# уйЉжАхТјњтљЇ

жЊЙТјЦСйюСИ║жбётЁѕтГўтюеуџётЈЇждѕсђѓтдѓСйЋтѕЕућежЊЙТјЦС┐АТЂ»№╝ЪтюетЏЙСИіжџЈТю║ТИИУх░; Сй┐ућежЂЇтјєт«џуљєсђѓжАхжЮбуГЅу║ДуџёуЅ╣тЙЂтљЉжЄЈтЁгт╝Јсђѓт░єжАхжЮбТјњтљЇСИјТќЄТюгтіЪУЃйуЏИу╗ЊтљѕсђѓтЁХС╗ќт║ћућесђѓУ┐ЏСИђТГЦжўЁУ»╗С┐АТЂ»ТБђу┤бсђѓ
т▒Ћт╝ђТЪЦуюІУ»дТЃЁ

2.How do we know this is well-defined? Maybe the ratio doesnРђЎt converge at all, or it converges to something which depends on the page we started with. Well, we know that the random walk is a Markov chain: the state of the chain is the page being visited. (Why?) We also see that there is some probability that the chain will go from any page to any other page eventually (if only by eventually hitting a dead-end page and then randomly re-starting). So the state-space of the Markov chain is strongly connected. The number of pages n is finite. And remember from probability models that a finite Markov chain whose state-space is strongly connected obeys the ergodic theorem, which says, precisely, that the fraction of time the chain spends in any one state goes to a well-defined limit, which doesnРђЎt depend on the starting state. So one way to calculate the page-rank is just to simulate, i.e., to do a random walk in the way I described. But this is slow, and there is another way. Suppose that ╬й is a probability vector on the states, i.e., itРђЎs an n-dimensional vector whose entries are non-negative and sum to one. Then, again from prob- ability models, if the distribution at time t is ╬йt , the distribution one time-step later is ╬йt+1 = ╬йt P = ╬й0 P t with P the transition matrix we defined earlier. ItРђЎs another result from proba- bility that the ╬йt keep getting closer and closer to each other, so that lim ╬й0 P t = ¤Ђ tРєњРѕъ where ¤Ђ is a special probability distribution satisying the equation ¤Ђ = ¤ЂP That is, ¤Ђ is an eigenvector of P with eigenvalue 1. (There is only one such ¤Ђ if the chain is strongly connected.)2 In fact, this ¤Ђ is the same as the ¤Ђ we get from the ergodic theorem. So rather than doing the simulation, we could just calculate the eigenvectors of P , which is often faster and more accurate than the simulation. Unpacking the last equation, it says ¤Ђ(i) = ¤Ђ(j)Pij j which means that pages with high page-rank are ones which are reached, with high probability, from other pages of high page-rank. This sounds circular (Рђюa celebrity is someone whoРђЎs famous for being famousРђЮ), but, as weРђЎve seen, it isnРђЎt. In fact, one way to compute it is to start with ╬й0 being the uniform distribution, i.e., ╬й0 (i) = 1/n for all i, and then calculate ╬й1 , ╬й2 , . . . until the change from ╬йt to ╬йt+1 is small enough to tolerate. That is, initially every page has equal page-rank, but then it shifts towards those reached by many strong links (╬й1 ), and then to those with many strong links from pages reached by many strong links (╬й2 ), and so forth. 2 A detailed discussion here would involve the Frobenius-Perron (or Perron-Frobenius) the- orem from linear algebra, which is a fascinating and central result, but that will have to wait for another class. 2