Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic...
Transcript of Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic...
![Page 1: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/1.jpg)
Link Analysis
Leonid E. Zhukov
School of Data Analysis and Artificial IntelligenceDepartment of Computer Science
National Research University Higher School of Economics
Network Science
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18
![Page 2: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/2.jpg)
Lecture outline
1 Graph-theoretic definitions
2 Web page ranking algorithmsPagerankHITS
3 The Web as a graph
4 PageRank beyond the web
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 2 / 18
![Page 3: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/3.jpg)
Graph theory
Graph G (E ,V ), |V | = n, |E | = mAdjacency matrix An×n, Aij , edge i → j
Graph is directed, matrix is non-symmetric: AT 6= A, Aij 6= Aji
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 3 / 18
![Page 4: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/4.jpg)
Graph theory
sinks: zero out degree nodes, kout(i) = 0, absorbing nodes
sources: zero in degree nodes, kin(i) = 0
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 4 / 18
![Page 5: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/5.jpg)
Graph theory
Graph is strongly connected if every vertex is reachable form everyother vertex.
Strongly connected components are partitions of the graph intosubgraphs that are strongly connected
In strongly connected graphs there is a path is each direction betweenany two pairs of vertices
image from Wikipedia
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 5 / 18
![Page 6: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/6.jpg)
Graph theory
A directed graph is aperiodic if the greatest common divisor of thelengths of its cycles is one (there is no integer k >1 that divides thelength of every cycle of the graph)
image from Wikipedia
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 6 / 18
![Page 7: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/7.jpg)
Web as a graph
Hyperlinks - implicit endorsements
Web graph - graph of endorsements (sometimes reciprocal)
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 7 / 18
![Page 8: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/8.jpg)
PageRank
”PageRank can be thought of as a model of user behavior. We assume there is a”random surfer” who is given a web page at random and keeps clicking on links,never hitting ”back” but eventually gets bored and starts on another randompage. The probability that the random surfer visits a page is its PageRank.”
Sergey Brin and Larry Page, 1998
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 8 / 18
![Page 9: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/9.jpg)
Random walk
Random walk on a directed graph
pt+1i =
∑j∈N(i)
ptjdoutj
=∑j
Aji
doutj
pj
Dii = diag{douti }
pt+1 = (D−1A)Tpt
pt+1 = PTpt
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 9 / 18
![Page 10: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/10.jpg)
Ranking on directed graph
Absorbing nodes
Source nodes
Cycles
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 10 / 18
![Page 11: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/11.jpg)
Perron-Frobenius Theorem
Perron-Frobenius theorem (Fundamental Theorem of Markov Chains)If matrix is
stochastic (non-negative and rows sum up to one, describes Markovchain)
irreducible (strongly connected graph)
aperiodic
then∃ limt→∞
p̄t = π̄
and can be found as a left eigenvector
π̄P = π̄, where ||π̄||1 = 1
π̄ - stationary distribution of Markov chain, row vectorOscar Perron, 1907, Georg Frobenius,1912.
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 11 / 18
![Page 12: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/12.jpg)
PageRank
Transition matrix:
P = D−1A
Stochastic matrix:
P′ = P +seT
nPageRank matrix:
P′′ = αP′ + (1− α)eeT
n
Eigenvalue problem (choose solution with λ = 1):
P′′Tp = λp
Notations:e - unit column vector, s - absorbing nodes indicator vector (column)
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 12 / 18
![Page 13: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/13.jpg)
PageRank computations
Eigenvalue problem (λ = 1, ||p||1 = pTe = 1):(αP′ + (1− α)
eeT
n
)T
p = λp
p = αP′Tp + (1− α)e
n
Power iterations:p← αP′Tp + (1− α)
e
n
Sparse linear system:
(I− αP′T )p = (1− α)e
n
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 13 / 18
![Page 14: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/14.jpg)
Graph structure of the web
Bow tie structure of the web
Andrei Broder et al, 1999
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 14 / 18
![Page 15: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/15.jpg)
PageRank
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 15 / 18
![Page 16: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/16.jpg)
PageRank beyond the Web
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 16 / 18
![Page 17: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/17.jpg)
Hubs and Authorities (HITS)
Citation networks. Reviews vs original research (authoritative) papers
authorities, contain useful information, aihubs, contains links to authorities, hi
Mutual recursion
Good authorities reffered bygood hubs
ai ←∑j
Ajihj
Good hubs point to goodauthorities
hi ←∑j
Aijaj
Jon Kleinberg, 1999
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 17 / 18
![Page 18: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/18.jpg)
HITS
System of linear equations
a = αATh
h = βAa
Symmetric eigenvalue problem
(ATA)a = λa
(AAT )h = λh
where eigenvalue λ = (αβ)−1
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 18 / 18
![Page 19: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/19.jpg)
Hubs and Authorities
Hubs Authorities
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 19 / 18
![Page 20: Leonid E. ZhukovLeonid E. Zhukov (HSE) Lecture 6 20.02.2016 1 / 18 Lecture outline 1 Graph-theoretic de nitions 2 Web page ranking algorithms Pagerank HITS 3 The Web as a graph 4 PageRank](https://reader033.fdocuments.in/reader033/viewer/2022060514/5f83d6fa467b7464c066810c/html5/thumbnails/20.jpg)
References
The PageRank Citation Ranknig: Bringing Order to the Web. S.Brin, L. Page, R. Motwany, T. Winograd, Stanford Digital LibraryTechnologies Project, 1998
Authoritative Sources in a Hyperlinked Environment. Jon M.Kleinberg, Proc. 9th ACM-SIAM Symposium on Discrete Algorithms,
Graph structure in the Web, Andrei Broder et all. Procs of the 9thinternational World Wide Web conference on Computer networks,2000
A Survey of Eigenvector Methods of Web Information Retrieval. AmyN. Langville and Carl D. Meyer, 2004
PageRank beyond the Web. David F. Gleich, arXiv:1407.5107, 2014
Leonid E. Zhukov (HSE) Lecture 6 20.02.2016 20 / 18