CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
-
Upload
princess-mann -
Category
Documents
-
view
214 -
download
0
Transcript of CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
![Page 1: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/1.jpg)
CSE 5243 (AU 14)
Graph Basics and a Gentle Introduction to PageRank
1
![Page 2: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/2.jpg)
Graphs from the Real World
Königsberg's Bridges
Ref: http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg
![Page 3: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/3.jpg)
Primitives and Notations
G = (V, E) E can also be represented as an adjacency
matrix Undirected vs. directed graph Degree (Shortest) distance between two vertices
3
![Page 4: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/4.jpg)
Properties of Nodes
Centrality: how “central” a node is in the graph How close the node is to all other nodes?
How much is a node a “choke point”?
𝜎st is the number of shortest paths between s and t
4
![Page 5: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/5.jpg)
Properties of Nodes
Clustering coefficient: how much does a node cluster with neighbors Local clustering coefficient
Global clustering coefficient
5
![Page 6: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/6.jpg)
Background
Besides the keywords, what other evidence can one use to rate the importance of a webpage?
Solution: Use the hyperlink structure E.g. a webpage linked by many
webpages is probably important. but this method is not global
(comprehensive). PageRank is developed by Larry Page in
1998.
6
![Page 7: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/7.jpg)
Idea
A graph representing WWW Node: webpage Directed edge: hyperlink
A user randomly clicks the hyperlink to surf WWW. The probability a user stop in a particular
webpage is the PageRank value. A node that is linked by many nodes with
high PageRank value receives a high rank itself; If there are no links to a node, then there is no support for that page.
7
![Page 8: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/8.jpg)
A simple version
u: a webpage Bu: the set of u’s backlinks Nv: the number of forward links of page v
Initially, R(u) is 1/N for every webpage Iteratively update each webpage’s PR value
until convergence.
8
uBvvN
vRuR
)()(
![Page 9: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/9.jpg)
Example 19
PageRank Calculation: first iteration
![Page 10: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/10.jpg)
Example 110
PageRank Calculation: second iteration
![Page 11: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/11.jpg)
Example 111
Convergence after some iterations
![Page 12: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/12.jpg)
A little more advanced version
Adding a damping factor d Imagine that a surfer would stop clicking
a hyperlink with probability 1-d
R(u) is at least (1-d)/(N-1) N is total num. of nodes.
12
uBv
vN
vRd
N
duR
)(
1
)1()(
![Page 13: CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.](https://reader036.fdocuments.in/reader036/viewer/2022081520/56649c905503460f9494a827/html5/thumbnails/13.jpg)
Other applications
Social network (Facebook, Twitter, etc) Node: Person; Edge: Follower / Followee /
Friend Higher PR value: Celebrity
Citation network Node: Paper; Edge: Citation Higher PR values: Important Papers.
Protein-protein interaction network Node: Protein; Edge: Two proteins bind
together Higher PR values: Essential proteins.
13