Speeding Up Algorithms on Compressed Web...
Transcript of Speeding Up Algorithms on Compressed Web...
![Page 1: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/1.jpg)
Speeding Up Algorithms on Compressed Web Graphs
Chinmay Karande (Georgia Institute of Technology)Kumar Chellapilla (Microsoft Live Labs)
Reid Andersen (Microsoft Live Labs)
WSDM 2009
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 1 / 41
![Page 2: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/2.jpg)
Outline
1 Fast Algorithms for Compressed GraphsGraph CompressionAdjacency Matrix Multiplication on Compressed GraphsAdapting PageRank Markov Chain to Compressed GraphsOther algorithmsImplementation Results
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 2 / 41
![Page 3: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/3.jpg)
WWW Graph
A gold-mine of important information.
Webpages are nodes, hyperlinks are edges.
HUGE dataset: ∼ 1 Trillion pages
Graph Algorithms:I Importance metrics: PageRank, HITS, SALSA...I Finding pathsI Clustering
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 3 / 41
![Page 4: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/4.jpg)
WWW Graph
A gold-mine of important information.
Webpages are nodes, hyperlinks are edges.
HUGE dataset: ∼ 1 Trillion pages
Graph Algorithms:I Importance metrics: PageRank, HITS, SALSA...I Finding pathsI Clustering
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 3 / 41
![Page 5: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/5.jpg)
WWW Graph
A gold-mine of important information.
Webpages are nodes, hyperlinks are edges.
HUGE dataset: ∼ 1 Trillion pages
Graph Algorithms:I Importance metrics: PageRank, HITS, SALSA...I Finding pathsI Clustering
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 3 / 41
![Page 6: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/6.jpg)
Structural Graph Compression
Replace a dense subgraph by a sparse one, such that:I Maintain connectivityI DecompressibleI Maintain ‘structure’
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 4 / 41
![Page 7: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/7.jpg)
Clique-Star Compression
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 5 / 41
![Page 8: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/8.jpg)
Clique-Star Compression: Terminology
Real Node
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 6 / 41
![Page 9: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/9.jpg)
Clique-Star Compression: Terminology
Real Node
Virtual Node
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 7 / 41
![Page 10: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/10.jpg)
Clique-Star Compression: Terminology
Real Node
Virtual Node
Virtual Edge
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 8 / 41
![Page 11: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/11.jpg)
Clique-Star Compression
Compression performed in phases: In each phase compressedge-disjoint cliques.
In each phase, virtual edges may become longer by one.
Diminishing returns on number of phases: ∼ 6 to 8 phases yield 10fold compression. [G. Buehrer, K. Chellapilla]
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 9 / 41
![Page 12: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/12.jpg)
Problem Statement
Problem: Given G ′ compressed from G , how do we perform computationson G ′ so as to infer properties of G?
How to determine metrics like:I PageRankI HITSI SALSA
of G without decompressing G ′?
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 10 / 41
![Page 13: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/13.jpg)
Example: PageRank
PageRank can be viewed as:
Repeated multiplication by adjacency matrix (with adjustments)I We need a Black-Box procedure to multiply a vector by adjacency
matrix of G given only G ′.
Steady state of a Markov ChainI We need a Markov Chain on G ′ that ‘mimics’ the PageRank MC on G .
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 11 / 41
![Page 14: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/14.jpg)
Example: PageRank
PageRank can be viewed as:
Repeated multiplication by adjacency matrix (with adjustments)I We need a Black-Box procedure to multiply a vector by adjacency
matrix of G given only G ′.
Steady state of a Markov ChainI We need a Markov Chain on G ′ that ‘mimics’ the PageRank MC on G .
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 11 / 41
![Page 15: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/15.jpg)
Outline
1 Fast Algorithms for Compressed GraphsGraph CompressionAdjacency Matrix Multiplication on Compressed GraphsAdapting PageRank Markov Chain to Compressed GraphsOther algorithmsImplementation Results
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 12 / 41
![Page 16: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/16.jpg)
Adjacency Matrix Multiplication: Nuts and Bolts
y = ET · x x ∈ Rn
yv = xu1 + xu2 + xu3 + xu4 + xu5
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 13 / 41
![Page 17: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/17.jpg)
Adjacency Matrix Multiplication on Compressed Graph
y = ET · x x ∈ Rn
yv = xu1 + xu2 + xu3 + xu4 + xu5
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 14 / 41
![Page 18: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/18.jpg)
Adjacency Matrix Multiplication on Compressed Graph
y = ET · x x ∈ Rn
yv = xu1 + xu2 + yw
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 15 / 41
![Page 19: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/19.jpg)
Adjacency Matrix Multiplication on Compressed Graph:Dependencies
Consider a virtual edge u → w1 → ... → wk → v :
yv depends upon ywk
ywkdepends upon ywk−1
...
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 16 / 41
![Page 20: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/20.jpg)
Acyclic Dependencies on Virtual Nodes
Subgraph induced by edges incident on virtual nodes is a forest. [G.Buehrer, K. Chellapilla]⇒ There exists a way to resolve dependencies.
while y is undefined on some virtual nodes doPick virtual node w such that y is defined on all virtual predecessorsof w .Compute and define yw .
end while
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 17 / 41
![Page 21: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/21.jpg)
Acyclic Dependencies on Virtual Nodes
Subgraph induced by edges incident on virtual nodes is a forest. [G.Buehrer, K. Chellapilla]⇒ There exists a way to resolve dependencies.
while y is undefined on some virtual nodes doPick virtual node w such that y is defined on all virtual predecessorsof w .Compute and define yw .
end while
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 17 / 41
![Page 22: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/22.jpg)
Solution
Permute virtual nodes in the order of dependencies
Practical ConsiderationsI Sequential File AccessI Synchronous algorithmI For SALSA: Inverted adjacency required for virtual nodes.I Speed-up almost matches the storage reduction ratio.!
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 18 / 41
![Page 23: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/23.jpg)
Solution
Permute virtual nodes in the order of dependencies
Practical ConsiderationsI Sequential File AccessI Synchronous algorithmI For SALSA: Inverted adjacency required for virtual nodes.
I Speed-up almost matches the storage reduction ratio.!
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 18 / 41
![Page 24: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/24.jpg)
Solution
Permute virtual nodes in the order of dependencies
Practical ConsiderationsI Sequential File AccessI Synchronous algorithmI For SALSA: Inverted adjacency required for virtual nodes.I Speed-up almost matches the storage reduction ratio.!
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 18 / 41
![Page 25: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/25.jpg)
Outline
1 Fast Algorithms for Compressed GraphsGraph CompressionAdjacency Matrix Multiplication on Compressed GraphsAdapting PageRank Markov Chain to Compressed GraphsOther algorithmsImplementation Results
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 19 / 41
![Page 26: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/26.jpg)
PageRank Scheme
PageRank is a Markov Chain:
With probability α, perform a uniform ‘jump’.
Pr [u → v ] = (1− α)1
|δout(u)|
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 20 / 41
![Page 27: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/27.jpg)
PageRank on Compressed Graph
u1
u2
u3
v1
v2
u1
u2
u3
v1
v2
w
Pr [Xt = ui ] = pi
Pr [Xt+1 = vi | p1, p2, p3] =1
|δ(u1)|+
1
|δ(u2)|+
1
|δ(u3)|
=1
|δ(w)|∑
i
|δ(w)||δ(ui )|
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 21 / 41
![Page 28: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/28.jpg)
PageRank on Compressed Graph
u1
u2
u3
v1
v2
u1
u2
u3
v1
v2
w
Pr [Xt = ui ] = pi
Pr [Xt+1 = vi | p1, p2, p3] =1
|δ(u1)|+
1
|δ(u2)|+
1
|δ(u3)|
=1
|δ(w)|∑
i
|δ(w)||δ(ui )|
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 21 / 41
![Page 29: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/29.jpg)
Defining the ‘reach’ of a node
∆(u) =
1 If u is real∑uv∈E ′
∆(v) If u is virtual
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 22 / 41
![Page 30: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/30.jpg)
Illustration of ∆ function
∆(u) = 1
∆(v) = 5
∆(w) = 3
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 23 / 41
![Page 31: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/31.jpg)
Defining the true out-degree of a node
Γ(u) =∑
uv∈E ′
∆(v)
If G ′ is compressed from G then:
For real u, Γ(u) is the out-degree of u in G .
For virtual u, Γ(u) = ∆(u).
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 24 / 41
![Page 32: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/32.jpg)
Illustration of Γ function
Γ(u) = 7
Γ(v) = 5
Γ(w) = 3
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 25 / 41
![Page 33: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/33.jpg)
PageRank on Compressed Graph
With probability α, perform a uniform ‘jump’ but don’t jump to andfrom virtual nodes.
Pr [u → v ] =
(1− α)
∆(v)
Γ(u)If u is real
∆(v)
Γ(u)If u is virtual
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 26 / 41
![Page 34: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/34.jpg)
PageRank on Compressed Graph
With probability α, perform a uniform ‘jump’ but don’t jump to andfrom virtual nodes.
Pr [u → v ] =
(1− α)
∆(v)
Γ(u)If u is real
∆(v)
Γ(u)If u is virtual
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 26 / 41
![Page 35: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/35.jpg)
Correctness theorem
Theorem
If G ′ is compressed from G and p′, p are respective PageRank vectors,then for every real node u, p′(u) = εp(u).
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 27 / 41
![Page 36: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/36.jpg)
Proof
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
Conclusion: For u ∈ V (Gi ), pi(u)’s and pi+1(u)’s satisfy the sameequations ⇒ pi+1(u) = εi+1pi(u)
ε = ε1 · ε2 · ... · εk
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 28 / 41
![Page 37: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/37.jpg)
Proof
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
Conclusion: For u ∈ V (Gi ), pi(u)’s and pi+1(u)’s satisfy the sameequations ⇒ pi+1(u) = εi+1pi(u)
ε = ε1 · ε2 · ... · εk
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 28 / 41
![Page 38: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/38.jpg)
Proof
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
Conclusion: For u ∈ V (Gi ), pi(u)’s and pi+1(u)’s satisfy the sameequations ⇒ pi+1(u) = εi+1pi(u)
ε = ε1 · ε2 · ... · εk
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 28 / 41
![Page 39: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/39.jpg)
Proof
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
Conclusion: For u ∈ V (Gi ), pi(u)’s and pi+1(u)’s satisfy the sameequations ⇒ pi+1(u) = εi+1pi(u)
ε = ε1 · ε2 · ... · εk
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 28 / 41
![Page 40: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/40.jpg)
Solution
Run (modified) PageRank on compressed graph, and normalize the valueson real nodes to unit norm..
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 29 / 41
![Page 41: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/41.jpg)
Precision Theorem
Theorem
ε ≥ 2−k
where k is the length of the longest virtual edge.
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
To prove: εi ≥ 1/2.
Follows from the fact that∑u∈V (Gi )
pi (u) +∑u∈Q
pi (u) = 1
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 30 / 41
![Page 42: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/42.jpg)
Precision Theorem
Theorem
ε ≥ 2−k
where k is the length of the longest virtual edge.
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
To prove: εi ≥ 1/2.
Follows from the fact that∑u∈V (Gi )
pi (u) +∑u∈Q
pi (u) = 1
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 30 / 41
![Page 43: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/43.jpg)
Precision Theorem
Theorem
ε ≥ 2−k
where k is the length of the longest virtual edge.
Proof.
Split the compression from G to G ′ in phases:
G = G0 � G1 � ... � Gk = G ′
Let pi be the steady state of (modified) PageRank on Gi .
To prove: εi ≥ 1/2.
Follows from the fact that∑u∈V (Gi )
pi (u) +∑u∈Q
pi (u) = 1
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 30 / 41
![Page 44: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/44.jpg)
Solution
Run (modified) PageRank on compressed graph, and normalize the valueson real nodes to unit norm..
Practical Considerations:I Modified only the weights - Can run any existing PageRank
implementation almost unchanged.I Sequential File AccessI Asynchronous: Distributed computing feasible.I Convergence may be slower due to longer path lengths.
I Speed-up per iteration almost matches the storage reductionratio.!
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 31 / 41
![Page 45: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/45.jpg)
Solution
Run (modified) PageRank on compressed graph, and normalize the valueson real nodes to unit norm..
Practical Considerations:I Modified only the weights - Can run any existing PageRank
implementation almost unchanged.I Sequential File AccessI Asynchronous: Distributed computing feasible.I Convergence may be slower due to longer path lengths.I Speed-up per iteration almost matches the storage reduction
ratio.!
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 31 / 41
![Page 46: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/46.jpg)
Outline
1 Fast Algorithms for Compressed GraphsGraph CompressionAdjacency Matrix Multiplication on Compressed GraphsAdapting PageRank Markov Chain to Compressed GraphsOther algorithmsImplementation Results
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 32 / 41
![Page 47: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/47.jpg)
SALSA: Stochastic Approach to Link-Structure Analysis
Both the Synchronous and Asynchronous methods can be adapted forSALSA.
In-link counterparts of ∆ and Γ required.
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 33 / 41
![Page 48: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/48.jpg)
Shortest Paths: BFS
Simply define edge weights as:
w(u, v) =
{1 If v is real0 If v is virtual
Use a Deque.
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 34 / 41
![Page 49: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/49.jpg)
Outline
1 Fast Algorithms for Compressed GraphsGraph CompressionAdjacency Matrix Multiplication on Compressed GraphsAdapting PageRank Markov Chain to Compressed GraphsOther algorithmsImplementation Results
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 35 / 41
![Page 50: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/50.jpg)
Experiments: Proof of Concept
If β is the reduction ratio in the number of edges, we cannot hope for theprograms to run β times faster.
O(|V |) operations such as:
Allocating variables
Copying and zeroing values between iterations
bring down the speed-up to a small extent.
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 36 / 41
![Page 51: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/51.jpg)
PageRank on eu-2005
Uncompressed graph
No. of nodes: 862,664No. of edges: 19,235,140
Compressed graph has β = 4.34
No. of nodes: 1,196,536No. of edges: 4,429,375
Uncompressed Synchronous Asynchronous
Time/iteration (sec) 5.37 1.58 1.50
No. of iterations 19 19 50
Speed-up 1 3.40 1.36
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 37 / 41
![Page 52: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/52.jpg)
PageRank on uk-2005
Uncompressed graph
No. of nodes: 39,459,925No. of edges: 936,364,282
Compressed graph has β = 6.18
No. of nodes: 47,482,140No. of edges: 151,456,024
Uncompressed Synchronous Asynchronous
Time/iteration (sec) 264.40 59.52 59.15
No. of iterations 21 21 53
Speed-up 1 4.44 2.53
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 38 / 41
![Page 53: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/53.jpg)
SALSA on eu-2005
Uncompressed graph
No. of nodes: 862,664No. of edges: 19,235,140
Compressed graph has β = 4.34
No. of nodes: 1,196,536No. of edges: 4,429,375
Uncompressed Synchronous Asynchronous
Time/iteration (sec) 5.48 2.37 1.97
No. of iterations 91 91 100
Speed-up 1 2.31 2.70
Storage Reduction 1 2.36 3.21
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 39 / 41
![Page 54: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/54.jpg)
PageRank on uk-2005
Uncompressed graph
No. of nodes: 39,459,925No. of edges: 936,364,282
Compressed graph has β = 6.18
No. of nodes: 47,482,140No. of edges: 151,456,024
Uncompressed Synchronous Asynchronous
Time/iteration (sec) 276.09 72.93 88.69
No. of iterations 104 104 124
Speed-up 1 3.11 3.18
Storage Reduction 1 3.47 4.54
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 40 / 41
![Page 55: Speeding Up Algorithms on Compressed Web Graphstranslectures.videolectures.net/.../tag=37587/wsdm09_karande_suac… · C. Karande, K. Chellapilla, R. Andersen Speeding Up Algorithms](https://reader033.fdocuments.in/reader033/viewer/2022060913/60a70467606185335264da96/html5/thumbnails/55.jpg)
Thank you.!
C. Karande, K. Chellapilla, R. Andersen () Speeding Up Algorithms on Compressed Web Graphs WSDM 2009 41 / 41