GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some...
Transcript of GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some...
![Page 1: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/1.jpg)
GraphChi: Large-Scale Graph Computation on Just a PC
Aapo Kyrölä (CMU) Guy Blelloch (CMU)
Carlos Guestrin (UW)
OSDI’12
In co-‐opera+on with the GraphLab team.
![Page 2: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/2.jpg)
BigData with Structure: BigGraph
social graph social graph follow-‐graph consumer-‐products graph
user-‐movie ra;ngs graph
DNA interac;on
graph
WWW link graph
etc.
![Page 3: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/3.jpg)
Big Graphs != Big Data
3 GraphChi – Aapo Kyrola
Data size: 140 billion connec;ons
≈ 1 TB
Not a problem!
Computa;on:
Hard to scale
TwiQer network visualiza;on, by Akshay Java, 2009
![Page 4: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/4.jpg)
Could we compute Big Graphs on a single machine?
Disk-based Graph Computation
Can’t we just use the Cloud?
![Page 5: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/5.jpg)
Wri;ng distributed applica;ons remains cumbersome.
5 GraphChi – Aapo Kyrola
Cluster crash Crash in your IDE
Distributed State is Hard to Program
![Page 6: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/6.jpg)
Efficient Scaling
• Businesses need to compute hundreds of distinct tasks on the same graph – Example: personalized recommendations.
Parallelize each task Parallelize across tasks
Task
Task
Task
Task
Task
Task
Task
Task
Task
Task
Task
Task Task
Complex Simple
Expensive to scale
2x machines = 2x throughput
![Page 7: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/7.jpg)
Other Benefits
• Costs – Easier management, simpler hardware.
• Energy Consumption – Full utilization of a single computer.
• Embedded systems, mobile devices – A basic flash-drive can fit a huge graph.
![Page 8: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/8.jpg)
Research Goal
Compute on graphs with billions of edges, in a reasonable *me, on a single PC.
– Reasonable = close to numbers previously reported for distributed systems in the literature.
Experiment PC: Mac Mini (2012)
![Page 9: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/9.jpg)
Computational Model
GraphChi – Aapo Kyrola 9
![Page 10: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/10.jpg)
Computational Model
• Graph G = (V, E) – directed edges: e = (source,
destination) – each edge and vertex associated
with a value (user-defined type) – vertex and edge values can be
modified • (structure modification also
supported) Data
Data Data
Data
Data Data Data
Data Data
Data
10 GraphChi – Aapo Kyrola
A B e
Terms: e is an out-‐edge of A, and in-‐edge of B.
![Page 11: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/11.jpg)
Data
Data Data
Data
Data Data Data
Data
Data
Data
Vertex-centric Programming
• “Think like a vertex” • Popularized by the Pregel and GraphLab projects
– Historically, systolic computation and the Connection Machine
MyFunc(vertex) { // modify neighborhood } Data
Data Data
Data
Data
![Page 12: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/12.jpg)
The Main Challenge of Disk-based Graph Computation:
Random Access
![Page 13: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/13.jpg)
Random Access Problem
vertex in-‐neighbors out-‐neighbors
5 3:2.3, 19: 1.3, 49: 0.65,... 781: 2.3, 881: 4.2..
....
19 3: 1.4, 9: 12.1, ... 5: 1.3, 28: 2.2, ...
• ... or with file index pointers vertex in-‐neighbor-‐ptr out-‐neighbors
5 3: 881, 19: 10092, 49: 20763,... 781: 2.3, 881: 4.2..
....
19 3: 882, 9: 2872, ... 5: 1.3, 28: 2.2, ...
Random write
Random read read
synchronize
• Symmetrized adjacency file with values,
5 19
For sufficient performance, millions of random accesses / second would be needed. Even for SSD, this is too much.
![Page 14: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/14.jpg)
Possible Solutions
1. Use SSD as a memory-‐extension?
[SSDAlloc, NSDI’11]
2. Compress the graph structure to fit into RAM? [ WebGraph framework]
3. Cluster the graph and handle each cluster separately in RAM?
4. Caching of hot nodes?
Too many small objects, need millions / sec.
Associated values do not compress well, and are mutated.
Expensive; The number of inter-‐cluster edges is big.
Unpredictable performance.
![Page 15: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/15.jpg)
Our Solution
Parallel Sliding Windows (PSW)
![Page 16: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/16.jpg)
Parallel Sliding Windows: Phases
• PSW processes the graph one sub-graph a time:
• In one iteration, the whole graph is processed.
– And typically, next iteration is started.
1. Load
2. Compute
3. Write
![Page 17: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/17.jpg)
• Vertices are numbered from 1 to n – P intervals, each associated with a shard on disk. – sub-graph = interval of vertices
PSW: Shards and Intervals
shard(1)
interval(1) interval(2) interval(P)
shard(2) shard(P)
1 n v1 v2
17 GraphChi – Aapo Kyrola
1. Load
2. Compute
3. Write
![Page 18: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/18.jpg)
PSW: Layout
Shard 1
Shards small enough to fit in memory; balance size of shards
Shard: in-‐edges for interval of ver;ces; sorted by source-‐id
in-‐edges fo
r ver;ces 1..100
sorted
by source_id
Ver;ces 1..100
Ver;ces 101..700
Ver;ces 701..1000
Ver;ces 1001..10000
Shard 2 Shard 3 Shard 4 Shard 1
1. Load
2. Compute
3. Write
![Page 19: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/19.jpg)
Ver;ces 1..100
Ver;ces 101..700
Ver;ces 701..1000
Ver;ces 1001..10000
Load all in-‐edges in memory
Load subgraph for verJces 1..100
What about out-‐edges? Arranged in sequence in other shards
Shard 2 Shard 3 Shard 4
PSW: Loading Sub-graph
Shard 1
in-‐edges fo
r ver;ces 1..100
sorted
by source_id
1. Load
2. Compute
3. Write
![Page 20: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/20.jpg)
Shard 1
Load all in-edges in memory
Load subgraph for verJces 101..700
Shard 2 Shard 3 Shard 4
PSW: Loading Sub-graph
Ver;ces 1..100
Ver;ces 101..700
Ver;ces 701..1000
Ver;ces 1001..10000
Out-edge blocks in memory
in-‐edges fo
r ver;ces 1..100
sorted
by source_id
1. Load
2. Compute
3. Write
![Page 21: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/21.jpg)
PSW Load-Phase
Only P large reads for each interval.
P2 reads on one full pass.
21 GraphChi – Aapo Kyrola
1. Load
2. Compute
3. Write
![Page 22: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/22.jpg)
PSW: Execute updates
• Update-function is executed on interval’s vertices • Edges have pointers to the loaded data blocks
– Changes take effect immediately asynchronous.
&Data
&Data &Data
&Data
&Data &Data &Data
&Data &Data
&Data
Block X
Block Y
Determinis;c scheduling prevents races between neighboring ver;ces.
22 GraphChi – Aapo Kyrola
1. Load
2. Compute
3. Write
![Page 23: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/23.jpg)
PSW: Commit to Disk
• In write phase, the blocks are written back to disk – Next load-phase sees the preceding writes
asynchronous.
23 GraphChi – Aapo Kyrola
1. Load
2. Compute
3. Write
&Data
&Data &Data
&Data
&Data &Data &Data
&Data &Data
&Data
Block X
Block Y
In total: P2 reads and writes / full pass on the graph. Performs well on both SSD and hard drive.
![Page 24: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/24.jpg)
GraphChi: Implementation
Evaluation & Experiments
![Page 25: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/25.jpg)
GraphChi
• C++ implementation: 8,000 lines of code – Java-implementation also available (~ 2-3x slower),
with a Scala API.
• Several optimizations to PSW (see paper).
Source code and examples: hQp://graphchi.org
![Page 26: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/26.jpg)
EVALUATION: APPLICABILITY
![Page 27: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/27.jpg)
Evaluation: Is PSW expressive enough?
Graph Mining – Connected components – Approx. shortest paths – Triangle counting – Community Detection
SpMV – PageRank – Generic
Recommendations – Random walks
Collaborative Filtering (by Danny Bickson)
– ALS – SGD – Sparse-ALS – SVD, SVD++ – Item-CF
Probabilistic Graphical Models – Belief Propagation
Algorithms implemented for GraphChi (Oct 2012)
![Page 28: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/28.jpg)
IS GRAPHCHI FAST ENOUGH?
Comparisons to existing systems
![Page 29: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/29.jpg)
Experiment Setting
• Mac Mini (Apple Inc.) – 8 GB RAM – 256 GB SSD, 1TB hard drive – Intel Core i5, 2.5 GHz
• Experiment graphs:
Graph VerJces Edges P (shards) Preprocessing
live-‐journal 4.8M 69M 3 0.5 min
nevlix 0.5M 99M 20 1 min
twiQer-‐2010 42M 1.5B 20 2 min
uk-‐2007-‐05 106M 3.7B 40 31 min
uk-‐union 133M 5.4B 50 33 min
yahoo-‐web 1.4B 6.6B 50 37 min
![Page 30: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/30.jpg)
Comparison to Existing Systems
Notes: comparison results do not include +me to transfer the data to cluster, preprocessing, or the +me to load the graph from disk. GraphChi computes asynchronously, while all but GraphLab synchronously.
PageRank
See the paper for more comparisons.
WebGraph Belief PropagaJon (U Kang et al.)
Matrix FactorizaJon (Alt. Least Sqr.) Triangle CounJng
GraphLab v1 (8 cores)
GraphChi (Mac Mini)
0 2 4 6 8 10 12 Minutes
NeDlix (99B edges)
Spark (50 machines)
GraphChi (Mac Mini)
0 2 4 6 8 10 12 14 Minutes
TwiKer-‐2010 (1.5B edges)
Pegasus / Hadoop (100
machines)
GraphChi (Mac Mini)
0 5 10 15 20 25 30 Minutes
Yahoo-‐web (6.7B edges)
Hadoop (1636
machines)
GraphChi (Mac Mini)
0 50 100 150 200 250 300 350 400 450 Minutes
twiKer-‐2010 (1.5B edges)
On a Mac Mini: GraphChi can solve as big problems as exis;ng large-‐scale systems.
Comparable performance.
![Page 31: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/31.jpg)
PowerGraph Comparison • PowerGraph / GraphLab 2
outperforms previous systems by a wide margin on natural graphs.
• With 64 more machines, 512 more CPUs: – Pagerank: 40x faster than
GraphChi – Triangle counting: 30x faster
than GraphChi.
OSDI’12
GraphChi has state-‐of-‐the-‐art performance / CPU.
vs.
GraphChi
![Page 32: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/32.jpg)
SYSTEM EVALUATION Sneak peek
Consult the paper for a comprehensive evalua+on: • HD vs. SSD • Striping data across mul+ple
hard drives • Comparison to an in-‐memory
version • BoKlenecks analysis • Effect of the number of shards • Block size and performance.
![Page 33: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/33.jpg)
Scalability / Input Size [SSD]
• Throughput: number of edges processed / second.
Conclusion: the throughput remains roughly constant when graph size is increased.
GraphChi with hard-‐drive is ~ 2x slower than SSD (if computa;onal cost low).
Graph size à
domain
twitter-2010
uk-2007-05 uk-union
yahoo-web
0.00E+00
5.00E+06
1.00E+07
1.50E+07
2.00E+07
2.50E+07
0.00E+00 2.00E+00 4.00E+00 6.00E+00 8.00E+00 Billions
PageRank -- throughput (Mac Mini, SSD)
Perf
orm
ance
à
Paper: scalability of other applica;ons.
![Page 34: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/34.jpg)
Bottlenecks / Multicore
Experiment on MacBook Pro with 4 cores / SSD.
• Computationally intensive applications benefit substantially from parallel execution.
• GraphChi saturates SSD I/O with 2 threads.
0 20 40 60 80 100 120 140 160
1 2 4 RunJ
me (secon
ds)
Number of threads
Matrix Factoriza*on (ALS)
Loading ComputaJon
0
200
400
600
800
1000
1 2 4 RunJ
me (secon
ds)
Number of threads
Connected Components
Loading ComputaJon
![Page 35: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/35.jpg)
Evolving Graphs
Graphs whose structure changes over time
![Page 36: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/36.jpg)
Evolving Graphs: Introduction
• Most interesting networks grow continuously: – New connections made, some ‘unfriended’.
• Desired functionality: – Ability to add and remove edges in streaming
fashion; – ... while continuing computation.
• Related work: – Kineograph (EuroSys ‘12), distributed system for
computation on a changing graph.
![Page 37: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/37.jpg)
PSW and Evolving Graphs
• Adding edges – Each (shard, interval) has an associated edge-buffer.
• Removing edges: Edge flagged as “removed”.
interval(1)
interval(2)
interval(P)
shard(j)
edge-buffer(j, 1)
edge-buffer(j, 2)
edge-buffer(j, P)
New edges (for example) TwiQer “firehose”
![Page 38: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/38.jpg)
Recreating Shards on Disk
• When buffers fill up, shards a recreated on disk – Too big shards are split.
• During recreation, deleted edges are permanently removed.
interval(1)
interval(2)
interval(P)
shard(j)
Re-‐create & Split
interval(1)
interval(2)
interval(P+1)
shard(j)
interval(1)
interval(2)
interval(P+1)
shard(j+1)
![Page 39: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/39.jpg)
EVALUATION: EVOLVING GRAPHS
Streaming Graph experiment
![Page 40: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/40.jpg)
Streaming Graph Experiment • On the Mac Mini:
– Streamed edges in random order from the twitter-2010 graph (1.5 B edges)
• With maximum rate of 100K or 200K edges/sec. (very high rate)
– Simultaneously run PageRank. – Data layout:
• Edges were streamed from hard drive • Shards were stored on SSD.
edges
![Page 41: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/41.jpg)
Ingest Rate
0
50000
100000
150000
200000
250000
0 1 2 3
Edges /
sec
Time (hours)
actual-‐100K actual-‐200K 100K 200K
When graph grows, shard recrea;ons become more expensive.
![Page 42: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/42.jpg)
Streaming: Computational Throughput
Throughput varies strongly due to shard rewrites and asymmetric computa;on.
0.00E+00
2.00E+06
4.00E+06
6.00E+06
8.00E+06
1.00E+07
0 1 2 3 4
Throuh
gput Edges/sec
Time (hours)
Throughput (100K) Throughput (200K) Fixed graph
![Page 43: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/43.jpg)
Conclusion
![Page 44: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/44.jpg)
Future Directions
Come to the the poster on Monday to discuss!
• This work: small amount of memory. • What if have hundreds of GBs of RAM?
Graph working memory (PSW) disk
Computation 1 state Computation 2 state
Computational state
Graph working memory (PSW) disk
Computation 1 state Computation 2 state
Computational state
Graph working memory (PSW) disk
computational state
Computation 1 state Computation 2 state
RAM
![Page 45: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/45.jpg)
Conclusion
• Parallel Sliding Windows algorithm enables processing of large graphs with very few non-sequential disk accesses.
• For the system researchers, GraphChi is a solid baseline for system evaluation – It can solve as big problems as distributed systems.
• Takeaway: Appropriate data structures as an alternative to scaling up.
Source code and examples: http://graphchi.org License: Apache 2.0
![Page 46: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/46.jpg)
Extra Slides
![Page 47: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/47.jpg)
PSW is Asynchronous
• If V > U, and there is edge (U,V, &x) = (V, U, &x), update(V) will observe change to x done by update(U): – Memory-shard for interval (j+1) will contain writes to shard(j) done
on previous intervals. – Previous slide: If U, V in the same interval.
• PSW implements the Gauss-Seidel (asynchronous) model of computation – Shown to allow, in many cases, clearly faster convergence of
computation than Bulk-Synchronous Parallel (BSP). – Each edge stored only once.
V ti ( F (V t
0 , V t1 , . . . , V t
i�1, Vt�1i , V t�1
i+1 , ...)
Extended edi;on
47 GraphChi – Aapo Kyrola
![Page 48: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/48.jpg)
Number of Shards
101 102 1030
0.5
1
1.5
2x 107#Shards and Performance
Number of shards (P)
Thro
ughp
ut (e
dges
/sec
) Conn comp. (SSD)
Pagerank (SSD)
Pagerank (HD)
Conn comp. (HD)
Student Version of MATLAB
• If P is in the “dozens”, there is not much effect on performance.
![Page 49: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/49.jpg)
I/O Complexity
• See the paper for theoretical analysis in the Aggarwal-Vitter’s I/O model. – Worst-case only 2x best-case.
• Intuition:
shard(1)
interval(1) interval(2) interval(P)
shard(2) shard(P)
1 |V| v1 v2
Inter-‐interval edge is loaded from disk only once / itera;on.
Edge spanning intervals is loaded twice / itera;on.
![Page 50: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/50.jpg)
Multiple hard-drives (RAIDish)
• GraphChi supports striping shards to multiple disks Parallel I/O.
0
500
1000
1500
2000
2500
3000
Pagerank Conn. components
Second
s / itera;
on
1 hard drive 2 hard drives 3 hard drives
Experiment on a 16-‐core AMD server (from year 2007).
![Page 51: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/51.jpg)
Bottlenecks
0
500
1000
1500
2000
2500
1 thread 2 threads 4 threads
Disk IO Graph construc;on Exec. updates
Connected Components on Mac Mini / SSD
• Cost of constructing the sub-graph in memory is almost as large as the I/O cost on an SSD – Graph construction requires a lot of random access in RAM memory
bandwidth becomes a bottleneck.
![Page 52: GraphChi: Large-Scale Graph Computation on Just a PC · – New connections made, some ‘unfriended’. • Desired functionality: – Ability to add and remove edges in streaming](https://reader034.fdocuments.in/reader034/viewer/2022043007/5f91ef1051d50934b311276a/html5/thumbnails/52.jpg)
Computational Setting
• Constraints: A. Not enough memory to store the whole graph in
memory, nor all the vertex values. B. Enough memory to store one vertex and its edges
w/ associated values.
• Largest example graph used in the experiments: – Yahoo-web graph: 6.7 B edges, 1.4 B vertices
Recently GraphChi has been used on a MacBook Pro to compute with the most recent TwiQer follow-‐graph (last year: 15 B edges)