GraphLab:ANew) Framework)for)Parallel)...
Transcript of GraphLab:ANew) Framework)for)Parallel)...
![Page 1: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/1.jpg)
GraphLab: A New Framework for Parallel Machine LearningYUCHENG LOW, JOSEPH GONAZLEZ, AAPO KYROLA, DANNY BICKSON, CARLOS GUESTRIN, JOSEPH M. HELLERSTEIN
PRESENTED BY: STEPHEN MACKE
![Page 2: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/2.jpg)
The Problem•Need to perform machine learning at scale•Hardware advances hitting power wall•Need to parallelize (multicore + distributed)•But parallelization is hard!
![Page 3: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/3.jpg)
Low-‐level approaches•Pthreads•MPI•(OpenMP)•(CUDA, SSE instructions)
![Page 4: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/4.jpg)
Solution: MapReduce?•Only works well for embarassingly parallel problems
•Iterative MapReduce typically requires global synchronization at a single reducer
![Page 5: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/5.jpg)
Other Abstractions•Systolic, Dataflow• Nodes vertices, edges communication links• Bulk synchronous – bad for “Gauss-‐Seidel”-‐style schedules
•DAG Abstraction (Dryad, Pig Latin)• Dataflow graph depends on # iterations•Must be specified in advance – not suitable for iterative computation
•Graph-‐based messaging (e.g. Pregel)• Require users to explicitly manage communication
![Page 6: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/6.jpg)
Digression: Jacobi vs. Gauss-‐Seidel schedules
![Page 7: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/7.jpg)
Digression: Jacobi vs. Gauss-‐Seidel schedules
![Page 8: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/8.jpg)
Digression: Jacobi vs. Gauss-‐Seidel schedules
![Page 9: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/9.jpg)
The Main Idea
Perform iterative updates in this “Gauss-‐Seidel” fashion, asynchronously.
![Page 10: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/10.jpg)
Other Abstractions (cont’d)
![Page 11: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/11.jpg)
GraphLab Components
![Page 12: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/12.jpg)
Data Graph
![Page 13: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/13.jpg)
User Update Functions
![Page 14: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/14.jpg)
Sync Operation•“Fold” operation aggregates over vertices• Analogous to “Reduce” from MapReduce
•“Merge” operation for parallel tree reduction• Analogous to “Combine” from MapReduce
•“Apply” finalizes value (e.g. rescaling)
![Page 15: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/15.jpg)
Let’s write Update for Pagerank!•Reminder: pagerank eqns
![Page 16: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/16.jpg)
Let’s write Update for Pagerank!
![Page 17: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/17.jpg)
Scope Rules
![Page 18: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/18.jpg)
Scope Rules
Q. What consistency model do we need for pagerank?
![Page 19: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/19.jpg)
Scheduling•Synchronous•Round-‐robin•Set scheduler
•Splash scheduler
![Page 20: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/20.jpg)
Set Scheduling via Graph Coloring•For certain consistency models (q. which ones?), we can update all vertices of same color in parallel!
Ex. Gibbs sampling
![Page 21: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/21.jpg)
Case Studies•MRF Parameter learning•Gibbs sampling•Co-‐Em•Lasso / Shooting•Compressed Sensing
![Page 22: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/22.jpg)
Results (MRF parameter learning)
![Page 23: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/23.jpg)
Results (Gibbs sampling, LBP)
![Page 24: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/24.jpg)
Results (Co-‐Em)
![Page 25: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/25.jpg)
Results (Lasso)
![Page 26: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/26.jpg)
Results (Compressed Sensing)
![Page 27: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/27.jpg)
Conclusion•Does not say anything new about parallel programming fundamentals
•New interesting abstraction for certain types of problems• Shows need for rethinking existing abstractions (e.g. MR)• (particularly for iterative / ML algorithms)
•Commercialized as GraphLab Create• But mostly for toolkit / viz purposes• Actual mucking with update / sync functions somewhat niche• Open sourced in PowerGraph, GraphChi, etc.
![Page 28: GraphLab:ANew) Framework)for)Parallel) Machine)Learningi.stanford.edu/~adityagp/courses/cs598/slides/graphlab.pdf · GraphLab:ANew) Framework)for)Parallel) Machine)Learning YUCHENGLOW,)JOSEPH)GONAZLEZ,](https://reader035.fdocuments.in/reader035/viewer/2022071106/5fe08f5a56df6a782f32a427/html5/thumbnails/28.jpg)
More Gibbs Stuff