A New Parallel Framework for Machine Learning
-
Upload
theodore-estes -
Category
Documents
-
view
34 -
download
2
description
Transcript of A New Parallel Framework for Machine Learning
![Page 1: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/1.jpg)
Carnegie Mellon
Joseph GonzalezJoint work with
YuchengLow
AapoKyrola
DannyBickson
CarlosGuestrin
GuyBlelloch
JoeHellerstein
DavidO’Hallaron
A New Parallel Framework for Machine Learning
AlexSmola
![Page 2: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/2.jpg)
In ML we face BIG problems
48 Hours a MinuteYouTube
24 Million Wikipedia Pages
750 MillionFacebook Users
6 Billion Flickr Photos
![Page 3: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/3.jpg)
Parallelism: Hope for the FutureWide array of different parallel architectures:
New Challenges for Designing Machine Learning Algorithms: Race conditions and deadlocksManaging distributed model state
New Challenges for Implementing Machine Learning Algorithms:Parallel debugging and profilingHardware specific APIs
3
GPUs Multicore Clusters Mini Clouds Clouds
![Page 4: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/4.jpg)
Carnegie Mellon
Core Question
How will wedesign and implement
parallel learning systems?
![Page 5: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/5.jpg)
Carnegie Mellon
Threads, Locks, & Messages
Build each new learning systems usinglow level parallel primitives
We could use ….
![Page 6: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/6.jpg)
Threads, Locks, and MessagesML experts repeatedly solve the same parallel design challenges:
Implement and debug complex parallel systemTune for a specific parallel platformTwo months later the conference paper contains:
“We implemented ______ in parallel.”
The resulting code:is difficult to maintainis difficult to extendcouples learning model to parallel implementation
6
Graduate
students
![Page 7: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/7.jpg)
Carnegie Mellon
Map-Reduce / HadoopBuild learning algorithms on-top of
high-level parallel abstractions
... a better answer:
![Page 8: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/8.jpg)
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
8
Embarrassingly Parallel independent computation
12.9
42.3
21.3
25.8
No Communication needed
![Page 9: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/9.jpg)
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
9
12.9
42.3
21.3
25.8
24.1
84.3
18.4
84.4
Image Features
![Page 10: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/10.jpg)
CPU 1 CPU 2 CPU 3 CPU 4
MapReduce – Map Phase
10
Embarrassingly Parallel independent computation
12.9
42.3
21.3
25.8
17.5
67.5
14.9
34.3
24.1
84.3
18.4
84.4
No Communication needed
![Page 11: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/11.jpg)
CPU 1 CPU 2
MapReduce – Reduce Phase
11
12.9
42.3
21.3
25.8
24.1
84.3
18.4
84.4
17.5
67.5
14.9
34.3
2226.
26
1726.
31
Image Features
Attractive Face Statistics
Ugly Face Statistics
![Page 12: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/12.jpg)
BeliefPropagation
SVM
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
Map-Reduce for Data-Parallel MLExcellent for large data-parallel tasks!
12
Data-Parallel Graph-ParallelIs there more toMachine Learning
?CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
![Page 13: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/13.jpg)
Carnegie Mellon
Concrete Example
Label Propagation
![Page 14: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/14.jpg)
Profile
Label Propagation AlgorithmSocial Arithmetic:
Recurrence Algorithm:
iterate until convergence
Parallelism:Compute all Likes[i] in parallel
Sue Ann
Carlos
Me
50% What I list on my profile40% Sue Ann Likes10% Carlos Like
40%
10%
50%
80% Cameras20% Biking
30% Cameras70% Biking
50% Cameras50% Biking
I Like:
+60% Cameras, 40% Biking
![Page 15: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/15.jpg)
Properties of Graph Parallel Algorithms
DependencyGraph
IterativeComputation
What I Like
What My Friends Like
Factored Computation
![Page 16: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/16.jpg)
?
BeliefPropagation
SVM
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
Map-Reduce for Data-Parallel MLExcellent for large data-parallel tasks!
16
Data-Parallel Graph-Parallel
CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
Map Reduce?
![Page 17: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/17.jpg)
Carnegie Mellon
Why not use Map-Reducefor
Graph Parallel Algorithms?
![Page 18: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/18.jpg)
Data Dependencies
Map-Reduce does not efficiently express dependent data
User must code substantial data transformations Costly data replication
Inde
pend
ent D
ata
Row
s
![Page 19: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/19.jpg)
Slow
Proc
esso
rIterative Algorithms
Map-Reduce not efficiently express iterative algorithms:
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Iterations
Barr
ier
Barr
ier
Barr
ier
![Page 20: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/20.jpg)
MapAbuse: Iterative MapReduceOnly a subset of data needs computation:
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Iterations
Barr
ier
Barr
ier
Barr
ier
![Page 21: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/21.jpg)
MapAbuse: Iterative MapReduceSystem is not optimized for iteration:
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Data
Data
Data
Data
Data
Data
Data
CPU 1
CPU 2
CPU 3
Iterations
Disk Pe
nalty
Disk Pe
nalty
Disk Pe
nalty
Sta
rtup
Pen
alty
Sta
rtup
Pen
alty
Sta
rtup
Pen
alty
![Page 22: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/22.jpg)
Synchronous vs. AsynchronousExample Algorithm: If Red neighbor then turn Red
Synchronous Computation (Map-Reduce) :Evaluate condition on all vertices for every phase
4 Phases each with 9 computations 36 Computations
Asynchronous Computation (Wave-front) :Evaluate condition only when neighbor changes
4 Phases each with 2 computations 8 Computations
Time 0 Time 1 Time 2 Time 3 Time 4
![Page 23: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/23.jpg)
Data-Parallel Algorithms can be Inefficient
The limitations of the Map-Reduce abstraction can lead to inefficient parallel algorithms.
1 2 3 4 5 6 7 80
100020003000400050006000700080009000
Number of CPUs
Runti
me
in S
econ
ds
Optimized in Memory MapReduceBP
Asynchronous Splash BP
![Page 24: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/24.jpg)
?
BeliefPropagationSVM
KernelMethods
Deep BeliefNetworks
NeuralNetworks
Tensor Factorization
PageRank
Lasso
The Need for a New AbstractionMap-Reduce is not well suited for Graph-Parallelism
24
Data-Parallel Graph-Parallel
CrossValidation
Feature Extraction
Map Reduce
Computing SufficientStatistics
![Page 25: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/25.jpg)
Carnegie Mellon
What is GraphLab?
![Page 26: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/26.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
26
![Page 27: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/27.jpg)
Data Graph
27
A graph with arbitrary data (C++ Objects) associated with each vertex and edge.
Vertex Data:• User profile text• Current interests estimates
Edge Data:• Similarity weights
Graph:• Social Network
![Page 28: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/28.jpg)
Implementing the Data GraphMulticore Setting
In MemoryRelatively Straight Forward
vertex_data(vid) dataedge_data(vid,vid) dataneighbors(vid) vid_list
Challenge:Fast lookup, low overhead
Solution:Dense data-structuresFixed Vdata & Edata typesImmutable graph structure
Cluster Setting
In MemoryPartition Graph:
ParMETIS or Random Cuts
Cached Ghosting
Node 1 Node 2
A B
C D
A B
C D
A B
C D
![Page 29: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/29.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
29
![Page 30: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/30.jpg)
label_prop(i, scope){ // Get Neighborhood data (Likes[i], Wij, Likes[j]) scope;
// Update the vertex data
// Reschedule Neighbors if needed if Likes[i] changes then reschedule_neighbors_of(i); }
Update Functions
30
An update function is a user defined program which when applied to a vertex transforms the data in the scope of the vertex
![Page 31: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/31.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
31
![Page 32: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/32.jpg)
The Scheduler
32
CPU 1
CPU 2
The scheduler determines the order that vertices are updated.
e f g
kjih
dcba b
ih
a
i
b e f
j
c
Sch
edule
r
The process repeats until the scheduler is empty.
![Page 33: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/33.jpg)
Choosing a Schedule
GraphLab provides several different schedulersRound Robin: vertices are updated in a fixed orderFIFO: Vertices are updated in the order they are addedPriority: Vertices are updated in priority order
33
The choice of schedule affects the correctness and parallel performance of the algorithm
Obtain different algorithms by simply changing a flag! --scheduler=roundrobin --scheduler=fifo --scheduler=priority
![Page 34: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/34.jpg)
CPU 1 CPU 2 CPU 3 CPU 4
Implementing the SchedulersMulticore Setting
Challenging!Fine-grained lockingAtomic operations
Approximate FiFo/PriorityRandom placementWork stealing
Cluster Setting
Multicore scheduler on each node
Schedules only “local” verticesExchange update functions
Queue 1
Queue 2
Queue 3
Queue 4
Node 1
CPU 1 CPU 2
Queue 1
Queue 2
Node 2
CPU 1 CPU 2
Queue 1
Queue 2
v1 v2
f(v1)
f(v2)
![Page 35: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/35.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
35
![Page 36: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/36.jpg)
GraphLab Ensures Sequential Consistency
36
For each parallel execution, there exists a sequential execution of update functions which produces the same result.
CPU 1
CPU 2
SingleCPU
Parallel
Sequential
time
![Page 37: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/37.jpg)
Ensuring Race-Free CodeHow much can computation overlap?
![Page 38: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/38.jpg)
CPU 1 CPU 2
Common Problem: Write-Write Race
38
Processors running adjacent update functions simultaneously modify shared data:
CPU1 writes: CPU2 writes:
Final Value
![Page 39: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/39.jpg)
Nuances of Sequential Consistency Data consistency depends on the update function:
Some algorithms are “robust” to data-racesGraphLab Solution
The user can choose from three consistency modelsFull, Edge, Vertex
GraphLab automatically enforces the users choice
39
CPU 1 CPU 2
Unsafe
CPU 1 CPU 2
Safe
Read
![Page 40: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/40.jpg)
Consistency Rules
40
Guaranteed sequential consistency for all update functions
Data
![Page 41: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/41.jpg)
Full Consistency
41
Only allow update functions two vertices apart to be run in parallelReduced opportunities for parallelism
![Page 42: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/42.jpg)
Obtaining More Parallelism
42
Not all update functions will modify the entire scope!
![Page 43: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/43.jpg)
Edge Consistency
43
CPU 1 CPU 2
Safe
Read
![Page 44: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/44.jpg)
Obtaining More Parallelism
44
“Map” operations. Feature extraction on vertex data
![Page 45: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/45.jpg)
Vertex Consistency
45
![Page 46: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/46.jpg)
Consistency Through R/W LocksRead/Write locks:
Full Consistency
Edge Consistency
Write Write WriteCanonical Lock Ordering
Read Write ReadRead Write
![Page 47: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/47.jpg)
Multicore Setting: Pthread R/W LocksDistributed Setting: Distributed Locking
Prefetch Locks and Data
Allow computation to proceed while locks/data are requested.
Node 2
Consistency Through R/W Locks
Node 1Data GraphPartition
Lock Pip
elin
e
![Page 48: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/48.jpg)
Consistency Through SchedulingEdge Consistency Model:
Two vertices can be Updated simultaneously if they do not share an edge.
Graph Coloring:Two vertices can be assigned the same color if they do not share an edge.
Barr
ier
Phase 1
Barr
ier
Phase 2
Barr
ier
Phase 3
![Page 49: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/49.jpg)
The GraphLab Framework
Scheduler Consistency Model
Graph BasedData Representation
Update FunctionsUser Computation
49
![Page 50: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/50.jpg)
Global State and ComputationNot everything fits the Graph metaphor
Shared Weight:
Global Computation:
![Page 51: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/51.jpg)
Anatomy of a GraphLab Program:
1) Define C++ Update Function2) Build data graph using the C++ graph object3) Set engine parameters:
1) Scheduler type 2) Consistency model
4) Add initial vertices to the scheduler 5) Register global aggregates and set refresh intervals6) Run the engine on the graph [Blocking C++ call]7) Final answer is stored in the graph
![Page 52: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/52.jpg)
Algorithms Implemented PageRankLoopy Belief PropagationGibbs SamplingCoEMGraphical Model Parameter LearningProbabilistic Matrix/Tensor FactorizationAlternating Least SquaresLasso with Sparse FeaturesSupport Vector Machines with Sparse FeaturesLabel-Propagation…
![Page 53: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/53.jpg)
The Code
API Implemented in C++:Pthreads, GCC Atomics, TCP/IP, MPI, in house RPC
Multicore APIExperimental Matlab/Java/Python supportNearly Complete Implementation
Available under Apache 2.0 License
Cloud APIBuilt and tested on EC2 using No Fault ToleranceStill Experimental
http://graphlab.org
![Page 54: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/54.jpg)
Carnegie Mellon
Shared MemoryExperiments
Shared Memory Setting16 Core Workstation
54
![Page 55: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/55.jpg)
Loopy Belief Propagation
55
3D retinal image denoising
Data GraphUpdate Function:
Loopy BP Update EquationScheduler:
Approximate PriorityConsistency Model:
Edge Consistency
Vertices: 1 MillionEdges: 3 Million
![Page 56: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/56.jpg)
Loopy Belief Propagation
56
0 2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
16
Number of CPUs
Spee
dup
Optimal
Bett
er
SplashBP
15.5x speedup
![Page 57: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/57.jpg)
CoEM (Rosie Jones, 2005)Named Entity Recognition Task
the dog
Australia
Catalina Island
<X> ran quickly
travelled to <X>
<X> is pleasant
Hadoop 95 Cores 7.5 hrs
Is “Dog” an animal?Is “Catalina” a place?
Vertices: 2 MillionEdges: 200 Million
![Page 58: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/58.jpg)
0 2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
16
Number of CPUs
Spee
dup
Bett
er
Optimal
GraphLab CoEM
CoEM (Rosie Jones, 2005)
58
GraphLab 16 Cores 30 min
15x Faster!6x fewer CPUs!
Hadoop 95 Cores 7.5 hrs
![Page 59: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/59.jpg)
Carnegie Mellon
ExperimentsAmazon EC2
High-Performance Nodes
59
![Page 60: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/60.jpg)
Video Cosegmentation
Segments mean the same
Model: 10.5 million nodes, 31 million edges
Gaussian EM clustering + BP on 3D grid
![Page 61: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/61.jpg)
Video Coseg. Speedups
![Page 62: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/62.jpg)
Prefetching Data & Locks
![Page 63: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/63.jpg)
Matrix FactorizationNetflix Collaborative Filtering
Alternating Least Squares Matrix Factorization
Model: 0.5 million nodes, 99 million edges
Netflix
Users
Movies
d
![Page 64: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/64.jpg)
NetflixSpeedup Increasing size of the matrix factorization
![Page 65: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/65.jpg)
The Cost of Hadoop
![Page 66: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/66.jpg)
SummaryAn abstraction tailored to Machine Learning
Targets Graph-Parallel Algorithms
Naturally expressesData/computational dependenciesDynamic iterative computation
Simplifies parallel algorithm designAutomatically ensures data consistencyAchieves state-of-the-art parallel performance on a variety of problems
66
![Page 67: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/67.jpg)
Current/Future Work Out-of-core StorageHadoop/HDFS Integration
Graph ConstructionGraph StorageLaunching GraphLab from HadoopFault Tolerance through HDFS Checkpoints
Sub-scope parallelismAddress the challenge of very high degree nodes
Improved graph partitioningSupport for dynamic graph structure
![Page 68: A New Parallel Framework for Machine Learning](https://reader035.fdocuments.in/reader035/viewer/2022081514/568130c5550346895d96e618/html5/thumbnails/68.jpg)
Carnegie Mellon
Checkout GraphLab
http://graphlab.org
68
Documentation… Code… Tutorials…
Questions & Feedback