Post on 14-Jan-2016
description
Ex-MATE: Data-Intensive Computing with Large Reduction Objects and Its Application to Graph Mining
Wei Jiang and Gagan Agrawal
Outline
April 21, 20232
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
Outline
April 21, 20233
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
April 21, 20234
Map-Reduce Simple API : map and reduce
Easy to write parallel programs Fault-tolerant for large-scale data centers
Performance? Always a concern for HPC community
Generalized Reduction First proposed in FREERIDE that was developed at Ohio
State 2001-2003 Shared a similar processing structure
The key difference lies in a programmer-managed reduction-object
Better performance?
Background (I)
April 21, 20235
Map-Reduce Execution
Comparing Processing Structures
6
• Reduction Object represents the intermediate state of the execution• Reduce func. is commutative and associative• Sorting, grouping.. .overheads are eliminated with red. func/obj.
April 21, 2023
Our Previous Work A comparative study between FREERIDE and
Hadoop: FREERIDE outperformed Hadoop with factors of 5 to 10 Possible reasons:
Java VS C++? HDFS overheads? Inefficiency of Hadoop? API difference?
Developed MATE (Map-Reduce system with an AlternaTE API) on top of Phoenix from Stanford Adopted Generalized Reduction Focused on API differences MATE improved Phoenix with an average of 50%
Avoids large set of intermediate pairs between Map & Reduce Reduces memory requirements
April 21, 20237
Extending MATE Main issues of the original MATE:
Only works on a single multi-core machine Datasets should reside in memory Assumes the reduction object MUST fit in memory
This paper extended MATE to address these limitations Focus on graph mining: an emerging class of apps
Require large-sized reduction objects as well as large-scale datasets
E.g., PageRank could have a 8GB reduction object! Support of managing arbitrary-sized reduction objects
Also reading disk-resident input data Evaluated Ex-MATE using PEGASUS
PEGASUS: A Hadoop-based graph mining system
April 21, 20238
Outline
April 21, 20239
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
April 21, 202310
System Design and Implementation System design of Ex-MATE
Execution overview Support of distributed environments
System APIs in Ex-MATE One set provided by the runtime
operations on reduction objects Another set defined or customized by the users
reduction, combination, etc.. Runtime in Ex-MATE
Data partitioning Task scheduling Other low-level details
April 21, 202311
Ex-MATE Runtime Overview Basic one-stage execution
April 21, 202312
Implementation Considerations Support for processing very large datasets
Partitioning function: Partition and distribute to a number of nodes
Splitting function: Use the multi-core CPU on each node
Management of a large reduction-object (R.O.): Reduce disk I/O! Outputs (R.O.) are updated in a demand-driven way
Partition the reduction object into splits Inputs are re-organized based on data access
patterns Reuse a R.O. split as much as possible in memory
Example: Matrix-Vector Multiplication
A MV-Multiplication Example
April 21, 202313
Output Vector
Input Vector
Input Matrix(1, 1)
(2, 1)
(1, 2)
Outline
April 21, 202314
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
GIM-V for Graph Mining (I) Generalized Iterative Matrix-Vector
Multiplication(GIM-V) Proposed at CMU at first Similar to the common MV Multiplication
MV Mul. : Three operations in
GIM-V: combine m(i, j) and v(j) :
Not have to be a multiplication combineAll n partial results for the element i :
Not have to be the sum assign v(new) to v(i) :
The previous value of v(i) is updated by a new value
April 21, 202315
Multiplication
Sum
Assignment
GIM-V for Graph Mining (II) A set of graph mining applications can fit
into this GIM-V PageRank, Diameter Estimation, Finding
Connected Components, Random Walk with Restart, etc..
Parallelization of GIM-V: Use Map-Reduce in PEGASUS
A two-stage algorithm: two consecutive map-reduce jobs
Use Generalized Reduction in Ex-MATE A one-stage algorithm: simpler code
April 21, 202316
GIM-V Example: PageRank PageRank is used by Google to calculate the
relative importance of web-pages: Direct implementation of GIM-V: v(j) is the ranking
value The three customized operations are:
April 21, 202317
Multiplication
Sum
Assignment
GIM-V: Other Algorithms Diameter Estimation: HADI is an algorithm to
estimate the diameter of a given graph The three customized operations are:
Finding Connected Components: HCC is a new algorithm to find the connected components of large graphs The three customized operations are:
April 21, 202318
Multiplication
Bitwise-or
Bitwise-or
Multiplication
Minimal
Minimal
Parallelization of GIM-V (I) Using Map-Reduce: Stage I
Map:
April 21, 202319
Map M(i,j) and V(j) to reducer j
Parallelization of GIM-V (II) Using Map-Reduce: Stage I (cont.)
Reduce:
April 21, 202320
Map “combine2(M(i,j) , V(j)) “to reducer i
Parallelization of GIM-V (III) Using Map-Reduce: Stage II
Map:
April 21, 202321
Parallelization of GIM-V (IV) Using Map-Reduce: Stage II (cont.)
Reduce:
April 21, 202322
Parallelization of GIM-V (V) Using Generalized Reduction in Ex-MATE:
Reduction:
April 21, 202323
Parallelization of GIM-V (VI) Using Generalized Reduction in Ex-MATE:
Finalize:
April 21, 202324
Outline
April 21, 202325
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
April 21, 202326
Applications: Three graph mining algorithms:
PageRank, Diameter Estimation, and Finding Connected Components
Evaluation: Performance comparison with PEGASUS
PEGASUS provides a naïve version and an optimized version
Speedups with an increasing number of nodes Scalability speedups with an increasing size of
datasets Experimental platform:
A cluster of multi-core CPU machines Used up to 128 cores (16 nodes)
Experiments Design
April 21, 202327
Results: Graph Mining (I) PageRank: 16GB dataset; a graph of 256
million nodes and 1 billion edgesA
vg
. Tim
e P
er
Itera
tion
(m
in)
# of nodes
10.0 speedup
April 21, 202328
Results: Graph Mining (II) HADI: 16GB dataset; a graph of 256 million
nodes and 1 billion edgesA
vg
. Tim
e P
er
Itera
tion
(m
in)
# of nodes
11.0 speedup
April 21, 202329
Results: Graph Mining (III) HCC: 16GB dataset; a graph of 256 million
nodes and 1 billion edgesA
vg
. Tim
e P
er
Itera
tion
(m
in)
# of nodes
9.0 speedup
April 21, 202330
Scalability: Graph Mining (IV) HCC: 8GB dataset; a graph of 256 million
nodes and 0.5 billion edgesA
vg
. Tim
e P
er
Itera
tion
(m
in)
# of nodes
1.7 speedup
1.9 speedup
April 21, 202331
Scalability: Graph Mining (V) HCC: 32GB dataset; a graph of 256 million
nodes and 2 billion edgesA
vg
. Tim
e P
er
Itera
tion
(m
in)
# of nodes
1.9 speedup
2.7 speedup
April 21, 202332
Scalability: Graph Mining (VI) HCC: 64GB dataset; a graph of 256 million
nodes and 4 billion edgesA
vg
. Tim
e P
er
Itera
tion
(m
in)
# of nodes
1.9 speedup
2.8 speedup
Observations
April 21, 202333
Performance trends are similar for all three applications Consistent with the fact that all three applications
are implemented using the GIM-V method Ex-MATE outperforms PEGASUS significantly
for all three graph mining algorithms Reasonable speedups for different datasets Better scalability for larger datasets with a
increasing number of nodes
Outline
April 21, 202334
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
Related Work: Academia
April 21, 202335
Evaluation of Map-Reduce-like models in various parallel programming environments: Phoenix-rebirth for large-scale multi-core machines Mars for a single GPU MITHRA for GPGPUs in heterogeneous platforms Recent IDAV for GPU clusters
Improvement of Map-Reduce API: Integrating pre-fetch and pre-shuffling into Hadoop Supporting online queries Enforcing a less restrictive synchronization
semantics between Map and Reduce
Related Work: Industry
April 21, 202336
Google’s Pregel System: Map-reduce may not so suitable for graph
operations Proposed to target graph processing Open source version: HAMA project in Apache
Variants of Map-Reduce: Dryad/DryadLINQ from Microsoft Sawzall from Google Pig/Map-Reduce-Merge from Yahoo! Hive from Facebook
Outline
April 21, 202337
Background System Design of Ex-MATE Parallel Graph Mining with Ex-MATE Experiments Related Work Conclusion
April 21, 202338
Conclusion Ex-MATE supports the management of
reduction objects of arbitrary sizes Deals with disk-resident reduction objects
Outperforms PEGASUS for both the naïve and optimized implementations for all three graph mining application Has a simpler code
Offers a promising alternative for developing efficient data-intensive applications, Uses GIM-V for parallelizing graph mining
39
Thank You, and Acknowledgments Questions and comments
Wei Jiang - jiangwei@cse.ohio-state.edu Gagan Agrawal - agrawal@cse.ohio-state.edu
This project was supported by: