GraphX - USENIX · GraphX:! Graph Processing in a ! Distributed Dataflow Framework "Joseph...

Post on 12-Nov-2018

236 views 0 download

Transcript of GraphX - USENIX · GraphX:! Graph Processing in a ! Distributed Dataflow Framework "Joseph...

GraphX:���Graph Processing in a ���Distributed Dataflow Framework Joseph Gonzalez Postdoc, UC-Berkeley AMPLab Co-founder, GraphLab Inc. ���Joint work with Reynold Xin, Ankur Dave, Daniel Crankshaw, Michael Franklin, and Ion Stoica OSDI 2014 UC  BERKELEY  

Modern Analytics

Raw Wikipedia

< / >!< / >!< / >!XML!

Hyperlinks PageRank Top 20 Pages Title PR

Link Table Title Link

Editor Graph Community Detection

User Community

User Com.

Discussion Table

User Disc.

Top Communities Com. PR..

Tables

Raw Wikipedia

< / >!< / >!< / >!XML!

Hyperlinks PageRank Top 20 Pages Title PR

Link Table Title Link

Editor Graph Community Detection

User Community

User Com.

Discussion Table

User Disc.

Top Communities Com. PR..

Graphs

Raw Wikipedia

< / >!< / >!< / >!XML!

Hyperlinks PageRank Top 20 Pages Title PR

Link Table Title Link

Editor Graph Community Detection

User Community

User Com.

Discussion Table

User Disc.

Top Communities Com. PR..

Separate Systems

Tables Graphs

Separate Systems

Graphs Dataflow Systems

Table

Result

Row

Row

Row

Row

Separate Systems Dataflow Systems Graph Systems

Dependency Graph

6. Before

8. After

7. After

Table

Result

Row

Row

Row

Row

Difficult to Use Users must Learn, Deploy, and Manage

multiple systems

Leads to brittle and often ���complex interfaces

8  

Inefficient

9  

Extensive data movement and duplication across ���the network and file system

< / >!< / >!< / >!XML!

HDFS   HDFS   HDFS   HDFS  

Limited reuse internal data-structures ���across stages

Spark Dataflow Framework

GraphX

GraphX Unifies Computation on ��� Tables and Graphs

Table View Graph View

Enabling a single system to easily and efficiently support the entire pipeline

Separate Systems Dataflow Systems Graph Systems

Dependency Graph

6. Before

8. After

7. After

Table

Result

Row

Row

Row

Row

?

Separate Systems Dataflow Systems

Dependency Graph

Graph Systems 6. Before

8. After

7. After

Table

Result

Row

Row

Row

Row

22

354

1340

0 200 400 600 800 1000 1200 1400 1600

GraphLab

Spark

Hadoop

Runtime (in seconds, PageRank for 10 iterations)

PageRank on the Live-Journal Graph

Hadoop is 60x slower than GraphLab Spark is 16x slower than GraphLab

?

Key Question

How can we naturally express and efficiently execute graph computation in a general purpose dataflow framework?

Distill the lessons learned���from specialized graph systems

Key Question

How can we naturally express and efficiently execute graph computation in a general purpose dataflow framework?

Representation Optimizations

Raw Wikipedia

< / >!< / >!< / >!XML!

Hyperlinks PageRank Top 20 Pages Title PR

Link Table Title Link

Editor Graph Community Detection

User Community

User Com.

Discussion Table

User Disc.

Top Communities Com. PR..

Express computation locally:

Iterate until convergence

Rank of Page i Weighted sum of

neighbors’ ranks

17

Example Computation: PageRank

Random Reset Prob.

R[i] = 0.15 +X

j2InLinks(i)

R[j]

OutLinks(j)

“Think like a Vertex.” - Malewicz et al., SIGMOD’10

18

19

Gather information from���neighboring vertices

Graph-Parallel Pattern Gonzalez et al. [OSDI’12]

20

Apply an update the vertex property

Graph-Parallel Pattern Gonzalez et al. [OSDI’12]

21

Scatter information to���neighboring vertices

Graph-Parallel Pattern Gonzalez et al. [OSDI’12]

Many Graph-Parallel Algorithms Collaborative Filtering » Alternating Least Squares » Stochastic Gradient Descent » Tensor Factorization

Structured Prediction » Loopy Belief Propagation » Max-Product Linear Programs » Gibbs Sampling

Semi-supervised ML » Graph SSL » CoEM

Community Detection » Triangle-Counting » K-core Decomposition » K-Truss

Graph Analytics » PageRank » Personalized PageRank » Shortest Path » Graph Coloring

MACHINE  LEARNING  

NETWORK    ANALYSIS  

22

Specialized Computational

Pattern

Specialized Graph

Optimizations

Graph System Optimizations

24  

Specialized���Data-Structures

Vertex-Cuts Partitioning

Remote Caching / Mirroring

Message Combiners Active Set Tracking

Representation

Optimizations

Distributed Graphs

Horizontally Partitioned Tables

Join

Vertex Programs

Dataflow Operators

Advances in Graph Processing Systems Distributed Join Optimization

Materialized View Maintenance

Property Graph Data Model

B C

A D

F E

A D D

Property Graph

B C

D

E

A A

F

Vertex Property: •  User Profile •  Current PageRank Value

Edge Property: •  Weights •  Timestamps

Part. 2

Part. 1

Vertex Table

(RDD)

B C

A D

F E

A D

Encoding Property Graphs as Tables

D

Property Graph

B C

D

E

A A

F

Machine 1

Machine 2

Edge Table (RDD)

A B

A C

C D

B C

A E

A F

E F

E D

B

C

D

E

A

F

Routing Table

(RDD)

B

C

D

E

A

F

1  

2  

1   2  

1   2  

1  

2  

Vertex Cut

Separate Properties and Structure Reuse structural information across multiple graphs

Input Graph

Transform Vertex Properties

Transformed Graph Vertex Table

Routing ���Table

Edge Table

Vertex Table

Routing ���Table

Edge Table

Table Operators Table operators are inherited from Spark:

29

map

filter

groupBy

sort

union

join

leftOuterJoin

rightOuterJoin

reduce

count

fold

reduceByKey

groupByKey

cogroup

cross

zip

sample

take

first

partitionBy

mapWith

pipe

save

...

class Graph [ V, E ] { def Graph(vertices: Table[ (Id, V) ], edges: Table[ (Id, Id, E) ])

// Table Views ----------------- def vertices: Table[ (Id, V) ] def edges: Table[ (Id, Id, E) ] def triplets: Table [ ((Id, V), (Id, V), E) ] // Transformations ------------------------------ def reverse: Graph[V, E] def subgraph(pV: (Id, V) => Boolean,

pE: Edge[V,E] => Boolean): Graph[V,E] def mapV(m: (Id, V) => T ): Graph[T,E] def mapE(m: Edge[V,E] => T ): Graph[V,T] // Joins ---------------------------------------- def joinV(tbl: Table [(Id, T)]): Graph[(V, T), E ] def joinE(tbl: Table [(Id, Id, T)]): Graph[V, (E, T)] // Computation ---------------------------------- def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E]

}

Graph Operators (Scala)

30

class Graph [ V, E ] { def Graph(vertices: Table[ (Id, V) ], edges: Table[ (Id, Id, E) ])

// Table Views ----------------- def vertices: Table[ (Id, V) ] def edges: Table[ (Id, Id, E) ] def triplets: Table [ ((Id, V), (Id, V), E) ] // Transformations ------------------------------ def reverse: Graph[V, E] def subgraph(pV: (Id, V) => Boolean,

pE: Edge[V,E] => Boolean): Graph[V,E] def mapV(m: (Id, V) => T ): Graph[T,E] def mapE(m: Edge[V,E] => T ): Graph[V,T] // Joins ---------------------------------------- def joinV(tbl: Table [(Id, T)]): Graph[(V, T), E ] def joinE(tbl: Table [(Id, Id, T)]): Graph[V, (E, T)] // Computation ---------------------------------- def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E]

}

Graph Operators (Scala)

31

def mrTriplets(mapF: (Edge[V,E]) => List[(Id, T)], reduceF: (T, T) => T): Graph[T, E]

capture the Gather-Scatter pattern from specialized graph-processing systems

Triplets Join Vertices and Edges The triplets operator joins vertices and edges:

Triplets Vertices

B

A

C

D

Edges

A B A C B C C D

A B A

B A C B C C D

Map-Reduce Triplets Map-Reduce triplets collects information about the neighborhood of each vertex:

C D

A C

B C

A B

Src. or Dst.

MapFunction( ) à (B, )

MapFunction( ) à (C, ) MapFunction( ) à (C, )

MapFunction( ) à (D, )

Reduce

(B, )

(C, + )

(D, ) Message

Combiners

Using these basic GraphX operators ���we implemented Pregel and GraphLab ���

in under 50 lines of code!

34

The GraphX Stack���(Lines of Code)

GraphX (2,500)

Spark (30,000)

Pregel API (34)

PageRank (20)

Connected Comp. (20)

K-core (60) Triangle

Count (50)

LDA (220)

SVD++ (110)

Some algorithms are more naturally expressed using the GraphX primitive operators

Representation

Optimizations

Distributed Graphs

Horizontally Partitioned Tables

Join

Vertex Programs

Dataflow Operators

Advances in Graph Processing Systems Distributed Join Optimization

Materialized View Maintenance

Vertex Table

(RDD)

Join Site Selection using Routing Tables Edge Table

(RDD) A B

A C

C D

B C

A E

A F

E F

E D

B

C

D

E

A

F

Routing Table

(RDD)

B

C

D

E

A

F

1  

2  

1   2  

1   2  

1  

2  

Never Shuffle���Edges!

Vertex Table

(RDD)

Caching for Iterative mrTriplets Edge Table

(RDD) A B

A C

C D

B C

A E

A F

E F

E D

Mirror Cache

B C D

A

Mirror Cache

D E F

A

B

C

D

E

A

F

B

C

D

E

A

F

A

D

Scan Scan

Reusable Hash Index

Reusable Hash Index

Vertex Table

(RDD)

Edge Table (RDD)

A B

A C

C D

B C

A E

A F

E F

E D

Mirror Cache

B C D

A

Mirror Cache

D E F

A

Incremental Updates for Triplets View

B

C

D

E

A

F

Change A A

Change E

Scan

Vertex Table

(RDD)

Edge Table (RDD)

A B

A C

C D

B C

A E

A F

E F

E D

Mirror Cache

B C D

A

Mirror Cache

D E F

A

Aggregation for Iterative mrTriplets

B

C

D

E

A

F

Change

Change

Scan

Change

Change

Change

Change

Local Aggregate

Local Aggregate

B C

D

F

Reduction in Communication Due to Cached Updates

0.1

1

10

100

1000

10000

0 2 4 6 8 10 12 14 16

Net

wor

k Co

mm

. (M

B)

Iteration

Connected Components on Twitter Graph

Most vertices are within 8 hops���of all vertices in their comp.

Benefit of Indexing Active Vertices

0

5

10

15

20

25

30

0 2 4 6 8 10 12 14 16

Runt

ime

(Sec

onds

)

Iteration

Connected Components on Twitter Graph

Without Active Tracking

Active Vertex Tracking

Join Elimination Identify and bypass joins for unused triplet fields

»  Java bytecode inspection

43

0 2000 4000 6000 8000

10000 12000 14000

0 5 10 15 20

Com

mun

icatio

n (M

B)

Iteration

PageRank on Twitter

Factor of 2 reduction in communication

Better

Join Elimination

Without Join Elimination

Additional Optimizations Indexing and Bitmaps: » To accelerate joins across graphs » To efficiently construct sub-graphs

Lineage based fault-tolerance » Exploits Spark lineage to recover in parallel » Eliminates need for costly check-points

Substantial Index and Data Reuse: » Reuse routing tables across graphs and sub-graphs » Reuse edge adjacency information and indices

44

System Comparison Goal:

Demonstrate that GraphX achieves performance parity with specialized graph-processing systems.

Setup: 16 node EC2 Cluster (m2.4xLarge) + 1GigE Compare against GraphLab/PowerGraph (C++), Giraph (Java), & Spark (Java/Scala)

0 500

1000 1500 2000 2500 3000 3500

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)

PageRank Benchmark

GraphX performs comparably to ���state-of-the-art graph processing systems.

Runt

ime

(Sec

onds

)

EC2 Cluster of 16 x m2.4xLarge (8 cores) + 1GigE

7x 18x

Connected Comp. Benchmark

0

500

1000

1500

2000

2500

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)

GraphX performs comparably to ���state-of-the-art graph processing systems.

Out

-of-M

emor

y

EC2 Cluster of 16 x m2.4xLarge (8 cores) + 1GigE

Runt

ime

(Sec

onds

)

8x 10x

Graphs are just one stage…. ������

What about a pipeline?

HDFS HDFS

Compute Spark Preprocess Spark Post.

A Small Pipeline in GraphX

Timed end-to-end GraphX is the fastest

Raw Wikipedia

< / >!< / >!< / >!XML!

Hyperlinks PageRank Top 20 Pages

0 200 400 600 800 1000 1200 1400 1600

GraphX GraphLab + Spark

Giraph + Spark Spark

Total Runtime (in Seconds)

605 375

1492

342

Adoption and Impact GraphX is now part of Apache Spark

•  Part of Cloudera Hadoop Distribution

In production at Alibaba Taobao

•  Order of magnitude gains over Spark

Inspired GraphLab Inc. SFrame technology

•  Unifies Tables & Graphs on Disk

GraphX à Unified Tables and Graphs

Enabling users to easily and efficiently express the entire analytics pipeline

New API Blurs the distinction between

Tables and Graphs

New System Unifies Data-Parallel ���

Graph-Parallel Systems

Graph Systems GraphX

Specialized Systems Integrated Frameworks

What did we Learn?

Graph Systems GraphX

Specialized Systems Integrated Frameworks

Parameter Server

?

Future Work

Graph Systems GraphX

Specialized Systems Integrated Frameworks

Parameter Server

Future Work

Asynchrony Non-deterministic

Shared-State

Thank You

jegonzal@eecs.berkeley.edu

http://amplab.cs.berkeley.edu/projects/graphx/

Reynold Xin

Ankur Dave

Daniel Crankshaw

Michael Franklin

Ion Stoica

Related Work Specialized Graph-Processing Systems: ���

GraphLab [UAI’10], Pregel [SIGMOD’10], Signal-Collect [ISWC’10], Combinatorial BLAS [IJHPCA’11], GraphChi [OSDI’12], PowerGraph [OSDI’12], ���Ligra [PPoPP’13], X-Stream [SOSP’13]

Alternative to Dataflow framework:��� Naiad [SOSP’13]: GraphLINQ ��� Hyracks: Pregelix [VLDB’15]

Distributed Join Optimization:��� Multicast Join [Afrati et al., EDBT’10]��� Semi-Join in MapReduce [Blanas et al., SIGMOD’10]

Edge Files Have Locality

0

200

400

600

800

1000

1200

GraphLab GraphX + Shuffle

GraphX

GraphLab rebalances the edge-files on-load.

GraphX preserves the on-disk layout through Spark. à Better Vertex-Cut

UK-Graph (106M Vertices, 3.7B Edges)

Runtime (Seconds)

Scalability

0

50

100

150

200

250

300

350

400

450

500

8 16 24 32 40 48 56 64

Linear Scaling

Twitter Graph (42M Vertices,1.5B Edges)

Scales slightly better than ���PowerGraph/GraphLab

Runt

ime

EC2-Nodes

Apache Spark Dataflow Platform Resilient Distributed Datasets (RDD):

Zaharia et al., NSDI’12

HDFS

HDFS

Map

Map

RDD RDD

Reduce

RDD

Load

Load

Apache Spark Dataflow Platform Resilient Distributed Datasets (RDD):

Zaharia et al., NSDI’12

HDFS

HDFS

Map

Map

RDD RDD

Reduce

Optimized for iterative access to data.

RDD

Load

Load

.cache()

Persist in Memory

PageRank Benchmark

GraphX performs comparably to ���state-of-the-art graph processing systems.

0 500

1000 1500 2000 2500 3000 3500

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)

? ✓ Runt

ime

(Sec

onds

)

EC2 Cluster of 16 x m2.4xLarge Nodes + 1GigE

?

Shared Memory Advantage Spark Shared Nothing Model

Core Core Core Core

Shuffle Files

GraphLab Shared Memory

Core Core Core Core

Shared De-serialized In-Memory���Graph

Shared Memory Advantage

0 50

100 150 200 250 300 350 400 450 500

GraphLab GraphLab NoSHM

GraphX

Twitter Graph (42M Vertices,1.5B Edges) Spark Shared Nothing Model

GraphLab No SHM.

Core Core Core Core

Shuffle Files

Core Core Core Core

TCP/IP

Runtime (Seconds)

PageRank Benchmark

0  

500  

1000  

1500  

2000  

2500  

3000  

3500  

0  1000  2000  3000  4000  5000  6000  7000  8000  9000  

Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)

GraphX performs comparably to ���state-of-the-art graph processing systems.

Connected Comp. Benchmark

0

500

1000

1500

2000

2500

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

Twitter Graph (42M Vertices,1.5B Edges) UK-Graph (106M Vertices, 3.7B Edges)

GraphX performs comparably to ���state-of-the-art graph processing systems.

Out

-of-M

emor

y

EC2 Cluster of 16 x m2.4xLarge Nodes + 1GigE

Runt

ime

(Sec

onds

)

8x

10x

Out

-of-M

emor

y

Fault-Tolerance Leverage Spark Fault-Tolerance Mechanism

0 100 200 300 400 500 600 700 800 900

1000

No Failure Lineage Restart

Runt

ime

(Sec

onds

)

67

Graph-Processing Systems

oogle

Expose specialized API to simplify graph programming.

CombBLAS

Kineograph

X-Stream

GraphChi

Ligra

GPS

Representation

Vertex-Program Abstraction

i Pregel_PageRank(i,  messages)  :        //  Receive  all  the  messages      total  =  0      foreach(  msg  in  messages)  :          total  =  total  +  msg        //  Update  the  rank  of  this  vertex      R[i]  =  0.15  +  total        //  Send  new  messages  to  neighbors      foreach(j  in  out_neighbors[i])  :          Send    msg(R[i])  to  vertex  j  

68

The Vertex-Program Abstraction

GraphLab_PageRank(i)        //  Compute  sum  over  neighbors      total  =  0      foreach(  j  in  neighbors(i)):            total  +=  R[j]  *  wji        //  Update  the  PageRank      R[i]  =  0.15  +  total      

69  

R[4]  *  w41  

+  +  

4 1

3 2

Low, Gonzalez, et al. [UAI’10]

F  

E  

Example: Oldest Follower

D  

B  

A  

C  Calculate the number of older followers for each user?

val olderFollowerAge = graph .mrTriplets(

e => // Map if(e.src.age > e.dst.age) { (e.srcId, 1) else { Empty } , (a,b) => a + b // Reduce ) .vertices

23 42

30

19 75

16 70

Enhanced Pregel in GraphX

71

pregelPR(i, messageList ): !// Receive all the messages !total = 0 !foreach( msg in messageList) : ! total = total + msg!

// Update the rank of this vertex !R[i] = 0.15 + total !

// Send new messages to neighbors !foreach(j in out_neighbors[i]) : ! Send msg(R[i]/E[i,j]) to vertex!

Require  Message  Combiners  messageSum  

messageSum  

Remove  Message  Computation  

from  the  Vertex  Program  

sendMsg(iàj, R[i], R[j], E[i,j]): ! // Compute single message ! return msg(R[i]/E[i,j]) ! !

combineMsg(a, b): ! // Compute sum of two messages ! return a + b ! !

PageRank in GraphX

72

// Load and initialize the graph !

val graph = GraphBuilder.text(“hdfs://web.txt”) !val prGraph = graph.joinVertices(graph.outDegrees) !!

// Implement and Run PageRank !

val pageRank = ! prGraph.pregel(initialMessage = 0.0, iter = 10)( ! (oldV, msgSum) => 0.15 + 0.85 * msgSum, ! triplet => triplet.src.pr / triplet.src.deg, !

(msgA, msgB) => msgA + msgB) !

Example Analytics Pipeline // Load raw data tables !

val articles = sc.textFile(“hdfs://wiki.xml”).map(xmlParser) !

val links = articles.flatMap(article => article.outLinks) !

// Build the graph from tables !

val graph = new Graph(articles, links) !

// Run PageRank Algorithm !

val pr = graph.PageRank(tol = 1.0e-5) !

// Extract and print the top 20 articles !

val topArticles = articles.join(pr).top(20).collect !

for ((article, pageRank) <- topArticles) { ! println(article.title + ‘\t’ + pageRank) !} !

Apache Spark Dataflow Platform Resilient Distributed Datasets (RDD):

Zaharia et al., NSDI’12

HDFS

HDFS

Map

Map

RDD RDD

Reduce

Lineage:

RDD

Load

Load

HDFS RDD RDD Reduce Map RDD

Load

.cache()

Persist in Memory