1 QSX: Querying Social Graphs Graph algorithms in MapReduce MapReduce: an introduction BFS for...

1

QSX: Querying Social Graphs

Graph algorithms in MapReduce

MapReduce: an introduction

BFS for distance queries

PageRank

Keyword search

Subgraph isomorphism

2

Motivation for studying parallel algorithms

Worse still

regular path queries (finding simple paths): NP-complete

Graph pattern matching via subgraph isomorphism: NP-complete

Graph queries are costlyBFS for reachability: linear time O(|V| + |E|)kNN join: quadratic timeGraph pattern matching by graph simulation: quadratic time

Can we efficiently evaluate graph queries?

Furthermore, real-life graphs are typically big

Facebook: 1.38 billion nodes and 140 billion links

3

The impact of the sheer volume of big data

Using SSD of 6G/s, a linear scan of a data set DD would take

1.9 days when DD is of 1PB (1015B)

5.28 years when DD is of 1EB (1018B)

Is it feasible to query real-life big graphs?

A departure from classical computational complexity theory

Traditional computational complexity theory of almost 50 years:

• The good: polynomial time computable (PTIME)

• The bad: NP-hard (intractable)

• The ugly: PSPACE-hard, EXPTIME-hard, undecidable…

Parallel query answering

We can do better provided more resources

10,000 processors

How to capitalize on the resources and reduce response time? 4

Using 10000 SSD of 6G/s, a linear scan of DD might take: 1.9 days/10000 = 16 seconds when DD is of 1PB (1015B)5.28 years/10000 = 4.63 days when DD is of 1EB (1018B)

Only ideally!

DB

M

DB

M

DB

M

interconnection network

P P P

5

Pipelined parallelism

Pipelining: a sequence of operations on each data item, one conducted by a processors

What are the limitations?

DB

M

DB

M

DB

M

interconnection network

P P P

op1 op2 opn

data

Parallel query answering

Given a big graph G, and n processors S1, …, SnG is partitioned into fragments (G1, …, Gn) G is distributed to n processors: Gi is stored at Si

Dividing a big G into small fragments of manageable size

Each processor Si processes operations for a query on its local fragment Gi, in parallel

Q( )

GGQ( )

G1G1Q( )GnGnQ( )G2G2

…

6

MapReduce

77

8

MapReduce

A programming model with two primitive functions:

How does it work?

Input: a list <k1, v1> of key-value pairs

Map: applied to each pair, computes key-value pairs <k2,

v2>

• The intermediate key-value pairs are hash-partitioned based on k2. Each partition (k2, list(v2)) is

sent to a reducer

Reduce: takes a partition as input, and computes key-value pairs <k3, v3>

Map: <k1, v1> list (k2, v2)

Reduce: <k2, list(v2)> list (k3, v3)

The process may reiterate – multiple map/reduce steps

Google

9

Architecture (Hadoop)

No need to worry about how the data is stored and sent

<k1, v1>

mapper mappermapper

<k1, v1> <k1, v1> <k1, v1>

reducer reducer

<k2, v2> <k2, v2> <k2, v2>

<k3, v3> <k3, v3>

One block for each mapper (a map task)

In local store of mappers

Stored in DFS Partitioned in blocks (64M)

Hash partition (k2)

Aggregate results

Multiple steps

10

parallelism

Data partitioned parallelism

<k1, v1>

mapper mappermapper

<k1, v1> <k1, v1> <k1, v1>

reducer reducer

<k2, v2> <k2, v2> <k2, v2>

<k3, v3> <k3, v3>

Parallel computation

Parallel computation

What parallelism?

11

Popular in industry

Apache Hadoop, used by Facebook, Yahoo, …• Hive, Facebook, HiveQL (SQL)• PIG, Yahoo, Pig Latin (SQL like)• SCOPE, Microsoft, SQL, Cassandra, Facebook, CQL (no join)• HBase, Google, distributed BigTable• MongoDB, document-oriented (NoSQL)

Scalability

– Yahoo!: 10,000 cores, for Web search queries (2008)

– Facebook: 100 PB, about half a PB per day

– Amazon Elastic Compute Cloud (EC2) and Amazon Simple Storage Service (S3); New York Time used 100 EC2 instances to process 4TB of image data, $240

Study Spark: https://spark.apache.org/

12

Advantages of MapReduce

Simple: one only needs to define two functions

no need to worry about how the data is stored, distributed and how the operations are scheduled

Fault tolerance: why?

scalability: a large number of low end machines• scale out (scale horizontally): adding a new computer to a

distributed software application; lost-cost “commodity”• scale up (scale vertically): upgrade, add (costly) resources

to a single node

flexibility: independent of data models or schema

independence: it can work with various storage layers

13

Fault tolerance

Able to handle an average of 1.2 failures per analysis job

<k1, v1>

mapper mappermapper

<k1, v1> <k1, v1> <k1, v1>

reducer reducer

<k2, v2> <k2, v2> <k2, v2>

<k3, v3> <k3, v3>

triplicated

Detecting failures and reassigning the tasks of failed nodes to healthy nodes

Redundancy checking to achieve load balancing

14

MapReduce algorithms

map(key: node, value: (adjacency-list, others) )

{computation;

emit (mkey, mvalue)

}

Input: query Q and graph G

Output: answers Q(G) to Q in G

compatibility

Match mkey, mvalue

reduce(key: __ , value: list[value] )

{ …

emit (rkey, rvalue)

}

Match rkey, rvalue when multiple iterations of MapReduce are needed

15

Control flow

while (termination condition is not satisfied) do {

a) map from staging dir 1;

b) reduce into staging dir 2;

c) move files from staging dir 2 to staging dir 1

}

Copy files from input directory staging dir 1; preprocessing

No global data structures accessible and mutable by all

Iterations of MapReduce

Postprocessing; move files from staging dir 2 to output dir

Termination: non-MapReduce driver program

Functional programming

BFS for distance queries

16

17

Dijkstra’s algorithm for distance queries

Distance: single-source shortest-path problem• Input: A directed weighted graph G, and a node s in G• Output: The lengths of shortest paths from s to all nodes in G

Dijkstra (G, s, w): 1. for all nodes v in V do

a. d[v] ;

2. d[s] 0; Que V;

3. while Que is nonempty do

a. u ExtractMin(Que);

b. for all nodes v in adj(u) do

a) if d[v] > d[u] + w(u, v) then d[v] d[u] + w(u, v);

Use a priority queue Que; w(u, v): weight of edge (u, v); d(u): the distance from s to u

Use a priority queue Que; w(u, v): weight of edge (u, v); d(u): the distance from s to u

Extract one with the minimum d(u)Extract one with the minimum d(u)

O(|V| log|V| + |E|).

MapReduce?MapReduce?

18

A MapReduce algorithm

key-values pairs

Input: graph G, represented by adjacency listsNode N:

• Node id: nid n

• N.distance: from start node s to N

• N.AdjList: [(m, w(n, m))], node id and weight of edge (n, m)

Key: node id n

Value of node N:

• Distance: from start node s to n got so far

• Node N (id, AdjList, etc)

Different structures

19

Mapper

Data-partitioned parallelism

Parallel processing

all nodes are processed in parallel, each by a mapper

for each node m adjacent to n, emit a revised distance via n

Map (nid n, node N)•d N.distance; •emit( nid n, N); •for each (m, w) in N.AdjList do

• emit( nid m, d + w(n, m));

Revise distance of m via n

Why?

emit (nid: n, N): preserve graph structure for iterative

processing

20

Reducer

Current M.distance: minimum from all predecessors

list for m: distances from all predecessors so far Node M: must exist (from Mapper)

Reduce (nid m, list)

•dmin ;

•for all d in list do • if IsNode(d)

• then M d;

• else if d < dmin

• then dmin d;

• M.distance dmin;

• emit (nid m, node M);

Group by node idEach d in list is either a distance to m from a predecessoror node m

Always be there. Why?

Minimum distance so far

Update M.distance for this round

21

Iterations and termination

Termination control

Each MapReduce iteration advances the “known frontier” by one hop

Subsequent iterations include more and more reachable nodes as frontier expands

Multiple iterations are needed to explore entire graph

Termination: when the intermediate result no longer changes

For no node n, N.distance is changed in the last round

controlled by a non-MapReduce driver

Use a flag – inspected by non-MapReduce driver

Iteration 0: Base case

22

mapper: (a,<c,10>) (c,<s,5>) edgesreducer: (a,<10, ...>) (c,<5, ...>)

"Wave"

0s

∞

∞

∞

∞

a b

c d

10

5

2 3

1

9

7

2

4 6

Iteration 1

23

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>) (b,<c,14>) (d,<c,7>) edges

reducer: (a,<8, ...>) (c,<5, ...>) (b,<11, ...>) (d,<7, ...>)

0

10

5

∞

∞

10

5

2 3 9

7

4 6s

a b

c d

1

2

"Wave"group (a,<s,10>) and (a,<c,8>)

Iteration 2

24

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,12>) (b,<a,11>) (b,<c,14>) (d,<c,7>) (b,<d,13>) (d,<b,15>) edgesreducer: (a,<8>) (c,<5>) (b,<9>) (d,<7>)

0

8

5

9

7

10

5

2 3 9

7

4 6s

a b

c d

1

2

"Wave"

Iteration 3

25

mapper: (a,<s,10>) (c,<s,5>) (a,<c,8>) (c,<a,9>) (b,<a,11>) (b,<c,14>) (d,<c,7>) (b,<d,13>) (d,<b,15>) edges

reducer: (a,<8>) (c,<5>) (b,<11>) (d,<7>)No change! Convergence!No change! Convergence!

0

8

5

11

7

10

5

2 3 9

7

4 6s

a b

c d

1

2

26

Efficiency?

Any other sources of inefficiency?

MapReduce explores all paths in parallel

Each MapReduce iteration advances the “known frontier” by one hop

• Redundant work, since useful work is only done at the “frontier”

Dijkstra’s algorithm is more efficient

• At any step it only pursues edges from the minimum-cost path inside the frontier

skew

27

A closer look

Need a way to test for convergence

Data partitioned parallelism Local computation at each node in mapper, in parallel:

attributes of the node, adjacent edges and local link structures Propagating computations: traversing the graph; this may

involve iterative MapReduce

Tips: Adjacency lists Local computation in mapper; Pass along partial results via outlinks, keyed by destination node; Perform aggregation in reducer on inlinks to a node Iterate until convergence: controlled by external “driver” pass graph structures between iterations

PageRank

2828

29

PageRank

The likelihood that page v is visited by a random walk:

(1/|V|) + (1 - ) _(u L(v)) P(u)/C(u)

Recursive computation: for each page v in G,• compute P(v) by using P(u) for all u L(v)

until• converge: no changes to any P(v)• after a fixed number of iterations

random jumprandom jump following a link from other pagesfollowing a link from other pages

How to speed it up?

30

A MapReduce algorithm

Assume that = 0


• Node id: nid n• N.rank: the current rank• N.AdjList: [m], node id

Key: node id n

Value of node N:

• rank: a rank of a node

• Node N (id, AdjList, etc)

Simplified version: _(u L(v)) P(u)/C(u)

31

Mapper

Local computation in mapper

Parallel processing

all nodes are processed in parallel, each by a mapper

Pass PageRank at n to successors of n

Map (nid n, node N)•p N.rank/|N.AdjList|;•emit( nid n, N); •for each m in N.AdjList do

• emit( nid m, p);Pass rank to neighbors

P(u)/C(u)

emit (nid: n, N): preserve graph structure for iterative

processing

32

Reducer

Aggregation in reducer

list for m: P(u)/C(u) from all predecessors of m

m.rank at the end: _(u L(v)) P(u)/C(u)

Reduce (nid m, list)•s 0; •for all p in list do

• if IsNode(p) • then M p; • else s s + p;

• M.rank s;• emit (nid m, node M);

Sum up

With updated M.rank for this round

Recover graph structure

PageRank in MapReduce

n5 [n1, n2, n3]n1 [n2, n4] n2 [n3, n5] n3 [n4] n4 [n5]

n2 n4 n3 n5 n1 n2 n3n4 n5

n2 n4n3 n5n1 n2 n3 n4 n5

n5 [n1, n2, n3]n1 [n2, n4] n2 [n3, n5] n3 [n4] n4 [n5]

Map

Reduce

Acknowledgments: some animation slides are borrowed from

www.cs.kent.edu/~jin/Cloud12Spring/GraphAlgorithms.pptx

Termination control: external driver

Keyword search

3434

35

Distinct-root trees

k dj(r, pj): k iterations (termination condition) 35

Match: a subtree T = (r, (k1, p1, d1(r, p1)), …, (km, pm, dm(r, pm)) of G such that

• each keyword ki in Q is contained in a leaf pi of T

• pi is closest to r among all nodes that contain ki • the distance from the root r of T to the lead does not exceed k

Input: A list Q = (k1, …, km) of keywords, a directed graph G, and a positive integer k

Output: distinct trees that match Q bounded by k

A simplified version

36

An MapReduce algorithm

Preprocessing: can be done in MapReduce itself


• Node id: nid n• N.((K1, P1, D1), …, (Km, Pm, Dm) : representing (n, (k1, p1,

d1(n, p1)), …, (km, pm, dm(n, pm))

• N.AdjList: [m], node id

Key: node id n

Preprocessing: N.((K1, P1, D1), …, (Km, Pm, Dm):

• P1 = and Dm = if N does not contain km

• P1 = n and Dm = 0 otherwise

37

Mapper

Pass information from successors

Local computation:

Shortcut one node

One hop forward

Map (nid n, node N)•emit( nid n, N); •for each m in N.AdjList do

• emit( nid n, (M.(K1, P1, D1+1), …, M.(Km, Pm, Dm+1));

Contrast this to, e.g., PageRank

m is the node id of node M

38

Reducer

Shortest distances within j hops

Invariant: in iteration j, N.((K1, P1, D1), …, (Km, Pm, Dm) represents (n, (k1, p1, d1(n, p1)), …, (km, pm, dm(n, pm))

Reduce (nid n, list)•for i from 1 to m do

• pi N. Pi; di N. di;

•for i from 1 to m do

• Si the set of all M.(Ki, Pi, Di) in list

• di the smallest M.Di; pi the corresponding M. Di;

•for i from 1 to m do

• N.Pi pi; N.Di di;

• emit (nid n, node N);

Group by keyword ki

Pick the one with the shortest distance to n

N: the node represented by n; must be in list

39

Termination and post-processing

A different way of passing information during traversal

Termination: after k iterations, for a given positive integer k

Post-processing: upon termination, for each node n, where

N.((K1, P1, D1), …, (Km, Pm, Dm)

If no Pi = for i from 1 to m, then

N.((K1, P1, D1), …, (Km, Pm, Dm) represents a valid match

(n, (k1, p1, d1(n, p1)), …, (km, pm, dm(n, pm))

Graph pattern matching by subgraph isomorphism

4040

41

Input: a query Q and a data graph G,

Output: all the matches of Q in G, i.e, all subgraphs of G that are isomorphic to Q

a bijective function f on nodes:

(u,u’ ) ∈ Q iff (f(u), f(u’)) G∈

41

Graph pattern matching by subgraph isomorphism

NP-complete

MapReduce?

42

An MapReduce algorithm

Two MapReduce steps: preprocessing, and computation


• Node id: nid n• N.Gd: the subgraph of G rooted at n, consisting of nodes

within d hops of n• N.AdjList: [m], node id

Key: node id n

Preprocessing: for each node n, computes N.Gd

• A MapReduce algorithm of d iterations

• adjacency lists are only used in the preprocessing step

d: the radius of Q

43

Algorithm

Just a conceptual level evaluation

Show the correctness? All and only isomorphic mappings?

Map (nid n, node N)

•compute all matches S of Q in N.Gd

•emit(1, S);

reduce (1, list)•M the union of all sets in list•emit(M, 1);

Invoke any algorithm for subgraph isomorphism: VF2, Ullman

not necessary; just to eliminate duplicates

Parallel scalability? The more processors, the faster? Yes, as long as the number of processors does not exceed the number of nodes of G

Lot of redundant computations

Yes, data locality

Summing up

4444

45

Summary and review

Why do we need parallel algorithms for querying graphs?

What is the MapReduce framework?

How to develop graph algorithms in MapReduce?

– Graph representation

– Local computation in mapper

– Aggregation in reducer

– Termination

Graph algorithms in MapReduce may not be efficient. Why?

Develop your own graph algorithms in MapReduce. Give correctness proof, complexity analysis and performance guarantees for your algorithms

46

Project (1)

Recall strongly connected components (Lecture 2)

46

Implement a MapReduce algorithm that, given a graph G, computes all (maximum) strongly connected components of G

Develop optimization strategies Experimentally evaluate your algorithm, especially its scalability

with the size of G Write a survey on parallel algorithms for computing strongly

connected components, as part of the related work.

A development project

47

Project (2)

Recall strongly kNN joins (Lecture 2)

47

Implement a MapReduce algorithm for evaluating kNN join queries Develop optimization strategies Experimentally evaluate your algorithm Write a survey on parallel algorithms for kNN queries and kNN join

queries, as part of the related work.


48

Project (3)

Recall keyword search with Steiner-tree semantics (Lecture 2)

48

Implement a MapReduce algorithm for keyword search with Steiner-tree semantics

Develop optimization strategies Experimentally evaluate your algorithm Write a survey on parallel algorithms for keyword search, as part of

the related work.


49

• W. Fan, F. Geerts, and F. Neven. Making Queries Tractable on Big Data with Preprocessing, VLDB 2013

• Y. Tao, W. Lin. X. Xiao. Minimal MapReduce Algorithms (MMC) http://www.cse.cuhk.edu.hk/~taoyf/paper/sigmod13-mr.pdf

• L. Qin, J. Yu, L. Chang, H. Cheng, C. Zhang, Xuemin Lin: Scalable big graph processing in MapReduce. SIGMOD 2014. http://www1.se.cuhk.edu.hk/~hcheng/paper/SIGMOD2014qin.pdf

• W. Lu, Y. Shen, S. Chen, B. Ooi: Efficient Processing of k Nearest Neighbor Joins using MapReduce. PVLDB 2012. http://arxiv.org/pdf/1207.0141.pdf

• V. Rastogi, A. Machanavajjhala, L. Chitnis, A. Sarma: Finding connected components in map-reduce in logarithmic rounds. ICDE 2013http://arxiv.org/pdf/1203.5387.pdf

Papers for you to review

1 QSX: Querying Social Graphs Graph algorithms in MapReduce MapReduce: an introduction BFS for...

Documents

Transcript of 1 QSX: Querying Social Graphs Graph algorithms in MapReduce MapReduce: an introduction BFS for...