Handing Data Skew in MapReduce · 2013. 9. 12. · Technische Universitat Munc¨ hen¨ Summary:...

Post on 15-Oct-2020

1 views 0 download

Transcript of Handing Data Skew in MapReduce · 2013. 9. 12. · Technische Universitat Munc¨ hen¨ Summary:...

Technische Universitat Munchen

Handing Data Skew in MapReduce

B. Gufler♠, N. Augsten♣, A. Reiser♠, A. Kemper♠

♠Technische Universitat Munchen ♣Free University of Bolzano-Bozen

May 8, 2011

Technische Universitat Munchen

Cloud Data Processing with MapReduce

MapReduce for BusinessI usage statistics, index building, . . .I simple reducer tasks

MapReduce for eScience: new challenges

1. heavy data skew

2. complex reducer tasks

; How to deal with these problems?

1

Technische Universitat Munchen

Cloud Data Processing with MapReduce

MapReduce for BusinessI usage statistics, index building, . . .I simple reducer tasks

MapReduce for eScience: new challenges

1. heavy data skew

2. complex reducer tasks

; How to deal with these problems?

1

Technische Universitat Munchen

Cloud Data Processing with MapReduce

MapReduce for BusinessI usage statistics, index building, . . .I simple reducer tasks

MapReduce for eScience: new challenges

1. heavy data skew

2. complex reducer tasks

; How to deal with these problems?

1

Technische Universitat Munchen

Cloud Data Processing with MapReduce

MapReduce for BusinessI usage statistics, index building, . . .I simple reducer tasks

MapReduce for eScience: new challenges

1. heavy data skew

2. complex reducer tasks

; How to deal with these problems?

1

Technische Universitat Munchen

Data Skew

I values of an attribute not evenly distributedI very common in scientific dataI causes load imbalance if attribute is used for partitioning

2

Technische Universitat Munchen

Example: Node Masses in Millennium Simulation

0

100

200

300

400nodes (∗1M)

node mass1–

50

434M

51–5

00

295M

501–

5k

29M

5k–5

0k

2.5M

50k–

500k

143k

500k

–5M

1.8k

3

Technische Universitat Munchen

Complex Reducer Tasks

I complex scientific analysis algorithmsI often polynomial or even exponential complexity

I amplifies load imbalance

Map

Map

Map

Reduce

Reduce

6 · 22 = 24 42 + 82 = 80

4

Technische Universitat Munchen

Complex Reducer Tasks

I complex scientific analysis algorithmsI often polynomial or even exponential complexity

I amplifies load imbalance

Map

Map

Map

Reduce

Reduce

6 · 22 = 24

42 + 82 = 80

4

Technische Universitat Munchen

Complex Reducer Tasks

I complex scientific analysis algorithmsI often polynomial or even exponential complexity

I amplifies load imbalance

Map

Map

Map

Reduce

Reduce

6 · 22 = 24 42 + 82 = 80

4

Technische Universitat Munchen

Example: Frequent Subtree Mining

I find common substructures in(large parts of) a forest

I exponential complexity forunordered subtrees

aT1

b c

d a

dT2

a

c

a

bT3

a

d

c

d

5

Technische Universitat Munchen

Example: Frequent Subtree Mining

I find common substructures in(large parts of) a forest

I exponential complexity forunordered subtrees

aT1

b c

d a

dT2

a

c

a

bT3

a

d

c

d

5

Technische Universitat Munchen

Example: Frequent Subtree Mining

I find common substructures in(large parts of) a forest

I exponential complexity forunordered subtrees

aT1

b c

d a

dT2

a

c

a

bT3

a

d

c

d

5

Technische Universitat Munchen

Example: Frequent Subtree Mining

I find common substructures in(large parts of) a forest

I exponential complexity forunordered subtrees

aT1

b c

d a

dT2

a

c

a

bT3

a

d

c

d

5

Technische Universitat Munchen

Summary: MapReduce for eScience

ChallengesI load imbalance due to data skewI complex analysis amplifies execution time differencesI example: frequent subtree mining (PathJoin algorithm)

I 16 GB data set (subset of the Millennium simulation)I cluster of 16 nodesI total execution time: 8 hoursI execution time difference of reducers: up to 6 hours

6

Technische Universitat Munchen

Agenda

I MotivationI Load Balancing in MapReduce: Big PictureI Estimation of Partition CostI Assignment of Partitions to ReducersI Number of PartitionsI Evaluation

7

Technische Universitat Munchen

Data Partitioning in MapReduce

Map

cluster: all itemswith the samekey

partition: all keyswith the samehash values

hash function fordata partitioning

8

Technische Universitat Munchen

Load Balancing: Current MapReduceP

0P

1

P0

P1

P0

P1

Map

Red

uce

P0 P1 cont

rolle

r

I hash partitioning on each mapperI static 1:1 assignment of partitions to

reducersI balances number of keys per

reducer instead of workloadI no flexibility

9

Technische Universitat Munchen

Improved Load Balancing

I create more partitions than there are reducersI cost-based assignment of partitions to reducers

P0

P1

P2

P0

P1

P2

P0

P1

P2

Map

Red

uce

P0+P2 P1 cont

rolle

r

I flexibility for load balancingI challenges

1. compute number ofpartitions

2. estimate partition cost3. assign partitions to

reducers

10

Technische Universitat Munchen

Estimate Partition Cost

partition cost depends on1. number and size of the clusters within the partition

I locally monitored on each mapperI centrally aggregated on one controller

2. complexity of the reducer algorithmI specified by the user

; estimate cluster cost based on this information

; sum up cluster costs to obtain partition cost

11

Technische Universitat Munchen

Estimate Partition Cost: Monitoring

I tuple countI monitor local count on every mapperI sum up on controller

I cluster countI create Bloom filter on every mapperI employ Linear Counting on controller

12

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

13

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

Agg

rega

te

13

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

Agg

rega

te

13

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

Agg

rega

tetuples: 48∑

13

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

Agg

rega

tetuples: 48∑|

13

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

Agg

rega

tetuples: 48∑| lc

K. Whang, B. Vander-Zanden, H. Taylor: A Linear-Time Probabilistic Counting Algorithm for Databa-se Applications, ACM TODS 15(2), 1990

13

Technische Universitat Munchen

Estimate Partition Cost: MonitoringM

appe

r1

tuples: 17

clusters: [10011]

P0

Map

per2

tuples: 31

clusters: [01011]

P0

Agg

rega

tetuples: 48

clusters: 8

∑| lc

K. Whang, B. Vander-Zanden, H. Taylor: A Linear-Time Probabilistic Counting Algorithm for Databa-se Applications, ACM TODS 15(2), 1990

13

Technische Universitat Munchen

Estimate Partition Cost: Calculation

Agg

rega

te tuples: 48

clusters: 8

reducer complexity: quadratic(e. g., pairwise comparison)

I average cluster size: 488 = 6

I cluster cost: 62 = 36I partition cost: 8 · 36 = 288

14

Technische Universitat Munchen

Estimate Partition Cost: Calculation

Agg

rega

te tuples: 48

clusters: 8

reducer complexity: quadratic(e. g., pairwise comparison)

I average cluster size: 488 = 6

I cluster cost: 62 = 36I partition cost: 8 · 36 = 288

14

Technische Universitat Munchen

Estimate Partition Cost: Calculation

Agg

rega

te tuples: 48

clusters: 8

reducer complexity: quadratic(e. g., pairwise comparison)

I average cluster size: 488 = 6

I cluster cost: 62 = 36

I partition cost: 8 · 36 = 288

14

Technische Universitat Munchen

Estimate Partition Cost: Calculation

Agg

rega

te tuples: 48

clusters: 8

reducer complexity: quadratic(e. g., pairwise comparison)

I average cluster size: 488 = 6

I cluster cost: 62 = 36I partition cost: 8 · 36 = 288

14

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

15

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

reducer 1 reducer 20 0

15

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

reducer 1 reducer 2050

15

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

reducer 1 reducer 250 40

15

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

reducer 1 reducer 250 60

15

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

reducer 1 reducer 26065

15

Technische Universitat Munchen

Assign Partitions to Reducers

I assign partitions to reducers balancing the partition costI bin packing problem, but: NP-hardI rely on a heuristic instead

50 40 20 15 3

reducer 1 reducer 265 63

15

Technische Universitat Munchen

How many Partitions?

I static approachI number of partitions fixed beforehand by controllerI matching partitions created on every mapper

I dynamic approachI only initial number of partitions fixedI mappers may split some partitions further

16

Technische Universitat Munchen

How many Partitions?

I static approachI number of partitions fixed beforehand by controllerI matching partitions created on every mapper

I dynamic approachI only initial number of partitions fixedI mappers may split some partitions further

16

Technische Universitat Munchen

Evaluation

I load balancing qualityI influence on execution time

Synthetic Data SetsI 520M tuplesI 2 000 clustersI Zipf distributions

Millennium Data SetI 760M tuplesI ≈ 32 000 clusters

17

Technische Universitat Munchen

Evaluation: Load Balancing

I measure standard deviation of partition cost per reducerI small is good: small differences between reducers

01020304050

10 20 25 50

stdev [% of mean]

reducers

Synthetic Data, Moderate Skew

01020304050

10 20 25 50

stdev [% of mean]

reducers

Millennium Data

Standard MapReduce Fine Partitioning

18

Technische Universitat Munchen

Evaluation: Execution Time

I measure execution time per reducerI max time: overall time of reduce job

05

101520

10 20 25 50

Synthetic Execution Time

reducers

Synthetic Data, Moderate Skew

010203040

10 20 25 50

Synthetic Execution Time

reducers

Millennium Data

Standard MapReduce Fine PartitioningTime for Most Expensive Cluster

19

Technische Universitat Munchen

Summary

I scientific data analysis with MapReduceI partition-based load balancingI distributed monitoring for cost estimationI experimental evaluation

I better load balancingI reduced overall execution time ∑

| lc

20

Technische Universitat Munchen

Thank you!

21