Parallel Query Optimization

20
Fall 2008 Parallel Query Optimizati on 1 Parallel Query Optimization

description

Parallel Query Optimization. Memory. One tuple at a time. Bucket B. Bucket A. Bucket Sizes and I/O Costs. Bucket B does not fit in the memory in its entirety, It must be loaded several times. Memory. One tuple at a time. Bucket B(1). Bucket A(1). Bucket A(2). Bucket B(2). Bucket A(3). - PowerPoint PPT Presentation

Transcript of Parallel Query Optimization

Page 1: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 1

Parallel Query Optimization

Page 2: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 2

Bucket Sizes and I/O Costs Bucket B does not fit in the memory in its entirety,

It must be loaded several times.

Bucket B

Memory

Bucket A

One tuple at a time

Page 3: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 3

Fit in Memory

Bucket B fits in memory. It needs to be loaded only once.

Bucket B(2)

Bucket B(1)

Memory

Bucket A(1)

One tuple at a time

Bucket B(3)

Bucket A(2)

Bucket A(3)

Page 4: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 4

Hash-Based Join

Page 5: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 5

GRACE Algorithm

Page 6: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 6

Data Skew System performance is very sensitive to the skewn

ess in tuple distribution.

Page 7: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 7

Zipf-like DistributionTotal: 1,000,000tuples

Page 8: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 8

Partition Tuning Best Fit Decreasing Strategy:

In this partition tuning strategy, the hash buckets are first sorted into decreasing order according to size.

In each iteration, the currently largest bucket is assigned to the currently smallest partition (or PN).

This process is repeated until all the buckets have been allocated.

This is a dynamic load balancing technique.

Page 9: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 9

Best Fit Decreasing Strategy

Page 10: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 10

Adaptive Load Balancing (ABJ+)

Page 11: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 11

ABJ+ vs. GRACE

Page 12: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 12

L_LBO in Multi-way Join Queries

L_LBO: Linear Tree with Load Balancing A multi-way join query is treated as a sequential

order of two-way (or single) joins by using ABJ+.

Page 13: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 13

B_NLB in Multi-way Join Queries

B_NLB: Bushy Tree without Load Balancing It tries to join as many pairs of relations as possibl

e.Split Phase: Each PN partitions its portion of each relation

into small subbuckets and each subbuckets is transferred to PN corresponding to the bucket ID.

Join Phase: Each PN performs the local joins.

Page 14: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 14

NLBO in Multi-way Join Queries

NLBO: No Load Balancing Optimization

Like B_NLB, it tries to join as many pairs of relations as possible.

Hash Phase: Each PN partitions its portion of each relation into small subbuckets and stores them back to its own disks.

Partition Tuning Phase: It allocates the buckets to the PNs using the Best Fit Decreasing Strategy.

Join Phase: Each PN performs the local joins.

Page 15: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 15

LBO in Multi-way Join Queries

LBO: Load Balancing Optimization

Hash Phase: hashed and stored back into local disks.

Optimization Phase: using best fit decreasing strategy and a greedy algorithm to select joins which will be executed concurrently.

Executing Phase:

Stage 1: Tune the partitions.

Stage 2: Perform the join operation.

Stage 3: Update the join graph, then go to Optimization Phase.

Page 16: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 16

Optimization Phase of LBO

Page 17: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 17

Effect of Bucket Skew

Page 18: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 18

LBO-FR LBO-SFR: LBO with Fragment & Replicate Featu

re LBO-FR is similar to LBO, except it partitions bu

cket pairs into subbucket pairs if those buckets are too large.

Example: suppose bucket pair (S1, R1) is too large and |S1| > |R2|.

S1

R1

S1,1

R1

S1,2

R1

S1,1

R1

S1,2

R1

S1,3

R1

Page 19: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 19

LBO-SFR LBO-SFR: LBO with Symmetric Fragment &

Replicate Feature

S1,1,1

R1,1,1

S1,1,1

R1,1,1

S1,2,1

R1,1,2

S1,1,1

R1,1,1

S1,2,1

R1,1,2

S1,1,2

R1,2,1

S1,2,2

R1,2,2

S1,1,1

R1,1,1

S1,2,1

R1,1,2

S1,3,1

R1,1,3

S1,1,2

R1,2,1

S1,2,2

R1,2,2

S1,3,2

R1,2,3

|S1|>|R1| |S1,1,1|<|R1,1,1|

|S1,1,1|>|R1,1,1|

Parti. S1Parti. R1 Parti. S1

Page 20: Parallel Query Optimization

Fall 2008 Parallel Query Optimization 20

Effect of Bucket Skew