Query Processing CS 347 Decomposition Parallel and...

27
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization C S 347 Notes 3 2 Decomposition Same as in centralized system 1. Normalization 2. Eliminating redundancy 3. Algebraic rewriting C S 347 Notes 3 3 Normalization Convert from general language to a standard form E.g., relational algebra C S 347 Notes 3 4

Transcript of Query Processing CS 347 Decomposition Parallel and...

CS 347Parallel and Distributed

Data ProcessingSpring 2016

Notes 3: Query Processing

Query ProcessingDecompositionLocalizationOptimization

CS 347 Notes 3 2

DecompositionSame as in centralized system

1. Normalization2. Eliminating redundancy3. Algebraic rewriting

CS 347 Notes 3 3

NormalizationConvert from general language to a standard form E.g., relational algebra

CS 347 Notes 3 4

NormalizationExample:

select R.A, R.Cfrom R, Swhere R.A = S.A and

((R.B = 1 and S.D = 2) or (R.C > 3 and S.D = 2))

CS 347 Notes 3 5

∏R.A,R.C

R S

σ(R.B=1∨R.C>3)∧S.D=2

conjunctivenormal formA

NormalizationAlso detect invalid expressions

select * from R ✘ R does not have D attributewhere R.D = 3

CS 347 Notes 3 6

Eliminating RedundancyE.g., in conditions

(S.A = 1) ∧ (S.A > 5) ⟹ false

(S.A < 10) ∧ (S.A < 5) ⟹ S.A < 5

CS 347 Notes 3 7

Eliminating RedundancyE.g., sharing common sub-expressions

CS 347 Notes 3 8

⋈ ⋈

S σcond σcond T

R R

⋈ ⋈

S σcond T

R

Algebraic RewritingE.g., pushing conditions down

CS 347 Notes 3 9

σcond

R S

σcond3

σcond1 σcond2

R S

Query ProcessingDecomposition ✔→ One or more algebraic query trees on relations

LocalizationReplacing relations by corresponding fragments

CS 347 Notes 3 10

LocalizationSteps

(1) Start with query

(2) Replace relations by fragments

(3) Push ∪ up and ∏, σ down (apply CS 245 rules)

(4) Simplify ~ eliminate unnecessary operations

CS 347 Notes 3 11

LocalizationNotation for fragment

CS 347 Notes 3 12

[R : cond]

fragment

conditions its tuples satisfy

Example A(1)

CS 347 Notes 3 13

σE=3

R

Example A(2)

CS 347 Notes 3 14

σE=3

[R1: E<10] [R2: E≥10]

Example A(3)

CS 347 Notes 3 15

σE=3

[R1: E<10] [R2: E≥10]

σE=3

Example A(3)

CS 347 Notes 3 16

σE=3

[R1: E<10] [R2: E≥10]

σE=3

⟹ Ø

Example A(4)

CS 347 Notes 3 17

σE=3

[R1: E<10]

LocalizationRule 1

σc1 [R: c2] ⟹ σc1 [R: c1∧c2]

[R: false] ⟹ Ø

CS 347 Notes 3 18

Example AσE=3 [R2: E ≥ 10] ⟹ σE=3 [R2: E=3∧E ≥ 10]

⟹ σE=3 [R2: false]⟹ Ø

CS 347 Notes 3 19

Example B(1)

CS 347 Notes 3 20

R

S

AA = common attribute

Example B(2)

CS 347 Notes 3 21

A

[R1: A<5] [R2: 5≤A≤10] [R3: A>10] [S1: A<5] [S2: A≥5]

Example B(3)

CS 347 Notes 3 22

[R1: A<5]

[R2: 5≤A≤10]

[R3: A>10]

[S1: A<5] [S2: A≥5]

[S1: A<5] [R1: A<5] [S2: A≥5]

[R2: 5≤A≤10]

[R3: A>10] [S1: A<5] [S2: A≥5]

⋈A ⋈A ⋈A ⋈A

⋈A ⋈A

Example B(3)

CS 347 Notes 3 23

[R1: A<5]

[R2: 5≤A≤10]

[R3: A>10]

[S1: A<5] [S2: A≥5]

[S1: A<5] [R1: A<5] [S2: A≥5]

[R2: 5≤A≤10]

[R3: A>10] [S1: A<5] [S2: A≥5]

⋈A ⋈A ⋈A ⋈A

⋈A ⋈A

Example B(4)

CS 347 Notes 3 24

[R1: A<5] [R3: A>10][S2: A≥5][S1: A<5] [R2: 5≤A≤10] [S2: A≥5]

⋈A ⋈A ⋈A

LocalizationRule 2

[R: c1] [S: c2] ⟹ [R S: c1∧c2∧R.A=S.A]

[R: false] ⟹ Ø

CS 347 Notes 3 25

⋈A ⋈A

Example B[R1: A < 5] [S2: A ≥ 5]

⟹ [R1 S2: R1.A < 5 ∧ S2.A ≥ 5 ∧ R1.A=S2.A]

⟹ [R1 S2: false]

⟹ Ø

CS 347 Notes 3 26

⋈A

⋈A

⋈A

Example C(2)

CS 347 Notes 3 27

K

[R1: A<10] [R2: A≥10] [S1: K=R.K∧R.A<10] [S2: K=R.K∧R.A≥10]

derived fragmentation

Example C(3)

CS 347 Notes 3 28

[R1]

⋈K ⋈K⋈K ⋈K

[S1] [R1] [S2] [S1][R2] [R2] [S2]

Example C(3)

CS 347 Notes 3 29

[R1]

⋈K ⋈K⋈K ⋈K

[S1] [R1] [S2] [S1][R2] [R2] [S2]

Example C(4)

CS 347 Notes 3 30

[R1: A<10]

⋈K ⋈K

[S1: K=R.K∧R.A<10] [R2: A≥10] [S2: K=R.K∧R.A≥10]

Example C[R1: A < 10] [S2: K = R.K ∧ R.A ≥ 10]

⟹ [R1 S2: R1.A < 10 ∧ S2.K = R.K ∧ R.A ≥ 10 ∧ R1.K = S2.K]

⟹ [R1 S2: false]

⟹ Ø

CS 347 Notes 3 31

⋈K

⋈K

⋈K

Example D(1)

CS 347 Notes 3 32

RR1 (K, A, B)

R2 (K, C, D)

vertical fragmentation

∏A

Example D(2)

CS 347 Notes 3 33

∏A

R1

⋈K

R2(K,A,B) (K,C,D)

Example D(2)

CS 347 Notes 3 34

∏A

R1

⋈K

R2(K,A,B) (K,C,D)

∏K,A ∏K,A

Example D(4)

CS 347 Notes 3 35

∏A

R1(K, A, B)

Rule 3Consider the vertical fragmentation

Ri = ∏ (R), Ai⊆A

Then for any B ⊆A

∏B(R) = ∏B( Ri | B ∩ Ai ≠ Ø )

CS 347 Notes 3 36

⋈i

Ai

Query ProcessingDecomposition ✔Localization ✔

OptimizationOverviewJoins and other operationsInter-operation parallelismOptimization strategies

CS 347 Notes 3 37

OverviewOptimization process1. Generate query plans

2. Estimate size ofintermediate results

3. Estimate cost of plan

CS 347 Notes 3 384. Pick minimum

P1 P2 P3 Pn

C1 C2 C3 Cn

§

§

§

§

§

§

§

§

§

§

§

§

OverviewDifferences from centralized optimizationNew strategies for some operations

Parallel/distributed sortParallel/distributed join

Semi-joinPrivacy preserving join

Duplicate eliminationAggregationSelection

Many ways to assign and schedule processorsCS 347 Notes 3 39

Parallel/Distributed SortInputa) Relation R on single site/diskb) Relation R fragmented/partitioned by sort attributec) Relation R fragmented/partitioned by another attribute

CS 347 Notes 3 40

Parallel/Distributed SortOutputa) Sorted R on single site/diskb) Sorted R fragments/partitions

CS 347 Notes 3 41

5 …610

12 …15

19 …202150

F1 F2 F3

Basic SortR(K, …) to be sorted on KHorizontally fragmented on K using vector (k0, k1, ... , kn)

CS 347 Notes 3 42

73

2722

111714

10 20

k0 k1

Basic SortAlgorithm1. Sort each fragment independently2. Ship results if necessary

CS 347 Notes 3 43

Basic SortSame idea on different architectures

Shared nothing

Shared memory

CS 347 Notes 3 44

P1

M F1…

P2

M F2

M

P1

F2

… P2

F1

Range-Partition SortR(K, …) to be sorted on KR located at one or more site/disk, not fragmented on K

CS 347 Notes 3 45

Range-Partition SortAlgorithm1. Range partition on K2. Basic sort

CS 347 Notes 3 46

R’1

R’2

R’3

k0

k1

local sort

local sort

local sort

Ra R1

R2

R3

Rbresult

Partition VectorsProblemSelect a good partition vector given fragments

CS 347 Notes 3 47

7 …521114

31 …815113217

10 …124

Ra Rb Rc

Partition VectorsExample approach

Each site sends to coordinatorMIN sort keyMAX sort keyNumber of tuples

CoordinatorComputes vector and distributes to sitesDecides the number of sites to perform local sorts

CS 347 Notes 3 48

Partition VectorsSample scenarioCoordinator receives

SA: MIN = 5 MAX = 9 # of tuples = 10SB: MIN = 7 MAX = 16 # of tuples = 10

CS 347 Notes 3 49

Sample scenarioCoordinator receives

SA: MIN = 5 MAX = 9 # of tuples = 10SB: MIN = 7 MAX = 16 # of tuples = 10

Partition Vectors

CS 347 Notes 3 50

Expected tuples

assuming we want to sort at 2 sites

5 10 15 20

k0?

1 = 10 / (16 – 7 + 1)

2 = 10 / (9 – 5 + 1)

Sample scenario

Partition Vectors

CS 347 Notes 3 51

Expected tuples with key < k0 = half of total tuples2(k0 – 5) + (k0 – 7) = 10k0 = (10 + 10 + 7) / 3 = 9

Expected tuples

5 10 15 20

k0?

1

2

Partition VectorsVariationsSend more info to coordinator:a) Partition vector for local site

b) Histogram

CS 347 Notes 3 52

5 6 8 107 9

3 3 3

5 6 8 10

# of tupleslocal vector

Partition VectorsVariationsMultiple rounds1. Sites send range and # of tuples to coordinator2. Coordinator distributes preliminary vector V0

3. Sites send coordinator # of tuples in each V0 range4. Coordinator computes and distributes final vector Vf

CS 347 Notes 3 53

Partition VectorsVariationsDistributed algorithm that does not require a coordinator?

CS 347 Notes 3 54

Parallel External Sort-MergeSame as range-partition sort except does sorting first

CS 347 Notes 3 55

R’1

R’2

R’3

k0

k1

local sort

local sort

Ra R’a

R’bRb result

in order

merge

Parallel/Distributed JoinInputRelations R, SMay or may not be fragmented

OutputR ⋈ SResult at one or more sites

CS 347 Notes 3 56

Partitioned Join (Equijoin)

CS 347 Notes 3 57

RA S1

S2

S3

RB

R1

R2

R3

SA

SB

SC

resultf(A)

f(A)

local join

Partitioned Join (Equijoin)Same partition function f is used for both R and S

f can be range or hash partitioning

Local join can be of any type (use any CS 245 optimization)

Various scheduling options, e.g., a) Partition R; partition S; joinb) Partition R; build local hash table for R; partition S and join

CS 347 Notes 3 58

Partitioned Join (Equijoin)We already know why it works

CS 347 Notes 3 59

[R1] [R2] [R3]

[S1] [S2] [S3]

⋈ ⋈ ⋈

[R1] [S1] [R2] [S2] [R3] [S3]

Sometimes we partition just to make this join possible

Partitioned Join (Equijoin)Selecting a good partition function f is very important

Goal is to make all | Ri | + | Si | the same size

CS 347 Notes 3 60

Asymmetric Fragment + Replicate Join

CS 347 Notes 3 61

RA S

S

S

RB

R1

R2

R3

SA

SB

partition

unionresult

local join

Asymmetric Fragment + Replicate JoinCan use any partition function f for R (even round robin)

Can do any join, not just equijoin (e.g., R S)

CS 347 Notes 3 62

⋈R.A<S.B

General Fragment + Replicate Join

CS 347 Notes 3 63

RA R1

R2

R3

RB

R1

R2

R3

f

n copies of each fragment

partition→3 fragments

General Fragment + Replicate Join

CS 347 Notes 3 64

RA R1

R2

R3

RB

R1

R2

R3

f

n copies of each fragment

partition→3 fragments

S is partitioned in similar fashion

General Fragment + Replicate Join

CS 347 Notes 3 65

n × m pairings

of R, S fragments

R1 S1

R2 S1

R3 S1

R1 S2

R2 S2

R3 S2

result

General Fragment + Replicate Join

The asymmetric F+R join is special case of the general F+R

Asymmetric F+R may be good if S is small

Also works on non-equijoins

CS 347 Notes 3 66

Semi-JoinGoal: reduce communication traffic

R ⋈ S ⟹ ( R ⋉ S ) ⋈ S or

R ⋈ ( S ⋉ R ) or

( R ⋉ S ) ⋈ ( S ⋉ R )

CS 347 Notes 3 67

A A A

A A

AA A

Semi-JoinR ⋈ S

CS 347 Notes 3 68

2 a10 b25 c30 d

3 x10 y15 z25 w32 x

R S A AB C

Semi-JoinR ⋈ S

CS 347 Notes 3 69

2 a10 b25 c30 d

3 x10 y15 z25 w32 x

R S A AB C

∏AR = [2, 10, 25, 30]

Semi-JoinR ⋈ S

CS 347 Notes 3 70

2 a10 b25 c30 d

3 x10 y15 z25 w32 x

R S A AB C

∏AR = [2, 10, 25, 30]

S ⋉R = 10 y25 w

R ⋈ S

Semi-Join

CS 347 Notes 3 71

2 a10 b25 c30 d

3 x10 y15 z25 w32 x

R S A AB C

Transmitted dataWith semi-join R ⋈ ( S ⋉ R ): T = 4 | A | + 2 | A + C | + resultWith join R ⋈ S : T = 4 | A + B | + result

better if | B | is large

Semi-JoinAssume R is the smaller relation

Then, in general,

( R ⋉ S ) ⋈ S is better than ( R ⋈ S ) ifsize( ∏A S ) + size ( R ⋉ S ) < size ( R )

Can do similar comparison for other joins→ Only taking into account transmission cost

CS 347 Notes 3 72

A A A

A

Semi-JoinTrick: encode ∏A S (or ∏A R) as a bit vector

CS 347 Notes 3 73

0 0 1 1 0 1 0 0 0 0 1 0 1 0 0 keys in S

one bit/possible key

Semi-Join

CS 347 Notes 3 74

Useful for three way joins R ⋈ S ⋈ T

Semi-Join

CS 347 Notes 3 75

Useful for three way joins R ⋈ S ⋈ T

Option 1: R’ ⋈ S’ ⋈ T where R’ = R ⋉ S

S’ = S ⋉ T

Semi-Join

CS 347 Notes 3 76

Useful for three way joins R ⋈ S ⋈ T

Option 1: R’ ⋈ S’ ⋈ T where R’ = R ⋉ S

S’ = S ⋉ T

Option 2: R” ⋈ S’ ⋈ T where R” = R ⋉ S’

S’ = S ⋉ T

Semi-Join

CS 347 Notes 3 77

Useful for three way joins R ⋈ S ⋈ T

Option 1: R’ ⋈ S’ ⋈ T where R’ = R ⋉ S

S’ = S ⋉ T

Option 2: R” ⋈ S’ ⋈ T where R” = R ⋉ S’

S’ = S ⋉ T…→ Many options ~ exponential in # of relations

Privacy Preserving JoinSite 1 has R(A, B)Site 2 has S(A, C)

Want to compute R ⋈ S

Site 1 should not discover any S info not in the joinSite 2 should not discover any R info not in the join

CS 347 Notes 3 78

Privacy Preserving JoinSemi-join won’t workIf site 1 sends ∏A R to site 2, site 2 learns all keys of R

CS 347 Notes 3 79

a1 b1

a2 b2

a3 b3

a4 b4

a1 c1

a3 c2

a5 c3

a7 c4

R S A AB C

site 1 site 2

∏A R = (a1, a2, a3, a4)

Privacy Preserving JoinFix: send hash keys onlySite 1 hashes each value of A before sendingSite 2 hashes its own A values (same h) to see what tuples match

CS 347 Notes 3 80

a1 b1

a2 b2

a3 b3

a4 b4

a1 c1

a3 c2

a5 c3

a7 c4

R S A AB C

∏A R = (h(a1), h(a2), h(a3), h(a4))

site 2 sees it has h(a1), h(a3)

(a1, c1), (a3, c3)site 1 site 2

Privacy Preserving JoinWhat is the problem?

CS 347 Notes 3 81

a1 b1

a2 b2

a3 b3

a4 b4

a1 c1

a3 c2

a5 c3

a7 c4

R S A AB C

∏A R = (h(a1), h(a2), h(a3), h(a4))

site 2 sees it has h(a1), h(a3)

(a1, c1), (a3, c3)site 1 site 2

Privacy Preserving JoinWhat is the problem?

CS 347 Notes 3 82

a1 b1

a2 b2

a3 b3

a4 b4

a1 c1

a3 c2

a5 c3

a7 c4

R S A AB C

∏A R = (h(a1), h(a2), h(a3), h(a4))

site 2 sees it has h(a1), h(a3)

(a1, c1), (a3, c3)site 1 site 2

Dictionary attackSite 2 can take all keys a1, a2, a3, … and checks if h(a1), h(a2), h(a3), …matches what site 1 sent

Privacy Preserving JoinOur adversary model: honest but curious

Dictionary attack is possible (cheating is internal and can’t be caught)

Sending incorrect keys not possible (cheater could be caught)

CS 347 Notes 3 83

Privacy Preserving JoinOne solutionInformation Sharing Across Private Databases. Agrawal et al., SIGMOD 2003

→ Use commutative encryption functions

Ei(x) = x encrypted using a key private to site iE1(E2(x)) = E2(E1(x))

Shorthand: E1(x) is x, E2(x) is x, E1(E2(x)) is x

CS 347 Notes 3 84

One solution

Privacy Preserving Join

CS 347 Notes 3 85

a1 b1

a2 b2

a3 b3

a4 b4

a1 c1

a3 c2

a5 c3

a7 c4

R S A AB C

( a1, a2, a3, a4 )

( a1, a2, a3, a4 )

( a1, a3, a5, a7 )

Computes ( a1, a3, a5, a7 ), intersects with ( a1, a2, a3, a4 )

(a1, b1), (a3, b3)

site 1 site 2

Other Privacy Preserving OperationsInequality join

Similarity join

CS 347 Notes 3 86

R ⋈ SR.A>S.A

R ⋈ Ssim(R.A,S.A)<e

Other Parallel OperationsDuplicate eliminationSort first (in parallel) then eliminate duplicates in resultPartition tuples (range or hash) and eliminate locally

AggregatesPartition by grouping attributes; compute aggregate locally

CS 347 Notes 3 87

Parallel/Distributed Aggregates

CS 347 Notes 3 88

sum (sal) group by dept

id dept salary1 toy 102 toy 203 sales 15

Ra

id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30

Rb

Parallel/Distributed Aggregates

CS 347 Notes 3 89

id dept salary1 toy 102 toy 203 sales 15

Ra

id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30

Rb

id dept salary1 toy 102 toy 205 toy 206 mgmt 158 mgmt 30

id dept salary3 sales 154 sales 57 sales 10

sum (sal) group by dept

Parallel/Distributed Aggregates

CS 347 Notes 3 90

id dept salary1 toy 102 toy 203 sales 15

Ra

id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30

Rb

id dept salary1 toy 102 toy 205 toy 206 mgmt 158 mgmt 30

id dept salary3 sales 154 sales 57 sales 10

dept sumtoy 50

mgmt 45

dept sumsales 30

sum

sum

sum (sal) group by dept

Parallel/Distributed Aggregates

CS 347 Notes 3 91

id dept salary1 toy 102 toy 203 sales 15

Ra

id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30

Rb

sum (sal) group by dept with less data

Parallel/Distributed Aggregates

CS 347 Notes 3 92

id dept salary1 toy 102 toy 203 sales 15

Ra

id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30

Rb

dept salarytoy 30toy 20

mgmt 45

dept salarysales 15sales 15

sum

sum

sum (sal) group by dept with less data

Parallel/Distributed Aggregates

CS 347 Notes 3 93

id dept salary1 toy 102 toy 203 sales 15

Ra

id dept salary4 sales 55 toy 206 mgmt 157 sales 108 mgmt 30

Rb

dept salarytoy 30toy 20

mgmt 45

dept salarysales 15sales 15

sum

sum

dept sumtoy 50

mgmt 45

dept sumsales 30

sum

sum

sum (sal) group by dept with less data

Parallel/Distributed AggregatesEnhancementPerform aggregate during partition to reduce data transmitted→ Does not work for all aggregate functions—which ones?

CS 347 Notes 3 94

Preview: MapReduce

CS 347 Notes 3 95

data A1

data A2

data A3

data B1

data B2

data C1

data C1

map

reduce

Parallel/Distributed SelectionStraightforward if one can leverage range or hash partitioning

How do indexes work?

CS 347 Notes 3 96

Parallel/Distributed SelectionPartition vector can act as the root of a distributed index

CS 347 Notes 3 97

k0 k1

site 1 site 2 site 3

local indexes

Parallel/Distributed SelectionDistributed indexes on a non-partition attributes get complicated

CS 347 Notes 3 98

k0 k1

index sites

tuple sites

Parallel/Distributed SelectionIndexing schemesWhich one is better in a distributed environment?How to make updates and expansion efficient?Where to store the directory and set of participants?Is global knowledge necessary?

→ If the index is not too big, it may be better to keep it whole and replicate it

CS 347 Notes 3 99

Query ProcessingDecomposition ✔Localization ✔

OptimizationOverview ✔Joins and other operations ✔Inter-operation parallelismOptimization strategies

CS 347 Notes 3 100

Inter-Operation ParallelismPipelinedIndependent

CS 347 Notes 3 101

Pipelined Parallelism

CS 347 Notes 3 102

σc

R

S

site 2

site 1

site 1 join

R S

site 2

resultprobetuples matching σc

Pipelined ParallelismPipelining cannot be used in all cases

CS 347 Notes 3 103

stream ofR tuples

stream ofS tuples

Independent Parallelism

CS 347 Notes 3 104

⋈⋈ ⋈

R S T V

site 1 site 2

1. temp1 ⟵R ⋈ Stemp2 ⟵ T ⋈ V

2. result ⟵ temp1 ⋈ temp2

SummaryAs we consider query plans for optimization, we must consider various new strategies for

individual operations

scheduling operations

CS 347 Notes 3 105