Query execution hashing based Two pass algorithms
-
Upload
quentin-lott -
Category
Documents
-
view
49 -
download
1
description
Transcript of Query execution hashing based Two pass algorithms
Terminology Hash Basics Partitioning relation by hashing Hash-based Grouping and Aggregation Union, Intersection and Difference Hash-Join Algorithm Conclusion
Query optimization -- Logical Query plan -- Physical plan Query execution: -- Query processor- group of DBMS
components -- Converts user queries into database
operations Operation Relation- arguments of operation
Large data Hash functions to store large relations Memory buffers Gain factor of M in the size of relations
Tuples from same block foes to same bucket
Hash key depends on grouping attributes First pass: Process each bucket in turn. Second pass: Only one record per group.
Binary operation- same hash function for both arguments
Union: R U S First Pass -- 2(M-1) buckets -- Avoid duplicates Same for R ∩ S, R – S..
I/O operations needed: -- B(R) + B(S) -- 2 more for hashing -- Total: 3(B(R) + B(S)) For, two pass algorithm: -- min(B(R),B(S)) ≤ M2
R(X,Y) ►◄ S(Y,Z) Same as other binary operations Only difference in hash key, Y I/O operations: -- 3(B(R)+B(S)) -- Two pass require min(B(R),B(S)) ≤ M2
-- Further techniques to reduce number of I/O operations