Query execution hashing based Two pass algorithms

10
—Gaurang Patel— (205)

description

Query execution hashing based Two pass algorithms. —Gaurang Patel— (205). Agenda. Terminology Hash Basics Partitioning relation by hashing Hash-based Grouping and Aggregation Union, Intersection and Difference Hash-Join Algorithm Conclusion. Terminology. Query optimization - PowerPoint PPT Presentation

Transcript of Query execution hashing based Two pass algorithms

—Gaurang Patel—(205)

Terminology Hash Basics Partitioning relation by hashing Hash-based Grouping and Aggregation Union, Intersection and Difference Hash-Join Algorithm Conclusion

Query optimization -- Logical Query plan -- Physical plan Query execution: -- Query processor- group of DBMS

components -- Converts user queries into database

operations Operation Relation- arguments of operation

Large data Hash functions to store large relations Memory buffers Gain factor of M in the size of relations

Algorithm:

Tuples from same block foes to same bucket

Hash key depends on grouping attributes First pass: Process each bucket in turn. Second pass: Only one record per group.

Binary operation- same hash function for both arguments

Union: R U S First Pass -- 2(M-1) buckets -- Avoid duplicates Same for R ∩ S, R – S..

I/O operations needed: -- B(R) + B(S) -- 2 more for hashing -- Total: 3(B(R) + B(S)) For, two pass algorithm: -- min(B(R),B(S)) ≤ M2

R(X,Y) ►◄ S(Y,Z) Same as other binary operations Only difference in hash key, Y I/O operations: -- 3(B(R)+B(S)) -- Two pass require min(B(R),B(S)) ≤ M2

-- Further techniques to reduce number of I/O operations