Query Processing

04/19/23

Query ProcessingQuery Processing

Chapter 13: Query ProcessingChapter 13: Query Processing

Overview Overview Measures of Query CostMeasures of Query Cost Selection Operation Selection Operation Sorting Sorting Join Operation Join Operation Other OperationsOther Operations Evaluation of ExpressionsEvaluation of Expressions

Basic Steps in Query ProcessingBasic Steps in Query Processing

Basic Steps in Query ProcessingBasic Steps in Query Processing

Parsing and translationParsing and translation translate the query into its internal form. This translate the query into its internal form. This

is then translated into relational algebra.is then translated into relational algebra. Parser checks syntax, verifies relationsParser checks syntax, verifies relations

EvaluationEvaluation The query-execution engine takes a query-The query-execution engine takes a query-

evaluation plan, executes that plan, and evaluation plan, executes that plan, and returns the answers to the query.returns the answers to the query.

Basic Steps in Query Processing : Basic Steps in Query Processing : OptimizationOptimization

A relational algebra expression may have many A relational algebra expression may have many equivalent expressionsequivalent expressions E.g., E.g., balancebalance25002500((balancebalance((account)) account)) is equivalent is equivalent

to to balancebalance((balancebalance25002500((account))account))

Annotated expression specifying detailed Annotated expression specifying detailed evaluation strategy is called an evaluation strategy is called an evaluation-planevaluation-plan.. E.g., can use an index on E.g., can use an index on balancebalance to find to find

accounts with balance < 2500,accounts with balance < 2500, or can perform complete relation scan and or can perform complete relation scan and

discard accounts with balance discard accounts with balance 2500 2500

Basic Steps: OptimizationBasic Steps: Optimization

Query OptimizationQuery Optimization:: Amongst all equivalent Amongst all equivalent evaluation plans choose the one with lowest cost. evaluation plans choose the one with lowest cost. Cost is estimated using statistical information Cost is estimated using statistical information

from the database catalogfrom the database catalog e.g. number of tuples in each relation, size of tuples, etc.e.g. number of tuples in each relation, size of tuples, etc.

In this chapter we studyIn this chapter we study How to measure query costsHow to measure query costs Algorithms for evaluating relational algebra Algorithms for evaluating relational algebra

operationsoperations How to combine algorithms for individual How to combine algorithms for individual

operations in order to evaluate a complete operations in order to evaluate a complete expressionexpression

Measures of Query CostMeasures of Query Cost

Cost is generally measured as total elapsed time for Cost is generally measured as total elapsed time for answering queryanswering query Many factors contribute to time costMany factors contribute to time cost

disk accesses, CPUdisk accesses, CPU, or even network , or even network communicationcommunication Typically disk access is the predominant cost, and is Typically disk access is the predominant cost, and is

also relatively easy to estimate. Measured by taking also relatively easy to estimate. Measured by taking into accountinto account Number of seeks * average-seek-costNumber of seeks * average-seek-cost Number of blocks read * average-block-read-costNumber of blocks read * average-block-read-cost Number of blocks written * average-block-write-costNumber of blocks written * average-block-write-cost

Cost to write a block is greater than cost to read a block Cost to write a block is greater than cost to read a block data is read back after being written to ensure that the write data is read back after being written to ensure that the write

was successfulwas successful


For simplicity we just use the For simplicity we just use the number of block number of block transferstransfers as the cost measuresas the cost measures bb

We ignore CPU costs for simplicityWe ignore CPU costs for simplicity We do not include cost to writing output to disk in We do not include cost to writing output to disk in

our cost formulaeour cost formulae


Several algorithms can reduce disk IO by using Several algorithms can reduce disk IO by using extra buffer space extra buffer space Amount of real memory available to buffer Amount of real memory available to buffer

depends on other concurrent queries and OS depends on other concurrent queries and OS processes, known only during executionprocesses, known only during execution

We often use worst case estimates, assuming only We often use worst case estimates, assuming only the minimum amount of memory needed for the the minimum amount of memory needed for the operation is availableoperation is available

Required data may be buffer resident already, Required data may be buffer resident already, avoiding disk I/Oavoiding disk I/O But hard to take into account for cost But hard to take into account for cost

estimationestimation

Selection OperationSelection Operation

File scanFile scan – search algorithms that locate and – search algorithms that locate and retrieve records that fulfill a selection condition.retrieve records that fulfill a selection condition.


Algorithm Algorithm A1A1 ( (linear searchlinear search). Scan each file block ). Scan each file block and test all records to see whether they satisfy and test all records to see whether they satisfy the selection condition.the selection condition. Cost estimate = Cost estimate = bbr r block transfersblock transfers

bbr r denotes number of blocks containing records from denotes number of blocks containing records from relation relation rr

If selection is on a key attribute, can stop on If selection is on a key attribute, can stop on finding recordfinding record

cost = (cost = (bbr r /2) block transfers /2) block transfers Linear search can be applied regardless of Linear search can be applied regardless of

selection condition orselection condition or ordering of records in the file, or ordering of records in the file, or availability of indicesavailability of indices


A2 A2 (binary search). (binary search). Applicable if selection is an Applicable if selection is an equality comparison on the attribute on which file is equality comparison on the attribute on which file is ordered. ordered. Assume that the blocks of a relation are stored Assume that the blocks of a relation are stored

contiguously contiguously Cost estimate (number of disk blocks to be Cost estimate (number of disk blocks to be

scanned):scanned): cost of locating the first tuple by a binary search on the cost of locating the first tuple by a binary search on the

blocksblocks loglog22((bbrr))

If there are multiple records satisfying selectionIf there are multiple records satisfying selection Add transfer cost of the Add transfer cost of the number of blocks containing records number of blocks containing records

that satisfy selection condition that satisfy selection condition

Selections Using IndicesSelections Using Indices

Index scan Index scan – search algorithms that use an – search algorithms that use an indexindex selection condition must be on search-key of selection condition must be on search-key of

index.index.


A3 A3 ((primary index on candidate key, equalityprimary index on candidate key, equality). ). Retrieve a single record that satisfies the Retrieve a single record that satisfies the corresponding equality condition corresponding equality condition CostCost = = hhii + 1+ 1


A4 A4 ((primary index on nonkey, equality) primary index on nonkey, equality) Retrieve Retrieve multiple records. multiple records. Records will be on consecutive blocksRecords will be on consecutive blocks

Let b = number of blocks containing matching Let b = number of blocks containing matching recordsrecords

CostCost = = hhii + b+ b


A5A5 ( (equality on search-key of secondary index).equality on search-key of secondary index). Retrieve a single record if the search-key is a Retrieve a single record if the search-key is a

candidate keycandidate key Cost = hCost = hii + 1+ 1

Retrieve multiple records if search-key is not a Retrieve multiple records if search-key is not a candidate keycandidate key

each of each of nn matching records may be on a different matching records may be on a different block block

Cost = Cost = hhii + + nn Can be very expensive!Can be very expensive!

Selections Involving ComparisonsSelections Involving Comparisons

Can implement selections of the form Can implement selections of the form AAV V ((rr) or ) or A A

VV((rr) by using) by using a linear file scan or binary search,a linear file scan or binary search, or by using indices in the following ways:or by using indices in the following ways:


A6A6 ( (primary index, comparison).primary index, comparison). (Relation is (Relation is sorted on A)sorted on A)

For For A A V V(r)(r) use index to find first tuple use index to find first tuple v v and scan and scan relation sequentially from thererelation sequentially from there

For For AAV V ((rr) just scan relation sequentially till first tuple ) just scan relation sequentially till first tuple > > v; v; do not use indexdo not use index


A7A7 ( (secondary index, comparisonsecondary index, comparison). ). For For A A V V(r)(r) use index to find first index entry use index to find first index entry v v and and

scan index sequentially from there, to find pointers scan index sequentially from there, to find pointers to records.to records.

For For AAV V ((rr) just scan leaf pages of index finding ) just scan leaf pages of index finding pointers to records, till first entry > pointers to records, till first entry > vv

In either case, retrieve records that are pointed toIn either case, retrieve records that are pointed to requires an I/O for each recordrequires an I/O for each record Linear file scan may be cheaperLinear file scan may be cheaper

SortingSorting

We may build an index on the relation, and then We may build an index on the relation, and then use the index to read the relation in sorted order. use the index to read the relation in sorted order. May lead to one disk block access for each tuple.May lead to one disk block access for each tuple.

For relations that fit in memory, techniques like For relations that fit in memory, techniques like quicksort can be used. For relations that don’t fit quicksort can be used. For relations that don’t fit in memory, in memory, external sort-merge external sort-merge is a good is a good choice. choice.

External Sort-MergeExternal Sort-Merge

1.1. Create sortedCreate sorted runsruns. Let . Let ii be 0 initially. be 0 initially. Repeatedly do the following till the end of the Repeatedly do the following till the end of the relation:relation: (a) Read (a) Read MM blocks of relation into memory blocks of relation into memory (b) Sort the in-memory blocks (b) Sort the in-memory blocks (c) Write sorted data to run (c) Write sorted data to run RRii; increment ; increment i.i.

Let the final value ofLet the final value of i i be be NN

2.2. Merge the runs (next slide)…..Merge the runs (next slide)…..

Let M denote memory size (in pages).

External Sort-Merge (Cont.)External Sort-Merge (Cont.)

2.2. Merge the runs (N-way merge)Merge the runs (N-way merge). . We assume (for now) We assume (for now)

that that NN < < MM. .

1.1. Use Use NN blocks of memory to buffer input runs, and blocks of memory to buffer input runs, and 1 block to buffer output. Read the first block of 1 block to buffer output. Read the first block of each run into its buffer pageeach run into its buffer page

2.2. repeatrepeat1.1. Select the first record (in sort order) among all buffer Select the first record (in sort order) among all buffer

pagespages

2.2. Write the record to the output buffer. If the output Write the record to the output buffer. If the output buffer is full write it to disk.buffer is full write it to disk.

3.3. Delete the record from its input buffer page.Delete the record from its input buffer page.IfIf the buffer page becomes empty the buffer page becomes empty thenthen read the next block (if any) of the run into the buffer. read the next block (if any) of the run into the buffer.

3.3. untiluntil all input buffer pages are empty: all input buffer pages are empty:

External Sort-Merge (Cont.)External Sort-Merge (Cont.)

If If NN MM, several merge , several merge passespasses are required. are required. In each pass, contiguous groups of In each pass, contiguous groups of M M - 1 - 1

runs are merged. runs are merged. A pass reduces the number of runs by a A pass reduces the number of runs by a

factor of factor of MM -1, and creates runs longer by -1, and creates runs longer by the same factor. the same factor.

E.g. If M=11, and there are 90 runs, one pass E.g. If M=11, and there are 90 runs, one pass reduces the number of runs to 9, each 10 times reduces the number of runs to 9, each 10 times the size of the initial runsthe size of the initial runs

Repeated passes are performed till all runs Repeated passes are performed till all runs have been merged into one.have been merged into one.

Example: External Sorting Using Sort-MergeExample: External Sorting Using Sort-Merge

External Merge Sort (Cont.)External Merge Sort (Cont.)

Cost analysis:Cost analysis: Total number of merge passes required: Total number of merge passes required: loglogMM––

11((bbrr/M)/M).. Block transfers for initial run creation as well as Block transfers for initial run creation as well as

in each pass is 2in each pass is 2bbrr

for final pass, we don’t count write cost for final pass, we don’t count write cost we ignore final write cost for all operations since the we ignore final write cost for all operations since the

output of an operation may be sent to the parent output of an operation may be sent to the parent operation without being written to diskoperation without being written to disk

Thus total number of block transfers for external Thus total number of block transfers for external sorting:sorting:

bbr r ( 2 ( 2 loglogMM–1–1((bbr r / M)/ M) + 1) + 1)

Join OperationJoin Operation

Several different algorithms to implement joinsSeveral different algorithms to implement joins Nested-loop joinNested-loop join Block nested-loop joinBlock nested-loop join Indexed nested-loop joinIndexed nested-loop join Merge-joinMerge-join Hash-joinHash-join

Choice based on cost estimateChoice based on cost estimate Examples use the following informationExamples use the following information

Number of records of Number of records of customercustomer: 10,000 : 10,000 depositordepositor: 5000: 5000

Number of blocks of Number of blocks of customercustomer: 400 : 400 depositordepositor: 100: 100

Nested-Loop JoinNested-Loop Join To compute the theta join To compute the theta join rr ss

for eachfor each tuple tuple ttrr in in rr do begin do begin

for each tuple for each tuple ttss in in ss do begin do begin

test pair (test pair (ttrr,t,tss) to) to see if they satisfy the join see if they satisfy the join

condition condition if they do, add if they do, add ttrr • t• tss to the result.to the result.

endendendend

rr is called the is called the outerouter relationrelation and and ss the the inner inner relationrelation of the join. of the join.

Requires no indices and can be used with any kind of Requires no indices and can be used with any kind of join condition.join condition.

Expensive since it examines every pair of tuples in Expensive since it examines every pair of tuples in the two relations. the two relations.

Nested-Loop Join (Cont.)Nested-Loop Join (Cont.)

In the worst case, if there is enough memory only to hold In the worst case, if there is enough memory only to hold one block of each relation, the estimated cost is one block of each relation, the estimated cost is nnrr bbss + + b brr

If the smaller relation fits entirely in memory, use that as If the smaller relation fits entirely in memory, use that as the inner relation.the inner relation. Reduces cost to Reduces cost to bbrr + + bbss block transfersblock transfers

Assuming worst case memory availability cost estimate Assuming worst case memory availability cost estimate isis with with depositor depositor as outer relation:as outer relation:

5000 5000 400 + 100 = 2,000,100 block transfers, 400 + 100 = 2,000,100 block transfers, with with customer customer as the outer relation as the outer relation

10000 10000 100 + 400 = 1,000,400 block transfers 100 + 400 = 1,000,400 block transfers If smaller relation (If smaller relation (depositor)depositor) fits entirely in memory, fits entirely in memory,

the cost estimate will be 500 block transfers.the cost estimate will be 500 block transfers. Block nested-loops algorithm (next slide) is preferable.Block nested-loops algorithm (next slide) is preferable.

Block Nested-Loop JoinBlock Nested-Loop Join

Variant of nested-loop join in which every Variant of nested-loop join in which every block of inner relation is paired with every block of inner relation is paired with every block of outer relation.block of outer relation.

for each for each block block BBrr ofof rr do begin do begin

for eachfor each block block BBss of of s s do begindo begin

for eachfor each tuple tuple ttrr in in BBr r do begindo begin

for each for each tuple tuple ttss in in BBss do begindo begin

Check if (Check if (ttrr,t,tss) ) satisfy the join condition satisfy the join condition

if they do, add if they do, add ttrr • • ttss to the result.to the result.

endendendend

endendendend

Block Nested-Loop Join (Cont.)Block Nested-Loop Join (Cont.)

Worst case estimate: Worst case estimate: bbrr b bss + b + brr block block transferstransfers

Best case: Best case: bbrr ++ b bss block transfers.block transfers.

Indexed Nested-Loop JoinIndexed Nested-Loop Join

Index lookups can replace file scans ifIndex lookups can replace file scans if join is an equi-join or natural join andjoin is an equi-join or natural join and an index is available on the inner relation’s join an index is available on the inner relation’s join

attributeattribute Can construct an index just to compute a join.Can construct an index just to compute a join.

For each tuple For each tuple ttrr in the outer relation in the outer relation r,r, use the index to use the index to

look up tuples in look up tuples in ss that satisfy the join condition with tuple that satisfy the join condition with tuple ttrr..

Worst case: buffer has space for only one page of Worst case: buffer has space for only one page of rr, and, , and, for each tuple in for each tuple in rr, we perform an index lookup on , we perform an index lookup on s.s.

Cost of the join: Cost of the join: bbrr + + nnrr cc Where Where cc is the cost of traversing index and fetching all is the cost of traversing index and fetching all

matching matching ss tuples for one tuple or tuples for one tuple or rr

Example of Nested-Loop Join CostsExample of Nested-Loop Join Costs

Compute Compute depositor customer, depositor customer, with with depositordepositor as the outer as the outer relation.relation.

Let Let customercustomer have a primary B have a primary B++-tree index on the join -tree index on the join attribute attribute customer-name, customer-name, which contains 20 entries in each which contains 20 entries in each index node.index node.

SinceSince customer customer has 10,000 tuples, the height of the tree is has 10,000 tuples, the height of the tree is 4, and one more access is needed to find the actual data4, and one more access is needed to find the actual data

depositordepositor has 5000 tuples has 5000 tuples Cost of block nested loops joinCost of block nested loops join

400*100 + 100 = 40,100 block transfers400*100 + 100 = 40,100 block transfers assuming worst case memory assuming worst case memory may be significantly less with more memorymay be significantly less with more memory

Cost of indexed nested loops joinCost of indexed nested loops join

100 + 5000 * 5 = 25,100 block transfers100 + 5000 * 5 = 25,100 block transfers

Other OperationsOther Operations

Duplicate elimination Duplicate elimination can be implemented via hashing can be implemented via hashing or sorting.or sorting. On sorting duplicates will come adjacent to each On sorting duplicates will come adjacent to each

other, and all but one set of duplicates can be other, and all but one set of duplicates can be deleted. deleted.

Optimization: Optimization: duplicates can be deleted during run duplicates can be deleted during run generation as well as at intermediate merge steps in generation as well as at intermediate merge steps in external sort-merge.external sort-merge.

Projection:Projection: perform projection on each tuple perform projection on each tuple followed by duplicate elimination.followed by duplicate elimination.

Other Operations : AggregationOther Operations : Aggregation

AggregationAggregation can be implemented in a manner can be implemented in a manner similar to duplicate elimination.similar to duplicate elimination. Sorting or hashing can be used to bring tuples Sorting or hashing can be used to bring tuples

in the same group together, and then the in the same group together, and then the aggregate functions can be applied on each aggregate functions can be applied on each group.group.

Evaluation of ExpressionsEvaluation of Expressions

So far: we have seen algorithms for individual So far: we have seen algorithms for individual operationsoperations

Alternatives for evaluating an entire expression Alternatives for evaluating an entire expression treetree MaterializationMaterialization: generate results of an : generate results of an

expression whose inputs are relations or are expression whose inputs are relations or are already computed, already computed, materializematerialize (store) it on (store) it on disk. Repeat.disk. Repeat.

PipeliningPipelining: pass on tuples to parent : pass on tuples to parent operations even as an operation is being operations even as an operation is being executedexecuted

Query Processing

Documents

Transcript of Query Processing