Query processing
-
Upload
garapatiavinash -
Category
Documents
-
view
112 -
download
2
description
Transcript of Query processing
5/2/2011
1
Query Processing and
Optimization
Introduction
• Users are expected to write ―efficient‖ queries. But they
do not always do that!
– Users typically do not have enough information about the
database to write efficient queries. E.g., no information on
table size
– Users would not know if a query is efficient or not without
knowing how the DBMS’s query processor work
• DBMS’s job is to optimize the user’s query by:
– Converting the query to an internal representation (tree or
graph)
– Evaluate the costs of several possible ways of executing
the query and find the best one.
Steps in Query Processing
SQL query
Execution Plan
Code
Result
Parse Tree
Query Parsing
Code Generation
Query Optimization
Runtime DB Processor
Join
ProjectEmployee
Join Employee and Project
using hash join, … ...
Query ProcessingQuery in a high level language
Scanning, Parsing,
& Validating
Intermediate form of query
QUERY OPTIMIZER
Execution Plan
Query Code Generator
Code to execute the query
Runtime DB Processor
Result of query
Basic Steps in Query Processing1. Parsing and translation
2. Optimization
3. Evaluation
Basic Steps in Query Processing
• Parsing and translation
– translate the query into its internal form.
This is then translated into relational
algebra.
– Parser checks syntax, verifies relations
• Evaluation
– The query-execution engine takes a query-
evaluation plan, executes that plan, and
returns the answers to the query.
5/2/2011
2
Query Processing
• Consider the query:
select balance
from account
where balance<2500
• Can be translated into either of the following RA expressions:
balance 2500( balance(account))
balance( balance 2500(account))
• The RA expressions are equivalent
Query Processing
• Each relational algebra operation can be evaluated using one of several different algorithms– Correspondingly, a relational-algebra
expression can be evaluated in many ways.
• Annotated expression specifying detailed evaluation strategy is called an evaluation-plan– E.g., can use an index on balance to find
accounts with balance < 2500,– or can perform complete relation scan and
discard accounts with balance 2500
Query Plan Query Optimization
• Amongst all equivalent evaluation plans choose the one with lowest cost. – Cost is estimated using statistical information
from the database catalog• e.g. number of tuples in each relation, size of tuples,
etc.
• First we need to learn:– How to measure query costs– Algorithms for evaluating relational algebra
operations– How to combine algorithms for individual
operations in order to evaluate a complete expression
– How to optimize queries, that is, how to find an evaluation plan with lowest estimated cost
Measures of Query Cost• Cost is generally measured as total elapsed time for
answering query
– Many factors contribute to time cost
• disk accesses, CPU, or even network communication
• Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account
– Number of seeks * average-seek-cost
+ Number of blocks read * average-block-read-cost
+ Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
– data is read back after being written to ensure that the write was successful
– Assumption: single disk
• Can modify formulae for multiple disks/RAID arrays
• Or just use single-disk formulae, but interpret them as measuring resource consumption instead of time
Measures of Query Cost (Cont.)• For simplicity we just use the number of block transfers from
disk and the number of seeks as the cost measures– tT – time to transfer one block
– tS – time for one seek
– Cost for b block transfers plus S seeksb * tT + S * tS
• We ignore CPU costs for simplicity– Real systems do take CPU cost into account
• We do not include cost to writing output to disk in our cost formulae
• Several algorithms can reduce disk I/O by using extra buffer space
– Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution
• We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available
• Required data may be buffer resident already, avoiding disk I/O
– But hard to take into account for cost estimation
5/2/2011
3
Statistics and Catalogs
• For each Table
– Table name, file name (or some identifier) & file structure (e.g., heap file)
– Attribute name and type of each attribute
– Index name of each index
– Integrity constraints
• For each Index
– Index name & the structure (e.g., B+ tree)
– Search key attributes
• For each View
– View name & definition
Statistics and Catalogs
• Cardinality: NTuples(N) for each R
• Size: NPages(R) for each R
• Index Cardinality: Number of distinct key values NKeys(I) for each I
• Index Size: INPages(I) for each index I
• For B+ tree index, INPages is number of leaf pages
• Index Height: Number of non-leaf levels IHeight(I) for eact tree index
• Index Range: ILow(I) & IHigh(I)
Statistics and Catalogs
• Catalogs updated periodically
– Updating whenever data changes is too expensive
• More detailed information (e.g., histograms of the values in some field) are sometimes stored.
Operator Evaluation
Algorithms for evaluating relational operators use some simple ideas extensively:
– Indexing: If a selection or join condition is specified, use an index to examine just the tuples that satisfy the condition.
– Iteration: Sometimes, faster to scan all tuples even if there is an index. (And sometimes, we can scan the data entries in an index instead of the table itself.)
– Partitioning: By using sorting or hashing, we can partition the input tuples and replace an expensive operation by similar operations on smaller inputs.
Access Paths• An access path is a method of retrieving tuples:
• File scan, or index that matches a selection (in the query)
• A tree index matches (a conjunction of) terms that involve only attributes in a prefix of the search key.
• E.g., Tree index on <a, b, c> matches the selection a=5 AND b=3, and a=5 AND b>6, but not b=3.
• A hash index matches (a conjunction of) terms that has a term attribute = value for every attribute in the search key of the index.
• E.g., Hash index on <a, b, c> matches a=5 AND b=3 AND
c=5; but it does not match b=3, or a=5 AND b=3, or a>5 AND b=3 AND c=5.
Access Paths
• Selectivity: Number of pages retrieved (Index + data) to retrieve all desired tuples
• Using the most selective access path minimizes the cost of data retrieval
• Reduction Factor: • Each conjunct is a filter
• Fraction of tuples satisfying a given conjunct is called the reduction factor
5/2/2011
4
Query Optimization
• Techniques used by a DBMS to process, optimize, and execute high-level queries
• A high-level query is – Scanned– Parsed– Validated
• Internal representation – QUERY TREE– QUERY GRAPH
• Many Execution Strategies• Choosing a suitable one for processing a query is
QUERY OPTIMIZATION• Ideally: Want to find best plan• Practically: Avoid worst plans!
Query Optimization
• Scanning– The scanner identifies the language tokens, such as SQL
keywords, attribute names, & relation names
• Parsing– Parser checks the query syntax to determine whether it
is formulated according to the grammar rules of the query language
• Validating– Checking that all the attribute & relation names are valid
and semantically meaningful names in the schema of the particular DB being queried
SQL Queries to
Relational Algebra
• SQL queries are optimized by decomposing them into a collection of smaller units, called blocks
• Query optimizer concentrates on optimizing a single block at a time
Translating SQL Queries into
Relational Algebra
• Query block: the basic unit that can be translated into the algebraic operators and optimized.
• A query block contains a single SELECT-FROM-WHERE expression, as well as GROUP BY and HAVING clause if these are part of the block.
• Nested queries within a query are identified as separate query blocks.
• Aggregate operators in SQL must be included in the extended algebra.
Translating SQL Queries into
Relational AlgebraSELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ( SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5);
SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > C
πLNAME, FNAME (σSALARY>C(EMPLOYEE)) ℱMAX SALARY (σDNO=5 (EMPLOYEE))
• File scan scan all records of the file to find records that
satisfy selection condition
• Binary search when the file is sorted on attributes
specified in the selection condition
• Index scan using index to locate the qualified records
– Primary index, single record retrieval equality
comparison on a primary key attribute with a primary
index
– Primary index, multiple records retrieval comparison
condition <, >, etc. on a key field with primary index
– Clustering index to retrieve multiple records
– Secondary index to retrieve single or multiple records
Select Operation
5/2/2011
5
OP1 AND OP2 (e.g., EmpNo=123 AND Age=30)
Conjunctive selection: Evaluate the condition that has an index created (i.e.,
that can be evaluated very fast), get the qualified tuples and then check if
these tuples satisfy the remaining conditions.
Conjunctive selection using composite index: if there is a composite index
created on attributes involved in one or more conditions, then use the
composite index to find the qualified tuples
Complete Employee RecordsEmpNo Age
012 25
123 30
Composite
index
Conjunctive selection by intersection of record pointers: if secondary indexes
are available, evaluate each condition and intersect the sets of record pointers
obtained.
Conjunctive Conditions
When there are more than one attribute with an index:
– use the one that costs least, and
– the one that returns the smallest number of qualified tuple
Disjunctive select conditions: OP1 or OP2 are much more
costly:
potentially a large number of tuples will qualify
costly if any one of the condition doesn‟t have an index created
selectivity of a condition is the number of tuples that
satisfy the condition divided by total number of tuples.
The smaller the selectivity, the fewer the number of
tuples retrieved, and the higher the desirability of using
that condition to retrieve the records.
Conjunctive Conditions
• Join is one of the most time-consuming
operations in query processing.
• Two-way join is a join of two relations, and there
are many algorithms to evaluate the join.
• Multi-way join is a join of more than two relations;
different orders of evaluating a multi-way join
have different speeds
• We shall study methods for implementing two-
way joins of form
R A=B S
Join Operation
Nested (inner-outer) Loop: For each record r in R (outer loop),
retrieve every record s from S (inner loop) and check if r[A] =
s[B].
R A=B S
Join Algorithm: Nested (inner-outer) Loop
for each tuple r in Rdo for each tuple s in S
do if r.[A] = s[B] then output result
endend
0005
0002
0004
0002
0002
0001
0005
0005
0002
0002
0003
0002
0005
RS
m tuples in R
n tuples in S
m*n checkings
R and S can be reversed
If an index (or hash key) exists, say, on attribute B of S, should we put R in
the outer loop or S? Why?
Records in the outer relation are accessed sequentially, an index on the
outer relation doesn‟t help;
Records in the inner relations are accessed randomly, so an index can
retrieve all records in the inner relation that satisfy the join condition.
When One Join Attributes is Indexed
0005
0002
0004
0002
0002
0001
0005
R
0005
0002
0002
0003
0002
0005
Sindex on S
Sort-merge join: if the records of R and S are sorted on the
join attributes A and B, respectively, then the relations are
scanned in say ascending order, matching the records that
have same values for A and B.
R A=B S
0001
0002
0002
0002
0004
0005
0005
0002
0002
0002
0003
0005
0005
Sort-Merge Join
• R and S are only scanned once.
• Even if the relations are not
sorted, it is better to sort them
first and do sort-merge join then
doing double-loop join.
• if R and S are sorted, n + m
• if not sorted:
n log(n) + m log(m) + m + n
5/2/2011
6
Hash-join: R and S are both hashed to the same hash file based
on the join attributes. Tuples in the same bucket are then
“joined”.
0001
0002
0002
0002
0004
0005
00050002
0002
0002
0003
0005
0005
0001 0002
0002
0002
0004 0005
0005
0002
0002
0002
0003
0005
0005
Hash Join Method
• Disk accesses are based on blocks, not individual tuples
• Main memory buffer can significantly reduce the number of disk
accesses
– Use the smaller relation in outer loop in nested loop method
– Consider if 1 buffer is available, 2 buffers, m buffers
• When index is available, either the smaller relation or the one with
large number of matching tuples should be used in the outer loop.
• If join attributes are not indexed, it may be faster to create the
indexes on-the-fly (hash-join is close to generating a hash index
on-the-fly)
• Sort-Merge is the most efficient; the relations are often sorted
already
• Hash join is efficient if the hash file can be kept in the main
memory
Hints on Evaluating Joins
Query Processing and
Optimization
Measures of Query Cost (Cont.)• For simplicity we just use the number of block transfers from disk
and the number of seeks as the cost measures
– tT – time to transfer one block
– tS – time for one seek
– Cost for b block transfers plus S seeks
b * tT + S * tS
• We ignore CPU costs for simplicity
– Real systems do take CPU cost into account
• We do not include cost to writing output to disk in our cost formulae
Selection Operation• File scan
• Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition.
– Cost estimate = br block transfers + 1 seek (br * tT + tS )
• br denotes number of blocks containing records from relation r
– If selection is on a key attribute, can stop on finding record
• cost = (br /2) block transfers + 1 seek (br /2)* tT + tS– Linear search can be applied regardless of
• selection condition or
• ordering of records in the file, or
• availability of indices• Note: binary search generally does not make sense since data is not
stored consecutively
– except when there is an index available,
– and binary search requires more seeks than index search
Selections Using Indices• Index scan – search algorithms that use an index
– selection condition must be on search-key of index.
• A2 (primary index, equality on key). Retrieve a single record
that satisfies the corresponding equality condition
– Cost = (hi + 1) * (tT + tS)
• A3 (primary index, equality on nonkey) Retrieve multiple
records.
– Records will be on consecutive blocks
• Let b = number of blocks containing matching records
– Cost = hi * (tT + tS) + tS + tT * b
5/2/2011
7
Selections Using Indices• A4 (secondary index, equality on nonkey).
– Retrieve a single record if the search-key is a candidate key
• Cost = (hi + 1) * (tT + tS)
– Retrieve multiple records if search-key is not a candidate key
• each of n matching records may be on a different block
• Cost = (hi + n) * (tT + tS)
– Can be very expensive!
Selections Involving Comparisons
• Can implement selections of the form A V (r) or A V(r) by using
– a linear file scan,
– or by using indices in the following ways:
• A5 (primary index, comparison). (Relation is sorted on A)
• For A V(r) use index to find first tuple v and scan relation sequentially from there
• For A V (r) just scan relation sequentially till first tuple > v; do
not use index
• A6 (secondary index, comparison).
• For A V(r) use index to find first index entry v and scan index sequentially from there, to find pointers to records.
• For A V (r) just scan leaf pages of index finding pointers to records, till first entry > v
• In either case, retrieve records that are pointed to
– requires an I/O for each record
– Linear file scan may be cheaper
Implementation of Complex Selections
• Conjunction: 1 2 . . . n(r)
• A7 (conjunctive selection using one index).
– Select a combination of i and algorithms A1 through A7 that
results in the least cost for i (r).
– Test other conditions on tuple after fetching it into memory buffer.
• A8 (conjunctive selection using composite index).
– Use appropriate composite (multiple-key) index if available.
• A9 (conjunctive selection by intersection of identifiers).
– Requires indices with record pointers.
– Use corresponding index for each condition, and take intersection
of all the obtained sets of record pointers.
– Then fetch records from file
– If some conditions do not have appropriate indices, apply test in
memory.
Algorithms for Complex Selections
• Disjunction: 1 2 . . . n (r).
• A10 (disjunctive selection by union of identifiers).
– Applicable if all conditions have available indices.
• Otherwise use linear scan.
– Use corresponding index for each condition, and take union
of all the obtained sets of record pointers.
– Then fetch records from file
• Negation: (r)
– Use linear scan on file
– If very few records satisfy , and an index is applicable to
• Find satisfying records using index and fetch from file
Sorting
• We may build an index on the relation, and then use the index to
read the relation in sorted order. May lead to one disk block access
for each tuple.
• For relations that fit in memory, techniques like quicksort can be
used. For relations that don’t fit in memory, external
sort-merge is a good choice.
External Sort-Merge
1. Create sorted runs. Let i be 0 initially.
Repeatedly do the following till the end of the relation:
(a) Read M blocks of relation into memory
(b) Sort the in-memory blocks
(c) Write sorted data to run Ri; increment i.
Let the final value of i be N
2. Merge the runs (next slide)…..
Let M denote memory size (in pages).
5/2/2011
8
External Sort-Merge (Cont.)
2. Merge the runs (N-way merge). We assume (for now) that N
< M.
1. Use N blocks of memory to buffer input runs, and 1 block to
buffer output. Read the first block of each run into its buffer page
2. repeat
1. Select the first record (in sort order) among all buffer pages
2. Write the record to the output buffer. If the output buffer is
full write it to disk.
3. Delete the record from its input buffer page.
If the buffer page becomes empty then
read the next block (if any) of the run into the buffer.
3. until all input buffer pages are empty:
External Sort-Merge (Cont.)• If N M, several merge passes are required.
– In each pass, contiguous groups of M - 1 runs are merged.
– A pass reduces the number of runs by a factor of M -1, and
creates runs longer by the same factor.
• E.g. If M=11, and there are 90 runs, one pass reduces
the number of runs to 9, each 10 times the size of the
initial runs
– Repeated passes are performed till all runs have been
merged into one.
Example: External Sorting Using Sort-Merge SQL Queries to
Relational Algebra
• SQL queries are optimized by decomposing them into a collection of smaller units, called blocks
• Query optimizer concentrates on optimizing a single block at a time
Translating SQL Queries into
Relational Algebra
• Query block: the basic unit that can be translated into the algebraic operators and optimized.
• A query block contains a single SELECT-FROM-WHERE expression, as well as GROUP BY and HAVING clause if these are part of the block.
• Nested queries within a query are identified as separate query blocks.
• Aggregate operators in SQL must be included in the extended algebra.
Translating SQL Queries into
Relational AlgebraSELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ( SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5);
SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > C
πLNAME, FNAME (σSALARY>C(EMPLOYEE)) ℱMAX SALARY (σDNO=5 (EMPLOYEE))
5/2/2011
9
Query Optimization• Query optimizer would now choose an execution
plan for each block
• Note that the inner block needs to be evaluated only once to produce the maximum salary
• Uncorrelated nested query
• It is much harder to optimize correlated nested query where a tuple variable from the outer block appears in the where clause of the inner block
Select S.sname
From Sailors S
Where exists (select *
from reserves R
where R.bid=103
& R.sid=S.sid)
A Word about *
• All we want to do is to check that a qualifying row exists, and not really want to retrieve any columns from the row
Select S.sname
From Sailors S
Where exists (select *
from reserves R
where R.bid=103
& R.sid=S.sid)
Select count (*)
From Sailors S
Select count (distinct S.sname)
From Sailors S
If COUNT does not include DISTINCT, the above two queries give the same result
COUNT (*) is a better querying style since it immediately clear that all records contribute to total count
• Give a relational algebra expression,
how do we transform it to a more efficient
one?
Query Optimization
• Use the query tree as a tool to rearrange
the operations of the relational algebra
expression
Query Optimization
• RDBMS query optimizers are very complex pieces of software
• Typically represent 40-50 man years of development effort!!
Query Optimization
• SQL queries translated into Relational Algebra & then optimized
• Two main techniques for optimization•Heuristic based
» Ordering the operations in a query execution strategy
» Works for most cases but not guaranteed for all possible cases
•Cost based» Systematically estimating the cost of different
execution strategies and choosing the execution plan with the lowest cost estimate
• Both combined in a typical query optimizer
Query Optimization
• Query is essentially treated as a σ-∏-►◄ algebra expression
• Remaining operations are carried out on the result of the σ-∏-►◄expression
• Optimizing an RA expression involves:• Enumerating alternative plans for evaluating the
expression. NOT ALL
• Estimating the cost of each enumerated plan and choosing the plan with the lowest estimated cost
5/2/2011
10
Query Evaluation Plans
• A QEP consists of an extended RA tree
• Additional annotations at each node indicating the access method to use for each table and the implementation method to use for each relational operator
Structure and Execution of a Query Tree
• A query tree is a tree structure that
corresponds to a relational algebra expression
by representing the input relations as leaf
nodes and the relational algebra operations as
internal nodes of the tree
• An execution of the query tree consists of
executing an internal node operation whenever
its operands are available and then replacing
that internal node by the relation that results
from executing the operation
Query Optimization: Example
SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND
R.bid=100 AND S.rating>5
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
RA Expression:∏sname (σ bid=100^rating>5(R ►◄sid=sid S))
The Schema:Sailors (sid, sname, rating, age) 50 Bytes
Reserves (sid, bid, day, rname) 40 Bytes
Interpreting the TREE
Tree partially specifies how to evaluate the query
• First compute join between Reserves & Sailors
• Then the selections
• Finally the projection
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Interpreting the TREEDecide on the implementation of each operation involved
• Page oriented simple nested loops join between Reserves & Sailors with Reserves as the outer table
• Apply selections & projections to each tuple in the result of the join as it is produced
• Result of the join before the selections and projections is never stored in its entirety
• Convention: Outer table is the left child of the operator
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
(File Scan)(File Scan)
Heuristics for Optimizing a Query
• A query may have several equivalent
query trees
• A query parser generates a standard canonical query tree from a SQL query tree– Cartesian products are first applied
(FROM)
– then the conditions (WHERE)
– and finally projection (SELECT)
5/2/2011
11
ProjNo,DeptNo,EmpName,Address,Birthdate
ProjLocation=‘Stafford’ AND MgrNo=EmpNo AND
DeptNo=DeptNo,
Employee
DepartmentProject
The query optimizer
transforms this canonical
query into an efficient final
query
Heuristics for Optimizing a Query
select ProjNo, DeptNo, EmpName, Address,
Birthdate
from Project, Department, Employee
where ProjLocation=„Stafford‟ and
MrgNo=EmpNo and
Department.DeptNo=Employee.DeptNo
Find the names of employees born after 1957
who work on a project named „Aquarius‟
select EmpName
from Employee, WorksOn, Project
where ProjName=„Aquarius‟ AND
Project.ProjNo=WorksOn.ProjNo AND
Employee.EmpNo = WorksOn.EmpNo
AND
Birthdate >„DEC-31-1957‟
WorksOn (EmpNo, ProjNo, Hours)
EmpName
ProjName=‘Aquarius’ AND Project.ProjNo=Project.ProjNo
AND Employee.EmpNo=WorksOn.EmpNo
AND Birthdate > ‘DEC-31-1957’
Project
WorksOnEmployee
Example
EmpName
ProjNo=ProjNo
Project
WorksOn
Employee
ProjName=‘Aquarius’
Birthdate > ‘dec-31-1957’
EmpNo=EmpNo
Example
Push all the conditions as far down
the tree as possible
Expensive due to large
size of Employee
Example
EmpName
EmpNo=EmpNo
Employee
WorksOn
Project
Birthdate > ‘dec-31-1957’
PNAME=‘Aquarius’
ProjNo=ProjNo
Rearrange join sequence according
to estimates of relation sizes
Only need ProjNo attribute from
Project and WorksOn
Only need EmpNo attribute from
Employee and WorksOn and
EmpName from Employee
Example
Replace cross products and selection
sequence with a join operation EmpName
EmpNo= EmpNo
EmployeeWorksOn
Project
Birthdate > ‘dec-31-1957’
ProjName=‘Aquarius’
ProjNo= ProjNo
Example
Push projection as far down the
query tree as possible
LNAME
EmpNo = EmpNo
Employee
Birthdate > ‘dec-31-1957’
WorksOn
Project
ProjName=‘Aquarius’
ProjNo= ProjNo
EmpNo, EmpNameEmpNo
EmpNo, ProjNoProjNo
5/2/2011
12
1. Cascade of : A conjunctive selection condition can be broken up into a cascade (sequence) of individual operations:
• c1 AND c2 AND...AND cn(R) c1
( c2(...( cn
(R))..))
2. Commutativity of :
c1( c2
(R)) c2( c1
(R))
3. Cascade of :
• List1( List2
(... ( Listn(R))... )) List1
(R)
if List1 is included in List2…Listn; result is null if List1 is not in any of List2…Listn
Transformation Rules
4. Commuting with : if the projection list List1 involves only attributes that are in condition c
• List1( c(R)) c( List1(R))
5. Commutivity of JOIN or : R S S R
6. Commuting with JOIN: if all the attributes in the selection condition c involve only the attributes of one of the relations being joined, say, R
• c(R S) ( c(R)) S
Transformation Rules
7. Commuting with JOIN: if List can be separated into
List1 and List2 involving only attributes from R and S,
respectively, and the join condition c involves only
attributes in List:
• List(R c S) ( List1(R) c List2
(S))
8. Commuting set operations: and are commutative
9. JOIN, , , are associative
10. distributes over , ,
• c (R S) c(R) c(S)
11. distributes over
• List (R S) ( List(R) List(S))
Transformation Rules
Use rule 1 to break up any operation with conjunctive conditions into a sequence of operations
Use rules 2, 4, 6, and 10 concerning commutativity of with other operations to move each operation as far down the query tree as possible based on the attributes in the operations
Use rule 9 concerning associativity of binary operations to rearrange the leaf nodes of the tree so that the leaf node relations with the most restrictive operations are executed
Heuristic Algebraic Optimization
Combine sequences of Cartesian product and operation representing a join condition into single JOIN operations
Use rules 3, 4, 7, and 11 concerning the cascading of and commuting with other operations, break down a and move the projection attributes down the tree as far as possible
Identify subtrees that represent groups of operations that can be executed by a single algorithm (select/join followed by project)
Heuristic Algebraic OptimizationPipelined Evaluation
• Motivation– A query is mapped into a sequence of operations.
– Each execution of an operation produces a temporary result.
– Generating and saving temporary files on disk is time consuming and expensive.
• Alternative:– Avoid constructing temporary results as much as
possible.
– Pipeline the data through multiple operations - pass the result of a previous operator to the next without waiting to complete the previous operation.
5/2/2011
13
Pipelined Evaluation
• The result of one operator is sometimes pipelined to another operator without creating a temporary table to hold the intermediate result
• The output of R ►◄S is pipelined into the selections & projections that follow
• Cost of writing out the intermediate result & reading it back in can be significant
• Temporary table: Materialized Tuples
Pipelined Evaluation
• Consider a selection query in which only a part of the selection condition matches an index
• 2 instances of selection operator– Matching (primary) part of the selection condition
– Rest
• Pipelining: apply the second selection to each tuple in the result of the primary selection as it is produced & adding tuples that qualify to the final result
• When the input to a unary operator is pipelined into it, we say that the operator is applied on-the-fly
Pipelined Evaluation
• Result tuples of first join pipelined into join with C
• Conceptually, the evaluation is initiated from the root, & the node joining A & B produces tuples as and when they are requested from their parent node
►◄
A B
C
►◄
(A ►◄B) ►◄ C
Estimation of the Size of Joins
• The Cartesian product r s contains nrns tuples; each tuple
occupies sr + ss bytes.
• If R S = , then r s is the same as r x s.
• If R S is a key for R, then a tuple of s will join with at most one
tuple from r; therefore, the number of tuples in r s is no greater
than the number of tuples in s.If R S in S is a foreign key in S referencing R, then the number of
tuples in r s is exactly the same as the number of tuples in s.The case for R S being a foreign key referencing S is symmetric.
R S
Matching tuples
Example of Size Estimation
• In the example query depositor customer, customer-name in
depositor is a foreign key of customer; hence, the result has exactly
depositor tuples, which is 5000.
• Data: R = Customer, S = Depositor
customer = 10,000
fcustomer = 25
bcustomer = 10000/25 = 400
depositor = 5,000
fdepositor = 50
bdepositor = 5000/50 = 100
Estimation of the size of Joins
• If R S = {A} is not a key for R or S.
If we assume that every tuple t in R produces tuples in
R S, number of tuples in R S is estimated to be:
r s
V(A, s)
• If the reverse is true, the estimates obtained will be:
r s
V(A, r)
• The lower of these two estimates is probably the more
accurate one.
Number of distinct values of A in s
R S
s
V(A, s)
5/2/2011
14
Estimation of the size of Joins
• Compute the size estimates for depositor customer
without using information about foreign keys:
– customer = 10,000
depositor = 5,000
V(customer-name, depositor ) = 2500
V(customer-name, customer ) = 10000
– The two estimates are 5000 * 10000/2500 = 20,000 and
5000 * 10000/10000 = 5000
– We choose the lower estimate, which, in this case, is the
same as our earlier computation using foreign keys.
There are 5,000 tuples in
depositor relation but has
only 2,500 distinct
depositors, so every
depositor has two accounts
Customer-name is unique
Nested-Loop Join
• Compute the theta join, r s
for each tuple tr in r do begin
for each tuple ts in s do begintest pair (tr, ts) to see if they satisfy the join condition
if they do, add tr · ts to the result.
End
end
• r is called the outer relation and s the inner relation of the join.
• Requires no indices and can be used with any kind of join condition.
• Expensive since it examines every pair of tuples in the two relations.
Cost of Nested-Loop Join• If there is enough memory to hold only one block of each
relation, the estimated cost is nr * bs + br disk accesses
• If the smaller relation fits entirely in memory, use it as the inner relation. This reduces the cost estimate to br + bs disk accesses.
– br + bs is the minimum possible cost to read R and S once
– Putting both relations in memory won’t reduce the cost further
br disk accesses to
load R into bufferRS
For each tuple in r, S has to be
read into buffer, bs disk accesses
no. of bocks in rno. of bocks in s
Query Processing and
Optimization
Structure and Execution of a Query Tree
• A query tree is a tree structure that
corresponds to a relational algebra expression
by representing the input relations as leaf
nodes and the relational algebra operations as
internal nodes of the tree
• An execution of the query tree consists of
executing an internal node operation whenever
its operands are available and then replacing
that internal node by the relation that results
from executing the operation
Query Optimization: Example
SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND
R.bid=100 AND S.rating>5
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
RA Expression:∏sname (σ bid=100^rating>5(R ►◄sid=sid S))
The Schema:Sailors (sid, sname, rating, age) 50 Bytes
Reserves (sid, bid, day, rname) 40 Bytes
5/2/2011
15
Interpreting the TREE
Tree partially specifies how to evaluate the query
• First compute join between Reserves & Sailors
• Then the selections
• Finally the projection
RA Tree:
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
Interpreting the TREEDecide on the implementation of each operation involved
• Page oriented simple nested loops join between Reserves & Sailors with Reserves as the outer table
• Apply selections & projections to each tuple in the result of the join as it is produced
• Result of the join before the selections and projections is never stored in its entirety
• Convention: Outer table is the left child of the operator
Reserves Sailors
sid=sid
bid=100 rating > 5
sname
(Simple Nested Loops)
(On-the-fly)
(On-the-fly)Plan:
(File Scan)(File Scan)
Heuristics for Optimizing a Query
• A query may have several equivalent
query trees
• A query parser generates a standard canonical query tree from a SQL query tree– Cartesian products are first applied
(FROM)
– then the conditions (WHERE)
– and finally projection (SELECT)
ProjNo,DeptNo,EmpName,Address,Birthdate
ProjLocation=‘Stafford’ AND MgrNo=EmpNo AND
DeptNo=DeptNo,
Employee
DepartmentProject
The query optimizer
transforms this canonical
query into an efficient final
query
Heuristics for Optimizing a Query
select ProjNo, DeptNo, EmpName, Address,
Birthdate
from Project, Department, Employee
where ProjLocation=„Stafford‟ and
MrgNo=EmpNo and
Department.DeptNo=Employee.DeptNo
Find the names of employees born after 1957
who work on a project named „Aquarius‟
select EmpName
from Employee, WorksOn, Project
where ProjName=„Aquarius‟ AND
Project.ProjNo=WorksOn.ProjNo AND
Employee.EmpNo = WorksOn.EmpNo
AND
Birthdate >„DEC-31-1957‟
WorksOn (EmpNo, ProjNo, Hours)
EmpName
ProjName=‘Aquarius’ AND Project.ProjNo=Project.ProjNo
AND Employee.EmpNo=WorksOn.EmpNo
AND Birthdate > ‘DEC-31-1957’
Project
WorksOnEmployee
Example
EmpName
ProjNo=ProjNo
Project
WorksOn
Employee
ProjName=‘Aquarius’
Birthdate > ‘dec-31-1957’
EmpNo=EmpNo
Example
Push all the conditions as far down
the tree as possible
Expensive due to large
size of Employee
5/2/2011
16
Example
EmpName
EmpNo=EmpNo
Employee
WorksOn
Project
Birthdate > ‘dec-31-1957’
PNAME=‘Aquarius’
ProjNo=ProjNo
Rearrange join sequence according
to estimates of relation sizes
Only need ProjNo attribute from
Project and WorksOn
Only need EmpNo attribute from
Employee and WorksOn and
EmpName from Employee
Example
Replace cross products and selection
sequence with a join operation EmpName
EmpNo= EmpNo
EmployeeWorksOn
Project
Birthdate > ‘dec-31-1957’
ProjName=‘Aquarius’
ProjNo= ProjNo
Example
Push projection as far down the
query tree as possible
LNAME
EmpNo = EmpNo
Employee
Birthdate > ‘dec-31-1957’
WorksOn
Project
ProjName=‘Aquarius’
ProjNo= ProjNo
EmpNo, EmpNameEmpNo
EmpNo, ProjNoProjNo
1. Cascade of : A conjunctive selection condition can be broken up into a cascade (sequence) of individual operations:
• c1 AND c2 AND...AND cn(R) c1
( c2(...( cn
(R))..))
2. Commutativity of :
c1( c2
(R)) c2( c1
(R))
3. Cascade of :
• List1( List2
(... ( Listn(R))... )) List1
(R)
if List1 is included in List2…Listn; result is null if List1 is not in any of List2…Listn
Transformation Rules
4. Commuting with : if the projection list List1 involves only attributes that are in condition c
• List1( c(R)) c( List1(R))
5. Commutivity of JOIN or : R S S R
6. Commuting with JOIN: if all the attributes in the selection condition c involve only the attributes of one of the relations being joined, say, R
• c(R S) ( c(R)) S
Transformation Rules
7. Commuting with JOIN: if List can be separated into
List1 and List2 involving only attributes from R and S,
respectively, and the join condition c involves only
attributes in List:
• List(R c S) ( List1(R) c List2
(S))
8. Commuting set operations: and are commutative
9. JOIN, , , are associative
10. distributes over , ,
• c (R S) c(R) c(S)
11. distributes over
• List (R S) ( List(R) List(S))
Transformation Rules
5/2/2011
17
Pictorial Depiction of Equivalence Rules
Use rule 1 to break up any operation with conjunctive conditions into a sequence of operations
Use rules 2, 4, 6, and 10 concerning commutativity of with other operations to move each operation as far down the query tree as possible based on the attributes in the operations
Use rule 9 concerning associativity of binary operations to rearrange the leaf nodes of the tree so that the leaf node relations with the most restrictive operations are executed
Heuristic Algebraic Optimization
Combine sequences of Cartesian product and operation representing a join condition into single JOIN operations
Use rules 3, 4, 7, and 11 concerning the cascading of and commuting with other operations, break down a and move the projection attributes down the tree as far as possible
Identify subtrees that represent groups of operations that can be executed by a single algorithm (select/join followed by project)
Heuristic Algebraic OptimizationEvaluation of Expressions
• Alternatives for evaluating an entire expression tree
– Materialization: generate results of an expression whose inputs
are relations or are already computed, materialize (store) it on
disk.
– Pipelining: pass on tuples to parent operations even as an
operation is being executed
Materialization• Materialized evaluation: evaluate one operation at a time,
starting at the lowest-level. Use intermediate results materialized
into temporary relations to evaluate next-level operations.
• E.g., in figure below, compute and store
then compute the store its join with instructor, and finally compute
the projection on name.
)("Watson" departmentbuilding
Materialization (Cont.)
• Materialized evaluation is always applicable
• Cost of writing results to disk and reading them back can be quite
high
– Our cost formulas for operations ignore cost of writing results to
disk, so
• Overall cost = Sum of costs of individual operations +
cost of writing intermediate results to disk
• Double buffering: use two output buffers for each operation, when
one is full write it to disk while the other is getting filled
– Allows overlap of disk writes with computation and reduces
execution time
5/2/2011
18
Pipelining• Pipelined evaluation : evaluate several operations
simultaneously, passing the results of one operation on to the next.
• E.g., in previous expression tree, don’t store result of
– instead, pass tuples directly to the join.. Similarly, don’t store result of join, pass tuples directly to projection.
• Much cheaper than materialization: no need to store a temporary relation to disk.
• Pipelining may not always be possible – e.g., sort, hash-join.
• For pipelining to be effective, use evaluation algorithms that generate output tuples even as tuples are received for inputs to the operation.
• Pipelines can be executed in two ways: demand driven and
producer driven
)("Watson" departmentbuilding
Pipelining• In demand driven or lazy evaluation
– system repeatedly requests next tuple from top level operation
– Each operation requests next tuple from children operations as
required, in order to output its next tuple
– In between calls, operation has to maintain ―state‖ so it knows
what to return next
• In producer-driven or eager pipelining
– Operators produce tuples eagerly and pass them up to their
parents
• Buffer maintained between operators, child puts tuples in
buffer, parent removes tuples from buffer
• if buffer is full, child waits till there is space in the buffer, and
then generates more tuples
– System schedules operations that have space in output buffer
and can process more input tuples
• Alternative name: pull and push models of pipelining
Pipelining (Cont.)• Implementation of demand-driven pipelining
– Each operation is implemented as an iterator implementing the following operations
• open()
– E.g. file scan: initialize file scan
» state: pointer to beginning of file
– E.g.merge join: sort relations;
» state: pointers to beginning of sorted relations
• next()
– E.g. for file scan: Output next tuple, and advance and store file pointer
– E.g. for merge join: continue with merge from earlier state till next output tuple is found. Save pointers as iterator state.
• close()
Evaluation Algorithms for Pipelining• Some algorithms are not able to output results even as they get input
tuples
– E.g. merge join, or hash join
– intermediate results written to disk and then read back
• Algorithm variants to generate (at least some) results on the fly, as
input tuples are read in
– E.g. hybrid hash join generates output tuples even as probe relation
tuples in the in-memory partition (partition 0) are read in
– Double-pipelined join technique: Hybrid hash join, modified to
buffer partition 0 tuples of both relations in-memory, reading them
as they become available, and output results of any matches
between partition 0 tuples
• When a new r0 tuple is found, match it with existing s0 tuples,
output matches, and save it in r0
• Symmetrically for s0 tuples
Pipelined Evaluation
• Motivation– A query is mapped into a sequence of operations.
– Each execution of an operation produces a temporary result.
– Generating and saving temporary files on disk is time consuming and expensive.
• Alternative:– Avoid constructing temporary results as much as
possible.
– Pipeline the data through multiple operations - pass the result of a previous operator to the next without waiting to complete the previous operation.
Pipelined Evaluation
• The result of one operator is sometimes pipelined to another operator without creating a temporary table to hold the intermediate result
• The output of R ►◄S is pipelined into the selections & projections that follow
• Cost of writing out the intermediate result & reading it back in can be significant
• Temporary table: Materialized Tuples
5/2/2011
19
Pipelined Evaluation
• Consider a selection query in which only a part of the selection condition matches an index
• 2 instances of selection operator– Matching (primary) part of the selection condition
– Rest
• Pipelining: apply the second selection to each tuple in the result of the primary selection as it is produced & adding tuples that qualify to the final result
• When the input to a unary operator is pipelined into it, we say that the operator is applied on-the-fly
Pipelined Evaluation
• Result tuples of first join pipelined into join with C
• Conceptually, the evaluation is initiated from the root, & the node joining A & B produces tuples as and when they are requested from their parent node
►◄
A B
C
►◄
(A ►◄B) ►◄ C
Statistical Information for Cost Estimation
• nr: number of tuples in a relation r.
• br: number of blocks containing tuples of r.
• lr: size of a tuple of r.
• fr: blocking factor of r — i.e., the number of tuples of
r that fit into one block.
• V(A, r): number of distinct values that appear in r for attribute A; same as the size of A(r).
• If tuples of r are stored together physically in a file,
then:
rfrn
rb
Histograms
• Histogram on attribute age of relation person
Equi-width histograms
• Equi-depth histograms
Estimation of the Size of Joins
• The Cartesian product r s contains nrns tuples; each tuple
occupies sr + ss bytes.
• If R S = , then r s is the same as r x s.
• If R S is a key for R, then a tuple of s will join with at most one
tuple from r; therefore, the number of tuples in r s is no greater
than the number of tuples in s.If R S in S is a foreign key in S referencing R, then the number of
tuples in r s is exactly the same as the number of tuples in s.The case for R S being a foreign key referencing S is symmetric.
R S
Matching tuples
Example of Size Estimation
• In the example query depositor customer, customer-name in
depositor is a foreign key of customer; hence, the result has exactly
depositor tuples, which is 5000.
• Data: R = Customer, S = Depositor
customer = 10,000
fcustomer = 25
bcustomer = 10000/25 = 400
depositor = 5,000
fdepositor = 50
bdepositor = 5000/50 = 100
5/2/2011
20
Estimation of the size of Joins
• If R S = {A} is not a key for R or S.
If we assume that every tuple t in R produces tuples in
R S, number of tuples in R S is estimated to be:
r s
V(A, s)
• If the reverse is true, the estimates obtained will be:
r s
V(A, r)
• The lower of these two estimates is probably the more
accurate one.
Number of distinct values of A in s
R S
s
V(A, s)
Estimation of the size of Joins
• Compute the size estimates for depositor customer
without using information about foreign keys:
– customer = 10,000
depositor = 5,000
V(customer-name, depositor ) = 2500
V(customer-name, customer ) = 10000
– The two estimates are 5000 * 10000/2500 = 20,000 and
5000 * 10000/10000 = 5000
– We choose the lower estimate, which, in this case, is the
same as our earlier computation using foreign keys.
There are 5,000 tuples in
depositor relation but has
only 2,500 distinct
depositors, so every
depositor has two accounts
Customer-name is unique
Nested-Loop Join
• Compute the theta join, r s
for each tuple tr in r do begin
for each tuple ts in s do begintest pair (tr, ts) to see if they satisfy the join condition
if they do, add tr · ts to the result.
End
end
• r is called the outer relation and s the inner relation of the join.
• Requires no indices and can be used with any kind of join condition.
• Expensive since it examines every pair of tuples in the two relations.
Cost of Nested-Loop Join• If there is enough memory to hold only one block of each
relation, the estimated cost is nr * bs + br disk accesses
• If the smaller relation fits entirely in memory, use it as the inner relation. This reduces the cost estimate to br + bs disk accesses.
– br + bs is the minimum possible cost to read R and S once
– Putting both relations in memory won’t reduce the cost further
br disk accesses to
load R into bufferRS
For each tuple in r, S has to be
read into buffer, bs disk accesses
no. of bocks in rno. of bocks in s
Selection Size Estimation
• A=v(r)
• nr / V(A,r) : number of records that will satisfy the selection
• Equality condition on a key attribute: size estimate = 1
• A V(r) (case of A V(r) is symmetric)
– Let c denote the estimated number of tuples satisfying the
condition.
– If min(A,r) and max(A,r) are available in catalog
• c = 0 if v < min(A,r)
• c =
– If histograms available, can refine above estimate
– In absence of statistical information c is assumed to be nr / 2.
),min(),max(
),min(.
rArA
rAvnr
Size Estimation of Complex Selections
• The selectivity of a condition i is the probability that a tuple in
the relation r satisfies i .
– If si is the number of satisfying tuples in r, the selectivity of
i is given by si /nr.
• Conjunction: 1 2 . . . n (r). Assuming indepdence, estimate of
tuples in the result is:
• Disjunction: 1 2 . . . n (r). Estimated number of tuples:
• Negation: (r). Estimated number of tuples:
nr – size( (r))
n
r
nr
n
sssn
. . . 21
)1(...)1()1(1 21
r
n
rr
rn
s
n
s
n
sn
5/2/2011
21
Heuristic Optimization• Cost-based optimization is expensive
• Systems may use heuristics to reduce the number of choices that must be made in a cost-based fashion.
• Heuristic optimization transforms the query-tree by using a set of rules that typically (but not in all cases) improve execution performance:
– Perform selection early (reduces the number of tuples)
– Perform projection early (reduces the number of attributes)
– Perform most restrictive selection and join operations before other similar operations.
– Some systems use only heuristics, others combine heuristics with partial cost-based optimization.
Heuristic Optimization
Perform selection operations as early as possible
– A heuristic optimizer would use this rule without finding out whether the cost is reduced by this transformation
– Does it always work?
– Consider this:
σθ (A ►◄B)
Heuristic Optimization
Perform selection operations as early as possible
σθ (A ►◄B)
– Condition θ only refers to attributes in B
– Selection can definitely be performed before the join
– A is extremely small as compared to B
– Index on the join attribute of B
– No index on the attributes used by θ
– Is it a good idea to push the selection before the join?
Heuristic Optimization
Perform selection operations as early as possible
σθ (A ►◄B)
– Performing the selection early ie directly on B
– Would require a scan of all tuples in B
– Probably cheaper to compute the join using the index and then to reject the tuples that fail the selection
Heuristic Optimization
Perform projection operations as early as possible
– Projection operation, like the selection operation, reduces the size of relations
– Whenever we need to generate a temporary relation, it is advantageous to apply immediately any projections that are possible
Heuristic Optimization
Perform selections earlier than projections
– Selections have the potential of reducing the size of a relation greatly
– Selections enable the use of indices to access tuples
5/2/2011
22
Heuristic Optimization
– Heuristics reorder an initial query-tree representation in such a way that the operations that reduce the size of the intermediate results are applied first
– Early selections reduce the number of tuples
– Early projections reduce the number of attributes
– Heuristic transformations also restructure the tree so that the system performs the most restrictive selection and join operations before other similar operations
SYSTEM R Optimizer
Current relational query optimizers have been greatly influenced by choices made in the design of the IBM’s System R query optimizer
– Use of statistics about DB instance to
estimate the cost of a QEP
– Consider only plans with binary joins in which
the inner relation is a base relation
• This heuristic greatly reduces the no. of alternative
plans that must be considered
SYSTEM R Optimizer
– Focus optimization on the class of SQL queries without nesting & treat nested queries in a relatively ad-hoc way
– Not to perform duplicate elimination for projections except as a final step when required by a DISTINCT clause
– Cartesian products avoided
– A model of cost that accounted for CPU costs as well as I/O costs
– Only left-deep plans
Left-Deep Plans
Focus optimization on the class of SQL queries without nesting & treat nested queries in a relatively ad-hoc way
– Not to perform duplicate elimination for projections except as a final step when required by a DISTINCT clause
– Cartesian products avoided
– A model of cost that accounted for CPU costs as well as I/O costs
– Only left-deep plans
Left Deep Join Trees
• In left-deep join trees, the right-hand-side input
for each join is a relation, not the result of an
intermediate join.