Understanding SQL Server Query Execution Plans Greg Linwood MyDBA [email protected].

19
Understanding SQL Server Query Execution Plans Greg Linwood MyDBA [email protected]

Transcript of Understanding SQL Server Query Execution Plans Greg Linwood MyDBA [email protected].

Page 1: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Understanding SQL Server Query Execution Plans

Greg Linwood

MyDBA

[email protected]

Page 2: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

About Me• MyDBA

– Managed Database Services

• SQLskills– SQL Server training

• Microsoft SQL Server MVP

• Melbourne SQL Server User Group Leader

Page 3: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Agenda• SQL: set based expression / serial execution

• Intro to execution plans

• Table & index access methods

– Table storage formats – Heaps & Clustered Indexes

– Nonclustered Indexes (on Heaps vs on Clustered Indexes)

– Covering Indexes

• Join operators

– Nested Loops Join, Merge Join & Hash Join

• Review

• Discussion

Page 4: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Intro to execution plans

• Why is understanding Execution Plans important?

• Tune performance problems at the source (query efficiency)– More effective than tuning hardware

– Other diagnostics only reveal consequences of poor query execution plans• High CPU is only a consequence of poorly tuned query plans• High disk utilisation usually just a consequence of poorly tuned query plans• Waitstats reveals high resource utilisation of poorly tuned query plans• Locking usually just a consequence of poorly tuned query plans

• Tuning query execution plans is a VERY important tuning technique

• Provides insight into query execution steps / processing efficiency

- SQL Server occasionally makes mistakes

Page 5: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

SQL: set based expression / serial execution

• SQL syntax based on “set based” expressions (no processing rules)

• Query execution is serial– SQL Server “compiles” query into a series of sequential steps which are

executed one after the other

– Individual steps also have internal sequential processing • (eg table scans are processed one page after another & per row within page)

Returns CustID, OrderID & OrderDate for orders >

1st Jan 2005

No processing rules included in SQL statement, just the “set” of

data to be returned

• Execution Plans Display these steps

Page 6: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Intro to execution plans – a simple example

• Execution Plan shows how SQL Server compiled & executes this query– Ctl-L in SSMS for “estimated” plan (without running query)

– Ctl-M displays “actual” plan after query has completed

• Read Execution Plans from top right to bottom left– In this case, plan starts with Clustered Index Scan of [SalesOrderHeader]

– Then, for every row returned, performs an index seek into [Customers]

Plan node cost shown as % of total

plan “cost”

Table / index access methods displayed (Scan, Seek etc)

Join physical operators displayed

(Loops, Merge, Hash)Thickness of intra node rows

denotes estimated / actual number of rows carried between nodes in

the plan

Page 7: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Intro to execution plans – a simple example

• Read Execution Plans from top right to bottom left (loosely)1

3

24

(1) Plan starts with Clustered Index Scan of [SalesOrderHeader](Full scan of table, as no index exists)

(2) “For Each” row returned from [SalesOrderHeader]..(Nested Loops are execution plan terminology for “For Each”, but we’ll come back to this later)

(3) Find row in [Customer] with matching CustomerID

(4) Return rows formatted with CustomerID, SalesOrderID & OrderDate columns

Using our existing

sample query..

Note: plan starts at [SalesOrderHeader] even though [Customer] is actually named first in query expression

Right angle blue arrow in table access method icon represents full scan (bad)

Stepped blue arrow in table access method icon represents index “seek”, but could be either a single row or a range of rows

Page 8: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Execution plan node properties

• Mouse over execution plan node reveals extra properties..

Search predicate. WHERE filter in this case, but can also

be join filter

Number of rows returned shown in

Actual Execution Plan

Name of Schema object accessed to physically process query –

typically an index, but also possibly a heap structure

Ordered / Unordered – displays whether scan operation follows page

“chain” linked list (next / previous page # in page header) or follows Index Allocation Map (IAM) page

Page 9: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

“Heap” Table Storage

• Query execution example:

Select FName, Lname, PhNo

from Customers where Lname = ‘Smith’

No b-tree with HEAPs, so no lookup method

available unless other indexes are present. Only

option is to scan heap

No physical ordering of table rows (despite this

display)Scan cannot complete just because a row is

located. Because data is not ordered, scan must

continue through to end of table (heap)

• Table storage structure used when no clustered index on table– Rarely used as CIXs added to PKs by default– Oracle uses Heap storage by default (even with PKs)

• No physical ordering of rows– Stored in order of insertion– New pages added to end of “heap” as needed

• NO B-Tree index nodes (no “index”)

Page 10: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

“Clustered Index” Table Storage

• Table rows stored in leaf level of clustered index, in physical order of index column/s (key/s)

• B-Tree index nodes also created on indexed columns

• Each level contains entries based on “cluster key” value from the first row in pages from lower level

• Default table storage format for tables WITH a primary key

• Query execution example:

Select FName, Lname, PhNo

From Customers where CustID = 23

Table rows stored in physical order of

clustered index key column/s – CustID in this

case.

CIX also provides b-tree lookup pages, similar to a

regular index

Can only have one CIX per table (as table storage can

only be sorted one way)

• create clustered index cix_CustID on customers (CustID)

Page 11: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Non-Clustered Index (on Heap storage)

• Query execution example:

Select Lname, Fname

from Customers

where Lname = ‘Smith’

• B-tree structure contains one leaf row for every row in base table, sorted by index column values.

• Each row contains a “RowID”, an 8 byte “pointer” to heap storage

– (RowID actually contains File, Page & Slot data)

• If index does NOT cover query, RowID lookups performed to get values for non-indexed columns

• create nonclustered index ncix_lname on customers (lname)

Page 12: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Non-Clustered Index (on Clustered Index storage)• create nonclustered index ncix_lname on customers (lname)• B-tree structure contains one

leaf row for every row in base table, sorted by index column values. (same as when NCIX is on a heap)

• Instead of a RowID, each row’s clustered index “key” value is stored in the index leaf level instead.

• This means RowID bookomarks cannot be performed (as RowID is not available). Instead, bookmark lookups are performed, which are considerably more expensive

Query execution

example:

Select Lname, Fname

From Customers

Where Lname = ‘Smith’

BookmarkBookmarkLookupLookup

Page 13: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Non-Clustered Index (Covering Index)• create nonclustered index ncix_lname on customers (lname, fname)

NCIX now “covers” query because all columns named in query are

present in NCIX

• “Covering” indexes significantly reduce query workload by removing bookmark lookups (& RowID lookups)

• Query execution

example:

Select Lname, Fname

from Customers

where Lname = ‘Smith’

Page 14: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Join Operators (intra-table operators)• Nested Loop Join

– Original & only join operator until SQL Server 7.0– “For Each Row…” type operator– Takes output from one plan node & executes another

operation “for each” output row from that plan node

• Merge Join– Scans both sides of join in parallel – Ideal for large range scans where joined columns are

indexed• If joined columns aren’t indexed, requires expensive sort

operation prior to Merge

• Hash Join– “Hashes” values of join column/s from one side of join

• Usually smaller side

– “Probes” with the other side• Usually larger side

– Hash is conceptually similar to building an index for every execution of a query

• Hash buckets not shared between executions

– Worst case join operator– Useful for large scale range scans which occur infrequently

Page 15: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Nested Loops Join (IX range scan + IX seek)

select p.Class, sod.ProductID

from Production.Product p

join Sales.SalesOrderDetail sod

on p.ProductID = sod.ProductID

where p.Class = ‘M‘

and sod.SpecialOfferID = 2

1

An index seek (3 page reads) is

performed against SalesOrderDetail FOR EACH row

found in the seek range.

3

If a large number of rows are involved in execution plan node (not just results) this can be very costly

Page 16: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Merge Join (IX range scan + IX range scan)

select p.Class, sod.ProductID

from Production.Product p

join Sales.SalesOrderDetail sod

on p.ProductID = sod.ProductID

where p.Class = ‘M‘

and sod.SpecialOfferID = 2

Values on either side of ranges being merged compared for (in)equality

1

Page 17: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Nested Loops Join vs Merge Join

• Comparing Nested Loops vs Merge Join– Nested Loops is often far less efficient than Merge– Setting up Merge operator more expensive than Nested Loops

• But cost savings in terms of IO pay off very quickly – only hundreds of rows required

• Common misconception – “Merge requires left hand columns in indexes to be the same”– Not always true. Note index definitions for the previous example were:

create index ix_SalesOrderDetail_SpecialOfferID_ProductID

on Sales.SalesOrderDetail (SpecialOfferID, ProductID)

create index ix_Product_Class_ProductID

on Production.Product (Class, ProductID) ProductID on RIGHT hand side of both indexes to support JOIN whilst left hand columns support range seek.

• Where only a few rows are involved (tens or perhaps hundreds) there’s little difference

• Where many rows involved (per node or resultset), difference can be huge

• Merge can be very costly if SORT operation required– Important that well designed indexes exist first to avoid large scale sorting within plan

Page 18: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Review• SQL Syntax is Set based but execution is serial

– No such thing as “set based execution”

• Execution Plans describe execution steps chosen by SQL Server

• Only describes current behavior – doesn’t describe solutions– Useful to verify expected behavior rather than look for answers

• SQL Server can only optimize based on existing indexes– Most performance tuning solutions come from designing good indexes

• Other useful tools– Set statistics io on

• Shows table level workload (reads)

– Profiler• Capture Execution Plans at run time

Page 19: Understanding SQL Server Query Execution Plans Greg Linwood MyDBA gregl@MyDBA.com.

Greg Linwood

[email protected]

http://blogs.sqlserver.org.au/blogs/greg_linwood

Questions?