Ten query tuning techniques every SQL Server programmer should know

Post on 24-Jan-2015

1.199 views 3 download

description

From the noted database expert and author of 'SQL in a Nutshell' - SELECT statements have a reputation for being very easy to write, but hard to write very well. This session will take you through ten of the most problematic patterns and anti-patterns when writing queries and how to deal with them all. Loaded with live demonstrations and useful techniques, this session will teach you how to take your SQL Server queries mundane to masterful.

Transcript of Ten query tuning techniques every SQL Server programmer should know

TEN QUERY TUNING TECHNIQUES

Every SQL Programmer Should Know

Kevin KlineDirector of Engineering Services at SQL

SentryMicrosoft MVP since 2003Facebook, LinkedIn, Twitter at KEKLINEKEKline@sqlsentry.comKevinEKline.com, ForITPros.com

FOR FRIENDS OF SQL SENTRY

• Free Plan Explorer download: http://www.sqlsentry.net/plan-explorer/

• Free query tuning consultations: http://answers.sqlperformance.com.

• Free new ebook (regularly $10) to attendees. Send request to sales@sqlsentry.net.

• SQL Server educational videos, scripts, and slides: http://SQLSentry.TV

• Tuning blog: http://www.sqlperformance.com/

• Monthly eNews tips and tricks: http://www.sqlsentry.net/newsletter-archive.asp

AGENDA• Introductions• Test & tuning environment

• 1. Clearing caches

• Looking for red flags• 2. Reading execution plans

• Query tuning techniques:• 8 more specific examples of widespread approaches that lead to poor

performance

• Summary

3

TEST & TUNING ENVIRONMENT

• Code to clear the caches*: o CHECKPOINTo DBCC [FreeProcCache | FreeSystemCache | FlushProcInDB(<dbid>) ]o DBCC DropCleanBuffers

• Code to set measurements:o SET STATISTICS [TIME | IO]o SET SHOWPLAN [TEXT | XML] or Graphic Execution Plans

• Code for Dynamic Management Views (DMV) checks.o System info – sys.dm_os_performance_counters, sys.dm_os_wait_statso Query info – sys.dm_exec_requestso Index info – sys.dm_db_index_usage_stats, sys.dm_io_virtual_file_stats

RED FLAGS IN YOUR SQL CODE

• Red Flags Query Operators:o Lookups, Scans, Spools, Parallelism Operations

• Other Red Flags:o Dissimilar estimated versus actual row countso High physical readso Missing statistics alarmso Large sort operationso Implicit data type conversions

DEMOS: DEFAULT CURSORS

• I don’t always use cursors…o …but when I do, I avoid the default optionso Slow and heavy-handed: Global, updateable, dynamic,

scrollableo I use LOCAL FAST_FORWARDo May want to test STATIC vs. DYNAMIC, when tempdb is a

bottleneck

• Blog post: http://bit.ly/AB-cursors 

DEMOS: WHERE IN VERSUS WHERE EXISTS

• There are lots of ways to find data existing within subsets:• IN, EXISTS, JOIN, Apply, subquery

• Which technique is best?• Blog post: http://bit.ly/AB-NOTIN

OPTIMIZING FOR SELECT VERSUS DML

• Big differences between a SELECT and a DML statement that effects the same rows.

• Shouldn’t blindly create every index the Tuning Advisor or execution plan tells you to!

• Blog post - http://bit.ly/AB-BlindIndex

READS & INDEX STRUCTURE• 8K pages• Leaf pages ARE the data.• Non-leaf pages are pointers.

Leaf Pages

Root Page

Level 0

Intermediate Pages

Level 1

Level 2

WRITES & INDEX STRUCTURE• Each change to the leaf pages requires all index

structures be updated.

Leaf Pages

Root Page

Level 0

Intermediate Pages

Level 1

Level 2

Page Split

DML

Actualplace-ment

DEMOS: UNWANTED RECOMPILES

Execution

Load metadataNOIn Memory?

compile

optimize

ExecuteY

ES

ReComp

Execute

CAUSES OF RECOMPILE

• Expected: Because we request it:• CREATE PROC … WITH RECOMPILE or EXEC myproc … WITH

RECOMPILE• SP_RECOMPILE foo

• Expected: Plan was aged out of memory• Unexpected: Interleaved DDL and DML• Unexpected: Big changes since last execution:

• Schema changes to objects in underlying code• New/updated index statistics• Sp_configure

INTERLEAVED DDL AND DML• CREATE PROC testddldml AS … ;• CREATE TABLE #testdml; -- (DDL)• <some T-SQL code here>• INSERT INTO #testdml; -- (DML + RECOMPILE)• <some T-SQL code here>• ALTER TABLE #testdml; -- (DDL)• <some T-SQL code here>• INSERT INTO #testdml; -- (DML + RECOMPILE)• <some T-SQL code here>• DROP TABLE #testdml; -- (DDL)• <some T-SQL code here>

DEMOS: THE "KITCHEN SINK" PROCEDURE

• Usually see it as a one-query-for-all-queries procedure, or even one-proc-for-for-all-transactions procedure:o Where name starts with S, or placed an order this year, or lives in Texaso Insert AND Update AND Delete AND Select

• Conflicting optional parameters make optimization impossibleo OPTION (RECOMPILE)o Dynamic SQL + Optimize for ad hoc workloadso Specialized procedures

• Better approach?o Specialize and optimize each piece of code to do ONE THING really effectively

DEMOS: SP_EXECUTESQL VS. EXEC(…)

• I don’t always use dynamic SQL…o …but when I do, I always use sp_executesqlo Less fuss with concatenation and implicit/explicit conversionso Better protection against SQL injection (but not for all things)o At worst case, behavior is the same

• Can promote better plan re-use• Encourages strongly typed parameters instead of

building up a massive string

IMPLICIT CONVERSIONS

• SQL Server has to do a lot of extra work / scans when conversion operations are assumed by the SQL programmer.

• Happens all the time with data types you’d think wouldn’t need it, e.g. between date types and character types.

• Very useful data type conversion chart at http://bit.ly/15bDRRA.

• Data type precedence call also have an impact: http://bit.ly/13Zio1f.

IMPLICIT CONVERSION RESOURCES

• Ian Stirk’s Column Mismatch Utility at http://www.sqlservercentral.com/articles/Administration/65138/.

• Jonathan Kehayias’ plan cache analyzer at http://sqlblog.com/blogs/jonathan_kehayias/archive/2010/01/08/finding-implicit-column-conversions-in-the-plan-cache.aspx.

• Jonathan Kehayias’ index scan study at http://www.sqlperformance.com/2013/04/t-sql-queries/implicit-conversion-costs

DEMOS: COMMA-DELIMITED PARAMETERS

• Example: pass a comma-separated list of OrderIDs• String splitting is expensive, even using CLR• Table-valued parameters are typically a better approach

DEMOS: TEMPORARY STRUCTURES

• Which are better, temp tables or temp variables?

Temp Table Temp Variable

Stored in? Tempdb Tempdb

Statistics? Yes No (1 row)

Indexs/Keys? Yes 1 UK / PK only

Truncate? Yes No

Recompiles? Yes No

Parallelism? Yes No

Metadata Overhead?

Low Lowest

Lock Overhead? Normal Lowest

CODING STANDARDS AND DISSIMILARITY

• Might sound frivolous, but naming schemes are importanto Convention is not important; but rather being consistent and logical

• Story: dbo.UpdateCustomer vs. dbo.Customer_Update

• Always specify schema when creating, altering, referencing objectso Object resolution works a little bit harder without ito More importantly, it can get the wrong answero And will often yield multiple copies of the same plan

• Do not use the sp_ prefix on stored procedureso This has observable overhead, no matter how specific you are

MIMICKING PRODUCTION

• Your dev machine is usually nothing like productiono Build representative data when you cano Build a stats-only database when you can’t (a.k.a. a database clone)

• Will allow you to see plan issues, but not speedo Make sure settings are the same

• @@VERSION, edition• Max memory if possible, sp_configure options• Logins (and permissions), tempdb settings• Parameterization settings, recovery model, compression, snapshot isolation• Compatibility level (usually not an issue when working with a restore)• Run a full business cycle workload after a restore

o Simulate equivalent hardware: DBCC OPTIMIZER_WHATIFo Use Distributed Replay when you can

• Not perfect, but more realistic than single-threaded trace replay

SUMMARYTest Harness for your SQL Code to

ensure stable results.Remember the red flags in SQL code

running against SQL Server.Plus 8 more coding techniques:1. Writing “good” cursors2. WHERE IN versus WHERE EXISTS

3. The nature of indexes and Optimizing for Reads versus Writes

4. Inadvertent recompiles5. EXEC versus SP_EXECUTESQL6. Data types and Implicit conversion7. Dissimilar execution plans and

tracking plan cache bloat8. Mimicking the production environmentMore info?

Let’s connect!Facebook, LinkedIn, Twitter at KEKLINE.

Email at KEKline@sqlsentry.com

Blogs at http://KevinEKline.com Andhttp://ForITPros.com

WRAP UPEngage with us on social media

o I’m thankful for your word of mouth promotions and endorsements!

Share your tough SQL tuning problems with us: http://answers.sqlperformance.com

Download SQL Sentry Plan Explorer for free: http://www.sqlsentry.com/plan-explorer/

Check out our other award winning tools: http://www.sqlsentry.net/download

NOLOCK

http://www.flickr.com/photos/relyjus/4289185639/

NOLOCK

• It is a turbo button …if you’re ok with inaccuracy

• There are times it is perfectly valido Ballpark row countso Please use session-level setting, not table hint

• Usually, though, better to use SNAPSHOT or RCSIo But test under heavy load