05_waits_intro

44
Part II : Waits Events and the Geeks who love them Kyle Hailey http://perfvision.com

Transcript of 05_waits_intro

Page 1: 05_waits_intro

Part II : Waits Events and the

Geeks who love them

Kyle Hailey

http://perfvision.com

Page 2: 05_waits_intro

#.2

Copyright 2006 Kyle Hailey

Wait EventsWait EventsWait Events

Page 3: 05_waits_intro

#.3

Copyright 2006 Kyle Hailey

And the Geeks Who Love Them

Page 4: 05_waits_intro

#.4

Copyright 2006 Kyle Hailey

In this Presentation:

Introduction to Waits Tuning Methodology

Plan of ActionStatspacks, AWR or OEM for Collection DataBased on Waits

Using Waits to Solve Bottlenecks

Page 5: 05_waits_intro

#.5

Copyright 2006 Kyle Hailey

Database is Hung!

Everybody blames the databaseYet 9 out of 10 dba’s agree it’s not the

databaseHow do you prove it to management?

On the off chance it’s the database, what do we do?

Page 6: 05_waits_intro

#.6

Copyright 2006 Kyle Hailey

Database: Guilty until proven innocent

*$%@!!*$%@!!

Page 7: 05_waits_intro

#.7

Copyright 2006 Kyle Hailey

Oracle’s Defense

After years of false accusations Oracle took action and created a defense

system:

WAIT EVENTSTo the rescue Oracle is the best instrumented database on

the market which can save time and money on development and tuning

Page 8: 05_waits_intro

#.8

Copyright 2006 Kyle Hailey

Oracle Instrumentation

RedoRedo Lib Lib CacheCache

Buffer Buffer CacheCache

IOIO

LocksLocks

NetworkNetwork

CPUCPU

Page 9: 05_waits_intro

#.9

Copyright 2006 Kyle Hailey

Waits Introduced in v7 Revolutionized tuning

Changed from Ratio Guesswork to empirical measure of time lost to bottlenecks

10g added the crucial addition ASH Not only identifies bottlenecks but

Who (session, service, package, procedure) Where (CPU, Wait) When (time) What (SQL statement)

Page 10: 05_waits_intro

#.10

Copyright 2006 Kyle Hailey

Tuning Methodology1. Machine

Run queue (CPU) Check other applications reduce CPU usage or add CPUs

Paging Reduce memory usage or add memory

2. Oracle Waits + CPU > Available CPU

Tune waits CPU 100%

Tune SQL Else low waits, available CPU then

It’s the application

We are going to We are going to concentrate here concentrate here on WAITSon WAITS

Page 11: 05_waits_intro

#.11

Copyright 2006 Kyle Hailey

Dependable Tuning Strategy

Determine AAS : Run Statspack or AWR Report

Top 5 Timed Events ~50 lines down from top

Need Available CPU Elapsed Time CPU_COUNT

ASH Report : ashrpt.sql OEM 10g

Performance Page does everything

If there is a wait bottleneck tune the wait

Page 12: 05_waits_intro

#.12

Copyright 2006 Kyle Hailey

Tuning Methodology Graphics

Relax, it’s the Relax, it’s the applicationapplication

Get to Work!Get to Work!

Page 13: 05_waits_intro

#.13

Copyright 2006 Kyle Hailey

Waits beyond OEM

OEM identifies Wait problems Provides solutions with ADDM sometimes But

What do you do when ADDM isn’t sufficient? What do you do if you don’t have OEM 10g?

Waits Need to know about waits How they work How to analyze them

Page 14: 05_waits_intro

#.14

Copyright 2006 Kyle Hailey

Waits

I/O

Library Cache

Locks

Redo

Buffer Cache

SQL*Net

Wait Areas

We’ll discuss Waits in these logical database areasWe’ll discuss Waits in these logical database areas

Page 15: 05_waits_intro

#.15

Copyright 2006 Kyle Hailey

Wait Tree

Waits

IO

Buffer Cache

Library Cache

Lock

Redo

SQL Net

Buffer Busy

Rollback

Free lists

IO ReadCache Latches

Library Cache

Shared Pool

TX Row Lock

TX ITL Lock

HW Lock

Write IO

Read IO

Log Buffer

Log File Sync

Log File

Page 16: 05_waits_intro

#.16

Copyright 2006 Kyle Hailey

v$active_session_history

When ADDM fails or we don’t have ADDM we can collect the necessary information from

v$active_session_history Session (user, service, client, package, procedure, etc) SQL statement For IO related waits

CURRENT_OBJ# ,CURRENT_FILE# ,CURRENT_BLOCK# Blocking_Session P1 P2 P3

Page 17: 05_waits_intro

#.17

Copyright 2006 Kyle Hailey

What are P1,P2,P3 ?

Each Wait has a 3 parameters P1,P2,P3 Give detailed information Meaning different for each wait Meaning definitions in V$event_name

Select

name,

parameter1,

parameter2,

parameter3

from v$event_name;

col parameter1 for a10col parameter1 for a10col parameter2 for a10col parameter2 for a10col parameter3 for a10col parameter3 for a10select parameter1 ,parameter2 , parameter3 select parameter1 ,parameter2 , parameter3 from v$event_namefrom v$event_namewhere name = '&1';where name = '&1';

Page 18: 05_waits_intro

#.18

Copyright 2006 Kyle Hailey

Wait Arguments Example

NAME PARAMETER1 PARAMETER2 PARAMETER3------------------------------ ----------- --------------- ---------------

latch: cache buffers chains address number triesfree buffer waits file# block# set-id#buffer busy waits file# block# class#latch: redo copy address number trieslog buffer spaceswitch logfile commandlog file sync buffer#db file sequential read file# block# blocksenq: TM - contention name|mode object # table/partitionundo segment extension segment#enq: TX - row lock contention name|mode usn<<16 | slot sequencerow cache lock cache id mode requestlibrary cache pin handle address pin address 100*mode+namesplibrary cache load lock object address lock address 100*mask+namesppipe put handle address record length timeout

select parameter1 ,parameter2 , parameter3 select parameter1 ,parameter2 , parameter3 from v$event_name;from v$event_name;

Page 19: 05_waits_intro

#.19

Copyright 2006 Kyle Hailey

Wait Analysis requires p1,p2,p3

Of the top 30 wait events 8 can be solved without ASH

The rest need Sql_id and/or P1,P2,P3

free buffer waitslog buffer spacelog file switch (archiving needed)log file switch (checkpoint incomplete)log file switch completionlog file syncswitch logfile commandwrite complete waits

Page 20: 05_waits_intro

#.20

Copyright 2006 Kyle Hailey

Difficult Waits

These 4 waits have multiple causes

Latches p2 = latch # (p1= address, p3= tries)

Locks p1 = lock type and mode ( p2 = id1, p3= id2)

Buffer Busy p3 = block class#, p1= file, p2=block (in 9i p3 was the bbw type)

Row Cache Lock p1 = cache id (p2 = mode, p3=request)

Page 21: 05_waits_intro

#.21

Wait Analysis

Find SQL waitingMost often the tuning answer lies in looking at what

the application is doing, and changing it

Find extended wait informationParameter1, Parameter2, Parameter3

Sometimes the wait events that are found are not in the documentation and it takes some educated guesswork to figure out the problem

Page 22: 05_waits_intro

#.22

Copyright 2006 Kyle Hailey

Waits we will Ignore

One thing that makes waits difficult is knowing which ones to look at and which ones to ignore.

Background Idle Resource Manager

Page 23: 05_waits_intro

#.23

Copyright 2006 Kyle Hailey

REDO Log FilesREDO Log Files Data FilesData Files

DBWRDBWRLGWRLGWR

User2User2

User1User1

User3User3

Log BufferLog Buffer

Buffer CacheBuffer CacheLog Log BufferBuffer

Buffer Buffer CacheCache

SGASGALibrary Library CacheCache

Background Processes

PMONPMON

SMONSMON

Page 24: 05_waits_intro

#.24

Copyright 2006 Kyle Hailey

Background & Foreground

Background Processes DBWR LGWR PMON SMON Etc

Foreground Processes SQL*Plus Pro*C SQL*Forms Oracle applications

Only interested in Foreground waits

Page 25: 05_waits_intro

#.25

Copyright 2006 Kyle Hailey

Background Waits

ASH Avoid Background waits in ASH with

V$session_wait joined to v$session

Select …from v$active_session_history where SESSION_TYPE='FOREGROUND'

Select …from v$active_session_history where SESSION_TYPE='FOREGROUND'

select …from v$session s, v$session_wait w where w.sid=s.sid and s.type='USER'

select …from v$session s, v$session_wait w where w.sid=s.sid and s.type='USER'

Page 26: 05_waits_intro

#.26

Copyright 2006 Kyle Hailey

Idle Waits

Filtered Out of ASH by default 10g

where wait_class != ‘Idle’Create a list

9iCreate a list with

Documentation List created from 10g Stats$idle_events from statspack

Select name from v$event_name where wait_class=‘Idle’;

Select name from v$event_name where wait_class=‘Idle’;

SQL*Net message from clientSQL*Net message from client

Page 27: 05_waits_intro

#.27

Copyright 2006 Kyle Hailey

Parallel Query Waits

Filter Out Parallel Query Wait events are unusable

Save waits are both idle and waits

Parallel Query Waits start with ‘PX’ or ‘KX’PX Deq: Par Recov ReplyPX Deq: Parse Reply

Page 28: 05_waits_intro

#.28

Copyright 2006 Kyle Hailey

Resource Manager Waits

Resource manager throttles userCreates waitObfuscates problems

10g

select name from v$event_name where wait_class='Scheduler';

select name from v$event_name where wait_class='Scheduler';

Page 29: 05_waits_intro

#.29

Copyright 2006 Kyle Hailey

RAC Waits

RAC waits are certainly interesting but will be covered outside of this presentation.

You are on your own Check documentation If you are not using RAC then no worries 10g

9i RAC and OPS waits usually contain the word “global”

Select event from v$event_name where wait_class=‘Cluster’;

Select event from v$event_name where wait_class=‘Cluster’;

Page 30: 05_waits_intro

#.30

Copyright 2006 Kyle Hailey

Latches

Protect areas of memory from concurrent use Light weight locks

Bit in memoryAtomic processor callFast and cheapGone if memory is lost

Often used in cache coherency managementChanges to a data block

Exclusive Generally Sharing reading has been introduced for some latches

Page 31: 05_waits_intro

#.31

Copyright 2006 Kyle Hailey

Finding Latches

“latch free” Covers many latches, find the problem latch by

1. select name from v$latchname where latch# = p1; OR

2. Find highest sleeps in Statspack latch section

In 10g, important latches have a wait event latch: cache buffers chains latch: shared pool

latch: library cache

Page 32: 05_waits_intro

#.32

Copyright 2006 Kyle Hailey

Enqueues aka Locks

“Enqueue” wait – covers all locks pre 10 Protect data against concurrent changes Lock info written into data structures

Block headersData blocksWritten in cache structures

Shareable in compatible modes

Page 33: 05_waits_intro

#.33

Copyright 2006 Kyle Hailey

Locks 10g

10g breaks all Enqueues out enq: HW - contention Configuration enq: TM - contention Application enq: TX - allocate ITL entry Configuration enq: TX - index contention Concurrency enq: TX - row lock contention Application enq: UL - contention Application

Page 34: 05_waits_intro

#.34

Copyright 2006 Kyle Hailey

Row Cache Lock

Need p1 to see the cache type

SQL> select cache#, parameter from v$rowcache;

CACHE# PARAMETER---------- -------------------------------- 1 dc_free_extents 4 dc_used_extents 2 dc_segments 0 dc_tablespaces 5 dc_tablespace_quotas 6 dc_files 7 dc_users 3 dc_rollback_segments 8 dc_objects 17 dc_global_oids 12 dc_constraints

SQL> select cache#, parameter from v$rowcache;

CACHE# PARAMETER---------- -------------------------------- 1 dc_free_extents 4 dc_used_extents 2 dc_segments 0 dc_tablespaces 5 dc_tablespace_quotas 6 dc_files 7 dc_users 3 dc_rollback_segments 8 dc_objects 17 dc_global_oids 12 dc_constraints

Page 35: 05_waits_intro

#.35

Copyright 2006 Kyle Hailey

Row Cache Lock Statspack

^LDictionary Cache Stats for DB: ORA9 Instance: ora9 Snaps: 1 -2->"Pct Misses" should be very low (< 2% in most cases)->"Cache Usage" is the number of cache entries being used->"Pct SGA" is the ratio of usage to allocated size for that cache

Get Pct Scan Pct Mod FinalCache Requests Miss Reqs Miss Reqs Usage----------------- --------- ------ ------- ----- -------- ----------dc_object_ids 45 0.0 0 0 958dc_objects 89 0.0 0 0 1,129dc_segments 69 0.0 0 0 807dc_tablespaces 12 0.0 0 0 13dc_usernames 22 0.0 0 0 19dc_sequences 120,003 0.0 0 120,003 5

^LDictionary Cache Stats for DB: ORA9 Instance: ora9 Snaps: 1 -2->"Pct Misses" should be very low (< 2% in most cases)->"Cache Usage" is the number of cache entries being used->"Pct SGA" is the ratio of usage to allocated size for that cache

Get Pct Scan Pct Mod FinalCache Requests Miss Reqs Miss Reqs Usage----------------- --------- ------ ------- ----- -------- ----------dc_object_ids 45 0.0 0 0 958dc_objects 89 0.0 0 0 1,129dc_segments 69 0.0 0 0 807dc_tablespaces 12 0.0 0 0 13dc_usernames 22 0.0 0 0 19dc_sequences 120,003 0.0 0 120,003 5

Page 36: 05_waits_intro

#.36

Copyright 2006 Kyle Hailey

Additional Support AWR Tables – on disk for 7 days by default

DBA_HIST_ACTIVE_SESS_HISTORY 1 in 10 ASH samples

DBA_HIST_SEG_STAT Good for ITL and buffer busy wait

DBA_HIST_SYSTEM_EVENT Important for getting avg wait times

DBA_HIST_SQLSTAT sql execution deltas

DBA_HIST_SYSMETRIC_SUMMARY Statistics avg, max, min

Metric Tables – in memory deltas V$EVENTMETRIC

Page 37: 05_waits_intro

#.37

Copyright 2006 Kyle Hailey

All Events over 7 days

select count(*), event fromselect count(*), event from

( select event from DBA_HIST_ACTIVE_SESS_HISTORY( select event from DBA_HIST_ACTIVE_SESS_HISTORY where sample_time < ( select min(sample_time) from where sample_time < ( select min(sample_time) from v$active_session_history)v$active_session_history) union all union all select event from v$active_session_historyselect event from v$active_session_history ))group by eventgroup by eventorder by eventorder by event//

Page 38: 05_waits_intro

#.38

Copyright 2006 Kyle Hailey

Example ASH QuerySelect ash.p1,Select ash.p1, ash.p2,ash.p2, CURRENT_OBJ#||' '||o.object_name objn,CURRENT_OBJ#||' '||o.object_name objn, o.object_type otype, o.object_type otype, CURRENT_FILE# filen,CURRENT_FILE# filen, CURRENT_BLOCK# blockn,CURRENT_BLOCK# blockn, ash.SQL_ID,ash.SQL_ID, w.class ||' '||to_char(ash.p3) block_typew.class ||' '||to_char(ash.p3) block_typefrom v$active_session_history ash,from v$active_session_history ash, ( select rownum class#, class from v$waitstat ) w,( select rownum class#, class from v$waitstat ) w, all_objects oall_objects owhere event='buffer busy waits'where event='buffer busy waits' and w.class#(+)=ash.p3and w.class#(+)=ash.p3 and o.object_id (+)= ash.CURRENT_OBJ#and o.object_id (+)= ash.CURRENT_OBJ# and ash.session_state='WAITING'and ash.session_state='WAITING' and ash.sample_time > sysdate - &1/(60*24)and ash.sample_time > sysdate - &1/(60*24)Order by sample_timeOrder by sample_time

P1 P2 OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPEP1 P2 OBJN OTYPE FILEN BLOCKN SQL_ID BLOCK_TYPE

-- ------ --------------------- ----- ----- ------ ------------- --------- ------ --------------------- ----- ----- ------ ------------- -------

1 112796 66053 BBW_INDEX_VAL_I INDEX 1 112796 6avm49ys4k7t6 data block 11 112796 66053 BBW_INDEX_VAL_I INDEX 1 112796 6avm49ys4k7t6 data block 1

1 112401 66053 BBW_INDEX_VAL_I INDEX 1 112401 5wqps1quuxqr4 data block 11 112401 66053 BBW_INDEX_VAL_I INDEX 1 112401 5wqps1quuxqr4 data block 1

1 112796 66053 BBW_INDEX_VAL_I INDEX 1 112796 5wqps1quuxqr4 data block 11 112796 66053 BBW_INDEX_VAL_I INDEX 1 112796 5wqps1quuxqr4 data block 1

1 113523 66053 BBW_INDEX_VAL_I INDEX 1 113523 5wqps1quuxqr4 data block 11 113523 66053 BBW_INDEX_VAL_I INDEX 1 113523 5wqps1quuxqr4 data block 1

Page 39: 05_waits_intro

#.39

Copyright 2006 Kyle Hailey

Average Wait Times Historicselect

btime,

(time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0) avg_ms

from (

select

to_char(s.BEGIN_INTERVAL_TIME,'DD-MON-YY HH24:MI') btime,

total_waits count_end,

time_waited_micro/1000 time_ms_end,

LagLag ( (e.time_waited_micro/1000)

OVER( PARTITION BY e.event_name ORDER BY s.snap_id) time_ms_beg,

LagLag ( (e.total_waits)

OVER( PARTITION BY e.event_name ORDER BY s.snap_id) count_beg

from

DBA_HIST_SYSTEM_EVENT e,

DBA_HIST_SNAPSHOT s

where

s.snap_id=e.snap_id and e.event_name= '&1'

order by begin_interval_time

)

order by btime;

BTIME AVG_MSBTIME AVG_MS-------------------- -------------------------------- ------------08-JAN-08 01:00 1.01708-JAN-08 01:00 1.017

08-JAN-08 02:00 .72008-JAN-08 02:00 .720

08-JAN-08 03:00 .62108-JAN-08 03:00 .621

08-JAN-08 04:00 1.74708-JAN-08 04:00 1.747

08-JAN-08 05:00 1.04608-JAN-08 05:00 1.046

08-JAN-08 06:00 1.44408-JAN-08 06:00 1.444

Page 40: 05_waits_intro

#.40

Copyright 2006 Kyle Hailey

Avg Wait times nowselectselect

en.name,en.name, (time_waited)/nullif(wait_count,0) avg_ms,(time_waited)/nullif(wait_count,0) avg_ms, wait_countwait_countfrom from v$eventmetric e,v$eventmetric e, v$event_name env$event_name enwherewhere e.event# = en.event#e.event# = en.event# and en.name like '%&1%‘;and en.name like '%&1%‘;

NAME AVG_MS WAIT_COUNTNAME AVG_MS WAIT_COUNT

------------------------- ---------- ----------------------------------- ---------- ----------

db file sequential read .658863707 6420db file sequential read .658863707 6420db file scattered read .549427419 186db file scattered read .549427419 186db file parallel write .089073438 64db file parallel write .089073438 64

Page 41: 05_waits_intro

#.41

Object Translation

Object ID File # and Block #

Page 42: 05_waits_intro

#.42

Copyright 2006 Kyle Hailey

Wait interface Weaknesses

Logons EM 10g shows these on perf page Time model helps

V$SYS_TIME_MODEL connection management call elapsed time

I’ve had problems

Paging/Memory issues CPU starvation Null Events Bugs – read external table reports CPU

http://blog.tanelpoder.com/

Page 43: 05_waits_intro

#.43

Copyright 2006 Kyle Hailey

Dependable Tuning Strategy

Run Statspack/AWR reportTop 5 Timed Events

~50 lines down from topNeed Available CPU

Elapsed Time CPU_COUNT

OEM 10gPerformance Page does everything !

OEM doesn’t solve the problemQuery v$active_session_history directly

Page 44: 05_waits_intro

#.44

Copyright 2006 Kyle Hailey

Summary Waits make Tuning Easy

Check Machine Health Tune Waits Tune CPU

Tune SQL Change Application Architecture

Use OEM10g Statspack/AWR, S/ASH

Ignore Background, Idle, Resmgr, PQO Use ASH if OEM fails See http://perfvision.com for more info