Adapting to Adaptive Plans on 12c
-
Upload
mauro-pagano -
Category
Software
-
view
244 -
download
3
Transcript of Adapting to Adaptive Plans on 12c
Adapting to Adaptive Plans on 12c
Mauro Pagano
2
Mauro Pagano• Consultant, working with both DBA and Devs• Oracle Enkitec Accenture Enkitec Group• Database Performance and SQL Tuning• Training, Workshops, OUG• Free Tools (SQLd360, TUNAs360, Pathfinder)• Newbie old fart (thanks Bryn! :-)
3
Some background• CBO makes mistakes– Complex code with challenging job – Based on
• Statistical formula, not perfect in 100% cases• Partial knowledge of the data (stats)
• Mistakes can translate into poor exec plans• Poor plans usually lead to poor performance
4
How to avoid mistakes?• Improve quality of the model (Oracle)– Hard, lots of corner cases for CBO to handle
• Improve quality of stats (kind of you)– Hard, need knowledge of data and queries
• In 12c Oracle introduced two more ways:– Not committing to a specific plan at parse time– Keeping track of them not to make them again
BIG shift in mentality for CBO, acknowledging mistakes are made and re-acting on them
5
Typical effect of CBO mistake
6
Behind the curtainsNested Loop at step 15 is driven by 16. E-rows(16)
is 1 but A-Rows(16) is 299k
As a consequence every step on the inner side of
NL(15) is started 299k times First few steps of inner
block is where most of the DB time usually goes
7
Nested Loop vs Hash Join - scalabilityNested Loop
Hash Join
Number of rows
Elap
sed
Tim
e Linear scalability for joins is BAD!! 1ms per row means 1k secs for 1M
rows and 11 days for 1B
8
A test is worth 1000 experts opinionsdrop table tab1 purge;drop table tab2 purge;
create table tab1 asselect mod(rownum, 100) j1, mod(rownum, 100) j2, mod(rownum, 100) f1, mod(rownum, 100) f2, mod(rownum, 100) f3, lpad('x',1000,'x') pad1 -- just to make FTS more expensive from dual connect by rownum <= 100000;
create table tab2 asselect mod(rownum, 100) j1, mod(rownum, 100) j2, mod(rownum, 100) f1, mod(rownum, 100) f2, lpad('x',1000,'x') pad1 -- just to make FTS more expensive from dualconnect by rownum <= 100000;
create index tab2_idx on tab2(j1,j2);
9
Our demo SQL – pre 12.1select count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2 and a.f1 = 1 and a.f2 = 1 and a.f3 = 1
------------------------------------------------------------------------| Id | Operation | Name |Starts|E-Rows|A-Rows| A-Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1| | 1 |00:00:03.33|| 1 | SORT AGGREGATE | | 1| 1| 1 |00:00:03.33|| 2 | NESTED LOOPS | | 1| 1| 1000K|00:00:02.52||* 3 | TABLE ACCESS FULL| TAB1 | 1| 1| 1000 |00:00:00.05||* 4 | INDEX RANGE SCAN | TAB2_IDX | 1000| 10| 1000K|00:00:00.85|-------------------------------------------------------------------------
Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter("A"."F1"=1 AND "A"."F2"=1 AND "A"."F3"=1) 4 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2")
10
Adaptive Features in 12c
11
Meet Adaptive Plans• Introduced in 12.1– Slightly enhanced in 12.2
• Enabled by default even in 12.2– Some of the enhancements are disabled in 12.2
• Transparent to users and DBAs– CBO takes care of it, no need to intervene
• Designed to avoid runaway executions• Applies to join and PX distribution method
12
Adaptive Joins - How does it work?• CBO identifies potential “weak” areas at parse– For example, those with complex join predicates
• CBO doesn’t commit to a decision at parse– Risky since CBO knows it’s a weak path
• Execution plan built with two join methods– Nested Loop and Hash Join
• Execution ”starts small” with Nested Loop
13
Our demo SQL – 12.1select count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2 and a.f1 = 1 and a.f2 = 1 and a.f3 = 1
----------------------------------------------------------------------------| Id | Operation | Name |Starts|E-Rows|A-Rows| A-Time |----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1| | 1 |00:00:00.21|| 1 | SORT AGGREGATE | | 1| 1| 1 |00:00:00.21||* 2 | HASH JOIN | | 1| 1| 1000K|00:00:00.16||* 3 | TABLE ACCESS FULL | TAB1 | 1| 1| 1000 |00:00:00.02|| 4 | INDEX FAST FULL SCAN| TAB2_IDX | 1| 10| 100K|00:00:00.01|----------------------------------------------------------------------------
2 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2") 3 - filter(("A"."F1"=1 AND "A"."F2"=1 AND "A"."F3"=1))
Note----- - this is an adaptive plan
14
How does it really work?• CBO determines when NL not good anymore– Number of rows where NL and HJ lines intersect
• CBO calls it Inflection Point (IP)– Passed it, HJ becomes more efficient than NL– Still based on CBO costing, could be flawed
• Thresholds reported in execution plan (*)
– As STATISTICS COLLECTOR step– Using existing technology, not reported before
15
How does it really work? – Part 2• Real exec plan has a blocking op before the join• Rows are buffered there– Becomes pass-through after IP reached
• All rows fetched before IP reached -> NL– Low #rows to process, NL more efficient
• IP reached but more rows to fetch -> HJ– High #rows to process, HJ more efficient
16
Our demo SQL – 12.1 – real plan----------------------------------------------------------------------------| Id |Operation |Name |Starts|E-Rows|A-Rows| A-Time |----------------------------------------------------------------------------| 0|SELECT STATEMENT | | 1 | | 1 |00:00:00.23 || 1| SORT AGGREGATE | | 1 | 1| 1 |00:00:00.23 || * 2| HASH JOIN | | 1 | 1| 1000K|00:00:00.17 ||- 3| NESTED LOOPS | | 1 | 1| 1000 |00:00:00.03 ||- 4| STATISTICS COLLECTOR| | 1 | | 1000 |00:00:00.03 || * 5| TABLE ACCESS FULL |TAB1 | 1 | 1| 1000 |00:00:00.03 ||- * 6| INDEX RANGE SCAN |TAB2_IDX| 0 | 10| 0 |00:00:00.01 || 7| INDEX FAST FULL SCAN |TAB2_IDX| 1 | 10| 100K|00:00:00.01 |----------------------------------------------------------------------------
Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2") 5 - filter(("A"."F1"=1 AND "A"."F2"=1 AND "A"."F3"=1)) 6 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2")
Note----- - this is an adaptive plan (rows marked '-' are inactive)
17
How does it look in SQL Monitor?
18
How does it look in SQL Monitor?
Grey steps are inactive
19
How does it look in SQLd360?
20
Plans info change a bit• Distinction between “small” and “big” plan– FULL_PLAN_HASH_VALUE -> big plan– PLAN_HASH_VALUE -> small plan
• Real exec plan has more steps– Hidden by DBMS_XPLAN since not “active”
• Execution plan Step ID a bit misleading– Changed by DBMS_XPLAN to be consecutive– Mismatch between “small” and “big” plan Step ID
• ASH SQL Plan Line ID shows “big” plan Step ID
21
How does CBO determine IP?NL
HJ
Number of rows
Elap
sed
Tim
eCompute cost for min
#rows for NL vs HJ. NL wins here
Compute cost for max #rows for NL vs HJ.
HJ wins here
Divide max #rows in half and compute cost.
HJ wins hereKeep doing it until HJ loses, that’s inflection
point to switch at
22
Nerdy details AP: Checking validity for query block SEL$1, sqlid=c3jux1qzjc40j.Searching for inflection point (join #1) between 0.10 and 199728.76AP: Computing costs for inflection point at min value 0.10AP: Using binary search for inflection point searchAP: Costing Nested Loops Join for inflection point at card 0.10 NL Join : Cost: 178.655763 … AP: Costing Hash Join for inflection point at card 0.10 Hash join: Resc: 181.510246 …AP: lcost=178.66, rcost=181.51
AP: Computing costs for inflection point at max value 199728.76AP: Costing Nested Loops Join for inflection point at card 199728.76 NL Join : Cost: 401159.563642 … AP: Costing Hash Join for inflection point at card 199728.76 Hash join: Resc: 247.056641 …AP: lcost=401159.56, rcost=247.06
23
Nerdy details – part 2AP: Costing Nested Loops Join for inflection point at card 2.39 NL Join : Cost: 180.663398 Resp: 180.663398 Degree: 1 AP: Costing Hash Join for inflection point at card 2.39 Hash join: Resc: 181.510256 AP: lcost=180.66, rcost=181.51
DP: Found point of inflection for NLJ vs. HJ: card = 2.39
24
When does Adaptive kick in?• At the first execution
– Decision used by following executions too (shared cursor)• What if things change?
– Different bind -> ACS should take care of it– Misleading stats -> if no SPD, stuck with first plan
• Until next hard parse, could be triggered by many factors
• Continuous adaptive in 12.2– Adapts at every execution– Disabled by default (for now)
25
Limitations of Adaptive Joins• No Sort Merge Join, only NL and HJ
– Means it only works for equality join conditions• Doesn’t change join order, only join method– If mistake lead to poor join order, stuck with it
• CBO takes a step back in many cases– For example, if join method hinted, even if suboptimal
• 10053 only way to get details– Not much of an issue though if feature works
26
Adaptive Distribution Method (ADM)• Determine distribution method at runtime– Avoid broadcast for large dataset
• Hash distribution for both rowsets– Avoid hash for small dataset
• Broadcast for driver, round-robin for probe rowset
• STATISTIC_COLLECTOR step before PX SEND• PX SEND becomes PX SEND HYBRID• Decision made by QC based on slave provided info– Harder to spot, plan doesn’t change
27
Little demo – ADM-- 2x rows, not really necessaryinsert into tab2 select * from tab2;
-- start small, 2 rowsdelete tab1 where rownum <= 99998;
-- removed filter predicates-- used hints to force desired join order, method and DoPselect /*+ parallel(4) leading(a) use_hash(b) */ count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2
28
Little demo– ADM
2 rows produced
8 rows distributed
2 rows * 4 DoP = 8BROADCAST distribution
29
Little demo– ADM-- 256 rows, deleting just to start “clean”delete tab1;insert into tab1 … connect by rownum <= 256;
-- removed filter predicates-- used hints to force desired join order, method and DoPselect /*+ parallel(4) leading(a) use_hash(b) */ count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2
30
Little demo– ADM
256 rows produced
256 rows distributed
#rows produced = distributedHASH distribution
Plan looks the same as before!!
31
Turning point – ADM-- 254 rowsdelete tab1 where rownum <= 2;
-- removed filter predicates-- used hints to force desired join order, method and DoPselect /*+ parallel(4) leading(a) use_hash(b) */ count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2
(from 10053)ADM inflection point = 256.000000
32
Little demo– ADM
254 rows produced
1016 rows distributed
254 * 4 = 1016BROADCAST distribution
33
Why everybody turns it off?• Adaptive Features in 12.1 have single parameter
– To disable Adaptive Statistics need to disable all• Adaptive Plans feature work well
– Turning off is generally a bad idea• Oracle provided patch to split parameter (22652097)
– To keep Adaptive Plans in place– To disable Adaptive Statistics only
• Default behavior in 12.2, AP enabled and AS disabled
34
Summary• Can save the day on runaway SQL– This is especially for Adaptive Joins– Allow critical decision to be made with better info
• No setup necessary– One of the few feature that kind of “just works”
• Makes plan investigation a bit harder– But well handled by SQL Monitoring– And other tools like SQLd360
35
36
Contact Information• http://mauro-pagano.com– Email• [email protected]
– Free tools to download• SQLd360 • TUNAs360• PAthfinder