Adapting to Adaptive Plans on 12c

Adapting to Adaptive Plans on 12c

Mauro Pagano

2

Mauro Pagano• Consultant, working with both DBA and Devs• Oracle Enkitec Accenture Enkitec Group• Database Performance and SQL Tuning• Training, Workshops, OUG• Free Tools (SQLd360, TUNAs360, Pathfinder)• Newbie old fart (thanks Bryn! :-)

3

Some background• CBO makes mistakes– Complex code with challenging job – Based on

• Statistical formula, not perfect in 100% cases• Partial knowledge of the data (stats)

• Mistakes can translate into poor exec plans• Poor plans usually lead to poor performance

4

How to avoid mistakes?• Improve quality of the model (Oracle)– Hard, lots of corner cases for CBO to handle

• Improve quality of stats (kind of you)– Hard, need knowledge of data and queries

• In 12c Oracle introduced two more ways:– Not committing to a specific plan at parse time– Keeping track of them not to make them again

BIG shift in mentality for CBO, acknowledging mistakes are made and re-acting on them

5

Typical effect of CBO mistake

6

Behind the curtainsNested Loop at step 15 is driven by 16. E-rows(16)

is 1 but A-Rows(16) is 299k

As a consequence every step on the inner side of

NL(15) is started 299k times First few steps of inner

block is where most of the DB time usually goes

7

Nested Loop vs Hash Join - scalabilityNested Loop

Hash Join

Number of rows

Elap

sed

Tim

e Linear scalability for joins is BAD!! 1ms per row means 1k secs for 1M

rows and 11 days for 1B

8

A test is worth 1000 experts opinionsdrop table tab1 purge;drop table tab2 purge;

create table tab1 asselect mod(rownum, 100) j1, mod(rownum, 100) j2, mod(rownum, 100) f1, mod(rownum, 100) f2, mod(rownum, 100) f3, lpad('x',1000,'x') pad1 -- just to make FTS more expensive from dual connect by rownum <= 100000;

create table tab2 asselect mod(rownum, 100) j1, mod(rownum, 100) j2, mod(rownum, 100) f1, mod(rownum, 100) f2, lpad('x',1000,'x') pad1 -- just to make FTS more expensive from dualconnect by rownum <= 100000;

create index tab2_idx on tab2(j1,j2);

9

Our demo SQL – pre 12.1select count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2 and a.f1 = 1 and a.f2 = 1 and a.f3 = 1

------------------------------------------------------------------------| Id | Operation | Name |Starts|E-Rows|A-Rows| A-Time |-------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1| | 1 |00:00:03.33|| 1 | SORT AGGREGATE | | 1| 1| 1 |00:00:03.33|| 2 | NESTED LOOPS | | 1| 1| 1000K|00:00:02.52||* 3 | TABLE ACCESS FULL| TAB1 | 1| 1| 1000 |00:00:00.05||* 4 | INDEX RANGE SCAN | TAB2_IDX | 1000| 10| 1000K|00:00:00.85|-------------------------------------------------------------------------

Predicate Information (identified by operation id):--------------------------------------------------- 3 - filter("A"."F1"=1 AND "A"."F2"=1 AND "A"."F3"=1) 4 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2")

10

Adaptive Features in 12c

11

Meet Adaptive Plans• Introduced in 12.1– Slightly enhanced in 12.2

• Enabled by default even in 12.2– Some of the enhancements are disabled in 12.2

• Transparent to users and DBAs– CBO takes care of it, no need to intervene

• Designed to avoid runaway executions• Applies to join and PX distribution method

12

Adaptive Joins - How does it work?• CBO identifies potential “weak” areas at parse– For example, those with complex join predicates

• CBO doesn’t commit to a decision at parse– Risky since CBO knows it’s a weak path

• Execution plan built with two join methods– Nested Loop and Hash Join

• Execution ”starts small” with Nested Loop

13

Our demo SQL – 12.1select count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2 and a.f1 = 1 and a.f2 = 1 and a.f3 = 1

----------------------------------------------------------------------------| Id | Operation | Name |Starts|E-Rows|A-Rows| A-Time |----------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1| | 1 |00:00:00.21|| 1 | SORT AGGREGATE | | 1| 1| 1 |00:00:00.21||* 2 | HASH JOIN | | 1| 1| 1000K|00:00:00.16||* 3 | TABLE ACCESS FULL | TAB1 | 1| 1| 1000 |00:00:00.02|| 4 | INDEX FAST FULL SCAN| TAB2_IDX | 1| 10| 100K|00:00:00.01|----------------------------------------------------------------------------

2 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2") 3 - filter(("A"."F1"=1 AND "A"."F2"=1 AND "A"."F3"=1))

Note----- - this is an adaptive plan

14

How does it really work?• CBO determines when NL not good anymore– Number of rows where NL and HJ lines intersect

• CBO calls it Inflection Point (IP)– Passed it, HJ becomes more efficient than NL– Still based on CBO costing, could be flawed

• Thresholds reported in execution plan (*)

– As STATISTICS COLLECTOR step– Using existing technology, not reported before

15

How does it really work? – Part 2• Real exec plan has a blocking op before the join• Rows are buffered there– Becomes pass-through after IP reached

• All rows fetched before IP reached -> NL– Low #rows to process, NL more efficient

• IP reached but more rows to fetch -> HJ– High #rows to process, HJ more efficient

16

Our demo SQL – 12.1 – real plan----------------------------------------------------------------------------| Id |Operation |Name |Starts|E-Rows|A-Rows| A-Time |----------------------------------------------------------------------------| 0|SELECT STATEMENT | | 1 | | 1 |00:00:00.23 || 1| SORT AGGREGATE | | 1 | 1| 1 |00:00:00.23 || * 2| HASH JOIN | | 1 | 1| 1000K|00:00:00.17 ||- 3| NESTED LOOPS | | 1 | 1| 1000 |00:00:00.03 ||- 4| STATISTICS COLLECTOR| | 1 | | 1000 |00:00:00.03 || * 5| TABLE ACCESS FULL |TAB1 | 1 | 1| 1000 |00:00:00.03 ||- * 6| INDEX RANGE SCAN |TAB2_IDX| 0 | 10| 0 |00:00:00.01 || 7| INDEX FAST FULL SCAN |TAB2_IDX| 1 | 10| 100K|00:00:00.01 |----------------------------------------------------------------------------

Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2") 5 - filter(("A"."F1"=1 AND "A"."F2"=1 AND "A"."F3"=1)) 6 - access("A"."J1"="B"."J1" AND "A"."J2"="B"."J2")

Note----- - this is an adaptive plan (rows marked '-' are inactive)

17

How does it look in SQL Monitor?

18

How does it look in SQL Monitor?

Grey steps are inactive

19

How does it look in SQLd360?

20

Plans info change a bit• Distinction between “small” and “big” plan– FULL_PLAN_HASH_VALUE -> big plan– PLAN_HASH_VALUE -> small plan

• Real exec plan has more steps– Hidden by DBMS_XPLAN since not “active”

• Execution plan Step ID a bit misleading– Changed by DBMS_XPLAN to be consecutive– Mismatch between “small” and “big” plan Step ID

• ASH SQL Plan Line ID shows “big” plan Step ID

21

How does CBO determine IP?NL

HJ

Number of rows

Elap

sed

Tim

eCompute cost for min

#rows for NL vs HJ. NL wins here

Compute cost for max #rows for NL vs HJ.

HJ wins here

Divide max #rows in half and compute cost.

HJ wins hereKeep doing it until HJ loses, that’s inflection

point to switch at

22

Nerdy details AP: Checking validity for query block SEL$1, sqlid=c3jux1qzjc40j.Searching for inflection point (join #1) between 0.10 and 199728.76AP: Computing costs for inflection point at min value 0.10AP: Using binary search for inflection point searchAP: Costing Nested Loops Join for inflection point at card 0.10 NL Join : Cost: 178.655763 … AP: Costing Hash Join for inflection point at card 0.10 Hash join: Resc: 181.510246 …AP: lcost=178.66, rcost=181.51

AP: Computing costs for inflection point at max value 199728.76AP: Costing Nested Loops Join for inflection point at card 199728.76 NL Join : Cost: 401159.563642 … AP: Costing Hash Join for inflection point at card 199728.76 Hash join: Resc: 247.056641 …AP: lcost=401159.56, rcost=247.06

23

Nerdy details – part 2AP: Costing Nested Loops Join for inflection point at card 2.39 NL Join : Cost: 180.663398 Resp: 180.663398 Degree: 1 AP: Costing Hash Join for inflection point at card 2.39 Hash join: Resc: 181.510256 AP: lcost=180.66, rcost=181.51

DP: Found point of inflection for NLJ vs. HJ: card = 2.39

24

When does Adaptive kick in?• At the first execution

– Decision used by following executions too (shared cursor)• What if things change?

– Different bind -> ACS should take care of it– Misleading stats -> if no SPD, stuck with first plan

• Until next hard parse, could be triggered by many factors

• Continuous adaptive in 12.2– Adapts at every execution– Disabled by default (for now)

25

Limitations of Adaptive Joins• No Sort Merge Join, only NL and HJ

– Means it only works for equality join conditions• Doesn’t change join order, only join method– If mistake lead to poor join order, stuck with it

• CBO takes a step back in many cases– For example, if join method hinted, even if suboptimal

• 10053 only way to get details– Not much of an issue though if feature works

26

Adaptive Distribution Method (ADM)• Determine distribution method at runtime– Avoid broadcast for large dataset

• Hash distribution for both rowsets– Avoid hash for small dataset

• Broadcast for driver, round-robin for probe rowset

• STATISTIC_COLLECTOR step before PX SEND• PX SEND becomes PX SEND HYBRID• Decision made by QC based on slave provided info– Harder to spot, plan doesn’t change

27

Little demo – ADM-- 2x rows, not really necessaryinsert into tab2 select * from tab2;

-- start small, 2 rowsdelete tab1 where rownum <= 99998;

-- removed filter predicates-- used hints to force desired join order, method and DoPselect /*+ parallel(4) leading(a) use_hash(b) */ count(*) from tab1 a, tab2 b where a.j1 = b.j1 and a.j2 = b.j2

28

Little demo– ADM

2 rows produced

8 rows distributed

2 rows * 4 DoP = 8BROADCAST distribution

29

Little demo– ADM-- 256 rows, deleting just to start “clean”delete tab1;insert into tab1 … connect by rownum <= 256;


30

Little demo– ADM

256 rows produced

256 rows distributed

#rows produced = distributedHASH distribution

Plan looks the same as before!!

31

Turning point – ADM-- 254 rowsdelete tab1 where rownum <= 2;


(from 10053)ADM inflection point = 256.000000

32

Little demo– ADM

254 rows produced

1016 rows distributed

254 * 4 = 1016BROADCAST distribution

33

Why everybody turns it off?• Adaptive Features in 12.1 have single parameter

– To disable Adaptive Statistics need to disable all• Adaptive Plans feature work well

– Turning off is generally a bad idea• Oracle provided patch to split parameter (22652097)

– To keep Adaptive Plans in place– To disable Adaptive Statistics only

• Default behavior in 12.2, AP enabled and AS disabled

34

Summary• Can save the day on runaway SQL– This is especially for Adaptive Joins– Allow critical decision to be made with better info

• No setup necessary– One of the few feature that kind of “just works”

• Makes plan investigation a bit harder– But well handled by SQL Monitoring– And other tools like SQLd360

36

Contact Information• http://mauro-pagano.com– Email• [email protected]

– Free tools to download• SQLd360 • TUNAs360• PAthfinder

http://mauro-pagano.com/

Adapting to Adaptive Plans on 12c

Software

Transcript of Adapting to Adaptive Plans on 12c