Download - ADVT SQL Plan Explained

8/13/2019 ADVT SQL Plan Explained

1/24

ADVT SQL Plan Explained


2/24


3/24

The query run for long time because of high LIO.


4/24


5/24

1. Those small rows are usually the results of missing

stats.

2. For this specific case, the query started before

partition stats were ready.

3. When there are no partition stats, global stats would

be used.

NESTED LOOPS JOIN with bad cardinality estimate

on first row source is a major reason for high LIO and

CPU usage.


6/24

1. Here is the plan currently running, using SQL profile to force

hash join.

2. Note the high cost of HASH GROUP BY at the bottom.


7/24

Query Structure

SELECT /*+ parallel(d,4) full(d) */

FROM

SOURCE_BY_SRCH_DLY_REV_MASK s,

( Complex View based on source_search_type_daily t1 ) t,

(Complex View based on DM_SUMMARY_DAILY d ) dwhere

d.datestamp >= s.start_date

and d.datestamp


8/24


9/24

Inline View D

select mrkt_id, datestamp, SOURCE, query_source, search_type,domain, pageview_type, country_of_origin ,

sum(pageviews) pageviews, sum(bidded_searches)bidded_searches, sum(bidded_results) bidded_results,sum(bidded_clicks) bidded_clicks, sum(revenue) revenue

fromDM_SUMMARY_DAILY d

where

d.datestamp = to_date('20120715' , 'yyyymmdd' )

and d.source like 'geosign%derp and d.mrkt_id = 0

group by

mrkt_id, datestamp, SOURCE, query_source, search_type,domain, pageview_type, country_of_origin

1. Access a single partition of DM_SUMMARY_DAILY.

2. MRKT_ID=0

3. SOURCE uses LIKE expr.


10/24


11/24

How to Calculate Cardinality

1. Cardinality = (num_rows) * (selectivity of column 1) *(selectivity of column 2) * *(selectivity of column n)

2. Column Selectivity =1. Without histograms or with bind value: 1/(number of distinct

values (NDV))2. With frequency histograms: (number of buckets for the

specified value) /total bucket number

3. With height balance histograms: if the value occupied morethat 1 bucket, see 2. Otherwise, use the density from columnstats, but we can always use 1/NDV as reference.

4. For inequality predicate with bind variable or function: 0.05


12/24

Bad Plan

DM_SUMMARY_DAILY Partition Stats not ready

Global: rows: 23,451,579,811 Global NDV: datestamp: 2145, mrkt_id: 25,source: 65524

Estimate: 23,451,579,811*(1/2145)*(1/25)*(1/65524) = 6.6742, round up to 7.

Actual Partition Stats: rows: 5,127,832, datestamp: 1, mrkt_id: 23, source: 601

Estimated if using part stats when it was ready: 5,127,832*(1/1)*(1/23)*(1/601) =371.

If using histograms for mrkt_id=0 (3946 out of 5551 bucket numbers):5,127,832*(3946/5551)*(1/601) = 6065

SRC_BY_SRCH_DREV_MASK_ED No stats. Default to (block_size-cache layer)*blocks/100. block_size is 16K,

blocks is 5. 16*1024*5/100 = 819.2. Not sure about the value of cache layer.

SOURCE_SEARCH_TYPE_DAILY, per (datestamp,mrkt_id,source) No partition and global stats captured.

We blamed lacking of stats was the reason. So I will skip further researchon this plan.


13/24

Good Plan With SQL profile

DM_SUMMARY_DAILY Actual Partition Stats: rows: 5,193,086, datestamp: 1, mrkt_id: 21,

source: 609 (huge diff from global stats)

Estimated if using part stats: 5,193,086*(1/1)*(1/21)*(1/609) = 406.

If using histograms for mrkt_id=0 (3924 out of 5615 buckets) and forsource like geosign%drep (6 out of 254 buckets):

5,193,086*(3924/5615)*(6/254) = 85,727. SRC_BY_SRCH_DREV_MASK_ED

Still use default 818 rows. Actual value is 61.

SOURCE_SEARCH_TYPE_DAILY partition stats: rows: 3,312,381, datestamp: 1, mrkt_id: 26 (histograms

for value 0: 3015 out of 5590), source: 11890

When using hash join, with datestamp and mrkt_id=0, 3,312,381*(3015/5590) = 1,786,542 (1786K in the plan).

When use join predicate push down with column source, for each(datestamp, mrkt_id,source) is 3,312,381*(3015/5590)*(1/11890) = 150.Here column source is treated as bind value.


14/24

MRKT_ID Histograms

Data is skewed on MRKT_ID=0


15/24

SOURCE Histograms

Not easy to count the actual

buckets


16/24

How Oracle evaluate join orders?

Estimate cardinalities from each row source, SRC_BY_SRCH_DREV_MASK_ED: 818

View D on DM_SUMMARY_DAILY 406 or 85,727, depending on if histograms available or not

View S on SOURCE_SEARCH_TYPE_DAILY 1,786,542

Oracle normally starts from the row source with smallest table, thennext smaller one, and eventually all the combinations (factorial oftotal number of tables, here is 3! = 6).

So in this case, if histograms is used, the first table will beSRC_BY_SRCH_DREV_MASK_ED, otherwise, it will beDM_SUMMARY_DAILY.

Since the view on SOURCE_SEARCH_TYPE_DAILY is the last toevaluate, the cardinality estimate for it is usually not very important,but the costs for different access methods will be very important andwill be very sensitive to the output counts of the join from the othertwo tables.


17/24

Join Cardinality Between S and D

Join Cardinality = (num_rowsSnum_nullS)*(num_rowsDnum_nullD)

/max(ndv(mrkt_idS),max(ndv(mrkt_idD))

NDV 21 is found from 10053 trace for the small

table. It is interesting how Oracle derives thisdefault value, because it is actual NDV of theother table at partition level.

If no histograms is used: (818-0)*(406-0)/max(21,1) = 15,814

If histogram is used: (818-0)*(85727-0)/max(21,1) = 3,339,270


18/24

Join Cardinality Between S and D

After filtered by d.datestamp >= s.start_date and d.datestamp 40

With histograms: 8348.175 -> 8349 (plan uses 8347) Fortunately, the result is inflated by

SRC_BY_SRCH_DREV_MASK_ED, by 818/61 = 13.4times.

Side note: when dynamic sampling was used as attemptto resolve this issue, it gave the actual count ofSRC_BY_SRCH_DREV_MASK_ED, that is, 61. So evenwith histograms, the join cardinality estimate is only at622, not enough for Oracle to pick up the right plan.


19/24

FTS Cost

FTS CPU cost formula: cost = (#SRds +#MRds*mreadtim/sreadtim +#CPUCycles/(cpuspeed*sreadtim)

When using noworkload statistics, like in this case

MBRC = db_file_multiblock_read_count Sreadtim = ioseektim + db_block_size/iotfrspeed

Mreadtim = ioseektim + db_file_multiblock_read_count*db_block_size/iotfrspeed

#SRds: number of single block reads #MRds: number of multiple block reads with size of

db_file_multiblock_read_count.


20/24

Cost Estimate For View T

FTS on SOURCE_SEARCH_TYPE_DAILY

26,243 blocks,

Parameters (from 10053, except CPUSPEEDNW, all default values) db_file_multiblock_read_count:16

CPUSPEEDNW: 1583 millions instructions/sec (default is 100)

IOTFRSPEED: 4096 bytes per millisecond (default is 4096) IOSEEKTIM: 10 milliseconds (default is 10)

sreadtim = (10 + 16*1024/4096) = 14

mreadtim = (10 + 16*16*1024/4096) = 74

FTS Cost = (0+26,243*74/14) + cpu_cost = 8669.5625 + cpu_cost

The plan used cost 8736. The difference is from cpu_cost to readrows and filter the result.

Because view T is aggregated complex view, there is a huge costassociated with it for sorting and grouping, making the total cost at31,796.


21/24

Index Scan Cost

Cost = blevel + ceiling(leaf_blocks * effective

index selectivity) +

ceiling(clustering_factor*effective table

selectivity) Effective index selectivity is the calculated as

multiplications of all leading columns inside the

index specified in the predicates. If an index has

more columns than the predicates, stop whenencounter the first column without


22/24

Cost Estimate For T

Cost estimate via JPPD (join predicate push down), per(datestamp, mrkt_id,source), via index range scan

Index IDX2_SOURCE_SEARCH_TYPE_DAILY Blevel: 2

Leaf_blocks: 11818 Clustering_factor: 530,949

Effective selectivity: sel(datestamp)*sel(mrkt_id)*sel(source) = 1* (3015/5590)*(1/11890) = 0.0000453621

Cost = 2+ceil(0.536)+ceil(24.08) = 28

Cardinality Estimate: 3,312,381* 0.0000453621 = 150 Because the low cardinality, GROUP BY will be in memory and

the cost can be ignored.


23/24

Cost Estimate for T

If partition stats are not ready, global stats are used (forindex IDX2_SOURCE_SEARCH_TYPE_DAILY)

Num_rows: 2,967,427,119, blevel: 3, leaf blocks:10,109,550, clustering_factor: 213,185,700.

NDV: datestamp: 3350, mrkt_id: 32, source: 50900. Histograms for mrkt_id for value 0: 4393 out of 9212.

Effective index selectivity:(1/3350)*(4393/9212)*(1/50900) = 2.796692e-9

Cardinality: 2,967,427,119*2.796692e-9 = 8.3 - > 9 Cost: 3 + ceil(10,109,550*2.796692e-9) +

ceil(213,185,700*2.796692e-9) = 3+1+1 = 5


24/24