Ugif 10 2012 ppt0000002

51
User Group Informix France Update Statistics Olivier Bourdin [email protected] Mercredi 3 Octobre 2012 Mercredi 3 Octobre 2012

Transcript of Ugif 10 2012 ppt0000002

Page 1: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics

Olivier [email protected]

Mercredi 3 Octobre 2012Mercredi 3 Octobre 2012

Page 2: Ugif 10 2012 ppt0000002

User Group Informix France

� Brief Review and History� What’s changed?

– 11.10, 11.50– 11.70 – “Smart Statistics”

� 11.70 FAQ’s– Do I need to do anything different?

– Did the update statistics update any stats?– Update statistics and reoptimization

Overview

Page 3: Ugif 10 2012 ppt0000002

User Group Informix France

Why is statistics important?

�Choosing the right QUERY PATH determines how fast you get your results.

�Choosing the Wrong Path can be like going around the world to get to your neighbor’s.

• Expensive to go around the world.

• Takes too long.

Page 4: Ugif 10 2012 ppt0000002

User Group Informix France

Query Optimization Process

� Examine all tables (table A, table B, table C)– Examine selectivity of every filter (where clauses)– Determine if indexes can be used for filters, order by, group by– Find the best way to scan a table -- sequentially or by an index

� Identify Join Pairs (AB, AC, BA, BC, CA, CB)– Find best join method (nested loop, hash, or sort merge)– Decide which indexes are best for the join– Calculate the cost of the join

� Repeat for each additional table (ABC, ACB, BAC, ...)

Page 5: Ugif 10 2012 ppt0000002

User Group Informix France

Estimating costs: need data !

� Find the cheapest/lowest cost path.– Cost = I/O cost + Weight * (CPU cost)– I/O -- disk access– CPU -- Rows processed

� Estimate costs – Filters -- Which indexes to use?– Joins -- Nested Loop, Hash, or Sort Merge?– Eliminate redundant pairs?

Page 6: Ugif 10 2012 ppt0000002

User Group Informix France

Filter selectivity

� Selectivity is the percentage of rows selected as a result of a filter (number between 0 and 1)

Expression Filter Selectivity

indexed_col = literal

value

F=1/(number of distinct keys in index)

indexed_col > literal

value

F = (literal value - 2nd min)/(2nd max-2nd

min)

NOT expression F = 1 - F(expression)

expr1 AND expr2 F = F(expr1) x F(expr2)

Page 7: Ugif 10 2012 ppt0000002

User Group Informix France

How do we influence Quey Optimization ?

� OPTCOMPIND� Optimizer directives, Optimization Goals� Update Statistics

– Collect information for the optimizer– Table nrows, npused; Index Statistics -- LOW– Data Distributions -- MEDIUM & HIGH– Compile Stored Procedures

Page 8: Ugif 10 2012 ppt0000002

User Group Informix France

Where are the stats stored ?

� systables (Low)– nrows, npused

� sysindices (Low)– leaves, levels, nunique, clust

� syscolumns (Low)– colmin, colmax

� sysfragments (Low)– nrows, npused, – For index partitions, levels, clust

� sysdistrib (Medium or High)Can view with dbschema -hd

Page 9: Ugif 10 2012 ppt0000002

User Group Informix France

View Query Path

� Set explain on– Can be set in session

� Explain Directive– Can be embedded in the query

� xtrace Debug– Support may ask you to turn this on

FOREACH SELECT {+EXPLAIN } order_num INTO p_num FROM orders WHERE customer_num = 104 ORDER BY order_num

Page 10: Ugif 10 2012 ppt0000002

User Group Informix France

Debugging with xtrace

� To “see” the statistics information being used for query optimization

Example:xtrace heavy -c XTF_OPTMZR -f XTF_DEBUGxtrace size 10000xtrace on

Use “xtrace fview” or “xtrace view” to view traces.

“xtrace fview” includes timestamps.

Use “xtrace info” to display current xtrace settings.

Use “xtrace --” for xtrace usage info.

Page 11: Ugif 10 2012 ppt0000002

User Group Informix France

Xtrace: example

f1 31310 16 get_distrib(): distrib not found for table c col zipcodef1 7401 16 selec1: op = 46(OP_EQ), defsel = 0.1 sel = 0.0434783……f2 1207 16 oprowspages(tab = c, nrows = 28, npages = 2)f2 13217 16 opmix_iscancost(numrows=1.21739,npages=2,pagesread=1.13988)f2 13225 16 opmix_iscancost(scancost=1.1764,indexcost=1.08, …, iscancost=2.2564)

f1 31310 18 get_distrib(): distrib found for table c col zipcodef1 7401 18 selec1: op = 46(OP_EQ), defsel = 0.1 sel = 0.0357143……f2 1207 18 oprowspages(tab = c, nrows = 28672, npages = 2048)…f2 2237 18 dpages = 24576 lpages = 84 nlevels = 2f2 1871 18 dcost = 33.72 seek 0 keyonly = TRUEf2 1896 18 iscancost(c, zip_ix) cost = 35.72f2 13217 18 opmix_iscancost(numrows=1024,npages=2048,pagesread=805.977)f2 13225 18 opmix_iscancost(scancost=836.697,indexcost=35.72, …, iscancost=872.417)

Before

After Update Statistics

Page 12: Ugif 10 2012 ppt0000002

User Group Informix France

Xtrace (after ... cont’d)

…f2 1207 18 oprowspages(tab = c, nrows = 28672, npages = 2048)f2 1320 18 opscantabcost(c) npages = 2048, nrows = 28672, cost = 2909.16f2 1527 18 opcartcost(c) cost = 2909.16 initcost = 0f2 1988 18 index_info(): index 100_1 fullness 0.75 recs_per_node 128 keylen 4…f2 2237 18 dpages = 2048 lpages = 187 nlevels = 3f2 10863 18 idxtree_travcost s 3.48772e-05 nlevels 3 lpages .. dpages .. mempages 512f2 14448 18 seek_factor 6 clust 2048 clust_scale 0 seek 0…f2 1727 18 opidxcost(c, 100_1) = 0.745763f1 16094 18 index 100_1 considered, icost 0.745763, istart 0.0078125, fltragg 0f1 16324 18 indexp(): best index path: idx 100_1 icost = 0.745763 idx_flags2f3 3462 18 idx cost = 0.745763 initcost = 0.0078125 totalcost = 17.1526f3 3465 18 outer size = 23 join size = 1f3 8468 18 build inner table, init cost is 13.5745, join cost is 4.24268f3 8568 18 build outer table, init cost is 4.24268, join cost is 13.5745

Page 13: Ugif 10 2012 ppt0000002

User Group Informix France

sqexplain.out (before)

select c.city, c.state, o.ship_date from customer c , orders o where c.customer_num = o.customer_num and c.state = ? and c.zipcode = ?

Estimated Cost: 3Estimated # of Rows Returned: 1

1) informix.c: INDEX PATHFilters: informix.c.state = 'AZ'

(1) Index Name: informix.zip_ixIndex Keys: zipcode (Serial, fragments: ALL)Lower Index Filter: informix.c.zipcode = '85016'

2) informix.o: INDEX PATH(1) Index Name: informix. 102_4

Index Keys: customer_num (Serial, fragments: ALL)Lower Index Filter: informix.c.customer_num =

informix.o.customer_numNESTED LOOP JOIN

Page 14: Ugif 10 2012 ppt0000002

User Group Informix France

sqexplain.out (after)

select c.city, c.state, o.ship_date from customer c , orders o where c.customer_num = o.customer_num and c.state = ? and c.zipcode = ?

Estimated Cost: 19Estimated # of Rows Returned: 1

1) informix.o: SEQUENTIAL SCAN2) informix.c: INDEX PATH

Filters: (informix.c.zipcode = '85016' AND informix.c.state = 'AZ' )

(1) Index Name: informix. 100_1Index Keys: customer_num (Serial, fragments: ALL)Lower Index Filter: informix.c.customer_num =

informix.o.customer_numNESTED LOOP JOIN

Customer has 28672 rows.

Orders has 23 rows.

Page 15: Ugif 10 2012 ppt0000002

User Group Informix France

Before 11.x

� Before 11.x– Update statistics low, – Update statistics medium, high

• Resolution, Confidence

– Update statistics distributions only– Update statistics drop distributions– Update statistics for table, for procedure– Lots of guidelines

• What to run update statistics on• Which update statistics to run• How to run update statistics

� Scripts

� Cron jobs

Page 16: Ugif 10 2012 ppt0000002

User Group Informix France

Guidelines

� Update statistics medium distributions only for all columns that do not have an index

� Update statistics high for columns that are the first key in an index

� Update statistics low for all columns in multicolumn indexes� Run with PDQ for better performance (for table ONLY)� Do not run with PDQ for update statistics for procedure

Page 17: Ugif 10 2012 ppt0000002

User Group Informix France

Issues (before 11.x)

� Difficult to know when update statistics was run last� Guidelines weren’t always well-understood� People weren’t sure how to run update statistics

– Accidentally over-wrote statistics by running HIGH first, then MEDIUM

– Accidentally compiled stored procedures with PDQ– Ran Update Stats LOW twice (performance issue)

Update statistics LOW for table tab1;

Update statistics HIGH for table tab1 (col1, col2);

What might be considered “missing” here?

Page 18: Ugif 10 2012 ppt0000002

User Group Informix France

11.10 Features

� 11.10 Enhancements– Create index creates initial stats and distribution

information for the leading column of the index– Enhance catalog information

• What time was update statistics Low run?• What time were the distributions created?• How many rows were sampled for the distributions?

– New “Sampling Size” option– Update statistics drop distributions ONLY– Auto Update Statistics Scheduler tasks

Page 19: Ugif 10 2012 ppt0000002

User Group Informix France

Help with Guidelines

� Use scheduler task “Auto Update Statistics Evaluation”– Scheduler task can be run “on-demand” using exectask()

� Use script in Informix Technote (swg21137764)– UPDATE STATISTICS commands to allow the optimizer

to work its best

� Use Art Kagel’s dostats (from IIUG)

http://www-01.ibm.com/support/docview.wss?uid=swg21137764

Execute function exectask(‘Auto Update Statistics Evaluation’)

Page 20: Ugif 10 2012 ppt0000002

User Group Informix France

US History

� First introduced in 11.10– Scheduler task “Auto Update Statistics Evaluation”– Scheduler task “Auto Update Statistics Refresh”– Uses the guidelines to determine the update statistics

commands to run

� Enhancement to work with non-English Locales in 11.50.xC6

Page 21: Ugif 10 2012 ppt0000002

User Group Informix France

AUS Scheduler Tasks

� Runs Update Statistics FOR TABLE commands

� Runs with PDQ set to AUS_PDQ in sysadmin:ph_threshold

UPDATE STATISTICS LOW FOR TABLE stores7:customerUPDATE STATISTICS HIGH FOR TABLE stores7:customer ( customer_num, zipcode ) RESOLUTION 0.500 DISTRIBUTI ONS ONLY

> select * from ph_threshold where name = "AUS_PDQ" ;id 30name AUS_PDQtask_name Auto Update Statistics Refreshvalue 10 value_type NUMERICdescription Update statistics executes with this P DQ priority.

Page 22: Ugif 10 2012 ppt0000002

User Group Informix France

AUS Parameters

AUS_AGE aus_evaluator

The statistics are rebuilt after specified days.

AUS_CHANGE aus_evaluator

The statistics are rebuilt after specified percentage

of data has changed.

AUS_AUTO_RULES aus_evaluator

1 or 0 – if “off”, only evaluates tables that already

have statistics.

AUS_SMALL_TABLES aus_evaluator

Tables containing less than this number of rows will

always have their statistics rebuilt.

AUS_PDQ aus_refresh_stats

Run Update Statistics with this PDQ setting.

Page 23: Ugif 10 2012 ppt0000002

User Group Informix France

11.70 Features

� Smart Statistics– Default: AUTO_STAT_MODE 1 – Default: STATCHANGE 10– Update Statistics command, when run, is not executed

for index statistics and for table distribution if the STATCHANGE threshold has not been met

� Fragment-level Statistics– Not on by default– Not discussed in this presentation

Page 24: Ugif 10 2012 ppt0000002

User Group Informix France

11.70 Statistics Updated ?

�Update Statistics info in database catalog tables–Look at ustlowts in systables

• Updated when systables' nrows and npused are updated – this is done whenever update statistics command is run – STATCHANGE threshold is not looked at

–Look at ustlowts in sysindices• Updated when index statistics are rebuilt/updated

–Look at constr_time in sysdistrib• Updated when distribution statistics are rebuilt/updated

Page 25: Ugif 10 2012 ppt0000002

User Group Informix France

Example

$ dbaccessdemo7 stores7 –nots

select idxname, levels, leaves, nrows, nupdates, ndeletes, ninserts, ustlowtsfrom sysindiceswhere tabid = 100 and idxname = “zip_ix” ;

idxname zip_ixlevels 1leaves 1.000000000000nrows 28.00000000000nupdates 0.00ndeletes 0.00ninserts 28.00000000000ustlowts 2012-04-03 22:54:56.00000

> select * from sysdistrib where tabid = 100;

No rows found.

Index on customer(zipcode)

UDI counters for this index at the time of the update statistics low run.

dbaccessdemo7 did not create table distributions for customer table.

Page 26: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

> load from customer.unl insert into customer;

199863 row(s) loaded.

> select idxname, levels, leaves, nrows, nupdates, ndeletes, ninserts, > ustlowts from sysindiceswhere tabid = 100 and idxname = “zip_ix”;

idxname zip_ixlevels 1leaves 1.000000000000nrows 28.00000000000nupdates 0.00ndeletes 0.00ninserts 28.00000000000ustlowts 2012-04-03 22:54:56.00000

Index statistics for zip_ix unchanged after 199,863 rows inserted into the customer table.

-- No update statistics command has been run.

Page 27: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

idxname zip_ixlevels 1leaves 1.000000000000nrows 28.00000000000nupdates 0.00ndeletes 0.00ninserts 28.00000000000ustlowts 2012-04-03 22:54:56.00000

> create index state_ix on customer(state);

idxname state_ixlevels 3leaves 556.0000000000nrows nupdates 0.00ndeletes 0.00ninserts 0.00ustlowts 2012-04-03 23:04:33.00000

After inserting 199,863 rows into the customer table, create index state_ix on customer(state). -- No update statistics command has been run.

Page 28: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

tabid 100colno 8mode Hsmplsize 199891.0000000rowssmpld 199891.0000000constr_time 2012-04-03 23:04:33.00000ustnrows 199891.0000000ustbuildduration 0:00:00.00000nupdates 0.00ndeletes 0.00ninserts 199891.0000000

> select tabid, colno, mode, smplsize, rowssmpld, constr_time, > ustnrows, ustbuildduration, nupdates, ndeletes, ninserts > from sysdistrib where tabid = 100;

column state

Distribution information for column state in customer table

Page 29: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

partnum nupdates ndeletes ninserts zip_ix 1049092 0 0 199891state_ix 1049100 0 0 0

> select partnum, nupdates, ndeletes, ninserts from sysmaster:sysptnhdr > where partnum in (select partn from sysfragments > where fragtype = "I" and indexname in ('state_ix', 'zip_ix'));

> select partnum, nupdates, ndeletes, ninserts from sysmaster:sysptnhdr> where partnum = (select partnum from systables where tabid = 100);

partnum nupdates ndeletes ninserts customer 1049069 0 0 199891

Actual partition page info, showing the UDI counters for the partition, since the partition was created – this is not the same as the UDI info in the catalogs, which are updated when statistics are updated.

Page 30: Ugif 10 2012 ppt0000002

User Group Informix France

OAT view of Statistics

Page 31: Ugif 10 2012 ppt0000002

User Group Informix France

OAT view (cont’d)

For customer table --• Index zip_ix has exceeded STATCHANGE.• Index state_ix has not.

Page 32: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

idxname zip_ixlevels 1leaves 1.000000000000nrows 28.00000000000nupdates 0.00ndeletes 0.00ninserts 28.00000000000ustlowts 2012-04-03 22:54:56.00000

> update statistics low for table customer;

idxname zip_ixlevels 3leaves 505.0000000000nrows 199891.0000000nupdates 0.00ndeletes 0.00ninserts 199891.0000000ustlowts 2012-04-04 00:36:53.00000

• Index statistics updated.• Catalog UDI values updated.• sysindices ustlowts updated.

zip_ix index

BEFORE AFTER

Page 33: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

idxname state_ixlevels 3leaves 556.0000000000nrows 199891.0000000nupdates 0.00ndeletes 0.00ninserts 0.00ustlowts 2012-04-03 23:04:33.00000

idxname state_ixlevels 3leaves 556.0000000000nrows nupdates 0.00ndeletes 0.00ninserts 0.00ustlowts 2012-04-03 23:04:33.00000

> update statistics low for table customer;

• Index statistics unchanged.• Catalog UDI values unchanged.• sysindices ustlowts unchanged.

BEFORE AFTER

state_ix index

Page 34: Ugif 10 2012 ppt0000002

User Group Informix France

Example (cont’d)

> select tabname, tabid, nrows, created, ustlowts > from systables where tabid = 100;

tabname customertabid 100nrows 199891.0000000created 04/03/2012ustlowts 2012-04-04 00:36:53.00000

The systables information is always updated when update statistics for table stats are run, regardless of STATCHANGE.

Page 35: Ugif 10 2012 ppt0000002

User Group Informix France

Example

� Before 11.70– You should put “Distributions Only” in the Update

Statistics HIGH command to avoid collecting index statistics again

� After 11.70– Doesn’t matter since index statistics will only be

updated if STATCHANGE has been met for the index

Update Statistics LOW for table tab1;

Update Statistics HIGH for table tab1 (col1, col2);

Page 36: Ugif 10 2012 ppt0000002

User Group Informix France

Sysmaster query for %change

SELECT colname as name, 'Column' as type, constr_time::datetime year to second as build_date, rowssmpld::bigint as sample, d.ustnrows::bigint as nrows,case when d.mode = 'M' then 'Medium‘ when d.mode = 'H' then 'High' end as mode,resolution, confidence, ustbuildduration as build_duration,(table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) as udi_counter,CASE WHEN d.ustnrows=0 and(table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) = 0 THEN 0.00

WHEN d.ustnrows=0 and(table_counter.udi_counter - d.ninserts - d.nupdates - d.ndeletes) != 0 THEN -1

ELSE ROUND((table_counter.udi_counter - d.ninserts - d.nupdates –d.ndeletes)/d.ustnrows * 100,2)

END as changeFROM sysdistrib d, syscolumns c, ( select SUM(nupdates + ndeletes + ninserts) as udi_counter from sysmaster:sysptnhdr

where partnum in (select partn from sysfragments where tabid = 100 and fragtype='T'union select partnum as partn from systables where tabid = 100) )as table_counter

WHERE d.tabid=100 and c.tabid=100 and d.colno = c.colno and d.seqno = 1

UNION

Page 37: Ugif 10 2012 ppt0000002

User Group Informix France

Sysmaster query for %change

-- Continuing query started on previous slideSELECT idxname as name, MIN('Index') as type, MIN(ustlowts)::datetime year to second as build_date, MIN(0) assample, SUM(f.nrows)::bigint as nrows, MIN('Low') as mode,MIN(0) as resolution, MIN(0) as confidence, SUM(i.ustbuildduration) as build_duration,SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0)) -SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0)) as udi_counter,CASE WHEN SUM(f.nrows)=0 and (SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0)+ NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0))) = 0

THEN 0.00WHEN SUM(f.nrows)=0 and (SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0)

+ NVL(p.ndeletes,0)) - SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0))) != 0 THEN -1ELSE ROUND((SUM(NVL(p.ninserts,0) + NVL(p.nupdates,0) + NVL(p.ndeletes,0))

- SUM(NVL(f.ninserts,0) + NVL(f.nupdates,0) + NVL(f.ndeletes,0)))/SUM(f.nrows) * 100,2) END as changeFROM sysindices i, sysmaster:sysptnhdr p, sysfragments fWHERE i.idxname = f.indexname

AND i.tabid = 100 AND i.tabid = f.tabid AND f.partn = p.partnumGROUP BY i.idxname ORDER BY change DESC

Page 38: Ugif 10 2012 ppt0000002

User Group Informix France

Table STATCHANGE value

� Default STATCHANGE applies if not set for table

� Can be set at session level using set environment – Set environment statchange ‘5’ ;

� Can set STATCHANGE when creating table� Can alter table to set STATCHANGE

– Alter table customer statchange 5;

select tabname, NVL ( statchange, (select cf_effective from sysmaster:sysconfig where cf_name = ‘STATCHANGE’) ) as statchange from systables where tabname = "customer";

Page 39: Ugif 10 2012 ppt0000002

User Group Informix France

FORCE option

� Can add “FORCE” to any update statistics command to ignore STATCHANGE

� When you upgrade to 11.70– Existing partition pages will have UDI counters added

(UDI values are 0)– Catalog tables sysfragments (for indexes) and

sysdistrib (for table column data distributions) will have UDI counters added (values are 0)

– What does this mean for Update Statistics?• FORCE � Execute even if NO change• STATCHANGE 0 � Execute if any amount of change (non-

zero)

Page 40: Ugif 10 2012 ppt0000002

User Group Informix France

FORCE option (cont’d)

� Add “FORCE” to end of update statistics command to get legacy behavior (ignore STATCHANGE)

� FORCE– Execute even if NO change– Sets sysdistrib nupdates, ndeletes, ninserts to 0 –

same behavior isn’t seen with sysfragments nupdates, ndeletes, ninserts

� STATCHANGE 0– Execute if non-zero amount of change– Set environment STATCHANGE ‘0’

Page 41: Ugif 10 2012 ppt0000002

User Group Informix France

Stored Procedures

� Not affected by STATCHANGE -- Update statistics FOR PROCEDURE

� SQL statements in SPL are optimized– When SPL is created or on first execution– When dependent table or indexes are altered– When statistics of dependent tables change

In 11.70, this means every time update statistics is run to update a table, systable’s npused, nrows, and ustlowts are updated (even if index statistics or distribution statistics are not updated due to STATCHANGE not having been met).

Page 42: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low - Summary

� Update statistics low performance improvement feature takes effect when :

• USTLOW_SAMPLE is set to 1 • the index has 100,000 or more leaf pages

• Detached index

� USTLOW_SAMPLE • New ONCONFIG parameter, documented in 11.70.xC4

• Controls use of sampling (new feature) to collect index statistics during update statistics

• 0 or 1 (on) / Default value is 0 (off)• Can be updated with onmode -wm/wf

• Can be set at session-level using SET ENVIRONMENT

– Set Environment USTLOW_SAMPLE '0' / '1' / 'on' / 'off'

Page 43: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low – Why?

� Update Statistics LOW takes too long when gathering statistics for large indexes

• Entire index is read in sequence• Each leaf page of an index must be read individually (separate I/O)

• Some customers do not run the command because it does not fit in the maintenance window

• On a single large table (billions of rows and many indexes), command can take over 3 days

� New Feature Solution: USTLOW_SAMPLE• Use sampling to reduce time required to gather index statistics

• Many samples are taken, and index statistics is calculated based on statistics from the samples

Page 44: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low - Details

� Update statistics low gathers the following index statistics • number of index levels• number of index leaf pages

• number of unique values for index lead key

• clustering factor• 2nd lowest and 2nd highest value for index lead key

� Index statistics saved in database catalog• Sysindices (levels, leaves, nunique, clust)

• Syscolumns (colmin, colmax)

• Sysfragments (levels, clust) for fragtype = “I”

� When Update Statistics Med or High is run, index statistics are also collected, unless “Distributions Only” is used

Page 45: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low – Details (cont’d)

� Instead of reading the entire index in sequence, the new feature:

• Uses sampling• Each sample will go from index root page to index leaf page,

reading one or more index leaf pages• Sampling is “dynamic” -- number of samples is not pre-

determined• Number of samples is determined by the quality of the samples

– Fewer samples needed if data is evenly distributed– More samples needed if data distribution is skewed

– Standard deviation among the samples is used as criteria as a measurement of “quality”

• Time for update statistics is not predictable up-front

Page 46: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low - Example

� Example based on internal traces

Page 47: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low - Example

� Example based on internal traces

Page 48: Ugif 10 2012 ppt0000002

User Group Informix France

Update Statistics Low - Notes

� Review of Update statistics feature– 11.70.xC1 “Smart Statistics” Feature Review

• Default: AUTO_STAT_MODE 1 • Default: STATCHANGE 10

• Update Statistics command, when run, is not executed for index statistics and for table distribution if the STATCHANGE threshold has not been met

– Update Statistics info in database catalog tables• Look at ustlowts in systables

– Updated when systables' nrows and npused are updated – this is done whenever update statistics command is run – STATCHANGE threshold is not looked at

• Look at ustlowts in sysindices– Updated when index statistics are rebuilt/updated

• Look at constr_time in sysdistrib– Updated when distribution statistics are rebuilt/updated

� Remember, 11.10 Feature – Statistics are collected when Index is created

Page 49: Ugif 10 2012 ppt0000002

User Group Informix France

Catalog for smarter Statistics

systables sysfragments 11.70

statchange nupdates Existing

statlevel ndeletes

ustlowts ninserts

sysindices sysdistrib sysfragdist

nupdates nupdates nupdates

ndeletes ndeletes ndeletes

ninserts ninserts ninserts

ustbuildduration ustbuildduration ustbuildduration

ustlowts constr_time constr_time

Page 50: Ugif 10 2012 ppt0000002

User Group Informix France

Questions ?

Page 51: Ugif 10 2012 ppt0000002

User Group Informix France

MerciMerci

Olivier [email protected]

Mercredi 3 Octobre 2012Mercredi 3 Octobre 2012