Recent Query Processing Enhancements NoCOUG Conference February 19 th, 2004 George Lumpkin.

Recent Query ProcessingEnhancements

NoCOUG ConferenceFebruary 19th, 2004

George Lumpkin

Query processing

For the purposes of this presentation, ‘query processing’ includes:

– The underlying database objects which are being accessed (table, indexes, etc)

– The SQL functions and capabilities used to access those database objects

– The internal algorithms for executing SQL statements (table scans, index probes, joins, etc)

– The optimization techniques applied to SQL statements – The capabilities to view and understand Oracle’s query

processing Improved performance is the primary benefit of

enhanced query processing

Database Objects

Database objects

Oracle9i:– Table Compression– List and Range-List Partitioning– Bitmap join index– IOT’s: hash partitiong, parallel DML, bitmap indexes– Datetime datatype

Oracle10g:– Floating point datatype– Global hash-partitioned indexes– Datetime improvements

Table Compression (Oracle9i, Release 2) Tables can be compressed

– Compression can also be specified at the partition level– Indexes are not compressed

Typical compression ratios range from 3:1 to 5:1– Compression is dependent upon the actual data– Compression algorithm based on removing data

redundancy Key benefit is cost savings

– Save TB’s of storage without compromising performance or functionality

However, a secondary benefit is often performance due to reduced IO utilization

List and Composite Range-List Partitioning

List partitioning allows a table to be partitioned with a list of values

– For example, a table can be partitioned by region or by department

Composite Range-List enables logical sub-partitioning for the most commonly used Range partitioning

Further flexibility in how a DBA can manage large data sets

Provide appropriate partitioning techniques for all business requirements

OCT1998OCT1998

SEP1998SEP1998

NOV1998NOV1998

DEC1997DEC1997

NOV1997NOV1997

RANGE (sales_date)RANGE (sales_date)

Composite Range-List Partitioning

Range partition across time

...

List partition across another major attribute

LIS

T (

ge

og

rap

hy)

LIS

T (

ge

og

rap

hy)

...SouthSouth SouthSouthSouthSouth SouthSouth SouthSouth

WestWest WestWestWestWest WestWest WestWest...

OCT1998

North

OCT1998

North

SEP1998

North

SEP1998

North

NOV1998North

NOV1998North

DEC1997

North

DEC1997

North

NOV1997

North

NOV1997

North

Implementation and Usage Tips List and Composite List-Range Partitioning

Consider LIST (sub)partitioning when:– You have a column containing unordered values, which

correspond to a logical unit for data maintenance and query access

Use a DEFAULT list partition when:– You may have unexpected values for the partitioning key – You often add or modify values in your partitioning key.

Migration of existing nonpartitioned and partitioned tables– For online migration, use the dbms_redefinition package– For offline creation, use

CREATE TABLE AS SELECT INSERT /*+ APPEND */

Pe

rform

an

ce B

en

efits

Ma

na

ge

ab

ility B

ene

fits

IEEE Floating Point

New datatypes: binary_float and binary_double Precise mapping to Java and other application

environments Potential space reduction

– 4/8 bytes fixed vs. up to 21 bytes variable for Oracle number

Increased range of values – Binary double’s 11-bit exponent

Performance improvement – Native hardware vs. proprietary software for calculations

Caveat: binary numbers are subject to rounding effects and are never suitable for data requiring precision

IEEE Floating Point Biggest potential benefit for BI: Improved

performance for lengthy/complex arithmetical expressions

– Example query with 20X performance gains:

Second biggest potential benefit: Space savings for lengthy numeric types

select promotion_name,

exp (geo_mean_temp/count) as geometric_mean

from ( select p.promo_name as promotion_name, sum (ln (quantity_sold)) as geo_mean_temp from sales s, promotions p where p.promo_id = s.promo_id and time_id between to_date ('01-jan-1998', 'dd-mon-yyyy') and to_date ('31-dec-1998','dd-mon-yyyy')

and quantity_sold > 0 group by promo_name)

order by geometric_mean desc;

SQL Functions

SQL Functions

Oracle9i– ANSI Joins– Full outer joins– CASE statement– Grouping sets– WITH clause

Oracle10g– Enhanced connect by– Partition Outer Join– Regular Expressions– SQL Models– Statistical functions– Frequent Itemsets

WITH clause

Useful when a given query accesses the same subquery multiple times:

WITH channel_summary AS ( SELECT channels.channel_desc,SUM(amount_sold) AS channel_total FROM sales, channelsWHERE sales.channel_id = channels.channel_idGROUP BY channels.channel_desc )

SELECT channel_desc, channel_total FROM channel_summary WHERE channel_total > ( SELECT SUM(channel_total) * 1/3 FROM

channel_summary);

New outer join syntax enabling easy specification and high performance for joins that "densify" sparse data.

To specify comparison calculations and to format reports reliably, best to return a consistent set of dimension members in query results

Yet data normally stored in "sparse" form: why waste space storing non-occurrence?

Ugly and slow SQL needed to add back rows for nonexistent cases into query output.

Most frequently used to replace missing values along time dimension.

Accepted for ANSI SQL standard.

Partitioned Outer Join

1 April 2003 Bottle 10 6 April 2003 Bottle 81 April 2003 Can 15 4 April 2003 Can 11

1 April 2003 Bottle 10 2 April 2003 Bottle 3 April 2003 Bottle 4 April 2003 Bottle 5 April 2003 Bottle 6 April 2003 Bottle 81 April 2003 Can 15 2 April 2003 Can 3 April 2003 Can 4 April 2003 Can 115 April 2003 Can 6 April 2003 Can

Inventory table holds only changed values. But for calculations & reporting, we want rows for the full set of dates.

Inventory Tabletime_id product quant

SELECT times.time_id, product, quantFROM inventory PARTITION BY (product) RIGHT OUTER JOIN times ON (times.time_id=inventory.time_id);

Similar to a regular outer join, except the

outer join is applied to each partition.

Partitioned Outer Join - Basics

1 April 2003 Bottle 10 2 April 2003 Bottle 103 April 2003 Bottle 104 April 2003 Bottle 105 April 2003 Bottle 106 April 2003 Bottle 81 April 2003 Can 15 2 April 2003 Can 153 April 2003 Can 154 April 2003 Can 115 April 2003 Can 116 April 2003 Can 11

SELECT time_id, product, LAST_VALUE (quant IGNORE NULLS) OVER (PARTITION BY product ORDER BY time_id) quant FROM ( SELECT times.time_id, product, quant FROM inventory PARTITION BY (product) RIGHT OUTER JOIN times ON (times.time_id=inventory.time_id) );

The last non-null values should be preserved for subsequent records (typical inventory problem)

New analytical SQL keyword for LAST_VALUE()

Partition Outer Join: Repeating Values

Row Sources

Row sources

Oracle9i– Sampling– Index skip scans

Oracle10g– Table scan speed-up– Inline Lob access speedup

Index Skip Scan

In Oracle8i, composite index used only if first (prefix) column in the predicate

In Oracle9i, skip scan uses the composite index that is far faster than a Full Table Scan

No need for another index Especially useful if the number of distinct

values of prefix column are relatively low

Index Skip Scan

Business Scenario: Department of Motorized Vehicles– A car is uniquely identified by State and registration ID– Unique index on (STATE, REGISTRATION#)

Query: Find the details of a registration ID when the State is not known

– Index skip scan allows composite index to be used for this query Can be many times faster than not using an index

– Before index skip scans, Bad performance because of lack of index Or, extra cost, maintenance to create index on

(registration#)

Optimizer

Optimizer

Oracle9i– Dynamic Sampling– Bind Peeking– Index Joins

Oracle10g– Automatic SQL Tuning

Bind peeking

In the first invocation of a cursor containing bind variables, the optimizer will ‘peek’ at the bind values and use those values to optimize the query

– The query plan will remain cached, and will be re-used for future invocations

The bind variables in the first invocation should thus be ‘representative’ values

Optimizer Dynamic SamplingOracle9i Rel 2 Problem: optimizer statistics may be missing

or known to be inaccurate Solution: statistics are dynamically gathered

during query optimization– Table predicate selectivity and cardinality– Sampling is used to minimize the time required to

gather statistics– Statistics are only gathered for queries which are

expected to take a long time (relative to the cost of gathering stats)

Optimizer Dynamic Sampling

Settings for OPTIMIZER_DYNAMIC_SAMPLING parameter:

– 0 -- Off.– 1 – Used for multi-table queries for tables w/o both statistics

and indexes. Little overhead since you will have to do a full scan anyway. This is the default in 9iR2.

– 2 -- Used for any unanalyzed object. This is the default in 10g where we have automated stats collection, but users may still have volatile objects without stats. This is the default in 10g

– 3 – Used when the optimizer has to use a guess, e.g., to_number(c1) > 10.

– 4 -- Used if correlations could be present, e.g., ANDed or ORed conditions on the same table.

Diagnostics

Diagnostics

Oracle9i– Query execution statistics– Enhanced SQL trace information– Enhanced explain plan output (DBMS_XPLAN)

Oracle10g– Automatic workload repository and automated

diagnosis with ADDM Self-tuning SQL optimization

– Parallel Execution Enhancements: “No Slave SQL”

Query Execution Statistics

Oracle9i introduces new dynamic views for a deeper insight into SQL Execution

- V$SQL_PLAN_*- Execution plans of all cursors in the shared SQL

area- Cursor runtime statistics can be collected with

STATISTICS_LEVEL=ALL- V$SQL_WORKAREA_*

- Detailed information about the memory usage for all running SQL statements down to a row source level

- Activated when PGA_AGGREGATE_TARGET <> 0

V$SQL_PLAN

SQL> select /* TRACK_ME */ e.ename, d.dname from scott.emp e, scott.dept d where e.deptno=d.deptno;

SQL> select /* NOT_ME */ id, operation, object_name, cost, bytes from v$sql_plan where hash_value = (select hash_value from v$sql where sql_text like '%TRACK_ME%' and sql_text not like '%NOT_ME%') order by 1;

ID OPERATION OBJECT_NAME COST BYTES---------- ------------------------------ --------------------- ---------- ----------0 SELECT STATEMENT 51 HASH JOIN 4 5882 TABLE ACCESS DEPT 2 883 TABLE ACCESS EMP 2 280

V$SQL_PLAN equivalent to PLAN_TABLE Shows actual used plan

– SQL shown to select plan is simplified– In Oracle10g, you can use DBMS_XPLAN

V$SQL_PLAN_STATISTICS_ALL

SQL> alter session set statistics_level=ALL;SQL> select /* TRACK_ME */ e.ename, d.dname from scott.emp e, scott.dept d where e.deptno=d.deptno;

SQL> select /* NOT_ME */ id, operation, object_name, last_output_rows "ROWS", last_cr_buffer_gets “CR", last_disk_reads “PR", last_elapsed_time "TIME us" from v$sql_plan_statistics_all where hash_value = (select hash_value from v$sql where sql_text like '%TRACK_ME%' and sql_text not like '%NOT_ME%') order by 1;

ID OPERATION OBJECT_NAME ROWS CR PR TIME us-- ------------ ----------------------- --------- --------- -------- ---------- 1 HASH JOIN 14 7 0 2685 2 TABLE ACCESS DEPT 4 3 0 377 3 TABLE ACCESS EMP 14 4 0 426

V$SQL_PLAN_STATISTICS shows actual cursor execution statistics (overhead, not enabled by default)

– SQL shown to select plan is simplified– In Oracle10g, you can use DBMS_XPLAN

Enhanced SQL Trace

By Default, SQL trace includes some runtime statistics for a SQL statement

PARSING IN CURSOR #11select e.ename, d.dname from emp e, dept d where e.deptno=d.deptnoEND OF STMTPARSE #11:c=20000,e=13838,p=0,cr=8,cu=0,mis=1,r=0,dep=0,og=1,tim=6389414660329EXEC #11:c=0,e=102,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=6389414660694FETCH #11:c=0,e=2132,p=0,cr=6,cu=0,mis=0,r=1,dep=0,og=1,tim=6389414663127FETCH #11:c=0,e=1329,p=0,cr=1,cu=0,mis=0,r=13,dep=0,og=1,tim=6389414665803XCTEND rlbk=0, rd_only=1STAT #11 id=1 cnt=14 pid=0 … op='HASH JOIN (cr=7 pr=0 pw=0 time=3357 us)'STAT #11 id=2 cnt=4 pid=1 … op='TABLE ACCESS FULL DEPT (cr=3 pr=0 pw=0 time=525 us)'STAT #11 id=3 cnt=14 pid=1 … op='TABLE ACCESS FULL EMP (cr=4 pr=0 pw=0 time=380 us)'

“Data Warehouse” of the Database Facility to collect, process, and maintain important

RDBMS statistics and workload: SQL workload, segment statistics, time & wait statistics, metrics, feature usage

Efficiently sample and compute statistics in memory Periodically flush coarser-grain information to disk - in

a self-managed tablespace Information readily available & real-time accessible

when needed On by default - flush to disk every 30 min, keep for 7

days

Workload Repository

In Oracle9i, parallel execution plans were complex – Difficult to read/understand due to multiple cursors– Difficult to analyze statement-level performance information

SQL> explain plan for select /*+parallel(d) parallel(e) */ dname, ename from emp e, dept d where e.deptno=d.deptno;

--------------------------------------------…--------------------------------------| Id | Operation | Name |… | TQ |IN-OUT| PQ Distribution | -----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | | ||* 1 | HASH JOIN | | | 88,01 | P->S | QC (RAND) || 2 | TABLE ACCESS FULL | EMP | | 88,01 | PCWP | || 3 | TABLE ACCESS FULL | DEPT | | 88,00 | P->P | BROADCAST |-----------------------------------------------------------------------------------

PX Slave SQL Information (identified by operation id):------------------------------------------------------ 1 - SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) */ A1.C1,A2.C1 FROM (SELECT /*+ NO_EXPAND ROWID(A3) */ A3."DEPTNO" C0,A3."ENAME" C1 FROM "EMP" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC) A3 ) A1,:Q288000 A2 WHERE A1.C0=A2.C0 3 - SELECT /*+ NO_EXPAND ROWID(A1) */ A1."DEPTNO" C0,A1."DNAME" C1 FROM "DEPT" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC) A1

Oracle10g - No Slave SQL

Oracle Database 10g: single execution plan, single cursor

SQL> explain plan for select /*+parallel(d) parallel(e) */ dname, ename from emp e, dept d where e.deptno=d.deptno;

----------------------------------------------.. --------------------------------| Id | Operation | Name | | TQ |IN-OUT| PQ Distrib |-----------------------------------------------..--------------------------------| 0 | SELECT STATEMENT | | | | | || 1 | PX COORDINATOR | | | | | || 2 | PX SEND QC (RANDOM) | :TQ10001 | | Q1,01 | P->S | QC (RAND) ||* 3 | HASH JOIN | | | Q1,01 | PCWP | || 4 | PX BLOCK ITERATOR | | | Q1,01 | PCWC | || 5 | TABLE ACCESS FULL | EMP | | Q1,01 | PCWP | || 6 | BUFFER SORT | | | Q1,01 | PCWC | || 7 | PX RECEIVE | | | Q1,01 | PCWP | || 8 | PX SEND BROADCAST | :TQ10000 | | Q1,00 | P->P | BROADCAST || 9 | PX BLOCK ITERATOR | | | Q1,00 | PCWC | || 10 | TABLE ACCESS FULL | DEPT | | Q1,00 | PCWP | |----------------------------------------------..---------------------------------

Oracle10g - No Slave SQL

Recent Query Processing Enhancements NoCOUG Conference February 19 th, 2004 George Lumpkin.

Documents

Transcript of Recent Query Processing Enhancements NoCOUG Conference February 19 th, 2004 George Lumpkin.