Recent Query Processing Enhancements NoCOUG Conference February 19 th, 2004 George Lumpkin.
-
Upload
melanie-jenkins -
Category
Documents
-
view
215 -
download
0
Transcript of Recent Query Processing Enhancements NoCOUG Conference February 19 th, 2004 George Lumpkin.
Recent Query ProcessingEnhancements
NoCOUG ConferenceFebruary 19th, 2004
George Lumpkin
Query processing
For the purposes of this presentation, ‘query processing’ includes:
– The underlying database objects which are being accessed (table, indexes, etc)
– The SQL functions and capabilities used to access those database objects
– The internal algorithms for executing SQL statements (table scans, index probes, joins, etc)
– The optimization techniques applied to SQL statements – The capabilities to view and understand Oracle’s query
processing Improved performance is the primary benefit of
enhanced query processing
Database Objects
Database objects
Oracle9i:– Table Compression– List and Range-List Partitioning– Bitmap join index– IOT’s: hash partitiong, parallel DML, bitmap indexes– Datetime datatype
Oracle10g:– Floating point datatype– Global hash-partitioned indexes– Datetime improvements
Table Compression (Oracle9i, Release 2) Tables can be compressed
– Compression can also be specified at the partition level– Indexes are not compressed
Typical compression ratios range from 3:1 to 5:1– Compression is dependent upon the actual data– Compression algorithm based on removing data
redundancy Key benefit is cost savings
– Save TB’s of storage without compromising performance or functionality
However, a secondary benefit is often performance due to reduced IO utilization
List and Composite Range-List Partitioning
List partitioning allows a table to be partitioned with a list of values
– For example, a table can be partitioned by region or by department
Composite Range-List enables logical sub-partitioning for the most commonly used Range partitioning
Further flexibility in how a DBA can manage large data sets
Provide appropriate partitioning techniques for all business requirements
OCT1998OCT1998
SEP1998SEP1998
NOV1998NOV1998
DEC1997DEC1997
NOV1997NOV1997
RANGE (sales_date)RANGE (sales_date)
Composite Range-List Partitioning
Range partition across time
...
List partition across another major attribute
LIS
T (
ge
og
rap
hy)
LIS
T (
ge
og
rap
hy)
...SouthSouth SouthSouthSouthSouth SouthSouth SouthSouth
WestWest WestWestWestWest WestWest WestWest...
OCT1998
North
OCT1998
North
SEP1998
North
SEP1998
North
NOV1998North
NOV1998North
DEC1997
North
DEC1997
North
NOV1997
North
NOV1997
North
Implementation and Usage Tips List and Composite List-Range Partitioning
Consider LIST (sub)partitioning when:– You have a column containing unordered values, which
correspond to a logical unit for data maintenance and query access
Use a DEFAULT list partition when:– You may have unexpected values for the partitioning key – You often add or modify values in your partitioning key.
Migration of existing nonpartitioned and partitioned tables– For online migration, use the dbms_redefinition package– For offline creation, use
CREATE TABLE AS SELECT INSERT /*+ APPEND */
Pe
rform
an
ce B
en
efits
Ma
na
ge
ab
ility B
ene
fits
IEEE Floating Point
New datatypes: binary_float and binary_double Precise mapping to Java and other application
environments Potential space reduction
– 4/8 bytes fixed vs. up to 21 bytes variable for Oracle number
Increased range of values – Binary double’s 11-bit exponent
Performance improvement – Native hardware vs. proprietary software for calculations
Caveat: binary numbers are subject to rounding effects and are never suitable for data requiring precision
IEEE Floating Point Biggest potential benefit for BI: Improved
performance for lengthy/complex arithmetical expressions
– Example query with 20X performance gains:
Second biggest potential benefit: Space savings for lengthy numeric types
select promotion_name,
exp (geo_mean_temp/count) as geometric_mean
from ( select p.promo_name as promotion_name, sum (ln (quantity_sold)) as geo_mean_temp from sales s, promotions p where p.promo_id = s.promo_id and time_id between to_date ('01-jan-1998', 'dd-mon-yyyy') and to_date ('31-dec-1998','dd-mon-yyyy')
and quantity_sold > 0 group by promo_name)
order by geometric_mean desc;
SQL Functions
SQL Functions
Oracle9i– ANSI Joins– Full outer joins– CASE statement– Grouping sets– WITH clause
Oracle10g– Enhanced connect by– Partition Outer Join– Regular Expressions– SQL Models– Statistical functions– Frequent Itemsets
WITH clause
Useful when a given query accesses the same subquery multiple times:
WITH channel_summary AS ( SELECT channels.channel_desc,SUM(amount_sold) AS channel_total FROM sales, channelsWHERE sales.channel_id = channels.channel_idGROUP BY channels.channel_desc )
SELECT channel_desc, channel_total FROM channel_summary WHERE channel_total > ( SELECT SUM(channel_total) * 1/3 FROM
channel_summary);
New outer join syntax enabling easy specification and high performance for joins that "densify" sparse data.
To specify comparison calculations and to format reports reliably, best to return a consistent set of dimension members in query results
Yet data normally stored in "sparse" form: why waste space storing non-occurrence?
Ugly and slow SQL needed to add back rows for nonexistent cases into query output.
Most frequently used to replace missing values along time dimension.
Accepted for ANSI SQL standard.
Partitioned Outer Join
1 April 2003 Bottle 10 6 April 2003 Bottle 81 April 2003 Can 15 4 April 2003 Can 11
1 April 2003 Bottle 10 2 April 2003 Bottle 3 April 2003 Bottle 4 April 2003 Bottle 5 April 2003 Bottle 6 April 2003 Bottle 81 April 2003 Can 15 2 April 2003 Can 3 April 2003 Can 4 April 2003 Can 115 April 2003 Can 6 April 2003 Can
Inventory table holds only changed values. But for calculations & reporting, we want rows for the full set of dates.
Inventory Tabletime_id product quant
SELECT times.time_id, product, quantFROM inventory PARTITION BY (product) RIGHT OUTER JOIN times ON (times.time_id=inventory.time_id);
Similar to a regular outer join, except the
outer join is applied to each partition.
Partitioned Outer Join - Basics
1 April 2003 Bottle 10 2 April 2003 Bottle 103 April 2003 Bottle 104 April 2003 Bottle 105 April 2003 Bottle 106 April 2003 Bottle 81 April 2003 Can 15 2 April 2003 Can 153 April 2003 Can 154 April 2003 Can 115 April 2003 Can 116 April 2003 Can 11
SELECT time_id, product, LAST_VALUE (quant IGNORE NULLS) OVER (PARTITION BY product ORDER BY time_id) quant FROM ( SELECT times.time_id, product, quant FROM inventory PARTITION BY (product) RIGHT OUTER JOIN times ON (times.time_id=inventory.time_id) );
The last non-null values should be preserved for subsequent records (typical inventory problem)
New analytical SQL keyword for LAST_VALUE()
Partition Outer Join: Repeating Values
Row Sources
Row sources
Oracle9i– Sampling– Index skip scans
Oracle10g– Table scan speed-up– Inline Lob access speedup
Index Skip Scan
In Oracle8i, composite index used only if first (prefix) column in the predicate
In Oracle9i, skip scan uses the composite index that is far faster than a Full Table Scan
No need for another index Especially useful if the number of distinct
values of prefix column are relatively low
Index Skip Scan
Business Scenario: Department of Motorized Vehicles– A car is uniquely identified by State and registration ID– Unique index on (STATE, REGISTRATION#)
Query: Find the details of a registration ID when the State is not known
– Index skip scan allows composite index to be used for this query Can be many times faster than not using an index
– Before index skip scans, Bad performance because of lack of index Or, extra cost, maintenance to create index on
(registration#)
Optimizer
Optimizer
Oracle9i– Dynamic Sampling– Bind Peeking– Index Joins
Oracle10g– Automatic SQL Tuning
Bind peeking
In the first invocation of a cursor containing bind variables, the optimizer will ‘peek’ at the bind values and use those values to optimize the query
– The query plan will remain cached, and will be re-used for future invocations
The bind variables in the first invocation should thus be ‘representative’ values
Optimizer Dynamic SamplingOracle9i Rel 2 Problem: optimizer statistics may be missing
or known to be inaccurate Solution: statistics are dynamically gathered
during query optimization– Table predicate selectivity and cardinality– Sampling is used to minimize the time required to
gather statistics– Statistics are only gathered for queries which are
expected to take a long time (relative to the cost of gathering stats)
Optimizer Dynamic Sampling
Settings for OPTIMIZER_DYNAMIC_SAMPLING parameter:
– 0 -- Off.– 1 – Used for multi-table queries for tables w/o both statistics
and indexes. Little overhead since you will have to do a full scan anyway. This is the default in 9iR2.
– 2 -- Used for any unanalyzed object. This is the default in 10g where we have automated stats collection, but users may still have volatile objects without stats. This is the default in 10g
– 3 – Used when the optimizer has to use a guess, e.g., to_number(c1) > 10.
– 4 -- Used if correlations could be present, e.g., ANDed or ORed conditions on the same table.
Diagnostics
Diagnostics
Oracle9i– Query execution statistics– Enhanced SQL trace information– Enhanced explain plan output (DBMS_XPLAN)
Oracle10g– Automatic workload repository and automated
diagnosis with ADDM Self-tuning SQL optimization
– Parallel Execution Enhancements: “No Slave SQL”
Query Execution Statistics
Oracle9i introduces new dynamic views for a deeper insight into SQL Execution
- V$SQL_PLAN_*- Execution plans of all cursors in the shared SQL
area- Cursor runtime statistics can be collected with
STATISTICS_LEVEL=ALL- V$SQL_WORKAREA_*
- Detailed information about the memory usage for all running SQL statements down to a row source level
- Activated when PGA_AGGREGATE_TARGET <> 0
V$SQL_PLAN
SQL> select /* TRACK_ME */ e.ename, d.dname from scott.emp e, scott.dept d where e.deptno=d.deptno;
SQL> select /* NOT_ME */ id, operation, object_name, cost, bytes from v$sql_plan where hash_value = (select hash_value from v$sql where sql_text like '%TRACK_ME%' and sql_text not like '%NOT_ME%') order by 1;
ID OPERATION OBJECT_NAME COST BYTES---------- ------------------------------ --------------------- ---------- ----------0 SELECT STATEMENT 51 HASH JOIN 4 5882 TABLE ACCESS DEPT 2 883 TABLE ACCESS EMP 2 280
V$SQL_PLAN equivalent to PLAN_TABLE Shows actual used plan
– SQL shown to select plan is simplified– In Oracle10g, you can use DBMS_XPLAN
V$SQL_PLAN_STATISTICS_ALL
SQL> alter session set statistics_level=ALL;SQL> select /* TRACK_ME */ e.ename, d.dname from scott.emp e, scott.dept d where e.deptno=d.deptno;
SQL> select /* NOT_ME */ id, operation, object_name, last_output_rows "ROWS", last_cr_buffer_gets “CR", last_disk_reads “PR", last_elapsed_time "TIME us" from v$sql_plan_statistics_all where hash_value = (select hash_value from v$sql where sql_text like '%TRACK_ME%' and sql_text not like '%NOT_ME%') order by 1;
ID OPERATION OBJECT_NAME ROWS CR PR TIME us-- ------------ ----------------------- --------- --------- -------- ---------- 1 HASH JOIN 14 7 0 2685 2 TABLE ACCESS DEPT 4 3 0 377 3 TABLE ACCESS EMP 14 4 0 426
V$SQL_PLAN_STATISTICS shows actual cursor execution statistics (overhead, not enabled by default)
– SQL shown to select plan is simplified– In Oracle10g, you can use DBMS_XPLAN
Enhanced SQL Trace
By Default, SQL trace includes some runtime statistics for a SQL statement
PARSING IN CURSOR #11select e.ename, d.dname from emp e, dept d where e.deptno=d.deptnoEND OF STMTPARSE #11:c=20000,e=13838,p=0,cr=8,cu=0,mis=1,r=0,dep=0,og=1,tim=6389414660329EXEC #11:c=0,e=102,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=6389414660694FETCH #11:c=0,e=2132,p=0,cr=6,cu=0,mis=0,r=1,dep=0,og=1,tim=6389414663127FETCH #11:c=0,e=1329,p=0,cr=1,cu=0,mis=0,r=13,dep=0,og=1,tim=6389414665803XCTEND rlbk=0, rd_only=1STAT #11 id=1 cnt=14 pid=0 … op='HASH JOIN (cr=7 pr=0 pw=0 time=3357 us)'STAT #11 id=2 cnt=4 pid=1 … op='TABLE ACCESS FULL DEPT (cr=3 pr=0 pw=0 time=525 us)'STAT #11 id=3 cnt=14 pid=1 … op='TABLE ACCESS FULL EMP (cr=4 pr=0 pw=0 time=380 us)'
“Data Warehouse” of the Database Facility to collect, process, and maintain important
RDBMS statistics and workload: SQL workload, segment statistics, time & wait statistics, metrics, feature usage
Efficiently sample and compute statistics in memory Periodically flush coarser-grain information to disk - in
a self-managed tablespace Information readily available & real-time accessible
when needed On by default - flush to disk every 30 min, keep for 7
days
Workload Repository
In Oracle9i, parallel execution plans were complex – Difficult to read/understand due to multiple cursors– Difficult to analyze statement-level performance information
SQL> explain plan for select /*+parallel(d) parallel(e) */ dname, ename from emp e, dept d where e.deptno=d.deptno;
--------------------------------------------…--------------------------------------| Id | Operation | Name |… | TQ |IN-OUT| PQ Distribution | -----------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | | ||* 1 | HASH JOIN | | | 88,01 | P->S | QC (RAND) || 2 | TABLE ACCESS FULL | EMP | | 88,01 | PCWP | || 3 | TABLE ACCESS FULL | DEPT | | 88,00 | P->P | BROADCAST |-----------------------------------------------------------------------------------
PX Slave SQL Information (identified by operation id):------------------------------------------------------ 1 - SELECT /*+ ORDERED NO_EXPAND USE_HASH(A2) */ A1.C1,A2.C1 FROM (SELECT /*+ NO_EXPAND ROWID(A3) */ A3."DEPTNO" C0,A3."ENAME" C1 FROM "EMP" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC) A3 ) A1,:Q288000 A2 WHERE A1.C0=A2.C0 3 - SELECT /*+ NO_EXPAND ROWID(A1) */ A1."DEPTNO" C0,A1."DNAME" C1 FROM "DEPT" PX_GRANULE(0, BLOCK_RANGE, DYNAMIC) A1
Oracle10g - No Slave SQL
Oracle Database 10g: single execution plan, single cursor
SQL> explain plan for select /*+parallel(d) parallel(e) */ dname, ename from emp e, dept d where e.deptno=d.deptno;
----------------------------------------------.. --------------------------------| Id | Operation | Name | | TQ |IN-OUT| PQ Distrib |-----------------------------------------------..--------------------------------| 0 | SELECT STATEMENT | | | | | || 1 | PX COORDINATOR | | | | | || 2 | PX SEND QC (RANDOM) | :TQ10001 | | Q1,01 | P->S | QC (RAND) ||* 3 | HASH JOIN | | | Q1,01 | PCWP | || 4 | PX BLOCK ITERATOR | | | Q1,01 | PCWC | || 5 | TABLE ACCESS FULL | EMP | | Q1,01 | PCWP | || 6 | BUFFER SORT | | | Q1,01 | PCWC | || 7 | PX RECEIVE | | | Q1,01 | PCWP | || 8 | PX SEND BROADCAST | :TQ10000 | | Q1,00 | P->P | BROADCAST || 9 | PX BLOCK ITERATOR | | | Q1,00 | PCWC | || 10 | TABLE ACCESS FULL | DEPT | | Q1,00 | PCWP | |----------------------------------------------..---------------------------------
Oracle10g - No Slave SQL