Oracle Statistics

download Oracle Statistics

of 26

Transcript of Oracle Statistics

  • 7/30/2019 Oracle Statistics

    1/26

    Oracle StatisticsWhenever a valid SQL statement is processed Oracle has to decidehow to retrieve the necessary data. This decision can be madeusing one of two methods:

    Rule Based Optimizer (RBO) - This method is used if theserver has no internal statistics relating to the objectsreferenced by the statement. This method is no longerfavoured by Oracle and was desupported from 10g.

    Cost Based Optimizer (CBO) - This method is used ifinternal statistics are present. The CBO checks severalpossible execution plans and selects the one with thelowest cost, where cost relates to system resources. SinceOracle 8i the Cost Based Optimizer (CBO) is the preferredoptimizer for Oracle.

    Oracle statistics tell us the size of the tables, the

    distribution of values within columns, and other importantinformation so that SQL statements will always generate the bestexecution plans. If new objects are created, or the amount ofdata in the database changes the statistics will no longerrepresent the real state of the database so the CBO decisionprocess may be seriously impaired.

    Oracle can do things in several different ways, e.g. select mightbe done by table scan or by using indexes. It uses statistics, avariety of counts and averages and other numbers, to figure outthe best way to do things. It does the figuring automatically,using the Cost Based Optimizer. DBA job is to make sure thenumbers are good enough for that optimizer to work properly.

    Oracle statistics may refer to historical performance statisticsthat are kept in STATSPACK or AWR, but more common use of theterm Oracle statistics is about Oracle optimizer Metadatastatistics in order to provide the cost-based SQL optimizer withthe information about the nature of the tables. The statisticsmentioned here are optimizer statistics, which are created forthe purposes of query optimization and are stored in the datadictionary. These statistics should not be confused withperformance statistics visible through V$ views.

    The optimizer is influenced by the following factors:

    OPTIMIZER_MODE in the initialization file

    Statistics in the data dictionary

    Hints

    OPTIMIZER_MODE can have the following values:CHOOSEALL_ROWS

    http://satya-dba.blogspot.com/2009/01/whats-new-in-10g.htmlhttp://satya-dba.blogspot.com/2009/08/statspack-in-oracle.htmlhttp://satya-dba.blogspot.com/2009/01/whats-new-in-10g.htmlhttp://satya-dba.blogspot.com/2009/08/statspack-in-oracle.html
  • 7/30/2019 Oracle Statistics

    2/26

    FIRST_ROWSRULE

    If we provide Oracle with good statistics about the schema theCBO will almost always generate an optimal execution plan. Theareas of schema analysis include:

    Object statistics - Statistics for all tables, partitions,IOTs, etc should be sampled with a deep and statisticallyvalid sample size.

    Critical columns - Those columns that are regularly-referenced in SQL statements that are:

    o Heavily skewed columns - This helps the CBO properlychoose between an index range scan and a full tablescan.

    o Foreign key columns - For n-way table joins, the CBOneeds to determine the optimal table join order andknowing the cardinality of the intermediate results

    sets is critical. External statistics - Oracle will sample the CPU cost and

    I/O cost during statistics collection and use thisinformation to determine the optimal execution plan, basedon optimizer_mode. External statistics are most useful forSQL running in the all_rows optimizer mode.

    Optimizer statistics are a collection of data that describe moredetails about the database and the objects in the database. Thesestatistics are used by the query optimizer to choose the bestexecution plan for each SQL statement. Optimizer statisticsinclude the following:

    Table statisticso Number of rowso Number of blockso Average row length

    Column statisticso Number of distinct values (NDV) in columno Number of nulls in columno Data distribution (histogram)

    Index statisticso Number of leaf blockso Levels

    o Clustering factor System statistics

    o I/O performance and utilizationo CPU performance and utilization

    The optimizer statistics are stored in the data dictionary. Theycan be viewed using data dictionary views. Only statistics storedin the dictionary itself have an impact on the cost-based

    http://satya-dba.blogspot.com/2009/07/partitioning-in-oracle.htmlhttp://satya-dba.blogspot.com/2009/07/partitioning-in-oracle.html
  • 7/30/2019 Oracle Statistics

    3/26

    optimizer.

    When statistics are updated for a database object, Oracleinvalidates any currently parsed SQL statements that access theobject. The next time such a statement executes, the statement isre-parsed and the optimizer automatically chooses a new execution

    plan based on the new statistics. Distributed statementsaccessing objects with new statistics on remote databases are notinvalidated. The new statistics take effect the next time the SQLstatement is parsed.

    Because the objects in a database can be constantly changing,statistics must be regularly updated so that they accuratelydescribe these database objects. Statistics are maintainedautomatically by Oracle or we can maintain the optimizerstatistics manually using the DBMS_STATS package.

    DBMS_STATS package provides procedures for managing statistics.

    We can save and restore copies of statistics. You can exportstatistics from one system and import those statistics intoanother system. For example, you could export statistics from aproduction system to a test system. We can lock statistics toprevent those statistics from changing.

    For data warehouses and database using the all_rowsoptimizer_mode, from Oracle9i release 2 we can collect theexternal cpu_cost and io_cost metrics. The ability to save andre-use schema statistics is important:

    Bi-Modal databases - Many databases get huge benefits from

    using two sets of stats, one for OLTP (daytime), andanother for batch (evening jobs).

    Test databases - Many Oracle professionals will exporttheir production statistics into the development instancesso that the test execution plans more closely resemble theproduction database.

    Creating statisticsIn order to make good use of the CBO, we need to createstatistics for the data in the database. There are severaloptions to create statistics.

    Automatic Statistics GatheringThe recommended approach to gathering statistics is to allowOracle to automatically gather the statistics. Oracle gathersstatistics on all database objects automatically and maintainsthose statistics in a regularly-scheduled maintenance job.Automated statistics collection eliminates many of the manualtasks associated with managing the query optimizer, andsignificantly reduces the chances of getting poor execution plans

    http://satya-dba.blogspot.com/2009/01/whats-new-in-9i.html#9irel2http://satya-dba.blogspot.com/2009/01/whats-new-in-9i.html#9irel2
  • 7/30/2019 Oracle Statistics

    4/26

    because of missing or stale statistics.

    GATHER_STATS_JOBOptimizer statistics are automatically gathered with the jobGATHER_STATS_JOB. This job gathers statistics on all objects inthe database which have missing statistics and stale statistics.

    This job is created automatically at database creation time andis managed by the Scheduler. The Scheduler runs this job when themaintenance window is opened. By default, the maintenance windowopens every night from 10 P.M. to 6 A.M. and all day on weekends.The stop_on_window_close attribute controls whether theGATHER_STATS_JOB continues when the maintenance window closes.The default setting for the stop_on_window_close attribute isTRUE, causing Scheduler to terminate GATHER_STATS_JOB when themaintenance window closes. The remaining objects are thenprocessed in the next maintenance window.

    The GATHER_STATS_JOB job gathers optimizer statistics by callingthe DBMS_STATS.GATHER_DATABASE_STATS_JOB_PROC procedure. TheGATHER_DATABASE_STATS_JOB_PROC procedure collects statistics ondatabase objects when the object has no previously gatheredstatistics or the existing statistics are stale because theunderlying object has been modified significantly (more than 10%of the rows).The GATHER_DATABASE_STATS_JOB_PROC is an internalprocedure, but its operates in a very similar fashion to theDBMS_STATS.GATHER_DATABASE_STATS procedure using the GATHER AUTOoption. The primary difference is that theGATHER_DATABASE_STATS_JOB_PROC procedure prioritizes the databaseobjects that require statistics, so that those objects which most

    need updated statistics are processed first. This ensures thatthe most-needed statistics are gathered before the maintenancewindow closes.

    Enabling Automatic Statistics GatheringAutomatic statistics gathering is enabled by default when adatabase is created, or when a database is upgraded from anearlier database release. We can verify that the job exists byviewing the DBA_SCHEDULER_JOBS view:SQL> SELECT * FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME ='GATHER_STATS_JOB';

    In situations when you want to disable automatic statisticsgathering, then disable the GATHER_STATS_JOB as follows:BEGINDBMS_SCHEDULER.DISABLE('GATHER_STATS_JOB');END;/

    Automatic statistics gathering relies on the modificationmonitoring feature. If this feature is disabled, then the

  • 7/30/2019 Oracle Statistics

    5/26

    automatic statistics gathering job is not able to detect stalestatistics. This feature is enabled when the STATISTICS_LEVELparameter is set to TYPICAL (default) or ALL.

    When to Use Manual StatisticsAutomatic statistics gathering should be sufficient for most

    database objects which are being modified at a moderate speed.However, there are cases where automatic statistics gathering maynot be adequate. Because the automatic statistics gathering runsduring an overnight batch window, the statistics on tables whichare significantly modified during the day may become stale. Thereare typically two types of such objects:

    Volatile tables that are being deleted or truncated andrebuilt during the course of the day.

    Objects which are the target of large bulk loads which add10% or more to the object's total size.

    For highly volatile tables, there are two approaches:

    The statistics on these tables can be set to NULL. WhenOracle encounters a table with no statistics, Oracledynamically gathers the necessary statistics as part ofquery optimization. This dynamic sampling feature iscontrolled by the OPTIMIZER_DYNAMIC_SAMPLING parameter, andthis parameter should be set to a value of 2 or higher. Thedefault value is 2. The statistics can set to NULL bydeleting and then locking the statistics:

    BEGIN

    DBMS_STATS.DELETE_TABLE_STATS('SCOTT','EMP');

    DBMS_STATS.LOCK_TABLE_STATS('SCOTT','EMP');

    END;

    /

    The statistics on these tables can be set to values thatrepresent the typical state of the table. We should gatherstatistics on the table when the tables have arepresentative number of rows, and then lock thestatistics.

    This is more effective than the GATHER_STATS_JOB, because anystatistics generated on the table during the overnight batch

    window may not be the most appropriate statistics for the daytimeworkload.For tables which are being bulk-loaded, the statistics-gathering procedures should be run on those tables immediatelyfollowing the load process, preferably as part of the same scriptor job that is running the bulk load.

    For external tables, statistics are not collected duringGATHER_SCHEMA_STATS, GATHER_DATABASE_STATS, and automaticstatistics gathering processing. However, you can collect

  • 7/30/2019 Oracle Statistics

    6/26

    statistics on an individual external table usingGATHER_TABLE_STATS. Sampling on external tables is not supportedso the ESTIMATE_PERCENT option should be explicitly set to NULL.Because data manipulation is not allowed against external tables,it is sufficient to analyze external tables when thecorresponding file changes.

    If the monitoring feature is disabled by setting STATISTICS_LEVELto BASIC, automatic statistics gathering cannot detect stalestatistics. In this case statistics need to be manually gathered.

    Another area in which statistics need to be manually gathered isthe system statistics. These statistics are not automaticallygathered.

    Statistics on fixed objects, such as the dynamic performancetables, need to be manually collected usingGATHER_FIXED_OBJECTS_STATS procedure. Fixed objects record

    current database activity; statistics gathering should be donewhen database has representative activity.

    Whenever statistics in dictionary are modified, old versions ofstatistics are saved automatically for future restoring.Statistics can be restored using RESTORE procedures of DBMS_STATSpackage.

    In some cases, we may want to prevent any new statistics frombeing gathered on a table or schema by the DBMS_STATS_JOBprocess, such as highly volatile tables. In those cases, theDBMS_STATS package provides procedures for locking the statistics

    for a table or schema.

    Scheduling StatsScheduling the gathering of statistics using DBMS_JOB is theeasiest way to make sure they are always up to date:

    SET SERVEROUTPUT ONDECLAREl_job NUMBER;BEGINDBMS_JOB.submit(l_job, 'BEGINDBMS_STATS.gather_schema_stats(''SCOTT''); END;',

    SYSDATE,'SYSDATE + 1');COMMIT;DBMS_OUTPUT.put_line('Job: ' || l_job);END;/

    The above code sets up a job to gather statistics for SCOTT forthe current time every day. We can list the current jobs on theserver using the DBA_JOBS and DBA_JOBS_RUNNING views.

  • 7/30/2019 Oracle Statistics

    7/26

    Existing jobs can be removed using:EXEC DBMS_JOB.remove(X);COMMIT;Where 'X' is the number of the job to be removed.

    Manual Statistics GatheringIf you choose not to use automatic statistics gathering, then youneed to manually collect statistics in all schemas, includingsystem schemas. If the data in the database changes regularly,you also need to gather statistics regularly to ensure that thestatistics accurately represent characteristics of your databaseobjects.

    The preferred tool for collecting statistics used to be theANALYZE command. Over the past few releases, the DBMS_STATSpackage in the PL/SQL Packages and Types reference has taken overthe statistics functions, and left the ANALYZE command with more

    mundane 'health check' work like analyzing chained rows.

    Analyze StatementThe ANALYZE statement can be used to gather statistics for aspecific table, index or cluster. The statistics can be computedexactly, or estimated based on a specific number of rows, or apercentage of rows.

    The ANALYZE command is available for all versions of Oracle,however to obtain faster and better statistics use the proceduressupplied - in 7.3.4 and 8.0 DBMS_UTILITY.ANALYZE_SCHEMA, and in8i and above - DBMS_STATS.GATHER_SCHEMA_STATS.

    The analyze table can be used to create statistics for 1 table,index or cluster.Syntax:ANALYZE table tableName {compute|estimate|delete} statisticsoptionsANALYZE index indexName {compute|estimate|delete} statisticsoptionsANALYZE cluster clusterName {compute|estimate|delete} statisticsoptions

    ANALYZE TABLE emp COMPUTE STATISTICS;ANALYZE TABLE emp COMPUTE STATISTICS FOR COLUMNS sal SIZE 10;ANALYZE TABLE emp PARTITION (p1) COMPUTE STATISTICS;ANALYZE INDEX emp_pk COMPUTE STATISTICS;

    ANALYZE TABLE emp ESTIMATE STATISTICS;ANALYZE TABLE emp ESTIMATE STATISTICS SAMPLE 500 ROWS;ANALYZE TABLE emp ESTIMATE STATISTICS SAMPLE 15 PERCENT;ANALYZE TABLE emp ESTIMATE STATISTICS FOR ALL COLUMNS;

    http://satya-dba.blogspot.com/2009/07/partitioning-in-oracle.htmlhttp://satya-dba.blogspot.com/2009/07/partitioning-in-oracle.html
  • 7/30/2019 Oracle Statistics

    8/26

    ANALYZE TABLE emp DELETE STATISTICS;ANALYZE INDEX emp_pk DELETE STATISTICS;

    ANALYZE TABLE emp VALIDATE STRUCTURE CASCADE;ANALYZE INDEX emp_pk VALIDATE STRUCTURE;

    ANALYZE CLUSTER emp_custs VALIDATE STRUCTURE CASCADE;

    ANALYZE TABLE emp VALIDATE REF UPDATE;ANALYZE TABLE emp LIST CHAINED ROWS INTO cr;

    Note: Do not use the COMPUTE and ESTIMATE clauses of ANALYZEstatement to collect optimizer statistics. These clauses aresupported solely for backward compatibility and may be removed ina future release. The DBMS_STATS package collects a broader, moreaccurate set of statistics, and gathers statistics moreefficiently.

    We may continue to use ANALYZE statement to for other purposesnot related to optimizer statistics collection:

    To use the VALIDATE or LIST CHAINED ROWS clauses

    To collect information on free list blocks

    To sample a number (rather than a percentage) of rows

    DBMS_UTILITYThe DBMS_UTILITY package can be used to gather statistics for awhole schema or database. With DBMS_UTILITY.ANALYZE_SCHEMA youcan gather all the statistics for all the tables, clusters and

    indexes of a schema. Both methods follow the same format as theANALYZE statement:

    EXEC DBMS_UTILITY.ANALYZE_SCHEMA('SCOTT','COMPUTE');EXECDBMS_UTILITY.ANALYZE_SCHEMA('SCOTT','ESTIMATE',ESTIMATE_ROWS=>100);EXECDBMS_UTILITY.ANALYZE_SCHEMA('SCOTT','ESTIMATE',ESTIMATE_PERCENT=>25);EXEC DBMS_UTILITY.ANALYZE_SCHEMA('SCOTT','DELETE');EXECDBMS_UTILITY.ANALYZE_DATABASE('COMPUTE');EXECDBMS_UTILITY.ANALYZE_DATABASE('ESTIMATE',ESTIMATE_ROWS=>100);EXECDBMS_UTILITY.ANALYZE_DATABASE('ESTIMATE',ESTIMATE_PERCENT=>15);

    DBMS_STATSThe DBMS_STATS package was introduced in Oracle 8i and is Oraclespreferred method of gathering object statistics. Oracle list anumber of benefits to using it including parallel execution, long

  • 7/30/2019 Oracle Statistics

    9/26

    term storage of statistics and transfer of statistics betweenservers. This PL/SQL package is also used to modify, view,export, import, and delete statistics. It follows a similarformat to the other methods.

    The DBMS_STATS package can gather statistics on table and

    indexes, and well as individual columns and partitions of tables.It does not gather cluster statistics; however, we can useDBMS_STATS to gather statistics on the individual tables insteadof the whole cluster.

    When we generate statistics for a table, column, or index, if thedata dictionary already contains statistics for the object, thenOracle updates the existing statistics. The older statistics aresaved and can be restored later if necessary.

    Procedures in the DBMS_STATS package for gathering statistics ondatabase objects:

    Procedure Collects

    GATHER_INDEX_STATS Index statistics

    GATHER_TABLE_STATS Table, column, and index statistics

    GATHER_SCHEMA_STATS Statistics for all objects in a schema

    GATHER_DICTIONARY_STATS Statistics for all dictionary objects

    GATHER_DATABASE_STATS Statistics for all objects in a database

    EXEC DBMS_STATS.GATHER_DATABASE_STATS;EXEC DBMS_STATS.GATHER_DATABASE_STATS(ESTIMATE_PERCENT=>20);

    EXEC DBMS_STATS.GATHER_SCHEMA_STATS(ownname, estimate_percent,block_sample, method_opt, degree, granularity, cascade, stattab,statid, options, statown, no_invalidate, gather_temp,gather_fixed);EXEC DBMS_STATS.GATHER_SCHEMA_STATS('SCOTT');EXEC DBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'MRT');EXECDBMS_STATS.GATHER_SCHEMA_STATS('SCOTT',ESTIMATE_PERCENT=>10);

    EXEC DBMS_STATS.GATHER_TABLE_STATS('SCOTT','EMP');EXEC

    DBMS_STATS.GATHER_TABLE_STATS('SCOTT','EMP',ESTIMATE_PERCENT=>15);

    EXEC DBMS_STATS.GATHER_INDEX_STATS('SCOTT','EMP_PK');EXECDBMS_STATS.GATHER_INDEX_STATS('SCOTT','EMP_PK',ESTIMATE_PERCENT=>15);

  • 7/30/2019 Oracle Statistics

    10/26

    This package also gives us the ability to delete statistics:EXEC DBMS_STATS.DELETE_DATABASE_STATS;EXEC DBMS_STATS.DELETE_SCHEMA_STATS('SCOTT');EXEC DBMS_STATS.DELETE_TABLE_STATS('SCOTT','EMP');EXEC DBMS_STATS.DELETE_INDEX_STATS('SCOTT','EMP_PK');EXEC DBMS_STATS.DELETE_PENDING_STATS('SH','SALES');

    EXECDBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'"DWH"',OPTIONS=>'GATHERAUTO');EXECDBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'PERFSTAT',CASCADE=>TRUE);

    When gathering statistics on system schemas, we can use theprocedure DBMS_STATS.GATHER_DICTIONARY_STATS. This proceduregathers statistics for all system schemas, including SYS andSYSTEM, and other optional schemas, such as CTXSYS and DRSYS.

    Statistics Gathering Using SamplingThe statistics-gathering operations can utilize sampling toestimate statistics. Sampling is an important technique forgathering statistics. Gathering statistics without samplingrequires full table scans and sorts of entire tables. Samplingminimizes the resources necessary to gather statistics.

    Sampling is specified using the ESTIMATE_PERCENT argument to theDBMS_STATS procedures. While the sampling percentage can be setto any value, Oracle Corporation recommends setting theESTIMATE_PERCENT parameter of the DBMS_STATS gathering procedures

    to DBMS_STATS.AUTO_SAMPLE_SIZE to maximize performance gainswhile achieving necessary statistical accuracy. AUTO_SAMPLE_SIZElets Oracle determine the best sample size necessary for goodstatistics, based on the statistical property of the object.Because each type of statistics has different requirements, thesize of the actual sample taken may not be the same across thetable, columns, or indexes. For example, to collect table andcolumn statistics for all tables in the SCOTT schema with auto-sampling, you could use:EXECDBMS_STATS.GATHER_SCHEMA_STATS('SCOTT',DBMS_STATS.AUTO_SAMPLE_SIZE);EXEC DBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'SCOTT',ESTIMATE_PERCENT=>DBMS_STATS.AUTO_SAMPLE_SIZE);

    When the ESTIMATE_PERCENT parameter is manually specified, theDBMS_STATS gathering procedures may automatically increase thesampling percentage if the specified percentage did not produce alarge enough sample. This ensures the stability of the estimatedvalues by reducing fluctuations.EXEC

  • 7/30/2019 Oracle Statistics

    11/26

    DBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'SCOTT',ESTIMATE_PERCENT=>25);

    Parallel Statistics GatheringThe statistics-gathering operations can run either serially or inparallel. The degree of parallelism can be specified with the

    DEGREE argument to the DBMS_STATS gathering procedures. Parallelstatistics gathering can be used in conjunction with sampling.Oracle recommends setting the DEGREE parameter toDBMS_STATS.AUTO_DEGREE. This setting allows Oracle to choose anappropriate degree of parallelism based on the size of the objectand the settings for the parallel-related init.ora parameters.

    Note that certain types of index statistics are not gathered inparallel, including cluster indexes, domain indexes, and bitmapjoin indexes.

    EXEC DBMS_STATS.GATHER_SCHEMA_STATS(OWNNAME=>'SCOTT',

    ESTIMATE_PERCENT=> DBMS_STATS.AUTO_SAMPLE_SIZE, METHOD_OPT=> 'FORALL COLUMNS SIZE AUTO',DEGREE=>7);EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>DWH,METHOD_OPT=>FOR ALL COLUMNS SIZEAUTO,DEGREE=>6,ESTIMATE_PERCENT=>5, NO_INVALIDATE=>FALSE);

    Statistics on PartitionedObjectsFor partitioned tables and indexes, DBMS_STATS can gatherseparate statistics for each partition, as well as globalstatistics for the entire table or index. Similarly, forcomposite partitioning, DBMS_STATS can gather separate statisticsfor subpartitions, partitions, and the entire table or index. The

    type of partitioning statistics to be gathered is specified inthe GRANULARITY argument to the DBMS_STATS gathering procedures.

    Depending on the SQL statement being optimized, the optimizer canchoose to use either the partition (or subpartition) statisticsor the global statistics. Both types of statistics are importantfor most applications, and Oracle recommends setting theGRANULARITY parameter to AUTO to gather both types of partitionstatistics.

    Column Statistics and HistogramsWhen gathering statistics on a table, DBMS_STATS gathers

    information about the data distribution of the columns within thetable. The most basic information about the data distribution isthe maximum value and minimum value of the column. However, thislevel of statistics may be insufficient for the optimizer's needsif the data within the column is skewed. For skewed datadistributions, histograms can also be created as part of thecolumn statistics to describe the data distribution of a givencolumn.

    http://satya-dba.blogspot.com/2009/07/partitioning-in-oracle.htmlhttp://satya-dba.blogspot.com/2009/07/partitioning-in-oracle.html
  • 7/30/2019 Oracle Statistics

    12/26

    Histograms are specified using the METHOD_OPT argument of theDBMS_STATS gathering procedures. Oracle recommends setting theMETHOD_OPT to FOR ALL COLUMNS SIZE AUTO. With this setting,Oracle automatically determines which columns require histogramsand the number of buckets (size) of each histogram. You can alsomanually specify which columns should have histograms and the

    size of each histogram.

    EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES',method_opt=>'FORCOLUMNS (empno, deptno)');EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES',method_op =>'FORCOLUMNS (sal+comm)');

    Note: If you need to remove all rows from a table when usingDBMS_STATS, use TRUNCATE instead of dropping and re-creating thesame table. When a table is dropped, workload information used bythe auto-histogram gathering feature and saved statistics historyused by the RESTORE_*_STATS procedures will be lost. Without this

    data, these features will not function properly.

    Determining Stale StatisticsStatistics must be regularly gathered on database objects asthose database objects are modified over time. In order todetermine whether or not given database object needs new databasestatistics, Oracle provides a table monitoring facility. Thismonitoring is enabled by default when STATISTICS_LEVEL is set toTYPICAL or ALL. Monitoring tracks the approximate number ofINSERTs, UPDATEs, and DELETEs for that table, as well as whetherthe table has been truncated, since the last time statistics weregathered. The information about changes of tables can be viewed

    in the USER_TAB_MODIFICATIONS view. Following a data-modification, there may be a few minutes delay while Oraclepropagates the information to this view. Use theDBMS_STATS.FLUSH_DATABASE_MONITORING_INFO procedure toimmediately reflect the outstanding monitored information kept inthe memory.

    -- Table levelALTER TABLE emp NOMONITORING;ALTER TABLE emp MONITORING;

    -- Schema level

    EXEC DBMS_STATS.alter_schema_tab_monitoring('SCOTT', TRUE);EXEC DBMS_STATS.alter_schema_tab_monitoring('SCOTT', FALSE);

    -- Database levelEXEC DBMS_STATS.alter_database_tab_monitoring(TRUE);EXEC DBMS_STATS.alter_database_tab_monitoring(FALSE);

    The GATHER_DATABASE_STATS or GATHER_SCHEMA_STATS proceduresgather new statistics for tables with stale statistics when the

  • 7/30/2019 Oracle Statistics

    13/26

    OPTIONS parameter is set to GATHER STALE or GATHER AUTO. If amonitored table has been modified more than 10%, then thesestatistics are considered stale and gathered again.

    User-defined StatisticsYou can create user-defined optimizer statistics to support user-

    defined indexes and functions. When you associate a statisticstype with a column or domain index, Oracle calls the statisticscollection method in the statistics type whenever statistics aregathered for database objects.

    You should gather new column statistics on a table after creatinga function-based index, to allow Oracle to collect columnstatistics equivalent information for the expression. This isdone by calling the statistics-gathering procedure with theMETHOD_OPT argument set to FOR ALL HIDDEN COLUMNS.When to GatherStatisticsWhen gathering statistics manually, we not only need to determine

    how to gather statistics, but also when and how often to gathernew statistics.

    For an application in which tables are being incrementallymodified, we may only need to gather new statistics every week orevery month. The simplest way to gather statistics in theseenvironments is to use a script or job scheduling tool toregularly run the GATHER_SCHEMA_STATS and GATHER_DATABASE_STATSprocedures. The frequency of collection intervals should balancethe task of providing accurate statistics for the optimizeragainst the processing overhead incurred by the statisticscollection process.

    For tables which are being substantially modified in batchoperations, such as with bulk loads, statistics should begathered on those tables as part of the batch operation. TheDBMS_STATS procedure should be called as soon as the loadoperation completes.

    For partitioned tables, there are often cases in which only asingle partition is modified. In those cases, statistics can begathered only on those partitions rather than gatheringstatistics for the entire table. However, gathering globalstatistics for the partitioned table may still be necessary.

    Transferring Statistics between databasesIt is possible to transfer statistics between servers allowingconsistent execution plans between servers with varying amountsof data. First the statistics must be collected into a statisticstable. It can be very handy to use production statistics ondevelopment database, so that we can forecast the optimizerbehaviour.

  • 7/30/2019 Oracle Statistics

    14/26

    Statistics can be exported and imported from the data dictionaryto user-owned tables. This enables you to create multipleversions of statistics for the same schema. It also enables youto copy statistics from one database to another database. You maywant to do this to copy the statistics from a production databaseto a scaled-down test database.

    Note: Exporting and importing statistics is a distinct conceptfrom the EXP and IMP utilities of the database. The DBMS_STATSexport and import packages do utilize IMP and EXP dumpfiles.

    Before exporting statistics, you first need to create a table forholding the statistics. This statistics table is created usingthe procedure DBMS_STATS.CREATE_STAT_TABLE. After this table iscreated, then you can export statistics from the data dictionaryinto your statistics table using the DBMS_STATS.EXPORT_*_STATSprocedures. The statistics can then be imported using theDBMS_STATS.IMPORT_*_STATS procedures.

    Note that the optimizer does not use statistics stored in a user-owned table. The only statistics used by the optimizer are thestatistics stored in the data dictionary. In order to have theoptimizer use the statistics in user-owned tables, you mustimport those statistics into the data dictionary using thestatistics import procedures.

    In order to move statistics from one database to another, youmust first export the statistics on the first database, then copythe statistics table to the second database, using the EXP andIMP utilities or other mechanisms, and finally import the

    statistics into the second database.

    Note: The EXP and IMP utilities export and import optimizerstatistics from the database along with the table. One exceptionis that statistics are not exported with the data if a table hascolumns with system-generated names.

    In the following example the statistics for the APPSCHEMA userare collected into a new table, STATS_TAB, which is owned byDBASCHEMA:

    1. Create the statistics table.

    EXEC DBMS_STATS.CREATE_STAT_TABLE(ownname =>'SCHEMA_NAME',stat_tab => 'STATS_TABLE', tblspace => 'STATS_TABLESPACE');

    SQL> EXEC DBMS_STATS.CREATE_STAT_TABLE('DBASCHEMA','STATS_TAB');

    2. Export statistics to statistics table.EXEC DBMS_STATS.EXPORT_SCHEMA_STATS('ORIGINAL_SCHEMA','STATS_TABLE', NULL, 'STATS_TABLE_OWNER');

  • 7/30/2019 Oracle Statistics

    15/26

    SQL> EXECDBMS_STATS.EXPORT_SCHEMA_STATS('APPSCHEMA','STATS_TAB',NULL,'DBASCHEMA');(or)EXEC DBMS_STATS.EXPORT_SCHEMA_STATS(OWNNAME=>'APPSCHEMA',STATTAB=>'STAT_TAB',STATID=>'030610',STATOWN=>'DBASCHEMA');

    3. This table can be transferred to another server using any oneof the below methods.SQLPlus Copy:SQL> insert into dbaschema.stats_tab select * fromdbaschema.stats_tab@source;

    Export/Import:exp file=stats.dmp log=stats_exp.log tables=dbaschema.stats_tabimp file=stats.dmp log=stats_imp.log

    Datapump:

    expdp directory=dpump_dir dumpfile=stats.dmplogfile=stats_exp.log tables= dbaschema.stats_tabimpdp directory=dpump_dir dumpfile=stats.dmplogfile=stats_imp.log

    4. Import statistics into the data dictionary.EXEC DBMS_STATS.IMPORT_SCHEMA_STATS('NEW_SCHEMA', 'STATS_TABLE',NULL, 'SYSTEM');

    SQL> EXECDBMS_STATS.IMPORT_SCHEMA_STATS('APPSCHEMA','STATS_TAB',NULL,'DBASCHEMA');

    (or)EXEC DBMS_STATS.IMPORT_SCHEMA_STATS(OWNNAME=>'APPSCHEMA',STATTAB=>'STAT_TAB',STATID=>'030610',STATOWN=>'DBASCHEMA');

    5. Drop the statistics table (optional step).EXEC DBMS_STATS.DROP_STAT_TABLE('SYSTEM','STATS_TABLE');SQL> EXEC DBMS_STATS.DROP_STAT_TABLE('DBASCHEMA','STATS_TAB');

    Getting top-quality statsBecause Oracle9i schema statistics work best with external systemload, we like to schedule a valid sample (usingdbms_stats.auto_sample_size) during regular working hours. For

    example, here we refresh statistics using the "auto" option whichworks with the table monitoring facility to only re-analyze thoseOracle tables that have experienced more than a 10% change in rowcontent:begindbms_stats.gather_schema_stats(ownname => 'SCOTT',estimate_percent => dbms_stats.auto_sample_size,method_opt => 'for all columns size auto',degree => 7);

    http://satya-dba.blogspot.com/2009/05/export-import.htmlhttp://satya-dba.blogspot.com/2009/05/datapump.htmlhttp://satya-dba.blogspot.com/2009/05/export-import.htmlhttp://satya-dba.blogspot.com/2009/05/datapump.html
  • 7/30/2019 Oracle Statistics

    16/26

    end;/

    Optimizer HintsALL_ROWSFIRST_ROWSFIRST_n_ROWSAPPENDFULLINDEXDYNAMIC_SAMPLINGBYPASS_RECURSIVE_CHECKBYPASS_RECURSIVE_CHECK APPEND

    Examples:SELECT /*+ ALL_ROWS */ empid, last_name, sal FROM emp;SELECT /*+ FIRST_ROWS */ * FROM emp;

    SELECT /*+ FIRST_20_ROWS */ * FROM emp;SELECT /*+ FIRST_ROWS(100) */ empid, last_name, sal FROM emp;

    System StatisticsSystem statistics describe the system's hardware characteristics,such as I/O and CPU performance and utilization, to the queryoptimizer. When choosing an execution plan, the optimizerestimates the I/O and CPU resources required for each query.System statistics enable the query optimizer to more accuratelyestimate I/O and CPU costs, enabling the query optimizer tochoose a better execution plan.

    When Oracle gathers system statistics, it analyzes systemactivity in a specified time period (workload statistics) orsimulates a workload (noworkload statistics). The statistics arecollected using the DBMS_STATS.GATHER_SYSTEM_STATS procedure.Oracle highly recommends that you gather system statistics.

    Note: You must have DBA privileges or GATHER_SYSTEM_STATISTICSrole to update dictionary system statistics.

    EXEC DBMS_STATS.GATHER_SYSTEM_STATS (interval=>720,stattab=>'mystats', statid=>'OLTP');EXEC DBMS_STATS.IMPORT_SYSTEM_STATS('mystats', 'OLTP');Unliketable, index, or column statistics, Oracle does not invalidatealready parsed SQL statements when system statistics get updated.All new SQL statements are parsed using new statistics.

    These options better facilitate the gathering process to thephysical database and workload: when workload system statisticsare gathered, noworkload system statistics will be ignored.Noworkload system statistics are initialized to default values atthe first database startup.

  • 7/30/2019 Oracle Statistics

    17/26

    Workload StatisticsWorkload statistics, introduced in Oracle 9i, gather single andmultiblock read times, mbrc, CPU speed (cpuspeed), maximum systemthroughput, and average slave throughput. The sreadtim, mreadtim,and mbrc are computed by comparing the number of physical

    sequential and random reads between two points in time from thebeginning to the end of a workload. These values are implementedthrough counters that change when the buffer cache completessynchronous read requests. Since the counters are in the buffercache, they include not only I/O delays, but also waits relatedto latch contention and task switching. Workload statistics thusdepend on the activity the system had during the workload window.If system is I/O boundboth latch contention and I/O throughputit will be reflected in the statistics and will therefore promotea less I/O intensive plan after the statistics are used.Furthermore, workload statistics gathering does not generateadditional overhead.

    In release 9.2, maximum I/O throughput and average slavethroughput were added to set a lower limit for a full table scan(FTS).To gather workload statistics, either:

    Run the dbms_stats.gather_system_stats('start') procedureat the beginning of the workload window, then thedbms_stats.gather_system_stats('stop') procedure at the endof the workload window.

    Run dbms_stats.gather_system_stats('interval', interval=>N)where N is the number of minutes when statistics gathering

    will be stopped automatically.

    To delete system statistics, rundbms_stats.delete_system_stats(). Workload statistics will bedeleted and reset to the default noworkload statistics.

    Noworkload StatisticsNoworkload statistics consist of I/O transfer speed, I/O seektime, and CPU speed (cpuspeednw). The major difference betweenworkload statistics and noworkload statistics lies in thegathering method.

    Noworkload statistics gather data by submitting random readsagainst all data files, while workload statistics uses countersupdated when database activity occurs. isseektim represents thetime it takes to position the disk head to read data. Its valueusually varies from 5 ms to 15 ms, depending on disk rotationspeed and the disk or RAID specification. The I/O transfer speedrepresents the speed at which one operating system process canread data from the I/O subsystem. Its value varies greatly, froma few MBs per second to hundreds of MBs per second. Oracle uses

    http://satya-dba.blogspot.com/2009/01/whats-new-in-9i.html#9irel2http://satya-dba.blogspot.com/2009/01/whats-new-in-9i.html#9irel2
  • 7/30/2019 Oracle Statistics

    18/26

    relatively conservative default settings for I/O transfer speed.

    In Oracle 10g, Oracle uses noworkload statistics and the CPU costmodel by default. The values of noworkload statistics areinitialized to defaults at the first instance startup:ioseektim = 10ms

    iotrfspeed = 4096 bytes/mscpuspeednw = gathered value, varies based on system

    If workload statistics are gathered, noworkload statistics willbe ignored and Oracle will use workload statistics instead. Togather noworkload statistics, rundbms_stats.gather_system_stats() with no arguments. There will bean overhead on the I/O system during the gathering process ofnoworkload statistics. The gathering process may take from a fewseconds to several minutes, depending on I/O performance anddatabase size.

    The information is analyzed and verified for consistency. In somecases, the value of noworkload statistics may remain its defaultvalue. In such cases, repeat the statistics gathering process orset the value manually to values that the I/O system hasaccording to its specifications by using thedbms_stats.set_system_stats procedure.

    Managing StatisticsRestoring Previous Versions of StatisticsWhenever statistics in dictionary are modified, old versions ofstatistics are saved automatically for future restoring.Statistics can be restored using RESTORE procedures of DBMS_STATS

    package. These procedures use a time stamp as an argument andrestore statistics as of that time stamp. This is useful in casenewly collected statistics leads to some sub-optimal executionplans and the administrator wants to revert to the previous setof statistics.There are dictionary views that display the time ofstatistics modifications. These views are useful in determiningthe time stamp to be used for statistics restoration.

    Catalog view DBA_OPTSTAT_OPERATIONS contain history ofstatistics operations performed at schema and databaselevel using DBMS_STATS.

    The views *_TAB_STATS_HISTORY views (ALL, DBA, or USER)

    contain a history of table statistics modifications.

    The old statistics are purged automatically at regular intervalsbased on the statistics history retention setting and the time ofthe recent analysis of the system. Retention is configurableusing the ALTER_STATS_HISTORY_RETENTION procedure of DBMS_STATS.The default value is 31 days, which means that you would be ableto restore the optimizer statistics to any time in last 31 days.

    http://satya-dba.blogspot.com/2009/01/whats-new-in-10g.htmlhttp://satya-dba.blogspot.com/2009/01/whats-new-in-10g.html
  • 7/30/2019 Oracle Statistics

    19/26

    Automatic purging is enabled when STATISTICS_LEVEL parameter isset to TYPICAL or ALL. If automatic purging is disabled, the oldversions of statistics need to be purged manually using thePURGE_STATS procedure.

    The other DBMS_STATS procedures related to restoring and purgingstatistics include:

    PURGE_STATS: This procedure can be used to manually purgeold versions beyond a time stamp.

    GET_STATS_HISTORY_RENTENTION: This function can be used toget the current statistics history retention value.

    GET_STATS_HISTORY_AVAILABILITY: This function gets theoldest time stamp where statistics history is available.Users cannot restore statistics to a time stamp older thanthe oldest time stamp.

    When restoring previous versions of statistics, the followinglimitations apply:

    RESTORE procedures cannot restore user-defined statistics.

    Old versions of statistics are not stored when the ANALYZEcommand has been used for collecting statistics.

    Note: If you need to remove all rows from a table when usingDBMS_STATS, use TRUNCATE instead of dropping and re-creating thesame table. When a table is dropped, workload information used bythe auto-histogram gathering feature and saved statistics history

    used by the RESTORE_*_STATS procedures will be lost. Without thisdata, these features will not function properly.

    Restoring Statistics versus Importing or Exporting StatisticsThe functionality for restoring statistics is similar in somerespects to the functionality of importing and exportingstatistics. In general, you should use the restore capabilitywhen:

    You want to recover older versions of the statistics. Forexample, to restore the optimizer behaviour to an earlierdate.

    You want the database to manage the retention and purgingof statistics histories.

    You should use EXPORT/IMPORT_*_STATS procedures when:

    You want to experiment with multiple sets of statistics andchange the values back and forth.

  • 7/30/2019 Oracle Statistics

    20/26

    You want to move the statistics from one database toanother database. For example, moving statistics from aproduction system to a test system.

    You want to preserve a known set of statistics for a longerperiod of time than the desired retention date forrestoring statistics.

    Locking Statistics for a Table or SchemaStatistics for a table or schema can be locked. Once statisticsare locked, no modifications can be made to those statisticsuntil the statistics have been unlocked. These locking proceduresare useful in a static environment in which you want to guaranteethat the statistics never change.

    The DBMS_STATS package provides two procedures for locking andtwo procedures for unlocking statistics:

    LOCK_SCHEMA_STATS

    LOCK_TABLE_STATS

    UNLOCK_SCHEMA_STATS

    UNLOCK_TABLE_STATS

    EXEC DBMS_STATS.LOCK_SCHEMA_STATS('AP');EXEC DBMS_STATS.UNLOCK_SCHEMA_STATS('AP');

    Setting StatisticsWe can set table, column, index, and system statistics using theSET_*_STATISTICS procedures. Setting statistics in the manner is

    not recommended, because inaccurate or inconsistent statisticscan lead to poor performance.

    Dynamic SamplingThe purpose of dynamic sampling is to improve server performanceby determining more accurate estimates for predicate selectivityand statistics for tables and indexes. The statistics for tablesand indexes include table block counts, applicable index blockcounts, table cardinalities, and relevant join column statistics.These more accurate estimates allow the optimizer to producebetter performing plans.

    You can use dynamic sampling to:

    Estimate single-table predicate selectivities whencollected statistics cannot be used or are likely to leadto significant errors in estimation.

    Estimate statistics for tables and relevant indexes withoutstatistics.

    Estimate statistics for tables and relevant indexes whosestatistics are too out of date to trust.

  • 7/30/2019 Oracle Statistics

    21/26

    This dynamic sampling feature is controlled by theOPTIMIZER_DYNAMIC_SAMPLING parameter. For dynamic sampling toautomatically gather the necessary statistics, this parametershould be set to a value of 2(default) or higher.

    The primary performance attribute is compile time. Oracle

    determines at compile time whether a query would benefit fromdynamic sampling. If so, a recursive SQL statement is issued toscan a small random sample of the table's blocks, and to applythe relevant single table predicates to estimate predicateselectivities. The sample cardinality can also be used, in somecases, to estimate table cardinality. Any relevant column andindex statistics are also collected. Depending on the value ofthe OPTIMIZER_DYNAMIC_SAMPLING initialization parameter, acertain number of blocks are read by the dynamic sampling query.

    For a query that normally completes quickly (in less than a fewseconds), we will not want to incur the cost of dynamic sampling.

    However, dynamic sampling can be beneficial under any of thefollowing conditions:

    A better plan can be found using dynamic sampling.

    The sampling time is a small fraction of total executiontime for the query.

    The query will be executed many times.

    Dynamic sampling can be applied to a subset of a single table'spredicates and combined with standard selectivity estimates ofpredicates for which dynamic sampling is not done.

    We control dynamic sampling with the OPTIMIZER_DYNAMIC_SAMPLINGparameter, which can be set to a value from 0 to 10. The defaultis 2.

    A value of 0 means dynamic sampling will not be done.

    Increasing the value of the parameter results in moreaggressive application of dynamic sampling, in terms ofboth the type of tables sampled (analyzed or unanalyzed)and the amount of I/O spent on sampling.

    Dynamic sampling is repeatable if no rows have been inserted,deleted, or updated in the table being sampled. The parameterOPTIMIZER_FEATURES_ENABLE turns off dynamic sampling if set to aversion prior to 9.2.0.

    Dynamic Sampling LevelsThe sampling levels are as follows if the dynamic sampling levelused is from a cursor hint or from the OPTIMIZER_DYNAMIC_SAMPLINGinitialization parameter:

  • 7/30/2019 Oracle Statistics

    22/26

    Level 0: Do not use dynamic sampling.

    Level 1: Sample all tables that have not been analyzed ifthe following criteria are met: (1) there is at least 1unanalyzed table in the query; (2) this unanalyzed table isjoined to another table or appears in a subquery or non-mergeable view; (3) this unanalyzed table has no indexes;

    (4) this unanalyzed table has more blocks than the numberof blocks that would be used for dynamic sampling of thistable. The number of blocks sampled is the default numberof dynamic sampling blocks (32).

    Level 2: Apply dynamic sampling to all unanalyzed tables.The number of blocks sampled is two times the defaultnumber of dynamic sampling blocks.

    Level 3: Apply dynamic sampling to all tables that meetLevel 2 criteria, plus all tables for which standardselectivity estimation used a guess for some predicate thatis a potential dynamic sampling predicate. The number ofblocks sampled is the default number of dynamic sampling

    blocks. For unanalyzed tables, the number of blocks sampledis two times the default number of dynamic sampling blocks.

    Level 4: Apply dynamic sampling to all tables that meetLevel 3 criteria, plus all tables that have single-tablepredicates that reference 2 or more columns. The number ofblocks sampled is the default number of dynamic samplingblocks. For unanalyzed tables, the number of blocks sampledis two times the default number of dynamic sampling blocks.

    Levels 5, 6, 7, 8, and 9: Apply dynamic sampling to alltables that meet the previous level criteria using 2, 4, 8,32, or 128 times the default number of dynamic samplingblocks respectively.

    Level 10: Apply dynamic sampling to all tables that meetthe Level 9 criteria using all blocks in the table.

    The sampling levels are as follows if the dynamic sampling levelfor a table is set using the DYNAMIC_SAMPLING optimizer hint:

    Level 0: Do not use dynamic sampling.

    Level 1: The number of blocks sampled is the default numberof dynamic sampling blocks (32).

    Levels 2, 3, 4, 5, 6, 7, 8, and 9: The number of blockssampled is 2, 4, 8, 16, 32, 64, 128, or 256 times thedefault number of dynamic sampling blocks respectively.

    Level 10: Read all blocks in the table.

    Handling Missing StatisticsWhen Oracle encounters a table with missing statistics, Oracledynamically gathers the necessary statistics needed by theoptimizer. However, for certain types of tables, Oracle does notperform dynamic sampling. These include remote tables andexternal tables. In those cases and also when dynamic sampling

  • 7/30/2019 Oracle Statistics

    23/26

    has been disabled, the optimizer uses default values for itsstatistics.

    Default Table Values When Statistics Are Missing

    Table Statistic Default Value Used by Optimizer

    Cardinality num_of_blocks * (block_size - cache_layer) /avg_row_lenAverage row length 100 bytesNumber of blocks 100 or actual value based on the extent mapRemote cardinality 2000 rowsRemote average row length 100 bytesDefault Index Values When Statistics Are Missing

    Index Statistic Default Value Used by OptimizerLevels 1Leaf blocks 25Leaf blocks/key 1

    Data blocks/key 1Distinct keys 100Clustering factor 800

    Viewing StatisticsStatistics on Tables, Indexes and ColumnsStatistics on tables, indexes, and columns are stored in the datadictionary. To view statistics in the data dictionary, query theappropriate data dictionary view (USER, ALL, or DBA). These DBA_*views include the following:

    DBA_TAB_STATISTICS

    ALL_TAB_STATISTICS USER_TAB_STATISTICS DBA_TAB_COL_STATISTICS ALL_TAB_COL_STATISTICS USER_TAB_COL_STATISTICS DBA_TAB_HISTOGRAMS ALL_TAB_HISTOGRAMS USER_TAB_HISTOGRAMS

    DBA_TABLES DBA_OBJECT_TABLES DBA_TAB_HISTOGRAMS

    DBA_INDEXES DBA_IND_STATISTICS DBA_CLUSTERS DBA_TAB_PARTITIONS DBA_TAB_SUBPARTITIONS DBA_IND_PARTITIONS DBA_IND_SUBPARTITIONS DBA_PART_COL_STATISTICS DBA_PART_HISTOGRAMS

  • 7/30/2019 Oracle Statistics

    24/26

    DBA_SUBPART_COL_STATISTICS DBA_SUBPART_HISTOGRAMS

    Viewing HistogramsColumn statistics may be stored as histograms. These histogramsprovide accurate estimates of the distribution of column data.

    Histograms provide improved selectivity estimates in the presenceof data skew, resulting in optimal execution plans with nonuniform data distributions.

    Oracle uses two types of histograms for column statistics:height-balanced histograms and frequency histograms. The type ofhistogram is stored in the HISTOGRAM column of the*TAB_COL_STATISTICS views (USER and DBA). This column can havevalues of HEIGHT BALANCED, FREQUENCY, or NONE.

    Height-Balanced HistogramsIn a height-balanced histogram, the column values are divided

    into bands so that each band contains approximately the samenumber of rows. The useful information that the histogramprovides is where in the range of values the endpoints fall.Height-balanced histograms can be viewed using the*TAB_HISTOGRAMS tables.

    Example for Viewing Height-Balanced Histogram StatisticsBEGINDBMS_STATS.GATHER_TABLE_STATS (OWNNAME => 'OE', TABNAME =>'INVENTORIES',METHOD_OPT => 'FOR COLUMNS SIZE 10 QUANTITY_ON_HAND');END;

    /

    SELECT COLUMN_NAME, NUM_DISTINCT, NUM_BUCKETS, HISTOGRAMFROM USER_TAB_COL_STATISTICSWHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME ='QUANTITY_ON_HAND';

    COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM------------------------------ ------------ --------------------------QUANTITY_ON_HAND 237 10 HEIGHT BALANCED

    SELECT ENDPOINT_NUMBER, ENDPOINT_VALUEFROM USER_HISTOGRAMSWHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME ='QUANTITY_ON_HAND'ORDER BY ENDPOINT_NUMBER;

    ENDPOINT_NUMBER ENDPOINT_VALUE--------------- --------------0 0

  • 7/30/2019 Oracle Statistics

    25/26

    1 272 423 574 745 986 123

    7 1498 1759 20210 353

    In the query output, one row corresponds to one bucket in thehistogram.

    Frequency HistogramsIn a frequency histogram, each value of the column corresponds toa single bucket of the histogram. Each bucket contains the numberof occurrences of that single value. Frequency histograms are

    automatically created instead of height-balanced histograms whenthe number of distinct values is less than or equal to the numberof histogram buckets specified. Frequency histograms can beviewed using the *TAB_HISTOGRAMS tables.

    Example for Viewing Frequency Histogram StatisticsBEGINDBMS_STATS.GATHER_TABLE_STATS (OWNNAME => 'OE', TABNAME =>'INVENTORIES',METHOD_OPT => 'FOR COLUMNS SIZE 20 WAREHOUSE_ID');END;/

    SELECT COLUMN_NAME, NUM_DISTINCT, NUM_BUCKETS, HISTOGRAMFROM USER_TAB_COL_STATISTICSWHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME ='WAREHOUSE_ID';

    COLUMN_NAME NUM_DISTINCT NUM_BUCKETS HISTOGRAM------------------------------ ------------ --------------------------WAREHOUSE_ID 9 9 FREQUENCY

    SELECT ENDPOINT_NUMBER, ENDPOINT_VALUE

    FROM USER_HISTOGRAMSWHERE TABLE_NAME = 'INVENTORIES' AND COLUMN_NAME = 'WAREHOUSE_ID'ORDER BY ENDPOINT_NUMBER;

    ENDPOINT_NUMBER ENDPOINT_VALUE--------------- --------------36 1213 2261 3

  • 7/30/2019 Oracle Statistics

    26/26

    370 4484 5692 6798 7984 81112 9

    Issues

    Exclude dataload tables from your regular stats gathering,unless you know they will be full at the time that statsare gathered.

    Gathering stats for the SYS schema can make the system runslower, not faster.

    Gathering statistics can be very resource intensive for theserver so avoid peak workload times or gather stale statsonly.

    Even if scheduled, it may be necessary to gather freshstatistics after database maintenance or large data loads.

    If a table goes from 1 row to 200 rows, that's asignificant change. When a table goes from 100,000 rows to150,000 rows, that's not a terribly significant change.When a table goes from 1000 rows all with identical valuesin commonly-queried column X to 1000 rows with nearlyunique values in column X, that's a significant change.

    Statistics store information about item counts and relativefrequencies. Things that will let it "guess" at how many rowswill match a given criteria. When it guesses wrong, the optimizer

    can pick a very suboptimal query plan.

    Source:Internet