Parallel Processing in Oracle 11g Standard Edition
-
Upload
sagar-thorat -
Category
Documents
-
view
888 -
download
6
description
Transcript of Parallel Processing in Oracle 11g Standard Edition
Parallel Processing in Oracle 11g Standard Edition
- Sagar Thorat
Parallel Processing in Standard Edition
• Need for Parallel Querying or Parallel Processing
• Using Parallel Processing in Standard Edition
• Examples
• Advantages
Need for Parallel Querying or Parallel Processing
• When you submit a SQL statement to a database engine, by default it is executed serially by a single server process.
Using Oracle Parallel Query (OPQ)
• The parallel query feature allows certain operations (for example, full table scans or sorts) to be performed in
parallel by multiple query server processes.
OPQ
• Partitions table into logical chunks
• Parallel query slaves read the chunks
• Results of each slave re-assembled and returned.
• Much faster
OPQ and RAC
• Oracle Real Application Clusters (RAC) allows one to take advantage of a multi-node clustered environment for availability and performance reasons. It is commonly used to access a very large database from different nodes of a cluster.
• If both RAC and OPQ are available one can split operations across multiple CPUs and multiple nodes in a cluster for even further performance improvements.
• One needs to carefully balance the number of people executing Parallel Query Operations and the degree of parallelism with the number of CPUs in the system.
Parallel processing in Standard Edition
• As of Oracle Database 11g Release 2, there is a feature that provides parallel processing capabilities in the Standard Edition. This feature is available through the DBMS_PARALLEL_EXECUTE package.
DBMS_PARALLEL_EXECUTE
• The DBMS_PARALLEL_EXECUTE package allows a workload associated with a base table to be broken down into smaller “chunks” which can be run in parallel.
Using DBMS_PARALLEL_EXECUTE
• User schema requires the CREATE JOB system privilege.
• Create Task• Create chunks (size) using one of the following.
CREATE_CHUNKS_BY_ROWID
CREATE_CHUNKS_BY_NUMBER_COLCREATE_CHUNKS_BY_SQL
• Execute task using RUN_TASK• Perform error handling and run failed chunks
with RESUME_TASK.
Creating Task• CREATE_TASK creates a named task to be managed by
DBMS_PARALLEL_EXECUTE
• E.g.BEGINDBMS_PARALLEL_EXECUTE.create_task (task_name => ‘parallel_task’);END;/
To create a task automatically use GENERATE_TASK_NAME function
SELECT DBMS_PARALLEL_EXECUTE.generate_task_name FROM dual;
Information about existing tasks is stored in *_PARALLEL_EXECUTE_TASKS views.
Chunking of workload
• 3 methods:1. CREATE_CHUNKS_BY_ROWID defines by
ROWID the various chunks of the total set of rows to be modified by the SQL statement.
2. CREATE_CHUNKS_BY_SQL defines, by a user-specified SQL statement, the chunking of data.
3. CREATE _CHUNKS_BY_NUMBER_COL defines, by numeric column, the chunking of data
Examples
Creation of base table which will be chunked
GIGANTIC_TABLE.sql
Insertion script :
GIGANTIC_TABLE INSERT.txt
CREATE_CHUNKS_BY_ROWID
Syntax
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID(task_name IN VARCHAR2,
table_owner IN VARCHAR2, table_name IN VARCHAR2, by_row IN BOOLEAN, chunk_size IN NUMBER);
The table to be chunked must be a physical table with physical ROWID having views and table functions. Index Organized Tables are not allowed.
CREATE_CHUNKS_BY_ROWID e.g.
BEGIN DBMS_PARALLEL_EXECUTE.create_chunks_by_rowid(task_name =>
parallel_task ', table_owner => 'POOJA10_SPM', table_name => 'GIGANTIC_TABLE', by_row => TRUE, chunk_size => 10000);END; by_row parameter is set to TRUE so that the chunk size (next argument)
refers to the number of rows, not the number of blocks
CREATE_CHUNKS_BY_SQL
Syntax:
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL (task_name IN VARCHAR2,
sql_statement IN CLOB, by_rowid IN BOOLEAN);
e.g.CREATE TABLE GIGANTIC_TABLE_CHUNKS (start_id INTEGER,end_id
INTEGER);
beginINSERT INTO GIGANTIC_TABLE_CHUNKS VALUES (1,1000000);INSERT INTO GIGANTIC_TABLE_CHUNKS VALUES (1000001,2000000);INSERT INTO GIGANTIC_TABLE_CHUNKS VALUES (2000001,3000000);...INSERT INTO GIGANTIC_TABLE_CHUNKS VALUES (11000001,12000000);commit;end;
CREATE_CHUNKS_BY_SQL
DECLARE c_chunk_statement CONSTANT VARCHAR2(1000) :=
'select start_id,end_id from GIGANTIC_TABLE_CHUNKS';
BEGIN DBMS_PARALLEL_EXECUTE.create_chunks_by_SQL(task_name => 'Parallel_Update2', sql_stmt => c_chunk_statement, by_rowid => FALSE);
END;
CREATE _CHUNKS_BY_NUMBER_COL
Syntax :
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_NUMBER_COL
(task_name IN VARCHAR2, table_owner IN VARCHAR2, table_name IN VARCHAR2, table_column IN VARCHAR2, chunk_size IN NUMBER);
CREATE_CHUNKS_BY_NUMBER_COLUMN
BEGIN DBMS_PARALLEL_EXECUTE.create_chunks_by_number_col
(task_name => 'Parallel_Update3', table_owner => 'POOJA10_SPM', table_name => 'GIGANTIC_TABLE', table_column => 'ROW_IDENTIFIER', chunk_size => 2400000);
END;
Display information about Chunks and Tasks
• Use the data dictionary views USER_PARALLEL_EXECUTE_CHUNKS and USER_PARALLEL_EXECUTE_TASKS to check the details about the individual chunks and the task associated with them.
RUN_TASK• Syntax
DBMS_PARALLEL_EXECUTE.RUN_TASK (task_name IN VARCHAR2,
sql_stmt IN CLOB, language_flag IN NUMBER, edition IN VARCHAR2 DEFAULT NULL,
apply_crossedition_trigger INVARCHAR2 DEFAULT NULL,
fire_apply_trigger IN BOOLEAN DEFAULT TRUE, parallel_level IN NUMBER DEFAULT 0, job_class
IN VARCHAR2 DEFAULT 'DEFAULT_JOB_CLASS');
Run_task e.g. DECLAREc_update_statement CONSTANT VARCHAR2(1000) := 'UPDATE GIGANTIC_TABLE gtSET gt.no_2 = gt.no_2 + 1WHERE ROW_IDENTIFIER BETWEEN :start_id AND :end_id';BEGIN
DBMS_PARALLEL_EXECUTE.run_task(task_name => 'Parallel_Update2', sql_stmt => c_update_statement, language_flag => DBMS_SQL.NATIVE, parallel_level => 4);
CASE dbms_parallel_execute.task_status(task_name => 'Parallel_Update2') WHEN dbms_parallel_execute.chunking_failed THEN dbms_output.put_line('chunking_failed'); WHEN dbms_parallel_execute.finished THEN dbms_output.put_line('finished'); WHEN dbms_parallel_execute.finished_with_error THEN dbms_output.put_line('finished_with_error'); WHEN dbms_parallel_execute.crashed THEN dbms_output.put_line('crashed'); END CASE;
END;/
Task and Chunk details• By Rowid
CHUNK_BY_ROW_ID_CHUNK_DETAILS
CHUNK_BY_ROW_ID_JOB_DETAILS
CHUNK_BY_ROW_ID_TASK_DETAILS
CHUNK_BY_SQL_CHUNK_DETAILS
CHUNK_BY_SQL_TASK_DETAILS
CHUNK_BY_NUMBER_COLUMN_CHUNK_DETAILS
CHUNK_BY_NUMBER_COLUMN_TASK_DETAILS
CHUNK_BY_NUMBER_COLUMN_JOB_DETAILS
• By SQL
• By Number Column
User-defined framework
• Rather than simply asking to run a task with RUN_TASK, you can control chunk execution by getting a specific chunk with GET_ROWID_CHUNK or GET_NUMBER_COL_CHUNK procedure and then executing it with EXECUTE IMMEDIATE.
• You can then immediately resolve any errors and decide if you want to commit the changes.
User-defined Framework.txt
USER_DEFINED_CHUNK_DETAILS
STATUS• Constants available to get the chunk status and task status
• You can set chunk status manually by using SET_CHUNK_STATUS procedure
• TASK_STATUS function returns the status of the task (possible values ‘FINISHED’ or ‘FINISHED_WITH_ERROR’)
• If status is not finished, you can resume the task using RESUME_TASK procedure
• You can stop a running task, using STOP_TASK procedure, and can restart it using RESUME_TASK procedure
STATUS
DROP TASK
• After job is completed, you can drop the task.
• The associated chunk information is also lost
BEGIN
DBMS_PARALLEL_EXECUTE.drop_task(‘task_name’);
END;
/
Advantages of DBMS_PARALLEL_EXECUTE
• You lock only one set of rows at a time, for a relatively short time, instead of locking the entire table.
• You do not lose work that has been done if something fails before the entire operation finishes.
• You reduce rollback space consumption.
• Less undo space is required
• Restart failed chunks (transactions)
• You improve performance
– Thank You !