- WordPress.comDec 03, 2013 · •Exadata storage cells retrieve requested blocks •Parallel on...

<Insert Picture Here>

Controlling resources in an Exadata environment

Agenda

• Smart IO

• IO Resource Manager

• Compression

• Hands-on time

• Exadata Security

• Flash Cache

• Storage Indexes

• Parallel Execution

Agenda

• Smart IO • IO Resource Manager

• Compression

SMART IO

How we (used to) read and write data

• Data read and write has many forms

• SQL statements

• Full and Incremental Backups

• Restore of backups

• Loading of data

• Creation of tablespaces/datafiles

• Exadata is designed to do this fast

• Latest and greatest in hardware

• Optimized software to work together with the hardware

Are fast nodes and storage enough ?

• NO !

• The speed is determined by the weakest link

• Processing speed of the database nodes

• Processing speed of the storage environment

• The storage network and components that tie them together

• 4Gb fiber is not fast enough

• 4 single thread sessions can easily use throughput of a 4Gb card

Can we go any faster besides hardware

• Yes ! • Limit the amount of processing done on the database nodes

• Scanning full tables on the storage, not on the DB nodes

• Only retrieve columns and rows that you actually need

• Encrypt and Decrypt on the storage side, not on the DB

• Transfer tasks (writing zero‟s to datafiles) to the storage

• Free CPU power from the DB nodes

• Pay less license fees because you need less DB nodes

• Waste less power and heat because you have less systems

• Pay less because you need less iron in your environment

• Fully use the CPU‟s on the storage side

Smart IO applications

• Smart scan

• Query 1TB and only receive process the actual results, not 1TB

• Retrieve parts from flash and parts from disk

• Smart file creation and block formatting

• Let the storage write the 5TB of zero‟s for the new datafiles

instead of the database nodes. Parallel and faster !

• Both new tablespaces but also RMAN restore benefit

• Smart incremental backup

• Let the storage decide which blocks to back-up. Parallel, fast so

no more full database scans from the RMAN process

• Smart Scans on Encrypted Columns and Tablespaces

• Smart Scans for Data Mining Scoring

Smart Scan, first get the basics

This is generic Oracle business

• Database sessions read one or more blocks in the SGA

• After reading/processing, the block stays in the SGA

• Sessions can re-use the block in the SGA

• Storage allocated based on Most Recent Used Algorithm

• Database sessions read blocks in the PGA

• Amount of data too large to fit in the SGA

• Called „Direct reads‟, stored in the sessions PGA

• Blocks evaluated as they come in

• Blocks deleted after usage (new query means reading again)

On Exadata, Smart Scan kicks in for the Direct Reads

which are the most resource intensive queries

Datastream in an Oracle Database

Smart Scan – functional summary

• Smart scan is implemented by function shipping

• Statement is processed by the instance

• Block ID‟s are determined per Exadata cell

• Predicates (where clause) and Block ID‟s are shipped to the cell

• Cell processes the blocks and returns the filtered rows

• Cell has libraries to understand Oracle block format

• Predicate evaluation (functions in the where clause)

• Column selection

• Join filtering through bloom filters

• Works on compressed and uncompressed blocks

• Tablespace and column encryption is supported

Smart Scan – What happens ?

• Database decides Direct Read is needed

• Too much data to fit in the SGA

• Determines list of blocks that need to be accessed

• Either indexes or tables

• Database detects all required data is on Exadata

• Creates list of blocks per Exadata storage cell

• Ships list of blocks, the required columns and applicable

„where‟ predicates to the Exadata storage cell

Smart Scan – What happens (cont..)

• Exadata storage cells retrieve requested blocks

• Parallel on cells and parallel in multiple threads per cell

• Based on column requirements and where predicates, retrieve

data from the blocks

• Gather retrieved data and create Oracle-like blocks

• Ship blocks with data to the database node(s)

• Database receives virtual blocks from all cells

• Gathers in PGA and determine result for session

• Send result to session and delete virtual blocks

Query 1TB and only receive 10GB in the DB nodes !

Smart scan – when is it used?

• Optimizer does not decide to use smart scan

• It is a run-time decision

• First, a scan decides if direct reads can be used

• Decision based on

• Table size

• Number of dirty buffers

• Amount of data already cached

• Other heuristics (see manual)

• The behavior is the same as non-Exadata behavior

Smart scan – when is it used? Cont.

• Setting of CELL_OFFLOAD_PROCESSING parameter

• TRUE / FALSE

• All the files of a tablespace need to reside

on Exadata storage

• Smart scans are used for scans

• In sub-queries and in-lines as well

• Used for the following row sources:

• Table scan

• Index (fast full) scan

• Bitmap index scan

Smart Scan – Predicting offload

• A stable plan is important

• Explain plan should not change in a running environment

• No additional parsing for the same statement

• Explain plans helps you see Exadata offload

• Operations that could be offloaded

• Predicates that could be offloaded

• Joins that could be offloaded though Bloom filtering

• A certain Explain Plan does not guarantee offloading !

For more information on Oracle’s Bloom filtering, see http://antognini.ch/papers/BloomFilters20080620.pdf

Explain plan example

------------------------------------------- | Id | Operation | Name | ------------------------------------------- | 0 | SELECT STATEMENT | | | *1 | HASH JOIN | | | *2 | HASH JOIN | | | *3 | TABLE ACCESS STORAGE FULL | SALES | | *4 | TABLE ACCESS STORAGE FULL | SALES | | *5 | TABLE ACCESS STORAGE FULL | SALES | ------------------------------------------- Predicate Information (identified by operation id):

---------------------------------------------------------------------------------

1 - access("T"."CUST_ID"="T2"."CUST_ID" AND "T1"."PROD_ID"="T2"."PROD_ID"

AND "T1"."CUST_ID"="T2"."CUST_ID")

2 - access("T"."PROD_ID"="T1"."PROD_ID")

3 - storage("T1"."PROD_ID"<200 AND

"T1"."AMOUNT_SOLD"*"T1"."QUANTITY_SOLD">10000 AND "T1"."PROD_ID"<>45)

filter("T1"."PROD_ID"<200 AND

"T1"."AMOUNT_SOLD"*"T1"."QUANTITY_SOLD">10000 AND "T1"."PROD_ID"<>45)

4 - storage("T"."PROD_ID"<200 AND "T"."PROD_ID"<>45)

filter("T"."PROD_ID"<200 AND "T"."PROD_ID"<>45)

5 - storage("T2"."PROD_ID"<200 AND "T2"."PROD_ID"<>45)

filter("T2"."PROD_ID"<200 AND "T2"."PROD_ID"<>45)

Manipulating Explain Plan output

•CELL_OFFLOAD_PLAN_DISPLAY parameter

• AUTO (default)

• Explain plan will show predicate offload only if tablespace

resides on Exadata storage

• ALWAYS

• Explain plan will show predicate offload whether the

tablespace resides on Exadata storage or not

• NEVER

• Explain Plan will never indicate predicate offload even if

tablespaces resides on Exadata storage

Detecting Scan offloads

• Trace of the session executing the statement

• Querying (G)V$ views

• (G)V$SYSSTAT

• (G)V$SQL

• (G)V$SESSTAT

• etc

Example: V$SYSSTAT

•cell physical IO interconnect bytes

• Bytes transferred between the storage nodes and the

database nodes

•physical IO disk bytes

• Bytes physically read on the Exadata storage nodes. This

includes both IO performed for both block IO and for smart

scans.

•cell physical IO bytes eligible for

predicate offload

• Blocks that were processed by the smart scan process using

the column list and the where predicates

V$SYSTAT values and efficiency

select name

from table

where col >= 100

Phys. IO 10Gb

Phys. IO eligible for offload 10Gb

Phys. IO interconnect 2Gb

Efficiency: = 10Gb

2Gb = 20%

V$SYSTAT values and efficiency

select a.name, b.*

from table a, table b

where a.id = b.id

and a.col >= 100

Phys. IO 10Gb

Phys. IO eligible for offload 5Gb

Phys. IO interconnect 5Gb

Efficiency: = 5Gb

1Gb = 20%

Efficiency: = 10Gb

5Gb = 50%

V$SQL makes it easier

• We can use the following columns • physical_read_bytes

• How much data was read by the cell

• io_interconnect_bytes

• How much data was transported through the interconnect

• io_cell_offload_eligible_bytes

• How much of the physical read data was processed in the

cell

• io_cell_offload_returned_bytes

• How much of the processed data was actually returned to

the DB

• This is per statement

Smart scan inside the cell

• Smart scan is handled by the Cellsrv process on the cell

• Cellsrv is

• Multi-threaded

• Serves block IO and smart IO

• Runs a piece of RDBMS code to support smart IO

• Can provide storage to one or more databases

• Does not communicate with other cells

Predicate disk data flow

• Jobs can execute

concurrently

• Concurrent IOs can be

issued for a single

RDBMS client

• Concurrent filter jobs

can be applying

predicates

• Exadata adds another

level of parallelism in

query processing

IO jobs – issues IOs

PredicateDiskRead

PredicateFilter –

Filter raw data

PredicateCachePut –

Queues new IO requests

PredicateCacheGet –

Send result back

Other Smart improvements

• Smart file creation

• Offloads process of formatting new blocks to the cell storage.

• Block ids (instead of formatted blocks) are shipped to the cells

• Smart file creation is used whenever a file is created

• Tablespace creation

• File resize (increase in size)

• RMAN restore

• Statistics involved (V$SYSSTAT)

• cell physical IO bytes saved during optimized file creation

• cell physical IO bytes saved during optimized RMAN file

restore

Other Smart improvements

• Smart incremental backup

• Offloads identifying blocks to backup to Exadata cell (SCN)

• Used automatically

• Unless Fast Incremental Backup feature is used

• V$BACKUP_DATAFILE for smart incremental backup • BLOCKS_SKIPPED_IN_CELL

• number of blocks that were read and filtered by the cells to

optimize the RMAN incremental backup.

• BLOCKS

• Size of the backup data file in blocks.

New wait events

• Smart Scan • cell smart table scan - Database is waiting for table scans to

complete on a cell.

• cell smart index scan - Database is waiting for index or index-

organized table (IOT) fast full scans.

• Smart file creation • cell smart file creation - Event appears when the database is waiting

for the completion of a file creation on a cell.

• cell smart restore from backup - Event appears when the database

is waiting for the completion of a file initialization for restore from

backup on a cell.

• Smart incremental Backup • cell smart incremental backup - Event appears when the database is

waiting for the completion of an incremental backup on a cell.

Exadata Smart Features - Summary

• Long running actions benefit the most

• Smart Scans for optimizing full table/index/bitmap scans

• Smart file creation for datafile creation and restoring backups

• Smart Incremental Backups for incremental backup creations

• Explain plan displays possible offloads

• Does not guarantee offload

• Indicator „STORAGE‟ shows offload options

• Various new wait events

• Do not get scared if you see them

Q U E S T I O N S

A N S W E R S

On Exadata Smart Features

Agenda

• Smart IO

• IO Resource Manager • Compression

IO RESOURCE MANAGEMENT

Why Would Customers Be Interested

in I/O Resource Manager?

• Exadata Storage can be shared by multiple types of workloads and multiple databases

• Sharing lowers administration costs

• Sharing leads to more efficient usage of storage

• But, workloads may not happily coexist

• ETL jobs interfere with DSS query performance

• One production data warehouse can interfere with another

• Extraordinary query performance also means that one query can utilize all of Exadata‟s I/O bandwidth!

• Non-priority queries can substantially impact the performance of critical queries

• Customers will need a way to control these workloads

Consequence of I/O Bandwidth Limits

Production Database

Development

Database

desired bandwidth:

0.2 + 15 + 15 =

30.2 GB/s

available I/O

Bandwidth:

21 GB/s

desired

bandwidth:

15 GB/s

desired

bandwidth:

0.2 + 15 GB/s

15 GB/s

200 MB/s

Storage Network

Storage

IO Manager solves the problem

Production Database

Development

Database

actual bandwidth:

0.2+12.8 + 8 =

30.2 21 GB/s

available I/O

Bandwidth:

21 GB/s

actual bandwidth:

15 8 GB/s

actual bandwidth:

0.2 + 12.8 GB/s

15 12.8 GB/s

200 MB/s

Storage Network

Storage

When Does I/O Resource Manager

Help the Most?

• Conflicting Workloads

• Multiple consumer groups

• Multiple databases

• Concurrent database administration

• Backup, ETL, File creation etc

• Of course only if I/O is a bottleneck

• Significant proportion of the wait events are for I/O

• Including the CELL WAIT events

I/O Scheduling, the Traditional Way

• With traditional storage, I/O schedulers are black boxes

• You cannot influence their behavior !

• I/O requests are processed in FIFO order

• Some reordering may be done to improve disk efficiency

Disk Queue

Traditional

Storage

Server

H L H L L L

High-Priority

Workload Low-Priority

Workload

RDBMS I/O

Requests

RDBMS

I/O Scheduling, the Exadata Way

• Exadata limits the number of outstanding I/O requests

• Issues enough I/Os to keep disk performing efficient

• Limit prevents low-priority intensive workload from flooding

the disk

• Subsequent I/O requests are internally queued

• Exadata dequeues I/O requests, based on database and the

user‟s resource plans

• Inter-database plans specify bandwidth between multiple

databases

• Intra-database plans specify bandwidth between

workloads within a database

I/O Scheduling, the Exadata Way

Finance

Database

Exadata

Cell

H H H

H

L L L L

High Priority Consumer

Group Queue

Low Priority Consumer

Group Queue

I/O

Resource

Manager

Finance Data Warehouse

Sales

Database

H H

L L L

High Priority Consumer

Group Queue

Low Priority Resource

Group Queue

Sales Data Warehouse

Sales-Priority

Finance-Priority

L

Plans are known by the CellSRV

• Inter-database (or between databases) • Specified through CellCLI and pushed to CELLSRV

• Intra-database (or inside a database) • Pushed by RDBMS to CELLSRV Server.

• Intra-database plans are regular Resource Manager Plans

• I/O‟s are tagged • Every ASM/RDBMS I/O is tagged by the sender identity

(Database ID/Consumer Group ID)

• CELLSRV uses Resource Manager component (similar to RDBMS) to schedule I/O‟s

Setting up IO Resource Management

• Inter-Database resource plans

• Setup on the Exadata cell

• Enabled / disabled on Exadata cell level

• Intra-Database resource plans

• IORM on Exadata cell must be enabled

• Inter-database plans not required

• Setup Resource Manager on the database

• Map sessions to consumer groups

• Create database resource plan

• Add permissions for users

• Enable Database Resource plan

How can we limit what ? Per database or resource group

• On disk Level

• Minimum % of IO

• Only kicks in when fighting over disk IO‟s

• Speed may vary but has a guaranteed minimum

• Limited % of IO

• Always limiting IO to certain %

• Even if system is not being used

• On Flash Level

• No need to limit flash I/O – fast enough

• Limit access to Flash storage space

How to Configure Inter-Database IORM

• Specify DB name, level and allocation %

• Per level, 100% is available

• Various options are available

• OTHER is wildcard for non-specified databases

CellCLI> alter iormplan

dbplan = ((name = production, level=1, allocation=100),

(name = test, level=2, allocation=80),

(name = other, level=2, allocation=20))

IORMPLAN successfully altered

CellCLI> alter iormplan active

IORMPLAN successfully altered

Limiting resources on Disk

Starting patchset 11.2.0.2

• Between users inside a database

• User default Resource Management

• Directive “max_utilization_limit" applies for both CPU and IO

• Between databases

• User the “limit” parameter in the IORMPlan

• Example:

ALTER IORMPLAN

dbplan=((name=prod, limit=75),

(name=test, limit=20),

(name=other, limit=5))

Control access to flash cache

Starting patchset 11.2.0.2

• Prevent databases to access flash cache

• Low priority databases, Test databases etc

• New attribute introduced for IORM: FlashCache

Cellcli> ALTER IORMPLAN

dbplan=((name=prod, flashCache=on),

(name=dev, flashCache=on),

(name=test, flashCache=off),

(name=other, flashCache=off))

Measuring the Benefit from IORM

• Method 1: Monitor performance of target workload

• Demonstrates the effectiveness of IORM to the workload

owner

• Measure target workload‟s query times or transaction rates

• Method 2: Monitor I/O statistics of target workload

• Characterizes I/O traffic per workload for each cell

• Demonstrates the effect of IORM on each workload

• Measure using Exadata IORM metrics

Example

• Production database

• Users doing normal production

• Development / Test database

• Schema‟s used for Testing

• Schema‟s used for Development

• Both databases run on same cluster

Before IO resource Management

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

PROD

TEST

DEV

Without IORM

Creating an IORM Schema

DB name Resource

group

Inter %

Exadata cell

Intra %

Resource group

PROD - Level=1, alloc=60 -

DEVTST DEV Level=2, alloc=100 75

TEST 25

• Setup on Exadata cell

• PROD=60, DEVTST=40, OTHER=0

• Setup inside Database Resource Mgr

• Resource group DEV = 75%

• Resource group TEST = 25%

With and without IORM

0

10

20

30

40

50

60

70

80

90

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

PROD

TEST

DEV

Without IORM With IORM

IO Resource Manager - Summary

• IORM manages load effectively

• Within a database

• Between databases

• Only if IO is the bottleneck, check AWR for this

• To verify IORM, monitor your critical workload

• With and without other workloads

• With and without IORM

• Use IORM metrics to monitor I/O rates and wait times

• Or use your own Metrics / end user

Q U E S T I O N S

A N S W E R S

On IO Resource Manager

Agenda

• Smart IO

• IO Resource Manager

• Compression

COMPRESSION

Data Growth Challenges

• IT must support exponentially growing amounts of data • Explosion in online access and content

• Government data retention regulations

• Performance often declines as data grows

• IT budgets are flat or decreasing

• Need to grow data • Without hurting performance

• Without growing cost

• Powerful and efficient compression is key

Availability

Database 11g

Advanced Compression

Oracle 8

Database 11g


Database 11g


Database 11g


Database 11g


All Applications

Unstructured (File) Data SecureFiles Compression

Table compression Data Warehouses

Backup Compression RMAN Compression

Data Pump Compression

SecureFiles Deduplication

All Applications

Actively Updated Data OLTP Table Compression

Application Fit Compression Feature

Network Compression

All Applications

Unstructured (File) Data

Data Guard Redo

Transport Compression

NEW

Compress for Query Built on Hybrid Columnar

Compress for Archive Compression Technology

Oracle Database Compression Overview Compress All Your Data

Hybrid Columnar Compression

• Hybrid Columnar Compressed Tables

• New approach to compressed table storage

• Useful for data that is bulk loaded and queried

• Update activity is light

• How it Works

• Tables are organized into Compression Units

• CUs are larger than database blocks

• Usually around 32K

• Within Compression Unit, data is organized by

column instead of by row

• Column organization brings similar values

close together, enhancing compression

Compression Unit

10x to 15x Reduction

Hybrid Columnar Compression Technology Overview

• Compression Unit

• Logical structure spanning multiple database blocks

• Data organized by column during data load

• Each column compressed separately

• All column data for a set of rows stored in compression unit

• Typically 32k (4 blocks x 8k block size)

CU HEADER

BLOCK HEADER BLOCK HEADER BLOCK HEADER BLOCK HEADER

C3

C4 C1

C2

C7 C5

C6 C8

C8

Logical Compression Unit

Compress for Query Built on Hybrid Columnar Compression

• 10x average storage savings

• 100 TB Database compresses to 10 TB

• Reclaim 90 TB of disk space

• Space for 9 more „100 TB‟ databases

• 10x average scan improvement

• 1,000 IOPS reduced to 100 IOPS

100 TB

10

TB

Compress for Archive Built on Hybrid Columnar Compression

• Compression algorithm optimized for max storage savings

• Benefits any application with data retention requirements

• Best approach for ILM and data archival

• Minimum storage footprint

• No need to move data to tape or less expensive disks

• Data is always online and always accessible

• Run queries against historical data (without recovering from tape)

• Update historical data

• Supports schema evolution (add/drop columns)

Compress for Archive

• Optimal workload characteristics for Archive Compression

• Any application (OLTP, Data Warehouse)

• Cold or Historical Data

• Data loaded with bulk load operations

• Minimal access and update requirements

• Instead of record lock a Compression Unit lock

• 15x average storage savings

• 100 TB Database compresses to 6.6 TB

• Keep historical data online forever

• Up to 70x savings seen on production customer data

Compression in Exadata ILM and Data Archiving Strategies

• OLTP Applications

• Table Partitioning

• Heavily accessed data (read and write)

• Partitions using OLTP Table Compression

• Cold or historical data

• Partitions using Compress for Archive

• Data Warehouses

• Table Partitioning

• Heavily accessed data (read)

• Partitions using Compress for Query

• Cold or historical data

• Partitions using Compress for Archive

Hybrid Columnar Compression Outside of Exadata

• Only supported for data stored on

• Exadata Storage Cells

• Exadata Database Machines

• Exadata Sparc Super Cluster

• Oracle ZFS Appliance systems

• Oracle Pillar Data Systems

• Storing and accessing HCC data on other systems

• Storage is possible including DataGuard, Recovery etc

• Access only possible after de-compression of data

• Can be done on any 11gR2+ system

Hybrid Columnar Compression Business as Usual

• Fully supported with…

• B-Tree, Bitmap Indexes, Text indexes

• Materialized Views

• Exadata Server and Cells including offload

• Partitioning, Parallel Query, PDML, PDDL

• Schema Evolution support, online, add/drop columns

• Data Guard Physical and Logical Standby (>11.2) Support

• Data only accessible if standby supports HCC too !

• Streams is not supported

• GoldenGate will be supported soon !

Hybrid Columnar Compressed Tables Details

• Data loaded using Direct Load uses EHCC

• Parallel DML

• INSERT /*+ APPEND */

• Direct Path SQL*Loader

• Optimized algorithms avoid or greatly reduce

overhead of decompression during query

• Individual row lookups consume more CPU than row

format

• Need to reconstitute row from columnar format

Hybrid Columnar Compressed Tables Details continued..

• Updated rows automatically migrate

• to lower compression level to support frequent

transactions

• Table size will increase moderately

• All un-migrated rows in Compression Unit are locked

during migration

• Row will get a new ROWID after update

• Data loaded using Conventional Insert (non-bulk)

uses the lower compression level

Hybrid Columnar Compressed Tables Details continued..

• Specialized columnar query processing engine

• Runs in Exadata Storage Server to run directly against

compressed data

• Column optimized processing of query projection and

filtering

• Result is returned uncompressed

SQL> @advisor

Table: GL_BALANCES

Compression Type: Compress for Query HIGH

Estimated Compression Ratio: 10.17

PL/SQL procedure succesfully completed.

SQL>

Compression Advisor

• New Advisor in Oracle Database 11g Release 2

• DBMS_COMPRESSION PL/SQL Package

• Estimates Hybrid Columnar Compress storage savings on non-

Exadata hardware

• Requires Patch # 8896202

Hybrid Columnar Compression Customer Success Stories

• Data Warehouse Customers (Warehouse Compression)

• Top Financial Services 1: 11x



• Top Telco 1: 8x

• Top Telco 2: 14x

• Top Telco 3: 6x

• Scientific Data Customer (Archive Compression)

• Top R&D customer (with PBs of data): 28x

• OLTP Archive Customer (Archive Compression)

• Oracle E-Business Suite, Oracle Corp.: 23x

• Custom Call Center Application, Top Telco: 15x

Summary

• IT must support exponentially growing

amounts of data

• Without growing cost

• Without hurting performance

• Exadata and Hybrid Columnar

Compression

• Extreme Storage Savings

• Compress for Query

• Compress for Archive

• Improve I/O Scan Rates

Q U E S T I O N S

A N S W E R S

Now you do it !

- WordPress.comDec 03, 2013 · •Exadata storage cells retrieve requested blocks •Parallel on...

Documents

Transcript of - WordPress.comDec 03, 2013 · •Exadata storage cells retrieve requested blocks •Parallel on...