- WordPress.comDec 03, 2013 · •Exadata storage cells retrieve requested blocks •Parallel on...
Transcript of - WordPress.comDec 03, 2013 · •Exadata storage cells retrieve requested blocks •Parallel on...
<Insert Picture Here>
Controlling resources in an Exadata environment
Agenda
• Smart IO
• IO Resource Manager
• Compression
• Hands-on time
• Exadata Security
• Flash Cache
• Storage Indexes
• Parallel Execution
Agenda
• Smart IO • IO Resource Manager
• Compression
SMART IO
How we (used to) read and write data
• Data read and write has many forms
• SQL statements
• Full and Incremental Backups
• Restore of backups
• Loading of data
• Creation of tablespaces/datafiles
• Exadata is designed to do this fast
• Latest and greatest in hardware
• Optimized software to work together with the hardware
Are fast nodes and storage enough ?
• NO !
• The speed is determined by the weakest link
• Processing speed of the database nodes
• Processing speed of the storage environment
• The storage network and components that tie them together
• 4Gb fiber is not fast enough
• 4 single thread sessions can easily use throughput of a 4Gb card
Can we go any faster besides hardware
• Yes ! • Limit the amount of processing done on the database nodes
• Scanning full tables on the storage, not on the DB nodes
• Only retrieve columns and rows that you actually need
• Encrypt and Decrypt on the storage side, not on the DB
• Transfer tasks (writing zero‟s to datafiles) to the storage
• Free CPU power from the DB nodes
• Pay less license fees because you need less DB nodes
• Waste less power and heat because you have less systems
• Pay less because you need less iron in your environment
• Fully use the CPU‟s on the storage side
Smart IO applications
• Smart scan
• Query 1TB and only receive process the actual results, not 1TB
• Retrieve parts from flash and parts from disk
• Smart file creation and block formatting
• Let the storage write the 5TB of zero‟s for the new datafiles
instead of the database nodes. Parallel and faster !
• Both new tablespaces but also RMAN restore benefit
• Smart incremental backup
• Let the storage decide which blocks to back-up. Parallel, fast so
no more full database scans from the RMAN process
• Smart Scans on Encrypted Columns and Tablespaces
• Smart Scans for Data Mining Scoring
Smart Scan, first get the basics
This is generic Oracle business
• Database sessions read one or more blocks in the SGA
• After reading/processing, the block stays in the SGA
• Sessions can re-use the block in the SGA
• Storage allocated based on Most Recent Used Algorithm
• Database sessions read blocks in the PGA
• Amount of data too large to fit in the SGA
• Called „Direct reads‟, stored in the sessions PGA
• Blocks evaluated as they come in
• Blocks deleted after usage (new query means reading again)
On Exadata, Smart Scan kicks in for the Direct Reads
which are the most resource intensive queries
Datastream in an Oracle Database
Smart Scan – functional summary
• Smart scan is implemented by function shipping
• Statement is processed by the instance
• Block ID‟s are determined per Exadata cell
• Predicates (where clause) and Block ID‟s are shipped to the cell
• Cell processes the blocks and returns the filtered rows
• Cell has libraries to understand Oracle block format
• Predicate evaluation (functions in the where clause)
• Column selection
• Join filtering through bloom filters
• Works on compressed and uncompressed blocks
• Tablespace and column encryption is supported
Smart Scan – What happens ?
• Database decides Direct Read is needed
• Too much data to fit in the SGA
• Determines list of blocks that need to be accessed
• Either indexes or tables
• Database detects all required data is on Exadata
• Creates list of blocks per Exadata storage cell
• Ships list of blocks, the required columns and applicable
„where‟ predicates to the Exadata storage cell
Smart Scan – What happens (cont..)
• Exadata storage cells retrieve requested blocks
• Parallel on cells and parallel in multiple threads per cell
• Based on column requirements and where predicates, retrieve
data from the blocks
• Gather retrieved data and create Oracle-like blocks
• Ship blocks with data to the database node(s)
• Database receives virtual blocks from all cells
• Gathers in PGA and determine result for session
• Send result to session and delete virtual blocks
Query 1TB and only receive 10GB in the DB nodes !
Smart scan – when is it used?
• Optimizer does not decide to use smart scan
• It is a run-time decision
• First, a scan decides if direct reads can be used
• Decision based on
• Table size
• Number of dirty buffers
• Amount of data already cached
• Other heuristics (see manual)
• The behavior is the same as non-Exadata behavior
Smart scan – when is it used? Cont.
• Setting of CELL_OFFLOAD_PROCESSING parameter
• TRUE / FALSE
• All the files of a tablespace need to reside
on Exadata storage
• Smart scans are used for scans
• In sub-queries and in-lines as well
• Used for the following row sources:
• Table scan
• Index (fast full) scan
• Bitmap index scan
Smart Scan – Predicting offload
• A stable plan is important
• Explain plan should not change in a running environment
• No additional parsing for the same statement
• Explain plans helps you see Exadata offload
• Operations that could be offloaded
• Predicates that could be offloaded
• Joins that could be offloaded though Bloom filtering
• A certain Explain Plan does not guarantee offloading !
For more information on Oracle’s Bloom filtering, see http://antognini.ch/papers/BloomFilters20080620.pdf
Explain plan example
------------------------------------------- | Id | Operation | Name | ------------------------------------------- | 0 | SELECT STATEMENT | | | *1 | HASH JOIN | | | *2 | HASH JOIN | | | *3 | TABLE ACCESS STORAGE FULL | SALES | | *4 | TABLE ACCESS STORAGE FULL | SALES | | *5 | TABLE ACCESS STORAGE FULL | SALES | ------------------------------------------- Predicate Information (identified by operation id):
---------------------------------------------------------------------------------
1 - access("T"."CUST_ID"="T2"."CUST_ID" AND "T1"."PROD_ID"="T2"."PROD_ID"
AND "T1"."CUST_ID"="T2"."CUST_ID")
2 - access("T"."PROD_ID"="T1"."PROD_ID")
3 - storage("T1"."PROD_ID"<200 AND
"T1"."AMOUNT_SOLD"*"T1"."QUANTITY_SOLD">10000 AND "T1"."PROD_ID"<>45)
filter("T1"."PROD_ID"<200 AND
"T1"."AMOUNT_SOLD"*"T1"."QUANTITY_SOLD">10000 AND "T1"."PROD_ID"<>45)
4 - storage("T"."PROD_ID"<200 AND "T"."PROD_ID"<>45)
filter("T"."PROD_ID"<200 AND "T"."PROD_ID"<>45)
5 - storage("T2"."PROD_ID"<200 AND "T2"."PROD_ID"<>45)
filter("T2"."PROD_ID"<200 AND "T2"."PROD_ID"<>45)
Manipulating Explain Plan output
•CELL_OFFLOAD_PLAN_DISPLAY parameter
• AUTO (default)
• Explain plan will show predicate offload only if tablespace
resides on Exadata storage
• ALWAYS
• Explain plan will show predicate offload whether the
tablespace resides on Exadata storage or not
• NEVER
• Explain Plan will never indicate predicate offload even if
tablespaces resides on Exadata storage
Detecting Scan offloads
• Trace of the session executing the statement
• Querying (G)V$ views
• (G)V$SYSSTAT
• (G)V$SQL
• (G)V$SESSTAT
• etc
Example: V$SYSSTAT
•cell physical IO interconnect bytes
• Bytes transferred between the storage nodes and the
database nodes
•physical IO disk bytes
• Bytes physically read on the Exadata storage nodes. This
includes both IO performed for both block IO and for smart
scans.
•cell physical IO bytes eligible for
predicate offload
• Blocks that were processed by the smart scan process using
the column list and the where predicates
V$SYSTAT values and efficiency
select name
from table
where col >= 100
Phys. IO 10Gb
Phys. IO eligible for offload 10Gb
Phys. IO interconnect 2Gb
Efficiency: = 10Gb
2Gb = 20%
V$SYSTAT values and efficiency
select a.name, b.*
from table a, table b
where a.id = b.id
and a.col >= 100
Phys. IO 10Gb
Phys. IO eligible for offload 5Gb
Phys. IO interconnect 5Gb
Efficiency: = 5Gb
1Gb = 20%
Efficiency: = 10Gb
5Gb = 50%
V$SQL makes it easier
• We can use the following columns • physical_read_bytes
• How much data was read by the cell
• io_interconnect_bytes
• How much data was transported through the interconnect
• io_cell_offload_eligible_bytes
• How much of the physical read data was processed in the
cell
• io_cell_offload_returned_bytes
• How much of the processed data was actually returned to
the DB
• This is per statement
Smart scan inside the cell
• Smart scan is handled by the Cellsrv process on the cell
• Cellsrv is
• Multi-threaded
• Serves block IO and smart IO
• Runs a piece of RDBMS code to support smart IO
• Can provide storage to one or more databases
• Does not communicate with other cells
Predicate disk data flow
• Jobs can execute
concurrently
• Concurrent IOs can be
issued for a single
RDBMS client
• Concurrent filter jobs
can be applying
predicates
• Exadata adds another
level of parallelism in
query processing
IO jobs – issues IOs
PredicateDiskRead
PredicateFilter –
Filter raw data
PredicateCachePut –
Queues new IO requests
PredicateCacheGet –
Send result back
Other Smart improvements
• Smart file creation
• Offloads process of formatting new blocks to the cell storage.
• Block ids (instead of formatted blocks) are shipped to the cells
• Smart file creation is used whenever a file is created
• Tablespace creation
• File resize (increase in size)
• RMAN restore
• Statistics involved (V$SYSSTAT)
• cell physical IO bytes saved during optimized file creation
• cell physical IO bytes saved during optimized RMAN file
restore
Other Smart improvements
• Smart incremental backup
• Offloads identifying blocks to backup to Exadata cell (SCN)
• Used automatically
• Unless Fast Incremental Backup feature is used
• V$BACKUP_DATAFILE for smart incremental backup • BLOCKS_SKIPPED_IN_CELL
• number of blocks that were read and filtered by the cells to
optimize the RMAN incremental backup.
• BLOCKS
• Size of the backup data file in blocks.
New wait events
• Smart Scan • cell smart table scan - Database is waiting for table scans to
complete on a cell.
• cell smart index scan - Database is waiting for index or index-
organized table (IOT) fast full scans.
• Smart file creation • cell smart file creation - Event appears when the database is waiting
for the completion of a file creation on a cell.
• cell smart restore from backup - Event appears when the database
is waiting for the completion of a file initialization for restore from
backup on a cell.
• Smart incremental Backup • cell smart incremental backup - Event appears when the database is
waiting for the completion of an incremental backup on a cell.
Exadata Smart Features - Summary
• Long running actions benefit the most
• Smart Scans for optimizing full table/index/bitmap scans
• Smart file creation for datafile creation and restoring backups
• Smart Incremental Backups for incremental backup creations
• Explain plan displays possible offloads
• Does not guarantee offload
• Indicator „STORAGE‟ shows offload options
• Various new wait events
• Do not get scared if you see them
Q U E S T I O N S
A N S W E R S
On Exadata Smart Features
Agenda
• Smart IO
• IO Resource Manager • Compression
IO RESOURCE MANAGEMENT
Why Would Customers Be Interested
in I/O Resource Manager?
• Exadata Storage can be shared by multiple types of workloads and multiple databases
• Sharing lowers administration costs
• Sharing leads to more efficient usage of storage
• But, workloads may not happily coexist
• ETL jobs interfere with DSS query performance
• One production data warehouse can interfere with another
• Extraordinary query performance also means that one query can utilize all of Exadata‟s I/O bandwidth!
• Non-priority queries can substantially impact the performance of critical queries
• Customers will need a way to control these workloads
Consequence of I/O Bandwidth Limits
Production Database
Development
Database
desired bandwidth:
0.2 + 15 + 15 =
30.2 GB/s
available I/O
Bandwidth:
21 GB/s
desired
bandwidth:
15 GB/s
desired
bandwidth:
0.2 + 15 GB/s
15 GB/s
200 MB/s
Storage Network
Storage
IO Manager solves the problem
Production Database
Development
Database
actual bandwidth:
0.2+12.8 + 8 =
30.2 21 GB/s
available I/O
Bandwidth:
21 GB/s
actual bandwidth:
15 8 GB/s
actual bandwidth:
0.2 + 12.8 GB/s
15 12.8 GB/s
200 MB/s
Storage Network
Storage
When Does I/O Resource Manager
Help the Most?
• Conflicting Workloads
• Multiple consumer groups
• Multiple databases
• Concurrent database administration
• Backup, ETL, File creation etc
• Of course only if I/O is a bottleneck
• Significant proportion of the wait events are for I/O
• Including the CELL WAIT events
I/O Scheduling, the Traditional Way
• With traditional storage, I/O schedulers are black boxes
• You cannot influence their behavior !
• I/O requests are processed in FIFO order
• Some reordering may be done to improve disk efficiency
Disk Queue
Traditional
Storage
Server
H L H L L L
High-Priority
Workload Low-Priority
Workload
RDBMS I/O
Requests
RDBMS
I/O Scheduling, the Exadata Way
• Exadata limits the number of outstanding I/O requests
• Issues enough I/Os to keep disk performing efficient
• Limit prevents low-priority intensive workload from flooding
the disk
• Subsequent I/O requests are internally queued
• Exadata dequeues I/O requests, based on database and the
user‟s resource plans
• Inter-database plans specify bandwidth between multiple
databases
• Intra-database plans specify bandwidth between
workloads within a database
I/O Scheduling, the Exadata Way
Finance
Database
Exadata
Cell
H H H
H
L L L L
High Priority Consumer
Group Queue
Low Priority Consumer
Group Queue
I/O
Resource
Manager
Finance Data Warehouse
Sales
Database
H H
L L L
High Priority Consumer
Group Queue
Low Priority Resource
Group Queue
Sales Data Warehouse
Sales-Priority
Finance-Priority
L
Plans are known by the CellSRV
• Inter-database (or between databases) • Specified through CellCLI and pushed to CELLSRV
• Intra-database (or inside a database) • Pushed by RDBMS to CELLSRV Server.
• Intra-database plans are regular Resource Manager Plans
• I/O‟s are tagged • Every ASM/RDBMS I/O is tagged by the sender identity
(Database ID/Consumer Group ID)
• CELLSRV uses Resource Manager component (similar to RDBMS) to schedule I/O‟s
Setting up IO Resource Management
• Inter-Database resource plans
• Setup on the Exadata cell
• Enabled / disabled on Exadata cell level
• Intra-Database resource plans
• IORM on Exadata cell must be enabled
• Inter-database plans not required
• Setup Resource Manager on the database
• Map sessions to consumer groups
• Create database resource plan
• Add permissions for users
• Enable Database Resource plan
How can we limit what ? Per database or resource group
• On disk Level
• Minimum % of IO
• Only kicks in when fighting over disk IO‟s
• Speed may vary but has a guaranteed minimum
• Limited % of IO
• Always limiting IO to certain %
• Even if system is not being used
• On Flash Level
• No need to limit flash I/O – fast enough
• Limit access to Flash storage space
How to Configure Inter-Database IORM
• Specify DB name, level and allocation %
• Per level, 100% is available
• Various options are available
• OTHER is wildcard for non-specified databases
CellCLI> alter iormplan
dbplan = ((name = production, level=1, allocation=100),
(name = test, level=2, allocation=80),
(name = other, level=2, allocation=20))
IORMPLAN successfully altered
CellCLI> alter iormplan active
IORMPLAN successfully altered
Limiting resources on Disk
Starting patchset 11.2.0.2
• Between users inside a database
• User default Resource Management
• Directive “max_utilization_limit" applies for both CPU and IO
• Between databases
• User the “limit” parameter in the IORMPlan
• Example:
ALTER IORMPLAN
dbplan=((name=prod, limit=75),
(name=test, limit=20),
(name=other, limit=5))
Control access to flash cache
Starting patchset 11.2.0.2
• Prevent databases to access flash cache
• Low priority databases, Test databases etc
• New attribute introduced for IORM: FlashCache
Cellcli> ALTER IORMPLAN
dbplan=((name=prod, flashCache=on),
(name=dev, flashCache=on),
(name=test, flashCache=off),
(name=other, flashCache=off))
Measuring the Benefit from IORM
• Method 1: Monitor performance of target workload
• Demonstrates the effectiveness of IORM to the workload
owner
• Measure target workload‟s query times or transaction rates
• Method 2: Monitor I/O statistics of target workload
• Characterizes I/O traffic per workload for each cell
• Demonstrates the effect of IORM on each workload
• Measure using Exadata IORM metrics
Example
• Production database
• Users doing normal production
• Development / Test database
• Schema‟s used for Testing
• Schema‟s used for Development
• Both databases run on same cluster
Before IO resource Management
0
10
20
30
40
50
60
70
80
90
100
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
PROD
TEST
DEV
Without IORM
Creating an IORM Schema
DB name Resource
group
Inter %
Exadata cell
Intra %
Resource group
PROD - Level=1, alloc=60 -
DEVTST DEV Level=2, alloc=100 75
TEST 25
• Setup on Exadata cell
• PROD=60, DEVTST=40, OTHER=0
• Setup inside Database Resource Mgr
• Resource group DEV = 75%
• Resource group TEST = 25%
With and without IORM
0
10
20
30
40
50
60
70
80
90
100
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
PROD
TEST
DEV
Without IORM With IORM
IO Resource Manager - Summary
• IORM manages load effectively
• Within a database
• Between databases
• Only if IO is the bottleneck, check AWR for this
• To verify IORM, monitor your critical workload
• With and without other workloads
• With and without IORM
• Use IORM metrics to monitor I/O rates and wait times
• Or use your own Metrics / end user
Q U E S T I O N S
A N S W E R S
On IO Resource Manager
Agenda
• Smart IO
• IO Resource Manager
• Compression
COMPRESSION
Data Growth Challenges
• IT must support exponentially growing amounts of data • Explosion in online access and content
• Government data retention regulations
• Performance often declines as data grows
• IT budgets are flat or decreasing
• Need to grow data • Without hurting performance
• Without growing cost
• Powerful and efficient compression is key
Availability
Database 11g
Advanced Compression
Oracle 8
Database 11g
Advanced Compression
Database 11g
Advanced Compression
Database 11g
Advanced Compression
Database 11g
Advanced Compression
All Applications
Unstructured (File) Data SecureFiles Compression
Table compression Data Warehouses
Backup Compression RMAN Compression
Data Pump Compression
SecureFiles Deduplication
All Applications
Actively Updated Data OLTP Table Compression
Application Fit Compression Feature
Network Compression
All Applications
Unstructured (File) Data
Data Guard Redo
Transport Compression
NEW
Compress for Query Built on Hybrid Columnar
Compress for Archive Compression Technology
Oracle Database Compression Overview Compress All Your Data
Hybrid Columnar Compression
• Hybrid Columnar Compressed Tables
• New approach to compressed table storage
• Useful for data that is bulk loaded and queried
• Update activity is light
• How it Works
• Tables are organized into Compression Units
• CUs are larger than database blocks
• Usually around 32K
• Within Compression Unit, data is organized by
column instead of by row
• Column organization brings similar values
close together, enhancing compression
Compression Unit
10x to 15x Reduction
Hybrid Columnar Compression Technology Overview
• Compression Unit
• Logical structure spanning multiple database blocks
• Data organized by column during data load
• Each column compressed separately
• All column data for a set of rows stored in compression unit
• Typically 32k (4 blocks x 8k block size)
CU HEADER
BLOCK HEADER BLOCK HEADER BLOCK HEADER BLOCK HEADER
C3
C4 C1
C2
C7 C5
C6 C8
C8
Logical Compression Unit
Compress for Query Built on Hybrid Columnar Compression
• 10x average storage savings
• 100 TB Database compresses to 10 TB
• Reclaim 90 TB of disk space
• Space for 9 more „100 TB‟ databases
• 10x average scan improvement
• 1,000 IOPS reduced to 100 IOPS
100 TB
10
TB
Compress for Archive Built on Hybrid Columnar Compression
• Compression algorithm optimized for max storage savings
• Benefits any application with data retention requirements
• Best approach for ILM and data archival
• Minimum storage footprint
• No need to move data to tape or less expensive disks
• Data is always online and always accessible
• Run queries against historical data (without recovering from tape)
• Update historical data
• Supports schema evolution (add/drop columns)
Compress for Archive
• Optimal workload characteristics for Archive Compression
• Any application (OLTP, Data Warehouse)
• Cold or Historical Data
• Data loaded with bulk load operations
• Minimal access and update requirements
• Instead of record lock a Compression Unit lock
• 15x average storage savings
• 100 TB Database compresses to 6.6 TB
• Keep historical data online forever
• Up to 70x savings seen on production customer data
Compression in Exadata ILM and Data Archiving Strategies
• OLTP Applications
• Table Partitioning
• Heavily accessed data (read and write)
• Partitions using OLTP Table Compression
• Cold or historical data
• Partitions using Compress for Archive
• Data Warehouses
• Table Partitioning
• Heavily accessed data (read)
• Partitions using Compress for Query
• Cold or historical data
• Partitions using Compress for Archive
Hybrid Columnar Compression Outside of Exadata
• Only supported for data stored on
• Exadata Storage Cells
• Exadata Database Machines
• Exadata Sparc Super Cluster
• Oracle ZFS Appliance systems
• Oracle Pillar Data Systems
• Storing and accessing HCC data on other systems
• Storage is possible including DataGuard, Recovery etc
• Access only possible after de-compression of data
• Can be done on any 11gR2+ system
Hybrid Columnar Compression Business as Usual
• Fully supported with…
• B-Tree, Bitmap Indexes, Text indexes
• Materialized Views
• Exadata Server and Cells including offload
• Partitioning, Parallel Query, PDML, PDDL
• Schema Evolution support, online, add/drop columns
• Data Guard Physical and Logical Standby (>11.2) Support
• Data only accessible if standby supports HCC too !
• Streams is not supported
• GoldenGate will be supported soon !
Hybrid Columnar Compressed Tables Details
• Data loaded using Direct Load uses EHCC
• Parallel DML
• INSERT /*+ APPEND */
• Direct Path SQL*Loader
• Optimized algorithms avoid or greatly reduce
overhead of decompression during query
• Individual row lookups consume more CPU than row
format
• Need to reconstitute row from columnar format
Hybrid Columnar Compressed Tables Details continued..
• Updated rows automatically migrate
• to lower compression level to support frequent
transactions
• Table size will increase moderately
• All un-migrated rows in Compression Unit are locked
during migration
• Row will get a new ROWID after update
• Data loaded using Conventional Insert (non-bulk)
uses the lower compression level
Hybrid Columnar Compressed Tables Details continued..
• Specialized columnar query processing engine
• Runs in Exadata Storage Server to run directly against
compressed data
• Column optimized processing of query projection and
filtering
• Result is returned uncompressed
SQL> @advisor
Table: GL_BALANCES
Compression Type: Compress for Query HIGH
Estimated Compression Ratio: 10.17
PL/SQL procedure succesfully completed.
SQL>
Compression Advisor
• New Advisor in Oracle Database 11g Release 2
• DBMS_COMPRESSION PL/SQL Package
• Estimates Hybrid Columnar Compress storage savings on non-
Exadata hardware
• Requires Patch # 8896202
Hybrid Columnar Compression Customer Success Stories
• Data Warehouse Customers (Warehouse Compression)
• Top Financial Services 1: 11x
• Top Financial Services 2: 24x
• Top Financial Services 3: 18x
• Top Telco 1: 8x
• Top Telco 2: 14x
• Top Telco 3: 6x
• Scientific Data Customer (Archive Compression)
• Top R&D customer (with PBs of data): 28x
• OLTP Archive Customer (Archive Compression)
• Oracle E-Business Suite, Oracle Corp.: 23x
• Custom Call Center Application, Top Telco: 15x
Summary
• IT must support exponentially growing
amounts of data
• Without growing cost
• Without hurting performance
• Exadata and Hybrid Columnar
Compression
• Extreme Storage Savings
• Compress for Query
• Compress for Archive
• Improve I/O Scan Rates
Q U E S T I O N S
A N S W E R S
Now you do it !