RAC Capacity Planning-1.pdf

44

Transcript of RAC Capacity Planning-1.pdf

Page 1: RAC Capacity Planning-1.pdf
Page 2: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Oracle Real Application Clusters: Oracle Real Application Clusters: Oracle Real Application Clusters: Oracle Real Application Clusters: Sizing and Capacity Sizing and Capacity Sizing and Capacity Sizing and Capacity Planning Then and NowPlanning Then and NowPlanning Then and NowPlanning Then and NowSu TangSri SubramaniamRACPACK

Page 3: RAC Capacity Planning-1.pdf

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Page 4: RAC Capacity Planning-1.pdf

Agenda

• Capacity Planning in GRID/RAC Environment• Scalable Infrastructure Design

• On Demand Capacity Addition and Utilization

• Criteria to add more Capacity• Real World Customer Example• Questions

Page 5: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Capacity Planning

Page 6: RAC Capacity Planning-1.pdf

RAC Capacity Planning Advantages

• All current practices still apply• Network Storage sizing• Interconnect Network capacity

• Servers capacity

• Application Service design

• RAC flexibility ensures• Good initial estimate is sufficient• Easily accommodates Growth

• Emphasis shifts to capacity utilization

Page 7: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Storage Network

Page 8: RAC Capacity Planning-1.pdf

Networked Storage

• RAC works with both SAN and NAS Storage• Optimal Storage selection depends on ..

• Estimated I/O Response Time• Typically single block I/O requests• Common characteristic of most OLTP applications• IOPS – measure used

• Estimated I/O Bandwidth• Large multi-block I/O’s• Data Warehouse and Mix workload environments• Occurs during backup/recovery operations

• Estimation should include requirements for both normal/backup I/O’s

Page 9: RAC Capacity Planning-1.pdf

Storage Capacity Planning

Estimate initial data size and growth rate for all the applications(E.g., 500GB initial, double over two years, 1TB total)

Add the fault tolerance requirements(E.g., 2TB with RAID1, 1.2TB with RAID5)

Add the backup requirements to the size(E.g., Additional 1TB for a full, another 1TB for 5 incremental)

Page 10: RAC Capacity Planning-1.pdf

Storage Capacity Planning

Estimate aggregated throughput and IOPS(E.g., 2GB/sec, or 300,000 IOPS)

Calculate the total bandwidth requirement per node(E.g., 2GB/sec for 16 nodes = 128MB/node/sec or 300,000/16 = 18,750 IOPS/node)

Choose the appropriate storage class and build the configuration(E.g., 1,200 IOPS per spindle, 16-way striped = 19,200 IOPS per LUN)

Page 11: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Interconnect Network

Page 12: RAC Capacity Planning-1.pdf

Interconnect Capacity Planning

• RAC interconnect usage• Oracle Clusterware

• Very small messages exchanged periodically

• Response time/load critical not big bandwidth consumer

• Oracle RAC Database

• Primary user of interconnect capacity

• Exchanges both small and large messages between nodes

• Key driver in deciding the network configuration

Page 13: RAC Capacity Planning-1.pdf

RAC Messages

• Small 256 byte messages• Used by GES and GCS

• Cache Fusion blocks messages• Db_block_size

• Parallel Query • Parallel_execution_message_size• default 8k

Page 14: RAC Capacity Planning-1.pdf

Interconnect Bandwidth

• Message received (M) per second• (#GES message + #GCS messages)

• Blocks received (B) per second• (db_block_size * (#cr block received + #current block received)) /

mtu size

• PQ message received (P) per second • (PQ_message_size * # PX remote messages recv'd ) / mtu size

• Total bandwidth required per second …• (Message received + Blocks received + PQ message received) /

max network transmit capacity• (M+B+P)/85000

• Similar equation applies to send side

Page 15: RAC Capacity Planning-1.pdf

Example from AWR Report

• Global Cache blocks received: 2,534

• GCS/GES messages received: 8,11

• PX remote messages recv'd 65

• Db_block_size 8192• Parallel_execution_message_size 8192• Mtu_size 1500

• One Gigabit ethernet interface for interconnect

• Total bandwidth Req’d= (M+B+P)/85000• = (2534 + ((811 *8192)/1500) + ((65*8192)/1500) )/85000 • 8.5 % of capacity utilization

Page 16: RAC Capacity Planning-1.pdf

Interconnect Bandwidth

• Available Interconnect Bandwidth in IP based network• Depends on the network packets transmitted• The comparison of theoretical bandwidth using total bytes

transmitted is not accurate

Page 17: RAC Capacity Planning-1.pdf

Available Network Bandwidth

0

20

40

60

80

100

120

256 byte 512 byte 1024 byte 2048 byte 8192 byte

Series1

MB/sec

Message size in bytes

Page 18: RAC Capacity Planning-1.pdf

RAC Interconnect

• Experience shows for most applications single Gigabit Ethernet is adequate

• In planning 70 % utilization should be reasonable point to add additional interfaces

Page 19: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Server Capacity

Page 20: RAC Capacity Planning-1.pdf

Server Capacity Planning

• To size the server optimally• Consider total no of concurrent processes• Estimated CPU utilization of critical queries

• Grid control/ SQL Trace should give this data

• Plan for max run-queue length 2 * no of CPU’s

• During high utilization periods never to exceed 70% overall CPU in the box

• Factor the percentage of capacity each server adds• This would help to attain your High Availability Goals

• In planned outage situations it will help to …

• Determine whether surviving nodes can support the workload

Page 21: RAC Capacity Planning-1.pdf

Server capacity Planning

• Ensure optimal no of HBA’s are available• To get desired I/O response time & bandwidth• Plan for 50-70% Capacity utilization

• Ensure optimal number of NIC’s avaiable• For both public and cluster interconnects

• And for NAS Storage if used

Page 22: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Infrastructure Design

Page 23: RAC Capacity Planning-1.pdf

Scalable Infrastructure Design

• Very critical aspect in new capacity planning exercise• Critical elements of scalable infrastructure design

consist of …• Networked Storage

• Interconnect Network

• Optimally sized servers

• Software and Application Service

Page 24: RAC Capacity Planning-1.pdf

Infrastructure DesignS

tora

ge F

arm

SAN Fabric 1 SAN Fabric 2

Storage 01 Storage 02 Storage NN

Sto

rage

Far

m

SAN Fabric 1 SAN Fabric 2

Storage 01 Storage 02 Storage NN

• 2 SAN Switches• Low-end SAN Storage• 2 ports from each Storage Processor connected to each

SAN switch• Equal-size RAID5 LUNS are distributed among all SP’s• On Storage Processor failure in Array LUN’s would failover

Page 25: RAC Capacity Planning-1.pdf

Infrastructure DesignS

erve

r F

arm

a001 a002 a003 aNNNb001 b002 b003 bNNN

Sto

rage

Far

m

SAN Fabric 1 SAN Fabric 2

Storage 01 Storage 02 Storage NN

Sto

rage

Far

m

SAN Fabric 1 SAN Fabric 2

Storage 01 Storage 02 Storage NN

Server and storage farms horizontally scalable (“scaling-

out”)

•2 CPU and 4 CPU boxes

•2 port HBA connecting to each server

•LUNS are load-balanced on both ports

•Protects from SP, Array port, Single HBA, Single SAN switch

Page 26: RAC Capacity Planning-1.pdf

Infrastructure DesignS

erve

r F

arm

a001 a002 a003 aNNNb001 b002 b003 bNNN

IP N

etw

ork Public/App-DB Private Interconnect NAS/iSCSI Management

NAS NNLANWAN

Sto

rage

Far

m

SAN Fabric 1 SAN Fabric 2

Storage 01 Storage 02 Storage NN

Sto

rage

Far

m

SAN Fabric 1 SAN Fabric 2

Storage 01 Storage 02 Storage NN

Server and storage farms horizontally scalable (“scaling-

out”)

Page 27: RAC Capacity Planning-1.pdf

Infrastructure Design

Separate Switches for PUBLIC, Private, NAS if used and Management Network

Redundant Networks for PUBLIC, PRIVATE and NAS- For most configurations active/failover should be sufficient

- Where Load-balancing used ensure correct option of Network Redundancy is used to provide both send and Receive side load balance

- 803.2ad is used to aggregate switch ports

- 803.2ad is used in the host to bond the interfaces

Page 28: RAC Capacity Planning-1.pdf

Storage Network

• Implement zoning / masking using• Simple scheme where all LUN’s are visible across all nodes,

if the cluster infrastructure is used by multiple databases

• Create equi-sized LUNS that meets planned I/O characteristics

• Ensure LUN can support combined throughput of all concurrent RAC node access

• Avoid ISL in SAN switch design by sizing the SAN switch appropriately

• In ASM diskgroup add disks with similar storage characteristics and capacity

Page 29: RAC Capacity Planning-1.pdf

Interconnect Network

• Ensure proper VLAN for the cluster-interconnect network

• Avoid cascading switches• If NIC bonding used ensure switch ports are

appropriately configured to provide both send/receive side load balancing

• Ensure similar vendors NIC’s are teamed in the host

Page 30: RAC Capacity Planning-1.pdf

Server Design

• Ensure similar sized servers are clustered together• Ensure Remote Administration has been correctly

setup• Use Automated procedures to check consistency of

correct OS, firmware and application software version and revision levels• Cluster Verification Tool

• Verifies infrastructure,Clusterware and RAC configurations• ORION

• Measures available I/O bandwidth and Response Time• IPERF

• Measures & reports network performance

Page 31: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Software Considerations

Page 32: RAC Capacity Planning-1.pdf

Cluster Software Design

• If multiple Database’s are using common cluster infrastructure• Ensure similar sized nodes are clustered together

• Install separate single CLUTER_HOME

• Install separate single ASM_HOME

• DB_HOME’s could be installed/expanded as required

Page 33: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Adding Capacity

Page 34: RAC Capacity Planning-1.pdf

When to Add More Capacity

• These Guidelines assumes• All configuration and Best Practices are followed• And all necessary SQL, DB tuning is performed

• Key threshold to monitor for disk I/O • Db_file_sequential_read > 25 msec

• Db_file_scattered_read > 30 msec

• Log_file_parallel_write > 3 msec

• Determine the source of the bottleneck• HOST, HBA, SAN Switch or Storage Array

Page 35: RAC Capacity Planning-1.pdf

When to Add More Capacity

• Thresholds to monitor Interconnect Network• Assumes following pre-requisites

• Host CPU’s in any RAC instance node is not max’ed out

• Correct Network Configuration and Best Practice followed

• Log_file_parallel_write not > 3 msec

• If cache fusion message latencies exceed following limitations

3080.3Avg global cache current block receive time(ms)

2330.1Average time to process current block request

1240.3Avg global cache cr block receive time (ms)

1010.1Average time to process cr block request

Upper Bound

TypicalLower Bound

AWR Report Latency Name

Page 36: RAC Capacity Planning-1.pdf

AWR Report – RAC Statistics

Page 37: RAC Capacity Planning-1.pdf

When to Add Capacity

• Server• Overall CPU utilization constantly exceed 70% • Run-queue length is > 2*CPU for long periods of time

Page 38: RAC Capacity Planning-1.pdf

<Insert Picture Here>

Real World Example

Page 39: RAC Capacity Planning-1.pdf

Mercado Libre

• eBay in Latin America• Runs marketplace from search to Bid• In 2004 moved from mid-range SMP to

• 4*4 node Itanium2 Linux RAC Cluster• 16 Gig RAM each Node• NFS filer storage• Initially estimated 400,000 TP hour good for 2 years

Page 40: RAC Capacity Planning-1.pdf

Mercado Libre

• Scaled incrementally as marketplace grew

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

Busin

ess V

olum

e

2004 2005 2006

Nod

es

Page 41: RAC Capacity Planning-1.pdf

Mercado LibrePerformance Characteristics

MercadoLibre’s 13 node Linux Itanium cluster • 460 GB RAM clusterwide• 286 GB SGA

• 14,500 URLS/second

• 47 GB/ redo /day

Only use a maximum 40% of the capacity of a single Gigabit Ethernet interconnect

Page 42: RAC Capacity Planning-1.pdf

Summary

• Plan initial sizing with good estimate • Design a Scalable infrastructure • Grow capacity with business volume• Resource utilization is the key driver

Page 43: RAC Capacity Planning-1.pdf

For More Information

http://search.oracle.com

or

otn.oracle.com/rac

REAL APPLICATION CLUSTERS

Page 44: RAC Capacity Planning-1.pdf