tuningfor_oracle

46
IBM Advanced Technical Support - Americas © 2008 IBM Corporation 06/27/22 AIX Performance: Configuration & Tuning for Oracle Vijay Adik [email protected] ATS - Oracle Solutions Team

Transcript of tuningfor_oracle

Page 1: tuningfor_oracle

IBM Advanced Technical Support - Americas

© 2008 IBM Corporation04/13/23

AIX Performance: Configuration & Tuning for Oracle

Vijay [email protected] - Oracle Solutions Team

Page 2: tuningfor_oracle

IBM Advanced Technical Support - Americas

2 © 2008 IBM Corporation 04/13/23

Legal informationThe information in this presentation is provided by IBM on an "AS IS" basis without any warranty, guarantee or assurance of any kind. IBM also does not provide any warranty, guarantee or assurance that the information in this paper is free from errors or omissions. Information is believed to be accurate as of the date of publication. You should check with the appropriate vendor to obtain current product information.

Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.

IBM, ^, , RS6000, System p, AIX, AIX 5L, GPFS, and Enterprise Storage Server (ESS) are trademarks or registered trademarks of the International Business Machines Corporation.

Oracle, Oracle9i and Oracle10g are trademarks or registered trademarks of Oracle Corporation.

All other products or company names are used for identification purposes only, and may be trademarks of their respective owners.

Page 3: tuningfor_oracle

IBM Advanced Technical Support - Americas

3 © 2008 IBM Corporation 04/13/23

AIX Configuration Best Practices for Oracle

– Memory

– I/O

– Network

– Miscellaneous

Agenda

Page 4: tuningfor_oracle

IBM Advanced Technical Support - Americas

4 © 2008 IBM Corporation 04/13/23

The suggestions presented here are considered to be basic configuration “starting points” for general Oracle workloads

Your workloads may vary

Ongoing performance monitoring and tuning is recommended to ensure that the configuration is optimal for the particular workload characteristics

AIX Configuration Best Practices for Oracle

Page 5: tuningfor_oracle

IBM Advanced Technical Support - Americas

5 © 2008 IBM Corporation 04/13/23

Performance Overview – Tuning Methodology

CPU Memory I/ONetwork

Predominant Bottleneck

• Understand the external view of system performanceThe external view of system performance is the observable event that is causing someone to say the system is performing poorly. Typically, (1) end-user response time, (2) application (or task) response time or (3) throughput. Should not use system metrics to judge improvement.

• Performance only improves when the predominant bottleneck is fixed

Fixing a secondary bottleneck will not improve performance and typically results in overloading an already overloaded predominant bottleneck.

• Monitor Performance after a change – Tuning is an iterative process

Monitoring is required after making a change for two reasons (1) Fixing the predominant bottleneck typically uncovers another bottleneck, and (2) Not all changes yield a positive results. If possible you should have a “repeatable” test to so change can be accurately evaluated.

• End-User Response time is the elapsed time between when a user submits a request and receives a response. • Application Response time is the elapsed required for one or more jobs to complete. Historically, these jobs have been called batch jobs. • Throughput is the amount of work that can be accomplished per unit time. This metric is typically expressed in terms of transaction per minute.

Iterative Tuning Process

Stress System (i.e., Tune at Peak workload)

Monitor Sub-Systems

Identify Predominant Bottleneck

Tune Bottleneck

Repeat

Page 6: tuningfor_oracle

IBM Advanced Technical Support - Americas

6 © 2008 IBM Corporation 04/13/23

Performance Monitoring and Tuning Tools

CPU MemoryI/O

SubsystemNetwork

Processes & Threads

Status Commands

vmstat, topas, iostat, ps, mpstat, lparstat, sar, time/timex, emstat/alstat

vmstat, topas, ps, lsps, ipcs

vmstat, topas, iostat, lvmstat, lsps, lsattr/lsdev, lspv/lsvg/lslv

netstat, topas, atmstat, entstat, tokstat, fddistat, nfsstat, ifconfig

ps, pstat, topas, emstat/alstat

Monitor

Commands

netpmon svmon, netpmon, filemon

fileplace, filemon

netpmon, tcpdump

svmon, truss, kdb, dbx, gprof, kdb, fuser, prof

Trace Level Commands

tprof, curt, splat, trace, trcrpt

trace,trcrpt trace, trcrpt iptrace, ipreport, trace, trcrpt

truss, pprof curt, splat, trace, trcrpt

Tuning tools

schedo, fdpr, bindprocessor, bindintcpu, nice/renice, setpri

vmo, rmss,fdpr, chps/mkps

ioo, lvmo, chdev, migratepv,chlv, reorgvg

no, chdev,ifconfig

nfso,chdev, fdpr

Page 7: tuningfor_oracle

IBM Advanced Technical Support - Americas

7 © 2008 IBM Corporation 04/13/23

AIX Configuration Best Practices for Oracle

– Memory

– I/O

– Network

– Miscellaneous

Agenda

Page 8: tuningfor_oracle

Advanced Technical Support – System p

© 2008 IBM Corporation8 04/13/23

AIX Memory Management Overview

The role of Virtual Memory Manager (VMM) is to provide the capability for programs to address more memory locations than are actually available in physical memory.

On AIX this is accomplished using segments that are partitioned into fixed sizes called “pages”.

– A segment is 256M

– default page size 4K

– POWER 4+ and POWER5 can define large pages, which are 16M

The 32-bit or 64-bit address translates into a 52-bit or 80-bit virtual address

– 32-bit system : 4-bit segment register that contains a 24-bit segment id, and 28-bit offset.

• 24-bit segment id + 28-bit offset = 52-bit VA

– 64-bit system: 32-bit segment register that contains a 52-bit segment id, and 28-bit offset.

• 52-bit segment id + 28-bit offset = 80-bit VA

The VMM maintains a list of free frames that can be used to retrieve pages that need to be brought into memory.

– The VMM replenishes the free list by removing some of the current pages from real memory (i.e., steal memory).

– The process of moving data between memory and disk is called “paging”.

The VMM uses a Page Replacement Algorithm (implemented in the lrud kernel threads) to select pages that will be removed from memory.

Page 9: tuningfor_oracle

Advanced Technical Support – System p

© 2008 IBM Corporation9 04/13/23

Virtual Memory Space – 64 Bits 36-bits selects Segment Register 28-bits offset within Segment 64-bit Address

.

.

.

Virtual Memory

1 Trillion Terabytes or 1 Yotta byte

Segments IDs

0

Each Segment Register contains a 52-bit Segment ID

Kernel Segment

Page Space Disk Map

Kernel Heap

256 Mbyte Segment

52-bit Segment Id + 28-bit offset = 80-bit Virtual Address

Segment is divided into 4096 byte chunks called pages

Each Segment can have a maximum of

65536 pages

28-bit offset – to access a specific location in the

segment

228 = 256M

Page 10: tuningfor_oracle

Advanced Technical Support – System p

© 2008 IBM Corporation10 04/13/23

Memory Tuning Overview

Virtual Memory

(General)

Large Pages

(Pinned Memory 1)

Memory:

minfree

maxfree

lru_file_repage

lru_poll_interval

v_pinshm

lgpg_regions

lgpg_size

JFSEnhanced JFS

(JFS2)

maxperm

strict_maxperm

maxclient

strict_maxclient

NAME CUR DEF BOOT MIN MAX UNIT TYPE--------------------------------------------------------------------------------lru_file_repage 1 1 1 0 1 boolean Dlru_poll_interval 0 0 0 0 60000 milliseconds Dmaxclient% 80 80 80 1 100 % memory Dmaxfree 1088 1088 1088 8 200K 4KB pages Dmaxperm% 80 80 80 1 100 % memory Dminfree 960 960 960 8 200K 4KB pages Dstrict_maxclient 1 1 1 0 1 boolean Dstrict_maxperm 0 0 0 0 1 boolean Dminperm% 20 20 20 1 100 % memory D

vmo –p –o <parameter name>=<new value>

-p flags updates /etc/tunables/nextboot

Page 11: tuningfor_oracle

IBM Advanced Technical Support - Americas

11 © 2008 IBM Corporation 04/13/23

The AIX “vmo” command provides for the display and/or update of several parameters which influence the way AIX manages physical memory– The “-a” option displays current parameter settings

vmo –a

– The “-o” option is used to change parameter values

vmo –o minfree=1440

– The “-p” option is used to make changes persist across a reboot

vmo –p –o minfree=1440

Virtual Memory Manager (VMM) Tuning

A number of the default “vmo” settings are not optimized for

database workloads and should be modified for Oracle environments

Page 12: tuningfor_oracle

IBM Advanced Technical Support - Americas

12 © 2008 IBM Corporation 04/13/23

VMM Tuning

Suggested Combination

– maxperm%=maxclient%=<High Percentage> – minperm% = <Low Percentage>– strict_maxperm=0 – strict_maxclient=1– lru_file_repage=0 – lru_poll_interval=10

The file cache will be allowed to grow; however, when the VMM needs memory it will steal only file pages. Why? Because we’ve set lru_file_repage=0.

What is <High Percentage> – If possible, set so maxclient% is always greater than numclient% (vmstat –v)

• Why? Maxclient is a hard limit; therefore, lrud will not run What is <Low Percentage>

– Set so that numperm (vmstat –v) is always greater than minperm% • Why? If numperm drops below minperm then lru_file_repage is set to 1 and you

will steal computational pages

Page 13: tuningfor_oracle

Advanced Technical Support – System p

© 2008 IBM Corporation13 04/13/23

VMM Tuning Combination Summary – Goal is to prevent paging of computational memory.

Recommended Method:

lru_file_repage = 0

strict_maxperm = 0

strict_maxclient = 1

maxperm% = maxclient% = High Percentage

minperm% = Low Percentage

lru_poll_interval=10

Classic Method*:

lru_file_repage = 1

strict_maxperm = 0

strict_maxclient = 0

maxperm% = maxclient% = 20% (or small number)

minperm% = 5

lru_poll_interval=10

* This method is appropriate for system that don’t have ‘lru_file_repage’ tunable.

Calculated Method:

lru_file_repage = 0

strict_maxperm = 0

strict_maxclient = 1

maxperm% = maxclient% = 1 - % Computational + 20%

lru_poll_interval=10

Where,

%Computational = max. AVM / Real Memory Frames

Avoid:

strict_maxperm = 1 and strict_maxclient = 0

strict_maxperm = strict_maxclient = 0 & lru_file_repage = 0

Page 14: tuningfor_oracle

IBM Advanced Technical Support - Americas

14 04/13/23 © 2008 IBM Corporation

0%

20%

40%

60%

80%

100%

Time

Phy

sica

l Mem

ory

numperm% comp% Free% maxperm%maxfree minfree minperm%

Virtual Memory Management (VMM) ThresholdsStart stealing pages when free memory below minfree

Stop stealing pages when free memory above maxfree

When numperm% > maxperm%, steal only file system pages

When minperm% < numperm% < maxperm%, steal file system or computation pages, depending on repage rate

When numperm% < minperm%, steal both file system and computational pages

Page 15: tuningfor_oracle

IBM Advanced Technical Support - Americas

15 © 2008 IBM Corporation 04/13/23

VMM Page Stealing Thresholds

The following define thresholds for the VMM page stealing process (lrud):

minfree– Set minfree = 120 x # logical CPUs / # Memory pools– Consider increasing if vmstat “fre” column frequently approaches zero or if

“vmstat –s” shows significant “free frame waits”

maxfree – Set maxfree = minfree + (MAX(maxpgahead, j2_maxPageReadAhead) x #

logical CPUs)

Example:

For a 6-way LPAR with SMT enabled, maxpgahead=8 and j2_maxPageReadAhead=8:– minfree = 360 = 120 x 6 x 2 / 4– maxfree = 1536 = 1440 + (max(8,8) x 6 x 2)

vmo –o minfree=1440 –o maxfree=1536 -p

Page 16: tuningfor_oracle

Advanced Technical Support – System p

© 2008 IBM Corporation16 04/13/23

AIX 5.3/6.1 – minfree and maxfree changes minfree and maxfree on AIX 5.3/6.1 are now applied to each memory pool.

total free list = minfree * # of memory pools In earlier releases of AIX (5.2 and 5.1), minfree was divided by the number of memory pools

so that the total free list (determined by adding minfree for *each* memory pool) equaled the vmo/vmtune value of minfree.

AIX Level           minfree        mempools     LRUD starts when 51/52                 1024                4               free_list =< 1024 53                      1024                4               free_list =< (4 * 1024)

Initial Setting AIX 5.3/6.1 Initial Setting AIX 5.2

minfree = max( 960, lcpus * 120) ----------------------- # of mempools

maxfree = minfree + (Max Read Ahead * lcpus) ---------------------- # of mempools

minfree = max( 960, lcpus * 120)

maxfree = minfree + (Max Read Ahead * lcpus)

Where,

Max Read Ahead = max( maxpgahead, j2_maxPageReadAhead)

Page 17: tuningfor_oracle

IBM Advanced Technical Support - Americas

17 © 2008 IBM Corporation 04/13/23

AIX Paging Space

Allocate Paging Space: Configure Server/LPAR with enough physical memory to satisfy memory requirements

With AIX demand paging, paging space does not have to be large Provides safety net to prevent system crashes when memory overcommitted.

Generally, keep within internal drive or high performing SAN storage

Monitor paging activity: vmstat -s

sar -r

nmon

Resolve paging issues: Reduce file system cache size (MAXPERM, MAXCLIENT)

Reduce Oracle SGA or PGA (9i or later) size

Add physical memory

Do not over commit real memory!

Page 18: tuningfor_oracle

IBM Advanced Technical Support - Americas

18 © 2008 IBM Corporation 04/13/23

AIX 5.3/6.1 Multiple Page Size Support AIX 5.3 5300-04 introduces two new page sizes:

– 64K

– 16M (large pages)

Requires p5+ hardware

Requires p5 System Release 240, Service Level 202 microcode

16MB support requires Version 5 Release 2 of the Hardware Management Console (HMC) machine code

User/Application must request preferred page size

– 64K pages appear very promising, since they do not need to be configured/reserved in advance

– Will require Oracle code changes to explicitly support (10.2.0.4)

– If preferred size not available, the largest available smaller size will be used• Current Oracle versions should end up using 64KB pages if 16mb pages

not configured?

Page 19: tuningfor_oracle

IBM Advanced Technical Support - Americas

19 © 2008 IBM Corporation 04/13/23

Large Page Support (optional)

Pinning shared memory

AIX Parameters• vmo –p –o v_pinshm = 1• Leave maxpin% at the default of 80% unless the SGA exceeds 77% of real memory

– Vmo –p –o maxpin%=[(total mem-SGA size)*100/total mem] + 3 Oracle Parameters

• LOCK_SGA = TRUE

Enabling Large Page Support vmo –r –o lgpg_size = 16777216 –o lgpg_regions=(SGA size / 16 MB)

Allowing Oracle to use Large Pages chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle

Using Monitoring Tools svmon –G

svmon –P

Oracle metalink note# 372157.1

Page 20: tuningfor_oracle

IBM Advanced Technical Support - Americas

20 © 2008 IBM Corporation 04/13/23

Determining SGA size

SGA Memory Summary for DB: test01 Instance: test01 Snaps: 1046 -1047

SGA regions Size in Bytes

------------------------------ ----------------

Database Buffers 16,928,210,944

Fixed Size 768,448

Redo Buffers 2,371,584

Variable Size 1,241,513,984

----------------

sum 18,172,864,960

lgpg_regions = 18,172,864,960 / 16,777,216 = 1084 (rounded up)

Page 21: tuningfor_oracle

Advanced Technical Support – System p

© 2008 IBM Corporation21 04/13/23

Tuning and Improving System Performance Adjust the VMM Tuning Parameters

– Key parameters listed on word document

Implement VMM related Mount Options

– DIO / CIO

– Release behind or read and/or write

Reduce Application Memory Requirements

Memory Model

– %Computational < 70% - Large Memory Model – Goal is to adjust tuning parameters to prevent paging• Multiple Memory pools • Page Space smaller than Memory • Must Tune VMM key parameters

– %Computational > 70% - Small Memory Model – Goal is to make paging as efficient as possible • Add multiple page spaces on different spindles • Make all pages space the same size to ensure round-robin scheduling • PS = 1.5 computational requirements • Turn off DEFPS • Memory Load Control

Add additional Memory

Page 22: tuningfor_oracle

IBM Advanced Technical Support - Americas

22 © 2008 IBM Corporation 04/13/23

AIX Configuration Best Practices for Oracle

– Memory

– I/O

– Network

– Miscellaneous

Agenda

Page 23: tuningfor_oracle

IBM Advanced Technical Support - Americas

23 © 2008 IBM Corporation 04/13/23

Application memory area caches data to avoid IO

NFS caches file attributes NFS has a cached filesystem for NFS clients

The AIX IO stack

JFS and JFS2 cache use extra system RAM JFS uses persistent pages for cache JFS2 uses client pages for cache

Queues exist for both adapters and disksAdapter device drivers use DMA for IODisk subsystems have read and write cache

Disks have memory to store commands/data

Cache

Disk

Device Driver (s)

Application

LVM

VMM

LVM

Local FSJFS/JFS2

Remote FS NFS

Disk Subsystem (optional)

Raw

LVs

Raw

disks

Logical File System

Write Cache - ack sent back to application

Page 24: tuningfor_oracle

IBM Advanced Technical Support - Americas

24 © 2008 IBM Corporation 04/13/23

Asynchronous I/O AIX parameters (smit aio)

minservers = 10 * # cpus maxservers = (10 * # disks) / # cpus maxreqs = a multiple of 4096 > 4 * #disks * queue_depth “enable” at system restartTypical settings: minservers=100, maxservers=200, maxreqs=16384

Oracle parameters (init.ora) disk_asynch_io = TRUE filesystemio_options = {ASYNCH | SETALL} db_writer_processes = n (normally left at default, 1) db_writer_io_slaves = n (don’t use – implements AIO simulation)

Monitor usage:• Watch for Oracle alert log or trace file messages:

– Warning “lio_listo returned EAGAIN”

• AIX Monitoring– “pstat –a | grep aios”– Use “-A” and “-t” options for NMON

Note: FASTPATH, which uses async IO. AIO servers method uses the process based IO, whereas FASTPATH method uses Kernel based (interrupt based) is much better. Make sure it is enabled by using the following command:– lsattr -El aio0 and look for the value "fastpath", which should be enabled

Page 25: tuningfor_oracle

IBM Advanced Technical Support - Americas

25 © 2008 IBM Corporation 04/13/23

AIX Filesystems Journaled File System (JFS)

Better for lots of small file creates & deletes– Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc.

– Direct I/O (DIO) mount/open option no caching on reads

Enhanced JFS (JFS2)

Better for large files/filesystems– Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc.

– Direct I/O (DIO) mount/open option no caching on reads

– Concurrent I/O (CIO) mount/open option DIO, with write serialization disabled• Use for Oracle .dbf, control files and online redo logs only!!!

GPFS

Clustered filesystem – the IBM filesystem for RAC

– Non-cached, non-blocking I/Os (similiar to JFS2 CIO) for all Oracle files

GPFS and JFS2 with CIO offer similar performance as Raw Devices

Page 26: tuningfor_oracle

IBM Advanced Technical Support - Americas

26 © 2008 IBM Corporation 04/13/23

Cached vs. non-Cached (Direct) I/O

Oracle 9i Oracle 10g

JFS Set filesystemio_options=SETALL-or-

Use “dio” mount option

Set filesystemio_options=SETALL-or-

Use “dio” mount option

JFS2 Use “cio” mount option Set filesystemio_options=SETALL-or-

Use “cio” mount option

File System caching tends to benefit heavily sequential workloads with low write content. To enable caching for JFS/JFS2:

Use default filesystem mount options

Set Oracle filesystemio_options=ASYNCH

DIO tends to benefit heavily random access workloads and CIO tends to benefit heavy update workloads. To disable JFS, JFS2 caching, see the following table:

Page 27: tuningfor_oracle

IBM Advanced Technical Support - Americas

27 © 2008 IBM Corporation 04/13/23

CIO Demotion and Filesystem Block Size

Data Base Files (DBF)

If db_block_size = 2048 set agblksize=2048

If db_block_size >= 4096 set agblksize=4096

Redo Log Files

Set agblksize=512 and use CIO or DIO

Page 28: tuningfor_oracle

IBM Advanced Technical Support - Americas

28 © 2008 IBM Corporation 04/13/23

I/O Tuning (ioo)

READ-AHEAD (Only applicable to JFS/JFS2 with caching enabled)

MINPGAHEAD (JFS) or j2_minPageReadAhead (JFS2)– Default: 2– Starting value: MAX(2,DB_BLOCK_SIZE / 4096)

MAXPGAHEAD (JFS) or j2_maxPageReadAhead (JFS2)– Default: 8 (JFS), 128 (JFS2)– Set equal to (or multiple of) size of largest Oracle I/O request

• DB_BLOCK_SIZE * DB_FILE_MULTI_BLOCK_READ_COUNT

Number of buffer structures per filesystem:

NUMFSBUFS: – Default: 196, Starting Value: 568

j2_nBufferPerPagerDevice (j2_dynamicBufferPreallocation replaces)

– Default: 512, Starting Value: 2048

Monitor with “vmstat –v”

Page 29: tuningfor_oracle

IBM Advanced Technical Support - Americas

29 © 2008 IBM Corporation 04/13/23

Data Layout for Optimal I/O Performance

Stripe and mirror everything (SAME) approach:

Goal is to balance I/O activity across all disks, loops, adapters, etc...

Avoid/Eliminate I/O hotspots

Manual file-by-file data placement is time consuming, resource intensive and iterative

Use RAID-5 or RAID-10 to create striped LUNs (hdisks)

Create AIX Volume Group(s) (VG) w/ LUNs from multiple arrays, striping on the front end as well for maximum distribution

Physical Partition Spreading (mklv –e x) –or-

Large Grained LVM striping (>= 1MB stripe size)

http://www-1.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100319

Page 30: tuningfor_oracle

IBM Advanced Technical Support - Americas

30 © 2008 IBM Corporation 04/13/23

Data Layout cont’d…

Stripe using Logical Volume (LV) or Physical Partition (PP) striping

LV Striping– Oracle recommends stripe width of a multiple of

• Db_block_size * db_file_multiblock_read_count• Usually around 1 MB

– Valid LV Strip sizes:• AIX 5.2: 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1 MB• AIX 5.3: AIX 5.2 Stripe sizes + 2M, 4M, 16 MB, 32M, 64M, 128M

– Use AIX Logical Volume 0 offset (9i Release 2 or later) • Use Scalable Volume Groups (VGs), or use “mklv –T O” with Big VGs• Requires AIX APAR IY36656 and Oracle patch (bug 2620053)

PP Striping– Use minimum Physical Partition (PP) size (mklv -t, -s parms)

• Spread AIX Logical Volume (LV) PPs across multiple hdisks in VG (mklv –e x)

Page 31: tuningfor_oracle

IBM Advanced Technical Support - Americas

31 © 2008 IBM Corporation 04/13/23

Tuning and Improving System Performance Adjust the key IOO Tuning Parameters

Adjust device specific tuning Parameters

Other I/O tuning Options

– DIO / CIO

– Release behind or read and/or write

– IO Pacing

– Write Behind

Improve the data layout

Add additional hardware resources

Page 32: tuningfor_oracle

IBM Advanced Technical Support - Americas

32 © 2008 IBM Corporation 04/13/23

AIX Configuration Best Practices for Oracle

– Memory

– I/O

– Network

– Miscellaneous

Agenda

Page 33: tuningfor_oracle

IBM Advanced Technical Support - Americas

33 © 2008 IBM Corporation 04/13/23

Network Options (no) Parameters

• Set sb_max >= 1 MB (1048576) • Set tcp_sendspace >= 262144• Set tcp_recvspace >= 262144• Set rfc1323=1

Page 34: tuningfor_oracle

IBM Advanced Technical Support - Americas

34 © 2008 IBM Corporation 04/13/23

Additional Network (no) Parameters for RAC:

Set udp_sendspace = db_block_size * db_file_multiblock_read_count

(not less than 65536)

Set udp_recvspace = 4 * udp_sendspace

– Must be < sb_max

Increase if buffer overflows occur

Examples:

no -a |grep udp_sendspace

no –o -p udp_sendspace=65536

netstat -s |grep "socket buffer overflows"

Page 35: tuningfor_oracle

IBM Advanced Technical Support - Americas

35 © 2008 IBM Corporation 04/13/23

AIX Configuration Best Practices for Oracle

– Memory

– I/O

– Network

– Miscellaneous

Agenda

Page 36: tuningfor_oracle

IBM Advanced Technical Support - Americas

36 © 2008 IBM Corporation 04/13/23

Miscellaneous parameters

User Limits (smit chuser)– Soft FILE size = -1 (Unlimited)– Soft CPU time = -1 (Unlimited)– Soft DATA segment = -1 (Unlimited)– Soft STACK size -1 (Unlimited)– /etc/security/limits

Maximum number of PROCESSES allowed per user (smit chgsys)– maxuproc >= 2048

Environment variables:– AIXTHREAD_SCOPE=S

Page 37: tuningfor_oracle

IBM Advanced Technical Support - Americas

37 © 2008 IBM Corporation 04/13/23

DLPAR & Oracle

CPU

Oracle 9i– Oracle CPU_COUNT does not recognize change in # logical cpus– AIX scheduler can still use the added CPUs

Oracle 10g

Oracle CPU_COUNT is dynamically updated for change in # logical cpus

Memory

Oracle 9i or 10g– SGA can be dynamically resized, but has an upper bound of SGA_MAX_SIZE.

• SGA_TARGET (10g)• DB_CACHE_SIZE, SHARED_POOL_SIZE., etc.

– PGA_AGGREGATE_TARGET can be dynamically resized

SGA_TARGET and PGA_AGGREGATE_TARGET are not hard limits

Page 38: tuningfor_oracle

IBM Advanced Technical Support - Americas

38 © 2008 IBM Corporation 04/13/23

Micro-Partitioning technology

Partitioning options – Micro-partitions: Up to 254*– Dynamic LPARs: Up to 32*– Combination of both

Configured via the HMC

Number of logical processors– Minimum/maximum

Entitled capacity– In units of 1/100 of a CPU– Minimum 1/10 of a CPU

Variable weight– % share (priority) of

surplus capacity

Capped or uncapped partitions

Micro-partitions

Pool of 6 CPUs

Lin

ux

i5/O

S V

5R3*

*

AIX

5L

V5.

3

AIX

5L

V5.

3

Lin

ux

Entitledcapacity

Hypervisor

Min

Max

*on p5-590 and p5-595** on p5-570, p5-590, and p5-595

AIX

5L

V

5.2

AIX

5L

V

5.3

DynamicLPARs

WholeProcessors

Micro-Partitioning technology allows each processor to be subdivided into as

many as 10 “virtual servers”

Note: Micro-partitions are optional.

Page 39: tuningfor_oracle

IBM Advanced Technical Support - Americas

39 © 2008 IBM Corporation 04/13/23

Shared Processor Logical Partitions – Terminology

LPAR w/o SMTAIX 5.3

LPAR w/o SMTAIX 5.3

LPAR w/SMTAIX 5.3

LPAR w/SMTAIX 5.3

LPARLPAR

Shared Processor PoolCapacity of 6 Processing Units

Shared Processor Logical Partition (splpar) key terms that will be discussed:

Physical Processors (PP) – An 8-way p5 590. For this configuration one MCM houses 4 POWER5 chip and each POWER5 chip has two processor cores. With SMT enable each processor core can simultaneous execute two instruction threads.

Shared Processor Pool – 6 processors have been allocated to the shared processor pool and 2 processors have been allocated to a dedicated partition.

Virtual Processors (VP) – The operating system views the virtual processors as a “physical processor”.

Logical Processors – With SMT enabled each VP is viewed by the operating system has having two logical processors.

Process Capacity specification for splpars - Each splpar has the entitled processing capability, which is defined via a number of partition configuration parameters.

Now, let’s discuss processor capacity specification in more detail.

POWER5 Chip Processor Core The four POWER5 chips are packaged on a Multi-Chip Module (MCM).

Virtual Processors Logical Processors

Page 40: tuningfor_oracle

IBM Advanced Technical Support - Americas

40 © 2008 IBM Corporation 04/13/23

Capped Shared Processor LPAR

Maximum Processor Capacity

Entitled Processor CapacityProcessorCapacityUtilization LPAR Capacity Utilization

Pool Idle Capacity Available

Time

minimum processor capacity

ceded capacity

utilized capacity

Page 41: tuningfor_oracle

IBM Advanced Technical Support - Americas

41 © 2008 IBM Corporation 04/13/23

Uncapped Shared Processor LPAR

Maximum Processor Capacity

ProcessorCapacityUtilization

Pool Idle Capacity Available

Time

Entitled Processor Capacity

minimum processor capacity

Utilized Capacity

ceded capacity

Page 42: tuningfor_oracle

IBM Advanced Technical Support - Americas

42 © 2008 IBM Corporation 04/13/23

Simultaneous Multithreading (SMT) & Oracle

CPU Total AIX52 3/9/2004

0

10

20

30

40

50

60

70

80

90

100

12:3

4

12:3

6

12:3

8

12:4

0

12:4

2

12:4

412

:46

12:4

8

12:5

0

12:5

2

12:5

4

12:5

6

12:5

8

13:0

0

13:0

2

13:0

4

13:0

6

13:0

813

:10

13:1

2

13:1

4

13:1

6

13:1

8

13:2

0

13:2

2

13:2

4

13:2

6

13:2

8

13:3

0

13:3

213

:34

13:3

6

13:3

8

13:4

0

13:4

2

13:4

4

User% Sys% Wait%

CPU Total AIX53 10/9/2004

0

10

20

30

40

50

60

70

80

90

100

10:4

5

10:4

7

10:4

9

10:5

1

10:5

3

10:5

5

10:5

7

10:5

9

11:0

1

11:0

3

11:0

5

11:0

7

11:0

9

11:1

1

11:1

3

11:1

5

11:1

7

11:1

9

11:2

1

11:2

3

11:2

5

11:2

7

11:2

9

11:3

1

11:3

3

11:3

5

11:3

7

11:3

9

11:4

1

11:4

3

11:4

5

11:4

7

11:4

9User% Sys% Wait%

Processes AIX53 10/9/2004

0

5

10

15

20

25

10:4

5

10:4

7

10:4

9

10:5

1

10:5

3

10:5

5

10:5

7

10:5

9

11:0

1

11:0

3

11:0

5

11:0

7

11:0

9

11:1

1

11:1

3

11:1

5

11:1

7

11:1

9

11:2

1

11:2

3

11:2

5

11:2

7

11:2

9

11:3

1

11:3

3

11:3

5

11:3

7

11:3

9

11:4

1

11:4

3

11:4

5

11:4

7

11:4

9

RunQueue Swap-in

Processes AIX52 3/9/2004

0

5

10

15

20

25

30

35

40

45

12:3

4

12:3

6

12:3

8

12:4

0

12:4

2

12:4

4

12:4

6

12:4

8

12:5

0

12:5

2

12:5

4

12:5

6

12:5

8

13:0

0

13:0

2

13:0

4

13:0

6

13:0

8

13:1

0

13:1

2

13:1

4

13:1

6

13:1

8

13:2

0

13:2

2

13:2

4

13:2

6

13:2

8

13:3

0

13:3

2

13:3

4

13:3

6

13:3

8

13:4

0

13:4

2

13:4

4

RunQueue Swap-in

Without SMT:

With SMT:

Page 43: tuningfor_oracle

IBM Advanced Technical Support - Americas

43 © 2008 IBM Corporation 04/13/23

Performance Monitoring and Tuning Tools

CPU MemoryI/O

SubsystemNetwork

Processes & Threads

Status Commands

vmstat, topas, iostat, ps, mpstat, lparstat, sar, time/timex, emstat/alstat

vmstat, topas, ps, lsps, ipcs

vmstat, topas, iostat, lvmstat, lsps, lsattr/lsdev, lspv/lsvg/lslv

netstat, topas, atmstat, entstat, tokstat, fddistat, nfsstat, ifconfig

ps, pstat, topas, emstat/alstat

Monitor

Commands

netpmon svmon, netpmon, filemon

fileplace, filemon

netpmon, tcpdump

svmon, truss, kdb, dbx, gprof, kdb, fuser, prof

Trace Level Commands

tprof, curt, splat, trace, trcrpt

trace,trcrpt trace, trcrpt iptrace, ipreport, trace, trcrpt

truss, pprof curt, splat, trace, trcrpt

Tuning tools

schedo, fdpr, bindprocessor, bindintcpu, nice/renice, setpri

vmo, rmss,fdpr, chps/mkps

ioo, lvmo, chdev, migratepv,chlv, reorgvg

no, chdev,ifconfig

nfso,chdev

Page 44: tuningfor_oracle

IBM Advanced Technical Support - Americas

44 © 2008 IBM Corporation 04/13/23

Reference Material:Oracle Techical Documentation

http://technet.oracle.com

Oracle Support

http://metalink.oracle.com (requires support license)

Check metalink note ID 282036.1

IBM Redbooks on Oracle

http://www.redbooks.ibm.com

Advanced Technical Support (Techdocs)

http://www.ibm.com/support/techdocs

http://w3.ibm.com/support/techdocs (IBM Internal)

GPFS Documentation

http://publib.boulder.ibm.com/clresctr/library/gpfs_faqs.html

AIX Documentation

http://www.ibm.com/servers/eserver/pseries/library/

Page 45: tuningfor_oracle

IBM Advanced Technical Support - Americas

45 © 2008 IBM Corporation 04/13/23

Q&A

Page 46: tuningfor_oracle

IBM Advanced Technical Support - Americas

46 © 2008 IBM Corporation 04/13/23

The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: AS/400, DBE, e-business logo, ESCO, eServer, FICON, IBM, IBM Logo, iSeries, MVS, OS/390, pSeries, RS/6000, S/30, VM/ESA, VSE/ESA, Websphere, xSeries, z/OS, zSeries, z/VM

The following are trademarks or registered trademarks of other companies

Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development CorporationJava and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countriesLINUX is a registered trademark of Linux TorvaldsUNIX is a registered trademark of The Open Group in the United States and other countries.Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.Intel is a registered trademark of Intel Corporation* All other products may be trademarks or registered trademarks of their respective companies.

NOTES:

Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.

IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.

All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.

This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.

All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

References in this document to IBM products or services do not imply that IBM intends to make them available in every country.

Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.

The information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

Trademarks