Earl Jew Part i How to Monitor and Analyze Aix Vmm and Storage Io Statistics Apr4-13

Materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.3

2011IBM Power Systems Technical UniversityOctober 10-14 | Fontainebleau Miami Beach | Miami, FL

Copyright IBM Corporation 2012

PE61 Part I: Updated Concepts and Tactics --How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR

Earl Jew ([email protected]) 310-251-2907 cellSenior IT Management Consultant - IBM Power Systems and IBM Systems Storage IBM Lab Services and Training - US Power Systems (group/dept)400 North Brand Blvd., c/o IBM 8th floor, Glendale, CA 91203 [Extended: April 4th, 2013]

2 Copyright IBM Corporation 2012

Part I: Updated Concepts and Tactics -- How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR

ABSTRACTThis presentation updates AIX/VMM (Virtual Memory Management) and LVM/JFS2

storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning of the numbers offered by AIX commands (vmstat, iostat, mpstat, sar, etc.) to monitor and analyze the AIX VMM and storage IO performance and capacity of a given Power7/AIX LPAR.

These tactics are further illustrated in Part II: Updated Real-world Case Histories --How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR.


Part II: Updated Real-world Case Histories -- How to Monitor and Analyzethe VMM and Storage I/O Statistics of a Power/AIX LPAR

ABSTRACTThese updated case-histories further illustrate the content presented in Part I:

Updated Concepts and Tactics -- How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR.

This presentation includes suggested ranges and ratios of AIX statistics to guide VMM and storage IO performance and capacity analysis.

Each case is founded on a different real-world customer configuration and workload that manifests characteristically in the AIX performance statistics -- as performing: intensely in bursts, with hangs and releases, AIX:lrud constrained, AIX-buffer constrained, freely unconstrained, inode-lock contended, consistently light, atomic&synchronous, virtually nil IO workload, long avg-wait's, perfectly ideal, long avg-serv's, mostly rawIO, etc.


Strategic Perspective: What is Workload Characterization?

Power/AIX performance-tuning is based on continuous cycles of: workload characterization, i.e. monitoring for indicated issues implementing tactics to remedy indicated issues

Workload characterization is determining an infrastructures resource capacities under load

In other words, workload characterization examines: the readiness of instructions&data residing in SAN storage, main memory, Power7 L3/L2/L1 cache the latency&throughput of instruction&data transfers between the above, i.e. multipathing, blocked IOs the processing of instructions&data, i.e. CPUs simultaneously executing prioritized processes/threads the dynamic balance and relative exhaustion/surplus of above resources

Workload characterization accounts an LPARs technology, implementation, size/count/bandwidth IBM Power CPU technology, i.e. Power5/5+, Power6/6+, Power7/7+ Booted implementation, i.e. shared-pool vs dedicated CPU LPARs, SRAD affinity assignment Component implementation, i.e. dedicated IO adapters (traditional) vs. dual-VIOS (PowerVM), NPIV Size/count/bandwidth of component technologies to address the expected workload, i.e.:

Total LPAR gbRAM and the relative amounts of the four main sections AIX VMM memory count of vCPU/eCPU/logicalCPU/FC-HBAs/LAN adapters/PCIe Gen2 slots/etc and the bandwidth of each


Formulations of AIX Tactics for Empirical Performance Analysis

This presentation will: explain the numbers presented by mundane AIX commands (vmstat,mpstat,iostat,ps,) formulate the recognition and severity of indicated AIX performance issues hidden in these numbers offer tactics to remedy any indicated AIX performance issues

Formulated indicators in mundane AIX command output can distinguish areas of resource exhaustion, limitation andover-commitment, as well as, resource under-utilization, surplus and over-allocation

Monitoring AIX: hardware->implementation->historical/accumulated stats->real-time/dynamic stats Review component technology of the infrastructure, i.e. ensure proper tuning-by-hardware Review implemented AIX structures, i.e. shared vs dedicated CPUs, SRADs, VIOS, NPIV, LVM/JFS2 constructs Review historical/accumulated AIX events, usages, pendings, counts, blocks, exhaustion, etc. Monitor real-time/dynamic AIX command behaviors, i.e. ps,vmstat,mpstat,iostat,ipcs, etc.

Interpret all indicators relative to the in-place technology, implementation and count/size/bandwidth of resources Historical/cumulative indicators are judged by counts-per-scale over days-uptime since boot Real-time/dynamic indicators are compared by ranges&ratios of system resources

Color-coded Severity-of-Indicators: blue/surplus, green/normal, orange/warning, red/critical


Considerations when Monitoring AIX Performance statistics

Monitor dynamic AIX behaviors with 1 or 2 second sampling intervals (vs 10-600 secs.)

Verify a stressful workload exists: We cant tune what is not being taxed

Discontinue active efforts when done: If/when it runs fast enough, were tuned

Favor building track-able discrete structures: We cant tune what cant be tracked

Discern workload spikes,peaks,bursts and burns: We tune the intensities, not the sleepy-times

Establish dynamic baselines by monitoring real-time AIX behaviors with ranges&ratios

Monitor AIX behaviors with the goal of characterizing the workload (vmstat I 1)


Monitoring AIX Usage, Meaning and Interpretation Review component technology of the infrastructure, i.e. proper tuning-by-hardware Review implemented AIX constructs, i.e. firm near-static structures and settings Review historical/accumulated AIX events, i.e. usages, pendings, counts, blocks, etc. Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, etc.

Recognizing Common Performance-degrading Scenarios High Load Average relative to count-of-LCPUs, i.e. over-threadedness vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r Poor ratio of pages freed to pages examined (fr:sr ratio) in vmstat -s output

Strategic Thoughts, Concepts, Considerations, and Tactics


Note the size, scale, technology and implementation of the given LPARNote the LPARs ratio-of-resources, i.e. CPU-to-RAM-to-SAN I/O

$ date ; uname -a ; id ; oslevel s; lparstat -iWed Sep 26 00:00:00 EDT 2012AIX tsm03 1 6 00X555XX5X00uid=0(root) gid=0(system) groups=2(bin),3(sys),7(security),8(cron),10(audit),11(lp)6100-06-06-1140Node Name : tsm03Partition Name : TSM03Partition Number : 1Type : Shared-SMT-4Mode : UncappedEntitled Capacity : 6.00Partition Group-ID : 32769Shared Pool ID : 0Online Virtual CPUs : 6Maximum Virtual CPUs : 7Minimum Virtual CPUs : 4Online Memory : 24064 MBMaximum Memory : 24064 MBMinimum Memory : 24064 MBVariable Capacity Weight : 128Minimum Capacity : 4.00Maximum Capacity : 7.00Capacity Increment : 0.01Maximum Physical CPUs in system : 16Active Physical CPUs in system : 16Active CPUs in Pool : 16Shared Physical CPUs in system : 16Maximum Capacity of Pool : 1600Entitled Capacity of Pool : 1600Unallocated Capacity : 0.00Physical CPU Percentage : 100.00%Unallocated Weight : 0Memory Mode : DedicatedTotal I/O Memory Entitlement : -Variable Memory Capacity Weight : -Memory Pool ID : -Physical Memory in the Pool : -Hypervisor Page Size : -Unallocated Variable Memory Capacity Weight: -Unallocated I/O Memory entitlement : -Memory Group ID of LPAR : -Desired Virtual CPUs : 6Desired Memory : 24064 MBDesired Variable Capacity Weight : 128Desired Capacity : 6.00


prtconf # note the component technology of the given LPAR

$ prtconfSystem Model: IBM,8233-E8BMachine Serial Number: 5555XXXProcessor Type: PowerPC_POWER7Processor Implementation Mode: POWER 7Processor Version: PV_7_CompatNumber Of Processors: 6Processor Clock Speed: 3300 MHzCPU Type: 64-bitKernel Type: 64-bitLPAR Info: 1 TSM03Memory Size: 24064 MBGood Memory Size: 24064 MBPlatform Firmware level: AL710_065Firmware Version: IBM,AL710_065Console Login: enableAuto Restart: trueFull Core: falseNetwork Information

Host Name: tsm03IP Address: 111.222.33.44Sub Netmask: 255.255.255.128Gateway: 111.222.33.1Name Server: 111.222.166.17Domain Name: customer.com

Paging Space InformationTotal Paging Space: 60672MBPercent Used: 24%

Volume Groups Information============================================================================== Inactive VGs============================================================================== heartbeat_vg============================================================================== Active VGs============================================================================== tsm_vg:PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTIONhdiskpower57 active 99 0 00..00..00..00..00hdiskpower8 active 9 0 00..00..00..00..00


lscfg # note the placement of components in the implementation of the LPAR

$ lscfgINSTALLED RESOURCE LISTThe following resources are installed on the machine.+/- = Added or deleted from Resource List.* = Diagnostic support not available.Model Architecture: chrpModel Implementation: Multiple Processor, PCI bus

+ sys0 System Object+ sysplanar0 System Planar* vio0 Virtual I/O Bus* vscsi2 U8233.E8B.1009ADP-V1-C5-T1 Virtual SCSI Client Adapter* vscsi1 U8233.E8B.1009ADP-V1-C3-T1 Virtual SCSI Client Adapter* vscsi0 U8233.E8B.1009ADP-V1-C2-T1 Virtual SCSI Client Adapter* hdisk3 U8233.E8B.1009ADP-V1-C2-T1-L8400000000000000 Virtual SCSI Disk Drive* hdisk2 U8233.E8B.1009ADP-V1-C2-T1-L8300000000000000 Virtual SCSI Disk Drive* hdisk1 U8233.E8B.1009ADP-V1-C2-T1-L8200000000000000 Virtual SCSI Disk Drive* hdisk0 U8233.E8B.1009ADP-V1-C2-T1-L8100000000000000 Virtual SCSI Disk Drive* vsa0 U8233.E8B.1009ADP-V1-C0 LPAR Virtual Serial Adapter* vty0 U8233.E8B.1009ADP-V1-C0-L0 Asynchronous Terminal* pci5 U5802.001.00H2615-P1 PCI Express Bus+ ent0 U5802.001.00H2615-P1-C6-T1 2-Port 10/100/1000 Base-TX PCI-Express Ada+ ent1 U5802.001.00H2615-P1-C6-T2 2-Port 10/100/1000 Base-TX PCI-Express Ada* pci4 U5802.001.00H2615-P1 PCI Express Bus+ fcs6 U5802.001.00H2615-P1-C5-T1 8Gb PCI Express Dual Port FC Adapter (df10* fcnet6 U5802.001.00H2615-P1-C5-T1 Fibre Channel Network Protocol Device+ fscsi6 U5802.001.00H2615-P1-C5-T1 FC SCSI I/O Controller Protocol Device* sfwcomm6 U5802.001.00H2615-P1-C5-T1-W0-L0 Fibre Channel Storage Framework Comm* rmt156 U5802.001.00H2615-P1-C5-T1-W500308C0022DD803-L0 LTO Ultrium Tape Drive (FCP)* rmt157 U5802.001.00H2615-P1-C5-T1-W500308C0022DD803-L1000000000000 LTO Ultrium Tape Drive (FCP)+ fcs7 U5802.001.00H2615-P1-C5-T2 8Gb PCI Express Dual Port FC Adapter (df10+ fscsi7 U5802.001.00H2615-P1-C5-T2 FC SCSI I/O Controller Protocol Device* rmt74 U5802.001.00H2615-P1-C5-T2-W21000024FF31B5B1-L9000000000000 LTO Ultrium Tape Drive (FCP)* rmt75 U5802.001.00H2615-P1-C5-T2-W21000024FF31B5B1-LA000000000000 LTO Ultrium Tape Drive (FCP)* rmt76 U5802.001.00H2615-P1-C5-T2-W21000024FF31B5B1-LB000000000000 LTO Ultrium Tape Drive (FCP)* rmt77 U5802.001.00H2615-P1-C5-T2-W21000024FF31B5B1-LC000000000000 LTO Ultrium Tape Drive (FCP)* rmt78 U5802.001.00H2615-P1-C5-T2-W21000024FF31B5B1-LD000000000000 LTO Ultrium Tape Drive (FCP)* rmt79 U5802.001.00H2615-P1-C5-T2-W21000024FF31B5B1-LE000000000000 LTO Ultrium Tape Drive (FCP)


lsdev note the count&capacity of the component technology of the LPAR

$ lsdevL2cache0 Available L2 Cachecd0 Defined Virtual SCSI Optical Served by VIO Serveren0 Defined 05-00 Standard Ethernet Network Interfaceen1 Defined 05-01 Standard Ethernet Network Interfaceen2 Defined 00-00-00 Standard Ethernet Network Interfaceen3 Available Standard Ethernet Network Interfaceen4 Defined 00-01 Standard Ethernet Network Interfaceen5 Defined 07-00-00 Standard Ethernet Network Interfaceent0 Available 05-00 2-Port 10/100/1000 Base-TX PCI-Express Adapter (14104003)ent1 Available 05-01 2-Port 10/100/1000 Base-TX PCI-Express Adapter (14104003)ent2 Available 00-00-00 10 Gigabit Ethernet Adapter (ct3)ent3 Available EtherChannel / IEEE 802.3ad Link Aggregationet0 Defined 05-00 IEEE 802.3 Ethernet Network Interfaceet1 Defined 05-01 IEEE 802.3 Ethernet Network Interfaceet2 Defined 00-00-00 IEEE 802.3 Ethernet Network Interfaceet3 Defined IEEE 802.3 Ethernet Network Interfaceet4 Defined 00-01 IEEE 802.3 Ethernet Network Interfaceet5 Defined 07-00-00 IEEE 802.3 Ethernet Network Interfacefcnet0 Defined 01-00-02 Fibre Channel Network Protocol Devicefcnet1 Defined 01-01-01 Fibre Channel Network Protocol Devicefcnet2 Defined 03-00-01 Fibre Channel Network Protocol Devicefcnet3 Defined 03-01-02 Fibre Channel Network Protocol Devicefcnet4 Defined 04-00-02 Fibre Channel Network Protocol Devicefcnet5 Defined 04-01-01 Fibre Channel Network Protocol Devicefcnet6 Defined 02-00-01 Fibre Channel Network Protocol Devicefcnet7 Defined 02-01-02 Fibre Channel Network Protocol Devicefcs0 Available 01-00 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs1 Available 01-01 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs2 Available 03-00 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs3 Available 03-01 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs4 Available 04-00 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs5 Available 04-01 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs6 Available 02-00 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fcs7 Available 02-01 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)fscsi0 Available 01-00-01 FC SCSI I/O Controller Protocol Devicefscsi1 Available 01-01-02 FC SCSI I/O Controller Protocol Devicefscsi2 Available 03-00-02 FC SCSI I/O Controller Protocol Devicefscsi3 Available 03-01-01 FC SCSI I/O Controller Protocol Devicefscsi4 Available 04-00-01 FC SCSI I/O Controller Protocol Devicefscsi5 Available 04-01-02 FC SCSI I/O Controller Protocol Devicefscsi6 Available 02-00-02 FC SCSI I/O Controller Protocol Devicefscsi7 Available 02-01-01 FC SCSI I/O Controller Protocol Devicehba0 Available 00-00 10 Gigabit Ethernet-SR PCI-Express Host Bus Adapter (2514300014108c03)


lsps ; mount ; df -k review the implemented construction of firm AIX structures

$ lsps a ; lsps s ; mount ; df -kPage Space Physical Volume Volume Group Size %Used Active Auto Type Chksumpaging02 hdisk2 paging_vg 9216MB 39 yes yes lv 0paging01 hdisk2 paging_vg 24576MB 15 yes yes lv 0paging00 hdisk2 paging_vg 16384MB 22 yes yes lv 0hd6 hdisk0 rootvg 10496MB 35 yes yes lv 0Total Paging Space Percent Used

60672MB 24%node mounted mounted over vfs date options

-------- --------------- --------------- ------ ------------ ---------------/dev/hd4 / jfs2 Sep 07 17:03 rw,log=/dev/hd8 /dev/hd2 /usr jfs2 Sep 07 17:03 rw,log=/dev/hd8 /dev/hd9var /var jfs2 Sep 07 17:03 rw,log=/dev/hd8 /dev/hd3 /tmp jfs2 Sep 07 17:03 rw,log=/dev/hd8 /dev/hd1 /home jfs2 Sep 07 17:07 rw,log=/dev/hd8 /dev/hd11admin /admin jfs2 Sep 07 17:07 rw,log=/dev/hd8 /proc /proc procfs Sep 07 17:07 rw /dev/hd10opt /opt jfs2 Sep 07 17:07 rw,log=/dev/hd8 /dev/livedump /var/adm/ras/livedump jfs2 Sep 07 17:07 rw,log=/dev/hd8 /dev/install_sw_lv /install_sw jfs2 Sep 07 17:07 rw,log=/dev/hd8 /dev/tsmlib1_lv /tsm/db2lib1 jfs2 Sep 07 17:22 rw,log=INLINE /dev/tsm_db_lv /tsm/tsm jfs2 Sep 07 17:22 rw,log=INLINE /dev/tsm_arc_lv /tsm/tsm/arch jfs2 Sep 07 17:22 rw,log=INLINE /dev/tsm_dat01_lv /tsm/tsm/data01 jfs2 Sep 07 17:22 rw,log=INLINE /dev/tsm_dat02_lv /tsm/tsm/data02 jfs2 Sep 07 17:22 rw,log=INLINE /dev/tsm_dat03_lv /tsm/tsm/data03 jfs2 Sep 07 17:22 rw,log=INLINE /dev/tsm_lg_lv /tsm/tsm/log jfs2 Sep 07 17:22 rw,log=INLINE /dev/lv01 /tsm/tsmb jfs2 Sep 07 17:22 rw,log=INLINE

Filesystem 1024-blocks Free %Used Iused %Iused Mounted on/dev/hd4 3145728 2605152 18% 31050 5% //dev/hd2 4390912 581548 87% 64251 29% /usr/dev/hd9var 2097152 78452 97% 9844 24% /var/dev/hd3 2097152 1035572 51% 2530 2% /tmp/dev/hd1 1048576 250468 77% 1198 3% /home/dev/hd11admin 131072 130692 1% 5 1% /admin/proc - - - - - /proc/dev/hd10opt 5242880 1815992 66% 26774 6% /opt/dev/livedump 262144 255344 3% 31 1% /var/adm/ras/livedump/dev/install_sw_lv 20971520 7548932 65% 7944 1% /install_sw/dev/tsmlib1_lv 51380224 21353496 59% 1818 1% /tsm/db2lib1/dev/tsm_db_lv 513802240 209820276 60% 1695 1% /tsm/tsm/dev/tsm_arc_lv 102760448 74128676 28% 73 1% /tsm/tsm/arch/dev/tsm_dat01_lv 519045120 6434120 99% 25 1% /tsm/tsm/data01/dev/tsm_dat02_lv 519045120 32034120 94% 23 1% /tsm/tsm/data02


df -k review the implemented construction of firm AIX structures;observe count-of-inodes per GBs(used) of each applications data filesystems

$ df -kFilesystem 1024-blocks Free %Used Iused %Iused Mounted on/dev/hd4 262144 129016 51% 3777 3% //dev/hd2 3932160 544280 87% 42721 5% /usr/dev/hd9var 1048576 334980 69% 4293 2% /var/dev/hd3 1048576 731832 31% 519 1% /tmp/dev/hd1 262144 63632 76% 2622 5% /home/proc - - - - - /proc/dev/hd10opt 262144 213832 19% 849 2% /opt/dev/lvsapcds 2097152 456840 79% 1246 2% /sapcds/dev/lvcnvbt 20480000 16993664 18% 715 1% /cnv/dev/lvhrtmpbt 524288 506984 4% 30 1% /hrtmp/dev/lvoraclebt 524288 436808 17% 2938 3% /oracle/dev/lvorapr1bt 8978432 3838252 58% 21476 3% /oracle/PR1/dev/lvmirrlogAp 3080192 2567348 17% 6 1% /oracle/PR1/mirrlogA/dev/lvmirrlogBp 3080192 2567348 17% 6 1% /oracle/PR1/mirrlogB/dev/lvoriglogAp 3080192 2567348 17% 6 1% /oracle/PR1/origlogA/dev/lvoriglogBp 3080192 2567348 17% 6 1% /oracle/PR1/origlogB/dev/lvsaparchbt 14680064 14296480 3% 7176 1% /oracle/PR1/saparch/dev/lvsapdata1bt 268173312 73734764 73% 116 1% /oracle/PR1/sapdata1/dev/lvsapdata18bt 268173312 73751196 73% 108 1% /oracle/PR1/sapdata10/dev/lvsapdata11bt 268173312 77027948 72% 108 1% /oracle/PR1/sapdata11/dev/lvsapdata24bt 268173312 75455208 72% 108 1% /oracle/PR1/sapdata12/dev/lvsapdata2bt 268173312 76225148 72% 110 1% /oracle/PR1/sapdata2/dev/lvsapdata3bt 268173312 75569716 72% 110 1% /oracle/PR1/sapdata3/dev/lvsapdata14bt 268173312 74930816 73% 108 1% /oracle/PR1/sapdata4/dev/lvsapdata23bt 268173312 77814376 71% 108 1% /oracle/PR1/sapdata5/dev/lvsapdata16bt 268173312 79387368 71% 108 1% /oracle/PR1/sapdata6/dev/lvsapdata7bt 268173312 74013420 73% 108 1% /oracle/PR1/sapdata7/dev/lvsapdata8bt 268173312 75192876 72% 108 1% /oracle/PR1/sapdata8/dev/lvsapdata19bt 268173312 74668728 73% 108 1% /oracle/PR1/sapdata9/dev/lvsapreorgbt 25165824 19272876 24% 1153 1% /oracle/PR1/sapreorg/dev/lvostage 2097152 1957092 7% 794 1% /oracle/stage/dev/lvsapmntbt 2097152 1447736 31% 357 1% /sapmnt/PR1


ipcs -bm review the implemented construction of firm AIX structures; computational memory includes allocated (vs authorized) shmemsegs

$ ipcs -bmIPC status from /dev/mem as of Wed Sep 26 00:01:26 EDT 2012T ID KEY MODE OWNER GROUP SEGSZShared Memory:m 1048576 0x78000166 --rw-rw-rw- root system 33554432m 1048577 0x7800010b --rw-rw-rw- root system 33554432m 1048578 0x21002002 --rw------- pconsole system 10485760m 3 0x6700b061 --rw-r--r-- root system 12m 4 0x6800b061 --rw-r--r-- root system 377016m 5 0x7000b061 --rw------- root system 3168m 23068678 0xa7067574 --rw-rw-rw- db2prd1 db2srvrs 140871904m 9437191 0xffffffff --rw------- db2lib1 db2srvrs 268435456m 15728648 0xffffffff --rw------- db2prd1 db2srvrs 16106127360m 10485770 0xffffffff --rw------- db2lib1 db2srvrs 3758096384m 35651595 0xa7067561 --rw------- db2prd1 db2srvrs 51511296m 14680076 0xffffffff --rw------- db2lib1 db2srvrs 131072m 6291470 0x1b7fa074 --rw-rw-rw- db2lib1 db2srvrs 140871904m 12582927 0xffffffff --rw------- db2lib1 db2srvrs 163905536m 8388624 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 17 0xa7067668 --rw-rw---- db2prd1 db2srvrs 50331648m 73400338 0x1b7fa168 --rw-rw---- db2lib1 db2srvrs 50331648m 20971539 0xffffffff --rw------- db2prd1 db2srvrs 163905536m 6291476 0xffffffff --rw------- db2prd1 db2srvrs 131072m 13631509 0x1b7fa061 --rw------- db2lib1 db2srvrs 51511296m 26214422 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 111149079 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 89128984 0xffffffff --rw------- db2lib1 db2srvrs 268435456m 1067450393 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 115343386 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 894435355 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 311427100 0xffffffff --rw------- db2lib1 db2srvrs 268435456m 371195933 0xffffffff --rw------- db2prd1 db2srvrs 268435456m 547356703 0xffffffff --rw------- db2prd1 db2srvrs 131072m 569376800 0xffffffff --rw------- db2prd1 db2srvrs 131072m 576716833 0xffffffff --rw------- db2lib1 db2srvrs 268435456


vmo L ; ioo -L # review the implemented construction of firm AIX structures

# vmo L ; ioo LNAME CUR DEF BOOT MIN MAX UNIT TYPE

DEPENDENCIES--------------------------------------------------------------------------------ame_cpus_per_pool n/a 8 8 1 1K processors B--------------------------------------------------------------------------------ame_maxfree_mem n/a 24M 24M 320K 16G bytes D

ame_minfree_mem--------------------------------------------------------------------------------ame_min_ucpool_size n/a 0 0 5 95 % memory D--------------------------------------------------------------------------------ame_minfree_mem n/a 8M 8M 64K 16383M bytes D

ame_maxfree_mem--------------------------------------------------------------------------------ams_loan_policy n/a 1 1 0 2 numeric D--------------------------------------------------------------------------------enhanced_affinity_affin_time

1 1 1 0 100 numeric D--------------------------------------------------------------------------------enhanced_affinity_vmpool_limit

10 10 10 -1 100 numeric D--------------------------------------------------------------------------------esid_allocator 0 0 0 0 1 boolean D--------------------------------------------------------------------------------force_relalias_lite 0 0 0 0 1 boolean D--------------------------------------------------------------------------------kernel_heap_psize 0 0 0 0 16M bytes B--------------------------------------------------------------------------------lgpg_regions 0 0 0 0 8E-1 D

lgpg_size--------------------------------------------------------------------------------NAME CUR DEF BOOT MIN MAX UNIT TYPE

DEPENDENCIES--------------------------------------------------------------------------------aio_active 1 1 boolean S--------------------------------------------------------------------------------aio_maxreqs 64K 64K 64K 4K 1M numeric D--------------------------------------------------------------------------------aio_maxservers 30 30 30 1 20000 numeric D

aio_minservers--------------------------------------------------------------------------------aio_minservers 3 3 3 0 20000 numeric D

aio_maxservers------------------------------------------------


uptime; vmstat s Review accumulated count-of-events over days-uptime

12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.131356958409 total address trans. faults276638320 page ins260776199 page outs3259560 paging space page ins4195229 paging space page outs

0 total reclaims442257234 zero filled pages faults

849546 executable filled pages faults458258136 pages examined by clock

214 revolutions of the clock hand277114986 pages freed by the clock16045503 backtracks

2770 free frame waits0 extend XPT waits

11026835 pending I/O waits536747261 start I/Os32579821 iodones

138394979018 cpu context switches34131579015 device interrupts19730395799 software interrupts3300305278 decrementer interrupts910908738 mpc-sent interrupts910908138 mpc-receive interrupts429034782 phantom interrupts

0 traps2395294772518 syscalls



12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.131356958409 total address trans. faults

276638320 page ins260776199 page outs3259560 paging space page ins4195229 paging space page outs0 total reclaims

442257234 zero filled pages faults849546 executable filled pages faults

address translation faultsIncremented for each occurrence of an address translation page fault. I/O may or may not be required toresolve the page fault. Storage protection page faults (lock misses) are not included in this count.

address translation faults occur when virtual-to-physical memory address translations are required when: creating/initiating/forking/extending processes (that is, memory is needed to store a process contents), i.e zero

filled pages faults and executable filled pages faults instructions or data are initially read or written to/from persistent storage, i.e. page ins and page outs memory is needed by AIX to manage other operations, i.e. network IO mbuf allocations, creating SHMSEGs,

dynamic allocation of LVM/JFS2 fsbufs, etc.

address translation faults occur when virtual-to-physical memory address translations are required when:creating/initiating/forking/extending processes (that is, memory is needed to store a process contents), i.e. zero

filled pages faults and executable filled pages faultsinstructions or data are initially read or written to/from persistent storage, i.e. page ins and page outsmemory is needed by AIX to manage other operations, i.e. network IO mbuf allocations, creating SHMSEGs,

dynamic allocation of LVM/JFS2 fsbufs, etc.



12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.131356958409 total address trans. faults

276638320 page ins260776199 page outs

3259560 paging space page ins4195229 paging space page outs0 total reclaims


page insIncremented for each page read in by the virtual memory manager. The count is incremented for page insfrom page space and file space. Along with the page out statistic, this represents the total amount ofreal I/O initiated by the virtual memory manager. [These are generally JFS/JFS2/NFS filesystem reads]

page outsIncremented for each page written out by the virtual memory manager. The count is incremented forpage outs to page space and for page outs to file space. Along with the page in statistic, this representsthe total amount of real I/O initiated by the virtual memory manager. [These are generally JFS/JFS2/NFS

filesystem writes]



12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.131356958409 total address trans. faults276638320 page ins260776199 page outs3259560 paging space page ins4195229 paging space page outs0 total reclaims


paging space page insIncremented for VMM initiated page ins from paging space only.

paging space page outsIncremented for VMM initiated page outs to paging space only.

Only computational memory is ever written-to or read-from the paging space; the paging space extends computational memory. For any Days Uptime, acceptable tolerance is up to 5 digits of paging space page outs. For any Days Uptime, your concern for performance degradation should grow exponentially greater for each digit beyond 5 digits of paging space page outs.

But, of course, you might be asking: What is Computational Memory?

Only computational memory is ever written-to or read-from the paging space; the paging space extends computationalmemory. For any Days Uptime, acceptable tolerance is 5 digits of paging space page outs. For any Days Uptime,your concern for performance degradation should grow exponentially greater for each digit beyond 5 digits of paging space page outs.

But, of course, you might be asking: What is Computational Memory?


What is Computational memory? What is File memory (aka Non-Computational memory)

Computational memoryComputational memory is used while your processes are actually working on computing information. These working segments are

temporary (transitory) and only exist up until the time a process terminates or the page is stolen. They have no real permanent disk storage location. When a process terminates, both the physical and paging spaces are released in many cases. When there is a large spike in available pages, you can actually see this happening while monitoring your system. When free physical memory starts getting low, programs that have not used recently are moved from RAM to paging space to help release physical memory for more real work.

File memory (aka Non-Computational memory)File memory (unlike computational memory) uses persistent segments and has a permanent storage location on the disk. Data

files or executable programs are mapped to persistent segments rather than working segments. The data files can relate to filesystems, such as JFS, JFS2, or NFS. They remain in memory until the file is unmounted, a page is stolen, or a file is unlinked. After the data file is copied into RAM, VMM controls when these pages are overwritten or used to store other data. Given the alternative, most people would much rather have file memory paged to disk rather than computational memory.

When a process references a page which is on disk, it must be paged, which could cause other pages to page out again. VMM is constantly lurking and working in the background trying to steal frames that have not been recently referenced, using the page replacement algorithm discussed earlier. It also helps detect thrashing, which can occur when memory is extremely low and pages are constantly being paged in and out to support processing. VMM actually has a memory load control algorithm, which can detect if the system is thrashing and actually tries to remedy the situation. Unabashed thrashing can literally cause a system to come to a standstill, as the kernel becomes too concerned with making room for pages than actually doing anything productive.

Source verbatim: Ken Milberg/Martin Brown http://www.ibm.com/developerworks/aix/library/au-aix7memoryoptimize1/index.html





849546 executable filled pages faults

zero-filled page faultsIncremented if the page fault is to working storage and can be satisfied by assigning a frame and zero-filling it.

executable-filled page faultsIncremented for each instruction page fault.

zero-filled page faults are used to allocate memory when creating, initializing, forking or extending AIX processes to be executed, such as when starting-up a database instance, or executing Java applications. They do not involve storage IO. They also load the TLB for fast next access. By definition, they are only computational memory.

executable filled pages faults are used to allocate memory designated to house binary-executable instructions, and they do involve storage read IOs. They also load the TLB for fast next access. By definition, they are only computational memory.

zero-filled page faults are used to allocate memory when creating, initializing, forking or extending AIXprocesses to be executed, such as when starting-up a database instance, or executing Java applications. They do notinvolve storage IO. They also load the TLB for fast next access. By definition, they are only computational memory.

executable filled pages faults are used to allocate memory designated to house binary-executable instructions,and they do involve storage read IOs. They also load the TLB for fast next access. By definition, they are onlycomputational memory.



458258136 pages examined by clock214 revolutions of the clock hand

277114986 pages freed by the clock

pages examined by the clockVMM uses a clock-algorithm to implement a pseudo least recently used (lru) page replacement scheme.Pages are aged by being examined by the clock. This count is incremented for each page examined by the clock.

revolutions of the clock handIncremented for each VMM clock revolution (that is, after each complete scan of memory).

pages freed by the clockIncremented for each page the clock algorithm selects to free from real memory.

Typically, [pages freed by the clock / pages examined by the clock] is comfortably greater than 0.40,i.e. 277114986 / 458258136 = 0.60471

If not greater than 0.40, then the lower this value reaches below 0.40, the more likely gbRAM needs to be added. This is a contributing or confirming factor suggesting more gbRAM may be needed; it is not a definitive indicator.

pages examined by the clock is the historical accumulation of AIX:vmstat:page:sr activity (aka lrud-scanrate).pages freed by the clock is the historical accumulation of AIX:vmstat:page:fr activity (aka lrud-freerate).

Typically, [pages freed by the clock / pages examined by the clock] is comfortably greater than 0.40,i.e. 277114986 / 458258136 = 0.60471

If not greater than 0.40, then the lower this value reaches below 0.40, the more likely gbRAM needs to be added. This is a contributing or confirming factor suggesting more gbRAM may be needed; it is not a definitive indicator.

pages examined by the clock is the historical accumulation of AIX:vmstat:page:sr activity (aka lrud-scanrate).pages freed by the clock is the historical accumulation of AIX:vmstat:page:fr activity (aka lrud-freerate).



16045503 backtracks2770 free frame waits

0 extend XPT waits11026835 pending I/O waits536747261 start I/Os32579821 iodones

backtracksIncremented for each page fault that occurs while resolving a previous page fault. (The new page fault must be resolved first

and then initial page faults can be backtracked.) free frame waits

Incremented each time a process requests a page frame, the free list is empty, and the process is forced to wait while the free list is replenished.

The count of backtracks monitors the relative intensity or duration of coincident and near-coincident page faulting activity. It can generally distinguish a steady consistently-moderate workload pattern (low count) from a frenetically spiking, peaking, bursting or burning workload pattern (high count).

The count of free frame waits increases when free memory repeatedly reaches down to zero and slightly back up. High counts indicate a likely start/stop stuttering of user workload progress, as well as, frustrating unfettered storage IO throughput; this is typically associated with harsh bursts and burns of AIX:lrud scanning and freeing, as well as, higher CPU-kernel time (i.e. AIX:vmstat:cpu:sy >25%).

The count of backtracks monitors the relative intensity or duration of coincident and near-coincident page faulting activity. It can generally distinguish a steady consistently-moderate workload pattern (low count) from a frenetically spiking, peaking, bursting or burning workload pattern (high count).

The count of free frame waits increases when free memory repeatedly reaches down to zero and slightly backup. High counts indicate a likely start/stop stuttering of user workload progress, as well as, frustrating unfetteredstorage IO throughput; this is typically associated with harsh bursts and burns of AIX:lrud scanning and freeing, aswell as, higher CPU-kernel time (i.e. AIX:vmstat:cpu:sy >25%).



16045503 backtracks2770 free frame waits

0 extend XPT waits11026835 pending I/O waits

536747261 start I/Os32579821 iodones

pending I/O waitsIncremented each time a process is waited by VMM for a page-in I/O to complete.

start I/OsIncremented for each read or write I/O request initiated by VMM.

iodonesIncremented at the completion of each VMM I/O request.

High counts of pending I/O waits could indicate long page-in I/O latencies, or processes awaiting page-in I/O are repeatedly or too rapidly returned to the CPU before the page-in I/O completes, or both in varying degrees. Acceptable tolerance is up to 80% of iodones; warning is 81%-100% of iodones; seek-resolution is beyond 100% of iodones, i.e. pending I/O waits / iodones => 11026835/32579821 = 33.84% = Acceptable

start I/Os are generally the sum of page ins and page outs.

The ratio of start I/Os to iodones is a relative indicator of sequential I/O coalescence. Sequential read-aheads and sequential write-behinds of JFS2 default-mount I/O transactions are automatically coalesced to fewer larger I/O transactions. This is a quick&dirty method of distinguishing a generally random IO versus sequential IO workload, i.e. start I/Os/iodones=>536747261/32579821=16.47 is a moderate Sequential IO reduction ratio.

High counts of pending I/O waits could indicate long page-in I/O latencies, or perhaps processes awaiting page-inI/O are repeatedly/rapidly scheduled to a CPU before the page-in I/O completes, or both in varying degrees.Acceptable tolerance is up to 80% of iodones; warning is 81%-100% of iodones; seek-resolution is beyond 100% ofiodones, i.e. pending I/O waits / iodones => 11026835 / 32579821 = 33.84% = Acceptable

start I/Os are generally the sum of page ins and page outs.

The ratio of start I/Os to iodones is a relative indicator of sequential I/O coalescence. Sequential read-aheadsand sequential write-behinds of JFS2 default-mount I/O transactions are automatically coalesced to fewer larger I/Otransactions. This is a quick&dirty method of distinguishing a generally random IO versus sequential IO workload,i.e. start I/Os/iodones=>536747261/32579821=16.47 is a moderately-low Sequential IO reduction ratio.



138394979018 cpu context switches34131579015 device interrupts19730395799 software interrupts3300305278 decrementer interrupts

2395294772518 syscalls

CPU context switchesIncremented for each processor context switch (dispatch of a new process).

device interruptsIncremented on each hardware interrupt.

software interruptsIncremented on each software interrupt. A software interrupt is a machine instruction similar to a hardware

interrupt that saves some state and branches to a service routine. System calls are implemented with software interrupt instructions that branch to the system call handler routine.

decrementer interruptsIncremented on each decrementer interrupt.

syscallsIncremented for each system call.



138394979018 cpu context switches34131579015 device interrupts19730395799 software interrupts3300305278 decrementer interrupts

2395294772518 syscalls

Note the paired ratios of the above for a relative sense-of-proportion of system events.What is useful about the ratio of cpu context switches : decrementer interrupts?

138394979018 / 3300305278 = an average of 42 device interrupts per decrementer interrupt

What is useful about the ratio of device interrupts : decrementer interrupts?34131579015 / 3300305278 = an average of 10 device interrupts per decrementer interrupt

What is useful about the ratio of syscalls : decrementer interrupts?2395294772518 / 3300305278 = an average of 726 system calls per decrementer interrupt

What is useful about the ratio of device interrupts : syscalls : cpu context switches?34131579015 : 2395294772518 : 138394979018 ~= 10:726:42 per decrementer interrupt

Note the paired ratios of the above for a relative sense-of-proportion of system events.What is useful about the ratio of cpu context switches : decrementer interrupts?

138394979018 / 3300305278 = an average of 42 device interrupts per decrementer interrupt

What is useful about the ratio of device interrupts : decrementer interrupts?34131579015 / 3300305278 = an average of 10 device interrupts per decrementer interrupt

What is useful about the ratio of syscalls : decrementer interrupts?2395294772518 / 3300305278 = an average of 726 system calls per decrementer interrupt

What is useful about the ratio of device interrupts : syscalls : cpu context switches?34131579015 : 2395294772518 : 138394979018 ~= 10:726:42 per decrementer interrupt


Determine points of exhaustion, limitation, and over-commitmentDetermine surplus resources: CPUcycles, RAM, SAN I/O thruput, etc.

$ uptime; vmstat -s12:46AM up 139 days, 1:29, 0 users, load average: 9.24, 4.21, 2.99

36674080366 total address trans. faults303594828999 page ins # filesystem reads from disk; vmstat:page:fi65127100071 page outs # filesystem writes to disk; vmstat:page:fo

17 paging space page ins # vmstat:page:pi166 paging space page outs # vmstat:page:po


379929 executable filled pages faults790677067990 pages examined by clock # vmstat:page:sr

102342 revolutions of the clock hand323578511315 pages freed by the clock # vmstat:page:fr

216779474 backtracks173781776 free frame waits # waits when vmstat:memory:fre equals 0

0 extend XPT waits13118848968 pending I/O waits

369118024444 start I/Os21394237531 iodones

115626032109 cpu context switches # vmstat:faults:cs25244855068 device interrupts # fc/ent/scsi interrupts; vmstat:faults:in3124067547 software interrupts # software interrupts

14571190906 decrementer interrupts # lcpu decrementer clock interrupts56397341 mpc-sent interrupts56396919 mpc-receive interrupts32316580 phantom interrupts

0 traps739431511068 syscalls # total system calls (akin to miles traveled)


uptime; vmstat v Review accumulated count-of-events over days-uptime

12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.136160384 memory pages5954432 lruable pages156557 free pages

3 memory pools883319 pinned pages

80.0 maxpin percentage3.0 minperm percentage

90.0 maxperm percentage1.2 numperm percentage

77369 file pages0.0 compressed percentage

0 compressed pages1.2 numclient percentage

90.0 maxclient percentage77369 client pages

0 remote pageouts scheduled19 pending disk I/Os blocked with no pbuf

1019076 paging space I/Os blocked with no psbuf2359 filesystem I/Os blocked with no fsbuf

0 client filesystem I/Os blocked with no fsbuf204910 external pager filesystem I/Os blocked with no fsbuf

96.2 percentage of memory used for computational pages



12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.136160384 memory pages5954432 lruable pages156557 free pages # real-time count of freemem pages is continuously changing


free pagesNumber of free 4 KB pages.

The AIX VMM managed count of free pages is set by AIX:vmo:minfree and AIX:vmo:maxfree. Varying with AIX and the incidental count of memory pools, the count of free pages is maintained by default within a 4-digit range of 4KB pages, say between 2880 and 3264 of 4 KB pages.

Meanwhile, enterprise-class infrastructures can sustain JFS2 filesystem I/O throughputs of 5-digits of 4KB reads and writes but if-and-only-if a greater count of free pages is always available to accept the reads and writes.

Remember: The count of free frame waits increases when free memory repeatedly reaches down to zero and slightly back up. High counts indicate a likely start/stop stuttering of user workload progress, as well as, frustrating unfettered storage IO throughput; this is typically associated with harsh bursts and burns of AIX:lrud scanning and freeing, as well as, higher CPU-kernel time (i.e. AIX:vmstat:cpu:sy >25%).

The AIX VMM managed count of free pages is set by AIX:vmo:minfree and AIX:vmo:maxfree. Varyingwith AIX and the incidental count of memory pools, the count of free pages is typically maintained by default withina 4-digit range of 4KB pages, i.e. between 2880 and 3264 of 4 KB pages.

Meanwhile, enterprise-class infrastructures can sustain JFS2 filesystem I/O throughputs of 5-digits of 4KB reads andwrites if-and-only-if a greater count of free pages is always available to accept the reads and writes.

Remember: The count of free frame waits increases when free memory repeatedly reaches down to zero andslightly back up. High counts indicate a likely start/stop stuttering of user workload progress, as well as, frustratingunfettered storage IO throughput; this is typically associated with harsh bursts and burns of AIX:lrud scanning andfreeing, as well as, higher CPU-kernel time (i.e. AIX:vmstat:cpu:sy >25%).


Reduce free frame waits by raising minfree and maxfree higher than default

=== command: vmstat sv Note: low 4 digits of free frame waits with a nice 6 digits of free pages; while theres enough freemem, IO (i.e. fi,fo) continues unfettered2770 free frame waits

156557 free pages

=== command: vmo L Note: maxfree=8704 (default=1088), minfree=8K (default=960); incidentally, this LPAR has 3 memory pools--------------------------------------------------------------------------------maxfree 8704 1088 8704 16 4812K 4KB pages D

minfreememory_frames

--------------------------------------------------------------------------------minfree 8K 960 8K 8 4812K 4KB pages D

maxfreememory_frames

--------------------------------------------------------------------------------

=== command: vmstat Iwt 1 Note: mempool_count*maxfree=3*8704=26112; mempool_count*minfree=3*8192=24576, (fre=24576 starts fr:sr lrud scanning&freeing)System configuration: lcpu=24 mem=24064MB ent=6.00

kthr memory page faults cpu time ----------- --------------------- ------------------------------------ ------------------ ----------------------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se

11 1 0 7539030 117080 2 548 206 0 0 0 32730 2188826 175026 37 35 1 27 5.52 92.1 00:01:1217 1 0 7540803 115005 0 132 146 0 0 0 32581 2178059 169341 39 33 0 28 5.51 91.9 00:01:1312 0 2 7540365 114924 2 316 16 0 0 0 30417 2188135 171948 36 36 0 28 5.52 92.0 00:01:1410 0 2 7540654 106310 0 5863 29 0 0 0 33112 2134251 177073 34 38 0 28 5.52 92.0 00:01:1516 0 0 7540058 94698 8 8459 17 0 0 0 33147 2076084 173134 35 40 1 25 5.59 93.1 00:01:1623 0 0 7544739 83097 0 4518 15 0 0 0 32137 2098672 170494 39 38 2 22 5.68 94.7 00:01:170 0 0 7552531 70637 19 3518 44 0 0 0 27911 2207363 166832 43 33 7 18 5.65 94.1 00:01:18

11 2 0 7560676 61953 23 14471 38 0 0 0 24444 2196741 154363 43 30 9 18 5.55 92.6 00:01:1917 0 0 7570158 50021 66 11393 41 1733 4661 13412 30670 2063644 166578 39 40 7 14 5.75 95.8 00:01:2013 2 0 7570331 39515 17 24859 10 8366 24671 71521 32332 1830946 163441 37 46 5 12 5.81 96.9 00:01:2117 3 0 7569607 42002 14 3569 6 4458 2643 4022 26678 2219614 165593 46 29 7 18 5.62 93.7 00:01:2217 9 0 7569539 46795 22 1 4 2808 0 0 26107 2179201 164453 43 31 6 20 5.61 93.5 00:01:2313 10 0 7569524 49434 7 1 3 2522 0 0 26521 2216482 166354 40 31 7 22 5.48 91.3 00:01:2421 6 0 7569511 53096 0 1 10 3530 0 0 26437 2184553 164387 40 32 6 22 5.54 92.3 00:01:25

Universal Recommendation: If default maxfree & minfree, and 6+ digits of free frame waits per any 90 days uptime,1) use vmo to tune minfree=(5*2048), maxfree=(6*2048); 2) use ioo to tune j2_MaxPageReadAhead=2048.Recommendation: If default maxfree & minfree, and 6+ digits of free frame waits per any 90 days uptime,1) use vmo to tune minfree=(5*2048), maxfree=(6*2048); 2) use ioo to tune j2_MaxPageReadAhead=2048.






90.0 maxperm percentage1.2 numperm percentage # a real-time % indicator of disk IO cache


0 compressed pages1.2 numclient percentage # a real-time % indicator of disk IO cache90.0 maxclient percentage

77369 client pages0 remote pageouts scheduled

19 pending disk I/Os blocked with no pbuf1019076 paging space I/Os blocked with no psbuf

2359 filesystem I/Os blocked with no fsbuf0 client filesystem I/Os blocked with no fsbuf

204910 external pager filesystem I/Os blocked with no fsbuf96.2 percentage of memory used for computational pages



3.0 minperm percentage90.0 maxperm percentage1.2 numperm percentage # Warning when less than/equal minperm%

1.2 numclient percentage # Warning when less than/equal minperm%

90.0 maxclient percentage

minperm percentageTuning parameter (managed using vmo) in percentage of real memory. This specifies the point below

which file pages are protected from the re-page algorithm. maxperm percentage

Tuning parameter (managed using vmo) in percentage of real memory. This specifies the point above which the page stealing algorithm steals only file pages.

numperm percentagePercentage of memory currently used by the file cache.

numclient percentagePercentage of memory occupied by client pages.

maxclient percentageTuning parameter (managed using vmo) specifying the maximum percentage of memory which can be

used for client pages.






90.0 maxperm percentage1.2 numperm percentage


0 compressed pages1.2 numclient percentage

90.0 maxclient percentage77369 client pages

0 remote pageouts scheduled19 pending disk I/Os blocked with no pbuf

1019076 paging space I/Os blocked with no psbuf2359 filesystem I/Os blocked with no fsbuf


96.2 percentage of memory used for computational pages





204910 external pager filesystem I/Os blocked with no fsbuf

pending disk I/Os blocked with no pbuf Number of pending disk I/O requests blocked because no pbuf was available. Pbufs are pinned memory buffersused to hold I/O requests at the logical volume manager layer. Count is currently for the rootvg: only.

paging space I/Os blocked with no psbufNumber of paging space I/O requests blocked because no psbuf was available. Psbufs are pinned memory buffersused to hold I/O requests at the virtual memory manager

filesystem I/Os blocked with no fsbuf Number of filesystem I/O requests blocked because no fsbuf was available. Fsbuf are pinned memory buffersused to hold I/O requests in the filesystem layer.

client filesystem I/Os blocked with no fsbufNumber of client filesystem I/O requests blocked because no fsbuf was available. NFS (Network File System) andVxFS (Veritas) are client filesystems. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem layer.

external pager filesystem I/Os blocked with no fsbufNumber of external pager client filesystem I/O requests blocked because no fsbuf was available. JFS2 is an external pager client filesystem. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem layer.



19 pending disk I/Os blocked with no pbuf # stat for rootvg only1019076 paging space I/Os blocked with no psbuf



Use AIX:lvmo to monitor the pervg_blocked_io_count of each active LVM volume group,i.e. lvmo a v rootvg ; echo ; lvmo a v datavg

vgname = rootvgpv_pbuf_count = 512total_vg_pbufs = 512max_vg_pbufs = 16384pervg_blocked_io_count = 19pv_min_pbuf = 512max_vg_pbuf_count = 0global_blocked_io_count = 1566

vgname = datavgpv_pbuf_count = 512total_vg_pbufs = 2048max_vg_pbufs = 65536pervg_blocked_io_count = 475pv_min_pbuf = 512max_vg_pbuf_count = 0global_blocked_io_count = 1566

Acceptable tolerance is 4-digits of pervg_blocked_io_count per LVM volume group for any 90 days uptime.


Determine points of exhaustion, limitation, and over-commitmentDetermine surplus resources: CPUcycles, RAM, SAN I/O thruput, etc.

# lvmo -a -v rootvg # 270 days uptime for counters belowvgname = rootvgpv_pbuf_count = 512total_vg_pbufs = 1024 # total_vg_pbufs / pv_pbuf_count = 1024/512 = 2 LUNsmax_vg_pbuf_count = 16384pervg_blocked_io_count = 90543pv_min_pbuf = 512global_blocked_io_count = 12018771# lvmo -a -v apvg15vgname = apvg15pv_pbuf_count = 512total_vg_pbufs = 15872 # total_vg_pbufs / pv_pbuf_count = 15872/512 = 31 LUNsmax_vg_pbuf_count = 524288pervg_blocked_io_count = 517938pv_min_pbuf = 512global_blocked_io_count = 12018771

# lvmo -a -v pgvg01vgname = pgvg01pv_pbuf_count = 512total_vg_pbufs = 1024 # total_vg_pbufs / pv_pbuf_count = 1024/512 = 2 LUNsmax_vg_pbuf_count = 16384pervg_blocked_io_count = 8612687pv_min_pbuf = 512global_blocked_io_count = 12018771


Increase total_vg_pbufs to resolve high pervg_blocked_io_count

19 pending disk I/Os blocked with no pbuf # stat for rootvg only

Four factors complicate how to resolve high counts of pervg_blocked_io_count: The number of pbufs per physical volume when its added to the volume group, i.e. the value of AIX:lvmo:pv_pbuf_count The count and size of physical volumes (aka LUNs or hdisks) assigned to the LVM VG The count and size of the JFS2:LVM logical volumes created on the VGs physical volumes, i.e. a reasonable balance of JFS2 fsbufs-to-

VG pbufs favors optimal performance. Having either too few or too many VG:pbuf can severely hamper performance and throughput.

As such, we should only add pbufs by-formula on a schedule of 90-day change&observe cycles.

Use AIX:lvmo to monitor the pervg_blocked_io_count of each active LVM volume group,i.e. lvmo a v rootvg ; echo ; lvmo a v datavg

Acceptable tolerance is 4-digits of pervg_blocked_io_count per LVM volume group for any 90 days uptime.

Otherwise, for each LVM volume group, adjust the value of AIX:lvmo:pv_pbuf_count accordingly:If 5-digits of pervg_blocked_io_count, add ~2048 pbufs to total_vg_pbufs per 90-day cycle.If 6-digits of pervg_blocked_io_count, add ~[4*2048] pbufs to total_vg_pbufs per 90-day cycle.If 7-digits of pervg_blocked_io_count, add ~[8*2048] pbufs to total_vg_pbufs per 90-day cycle.If 8-digits of pervg_blocked_io_count, add ~[12*2048] pbufs to total_vg_pbufs per 90-day cycle.If 9-digits of pervg_blocked_io_count, add ~[16*2048] pbufs to total_vg_pbufs per 90-day cycle.

Use AIX:lvmo to confirm/verify the value of total_vg_pbufs for each VG.



12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.131356958409 total address trans. faults276638320 page ins260776199 page outs3259560 paging space page ins

4195229 paging space page outs0 total reclaims

paging space page outsIncremented for VMM initiated page outs to paging space only.




paging space I/Os blocked with no psbufNumber of paging space I/O requests blocked because no psbuf was available. Psbufs are pinned memory buffersused to hold I/O requests at the virtual memory manager

The ratio of paging space I/Os blocked with no psbuf / paging space page outs is a direct measure of intensity, i.e. 1019076 / 4195229 = 24.2%. In this example, suffering 7-digits of paging space page outs in 18 Days-Uptime is bad enough, but when they are also blocked with no psbuf, system performance and keyboard responsiveness can stop-and-start in seconds-long cycles. One might believe AIX crashed, when it hasnt.

The ratio of paging space I/Os blocked with no psbuf / paging space page outs is a direct measure ofintensity, i.e. 1019076 / 4195229 = 24.2%. In this example, suffering 7-digits of paging space page outs in 18Days-Uptime is bad enough, but when there are also paging space I/Os blocked with no psbuf, system perfor-mance and keyboard responsiveness can stop-and-start in seconds-long cycles. One might believe AIX has even crashed, when it hasnt. Preclude paging space page outs by any means; add more gbRAM to the LPAR.


Criteria for Creating a Write-Expedient pagingspace_vg

The first priority should be to preclude any pagingspace-pageouts. Thus, a write-expedient pagingspace is only needed if you have any unavoidable pagingspace-pageout activity. Ultimately, if we must suffer any pagingspace-pageouts, we want them to write-out to the pagingspace as quickly as possible (thus my term: write-expedient).

So, for the sake of prudence, we should always create a write-expedient pagingspace. The listed traits below are optimal for write-expediency; include as many as you can (but always apply the key tuning tactic below):

Create a dedicated AIX:LVM:vg (VolumeGroup) called pagingspace_vg

Create the pagingspace_vg using FC-SAN storage LUNs (ideally RAID5 LUNs on SSD, FC or SAS technology disk drives, and not on SATA disk drives (which are slower and employs RAID6), nor on any local/internal SAS disks)

The total size of the pagingspace in pagingspace_vg should match the size of installed LPAR gbRAM

Assign 3-to-8 LUN/hdisks to pagingspace_vg and size each LUN to be an even fraction of installed gbRAM. For instance, if the LPAR has 18gbRAM, then assign three 6gb LUN/hdisks to pagingspace_vg

Configure one AIX:LVM:VG:lv (logical volume) for each LUN/hdisk in pagingspace_vg; do not deploy PP-striping (because it messes-up discrete hdisk IO monitoring) - just map one hdisk to one lv

The key tuning tactic: With root-user privileges, use AIX:lvmo to set pagingspace_vg:pv_pbuf_count=2048. This will ensure pagingspace_vg:total_vg_pbufs will equal [ * pv_pbuf_count].

To set the pv_pbuf_count value to 2048, type the following:lvmo -v pagingspace_vg -o pv_pbuf_count=2048



19 pending disk I/Os blocked with no pbuf1019076 paging space I/Os blocked with no psbuf2359 filesystem I/Os blocked with no fsbuf


filesystem I/Os blocked with no fsbuf Number of filesystem I/O requests blocked because no fsbuf was available. Fsbuf are pinned memory buffersused to hold I/O requests in the filesystem layer.

This count refers to JFS fsbuf exhaustion (vs. JFS2) and is typically ignored today. Virtually all customers use JFS2.

client filesystem I/Os blocked with no fsbufNumber of client filesystem I/O requests blocked because no fsbuf was available. NFS (Network File System) andVxFS (Veritas) are client filesystems. Fsbuf are pinned memory buffers used to hold I/O requests in the filesystem

layer.

Starting with AIX 6.1 Technology Level 02, the following parameters are obsolete because the network file system (NFS) and the virtual memory manager (VMM) dynamically tunes the number of buf structures and page device tables (PDTs) based on workload:* nfs_v2_pdts* nfs_v2_vm_bufs* nfs_v3_pdts* nfs_v3_vm_bufs* nfs_v4_pdts* nfs_v4_vm_bufs

This count refers to JFS fsbuf exhaustion (vs. JFS2) and is typically ignored today. Virtually all customers use JFS2.

Starting with AIX 6.1 Technology Level 02, the following parameters are obsolete because the network file system(NFS) and the virtual memory manager (VMM) dynamically tunes the number of buf structures and page device tables (PDTs) based on workload:* nfs_v2_pdts* nfs_v2_vm_bufs* nfs_v3_pdts* nfs_v3_vm_bufs* nfs_v4_pdts* nfs_v4_vm_bufs


Resolving high external pager filesystem I/Os blocked with no fsbuf





Acceptable tolerance is 5-digits per 90 Days-Uptime. First tactic to attempt: If 6-digits, set ioo h j2_dynamicBufferPreallocation=128.First tactic to attempt: If 7+ digits, set ioo h j2_dynamicBufferPreallocation=256.

ioo -h j2_dynamicBufferPreallocation=valueThe number of 16K slabs to preallocate when the filesystem is running low of bufstructs.

A value of 16 represents 256K. The bufstructs for Enhanced JFS (aka JFS2) are now dynamic; the number of buffers that start on the JFS2 filesystem is controlled by j2_nBufferPerPagerDevice (now restricted), but buffers are allocated and destroyed dynamically past this initial value. If the number of external pager filesystem I/Os blocked with no fsbuf increases, the j2_dynamicBufferPreallocation should be increased for that file system, as the I/O load on a file system may be exceeding the speed of preallocation.

A value of 0 will disable dynamic buffer allocation completely.Heavy IO workloads should have this value changed to 256.File systems do not need to be remounted to activate.

Acceptable tolerance is 5 digits per any 90 days uptime. First tactic: If 6 digits, set ioo h j2_dynamicBufferPreallocation=128.First tactic: If 7+ digits, set ioo h j2_dynamicBufferPreallocation=256.


Resolving high external pager filesystem I/Os blocked with no fsbuf





Acceptable tolerance is 5-digits per 90 Days-Uptime.Second tactic to attempt: If 6-digits, set ioo -o j2_nBufferPerPagerDevice=5120.Second tactic to attempt: If 7+digits, set ioo -o j2_nBufferPerPagerDevice=10240.

ioo -o j2_nBufferPerPagerDevice=value [Restricted]This tunable specifies the number of JFS2 bufstructs that start when the filesystems is mounted. Enhanced JFS will allocate

more dynamically (see j2_dynamicBufferPreallocation). Ideally, this value should not be tuned, instead j2_dynamicBufferPreallocation should be tuned. However, it may be appropriate to change this value if the number of external pager filesystem I/Os blocked with no fsbuf increases and continues increasing and j2_dynamicBufferPreallocation tuning has already been attempted. If the kernel must wait for a free bufstruct, it puts the process on a wait list before the start I/O is issued and will wake it up once a bufstruct has become available. May be appropriate to increase if striped logical volumes or disk arrays are being used.

Heavy IO workloads may require this value to be changed and a good starting point would be 5120 or 10240.File system(s) must be remounted.

Acceptable tolerance is 5 digits per 90 Days-Uptime. Second tactic (if first tactic wasnt enough): If 6 digits, set ioo -o j2_nBufferPerPagerDevice=5120.Second tactic (if first tactic wasnt enough): If 7+ digits, set ioo -o j2_nBufferPerPagerDevice=10240.


ps -ekf cumulative since last boot; compare CPU-time of key processes

$ uptime ; ps -ekf | grep -v grep | egrep syncd|lrud|nfsd|biod|wait|getty12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.13

root 131076 0 0 Sep 07 - 553:26 waitroot 262152 0 0 Sep 07 - 25:37 lrudroot 917532 0 0 Sep 07 - 1942:36 waitroot 983070 0 0 Sep 07 - 2026:38 waitroot 1048608 0 0 Sep 07 - 2030:40 waitroot 1114146 0 0 Sep 07 - 612:19 waitroot 1179684 0 0 Sep 07 - 1825:26 waitroot 1245222 0 0 Sep 07 - 1948:03 waitroot 1310760 0 0 Sep 07 - 1949:43 waitroot 1376298 0 0 Sep 07 - 585:41 waitroot 1441836 0 0 Sep 07 - 1881:58 waitroot 1507374 0 0 Sep 07 - 2005:49 waitroot 1572912 0 0 Sep 07 - 2010:27 waitroot 1638450 0 0 Sep 07 - 615:26 waitroot 1703988 0 0 Sep 07 - 1712:18 waitroot 1769526 0 0 Sep 07 - 1848:42 waitroot 1835064 0 0 Sep 07 - 1853:13 waitroot 1900602 0 0 Sep 07 - 528:33 waitroot 1966140 0 0 Sep 07 - 1412:40 waitroot 2031678 0 0 Sep 07 - 1552:47 waitroot 2097216 0 0 Sep 07 - 1558:07 waitroot 2162754 0 0 Sep 07 - 658:26 waitroot 2228292 0 0 Sep 07 - 1200:31 waitroot 2293830 0 0 Sep 07 - 1334:07 waitroot 2359368 0 0 Sep 07 - 1228:54 waitroot 3014734 1 0 Sep 07 - 0:00 kbiodroot 5111986 1 0 Sep 07 - 14:16 /usr/sbin/syncd 60root 8847594 1 0 Sep 07 - 7:07 /usr/sbin/getty /dev/console

$ uptime ; ps -ekf | grep -v grep | egrep syncd|lrud|nfsd|biod|wait|getty | grep -c wait12:00AM up 18 days, 6:53, 1 user, load average: 12.99, 12.30, 12.1324$


iostat a cumulative since last boot; mapping&comparing hdisks stats can be useful in characterizing performance-related I/O patterns&trends

$ iostat -aDisks: % tm_act Kbps tps Kb_read Kb_wrtnhdisk8 2.2 607.3 1.5 2164957533 1876147460hdisk12 2.4 607.7 1.3 2065741964 1978282924hdisk13 2.2 582.8 1.3 2002751079 1875515764hdisk11 2.1 593.4 1.3 2073048903 1875758716hdisk4 1.9 216.3 23.9 812230724 626802460hdisk15 0.0 2.2 0.6 25584 14666516hdisk16 11.5 178.7 23.6 1169343088 19983468hdisk14 0.0 1.3 0.0 8828331 0hdisk10 6.3 548.9 7.8 3617545529 35278292hdisk17 0.0 0.0 0.0 8560 0hdisk18 0.0 3.0 0.1 9741142 10386688hdisk7 0.4 53.3 7.6 272419695 82268236hdisk5 0.6 59.4 6.5 225752039 169601848hdisk6 2.3 624.3 1.5 2175672098 1978387280hdisk9 2.3 613.6 1.4 2104140790 1978677528hdisk20 8.1 228.3 35.1 1511885833 7496668hdisk21 1.6 99.7 24.8 16230194 647254280hdisk22 15.9 845.2 58.4 5592808968 31384956hdisk23 3.5 364.2 60.4 1627955714 795383552hdisk25 20.3 740.1 36.5 4725304221 199399144hdisk27 20.1 1015.2 45.8 6675326923 80385252hdisk26 41.3 2934.5 118.0 18806493859 720917972hdisk29 19.2 949.4 55.7 6113262738 204348212hdisk24 25.8 1867.8 59.4 12330198268 98946776hdisk30 16.9 515.9 38.4 3271643603 161247332hdisk32 5.1 555.4 34.2 888509245 2807296084hdisk31 11.7 483.8 71.1 3111959749 107262760hdisk33 47.2 2760.0 153.7 18308894936 56985180hdisk36 2.4 597.9 1.3 2103249842 1875221640hdisk35 2.8 616.8 1.4 2126412342 1977828244


iostat D cumulative since last boot; mapping&comparing hdisks stats is useful in characterizing performance-related I/O patterns&trends

$ iostat -DSystem configuration: lcpu=24 drives=87 paths=172 vdisks=0hdisk0 xfer: %tm_act bps tps bread bwrtn

0.8 18.7K 2.3 7.0K 11.7Kread: rps avgserv minserv maxserv timeouts fails

0.6 3.0 0.1 267.1 0 0write: wps avgserv minserv maxserv timeouts fails

1.7 5.5 0.3 320.5 0 0queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull

8.8 0.0 291.3 0.0 0.0 6349911hdisk1 xfer: %tm_act bps tps bread bwrtn


0.0 4.8 0.1 301.8 0 0write: wps avgserv minserv maxserv timeouts fails


11.3 0.0 275.6 0.0 0.0 6102418hdisk86 xfer: %tm_act bps tps bread bwrtn


30.6 6.5 0.1 1.3S 0 0write: wps avgserv minserv maxserv timeouts fails


4.3 0.0 1.1S 0.0 0.0 73320194hdisk87 xfer: %tm_act bps tps bread bwrtn


31.2 6.3 0.1 1.2S 0 0write: wps avgserv minserv maxserv timeouts fails


4.3 0.0 1.2S 0.0 0.0 74160810


netstat v high watermark is Max Packets on S/W Transmit Queue

$ netstat -v-------------------------------------------------------------ETHERNET STATISTICS (ent0) :Device Type: 2-Port 10/100/1000 Base-TX PCI-Express Adapter (14104003)Hardware Address: 00:14:5e:74:1b:8aElapsed Time: 270 days 21 hours 33 minutes 15 seconds

Transmit Statistics: Receive Statistics:-------------------- -------------------Packets: 101419085701 Packets: 417799880725Bytes: 402789006370762 Bytes: 546174259053849Interrupts: 0 Interrupts: 67899053842Transmit Errors: 0 Receive Errors: 10965163Packets Dropped: 0 Packets Dropped: 30

Bad Packets: 0Max Packets on S/W Transmit Queue: 3109S/W Transmit Queue Overflow: 0Current S/W+H/W Transmit Queue Length: 1

Broadcast Packets: 24079 Broadcast Packets: 1135765Multicast Packets: 0 Multicast Packets: 387934No Carrier Sense: 0 CRC Errors: 0DMA Underrun: 0 DMA Overrun: 9219805Lost CTS Errors: 0 Alignment Errors: 0Max Collision Errors: 0 No Resource Errors: 1745358Late Collision Errors: 0 Receive Collision Errors: 0Deferred: 0 Packet Too Short Errors: 0SQE Test: 0 Packet Too Long Errors: 0Timeout Errors: 0 Packets Discarded by Adapter: 0Single Collision Count: 0 Receiver Start Count: 0Multiple Collision Count: 0Current HW Transmit Queue Length: 1

General Statistics:-------------------No mbuf Errors: 30Adapter Reset Count: 0Adapter Data Rate: 2000Driver Flags: Up Broadcast Running

Simplex 64BitSupport ChecksumOffload PrivateSegment LargeSend DataRateSet


ps kelmo THREAD demonstrates the reality of threadedness

$ ps kelmo THREADUSER PID PPID TID ST CP PRI SC WCHAN F TT BND COMMANDroot 1 0 - A 0 60 1 - 200003 - - /etc/init

- - - 65539 S 0 60 1 - 410400 - - -root 3145888 1 - A 0 60 1 f1000000a05f9098 240001 - - /usr/ccs/bin/shlap64

- - - 4063447 S 0 60 1 f1000000a05f9098 400 - - -root 3539136 1 - A 0 60 1 f1000a0000154048 40401 - - /usr/lib/errdemon

- - - 15270101 S 0 60 1 f1000a0000154048 10400 - - -root 4915350 1 - A 0 60 1 f1000000c0386728 40401 - - /usr/sbin/emcp_xcryptd -d

- - - 14155953 S 0 60 1 f1000000c0386728 400000 - - -root 5111986 1 - A 0 60 17 * 240001 - - /usr/sbin/syncd 60

- - - 4915221 S 0 60 1 f1000a011d0ec6b0 410400 - - -- - - 11272331 S 0 60 1 f1000a0122cb77b0 410400 - - -- - - 13107365 S 0 60 1 f1000a0122cbd2b0 410400 - - -- - - 14352589 S 0 60 1 f1000a013a0fe0b0 410400 - - -- - - 14418107 S 0 60 1 f1000a013a0fedb0 410400 - - -- - - 14483643 S 0 60 1 f1000a011fcf0ab0 410400 - - -- - - 14549181 S

Earl Jew Part i How to Monitor and Analyze Aix Vmm and Storage Io Statistics Apr4-13

Documents

Transcript of Earl Jew Part i How to Monitor and Analyze Aix Vmm and Storage Io Statistics Apr4-13