A06 Care and Feeding of AIX

115
© 2006 IBM Corporation IBM Confidential © 2007 IBM Corporation Care and Feeding of AIX Susan Schreitmueller Distinguished Engineer, Client Care

Transcript of A06 Care and Feeding of AIX

Page 1: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

Care and Feeding of AIX

Susan SchreitmuellerDistinguished Engineer, Client Care

Page 2: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 2

The THANK YOU Slide

A special note of thanks to those in FTSS and

ATS who contributed ideas and expertise to

this presentation. Without their submissions

this presentation would not be possible.

However, any mistakes IHowever, any mistakes I’’ll take full ll take full

credit for.credit for.

Page 3: A06 Care and Feeding of AIX

STG – Power Systems Client Care

Contributors

• Regina Moliff

• Steve Pittman

• Peter Nutt

• Bruce Spencer

• Mark Dixon

• Jerry Petru

• David Sinnot

• Grover Davidson

• Cesar D Maciel

• Maneesh Sharma

• Daryl Scott

• Ravi Shankar

• Ken Fleck

• Michael Sieber

PAGE 3

Page 4: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 4

Disclaimer

The suggestions contained in this presentation are general suggestions formulated by the author not as a recommendation from IBM.

These recommendations should be carefully examined for your environment and tested rigorously prior to implementing in production.

All environments differ and requirements vary given application and system nuances. Always use YOUR best judgment.

Page 5: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 5

Level-Setting for Best Practices:

Best practices for implementing Oracle on AIX:

• Oracle 9i & 10g on IBM AIX 5L: Tips & Considerations

• Oracle Architecture and Tuning on AIX white paper

• Tuning IBM AIX 5L for an Oracle Database white paper

IBM System p Advanced POWER Virtualization Best Practices Redbook

IBM eServer Security Planner and the Strengthening AIX Security: A System-Hardening Approach white paper

Learn 10 good UNIX usage habits web page

• AIX V5.3 installation best practices https://w3.webahead.ibm.com/w3ki/display/wpSeriesFTSS/AIXInstall

• AIX V5.3 backup & restore https://w3.webahead.ibm.com/w3ki/display/wpSeriesFTSS/AIXBackup

• AIX V5.3 boot from SAN https://w3.webahead.ibm.com/w3ki/display/wpSeriesFTSS/AIXSANBoot

Regularly visit Service and support best practices for UNIX servers

Page 6: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 6

http://www14.software.ibm.com/webapp/set2/est/home.html

Page 7: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 7

From 50,000 FEET•Know SLA’s for all applications, architect accordingly•Record a view of your applications and SLA’s by partition and by machine•Build an adequate test environment. For EVERYTHING. •Create and test your backup strategy, in its entirety, routinely. And when anything changes!•Have a PLAN to apply maintenance. Yes even in a 24/7 52 weeks per year environment.•MONITOR your environment•TEST ALL CHANGES!

•Understand your baseline performance.•Understand your peaks and how virtualization features can help consolidate servers.•Have a capacity plan in place for peaks. Test before needed.•Review monitoring and escalation procedures.•Run a ‘test’ problem if its been a while.•Spend extra time on designing your I/O layout. Especially for databases!•Know tools and monitoring techniques for problems and what they look like when the system is NORMAL…

Page 8: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

Golden Code

Making It and Keeping It

Page 9: A06 Care and Feeding of AIX

STG – Power Systems Client Care

The landscape is changing!

• Know what tools are coming… IBM Director is the proposed strategic direction to acquire and disseminate fixes and will incorporate SUMA and NIM functions longterm.

• (SUMA and NIM will still be employed by many shops for a long time to come…)

• Develop a strategy for classifying servers (according to availability and change/release tolerance)

• Develop the strategy for notification of fixes, commonly a mix of SUMA and subscription services.

• Review the tools you will use such as System Planning Tool, NIM, SUMA, FLRT

• Describe the different levels of code to disseminate

• Determine how to disseminate the changes (NIM or other)

PAGE 9

Page 10: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 10

Golden Code – Making It & Keeping ItCategorize your images: When creating an image that will be used on many servers, you may want to

split out into two or three images that have unique characteristics: eg., database servers, application servers etc.,

Don’t Reinvent the wheel! – keep it consistent where possible:Incorporate as many of the post-install tasks into the base image as you can and

still maintain the ‘golden code’

You might consider sizing your filesystems (/usr / /tmp & dump ) before you create a clone and creating any RCT monitoring that is standard along with utilizing performance templates

Use NIM or some mechanism to track and organize images and fixes: As code is moved into the environment, the golden code should be evaluated

and kept current through the NIM process. You need to keep NIM up to date to be effective

Within a nim environment, include vg backups of the non rootvg volume groups structures.

Using savevg -mrivf /usr/local/recovery/datavg datavg makes sure there is a saved layout of the external disks prior to the mksysb running , and thus the info is also saved within the mksysb image on the nim server.

Page 11: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 11

Golden Code – Making It & Keeping It

If there are a number of AIX servers, a ‘golden code’ image should be maintained. A NIM server and a cloned image is an excellent way to maintain O/S consistency.

The most recent Technology Level and Service Pack should be researched and considered as a starting point if this is the initial setup of an environment. Get on the subscription list to get notified of hypersand critical fixes. Consider using SUMA to monitor and download fixes.

Get familiar with IBM director, niminv and getinvcommands as well as compare_report.

Firmware is just as important as software to keep current. Have a plan to allow currency of both!

Page 12: A06 Care and Feeding of AIX

STG – Power Systems Client Care

12

SUMA (Service Update Management Assistant)

• SUMA, included in the base AIX 5L™ Version 5.3 operating system, provides flexible, policy-based options to perform unattended downloads of AIX 5L™ updates from the Support Web site. • Notification of requestor via email

• SMIT or command line interface

• TL’s or SP’s will be downloaded (no specific PTFs or APAR support after 10/08), but individual updates can be installed if desired after the download

Page 13: A06 Care and Feeding of AIX

System p AIX LINUX Technical University © 2006 IBM Corporation13

InstallationSource

=LPP_Source

NetworkOS Tree

=SPOT

BackupImages

=MKSYSBs

Customization

=Scripts

ResourcesNIM Server

NIMClients

Network

Network Install Manager (NIM)

NIM Master

LINUX Client

AIX Client 3

AIX Client 1

AIX Client 2

Data Warehouse

� The niminv command can gather, conglomerate, compare, and download fixes based on the installation inventory of NIM objects.

� The niminv command extends the functionality of compare_report to operate on several NIM objects such as machines and lpp_sources at the same time.

� The geninv command can collect software inventory information from other systems using IP addresses or resolvable hostnames.

Page 14: A06 Care and Feeding of AIX

STG – Power Systems Client Care

14

compare_report

• The AIX compare_report command compares the filesets installed on a system to the contents of a fileset image repository or to a list of available updates that may be downloaded from Fix Central.

• It produces reports that simplify the process of determining the fixes to install to bring a system to the latest maintenance level or the latest level. Reports that are created using the list of available updates can be uploaded directly to Fix Central (5.1 uploads, 5.2 uploads, 5.3 uploads) to request the exact fixes needed for the system.

Page 15: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 15

Regularly review Subscription Service notifications

Always have at least one System’s Admin (and possibly

a backup) reviewing hypersand criticals for each system

or group of systems

Page 16: A06 Care and Feeding of AIX

System p AIX LINUX Technical University © 2006 IBM Corporation

NIM and SUMA

● SUMA can filter against an lpp_source

● Use SUMA to monitor for updates, notify the administrator of critical or desired fixes, filter against repositories to control downloads

● compare_report will accept lpp_source:

● Use compare_report regularly to provide a comparison among deployed machines and a ‘control source’ to determine rate of change and environmental differences

● Use compare_report to check for differences (uplevel / downlevel / missing) in fix deployment and lpps

● geninv and niminv provide plain English inventory reports • The software inventory, gathered from lslpp –Lc and the hardware inventories (system and

adapter inventories) are often used for comparisons between installed machines and a ‘control’

/usr/sbin/compare_report /usr/sbin/geninv /usr/sbin/niminv

are the executables for compare_report, geninv, and niminv respectively

geninv and compare_report are in bos.rte.install niminv in bos.sysmgt.nim.master

Page 17: A06 Care and Feeding of AIX

System p AIX LINUX Technical University © 2006 IBM Corporation

SUMA and NIM Come Together TL5

Page 18: A06 Care and Feeding of AIX

System p AIX LINUX Technical University © 2006 IBM Corporation18

Some thoughts on of SUMA and the new NIM

�Use niminv to gather and conglomerate inventory of NIM clients

�Utilize SUMA to download software fixes to a NIM master based on the conglomerated inventory and use NIM to deploy and track the fixes

�Execute SUMA on a systematic basis to check for AIX fixes of interest (APAR, TL, SP, Security, Latest, etc)

�geninv gathers software and hardware (system and adapter microcode) inventories on local and remote machines

�niminv gathers, conglomerates, compares, and downloads fixes based on the installation inventory of NIM objects. It also extends the functionality of compare_report to operate on several NIM objects such as machines and lpp_sources at the same time.

Page 19: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 19

A word about Maintenance!

Lobby for once a month maintenance window (be happy if you get once a quarter), even if you don’t use it every time…

You MUST update your firmware once per year (firmware releases are supported for 1 year / (2yrs for P6)) and should plan to embrace a Technology Level per year also. Good planning dictates an additional concurrent firmware upgrade during the year as well.

Remember our 50K view? Have your SLA’s defined and a view of them by machine and by partition. This will be useful in defining a maintenance policy.

Check your machines ahead of time ( such as readiness checker ) to review your environment BEFORE you start the upgrade!

Page 20: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 20

Other Recommendations• Recommendations: Utilize alt_disk_install or multi-bos to clone a

running, optimally laid out system when beginning the mksysb setup.

• Recommendations: rootvg should always be mirrored and quorum turned off in a single disk mirroring configuration.

• Recommendations: A test environment should be maintained that can test the initial and periodic restore of the ‘golden code’ image. It must include like configuration (eg., HACMP).

• Move code into and back out of test EXACTLY the same way before you do it in production.

• Recommendations: Customers should install the latest AIX maintenance level on any system which will be tested extensivelyprior to deployment, unless the application vendor(s) strongly recommend some level other than the latest.

• Recommendations: TAKE A MKSYSB after the initial install is completed. Ensure that the prompt field is set to YES in bos.inst.images. Label the tape and make it write protect.

Page 21: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 21

Install via Bundles

BestPractice

After installing base AIX, install optional components by installing bundles rather than filesets

Why: Easier to select one or a few bundles rather than selecting hundreds of filesets .

How: Best Practice: After installing base AIX, install optional components by installing bundles rather than filesets. Why: Easier to select one or a few bundles rather than selecting hundreds of filesets. How: Use the smit Install Software Bundle menu (accessed via the smittyinstall_bundle fast path) to install bundles from the AIX installation media.

Page 22: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 22

Additional Filesets to Install

•Install the appropriate AIX bundles using the command smittyinstall_bundle. The Server and Mozilla bundles are the most popular

bundles. Installing the Mozilla bundle requires access to the AIX Toolbox for Linux Applications and Mozilla CDs. (Mozilla can be installed independently (without using the bundle). Other bundles (eg, App-Dev) may be applicable in some environments. • There are additional filesets you may want to consider those on the next page, however these will vary with installation.

Once all desired filesets and bundles have been installed, select and apply a Technology Level and service pack.

Please note that if additional filesets or bundles are installed after a Technology Level has been applied, it is important to reapply the Technology Level. See the After installing any new optional AIX filesets, always reapply the current AIX Technology Level and Service Pack best practice

.

Page 23: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 23

Maintenance Level UpdatesBestPractice:

After installing any new optional AIX filesets, always reapply the current AIX Technology Level and Service Pack.

Why: It is important to keep all system components at a consistent fix level. When an AIX Technology Level or Service Pack is applied, fixes are applied only to filesets installed on the system when the Technology Level or Service Pack application occurs. When new optional AIX filesets are installed, they are installed at the fix level available on the base AIX installation media, which is likely below the AIX Technology Level and Service Pack at which the system is currently running.

How: Use the smit Update Installed Software to Latest Level (Update All) menu (accessed via the smitty update_all fast path) to reapply AIX Technology

Level media.

Caution: Use smitty update_all only against AIX Technology Level media, not

against other media received from the IBM Support Center which contains miscellaneous fixes, unless instructed to do so by the Support Center.

Page 24: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 24

Installing Additional Filesets ContBestPractice:

Install a few individual filesets in addition to bundles.

Why: The filesets are not part of every bundle and provide the following valuablefunctions: •bos.adt.samples (the vmtune command and other performance tools, but please note that most vmtune command functions have been replaced by the vmo, ioo, and noo commands in AIX V5.3) •X11.apps.config (needed for SSH X11 forwarding to work) •bos.adt.base (required by Oracle according to the Oracle9i Release Notes ) •bos.adt.libm (required by Oracle according to the Oracle9i Release Notes ) •bos.perf.perfstat (required by Oracle according to the Oracle9i Release Notes ) •bos.perf.libperfstat (required by Oracle according to the Oracle9i Release Notes ) •Possibly bos.dosutil (support for AIX dosdir, dosread, and doswrite commands to read and write DOS-format diskettes)

How: Use the smit Install and Update from ALL Available Software menu (accessed via the smitty install_all fast path) to install additional filesets from the AIX installation media.

Page 25: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 25

Install your favorite web browser

BestPractice

Install a web browser.

Why: A web browser is an important end-user and sysadmintool.

How: Install the Mozilla browser from the Mozilla V1.7.3 Web Browser and Application Suite for AIX CD (or download the latest Mozilla or Firefox browser from the Web browsers for AIX download site ).

See the technote: Installing Mozilla on AIXPlease note that root's mozilla cache is placed in /.mozilla by default. To preserve

space in / and to avoid filling it up, after installing Mozilla consider creating a /home/root/.mozilla directory, creating a symbolic link to it from /.mozilla, and

confirming that root's mozilla will run okay with the soft-linked directory.

Use the smitty change_documentation_services to specify Mozilla as the system default

browser once Mozilla is installed.

Page 26: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 26

Install your favorite PDF Reader

BestPractice

Install Adobe Acrobat Reader.

Why: A .PDF reader is an important end-user and sysadmin tool.

How: Download the Adobe Acrobat Reader V7.0.8 for IBM AIX from the Adobe

Reader download site .

To install Acrobat, run INSTALL in the AdobeReader directory after unpacking the downloaded archive.

See the Don't add local files to / and /usr best practice above, which suggests creating a /usr/local filesystem and installing software such as Adobe Acrobat in that filesystem (in, for example, directory /usr/local/Acrobat7). Add a symbolic link from /usr/local/bin/acroread to the Adobe acroread executable. (If you implemented the Add useful shell scripts to /usr/local/bin and make them

available to all users best practice, the /usr/local/bin directory will already exist and all users will have /usr/local/bin in their $PATHs.)

Page 27: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 27

Keeping / and /usr clean

BestPractice

Don't add local files to / and /usr. (Mount points are okay, though.) And minimize custom changes to executable AIX files in / and /usr (eg, /sbin/rc.boot).

Why: 1.AIX system maintenance updates files in / and /usr. Local customization to executable AIX files in / and /usr may be wiped out when maintenance is applied. 2.During an version upgrade, it is often desirable to use a Preservation reinstall , which discards the /, /usr, /var, and /tmp filesystems and rebuilds them from scratch. Local files stored in / and /usr will be lost during such a reinstall.

3.This helps minimize the amount of space used in / and /usr.

Page 28: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 28

Share useful Administrator Scripts

BestPractice

Add useful shell scripts to /usr/local/bin and make them available to all users.

Why: Share scripts – ease of use. (Also suggest change control on scripting – these can be more dangerous than anything!)

How: As described in the Don't add local files to / and /usr best practice, create a /usr/local filesystem. Create a /usr/local/bin directory in the filesystem. Add files /usr/local/bin/ptreeand /usr/local/bin/stopcmd (follow links to see file contents) with ownership & permissions of bin.bin & r-xr-xr-x.

Add /usr/local/bin near the end of PATH= in /etc/environment. (You did remember to save /etc/environment as /etc/environment.orig first, right? - See the Before manually editing any file in the / and /usr filesystems for the first time, save a copy of the file best practice.) If you implemented the Update the /usr/lib/security/mkuser.sys file shipped with AIX to install a profile other than the system default when a new userid is created best practice, the $PATH set in /etc/environment will be inherited by all users.

Note: Because these files are added to the /usr/local filesystem in rootvg, this is not an exception to the Don't add local files to / and /usr best practice.

Page 29: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 29

Network Configuration tips

BestPractice

Configure TCP/IP to search local /etc/hosts to resolve a host name before trying DNS.

Why: If your network folks hose up DNS, you want to be able to circumvent the problem locally and minimize impact on your servers while they are fixing the problem.

How: There are two options for name resolution tuning : Add a line containing hosts=local,bind to /etc/netsvc.conf. (You did remember to save

/etc/netsvc.conf as /etc/netsvc.conf.orig first, right? - See the Before manually editing any file in the / and /usr filesystems for the first time, save a copy of the file best practice.) This update takes effect as soon as the change is made.

Add a line containing NSORDER=local,bind to /etc/environment. (You did remember to save /etc/environment as /etc/environment.orig first, right? - See the Before manually editing any file in the / and /usr filesystems for the first time, save a copy of the file best practice.) This update takes effect only as each daemon is stopped & restarted and as users logoff & log back in. The easiest way to restart all daemons is to reboot AIX.

It is not necessary to add an export NSORDER command to /etc/profile for ksh users, since ksh

exports NSORDER by default if it is set.

Note: Leave the loopback/localhost entry in /etc/hosts and add only an entry for your local hostname unless, of course, it becomes necessary to add entries to circumvent a DNS problem.

Note: Before and after making this change, issue the command host <ipaddr> for every <ipaddr> defined in /etc/hosts. Update entries in /etc/hosts so that the host <ipaddr> command generates the same output with and without the new line in

/etc/netsvc.conf. That is, make sure your /etc/hosts is consistent with DNS.

Page 30: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 30

Install OpenSSH

BestPractice

Install OpenSSH.

Why: To avoid use of unsecure protocols such as FTP and telnet, which send unencrypted passwords (including root's password) over TCP/IP networks. See the OpenSSH web page for details.

How: The OpenSSH is now bundled with AIX web page has instructions for installing OpenSSH on AIX. Here is a summary of those instructions:

Point a browser at the AIX Toolbox for Linux Applications web page. Note that the Federal govermentconsiders cryptographic software to be a munition, so IBM must require that users register before downloading it. And OpenSSL is cryptographic software. Heave a heavy sigh. Follow the AIX Toolbox Cryptographic Content link on the right. (An IBM ID is required to access the cryptographic content. The registration required to obtain an IBM ID is relatively quick and painless. If you don't want to register for an IBM ID, OpenSSL is shipped with AIX on the AIX Toolbox for Linux Applications CD , although the version on that CD might not have all available patches.) Please note that more than one version of the openssl - Secure Sockets Layer and cryptography libraries and tools is available for download from the web site. Select the latest (openssl-0.9.7l-1.aix5.1.ppc.rpm as of 11/30/2006) and download it.

Use the smitty install fast path to install the openssl RPM file. Use the smitty install fast path to install the OpenSSH filesets from the AIX 5L V5.3 Expansion Pack CD .

Or to get a version with latest available patches, download the latest OpenSSH filesets in installpformat from the OpenSSH on AIX web page on SourceForge. Select the latest version of Open Secure Shell (openssh_4.3p2-r2 as of 6/2/2007).

Note: If you need an SSH client for Windows XP, PuTTY is free and seems to work well. If you have experience with PuTTYon Microsoft Vista, please edit this page to document that experience.

Note: If you prefer to build SSH from open source rather than download executables, the Deploying OpenSSH on AIX web page has instructions for doing so.

Page 31: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 31

Setting the TERM value

BestPractice

When configuring an ASCII console/terminal, configure AIX to set the TERM= shell environment variable correctly during login from that terminal.

Why: Ease of use and correct key / video mapping

How: When defining an ASCII console/terminal, always set the tty'sTERMINAL type field (on the Change / Show Characteristics of a TTY

menu accessed via the smitty chgtty fast path) to the appropriate terminal type. For example, if the terminal is a VT100, set the TERMINAL type field to vt100}. When defining a serial port for amodem, leave the {{TERMINAL type field set to dumb. Discourage users from setting TERM= in their .profile except as a conditional statement (eg, if [ "$TERM" = "dumb" ] ; then TERM=vt100 ; fi)

Suggestion: Some Clients recommend use of mkterm command on the hmc instead of having to rely

on the websm interface. Quicker / easier to access.

Page 32: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 32

Tailoring User IDsBestPractice

Before creating any userids for which /bin/ksh is the default shell, update the /usr/lib/security/mkuser.sys file shipped with AIX to install a profile other than

the system default when a new userid is created.

Why: The system default user profile (/etc/security/.profile) assigns an absolute

$PATH, which defeats any attempt to introduce a new directory into all users' $PATHs by updating the PATH= statement in /etc/environment.

How: Replace /usr/lib/security/mkuser.sys (follow link to see new mkuser.sys

content). (You did remember to save /usr/lib/security/mkuser.sys as /usr/lib/security/mkuser.sys.orig first, right? - See the Before manually editing any file in the / and /usr filesystems for the first time, save a copy of the file best practice.) Verify that mkuser.sys.orig is as expected (follow link to see old mkuser.sys content).

Add files /etc/security/.profile.ksh and /etc/security/.kshrc (follow links to see file contents). Make sure .profile.ksh & .kshrc have the same ownership & permissions as /etc/security/.profile (root.security & rw-rw----).

If you wish to create a user with a default shell other than /bin/ksh, you should make appropriate changes to /usr/lib/security/mkuser.sys for other default shell(s) and add the other default profile(s) to /etc/security.

Note: This is one of the few exceptions to the Don't add local files to / and /usr best practice.

Page 33: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 33

Configuring root

BestPractice:

Configure the root user so you always know which system you are on and which directory you are in.

Why: If you are managing systems over a network, it is very important to know what system you are on. It is, therefore, important that root's shell prompt always remind you where you are. (The odd print command in /.profile will set the text in the title bar of an xterm or aixterm window to hostname:username when you log in through an xterm or aixterm window on another system.)

How: Assuming you configure the root user to run ksh at login, add files /.profile and /.kshrc (follow links to see file contents). Set ownership & permissions to root.system & rw-r-----.

Note: After logging in to the root userid through CDE (GUI Desktop) on a graphics console for the first time, edit /.dtprofile to uncomment the last line (# DTSOURCEPROFILE=true) so that /.profile will get executed every time root logs in. (You did remember to save /.dtprofile as /.dtprofile.orig first, right? - See the Before manually editing any file in the / and /usr filesystems for the first time, save a copy of the file best practice.)

Page 34: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 34

Update the path

BestPractice

Update $MANPATH for all users so that the man command can find man pages installed by AIX Toolbox for Linux Applications software and the csm.dsh fileset.

Why: Ease of use / Usability

How: Add the line MANPATH=/usr/share/man:/opt/freeware/man:/opt/csm/man:/opt/freeware/apache/man to /etc/environment. (You did remember to save /etc/environment as

/etc/environment.orig first, right? - See the Before manually editing any file

in the / and /usr filesystems for the first time, save a copy of the file best practice.)Note: Bull freeware installation instructions suggest that an export MANPATH command must be added to /etc/profile, but this is not necessary because kshexports MANPATH by default if it is set. To confirm that MANPATH is exported by default after adding MANPATH to /etc/environment, login, issue the command export, and observe that MANPATH is displayed (among many other environment variables).

Page 35: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 35

Allocating Dump Space

BestPractice

Make sure adequate system dump space is allocated.

Why: So that if AIX crashes, diagnostic information is captured to determine the cause of the crash, which usually allows remedial action to be taken to prevent a reoccurrence.

How: When AIX V5.3 is first installed, run the /usr/lib/ras/dumpcheckcommand to make sure adequate dump space has been allocated. Since the dump space requirement tends to grow as the system gets busier, configure dumpcheck to run regularly at a time when the system is likely to be fairly heavily loaded.

Use crontab -l to confirm that root's crontab is configured to run dumpcheck regularly at an appropriate time and, if not, use crontab -e to update root's crontab.

Page 36: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 36

Dump Space Preparation

BestPractice

Prepare the system so you can initiate a stand-alone dump if AIX hangs (won't allow logins - might or might not respond to a ping).

Why: It is too late for preparation when AIX is hung. If AIX hangs and you have no way of initiating a standalone dump, you have no way of collecting diagnostic information to determine why AIX hung.

How: If the system is in a secure area (where unauthorized personnel can not gain physical access

to it), Invoke the AIX command sysdumpdev -K. A standalone dump can then be initiated at

any time (even if AIX is hung) using any of the methods described in the System Dump Facility article in the AIX V5.3 Kernel Extensions and Device Support Programming Concepts

manual.

On POWER5 LPARs managed by a Hardware Management Console, the Dump option of the Restart Partition function can be used to initiate an AIX stand-alone dump, as documented in the Using the Hardware Management Console to restart AIX logical partitions article in the POWER5 Partitioning for AIX with an HMC

manual.

A system dump can be initiated remotely via a modem or terminal server after enabling the remote reboot facility using the smitty rrbtty fast path. But according to PMR 24881,L6Q, the AIX remote reboot facility does not work for a system (integrated serial) port on a POWER5 system. One should instead enable serial port snoop (see the Enabling serial port snoop article).

While an LPAR is dumping, dump progress indicators (0c0, 0c2, 0c9, etc) will appear on the HMC and/or in the LCD display. The various possible indicator values are

documented the "Dump progress indicators (dump status codes)" section of the AIX IPL

progress codes article in the System p Reference codes manual.

Page 37: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 37

A word about editing /etc/filesystems

BestPractice

Never edit /etc/filesystems to define or modify a filesystem definition.

Why: AIX keeps information regarding each JFS filesystem not only in /etc/filesystemsbut also in the LVCB of the logical volume on which the filesystem is defined. (The LVCB (logical volume control block) is in the first 512 bytes of a logical volume. Search the AIX Information Center for more information about the LVCB.)

When a volume group is imported, AIX reads the LVCB of every logical volume in the volume group and adds filesystem definitions to /etc/filesystems. And when a volume group is exported, AIX deletes from /etc/filesystems the definition of filesystems in the volume group.

So if you edit /etc/filesystems to change a filesystem definition, the definition will revert to its original state if/when you export and re-import the volume group. Not

a good thing.

How: (instead)

Use smitty manfs or the AIX chfs command to change a filesystem definition. Use smitty manfs to remove a filesystem. (There is a documented AIX rmfs command, but be very careful with it. Like the UNIX rm command, rmfs does not prompt for

confirmation. It immediately destroys the specified filesystem!)

Page 38: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 38

BestPractice

Use smitty crcdrfs to create a CD-ROM filesystem with a mount point of /cdrom.

Why: Sooner or later, you will want to mount a CD. Once you have created the /cdrom filesystem, a CD can be mounted by root user with mount /cdrom. Why wait until you need

to mount a CD to figure out how to create the filesystem?

But please note that the installp command (and the smit

install menus which use it) assume that a CD is not mounted. Mounting a CD before attempting to install something from it can result in cryptic (that is, difficult to diagnose) behavior from the installp command at some

AIX patch levels.

How: Use the smitty crcdrfs fast path.

Create CDROM filesystem

Page 39: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 39

BestPractice

Setup ‘System Hang Detection (smitty shd) for AIX V5.1 or later systems

Why: If the system suffers from a priority based 'hang', system hang detection can run a user specified recovery script or open up a 'high priority' login session on a specified console (/dev/console is reasonable).

How: Use the smitty shd fast path.

System Hang Detection

Page 40: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 40

Don’t set yourself up for failure!

BestPractice

Limit the number of external factors that can affect your environment(network, storage changes)Allow enough time: for test, for implementation, for backups, for back-out. Don’t overload bandwidth when planning to backup, restore, or desseminate

Why: Firmware updates (especially to the HMC) Network outages are particularly vulnerable to failure when network outages occur midstream.

How: Best Practice: Thoroughly review changes that will occur in the samewindow. Predict network load and affect that simultaneous downloads will have. Plot out all the steps that will result in a safe implementation with testing, backups, and possible backout.

Page 41: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 41

Review FLRT and your process carefully

BestPractice

Review the Fix Level Recommendation Tool (FLRT) and the pre-reqtool before you begin your installation planning. Ensure that you update NIM servers and HMC first.

Why: Updates (especially to the HMC) can often affect other levels required on the system. It is also suggested to keep the NIM server as the highest level in your environment

How: Best Practice: Utilize FLRT to plan upgrades http://www14.software.ibm.com/webapp/set2/flrt/home

and the pre-req (feature code analysis) when you touch the hardwarehttp://www-912.ibm.com/e_dir/eServerPrereq.nsf

Page 42: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 42

Configure and Utilize Service Agent (ESA)

BestPractice

Utilize Service Agent on your system. Test the set up prior to needing it – including inventories and call backs.

Why: Configuration issues and incorrect call-back numbers are some of the most common failure points for ESA. ESA is a hardware monitoring tool that can save many outages.

How: See session on Service Agent / Phone home.

Page 43: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 43

Virtualization

BestPractice

If you are running Power5 with virtualization ensure that APAR IY97605 is installed.

Why: Memory DLPAR on p5 can be unduly slow when checking for memory pages. This is required when memory is allocated inside a partition, but not when memory is unnallocated (as when activated by CoD). APAR IY97605 fixes this problem for unnallocated memory. As an example, on a p5-595, adding 90 GB unnallocated memory prior to using the APAR took one hour. After the APAR took 1 min 26 sec.

How: Install APAR IY97605

Page 44: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 44

Virtualizaton / VIOS

BestPractice

Document all the devices/components involved between a virtual device on an AIX client and a physical backing device on a virtual I/O server.

Why: It is difficult to obtain an end - to - end mapping between virtual devices and their physical backing devices. When something goes wrong in a virtualized environment, understanding what components are affected and how components are linked together is very valuable.

Page 45: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 45

General – System

BestPractice

Setup workload manager and give the 'system' class a minimum CPU limit of 10% (or thereabouts)

Why: If the system becomes CPU saturated and would normally be unresponsive (still running though), an 'admin' can log into the system and as long as 'system' (everything that runs under 'root') is using less than 10% then admin activities will be scheduled in preference to other 'user' processes.

Benefit is that if the system becomes unresponsive due to extreme workload, an admin can log onto the system and attempt recovery without resorting to a 'reset'.

SHD and WLM look the same but SHD will not work if the system isjust extremely overloaded, WLM minimum CPU limit will still function.

Page 46: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 46

VIOS

BestPractice

Instead of installing fix/update packages using SMIT as root (entry via "oem_setup_env") is with the vioscli command "updateios".

Why: Many times this has resulted in a VIOS that is not consistent or missing new features that were included in the fix/update package.

SMIT update installation of a given VIOS fix package will result in possibly failing to install allfixes and new features in a given VIOS fix package.

How: updateios from command line.

Using "updateios" will completely and correctly install all fixes and new features included in a given VIOSfix package.

Page 47: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 47

System Planning Tool

BestPractice

New System p clients (and some others too) should consider using the System Planning Tool output as well as configuring ESA.

Why: This give a good view aiding in getting a box up running multiple partitions, VIOS, shared pool up and running.

You can get a box up and running quicker, and this shows SystemsAdministrators easy ways to plan, deploy and document a server configuration.

Page 48: A06 Care and Feeding of AIX

STG – Power Systems Client Care

Security TIPS

Suggestion or Tip (best practice): Enable Stack Execution Disable feature on AIX

Reason Why: There is no reason not to do. It does not have any performance impacts, but provides valuable protection against attacks.

Potential Benefit or Harm: Protects against most buffer overflow based security vulnerabilities

Suggestion or Tip (best practice): Review AIX security expert and enable a level that suits you.

Reason Why: Central dash board for security configuration on AIX

Potential Benefit or Harm: Hardens the system with ease. Merge any custom hardening scripts with AIXpert

For latest information on AIX security information visit

http://www-03.ibm.com/systems/p/security/

For latest information on AIX certifications, visit

http://www-03.ibm.com/servers/aix/products/aixos/certifications/

PAGE 48

Page 49: A06 Care and Feeding of AIX

STG – Power Systems Client Care

Security Tips Continued

Suggestion or Tip (best practice): Avoid using telnet, ftp etc tools. Use OpenSSH as alternative.

Reason Why: telnet, ftp send passwords in clear over network and are prone to attacks

Potential Benefit or Harm: Secure day to day communication using SSH

Suggestion or Tip (best practice): Disable the services not necessary in inetd. Use AIXpert to do this

Reason Why: Unnecessary listening on ports provides for attack vectors.

Potential Benefit or Harm: Harden the system by shutting down unused services

Suggestion or Tip (best practice): Avoid using simple passwords (say "root" for root account)

Reason Why: Dictionary attacks and guess attacks could lead to password uncovering

Potential Benefit or Harm: Protection against attacks.

PAGE 49

Page 50: A06 Care and Feeding of AIX

STG – Power Systems Client Care

AIX 6.1 Security Tips

• #1 (available in 53 also by Nov 2007)

• Suggestion or Tip (best practice): Use different hashing algorithm (other crypt) for storing passwords.

• Reason Why: Provides much better security for passwords

• Potential Benefit or Harm: Provides better security and also one can use greater than 8 character passwords.

• #2 Suggestion or Tip (best practice): use Trusted Execution feature in AIX 6.1 to enhance security. Protect against intrusion attacks.

• Reason Why: Provides for integrity verification of the system

• Potential Benefit or Harm: Lock down the system and protect against attacks from an intruder

• #3 Suggestion or Tip (best practice): Segregate your important data and use AIX 6.1 Encrypted file system to encrypt data

• Reason Why: Protect important data against attacks

PAGE 50

Page 51: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 51

Some other general recommendations:

• Utilize the dsmc image command within TSM: This enables an exact point in time backup of a filesystem. TSM unmounts and then remounts the filesystem as read only then performs a dd of it directly to the tsm server. This is seen as a major time advantage for certain applications.

• Consider using Concurrent I/O in an oracle /sap environment. This can result in large memory usage savings. Since this negates filesystem caching, truer readings of memory usage for core applications and performance benefits are some of the benefits.

•Configure dead gateway detection with multiple default gateways on servers with more than one network . This results in better network availability for the environment. Traffic can be routed through more interfaces and this can eliminate a Single Point of Failure which can occur with there is a single default gateway.

•Use SSH keys for authentication, negating passwords apart from your key password.

Page 52: A06 Care and Feeding of AIX

STG – Power Systems Client Care

‘Colin tips’• Always create a dummy empty LPAR with the same connectivity as primary LPARs to do upgrades into.

Then use ALT_CLONE and upgrade the new software (i.e. Oracle, WAS, SAP, etc) into that and switch the addresses published externally when you are happy the upgrade has worked properly.

• Work to standard infrastructure patterns for an estate. Make each machine adhere to the pattern for that role – even if there is something it requires less than other machines in that role. It makes builds and management easier.

• Learn the HMC command line. Performance is better than the GUI.

• Have a script that runs each night that packages up errpt alerts for that day and Emails them. Sometimes systems management tools can get errors and not alert you appropriately when a problem occurs so a failsafe script (which I can supply) is worth having.

• Set up accounts to point to an LDAP / Kerberos environment off the LPARs – even if it uses AD. It makes managing passwords a whole order of magnitude easier because it is all done in one place.

• When setting up WAS on AIX always set it up with WAS security switched off. Then MAKE SURE you have added the security config before you switch the security on. If you don’t you will be rebuilding as you won’t have a login to do administration!

• Always set up more than one WAS instance inside each LPAR – called vertical scalability – cross clustered with instances in another physical machine. Have more than one logical cluster per WAS cell across the two machines. The instances will actually share memory but this allows rolling upgrades and more resilience and performance.

PAGE 52

Page 53: A06 Care and Feeding of AIX

STG – Power Systems Client Care

Some Scripting Hints / Tips

• Suggestion or Tip - Script and schedule health checks and basic administrative tasks. Some examples might be:

• A script that runs daily to ensure the bootlist is set correctly. If it is not set correctly, the scripts sets is correctly. If the script has to set the bootlist correctly or cannot set the bootlist, an e-mail is sent to root. Any stdout or stderr messages are sent to a log file.

• A script that runs weekly that creates a backup mirror of rootvg on a seperate physical disk in a seperatevolume group. stdout and stderr messages are added to a log file.

• A script that runs every 15 minutes to ensure that both mirrors of rootvg are synchronized. If the script cannot sychronize the mirrors, it e-mails root.

• A script that runs every 15 minutes that checks the paths to a SAN disk. If any paths have failed, the script attempts to enable them. If the script cannot enable them, an e-mail is sent to root. If the paths have been disabled, the script does not attempt to enable them (in case they were intentionally disabled) and instead simply e-mails root.

• A NIM server script that runs weekly that backs up rootvg of all of its clients.

• Reasons Why/Potential Benefit or Harm

• If the machine goes down and is restarted, it will boot to the devices in the bootlist. If the bootlist is set incorrectly, the machine may not be able to boot.

• If something happens to rootvg, the mirror in the seperate volume group can be used to recover rootvg.

• If the mirrors are not synchronized and you lose one mirror, you may lose data.

• People typically have multiple paths to a disk because they need to be able to access the disk even if all but one of the paths goes down. If all of the paths except one go down and you don't notice it, you are betting that the final path will not go down. It is better to try recover paths so you always have multiple paths to a disk and can risk losing one.

• In case something happens to the server, rootvg can be recovered from the NIM server.

PAGE 53

Page 54: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 54

General Recommendations Cont:

•When installing AIX for the first time (or upgrading to a new release level), be sure to install available fixes. See AIX V5.3 installation best practices for more information. When installing AIX on a LUN on a Storage Area Network (SAN), be aware of considerations unique to that environment. See AIX V5.3 boot from SANfor more information.

•Capture bootable backups periodically (monthly/quarterly?). See AIX V5.3 backup and restore for more information regarding AIX backups.

•Store some bootable backups off site for recovery if the data center is destroyed.

•Test the restore process periodically (yearly?) by restoring the most recent bootable backup. Wouldn't want to discover that a backup can not be successfully restored while in the middle of disaster recovery, would we? Doh! It is best for the restore to be tested by someone other than the person who captured the backup, to confirm that restore procedure documentation is adequate.

•Capture application data backups (volume groups other than rootvg and filesystems not mounted when mksysb is captured).

Page 55: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 55

General recommendations (STILL cont):

•Test application data restore process periodically (yearly?) by restoring the most recent backup. It is best for the restore to be tested by someone other than the person who captured the backup, to confirm that restore procedure documentation is adequate.

•Monitor the system for errors. Attempt to discover the root cause of every error and to address the cause to minimize the number of errors which occur, while acknowledging that getting a failed system back in operation must sometimes take precedence over collecting the diagnostic information required to determine failure root cause. The primary AIX error log can be displayed using the errpt command. Please note that an AIX Error Notification exit can be used to take action (eg, send an email) if a particular error occurs. The primary HMC error log can be displayed from the HMC GUI using Service Applications -> Service Focal Point -> Manage Serviceable Events. Please note that Service Applications -> Service Agent -> Customer Notification can be used to configure an HMC to send an email when a new serviceable event is logged. Please note that alog -t console -o will display messages which have appeared on the AIX system console and alog -t boot -o will display messages which were generated as AIX booted up.

•Conduct a post mortem after each application outage. Attempt to answer the following questions and then act upon the answers: (1) Was there any warning of this outage? If so, why was the warning not acted upon in time to prevent the outage? (2) What changes can be made to prevent the outage in the future? (3) Are other servers exposed to similar outages?

Page 56: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 56

Best practice: Update number of licensed AIX users to

the maximum if you are not at 5.3 (default is 32767 at 5.3)

Best practice: Update number of licensed AIX users to the maximum.

Why: As of May 5, 2000, IBM no longer charges per AIX user and

all existing AIX licenses now permit an unlimited number of users to

login, but AIX continues to enforce the setting for number of licensed

users, which defaults to 2.

Method: Issue the command 'chlicense -u 32767' to set number

of licensed AIX users as high as possible. Change will become

effective on the next reboot.

Page 57: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 57

nmon and nmon Analyzer

nmon & nmon analyzer are ‘as-is’ tools, available free of charge via download

http://www-128.ibm.com/developerworks/eserver/articles/analyze_aix/

nmon is a monitoring tool that is basically an augmented ‘monitor’; the analyzer is an analysis spreadsheet useful for highlevel views of the system

The nmon tool is designed for AIX and Linux performance specialists to use for monitoring and analyzing performance data, including:

••CPU utilization CPU utilization

••Memory use Memory use

••Kernel statistics and run queue information Kernel statistics and run queue information

••Disks I/O rates, transfers, and read/write ratios Disks I/O rates, transfers, and read/write ratios

••Free space on file systems Free space on file systems

••Disk adapters Disk adapters

••Network I/O rates, transfers, and read/write ratios Network I/O rates, transfers, and read/write ratios

••Paging space and paging rates Paging space and paging rates

••CPU and AIX specification CPU and AIX specification

••Top processors Top processors

••IBM HTTP Web cache IBM HTTP Web cache

••UserUser--defined disk groups defined disk groups

••Machine details and resources Machine details and resources

••Asynchronous I/O Asynchronous I/O ---- AIX only AIX only

••Workload Manager (WLM) Workload Manager (WLM) ---- AIX only AIX only

••IBM IBM TotalStorageTotalStorage®® Enterprise Storage ServerEnterprise Storage Server®®

(ESS) disks (ESS) disks ---- AIX only AIX only

••Network File System (NFS) Network File System (NFS)

••Dynamic LPAR (DLPAR) changes Dynamic LPAR (DLPAR) changes ---- only only pSeriespSeries

p5 and p5 and OpenPowerOpenPower for either AIX or Linux for either AIX or Linux

Page 58: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 58

Download Frequently

Page 59: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 59

�Documentation

�Led codes

�Error Records

�Fixes

�RML

Access to the Web

Have web access in computer room to access the fixes and documentation

Have access to documentation for a server somewhere OTHER than on the server (ESPECIALLY restore procedures!)

Page 60: A06 Care and Feeding of AIX

© Copyright IBM Corporation 2005

IBM E server System p

Administrative Planning

• SUMA, compare_report, and lppmgr are designed to give the System p user superior tools for planning O/S maintenance and upgrades

• Combined with subscription services and NIM, maintenance can be largely automated

• SUMA is designed to analyze and filter maintenance and bring down to the System p environment

• compare_report is designed to assist in analyzing an environment and show areas of downlevel, uplevel or missing maintenance

• NIM is utilized to deliver maintenance and manage the code locations

• lppmgr is a house-cleaning tool for directory maintenance and NIM clean-up

• Subscription services, vital for proactive notification in a System p environment complete the maintenance tool suite for the O/S

Page 61: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

No TIPping Please

Testing In Production (TIPping)

Page 62: A06 Care and Feeding of AIX

IBM

IBM

TEST HERE!

Not HERE!

TEST

Production

There is a large exposure to changes that are

just minimally tested and not fully assessed

for their impact on production

The test environment

should mirror the

production environment as

closely as feasible.

LinuxAIX 5LV5.2

Dynamically resizable

4 CPUs

2TEST

6 CPUs

Lin

ux

Lin

ux

AIX

5L

V5

.3

AIX

5L

V 5

.3

AIX

5L

V5

.3

AIX

5L

V5

.3

AIX

5L

V5

.3

Micro-partitioning

AIX 5LV5.3

6PROD

Ethernetsharing

Virtual I/O server

partition

Storagesharing

1 CPU

Or utilize

Test

partitions!

Another exposure is not properly controlling production access.

Page 63: A06 Care and Feeding of AIX

STG – Power Systems Client Care

TIPPING – there are SNEAKY ways to test in production!

• Using a script for the first time in production…

• Not testing HACMP.

• Putting a fix into production in a different way than in test!

• Not testing the BACKOUT before you get to production…

PAGE 63

Page 64: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 64

Stage to ProductionStage to Production

Code is Installed via

apply

Scalability

Production is protected via apply vs. commit

PLEASE! Utilize the exact same method to move changes into stage that you use for production!

This includes backouts!, timing, and HACMP!

Once Satisfied,

Code is Committed

Comparative reporting available

Baseline is then

updated

Alternate Disk Install provides means to stage code if desired

Compare Reports available to analyze production vs stage vs. latest fixes

Page 65: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 65

More on Testing Environment

alt_disk_install or multi-bos can be used to clone the running system or make a mksysb to use against test system. AIX 5.2 and beyond allows for a migration install with the use of alt_disk_install

Test the unique and complete environment from production on test…Yes, this means to test HACMP!!

HACMP should be modified and tested whenever changes occur, this includes component updates, network updates, and I/O subsystem changes.

Although you can restore clones to different hardware, test like to like in terms of how you run things – if you use VIOS, TEST w/ VIOS, if you use concurrent volumes – well you get the idea!

Test systems should be reflective of the production environment! You may not be able to duplicate a 2 TB environment, but 100 records won’t give you volume!

Page 66: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM Corporation66

MultiBOS

hd2 - /usr

hd4 - /

bos_hd9var - /bos_inst/var

hd5 - boot

hd6 - paging

bos_hd2 - /bos_inst/usr

bos_hd4 - /bos_inst

hd9var - /var

bos_hd5 - boot

hd3 - /tmp

hd1 - /home

rootvgOne

ACTIVE instance per boot. The other

is the STANDBY

hd10opt - /opt

ACTIVE bos_hd10opt - /bos_inst/opt

Page 67: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM Corporation67

MultiBOS

hd2 - /usr

hd4 - /bos_hd9var - /var

hd5 - boot

hd6 - paging

bos_hd2 - /usr

bos_hd4 - /

hd9var - /var

bos_hd5 - boot

hd3 - /tmp

hd1 - /home

rootvg

hd10opt - /optbos_hd10opt - /opt

ACTIVE

Page 68: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

Virtualization

Page 69: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 69

Deciding How Best to Utilize Virtualization Maybe Daunting at FiDeciding How Best to Utilize Virtualization Maybe Daunting at Firstrst

The first step is planning a systematic approach and understandiThe first step is planning a systematic approach and understanding of how the ng of how the

landscape looks today. New applications are easy to start deplolandscape looks today. New applications are easy to start deploying, ying,

migration of test, dev and Q&A are also easy targets with minimamigration of test, dev and Q&A are also easy targets with minimal risk.l risk.

The easiest step is to review the number of CPU's on the systemThe easiest step is to review the number of CPU's on the system. Using this . Using this

information, create a pool of CPUs and simply allow the hyperinformation, create a pool of CPUs and simply allow the hyper--visor to visor to

distribute the workload most efficiently. This drives increased distribute the workload most efficiently. This drives increased utilization and utilization and

can significantly reduce software savings as well. Each logicacan significantly reduce software savings as well. Each logical partition l partition

(LPAR) can (LPAR) can ‘‘capcap’’ or limit the number of CPUs utilized. This will reduce the or limit the number of CPUs utilized. This will reduce the

number of CPUs that need to be licensed for a software package. number of CPUs that need to be licensed for a software package. One One

example might be to have a System p5example might be to have a System p5--595 with 64 CPUs and create 20 595 with 64 CPUs and create 20

LPARsLPARs letting the system distribute work load as needed. Once this iletting the system distribute work load as needed. Once this is stable s stable

increase the workload to 40 increase the workload to 40 LPARsLPARs then 60 and finally 80 plus. Many then 60 and finally 80 plus. Many

customers are now pushing the 100+ LPAR per frame.customers are now pushing the 100+ LPAR per frame.

Page 70: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 70

Deciding on CPU needs for a standard build:

��Four categories for LPAR builds, namely:Four categories for LPAR builds, namely:

•• 11--2CPU2CPU Great candidate for full virtualizationGreat candidate for full virtualization

•• 22--4CPU 4CPU Great candidate for full virtualizationGreat candidate for full virtualization

•• 44--8 CPU8 CPU Good candidate for full virtualizationGood candidate for full virtualization

•• 88-- +CPU+CPU Good for Virtualization of Good for Virtualization of rootvgrootvg and and

maybe network but may requiremaybe network but may require

physical adapters for I/Ophysical adapters for I/O

Page 71: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 71

Suggestion: for CPU pools

�Looking again at the System p5-595 running 80 LPARs, we configure four VIOS for load balancing and redundancy. Also to separate test workloads from production.

�Most issues that happen on well designed systems are done by sleep-deprived system administrators doing work at the bewitching hour of 2 a.m... Having two VIOs is one way to protect against that same sleep-deprived administrator typing the wrong command and bringing a single VIOsdown.

VIO Server 1

VIO Server 2

LPAR1

AIX 5L V5.3

LPAR11

AIX 5L V5.3PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

PPPP

vscsi

M

MMMM

MMM

AAAA M DD

AAAA

MMMM PP

POWER5 with

LPARs &

Virtualization

vent

LPAR99

AIX 5L V5.2

MMM

MMM

MMM

LPAR27

AIX 5L V5.3 PeopleSoft

Capped/

Uncapped

Processor

Pool

LPAR21

AIX 5L V5.3 AR Server MM

P=processor

M=memory

A=adapter

vent=virtual ethernet

vdisk=virtual disk

VIO=Virtual I/O server

D=scsi disk

Legend:

vscsi

vscsivscsi

vscsi vscsi

vscsivscsi

vscsi

vscsi

vent

vent vent

ventvent

vent vent

ventvent

p5

vent

vscsi

vent

vscsi

vent

vscsi

vent

vscsi

vent

vscsi

vent

vscsi

LPAR35

AIX 5L V5.3 AP Server

LPAR36

AIX 5L V5.3 AR Server

LPAR78

AIX 5L V5.3 AR Test Server

LPAR80

AIX 5L V5.3 AP Q & A Server

MM

MM

MMMM

VIO Server 3AAAA M DD

AAAA M DD

AAAA M DD VIO Server 4

Page 72: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 72

Setting up the rootvg to boot from SAN through the VIO Server

�Virtualizing the IO for rootvg by creating a pair of Virtual IO Servers (VIOS). Two VIOS reduce risk by providing redundancy. Place rootvg on a SAN and boot from the SAN. This helps save on internal disk space as SAN is more cost effective it is generally provides faster IO throughput. On average only four, 2GB Host Bus Adapter (HBA) cards will be needed in each VIO Server to handle the workload of 40 rootvg and their paging space requirements. Generally, since the majority of bandwidth is consumed during the boot process, there will be unused bandwidth on HBAs unless all 40 LPARs were booting concurrently which would be unlikely.

Page 73: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 73

Networking Aspect of Virtualization

VEA

VEA

��Shared Ethernet Adapter (SEA) method supports VLAN tagging and lShared Ethernet Adapter (SEA) method supports VLAN tagging and large packets. arge packets. It can be set up on the VIO servers and is therefore easer to maIt can be set up on the VIO servers and is therefore easer to manage than NIB. SEA nage than NIB. SEA requires backup network cards on the opposite VIO server, but ifrequires backup network cards on the opposite VIO server, but if the goal in the the goal in the enterprise is to provide the best highly available network and senterprise is to provide the best highly available network and still lower cost this may till lower cost this may be the preferred choice. be the preferred choice.

Page 74: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 74

Planning for Virtual datavg’s

� Careful analysis of the current SAN environment and port utilization is necessary before planning out the transition of physical to virtual, but it is relatively simple to move from physical to virtual when data is on a SAN. When capacity needs outgrow the bandwidth of the virtual subsystem it is a matter of adding more dedicated HBAs and migrating back from virtual to physical.

� Virtualizing the non-rootvg or datavg disks. In some instances, on systems with very high, busy or volatile IO this may be the least desirable virtualization for your system. In fact, some suggest a simple guide line in this case: “if your current system today is very busy with IO, you should not virtualize the disk”. System p hardware provides flexibility in this respect as well. Either physical or virtual disk and networking options can be used in any of the LPARs.

� Start by virtualizing the rootvg and therefore reducing the rootvgHBAs and then further virtualize and reduce HBAs for datavg where warranted. A good place to begin is to allocate 4 HBAs for each VIO server in a 80 LPAR system and add the datavg to existing OS LUNs

� Remember to grow slow and test and prove stability before movingto the full 80 LPARs, start with 20 then 40 and so on.

Page 75: A06 Care and Feeding of AIX

System p AIX and LINUX Technical University © 2006 IBM CorporationPAGE 75

The Big Picture

HBA3

HBA1

HBA0

HBA1

HBA0

HBA2

HBA2

HBA3

SVC

SVC

Page 76: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

Fine Tuning

Page 77: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 77

Tuning – a start

Examine memory utilization during peaks

Determine appropriate vmo/ioo (or vmtune parms where re-executed at boot if not 5.2).

Examine networking characteristics

Determine appropriate no and nfso parms (“ “)

Review I/O subsystem layout

If database: set async I/O & maxuprocs 512

chlicense –u 32767 (AIX default of 2) * Changed at AIX 5.3

Page 78: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 78

Initial Tuning – vmtune or vmo – places to START for AIX 5.3

If the load on the system is relatively unknown, the values above

could be considered a starting point.

In a general sense, many applications do just fine using the default parameters. Databases may need more monitoring and fine–tuning. In particular, using concurrent I/O may be extremely useful in terms of performance benefits for database servers. Special use servers like NFS or TSM may warrant even more specialized considerations.

If you are looking for ‘general recommendations’:

lru_file_repage=0, lru_poll_interval=10, maxclient%=80 and maxperm%=80 are the generally accepted starting points.

Page 79: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 79

Some other hints

• After you install the code: you may want to relinkbinaries

• Create a SYSTEMBOOK to document your system and your procedures for startup and shutdown

• Run FSCK periodically when you have a maintenance window

• Failover HACMP (forward and back) during a maintenance window

• Configure network backup and dead gateway detection unless you have handled this elsewhere

• Run a checksum (lppchk) after your installs and also • Check (instfix –icqk 5300-05_AIX_ML | grep “:-:”)

Page 80: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

Keeping Watch

A look at the minimum and optimum Monitoring Environment

Page 81: A06 Care and Feeding of AIX

STG – Power Systems Client Care

81

ITM for AIX/P Monitoring New Face Directions

• ITM product to provide advance monitoring capabilities on AIX/System P• AIX, VIOS, and HMC availability and topology monitoring• AIX and VIOS health monitoring • AIX and VIOS performance and throughput monitoring• Data warehousing of current and historical performance data• Logical and graphical views of availability, health, performance and throughput • Customer configurable views, situations, and workflows• Scalability of agents/server

AIX/System P CustomizedP virtualization environment

AIX & VIOS specific health

AIX & VIOS specific performance,

including platform aspects, like SMT

Capabilities

Topology and Navigation� HMC, IVM, CEC, LPARs, VIOS Server and Client, AIX, WPARs

Availability Monitoring� HMC, CEC, VIOS, LPAR, AIX, WPAR Status� AIX, VIOS, WPAR CPU, Memory, I/O High Level Metrics

Health Checks, Alert Messages, Expert Advise, Programmable Actions� AIX, VIOS, WPAR, HMC � Customized for the Environment� CPU, Memory, Disk, and N/W Thresholds, File System Status, Paging Space Status, Status of Daemons and Services, Top Resource Consumers, Critical Errors, etc.

Performance and Throughput� AIX, VIOS, WPAR� Customized VIOS and WPAR Metrics� Existing ITM Metrics (i.e. CPU, Memory, I/O, Network, File System)� AIX PTX Metrics (i.e. CPU, Memory, LAN, TCP, UDP, IP , WLM, Process, Thread, LPAR, Disk, I/O, LVM, Paging Space, IPC, NFS, CEC)

Data Warehouse� Historic performance data for trending

Customer Customizable Workspaces, Navigators, Situations (Eventing)

Customer Configurable Workflows

Page 82: A06 Care and Feeding of AIX

STG – Power Systems Client Care

82

ITM – Tivoli’s New Face on AIX/P• Initiative aimed at driving better AIX/System P support in Tivoli products

• Significant current usage of Tivoli products on AIX/System P

• Address significant pain points around existing Tivoli product capability gaps on AIX

• Get ahead of the curve by lining up Tivoli product support plans for future AIX/P features to avoid future gaps

• Initiative initially focused on “core” Tivoli products and capabilities• Base and virtualized environment elements

�Monitoring�Performance Management�Capacity Management�Event Management�Data Management�User Management�Access Management

• Also focus on general issues, like timely Tivoli support for new AIX releases

• Initiative will consider/drive support for the “full” System P environment for platform-related capabilities• AIX, Linux on P, HMC, VIOS

• i.e. availability and configuration monitoring of the entire environment

• Must also drive on better AIX/System P support for Enterprise system management products provided by others

• i.e. BMC, CA

�Compliance

�Accounting

�Configuration Management

�License Management

�Storage Management

Page 83: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 83

Name

Resource Class:

Monitored Property

Event Expression:

Event Description:

Rearm Expression:

Rearm Description:

Severity:

/tmp space used

Journaled File System

PercentTotUsed Details... Use Defaults

PercentTotUsed > 90

PercentTotUsed < 85

An event will be generated when more than 90% of the total space in the /tmp directory is in use.

An event will be rearmed when more than 90% of the total space in the /tmp directory is in use.

Informational

Responses to the condition...

OK Cancel Help

General Monitored Resource

Page 84: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 84

At a minimum:

Ensure that your application logs are being passed to your monitoring automated alert system as well as the AIX error log

Look at the general monitoring recommendations in the RMC and ensure that your monitor of choice includes these things

Regularly check the Service Agent logs for anomaliesWhen you DO have a problem – ask yourself four questions:• Did my monitoring system catch this problem? • If it didn’t why not? If it did – why didn’t it warn me soon enough to prevent a catastrophic problem?

• Have I changed my monitoring to catch this in the future?

• Do I have OTHER systems that could potentially be hit by this error? (consider your other departments / geos too!)

Page 85: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 85

Suggestion: set up a script to run via cron

lscfg –v

lsconf

lsdev –CH

nmon

errpt for last 24 hours

vmtune | vmo|ioo|no|schedo|nfso

no –a

netstat –rn

df –k

Include the things that you might investigate if you

encountered a problem the morning after a change cycle.

Page 86: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 86

The honey-do list

Sign up for cert advisories

Turn on Service Agent

Sign up for hyper-fix alerts

Routinely review RML

Use FLRT!

Use pre-req

UPDATE YOUR FIRMWARE!!

Page 87: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

87

HMC Planning ToolsNew in SF240 New in SF240 (GA7)(GA7)

Page 88: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

88

New in SF240 New in SF240 (GA7)(GA7)

88

HMC Planning Tools - System Plans

� Create “System Plan” from a running system

� Deploy the same LPARs on a different machine

� View System Plan� Hardware Configuration

and LPAR details

mksysplan –f file.sysplan –m server

mksysplan creates a system plan file that represents the information knownabout a managed system's hardware, partitions, profiles, and partition provisioning information.

Page 89: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

89

New in SF240 New in SF240 (GA7)(GA7)

89

HMC - Collecting and Viewing Resource Utilization Data

The HMC collects system activities that affect partition performance and capacity. The following are the types of events that the HMC records and you can view: �Shared processor utilization data �Any managed system change that affects data collection �Any partition change that affects data collection �You can use this data to analyze trends and make resource adjustments.

Page 90: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

90

Code Update Readiness Checker

State of the platform before attempting code update can cause code update to fail� Network connections� Pending serviceable events

Code Update Readiness Check function in HMC� Analyze system for problems that will prevent success� Inform operator of problems to be corrected� Many of these conditions will not inhibit normal system operation, but will prevent a successful code

update

Run Code Update Readiness Check in Advance� We recommend to run readiness checker one week in advance of code update to allow time to

resolve errors if any are found� These must be resolved before code update

How to Run Readiness Check in Advance� Change Licensed Internal Code for Current Release� Select target� Start Change Licensed Internal Code Wizard� If you reach “Specify LIC Repository” panel, the

readiness checker has passed – select Cancel

Enabled in 01SF235_165 (GA6)

Page 91: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

91

POWER5 Code Matrix

http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/home.html

Page 92: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

92

POWER5 Code Matrix con’t

P5 Release Level 240

P5 Release Level 235

P5 Release Level 230

P5 Release Level 225

P5 Release Level 222

P5 Release Level 220

P5 Release Level 210

P5 HMC V5 R2 Recommended Combination

Thru 02/2007

Recommended Combination

Thru 10/2006

Recommended Combination

Thru 05/2006

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

P5 HMC V5 R1 Not a Supported

Combination

Recommended

CombinationThru 10/2006

Recommended

CombinationThru 05/2006

Allowed, Upgrade

Recommended

Allowed, Upgrade

Recommended

Allowed, Upgrade

Recommended

Allowed, Upgrade

Recommended

P5 HMC V4 R5 Not a Supported Combination

Not a Supported Combination

Recommended CombinationThru 05/2006

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

P5 HMC V4 R4 Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

P5 HMC V4 R3 Not a Supported

Combination

Not a Supported

Combination

Not a Supported

Combination

Not a Supported

Combination

Allowed, Upgrade

Recommended

Allowed, Upgrade

Recommended

Allowed, Upgrade

Recommended

P5 HMC V4 R2 Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Allowed, Upgrade Recommended

Allowed, Upgrade Recommended

P5 HMC V4 R1 Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Not a Supported Combination

Allowed, Upgrade Recommended

Recommended Combination Thru mm/yyyy - Recommended HMC and System Firmware combination - FW Release covered under general FW support thru mm/yyyyAllowed, Upgrade Recommended - No longer supported with Service Packs. IBM recommends that you update your firmware to a recommended Release Level

Supported code combinations for HMC and server firmware: 1. Supported HMC and POWER5 Server Code combinations (excluding 595 and 590) 2. Supported HMC and POWER5 Server Code combinations for 595 and 590

http://www14.software.ibm.com/webapp/set2/sas/f/power5cm/supportedcode.html

Supported HMC and Server Release combination

Page 93: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

93

Prerequisite Tool

Inventory Pre-Req/Co-Req information: http://www-912.ibm.com/e_dir/eserverprereq.nsf

Page 94: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

94

Electronic Service Agent “Phone Home”

The Electronic Service Agent™ is a "no-charge" software tool that resides on your system p servers to monitor events. �ESA is able to automatically report hardware problems. �This proactive tool enables support to arrive on-site with the knowledge and parts required to resolve issues quickly.�We recommend our clients utilize this “Phone home” capability�Key improvements planned (HTTP Proxy

Support - in the SF240 (SP3)

Page 95: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

95

New in GA7New in GA7

“Call-Home” using SSL connection

� “Call-Home” can now be setup to use SSL through firewall� Existing “VPN” connection method is still available� Proxy-HTTP support available in GA7 SP3 code

Electronic Service Agent “Phone Home”

Page 96: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 96

Monitoring GOAL

Set up monitoring on your system to ensure that you, the system’s administrator are warning of impending problems and slow-downs BEFORE your customers tell you about them.

Monitoring should be proactive and exception-based. When something is out of spec or out of norm, an alert should be sent rather than relying on review of logs or reports to assess after the fact.

Page 97: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

Backup & Recovery

Page 98: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 98

Backup & Recovery

Establishing a schedule for mksysb’s

• Before and after upgrades

• Following updates to the I/O configuration

• Following configuration changes

TEST – your recovery (before you need to!)

Set backup tunables as needed – but before processes are started

When looking at recovery sites or scenarios ensure that if the same features are not on both, capacity is adequate or there are mechanisms to control workload.

Ensure that vital information (such as device drivers) has not been removed from the system if different hardware is used in backup site.

Examine all software that might be needed (O/S upgrades, software keys) at offsite location.

Write your backup restore documentation so someone NOT familiar with the system can restore the system.

Inactive partitions can be used to define an alternate disaster recovery scenario.

Take a look at using the dsmc command out of TSM

Page 99: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM Confidential

© 2007 IBM Corporation

The Road Ahead

Developing a Strategy for Firmware UpgradesO/S Upgrades

Page 100: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 100

Maintaining Your Environment

A good fix maintenance strategy is an important part of maintaining and managing your server. Regular maintenance of your server, and application of the latest fixes help to maximize server

IBM recommends that all servers be kept on a supported release and current with latest available fix packages for HMC and server firmware fixes.

The most important scenario to avoid is remaining on a release so long that all subsequent releases that support a single-step upgrade are withdrawn from marketing. Without a single-step upgrade available, there are no supported ways for you to upgrade your server.

IBM recommends that apply a release level and a minimum of one service pack per year.

• Release Levels– Twice a year

– Generally in February and August, but can change

• Service Packs– Generally released approximately every three months

– Can be released any time as needed if important fixes are available

Page 101: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

101

Fix Level Recommendation Tool (FLRT)

Initial release enabled customers to obtain recommended minimum fix levels for key components of IBM System p5 servers.System Firmware�Hardware Management Console�Virtual I/O Server virtualization partition�AIX 5L operating system

We are consistently expanding this tool to support more IBM products.High Availability Cluster Multi Processor (HACMP)�Customer Systems Management (CSM)�Parallel Environment�General Parallel File System�Others

Highlights of FLRT�Scripting enabled to evaluate current fix levels�Easy to create and understand reports�Useful for “what if” planning needs�Links to fix distribution sites�Print friendly view provides printable report for maintenance planning �Option to manually determine fix levels for all support products for clients who do not wish to

use automated determination�Easily obtainable tool from all fix distribution sites

A simple to understand report providing customers with a quick reference to minimum IBM recommendations to better prevent system outages.

Page 102: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

102

New FLRT enhancements

Page 103: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

103

New FLRT enhancements

Page 104: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

104

New FLRT enhancements

Page 105: A06 Care and Feeding of AIX

© 2006 IBM CorporationIBM System P Customer Care

105

New FLRT enhancements

Page 106: A06 Care and Feeding of AIX

© 2007 IBM CorporationIBM Power Systems Customer Care

106

Maintaining Your Environment

A good fix maintenance strategy is an important part of maintaining and managing your server. Regular maintenance of your server, and application of the latest fixes help to maximize server performance, and may reduce the impact of problems if they arise.

IBM recommends that all servers be kept on a supported release and current with latest available fix packages for HMC and server firmware fixes.

The most important scenario to avoid is remaining on a release so long that all subsequent releases that support a single-step upgrade are withdrawn from marketing. Without a single-step upgrade available, there are no supported ways for you to upgrade your server.

IBM recommends that apply a release level and a minimum of one service pack per year.� Release Levels

�Twice a year�Generally in February and August, but can change

� Service Packs�Generally released approximately every three months�Can be released any time as needed if important fixes are available

Page 107: A06 Care and Feeding of AIX

© 2007 IBM CorporationIBM Power Systems Customer Care

107

General Firmware StrategiesIBM releases new firmware for the following reasons:

The addition of new system function. To correct or avoid a problem.

There are some natural points at which firmware should be evaluated for potential updates:

� When a subscription notice advises of a critical or HIPER (highly pervasive) fix, the environment should be reviewed to determine if the fix should be applied.

� When one of the twice-yearly updates is released.� Whenever new hardware is introduced into the environment the firmware

pre-reqs and co-reqs should be evaluated.� Anytime HMC firmware levels are adjusted. � Whenever an outage is scheduled for a system which otherwise has

limited opportunity to update or upgrade.� When the firmware level your system is on is approaching end-of-service.� If other similar hardware systems are being upgraded and firmware

consistency can be maximized by a more homogenous firmware level.� On a yearly cycle if firmware has not been updated or upgraded within

the last year.

Page 108: A06 Care and Feeding of AIX

© 2007 IBM CorporationIBM Power Systems Customer Care

108

Planning for a Firmware Event

First and foremost, review the environment for any existing issues or problems. Check hardware and software logs and resolve as many outstanding issues as possible before undertaking the maintenance event. (For Release level SF235 and beyond, there is a firmware tool available to assist with this).

Existing HMC, Bulk Power and System firmware levels should be determined and documented.

Determine the correct level of code for HMC, Bulk Power and System Firmware. Locate this and review all README(s) and current documentation.

Review the system’s hardware inventory and validate that against firmware levels. If a piece of hardware is introduced that requires higher level of firmware, this may necessitate upgrading other components.

Determine whether the proposed fixes are concurrent / deferred / disruptive.Put together a plan for ALL related firmware events (example – HMC should be

upgraded and at highest level in the complex).Some of the suggested physical checks would include reviewing the current Installed

Level of code for FSP and BPC through the Licensed Internal Code Maintenance folder on the HMC. � The Installed Level indicates the level of firmware that has been installed and will be loaded into

memory after the managed system is powered off and powered on.� The Activated Level indicates the level of firmware that is active and running in memory. � The Accepted Level indicates the backup level of firmware. � The HMC code level can be ascertained by right clicking on the HMC GUI desktop, selecting

‘rshterm’ and entering lshmc –V.Schedule and announce a maintenance event even when firmware is concurrent.

There will be no planned reboot but there should be advance notice to users of the timeframe.

Page 109: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 109

Summary

Although this is by no means a complete and comprehensive list, this should give you lots of first thoughts about how to manage AIX.

The most important thing to remember on monitoring – set your system up to warn you of impending problems before your customers tell you about them.

Remember that if you manage a UNIX box like a mainframe, you work towards mainframe-like availability. If you manage it like a PC ….. you know

Page 110: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 110

Notes to presenter

REQUIRED CHARTS

The presenter must display the Special Notices chart, the Notes on

Benchmarks and Values charts (if the referenced values are given),

and the Notes on Performance Estimates chart (if the referenced

performance estimates are given) during the course of the presentation.

Any printed copies of this presentation that are distributed must include

legible copies of these charts. If printed copies are not distributed, the

attendees must be offered the option to receive legible printed copies of

these charts.

TRADEMARKS

Please review the Special Notices page prior to updating this

presentation to ensure all trademarks used are given proper attribution.

SPEAKER NOTES

This presentation may contain speaker notes available imbedded or as a

separate file. Please ensure these are utilized if available. Revised January 9, 2003

Page 111: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 111

To properly view this presentation, you will need to install the

IBM ~ True Type Font - IBMeserver. If it is not properly

installed, you will see IBM ^ or ~ instead of IBM ~.

The font can be downloaded by IBMers from:

http://w3.ibm.com/sales/systems/portal/_s.155/254?navID=f220s220t260&geoID=A

ll&prodID=System p&docID=eserverfont

or by IBM Business Partners from:

http://www.ibm.com/partnerworld/sales/systems; document: eserverfontbp

Notes to presenter (cont.)

Revised August 29, 2004

Page 112: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 112

This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other

countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in

your area.

Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the

capabilities of non-IBM products should be addressed to the suppliers of those products.

IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license

to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA.

All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either

expressed or implied.

All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may

be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions.

IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to

qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may

vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice.

IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.

All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.

IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.

Many of the System p features described in this document are operating system dependent and may not be available on Linux. For more information, please

check: http://www.ibm.com/servers/eserver/System p/linux/whitepapers/linux_System p.html

Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on

many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been

made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted

in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment.

Special notices

Revised February 6, 2004

Page 113: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 113

Special notices (cont.)The following terms are registered trademarks of International Business Machines Corporation in the United States and/or other countries: AIX, AIX/L, AIX/L(logo), alphaWorks, AS/400, BladeCenter, Blue Gene, Blue Lightning, C Set++, CICS, CICS/6000, ClusterProven, CT/2, DataHub, DataJoiner, DB2, DEEP BLUE, developerWorks, DFDSM, DirectTalk, DYNIX, DYNIX/ptx, e business(logo), e(logo)business, e(logo)server, Enterprise Storage Server, ESCON, FlashCopy, GDDM, IBM, IBM(logo), ibm.com, IBM TotalStorage Proven, IntelliStation, IQ-Link, LANStreamer, LoadLeveler, Lotus, Lotus Notes, Lotusphere,Magstar, MediaStreamer, Micro Channel, MQSeries, Net.Data, Netfinity, NetView, Network Station, Notes, NUMA-Q, Operating System/2, Operating System/400, OS/2, OS/390, OS/400, Parallel Sysplex, PartnerLink, PartnerWorld, Passport Advantage, POWERparallel, PowerPC, PowerPC(logo), Predictive Failure Analysis, PS/2, System p, PTX, ptx/ADMIN, RETAIN, RISC System/6000, RS/6000, RT Personal Computer, S/390, Scalable POWERparallel Systems, SecureWay, Sequent, ServerProven, SP1, SP2, SpaceBall, System/390, The Engines of e-business, THINK, ThinkPad, Tivoli, Tivoli(logo), Tivoli Management Environment, Tivoli Ready(logo), TME, TotalStorage, TrackPoint, TURBOWAYS, UltraNav, VisualAge, WebSphere, xSeries, z/OS, zSeries.

The following terms are trademarks of International Business Machines Corporation in the United States and/or other countries: Advanced Micro-Partitioning, AIX/L(logo), AIX 5L, AIX PVMe, AS/400e, Chipkill, Cloudscape, DB2 OLAP Server, DB2 Universal Database, DFDSM, DFSORT, Domino, e-business(logo), e-business on demand, eServer, Express Middleware, Express Portfolio, Express Servers, Express Servers and Storage, GigaProcessor, HACMP, HACMP/6000, Hypervisor, i5/OS, IBMLink, IMS, Intelligent Miner, Micro-Partitioning, iSeries, NUMACenter, ON DEMAND BUSINESS logo, OpenPower, POWER, Power Architecture, Power Everywhere, PowerPC Architecture, PowerPC 603, PowerPC 603e, PowerPC 604, PowerPC 750, POWER2, POWER2 Architecture, POWER3, POWER4, POWER4+, POWER5, POWER5+, POWER6, Redbooks, Sequent (logo), SequentLINK, Server Advantage, ServeRAID, Service Director, SmoothStart, SP, System p5, S/390 Parallel Enterprise Server, ThinkVision, Tivoli Enterprise, TME 10, TotalStorage Proven, Ultramedia, VideoCharger, Virtualization Engine, Visualization Data Explorer, X-Architecture, z/Architecture.

A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml.

UNIX is a registered trademark in the United States, other countries or both.

Linux is a trademark of Linus Torvalds in the United States, other countries or both.

Microsoft, Windows, Windows NT and the Windows logo are registered trademarks of Microsoft Corporation in the United States and/or other countries.

Intel, Itanium and Pentium are registered trademarks and Xeon and MMX are trademarks of Intel Corporation in the United States and/or other countries

AMD Opteron is a trademark of Advanced Micro Devices, Inc.

Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States and/or other countries.

TPC-C and TPC-H are trademarks of the Transaction Performance Processing Council (TPPC).

SPECint, SPECfp, SPECjbb, SPECweb, SPECjAppServer, SPEC OMP, SPECviewperf, SPECapc, SPEChpc, SPECjvm, SPECmail, SPECimap and SPECsfs are

trademarks of the Standard Performance Evaluation Corp (SPEC).

NetBench is a registered trademark of Ziff Davis Media in the United States, other countries or both.

Other company, product and service names may be trademarks or service marks of others. Revised July 20, 2005

Page 114: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 114

The IBM benchmarks results shown herein were derived using particular, well configured, development-level and generally-available computer systems. Buyers should consult other sources of information to evaluate the performance of systems they are considering buying and should consider conducting application oriented testing. For additional information about the benchmarks, values and systems tested, contact your local IBM office or IBM authorized reseller or access the Web site of the benchmark consortium or benchmark vendor.

IBM benchmark results can be found in the IBM System p5, ~ p5, System p, OpenPower and IBM RS/6000 Performance Report at http://www.ibm.com/servers/eserver/System p/hardware/system_perf.html.

Unless otherwise indicated for a system, the performance benchmarks were conducted using AIX V4.3 or AIX 5L. IBM C Set++ for AIX and IBM XL FORTRAN for AIX with optimization were the compilers used in the benchmark tests. The preprocessors used in some benchmark tests include KAP 3.2 for FORTRAN and KAP/C 1.4.2 from Kuck & Associates and VAST-2 v4.01X8 from Pacific-Sierra Research. The preprocessors were purchased separately from these vendors. Other software packages like IBM ESSL for AIX and MASS for AIX were also used in some benchmarks.

For a definition and explanation of each benchmark and the full list of detailed results, visit the Web site of the benchmark consortium or benchmark vendor.

TPC http://www.tpc.org

SPEC http://www.spec.org

LINPACK http://www.netlib.org/benchmark/performance.pdf

Pro/E http://www.proe.com

GPC http://www.spec.org/gpc

NotesBench http://www.notesbench.org

VolanoMark http://www.volano.com

STREAM http://www.cs.virginia.edu/stream/

SAP http://www.sap.com/benchmark/

Oracle Applications http://www.oracle.com/apps_benchmark/

PeopleSoft - To get information on PeopleSoft benchmarks, contact PeopleSoft directly

Siebel http://www.siebel.com/crm/performance_benchmark/index.shtm

Baan http://www.ssaglobal.com

Microsoft Exchange http://www.microsoft.com/exchange/evaluation/performance/default.asp

Veritest http://www.veritest.com/clients/reports

Fluent http://www.fluent.com/software/fluent/fl5bench/fullres.htmn

TOP500 Supercomputers http://www.top500.org/

Ideas International http://www.idesinternational.com/benchmark/bench.html

Storage Performance Council http://www.storageperformance.org/results

Notes on benchmarks and values

Revised July 5, 2005

Page 115: A06 Care and Feeding of AIX

STG – Power Systems Client Care

PAGE 115

rPerf

rPerf (Relative Performance) is an estimate of commercial processing performance relative to other IBM UNIX systems. It is derived from an IBM

analytical model which uses characteristics from IBM internal workloads, TPC and SPEC benchmarks. The rPerf model is not intended to represent any

specific public benchmark results and should not be reasonably used in that way. The model simulates some of the system operations such as CPU,

cache and memory. However, the model does not simulate disk or network I/O operations.

rPerf estimates are calculated based on systems with the latest levels of AIX 5L and other pertinent software at the time of system announcement. Actual

performance will vary based on application and configuration specifics. The IBM ~ System p 640 is the baseline reference system and has a

value of 1.0. Although rPerf may be used to approximate relative IBM UNIX commercial processing performance, actual system performance may vary

and is dependent upon many factors including system hardware configuration and software design and configuration.

All performance estimates are provided "AS IS" and no warranties or guarantees are expressed or implied by IBM. Buyers should consult other sources

of information, including system benchmarks, and application sizing guides to evaluate the performance of a system they are considering buying. For

additional information about rPerf, contact your local IBM office or IBM authorized reseller.

Notes on performance estimates

Revised August 12, 2005