IBM PowerHA SystemMirror for AIX Standard Edition n oW … · 2020-04-08 · PowerHA SystemMirror P
Introduction to PowerHA
Transcript of Introduction to PowerHA
Introduction to PowerHAPower High Availability
Skill Level: Intermediate
Uma Chandolu ([email protected])Senior Staff Software EngineerIBM
Tejaswini Kaujalgi ([email protected])Software EngineerIBM India Software Labs
17 Aug 2010
PowerHA for AIX® is the new name for HACMP (High Availability ClusterMultiprocessing). HACMP is an application that makes system fault resilient andreduces downtime of applications. This article concentrates on the introduction toPowerHA and provides a detailed explanation of how to configure a 2 node cluster.Considering the demand for this configuration from various customers, this documentwill be very useful in understanding PowerHA and setting up a 2 node cluster.
Introduction
In today's increasing business demands, critical applications need to be available allthe time, and the system needs to be fault tolerant. But these fault tolerant systemsalways come with a heavy cost. Hence, there is need of an application whichprovides these facilities and is also cost effective.
A High Availability solution can ensure that the failure of any component of thesolution does not cause the application and its data to be unavailable to the usercommunity. This is achieved through the elimination, or masking, of both plannedand unplanned downtime by eliminating single points of failure. Also, there is nospecialized hardware required to make an application highly available. PowerHAdoes not perform some administrative tasks like backups, time synchronization, and
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 1 of 22
any application specific configuration.
Figure 1 is an illustration of the failover capacity. When one server goes down, theother takes over.
Figure 1. Failover capacity
Overview of PowerHA
The terms PowerHA and HACMP are used interchangeably. As mentioned earlier, iteliminates single points of failure (SPOF). The following table shows possibleSPOFs:
Cluster object Eliminated as SPOF by:
Node Use multiple Nodes
Power source Using multiple circuits or uninterrupted powersupplies
Network Adapter Using redundant network adapters
Network Using multiple networks to connect nodes
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 2 of 22
TCP/IP Subsystem Using non-IP networks to connect adjoiningnodes and clients
Disk Adapter Using redundant disk adapter or multi Pathhardware
Disk Using multiple disks with mirroring or raid
Application Adding nodes for takeover; configuringapplication monitor
VIO server Implementing dual VIO server
Site Adding an additional site
The main goal is to have 2 servers so that if one fails, the other takes over.PowerHA is a clustering technology that provides both failover protection by havingredundancy and horizontal scalability through concurrent/parallel access.
PowerHA terminology
There are various terminologies used in PowerHA. They can be classified intotopology components and resource components.
The topology components are basically the physical components. They include:
• Nodes: System p servers can be standalone partitions or vios clients
• Networks: IP networks and Non IP networks
• Communication interfaces: Token Ring or Ethernet adapters
• Communication devices: RS232 or heartbeat over disk
The resources components are the logical entities that will be made highly available.They include:
• Application server: It involves the start/stop scripts of the application
• Service IP address: The end users are generally given an IP address toconnect to the application. This IP address is mapped to a node wherethe application is actually running. Since the IP address needs to remainhighly available, it is a part of the resource group.
• File system: Many applications require the file systems to mounted.
• Volume group: Volume groups are required to be made highly availablewith many applications.
All the resources together form an entity called a resource group. PowerHA handles
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 3 of 22
this as a single unit. It keeps the resource groups highly available. Resource groupshave policies associated with it. Those are:
1. Startup policy: This tells which node the resource group should activate
2. Fallover policy: When a failure happens, this determines the fallovertarget node
3. Fallback policy: This tells whether or not the resource group will fallback.
Whenever a failure happens, it looks for these policies and works accordingly.
The subsystems of PowerHA
Figure 2. Subsystems of PowerHA
The illustration above shows how PowerHA comprises of a number of softwarecomponents:
• The cluster manager, clstrmgr, is the core process that monitors clustermembership. The cluster manager includes a topology manager tomanage the topology components, a resource manager to manage
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 4 of 22
resource groups, an event manager with event scripts that works throughthe RMC facility, and RSCT to react to failures.
• The clinfo process provides an API for communicating between clustermanager and your application. Clinfo also provides remote monitoringcapabilities and can run a script in response to a status change in thecluster.
• In PowerHA 5, clcomdES allows the cluster managers to communicate ina secure manner without using rsh and the /.rhost files.
Configuration of a 2 node cluster
Before starting off with the configurations, lets look at the networking and the storageconsiderations of PowerHA.
Networking
PowerHA uses networks to detect and diagnose failures as well as providing clientswith highly available access to applications.
Internode communication also happens through networks. PowerHA detects 3 kindsof failures directly. Those are the network, NIC and node failure. It detects anddiagnosis through the use of RSCT daemon. RSCT in fact detects the loss ofheartbeat packets that are sent across all the networks and determines the exactloss (Node, NIC or network failure).
Figure 2 shows that the heartbeat packets are transferred and received by all NICs,which helps in determining the failures.
Figure 3. Cluster representing heartbeat packets
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 5 of 22
If the heartbeat packets are stopped, then both nodes assume that the other is downand hence each will try to bring the resource group online. This could result inmassive data corruption.
To avoid this, PowerHA uses 2 kinds of networks:
1. IP networks: Examples are Ethernet, Ether channel, etc
2. Non-IP networks: An example is RS232 (this is needed to make surethat even if the network goes down, PowerHA is capable of differentiatingbetween network failure and node failure)
IP address take over (IPAT)
Most of the applications require that the IP address be highly available. To ensurethis, we include this service IP into the resource group. The movement of thisservice IP from one NIC to another is called as IP address take over. There are twoways to use IPAT:
1. IPAT via aliasing: PowerHA adds the service IP address to the NIC,accomplished using AIX IP aliasing feature
2. IPAT via replacement: PowerHA replaces the Interface IP address withthe Service IP
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 6 of 22
Storage
Storage can be broadly classified into two types:
1. Private storage: Owned by only one node
2. Shared storage: Owned by more then one node in the cluster
All applications' data resides in the shared storage. To avoid data inconsistency,shared storage protection can be done in the following ways:
1. Reserve/release-based shared storage protection: Used with standardvolume groups
2. RSCT-based shared storage protection: Used with enhanced concurrentvolume groups
HACMP 5.x supports the RSCT-based style of shared storage protection, whichrelies on AIXs RSCT component to coordinate the ownership of shared storagewhen using enhanced concurrent volume groups in non-concurrent mode.
Configuration
Before starting with the configuration, the cluster must be properly planned. Theonline planning worksheets (OLPW) can be used for the planning purpose. Here, itexplains the configuration of a two node cluster. In the example provided, bothnodes have 3 Ethernet adapters and 2 shared disks.
Step 1: Fileset installation
After installing AIX, the first step is to install the required filesets. Install the followingfilesets. The RSCT and BOS filesets can be found in the AIX base version CDs. Thelicense for PowerHA needs to be purchased to install the HACMP filesets.
RSCT filesetsrsct.compat.basic.hacmprsct.compat.clients.hacmp
rsct.basic.hacmprsct.basic.rte
rsct.opt.storagermrsct.crypt.desrsct.crypt.3des
rsct.crypt.aes256
BOS filesetsbos.data
bos.adt.libmbos.adt.syscalls
bos.clvm.enhbos.net.nfs.server
HACMP 5.5 filesetscluster.adt.es
cluster.es.assistcluster.es.cspoccluster.es.plugins
cluster.assist.licensecluster.doc.en_US.assist
cluster.doc.en_US.escluster.es.worksheets
cluster.licensecluster.man.en_US.assist
cluster.man.en_US.escluster.es.client
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 7 of 22
cluster.es.servercluster.es.nfscluster.es.cfs
After installing the filesets, reboot the partition.
Step 2: Setting the path
Next, the path needs to be set. To do that, add the following to the /.profile file:
export PATH=$PATH:/usr/es/sbin/cluster:/usr/es/sbin/cluster/utilities
Step 3: Network configuration
To configure an IP address on the Ethernet adapters, do the following:
#smitty tcpip -> Minimum Configuration and Startup -> Choose Ethernet network interface
You will have three Ethernet adapters. Two with private IP address and one withpublic IP address.
Here, enter the relevant fields for en0, which you will configure for a public IPaddress.
Image 1. Configuration of a public IP address
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 8 of 22
This will configure the IP address and start the TCP/IP services on it.
Similarly, you configure the private IP addresses on en1 and en2.
Image 2. Configuration of a private IP address
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 9 of 22
Similarly, configure en2 with private IP 10.10.210.21 and start the TCP/IP services.
Next, you need to add the IP addresses (Of both node1, node2 and the service IPwhich db2live here) and the labels into /etc/hosts file. It should look like:
# Internet Address Hostname # Comments127.0.0.1 loopback localhost # loopback (lo0) name/address192.168.20.72 node1.in.ibm.com node1192.168.20.201 node2.in.ibm.com node210.10.35.5 node2ha110.10.210.11 node2ha210.10.35.4 node1ha110.10.210.21 node1ha2192.168.22.39 db2live
The idea is that you should include each of the three ports for each machine withrelevant labels for name resolution.
Perform similar operations on node2. Configure en0 with the public IP and en1 anden2 with private IPs and edit the /etc/hosts file.
To test that all is well, you can issue pings to the various IP addresses from eachmachine.
Step 4: Storage configuration
We need to have a shared storage to create heartbeat over FC disk. The disks needto be allocated from SAN. Once both the nodes are able to see the same disks (thiscan be identified using LUN number), heartbeat over disks will be configured.
This method does not use Ethernet to avoid a single point of failure from theEthernet network/switches/protocols.
The first step is to identify the available major number on all the nodes.
Image 3. Identifying available major number
Pick a unique number. In this case, we picked 100.
On node1
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 10 of 22
1. Create a vg "hbvg" on the shared disk "hdisk1" with enhanced concurrentcapability.
#smitty mkvg
Image 4. Volume group creation
Once hbvg is created, the autovaryon flag needs to be disabled. To dothat run the following command.
#chvg -an hbvg
2. Next we create logical volumes in the volume group "hbvg". Enter an LVname such as hbloglv, select 1 for the number of logical partitions, selectjfslog as the type, and set scheduling to sequential. Let the remainingoptions have the default value. Press enter.
#smitty mklv
Image 5. Logical Volume creation
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 11 of 22
Once lv is created, initialize the logform.
#logform /dev/hbloglv
Repeat this process to create another LV of type jfs and named hblv, butotherwise identical.
3. Next we create a filesystem. To do that, enter the following:
#smitty crfs ->Add a Journaled File System -> Add a Journaled File System on aPreviously Defined Logical Volume -> Add a Standard Journaled File System
Here enter the lv name "hblv", lv for log as " hbloglv" and the mount point/hb_fs
Image 6. Filesystem creation in a Logical Volume
Once the Filesystem is created, try mounting the file system. Beforemoving to node2, unmount /hb_fs and varyoffvg hbvg.
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 12 of 22
On Node 2
1. Identify the shared disk using PVID. Import the volume group with samemajor number (here it is 100) from the shared disk (hdisk1):
#importvg -V 100 -y hbvg hdisk1
2. Varyon the volume group and disable auto start at mount.
#varyonvg hbvg
#chvg -an hbvg
Now, you should be able to mount the filesystem. Once done, unmountthe filesystem and varyoffvg hbvg.
3. Verification of Heartbeat over FC:Open 2 different sessions of both the nodes. On node1, run followingcommand. Where hdisk1 is the shared disk.
#/usr/sbin/rsct/bin/dhb_read -p hdisk1 -r
On node2:
/usr/sbin/rsct/bin/dhb_read -p hdisk1 -t
Basically one node will heartbeat to the disk and the other will detect it.Both nodes should return to the command line after reporting Linkoperating normally.
Application specific configuration
If you are making any application, for example DB2 server highly available,application specific configuration needs to be done. That is beyond the scope of thisarticle.
HACMP related configuration
1. Network takeover:On both nodes:
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 13 of 22
1. Run grep -i community /etc/snmpdv3.conf | greppublic and ensure that there is an uncommented line similar toCOMMUNITY public public noAuthNoPriv 0.0.0.0 0.0.0.0.
2. Next we need to add all the IP addresses of nodes, NIC's in the/etc/rhosts file.
# cat /usr/es/sbin/cluster/etc/rhosts192.168.20.72192.168.20.20110.10.35.510.10.210.1110.10.35.410.10.210.21192.168.22.39
Configuring PowerHA cluster
On Node 1:
1. First define a cluster
#smitty hacmp --> Extended Configuration --> Extended Topology Configuration -->Configure an HACMP Cluster --> Add/Change/Showan HACMP Cluster
Image 7. Defining a cluster
Press enter. Now, the cluster is defined.
2. Add nodes to the defined cluster.
#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration --> Configure HACMP Nodes --> Add a Node to the HACMPCluster
Image 8. Adding nodes to a cluster
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 14 of 22
Similarly, add another node to the cluster.
Till now, we have defined a cluster and added nodes to it. Next we willmake the 2 nodes communicate with each other.
3. Adding network. We will add 2 kinds of networks, IP (Ethernet) andnon-IP (diskhb).
#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Networks --> Add a Network to theHACMP Cluster
Select "ether" from the list.
Image 9. Adding networks to the cluster
After this is added, return to "Add a network to the HACMP cluster" andalso add the diskhb network.
4. The next step establishes what physical devices from each node areconnected to each network.
#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Communication Interfaces/Devices--> Add Communication Interfaces/Devices -->Add Pre-definedCommunication Interfaces and Devices--> Communication Interfaces
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 15 of 22
Pick the network that we added in the last step (IP_network) and enterconfiguration similar to this:
Image 10. Adding communication devices to the cluster
There should be a warning about an insufficient number ofcommunication ports on particular networks.
These last steps need to be repeated for the different adapters to beassigned to the various networks for HACMP purposes; the warnings canbe ignored then, by the time all adapters are assigned to networks thewarnings must be gone. In any case, repeat for all interfaces.
Note that for the disk communication (the disk heartbeat), the steps areslightly different.
#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Communication Interfaces/Devices --> AddCommunication Devices
Select shared_diskhb or the relevant name as appropriate and fill in thedetails as below:
Image 11. Adding communication interfaces to the cluster
Each node in the cluster also needs to have a persistent node IP address.We associate each node with its persistent IP as follows:
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 16 of 22
#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Configure HACMP Persistent Node IP Label/Addresses
Add all the details as below:
Image 12. Adding persistent IP address to the cluster
Checkpoint:
After adding everything, we should check if everything was addedcorrectly.
#smitty hacmp --> Extended Configuration --> Extended TopologyConfiguration--> Show HACMP Topology -->Show Cluster Topology
It will list all the networks, interfaces, devices. Verify that they are addedcorrectly.
5. Adding Resource Group:Till now we have defined a cluster, added nodes to it and also configuredboth IP as well as non-IP_network. The next step is to configure aresource group. As defined earlier, a resource group is a collection ofresources. Application server is one such resource which needs to bekept highly available, for example a db2 server.
Adding an application server to resource group:
#smitty hacmp --> Extended Configuration-->Extended ResourceConfiguration-->HACMP Extended Resources Configuration--> ConfigureHACMP Application Servers-->Add an Application Server
Image 13. Adding resources - Application server
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 17 of 22
This specifies the server name and the start and the stop scripts neededto start/stop the application server. For applications such as DB2,WebSphere, SAP, Oracle, TSM, ECM, LDAP, IBM HTTP, the start/stopscripts come with the product. For other applications, administrator shouldwrite their own scripts to start/stop the application.
The next resource that we will add into the resource group is a service IP.It is through this IP only that the end users will connect to the application.Hence Service IP should be kept highly available.
#smitty hacmp --> Extended Configuration-->Extended ResourceConfiguration-->HACMP Extended Resources Configuration-->Configure HACMPService IP Labels/Addresses--> Add a Service IP Label/Address
Choose "Configurable on Multiple Nodes" and then "IP_network". Herewe have db2live as the service IP.
Image 14. Adding resources - Service IP
Now the resources are added, we will create a Resource Group (RG),define RG policies and add all these resources to it.
#smitty hacmp --> Extended Configuration-->HACMP Extended Resource GroupConfiguration--> Add a Resource Group
Image 15. Resource Group creation
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 18 of 22
Once RG is created, we can change attributes of it using,
#smitty hacmp --> Extended Configuration-->HACMP Extended Resource GroupConfiguration-->Change/Show Resources and Attributes for a Resource Group
Select db2_rg and configure as desired:
Image 16. Defining various attributes of the Resource Group
6. Verification and synchronizationOnce everything is configured on the primary node [node1 here], we needto synchronize this with all other nodes in the cluster. To do that,
#smitty hacmp--> Extended Configuration--> Extended Verification andSynchronization
Image 17. Verification and synchronization of the cluster
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 19 of 22
This will check the status and configuration of the local node first, andthen it will propagate the configuration to the other nodes in the cluster, ifthey are reachable. There should be plenty of details on any errors (andpasses too).
Once this is done, your cluster is ready. You can test it by moving the RGmanually. To do that,
#smitty hacmp--> System Management (C-SPOC)--> HACMP Resource Group andApplication Management--> Move a Resource Group to Another Node / Site-->Move Resource Groups to Another Node
Choose "node2" and press Enter. You should see the stop scripts runningon node1 and start scripts running on node2. After few seconds, the RGwill be online on node2.
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 20 of 22
Resources
Learn
• Check out the High Availability Cluster Multi-Processing (HACMP) productdocumentation.
• Attend a complimentary developerWorks Live! briefing to get up-to-speedquickly on IBM products and tools as well as IT industry trends.
• Watch developerWorks on-demand demos ranging from product installation andsetup demos for beginners, to advanced functionality for experienceddevelopers.
Get products and technologies
• Evaluate IBM products in the way that suits you best: Download a product trial,try a product online, use a product in a cloud environment, or spend a few hoursin the SOA Sandbox learning how to implement Service Oriented Architectureefficiently.
Discuss
• Get involved in the My developerWorks community. Connect with otherdeveloperWorks users while exploring the developer-driven blogs, forums,groups, and wikis.
• Follow developerWorks on Twitter.
• Get involved in the My developerWorks community.
• Participate in the AIX and UNIX® forums:
• AIX Forum
• AIX Forum for developers
• Cluster Systems Management
• Performance Tools Forum
• Virtualization Forum
• More AIX and UNIX Forums
About the authors
Uma Chandolu
ibm.com/developerWorks developerWorks®
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 21 of 22
Uma M. Chandolu works as a Development Support Specialist on AIX.He has five years of extensive hands-on experience in AIXenvironments and demonstrated expertise in AIX system administrationand other subsystems. He has experience interfacing with customersand handling customer-critical situations. He has been recognized asan IBM developerWorks contributing author. He can be contacted [email protected].
Tejaswini KaujalgiTejaswini Kaujalgi works as Systems Software Engineer in the IBM AIXUPT Release team in Bangalore. She has been working on AIX,PowerHA, Security, and VIOS components on pSeries for more than 3years at IBM India Software Labs. She has also worked on variouscustomer configurations using LDAP, Kerberos, RBAC, PowerHA andAIX on pSeries. She is an IBM certified System p Administrator. Youcan reach her at [email protected].
developerWorks® ibm.com/developerWorks
Introduction to PowerHA Trademarks© Copyright IBM Corporation 2010. All rights reserved. Page 22 of 22