FlexiPlatform Alarms for RNC-OMS

26
FlexiPlatform Alarms for RNC- OMS DN70155942 Issue 1-0 en draft # Nokia Corporation 1 (26)

description

FlexiPlatform Alarms for RNC-OMS

Transcript of FlexiPlatform Alarms for RNC-OMS

Page 1: FlexiPlatform Alarms for RNC-OMS

FlexiPlatform Alarms for RNC-OMS

DN70155942Issue 1-0 en draft

# Nokia Corporation 1 (26)

Page 2: FlexiPlatform Alarms for RNC-OMS

The information in this document is subject to change without notice and describes only theproduct defined in the introduction of this documentation. This document is not an officialcustomer document and Nokia Networks does not take responsibility for any errors or omissionsin this document. No part of it may be reproduced or transmitted in any form or means withoutthe prior written permission of Nokia Networks. The document has been prepared to be used byprofessional and properly trained personnel, and the customer assumes full responsibility whenusing it. Nokia Networks welcomes customer comments as part of the process of continuousdevelopment and improvement of the documentation.

The information or statements given in this document concerning the suitability, capacity, orperformance of the mentioned hardware or software products cannot be considered binding butshall be defined in the agreement made between Nokia Networks and the customer.

Nokia Networks WILL NOT BE RESPONSIBLE IN ANY EVENT FOR ERRORS IN THISDOCUMENT OR FOR ANY DAMAGES, INCIDENTAL OR CONSEQUENTIAL (INCLUDINGMONETARY LOSSES), that might arise from the use of this document or the information in it.UNDER NO CIRCUMSTANCES SHALL NOKIA BE RESPONSIBLE FOR ANY LOSS OF USE,DATA, OR INCOME, COST OF PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES,PROPERTY DAMAGE, PERSONAL INJURY OR ANY SPECIAL, INDIRECT, INCIDENTAL,PUNITIVE OR CONSEQUENTIAL DAMAGES HOWSOEVER CAUSED.

THE CONTENTS OF THIS DOCUMENT ARE PROVIDED "AS IS". EXCEPT AS REQUIREDBY APPLICABLE MANDATORY LAW, NO WARRANTIES OF ANY KIND, EITHER EXPRESSOR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OFMERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT,ARE MADE IN RELATION TO THE ACCURACY, RELIABILITY OR CONTENTS OF THISDOCUMENT. NOKIA RESERVES THE RIGHT TO REVISE THIS DOCUMENT ORWITHDRAW IT AT ANY TIME WITHOUT PRIOR NOTICE.

This document and the product it describes are considered protected by copyright according tothe applicable laws.

NOKIA and Nokia Connecting People are registered trademarks of Nokia Corporation. Otherproduct names mentioned in this document may be trademarks of their respective companies,and they are mentioned for identification purposes only.

Copyright © Nokia Corporation 2006. All rights reserved. Reproduction, transfer, distribution orstorage of part or all of the contents in this document in any form without the prior writtenpermission of Nokia is prohibited.

2 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 3: FlexiPlatform Alarms for RNC-OMS

Contents

Contents 3

1 Changes in FlexiPlatform alarms for RNC-OMS 51.1 Changes within the Nokia FlexiPlatform release 4 51.1.1 Changes in the FlexiPlatform alarms for RNC-OMS 5

2 70005 INCORRECT ALARM DATA 7

3 70030 DISK DATABASE IS GETTING FULL 11

4 70156 DISK DATABASE WATCHDOG START-UP FAILED 15

5 70173 BACKEND DATABASE REQUIRED BY CORBA NAMINGSERVICE IS UNAVAILABLE 17

6 70245 ILLEGAL INTERNAL USAGE OF EXTERNAL ALARMNOTIFICATION FORMAT 21

7 70256 RESOURCE ALLOCATION OR DE-ALLOCATION FAILURE 23

DN70155942Issue 1-0 en draft

# Nokia Corporation 3 (26)

Contents

Page 4: FlexiPlatform Alarms for RNC-OMS

4 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 5: FlexiPlatform Alarms for RNC-OMS

1 Changes in FlexiPlatform alarms forRNC-OMS

The information contained in this section is for internal use only. Do not reusethis module in application-specific library configurations. Use only relevantinformation to your own change module(s).

1.1 Changes within the Nokia FlexiPlatform release 4

1.1.1 Changes in the FlexiPlatform alarms for RNC-OMS

The following alarms are new in the current FP4 release:

. 70245 ILLEGAL INTERNAL USAGE OF EXTERNAL ALARMNOTIFICATION FORMAT

. 70256 RESOURCE ALLOCATION OR DE-ALLOCATION FAILURE

The following alarm descriptions have been updated in FP4:

. 70005 INCORRECT ALARM DATA

. 70030 DISK DATABASE IS GETTING FULL

. 70156 DISK DATABASE WATCHDOG START-UP FAILED

. 70173 BACKEND DATABASE REQUIRED BY CORBA NAMINGSERVICE IS UNAVAILABLE

DN70155942Issue 1-0 en draft

# Nokia Corporation 5 (26)

Changes in FlexiPlatform alarms for RNC-OMS

Page 6: FlexiPlatform Alarms for RNC-OMS

6 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 7: FlexiPlatform Alarms for RNC-OMS

2 70005 INCORRECT ALARM DATA

Probable cause: Invalid parameter

Event type: Processing error

Default severity: Major

Meaning

The alarm system has been requested to raise or clear an alarm with incorrectalarm data. One or more arguments provided with the request might have aninvalid value or meaning: null, or they might be empty, too long, out of specifiedrange, contain non-printable characters, or have an incorrect format. The alarmnumber (Specific Problem) might also be unknown. An incorrect format in thiscase means, for example, that a character value was entered where a numericvalue was expected. A special case of an incorrect format is if the quotes (")surrounding the value of an information field are missing from an alarmnotification record in the syslog.

The alarm which is requested to be raised or cleared with incorrect data is notprocessed further but the information is put as additional information in thisalarm. If the alarm number is unknown, then the actual fault for which the alarmhas been raised is also left unknown.

Identifying additional information fields

1. Erroneous data

. Identifies the alarm data that was incorrect or that was totally missing.Only the name of the first field containing invalid data is mentioned here.

Possible values are:. SP: Specific Problem given in the data is not known by the alarm

system, or is not reasonable.. MOId: Managed Object Id given in the data is not reasonable. PS: Perceived Severity given in the data is not reasonable. applId: Application Id given in the data is not reasonable

DN70155942Issue 1-0 en draft

# Nokia Corporation 7 (26)

70005 INCORRECT ALARM DATA

Page 8: FlexiPlatform Alarms for RNC-OMS

. AAI: Additional Information given in the data is not reasonable.

. IAAI: Identifying Additional Information given in the data is notreasonable.

. alarmTime: Alarm time is presented in too long a format, or is innon-numerical format.

. length: The combined length of the string type fields (ManagedObject Id, Application Id, Application Additional Information,Identifying Application Additional Information) given in the dataexceeds the maximum value of 896 characters. Note that in this case,both Application Id and Managed Object Id in the given data areconsidered as invalid, as only the combined length is verified.

. In addition, these values are also possible for RNC alarms:. rncLocalMOId: the Local Managed Object Id given in the data is not

reasonable;. rncApplicationId: the RNC Application Id given in the data is not

reasonable;. rncNotificationId: the RNC Notification Id given in the data is not

reasonable;. rncFlowControl: the RNC Flow Control given in the data is not

reasonable.

2. Specific Problem

. Specific problem (the alarm number) of the invalid alarm can also containthe original invalid value if this was the invalid field.

Additional information fields

3. Managed Object Id

. Distinguished name of the managed object that was given as the ManagedObject Id in the invalid alarm. If the MOId itself was the incorrect data,then the value fsManagedObjectId=invalid,fsClusterId=ClusterRoot is displayed in this field.

Instructions

Fill in a problem report and send it to your local Nokia representative.

Clearing

Clear the alarm with the Alarm Browser after correcting the fault as presented inInstructions, in other words, after sending the report to your local Nokiarepresentative.

8 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 9: FlexiPlatform Alarms for RNC-OMS

Testing instructions

Use, for example, the alarm system command line interface (CLI) commandflexalarm to send a request to raise or clear an alarm with a Specific Problemthat does not exist.

For example:

$> flexalarm -raise -mo=<myMO> -ap=<myAP> -sp=700111

where <myMO> and <myAP> have the correct format.

Since the 700111 Specific Problem does not exist, alarm 70005 is raised.

DN70155942Issue 1-0 en draft

# Nokia Corporation 9 (26)

70005 INCORRECT ALARM DATA

Page 10: FlexiPlatform Alarms for RNC-OMS

10 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 11: FlexiPlatform Alarms for RNC-OMS

3 70030 DISK DATABASE IS GETTINGFULL

Probable cause: Storage capacity problem

Event type: Processing error

Default severity: Major

Meaning

The disk storage area reserved for disk database is filling up.

The disk database is still fully operational. If the database fills up completely, itsservices cannot be used anymore.

Identifying additional information fields

-

Additional information fields

1. Max size: the maximum size of database in kB

2. Fill ratio: the fill ratio of the database

Instructions

The actions to be done in order to avoid a completely full database are database-specific, so contact your local Nokia representative immediately and providethem with the information you obtained from the alarm notification's fields.

Clearing

Clear the alarm with the Alarm Browser after correcting the fault as presented inInstructions.

DN70155942Issue 1-0 en draft

# Nokia Corporation 11 (26)

70030 DISK DATABASE IS GETTING FULL

Page 12: FlexiPlatform Alarms for RNC-OMS

Testing instructions

You can test the alarm either by filling the database until the allocated spaceexceeds the fill ratio alarm limit, or by decreasing the fill ratio alarm limit underthe current fill ratio of the database. You can also combine these two approaches.

. In the first approach, you simply create a dummy table to the database andinsert rows to it until the fill ratio exceeds the fill ratio alarm limit (seeattribute fsdbFillRatioAlarmLimit in the DB fragment).

. In the second approach, you must use a parameter management tool tochange the fsdbFillRatioAlarmLimit attribute of the DB fragment toa smaller value than the current fill ratio of the database. After this, youmust restart the recovery group of the database (fshascli -r /<RG>).The current fill ratio of the database can be estimated as follows:

1. Get the maximum size of the database either by checking theinnodb_data_file_path attribute from the MySQL instanceconfiguration file (/var/mnt/local/MySQL_<DBName>/my.cnf) or by connecting to the instance and entering the followingcommand:

SHOW GLOBAL VARIABLES LIKE 'innodb_data_file_path'\G

The mximum size is the sum of the maximum size of each InnoDBdata file listed in the value. For example, the following result meansthat the maximum size is 500 MB (512'000 kB):

*************************** 1. row ***************************

Variable_name: innodb_data_file_path

Value: ibdata1:500M

2. Get the free space of the database by connecting to the instance andentering the following command for any InnoDB table:

SHOW TABLE STATUS FROM <schema> LIKE '<table>'\G

<schema> is the schema name of the InnoDB table and <table> isthe name of the table. The comment column of the result set showsthe free space.

For example, the following result means that the database has492'544 kB free space (when using the example size of step 1, theresult leads to fill ratio of 3,8%):

mysql> SHOW TABLE STATUS FROM test LIKE 'mysqlwdtest'\G

*************************** 1. row ***************************

Name: mysqlwdtest

...

Comment: InnoDB free: 492544 kB

It does not matter which InnoDB table is used in the query.

3. Check the schema and the name of an arbitrary InnoDB table byusing the following query:

SELECT table_schema,table_name

FROM information_schema.tables

12 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 13: FlexiPlatform Alarms for RNC-OMS

WHERE engine = 'InnoDB'

LIMIT 1;

DN70155942Issue 1-0 en draft

# Nokia Corporation 13 (26)

70030 DISK DATABASE IS GETTING FULL

Page 14: FlexiPlatform Alarms for RNC-OMS

14 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 15: FlexiPlatform Alarms for RNC-OMS

4 70156 DISK DATABASE WATCHDOGSTART-UP FAILED

Probable cause: Configuration or Customizing Error

Event type: Processing error

Default severity: Critical

Meaning

Start-up of the disk database watchdog has failed due to a configuration error.

Because the disk database and its watchdog belong to the same recovery unit(RU), the disk database watchdog start-up failure means that the database is notavailable.

Identifying additional information fields

-

Additional information fields

1. Reason. Possible values:. 1 - Disk database watchdog failed to read the parameters from

parameter management.. 2 - Invalid or missing parameter value.

2. List of invalid or missing parameters if the reason for the alarm is 2.

Instructions

Check the Application Additional Information field for a reason for theconfiguration error:

. Reason 1: Disk database watchdog failed to read the parameters fromparameter management

. Reason 2: Invalid or missing parameter value

DN70155942Issue 1-0 en draft

# Nokia Corporation 15 (26)

70156 DISK DATABASE WATCHDOG START-UP FAILED

Page 16: FlexiPlatform Alarms for RNC-OMS

and continue according to the following procedure:

1. Check that the following parameters exist in parameter management foreach database entry in the database fragment with the DN (DistinguishedName) "fsFragmentId=DB, fsClusterId=ClusterRoot":

fsdbRedundancyModel

fsdbDataSourceName

fsdbFillRatioAlarmLimit

fsdbFillRatioCheckFreq

2. Use Parameter Tool to get the values of those parameters for the databasein question. To find those parameters, use the value of the Managed Objectfield in Alarm Browser, for example:

fsdbName=DB_Alarm,fsFragmentId=DB,fsClusterId=ClusterRoot

3. Send your Nokia representative the found values and/or parameters that donot exist.

Clearing

Clear the alarm with Alarm Browser after correcting the fault as presented inInstructions.

Testing instructions

The easiest way to raise the alarm is to use a parameter management tool tochange the fsdbFillRatioAlarmLimit or fsdbFillRatioCheckFreqattribute of the database to a non-numeric value and to restart the recovery groupof the database (fshascli -r /<RG>).

16 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 17: FlexiPlatform Alarms for RNC-OMS

5 70173 BACKEND DATABASE REQUIREDBY CORBA NAMING SERVICE ISUNAVAILABLE

Probable cause: Underlying Resource Unavailable

Event type: Processing error

Default severity: Major

Meaning

The MySQL database instance DB_CosNaming, used by the public and privateCORBA naming service (NaS) instances, cannot be contacted by the NaSwrapper. Note that the recovery group that owns the backend database isNamingServiceDB and CORBA NaS instances belong to recovery groupsPrivateCosNaming and PublicCosNaming.

The CORBA NaS is not able to store data in the database. Therefore the CORBANaS is not functional and replies to the high availability services (HAS)heartbeats with a failure indication.

Identifying additional information fields

-

Additional information fields

-

Instructions

. Check that the error situation still exists

/opt/Nokia/SS_Naming/bin/ns_listall

/opt/Nokia/SS_Naming/bin/ns_listall public

DN70155942Issue 1-0 en draft

# Nokia Corporation 17 (26)

70173 BACKEND DATABASE REQUIRED BY CORBA NAMING SERVICE ISUNAVAILABLE

Page 18: FlexiPlatform Alarms for RNC-OMS

These commands should list the content of the private and the publicnaming graphs if the NaS is working correctly. If these commands throwexceptions, the NaS is not working correctly which may result, forexample, from an unavailable backend database.

. Check if the backend database DB_CosNaming (RG NamingServiceDB)is unlocked and active.

fshascli -s /NamingServiceDB

If the NamingServiceDB is locked, unlock it.

fshascli -u /NamingServiceDB

After a few seconds the database should have restarted and the NaS shouldhave automatically re-established connections. Ensure the restart and there-established connections by issuing the ns_listall commandsmentioned above.

. If this does not solve the problem, there is something wrong with thedatabase deployment or configuration. In that case, also the alarm 70156DISK DATABASE WATCHDOG START-UP FAILED should be raised bythe MySQL DB watchdog dedicated for the DB_CosNaming databaseinstance.

The following steps describe the error checking procedure ifNamingServiceDB RG fails (see alarm description 70156 DISKDATABASE WATCHDOG START-UP FAILED for more information).

1. Check the master-syslog for any indication of errors.

less /var/log/master-syslog

2. Check that the LDAP server is up and running.. Check that the RG owning the LDAP server is unlocked.

fshascli -s /Directory. Check that the LDAP server is really working by listing the content

of the LDAP tree (CTRL-C aborts the listing).

ldapsearch

3. If the LDAP is working correctly, check that the DB directory mount isfunctional:. Lock the NamingServiceDB RG (if not yet locked).. Mount the database directory manually.

a. Create the SW RAID (md device) to where theDB_CosNaming directory is stored at.

create_sw_raid /dev/md8 \

/dev/VG_62/MySQL_DB_CosNaming \

/dev/VG_63/MySQL_DB_CosNaming

18 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 19: FlexiPlatform Alarms for RNC-OMS

Note that the device paths given as arguments above may bedifferent in your system.

Check the correct device paths from:

/opt/Nokia_BP/etc/ldapfile/ldif_in/PFSAN*.ldif

The device paths are defined under an entry defining theFSHWSWRAID object class for the NaS:

dn: fshwStorageResourceName=/dev/md8, fshwSANName=0,

fsFragmentId=HW, fsClusterId=ClusterRoot

fshwStorageResourceName: /dev/md8

objectClass: FSHWStorageResource

objectClass: FSHWSWRAID

objectClass: extensibleObject

fshwRAIDLevel: 1

fshwPartitionName: /dev/VG_62/MySQL_DB_CosNaming

fshwPartitionName: /dev/VG_63/MySQL_DB_CosNaming

fsUserComment: MySQL DB for CORBA Naming Service

b. Mount the directory.

mkdir /tmp/tmp_nasDB

mount /dev/md8 /tmp/tmp_nasDB

Remember to unmount the directory and to stop the md device after thefollowing checks have been performed (see the last step).

4. Check that the database disk content is accessible and readable

ls -la /tmp/tmp_nasDB

5. Check that the my.cnf and odbc.ini files exist in that directory and haveread access rights. Check also that these files are identical to those underthe SS_Naming home directory.

diff /tmp/tmp_nasDB/odbc.ini /opt/Nokia/SS_Naming/etc/odbc.ini

diff /tmp/tmp_nasDB/my.cnf /opt/Nokia/SS_Naming/etc/my.cnf

6. Check the mysql.err file for any error indications. You can also find thisfile from the /tmp/tmp_nasDB directory.

7. Remove the mount and stop the md devices

a. Unmount and remove the directory.

umount /tmp/tmp_nasDB

rmdir /tmp/tmp_nasDB

b. Stop the md device.

mdadm --manage -S /dev/md8

If any of the preceding checks fail, a major software failure exists in thesystem. In that case, contact your Nokia representative with theinformation gathered during the preceding steps.

DN70155942Issue 1-0 en draft

# Nokia Corporation 19 (26)

70173 BACKEND DATABASE REQUIRED BY CORBA NAMING SERVICE ISUNAVAILABLE

Page 20: FlexiPlatform Alarms for RNC-OMS

Clearing

HAS clears the alarm automatically when it has detected the NaS to be faulty andtherefore restarted the PrivateCosNaming and PublicCosNaming recoverygroups.

However, if the backend database remains faulty, the alarm is raised again. Thismay result in a restart loop constantly raising the same alarm. Therefore, if theproblem seems to be permanent, it is recommended to lock the NaS and thedatabase recovery groups with the following commands:

fshascli -l /NamingServiceDB

fshascli -l /PublicCosNasming

fshascli -l /PrivateCosNaming

and to clear the alarm manually before performing the steps for solving the error.

20 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 21: FlexiPlatform Alarms for RNC-OMS

6 70245 ILLEGAL INTERNAL USAGE OFEXTERNAL ALARM NOTIFICATIONFORMAT

Probable cause: Software Program Error

Event type: Processing error

Default severity: Major

Meaning

Application raised or cleared an alarm containing an internal application Id andprovided its own alarm time. Application is allowed to provide alarm time onlyfor external alarms (alarms with external application Id). This alarm is also raisedif the application raised or cleared an alarm containing an external application Idbut did not provide its own alarm time.

The original alarm is discarded.

Identifying additional information fields

Data from the original alarm:

1. Managed Object Id

2. Specific problem

3. Identifying application additional information

(Application Id is present in the Managed Object Id field of the alarm)

Additional information fields

-

DN70155942Issue 1-0 en draft

# Nokia Corporation 21 (26)

70245 ILLEGAL INTERNAL USAGE OF EXTERNAL ALARM NOTIFICATION FORMAT

Page 22: FlexiPlatform Alarms for RNC-OMS

Instructions

Fill in a problem report with the alarm data and send it to your Nokiarepresentative.

Clearing

Clear the alarm with Alarm Browser after correcting the fault as presented inInstructions.

Testing instructions

1. Create a text file containing the following single row:

2005 Jul 20 22:09:00 ALARM RAISE SP=70156 \

MO=fshaProcessInstanceName= SolidWDforAlarmType,\

fshaRecoveryUnitName=FSAlarmDBServer,fsipHostName=WAS,\

fsFragmentId=Nodes,fsFragmentId=HA,fsClusterId=ClusterRoot \

AP=fshaProcessInstanceName=SolidWDforAlarmType,\

fshaRecoveryUnitName=FSAlarmDBServer,fsipHostName=WAS,\

fsFragmentId=Nodes,fsFragmentId=HA, fsClusterId=ClusterRoot \

SE=5 IINFO="1 AlarmDB /var/mnt/local/Solid_DB_Alarm" TIME=E1124564940350

2. Use the Parameter Tool to memorize the value of thefsParameterId=fsLogFileName,fsAlarmProcessorConfigurationId=Default,fsAlarmProcessorId=AlarmProcessor1, fsFragmentId=AlarmProcessors, fsFragmentId=AlarmMgmt,

fsClusterId=ClusterRoot attribute in the alarm processor LDAPconfiguration and replace it with the name of the created file.

3. Restart alarm processor with the following command:

fshascli -r /<Node>/FSAlarmSystemServer/AlarmProcessor

where <Node> is the name of the node where alarm processor is deployed.

4. After verifying that an alarm for the situation has been raised, clear it withAlarm Browser.

5. Use the Parameter Tool to restore the original name of the alarm log file.

6. Restart alarm processor again.

22 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 23: FlexiPlatform Alarms for RNC-OMS

7 70256 RESOURCE ALLOCATION OR DE-ALLOCATION FAILURE

Probable cause: Software Program Abnormally Terminated

Event type: Processing error

Default severity: Major

Meaning

Allocation or deallocation of resources to or from a computer node in the clusterhas failed.

Applications running in the cluster are often identified with resources that areallocated to the node before the application is started and released from the nodeafter the application has terminated. Such resources can, for example, be TCP/IPaddresses that are associated with the service provided by the software or a diskpartition that for example contains the application database. In addition, theapplication can allocate and deallocate other resources (for example, start andstop 3rd party applications) in its control scripts.

An operation failure has been reported for the defined recovery unit while it wasstarting or stopping.

If the error occurred when an application was starting, application start-up isaborted. In case of a permanent fault, the service provided by the application isnow down. With a transient or node-specific fault, and providing that theapplication has a standby, the application may have been restarted successfully onanother node.

If the fault happened while the application was terminating, the node on whichthe error happened has now been restarted to restore it to a known state. If thenode has restarted successfully or the application has a standby resource, theapplication has likely already restarted, and service is again available.

DN70155942Issue 1-0 en draft

# Nokia Corporation 23 (26)

70256 RESOURCE ALLOCATION OR DE-ALLOCATION FAILURE

Page 24: FlexiPlatform Alarms for RNC-OMS

Identifying additional information fields

-

Additional information fields

1. Name of the recovery group to which the recovery unit belongs. Forexample, "/Directory".

2. Situation when the failure happened: string "allocating" or "de-allocating"

3. Type of the resource allocation: "IP(address)", "disk(mount point)" or"ctrlscript". For example, "IP(192.1.1.78)" or "disk(sysimg)".

4. Only present if argument 3 is "ctrlscript". Contains the name of the controlscript that reported the failure. For example, "RUControlDirectoryServer.sh"

Instructions

1. Log into the cluster as root user to check the situation.

2. Use the fshascli command to check the state of all recovery units withinthe recovery group (name of the recovery group is in the ApplicationAdditional Information field).

If the recovery group is providing service, its every UNLOCKED recoveryunit that has the ACTIVE role, has the ENABLED operational state and anempty procedural status.

For example, the state of recovery units of the /Directory recovery groupscan be checked as follows:

$ fshascli --state $(fshascli -children /Directory |

grep -vE "\/.+\/.+\/" )

/CLA-0/FSDirectoryServer:

administrative(UNLOCKED)

operational(ENABLED)

usage(IDLE)

procedural(NOTINITIALIZED)

availability()

unknown(FALSE)

alarm()

role(COLDSTANDBY)

/CLA-1/FSDirectoryServer:

administrative(UNLOCKED)

operational(ENABLED)

usage(ACTIVE)

procedural()

availability()

unknown(FALSE)

alarm()

24 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS

Page 25: FlexiPlatform Alarms for RNC-OMS

role(ACTIVE)

In the above case, the recovery unit of the CLA-0 node is acting as a coldstandby backup and the recovery unit on CLA-1 is running the servicenormally.

Note that the grep command in the example is used to filter outinformation regarding individual processes in each recovery unit. Sincethis is a situation that may be caused by various different faults, contactyour Nokia representative to analyse the root cause.

Clearing

Clear the alarm manually after the problem has been solved.

Testing instructions

Simulate an IP address allocation failure

1. An IP address allocation failure can be caused by manually allocating an IPaddress to a node before a recovery unit is started. Select a cold active/standby recovery group (but do not use the Directory recovery group) thathas an IP address associated with it, and allocate the address to the standbynode. For example:

$ fshascli --state /CLA-0/FSClusterDNSServer

/CLA-0/FSClusterDNSServer

administrative(UNLOCKED) <== Unlocked

operational(ENABLED) <== Operational

usage(IDLE)

procedural(NOTINITIALIZED)

availability()

unknown(FALSE)

alarm()

role(COLDSTANDBY)

$ grep ClusterDNS /etc/hosts

192.168.2.255 ClusterDNS

. . .

$ ip addr show | grep 192.168.2.255

inet 192.168.2.255/23 scope global secondary bond0

inet fe80::192:168:2:255/10 scope link

$ ssh cla-0

Last login: . . .

$ ip address add 192.168.2.255/23 dev bond0

2. Issue a switchover for the recovery group so that the service attempts tomove to the node that already has the IP address. For example:

$ fshascli --switchover /ClusterDNS

The switchover fails and the alarm gets raised. The alarm is visible, forexample, in the alarm log. Note that you have to cancel the alarm manually.

DN70155942Issue 1-0 en draft

# Nokia Corporation 25 (26)

70256 RESOURCE ALLOCATION OR DE-ALLOCATION FAILURE

Page 26: FlexiPlatform Alarms for RNC-OMS

3. Remove the IP address that you added manually or reboot the node. Forexample:

$ ip address del 192.168.2.255/23 dev bond0

26 (26) # Nokia Corporation DN70155942Issue 1-0 en draft

FlexiPlatform Alarms for RNC-OMS