Replacing McRNC Hardware Units
description
Transcript of Replacing McRNC Hardware Units
WCDMA RAN, Rel. RU40,Operating Documentation,Issue 05
Replacing MulticontrollerRNC Hardware Units DN09109953Issue 02AApproval Date 2014-02-10
The information in this document is subject to change without notice and describes only the productdefined in the introduction of this documentation. This documentation is intended for the use ofNokia Solutions and Networks customers only for the purposes of the agreement under which thedocument is submitted, and no part of it may be used, reproduced, modified or transmitted in anyform or means without the prior written permission of Nokia Solutions and Networks. The documen-tation has been prepared to be used by professional and properly trained personnel, and the cus-tomer assumes full responsibility when using it. Nokia Solutions and Networks welcomes customercomments as part of the process of continuous development and improvement of the documenta-tion.
The information or statements given in this documentation concerning the suitability, capacity, orperformance of the mentioned hardware or software products are given "as is" and all liability aris-ing in connection with such hardware or software products shall be defined conclusively and finallyin a separate agreement between Nokia Solutions and Networks and the customer. However,Nokia Solutions and Networks has made all reasonable efforts to ensure that the instructions con-tained in the document are adequate and free of material errors and omissions. Nokia Solutionsand Networks will, if deemed necessary by Nokia Solutions and Networks, explain issues whichmay not be covered by the document.
Nokia Solutions and Networks will correct errors in this documentation as soon as possible. IN NOEVENT WILL Nokia Solutions and Networks BE LIABLE FOR ERRORS IN THIS DOCUMENTA-TION OR FOR ANY DAMAGES, INCLUDING BUT NOT LIMITED TO SPECIAL, DIRECT, INDI-RECT, INCIDENTAL OR CONSEQUENTIAL OR ANY LOSSES, SUCH AS BUT NOT LIMITED TOLOSS OF PROFIT, REVENUE, BUSINESS INTERRUPTION, BUSINESS OPPORTUNITY OR DA-TA,THAT MAY ARISE FROM THE USE OF THIS DOCUMENT OR THE INFORMATION IN IT.
This documentation and the product it describes are considered protected by copyrights and otherintellectual property rights according to the applicable laws.
NSN is a trademark of Nokia Solutions and Networks. Nokia is a registered trademark of NokiaCorporation. Other product names mentioned in this document may be trademarks of their respec-tive owners, and they are mentioned for identification purposes only.
Copyright © Nokia Solutions and Networks 2014. All rights reserved
f Important Notice on Product Safety This product may present safety risks due to laser, electricity, heat, and other sources of
danger.
Only trained and qualified personnel may install, operate, maintain or otherwise handlethis product and only after having carefully read the safety information applicable to thisproduct.
The safety information is provided in the Safety Information section in the “Legal, Safetyand Environmental Information” part of this document or documentation set.
Nokia Solutions and Networks is continually striving to reduce the adverse environmental effects ofits products and services. We would like to encourage you as our customers and users to join us inworking towards a cleaner, safer environment. Please recycle product packaging and follow therecommendations for power use and proper disposal of our products and their components.
If you should have questions regarding our Environmental Policy or any of the environmental ser-vices we offer, please contact us at Nokia Solutions and Networks for any additional information.
Replacing Multicontroller RNC Hardware Units
2 DN09109953 Issue: 02A
Table of ContentsThis document has 55 pages
Summary of changes..................................................................... 6
1 Replacing the faulty chassis in a running system.......................... 71.1 Replacing a faulty chassis..............................................................71.1.1 Removing the faulty chassis from the running system...................71.1.1.1 Steps ............................................................................................. 71.1.2 Installing the new chassis.............................................................111.1.2.1 Steps ........................................................................................... 11 2 Replacing the hard disk drive on hard disk drive carrier AMC..... 162.1 Removing the faulty hard disk drive............................................. 162.1.1 Steps ........................................................................................... 172.2 Installing the new hard disk drive................................................. 182.2.1 Steps ........................................................................................... 18 3 Replacing the failed hard disk drives on both CFPU nodes ........233.1 Removing the faulty hard disk drives........................................... 233.2 Installing the new hard disk drives............................................... 24 4 Replacing an AMC....................................................................... 274.1 Removing an AMC....................................................................... 274.2 Installing an AMC......................................................................... 28 5 Replacing a fan module............................................................... 305.1 Removing a fan module............................................................... 305.1.1 Steps ........................................................................................... 305.2 Installing a fan module................................................................. 305.2.1 Steps ........................................................................................... 31 6 Replacing an add-in card............................................................. 326.1 Removing an add-in card............................................................. 326.1.1 Steps ........................................................................................... 326.2 Installing an add-in card............................................................... 376.2.1 Steps ........................................................................................... 37 7 Replacing a power distribution unit.............................................. 397.1 Removing a power distribution unit (PDU)................................... 397.1.1 Steps ........................................................................................... 407.2 Installing a power distribution unit (PDU)..................................... 407.2.1 Steps ........................................................................................... 41 8 Replacing a power supply unit..................................................... 438.1 Removing a power supply unit..................................................... 438.1.1 Steps ........................................................................................... 43
Replacing Multicontroller RNC Hardware Units
Issue: 02A DN09109953 3
8.2 Installing a power supply unit....................................................... 448.2.1 Steps ........................................................................................... 44 9 Replacing the air filter.................................................................. 45 10 Dealing with sensor alarms.......................................................... 46 11 Communication between active and standby units in a BCN
cluster fails................................................................................... 5411.1 Description................................................................................... 5411.2 Symptoms.................................................................................... 5411.3 Recovery procedures................................................................... 54
Replacing Multicontroller RNC Hardware Units
4 DN09109953 Issue: 02A
List of FiguresFigure 1 The SAS/SATA switch in the HDSAM-A............................................. 16Figure 2 The hard disk drive on the hard disk drive carrier AMC..................... 18Figure 3 Installing a hard disk drive on the hard disk drive carrier AMC.......... 19Figure 4 The SAS/SATA switch in the HDSAM-A............................................. 23Figure 5 The hard disk drive on the hard disk drive carrier AMC..................... 24Figure 6 Installing a hard disk drive on the hard disk drive carrier AMC.......... 25Figure 7 Pulling the hot swap handle of an AMC..............................................27Figure 8 Removing an AMC from the BCN module..........................................28Figure 9 Inserting an AMC into the BCN module..............................................28Figure 10 Pressing the hot swap handle............................................................ 29Figure 11 BCN top cover screws........................................................................ 35Figure 12 Removing the BCN top cover............................................................. 36Figure 13 BCN add-in card screws.....................................................................36Figure 14 Pulling an add-in card out from the BCN module............................... 36Figure 15 Inserting an add-in card into BCN module..........................................37Figure 16 BCN add-in card screws.....................................................................37Figure 17 Installing BCN top cover..................................................................... 38Figure 18 Power distribution units in the cabinet................................................ 39Figure 19 Replacing a PDU................................................................................ 40Figure 20 Installing PDU to the cabinet.............................................................. 41Figure 21 PDU grounding cable......................................................................... 41Figure 22 Unscrewing the two thumbscrews...................................................... 45Figure 23 Openning the air filter cover and pulling out the air filter.................... 45Figure 24 Positions of the PSUs and fan trays................................................... 53
Replacing Multicontroller RNC Hardware Units
Issue: 02A DN09109953 5
Summary of changesChanges between document issues are cumulative. Therefore, the latest documentissue contains all changes made to previous issues.
See Guide to WCDMA RAN and I-HSPA Documentation.
Changes made between issues 02 (RU40) and 02A (RU40)
Instructions for a graceful shutdown have been added when replacing an add-in card.
Changes made between issues 01B (RU30) and 02 (RU40)
Instructions apply to both BCN-A and BCN-B hardware. The example display outputshave been updated and may vary slightly as a result.
Changes made between issues 01A (RU30) and 01B (RU30)
Replacing the hard disk drive on hard disk drive carrier AMC has been updated toinclude verification steps.
has been added.
Summary of changes Replacing Multicontroller RNC Hardware Units
6 DN09109953 Issue: 02A
1 Replacing the faulty chassis in a runningsystem
1.1 Replacing a faulty chassisPurpose
In a multi chassis environment (two or more chassis), if the existing chassis is faulty, youneed to remove the chassis and replace a new chassis in its place.
Before you start
Ensure that:
• The embedded software is upgraded to the required version in the replacementchassis.
• The Initial LMP settings are properly configured for the replacement chassis, such asswitch configuration, backplane resiliency configuration. For more information, seeCommissioning Multicontroller RNC.
1.1.1 Removing the faulty chassis from the running system1.1.1.1 Steps
1 Identify the chassis to be replaced in the running system.In this section, the chassis to be replaced refers to the chassis-2.
2 Check all the nodes that are running in the chassis to be replaced.To check all the running nodes present in the chassis to be replaced, enter thefollowing command: show hardware state listThe following output is displayed:root@CFPU-0 [RNC-89] > show hardware state list
cabinet-1 : unit /cabinet-1chassis-1 : unit /cabinet-1/chassis-1chassis-2 : unit /cabinet-1/chassis-2LMP-1-1-1 : node available /cabinet-1/chassis-1/piu-1LMP-1-2-1 : node available /cabinet-1/chassis-2/piu-1CFPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-1/CPU- 1/core -0,1,10,2,3,4,5,6,7,8,9CSPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-2/CPU- 1/core -0,1,2,3,4,5USPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-3/CPU- 1/core -0,1,2,3,4EIPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-4/CPU- 1/core -0,1,2,3,4,5CSPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-5/CPU- 1/core -0,1,2,3,4,5USPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-6/CPU- 1/core -0,1,2,3,4USPU-4 : node available /cabinet-1/chassis-1/piu-1/addin-7/CPU-
Replacing Multicontroller RNC Hardware Units Replacing the faulty chassis in a running system
Issue: 02A DN09109953 7
1/core -0,1,2,3,4EIPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-8/CPU- 1/core -0,1,2,3,4,5USSR-0 : node available /cabinet-1/chassis-1/piu-1/addin-1/CPU- 1/core -11CSUP-0 : node available /cabinet-1/chassis-1/piu-1/addin-2/CPU- 1/core -10,11,6,7,8,9USUP-0 : node available /cabinet-1/chassis-1/piu-1/addin-3/CPU- 1/core -10,11,5,6,7,8,9EITP-0 : node available /cabinet-1/chassis-1/piu-1/addin-4/CPU- 1/core -10,11,6,7,8,9CSUP-2 : node available /cabinet-1/chassis-1/piu-1/addin-5/CPU- 1/core -10,11,6,7,8,9USUP-2 : node available /cabinet-1/chassis-1/piu-1/addin-6/CPU- 1/core -10,11,5,6,7,8,9USUP-4 : node available /cabinet-1/chassis-1/piu-1/addin-7/CPU- 1/core -10,11,5,6,7,8,9EITP-2 : node available /cabinet-1/chassis-1/piu-1/addin-8/CPU- 1/core -10,11,6,7,8,9CFPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-1/CPU- 1/coreCSPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-2/CPU- 1/coreUSPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-3/CPU- 1/coreEIPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-4/CPU- 1/coreCSPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-5/CPU- 1/coreUSPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-6/CPU- 1/coreUSPU-5 : node available /cabinet-1/chassis-2/piu-1/addin-7/CPU- 1/coreEIPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-8/CPU- 1/coreUSSR-1 : node available /cabinet-1/chassis-2/piu-1/addin-1/CPU- 1/coreCSUP-1 : node available /cabinet-1/chassis-2/piu-1/addin-2/CPU- 1/coreUSUP-1 : node available /cabinet-1/chassis-2/piu-1/addin-3/CPU- 1/coreEITP-1 : node available /cabinet-1/chassis-2/piu-1/addin-4/CPU- 1/coreCSUP-3 : node available /cabinet-1/chassis-2/piu-1/addin-5/CPU- 1/coreUSUP-3 : node available /cabinet-1/chassis-2/piu-1/addin-6/CPU- 1/coreUSUP-5 : node available /cabinet-1/chassis-2/piu-1/addin-7/CPU- 1/coreEITP-3 : node available /cabinet-1/chassis-2/piu-1/addin-8/CPU- 1/corecluster : cluster availableThe output provides information that nodes CFPU-1, CSPU-1, USPU-1, EIPU-1,CSPU-3, USPU-3, USPU-5, EIPU-3, USSR-1, CSUP-1, USUP-1, EITP-1, CSUP-3,USUP-3, USUP-5, EITP-3 are present in the chassis to be replaced.
Replacing the faulty chassis in a running system Replacing Multicontroller RNC Hardware Units
8 DN09109953 Issue: 02A
3 Check if the current SCLI session is running on a node located in the chassisto be replaced.If the SCLI session is running on a node (node name identified by the prompt)located in the chassis to be replaced. Then, perform a switchover for the /SSHrecovery group in order to have the connectivity to the cluster during chassisreplacement. To perform a switchover for the /SSH recovery group, enter thefollowing command:set has switchover force managed-object /SSH
g The SSH connection breaks when the swichover command is executed, andthe SSH session must be started again.
4 Disable cluster manager nodes located in the chassis to be replaced.To identify the nodes configured as a cluster manager, enter the following command:show has view managed-object /ClusterHA
The following output is displayed:/ClusterHA:RecoveryGroup /ClusterHAspecialConstraints=(serviceInterruptionDenied)RecoveryUnit /CFPU-0/FSClusterHAServerrecoveryUnitType=(ClusterManagerRecoveryUnit)Process /CFPU-0/FSClusterHAServer/HASClusterManagercommand=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)status=(nonHA)startMethod=(always)severity=(important)RecoveryUnit /CFPU-1/FSClusterHAServerrecoveryUnitType=(ClusterManagerRecoveryUnit)Process /CFPU-1/FSClusterHAServer/HASClusterManagercommand=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)status=(nonHA)startMethod=(always)severity=(important)The output indicates that /CluserHA recovery group has recovery units for CFPU-0and CFPU-1 nodes. Hence, the CFPU-0 and CFPU-1 nodes are configured ascluster manager nodes.From the output of step 4, it is observed that one cluster manager node (CFPU-1) islocated in the chassis to be replaced. Hence, the following steps must be executed:
a) Disable the Cluster Management Functionality (CMF) on the CFPU-1 node. Enterthe following command:set cmf disable node-name /CFPU-1The following output is displayed:Cluster management functionality disabled on host CFPU-1
b) Check CFPU-1 node where CMF was disabled has CMF-DISABLED status andthe other cluster manager node ( in this case CFPU-0 node) has CMF-SERVINGstatus. Enter the following command:show cmf status node-name /CFPU-1The following output must be displayed:CFPU-1: CMF-DISABLED priority: 6CFPU-0: CMF-SERVING priority: 5
Replacing Multicontroller RNC Hardware Units Replacing the faulty chassis in a running system
Issue: 02A DN09109953 9
5 Lock all the managed objects in the chassis to be replaced.To lock all the managed objects, enter the following command:set has lock managed-object <mo-name1> <mo-name2>...
g The SE nodes are not managed and therefore they are not locked.
ExampleTo lock all the managed nodes in chassis-2, enter the following commands:set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1/CSPU-3 /USPU-3 /USPU-5 /EIPU-3
If the nodes are successfully locked, the following output is displayed:root@CFPU-0 [RNC-1002] > set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3/CFPU-1 locked successfully, 1 services activated on standby node(s)/CSPU-1 locked successfully, 1 services activated on standby node(s)/USPU-1 locked successfully/EIPU-1 locked successfully, 3 services activated on standby node(s)/CSPU-3 locked successfully, 1 services activated on standby node(s)/USPU-3 locked successfully/USPU-5 locked successfully/EIPU-3 locked successfully, 3 services activated on standby node(s)
6 Power off all the managed objects in the chassis to be replaced.To power off the managed objects, enter the following command:set has power off managed-object <mo-name1> <mo-name2>...
ExampleTo power off all the managed nodes in chassis-2, enter the following commands:set has power off managed-object /CFPU-1 /CSPU-1 /USPU-1/EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
If the nodes are successfully powered off, the following output is displayed:/CFPU-1 is powered OFF successfully/CSPU-1 is powered OFF successfully/USPU-1 is powered OFF successfully/EIPU-1 is powered OFF successfully/CSPU-3 is powered OFF successfully/USPU-3 is powered OFF successfully/USPU-5 is powered OFF successfully/EIPU-3 is powered OFF successfully
7 Verify all the managed objects in the chassis to be replaced are powered off.To verify the availability of a managed node, enter the following command:show has state availability managed-object <mo-name>
ExampleTo verify that the availability status of all managed nodes in chassis-2 areavailability (POWEROFF), enter the following commands:show has state availability managed-object /CFPU-1 /CSPU-1 \/CSPU-3 /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5
Replacing the faulty chassis in a running system Replacing Multicontroller RNC Hardware Units
10 DN09109953 Issue: 02A
The following output must be displayed:OBJECT AVAILABILITY /CFPU-1 POWEROFF /CSPU-1 POWEROFF /CSPU-3 POWEROFF /EIPU-1 POWEROFF /EIPU-3 POWEROFF /USPU-1 POWEROFF /USPU-3 POWEROFF /USPU-5 POWEROFF
8 Disconnect all the cables connected to the chassis to be replaced.To disconnect the cables connected to the chassis to be replaced, follow these steps:
a) If there is PDU used with the BCN module,ThenSwitch off the circuit breaker on the PDU for the BCN module in question.
b) Disconnect the power feed cables.c) Disconnect all the cables from the transceivers on the front side of the chassis.d) Disconnect the BCN grounding cable.e) Keep the network cables attached to the front cable tray of the chassis.f) Uninstall the cable tray with attached network cables from the BCN module.
The cable tray is uninstalled by unscrewing the two thumbscrews fixing the cabletray to the BCN module. If the screws are too tight to be opened by hand,loosening the screws that fix the BCN module mounting flanges to the cabinetmight help. For more information about detaching the cable tray, refer to thedocument Installing BCN Modules to the IR206 Cabinet.
g) Move the cable tray with attached network cables under the module, sothemodule can be easily pulled out from the rack.
9 Remove the HDD AMC from the AMC bay.If the chassis has an AMC slot remove the HDD AMC, follow the instructions in thesection, Replacing an AMC.
10 Remove the chassis to be replaced from the rack.
1.1.2 Installing the new chassis1.1.2.1 Steps
1 Insert the new chassis in the rack.
2 Insert the AMC back in the AMC bay.If the chassis had an HDD AMC equipped in the AMC bay, then insert the removedHDD AMC into the same slot of the replacement chassis. Follow the instructions inthe section, Replacing an AMC.
Replacing Multicontroller RNC Hardware Units Replacing the faulty chassis in a running system
Issue: 02A DN09109953 11
3 Connect all the cables to the new chassis.
a) Install the cable tray with attached network cables back to the BCN module.For more information about the cable tray installation, check the documentInstalling BCN Modules to the IR206 Cabinet.
b) Connect the BCN grounding cable.c) Connect the network cables back to the transceivers on the front side of
themodule.d) Connect the power feed cables.e) If there is PDU used with the BCN module,
ThenSwitch on the circuit breaker on the PDU for the BCN module in question.
4 Check that the LMP and all nodes of the new chassis are available.To check that the LMP and all nodes of the new chassis are available, enter thefollowing command:show hardware state list
The following output is displayed:root@CFPU-0 [RNC-89] > show hardware state list
cabinet-1 : unit /cabinet-1chassis-1 : unit /cabinet-1/chassis-1chassis-2 : unit /cabinet-1/chassis-2LMP-1-1-1 : node available /cabinet-1/chassis-1/piu-1LMP-1-2-1 : node available /cabinet-1/chassis-2/piu-1CFPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-1/CPU- 1/core -0,1,10,2,3,4,5,6,7,8,9CSPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-2/CPU- 1/core -0,1,2,3,4,5USPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-3/CPU- 1/core -0,1,2,3,4EIPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-4/CPU- 1/core -0,1,2,3,4,5CSPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-5/CPU- 1/core -0,1,2,3,4,5USPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-6/CPU- 1/core -0,1,2,3,4USPU-4 : node available /cabinet-1/chassis-1/piu-1/addin-7/CPU- 1/core -0,1,2,3,4EIPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-8/CPU- 1/core -0,1,2,3,4,5USSR-0 : node available /cabinet-1/chassis-1/piu-1/addin-1/CPU- 1/core -11CSUP-0 : node available /cabinet-1/chassis-1/piu-1/addin-2/CPU- 1/core -10,11,6,7,8,9USUP-0 : node available /cabinet-1/chassis-1/piu-1/addin-3/CPU- 1/core -10,11,5,6,7,8,9EITP-0 : node available /cabinet-1/chassis-1/piu-1/addin-4/CPU- 1/core -10,11,6,7,8,9CSUP-2 : node available /cabinet-1/chassis-1/piu-1/addin-5/CPU- 1/core -10,11,6,7,8,9USUP-2 : node available /cabinet-1/chassis-1/piu-1/addin-6/CPU- 1/core -10,11,5,6,7,8,9USUP-4 : node available /cabinet-1/chassis-1/piu-1/addin-7/CPU- 1/core -10,11,5,6,7,8,9EITP-2 : node available /cabinet-1/chassis-1/piu-1/addin-8/CPU-
Replacing the faulty chassis in a running system Replacing Multicontroller RNC Hardware Units
12 DN09109953 Issue: 02A
1/core -10,11,6,7,8,9CFPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-1/CPU- 1/coreCSPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-2/CPU- 1/coreUSPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-3/CPU- 1/coreEIPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-4/CPU- 1/coreCSPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-5/CPU- 1/coreUSPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-6/CPU- 1/coreUSPU-5 : node available /cabinet-1/chassis-2/piu-1/addin-7/CPU- 1/coreEIPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-8/CPU- 1/coreUSSR-1 : node available /cabinet-1/chassis-2/piu-1/addin-1/CPU- 1/coreCSUP-1 : node available /cabinet-1/chassis-2/piu-1/addin-2/CPU- 1/coreUSUP-1 : node available /cabinet-1/chassis-2/piu-1/addin-3/CPU- 1/coreEITP-1 : node available /cabinet-1/chassis-2/piu-1/addin-4/CPU- 1/coreCSUP-3 : node available /cabinet-1/chassis-2/piu-1/addin-5/CPU- 1/coreUSUP-3 : node available /cabinet-1/chassis-2/piu-1/addin-6/CPU- 1/coreUSUP-5 : node available /cabinet-1/chassis-2/piu-1/addin-7/CPU- 1/coreEITP-3 : node available /cabinet-1/chassis-2/piu-1/addin-8/CPU- 1/corecluster : cluster availableThe output must display that all the nodes of the new chassis are now available.
g The new chassis and its nodes take some time to boot up.
5 Setup the post configuration for the LMP of the new chassis.To setup the post configuration for the LMP of the new chassis, enter the followingcommands:cd /opt/nokiasiemens/SS_FSetup/bin
./configBCNLmp.py
The following output is displayed:INFO Copy ssh keys to LMPs.INFO Using credential file : /mnt/state/_global/etc/credentials/BCN- LMP/root.credINFO Copying /tftpboot/lmp/hosts file to all LMPs.INFO Changing the syslog.conf on all LMPs.INFO Changing the ntp.conf on all LMPs.INFO Configuring port monitor for all lmps.INFO Changing the mch.conf on all LMPs.INFO Removing bcn_sfp module loading from all LMPs.INFO Patching fastpath reset script on all LMPs.INFO Adding node reset init script to all LMPs.INFO Removing PET/SNMP trap configuration on all LMPsINFO Creating LMP configuration backup for automated configuration restore,
Replacing Multicontroller RNC Hardware Units Replacing the faulty chassis in a running system
Issue: 02A DN09109953 13
this might take up to 5 minutes.
Then exit with the Ctrl+C.
6 Unlock all the nodes in the new chassis.To unlock all the nodes in the new chassis, enter the following command:set has unlock managed-object <mo-name1> <mo-name2>...
ExampleTo unlock all the nodes in chassis-2, enter the following commands:set has unlock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1\ /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
The following output is displayed:/CFPU-1 unlocked successfully./CSPU-1 unlocked successfully./USPU-1 unlocked successfully./EIPU-1 unlocked successfully./CSPU-3 unlocked successfully./USPU-3 unlocked successfully./USPU-5 unlocked successfully./EIPU-3 unlocked successfully.
7 Check that all the nodes in the new chassis are operational.Wait for the nodes to restart. After the nodes have restarted, wait for the operationalstate to become OPERATIONAL(ENABLED). Enter the following command to viewthe operational state of the node:show has state managed-object <mo-name1> <mo-name2>...
ExampleTo check that all the nodes in chassis-2 have OPERATIONAL (ENABLED) status,enter the following commands:show has state operational managed-object /CFPU-1 /CSPU-1/USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3
The following output is displayed:OBJECT OPERATIONAL /CFPU-1 ENABLED /CSPU-1 ENABLED /CSPU-3 ENABLED /EIPU-1 ENABLED /EIPU-3 ENABLED /USPU-1 ENABLED /USPU-3 ENABLED /USPU-5 ENABLED
8 Enable CMF on the node configured as cluster manager.To identify the nodes configured as cluster manager, enter the following command:show has view managed-object /ClusterHA
Replacing the faulty chassis in a running system Replacing Multicontroller RNC Hardware Units
14 DN09109953 Issue: 02A
The following output is displayed:/ClusterHA:RecoveryGroup /ClusterHAspecialConstraints=(serviceInterruptionDenied)RecoveryUnit /CFPU-0/FSClusterHAServerrecoveryUnitType=(ClusterManagerRecoveryUnit)Process /CFPU-0/FSClusterHAServer/HASClusterManagercommand=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)status=(nonHA)startMethod=(always)severity=(important)RecoveryUnit /CFPU-1/FSClusterHAServerrecoveryUnitType=(ClusterManagerRecoveryUnit)Process /CFPU-1/FSClusterHAServer/HASClusterManagercommand=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)status=(nonHA)startMethod=(always)severity=(important)The output indicates that /CluserHA recovery group has recovery units for CFPU-0and CFPU-1 nodes. Hence, the CFPU-0 and CFPU-1 nodes are configured ascluster manager nodes.From the output of step 8, it is observed that one cluster manager node (CFPU-1) islocated in the new chassis. Hence, the following steps must be executed:
a) Enable the Cluster Management Functionality (CMF) on the CFPU-1 node. Enterthe following command:set cmf enable node-name /CFPU-1The following output is displayed:Cluster management functionality enabled on host CFPU-1
b) Check that the CFPU-1 node where the CMF was enabled, has CMF-BACKUPstatus and the CFPU-0 node has CMF-SERVING status. Enter the followingcommand:show cmf status node-name /CFPU-1The following output is displayed:CFPU-1: CMF-BACKUP priority: 6CFPU-0: CMF-SERVING priority: 5
Replacing Multicontroller RNC Hardware Units Replacing the faulty chassis in a running system
Issue: 02A DN09109953 15
2 Replacing the hard disk drive on hard diskdrive carrier AMCPurpose
The hard disk drive carrier AMC (HDSAM-A) is delivered with the hard disk drive inplace. The hard disk drive should be replaced every 3 to 4 years.
You may also need to replace the hard disk drive if it is faulty or if it needs to beupgraded or serviced.
Before you start
f Electrostatic discharge (ESD) may damage components in the module or otherunits.Wear an ESD wrist strap or use a corresponding method when handling theunits, and do not touch the connector surfaces.
The HDSAM-A supports both SAS and SATA hard disk drives and includes a SAS/SATAswitch for selecting the disk type. BCN platform supports only SAS hard disk drives, thusalways check that the switch is set to SAS, before starting the replacement procedure.
Figure 1 The SAS/SATA switch in the HDSAM-A
DN0945027
LEDs
HandleSwitch
2.5” SAS!or!SATA Drive
MMC
Power12V
IPMB-L
AMC!Connector
Switch
Mechanical!adapter
2!X!SAS
ON1
5V
HDSAM-A
SAS SATA
2.1 Removing the faulty hard disk drive
Replacing the hard disk drive on hard disk drive carrierAMC
Replacing Multicontroller RNC Hardware Units
16 DN09109953 Issue: 02A
2.1.1 Steps
1 Log into the CFPU node where the hard disk is not faulty.
2 Lock the node where the faulty hard disk drive is located.To lock the node where the faulty hard disk drive is located, enter the followingcommand:set has lock managed-object <mo-name>
Exampleset has lock managed-object /CFPU-1
The following output is displayed:/CFPU-1 locked successfully.
3 Power off the node where the faulty hard disk drive is located.To power off the node where the faulty hard disk drive is located, enter the followingcommand: set has power off managed-object <mo_name>Exampleset has power off managed-object /CFPU-1
The following output is displayed:/CFPU-1 is powered OFF successfully.
4 Remove the AMC from the AMC bay.Follow the instructions in section Replacing an AMC.
5 Place the AMC so that the faulty hard disk drive side is facing down. Unscrewthe four screws on the metal bracket of the AMC module, then turn the moduleover carefully while holding the hard disk drive.
6 Disconnect the faulty hard disk drive.Detach the faulty hard disk drive from the connector by pulling it gently (from right toleft in the following figure).
Replacing Multicontroller RNC Hardware Units Replacing the hard disk drive on hard disk drive carrierAMC
Issue: 02A DN09109953 17
Figure 2 The hard disk drive on the hard disk drive carrier AMCHard�disk�drive
DN0945257
2.2 Installing the new hard disk drive2.2.1 Steps
1 Connect the new hard disk drive to the SAS connector of HDSAM-A.Connect the new hard disk drive to the SAS connector in the HDSAM-A by pushing itgently (from left to right in the following figure).
Replacing the hard disk drive on hard disk drive carrierAMC
Replacing Multicontroller RNC Hardware Units
18 DN09109953 Issue: 02A
Figure 3 Installing a hard disk drive on the hard disk drive carrier AMC
Inserted�screw
SAS�connector
Hard�disk�drive
DN0945245
2 Turn the AMC over and attach the new hard disk drive to the AMC with fourscrews.Tighten the screws so that their heads are in line with the metal bracket.
3 Install the AMC module back into the AMC bay.Follow the instructions in section Replacing an AMC.
4 Enable network boot for the node with the new hard disk drive.To enable the network boot for the node with the new hard disk drive, enter thefollowing commands:
a) Log in as root.set user username root
b) Power on the node.hwcli -np on <node_name>Wait a few seconds before proceeding to the next step.
c) Reset the node.hwcli -nr -B 3 <node_name>
d) Exit root.exit
Replacing Multicontroller RNC Hardware Units Replacing the hard disk drive on hard disk drive carrierAMC
Issue: 02A DN09109953 19
Example:set user username roothwcli -np on CFPU-1The following ouput is displayed:Powering on CFPU-1 [ok]
hwcli -nr -B 3 CFPU-1The following ouput is displayed:Resetting CFPU-1 [ok]
g The CFPU node takes some time to reboot and the availability can bechecked by logging through the SSH.
5 Disable the watchdog on the node with the new hard disk drive.To disable the watchdog on the node with the new hard disk drive through SSH,enter the following command:ssh <node_name> \ wdctl -d
Example:ssh CFPU-1 \ wdctl -dexit
6 Initialize the new disk from the other node where the hard disk is not faulty.To initialize the new disk, enter the following command:initialise hwThe following output is displayed:Hardware successfully initialized
g To run the initialization script and display the console output, the space barmust be pressed several times after entering the command.
7 Reboot the node with the new hard disk drive from the local disk.Enter the following commands:set user username roothwcli -nr -B 2 <node_name>exitExample:Enter:set user username roothwcli -nr -B 2 CFPU-1exitThe following ouput is displayed:Resetting CFPU-1 [ok]The node will restart and synchronize the Distributed Replicated Block Devices(DRBD). You can enter the watch -n 10 cat /proc/drbd to see how thesynchronization is progressing.However, if the watch -n 10 cat /proc/drbdcommand fails, the cat /proc/drbd command must be executed.
Replacing the hard disk drive on hard disk drive carrierAMC
Replacing Multicontroller RNC Hardware Units
20 DN09109953 Issue: 02A
g Set user username root must first be executed beforewatch -n 10 cat /proc/drbd.
Do not restart the node during the DRBD synchronization. The initialization processof the new disk is not ready until the synchronization is successfully completed.
Example:# watch -n 10 cat /proc/drbdEvery 10.0s: cat /proc/drbd Wed Apr 24 10:54:39 2013
version: 8.3.7 (api:88/proto:86-92)srcversion: 35B9BF7C501212268498452 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:512081 nr:0 dw:635 dr:512860 al:4 bm:32 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:104849 nr:0 dw:31303 dr:106531 al:6 bm:7 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 2: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:204756 nr:0 dw:36 dr:205264 al:3 bm:13 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 3: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---- ns:2890764 nr:0 dw:46448 dr:2882497 al:32 bm:190 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:12469416 [==>.................] sync'ed: 18.9% (12176/14996)M finish: 0:05:37 speed: 36,872 (30,744) K/sec 4: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:16024 nr:0 dw:78448 dr:174701 al:489 bm:200 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:3062668 5: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:533 nr:0 dw:4921 dr:4830 al:13 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:102360 6: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:0 nr:0 dw:349 dr:1461 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8152 7: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:0 nr:0 dw:12 dr:675 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8152 8: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:133 nr:0 dw:1894 dr:759 al:5 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:511948 9: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:0 nr:0 dw:116 dr:4461 al:7 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:291622410: cs:PausedSyncS ro:Primary/Secondary ds:UpToDate/Inconsistent C rap-- ns:0 nr:0 dw:1500 dr:956 al:3 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:102360
The example shows that the first blocks are now synchronized. The oos (out ofsynch) value is zero. For block 3, the synchronization is in process and progress isdisplayed. Once synchronization is complete, the oos value for all blocks will be 0.
8 Check that serving and backup CMF (Cluster Management Functionality) areworking normally.Enter:show cmf status recovery-unit node-name <mo-name>
Replacing Multicontroller RNC Hardware Units Replacing the hard disk drive on hard disk drive carrierAMC
Issue: 02A DN09109953 21
Example:_nokadmin@CFPU-0 [RNC-37] > show cmf status recovery-unit node-name /CFPU-1CFPU-0@RNC-37 [2013-04-24 13:18:51 +0200]Recovery units with DRBD resources for managed object /CFPU-1:/CFPU-1/FSDirectoryServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd1: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSAlarmSystemLightServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd5: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/QNOMUServer-1: 1/1 (peer[s]/drbd device[s] up)/dev/drbd9: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSPM9Server: 1/1 (peer[s]/drbd device[s] up)/dev/drbd8: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSClusterStateServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd4: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSLogServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd3: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSSSHServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd2: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/QNEMServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd10: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSCLMServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd6: DRBD_SECONDARY 1/0 (peer/wait secondary)/CFPU-1/FSHotSwapMonitorServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd7: DRBD_SECONDARY 1/0 (peer/wait secondary)_nokadmin@CFPU-0 [RNC-37] > show cmf status recovery-unit node-name /CFPU-0CFPU-0@RNC-37 [2013-04-24 13:18:55 +0200]Recovery units with DRBD resources for managed object /CFPU-0:/CFPU-0/FSPM9Server: 1/1 (peer[s]/drbd device[s] up)/dev/drbd8: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSClusterStateServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd4: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSLogServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd3: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSSSHServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd2: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/QNEMServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd10: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSCLMServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd6: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSHotSwapMonitorServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd7: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSAlarmSystemLightServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd5: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/FSDirectoryServer: 1/1 (peer[s]/drbd device[s] up)/dev/drbd1: DRBD_PRIMARY 1/0 (peer/wait secondary)/CFPU-0/QNOMUServer-0: 1/1 (peer[s]/drbd device[s] up)/dev/drbd9: DRBD_PRIMARY 1/0 (peer/wait secondary)Compare the blocks and they should match for both managed objects.
9 Unlock the node with the new hard disk drive.Enter:set has unlock managed-object <mo-name>Example:set has unlock managed-object /CFPU-1The following output is displayed:/CFPU-1 unlocked successfully.
Replacing the hard disk drive on hard disk drive carrierAMC
Replacing Multicontroller RNC Hardware Units
22 DN09109953 Issue: 02A
3 Replacing the failed hard disk drives on bothCFPU nodesSummary
The hard disk drive carrier AMC (HDSAM-A) is delivered with the hard disk drive inplace. The hard disk drive should be replaced every 3 to 4 years.
You may also need to replace the hard disk drives if they are faulty or need to beupgraded or serviced.
Purpose
To replace the failed hard disk drives on both the CFPU nodes.
Before you start
f Electrostatic discharge (ESD) may damage components in the module or otherunits.Wear an ESD wrist strap or use a corresponding method when handling theunits, and do not touch the connector surfaces.
The HDSAM-A supports both SAS and SATA hard disk drives and includes a SAS/SATAswitch for selecting the disk type. BCN platform supports only SAS hard disk drives, thusalways check that the switch is set to SAS, before starting the replacement procedure.
Figure 4 The SAS/SATA switch in the HDSAM-A
DN0945027
LEDs
HandleSwitch
2.5” SAS!or!SATA Drive
MMC
Power12V
IPMB-L
AMC!Connector
Switch
Mechanical!adapter
2!X!SAS
ON1
5V
HDSAM-A
SAS SATA
3.1 Removing the faulty hard disk drives
1 Remove the AMC from the AMC bay.Follow the instructions in section Replacing an AMC.
Replacing Multicontroller RNC Hardware Units Replacing the failed hard disk drives on both CFPUnodes
Issue: 02A DN09109953 23
2 Place the AMC so that the faulty hard disk drive side is facing down. Unscrewthe four screws on the metal bracket of the AMC module, then turn the moduleover.
3 Disconnect the faulty hard disk drive.Detach the faulty hard disk drive from the connector by pulling it gently (from right toleft in the following figure).
Figure 5 The hard disk drive on the hard disk drive carrier AMCHard�disk�drive
DN0945257
4 Repeat steps 1 to 3 for removing the other hard disk drive.
3.2 Installing the new hard disk drives
1 Connect the new hard disk drive to the SAS connector of HDSAM-A.Connect the new hard disk drive to the SAS connector in the HDSAM-A by pushing itgently (from left to right in the following figure).
Replacing the failed hard disk drives on both CFPUnodes
Replacing Multicontroller RNC Hardware Units
24 DN09109953 Issue: 02A
Figure 6 Installing a hard disk drive on the hard disk drive carrier AMC
Inserted�screw
SAS�connector
Hard�disk�drive
DN0945245
2 Turn the AMC over and attach the new hard disk drive to the AMC with fourscrews.Tighten the screws so that their heads are in line with the metal bracket.
3 Install the AMC module back into the AMC bay.Follow the instructions in section Replacing an AMC.
4 Check the embedded software version on the new hard disk (HDSAM-A).Use the following command:show sw-manage embedded-sw version all
5 Upgrade the embedded software version.If there are newer embedded software versions, then upgrade the embeddedsoftware version. For instructions, see Upgrading Embedded Software.
Replacing Multicontroller RNC Hardware Units Replacing the failed hard disk drives on both CFPUnodes
Issue: 02A DN09109953 25
6 Repeat the steps 1 to 5 for installing the hard disk drive on the other CFPUnode.
7 Perform the full restoration for the system.Perform the full restoration for the system. For instructions, see CommissioningmcRNC.
Replacing the failed hard disk drives on both CFPUnodes
Replacing Multicontroller RNC Hardware Units
26 DN09109953 Issue: 02A
4 Replacing an AMCPurpose
You may need to replace an AMC if it is faulty or if it needs to be replaced due toconfiguration changes, extensions or servicing.
g When sending a faulty hard disk drive AMC to be replaced, remember toremove the hard disk drive.
Before you start
f Electrostatic discharge (ESD) may damage components in the module or otherunits.Wear an ESD wrist strap or use a corresponding method when handling theunits, and do not touch the connector surfaces.
4.1 Removing an AMC
1 Gently pull the hot swap handle on the front panel of the AMC.Do not pull the handle out all the way yet. Pulling the handle notifies the hardwaremanagement system that you are going to remove the AMC and tells it to finish allprocesses.The hot swap LED starts flashing.
Figure 7 Pulling the hot swap handle of an AMC
DN0977767
2 Wait until the hot swap LED turns into a solid blue.This may take a few seconds.
Replacing Multicontroller RNC Hardware Units Replacing an AMC
Issue: 02A DN09109953 27
3 Pull the hot swap handle again more firmly and slide the AMC out of the bay.
Figure 8 Removing an AMC from the BCN module
DN0973762
4 If you are not installing another AMC immediately, install an AMC filler into theempty AMC bay.This is to ensure adequate cooling and a proper EMC shield in the module.
4.2 Installing an AMC
1 Check that the EMC gasket is correctly in place and that its contacts are clean.
2 Insert the AMC into the bay, sliding it along the guide rails as shown in thefigure below.Make sure that the AMC is firmly seated in the module’s connectors.
Figure 9 Inserting an AMC into the BCN module
DN0977588
3 Press the hot swap handle firmly.Wait until the blue hot swap LED turns off and the power LED turns solid green.
Replacing an AMC Replacing Multicontroller RNC Hardware Units
28 DN09109953 Issue: 02A
Figure 10 Pressing the hot swap handle
DN0977782
g If hard disk cross connecting is used, the hard disk AMC can only be placedin AMC bay1.
Replacing Multicontroller RNC Hardware Units Replacing an AMC
Issue: 02A DN09109953 29
5 Replacing a fan moduleSummary
The fan modules are located at the rear of the BCN module.
BCN fan modules
DN0973747
Before you start
f Electrostatic discharge (ESD) may damage components in the module or otherunits.Wear an ESD wrist strap or use a corresponding method when handling theunits, and do not touch the connector surfaces.
t The fan module can be replaced while the BCN is powered on. Only one fanmodule can be replaced at once.
Prepare the spare fan unit for replacement beforehand. After removing a fanfrom the BCN module, the systems starts to heat up very quickly. Proceedimmediately with the new fan installation. The following procedure applies to allthree fan modules of the BCN module.
5.1 Removing a fan module5.1.1 Steps
1 Unscrew the two thumbscrews attaching the fan module to the BCN.The Phillips screws are built into the fan module and can be loosened either by handor with a screwdriver.
2 Pull the fan module out from the BCN module.
5.2 Installing a fan module
Replacing a fan module Replacing Multicontroller RNC Hardware Units
30 DN09109953 Issue: 02A
5.2.1 Steps
1 Insert the fan module to its slot at the rear side of the BCN module.
2 Tighten the fan module’s thumbscrews.The Phillips screws are built into the fan module and can be tightened either by handor with a screwdriver.
Replacing Multicontroller RNC Hardware Units Replacing a fan module
Issue: 02A DN09109953 31
6 Replacing an add-in cardBefore you start
Power off the BCN module before removing or installing an add-in card.
f Electrostatic discharge (ESD) may damage components in the module or otherunits.Wear an ESD wrist strap or use a corresponding method when handling theunits, and do not touch the connector surfaces.
6.1 Removing an add-in card6.1.1 Steps
1 Option Description
If the BCN module is installed in the cabinet,
Then a) Gracefully shut down the BCN module.
1. Identify the chassis where the plug-in unit is located that is to be replaced.In this section, the chassis refers to the chassis-2.
2. Check all the nodes that are running in the chassis where the plug-in unit is located.To check all the running nodes present in the chassis to be replaced, enter the following command: show hardware state listThe following output is displayed:root@CFPU-0 [RNC-89] > show hardware state list
cabinet-1 : unit /cabinet-1chassis-1 : unit /cabinet-1/chassis-1chassis-2 : unit /cabinet-1/chassis-2LMP-1-1-1 : node available /cabinet-1/chassis-1/piu-1LMP-1-2-1 : node available /cabinet-1/chassis-2/piu-1CFPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-1/CPU-1/core -0,1,10,2,3,4,5,6,7,8,9CSPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-2/CPU-1/core -0,1,2,3,4,5USPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-3/CPU-1/core -0,1,2,3,4EIPU-0 : node available /cabinet-1/chassis-1/piu-1/addin-4/CPU-1/core -0,1,2,3,4,5CSPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-5/CPU-1/core -0,1,2,3,4,5USPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-6/CPU-1/core -0,1,2,3,4USPU-4 : node available /cabinet-1/chassis-1/piu-1/addin-7/CPU-1/core -0,1,2,3,4EIPU-2 : node available /cabinet-1/chassis-1/piu-1/addin-8/CPU-1/core -0,1,2,3,4,5USSR-0 : node available /cabinet-1/chassis-1/piu-1/addin-1/CPU-1/core -11CSUP-0 : node available /cabinet-1/chassis-1/piu-1/addin-2/CPU-1/core -10,11,6,7,8,9USUP-0 : node available /cabinet-1/chassis-1/piu-1/addin-3/CPU-1/core -10,11,5,6,7,8,9EITP-0 : node available /cabinet-1/chassis-1/piu-1/addin-4/CPU-1/core -10,11,6,7,8,9CSUP-2 : node available /cabinet-1/chassis-1/piu-1/addin-5/CPU-1/core -10,11,6,7,8,9USUP-2 : node available /cabinet-1/chassis-1/piu-1/addin-6/CPU-1/core -10,11,5,6,7,8,9USUP-4 : node available /cabinet-1/chassis-1/piu-1/addin-7/CPU-1/core -10,11,5,6,7,8,9EITP-2 : node available /cabinet-1/chassis-1/piu-1/addin-8/CPU-1/core -10,11,6,7,8,9CFPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-1/CPU-1/coreCSPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-2/CPU-1/coreUSPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-3/CPU-1/coreEIPU-1 : node available /cabinet-1/chassis-2/piu-1/addin-4/CPU-1/coreCSPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-5/CPU-1/core
Replacing an add-in card Replacing Multicontroller RNC Hardware Units
32 DN09109953 Issue: 02A
Option DescriptionUSPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-6/CPU-1/coreUSPU-5 : node available /cabinet-1/chassis-2/piu-1/addin-7/CPU-1/coreEIPU-3 : node available /cabinet-1/chassis-2/piu-1/addin-8/CPU-1/coreUSSR-1 : node available /cabinet-1/chassis-2/piu-1/addin-1/CPU-1/coreCSUP-1 : node available /cabinet-1/chassis-2/piu-1/addin-2/CPU-1/coreUSUP-1 : node available /cabinet-1/chassis-2/piu-1/addin-3/CPU-1/coreEITP-1 : node available /cabinet-1/chassis-2/piu-1/addin-4/CPU-1/coreCSUP-3 : node available /cabinet-1/chassis-2/piu-1/addin-5/CPU-1/coreUSUP-3 : node available /cabinet-1/chassis-2/piu-1/addin-6/CPU-1/coreUSUP-5 : node available /cabinet-1/chassis-2/piu-1/addin-7/CPU-1/coreEITP-3 : node available /cabinet-1/chassis-2/piu-1/addin-8/CPU-1/corecluster : cluster availableThe output provides information that nodes CFPU-1, CSPU-1, USPU-1, EIPU-1, CSPU-3, USPU-3, USPU-5, EIPU-3, USSR-1, CSUP-1, USUP-1, EITP-1, CSUP-3, USUP-3, USUP-5, EITP-3 arepresent in the chassis to be replaced.
3. Check if the current SCLI session is running on a node located in the chassis.If the SCLI session is running on a node (node name identified by the prompt) located in the chassis to be replaced. Then, perform a switchover for the /SSH recovery group in order to have theconnectivity to the cluster during chassis replacement. To perform a switchover for the /SSH recovery group, enter the following command:set has switchover force managed-object /SSH
g The SSH connection breaks when the swichover command is executed, and the SSH session must be started again.
4. Disable cluster manager nodes located in the chassis.To identify the nodes configured as a cluster manager, enter the following command:show has view managed-object /ClusterHAThe following output is displayed:/ClusterHA:RecoveryGroup /ClusterHAspecialConstraints=(serviceInterruptionDenied)RecoveryUnit /CFPU-0/FSClusterHAServerrecoveryUnitType=(ClusterManagerRecoveryUnit)Process /CFPU-0/FSClusterHAServer/HASClusterManagercommand=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)status=(nonHA)startMethod=(always)severity=(important)RecoveryUnit /CFPU-1/FSClusterHAServerrecoveryUnitType=(ClusterManagerRecoveryUnit)Process /CFPU-1/FSClusterHAServer/HASClusterManagercommand=(/opt/nokiasiemens/SS_CFM/bin/HASClusterManager)status=(nonHA)startMethod=(always)severity=(important)The output indicates that /CluserHA recovery group has recovery units for CFPU-0 and CFPU-1 nodes. Hence, the CFPU-0 and CFPU-1 nodes are configured as cluster manager nodes.From the output of step 4, it is observed that one cluster manager node (CFPU-1) is located in the chassis to be replaced. Hence, the following steps must be executed:
a) Disable the Cluster Management Functionality (CMF) on the CFPU-1 node. Enter the following command:set cmf disable node-name /CFPU-1The following output is displayed:Cluster management functionality disabled on host CFPU-1
b) Check CFPU-1 node where CMF was disabled has CMF-DISABLED status and the other cluster manager node ( in this case CFPU-0 node) has CMF-SERVING status. Enter the followingcommand:
Replacing Multicontroller RNC Hardware Units Replacing an add-in card
Issue: 02A DN09109953 33
Option Description
show cmf status node-name /CFPU-1The following output must be displayed:CFPU-1: CMF-DISABLED priority: 6CFPU-0: CMF-SERVING priority: 5
5. Lock all the managed objects in the chassis.To lock all the managed objects, enter the following command:set has lock managed-object <mo-name1> <mo-name2>...
g The SE nodes are not managed and therefore they are not locked.
ExampleTo lock all the managed nodes in chassis-2, enter the following commands:set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3If the nodes are successfully locked, the following output is displayed:root@CFPU-0 [RNC-1002] > set has lock managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3/CFPU-1 locked successfully, 1 services activated on standby node(s)/CSPU-1 locked successfully, 1 services activated on standby node(s)/USPU-1 locked successfully/EIPU-1 locked successfully, 3 services activated on standby node(s)/CSPU-3 locked successfully, 1 services activated on standby node(s)/USPU-3 locked successfully/USPU-5 locked successfully/EIPU-3 locked successfully, 3 services activated on standby node(s)
6. Power off all the managed objects in the chassis.To power off the managed objects, enter the following command:set has power off managed-object <mo-name1> <mo-name2>...ExampleTo power off all the managed nodes in chassis-2, enter the following commands:set has power off managed-object /CFPU-1 /CSPU-1 /USPU-1 /EIPU-1 /CSPU-3 /USPU-3 /USPU-5 /EIPU-3If the nodes are successfully powered off, the following output is displayed:/CFPU-1 is powered OFF successfully/CSPU-1 is powered OFF successfully/USPU-1 is powered OFF successfully/EIPU-1 is powered OFF successfully/CSPU-3 is powered OFF successfully/USPU-3 is powered OFF successfully/USPU-5 is powered OFF successfully/EIPU-3 is powered OFF successfully
7. Verify all the managed objects in the chassis to be replaced are powered off.To verify the availability of a managed node, enter the following command:show has state availability managed-object <mo-name>ExampleTo verify that the availability status of all managed nodes in chassis-2 are availability (POWEROFF), enter the following commands:show has state availability managed-object /CFPU-1 /CSPU-1 \ /CSPU-3 /EIPU-1 /EIPU-3 /USPU-1 /USPU-3 /USPU-5The following output must be displayed:OBJECT AVAILABILITY /CFPU-1 POWEROFF /CSPU-1 POWEROFF /CSPU-3 POWEROFF
Replacing an add-in card Replacing Multicontroller RNC Hardware Units
34 DN09109953 Issue: 02A
Option Description
/EIPU-1 POWEROFF /EIPU-3 POWEROFF /USPU-1 POWEROFF /USPU-3 POWEROFF /USPU-5 POWEROFF
b)c) Disconnect the power feed cables.d) Disconnect the network cables from the transceivers on the front side of the module.e) Disconnect the BCN grounding cable.f) Keep the network cables attached to the front cable tray of the BCN module.g) Uninstall the cable tray with attached network cables from the BCN module.
The cable tray is uninstalled by unscrewing the two thumbscrews fixing the cable tray to the BCN module. If the screws are too tight to be opened by hand, loosening the screws that fix the BCN modulemounting flanges to the cabinet might help.For more information about detaching the cable tray, refer to the document Installing BCN Modules to the IR206 Cabinet.
h) Move the cable tray with attached network cables under the module, so the module can be easily pulled out from the cabinet.
2 Unscrew the two thumbscrews securing the top cover of the BCN module.The screws are located at the rear side of the module as shown on the figure below.The Phillips screws are built into the top cover of the BCN module and can beloosened either by hand or with a screwdriver.
Figure 11 BCN top cover screws
DN0973774
3 Option Description
If the BCN module is installed in the cabinet,
Then a) Pull the module out of the cabinet, until it locks into the outmostposition.
b)
Replacing Multicontroller RNC Hardware Units Replacing an add-in card
Issue: 02A DN09109953 35
4 Slide the top cover of module towards the rear side until it stops. Lift the topcover upwards.
Figure 12 Removing the BCN top cover
5 Unscrew the two thumbscrews securing the add-in card to the rails inside theBCN module.
Figure 13 BCN add-in card screws
DN0977525
6 Slide the add-in card upwards to remove it from the BCN module.
Figure 14 Pulling an add-in card out from the BCN module
DN0973798
Replacing an add-in card Replacing Multicontroller RNC Hardware Units
36 DN09109953 Issue: 02A
6.2 Installing an add-in card6.2.1 Steps
1 Slide the add-in card into the rails inside the BCN module until the pins of thecard fall into connectors of the main board.
Figure 15 Inserting an add-in card into BCN module
DN0973708
2 Secure the add-in card to the rails with built-in thumbscrews.The Phillips screws are built into the add-in card and can be tightened either by handor with a screwdriver.
Figure 16 BCN add-in card screws
DN0977525
Replacing Multicontroller RNC Hardware Units Replacing an add-in card
Issue: 02A DN09109953 37
3 Place the BCN module’s cover on the top of the module, leaving small gapbetween the top cover and the front edge of the module.
Figure 17 Installing BCN top cover
4 Slide the top cover to the front side of the module, until it falls into place.
5 Option Description
If the BCN module is installed in the cabinet,
Then a)b) Push the module back into the cabinet, until it locks into position.
Pull the green latches on the inner sliding rails towards you andslide the BCN module into the cabinet.
6 Tighten the thumbscrews of the top cover.
7 Option Description
If the BCN module is installed in the cabinet,
Then a) Install the cable tray with attached network cables back to theBCN module.For more information about the cable tray installation, check thedocument Installing BCN Modules to the IR206 Cabinet.
b) Connect the BCN grounding cable.c) Connect the network cables back to the transceivers on the front
side of the module.d) Connect the power feed cables.e)f) Power on the BCN module.
Replacing an add-in card Replacing Multicontroller RNC Hardware Units
38 DN09109953 Issue: 02A
7 Replacing a power distribution unitPurpose
If the power distribution unit (PDU) is faulty, you must replace it with a new one.
Figure 18 Power distribution units in the cabinet
ONON
ONON
ONON
ONON
OFFOFF
OFFOFF
OFFOFF
OFFOFF
56
78
12
34
ONON
ONON
ONON
ONON
OFFOFF
OFFOFF
OFFOFF
OFFOFF
56
78
12
34
front�viewDN0960093
Before you start
Make sure you have a digital multimeter or voltage meter available.
f Danger of hazardous voltages and electric shock!Before connecting or removing any power supply cables to or from the powerdistribution unit, make sure that both site power feeds to the power distributionunit are off, the circuit breakers on the front panel of the power distribution unitare in the OFF position, and the equipment is properly earthed (grounded).
f Danger of hazardous voltages and electric shock!Make sure your hands are dry and remove any metal objects such as ringsbefore touching the power supply equipment.
f Risk of personal injury.Observe the given torque ranges at all times. Incorrect torque can result indamage to equipment, unreliability, and fire hazards due to excessive powerdissipation and high temperature of materials.
7.1 Removing a power distribution unit (PDU)
Replacing Multicontroller RNC Hardware Units Replacing a power distribution unit
Issue: 02A DN09109953 39
7.1.1 Steps
1 Make sure that the redundant PDU is functional.
2 Switch off the circuit breakers on the PDU you are going to remove.
3 Check the PDU input feeds with a digital multimeter to ensure there are novoltages in the cables.
4 Disconnect all cables from the PDU.
a) Disconnect the four power feed cables from the PDU.b) Disconnect the CGNDB grounding cable from the PDU.c) Disconnect the eight PSU input feeds from the PDU.
5 Unscrew the four fixing screws attaching the PDU to the cabinet.
Figure 19 Replacing a PDU
DN0960109
ONON
ONON
ONON
ONON
OFFOFF
OFFOFF
OFFOFF
OFFOFF
56
78
12
34
ONON
ONON
ONON
ONON
OFFOFF
OFFOFF
OFFOFF
OFFOFF
56
78
12
34
front�view
M6
6 Remove the PDU from the cabinet.
7.2 Installing a power distribution unit (PDU)
Replacing a power distribution unit Replacing Multicontroller RNC Hardware Units
40 DN09109953 Issue: 02A
7.2.1 Steps
1 Insert the PDU into the cabinet and align the holes of its mounting ear with thecabinet mounting rail.
2 Attach the PDU to the cabinet with four M6x12 screws.
Figure 20 Installing PDU to the cabinet
DN0960187
ONON
ONON
ONON
ONON
OFFOFF
OFFOFF
OFFOFF
OFFOFF
56
78
12
34
ONON
ONON
ONON
ONON
OFFOFF
OFFOFF
OFFOFF
OFFOFF
56
78
12
34
front�view
M6
3 Connect the PDU grounding cable (CGNDB) to the PDU.
Figure 21 PDU grounding cable
DN0977591
5
6
7
8
-�48����RTN
1
2
3
4
-�48����RTN
rear�view
Replacing Multicontroller RNC Hardware Units Replacing a power distribution unit
Issue: 02A DN09109953 41
4 Check the PDU input feeds with a digital multimeter to ensure there are novoltages in the cables.
5 Connect the site power supply cables to the PDU (for DC power supply only).
6 Connect the site power supply cables to the PDU (for AC power supply only).
7 Connect the eight PSU input feeds to the PDU.
8 Switch on the site power supply to the PDU.
9 Switch on the circuit breakers on the PDU.
Replacing a power distribution unit Replacing Multicontroller RNC Hardware Units
42 DN09109953 Issue: 02A
8 Replacing a power supply unitBefore you start
f Electrostatic discharge (ESD) may damage components in the module or otherunits.Wear an ESD wrist strap or use a corresponding method when handling theunits, and do not touch the connector surfaces.
8.1 Removing a power supply unit8.1.1 Steps
1 Option Description
If there is PDU used with the BCN module,
Then Switch off the circuit breaker on the PDU for the power supplyunit to be replaced.
2 Unplug the power cable connected to the power supply unit.
3 Unscrew the two thumbscrews attaching the power supply unit to the BCN.The Phillips screws are built into the power supply unit and can be loosened eitherby hand or with a screwdriver.
4 Pull the power supply unit out from the BCN module.Removing an AC PSU from the BCN module
DN0960151
Replacing Multicontroller RNC Hardware Units Replacing a power supply unit
Issue: 02A DN09109953 43
Removing a DC PSU from the BCN module
RT
N-4
8V
PO
K
RT
N-4
8V
PO
K
DN0960163
8.2 Installing a power supply unit8.2.1 Steps
1 Insert the power supply unit to its slot at the rear side of the BCN module sothe screws built into the unit are on the right-hand side.
2 Tighten the unit’s thumbscrews.The Phillips screws are built into the power supply unit and can be tightened eitherby hand or with a screwdriver.
3 Plug the power cable to the power supply unit.
4 Attach the cable clamp to the cable.
5 Option Description
If there is PDU used with the BCN module,
Then Switch on the circuit breaker on the PDU for the power supplyunit, which was replaced.
Replacing a power supply unit Replacing Multicontroller RNC Hardware Units
44 DN09109953 Issue: 02A
9 Replacing the air filterPurpose
Inspect the air filter regularly. To prevent dust from accumulating inside the equipment,the filter element should be replaced twice a year.
Steps
1 Unscrew the two thumbscrews attaching the air filer cover to the BCN module.
Figure 22 Unscrewing the two thumbscrews
2 Open the air filter cover and pull out the air filter.
Figure 23 Openning the air filter cover and pulling out the air filter
DN0960112
3 Push the new air filter into the guide rails on both sides of the air filter cover.
4 Push the air filter cover back and fasten the two thumbscrews.
5 Record the date of the air filter change.
Replacing Multicontroller RNC Hardware Units Replacing the air filter
Issue: 02A DN09109953 45
10 Dealing with sensor alarmsSymptomsAn alarm about the sensor value of an Field Replacement Unit (FRU) is received.
The following is an example of the alarm:
Alarm ID: 2813Specific problem: 70307 - VOLTAGE OUT OF LIMITManaged object: fshwModuleId=addin-5,fshwPIUId=piu- 1,fshwEquipmentHolderId=chassis-2,fshwEquipmentHolderId=cabinet-1, fsFragmentId=HW,fsClusterId=ClusterRootSeverity: 2 (critical)Cleared: noClearing: automaticAcknowledged: noAck. user ID: N/AAck. time: N/AAlarm time: 2012-03-12 09:08:29:940 EETEvent type: x5 (equipment)Application: fshaProcessInstanceName=HPIMonitor,fshaRecoveryUnitName=FSHPIMonitorServer,fsipH ostName=CFPU- 0,fsFragmentId=Nodes,fsFragmentId=HA,fsClusterId=ClusterRootIAppl Addl. Info: Unit={BCNOC-A} Position=/chassis-2/slot-5 Sensor={number=218,Name=VDD_QLM3}Appl. Addl. Info: 0.044Notification ID: 8422Extended event type : x1 (raise)Control indicator: 7 (full visible)Recovery procedures1. Determine the sensor name from the Sensor field of the IAppl Addl. Info
section of the alarm.2. Determine the effected FRU from the Position field of the IAppl Addl. Info
section of the alarm.3. Check the sensor data of the FRU in trouble with the help of the sensor name and
FRU name.BCN includes several sensors that report on hardware conditions. Many of thesensor readings can be used to diagnose the hardware fault.Follow the steps below to check the sensor data of the FRU in trouble:
a) Check the LMP version.Issue the following command to show the LMP version:sw_fw_versioninfoExample:root@LMP-1-2-1:~# sw_fw_versioninfo Active U-Boot Version 5.3.0 (in flash 0) Backup U-Boot Version 5.3.0 (in flash 1) LMP Version 5.3.0 PCB Version A104-3 LED CPLD Version 05 PCI-LPC bridge XP2 Version 05 VCMC Version 5.3.0
Dealing with sensor alarms Replacing Multicontroller RNC Hardware Units
46 DN09109953 Issue: 02A
PWR1014 Version 0007 FRUD Version 5.3.0 Part Number C111721.B3B Add-in Card 1 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 2 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 3 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 4 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 5 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 6 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 7 MMC Version 4.2.3 Part Number C111723.A1A BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 Add-in Card 8 MMC Version 4.2.3 Part Number C111723.A1A
Replacing Multicontroller RNC Hardware Units Dealing with sensor alarms
Issue: 02A DN09109953 47
BCNOC-A PCPL Version 0.3.0 BCNOC-A OCTF Version 4.2.3 BCNOC-A FRUD Version 4.2.3 AMC 1 MMC Version 1.10 Part Number C110598.B3A hdsam-a_ad_frud Version 01.10.0000 AMC 2 PSU info 0 Board Mfg : EMERSON Board Product : BAFE-B Board Serial : TR120201616 Board Part Number : C112156.C1A Board Extra : 1f01 Product Manufacturer : EMERSON Product Name : DS1200-3-007 Product Part Number : PSU Product Version : 04 Product Serial : I510JS000H04P PSU info 1 Board Mfg : EMERSON Board Product : BAFE-B Board Serial : TR120201617 Board Part Number : C112156.C1A Board Extra : 1f01 Product Manufacturer : EMERSON Product Name : DS1200-3-007 Product Part Number : PSU Product Version : 04 Product Serial : I510JS000J04P
b) Check sensors.
1. List the hardware sensors.Issue the following command to list all the hardware sensors:mch_cli ShowSensor
g This will list all the sensors, with the Logical Unit Number (LUN) andsensor addresses, attached to a hardware unit.
Example:mch_cli ShowSensorroot@LMP-1-2-1:~# mch_cli ShowSensorEntity: Unknown Hot Swap PSU 1 0x00 0x43 PSU1 IN_Curr 0x00 0x2c PSU1 Fan 1 0x00 0x28 PSU1 Temp 2 0x00 0x26 PSU1 Temp 1 0x00 0x24 PSU1 Status 0x00 0x22 PSU1 OUT_Curr 0x00 0x20 PSU1 OUT_3V3 0x00 0x1e
Dealing with sensor alarms Replacing Multicontroller RNC Hardware Units
48 DN09109953 Issue: 02A
PSU1 OUT_12V 0x00 0x1c PSU1 INPUT 0x00 0x1aEntity: BAFE-B Hot Swap PSU 2 0x00 0x44 PSU2 IN_Curr 0x00 0x2d PSU2 Fan 1 0x00 0x29 PSU2 Temp 2 0x00 0x27 PSU2 Temp 1 0x00 0x25 PSU2 Status 0x00 0x23 PSU2 OUT_Curr 0x00 0x21 PSU2 OUT_3V3 0x00 0x1f PSU2 OUT_12V 0x00 0x1d PSU2 INPUT 0x00 0x1bEntity: BMFU-A Hot Swap CU 1 0x00 0x45 Fan 2 0x00 0x0a Fan 1 0x00 0x09Entity: BMFU-A Hot Swap CU 2 0x00 0x46 Fan 4 0x00 0x0c Fan 3 0x00 0x0bEntity: BAFU-A Hot Swap CU 3 0x00 0x47 Fan 6 0x00 0x0e Fan 5 0x00 0x0dEntity: BCNMB-A POST Error 0x00 0x48 LMP Reset 0x00 0x3d SEL status 0x00 0x3c BMC Watchdog 0x00 0x3b CLOCK_IRQ 0x00 0x30 Reset Button 0x00 0x2e VCCA 0x00 0x19 1.0V 0x00 0x18 1.25V_GE 0x00 0x17 1.25V_XG 0x00 0x16 VCC3 0x00 0x15 VCC5 0x00 0x14 12V 0x00 0x13 0.9V 0x00 0x12 1.8V 0x00 0x11 1.1V 0x00 0x10 3.3SB 0x00 0x0f Inlet3 Temp 0x00 0x08 Inlet2 Temp 0x00 0x07 Outlet Temp 0x00 0x06 Inlet1 Temp 0x00 0x05 BCM56820 Temp 0x00 0x04 BCM56512 Temp 0x00 0x03 LMP Temp 0x00 0x02 PEX Temp 0x00 0x01 InterSrc NewSel 0x00 0xf5 InterSrc Loss 0x00 0xf4 Sync2 NewSel 0x00 0xf3 Sync2 Loss 0x00 0xf2
Replacing Multicontroller RNC Hardware Units Dealing with sensor alarms
Issue: 02A DN09109953 49
Sync1 NewSel 0x00 0xf1 Sync1 Loss 0x00 0xf0 External alarm 0x00 0xfeEntity: Unknown AMC at 00 Hot Swap AMC 1 0x00 0x31Entity: Unknown AMC at 00 Hot Swap AMC 2 0x00 0x32Entity: Unknown AMC at 00 Hot Swap AMC 3 0x00 0x33Entity: Unknown AMC at 00 Hot Swap AMC 4 0x00 0x34Entity: Unknown AMC at 00 Hot Swap AMC 5 0x00 0x35Entity: Unknown AMC at 00 Hot Swap AMC 6 0x00 0x36Entity: Unknown AMC at 00 Hot Swap AMC 7 0x00 0x37Entity: Unknown AMC at 00 Hot Swap AMC 8 0x00 0x38Entity: Unknown AMC at 00 Hot Swap AMC 9 0x00 0x39Entity: Unknown AMC at 00 Hot Swap AMC 10 0x00 0x3aEntity: CPU 1 RESET_TYPE 0x00 0x5d BOOT 0x00 0x5c BOOT_ERROR 0x00 0x5b VDD_QLM3 0x00 0x5a VDD_QLM2 0x00 0x59 VDD_QLM1 0x00 0x58 VDD_QLM0 0x00 0x57 VDD_VTT0 0x00 0x56 DDR_VDD 0x00 0x55 VDD_OCORE 0x00 0x54 MON_3VSB 0x00 0x53 MON_12V 0x00 0x52 Tmp421 Temp 0x00 0x51 BMC Watchdog 0x00 0x50Entity: CPU 2 RESET_TYPE 0x00 0x7d BOOT 0x00 0x7c BOOT_ERROR 0x00 0x7b VDD_QLM3 0x00 0x7a VDD_QLM2 0x00 0x79 VDD_QLM1 0x00 0x78 VDD_QLM0 0x00 0x77 VDD_VTT0 0x00 0x76 DDR_VDD 0x00 0x75 VDD_OCORE 0x00 0x74 MON_3VSB 0x00 0x73 MON_12V 0x00 0x72 Tmp421 Temp 0x00 0x71 BMC Watchdog 0x00 0x70Entity: CPU 3 RESET_TYPE 0x00 0x9d
Dealing with sensor alarms Replacing Multicontroller RNC Hardware Units
50 DN09109953 Issue: 02A
BOOT 0x00 0x9c BOOT_ERROR 0x00 0x9b VDD_QLM3 0x00 0x9a VDD_QLM2 0x00 0x99 VDD_QLM1 0x00 0x98 VDD_QLM0 0x00 0x97 VDD_VTT0 0x00 0x96 DDR_VDD 0x00 0x95 VDD_OCORE 0x00 0x94 MON_3VSB 0x00 0x93 MON_12V 0x00 0x92 Tmp421 Temp 0x00 0x91 BMC Watchdog 0x00 0x90Entity: CPU 4 RESET_TYPE 0x00 0xbd BOOT 0x00 0xbc BOOT_ERROR 0x00 0xbb VDD_QLM3 0x00 0xba VDD_QLM2 0x00 0xb9 VDD_QLM1 0x00 0xb8 VDD_QLM0 0x00 0xb7 VDD_VTT0 0x00 0xb6 DDR_VDD 0x00 0xb5 VDD_OCORE 0x00 0xb4 MON_3VSB 0x00 0xb3 MON_12V 0x00 0xb2 Tmp421 Temp 0x00 0xb1 BMC Watchdog 0x00 0xb0Entity: CPU 5 RESET_TYPE 0x00 0xdd BOOT 0x00 0xdc BOOT_ERROR 0x00 0xdb VDD_QLM3 0x00 0xda VDD_QLM2 0x00 0xd9 VDD_QLM1 0x00 0xd8 VDD_QLM0 0x00 0xd7 VDD_VTT0 0x00 0xd6 DDR_VDD 0x00 0xd5 VDD_OCORE 0x00 0xd4 MON_3VSB 0x00 0xd3 MON_12V 0x00 0xd2 Tmp421 Temp 0x00 0xd1 BMC Watchdog 0x00 0xd0Entity: CPU 6 RESET_TYPE 0x01 0x0d BOOT 0x01 0x0c BOOT_ERROR 0x01 0x0b VDD_QLM3 0x01 0x0a VDD_QLM2 0x01 0x09 VDD_QLM1 0x01 0x08 VDD_QLM0 0x01 0x07 VDD_VTT0 0x01 0x06 DDR_VDD 0x01 0x05 VDD_OCORE 0x01 0x04 MON_3VSB 0x01 0x03
Replacing Multicontroller RNC Hardware Units Dealing with sensor alarms
Issue: 02A DN09109953 51
MON_12V 0x01 0x02 Tmp421 Temp 0x01 0x01 BMC Watchdog 0x01 0x00Entity: CPU 7 RESET_TYPE 0x01 0x2d BOOT 0x01 0x2c BOOT_ERROR 0x01 0x2b VDD_QLM3 0x01 0x2a VDD_QLM2 0x01 0x29 VDD_QLM1 0x01 0x28 VDD_QLM0 0x01 0x27 VDD_VTT0 0x01 0x26 DDR_VDD 0x01 0x25 VDD_OCORE 0x01 0x24 MON_3VSB 0x01 0x23 MON_12V 0x01 0x22 Tmp421 Temp 0x01 0x21 BMC Watchdog 0x01 0x20Entity: CPU 8 RESET_TYPE 0x01 0x4d BOOT 0x01 0x4c BOOT_ERROR 0x01 0x4b VDD_QLM3 0x01 0x4a VDD_QLM2 0x01 0x49 VDD_QLM1 0x01 0x48 VDD_QLM0 0x01 0x47 VDD_VTT0 0x01 0x46 DDR_VDD 0x01 0x45 VDD_OCORE 0x01 0x44 MON_3VSB 0x01 0x43 MON_12V 0x01 0x42 Tmp421 Temp 0x01 0x41 BMC Watchdog 0x01 0x40Entity: AMC 1 Version change 0x01 0x68 DC/DC Failure 0x01 0x67 MMC Temp 0x01 0x66 HDD Temp 0x01 0x65 +5V Backend 0x01 0x64 +12V Backend 0x01 0x63 +3.3V MP 0x01 0x62 +12V Payload 0x01 0x61 Hot Swap 0x01 0x60
2. Get the hardware sensor threshold.Find the sensor address of the sensor you want and issue the followingcommand to get the sensor threshold of the hardware unit:mch_cli GetSensorThreshold <LUN> <Sensor addr>Example:root@LMP-1-2-1:~# mch_cli GetSensorThreshold 0x00 0xdasensor VDD_QLM3 (218)Lower Non-Critical : NALower Critical : NALower Non-Recoverable : 1.1720Upper Non-Critical : NAUpper Critical : NA
Dealing with sensor alarms Replacing Multicontroller RNC Hardware Units
52 DN09109953 Issue: 02A
Upper Non-Recoverable : 1.2920support thresholds: lnr unr
3. Get the hardware sensor data.Issue the following command to get the sensor data of the hardware unit youwant:mch_cli ReadSensor <LUN> <Sensor addr>Example:root@LMP-1-2-1:~# mch_cli ReadSensor 0x00 0xdaSensor: VDD_QLM3Lun: 0x00Number: 0xdaValue: 1.236000
4. If the sensor value is not within the Lower and Upper threshold value, try to replacethe FRU in trouble to correct the sensor value. See chapter Replacing hardwareunits for details.
5. If the problem persists even after following the instructions, please contact your localNokia Solutions and Networks representative with your observations on the sensorvalues.
Position of the PSUs and fan traysThe following picture shows the positions of the PSUs and fan trays.
Figure 24 Positions of the PSUs and fan trays
Replacing Multicontroller RNC Hardware Units Dealing with sensor alarms
Issue: 02A DN09109953 53
11 Communication between active and standbyunits in a BCN cluster fails
11.1 DescriptionThe communication failure between active and standby units in a box controller node(BCN) cluster for a long time will cause a split-brain situation. If the cluster internalnetwork connection between BCN modules fails, the cluster may get partitioned into twoindependent parts, which attempt to provide the same services. As a result, the BCNmodules do not function properly. Possible causes for the problem are:
• Improper cabling between the BCN boxes.• Tampering of cables connecting the BCN boxes.• Incorrect switch configurations.• Malfunctioning of hardware or embedded software.
11.2 SymptomsImproper handling of the hardware might lead to a scenario where two isolated parts ofthe BCN cluster are running and trying to provide the same services. In this split-brainsituation, the following problems might occur:
• Storage resources replicated using Distributed Replicated Block Device (DRBD) getupdated independently on both sides.
• As both CLA nodes run an independent instance of the cluster managementsoftware, nodes may get reset continuously because, they can communicate onlywith one management software at a time.
• External IP addresses will be assigned to both the units which cause IP addressconflicts and various communication errors.
• The CLA nodes might reboot continuously.
11.3 Recovery procedures
1 Power-off all the BCN modules.
2 Power-on one of the CLA node BCN modules.
3 Wait till the BCN module starts.Wait till the BCN module starts. In case, no console connection is available, just waitfor 3 minutes.
Communication between active and standby units in aBCN cluster fails
Replacing Multicontroller RNC Hardware Units
54 DN09109953 Issue: 02A
4 Power-on the remaining BCN modules.If the CLA node is up and running in the powered on BCN module, power-on theremaining BCN modules. This will overwrite the disk devices of the last activated unitwith the copies of the unit that was started up first.
5 Verify that all the nodes and services are up and running.To verify that the split-brain situation is over and all the services are up and running,enter the following command:show has summary
g The example displays a 2 BCN configuration. If there are more BCNs, thedisplay is different.
_nokadmin@CFPU-0 [RNC-37] > show has summaryCFPU-0@RNC-37 [2013-05-09 10:07:57 +0200]Node status Nodes in configuration : 16 Unlocked nodes : 16
RG status RGs in configuration : 68 Unlocked RGs : 68
RU status RUs in configuration : 274 Unlocked RUs : 274
Process status Processes in configuration : 1482 Unlocked processes : 1482
If there are differences between Nodes in configuration and Unlocked nodes, then split-brain is still active.
g There still might be differences in Node status even if split-brain is over. Pleasecheck if there are nodes that are locked. If yes, unlock them and try once again.
Replacing Multicontroller RNC Hardware Units Communication between active and standby units in aBCN cluster fails
Issue: 02A DN09109953 55