MFS Trouble Shooting Guide B9 ed15.pdf

197
ED Ed15 Released MFS Troubleshooting guide release B9 EVOLIUM 3bk29042jaaapwzza-ed15rl.doc 15/03/2007 3BK 29042 JAAA PWZZA 1/197 Site VELIZY EVOLIUM™ SAS Originators MFS integration team MFS TROUBLESHOOTING GUIDE B9 RELEASE System : ALCATEL 900 / BSS Sub-system : MFS Document Category : USER GUIDE ABSTRACT This document constitutes the reference location for storing troubleshooting actions related to operation of MFS B9. It is restricted to ALCATEL internal usage, notably for ALCATEL personnel providing on site support at customer premises. This document will be updated each time new problem occurs. Approvals Name App. J-J BELLEGO G. ACBARD B. FERNIER Name App. D. COTTIN

Transcript of MFS Trouble Shooting Guide B9 ed15.pdf

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 1/197

    Site

    VELIZY

    EVOLIUM SAS

    Originators

    MFS integration team

    MFS TROUBLESHOOTING GUIDE

    B9 RELEASE

    System : ALCATEL 900 / BSS Sub-system : MFS Document Category : USER GUIDE

    ABSTRACT

    This document constitutes the reference location for storing troubleshooting actions related to operation of MFS B9. It is restricted to ALCATEL internal usage, notably for ALCATEL personnel providing on site support at customer premises.

    This document will be updated each time new problem occurs.

    Approvals

    Name App.

    J-J BELLEGO G. ACBARD B. FERNIER

    Name App.

    D. COTTIN

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 2/197

    REVIEW

    ED 12 RL 07-07-06 Reading report EVOLIUM/R&D/TD/MFS/2006-4968-PME

    ED 13 RL 27-09-06 Reading report EVOLIUM/R&D/TD/MFS/ 2006-5042-PME

    ED 14 RL 27-11-06 Reading report EVOLIUM/R&D/TD/MFS/ 2006-5092-PME

    ED 15 RL 14-03-07 Reading report EVOLIUM/R&D/TD/MFS/2007-5215-DC

    HISTORY

    Ed. 01 Proposal 01 Cancelled B8 chapters (FR close OUT, NRE, REL)

    Ed. 01 Proposal 02 01-11-2004 P.MENON Some clean up + synchronization with new tips from B8

    Ed. 01 Proposal 03 08-11-2004 P.MENON Suppress redundant informations with MFS Installation;Configuration,and Software replacement guide

    Ed. 01 Proposal 04 16-11-2004 P.MENON Minor corrections

    Ed. 01 Proposal 05 16-02-05 P.MENON

    - Add Unix boot impossible (wrong default kernel) - Add check if backup Mib is not corrupted - Add How to get contents of unix patch BL

    - Add for Trace of unix patch installation

    Ed. 01 release 11-03-05 Release for B9 MR0

    Ed. 02 release 01-06-05 Release for B9 MR2 P.MENON - update Corrective action: second step (install_lsm)

    Ed. 03 release 02-06-05 Release for B9 MR2 P.MENON - S99trace_srv.ds is renamed in S99trace_server.ds since MFSAW10F

    Ed. 04 release 02-06-05 Release for B9 MR2 P.MENON - Add for Failure on Update Remote Inventory

    Ed. 05 release 09-06-05 Release for B9 MR2 P.MENON - Add for rmdir fails during execution of ins_swcx.sh when cygwin is installed on the PC

    Ed. 06 release 30-06-05 Release for B9 MR2 P.MENON -update Error at step 5/10 (Isolation) Check the full SCSI chain...

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 3/197

    Ed. 07 release 06-07-05 Release for B9 MR2 P.MENON -Add Connection by ftp from a MFS station to an external server is impossible FR 3BKA20FBR164817 - Add TRACE_SERVER does not run FR 3BKA13FBR164932

    - Add GPU traces are not completed - Add Impossible to load patch GPU B8 on GPUs FR

    3BKA13FBR164932 - Add Unix patch installation from OMC stopped due to a

    network failure - Add After a roll-back it is impossible to open the IMT

    terminal FR 3BKA20FBR162930

    Ed. 08 Proposal 01 30-08-05 P.MENON - Add for Inall procedure stopped due to a station in "halt in" state FR 3BKA13FBR166921

    Ed. 08 Proposal 02 01-09-05 P.MENON - Add new Installation from a not english PC fails (FR 3BKA20FBR166358)

    Ed. 08 Proposal 03 09-09-05 P.MENON

    - Add new The trace server stops running after a while (FR 3BKA13FBR169218)

    - Add new Result of dupatch in B8 or B9 RC40 with BL24

    Ed. 08 Proposal 04 20-09-05 P.MENON Add new Control station reboots in loop with reset_code 214 after installation of BL22 (FR 3BKA13FBR170335)

    Ed. 08 Proposal 05 20-10-05 P.MENON - Update Control station reboots in loop with reset_code 214 after installation of BL22 (FR 3BKA13FBR170335) - Suppress yellow paragraph

    Ed. 08 Release 28-10-05 Release

    Ed. 09 Release 13-01-06 Release P.MENON - quality corrections - update Error at Step 2 (Creation) - Add new MFS UNIX patch installation makes Control Station unusable (B9 MR1 ED2) (FR 3BKA23FBR174370) - Update Error at step 5/10 (Isolation) (FR 3BKA13FBR175829) - Add new Reinstallation of the MFS and restauration of data from OMC - Add new How to restore the MIB without needing full reinstallation - Add new Sanity check script to prevent any potential problem on the MFS

    Ed. 10 Proposal 01 08-02-06 P.MENON - Add new GPU problem but alarm is "Failure of a JAET1 applique" (FR 3BKA13FBR177178) - Add new no more available disk space on /usr (FR 3BKA20FBR176683)

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 4/197

    Ed. 10 Proposal 02 09-02-06 P.MENON - update Sanity check script to prevent any potential problem on the MFS (FR 3BKA13CBR176689)

    Ed. 10 Proposal 03 13-02-06 P.MENON - update Sanity check script to prevent any potential problem on the MFS (FR 3BKA13CBR176689) after remarks - add Result of dupatch in B9 with BL22 since MR1 Edx (MFSSAW11E)

    Ed. 10 Proposal 04 16-02-06 P.MENON - update Error at step 5/10 (Isolation) - add System and Tomas (Nectar was the name in a former time) traces

    Ed. 10 Proposal 05 22-02-06 P.MENON - add Wrong httpd.conf

    Ed. 10 Release 23-02-06 Release

    Ed. 11 Release 02-03-06 Release P.MENON - update Sanity check script to prevent any potential problem on the MFS (FR 3BKA13CBR176689) - add JBETI traces - update The trace server stops running after a while - update TRACE_SERVER does not run - add not enough space for Backup MIB - add new GPU switch over no more possible (FR 3BKA20FBR149993 and 3BKA20FBR151855)

    Ed. 12 Release 30-06-06 Release P.MENON - update Traces of unix patch installation - update O&M trace SCIM (RTA) - update GPU switch over no more possible, JBETI problems - Add new Rebuild of mirrored partitions on RC40 - rename Sanity check script to prevent any potential problem on the MFS to AuditMFS script to prevent any potential problem on the MFS - Add new not possible to get PM of MFS from OMC FR 3BKA13FBR183494 not possible to unlock omcxchg account from User management option of IMT FR 3BKA13FBR183497 - Update Check if backup Mib is corrupted - Update CRAFT cannot connect to MFS floating IP:wrong httpd.conf - Update AuditMFS script to prevent any potential problem on the MFS with new codes FR/CR 3BKA13CBR179923 3BKA13CBR180184 3BKA13CBR180203 3BKA13CBR180618 - Add new Impossible to enable MRTG Collector FR 3BKA13FBR186503 - Add new active Control Station is blocked after automatic backup MIB on RC40 FR 3BKA13CBR189473 - Add new MFS UNIX patch installation fails with a core file generated from 'install_patch_du' FR 3BKA13FBR189822 - Merge with MX Trouble Shooting descriptions

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 5/197

    Ed. 12 Release 05-07-06 Release P.MENON - update Sleeping cells - Add new bul file execution returns 1 error FR 3BKA13FBR186955 - Add new No RRALLI sent on GSLs, for all cells of the BSC, after activate MLU FR 3BKA20FBR186403

    Ed. 13 Proposal 01 04-09-06 P.MENON - Update Error at step 3/10: . after installation from scratch of B9 version containing the script clean_spdata, the next migration does not work (FR 3BKA13FBR188065) . after installation from scratch a MFS which was coming from migration or software replacement, with restoration of the backup MIB, a new migration or software replacement fails (FR 3BKA13FBR194235/3BKA13FBR193877/3BKA13FBR181238) - Add new Cell parameters modification is not allowed from IMT (BUI request) - Add new dataPatch.bul" error during scratch installation in B9 MR4 (FR 3BKA13FBR185034) - update AuditMFS script to prevent any potential problem on the MFS error codes added (118: CS are not time synchronized (CR 3BKA13CBR193667) and 406: discrepancies in version descriptor files (CR 3BKA13CBR193904)

    Ed. 13 Proposal 02 27-09-06 P.MENON . update after installation from scratch a MFS which was coming from migration or software replacement, with restoration of the backup MIB, a new migration or software replacement fails (FR 3BKA13FBR194235/3BKA13CBR193877/3BKA13FBR181238) - update AuditMFS script to prevent any potential problem on the MFS

    Ed. 13 Release 11-10-06 P.MENON Release approved

    Ed. 14 Proposal 01 27-10-06

    15-11-06

    16-11-06

    P.MENON - Add new Serial splitter and RJ45 converter for Trouble shooting ( MFS Evolution only) - Add new How to generate/backup on a platform MFS a virgin MIB and how to import this MIB on a field MFS. (same architecture / same SW level) (CR 3BKA13CBR194432) - Add new Impossible to install MFS Sanity Check Script (AW11EP_00D) (FR 3BKA13FBR196644) - update JBETI trace D. COTTIN - Add FR 3BKA20FBR199071 Hanging alarm "Card voltage out of range" for MFS JBXSSW after power failure - Add 3BKA20FBR186125 PV_PEM alarms raised and not cleared from IMT and OMCR after MFS power off/on - Add 3BKA13FBR199323 mfssetup or configure_switch failure after replacing a new SSW board

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 6/197

    Ed. 14 Release 27-11-06 P.MENON Release approved - Add new How to detect JBETI is not blocked

    Ed. 15 Proposal1

    Ed 15 proposal 2

    05-01-07 D. COTTIN - Add new Not possible to boot MXMFS OMCP board from PC during SW installation -> FR 3BKA20FBR199390 - Add new AuditMFS script to prevent any potential problem on the MFS (MFS Evolution only ) -> CR 3BKA13CBR176689

    - Add new chapter 9 about Crash/Traces for A9130 MFS Evolution only

    - Add FR 3BKA20FBR208698 AuditMFS.pl script reports some errors

    P. GIUDICELLI Add : - 3BKA20FBR167349 : hardware alarm - 3BKA20FBR183063 : pb with GPU - 3BKA20FBR188564 : /RESULT full - 3BKA20FBR196642 : error 205 - 3BKA20FBR186005 pb concerning remomte inventory

    Ed15 released

    14-03-07 D. Cottin.

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 7/197

    TABLE OF CONTENTS

    1 INTRODUCTION............................................................................................................................. 16 1.1.1 Document organisation ................................................................................................. 16 1.1.2 Presentation .................................................................................................................. 16

    2 GPU................................................................................................................................................. 17 2.1 GPUs disappear from the IMT .................................................................................. 17

    2.1.1 Reference FR: None. .................................................................................................... 17 2.1.2 Problem description ...................................................................................................... 17 2.1.3 Corrective action ........................................................................................................... 17

    2.2 GPU SO Impossible ................................................................................................... 19 2.2.1 Reference FR: 3BKA20FBR108914 ............................................................................. 19 2.2.2 Problem description ...................................................................................................... 19 2.2.3 Corrective action ........................................................................................................... 19

    2.3 GPU reboots continuously ....................................................................................... 19 2.3.1 Reference FR: 3BKA20FBR119782 ............................................................................. 19 2.3.2 Problem description ...................................................................................................... 19 2.3.3 Corrective action ........................................................................................................... 20 2.3.4 Problem solved.............................................................................................................. 20

    2.4 GPU connection problem.......................................................................................... 20 2.4.1 Reference FR: none...................................................................................................... 20 2.4.2 Problem description ...................................................................................................... 20 2.4.3 Corrective action ........................................................................................................... 20

    2.5 GPU problem but alarm is "Failure of a JAETI1 applique".................................... 21 2.5.1 Reference FR: 3BKA13FBR177178 ............................................................................. 21 2.5.2 Problem description ...................................................................................................... 21 2.5.3 Corrective action ........................................................................................................... 21

    2.6 GPU switch over no more possible, JBETI problems............................................ 21 2.6.1 Reference FR: 3BKA20FBR149993, 3BKA20FBR151855 and 3BKA13FBR163557.. 21 2.6.2 Problem description ...................................................................................................... 21 2.6.3 Preventive action........................................................................................................... 22 2.6.4 Corrective action ........................................................................................................... 22

    2.7 GPU SW is not loaded ( MFS Evolution only ) ........................................................ 22 2.7.1 Reference FR: 3BKA13FBR 175541 ............................................................................ 22 2.7.2 Problem description ...................................................................................................... 22 2.7.3 Corrective action ........................................................................................................... 22

    2.8 MFS crash while we insert a GPU............................................................................ 23 2.8.1 Reference FR: 3BKA20FBR183063 ............................................................................ 23 2.8.2 Problem description ...................................................................................................... 23 2.8.3 Corrective action ........................................................................................................... 23

    3 INSTALLATION.............................................................................................................................. 24 3.1 Station restart ............................................................................................................ 24

    3.1.1 Reference FR: None. .................................................................................................... 24 3.1.2 Problem description ...................................................................................................... 24 3.1.3 Corrective action ........................................................................................................... 24

    3.2 Impossible to rlogin/telnet to MFS as root.............................................................. 25 3.2.1 Reference FR: None. .................................................................................................... 25 3.2.2 Problem description ...................................................................................................... 25 3.2.3 Corrective action ........................................................................................................... 25

    3.3 Unix boot impossible (wrong default kernel) ......................................................... 26 3.3.1 Reference FR: None. .................................................................................................... 26 3.3.2 Problem description ...................................................................................................... 26 3.3.3 Corrective action ........................................................................................................... 27

    3.4 dataPatch.bul" error during scratch installation in B9 MR4 ................................. 27 3.4.1 Reference FR: 3BKA13FBR185034 ............................................................................. 27

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 8/197

    3.4.2 Problem description ...................................................................................................... 27 3.4.3 Corrective action ........................................................................................................... 28

    4 MFS BASED ON RC40 .................................................................................................................. 29 4.1 Installation from a not English PC fails................................................................... 30

    4.1.1 Reference FR: FR 3BKA20FBR166358 ....................................................................... 30 4.1.2 Problem description ...................................................................................................... 30 4.1.3 Corrective action ........................................................................................................... 30

    4.2 MFS installation failed............................................................................................... 30 4.2.1 Reference FR: none...................................................................................................... 30 4.2.2 Problem description ...................................................................................................... 30 4.2.3 Corrective action ........................................................................................................... 30

    4.3 Inall procedure stopped due to a station in "halt in" state ................................... 33 4.3.1 Reference FR: 3BKA13FBR166921 ............................................................................. 33 4.3.2 Problem description ...................................................................................................... 33 4.3.3 Corrective action ........................................................................................................... 33

    4.4 Failure during the SWC from the OMC at step 1/10 (before file transfer)............ 34 4.4.1 Reference FR: none...................................................................................................... 34 4.4.2 Problem description ...................................................................................................... 34 4.4.3 Corrective action ........................................................................................................... 34

    4.5 Unix boot impossible ................................................................................................ 34 4.5.1 Reference FR: None. .................................................................................................... 34 4.5.2 Problem description ...................................................................................................... 34 4.5.3 Corrective action ........................................................................................................... 34

    4.6 Rebuild of mirrored partitions on RC40 .................................................................. 35 4.6.1 Reference FR: None. .................................................................................................... 35 4.6.2 Problem description ...................................................................................................... 35 4.6.3 Corrective action ........................................................................................................... 35

    4.7 active Control Station is blocked after automatic backup MIB on RC40............. 36 4.7.1 Reference FR: 3BKA13CBR189473............................................................................. 36 4.7.2 Problem description ...................................................................................................... 36 4.7.3 Corrective action ........................................................................................................... 36 4.7.4 Impacts.......................................................................................................................... 37

    5 MFS BASED ON MX ...................................................................................................................... 38 5.1 "Inall" failed during MX-MFS installation ................................................................ 39

    5.1.1 Reference FR: None. .................................................................................................... 39 5.1.2 Problem description ...................................................................................................... 39 5.1.3 Corrective action ........................................................................................................... 39

    5.2 Connection to OMCP using console redirection does not work.......................... 39 5.2.1 Reference FR: None. .................................................................................................... 39 5.2.2 Problem description ...................................................................................................... 39 5.2.3 Corrective action ........................................................................................................... 40

    5.3 "inall" failed during MX-MFS installation ................................................................ 40 5.3.1 Reference FR: None. .................................................................................................... 40 5.3.2 Problem description ...................................................................................................... 40 5.3.3 Corrective action ........................................................................................................... 40

    5.4 "inall" failed during MX-MFS installation ................................................................ 41 5.4.1 Reference FR: None. .................................................................................................... 41 5.4.2 Problem description ...................................................................................................... 41 5.4.3 Corrective action ........................................................................................................... 41

    5.5 Error at step 2/10 (Creation) ( MFS Evolution only ) ........................................... 41 5.5.1 Reference FR................................................................................................................ 41 5.5.2 Problem description ...................................................................................................... 41 5.5.3 Corrective action ........................................................................................................... 41

    5.6 Error at step 3/10 (Verify) ( MFS Evolution only ) ................................................ 42 5.6.1 Reference FR: None. .................................................................................................... 42 5.6.2 Problem description ...................................................................................................... 42 5.6.3 Corrective action ........................................................................................................... 42

    5.7 Error at step 7/10 (Validation) - ( MFS Evolution only ) ......................................... 42 5.7.1 Reference FR: None. .................................................................................................... 42

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 9/197

    5.7.2 Problem description ...................................................................................................... 42 5.7.3 Corrective action ........................................................................................................... 43

    5.8 The stand-by station is not operational ( MFS Evolution only ) ........................... 43 5.8.1 Reference FR: 3BKA20FBR199072 ............................................................................. 43 5.8.2 Problem description ...................................................................................................... 43 5.8.3 Corrective action ........................................................................................................... 43

    5.9 Ethernet connection problem ( MFS Evolution only )............................................ 44 5.9.1 Reference FR : none..................................................................................................... 44 5.9.2 Problem description ...................................................................................................... 44 5.9.3 Corrective Action........................................................................................................... 44

    5.10 Impossible to connect IMT ( MFS Evolution only )................................................. 48 5.10.1 Reference FR : 3BKA20FBR175917 ............................................................................ 48 5.10.2 Problem Description...................................................................................................... 48 5.10.3 Corrective Action........................................................................................................... 48

    5.11 After Power-on of ATCA shelf, OMCP servers are powered-off ( MFS Evolution only ) 49

    5.11.1 Reference FR 3BKA20FBR172514 .............................................................................. 49 5.11.2 Problem description ...................................................................................................... 49 5.11.3 Corrective action ........................................................................................................... 49

    5.12 How to update time from OMC ( MFS Evolution only ) .......................................... 49 5.12.1 Reference FR................................................................................................................ 49 5.12.2 Problem description ...................................................................................................... 49 5.12.3 Corrective action ........................................................................................................... 49 5.12.4 Problem solved.............................................................................................................. 50

    5.13 NE1oE supervision lost ( MFS Evolution only )...................................................... 50 5.13.1 Reference FR: None ..................................................................................................... 50 5.13.2 Problem description ...................................................................................................... 50 5.13.3 Corrective action ........................................................................................................... 50

    5.14 Extension from 1 shelf configuration to 2 shelves configurations has failed ( MFS Evolution only ) .......................................................................................................................... 51

    5.14.1 Reference FR: None ..................................................................................................... 51 5.14.2 Problem description ...................................................................................................... 51 5.14.3 Corrective action ........................................................................................................... 51

    5.15 No RRALLI sent on GSLs, for all cells of the BSC, after activate MLU ( MFS Evolution only ) .......................................................................................................................... 52

    5.15.1 Reference FR: 3BKA20FBR186403 ............................................................................. 52 5.15.2 Problem description ...................................................................................................... 52 5.15.3 Corrective action ........................................................................................................... 52

    5.16 Hanging alarm "Card voltage out of range" for MFS JBXSSW after power failure ( MFS Evolution only ).................................................................................................................. 52

    5.16.1 Reference 3BKA20FBR199071, 3BKA20FBR199072 ................................................. 52 5.16.2 Problem description ...................................................................................................... 52 5.16.3 Corrective action ........................................................................................................... 52

    5.17 PV_PEM alarms raised and not cleared from IMT and OMCR after MFS power off/on ( MFS Evolution only ) .................................................................................................... 53

    5.17.1 Reference 3BKA20FBR186125, 3BKA20FBR199072 ................................................. 53 5.17.2 Problem description ...................................................................................................... 53 5.17.3 Corrective action ........................................................................................................... 53

    5.18 mfssetup or configure_switch failure after replacing a new SSW board. ( MFS Evolution only ) .......................................................................................................................... 53

    5.18.1 Reference 3BKA13FBR199323 .................................................................................... 53 5.18.2 Problem description ...................................................................................................... 53 5.18.3 Corrective action ........................................................................................................... 53

    5.19 Not possible to boot MXMFS OMCP board from PC during SW installation....... 54 5.19.1 Reference FR: 3BKA20FBR199390 ............................................................................. 54 5.19.2 Problem description ...................................................................................................... 54 5.19.3 Corrective action ........................................................................................................... 54

    5.20 AuditMFS.pl script reports some errors regarding the filesystem name check. 54 5.20.1 Reference FR: 3BKA20FBR208698 ............................................................................. 54 5.20.2 Problem description ...................................................................................................... 54

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 10/197

    5.20.3 Corrective action ........................................................................................................... 54

    6 AUTOMATIC SOFTWARE CHANGE ............................................................................................ 55 6.1 Error during execution of ins_swcx.sh ................................................................... 55

    6.1.1 Reference FR: None. .................................................................................................... 55 6.1.2 Problem description ...................................................................................................... 55 6.1.3 Corrective action ........................................................................................................... 56

    6.2 rmdir fails during execution of ins_swcx.sh when cygwin is installed on the PC 57

    6.2.1 Reference FR: 3BKA13FBR163888 ............................................................................. 57 6.2.2 Problem description ...................................................................................................... 57 6.2.3 Corrective action ........................................................................................................... 57

    6.3 Error Temporary local directory error on IMT during step 0 ............................. 57 6.3.1 Reference FR: None. .................................................................................................... 57 6.3.2 Problem description ...................................................................................................... 57 6.3.3 Corrective action ........................................................................................................... 57

    6.4 Error File Access Error" with dlv.bck always appears when doing SW replacement ................................................................................................................................ 57

    6.4.1 Reference FR: 3BKA20FBR150527 ............................................................................. 57 6.4.2 Problem description ...................................................................................................... 57 6.4.3 Corrective action ........................................................................................................... 57

    6.5 Error at step 2/10 (Creation) ..................................................................................... 59 6.5.1 Reference FR: None. .................................................................................................... 59 6.5.2 Problem description ...................................................................................................... 59 6.5.3 Corrective action ........................................................................................................... 59

    6.6 Error at step 3/10 (Verify) .......................................................................................... 60 6.6.1 Reference FR: 3BKA20FBR099035 = 3BKA13FBR102355 ........................................ 60 6.6.2 Problem description ...................................................................................................... 60 6.6.3 Corrective action ........................................................................................................... 60 6.6.4 Reference FR: 3BKA13FBR188065 ............................................................................. 63 6.6.5 Reference FR: 3BKA13FBR194235, 3BKA13CBR193877, 3BKA13FBR181238 ....... 64

    6.7 Error at step 5/10 (Isolation) ..................................................................................... 65 6.7.1 Reference FR: 3BK - A13FBR096085 / 105356 / 112480 - A20FBR096035 / 105055 /

    129810 / 139842 - A23FBR174097......................................................................................... 65 6.7.2 Save traces ................................................................................................................... 65 6.7.3 Problem description ...................................................................................................... 65 6.7.4 Specific case for 3BKA20FBR129810 : Problem occurs while Backup Server is down.66 6.7.5 Specific case for 3BKA13FBR175829: broken shared disk ......................................... 67

    6.8 Error at step 6/10 (Major version change)............................................................... 74 6.8.1 Reference FR: 3BKA13FBR107676 ............................................................................. 74 6.8.2 Problem description ...................................................................................................... 74 6.8.3 Check if disks are shared correctly ............................................................................... 74 6.8.4 Corrective action ........................................................................................................... 74

    6.9 Error at step 7/10 (Validation)................................................................................... 75 6.9.1 Reference FR: None. .................................................................................................... 75 6.9.2 Problem description ...................................................................................................... 75 6.9.3 Corrective action ........................................................................................................... 75

    6.10 Control station reboots in loop with reset_code 214 after installation of BL22 . 75 6.10.1 Reference FR: 3BKA13FBR170335 ............................................................................. 75 6.10.2 Problem description ...................................................................................................... 75 6.10.3 Corrective action ........................................................................................................... 76

    6.11 MFS UNIX patch installation makes Control Station unusable (B9 MR1 ED2).... 76 6.11.1 Reference FR: 3BKA23FBR174370 ............................................................................. 76 6.11.2 Problem description ...................................................................................................... 76 6.11.3 Corrective action ........................................................................................................... 78

    6.12 MFS UNIX patch installation fails with a core file generated from 'install_patch_du' ....................................................................................................................... 78

    6.12.1 Reference FR: 3BKA45FBR188097/3BKA25FBR188087/ 3BKA13FBR189822......... 78 6.12.2 Problem description ...................................................................................................... 78 6.12.3 Corrective action ........................................................................................................... 79

    6.13 bul file execution returns 1 error ............................................................................. 80

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 11/197

    6.13.1 Reference FR: 3BKA13FBR186955 ............................................................................. 80 6.13.2 Problem description ...................................................................................................... 80 6.13.3 Corrective action ........................................................................................................... 81

    7 MFS RUNNING............................................................................................................................... 82 7.1 The stand-by station is not operational .................................................................. 83

    7.1.1 Reference FR: None. .................................................................................................... 83 7.1.2 Problem description ...................................................................................................... 83 7.1.3 Corrective action ........................................................................................................... 83

    7.2 Station not reachable ................................................................................................ 83 7.2.1 Reference FR: none...................................................................................................... 83 7.2.2 Problem description ...................................................................................................... 83 7.2.3 Corrective action ........................................................................................................... 84 7.2.4 Problem solved.............................................................................................................. 84

    7.3 System console not reachable................................................................................. 84 7.3.1 Reference FR: none...................................................................................................... 84 7.3.2 Problem description ...................................................................................................... 84 7.3.3 Corrective action ........................................................................................................... 84

    7.4 A process is looping ................................................................................................. 85 7.4.1 Reference FR: 3BKA45FBR119174 ............................................................................. 85 7.4.2 Problem description ...................................................................................................... 85 7.4.3 Corrective action ........................................................................................................... 85 7.4.4 impacts .......................................................................................................................... 86

    7.5 Reboots in loop on MFS reset due to bad IP address ........................................... 86 7.5.1 Reference FR: 3BKA20FBR079434 - 3BKA20FBR081233 (close NIP) ...................... 86 7.5.2 Problem description ...................................................................................................... 86 7.5.3 Corrective action ........................................................................................................... 86

    7.6 Reboots in loop due to no more disk space........................................................... 87 7.6.1 Reference FR: None ..................................................................................................... 87 7.6.2 Problem description ...................................................................................................... 87 7.6.3 Corrective action ........................................................................................................... 87

    7.7 OMC-MFS link problem at different interface cases .............................................. 88 7.8 Ethernet connection problem................................................................................... 90

    7.8.1 Reference FR: none...................................................................................................... 90 7.8.2 Problem description ...................................................................................................... 90 7.8.3 Corrective Action........................................................................................................... 90

    7.9 Sleeping cells............................................................................................................. 90 7.9.1 Alerter definition ............................................................................................................ 90

    7.10 DS10 servers dont come up automatically after power off/power on................. 91 7.10.1 Reference FR: 3BKA45FBR17363, 3BKA20FBR135619............................................. 91 7.10.2 Problem description ...................................................................................................... 91 7.10.3 Corrective action ........................................................................................................... 93

    7.11 Failure on Update Remote Inventory....................................................................... 94 7.11.1 Reference FR: none...................................................................................................... 94 7.11.2 Problem description ...................................................................................................... 94 7.11.3 Corrective Action........................................................................................................... 94

    7.12 MFS remote inventory is missing for some boards............................................... 94 7.12.1 Reference FR: 3BKA20FBR186005 ............................................................................ 94 7.12.2 Problem description ...................................................................................................... 94 7.12.3 Corrective action ........................................................................................................... 95

    7.13 Connection by ftp from a MFS station to an external server is impossible ........ 95 7.13.1 Reference FR: 3BKA20FBR164817 ............................................................................. 95 7.13.2 Problem description ...................................................................................................... 95 7.13.3 Corrective action ........................................................................................................... 95

    7.14 The trace server stops running after a while.......................................................... 96 7.14.1 Reference FR: 3BKA13FBR169218 ............................................................................. 96 7.14.2 Problem description ...................................................................................................... 96 7.14.3 Corrective action ........................................................................................................... 96

    7.15 TRACE_SERVER does not run................................................................................. 96 7.15.1 Reference FR: 3BKA13FBR164932 ............................................................................. 96 7.15.2 Problem description ...................................................................................................... 96

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 12/197

    7.15.3 Corrective action ........................................................................................................... 97

    7.16 GPU traces are not completed ................................................................................. 97 7.16.1 Reference FR: none...................................................................................................... 97 7.16.2 Problem description ...................................................................................................... 97 7.16.3 Corrective action ........................................................................................................... 97

    7.17 Impossible to load patch GPU B8 on GPUs............................................................ 97 7.17.1 Reference FR: 3BKA13FBR164932 ............................................................................. 97 7.17.2 Problem description ...................................................................................................... 97 7.17.3 Corrective action ........................................................................................................... 97

    7.18 Unix patch installation from OMC stopped due to a network failure................... 97 7.18.1 Reference FR: none...................................................................................................... 97 7.18.2 Problem description ...................................................................................................... 97 7.18.3 Corrective action ........................................................................................................... 98

    7.19 After a roll-back it is impossible to open the IMT terminal ................................... 98 7.19.1 Reference FR: 3BKA20FBR162930 ............................................................................. 98 7.19.2 Problem description ...................................................................................................... 98 7.19.3 Corrective action ........................................................................................................... 98

    7.20 Telnet access from Windows .................................................................................. 98 7.20.1 Reference FR: none...................................................................................................... 98 7.20.2 Problem description ...................................................................................................... 98 7.20.3 Corrective Action........................................................................................................... 98

    7.21 no more available disk space on /usr...................................................................... 99 7.21.1 Reference FR: 3BKA20FBR176683 ............................................................................. 99 7.21.2 Problem description ...................................................................................................... 99 7.21.3 Corrective action ........................................................................................................... 99

    7.22 CRAFT cannot connect to MFS floating IP:wrong httpd.conf ............................ 100 7.22.1 Reference FR: 3BKA13FBR177317 ........................................................................... 100 7.22.2 Problem description .................................................................................................... 100 7.22.3 Corrective action ......................................................................................................... 100

    7.23 not enough space for Backup MIB ........................................................................ 101 7.23.1 Reference FR: none.................................................................................................... 101 7.23.2 Problem description .................................................................................................... 101 7.23.3 Corrective action ......................................................................................................... 102

    7.24 not possible to get PM of MFS from OMC, not possible to unlock omcxchg account from User management option of IMT................................................................. 102

    7.24.1 Reference FR: 3BKA13FBR183494, 3BKA13FBR183497......................................... 102 7.24.2 Problem description .................................................................................................... 102 7.24.3 Corrective action ......................................................................................................... 103

    7.25 Impossible to enable MRTG Collector................................................................... 103 7.25.1 Reference FR: 3BKA13FBR186503 ........................................................................... 103 7.25.2 Problem description .................................................................................................... 103 7.25.3 Corrective action ......................................................................................................... 103

    7.26 Cell parameters modification is not allowed from IMT (BUI request) ............... 104 7.26.1 Reference FR: none.................................................................................................... 104 7.26.2 Problem description .................................................................................................... 104 7.26.3 Corrective action ......................................................................................................... 104

    7.27 Impossible to install MFS Sanity Check Script (AW11EP_00D) ......................... 104 7.27.1 Reference FR: 3BKA13FBR196644 ........................................................................... 104 7.27.2 Problem description .................................................................................................... 104 7.27.3 Corrective action ......................................................................................................... 105

    7.28 /RESULT is full ......................................................................................................... 105 7.28.1 Reference FR: 3BKA20FBR188564 .......................................................................... 105 7.28.2 Problem description .................................................................................................... 105 7.28.3 Corrective action ......................................................................................................... 105

    7.29 Discrepancy concerning shared disk.................................................................... 106 7.29.1 Reference FR: 3BKA20FBR167439 .......................................................................... 106 7.29.2 Problem description .................................................................................................... 106 7.29.3 Corrective action ......................................................................................................... 106

    8 CRASH/TRACES.......................................................................................................................... 108 8.1 Determine crash cause ........................................................................................... 108

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 13/197

    8.2 Save traces............................................................................................................... 108 8.3 O&M trace................................................................................................................. 109

    8.3.1 SCIM (RTA)................................................................................................................. 109 8.3.2 Q3................................................................................................................................ 109 8.3.3 RETIX.......................................................................................................................... 110 8.3.4 UNIX............................................................................................................................ 110

    8.4 GPU trace ................................................................................................................. 110 8.4.1 Trace level................................................................................................................... 110 8.4.2 Which level to activate ................................................................................................ 111 8.4.3 How to modify size of mfs_trace_p_XX file?............................................................... 111

    8.5 JBETI trace ............................................................................................................... 112 8.6 Traces of unix patch installation............................................................................ 113 8.7 Problems................................................................................................................... 114

    8.7.1 GPU traces.................................................................................................................. 114 8.7.2 Trace Server................................................................................................................ 114 8.7.3 Disk quota ................................................................................................................... 114 8.7.4 mfs_trace_p_XX traces location ................................................................................. 114

    8.8 System and Tomas (Nectar was the name in a former time) traces .................. 115 8.8.1 system traces (if required)........................................................................................... 115 8.8.2 Advfs traces................................................................................................................. 115 8.8.3 TOMAS traces............................................................................................................. 116

    9 CRASH/TRACES (MFS EVOLUTION ONLY) ............................................................................. 117 9.1 Determine crash cause ........................................................................................... 117 9.2 Save traces............................................................................................................... 117 9.3 O&M trace................................................................................................................. 117

    9.3.1 SCIM (RTA)................................................................................................................. 117 9.3.2 Q3................................................................................................................................ 118 9.3.3 RETIX.......................................................................................................................... 118 9.3.4 LINUX.......................................................................................................................... 119

    9.4 GPU trace ................................................................................................................. 119 9.4.1 Trace level................................................................................................................... 119 9.4.2 Which level to activate ................................................................................................ 119 9.4.3 How to modify size of mfs_trace_p_X_XX file? .......................................................... 120

    9.5 NE1oE Traces........................................................................................................... 121 9.6 Problems................................................................................................................... 122

    9.6.1 GPU traces.................................................................................................................. 122 9.6.2 Trace Server................................................................................................................ 122 9.6.3 Disk quota ................................................................................................................... 122 9.6.4 mfs_trace_p_X_XX traces location............................................................................. 122

    10 VARIOUS INFORMATION ........................................................................................................... 123 10.1 User count creation via IMT on MFS...................................................................... 123

    10.1.1 Reference FR: 3BKA45FBR144680 ........................................................................... 123 10.1.2 Problem description .................................................................................................... 123 10.1.3 Corrective action ......................................................................................................... 123

    10.2 Update disk usage information .............................................................................. 123 10.2.1 Problem description .................................................................................................... 123 10.2.2 Action .......................................................................................................................... 124

    10.3 Shared disks access ............................................................................................... 124 10.3.1 Problem description .................................................................................................... 124 10.3.2 Action .......................................................................................................................... 124

    10.4 How to get MFS component versions ................................................................... 126 10.4.1 Problem description .................................................................................................... 126 10.4.2 Action .......................................................................................................................... 126

    10.5 How to know how many IMT are open at same time ? ........................................ 129 10.5.1 Reference FR: none.................................................................................................... 129 10.5.2 Problem description .................................................................................................... 129 10.5.3 Corrective action ......................................................................................................... 129

    10.6 How to update time from OMC............................................................................... 130 10.6.1 Reference FR: 3BKA13FBR141970 ........................................................................... 130

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 14/197

    10.6.2 Problem description .................................................................................................... 130 10.6.3 Corrective action ......................................................................................................... 130 10.6.4 Problem solved............................................................................................................ 130

    10.7 MFS restoration problem ........................................................................................ 131 10.7.1 Problem description .................................................................................................... 131 10.7.2 Corrections description ............................................................................................... 131

    10.8 MFS system restoration problem: supervision ( MFS Evolution only ) ............ 131 10.9 Backup ( MFS Evolution only ) ............................................................................... 132 10.10 Restore ( MFS Evolution only )............................................................................... 132 10.11 Check if backup Mib is corrupted .......................................................................... 133

    10.11.1 Reference FR.............................................................................................................. 133 10.11.2 Problem description .................................................................................................... 133 10.11.3 Correction description ................................................................................................. 135

    10.12 Reinstallation of the MFS and restauration of data from OMC........................... 135 10.12.1 Reference FR: 3BKA13CBR177682........................................................................... 135 10.12.2 Problem description .................................................................................................... 135 10.12.3 Correction description ................................................................................................. 135

    10.13 How to get contents of Unix patch BL .................................................................. 135 10.13.1 Problem description .................................................................................................... 135 10.13.2 Action .......................................................................................................................... 135

    10.14 How to restore the MIB without needing full reinstallation ................................ 138 10.14.1 Reference FR: 3BKA13CBR177682........................................................................... 138 10.14.2 Problem description .................................................................................................... 138 10.14.3 Correction description ................................................................................................. 138

    10.15 AuditMFS script to prevent any potential problem on the MFS.......................... 139 10.15.1 Reference FR: 3BKA13CBR176689........................................................................... 139 10.15.2 Return codes explanation ........................................................................................... 139 10.15.3 Corrective action ......................................................................................................... 141 10.15.4 Example on AS800 (based on Tomas RC23)............................................................. 148 10.15.5 Example on DS10 (based on Tomas RC23)............................................................... 156 10.15.6 Example on DS10 (based on Tomas RC40)............................................................... 163

    10.16 AuditMFS script to prevent any potential problem on the MFS ( MFS Evolution only ) 170

    10.16.1 Reference FR: 3BKA13CBR176689........................................................................... 170 10.16.2 Return codes explanation ........................................................................................... 171 10.16.3 Corrective action ......................................................................................................... 172 10.16.4 Example on MX-MFS (based on Tomix RL42A)......................................................... 175

    10.17 Serial splitter and RJ45 converter for Trouble shooting ( MFS Evolution only)185 10.17.1 Reference FR: None ................................................................................................... 185 10.17.2 Problem description .................................................................................................... 185 10.17.3 Action .......................................................................................................................... 186

    10.18 How to generate/backup on a platform MFS a virgin MIB and how to import this MIB on a field MFS (same architecture / same SW level)..................................................... 186

    10.18.1 Reference FR: 3BKA13CBR194432........................................................................... 186 10.18.2 Problem description .................................................................................................... 186 10.18.3 Action .......................................................................................................................... 187

    10.19 How to to detect JBETI is not blocked .................................................................. 187 10.19.1 Reference FR: none.................................................................................................... 187 10.19.2 Problem description .................................................................................................... 187 10.19.3 Correction description ................................................................................................. 187

    11 GLOSSARY AND ABBREVIATIONS .......................................................................................... 189 A HW SETTINGS OF ENVIRONMENTAL VARIABLES (FW)........................................................ 190

    INTERNAL REFERENCED DOCUMENTS

    Not applicable

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 15/197

    REFERENCED DOCUMENTS [ 1 ] MFS B9 installation user guide, reference 3BK 09679 JAAA RJZZA

    [ 2 ] EVOLIUM A9135 MFS MAINTENANCE HANDBOOK, reference 3BK 20935 AAAA PCZZA

    [ 3 ] B8/B9 A9135 MFS SOFTWARE MIGRATION Release B9, reference 3BK 17422 0202 RJZZA

    RELATED DOCUMENTS

    PMU logging messages description and principles release B6.2 3BK 09850 FCAD PWZZA

    OPEN POINTS / RESTRICTIONS

    no open point and no restriction have been found

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 16/197

    1 INTRODUCTION

    1.1.1 Document organisation

    This document is organized the following way:

    1) This chapter

    2) Troubles coming from GPU, with, most of the time a Quality Alert attached

    3) Troubles coming at installation time

    4) Troubles coming at SW change time, depending on the SWC phase

    5) Troubles happening when MFS is started

    6) What to do in case of crash, which information to be kept?

    7) How to set and to get traces

    8) Information: general information, as disk usage,

    Plus an appendix for specific information

    A) IOLAN configuration

    B) HW setting of environmental variables

    1.1.2 Presentation

    Each chapter are introduced with a table summarising the addressed problems, origin and fix.

    Very few chapters can be shown to the customer. They are highlighted in green.

    Commands are presented in grey rectangle

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 17/197

    2 GPU

    What/behavior Trouble origin Fix

    1) GPUs disappear from IMT 1 or more GPU with bad components

    Change GPU

    2) Impossible GPU switch over JAE1 applique mistake Change JAE1 3) GPU reboots continuously GPU FW mistake Change the GPU 4) GPU connection problem Connection, ethernet Check Ethernet,

    Extract and re-plug the board 5) GPU problem but alarm is

    "Failure of a JAETI1 applique" Faulty GPU Change faulty GPU

    6) GPU switch over no more possible

    JBETI becomes blocked reset the active JBETI

    7) GPU SW is not loaded No more DHCP lease available Remove DHCP lease file

    2.1 GPUs disappear from the IMT

    2.1.1 Reference FR: None.

    2.1.2 Problem description

    One or more (up to all) GPU in a subrack disappear from time to time on the Craft terminal (IMT), like they have been unpluged.

    The GSM and GPRS remains available, but its impossible to perform any remote action to these GPU (download or modify the configuration, switch over, reset data, lock ).

    A reset hardware (=> outage telecom GSM + GPRS) solve the problem for a short time (< 1 day).

    2.1.3 Corrective action

    At least one GPU in the subrack can have bad hardware components.

    All GPU of the subrack must be checked.

    To check one GPU, unplug it (=> outage telecom GSM + GPRS).

    Then compare the 5 components references like on the following pictures:

    For these 5 components (XXX):

    FB2041 is the good reference

    FBL2041 is a wrong reference !

    Bad component must be changed.

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 18/197

    JBGPU

    XXXXXXXXX

    JBGPU

    XXX

    XXX

    -

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 19/197

    2.2 GPU SO Impossible

    2.2.1 Reference FR: 3BKA20FBR108914

    2.2.2 Problem description

    Switch-over of one GPU (by Craft Terminal) on the spare GPU is not possible : the spare GPU begins to load its telecom configuration, but, some seconds after, the board is blocked and alarms On free run mode appears. The traffic is stopped.

    When a switch-back of the GPU board is done, the traffic comes back, and everything is normal.

    To confirm the problem, switch some GPU and some applique.

    The problem is due to a difference between 2 variants of the appliques for the technology of the redundancy bus transceivers: The AxABxx version is equipped with component FB2041BB (running with VCC= 5V ), and AxAAxx version is equipped with FBL2041BB (running with VCC=3.3 V).

    An hardware correction under study for 3BK08231AxABxx pcm appliques.

    2.2.3 Corrective action

    It has been demonstrated that the pcm applique with the reference number 3BK08231AxABxx causes the problem. PCM applique 3BK08231AxAAxx, must be fully operational.

    JAE1C boards (75 PCM) : Check the pcm applique reference :

    3BK08231ABAAxx : good board

    3BK08231ABABxx : faulty board : Change the board by a good JAE1C

    JAE1 boards (120 PCM) : Check the pcm applique reference :

    3BK08231AAAAxx : good board

    3BK08231AAABxx : faulty board : Change the board by a good JAE1

    2.3 GPU reboots continuously

    2.3.1 Reference FR: 3BKA20FBR119782

    2.3.2 Problem description The GPU reboots Continuously after configuration completed and board unlocked with GPU. After GPRS has been configured and the GPU and GPRS unlocked, it reboots continuously. When a switchover is performed, the same problem occurs. In internal GPU traces (file mfs_trace_p_XX), the following traces indicate there is a failure in PMU package initialisations:

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 20/197

    DATA_ERR : T: 200 : rrmswcomp.cpp : 160 : Cell Traffic Package init failure... DATA_ERR : T: 200 : rrmswcomp.cpp : 172 : Bss Management Package init failure...

    Then, check if the GPU reference (on the front side of the board) is GPU 3BK08064ABAC01

    2.3.3 Corrective action

    If the GPU reference is GPU 3BK08064ABAC01, and if the behavior is as described above, then contact the Local TAC, who has to change the GPU and to send the faulty GPU to Alcatel Repair Center, where a fix will be applied.

    The problem is due to a bad detection of the remote inventory by the firmware of the GPU: the firmware checks in the remote inventory the combination of functional variant (VF), realization variant (VR) ABAA. This is a bug, it should check (VF) AB field only and not care about (VR) AC field. As ABAA is not found, the GPU board is not detected as JBGPU2 ( with 128 MB of PPC memory ), but by default as a JBGPU ( with 64 MB of PPC memory). It explains that some PMU packages can not initialize their memory allocation.

    2.3.4 Problem solved

    Hardware correction under study.

    2.4 GPU connection problem

    2.4.1 Reference FR: none

    2.4.2 Problem description

    GPU stays initial/idle (craft site view) and does not connect to the MFS. The led can be either fixed or blinked orange.

    2.4.3 Corrective action

    1. Check that at least one Ethernet link is plugged for that GPU in one of the switch.

    2. Launch a Console on that GPU: plug a cable between the debug output of the applique and a COM port. (CTRL uu to enter GPU menu). Type help to list the available command. ve /vi display MAC / IP addresses.

    3. If the GPU initialization is stopped at boot request (the GPU does not know its IP address) ! there is no connection between GPU and control station. Check that UDP packets corresponding to boot request are actually sent through one of the interface (tu1 or tu2):

    Set-up the tcpdump on the net: cd /dev ./MAKEDEV pfilt pfconfig +p +c tu1 tcpdump i tu1 udp port 68 (if necessary : lan_config I tu1 s 10 x 0 a 0 # Set output to 10 Mega )

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 21/197

    Packet sent through port 68 are bootpc (client = GPU) ones. Port 67 packets are bootps (server = control station) answer.

    Check the file /etc/bootptab : It should have a line giving the board IP address according to the Ethernet address : gpu1_lg0: tc=DS.default: ha=00809F090804: ip=1.1.1.50: bf=Loader.hex:\

    ha is the Ethernet address, check with the console that the GPU gives the right address.

    4. If the GPU initialisation is stopped at BNP init (On GPU console, the following messages is printed: Wait for answer from GEM since x seconds

    ! there is a communication between the GPU and the control station (it is not an Ethernet problem). It is a known bug (see FR:A13/90904). Workaround: extract the board and plug it again. (This may be done several times)

    2.5 GPU problem but alarm is "Failure of a JAETI1 applique"

    2.5.1 Reference FR: 3BKA13FBR177178

    2.5.2 Problem description

    The origin of this issue seems to be real HW problem (faulty GPU) but the alarm is reported on the wrong board. (problem occured in B8 MR5 Ed4)

    The GPU's part number is 3BK08064ACAB06 and it is not impacted by known Quality Alerters

    2.5.3 Corrective action

    Unplug the problematic GPU and after reset the JBET1 either on left or right handside.

    2.6 GPU switch over no more possible, JBETI problems

    2.6.1 Reference FR: 3BKA20FBR149993, 3BKA20FBR151855 and 3BKA13FBR163557

    2.6.2 Problem description

    Sometimes the JBETI becomes blocked, so that it won't treat any request ( Remote inventory, Gpu reset, Gpu switchover ), and alarm are not cleared neither raised, while alls led on the JBETI are green: a switchover is done on spare GPU but no telecom traffic possible. we can fall in this situation for the following reasons - after a GPU crash: On the GPU software crash, O&M detect the loss of supervion of this GPU board and send a reset order to this GPU through the JBETI, but as JBETI is blocked the GPU won't reset/reboot.

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 22/197

    Then after a while ( about 3 minutes ), as O&M don't see the GPU rebooting ( it conclude that the GPU is failling) , so O&M send a GPU switchover order (to route the PCM signal from applique to the spare GPU ), and send Telecom configuration to the spare GPU. but as the JBETI is blocked, it will not treat the GPU switchover order, leading to not traffic possible on spare GPU. - after a manual GPU switchover command sent from IMT O&M send a GPU switchover order (to route the PCM signal from applique to the spare GPU ), and send Telecom configuration to the spare GPU. but as the JBETI is blocked, it will not treat the GPU switchover order, leading to not traffic possible on spare GPU.

    To confirm that the JBETI is blocked :

    a remote inventory command from IMT will fail in time-out

    2.6.3 Preventive action

    None

    2.6.4 Corrective action when JBETI is suspected as blocked, reset the active JBETI

    2.6.4.1 Problem solved

    MFS.PATCH.B9_0.RCxx.11EP_00G (JBETI_AA patch) and MFS.PATCH.B9_0.RCxx.11EP_00H (JBETI_AB patch) solve the problem in B9 MR1 ED4 QD11 (MFSSAW11E/41E, MFSSAW11F/41F)

    2.7 GPU SW is not loaded ( MFS Evolution only )

    2.7.1 Reference FR: 3BKA13FBR 175541

    2.7.2 Problem description

    Lease related to BOOTP is infinite so it is necessary to remove the lease file to be able to replace the boards with no constraint (32 different GP boards can be plugged)

    2.7.3 Corrective action

    Remove the /var/dhcp/dhcpd.leases

    At any terminal accessing the active STATION, type STATION_x> cd /var/dhcp/dhcpd.leases STATION_x> rm dhcpd.leases

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 23/197

    STATION_x> rm dhcpd.leases~

    2. Then kill the DHCP server (it may trigger an OMCP switch-over): STATION_x> ps -efd | grep dhcpd root 1205 1115 0 Dec13 ? 00:00:00 /usr/nectar/bin/dhcpd_ctrl root 1467 1205 0 Dec13 ? 00:00:02 /usr/sbin/dhcpd -cf /nfm_local/spdata/nectar/dhcpd/dhcpd.conf -f -q eth0 eth1 STATION_x> kill -9 1467

    "

    2.8 MFS crash while we insert a GPU

    2.8.1 Reference FR: 3BKA20FBR183063

    2.8.2 Problem description

    When we insert a GPU in a subrack, the MFS crashes with a content which looks like at :

    Feb 21 14:47:50 STATION_A xma[1832]: [.388 ms] [ERRCOUNT 1838 0xc NMA (0)] nnn_util.c - line:479 - label:Nb_station received is too big (268435461)

    The message Nb_station received is too big (268435461) identify the problem

    2.8.3 Corrective action

    The crash is due to a wrong message sent by the GPU. A new GPU must be used

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 24/197

    3 INSTALLATION

    What/behavior Trouble origin Fix

    1) Both stations restart Wrong address declaration Modify address 2) rlogin is refused Impossible rlogin/telnet as root Modify securettys file 3) Unix boot impossible Wrong default kernel Modify boot_file variable in

    Firmware 4) dataPatch.bul" error during

    scratch installation in B9 MR4 conf_alarm [cfgalarm101AH] object is defined twice in bul files

    this error is expected without bad effect on the MFS

    3.1 Station restart

    3.1.1 Reference FR: None.

    3.1.2 Problem description

    In some cases, when trying to restart one station, both of them restart, this may be due to the fact that they are declared to a wrong address.

    3.1.3 Corrective action

    Check (and if necessary modify) the firmware software configuration of the control stations, which can be accessed through the system console.

    1. At any terminal accessing one of the control stations (STATION_A or STATION_B, by telnet or rlogin), it is possible to access the system console of any control station by typing either:

    STATION_x> telnet 1.1.1.20 10002

    (for STATION_A system console) STATION_x> telnet 1.1.1.20 10003

    (for STATION_B system console)

    2. Type some to get the prompt ; then :

    1) The UNIX login or the shell prompt is displayed : login root if necessary, then halt the station gently under the firmware by typing the following command :

    STATION_x> init 0

    When the firmware prompt is displayed ( >>> ), go to step 6).

    2) The machine doesnt react and the display is still: force the machine to stop by typing the keystroke sequence:

    rmc

    3) Then, when the RMC prompt is available : RMC>halt in

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 25/197

    4) Then again:

    rmc

    5) Then when the RMC prompt is available : RMC>halt out

    Then the firmware prompt should be available.

    6) Type the following command >>>show *

    (give all firmware variables values)

    (Refer to Appendix B for a list of currently advised values depending on the hardware configuration)

    If values are erronous, especially pka0_host_id, pkb0_host_id, pkc0_host_id and auto_action, modify them. For example : >>>set pkc0_host_id 6

    When all checks and modifications are done, do the following : >>>init

    The machine should now reboot automatically .

    Now release the system console (as it is used from time to time by NECTAR Hardware Management) :

    - On Sun station, by typing :

    ]

    (simultaneously control and closing square bracket) then : telnet>quit

    - If the console is accessed from the other station through a PC/NT X terminal, close simply the window.

    (Another method to release the system console is to restart the iolan (see other chapter) from another session).

    3.2 Impossible to rlogin/telnet to MFS as root

    3.2.1 Reference FR: None.

    3.2.2 Problem description

    When trying to rlogin to the CS root, action is refused by the control station (access denied)

    3.2.3 Corrective action

    The file /etc/securettys is not good : it should include a line ptys to enable to be root from another terminal.

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 26/197

    Login as admin on one of the control stations.

    telnet 1.1.1.20 10002 /10003 to gain access to the system console (see 3.1.3 for more details)

    login as root

    type echo ptys >> /etc/securettys

    This adds the line ptys in securettys

    perform the same action on the other control station

    release the terminal or (in case of problem) reboot the iolan (telnet 1.1.1.20, return, su, iolan, reboot)

    3.3 Unix boot impossible (wrong default kernel)

    3.3.1 Reference FR: None.

    3.3.2 Problem description

    Unix cant boot because it cant open the default kernel 'vmunix.pre_capmn':

    You should have the following at the console:

    ff.fe.fd.fc.fb.fa.f9.f8.f7.f6.f5.f3.f2.f1.f0.ef.df.ee.f4. probing hose 0, PCI probing PCI-to-EISA bridge, bus 1 probing PCI-to-PCI bridge, bus 2 bus 0, slot 5 -- pka -- QLogic ISP10x0 bus 0, slot 6 -- vga -- S3 Trio64/Trio32 bus 2, slot 0 -- ewa -- DE500-BA Network Controller bus 2, slot 1 -- ewb -- DE500-BA Network Controller bus 2, slot 2 -- ewc -- DE500-BA Network Controller bus 2, slot 3 -- ewd -- DE500-BA Network Controller bus 0, slot 12, function 0 -- pkb -- NCR 53C875 bus 0, slot 12, function 1 -- pkc -- NCR 53C875 ed.ec.*** keyboard not plugged in... eb.....ea.e9.e8.e7.e6.e5.e4.e3.e2.e1.e0. V5.8-24, built on Jul 11 2001 at 10:57:51 Memory Testing and Configuration Status 512 Meg of System Memory Bank 0 = 512 Mbytes(128 MB Per DIMM) Starting at 0x00000000 Bank 1 = No Memory Detected

    CPU 0 booting

    waiting for pkb0.6.0.12.0 to poll... (boot dka0.0.0.5.0 -file vmunix.pre_capmn -flags S) block 0 of dka0.0.0.5.0 is a valid boot block reading 16 blocks from dka0.0.0.5.0 bootstrap code read in Building FRU table FRU table size = 0xbed

  • ED Ed15 Released MFS Troubleshooting guide release B9

    EVOLIUM 3bk29042jaaapwzza-ed15rl.doc15/03/2007

    3BK 29042 JAAA PWZZA 27/197

    base = 1d2000, image_start = 0, image_bytes = 2000 initializing HWRPB at 2000 initializing page table at 1ffce000 initializing machine state setting affinity to the primary CPU jumping to bootstrap code

    Digital UNIX boot - Mon Nov 1 17:21:23 EST 1999

    can't open vmunix.pre_capmn

    Enter [option_1 ... option_n] Hit to boot default kernel 'vmunix.pre_capmn':

    This is due to a wrong value of the boot_file variable at Firmware level:

    >>>show boot*_file boot_file vmunix.pre_capmn booted_file vmunix.pre_capmn

    3.3.3 Corrective action

    Modify the boot_file variable:

    >>>set boot_file vmunix

    Verify the boot_file variable:

    >>>show boot_file boot_file vmunix

    3.4 dataPatch.bul" error during scratch installation in B9 MR4

    3.