Troubleshoot Ashburn F5 Load Balancer

4
05.17.2012 -- Troubleshoot Ashburn F5 Load Balancer Mouse over the text will show a description of each field. In all the text boxes you can use Wiki Notation, see the Full Notation Guide Cloud Incubation MOP Subject of MOP (http://) Troubleshoot Ashburn F5 Load Balancer Date Scheduled (http://) 5/17/2012 Risk Assessment (http://) Green Author (http://) Michael Slaughter Manager (http://) Andrea Stebenne Version (http://) 1.0 Status (http://) Requested Technical Authorized by (http://) Insert after approval ATTUID Insert after approval Technical Authorized date Insert after approval Management Authorized by (http://) Andrea Stebenne ATTUID AS670S Management Authorized date 5/16/2012 Postmortem (http://) N/A Detailed Objective: (http://) One of the F5 Vipron Load Balancer in Ashburn is offline due to errors that were causing it to failover. n118402f5vprn2001 will undergo intrusive troubleshooting to determine the nature of the issue and reach a determination for resolution if not found during the change window Risk Assessment (http://) None Prerequisites (http://) None Maintenance Impact (http://) AT&T Impact (http://) None AT&T Notification (http://)

description

Troubleshoot Ashburn F5 Load Balancer

Transcript of Troubleshoot Ashburn F5 Load Balancer

Page 1: Troubleshoot Ashburn F5 Load Balancer

05.17.2012 -- Troubleshoot Ashburn F5 Load Balancer Mouse over the text will show a description of each field. In all the text boxes you can use Wiki Notation, see the Full Notation Guide

Cloud Incubation MOP

Subject of MOP (http://) Troubleshoot Ashburn F5 Load Balancer

Date Scheduled (http://) 5/17/2012

Risk Assessment (http://) Green

Author (http://) Michael Slaughter

Manager (http://) Andrea Stebenne

Version (http://) 1.0

Status (http://) Requested

TechnicalAuthorized by(http://)

Insert after approval ATTUID Insert after approval TechnicalAuthorized date

Insert after approval

ManagementAuthorized by(http://)

Andrea Stebenne ATTUID AS670S ManagementAuthorized date

5/16/2012

Postmortem(http://)

N/A

Detailed Objective: (http://)One of the F5 Vipron Load Balancer in Ashburn is offline due to errors that were causing it to failover. n118402f5vprn2001 will undergo intrusivetroubleshooting to determine the nature of the issue and reach a determination for resolution if not found during the change window

Risk Assessment (http://)None

Prerequisites (http://)None

Maintenance Impact (http://)

AT&T Impact (http://)

None

AT&T Notification (http://)

Page 2: Troubleshoot Ashburn F5 Load Balancer

None

Customer Impact (http://)

Potential impact to customers using load balancer polcies for the VDC's in Ashburn.

At all cost, the standby F5 node should remain the active node with no interruption to traffic.

Customer Notification (http://)

VWR should be notified of possible interruptions during the change window 

Maintenance Conflicts (http://)

None

Versions (http://)1.0

Access Requirements (http://)All work will be performed by MHO Tier 2 & Tier3 & F5 Vendor along with IDC Remote Hands

Repositories (http://)

Pre-Maintenance Checklists

Precautions (http://)

MOP Preparation

Step Prior to Maintenance Yes/No

1 Has an MR been Opened? http://wmis.web.att.com

P17151 / S17151AO - RQ00023

2 Has this passed QA Testing? yes

3 Has this passed CTO Lab (Lab-C) Testing? yes

4 Any potential conflicts with othermaintenance? Calendar

yes

Ask Yourself

Step General Ask Yourself Yes / No

1 Do I know why I'm doing this work? yes

Page 3: Troubleshoot Ashburn F5 Load Balancer

2 Have I identified and notified everybody –customers and internal groups – who will bedirectly affected by this work?

yes

3 Can I prevent or control service interruption? yes

4 Is this the right time to do this work? yes

5 Am I trained and qualified to do this work? yes

6 Are the work orders, MOPs, and supportingdocumentation current and error free?

yes

7 Do I have everything I need to quickly andsafely restore service if something goeswrong?

yes

8 Have I walked through the procedure? yes

Emergency / Support Contacts (http://)

Contacts that can be reached out to in case of failure.

Organization Contact Name Contact Number Manager Name Manager Number

MHO Brad Elias (859) 227-1141 Karen Damsel (727) 862-3469

CTO Jerry Yuen   Deborah Monteforte  

         

         

Implementation (http://)Pre-Maintenance Steps (http://)

You can use Wiki Markup in this field

Ashburn IDC has confirmed that there are spare cables and SFPs available.

THey have also performed an end to end test of the fiber link for 2/2.1 to IPE1 2/2/0, MX960 (CBB Juniper Router)

Packages to download

Alarms to Suppress

Implementation (http://)

You can use Wiki Markup in this field

Jerry Yuen recommeds implenting the following to correct the uplink issue on :

1. Manually disable the feature failsave (uncheck it) in all VLANs in all F5s in all locations.2. Manually remove all failsave feature VLANs in Systems -> High Availability -> Failsave -> VLANs3. Verify uplink is established to upstream Junper router and LB URL's are available and responding4. Continue to troubleshoot with CBB & F5 (Vendor) if needed 

Processes / Ports (http://)

Page 4: Troubleshoot Ashburn F5 Load Balancer

Processes:

TCP Ports Listening / Connected

Processes that are running after start

Ports that are used to connect / serve after start

Testing (http://)

ICO Tier 1 preforms a full healt check including testing of the following URL's to verify load balancer functionality:

https://32.64.13.158 <-- vwr

http://32.64.3.254 <-- Tier2 test VIP

Alarms to Un-Suppress

Rollback (http://)

Post-Mop Activities (http://)

Post-Mop Cleanup (http://)