Southwest Airlines: using HP Diagnostics to drive value

29
1 ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Southwest Airlines: using HP Diagnostics to drive value Subuhi Ali Performance Tech Lead Southwest Airlines

description

This presentation will demonstrate the value of HP Diagnostics software. We will cover tips and tricks for easy installation, the benefits of the integration of HP Diagnostics with HP LoadRunner, troubleshooting application issues using HP Diagnostics, and using Diagnostics to identify and diagnose out-of-memory issues. We will use specific examples to illustrate our points, such as our main website replacement project and an application in production that had to be frequently restarted because of memory leaks.

Transcript of Southwest Airlines: using HP Diagnostics to drive value

Page 1: Southwest Airlines: using HP Diagnostics to drive value

1 ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Southwest Airlines: using HP Diagnostics to drive value

Subuhi Ali

Performance Tech LeadSouthwest Airlines

Page 2: Southwest Airlines: using HP Diagnostics to drive value

2

Topics

Company Introduction What is HP Diagnostic Installing HP Diagnostics Configuring the JVM / App Server Why Southwest Airlines chooses to use it? Issues – HP Diagnostic – Resolution Time Problem 1 – 100% CPU Problem 2 – Application Response Time Problem 3 – High Memory utilization Advantages of using Diagnostic Questions

Page 3: Southwest Airlines: using HP Diagnostics to drive value

3

Southwest Airlines - Intro

In 1971, Southwest began as a small Texas airline that served three cities with three airplanes.

In 2008, we operate more than 3,400 flights a day to 64 destinations in the U.S.

Over the course of 37 years, Southwest has grown to become the largest U.S. carrier in terms of domestic passengers flown

In 2009, Southwest moved 182 million pounds of cargo.

The shortest daily Southwest flight is between Ft. Myers (RSW) and Orlando (MCO) (133 miles). The longest daily Southwest flight is between Providence (PVD) and Las Vegas (LAS) (2,363 miles)).

Southwest has 1,164 married couples. In other words, 2,328 Southwest Employees have spouses who also work for the Company.

Southwest was the first airline to establish a home page on the Internet. Initially, five Employees comprised Southwest’s web site development team, and the site took about nine months to create.

Page 4: Southwest Airlines: using HP Diagnostics to drive value

4

Southwest Airlines - Intro

More than 35,000 total Employees throughout the Southwest system and 3,200 flights daily flying to 68 cities.

Page 5: Southwest Airlines: using HP Diagnostics to drive value

5

Southwest Airlines - Intro

Page 6: Southwest Airlines: using HP Diagnostics to drive value

6

What Is HP Diagnostic

HP Diagnostic tool is designed to help you improve the performance of your Java, .NET, and other enterprise applications throughout the application lifecycle. It enables you to:

Identify where time is spent in an application layer.

It allows you to drill down from a business transaction that is taking a long time due to the a problematic component

discover "rogue" code/components real-time as they are invoked

Identify Memory leaks

Tune Garbage Collection issues

For developers, it means that tracing code doesn't have to be added and removed.

The resolution time is quick

It can be enabled in Pre – Production or Production environment to quickly find resolutions

Page 7: Southwest Airlines: using HP Diagnostics to drive value

7

Installing HP Diagnostics

The installation Order is:

Diagnostic ServerProbeLoadRunner Integration

Diagnostic ServersJava Applications

Metrics

Diagnostic UI

Interface with LoadRunner

Controller

NAS Filer

Probe

Page 8: Southwest Airlines: using HP Diagnostics to drive value

8

Installing HP Diagnostics

Points to remember

Use a common id between your Linux and windows platform

Install the probe once on the NAS Mount and share it

Use unique names for each application JVM Node

With LoadRunner integration if the registered components change then start with a fresh Scenario

Page 9: Southwest Airlines: using HP Diagnostics to drive value

9

Configuring the JVM / App Server

Important: Ensure that the name of the probes that are defined for a machine are unique, and that each is only assigned to a single application

This can be done many ways, and will be specific to the type of server you are instrumenting (Tomcat vs. generic JAVA app).

In general, we are looking for the startup scripts that set the "JAVA_OPTIONS" parameter.

Minimally, you must set the "-javaagent" parameter; it is a best practice to also set the "-Dprobe.id" as well as the "-Dprobe.group".

/ JAVA_OPTIONS:"-javaagent:<probe_install_dir>\lib\probeagent.jar -Dprobe.id=**Unique_Name** -Dprobe.group=**Group_Name**“

Page 10: Southwest Airlines: using HP Diagnostics to drive value

10

Why Southwest Airlines Chooses to Use It?

Number one reason would be to get more visibility all the way to code level

Getting average response times split across different application layers

Easy of installation and setup

Very small footprint, you can actually run a Load Test with full instrumentation turned on

We have a lot of WebSphere and home grown Java applications

This adds on from a nominal 2 hrs to 1 day of setup time i.e. if you already have the Load Test setup

Page 11: Southwest Airlines: using HP Diagnostics to drive value

11

Issues – HP Diagnostic – Resolution Time

Mule Dispatcher

thread hitting 100%

CPU

Diagnostic

was enabled

Took 2

minutes

to identify

Issue

Response Times very

high on Complete

Purchase transactions

Diagnostic

was enabled

Took 2

days to

identify

Issue

Memory Leak in

ProductionDiagnostic

was enabled

Took 8

hours to

identify

Issue

Problem 1

Problem 2

Problem 3

Page 12: Southwest Airlines: using HP Diagnostics to drive value

12

Problem 1 – 100% CPU

Brief Problem Description:

In one of the application Layers as soon as the environment was started the CPU for a single thread hit 100%.

The thread stack trace page gave us the required information to rectify

A problem

that the

developer

was looking

at for 2 days

was resolved

in 2 minutes

Page 13: Southwest Airlines: using HP Diagnostics to drive value

13

Problem 2 – Application Response Time

Brief Problem description:

Application’s SLA’s with regards to response time were not being met.

We had a goal of 180 TPS and we were not making even 10 TPS

Using diagnostics we found that most of the time was being spent in one of the application layers.

Page 14: Southwest Airlines: using HP Diagnostics to drive value

14

Problem 2 – Application Response Time

Using Diagnostics we could isolate and see not only the total transaction times in LoadRunner but also for the same transaction how much time was spent on the server side only

Example – Complete Purchase

Average – Total = 6.838Average – Server = 5.34

Page 15: Southwest Airlines: using HP Diagnostics to drive value

15

Problem 2 – Application Response Time

The tool gives you the options to:Breakdown the Layer to – ClassesBreakdown the transaction to Server RequestShow VMBreakdown the Layer to – Server RequestShow Chain of Calls

Page 16: Southwest Airlines: using HP Diagnostics to drive value

16

Problem 2 – Application Response Time

Breakdown the Layer to – Classes

This option drills down to the Class level – Byte Protocol is the issue

Page 17: Southwest Airlines: using HP Diagnostics to drive value

17

Problem 2 – Application Response Time

Breakdown the Layer to – Methods

Java – io – OutputStream is the issue

Page 18: Southwest Airlines: using HP Diagnostics to drive value

18

Problem 2 – Application Response Time

Drill down to the Transaction Chain of calls which gives you:Method NameClass NamePackage nameLayer Name

Page 19: Southwest Airlines: using HP Diagnostics to drive value

19

Problem 2 – Application Response Time

User

User

User

User

Connection Pool

Apache

Thread Pool Client Framework

Tomcat

Connection Pool

Service

User

User

User

User

User

Locking happening on Mutual Exclusion during synchronization between tomcat and the mule layer dispatchers

Page 20: Southwest Airlines: using HP Diagnostics to drive value

20

Problem 2 – Application Response Time

Changed code to remove the lock

User

User

User

User

Connection Pool

Apache

Thread Pool Client Framework

Tomcat

Connection Pool

Service

User

User

User

User

User

Page 21: Southwest Airlines: using HP Diagnostics to drive value

21

Problem 2 – Application Response Time

A problem that the development team and

Engineers were looking at for 2 weeks with no

answer was resolved in two days

Page 22: Southwest Airlines: using HP Diagnostics to drive value

22

Problem 3 – High Memory Utilization

Brief Problem Description:Opening a holiday schedule in the scheduling application got the heap usage high and the application was constantly running out of Memory in Production

Page 23: Southwest Airlines: using HP Diagnostics to drive value

23

Problem 3 – High Memory Utilization

We saw huge fluctuations in Heap Size, GC (Garbage Collection) did not kick in till the Heap was really high up there, we did not see any Young GC happening

Page 24: Southwest Airlines: using HP Diagnostics to drive value

24

Problem 3 – High Memory Utilization

Changed the Garbage Collection policy to “gencon” from optthroughput

Page 25: Southwest Airlines: using HP Diagnostics to drive value

25

Problem 3 – High Memory Utilization

A problem that the development team and

Engineers were looking at for weeks on end,

having to restart the application and with no

answers were able to find a solution in two

sessions of 4 hours each

Page 26: Southwest Airlines: using HP Diagnostics to drive value

26

Advantages of using Diagnostic

Changes black box Load testing to white box as it give a lot more visibility

Makes identifying issues in the code or the environment a lot easier

Splits the total transaction time into different application layers making it easier to identify bottlenecks

Helps in tuning – Heap Size, GC Collection etc.

Issue identification process is a lot faster as it shows culprit code causing the problem

Page 27: Southwest Airlines: using HP Diagnostics to drive value

27

Questions

Page 28: Southwest Airlines: using HP Diagnostics to drive value

28 ©2010 Hewlett-Packard Development Company, L.P.

To learn more on this topic, and to connect with your peers after

the conference, visit the HP Software Solutions Community:

www.hp.com/go/swcommunity

Page 29: Southwest Airlines: using HP Diagnostics to drive value

29