WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

44
WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software

Transcript of WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Page 1: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

WebSphere Performance Drivers

William R. Sullivan, P.E.

CTO WHAM Engineering & Software

Page 2: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Memory Performance Drivers

• Memory Concepts– Address Space Management– Address Translation– Locality of Reference

Page 3: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Address Space Management

• Manipulating the numbers that the program generates that reference variables and data storage locations

• For C/C++ a pair of functions called malloc and free managed the heap

• For Java, there is something called a garbage collector that makes Address Space Management transparent to the programmer

Page 4: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Locality of Reference

• This simply refers to the fact that the next location fetched from memory is close to the first

• As long as it is in the same page, no new virtual mapping needs to be created

• Programs with poor locality of reference rarely get extra performance with faster CPUs

• Programs with larger resident set size generally have less locality of reference

Page 5: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Address Space Management in Java Applications

• The operator new is used to create instances of class objects which invokes the class constructor

• No delete operation is needed because of Garbage Collection

• This leads to poorly performing applications where lots of construction occurs

Page 6: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Java Address Space Management

• Two significant impacts on program operation– Locality of reference is not controllable except

by using a very small heap which is not always practical

– If the program uses many objects, garbage collection can take too long and cause excessive CPU use

– These two are at odds when tuning WS

Page 7: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

What Does Garbage Collection Do?

• Collects unused space

• Compacts it by coalescing unused contiguous chunks

• Copies data around which is actively in use

• All request processing is suspended during garbage collection

Page 8: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

When Does GC Kick In?

• When either a limit in time or a limit in memory use has been achieved

• Main tuning knob is the start size and final size of the heap

• Asynchronous GC can be disabled but it isn’t advisable

• GC can be invoked by the programmer as well

Page 9: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

What Adverse Affect Does GC Produce on Application Behavior?

• Excessive CPU consumption, IBM says expect 5%-20%

• Response time impact for all transactions in progress and received during garbage collection interval

• Can we characterize specifically the impact GCs are having?

Page 10: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Test Application from IBM

• We used the Account Transfer Application that came with the Samples from IBM

• Used a URL based load generator with 10 simultaneous requestors and zero think time

• Used WHAM DRM 3.5 to measure and analyze all the results

Page 11: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

GC as a percentage of Total CPU

• IBM claims anywhere from 5% to 20% of the application time is acceptable

• That is way too high for the price you pay for the licensing of WebSphere

Page 12: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

How do we characterize GC?

• We would say from our understanding of GC so far that when transaction rate drops to 0 and CPU is 100% of one CPU, the JVM is in the process of collecting garbage

• Is there any way to conclusively demonstrate that?– WHAM Profiling data on the application– GC Verbose output correlated to the other data

streams but unfortunately it doesn’t come with timestamps (it did in 3.01)

Page 13: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

GC at 2% is good right?

• Percentages can be misleading and that is why it is always necessary to look at both the frequency and time domain

• The next slides show GC for a 128MB heap and a 512MB heap in which GC is 2% of the application CPU during the interval of observation

Page 14: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

2% GC in a 128MB Heap

Page 15: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

2% GC in a 128MB Heap

Page 16: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

2% GC in 512MB Heap

Page 17: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

2% GC in a 512MB heap

Page 18: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Why 512MB GC is Slow

Page 19: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Lessons Learned

• Large heap may cause page faulting when garbage collection starts and extend GC time which has an adverse impact on application response time

• Large heap had other negative effects such as increased memory management overhead due to steals and minor faults from poor locality of reference

Page 20: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

What happens to GC over time?

• It should kick in at a regular rate as long as the rate of object creation is constant

• The cost of GC should be proportional to the rate of garbage creation

• Let’s have a look at the Account Transfer Bean using JSPs

• We ran it for 33 minutes with different heap sizes, 64MB,128MB and 256MB

Page 21: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Time Per Tier View of Load Effects on a 64MB heap over 30 minutes

Page 22: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Transaction Rate and Service Time in Tier 2

Page 23: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

What if we Look at 10s sample rate

Page 24: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

CPU 1s vs 10s sample rate

Page 25: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Slower Sampling Averages Data

• Cannot see the clear pattern of garbage collection• We can see that response time is rising over time as

CPU is dropping with throughput• We would assume some internal application slowdown

may be occurring or that things may be slower on the database but we know the latter isn’t the case

• With faster sampling, it is clear that the slowdown is periodic and is clearly in Tier 2

• Faster sampling is key in isolating and identifying these sorts of anamolies

Page 26: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Expanded View of Previous Charts

Page 27: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Functions and Transactions from Silhouette

Page 28: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Libs and Service Times

Page 29: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Lets Plot GC CPU vs Time

• We measured the application for 33 minutes under a fixed load and summarized the CPU usage at 3 minute intervals

• We then plotted the total CPU usage in GC per interval for a 64MB heap, a 128MB heap and a 256MB heap

• Notice that the service time effect of GC is about 2s for a 64MB heap from the previous chart

Page 30: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

CPU Breakout for 64MB Heap

CPU Breakdown for 64MB Heap

0

50

100

150

200

250

300

3 6 9 12 15 60 21 24 27 30 33

Time in Minutes

CP

U i

n S

eco

nd

s

64MB AppCPU

64MB GC CPU

Page 31: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

CPU Breakout for 128MB Heap

CPU Breakdown for 128MB Heap

0

50

100

150

200

250

300

3 6 9 12 15 18 21 24 27 30 33

Time in Minutes

CP

U i

n S

eco

nd

s

128MB App CPU

128MB GC CPU

Page 32: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

CPU Breakout for 256MB Heap

CPU Breakdown for 256MB Heap

0

50

100

150

200

250

300

3 6 9 12 15 18 21 24 27 30 33

Time in Minutes

CP

U i

n S

eco

nd

s

256MB App CPU

256 MB GC CPU

Page 33: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

So What is Wrong Here?

• Garbage collection is being invoked more frequently but the rate of transactions is decreasing

• Garbage is collected when we run out of space and so we would have to say that with higher frequency GCs we are running out of space sooner

• The implication is that we have a memory leak or something is holding memory active after requests are completed

Page 34: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

But I thought GC fixed memory Leaks?

• Not exactly• Objects have seven states

– Created– In Use– Invisible– Unreachable– Collected– Finalized– Deallocated

Page 35: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Invisible Objects

• These are objects that are apparently out of scope but in the frame of reference of code that the JVM will not eliminate them unless they are explicitly de-referenced.

• This kind of coding can be over-ridden by explicitly setting the reference to the object to null after it’s used

Page 36: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Using JSP’s or Not• It turns out that we have two ways to do the transfer funds operation• We have a JSP based bean where the bean populates a JSP object

and then creates a session servlet to run the JSP• The direct implementation creates all of the html output inside the

bean. This isn’t good because programmers would be required to develop content in the direct case

• In the JSP case, content programmers don’t need Java just html to implement dynamic content because the bean handles all the dynamic content

• The leaky version is the JSP based version• We ran the Direct Bean implementation and here were the results

Page 37: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

No JSP produced good results

Page 38: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

So What is Special About JSP?

• We investigated the Bean Code and found that the JSP version of the Bean had to create a session

• Sessions are either persistent or cached• The default is cached and there is a limit of 1000

and a timeout of 30 minutes• We decided to adjust session timeout to see if

shorter timeouts would help free memory quickly enough to keep the program from starving

• First adjustment was to 5 minutes

Page 39: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

5 Minute Session Timeout

Page 40: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Garbage Collection Cost

• GC becomes consistent but at a fairly high frequency

• Total GC Cost is 11% of the Application• Not acceptable so we can increase the

heap size or decrease the timeout• Next we decreased the session timeout to

2 minutes which produced 3% GC cost• Then we decided to use a larger heap and

longer timeout to see how that worked

Page 41: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

2 Minute Session Timeout

Page 42: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

5 Minute Timeout 128MB Heap

Page 43: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Final Choice

• The 2 minute timeout with 64MB heap produced 3% GC cost

• The 5 minute timeout with 128MB heap produced <1% GC cost

• Longer session timeout is better for most applications

Page 44: WebSphere Performance Drivers William R. Sullivan, P.E. CTO WHAM Engineering & Software.

Conclusions

• GC can have a significant impact on WebSphere performance

• GC must be characterized in order to ensure that it isn’t negatively affecting the application performance

• Proper Tools and the right approach to analyzing GC is imperative to identifying problems and rectifying them

• As far as we could tell, Websphere doesn’t come with the proper tools for characterizing GC costs