Memory Sizing Methodology for zOS External

18
IBM Software Group ® WebSphere JVM memory sizing and monitoring methodology on zOS Rohit Kelapure IBM Advisory Software Engineer March 02, 2009

description

WebSphere JVM memory sizing and monitoring methodology on zOS

Transcript of Memory Sizing Methodology for zOS External

Page 1: Memory Sizing Methodology for zOS External

IBM Software Group

®

WebSphere JVM memory sizing and monitoring methodology on zOS

Rohit KelapureIBM Advisory Software Engineer

March 02, 2009

Page 2: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

2

Outline

JVM Memory Sizing

Benchmarking

• Investigate target criteria

• Determine workload Characteristics

• Define test scenarios

• Obtain baseline results

• Tune system & Measure improvements

JVM Memory monitoring

Profiling

Monitoring

Diagnostic Tools

Questions and Wrap Up

Page 3: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

3

Benchmarking Benefits

Provides a quantitative expression of business requirements

Reliably predicts the performance of a production system under stress and after long-term operation.

Helps identify performance regressions.

Considerations

Target Criteria

• Mapping business requirements to the benchmark

• Predicting workload for special events

• Estimating long-term system status

Workload Characteristics

• Choose a good mix of test cases

• Account for error conditions

• Test for system failures

Test environment versus production environment

• Resources versus capacity

• Capacity, workload, and response time

End to End Process

Create benchmark (Environment, Tools, Scenarios)

Get a baseline performance number in a controlled environment. (measure key performance metrics)

Execute tests again and measure improvements (maintain environmental hygiene)

Scrub & Repeat

Page 4: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

WebSphere Memory Structure on 31-bit zOS

4

2G storage breakdown 2G address space for 31-bit z/OS Common Storage Storage owned by other TCB’s Storage owned by WAS main TCB Subpool 2 Key8/Key2 (mallocs) Other Subpools WAS, other Program Products

(mallocs) Java Memory Usage

Main Java Storage Usage

Internal Memory(small)

Java Heap (big -Xmx)

Class Segments (big)

JIT Codes (64m max)

JIT Memory (small)

J9pool (big)

Page 5: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

zOS Memory and Startup Overview Garbage collection frequency in V6.1 with JDK 1.5 has remained unchanged as compared to V6.02

and JDK 1.4.2

V6.1 real storage footprint at application steady-state time is equivalent to V6.02.

V6.1 CPU time and elapsed time of server startup has decreased as compared to V6.0.2

The JVM Quickstart option can further reduce CPU time of server startup with minimal effect on

steady-state throughputMemory

Virtual storage footprint at startup time is essentially unchanged from V6.02 to V6.1.Virtual storage

footprint at startup in WAS V7 is equivalent to WAS v6.1

Startup time and Footprint reduction from WAS v6.1 to V7 z/OS

Both elapsed time and CPU time reduced by moving from WAS v6.1 to WAS V7 and

memory footprint remained similar to the ‘Out of the box’ scenario.

WAS V7 is faster in Elapsed time CPU time compared to WAS v6.1.

With provisioning enabled, CPU time is reduced and further reduced by with jvm option –

Xquickstart.

Results achieved due to V7 enhancements and new features - provisioning, JDK

improvements and parallel startup of the applications.

5

Page 6: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

Tuning java virtual machine heap on zOS Java memory or heap tuning

Increasing the heap size supports more object creation. Application runs longer before a garbage collection occurs. Takes

longer to compact and causes GC to take longer.

Set the maximum JVM heap size (Xmx) to value higher than the initial JVM heap size (Xms) . (Xms == 0.25Xmx).

Consider making Xms == Xmx if you accurately know the JVM working set size.

• Typically set during performance analysis when absolute optimal performance is needed.

• If workloads grow and shrink over time, the JVM will not expand/contract accordingly

Making Xms large initially improves performance by delaying GC but ultimately affects response time when GC kicks in

If the Xmx exceeds the available physical memory, and paging occurs, there is a noticeable decrease in performance

If system is already paging heavily, increasing the JVM heap size might make performance worse rather than better.

• To prevent paging, specify Xmx to allow a minimum of 256 MB of physical memory for each processor and 512 MB of

physical memory for each application server.

Key metric: Ratio between the average length of a single garbage collection call and the average time between calls. This

should as low as possible.

The percent free memory should not be reduced over time.

Look for memory leaks and heap fragmentation by running long-running, repetitive and concurrency tests.

• A good test case exercises areas of the application where objects are created.

• Look at areas where collections of objects are used.

Start up versus runtime performance optimization

By default, IBM virtual machines for Java are optimized for runtime performance

For faster startups, you should reduce the initial optimization level that the compiler uses

• Reducing the JIT level may degrade the runtime performance of your applications because the class methods are now

compiled at a lower optimization level.

Xquickstart setting influences how the JVM uses a lower optimization level for class method compiles

6

Page 7: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

7

Tuning java virtual machines on zOS Garbage Collection (GC) tuning

There is always a cost for garbage collection in Java applications.

Ideally, the JVM ought to be spending less than 5% of the time in GC. Typical time spent (5 - 20)%

An Allocation Failure should never cause multiple GCs.

Compaction actions should be occurring on less than half of the GCs.

Select the appropriate GC policy tailored to your application.

optthruput, optimize for throughput DEFAULT

optavgpause, optimize for garbage collection pause time

Gencon minimizes GC pause times at the expense of application throughput.

• Better suited for workloads that consume a lot of short lived objects

subpool, which can increase performance on multiprocessor systems

As the JVM heap size decreases, the cost of garbage collection increases causing a decrease in throughput

Page 8: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

8

Tuning java virtual machines on zOS Enable class sharing in a cache.

Sharing classes in a cache can improve startup time and reduce memory footprint.

Processes, such as application servers, node agents, and deployment managers, can use the share classes option

Tune the configuration update process for a large cell configuration.

Determine whether configuration update performance or consistency checking is more important

Consider disabling config_consistency_check on the Dmgr

Limit the number of dumps that are taken in specific situations.

In certain error conditions, multiple application server threads might fail and the JVM requests a TDUMP for each of those threads. This

situation can cause a large number of TDUMPs to be taken concurrently leading to other problems, such. as a shortage of auxiliary storage.

Use the JAVA_DUMP_OPTS environment variable to indicate the number of dumps that you want the JVM to produce in certain situations.

Make sure the debug version of the JVM libjava_g is not included in your libpath.

Real Storage footprint

As the size of real storage decreases, the page fault rate increases causing a decrease in throughput

Page 9: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

9

Profiling WebSphere JVM on zOS Use Java virtual machine Tool Interface (JVMTI) profiling to gather data about your system for performance analysis.

JVMTI is a native programming interface that provides tools the ability to inspect the state of the JVM.

JVMTI provides the ability to collect information about the JVM that runs the application server.

Tivoli® Performance Viewer leverages these interfaces to enable more comprehensive performance analysis.

This interface is new for the JVM, V1.5. JVMTI replaces the Java virtual machine Profiling Interface (JVMPI).

To enabling the Java virtual machine profiler data

Type -agentlib:pmiJvmtiProfiler in the Generic JVM arguments field. In a WebSphere Application Server for z/OS 64-bit

environment, type -agentlib:pmiJvmtiProfiler64. Note: If the deprecated JVMPI profiler is used WAS6.1, type –

XrunpmiJvmpiProfiler, for z/OS 64-bit environment, type -XrunpmiJvmpiProfiler64.

Page 10: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

10

Profiling WebSphere JVM on zOS: Jinsight A tool that provides a dynamic, lightweight Java profiler and visualizer for Linux on System z and z/OS.

Combination of interactive live tracing and the execution view enables the user to find the bottlenecks in the execution

Live connection allows customers to collect execution information from the Java program on System z and visualize it

immediately on a PC, without having to use traces or source code or byte code instrumentation

New Enhanced Agent adds ability to collect profile data when running 64-bit JVM;

Much of Jinsight's visualization technology is now available in the Rational® Application Developer for WebSphere®

Jinsight is available here on alphaWorks http://www.alphaworks.ibm.com/tech/jinsightlive

Jinsight Education: JinsightLive for System z - Do You Know What your Application is Doing?

http://ew.share.org/proceedingmod/abstract.cfm?abstract_id=17929

Page 11: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

11

Monitoring WebSphere JVM on zOS Enable the Verbose garbage collection property if you think garbage collection is occurring too frequently. This will write a report to the output stream each time the garbage collector runs. This report should give you an idea of

what is going on with Java GC.

Key things to look for in a verbose GC report are:

Time spent in garbage collection. Ideally, you want to be spending less than 5% of the time in GC.

• To determine percentage of time spent in GC, divide the time it took to complete the collection by the time since the

last AF and multiply the result by 100. For example, 83.29/3724.32 * 100 = 2.236%

• If you are spending more than 5% of your time in GC and if GC is occurring frequently, you may need to increase

your Java heap size.

Growth in the allocated heap.

• To determine this, look at the %free. You want to make sure the number is not continuing to decline. If the %free

continues to decline you are experiencing a gradual growth in allocated heap from GC to GC which could indicate

that your application/WAS has a memory leak. If garbage collection is occurring too frequently, increase the maximum size of the JVM heap. The total, used, and free heap size counters are available by enabling PMI. MVS™ console command, modify display, jvmheap, will also display JVM heap information. In addition, you can check

the server activity and interval SMF records.

Page 12: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

zOS TCB Monitoring

VSMDATA provides high level of memory usage allocation to TCB at address 7FE030: F0000 983040

allocation to TCB at address 7FFD90: 91000 593920

allocation to TCB at address 7FFB00: 14A000 1351680

allocation to TCB at address 7FF028: 4FCC6000 1338793984 *

allocation to TCB at address 7EBBE8: E000 57344

allocation to TCB at address 7EB9C8: E000 57344

allocation to TCB at address 7EB6B0: E000 57344

allocation to TCB at address 7EB160: E000 57344

allocation to TCB at address 7C3B48: E000 57344

allocation to TCB at address 7C39B0: E000 57344

TCB 7FF028 is the WAS main TCB If the available storage for main TCB is too small, check

common area or other TCB’s storage usage

Main TCB Storage Breakdown from VSMDATA Subpool 2, key 8 Total alloc: 3FB82000 1069031424

Subpool 1, key 8 Total alloc: 6124000 101859328

Subpool 230, key 0 Total alloc: 605A000 101031936

Subpool 129, key 8 Total alloc: 3200000 52428800

Subpool 252, key 0 Total alloc: 989000 9998336

Subpool 132, key 8 Total alloc: 187000 1601536

Subpool 249, key 2 Total alloc: 140000 1310720

SP2 Key 8 for all mallocs in WAS * SP1 Key 8 seems to be LE control blocks* SP230 Key 0 Shared Library Region

12

* Over specified SHRLIBRGNSIZE can cause OOM… In V7 WAS reports the current value for SHRLIBRGNSIZE in a message issued during startup (BBOO0341I)

Please see : SHRLIBRGNSIZE and Effect on 31-Bit JVM Storage Needs http://www.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP101320

* Breakdown of Mallocs in SP2 K2/K8

•Java Storages

•Internal Memory

•Java Heap

•Class Segments

•JIT Code

•JIT Memory

•J9pool

•WAS and other Program Products e.g.

•ITCAM

•WAS Channel I/O

Page 13: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

Tools to find zOS java storage usage Existing tools can’t find J9pool usage

Javacores

F BBOS001,JAVACORE

Located in Servant and Adjunct home directories

Can browse the file or Use IBM Thread and Monitor Dump Analyzer for Java in ISA

Dbx debugger extension Most of the time z/OS customers only provide SVCDUMP or IEATDUMP

dumpallsegments

LEDATA HEAP home made batch jobs to extract LEDATA HEAP report

Specifies a report on Storage Management control blocks pertaining to HEAP storage.

Shows all mallocs in table

Exclude known mallocs from Java and create a table of mallocs from WAS and other PP’s

Sort by columns to find size and data patterns for high memory usage consumers.

13

Page 14: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere Software

IBM Confidential © 2006 IBM Corporation14

Diagnostic Tools for WebSphere JVM on zOS Tools provide functionality for problem diagnosis similar to the older deployment environments such as CICS and IMS

Heap-related issues such as OutOfMemoryError, other crashes, and hangs or loops within WebSphere address spaces

can be analyzed using these tools.

The Svcdump.jar enables direct access to the binary SVC Dumps or Transaction Dumps created on z/OS, without the

need for intermediate software such as IPCS. The svcdump.jar is shipped to include

• The Dump utility

• (com.ibm.jvm.svcdump.Dump package) formats native and Java stacks for threads in dumped processes that include

an instantiated JVM. The Dump utility includes function to print out other useful information such as in core trace

buffers maintained by the JVM and the system trace, mimicking or extending the information that can be obtained with

IPCS.

• FindRoots utility

• (com.ibm.jvm.findroots.* package) provides multiple ways of formatting the object graphs present in the Java-managed

heap. This is critical for the sometimes difficult tasks of finding object leaks and determining heap occupancy.

• IBM Support Assistant tooling

• Tools for IBM Support Assistant performs numerous functions like memory-heap dump (MDD4J) and verbose GC

log analysis (PMAT).

• Use the Modify command from the MVS console to dynamically generate heapdumps/heapdumps

• To get an SVCDUMP use the 'DUMP' console command.

• Use the IBM Heap Analyzer to look at the phd files generated by the svcdump.jar utility to track down

memory leaks.

• The Eclipse Memory Analyzer Tool (MAT) is an open-source Eclipse project for analyzing heap dumps

and identifying memory leaks from Java virtual machines.

• The IBM Diagnostic Tool Framework for Java (DTFJ) adapter enables MAT to work with system dumps and

Portable Heap Dump (PHD) files from IBM Virtual Machines for Java version 6, version 5.0 and version 1.4.2

Page 15: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

Major Java Storage Consumers: aka Major OOM contributorsClass Segments

One of the key area causing OOM

Some dumps had 40,000+ Loaded classes

WAS CompoundClassLoader leaks

WAS CCL not the cause

Each application has one CCL

After application redeployed a new CCL is used

Old CCL not GCed because some loaded classes holding

some caches

Latest WAS6.1 have a lot of fixes

Capacity planning should consider how many classes going

to be loaded

Java Heap Storage used by Java Heap is fixed

Controlled by –Xmx value

One customer OOM case with -Xmx1000m and Xms500m

never exceeded 500m

VM has to reserve 1000m and wasted 500m

Tools: javacore, jextract, dumpallsegments

Apache caches reflected methods

Apache frame work used by RAD defines a lot of reflected

methods

Defined reflected methods are hung in two levels of hashtables:

class names and method names

Sometimes 40,000 to 50,000 methods defined

The first 15 calls use native access (slower)

After –Dsun.reflect.inflationThreshold(default 15), the reflected

method will be inflated to DCL’s

DCL is faster but need native storage

Saw 25,000 DCL's in one dump

Currently no official tools to find native memory used by

classloaders, we are working on it

LE option can help native memory used by classloaders

HEAPPOOLS(ON,8,3,16,2,64,2,256,8,504,8,1024,4) .under test

VM APAR IZ30962 helps native memory footprint for DCL but

not the root cause

-Dsun.reflect.noInflation=true is misleading. It will not check

the threshold and always inflate to DCL

apache will not release the cached DCL’s once the method

called by 15 times

There should be cap and expiration mechanism to expire the

cache

For capacity planning, customers need to know how many

reflected methods defined and how many of them will be

executed

15

Page 16: Memory Sizing Methodology for zOS External

IBM Software Group | WebSphere software

WebSphere zOS JVM Capacity Planning

Classes how many classes going to be loaded

JSP romclasses are much bigger

A WAS fix can cap the number of JSP’s

How many reflected methods defined in apache and how many will be called/executed

16

Page 17: Memory Sizing Methodology for zOS External

http://agile.pok.ibm.com

IBM Software Group - AIM Development Status Review IBM Confidential 1/14/2009IBM Software Group | WebSphere software

1717

Q&A

Page 18: Memory Sizing Methodology for zOS External

http://agile.pok.ibm.com

IBM Software Group - AIM Development Status Review IBM Confidential 1/14/2009IBM Software Group | WebSphere software

1818

Backup