IBM Java Garbage Collection Tuning

55
IBM Software Group ® WebSphere ® Support Technical Exchange Java Garbage Collection Best Practices for Sizing and Tuning the Java Heap Chris Bailey

description

good

Transcript of IBM Java Garbage Collection Tuning

  • IBM Software Group

    WebSphere Support Technical Exchange

    Java Garbage CollectionBest Practices for Sizing and Tuning the Java Heap

    Chris Bailey

  • IBM Software Group

    WebSphere Support Technical Exchange 2

    Objectives Overview

    Selecting the Correct GC Policy

    Sizing the Java heap

    Questions/Answers

  • IBM Software Group

    WebSphere Support Technical Exchange 3

    Garbage Collection Performance GC performance issues can take many forms

    Definition of a performance problem is user centric User requirement may be for:

    Very short GC pause times Maximum throughput A balance of both

    First step is ensure that the correct GC policy has been selected for the workload type

    Helpful to have an understanding of GC mechanisms

    Second step is to ensure heap sizing is correct

    Third step us to look for specific performance issues

  • IBM Software Group

    WebSphere Support Technical Exchange

    Selecting the Correct GC Policy

  • IBM Software Group

    WebSphere Support Technical Exchange 5

    Understanding Garbage Collection Responsible for allocation and freeing of:

    Java objects, Array objects and Java classes

    Allocates objects using a contiguous section of Java heap Ensures the object remains as long as it is in use or live

    Determination based on a reference from another live object or from outside of the Heap

    Reclaims objects that are no longer referenced Ensures that any finalize method is run before the object is

    reclaimed

  • IBM Software Group

    WebSphere Support Technical Exchange 6

    Object Allocation Requires a contiguous area of Java heap

    Driven by requests from: The Java application JNI code

    Most allocations take place in Thread Local Heaps (TLHs)Threads reserve a chunk of free heap to allocate from

    Reduces contention on allocation lock Keeps code running in a straight line (fewer failures) Meant to be fast

    Available for objects < 512 bytes in size Larger allocates take place under a global heap lock

    These allocations are one time costs out of line allocateMultiple threads allocating larger objects at the same time willcontend

  • IBM Software Group

    WebSphere Support Technical Exchange 7

    Object Reclamation (Garbage Collection) Occurs under two scenarios:

    An allocation failure An object allocation is requested and not enough contiguous memory is available

    A programmatically requested garbage collection cycle call is made to System.GC() or Runtime.GC() the Distributed Garbage Collector is running call to JVMPI/TI is made

    Two main technologies used to remove the garbage: Mark Sweep Collector Copy Collector

    IBM uses a mark sweep collector or a combination for generational

  • IBM Software Group

    WebSphere Support Technical Exchange 8

    Global Collection Policies Garbage Collection can be broken down into 2 (3) steps

    Mark: Find all live objects in the system Sweep: Reclaim unused heap memory to the free list Compact: Reduce fragmentation within the free list

    All steps are in a single stop-the-world (STW) phaseApplication pauses whilst garbage collection is done

    Each step is performed as a parallel task within itself

    Four GC Policies, optimized for different scenarios-Xgcpolicy:optthruput optimized for batch type applications-Xgcpolicy:optavgpause optimized for applications with responsiveness

    criteria-Xgcpolicy:gencon optimized for highly transactional workloads-Xgcpolicy:subpools optimized for large systems with allocation

    contention

  • IBM Software Group

    WebSphere Support Technical Exchange 9

    Parallel Mark Sweep Collector, with compaction avoidance Created to make use of additional processors on server systems Designed to increase performance for SMP and not degrade performance for uni-processor systems

    Optimized for Throughput Best policy for batch type applications

    Consists of a single flat Java heap:

    0 GB 2 GB

    Heap Base Heap LimitHeap Size

    LOA

    Parallel GC (optthruput)

  • IBM Software Group

    WebSphere Support Technical Exchange 10

    Parallelism achieved through the use of GC Helper ThreadsParked set of threads that wake to share GC workMain GC thread generates the root set of objectsHelper threads share the work for the rest of the phasesNumber of helpers is one less than the number of processing

    unitsSo helper threads and main GC thread equals the number of

    processing unitsConfigurable using -Xgcthreads

    GC Helper Threads

  • IBM Software Group

    WebSphere Support Technical Exchange 11

    Parallel Mark/Parallel Sweep view of GC

  • IBM Software Group

    WebSphere Support Technical Exchange 12

    Reduces and makes more consistent the time spent inside Stop theWorld GC Reduction usually between 90 and 95%

    Achieved by carrying out some of the STW work whilst application is running 1.4.2: Concurrent Marking 5.0: Concurrent Marking and Concurrent Sweeping

    Slight overhead on thruput for greatly reduced STW times Policy is ideal for systems with responsiveness criteria

    eg. Portal applications

    Concurrent GC (optavgpause)

  • IBM Software Group

    WebSphere Support Technical Exchange 13

    Parallel and Concurrent Mark/Sweep

    Concurrent Kickoff

  • IBM Software Group

    WebSphere Support Technical Exchange 14

    Concurrent Mark hidden object issue Higher heap usage

  • IBM Software Group

    WebSphere Support Technical Exchange 15

    Higher heap usage

    because not all garbage removed

    Concurrent Mark hidden object issue

    Dangling pointer!

  • IBM Software Group

    WebSphere Support Technical Exchange 16

    Similar in concept to that used by Sun and HP Parallel copy and concurrent global collects by default

    Motivation: Objects die young so focus collection efforts on recently created objects Divide the heap up into a two areas: new and old Perform allocates from the new area Collections focus on the new area Objects that survive a number of collects in new area are

    promoted to old area (tenured)

    Ideal for transactional and high data throughput workloads

    Generational and Concurrent GC (gencon)

    0 GB 2 GB

    Heap Base Heap LimitHeap Size

    LOANursery (new) Space Tenured (old) SpaceAllocate Survivor

  • IBM Software Group

    WebSphere Support Technical Exchange 17

    Allocate Space Survivor Space

    Nursery is split into two spaces (semi-spaces)Only one contains live objects and is available for allocationMinor collections (Scavenges) move objects between spacesRole of spaces is reversed

    Nursery/Young Generation

    Survivor Space Allocate Space

    Movement results in implicit compaction

    Nursery (new) Space Copy Collection

  • IBM Software Group

    WebSphere Support Technical Exchange 18

    Subpooling (subpool) Goals:

    Reduce allocation lock contention by distributing free memory into multiple lists

    Reduce allocation contention through use of atomic operations instead of a heap lock

    Prevent premature garbage collections by using a best fit (or closer to best fit) policy instead of address ordered

    Ideal for very large SMP systems where large amounts data is being allocated where there is heap lock contention

  • IBM Software Group

    WebSphere Support Technical Exchange 19

    Looking for Heap Lock Contention All locks can be profiled using Java Lock Analyzer (JLA)

    http://www.alphaworks.ibm.com/tech/jla(AlphaWorks)

    Provides time accounting and contention statistics for Java and JVM locks

    Functionality includes: Counters associated with contended locks Total number of successful acquires Recursive acquires times a thread acquires a lock it

    already owns Number of times a thread blocks because a monitor is

    already owned Cumulative time the monitor was held.

  • IBM Software Group

    WebSphere Support Technical Exchange 20

    JLA Sample ReportSystem (Registered) Monitors

    %MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME87 5273 5273 4572 0 710708 18487 1 95408 JITC Global_Compile lock9 6870 6869 631 1 113420 2976 0 11807 Heap lock5 1123 1123 51 0 11098 286 1 248385 Binclass lock0 1153 1147 5 6 1307 33 0 47974 Monitor Cache lock0 46149 45877 134 272 36961 877 1 6558 JITC CHA lock0 33734 23483 19 10251 6544 150 1 17083 Thread queue lock0 5 5 0 0 0 0 0 9309689 JNI Global Reference lock0 5 5 0 0 0 0 0 9283000 JNI Pinning lock0 5 5 0 0 0 0 0 9442968 Sleep lock0 1 1 0 0 0 0 0 0 Monitor Registry lock0 0 0 0 0 0 0 0 0 Evacuation Region lock0 0 0 0 0 0 0 0 0 Method trace lock0 0 0 0 0 0 0 0 0 Classloader lock0 0 0 0 0 0 0 0 0 Heap Promotion lock

    Java (Inflated) Monitors

    %MISS GETS NONREC SLOW REC TIER2 TIER3 %UTIL AVER-HTM MON-NAME15 68 68 10 0 2204 56 2 11936405 test.lock.testlock1@A09410/A094182 42 42 1 0 186 5 0 300478 test.lock.testlock2@D31358/D313600 70 70 0 0 41 1 0 7617 java.lang.ref.ReferenceQueue$Lock@920628/920630

  • IBM Software Group

    WebSphere Support Technical Exchange 21

    JLA: Fields in the report

  • IBM Software Group

    WebSphere Support Technical Exchange 22

    Choosing the Right GC Policy Four GC Policies, optimized for different scenarios

    -Xgcpolicy:optthruput optimized for batch type applications-Xgcpolicy:optavgpause optimized for applications with

    responsiveness criteria-Xgcpolicy:gencon optimized for highly transactional

    workloads-Xgcpolicy:subpools optimized for large systems with allocation

    contention

    How do I know whether to use optavgpause or gencon? Monitor GC activityLook for certain characteristics

  • IBM Software Group

    WebSphere Support Technical Exchange 23

    Monitoring GC Activity Use of Verbose GC logging

    only data that is required for GC performance tuning Graph Verbose GC output using GC and Memory Visualizer (GCMV) from ISA

    Activated using command line options-verbose:gc-Xverbosegclog:[DIR_PATH][FILE_NAME],X,Y

    where: [DIR_PATH] is the directory where the file should be written [FILE_NAME] is the name of the file to write the logging to X is the number of files to Y is the number of GC cycles a file should contain

    Performance Cost: (very) basic testing shows a 2% overhead for GC duration of 200ms

    eg. if application GC overhead is 5%, it would become 5.1%

  • IBM Software Group

    WebSphere Support Technical Exchange 24

    Important Characteristics for Choosing GC Policy

    Rate of Garbage CollectionHigh rates of object burn point to large numbers of transitional objects, and

    therefore the application may well benefit from the use of gencon

    Large Object Allocations?The allocation of very large objects adversely affects gencon unless the nursery is

    sufficiently large enough. The application may well benefit from optavgpuse

    Large heap usage variationsThe optavgpause algorithms are best suited to consistent allocation profilesWhere large variations occur, gencon may be better suited

    Rule of thumb: if GC overhead is > 10%, youve most likely chosen the wrong one

  • IBM Software Group

    WebSphere Support Technical Exchange 25

    Rate of Garbage Collectionoptavgpause gencon

    Gencon could handle a higher rate of garbage collectionCompleting the test quicker

    Gencon had a smaller percentage of time in garbage collection Gencon had a shorter maximum pause time

  • IBM Software Group

    WebSphere Support Technical Exchange 26

    Rate of Garbage Collection

    Gencon provides less frequent long Garbage Collection cycles Gencon provides a shorter longest Garbage Collection cycle

  • IBM Software Group

    WebSphere Support Technical Exchange 27

    Large Object Allocations (Very) Large Object allocations affects the gencon GC policy

    If object is larger than the Nursery size, the object is immediately tenured Removes the benefit of generational heaps Still has the additional overhead of running generational

    If object is fits in the nursery but fills it, frequent nursery collects will have to occur Too frequent nursery collects mean objects are likely to survive and need copying Copying is an expensive process

    If (Very) Large Objects are being used, a sufficiently large enough nursery is required

  • IBM Software Group

    WebSphere Support Technical Exchange

    Sizing the Java Heap

  • IBM Software Group

    WebSphere Support Technical Exchange 29

    Sizing the Java Heap Maximum possible Java heap sizes

    The correct Java heap size

    Fixed heap sizes vs. Variable heap sizes

    Heap Sizing for Generational GC

  • IBM Software Group

    WebSphere Support Technical Exchange 30

    Maximum Possible Heap Size 32 bit Java processes have maximum possible heap size

    Varies according to the OS and platform used Determined by the process memory layout

    64 bit processes do not have this limit Limit exists, but is so large it can be effectively ignored Addressability usually between 2^44 and 2^64 Which is 16+ TeraBytes

  • IBM Software Group

    WebSphere Support Technical Exchange 31

    An Operating System process like any other application: Subject to OS and architecture restrictions 32bit architecture has an addressable range of:

    2^32 which is 0x00000000 0xFFFFFFFF which is 4GB

    Not all addressable space is available to the application The operating system needs memory for:

    The kernel The runtime support libraries

    Varies according to Operating System How much memory is needed and where that memory is located

    0 GB 4 GB

    0x0 0xFFFFFFFF

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    Java Process Memory Layout

  • IBM Software Group

    WebSphere Support Technical Exchange 32

    Memory Available to the Java Process On Windows:

    On AIX:

    0 GB 4 GB

    0x0 0xFFFFFFFF

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    0 GB 4 GB

    0x0 0xFFFFFFFF

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    Operating System Space

    Libraries

    Kernel Libraries

  • IBM Software Group

    WebSphere Support Technical Exchange 33

    Java Process Restrictions Not all Java Process space is available to the Java application

    The Java Runtime needs memory for: The Java Virtual Machine Backing resources for some Java objects

    This memory area as well as some other allocations, is part of the Native Heap

    Memory not allocated to the Java Heap is available to the native heap

    Available memory space Java heap = native heap

    Effectively, the Java process maintains two memory pools

  • IBM Software Group

    WebSphere Support Technical Exchange 34

    The Native Heap Allocated using malloc() and therefore subject to memory

    management by the OS

    Used for Virtual Machine resources, eg: Execution engine Class Loader Garbage Collector infrastructure

    Used to underpin Java objects: Threads, Classes, AWT objects, ZipFiles

    Used for allocations by JNI code

  • IBM Software Group

    WebSphere Support Technical Exchange 35

    Native Heap available to Application On Windows

    On AIX (1.4.2 with small heaps)

    0 GB 4 GB

    0x0

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    Operating System Space

    Libraries

    Java Heap

    0xFFFFFFFF

    0 GB 4 GB

    0x0 0xFFFFFFFF

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    Kernel LibrariesJava Heap

    VM Resources

    VM Resources

    Native Heap

    Native Heap

  • IBM Software Group

    WebSphere Support Technical Exchange 36

    Layout with Large Java Heaps on AIX Applies to heaps > 1GB in size and Java 5.0

    Java heap becomes allocated using mmap()

    Segments used start at 0xC and work downwards

    understanding memory layout important for monitoring

    0 GB 4 GB

    0x0 0xFFFFFFFF

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    Kernel Libraries

    VM Resources

    0x7

    Native Heap0xD0x3

    Java Heap

  • IBM Software Group

    WebSphere Support Technical Exchange 37

    Linux:

    z/OS:

    Memory Layout for Linux

    0 GB 4 GB

    0x0

    2 GB

    0x80000000

    1 GB 3 GB

    0x40000000 0xC0000000

    KernelJava Heap

    0xFFFFFFFFVM Resources

    Native Heap

    PAGE_OFFSETTASK_SIZE

    0 GB

    0x0

    2 GB

    0x7FFFFFFF

    1 GB

    0x40000000

    Java Heap

    VM Resources

  • IBM Software Group

    WebSphere Support Technical Exchange 38

    Theoretical and Advised Max Heap Sizes

    The larger the Java heap, the more constrained the native heap Advised limits to prevent native heap from becoming overly

    restricted, leading to OutOfMemoryErrors

    Exceeding advised limits possible, but should be done only when native heap usage is understood

    Native heap usage can be measured using OS tools:Svmon (AIX), PerfMon (Windows), RMF (zOS) etc

    1.8GB1.8GB/3GB

    2.5GB3 GBHugemem Kernel

    Advised MaximumMaximum PossibleAdditional OptionsPlatform

    1.3GB1.7GBz/OS

    1.5GB1.8GBWindows

    1.5GB2 GBLinux2.5GB3.25 GBautomaticAIX

  • IBM Software Group

    WebSphere Support Technical Exchange 39

    Moving to 64bit Moving to 64bit remove the Java heap size limit

    However, ability to use more memory is not free 64bit applications perform slower

    More data has to be manipulated Cache performance is reduced

    64bit applications require more memory Java Object references are larger Internal pointers are larger

    Major improvements to this in Java 6.0 due to compressed pointers

  • IBM Software Group

    WebSphere Support Technical Exchange 40

    The correct Java heap size GC will adapt heap size to keep occupancy between 40% and 70%

    Heap occupancy over 70% causes frequent GC cycles Which generally means reduced performance

    Heap occupancy below 40% means infrequent GC cycles, but cycles longer than they needs to be

    Which means longer pause times that necessary Which generally means reduced performance

    The maximum heap size setting should therefore be 43% larger than the maximum occupancy of the applicationMaximum occupancy + 43% means occupancy at 70% of total heap

    Eg. For 70MB occupancy, 100MB Max heap required, which is 70MB +43% of 70MB

  • IBM Software Group

    WebSphere Support Technical Exchange 41

    Long Garbage Collection Cycles

    Too Frequent Garbage Collection

    The correct Java heap sizeM

    e

    m

    o

    r

    y

    Time

    70%

    40%

    Heap Occupancy

    Heap Size

  • IBM Software Group

    WebSphere Support Technical Exchange 42

    Fixed heap sizes vs. Variable heap sizes Should the heap size be fixed?

    i.e. Minimum heap size (-Xms) = Maximum heap size (-Xmx)?

    Each option has advantages and disadvantages As for most performance tuning, you must select which is right for the particular

    application

    Variable Heap Sizes GC will adapt heap size to keep occupancy between 40% and 70%

    Expands and Shrinks the Java heap Allows for scenario where usage varies over time

    Where variations would take usage outside of the 40-70% window

    Fixed Heap Sizes Does not expand or shrink the Java heap

  • IBM Software Group

    WebSphere Support Technical Exchange 43

    Heap Expansion and Shrinkage Act of heap expansion and shrinkage is relatively cheap

    However, a compaction of the Java heap is sometimes required Expansion: for some expansions, GC may have already

    compacted to try to allocate the object before expansion

    Shrinkage: GC may need to compact to move objects from the area of the heap being shrunk

    Whilst expansion and shrinkage optimizes heap occupancy, it (usually) does so at the cost of compaction cycles

  • IBM Software Group

    WebSphere Support Technical Exchange 44

    Conditions for Heap Expansion Not enough free space available for object allocation after GC has

    complete Occurs after a compaction cycle Typically occurs where there is fragmentation or during rapid

    occupancy growth (i.e., application startup)

    Heap occupancy is over 70% Compaction unlikely

    More than 13% of time is spent in GC Compaction unlikely

  • IBM Software Group

    WebSphere Support Technical Exchange 45

    Conditions for Heap Shrinkage Heap occupancy is under 40%

    And the following is not true: Heap has been recently expanded (last 3 cycles) GC is a result of a System.GC() call

    Compaction occurs if: An object exists in the area being shrunk GC did not shrink on the previous cycle

    Compaction is therefore likely to occur

  • IBM Software Group

    WebSphere Support Technical Exchange 46

    Introduction to Xmaxf and Xminf The Xmaxf and Xminf settings control the 40% and 70% occupancy

    bounds -Xmaxf: the maximum heap space free before shrinkage (default is 0.6

    for 40%) -Xminf: the minimum heap space before expansion (default is 0.3 for

    70%)

    Can be used to move optimum occupancy window if required by the application

    eg. Lower heap utilization required for more infrequent GC cycles

    Can be used to prevent shrinkage -Xmaxf1.0 would mean shrinkage only when heap is 100% free Would completely remove shrinkage capability

  • IBM Software Group

    WebSphere Support Technical Exchange 47

    Introduction to Xmaxe and -Xmine The Xmaxe and Xmine settings control the bounds of the size of

    each expansion step -Xmaxe: the maximum amount of memory to add to the heap

    size in the case of expansion (default is unlimited) -Xmine: the minimum amount of memory to add to the heap

    size in the case of expansion (default is 1MB)

    Can be used to reduce/prevent compaction due to expansion Reduce expansions by setting a large -Xmine

  • IBM Software Group

    WebSphere Support Technical Exchange 48

    GC Managed Heap Sizing

    Long Garbage Collection Cycles

    To Frequent Garbage Collection

    M

    e

    m

    o

    r

    y

    Time

    -Xminf

    -Xmaxf

    Heap Occupancy

    Heap Size

    Expansion (>= -Xmine)

  • IBM Software Group

    WebSphere Support Technical Exchange 49

    Fixed or Variable?? Again, dependent on application

    For flat memory usage, use fixed For widely varying memory usage, consider variable

    Variable provides more flexibility and ability to avoid OutOfMemoryErrors

    Some of the disadvantages can be avoided: -Xms set to lowest steady state memory usage prevents

    expansion at startup -Xmaxf1 will remove shrinkage -Xminf can be used to prevent compaction before

    expansion -Xmine can be used to reduce expansions

  • IBM Software Group

    WebSphere Support Technical Exchange 50

    Nursery Tenured

    Options Are: Fix both nursery and tenured space

    Allow them to expand/contract

    General Advice: Fix the new space sizeSize the tenured space as you would for a flat heap

    Heap Sizing for Generational GC

  • IBM Software Group

    WebSphere Support Technical Exchange 51

    Sizing the Nursery Copying from Allocate to Survivor or to Tenured space is expensive

    Physical data is copied (similar to compaction with is also expensive Ideally survival rates should be as low as possible

    Less data needs to be copied Less tenured/global collects that will occur

    The larger the nursery: the greater the time between collects the less objects that should survive However, the longer a copy can potentially take

    Recommendation is to have a nursery as large as possible Whilst not being so large that nursery collect times affect the

    application responsiveness

  • IBM Software Group

    WebSphere Support Technical Exchange 52

    Summary GC Policy should be chosen according to application scenario

    Java heap should ideally be sized for between 40 and 70% occupancy

    Min=Max heap size is right for some applications, but not for others

  • IBM Software Group

    WebSphere Support Technical Exchange 53

    Additional WebSphere Product Resources Discover the latest trends in WebSphere Technology and implementation, participate in

    technically-focused briefings, webcasts and podcasts at: http://www.ibm.com/developerworks/websphere/community/

    Learn about other upcoming webcasts, conferences and events: http://www.ibm.com/software/websphere/events_1.html

    Join the Global WebSphere User Group Community: http://www.websphere.org

    Access key product show-me demos and tutorials by visiting IBM Education Assistant: http://www.ibm.com/software/info/education/assistant

    View a Flash replay with step-by-step instructions for using the Electronic Service Request (ESR) tool for submitting problems electronically: http://www.ibm.com/software/websphere/support/d2w.html

    Sign up to receive weekly technical My support emails: http://www.ibm.com/software/support/einfo.html

  • IBM Software Group

    WebSphere Support Technical Exchange 54

    Additional Java Product Resources Obtain Java Documentation:

    https://www.ibm.com/developerworks/java/jdk/docs.html

    Download the IBM Java SDKs:https://www.ibm.com/developerworks/java/jdk/index.html

    Find and download Java tooling:http://www.ibm.com/software/websphere/events_1.html

    Troubleshoot Java with the IBM Guided Activity Assistant:http://www-01.ibm.com/support/docview.wss?uid=swg27010135

    Troubleshoot Java with the Guided Troubleshooting InfoCenterhttp://publib.boulder.ibm.com/infocenter/javasdk/tools/topic/com.ibm.java.doc.tools.welcome/tools/welcome/welcome.html

    Discuss IBM Java:http://www.ibm.com/developerworks/forums/forum.jspa?forumID=367

  • IBM Software Group

    WebSphere Support Technical Exchange 55

    Questions and Answers