Concurrent Garbage Collection - Meetupfiles.meetup.com/3189882/JUG_Concurrent_GC_Jul2015.pdf · Ø...
Transcript of Concurrent Garbage Collection - Meetupfiles.meetup.com/3189882/JUG_Concurrent_GC_Jul2015.pdf · Ø...
© Copyright Azul Systems 2015
© Copyright Azul Systems 2015
@azulsystems azulsystems.com
Concurrent Garbage Collection
§ Deepak Sreedhar§ JVM engineer, Azul Systems
7/27/151
Java User GroupBangalore
© Copyright Azul Systems 2015
About me: Deepak SreedharØ JVM student at Azul Systems
Ø Currently working on enhancing the C4 garbage collector implementation in Azul Zing JVM
Ø Prior experience with dynamic binary translation and server migration tools
7/27/152
© Copyright Azul Systems 2015
Introduction
7/27/153
© Copyright Azul Systems 2015
Quiz
Ø Does java spec mandate automatic GC? Ø Is GC efficient?Ø Can GC collect all dead objects? Ø Can GC impact application throughput? Ø Can GC impact application latency?Ø Does a larger heap imply poorer performance? Ø Does increasing Xmx (more free space) Improve
GC efficiency?
7/27/154
© Copyright Azul Systems 2015
Terminology
Ø The java heap memoryØ Objects and referencesØ Live, reachable and dead objectsØ Fragmentation and headroom wastageØ Virtual and physical memoryØ MutatorsØ Allocation and mutation rates
7/27/155
© Copyright Azul Systems 2015
GC SafepointØ A point in thread execution when GC can identify all
references correctly, and there is no mutationØ Global safepoint (STW) – all threads are at safepointØ Safepointing not same as halting. A thread running
native code (JNI) is at a safepoint Ø Time to safepoint is as crucial for low latency as is
the GC operation time. Try -XX:+PrintGCApplicationStoppedTime
Ø Safepoints may be needed for non GC reasons such as deoptimization and JVMTI heap iteration
7/27/156
© Copyright Azul Systems 2015
GC classification
Ø Precise vs. ConservativeØ Incremental vs. MonolithicØ Parallel vs. SerialØ Concurrent vs. Stop-the-world Ø Multi-generational collectors
• Weak generational hypothesis• Young (new) and Old (tenured) generation• Promotion (tenuring)• Lesser pauses usually in new gen (smaller set of live objects)• Remembered sets, card tables for cross-generational
references• Can delay, but not avoid old gen collections
7/27/157
© Copyright Azul Systems 2015
Copying collector
Ø Copy and fixup as objects are discoveredØ “From” and “To” spacesØ Used for young (new) gen in many collectorsØ Usually implemented as monolithic, stop-the-worldØ Complexity of the order of live objectsØ Theoretically, requires double the memoryØ Practically many objects may be dead
• Eden and survivor spaces• Early promotion to old gen when more memory is needed
7/27/158
© Copyright Azul Systems 2015
Mark Compact
Ø Separate mark and compact phasesØ Mark (trace) - identify live objectsØ Compact - Move objects to reduce fragmentation
Ø Compact to “To” spaceØ Complexity of the order of live objectsØ Can be implemented incrementallyØ Full compaction can be delayed
7/27/159
© Copyright Azul Systems 2015
Mark Sweep Compact
Ø Mark - identify live objectsØ Sweep – iterate over the heap and find free spaceØ Compact - Move objects to reduce fragmentationØ Used for old gen in many collectorsØ Complexity of the order of heap sizeØ In-place, does not need more memoryØ Can be implemented incrementallyØ Can delay compaction to reduce pauses, but not
eliminate it
7/27/1510
© Copyright Azul Systems 2015
Object allocationØ Increasing memory availability on servers – into the
terabyte space Ø Efficient allocation using Thread Local Allocation Buffers
(TLAB) and simple “advance the top” algorithmØ Not many java applications able to fully utilize this facilityØ GC pauses (including in new gen)Ø Difficulty in arriving at the right tuningØ Object pools, off heap memory used to get around this
problem – not perfect solutions since memory management layer needs to be coded
Ø Can we have a continuously concurrent garbage collector?
7/27/1511
© Copyright Azul Systems 2015
Challenges and approaches
7/27/1512
© Copyright Azul Systems 2015
Concurrent MarkingØ Marking – start from roots and traverse the object
graph through discovered referencesØ Mutators can modify the object graph while GC is
marking • Move a reference to an already visited portion of the graph• Remove references to an object from heap and keep a single
reference in a register hiding it from GC marker
Ø Approaches• Incremental update – revisit root-set and modified portions of the
graph iteratively, end with a re-mark pause• SATB (snapshot at the beginning) – intercept writes and store old
contents into buffers
7/27/1513
© Copyright Azul Systems 2015
Concurrent CompactionØ Mutators can modify an object while it is being copiedØ Mutators can read an object using stale pointers after it has
been copiedØ Incremental compact - G1GC Approach
• Divide heap into regions, maintain inter region references using remembered sets
• Minor collections use a copying collector• Some minor collections do incremental compaction for old gen• After concurrent mark, estimate efficiency of collecting regions, those with no
or smaller RSets can be collected easier, so will be prioritized for upcoming minor collections
• Source regions updated while copying, RSets updates on new regions follow copying
• Mark sweep compact for STW major collections
ØRead Barriers
7/27/1514
© Copyright Azul Systems 2015
GC Barriers
Ø Instructions executed by mutators that aid gar bage collection
Ø Help maintain metadataØ Impose invariantsØ Write barriers
• Update cross generation or cross region references• SATB barrier to ensure snapshot is fully marked• Incremental update barriers that store new references
Ø Read barriers• Baker-style barrier• Brooks-style forwarding pointer• C4 Load Value Barrier
7/27/1515
© Copyright Azul Systems 2015
The Continuously Concurrent Compacting
Collector (C4)
7/27/1516
© Copyright Azul Systems 2015
Loaded Value Barrier
Ø A read barrier that ensures, at time of load, that the following invariants are met before reference is visible to application
• If GC cycle is in marking phase, the reference will be marked through
• If GC cycle is in relocation phase, or has completed relocation but not fixup, the reference will be updated to point to the relocated object
Ø Simultaneously guarantees that• No reference misses GC attention during marking• There is no stale access to a compacted page
Ø The result of the load will always be a valid reference to a valid object
7/27/1517
© Copyright Azul Systems 2015
Self Healing
Ø Contents of source location overwritten with the result of LVB
Ø Loading from same source cannot trigger barrier again
Ø Critical property that ensures finite and predictable amount of work
Ø There may be “trap storms” at phase shifts, but they will settle down as we do healing and complete
Ø Unique to the C4 barrier (LVB)
7/27/1518
© Copyright Azul Systems 2015
Mark phaseØLike other collectors start from root set and traverse
the object graphØNMT (not marked through) LVB check – does
reference metadata match expected GC state for the generation?
ØTrap handling – Fix NMT state for the reference, heal the source location and add to collector’s work queue
ØCheckpoints to clean stacks and transfer ref buffersØMarking followed by a concurrent weak reference
processing phase
7/27/1519
© Copyright Azul Systems 2015
Relocation phaseØ Forwarding information kept outside of heap pagesØ Virtual memory of compacted pages remain reserved until fixup is
completeØ Physical memory can be released immediately (Quick Release) and
recycledØ Hand over hand relocation – Each GC thread can complete with just
one seed pageØ Compacted pages are protected to catch accesses performed without
LVBØ Mutators cooperate in the relocation if GC hasn’t moved the object yet
at the relocate LVB trapØ Also heal the source memory with the new address of the objectØ Large objects are just remapped to new virtual addresses, not
physically copied
7/27/1520
© Copyright Azul Systems 2015
Fixup phaseØTraverse object graph and heal memory
locations if not already done by mutatorsØAt end of fixup phase, virtual memory
corresponding to compacted pages can be freed
ØCan be combined with marking phase for next GC cycle, helping reduce GC cycle duration
ØMutators will do the fixup as part of LVB
7/27/1521
© Copyright Azul Systems 2015
Generational features
Ø New and old collections can proceed simultaneously and almost independently, unlike most collectors
Ø Perm gen processed by Old collectorØ Old and new collectors use the same algorithmØ Synchronization using simple interlocks and limited
suspension at phase changesØ Precise card marks for inter generational
references. Updated by Store Value Barriers (SVB)Ø Can be extended to N generations
7/27/1522
© Copyright Azul Systems 2015
Heap managementØAllocation in 2 MB “pages” ØQuick Release allows physical pages to be recycled
to satisfy allocation requests before fixup is completeØNew, old and perm gen pages interleaved in virtual
spaceØTiered allocation - Objects divided into small, mid and
large “spaces” based on size – helps limit maximum headroom wastage (currently 12.5%)
ØTLABs for small space allocation, bump-the-pointerØRelocation uses a different mechanism for each
space to limit the maximum copy that a mutator needs to do
7/27/1523
© Copyright Azul Systems 2015
Zing Safepoints
Ø C4 algorithm is pauseless, but current implementation has few short pauses mostly at collector phase transitions (for ease and efficiency)
Ø Pause times independent of heap size, live object size, object lifetime, allocation rate, mutation rate, count of weak/soft/phantom references
Ø Provides sufficient safepoint opportunities to reduce time to bring threads to safepoint
Ø Pause times remain consistentØ Employs thread checkpoints when there is a
specific action to be performed for/by that thread or when the thread needs to observe a GC state change
7/27/1524
© Copyright Azul Systems 2015
More on Zing
Ø GC scheduled by heuristicsØ In most cases no tuning requiredØ Elastic memory - helps reduce occurrences of OOMØ Linux kernel module to improve performance of
virtual memory operations
7/27/1525
© Copyright Azul Systems 2015
Keywords for reference searchØ Talks by Gil Tene, CTO Azul Systems Ø The Garbage Collection HandbookØ C4: The Continuously Concurrent Compacting CollectorØ Garbage-First Garbage CollectionØ Azul Zing JVM
7/27/1526
© Copyright Azul Systems 2015
Where Zing shinesØ Low latency
Eliminate behaviour blips down to the sub-millisecond-units level
Ø Machine-to-machine “stuff”Support higher *sustainable* throughput (one that meets SLAs)
Messaging, queues, market data feeds, fraud detection, analytics
Ø Human response timesEliminate user-annoying response time blips. Multi-second and even fraction-of-a-second blips will be completely gone.
Support larger memory JVMs *if needed* (e.g. larger virtual user counts, or larger cache, in-memory state, or consolidating multiple instances)
Ø “Large” data and in-memory analyticsMake batch stuff “business real time”. Gain super-efficiencies.
Cassandra, Spark, Solr, DataGrid, any large dataset in fast motion7/27/1527
© Copyright Azul Systems 2015
Q & A
7/27/1528