1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn...
-
Upload
barbara-bradford -
Category
Documents
-
view
213 -
download
1
Transcript of 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn...
![Page 1: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/1.jpg)
1
Garbage Collection Advantage:
Improving Program Locality
Xianglong Huang (UT)Stephen M Blackburn (ANU), Kathryn S McKinley (UT)
J Eliot B Moss (UMass), Zhenlin Wang (MTU), Perry Cheng (IBM)
![Page 2: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/2.jpg)
2
Motivation
• Memory gap problem• OO programs become more popular• OO programs exacerbates memory gap
problem– Automatic memory management– Pointer data structures– Many small methods
Goal: improve OO program locality
![Page 3: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/3.jpg)
3
Cache Performance Matters
_213_javac
05
10152025303540
Tota
l Cyc
les
(in b
illio
ns)
![Page 4: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/4.jpg)
4
Opportunity
• Generational copying garbage collector reorders objects at runtime
![Page 5: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/5.jpg)
5
1
4
65
7
2 3
Copying of Linked Objects
BreadthFirst
65
7
432
1
![Page 6: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/6.jpg)
6
71 2 3 4 5 6
1
4
65
7
2 3
Copying of Linked Objects
65
7
432
1
BreadthFirst
DepthFirst
![Page 7: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/7.jpg)
7
71 2 3 4 5 6
Copying of Linked Objects
DepthFirst
OnlineObjectReordering
1 4BreadthFirst
61 2 3 4 75
1
4
65
7
2 3
65
7
432
1
41
![Page 8: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/8.jpg)
8
Outline
• Motivation• Online Object Reordering
(OOR)• Methodology• Experimental Results• Conclusion
![Page 9: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/9.jpg)
9
Online Object Reordering
• Where are the cache misses?• How to identify hot field accesses
at runtime?• How to reorder the objects?
![Page 10: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/10.jpg)
10
Where Are The Cache Misses?
VM Objects StackOlder
Generation
• Heap structure:
Nursery
Not to scale
![Page 11: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/11.jpg)
11
Where Are The Cache Misses?
_209_db
0200400600800
100012001400160018002000
To
tal
Acc
esse
s (i
n m
illi
on
s)
L2 hits
L2 misses
![Page 12: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/12.jpg)
12
Where Are The Cache Misses?
• Two opportunities to reorder objects in the older generation– Promote nursery objects– Full heap collection
![Page 13: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/13.jpg)
13
How to Find Hot Fields?
• Runtime info (intercept every read)?
• Compiler analysis?• Runtime information + compiler
analysis Key: Low overhead estimation
![Page 14: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/14.jpg)
14
Which Classes Need Reordering?
Step 1: Compiler analysis– Excludes cold basic blocks– Identifies field accesses
Step 2: JIT adaptive sampling identifies hot methods– Mark as hot field accesses in hot
methods
Key: Low overhead estimation
![Page 15: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/15.jpg)
15
Example: Compiler Analysis
Compiler
Hot BBCollect access info
Cold BBIgnore
Compiler
Access List:1. A.b2. ….….
Method Foo { Class A a; try { …=a.b; … } catch(Exception e){ …a.c }}
![Page 16: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/16.jpg)
16
Example: Adaptive Sampling
Method Foo { Class A a; try { …=a.b;
… } catch(Exception e){
…a.c }}
Adaptive Sampling
Foo is hot
Foo Accesses:1. A.b2. ….….
A.b is hot
A
B
b…..
c A’s type information
c b
![Page 17: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/17.jpg)
17
1
4
65
7
2 3
Copying of Linked Objects
65
7
43
OnlineObjectReordering
Type Information
143
2
1
Hot space Cold space
![Page 18: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/18.jpg)
18
OOR System Overview
BaselineCompiler
SourceCode
ExecutingCode
AdaptiveSampling Optimizing
Compiler
HotMethods
Access InfoDatabase
Register HotField Accesses
Look Up
AddsEntries
GC: CopiesObjects
Affects Locality
AdviceGC: CopiesObjects
OOR additionJikesRVM componentInput/Output
OptimizingCompiler
AdaptiveSampling
Improves Locality
![Page 19: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/19.jpg)
19
Outline
• Motivation• Online Object Reordering• Methodology• Experimental Results• Conclusion
![Page 20: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/20.jpg)
20
Methodology: Virtual Machine
• Jikes RVM– VM written in Java– High performance– Timer based adaptive sampling – Dynamic optimization
• Experiment setup– Pseudo-adaptive – 2nd iteration [Eeckhout et al.]
![Page 21: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/21.jpg)
21
Methodology: Memory Management
• Memory Management Toolkit (MMTk):– Allocators and garbage collectors– Multi-space heap
• Boot image• Large object space (LOS)• Immortal space
• Experiment setup– Generational copying GC with 4M
bounded nursery
![Page 22: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/22.jpg)
22
Overhead: OOR Analysis Only
Benchmark Base Execution Time (sec)
w/ only OOR Analysis (sec)
Overhead
jess 4.39 4.43 0.84%
jack 5.79 5.82 0.57%
raytrace 4.63 4.61 -0.59%
mtrt 4.95 4.99 0.70%
javac 12.83 12.70 -1.05%
compress 8.56 8.54 0.20%
pseudojbb 13.39 13.43 0.36%
db 18.88 18.88 -0.03%
antlr 0.94 0.91 -2.90%
hsqldb 160.56 158.46 -1.30%
ipsixql 41.62 42.43 1.93%
jython 37.71 37.16 -1.44%
ps-fun 129.24 128.04 -1.03%
Mean -0.19%
![Page 23: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/23.jpg)
23
Detailed Experiments
• Separate application and GC time• Vary thresholds for method heat• Vary thresholds for cold basic
blocks• Three architectures
– x86, AMD, PowerPC
• x86 Performance counter: – DL1, trace cache, L2, DTLB, ITLB
![Page 24: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/24.jpg)
24
Performance javac
![Page 25: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/25.jpg)
25
Performance db
![Page 26: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/26.jpg)
26
Performance jython
Any static ordering leaves you vulnerable to pathological cases.
![Page 27: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/27.jpg)
27
Phase Changes
![Page 28: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/28.jpg)
28
Related Work
• Evaluate static orderings [Wilson et al.]– Large performance variation
• Static profiling [Chilimbi et al., and others]– Lack of flexibility
• Instance-based object reordering [Chilimbi et al.]– Too expensive
![Page 29: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/29.jpg)
29
Conclusion
• Static traversal orders have up to 25% variation
• OOR improves or matches best static ordering
• OOR has very low overhead• Past predicts future
![Page 30: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/30.jpg)
30
Questions?
Thank you!
![Page 31: 1 Garbage Collection Advantage: Improving Program Locality Xianglong Huang (UT) Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (UMass),](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf9c1a28abf838c9328a/html5/thumbnails/31.jpg)
31
OOR System Overview
• Records object accesses in each method (excludes cold basic blocks)
• Finds hot methods by adaptive sampling
• Reorders objects with hot fields in older generation during GC
• Copies hot objects into separate region