Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in...

Post on 20-Dec-2015

214 views 0 download

Transcript of Connectivity-Based Garbage Collection Presenter Feng Xian Author Martin Hirzel, et.al Published in...

Connectivity-BasedGarbage Collection

Presenter Feng XianAuthor Martin Hirzel, et.alPublished in OOPSLA’2003

2

Garbage Collection Benefits

Garbage collection leads to simpler• Design no complex deallocation protocols

• Implementation automatic deallocation

• Maintenance fewer bugs

Benefits are widely accepted

• Java, C#, Python, …

3

Garbage Collection:Haven’t we solved this problem yet?• For a state-of-the-art garbage collector:

– time ~14% of execution time– space 3x high watermark– pauses 0.8 seconds

• Can reduce any one cost

• Challenge: reduce all three costs

4

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

Example Heap

Boxes: heap objects

Arrows: pointers

Long box: stack + global variables

s1

s2

g

5

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

Thesis

1. Objects form distinct data structures

2. Connected objects die together

3. Garbage collectors can exploit 1. and 2. to reclaim objects efficiently

stack +globals

6

Experimental Infrastructure

JikesRVM Research Virtual Machine– From IBM Research– Written in Java– Application and runtime system share heap

Good garbage collection even more important

Benchmarks– SPECjvm98 suite and SPECjbb2000– Java Olden suite– xalan, ipsixql, nfc, jigsaw

7

Outline

• Garbage Collector Design Principles

• Family of Garbage Collectors

• Design Space Exploration

• Conclusion

8

Garbage Collector Design Principles

“Do partial collections.”

Don’t collect the full heap every time

Shorter pause times

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

stack +globals

9

Garbage Collector Design Principles

“Predict lifetime based on age.”

Generational hypothesis:Most objects die young

Generational garbage collection:– Partition by age– Collect young objects

most often

Low time overhead

That’s the state of the art.

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

stack +globals

young generation old generation

10

Garbage Collector Design Principles

Generational GC Problems

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o14

o12o13

stack +globals

young generation old generation

Regular full collections Long peak pause

Old-to-young pointers Need bookkeeping

11

Garbage Collector Design Principles

“Collect connected objects together.”Likelihood that two objects die at the same time:

Connectivity Example Likelihood

Any pair 33.1%

Weakly connected 46.3%

Strongly connected 72.4%

Direct pointer 76.4%

o2o1 ?

o2o1

o2o1

o2o1

12

Garbage Collector Design Principles

“Focus on objects with few ancestors.”

Shortlived objects are easy to collect

LifetimeMedian number of ancestor objects

Short 2 objects

Long 83,324 objects

13

Garbage Collector Design Principles

“Predict lifetime based on roots.”

o1

o2

o3

stack +globals

Lifetime

Objects reachable … Short Long

indirectly from stack 25.6% 16.2%

only directly from stack 32.9% 0.8%

from globals 4.0% 20.5%

Total 62.5% 37.5%

o4g

s

For details, see [ISMM’02] paper.

14

Outline

• Garbage Collector Design Principles

• Family of Garbage Collectors

• Design Space Exploration

• Conclusion

15

CBGC Family of Garbage Collectors:

Connectivity-Based Garbage Collection

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o12o13

p1

p2

p3

p4

o14

stack +globals

• Do partial collections.• Collect connected

objects together.• Predict lifetime based

on age.• Focus on objects with

few ancestors.• Predict lifetime based

on roots.

16

Family of Garbage Collectors

Components of CBGC

Before allocation:1. Partitioning

Decide into which partition to put each object

Collection algorithm:2. Estimator

Estimate dead + live objects for each partition

3. ChooserChoose “good” set of partitions

4. Partial collectionCollect chosen partitions

17

Find fine-grained partitions, where

• Partition edgesrespect pointers

• Objects don’t move between partitions

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o12o13

p1

p2

p3

p4

Family of Garbage Collectors

Partitioning Problem

o14

stack +globals

18

Pointer analysis• Type-based [Harris]

– o1 may point to o2 if o1 has a field of atype compatible to o2

-conservative: they determine the absence of a pointer btw two heaps only if they can prove that such pointer cannot exist.

o2

o1

o4o3

o5

o10

o6

o8o9

o7

o11

o15

o12o13

p1

p2

p3

p4

Family of Garbage Collectors

Partitioning Solutions

o14

stack +globals

19

Family of Garbage Collectors

Estimator Problem

For each partition guess

dead – Objects that can be

reclaimed– Pay-off

live– Objects that must be

traversed– Cost

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

p1

p2

p3

p42 dead + 2 live

stack +globals

20

Family of Garbage Collectors

Estimator Solutions

Heuristics• Connected objects die

together• Most objects die

young• Objects reachable

from globals live long• The past predicts the

future

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

p1

p2

p3

p42 dead + 2 live

stack +globals

21

Family of Garbage Collectors

Chooser Problem

Pick subset of partitions• Maximize total dead

• Minimize total live

• Closed under predecessor relation

No bookkeeping for external

pointers

p3

p1

p2

p3

p4

7 dead + 5 live

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

2 dead + 2 live

stack +globals

22

Family of Garbage Collectors

Chooser Solutions

Optimal algorithm based on network flow [TR]

Simpler, greedy algorithm

p3

p1

p2

p3

p4

7 dead + 5 live

3 dead + 3 live

1 dead + 2 live

2 dead + 0 live

2 dead + 2 live

stack +globals

23

o5

o10

o8

o11

Family of Garbage Collectors

Partial Collection Problem

o2

o6

o9

o7

o5

o10

o8

o11

o12o13

o15

p2

p3

p4

rest of heap

o14

Look only at chosen partitions

Traverse reachable objects

Reclaim unreachable objects

stack +globals

o

o

24

o5

o10

o8

o11

Family of Garbage Collectors

Partial Collection Solutions

o2

o6

o9

o7

o5

o10

o8

o11

o12o13

o15

p2

p3

p4

rest of heap

o14

stack +globals

Generalize canonical full-heap algorithms

• Mark and sweep[McCarthy’60]

• Semi-space copying[Cheney’70]

• Treadmill[Baker’92]

25

Outline

• Garbage Collector Design Principles

• Family of Garbage Collectors

• Design Space Exploration

• Conclusion

26

Design Space Exploration

Questions

How good is a naïve CBGC?

How good could CBGC be in 20 years?

How well does CBGC do in a JVM?

27

Design Space Exploration

Simulator Methodology

Garbage collection simulator (under GPL)– Uses traces of allocations and pointer writes

from our benchmark runs

Simulator advantages– Easier to implement variety of collector algorithms– Know entire trace beforehand:

can use that for “in 20 years” experiments

Currently adding CBGC to JikesRVM

28

Design Space Exploration

How good is a naïve CBGC?

Cost in time

Cost in space

Pause times

Full-heapSemi-space

copying

CBGC-naïve• Type-based

partitioning [Harris]• Heuristics

estimator

AppelCopying

generational

jack xalan jbb javac jack xalan jbb javac jack xalan jbb javac1.72

0

0

0

0.87

0.22

29

Cost in time

Cost in space

Pause times

Full-heapSemi-space

copying

CBGC-oraclesPartitioning

and estimatorbased on trace

AppelCopying

generational

jack xalan jbb javac jack xalan jbb javac jack xalan jbb javac

Design Space Exploration

How good could CBGC be in 20 years?1.72

0

0

0

0.87

0.22

30

CBGC with oracles beats Appel– We did not find a “performance wall”– CBGC has potential

The performance gap between CBGC with oracles and naïve CBGC is large

– Research challenges

Design Space Exploration

How good could CBGC be in 20 years?

31

How well does CBGC doin a Java virtual machine?

Implementation in progress

Need a pointer analysis for the partitioning

32

Contributions presented in this talk

Connectivity-based GC design principles[ISMM’02]

CBGC, a new family of garbage collectors;

Design space exploration with simulator[OOPSLA’03]