Compilation 2007Compilation 2007
Garbage CollectionGarbage Collection
Michael I. Schwartzbach
BRICS, University of Aarhus
2Garbage Collection
The Garbage CollectorThe Garbage Collector
A garbage collector is part of the runtime system It reclaims heap-allocated records (objects) that
are no longer in use
A garbage collector should:• reclaim all unused records• spend very little time per record• not cause significant delays• allow all of memory to be used
These are difficult and conflicting requirements
3Garbage Collection
Life Without Garbage CollectionLife Without Garbage Collection
Unused records must be explicitly deallocated This is superior if done correctly But it is easy to miss some records And it is dangerous to handle pointers Memory leaks in real life (ical v.2.1):
0
5
10
15
20
25
30
35MB
hours
4Garbage Collection
Record LivenessRecord Liveness
Which records are still in use? Ideally, those that will be accessed in the future
execution of the program But that is of course undecidable...
Basic conservative approximation:
A record is live if it is reachable from a stack location (local variable or local stack)
Dead records may still point to each other
5Garbage Collection
A Heap With Live and Dead RecordsA Heap With Live and Dead Records
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
6Garbage Collection
The Mark-and-Sweep AlgorithmThe Mark-and-Sweep Algorithm
Explore pointers starting from all stack locations and mark all the records encountered
Sweep through all records in the heap and reclaim the unmarked ones
Unmark all marked records
Assumptions:• we know the start and size of each record in memory• we know which record fields are pointers• reclaimed records are kept in a freelist
7Garbage Collection
Pseudo Code for Mark-and-SweepPseudo Code for Mark-and-Sweep
function DFS(x) { if (x is a heap pointer) if (x is not marked) { mark x; for (i=1; i<=|x|; i++) DFS(x.fi) }}
function Sweep() { p = first address in heap; while (p<last address in heap) { if (p is marked) unmark p; else { p.f1 = freelist; freelist = p; } p = p + sizeof(p); }}
function Mark() { foreach (v in a stack frame) DFS(v);}
8Garbage Collection
Marking and Sweeping (1/11)Marking and Sweeping (1/11)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
9Garbage Collection
Marking and Sweeping (2/11)Marking and Sweeping (2/11)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
10Garbage Collection
Marking and Sweeping (3/11)Marking and Sweeping (3/11)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
11Garbage Collection
Marking and Sweeping (4/11)Marking and Sweeping (4/11)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
12Garbage Collection
Marking and Sweeping (5/11)Marking and Sweeping (5/11)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
13Garbage Collection
Marking and Sweeping (6/11)Marking and Sweeping (6/11)
p
q
r
37
15 12
7
3759
9
20
freelist
00017 00008
0004200113
00249
00371
00738
14Garbage Collection
Marking and Sweeping (6/11)Marking and Sweeping (6/11)
p
q
r
37
15 12
7
3759
9
20
freelist
00017 00008
0004200113
00249
00371
00738
15Garbage Collection
Marking and Sweeping (6/11)Marking and Sweeping (6/11)
p
q
r
37
15 12
7
3759
9
20
freelist
00017 00008
0004200113
00249
00371
00738
16Garbage Collection
Marking and Sweeping (7/11)Marking and Sweeping (7/11)
p
q
r
37
15 12
7
3759
20
freelist
00017 00008
0004200113
00249
00371
00738
17Garbage Collection
Marking and Sweeping (8/11)Marking and Sweeping (8/11)
p
q
r
37
15 12
7
3759
20
freelist
00017 00008
0004200113
00249
00371
00738
18Garbage Collection
Marking and Sweeping (9/11)Marking and Sweeping (9/11)
p
q
r
37
15 12
7
3759
20
freelist
00017 00008
0004200113
00249
00371
00738
19Garbage Collection
Marking and Sweeping (10/11)Marking and Sweeping (10/11)
p
q
r
37
15 12
7
3759
20
freelist
00017 00008
0004200113
00249
00371
00738
20Garbage Collection
Marking and Sweeping (11/11)Marking and Sweeping (11/11)
p
q
r
37
15 12
7
3759
20
freelist
00017 00008
0004200113
00249
00371
00738
21Garbage Collection
Analysis of Mark-and-SweepAnalysis of Mark-and-Sweep
Assume the heap has H words Assume that R words are reachable The cost of garbage collection is:
c1R + c2H
The cost per reclaimed word is:
(c1R + c2H)/(H - R)
If R is close to H, then this is expensive
22Garbage Collection
AllocationAllocation
The freelist must be searched for a record that is large enough to provide the requested memory
Free records may be sorted by size The freelist may become fragmented:
containing many small free records but none that is large enough
Defragmentation joins adjacent free records
23Garbage Collection
Pointer ReversalPointer Reversal
The DFS recursion stack could have size H It has at least size log(H) This may be too much (after all, memory is low)
The recursion stack may be cleverly embedded in the fields of the marked records
This technique makes mark-and-sweep practical
24Garbage Collection
The Reference Counting AlgorithmThe Reference Counting Algorithm
Maintain a counter of the total number of references to each record
For each assignment, update the counters A record is dead when its counter is zero Advantages:
• catches dead records immediately• does not cause long pauses
Disadvantages:• cannot detect cycles of dead records• is rather expensive
25Garbage Collection
Pseudo Code for Reference CountingPseudo Code for Reference Counting
function Increment(x) { x.count++;}
function Decrement(x) { x.count--; if (x.count==0) PutOnFreeList(x);}function PutOnFreelist(x) {
Decrement(x.f1); x.f1 = freelist; freelist = x;}
function RemoveFromFreelist(x) { for (i=2; i<=|x|; i++) Decrement(x.fi);}
26Garbage Collection
The Stop-and-Copy AlgorithmThe Stop-and-Copy Algorithm
Divide the heap space into two parts Only use one part at a time When it runs full, copy live records to the other
part of the heap space Then switch the roles of the two parts Advantages:
• fast allocation (no freelist)• avoids fragmentation
Disadvantage:• wastes half your memory
27Garbage Collection
Before and After Stop-and-CopyBefore and After Stop-and-Copy
8
7
6
4
3
5
from-space to-space
nextlimit
8
7
6
5
4
3
to-space from-spacelimit
next
28Garbage Collection
Pseudo Code for Stop-and-CopyPseudo Code for Stop-and-Copy
function Forward(x) { if (x from-space) { if (x.f1 to-space) return x.f1; else for (i=1; i<|x|; i++) next.fi = x.fi; x.f1 = next; next = next + sizeof(x); return x.f1; } else return x;}
function Copy() { scan = next = start of to-space; foreach (v in a stack frame) v = Forward(v); while (scan < next) { for (i=1; i<=|scan|; i++) scan.fi = Forward(scan.fi); scan = scan + sizeof(scan); }}
29Garbage Collection
Stopping and Copying (1/13)Stopping and Copying (1/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
30Garbage Collection
Stopping and Copying (2/13)Stopping and Copying (2/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
31Garbage Collection
Stopping and Copying (3/13)Stopping and Copying (3/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
32Garbage Collection
Stopping and Copying (4/13)Stopping and Copying (4/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
33Garbage Collection
Stopping and Copying (5/13)Stopping and Copying (5/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
34Garbage Collection
Stopping and Copying (6/13)Stopping and Copying (6/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
35Garbage Collection
Stopping and Copying (7/13)Stopping and Copying (7/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
36Garbage Collection
Stopping and Copying (8/13)Stopping and Copying (8/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
37Garbage Collection
Stopping and Copying (9/13)Stopping and Copying (9/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
2000249
38Garbage Collection
Stopping and Copying (10/13)Stopping and Copying (10/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
2000936
39Garbage Collection
Stopping and Copying (11/13)Stopping and Copying (11/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
2000936
5900948
40Garbage Collection
Stopping and Copying (12/13)Stopping and Copying (12/13)
p
q
r
37
15 12
7
3759
9
20
00017 00008
0004200113
00249
00371
00738
from-spaceto-space
1500017
1509000
3709012
1209024
2000936
5900948
41Garbage Collection
Stopping and Copying (13/13)Stopping and Copying (13/13)
p
q
r
37
to-spacefrom-space
1509000
3709012
1209024
2000936
5900948
42Garbage Collection
Analysis of Stop-and-CopyAnalysis of Stop-and-Copy
Assume the heap has H words Assume that R words are reachable The cost of garbage collection is:
c3R
The cost per reclaimed word is:
c3R/(H/2 - R)
This has no lower bound as H grows
43Garbage Collection
Recognizing Records and PointersRecognizing Records and Pointers
Earlier assumptions:• we know the start and size of each record in memory• we know which record fields are pointers
For object-oriented languages, each record already contains a pointer to a class descriptor
For general languages, we must sacrifice a few bytes per record
For the stack frame:• use a bit per stack location• use a table per program point
44Garbage Collection
Conservative Garbage CollectionConservative Garbage Collection
For mark-and-sweep, we may use a conservative approximation to recognize pointers
A word is a pointer if it looks like one (its value is an address in the range of the heap space)
This will recognize too many pointers Thus, too many records will be marked as live
This does not work for stop-and-copy...
45Garbage Collection
Triggering Garbage CollectionTriggering Garbage Collection
A collection must be triggered when there is no more free heap space
But this may cause a long pause in the execution Collections may be triggered by heuristics:
• after a certain number of records have been allocated• when only a certain fraction of the heap is free• after a certain period of time• when the program is not busy
46Garbage Collection
Generational CollectionGenerational Collection
Observation: the young die quickly! The collector should focus on young records Divide the heap into generations: G0, G1, G2, ...
All records in Gi are younger than records in Gi+1
Collect G0 often, G1 less often, and so on
Promote a record from Gi to Gi+1 when it survives several collections
47Garbage Collection
Collecting a GenerationCollecting a Generation
How to collect the G0 generation:• roots are no longer just stack locations, but also
pointers from G1, G2, ...
• it could be expensive to find those pointers• fortunately they are rare, so we can remember them
Ways to remember pointers:• maintain a set of all updated records• mark pages of memory that contain updated records
(using hardware or software)
48Garbage Collection
Incremental CollectionIncremental Collection
A garbage collector creates (long) pauses This is bad for real-time programs
An incremental collector runs concurrently with the program (in a separate thread)
It must now handle simultaneous heap updates
49Garbage Collection
The Tricoloring AlgorithmThe Tricoloring Algorithm
Records are colored black, grey, or white
visited and all children visited visited, but not all children visited not visited
The program may update the heap as it pleases, but must maintain an invariant:
no black record points to a white record
50Garbage Collection
Function Tricolor() { color all records white;
color all roots grey;
while (more grey records) {
x = a grey record;
for (i=1; i<=|x|; i++)
color x.fi grey;
color x black;
}
reclaim all white records;
}
Pesudo Code for TricoloringPesudo Code for Tricoloring
51Garbage Collection
Maintaining the InvariantMaintaining the Invariant
Write barriers:
x.fi = y; black2grey(x).fi = y;
Read barriers:
x.fi = y; x.fi = white2grey(y);
Requires synchronizations between the running program and the collector
52Garbage Collection
Garbage Collection in JavaGarbage Collection in Java
Sun's HotSpot VM uses by default:• two generations: "nursery" and "old objects"• the nursery is collected using stop-and-copy• the old objects are collected using mark-and-sweep in
a version that also compacts the live records
For real-time applications:• use option -Xincgc• a more sophisticated incremental algorithm• 10% slower• but with shorter pauses
53Garbage Collection
FinalizersFinalizers
If an object has a finalize() method, it will be invoked before the object is reclaimed by the garbage collector
But there is no guarantee how soon this happens
This method may actually resurrect the object Typically, the garbage collector needs an extra
pass to find out if the dead really stay dead
54Garbage Collection
Interacting With the Garbage CollectorInteracting With the Garbage Collector
Trigger the garbage collector manually:• System.gc();
The java.lang.ref package allows variations of the pointer concept:• SoftReference• WeakReference
55Garbage Collection
Soft ReferencesSoft References
The garbage collector may reclaim an object that has soft references but no ordinary (strong) references
This is typically used for caching:SoftReference sr = null;
...
Image img;
if (sr == null) {
img = getImage("huge.gif");
sr = new SoftReference(img);
} else
img = (Image)sr.get();
display(img);
img = null;
Top Related