Operating Systems - Garbage Collection

63
UNIVERSITY OF MASSACHUSETTS AMHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Garbage Collection

description

From the Operating Systems course (CMPSCI 377) at UMass Amherst, Fall 2007.

Transcript of Operating Systems - Garbage Collection

Page 1: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Emery Berger

University of Massachusetts Amherst

Operating SystemsCMPSCI 377

Garbage Collection

Page 2: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Questions To Come

Terms:

Tracing

Copying

Conservative

Parallel GC

Concurrent GC

Runtime & space costs

Live objects

Reachable objects

References2

Page 3: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Explicit Memory Management

malloc/new

allocates space for an object

free/delete

returns memory to system

Simple, but tricky to get right

Forget to free memory leak

free too soon “dangling pointer”

Double free, invalid free...

Page 4: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Dangling Pointers

Node x = new Node (“happy”);

Node ptr = x;

delete x; // But I’m not dead yet!

Node y = new Node (“sad”);

cout << ptr->data << endl; // sad

Insidious, hard-to-track down bugs

Page 5: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Solution: Garbage Collection

Garbage collector periodically scans objects on heap

No need to free

Automatic memory management

Reclaims non-reachable objects

Won’t reclaim objects until they’re dead(actually somewhat later)

Page 6: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

No More Dangling Pointers

Node x = new Node (“happy”);

Node ptr = x;

// x still live (reachable through ptr)Node y = new Node (“sad”);

cout << ptr->data << endl; // happy!

So why not use GCall the time?

Page 7: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Because This Guy Says Not To

There just aren’t all that many worse ways to f*** up your cache behavior than by using lots of allocations and lazy GC to manage your

memory.

GC sucks donkey brains through a straw from a performance standpoint.

LinusTorvalds

Page 8: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Slightly More Technically…

“GC impairs performance”

Extra processing

collection, copying

Degrades cache performance (ibid)

Degrades page locality (ibid)

Increases memory needs

delayed reclamation

Page 9: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

On the other hand…

No, “GC enhances performance!”

Faster allocation

pointer-bumping vs. freelists

Improves cache performance

no need for headers

Better locality

can reduce fragmentation, compact data structures according to use

Page 10: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Outline

Classical GC algorithms

Quantifying GC performance

A hard problem

Oracular memory management

GC vs. malloc/free bakeoff

Page 11: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Classical Algorithms

Three classical algorithms

Mark-sweep

Reference counting

Semispace

Tweaks

Generational garbage collection

Out of scope

Parallel – perform GC in parallel

Concurrent – run GC at same time as app

Real-time – ensure bounded pause times11

Page 12: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep

12

Start with roots

Global variables, variables on stack& in registers

Recursively visit every object through pointers

Mark every object we find (set mark bit)

Everything not marked = garbage

Can then sweep heap for unmarked objectsand free them

Page 13: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

13

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Page 14: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

14

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Visit objects, following object graph

Page 15: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

15

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Visit objects, following object graph

Page 16: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

16

global 1

global 2

global 3

roots

object 1

object 2

object 3

object 4

object 5

object 6

Initially,all objects white(garbage)

Visit objects, following object graph

Page 17: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Mark-Sweep Example

17

global 1

global 2

global 3

roots

object 2

object 4

object 5

Initially,all objects white(garbage)

Visit objects, following object graph

Can sweep immediately or lazily

freelist

object 1

object 3

object 6

Page 18: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting

For every object, maintain reference count= number of incoming pointers

a->ptr = x refcount(x)++

a->ptr = y refcount(x)--; refcount(y)++

Reference count = 0 no more incoming pointers: garbage

18

Page 19: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

19

global 1

global 2

global 3

roots

New objects:ref count = 1

object 5

1

object 4

1

object 2

2

Page 20: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

20

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

object 5

1

object 4

1

object 2

1

Page 21: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

21

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

object 5

1

object 4

1

object 2

1

Page 22: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

22

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

And recursively delete pointers

object 5

0

object 4

1

object 2

1

Page 23: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

23

global 1

global 2

global 3

roots

New objects:ref count = 1

Delete pointer:refcount--

And recursively delete pointers

refcount == 0:put on freelist

object 5

0

object 4

0

object 2

1

Page 24: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Cycles & Reference Counting

24

global 1

global 2

global 3

roots

Big problem: cycles

object 5

2

object 4

1

object 2

2

Page 25: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

25

global 1

global 2

global 3

roots

Big problem: cycles

object 5

1

object 4

1

object 2

2

Page 26: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reference Counting Example

26

global 1

global 2

global 3

roots

Big problem: cycles

Cycles lead to unreclaimablegarbage

Need to do periodic tracing collection (e.g., mark-sweep)

object 5

1

object 4

1

object 2

2

Page 27: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace

Divide heap in two semispaces:

Allocate objects from from-space

When from-space fills,

Scan from roots through live objects

Copy them into to-space

When done, switch the spaces

Allocate from leftover part of to-space

27

Page 28: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

28

from-space

to-space

Page 29: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

29

from-space

to-space

Page 30: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

30

from-space

to-space

Page 31: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

31

from-space

to-space

Page 32: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

32

from-space

to-space

Page 33: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

33

from-space

to-space

Page 34: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

34

from-space

to-space

Page 35: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Semispace Example

Allocate in from-space

Pointer bumping

Copy live objects into to-space

Leaves forwarding pointer

Flip spaces;allocate from end

35

to-space

from-space

Page 36: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC

Optimization for copying collectors

Generational hypothesis:“most objects die young”

Common-case optimization

Allocate into nursery

Small region

Collect frequently

Copy out survivors

Key idea: keep track of pointers frommature space into nursery

36

Page 37: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

37

nursery

mature space

Page 38: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

38

nursery

mature space

Page 39: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

39

nursery

mature space

Page 40: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

40

nursery

mature space Copy out survivors (via roots & mature space pointers)

Page 41: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

41

nursery

mature space Copy out survivors (via roots & mature space pointers)

Page 42: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

42

nursery

mature space Copy out survivors (via roots & mature space pointers)

Page 43: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Generational GC Example

43

nursery

mature space Copy out survivors (via roots & mature space pointers)

Reset allocation pointer & continue

Page 44: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Conservative GC

Non-copying collectors for C & C++

Must identify pointers

“Duck test”: if it looks like a pointer, it’s a pointer

Trace through “pointers”, marking everything

Can link with Boehm-Demers-Weiser library (“libgc”)

44

Page 45: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

GC vs. malloc/free

45

Page 46: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node v = malloc(sizeof(Node));

v->data=malloc(sizeof(NodeData));

memcpy(v->data, old->data,

sizeof(NodeData));

free(old->data);

v->next = old->next;

v->next->prev = v;

v->prev = old->prev;

v->prev->next = v;

free(old);

Using GC in C/C++ is easy:

BDWCollector

Page 47: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node v = malloc(sizeof(Node));

v->data=malloc(sizeof(NodeData));

memcpy(v->data, old->data,

sizeof(NodeData));

free(old->data);

v->next = old->next;

v->next->prev = v;

v->prev = old->prev;

v->prev->next = v;

free(old);

…slide in BDW and ignore calls to free.

BDWCollector

Page 48: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

What About Other Garbage Collectors?

Compares malloc to GC, but only conservative, non-copying collectors

Can’t reduce fragmentation,reorder objects, etc.

But: faster precise, copying collectors

Incompatible with C/C++

Standard for Java…

Page 49: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node node = new Node();

node.data = new NodeData();

useNode(node);

node = null;

...

node = new Node();

...

node.data = new NodeData();

...

Adding malloc/free to Java:not so easy…

LeaAllocator

Page 50: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Comparing Memory Managers

Node node = new Node();

node.data = new NodeData();

useNode(node);

node = null;

...

node = new Node();

...

node.data = new NodeData();

...

... need to insert frees, but where?

free(node.data)?

free(node)

?

LeaAllocator

Page 51: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Oracular Memory Manager

Java

Simulator

C malloc/free

perform actions at no costbelow here

execute program here

allocation

Oracle

Consult oracle at each allocation

Oracle does not disrupt hardware state

Simulator invokes free()…

Page 52: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Object Lifetime & Oracle Placement

Oracles bracket placement of frees

Lifetime-based: most aggressive

Reachability-based: most conservative

unreachable

live dead

reachable

freed bylifetime-based oracle

freed byreachability-based oracle

can be collected

free(obj) free(??)

obj =

new Object; can be freed

free(obj)

Page 53: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Liveness Oracle Generation

Java

PowerPCSimulator

C malloc/free

perform actions at no costbelow here

execute program here

tracefile

allocation, mem access, prog. roots

Post-process

Liveness: record allocs, memory accesses

Oracle

Page 54: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Reachability Oracle Generation

Java

PowerPCSimulator

C malloc/free

perform actions at no costbelow here

execute program here

tracefile

allocations,ptr updates,prog. roots

Merlin analysis

Reachability:

Illegal instructions mark heap events (especially pointer assignments)

Oracle

Page 55: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Oracular Memory Manager

Java

PowerPCSimulator

C malloc/free

perform actions at no costbelow here

execute program here

oracle

allocation

Run & consult oracle before each allocation

When needed, modify instruction to call free

Extra costs (oracle access) hidden by simulator

Page 56: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Execution Time for pseudoJBB

GC can be faster than malloc/free

90%

100%

110%

120%

130%

140%

150%

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Tim

e R

elat

ive

to L

ea

Heap Size Relative to Collector Minimum

GenMS

GenCopy

GenRC

Lea w/ Reach

Lea w/ Life

Page 57: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Geo. Mean of Execution Time

Trades space for time

90%

95%

100%

105%

110%

115%

120%

125%

130%

1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00

Heap Size Relative to Collector Minimum

Exe

cutio

n Ti

me

Rel

ativ

e to

Lea

GenMS

GenCopy

GenRC

Lea w/ Reach

Lea w/ Life

MSExplic it w/ Reach

Page 58: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Footprint at Quickest Run

GC uses much more memory for speed

Page 59: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Footprint at Quickest Run

GC uses much more memory for speed

1.001.38 1.61

5.105.66

4.84

7.697.09

0.63

Page 60: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Javac Paging Performance

GC: poor paging performance

Page 61: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

Summary of Results

Best collector equals Lea's performance…

Up to 10% faster on some benchmarks

... but uses more memory

Quickest runs require 5x or more memory

GenMS at least doubles mean footprint

Page 62: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

When to Use Garbage Collection

Garbage collection fine if

system has more than 3x needed RAM

and no competition with other processes

or avoiding bugs / security more important

Not so good:

Limited RAM

Competition for physical memory

Depends on RAM for performance

In-memory database

Search engines, etc.

Page 63: Operating Systems - Garbage Collection

UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science

The End

63