Knowing your Garbage Collector / Python Madrid

37
Knowing your garbage collector Francisco Fernandez Castano Rushmore.fm [email protected] @fcofdezc October 21, 2014 Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 1 / 37

description

Talk about garbage collection in CPython and PyPy

Transcript of Knowing your Garbage Collector / Python Madrid

Page 1: Knowing your Garbage Collector / Python Madrid

Knowing your garbage collector

Francisco Fernandez Castano

Rushmore.fm

[email protected] @fcofdezc

October 21, 2014

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 1 / 37

Page 2: Knowing your Garbage Collector / Python Madrid

Overview

1 IntroductionMotivationConcepts

2 AlgorithmsCPython RCPyPy

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 2 / 37

Page 3: Knowing your Garbage Collector / Python Madrid

Motivation

Managing memory manually is hard.

Who owns the memory?

Should I free these resources?

What happens with double frees?

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 3 / 37

Page 4: Knowing your Garbage Collector / Python Madrid

Dangling pointers

int *func(void)

{

int num = 1234;

/* ... */

return #

}

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 4 / 37

Page 5: Knowing your Garbage Collector / Python Madrid

Ownership

int *func(void)

{

int *num = malloc (10 * sizeof(int ));;

/* ... */

return num;

}

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 5 / 37

Page 6: Knowing your Garbage Collector / Python Madrid

John Maccarthy

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 6 / 37

Page 7: Knowing your Garbage Collector / Python Madrid

Basic concepts

Heap

A data structure in which objects may be allocated or deallocated in anyorder.

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 7 / 37

Page 8: Knowing your Garbage Collector / Python Madrid

Basic concepts

Heap

A data structure in which objects may be allocated or deallocated in anyorder.

Mutator

The part of a running program which executes application code.

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 8 / 37

Page 9: Knowing your Garbage Collector / Python Madrid

Basic concepts

Heap

A data structure in which objects may be allocated or deallocated in anyorder.

Mutator

The part of a running program which executes application code.

Collector

The part of a running program responsible of garbage collection.

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 9 / 37

Page 10: Knowing your Garbage Collector / Python Madrid

Garbage collection

Definition

Garbage collection is automatic memory management. While themutator runs , it routinely allocates memory from the heap. If morememory than available is needed, the collector reclaims unused memoryand returns it to the heap.

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 10 / 37

Page 11: Knowing your Garbage Collector / Python Madrid

CPython GC

CPython implementation has garbage collection.

CPython GC algorithm is Reference counting with cycle detector

It also has a generational GC.

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 11 / 37

Page 12: Knowing your Garbage Collector / Python Madrid

Young objects

[elem * 2 for elem in elements]

balance = (a / b / c) * 4

’asdadsasd -xxx’.replace(’x’, ’y’). replace(’a’, ’b’)

foo.bar()

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 12 / 37

Page 13: Knowing your Garbage Collector / Python Madrid

PyObject

typedef struct _object {

_PyObject_HEAD_EXTRA

Py_ssize_t ob_refcnt;

struct _typeobject *ob_type;

} PyObject;

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 13 / 37

Page 14: Knowing your Garbage Collector / Python Madrid

PyTypeObject

typedef struct _typeobject {

PyObject_VAR_HEAD

const char *tp_name;

Py_ssize_t tp_basicsize , tp_itemsize;

destructor tp_dealloc;

printfunc tp_print;

getattrfunc tp_getattr;

setattrfunc tp_setattr;

void *tp_reserved;

.

.

} PyTypeObject;

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 14 / 37

Page 15: Knowing your Garbage Collector / Python Madrid

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 15 / 37

Page 16: Knowing your Garbage Collector / Python Madrid

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 16 / 37

Page 17: Knowing your Garbage Collector / Python Madrid

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 17 / 37

Page 18: Knowing your Garbage Collector / Python Madrid

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 18 / 37

Page 19: Knowing your Garbage Collector / Python Madrid

Reference Counting Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 19 / 37

Page 20: Knowing your Garbage Collector / Python Madrid

Cycles

l = []

l.append(l)

del l

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 20 / 37

Page 21: Knowing your Garbage Collector / Python Madrid

Cycles

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 21 / 37

Page 22: Knowing your Garbage Collector / Python Madrid

Cycles

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 22 / 37

Page 23: Knowing your Garbage Collector / Python Madrid

PyObject

typedef struct _object {

_PyObject_HEAD_EXTRA

Py_ssize_t ob_refcnt;

struct _typeobject *ob_type;

} PyObject;

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 23 / 37

Page 24: Knowing your Garbage Collector / Python Madrid

PyTypeObject

typedef struct _typeobject {

PyObject_VAR_HEAD

const char *tp_name;

Py_ssize_t tp_basicsize , tp_itemsize;

destructor tp_dealloc;

printfunc tp_print;

getattrfunc tp_getattr;

setattrfunc tp_setattr;

void *tp_reserved;

.

.

} PyTypeObject;

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 24 / 37

Page 25: Knowing your Garbage Collector / Python Madrid

PyGC Head

typedef union _gc_head {

struct {

union _gc_head *gc_next;

union _gc_head *gc_prev;

Py_ssize_t gc_refs;

} gc;

double dummy; /* force worst -case alignment */

} PyGC_Head;

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 25 / 37

Page 26: Knowing your Garbage Collector / Python Madrid

CPython Memory Allocator

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 26 / 37

Page 27: Knowing your Garbage Collector / Python Madrid

CPython Memory Allocator

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 27 / 37

Page 28: Knowing your Garbage Collector / Python Madrid

Demo

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 28 / 37

Page 29: Knowing your Garbage Collector / Python Madrid

Reference counting

Pros: Is incremental, as it works, it frees memory.

Cons: Detecting Cycles could be hard.

Cons: Size overhead on objects.

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 29 / 37

Page 30: Knowing your Garbage Collector / Python Madrid

PyPy

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 30 / 37

Page 31: Knowing your Garbage Collector / Python Madrid

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 31 / 37

Page 32: Knowing your Garbage Collector / Python Madrid

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 32 / 37

Page 33: Knowing your Garbage Collector / Python Madrid

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 33 / 37

Page 34: Knowing your Garbage Collector / Python Madrid

Mark and Sweep Algorithm

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 34 / 37

Page 35: Knowing your Garbage Collector / Python Madrid

Mark and sweep

Pros: Can collect cycles.

Cons: Basic implementation stops the world

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 35 / 37

Page 36: Knowing your Garbage Collector / Python Madrid

Questions?

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 36 / 37

Page 37: Knowing your Garbage Collector / Python Madrid

The End

Francisco Fernandez Castano (@fcofdezc) Python GC October 21, 2014 37 / 37