Using linearity to allow heap-recycling in Haskell

8/9/2019 Using linearity to allow heap-recycling in Haskell

1/32

Using linearity to allow heap-recycling in Haskell

Chris Nicholls

May 24, 2010


2/32

Abstract

This project investigates destructive updates and heap-space-recycling inHaskell through the use of linear types. I provide a semantics for an exten-sion to the STG language, an intermediate language used in the GlasgowHaskell Compiler (GHC), that allows arbitrary data types to be updated.

A type system based on uniqueness typing is also introduced that allowsthe use of the new semantics without breaking referential transparency. Thetype system aims to be simple and syntactically light, allowing a programmerto introduce destructive updates with minimal changes to source code.

I have implemented this semantic extension to both in an interpreter forthe STG language and in the GHC backend. Finaly, I have written a typechecker for this system that works over a subset of Haskell.

1


3/32

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1 Introduction 3The Problem With Persistence . . . . . . . . . . . . . . . . . . . . 3

2 Uniqueness 4

3 Uniqueness in Type Systems 6Linear Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6Clean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Monads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Hage & Holdermans Heap Recycling for Lazy Languages . . . . . 8Uniqueness in Imperative Languages . . . . . . . . . . . . . . . . . 8A Simpler Type System for Unique Values . . . . . . . . . . . . . . 10

4 Implementation 13The STG Language . . . . . . . . . . . . . . . . . . . . . . . . . . 14Operational Semantics of STG . . . . . . . . . . . . . . . . . . . . 16Closure Representation . . . . . . . . . . . . . . . . . . . . . . . . 18Adding an Overwrite construct . . . . . . . . . . . . . . . . . . . . 19

Ministg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20GHC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5 Results 27

6 Conclusion 29

2


4/32

Chapter 1

Introduction

The Problem With Persistence

One striking feature of pure functional programming in languages such asHaskell is the lack of state. As all data structures are persistent, updatinga value does not destroy it but instead creates a new copy. The advantagesof this are well known[1][2] but conversely so are the disadvantages[3][5].In particular, persistence can lead to excessive memory consumption whenstructures remain in memory long after they have ceased to be useful[6].

The reason Haskell does not allow state is to avoid side effects and thereason side effects are avoided is because they can make understanding and

reasoning about programs difficult. Indeed, from a theoretical point of view,side effects simple arent required for computation. Yet undeniably, side ef-fects are useful, particularly when implementing efficient data structures[4].

Whilst the lack of destructive update in Haskell is useful in accomplishingthe goal of referential transparency, it is not strictly necessary. It is some-times possible to allow destructive updates without introducing observableside effects.

3


5/32

Chapter 2

Uniqueness

Imagine a program that reads a list of integers from a file, sorts them andthen continues to process the sorted list in some manner. In an imperativesetting, we might expect this sorting to be done in- place, but in Haskell wemust allocate the space for a new, sorted version of the list. However, if theoriginal list is not referred to in the rest of the program, then any changesmade to the data contained in the list will never be observed. Thus thereis no need to maintain the original list. This means we could re-use thespace occupied by the unsorted list, and sine we know that sorting preserveslength, we might begin to wonder if we can do the sorting in-place.

The reason we could not use destructive updates in the example above is

that doing so may introduce side effects into our program. For instance, if weare able to sort a list in-place then the following code becomes problematic:

foo :: [a] ([a], [a])foo xs = (xs, (sort in place xs))

Does fst (foo [3,2,1]) refers to a sorted list or an unsorted list? Withlazy evaluation we have no way of knowing.

Notice however that modifying the original, unsorted list is only a prob-lem if it is referred to again elsewhere in the program. If the list is not usedanywhere else, then there can be no observable side effects of updating it

in place as any data that cannot be referenced again can have no semanticeffect on the rest of the program.

If this where the case then the compiler would be free to re-use thespace previously taken up by the list, perhaps updating the data structurein-place, and referential transparency would not be broken. This condition,that there is only ever one reference to the list, is known as uniqueness we say that the list is unique.

Consider an algorithm that inserts an element into a binary tree(fig 2.1).In an imperative language this would normally involve walking the tree untilwe find the correct place to insert the element and updating the node at that

4


6/32

d

lj

ki

b

fe

h

a

c

g

(a) A binary tree. An element is to be insertedin the outlined position

a'

c'

g'

m

d

lj ki

b

fe

h

a

c

g

(b) After insertion. A new tree has been createdfrom the old one

Figure 2.1: Inserting an element into a binary tree

position. However in a functional language, we must instead copy all thenodes above the one to be updated and create a new binary tree. If theoriginal tree was unique, that is, the only reference to a was passed to the

function that inserted m, then there will no longer be any references toa. Consequently, there will be no longer be any references to nodes c or geither. All three node will be wasting space in memory. If a larger numberof nodes are inserted then it is possible that the space wasted will be manytimes greater than the space taken up by the tree! Clearly a lot of spacecan be wasted.

In general it is not possible to predict when an object in a Haskell willbecome garbage, so garbage collection must be a dynamic run-time pro-cess. Because garbage collection happens at run-time, there is a perfor-mance penalty associated with it. Indeed, whilst garbage collection can bevery efficient when large amounts of memory are available[8], it can often

take up non-trivial percentages of the programs execution time in memoryconstrained environments. But when an object is know always to be unique,its lifetime can be determined statically and so the run-time cost of garbagecollection can be avoided.

5


7/32

Chapter 3

Uniqueness in Type Systems

Linear Logic

Linear Logic is a system of logic proposed by Jean-Yves Girard in whicheach assumption must be used exactly once. Wadler noticed that in thecontext of programming languages, linear logic corresponds to

No Duplication. No value is shared so, as we have seen, destructiveupdate is permissible.

No Discarding. Each value is used exactly once. This use represents

an explicit deallocation, so no garbage collection is required.Wadler proposed a Linear type system based directly on Girads logic [10][11]. In this type system every value is labeled as being either linear ornonlinear. Functions are then typed to accept either linear or nonlineararguments.

In [7] David Wakeling and Colin Runciman describe an implementationof a variant of Lazy ML that incorporates linear types. Their results aredisappointing: the performance of programs using linear data structures isgenerally much worse than without. The cost of maintaining linearity easilyoutweighs the benefits of reduced need for garbage collection.

Along similar lines, Henry Baker provides an implementation of linear

Lisp [17] that restricts every type to being linear. The result is an imple-mentation of Lisp that requires no run-time memory management. Thiscomes at a price, however: Baker found that much of the work must insteadbe done by the programmer and, as with Linear ML, the large amounts ofbook-keeping and explicit copying mean that linear Lisp is slightly slowerthat its classical counterpart.

6


8/32

Clean

Clean[23] is a language very similar to Haskell that features a unique typesystem based on linear logic. Clean allows users to specify particular vari-ables as being unique The type system exposed to the user is large and oftensimple functions can have complex types. However, the de-facto implemen-tation has proved to be very efficient.

One particularly interesting feature of Clean is that the state of the worldis explicit. Every Clean program passes around a unique object, the world.The world represents the state of the system, explicitly threaded throughoutthe program and is thus destructive updates to the world can be used tosequence IO operations. Unique objects cannot be duplicated, so no more

than one world can exist at a time and hence there is no danger of referringto an old state by accident.

Monads

Haskell takes a different approach towards IO. Monads, as presented byWadler and Peyton-Jones [27] can do much of the work of uniqueness typingby the use of encapsulation. Indeed, they are much simpler in terms of bothsyntax and type system. However, monads do not solve every problem aselegantly.

Suppose we have a program that makes use of a binary tree:

data BinTree a = Empty | Node a (BinTree a) (BinTree a)

insert :: a BinTree a BinTree aremoveMin:: BinTree a (a, BinTree a)isEmpty :: BinTree a Bool

If we want to allow the tree to be updated destructively we can employthe ST monad, replacing each branch by a mutable reference, an STRef.However, as STRefs require a state parameter, we must also add a typeparameter to our binary trees.

data BinTree s a =Empty| Node a (STRef s (BinTree s a) (BinTree s a))

Unfortunately, none of the code we have written to work over binarytrees will work any more! Not only are the type signatures incorrect, but thewhole implementation must be re-written to work within the state monad.

insert :: a BinTree s a ST s (BinTree s a)removeMin:: BinTree s a ST s (a, BinTree s a)...

7


9/32

Monadic code can often differ significantly in style to idiomatic functional

code, so this may end up affecting large portions of our code. This can clearlycause problems if we where trying to optimise a large program in which thebinary tree implementation had been identified as a bottleneck.

Hage & Holdermans Heap Recycling for Lazy Lan-

guages

As a way of avoiding this monad creep, Hage and Holdermans present a lan-guage construct to allow destructive updates for unique values in nonstrict,pure functional languages[19]. Their solution makes use of an embedded

destructive assignment operator and user-controlled annotations to indicatewhich closures can be re-used. They describe a type system that restricts theuse of this operator and prove that referential transparency is maintained.

Hage and Holdermans do not provide an implementation of either thetype system or the destructive assignment operator.

They also state concern in the complexity of the type system exposedto the user, despite it being simpler than the system used in Clean. This isthe issue addressed in the next section of this paper.

Uniqueness in Imperative Languages

The initial motivation for this project came not from linear logic but fromimagining an imperative language that maintained a form of referentialtransparency.

This language has two types of variables, consumable and immutable.Each function then accepts two sets of variables, one set is the set of variablesthat this function consumes, the other is the set of variables that it views.During execution, a function f is said to own a variable x if and only if:

the variable x was created inside the body of f (either from a closedterm or a literal), or x was passed to f as a consumable variable;

f has not passed x as a consumable variable to any other function.

Each function is restricted so that the only variables it can modify orreturn are the variables it owns. One further restriction is then when avariable is passed in to a function as a consumable variable, it is removedfrom the current scope (this means it cannot be used as another argument tothe same function). Thus, any variable passed into a functional as a viewedargument will not change during the execution of that function, and anyvariable passed in as a consumed argument can not be referred to again, sodestructively updating the variable will not cause side effects.

8


10/32

As an example, here is an implementation of quicksort in this theoretical

language:

qsort (consumed xs :: [Int]) [Int] = {return sort (xs, Nil)

}

sort (consumed xs :: [Int], end :: [Int]) [Int] = {case xs of

Nil return end;Cons (x, xs) {

ys, zs := split (x, xs);zs := sort (zs, end);

return sort (ys, Cons (x, zs));}

}

split (viewed p :: Int; consumed xs :: [Int]) ([Int], [Int]) = {case xs of

Nil return ([ ], [ ])Cons (x, xs) {

ys, zs := split (p, xs);if x > p then :

return (ys, Cons (x, zs));else :

return (Cons (x, ys), zs);}

}

In the body of sort, xs will be out of scope after the case expression,and after the line

split(xs, x)xs will be out of scope but x will remain in scope, since split consumes itssecond argument but only views its first.

These rules ensure that at any point in the programs execution, if x isconsumable in the current environment then there is no more than a single

reference to it. Conversely if there is more than one reference to x then xmust be immutable.

A sufficiently smart compiler would be able to tell that in each caseexpression, list under scrutiny is never referred to again, only its elementsare. Thus, in the case that the list was a Cons cell, the cell can be re- usedwhen a Cons cell is created later on. In this way, the function sort can avoidallocating any new cells and instead operate in-place.

9


11/32

A Simpler Type System for Unique Values

We can translate this idea of consumable variables into Haskell. Below isthe code for quicksort written in a version of Haskell extended to includethis idea.

qsort :: [ Int] ; [Int]qsort xs = sort xs [ ]

sort :: [ Int] ; [Int] ; [Int]sort [ ] end = endsort (x : xs) end = sort ys (x : sort zs end)

where

(ys, zs) = split x xssplit :: Int [Int] ; ([Int], [Int])split [ ] p = ([ ], [ ])split (x : xs) p = case p > x of

True (x : ys, zs)False (ys, x : zs)where

(ys, zs) = split p xs

This is deliberately very close to standard Haskell, with one addition. Anew form of arrow has been introduced to the syntax of types. The intended

meaning of

f :: a ; bg :: a b

is that f consumes a variable of type a and produces a b. Thus the bodyof f is free to modify x. By comparison, g is a standard Haskell functionthat only views its first argument. Intuitively, the new arrow form obeysthe following rules:

Only unique variables and closed terms may be used as an argumentto a function expecting a unique value;

The result of applying a function of type (a ; b) to a unique value oftype a will be a unique value of type b

A unique variable may be used at most once in the body of a function;

data structures are unique all the way down

i.e. a function (f :: [a] ; [a]) works over a unique list whose elementsare also unique.

10


12/32

map :: (a ; b) [a] ; [b ]map f [ ] = [ ]map f (x : xs) = f x : map f xs

-- Map takes a unique list of and updates it in place-- Notice the function f is not unique itself as it is used twice-- on the right hand side

id :: x ; xid x = x

compose:: (b ; c) (a ; b) a ; ccompose f g x = f (g x)

double1 :: a ; (a, a)double1 a = (a, a) -- error: unique variable a is used twice.double2 :: a (a, a)double2 a = (a, a)

apply1 :: (a b) a ; bapply1 f x = f x -- error: result of applying f to x will not be unique

apply2 :: (a ; b) a bapply2 f x = f x -- error: f expects a unique argument, x is not unique

twice :: (a ; a) a ; atwice f = compose f f

fold :: (b ; a ; a) a ; List b ; afold f e [ ] = efold f e (x : xs) = fold f (f x e) xs

f1 :: a ; (a b) bf1 x g = g xf2 :: a (a ; b) bf2 x g = g x -- error: g expects a unique argument, x is not uniquef3 :: a (a b) ; bf3 x g = g x -- error: the result of applying g to x will not be unique.

-- a unique variable may be passed to an argument expecting a

-- non-unique variable, but not the other way round.-- Note that in f1, the type signature is implicitly bracketed like this:-- f1 :: a ; ((a b) b)-- so the result of a partial application would be a function that is-- itself unique.

Figure 3.1: Some examples of function with possible type signatures andtype errors.

11


13/32

Semantically, this can be viewed in terms of the system proposed by

Hage and Holdermans, equivalent tof :: a1 1 b1

g :: a b

Many functions can be converted to use this type system without needingto alter their definition at all. For instance, a function that reverses a listin-place can be constructed simply by altering the type signature or thestandard Haskell function reverse.

reverse :: [a] ; [a]reverse = rev [ ]

where

rev:: [a];

[a];

[a]rev xs [ ] = xsrev xs (y : ys) = rev (y : xs) ys

There is a significant drawback to this system in the fact that there ismore than one possible way to assign a type to some fragments of code.If we want to use both in-place reverse and regular reverse, then we mustcreate two separate functions that differ only by name and type signature.

I have implemented a typechecker for this system over a subset of Hasekell.Due to time constraints and the complexity of GHCs type system result-ing from the vast number of type system extensions already present, the

new type system has not been integrated into GHC. Despite this, the back-end mechanisms to allow closure-recycling are fully functional: the exampleabove will compile and run, sorting the list in-place although it will not betypechecked by GHC.

12


14/32

Chapter 4

Implementation

I have implemented the backend mechanisms for dealing with overwritingas an extension to the Glasgow Haskell Compiler. This section includes

just enough detail about the inner working of the compiler to explain thisextension.

There are several main stages in the compilation pipeline:

The Front End contains the parser and the type checker.

The Desugarer converts from the abstract syntax of Haskell into thetiny intermediate Core-language.

A set of Core-to-Core optimisations and other transformations.

Translation into the STG language.

Code generation.

This chapter deals with the details of the final two phases.

13


15/32

The STG Language

The STG language is a small, non-strict functional language used inter-nally by GHC as an intermediate language before imperative code is output.Along with formal denotational semantics [26], the STG language also hasfull operational semantics with a clear and simple meaning for each languageconstruct.

Construct Operational meaning

Function Application Tail CallLet Expression Heap AllocationCase expression Evaluation

Constructor application Return to Continuation

There are also several properties of STG code that are of interest:

Every argument to a function or data constructor is a simple variableor constant. Operationally, this means that arguments to functions areprepared (either by evaluating them or constructing a closure) priorto the call.

All constructors and built-in operations are saturated. This cannot beguaranteed for every function since Haskell is a higher order languageand the arity of functions is not necessarily known, but it simplifies the

operational semantics. Functions of known arity can be eta-expandedto ensure saturation.

Pattern matching and evaluation is only ever performed via case ex-pressions, and each case expression matches one-level patterns.

Each closure has an associated update flag. More is explained aboutthese further down.

Bindings in the STG language carry with them a list of free variables.This has no semantic effect but is useful for code generation.

14


16/32

Program prog binds

Bindings binds var1 = lf1; ...; varn = lfn

Lambda-forms lf varsf varsa -> expr

Update flag u Updatable

| n Not updatable

Expression expr let binds in expr Local definition| letrec binds in expr Local recursion| case expr of alts Case statements| var atoms Application| constr atoms Saturated constructor| prim atoms Saturated buit-in op| literal

Alternatives alts aalt1; ...; aaltn; def ault n 0 (Algebraic)

| palt1; ...; paltn; def ault n 0 (Primitive)

Algebraic alt aalt constr vars -> exprPrimitive alt palt literal -> exprDefault alt default var -> expr

Literals literal 0# | 1# | ... Primitive Integers

Primitive ops prims +# | -# | *# | /# | ... Primitive integer ops

Variable lists vars {var1, ..., varn} n 0

Atom lists atoms {atom1, ..., atomn} n 0atom var | literal

Figure 4.1: Syntax of the STG language

15


17/32

let x = bind in e; s; H

e[x

/x]; s; H[x bind]

(LET)

(x free)

case v of alts; s; H[v C a1...an]e[a1/x1...an/xn]; s; H

alts = {...; C x1...xn e; ...}

(CASECON)

case v of {...; x e;...}; s; He[v/x]; s; H v is a literal and does

not match any othercase alternatives

(CASEANY)

case v of alts; s; He; (case of alts : s); H (CASE)

v; case of alts; s; Hcase v of alts; s; h v is a literal or H[v] is in

HNF(RET)

s; s; H[x e]e; (U pd x : s); H e is a thunk (THUNK)

y; (Upd x : s); Hy; s; H[x H[y]] H[y] is a value (UPDATE)

Figure 4.2: The evaluation rules

Operational Semantics of STGThs semantics of the STG language are described in [15] and [26]. Anoutline of the relevant rules is presented here with some details left out. Inparticular the details of both recursion and function application are missingas neither have much effect on the ideas presented here. The semantics ofthe STG language is given in terms of three components:

The code e is the expression under evaluation;

The stack s of continuations;

The heap H is a finite mapping from variables to closures.The continuations, , on the stack take the following forms:

::= case ofalts Scrutinise the returned value in the case statement| Upd t Updatethe thunk t with the returned value| ( a1...an) Apply the returned function to a1...an

The first rule, LET, states that to evaluate a let-expression the heapH is extended to map a fresh variable to the right hand side bind of the

16


18/32

expression. The fresh variable corresponds to allocating a new address in

memory. After allocation, we enter the code for e with x substituted for x.Here is the Haskell code for the function reverse, taken from the standardprelude and the corresponding STG code:

reverse = rev [ ]where

rev xs [ ] = xsrev xs (y : ys) = rev (y : xs) ys

reverse = { } n { } rev {Nil}rev = { } n {xs ys} case ys of

Nil { } xsCons {z, zs}

let rs = {z, xs} n Cons {z, xs} inrev {rs, zs}

which should be read in the following way:

First bind reverse to a function closure whose code pushes onto thestack a continuation that apply a function to the value N il, thenevaluate the code for rev

Bind rev to a function closure that expects two argumentsxs

andys

.The code for this closure should force evaluation of ys and examinethe result:

if it matches N il, then evaluate the code for N il;

if it matches Cons z zs then allocate a Cons cell with argumentsz and xs, load rs and zs onto the stack and enter the code forrev.

Update flags

One feature of lazy evaluation is that each closure should be replaced by

its (head) normal form upon evaluation, so that the same closure is neverevaluated more than once. The update flag attached to each closure specifieswhether this update should take place. If the flag is set to u then the closurewill be updated and if it set to n thet no update will be performed. Navely,every flag can be set to u, but this is not always necessary. For instance, if aclosure is already in head normal form, then updating is not required. Muchmore detail about this is given in Simon Peyton-Jones paper Implementing

functional languages on stock hardware: the Spineless Tageless G-Machine.[26]

17


19/32

Closure Representation

Every heap object in GHC is in one of three forms: a head normal form (avalue), a thunk which represents an unevaluated expression, or an indirectionto another object. A value can either be a function value or a data valueformed by a saturated constructor application. The term closure is usedto refer to any of these objects. A distinctive feature of GHC is that allclosures are represented, and indeed handled, in a uniform way.

Code

Free Variables

All closures are in this form, with a pointer to an info table containingcode and other details about the closure and a list of variables that theclosure needs access to. For example, a closure for a function application willstore the code for the function in the info table and the arguments in the freevariable list. When the closure is evaluated, the arguments can be reachedvia a known offset from the start of the closure. For a data constructor,the code will return to the continuation of the case statement that forcedevaluation, providing the arguments of the constructor application. Thesearguments are again, simply stored as an offset from the start of the closure.

18


20/32

Adding an Overwrite construct

In this section, a new construct overwrite is added to the STG language.The syntax and semantics are given below. The idea is that

overwrite xs with e1 in e2will behave in a similar manner to

let xs = e1 in e2but rather than storing e1 as a new heap- allocated closure and binding tox, the closure bound to x will be overwritten with the closure for e1.

Now, care must be taken to ensure that x really is bound to a closure,not an unboxed value, and that e1 will produce the same type of closure.However, no checking is done at this stage as we assume this (as well as

uniqueness checking) has been taken care of elsewhere in the compiler.This highlights another difference between let and overwrite, namely

that the variable bound in the let-construct may be any variable, free orbound whereas in the overwrite-construct, it must be a bound variable.We can add this construct to the example reverse from above:

reverse :: [a] ; [a]reverse = { } n { } rev {Nil}rev = { } n {xs ys} case ys of

Nil { } xsCons {z, zs}

overwrite ys with Cons {z, xs} inrev {ys, zs}

Since there are no longer any let-constructs in this code, it doesnt allocateany space on the heap! The function runs using a constant amount ofspace, although in the case that the list is a previously unevaluated thunk,forcing the evaluation ofreverse will also force the evaluation and thereforeallocation of the list it operates on.

Note that we know it is safe to overwrite ys with a Cons cell beacuasewe know it to be unique from the type signature1 and we know ys to be aCons cell already since it was matched in a case expression

In general, it is safe to overwrite a closure x with a constructor applica-

tion C a1...an exactly when these two conditions hold:

x is known to be unique. This information is provided by the typesystem.

The closure bound to x was built with constructor C and is in normalform. This happens when x has been matched in a case expression,inside the guard for constructor C.

1The STG language is untyped, but this inforformation is available during the trans-lation phase.

19


21/32

Expression expr ...

| overwrite x with expr in expr

overwrite x with e1 in e2; s; He; s; H[x e1]

(OVERWRITE)

Figure 4.3: The overwrite construct

Ministg

Ministg [29] is an interpreter for the STG language that implements theoperational semantics as given above. It offers a good place to investigatethe new semantics.

Here is an outline of the relevant code:The code dealing with overwrite-expressions is largely similar to the code

for let-expressions and usually is simpler. For instance, no free variableneed be generated unlike in the let- expression and no substitution needbe performed. Performing substitutions over overwrite-expressions is alsosimpler than the corresponding let-expression, as there is no variable captureto be avoided. The final difference is in calculating free variables, as in alet expression the variable appearing on the left-hand-side is not free but isfree in an overwrite expression.

GHC

At an operational level, these are the only differences between let- expres-sions and overwrite expressions. When it comes to implementing the STGlanguage in GHC, however, there are a few more hurdles to overcome. Un-surprisingly, much of the code remains the same as for let- expressions, butthe translation is not as direct as in the Ministg interpreter.

Firstly: updating variables reacts badly with the generational garbagecollector employed in GHC. More detail about this is provided in the next

section.Secondly: whereas in the Ministg interpreter variable locations are stored

in data structure representing a finite mapping, in GHC variable locationsare stored as pointers kept in registers or as offsets from the current closure.In the case that the location of a variable is stored at an offset from aclosure that is to be overwritten we must make sure to save this locationbefore performing the update, otherwise the location will be lost and willno longer be able to access the variable.

In the example below, the addresses for x and xs will be located at an

20


22/32

smallStep :: Exp Stack Heap Eval (Maybe (Exp, Stack, Heap))-- LETsmallStep (Let var object exp) stack heap = do

newVar freshVarlet newHeap = updateHeap newVar object heaplet newExp = subs (mkSub var (Variable newVar)) expreturn $ Just (newExp, stack, newHeap)

-- OVERWRITEsmallStep (Overwrite var object exp) stack heap = do

let newHeap = updateHeap var object heapreturn $ Just (exp, stack, newHeap)

-- CASECONsmallStep (Case (Atom (Variable v)) alts) stack heap

| Con constructor args lookupHeap v heap,Just (vars, exp) exactPatternMatch constructor alts = do

return$ Just (subs (mkSubList (zip vars args)) exp, stack,heap)

-- CASEANYsmallStep (Case (Atom v) alts) stack heap

| isLiteral v isValue (lookupHeapAtom v heap),Just (x, exp) defaultPatternMatch alts = do

return$ Just (subs (mkSub x v) exp, stack, heap)

-- CASE

smallStep (Case exp alts) stack heap = doreturn $ Just (exp, CaseCont alts callStack : stack, heap)

-- RETsmallStep exp@(Atom atom) (CaseCont alts : stackRest) heap

| isLiteral atom isValue (lookupHeapAtom atom heap) = doreturn$ Just (Case exp alts, stackRest, heap)

-- THUNKsmallStep (Atom (Variable x)) stack heap

| Thunk exp lookupHeap x heap = do

let newHeap = updateHeap x BlackHole heapreturn$ Just (exp, UpdateCont x : stack, newHeap)

-- UpdatesmallStep atom@(Atom (Variable y)) (UpdateCont x : stackRest) heap

| object lookupHeap y heap, isValue object = doreturn$ Just (atom, stackRest, updateHeap x object heap)

Figure 4.4: Outline of the Ministg implementation for the evaluation rulesgiven in figure 4.2 plus the new overwrite expression.

21


23/32

offset from the closure for ls. When that closure is overwritten, we lose these

addresses, so we must take care to save them in tempory variables first.

...

case ls ofCons x xs

...

overwrite ls with Cons y ys in... x ... xs ...

x xs

...

x offset: 1

xs offest: 2

...

Cons info table

(a)

x xs

...

x offset: 1

xs offest: 2

...

Cons info table

x y xs ys

...

y offset: 1

ys offest: 2

...

Cons info table

(b)

Figure 4.5: Overwriting a Cons cell. Any references that pointed to xs in(a) will point to ys after the update (b) and similarly for x and y.

22


24/32

Let us now consider another example, map. Intuitively, map seems like

a good candidate for in-place updates we scan across the list updatingeach element with an function application. But there is a problem. Lookingat the code for map and the corresponding STG binding we see that mapdoes not allocate any Cons cells! At least not directly:

map :: (a b) [a] [b ]map [ ] = [ ]map f (x : xs) = f x : map f xs

map = { } n {f, xs} case xs ofNil { } NilCons {y, ys}

let fy = {f, y} u f {y} inlet mfys = {f, ys} = u map {f, ys} inCons {fy, mfys}

The two closures allocated in the body of map are both thunks allocatedon the heap whereas the Cons cell is placed on the return stack. In a strictlanguage, the recursive call to map would allocate the rest of the list, but in alazy language a thunk representing the suspended computation is allocatedinstead. This thunk will later be updated with its normal form (either aCons cell or Nil) if examined in a case statement.

In general, the size of the updatees closure and the size of the thunk willnot be of the same size, so we cannot blindly overwrite the former with thelatter. One can imagine a mechanism whereby upon seeing a unique valuein a case statement, the code generator searches the rest of the code for theclosure that fits best. If a closure of the same type is built, then we selectthat. Otherwise we try and reuse as much space as possible by selecting thelargest closure that will take up no more space than the closure we wish tooverwrite.

There is also the possibility of reusing the thunk allocated for the recur-sive call itself, since once evaluated, it is no longer needed.

I have not been able to try implementing this feature, but it would be

an interesting improvement to make.There is one more optimisation that could potentially be included. When

a variable x known to be unique goes out of scope, we know that it hasbecome garbage, weather or not x appears in a case statement. The compilerwould then be free to overwrite x with a new variable y without making anyassumptions about the uniqueness of y. There is some difficulty here as wedo not know if x refers to a value or an unevaluated thunk. If x has notbeen evaluated then in general we can infer nothing about the size of thethunk it refers to, as it may have been formed from an arbitrary expression.

23


25/32

Garbage Collection

GHC uses an n-stage generational garbage collector. A copying collector willpartition the heap into two heap spaces: the from-space and the to-space.Initially objects are only allocated in the from-space. Once this space is full,the live object in the from-space are copied into the to-space. Live objectsare objects that are reachable from the current code. Any unreachableobject (garbage) is never copied so will not take up space in the new heaparea, so the new heap will be smaller than the old heap (provided there wasunreachable data in the heap). Now the from-space becomes the to-spaceand vide-versa and the program continues to run. If no space was reclaimed,then the size of the two spaces must be increased, if this is possible. This can

be generalised to more than two spaces so that there are many heap-spacesof which any one of them may be acting as the to-space at a given time.

This process clearly cannot be employed in a language that allows pointerarithmetic, for example, since closures are frequently being relocated inmemory and pointers would be left dangling, or pointing to nonsense. Butare things any better in a functional language? Ignoring for the momentlazy evaluation and overwriting, Haskell has the property that any new datavalue will only point to old data, never the other way round since values areimmutable. This means the references in memory form a directed acyclicgraph with older values at the leaves and newer values nearer to the root.

The idea behind generations is that since structures are immutable, old

structures dont usually point to structures created more recently. Becauseof this is it possible to partition the heap into generations where old gener-ations do not reference new generations. In this way, the garbage collectorcan re-arrange the new generations without affecting the old generations. Ithas been observed that in functional programming, old data tends to stayaround for much longer than new data [reference] so most unreachable datais newly created. This means that a large proportion of the garbage to becollected usually lies in the youngest generation, so by collecting this we canreclaim decent amounts of space without having to traverse the whole heap.Occasionally however, garbage collecting a young generation will not free upenough space, in which case older generations must also be collected.

A generational collector will also use a method to age objects fromyounger generation into older generations, if they have been around longenough. The usual way of doing this is by recording how many collectionsan object survives in a particular generation. Once this number exceedssome threshold, the object is moved up into the older generation.

By default, GHC uses two generations. This scheme leads to frequent,small collections with occasional, much larger collections of the entire heap.

Up until this point, we have been considering only garbage collectionin directed acyclic graphs. Things become much less neat when we allowclosures to be overwritten, as a closure in an old generation may well be

24


26/32

(a) A block of memory split into twogenerations. The grey blocks are garbage.

(b) After garbage collecting the youngest gen-eration.

Figure 4.6: Generational garbage collection

updated to reference a newer closure in a younger generation. When agarbage collection takes place, the younger closure will be moved to a newlocation and the reference inside the older closure will no longer point to thecorrect location. It is worth noting that this can happen even without theoverwrite construct, owing to lazy evaluation. For example, the followingcode can be used to create a cyclic list:

cyclic :: [a] [a]cyclic xs = rs

rs = link xs rs

link [ ] ys = yslink (x : xs) = x : link xs ys

To work around this, for each generation, GHC keeps track of the set ofclosures that contain pointers to younger generations called the rememberedset. During a garbage collection, the pointers to younger generations aremaintained so as to keep pointing to the correct locations. This rememberedset must be updated whenever a closure is overwritten; this is known as thewrite-barrier. This means that whenever a closure is overwritten, we mustcheck for any old to new pointers being created by considering the generation

25


27/32

of the closure being overwritten.

Unfortunately, this does incur a significant performance penalty for theoverwrite expression. This will discussed later on in more detail.

26


28/32

Chapter 5

Results

Due to the write-barrier overhead, performance gains for this new optimi-sation are slight. In a benchmark program that sorts a large list of integerswe find that although the time spent in garbage collection drops by around10%, the extra cost of the write- barrier and saving free variables almostexactly counteracts the benefits.

By comparison, if we restrict GHC to use a single-space copying collector,thus avoiding the problems associated with older generations, we see a muchbigger improvement from in-place updates. However, the overall executiontime is worse than when using the generational collector, so there is littlepoint in doing so.

For larger, more realistic programs the gain is usually even smaller. Typi-cally, there is little difference between an optimised program and one withoutclosure-overwriting. It is interesting to note though, that in no case has itbeen observed that the extension causes a program to run noticeably slower.

However, this is only the case in conditions where the run time system isallowed access to much more heap space than is needed. When the amountof the heap space available is restricted to be close to the amount of livedata, very different results can be seen. Fig 5.2 shows how the performanceof the sorting program varies with the size of the heap. For small heapsizes, including destructive update makes a big difference in the speed ofthe program. Without allowing destructive update, reducing the size of the

heap dramatically incraces the amount of time spent garbage collecting. Fora heap size of 8MB garbage collection accounts for approximately 50% ofexecution time and for a heap size of 2MB, this increases to around 75%.By contrast, with destructive overwriting turned on, reducing the size ofthe heap has little effect on the program. Indeed the program actually runsslightly faster with a smaller heap! This may be due to improved datalocality of a smaller heap and fewer cache misses.

27


29/32

none -G1 -M2m

With optimisationTime(s) 6.43 8.38 6.11

%GC 46% 56% 44.8%

Without optimisationTime(s) 6.41 9.52 11.18

%GC 57% 60% 76%

Figure 5.1: Results of running a sorting algorithm with various optionsaffecting the run time system. The code under analysis here is exactly thequicksort example given earlier used to sort a list of 20000 integers takingthe minimum time over three runs.

Figure 5.2

28


30/32

Chapter 6

Conclusion

As the run time system of GHC has been highly optimised for persistentdata structures, overwriting closures provides little benefit under typicalconditions. Despite this, the technique appears promising for environmentswhere a large amount of excess heap space is not available.

A number of possibilities for further optimisation remain open, that mayimprove the impact of this technique. It is likely that being more aggressivein deciding which closures can be overwritten would lead to better results. Inparticular, allowing the closures allocated for function calls to be updatedis likely to be useful for optimising recursive functions that are not tailrecursive.

29


31/32

Bibliography

[1] Hudak, P. 1989. Conception, evolution, and application of functionalprogramming languages ACM Computer Survey

[2] J. Hughes Why Functional Programming Matters

[3] David B. MacQueen Reflections on standard ML Lecture notes on Com-puter Science, Volume 693/1993, pages 32-46.

[4] Sylvian Conchon Jean-Christophe Fillantre A Persistent Union-FindData Structure

[5] P. Wadler Functional Programming: Why no one uses functional lan-guages

[6] Niklas Rojemo Colin Runciman Lag, Drag and Void: heap-profiling and

space-efficient compilation revisited Department of Computer Science,University of York

[7] David Wakeling Colin Runciman Linearity and Laziness

[8] Andrew W. Appel Garbage Collection Can Ce Faster Than Stack Allo-cation. Department of computer science, Princeton University.

[9] Philip Wadler The marriage of effects and monads. Philip Wadler, BellLaboratories

[10] Philip Wadler Is there a use for linear logic? Philip Wadler, Bell

Laboratories

[11] Philip Wadler A taste of linear logic Philip Wadler, Bell Laboratories

[12] Philip Wadler Comprehendnig Monads Philip Wadler, University ofGlasgow

[13] David N. Turner Philip Wadler Operational Interpretations of LinearLogic

30


32/32

[14] Simon Peyton-Jones Implementing functional languages on stock hard-

ware: The Spineless Tagless G-machine version 2.5 University of Glas-gow

[15] Simon Peyton-Jones Making a Fast Curry: Push/Enter vs Eval/Applyfor Higher-order Languages

[16] Antony L. Hosking Memory Management for Persistence Universityof Massachusetts

[17] Henry G. Baker Lively Linear Lisp Look Ma, No Garbage

[18] Edsko de Vries Rinus Plasmeijer David M Abrahamson Uniqueness

Typing Redefined[19] Jurriaan Hage Stefan Holdermans Heap Recycling for Lazy Languages

Department of Information and Computing Sciences, Utrecht University

[20] Jon Mountjoy The Spineless Tagless G-machine, naturally Departmentof Computer Science University of Amsterdam

[21] Francois Pottier Wandering through linear types, capabilities, and re-gions.

[22] Exploring the Barrier to Entry - Incremental Generational GarbageCollection for Haskell A.M. Cheadle A.J. Field S. Marlow S.L. Peyton

Jones R.L. While

[23] Ntcker E.G.J.M.H. Smetsers J.E.W. Eekelen M.C.J.D. van PlasmeijerM.J. Concurrent Clean

[24] Simon Peyton-Jones Simon Marlow The STG Runtime System (re-vised)

[25] Simon Peyton-Jones Simon Marlow The New GHC/Hugs Runtime Sys-tem

[26] Simon Peyton-Jones Implementing Functional languages on stock hard-

ware: the Spineless Tageless G-Machine

[27] Simon Peyton-Jones Philip Wadler Imperative Functional Programming

[28] J. Launchburry S Peyton-Jones State in Haskell In Lisp and SymbolicComputation, volume 8, pages 293-342.

[29] The Mini STG Language: http://www.haskell.org/haskellwiki/Ministg

31

Using linearity to allow heap-recycling in Haskell

Documents

Transcript of Using linearity to allow heap-recycling in Haskell