Compiler Construction - BGUcomp191/wiki.files/compiler-construction... · Compiler Construction...

48
Compiler Construction Mayer Goldberg \ Ben-Gurion University December 3, 2018 Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 1 / 48

Transcript of Compiler Construction - BGUcomp191/wiki.files/compiler-construction... · Compiler Construction...

  • Compiler Construction

    Mayer Goldberg \ Ben-Gurion University

    December 3, 2018

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 1 / 48

  • Chapter 5

    Agenda▶ Intuition about the tail-calls, tail-position, & the

    tail-call-optimization▶ The tail-position, tail-call▶ The TCO▶ Loops & tail-recursion▶ Annotating the tail-call▶ What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 2 / 48

  • The tail-call (intuitively)

    ▶ Two people are walking through a forest, and encounter a witch🧙

    ▶ The witch grants them three wishes…▶ Here is what each person wished for:

    First Person Second PersonUS$1,000,000 3 more wishesA grade of 100 US$1,000,0003 more wishes A grade of 100

    ▶ What is the difference?

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 3 / 48

  • The tail-call (continued)▶ Here is what each person wished for:

    First Person Second PersonUS$1,000,000 3 more wishesA grade of 100 US$1,000,0003 more wishes A grade of 100

    ▶ The first person’s wishes are simple to grant▶ The second person’s wishes are annoying:

    ▶ The witch must remember what to do once she returned fromgranting the 3 wishes

    ▶ Since this nonsense is going to go on for a while, the witchneeds a stack of paper slips to manage all these requests inorder…

    This is the difference between a tail-call (First Person) & anon-tail-call (Second Person)

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 4 / 48

  • The tail-call (continued)

    Back to reality:▶ Non-tail-calls require additional stack frames to manage

    arguments & return addresses▶ The stack depth (in frames) is proportional to the number of

    non-tail-calls▶ Tail-calls do not require additional stack frames

    ▶ The stack depth (in frames) is independent of the number oftail-calls

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 5 / 48

  • The tail-call optimization (intuitively)

    ▶ You are surfing a web browser▶ The browser is broken: It has no ⟨Back⟩ key

    ▶ Once you click on a link, you are unable to return▶ To read a web page, you therefore right-click on a link & select

    the option to open in a new frame▶ You read the page in the new frame, possibly opening links in

    additional frames▶ When you are done reading a page, you click on the ⊠ button

    to remove the frame

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 6 / 48

  • The tail-call optimization (intuitively, cont)▶ Some web pages have special links:

    ▶ These links are not just the last links on the page▶ These links are the last thing on the page

    ▶ There’s nothing to read past these links!☞ Not all pages have such special links

    ▶ Because there’s nothing to read past these links, we needneither to open a new frame nor to return from it:

    ▶ Rather than right-click & open the page in the new frame, wesimply click on the link in place

    ▶ The new contents shall overwrite the old contents▶ The new contents can be larger or smaller than the contents it

    replaces▶ The size of the frame can change▶ The number of frames shall not change

    ▶ When we’re done reading the page, we close it with ⊠

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 7 / 48

  • The tail-call optimization (intuitively, cont)

    The tail-calls

    a new frame

    a new frame

    a new frame

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 8 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization▶ The tail-position, tail-call▶ The TCO▶ Loops & tail-recursion▶ Annotating the tail-call▶ What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 9 / 48

  • The tail-position

    ▶ The tail-position is a point in the body of a function, procedure,method, subroutine, etc., just before the return statement,where the last computation is performed

    ▶ Because computation may proceed non-linearly, there many bemore than one tail-position

    ▶ To find the tail positions, find all points in the code where areturn-statement or the ret instruction could be placed

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 10 / 48

  • The tail-position (continued)

    Example:(lambda (x)

    (f (g (g x))))▶ tail-call

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 11 / 48

  • The tail-position (continued)

    Example:(lambda (x)

    (f (lambda (y)(g x y))))

    ▶ tail-calls☞ Each lambda-expression has its own return-statement!

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 12 / 48

  • The tail-position (continued)

    Example:(lambda (x y z w)

    (if (foo? x)(goo y)(boo (doo z))))

    ▶ tail-calls☞ If an if-expression is in tail-position, then the then-expression

    & else-expression are also in tail-position

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 13 / 48

  • The tail-position (continued)

    Example:(lambda (x y z)

    (f (if (g? x)(h y)(w z))))

    ▶ tail-call☞ If an if-expression is not in tail-position, then neither are its

    then-expression & else-expression

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 14 / 48

  • The tail-position (continued)

    Example:(lambda (a b)

    (f a)(g a b)(display "done!\n"))▶ tail-call☞ If a sequence, whether explicit or implicit, is in tail-position, the

    last expression in the sequence is also in tail-position

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 15 / 48

  • The tail-position (continued)

    Example:(lambda ()

    (and (f x) (g y) (h z)))▶ tail-call☞ If an and-expression is in tail-position then its last expression is

    also in tail-position▶ While it is possible to return after computing previous

    expressions [within an and-expression], it is only from the lastexpression that return is possible immediately, without firsttesting the value of the expression

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 16 / 48

  • The tail-position (continued)

    Example:(lambda ()

    (or (f (g x)) y))▶ The above example contains no application in tail-position☞ Similarly to and-expression, if an or-expression is in tail-position

    then its last expression is in tail-position

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 17 / 48

  • The tail-position (continued)

    Example:(lambda ()

    (set! x (f y)))

    ☞ The body of a set!-expression is never in tail-position!

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 18 / 48

  • The tail-position (continued)

    Example:(lambda ()

    (set! x (f (lambda (y)(g x y)))))

    ▶ tail-call☞ Even though the body of the set!-expression is not in

    tail-position, “every lambda-expression has its own return”▶ The marked application is in tail-position relative to its

    enclosing lambda-expression

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 19 / 48

  • The tail-position (continued)

    Example:(lambda (x y z)

    (cond ((f? x) (g y))((g? x) (f x) (f y))(else (h x) (f y) (g (f x)))))

    ▶ tail-calls☞ If a cond-expression is in tail-position then the last expression in

    the implicit sequence of each cond-rib is also in tail-position

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 20 / 48

  • The tail-position (continued)

    Example:(let ((x (f y))

    (y (g x)))(goo (boo x) y))▶ tail-call☞ The body of a let-expression is the body of a

    lambda-expression.▶ Therefore, the last expression in the implicit sequence of the

    body is in tail-position

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 21 / 48

  • The tail-position (continued)

    Example:(lambda (s)

    (apply f s))

    This is an interesting example:▶ On the one hand, there is only one tail-call that appears

    statically in the source code▶ On the other hand, there is another tail-call that shall take place

    at run-time: The application of f must re-use the top activationframe!

    ☞ The implementation of apply duplicates part of the code forthe tail-call-optimization, so that the frame for the call to foverwrites the frame for the call to apply!

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 22 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization🗸 The tail-position, tail-call▶ The TCO▶ Loops & tail-recursion▶ Annotating the tail-call▶ What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 23 / 48

  • The tail-call optimizationThe tail-call optimization is a recycling of activation frames on thestack:

    ▶ Function/method calls ordinarily open new activation frames onthe stack

    ▶ Function/method calls in tail-position need not open newactivation frames, but may re-use the current/top frame

    ▶ Only two items in the current activation frame need to bepreserved during a tail-call:

    ▶ The old frame-pointer▶ The return-address

    ▶ Everything else on the stack —▶ The lexical environment▶ The values of arguments▶ Any local values

    are all overwritten & replaced by the contents of the new frameMayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 24 / 48

  • The tail-call optimization (continued)

    ▶ The tail-call optimization optimizes stack frames▶ In some situations this may also optimize time▶ We shall not encounter many other optimizations of space▶ The tail-call optimization is very important!

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 25 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization🗸 The tail-position, tail-call🗸 The TCO▶ Loops & tail-recursion▶ Annotating the tail-call▶ What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 26 / 48

  • The tail-call optimization (continued)

    The significance of the TCO▶ All loops are special cases of recursive functions where the

    recursive call is in tail-position▶ The TCO is our license to use recursive procedures to

    implement iteration▶ Absent the TCO, iteration using recursion would be

    prohibitively expensive:⚠ The amount of stack consumed in the execution of a loop

    would be proportional to the number of iterations⚠ Large loops would exhaust the stack

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 27 / 48

  • Loops

    ▶ By now, we know how to identify the tail-position▶ We made the claim that the TCO is important because it gives

    us license to implement loops using tail-recursive functions☞ We now need to pay this debt, and show that loops are

    tail-recursive functions!

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 28 / 48

  • Loops (continued)

    Example: while-loops(define while

    (lambda (test body)(if (test)(begin(body)(while test body)))))

    ▶ tail-call

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 29 / 48

  • Loops (continued)

    Example: while-loops (cont)> (let ((i 0))

    (while(lambda () (< i 10))(lambda ()(set! i (+ 1 i))(display(format "~a~%" i)))))

    12...910

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 30 / 48

  • Loops (continued)

    Example: for-loops(define for

    (lambda (from to body)(if (< from to)

    (begin(body from)(for (+ 1 from) to body)))))

    ▶ Notice that the body is parameterized by the index-variable ofthe loop

    ▶ tail-call

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 31 / 48

  • Loops (continued)Example: for-loops (cont)> (for 1 6

    (lambda (i)(for 1 6(lambda (j)

    (display(format "\t~a"

    (* i j)))))(newline)))

    1 2 3 4 52 4 6 8 103 6 9 12 154 8 12 16 205 10 15 20 25

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 32 / 48

  • The tail-call optimization (continued)▶ The tail-call optimization is not just important for loops▶ Consider Ackermann’s Function:

    Ack(0, q) = q + 1Ack(p + 1, 0) = Ack(p, 1)

    Ack(p + 1, q + 1) = Ack(p,Ack(p + 1, q))

    Ack(2, 2) = 7Ack(3, 3) = 61

    ▶ tail-calls▶ non-tail-call▶ The TCO cuts reduces the number of frames needed by 50%,

    which lets us compute Ackermann’s Function for larger inputsMayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 33 / 48

  • The tail-call optimization (continued)

    One disadvantage of the TCO:▶ When a frame is overwritten, debug information is lost▶ Smalltalk & Java have great debuggers

    ▶ Scheme does not!▶ Language implementations that do not implement the TCO

    should, at the very least, offer efficient & convenient loopingmechanisms

    ☞ Compromise: Turn the TCO on/off while debugging▶ No Scheme compiler does this▶ It’s easy (if tedious) to by hand:

    (define id (lambda (x) x))Just wrap each tail-call with id☺

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 34 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization🗸 The tail-position, tail-call🗸 The TCO🗸 Loops & tail-recursion▶ Annotating the tail-call▶ What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 35 / 48

  • Annotating tail-calls in our compiler▶ Annotating tail-calls is done in a single pass over the AST ofexpr

    ▶ You are given the type expr' which includes the typeconstructor ApplicTP' of expr' * (expr' list) forencoding tail-calls

    ▶ The simplest way to annotate tail-calls is to carry along anauxiliary parameter in_tp (read: in tail-position) to indicatewhether the current expression is in tail-position

    ▶ The initial value of in_tp is false▶ When an Appic is encountered and the value of in_tp is true,

    an ApplicTP' is used to package the result of the recursivecalls over the procedure and the list of arguments

    ▶ Upon entering lambda-expressions (of any kind), the value ofin_tp is reset back to true

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 36 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization🗸 The tail-position, tail-call🗸 The TCO🗸 Loops & tail-recursion🗸 Annotating the tail-call▶ What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 37 / 48

  • TCO Ackermann in CThe TCO is simplest to demonstrate on a function with immediatetail-recursion:

    ▶ The new frame is identical to the old frame, both in size andtypes

    Consider Ackermann’s Function:int ack(int a, int b) {

    if (a == 0) { return b + 1; }else if (b == 0) { return ack(a - 1, 1); }else { return ack(a - 1, ack(a, b - 1)); }

    }

    🤔 A tail call is identified by the pattern return f(...)☞ There are two tail-recursive function calls

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 38 / 48

  • TCO Ackermann in C (continued)

    Optimizing the tail-calls in Ackermann’s Function results in:int ack(int a, int b) {L:

    if (a == 0) { return b + 1; }else if (b == 0) { b = 1; --a; goto L; }else { b = ack(a, b - 1); --a; goto L; }

    }

    ▶ Notice that the current frame is changed in place▶ Notice that goto replaces the function call▶ Notice that one non-tail-call remains

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 39 / 48

  • TCO Ackermann in x86/64Here is Ackermann’s Function, with the TCO, in x86/64:

    ack: .A:cmp rdi, 0 lea rax, [rsi + 1]jz .A retcmp rsi, 0jz .B .B:push rdi dec rdidec rsi mov rsi, 1call ack jmp ackmov rsi, raxpop rdidec rdijmp ack

    Tail-calls and a non-tail-callMayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 40 / 48

  • The tail-call optimization (continued)

    ▶ When the tail-call is not immediately-recursive▶ non-recursive, or▶ mutually-recursive

    we cannot demonstrate the TCO in high-level C▶ This would require that we goto from within the body of one

    procedure into a label that is local to another procedure▶ The new frame can be very different in size, argument count,

    and argument type from the old frame, so we cannot overwriteit in a high-level language

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 41 / 48

  • The tail-position (continued)

    ▶ Tail-recursion, whether immediate or mutual, is a special case ofthe tail-call optimization, where the call is recursive or mutuallyrecursive

    ▶ The general tail-recursion optimization is all that is required tosupport the implementation of loops using recursion

    ▶ This is required by the standard for Scheme▶ The tail-call optimization is more general than the tail-recursion

    optimization, and optimizes all tail-calls▶ The tail-call optimization crosses the boundaries of

    functions/procedures/methods, and is implemented inassembly/machine language

    ☞ You will implement the tail-call-optimization in your compilers

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 42 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization🗸 The tail-position, tail-call🗸 The TCO🗸 Loops & tail-recursion🗸 Annotating the tail-call🗸 What TCO code looks like▶ Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 43 / 48

  • The tail-call optimization (continued)

    Upon tail-call① Evaluate the arguments, and push their values onto the stack in

    reverse order (from last to first)② Push the argument count③ Evaluate the procedure expression

    ▶ Verify that we have a closure!

    ④ Push the lexical environment of the closure⑤ Push the return address of the current frame⑥ Restore the old frame-pointer register (on x86/64: rbp)⑦ Overwrite the existing frame with the new frame

    ▶ Loop, memcpy, whatever…

    ⑧ jmp to the code-pointer of the closure

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 44 / 48

  • The tail-call optimization (continued)Upon return from a procedure-call① Add a wordsize (8 bytes) to the stack-pointer (on x86/64: rsp)

    to move past the lexical environment② Pop off the argument count

    ▶ This can be different from the number of arguments youpushed:

    ▶ If you call a procedure with n arguments and it tail-calls aprocedure with k arguments, and it calls a procedure with rarguments, which returns, then you get back r arguments onthe stack!

    ▶ Variadic lambda-expressions and lambda-expressions withoptional arguments will alter the stack to number of theirparameters: If you call the procedure (lambda (a b c . r)... ) with 29 arguments, and it returns, you can expect 4arguments on the stack, the last one of which being a list of theremaining 26 arguments…

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 45 / 48

  • The tail-call optimization (continued)

    Upon return from a procedure-call② Pop off the argument count③ Remove as many arguments off the stack as indicated by the

    argument count: Add word-size * argument-count to thestack-pointer register

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 46 / 48

  • Chapter 5

    Agenda🗸 Intuition about the tail-calls, tail-position, & the

    tail-call-optimization🗸 The tail-position, tail-call🗸 The TCO🗸 Loops & tail-recursion🗸 Annotating the tail-call🗸 What TCO code looks like🗸 Implementing the TCO

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 47 / 48

  • Further reading

    Mayer Goldberg \ Ben-Gurion University Compiler Construction December 3, 2018 48 / 48