CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C...

42
CS153: Compilers Lecture 3: Assembly ctd. Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic

Transcript of CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C...

Page 1: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

CS153: Compilers Lecture 3: Assembly ctd.

Stephen Chong https://www.seas.harvard.edu/courses/cs153Contains content from lecture notes by Steve Zdancewic

Page 2: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Announcements

•Office hours started •See website for details

•Homework 1 (HellOCaml) due today •Recall: at most 3 days worth of late minutes can be

used per homework •You have 10 days worth of late minutes in total

•Homework 2 X86lite out today •Due Tuesday Sept 24

2

Page 3: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Today

•Continue looking at representation of x86 code •From previous lecture

•C memory layout •Calling convention

3

Page 4: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

X86 Schematic

�4

Stack

Code&Data

Heap

LargerAdd

resses

0x00000000

0xffffffffMemory

rax rbx rcx rdx

rsi rdi rbp rsp

r08 r09 r10 r11

r12 r13 r14 r15

ControlALUOF

SF

ZF

Instruction Decoder

RIP

Registers

Flags

Processor

Page 5: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

3 parts of the C memory model

•The code & data (or “text”) segment •contains compiled code, constant strings, etc.

•The Heap •Stores dynamically allocated objects

•Allocated via “malloc”

•Deallocated via “free” •C runtime system

•The Stack •Stores local variables •Stores the return address of a function

•In practice, most languages use this model�5

LargerAdd

resses

Stack

Code&Data

Heap

Page 6: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Local/Temporary Variable Storage

•Need space to store: •Global variables •Values passed as arguments to procedures •Local variables (either defined in the source program or introduced

by the compiler)

•Processors provide two options •Registers: fast, small size (32 or 64 bits), very limited number •Memory: slow, very large amount of space (2+ GB) • caching important

•In practice on X86: •Registers are limited (and have restrictions) •Divide memory into regions including the stack and the heap

�6

Page 7: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Calling Conventions

•Specify the locations (e.g. register or stack) of arguments passed to a function and returned by the function

•Designate registers either: •Caller Save – e.g. freely usable by the called code •Callee Save – e.g. must be restored by the called code

•Define the protocol for deallocating stack-allocated arguments •Caller cleans up •Callee cleans up (makes variable arguments harder)

�7

Page 8: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

32-bit cdecl calling conventions

•“Standard” on X86 for many C-based operating systems (i.e. almost all) •Still some wrinkles about return values (e.g. some compilers

use EAX and EDX to return small values) •64-bit allows for packing multiple values in one register

•Arguments are passed on the stack in right-to-left order

•Return value is passed in EAX •Registers EAX, ECX, EDX are caller save •Other registers are callee save

•Ignoring these conventions will cause havoc (bus errors or seg faults)

�8

Page 9: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Stack frame example

9

Call chain

foo

bar

baz

baz

baz

quux

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = quux(); ...}

int baz(...) { int z; ... z = baz(); ... return z;}

int quux(...) { ... return 42;}

Page 10: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

Stephen Chong, Harvard University

Stack frame example

10

Call chain

foo⋮

foo%esp

%ebp

Page 11: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

Stephen Chong, Harvard University

Stack frame example

11

Call chain

foo

bar

foo

%esp

%ebpbar

Page 12: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

int baz(...) { int z; ... z = baz(); ... return z;}

Stephen Chong, Harvard University

Stack frame example

12

Call chain

foo

bar

baz

foo

%esp

%ebpbar

baz

Page 13: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

int baz(...) { int z; ... z = baz(); ... return z;}

int baz(...) { int z; ... z = baz(); ... return z;}

Stephen Chong, Harvard University

Stack frame example

13

Call chain

foo

bar

baz

baz

foo

%esp

%ebp

bar

baz

baz

Page 14: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Stack frame example

14

Call chain

foo

bar

baz

baz

baz

foo

%esp

%ebp

bar

baz

baz

baz

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

int baz(...) { int z; ... z = baz(); ... return z;}

int baz(...) { int z; ... z = baz(); ... return z;}

int baz(...) { int z; ... z = baz(); ... return z;}

Page 15: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

int baz(...) { int z; ... z = baz(); ... return z;}

int baz(...) { int z; ... z = baz(); ... return z;}

Stephen Chong, Harvard University

Stack frame example

15

Call chain

foo

bar

baz

baz

foo

%esp

%ebp

bar

baz

bazbaz

Page 16: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Stack frame example

16

Call chain

foo

bar

baz

baz

baz

foo

%esp

%ebp

bar

baz

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

int baz(...) { int z; ... z = baz(); ... return z;}

Page 17: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

Stephen Chong, Harvard University

Stack frame example

17

Call chain

foo

bar

baz

baz

baz

foo

%esp

%ebpbar

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

Page 18: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

Stephen Chong, Harvard University

Stack frame example

18

Call chain

foo

bar

baz

baz

baz

quux

foo

%esp

%ebp

bar

quux

int quux(...) { ... return 42;}

Page 19: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Stack frame example

19

Call chain

foo

bar

baz

baz

baz

quux

foo

%esp

%ebpbar

void foo(...) { ... bar(); ...}

void bar(...) { int x, y; x = baz(); ... y = baz(); ...}

Page 20: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Stack frame example

20

void foo(...) { ... bar(); ...}

Call chain

foo

bar

baz

baz

baz

baz

foo%esp

%ebp

Page 21: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Call Stacks: Example

•Stack frame: •Functions arguments •Local variable

storage •Return address •Link (or “frame”)

pointer

�21

largerm

emoryaddresses

Old %ebp

Arguments

Return address

Saved registers +

Local variables

Argument build

%esp

%ebp

Caller stack frame

Frame pointer

Stack pointer

Page 22: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Call Stacks: Caller’s protocol

•Function call: f(e1, e2, …, en);

•1. Save caller-save registers

•2. Evaluate e1 to v1, e2 to v2, … , en to vn

•3. Push vn to v1 onto the top of the stack.

•4. Use Call to jump to the code for f •pushing the return address onto the stack.

•Invariant: returned value passed in EAX

•After call: •1. clean up the pushed arguments by popping

the stack. •2. Restore caller-saved registers

�22

largerm

emoryaddresses

Old %ebp

Arguments

Return address

Saved registers +

Local variables

Argument build

%esp

%ebp

Caller stack frame

Frame pointer

Stack pointer

Page 23: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Call Stacks: Callee’s protocol

•On entry: •1. Save old frame pointer

• EBP is callee save

•2. Create new frame pointer

•Mov(Esp, Ebp)•3. Allocate stack space for local variables.

•Invariants: (assuming word-size values) •Function argument n is located at:

EBP + (1 + n) * 4 •Local variable j is located at:

EBP – j * 4

•On exit: •1. Pop local storage •2. Restore EBP

�23

largerm

emoryaddresses

Old %ebp

Arguments

Return address

Saved registers +

Local variables

Argument build

%esp

%ebp

Caller stack frame

Frame pointer

Stack pointer

Page 24: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Example: swap

24

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

/* Global vars */int zip1 = 15213;int zip2 = 91125;

void call_swap() { swap(&zip1, &zip2);}

call_swap:...pushl $zip2 # Push argspushl $zip1 # on stackcall swap # Do the call...

Page 25: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Example: swap

25

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

/* Global vars */int zip1 = 15213;int zip2 = 91125;

void call_swap() { swap(&zip1, &zip2);}

call_swap:...pushl $zip2 # Push argspushl $zip1 # on stackcall swap # Do the call...

...%esp

%ebpStack

Page 26: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Example: swap

26

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

/* Global vars */int zip1 = 15213;int zip2 = 91125;

void call_swap() { swap(&zip1, &zip2);}

call_swap:...pushl $zip2 # Push argspushl $zip1 # on stackcall swap # Do the call...

...

$zip2%esp

%ebpStack

Page 27: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Example: swap

27

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

/* Global vars */int zip1 = 15213;int zip2 = 91125;

void call_swap() { swap(&zip1, &zip2);}

call_swap:...pushl $zip2 # Push argspushl $zip1 # on stackcall swap # Do the call...

...

$zip2

%esp

%ebpStack

$zip1

Page 28: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Example: swap

28

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

/* Global vars */int zip1 = 15213;int zip2 = 91125;

void call_swap() { swap(&zip1, &zip2);}

call_swap:...pushl $zip2 # Push argspushl $zip1 # on stackcall swap # Do the call...

...

$zip2

%esp

%ebpStack

$zip1

Return address

Page 29: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Code for swap

29

void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0;}

swap: pushl %ebp movl %esp,%ebp pushl %ebx

movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx)

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

Set up

Body

Finish

Page 30: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap setup

30

pushl %ebp movl %esp,%ebp pushl %ebx

Set up

...

$zip2

%esp

%ebpStack entering swap

$zip1

Return address

...

$zip2

Resulting stack

$zip1

Return address%esp

%ebp

Page 31: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap setup

31

pushl %ebp movl %esp,%ebp pushl %ebx

Set up

...

$zip2

%esp

%ebp

$zip1

Return address

Stack entering swap

...

$zip2

%esp

%ebpResulting stack

$zip1

Return address

Old %ebp

Page 32: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap setup

32

pushl %ebp movl %esp,%ebp pushl %ebx

Set up

...

$zip2

%esp

%ebp

$zip1

Return address

Stack entering swap

...

$zip2

%esp%ebp

Resulting stack

$zip1

Return address

Old %ebp

Page 33: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap setup

33

pushl %ebp movl %esp,%ebp pushl %ebx

Set up

...

$zip2

%esp

%ebp

$zip1

Return address

Stack entering swap

...

$zip2

Resulting stack

$zip1

Return address

Old %ebp

Old %ebx%esp

%ebp

Page 34: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Offset relative to %ebp

48

12

Stephen Chong, Harvard University

Swap body

34

...

$zip2

%esp

%ebp

$zip1

Return address

Stack entering swap

...

$zip2

%esp

%ebp

Resulting stack

$zip1

Return address

Old %ebp

Old %ebx

movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx)

Body

Page 35: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap finish

35

...

$zip2

%esp

%ebp

Stack at end swap body

$zip1

Return address

Old %ebp

Old %ebx

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

Finish

Page 36: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap finish

36

...

$zip2

%esp

Stack at end swap body

$zip1

Return address

Old %ebp

Old %ebx

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

Finish

...

$zip2

Resulting stack

$zip1

Return address

Old %ebp

Old %ebx

%ebp

%esp

%ebp

Page 37: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap finish

37

...

$zip2

%esp

Stack at end swap body

$zip1

Return address

Old %ebp

Old %ebx

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

Finish

...

$zip2

Resulting stack

$zip1

Return address

Old %ebp

Old %ebx

%ebp

%esp

%ebp

Restores old value of %ebx!

Page 38: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap finish

38

...

$zip2

%esp

Stack at end swap body

$zip1

Return address

Old %ebp

Old %ebx

Finish

...

$zip2

Resulting stack

$zip1

Return address

Old %ebp%ebp

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

%esp%ebp

Page 39: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap finish

39

...

$zip2

%esp

Stack at end swap body

$zip1

Return address

Old %ebp

Old %ebx

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

Finish

...

$zip2

Resulting stack

$zip1

Return address%esp

%ebp

%ebp

Page 40: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Swap finish

40

...

$zip2

%esp

Stack at end swap body

$zip1

Return address

Old %ebp

Old %ebx

movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret

Finish

...

$zip2

Resulting stack

$zip1%esp

%ebp

%ebp

Page 41: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

Stack frame cheat sheet

41

foo() { bar(38,42);}

old %ebpnext_arg2

next_arg1

return address

old %ebpnext_arg2

next_arg1

return address

old %ebplocal1

local2

next_arg2

next_arg1

return address

arg2

arg1

return address

old %ebp...

bar(int arg1, int arg2) { int local1, local2; ... baz(55,77);}

baz(int arg1, int arg2) { ...}

(%ebp)

42

4(%ebp)8(%ebp)12(%ebp)

-4(%ebp)-8(%ebp)

38

42

38

77

55

77

55

Page 42: CS153: Compilers Lecture 3: Assembly ctd. · Stephen Chong, Harvard University 3 parts of the C memory model •The code & data (or “text”) segment •contains compiled code,

Stephen Chong, Harvard University

X86-64 SYSTEM V AMD 64 ABI

•More modern variant of C calling conventions •used on Linux, Solaris, BSD, OS X

•Callee save: rbp, rbx, r12-r15 •Caller save: all others

•Parameters 1 .. 6 go in: rdi, rsi, rdx, rcx, r8, r9 •Parameters 7+ go on the stack (in right-to-left order)

•so: for n > 6, the nth argument is located at ((n-7)+2)*8 + rbp

•Return value: in rax

•128 byte “red zone” – scratch pad for the callee's data •128 bytes beyond rsp is considered reserved memory •Not modified by signal or interrupt handlers •Callee can use this for temporary data not needed across function calls

�42