Intermediate Code Generation CS308 Compiler Theory1.

74
Intermediate Code Generation CS308 Compiler Theory 1

Transcript of Intermediate Code Generation CS308 Compiler Theory1.

Page 1: Intermediate Code Generation CS308 Compiler Theory1.

Intermediate Code Generation

CS308 Compiler Theory 1

Page 2: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 2

Intermediate Code Generation

• Intermediate codes are machine independent codes, but they are close to machine instructions.

• The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator.

• Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language.– syntax trees can be used as an intermediate language.– postfix notation can be used as an intermediate language.– three-address code (Quadraples) can be used as an intermediate language

• we will use quadraples to discuss intermediate code generation• quadraples are close to machine instructions, but they are not actual machine instructions.

– some programming languages have well defined intermediate languages.• java – java virtual machine• prolog – warren abstract machine• In fact, there are byte-code emulators to execute instructions in these intermediate languages.

Page 3: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 3

Three-Address Code (Quadraples)

• A quadraple is:

x := y op z

where x, y and z are names, constants or compiler-generated temporaries; op is any operator.

• But we may also the following notation for quadraples (much better notation because it looks like a machine code instruction)

op y,z,x

apply operator op to y and z, and store the result in x.

• We use the term “three-address code” because each statement usually contains three addresses (two for operands, one for the result).

Page 4: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 4

Three-Address Statements

Binary Operator: op y,z,result or result := y op zwhere op is a binary arithmetic or logical operator. This binary operator is applied to y and z, and the result of the operation is stored in result.Ex: add a,b,c

gt a,b,caddr a,b,caddi a,b,c

Unary Operator: op y,,result or result := op ywhere op is a unary arithmetic or logical operator. This unary operator is applied to y, and the result of the operation is stored in result.Ex: uminus a,,c

not a,,cinttoreal a,,c

Page 5: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 5

Three-Address Statements (cont.)

Move Operator: mov y,,result or result := ywhere the content of y is copied into result.

Ex: mov a,,c

movi a,,c

movr a,,c

Unconditional Jumps: jmp ,,L or goto LWe will jump to the three-address code with the label L, and the execution continues from that statement.

Ex: jmp ,,L1 // jump to L1

jmp ,,7 // jump to the statement 7

Page 6: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 6

Three-Address Statements (cont.)

Conditional Jumps: jmprelop y,z,L or if y relop z goto LWe will jump to the three-address code with the label L if the result of y relop z is true, and the execution continues from that statement. If the result is false, the execution continues from the statement following this conditional jump statement.

Ex: jmpgt y,z,L1 // jump to L1 if y>z

jmpgte y,z,L1 // jump to L1 if y>=z

jmpe y,z,L1 // jump to L1 if y==z

jmpne y,z,L1 // jump to L1 if y!=z

Our relational operator can also be a unary operator.

jmpnz y,,L1 // jump to L1 if y is not zero

jmpz y,,L1 // jump to L1 if y is zero

jmpt y,,L1 // jump to L1 if y is true

jmpf y,,L1 // jump to L1 if y is false

Page 7: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 7

Three-Address Statements (cont.)

Procedure Parameters: param x,, or param x

Procedure Calls: call p,n, or call p,nwhere x is an actual parameter, we invoke the procedure p with n parameters.

Ex: param x1,,

param x2,,

p(x1,...,xn)

param xn,,

call p,n,

f(x+1,y) add x,1,t1

param t1,,

param y,,

call f,2,

Page 8: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 8

Three-Address Statements (cont.)

Indexed Assignments:

move y[i],,x or x := y[i]

move x,,y[i] or y[i] := x

Address and Pointer Assignments:

moveaddr y,,x or x := &y

movecont y,,x or x := *y

Page 9: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 9

Syntax-Directed Translation into Three-Address Code

S id := E S.code = E.code || gen(‘mov’ E.place ‘,,’ id.place)

E E1 + E2 E.place = newtemp();

E.code = E1.code || E2.code || gen(‘add’ E1.place ‘,’ E2.place ‘,’ E.place)

E E1 * E2 E.place = newtemp();

E.code = E1.code || E2.code || gen(‘mult’ E1.place ‘,’ E2.place ‘,’ E.place)

E - E1 E.place = newtemp();

E.code = E1.code || gen(‘uminus’ E1.place ‘,,’ E.place)

E ( E1 ) E.place = E1.place;

E.code = E1.code

E id E.place = id.place;

E.code = null

Page 10: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 10

Syntax-Directed Translation (cont.)

S while E do S1 S.begin = newlabel();

S.after = newlabel();

S.code = gen(S.begin “:”) || E.code ||

gen(‘jmpf’ E.place ‘,,’ S.after) || S1.code ||

gen(‘jmp’ ‘,,’ S.begin) ||

gen(S.after ‘:”)

S if E then S1 else S2 S.else = newlabel();

S.after = newlabel();

S.code = E.code ||

gen(‘jmpf’ E.place ‘,,’ S.else) || S1.code ||

gen(‘jmp’ ‘,,’ S.after) ||

gen(S.else ‘:”) || S2.code ||

gen(S.after ‘:”)

Page 11: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 11

Translation Scheme to Produce Three-Address Code

S id := E { p= lookup(id.name);

if (p is not nil) then emit(‘mov’ E.place ‘,,’ p)

else error(“undefined-variable”) }

E E1 + E2 { E.place = newtemp();

emit(‘add’ E1.place ‘,’ E2.place ‘,’ E.place) }

E E1 * E2 { E.place = newtemp();

emit(‘mult’ E1.place ‘,’ E2.place ‘,’ E.place) }

E - E1 { E.place = newtemp();

emit(‘uminus’ E1.place ‘,,’ E.place) }

E ( E1 ) { E.place = E1.place; }

E id { p= lookup(id.name);

if (p is not nil) then E.place = id.place

else error(“undefined-variable”) }

Page 12: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 12

Translation Scheme with Locations

S id := { E.inloc = S.inloc } E { p = lookup(id.name);

if (p is not nil) then { emit(E.outloc ‘mov’ E.place ‘,,’ p); S.outloc=E.outloc+1 } else { error(“undefined-variable”); S.outloc=E.outloc } }

E { E1.inloc = E.inloc } E1 + { E2.inloc = E1.outloc } E2

{ E.place = newtemp(); emit(E2.outloc ‘add’ E1.place ‘,’ E2.place ‘,’ E.place); E.outloc=E2.outloc+1 }

E { E1.inloc = E.inloc } E1 * { E2.inloc = E1.outloc } E2

{ E.place = newtemp(); emit(E2.outloc ‘mult’ E1.place ‘,’ E2.place ‘,’ E.place); E.outloc=E2.outloc+1 }

E - { E1.inloc = E.inloc } E1

{ E.place = newtemp(); emit(E1.outloc ‘uminus’ E1.place ‘,,’ E.place); E.outloc=E1.outloc+1 }

E ( E1 ) { E.place = E1.place; E.outloc=E1.outloc+1 }

E id { E.outloc = E.inloc; p= lookup(id.name); if (p is not nil) then E.place = id.place else error(“undefined-variable”) }

Page 13: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 13

Boolean Expressions

E { E1.inloc = E.inloc } E1 and { E2.inloc = E1.outloc } E2

{ E.place = newtemp(); emit(E2.outloc ‘and’ E1.place ‘,’ E2.place ‘,’ E.place); E.outloc=E2.outloc+1 }

E { E1.inloc = E.inloc } E1 or { E2.inloc = E1.outloc } E2

{ E.place = newtemp(); emit(E2.outloc ‘and’ E1.place ‘,’ E2.place ‘,’ E.place); E.outloc=E2.outloc+1 }

E not { E1.inloc = E.inloc } E1

{ E.place = newtemp(); emit(E1.outloc ‘not’ E1.place ‘,,’ E.place); E.outloc=E1.outloc+1 }

E { E1.inloc = E.inloc } E1 relop { E2.inloc = E1.outloc } E2

{ E.place = newtemp();

emit(E2.outloc relop.code E1.place ‘,’ E2.place ‘,’ E.place); E.outloc=E2.outloc+1 }

Page 14: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 14

Translation Scheme(cont.)

S while { E.inloc = S.inloc } E do

{ emit(E.outloc ‘jmpf’ E.place ‘,,’ ‘NOTKNOWN’);

S1.inloc=E.outloc+1; } S1

{ emit(S1.outloc ‘jmp’ ‘,,’ S.inloc);

S.outloc=S1.outloc+1;

backpatch(E.outloc,S.outloc); }

S if { E.inloc = S.inloc } E then

{ emit(E.outloc ‘jmpf’ E.place ‘,,’ ‘NOTKNOWN’);

S1.inloc=E.outloc+1; } S1 else

{ emit(S1.outloc ‘jmp’ ‘,,’ ‘NOTKNOWN’);

S2.inloc=S1.outloc+1;

backpatch(E.outloc,S2.inloc); } S2

{ S.outloc=S2.outloc;

backpatch(S1.outloc,S.outloc); }

Page 15: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 15

Three Address Codes - Example

x:=1;

y:=x+10;

while (x<y) { x:=x+1;

if (x%2==1) then y:=y+1;

else y:=y-2;

}

01: mov 1,,x

02: add x,10,t1

03: mov t1,,y

04: lt x,y,t2

05: jmpf t2,,17

06: add x,1,t3

07: mov t3,,x

08: mod x,2,t4

09: eq t4,1,t5

10: jmpf t5,,14

11: add y,1,t6

12: mov t6,,y

13: jmp ,,16

14: sub y,2,t7

15: mov t7,,y

16: jmp ,,4

17:

Page 16: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 16

Arrays

• Elements of arrays can be accessed quickly if the elements are stored in a block of consecutive locations.

A one-dimensional array A:

baseA low i width

baseA is the address of the first location of the array A,

width is the width of each array element.

low is the index of the first array element

location of A[i] baseA+(i-low)*width

… …

Page 17: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 17

Arrays (cont.)

baseA+(i-low)*width

can be re-written as i*width + (baseA-low*width)

should be computed at run-time can be computed at compile-time

• So, the location of A[i] can be computed at the run-time by evaluating the formula i*width+c where c is (baseA-low*width) which is evaluated at compile-time.

• Intermediate code generator should produce the code to evaluate this formula i*width+c (one multiplication and one addition operation).

Page 18: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 18

Two-Dimensional Arrays

• A two-dimensional array can be stored in

– either row-major (row-by-row) or

– column-major (column-by-column).

• Most of the programming languages use row-major method.

• Row-major representation of a two-dimensional array:

row1 row2 rown

Page 19: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 19

Two-Dimensional Arrays (cont.)

• The location of A[i1,i2] is

baseA+ ((i1-low1)*n2+i2-low2)*width

baseA is the location of the array A.

low1 is the index of the first row

low2 is the index of the first column

n2 is the number of elements in each rowwidth is the width of each array element

• Again, this formula can be re-written as

((i1*n2)+i2)*width + (baseA-((low1*n1)+low2)*width)

should be computed at run-time can be computed at compile-time

Page 20: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 20

Multi-Dimensional Arrays

• In general, the location of A[i1,i2,...,ik] is

(( ... ((i1*n2)+i2) ...)*nk+ik)*width + (baseA-((...((low1*n1)+low2)...)*nk+lowk)*width)

• So, the intermediate code generator should produce the codes to evaluate the following formula (to find the location of A[i1,i2,...,ik]) :

(( ... ((i1*n2)+i2) ...)*nk+ik)*width + c

• To evaluate the (( ... ((i1*n2)+i2) ...)*nk+ik portion of this formula, we can use the recurrence equation:

e1 = i1

em = em-1 * nm + im

Page 21: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 21

Translation Scheme for Arrays

• If we use the following grammar to calculate addresses of array elements, we need inherited attributes.

L id | id [ Elist ]

Elist Elist , E | E

• Instead of this grammar, we will use the following grammar to calculate addresses of array elements so that we do not need inherited attributes (we will use only synthesized attributes).

L id | Elist ]

Elist Elist , E | id [ E

Page 22: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 22

Translation Scheme for Arrays (cont.)

S L := E { if (L.offset is null) emit(‘mov’ E.place ‘,,’ L.place)

else emit(‘mov’ E.place ‘,,’ L.place ‘[‘ L.offset ‘]’) }

E E1 + E2 { E.place = newtemp();

emit(‘add’ E1.place ‘,’ E2.place ‘,’ E.place) }

E ( E1 ) { E.place = E1.place; }

E L { if (L.offset is null) E.place = L.place)

else { E.place = newtemp();

emit(‘mov’ L.place ‘[‘ L.offset ‘]’ ‘,,’ E.place) } }

Page 23: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 23

Translation Scheme for Arrays (cont.)

L id { L.place = id.place; L.offset = null; }

L Elist ]

{ L.place = newtemp(); L.offset = newtemp();

emit(‘mov’ c(Elist.array) ‘,,’ L.place);

emit(‘mult’ Elist.place ‘,’ width(Elist.array) ‘,’ L.offset) }

Elist Elist1 , E

{ Elist.array = Elist1.array ; Elist.place = newtemp(); Elist.ndim = Elist1.ndim + 1;

emit(‘mult’ Elist1.place ‘,’ limit(Elist.array,Elist.ndim) ‘,’ Elist.place);

emit(‘add’ Elist.place ‘,’ E.place ‘,’ Elist.place); }

Elist id [ E

{Elist.array = id.place ; Elist.place = E.place; Elist.ndim = 1; }

Page 24: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 24

Translation Scheme for Arrays – Example1

• A one-dimensional double array A : 5..100

n1=95 width=8 (double) low1=5

• Intermediate codes corresponding to x := A[y]

mov c,,t1 // where c=baseA-(5)*8

mult y,8,t2

mov t1[t2],,t3

mov t3,,x

Page 25: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 25

Translation Scheme for Arrays – Example2

• A two-dimensional int array A : 1..10x1..20

n1=10 n2=20 width=4 (integers) low1=1 low2=1

• Intermediate codes corresponding to x := A[y,z]

mult y,20,t1add t1,z,t1mov c,,t2 // where c=baseA-(1*20+1)*4mult t1,4,t3mov t2[t3],,t4mov t4,,x

Page 26: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 26

Translation Scheme for Arrays – Example3

• A three-dimensional int array A : 0..9x0..19x0..29

n1=10 n2=20 n3=30 width=4 (integers) low1=0 low2=0 low3=0

• Intermediate codes corresponding to x := A[w,y,z]

mult w,20,t1add t1,y,t1mult t1,30,t2add t2,z,t2

mov c,,t3 // where c=baseA-((0*20+0)*30+0)*4mult t2,4,t4mov t3[t4],,t5mov t5,,x

Page 27: Intermediate Code Generation CS308 Compiler Theory1.

Test Yourself

for i:=1 to M do

for j:=1 to N do

A[i,j]:=B[i,j]

假设数组每维下届为 1,数组每个单元长度为 1,按行存放。写出四元式中间代码。

CS308 Compiler Theory 27

Page 28: Intermediate Code Generation CS308 Compiler Theory1.

boolean expressions

boolean expressions serve too purposes:• compute a logical value• used as conditional expressions in flow-of-control statements

E E || E | E && E | ! E | ( E )

| E relop E | true | false

Page 29: Intermediate Code Generation CS308 Compiler Theory1.

methods of translating boolean expressions

• numerical computation, e.g.1 denotes true, 0 false

• position in a program (flow of control), e.g. in if-then-else statements

Page 30: Intermediate Code Generation CS308 Compiler Theory1.

numerical representation – complete evaluation

• E E1 || E2 {E.place := newtemp;

emit (E.place, ‘:=’, E1.place, ‘or’ E2.place) }

• E id1 relop id2

{E.place := newtemp;

emit (‘if’, id1.place, relop.op, id2.place, ‘goto’, nextstat+3 );

emit (E.place, ‘:=’, ‘0’ );emit (‘goto’, nextstat + 2 );emit (E.place, ‘:=’, ‘1’ ) }

Page 31: Intermediate Code Generation CS308 Compiler Theory1.

translation of a<b || c<d && e<f

100: if a < b goto 103101: t1 := 0102: goto 104103: t1 := 1104: if c < d goto 107105: t2 := 0106: goto 108107: t2 := 1108: if e < f goto 111109: t3 := 0;110: goto 112111: t3 := 1112: t4 := t2 and t3113: t5 := t1 or t2

Page 32: Intermediate Code Generation CS308 Compiler Theory1.

flow of control – short-circuit evaluation

• E E1 || E2

{E1.true := E.true; /* E.true: the label to which control flows if E is true */

E1.false := newlabel();

E2.true := E.true;

E2.false := E.false;

E.code := E1.code || gen(E1.false, ‘:’) || E2.code }

Page 33: Intermediate Code Generation CS308 Compiler Theory1.

• E E1 && E2

{E1.true := newlabel();

E1.false := E.false;

E2.true := E.true;

E2.false := E.false;

E.code := E1.code || gen(E1. true, ‘:’) || E2.code }

Page 34: Intermediate Code Generation CS308 Compiler Theory1.

• E ! E1

{E1.true := E.false;

E1.false := E.true;

E.code := E1.code

}

Page 35: Intermediate Code Generation CS308 Compiler Theory1.

• E E1 relop E2

{E.code := E1.code || E2.code ||

gen(‘if’,E1.place,relop.op, E2.place, ‘goto’, E.true)

|| gen(‘goto’, E.false) }• E true {E.code := gen(‘goto’, B.true)}• E false {E.code := gen(‘goto’, B.false)}

Page 36: Intermediate Code Generation CS308 Compiler Theory1.

translation of a<b or c<d and e<f

if a < b goto Ltrue

goto L1

L1: if c < d goto L2

goto Lfalse

L2: if e < f goto Ltrue

goto Lfalse

有多余的 goto 语句!

Page 37: Intermediate Code Generation CS308 Compiler Theory1.

translation of flow-of-control statements

S if (E) S1

| if (E) S1 else S2

| while (E) S1

| S1 S2

S.next : the label that is attached to the first three-address code to be executed after the code for S

Page 38: Intermediate Code Generation CS308 Compiler Theory1.

code for flow-of-control statements

E.code

S1.codeE.true:

. . .

to E.trueto E.false

(a) if-then

E.code

S1.codeE.true:

. . .

to E.trueto E.false

E.false:goto S.next

S2.code

(b) if-then-else

E.code

S1.codeE.true:

. . .

to E.trueto E.false

goto S.begin

S.begin:

(c) while-do

E.false:

S.next:

E.false:

Page 39: Intermediate Code Generation CS308 Compiler Theory1.

S if (E) S1

{E.true := newlabel(); E.false := S.next; S1.next := S.next;

S.code := E.code || gen(E.true, ‘:’) || S1.code }

Page 40: Intermediate Code Generation CS308 Compiler Theory1.

S if (E) S1 else S2

{E.true := newlabel(); E.false := newlabel(); S1.next := S.next;

S2.next := S.next;

S.code := E.code || gen(E.true, ‘:’) || S1.code ||gen(‘goto’, S.next)

|| gen(E.false, ‘:’) ||S2.code}

Page 41: Intermediate Code Generation CS308 Compiler Theory1.

S while (E) S1

{S.begin:= newlabel(); E.true := newlabel(); E.false := S.next; S1.next := S.begin;

S.code := gen(S.begin, ‘:’) || E.code || gen(E.true, ‘:’) || S1.code || gen(‘goto’, S.begin) }

Page 42: Intermediate Code Generation CS308 Compiler Theory1.

• S S1 S2

{S1.next := newlabel();

S2.next := S.next;

S.code := S1.code || gen(S1.next, ‘:’) ||

S2code}

Page 43: Intermediate Code Generation CS308 Compiler Theory1.

例子

if (X < 100 || x > 200 && x != y) x = 0;=> if x < 100 goto L2 goto L3L3: if x > 200 goto L4

goto L1L4: if x!=y goto L2

goto L1L2: x = 0L1:

Page 44: Intermediate Code Generation CS308 Compiler Theory1.

如何产生更高效的代码

if x > 200 goto L4

goto L1

L4: …

if x <= 200 goto L1

L4: …

(fall through)

Page 45: Intermediate Code Generation CS308 Compiler Theory1.

S if (E) S1

{E.true := fall; // not newlabel; E.false := S.next; S1.next := S.next;

S.code := E.code || S1.code }

对于 if (E) S else S, while (E) S, 同样设置 E.true 为 fall

Page 46: Intermediate Code Generation CS308 Compiler Theory1.

利用 fall through

E E1 relop E2

{test = E1 relop E2

s = if E.true != fall and E.false != fall then gen(‘if’ test ‘goto’, E.true) || gen(‘goto’, E.false) else if (E.true != fall) then gen(‘if’ test ‘goto’, E.true) else if (E.false != fall) then gen(‘if’ ! test ‘goto’, E.false) else ‘’

E.code := E1.code || E2 .code || s}

Page 47: Intermediate Code Generation CS308 Compiler Theory1.

E E1 || E2

{E1.true := if (E.true = fall) newlabel() else E.true; E1.false := fall; E2.true := E.true; E2.false := E.false; E.code := if (E.true = fall) then E1.code || E2.code || gen(E1.true, ‘:’) else E1.code || E2.code }

Page 48: Intermediate Code Generation CS308 Compiler Theory1.

E E1 && E2

{E1.false := if (E.false = fall) newlabel() else E.false; E1.true := fall; E2.true := E.true; E2.false := E.false; E.code := if (E.false = fall) then E1.code || E2.code || gen(E1.false, ‘:’) else E1.code || E2.code }

Page 49: Intermediate Code Generation CS308 Compiler Theory1.

case statements

switch Ebegin

case V1: S1

case V2: S2

. . .case Vn - 1: Sn – 1

default: Sn

end

Page 50: Intermediate Code Generation CS308 Compiler Theory1.

n-way branch

code to evaluate E to tgoto test | Ln: code for Sn

L1: code for S1 | goto next

goto next | test: if t = V1 goto L1

L2: code for S2 | if t = V2 goto L2

goto next | . . . . . . | if t = Vn-1 goto Ln-1

Ln-1: code for Sn -1 | goto Ln

goto next | next:

Page 51: Intermediate Code Generation CS308 Compiler Theory1.

to facilitate case optimization

• we need to provide special IL instructions so that compilers can recognize the case construct and do appropriate optimizations:

case V1 L1

case V2 L2

. . .case Vn-1 Ln-1

case t Ln next:

Page 52: Intermediate Code Generation CS308 Compiler Theory1.

backpatching

allows generation of intermediate code in one pass (the problem with translation scheme before is that we have inherited attributes such as S.next, which is not suitable to implement in bottom-up parsers)

idea: the labels (in the three-address code) will be filled when we know the places

• attributes: E.truelist (true exits : 真 标 号 表 ), E.falselist (false exits)

Page 53: Intermediate Code Generation CS308 Compiler Theory1.

• 我们把所有生成的中间代码(假设为 4元组)用一个指令数组表示,那么标号就可以看作是数组的索引

• 用变量 nextquad表示下一个四元组的编号, emit()函数将生成一个四元组,同时 nextquad++;

Page 54: Intermediate Code Generation CS308 Compiler Theory1.

three auxiliary functions:

• makelist(i) : create a list containing i (index to quadruples)

• merge(p1,p2): returns a concatenated list of p1 and p2

• backpatch(p, i) insert i as the target label for each of the statement on list p

Page 55: Intermediate Code Generation CS308 Compiler Theory1.

boolean expressions

• for E1 or E2 , we know we must evaluate E2 if E1 is false. We use a marker nonterminal.

E E1 or M E2

E E1 and M E2

E not E1

E ( E1 )E id1 relop id2 | true | falseM e

Page 56: Intermediate Code Generation CS308 Compiler Theory1.

• attribute M.quad records the number of the first statement (quadruple) of E2.code

• E E1 or M E2

{backpatch(E1.falselist,M.quad);

E.truelist := merge(E1.truelist, E2.truelist); E.falselist := E2.falselist}• M e {M.quad := nextquad}

Page 57: Intermediate Code Generation CS308 Compiler Theory1.

• E E1 and M E2

{backpatch(E1.truelist,M.quad); E.falselist := merge(E1.falselist, E2.falselist); E.truelist := E2.truelist}• E not E1

{E.truelist := E1.falselist; E.falselist := E1.truelist;}• E ( E1 )

{E.truelist := E1.truelist; E.falselist := E1.falselist;}

Page 58: Intermediate Code Generation CS308 Compiler Theory1.

• E id1 relop id2 {E.truelist := makelist(nextquad); E.falselist := makelist(nextquad+1); emit(‘if’ id1.place relop.op id2.place ‘goto _’); emit(‘goto

_’);}• E true {E.truelist := makelist(nextquad); emit(‘goto

_’);}• E false {E.falselist := makelist(nextquad); emit(‘goto

_’);}

Page 59: Intermediate Code Generation CS308 Compiler Theory1.

flow-of-control statements

• backpatching can also be used to translate flow-of-control statements in one pass

• The grammar

S if E then S1

| if E then S1 else S2

| while E do S1

| begin L end

| A L L1 ; S | S

Page 60: Intermediate Code Generation CS308 Compiler Theory1.

• We use an attribute S.nextlist for a list of jumps to the quadruple following S in execution order. We also define L.nextlist similarly

• the need for .nextlist

if E then S1 else S2

we must use a ‘goto _’ after code for S1 to skip over code for S2

Page 61: Intermediate Code Generation CS308 Compiler Theory1.

• S if E then M S1

{backpatch(E.truelist, M.quad);

S.nextlist := merge(E.falselist, S1.nextlist}

• S if E then M1 S1 N else M2 S2

{backpatch(E.truelist, M1.quad);

backpatch(E.falselist, M2.quad);

S.nextlist := merge(S1.nextlist, N.nextlist, S2.nextlist)}

Page 62: Intermediate Code Generation CS308 Compiler Theory1.

N ε{N.nextlist := makelist(nextquad);emit(‘goto _’);}

M ε{M.quad := nextquad;}

S while M1 E do M2 S1

{backpatch(E.truelist, M2.quad);

backpatch(S1.nextlist, M1.quad); S.nextlist := E.falselist;

emit(‘goto’ M1.quad)}

Page 63: Intermediate Code Generation CS308 Compiler Theory1.

S begin L end

{S.nextlist := L.nextlist;}S A {S.nextlist := nil;}

L L1 ; M S {

backpatch(L1.nextlist, M.quad); L.nextlist := S.nextlist;}

L S { L.nextlist := S.nextlist;}

Page 64: Intermediate Code Generation CS308 Compiler Theory1.

Labels and Gotos

• for goto L, we have to change L into the address of the three-address code for the statement where L is attached

• when L’s address has been found, we can do this easily with the information in the symbol table

• otherwise, we have to use backpatching. We keep the list of address to be filled in the symbol table

Page 65: Intermediate Code Generation CS308 Compiler Theory1.

Break and continue

• 必须记住包含 break 或 continue的上层语句( while, for , switch) S

• 生成 goto _之类的代码,同时把这个四元组的标号加到 S.nextlist

• 回填

Page 66: Intermediate Code Generation CS308 Compiler Theory1.

procedure calls

• assume procedure call is generated using the grammar:S call id (Elist)Elist Elist, EElist E

• One translation scheme is: first evaluate parameters, then pass them together

• we have to use a queue to store those evaluated results

Page 67: Intermediate Code Generation CS308 Compiler Theory1.

S call id (Elist){for each item p on queue do

emit(‘param’, p); emit(‘call’, id.place) }

Elist Elist, E {append(E.place, queue)}Elist E {queue := initqueue(E.place)}

queue 是一个全局变量

Page 68: Intermediate Code Generation CS308 Compiler Theory1.

• 在实际的编译器中 ,问题要复杂 . 首先,我们不但要处理过程,还要处理函数。而函数是可以嵌套的。因此不可能采用全局变量。另外,没有必要所有的参数同时传递,可以计算好一个传递一个。

• ( param.c)

Page 69: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 71

Declarations

P M D

M € { offset=0 }

D D ; D

D id : T { enter(id.name,T.type,offset); offset=offset+T.width }

T int { T.type=int; T.width=4 }

T real { T.type=real; T.width=8 }

T array[num] of T1 { T.type=array(num.val,T1.type);

T.width=num.val*T1.width }

T ↑ T1 { T.type=pointer(T1.type); T.width=4 }

where enter crates a symbol table entry with given values.

Page 70: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 72

Nested Procedure Declarations

• For each procedure we should create a symbol table.

mktable(previous) – create a new symbol table where previous is the parent symbol table of this new symbol table

enter(symtable,name,type,offset) – create a new entry for a variable in the given symbol table.

enterproc(symtable,name,newsymbtable) – create a new entry for the procedure in the symbol table of its parent.

addwidth(symtable,width) – puts the total width of all entries in the symbol table into the header of that table.

• We will have two stacks:– tblptr – to hold the pointers to the symbol tables– offset – to hold the current offsets in the symbol tables in tblptr stack.

Page 71: Intermediate Code Generation CS308 Compiler Theory1.

CS308 Compiler Theory 73

Nested Procedure Declarations

P M D { addwidth(top(tblptr),top(offset)); pop(tblptr); pop(offset) }

M € { t=mktable(nil); push(t,tblptr); push(0,offset) }

D D ; D

D proc id N D ; S { t=top(tblptr); addwidth(t,top(offset)); pop(tblptr); pop(offset); enterproc(top(tblptr),id.name,t) }

D id : T { enter(top(tblptr),id.name,T.type,top(offset)); top(offset)=top(offset)+T.width }

N € { t=mktable(top(tblptr)); push(t,tblptr); push(0,offset) }

Page 72: Intermediate Code Generation CS308 Compiler Theory1.

Test Yourself

CS308 Compiler Theory 74

S do S(1) While E其语义解释为:

S(1)的代码

E的代码

针对自下而上的语法分析器,按如下要求构造该语句的翻译模式:(1) 写出适合语法制导翻译的产生式;(2) 写出每个产生式对应的语义动作。

Page 73: Intermediate Code Generation CS308 Compiler Theory1.

G(S):

R do

UR S(1) While

SU E

R do { R.QUAD:=NXQ }

UR S(1) While

{ U.QUAD:=R.QUAD;

BACKPATCH(S.CHAIN, NXQ) }

SU E

{ BACKPATCH(E.TC, U.QUAD);

S.CHAIN:=E.FC }• •答案二:•(1) S do M1 S(1) While M2 E

•M ε (3分 )•(2) M ε { M.QUAD := NXQ } (6分 )

• S do M1 S(1) While M2 E

• {

•BACKPATCH(S(1).CHAIN, M2.QUAD);

•BACKPATCH(E.TC, M1.QUAD);

• S.CHAIN:=E. FC• }

CS308 Compiler Theory 75

Page 74: Intermediate Code Generation CS308 Compiler Theory1.

(1) S do M1 S(1) While M2 E

M ε

(2)M ε { M.QUAD := NXQ }

S do M1 S(1) While M2 E

{

BACKPATCH(S(1).CHAIN, M2.QUAD);

BACKPATCH(E.TC, M1.QUAD);

S.CHAIN:=E. FC

}

CS308 Compiler Theory 76