Intermediate Code Generation - wmich.eduzijiang/CS5810/Chap6.pdf · Intermediate‐Code Generation...

38
IntermediateCode Generation IntermediateCode Generation Chapter 6 Chapter 6: Intermediate Code Generation 1 CS5810 Spring 2009

Transcript of Intermediate Code Generation - wmich.eduzijiang/CS5810/Chap6.pdf · Intermediate‐Code Generation...

Intermediate‐Code GenerationIntermediate‐Code Generation

Chapter 6

Chapter 6: Intermediate Code Generation

1CS5810 Spring 2009

Why Intermediate CodeWhy Intermediate Code

I t di tParser

Static Checker

Intermediate Code 

Generator

Code Generator

Front end Back end

Program1 Architecture1 Program1 Architecture1

Program1 Architecture2 Program1 Architecture2Interm. C d

Program1

Program

Architecture3

Architecture

Program1

Program

Architecture3

Architecture

Code 

Chapter 6: Intermediate Code Generation

2CS5810 Spring 2009

Program1 Architecture4 Program1 Architecture4

Intermediate CodeIntermediate Code

• Similar terms: Intermediate representation,Similar terms: Intermediate representation, intermediate language

• Ties the front and back ends togetherTies the front and back ends together• Language and Machine neutral• Many forms• Many forms• Level depends on how being processedM th i t di t l b• More than one intermediate language may be used by a compiler

3Chapter 6: Intermediate Code Generation

CS5810 Spring 2009

Intermediate language levelsIntermediate language levels

Hi h • Medium • Low• High

t1  a[i,j+2]

• Medium

t1  j + 2

t2 i * 20

• Low

r1  [fp‐4]

r2 r1 + 2t2  i * 20

t3  t1 + t2

t4 4 * t3

r2  r1 + 2

r3  [fp‐8]

4 3*20t4  4 * t3

t5  addr a

r4  r3*20

r5  r4 + r2

*t6  t5 + t 4

t7  *t6

r6  4 * r5

r7  fp – 216

4

f1  [r7+r6]Chapter 6: Intermediate Code Generation

CS5810 Spring 2009

Intermediate Languages TypesIntermediate Languages Types

• Graphical IRs: Abstract Syntax trees DAGsGraphical IRs: Abstract Syntax trees, DAGs, Control Flow Graphs

• Linear IRs:• Linear IRs: – Stack based (postfix)

Th dd d ( d l )– Three address code (quadruples)

5Chapter 6: Intermediate Code Generation

CS5810 Spring 2009

Graphical IRsGraphical IRs

• Abstract Syntax Trees (AST) – retain essentialAbstract Syntax Trees (AST)  retain essential structure of the parse tree, eliminating unneeded nodes.

• Directed Acyclic Graphs (DAG) – compacted AST to avoid duplication – smaller footprint as well

• Control flow graphs (CFG) – explicitly model control flow

6Chapter 6: Intermediate Code Generation

CS5810 Spring 2009

ASTs and DAGs:b * b*a := b *‐c + b*‐c

:=

a +

:=

a +

* *

b (uni) (uni)b b

*

(uni)b ‐ (uni)

c

‐ (uni)b b ‐ (uni)

c c

7Chapter 6: Intermediate Code Generation

CS5810 Spring 2009

DAGs for ExpressionsDAGs for Expressions

• A node in a Directed Acyclic Graph (DAG) mayA node in a Directed Acyclic Graph (DAG) may have more than one parent

+

*+

d

*

a

a+a*(b‐c)+(b‐c)*d

cb

Chapter 6: Intermediate Code Generation

8CS5810 Spring 2009

SDD for DAG’sSDD for DAG s

Production Semantic RulesProduction Semantic Rules

E E + T E1.node = Create_Node(‘+’,E2.node,T.node)

E T E.node = T.node

T T * F T1.node= Create_Node(‘*’, T2.node, F.node)

T F T d F dT F T.node = F.node

F int F.node = Create_leaf(int, int.val)

F ( E ) F node = E nodeF ( E ) F.node E.node

CS5810 Spring 2009 9Chapter 6: Intermediate Code Generation

Value‐number method for DAGValue number method for DAG

id1

num

+

10

21

2

3+

=

i = i + 10= 1 34

5

10i

Input: Label op, node l, and node rOutput: The value number in the array with <op, l, r>Method: Search the array for M=<op, l, r>. If M exists, return M;

otherwise create a new node N=<op, l, r>, and return its value number

CS5810 Spring 2009 10

number.

Chapter 6: Intermediate Code Generation

TEST YOURSELF #1TEST YOURSELF #1

• Construct the DAG and identify the valueConstruct the DAG and identify the value numbers for the sub‐expressions

((x+y)‐((x+y)*(x‐y)))+((x+y)*(x‐y))

CS5810 Spring 2009 11Chapter 6: Intermediate Code Generation

Linearized ICLinearized IC

• Stack based (one address) – compact

• Three address (quadruples) – up to three operands, one 

push 2push ymultiply

operatort1 <- 2t2 <- yp y

push xsubtract

t2 < yt3 <- t1 * t2t4 <- xt5 <- t4 – t1

12Chapter 6: Intermediate Code Generation

CS5810 Spring 2009

Three‐Address CodeThree Address Code

• At most one operator on the right hand sideAt most one operator on the right hand side– Expression like x+y*z has to be translated

+

*+t1=b‐ct2=a*t1

d

*

a

t2 a t1t3=a+t2t4=t1*d

cb

t4 t1 dt5=t3+t4

Chapter 6: Intermediate Code Generation

13CS5810 Spring 2009

Addresses and InstructionsAddresses and Instructions

• An address can beAn address can be– A name

• In actual implementation name is replaced by aIn actual implementation, name is replaced by a pointer to its symbol‐table entry

– A constant

– A compiler–generated temporary

Chapter 6: Intermediate Code Generation

14CS5810 Spring 2009

Addresses and InstructionsAddresses and Instructions

• Common three‐address instruction forms– Assignment instructions of  the form x = y op z– Assignment instructions of  the form x = op zC i t ti f th f– Copy instructions of  the form x = y 

– Unconditional jump goto L– Conditional jumps if x goto L and if false x goto Lo d o a ju ps go o a d a se go o– Conditional jumps if x relop y goto L– Procedure calls and returns

P f t• Param x for parameters• Call p, n and y = call p, n for procedure and function calls

– Index copy instructions x=y[i] and x[i]=y– Address and pointer assignments x=&y, x=*y and *x=y

Chapter 6: Intermediate Code Generation

15CS5810 Spring 2009

Three‐Address Code Instruction and ddAddress

• At most one operator on the right hand sideAt most one operator on the right hand side

L: t1 = i+1i = t1

*do i = i + 1; t2= i * 8t3 = a[t2]f

do i  i + 1;while (a[i] < v);

If t3 < v goto L

Chapter 6: Intermediate Code Generation

16CS5810 Spring 2009

QuadruplesQuadruplesA quadruple has 4 fields: op, arg1, arg2, result

minust1 = minus ct b * t

c t10

minus

1

2

3

t2= b * t1t3 = minus ct b * t *

*

c

b t1 t2

t3

+4

3

5

t4 = b * t3t5  = t1 + t4

t

* b

t1

t3 t4

t4 t5

=5a = t5 t5 a

CS5810 Spring 2009 17Chapter 6: Intermediate Code Generation

TriplesTriplesA triple has 3 fields: op, arg1, arg2

result is used primarily for temporary namesresult is used primarily for temporary names

minust1 = minus ct b * t

c0

minus

1

2

3

t2= b * t1t3 = minus ct b * t *

*

c

b (0)

+4

3

5

t4 = b * t3t5  = t1 + t4

t

* b

(0)

(2)

(3)

=5a = t5 (4)

What if instructions are moved during a

CS5810 Spring 2009 18Chapter 6: Intermediate Code Generation

optimization?

Static Single‐Assignment FormStatic Single Assignment Form

• SSA is an IR that facilitate certain codeSSA is an IR that facilitate certain code optimization– All assignments are to vars with distinct namesg

p = a + bq p c

p1 = a + bq p cq = p ‐ c

p = q * dp = e p

q1= p1 ‐ cp2 = q1 * dp = e pp = e – p

q = p + qp3 = e – p2q2  = p3 + q1

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 19

Static Single‐Assignment FormStatic Single Assignment Form

• What if the same variable is defined in twoWhat if the same variable is defined in two different control flow paths?

if (flag)if (flag)

x 1;

if (flag)x1 = ‐1;

elsex = ‐1;else

x = 1;

elsex2 = 1;

x φ(x x )x = 1;y = x * a;

x3 = φ(x1 ,x2);y = x3 * a;

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 20

TEST YOURSELF #2TEST YOURSELF #2

• Translate the expression intoTranslate the expression into– A DAG– QuadruplesQ p– Triples 

a + ‐(b+c)

CS5810 Spring 2009 21Chapter 6: Intermediate Code Generation

Types and DeclarationsTypes and Declarations

• Type checking uses logical rules to reasonType checking uses logical rules to reason about the behavior of a program at run time– Ensures that the types of the operands match the– Ensures that the types of the operands match the 

type expected by the operator

• Translation Applications• Translation Applications– Determine the storage

C l l t th dd– Calculate the address

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 22

Type ExpressionsType Expressions

• A type expression is either a basic type or isA type expression is either a basic type or is formed by applying a type constructor to a type expressiontype expression– array: number, type expression

Record: field names and their types– Record: field names and their types

– for function types

Cartesian prod ct If s and t are t pe e pressions– Cartesian product: If s and t are type expressions, so is s×t

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 23

Type Expressions for int[2][3]Type Expressions for int[2][3]

array

array2

integer3

CS5810 Spring 2009 24Chapter 6: Intermediate Code Generation

Grammar for DeclarationsGrammar for Declarations

D  T id; D | εT B C | record ‘{‘ D ‘}’T  B C | record ‘{‘ D ‘}’B  int | floatC ε | [num] CC  ε | [num] C

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 25

Storage LayoutStorage Layout

T B {t = B type; w = B width}T  B  {t = B.type; w = B.width} C 

B int {B type = integer; B width = 4 }B  int {B.type = integer; B.width = 4 }B  float {B.type = float; B.width = 8 }C ε {C type = t; C width = w;}C  ε {C.type = t; C. width = w;} C  [num] C1 {t=array(num.value, C1.type);

C width = num value * C width}C.width = num.value  C1.width}

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 26

TEST YOURSELF #3TEST YOURSELF #3

• What’s the type and width of int[2][3]?What s the type and width of int[2][3]?

CS5810 Spring 2009 27Chapter 6: Intermediate Code Generation

Translation of ExpressionsTranslation of Expressions

Production Actions

S id = E S.code=E.code|| gen(top.get(id.lexeme)’=‘ E.addr)

E E1 + E2 E.addr = new Temp()1

E.code=E1.code|| E2.code ||

gen(E.addr ‘=‘ E1.addr+E2.addr) | E E addr = new Temp()| - E1 E.addr = new Temp()

E.code=E1.code|| gen(E.addr ‘=‘ ‘minus’ E1.addr)

| (E1) E.addr = E1.addr

E.code = E1.code

| id E.addr = top.get(id.lexeme)

E code=‘’

CS5810 Spring 2009 28

E.code=

Chapter 6: Intermediate Code Generation

TEST YOURSELF #4TEST YOURSELF #4

• Add to the translation rules for the followingAdd to the translation rules for the following productions– E E * E– E  E1  E2

– E  + E1

CS5810 Spring 2009 29Chapter 6: Intermediate Code Generation

Type CheckingType Checking

• To do type checking a compiler needs toTo do type checking a compiler needs to assign a type expression for each component of the source programof the source program– An implementation of a language is strongly typed if a compiler guarantees that no type errors in runif a compiler guarantees that no type errors in run time

Chapter 6: Intermediate Code Generation

30CS5810 Spring 2009

Rules for Type CheckingRules for Type Checking• Type synthesis builds up the type of an expression from the type of its subexpressionsexpression from the type of its subexpressions– Names must be declared before they are used

If f has type s t and x has type sIf f has type s t and x has type sThen f(x) has type t

• Type inference determines the type of a language construct from the way it is used• Names need not be declared

If f(x) has is an expression Th f d t f h t t

Chapter 6: Intermediate Code Generation

31CS5810 Spring 2009

Then for some s and t, f has type s tand x has type s

Type ConversionsType Conversions

• In type synthesis the rule associated with E E + E builds onE1 + E2 builds on

if (E1.type=int and E2.type=int) E.type=intelse If (E1.type=float and E2.type=int) E.type=float…

Chapter 6: Intermediate Code Generation

32CS5810 Spring 2009

Type ConversionsType Conversions

double • max(s,t) returns the maximum of

float

max(s,t) returns the maximum of the two types

• widen(a,t,w) widen a of type t intolong

widen(a,t,w) widen a of type t into type w

int

charshort

E E1 + E2 {E.type = max(E.type, E.type);

a1=widen(E1.addr,E1.type,E.type);

a =widen(E addr E type E type);charshort

byte

a2=widen(E2.addr,E1.type,E.type);

E.addr = new Temp();

gen(E.addr ‘=‘ a1 ‘+’ a2);

CS5810 Spring 2009 33

byte

Widening conversionsChapter 6: Intermediate Code Generation

TEST YOURSELF #5TEST YOURSELF #5

• Assume c and d are characters s and t areAssume c and d are characters, s and t are short integers, i and j are integers, and x is a float Translate the following expressionsfloat. Translate the following expressions– i = s + c

x = (x+c) * (t+d)– x = (x+c) * (t+d)

CS5810 Spring 2009 34Chapter 6: Intermediate Code Generation

Control FlowControl Flow

• Boolean expressions are used toBoolean expressions are used to– Alter the flow of control

Compute logical values– Compute logical values

B  B||B | B&&B | !B | (B) | E rel E | true | false

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 35

Short‐Circuit CodeShort Circuit Code

• In short‐circuit code && || and ! translateIn short circuit code, &&, || and ! translate into jumps

If (x<100 || x>200&&x!=y) x = 0If (x<100 || x>200&&x! y) x   0

if x < 100 goto L2ifFalse x>200 goto L1ifFalse x!=y goto L1L2: x = 0L1: 

Chapter 6: Intermediate Code Generation

CS5810 Spring 2009 36

SDD for Flow‐of‐Control StatementsSDD for Flow of Control Statements

Production Semantic RulesProduction Semantic Rules

S if (B) S1 B.true = newlabel()

B.false = S1.next = S.next1

S.code = B.code || label(B.true) || S1.codeS if (B) S1 else S2 B.true = newlabel()

B f l l b l()B.false = newlabel()S1.next= S2.next = S.nextS.code = B.code

|| label(B.true) || S1.code|| gen(‘goto’ S.next)|| label(B false) || S code

CS5810 Spring 2009 37

|| label(B.false) || S2.code

Chapter 6: Intermediate Code Generation

SDD for Flow‐of‐Control StatementsSDD for Flow of Control Statements

Production Semantic RulesProduction Semantic Rules

S while (B) S1 begin = newlabel()

B.true = newlabel()

B.false = S.next

S next=beginS1.next=begin

S.code = label(begin) || B.code

|| label(B.true) || S1.code|| ( ) ||

|| gen(‘goto’ begin)

CS5810 Spring 2009 38Chapter 6: Intermediate Code Generation