Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser...

38
Foundations of Global Networked Computing: Building a Modern Computer From First Principles IWKS 3300: NAND to Tetris Spring 2019 John K. Bennett This course is based upon the work of Noam Nisan and Shimon Schocken. More information can be found at (www.nand2tetris.org ). Compiler II: Code Generation

Transcript of Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser...

Page 1: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Foundations of Global Networked Computing:

Building a Modern Computer From First Principles

IWKS 3300: NAND to Tetris

Spring 2019

John K. Bennett

This course is based upon the work of Noam Nisan and Shimon Schocken.

More information can be found at (www.nand2tetris.org).

Compiler II: Code Generation

Page 2: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Where We Are

Assembler

Chapter 6

H.L. Language

&

Operating Sys.

abstract interface

Compiler

Chapters 10 - 11

VM Translator

Chapters 7 - 8

Computer

Architecture

Chapters 4 - 5

Gate Logic

Chapters 1 - 3 Electrical

EngineeringPhysics

Virtual

Machine

abstract interface

Software

Hierarchy

Assembly

Language

abstract interface

Hardware

Hierarchy

Machine

Code

abstract interface

Hardware

Platform

abstract interface

Chips &

Logic Gates

abstract interface

Human

Thought

Abstract design

Chapters 9, 12

We are still here

Page 3: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

The Big Picture

(Chapter 11)

Keyboar

d

Jack

Program

Toke-

nizerParser

Code

Gene

-ration

Syntax Analyzer

Jack Compiler

VM

code

XML

code

(Chapter 10)

1. Syntax analysis: extracting the semantics from the source code

2. Code generation: expressing the semantics using the target languagethis

week

last

week

Page 4: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

What We Did Last Week

void WriteXML(string tag, string value){

if (DoXML){

for (int i = 0; i < xmlIndent; i++) XMLFile.Write(" ");XMLFile.Write("<" + tag + "> ");// get rid of quotes around string constants// fix the XML special characters, if anyif (value == "") value = tokenizer.currentToken;if (tag == "symbol"){

value = value.Replace("&", "&amp;");value = value.Replace("<", "&lt;");value = value.Replace(">", "&gt;");value = value.Replace("\"", "&quot;");

}XMLFile.Write(value);XMLFile.WriteLine(" </" + tag + ">");

}}

Page 5: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

What We’ll Do This Week

...public VMWriter(StreamWriter VMFile){

VMOutFile = VMFile;}

public void writePush(string segment, int index){

if (validSegNames.ContainsKey(segment)){

VMOutFile.WriteLine(" push {0} {1}", segment, index);}else ReportError("Bad Segment Name", segment);

}

public void writePop(string segment, int index){

if (validSegNames.ContainsKey(segment)){

VMOutFile.WriteLine(" pop {0} {1}", segment, index);}else ReportError("Bad Segment Name", segment);

}...

Page 6: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Moving On to Code Generation

Class Bar {method Fraction foo(int y) {

var int temp; // a variablelet temp = (xxx+12)*-63;...

...

<varDec>

<keyword> var </keyword>

<keyword> int </keyword>

<identifier> temp </identifier>

<symbol> ; </symbol>

</varDec>

<statements>

<letStatement>

<keyword> let </keyword>

<identifier> temp </identifier>

<symbol> = </symbol>

<expression>

<term>

<symbol> ( </symbol>

<expression>

<term>

<identifier> xxx </identifier>

</term>

<symbol> + </symbol>

<term>

<int.Const.> 12 </int.Const.> </term>

</expression> ...

Syntax analyzer

The code generation challenge:

Program = a series of operations that

manipulate data

Compiler: converts each “understood”

(parsed) source operation and data item

into corresponding operations and data

items in the target language

Thus, we have to generate code for

o handling data

o handling operations

Our approach: morph the syntax

analyzer (Project 10) into a full-blown

compiler: instead of generating XML,

we’ll make it generate VM code.

Page 7: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

The Jack Grammar

’x’: x appears verbatim

x: x is a language construct

x?: x appears 0 or 1 times

x*: x appears 0 or more times

x|y: either x or y appears

(x,y): x appears, then y.

Notation

Page 8: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

The Jack Grammar (cont.)

’x’: x appears verbatim

x: x is a language construct

x?: x appears 0 or 1 times

x*: x appears 0 or more times

x|y: either x or y appears

(x,y): x appears, then y.

Page 9: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Memory Segments (review)

Where i is a non-negative integer and segment is one of the following:

static: holds values of global variables, shared by all functions in the same class

argument: holds values of the argument variables of the current function

local: holds values of the local variables of the current function

this: holds values of the private field variables of the current object

that: holds array values

constant: holds all the constants in the range 0 … 32767 (pseudo memory segment)

pointer: used to anchor this and that to various areas in the heap

temp: fixed 8-entry segment that holds temporary variables for general use;

Shared by all VM functions in the program.

VM memory Commands:

pop segment i

push segment i

Page 10: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

“Standard” Hack Memory Map

R0R1R2R3R4

SP

LCL Base

ARG Base

THIS Base

THAT Base

R5R6R7R8R9

TEMP

R10R11R12R13R14

N/A

N/A

R15

N/AN/AN/A

N/A

Stack

Heap

Scrn. & Kbd.

0123456789

101112131415

256-2047 *

2048-16383 *

16384-24575

Stack PointerPtr. To LCL Vars (in Stack Frame)

Ptr. To Args (in Stack Frame)Pointer 0 (ptr to This segment)Pointer 1 (ptr to That segment)

Temp 0Temp 1Temp 2Temp 3Temp 4Temp 5Temp 6Temp 7

GP registerGP registerGP register

Stack (including LCL and ARG)Heap (including THIS and THAT)

Memory-Mapped I/O

N/A Static16-255 * static vars.

RAM[n] Register n Seg. Name Use

*The memory addresses in purple can be changed if there is a reason to do so.

Page 11: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Implementation of the VM’s Stack on the Hack RAM

working stack of

the current function

argument nArgs-1

ARG

saved state of the calling

function. Used by the VM

implementation to restore

the segments of the calling

function just after the

current function returns. saved THIS

saved ARG

saved returnAddress

saved LCL

local 0

local 1

. . .local nVars-1

argument 0

argument 1

. . .

frames of all the functions

up the calling chain

LCL

SP

saved THAT

local variables of

the current function

arguments pushed by the

caller for the current function

Global stack:

the entire RAM area dedicated

for holding the stack

Working stack (“stack frame”):

The stack that the current

function sees

At any point of time, only one

function (the current function)

is executing; other functions

may be waiting up the calling

chain

Shaded areas: irrelevant to the

current function

The current function sees only

the working stack, and has

access only to its memory

segments

The rest of the stack holds the

saved states of all the

functions up the calling

hierarchy.

Page 12: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Code Generation Example

method int foo() {var int x;let x = x + 1;...

<letStatement>

<keyword> let </keyword>

<identifier> x </identifier>

<symbol> = </symbol>

<expression>

<term>

<identifier> x </identifier>

</term>

<symbol> + </symbol>

<term>

<constant> 1 </constant> </term>

</expression>

</letStatement>

Syntax

analysis

(Assume x is the first local variable declared in the method)

push local 0

push constant 1

add

pop local 0

Code

generation

(we usually skip this part)

Page 13: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Variables

When the compiler encounters a variable, say x, in the source code, it has to know:

What is x’s data type?

Primitive (int, bool or char), or user-defined (instance of a class)?

(We need to know this in order to properly allocate RAM resources for x’s

representation)

What kind of variable is x?

local, static, field, argument ?

( We need to know this in order to properly allocate x to the right memory

segment; this allocation also implies the variable’s life cycle, e.g., life of function

call or life of instance).

Page 14: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Variables: mapping to memory segments (example)

When compiling this class, we have to create the following mappings:

The class variables nAccounts , bankCommission are mapped to static 0,1

The object fields id, owner, balance are mapped to this 0,1,2

The argument variables sum, BankAccount, when are mapped to arg 0,1,2

The local variables i, j, due are mapped to local 0,1,2

The target language uses 8 memory segments

Each memory segment, e.g. static,

is an indexed sequence of 16-bit values

that can be referred to as

static 0, static 1, static 2, etc.

Page 15: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Variables: Symbol Tables

How the compiler uses symbol tables:

In a classic implementation, the compiler builds

several linked symbol tables, each reflecting a

single scope nested within the enclosing scope

Identifier lookup works from the current symbol

table back to the list’s head

We don’t have to work quite this hard…

Page 16: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Variables: Symbol Table

class SymbolTable{

int staticIndex = 0;int fieldIndex = 0;int argIndex = 0;int varIndex = 0;

public static Dictionary<string, SymbolTableEntry> classSymbolTable =new Dictionary<string, SymbolTableEntry>() {};

public static Dictionary<string, SymbolTableEntry> subroutineSymbolTable =new Dictionary<string, SymbolTableEntry>() {};

public void startSubroutine(){

subroutineSymbolTable = new Dictionary<string, SymbolTableEntry>();argIndex = 0;varIndex = 0;

}

public void startClass(){

classSymbolTable = new Dictionary<string, SymbolTableEntry>();staticIndex = 0;fieldIndex = 0;

}...

Page 17: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Variables: Symbol Table Entry

// In addition to the four possibilities of "KIND" of variable, (var, argument, static, field) the book// talks about a "Category." The possibilities for Category are "var, argument, static, field, class,// and subroutine" (page 243).// Instead of doing both, we just extend the KIND enum.public enum ST_KIND {STATIC, FIELD, ARG, VAR, CLASS_NAME, SUB_NAME, UND };

// The book also talks about the "Scope" of a variable// Scope can be "subroutine", "class", "instance", "program" or, in cases where variables of// the same name are redefined in different contexts, "subroutine + class", "subroutine + instance",// or "subroutine + program"// Scope is "None" if something goes wrong.

class SymbolTableEntry{

public string ST_TYPE; // int, bool, char or classnamepublic ST_KIND ST_KIND; // see abovepublic int ST_INDEX; // index into the segment named in KINDpublic string ST_SCOPE; // see above

public SymbolTableEntry(string type, ST_KIND kind, int index, string scope){

ST_TYPE = type;ST_KIND = kind;ST_INDEX = index;ST_SCOPE = scope;

}}

Page 18: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Managing Variable Life Cycle

Variable life cycle

static variables: single copy must be kept alive throughout the program duration

field variables: different copies must be kept for each object instance

local variables: created on subroutine entry, killed on exit

argument variables: similar to local variables.

The good news: the VM implementation

already handles almost all of these details!

Page 19: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

class Complex {// Fields:field int re, im; // Real & Img. parts

.../** Constructs a new Complex number */constructor Complex new (int real, int img){

let re = real;let im = img;return this;

}...

}

class Foo {function void bla() {

var Complex a, b, c;...let a = Complex.new (5,17);let b = Complex.new (12,192);...let c = a; // Only reference is copied...

}

Jack code

Handling Objects: construction / memory allocation

How to compile: let a = Complex.new (5,17)

The compiler generates code to push

n, and then calls Memory.alloc*, where

n is the number of words necessary to

represent the object in question;

Memory.alloc is an OS function that

returns the base address of a free

memory block of size n words.

Following

compilation:

* The book’s Jack OS sometimes chokes on alloc(0), so always alloc at least one word.

Page 20: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling objects: accessing fields

class Complex {// Fields:field int re, im; // Real & Img. parts

.../** Constructs a new Complex number */constructor Complex new (int real, int img){

let re = real;let im = img;return this;

}... .../** Multiplies this Complex number

by the given scalar */method void mult (int c) {

let re = re * c;let im = im * c;return;

}...

}

Jack code

*(this + 1) = *(this + 1)times(argument 0)

How to compile:

let im = im * c

1. Look up these two variables in

the symbol table:

im:

Segment: field

Index: 1

c:

Segment: arg

Index: 0

2. Generate VM code that

implements this pseudo-code :

Page 21: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Assume that b and r were passed to a

function as its first two arguments, and

the code let b.radius = r; appears in

the function. How do we compile?

// Get b's base address:

push argument 0

// Point the this segment to b:

pop pointer 0

// Get r's value (r is an arg)

push argument 1

// Set b's third field to r:

pop this 2

120

80

50radius:

x:

y:

3color:120

80

50

3012

3013

3014

33015

412 3012

...

...

High level program view RAM view

0

...

b

following

compilationb

object

b

object(Actual RAM locations of program variables are

run-time dependent, and thus the addresses shown

here are arbitrary examples.)

00

1

Virtual memory segments just before

the operation b.radius=17:

3012

17

0

1

......

120

80

17

0

1

2

30120

1

3

3012

17

0

1

argument pointer this

...

3

(this 0is nowalligned withRAM[3012])

...

Virtual memory segments just after

the operation b.radius=17:

argument pointer this

Handling Objects: establishing access to the object’s fields

Background: Suppose we have an object

named b of type Ball. A Ball has x,y

coordinates, a radius, and a color.

r r

Page 22: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling objects: Method Calls

General rule: each method call

foo.bar(v1,v2,...)

is translated into:

push foopush v1push v2...

call bar

class Complex {field int re, im; // Real & Img. parts

/** Constructs a new Complex number */constructor Complex new (int real, int img) {

let re = real;let im = img;return this;

}/** Multiplies this Complex number

by the given scalar */method void mult (int c) {

let re = re * c;let im = im * c;return;

}

...

}

class Foo {

...

function void bla() {

var Complex x;

...

let x = Complex.new (1,2);

do x.mult(5);

...

}

}

Jack code

push x

push 5

call mult

How to compile:

x.mult(5);

We will view this method call as:

mult(x,5)

…and generate the following code:

Page 23: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

class Main {...function void foo(int k) {

var int x, y;var Array bar; // declare an array...// Construct the array:let bar = Array.new (10);...let bar[k] = 19;

}...do Main.foo(2); // Call the foo method...

Jack code

How to compile:

let bar = Array.new (10);

Generate call to Array constructor, which in turn calls Memory.alloc

Handling Arrays: declaration / construction

19

4315

4316

4317

4324

(bar array)

...4318

...

...

4315

...0

bar

x

y

2 k

(local 0)

(local 1)

(local 2)

(argument 0)

275

276

277

504

RAM state

...

Following

execution:

Page 24: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

class Main {...function void foo(int k) {

var int x, y;var Array bar; // declare an array...// Construct the array:let bar = Array.new (10);...let bar[k] = 19;

}...do Main.foo(2); // Call the foo method...

Jack code

How to compile: let bar[k] = 19;

// bar[k]=19, or *(bar+k)=19push barpush kadd// Use a pointer to access x[k]pop addr // addr points to bar[k]push 19pop *addr // Set bar[k] to 19

VM Code (pseudo)

// bar[k]=19, or *(bar+k)=19push local 2push argument 0add// Use the that segment to access x[k]pop pointer 1push constant 19pop that 0

VM Code (actual)

19

4315

4316

4317

4324

(bar array)

...4318

...

...

4315

...0

bar

x

y

2 k

(local 0)

(local 1)

(local 2)

(argument 0)

275

276

277

504

RAM state, just after executing bar[k] = 19

...

Following

execution:

Handling arrays: accessing an array entry by its index

Page 25: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

syntax

analysis

parse tree

Handling Expressions

((5+z)/-8)*(4+2)

Jack code

push 5

push z

add

push 8

neg

call div

push 4

push 2

add

call mult

code

generation

To generate VM code from a parse tree exp, use the following logic:

The codeWrite(exp) algorithm:

if exp is a constant n then output "push n"

if exp is a variable v then output "push v"

if exp is op(exp1) then codeWrite(exp1); output "op";

if exp is (exp1 op exp2) then codeWrite(exp1); codeWrite(exp2); output "op";

if exp is func (exp1, ..., expn) then codeWrite(exp1); ... codeWrite(exp1); output "call func";

Pseudo VM code

+

Page 26: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

CompileExpression Example

void CompileExpression(string expNextChar)// we use expNextChar to let us know when we are done with the (op term)* cycle, e.g., ‘)’{

WriteXMLTag("expression");bool doneWithExpression = false;CompileTerm(); // first term on stackwhile (GetNextToken()){

if ((tokenizer.currentToken == "+") || (tokenizer.currentToken == "-") || (tokenizer.currentToken == "*") || (tokenizer.currentToken == "/") ||(tokenizer.currentToken == "&") || (tokenizer.currentToken == "|") || (tokenizer.currentToken == "<") || (tokenizer.currentToken == ">") ||(tokenizer.currentToken == "="))

{WriteXML("symbol", tokenizer.currentToken);string op = tokenizer.currentToken;CompileTerm(); // next term on stack, then process opif (op == "*") vmWriter.writeCall("Math.multiply", 2);else if (op == "/") vmWriter.writeCall("Math.divide", 2);else vmWriter.writeArithmetic(op);continue;

}else if (tokenizer.currentToken == expNextChar) // we’re done{

doneWithExpression = true;break;

}else ReportError((expNextChar + " or term (op Term)*"), tokenizer.currentToken);

}WriteXMLTag("/expression");if (doneWithExpression) WriteXML("symbol", tokenizer.currentToken);

}

Page 27: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Program Flow

if (cond)

s1

else

s2

...

High-level code

VM code to compute and push !(cond)

if-goto L1

VM code for executing s1

goto L2

label L1

VM code for executing s2

label L2

...

Pseudo VM code

code

generation

while (cond)

s

...

High-level code

label L1

VM code to compute and push !(cond)

if-goto L2

VM code for executing s

goto L1

label L2

...

Pseudo VM code

code

generation

Page 28: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

CompileWhileStatement Example

void ComplileWhileStatement(){

WriteXMLTag("whileStatement"); // only for Project 10WriteXML("keyword", "while"); // only for Project 10

string whileCondLabel = NextLabel("WHILE_EXP");string whileEndLabel = NextLabel("WHILE_END");

vmWriter.writeLabel(whileCondLabel);

expect("symbol", "(");CompileExpression(")"); // compute while conditionvmWriter.writeArithmetic("~"); // logically negate itvmWriter.writeIf(whileEndLabel);

expect("symbol", "{");// body of while loopCompileStatements(); vmWriter.writeGoto(whileCondLabel);vmWriter.writeLabel(whileEndLabel);

expect("symbol", "}");WriteXMLTag("/whileStatement"); // only for Project 10

}

Page 29: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Handling Labels Like the Book’s Compiler

private string NextLabel(string prefix){

int labelNumber;switch (prefix){

case "WHILE_EXP":WHILE_EXP_labelNumber += 1;labelNumber = WHILE_EXP_labelNumber;break;

case "WHILE_END":WHILE_END_labelNumber += 1;labelNumber = WHILE_END_labelNumber;break;

case "IF_TRUE":IF_TRUE_labelNumber += 1;labelNumber = IF_TRUE_labelNumber;break;

case "IF_FALSE":IF_FALSE_labelNumber += 1;labelNumber = IF_FALSE_labelNumber;break;

case "IF_END":IF_END_labelNumber += 1;labelNumber = IF_END_labelNumber;break;

default:labelNumber = 99999;break;

}return (prefix + (labelNumber.ToString()));

}

Page 30: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

VMWriter Examples

public void writeLabel(string label)

{

VMOutFile.WriteLine("label {0}", label);

}

public void writeGoto(string label)

{

VMOutFile.WriteLine(" goto {0}", label);

}

public void writeIf(string label)

{

VMOutFile.WriteLine(" if-goto {0}", label);

Page 31: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Function Example

transfer 3transfer

// balance// sum

// this// sum

// balance

Page 32: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

CompileReturn Example

void CompileReturnStatement(){

// Compiles <return-statement > := 'return' < expression >? ';'

WriteXMLTag("returnStatement"); // Chap 10 onlyWriteXML("keyword", "return"); // Chap 10 only

if (GetNextToken()){

if (tokenizer.currentToken == ";") //no return expression{

WriteXML("symbol", ";"); // Chap 10 onlyvmWriter.writePush("constant", 0); // see book pg. 235;

// void returns push 0 onto stack}else{

peekedAhead = true;CompileExpression(";"); // compile an expression or look for a ";"// leaving result as return value on stack

}vmWriter.writeReturn();

}else ReportError("return statement", "premature EOF");WriteXMLTag("/returnStatement"); // Chap 10 only

}

Page 33: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Special Handling of Methods and Constructors

…// when we get here we have processed all args and all var decs (if any)vmWriter.writeFunction(Working_Class + "." + Working_Subroutine,

symbolTable.VarCount(ST_KIND.VAR));

if (subKind == ST_KIND.METHOD_NAME){// Insert VM code that sets the base of the "this" seg. to the object passed as arg 0

vmWriter.writePush("argument", 0);vmWriter.writePop("pointer", 0);

}else if (subKind == ST_KIND.CONSTRUCTOR_NAME){// Allocate a memory block for the new object (and its field variables),// and set the base of the "this" segment to point to the new object's base// The Jack OS reportedly chokes on alloc(0), so always alloc at least one word.// if OS alloc gets fixed, change this

vmWriter.writePush("constant",Math.Max(symbolTable.VarCount(ST_KIND.FIELD),1));

vmWriter.writeCall("Memory.alloc", 1);vmWriter.writePop("pointer", 0);

}

Page 34: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Compilation Engine

The CompilationEngine effects the actual compilation output.

• It gets its input from the Jack Tokenizer and emits its parsed structure into an output filestream.

• The output is generated by a series of compilexxx() routines, one for every syntactic element xxx of the Jack grammar.

• The contract between these routines is that each compilexxx() routine should read the syntactic construct xxx from the input, advance() the tokenizer exactly beyond xxx, and output the parsing of xxx. Thus, compilexxx() may only be called if indeed xxx is the next syntactic element of the input.

• In the first version of the compiler (Project 10), this module emits a structured printout of the code, wrapped in XML tags (defined in the specs of Project 10). In the final version of the compiler, this module generates executable VM code (defined in the specs of Project 11).

• In both cases, the parsing logic and module API are exactly the same.

Page 35: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Compilation Engine (cont.)

Page 36: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

CompilationEngine (cont.)

Page 37: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

CompilationEngine (cont.)

Page 38: Compiler II: Code GenerationThe Big Picture (Chapter 11) Keyboar d Jack Program Toke-nizer Parser Code Gene-ration Syntax Analyzer Jack Compiler VM code XML code (Chapter 10) 1. Syntax

Perspective

Jack simplifications that would be challenging to extend:

Limited primitive type system

No inheritance

No public class fields, e.g. must use r = c.getRadius()rather than r = c.radius

Jack simplifications that are easy to extend: :

Limited control structures, e.g. no for, switch, case, continue, break …

Cumbersome handling of char types, e.g. cannot use let x=‘c’

++, --, <=, >=, ==, etc.

Optimizations

Code, e.g., c=c+1 is translated inefficiently into push c, push 1, add, pop c.

Parallel processing

Many other possible improvements …