Language Translators · What is a Translator? A programming language is a notation that a person...

23
* Course website: https://www.cs.columbia.edu/ rgu/courses//spring ** These slides are borrowed from Prof. Edwards. Language Translators Ronghui Gu Spring Columbia University

Transcript of Language Translators · What is a Translator? A programming language is a notation that a person...

Page 1: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

∗ Course website: https://www.cs.columbia.edu/ rgu/courses/4115/spring2019∗∗ These slides are borrowed from Prof. Edwards.

Language Translators

Ronghui GuSpring 2020

Columbia University

1

Page 2: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

What is a Translator?

A programming language is a notation that a person and acomputer can both understand.

• It allows you to express what is the task to compute• It allows a computer to execute the computation task

A translator translates what you express to what a computercan execute.

2

Page 3: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Compiler

Source Program

Compiler

Input Machine Code Output

• Pros: translation is done once and for all; optimize codeand map identi�ers at compile time.

• Cons: long compilation time; hard to port.

3

Page 4: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Interpreter

Source Program

Input Interpreter Output

• Pros: source code distribution; short development cycle.• Cons: translation is needed every time a statement isexecuted; lack optimization; map identi�ers repeatedly.

4

Page 5: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Bytecode Interpreter

Source Program

Compiler

Bytecode

Input Virtual Machine Output

• Pros: bytecode is highly compressed and optimized;bytecode distribution.

• Cons: compilation overhead + interpreter overhead.

5

Page 6: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Just-In-Time Compiler

Source Program

Compiler

Bytecode

Just-in-time Compiler

Input Machine Code Output

• Pros: compile and optimize many sections just before theexecution; bytecode distribution.

• Cons: compilation overhead + warm-up overhead.6

Page 7: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Language Speeds Compared

Native code compilersJust-in-time compilersBytecode interpreters

ATSC++ GNU g++C GNU gccJava 6 steady stateAda 2005 GNATHaskell GHCScalaJava 6 -serverLua LuaJITFortran IntelOCamlF# MonoC# MonoGo 6g 8gRacketLisp SBCLJavaScript V8Erlang HiPELuaSmalltalk VisualWorksJava 6 -XintPython CPythonPython 3Ruby 1.9Mozart/OzPHPPerl

Source: http://shootout.alioth.debian.org/ 7

Page 8: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Compilation Phases

Page 9: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Compiling a Simple Program

i n t avg ( i n t a , i n t b){

re turn ( a + b) / 2 ;}

Compiler

0101110101...

8

Page 10: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Compilation Phases

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

9

Page 11: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

What the Compiler Sees

i n t avg ( i n t a , i n t b){

re turn ( a + b) / 2 ;}

i n t SP a v g ( i n t SP a , SP i n t SP b ) NL{ NLSP SP r e t u r n SP ( a SP + SP b ) SP / SP 2 ; NL} NL

Just a sequence of characters

10

Page 12: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Lexical Analysis

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

11

Page 13: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Lexical Analysis Gives Tokens

i n t avg ( i n t a , i n t b){

re turn ( a + b) / 2 ;}

int avg ( int a , int b ) { return ( a + b

) / 2 ; }

• A stream of tokens; whitespace, comments removed.• Throw errors when failing to create tokens: malformedstrings or numbers or invalid characters (such asnon-ASCII characters in C).

12

Page 14: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Syntax Analysis

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

13

Page 15: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Syntax Analysis Gives an Abstract Syntax Tree

func

int avg args

arg

int a

arg

int b

return

/

+

a b

2

i n t avg ( i n t a , i n t b){

re turn ( a + b) / 2 ;}

• Syntax analysis will throwerrors if “}” is missing. Lexicalanalysis will not.

14

Page 16: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Semantic Analysis

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

15

Page 17: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Semantic Analysis: Resolve Symbols; Verify Types

Symbol Table

int a

int b

func

int avg args

arg

int a

arg

int b

return

/

+

a b

2

16

Page 18: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Intermediate Code Generation

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

17

Page 19: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Translation into 3-Address Code

i n t avg ( i n t a , i n t b){

re turn ( a + b) / 2 ;}

Idealized assembly language w/ in�nite registers

avg :t0 := a + bt1 := 2t2 := t0 / t1r e t t2

18

Page 20: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Optimization

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

19

Page 21: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Optimization

Optimization

avg :t0 := a + bt1 := 2t2 := t0 / t1r e t t2

avg :t0 := a + bt2 := t0 / 2r e t t2

20

Page 22: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Code Generation

int avg (int a, int b) ...

Lexical Analysis

Syntax Analysis

Semantic Analysis

Intermediate Code Generation

Optimization

Code Generation

0101110101...

front-end

middle-end

back-end

21

Page 23: Language Translators · What is a Translator? A programming language is a notation that a person and a computer can both understand. • It allows you to express what is the task

Generation of x86 Assembly

Code Generation

avg :t0 := a + bt2 := t0 / 2r e t t2

avg : pushl %ebp # save BPmovl %esp ,%ebpmovl 8(%ebp) ,%eax # load a from stackmovl 12(%ebp) ,%edx # load b from stack

addl %edx ,%eax # a += bshr $1 ,%eax # a /= 2r e t

22