Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language...

39
1 Language processors (Chapter 2) Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion 8 Interpretation 9 Review

Transcript of Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language...

Page 1: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

1Language processors (Chapter 2)

Course Overview

PART I: overview material1 Introduction

2 Language processors (tombstone diagrams, bootstrapping)

3 Architecture of a compiler

PART II: inside a compiler4 Syntax analysis

5 Contextual analysis

6 Runtime organization

7 Code generation

PART III: conclusion8 Interpretation

9 Review

Page 2: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

2Language processors (Chapter 2)

Chapter 2 Language Processors

RECAP– Different levels of programming languages

low-level <–> high-level

– different language processors to bridge the semantic gap

GOAL this lecture:– A high level understanding of what language processors are.

• what do they do?

• what kinds are there?

Topics:– translators (compilers, assemblers, ...)– tombstone diagrams– bootstrapping

Page 3: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

3Language processors (Chapter 2)

Compilers and other translators

Examples:Chinese => English

Java => JVM byte codes

Scheme => C

C => Scheme

x86 Assembly Language => x86 binary codes

Other non-traditional examples:disassembler, decompiler (e.g. JVM => Java)

Page 4: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

4Language processors (Chapter 2)

Assemblers versus Compilers

An assembler and a compiler both translate a “more high level” language into a “more low-level” one.

C

x86 assembly

x86 executable binary

Scheme

More high level

More low level

Scheme (to C) compiler

C compiler

x86 assembler

Page 5: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

5Language processors (Chapter 2)

Assemblers versus Compilers

Q: What is the difference between a compiler and an assembler?

Assembler: • very direct mapping (almost 1 to 1 mapping) from

the source language onto the target language.

Compiler: • more complex “conceptual” mapping from a HL

language to a LL language. 1 construct in the HL language => many instructions in the LL language

Page 6: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

6Language processors (Chapter 2)

Terminology

Translatorinput output

source program object program

is expressed in thesource language

is expressed in theimplementation language

is expressed in thetarget language

Q: Which programming languages play a role in this picture?

Page 7: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

7Language processors (Chapter 2)

Tombstone Diagrams

What are they?– diagrams consist of a set of “puzzle pieces” we can use to

reason about language processors and programs

– different kinds of pieces

– combination rules (not all diagrams are “well formed”)

M

Machine implemented in hardware

S --> T

L

Translator implemented in L

ML

Language interpreter in L

Program P implemented in L

LP

Page 8: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

8Language processors (Chapter 2)

Tombstone diagrams: Combination rules

SP P

TS --> T

M

M

LP

S --> T

MWRONG!

OK!OK!

OK!MMP

OK!

M

LP

WRONG!

Page 9: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

9Language processors (Chapter 2)

Tetrisx86C

Tetris

Compilation

x86

Example: Compilation of C programs on an x86 machine

C --> x86

x86x86

Tetris

x86

Page 10: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

10Language processors (Chapter 2)

TetrisPPCC

Tetris

Cross compilation

x86

Example: A C “cross compiler” from x86 to PPC

C --> PPC

x86

A cross compiler is a compiler which runs on one machine (the host machine) but emits code for another machine (the target machine).

Host ≠ Target

Q: Are cross compilers useful? Why would/could we use them?

PPCTetris

PPC

download

Page 11: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

11Language processors (Chapter 2)

Tetrisx86

TetrisJVMJava

Tetris

Two Stage Compilation

x86

Java-->JVM

x86

A two-stage translator is a composition of two translators. The output of the first translator is provided as input to the second translator.

x86

JVM-->x86

x86

Page 12: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

12Language processors (Chapter 2)

x86

Java-->x86

Compiling a Compiler

Observation: A compiler is a program! Therefore it can be provided as input to a language processor.Example: compiling a compiler.

Java-->x86

C++

x86

C++-->x86

x86

Page 13: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

13Language processors (Chapter 2)

Interpreters

An interpreter is a language processor implemented in software, i.e. as a program.

Terminology: abstract (or virtual) machine versus real machine

Example: The Java Virtual Machine

JVMx86

x86

JVMTetris

Q: Why are abstract machines useful?

Page 14: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

14Language processors (Chapter 2)

Interpreters

Q: Why are abstract machines useful?

1) Abstract machines provide better platform independence

JVMx86

x86 PPC

JVMTetris

JVMPPC

JVMTetris

Page 15: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

15Language processors (Chapter 2)

Interpreters

Q: Why are abstract machines useful?

2) Abstract machines are useful for testing and debugging.

Example: Testing the “Ultima” processor using hardware emulation

Ultimax86

x86

UltimaUltima

P

UltimaP

Functional equivalence

Note: we don’t have to implement Ultima emulator directly in x86; we can use a high-level language and compile it to x86.

Page 16: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

16Language processors (Chapter 2)

Interpreters versus Compilers

Q: What are the tradeoffs between compilation and interpretation?

Compilers typically offer more advantages when – programs are deployed in a production setting

– programs are “repetitive”

– the instructions of the programming language are complex

Interpreters typically are a better choice when– we are in a development/testing/debugging stage

– programs are run once and then discarded

– the instructions of the language are simple

Page 17: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

17Language processors (Chapter 2)

Interpretive Compilers

Why?A tradeoff between fast(er) compilation and a reasonable runtime performance.

How?Use an “intermediate language”• more high-level than machine code => easier to compile to• more low-level than source language => easier to implement as

an interpreter

Example: A “Java Development Kit” for machine M

Java-->JVM

M

JVMM

Page 18: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

18Language processors (Chapter 2)

PJVMJava

P

Interpretive Compilers

Example: Here is how to use a “Java Development Kit” to run a Java program P

Java-->JVM

M JVMMM

JVMP

M

Page 19: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

19Language processors (Chapter 2)

Portable Compilers

Example: Two different “Java Development Kits”

Java-->JVM

JVM

JVMM

Kit 2:

Java-->JVM

M

JVMM

Kit 1:

Q: Which one is “more portable”?

Page 20: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

20Language processors (Chapter 2)

Portable Compilers

In the previous example we have seen that portability is not an “all or nothing” kind of deal.

It is useful to talk about a “degree of portability” as the percentage of code that needs to be re-written when moving to a dissimilar machine.

In practice 100% portability is not possible.

Page 21: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

21Language processors (Chapter 2)

Example: a “portable” compiler kit

Java-->JVM

Java

JVMJava

Java-->JVM

JVM

Q: Suppose we want to run this kit on some machine M. How could we go about realizing that goal? (with the least amount of effort)

Portable Compiler Kit:

Page 22: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

22Language processors (Chapter 2)

Example: a “portable” compiler kit

Java-->JVM

Java

JVMJava

Java-->JVM

JVM

Q: Suppose we want to run this kit on our some machine M. How could we go about realizing that goal? (with the least amount of effort)

JVMJava

JVMC++

reimplement

C++-->M

M

JVMM

M

Page 23: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

23Language processors (Chapter 2)

Example: a “portable” compiler kit

Java-->JVM

Java

JVMJava

Java-->JVM

JVM

JVMM

This is what we have now:

Now, how do we run our Tetris program?

TetrisJVMJava

Tetris

M

Java-->JVM

JVMJVM

M

JVMTetris

JVMM

M

Page 24: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

24Language processors (Chapter 2)

Bootstrapping

Java-->JVM

Java

JVMJava

Java-->JVM

JVM

Remember our “portable compiler kit”:

We haven’t used this yet!

Java-->JVM

Java

Same language! Q: What can we do with a compiler written in itself? Is that useful at all?

JVMM

Page 25: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

25Language processors (Chapter 2)

Bootstrapping

Java-->JVM

Java

Same language!

Q: What can we do with a compiler written in itself? Is that useful at all?

• By implementing the compiler in (a subset of) its own language, we become less dependent on the target platform => more portable implementation.

• But… “chicken and egg problem”? How can we get around that?=> BOOTSTRAPPING: requires some work to make the first “egg”.

There are many possible variations on how to bootstrap a compiler written in its own language.

Page 26: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

26Language processors (Chapter 2)

Bootstrapping an Interpretive Compiler to Generate M code

Java-->JVM

Java

JVMJava

Java-->JVM

JVM

Our “portable compiler kit”:

PMJava

P

Goal: we want to get a “completely native” Java compiler on machine M

Java-->M

M

JVMM

M

Page 27: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

27Language processors (Chapter 2)

PM

PMJava

P

Bootstrapping an Interpretive Compiler to Generate M code

Idea: we will build a two-stage Java --> M compiler.

PJVM

M

Java-->JVM

MM

JVM-->M

M

We will make this by compiling

To get this we implement

JVM-->M

JavaJava-->JVM

JVM and compile it

Page 28: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

28Language processors (Chapter 2)

Bootstrapping an Interpretive Compiler to Generate M code

Step 1: implement

JVM-->M

Java JVM

JVM-->MJava-->JVM

JVMJVM

M

M

JVM-->M

Java

Step 2: compile it

Step 3: compile this

Page 29: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

29Language processors (Chapter 2)

Bootstrapping an Interpretive Compiler to Generate M code

Step 3: “Self compile” the JVM (in JVM) compiler

M

JVM-->M

JVMM

M

JVM-->M

JVM JVM-->M

JVM

This is the second stage of our compiler!

Step 4: use this to compile the Java compiler

Page 30: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

30Language processors (Chapter 2)

Bootstrapping an Interpretive Compiler to Generate M code

Step 4: Compile the Java-->JVM compiler into machine code

M

Java-->JVM

M

Java-->JVM

JVM JVM-->M

M

The first stage of our compiler!

We are DONE!

Page 31: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

31Language processors (Chapter 2)

Full Bootstrap

A full bootstrap is necessary when we are building a new compiler from scratch.

Example:We want to implement an Ada compiler for machine M. We don’t currently have access to any Ada compiler (not on M, nor on any other machine).

Idea: Ada is very large, we will implement the compiler in a subset of Ada and bootstrap it from a subset of Ada compiler in another language. (e.g. C)

Ada-S-->M

C

v1Step 1a: Build a compiler for Ada-S in another language

Page 32: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

32Language processors (Chapter 2)

Full Bootstrap

Ada-S-->M

C

v1

Step 1a: Build a compiler (v1) for Ada-S in another language.

Ada-S-->M

C

v1

M

Ada-S-->Mv1

Step 1b: Compile v1 compiler on M

M

C-->M

MThis compiler can be used for bootstrapping on machine M but we do not want to rely on it permanently!

Page 33: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

33Language processors (Chapter 2)

Full Bootstrap

Ada-S-->M

Ada-S

v2

Step 2a: Implement v2 of Ada-S compiler in Ada-S

Ada-S-->M

Ada-S

v2

M

M

Ada-S-->Mv2

Step 2b: Compile v2 compiler with v1 compiler

Ada-S-->M

M

v1

Q: Is it hard to rewrite the compiler in Ada-S?

We are now no longer dependent on the availability of a C compiler!

Page 34: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

34Language processors (Chapter 2)

Full Bootstrap

Step 3a: Build a full Ada compiler in Ada-S

Ada-->M

Ada-S

v3

M

M

Ada-->Mv3

Ada-S-->M

M

v2

Step 3b: Compile with v2 compiler

Ada-->M

Ada-S

v3

From this point on we can maintain the compiler in Ada. Subsequent versions v4,v5,... of the compiler in Ada and compile each with the the previous version.

Page 35: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

35Language processors (Chapter 2)

Half BootstrapWe discussed full bootstrap which is required when we have no access to a compiler for our language at all.

Q: What if we have access to a compiler for our language on a different machine HM (host machine) but want to develop one for TM (target machine)?

Ada-->HM

HM

We have:

Ada-->TM

TM

We want:

Idea: We can use cross compilation from HM to TM to bootstrap the TM compiler.

Ada-->HM

Ada

Page 36: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

36Language processors (Chapter 2)

HM

Ada-->TM

Half Bootstrap

Idea: We can use cross compilation from HM to TM to bootstrap the TM compiler.

Step 1: Implement Ada-->TM compiler in Ada

Ada-->TM

Ada

Step 2: Compile on HM

Ada-->TM

Ada Ada-->HM

HM

HM

Cross compiler: running on HM but

emits TM code

Page 37: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

37Language processors (Chapter 2)

TM

Ada-->TM

Half Bootstrap

Step 3: Cross compile our Ada-->TM compiler.

Ada-->TM

Ada Ada-->TM

HM

HM

DONE!

From now on we can develop subsequent versions of the compiler completely on TM

Page 38: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

38Language processors (Chapter 2)

Bootstrapping to Improve Efficiency

The efficiency of programs and compilers:Efficiency of programs:

- memory usage- runtime

Efficiency of compilers: - Efficiency of the compiler itself- Efficiency of the emitted code

Idea: We start from a simple compiler (generating inefficient code) and develop more sophisticated version of it. We can then use bootstrapping to improve performance of the compiler.

Page 39: Language processors (Chapter 2) 1 Course Overview PART I: overview material 1Introduction 2Language processors (tombstone diagrams, bootstrapping) 3Architecture.

39Language processors (Chapter 2)

Bootstrapping to Improve Efficiency

We have:Ada-->Mslow

Ada

Ada-->Mslow

Mslow

We implement:Ada-->Mfast

Ada

Ada-->Mfast

Ada

M

Ada-->Mfast

Mslow

Step 1

Ada-->Mslow

Mslow

Step 2 Ada-->Mfast

Ada

M

Ada-->Mfast

MfastAda-->Mfast

Mslow

Fast compiler that emits fast code!