Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer...

20
Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages

Transcript of Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer...

Page 1: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

Compilation in More Detail

Asst. Prof. Dr. Ahmet SayarSpring-2012

Kocaeli University Computer Engineering Department

Principles of Programming Languages

Page 2: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

Phases of Compilation

Page 3: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

3

The Structure of a Compiler

1. Lexical Analysis2. Parsing3. Semantic Analysis4. Symbol Table5. Optimization6. Code Generation

The first 3, at least, can be understood by analogy to how humans comprehend English.

Page 4: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

4

1. Lexical Analysis

• First step: recognize words.– Smallest unit above letters

This is a sentence.

• Note the– Capital “T” (start of sentence symbol)– Blank “ “ (word separator)– Period “.” (end of sentence symbol)

Page 5: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

5

More Lexical Analysis

• Lexical analysis is not trivial. Consider:ist his ase nte nce

• Plus, programming languages are typically more cryptic than English:

*p->f += -.12345e-5

Page 6: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

6

And More Lexical Analysis

• Lexical analyzer divides program text into “words” or “tokens”

if x == y then z = 1; else z = 2;

• Units: if, x, ==, y, then, z, =, 1, ;, else, z, =, 2, ;

Page 7: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

7

2. Parsing

• Once words are understood, the next step is to understand sentence structure

• Parsing = Diagramming Sentences– The diagram is a tree

Page 8: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

Diagramming a Sentence

8

This line is a longer sentence

verbarticle noun article adjective noun

subject object

sentence

Page 9: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

9

Parsing Programs

• Parsing program expressions is the same• Consider:

If x == y then z = 1; else z = 2;• Diagrammed:

if-then-else

x y z 1 z 2==

assignrelation assign

predicate else-stmtthen-stmt

Page 10: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

10

3. Semantic Analysis• Once sentence structure is understood, we can try to understand

“meaning”– But meaning is too hard for compilers

• Compilers perform limited analysis to catch inconsistencies

• Some do more analysis to improve the performance of the program

Parse treeSemantic Analyzer

Intermediate Program

Page 11: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

11

Semantic Analysis in English

• Example:Jack said Jerry left his assignment at home.

What does “his” refer to? Jack or Jerry?

• Even worse:Jack said Jack left his assignment at home?

How many Jacks are there?Which one left the assignment?

Page 12: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

• Programming languages define strict rules to avoid such ambiguities

• This C++ code prints “4”; the inner definition is used

Semantic Analysis in Programming

12

{int Jack = 3;{

int Jack = 4;cout <<

Jack;}

}

Page 13: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

13

More Semantic Analysis

• Compilers perform many semantic checks besides variable bindings

• Example:Jack left her homework at home.

• A “type mismatch” between her and Jack; we know they are different people– Presumably Jack is male

Page 14: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

• After the lexical analyzing and parsing, symbol table is created.

• Lexical anayzer gives symbol table as an output• Below table shows tokens for a pascal statemet

– toplam:=değer+10;

4. Symbol Table

Page 15: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

15

5. Optimization

• No strong counterpart in English, but akin to editing

• Automatically modify programs so that they– Run faster– Use less memory– In general, conserve some resource

Page 16: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

• Result = function1(a+b) + function2(a+b)• How do you optimize this code?

Machine independent optimization

Page 17: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

• Code motion– Invariant expressions should be executed only once– E.g.

for (int i = 0; i < x.length; i++) x[i] *= Math.PI * Math.cos(y);

double picosy = Math.PI * Math.cos(y);for (int i = 0; i < x.length; i++) x[i] *= picosy;

Machine independent optimization

Page 18: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

• Multiplying a number with the power of 2– Shift to the left

• Dividing a number with the power of 2– Shift to the right

Machine dependent optimization

Page 19: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

19

Issues

• Compiling is almost this simple, but there are many pitfalls.

• Example: How are erroneous programs handled?

• Language design has big impact on compiler– Determines what is easy and hard to compile– Course theme: many trade-offs in language design

Page 20: Compilation in More Detail Asst. Prof. Dr. Ahmet Sayar Spring-2012 Kocaeli University Computer Engineering Department Principles of Programming Languages.

20

Compilers Today

• The overall structure of almost every compiler adheres to this outline

• The proportions have changed since FORTRAN– Early: lexing, parsing most complex, expensive

– Today: optimization dominates all other phases, lexing and parsing are cheap