1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing...

23
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Transcript of 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing...

Page 1: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

1

COMP 3438 – Part II-Lecture 1: Overview of Compiler Design

Dr. Zili Shao

Department of Computing

The Hong Kong Polytechnic Univ.

Page 2: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

2

Overview of the Subject (COMP 3438)

Overview of Unix Sys. Prog.

Process File System

Overview of Device Driver Development

Character Device Driver Development

Introduction to Block

Device Driver

Overview of Complier Design

Lexical Analysis(HW #3)

Syntax Analysis(HW #4)

Part I: Unix System Programming (Device Driver Development)

Part II: Compiler Design

Course Organization (This lecture is in red)

Page 3: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Outline

Programming language: High-level vs. Low level

What is a compiler? Phases of a compiler

3

Page 4: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Programming language – Machine Language

Machine languagesEverything is a binary number

Operations, data, addresses, …

e.g. In MIPS 2000,

0010 0100 1010 0110 0000 0000 0000 0100

# $t5 + 4 $t6

Machines like it BUT not us 4

Page 5: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Programming language – Assembly Language

Assembly languages Symbolic representation of Machine Language e.g.

Machine Code:

0010 0100 1010 0110 0000 0000 0000 0100

# $t5 + 4 $t6

Assembly Code: add $t6, $t5, 4

5

Page 6: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

High-level Programming language

High-level languagesProcedural (modular) programming

Group instructions into meaningful abstractions, e.g., data types, control structures, functions, etc.

C, Pascal, Perl

Object oriented programmingGroup “data” and “methods” into “objects”Naturally represents the world around usC++, Java, JavaScript

Logical programming: PrologFunctional programming: ML

6

Page 7: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Why High-level Languages?

Hide unnecessary details, so have a higher level of abstraction, increasing productivity

Make programs more robust, e.g., meaning of information is specified before its use, enabling substantial error checking at compile time

Make programs more portable

7

Page 8: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Compilers are Translators

C/C++

Fortran

Java

Perl

Matlab

Natural Language

Command

Machine code

Virtual Machine Code

Transformed code

(C, Java, …)

Lower level commands

Semantic components

…….

Translate

8

Page 9: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Translation Mechanisms

Compilation To translate a source program in one language into an

executable program in another language, and produce results while executing the new program

Examples: C, C++, Fortune Interpretation

To read a source program and produce results while understanding that program

Examples: Basic Case Study: Java

First, translate to java bytecode (compilation) Second, execute by interpretation (JVM)/compilation (JIT

(Just-In-Time))

9

Page 10: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Comparison of Compiler/Interpreter

Compiler Interpreter

Overview

Advantages Fast program execution;

Fully exploit architecture features;

Easy to debug;

Flexible to modify;

Machine independent;

Disadvantages Pre-processing of program;

Complicated;

Execution overhead;

SourceCode

Com

piler

ObjectCode

Data ResultsSourceCode

Data

Interpreter Results

10

Page 11: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

What is a compiler?

A compiler is a software that takes a program written in one language (called the source language) and translates it into an equivalent program in another language (called the target language).

It also reports to its user the presence of errors in the source program.

CompilerSourceprogram

Targetprogram

Error messages

11

Page 12: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

The Phases of a Compiler

Source program

Lexical Analyzer

Syntax Analyzer (Parser)

Semantic Analyzer

Intermediate Code Generator

Code Optimizer

Code Generator

Target program

Symbol-table Manager

Error Handler

12

Page 13: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Scan the source program and group sequences of characters into tokens.

A token is the smallest element of a language a group of characters (e.g., a series of alphabetic characters

forms a keyword; a series of digits forms a number).

The sub-module of the compiler that performs lexical analysis is called a lexical analyzer.

Example: position := initial + rate * 60 (pascal statement)

Lexical Analysis

Value Toke Type Value Toke Type position ID rate ID:= Operator * Operatorinitial ID 60 NUM

13

Page 14: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Once the tokens are identified, syntax analysis groups sequence of tokens into language constructs

e.g., identifiers, numbers, and operators can be grouped into expressions.

e.g., keywords, identifiers, expressions and operators can be combined to form statements.

The sub-module of the compiler that performs syntax analysis is called the parser/ Syntax Analyzer.

Syntax Analysis

14

Page 15: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Result of syntax analysis is recorded in a hierarchical structure called a syntax tree, each node represents an operation and its children represent

the arguments of the operation. evaluation begins from bottom and moves up. e.g., parse tree for postion := initial + rate * 60

Syntax Analysis – Syntax (Parse) Tree

=

id1 +

id2 *

id3 NUM (60)15

Page 16: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Semantic Analysis

Determine the meaning using the syntax treePut semantic meaning into the syntax treePerform checks to ensure that components fit together

meaningful, e.g. Type checking=

id1 +

id2 *

id3

NUM (60)

inttoreal

16

Page 17: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Intermediate Code Generation

Generate IR (Intermediate Representation) code

temp1 := inttoreal(60)temp2 := id3*temp1 temp3 := id2+temp2id1 := temp3

Easier to generate machine code from IR code

=

id1 +

id2 *

id3

NUM (60)

inttoreal

17

Page 18: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Code Optimization: Modify program representation so that program can run faster, use less memory, power, …

IR Code Optimized Code

Code Optimization

temp1 := inttoreal(60)temp2 := id3*temp1 temp3 := id2+temp2id1 := temp3

temp1 := id3* 60.0 id1 := id2+temp1

18

Page 19: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Code Generation

Generate target program.Machine Code

temp1 := id3* 60.0 id1 := id2+temp1

MOVF id3, R2MULF #60.0, R2MOVF id2, R1ADDF R2, R1MOVF R1, id1

19

Page 20: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Symbol Table Management

Collect and maintain information about IDAttributes:

Storage: where to store (Data, Heap, Stack, …)

Type: char, int, pointer, …

Scope: effective range

Number: value Information is added and used by all phases

Debuggers use symbol table

20

Page 21: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Front End and Back End

Source program

Lexical Analyzer

Syntax Analyzer (Parser)

Semantic Analyzer

Intermediate Code Generator

Code Optimizer

Code Generator

Target program

Symbol-table Manager

Error Handler

Front End

Back End

21

Page 22: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Distinction between Phases and Passes

Passes: the times going through a program representation1-pass, 2-pass, multiple-pass compilationLanguage become more complex – more passes

Phases: conceptual stagesNot completely separate

Semantic phase may do things that syntax should do

22

Page 23: 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Compiler Tools

Phases Tools

Lexical Analysis Lex, flex

Syntax Analysis yacc, bison

Semantic Analysis

Intermediate Code

Code Optimization

Code Generation

23