Chapter 2 A Simple Compiler
description
Transcript of Chapter 2 A Simple Compiler
![Page 1: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/1.jpg)
1
Chapter 2 A Simple Compiler
![Page 2: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/2.jpg)
2
Outlines
• 2.1 The Structure of a Micro Compiler
• 2.2 A Micro Scanner
• 2.3 The Syntax of Micro
• 2.4 Recursive Descent Parsing
• 2.5 Translating Micro
![Page 3: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/3.jpg)
3
Micro• Micro: a very simple language
– Only integers– No declarations– Variables consist of A..Z, 0..9, and at most 32
characters long.– Comments begin with -- and end with end-of-line. – Three kinds of statements:
• assignments, e.g., a := b + c• read(list of ids), e.g., read(a, b)• write(list of exps), e.g., write(a+b)
– Begin, end, read, and write are reserved words.– Tokens may not extend to the following line.
![Page 4: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/4.jpg)
4
The Structure of a Micro compiler
• One-pass type, no explicit intermediate representations used– See P. 9, Fig. 1.3
• The interface– Parser is the main routine.– Parser calls scanner to get the next token.– Parser calls semantic routines at appropriate times.– Semantic routines produce output in assembly language.– A simple symbol table is used by the semantic routines.
![Page 5: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/5.jpg)
5
Scanner ParserSemanticRoutines
SourceProgram
(CharacterStream)
Tokens Syntactic
Structure
Target MachineCode
Symbol andAttribute
Tables
(Used by allPhases of The Compiler)
![Page 6: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/6.jpg)
6
A Micro Scanner• The Micro Scanner will be a function of no
arguments that returns token values– There are 14 tokens.
typedef enum token_types {BEGIN, END, READ, WRITE, ID, INTLITERAL,LPAREN, RPAREN, SEMICOLON, COMMA, ASSIGNOP,PLUSOP, MINUSOP, SCANEOF
} token;
Extern token scanner(void);
![Page 7: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/7.jpg)
7
A Program Example of Micro
Begin A:=BB-314+A; end SCANEOF
![Page 8: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/8.jpg)
8
A Program Example of Micro
Begin A:=BB-314+A; end SCANEOF
![Page 9: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/9.jpg)
9
A Micro Scanner (Cont’d)
• The scanner returns the longest string that constitutes a token, e.g., in
abcdef
ab, abc, abcdef are all valid tokens.
The scanner will return the
longest one (i.e., abcdef).
![Page 10: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/10.jpg)
10
continue: skip one iteration
getchar() , isspace(), isalpha(), isalnum(),isdigital()
ungetc(): push back one character
![Page 11: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/11.jpg)
11
![Page 12: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/12.jpg)
12
A Micro Scanner (Cont’d)
• How to handle RESERVED words?– Reserved words are similar to identifiers.
• Two approaches:– Use a separate table of reserved words– Put all reserved words into symbol table
initially.
![Page 13: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/13.jpg)
13
A Micro Scanner (Cont’d)
• Provision for saving the characters of a token as they are scanned– token_buffer, buffer_char(), clear_buffer(),
check_reserved()
• Handle end of file– feof(stdin)
![Page 14: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/14.jpg)
14
Complete Scanner Functionfor Micro
![Page 15: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/15.jpg)
15
![Page 16: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/16.jpg)
16
The Syntax of Micro• Micro's syntax is defined by a context-free
grammar (CFG)– CFG is also called BNF (Backus-Naur Form)
grammar• CFG consists of a set of production rules,
AB C D Z
LHS must be a single nonterminal
RHS consists 0 or more terminals or nonterminals
LHS RHS
![Page 17: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/17.jpg)
17
The Syntax of Micro (Cont’d)• Two kinds of symbols
– Nonterminals• Delimited by < and >
• Represent syntactic structures
– Terminals• Represent tokens
• E.g.
<program> begin <statement list> end
• Start or goal symbol : empty or null string
![Page 18: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/18.jpg)
18
The Syntax of Micro (Cont’d)
• E.g.
<statement list> <statement><statement tail>
<statement tail> <statement tail> <statement><statement tail>
![Page 19: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/19.jpg)
19
The Syntax of Micro (Cont’d)• Extended BNF: some abbreviations
1. optional: [ ] 0 or 1
<stmt> if <exp> then <stmt>
<stmt> if <exp> then <stmt> else <stmt>
can be written as
<stmt> if <exp> then <stmt> [ else <stmt> ]
2. repetition: { } 0 or more
<stmt list> <stmt> <tail>
<tail>
<tail> <stmt> <tail>
can be written as
<stmt list> <stmt> { <stmt> }
![Page 20: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/20.jpg)
20
The Syntax of Micro (Cont’d)
• Extended BNF: some abbreviations
3. alternative: | or
<stmt> <assign>
<stmt> <if stmt>
can be written as
<stmt> <assign> | <if stmt>
• Extended BNF == BNF
– Either can be transformed to the other.
– Extended BNF is more compact and readable
![Page 21: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/21.jpg)
21
The Syntax of Micro (Cont’d)
![Page 22: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/22.jpg)
22
• The derivation of begin ID:= ID + (INTLITERAL – ID); end
![Page 23: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/23.jpg)
23
The Syntax of Micro (Cont’d)
• A CFG defines a language, which is a set of sequences of tokens
• Syntax errors & semantic errors A:=‘X’+True;
![Page 24: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/24.jpg)
24
The Syntax of Micro (Cont’d)• Associativity
A-B-C
• Operator precedenceA+B*C
![Page 25: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/25.jpg)
25
• A grammar fragment defines such a precedence relationship
![Page 26: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/26.jpg)
26
• With parentheses, the desired grouping can be forced
![Page 27: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/27.jpg)
27
Recursive Descent Parsing• There are many parsing techniques.
– Recursive descent is one of the simplest parsing techniques
• Basic idea– Each nonterminal has a parsing procedure– For symbol on the RHS : a sequence of matching
• To Match a nonterminal A– Call the parsing procedure of A
• To match a terminal symbol t– Call match(t)
» match(t) calls the scanner to get the next token. If is it t, everything is correct. If it is not t, we have found a syntax error.
![Page 28: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/28.jpg)
28
Recursive Descent Parsing• If a nonterminal has several productions, choose an
appropriate one based on the next input token.• Parser is started by invoking system_goal().
![Page 29: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/29.jpg)
29
• next_token() : a function that returns the next token. It does not call scanner(void).
處理 {<statement>} 不出現的情形
![Page 30: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/30.jpg)
30
<statement>必出現一次
![Page 31: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/31.jpg)
31
![Page 32: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/32.jpg)
32
![Page 33: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/33.jpg)
33
![Page 34: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/34.jpg)
34
Translating Micro
• Target language: 3-addr code (quadruple)– OP A, B, C
• Note that we did not worry about registers at this time.– temporaries: Sometimes we need to hold
temporary values.• E.g. A+B+C
ADD A,B,TEMP&1ADD TEMP&1,C,TEMP&2
![Page 35: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/35.jpg)
35
Translating Micro (Cont’d)
• Action Symbols– The bulk of a translation is done by semantic routine
– Action symbols can be added to a grammar to specify when semantic processing should take place
• Be placed anywhere in the RHS of a production
– translated into procedure call in the parsing procedures• #add corresponds to a semantic routine named add()
– No impact on the languages recognized by a parser driven by a CFG
![Page 36: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/36.jpg)
36
Scanner ParserSemanticRoutines
SourceProgram
(CharacterStream)
Tokens Syntactic
Structure
Target MachineCode
Symbol andAttribute
Tables
(Used by allPhases of The Compiler)
![Page 37: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/37.jpg)
37
![Page 38: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/38.jpg)
38
Semantic Information• Semantic routines need certain information to do
their work. – These information is stored in semantic records.
– Each kind of grammar symbol has a semantic record
![Page 39: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/39.jpg)
39
Previous parsing procedureP. 37
New parsing procedure which involved semanticroutines
<expression> <primary> {<add op> <primary>} #gen_infix
![Page 40: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/40.jpg)
40
Semantic Information (Cont’d)• Subroutines for symbol table and temporaries
![Page 41: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/41.jpg)
41
Semantic Information (Cont’d)• Semantic routines
![Page 42: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/42.jpg)
42
![Page 43: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/43.jpg)
43
![Page 44: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/44.jpg)
44
![Page 45: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/45.jpg)
45
![Page 46: Chapter 2 A Simple Compiler](https://reader033.fdocuments.in/reader033/viewer/2022061522/56814dc8550346895dbb1c5e/html5/thumbnails/46.jpg)
46