Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf ·...
Transcript of Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf ·...
![Page 1: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/1.jpg)
Fall 2016-2017 Compiler PrinciplesLecture 2: LL parsing
Roman ManevichBen-Gurion University of the Negev
1
![Page 2: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/2.jpg)
Books
2
CompilersPrinciples, Techniques, and ToolsAlfred V. Aho, Ravi Sethi, Jeffrey D. Ullman
Advanced Compiler Design and ImplementationSteven Muchnik
Modern Compiler DesignD. Grune, H. Bal, C. Jacobs, K. Langendoen
Modern Compiler Implementation in JavaAndrew W. Appel
![Page 3: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/3.jpg)
Tentative syllabus
FrontEnd
Scanning
Top-downParsing (LL)
Bottom-upParsing (LR)
IntermediateRepresentation
Operational Semantics
Lowering
Optimizations
DataflowAnalysis
LoopOptimizations
Code Generation
RegisterAllocation
EnergyOptimization
InstructionSelection
3
mid-term exam
![Page 4: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/4.jpg)
Parsing background
• Context-free grammars
– Terminals
– Nonterminals
– Start nonterminal
– Productions (rules)
• Context-free languages
– Derivations (leftmost, rightmost)
– Derivation tree (also called parse tree)
• Ambiguous grammars
4
![Page 5: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/5.jpg)
Agenda
5
• Understand role of syntax analysis
• Parsing strategies
• LL parsing
– Building a predictor table via FIRST/FOLLOW/NULLABLE sets
– Pushdown automata algorithm
• Handling conflicts
![Page 6: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/6.jpg)
Role of syntax analysis
• Recover structure from stream of tokens– Parse tree / abstract syntax tree
• Error reporting (recovery)• Other possible tasks
– Syntax directed translation (one pass compilers)– Create symbol table– Create pretty-printed version of the program,
e.g., Auto Formatting function in IDE
6
High-levelLanguage
(scheme)
Executable
Code
LexicalAnalysis
Syntax Analysis
Parsing
AST SymbolTableetc.
Inter.Rep.(IR)
CodeGeneration
![Page 7: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/7.jpg)
From tokens to abstract syntax trees
59 + (1257 * xPosition)
)id*num(+num
Lexical Analyzer
program text
token stream
Parser
Grammar:
E id
E num
E E + E
E E * E
E ( E ) +
num
num x
*
Abstract Syntax Tree
validsyntaxerror
7
Lexicalerror valid
Regular expressionsFinite automata
Context-free grammarsPush-down automata
![Page 8: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/8.jpg)
Marking “end-of-file”
• Sometimes it will be useful to transform a grammar G with start non-terminal S into a grammar G’ with a new start non-terminal S‘ and a new production rule
S’ S $– $ is not part of the set of tokens
– It is a special End-Of-File (EOF) token
• To parse α with G’ we change it into α $
• Simplifies parsing grammars with null productions– Also simplifies parsing LR grammars
8
![Page 9: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/9.jpg)
Another convention
• We will assume that all productions have been consecutively numbered(1) S E $
(2) E T
(3) E E + T
(4) T id
(5) T ( E )
9
![Page 10: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/10.jpg)
Parsing strategies
10
![Page 11: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/11.jpg)
Broad kinds of parsers
• Parsers for arbitrary grammars–Cocke-Younger-Kasami [‘65] method O(n3)
– Earley’s method (implemented by NLTK)O(n3) but lower for restricted classes
–Not commonly used by compilers
• Parsers for restricted classes of grammars– Top-Down
• With/without backtracking
–Bottom-Up
11
![Page 12: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/12.jpg)
Top-down parsing
• Constructs parse tree in a top-down matter
• Find leftmost derivation
• Predictive: for every non-terminal and k-tokens predictthe next production LL(k)
• Challenge: beginning with the start symbol, try to guess the productions to apply to end up at the user's program
12
By Fidelio (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
![Page 13: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/13.jpg)
Predictive parsing
13
![Page 14: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/14.jpg)
Exercise: show leftmost derivation
14
not ( not true or false )
not E
E
( E OP E )
not
LIT
or LIT
true
false
(1) E LIT(2) | (E OP E)(3) | not E(4) LIT true(5) | false(6) OP and(7) | or(8) | xor
E
E
not E
not ( E OP E )
not ( not E OP E )
not ( not LIT OP E )
not ( not true OP E )
not ( not true or LIT )
not ( not true or E )
How did we decide which production of ‘E’ to take?
![Page 15: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/15.jpg)
Predictive parsing
• Given a grammar G attempt to derive a word ω• Idea
– Scan input from left to right– Apply production to leftmost nonterminal– Pick production rule based on next input token
• Problem: there is more than one production based for next token
• Solution: restrict grammars to LL(1)– Parser correctly predicts which production to apply– If grammar is not in LL(1) the parser construction
algorithm will detect it
15
![Page 16: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/16.jpg)
LL(1) parsing via pushdown automata
16
Parsing programX
Y
Z
$
$b+a
Derivation tree / error
Input stream
Stack of symbols(current sentential form)
no
nte
rmin
al
token
production
Prediction table
![Page 17: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/17.jpg)
LL(1) parsing algorithm
• Set stack=S$• while true
– Prediction When top of stack is nonterminal N1. Pop N2. lookup Table[N,t]3. If table[N,t] is not empty, push Table[N,t] on stack
else return syntax error
– Match When top of stack is terminal t• If t=next input toke, pop t and increment input index
else return syntax error
– End When stack is empty• If input is empty return success
else return syntax error
17
![Page 18: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/18.jpg)
( ) not true false and or xor $
E 2 3 1 1
LIT 4 5
OP 6 7 8
(1) E → LIT
(2) E → ( E OP E )
(3) E → not E
(4) LIT → true
(5) LIT → false
(6) OP → and
(7) OP → or
(8) OP → xor
No
nte
rmin
als
Input tokens
Table entries determine which production to take
Example prediction table
18
‘(‘ FIRST(‘( E OP E )’ )
![Page 19: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/19.jpg)
a b c
S S aSb S c
S aSb | caacbb$
Input suffix Stack content Move
aacbb$ S$ predict(S,a) = S aSb
aacbb$ aSb$ match(a,a)
acbb$ Sb$ predict(S,a) = S aSb
acbb$ aSbb$ match(a,a)
cbb$ Sbb$ predict(S,c) = S c
cbb$ cbb$ match(c,c)
bb$ bb$ match(b,b)
b$ b$ match(b,b)
$ $ match($,$) – success
Running parser example
19
![Page 20: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/20.jpg)
a b c
S S aSb S c
S aSb | cabcbb$
Input suffix Stack content Move
abcbb$ S$ predict(S,a) = S aSb
abcbb$ aSb$ match(a,a)
bcbb$ Sb$ predict(S,b) = ERROR
Illegal input example
20
![Page 21: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/21.jpg)
Building the prediction table
• Let G be a grammar
• Compute FIRST/NULLABLE/FOLLOW
• Check for conflicts
– No conflicts => G is an LL(1) grammar
– Conflicts exit => G is not an LL(1) grammar
• Attempt to transform G into an equivalent LL(1) grammar G’
21
![Page 22: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/22.jpg)
First sets
22
![Page 23: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/23.jpg)
FIRST sets
• Definition: For a nonterminal A, FIRST(A) is the set of terminals that can start in a sentence derived from A
– Formally: FIRST(A) = {t | A * t ω}
• Definition: For a sentential form α, FIRST(α) is the set of terminals that can start in a sentence derived from α
– Formally: FIRST(α) = {t | α * t ω}
23
![Page 24: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/24.jpg)
FIRST sets example
• FIRST(E) = …?
• FIRST(LIT) = …?
• FIRST(OP) = …?
24
E LIT | (E OP E) | not ELIT true | falseOP and | or | xor
![Page 25: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/25.jpg)
FIRST sets example
• FIRST(E) = FIRST(LIT) FIRST(( E OP E )) FIRST(not E)
• FIRST(LIT) = { true, false }
• FIRST(OP) = {and, or, xor}
• A set of recursive equations
• How do we solve them?
25
E LIT | (E OP E) | not ELIT true | falseOP and | or | xor
![Page 26: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/26.jpg)
Computing FIRST sets
• This is known as a fixed-point algorithm• We will see such iterative methods later in the
course and learn to reason about them
26
Assume no null productions (A )
1. Initially, for all nonterminals A, setFIRST(A) = { t | A t ω for some ω }
2. Repeat the following until no changes occur:for each nonterminal A
for each production A α1 | … | αk
FIRST(A) := FIRST(α1) ∪ … ∪ FIRST(αk)
![Page 27: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/27.jpg)
Exercise: compute FIRST
27
STMT if EXPR then STMT| while EXPR do STMT| EXPR ;
EXPR TERM -> id| zero? TERM| not EXPR| ++ id| -- id
TERM id| constant
TERMEXPRSTMT
FIRST(STMT) = FIRST(if) ∪ FIRST(while) ∪ FIRST(EXPR)FIRST(EXPR) = FIRST(TERM) ∪ FIRST(zero?) ∪ FIRST(not) ∪ FIRST(++) ∪ FIRST(--)FIRST(TERM) = FIRST(id) ∪ FIRST(constant)
![Page 28: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/28.jpg)
Exercise: compute FIRST
28
STMT if EXPR then STMT| while EXPR do STMT| EXPR ;
EXPR TERM -> id| zero? TERM| not EXPR| ++ id| -- id
TERM id| constant
TERMEXPRSTMT
FIRST(STMT) = {if, while} ∪ FIRST(EXPR)FIRST(EXPR) = {zero?, not, ++, --} ∪ FIRST(TERM)FIRST(TERM) = {id, constant}
![Page 29: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/29.jpg)
1. Initialization
29
STMT if EXPR then STMT| while EXPR do STMT| EXPR ;
EXPR TERM -> id| zero? TERM| not EXPR| ++ id| -- id
TERM id| constant
TERMEXPRSTMT
idconstant
zero?Not++--
ifwhile
FIRST(STMT) = {if, while} ∪ FIRST(EXPR)FIRST(EXPR) = {zero?, not, ++, --} ∪ FIRST(TERM)FIRST(TERM) = {id, constant}
![Page 30: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/30.jpg)
2. Iterate 1
30
STMT if EXPR then STMT| while EXPR do STMT| EXPR ;
EXPR TERM -> id| zero? TERM| not EXPR| ++ id| -- id
TERM id| constant
TERMEXPRSTMT
idconstant
zero?Not++--
ifwhile
zero?Not++--
FIRST(STMT) = {if, while} ∪ FIRST(EXPR)FIRST(EXPR) = {zero?, not, ++, --} ∪ FIRST(TERM)FIRST(TERM) = {id, constant}
![Page 31: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/31.jpg)
2. Iterate 2
31
STMT if EXPR then STMT| while EXPR do STMT| EXPR ;
EXPR TERM -> id| zero? TERM| not EXPR| ++ id| -- id
TERM id| constant
TERMEXPRSTMT
idconstant
zero?Not++--
ifwhile
idconstant
zero?Not++--
FIRST(STMT) = {if, while} ∪ FIRST(EXPR)FIRST(EXPR) = {zero?, not, ++, --} ∪ FIRST(TERM)FIRST(TERM) = {id, constant}
![Page 32: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/32.jpg)
2. Iterate 3 – fixed-point
32
STMT if EXPR then STMT| while EXPR do STMT| EXPR ;
EXPR TERM -> id| zero? TERM| not EXPR| ++ id| -- id
TERM id| constant
TERMEXPRSTMT
idconstant
zero?Not++--
ifwhile
idconstant
zero?Not++--
idconstant
FIRST(STMT) = {if, while} ∪ FIRST(EXPR)FIRST(EXPR) = {zero?, not, ++, --} ∪ FIRST(TERM)FIRST(TERM) = {id, constant}
![Page 33: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/33.jpg)
Reasoning about the algorithm
33
• Is the algorithm correct?• Does it terminate? (complexity)
Assume no null productions (A )
1. Initially, for all nonterminals A, setFIRST(A) = { t | A t ω for some ω }
2. Repeat the following until no changes occur:for each nonterminal A
for each production A α1 | … | αk
FIRST(A) := FIRST(α1) ∪ … ∪ FIRST(αk)
![Page 34: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/34.jpg)
Reasoning about the algorithm
• Termination:
• Correctness:
34
![Page 35: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/35.jpg)
LL(1) Parsing of grammars without epsilon productions
35
![Page 36: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/36.jpg)
Using FIRST sets
• Assume G has no epsilon productions and for every non-terminal X and every pair of productions X and X we have thatFIRST() FIRST() = {}
• No intersection between FIRST sets =>can always pick a single rule
36
![Page 37: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/37.jpg)
Using FIRST sets
• In our Boolean expressions example– FIRST( LIT ) = { true, false }
– FIRST( ( E OP E ) ) = { ‘(‘ }
– FIRST( not E ) = { not }
• If the FIRST sets intersect, may need longer lookahead– LL(k) = class of grammars in which production rule
can be determined using a lookahead of k tokens
– LL(1) is an important and useful class
• What if there are epsilon productions?
37
![Page 38: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/38.jpg)
Extending LL(1) Parsingfor epsilon productions
38
![Page 39: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/39.jpg)
FIRST, FOLLOW, NULLABLE sets
• For each non-terminal X
• FIRST(X) = set of terminals that can start in a sentence derived from X
– FIRST(X) = {t | X * t ω}
• NULLABLE(X) if X *
• FOLLOW(X) = set of terminals that can follow Xin some derivation
– FOLLOW(X) = {t | S * X t }
39
![Page 40: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/40.jpg)
Computing the NULLABLE set
• Lemma: NULLABLE(1 … k) = NULLABLE(1) … NULLABLE(k)
1. Initially NULLABLE(X) = false
2. For each non-terminal X if exists a productionX then NULLABLE(X) = true
3. Repeatfor each production Y 1 … kif NULLABLE(1 … k) then
NULLABLE(Y) = trueuntil NULLABLE stabilizes
40
![Page 41: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/41.jpg)
Exercise: compute NULLABLE
41
S A a bA a | B A B | CC b |
NULLABLE(S) = NULLABLE(A) NULLABLE(a) NULLABLE(b)NULLABLE(A) = NULLABLE(a) NULLABLE()NULLABLE(B) = NULLABLE(A) NULLABLE(B) NULLABLE(C)NULLABLE(C) = NULLABLE(b) NULLABLE()
![Page 42: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/42.jpg)
FIRST with epsilon productions
• How do we compute FIRST(1 … k) when epsilon productions are allowed?
– FIRST(1 … k) = ?
42
![Page 43: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/43.jpg)
FIRST with epsilon productions
• How do we compute FIRST(1 … k) when epsilon productions are allowed?
– FIRST(1 … k) =if not NULLABLE(1) then FIRST(1)else FIRST(1) FIRST (2 … k)
43
![Page 44: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/44.jpg)
Exercise: compute FIRST
44
S A c bA a |
NULLABLE(S) = NULLABLE(A) NULLABLE(c) NULLABLE(b)NULLABLE(A) = NULLABLE(a) NULLABLE()
FIRST(S) = FIRST(A) FIRST(cb)FIRST(A) = FIRST(a) FIRST ()
FIRST(S) = FIRST(A) {c}FIRST(A) = {a}
![Page 45: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/45.jpg)
FOLLOW sets
• if X α Y then FOLLOW(Y) ?
if NULLABLE() or = thenFOLLOW(Y) ?
p. 189
45
![Page 46: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/46.jpg)
FOLLOW sets
• if X α Y then FOLLOW(Y) FIRST()
if NULLABLE() or = thenFOLLOW(Y) ?
p. 189
46
![Page 47: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/47.jpg)
FOLLOW sets
• if X α Y then FOLLOW(Y) FIRST()
if NULLABLE() or = thenFOLLOW(Y) FOLLOW(X)
p. 189
47
![Page 48: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/48.jpg)
FOLLOW sets
• if X α Y then FOLLOW(Y) FIRST()
if NULLABLE() or = thenFOLLOW(Y) FOLLOW(X)
• Allows predicting epsilon productions:X when the lookahead token is in FOLLOW(X)
p. 189
S A c bA a |
What should we predict for input “cb”?
What should we predict for input “acb”?
48
![Page 49: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/49.jpg)
LL(1) conflicts
49
![Page 50: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/50.jpg)
Conflicts
• FIRST-FIRST conflict
– X α and X and
– If FIRST(α) FIRST(β) {}
• FIRST-FOLLOW conflict
– NULLABLE(X)
– If FIRST(X) FOLLOW(X) {}
50
![Page 51: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/51.jpg)
LL(1) grammars
• A grammar is in the class LL(1) when its LL(1) prediction table contains no conflicts
• A language is said to be LL(1) when it has an LL(1) grammar
51
![Page 52: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/52.jpg)
LL(k) grammars
52
![Page 53: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/53.jpg)
LL(k) grammars
• Generalizes LL(1) for k lookahead tokens
• Need to generalize FIRST and FOLLOW for klookahead tokens
53
![Page 54: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/54.jpg)
Agenda
54
• LL(k) via pushdown automata
• Predicting productions via FIRST/FOLLOW/NULLABLE sets
• Handling conflicts
![Page 55: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/55.jpg)
Handling conflicts
55
![Page 56: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/56.jpg)
Problem 1: FIRST-FIRST conflict
• FIRST(term) = { ID }
• FIRST(indexed_elem) = { ID }
• How can we transform the grammar into an equivalent grammar that does not have this conflict?
term ID | indexed_elemindexed_elem ID [ expr ]
56
![Page 57: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/57.jpg)
Solution: left factoring
• Rewrite the grammar to be in LL(1)
Intuition: just like factoring in algebra: x*y + x*z into x*(y+z)
term ID | indexed_elemindexed_elem ID [ expr ]
term ID after_IDAfter_ID [ expr ] |
57
New grammar is more complex – has epsilon production
![Page 58: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/58.jpg)
S if E then S else S| if E then S | T
Exercise: apply left factoring
58
![Page 59: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/59.jpg)
S if E then S else S| if E then S | T
S if E then S S’ | T
S’ else S |
Exercise: apply left factoring
59
![Page 60: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/60.jpg)
Problem 2: FIRST-FOLLOW conflict
• FIRST(S) = { a } FOLLOW(S) = { }
• FIRST(A) = { a } FOLLOW(A) = { a }
• How can we transform the grammar into an equivalent grammar that does not have this conflict?
S A a bA a |
60
![Page 61: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/61.jpg)
Solution: substitution
S A a bA a |
S a a b | a b
Substitute A in S
61
![Page 62: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/62.jpg)
Solution: substitution
S A a bA a |
S a a b | a b
Substitute A in S
S a after_Aafter_A a b | b
Left factoring
62
![Page 63: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/63.jpg)
Problem 3: FIRST-FIRST conflict
• Left recursion cannot be handled with a bounded lookahead
• How can we transform the grammar into an equivalent grammar that does not have this conflict?
E E - term | term
63
![Page 64: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/64.jpg)
Solution: left recursion removal
• L(G1) = β, βα, βαα, βααα, …
• L(G2) = same
N Nα | βN βN’ N’ αN’ |
G1 G2
E E - term | termE term TE | termTE - term TE |
For our 3rd example:
p. 130
Can be done algorithmically.Problem 1: grammar becomes mangled beyond recognitionProblem 2: grammar may not be LL(1)
64
![Page 65: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/65.jpg)
Recap
• Given a grammar
• Compute for each non-terminal– NULLABLE
– FIRST using NULLABLE
– FOLLOW using FIRST and NULLABLE
• Compute FIRST for each sentential form appearing on right-hand side of a production
• Check for conflicts– If exist: attempt to remove conflicts by rewriting
grammar
65
![Page 66: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/66.jpg)
The bigger picture
• Compilers include different kinds of program analyses each further constrains the set of legal programs
– Lexical constraints
– Syntax constraints
– Semantic constraints
– “Logical” constraints(Verifying Compiler grand challenge)
66
Program consists of legal tokens
Program included in a given context-free language
Program included in a given attribute grammar (type checking, legal inheritance graph, variables initialized before used)
Memory safety: null dereference, array-out-of-bounds access,data races, functional correctness (program meets specification)
![Page 67: Fall 2016-2017 Compiler Principles Lecture 2: LL parsingcomp171/wiki.files/02-parsing-1-LL.pdf · Fall 2016-2017 Compiler Principles Lecture 2: LL parsing Roman Manevich Ben-Gurion](https://reader033.fdocuments.in/reader033/viewer/2022042320/5f09b6af7e708231d4282962/html5/thumbnails/67.jpg)
Next lecture:bottom-up parsing