Top Down Parser
-
Upload
vikasdalal -
Category
Documents
-
view
63 -
download
0
description
Transcript of Top Down Parser
1
Top-Down Parsing
• The parse tree is created top to bottom.
• Top-down parser– Recursive-Descent Parsing
• Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.)
• It is a general parsing technique, but not widely used.
• Not efficient
– Predictive Parsing
• no backtracking
• efficient
• needs a special form of grammars (LL(1) grammars).
• Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.
2
Top-Down Parsing
• Begin with the start symbol at the root of the parse tree
• Build the parse tree from the top down
3
Top-Down Parsing
S aSbS | bSaS | e S
a S b S
b S a S
e e
e
4
Parsing Decisions
Which nonterminal in the parse tree should be expanded?
Which of its grammar rules should be used to expand it?
5
Nondeterministic Parser
Expand any nonterminal.
Expand it using a grammar rule that occurs in the derivation of the
input string.
6
Backtracking Parser
Expand the leftmost nonterminal in the parse tree.
Try a grammar rule for the nonterminal. If it does not work out, try
another one.
7
Backtracking Parser
S aSa | bSb | a | b S
b b b
a S a
8
Backtracking Parser
S aSa | bSb | a | b S
b b b
b S b
a S a
9
Backtracking Parser
S aSa | bSb | a | b S
b b b
b S b
b S b
a S a
10
Backtracking Parser
S aSa | bSb | a | b S
b b b
b S b
b S b
b S b
11
Backtracking Parser
S aSa | bSb | a | b S
b b b
b S b
b S b
a
12
Backtracking Parser
S aSa | bSb | a | b S
b b b
b S b
b S b
b
13
Backtracking Parser
S aSa | bSb | a | b S
bb
b
b S b
b
14
Recursive Descent Parsing
• Basic idea:
– Write a routine to recognize each lhs
– This produces a parser with mutually recursive routines.
– Good for hand-coded parsers.
Ex: A aBb (This is only the production rule for A)
proc A {
- match the current token with a, and move to the next token;
- call „B‟;
- match the current token with b, and move to the next token;
}
15
Recursive Descent Parsing (cont.)
• When to apply e-productions.
A aA | bB | e
• If all other productions fail, we should apply an e-production. For
example, if the current token is not a or b, we may apply the
e-production.
16
Recursive Descent Parsing (cont.)
A aBb | bAB
proc A {
case of the current token {
„a‟: - match the current token with a, and move to the next token;
- call „B‟;
- match the current token with b, and move to the next token;
„b‟: - match the current token with b, and move to the next token;
- call „A‟;
- call „B‟;
}
}
17
Recursive Descent Parser for a Simple Declaration Statement
• Decl_stmt type idlist;
• Type int|float
• Idlist id|id ,idlist
• Proc declstmt()
• {
– Call type();
– Call idlist();
}
Proc type()
{
case of the current token {
‘int’ : match the current
token with int, move to the
next token
‘float’ : match the
currenttoken with float,
move to the next token;
}
}
Write the code for the nonterminal idlist
18
AaB | b will correspond to
–A() {
– if (lookahead == 'a')
match('a');
B();
else if (lookahead == 'b')
match ('b');
else error();
}
19
Recursive descent parser for expression
• ETE'
• E'+TE'|e
TFT'
• T'*FT'|e
F(E)
Fid
parse() {token = get_next_token();if (E() and token == '$')then return trueelse return false
}
E() {if (T())then return Eprime()else return false
}
Eprime() {if (token == '+')then token=get_next_token()
if (T())then return Eprime()else return false
else if (token==')' or token=='$')then return true else return false
}The remaining procedures are similar.
20
When Top down parsing doesn’t Work Well
• Consider productions S S a | a:
– In the process of parsing S we try the above rules
– Applied consistently in this order, get infinite loop
– Could re-order productions, but search will have
lots of backtracking and general rule for ordering is
complex
• Problem here is left-recursive grammar:
21
Left Recursion
E E + T | TT T * F | FF n | (E)
E
E + T
E + T
22
Elimination of Left recursion
• Consider the left-recursive grammar
S S a | b
• S generates all strings starting with a b and followed
by a number of a
• Can rewrite using right-recursion
S b S‟
S‟ a S‟ | e
23
Elimination of left Recursion. Example
• Consider the grammar
S 1 | S 0 ( b = 1 and a = 0 )
can be rewritten as
S 1 S‟
S‟ 0 S‟ | e
24
More Elimination of Left Recursion
• In general
S S a1 | … | S an | b1 | … | bm
• All strings derived from S start with one of b1,…,bm
and continue with several instances of a1,…,an
• Rewrite as
S b1 S‟ | … | bm S‟
S‟ a1 S‟ | … | an S‟ | e
25
General Left Recursion
• The grammar
S A a | d (1)
A S b (2)
is also left-recursive because
S + S b a
• This left recursion can also be eliminated by first
substituting (2) into (1)
• There is a general algorithm (e.g. Aho, Sethi, Ullman
§4.3)
26
Predictive Parsing
• Wouldn‟t it be nice if
– the r.d. parser just knew which production to expand next?
– Idea:
switch ( something ) {
case L1: return E1();
case L2: return E2();
otherwise: print “syntax error”;
}
– what‟s “something”, L1, L2?
• the parser will do lookahead (look at next token)
27
Predictive parsing (Contd..)
• Modification of Recursive descent top down parsing
in which parser “predicts” which production to use
– By looking at the next few tokens
– No backtracking
• Predictive parsers accept LL(k) grammars
– L means “left-to-right” scan of input
– L means “leftmost derivation”
– k means “predict based on k tokens of lookahead”
• In practice, LL(1) is used
28
LL(1) Languages
• For each non-terminal and input token there
may be a UNIQUE choice of production that
could lead to success.
• LL(k) means that for each non-terminal and k
tokens, there is only one production that could
lead to success
29
But First: Left Factoring
• Consider the grammar
E T + E | T
T int | int * T | ( E )
Impossible to predict because
– For T two productions start with int
– For E it is not clear how to predict
• A grammar must be left-factored before use
for predictive parsing
30
Left-Factoring Example
• Starting with the grammar
– E T + E | T
– T int | int * T | ( E )
• Factor out common prefixes of productions
E T X
X + E | ε
T ( E ) | int Y
Y * T | ε
31
Left-Factoring (cont.)
• In general,
A ab1 | ab2 where a is non-empty and the first symbols of b1 and b2 (if they have one)are different.
when processing a we cannot know whether expand
A to ab1 or
A to ab2
But, if we re-write the grammar as follows
A aA’
A’ b1 | b2 so, we can immediately expand A to aA’
32
Left-Factoring -- Algorithm
• For each non-terminal A with two or more alternatives (production rules) with a common non-empty prefix, let say
A ab1 | ... | abn | g1 | ... | gm
convert it into
A aA’ | g1 | ... | gm
A’ b1 | ... | bn
33
Left-Factoring – Example1
A abB | aB | cdg | cdeB | cdfB
A aA’ | cdA’’
A’ bB | B
A’’ g | eB | fB
34
Predictive Parser (example)
stmt if ...... |
while ...... |
begin ...... |
for .....
• When we are trying to write the non-terminal stmt, if the current token
is if we have to choose first production rule.
• When we are trying to write the non-terminal stmt, we can uniquely
choose the production rule by just looking the current token.
• We eliminate the left recursion in the grammar, and left factor it. But it
may not be suitable for predictive parsing (not LL(1) grammar).
35
Non-Recursive Predictive Parsing -- LL(1) Parser
• Non-Recursive predictive parsing is a table-driven parser.
• It is a top-down parser.
• It is also known as LL(1) Parser.
input buffer
stack Non-recursive output
Predictive Parser
Parsing Table
36
LL(1) Parser
input buffer
– our string to be parsed. We will assume that its end is marked with a special symbol $.
output
– a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer.
37
stack
– contains the grammar symbols
– at the bottom of the stack, there is a special end marker symbol $.
– initially the stack contains only the symbol $ and the starting symbol S. $S initial stack
– when the stack is emptied (ie. only $ left in the stack), the parsing is completed.
• parsing table
– a two-dimensional array M[A,a]
– each row is a non-terminal symbol
– each column is a terminal symbol or the special symbol $
– each entry holds a production rule.
38
• INITIAL CONFIGURATION
• Stack Input Buffer
• $S Input string$
• FINAL CONFIGURATION
• Stack Input Buffer
• $ $
39
LL(1) Parser – Parser Actions
• The symbol at the top of the stack (say X) and the current symbol in the input string (say a) determine the parser action.
• There are four possible parser actions.
1. If X and a are $ ( Final configuration) parser halts (successful completion)
2. If X and a are the same terminal symbol (different from $)
parser pops X from the stack, and moves the next symbol in the input buffer.
40
3. If X is a non-terminal
parser looks at the parsing table entry M[X,a]. If
M[X,a] holds a production rule XY1Y2...Yk, it pops X
from the stack and pushes Yk,Yk-1,...,Y1 into the stack.
The parser also outputs the production rule XY1Y2...Yk
to represent a step of the derivation.
4. none of the above error
– all empty entries in the parsing table are errors.
– If X is a terminal symbol different from a, this is also
an error case.
41
LL(1) Parser – Example1
S aBc LL(1) Parsing
B bB | e Table
stack input output
$S abbc$ S aBc
$cBa abbc$
$cB bbc$ B bB
$cBb bbc$
$cB bc$ B bB
$cBb bc$
$cB c$ B e
$c c$
$ $ accept, successful completion
B e
c
B bBB
S aBcS
$ba
42
LL(1) Parser – Example1 (cont.)
Outputs: S aBc B bB B e
Derivation(left-most): SaBcabBcabbBcabbc
S
Ba c
B
Bb
b
e
parse tree
43
LL(1) Parser – Example2E TE‟
E‟ +TE‟ | e
T FT‟
T‟ *FT‟ | e
.F (E) | id
F (E)F idF
T‟ eT‟ eT‟ *FT‟T‟ eT’
T FT‟T FT‟T
E‟ eE‟ eE‟ +TE‟E’
E TE‟E TE‟E
$)(*+id
1.E TE‟ 2.E‟ +TE‟ 3. E‟ e
4.T FT‟ 5.T‟ *FT‟ 6. T‟ e
7.F (E) 8.Fid
44
LL(1) Parser – Example2
stack input output
$E id+id$ E TE’
$E’T id+id$ T FT’
$E’ T’F id+id$ F id
$ E’ T’id id+id$
$ E’ T’ +id$ T’ e
$ E’ +id$ E’
TE’
$ E’ T+ +id$
$ E’ T id$ T FT’
$ E’ T’ F id$ F id
$ E’ T’id id$
$ E’ T’ $ T’ e
$ E’ $ E’ e
$ $ accept1.E TE‟ 2.E‟ +TE‟ 3. E‟ e
4.T FT‟ 5.T‟ *FT‟ 6. T‟ e
7.F (E) 8.Fid
id + * ( ) $
E 1 1
E ‘ 2 3 3
T 4 4
T ‘ 6 5 6 6
F 8 7
45
Constructing LL(1) Parsing Tables
• Two functions are used in the construction of LL(1) parsing tables:– FIRST FOLLOW
• FIRST(a) is a set of the terminal symbols which occur as first symbols in strings derived from a, where a is any string of grammar symbols.
• if a derives to e, then e is also in FIRST(a) .
• FOLLOW(A) is the set of the terminals which occur immediately after (follow) the non-terminal A in the strings derived from the starting symbol.
– a terminal a is in FOLLOW(A) if S aAab
– $ is in FOLLOW(A) if S aA*
*
46
Compute FIRST for Any String X [FIRST(X)]
1. If X is a terminal symbol OR e then FIRST(X)={X}
2. If X is a non-terminal symbol and X e is a production rule then e
is in FIRST(X).
3. If X is a non-terminal symbol and X Y1Y2..Yn is a production rule
if a terminal a in FIRST(Yi) and e is in all FIRST(Yj) for j=1,...,i-1
then a is in FIRST(X).
if e is in all FIRST(Yj) for j=1,...,n then e is in FIRST(X).
47
Example
1.Xa FIRST(X)={a}
2.X ε FIRST(X)={ε}
3.Xa|ε FIRST(X)={a, ε}
4. XAbB AaB B ε
FIRST(X)={a} FIRST(A)={a} FIRST(B)={ε}
5.XABC A ε B ε Cc
FIRST(X)={c} FIRST(A)={ε } FIRST(B)={ε }
48
FIRST Example
E TE‟
E‟ +TE‟ | e
T FT‟
T‟ *FT‟ | e
F (E) | id
FIRST(F) = {(,id}
FIRST(T’) = {*, e}
FIRST(T) = {(,id}
FIRST(E’) = {+, e}
FIRST(E) = {(,id}
49
ETE’
• First(E)
– E is a non-terminal and has a production ETE‟ , From rule 3
• Add all the non e -symbols of FIRST(T) and also collect first sets of E‟ if their
preceding nonterminal can derive e
• FIRST(T) = ?
– T is a nonterminal and has a production rule TFT‟, from rule 3
– Add all the non e -symbols of FIRST(F) and also collect first sets of T‟ if
their preceding nonterminal can derive e
FIRST(F) = ?
F is a nonterminal and has a production Fid | (E) .
First(F)= { id, ( }
Hence
FIRST(E)=FIRST(T)=FIRST(F)={ (,id }
50
FIRST(E’) AND FIRST(T’)
E‟+TE‟ | ε
• FIRST(E‟)= FIRST(+TE‟) U FIRST(ε)
= {+, ε}
T‟*FT‟| ε
• FIRST(T‟)= FIRST(*FT‟) U FIRST(ε)
= {*, ε}
51
FIRST SETS
FIRST(E) = {(,id}
FIRST(T) = {(,id}
FIRST(E’) = {+, ε}
FIRST(T’) = {*, ε}
FIRST(F) = {(,id}
E TE’
E’ +TE’ | ε
T FT’
T’ *FT’ | ε
F (E) | id
52
Compute FOLLOW (for non-terminals)
• If S is the start symbol $ is in FOLLOW(S)
• if A aBb is a production rule
everything in FIRST(b) is FOLLOW(B) except e
• If ( A aB is a production rule ) or
( A aBb is a production rule and e is in FIRST(b) )
everything in FOLLOW(A) is in FOLLOW(B).
We apply these rules until nothing more can be added to any follow set.
53
FOLLOW Example
E TE‟
E‟ +TE‟ | e
T FT‟
T‟ *FT‟ | e
F (E) | id
• FOLLOW(E)
• Since E is a start symbol
add $ to the follow set
• From rule 2, the terminal )
is followed by E. So add )
also to the follow set of E
• Hence
• FOLLOW(E)= { $,)}
54
• FOLLOW(E‟) : [ETE‟, E‟+TE‟ ]
• From rule (3) everything in FOLLOW(E) will be added to FOLLOW(E‟).
• HENCEFOLLOW(E‟)={ $, ) }
FOLLOW(T) : [ETE‟,
E‟+TE‟]
From rule (2) FIRST(E‟)
except ε is added to
FOLLOW(T).
From rule (3) , since First(E‟)
contains ε add FOLLOW (E)
to the FOLLOW(T).
HENCE
FOLLOW(T)={+, $, ) }
55
– FOLLOW(F) :
[TFT‟,T‟*FT‟]
• From rule (2) FIRST(T‟) except
ε is added to FOLLOW(F).
• From rule (3) , since First(T‟)
contains e add FOLLOW (T) to
the FOLLOW(T‟).
• HENCE
– FOLLOW(F)={*,+, $, ) }
• FOLLOW(T‟) :
[TFT‟,T‟*FT‟ ]
• From rule (3) everything in
FOLLOW(T) will be added to
FOLLOW(T‟).
• HENCE
– FOLLOW(T‟)={+, $, ) }
–
56
FOLLOW SETS
FOLLOW(E) = { $, ) }
FOLLOW(E‟) = { $, ) }
FOLLOW(T) = { +, ), $ }
FOLLOW(T‟) = { +, ), $ }
FOLLOW(F) = {+, *, ), $ }
57
EXERCISES
• COMPUTE FIRST and FOLLOW
SETS for the following grammar
S aBc
B bB | e
58
• SOLUTION
• FIRST(S)={a}
• FIRST(B)={b, e}
• FOLLOW(S)={$}
• FOLLOW(B)={c}
59
• 2. • statement if-statement | other
If-statement if ( exp ) statement else-part
Else-part else statement | ε
Exp0 | 1
3:A(A ) A| ε
4:
Lexpatom |list
Atomnumber | identifier
List ( lexp-seq )
Lexp-seq lexp , lexp-seq |lexp
– Left factor the grammar
– Compute First and Follow for the resultant grammar.
60
The LL(1) Parse Table
• Let G be an LL(1) grammar and M be the parsing table.
– M has one row for each nonterminal A
– M has one column for each terminal symbol a, plus a
column for the end of input symbol “$”.
61
Constructing LL(1) Parsing Table -- Algorithm
• for each production rule A a of a grammar G
– for each terminal a in FIRST(a)
add A a to M[A,a]
– If e in FIRST(a)
for each terminal b in FOLLOW(A) add A a to M[A,b]
– If e in FIRST(a) and $ in FOLLOW(A)
add A a to M[A,$]
• All other undefined entries of the parsing table are error entries.
62
Constructing LL(1) Parsing Table -- Example
S aBc B bB | e
FIRST(S)={a} FIRST(B)={b, e}
FOLLOW(S)={$} FOLLOW(B)={c}
SaBc
First(S)=First(aBc)={a}
Hence M[S,a]=SaBc
BbB| e
First(B)={b, e}
M[B,b]=BbB
c
B
S
$ba
SaBc
B bB B e
Follow(B)={c}
Hence M[B,c]=B e
63
Expression Grammar Parse Table
E TE'
E' +TE' | ε
T FT'
T' *FT' | ε
F ( E ) | id
• First (E) = First (T) = First (F) = { (, id }
• First (E') = { +, ε }
• First (T') = { *, ε }
• Follow (E) = Follow (E') = { $, ) }
• Follow (T) = Follow (T') = { +, $, ) }
• Follow (F) = { *, +, $, ) }
64
Expression Grammar Parse Table
• E TE' :
Since First(TE') = First(T) =
{ (, id }, we add E TE' to M[E, (] and
M[E, id].
First (E) = First (T) = First (F) = { (,
id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E'
T
T'
F
65
Expression Grammar Parse Table
• E' +TE' :
Since First(+TE') = {+}, we add E' +TE'
to M[E',+].First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE'
T
T'
F
66
Expression Grammar Parse Table
• E' e :
We must examine Follow(E') = { $, )
}. We add E' e to M[E',)] and
M[E',$]
First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T
T'
F
67
Expression Grammar Parse Table
• T FT' :
Since First(FT') = First(F) =
{ (, id }, we add T FT' to M[T,(]
and M[T,id].
First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T T FT' T FT'
T'
F
68
Expression Grammar Parse Table
• T' *FT' :
Since (*FT') = {*}, we add
T' *FT' to M[T',*].
First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T T FT' T FT'
T' T' *FT'
F
69
Expression Grammar Parse Table
• T' e :
We examine Follow(T') =
{ +, $, ) }. We add T' e to M[T',+],
M[T',)], and M[T',$].
First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T T FT' T FT'
T' T' e T' *FT' T' e T' e
F
70
Expression Grammar Parse Table
• F ( E ):
We add F ( E ) to M[F,(]
First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T T FT' T FT'
T' T' e T' *FT' T' e T' e
F F ( E )
71
Expression Grammar Parse Table
• F id :
We add F id to M[F,id]
First (E) = First (T) = First (F) = { (, id }
First (E') = { +, ε }
First (T') = { *, ε }
Follow (E) = Follow (E') = { $, ) }
Follow (T) = Follow (T') = { +, $, ) }
Follow (F) = { *, +, $, ) }
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T T FT' T FT'
T' T' e T' *FT' T' e T' e
F F id F ( E )
72
Expression Grammar Parse Table
id + * ( ) $
E E TE' E TE'
E' E' +TE' E' e E' e
T T FT' T FT'
T' T' e T' *FT' T' e T' e
F F id F ( E )
The completed parse table for the expression grammar
73
Exercises on Parsing Table Construction
• 1. statement if-statement | other
If-statement if ( exp ) statement else-part
Else-part else statement | ε
Exp0 | 1
2 :A(A ) A| ε
3.
Lexpatom |list
Atomnumber | identifier
List ( lexp-seq )
Lexp-seq lexp , lexp-seq |lexp
– Show the actions of the corresponding LL(1) parser, given the input string
(a,(b,(2)),( c )).
74
LL(1) Grammars
• A grammar whose parsing table has no multiply-defined entries is said
to be LL(1) grammar.
one input symbol used as a look-head symbol do determine parser action
LL(1) left most derivation
input scanned from left to right
• The parsing table of a grammar may contain more than one production
rule. In this case, we say that it is not a LL(1) grammar.
75
A Grammar which is not LL(1)
S i C t S E | a FOLLOW(S) = { $,e }
E e S | e FOLLOW(E) = { $,e }
C b FOLLOW(C) = { t }
FIRST(iCtSE) = {i}
FIRST(a) = {a}
FIRST(eS) = {e}
FIRST(e) = {e}
FIRST(b) = {b}
two production rules for M[E,e]
Problem ambiguity
C bC
E eE e S
E e
E
S iCtSES aS
$tieba
76
A Grammar which is not LL(1) (cont.)
• What do we have to do it if the resulting parsing table
contains multiply defined entries?
– If we didn‟t eliminate left recursion, eliminate the left
recursion in the grammar.
– If the grammar is not left factored, we have to left factor
the grammar.
– If its (new grammar‟s) parsing table still contains multiply
defined entries, that grammar is ambiguous or it is
inherently not a LL(1) grammar.
77
• A left recursive grammar cannot be a LL(1) grammar.
• A grammar is not left factored, it cannot be a LL(1) grammar
• An ambiguous grammar cannot be a LL(1) grammar.
78
Properties of LL(1) Grammars
• A grammar G is LL(1) if and only if the following
conditions hold for two distinctive production rules A a
and A b
-Both a and b cannot derive strings starting with same
terminals.
- At most one of a and b can derive to ε.
-If b can derive to ε, then a cannot derive to any string
starting with a terminal in FOLLOW(A).
79
Non LL(1) Examples
Grammar Not LL(1) because
S S a | a Left recursive
S a S | a FIRST(a S) FIRST(a)
S a R | e
R S | e For R: S * e and e * e
S a R a
R S | e
For R:
FIRST(S) FOLLOW(R)
80
Error Recovery in Predictive Parsing
• An error may occur in the predictive parsing (LL(1) parsing)
– if the terminal symbol on the top of stack does not match with
the current input symbol.
– if the top of stack is a non-terminal A, the current input symbol is a,
and the parsing table entry M[A,a] is empty.
• What should the parser do in an error case?
– The parser should be able to give an error message (as much as
possible meaningful error message).
– It should be recover from that error case, and it should be able
to continue the parsing with the rest of the input.
81
Example
82
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
Example: Parse 1 + 2 * 3
83
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
Or: number + number * number
84
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num + num * num $
$
85
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num + num * num $
$
exp
86
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num + num * num $
$
exp'
term
87
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num + num * num $
$
exp'
term
88
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num + num * num $
$
exp'
term'
factor
89
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num + num * num $
$
exp'
term'
num
90
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
+ num * num $
$
exp'
term'
91
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
+ num * num $
$
exp'
92
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
+ num * num $
$
exp'
term
addop
93
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
+ num * num $
$
exp'
term
addop
94
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
+ num * num $
$
exp'
term
+
95
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
+ num * num $
$
exp'
term
+
96
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num * num $
$
exp'
term
97
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num * num $
$
exp'
term'
factor
98
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num * num $
$
exp'
term'
num
99
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
* num $
$
exp'
term'
100
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
* num $
$
exp'
term'
factor
mulop
101
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
* num $
$
exp'
term'
factor
*
102
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num $
$
exp'
term'
factor
103
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num $
$
exp'
term'
factor
104
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
num $
$
exp'
term'
num
105
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
$
$
exp'
term'
106
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
$
$
exp'
term'
107
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
$
$
exp'
108
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
$
$
exp'
109
term
factor term'
term
factor term'
term
term' eterm'
mulop factor
term'
term' eterm' eterm' eterm'
mulop
*
mulop
factor
number
factor
( exp )
factor
addop
-
exp'
addop
term exp'
- *
exp' e
$
addop
+
addop
exp'
addop
term exp'
exp' eexp'
exp
term exp'
exp
term exp'
exp
+)number(M[N][T]
$
$
110
Successful Parse!
111
Self Study
• Error Recovery Techniques
• Panic-Mode Error Recovery
• Phrase-Level Error Recovery
• Error-Productions
• Global-Correction
• Reference :
• Aho, Sethi and Ullman