Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser...

59
Chapter 4 Chang Chi-Chung 2007.5.17
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    225
  • download

    5

Transcript of Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser...

Page 1: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Chapter 4

Chang Chi-Chung

2007.5.17

Page 2: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Parse

tree

intermediate

representation

The Role of the Parser

LexicalAnalyzer ParserSource

Program

Token

Symbol Table

getNextToken

Rest of Front End

Page 3: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

The Types of Parsers for Grammars Universal (any CFG)

Cocke-Younger-Kasimi Earley

Top-down (CFG with restrictions) Build parse trees from the root to the leaves. Recursive descent (predictive parsing) LL (Left-to-right, Leftmost derivation) methods

Bottom-up (CFG with restrictions) Build parse trees from the leaves to the root. Operator precedence parsing LR (Left-to-right, Rightmost derivation) methods

SLR, canonical LR, LALR

Page 4: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Representative Grammars

E → E + T | T T → T * F | F F → ( E ) | id

E → T E’ E’ → + T E’ | ε T → F T’ T’ → * F T’ | ε F → ( E ) | id

E → E + E | E * E | ( E ) | id

Page 5: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Error Handling A good compiler should assist in identifying and

locating errors Lexical errors

important, compiler can easily recover and continue Syntax errors

most important for compiler, can almost always recover Static semantic errors

important, can sometimes recover Dynamic semantic errors

hard or impossible to detect at compile time, runtime checks are required

Logical errors hard or impossible to detect

Page 6: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Error Recovery Strategies

Panic mode Discard input until a token in a set of designated

synchronizing tokens is found Phrase-level recovery

Perform local correction on the input to repair the error Error productions

Augment grammar with productions for erroneous constructs

Global correction Choose a minimal sequence of changes to obtain a global

least-cost correction

Page 7: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Context-Free Grammar

Context-free grammar is a 4-tupleG = < T, N, P, S> where T is a finite set of tokens (terminal symbols) N is a finite set of nonterminals P is a finite set of productions of the form

where N and (NT)*

S N is a designated start symbol

Page 8: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Notational Conventions Terminals

a, b, c, … T example: 0, 1, +, *, id, if

Nonterminals A, B, C, … N example: expr, term, stmt

Grammar symbols X, Y, Z (N T)

Strings of terminals u, v, w, x, y, z T*

Strings of grammar symbols (sentential form) , , (N T)*

The head of the first production is the start symbol, unless stated.

Page 9: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Derivations The one-step derivation is defined by

A where A is a production in the grammar

In addition, we define is leftmost lm if does not contain a nontermi

nal is rightmost rm if does not contain a nonterm

inal Transitive closure * (zero or more steps) Positive closure + (one or more steps)

Page 10: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Sentence and Language Sentence form

If S * in the grammar G, then is a sentential form of G Sentence

A sentential form of G has no nonterminals. Language

The language generated by G is it’s set of sentences. The language generated by G is defined by

L(G) = { w T* | S * w } A language that can be generated by a grammar is said to be a Co

ntext-Free language.

If two grammars generate the same language, the grammars are said to be equivalent.

Page 11: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example G = < T, N, P, S>

T = { +, *, (, ), -, id } N = { E } P = E E + E

E E * E E ( E ) E - E E id

S = E

E lm – E lm –(E) lm –(E+E) lm –(id +E)

lm –(id + id)

Page 12: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Ambiguity A grammar that produces more than one pars

e tree for some sentence is said to be ambiguous.

Example id + id * id

E E + E id + E id + E * E id + id * E id + id * id

E E * E E + E * E id + E * E id + id * E id + id * id

E → E + E | E * E | ( E ) | id

Page 13: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Grammar Classification A grammar G is said to be

Regular (Type 3) right linear A a B or A a left linear A B a or A a

Context free (Type 2)

where N and ( N T )* Context sensitive (Type 1)

A where A N, , , (N T )*, | | > 0

Unrestricted (Type 0) where , ( N T )*,

Page 14: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Language Classification

The set of all languages generated by grammars G of type T L(T) = { L(G) | G is of type T }

L(regular) L(context free)

L(context sensitive) L(unrestricted)

Page 15: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example ( a | b )* a b b

A0a A1 A3A2

b b

a

b

A0 aA0 | bA0 | aA1

A1 bA2

A2 bA3

A3 ε

Page 16: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example L = { anbn | n 1 } is context free.

S0 Si f

path labeled ai path labeled bi

… …

path labeled aj-i

ajbi can be accepted by D, but ajbi is not in the L.

Finite automata cannot count.CFG can count two items but not three.

Page 17: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Writing a Predictive Parsing Grammar Eliminating Ambiguity Elimination of Left Recursion Left Factoring ( Elimination of Left Factor ) Compute FIRST and FOLLOW Two variants:

Recursive (recursive calls) Non-recursive (table-driven)

Page 18: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Eliminating Ambiguity Dangling-else Grammar

stmt if expr then stmt | if expr then stmt else stmt

| other

if E1 then S1 else if E2 then S2 else S3

Page 19: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Eliminating Ambiguity(2)if E1 then if E2 then S1 else S2

Page 20: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Eliminating Ambiguity(3)

Rewrite the dangling-else grammar stmt matched_stmt

| open_stmt

matched_stmt if expr then matched_stmt else matched_stmt

| other

open_stmt if expr then stmt

| if expr then matched_stmt else open_stmt

Page 21: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Elimination of Left Recursion Productions of the form

A A |

are left recursive Non-left-recursions

A A’ A’ A’ | ε

When one of the productions in a grammar is left recursive then a predictive parser loops forever on certain inputs

Page 22: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Immediate Left-Recursion Elimination Group the Productions as

A A1 | A2 | … | Am | 1 | 2 | … | n

Where no i begins with an A

Replace the A-Productions byA 1 A’ | 2 A’ | … | n A’

A’ 1 A’ | 2 A’ | … | m A’ | ε

Page 23: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example Left-recursive grammar

A A | | | A

Into a right-recursive production A AR

| AR

AR AR

| AR

|

Page 24: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Non-Immediate Left-Recursion The Grammar

S A a | b

A A c | S d | ε The nonterminal S is left recursive, because

S A a Sda

But S is not immediately left recursive.

Page 25: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Elimination of Left Recursion Eliminating left recursion algorithm

Arrange the nonterminals in some order A1, A2, …, Anfor (each i from 1 to n) {

for (each j from 1 to i-1){ replace each production

Ai Ajwith

Ai 1 | 2 | … | kwhere

Aj 1 | 2 | … | k}

eliminate the immediate left recursion in Ai}

Page 26: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example A B C | aB C A | A bC A B | C C | a

i = 1 nothing to do

i = 2, j = 1 B C A | A b B C A | B C b | a b(imm) B C A BR | a b BR

BR C b BR |

i = 3, j = 1 C A B | C C | a C B C B | a B | C C | a

i = 3, j = 2 C B C B | a B | C C | aC C A BR C B | a b BR C B | a B | C C | a(imm)C a b BR C B CR | a B CR | a CR

CR A BR C B CR | C CR |

Page 27: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Exercise

The grammarS A a | b

A A c | S d | ε

Answer A A c | A a d | b d |

Page 28: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Left Factoring

Left Factoring is a grammar transformation. Predictive Parsing Top-down Parsing

Replace productionsA 1 | 2 | … | n |

withA AR | AR 1 | 2 | … | n

Page 29: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example

The Grammar

stmt if expr then stmt | if expr then stmt else stmt

Replace with

stmt if expr then stmt stmts

stmts else stmt | ε

Page 30: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Exercise

The following grammarS i E t S | i E t S e S | a

E b Answer

S i E t S S’ | a

S’ e S | ε

E b

Page 31: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Non-Context-Free Grammar Constructs A few syntactic constructs found in typical program

ming languages cannot be specified using grammars alone.

Checking the identifiers are declared before they are used in a program. The abstract language is L1 = { wcw | w is in (a|b)* }

aabcaab is in L1 and L1 is not CFG. C/C++ and Java does not distinguish among identifiers that

are different character strings. All identifiers are represented by a token such as id in a grammar.

In the semantic-analysis phase checks that identifiers are declared before they are used.

Page 32: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Top-Down Parsing LL methods and recursive-descent parsing

Left-to-right, Leftmost derivation Creating the nodes of the parse tree in preorder ( depth-

first )

GrammarE T + TT ( E )T - ET id

Leftmost derivationE lm T + T lm id + T lm id + id

E E

T

+

T

idid

E

TT

+

E

T

+

T

id

Page 33: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Top-down Parsing

LL

recursive-descent parsing

Predictive Parsing

Give a Grammar G E → T E’

E’ → + T E’ | ε

T → F T’

T’ → * F T’ | ε

F → ( E ) | id

Page 34: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.
Page 35: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

FIRST and FOLLOW

S

α A

c γ

a β

c is in FIRST(A)a is in FOLLOW(A)

Page 36: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

FIRST and FOLLOW

The constructed of both top-down and bottom-up parsers is aided by two functions, FIRST and FOLLOW, associated with a grammar G.

During top-down parsing, FIRST and FOLLOW allow us to choose which production to apply.

During panic-mode error recovery, sets of tokens produced by FOLLOW can be used as synchronizing tokens.

Page 37: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

FIRST FIRST()

The set of terminals that begin all strings derived from

FIRST(a) = { a } if a T FIRST() = { } FIRST(A) = A FIRST () for A P FIRST(X1X2…Xk) =

if FIRST (Xj) for all j = 1, …, i-1 thenadd non- in FIRST(Xi) to FIRST(X1X2…Xk)

if FIRST (Xj) for all j = 1, …, k thenadd to FIRST (X1X2…Xk)

Page 38: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

FIRST(1) By definition of the FIRST, we can compute F

IRST(X) If XT, then FIRST(X) = {X}. If XN, X→, then add to FIRST(X). If XN, and X → Y1 Y2 . . . Yn, then add all non- ele

ments of FIRST(Y1) to FIRST(X), if FIRST(Y1), then add all non- elements of FIRST(Y2) to FIRST(X), ..., if FIRST(Yn), then add to FIRST(X).

Page 39: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

FOLLOW FOLLOW(A)

the set of terminals that can immediately follow nonterminal A

FOLLOW(A) =for all (B A ) P do add FIRST()-{} to FOLLOW(A)for all (B A ) P and FIRST() do add FOLLOW(B) to FOLLOW(A)for all (B A) P do add FOLLOW(B) to FOLLOW(A)if A is the start symbol S then add $ to FOLLOW(A)

Page 40: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

FOLLOW(1)

By definition of the FOLLOW, we can compute FOLLOW(X) Put $ into FOLLOW(S). For each A B, add all non- elements of FIRST()

to FOLLOW(B). For each A B or A B, where FIRST(), ad

d all of FOLLOW(A) to FOLLOW(B).

Page 41: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example

Give a Grammar GE → T E’

E’ → + T E’ | ε

T → F T’

T’ → * F T’ | ε

F → ( E ) | id

FIRST

E ( id

E’ + T ( id

T’ * F ( id

FOLLOW

E $ )

E’ $ )

T + $ )

T’ + $ )

F * + $ )

Page 42: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Recursive Descent Parsing

Every nonterminal has one (recursive) procedure responsible for parsing the nonterminal’s syntactic category of input tokens

When a nonterminal has multiple productions, each production is implemented in a branch of a selection statement based on input look-ahead information

Page 43: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Procedure in Recursive-Descent Parsingvoid A() {

Choose an A-Production, AX1X2…Xk; for (i = 1 to k) {

if ( Xi is a nonterminal) call procedure Xi();

else if ( Xi = current input symbol a ) advance the input to the next symbol; else /* an error has occurred */ }}

Page 44: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Using FIRST and FOLLOW to Write a Recursive Descent Parser

expr term rest rest + term rest | - term rest | term id

FIRST(+ term rest) = { + }FIRST(- term rest) = { - }FOLLOW(rest) = { $ }

rest(){ if (lookahead in FIRST(+ term rest) ) { match(‘+’); term(); rest() } else if (lookahead in FIRST(- term rest) ) { match(‘-’); term(); rest() } else if (lookahead in FOLLOW(rest) ) return else error()}

Page 45: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

LL(1)

Page 46: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

LL(1) Grammar Predictive parsers, that is, recursive-descent parsers

needing no backtracking, can be constructed for a class of grammars called LL(1)

First “L” means the input from left to right. Second “L” means leftmost derivation. “1” for using one input symbol of lookahead at each st

ep tp make parsing action decisions. No left-recursive. No ambiguous.

Page 47: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

LL(1) A grammar G is LL(1) if it is not left recursive

and for each collection of productionsA 1 | 2 | … | n

for nonterminal A the following holds: FIRST(i) FIRST(j) = for all i j

如果交集不是空集合,會如何? if i * then

j * for all i j

FIRST(j) FOLLOW(A) = for all i j

Page 48: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example

Grammar Not LL(1) because:

S S a | a Left recursive

S a S | a FIRST(a S) FIRST(a) ={a}

S a R | R S |

For R: S * and *

S a R aR S |

For R:FIRST(S) FOLLOW(R)

Page 49: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Non-Recursive Predictive Parsing Table-Driven Parsing Given an LL(1) grammar G = <N, T, P, S> con

struct a table M[A,a] for A N, a T and use a driver program with a stack

Predictive parsingprogram (driver)

Parsing tableM

a + b $

X

Y

Z

$

stack

input

output

Page 50: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Predictive Parsing Table Algorithm

Page 51: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

ExampleE T E’E’ + T E’ | T F T ’ T ’ * F T ’ | F ( E ) | id

id + * ( ) $

E E T E’ E T E’

E’ E’ + T E’ E’ E’

T T F T’ T F TR

T’ T’ T’ * F TR T’ T’

F F id F ( E )

A FIRST() FOLLOW(A)

E T E’ ( id $ )

E’ + T E’ + $ )

E’ $ )

T F T’ ( id + $ )

T’ * F T’ * + $ )

T’ + $ )

F ( E ) ( * + $ )

F id id * + $ )

Page 52: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Exercise

Give a Grammar G as belowS i E t S S’

S’ e S | ε

E b Calculate the FIRST and FOLLOW Create a predictive parsing table

Page 53: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Answer

Ambiguous grammar

S i E t S S’ | aS’ e S | E b

a b e i t $

S S a S i E t S SR

SR

SR SR e S

SR

E E b

A FIRST() FOLLOW(A)

S i E t S S’ i e $

S a a e $

S’ e S e e $

S’ e $

E b b t

Error: duplicate table entry

Page 54: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Predictive Parsing AlgorithmInitially, w$ is in the input buffer and S is on top of the stack

set ip to point to the first symbol of w;set X to the top stack symbol;while (X != $) {

if (X is a) pop the stack and advance ip; else if ( X is a terminal) error(); else if ( M[X, a] is an error entry) error(); else if ( M[X, a] = X Y1Y2…Yk ) { output the production X Y1Y2…Yk pop the stack; push Yk , Yk-1 , ….Y1 onto the stack, with Y1 on top; } set X to the top stack symbol;}

Page 55: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example MATCHED STACK INPUT ACTION

E$ id + id * id$

TE’$ id + id * id$ E T E’

FT’E’$ id + id * id$ T F T ’

id T’E’$ id + id * id$ F id

id T’E’$ + id * id$ match id

id E’$ + id * id$ T ’

id +TE’$ + id * id$ E’ + T E ’

id + TE’$ id * id$ match +

id + FT’E’$ id * id$ T F T’

id + id T’E’$ id * id$ F id

id + id T’E’$ * id$ match id

id + id * FT’E’$ * id$ T’ * F T’

id + id * FT’E’$ id$ match *

id + id * id T’E’$ id$ F id

id + id * id T’E’$ $ Match id

id + id * id E’$ $ T’

id + id * id $ $ E’

Table-Driven Parsing

E T E’E’ + T E’ | T F T ’ T ’ * F T ’ | F ( E ) | id

Page 56: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Panic Mode Recovery

id + * ( ) $

E E T E’ E T E’ synch synch

E’ E’ + T E’ E’ v

T T F T’ synch T F TR synch synch

T’ T’ T’ * F TR T’ T’

F F id synch synch F ( E ) synch synch

FOLLOW(E) = { ) $ }FOLLOW(E’) = { ) $ }FOLLOW(T) = { + ) $ }FOLLOW(TR) = { + ) $ }FOLLOW(F) = { + * ) $ }

Add synchronizing actions toundefined entries based on FOLLOW

synch: the driver pops current nonterminal A and skips input tillsynch token or skips input until one of FIRST(A) is found

Pro: Can be automatedCons: Error messages are needed

Page 57: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Example STACK INPUT ACTION

E$ ) id * + id $ error, skip )

E$ id * + id $ id is in FIRST(E)

TE’$ id * + id $

FT’E’$ id * + id $

id T’E’$ id * + id $

T’E’$ * + id $

* FT’E’$ * + id $

FT’E’$ + id $ error, M[F, +] = synch

T’E’$ + id $ F has been popped

E’$ + id $

+TE’$ + id $

T’E’$ id $

FT’E’$ id $

id T’E’$ id $

T’E’$ $

E’$ $

$ $

E T E’E’ + T E’ | T F T ’ T ’ * F T ’ | F ( E ) | id

Page 58: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Phrase-Level Recovery

id + * ( ) $

E E T E’ E T E’ synch synch

E’ E’ + T E’ E’ E’

T T F T’ synch T F T’ synch synch

T’ insert * T’ T’ * F T’ T’ T’

F F id synch synch F ( E ) synch synch

Change input stream by inserting missing tokensFor example: id id is changed into id * id

insert *: driver inserts missing * and retries the production

Can then continue here

Pro: Can be automatedCons: Recovery not always intuitive (直覺 )

Page 59: Chapter 4 Chang Chi-Chung 2007.5.17. Parse tree intermediate representation The Role of the Parser Lexical Analyzer Parser Source Program Token Symbol.

Error Productions

id + * ( ) $

E E T E’ E T E’ synch synch

E’ E’ + T E’ E’ E’

T T F T’ synch T F T’ synch synch

T’ T’ F T’ T’ T’ * F TR T’ T’

F F id synch synch F ( E ) synch synch

E T E’

E’ + T E’ | T F T’

T’ * F T’ | F ( E ) | id

Add “error production”:T’ F T’

to ignore missing *, e.g.: id id

Pro: Powerful recovery methodCons: Cannot be automated