Foo Presentation

21
PRESENTATION ON SYNTAX ANALYSIS PHASE IN A COMPILER by JYOTIRMOY  

Transcript of Foo Presentation

Page 1: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 1/21

PRESENTATION ON 

SYNTAX ANALYSIS PHASE

IN A COMPILER

by JYOTIRMOY  

Page 2: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 2/21

Syntax Analysis is also often termed as

Parsing.

In computing, Parsing or Syntactic

analysis, is the process of analyzing atext, made of a sequence of tokens (for 

example, words), to determine its

grammatical structure with respect to agiven (more or less) formal grammar.

Page 3: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 3/21

Parser

In computing, a parser is one of the components

in an interpreter or compiler, which checks for correct

syntax and builds a data structure (often some kind of  parse tree , abstract syntax tree or other hierarchical

structure) implicit in the input tokens. The parser often

uses a separate Lexical Analysis to create tokens from

the sequence of input characters.Parsers may be programmed by hand or may be

(semi-)automatically generated (in some programming

languages) by a tool.

Page 4: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 4/21

Syntax Analysis (Parsing)

input

 ± Sequence of tokens

output

 ±Abstract Syntax Tree

Report syntax errors

-- unbalanced parenthesizes

[Create ³symbol-table´ ] and Parse Tree

In some cases the tree need not be generated

(one-pass compilers)

Page 5: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 5/21

Page 6: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 6/21

The Parsing process :Stage 2:

The next stage is parsing or syntactic

analysis, which is checking that the tokens

form an allowable expression. This is usually

done with reference to a context free grammar,

which recursively defines components that canmake up an expression and the order in which

they must appear.

Page 7: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 7/21

The Parsing process :

 Diagram :

Source

 program

Lexical

analyzer

Request

for token

 parser 

Rest of 

front end

Parse

tree

Page 8: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 8/21

We categorize the parsers into two groups:

1. Top-Down Parser

 ± the parse tree is created top to bottom, startingfrom the root.

2. Bottom-Up Parser

 ± the parse is created bottom to top; starting from the

leaves

Page 9: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 9/21

Both top-down and bottom-up parsers scan

the input from left to right (one symbol at atime).

Efficient top-down and bottom-up parsers can

 be implemented only for sub-classes of context-free grammars.

 ±LL for top-down parsing

 ±LR for bottom-up parsing

Page 10: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 10/21

Context-Free Grammars (CFG)

Inherently recursive structures of a programming language are defined by a CFG.

In a CFG, we have:

 ± A finite set of terminals (in our case, this will be

the set of tokens) ± A finite set of non-terminals (syntactic-variables)

 ± A finite set of productions rules in the followingform

A p E where A is a non-terminal and E is a stringof terminals and non-terminals (including the emptystring)

 ± A start symbol (one of the non-terminal symbol)

Page 11: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 11/21

Example:

E p E + E | E ± E | E * E | E / E | - E

E p ( E )

E p id

Page 12: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 12/21

Derivations

E

E+E

E+E derives from E

 ± we can replace E by E+E

 ± to able to do this, we have to have a production rule

EpE+E in our grammar.

E E+E id+E id+id

A sequence of replacements of non-terminal symbols iscalled a derivation of id+id from E.

*

+

Page 13: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 13/21

In general a derivation step is

EA F EKF if there is a production rule ApK in our 

grammar 

where E and  F are arbitrary strings

of terminal and non-terminal symbols

E1 E2 ... En (En derives from E1 or  E1 derives En )

: derives in one step

: derives in zero or more steps

: derives in one or more steps

Page 14: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 14/21

Derivations

E -E -(E) -(E+E) -(id+E) -(id+id)

OR 

E -E -(E) -(E+E) -(E+id) -(id+id)

At each derivation step, we can choose any of the

non-terminal in the sentential form of G for thereplacement.

Page 15: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 15/21

If we always choose the left-most non-terminal in

each derivation step, this derivation is called as left-

most derivation.

If we always choose the right-most non-terminal in

each derivation step, this derivation is called as

right-most derivation.

Page 16: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 16/21

Left-Most and Right-Most Derivation

Left-Most Derivation

E -E -(E) -(E+E) -(id+E) -

(id+id)

Right-Most Derivation

E

-E

-(E)

-(E+E)

-(E+id)

-(id+id)

lmlmlmlmlm

rmrmrmrmrm

Page 17: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 17/21

The top-down parsers try to find the left-most derivation of the given source

 program.

The bottom-up parsers try to find the right-

most derivation of the given source

 program in the reverse order.

Page 18: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 18/21

Page 19: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 19/21

Ambiguity A grammar produces more than one parse tree for a sentence

is called as an ambiguous grammar.

E E+E id+E id+E*E

id+id*E id+id*id

E E*E E+E*E id+E*E

id+id*E id+id*id

E

id

E +

id

id

E

E

* E

E

E +

id E

E

* E

id id

Page 20: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 20/21

Ambiguity (cont.)

For the most parsers, the grammar must beunambiguous.

unambiguous grammar 

unique selection of the parse tree for a

sentence

Page 21: Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 21/21

We should eliminate the ambiguity in the

grammar during the design phase of the

compiler.

An unambiguous grammar should be written to

eliminate the ambiguity.

We have to prefer one of the parse trees of asentence (generated by an ambiguous grammar)

to disambiguate that grammar to restrict to this

choice.

Ambiguity (cont.)