Foo Presentation

8/8/2019 Foo Presentation

http://slidepdf.com/reader/full/foo-presentation 1/21

PRESENTATION ON

SYNTAX ANALYSIS PHASE

IN A COMPILER

by JYOTIRMOY



Syntax Analysis is also often termed as

Parsing.

In computing, Parsing or Syntactic

analysis, is the process of analyzing atext, made of a sequence of tokens (for

example, words), to determine its

grammatical structure with respect to agiven (more or less) formal grammar.



Parser

In computing, a parser is one of the components

in an interpreter or compiler, which checks for correct

syntax and builds a data structure (often some kind of parse tree , abstract syntax tree or other hierarchical

structure) implicit in the input tokens. The parser often

uses a separate Lexical Analysis to create tokens from

the sequence of input characters.Parsers may be programmed by hand or may be

(semi-)automatically generated (in some programming

languages) by a tool.



Syntax Analysis (Parsing)

input

± Sequence of tokens

output

±Abstract Syntax Tree

Report syntax errors

-- unbalanced parenthesizes

[Create ³symbol-table´ ] and Parse Tree

In some cases the tree need not be generated

(one-pass compilers)



The Parsing process :Stage 2:

The next stage is parsing or syntactic

analysis, which is checking that the tokens

form an allowable expression. This is usually

done with reference to a context free grammar,

which recursively defines components that canmake up an expression and the order in which

they must appear.



The Parsing process :

Diagram :

Source

program

Lexical

analyzer

Request

for token

parser

Rest of

front end

Parse

tree



We categorize the parsers into two groups:

1. Top-Down Parser

± the parse tree is created top to bottom, startingfrom the root.

2. Bottom-Up Parser

± the parse is created bottom to top; starting from the

leaves



Both top-down and bottom-up parsers scan

the input from left to right (one symbol at atime).

Efficient top-down and bottom-up parsers can

be implemented only for sub-classes of context-free grammars.

±LL for top-down parsing

±LR for bottom-up parsing



Context-Free Grammars (CFG)

Inherently recursive structures of a programming language are defined by a CFG.

In a CFG, we have:

± A finite set of terminals (in our case, this will be

the set of tokens) ± A finite set of non-terminals (syntactic-variables)

± A finite set of productions rules in the followingform

A p E where A is a non-terminal and E is a stringof terminals and non-terminals (including the emptystring)

± A start symbol (one of the non-terminal symbol)



Example:

E p E + E | E ± E | E * E | E / E | - E

E p ( E )

E p id



Derivations

E

E+E

E+E derives from E

± we can replace E by E+E

± to able to do this, we have to have a production rule

EpE+E in our grammar.

E E+E id+E id+id

A sequence of replacements of non-terminal symbols iscalled a derivation of id+id from E.

*

+



In general a derivation step is

EA F EKF if there is a production rule ApK in our

grammar

where E and F are arbitrary strings

of terminal and non-terminal symbols

E1 E2 ... En (En derives from E1 or E1 derives En )

: derives in one step

: derives in zero or more steps

: derives in one or more steps



Derivations

E -E -(E) -(E+E) -(id+E) -(id+id)

OR

E -E -(E) -(E+E) -(E+id) -(id+id)

At each derivation step, we can choose any of the

non-terminal in the sentential form of G for thereplacement.



If we always choose the left-most non-terminal in

each derivation step, this derivation is called as left-

most derivation.

If we always choose the right-most non-terminal in

each derivation step, this derivation is called as

right-most derivation.



Left-Most and Right-Most Derivation

Left-Most Derivation

E -E -(E) -(E+E) -(id+E) -

(id+id)

Right-Most Derivation

E

-E

-(E)

-(E+E)

-(E+id)

-(id+id)

lmlmlmlmlm

rmrmrmrmrm



The top-down parsers try to find the left-most derivation of the given source

program.

The bottom-up parsers try to find the right-

most derivation of the given source

program in the reverse order.



Ambiguity A grammar produces more than one parse tree for a sentence

is called as an ambiguous grammar.

E E+E id+E id+E*E

id+id*E id+id*id

E E*E E+E*E id+E*E

id+id*E id+id*id

E

id

E +

id

id

E

E

* E

E

E +

id E

E

* E

id id



Ambiguity (cont.)

For the most parsers, the grammar must beunambiguous.

unambiguous grammar

unique selection of the parse tree for a

sentence



We should eliminate the ambiguity in the

grammar during the design phase of the

compiler.

An unambiguous grammar should be written to

eliminate the ambiguity.

We have to prefer one of the parse trees of asentence (generated by an ambiguous grammar)

to disambiguate that grammar to restrict to this

choice.

Ambiguity (cont.)

Foo Presentation

Documents

Transcript of Foo Presentation