Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

91
Module 2 Module 2 Compiler and their Compiler and their Working Working Software Construction Lecture 10 ,11 and 12

description

3 Source Code  Optimized for human readability  Matches human notions of grammar  Uses named constructs such as variables and procedures

Transcript of Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Page 1: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Module 2 Module 2 Compiler and their Compiler and their

WorkingWorking

Software ConstructionLecture 10 ,11 and 12

Page 2: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

2

What are CompilersWhat are Compilers Translate information from one

representation to another Usually information = program Typical Compilers:

• VC, VC++, GCC, JavaC• FORTRAN, Pascal, VB

Translators• Word to PDF• PDF to Postscript

Page 3: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

3

Source CodeSource Code Optimized for human

readability Matches human notions of

grammar Uses named constructs such

as variables and procedures

Page 4: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

4

How to TranslateHow to Translate Translation is a complex

process source language and

generated code are very different

Need to structure the translation

Page 5: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

5

Two-pass CompilerTwo-pass Compiler

FrontEnd

BackEnd

sourcecode

IR machinecode

errorsUse an intermediate representation (IR)Front end maps legal source code into IRBack end maps IR into target machine code

Page 6: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

6

The Front-EndThe Front-End

Modules Scanner (also called Lexical analyzer) Parser

scanner parsersourcecode

tokens IR

errors

Page 7: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

7

ScannerScanner

Maps character stream into words – basic unit of syntax

Produces pairs – • a word and• its part of speech

scanner parsersourcecode

tokens IR

errors

Page 8: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

8

ScannerScanner Example

x = x + y becomes

<id,x> <assign,=><id,x><op,+><id,y>

token typeword

<id,x>

we call the pair “<token type, word>” a “token”typical tokens: number, identifier, +, -, new, while, if

Page 9: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

9

ParserParser

scanner parsersourcecode

tokens IR

errors

•Recognizes context-free syntax and reports errors•Guides context-sensitive (“semantic”) analysis•Builds IR for source program

Page 10: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

What is Context Free SyntaxWhat is Context Free Syntax To understand this we should have base of

context free grammar It is a set of write and rules such as

10

Page 11: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

11

Context-Free GrammarsContext-Free Grammars Context-free syntax is specified

with a grammar G=(S,N,T,P) S is the start symbol N is a set of non-terminal symbols T is set of terminal symbols or words P is a set of productions or rewrite

rules

Page 12: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

12

Context-Free GrammarsContext-Free GrammarsGrammar for expressions 1. goal → expr2. expr → expr op term3. | term4. term → number 5. | id6. op → + 7. | -

Page 13: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

13

The Front EndThe Front End For this CFG

S = goalT = { number, id, +, -}N = { goal, expr, term, op}P = { 1, 2, 3, 4, 5, 6, 7}

Page 14: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

14

Context-Free GrammarsContext-Free Grammars Given a CFG, we can derive

sentences by repeated substitution

Consider the sentence (expression)

x + 2 – y

Page 15: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

15

DerivationDerivationProduction Result

goal1 expr2 expr op term5 expr op y7 expr – y2 expr op term – y4 expr op 2 – y6 expr + 2 – y3 term + 2 – y5 x + 2 – y

Page 16: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

16

The Front EndThe Front End To recognize a valid

sentence in some CFG, we reverse this process and build up a parse

A parse can be represented by a tree: parse tree or syntax tree

Page 17: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

17

ParseParseProduction Result

goal1 expr2 expr op term5 expr op y7 expr – y2 expr op term – y4 expr op 2 – y6 expr + 2 – y3 term + 2 – y5 x + 2 – y

Page 18: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

18

Syntax TreeSyntax Tree x+2-y

goal

expr

termopexpr

termopexpr

term

– <id,y>

<id,x>

+ <number, 2>

Page 19: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

19

Abstract Syntax TreesAbstract Syntax Trees The parse tree contains a lot

of unneeded information. Compilers often use an

abstract syntax tree (AST).

Page 20: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

20

Abstract Syntax TreesAbstract Syntax Trees

This is much more concise AST summarizes grammatical structure without the

details of derivation ASTs are one kind of intermediate representation

(IR)

–<id,y>

<id,x> <number,2>

+

Page 21: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

21

Three-pass CompilerThree-pass Compiler

Intermediate stage for code improvement or optimization Analyzes IR and rewrites (or transforms) IR Primary goal is to reduce running time of the compiled code May also improve space usage, power consumption, ... Must preserve “meaning” of the code.

FrontEnd

Sourcecode

machine code

errors

MiddleEnd

BackEnd

IR IR

Page 22: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Lexical AnalysisLexical AnalysisScanner

scanner parsersourcecode

tokens IR

errors

Page 23: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

23

Lexical AnalysisLexical Analysis The task of the scanner is to take a program written

in some programming language as a stream of characters and break it into a stream of tokens.

This activity is called lexical analysis. The lexical analyzer partition input string into

substrings, called words, and classifies them according to their role

Output of lexical analysis is a stream of tokens

Page 24: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

24

TokensTokensExample:

if( i == j ) z = 0;else z = 1;

Input is just a sequence of characters :

if ( \b i \b = = \b j \n \t ....

Page 25: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

25

TokensTokensGoal: partition input string into substrings classify them according to their role A token is a syntactic category Natural language:

“He wrote the program” Words: “He”, “wrote”, “the”, “program” Programming

language: “if(b == 0) a = b”

Words: “if”, “(”, “b”, “==”, “0”, “)”, “a”, “=”, “b”

Page 26: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

26

TokensTokens Identifiers: x y11 maxsize Keywords: if else while for Integers: 2 1000 -44 5L Floats: 2.0 0.0034 1e5 Symbols: ( ) + * / { } < > == Strings: “enter x” “error”

Page 27: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

27

How to Describe Tokens?How to Describe Tokens? Regular Languages are the

most popular for specifying tokens

• Simple and useful theory• Easy to understand• Efficient implementations

Page 28: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

28

Example of LanguagesExample of Languages

Alphabet = English charactersLanguage = English sentences

Alphabet = ASCIILanguage = C++ programs,

Java, C#

Page 29: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

29

RecapRecapTokens:

strings of characters representing lexical units of programs such as identifiers, numbers, operators.

Regular Expressions:concise description of tokens. A regular expression describes a set of strings.

Language L(R):set of strings represented by a regular expression R. L(R) is the language denoted by regular expression R.

Page 30: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

30

Regular ExpressionRegular ExpressionR|S = either R or SRS = R followed by S (concatenation)R* = concatenation of R zero or more times

(R*= |R|RR|RRR...)R? = | R (zero or one R)R+ = RR* (one or more R)[abc] = a|b|c (any of listed)[a-z] = a|b|....|z (range)[^ab] = c|d|... (anything but ‘a’‘b’)

Page 31: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

31

How to Use REsHow to Use REs We need mechanism to determine if

an input string w belongs to L(R), the language denoted by regular expression R.

Page 32: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

32

AcceptorAcceptor Such a mechanism is called

an acceptor.

input string

language

w

L

acceptor yes, if w Lno, if w L

Page 33: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

33

Finite Automata (FA)Finite Automata (FA) Specification:

Regular Expressions

Implementation: Finite Automata A finite automaton accepts a string if we can follow transitions labelled with characters in the string from start state to some accepting state

Page 34: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

SYNTACTIC VS SEMANTIC

ANALYSIS

Page 35: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Syntactic Analysis Natural language analogy: consider the sentence

He wrote the programHe wrote the program

noun verb article noun

subject predicate object

sentence

Page 36: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Syntactic Analysis Programming language

if ( b <= 0 ) a = bbool expr assignment

if-statement

Page 37: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Syntactic Analysisint* foo(int i, int j)){ for(k=0; i j; ) fi( i > j ) return j;}

extra parenthesis

Missing expression

not a keyword

Page 38: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Semantic Analysis Grammatically correct

He wrote the computer

noun verb article noun

subject predicate object

sentence

Page 39: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Semantic Analysisint* foo(int i, int j){ for(k=0; i < j; j++ ) if( i < j-2 ) sum = sum+i return sum;}

undeclared var

return type

mismatch

Page 40: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Role of the Parser Not all sequences of tokens are program. Parser must distinguish between valid and invalid sequences of tokens.

What we needAn expressive way to describe the syntax An acceptor mechanism that determines if input token stream satisfies the syntaxParsing is the process of discovering a derivation for some sentenceMathematical model of syntax – a grammar G.Algortihm for testing membership in L(G).

Page 41: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Backus-Naur Form (BNF) Context-free grammars are (often) given by BNF

expressions (Backus-Naur Form) Grammar rules in a similar form were first used in the description of the Algol60 Language. The notation was developed by John Backus and adapted by Peter Naur for the Algol60 report. Thus the term Backus-Naur Form (BNF) .

The meta-symbols of BNF are: definition or description

::=• meaning "is defined as"

|• meaning "or"

< >• angle brackets used to surround category names.

• optional items are enclosed in meta symbols [ and ]

Page 42: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Meta-symbols of BNF optional items are enclosed in meta symbols [ and ] example: <if_statement> ::= if <boolean_expression> then <statement_sequence>

[ else <statement_sequence> ] end if ;

repetitive items (zero or more times) are enclosed in meta symbols { and }, example: <identifier> ::= <letter> { <letter> | <digit> }

terminals of only one character are surrounded by quotes (") to distinguish them from meta-symbols, example: <statement_sequence> ::= <statement> { ";" <statement> }

In recent text books, terminal and non-terminal symbols are distingue by using bold faces for terminals and suppressing < and > around non-terminals. This improves greatly the readability.

The example then becomes: if_statement ::= if boolean_expression then statement_sequence [ else statement_sequence ] end if ";"

Page 43: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

More Useful Grammar1 expr → expr op expr2 | num3 | id4 op → +5 | –6 | *7 | /

Page 44: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Derivation: x – 2 * yRule Sentential Form

- expr1 expr op expr2 <id,x> op expr5 <id,x> – expr1 <id,x> – expr op expr2 <id,x> – <num,2> op expr6 <id,x> – <num,2> expr

3 <id,x> – <num,2> <id,y>

Page 45: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Derivation Such a process of rewrites is called a derivation. Process or discovering a derivations is called parsing At each step, we choose a non-terminal to replace Different choices can lead to different derivations.

Two derivations are of interest

1. Leftmost derivation

2. Rightmost derivation

Page 46: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Derivations Leftmost derivation: replace leftmost non-

terminal (NT) at each step Rightmost derivation: replace rightmost NT at

each step The example on the preceding slides

was leftmost derivation There is also a rightmost derivation

Page 47: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Rightmost DerivationRule Sentential Form

- expr1 expr op expr3 expr op <id,y>6 expr <id,y>1 expr op expr <id,y>2 expr op <num,2> <id,y>

5 expr – <num,2> <id,y>3 <id,x> – <num,2> <id,y>

Page 48: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Derivations The two derivations produce different parse

trees.

The parse trees imply different evaluation orders!

Page 49: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parse Trees

G

E

E op E

E op Ex –

2 * y

Leftmost derivation

evaluation orderx – ( 2 * y )

Page 50: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parse Trees

G

E

op

evaluation order(x – 2 ) * y

E

x –

E

E op E

2

* y

Rightmostderivation

Page 51: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Precedence These two derivations point out a problem with the

grammar It has no notion of precedence, or implied order of

evaluation

To add precedence

Create a non-terminal for each level of precedence

Isolate corresponding part of grammar

Force parser to recognize high precedence subexpressions first.

Page 52: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

PrecedenceFor algebraic expressions Multiplication and division,

first. (level one) Subtraction and addition,

next (level two)

Page 53: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

1 Goal → expr2 expr → expr + term3 | expr – term4 | term5 term → term factor6 | term / factor7 | factor8 factor → number9 | Id

leveltwo

levelone

Page 54: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

PrecedenceThis grammar is larger Takes more rewriting to reach some of the terminal

symbols But it encodes expected precedence

Produces same parse tree under leftmost and rightmost derivations Let’s see how it parses

x – 2 * y

Page 55: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Precedence x – 2 * y1 Goal → expr

2 expr → expr + term

3 | expr – term

4 | term

5 term → term factor

6 | term / factor

7 | factor

8 factor → number

9 | Id

Rule Sentential Form- Goal1 expr3 expr – term 5 expr – term factor9 expr – term <id,y>7 expr – factor <id,y>8 expr – <num,2>

<id,y>4 term – <num,2>

<id,y>7 factor – <num,2>

<id,y>9 <id,x> – <num,2>

<id,y> The rightmost derivation

Page 56: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parse Trees

G

E

F

T

T F

<id,x>

*<id,y

>

T

E

T

<num,2>

evaluation orderx – ( 2 * y )

Page 57: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parse Trees

G

E

F

T

T F

<id,x>

*<id,y

>

T

E

T

<num,2>

evaluation orderx – ( 2 * y )

Page 58: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Precedence Both leftmost and rightmost derivations give the

same expression

Because the grammar directly encodes the desired precedence.

Page 59: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parsing Techniques

Page 60: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parsing TechniquesTop-down parsers Start at the root of the parse tree

and grow towards leaves. Pick a production and try to match

the input Bad “pick” may need to backtrack Some grammars are backtrack-free.

Page 61: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Top-down parsersAlso called LL parsingL means that tokens are read left to rightL means that the parser constructs a leftmost derivation.

Page 62: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Parsing TechniquesBottom-up parsers Start at the leaves and grow toward root As input is consumed, encode

possibilities in an internal state. Start in a state valid for legal first tokens Bottom-up parsers handle a large class

of grammars Preferred method in practice

Page 63: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Bottom-up ParsingAlso called LR parsing L means that tokens are read left

to right R means that the parser

constructs a rightmost derivation.

Page 64: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Top-Down Parser A top-down parser starts with the root of the

parse tree. The root node is labeled with the goal symbol of

the grammar

Page 65: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Top-Down Parsing Algorithm Construct the root node of the parse tree Repeat until the fringe [ leaves] of the parse tree

matches input string

At a node labeled A, select a production with A on its lhs

for each symbol on its rhs, construct the appropriate child

When a terminal symbol is added to the fringe and it does not match the fringe, backtrack

Find the next node to be expanded

Page 66: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Top-Down Parsing The key is picking right production in step

1.

That choice should be guided by the input string

Page 67: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Expression Grammar1 Goal → expr2 expr → expr + term3 | expr - term4 | term5 term → term * factor6 | term ∕ factor7 | factor8 factor → number9 | id10 | ( expr )Let’s try parsing

x – 2 * y

Page 68: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr + term x – 2 * y4 term + term x – 2 * y7 factor + term x – 2 * y9 <id,x> + term x – 2 * y9 <id,x> + term x – 2 * y

Page 69: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

This worked well except that “–” does not match “+”

P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr + term x – 2 * y4 term + term x – 2 * y7 factor + term x – 2 * y9 <id,x> + term x – 2 * y9 <id,x> + term x – 2 * y

Page 70: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

The parser must backtrack to here

P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr + term x – 2 * y4 term + term x – 2 * y7 factor + term x – 2 * y9 <id,x> + term x – 2 * y9 <id,x> + term x – 2 * y

Page 71: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

This time the “–” and “–” matched

P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr – term x – 2 * y4 term – term x – 2 * y7 factor – term x – 2 * y9 <id,x> – term x – 2 * y9 <id,x> – term x – 2 * y

Page 72: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

We can advance past “–” to look at “2”

P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr – term x – 2 * y4 term – term x – 2 * y7 factor – term x – 2 * y9 <id,x> – term x – 2 * y9 <id,x> – term x – 2 * y- <id,x> – term x – 2 * y

Page 73: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Now, we need to expand “term”

P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr – term x – 2 * y4 term – term x – 2 * y7 factor – term x – 2 * y9 <id,x> – term x – 2 * y9 <id,x> – term x – 2 * y- <id,x> – term x – 2 * y

Page 74: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

P Sentential Form input- <id,x> – term x – 2 * y7 <id,x> – factor x – 2 * y9 <id,x> –

<num,2>x – 2 * y

- <id,x> – <num,2>

x – 2 * y“2” matches “2”

We have more input but no non-terminals left to expand

Page 75: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

The expansion terminated too soon

Need to backtrack

P Sentential Form input- <id,x> – term x – 2 * y7 <id,x> – factor x – 2 * y9 <id,x> –

<num,2>x – 2 * y

- <id,x> – <num,2>

x – 2 * y

Page 76: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

P Sentential Form input- <id,x> – term x – 2 * y5 <id,x> – term * factor x – 2 * y7 <id,x> – factor * factor x – 2 * y8 <id,x> – <num,2> *

factorx – 2 * y

- <id,x> – <num,2> * factor

x – 2 * y

- <id,x> – <num,2> * factor

x – 2 * y

9 <id,x> – <num,2> * <id,y>

x – 2 * y

- <id,x> – <num,2> * <id,y>

x – 2 * y

Success! We matched and consumed all the input

Page 77: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Another Possible ParseP Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr +term x – 2 * y2 expr +term +term x – 2 * y2 expr +term +term +term x – 2 * y2 expr +term +term +term

+....x – 2 * y

consuming no input!!Wrong choice of expansion leads to non-terminationParser must make the right choice

Page 78: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Left Recursion

Top-down parsers cannot handle left-recursive

grammars

Page 79: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Left Recursion Our expression grammar is left recursive.

This can lead to non-termination in a top-down parser

Non-termination is bad in any part of a compiler

For a top-down parser, any recursion must be a right recursion

We would like to convert left recursion to right recursion

To remove left recursion, we transform the grammar

Page 80: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left RecursionConsider a grammar fragment:

A → A | where neither nor starts with A.

Page 81: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left RecursionWe can rewrite this as:

A → A'

A' → A' |

where A' is a new non-terminal

Page 82: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left RecursionA → A ' A' → A'

|

This accepts the same language but uses only right recursion

Page 83: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left Recursion

The expression grammar we have been using contains two cases of left- recursion

Page 84: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left Recursion

expr → expr + term | expr – term | term

term → term * factor | term ∕ factor | factor

Page 85: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left RecursionApplying the transformation yields

expr → term expr' expr' → + term expr'

| – term expr' |

Page 86: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left RecursionApplying the transformation yields

term → factor term' term' → * factor term'

| ∕ factor term' |

Page 87: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Eliminating Left Recursion These fragments use only

right recursion A top-down parser will

terminate using them.

Page 88: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

1 Goal → expr2 expr → term expr' 3 expr' → + term expr' 4 | – term expr'5 | 6 term → factor term' 7 term' → * factor term' 8 | ∕ factor term'9 | 10 factor → number11 | id12 | ( expr )

Page 89: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

Predictive Parsing If a top down parser picks the wrong

production, it may need to backtrack Alternative is to look ahead in input and use

context to pick correctly How much lookahead is needed? In general, an arbitrarily large amount Fortunately, large classes of CFGs can be

parsed with limited lookahead Most programming languages constructs fall in

those subclasses

Page 90: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

LL[1]....LL[K] PARSING scan input from Left to right do a Leftmost derivation use 1.. k symbols of lookahead is a top-down parsing technique

Page 91: Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.

FURTHER IN ADVANCE COURSE …….

COMPILER CONSTRUCTION 7TH SEMESTER