Compiler Design UNIT-2
-
Upload
ankur-srivastava -
Category
Education
-
view
31 -
download
1
Transcript of Compiler Design UNIT-2
COMPILER DESIGN• UNIT-2 BASIC PARSING TECHNIQUES Topics: Parsers, Shift reduce parsing, operator precedence parsing, top down parsing, predictive parsers Automatic Construction of efficient Parsers: LR parsers, the canonical Collection of LR(0) items, constructing SLR parsing tables, constructing Canonical LR parsing tables, Constructing LALR parsing tables, using ambiguous grammars, an automatic parser generator, implementation of LR parsing tables.31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT
(CD) 1
DEFINITION OF PARSING
A parser is a compiler or interpreter component that breaks data into smaller elements for easy translation into another language.
A parser takes input in the form of a sequence of tokens or program instructions and usually builds a data structure in the form of a parse tree or an abstract syntax tree.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 231-01-2017
Contd…..
• To identify the language constructs present in a given input program.• If the parser determines the input to be a valid one, it outputs a
representation of the input in the form of a parser tree.
• If the input is grammatically incorrect, the parser declares the detection of syntax error in the input. This case no parse tree can be produced.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3
ROLE OF PARSER
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 431-01-2017
• In the compiler model, the parser obtains a string of tokens from the lexical analyzer,
• and verifies that the string can be generated by the grammar for the source language.
• The parser returns any syntax error for the source language.• It collects sufficient number of tokens and builds a parse
tree.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 531-01-2017
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 631-01-2017
• There are basically two types of parser:
• Top-down parser:• starts at the root of derivation tree and fills in• picks a production and tries to match the input• may require backtracking• some grammars are backtrack-free (predictive)
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 731-01-2017
Parser contd……..• Bottom-up parser:
• starts at the leaves and fills in • starts in a state valid for legal first tokens • uses a stack to store both state and sentential forms.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 8
TOP DOWN PARSING• A top-down parser starts with the root of the parse tree, labeled with
the start or goal symbol of the grammar.
• To build a parse, it repeats the following steps until the fringe of the parse tree matches the input string
• STEP1: At a node labeled A, select a production A α and construct the appropriate child for each symbol of α
• STEP2: When a terminal is added to the fringe that doesn’t match the input string, backtrack • STEP3: Find the next node to be expanded.
• The key is selecting the right production in step 1
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 931-01-2017
EXAMPLE FOR TOP DOWN PARSING• Supppose the given production rules are as follows:• S-> aAd|aB• A-> b|c• B->ccd
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1031-01-2017
PROBLEMS WITH TOP DOWN PARSING
1) BACKTRACKING Backtracking is a technique in which for expansion of non-terminal symbol we choose one alternative and if some mismatch occurs then we try another alternative if any.If for a non-terminal there are multiple production rules beginning with the same input symbol then to get the correct derivation we need to try all these alternatives.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1131-01-2017
EXAMPLE OF BACKTRACKING
• Suppose the given production rules are as follows:• S->cAd• A->a|ab
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1231-01-2017
More Example• S → rXd | rZd • X → oa | ea • Z → ai
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 13
2) LEFT RECURSIONLeft recursion is a case when the left-most non-terminal in a
production of a non-terminal is the non-terminal itself( direct left recursion ) or through some other non-terminal definitions, rewrites to the non-terminal again(indirect left recursion). Consider these examples -
(1) A -> Aq (direct)(2) A -> Bq
B -> Ar (indirect)Left recursion has to be removed if the parser performs top-down
parsing
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 1431-01-2017
Contd….• The production is left-recursive if the leftmost symbol on the right
side is the same as the non terminal on the left side. • For example,
expr → expr + term.• A grammar is left recursive if it has a nonterminal, say A, that has a
derivation of Aα from it.• Presence of left recursion creates difficulties while designing the
corresponding parsers.• Left recursion is of two types: Immediate left recursion General left recursion31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT
(CD) 15
Contd……• An immediate left recursion happens with a nonterminal A having
production rule of the form • A Aα| β.• The immediate left recursion can be eliminated by introducing a new
nonterminal symbol, say A’ thus modifying the grammar.• The grammar rule A Aα| β is modified as,• A βA’ • A’ αA’ |ε• Thus the rule A Aα1| Aα2|…..| Aαm| β 1| β 2 |….| β n can be
modified as,
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 16
Contd……• A β1 A’| β2 A’| ……..|βn A’• A’ α1 A’| α2 A’|……..|αm A’|ε• Example• Consider the following left-recursive grammar for arithmetic
expression, E E + T | T T T * F | F F (E) | id
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 17
Contd……• Elimination of immediate left recursion from the rules modifies the
grammar as,• E TE’• E’ +T E’| ε• T F T’• T’ *F T’| ε• F (E) | id• However, even if there may be no immediate left recursion, a number
of production rules may act together to give a general left recursion.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 18
Contd……
• S Aa• A Sb|c• Here, S is left recursive, because S Aa Sba. • This form of general left recursion can be eliminated with the
following algorithm.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 19
Algorithm
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 20
Contd…..• For example, consider the grammar,• S Aa• A Sb | c• Let the order of nonterminals be S, A. For i=1, the rule S Aa is
through, since there is no immediate left recursion. For i=2, A Sb|c is modified as, A Aab|c, which has immediate left
recursion & hence, is eliminated by modifying the rule as, A cA’ A’ abA’|ε
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 21
RECURSIVE DESCENT PARSING• A recursive descent parser is a kind of top-down parser built from a set
of mutually recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the productions of the grammar.
• It is a common form of top-down parsing.• It is called recursive as it uses recursive procedures to process the input.• It constructs the parse tree from the top & the input is read from left to
right.• It uses procedures for every terminal & non-terminal entity.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2231-01-2017
Example• Consider the grammar S abA A cd|c|εFor the input stream ab, the recursive descent parser starts by constructing a
parse tree representing S abA.Now construct the parse tree for the above grammar.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 23
PREDICTIVE LL(1) PARSING
• The first “L” in LL(1) refers to the fact that the input is processed from left to right.• The second “L” refers to the fact that LL(1) parsing determines a leftmost derivation
for the input string. • The “1” in parentheses implies that LL(1) parsing uses only one symbol of input to
predict the next grammar rule that should be used. • The data structures used by LL(1) are 1. Input buffer 2. Stack 3. Parsing table
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2431-01-2017
• The construction of predictive LL(1) parser is based on two very important functions and those are First and Follow.
• For construction of predictive LL(1) parser we have to follow the following steps:
• STEP1: computate FIRST and FOLLOW function.• STEP2: construct predictive parsing table using first and follow function.• STEP3: parse the input string with the help of predictive parsing table
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2531-01-2017
FIRSTIf X is a terminal then First(X) is just X!If there is a Production X → ε then add ε to first(X)If there is a Production X → Y1Y2..Yk then add
first(Y1Y2..Yk) to first(X)First(Y1Y2..Yk) is either
First(Y1) (if First(Y1) doesn't contain ε)OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is everything in First(Y1)
<except for ε > as well as everything in First(Y2..Yk)If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to First(Y1Y2..Yk) as
well.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2631-01-2017
FOLLOW• First put $ (the end of input marker) in Follow(S) (S is the
start symbol)• If there is a production A → aBb, (where a can be a whole
string) then everything in FIRST(b) except for ε is placed in FOLLOW(B).
• If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B)
• If there is a production A → aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is in FOLLOW(B)
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2731-01-2017
EXAMPLE OF FIRST AND FOLLOWThe GrammarE → TE'E' → +TE'E' → εT → FT'T' → *FT'T' → εF → (E)F → id
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2831-01-2017
PROPERTIES OF LL(1) GRAMMARS1. No left-recursive grammar is LL(1) 2. No ambiguous grammar is LL(1) 3. Some languages have no LL(1) grammar 4. A ε–free grammar where each alternative expansion for A begins with a
distinct terminal is a simple LL(1) grammar.
Example:S aS a
is not LL(1) because FIRST(aS) = FIRST(a) = { a } S aS´
S´ aS εaccepts the same language and is LL(1)
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 2931-01-2017
PREDICTIVE PARSING TABLE
Method:1. production A α:
a) a FIRST(α), add A α to M[A,a]b) If ε FIRST(α):
I. b FOLLOW(A), add A α to M[A,b]II. If $ FOLLOW(A), add A α to M[A,$]
2.Set each undefined entry of M to error
If M[A,a] with multiple entries then G is not LL(1).
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3031-01-2017
EXAMPLE OF PREDICTIVE PARSING LL(1) TABLE
The given grammar is as followsS EE TE´E´ +E —E εT FT´T´ * T / T εF num id
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3131-01-2017
BOTTOM UP PARSINGBottom-up parsing starts from the leaf nodes of a tree and works in
upward direction till it reaches the root node. we start from a sentence and then apply production rules in reverse
manner in order to reach the start symbol. Here, parser tries to identify R.H.S of production rule and replace it
by corresponding L.H.S. This activity is known as reduction.Also known as LR parser, where L means tokens are read from left to
right and R means that it constructs rightmost derivative.Bottom-up parsing is based on the reverse process to top-down.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3231-01-2017
Example• Consider the grammar, S aABe A Abc|b B d and the sentence abbcde. Parsing by bottom up methods, gives
abbcde aAbcde aAde aABe S
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 33
Reverse of this gives:
S => aABe => aAde => aAbcde => abbcde
Which is clearly a series of rightmost derivations.
EXAMPLE OF BOTTOM-UP PARSERE → T + E | T T → int * T | int | (E) Consider the string: int * int + int
int * int + int T → intint * T + int T → int * TT + int T → intT + T E → TT + T E → TE
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3431-01-2017
SHIFT REDUCE PARSING• Bottom-up parsing uses two kinds of actions: 1.Shift 2.Reduce• Shift: Move | one place to the right , Shifts a terminal to the left string
ABC|xyz ABCx|yz ⇒• Reduce: Apply an inverse production at the right end of the left string
If A → xy is a production, then Cbxy|ijk CbA|ijk ⇒
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3531-01-2017
Example
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 36
EXAMPLE OF SHIFT REDUCE PARSING
|int * int + int shiftint | * int + int shiftint * | int + int shiftint * int | + int reduce T → intint * T | + int reduce T → int * TT | + int shiftT + | int shiftT + int | reduce T → intT + T | reduce E → TT + E | reduce E → T + EE |
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3731-01-2017
OPERATOR PRECEDENCE PARSINGOperator grammars have the property that no production right side is empty or has two adjacent nonterminals. This property enables the implementation of efficient operator-
precedence parsers. These parser rely on the following three precedence relations:
Relation Meaning
a <· b a yields precedence to b
a =· b a has the same precedence as b
a ·> b a takes precedence over bANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT
(CD) 3831-01-2017
• These operator precedence relations allow to delimit the handles in the right sentential forms: <· marks the left end, =· appears in
the interior of the handle, and ·> marks the right end.• Suppose that $ is the end of the string, Then for all terminals we can
write: $ <· b and b ·> $• If we remove all Nonterminals and place the correct precedence
relation: <·, =·, ·> between the remaining terminals, there remain strings that can be analyzed by easily developed parser.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 3931-01-2017
EXAMPLE OF OPERATOR PRECEDENCE PARSING
id + * $id ·> ·> ·>+ <· ·> <· ·>* <· ·> ·> ·>$ <· <· <·
For example, the following operator precedence relations can
be introduced for simple expressions:
Example: The input string: id1 + id2 * id3
after inserting precedence relations becomes$ <· id1 ·> + <· id2 ·> * <· id3 ·> $
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 4031-01-2017
Contd….
id + * $
id ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <·
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 41
They follow from the following facts:
+ has lower precedence than * (hence + <• * and * •> +).
Both + and * are left-associative (hence + •> + and * •> *).
Associativity• If an operand has operators on both sides, the side on which the
operator takes this operand is decided by the associativity of those operators.
• Example• Operations such as Addition, Multiplication, Subtraction, and Division
are left associative. If the expression contains:• id op id op id it will be evaluated as: ( id op id ) op id …………….
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 42
Example….• (id + id) + id• Operations like Exponentiation are right associative, i.e., the order of
evaluation in the same expression will be:• id op (id op id)• Another example, id ^ (id ^ id)If two different operators share a common operand, the precedence
of operators decides which will take the operand. That is, 2+3*4 can have two different parse trees, one corresponding to
(2+3)*4 and another corresponding to 2+(3*4).
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 43
Contd…• By setting precedence among operators, this problem can be easily
removed. • As in the previous example, mathematically * (multiplication) has
precedence over + (addition), so the expression 2+3*4 will always be interpreted as:
• 2 + (3 * 4)• These methods decrease the chances of ambiguity in a language or its
grammar.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 44
Another Definition• For operators, associativity means that when the same operator
appears in a row, then which operator occurrence we apply first. • In the following, let Q be the operator a Q b Q cIf Q is left associative, then it evaluates as (a Q b) Q cAnd if it is right associative, then it evaluates as a Q (b Q c)It's important, since it changes the meaning of an expression. 31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT
(CD) 45
Contd…….• Consider the division operator with integer arithmetic, which is left
associative: 4 / 2 / 3 <=> (4 / 2) / 3 <=> 2 / 3 = 0If it were right associative, it would evaluate to an undefined
expression, since you would divide by zero 4 / 2 / 3 <=> 4 / (2 / 3) <=> 4 / 0 = undefined.• If you write 12 - 5 + 3, the possible evaluations include:• (12 - 5) + 3 = 10• 12 - (5 + 3) = 4
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 46
Contd…• Left Associative means we evaluate our expression from left to right
hand side.• Right Associative means we evaluate our expression from right to left
hand side.• We know *,/ and % have same precedence, but according to
associativity answer may change.• For eg: we have exp: 4*8/2%5• Left associative: (4*8)/2%5 ==> (32/2)%5 ==>16%5 ==>1• Right associative: 4*8/(2%5) ==> 4*(8/0) ==>Undefined behavior.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 47
LR PARSERS
• One of the best methods for syntactic recognition of programming languages is LR parsing.
• An LR parser uses the shift-reduce technique.• The L stands for left-to-right scanning & the R for a rightmost
derivation.• LR(1) parsing- i.e, LR parsing with one symbol lookahead.• LR(k) parsing, with k symbols of lookahead.• LR parsers are a type of bottom-up parsers that efficiently
handle deterministic context-free languages.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 48
LR Parsers• A bottom-up parser follows a rightmost derivation from the bottom
up.• Such parsers typically use the LR algorithm and are called LR parsers.
• L means process tokens from Left to right.• R means follow a Rightmost derivation.
• Furthermore, in LR parsing, the production is applied only after the pattern has been matched.
• In LL (predictive) parsing, the production was selected, and then the tokens were matched to it.
Rightmost Derivations• Let the grammar be
E E + T | TT T * F | FF (E) | id | num
Rightmost Derivations• A rightmost derivation of (id + num)*id is
E T T*F T*id F*id (E)*id (E + T)*id (E + F)*id (E + num)*id (T + num)*id (F + num)*id (id + num)*id.
LR Parsers• An LR parser uses a parse table, an input buffer, and a stack of
“states.”• It performs three operations.
• Shift a token from the input buffer to the stack.• Reduce the content of the stack by applying a production.• Go to a new state.
ADVANTAGES
• The advantages of LR parsing are numerous:1.An LR parser can recognize virtually all PL constructs written with
CFGs.2.It is the most general nonbacktracking technique known.3.It can be implemented in a very efficient manner.4.The languages it can recognize is a proper superset of that for
predictive parsers.5.It can detect syntax errors quickly.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 53
DISADVANTAGE
• The primary disadvantage to LR parsers is that it is far too much work to manually create LR parsing tables.
• However, tools exist to automatically generate an LR parser from a given grammar.
• These are called LR parser generators, such as YACC, BISON etc.• These parser generators are not only useful in creating the parser, but
also in finding errors in the grammar.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 54
LR PARSING METHODS
• There are actually three different methods to perform LR parsing: SLR(1) – Simple LR Parser: Works on smallest class of grammar. Few number of states, hence very small table. Simple and fast construction. Easy to implement, but less powerful.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 55
Contd…….
Canonical LR or LR Parser: It is most general and powerful. It is tedious and costly to implement. Works on complete set of LR(1) Grammar Generates large table and large number of states Slow construction
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 56
Contd……
• LALR(1) – Look-Ahead LR Parser: Works on intermediate size of grammar. Number of states are same as in SLR(1). It is a mix of SLR and Canonical LR. It is implemented efficiently. Most parser generators generate LALR parser, since they are the
trade-off between power and efficiency.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 57
LL vs LRLL LRDoes a leftmost derivation. Does a rightmost derivation in reverse.
Starts with the root nonterminal on the stack. Ends with the root nonterminal on the stack.
Ends when the stack is empty. Starts with an empty stack.
Uses the stack for designating what is still to be expected.
Uses the stack for designating what is already seen.
Builds the parse tree top-down. Builds the parse tree bottom-up.
Continuously pops a nonterminal off the stack, and pushes the corresponding right hand side.
Tries to recognize a right hand side on the stack, pops it, and pushes the corresponding nonterminal.
Expands the non-terminals. Reduces the non-terminals.
Reads the terminals when it pops one off the stack. Reads the terminals while it pushes them on the stack.
Pre-order traversal of the parse tree. Post-order traversal of the parse tree.31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT
(CD) 58
Constructing LR Parsing Tables• The table is an essential part of LR parsing.• But, How does one go about making the tables.• It is a daunting task if one does not use automated tools for the purpose.• One important fact to be kept in mind while constructing LR parsing tables
is that the state on the top of the stack provides a wealth of information to the parser.
• An LR parser is keeping track of viable prefixes for the handles.• It uses an automaton to recognize these prefixes.• The goto portion of the table simulates this automaton, but it does not
need to scan the stack on every input symbol to figure out what state it is in.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 59
Contd…..• Items• First concept in LR table construction is that of an item.• An item is a production rule with a position indicator (dot) at some point
on the RHS. • If A XYZ is a production, the possible items of this production are: A .XYZ A X.YZ A XY.Z A XYZ.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 60
Contd…..• Items are also known as LR(0) items, since they assume no lookahead.• An item denotes how much of a production we have seen so far
during the parsing .• SLR Parsing Tables• We augment the grammar and use two functions- closure & goto.• Augmented grammar: An augmented grammar simply has a new “dummy” start
symbol, whose only production is the start symbol of the grammar in question.• If G is our grammar with start symbol S, then the augmented grammar
G’ = (VT , VN U {S’}, S’, F U {S’ S }).
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 61
Contd…..• Let us consider the following augmented grammar for constructing
the SLR parsing table: S’ S S aABe A Abc A b B d
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 62
Contd….• The sets of Items for the augmented grammar computed using the process• Io = {[ S’ .S], [S .aABe ]}• I1 = {[ S’ S.]}• I2 = {[ S a.ABe ], [A .Abc ], [A .b]}• I3 = {[ S aA. Be ], [A A. bc ], [B .d]}• I4 = {[ A b.]}• I5 = {[ S aAB.e ]}• I6 = {[ A Ab.c ]} • I7 = {[B d.]}• I8 = {[ S aABe. ]} I9 = {[A Abc. ]}
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 63
Goto…..• The goto function for this grammar is: goto(Io , S) = I1
goto(Io , a) = I2
goto(I2 , A) = I3
goto(I2 , b) = I4
goto(I3 , B) = I5
goto(I3 , b) = I6
goto(I3 , d) = I7
goto(I5 , e) = I8
goto(I6 , c) = I931-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT
(CD) 64
Graph
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 65
S’ --> S.I1
B--> d. I7
A --> Ab.cI6
S’ --> .S I0
S’ -->.aABe
S’ -->a.ABeA -->.AbcA --> .b I2
A --> b. I4
S’ -->aA.BeA -->A.bcB--> .d I3
A --> Abc.I9
S’ -->aAB.eI5
S’ -->aABe.I8
a
d
b
b
A
B e
c
S
Constructing LR(1) PARSERS• LR(1) item = LR(0) item + lookahead• Example S AA A aA/b
I0
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 66
Production Rule:A α.Bβ, a/bB .Ɣ, c/d
or if β is not there
S’ .S, $S .A A, $
A .aA, a/b .b, a/b
A b., $
aS A.A, $A .aA, $
.b, $
A
A b., a/b
S AA., $
AA a.A, $A .aA, $
.b, $
Sa
S’ S., $
bb
A b., $
b
A a. A, a/bA .aA, a/b
.b, a/b
a
aA., a/b
b., a/b
A
b
An Automatic Parser Generator 1. ACCENT:• A Compiler for the Entire Class of Context-Free Languages• Welcome to Accent, a modern compiler that can process all grammars without any
restriction.
• No knowledge of parsing technology is required; grammars can be used directly without adapting them to a particular parsing technique. There is no struggling against shift/reduce conflicts as known from Yacc and no rewriting of left-recursive rules as it is required for LL(k) parsers.
• Accent is in use in industrial projects, especially where languages are implemented that are defined by complex standard documents and grammars that cannot be processed by traditional systems.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 67
CONTD……• Accent can be used in the style of Yacc, i.e. you provide a grammar
and add semantic actions. However, Accent also supports the Extended Backus Naur Form, and there are no restrictions on where to place semantic actions. Like Yacc, Accent cooperates with Lex.
• Accent is Open Source Software. Commercial support is available from Metarga GmbH.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 68
Contd……
2. AFLEX AND AYACC• Aflex and Ayacc are similar to the Unix tools lex and yacc, but they are
written in Ada and generate Ada output. • They were developed by the Arcadia Project at the University of
California, Irvine. • Aflex is based on the tool 'flex' written by Vern Paxson. These tools
are copyrighted, but are freely redistributable.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 69
Contd……3. ALE:• Attribute-Logic Engine , Version 3.2, a freeware logic programming
and grammar parsing and generation system. • This includes information on obtaining the system, user's guide,
graphical interfaces, and grammars. This version includes:• ALE is faster (again), both at compile-time and run-time.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 70
Contd…..• A new parsing compilation algorithm (Empty-First-Daughter closure),which:• The EFD-closure algorithm assumes that a grammar is ``EFD-closed'' meaning
that the first daughter of all the grammatical rules in the grammar are non-empty.
• corrects a long-standing problem in ALE with combining empty categories.• works around a problem that non-ISO-compatible Prologs, including SICStus
Prolog.• Shallow cuts (if-then-else predicates) have been added to the definite
clause language.• Faster extensionalisation code, particularly with grammars that have few or
no extensional types
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 71
Contd…..
• Faster subsumption checking code for chart edges.• ALE Source-level Debugger 3.0, which has been integrated with the
new SICStus 3.7 source-level debugger.• More compile-time error and warning messages,• Several bug corrections,• An updated user's manual,• An SWI Prolog port.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 72
Contd…..4. The AnaGram Parser Generator:• The parser is a C/C++ function that parses text according to the rules
in our grammar and, as it matches rules, calls our code to deal with them.
• Using a grammar means we get faster development, easier modification and maintainability, and fewer bugs in our software.
• We use an easy-to-understand description of input instead of bug-prone, fragile, branching code.
• We make the rules, and AnaGram makes a parser for our input.• AnaGram 2.01 runs on Win32 platforms.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 73
Contd…..5. Bison, The YACC-compatible Parser Generator:• Bison is a general-purpose parser generator that converts a grammar
description for an LALR(1) context-free grammar into a C program to parse that grammar.
• Once we are proficient with Bison, we may use it to develop a wide range of language parsers, from those used in simple desk calculators to complex programming languages.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 74
Contd……
6. BTYACC:• BTYACC is a modified version of yacc that supports automatic
backtracking and semantic disambiguation to parse ambiguous grammars, as well as syntactic sugar for inherited attributes.
7. BYACC:• Berkeley Yacc is a public domain LALR(1) parser generator. It has been
made as compatible as possible with AT&T Yacc.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 75
Contd……8. Coco/R:• It is a compiler generator, which takes an attributed grammar of a
source language and generates a scanner and a parser for this language.
• The scanner works as a deterministic finite automaton. • The parser uses recursive descent. • LL(1) conflicts can be resolved by a multi-symbol lookahead or by
semantic checks.• Coco/R for C#, Java, C++, F#, VB.Net, Delphi, Swift, Oberon, other
languages.
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 76
Contd…..• Some more examples.• DEPOT4• FLEX• GOBO EIFFEL LEX & YACC• HAPPY• HOLUB• LEX• LLGEN• MKS LEX & YACC• PCYACC
• PRECC• PROGRAMMAR• QUEX• RDP• TP LEX AND YACC• VISUALPARSE++……………….……………….
31-01-2017 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT (CD) 77
Using Ambiguous grammar• The number of states in LR(1) parsing table is much more than that in SLR
parsing table.• LALR reduces the number of states in LR(1) parsing table.
• LALR (Lookahead LR) is less powerful than LR(1)• reducing states may introduce reduce-reduce conflict, but not shift-reduce conflict.• LALR has the same number of states as SLR, but more powerful.
• Constructing LALR parsing table.• Combine LR(1) sets with the same sets of first parts (ignore lookahead).• Algorithms exist that skip constructing the LR(1) sets.
Contd….
• Using ambiguous grammars• ambiguous grammars will results in conflicts• Can use precedence and associativity to resolve the conflicts• May result in a smaller parsing table in comparison to using un-ambiguous grammars.
• Example: E->E+EE->E*EE->(E)E->id