Theory of Automata and formal languages Unit 3

29
THEORY OF AUTOMATA AND FORMAL LANGUAGES UNIT-III ABHIMANYU MISHRA ASSISTANT PROF.(CSE) JETGI Abhimanyu Mishra(CSE) JETGI 06/11/2022

Transcript of Theory of Automata and formal languages Unit 3

Page 1: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

THEORY OF AUTOMATA AND FORMAL LANGUAGES

UNIT-III

ABHIMANYU MISHRAASSISTANT PROF.(CSE)

JETGI

Page 2: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Context-Free Grammar

A context-free grammar (CFG) consisting of a finite set of grammar rules is a quadruple ( Vn,Vt,P, S) where:-

(i) Vn is a set of non-terminal symbols.(ii) Vt is a set of terminals where  Vn ∩ Vt = NULL.(iii)P is a set of rules, P: Vn → (Vn ∪ Vt)*, i.e., the left-hand side of the production rule P does have any right context or left context. (iv) S is the start symbol.

Page 3: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Example: Following is a CFB for the language:

L ={ wcwR/wε(a,b)*}Solution:- Let G be CFG for language L ={ wcwR/wε(a,b)*} G= ( Vn,Vt,P, S) Vn = {S} Vt = {a,b,c}And P is given by

S aSa S bSb S c

Page 4: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

S aSb S abSba S abbSbba S abbcbba

So string abbcbba can be derived from given CFG

Page 5: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

BACUS NAUR FORM(BNF)

While linguists were studying CFG’s computer scientists began to describe programming languages by notation called Baus Normal Form, which is the CFG notation with minor changes in format and some shorthand. For

Example:- BNF as S bA/aB A bAA/aS/a B aBB/bS/a

Hence BNF is a shorthand notation for context free grammar

Page 6: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Example: Write a CFG, Which generates strings of balanced parenthesis.

Solution: This grammar will accept the balanced right and left parenthesis. For example()() is acceptable,((())) is also accepted. Let CFG be, G= ( Vn,Vt,P, S) where Vn = set of non-terminals={s} Vt = set of terminals={(,)}

And set of production P is given by S SS S (S) S ɛ

Page 7: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Now we see what this grammar generates:

SS (S)S (S)SS (S)(S)S (S)(S)(S) ( )(S)(S) ( )( )(S) ( )( )( )

Page 8: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Example: Write a CFG, for the regular expression r = 0*1(0+1)*

Solution :- Let CFG be, G= ( Vn,Vt,P, S) where Vn = set of non-terminals={S,A,B} Vt = set of terminals={0,1}And production P are defined as:- S A1B A 0A/ɛ B 0B/1B/ɛLet us consider the derivation of the string 00101 S A1B S 0A10B S 00A101B S 00101 So, clearly G is regular expression r

Page 9: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Problems?

Q1: Write a CFG, which generates strings having equal no of a’s and b’s.Q2: Design a CFG,which can generate string, having any combination of a’s and b’s, except null string.Q3:- Design a CFG, for regular expression r = (a+b)*aa(a+b)*Q4: Design a CFG for the language L = {(0n1n/n>=0)U(1n0n/n>=0)}Q5:- Design a CFG for the language L = {anbn : n=/ m}

Page 10: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Left Most and Right Most Derivations

• Leftmost derivation − A leftmost derivation is obtained by applying production to the leftmost variable in each step.

• Rightmost derivation − A rightmost derivation is obtained by applying production to the rightmost variable in each step.

Let any set of production rules in a CFG be X → X+X | X*X |X| a over an alphabet {a}.The leftmost derivation for the string "a+a*a" may be −X → X+X → a+X → a+ X*X → a+a*X → a+a*aThe rightmost derivation for the above string "a+a*a" may be −X → X*X → X*a → X+X*a →X+a*a → a+a*a

Page 11: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Ambiguity in Context-Free Grammars

If a context free grammar G has more than one derivation tree for some string  w ɛ L(G) it is called an ambiguous grammar. There exist multiple right-most or left-most derivations for some string generated from that grammar.ProblemCheck whether the grammar G with production rules −X → X+X | X*X |X| ais ambiguous or not.SolutionLet’s find out the derivation tree for the string "a+a*a". It has two leftmost derivations.

Page 12: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Derivation 1 −X → X+X → a +X → a+ X*X → a+a*X → a+a*aParse tree 1 −

a

+

a

X*Xa

XX

X

Page 13: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Derivation 2 −X → X+X → a +X → a+ X*X → a+a*X → a+a*aParse tree 2 −

a

*

a

X + X a

XX

X

Page 14: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Problem?Example: Let G be CFG

S bB/aAA b/bS/aAAB a/aS/bBB

For the string bbaababa find

(i) left most derivation(ii) rightmost derivation(iii) parse tree

Page 15: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

4. Simplified Context-Free Grammar And Its Normal Form

4.1 Reduction of Context Free GrammarCFGs are reduced in two phases −

Phase 1 − Derivation of an equivalent grammar, G’, from the CFG, G, such that each variable derives some terminal string. Phase 2 − Derivation of an equivalent grammar, G”, from the CFG, G’, such that each symbol appears in a sentential form.

Page 16: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Problem?Find a reduced grammar equivalent to the grammar G, having production rules, P: S → AC | B, A → a, C → c | BC, E → aA | e

Phase 1 −T = { a, c, e }W1 = { A, C, E } from rules A → a, C → c and E → aAW2 = { A, C, E } { S } from rule S → AC∪W3 = { A, C, E, S } ɸ∪Since W2 = W3, we can derive G’ as −G’ = { { A, C, E, S }, { a, c, e }, P, {S}}where P: S → AC, A → a, C → c , E → aA | e

Page 17: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Phase 2 −Y1 = { S }Y2 = { S, A, C } from rule S → ACY3 = { S, A, C, a, c } from rules A → a and C → cY4 = { S, A, C, a, c }Since Y3 = Y4, we can derive G” as −G” = { { A, C, S }, { a, c }, P, {S}}where P: S → AC, A → a, C → c

Page 18: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

4.1.1 Eliminating the useless Symbols

A symbols Y in a context-free grammar is use-full it and only if(a) Ƴ * w, where w ɛL(g) and w in Vt*, that is Y leads to a string of

terminals . Here Y is said to be “generating”(b) If there is a derivation S * α Ƴ β * w, w ɛL(G), for same α and β,

then Y is said to be reachable.

4.1.2 Removal of Unit Production Non- terminal One non-terminal This is a production of the form A B is called unit production. Unit production increase the cost of derivation in a grammar.

Page 19: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

4.1.3 Removal of Null Production

In a CFG, a non-terminal symbol ‘A’ is a null able variable if there is a production A → ε or there is a derivation that starts at A and finally ends up with ε: A → .......… → ε

Removable Procedure (i) Find out null able non-terminal variables which derive ∈.(ii) For each production A → a, construct all productions A → x where x is obtained from ‘a’ by removing one or multiple non-terminals from Step 1.(iii) Combine the original productions with the result of step 2 and remove -∈productions.

Page 20: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

4.1.4 Chomsky Normal Form

If a CFG has only production of the form

Non-terminals String of exactly two non-terminals or of the formNon-terminals One terminal

Is said to be Chomsky normal form or CNF

Page 21: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Problem? Remove unit production from the following −S → XY, X → a, Y → Z | b, Z → M, M → N, N → a

Solution −There are 3 unit productions in the grammar −Y → Z, Z → M, and M → NAt first, we will remove M → N.As N → a, we add M → a, and M → N is removed.The production set becomesS → XY, X → a, Y → Z | b, Z → M, M → a, N → aNow we will remove Z → M.As M → a, we add Z → a, and Z → M is removed.The production set becomesS → XY, X → a, Y → Z | b, Z → a, M → a, N → a

Page 22: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Now we will remove Y → Z.As Z → a, we add Y → a, and Y → Z is removed.The production set becomesS → XY, X → a, Y → a | b, Z → a, M → a, N → a

Now Z, M, and N are unreachable, hence we can remove those.The final CFG is unit production free −S → XY, X → a, Y → a | b

Page 23: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

ProblemRemove null production from the following −S → ASA | aB | b, A → B, B → b | ∈

Solution −There are two nullable variables − A and BAt first, we will remove B → ε.After removing B → ε, the production set becomes −S → ASA | aB | b | a, A → B| b | , B → b∈Now we will remove A → ε.After removing A → ε, the production set becomes −S → ASA | aB | b | a | SA | AS | S, A → B| b, B → b

Page 24: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Problem?Convert the following CFG into CNFS → ASA | aB, A → B | S, B → b | ∈

Solution (1) Since S appears in R.H.S, we add a new state S0 and S0 → S is added to the production set and it becomes − S0 → S, S → ASA | aB, A → B | S, B → b | ε (2) Now we will remove the null productions − B ε and A → ε After removing B → ε, the production set becomes − S0 → S, S → ASA | aB | a, A → B | S | ε, B → b After removing A → ε, the production set becomes − S0 → S, S → ASA | aB | a | AS | SA | S, A → B | S, B → b

Page 25: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

(3)  Now we will remove the unit productions. After removing S → S, the production set becomes − S0 → S, S → ASA | aB | a | AS | SA, A → B | S, B → b After removing S0 → S, the production set becomes − S0 → ASA | aB | a | AS | SA, S → ASA | aB | a | AS | SA A → B | S, B → b After removing A → B, the production set becomes − S0 → ASA | aB | a | AS | SA, S → ASA | aB | a | AS | SA A → S | b B → b After removing A → S, the production set becomes − S0 → ASA | aB | a | AS | SA, S → ASA | aB | a | AS | SA A → b |ASA | aB | a | AS | SA, B → b

Page 26: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

(4) Now we will find out more than two variables in the R.H.S Here, S0 → ASA, S → ASA, A → ASA violates two Non-terminals in R.H.S. Hence we will apply step 4 and step 5 to get the following final production set which is in CNF −

S0 → AX | aB | a | AS | SA S → AX | aB | a | AS | SA A → b |AX | aB | a | AS | SA B → b X → SA

Page 27: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

(5) We have to change the productions S0 → aB, S → aB, A → aBAnd the final production set becomes −

S0 → AX | YB | a | AS | SAS → AX | YB | a | AS | SAA → b |AX | YB | a | AS | SAB → bX → SAY → a

Page 28: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Greibach Normal Form

A CFG is in Greibach Normal Form if the Productions are in the following forms − A → b A → bD1…Dn

S → ε where A, D1,....,Dn are non-terminals and b is a terminal.

Page 29: Theory of Automata and formal languages Unit 3

05/03/2023 Abhimanyu Mishra(CSE) JETGI

Algorithm to Convert a CFG into Greibach Normal Form

(i) If the start symbol S occurs on some right side, create a new start symbol S’ and a new production S’ → S

(ii) Remove Null productions. (Using the Null production removal algorithm discussed earlier).

(iii) Remove unit productions. (Using the Unit production removal algorithm discussed earlier).

(iv) Remove all direct and indirect left-recursion.

(v) Do proper substitutions of productions to convert it into the proper form of GNF