Simplification of CFG and Normal Formswgtzeng/courses...Normal Forms •We want a cfg with either...

26
Simplification of CFG and Normal Forms Wen-Guey Tzeng Computer Science Department National Chiao Tung University

Transcript of Simplification of CFG and Normal Formswgtzeng/courses...Normal Forms •We want a cfg with either...

Simplification of CFG and Normal Forms

Wen-Guey Tzeng

Computer Science Department

National Chiao Tung University

Normal Forms

• We want a cfg with either Chomsky or Greibach normal form

– Chomsky normal form

• Aa, ABC

– Greibach normal form

• Aax, xV*

22016 Spring

• CFG with normal forms are easier for parsing– The membership problem

– Given a grammar G and a string w, find the parsing tree for w if a parsing tree exists.

3

w = x+y*z

2016 Spring

• -free languages

– A language that does not contain

• We consider CFG G such that L(G) is -free

• For any cfg G, there is G’ such that L(G’)=L(G)-{}

42016 Spring

Transformation to normal forms: steps

5

CFG G=(V, T, P, S)

(-free context-free language)

Remove

(1) -productions(2) unit-productions(3) useless productionsfrom P to get G’

Convert G’ to normal forms

2016 Spring

A substitution rule

• For AB, A x1Bx2, By1|y2|…|yn

is equivalent toAx1y1x2|x1y2x2|…|x1ynx2, By1|y2|…|yn

• Example

– Aa|aaA|abBc, BabbA|bis equivalent toAa|aaA|ababbAc|abbc, BabbA|b

62016 Spring

Remove -productions

• -production: A

• Nullable variable A: A*

• Steps

1. Find the nullable variable set VN

2. For each Ax1x2…xm, xiVT,

• For each combination xi, xj, …, xk of variables in VN

add Ax1 …xi-1 xi+1… xj-1 xj+1 ... xk-1 xk+1…xm

• Note: don’t add A, if all xi are in VN

72016 Spring

Example

• SABaC, ABC, Bb|, CD|, Dd

• Nullable set VN={A, B, C}

• Add productions

82016 Spring

Remove unit-productions

• unit-production: AB

• Steps

– Remove AA immediately

– Draw dependency graph for variables A and B with:A*B

– For A*B and By1|y2|…|yn

• Add Ay1|y2|…|yn

– Remove all AB, where A and B are in dependency graph

92016 Spring

Example

• S Aa|B, BA|bb, Aa|bc|B

• Draw dependency graph

10

1. Remove unit productionsS Aa, Bbb, Aa|bc

2. AddSbb|a|bcAbb

Ba|bc

3. FinallySa|bc|bb|AaAa|bc|bbBa|bc|bb

2016 Spring

Remove useless productions

• A variable AV is useful if S can generate some terminal string through it.

– That is, S * xAy * w, wT*

• Example

– SaSb|AB|Ba, AaA, Bb|Bb, CcB|c

– S Ba ba. Thus, B is useful.

– S is useful.

– But, A and C are not useful (useless)

112016 Spring

• Two cases for useless variables

– Case 1: variables that cannot generate strings in T*

• SaSb|AB|Ba, AaA, Bb|Bb, CcB|c

• Algorithm (finding variables that generate strings)

1. V1={}

2. For rule Ax, x(TV1)*, add A to V1

3. Repeat 2 until no rules can be added to V1

• V1={S, B, C}

• SaSb|Ba, Bb|Bb, CcB|c

122016 Spring

– Case 2: variables that cannot be reached from S

• SaSb|Ba, Bb|Bb, CcB|c

• Algorithm: dependency graph

• C is un-reachable from S.

• SaSb|Ba, Bb|Bb

13

S B C

2016 Spring

• Algorithm (removing useless productions)

Input: G=(V, T, P, S)

1. Find the useless variables in Case 1 and remove related useless productions.

2. Find the useless (un-reachable) variables in Case 2 and remove the related useless productions

142016 Spring

Chomosy normal form

• A cfg is in Chomsky normal form (CNF) if all productions are of form

ABC, or Aa

• Example

– SAS|a, ASA|b

• Every cfg G, with L(G), has an equivalent CNF grammar.

152016 Spring

Converting into CNF

1. Apply the rules of removing -, unit-, and useless-productions

2. Convert the productions into the formAC1C2…Cn, or Aa

3. Convert AC1C2…Cn into AC1D1, D1C2D2, …, Dn-2Cn-1Cn

162016 Spring

Example

• SABa, Aaab, BAc

• Step 2:

• Step 3:

172016 Spring

Greibach normal form

• A cfg is in Greibach normal form (GNF) if all productions are of form

AaB1B2…Bn, n0

• Example

– SaBC, BaBA, Aa|bBSC

• Every cfg G, with L(G), has an equivalent GNF grammar.

182016 Spring

Example

• Example

– SAB, AaA|bB|b, Bb

– Result

• SaAB|bBB|bB, AaA|bB|b, Bb

• Example

– SabSb|aa

– Result

• SaBSB|aA, Bb, Aa

192016 Spring

Parsing (membership)

• Question: Given a CFG G in Chomsky normal form and a string w, determine whether wL(G)

• Idea: the dynamic programming technique

– A large problem is decomposed into smaller problems

– Combine solutions to smaller problems into a solution for the large problem

202016 Spring

• Assume w=a1a2…an

• Use the dynamic programming technique

– Vij={ V : V* aiai+1…aj}: variables that can generate substring aiai+1…aj

• Solve smaller problems Vik, Vk+1,j, for k=i, i+1,…, j-1

• Combine them to compute Vij

– Vij = {A:ABC, BVik, CVK+1,j, ik<j}

212016 Spring

22

w = a1 a2 a3 … ai ai+1 … aj-1 aj … an

Vij contains the variables that generate aiai+1…aj-1aj

ai ai+1 … ak ak+1 ak+2 … aj-1 aj

Vi k Vk+1 j

Vk+2 jVi k+1

. . .

. . .

2016 Spring

• Triangular table (n=5)

23

V1,5

V1,4 V2,5

V1,3 V2,4 V3,5

V1,2 V2,3 V3,4 V4,5

V1,1 V2,2 V3,3 V4,4 V5,5

2016 Spring

CYK Algorithm

• Input: G=(V, T, S, P) is in CNF and w=a1a2…an

– Compute Vij={ AV : A* aiai+1…aj}• V11, V22, …, Vnn

• V12, V23, …, Vn-1n

• …

• V1n

1. Smallest problem: add A to Vii

• if Aai is a production in P

2. Bigger problem: add to A to Vij if• For some k, ikj-1, ABC in P, B in Vik, C in Vk+1 j

3. wL(G) if and only if SV1n

242016 Spring

Example

• SAB, ABB|a, BAB|b

• w=aabbb

• Steps

– V11={A}, V22={A}, V33={B}, V44={B}, V55={B}

– V12=, V23={S, B}, V34={A}, V45={A}

– V13={S, B}, V24={A}, V35={S, B}

– V14={A}, V25={S, B}

– V15={S, B}

252016 Spring

Sum up

• Context-free grammars are used in designing programming languages, such as , C, PSACAL, etc.

• Membership problem in CFG is equivalent to the parsing problem in programming languages

• Normal forms are needed for “automatically” generating a “parser” for the programming language

262016 Spring