Languages That Are and Are Not Context-Free

27
Languages That Are and Are Not Context-Free Section 3.5 Wed, Oct 26, 2005

description

Languages That Are and Are Not Context-Free. Section 3.5 Wed, Oct 26, 2005. Regular vs. Context-Free. Theorem: Every regular language is context-free. Proof: Let L be regular. Given a DFA for L , add a stack, but do not use the stack. - PowerPoint PPT Presentation

Transcript of Languages That Are and Are Not Context-Free

Page 1: Languages That Are and Are Not Context-Free

Languages That Are and Are Not Context-Free

Section 3.5

Wed, Oct 26, 2005

Page 2: Languages That Are and Are Not Context-Free

Regular vs. Context-Free Theorem: Every regular language is context-free. Proof:

Let L be regular. Given a DFA for L, add a stack, but do not use the stack. That is, change each DFA transition (p, a, q) to a PDA

transition ((p, a, e), (q, e)). The result is a PDA whose language is L. Therefore, L is context-free.

Page 3: Languages That Are and Are Not Context-Free

Closure under Union Theorem: Let L1 and L2 be CFLs. Then L1 L2 is

also a CFL. Proof:

Let L1 have grammar (V1, Σ1, R1, S1) and let L2 have

grammar (V2, Σ2, R2, S2).

Then L1 L2 has the grammar (V, Σ, R, S) where Σ = Σ 1 Σ 2

V = V1 V2

S is the new start symbol R = R1 R2 {S → S1S2}.

Page 4: Languages That Are and Are Not Context-Free

Proof, continued Therefore, L1 L2 is a CFL.

We must assume in the proof that

(V1 – Σ1) (V2 – Σ2) = .

Why?

Page 5: Languages That Are and Are Not Context-Free

Closure under Concatenation Theorem: Let L1 and L2 be CFLs. Then L1L2 is also

a CFL. Proof:

Let L1 have grammar (V1, Σ1, R1, S1) and let L2 have

grammar (V2, Σ2, R2, S2).

Then L1L2 has the grammar (V, Σ, R, S) where Σ = Σ 1 Σ 2

V = V1 V2

S is the start symbol R = R1 R2 {S → S1S2}.

Page 6: Languages That Are and Are Not Context-Free

Proof, continued Therefore, L1L2 is a CFL.

Again, we must assume that

(V1 – Σ1) (V2 – Σ2) = .

Page 7: Languages That Are and Are Not Context-Free

Closure under Kleene Star Theorem: Let L be a CFL. Then L* is also a CFL. Proof:

Let L have grammar (V, Σ, R, S). Then L* has the grammar (V, Σ, R, S) where

R = R {S → e | SS}.

Therefore, L* is a CFL.

Page 8: Languages That Are and Are Not Context-Free

Intersection of a Regular Language and a CFL. Theorem: The intersection of a CFL and a regular

language is a CFL. Proof (outline):

Use the cross product to construct the intersection of the PDA and the DFA.

Only one component uses the stack. Therefore, there is no complication. The cross product will function as a PDA.

Page 9: Languages That Are and Are Not Context-Free

Intersection of a Regular Language and a CFL. More specifically, the transitions (p, a) q from the

DFA and (p', a, ) (q', ) from the PDA may be combined into

((p, p'), a, ) ((q, q'), )

for the new PDA.

Page 10: Languages That Are and Are Not Context-Free

Complementation and Intersection The complement of a context-free language is not

necessarily context-free. The intersection of two context-free languages is not

necessarily context-free. Counterexamples will be given later.

Page 11: Languages That Are and Are Not Context-Free

The Concept behind the Pumping Lemma for CFLs The Pumping Lemma for CFLs will allow us to show

that some languages are not context-free. If a CFL contains a word w with a sufficiently long

derivation S * w, then some nonterminal A must appear more than once.

This is the Pigeonhole Principle.

Page 12: Languages That Are and Are Not Context-Free

The Concept behind the Pumping Lemma for CFLs That is, we have

S * uAz * uvAyz * uvxyz. Thus, A * vAy and A * x. We may repeat the derivation A * vAy as many

times as we like (including zero times), producing strings uvnxynz, for any n 0.

Page 13: Languages That Are and Are Not Context-Free

The Length of a Path in a Parse Tree In a parse tree T, define a path to be

empty, or a sequence of nodes, starting at a node in the tree and

ending at one of its descendants, and including all of the children along the way.

The length of a path is 0, if the path is empty, or 1 less than the number of nodes in the path.

Page 14: Languages That Are and Are Not Context-Free

Height and Fanout The height of a parse tree is the length of the tree’s

longest path. Given a grammar G, the fanout of G, denoted (G), is

the largest number of symbols on the right side of any rule in G.

Page 15: Languages That Are and Are Not Context-Free

A Lemma for the Lemma Lemma: Let G be a CFG. The yield of any parse tree

of G of height h has length no greater than (G)h. Proof:

The longest possible string is obtained if we always use a grammar rule with the maximum number of symbols on the right-hand side.

Therefore, if we apply grammar rules to each nonterminal in the string at most h times, then the length of the resulting string is at most f(G)h.

Page 16: Languages That Are and Are Not Context-Free

The Pumping Lemma for CFLs The Pumping Lemma for CFLs: Let G = (V, Σ, R, S)

be a context-free grammar. Then any string w L(G) with length at least n = (G)|V – | + 1 can be written as w = uvxyz for some strings u, v, x, y, z Σ* such that |v| > 0 or |y| > 0, |vxy| n, and uvkxykz L(G) for every k 0.

Page 17: Languages That Are and Are Not Context-Free

The Pumping Lemma for CFLs Proof:

Let n = (G)|V – | + 1. Let w L(G) with |w| n. Let T be a parse tree for w that uses the smallest number of

leaves possible (minimize the number of empty strings.) Let P be a path of maximum length in T. Since |w| > (G)|V – |, the length of P is greater than |V – |,

i.e., P is at least |V – | + 1. (Lemma) Therefore, the number of nodes on P is at least |V – | + 2.

Page 18: Languages That Are and Are Not Context-Free

The Pumping Lemma for CFLs

Let P' be the last part of P consisting of exactly |V – | + 2 nodes.

P' must contain exactly |V – | + 1 nonterminals. Therefore, at least one nonterminal must be repeated. Let A be the first nonterminal that is repeated as we follow

the path from the leaf back towards the root. Let T' be the subtree with root at the second-to-last

occurrence of A on the path P. If we remove T' from T, except for its root A, the result is a

parse tree for a string uAz.

Page 19: Languages That Are and Are Not Context-Free

The Pumping Lemma for CFLs

Let T'' be the subtree whose root node is the last occurrence of A on the path P.

T'' is a parse tree for a string x. If we remove T'' from T' except the root A, the result is a

parse tree for a string vAy. This parse tree may be attached at the leaf A in the tree T –

T' repeatedly as many times as we like (including zero times), creating parse trees for uvkAykz for any k 0.

Finally, we re-attach T'' and get a parse tree for uvkxykz.

Page 20: Languages That Are and Are Not Context-Free

The Pumping Lemma for CFLs

If v = e and y = e, then they could have been eliminated, producing a shorter tree.

We assumed that this was the shortest possible parse tree for w.

Therefore, v ≠ e or y ≠ e. The path from the second-to-last A to the last A and then to

the terminal has length at most |V – | + 1. Therefore, the subtree T' represents no more than (G)|V – | +

1 terminals. (Lemma) Thus, |vwy| n.

Page 21: Languages That Are and Are Not Context-Free

Standard Example of a Non-CFL The language {anbncn | n 0} is not context-free. Proof:

Suppose it is. Let n be the n of the Pumping Lemma. Let w = anbncn. Then w = uvxyz where |v| > 0 or |y| > 0 and |vxy| n. Then vxy contains at most two different symbols. Suppose it contains at most as and bs (but no cs). Then either v contains at least one a or y contains at least

one b.

Page 22: Languages That Are and Are Not Context-Free

Standard Example of a Non-CFL

Say v contains i as and y contains j bs, for some i and j, with i > 0 or j > 0.

Then uv2xy2z contains at least n + i as and at least n + j bs, at least one of which is greater than n.

But uv2xy2z contains only n cs. Thus, uv2xy2z L. This is a contradiction. Therefore, this language is not context-free. The other case, where vxy contains bs and cs, but no as, is

handled similarly.

Page 23: Languages That Are and Are Not Context-Free

Example of a Non-CFL The language {ambncmdnm, n 0} is not context-free. Proof:

Suppose that it is context-free. Let n be the n of the Pumping Lemma. Let w = anbncndn. Complete the proof using the Pumping Lemma.

Page 24: Languages That Are and Are Not Context-Free

Example of a Non-CFL The language

L = {w *#as = #bs = #cs}

is not context-free. Proof:

Suppose that it is context-free. Intersect it with L(a*b*c*), which is regular. The intersection is {anbncn | n 0}, which known to be

non-CFL. Therefore, the language L is not context-free.

Page 25: Languages That Are and Are Not Context-Free

Nonclosure Properties Theorem: The set of context-free languages is not

closed under intersection. Proof:

Let L1 = {anbncm | m, n 0} and let L2 = {ambncn | m, n 0}.

Clearly, L1 and L2 are context-free.

However, L1 L2 = {anbncn | n 0}, which is known to be non-context-free.

Page 26: Languages That Are and Are Not Context-Free

Nonclosure Properties Theorem: The set of context-free languages is not

closed under complementation. Proof:

Suppose it were closed under complementation. Let L1 and L2 be context-free languages.

Then (L1' L2')' is also context-free.

However, by DeMorgan’s Laws, this is L1 L2, which we now know is not necessarily context-free.

Page 27: Languages That Are and Are Not Context-Free

Example The language

L = {w * | w uu for any u *}

is context-free. The language

L′ = {w * | w = uu for some u *}

is not context-free.