toc notes

SYLLABUS

CS4: Theoretical Foundations Of Computer ScienceUnit 1Mathematical preliminaries Sets, operation, relation strings, transitive closure,accountability and diagonaalisation, induction and proof methods- pigeon-hole principleand simple application concept of language grammar and production rules- Chomskyhierarchy.Unit IIFinite state machine, regular language, deterministic finite automata, conversiondeterministic automata , E- closures- regular expression finite automata, minimization ofautomata, Moore and Mealy machine and their equivalence.Unit IIIPumping lemma for regular sets-closure properties of regular sets-decision properties forregular sets, equivalence between regular language and regular grammar. Context freelanguage parse trees and ambiguity, reduction of CFGS, chomsky and Griebach normalformsUnit IVPush down Automata (PDA)-non Determinism-acceptance by two methods and theirequivalence, conversion of PDA to CFG CFLs and PDAs-closure and decisionproperties of CFLs.Unit VTuring machines-various-recursively enumerable (r.e.)set-recursive setsTM as computer of function- decidability and solvability- reductions- postcorrespondence problem (PCP) and unsolvability of ambiguity problem of CFGs,Churchs hypothesis.Unit VIIntroduction to recursive function theory- primitive recursive and partial recursivefunctions, Parsing top down and bottom up approach, derivation and reduction

Unit I

Mathematical preliminaries Sets, operation, relation strings, transitive closure,accountability and diagonaalisation, induction and proof methods- pigeon-hole principleand simple application concept of language grammar and production rules- Chomskyhierarchy.

Mathematical preliminaries

Sets, operations,Relations :

A set is a collection of objects with no repetition. The simplest way to describe aset is by listing its elements. If a set is described using a defining property, thedescription should clearly specify the objects and the universe of discourse:A = {x | x x < 10}

Important notation for sets: , aA, aAAB, AB, AB, A=B,AB, AB, A\B, AB(A) or 2A, A or Ac , AFor all sets A, B, and C in the universe U the following set properties hold:Associative law: (A B) C = A (B C)(A B) C = A (B C)Commutative law: A B = B AA B = B AComplement law: A Ac = UA Ac = Idem potency law: A A = AA A = AIdentity law: A = AA U = AZero law: A U = UA = Involution law: (Ac)c = ADe Morgans law: (A B)c = Ac Bc(A B)c = Ac BcDistributive law: A (B C) = (A B) (A C)A (B C) = (A B) (A C)A set |A| is said to be finite if A contains a finite number of elements.A set |A| is said to be infinite if A contains an infinite number of elements.The set A is said to be countable or enumerable is there is a way to list of theelements of A. More formally, A set A is enumerable or countable is A is finite orif there is a bisection f:A+.

Example. The following sets are countable: A = {x | y + x = 2y+1} = {3, 5, 7, 9, 11, 13, 15,}B = {(x,y) | x,y A} = {(3,3),(3,5),(5,3),(3,7),(5,5),(7,3),(3,9),(5,7),(7,5)}A relation R is a subset of A B where A is the domain and B is the range of R.If the domain is the same as the range, A = B, then R is a relation on the set A.Properties of relations:R is reflexive if aRa a AR is symmetric if aRb implies bRaR is transitive is aRb and bRc implies aRcR is ant symmetric if aRb and bRa implies a = bR is an equivalence relation if R is reflexive, symmetric and transitiveAn equivalence class is {bA | aRb} where R is an equivalence relation onthe set A and a AIf R is a reflexive relation on A, then the reflexive closure of R is thesmallest reflexive relation on A with R as a subset.If R is a symmetric relation on A, then the symmetric closure of R is thesmallest symmetric relation on A with R as a subset.If R is a transitive relation on A, then the transitive closure of R is thesmallest transitive relation on A with R as a subset.The relation R on A is an ordering relation if R is reflexive, ant symmetricand transitive

A function f from the set A to the set B, denoted f:AB, associates with eachelement aA a unique element bB denoted by f(a).A is the domain of f and B is the codomain of f. difference codomain vs range?The set {f(a)|aA} is the range of f and is a subset of the codomain of f.A function f:AB is total if f(a) is defined for all aA, otherwise f is partial.A function f:AB is onto if for each bB there is an aA such that f(a)=b.A function f:AB is one-to-one if for each bB there is aA such that f(a)=b.A function f:AB is bijective if f is both onto and one-to-one.Exercise 3.11. Determine the cardinality of the power set (A) where A=BC, B={1,2} andC={a,b}. What is the cardinality of the largest element of (A)?A={(1,a),(1,b),(2,a),(2,b)}|(A)|=2|A|=16{, {(1,a)} , {(1,b)} , {(2,a)} , {(2,b)} , {(1,a),(1,b)} , {(1,a),(2,a)},{(1,a),(2,b)} , {(1,b),(2,a)} , {(1,b),(2,b)} , {(2,a),(2,b)} , {(1,a),(1,b),(2,a)},{(1,a),(1,b),(2,b)} , {(1,a),(2,a),(2,b)} , {(1,b),(2,a),(2,b)},{(1,a),(1,b),(2,a),(2,b)}}2. Determine the properties for the following relations:i) < on the set of real numberstransitive, antisymmetricii) on the set of integer numbersreflexive, transitive, antisymmetriciii) is a relative of over the set of persons living in the UKreflexive, symmetric3. Determine the properties of function f:xxsuch that f(x,y)=(x+y,xy)total, onto, one-to-one, bijective, try solving equations x+y=a and x-y=b

Strings and Languages:

A string is a finite sequence of symbols a1a2a3an where each ai is an element ofthe alphabet .The length of a string x denoted |x| is the number of symbols in x.The empty string has no symbols and is denoted by and ||=0.* is the set of all strings of finite length over and * for all alphabets.The University of Nottingham School of Computer Science and ITDr. 5 Dario Landa-SilvaFor two strings x,y * where x=a1a2am and y=b1b2...bn the concatenation of xand y is given by the xy= a1a2amb1b2...bn.For a string z * of the form z=xy where x,y *, x is a prefix of y and y is apostfix of x.For a string z * of the form z=xwy where x,w,y *, w is a substring of z.A formal language L over the alphabet is any subset of *.The concatenation of two languages L1 and L2 is given by L1L2={xy | xL1,yL2}.Exercise 3.21. Let ={0,1} and L={x* and |x|=4}. R is a relation over L where xRy if thetwo first bits of x are the same as the two first bits of y. Determine if R is anequivalence relation.L={0000,0001,0010,0011,0100,0101,0110,0111,1000,1001,1010,1011,1100,1101,1110,1111}Yes, R is an equivalence relation2. How many prefixes, postfixes and substrings exist in a string on length n?n+1 prefixes, n+1 postfixes, 12( 1) n n substrings3. Let ={a,b,c} and L={w | www=uu for some u in *}. Give an example of astring in L.Examples: aa , bb , cc , abab , cbacba

Transitive Closure:

Closures of Binary RelationA binary relation R on a set S may not have a particular propertysuch as reflexivity, symmetry, or transitivity. However, it maybe possible to extend the relation so that it does have theproperty.Extending R means finding a larger subset of S S thatcontains R and which has the desired property. The closure of arelation on S with respect to a property is the smallest suchextension that has the desired property.

Commonly used Closures:_ transitive closure_ reflexive closure_ symmetric closure

Transitive Closure of Binary Relation:

Accountability And Diagonaalisation:

Diagonalization

Let P(n) be a proposition involving integern. The Principle of MathematicalInduction states that P(n) is truefor n0 _ n, if the following are true: P(n0), and for n _ n0, P(k), n0 _ k _ n impliesP(n + 1)Problems1. The number of subsets of a set ofsize n is 2n.2.nXi=0i = n(n + 1)/23.nXi=0i2 = n(n + 1)(2n + 1)/64.nXi=0i3 = (nXi=0i)25.nXi=02i = 2n+1 16. nXi=11/(i(i + 1)) = n/(n + 1) 7. For n _ 4, n! > 2n

Induction And Proof Methods:

Pigeonhole Principle

If A and B are finite sets and |A| >|B|, then there is no one-to-one functionfrom A to B.Example: In any group of at least twopeople there are at least two personsthat have the same number of acquaintanceswithin the group.Example: How many shoes must bedrawn from a box containing 10 pairsto ensure a match?Example: In NYC there are at leasttwo people with the same number ofhairs on their heads.Example: If A is a set of 10 numbersbetween 1 and 100, there are two distinctdisjoint subsets of A whose elementssum to the same number.

Diagonalization Proofs Theorem (Georg Cantor). The setof all subsets of natural numbers isuncountable.

Proof by contradiction. Suppose there is a one-to-one functionf from N onto pow(N). pow(N) = {S0,S1,S2,...}where Si = f(i) D = {n in N | n is not in Sn}.D is the diagonal set for N. D is a set of natural numbers,hence D = Sk for some k in N if k is in Sk, then k is not in D.But Sk = D. if k is not in Sk, then k is in D.But Sk = D. So pow(N) is not countable

Concept Of Language:

Context-Free Languages and Context-Free Grammars

From the results in the previous chapters, if a language is regular, it iseasy to _nd a general format for sentences of the language by using regularexpressions. Also, it is easy to check if a string is in the language by using thelanguage's DFA model. However, not all languages are regular. In fact, somebasic properties of programming languages require something beyond regularlanguages.

Example 4.0.4. In order to deal with mathematical expressions such((x + y):z + x:y)a programming language needs to have the ability to recognize the languageL = f(n)n : n _ 0g; which describes a simple kind of nested structure inprogramming languages.

Context-Free Grammars

Definition 4.1.1. A grammar G = (V; T; S; P) is context-free if all productionrules in P has the form A ! x; where A 2 V; x 2 (V [T)?: A languageL is context-free if there exists a context-free grammar G such that L = L(G);where L(G) = fw 2 T ? : S )? wgIt is easy to see that any regular grammar is context-free. But a contextfreegrammar may not be regular.69 Example 4.1.2. L = fan:bn : n _ 0g is irregular. Moreover, L is generatedby the context-free grammar G = (fSg; fa; bg; S; fS ! a:S:bj_g):For a sentence w of a context-free language, there may be more than onederivations for w starting from S : S ) : : : ) w: Furthermore, since it ispossible to have more than one variables on the right hand side of a productionrule, there are several possibilities of applying production rules.

Production Rules:

Definition 4.1.3. A derivation is called leftmost if in each step the leftmostvariable in the sentential form is replaced.A derivation is called rightmost if in each step the rightmost variable inthe sentential form is replaced.Example 4.1.4. G = (fA;B; Sg; fa; bg; S; P) whereP = fS !1 A:B;A !2 a:a:Aj3_;B !4 B:bj5_gLeftmost: S !1 A:B !2 a:a:A:B !3 a:a:B !4 a:a:B:b !5 a:a:bRightmost: S !1 A:B !4 A:B:b !5 A:b !2 a:a:A:b !3 a:a:bA second way of showing derivations is by using derivation trees. Thismanner of showing derivations is independent of the order in which productionrules are used.Definition 4.1.5. Let G = (V; T; S; P) be a context-free grammar

derivation tree is a tree in that(1) the root is labeled S(2) every leaf has a label in T [ f_g(3) every interior vertex has a label in V(4) for every vertex A 2 V; if A0s children are a1; a2; : : : ; an; then P mustcontain the production rule A ! a1:a2 : : : an;(5) every leaf with label _ has no sibling.Like transition graphs for _nite automata, derivation trees give a very explicitand easily comprehended description of a derivation.number lambdafactor termp lambdaterm expp( exp ) lambdafactor termp lambdaterm exppexpTheorem 4.1.6. Let G = (V; T; S; P) be a context-free grammar, 8w 2 _?;w is in L(G) if and only if there exists a derivation tree of G; whose yield isw:Proof.()):(1) We will _rst prove that for every sequence S ) x1 : : : ) xn: xi 2(V [ T)?; i = 1::n; there exists a partial derivation tree with root S;which satis_es Condition 1, 3, 4, 5 and yields xn: We will prove thisfact by using induction on the length of the sequence.n = 1: The tree is constructed by using the only production rulefor deriving x1 from S:n = k _ 1: Assume that for every sequence S ) x1 : : : ) xk: xi 2(V [ T)?; i = 1::k; there exists a partial derivation tree with rootS; which satis_es Condition 1, 3, 4, 5 and yields xk:n = k + 1: Since the grammar is context-free, for every sequenceS ) x1 : : : ) xk ) xk+1: of length k+1; where xi 2 (V [T)?; i =1::k+1; xk must be in the form u:A:v: Moreover, there must be aproduction rule A ! z in the set of production rules P; and xk+1must be in the form u:z:v for some A 2 V; u; v; z 2 (V [ T)?:From the induction assumption, there exists a partial derivationtree with root S and yields u:A:v: We simply add the children fornode A following the production rule A ! z: Obviously, the newpartial tree has root S; satis_es Condition 1, 3, 4, 5 and yieldsu::z:v = xk+1:(2) We will now prove that 8w 2 _?; if w 2 L(G) then there exists aderivation tree of G; whose yield is w:Since w 2 L(G); there exists a derivation sequence S ) : : : ) w:Therefore, there exists a partial derivation tree with root S; whichsatis_es Condition 1, 3, 4, 5 and yields w: This tree also satis_esCondition 2 because w 2 T ?; which means all leaves of the tree are inT [ f_g:(():(1) We will _rst prove that for all partial derivation tree with _ 1 interiornode, whose yield is x 2 (V [T)?; there exists a sequence S ) : : : ) x:We will prove this fact by using induction on the number of interiornodes.n = 1: The only interior node is S; and the sequence is S ) x:n = k _ 1: Assume that for all partial derivation tree with k _ 1interior node, whose yield is x 2 (V [T)?; there exists a sequenceS ) : : : ) x:n = k + 1: Since k + 1 > 1; every tree with k + 1 interior nodesmust have a leaf z_ 2 V [ T [ f_g such that its direct parentnode A 2 V is di_erent from S: (Otherwise, there is only oneinterior node, i.e., S:) Remove z_ and all of its siblings. The newpartial derivation tree has k interior nodes. Therefore, from theinduction assumption, there exists a sequence S ) : : : ) u:A:v;for some u; v 2 (V [T)?: Simply add) u:z:v = x to the sequence,where z is the concatenation of all children of A; we have thesequence S ) : : : ) u:A:v ) u:z:v = x for the tree.(2) We will now prove that 8w 2 _?; if there exists a derivation tree ofG; whose yield is w; then w 2 L(G):For every derivation tree whose yield is w; there exists a sequenceS ) : : : ) w: Since w 2 _?; w 2 L(G):

Theorem . Let G = (V; T; S; P) be a context-free grammar, whichdoes not have any _-rules (i.e., A ! _ where A 2 V ) or unit-production rules(i.e., A ! B where A;B 2 V ). Then for 8w 2 _?; the exhaustive searchingalgorithm either produces a parsing of w or tells us that no parsing is possible.Proof. After one round, either length of the sentential form or the numberof terminal symbols will increase at least one. Since the length of the sententialform or the number of the terminal symbols cannot exceed jwj; a derivationcannot involve more than 2 _ jwj rounds. _

Problem . This algorithm, however, is very ine_cient. The upperbound for the number of sentential forms M = jPj + jPj2 + _ _ _ + jPj2_jwj:Claim. We will reduce the complexity of the algorithm to jwj3 or lower.A context-free grammar G = (V; T; S; P) is called a simple grammar (sgrammar)if all of its production rules are of the form A ! a:X; where A 2 V;a 2 T; X 2 V ?; and all pair (A; a) occurs at most once in P:Lemma 4.2.5. If a grammar G = (V; T; S; P) is simple, 8w 2 _?; w canbe parsed with at most jwj steps.Proof. Assume that w = a1:a2 : : : an;_ If P does not have the rule S ! a1:A1 : : : then stop, the string w =2L(G);{ else apply the production rule S ! a1:A1 : : :{ If P does not have the rule A1 ! a2:A2 : : : then stop, the stringw =2 L(G);_ else apply the production rule A1 ! a2:A2 : : :_Definition 4.2.6. A context-free grammar G is ambiguous if there existsa sentence w 2 L(G) which has at least two distinct derivations.

Chomskys hierarchy:N. Chomsky: Three models for the description of language, IRE Trans. Information Th. 2, 113-124, 1956.Motivating example:Sentence -> Noun Verb Noun, e.g.: Bob loves AliceSentence -> Sentence Conjunction Sentence, e.g.: Bob loves Alice and Rome fights CarthageGrammar G(V, A, P, S). V: alphabet of non-terminal symbols, variables, grammatical types;A: alphabet of terminal symbols, S V: start symbol, sentence;P: unordered set of productions of the form L -> R, where L, R (V A)*Rewriting step: for x, y, y, z (V A)*, u -> v iff u = xyz, v = xyz and y -> y PDerivation: ->* is the transitive, reflexive closure of ->, i.e.u ->* v iff w0, w1, ... wj, with j 0, u = w0, w(i-1) -> wi, wj = vLanguage defined by G: L(G) = { w A* | S ->* w)Various restrictions on the productions define different types of grammars and corresponding languages:Type 0, phrase structure grammar: No restrictionsType 1, context sensitive: |L| |R|, (exception: S -> is allowed if S never occurs on any right-hand side)Type 2, context free: L VType 3, regular: L V, R = a or R = aX, where a A, X V

Chomsky Normal Form (CNF)

Definition . A context-free grammar is in Chomsky Normal Form(CNF) if all production rules are in the formA ! B:C or A ! awhere A;B;C 2 V , a 2 T.Algorithm 4.5.2. [CNF]_ Inputs: A context-free grammar G = (V; T; S; P) with _ =2 L(G)_ Output: An equivalent context-free grammar bG = (bV ; bT; bS; b P) thatis in CNFStep1: Remove _-production rules and unit production rules in GStep2: Construct G1 as follows: For all rule A ! x1; x2; : : : xn_ if n = 1, there must be a terminal symbol a such that A ! a. AddA ! a to P1_ if n _ 2, add A ! C1;C2; : : :Cn into P1 where Ci = xi if xi is avariable, and Ci = Ba is a new variable if xi = a is a terminal symbol.Add Ba ! a into P1 for all new variable BaStep3: Construct ^G from G1_ Put into ^ P all rules in P1 of the form A ! a and A ! BC_ Replace A ! C1;C2; : : :Cn where n > 2 byA ! C1D1D1 ! C2D2: : : : : : : : :Dn2 ! Cn1Cn

Unit IIFinite state machine, regular language, deterministic finite automata, conversiondeterministic automata , E- closures- regular expression finite automata, minimization ofautomata, Moore and Mealy machine and their equivalence

Finite state machine:

Finite State Machines with Output (Mealy and Moore Machines) Introduction If a combinational logic circuit is an implementation of a Boolean function, then a sequential logic circuit can be considered an implementation of a finite state machine. There is a little more to it than that (because a sequential logic circuit can contain combinational logic circuits).

If you take a course in programming languages, you will also learn about finite state machines. Usually, you will call it a DFA (deterministic finite automata).

While finite state machines with outputs are essentially DFAs, the purpose behind them is different. DFAs in programming languages When you are learning about models of computation, one simple model is a deterministic finite automata or DFA for short.

Formally, the definition of a DFA is:

Q, a set of states S, an single state which is an element of Q. This is the start state. F, a set of states designated as the final states Sigma, the input alphabet delta, a transition function that maps a state and a letter from the input alphabet, to a state

DFAs are used to recognize a language, L. A language is a set of strings made from characters in the input alphabet. If a language can be recognized by a DFA, it is said to have a regular grammar.

To use a DFA, you start in an initial state, and process the input string a character at a time. For example, if the input alphabet consists of "a" and "b", then a typical question is to ask whether the string "aaab" is accepted by a DFA.

To find out whether it is accepted, you start off in the state state, S. Then you process each character (first "a", then "a", then "a", and finally "b"). This may cause you to move from one state to another. After the last character is processed, if you are in a final state, then the string is in the language. Otherwise, it's not in the language.There are some languages that can't be recognized by a DFA (for example, palindromes). Thus, a DFA, while reasonably powerful, there are other (mathematical) machines that are more powerful.

Often, tokens in programming languages can be described using a regular grammar. FSM with output in hardware A finite state machine with output is similar to describe formally.

Q, a set of states S, an single state which is an element of Q. This is the start state. Sigma, the input alphabet Pi, the output alphabet delta, a transition function that maps a state and a letter from the input alphabet, to a state and a letter from the output alphabet.

The primary difference is that there is no set of final states, and that the transition function not only puts you in a new state, but also generates an output symbol.

The goal of this kind of FSM is not accepting or rejecting strings, but generating a set of outputs given a set of inputs. Recall that a black box takes in inputs, processes, and generates outputs. FSMs are one way of describing how the inputs are being processed, based on the inputs and state, to generate outputs. Thus, we're very interested in what output is generated.

In DFAs, we don't care what output is generated. We care only whether a string has been accepted by the DFA or not.

Since we're talking about circuits, the input alphabet is going to be the set of k bit bitstrings, while the output alphabet is the set of m bit bitstrings.

We'll look at this more informally, just in case you're confused. An Example Let's look at an example of an FSM.

Each of the circle is a state. For now, all you need to know is that, at any given moment, you are in one state. Think of this as a game, where there are circles drawn on the ground, and at any moment, you are standing in exactly one circle.

Each of the circle is given a unique binary number. The number of bits used depends on the total number of states. If there are N states, then you need ceil( lg N ) bits (the ceiling of log base 2 of N). The states are labelled with the letter q, plus subscripts. In this example, it's q1q0.

You may have k input bits. The input bits tell you which state to transition to. For example, if you have 2 input bits (x1x0), then there are four possible out going edges (x1x0 = 00, x1x0 = 01, x1x0 = 10, and x1x0 = 11). In general, there are 2k outgoing edges for k bits of input.

Thus, the number of edges depends on the number of bits used in the input. Tracing an Example You might be asked, what are the sequence of states and outputs, assuming you start in state 00, and have input (1, 1, 0, 0, 1). State 00 (Start) 01 10 01 01 10 Input 1 1 0 0 1

So, you may start in state 00, reading input 1 (see column 1 of the table), which puts you in state 01. At that point, you read in input 1 (see column 2), and go into state 10 (column 3), etc.FSM with Outputs: Moore machines The goal of FSMs is to describe a circuit with inputs and outputs. So far, we have inputs, that tell us which state we should go to, given some initial, start state. However, the machine generates no outputs.

We modify the FSM shown above, by adding outputs. Moore machines add outputs to each state. Thus, each state is associated with an output. When you transition into the state, the output corresponding to the state is produced. The information in the state is typically written as 01/1. 01 indicates the state, while 1 indicates the output. 01/1 is short hand for q1q0 = 01/z = 1

The number of bits in the output is arbitary, and depends on whatever your application needs. Thus, the number of bits may be less than, equal, or greater than the number of bits used to represent the state.

Let's look at an example of a Moore machine.

In this example, you see two bits for the state and two bits for the output. Thus, when you see 00/01 inside one of the circles, it is shorthand for q1q0 = 00 / z1 z0 = 01. Tracing using Timing Diagrams Given the Moore machine in the previous diagram, and the timing diagram below, you might be asked to determine the state and output.

The timing diagram isn't too hard to follow. Basically, you will start off in some state (let's say, 00), and draw the diagram to indicate what happens to the state (q1q0) and to the output (z1z0).

You'll notice the input does NOT change at the positive edge. That way, it's easier for you to tell the value of the input at the positive edge. To make it easier to read, I've added the value of x at the positive edge. Thus, the inputs are 1, 1, 0, 1, 1, 0.

Let's look at the timing diagram at the first positive edge (drawn with a vertical line). Before the first edge, the state, q1q0 = 00. The input is 1. This should put us in state 01 (i.e., q1q0 = 00), which outputs 11 (i.e., z1z0 = 11).

You have to read down the columns. The first column says that the machine is in state 00, with output 01. The second column says that the machine is in state 01, with output 11. The reason the second column says that is due to the input, x, read in at the first positive edge. The input x is 1, which caused the FSM to move from state 00 to state 01.

The value of the state and output are placed in the middle, but the really, it's the dark line that tells you when this happens. The state and output changes value on the positive edge (technically, it takes a small, but finite amount of time after the positive edge for the state and output to finally settle down, but we'll draw the diagrams as if it happens instantaneously---even though it doesn't).

Here's the rest of the timing diagram.

FSM with Outputs: Mealy machines A Moore machine has outputs that are a function of state. That is, z = f( qk-1,..., q0 ).

A Mealy machine has outputs that are a function of state and input, that is That is, z = f( qk-1,..., q0, xm-1,..., x0 ).

We usually indicate that the output is depedent on current state and input by drawing the output on the edge. In the example below, look at the edge from state 00 to state 01. This edge has the value 1/1. This means, that if you are in state 00, and you see an input of 1, then you output a 1, and transition to state 01.

Thus, 1/1 is short hand for x = 1 / z = 1.

Here's a sample Mealy machine. One thing you will notice is the numbering of the states. Usually, if there are 3 states, we number them 00, 01, and 10, since those are the first 3 UB numbers. However, given that we're using two bits, we can, in principle, pick any 3 of the 4 possible 2-bit numbers.

One reason we might want to pick something else besides 00, 01, and 10 is because implementing an FSM with minimal gates often involves picking the correct state numbering. Thus, if you're careful which state is numbered, say, 00, 01, and 11, you may be able to create a circuit that has fewer gates.

However, minimization of the circuit based on well-chosen state numberings is outside the scope of the course. We only pick the state numberings just to make a note that this could happen, but we won't take advantage of this fact.

Another interesting point to observe is how a Mealy machine differs from a Moore. Already we said that a Mealy machine's output may depend on both the values of state and input variables. We can see this in the example. Look at the edge from 00 to 01.

This edge says that if a 1 is input, we will transition to state 01 and output a 1. Now look at the loop in state 01. This says that if the input is 1, we will loop back to state 01, and output a 0.

So, in the first case, going to state 01 outputs a 1, whereas in the second case, going to state 01 outputs a 0. In a Moore machine, this would not happen. The output depends only on the state you transition into, not how you got into that state. Tracing using Timing Diagrams To see how the previous Mealy machine behaves, we can use timing diagrams. We'll use the same input as before.

The following timing diagram shows what happens to the state (q1q0) and to the output (z).

Just to see what happens. Initially, we're in state 00, with an output of 0. It doesn't terribly matter what the initial output is. Unlike the Moore machine, a Mealy machine's output doesn't depend on the current state.

In state 00, we see an input of a 1. This takes us to state 01, with an output of 1. If you read the second column of numbers, you see 0 and a 1 (which is state 01), followed by a 1 (which is the output). Equivalence of Mealy and Moore machines We have two ways to describe a FSM: Mealy and Moore machines. A mathematician might ask: are the two machines equivalent?

Initially, you might think not. A Mealy machine can have its output depend on both input and state. Thus, if we ignore the state, we should be able to convert a Moore machine to a Mealy machine.

It's not so easy to see that you can convert an arbitrary Mealy machine to a Moore machine.

It turns out that the two machines are equivalent. What does that mean? It means that given a Moore machine, you can create a Mealy machine, such that if both machines are fed the same sequence of inputs, they will both produce the same sequence of outputs. You can also convert from a Mealy machine to its equivalent Moore machine, and again generate the same outputs given the same sequence of inputs.

Actually, to be precise we must ignore one fact about Moore machines. Moore machines generate output even if no input has been read in. So, if you ignore this initial output of the Moore machine, you can convert between one machine and the other.

The actual algorithm is beyond the scope of the course. However, the basic idea of converting a Meal Mealy machine to a Moore machine is to increase the number of states. Roughly speaking, if you have a Mealy machine with N states, and there are k bits of input, you may need up to 2kN states in the equivalent Moore machine.

Effectively, the new states record information about how that state was reached.

Regular Languge:

Regular languagesWe define two new operations on languages, considering languages as sets of strings.Def 17.9 (p. 462) The concatenation (or set product) XY (sometimes written X Y) of twosets of strings X and Y, XY = {x y : x X, y Y}The Kleene star or closure of a set of strings X is X*, the set of all strings formed byconcatenating members of X any number of times (including zero) in any order and allowingrepetitions. This is just like our existing notion A* for alphabet A except that now X is a set ofstrings, not just an alphabet (which we can consider as a set of strings of length one). So theold notion is just a special case of the new more general notion, the case where A is a set ofstrings of length one.Using these two new operations on sets of strings, as well as standard set-theoretic notions,we can now define the regular languages recursively.Definition 17.10 (p. 463)Given an alphabet A:1. is a regular language.2. For any string x in A*, {x} is a regular language.3. If X, Y are regular languages, then so is X Y.4. If X, Y are regular languages, then so is XY.5. If X is a regular language, then so is X*.6. Nothing else is a regular language.: Deterministic Finite Automata (DFA):

We now begin the machine view of processing a string over alphabet . The machine has a semi-inifinite tape of squares holding one alphabet symbol per square. The machine has a finite state set, K, with a known start state. Initially we want the processing to be deterministic, that is, there is only one possible outcome from processing a string. Here is how we process it: place string on a tape with one symbol in each squareplace machine in start state and the read head on the first squarea computation step is done by considering the (current state, current tape symbol) and based the value of this pair, move to a new state and move the tape head one square to the right stop the machine when there are no more symbols to processAccording to the description, the state change operation is a function K . There are a number of interpretations we can give to this processing method, but the one of most interest to us is that of accepting the input string based on terminal state. Because of the deterministic behavior, we can say more strongly that it is deciding this string, as to whether it belongs to the language or not. The natural interpretation is means that an accepted string's terminal belongs to a set of final, or accepting states, F K.

The description of a DFA M = (K, , , s, F) is that which is defined in the textbook.

We must define precisely what it means to compute an input string. Because the machine never goes back, and stops after the last symbol, we can characterize the machine state as a configuration, which is an element of K * representing (current machine state, remaining portion of string to process). We define the binary relation: | (K*) (K*) the yields in one step relationIt is defined as follows: for all , (p,w) | (q,w) for all w * if and only if (p,) = q This makes rigorous several notions about processing in a DFA: symbols are processed from left to right, processing each one only oncethe state change only consults the current symbol (what is ahead in the string is irrelevant)Define |* to be the reflexive, transitive closure of |; this is called the yields relation.

In a DFA, computing the string w means to put the machine in the start configuration: (s,w)A terminal state, q, is one in which (s, w) |* (q, )Because the state transition is a function, it implies that there is only one state p such that (q,w) |* (p,). It is sometimes convenient to express the state transition as a function: * : K * Kwhere *(q, w) = p if and only if (q, w) |* (p, )Because of the nature of the DFA, this function is well-defined. AcceptanceWe say the string w is accepted if, proceeding from the start state, the terminal state we reach is final i.e., F. Concisely, w is accepted if *(s,w) F. Because of the determinism, it is sometimes said that w is decided by a DFA.

The language accepted by a DFA is the set of all strings accepted by a DFA. DFA and Regular Language EquivalenceOne of the main goals of Chapter 2 is to show the following:

Theorem: A language is accepted by a DFA if and only if it is a regular language (i.e., has a regular expression). Graphical representation of DFAFinite automata lend themselves readily to graphical interpretation.

Each state is a node in the graph: The start state is designated by: and final states are designated by: A transition (p,) = q is represented by the labelled edge (p,,q): Although it seems obvious, we still should state the equivalence of the two representations, in that state transitions by a string is equivalent to paths in the graph:

Claim: (q0,1,q1) (q1,2,q2) ... (qn-1,n,qn) are labelled edges if and only if (q0, 12...n) |* (qn, )

The empty path (no labelled edges) corresponds to the empty string (no symbols). Example 2.1.1 in textbook:This is the even-parity checker for the number of b's in a string, i.e., the machine accepts the language L = { w : #b's in w is even }Accepted strings include: , a*, a*ba*b.

We'll soon see the construction of a RE from a DFA, but this one is easy. You can see two looping paths from the start/final state back to itself by either a or ba*b. Choose from (aba*b) represents one loop going by two general "paths". We can repeat this 0 or more times, getting (aba*b)* as our RE for the DFA.

Switching final and non-final states gives us a DFA which represents the complement of the previous language: L = { w : #b's in w is odd }.Note that the RE doesn't easily "complement" in any way. Example 2.1.2 in textbook:L = { w : w does not contain the substring bbb }The state q3 is called a dead state, because no paths beyond this point can reach a final state. Again, switch final and non-final states constructs the DFA for the complement language: L = { w : w contains the substring bbb }It's also easy to see that a regular expression for L is (ab)*bbb(ab)*, since we only need to find the substring bbb somewhere in the string, not necessarily finding the first occurrence, which is what the DFA does. DFAs for substring acceptanceThe previous example can be generalized. Consider the language { w * : w contains the substring u }This is very easily expressed by the regular expression (*)u(*). To generate a DFA, write u = 1...nDraw the partial DFA representing the "success" transitions: (q0,1,q1) (q1,2,q2) ... (qn-1,n,qn)Start state = q0, final state = qn.

What is missing are the failure transitions: (qi-1, , ??), for iWe don't always go back to the start state when we fail. In general, look at the failure string: y = 1...i-1 where iWhat we want to find is the string, x where x = largest suffix of the failure string, y, which is also a prefix of the success string, u Take this string x and run the DFA from the start state to state p, and add the failure transition (qi-1, , p)Here is another example with = {a,b}: L = { w : w contains the substring abaa }. The setup is this: For example, consider the failure string at q3: abab. Observe that, for x = ab: abx = failure string xaa = success stringand that x is the largest possible substring satisfying this requirement. Therefore, to get the target state of the failure transition, run x from the start, thereby adding: ( q3, b, q2 )Completing this procedure, we get this DFA: ComplementWe will state precisely one concept that we have been suggesting in the above examples.

Theorem: If a language L is accepted by a DFA, then there is a derived DFA which accepts L = * - L.

Given the DFA (K,,,s,F), the derived DFA is (K,,,s,K-F). Namely, the final state set of the derived DFA is the complement of the final state set of the original DFA.

Product Construction: intersection and union

Theorem: If languages L1 and L2 over are accepted by a DFAs, then there are derived DFAs which accept the intersection: L1 L2the union: L1 L2

Given the DFAs (K1,,1,s1,F1) and (K2,,2,s2,F2) for L1 and L2, respectively, the derived DFAs are of the form (K1K2,,,(s1,s2),F)where (p,q) = ( 1(p), 2(q) )and F is either: for the intersection: F = F1F2 for the union: F = F1K2 K1F2 Intuitively the idea is to run the two DFA's simultaneously to a terminal state (q1,q2) and then Accept the string if (for the intersection) both q1 and q2 are final in their respective DFAs. (for the union) at least one of q1 and q2 are final in their respective DFAs. Intersection ExampleFind a DFA for the language over {a,b}: { w : w has an even number of b's and does not contain the substring bb }Here are the two languages and their DFAs: { w : w has an even number of b's } { w : w does not contain the substring bb }

The intersection construction gives us a machine with 6 states: {Ax, Ay, Az, Bx, By, Bz} derived from all state pairs. The hard part is usually figuring out how to meaningfully draw the constructed DFA. We can lay the states out in a 2x3 grid, but the crossing of the edges tends to obscure any simple sense of the behavior. Here in one possible rendering:

Both states Az and Bz are both dead and can effectively be replaced by a single state. MinimizationReducing the number of states is a step towards minimizing the DFA. There is a procedure for doing so and it concerns finding states which are equivalent in the sense that the set of strings which lead to final states starting from any of them is the same. In the case of a dead state, the set of such strings is simply .

Equivalent states can be replaced by a single state

Conversion deterministic automata:

\

E- closures- regular expression finite automata:

A regular expression specifies a language

The regular languages are those languagesspecified by regular expressions

Where a is any single symbol in and E, F areregular expressions:RE = {a} (E) EF E | F E*

Example: 01* | 00 is the regular expressiondenoting strings beginning 0, followed by anynumber of 1s, or 0 followed by a single 0

Terminology: finite automaton = finite state automaton. The class is sometimes called FSA,or just FA. The languages accepted by a fsa is called a finite state languages. These are alsothe regular languages, but we use a separate definition, later, to define regular language,and then we prove the equivalence of the two classes.We will also consider Chomsky formal grammars and Chomsky hierarchy. We can prove thata particular class of formal grammars, the Type 3 grammars, define the same class oflanguages, the regular languages. We will thus have three independent characterizations of thesame class of languages.An automaton may be deterministic or non-deterministic. We will first define deterministicfsa, then non-deterministic, then show that for fsa (this is not true for some other classes ofautomata), the two subclasses of fsa are equivalent with respect to the class of languagesaccepted.State diagrams. Example illustrating how dfa work. [blackboard; fig 17-2, p.456]Transitions of the form (qi, a, qj) or (qi, a,) qj

Definition. A deterministic finite automaton (dfa) M is a 5-tuple , where1K is a finite set of statesA is an alphabetqin K, the initial stateF K, the final states: K A K, is the transition function (or next-state function).What makes this a deterministic fsa is that must be a function: for each state and symbol,there is exactly one transition to a next state.Automata accept some strings (and dont accept others).1 We use A for alphabet where PtMW use , and we use qin for the initial state where PtMW use q0.Ling 726: Mathematical Linguistics, Lecture 13-14 Finite State Automata and LanguagesV. Borschev and B. Partee, October 31- Nov 1, 20063Definition. Given a dfa M, a string x A*, x = a1a2 an, ai A, n 0, is accepted by M iffthere exists a sequence of states q1, q2, , qn, qn+1 such that:1) q1 = qin is the initial state,2) qn+1 F is a final state,3) if x is not empty (i.e. n 1), then (qi, xi) = qi+1, and in the case n = 0 string x is empty, x =e, then for e to be accepted by M is enough that two first conditions, (1) and (2) hold, i.e. q1 =qin and q1 F.The language L(M) accepted by a dfa M is the set of all strings accepted by M.Non-deterministic fas (nfa).Two in-principle weakenings of the requirements, and two more that are optional butcommonly included.(i) for a given state-symbol pair, possibly more than one next state. [this is THE crucialone](ii) for a given state-symbol pair, possibly no next state. [this could always be modelled byadding a dead-end state](iii) allowing a transition of the form (qi, w, qj) where w A*, i.e. being able to read astring of symbols in one move, not only a single symbol. And as a noteworthy subcase of that,(iv) allowing a transition of the form (qi, e, qj) : changing state without reading a symbol.Example. fig 17-3, p. 459A string is accepted by a non-deterministic fa if there is some path through the state diagramwhich begins in the initial state, reads the entire string, and ends in a final state.Formal definition of non-deterministic fa. Just like formal definition of dfa, except that inplace of the transition function there is a transition relation , a finite subset of K A* K(i.e. the set of transitions of the form (qi, w, qj) where w A*).The definitions of acceptance of a string is similar to one for dfa,Definition. Given a nfa M, a string x A* is accepted by M iff there exist two sequences, asequence of strings w1, w2 , , wn, wi A*, n 0, such that x = w1w2 wn, and a sequenceof states q1, q2, , qn, qn+1 such that:1) q1 = qin is the initial state,2) qn+1 F is a final state,3) If n 1, then (qi, wi, qi+1) , and in the case when the string x is empty, x = e, then for eto be accepted by M is enough that two first conditions, (1) and (2), hold, i.e. q1 = qin and q1 F.Equivalence of deterministic and non-deterministic fsa. This is a major result it is notself-evident. The algorithm for constructing an equivalent deterministic fsa, given a nondeterministicone, is a bit complex and we wont do it; in the worst case it may give a dfa withLing 726: Mathematical Linguistics, Lecture 13-14 Finite State Automata and LanguagesV. Borschev and B. Partee, October 31- Nov 1, 2006

2n states corresponding to a nfa with n states. (And that presupposes that we take the narrowerdefinition of nfa, with weakenings (i) and (ii) but not (iii) or (iv).)Why it is useful to have both notions: The deterministic fa are conceptually morestraightforward; but in a given case it is often easier to construct a non-deterministic fa. Also,for some other classes of automata that we will consider, the two subclasses are notequivalent, so the notions remain important.

Theorem. (Kleene) A set of strings is a finite automaton language iff it is a regular language.We can sketch one half of the proof by showing how to construct a finite state automatoncorresponding to any given regular expression. (See pp. 464-468)Steps in the proof:i. The empty language is a fal (finite automaton language)ii. The unit language for every symbol in is a fal.iii. fals are closed under union.iv. fals are closed under concatenation.v. fals are closed under the Kleene star operation.

Minimization Of Automata:

One important result on finite automata, both theoretically and practically, is that for any regular language there is a unique DFA having the smallest number of states that accepts it. Let M = < Q , , q0 , , A > be a DFA that accepts a language L. Then the following algorithm produces the DFA, denote it by M1, that has the smallest number of states amomg the DFAs that accept L.

Minimization Algorithm for DFA

Construct a partition = { A, Q - A } of the set of states Q ; new := new_partition(} ; while (new ) := new ; new := new_partition() final := ; function new_partition() for each set S of do partition S into subsets such that two states p and q of S are in the same subset of S if and only if for each input symbol, p and q make a transition to (states of) the same set of .

The subsets thus formed are sets of the output partition in place of S. If S is not partitioned in this process, S remains in the output partition. end Minimum DFA M1 is constructed from final as follows: Select one state in each set of the partition final as the representative for the set. These representatives are states of minimum DFA M1. Let p and q be representatives i.e. states of minimum DFA M1. Let us also denote by p and q the sets of states of the original DFA M represented by p and q, respectively. Let s be a state in p and t a state in q. If a transition from s to t on symbol a exists in M, then the minimum DFA M1 has a transition from p to q on symbol a. The start state of M1 is the representative which contains the start state of M. The accepting states of M1 are representatives that are in A. Note that the sets of final are either a subset of A or disjoint from A.

Remove from M1 the dead states and the states not reachable from the start state, if there are any. Any transitions to a dead state become undefined. A state is a dead state if it is not an accepting state and has no out-going transitions except to itself.

Example 1 : Let us try to minimize the number of states of the following DFA.

Initially = { { 1 , 5 } , { 2 , 3 , 4 } }.

New_Partition is applied to . Since on b state 2 goes to state 1, state 3 goes to state 4 and 1 and 4 are in different sets in , states 2 and 3 are going to be separated from each other in new . Also since on a sate 4 goes to sate 4, state 3 goes to state 5 and 4 and 5 are in different sets in , states 3 and 4 are going to be separated from each other in new. Further, since on b 2 goes to 1, 4 goes to 4 and 1 and 4 are in different sets in , 2 and 4 are separated from each other in new. On the other hand 1 and 5 make the same transitions. So they are not going to be split.

Thus the new partition is { { 1 , 5 } , { 2 } , { 3 } , { 4 ] }. This becomes the in the second iteration.

When new_partition is applied to this new , since 1 and 5 do the same transitions, remains unchanged. Thus final = { { 1 , 5 } , { 2 } , { 3 } , { 4 ] }.

Select 1 as the representative for { 1 , 5 }. Since the rest are singletons, they have the obvious representatives. Note here that state 4 is a dead state because the only transitionout of it is to itself. Thus the set of states for the minimized DFA is { 1 , 2 , 3 }. For the transitions, since 1 goes to 3 on a, and to 2 on b in the original DFA, in the minimized DFA transitions are added from 1 to 3 on a, and 1 to 2 on b. Also since 2 goes to 1 on b, and 3 goes to 1 on a in the original DFA, in the minimized DFA transitions are added from 2 to 1 on b, and from 3 to 1 on a. Since the rest of the states are singletons, all transitions between them are inherited for the minimized DFA.

Thus the minimized DFA is as given in the following figure:

Example 2 : Let us try to minimize the number of states of the following DFA.

Initially = { { 3 } , { 1 , 2 , 4 , 5 , 6 } }. By applying new_partition to this , new = { { 3 } , { 1 , 4 , 5 } , { 2 , 6 } } is obtained. Applyting new_partition to this , new = { { 3 } , { 1 , 4 } , { 5 } , { 2 } , { 6 } } is obtained. Applyting new_partition again, new = { { 1 } , { 2 } , { 3 } , { 4 } , { 5 } , { 6 } } is obtained. Thus the number of states of the given DFA is already minimum and it can not be reduced any further.

Moore and Mealy machine and their equivalence:

Example

Q3.Derive a minimal state table for a single-input and single-output Moore-type FSM that produces an output of 1 if in the input sequence it detects either 110 or 101 patterns. Overlapping sequences should be detected. (Show the detailed steps of your solution.)

Unit III

Pumping lemma for regular sets-closure properties of regular sets-decision properties for regular sets, equivalence between regular language and regular grammar. Context freelanguage parse trees and ambiguity, reduction of CFGS, chomsky and Griebach normal forms

Pumping lemma for regular sets:

Other view of the concept of language:not the formalization of the notion of e_ective procedure, but set of words satisfying a given set of rulesOrigin : formalization of natural languageExample1. a phrase is of the form subject verb2.a subject is a pronoun3. a pronoun is he or she4. a verb is sleeps or listensPossible phrases:1. he listens2. he sleeps3. she sleeps4. she listens

Grammars:_ Grammar: generative description of a language_ Automaton: analytical description_ Example: programming languages are de_ned by a grammar (BNF),but recognized with an analytical description (the parser of acompiler),_ Language theory establishes links between analytical and generativelanguage descriptions.

Context free language:Context free grammars (CFG) and languages (CFL)Goals of this chapter: CFGs and CFLs as models of computation that define the syntax of hierarchical formalnotations as used in programming or markup languages. Recursion is the essential feature that distinguish CFGsand CFLs from FAs and regular languages. Properties, strengths and weaknesses of CFLs. Equivalence ofCFGs and NPDAs. Non-equivalence of deterministic and non-deterministic PDAs. Parsing. Context sensitivegrammars CSG.Context free grammars and languages (CFG, CFL)Algol 60 pioneered CFGs and CFLs to define the syntax of programming languages (Backus-Naur Form).Ex: arithmetic expression E, term T, factor F, primary P, a-op A = {+, -}, m-op M = {, /}, exp-op = ^.E -> T EAT AT, T -> F TMF, F -> P F^P,P -> unsigned number variable function designator ( E ) [Notice the recursion: E ->* ( E ) ]Ex Recursive data structures and their traversals:Binary tree T, leaf L, node N: T -> L N T T (prefix) or T -> L T N T (infix) or T -> L T T N (suffix).These definitions can be turned directly into recursive traversal procedures, e.g:procedure traverse (p: ptr); begin if p nil then begin visit(p); traverse(p.left); traverse(p.right); end; end;Df CFG: G = (V, A, P, S)V: non-terminal symbols, variables; A: terminal symbols; S V: start symbol, sentence;P: set of productions or rewriting rules of the form X -> w, where X V, w (V A)*Rewriting step: for u, v, x, y, y, z (V A)*: u -> v iff u = xyz, v = xyz and y -> y P.Derivation: ->* is the transitive, reflexive closure of ->, i.e.u ->* v iff w0, w1, .. , wk with k 0 and u = w0, wj-1 -> wj, wk = v.L(G) context free language generated by G: L(G) = {w A* S ->* w }.Ex Symmetric structures: L = { 0n 1n | n 0 }, or even palindromes L0 = { w wreversed w {0, 1}* }G(L) = ( {S}, {0, 1}, { S -> 0S1, S -> }, S ); G(L0) = ( {S}, {0, 1}, { S -> 0S0, S -> 1S1, S -> }, S )Palindromes (length even or odd): L1 = {w w = wreversed }. G(L1): add the rules: S -> 0, S -> 1 to G(L0).Ex Parenthesis expressions: V = {S}, T = { (, ), [, ] }, P = { S -> , S -> (S), S -> [S], S -> SS }Sample derivation: S -> SS -> SSS ->* ()[S][ ] -> ()[SS][ ] ->* ()[()[ ]][ ]The rule S -> SS makes this grammar ambiguous. Ambiguity is undesirable in practice, since the syntactic structure is generally used to convey semantic information.

Ex Ambiguous structures in natural languages:Time flies like an arrow vs. Fruit flies like a banana.Der Gefangene floh vs. Der gefangene Floh.Bad news: There exist CFLs that are inherently ambiguous, i.e. every grammar for them is ambiguous (seeExercise). Moreover, the problem of deciding whether a given CFG G is ambiguous or not, is undecidable.Good news: For practical purposes it is easy to design unambiguous CFGs.Exercise:a) For the Algol 60 grammar G (simple arithmetic expressions) above, explain the purpose of the rule E -> ATand show examples of its use. Prove or disprove: G is unambiguous.b) Construct an unambiguous grammar for the language of parenthesis expressions above.c) The ambiguity of the dangling else. Several programming languages (e.g. Pascal) assign to nestedif-then[-else] statements an ambiguous structure. It is then left to the semantics of the language to disambiguate.Let E denote Boolean expression, S statement, and consider the 2 rules:S -> if E then S, and S -> if E then S else S. Discuss the trouble with this grammar, and fix it.d) Give a CFG for L = { 0i 1j 2k | i = j or j = k }. Try to prove: L is inherently ambiguous.Equivalence of CFGs and NPDAsThm (CFG ~ NPDA): L A* is CF iff NPDA M that accepts L.Pf ->: Given CFL L, consider any grammar G(L) for L. Construct NPDA M that simulates all possiblederivations of G. M is essentially a single-state FSM, with a state q that applies one of Gs rules at a time. Thestart state q0 initializes the stack with the content S , where S is the start symbol of G, and is the bottom ofstack symbol. This initial stack content means that M aims to read an input that is an instance of S. In general, thecurrent stack content is a sequence of symbols that represent tasks to be accomplished in the characteristic LIFOorder (last-in first-out). The task on top of the stack, say a non-terminal X, calls for the next characters of theimput string to be an instance of X. When these characters have been read and verified to be an instance of X, Xis popped from the stack, and the new task on top of the stack is started. When is on top of the stack, i.e. thestack is empty, all tasks generated by the first instance of S have been successfully met, i.e. the input string readso far is an instance of S. M moves to the accept state and stops.The following transitions lead from q to q:, X-> w for each rule X -> w. When X is on top of the stack, replace X by a right-hand side for X.2) a, a -> for each a A. When terminal a is read as input and a is also on top of the stack, pop the stack.Rule 1 reflects the following fact: one way to meet the task of finding an instance of X as a prefix of the inputstring not yet read, is to solve all the tasks, in the correct order, present in the right-hand side w of the productionX -> w. M can be considered to be a non-deterministic parser for G. A formal proof that M accepts precisely Lcan be done by induction on the length of the derivation of any w L. QED

Pf * w } will be the languageof all strings that that can be derived from Vpq according to the productions of the grammar G to be constructed.In particular, L( Vsf ) = L(M), where s is the starting state and f the accepting state of M.Invariant:Vpq generates all strings w that take M from p with an empty stack to q with an empty stack.The idea is to relate all Vpq to each other in a way that reflects how labeled paths and subpaths through Ms statespace relate to each other. LIFO stack access implies: any w Vpq will lead M from p to q regardless of thestack content at p, and leave the stack at q in the same condition as it was at p. Different ws L( Vpq) may dothis in different ways, which leads to different rules of G:1) The stack may be empty only in p and in q, never in between. If so, w = a v b, for some a, b A, v A*. AndM includes the transitions (p, a, ) -> (r, t) and (s,b, t) -> (q, ). Add the rules: Vpq -> a Vrs b2) The stack may be empty at some point between p and in q, in state r.For each triple p, q, r Q, add the rules: Vpq -> Vpr Vrq.3) For each p Q, add the rule Vpp -> .The figure at left illustrates Rule1, at right Rule 2. If M includes the transitions (p, a, ) -> (r, t) and (s,b, t) -> (q,), then one way to lead M from p to q with identical stack content at the start and the end of the journey is tobreak the trip into three successive parts: 1) to read a symbol a and push t; 2) travel from r to s with identicalstack content at the start and the end of this sub-journey; 3) to read a symbol b and pop t.

Normal formsWhen trying to prove that all objects in some class C have a given property P, it is often useful to first prove thateach object O in C can be transformed to some equivalent object O in some subclass C of C. Here,equivalent implies that the transformation preserves the property P of interest. Thereafter, the argument can belimited to the the subclass C, taking advantage of any additional properties this subclass may have.Any CFG can be transformed into a number of normal forms (NF) that are (almost!) equivalent. Here,equivalent means that the two grammars define the same language, and the proviso almost is necessarybecause these normal forms cannot generate the null string.Chomsky normal form (right-hand sides are short):All rules are of the form X -> Y Z or X -> a, for some non-terminals X, Y, Z V and terminal a AThm: Every CFG G can be transformed into a Chomsky NF G such that L(G) = L(G) - {}.Pf idea: repeatedly replace a rule X -> v w, v 1, w 2 by X -> Y Z, Y -> v, Z -> w, where Y and Z are newnon-terminals used only in these new rules. Both right hand sides v and w are shorter than the original right handside v w.The Chomsky NF changes the syntactic structure of L(G), an undesirable side effect in practice. But ChomskyNF turns all syntactic structures into binary trees, a useful technical device that we exploit in later sections on thePumping Lemma and the CYK parsing algorithm.Greibach normal form (at every step, produce 1 terminal symbol at the far left - useful for parsing):All rules are of the form X -> a w, for some terminal a A, and some w V*Thm: Every CFG G can be transformed into a Greibach NF G such that L(G) = L(G) - {}.Pf idea: for a rule X -> Y w, ask whether Y can ever produce a terminal at the far left, i.e. Y ->* a v. If so, replaceX -> Y w by rules such as X -> a v w. If not, X -> Y w can be omitted, as it will never lead to a terminatingderivation.The pumping lemma for CFLsRecall the pumping lemma for regular languages, a mathematically precise statement of the intuitive notion aFSM can count at most up to some constant n. It says that for any regular language L, any sufficiently longword w in L can be split into 3 parts, w = x y z, such that all strings x yk z, for any k 0, are also in L.PDAs, which correspond to CFGs, can count arbitrarily high - though essentially in unary notation, i.e. bystoring k symbols to represent the number k. But the LIFO access limitation implies that the stack can only beused to represent one single independent counter at a time. To understand what independent means, consider aPDA that recognizes a language of balanced parenthesis expressions, such as ((([[..]]))). This task clearly callsfor an arbitrary number of counters to be stored at the same time, each one dedicated to counting his ownsubexpression. In the example above, the counter for ((( must be saved when the counter for [[ is activated.Fortunately, balanced parentheses are nested in such a way that changing from one counter to another matchesthe LIFO access pattern of a stack - when a counter, run down to 0, is no longer needed, the next counter on topof the stack is exactly the next one to be activated. Thus, the many counters coded into the stack interact in acontrolled manner, they are not independent.The pumping lemma for CFLs is a precise statement of this limitation. It asserts that every long word in L servesas a seed that generates an infinity of related words that are also in L.Thm: For every CFL L there is a constant n such that every z L of length z n can be written asz = u v w x y such that the following holds:1) v x , 2) v w x n, and 3) u vk w xk y L for all k 0.Pf: Given CFL L, choose any G = G(L) in Chomsky NF. This implies that the parse tree of any z L is abinary tree, as shown in the figure below at left. The length n of the string at the leaves and the height h of abinary tree are related by h log n, i.e. a long string requires a tall parse tree. By choosing the critical lengthn = 2 |V | + 1 we force the height of the parse trees considered to be h V+ 1. On a root-to-leaf path oflength V+ 1 we encounter at least V+ 1 nodes labeled by non-terminals. Since G has only Vdistinctnon-terminals, this implies that on some long root-to-leaf path we must encounter 2 nodes labeled with the samenon-terminal, say W, as shown at right.

For two such occurrences of W (in particular, the two lowest ones), and for some u, v, y, x, w A*, we have: S ->* u W y, W ->* v W x and W ->* w. But then we also have W ->* v2 W x2, and in general, W ->* vk Wxk, and S ->* u vk W xk y and S ->* u vk w xk y for all k 0, QED.For problems where intuition tells usa PDA cant do that, the pumping lemma is often the perfect tool neededto prove rigorously that a language is not CF. For example, intuition suggests that neither of the languages L1 ={ 0k 1k 2k / k 0 } or L2 = { w w / w {0, 1} } is recognizable by some PDA.For L1, a PDA would have to count up the 0s, then count down the 1s to make sure there are equally many 0sand 1s. Thereafter, the counters is zero, and although we can count the 2s, cant compare that number to thenumber of 0s, or of 1s, an information that is now lost.For L2, a PDA would have to store the first half of the input, namely w, and compare that to the second half toverify that the latter is also w. Whereas this worked trivially for palindromes, w wreversed, the order w w is theworst case possible for LIFO access: although the stack contains all the information needed, we cant extract theinfo we need at the time we need it. The pumping lemma confirms these intuitive judgements.Ex 1: L1 = { 0k 1k 2k / k 0 } is not context free.Pf (by contradiction): Assume L is CF, let n be the constant asserted by the pumping lemma.Consider z = 0n 1n 2n = u v w x y. Although we dont know where vwx is positioned within z, the assertion vw x n implies that v w x contains at most two distinct letters among 0, 1, 2. In other words, one or two of thethree letters 0, 1, 2 is missing in vwx. Now consider u v2 w x2 y. By the pumping lemma, it must be in L. Theassertion v x 1 implies that u v2 w x2 y is longer than u v w x y. But u v w x y had an equal number of 0s, 1s,and 2s, whereas u v2 w x2 y cannot, since only one or two of the three distinct symbols increased in number. Thiscontradiction proves the thm.Ex 2: L2 = { w w / w {0, 1} } is not context free.Pf (by contradiction): Assume L is CF, let n be the constant asserted by the pumping lemma.Consider z = 0n+1 1n+1 0n+1 1n+1 = u v w x y. Using k = 0, the lemma asserts z0 = u w y L, but we showthat z0 cannot have the form t t, for any string t, and thus that z0 L, leading to a contradiction. Recall that |v wx| n, and thus, when we delete v and x, we delete symbols that are within a distance of at most n from eachother. By analyzing three cases we show that, under this restriction, it is impossible to delete symbols in such away as to retain the property that the shortened string z0 = u w x has the form t t. We illustrate this using theexample n = 3, but the argument holds for any n.Given z = 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1, slide a window of length n = 3 across z, and delete any characters youwant from within the window. Observe that the blocks of 0s and of 1s within z are so long that the truncated z,call it z, still has the form 0s 1s 0s 1s. This implies that if z can be written as z = t t, then t must have theform t = 0s 1s. Checking the three cases: the window of length 3 lies entirely within the left half of z; thewindow straddles the center of z; and the window lies entirely within the right half of z, we observe that in noneof these cases z has the form z = t t, and thus that z0 = u w y L. QEDClosure properties of the class of CFLsThm (CFL closure properties): The class of CFLs over an alphabet A is closed under the regular operationsunion, catenation, and Kleene star.Pf: Given CFLs L, L A*, consider any grammars G, G that generate L and L, respectively. Combine G andG appropriately to obtain grammars for L L, L L, and L*. E.g, if G = (V, A, P, S), we obtainG(L*) = ( V {S0}, A, P { S0 -> S S0 , S0 -> }, S0 ).The proof above is analogous to the proof of closure of the class or regular languages under union, catenation,and Kleene star. There we combined two FAs into a single one using series, parallel, and loop combinations ofFAs. But beyond the three regular operations, the analogy stops. For regular languages, we proved closure undercomplement by appealing to deterministic FAs as acceptors. For these, changing all accepting states to nonaccepting,and vice versa, yields the complement of the language accepted. This reasoning fails for CFLs,because deterministic PDAs accept only a subclass of CFLs. For non-deterministic PDAs, changing acceptingstates to non-accepting, and vice versa, does not produce the complement of the language accepted. Indeed,closure under complement does not hold for CFLs.Thm: The class of CFLs over an alphabet Ais not closed under intersection and is not closed under omplement.We prove this theorem in two ways: first, by exhibiting two CFLs whose intersection is provably not CF, andsecond, by exhibiting a CFL whose complement is provably not CF.Pf : Consider CFLs L0 = { 0m 1m 2n | m, n 1 } and L1 = { 0m 1n 2n | m, n 1 }.L0 L1 = { 0k 1k 2k | k 1 } is not CF, as we proved in the previous section using the pumping lemma.This implies that the class of CFLs is not closed under complement. If it were, it would also be closed underintersection, because of the identity: L L = (L L ). But we also prove this result in a direct wayby exhibiting a CFL L whose complement is not context free. Ls complement is the notorious language L2 ={ w w / w {0, 1} } , which we have proven not context free using the pumping lemma.Pf : We show that L = { u | u is not of the form u = w w } is context free by exhibiting a CFG for L:S -> Y | Z | Y Z | Z YY -> 1 | 0 Y 0 | 0 Y 1 | 1 Y 0 | 1 Y 1Z -> 0 | 0 Z 0 | 0 Z 1 | 1 Z 0 | 1 Z 1The productions for Y generate all odd strings, i.e. strings of odd length, with a 1 as its center symbol.Analogously, Z generates all odd strings with a 0 as its center symbol. Odd strings are not of the formu = w w, hence they are included in L by the productions S -> Y | Z . Now we show that the strings u of evenlength that are not of the form u = w w are precisely those of the form Y Z or Z Y.First, consider a word of the form Y Z, such as the catenation of y = 1 1 0 1 0 0 0 and z = 1 0 1, where the center1 of y and the center 0 of z are highlighted. Writing y z = 1 1 0 1 0 0 0 1 0 1 as the catenation of two strings ofequal length, namely 1 1 0 1 0 and 0 0 1 0 1, shows that the former center symbols 1 of y and 0 of z have bothbecome the 4-th symbol in their respective strings of length 5. Thus, they are a witness pair whose clash showsthat y z w w for any w. This, and the analogous case for Z Y, show that the set of strings of the form Y Z or Z Yare in L.Conversely, consider any even word u = a1 a2 .. aj .. ak b1 b2 .. bj .. bk which is not of the form u = w w. Thereexists an index j where aj bj, and we can take each of aj and bj as center symbol of its own odd string. Thefollowing example shows a clashing pair at index j = 4: u = 1 1 0 0 1 1 1 0 1 1.Now u = 1 1 0 0 1 1 1 0 1 1 can be written as u = z y, where z = 1 1 0 0 1 1 1 Z and y = 0 1 1 Y.The following figure how the various string lengths labeled and add up.a1 a2 .. . . aj .. ak b1 b2 .. . . bj .. bkThe word problem. CFL parsing in time O(n3) by means of dynamic programmingInformally, the word problem asks: given G and w A*, decide whether w L(G).More precisely: is there an algorithm that applies to any grammar G in some given class of grammars, and any wA*, to decide whether w L(G)?Many algorithms solve the word problem for CFGs, e.g: a) convert G to Greibach NF and enumerate allderivations of length wto see whether any of them generates w; or b) construct an NPDA M that acceptsL(G), and feed w into M.Ex1: L = { 0k 1k k 1 }. G: S -> 01 0 S1. Use 0 as a stack symbol to count the number of 0s

Ex2: L = {w {0, 1}* #0s = #1s }. G: S -> 0 Y 1 Z, Y -> 1 S 0 Y Y, Z -> 0 S 1 Z ZInvariant: Y generates any string with an extra 1, Z generates any string with an extra 0.The production Z -> 0 S 1 Z Z means that Z has two ways to meet its goal: either produce a 0 now andfollow up with a string in S, i.e with an equal number of 0s and 1s; or produce a 1 but create two new tasks Z.

. For CFGs there is a bottom up algorithm (Cocke, Younger, Kasami) that systematically computes all possibleparse trees of all contiguous substrings of the string w to be parsed, and works in time O( w3 ). We illustrate theidea of the CYK algorithm using the following example:Ex2a: L = {w {0, 1 }+ #0s = #1s }. G: S -> 0 Y 1 Z, Y -> 1S 0 Y Y, Z -> 0 S 1 Z ZWe exclude the nullstring in order to convert G to Chomsky NF. For the sake of formality, introduce Y thatgenerates a single 1, similarly for Z and 0. Shorten the right hand side 0 Z Z by introducing a non terminal Z-> Z Z, and similarly Y -> YY. Every w Z can be written as w = u v, u Z, v Z. As we read wfrom left to write, there comes an index k where #1s = #0s +1, and that prefix of w can be taken as u. Theremainder v has again #1s = #0s +1.The grammar below maintains the invariants: Y generates a single 1; Y generates any string with an extra1; Y generates any string with 2 extra 1. Analogously for Z, Z, Z and 0.S -> Z Y | Y Z start with a 0 and remember to generate an extra 1, or start with a 1 and ...Z -> 0, Y -> 1 Z and Y are mere formalitiesZ -> 0 | Z S | Y Z produce an extra 0 now, or produce a 1 and remember to generate 2 extra 0sY -> 1 | Y S | Z Y produce an extra 1 now, or produce a 0 and remember to generate 2 extra 1sZ -> Z Z, Y -> YY split the job of generating 2 extra 0s or 2 extra 1sThe following table parses a word w = 001101 with |w| = n. Each of the n (n+1)/2 entries corresponds to asubstring of w. Entry (L, i) records all the parse trees of the substring of length L that begins at index i. Theentries for L = 1 correspond to rules that produce a single terminal, the other entries to rules that produce 2 nonterminals

The picture at the lower right shows that for each entry at level L, we must try (L-1) distinct ways of splitting thatentrys substring into 2 parts. Since (L-1) < n and there are n (n+1)/2 entries to compute, the CYK parser worksin time O(n3).Useful CFLs, such as parts of programming languages, should be designed so as to admit more efficient parsers,preferably parsers that work in linear time. LR(k) grammars and languages are a subset of CFGs and CFLs thatcan be parsed in a single scan from left to right, with a look-ahead of k symbols.Context sensitive grammars and languagesThe rewriting rules B -> w of a CFG imply that a non-terminal B can be replaced by a word w (V A)* inany context. In contrast, a context sensitive grammar (CSG) has rules of the form:u B v -> u w v, where u, v, w (V A)*,implying that B can be replaced by w only in the context u on the left, v on the right.It turns out that this definition is equivalent (apart from the nullstring ) to requiring that any CSG rule be of theform v -> w, where v, w (V A)*, and |v| |w|. This monotonicity property (in any derivation, the currentstring never gets shorter) implies that the word problem for CSLs: given CSG G and given w, is w L(G)?is decidable. An exhaustive enumeration of all derivations up to the length |w| settles the issue.As an example of the greater power of CSGs over CFGs, recall that we used the pumping lemma to prove that thelanguage 0k 1k 2k is not CF. By way of contrast, we prove:Thm: L = { 0k 1k 2k / k 1 } is context sensitive.The following CSG generates L. Function of the non-terminals V = {S, B, C, Y, Z}: each Y and Z generates a 1or a 0 at the proper time; B initially marks the beginning (left end) of the string, and later converts the Zs into 0s;C is a counter that ensures an equal number of 0s, 1s, 2s are generated. Non-terminals play a similar role asmarkers in Markov algorithms. Whereas the latter have a deterministic control structure, grammars are nondeterministic.S -> B K 2 at the last step in any derivation, B K generates 01, balancing this 2K -> Z Y K 2 counter K generates (ZY)k 2kK -> C when k has been fixed, C may start converting Ys into 1sY Z -> Z Y Zs may move towards the left, Ys towards the right at any timeB Z -> 0 B B may convert a Z into a 0 and shift it left at any timeY C -> C 1 C may convert a Y into a 1 and shift it right at any timeB C -> 01 when B and C meet, all permutations, shifts and conversions have been done

The Chomsky hierarchy:Types of grammars defined in terms of additional restrictions on the form of the rules:Type 0: No restriction.Type 1: Each rule is of the form P, where P VN and e.Type 2: Each rule is of the form P . (may be e.)Type 3: Each rule is of the form PxB or P x.Common names:Type 0: Unrestricted rewriting systems.Type 1: Context-sensitive grammars.Type 2: Context-free grammars.Type 3: Right-linear, or regular, or finite state grammars.Note than for type 2 (and of course Type 3) grammars the definition of rule applicationbecame more simple: string y is obtained from the string x by application the rule P ifthese strings could be represented in the form x = lPr and y = lr where l, r A*.Correspondence of type 3 grammars and fsas. (Construction p.473) Every type 3language is a fal. Can also show that every fal is a type 3 language (construction p.474).Automata viewed as either generators or acceptors.Grammars viewed as either generators or acceptors.Grammars and trees. Grammars of the types 1-3 generate not only strings but also trees ofimmediate constituents on these strings (for context-sensitive grammars (type 1) such a treedoesnt mirror context-restrictions in the process of this generation). [see 16.3 16.4]6. Properties of regular languages.Closure properties: We already know that the class of fals is closed under union,concatenation, and Kleene star. What about intersection? Complementation?Show complementation: if L is a fal, then A* - L is a fal. Use fsa construction. Assume wehave a deterministic fal M that accepts L. We can construct a deterministic fal M whichaccepts the complement of L just by interchanging final and non-final states.Therefore fals are also closed under intersection (why?).Ling 726: Mathematical Linguistics, Lecture 13-14 Finite State Automata and LanguagesV. Borschev and B. Partee, October 31- Nov 1, 20067Therefore the class of regular languages over any fixed alphabet is a Boolean algebra.Decidability properties: is there an algorithm for determining ... ?-- The membership question: yes.-- The emptiness question: yes.-- Does M accept all of A*?Problem (opt. exercise): Is there an algorithm for determining, given two machines M1, M2,whether L(M1) L(M2) ? (Yes. Show it.)Is there an algorithmic solution to the question of whether two fsas accept the samelanguage?

Language Parse Trees And Ambiguity: Trees And Ambiguity

Parse trees

A useful property of Boolean grammars is that they define parse trees of the strings theygenerate [18], which represent parses of a string according to positive conjuncts in the rules.These are, strictly speaking, finite acyclic graphs rather than trees. A parse tree of a stringw = a1 . . . a|w| from a nonterminal A contains a leaf labelled ai for every i-th position in thestring; the rest of the vertices are labelled with rules from P. The subtree accessible from anygiven vertex of the tree contains leaves in the range between i + 1 and j, and thus correspondsto a substring ai+1 . . . aj . In particular, each leaf ai corresponds to itself.For each vertex labelled with a ruleA 1&. . .&m&1&. . .&nand associated to a substring ai+1 . . . aj , the following conditions hold:1. It has exactly |1|+. . .+|m| direct descendants corresponding to the symbols in positiveconjuncts. For each nonterminal in each k, the corresponding descendant is labelled withsome rule for that nonterminal, and for each terminal a _, the descendant is a leaflabelled with a.2. For each k-th positive conjunct of this rule, let k = s1 . . . s. There exist numbersi1, . . . , i1, where i = i0 6 i1 6 . . . 6 i1 6 i = j, such that each descendant corre-sponding to each st encompasses the substring ait1+1 . . . ait .3. For each k-th negative conjunct of this rule, ai+1 . . . aj / LG(k).The root is the unique vertex with no incoming arcs; it is labelled with any rule for the non-terminal A, and all leaves are reachable from it. To consider the uniqueness of a parse tree fordifferent strings, it is useful to assume that only terminal leaves can have multiple incomingarcs.Condition 3 ensures that the requirements imposed by negative conjuncts are satisfied. How-ever, nothing related to these negative conjuncts is reflected in the actual trees. For instance,parse trees of the second grammar from Example 3.1 reflect only the conjunct S AB, andthus are plain context-free trees. On the other hand, parse trees corresponding to any con-junctive grammar, such as the first grammar in Example 3.1, reflect full information about themembership of a string in the language.4.2.2 AmbiguityUnambiguous context-free grammars can be defined in two ways:1. for every string generated by the grammar there is a unique parse tree (in other words, aunique leftmost derivation);2. for every nonterminal A and for every string w L(A) there exists a unique rule A s1 . . . s with w L(s1 . . . s), and a unique factorization w = u1 . . . u with ui L(si).Assuming that L(A) 6= ? for every nonterminal A, these definitions are equivalent. In the caseof Boolean grammars, the first definition becomes useless, because negative conjuncts are notaccounted for in a parse tree. The requirement of parse tree uniqueness can be trivially satisfiedas follows. Given any grammar G over an alphabet _ = {a1, . . . , am} and with a start symbol32 A. Okhotin, Formal grammars (draft November 26, 2009)S, one can define a new start symbol S and additional symbols bS and A, with the followingrules:S A&bSbS A&SA a1A | . . . | amA | This grammar generates the same language, and every string in L(G) has a unique parse tree,which reflects only the nonterminal A and hence bears no essential information.Trying to generalize the second approach for Boolean grammars in the least restrictive way,one may produce the following definition: for every nonterminal A and for every string w L(A) there exists a unique ruleA 1&. . .&m&1&. . .&n (4.4)with w LG(t) and w / LG(t) for all t, such that for every positive conjunct t =s1 . . . s there exists a unique factorization w = u1 . . . u with ui L(si).However, this definition can be trivialized similarly to the previous case. Given a Booleangrammar G, replace every rule (4.4) withA C1&. . .&Cm&1&. . .&n,where every new nonterminal C has a unique rule C . The resulting grammar generatesthe same language and contains only negative conjuncts, and so the condition on factorizationsin positive conjuncts is trivially satisfied (while the choice of a rule can be made unique as wellusing some additional transformations).Therefore, a proper definition of ambiguity for Boolean grammars must take into accountfactorizations of strings according to negative conjuncts. The following definition is obtained:Definition 4.6. A Boolean grammar G = (_,N, P, S) is unambiguous ifI. Different rules for every single nonterminal A generate disjoint languages, that is, forevery string w there exists at most one ruleA 1&. . .&m&1&. . .&n,with w LG(1) . . . LG(m) LG(1) . . . LG(n).II. All concatenations are unambiguous, that is, for every conjunct A }s1 . . . s and forevery string w there exists at most one factorization w = u1 . . . u with ui LG(si) for alli.Note that Condition II applies to positive and negative conjuncts alike. In the case of apositive conjunct belonging to some rule, this means that a string that is potentially generatedby this rule must be uniquely factorized according to this conjunct. For a negative conjunctA DE, Condition II requests that a factorization of w LG(DE) into LG(D) LG(E) isunique even though w is not generated by any rule involving this conjunct. As argued above,this condition cannot be relaxed.Consider some examples. Both grammars in Example 3.1 are unambiguous. To see thatCondition II is satisfied with respect to the conjunct S AB, consider that a factorizationw = uv, with u L(A) and v L(B), implies that u = a and v bc, so the boundarybetween u and v cannot be moved. The same argument applies to the conjuncts S DC andS DC. Different rules for each of A,B,C,D clearly generate disjoint languages.Boolean grammars 33On the other hand, the grammar in Example ?? is ambiguous because Condition II doesnot hold. Consider the string w = aabb and the conjunct S AB. This string has twofactorizations w = a abb = aab b, with a L(A), abb L(B), aab L(A) and b L(B). This,by definition, means that the grammar is ambiguous. It is not known whether there exists anunambiguous Boolean grammar generating the same language.Though, as mentioned above, the uniqueness of a parse tree does not guarantee that thegrammar is unambiguous, the converse holds:Proposition 4.1. For any unambiguous Boolean grammar, for any nonterminal A N andfor any string w LG(A), there exists a unique parse tree of w from A (assuming that onlyterminal vertices may have multiple incoming arcs).Another thing to note is that the first condition in the definition of unambiguity can bemet for every grammar using simple transformations. Assume every nonterminal A has eithera unique rule (4.1) of an arbitrary form, or multiple rules each containing a single positiveconjunct:A 1 | . . . | n (where i (_ N)) (4.5)There is no loss of generality in this assumption, because any multiple-conjunct rule for A canbe replaced with a rule of the form A A, where A is a new nonterminal with a uniquerule replicating the original rule for A. Then, for every nonterminal with multiple rules of theform (4.5), these rules can be replaced with the following n rules, which clearly generate disjointlanguages:A 1A 2&1A 3&1&2...A n&1&2&. . .&n1(4.5)The grammar obtained by this transformation will satisfy Condition I. Additionally, Condi-tion II, if it holds, will be preserved by the transformation.Proposition 4.2. For every Boolean grammar there exists a Boolean grammar generating thesame language, for which Condition I is satisfied. If the original grammar satisfies Condition II,then so will the constructed grammar.This property does not hold for context-free grammars. Consider the standard example ofan inherently ambiguous context-free language:{aibjck | i, j, k > 0, i = j or j = k}.Following is the most obvious ambiguous context-free grammar generating this language:S AB | DCA aA | B bBc | C cC | D aDb | Condition II is satisfied for the same reasons as in Example 3.1. On the other hand, Condition Iis failed for the nonterminal S and for strings of the form anbncn, which can be obtained usingeach of the two rules, and this is what makes this grammar ambiguous.34 A. Okhotin, Formal grammars (draft November 26, 2009)If the above context-free grammar is regarded as a Boolean grammar (ambiguous as well),then the given transformation disambiguates it in the most natural way by replacing the rulesfor the start symbol with the following rules:S AB | DC&AB .So it has been demonstrated that ambiguity in the choice of a rule represented by Condition Ican be fully controlled in a Boolean grammar, which is a practically very useful property notfound in the context-free grammars. On the other hand, ambiguity of concatenations formalizedin Condition II seems to be, in general, beyond such control.Complexity of parsing 43

Reduction of CFGS:

Let A and B be languages over the same alphabet _.An algorithm that transforms:_ Strings in A to strings in B._ Strings not in A to strings not in B.B is decidable ) A is decidable.A is undecidable ) B is undecidable.One way to show a problem B to be undecidable is to reduce an undecidable problem A to B.1.A Turing machine M computes a function f if:M halts on all inputs.On input x it writes f(x) on the tape and halts.Such a function f is called a computable function.Examples: increment, addition, multiplication, shift.Any algorithm with output is computing a function.2.Let A and B be languages over _.A is reducible to B if and only if:_ there exists a computable function f : __ ! __ such that_ for all w 2 __; w 2 A , f(w) 2 B.Notation: A _m B.FACT: A _m B , A _m B.w 2 A , f(w) 2 B is equivalent to w 62 A , f(w) 62 B.3.To Re-iterate:1. Construction: f(w) from w by an algorithm.2. Correctness: w 2 A , f(w) 2 B.4.An Example involving DFAsEQDFA = fA;B j A;B are DFAs and L(A) = L(B)g.EDFA = fA j A is a DFA and L(A) = ;g.A reduction machine on input A;B two DFAs:1. Constructs the DFA A0 such that L(A0) = L(A).2. Constructs the DFA B0 such that L(B0) = L(B).3. Constructs the DFA M1 such that L(M1) = L(A) \ L(B0).4. Constructs the DFA M2 such that L(M2) = L(A0) \ L(B).5. Constructs the DFA C such that L(C) = L(M1) [ L(M2).6. Outputs C.Correctness:_ Suppose L(A) = L(B). Then, L(C) = ;._ Suppose L(A) 6= L(B). Then, L(C) 6= ;.That is, EQDFA _m EDFA.5An Example involving CFGsALLCFG = fG j G is a CFG and L(G) = __g.EQCFG = fG;H j G;H are CFGs and L(G) = L(H)g.A reduction machine on input G a context-free grammar with alphabet _:1. Constructs a CFG H with rules of the form S0 aS0 j _, for all a 2 _.2. Outputs (G;H).L(H) = __.Correctness:_ Suppose G generates all strings in __. Then, L(G) = L(H)._ Suppose G does not generate some string in __. Then L(G) 6= L(H).That is, ALLCFG _m EQCFG.

Chomsky And Griebach Normal Forms:

A context free grammar is in chomsky normal form if each production yields a terminal or two nonterminals. Any context free grammar can be converted to chomsky normal form. First remove all E productions. If x E, then a production like y AxBxC spins off the productions y ABC | AxBC | ABxC. Perform similar substitutions for all nonterminals that lead to the empty string, then remove productions that yield E. If E is in the language, i.e. s E, then this production must remain. This is the only nonstandard production in chomsky normal form; all other productions must yield a terminal or two nonterminals. Next, remove any productions x x, as they are pointless. Given x y, let x derive everything that y derives, then remove the unit production x y. At this point the right side of each production has two or more symbols. Introduce new symbols to play out the right side. For instance, x AyzBC might be replaced with the following. x q1q2 q1 A q2 yq3 q3 zq4 q4 q5q6 q5 B q6 C Do this across the board and the resulting grammar is in chomsky normal form. Prove that a word of length n is derived in 2n-1 steps. Hint, each termin

toc notes

Documents

Transcript of toc notes