Unit 9 - Final
-
Upload
ashok-sharma -
Category
Documents
-
view
218 -
download
0
Transcript of Unit 9 - Final
-
7/30/2019 Unit 9 - Final
1/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 169
Unit 9 Regular Expressions and
Regular Languages
Structure
9.1 Introduction
Objectives
9.2 Regular expressions
9.3 Regular Expressions accepted by the Language
9.4 Finite Automaton from Regular Grammar
9.5 Regular Grammar from Finite Automata
Self Assessment Questions9.6 Summary
9.7 Terminal Questions
9.8 Answers
9.1 Introduction
In this unit, you will learn about regular expressions along with finite
automata, which act as a device for computing regular expressions. A
regular expression is a set of strings of symbols that can be generated by a
regular grammar using certain operations such as union, intersection and
concatenation. A regular expression also follows different identities that is
based on common mathematical operations such as addition and
multiplication. These identities help simplify the regular expression. A
regular expression can be accepted both by deterministic as well as non-
deterministic automata.
Objectives:
After going through this unit, you will be able to
explain the concept of regular expressions
understand the regular expression accepted by the language.
Convert finite automata from regular grammar.
Convert regular grammar from finite automata.
-
7/30/2019 Unit 9 - Final
2/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 170
9.2 Regular Expressions
In computing, regular expressions are used to represent a set of strings and
include symbols that are arranged using certain syntax rules. We can de
regular expression R1 using terminal symbols such as and that are
elements of . Some of the algebraic operations defined with regular
expression are:
1. Union: The union of two regular expressions is also a regular
expression. For example, if R1 and R2 are the two regular expressions,
then the union R1 + R2 is also a regular expression.
2. Concatenation: The concatenation of two regular expressions is a
regular expression. For example, if R1 and R2 are the two regular
expressions, then the concatenation R1R2 is also a regular expression.
3. Iteration: The iteration of a regular expression is also a regular
expression. For example, if R1 is a regular expression, then the iteration
1R is also a regular expression.
4. Order of evolution: The order of evolution of a regular expression is a
regular expression. For example, if R1 is a regular expression, then
order of evolution (R1) is also a regular expression.
9.2.1 Definition:
A regular expression is recursively defined as follows.
1. is a regular expression denoting an empty language.
2. is a regular expression which indicates the language containing an
empty string.
3. a is a regular expression which indicates the language containing only
{a}
4. If R is a regular expression denoting the language LR and S is a regular
expression denoting the language LS, then
-
7/30/2019 Unit 9 - Final
3/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 171
a. R+S is a regular expression corresponding to the language
LR LS.b. RS is a regular expression corresponding to the language LR.LS.
c. R* is a regular expression corresponding to the language LR.
5. The expressions obtained by applying any of the rules from 1 to 4 are
regular expressions.
Note: If parentheses are not present in the regular expressions, then
precedence of the operands is as follows: iteration, concatenation and
union. First you need to perform the iteration operation, then the
concatenation operation and finally the union operation.
Note: Any set, which is represented by using a regular expression, is known
as regular set. If the regular expression is R, then the regular set of R is
L(R).
9.2.2 Example:
Let x, y , where,
x represents the set {x}
x + y represents the set {x, y}
xy represents the set {xy}
x* represents the set {, x, xx, xxx, }
(x + y)* represents the set{x + y}*
9.3 Regular Expressions accepted by the Language
9.3.1 Example:
Some examples of regular expressions and the language corresponding to
these regular expressions are given here.
-
7/30/2019 Unit 9 - Final
4/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 172
RegularExpressions
Meaning
(a+b)* Set of strings of as and bs of any length including the NULLstring.
(a+b)*abb Set of strings of as and bs ending with the string abb
ab(a+b)* Set of strings of as and bs starting with the string ab.
(a+b)*aa(a+b)* Set ofstrings of as and bs having a sub string aa.
a*b*c* Set of string consisting of any number of as(may be emptystring also) followed by any number of bs(may includeempty string) followed by any number of cs(may includeempty string).
abc Set of string consisting of at least one a followed by string
consisting of at least one b followed by string consisting ofat least one c.
aa*bb*cc* Set of strings consisting of at least one a followed by stringconsisting of at least one b followed by string consisting ofat least one c.
(a+b)* (a + bb) Set of strings of as and bs ending with either a or bb
(aa)* (bb)*b Set of strings consisting of even number of as followed by
odd number of bs.
9.3.2 Example
Obtain a regular expression to accept a language consisting of strings of as
and alternate as and bs.
Solution: The alternate as and bs can be obtained by concatenating the
string ab zero or more times which can be represented by the regular
expression
(ab)*
and adding an optional b to the front and adding an optional a at the end as
shown below:
( + b) (ab)* ( + a).
Thus, the complete expression is given by
( + b) (ab)* ( + a)
-
7/30/2019 Unit 9 - Final
5/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 173
9.3.3 Note
The expression can also be obtained as shown below:
The as and bs can be generated using one of the following ways:
i) (ab)*
ii) b(ab)*
iii) (ba)*
iv) a(ba)*
Therefore the expression to generate alternate as and bs can be obtained
by taking the union of regular expressions as shown below:
(ab)* + b(ab)* + (ba)* + a(ba)*
9.3.4 Example
Obtain a regularexpression to accept a language consisting of strings of 0s
and 1s with at most one pair of consecutive 0s.
Solution: It is clear from the statement that the string consisting of at most
one pair of consecutive 0s may
o begin with combination of any number of 1s and 01s represented by (1
+ 01)*
o end with any number of 1s represented by 1 *.
Therefore the complete regular expression which consists of strings 0s and
1s with at most one pair of consecutive 0s is given by
(1 + 01)*00 1*.
9.3.5 Example
Obtain a regular expression to accept a language containing at least one a
and at least one b where = {a, b, c}.
Solution: Strings of as, bs and cs can be generated using the regular
expression
(a + b + c)*.
-
7/30/2019 Unit 9 - Final
6/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 174
But this string should have at least one aand at least one b. There are
two cases to be considered:
First a preceding b which can be represented using
c*a(a + c)*b
First b preceding a which can be represented using
c*b(b + c)*a
The regular expression (a + b + c)* can be preceded by one of the regular
expressions considered in the two cases just discussed.
Therefore the final regular expression isc*a(a + c)*b(a + b + c)* +c*b(b + c)*a(a + b + c)*
This expression can also be written as shown below:
[c*a(a+c)*b + c*b(b4c)*a] (a+b+c)*
9.3.6 Example
Obtain a regular expression to accept a language consisting of strings of as
and bs of even length.
Solution: String ofas and bs of even length can be obtained by the
combination of the strings aa, ab, ba and bb.
The language may even consist of an empty string denoted by .
Therefore the regular expression can be of the form
(aa + ab + ba + bb)*
The * closure includes the empty string.
The language corresponding to the regular expression is denoted by
L(R)={(aa + ab + ba + bb)n n 0}.
-
7/30/2019 Unit 9 - Final
7/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 175
9.3.7 Example
Obtain a regular expression to accept a language consisting of strings of as
and bs of odd length.
Solution: String of as and bs of odd length can be obtained by the
combination of the strings aa, ab, ba and bb followed by either a or b.
Therefore the regular expression can be of the form
(aa + ab + ba + bb)* (a + b)
String of as and bs of odd length can also be obtained by the combination
of the strings aa, ab, ba and bb preceded by either a or b.
Therefore the regular expression can also be represented as
(a + b) (aa + ab + ba + bb)*.
Observation:Even though these two expressions seem to be different, thelanguage corresponding to these two expressions is same.
9.3.8 Example
Obtain a regular expression such that L(R) = {w w {0, 1}* with at least
three consecutive 0s.
Solution: A string consisting of 0s and ls can be represented by the
regular expression
(0 + 1)*
This arbitrary string can precede three consecutive zeros and can follow
three consecutive zeros.
Therefore the regular expression can be written as
(0 +1)* 000(0+1)*.
The language corresponding to the regular expression can be written as
L(R) = { (0 + 1)m000(0+1)n m 0 and n 0}.
-
7/30/2019 Unit 9 - Final
8/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 176
9.4 Finite Automaton from Regular Grammar
9.4.1 Definition
A grammar G = (VN, VT, S, ) is said to be regular grammar the
grammar is right regular or left regular.
A grammar G is said to be right regularif all the productions are of the form
A wB and / or A w, where A, B VT and w VT*.
A grammar G is said to be left regularif all the productions are of the form
A Bw and / or A w, where A, B VT and w VT*.
9.4.2 Example
(i) The grammar with the set of productions
S aaB bbA
A aA b
B bB a
is a right linear grammar.
(ii) The grammar with the set of productions
S Baa Abb
A Aa b
B Bb a
is a left linear grammar.
9.4.3 Definition
A grammar which has at most one non terminal on the right side of any
production without restriction on the position of this non terminal (observe
that: non terminal can be leftmost or rightmost) is called linear grammar.
9.4.4 Theorem
Let G = (VN, VT, S, ) be a right linear grammar. Then there exists a
language L(G) which is accepted by a finite automata, that is, the language
generated from the regular grammar is a regular language.
-
7/30/2019 Unit 9 - Final
9/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 177
Proof: Let V = {q0, q1, } be the variables and S = q0 be the start state.
Let the productions in the grammar be
q0 x1q1
q1 x2q2
q2 x3q3
qn xn+1
Assume that the language L(G) generated from these productions is w.
Corresponding to each production in the grammar we can have equivalent
transitions in the FA to accept the string w.
After accepting the string wm the FA will be in the final state.
The procedure to obtain FA from these productions is given below.
Step 1: The start symbol q0 in the grammar is the start state of FA.
Step 2: For each production of the form q I wqj the corresponding
transition defined will be of the form
*(qi, w) = qj.
Step 3: For each production of the form q i w, the corresponding transition
defined will be of the form
*(qi, w) = qf, where qf is the final state.
Since the string w L(G) is also accepted by FA, by applying the transitions
obtained in step 1 through 3, the language is regular.
-
7/30/2019 Unit 9 - Final
10/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 178
9.4.5 Problem: Construct a DFA and the transition diagram, to accept the
language generated by the following grammar.
S 01A
A 10B
B 0A 11
Solution: Observe that each production of the form
A wB
the corresponding transition will be (A, w) = B
Also, for each production of the form A w, we can introduce the transition(A, w) = qf, where qf is the final state.
The transitions obtained from grammar G is shown in the table.
The transition diagram is shown below.
The DFA is
M = (Q, , , q0, F) where
-
7/30/2019 Unit 9 - Final
11/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 179
Q = {S, A, B, qf, q1, q2, q3},
= {0, 1}, q0 = S (start state), F = {q f}, is shown in the table. Here, theadditional vertices (states) introduced are q1, q2, q3.
9.4.6 Problem:
Construct DFA and the corresponding transition diagram to accept the
language generated by the following grammar.
S aA
A aA bB
B bB
Solution: Observe that each production of the form
A wB
the corresponding transition will be
(A, w) = B
Also, for each production of the form
A w,
we can introduce the transition
(A, w) = qf, where qf is the final state.
The transitions obtained from grammar G is shown in the table.
-
7/30/2019 Unit 9 - Final
12/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 180
Observe that for each production of the form , make A as the final
state.
The transition diagram corresponding to this is shown below.
9.5 Regular Grammar from Finite Automata
9.5.1 Theorem:
Let M = (Q, , , q0, F) be a finite automata. If L is the regular language
accepted by FA, then there exists a right linear grammar G = (VN, VT, S, )
so that L = L(G).
Proof: Let M = (Q, , , q0, F), where Q = {q0, q1, , qn}, = {a1, a2, , am}.
A regular grammar G = (VN, VT, S, ) can be constructed where
VN = {q0, q1, , qn}, VT = , S = q0.
The set of productions can be obtained as shown below.
Step 1: For each transition of the form (qi, a) = qj the corresponding production is
qi aqj
Step 2: If qF, the final state in FA, then introduce the production q.
Since these productions are obtained from the transitions defined for FA, the
language accepted by FA is also accepted by the grammar.
-
7/30/2019 Unit 9 - Final
13/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 181
9.5.2 Example:
Obtain a regular grammar from the following DFA given by the transition diagram.
Solution: For each transition of the form (A, a) = B, introduce the
production A aB. If q F (the final state), introduce the production A.
The productions obtained from the transitions defined for DFA is shown
below.
From the diagram, it is clear that the state B is a final state.
Therefore we introduce the production .The grammar G corresponding to the productions obtained is shown below.
-
7/30/2019 Unit 9 - Final
14/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 182
9.5.3 Example
Construct a regular grammar for the following DFA given by the transition
diagram.
Solution: For each transition of the form (A, a) = B, introduce the
production A aB.
If q F (the final state), introduce the production A. The productionsobtained from the transitions defined for DFA is shown below.
-
7/30/2019 Unit 9 - Final
15/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 183
Since the set of final states: {S, A, B}, we introduce the productions S ,
A , and B .Therefore the grammar G is:
G = (VN, VT, S, ) where
VN = {S, A, B, C}
VT = {a, b}
Observation: The finite automaton in this problem accepts strings of as
and bs except those containing the substring abb. Therefore from the
grammar G we can obtain a regular language which consist of strings of as
and bs without the substring abb.
9.5.4Example
Obtain a right linear grammar for the regular expression ((aab)* ab)*, given
by the transition diagram.
The right linear grammar is given by
G = (VN, VT, S, ) where
VN = {S, A, B}
VT = {a, b}
-
7/30/2019 Unit 9 - Final
16/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 184
9.5.5 Note
The left linear grammar can be obtained from FA as follows.
Step 1: Obtain the reverse of given DFA.
Step 2: Obtain the right linear grammar from the reversed DFA.
Step 3: Obtain the left linear grammar from right linear grammar.
9.5.6 Example
Obtain a left linear grammar for the DFA shown below.
Step 1: Reverse the DFA. That is, A as the final state and C as the start
state and reverse the direction of the arrow. The reversed DFA is shown
below.
Step 2: obtain the right linear grammar for the above DFA. The
corresponding productions are shown below.
-
7/30/2019 Unit 9 - Final
17/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 185
Step 3: Reverse the productions of right linear grammar to get left linear
grammar.
If A abcdB is the production in right linear grammar, after reversing the
production will be of the form
A Bdcba.
The conversion of right linear grammar to the left linear grammar is shown
below.
Therefore the final left linear grammar is
G = (VN, VT, S, ) where
VN = {C, A, B}
VT = {0, 1}
Now we show that the string 10101 is accepted by DFA.
-
7/30/2019 Unit 9 - Final
18/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 186
Hence the left linear grammar obtained is equivalent to the given FA.
Self Assessment Questions
1. The regular expression (11)* stands for _______
2. The regular expression (01)* + 1 stands for _____
3. The regular expression (0 + 10)*1* stands for ______
4. Obtain a left linear grammar for the regular expression ((aab)* ab)*.
9.6 Summary
In this unit special type of grammar called regular grammars were
considered. Different forms of regular expressions and the regular
expressions accepted by the language are given. We provided a method of
-
7/30/2019 Unit 9 - Final
19/19
Fundamentals of Theory of Computer Science Unit 9
Sikkim Manipal University Page No.: 187
obtaining a regular grammar from the finite and automaton (and vice versa).
Sufficient number of examples were given.
9.7 Terminal Questions
1. Obtain a right linear grammar for the language L = {anbm n 2, m 3}.
2. Obtain the left linear grammar for the right linear grammar shown below.
9.8 Answers
Self Assessment Questions
1. Set of strings consisting of even number of 1s.
2. The language consists of a string 1 or strings of (01)s that repeat zero
or more times.
3. Stings of 0s and 1s ending with any number of 1s (possible none).
4. G = (VN, VT, S, ) where VN = {A, B, S}, VT = {a, b}