Formal Languages, Automata and Models of · PDF fileFormal Languages, Automata and Models of...

188
CDT314 FABER FABER Formal Languages, Automata and Models of Computation Lecture 4 School of Innovation, Design and Engineering Mälardalen University 2011 1

Transcript of Formal Languages, Automata and Models of · PDF fileFormal Languages, Automata and Models of...

CDT314

FABERFABER

Formal Languages, Automata and Models of Computation

Lecture 4

School of Innovation, Design and Engineering Mälardalen University

2011

11

ContentContent

Regular Expressions and Regular Languages

NFA→DFA Subset Construction (Delmängdskonstruktion)NFA→DFA Subset Construction (Delmängdskonstruktion)

FA → RE State Elimination (Tillståndseliminering)

Mi i i i DFA b S t P titi i (Sä kilj d l it )Minimizing DFA by Set Partitioning (Särskiljandealgoritmen)

Grammar: Linear grammar. Regular grammar. Regular Lng. .

2

T B i ThTwo Basic Theorems

3

1: KleeneTheoremNFA↔DFANFA↔DFA

Let M be NFA accepting L. p gThen there exists DFA M´ that accepts L as well.

4

II: Finite Language TheoremA fi it l i FSA t blAny finite language is FSA-acceptable.

Example L = {abba, abb, abab}

FSA = Finite State Automaton = DFA/NFA

5

Regular ExpressionsRegular Expressionsand

R l LRegular Languages

6

Theorem - Part 1

LanguagesGenerated by

Regular⊆Generated byRegular Expressions

Languages⊆

1. For any regular expressionthe language is regular

r)(rL

7

Theorem - Part 2

LanguagesGenerated by

RegularL⊇Generated by

Regular ExpressionsLanguages⊇

2. For any regular language there isLLrL =)(

2. For any regular language there isa regular expression withr

8

Proof - Part 1

rFor any regular expression)(rL

y g pthe language is regular

Proof by induction on the size of rProof by induction on the size of r

9

Induction BasisPrimitive Regular Expressions: αλ,,∅NFAs Σ∈α(where )NFAs

)()( 1 ∅=∅= LML

( )

)()( 1 ∅=∅= LML

Regular)(}{)( 2 λλ LML == RegularLanguages

)(}{)( 3 aLaML ==a

10

Inductive Hypothesis

Assume

1r 2rfor regular expressions and

that and are regular languages)( 1rL )( 2rL

11

Inductive StepWe will prove:

)()( 21

LrrL +

regular languages)(

)(

1

21

rL

rrL∗

languages))(()(

1

1

rLrL

12

By definition of regular expressions:

)()()()()()( 2121

rLrLrrLrLrLrrL

=⋅∪=+

))(()(

)()()(*

11

2121

rLrL

rLrLrrL

=

=⋅∗

)())(())(()(

11

11

rLrL =

13

B i d ti h th i k)( 1rL )( 2rL

By inductive hypothesis we know:and are regular languages

It is known, and we will prove in this lecture:Regular languages are closed under

( ) ( )

It is known, and we will prove in this lecture:

( ) ( )( ) ( )

21

rLrLrLrL ∪union

concatenation ( ) ( )( )( )*1

21

rLrLrLconcatenation

star

14

Therefore

( ) ( ) ( )2121 rLrLrrL ∪=+( ) ( ) ( )

( ) ( ) ( )

2121

LLL

rLrLrrL ∪+

regular( ) ( ) ( )2121 rLrLrrL =⋅ languages

( ) ( )( )** 11 rLrL =

And trivially ))(( 1rL is a regular language15

y ))(( 1 is a regular language

Proof – Part 2

LFor any regular language there isr LrL =)(a regular expression with

Proof by construction of regular expression

16

LM

Since is regular take the NFA that accepts ittake the NFA that accepts it

LML =)(

Single final state17

Single final state

From construct the equivalentGeneralized Transition Graph

Mp

transition labels are regular expressions

ExampleM

a cM

a c

ba, ba +

18

Reverse of a Regular LanguageLanguage

19

TheoremTheorem

The reverse of a regular languageRL Lg g gis a regular language

Proof idea

Construct NFA that accepts :RL

invert the transitions of the NFAthat accepts L

20

that accepts L

ProofSince is regular, there is NFA that accepts

LLthere is NFA that accepts

Example: bExample:

baabL += *b

abb

a

21

I t T itib

Invert Transitionsab

ba

ab

a

22

Make old initial state b

a final state and vice versa

a

b

and vice versa

bb

a

b

abb a

23

b

Add a new initial stateab

b

b a

b

λab

λ

λb aλ

24

Resulting machine accepts RL

RL i l

baabL += *b

RL is regular

ababLR += *b

λa λ

λb aλ

25

Some Properties ofSome Properties of Regular Languages,

SSummary

26

Properties1L 2LFor regular languages and

we will prove that:

21 LL ∪Union:

p

21 LL ∪U o

Are Regular21LLConcatenation:

Are RegularLanguages

*1LStar:27

1LStar:

R l l l d dRegular languages are closed under

21 LL ∪Union:

21LLConcatenation:

*1LStar:

28

L L1LRegular language

( ) 11 LML =2LRegular language

( ) 22 LML =( ) 11 LML

MNFA M

( ) 22 LML

NFA1MNFA 2MNFA

Single final state Single final state

29

Example

a1M

}{1 baL n=a

b

2M

{ }baL =2ab

30

Union (Thompson’s construction)

NFA for

1M21 LL ∪

1

λλ λλ

2M λλ

31

Example

}{}{21 babaLL n ∪=∪NFA for

a}{1 baL n=

ab

λλ λ

λ

λ

λ }{2 baL =ab

λλ }{2 baL =

32

Concatenation (Thompson’s construction)

NFA for 21LL

M M1M 2Mλ λ

33

Example

NFA for }{}}{{21 bbaababaLL nn == }{}}{{21

n

a}{1 baL n=

}{2 baL =ab abλ λ

34

Star Operation (Thompson’s construction)( p )

NFA for *1L λ

1M*1L∈λ

1

λ λ

λ35

λ

Example

NFA for *}{*1 baL n= }{1

λ

}{1 baL n=a

bλ λbλ λ

36λ

Summary: Operations on Regular Expressions

RE Regular language descriptionRE Regular language description

a+b {a b}a+b {a,b} (a+b)(a+b) {aa, ab, ba, bb}a* {λ a aa aaa }a* {λ, a, aa, aaa, …}a*b {b,ab, aab, aaab, …}( +b)* {λ b b b bbb }(a+b)* {λ, a, b, aa, ab, ba, aaa, bbb…}

37

Algebraic Properties

Axiom Description

r +s = s+r + is commutative

(r +s)+t = r +(s+t) + is associative (r s) t r (s t) is associative

(rs)t = r (st) concatenation is associative

r (s+t) = rs+rt (s+t)r = sr +tr

concatenation distributes over +

λr = r λ is the identity element for λr r rλ = r

λ is the identity element for concatenation

r* = ( r +λ)* relation between * and λ

38r** = r* * is idempotent

Operator PrecedenceOperator Precedence

1. Kleene star2. Concatenation3. Union

allows dropping unnecessary parentheses.

39

Converting Regular Expression to a DFA

40

Example: From a Regular Expression to an NFA

Example : (a+b)*abb step by step construction

(a+b)

2a

λ 3 λ

61

λλ 4 5b

λλ

41

Example: From a Regular Expression to an NFA

Example : (a+b)*abbλ

(a+b)*

a

λ

2a

λλ λ6

3

0 7

4 5b

λλ

6

λ

42

Example: From a Regular Expression to an NFA

Example : (a+b)*abbλ (a+b)*abb

a

λ (a b) abb

2 a

10 aλ

λλλ

76

3

84 5b b

λλ7 8

9

4310

C ti FAConverting FA to a Regular Expressiong p

State Elimination

http://www.math.uu.se/~salling/Movies/FA_to_RegExpression.mov

44

Example

ba ,a

1 2b

b

1 2

a b3

a

Add a new start (s) and a new final (f) state:Add a new start (s) and a new final (f) state:• From s to the ex-starting state (1)• From each ex-final state (1,3) to f

45

From each ex final state (1,3) to f

λ ba ,a

1 2sb

b

1

b3

The original:

ba ,a

1 2ba ,

a

1 2

f λb

b

1 2

3

ab

b

1 2

3

a

Let’s remove state 1!Each combination of input/output to 1 will generateEach combination of input/output to 1 will generate a new path once state 1 is gone

b λ46

;λ ba , λ ;λ ba a; , a ; λ

When state 1 is gone we must be able to make all those transitions!

λ b∪a( );λ ba ,

λ baλ ba ,a

b1 2s

λb

3

a

λ baa

Previous:

3

f λ

λ ba,

bba

λ

47

f λλ

λ ;λ

λ b∪a( )

bλ ba ,a

b1 2s

λ

bba

λ λ 3

λ

λ

48f λ

ba a; ,

λ b∪a( )

b

b( )

λ ba ,a

b1 2s

λ

b

ba

λ λ 3

λ

λb∪a( )a

49f λ

A common mistake: having several arrows between the same pair of states. Join the two arrows (by union of their regular p ( y gexpressions) before going on to the next step.

λ bλ b∪a( )

λ ba ,a

b1 2s

ba

λb

λ 3λ bb∪a( )a ∪

50f λ

λa ; λ

ba( )λ b∪a( )

λ ba ,a1 2s

ba

bλ 3λ bb∪a( )a ∪

f λ

51aλ

Union again..

λ ba( )λ b∪a( )

λ ba ,a1 2s

ba bb∪a( )a ∪

λ 3λ

f λ a∪

52

λ ba( )

Without state 1...

λ b∪a( )

a2sb

bb∪a( )a ∪

3λbb( )

f λ a∪

53

λ ba( )

Now we repeat the same procedure for the state 2...

λ b∪a( )

a2sb

b∪a(a ∪ b3

λ b∪a(a ∪ b

f λ a∪

54

Following the path s-2-3 we concatenate all strings on our way…don’t forget a* in 2!

λ b∪a( )

ab

2sbb∪a( )λ a∗

bb∪a( )a ∪

3

f λ a∪

55

f

When 2 is removed, the path 3 - 2 –3 has also to be preserved, so we concatenate 3-2 string, take a* loop and go back 2-3 with string b!

λ b∪a( )

ab 2sbb∪a( )λ a∗

λbb∪a(a ∪

bba( )λ a

( bb∪a( )a ∪ ba∗)

56

f λ a∪

This is how the FA looks like without state 2:

sbb∪a( )λ a∗

( bb∪a( )a ∪ ba∗)

f λ a∪

( bb∪a( )a ∪ ba)

57

Finally we remove state 3…

sbb∪a( )λ a∗

λ a∪

f ( bb∪a( )a ∪ ba∗)

58

...so we concatenate strings s-3, loop in 3 and 3-f

sbaba *)( ∪λ

3λbabbaa ∗∪∪ ))((

f

babbaa ∪∪ ))((a∪λ

∗∗∪∪ )))((( babbaababa ∗∪ )(λ )( a∪λ

59

Now we can omit state 3!

s

λ

f

∗∗∪∪ )))((( babbaababa ∗∪ )(λ )( a∪λ

From s we have two choices, empty string or the long expression

∪∪ )))((( babbaababa∪ )(λ )( a∪λ

60OR is represented by union ∪, as usually

So union the arrows...

λ∪∗∗∪∪ )))((( babbaababa ∗∪ )( )( a∪λs f

)))((()( )(

...and we are done!

61

Converting FA to a RegularConverting FA to a Regular Expression-p

An Algorithm for State EliminationElimination

62

• We expand our notion of NFA- to allow transitions on arbitrary regular expressions, not simply single symbols or λ .

• Successively eliminate states, replacing transitions that enter and leave a state with a more complicated regular expression, until eventually there are only two states left: a start state, an accepting state, and astates left: a start state, an accepting state, and a single transition connecting them, labeled with a regular expression.

• The resulting regular expression is then our answer.

63

• To begin with, the automaton should have a start state that has no transitions into it (including self-loops), and ( g p ),which is not accepting.

• If your automaton does not obey this property, add a i d dd i inew, non-accepting start state s, and add a λ-transition

from s to the original start state.

The automaton should also have a single accepting• The automaton should also have a single accepting final state with no transitions leaving it, and no self-loops.

• If your automaton does not have it, add a new final state q, change all other accepting states to non-accepting and add λ transitions from them to qaccepting, and add λ -transitions from them to q.

This change clearly doesn't change the language accepted by the automaton.

64

accepted by the automaton.

Repeat the following steps, which eliminate a state:

1. Pick a non-start, non-accepting state q to eliminate. The state q will have i transitions in and j transitions out. Each will be labeled with a regular expression. For each of the ij combinations of transitions into and out of q, replace:

A

BCA

p qC

r

ithwithCBA *

p r

65And delete state q.

2 If several transitions go between a pair of states replace2. If several transitions go between a pair of states, replace them with a single transition that is the union of the individual transitions.

E.g. replace: A

p r

B

ithwith

p rBA∪

66

p

Example b aa b

1 3 4a2

b aabab *∪

a

b1 3 41 3

baabab *)*( ∪1 4

67

N.B. common mistake!

a b

meansba ,*)( ba∪

meansi.e.

NOT *)*( ba ∪See example on s 46 and 47 in Sallings book

NOT )( ba ∪

68

p g

Minimizing DFAb S t P titi iby Set Partitioning

http://www.math.uu.se/~salling/Movies/SubsetConstruction.mov

69

Minimization of DFAMinimization of DFA

The deterministic finite automata are not always the smallest possible accepting the source p p glanguage.

There may be states with the sameThere may be states with the same "acceptance behavior". This applies to states p and q, if for all input words, the automaton always or never moves to a final state from p and q.

70

State Reduction by Set Partitioning (Särskiljandealgoritmen)

The set partitioning technique is similar to one usedThe set partitioning technique is similar to one usedfor partitioning people into groups based on theirresponses to questionnaireresponses to questionnaire.

The following slides show the detailed steps forcomputing equivalent state sets of the starting DFAand constructing the corresponding reduced DFA.

71

b

3

a

a

a

b

a 3

1Starting DFA

bb b

aa

b4

2

0

bb a5

b3

a

aa, b

1,20a, b

Reduced DFA

72b

4,5

State Reduction by Set Partitioning

Step 0: Partition the states into two groupsaccepting and non acceptingaccepting and non-accepting.

{ } { }

P1 P2

aa

b

a 31

a

{ 3, 4, 5 } { 0, 1, 2 }

b ba

b 4

2

0

bb a5

73

State Reduction by Set Partitioning

Step 1: Get the response of each state for each input symbol. Notice that States 3 and 0 show different responses from the ones of the other states in the same setones of the other states in the same set.

P1 P2 P1P2

p1 p1 p1 p2 p1 p1a→ ↑ ↑ ↑ a→ ↑ ↑ ↑ a

b3

a

a→ ↑ ↑ ↑ a→ ↑ ↑ ↑{3, 4, 5 } {0, 1, 2 }

b→ ↓ ↓ ↓ b→ ↓ ↓ ↓b b

aa

ab

a

4

1

0

p2 p1 p1 p2 p1 p1 bb a

b b25

74

Step 2: Partition the sets according to the responses, and go to Step 1 until no partition occurs.

P11 P12 P21 P22

p11 p11 p12 p12

a→ ↑ ↑ a→ ↑ ↑

{4, 5} {3} {1, 2} {0}b→ ↓ ↓ b→↓ ↓

p11 p11 p11 p11

No further partition is possible for the sets P11 and P21 . So the final partition results are as follows:

75

{4, 5} {3} {1, 2} {0}

Minimized DFA consists of four states of the final partition, and the transitions are the one corresponding to the

{4, 5} {3} {1, 2} {0}

and the transitions are the one corresponding to the starting DFA.

{ } { } { } { }

aa

b3

a, ba

ab

b

a

4

31

a

0

b3

a, ba

a, b1,2

b4,5

0,

bb a

b bab 4

25

0aa, b

1,2

b4,5

0,

Minimized DFA Starting DFA

76

g

DFA Minimization Algorithm

The algorithm Why does this work?Partition P ∈ 2Q (power set)Start off with 2 subsets of QP ← { F, {Q-F}}

while ( P is still changing)T ← { }f h t P

Start off with 2 subsets of Q{F} and {Q-F}

for each set s ∈ Pfor each α ∈ Σ

partition s by αinto s1, s2, …, sk

While loop takes Pi→Pi+1 by splitting one or more sets

Pi+1 is at least one step closer to1, 2, , k

T ← T ∪ s1, s2, …, skif T ≠ P then

P ← T

Pi+1 is at least one step closer to the partition with |Q | sets

Maximum of |Q | splits

Partitions are never combinedInitial partition ensures that final

t t i t t

This is a fixed-point algorithm!

77

states are intact

Minimization and partitioning (Salling)

aExample 2.38 A minimal automaton },{ ba=Σ

a ab

aaaΣ

Theorem 2.4A DFA with Nstates is

a

bab abastates is

minimal if its language distinguishes b

a

Σb

b

b

distinguishes N strings

b

The automaton above is minimal, as it has 6 states }{ bbb

78

and distinguishes 6 strings: },,,,,{ abaabaabaε

ab

aaa a

Two strings x,y are distinguished by a language b

aaab

aaεa

aaa aε

ε

(DFA) if there is a distinguishing stringsuch that only one of strings

zab

abab b

aaaε ε

aaaε

εsuch that only one of strings xz, yz ends in acceptingstate (belongs to language).

b aa abε a

aa

aaa

a ab

aε ab aba

Σ

ba

b

εb

79Σ b

b

ab

aaa a

A set of strings is distinguished by b

aaab

aaεa

aaa aε

ε

g ylanguage if each pair of strings is distinguished.

a

ababa

b b

aaaε ε

aaaε

ε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

ba

Σb

b

80

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

εε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

ba

Σb

b

81

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

ε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

ba

Σb

b

82

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

ε

aa

aaaaΣ

ε a b aa ab

εba

εb

ab abaΣ ε

ba

Σb

b

83

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

ε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

ba

Σb

b

84

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

εε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

ba

Σb

b

85

Σb

ab

aaa aWhy ?aaa

We can try shorter strings baaab

aaεa

aaa aε

ε

We can try shorter strings. Will do?aNo as both aba and aa

a

ababa

b b

aaaε ε

aaaε

εNo, as both aba and aa get accept! Check!

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

},{ ba=Σ

ba

Σb

b

86

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

ε

aa

aaaaΣ

ε a b aa ab

εba

εb

ab abaΣ ε

ba

Σb

b

87

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

ba

ababa

b

aaaε ε

aaaε

εεb

aa

aaaaΣ

ε a aa ab

ba

εb

ab abaΣ

ba

Σb

b

88

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

ba

ababa

b

aaaε ε

aaaε

εb

aa

aaaaΣ

ε a aa ab

ba

εb

ab abaΣ

ba

Σb

b

89

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

ba

ababa

b

aaaε ε

aaaε

εb

aa

aaaaΣ

ε a aa abε

ba

εb

ab abaΣ ε

ba

Σb

b

90

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

ba

ababa

b

aaaε ε

aaaε

εb

aa

aaaaΣ

ε a aa abε

ba

εb

ab abaΣ ε

ba

Σb

b

91

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

εε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

ba

Σb

b

92

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

a

ababa

b b

aaaε ε

aaaε

ε

aa

aaaaΣ

ε a b aa ab

ba

εb

ab abaΣ

a

ba

Σb

b

93

Σb

ab

aaa ab

aaab

aaεa

aaa aε

ε

ba

ababa

b

aaaε ε

aaaε

εab

aa

aaaaΣ

ε a b aa

ba

εb

ab abaΣ ε

ba

Σb

b

94

Σb

Set Partition Algorithm (Salling book)

Example 2.39 We apply step by step set partition algorithm on the following DFA: }{ ba=Σ },{ ba=Σ

b 4Two strings x,y are distinguished by

bb

2

4a ab

g ylanguage (DFA) if there is a (distinguishing) string a

a b1

35 b

(distinguishing) string z such as that only one of strings xz, yz

d i ia

ab3

6ends in acceptingstate (belongs to language).

95

g g )

}6,5,4,3,2,1{N ti A ti

We search for strings that are distinguished by DFA!

}6,5,4,2,1{ }3{Non-accepting: Accepting:

bb 4b

What happens when we read ?a

ba

a b

21

35 ba ab

Distinguishing string: ora bWhat happens when we read ?a

From 1 and 6 by a we reach 3 which is accepting They form a special group

aa

b36

}3{}5,4,2{ }6,1{

accepting. They form a special group.

}3{}5{ }6,1{}4,2{

What happens when we read ?bFrom 5 we end in 6 by b,

96

}3{}5{ }6,1{}4,2{yleaving its group.

The minimal automaton has four states:

}3{}5{ }6,1{}4,2{

a bb

b}5{

}42{ ab

}6,1{ b }3{}4,2{

a

a

a

97

The Chomsky HierarchyThe Chomsky Hierarchy

98

}0:{ ! ≥nan}0,:{ ≥+ lncba lnln

Context-Free LanguagesNon-regular languages

}{ nnba }{ RwwContext Free Languages

Regular LanguagesRegular Languages

99

Some Additional Examplesfor your excersize

100

Example of subset construction for NFA → DFA

ε

2 a 3

εε

1start 1010

ε

0 6

87 9ε ε a b b

4 5b

ε

NFA N for (a+b)*abb

101

( )

Example of subset construction for NFA → DFA

STATEINPUT SYMBOL

a b

A B CABCD

BBBB

CDCED

EBB

EC

Translation table for DFA

102

Example of subset construction for NFA → DFA

b

b bC

Astart Ba Db b 10E

a

a aa

Result of applying the subset construction of NFA for (a+b)*abb.

103

Another Example of State Elimination

b b

baa

b b

ba,

b0q 1q 2q

b

b b

ba +a

b

0q 1q 2q

b

b0q 1q 2q

104

Another Example of State Elimination

ab b

ba +a

0q 1q 2q

b

babb*

0q 2q)(* babb +

105

Another Example of State Elimination

babb*Resulting Regular Expression

0q 2q

babb

)(* babb +0q 2q)(

*)(**)*( bbabbabbr +=

LMLrL == )()(106

)()(

In GeneralRemoving states cd

estates

iq q jqa ba b

dae* bce*dce*

iq jq

107bae*

Obtaining the final regular expression

1rr

4r

0q fq3r

0q fq2r

*)*(* 213421 rrrrrrr +=

LMLrL == )()(

108

Example: From an NFA to a DFA

b

states a bA B C

Cab b

B B DC B CD B E

A B D Ea b b

aD B EE B C a

a

a

109

Example: From an NFA to a DFA

states a bA B AA B A

B B D

D B ED B E

E B A

b b

A B Da b b

b b

A B D E

a

a

a

110

Example: DFA Minimization

Curr ent Part it ion Split on a Split on b

P0 {s4} {s0, s1, s2, s3} none {s0, s1, s2} {s3}

P1 {s 4}{s3}{s0 s1 s2} none {s0 s2}{s1}

final state

P1 {s 4}{s3}{s0, s1, s2} none {s0, s2}{s1}

P2 {s4}{s3}{s1}{s0, s2} none none

a a ba a

s0

as1

b

s3

bs4

a

b

a b

s0 , s2

as1 s3

bs4

b

a b

s2

b

b

111

Example: DFA Minimization

What about a ( b + c )* ?

a λq4 q5

b

λ

λ λλ λ

First the subset construction:

q0 q1 a λ

q6 q7 c

q3 q8 q2 q9

λ

λ λ

λ λ

First, the subset construction:

ε-c los ure (m ove( s ,*)) b

NFA s ta te s a b cs 0 q 0 q 1, q 2, q 3,

q 4, q 6, q 9

none none

s 1 q 1, q 2, q 3, none q 5, q 8, q 9, q 7, q 8, q 9,

s2

s0 s1

ba

b cq 4, q 6, q 9 q 3, q 4, q 6 q 3, q 4, q 6

s 2 q 5, q 8, q 9,q 3, q 4, q 6

none s 2 s 3

s 3 q 7, q 8, q 9,q 3, q 4, q 6

none s 2 s 3

s3

c

c

112Final states

Example: DFA MinimizationThen, apply the minimization algorithm

s

b

S p lit on s2

s0 s1

c

b

ab c

S p lit onCurren t Pa rt ition a b c

P0 { s 1, s 2, s 3} {s 0} none none none

To produce the minimal DFAs3

c

c

final states

a

b , c

s0 s1

113

Grammars

114

GrammarsGrammars express languages

Example: the English language

predicatephrasenounsentence → _

nounarticlephrasenoun →_

verbpredicate →

115

i lthearticleaarticle

birdnoun →

thearticle

dognounbirdnoun

singsverb →

barksverbsingsverb

116

A derivation of “the bird sings”:A derivation of the bird sings :

⟩⟨⟩⟨⇒⟩⟨ di tht⟩⟨⟩⟨⟩⟨⇒

⟩⟨⟩⟨⇒⟩⟨

predicatenounarticlepredicatephrasenounsentence _

⟩⟨⟩⟨⟩⟨⇒ verbnounarticlep

⟩⟨⟩⟨⟩⟨⇒

⟩⟨⟩⟨⟩⟨⇒

verbbirdtheverbnounthe

⟩⟨⟩⟨⟩⟨⇒

⟩⟨⟩⟨⟩⟨⇒

birdtheverbbirdthesings

117

A derivation of “a dog barks”:A derivation of a dog barks :

predicatephrasenounsentence ⇒

verbphrasenounpredicatephrasenounsentence

⇒ _

verbnounarticlep

⇒_

verbnouna⇒

b kdverbdoga

118

barksdoga⇒

The language of the grammar:

,"barksbirda"{=L

,"barksbirdhet","ingssbirda"

"barksogda","ingssbirdhet"

}

,"ingssogda",barksogda

"ingssogdhet","barksogdhet"}

119

gg }

Notation

birdnoun →

dognoun →

Non-terminal(Variable)

TerminalProduction rule(Variable)

120

Example

→ aSbSGrammar:

λ→→

SaSbSGrammar:

Derivation of sentence: ab

abaSbS ⇒⇒

aSbS → λ→S121

λ→S

λ→→

SaSbSGrammar:

bb

λ→S

aabbaaSbbaSbS ⇒⇒⇒

aabbDerivation of sentence

aabbaaSbbaSbS ⇒⇒⇒

aSbS → λ→S122

Other derivationsOther derivations

aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒ aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒

aaaabbbbaaaaSbbbbaaaSbbbaaSbbaSbS

⇒⇒⇒⇒⇒aaaabbbbaaaaSbbbb⇒⇒

123

The language of the grammar

→ aSbSλ→S

}0:{ ≥= nbaL nn }{

124

Formal DefinitionFormal Definition

G ( )PSTVGGrammar ( )PSTVG ,,,=

V S t f i bl:V Set of variables

:T Set of terminal symbolsSet of terminal symbols

:S Start variable

:P Set of production rules

125

ExampleGrammar

→ aSbSG

λ→→

SaSbS

( )PSTVG ,,,=

}{SV }{ bT}{SV = },{ baT =

}{ λSSbSP126

},{ λ→→= SaSbSP

Sentential FormSe e a oA sentence that contains variables and terminalsvariables and terminals

Example

aaabbbaaaSbbbaaSbbaSbS ⇒⇒⇒⇒

sentential forms Sentence(sats)

127

(sats)

*We write: aaabbbS ⇒

Instead of:

aaSbbaSbS ⇒⇒aaabbbaaaSbbb

aaSbbaSbS⇒⇒

⇒⇒

128

In general we write

ww*⇒

In general we write

nww1 ⇒

wwww ⇒⇒⇒⇒ L321

if

nwwww ⇒⇒⇒⇒ 321

By default ww*⇒( )

129

By default ww ⇒( )

Example

→ aSbS *Grammar Derivations

λ→→

SaSbS S

*

⇒λ

abS*⇒

aabbS*⇒

aaabbbS*⇒

130

aaabbbS⇒

GExample

→ aSbSGrammar

λ→→

SaSbS

Derivations

aaSbbS∗⇒

baaaaaSbbbbaaSbb∗⇒

131

Another Grammar Example

→ AbSGGrammar

→→

aAbAbS

λ→ADerivations

abbaAbbAbSbAbS

⇒⇒⇒⇒⇒

aabbbaaAbbbaAbbSabbaAbbAbS⇒⇒⇒

⇒⇒⇒

132

aabbbaa bbba bbS ⇒⇒⇒

More Derivations

aaaAbbbbaaAbbbaAbbAbS ⇒⇒⇒⇒aaaabbbbbaaaaAbbbbb⇒⇒

aaaabbbbbS∗⇒

bbbaaaaaabbbbS∗⇒

bbaS

bbbaaaaaabbbbS

nn∗⇒

133

bbaS⇒

The Language of a Grammar

For a grammar with start variable G Sg

}{)( SGL∗⇒ }:{)( wSwGL ⇒=

String of terminalsString of terminals

134

ExampleF GFor grammar

→AbA

AbSG

λ→→

AaAbAλ→A

}0{)( ≥bbGL nn }0:{)( ≥= nbbaGL nn

∗Since bbaS nn⇒

135

Notation

→ aAbAλ|AbA

λ→Aλ|aAbA→

ti lthearticleaarticle

→ theaarticle |→thearticle →

136

Linear Grammars

137

Linear Grammars

Grammars with at most one variable (non-terminal)at most one variable (non terminal)at the right side of a production

→ AbSSbSExamples:

→→

aAbAAbS

λ→→

SaSbS

λ→→

Aλ→S

138

A Non-Linear Grammar

SSS →Grammar G

S →λ

bSaSaSbS

→→

bSaS →

)}()({)(GL )}()(:{)( wnwnwGL ba ==

139

Another Linear Grammar

Grammar AS →G

aBAAS

→→

λ|AbB →

}0:{)( ≥= nbaGL nn }{)(

140

Right-Linear GrammarsRight-Linear Grammars

All productions have form: xBA→or

xA→

abSS →Example

aS →

141

Left-Linear Grammars

All productions have form BxA→All productions have form BxA→

Aor

xA→Example

BAabAAabS

→→

|aB

BAabA→→ |

142

aB →

Regular Grammars

A grammar is a regular grammar if and only if it is right-linear or left-linear.

143

R l GRegular Grammars Generate

Regular LanguagesRegular Languages

144

Theorem

LanguagesGenerated by

Regular=Generated by

Regular GrammarsLanguages

145

Theorem - Part 1

LanguagesGenerated by

Regular⊆Generated byRegular Grammars

Languages⊆

Any regular grammar generatesa regular language

146

Theorem - Part 2

LanguagesR l

g gGenerated byRegular Grammars

RegularLanguages

⊇Regular Grammars

Any regular language is generated by a regular grammar

147

y g g

Proof – Part 1Proof Part 1

LanguagesGenerated by

Regular⊆Generated byRegular Grammars

Languages⊆

The language generated by)(GLThe language generated by any regular grammar is regular

)(GLG

148

The case of Right-Linear Grammars

Let be a right-linear grammarG g g

We will prove: is regular)(GLp g)(

Proof idea We will construct NFAwith

M)()( GLML =with )()( GLML =

149

Grammar is right-linearG

Example

BaaABaAS |

→→

aBbBBaaA|→

→|→

150

Construct NFA such that every state Mis a grammar variable:

AS special

final stateFV

Bfinal state

BaaABaAS |

→→

aBbBBaaA|→

151

|

Add edges for each production:g p

A

S FV

AaS FV

BB

BABaAS |→

aBbBBaaA|→

152

aBbB |→

Aa

S FVλB

BAS |BaaABaAS |

→→

aBbBBaaA|→

153

A

a a

S VS FV

Bλ a

BaAS |→ B

BaaABaAS |

→→

aBbB |→

154

A

a a

S VS FV

Bλ a

BaAS |→ B

bBaaABaAS |

→→

baBbB |→

155

Aaa a

S FVλ a aBBaAS |→

bBaaA|

156aBbB |→

Aaa a

S FVλ a aB

b

157aaabaaaabBaaaBaAS ⇒⇒⇒⇒

GM GrammarNFA

Aa

BaAS |→a a

abBBBaaA|→

S FVλ a a

abBB |→

Bλ a

GLML )()(b

abaaaabGLML

**)()(

+==

158

abaaaab +

In GeneralA right-linear grammar G

has variables: K,,, 210 VVV

and productions:

VV jmi VaaaV L21→or

mi aaaV L21→or

159

We construct the NFA such that:M

each variable corresponds to a node: iV

0V 1V V FV0V 1V 2V FV

special

….

specialfinal state

160

For each production: jmi VaaaV L21→

we add transitions and intermediate nodes

V V1a 2a aiV jV………1a 2a ma

Example:

161

http://www.cs.duke.edu/csed/jflap/tutorial/grammar/toFA/index.htmlConvert Right-Linear Grammar to FA by JFLAP

Example

9a

M48433421

23110

||

VaaaVaaVVaVaV

→→

1V 3V1a3a2a 4a

5a

9

5193

452

| aVaVVaV

→→ 0V

FV3a3a

4a

8a

5a

9a

04

5193 |aV → 2V

4V8a 9a

5a

)()( MLGL =162

)()(

The case of Left-Linear Grammars

Let be a left-linear grammarG g

We will prove: is regular)(GLp g)(GL

Proof ideaWe will construct a right-linearWe will construct a right lineargrammar with G′ RGLGL )()( ′=

163

Since is left-linear grammarthe productions look like:

Gp

aaBaA→ kaaBaA L21→

A kaaaA L21→

164

Construct right-linear grammar G′

I G aaBaA→In :G kaaBaA L21→

vBA→ vBA→

G′ BAIn :G′ BaaaA k 12L→

BA R165

BvA R→

G′Construct right-linear grammar G′

I G AIn :G kaaaA L21→

vA→ vA→

In :G′ 12aaaA kL→R

166

RvA→

RIt is easy to see that: RGLGL )()( ′=

Since is right-linear, we have:G′

)(GL ′ RGL )( ′ )(GL)(GL GL )( )(GLRegular Regular RegularLanguage

gLanguage

gLanguage

167

Proof - Part 2

LanguagesLanguagesGenerated by

RegularLanguages

⊇Regular Grammars

Languages

Any regular language is generated b l

LGby some regular grammar G

168

Any regular language is generated LG

Proof idea

by some regular grammar G

Proof idea

Since is regularLSince is regularthere is an NFA such that

LM )(MLL =

Construct from a regular grammar h th t

M G)()( GLMLsuch that )()( GLML =

169

Example bp b

aM a aM1q 2q0q

*)*(* abbababL =)(MLL

3q)(MLL =

170

Convert to a right-linear grammarMb

M→ aqqG

a aM0q 1q 2q→

11

10

bqqaqq

1 2

→ 21

11

aqqqq

3q→ 32 bqq

λ→→ 13

qqq

LMLGL == )()(171

λ→3q )()(

In General

For any transition:aq p

Add production: apq →

variable terminal variable

172

F fi l t t qFor any final state: fq

Add production: λ→fqAdd production: λ→fq

173

Since is right-linear grammarG

is also a regular grammar with G

LMLGL == )()(

174

Regular GrammarsA regular grammar is any right-linear or left-linear grammarright linear or left linear grammar

Examples

1G 2GExamples

abSS → AabS →

aS →B

BAabA→ |

175aB →

ObservationRegular grammars generate regular languagesExamples

b1G 2G

abSS →BAabA

AabS→→

|aS →

aBBAabA

→→ |

aabGL *)()( 1 = *)()( 2 abaabGL =176

aabGL )()( 1 = )()( 2 abaabGL =

Chomsky’s Language Hierarchy

Non-regular languages

Regular LanguagesRegular Languages

177

Application: CompilerLexical Analysis

(Lexikalisk analys i Kompilatorteori)

178

What is a compiler?A compiler is program that translates a source

language into an equivalent target language

while (i > 3) {C programa[i] = b[i];

i ++}

C program

}

compiler does this

mov eax, ebxadd eax, 1cmp eax 3

assemblyprogram

179

cmp eax, 3jcc eax, edx

program

What is a compiler?

l f {class foo {int bar;...

Java program

}

compiler

struct foo {int bar;

does this

int bar;...

}

C program

180

What is a compiler?

l f {class foo {int bar;...

Java program

}

compiler

........does this

Java virtual.................

Java virtual machine program

181

Phases of a compilerpSource Program

Lexical AnalyzerScanner

Tokens

Parser Syntax Analyzer

P T

Semantic Analyzer

Parse Tree

Abstract Syntax Tree with

182

yattributes

Compilation in a Nutshell

Source code if (b == 0) a = b;Source code(character stream)

Lexical analysis

if (b == 0) a = b;

Parsing

Token stream if ( b ) a = b ;0==

Parsing

Abstract syntax tree(AST)

if==

b 0

=

a b

;

(AST)Semantic Analysisif

==

b 0

=

b

booleanDecorated AST

int;

183

int b int 0 int alvalue

int b

Stages of AnalysisStages of Analysis

Lexical Analysis – breaking the input up into individual words/tokens

Syntax Analysis – parsing the phrase structure ofthe program

Semantic Analysis – calculating the program’s imeaning

184

Lexical AnalyzerLexical Analyzer

Input is a stream of charactersProduces a stream of names, keywords & , y

punctuation marksDiscards white space and commentsp

185

Lexical AnalyzerLexical Analyzer

Recognize “tokens” in a program source code.The tokens can be variable names, reserved

d t b twords, operators, numbers, … etc.Each kind of token can be specified as an RE,

e g a variable name is of the form [A Zae.g., a variable name is of the form [A-Za-z][A-Za-z0-9]*.

We can then construct an λ NFA to recognize itWe can then construct an λ -NFA to recognize it automatically.

186

Lexical AnalyzerLexical Analyzer

By putting all these λ-NFA’s together, we obtain one which can recognize all different kinds of t k i th i t t itokens in the input string.

We can then convert this λ -NFA to NFA and th t DFA d i l t thi DFAthen to DFA, and implement this DFA as a deterministic program - the lexical analyzer.

187

RE NFA DFA Minimal DFA

Thompson’s Contruction

Subset Contruction

Hopcroft Mi i i tiContruction Contruction Minimization

188