Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of...

91
Automata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International Tbilisi Summer School in Logic and Language Tbilisi, Georgia September 2013 1/ 56

Transcript of Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of...

Page 1: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Automata and Formal Language Theory

Stefan HetzlInstitute of Discrete Mathematics and Geometry

Vienna University of Technology

9th International Tbilisi Summer School in Logic and Language

Tbilisi, Georgia

September 2013

1/ 56

Page 2: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Introduction

I Formal and natural languages

I How to specify a formal language?I AutomataI Grammars

I Strong connections to:I Computability theoryI Complexity theory

I Applications in computer science:I VerificationI Compiler constructionI Data formats

2/ 56

Page 3: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Introduction

I Formal and natural languagesI How to specify a formal language?

I AutomataI Grammars

I Strong connections to:I Computability theoryI Complexity theory

I Applications in computer science:I VerificationI Compiler constructionI Data formats

2/ 56

Page 4: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Introduction

I Formal and natural languagesI How to specify a formal language?

I AutomataI Grammars

I Strong connections to:I Computability theoryI Complexity theory

I Applications in computer science:I VerificationI Compiler constructionI Data formats

2/ 56

Page 5: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Introduction

I Formal and natural languagesI How to specify a formal language?

I AutomataI Grammars

I Strong connections to:I Computability theoryI Complexity theory

I Applications in computer science:I VerificationI Compiler constructionI Data formats

2/ 56

Page 6: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

I Deterministic finite automata

I Nondeterministic finite automata

I Automata with ε-transitions

I The class of regular languages

I The pumping lemma for regular languages

I Context-free grammars and languages

I Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

3/ 56

Page 7: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A First Example

lockedonmlhijk unlockedonmlhijkcard

**

push

jj

push**

cardtt

4/ 56

Page 8: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 9: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 10: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 11: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 12: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 13: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 14: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 15: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 16: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc

!

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 17: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 18: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 19: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 20: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 21: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 22: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab

%

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 23: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab %

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 24: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab %

ac

%

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 25: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab %

ac

%

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 26: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab %

ac

%

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 27: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab %

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 28: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Finite Automata – A More Abstract Example

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

abbbcc !

aab %

ac %

The language accepted by this automaton is

L = {akbncm | k , n,m ≥ 1}

5/ 56

Page 29: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Error State

// ?>=<89:;q0a // ?>=<89:;q1

a

��b // ?>=<89:;q2

b

��c // ?>=<89:;76540123q3

c

��

6/ 56

Page 30: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Error State

// ?>=<89:;q0a //

b,c''OOOOOOOOOOOOOOOO ?>=<89:;q1

a

��b //

c

��??

????

???

?>=<89:;q2

b

��c //

a

��

?>=<89:;76540123q3

c

��

a,b����

����

���

?>=<89:;qe

a,b,c

JJ

6/ 56

Page 31: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Deterministic Finite Automata – Definition

DefinitionA deterministic finite automaton (DFA) is a tupleA = 〈Q,Σ, δ, q0,F 〉 where:

1. Q is a finite set (the states).

2. Σ is a finite set (the input symbols).

3. δ : Q × Σ→ Q (the transition function).

4. q0 ∈ Q (the starting state)

5. F ⊆ Q (the final states).

7/ 56

Page 32: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

DFA – Example

// ?>=<89:;q0a //

b,c''OOOOOOOOOOOOOOOO ?>=<89:;q1

a

��b //

c

��??

????

???

?>=<89:;q2

b

��c //

a

��

?>=<89:;76540123q3

c

��

a,b����

����

���

?>=<89:;qe

a,b,c

JJ

as tuple: BB1

8/ 56

Page 33: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of a DFA

DefinitionExtend δ : Q × Σ→ Q to δ̂ : Q × Σ∗ → Q as follows.

δ̂(q,w) =

{q if w = ε

δ(δ̂(q, v), x) if w = vx for v ∈ Σ∗, x ∈ Σ

Example

δ̂(abc, q0) = q3, δ̂(aba, q0) = qe

DefinitionLet A = 〈Q,Σ, δ, q0,F 〉 be a DFA. The language accepted by A is

L(A) = {w ∈ Σ∗ | δ̂(q0,w) ∈ F}.

9/ 56

Page 34: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of a DFA

DefinitionExtend δ : Q × Σ→ Q to δ̂ : Q × Σ∗ → Q as follows.

δ̂(q,w) =

{q if w = ε

δ(δ̂(q, v), x) if w = vx for v ∈ Σ∗, x ∈ Σ

Example

δ̂(abc, q0) = q3, δ̂(aba, q0) = qe

DefinitionLet A = 〈Q,Σ, δ, q0,F 〉 be a DFA. The language accepted by A is

L(A) = {w ∈ Σ∗ | δ̂(q0,w) ∈ F}.

9/ 56

Page 35: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of a DFA

DefinitionExtend δ : Q × Σ→ Q to δ̂ : Q × Σ∗ → Q as follows.

δ̂(q,w) =

{q if w = ε

δ(δ̂(q, v), x) if w = vx for v ∈ Σ∗, x ∈ Σ

Example

δ̂(abc, q0) = q3, δ̂(aba, q0) = qe

DefinitionLet A = 〈Q,Σ, δ, q0,F 〉 be a DFA. The language accepted by A is

L(A) = {w ∈ Σ∗ | δ̂(q0,w) ∈ F}.

9/ 56

Page 36: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Designing an DFA

L = {w ∈ {a, b}∗ | w contains an even number of a’s

and an even number of b’s}

BB2

10/ 56

Page 37: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

⇒ Nondeterministic finite automata

I Automata with ε-transitions

I The class of regular languages

I The pumping lemma for regular languages

I Context-free grammars and languages

I Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

11/ 56

Page 38: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Nondeterministic Finite Automata – A Motivating Example

Automaton for accepting L = {wab | w ∈ {a, b}∗} ?

// ?>=<89:;q0

a,b

��a // ?>=<89:;q1

b // ?>=<89:;76540123q2

Nondeterminism =⇒ consider all possible runs

12/ 56

Page 39: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Nondeterministic Finite Automata – A Motivating Example

Automaton for accepting L = {wab | w ∈ {a, b}∗} ?

// ?>=<89:;q0

a,b

��a // ?>=<89:;q1

b // ?>=<89:;76540123q2

Nondeterminism =⇒ consider all possible runs

12/ 56

Page 40: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Nondeterministic Finite Automata – A Motivating Example

Automaton for accepting L = {wab | w ∈ {a, b}∗} ?

// ?>=<89:;q0

a,b

��a // ?>=<89:;q1

b // ?>=<89:;76540123q2

Nondeterminism =⇒ consider all possible runs

12/ 56

Page 41: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Nondeterministic Finite Automata – Definition

DefinitionA nondeterministic finite automaton (NFA) is a tupleA = 〈Q,Σ,∆, q0,F 〉 where:

1. Q is a finite set (the states).

2. Σ is a finite set (the input symbols).

3. ∆ ⊆ Q × Σ× Q (the transition relation).

4. q0 ∈ Q (the starting state)

5. F ⊆ Q (the final states).

13/ 56

Page 42: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

NFA – Example

// ?>=<89:;q0

a,b

��a // ?>=<89:;q1

b // ?>=<89:;76540123q2

as tuple: BB3

14/ 56

Page 43: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of an NFA

DefinitionExtend ∆ ⊆ Q × Σ× Q to ∆̂ : Q × Σ∗ × Q as follows.

(q,w , q) ∈ ∆̂ if w = ε

(q,w , q∗) ∈ ∆̂ if w = vx for v ∈ Σ∗, x ∈ Σ,

and (q, v , q′) ∈ ∆̂, and (q′, x , q) ∈ ∆

Example

(q0, ab, q0), (q0, ab, q2) ∈ ∆̂, (q0, bb, q0) ∈ ∆̂

DefinitionLet A = 〈Q,Σ,∆, q0,F 〉 be a NFA. The language accepted by A is

L(A) = {w ∈ Σ∗ | ∃q ∈ F s.t. (q0,w , q) ∈ ∆̂}.

15/ 56

Page 44: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of an NFA

DefinitionExtend ∆ ⊆ Q × Σ× Q to ∆̂ : Q × Σ∗ × Q as follows.

(q,w , q) ∈ ∆̂ if w = ε

(q,w , q∗) ∈ ∆̂ if w = vx for v ∈ Σ∗, x ∈ Σ,

and (q, v , q′) ∈ ∆̂, and (q′, x , q) ∈ ∆

Example

(q0, ab, q0), (q0, ab, q2) ∈ ∆̂, (q0, bb, q0) ∈ ∆̂

DefinitionLet A = 〈Q,Σ,∆, q0,F 〉 be a NFA. The language accepted by A is

L(A) = {w ∈ Σ∗ | ∃q ∈ F s.t. (q0,w , q) ∈ ∆̂}.

15/ 56

Page 45: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of an NFA

DefinitionExtend ∆ ⊆ Q × Σ× Q to ∆̂ : Q × Σ∗ × Q as follows.

(q,w , q) ∈ ∆̂ if w = ε

(q,w , q∗) ∈ ∆̂ if w = vx for v ∈ Σ∗, x ∈ Σ,

and (q, v , q′) ∈ ∆̂, and (q′, x , q) ∈ ∆

Example

(q0, ab, q0), (q0, ab, q2) ∈ ∆̂, (q0, bb, q0) ∈ ∆̂

DefinitionLet A = 〈Q,Σ,∆, q0,F 〉 be a NFA. The language accepted by A is

L(A) = {w ∈ Σ∗ | ∃q ∈ F s.t. (q0,w , q) ∈ ∆̂}.

15/ 56

Page 46: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Equivalence of DFA and NFA

TheoremLet L ⊆ Σ∗, then there is a DFA D with L(D) = L iff there is aNFA N with L(N) = L.

Proof (BB4).

1. Converting D to N: easy.

2. Converting N to D: subset construction.

16/ 56

Page 47: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Subset Construction – Example

Automaton for accepting L = {wab | w ∈ {a, b}∗}:

// ?>=<89:;q0

a,b

��a // ?>=<89:;q1

b // ?>=<89:;76540123q2

Conversion to DFA: BB5

In practice: only construct reachable states

17/ 56

Page 48: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Subset Construction – Lower Bound

TheoremThere are Ln ⊆ Σ∗, n ≥ 1 and NFA Nn with n + 1 states withL(Nn) = Ln s.t. all DFA Dn with L(Dn) = Ln have at least 2n

states.

Proof.Let Σ = {a, b} and for n ≥ 1 define

Ln = {wav | w , v ∈ Σ∗, |v | = n − 1}

BB6

18/ 56

Page 49: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

⇒ Automata with ε-transitions

I The class of regular languages

I The pumping lemma for regular languages

I Context-free grammars and languages

I Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

19/ 56

Page 50: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Epsilon-Transitions – A Motivating Example

Automaton for accepting decimal representations of integers:

Σ = {−, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}L = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+

∪ {−w | w ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+}

20/ 56

Page 51: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Epsilon-Transitions – A Motivating Example

Automaton for accepting decimal representations of integers:

Σ = {−, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}L = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+

∪ {−w | w ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+}

// ?>=<89:;q10−9

// ?>=<89:;76540123q2

0−9��

20/ 56

Page 52: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Epsilon-Transitions – A Motivating Example

Automaton for accepting decimal representations of integers:

Σ = {−, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}L = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+

∪ {−w | w ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}+}

// ?>=<89:;q0−,ε

// ?>=<89:;q10−9

// ?>=<89:;76540123q2

0−9��

20/ 56

Page 53: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

ε-NFA – Definition

DefinitionA nondeterministic finite automaton with ε-transitions (ε-NFA) isa tuple A = 〈Q,Σ,∆, q0,F 〉 where:

1. Q is a finite set (the states).

2. Σ is a finite set (the input symbols).

3. ∆ ⊆ Q × (Σ ∪ {ε})× Q (the transition relation).

4. q0 ∈ Q (the starting state)

5. F ⊆ Q (the final states).

21/ 56

Page 54: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

ε-NFA – Properties

DefinitionTransition relation ∆̂: includes ε-transitions.

DefinitionLet A = 〈Q,Σ,∆, q0,F 〉 be a ε-NFA. The language accepted by Ais

L(A) = {w ∈ Σ∗ | ∃q ∈ F s.t. (q0,w , q) ∈ ∆̂}.

TheoremLet L ⊆ Σ∗, then there is a DFA D with L(D) = L iff there is aε-NFA N with L(N) = L.

Proof.Modified subset construction (ε-closed subsets).

22/ 56

Page 55: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

⇒ The class of regular languages

I The pumping lemma for regular languages

I Context-free grammars and languages

I Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

23/ 56

Page 56: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Class of Regular Languages

Corollary

Let L ⊆ Σ∗. The following are equivalent:

I There is a DFA D with L(D) = L.

I There is a NFA N with L(N) = L.

I There is a ε-NFA N ′ with L(N ′) = L.

DefinitionL ⊆ Σ∗ is called regular language if there is a finite automaton Awith L(A) = L.

24/ 56

Page 57: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Closure Properties of Regular Languages

TheoremIf L1, L2 ⊆ Σ∗ are regular, then L1 ∪ L2 is regular.

Proof.BB7

TheoremIf L ⊆ Σ∗ is regular, then Lc = Σ∗ \ L is regular.

Proof.BB8

TheoremIf L1, L2 ⊆ Σ∗ are regular, then L1 ∩ L2 is regular.

Proof.L1 ∩ L2 = (Lc

1 ∪ Lc2)c.

25/ 56

Page 58: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Closure Properties of Regular Languages

TheoremIf L1, L2 ⊆ Σ∗ are regular, then L1 ∪ L2 is regular.

Proof.BB7

TheoremIf L ⊆ Σ∗ is regular, then Lc = Σ∗ \ L is regular.

Proof.BB8

TheoremIf L1, L2 ⊆ Σ∗ are regular, then L1 ∩ L2 is regular.

Proof.L1 ∩ L2 = (Lc

1 ∪ Lc2)c.

25/ 56

Page 59: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Closure Properties of Regular Languages

TheoremIf L1, L2 ⊆ Σ∗ are regular, then L1 ∪ L2 is regular.

Proof.BB7

TheoremIf L ⊆ Σ∗ is regular, then Lc = Σ∗ \ L is regular.

Proof.BB8

TheoremIf L1, L2 ⊆ Σ∗ are regular, then L1 ∩ L2 is regular.

Proof.L1 ∩ L2 = (Lc

1 ∪ Lc2)c.

25/ 56

Page 60: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

X The class of regular languages

⇒ The pumping lemma for regular languages

I Context-free grammars and languages

I Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

26/ 56

Page 61: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Pumping Lemma for Regular Languages

Lemma (Pumping Lemma)

Let L be a regular language. Then there is an n ∈ N s.t. for everyw ∈ L with |w | ≥ n we have w = v1v2v3 with

1. v2 6= ε,

2. |v1v2| ≤ n, and

3. for all k ≥ 0 also v1vk2 v3 ∈ L.

Proof.BB9

27/ 56

Page 62: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Non-Regular Languages

Lemma (Pumping Lemma)

Let L be a regular language. Then there is an n ∈ N s.t. for everyw ∈ L with |w | ≥ n we have w = v1v2v3 with

1. v2 6= ε,

2. |v1v2| ≤ n, and

3. for all k ≥ 0 also v1vk2 v3 ∈ L.

Example

L = {ambm | m ≥ 1} is not regular (BB10).

28/ 56

Page 63: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

X The class of regular languages

X The pumping lemma for regular languages

⇒ Context-free grammars and languages

I Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

29/ 56

Page 64: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Context-Free Grammars – A First Example

How can we specify the set of all arithmetical expressions?E.g. 12, 30 + 21 · 6, (123 + 7) · 15 + 88, . . .

E → N | E + E | E · E | (E )

N → D | DN

D → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

30/ 56

Page 65: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Context-Free Grammars – Definition

DefinitionA context-free grammar (CFG) is a tuple G = 〈N,T ,P,S〉 where

1. N is a finite set of symbols (the nonterminals),

2. T is a finite set of symbols (the terminals),

3. P is a finite set of production rules of the form:

A→ w where A ∈ N and w ∈ (N ∪ T )∗

4. S ∈ N (the start symbol).

31/ 56

Page 66: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Context-Free Grammars – Example

G = 〈NT ,T ,P,S〉NT = {N,D,E}

T = {+, ·, (, ), 0, 1, 2, 3, 4, 5, 6, 7, 8, 9}P = {E → N,E → E + E ,E → E · E ,E → (E ),

N → D,N → DN,

D → 0,D → 1,D → 2,D → 3,D → 4,

D → 5,D → 6,D → 7,D → 8,D → 9}S = E

32/ 56

Page 67: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of a Grammar

Let G = 〈N,T ,P, S〉 be a CFG.

DefinitionFor every A→ w ∈ P and every uAv ∈ (N ∪ T )∗ define

uAv ⇒G uwv .

The derivation relation ⇒∗G is the reflexive and transitive closure of⇒G .

DefinitionThe language of G is L(G ) = {w ∈ T ∗ | S ⇒∗G w}.

DefinitionL ⊆ Σ∗ is called context-free if there is a context-free grammar Gwith L(G ) = L.

33/ 56

Page 68: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Context-Free Grammars – Example Derivation

E → N | E + E | E · E | (E )

N → D | DN

D → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Example derivation: BB11

34/ 56

Page 69: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Formalisms

Automata Grammars

Regular Languages DFAs, NFAs ?

Context-Free Languages ? CFG

35/ 56

Page 70: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

X The class of regular languages

X The pumping lemma for regular languages

X Context-free grammars and languages

⇒ Right linear grammars

I Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

36/ 56

Page 71: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Right Linear Grammars

DefinitionA grammar G = 〈N,T ,P, S〉 is called right linear if all productionsare of one of the following forms:

A→ xB where x ∈ T ,B ∈ N

A→ x where x ∈ T

A→ ε

TheoremLet L ⊆ Σ∗. Then L is regular iff L has a right linear grammar.

Proof (BB12).

1. From right linear grammar to ε-NFA.

2. From NFA to right linear grammar.

Remark: notion of left linear grammars with analogous result37/ 56

Page 72: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Right Linear Grammars – Example

Let L = {ambn | m, n ≥ 0} (BB13)

Right linear grammar

Automaton

38/ 56

Page 73: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Formalisms

Automata Grammars

Regular Languages DFA, NFA LLG, RLG

Context-Free Languages ? CFG

39/ 56

Page 74: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Regular vs. Context-Free Languages

TheoremEvery regular language is context-free.

Proof.Every regular language has a right linear grammar. Every rightlinear grammar is context-free.

TheoremThere is a context-free language which is not regular.

Proof.L = {anbn | n ≥ 1} is not regular (pumping lemma), butS → ab | aSb is a context-free grammar for L.

40/ 56

Page 75: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

X The class of regular languages

X The pumping lemma for regular languages

X Context-free grammars and languages

X Right linear grammars

⇒ Pushdown Automata

I The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

41/ 56

Page 76: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Automata for Context-Free Languages ?

I Well-balanced strings of parentheses, e.g.(), (()()), ((()())()) ∈W but((), )(() /∈W

I Context-free grammar for W : E → EE | (E ) | εI W is not regular (by pumping lemma, BB14)

I Generating a language vs. accepting a language

I How to accept a well-balanced string? (BB15)

=⇒ Need more than constant memory=⇒ For context-free languages: automaton with a stack

42/ 56

Page 77: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Automata for Context-Free Languages ?

I Well-balanced strings of parentheses, e.g.(), (()()), ((()())()) ∈W but((), )(() /∈W

I Context-free grammar for W : E → EE | (E ) | εI W is not regular (by pumping lemma, BB14)

I Generating a language vs. accepting a language

I How to accept a well-balanced string? (BB15)=⇒ Need more than constant memory=⇒ For context-free languages: automaton with a stack

42/ 56

Page 78: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Pushdown Automata – Definition

DefinitionA pushdown automaton (PDA) is a tupleA = 〈Q,Σ, Γ, δ, q0,Z0,F 〉 where:

1. Q is a finite set (the states).

2. Σ is a finite set (the input symbols).

3. Γ is a finite set (the stack symbols).

4. ∆ ⊆ Q × (Σ ∪ {ε})× Γ× Q × Γ∗ (the transition relation).where for each (q, x , z) ∈ Q × (Σ ∪ {ε})× Γ there are onlyfinitely many (q′,w) ∈ Q × Γ∗ s.t. (q, x , z , q′,w) ∈ ∆.

5. q0 ∈ Q (the starting state).

6. Z0 ∈ Γ (the starting stack symbol).

7. F ⊆ Q (the final states).

43/ 56

Page 79: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Computation of a PDA

Let A = 〈Q,Σ, Γ, δ, q0,Z0,F 〉 be a PDA.

DefinitionA configuration of A is a triple (q,w , v) ∈ Q × Σ∗ × Γ∗ where

I q is the current state,

I w is the remaining input, and

I v is the current stack contents.

Convention: top of the stack is on the left

DefinitionDefine the binary step relation `A on configurations of A as:

(q, xw , yv) `A (p,w , uv) if (q, x , y , p, u) ∈ ∆

Define `∗A as reflexive and transitive closure of `A.

44/ 56

Page 80: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Language of a PDA

DefinitionLet A = 〈Q,Σ, Γ, δ, q0,Z0,F 〉 be a PDA, then the languageaccepted by A by final state is defined as:

L(A) = {w ∈ Σ∗ | (q0,w ,Z0) `∗A (q, ε, v) for some q ∈ F and any v}

DefinitionLet A = 〈Q,Σ, Γ, δ, q0,Z0,F 〉 be a PDA, then the languageaccepted by A by empty stack is defined as:

N(A) = {w ∈ Σ∗ | (q0,w ,Z0) `∗A (q, ε, ε) for any q}

45/ 56

Page 81: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Example

Σ = {(, )}, Γ = {Z0, 1}

?>=<89:;q1

), 1/ε(, 1/11

��

ε,Z0/Z0

��

// ?>=<89:;76540123q0

(,Z0/1Z0

VV

Derivation of (()()): BB16

46/ 56

Page 82: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Final-State Acceptance vs. Null-Stack Acceptance

TheoremLet L ⊆ Σ∗, then the following are equivalent:

1. There is a PDA AN with N(AN) = L.

2. There is a PDA AF with L(AF ) = L.

Proof (BB17).

1⇒ 22⇒ 1

47/ 56

Page 83: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Pushdown Automata and Context-Free Grammars

TheoremA language L has a context-free grammar iff it has a PDA.

Proof (BB18).

1. Given grammar G , construct PDA A.

2. Given PDA A, construct grammar G .

48/ 56

Page 84: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Formalisms

Automata Grammars

Regular Languages DFA, NFA LLG, RLG

Context-Free Languages PDA CFG

49/ 56

Page 85: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

X The class of regular languages

X The pumping lemma for regular languages

X Context-free grammars and languages

X Right linear grammars

X Pushdown Automata

⇒ The pumping lemma for context-free languages

I Grammars in computer science

I Further topics

50/ 56

Page 86: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

The Pumping Lemma for Context-Free Languages

Lemma (Pumping Lemma)

Let L be a context-free language. Then there is an n ∈ N s.t. forevery w ∈ L with |w | ≥ n we have w = v1v2v3v4v5 with

1. |v2v3v4| ≤ n,

2. v2v4 6= ε, and

3. for all k ≥ 0 also v1vk2 v3vk

4 v5 ∈ L.

Proof Sketch (BB19).

51/ 56

Page 87: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Non-Context-Free Languages

Lemma (Pumping Lemma)

Let L be a context-free language. Then there is an n ∈ N s.t. forevery w ∈ L with |w | ≥ n we have w = v1v2v3v4v5 with

1. |v2v3v4| ≤ n,

2. v2v4 6= ε, and

3. for all k ≥ 0 also v1vk2 v3vk

4 v5 ∈ L.

Example

L = {ambmcm | m ≥ 1} is not context-free (BB20).

52/ 56

Page 88: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Outline

X Deterministic finite automata

X Nondeterministic finite automata

X Automata with ε-transitions

X The class of regular languages

X The pumping lemma for regular languages

X Context-free grammars and languages

X Right linear grammars

X Pushdown Automata

X The pumping lemma for context-free languages

⇒ Grammars in computer science

I Further topics

53/ 56

Page 89: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

HTML is a Context-free Language

Source of website

Char→ a | A | b | B | · · ·String→ ε | Char String

Element→ Heading | Paragraph | Link | String | <br> | · · ·Elements→ ε | Element Elements

Heading→ <h3> String </h3> | · · ·Paragraph→ <p> Elements </p>

Link→ <a href = “ String“> String </a>

...

54/ 56

Page 90: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

XML

Generalization: XML (Extensible Markup Language)

I DTD (Document Type Definition) is a grammar

I There are DTDs for:HTML, office formats, mathematical formulas, address data,vector graphics, cooking recipes, formal proofs, . . .

I Very rich infrastructure available

55/ 56

Page 91: Automata and Formal Language TheoryAutomata and Formal Language Theory Stefan Hetzl Institute of Discrete Mathematics and Geometry Vienna University of Technology 9th International

Further Topics

I Context-sensitive languages: uAv → uwv

I Regular expressions

I Decidability/complexity of, e.g., membership, emptyness, . . .

I Parser generators

I . . .

Introductory Textbook:J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction toAutomata Theory, Languages, and Computation, 2nd edition,2001, Addison-Wesley.

56/ 56