Automata and Formal Languages - CM0081 Regular Expressions · Introduction Finite automata...

Automata and Formal Languages - CM0081Regular Expressions

Andrés Sicard-Ramírez

Universidad EAFIT

Semester 2018-2

Introduction

Finite automataMachine-like descriptions of regular languages.

Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses

Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)

Regular Expressions 2/40

Introduction


Regular expressionsAlgebraic description of regular languages

Declarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗Uses



Introduction


Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe language

Example: (01)(01)∗ + (010)(010)∗Uses



Introduction


Regular expressionsAlgebraic description of regular languagesDeclarative (“user-friendly”) way to express the strings that belong tothe languageExample: (01)(01)∗ + (010)(010)∗

UsesSearch commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)


Introduction



Search commands (e.g. Grep)

Lexical-analyzer generators (e.g. Lex and Alex)Domain specific languages (DSLs)


Introduction



Search commands (e.g. Grep)Lexical-analyzer generators (e.g. Lex and Alex)

Domain specific languages (DSLs)


Introduction





Operations on Languages

DefinitionLet 𝐿 and 𝑀 be languages. Then

Union 𝐿 ∪𝑀 = {𝑤 ∣ 𝑤 ∈ 𝐿 or 𝑤 ∈ 𝑀}

Concatenation 𝐿.𝑀 = {𝑤 ∣ 𝑤 = 𝑥𝑦, 𝑥 ∈ 𝐿 and 𝑦 ∈ 𝑀}

Powers 𝐿0 = {𝜀}𝐿1 = 𝐿

𝐿𝑘+1 = 𝐿.𝐿𝑘

Kleene closure 𝐿∗ =∞⋃𝑖=0

𝐿𝑖



ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.

If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}



ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.

If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}



ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.

∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}



ExamplesIf 𝐿 = {0, 1}, then 𝐿∗ consists of all strings of 0’s and 1’s and theempty word.If 𝐿 = {0𝑛 ∣ 𝑛 ≥ 1}, then 𝐿∗ = 𝐿 ∪ {𝜀}.If 𝐿 = {0, 11}, then 𝐿∗ consists of the empty word and those stringsof 0’s and 1’s such that the 1’s come in pairs.∅0 = {𝜀}∅𝑖 = ∅, for 𝑖 ≥ 1∅∗ = {𝜀}


Syntax: What the Regular Expressions Are

DefinitionLet Σ be an alphabet. The regular expressions on Σ are inductivelydefined by:

Basis step: 𝜀 and ∅ are regular expressions. If 𝑎 ∈ Σ then a is aregular expression.Inductive step: Let 𝐸 and 𝐹 be regular expressions. Then 𝐸 + 𝐹 ,𝐸𝐹 , 𝐸∗ and (𝐸) are regular expressions.




Basis step: 𝜀 and ∅ are regular expressions. If 𝑎 ∈ Σ then a is aregular expression.

Inductive step: Let 𝐸 and 𝐹 be regular expressions. Then 𝐸 + 𝐹 ,𝐸𝐹 , 𝐸∗ and (𝐸) are regular expressions.




Basis step: 𝜀 and ∅ are regular expressions. If 𝑎 ∈ Σ then a is aregular expression.Inductive step: Let 𝐸 and 𝐹 be regular expressions. Then 𝐸 + 𝐹 ,𝐸𝐹 , 𝐸∗ and (𝐸) are regular expressions.


Semantics: What Language Denotes a Regular Expression

DefinitionLet 𝐸 be a regular expression. The language denoted by 𝐸, denotedby 𝐿(𝐸), is inductively defined by:

Basis step:𝐿(𝜀) = {𝜀},𝐿(∅) = ∅,𝐿(a) = {𝑎}.

Inductive step: Let 𝐿(𝐸) and 𝐿(𝐹) be the languages denoted by theregular expressions 𝐸 and 𝐹 , then

𝐿(𝐸 + 𝐹) = 𝐿(𝐸) ∪ 𝐿(𝐹),𝐿(𝐸𝐹) = 𝐿(𝐸)𝐿(𝐹),𝐿(𝐸∗) = (𝐿(𝐸))∗,𝐿((𝐸)) = 𝐿(𝐸).


Precedence of Operators

Order of precedence, from highest to lowest: (), ∗, . and +.The operators . and + are left-associative.

Example01∗ + 1 = (0(1∗)) + 1

≠ (01)∗ + 1≠ 0(1∗ + 1)


Languages Denoted by Regular Expressions

Example𝐸 𝐿(𝐸)a + b 𝐿(a) ∪ 𝐿(b) = {𝑎} ∪ {𝑏} = {𝑎, 𝑏}

a∗ {𝜀, 𝑎, 𝑎𝑎, 𝑎𝑎𝑎,…}

(a + b)(a + b) 𝐿(a + b).𝐿(a + b) ={𝑎, 𝑏}.{𝑎, 𝑏} = {𝑎𝑎, 𝑎𝑏, 𝑏𝑎, 𝑏𝑏}

a + (ab)∗ {𝑎, 𝜀, 𝑎𝑏, 𝑎𝑏𝑎𝑏, 𝑎𝑏𝑎𝑏𝑎𝑏,…}

(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}

ai(a1 + a2 +⋯+ an)∗ {𝑤 ∈ Σ∗ ∣ 𝑤 starts by 𝑎𝑖}







(0 + 1)∗01(0 + 1)∗ {𝑥01𝑦 ∣ 𝑥, 𝑦 ∈ {0, 1}∗}




ExampleWrite a regular expression for the language 𝐿 defined by

𝐿 = {𝑤 ∈ {0, 1}∗ ∣ 0 and 1 alternate in 𝑤}.

Solution: (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗

Other solution: (𝜀 + 1)(01)∗(𝜀 + 0)

ExampleThe regular expression

(10 + 0)∗(𝜀 + 1)denotes the set of strings of 0’s and 1’s that have no two adjacent 1’s.





Solution: (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗






ExampleWrite a regular expression for denoting the set of strings over Σ = {0, 1}not ending in 01.

Solution: 𝜀 + 0 + 1 + (0 + 1)∗(00 + 10 + 11)



ExampleWrite a regular expression for denoting the set of strings over Σ = {0, 1}not ending in 01.Solution: 𝜀 + 0 + 1 + (0 + 1)∗(00 + 10 + 11)


Libraries

RemarkTheoretical regular expressions ≠ practical regular expressions.

Some programming languages with support to regular expressions.NET, C, Haskell, Java, Mathematica, MATLAB and Perl.


Algorithms

AlgorithmsSee the Haskell implementation of some algorithms on regular expressionsin the course homepage.


Applications

Some programs that use regular expressionsGrep: print lines matching a patternAwk: pattern scanning and processing languageSed: stream editor for filtering and transforming textAlex, Flex and Lex: lexical-analyzer generatorsEmacs and Vim: test editorsMySQL, Oracle and PostgreSQL: databases


Applications

ReadingApplications of Regular Expressions (Hopcroft, Motwani and Ullman 2007,§ 3.3).

𝐿+ def= 𝐿𝐿∗ (one or many times operator)

𝐿? def= 𝜀 + 𝐿 (zero or one time operator)


An Implementation: A Regular Expression Matcher

“Rob’s implementation itself is asuperb example of beautiful code:compact, elegant, efficient, anduseful. It’s one of the best examplesof recursion that I have ever seen.”

Brian Kernighan, p. 3.


References

Hopcroft, J. E., Motwani, R. and Ullman, J. D. (2007). Introduction toAutomata theory, Languages, and Computation. 3rd ed. Pearson Education(cit. on p. 38).


Automata and Formal Languages - CM0081 Regular Expressions · Introduction Finite automata...

Documents

Transcript of Automata and Formal Languages - CM0081 Regular Expressions · Introduction Finite automata...