automata7.ppt
Transcript of automata7.ppt
241-303 Discrete Maths: Automata/7 1
Discrete MathsDiscrete Maths
Recognising input using:Recognising input using:– automataautomata: a graph-based technique: a graph-based technique– regular expressionsregular expressions: an algebraic technique: an algebraic technique
equivalent to automata equivalent to automata
241-303, Semester 1 2009-2010
7. Automata andRegular Expressions
241-303 Discrete Maths: Automata/7 2
OverviewOverview
1. 1. Introduction to AutomataIntroduction to Automata
2.2. Representing AutomataRepresenting Automata
3. 3. The ‘aeiou’ AutomatonThe ‘aeiou’ Automaton
4.4. Generating OutputGenerating Output
5.5. Bounce Filter ExampleBounce Filter Example
6.6. Deterministic and Deterministic and Nondeterministic AutomataNondeterministic Automata
continued
241-303 Discrete Maths: Automata/7 3
7.7. ‘washington’ Partial Anagrams‘washington’ Partial Anagrams
8.8. Regular ExpressionsRegular Expressions
9.9. UNIX Regular ExpressionsUNIX Regular Expressions
10.10. From REs to AutomataFrom REs to Automata
11.11. More InformationMore Information
241-303 Discrete Maths: Automata/7 4
1. Introduction to Automata1. Introduction to Automata
A A finite state automatonfinite state automaton represents a problem as a represents a problem as a series of series of statesstates and and transitionstransitions between the states between the states– the automaton starts in an initial statethe automaton starts in an initial state– input causes a transition from the current state to anothinput causes a transition from the current state to anoth
er;er;– a state may be a state may be acceptingaccepting
the automaton can terminate successfully when it enters an acthe automaton can terminate successfully when it enters an accepting state (if it wants to)cepting state (if it wants to)
241-303 Discrete Maths: Automata/7 5
1.1. An Example1.1. An Example
evenA oddAstart
b
a
a
b
The states are the ovals.The states are the ovals. The transitions are the arrowsThe transitions are the arrows
– labelled with the input that ‘trigger’ themlabelled with the input that ‘trigger’ them The ‘oddA’ state is accepting.The ‘oddA’ state is accepting.
continued
The ‘even-odd’ Automaton
241-303 Discrete Maths: Automata/7 6
Execution SequenceExecution Sequence InputInput Move to StateMove to State
b a b a a evenA
b a b a a evenA
b a b a a oddA
b a b a a oddA
initialstate
the automaton could choose toterminate here
b a b a a evenA
b a b a a oddAstops since
no more input
241-303 Discrete Maths: Automata/7 7
1.2. Why are Automata Useful?1.2. Why are Automata Useful?
Automata are a very good way of modeling Automata are a very good way of modeling finite-state systemsfinite-state systems which change state due t which change state due to input. Examples:o input. Examples:– text editors, compilers, UNIX tools like text editors, compilers, UNIX tools like grepgrep
– communications protocolscommunications protocols– digital hardware componentsdigital hardware components
e.g. adders, RAMe.g. adders, RAM very differentapplications
241-303 Discrete Maths: Automata/7 8
2. Representing Automata2. Representing Automata
Automata have a mathematical basis which Automata have a mathematical basis which allows them to be analysed, e.g.:allows them to be analysed, e.g.:– prove that they accept correct inputprove that they accept correct input– prove that they do prove that they do notnot accept accept incorrectincorrect input input
Automata can be manipulated to simplify thAutomata can be manipulated to simplify them, and they can be automatically converteem, and they can be automatically converted into code.d into code.
241-303 Discrete Maths: Automata/7 9
2.1. A Mathematical Coding2.1. A Mathematical Coding
We can represent an automaton in terms of sets We can represent an automaton in terms of sets and mathematical functions.and mathematical functions.
The ‘even-odd’ automaton is:The ‘even-odd’ automaton is:startSet = { evenA }startSet = { evenA }
acceptSet = { oddA }acceptSet = { oddA }
nextState(evenA, b) => evenAnextState(evenA, b) => evenAnextState(evenA, a) => oddAnextState(evenA, a) => oddAnextState(oddA, b) => oddAnextState(oddA, b) => oddAnextState(oddA, a) => evenAnextState(oddA, a) => evenA
continued
241-303 Discrete Maths: Automata/7 10
Analysis of the mathematical form can shoAnalysis of the mathematical form can show that the ‘even-odd’ automaton only accepw that the ‘even-odd’ automaton only accepts strings which:ts strings which:– contain an odd number of ‘a’scontain an odd number of ‘a’s– e.g.e.g.
babaa abb abaab aabba aaaaba … babaa abb abaab aabba aaaaba …
241-303 Discrete Maths: Automata/7 11
2.2. Automaton in Code 2.2. Automaton in Code
It is easy to (automatically) translate an automaIt is easy to (automatically) translate an automaton into code, but ...ton into code, but ...– an automaton graph does not contain all the details an automaton graph does not contain all the details
needed for a programneeded for a program
The main extra coding issues:The main extra coding issues:– what to do when we enter an accepting state?what to do when we enter an accepting state?– what to do when the input cannot be processed?what to do when the input cannot be processed?
e.g. e.g. abzzabzz is enteredis entered
241-303 Discrete Maths: Automata/7 12
Encoding the ‘even-odd’ AutomatonEncoding the ‘even-odd’ Automaton
enum state {evenA, oddA}; // possible states
enum state currState = evenA; // start stateint isAccepting = 0; // falseint ch;
while ((ch = getchar()) != EOF)) { currState = nextState(currState, ch); isAccepting = acceptable(currState);}if (isAccepting) printf(“accepted\n);else printf(“not accepted\n”);
continued
accepting stateonly used atend of input
241-303 Discrete Maths: Automata/7 13
enum state nextState(enum state s, int ch){ if ((s == evenA) && (ch == ‘b’)) return evenA; if ((s == evenA) && (ch == ‘a’)) return oddA; if ((s == oddA) && (ch == ‘b’)) return oddA; if ((s == oddA) && (ch == ‘a’)) return evenA;
printf(“Illegal Input”); exit(1);}
simple handlingof incorrect input
continued
241-303 Discrete Maths: Automata/7 14
int acceptable(enum state s){ if (s == oddA) return 1; // oddA is an accepting state return 0;
}
241-303 Discrete Maths: Automata/7 15
3. The ‘aeiou’ Automaton3. The ‘aeiou’ Automaton
What English words contain the five vowels What English words contain the five vowels (a, e, i, o, u) in order?(a, e, i, o, u) in order?
Some words that match:Some words that match:– abstemiousabstemious– facetiousfacetious– sacrilegioussacrilegious
241-303 Discrete Maths: Automata/7 16
3.1. Automaton Graph3.1. Automaton Graph
0
L - a
astart1
L - e
e2
L - i
i3
L - o
o4
L - u
u5
L = all letters
241-303 Discrete Maths: Automata/7 17
3.2. Execution Sequence (1)3.2. Execution Sequence (1)
InputInput Move to StateMove to Statef a c e t i o u s 0
0
1
1
continued
f a c e t i o u s
f a c e t i o u s
f a c e t i o u s
241-303 Discrete Maths: Automata/7 18
InputInput Move to StateMove to State2
2
f a c e t i o u s
f a c e t i o u s
3
4
f a c e t i o u s
f a c e t i o u s
5f a c e t i o u sthe automaton canterminate here;no need to processmore input
241-303 Discrete Maths: Automata/7 19
Execution Sequence (2)Execution Sequence (2) InputInput Move to StateMove to State
a n d r e w 0
a n d r e w 1
a n d r e w 1
a n d r e w 1
continued
241-303 Discrete Maths: Automata/7 20
InputInput Move to StateMove to Statea n d r e w 1
a n d r e w 2
a n d r e w 2, and end of inputmeans failure
241-303 Discrete Maths: Automata/7 21
3.3. Translation to Code3.3. Translation to Code
enum state {0, 1, 2, 3, 4, 5}; // poss. states
enum state currState = 0; // start stateint isAccepting = 0; // falseint ch;
while ((ch = getchar()) != EOF) && !isAccepting) { currState = nextState(currState, ch); isAccepting = acceptable(currState);}if (isAccepting) printf(“accepted\n);else printf(“not accepted\n”);
stop processingwhen the accepting
state is entered
continued
241-303 Discrete Maths: Automata/7 22
enum state nextState(enum state s, int ch){ if (s == 0) { if (ch == ‘a’) return 1; else return 0; // input is L-a } if (s == 1) { if (ch == ‘e’) return 2; else return 1; // input is L-e } if (s == 2) { if (ch == ‘i’) return 3; else return 2; // input is L-i } : continued
241-303 Discrete Maths: Automata/7 23
: if (s == 3) { if (ch == ‘o’) return 4; else return 3; // input is L-o } if (s == 4) { if (ch == ‘u’) return 5; else return 4; // input is L-u }
printf(“Illegal Input”); exit(1);} // end of nextState()
simple handlingof incorrect input
241-303 Discrete Maths: Automata/7 24
int acceptable(enum state s){ if (s == 5) return 1; // 5 is an accepting state return 0;
}
241-303 Discrete Maths: Automata/7 25
4. Generating Output4. Generating Output
One possible extension to the basic automatOne possible extension to the basic automaton idea is to allow output:on idea is to allow output:– when a transition is ‘triggered’ there can be optiwhen a transition is ‘triggered’ there can be opti
onal output as wellonal output as well
Automata which generate output are sometiAutomata which generate output are sometimes called mes called Finite State MachinesFinite State Machines (FSMs). (FSMs).
241-303 Discrete Maths: Automata/7 26
4.1. ‘even-odd’ with Output4.1. ‘even-odd’ with Output
evenA oddAstart
ba/1
a
b
When the ‘a’ transition is triggered out of the When the ‘a’ transition is triggered out of the evenA state, then a ‘1’ is output.evenA state, then a ‘1’ is output.
241-303 Discrete Maths: Automata/7 27
4.2. Mathematical Coding4.2. Mathematical Coding
Add an ‘output’ mathematical function to thAdd an ‘output’ mathematical function to the automaton representation:e automaton representation:
output( evenA, a ) => 1output( evenA, a ) => 1
241-303 Discrete Maths: Automata/7 28
4.3. Extending the C Coding4.3. Extending the C Coding
The while loop for ‘even-odd’ will become:The while loop for ‘even-odd’ will become:
:while ((ch = getchar()) != EOF)) { output(currState, ch); currState = nextState(currState, ch); isAccepting = acceptable(currState);}
:
continued
241-303 Discrete Maths: Automata/7 29
The The output()output() C function: C function:
void output(enum state s, int ch){ if ((s == evenA) && (ch == ‘a’)) putchar(‘1’);}
241-303 Discrete Maths: Automata/7 30
5. Bounce Filter Example5. Bounce Filter Example
A signal processing problem:A signal processing problem:– a stream of 1’s and 0’s are ‘smoothed’ by the filter sa stream of 1’s and 0’s are ‘smoothed’ by the filter s
o that:o that: a single 0 surrounded by 1’s becomes a 1:a single 0 surrounded by 1’s becomes a 1:
...1111...1111001111... => ...111111111...1111... => ...111111111... a single 1 surrounded by 0’s becomes a 0a single 1 surrounded by 0’s becomes a 0
...0000...0000110000... => ...000000000...0000... => ...000000000...
This kind of filtering is used in image processinThis kind of filtering is used in image processing to reduce ‘noise’.g to reduce ‘noise’.
241-303 Discrete Maths: Automata/7 31
5.1. The ‘bounce’ Automaton5.1. The ‘bounce’ Automaton
b
a
d
cstart
0/0 1/0
1/1 0/0
0/0 1/1
0/11/1
241-303 Discrete Maths: Automata/7 32
NotesNotes There is no accepting stateThere is no accepting state
– the code will simply terminate at EOFthe code will simply terminate at EOF
The ‘a’ and ‘b’ states (left side) mostly have tThe ‘a’ and ‘b’ states (left side) mostly have transitions that output ‘0’s.ransitions that output ‘0’s.
The ‘c’ and ‘d’ states (right side) mostly have The ‘c’ and ‘d’ states (right side) mostly have transitions that output ‘1’s.transitions that output ‘1’s.
241-303 Discrete Maths: Automata/7 33
5.2. Execution Sequence5.2. Execution Sequence InputInput Move to StateMove to State OutputOutput
0 1 0 1 1 0 1 a
a 0
b 0
a 0
continued
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 1 0 1 1 0 1
241-303 Discrete Maths: Automata/7 34
InputInput Move to StateMove to State OutputOutputb 0
c 1
d 1
c 1
moved to righthand side
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 1 0 1 1 0 1
0 1 0 1 1 0 1
241-303 Discrete Maths: Automata/7 35
5.3. I/O Behaviour5.3. I/O Behaviour
Input: Input: 0 1 0 1 1 0 10 1 0 1 1 0 1Output:Output: 0 0 0 0 1 1 1 0 0 0 0 1 1 1
It takes 2 bits of the same type before the auIt takes 2 bits of the same type before the automaton realises that it has a new bit sequentomaton realises that it has a new bit sequence rather than a ‘noise’ bit.ce rather than a ‘noise’ bit.
smoothed awayin the output
241-303 Discrete Maths: Automata/7 36
6. Deterministic and 6. Deterministic and Nondeterministic Automata Nondeterministic Automata
We have been writing We have been writing deterministicdeterministic automata s automata so far:o far:– for an input read by a state there is for an input read by a state there is at most one tranat most one tran
sition that can be firedsition that can be fired state ‘s’ can process input ‘a’ and ‘w’, and fails for anytstate ‘s’ can process input ‘a’ and ‘w’, and fails for anyt
hing elsehing else
S
a
w
241-303 Discrete Maths: Automata/7 37
Nondeterministic AutomataNondeterministic Automata
A A nondeterministicnondeterministic (ND) automaton can ha (ND) automaton can have 2 or more transitions with the same label ve 2 or more transitions with the same label leaving a state.leaving a state.
ProblemProblem: if state S sees input ‘x’, then whic: if state S sees input ‘x’, then which transition should it use?h transition should it use?
S
a
x
x U
T
V
241-303 Discrete Maths: Automata/7 38
6.1. The ‘man’ Automaton6.1. The ‘man’ Automaton
Accept all strings that contain “man”Accept all strings that contain “man”– this is hard to write as a deterministic automatothis is hard to write as a deterministic automato
n. The following has bugs:n. The following has bugs:
0 1 2 3start
L - m
m a n
L - n
L - a
continued
WRONG
241-303 Discrete Maths: Automata/7 39
The input string The input string commandcommand
will get stuck at state 0:will get stuck at state 0:
0o
0m
1m
0a
0n
0d
0c
the problemstarts here
0
241-303 Discrete Maths: Automata/7 40
6.2. A ND Automaton Solution6.2. A ND Automaton Solution
0 1 2 3start
L
m a n
It is nondeterministic because an ‘m’ input in It is nondeterministic because an ‘m’ input in state 0 can be dealt with by two transitions:state 0 can be dealt with by two transitions:– a transition back to state 0, ora transition back to state 0, or– a transition to state 1a transition to state 1
continued
241-303 Discrete Maths: Automata/7 41
Processing Processing commandcommand input: input:
0o
0m
0m
0a
0n
0d
0c
0
1
1a
2n
3acceptingstate
mfail: rejectthe input
241-303 Discrete Maths: Automata/7 42
6.3. Executing a ND Automata6.3. Executing a ND Automata It is difficult to code ND automata in conventionIt is difficult to code ND automata in convention
al languages, such as C.al languages, such as C.
Two different coding approaches:Two different coding approaches:– 1. When an input arrives, execute 1. When an input arrives, execute all transitions in pall transitions in p
arallelarallel. See which succeeds.. See which succeeds.
– 2. When an input arrives,2. When an input arrives, try one transitiontry one transition. If it lead. If it leads to failure then s to failure then backtrackbacktrack and try another transition. and try another transition.
241-303 Discrete Maths: Automata/7 43
Approach (1) in ParlogApproach (1) in Parlog A A concurrentconcurrent logic programming language. logic programming language.
state0([X|Rest]) :- state0(Rest) : true.state0([m|Rest]) :- state1(Rest) : true.
state1([a|Rest]) :- state2(Rest).
state2([n|Rest]).
concurrenttesting
Call:?- state0([c,o,m,m,a,n,d]).
241-303 Discrete Maths: Automata/7 44
Approach (2) in PrologApproach (2) in Prolog
nextState(0, _, 0).nextState(0, ‘m’, 1).nextState(1, ‘a’, 2).nextState(2, ‘n’, 3).
nda(State, [Ch|Input]) :- nextState(State, Ch, NewState), nda(NewState, Input).nda(3, []). // accepting state
Call:?- nda(0, [c,o,m,m,a,n,d]).
the nondeterministic part
a sequential logic programming language
241-303 Discrete Maths: Automata/7 45
6.4. Why use ND Automata?6.4. Why use ND Automata?
With nondeterminism, some problems are eWith nondeterminism, some problems are easier to solve/model.asier to solve/model.
Nondeterminism is common in some applicNondeterminism is common in some application areas, such as AI, graph search, and coation areas, such as AI, graph search, and compilers.mpilers.
continued
241-303 Discrete Maths: Automata/7 46
It is possible to translate a ND automaton inIt is possible to translate a ND automaton into a (larger, complex) deterministic one.to a (larger, complex) deterministic one.
In mathematical terms, ND automata and deIn mathematical terms, ND automata and determinstic automata are terminstic automata are equivalentequivalent– they can be used to model all the same problemthey can be used to model all the same problem
ss
241-303 Discrete Maths: Automata/7 47
7. ‘washington’ Partial Anagrams7. ‘washington’ Partial Anagrams
Find all the words which can be made from the lFind all the words which can be made from the letters in “washington”.etters in “washington”.
There are over 240 words. Some of the 7-letter wThere are over 240 words. Some of the 7-letter words:ords:– agonistagonist– goatishgoatish– showingshowing– washingwashing
241-303 Discrete Maths: Automata/7 48
7.1. A Two Stage Process7.1. A Two Stage Process
1. Select all the words from a dictionary (e.g. 1. Select all the words from a dictionary (e.g. /us/us
r/share/dict/wordsr/share/dict/words on on calvincalvin) which use the lett) which use the letters in “washington”ers in “washington”– use a use a deterministicdeterministic automaton automaton
2. Delete the words which use the “washington” l2. Delete the words which use the “washington” letters too many times (e.g. “hash”)etters too many times (e.g. “hash”)– use a use a nondeterministicnondeterministic automaton automaton
241-303 Discrete Maths: Automata/7 49
7.2. Stage 1: Deterministic Automaton7.2. Stage 1: Deterministic Automaton
Send each word in the dictionary through thSend each word in the dictionary through the automaton:e automaton:
If state 1 is reached, then the word is passed If state 1 is reached, then the word is passed to stage 2.to stage 2.
0 1start newline
S = {w,a,s,h,i,n,g,t,o}
241-303 Discrete Maths: Automata/7 50
For example, “hash\n” is accepted:For example, “hash\n” is accepted:
0a
0s
0h
0\n
10h
241-303 Discrete Maths: Automata/7 51
7.3. Stage 2: ND Automaton7.3. Stage 2: ND Automaton
Check if a word uses a “washington” letter tCheck if a word uses a “washington” letter too often:oo often:– e.g. delete “hash”e.g. delete “hash”
The ND The ND automatonautomaton succeeds if a word uses succeeds if a word uses too many letters. too many letters.
Then the Then the programprogram will will notnot output the word. output the word.
241-303 Discrete Maths: Automata/7 52
Checking each LetterChecking each Letter
There are 9 different letters in “washington”.There are 9 different letters in “washington”. Nine deterministic automaton can be used to Nine deterministic automaton can be used to
detect if the given word has:detect if the given word has:– more than 1 ‘a’more than 1 ‘a’– more than 1 ‘g’more than 1 ‘g’– ......– more than 2 ‘n’smore than 2 ‘n’s
241-303 Discrete Maths: Automata/7 53
Check for more than 1 ‘a’Check for more than 1 ‘a’
0 1 2start
L - a
a a
L - a
If this succeeds then the program will not oIf this succeeds then the program will not output the word.utput the word.
e.g. ‘nana’
241-303 Discrete Maths: Automata/7 54
Checking all the Letters at OnceChecking all the Letters at Once
The 9 deterministic automaton can be appliThe 9 deterministic automaton can be applied to the same word at the same time.ed to the same word at the same time.
Combine the 9 deterministic automata to creCombine the 9 deterministic automata to create a single nondeterministic automaton.ate a single nondeterministic automaton.
241-303 Discrete Maths: Automata/7 55
Nondeterminstic CheckingNondeterminstic Checking
0 1 2start
L
a a
L - a
3 4g
L - g
g
5 6h
L - h
h
continued
two a's
two g's
two h's
241-303 Discrete Maths: Automata/7 56
9 10 11n
L - n
n n
L - n
7 8i
L - i
i
12 13o
L - o
o
continued
two i's
three n's
two o's
241-303 Discrete Maths: Automata/7 57
16 17t t
L - t
14 15s
L - s
s
18 19w
L - w
w
two s's
two t's
two w's
241-303 Discrete Maths: Automata/7 58
Processing “hash”Processing “hash”
Reaching an accepting state means that the prograReaching an accepting state means that the program will m will notnot output “hash”. output “hash”.
0a
0s
0h
0h
0
5
1414
111
6555
h
h
h
s
a s
ha
241-303 Discrete Maths: Automata/7 59
7.4. UNIX Coding7.4. UNIX Coding
Stages 0,1,2, piped together:Stages 0,1,2, piped together:
tr A-Z a-z < /usr/share/dict/words | tr A-Z a-z < /usr/share/dict/words | grep '^[washingto]*$' | grep '^[washingto]*$' | egrep -v 'a.*a|g.*g|h.*h|i.*i|egrep -v 'a.*a|g.*g|h.*h|i.*i|
n.*n.*n|o.*o|s.*s|t.*t|w.*w’ n.*n.*n|o.*o|s.*s|t.*t|w.*w’
The call to The call to trtr translates all the words taken from the d translates all the words taken from the dictionary into lower case.ictionary into lower case.
tr grep egrep -v
/usr/share/dict/words
241-303 Discrete Maths: Automata/7 60
8. Regular Expressions (REs)8. Regular Expressions (REs)
REs are an REs are an algebraicalgebraic way of specifying ho way of specifying how to recognise inputw to recognise input– ‘‘algebraic’ means that the recognition algebraic’ means that the recognition patternpattern is is
defined using RE operands and operatorsdefined using RE operands and operators
REs are REs are equivalentequivalent to automata to automata– REs and automata can be used on all the same pREs and automata can be used on all the same p
roblemsroblems
241-303 Discrete Maths: Automata/7 61
8.1. REs in grep8.1. REs in grep
grep searches input lines, a line at a time.grep searches input lines, a line at a time. If the line contains a string that matches greIf the line contains a string that matches gre
p's RE (pattern), then the line is output.p's RE (pattern), then the line is output.
grep "RE"
input lines(e.g. from a file)
hello andymy name is andymy bye byhe
output matching lines(e.g. to a file)
continued
241-303 Discrete Maths: Automata/7 62
ExamplesExamples
grep "and"hello andymy name is andymy bye byhe
hello andymy name is andy
grep –E "an|my"hello andymy name is andymy bye byhe
hello andymy name is andymy bye byhe
continued
"|" means "or"
241-303 Discrete Maths: Automata/7 63
grep "hel*"hello andymy name is andymy bye byhe
hello andymy bye byhe
"*" means "0 or more"
241-303 Discrete Maths: Automata/7 64
8.2. Why use REs?8.2. Why use REs?
They are very useful for expressing patterns They are very useful for expressing patterns that recognise textual input.that recognise textual input.
For example, REs are used in:For example, REs are used in:– editorseditors– compilerscompilers– web-based search enginesweb-based search engines– communication protocolscommunication protocols
241-303 Discrete Maths: Automata/7 65
8.3. The RE Language8.3. The RE Language
A RE defines a pattern which recognises (A RE defines a pattern which recognises (matches) a matches) a setset of strings of strings– e.g. a RE can be defined that recognises the ste.g. a RE can be defined that recognises the st
rings { aa, aba, abba, abbba, abbbba, …} rings { aa, aba, abba, abbba, abbbba, …}
These recognisable strings are sometimes These recognisable strings are sometimes called the RE’s called the RE’s languagelanguage..
241-303 Discrete Maths: Automata/7 66
RE OperandsRE Operands
There are 4 basic kinds of operands:There are 4 basic kinds of operands:– characters (e.g. ‘a’, ‘1’, ‘(‘)characters (e.g. ‘a’, ‘1’, ‘(‘)
– the symbol the symbol (means an empty string ‘’)(means an empty string ‘’)
– the symbol {} (means the empty set)the symbol {} (means the empty set)
– variables, which can be assigned a REvariables, which can be assigned a RE variable = REvariable = RE
241-303 Discrete Maths: Automata/7 67
RE OperatorsRE Operators
There are three basic operators:There are three basic operators:– union ‘|’union ‘|’– concatenation concatenation – closure *closure *
241-303 Discrete Maths: Automata/7 68
UnionUnion
S | TS | T– this RE can use the S this RE can use the S oror T RE to match strings T RE to match strings
Example REs:Example REs:a | ba | b matches strings {a, b}matches strings {a, b}
a | b | ca | b | c matches strings {a, b, c }matches strings {a, b, c }
241-303 Discrete Maths: Automata/7 69
ConcatenationConcatenation
S TS T– this RE will use the S RE this RE will use the S RE followed byfollowed by the T RE the T RE
to match against stringsto match against strings
Example REs:Example REs:a ba b matches the string { ab }matches the string { ab }
w | (a b)w | (a b) matches the strings {w, ab}matches the strings {w, ab}
241-303 Discrete Maths: Automata/7 70
What strings are matched by the REWhat strings are matched by the RE(a | ab ) (c | bc)(a | ab ) (c | bc)
Equivalent to:Equivalent to:{a, ab} followed by {c, bc}{a, ab} followed by {c, bc}
=> {ac, abc, abc, abbc}=> {ac, abc, abc, abbc}
=> {ac, abc, abbc}=> {ac, abc, abbc}
241-303 Discrete Maths: Automata/7 71
ClosureClosure
S*S*– this RE can use the S RE this RE can use the S RE 0 or more times0 or more times to ma to ma
tch against stringstch against strings
Example RE:Example RE:a*a* matches the strings:matches the strings:
{{, a, aa, aaa, aaaa, aaaaa, ... }, a, aa, aaa, aaaa, aaaaa, ... }
empty string
241-303 Discrete Maths: Automata/7 72
8.4. REs for C Identifiers8.4. REs for C Identifiers
We define two RE variables, We define two RE variables, letterletter and and digidigi
tt::letter = A | B | C | D ... Z |letter = A | B | C | D ... Z |
a | b | c | d .... z a | b | c | d .... z
digit = 0 | 1 | 2 | ... 9digit = 0 | 1 | 2 | ... 9
ident ident is defined using is defined using letterletter and and digitdigit::ident = letter ( letter | digit )*ident = letter ( letter | digit )*
continued
241-303 Discrete Maths: Automata/7 73
Strings matched by Strings matched by identident include: include:ab345ab345 ww h5gh5g
Strings not matched:Strings not matched:22 $abc$abc ********
241-303 Discrete Maths: Automata/7 74
9. UNIX Regular Expressions9. UNIX Regular Expressions
Different UNIX tools use slightly different extenDifferent UNIX tools use slightly different extensions of the basic RE notationsions of the basic RE notation– vivi, , awkawk, , sedsed, , grepgrep, , egrepegrep, etc., etc.
Extra features include:Extra features include:– character classescharacter classes– line start ‘^’ and end ‘$’ symbolsline start ‘^’ and end ‘$’ symbols– the wild card symbol ‘.’the wild card symbol ‘.’– additional operators, R? and R+additional operators, R? and R+
241-303 Discrete Maths: Automata/7 75
9.1. Character Classes9.1. Character Classes
The character class [aThe character class [a11 a a22 ... a ... ann] stands for ] stands for
aa11 | a | a22 | ... | a | ... | ann
aa11- a- ann stands for the set of characters betwee stands for the set of characters betwee
n an a11 and a and ann
– e.g. e.g. [A-Z] [a-z0-9][A-Z] [a-z0-9]
241-303 Discrete Maths: Automata/7 76
9.2. Line Start and End9.2. Line Start and End
The ‘^’ matches the beginning of the line, ‘The ‘^’ matches the beginning of the line, ‘$’ matches the end$’ matches the end– e.g. e.g. grep ‘^andr’ /usr/share/dict/words grep ‘^andr’ /usr/share/dict/words
grep '^[washingto]*$' grep '^[washingto]*$'
/usr/share/dict/words /usr/share/dict/words
241-303 Discrete Maths: Automata/7 77
Example as a DiagramExample as a Diagram
grep "^andr"AA'sAOLAOL's : :
androgenandrogen'sandrogynousandroidandroid'sandroids
/usr/share/dict/words
241-303 Discrete Maths: Automata/7 78
9.3. Wild Card Symbol9.3. Wild Card Symbol
The ‘.’ stands for any character except the nThe ‘.’ stands for any character except the newlineewline– e.g. e.g. grep ‘^a..b.$’ chapter1.txtgrep ‘^a..b.$’ chapter1.txt
grep ‘t.*t.*t’ manualgrep ‘t.*t.*t’ manual
241-303 Discrete Maths: Automata/7 79
grep "^a..b.$"AA'sAOLAOL's : :
adobealibiameba
/usr/share/dict/words
241-303 Discrete Maths: Automata/7 80
9.4. R? and R+9.4. R? and R+
R? stands for R? stands for | R (0 or 1 R) | R (0 or 1 R)
R+ stands for R | RR | RRR | ...R+ stands for R | RR | RRR | ...which can also be written as R R*which can also be written as R R*– one or more occurrences of Rone or more occurrences of R
241-303 Discrete Maths: Automata/7 81
9.5. Operator Precedence9.5. Operator Precedence
The operators *, +, and ? have the highest pThe operators *, +, and ? have the highest precedence.recedence.
Then comes concatenationThen comes concatenation Union ‘|’ is the lowest precedenceUnion ‘|’ is the lowest precedence
Example:Example:– a | bc? means a | (b(c?)), and matches the stringa | bc? means a | (b(c?)), and matches the string
s {a, b, bc}s {a, b, bc}
241-303 Discrete Maths: Automata/7 82
10. From REs to Automata10. From REs to Automata The translation uses a special kind of ND automata whiThe translation uses a special kind of ND automata whi
ch uses ch uses -transitions-transitions. Automata of this type are someti. Automata of this type are sometimes calledmes called -NFAs-NFAs..
The translation steps are:The translation steps are:– RERE =>=> -NFA-NFA
– -NFA -NFA =>=> ND automatonND automaton
– ND automaton => ND automaton => deterministic automatondeterministic automaton– deterministic automaton => codedeterministic automaton => code
241-303 Discrete Maths: Automata/7 83
10.1. 10.1. -NFAs-NFAs
A A -NFA allows a transition to use a -NFA allows a transition to use a label label..
A transition using an A transition using an label can be triggered label can be triggered without having to match any input.without having to match any input.
241-303 Discrete Maths: Automata/7 84
-NFA Example-NFA Example
a*b | b*a is accepted by the following a*b | b*a is accepted by the following -NF-NFA:A:
1 6
2
4
3
5
start
b
a
b
a
nondeterminismoccurs here
Example input:"bbba"
241-303 Discrete Maths: Automata/7 85
10.2. RE to 10.2. RE to -NFA-NFA The resulting The resulting -NFA has:-NFA has:
– one start state and one accepting stateone start state and one accepting state
– at most two transitions out of any stateat most two transitions out of any state
The construction uses standard automata ‘pieces’ The construction uses standard automata ‘pieces’ corresponding to RE operands and operators.corresponding to RE operands and operators.
The pieces are put together based on an expressioThe pieces are put together based on an expression tree for the RE.n tree for the RE.
241-303 Discrete Maths: Automata/7 86
Automata Pieces for RE OperandsAutomata Pieces for RE Operands
xstartAutomaton fora character x
startAutomaton for
startAutomaton for {}
This automaton does notaccept any strings.
241-303 Discrete Maths: Automata/7 87
Automata Pieces for RE OperatorsAutomata Pieces for RE Operators
Union S | T:Union S | T:
S
T
start
continued
241-303 Discrete Maths: Automata/7 88
Concatenation S T:Concatenation S T:
S Tstart
continued
241-303 Discrete Maths: Automata/7 89
Closure S*:Closure S*:
Sstart
241-303 Discrete Maths: Automata/7 90
10.3. Translating a | bc*10.3. Translating a | bc*
The first step in building the automaton is to The first step in building the automaton is to drawdraw a | bc* a | bc* as an expression tree: as an expression tree:
|
.
*
c
b
a
the concatenatesymbol
241-303 Discrete Maths: Automata/7 91
Translate the 3 leavesTranslate the 3 leaves
1 2astartAutomaton for a
4 5bstartAutomaton for b
7 8cstartAutomaton for c
241-303 Discrete Maths: Automata/7 92
Automaton for c*Automaton for c*
7 8 9
6start
c
241-303 Discrete Maths: Automata/7 93
Automaton for bc*Automaton for bc*
7 8 9
6start
c54
b
241-303 Discrete Maths: Automata/7 94
Final Automaton for a | bc*Final Automaton for a | bc*
7 8
6
start
c54
b
3
21a
9
0
241-303 Discrete Maths: Automata/7 95
10.4. From 10.4. From -NFA to ND Automaton-NFA to ND Automaton
The The -transitions can be removed by -transitions can be removed by combincombininging the states that use them. the states that use them.
If we are in a state S with If we are in a state S with -transition outpu-transition outputs, then we are also in any state that can be rts, then we are also in any state that can be reached from S by following those eached from S by following those transitio transitions.ns.
241-303 Discrete Maths: Automata/7 96
Example: simplify the lower branch of a|bc*Example: simplify the lower branch of a|bc*
7 8
6
c54
b
3
9
0
continued
241-303 Discrete Maths: Automata/7 97
becomes:
7 8
6
c54
b
3
9
0
39
continued
241-303 Discrete Maths: Automata/7 98
becomes:
7 8
6,9,3
cb9,30,4
continued
becomes:
75,6,9,3
cb8,9,30,4
5
state combinationbegins
241-303 Discrete Maths: Automata/7 99
becomes:
5,6,9,3
cb
7,8,9,30,4
becomes: cb
5,6,7,8,9,30,4
simplify the labels: cb50
241-303 Discrete Maths: Automata/7 100
All of a|bc* simplified:All of a|bc* simplified:
5
2
0
b
a
start
c
This also happensto be a deterministicautomaton, so thetranslation is finished.
241-303 Discrete Maths: Automata/7 101
11. More Information11. More Information
Johnsonbaugh, R. Johnsonbaugh, R. 19971997. . Discrete MatheDiscrete Mathematicsmatics, Prentice Hall, chapter , Prentice Hall, chapter 1010..