LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.
-
date post
15-Jan-2016 -
Category
Documents
-
view
219 -
download
0
Transcript of LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.
![Page 1: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/1.jpg)
LING/C SC/PSYC 438/538Computational Linguistics
Sandiway Fong
Lecture 10: 9/27
![Page 2: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/2.jpg)
Today’s Topics
• Homework 2 review
• Recap: DCG system
• New topic: Finite State Automata (FSA)
![Page 3: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/3.jpg)
Homework 2 Review
• Question 1– Consider a language L3or5
= {111, 11111, 111111, 111111111, 1111111111, 111111111111, 111111111111111,...}
– each member of L3or5 is a string containing only 1s
– the number of 1s in each string is divisible by 3 or 5
• Give a regular grammar that generates language L3or5
![Page 4: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/4.jpg)
Homework 2 Review
• Regular grammar:– divisible by 3 side
1. s --> [1], a.
2. a --> [1], b.
3. b --> [1].
4. b --> [1], c.
5. c --> [1], d.
6. d --> [1], b.
• Regular grammar:– divisible by 5 side7. s --> [1], one.8. one --> [1], two.9. two --> [1], three.10.three --> [1], four.11.four --> [1].12.four --> [1], five.13.five --> [1], six.14.six --> [1], seven.15.seven --> [1], eight.16.eight --> [1], four.
take the disjunction of the two sub-grammars, i.e. merge them
![Page 5: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/5.jpg)
Homework 2 Review
• Question 2: Language L = {a2nbn+1 | n ≥1} is also non-regular
• but can be generated by a regular grammar extended to allow left and right recursive rules
• Given a Prolog grammar satisfying these rules for L
• Legit rules:X -> aYX -> aX -> Ya– where X, Y are non-
terminals, a is some arbitirary terminal
![Page 6: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/6.jpg)
Homework 2 Review
• L = {a2nbn+1 | n≥ 1}
• DCG:1. s --> [a], b.
2. b --> [a], c.
3. c --> [b], d.
4. c --> s, [b].
5. d --> [b].
generated byrules 1 + 2 + 3 + 5
generated byrules 1 + 2 + 4(center-embeddedrecursive tree)
d
b
b
b
s
c
a
a
bs
b
s
c
a
a
![Page 7: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/7.jpg)
Homework 2 Review
• Question 3 (Optional 438)A right recursive regular grammar that generates a (rightmost) parse for the language {an| n>=2}:s(s(a,B)) --> [a], b(B).b(b(a,B)) --> [a], b(B).b(b(a))--> [a].
• Example?- s(Tree,[a,a,a],[]).Tree = s(a,b(a,b(a)))?
• A corresponding left recursive regular grammar will not halt in all cases when an input string is supplied
• Modify the right recursive grammar to produce a left recursive parse, e.g.?- s(Tree,[a,a,a],[]).
Tree = s(b(b(a),a),a)
• Can you do this for any right recursive regular grammar? e.g. the one for sheeptalk
![Page 8: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/8.jpg)
Homework 2 Review
• Original:s(s(a,B)) --> [a], b(B).b(b(a,B)) --> [a], b(B).b(b(a))--> [a].
• New:1. s(s(B,a)) --> [a], b(B).2. b(b(B,a)) --> [a], b(B).3. b(b(a))--> [a].
taking advantage of the fact that we know we’re generating only a’sexample: in rule 1, we match a terminal a on the left sidebut place an a at the end for the tree
![Page 9: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/9.jpg)
Extra Arguments: Recap
• some uses for extra arguments on non-terminals in the grammar:
1. to generate a parse tree • recording the derivation history
2. feature agreement
3. implement counting
![Page 10: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/10.jpg)
Extra Arguments: Recap
Implement Determiner-Noun Number Agreement
• Data:– the man/men– a man/*a men
• Modified grammar:np(np(D,N)) --> det(D,Number),
common_noun(N,Number).det(det(the),_) --> [the].det(det(a),sg) --> [a].common_noun(n(ball),sg) --> [ball].common_noun(n(man),sg) --> [man].common_noun(n(men),pl) --> [men].
np(np(D,N)) --> detsg(D), common_nounsg(N).np(np(D,N)) --> detpl(D), common_nounpl(N).detsg(det(a)) --> [a].detsg(det(the)) --> [the].detpl(det(the)) --> [the].common_nounsg(n(man)) -->[man]. common_nounpl(n(men)) --> [men].
syntactic sugarcould just encodefeature values in
nonterminal name
![Page 11: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/11.jpg)
Extra Arguments: Recap
• Class Exercise:– Implement Case = {nom,acc} agreement system for
the grammar– Examples: the man hit me vs. *the man hit I
• Modified grammar:s(Y,Z)) --> np(Y,Case), vp(Z), { Case = nom }.np(np(Y),Case) --> pronoun(Y,Case).pronoun(i,nom) --> [i].pronoun(we,nom) --> [we]. pronoun(me,acc) --> [me]. np(np(D,N),_) --> det(D,Num), common_noun(N,Num).vp(vp(Y,Z)) --> transitive(Y), np(Z,Case), { Case = acc }.
{ ... Prolog code ... }
![Page 12: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/12.jpg)
Extra Arguments: Recap
• Class Exercise:– Implement Case = {nom,acc} agreement system for
the grammar– Examples: the man hit me vs. *the man hit I
• Modified grammar:s(Y,Z)) --> np(Y,nom), vp(Z).np(np(Y),Case) --> pronoun(Y,Case).pronoun(i,nom) --> [i].pronoun(we,nom) --> [we]. pronoun(me,acc) --> [me]. np(np(D,N),_) --> det(D,Num), common_noun(N,Num).vp(vp(Y,Z)) --> transitive(Y), np(Z,acc).
without{ ... Prolog code ... }
![Page 13: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/13.jpg)
Extra Arguments: Recap
1. s --> [a],b.2. b --> [a],b.3. b --> [b],c.4. b --> [b].5. c --> [b],c.6. c --> [b].
L = a+b+
a regular grammar
Lab = { anbn | n ≥ 1 }regular grammar + extra argument
1. s(X) --> [a],b(a(X)).2. b(X) --> [a],b(a(X)).3. b(a(X)) --> [b],c(X).4. b(a(0)) --> [b].5. c(a(X)) --> [b],c(X).6. c(a(0)) --> [b].
Query: ?- s(0,L,[]).
![Page 14: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/14.jpg)
Extra Arguments: Recap
1. s(X) --> [a],b(a(X)).2. b(X) --> [a],b(a(X)).3. b(a(X)) --> [b],c(X).4. b(a(0)) --> [b].5. c(a(X)) --> [b],c(X).6. c(a(0)) --> [b].
• derivation tree:
s(0) rule 1
logic:?- s(0,L,[]).
start with 0 wrap an a(_) for each “a”unwrap a(_) for each “b”
match #a’s with #b’s = counting
a b(a(0)) rule 2
a b(a(a(0))) rule 3
b
b c(a(0)) rule 6
![Page 15: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/15.jpg)
So far
• Equivalent formalisms
regexpregular
grammars
FSA
![Page 16: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/16.jpg)
FSA
• example (sheeptalk) – baa!– baaa! …
• regexp– baa+!
w x
z
a
!
ya
a
> s b
points to start state end state
marked in red
basic idea: “just follow the arrows”
state
state transition:in state s,if current input symbol is bgo to state w
accepting computation:in a final state,if there is no moreinput, accept string
![Page 17: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/17.jpg)
FSA: Construction
• step-by-step• regexp
– baa+!=
– baaa*!s>
![Page 18: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/18.jpg)
FSA: Construction
• step-by-step• regexp
– baaa*!
– b
– from state s, – see a ‘b’, – move to state w
s wb>
![Page 19: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/19.jpg)
FSA: Construction
• step-by-step• regexp
– baaa*!
– ba
– from state w, – see an ‘a’, – move to state x
s wb xa>
![Page 20: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/20.jpg)
FSA: Construction
• step-by-step• regexp
– baaa*!
– baa
– from state x, – see an ‘a’, – move to state y
s wb xa>
y
a
![Page 21: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/21.jpg)
FSA: Construction
• step-by-step• regexp
– baaa*!
– baaa*– baa– baaa– baaaa...
– from y,– see an ‘a’, – move to ?
s wb xa>
y
y’
a
a
a...
but machine musthave a finite numberof states!
![Page 22: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/22.jpg)
Regular Expressions: Example
• step-by-step• regexp
– baaa*!
– baa*– baa– baaa– baaaa...
– from state y,– see an ‘a’, – “loop”, i.e. move back to,
state y
s wb xa
a
>
y
a
![Page 23: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/23.jpg)
Regular Expressions: Example
• step-by-step
• regexp– baaa*!
– baaa*!
– from state y,– see an ‘!’, – move to final state z
(indicated in red)
Note: machine cannot finish (i.e. reach the end of the input string) in states s, w, x or y
s wb xa
a
>
y
a
z!
![Page 24: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/24.jpg)
Finite State Automata (FSA)
• construction– the step-by-step FSA construction method we just
used – works for any regular expression
• conclusion– anything we can encode with a regular expression,
we can build a FSA for it
– an important step in showing that FSA and regexps are formally equivalent
![Page 25: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/25.jpg)
regexp operators and FSA
• basic wildcards– . and *
• . any single character• e.g. p.t• put, pit, pat, pet
• * zero or more characters
x yd
abc
e
z etc.
...
...
y
a etc.
one loopfor eachcharacter
over thewholealphabet
![Page 26: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/26.jpg)
regexp operators and FSA
• basic wildcards– +
• one or more of the preceding character
• e.g. a+
– [ ]
• range of characters• e.g. [aeiou]
x ya
a
x yo
aei
u
![Page 27: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/27.jpg)
regexp operators and FSA
• basic wildcards– ?
• zero or one of the preceding character
• e.g. a?• Non-determinism
– any FSA with an empty transition is non-deterministic
– see example: could be in state x or y simultaneous
– any FSA with an empty transition can be rewritten without the empty transition
x ya
λ
λ denotes the empty transition
x ya>
![Page 28: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/28.jpg)
regexp operators and FSA
• backreferences:– by state splitting– e.g. ([aeiou])\1
x yo
aei
uy z
o
aei
u
x
yu
o
a
e
i
u
z
ya
ye
yi
yo
o
a
e
u
![Page 29: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/29.jpg)
Finite State Automata (FSA)
• more formally– (Q,s,f,Σ,)1. set of states (Q): {s,w,x,y,z} 4 statesmust be a finite set2. start state (s): s3. end state(s) (f): z
4. alphabet (Σ): {a, b, !}5. transition function :
signature: character × state → state1. (b,s)=w2. (a,w)=x3. (a,x)=y4. (a,y)=y5. (!,y)=z
s wb xa
a
>
y
a
z!
![Page 30: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/30.jpg)
FSA
• Finite State Automata (FSA) have a limited amount of expressive power
• Let’s look at a modification to FSA and its effect on its power
![Page 31: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/31.jpg)
String Transitions
– so far...
• all machines have had just a single character label on the arc
• so if we allow strings to label arcs– do they endow the FSA with any
more power?
b
• Answer: No– because we can always convert a
machine with string-transitions into one without
abb
a b b
![Page 32: LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.](https://reader036.fdocuments.in/reader036/viewer/2022062423/56649d365503460f94a0e542/html5/thumbnails/32.jpg)
Finite State Automata (FSA)
• equivalent
s
z
baa
!
y
a
>
5 state machine
s wb xa
a
>
y
a
z!
3 state machineusing stringtransitions