Computer Simulation of a Serial Parsing Model
description
Transcript of Computer Simulation of a Serial Parsing Model
![Page 1: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/1.jpg)
Computer Simulation of a Serial Parsing Model
Doan-Nguyen [email protected] (Computational Linguistics at Concordia)Department of Computer ScienceConcordia University (Montréal, Canada)
![Page 2: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/2.jpg)
2
Introduction
Question: How about a computer simulation for serial syntactic parsing, which achieves a medium coverage and can deal with some classes of long and difficult sentences?
We implemented such a computer simulation: Based on the garden-path theory (Frazier, 1978;
Frazier & Clifton, 1996) Parsing process is basically incremental and input-
driven (bottom-up), but some top-down mechanisms are also included
![Page 3: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/3.jpg)
3
Introduction (2)
Focus on reanalysis Not based on context-free grammars. Using
context extensively. Linguistic knowledge is represented procedurally in Prolog clauses
Considering characteristics of computer to obtain high performance whenever psycholinguistic knowledge is not available
Leaving space for non-structural information in future extension (a human-simulating sentence interpretation program)
![Page 4: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/4.jpg)
4
Introduction (3)
Limitations: not yet considered: Lexical ambiguities: words of multiple
categories Conjunctive structures, sentences with
punctuations
![Page 5: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/5.jpg)
5
State-of-the-Art
Computational implementations: Marcus 1980: deterministic serial McRoy & Hirst 1990: serial model, no
reanalysis Lombardo 1998: top-down parsing +
selective reanalysis Stevenson 1994: connectionism Crocker & Brants 2000: parallel,
probabilistic
![Page 6: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/6.jpg)
6
The Algorithm
Basic loop: incrementalCurrentTree := NIL;while not end of sentence do
InNode := next input word;attach InNode to CurrentTree;
end
How to do attachment ? Where to attach InNode into CurrentTree ?In principle: examine all information from the tree and the input word. Attachment may be costly!
![Page 7: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/7.jpg)
7
First refinement: Attachment with Waiting Nodes
A node is waiting for some constituent(s) obligatory: I look for … expected: I am reading …
Incoming element = expected constituent ? Quick checks! I read … the (book) / the (teacher’s book) / good
(books) First refinement:attach(InNode) :- waiting_node(WaitingNode), attach_wait(WaitingNode,
InNode); ... . % more procedures
![Page 8: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/8.jpg)
8
More refinements
Two cases: InNode can be attached to CurrentTree:
specific attachment InNode cannot be attached to CurrentTree:
reanalysis
![Page 9: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/9.jpg)
9
Specific Attachment attach_spec(InNode) examines specifically the tree
and the input element for an attachment I read it … loudly I went to the theatre of the town... to see (her)
Resolve attachment ambiguities and improve efficiency with psycholinguistic preferences. Classical examples: Late Closure: He said that John came …
yesterday. Minimal Attachment: The boy … found ... PP-attachment: V preferred to N:
I found a bird … in (the garden)
![Page 10: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/10.jpg)
10
Reanalysis Reanalysis
I knew the cat in the garden … was (smart) The cat found in the garden … was (smart) I told him the story … was (interesting) Is the girl sitting there … intelligent?
Recent focus in psycholinguistics Frazier & Rayner 1982: Selective Reanalysis. Fodor & Inoue 1994: Diagnosis model: repairing
(not reparsing!) after diagnosis. Fodor & Inoue 1998: Attach Anyway + Adjust Others: Lewis 1998 (NL-Soar), Sturt & Crocker 1996
(Tree Lowering)
![Page 11: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/11.jpg)
11
Reanalysis (2)
Reanalysis: symptoms + guesses + verifications + repair The cat found in the garden … was
(smart)symptoms: 'was' has no subject but needs
oneguesses: the potential subject may be now
the object of a verb, or the subject of a clauseverificationsrepair the current tree
![Page 12: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/12.jpg)
12
Reanalysis instead of easy attachmentverbs with multiple subcategorizations
I will bring the children … some foodv - 1 object --> v - 2 objects (Not a relative!)
I tell the children … that … I tell the children … she …
v - object --> v - object - clause (Not relative) I want him … to come
v - object --> v - (CP (TP him to (VP come))) (Not a purposive adjunct)
Although there is no breakdown and attachment is possible, reanalysis should be invoked instead!
![Page 13: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/13.jpg)
13
Reanalysis instead of easy attachment (2) Hypothesis: When a verb is satisfied by the
incoming element, the parser still remembers its other subcategorizations by registering the verb as a may_wait node.
attach_may_wait(Elt) considers the possibility of reanalysis before doing specific attachment I want him
verb - obj: (VP want (DP him))may_wait(want, other subcats of 'want')
… to comeattach_may_wait reanalyzes the current tree:
(VP want (CP (TP (DP him) to (VP come)))
![Page 14: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/14.jpg)
14
Reanalysis instead of easy attachment (3)empty categories
The picturesi you have ei … seen ei is put right after have (Minimal Chain Principle (De
Vincenzi 1991))seen can begin a modifier, cf. The pictures you took
seen from this perspective are very surrealist
Other examples The booki I want ei … to read
Whati do you want ei … to read?
The folderi which I keep ei … my files in
Whati do you see ei … the birds with?
![Page 15: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/15.jpg)
15
Reanalysis instead of easy attachment (4) Hypothesis: (Similar to multiple subcategorization verbs)
An object trace node is registered in may_wait state. During parsing, when the top may_wait node is a trace node, attach_may_wait will check whether InNode needs to be attached to the verb, and if yes, move the trace node somewhere else.
The picturesi you have ei ... ei is registered as may_wait When seen arrives, attach_may_wait reanalyzes and
gives:The picturesi you have seen
The parser then looks for a new position for ei
The picturesi you have seen ei
![Page 16: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/16.jpg)
16
Reanalysis instead of easy attachment (5)
If the parser cannot find a new position for ei, it hangs it in the memory for future location
The folderi which I keep ei … my files in is blue The folderi which I keep ei …
The folderi which I keep my files -- ei is hung in memory
The folderi which I keep my files in ei -- ei is located
![Page 17: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/17.jpg)
17
The algorithm
attach(N) :- waiting_node(W), attach_wait(W, N);
may_wait_node(W), attach_may_wait(W, N); % reanalysis
attach_spec(N);
reanalyze(N). % reanalysis
CurrentTree := NIL;while not end of input do
N := next word;attach N to CurrentTree;
end
![Page 18: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/18.jpg)
18
Structure postulation
the boy … who the boy … you
Conform to Frazier 1998: parsing is not purely bottom-up, but also has some top-down features
CPthe boy
DP
who
waiting nodeadjunct
CP
DPi
ei TP
you
waiting node
the boy
![Page 19: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/19.jpg)
19
Forward parsing
Examples We need... disk… storage... management... software. I saw the … teacher’s son’s dog.
Reanalysis repeated??? Seems unlikely! I go… to… school / see her. They are… selected… people / by John.
Reanalysis seems not occur in either case Hypothesis: With normal pronouncing rhythm,
reanalysis does not occur here but forward parsing
![Page 20: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/20.jpg)
20
Forward parsing (2)
Forward parsing: the parser delays attachment and continues to work with the input stream to gather enough information for a good attachment We need... disk… storage... management...
software.When disk arrives, the parser does not attach it
immediately, but continues parsing forward to get the head of the DP
I go… to… school / see her.When to arrives, the parser continues parsing
forward to know whether it will get a PP or an infinitive VP
![Page 21: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/21.jpg)
21
Implementation
Programmed in Prolog (Windows)Coverage: many basic and
'advanced' structures of English. See test examples.
Test: Time complexity:
Theoretical ? Practical: seems good (linear ?)
![Page 22: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/22.jpg)
22
Test Results (on 300 sentences)
0
500
1000
1500
2000
2500
3000
1 3 5 7 9 11
13
15
17
19
21
23
25
Average
Min
Max
Length
Time (operations)
![Page 23: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/23.jpg)
23
Some test sentences The folder I kept my letters in is blue. Is the girl working in the house your daughter? Is the girl working in the house your daughter bought recently? What exactly do you want to see in this old house? The big white horse raced quickly past the old barn fell behind the house. The girl who I hear they say he thinks John loves is their daughter. To drink with friends that you have not met for a long time is very
interesting. Do you notice the old tall woman who sits looking at your daughter's
little dog has spent a long time here? Are the three little boys in the computer group chosen recently by the
university where your friend is studying computer science intelligent boys?
![Page 24: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/24.jpg)
24
The folder I kept my letters in is blue.
'node(4,d:0:[sg,def],0-1,the)
node(2,n:[sg],1-2,folder)
- 'node(7,e:o,?,e(d,node(4,d:0:[sg,def],0-1,the)))
node(5,c:r,?,[])
'node(3,d:p:[sg,p(1)],2-3,i)
node(6,t:[],?,[])
node(8,v:[o,g]:[cj,past0,r(keep)],3-4,kept)
node(13,d:ps:[pl],4-5,my)
node(11,n:[pl],5-6,letters)
-node(12,p:[],6-7,in)
node(15,e:d,?,e(d,node(7,e:o,?, e(d,node(4,d:0:[sg,def],0-1,the)))))
node(16,t:[cj,sg,p(3),pres,sg,def],?,[])
node(14,v:be:[cj,sg,p(3),pres],7-8,is)
node(17,a,8-9,blue)
![Page 25: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/25.jpg)
25
DiscussionsFuture work:
Lexical ambiguities: words of multiple categories Conjunctive structures, sentences with
punctuations Incorporating non-structural (semantic, real
world) information: sentence processing. Incorporating other techniques: statistics. Comparing the parser with traditional cfg-based
and statistics-based parsers.
Programming the parser on the computer may help us understand more on the human process.
![Page 26: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/26.jpg)
26
Conclusions
A serial parsing prototype has been implemented with some good results
Particularities: Reanalysis instead of easy attachment Forward parsing
Still a lot of things to do
![Page 27: Computer Simulation of a Serial Parsing Model](https://reader036.fdocuments.in/reader036/viewer/2022062323/56815275550346895dc0a5be/html5/thumbnails/27.jpg)
27
ReferencesCrocker, M. & Brants, T., 2000. Wide-Coverage Probabilistic Sentence Processing. Journal of
Psycholinguistic Research, Vol. 29, No. 6, 2000.De Vincenzi M., 1991. Syntactic parsing strategies in Italian (Dordrecht: Kluwer Academic,
1991).Fodor, J.D. & Ferreira, F., 1998 (eds). Reanalysis in Sentence Processing. Kluwer Academic.Fodor, J.D. & Inoue, A., 1994. The Diagnosis and Cure of Garden Paths. Journal of
Psycholinguistic Research, Vol. 23, No. 5, 1994.Fodor, J.D. & Inoue, A., 1998. Attach Anyway. In Fodor & Ferreira (1998) 101-141.Frazier, L., 1978. On Comprehending Sentences: Syntactic Parsing Strategies. Doctoral
dissertation, University of Connecticut.Frazier, L., 1998. Getting There (Slowly). Journal of Psycholinguistic Research, Vol. 27, No. 2,
1998.Frazier, L. & Clifton, C., 1996. Construal. Cambridge, MA: MIT Press.Lombardo, V., 1998. A Computational Model of Recovery. In Fodor & Ferreira (1998) 287-325.Marcus, M., 1980. A Theory of Syntactic Recognition for Natural Language. Cambridge, MA:
MIT Press.McRoy, S. & Hirst, G., 1990. Race-based Parsing and Syntactic Disambiguation. Cognitive
Science, 14.Stevenson S., 1998. Parsing as Incremental Restructuring. In Fodor & Ferreira (1998) 327-363.