Introduction and Jurafsky Model Resource: A Probabilistic Model of Lexical and Syntactic Access and...
-
Upload
derek-maxwell -
Category
Documents
-
view
225 -
download
1
Transcript of Introduction and Jurafsky Model Resource: A Probabilistic Model of Lexical and Syntactic Access and...
Sentence Comprehension-I
Introduction and Jurafsky Model
Resource: A Probabilistic Model of Lexical and Syntactic Access and Disambiguation, Jurafsky 1996
What is Reading?
Process of accessing words? or What happens after all the words are
recognized? Analyze it Evaluate it Creating new knowledge Thinking guided by print?
Psycholinguistics
Issues in Language Processing
What is constructed when we comprehend a sentence? Propositional representation
What role do words play? Mental lexicon
How does the process of constructing representation occur?
How is the ability to construct a representation acquired?
Reading Time Experiments
Reading time: total time by a reader to read a peace of text
Reading time experiments Word by word reading Sentence by sentence reading Eye tracking experiment
Comprehension level of sentence (Reading time)
Measuring Reading Time with Eye Tracking
Source: https://wiki.brown.edu/confluence/display/kertzlab/Eye-Tracking+While+Reading
Eye Tracking While Reading (ETWR) Experiment
Modelling Reading Time
Probabilistic Context Free Grammar (PCFG) model
Entropy Reduction model
Competition-Integration model
Connectionist model
Modelling Reading Time
Probabilistic Context Free Grammar (PCFG) model
Entropy Reduction model
Competition-Integration model
Connectionist model
Disambiguation
Assumptions by Jurafsky (1996) Observed preferences in interpretation of
ambiguous sentences reflect probabilities of different syntactic structures.
Processing difficulty is a continuum ▪ Slight preferences at one end▪ Garden path constructs at another end
Several types of ambiguities Lexical category ambiguity Attachment ambiguity Unexpected thematic fit Main clause vs. reduced relative clause ambiguity
Garden Path Phenomena
Garden path: to be led down to the garden path To be misled or deceived
A garden path sentence [Wikipedia] Grammatically correct sentence Starts in such a way that reader’s most
likely interpretation will be incorrect Reader will be lured to a dead-end parse
Garden path sentences will be marked with # now on
Garden Path Example
(1)#The old man the boat.(2)#The horse raced(3)#The complex houses married and
single soldiers and their families.(4)#The government plans to raise
taxes were defeated.
past the barn
fell.
Lexical Category Ambiguity
Ambiguity resolved without trouble (fires = N or V):1a. The warehouse fires destroyed all the buildings1b. The warehouse fires a dozen employees each year.
Ambiguity leads to garden path (complex= N or Adj, houses= N or V, etc.):2a. #The complex houses married and single students.2b. #The old man the boats
Attachment Ambiguity
(1)The spy saw the policeman with binocular
(2)The spy saw the policeman with a revolver
(3)The bird saw the birdwatcher with binocular
Subcategorization Frames
The arguments required by a verb are its subcategorization frame or valence.
Primary arguments Secondary arguments
Attachment preferences vary between verbs
(1)The women discussed the dogs on the beach.a. The women discussed the dogs which were on the beach.
(90%)b. The women discussed them (the dogs) while on the beach.
(10%)
(2)The women kept the dogs on the beach.a. The women kept the dogs which were on the beach. (5%)b. The women kept them (the dogs) on the beach. (95%)
Unexpected Thematic Fit
(1)The cop arrested by the detective was guilty of taking bribes
(2)The crook arrested by the detective was guilty of taking bribes
(1) introduces more disambiguation difficulty as the initial noun phrase (the cop) is a good agent of the first verb (arrested)
Main Clause vs. Reduced Relative Clause
Reduced relative clause: that-clause without the that.a. #The horse raced past the barn fell.a’. #The horse (that) raced past the barn fell.b. The horse found in the woods diedb’. The horse (that was) found in the woods died
Another case of different subcategorization preferences: X raced >> X raced Y (intransitive preferred over
transitive)
X found Y >> X found (transitive preferred over intransitive)
Serial Parsing Model
if multiple rules can apply, choose one based on a selection rule determinism
example selection rule: minimal attachment (choose the tree with the fewest nodes).
if parse fails, backtrack to choice point and reparse
backtracking occurs, causes increased processing times.
Parallel Parsing Model
if multiple rules can apply, pursue all possibilities in parallel non-determinism
if any parse fails, discard it; problem: number of parse trees can grow
exponentially solution: only pursue a limited number of
possibilities (bounded parallelism). Prune some of the unpromising parses
garden path means correct tree was pruned from search space;
backtracking occurs, causes increased processing times.
A Probabilistic Parallel Parser
How to model non-determinism Probabilistic parsing
Parsing model [Jurafsky (1996)] Each full or partial parse is assigned a
probability. Parses are pruned from the search space
if their probability is a factor of α below the most probable parse (beam search).
How are parse probabilities determined?
Computing Parse Probabilities
Jurafsky (1996) focuses on two sources of information: Construction probabilities: probability of syntactic
tree. Valence probabilities: probability of particular
syntactic categories as arguments for specific verbs Assumes that construction probabilities and
valence probabilities are independent, so
can be estimated from a large treebank using relative frequencies.
Grammar Language
Grammar for a language is a set of rewrite rules𝐴→𝛼𝐵𝐶
Non-terminal Symbol
Terminal Symbol
𝛼 𝐴 𝛽→𝛾 𝐵𝐶
Context-Free
Context-Sensitive
Probabilistic Context Free Grammar (PCFG)
is computed as. Probabilistic Context Free Grammar (PCFG)
A PCFG is a probabilistic version of a CFG where each production has a probability.▪ [p]
Probabilities of all productions rewriting a given non-terminal must add to 1, defining a distribution for each non-terminal.
String generation is now probabilistic where production probabilities are used to non-deterministically select a production for rewriting a given non-terminal.
Probabilistic Context Free Grammar (PCFG)
A PCFG a set of non-terminals a set of terminal symbols a set of production rules
PCFG Normal Forms
Chomsky Normal Form If a PCFG G is - Productions: or Conversion to CNF
▪ will be converted to
Simple PCFG
S → NP VP S → Aux NP VP S → VP NP → PronounNP → Proper-NounNP → Det NominalNominal → NounNominal → Nominal NounNominal → Nominal PPVP → VerbVP → Verb NPVP → VP PPPP → Prep NP
Grammar0.80.10.10.20.20.60.30.20.50.20.50.31.0
Prob
+
+
+
+
1.0
1.0
1.0
1.0
Det → the | a | that | this 0.6 0.2 0.1 0.1Noun → book | flight | meal | money 0.1 0.5 0.2 0.2Verb → book | include | prefer 0.5 0.2 0.3Pronoun → I | he | she | me 0.5 0.1 0.1 0.3Proper-Noun → Houston | NWA 0.8 0.2Aux → does 1.0Prep → from | to | on | near | through 0.25 0.25 0.1 0.2 0.2
Lexicon
Probabilistic Grammar Conversion
S → NP VPS → Aux NP VP
S → VP
NP → Pronoun
NP → Proper-Noun
NP → Det NominalNominal → Noun
Nominal → Nominal NounNominal → Nominal PPVP → Verb
VP → Verb NPVP → VP PPPP → Prep NP
Original Grammar Chomsky Normal FormS → NP VPS → X1 VPX1 → Aux NPS → book | include | prefer 0.01 0.004 0.006S → Verb NPS → VP PPNP → I | he | she | me 0.01 0.02 0.02 0.06NP → Houston | NWA 0.16 .04NP → Det NominalNominal → book | flight | meal | money 0.03 0.15 0.06 0.06Nominal → Nominal NounNominal → Nominal PPVP → book | include | prefer 0.1 0.04 0.06VP → Verb NPVP → VP PPPP → Prep NP
0.80.1
0.1
0.2
0.2
0.60.3
0.20.50.2
0.50.31.0
0.80.11.0
0.050.03
0.6
0.20.5
0.50.31.0
Valence Probabilities
Subcategorization frames of the verb keep:
Valence probabilities tell us how likely each of these frames is.
Modeling Disambiguation
Disambiguation using construction probabilities, no garden path“The warehouse fires destroyed all the
buildings”
Modeling Disambiguation
Disambiguation using construction probabilities, no garden path“The warehouse fires a dozen employees each
year.”
Modeling Valence Preferences
Disambiguation using valence probabilities, no garden path: “keep the dogs on the
beach”
Modeling Valence Preferences
Disambiguation using valence probabilities, no garden path: “keep the dogs on the
beach”
Modeling Valence Preferences
Disambiguation using valence probabilities, no garden path: “discuss the dogs on the
beach”
Modeling Valence Preferences
Disambiguation using valence probabilities, no garden path: “discuss the dogs on the
beach”
Combining Valence and Construction Probabilities
Garden path caused by construction probabilities and valenceprobabilities: (main verb interpretation)“the horse raced
past……”
Combining Valence and Construction Probabilities
Garden path caused by construction probabilities and valenceprobabilities: (reduced relative interpretation)“the horse raced
past……”
Combining Valence and Construction Probabilities
Disambiguation using construction probabilities and valenceprobabilities, no garden path: (main verb)“The bird found in the room died”
Combining Valence and Construction Probabilities
Disambiguation using construction probabilities and valenceprobabilities, no garden path: (reduced relative)“The bird found in the room died”