Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740:...
Transcript of Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740:...
![Page 1: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/1.jpg)
Dependency Parsing
Instructor: Yoav Artzi
CS5740: Natural Language Processing
Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning, and Dan Jurafsky, and David Weiss
![Page 2: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/2.jpg)
Overview• The parsing problem• Methods– Transition-based parsing
• Evaluation• Projectivity
![Page 3: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/3.jpg)
Parse Trees
• Part-of-speech Tagging: –Word classes
• Parsing:– From words to phrases to sentences– Relations between words
• Two views– Dependency – Constituency
![Page 4: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/4.jpg)
Dependency Parsing
• Dependency structure shows which words depend on (modify or are arguments of) which other words.
The boy put the tortoise on the rug
![Page 5: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/5.jpg)
Constituency (Phrase Structure) Parsing
• Phrase structure organizes words into nested constituents
• Linguists can, and do, argue about details• Lots of ambiguity
new art critics write reviews with computers
PP
NPNP
N�
NP
VP
S
![Page 6: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/6.jpg)
Dependency Structure• Syntactic structure consists of:– Lexical items– Binary asymmetric relations àdependencies
submitted
Bills were
Brownback
Senator
nsubjpass auxpass prep
nn
immigrationconj
by
cc
and
portspobj
prep
onpobj
Republican
Kansaspobj
prep
of
appos
Dependencies are typed with name of grammatical relation
![Page 7: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/7.jpg)
Dependency Structure• Syntactic structure consists of:– Lexical items– Binary asymmetric relations àdependencies
submitted
Bills
nsubjpass
Head (governor, superior, regent)
Modifier (dependent, inferior, subordinate)
Arrow from head to modifier (but can be
reversed)
![Page 8: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/8.jpg)
Dependency Structure• Syntactic structure consists of:– Lexical items– Binary asymmetric relations àdependencies
submitted
Bills were
Brownback
Senator
nsubjpass auxpass prep
nn
immigrationconj
by
cc
and
portspobj
prep
onpobj
Republican
Kansaspobj
prep
of
appos
Dependencies form a tree
![Page 9: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/9.jpg)
Dependency Structure• Syntactic structure consists of:– Lexical items– Binary asymmetric relations àdependencies
submitted
Bills were
Brownback
Senator
nsubjpass auxpass prep
nn
immigrationconj
by
cc
and
portspobj
prep
onpobj
Republican
Kansaspobj
prep
of
appos
Dependencies form a tree
Root
![Page 10: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/10.jpg)
Let’s Parse
He said that the boy who was wearing the blue shirt with the white pockets has left the building
John saw Mary
Start with main verb, and draw dependencies. Don’t worry about labels. Just try to get the modifiers right.
![Page 11: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/11.jpg)
Methods for Dependency Parsing• Dynamic programming
– Eisner (1996): O(n3)• Graph algorithms
– McDonald et al. (2005): score edges independently using classifier and use maximum spanning tree
• Constraint satisfaction– Start with all edges, eliminate based on hard constraints
• “Deterministic parsing”– Left-to-right, each choice is done with a classifier jumped
boy over
the thelittle
prepnsubj
det amod pobj
fencedet
![Page 12: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/12.jpg)
Making DecisionsWhat are the sources of information for dependency parsing?1. Bilexical affinities
– [issues à the] is plausible2. Dependency distance
– mostly with nearby words3. Intervening material
– Dependencies rarely span intervening verbs or punctuation4. Valency of heads
– How many dependents on which side are usual for a head?
ROOT Discussion of the outstanding issues was completed .
![Page 13: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/13.jpg)
MaltParse (Nivre et al. 2008)• Greedy transition-based parser• Each decision: how to attach each word as we
encounter it– If you are familiar: like shift-reduce parser
• Select each action with a classifier• The parser has:
– a stack σ, written with the top to the right• which starts with the ROOT symbol
– a buffer β, written with the top to the left• which starts with the input sentence
– a set of dependency arcs A• which starts off empty
– a set of actions
![Page 14: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/14.jpg)
Arc-standard Dependency ParsingStart: σ = [ROOT], β = w1, …, wn , A = ∅• Shift σ, wi|β, A à σ|wi, β, A• Left-Arcr σ|wi, wj|β, A à σ, wj|β, A∪{r(wj,wi)} • Right-Arcr σ|wi, wj|β, A à σ, wi|β, A∪{r(wi,wj)}Finish: β = ∅
ROOT Joe likes Marry
![Page 15: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/15.jpg)
Arc-standard Dependency ParsingStart: σ = [ROOT], β = w1, …, wn , A = ∅• Shift σ, wi|β, A à σ|wi, β, A• Left-Arcr σ|wi, wj|β, A à σ, wj|β, A∪{r(wj,wi)} • Right-Arcr σ|wi, wj|β, A à σ, wi|β, A∪{r(wi,wj)}Finish: β = ∅
ROOT Joe likes Marry[ROOT] [Joe, likes, marry] ∅
Shift [ROOT, Joe] [likes, marry] ∅Left-Arc [ROOT] [likes, marry] {(likes,Joe)} = A1Shift [ROOT, likes] [marry] A1Right-Arc [ROOT] [likes] A1 ∪ {(likes,Marry)} = A2Right-Arc [] [ROOT] A2 ∪ {(ROOT, likes)} = A3Shift [ROOT] [] A3
![Page 16: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/16.jpg)
Arc-standard Dependency ParsingStart: σ = [ROOT], β = w1, …, wn , A = ∅• Shift σ, wi|β, A à σ|wi, β, A• Left-Arcr σ|wi, wj|β, A à σ, wj|β, A∪{r(wj,wi)} • Right-Arcr σ|wi, wj|β, A à σ, wi|β, A∪{r(wi,wj)}Finish: β = ∅
ROOT Happy children like to play with their friends .
![Page 17: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/17.jpg)
Arc-eager Dependency ParsingStart: σ = [ROOT], β = w1, …, wn , A = ∅• Left-Arcr σ|wi, wj|β, A à σ, wj|β, A∪{r(wj,wi)}
– Precondition: r’(wk, wi) ∉ A, wi ≠ ROOT• Right-Arcr σ|wi, wj|β, A à σ|wi|wj, β, A∪{r(wi,wj)}• Reduce σ|wi, β, A à σ, β, A
– Precondition: r’(wk, wi) ∈ A• Shift σ, wi|β, A à σ|wi, β, AFinish: β = ∅
This is the common “arc-eager” variant: a head can immediately take a right dependent, before itsdependents are found
![Page 18: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/18.jpg)
Arc-eager1. Left-Arcr σ|wi, wj|β, A è σ, wj|β, A∪{r(wj,wi)}
Precondition: r’(wk, wi) ∉ A, wi ≠ ROOT2. Right-Arcr σ|wi, wj|β, A è σ|wi|wj, β, A∪{r(wi,wj)}3. Reduce σ|wi, β, A è σ, β, A
Precondition: r’(wk, wi) ∈ A4. Shift σ, wi|β, A è σ|wi, β, A
ROOT Happy children like to play with their friends .
![Page 19: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/19.jpg)
Arc-eager
ROOT Happy children like to play with their friends .
[ROOT] [Happy, children, …] ∅Shift [ROOT, Happy] [children, like, …] ∅LAamod [ROOT] [children, like, …] {amod(children, happy)} = A1Shift [ROOT, children] [like, to, …] A1LAnsubj [ROOT] [like, to, …] A1 ∪ {nsubj(like, children)} = A2RAroot [ROOT, like] [to, play, …] A2 ∪{root(ROOT, like) = A3Shift [ROOT, like, to] [play, with, …] A3LAaux [ROOT, like] [play, with, …] A3∪{aux(play, to) = A4RAxcomp [ROOT, like, play] [with their, …] A4∪{xcomp(like, play) = A5
1. Left-Arcr σ|wi, wj|β, A è σ, wj|β, A∪{r(wj,wi)} Precondition: r’(wk, wi) ∉ A, wi ≠ ROOT
2. Right-Arcr σ|wi, wj|β, A è σ|wi|wj, β, A∪{r(wi,wj)}3. Reduce σ|wi, β, A è σ, β, A
Precondition: r’(wk, wi) ∈ A4. Shift σ, wi|β, A è σ|wi, β, A
![Page 20: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/20.jpg)
Arc-eager
ROOT Happy children like to play with their friends .RAxcomp [ROOT, like, play] [with their, …] A4∪{xcomp(like, play) = A5RAprep [ROOT, like, play, with] [their, friends, …] A5∪{prep(play, with) = A6Shift [ROOT, like, play, with, their] [friends, .] A6LAposs [ROOT, like, play, with] [friends, .] A6∪{poss(friends, their) = A7RApobj [ROOT, like, play, with, friends] [.] A7∪{pobj(with, friends) = A8Reduce [ROOT, like, play, with] [.] A8Reduce [ROOT, like, play] [.] A8Reduce [ROOT, like] [.] A8RApunc [ROOT, like, .] [] A8∪{punc(like, .) = A9You terminate as soon as the buffer is empty. Dependencies = A9
1. Left-Arcr σ|wi, wj|β, A è σ, wj|β, A∪{r(wj,wi)} Precondition: r’(wk, wi) ∉ A, wi ≠ ROOT
2. Right-Arcr σ|wi, wj|β, A è σ|wi|wj, β, A∪{r(wi,wj)}3. Reduce σ|wi, β, A è σ, β, A
Precondition: r’(wk, wi) ∈ A4. Shift σ, wi|β, A è σ|wi, β, A
![Page 21: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/21.jpg)
MaltParser (Nivre et al. 2008)• Selecting the next action:– Discriminative classifier (SVM, MaxEnt, etc.)– Untyped choices: 4– Typed choices: |R| * 2 + 2
• Features: POS tags, word in stack, word in buffer, etc.
• Greedy à no search– But can easily do beam search
• Close to state of the art• Linear time parser à very fast!
![Page 22: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/22.jpg)
Parsing with Neural NetworksChen and Manning (2014)
• Arc-standard Transitions– Shift– Left-Arcr– Right-Arcr
• Selecting the next actions:– Untyped choices: 3– Typed choices: |R| * 2 + 1– Neural network classifier
• With a few model improvements and very careful hyper-parameter tuning gives SOTA results
![Page 23: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/23.jpg)
Parsing with Neural NetworksChen and Manning (2014)
![Page 24: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/24.jpg)
Hyper-parameters
Slide from David Weiss
![Page 25: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/25.jpg)
Slide from David Weiss
![Page 26: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/26.jpg)
Slide from David Weiss
![Page 27: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/27.jpg)
Slide from David Weiss
![Page 28: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/28.jpg)
Slide from David Weiss
![Page 29: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/29.jpg)
Evaluation
ROOT She saw the video lecture 0 1 2 3 4 5
Gold1 2 She nsubj2 0 saw root 3 5 the det4 5 video nn5 2 lecture dobj
Parsed1 2 She nsubj2 0 saw root 3 4 the det4 5 video nsubj5 2 lecture ccomp
Acc = # correct deps# of deps
UAS = 4 / 5 = 80%LAS = 2 / 5 = 40%
![Page 30: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/30.jpg)
Projectivity• Dependencies from CFG trees with head rules must
be projective– Crossing arcs are not allowed
• But: theory allows to account for displaced constituents à non-projective structures
Who did Bill buy the coffee from yesterday ?
![Page 31: Dependency Parsing - Cornell University · Dependency Parsing Instructor: Yoav Artzi CS5740: Natural Language Processing Slides adapted from Dan Klein, Luke Zettlemoyer, Chris Manning,](https://reader036.fdocuments.in/reader036/viewer/2022081600/6024e00fbd747165a90efecf/html5/thumbnails/31.jpg)
Projectivity
• Arc-eager transition system:– Can’t handle non-projectivity
• Possible directions:– Give up!– Post-processing– Add new transition types– Switch to a different algorithm• Graph-based parsers (e.g., MSTParser)