6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I....
Transcript of 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I....
![Page 1: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/1.jpg)
6.864: Lecture 2, Fall 2005Parsing and Syntax I
![Page 2: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/2.jpg)
Overview
• An introduction to the parsing problem
• Context free grammars
• A brief(!) sketch of the syntax of English
• Examples of ambiguous structures
• PCFGs, their formal properties, and useful algorithms
• Weaknesses of PCFGs
![Page 3: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/3.jpg)
Parsing (Syntactic Structure)
INPUT: Boeing is located in Seattle.
OUTPUT: S
NP
N
Boeing
VP
V
is
VP
V
located
PP
P NP
in N
Seattle
![Page 4: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/4.jpg)
Data for Parsing Experiments
• Penn WSJ Treebank = 50,000 sentences with associated trees
• Usual set-up: 40,000 training sentences, 2400 test sentences
An example tree: TOP
NNP NNPS
NP
VBD NP
ADVP IN
PP
VP
S
NP PP
PRP$ JJ NN CC JJ NN NNS
NP
IN
NP SBAR
NP
PP
NP
CD NN IN NP RB
QP
$ CD CD PUNC,
NNP PUNC, WHADVP
DT NN
NP
VBZ
QP NNS PUNC.
NP
VP
S
WRB
RB CD
Canadian Utilities had 1988 revenue of C$ 1.16 billion , mainly from its natural gas and electric utility businessesin Alberta , where the company serves about 800,000 customers .
Canadian Utilities had 1988 revenue of C$ 1.16 billion , mainly from its natural gas and electric utility businesses in Alberta , where the company serves about 800,000 customers .
![Page 5: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/5.jpg)
The Information Conveyed by Parse Trees
1) Part of speech for each word
(N = noun, V = verb, D = determiner)
S
NP
D
the
N
VP
V
robbed
NP
D Nburglar
the apartment
![Page 6: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/6.jpg)
2) Phrases S
NP
DT
the
N
VP
V
robbed
NP
DT Nburglar
the apartment
Noun Phrases (NP): “the burglar”, “the apartment”
Verb Phrases (VP): “robbed the apartment”
Sentences (S): “the burglar robbed the apartment”
![Page 7: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/7.jpg)
3) Useful Relationships
S
NP
subject
VP
V
verb
S
NP
DT
the
N
VP
V
robbed
NP
DT Nburglar
the apartment
∪ “the burglar” is the subject of “robbed”
![Page 8: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/8.jpg)
An Example Application: Machine Translation
• English word order is subject – verb – object
• Japanese word order is subject – object – verb
English: IBM bought LotusJapanese: IBM Lotus bought
English: Sources said that IBM bought Lotus yesterday Japanese: Sources yesterday IBM Lotus bought that said
![Page 9: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/9.jpg)
Syntax and Compositional Semantics
S: ( )
NP:IBM
IBM
VP: ( )
( ) NP:Lotus
bought IBM, Lotus
�y bought y, Lotus
V:�x, y bought y, x
bought Lotus
• Each syntactic non-terminal now has an associated semantic expression
• (We’ll see more of this later in the course)
![Page 10: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/10.jpg)
Context-Free Grammars
[Hopcroft and Ullman 1979]A context free grammar G = (N, �, R, S) where:
• N is a set of non-terminal symbols
• � is a set of terminal symbols
• R is a set of rules of the form X ∈ Y1Y2 . . . Yn
for n � 0, X � N , Yi � (N � �)
• S � N is a distinguished start symbol
![Page 11: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/11.jpg)
A Context-Free Grammar for English
N = {S, NP, VP, PP, DT, Vi, Vt, NN, IN}S = S� = {sleeps, saw, man, woman, telescope, the, with, in}
R = S ∪ NP VPVP ∪ ViVP ∪ Vt NPVP ∪ VP PPNP ∪ DT NNNP ∪ NP PPPP ∪ IN NP
Vi ∪ sleeps Vt ∪ saw NN ∪ man NN ∪ woman NN ∪ telescope DT ∪ the IN ∪ with IN ∪ in
Note: S=sentence, VP=verb phrase, NP=noun phrase, PP=prepositional phrase, DT=determiner, Vi=intransitive verb, Vt=transitive verb, NN=noun, IN=preposition
![Page 12: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/12.jpg)
Left-Most DerivationsA left-most derivation is a sequence of strings s1 . . . sn, where
• s1 = S, the start symbol
• sn � ��, i.e. sn is made up of terminal symbols only
• Each si for i = 2 . . . n is derived from si−1 by picking the leftmost non-terminal X in si−1 and replacing it by some � where X ∈ � is a rule in R
For example: [S], [NP VP], [D N VP], [the N VP], [the man VP], [the man Vi], [the man sleeps]
Representation of a derivation as a tree:
S
NP
D N
VP
Vi
the man sleeps
![Page 13: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/13.jpg)
S � NP VPNP VP NP � DT NDT N VP DT � thethe N VP N � dogthe dog VP VP � VBthe dog VB VB � laughs
the dog laughs
DERIVATION RULES USEDS
![Page 14: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/14.jpg)
NP � DT NDT N VP DT � thethe N VP N � dogthe dog VP VP � VBthe dog VB VB � laughs
the dog laughs
DERIVATION RULES USED S S � NP VP NP VP
![Page 15: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/15.jpg)
DT � thethe N VP N � dogthe dog VP VP � VBthe dog VB VB � laughs
the dog laughs
DERIVATION RULES USED S S � NP VP NP VP NP � DT N DT N VP
![Page 16: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/16.jpg)
N � dogthe dog VP VP � VBthe dog VB VB � laughs
the dog laughs
DERIVATION RULES USED S S � NP VP NP VP NP � DT N DT N VP DT � the the N VP
![Page 17: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/17.jpg)
VP � VBthe dog VB VB � laughs
the dog laughs
DERIVATION RULES USED S S � NP VP NP VP NP � DT N DT N VP DT � the the N VP N � dog the dog VP
![Page 18: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/18.jpg)
VB � laughs
the dog laughs
DERIVATION RULES USED S S � NP VP NP VP NP � DT N DT N VP DT � the the N VP N � dog the dog VP VP � VB the dog VB
![Page 19: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/19.jpg)
DERIVATION RULES USED S S � NP VP NP VP NP � DT N DT N VP DT � the the N VP N � dog the dog VP VP � VB the dog VB VB � laughs the dog laughs
S
NP
DT N
VP
VB
the dog laughs
![Page 20: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/20.jpg)
Properties of CFGs
• A CFG defines a set of possible derivations
• A string s � �� is in the language defined by the CFG if there is at least one derivation which yields s
• Each string in the language generated by the CFG may have more than one derivation (“ambiguity”)
![Page 21: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/21.jpg)
S � NP VPNP VP NP � hehe VP VP � VB PPhe VB PP VB � drovehe drove PP PP � down NPhe drove down NP NP � NP PPhe drove down NP PP NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USEDS
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 22: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/22.jpg)
NP � hehe VP VP � VP PPhe VP PP VP � VB PPhe VB PP PP VB� drovehe drove PP PP PP� down the streethe drove down the street PP PP� in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 23: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/23.jpg)
VP � VP PPhe VP PP VP � VB PPhe VB PP PP VB� drovehe drove PP PP PP� down the streethe drove down the street PP PP� in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 24: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/24.jpg)
VP � VB PPhe VB PP PP VB� drovehe drove PP PP PP� down the streethe drove down the street PP PP� in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VP PP he VP PP
S
NP VP
he
VP
VB PP
PP
in the car
drove down the street
![Page 25: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/25.jpg)
VB� drovehe drove PP PP PP� down the streethe drove down the street PP PP� in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VP PP he VP PP VP � VB PP he VB PP PP
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 26: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/26.jpg)
PP� down the streethe drove down the street PP PP� in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VP PP he VP PP VP � VB PP he VB PP PP VB� drove he drove PP PP
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 27: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/27.jpg)
PP� in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VP PP he VP PP VP � VB PP he VB PP PP VB� drove he drove PP PP PP� down the street he drove down the street PP
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 28: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/28.jpg)
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VP PP he VP PP VP � VB PP he VB PP PP VB� drove he drove PP PP PP� down the street he drove down the street PP PP� in the car he drove down the street in the car
S
NP
he
VP
VP
VB PP
the street
PP
in the car
drove down
![Page 29: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/29.jpg)
S � NP VPNP VP NP � hehe VP VP � VP PPhe VP PP VP � VB PPhe VB PP PP VB� drovehe drove PP PP PP� down the streethe drove down the street PP PP� in the carhe drove down the street in the car
DERIVATION RULES USEDS
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 30: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/30.jpg)
NP � hehe VP VP � VB PPhe VB PP VB � drovehe drove PP PP � down NPhe drove down NP NP � NP PPhe drove down NP PP NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 31: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/31.jpg)
VP � VB PPhe VB PP VB � drovehe drove PP PP � down NPhe drove down NP NP � NP PPhe drove down NP PP NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 32: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/32.jpg)
VB � drovehe drove PP PP � down NPhe drove down NP NP � NP PPhe drove down NP PP NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VB PP he VB PP
S
NP VP
he
VB PP
drove
NP
NP
the street
PP
in the car
down
![Page 33: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/33.jpg)
PP � down NPhe drove down NP NP � NP PPhe drove down NP PP NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VB PP he VB PP VB � drove he drove PP
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 34: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/34.jpg)
NP � NP PPhe drove down NP PP NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VB PP he VB PP VB � drove he drove PP PP � down NP he drove down NP
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 35: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/35.jpg)
NP � the streethe drove down the street PP PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VB PP he VB PP VB � drove he drove PP PP � down NP he drove down NP NP � NP PP he drove down NP PP
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 36: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/36.jpg)
PP � in the carhe drove down the street in the car
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VB PP he VB PP VB � drove he drove PP PP � down NP he drove down NP NP � NP PP he drove down NP PP NP � the street he drove down the street PP
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 37: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/37.jpg)
DERIVATION RULES USED S S � NP VP NP VP NP � he he VP VP � VB PP he VB PP VB � drove he drove PP PP � down NP he drove down NP NP � NP PP he drove down NP PP NP � the street he drove down the street PP PP � in the car he drove down the street in the car
S
NP
he
VP
VB PP
NP
NP
the street
PP
in the car
drove
down
![Page 38: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/38.jpg)
The Problem with Parsing: Ambiguity
INPUT: She announced a program to promote safety in trucks and vans
←
POSSIBLE OUTPUTS:
S S S S S S
NP VP NP VP NP VP
announced NP
NP VP She She
NP VP She NP VP She
announced NP She
announced NP
NP VP
She announced NP
NP VP
announced NP NP VP a programa program
NP PP
to promote NP a program to promote NP PP in NP
safety PP NP VP
safety in NP a program trucks and vans
in NP to promote NP
safetyto promote NP trucks and vans
announced NP
andNP NPtrucks and vans NP and NP
vans
vans NP and NP
NP VP NP VP safety PP
vans
a program in NPa program
to promote NP PP
to promote NP safety in NP trucks
truckssafety PP
in NP
trucks
And there are more...
![Page 39: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/39.jpg)
A Brief Overview of English Syntax
Parts of Speech:
• Nouns (Tags from the Brown corpus) NN = singular noun e.g., man, dog, park NNS = plural noun e.g., telescopes, houses, buildings NNP = proper noun e.g., Smith, Gates, IBM
• DeterminersDT = determiner e.g., the, a, some, every
• AdjectivesJJ = adjective e.g., red, green, large, idealistic
![Page 40: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/40.jpg)
A Fragment of a Noun Phrase Grammar
NN ≤ box NN ≤ car NN ≤ mechanic NN ≤ pigeon
≤ NNN̄ N̄ ≤ NN N̄
DT ≤ theN̄ ≤ JJ N̄
DT ≤ aN̄ ≤ N̄ N̄
N̄NP ≤ DTJJ ≤ fast JJ ≤ metal JJ ≤ idealistic JJ ≤ clay
Generates: a box, the box, the metal box, the fast car mechanic, . . .
![Page 41: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/41.jpg)
Prepositions, and Prepositional Phrases
• Prepositions IN = preposition e.g., of, in, out, beside, as
![Page 42: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/42.jpg)
An Extended Grammar
JJ ≤ fast JJ ≤ metal
≤ NNN̄ N̄
NN ≤ box JJ ≤ idealisticN̄≤ NN
NN ≤ car JJ ≤ clayN̄ N̄≤ JJ
NN ≤ mechanic NN ≤ pigeon IN ≤ in
N̄ ≤ N̄ N̄ N̄NP ≤ DT
IN ≤ under DT ≤ the IN ≤ of DT ≤ a IN ≤ on
IN ≤ with IN ≤ as
PP ≤ IN NPN̄ ≤ N̄ PP
Generates: in a box, under the box, the fast car mechanic under the pigeon in the box, . . .
![Page 43: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/43.jpg)
Verbs, Verb Phrases, and Sentences
• Basic Verb TypesVi = Intransitive verb e.g., sleeps, walks, laughsVt = Transitive verb e.g., sees, saw, likesVd = Ditransitive verb e.g., gave
• Basic VP RulesVP ∈ ViVP ∈ Vt NPVP ∈ Vd NP NP
• Basic S RuleS ∈ NP VP
Examples of VP: sleeps, walks, likes the mechanic, gave the mechanic the fast car, gave the fast car mechanic the pigeon in the box, . . .
![Page 44: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/44.jpg)
Examples of S: the man sleeps, the dog walks, the dog likes the mechanic, the dog in the box gave the mechanic the fast car,. . .
![Page 45: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/45.jpg)
PPs Modifying Verb Phrases
A new rule: VP ∈ VP PP
New examples of VP: sleeps in the car, walks like the mechanic, gave the mechanic the fast car on Tuesday, . . .
![Page 46: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/46.jpg)
Complementizers, and SBARs
• ComplementizersCOMP = complementizer e.g., that
• SBARSBAR ∈ COMP S
Examples: that the man sleeps, that the mechanic saw the dog . . .
![Page 47: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/47.jpg)
More Verbs
• New Verb TypesV[5] e.g., said, reportedV[6] e.g., told, informedV[7] e.g., bet
• New VP RulesVP ∈ V[5] SBARVP ∈ V[6] NP SBAR VP ∈ V[7] NP NP SBAR
Examples of New VPs: said that the man sleeps told the dog that the mechanic likes the pigeon bet the pigeon $50 that the mechanic owns a fast car
![Page 48: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/48.jpg)
Coordination
• A New Part-of-Speech: CC = Coordinator e.g., and, or, but
• New Rules NP ∈ NP CC NP N̄ ∈ N̄ CCN̄VP ∈ VP CC VP S ∈ S CC S SBAR ∈ SBAR CC SBAR
![Page 49: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/49.jpg)
Sources of Ambiguity
• Part-of-Speech ambiguity NNS ∈ walks Vi ∈ walks
• Prepositional Phrase Attachment the fast car mechanic under the pigeon in the box
![Page 50: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/50.jpg)
NP
D
the
N̄
N̄
JJ N̄
NN
car
N̄
NN
mechanic
PP
IN
under
NP
D
the
N̄
N̄
NN
pigeon
PP
IN
in
NP
D N̄
fast
the NN
box
![Page 51: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/51.jpg)
NP
D
the
N̄
N̄
N̄
JJ N̄
NN N̄
PP
IN
under
NP
D N̄
PP
IN
in
NP
D
the
N̄
NN
boxfast
car NN the N̄
mechanic NN
pigeon
![Page 52: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/52.jpg)
VP
VP
Vt PP
the street
PP
in the car
drove down
VP
Vt PP
NP
the N̄
street PP
in the car
drove down
![Page 53: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/53.jpg)
Two analyses for: John was believed to have been shot by Bill
![Page 54: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/54.jpg)
Sources of Ambiguity: Noun Premodifiers
• Noun premodifiers:
NP
D
the
N̄
JJ N̄
NN N̄
NP
D
the
N̄
N̄
JJ N̄
N̄
NNfast
car NN fast NN mechanic
mechanic car
![Page 55: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/55.jpg)
A Funny Thing about the Penn Treebank
Leaves NP premodifier structure flat, or underspecified:
NP
DT JJ NN NN
the fast car mechanic
NP
NP
DT JJ NN NN
PP
IN NP
DT NNunder the fast car mechanic
the pigeon
![Page 56: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/56.jpg)
A Probabilistic Context-Free Grammar
S ∪ NP VP 1.0 VP ∪ Vi 0.4 VP ∪ Vt NP 0.4 VP ∪ VP PP 0.2 NP ∪ DT NN 0.3 NP ∪ NP PP 0.7 PP ∪ P NP 1.0
Vi ∪ sleeps 1.0 Vt ∪ saw 1.0 NN ∪ man 0.7 NN ∪ woman 0.2 NN ∪ telescope 0.1 DT ∪ the 1.0 IN ∪ with 0.5 IN ∪ in 0.5
• Probability of a tree with rules �i ∈ �i is
i P (�i ∈ �i|�i)
![Page 57: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/57.jpg)
S � NP VP1.0
NP VP NP � DT N 0.3DT N VP DT � the 1.0the N VP N � dog 0.1the dog VP VP � VB 0.4the dog VB VB � laughs 0.5
the dog laughs
DERIVATION RULES USED PROBABILITYS
![Page 58: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/58.jpg)
NP � DT N0.3
DT N VP DT � the 1.0the N VP N � dog 0.1the dog VP VP � VB 0.4the dog VB VB � laughs 0.5
the dog laughs
DERIVATION RULES USED PROBABILITY S S � NP VP 1.0 NP VP
![Page 59: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/59.jpg)
DT � the1.0
the N VP N � dog 0.1the dog VP VP � VB 0.4the dog VB VB � laughs 0.5
the dog laughs
DERIVATION RULES USED PROBABILITY S S � NP VP 1.0 NP VP NP � DT N 0.3 DT N VP
![Page 60: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/60.jpg)
N � dog0.1
the dog VP VP � VB 0.4the dog VB VB � laughs 0.5
the dog laughs
DERIVATION RULES USED PROBABILITY S S � NP VP 1.0 NP VP NP � DT N 0.3 DT N VP DT � the 1.0 the N VP
![Page 61: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/61.jpg)
VP � VB0.4
the dog VB VB � laughs 0.5
the dog laughs
DERIVATION RULES USED PROBABILITY S S � NP VP 1.0 NP VP NP � DT N 0.3 DT N VP DT � the 1.0 the N VP N � dog 0.1 the dog VP
![Page 62: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/62.jpg)
VB � laughs0.5
the dog laughs
DERIVATION RULES USED PROBABILITY S S � NP VP 1.0 NP VP NP � DT N 0.3 DT N VP DT � the 1.0 the N VP N � dog 0.1 the dog VP VP � VB 0.4 the dog VB
![Page 63: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/63.jpg)
DERIVATION RULES USED PROBABILITY S S � NP VP 1.0 NP VP NP � DT N 0.3 DT N VP DT � the 1.0 the N VP N � dog 0.1 the dog VP VP � VB 0.4 the dog VB VB � laughs 0.5 the dog laughs
TOTAL PROBABILITY = 1.0 × 0.3 × 1.0 × 0.1 × 0.4 × 0.5
![Page 64: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/64.jpg)
�
Properties of PCFGs
• Assigns a probability to each left-most derivation, or parse-tree, allowed by the underlying CFG
• Say we have a sentence S, set of derivations for that sentence is T (S). Then a PCFG assigns a probability to each member of T (S). i.e., we now have a ranking in order of probability.
• The probability of a string S is
P (T, S) T �T (S)
![Page 65: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/65.jpg)
Deriving a PCFG from a Corpus
• Given a set of example trees, the underlying CFG can simply be all rules seen in the corpus
• Maximum Likelihood estimates:
Count(� � �)PM L(� � � | �) =
Count(�)
where the counts are taken from a training set of example trees.
• If the training data is generated by a PCFG, then as the training data size goes to infinity, the maximum-likelihood PCFG will converge to the same distribution as the “true” PCFG.
![Page 66: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/66.jpg)
PCFGs[Booth and Thompson 73] showed that a CFG with rule probabilities correctly defines a distribution over the set of derivations provided that:
1. The rule probabilities define conditional distributions over the different ways of rewriting each non-terminal.
2. A technical condition on the rule probabilities ensuring that the probability of the derivation terminating in a finite number of steps is 1. (This condition is not really a practical concern.)
![Page 67: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/67.jpg)
�
Algorithms for PCFGs
• Given a PCFG and a sentence S, define T (S) to be the set of trees with S as the yield.
• Given a PCFG and a sentence S, how do we find
arg max P (T, S) T �T (S)
• Given a PCFG and a sentence S, how do we find
P (S) = P (T, S) T �T (S)
![Page 68: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/68.jpg)
Chomsky Normal Form
A context free grammar G = (N, �, R, S) in Chomsky Normal Form is as follows
• N is a set of non-terminal symbols
• � is a set of terminal symbols
• R is a set of rules which take one of two forms:
– X ∈ Y1Y2 for X � N , and Y1, Y2 � N
– X ∈ Y for X � N , and Y � �
• S � N is a distinguished start symbol
![Page 69: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/69.jpg)
A Dynamic Programming Algorithm• Given a PCFG and a sentence S, how do we find
max P (T, S) T �T (S)
• Notation:
n = number of words in the sentence
Nk for k = 1 . . . K is k’th non-terminal
w.l.g., N1 = S (the start symbol)
• Define a dynamic programming table
�[i, j, k] = maximum probability of a constituent with non-terminal Nk
spanning words i . . . j inclusive
• Our goal is to calculate maxT �T (S) P (T, S) = �[1, n, 1]
![Page 70: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/70.jpg)
A Dynamic Programming Algorithm
• Base case definition: for all i = 1 . . . n, for k = 1 . . . K
�[i, i, k] = P (Nk � wi | Nk )
(note: define P (Nk � wi | Nk ) = 0 if Nk � wi is not in the grammar)
• Recursive definition: for all i = 1 . . . n, j = (i + 1) . . . n, k = 1 . . . K,
�[i, j, k] = max {P (Nk � NlNm | Nk ) × �[i, s, l] × �[s + 1, j, m]} i � s < j 1 � l � K 1 � m � K
(note: define P (Nk � NlNm | Nk ) = 0 if Nk � NlNm is not in thegrammar)
![Page 71: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/71.jpg)
Initialization: For i = 1 ... n, k = 1 ... K
λ[i, i, k] = P (Nk ∈ wi|Nk )
Main Loop: For length = 1 . . . (n − 1), i = 1 . . . (n − 1ength), k = 1 . . . K
j ≥ i + length max ≥ 0 For s = i . . . (j − 1), For Nl, Nm such that Nk ∈ NlNm is in the grammar
prob ≥ P (Nk ∈ NlNm)× λ[i, s, l]× λ[s + 1, j, m] If prob > max
max ≥ prob //Store backpointers which imply the best parse Split(i, j, k) = {s, l, m}
λ[i, j, k] = max
![Page 72: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/72.jpg)
�
A Dynamic Programming Algorithm for the Sum• Given a PCFG and a sentence S, how do we find
P (T, S) T �T (S)
• Notation:
n = number of words in the sentence
Nk for k = 1 . . . K is k’th non-terminal
w.l.g., N1 = S (the start symbol)
• Define a dynamic programming table
�[i, j, k] = sum of probability of parses with root label Nk
spanning words i . . . j inclusive
• Our goal is to calculate �
T �T (S) P (T, S) = �[1, n, 1]
![Page 73: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/73.jpg)
A Dynamic Programming Algorithm for the Sum
• Base case definition: for all i = 1 . . . n, for k = 1 . . . K
�[i, i, k] = P (Nk � wi | Nk )
(note: define P (Nk � wi | Nk ) = 0 if Nk � wi is not in the grammar)
• Recursive definition: for all i = 1 . . . n, j = (i + 1) . . . n, k = 1 . . . K,
�[i, j, k] = �
{P (Nk � NlNm | Nk ) × �[i, s, l] × �[s + 1, j, m]}
i � s < j 1 � l � K 1 � m � K
(note: define P (Nk � NlNm | Nk ) = 0 if Nk � NlNm is not in the grammar)
![Page 74: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/74.jpg)
Initialization: For i = 1 ... n, k = 1 ... K
λ[i, i, k] = P (Nk ∈ wi|Nk )
Main Loop: For length = 1 . . . (n − 1), i = 1 . . . (n − 1ength), k = 1 . . . K
j ≥ i + length sum ≥ 0 For s = i . . . (j − 1), For Nl, Nm such that Nk ∈ NlNm is in the grammar
prob ≥ P (Nk ∈ NlNm)× λ[i, s, l]× λ[s + 1, j, m] sum ≥ sum + prob
λ[i, j, k] = sum
![Page 75: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/75.jpg)
Overview
• An introduction to the parsing problem
• Context free grammars
• A brief(!) sketch of the syntax of English
• Examples of ambiguous structures
• PCFGs, their formal properties, and useful algorithms
• Weaknesses of PCFGs
![Page 76: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/76.jpg)
Weaknesses of PCFGs
• Lack of sensitivity to lexical information
• Lack of sensitivity to structural frequencies
![Page 77: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/77.jpg)
S
NP
NNP
VP
Vt NP
IBM bought NNP
Lotus
PROB = P (S ∈ NP VP | S) ×P (NNP ∈ IBM | NNP) ×P (VP ∈ V NP | VP) ×P (Vt ∈ bought | Vt) ×P (NP ∈ NNP | NP) ×P (NNP ∈ Lotus | NNP) ×P (NP ∈ NNP | NP)
![Page 78: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/78.jpg)
Another Case of PP Attachment Ambiguity
(a) S
NP
NNS
VP
VP
VBD
dumped
NP
NNS
PP
IN
into
NP
DT NN
workers
sacks a bin
![Page 79: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/79.jpg)
(b) S
NP
NNS
VP
VBD
dumped
NP
NP
NNS
sacks
PP
IN
into
NP
DT NN
workers
a bin
![Page 80: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/80.jpg)
Rules Rules S � NP VP S � NP VP NP � NNS NP � NNS VP � VP PP NP � NP PP VP � VBD NP VP � VBD NP NP � NNS NP � NNS
(a) PP � IN NP NP � DT NN
(b) PP � IN NP NP � DT NN
NNS � workers NNS � workers VBD � dumped VBD � dumped NNS � sacks NNS � sacks IN � into IN � into DT � a DT � a NN � bin NN � bin
If P (NP ∈ NP PP | NP) > P (VP ∈ VP PP | VP) then (b) is more probable, else (a) is more probable.
Attachment decision is completely independent of the words
![Page 81: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/81.jpg)
A Case of Coordination Ambiguity
(a) NP
NP
NP
NNS
PP
IN NP
CC
and
NP
NNS
cats
dogs in NNS
houses
![Page 82: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/82.jpg)
(b) NP
NP
NNS
dogs
PP
IN
in
NP
NP CC NP
NNS and NNS
houses cats
![Page 83: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/83.jpg)
Rules RulesNP � NP CC NP NP � NP CC NP NP � NP PP NP � NP PP NP � NNS NP � NNS PP � IN NP PP � IN NP NP � NNS NP � NNS(a) (b)NP � NNS NP � NNS NNS � dogs NNS � dogs IN � in IN � in NNS � houses NNS � houses CC � and CC � and NNS � cats NNS � cats
Here the two parses have identical rules, and therefore have identical probability under any assignment of PCFG rule probabilities
![Page 84: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/84.jpg)
Structural Preferences: Close Attachment
(a) NP
NP
NN
PP
IN NP
NP
NN
PP
IN NP
(b) NP
NP
NP
NN
PP
IN NP
NN
PP
IN NP
NN
NN
• Example: president of a company in Africa
• Both parses have the same rules, therefore receive same probability under a PCFG
• “Close attachment” (structure (a)) is twice as likely in Wall Street Journal text.
![Page 85: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/85.jpg)
Structural Preferences: Close Attachment
Previous example: John was believed to have been shot by Bill
Here the low attachment analysis (Bill does the shooting) contains same rules as the high attachment analysis (Bill does the believing), so the two analyses receive same probability.
![Page 86: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/86.jpg)
References[Altun, Tsochantaridis, and Hofmann, 2003] Altun, Y., I. Tsochantaridis, and T. Hofmann. 2003.
Hidden Markov Support Vector Machines. In Proceedings of ICML 2003.
[Bartlett 1998] P. L. Bartlett. 1998. The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network, IEEE Transactions on Information Theory, 44(2): 525-536, 1998.
[Bod 98]Bod, R. (1998). Beyond Grammar: An Experience-Based Theory of Language. CSLI Publications/Cambridge University Press.
[Booth and Thompson 73] Booth, T., and Thompson, R. 1973. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22(5), pages 442–450.
[Borthwick et. al 98] Borthwick, A., Sterling, J., Agichtein, E., and Grishman, R. (1998). Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition. Proc. of the Sixth Workshop on Very Large Corpora.
[Collins and Duffy 2001] Collins, M. and Duffy, N. (2001). Convolution Kernels for Natural Language. In Proceedings of NIPS 14.
[Collins and Duffy 2002] Collins, M. and Duffy, N. (2002). New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron. In Proceedings of ACL 2002.
[Collins 2002a] Collins, M. (2002a). Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with the Perceptron Algorithm. In Proceedings of EMNLP 2002.
[Collins 2002b] Collins, M. (2002b). Parameter Estimation for Statistical Parsing Models: Theory and Practice of Distribution-Free Methods. To appear as a book chapter.
![Page 87: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/87.jpg)
[Crammer and Singer 2001a] Crammer, K., and Singer, Y. 2001a. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines. In Journal of Machine Learning Research, 2(Dec):265-292.
[Crammer and Singer 2001b] Koby Crammer and Yoram Singer. 2001b. Ultraconservative Online Algorithms for Multiclass Problems In Proceedings of COLT 2001.
[Freund and Schapire 99] Freund, Y. and Schapire, R. (1999). Large Margin Classification using the Perceptron Algorithm. In Machine Learning, 37(3):277–296.
[Helmbold and Warmuth 95] Helmbold, D., and Warmuth, M. On Weak Learning. Journal of Computer and System Sciences, 50(3):551-573, June 1995.
[Hopcroft and Ullman 1979] Hopcroft, J. E., and Ullman, J. D. 1979. Introduction to automata theory, languages, and computation. Reading, Mass.: Addison–Wesley.
[Johnson et. al 1999] Johnson, M., Geman, S., Canon, S., Chi, S., & Riezler, S. (1999). Estimators for stochastic ‘unification-based” grammars. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics. San Francisco: Morgan Kaufmann.
[Lafferty et al. 2001] John Lafferty, Andrew McCallum, and Fernando Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML-01, pages 282-289, 2001.
[Littlestone and Warmuth, 1986] Littlestone, N., and Warmuth, M. 1986. Relating data compression and learnability. Technical report, University of California, Santa Cruz.
[MSM93] Marcus, M., Santorini, B., & Marcinkiewicz, M. (1993). Building a large annotated corpus of english: The Penn treebank. Computational Linguistics, 19, 313-330.
[McCallum et al. 2000] McCallum, A., Freitag, D., and Pereira, F. (2000) Maximum entropy markov models for information extraction and segmentation. In Proceedings of ICML 2000.
![Page 88: 6.864: Lecture 2, Fall 2005 Parsing and Syntax I · 2020. 12. 30. · Parsing and Syntax I. Overview • An introduction to the parsing problem • Context free grammars • A brief(!)](https://reader033.fdocuments.in/reader033/viewer/2022060918/60aa8a07505ec523ce400286/html5/thumbnails/88.jpg)
[Miller et. al 2000] Miller, S., Fox, H., Ramshaw, L., and Weischedel, R. 2000. A Novel Use of Statistical Parsing to Extract Information from Text. In Proceedings of ANLP 2000.
[Ramshaw and Marcus 95] Ramshaw, L., and Marcus, M. P. (1995). Text Chunking Using Transformation-Based Learning. In Proceedings of the Third ACL Workshop on Very Large Corpora, Association for Computational Linguistics, 1995.
[Ratnaparkhi 96] A maximum entropy part-of-speech tagger. In Proceedings of the empirical methods in natural language processing conference.
[Schapire et al., 1998] Schapire R., Freund Y., Bartlett P. and Lee W. S. 1998. Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5):1651-1686.
[Zhang, 2002] Zhang, T. 2002. Covering Number Bounds of Certain Regularized Linear Function Classes. In Journal of Machine Learning Research, 2(Mar):527-550, 2002.