Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10....

68
Natural Language Models and Interfaces Part B, lecture 2 Ivan Titov Institute for Logic, Language and Computation

Transcript of Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10....

Page 1: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Natural Language Models and Interfaces���Part B, lecture 2

Ivan Titov

Institute for Logic, Language and Computation

Page 2: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Today

2

}  Parsing algorithms for CFGs

}  Recap, Chomsky Normal Form (CNF)

}  A dynamic programming algorithm for parsing (CKY)

}  Extension of CKY to support unary inner rules

}  Parsing for PCFGs

}  Extension of CKY for parsing with PCFGs

}  Parser evaluation (if we have time)

After this lecture you should be able to start working on the assignment step 2.1

Page 3: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing

3

}  Parsing is search through the space of all possible parses

}  e.g., we may want either any parse, all parses or the highest scoring parse (if PCFG):

}  Bottom-up:

}  One starts from words and attempt to construct the full tree

}  Top-down

}  Start from the start symbol and attempt to expand to get the sentence

argmax

T2G(x)P (T )

Set of all trees given by the grammar for the sentence x

The probability by the PCFG model

Page 4: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

CKY algorithm (aka CYK)

4

}  Cocke-Kasami-Younger algorithm

}  Independently discovered in late 60s / early 70s

}  An efficient bottom up parsing algorithm for (P)CFGs

}  can be used both for the recognition and parsing problems

}  Very important in NLP (and beyond)

}  We will start with the non-probabilistic version

Page 5: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Constraints on the grammar

5

}  The basic CKY algorithm supports only rules in the Chomsky Normal Form (CNF):

}  Any CFG can be converted to an equivalent CNF

}  Equivalent means that they define the same language

}  However (syntactic) trees will look differently

}  It is possible to address it but defining such transformations that allows for easy reverse transformation

Unary preterminal rules (generation of words given PoS tags , , … )

Binary inner rules (e.g., , )

C ! x

C ! C1C2

Makes linguists unhappy

N ! telescope D ! the

S ! NP V P NP ! D N

Page 6: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Transformation to CNF form

6

}  What one need to do to convert to CNF form

}  Get rid of empty (aka epsilon) productions:

}  Get rid of unary rules:

}  N-ary rules:

C ! ✏

Generally not a problem as there are not empty production in the standard (postporcessed) treebanks

C ! C1

Not a problem, as our CKY algorithm will support unary rules

Crucial to process them, as required for efficient parsing

C ! C1 C2 . . . Cn (n > 2)

Page 7: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Transformation to CNF form: binarization

7

}  Consider

}  How do we get a set of binary rules which are equivalent?

}  A more systematic way to refer to new non-terminals

NP

DT

the

NNP

Dutch

VBG

publishing

NN

group

NP ! DT NNP V BG NN

NP ! DT X

X ! NNP Y

Y ! V BG NN

NP ! DT @NP |DT

@NP |DT ! NNP @NP |DT NNP

@NP |DT NNP ! V BG NN

Page 8: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Transformation to CNF form: binarization

8

}  Instead of binarizing tules we can binarize trees on preprocessing:

NP

DT

the

NNP

Dutch

VBG

publishing

NN

group

NP

DT

the

@NP-> DT

NNP

Dutch

@NP-> DT NNP

VBG

publishing

NN

group

Can be easily reversed on postprocessing

Also known as lossless Markovization in the context of PCFGs

Page 9: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Transformation to CNF form: binarization

9

}  Instead of binarizing tules we can binarize trees on preprocessing:

NP

DT

the

NNP

Dutch

VBG

publishing

NN

group

NP

DT

the

@NP-> DT

NNP

Dutch

@NP-> DT NNP

VBG

publishing

@NP-> DT NNP VBG

NN

group

This is exactly the transformation used in the code of the assignment

Page 10: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

}  We a given

}  a grammar

}  a sequence of words

}  Our goal is to produce a parse tree for

}  We need an easy way to refer to substrings of

CKY: Parsing task

10

indices refer to fenceposts

w = (w1, w2, . . . , wn)

G = (V,⌃, R, S)

w

w

[Some illustrations and slides in this lecture are from Marco Kuhlmann]

span (i, j) refers to words between fenceposts i and j

start symbol

Page 11: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Key problems

11

}  Recognition problem: does the sentence belong to the language defined by CFG?

}  Is there a derivation which yields the sentence?

}  Parsing problem: what is a derivation (tree) corresponding the sentence?

}  Probabilistic parsing: what is the most probable tree for the sentence?

Page 12: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing one word

12

C ! wi

Page 13: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing one word

13

C ! wi

Page 14: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing one word

14

C ! wi

Page 15: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing longer spans

15

C ! C1 C2

C1, C2,midCheck through all

Page 16: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing longer spans

16

C ! C1 C2

C1, C2,midCheck through all

Page 17: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parsing longer spans

17

Page 18: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Signatures

18

}  Applications of rules is independent of inner structure of a parse tree

}  We only need to know the corresponding span and the root label of the tree

}  Its signature [min,max,C]

Also known as an edge

Page 19: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

CKY idea

19

}  Compute for every span a set of admissible labels (may be empty for some spans)

}  Start from small trees (single words) and proceed to larger ones

}  When done, check if S is among admissible labels for the whole sentence, if yes – the sentence belong to the language

}  That is if a tree with signature [0, n, S] exists

}  Unary rules?

Page 20: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 21: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

S?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Chart (aka parsing triangle)

Page 22: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

S?

lead can poison

max =

1

max =

2

max =

3 min = 1

min = 2

min = 3

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 23: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

S?

lead can poison

max =

1

max =

2

max =

3 min = 1

min = 2

min = 3

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 24: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

S?

lead can poison

max =

1

max =

2

max =

3 min = 1

min = 2

min = 3

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 25: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

1

2

3

4

5

6

S?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 26: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

1

2

3

?

?

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 27: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

1

2

3

?

?

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 28: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V1

2

3

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 29: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V

NP, V P

NP, V P

NP

1

2

3

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 30: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V

NP, V P

NP, V P

NP

1

2

3

4

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 31: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V

NP, V P

NP, V P

NP

1

2

3

4

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 32: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

1

2

3

4

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 33: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

1

2

3

4

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Check about unary rules: no unary rules here

Page 34: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

1

2

3

4

5

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 35: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

1

2

3

4

5

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 36: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

1

2

3

4

5

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Check about unary rules: no unary rules here

Page 37: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

1

2

3

4

5

6

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 38: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

1

2

3

4

5

6

?

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Page 39: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

1

2

3

4

5

6S, NP

CKY in action

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

mid = 1

Page 40: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

S(?!)

1

2

3

4

5

6S, NP

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

mid = 2

Page 41: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

S(?!)

1

2

3

4

5

6S, NP

CKY in action S ! NP V P

V P ! V P NP

V P ! M V

V P ! V

NP ! N

NP ! N NP

N ! can

N ! lead

N ! poison

M ! can

M ! must

V ! poison

V ! lead

Pre

term

inal

rule

s In

ner r

ules

lead can poison

0 1 2 3

Apparently the sentence is ambiguous for the grammar: (as the grammar overgenerates)

mid = 2

Page 42: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Ambiguity

S

NP

N

lead

VP

M

can

V

poison

S

NP

N

lead

NP

N

can

VP

V

poison

No subject-verb agreement, and poison used as an intransitive verb

Apparently the sentence is ambiguous for the grammar: (as the grammar overgenerates)

Page 43: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

CKY more formally

43

}  Chart can be represented by a Boolean array chart[min][max][label]

}  Relevant entries have

}  if the signature (min, max, C) is already added to the chart; otherwise.

Here we assume that labels (C) are integer indices

chart[min][max][C] = true

chart[min][max][C]

0 < min < max n

false

min = 0

min = 1

min = 2

max = 1 max = 2 max = 3

N,M

N,V

N, V NP

NP, V P

NP, V P

NP

S, V P,

NP

S,V P,

NP

1

2

3

4

5

6In the assignment code we use a class Chart but its access methods are similar

Page 44: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Implementation: preterminal rules

44

Page 45: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Implementation: binary rules

45

Page 46: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Implementation: unary rules

46

But we forgot something!

Page 47: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Unary closure

47

}  What if the grammar contained 2 rules:

}  But C can be derived from A by a chain of rules:

}  One could support chains in the algorithm but it is easier to extend the grammar, to get the reflexive transitive closure

A ! B

B ! C

A ! B ! C

A ! B

B ! C)

A ! B

B ! C

A ! C

A ! A

B ! B

C ! C

Convenient for programming reasons in the PCFG case

Page 48: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Implementation: skeleton

48

Page 49: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Algorithm analysis

49

}  Time complexity?

}  , where is the number of rules in the grammar

}  There exist algorithms with better asymptotical time complexity but the `constant' makes them slower in practice (in general)

✓(n3|R|) |R|

A few seconds for sentences under < 20 words for a non-optimized parser

Page 50: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Practical time complexity

50

}  Time complexity? (for the PCFG version)

} 

⇠ n3.6

[Plot by Dan Klein]

Page 51: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Today

51

}  Parsing algorithms for CFGs

}  Recap, Chomsky Normal Form (CNF)

}  A dynamic programming algorithm for parsing (CKY)

}  Extension of CKY to support unary inner rules

}  Parsing for PCFGs

}  Extension of CKY for parsing with PCFGs

}  Parser evaluation (if we have time)

Page 52: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Probabilistic parsing

52

}  We discussed the recognition problem:

}  check if a sentence is parsable with a CFG

}  Now we consider parsing with PCFGs

}  Recognition with PCFGs: what is the probability of the most probable parse tree?

}  Parsing with PCFGs: What is the most probable parse tree?

Page 53: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Distribution over trees

53

}  Let us denote by the set of derivations for the sentence

}  The probability distribution defines the scoring over the trees

}  Finding the best parse for the sentence according to PCFG:

G(x) x

T 2 G(x)

P (T )

argmax

T2G(x)P (T )

Page 54: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

CKY with PCFGs

54

}  Chart is represented by a double array chart[min][max][label]

}  It stores probabilities for the most probable subtree with a given signature

}  will store the probability of the most probable full parse tree

chart[min][max][C]

chart[0][n][S]

Page 55: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Intuition

55

C ! C1 C2

For every choose , and mid such that

is maximal, where and are left and right subtrees.

C1 C2C

P (T1)⇥ P (T2)⇥ P (C ! C1C2)

T1 T2

Page 56: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Implementation: preterminal rules

56

Page 57: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Implementation: binary rules

57

Page 58: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Unary (reflexive transitive) closure

58

A ! B

B ! C)

A ! B

B ! C

A ! C

A ! A

B ! B

C ! C

0.1

0.2

0.1

0.2

0.2⇥ 0.1

1

1

1

Note that this is not a PCFG anymore as the rules do not sum to 1 for each parent

)A ! B

B ! C

A ! C

A ! A

B ! B

C ! C

1

1

1

0.1

0.2

1.e� 5

. . .. . .

A ! B

B ! C

A ! C

0.1

0.1

0.02

What about loops, like: ? A ! B ! A ! C

The fact that the rule is composite needs to be stored to recover the true tree

Page 59: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Recovery of the tree

59

}  For each signature we store backpointers to the elements from which it was built (e.g., rule and, for binary rules, midpoint)

}  start recovering from [0, n, S]

}  Be careful with unary rules

}  Basically you can assume that you always used an unary rule from the closure (but it could be the trivial one )

C ! C

Page 60: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Speeding up the algorithm (approximate search)

60

}  Basic pruning (roughly):

}  For every span (i,j) store only labels which have the probability at most N times smaller than the probability of the most probable label for this span

}  Check not all rules but only rules yielding subtree labels having non-zero probability

}  Coarse-to-fine pruning

}  Parse with a smaller (simpler) grammar, and precompute (posterior) probabilities for each spans, and use only the ones with non-negligible probability from the previous grammar

Page 61: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

}  Parsing algorithms for CFGs

}  Recap, Chomsky Normal Form (CNF)

}  A dynamic programming algorithm for parsing (CKY)

}  Extension of CKY to support unary inner rules

}  Parsing for PCFGs

}  Extension of CKY for parsing with PCFGs

}  Parser evaluation

Today

61

Page 62: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Parser evaluation

62

}  Intrinsic evaluation:

}  Automatic: evaluate against annotation provided by human experts (gold standard) according to some predefined measure

}  Manual: … according to human judgment

}  Extrinsic evaluation: score syntactic representation by comparing how well a system using this representation performs on some task

}  E.g., use syntactic representation as input for a semantic analyzer and compare results of the analyzer using syntax predicted by different parsers.

Though has many drawbacks it is easier and allows to track state-of-the-art across many years

Page 63: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Standard evaluation setting in parsing

63

}  Automatic intrinsic evaluation is used: parsers are evaluated against gold standard by provided by linguists

}  There is a standard split into the parts:

}  training set: used for estimation of model parameters

}  development set: used for tuning the model (initial experiments)

}  test set: final experiments to compare against previous work

Page 64: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Automatic evaluation of constituent parsers

64

}  Exact match: percentage of trees predicted correctly

}  Bracket score: scores how well individual phrases (and their boundaries) are identified

}  Crossing brackets: percentage of phrases boundaries crossing

}  Dependency metrics: scores dependency structure corresponding to the constituent tree (percentage of correctly identified heads)

The most standard measure; we will focus on it

Page 65: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Brackets scores

65

}  The most standard score is bracket score

}  It regards a tree as a collection of brackets:

}  The set of brackets predicted by a parser is compared against the set of brackets in the tree annotated by a linguist

}  Precision, recall and F1 are used as scores

[min,max,C]

Subtree signatures for CKY

Page 66: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Bracketing notation

66

}  The same tree as a bracketed sequence

(S

(NP (PN My) (N Dog) )

(VP (V ate)

(NP (D a ) (N sausage) )

)

)

S

NP

PN

My

N

dog

VP

V

ate

NP

D

a

N

sausage

Page 67: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Brackets scores

67

Harmonic mean of precision and recall

Pr =

number of brackets the parser and annotation agree on

number of brackets predicted by the parser

Re =number of brackets the parser and annotation agree on

number of brackets in annotation

F1 =2⇥ Pr ⇥Re

Pr +Re

Page 68: Natural Language Models and Interfacesivan-titov.org/teaching/nlmi/nlmi-syntax-2.pdf · 2017. 10. 1. · A dynamic programming algorithm for parsing (CKY) ! ... Equivalent means that

Preview: F1 bracket score

68

65

70

75

80

85

90

95

Treebank PCFG Unlexicalized PCFG (Klein and Manning, 2003)

Lexicalized PCFG (Collins, 1999)

Automatically Induced PCFG (Petrov et al.,

2006)

The best results reported (as of

2012)

We will introduce these models next time