Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides ›...

42
Natural Language Processing Lecture 13: More on CFG Parsing

Transcript of Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides ›...

Page 1: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Natural Language Processing

Lecture 13: More on CFG Parsing

Page 2: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Probabilistc/Weighted Parsing

Page 3: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Example: ambiguous parse

Page 4: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Probabilistc CFG

Page 5: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Ambiguous parse w/probabilites

P(lef) = 2.2 *10^-6 P(right) = 6.1 *10^-7

0.05

0.20

0.20

0.20

0.75

0.30

0.60

0.10

0.40

0.05

0.10

0.20 0.20

0.75 0.75

0.30

0.60

0.10 0.40

Page 6: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Review: Context-Free Grammars

• Vocabulary of terminal symbols, Σ

• Set of nonterminal symbols (a.k.a. variables), N

• Special start symbol S N∈• Producton rules of the form X → α where

X N∈α (N Σ)* ∈ ∪ (in CNF: α N∈ 2 Σ)∪

Page 7: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Probabilistc Context-Free Grammars

• Vocabulary of terminal symbols, Σ

• Set of nonterminal symbols (a.k.a. variables), N

• Special start symbol S N∈• Producton rules of the form X → α, each with

a positve weight p(X → α), whereX N∈α (N Σ)* ∈ ∪ (in CNF: α N∈ 2 Σ)∪∀X N, ∑∈ α p(X → α) = 1

Page 8: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

CKY Algorithm: Review

for i = 1 ... n

C[i-1, i] = { V | V → wi }

for ℓ = 2 ... n // width

for i = 0 ... n - ℓ // lef boundary

k = i + ℓ // right boundary

for j = i + 1 ... k – 1 // midpoint

C[i, k] = C[i, k] ∪ { V | V → YZ, Y ∈ C[i, j], Z ∈ C[j, k] }

return true if S ∈ C[0, n]

Page 9: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Weighted CKY Algorithm

for i = 1 ... n, V N∈C[V, i-1, i] = p(V → wi)

for ℓ = 2 ... n // width of span

for i = 0 ... n - ℓ // lef boundary

k = i + ℓ // right boundary

for j = i + 1 ... k – 1 // midpoint

for each binary rule V → YZ

C[V, i, k] = max{ C[V, i, k], C[Y, i, j] × C[Z, j, k] × p(V → YZ) }

return true if S ∈ C[·,0, n]

Page 10: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

CKY Algorithm: Review

Page 11: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Weighted CKY Algorithm

Page 12: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

P-CKY algorithm from book

Page 13: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 14: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Parsing as (Weighted) Deducton

Page 15: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley’s Algorithm

Page 16: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 17: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Example Grammar (same for CKY)

Page 18: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

18

Earley Parsing

• Allows arbitrary CFGs• Top-down control• Fills a table (or chart) in a single sweep over

the input– Table is length N+1; N is number of words– Table entries represent

• Completed consttuents and their locatons• In-progress consttuents• Predicted consttuents

Page 19: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

19

States

• The table-entries are called states and are represented with doted-rules.

S . VP A VP is predicted

NP Det . Nominal An NP is in progress

VP V NP . A VP has been found

Page 20: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

20

States/Locatons

• S . VP [0,0]

• NP Det .Nominal [1,2]

• VP V NP . [0,3]

A VP is predicted at the start of the sentence

An NP is in progress; the Det goes from 1 to 2

A VP has been found startng at 0 and ending at 3

Page 21: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

21

Earley top-level

• As with most dynamic programming approaches, the answer is found by looking in the table in the right place.

• In this case, there should be an S state in the final column that spans from 0 to N and is complete. That is,

S α . [0,N]

• If that’s the case, you’re done.

Page 22: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

22

Earley top-level (2)

• So sweep through the table from 0 to N…– New predicted states are created by startng top-

down from S

– New incomplete states are created by advancing existng states as new consttuents are discovered

– New complete states are created in the same way.

Page 23: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

23

Earley top-level (3)

• More specifically…1. Predict all the states you can upfront

2. Read a word1. Extend states based on matches

2. Generate new predictons

3. Go to step 2

3. When you’re out of words, look at the chart to see if you have a winner

Page 24: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley code: top-level

Page 25: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley code: 3 main functons

Page 26: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

02/26/2019 Speech and

Language Processing - Jurafsky and Martn

26

Extended Earley Example

• Book that fight

• We should find: an S from 0 to 3 that is a completed state

Page 27: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 28: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 29: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 30: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 31: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 32: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley’s Algorithm in equatons

• We can look at this from the declaratve programming point of view too.

ROOT → • S [0,0] goal:ROOT → S• [0,n]

book the fight through Chicago

Page 33: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley’s Algorithm: PREDICT

Given V → α•Xβ [i, j] and the rule X → γ,create X → •γ [j, j]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

book the fight through Chicago

ROOT → • S [0,0]S→ VPS → • VP [0,0]

Page 34: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley’s Algorithm: SCAN

Given V → α•Tβ [i, j] and the rule T → wj+1,

create T → wj+1• [j, j+1]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

V → book• [0, 1]

book the fight through Chicago

VP → • V NP [0,0]V → bookV → book • [0,1]

Page 35: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Earley’s Algorithm: COMPLETE

Given V → α•Xβ [i, j] and X → γ• [j, k],create V → αX•β [i, k]

ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...

V → book• [0, 1]VP → V • NP [0,1]

book the fight through Chicago

VP → • V NP [0,0]V → book • [0,1]VP → V • NP [0,1]

Page 36: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most
Page 37: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Thought Questons

• Runtme?– O(n3)

• Memory?– O(n2)

• Can we make it faster?

• Recovering trees?

Page 38: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Make it an Earley Parser

Page 39: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Parsing as Search, Again

Page 40: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Implementng Recognizers as Search

Agenda = { state0 }

while(Agenda not empty) s = pop a state from Agendaif s is a success-state return s // valid parse treeelse if s is not a failure-state: generate new states from spush new states onto Agenda

return nil // no parse!

Page 41: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Agenda-Based Probabilistc Parsing

Agenda = { (item, value) : inital updates from equatons }// items take the form [X, i, j]; values are reals

while(Agenda not empty) u = pop an update from Agendaif u.item is goal return u.value // valid parse treeelse if u.value -> Chart[u.item] store Chart[u.item] ← u.valueif u.item combines with other Chart items:

generate new updates from u and items stored in Chartpush new updates onto Agenda

return nil // no parse!

Page 42: Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides › 13-earley.pdfLanguage Processing - Jurafsky and Martn 21 Earley top-level • As with most

Catalog of CF Parsing Algorithms

• Recogniton/Boolean vs. parsing/probabilistc

• Chomsky normal form/CKY vs. general/Earley’s

• Exhaustve vs. agenda