Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.

Probabilistic Parsing: Enhancements

Ling 571Deep Processing Techniques for NLP

January 26, 2011

Parser IssuesPCFGs make many (unwarranted) independence

assumptionsStructural Dependency

NP -> Pronoun: much more likely in subject position

Lexical DependencyVerb subcategorizationCoordination ambiguity

Improving PCFGs:Structural Dependencies

How can we capture Subject/Object asymmetry?E.g., NPsubj-> Pron vs NPObj->Pron

Parent annotation:Annotate each node with parent in parse tree

E.g., NP^S vs NP^VPAlso annotate pre-terminals:

RB^ADVP vs RB^VP IN^SBAR vs IN^PP

Can also split rules on other conditions

Parent Annotation

Parent Annotation:Pre-terminals

Parent AnnotationAdvantages:

Captures structural dependency in grammars

Disadvantages: Increases number of rules in grammar

Disadvantages: Increases number of rules in grammarDecreases amount of training per rule

Strategies to search for optimal # of rules

Improving PCFGs:Lexical Dependencies

Lexicalized rules:Best known parsers: Collins, Charniak parsersEach non-terminal annotated with its lexical head

E.g. verb with verb phrase, noun with noun phraseEach rule must identify RHS element as head

Heads propagate up treeConceptually like adding 1 rule per head value

VP(dumped) -> VBD(dumped)NP(sacks)PP(into)VP(dumped) -> VBD(dumped)NP(cats)PP(into)

Lexicalized PCFGsAlso, add head tag to non-terminals

Head tag: Part-of-speech tag of head wordVP(dumped) -> VBD(dumped)NP(sacks)PP(into)VP(dumped,VBD) ->

VBD(dumped,VBD)NP(sacks,NNS)PP(into,IN)

Two types of rules:Lexical rules: pre-terminal -> word

Deterministic, probability 1 Internal rules: all other expansions

Must estimate probabilities

PLCFGsIssue:

PLCFGsIssue: Too many rules

No way to find corpus with enough examples

PLCFGsIssue: Too many rules

No way to find corpus with enough examples

(Partial) Solution: Independence assumedCondition rule on

Category of LHS, head

Condition head onCategory of LHS and parent’s head

Disambiguation Example

67.09/6

dumpedVPC

VBDNPPdumpedVPC

dumpedVPVBDNPPPVPP

dumpedVPC

NPVBDdumpedVPC

dumpedVPVBDNPVPp

22.09/2

...)...)((

)..)(...)((

PPdumpedXC

inPPdumpedXC

dumpedPPinp

...)...)((

)...)(...)((

PPsacksXC

inPPsacksXC

sacksPPinp

Improving PCFGs:Tradeoffs

Tensions: Increase accuracy:

Increase specificity E.g. Lexicalizing, Parent annotation, Markovization,etc

Increases grammarIncreases processing timesIncreases training data requirements

How can we balance?

CNF Factorization & Markovization

CNF factorization:Converts n-ary branching to binary branching

CNF factorization:Converts n-ary branching to binary branchingCan maintain information about original structure

Neighborhood history and parent

Issue:Potentially explosive

If keep all context: 72 -> 10K non-terminals!!!

How much context should we keep?What Markov order?

Different Markov Orders

Markovization & Costs(Mohri & Roark 2006)

EfficiencyPCKY is Vn3

Grammar can be huge Grammar can be extremely ambiguous

100s of analyses not unusual, esp. for long sentences

However, only care about best parsesOthers can be pretty bad

Can we use this to improve efficiency?

Beam ThresholdingInspired by beam search algorithm

Assume low probability partial parses unlikely to yield high probability overallKeep only top k most probably partial parse

Retain only k choices per cell For large grammars, could be 50 or 100 For small grammars, 5 or 10

Heuristic FilteringIntuition: Some rules/partial parses are unlikely to

end up in best parse. Don’t store those in table.

Exclusions:Low frequency: exclude singleton productions

Low probability: exclude constituents x s.t. p(x) <10-

Low probability: exclude constituents x s.t. p(x) <10-200

Low relative probability:Exclude x if there exists y s.t. p(y) > 100 * p(x)

Notes on HW#3Outline:

Induce grammar from (small) treebank

Implement Probabilistic CKY

Evaluate parser

Improve parser

Treebank FormatAdapted from Penn Treebank Format

Rules simplified:Removed traces and other null elementsRemoved complex tagsReformatted POS tags as non-terminals

Reading the ParsesPOS unary collapse:

(NP_NNP Ontario) was

(NP (NNP Ontario))

Binarization:VP -> VP’ PP; VP’ -> VB PP

WasVP -> VB PP PP

Start Early!

Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.

Documents

Transcript of Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.

Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.

Statistical NLP Spring 2010 Lecture 13: Parsing II Dan Klein – UC Berkeley.

Syntactic Parsing - Sameer Singhsameersingh.org/courses/statnlp/wi17/slides/lecture-0207-syntactic... · Syntactic Parsing Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 February

Introduction to NLP Data-Driven Dependency Parsing

Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.

11-711 Advanced NLP Dependency Parsing: Algorithms, Models ...

ICON09 NLP TOOLS CONTEST: INDIAN LANGUAGE ...ltrc.iiit.ac.in/icon/2009/nlptools/CR/all-papers-tools...ICON09 NLP Tools Contest: Indian Language Dependency Parsing. Hyderabad, India.

Semantic Parsing - Columbia Universitykathy/NLP/2017/ClassSlides/Class14... · 2017. 10. 24. · Two Examples • Parsing into Abstract Meaning Representaon (AMR) • Language to

Extract Precise Link Context Using NLP Parsing

Speech & NLP: Syntax & Parsing

CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.

NLP Programming Tutorial 12 - Dependency Parsing · 3 NLP Programming Tutorial 12 – Dependency Parsing Two Types of Parsing Dependency: focuses on relations between words …

Computational Semantics Ling 571 Deep Processing Techniques for NLP February 7, 2011.

October 2006Advanced Topics in NLP1 CSA3050: NLP Algorithms Finite State Transducers for Morphological Parsing.

NLP IN Q - Kx Systems · TEXT NLP IN Q No existing NLP libraries Parsing is expensive, simple vector operations are cheap Focus on vector operations, rather than named entity

600.465 - Intro to NLP - J. Eisner1 Parsing Tricks.

Algorithms for NLP - Carnegie Mellon Universitydemo.clab.cs.cmu.edu/11711fa18/slides/FA18 11-711 Lecture 15 - parsing III with...Algorithms for NLP. Learning PCFGs. Treebank PCFGs

Algorithms for AI and NLP (INF4820 — Parsing) · Algorithms for AI and NLP ... apply fundamental rule until ﬁxpoint is reached. inf4820 — -oct- (oe@ifi.uio.no) Natural Language

ACS Introduction to NLP Lecture 8: Parsing with ...

October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.