Deciding entailment and contradiction with stochastic and edit distance-based alignment...

Deciding entailment and Deciding entailment and contradiction with stochastic and contradiction with stochastic and

edit distance-based alignmentedit distance-based alignment

Marie-Catherine de Marneffe,

Sebastian Pado, Bill MacCartney, Anna N. Rafferty, Eric Yeh and Christopher D. Manning

NLP Group

Stanford University

Three-stage architectureThree-stage architecture[MacCartney et al. NAACL 06]

T: India buys missiles.H: India acquires arms.

Attempts to improve the Attempts to improve the different stagesdifferent stages

1) Linguistic analysis:

improving dependency graphs

improving coreference

2) New alignment:

edit distance-based alignment

3) Inference:

entailment and contradiction

Stage I – Capturing long Stage I – Capturing long dependenciesdependencies

Maler realized the importance of publishing his investigations

Recovering long dependenciesRecovering long dependencies

• Training on dependency annotations in the WSJ segment of the Penn Treebank

• 3 MaxEnt classifiers:

1) Identify governor nodes thatare likely to have a missingrelationship

2) Identify the type of GR

3) Find the likeliest dependent(given GR and governor)

Some impact on RTESome impact on RTE

• Cannot handle conjoined dependents:

Pierre Curie and his wife realized the importance of advertising their discovery

• RTE results:

Accuracy With recovery

RTE2 test 61.25 63.38

RTE3 test 65.25 66.50

RTE4 62.60 62.70

Coreference with ILPCoreference with ILP

• Train pairwise classifier to make coreference decisions over pairs of mentions

• Use integer linear programming (ILP) to find best global solution• Normally pairwise classifiers enforce transitivity in an ad-hoc

manner• ILP enforces transitivity by construction

• Candidates:

all based-NP in the text and the hypothesis• No difference in results compared to the OpenNLP

coreference system

[Finkel and Manning ACL 08]

Stage II – Previous stochastic Stage II – Previous stochastic aligneraligner

Word alignment scores:

semantic similarity

Edge alignment scores:

structural similarity

• Linear model form:

• Perceptron learning of weights

sw (hi, t j ) =θw ⋅f(hi, t j )

se ((hi,h j ),(tk, tl )) =θe ⋅f((hi,h j ),(tk, tl ))

score(a) = scorew (hii∈h

∑ ,a(hi)) + scoree( i, j )∈e(h )

∑ ((hi,h j ),(a(hi),a(h j )))

Stochastic local search for Stochastic local search for alignmentsalignments

Complete state formulation

Start with a (possibly bad) completesolution, and try to improve it

At each step, select hypothesis word andgenerate all possible alignments

Sample successor alignment fromnormalized distribution, and repeat

New aligner: MANLINew aligner: MANLI

4 components:

1. Phrase-based representation

2. Feature-based scoring function

3. Decoding using simulated annealing

4. Perceptron learning on MSR RTE2 alignment data

[MacCartney et al. EMNLP 08]

Phrase-based alignment Phrase-based alignment representationrepresentation

DEL(In1)

…DEL(there5)

EQ(are6, are2)

SUB(very7 few8, poorly3 represented4)

EQ(women9, women1)

EQ(in10, in5)

EQ(parliament11, parliament6)

An alignment is a sequence of phrase edits: EQ, SUB, DEL, INS

• 1-to-1 at phrase level but many-to-many at token level:

avoids arbitrary alignment choices

can use phrase-based resources

• Score edits as linear combination of features, then sum:

A feature-based scoring functionA feature-based scoring function

• Edit type features:

EQ, SUB, DEL, INS

• Phrase features:

phrase sizes, non-constituents

• Lexical similarity feature (max over similarity scores)

WordNet, distributional similarity, string/lemma similarity

• Contextual features:

distortion, matching neighbors

RTE4 resultsRTE4 results

2-way 3-way Av. P

stochastic 61.4 55.3 44.2

MANLI 57.0 50.1 54.3

Error analysisError analysis

• MANLI alignments are sparse

- sure/possible alignments in MSR data

- need more paraphrase information

• Difference between previous RTE data and RTE4:

length ratio between text and hypothesis

• All else being equal, a longer text makes it likelier that a hypothesis can get over the threshold

RTE1 RTE3 RTE4

T/H 2:1 3:1 4:1

Stage III – Contradiction Stage III – Contradiction detectiondetection

1. Linguisticanalysis

2. Graphalignment

3.Contradiction features &classification

tunedthreshold

contradicts

doesn’tcontradict

score = = –2.001.84

T: A case of indigenously acquired rabies infection has been confirmed.H: No case of rabies was confirmed.

No rabies

det prep_of

–0.75

rabiesPOSNERIDF

--0.027

… … …

Feature f

Polarity difference - -2.00

No rabies

det prep_of

Arabies

infection

A rabies

detamod

infection

prep_of

Event coreference

[de Marneffe et al. ACL 08]

Event coreference is necessary Event coreference is necessary for contradiction detectionfor contradiction detection

• The contradiction features look for mismatching information between the text and hypothesis

• Problematic if the two sentences do not describe the same event

T: More than 2,000 people lost their lives in the devastating Johnstown Flood.

H: 100 or more people lost their lives in a ferry sinking.

Mismatching information:

more than 2,000 != 100 or more

Contradiction featuresContradiction features

RTE Contradiction

Polarity Polarity

Number, date and time Number, date and time

Antonymy Antonymy

Structure Structure

Factivity Factivity

Modality Modality

Relations Relations

Alignment

AdjectiveGradation, Hypernymy

Adjunct

more precisely defined

Contradiction & EntailmentContradiction & Entailment

• Both systems are run independently

• Trust entailment system more

RTE system

ENTAIL

Contradiction system

UNKNOWN NO

Contradiction resultsContradiction results

• Low recall

- 47 contradictions filtered out by the “event” filter

- 3 contradictions tagged as entailment

- contradictions requiring deep lexical knowledge

precision recall

submission: alone 26.3 10.0

combined 28.6 8.0

post hoc: with filter 27.54 12.67

without filter 30.14 14.67

Deep lexical knowledgeDeep lexical knowledge

T: … Power shortages are a thing of the past.

H: Nigeria power shortage is to persist.

T: … No children were among the victims.

H: A French train crash killed children.

T: … The report of a crash was a false alarm.

H: A plane crashes in Italy.

T: … The current food crisis was ignored.

H: UN summit targets global food crisis.

Precision errorsPrecision errors

• Hard to find contradiction features that reach high accuracy

% error

Bad alignment 23

Coreference 6

Structure 40

Antonymy 10

Negation 10

Relations 6

Numeric 3

More knowledge is necessaryMore knowledge is necessary

T: The company affected by this ban, Flour Mills of Fiji, exports nearly US$900,000 worth of biscuits to Vanuatu yearly.

H: Vanuatu imports biscuits from Fiji.

T: The Concord crashed […], killing all 109 people on board and four workers on the ground.

H: The crash killed 113 people.

ConclusionConclusion

• Linguistic analysis:

some gain when improving dependency graphs

• Alignment:

potential in phrase-based representation not yet proven: need better phrase-based lexical resources

• Inference:

can detect some contradictions, but need to improve precision & add knowledge for higher recall

Deciding entailment and contradiction with stochastic and edit distance-based alignment...

Documents

Transcript of Deciding entailment and contradiction with stochastic and edit distance-based alignment...

SALSA The Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado,

NATURAL LANGUAGE INFERENCE A DISSERTATION SUBMITTED …nlp.stanford.edu/~manning/dissertations/MacCartney-Bill... · 2016-09-18 · An intrinsic property of the NLI task deﬁnition

Computational Linguistics (aka Natural Language Processing) Bill MacCartney SymSys 100 Stanford University 26 May 2011 (some slides adapted from Chris.

Textual Entailment and Logical Inference · 2017. 12. 4. · Entailment Triggers. Possible features . from (de Marneffe et al., 2006) Polarity features. presence/absence of neative

CADO/PADO: Update on 2015 WHO Consolidated …...CADO/PADO: Update on 2015 WHO Consolidated guidelines Towards Treat All in the context of SDGs Meg Doherty, Treatment and Care Coordinator

Protection Mainstreaming Presentation by GPC/Protection Mainstreaming Task Team Julien Marneffe, Philippines, December 2013.

Pado: A Data Processing Engine for Harnessing Transient ...yunseong.github.io/papers/eurosys17-yang.pdf · 575 Pado: A Data Processing Engine for Harnessing Transient Resources in

Robust local textual inference Marie-Catherine de Marneffe, Bill MacCartney, Teg Grenager, Daniel Cer, Anna Rafferty, Christopher D. Manning NLP Group.

Curriculum Vitae Date Prepared: Name: Poornima …...1 Curriculum Vitae Date Prepared: September 26, 2018 Name: Poornima Kumar, PhD Office Address: McLean Hospital, de Marneffe Building,

Yolanda Heredia Escorza Armando Lozano Rodríguez … · `Losparticipantesson:lasautoridadesLos participantes son: las autoridades educativas, la directora, los profesores, el encargg,pado

Partition-Based Logical Reasoning Bill MacCartney (KSL), Sheila A. McIlraith (KSL), Eyal Amir (FRG/Berkeley), Tomas Uribe (SRI) Richard Fikes and John.

Modeling Semantic Containment and Exclusion in Natural Language Inference Bill MacCartney and Christopher D. Manning NLP Group Stanford University 22 August.

Update on CADO/PADO: what are the challenges in using … · Update on CADO/PADO: what are the ... 11 years (2015-2026) ... •Ensure API availability •Lower cost of production

An Extended Model of Natural Logic Bill MacCartney and Christopher D. Manning NLP Group Stanford University 8 January 2009.

Tetra - JAS...25mm Pado Mattel Max Steel BIO-0087 Cad. p/ Bike Pado 30mm/60mm Azul BIO-0128 Toy Story BIO-OOB TOY Cad. p/ Moto Pado Disco SM 687 CR BIO-0104 Created Date 1/11/2021

Peace and Development Organization Pakistan - PADO

Two Aspects of the Problem of Natural Language Inference Bill MacCartney NLP Group Stanford University 8 October 2008.

Containment, Exclusion, and Implicativity: A Model of Natural Logic for Textual Inference Bill MacCartney and Christopher D. Manning NLP Group Stanford.

T. S. Rappaport, Y. Xing, G. R. MacCartney, Jr., A. F ...

Natural Logic and Natural Language Inference Bill MacCartney Stanford University / Google, Inc. 8 April 2011.