Toward Dependency Path based Entailment

Post on 23-Jan-2016

54 views 0 download

Tags:

description

Toward Dependency Path based Entailment. Rodney Nielsen, Wayne Ward, and James Martin. Dependency Path-based Entailment. DIRT (Lin and Pantel, 2001) Unsupervised method to discover inference rules “X is author of Y ≈ X wrote Y” “X solved Y ≈ X found a solution to Y” - PowerPoint PPT Presentation

Transcript of Toward Dependency Path based Entailment

Toward Dependency Path based Entailment

Rodney Nielsen, Wayne Ward, and James Martin

Dependency Path-based Entailment

DIRT (Lin and Pantel, 2001) Unsupervised method to discover

inference rules “X is author of Y ≈ X wrote Y” “X solved Y ≈ X found a solution to Y”

If two dependency paths tend to link the same sets of words, they hypothesize that their meanings are similar

ML Classification Approach

Features derived from corpus statistics Unigram co-occurrence Surface form bigram co-occurrence Dependency-derived bigram co-occurrence

Mixture of experts: About 18 ML classifiers from Weka toolkit Classify by majority vote or average

probability

Bag of Words Graph MatchingDependency PathBased Entailment

Corpora

7.4M articles, 2.5B words, 347 words/doc Gigaword (Graff, 2003) – 77% of documents Reuters Corpus (Lewis et al., 2004) TIPSTER

Lucene IR engine Two indices

Word surface form Porter stem filter

Stop words = {a, an, the}

Core Features

Core Repeated Features

Product of MLEs

Average of MLEs

Geometric Mean of MLEs

Worst Non-Zero MLE

Entailing Ngrams for the Lowest Non-Zero MLE

Largest Entailing Ngram Count with a Zero MLE

Smallest Entailing Ngram Count with a Non-Zero MLE

Count of Ngrams in h that do not Co-occur with any Ngrams from t

Count of Ngrams in h that do Co-occur with Ngrams in t

Dependency Features

Dependency bigram features

pc

pcpc

dpcvv

vvww

tvvpc n

ntwwP

,

,,,

,max,,

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Dependency Features

c

cw

wcw tsPtwwPK

tsP ,,,1

, 21

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Dependency Features

0,,1

1 , twwPtsP ofpaperof

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Dependency Features

tsPtwwPtwwPtsP ,21,,2

10,,2

1 , ofcostofcostthecost

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Dependency Features

0,,,21,,2

12

1 , risingiscostrisingcostrising twwPtsPtwwPtsP

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Descendent relation statistics

Verb Dependency Features

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Combined verb descendent relation features

Worst verb descendent relation features

Subject Dependency

Features

Combined and worst subject descendent relations

Combined and worst subject-to-verb paths

Hypothesis h Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

Other Dependency Features

Repeat these same features for: Object pcomp-n Other descendent relations

Results

RTE2 by Task: IE IR QA SUM Overall

Accuracy 55.5 64.0 55.0 70.0 61.1

Average Precision 49.4 73.0 57.3 80.7 65.2

RTE2 Accuracy SUM NonSUM Overall

Test Set 70.0 58.2 61.1

Training Set CV 84.5 62.7 68.1

RTE1 Accuracy CD NonCD Overall

Test Set (Best

submission)83.3

(83.3)56.8

(52.8)61.8

(58.6)

Training Set CV 83.7 56.9 61.6

Feature Analysis

All feature sets are contributing according to cross validation on the training set

Most significant feature set: Unigram stem based word alignment

Most significant core repeated feature: Average MLE

Questions

Mixture of experts classifier using corpus co-occurrence statistics Moving in the direction of DIRT Domain of Interest: Student response analysis in intelligent tutoring systems

RTE2 Task: IE IR QA SUM

All

Accuracy 55.5

64.0

55.0

70.0

61.1

Average Precision

49.4

73.0

57.3

80.7

65.2

Bag of Words Graph MatchingDependency PathBased Entailment

Hypothesis hRTE2 Accuracy SUM NonSUM Overall

Test Set 70.0 58.2 61.1

Training Set CV 84.5 62.7 68.1

Text t

rising

cost is

The of

paper

choke

Newspapers on

costs

and

falling

rising paper revenues

RTE1 Accuracy CD NonCD Overall

Test Set (Best Subm)

83.3 (83.3)

56.8 (52.8)

61.8 (58.6)

Training Set CV 83.7 56.9 61.6

c

cw

wcw tsPtwwPK

tsP ,,,1

, 21

Why Entailment

Intelligent Tutoring Systems Student Interaction Analysis

Are all aspects of the student’s answer entailed by the text and the gold standard answer

Are all aspects of the desired answer entailed by the student’s response

Word Alignment Features

hw v

vw

tvh

v

vw

tv

v

vw

tvw

tvw

n

ntTrP

n

ntw

n

nvTrPtTrP

,

,

,

max|1

max,MLE

max|1max|1

Unigram word alignment

Word Alignment Features Bigram word alignment

Example: <t>Newspapers choke on rising paper costs and

falling revenue.</t><h>The cost of paper is rising.</h>

MLE(cost, t) = ncost of, costs of /ncosts of = 6086/35800 = 0.17

1

11

1

11

1

11

1

11

,

4

,

3

,

2

,

1

1max,MLE

jj

jjji

jj

jjij

ij

ijii

ji

jiii

vv

vvvw

vv

vvwv

wv

wvww

vw

vwww

tvi

n

n

n

n

n

n

n

n

ktw

Word Alignment Features

Average unigram and bigram

Stem-based tokens

Corpora

7.4M articles/docs & 2.5B words, 347 words/doc Gigaword (Graff, 2003) -

5.7M articles, 2.1B words, 375 words/article 77% of documents and 83% of indexed

words Reuters Corpus (Lewis et al., 2004)

0.8M articles, 0.17B words, 213 words/article TIPSTER

0.9M articles, 0.26B words, 291 words/article