Classification-based Contextual Preferences

Classification-based Contextual Preferences

Shachar Mirkin

July 2011

TextInfer, Edinburgh

Joint work with:Ido Dagan, Lili Kotlerman, Idan Szpektor

2 2

Context matching in inference

• Motivation• Task definition

3 3

Motivation: Inference & ambiguity

H The US accepts a large number of foreigners every year

T The US welcomes hundreds of thousands of aliens yearly If it’s any

consolation, dear, our alien abduction

insurance is finally going to pay off

alien foreignerwelcome accept

4 4

match?

Addressing ambiguity


T The US welcomes hundreds of thousands of aliens yearly

• WSD• Direct sense matching

• Dagan et al., 2006: Binary task• Avoiding intermediate representation of meanings• Suggested for synonyms

• Lexical Substitution (McCarthy & Navigli, 2009)T1 The US welcomes hundreds of thousands of

foreigners yearly

10123254 10123254

?=

5 5

Context matching

• Context Matching generalizes sense matching• Does aliens in T match the meaning of outer-space ?• Does ‘children acquire English’ match X acquire Y X

learn Y?

• Expected output:1. Yes / No 2. Score / probability – quantifying the match degree

• Both can be provided by classifiers

6 6

Name-based Text Categorization*

* This isn’t the task, just the setting

7 7

Name-based Text Categorization (TC)

• Unsupervised setting of TC• The category name is the only input: space,

medicine, religion

State of the art (Barak et al., 2009) - Vector-space IR approach:

• Query: category name • Expanded with its entailing terms

A document must include a (direct or indirect) match to the category

space

star

alien

shuttle

room

ht

t shuttle space

r

8 8

A closer look at

Context Matching

• Context matching aspects• Directionality

9 9

Contextual matches

• Contextual (mis)matches in Name-based TC:• t – h

• H: (outer) space • T: The server ran out of disk space

• r – h • r: room space• Invalid for the category

• t – r • r: star space is a valid rule for space • T: Elle Magazine accused of 'whitening' Bollywood star's

skin

t

r

h

match?

1010

Context matching directionality

• Different roles for t, h & r i. t should match the meaning of h and rii. r should match the meaning of h


T If it’s any consolation, dear, our alien abduction insurance is finally going to pay off

t

r

h

• alien in T should match the meaning of foreigner in H

• and not the other way around

?

1111

Contextual Preferences(Szpektor et al., 2008)

• CP: the context matching framework• The 3 matches• Directionality

• Prior work• Concrete context matching models in Szpektor

et al.

t

r

h

CP

1212

The Contextual Preferences (CP) framework

t

r

h

cpg(r), cpv(r)

cpg(t), cpv(t)

cpg(h), cpv(h)

Pantel et al., 2007Pennacchiotti et al., 2007Downey et al., 2007

Harabagiu et al., 2003 (QA)

Patwardhan and Riloff, 2007

Dagan et al., 2006

• Each inference object has contextual representation, CP• In inference operations, CPs should also be matched

• Two contextual information types• Variables selectional preferences (CPv)

• Topical / global (CPg)

cpv(X) = {child}cpv(Y) = {English}

X acquire Y

cpg() = {learning, grammar}

Barak et al., 2009 (TC)

• Prior work addressed specific aspects of context matching

Connor and Roth, 2007

1313

Szpektor et al.’s implementation of CP

t

r

h

cpg(r), cpv(r)

cpg(t), cpv(t)

cpg(h), cpv(h)

Lin Similarity of instantiations

Cosine similarity of LSA vectors

Lin sim. of instantiations xscore of comparing preferred NE

• Our goal: implementation of CP with a model which is: • Unified• Directional • Incorporates various context information types

Classification-based approach

1414

A classification-based scheme for CP

• Classifiers for context matching• Classifiers for all CP context matching aspects

1515

Using classifiers to represent & assess context

• A classifier to identify valid contexts of h• Trained with typical valid & invalid contexts of h

• Given a match t: • Ch is applied on the context of t

• is it a valid context for the intended meaning of h?

Millions of Americans followed the space shuttle Atlantis' final missionElle Magazine accused of 'whitening' Bollywood star's skin

t

r

h h: (outer) space

.. Bollywood star's skin ..

Ch:space(..Bollywood <t> ‘s

skin..)

t

Ch

Ch(t)

Ch:spac

e

1616

Classification-based scheme: addressing all CP matches

• A classifier to identify valid contexts of the hypothesis h•Applied to t •Applied to the rule r

• A classifier to identify valid contexts of the rule r

•Applied to t

• Classifiers in prior work addressed t-r match(Kauchak and Barzilay, 2006; Dagan et al., 2006; Connor and Roth, 2007)

t

r

h h: (outer) spaceCh(t

)

.. Bollywood star's skin ..

star space

Ch(r)

Cr(t)

Cr

Ch

1717

A self-supervised context model

• A concrete model for the classification-based scheme

1818

A self-supervised context model

• Unsupervised setting, no labeled examples

• Self-supervised: Training examples automatically obtained• By querying the TC training set

• Challenge:Constructing queries that correctly retrieve valid or invalid

contexts

• Key principles of solution:1.Add disambiguation information (only) when

needed2.Gradual process:

• Most accurate (least ambiguous) queries first, less accurate follow

1919

Self-supervised model: acquiring positive examples

i.“outer space”ii.(“outer space” OR space) AND (infinite OR science OR “scientific discipline” OR . . . )

Acquiring positive examples (simplified):

• Start with monosemous terms and• Gradually add polysemous ones

• Use “context words” to resolve ambiguity

Ch:spac

e

• A similar process for Cr

2020

Self-supervised model: acquiring negative examples

Acquiring negative examples:1.Randomly (Kauchak and Barzilay, 2006)

2.Like positive examples, but using cohyponyms

• basketball for h:hockey ; islam for h:christianity

• Semantically similar to positive• Help classifiers achieve better discrimination

2121

Self-supervised model: applying the classifiers

• Ch & Cr classify matches in the text → Ch(t), Cr(t)• The document provides the context for t

• Ch also classifies each rule → Ch(r)• What is the context to classify?

2222

r: check:n hockey

• Can be interpreted as a domain-specific rule priors

Applying Ch to rules

The insurance company cut me a check for $6600 false

The camera zoomed in on a shoulder check along the boards

true

There’s a perfect antibiotics that can keep it in check false

They oppose both waiting periods and background checks

false

1. Sample k texts with the rule’s entailing term2. Apply Ch to each match3. Set Ch(r) to the ratio of positive

classifications

Ch(r) = 0.25

2323

Experiments and results

2424

Experimental Setting

• Following IR approach of state of the art Name-based TC

• Goal: improve TC performance by verifying matches

• Integrating within TC: • Modify vector entries with 3 scheme outputs [See paper]

• 2 TC datasets• Entailment rules: WordNet, Wikipedia (Shnarch et al.,

2009)

• SVM, linear kernel • Standard WSD features

• Classifiers’ scores (and not binary decisions)

2525

Baselines

Barak et al., 2009:1. Barakno-context

• Cosine similarity between document & category vectors• Baseline for not using any context model

2. Barakfull

• State of the art for Name-based TC• LSA as t-h context model

2626

Results

ModelReuters-10

Accuracy

P R F1

Barakno-context 73.2 63.6 77.0 69.7

Barakfull 76.3 68.0 79.2 73.2

Class.-based 79.3 71.8 83.6 77.2

Model20-Newsgroups

Accuracy

P R F1

Barakno-context 63.7 44.5 74.6 55.8

Barakfull 69.4 50.1 82.8 62.4

Class.-based 73.4 54.7 76.4 63.7

• Accuracy of the classification decisions• Recall: relative to the potential recall of

the rule-set

2727

Conclusions

• Quick summary• Discussion

2828

Summary

• 3 contextual aspects need to be considered in inference

• Prior work • addressed them partially• used symmetric models• provided a different method for each aspect

• Classification-based scheme for context in inference• Complete, unified, directional • Outperforming state of the art for Name-based TC

t

r

h

2929

Future work and discussion

• Future Work• Apply to other applications • Address to more complex hypotheses

• And unknown terms (for which we didn’t train classifiers in advance)

• Scaling • Attractive when hypotheses are known in advance• Feasible when hypothesis are given online?

• Possible directions• Instant methods for training classifiers• Training classifiers for entire rule sets in advance• Global classifiers• …

Thank you!

Classification-based Contextual Preferences

Documents

Transcript of Classification-based Contextual Preferences