Classification-based Contextual Preferences
description
Transcript of Classification-based Contextual Preferences
Classification-based Contextual Preferences
Shachar Mirkin
July 2011
TextInfer, Edinburgh
Joint work with:Ido Dagan, Lili Kotlerman, Idan Szpektor
2 2
Context matching in inference
• Motivation• Task definition
3 3
Motivation: Inference & ambiguity
H The US accepts a large number of foreigners every year
T The US welcomes hundreds of thousands of aliens yearly If it’s any
consolation, dear, our alien abduction
insurance is finally going to pay off
alien foreignerwelcome accept
4 4
match?
Addressing ambiguity
H The US accepts a large number of foreigners every year
T The US welcomes hundreds of thousands of aliens yearly
• WSD• Direct sense matching
• Dagan et al., 2006: Binary task• Avoiding intermediate representation of meanings• Suggested for synonyms
• Lexical Substitution (McCarthy & Navigli, 2009)T1 The US welcomes hundreds of thousands of
foreigners yearly
10123254 10123254
?=
5 5
Context matching
• Context Matching generalizes sense matching• Does aliens in T match the meaning of outer-space ?• Does ‘children acquire English’ match X acquire Y X
learn Y?
• Expected output:1. Yes / No 2. Score / probability – quantifying the match degree
• Both can be provided by classifiers
6 6
Name-based Text Categorization*
* This isn’t the task, just the setting
7 7
Name-based Text Categorization (TC)
• Unsupervised setting of TC• The category name is the only input: space,
medicine, religion
State of the art (Barak et al., 2009) - Vector-space IR approach:
• Query: category name • Expanded with its entailing terms
A document must include a (direct or indirect) match to the category
space
star
alien
shuttle
room
ht
t shuttle space
r
8 8
A closer look at
Context Matching
• Context matching aspects• Directionality
9 9
Contextual matches
• Contextual (mis)matches in Name-based TC:• t – h
• H: (outer) space • T: The server ran out of disk space
• r – h • r: room space• Invalid for the category
• t – r • r: star space is a valid rule for space • T: Elle Magazine accused of 'whitening' Bollywood star's
skin
t
r
h
match?
1010
Context matching directionality
• Different roles for t, h & r i. t should match the meaning of h and rii. r should match the meaning of h
H The US accepts a large number of foreigners every year
T If it’s any consolation, dear, our alien abduction insurance is finally going to pay off
t
r
h
• alien in T should match the meaning of foreigner in H
• and not the other way around
?
1111
Contextual Preferences(Szpektor et al., 2008)
• CP: the context matching framework• The 3 matches• Directionality
• Prior work• Concrete context matching models in Szpektor
et al.
t
r
h
CP
1212
The Contextual Preferences (CP) framework
t
r
h
cpg(r), cpv(r)
cpg(t), cpv(t)
cpg(h), cpv(h)
Pantel et al., 2007Pennacchiotti et al., 2007Downey et al., 2007
Harabagiu et al., 2003 (QA)
Patwardhan and Riloff, 2007
Dagan et al., 2006
• Each inference object has contextual representation, CP• In inference operations, CPs should also be matched
• Two contextual information types• Variables selectional preferences (CPv)
• Topical / global (CPg)
cpv(X) = {child}cpv(Y) = {English}
X acquire Y
cpg() = {learning, grammar}
Barak et al., 2009 (TC)
• Prior work addressed specific aspects of context matching
Connor and Roth, 2007
1313
Szpektor et al.’s implementation of CP
t
r
h
cpg(r), cpv(r)
cpg(t), cpv(t)
cpg(h), cpv(h)
Lin Similarity of instantiations
Cosine similarity of LSA vectors
Lin sim. of instantiations xscore of comparing preferred NE
• Our goal: implementation of CP with a model which is: • Unified• Directional • Incorporates various context information types
Classification-based approach
1414
A classification-based scheme for CP
• Classifiers for context matching• Classifiers for all CP context matching aspects
1515
Using classifiers to represent & assess context
• A classifier to identify valid contexts of h• Trained with typical valid & invalid contexts of h
• Given a match t: • Ch is applied on the context of t
• is it a valid context for the intended meaning of h?
Millions of Americans followed the space shuttle Atlantis' final missionElle Magazine accused of 'whitening' Bollywood star's skin
t
r
h h: (outer) space
.. Bollywood star's skin ..
Ch:space(..Bollywood <t> ‘s
skin..)
t
Ch
Ch(t)
Ch:spac
e
1616
Classification-based scheme: addressing all CP matches
• A classifier to identify valid contexts of the hypothesis h•Applied to t •Applied to the rule r
• A classifier to identify valid contexts of the rule r
•Applied to t
• Classifiers in prior work addressed t-r match(Kauchak and Barzilay, 2006; Dagan et al., 2006; Connor and Roth, 2007)
t
r
h h: (outer) spaceCh(t
)
.. Bollywood star's skin ..
star space
Ch(r)
Cr(t)
Cr
Ch
1717
A self-supervised context model
• A concrete model for the classification-based scheme
1818
A self-supervised context model
• Unsupervised setting, no labeled examples
• Self-supervised: Training examples automatically obtained• By querying the TC training set
• Challenge:Constructing queries that correctly retrieve valid or invalid
contexts
• Key principles of solution:1.Add disambiguation information (only) when
needed2.Gradual process:
• Most accurate (least ambiguous) queries first, less accurate follow
1919
Self-supervised model: acquiring positive examples
i.“outer space”ii.(“outer space” OR space) AND (infinite OR science OR “scientific discipline” OR . . . )
Acquiring positive examples (simplified):
• Start with monosemous terms and• Gradually add polysemous ones
• Use “context words” to resolve ambiguity
Ch:spac
e
• A similar process for Cr
2020
Self-supervised model: acquiring negative examples
Acquiring negative examples:1.Randomly (Kauchak and Barzilay, 2006)
2.Like positive examples, but using cohyponyms
• basketball for h:hockey ; islam for h:christianity
• Semantically similar to positive• Help classifiers achieve better discrimination
2121
Self-supervised model: applying the classifiers
• Ch & Cr classify matches in the text → Ch(t), Cr(t)• The document provides the context for t
• Ch also classifies each rule → Ch(r)• What is the context to classify?
2222
r: check:n hockey
• Can be interpreted as a domain-specific rule priors
Applying Ch to rules
The insurance company cut me a check for $6600 false
The camera zoomed in on a shoulder check along the boards
true
There’s a perfect antibiotics that can keep it in check false
They oppose both waiting periods and background checks
false
1. Sample k texts with the rule’s entailing term2. Apply Ch to each match3. Set Ch(r) to the ratio of positive
classifications
Ch(r) = 0.25
2323
Experiments and results
2424
Experimental Setting
• Following IR approach of state of the art Name-based TC
• Goal: improve TC performance by verifying matches
• Integrating within TC: • Modify vector entries with 3 scheme outputs [See paper]
• 2 TC datasets• Entailment rules: WordNet, Wikipedia (Shnarch et al.,
2009)
• SVM, linear kernel • Standard WSD features
• Classifiers’ scores (and not binary decisions)
2525
Baselines
Barak et al., 2009:1. Barakno-context
• Cosine similarity between document & category vectors• Baseline for not using any context model
2. Barakfull
• State of the art for Name-based TC• LSA as t-h context model
2626
Results
ModelReuters-10
Accuracy
P R F1
Barakno-context 73.2 63.6 77.0 69.7
Barakfull 76.3 68.0 79.2 73.2
Class.-based 79.3 71.8 83.6 77.2
Model20-Newsgroups
Accuracy
P R F1
Barakno-context 63.7 44.5 74.6 55.8
Barakfull 69.4 50.1 82.8 62.4
Class.-based 73.4 54.7 76.4 63.7
• Accuracy of the classification decisions• Recall: relative to the potential recall of
the rule-set
2727
Conclusions
• Quick summary• Discussion
2828
Summary
• 3 contextual aspects need to be considered in inference
• Prior work • addressed them partially• used symmetric models• provided a different method for each aspect
• Classification-based scheme for context in inference• Complete, unified, directional • Outperforming state of the art for Name-based TC
t
r
h
2929
Future work and discussion
• Future Work• Apply to other applications • Address to more complex hypotheses
• And unknown terms (for which we didn’t train classifiers in advance)
• Scaling • Attractive when hypotheses are known in advance• Feasible when hypothesis are given online?
• Possible directions• Instant methods for training classifiers• Training classifiers for entire rule sets in advance• Global classifiers• …
Thank you!