Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense...

19
Word Sense Disambiguation as an Integer Linear Programming Problem Vicky Panagiotopoulou 1 , Iraklis Varlamis 2 , Ion Androutsopoulos 1 , and George Tsatsaronis 3 1 Department of Informatics, Athens University of Economics and Business 2 Department of Informatics and Telematics, Harokopio University of Athens 3 Bi h l C (BIOTEC) T hi h Ui i D d 3 Biotechnology Center (BIOTEC), T echnische Universitat Dresden SETN 2012 SETN 2012 7th Hellenic Conference on Artificial Intelligence, May 28-31, 2012, Lamia

Transcript of Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense...

Page 1: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Word Sense Disambiguation as an Integer Linear Programming Problem

Vicky Panagiotopoulou1, Iraklis Varlamis2, y g p

Ion Androutsopoulos1, and George Tsatsaronis3

1 Department of Informatics, Athens University of Economics and Business2 Department of Informatics and Telematics, Harokopio University of Athens

3 Bi h l C (BIOTEC) T h i h U i i D d3 Biotechnology Center (BIOTEC), Technische Universitat Dresden

SETN 2012SETN 20127th Hellenic Conference on Artificial Intelligence,

May 28-31, 2012, Lamia

Page 2: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

ContentsContents

• Problem statement

• Existing solutionsExisting solutions

• Our solution

• Implementation

• ResultsResults

• Conclusions

SETN 2012 WSD as an ILP problem 2

Page 3: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Word Sense Disambiguation WSDWord Sense Disambiguation ‐WSD

• Assign to everyword of a document the most• Assign to every word of a document the most appropriate meaning (sense) among those offered by a lexicon or a thesaurus (inventory of senses)y ( y f ) Some examples:

The two friends jumped off the bank and into the water. bank = sloping land ‐ especially the slope beside a body bank = sloping land   especially the slope beside a body 

of water. They passed by the bank to make a deposit.

bank = a financial institution that accepts deposits and h l th i t l di ti itichannels the money into lending activities.

They used the bank when the army entered the city. bank = a supply or stock held in reserve for future use 

(especially in emergencies).(especially in emergencies). 

What is the correct meaning of “bank” in each sentence?se e ce

SETN 2012 WSD as an ILP problem 3

Page 4: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

How hard is the WSD task?How hard is the WSD task?

Upper Bound: Human performace; 95%‐99% coarse‐grained senses 65‐70% with fine‐grained senses [Haliday and Hasansenses, 65‐70% with fine‐grained senses [Haliday and Hasan, 1976].

Lower Bound: Unsupervised Baseline: 13‐20%, Supervised Lower Bound: Unsupervised Baseline: 13 20%, Supervised Baseline: 61‐64%

Inter‐annotator agreement: 67% ‐ 80% [Snyder and Palmer, g [ y ,2004]

SETN 2012 WSD as an ILP problem 4

Page 5: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

WSD alternativesWSD alternatives

• Several options in applying WSD:– Unsupervised

• High coverage, lower accuracy than supervised, no need for manually annotated data setneed for manually annotated data set

– Supervised• Lower coverage than unsupervised higher accuracy• Lower coverage than unsupervised, higher accuracy, “knowledge acquisition bottleneck”

SETN 2012 WSD as an ILP problem 5

Page 6: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Graph based Unsupervised WSDGraph‐based Unsupervised WSD

M ll d ’ t d f h• Map all words’ senses to nodes of a graph• Expand the graph by adding  related senses until a connected 

graph is constructedg p

• Rank graph nodes (senses) using graph based metrics (or node activation techniques)

• Each word is  mapped to its most highly ranked (or most active) senseactive) sense

SETN 2012 WSD as an ILP problem 6

Page 7: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Our suggestionOur suggestion

• Model WSD as an Integer Linear Programming (ILP) Problem

• Select exactly one possible sense of each word in the input t t i i th t t l i i l t dsentence, so as to maximize the total pairwise relatedness 

between the selected senses

• Create a graph that contains• Create a graph that contains only the candidate senses foreach word

• The edges denote relatednessbetween senses

SETN 2012 WSD as an ILP problem 7

Page 8: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

WSD as ILP vs Graph based WSDWSD as ILP vs Graph‐based WSD

• Complete but smaller graphs that contain only words’ senses

• Connected big graphs that contain extra (interconnecting) senseswords  senses

• Weighted edges are created using any pairwise sense

(interconnecting) senses

• The edges are lexical relations from a thesaurususing any pairwise sense 

relatedness measure (semantical or statistical)

relations from  a thesaurus, and are usually unweighed

• Semantic network • ILP  is NP‐hard, however for 

small graphs and using construction is slow 

• Spreading of activation or efficient solvers the method is faster than graph‐based WSD

node ranking run for each new graph

WSD

SETN 2012 WSD as an ILP problem 8

Page 9: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Towards an ILP formulations1j: possible senses of w1.

a1j: shows whether s1j is selected (a1j =1) or not (a1j = 0)

s2j: possible senses of w2.a2j: shows whether s2j is j jselected (a2j =1) or not (a2j = 0)

Maximize the total pairwise relatedness between selected senses, using only one sensesenses, using only one sense 

per word

R lt i d ti bj ti f ti9SETN 2012 WSD as an ILP problem

Results in a quadratic objective function

Page 10: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Our ILP model for WSDs1j: possible senses of w1.

a1j: shows if s1j is selected(a1j =1) or not (a1j = 0)

s2j: possible senses of w2.a2j: shows if s2j is selectedj j(a2j =1) or not (a2j = 0)

δij,i’j’: Shows whether  the edge is ij,i jactive (1) or not (0). The edge is active 

iff both connected senses areiff both connected senses are active (aij = ai’j’ =1).

10SETN 2012 WSD as an ILP problem

Page 11: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Our ILP model for WSDMaximize the total pairwise relatedness between senses, t ki i t ttaking into account senses connected via active edges

d d d

If sij is selected (aij = 1), then 

Edges are undirected

ij ( ij  ),there is exactly one active edge from sij to the senses of all other 

wordsw

If sij is not selected (aij = 0), 

words wi’.

then there is no active edge from sij to senses of other 

wordswi’.

11

words wi’.

SETN 2012 WSD as an ILP problem

Page 12: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Resources: WordNet [Miller et al ]Resources: WordNet [Miller et al.]

• Each sense is a set of synonymset of synonym words (synset) 

h l d– has a gloss and a POS (noun, 

b dj tiverb, adjective, adverb)

– is connected to other senses

SETN 2012 WSD as an ILP problem 12

Page 13: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

WordNet [Miller et al ]WordNet [Miller et al.]

13

Page 14: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

ImplementationImplementation

• Relatedness measures– Semantic Relatedness (SR) is a knowledge‐based measure that uses 

WordNetWordNet

• Semantic compactness (SCM): the semantic path from s1 to s2 is short and contains highly related senses

• Semantic Path Elaboration (SPE):  the senses in the path are very specific

– Pointwise mutual information (PMI) is a statistical similarity measure ( ) y

– We use the WordNet glosses of each sense si, and a non‐sense tagged corpus (953 million tokens)

SETN 2012 WSD as an ILP problem 14

Page 15: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

ImplementationImplementation• ILP Solver

– lp_solve: A branch‐and‐bound implementation that uses Simplex for LP subproblems. Available at http://lpsolve.sourceforge.net/

• Sense pr ning The sense s of a ord is remo ed from the graph if the• Sense pruning: The sense sij of a word wi is removed from the graph if the gloss of si and the sentence of wi do not overlap 

– The resulting graph is smaller, faster execution with comparable WSD performance

SETN 2012 WSD as an ILP problem 15

Page 16: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Experimental results

16SETN 2012 WSD as an ILP problem

Page 17: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

ConclusionsConclusions

• B tt lt ith SR (W dN t) i t d f PMI ( )• Better results with SR (WordNet), instead of PMI (co‐occurence).o We are currently evaluating the performance of PMI using sense 

tagged corporaS i i ti f ith t i ifi tlo Sense pruning improves time performance without significantly affecting WSD performance

• In Senseval2 we outperform graph based unsupervised WSD methods (SAN and PageRank)methods (SAN and PageRank)

• In Senseval3 we performed comparable to SAN, but worst than PageRank.o Higher polysemy than Senseval2. o SAN and PageRank create bigger graphs than our method.

• Almost 100% coveragego We may probably compare all methods in 100% coverage, by forcing 

other methods to give an answer in all cases (without using the First Sense heuristic) 

SETN 2012 WSD as an ILP problem 17

Page 18: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Next stepsNext steps

• PMI in sense‐tagged corporao But then we will be supervised

• Test with other similarity measures or combinations.o χ2, likelihood ratio, LSA, …o χ , likelihood ratio, LSA, …

o Our model works with any relatedness measure

• More evaluation datasets: Semeval 2007 2010• More evaluation datasets: Semeval 2007, 2010

• LP relaxations, in order to use Simplexo Faster solution: real time/scale implementations

o WSD in paragraph or section level

SETN 2012 WSD as an ILP problem 18

Page 19: Word Sense Disambiguation as an Lineargalaxy.hua.gr/~varlamis/Varlamis-papers/C57-ppt.pdfWord Sense Disambiguation ‐WSD • Assign to every word of a document the most appropriate

Thank you  !

Questions ?

Iraklis Varlamis

[email protected]

SETN 2012 WSD as an ILP problem 19