Post on 21-Mar-2016
description
Intelligent Database Systems Lab
Presenter : Kung, Chien-Hao
Authors : Yoong Keok Lee and Hwee Tou Ng
2002,EMNLP
An Empirical Evaluation of Knowledge Sources and learining Algorithms for
Word Sense Disambiguation
Intelligent Database Systems Lab
OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
Motivation• Natural language is inherently ambiguous.
• A word can have multiple meanings(or senses).
Intelligent Database Systems Lab
Objectives• This paper evaluates a variety of knowledge sources
and supervised learning algorithms for word sense
disambiguation on SENSEVAL-2 and SENSEVAL-1 data.
Intelligent Database Systems Lab
Methodology
Part of speech (POS) of Neighboring Words
Single Words in the Surrounding Context
Local CollocationsSyntactic Relations
Knowledge Sources
Intelligent Database Systems Lab
Methodology• Part-of-Speech(POS) of Neighboring Words– This paper use 7 features to encode this knowledge source – Setence segmentation program
(Reynar and Ratnaparkhi , 1997)– POS tagger
(Ratnaparkhi , 1996)
Reid saw me looking at the iron bars. barsand
NNP VBD PRP VBG IN DT NN NNS .
{IN,DT,NN,NNS,.,,}
Intelligent Database Systems Lab
Methodology• Single Words in the Surrounding Context– Feature selection method• Parameter:M2
{chocolate, iron, beer}
Reid saw me looking at the iron bars.
bars
<0,1,0>
Intelligent Database Systems Lab
Methodology• Local Collocations– This paper extracted 11 features.
C-1,-1 ,C1,1,C-2,-2,C2,2,C-2,-1,C-1,1,C1,2,C-3,-1,C-2,1,C-1,2,C1,3
{ a_chocolate , the_wine , the_iron }
Reid saw me looking at the iron bars.
bars
<the_iron>C-2,-1
Intelligent Database Systems Lab
Methodology• Syntactic Relations
(a) Show w and its POS(b) Show the sentence where w occurs(c) Show the feature vector corresponding to syntactic relations
Intelligent Database Systems Lab
• Learning Algorithms– Support Vector Machines– AdaBoost– Naïve Bayes– Decision Trees
• Evaluation Data Sets– SENSEVAL-2– SENSEVAL-1
Methodology
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Conclusions• Using all of these knowledge sources and SVM
achieves accuracy higher than the best official scores
on both SENSEVAL-2 and SENSEVAL-a test data.
Intelligent Database Systems Lab
Comments• Advantages– This paper easy to read.
• Applications– WSD