Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP

16
Intelligent Database Systems Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources and learining Algorithms for Word Sense Disambiguation

description

An Empirical Evaluation of Knowledge Sources and learining Algorithms for Word Sense Disambiguation. Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Transcript of Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP

Page 1: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Presenter : Kung, Chien-Hao

Authors : Yoong Keok Lee and Hwee Tou Ng

2002,EMNLP

An Empirical Evaluation of Knowledge Sources and learining Algorithms for

Word Sense Disambiguation

Page 2: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Page 3: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Motivation• Natural language is inherently ambiguous.

• A word can have multiple meanings(or senses).

Page 4: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Objectives• This paper evaluates a variety of knowledge sources

and supervised learning algorithms for word sense

disambiguation on SENSEVAL-2 and SENSEVAL-1 data.

Page 5: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Methodology

Part of speech (POS) of Neighboring Words

Single Words in the Surrounding Context

Local CollocationsSyntactic Relations

Knowledge Sources

Page 6: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Methodology• Part-of-Speech(POS) of Neighboring Words– This paper use 7 features to encode this knowledge source – Setence segmentation program

(Reynar and Ratnaparkhi , 1997)– POS tagger

(Ratnaparkhi , 1996)

Reid saw me looking at the iron bars. barsand

NNP VBD PRP VBG IN DT NN NNS .

{IN,DT,NN,NNS,.,,}

Page 7: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Methodology• Single Words in the Surrounding Context– Feature selection method• Parameter:M2

{chocolate, iron, beer}

Reid saw me looking at the iron bars.

bars

<0,1,0>

Page 8: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Methodology• Local Collocations– This paper extracted 11 features.

C-1,-1 ,C1,1,C-2,-2,C2,2,C-2,-1,C-1,1,C1,2,C-3,-1,C-2,1,C-1,2,C1,3

{ a_chocolate , the_wine , the_iron }

Reid saw me looking at the iron bars.

bars

<the_iron>C-2,-1

Page 9: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Methodology• Syntactic Relations

(a) Show w and its POS(b) Show the sentence where w occurs(c) Show the feature vector corresponding to syntactic relations

Page 10: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

• Learning Algorithms– Support Vector Machines– AdaBoost– Naïve Bayes– Decision Trees

• Evaluation Data Sets– SENSEVAL-2– SENSEVAL-1

Methodology

Page 11: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Experiments

Page 12: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Experiments

Page 13: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Experiments

Page 14: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Experiments

Page 15: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Conclusions• Using all of these knowledge sources and SVM

achieves accuracy higher than the best official scores

on both SENSEVAL-2 and SENSEVAL-a test data.

Page 16: Presenter   :  Kung,  Chien-Hao Authors      :  Yoong Keok  Lee and  Hwee Tou  Ng 2002,EMNLP

Intelligent Database Systems Lab

Comments• Advantages– This paper easy to read.

• Applications– WSD