Measuring Opinion Relevance in Latent Topic...

30
Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion Measuring Opinion Relevance in Latent Topic Space Wei Cheng 1 Xiaochuan Ni 2 Jian-Tao Sun 2 Xiaoming Jin 3 Hye-Chung Kum 1 Xiang Zhang 4 Wei Wang 1 1 Department of Computer Science, University of North Carolina at Chapel Hill, USA 2 Microsoft Research Asia, Beijing, China 3 School of Software, Tsinghua University, Beijing, China 4 Department of Electrical Engineering and Computer Science, Case Western Reserve University, USA Advisor: Prof. Wei Wang IEEE International Conference on Social Computing, MIT, 2011 Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei Wang Measuring Opinion Relevance in Latent Topic Space

Transcript of Measuring Opinion Relevance in Latent Topic...

Page 1: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Measuring Opinion Relevance in Latent TopicSpace

Wei Cheng1 Xiaochuan Ni2 Jian-Tao Sun2

Xiaoming Jin3 Hye-Chung Kum1 Xiang Zhang4

Wei Wang1

1Department of Computer Science, University of North Carolina at Chapel Hill,USA

2Microsoft Research Asia, Beijing, China3School of Software, Tsinghua University, Beijing, China

4Department of Electrical Engineering and Computer Science, Case WesternReserve University, USA

Advisor: Prof. Wei WangIEEE International Conference on Social Computing, MIT, 2011

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 2: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Outline

Overview of the opinion retrieval

Related work

Motivation of this work

Problem definition

Topic model based opinion retrieval

Experimental results

Conclusions

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 3: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Introduction

What is opinion retrieval?

Given a query, find documents that have subjective opinions aboutthe query

Example

A query “ipad”Relevant: “iPad is perfect.”Irrelevant: “The lowest price of iPad is $499.”

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 4: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Routine Solution for Opinion Retrieval:Two-stageMethodology

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Traditional IR 

Engine 

ScoreIIR

query 

Re‐ranking  ScoreIop

Opinion 

Analysis 

Initial topically 

relevant 

documents 

Opinion  relevant 

documents 

Corpus 

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 5: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Related Work

Opinion identification

Lexicon-based methods ([1],[2],[3],[4]),i.e.,“perfect”,“wonderful”,etc.

Classification-based methods ([5],[6]), i.e., SVM classifier

Language model based methods ([7]), i.e.,“I...wish”,“I...felt”,“I...enjoyed”,etc.

Identifying opinion targets

Words distance-based approximate methodology ([3])

Sentences distance-based approximate methodology ([8])

Integrating topic relevance and opinion for ranking

linear combination ([8])

quadratic combination ([3])

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 6: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Motivation

In the literature, the opinion relevance is handled by verystraightforward methods:

1 In [3], the authors count the number of opinion-carrying words(like “perfect”, “terrible”, etc.) around search query terms;

2 W. Zhang et al. [8] count opinion-carrying sentences(identified by sentiment classifier) around the search query.

The drawbacks of this kind of solutions:

Opinions around the search query may not be related to thequery.

The terms in search query cannot represent all the semanticsof the search topic.

To sum up, topic and opinion are mismatched.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 7: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Drawbacks

Example� � ��� � ���� ��� �� ��� ������ ����� � ���� �� ������ ��� ���� ������ ������ ���� ������ � ���� ���� ��������� � ������ � ����� �� �������� ������� ������ � ����� !"#$$ %& '()"& *)+& ,-./)*0 ,)"1 2"(33 "14" ! #5 )*"-6 7 89%.-"1&. 142 :)*/ &9& 4*; 1&#2< *- ;-(%"< %&&* :.-%)*0 "14" "1)*0 4$$ ;49 ,1)$2"2)"")*0 1&.&< "9:)*0 -* "1&2& =&.9 /&926 > ?�� ����� � � @�� �� ��� A���� �B��� C����� ���� ����� D C����� ��� ����������� E � ���� ����� � ��� �B������ ����� ��� ���������� ��� B��� ������� �� @�� ������ ���� ������ ��� F G����� ���������H I ! #5 2- 14::9 3 -. 9-( "14" 9-( 0-" "14"J -%K LM !" #$$%& 2- 0--;3 -. 9-( 2)*+& 9-( 4.& )*"- :1-"-0.4:196 �� � @��� � ���� ��� � ��� ��� ������� �� ��������� LN O14" ,-($; %& 1&$:3 ($ ;-,* "1& $)*&6 �P������������� � ���� �@ ���Q �R � @��� �� ��� ����� �B ��� ����������> � ���� �� @�� ����� ���� ��� ��������� ����� @���B��� �D S�� B���@�� ���� �� A���� T������ �E � @�� ���� ������ �� � ����� ��� �F� ����U� ��� ��� � B�������Figure: Example document from the TREC Blog06 corpus with Query“March of the penguins”

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 8: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Relationship among different document sets

Corpus

SET

SET�������

SET�������� ��

SET�� ����������Figure: Relationship among SETopinion to Q, SETrelevant to Q andSETopinion

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 9: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Key challenge of opinion retrieval task

SETopinion to Q is a proper subset of SETrelevant to Q ∩ SETopinion.

SETrelevant to Q and SETopinion can be retrieved by using traditional IRtechniques and opinion mining technologies respectively.

The crucial challenge of opinion retrieval task is in fact to select outSETopinion to Q from SETrelevant to Q ∩ SETopinion.

Key point: Measuring opinion relevance

Example

 

 

 

 

 

Opinion 

Irrelevant 

Query:iPad

Document 

Paragraph2:      iPad 

without opinions about iPad 

Paragraph1:      iPhone 

having opinions about iPhone

 

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 10: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Our overall pipeline for topic model based opinion retrieval

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

              Topic Space 

Opinion  relevant 

documents 

Traditional IR 

Engine 

Re‐ranking 

PLSA  (top ranked)

projection

 

projection 

Similarity 

Evaluating

Opinion bearing sentences

Initial topically relevant 

documents 

query 

Corpus 

Figure: MAP evolution over topic numberWei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 11: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Getting initial topically relevant documents

BM-25

ScoreIIR =∑

t∈Q,D

lnN − df + 0.5

df + 0.5· (k1 + 1)tf

(k1(1− b) + b dlavdl ) + tf

· (k3 + 1)qtf

k3 + qtf

(1)where tf is the term’s frequency in document, qtf is the term’s

frequency in query, N is the total number of documents in the collection,df is the number of documents that contain the term, dl is the documentlength (in bytes), avdl is the average document length, k1 (between1.0-2.0), b (usually 0.75), and k3 (between 0-1000) are constants.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 12: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Infer topic space

After we obtain the working set(i.e., top ranked documents from initiallyretrieved documents), we will use Probabilistic Latent Semantic Analysis(PLSA) for inferring a topic space.

PLSA [9] models each document as a mixture of topics. It generatesdocuments with the following process:

Select a document d with probability p(d)

Pick a latent topic z with probability p(z|d)

Generate a word w with probability p(w|z)

 

M

Kwzd

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 13: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Project search query and opinion carrying sentences into topic space

Project opinion carrying sentence

−−−−−→p(sop|z) =< p(sop|z1), . . . , p(sop|zK) > (2)

where p(sop|zk) is the probability for topic zk to generate sop [10].

p(sop|zk) =1

K ′ ·|V |∑j=1

n(sop, wj)p(wj |zk) (3)

Project search query

−−−→p(q|z) =< p(q|z1), . . . , p(q|zK) > (4)

where p(q|zk) is the probability for topic zk to generate query q.

p(q|zk) =1

K ′′ ·|V |∑j=1

n(q, wj)p(wj |zk) (5)

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 14: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Opinion relevance measurement at the topic level

Measuring similarity between sentences and queries at topic level

p(q|sop)= p(Qtopic|sop)= cos(

−−−→p(q|z),

−−−−−→p(sop|z))

=−−−→p(q|z)·

−−−−−→p(sop|z)

||−−−→p(q|z)||2×||

−−−−−→p(sop|z)||2

(6)

Note that p(Qtopic|sop) is not a strict probability distribution, it’sused to denote the score of the similarity between sop and Qtopic.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 15: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Re-ranking

Opinion score

ScoreIOP =∑

smp(q|sm) · f(sm) s.t. (1) and (2), with :

(1) p(q|sm) > µ(2) sm ∈ So

(7)

where So is the opinion expressing sentence set of document d and f(sm)denotes the sentiment strength of opinion expressing sentence sm

Overall score for ranking

Score = Combination(ScoreIIR, ̂ScoreIOP ) (8)

Two combination methods in combining ScoreIIR and ̂ScoreIOP :

linear combination λ · ScoreIIR + (1− λ) · ̂ScoreIOP [11, 8];

quadratic combination ScoreIIR · ̂ScoreIOP [3].

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 16: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Evaluation Setup

Data Sets

TREC Blog06 [12] crawled over a period of 11 weeks(December 2005 - February 2006), 148GB with threecomponents: feeds, permalinks and homepages.

We use 150 queries from Blog06, Blog07 [13], and Blog08[14] for evaluation.

We will tune algorithm parameters with Blog Track 06 topicswhile using Blog Track 07 and 08 new topics as the testingdata.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 17: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Evaluation metrics

MAP (mean average precision), P@10 (precision at top 10 results) and R-prec(R-Precision)

precision =|{relevant documents}

⋂{retrieved documents}|

|{retrieved documents}| (9)

P@10 is the precision that considers only the topmost 10 results returned.

AveP (q) =

∑nk=1(P@k · rel(k))

number of relevant documents(10)

where rel(k) is an indicator function equaling 1 if the item at rank k is arelevant document, zero otherwise, P@k is the precision at cut-off k in the list.

Example

 

For this query result, the AveP (average precision) is 13( 11+ 2

3+ 3

5) because

P@1=1/1,P@3=2/3,and P@5=3/5.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 18: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Evaluation Setup

Evaluation metrics

MAP =

∑Qq=1AveP (q)

Q(11)

R−Precision is precision at R-th position in the ranking of results for a querythat has R relevant documents.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 19: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Experiment environment setting

Opinion identification policy: We will follow M. Zhang et al.[3] and use sentiwordNet [4]. Choose opinion-bearing wordswith positive/negative strength more than 0.6.

Combination policy: linear combination

Size of the working set: M=1000

Example

SentiwordNet Sample

 

Positive 

happy 0.875 

fortuitous 0.5 

good 0.25 

providential 0.875 

lucky 0.875 

Negative 

abject 0.75 

disastrous 0.75 

dispossessed 0.875 

pathetic 1 

ill‐fated 0.875

Mixed 

unfixed 0.125 0.625 

detached 0.125 0.5 

at_liberty 0.125 0.375 

ascetic 0.625 0.25 

deathlike 0.125 0.5

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 20: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Baseline methods and implementations

bag-of-words-Method

ScoreIOP =∑

sm∈So

f(sm) (12)

single-sentence-Method

ScoreIOP =∑

smf(sm) s.t. (1) and (2), with :

(1) sm ∈ So

(2) sm contains q(13)

window-Method

ScoreIOP =∑

smf(sm) s.t. (1) and (2), with :

(1) sm ∈ So

(2) ∃si ∈ Sneighbors of sm ∧ si contains q(14)

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 21: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Effectiveness of number of topics

0 100 200 300 400 500

0.16

0.18

0.2

0.22

topic number

MA

P

maximum

Figure: MAP evolution over topic number

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 22: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Effectiveness of our framework� � ��� � ���� ��� �� ��� ������ ����� � ���� �� ������ ��� ���� ������ � ����� ���������� � ���� ���� ��������� � ������ � ����� �� �������� ����� �� ������ � ����� !"#$$ %&'()"& *)+& ,-./)*0 ,)"1 2"(33 "14" ! #5 )*"-6 7 89 %.-"1&. 142 :)*/ &9& 4*; 1&#2< *- ;-(%"< %&&*:.-%)*0 "14" "1)*0 4$$ ;49 ,1)$2" 2)"")*0 1&.&< "9:)*0 -* "1&2& =&.9 /&926 > ?�� ����� � � @�� ����� A���� �B ��� C����� ���� ����� D C����� ��� ����������� E � ���� ����� � ��� �B ����������� ��� ���������� ��� B��� ������� �� @�� ������ ���� ������ ��� F G ����� ���������HI ! #5 2- 14::9 3 -. 9-( "14" 9-( 0-" "14" J -%K LM !"#$$ %& 2- 0--; 3 -. 9-( 2)*+& 9-( 4.& )*"-:1-"-0.4:196 �� � @��� � ���� ��� � ��� �� � ������� �� ��������� LN O14" ,-($; %& 1&$:3 ($;-,* "1& $)*&6 �P ������������� � ���� �@ ���Q �R �@��� �� ��� ����� �B ��� ����������> � ���� �� @�� ����� ���� ��� ��������� ����� @���B��� �D S�� B��� @�� ���� ��A���� T������ �E �@�� ���� ������ �� � ����� ��� �F � ����U� ��� ��� � B�������Figure: Test document d

 

Query:   

March of the Penguins

(a) Query

2 4 6 8 10 12 14 16 180

0.2

0.4

0.6

0.8

1

Sentence

Top

ical

Rel

evan

ce

(b) Topical relevance of document dWei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 23: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Comparison to baselines

0 0.5 10.1

0.15

0.2

0.25

Blog Track 06 MAP

lambda

MA

P

0 0.5 10.2

0.3

0.4

0.5

Blog Track 06 P@10

lambda

P@

10

0 0.5 10.2

0.3

0.4

0.5

0.6

Blog Track 07 P@10

lambda

P@

10

window−Method

bag−of−words−Method

single−sentence−Method

our topic framework

window−Method

bag−of−words−Method

single−sentence−Method

our topic framework

bag−of−words−Method

single−sentence−Method

window−Method

our topic framework

0 0.5 10.1

0.15

0.2

0.25

0.3

0.35

lambda

MA

P

Blog Track 08 new topics MAP

0 0.5 10.2

0.3

0.4

0.5

0.6

p@

10

Blog Track 08 new topics P@10

0 0.5 10.1

0.15

0.2

0.25

0.3

0.35

lambda

MA

P

Blog Track 07 MAP

our topic framework

window−Method

single−sentence−Method

bag−of−words−Method

our topic framework

bag−of−words−Method

sigle−sentence−Method

window−Method

our topic framework

bag−of−words−Method

sigle−sentence−Method

window−Method

lambda

lambda

Figure: MAP & P@10 comparison for Blog Track 06, 07 and 08 topicswith different combination parameters

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 24: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

MAP improvements over different queries

850 860 870 880 890 900−0.5−0.4−0.3−0.2−0.1

00.10.20.30.40.5

06 Topics

topic

MA

P Im

prov

emen

t

900 910 920 930 940 950−0.5−0.4−0.3−0.2−0.1

00.10.20.30.40.5

07 Topics

topic

MA

P Im

prov

emen

t

1000 1010 1020 1030 1040 1050−0.5−0.4−0.3−0.2−0.1

00.10.20.30.40.5

08 new 50 Topics

topic

MA

P im

prov

emen

t

Figure: MAP improvement after re-ranking for individual 150 topics ofBlog Track 06, 07 and 08

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 25: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Comparison with state-of-the-art methods

Data set Method MAP P@10 R-prec

Blog 06

Best title-only-run at Blog06 [12] 0.1885 0.512 0.2771Our Relevant Baseline 0.186 0.32 0.2623

Our framework 0.2504 0.532 0.3169M. Zhang et al. improvement[3] 28.38% 44.86% 16.00%K. Seki et al. improvement[7] 22.00% – –Our framework’s improvement 34.62% 66.25% 20.82%

Blog 07

Most Improvement at Blog07 [13] 15.90% 21.60% 8.60%M. Zhang et al. improvement[3] 28.10% 40.30% 19.90%

Our Relevant Baseline 0.2603 0.438 0.3131Our Framework 0.3484 0.628 0.3869

Our framework’s improvement 33.84% 43.38% 23.57%

Blog 08new topics

Most Improvement at Blog08[14] 31.60% – –Secondary best Improvement at Blog08 14.75% – –

Our Relevant Baseline 0.2818 0.498 0.3451Our Framework 0.3296 0.6196 0.3789

Our framework’s improvement 16.96% 24.42% 9.79%

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 26: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Conclusion

In this paper. . .

1 We propose a novel framework to measure opinion relevancein the latent topic space.

2 It is essentially a general framework for opinion retrieval andcan be used by most current solutions to further enhance theirperformance.

3 Our framework is domain-independent, thus fits for differentopinion retrieval environments.

4 The effectiveness and advantage of our framework werejustified by experimental results on TREC blog datasets.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 27: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

Q&A

Many thanks!

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 28: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

C.Fellbaum, “Wordnet: An electronic lexical database (language,speech and communication ),” The MIT Press, May 1998.

S. Kim and E. Hovy, “Determining the sentiment of opinions,” inCOLING’04, 2004, pp. 1367–1373.

M. Zhang and X. Ye, “A generation model to unify topic relevanceand lexicon-based sentiment for opinion retrieval,” in SIGIR’08,2008, pp. 411–418.

A. Esuli and F. Sebastiani, “Sentiwordnet: A publicly availablelexical resource for opinion mining,” in LREC’06, 2006, pp. 417–422.

B. Liu, M. Hu, and J. Cheng, “Opinion observer: Analyzing andcomparing opinions on the web,” in WWW’05, 2005, pp. 342–351.

B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? sentimentclassification using machine learning techniques,” CoRR, cs.CL/0205070, 2002.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 29: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

K. Seki and K. Uehara, “Adaptive subjective triggers for opinionateddocument retrieval,” in WSDM’09, 2009.

W. Zhang, C. Yu, and W. Meng, “Opinion retrieval form blogs,” inCIKM’07, 2007, pp. 831–840.

T. Hofmann, “Probabilistic latent semantic indexing,” in SIGIR’99,1999, pp. 50–57.

Y. Lu and C. X. Zhai, “Opinion integration through semi-supervisedtopic modeling,” in WWW’03, 2008.

G. Mishne, “Multiple ranking strategies for opinion retrieval inblogs,” in TREC 2006, 2006.

I. Ounis, M. de Rijke, C. Macdonald, G. A. Mishne, and I. Soboroff,“Overview of the trec-2006 blog track,” in TREC’06 Working Notes,2006, pp. 15–27.

C. Macdonald, I. Ounis, and I. Soboroff, “Overview of the trec-2007blog track,” in TREC’07, 2007.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space

Page 30: Measuring Opinion Relevance in Latent Topic Spaceweicheng/2011SocialComSlides.pdfOutlineIntroductionRelated WorkMotivationProblem De nitionOur frameworkExperimental StudyConclusion

Outline Introduction Related Work Motivation Problem Definition Our framework Experimental Study Conclusion

I. Ouni, C. Macdonald, and I. Soboroff, “Overview of the trec-2008blog track,” in TREC’08, 2008.

Wei Cheng, Xiaochuan Ni, Jian-Tao Sun, Xiaoming Jin, Hye-Chung Kum, Xiang Zhang, Wei WangMeasuring Opinion Relevance in Latent Topic Space