Modeling Diversity in Information Retrieval

49
ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information Retrieval ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology Department of Statistics University of Illinois, Urbana-Champaign Joint work with John Lafferty, William Cohen, and Xuehua Shen

description

Modeling Diversity in Information Retrieval. ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology Department of Statistics University of Illinois, Urbana-Champaign. - PowerPoint PPT Presentation

Transcript of Modeling Diversity in Information Retrieval

Page 1: Modeling Diversity in   Information Retrieval

ACM SIGIR 2009 Workshop on Redundancy, Diversity, andInterdependent Document Relevance, July 23, 2009, Boston, MA

1

Modeling Diversity in

Information Retrieval

ChengXiang (“Cheng”) Zhai

Department of Computer Science

Graduate School of Library & Information Science

Institute for Genomic Biology

Department of Statistics

University of Illinois, Urbana-Champaign

Joint work with John Lafferty, William Cohen, and Xuehua Shen

Page 2: Modeling Diversity in   Information Retrieval

Different Reasons for Diversification

• Redundancy reduction

• Diverse information needs – Mixture of users

– Single user with an under-specified query

– Aspect retrieval

– Overview of results

• Active relevance feedback

• …

2

Page 3: Modeling Diversity in   Information Retrieval

Outline

• Risk minimization framework

• Capturing different needs for diversification

• Language models for diversification

3

Page 4: Modeling Diversity in   Information Retrieval

4

IR as Sequential Decision Making

User System

A1 : Enter a query Which documents to present?How to present them?

Ri: results (i=1, 2, 3, …)Which documents to view?

A2 : View documentWhich part of the document

to show? How?

R’: Document contentView more?

A3 : Click on “Back” button

(Information Need) (Model of Information Need)

Page 5: Modeling Diversity in   Information Retrieval

5

Retrieval Decisions

User U: A1 A2 … … At-1 At

System: R1 R2 … … Rt-1

Given U, C, At , and H, choosethe best Rt from all possible

responses to At

History H={(Ai,Ri)} i=1, …, t-1

DocumentCollection

C

Query=“Jaguar”

All possible rankings of C

The best ranking for the query

Click on “Next” button

All possible size-k subsets of unseen docs

The best k unseen docs

Rt r(At)

Rt =?

Page 6: Modeling Diversity in   Information Retrieval

6

A Risk Minimization Framework

User: U Interaction history: HCurrent user action: At

Document collection: C

Observed

All possible responses: r(At)={r1, …, rn}

User Model

M=(S, U…) Seen docs

Information need

L(ri,At,M) Loss Function

Optimal response: r* (minimum loss)

( )arg min ( , , ) ( | , , , )tt r r A t tM

R L r A M P M U H A C dM ObservedInferredBayes risk

Page 7: Modeling Diversity in   Information Retrieval

7

• Approximate the Bayes risk by the loss at the mode of the posterior distribution

• Two-step procedure– Step 1: Compute an updated user model M* based on

the currently available information– Step 2: Given M*, choose a response to minimize the

loss function

A Simplified Two-Step Decision-Making Procedure

( )

( )

( )

arg min ( , , ) ( | , , , )

arg min ( , , *) ( * | , , , )

arg min ( , , *)

* arg max ( | , , , )

t

t

t

t r r A t tM

r r A t t

r r A t

M t

R L r A M P M U H A C dM

L r A M P M U H A C

L r A M

where M P M U H A C

Page 8: Modeling Diversity in   Information Retrieval

8

Optimal Interactive Retrieval

User

A1

U C

M*1P(M1|U,H,A1,C)

L(r,A1,M*1)

R1A2

L(r,A2,M*2)

R2

M*2P(M2|U,H,A2,C)

A3 …

Collection

IR system

Page 9: Modeling Diversity in   Information Retrieval

• At {“enter a query”, “click on Back button”, “click on Next button, …}

• r(At): decision space (At dependent)– r(At) = all possible subsets of C + presentation strategies– r(At) = all possible rankings of docs in C – r(At) = all possible rankings of unseen docs– …

• M: user model – Essential component: U = user information need– S = seen documents– n = “Topic is new to the user”

• L(Rt ,At,M): loss function– Generally measures the utility of Rt for a user modeled as M– Often encodes retrieval criteria (e.g., using M to select a ranking of docs)

• P(M|U, H, At, C): user model inference– Often involves estimating a unigram language model U

9

Refinement of Risk Minimization

Page 10: Modeling Diversity in   Information Retrieval

10

Generative Model of Document & Query [Lafferty & Zhai 01]

observedPartiallyobserved

QU)|( Up QUser

DS)|( Sp D

Source

inferred

),|( Sdp Dd Document

),|( Uqp Q q Query

( | , )Q Dp R R

Page 11: Modeling Diversity in   Information Retrieval

11

Risk Minimization with Language Models [Lafferty & Zhai 01, Zhai & Lafferty 06]

Choice: (D1,1)

Choice: (D2,2)

Choice: (Dn,n)

...

query quser U

doc set Csource S

q

1

N

dSCUqpDLDD

),,,|(),,(minarg*)*,(,

Loss

L

L

L

Page 12: Modeling Diversity in   Information Retrieval

12

Optimal Ranking for Independent Loss

1 11 1

1 1

1

1 1

1

1 1

1

1 1

* arg min ( , ) ( | , , , )

( , ) ( | ... )

( )

( ) ( )

* arg min ( ) ( ) ( | , , , )

arg min ( ) ( ) (

j j

j

j

j

j

N i

ii j

N i

ii j

N jN

ij i

N jN

ij i

N jN

ij i

L p q U C S d

L s l

s l

s l

s l p q U C S d

s l p

| , , , )

( | , , , ) ( ) ( | , , , )

* ( | , , , )

j j

k k k k

k

q U C S d

r d q U C S l p q U C S d

Ranking based on r d q U C S

Decision space = {rankings}

Sequential browsing

Independent loss

Independent risk= independent scoring

“Risk ranking principle”[Zhai 02, Zhai & Lafferty 06]

Page 13: Modeling Diversity in   Information Retrieval

Risk Minimization for Diversification

• Redundancy reduction: loss function includes a redundancy/novelty measure– Special case: list presentation + MMR [Zhai et al. 03]

• Diverse information needs: loss function defined on latent topics– Special case: PLSA/LDA + aspect retrieval [Zhai 02]

• Active relevance feedback: loss function considers both relevance and benefit for feedback– Special case: feedback only (hard queries) [Shen & Zhai 05]

13

Page 14: Modeling Diversity in   Information Retrieval

Subtopic Retrieval

Query: What are the applications of robotics in the world today?

Find as many DIFFERENT applications as possible.

Example subtopics: A1: spot-welding robotics

A2: controlling inventory A3: pipe-laying robotsA4: talking robotA5: robots for loading & unloading memory tapesA6: robot [telephone] operatorsA7: robot cranes… …

Subtopic judgments A1 A2 A3 … ... Ak

d1 1 1 0 0 … 0 0d2 0 1 1 1 … 0 0d3 0 0 0 0 … 1 0….dk 1 0 1 0 ... 0 1

Need to model interdependent document relevance

Page 15: Modeling Diversity in   Information Retrieval

Diversify = Remove Redundancy [Zhai et al. 03]

15

1,

))|(1()|(

))|(1)(|1(

))|1(1())|(1)(|1()}{,,,...,|(

),,,|(),,...,|(),...,|(

),...,|(minarg),,,|(),(minarg*

2

3

321

111

1111

111

c

cwhere

dNewpdqp

dNewpdRp

dRpcdNewpdRpcdddl

dSCUqpdddldddr

dddrsdSCUqpL

kk

Rank

kk

Rank

kkkkiiQkk

kkkk

N

j

N

jii jj

“Willingness to tolerate redundancy”

Cost NEW NOT-NEW REL 0 C2 NON-REL C3 C3

C2<C3, since a redundant relevant doc is better than a non-relevant doc

Greedy Algorithm for Ranking: Maximal Marginal Relevance (MMR)

Page 16: Modeling Diversity in   Information Retrieval

A Mixture Model for Redundancy

P(w|Background)Collection

P(w|Old)

Ref. document

1-

=?

p(New|d)= = probability of “new” (estimated using EM)p(New|d) can also be estimated using KL-divergence

Page 17: Modeling Diversity in   Information Retrieval

Evaluation metrics

• Intuitive goals:– Should see documents from many different

subtopics appear early in a ranking (subtopic coverage/recall)

– Should not see many different documents that cover the same subtopics (redundancy).

• How do we quantify these?– One problem: the “intrinsic difficulty” of

queries can vary.

Page 18: Modeling Diversity in   Information Retrieval

Evaluation metrics: a proposal

• Definition: Subtopic recall at rank K is the fraction of subtopics a so that one of d1,..,dK is relevant to a.

• Definition: minRank(S,r) is the smallest rank K such that the ranking produced by IR system S has subtopic recall r at rank K.

• Definition: Subtopic precision at recall level r for IR system S is:

),minRank(S

),minRank(Sopt

r

r

This generalizes ordinary recall-precision metrics.

It does not explicitly penalize redundancy.

Page 19: Modeling Diversity in   Information Retrieval

Evaluation metrics: rationale

recall

K

minRank(Sopt,r)

minRank(S,r)),minRank(S

),minRank(Sopt

r

r precision

1.0

0.0

For subtopics, theminRank(Sopt,r) curve’s shape is not predictable and linear.

Page 20: Modeling Diversity in   Information Retrieval

Evaluating redundancy

Definition: the cost of a ranking d1,…,dK is

where b is cost of seeing document, a is cost of seeing a subtopic inside a document (before a=0).Definition: minCost(S,r) is the minimal cost at which recall r is obtained.Definition: weighted subtopic precision at r is

),minCost(S

),minCost(Sopt

r

rwill use a=b=1

Page 21: Modeling Diversity in   Information Retrieval

Evaluation Metrics Summary

• Measure performance (size of ranking minRank,

cost of ranking minCost) relative to optimal.

• Generalizes ordinary precision/recall.

• Possible problems:– Computing minRank, minCost is NP-hard!

– A greedy approximation seems to work well for our data set

Page 22: Modeling Diversity in   Information Retrieval

Experiment Design

• Dataset: TREC “interactive track” data.– London Financial Times: 210k docs, 500Mb

– 20 queries from TREC 6-8• Subtopics: average 20, min 7, max 56

• Judged docs: average 40, min 5, max 100

• Non-judged docs assumed not relevant to any subtopic.

• Baseline: relevance-based ranking (using language models)

• Two experiments– Ranking only relevant documents

– Ranking all documents

Page 23: Modeling Diversity in   Information Retrieval

S-Precision: re-ranking relevant docs

Page 24: Modeling Diversity in   Information Retrieval

WS-precision: re-ranking relevant docs

Page 25: Modeling Diversity in   Information Retrieval

Results for ranking all documents

“Upper bound”: use subtopic names to build an explicit subtopic model.

Page 26: Modeling Diversity in   Information Retrieval

Summary: Remove Redundancy• Mixture model is effective for identifying novelty in relevant

documents

• Trading off novelty and relevance is hard

• Relevance seems to be dominating factor in TREC interactive-track data

Page 27: Modeling Diversity in   Information Retrieval

Diversity = Satisfy Diverse Info. Need[Zhai 02]

• Need to directly model latent aspects and then optimize results based on aspect/topic matching

• Reducing redundancy doesn’t ensure complete coverage of diverse aspects

27

Page 28: Modeling Diversity in   Information Retrieval

Aspect Generative Model of Document & Query

QU),|( Up Q

User),|( Qqp

q Query

DS),|( Sp D

SourceDdp ,|(

d Document

=( 1,…, k)

n

n

i

A

aDaiD dddwhereapdpdp ...,)|()|(),|( 1

1 1

dDirapdpdpn

i

A

aai )|()|()|(),|(

1 1

PLSI:

LDA:

Page 29: Modeling Diversity in   Information Retrieval

Aspect Loss Function

)|()1()|(1

)|(

,

)||()}{,,,...,|(

1

11,...,1

1,...,11111

k

k

ii

kk

kkQ

kiiQkk

apapk

ap

where

Ddddl

QU),|( Up Q ),|( Qqp

q

DS),|( Sp D Ddp ,|(

d

)ˆ||ˆ( 1,...,1k

kQD

Page 30: Modeling Diversity in   Information Retrieval

Aspect Loss Function: Illustration

Desired coverage

p(a|Q)

“Already covered”

p(a|1)... p(a|k -

1)Combined coverage

p(a|k)

New candidate p(a|k)

non-relevant

redundant

perfect

Page 31: Modeling Diversity in   Information Retrieval

Evaluation Measures• Aspect Coverage (AC): measures per-doc

coverage– #distinct-aspects/#docs

• Aspect Uniqueness(AU): measures redundancy– #distinct-aspects/#aspects

• Examples0001001

0101100

#doc 1 2 3 … …#asp 2 5 8 … …#uniq-asp 2 4 5AC: 2/1=2.0 4/2=2.0 5/3=1.67AU: 2/2=1.0 4/5=0.8 5/8=0.625

1000101

… ...d1 d3d2

Page 32: Modeling Diversity in   Information Retrieval

Effectiveness of Aspect Loss Function (PLSI)

Aspect Coverage Aspect UniquenessData set NoveltyCoefficient Prec() AC() Prec() AU()=0 0.265(0) 0.845(0) 0.265(0) 0.355(0)0 0.249(0.8) 1.286(0.8) 0.263(0.6) 0.344(0.6)

MixedData

Improve -6.0% +52.2% -0.8% -3.1%=0 1(0) 1.772(0) 1(0) 0.611(0)0 1(0.1) 2.153(0.1) 1(0.9) 0.685(0.9)

RelevantData

Improve 0% +21.5% 0% +12.1%

)|()1()|(1

)|(1

11,...,1 k

k

ii

kk apap

kap

Page 33: Modeling Diversity in   Information Retrieval

Effectiveness of Aspect Loss Function (LDA)

Aspect Coverage Aspect UniquenessData set NoveltyCoefficient Prec AC Prec AC=0 0.277(0) 0.863(0) 0.277(0) 0.318(0)0 0.273(0.5) 0.897(0.5) 0.259(0.9) 0.348(0.9)

MixedData

Improve -1.4% +3.9% -6.5% +9.4%=0 1(0) 1.804(0) 1(0) 0.631(0)0 1(0.99) 1.866(0.99) 1(0.99) 0.705(0.99)

RelevantData

Improve 0% +3.4% 0% +11.7%

)|()1()|(1

)|(1

11,...,1 k

k

ii

kk apap

kap

Page 34: Modeling Diversity in   Information Retrieval

Comparison of 4 MMR Methods

Mixed Data Relevant DataMMRMethod AC Improve AU Improve AC Improve AU ImproveCC() 0%(+) 0%(+) +2.6%(1.5) +13.8%(1.5)

QB() 0%(0) 0%(0) +1.8%(0.6) +5.6%(0.99)

MQM() +0.2%(0.4) +1.0%(0.95) +0.2%(0.1) +1.2%(0.9)

MDM() +1.5%(0.5) +2.2%(0.5) 0%(0.1) +1.1%(0.5)

CC - Cost-based CombinationQB - Query Background ModelMQM - Query Marginal ModelMDM - Document Marginal Model

Page 35: Modeling Diversity in   Information Retrieval

Summary: Diverse Information Need• Mixture model is effective for capturing latent topics

• Direct modeling of latent aspects/topics is more effective than indirect modeling through MMR in improving aspect coverage, but MMR is better for improving aspect uniqueness

• With direct topic modeling and matching, aspect coverage can be improved at the price of lower relevance-based precision

Page 36: Modeling Diversity in   Information Retrieval

Diversify = Active Feedback [Shen & Zhai 05]

* arg min ( , ) ( | , , )D

D L D p U q C d

Decision problem: Decide subset of documents for relevance judgment

1

( , ) ( , , ) ( | , , )

( , , ) ( | , , )

j

k

i ii

j

L D l D j p j D U

l D j p j d U

Page 37: Modeling Diversity in   Information Retrieval

Independent Loss

1

( , ) ( , , ) ( | , , )k

i ii

j

L D l D j p j d U

1

( , , ) ( , , )k

i ii

l D j l d j

Independent Loss

( ) ( , , ) ( | , , ) ( | , , )i

i i i i ij

r d l d j p j d U p U q C d

*

1

arg min ( , , ) ( | , , ) ( | , , )i

k

i i i iD i j

D l d j p j d U p U q C d

1 1

( , ) ( , , ) ( | , , )kk

i i i ii i

j

L D l d j p j d U

Page 38: Modeling Diversity in   Information Retrieval

Independent Loss (cont.)

Uncertainty Sampling

( ,1, ) log ( 1 | , )

( ,0, ) log ( 0 | , ) i i i

i i i

l d p R d d C

l d p R d d C

( ) ( | , ) ( | , , )i ir d H R d p U q C d

( ) ( , , ) ( | , , ) ( | , , )i

i i i i ij

r d l d j p j d U p U q C d

Top K

1

, 0 1 0

, ( ,1, ) ,

( 0, ) , i i

i

d C l d C

l d C C C

0 1 0( ) ( ) ( 1 | , , ) ( | , , )i i ir d C C C p j d U p U q C d

Page 39: Modeling Diversity in   Information Retrieval

Dependent Loss

1

( , , ) ( 1 | , , ) ( , )k

i ii

L D U p j d U D

Heuristics: consider relevance

first, then diversity

( 1)N G K

Gapped Top K

Select Top N documents

Cluster N docs into K clusters

K Cluster CentroidMMR

Page 40: Modeling Diversity in   Information Retrieval

Illustration of Three AF Methods

Top-K (normal feedback)

123456789

10111213141516…

GappedTop-K

K-cluster centroid

Aiming at high diversity …

Page 41: Modeling Diversity in   Information Retrieval

Evaluating Active Feedback

Query Select K

docs

K docs

Judgment File

+

Judged docs

+ +

+

-

-

InitialResultsNo feedback

(Top-k, gapped, clustering)

FeedbackFeedbackResults

Page 42: Modeling Diversity in   Information Retrieval

Retrieval Methods (Lemur toolkit)

Query Q

DDocument D

Q

)||( DQD Results

Kullback-Leibler Divergence Scoring

Feedback Docs F={d1, …, dn}

Active Feedback

Default parameter settings

unless otherwise stated

FQQ )1('F

Mixture Model Feedback

Only learn from relevant docs

Page 43: Modeling Diversity in   Information Retrieval

Comparison of Three AF Methods

Collection Active FB Method

#Rel

Include judged docs

MAP Pr@10doc

HARD

Top-K 146 0.325 0.527

Gapped 150 0.330 0.548

Clustering 105 0.332 0.565

AP88-89

Top-K 198 0.228 0.351

Gapped 180 0.234* 0.389*

Clustering 118 0.237 0.393Top-K is the worst!

bold font = worst * = best

Clustering uses fewest relevant docs

Page 44: Modeling Diversity in   Information Retrieval

Appropriate Evaluation of Active Feedback

New DB(AP88-89,

AP90)

Original DBwith judged docs(AP88-89, HARD)

+ -+

Original DBwithout judged

docs

+ -+

Can’t tell if the ranking of un-judged documents is improved

Different methods

have different test documents

See the learning effectmore explicitly

But the docs must be similar to original docs

Page 45: Modeling Diversity in   Information Retrieval

Comparison of Different Test Data

Test Data Active FB Method

#Rel MAP Pr@10doc

AP88-89

Including

judged docs

Top-K 198 0.228 0.351

Gapped 180 0.234 0.389

Clustering 118 0.237 0.393

AP90 Top-K 198 0.220 0.321

Gapped 180 0.222 0.326

Clustering 118 0.223 0.325

Clustering generates fewer, but higher quality examples

Top-K is consistently the worst!

Page 46: Modeling Diversity in   Information Retrieval

Summary: Active Feedback

• Presenting the top-k is not the best strategy

• Clustering can generate fewer, higher quality feedback examples

Page 47: Modeling Diversity in   Information Retrieval

Conclusions

• There are many reasons for diversifying search results (redundancy, diverse information needs, active feedback)

• Risk minimization framework can model all these cases of diversification

• Different scenarios may need different techniques and different evaluation measures

47

Page 48: Modeling Diversity in   Information Retrieval

References• Risk Minimization

– [Lafferty & Zhai 01] John Lafferty and ChengXiang Zhai. Document language models, query models, and risk minimization for information retrieval. In Proceedings of the ACM SIGIR 2001, pages 111-119.

– [Zhai & Lafferty 06] ChengXiang Zhai and John Lafferty, A risk minimization framework for information retrieval, Information Processing and Management, 42(1), Jan. 2006, pages 31-55.

• Subtopic Retrieval

– [Zhai et al. 03] ChengXiang Zhai, William Cohen, and John Lafferty, Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval, In Proceedings of ACM SIGIR 2003.

– [Zhai 02] ChengXiang Zhai, Language Modeling and Risk Minimization in Text Retrieval, Ph.D. thesis, Carnegie Mellon University, 2002.

• Active Feedback

– [Shen & Zhai 05] Xuehua Shen, ChengXiang Zhai, Active Feedback in Ad Hoc Information Retrieval, Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( SIGIR'05), 59-66, 2005

ACM SIGIR 2009 Workshop on Redundancy, Diversity, andInterdependent Document Relevance, July 23, 2009, Boston, MA

48

Page 49: Modeling Diversity in   Information Retrieval

49

Thank You!