Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification

32
Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification Tu Ngoc Nguyen and Nattiya Kanhabua 1

Transcript of Leveraging Dynamic Query Subtopics for Time-aware Search Result Diversification

Leveraging Dynamic Query Subtopics for

Time-aware Search Result Diversification

Tu Ngoc Nguyen and Nattiya Kanhabua

1

Motivation

• Query underlying aspects change over time

2

I. Subtopic Mining

A. From Query Logs

B. From Document Collection

II. Time-aware Diversifying Models

III. Experiment

3

Outline

4

temporally

ambiguous, multi-

faceted queries

subtopic mining

query log

document

collection

time-aware

diversification

d1

d2

.

.

dn

dynamic

subtopics

t

System Pipeline

1.a

co-click bipartite

q1

q2

.

.

qn

related queries

querying

time t

clustering

subtopics

querying

time t + 1

5

Subtopic Mining Approach: Query Log

march madness

began

14/03/2006

ncaa women

tournament began

18/03/2006 01/04/2006

final four began

query: ncaa

6

Subtopic Dynamics: Query Log

7

temporally

ambiguous, multi-

faceted queries

subtopic mining

query log

document

collection

time-aware

diversification

d1

d2

.

.

dn

dynamic

subtopics

t

System Pipeline

1.b

– Probabilistic subtopic modeling: Latent Dirichlet Allocation (LDA)

8

Subtopic Mining: Document Collection

query: apple

9

Subtopic Dynamics: Document Collection

subtopic dynamics : Blogs08 query volume: GoogleTrend

10

temporally

ambiguous, multi-

faceted queries

subtopic mining

query log

document

collection

time-aware

diversification

d1

d2

.

.

dn

dynamic

subtopics

t

System Pipeline

2.

- Probabilistic model:

- Pr(c|q): weight of certain subtopic in a query

- Pr(q|d), Pr(d|q): relation between document and query

- Pr(c|d), Pr(d|c): relation between document and subtopic

- IA-Select objective function:

11

IA-Select Model [Vallet and Castells. 2012]

document

relevance novelty

• MMR [Carbonell and Goldstein. 98]

- diversify based on similarity of document contents

• IA-Select [Agrawal et al. 2009]

- diversify based on the taxonomy of subtopic categories

• xQuaD [Santos et al. 2010]

- general form of IA-Select

- define objective function as a mixture of relevance and diversity

probabilities

• Topic richness [Dou et al. 2011]

- general form of xQuaD and IA-Select models

- accepts topics from multiple sources

12

Search Result Diversification Models

- Probabilistic model:

- Pr(c|q): weight of certain subtopic in a query

- Pr(q|d), Pr(d|q): relation between document and query

- Pr(c|d), Pr(d|c): relation between document and subtopic

- xQuaD objective function:

13

xQuaD Model [Vallet and Castells. 2012]

noveltydocument-topic

relevance

document-query

relevance

• Temp-IASelect

- IA-Select objective function:

- Temp-IASelect objective function:

14

Temporal Diversifying Models

• Temp-xQuaD

- xQuaD objective function:

- Temp-xQuaD objective function:

• Temp-topic richness

- generalization of temp-xQuaD and temp-IASelect15

Temporal Diversifying Models

• TREC Blogs08 Collection

- crawled from Jan 2008 to Feb 2009

- clean HTML tags using HtmlCleaner and Boilerpipe libraries

- index using Lucene Core

- document’s publication date extracted from:

- Blog content

- URL

- Retrieval date

16

Experiment Settings

• Retrieval baseline:

- Okapi BM25

• Relevance assessments:

- human assessment

- binary relevance judgment, follows TREC Diversity Track 2009 and 2011

- 2 dimensional assessment: relevance and time

- exclude topics mined from query log (time gap between AOL and

Blogs08)

- top 10 most probability words represents a topic.

• Querying-time points:

- how popular is the query at particular time t

- how different to the previous time slice t-1

17

Experiment Settings

18

Relevance assessments

Document Publication

Date

Subtopic Hitting time Relevance

Title: It’s The Most Wonderful Time Of The

Year

Content: The greatest sports week of the

year is upon us, and Chris’s Sports Blog is

ready. Check back daily for coverage of the

ACC and NCAA Tournament, tips on how to

fill out your bracket …

2008-03-17 ncaa basketball

tournament

2008-03 1

Title: Is there a bigger joke than the NCAA?

Content: Each year the NCAA discovers a

new way to make a bigger ass of itself than

the previous one. This years specialty is to

bar from NCAA playoffs in every sport any

school who persists in using "unacceptable"

team names and school mascots, by their

exclusive definition…

2005-08-22 ncaa basketball

tournament

2008-03 0

Title: Apple Quince Jam

Content: The apple quince is a fruit that

ripens in the period from October to

November, it is an apple with a strange

shape: it looks like a pear and apple, and it is

lumpy…

2007-11-15 apple jam 2008-03 1

• α-NDCG

- adding diversity and novelty to nDCG

• Intent-Aware Precision (Precision-IA)

- intent-aware version of Precision

- treat subtopic as distinct interpretation of query

• Intent-Aware Expected Reciprocal Rank (ERR-IA)

- based on cascade model of search

19

Evaluation Metrics

20

Experimental Results

* baselines with dynamic subtopic mining

21

Experimental Results

△ (p < 0.05), △△ (p < 0.01)

Conclusion• studied temporally ambiguous and multi-faceted queries

- subtopic temporal variability

- subtopic mining from two different sources (query logs, document

collection)

• propose time-aware search results diversification frameworks

• Model and predict the subtopic change

• Combine diversifying by subtopics and time in a unified framework

22

Future work

THANK YOU.

23

Settings

• Estimate natural number of subtopics

– Suresh et al. 2010 view LDA as matrix factorization mechanism

– Cd×w = M1d×t × M2t×w

• d: number of document in the corpus

• w: size of vocabulary

• t: the number of topics

– optimum t is with minimal divergence value

• CM1 is the distribution of singular values of M1

• CM2 is obtained by normalize vector L · M2, L is 1 x D vector of lengths of

each document in C.

24

• New documents appear all the time

• Document content change over time

• Queries and query volumes change over time

• Example: [Kulkarni et al. 2011]

25

march madness

ncaa

Motivation

Query Dynamic Metrics

• Kendall τ Coefficient based:

• Jaccard Coefficient based:

26

Cluster Subtopic Candidates

• Clustering approach [Song et al. 2011]:

– step 1: Construct a similarity matrix of the related queries

– step 2: Cluster using Affinity Propagation algorithm

– step 3: Extract a set of exemplars as subtopics of the query

• Similarity metrics:

– lexical similarity:

• keywords and cosine similarity

– co-click similarity:

• based on fraction of common clicks

– semantic similarity:

• use WordNet as external KB

27

• Vector-based:

– Cosine Similarity

• Bag of words-based:

– Jaccard Coefficient

• Ranked list of words-based:

– Kendall τ Coefficient -based:

• Multinomial distribution-based:

– Kullback-Leibler Divergence

– Jensen-Shannon Divergence

Topic Similarity Metrics [Kim and Oh. 2011]

28

Subtopic Mining Approach

• Dynamic queries: select 57 out of 61 queries from the AOL query log i.e.

yearly recurrent or time-independent.

• Settings

– partition collection into 14 one-month length time slices

– training data in time slice ti is top 2000 documents D with d ∈ D,

pubDate(d) ∈ t

– number of subtopic is preset in the range from 5 to 20

• Subtopic weight:

– weight w(c) is the probability that a given query q implies subtopic c

29

Temporal Document Collection

• TREC Blogs08 Collection

- crawled from Jan 2008 to Feb 2009

- clean HTML tags using HtmlCleaner and Boilerpipe libraries

- index using Lucene Core

- document’s publication date extracted from:

- Blog content

- URL

- Retrieval date

30

Subtopic Evaluation

• 61 queries: 51 event-related queries, 10 standard ambiguous queries

– aspect removed e.g. march madness brackets → march madness

• Subtopic evaluation metrics [Radlinski et al. 2010]:

– coherence

– distinctness

– plausibility

– completeness

31

Subtopic Evaluation

- Perplexity: a measure of the ability of a model to generalize documents

-

- use holdout validation with 90% data for training and 10% for testing

- randomly select 20 out of 57 queries at a random time slice

32