Institute of Computing Technology, Chinese Academy of Sciences 1 A Unified Framework of Recommending...

28
Institute of Computing Technology, Chinese Academy of Sciences 1 A Unified Framework of Recommending Diverse and Relevant Queries Speaker: Xiaofei Zhu Institute of Computing Technology Chinese Academy of Sciences, P.R. China (Joint work with Jiafeng Guo, Xueqi Cheng) Institute Of Computing Technology Chinese Academy Of Sciences

Transcript of Institute of Computing Technology, Chinese Academy of Sciences 1 A Unified Framework of Recommending...

Institute of Computing Technology, Chinese Academy of Sciences 1

A Unified Framework of Recommending Diverse and Relevant Queries

Speaker: Xiaofei Zhu

Institute of Computing Technology

Chinese Academy of Sciences, P.R. China

(Joint work with Jiafeng Guo, Xueqi Cheng)

Institute Of Computing Technology Chinese Academy Of Sciences

Institute of Computing Technology, Chinese Academy of Sciences 2

Outline

• Introduction• Our Approach• Experiments• Summary

Institute of Computing Technology, Chinese Academy of Sciences 3

Outline

• Introduction• Our Approach• Experiments• Summary

Institute of Computing Technology, Chinese Academy of Sciences 4

Motivation

Very short

words are ambitious

users lack of domain-specific knowledge

Given query : ‘abc’

'abc shows''abc television''abc tv''abc news''abc breaking news'

ranke.g.

Relevance is

enough?

(1) Only recommend relevant queries is enough?

Very short

words are ambitious

users lack of domain-specific knowledge

Institute of Computing Technology, Chinese Academy of Sciences 5

Motivation

More blue, more relevantThe Red node is the input query.

(2) Is Euclidean space suitable?

Institute of Computing Technology, Chinese Academy of Sciences 6

Motivation

More blue, more relevantThe Red node is the input query.

Institute of Computing Technology, Chinese Academy of Sciences 7

Contribution

relevance diversity

Query Recommendation

Manifold ranking Import stop points

A novel unified frameworkManifold ranking with stop points

Institute of Computing Technology, Chinese Academy of Sciences 8

Outline

• Introduction• Our Approach• Experiments• Summary

Institute of Computing Technology, Chinese Academy of Sciences 9

Our Approach

1 1 1

2 2 2

m m m

u u u

u u u

u u u

query1 query2 queryn

Affinity matrix W

Institute of Computing Technology, Chinese Academy of Sciences 10

Our Approach

• Traditional manifold ranking process W- affinity matrixD – diagonal matrix

Step 1:

Step 2:

Step 3:

Institute of Computing Technology, Chinese Academy of Sciences 11

Our Approach

• Manifold ranking with stop points

Institute of Computing Technology, Chinese Academy of Sciences 12

- set of stop points

- set of free points

T

R

RR RT

TR TT

S SS

S S

0RT TTS S

Our Approach

( 1) ( )

(1 )RR RT

TR

t t

R R R

T TTT T

S S

S S

f f y

f f y

(2)

( 1) ( )

(1 )0

0

t t

R RR R R

T TR T T

f S f y

f S f y

(3)

( 1) ( ) (1 )t tR RR R Rf S f y (4)

( 1) ( ) (1 )t tf f yS (1)

Institute of Computing Technology, Chinese Academy of Sciences 13

Our Approach

( 1) ( ) (1 )t tR RR R Rf S f y

Institute of Computing Technology, Chinese Academy of Sciences 14

Detailed Algorithm

the input query free point setR stop point setT

1/2 1/2RR RR RR RRS D W D

( ) ( 1)

1 11 12 1 11

2 21 22 2 22

1 2

(1 )

t tR RR R R

nR RR RR RR R

nR RR RR RR R

n n nnRR RR

n nR RRR

f S f y

yf S S S

f f

f

yf S S S f

S S S

ny

the input query(0) 0Rf

Until k recommendations acquired

Institute of Computing Technology, Chinese Academy of Sciences 15

Outline

• Introduction• Our Approach• Experiments• Summary

Institute of Computing Technology, Chinese Academy of Sciences 16

Experiments

• Data Set– 15 million queries (from US users) sampled over one

month in May, 2006.– Clean the raw data

• ignoring non-English queries• replacing all non-alphanumeric characters in each query

with whitespace.• frequency less than 3 was removed

– After cleaning• 191,585 queries, 251,427 URLs and 318,947 edges.• 1.66 distinct URLs,1.27 distinct queries.

Institute of Computing Technology, Chinese Academy of Sciences 17

Experiments

• Baselines– Pair-wise Based• Naïve : only considers relevance• MMR (Maximal Marginal Relevance) : a linear

combination of relevance and diversity

– Graph Based• Hitting time: boost long tail queries• Grasshopper(Graph Random-walk with Absorbing

StateS that HOPs among PEaks for Ranking) : absorbing random walk on the graph

Institute of Computing Technology, Chinese Academy of Sciences 18

Case Study

Institute of Computing Technology, Chinese Academy of Sciences 19

• Automatic Evaluation– Open Directory Project(ODP) <-> Relevance – Commercial search engine (i.e., Google) <-> Diversity

• Evaluation metrics– Relevance– Diversity – Q-measure

Experiments

Institute of Computing Technology, Chinese Academy of Sciences 20

ExperimentsFigure 1: Average Relevance of Query Recommendation

over Different Recommendation Size under Five Approaches.

Institute of Computing Technology, Chinese Academy of Sciences 21

ExperimentsFigure 2: Average Diversity of Query Recommendation

over Different Recommendation Size under Five Approaches.

Institute of Computing Technology, Chinese Academy of Sciences 22

ExperimentsFigure 3: Average Q-measure of Query Recommendation over Different Recommendation Size under Five Approaches.

Institute of Computing Technology, Chinese Academy of Sciences 23

Recommendation pool

search results

Experiments

• Manual Evaluation– Recommendation pool– 3 human judges – Label tool

Institute of Computing Technology, Chinese Academy of Sciences 24

Experiments

• Evaluation Metrics

– Intent-Coverage

– α-nDCG (α -normalized Discounted Cumulative Gain )

Institute of Computing Technology, Chinese Academy of Sciences 25

Experiments

Table 2: Performance of recommendation results over a sample of queries under five different approaches.

Institute of Computing Technology, Chinese Academy of Sciences 26

Outline

• Introduction• Our Approach• Experiments• Summary

Institute of Computing Technology, Chinese Academy of Sciences 27

Summary

• A Novel Unified Model– Relevant and salient queries– Diverse

• Experimental results– Automatic evaluation– Manual evaluation

Institute of Computing Technology, Chinese Academy of Sciences 28

Thank you!

Q&[email protected]