Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell...

37
Query Reformulation as a Predictor of Search Satisfaction Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey

Transcript of Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell...

Page 1: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation as a Predictor of

Search Satisfaction

Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey

Page 2: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Online Satisfaction Measurement

• Satisfying users is the main objective

of any search system

• Measuring user satisfaction is

essential for improving the system

Page 3: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Satisfaction and Implicit Behavior

• How can we model user satisfaction?

– Implicit behavior

• Clicks are the best-known implicit signal

– Clickthrough (e.g., Joachims, 2002, Agichtein et al.

SIGIR’06, Carterette, Jones, NIPS’07, etc.)

– Dwell Time (e.g., Fox et al., TOIS’05)

– Interleaving (e.g., Joachims, KDD’02, Radlinski et al.,

CIKM’08)

Page 4: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Why not Just Use Clicks?

greenfield, mn accident

Time spent on page: 38 seconds

Page 5: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Why not Just Use Clicks?

Session Ends

greenfield, mn accident

Woman dies in a fatal accident in greenfield, minnesota

Page 6: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Why not Just Use Clicks?

• User performed this search on July 1st

• User was probably looking for

Page 7: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Why not Just Use Clicks?

Query Click Query

• User clicked on a result

• The dwell time is long

• But, user was not satsified

Clicks do not always mean satisfaction

Page 8: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Why not Just Use Clicks?

Lack of clicks does not always mean dissatisfaction

Weather in san francisco

Page 9: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation

Give Up

Reformulation satisfaction

• What do users do when they do not like the results?

Page 10: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation

• OR:

reformulation satisfaction

Give Up

reformulation search satisfaction

• Another implicit feedback signal that did not receive as much

attention is query reformulation

Page 11: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation

• Query Reformulation is the act of submitting a query to modify

a previous query in hope of retrieving better results

• Reformulations vs. Related Queries

reformulation satisfaction

reformulation search satisfaction

food in san francisco

weather in san francisco

A reformulation

Not a reformulation

Page 12: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Clicks and Reformulation

• Clickthrough Rate (CTR) of different sets of pairs relative to

CTR of all pairs

overall

Short

Long

0% 11% -21%

-29% -17% -39%

25% 24% 29%

Overall Not Similar Similar

Query Similarity

Tim

e D

iff.

• Queries are similar if they share a non-stop-word term

• Queries have short time difference if the difference between their timestamps

is less than 5 minutes

Page 13: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Clicks and Reformulation

• Clickthrough Rate (CTR) of different sets of pairs relative to

CTR of all pairs

- Similar pairs had 21% below average CTR

- Pairs where Q1 and Q2 are not similar had 11% above average CTR

overall

Short

Long

0% 11% -21%

-29% -17% -39%

25% 24% 29%

Overall Not Similar Similar

Query Similarity

Tim

e D

iff.

Page 14: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Clicks and Reformulation

• Clickthrough Rate (CTR) of different sets of pairs relative to

CTR of all pairs

- Pairs with short time diff. had 29% below average CTR

- Pairs with long time diff. had 25% above average CTR

overall

Short

Long

0% 11% -21%

-29% -17% -39%

25% 24% 29%

Overall Not Similar Similar

Query Similarity

Tim

e D

iff.

Page 15: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Clicks and Reformulation

• Clickthrough Rate (CTR) of different sets of pairs relative to

CTR of all pairs

- Similar pairs with short time diff. had 39% below average CTR

- Pairs that are not similar and had long time diff had 24% above average

CTR

overall

Short

Long

0% 11% -21%

-29% -17% -39%

25% 24% 29%

Overall Not Similar Similar

Query Similarity

Tim

e D

iff.

Page 16: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

overall

Short

Long

0% 11% -21%

-29% -17% -39%

25% 24% 29%

Clicks and Reformulation

• Clickthrough Rate (CTR) of different sets of pairs relative to

CTR of all pairs

Overall Not Similar Similar

- Pairs with long time diff. are very similar indicating that query

similarity has little effect if the time between queries is large

Query Similarity

Tim

e D

iff.

Page 17: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Approach

• Query Representation

• Query Reformulation Prediction

• Query Success Prediction

– Using clicks only

– Using reformulation only

– Using both clicks and reformulation

Page 18: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Representation

• Query Normalization

– Lower-casing

– Replacing runs of whitespaces with a single space

– Word breaking (using a character level n-gram model)

southjeseycraigslist south jesey craigslist

VerizonWireless verizon wireless

Page 19: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Representation

• Queries to Keywords

– For a query x = 𝑥1, 𝑥2, … , 𝑥𝑛 , find a mapping x → y ∈ 𝑌𝑛,

where y is a segmentation from the set 𝑌𝑛

– A segment break is introduced whenever the point wise

mutual information (PMI) between two consecutive words

drops below a certain threshold 𝜏.

𝑃𝑀𝐼(𝑥𝑖 , 𝑥𝑖+1) = log𝑝 𝑥𝑖 , 𝑥𝑖+1

𝑝 𝑥𝑖 𝑝 𝑥𝑖+1

Query Keywords

hotels in san francisco hotels in san_francisco

Hyundai roadside assistance phone number hyundai roadside_assistance phone_number

kodak easyshare recharger chord Kodak_easyshare recharger_chord

user reviews for apple ipad user_reviews for apple_ipad

Page 20: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Matching Keywords

• Exact Match

– The two phrases match exactly.

• Approximate Match

– To capture spelling variants and misspelling, we allow two

keywords to match if the Levenshtein edit distance between

them is less than 2.

• Semantic Match

– Using the depth of the Least Common Subsumer (LCS) in

the WordNet hierarchy.

𝑤𝑢𝑝 𝑡𝑖 , 𝑡𝑗 =2 ∗ 𝑑𝑒𝑝𝑡ℎ(𝐿𝐶𝑆)

𝑑𝑒𝑝𝑡ℎ 𝑡𝑖 + 𝑑𝑒𝑝𝑡ℎ(𝑡𝑗)

Page 21: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation Prediction

Textual Features

normalized Levenshtein edit distance

1 if lev > 2, 0 otherwise

num. characters in common starting from the left

num. characters in common starting from the right

num. words in common starting from the left

num. words in common starting from the right

num. words in common

Jaccard distance between sets of words

Adopted from (Jones and Klinkner., CIKM’08)

Page 22: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation Prediction

Keyword Features

num. of “exact match” keywords in common

num. of “approximate match” keywords in common

num. of “semantic match” keywords in common

num. of keywords in Q1

num. of keywords in Q2

num. of keywords in Q1 but not in Q2

num. of keywords in Q2 but not in Q1

1 if Q1 keywords all Q2’s keywords

1 if Q2 keywords all Q1’s keywords

Page 23: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation Prediction

Other Features

time between queries in seconds

time between queries as a binary feature (5 mins, 30

mins, 60 mins, 120 mins)

cosine distance between vectors derived from the first 10

search results for the query terms.

Page 24: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Reformulation Performance

72%

74%

76%

78%

80%

82%

84%

86%

88%

Heurisitic Textual Keywords All

Ac

cu

rac

y

- Keyword features outperform textual features

- Best performance when all features are combined

Page 25: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Query Satisfaction Prediction

1 Clicks Only A query Q is successful if it receives at least one

click

2 SAT Clicks Only

A query Q is successful if it receives at least one

long dwell time click (thresholds: 10, 30 and 50

seconds)

3 Reformulation Only

Predict success using reformulation features only

(i.e. assume users will always reformulate their

queries when not successful)

4 Reformulation + Clicks

(classifier)

Train a classifier using both reformulation and click

features.

Page 26: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Results

• Clicks Only method performs poorly

• Many queries that receive a click still end up

being unsuccessful

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Clicks Only Sat Click Only ReformulationOnly

Reformulation +Clicks

Acc

ura

cy

Page 27: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Results

• Accuracy improves when only SAT clicks are

considered

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Clicks Only Sat Click Only ReformulationOnly

Reformulation +Clicks

Acc

ura

cy

Page 28: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Results

• Better performance if we use clicks only

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Clicks Only Sat Click Only ReformulationOnly

Reformulation +Clicks

Acc

ura

cy

Page 29: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Results

• Best performance when we learn a classifier using both the

reformulation and the click features

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

Clicks Only Sat Click Only ReformulationOnly

Reformulation +Clicks

Acc

ura

cy

Page 30: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Reformulation Only vs. Reformulation + Clicks

• Reformulation Only achieves high DSAT but low SAT precision

• Reformulation + clicks achieves good performance for both SAT

and DSAT cases

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

ReformulationOnly

Reformulation +Clicks

Acc

ura

cy

0%

20%

40%

60%

80%

100%

ReformulationOnly

Reformulation +Clicks

SA

T P

recis

ion

DS

AT

Pre

cis

ion

SA

T P

recis

ion

DS

AT

Pre

cis

ion

Page 31: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Reformulation Behavior and Search Tasks

• Queries in successful tasks

Page 32: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Reformulation Behavior and Search Tasks

• Queries in successful tasks

Page 33: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Reformulation Behavior and Search Tasks

• Queries in unsuccessful tasks

Page 34: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Reformulation Behavior and Search Tasks

• Queries in unsuccessful tasks

Page 35: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Reformulation Behavior and Search Tasks

Queries in unsuccessful tasks have higher similarity than

queries in successful tasks

Data from (Hassan et al., CIKM’11)

Page 36: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Conclusions

• We can reliably identify query reformulations

• Query reformulation is a strong predictor of search success

• Best results when using both query reformulation and clicks

• Reformulation behavior differs in successful and

unsuccessful tasks

Page 37: Query Reformulation as a Predictor of Search Satisfaction...Ahmed Hassan, Xiaolin Shi, Nick Craswell and Bill Ramsey . Online Satisfaction Measurement •Satisfying users is the main

Thanks !

Ahmed Hassan

[email protected]