Context-Sensitive Query Auto-Completion AUTHORS:NAAMA KRAUS AND ZIV BAR-YOSSEF DATE OF...
-
Upload
andrew-prewitt -
Category
Documents
-
view
217 -
download
2
Transcript of Context-Sensitive Query Auto-Completion AUTHORS:NAAMA KRAUS AND ZIV BAR-YOSSEF DATE OF...
1
Context-Sensitive QueryAuto-CompletionAUTHORS:NAAMA KRAUS AND ZIV BAR-YOSSEF
DATE OF PUBLICATION:NOVEMBER 2010
SPEAKER:RISHU GUPTA
2
digital camera reviewsdigital camera buying guidedigital camera with wifidigital camera dealsdigital camera worlddigital picture framedigital copy
Motivating Example
I want to buy a good Digital
Camera
Current Result Desired Result
3
Most Challenging Auto-Completion Scenario
Challenge :Query Auto-Completion predicts the correct user’s query with only 12.8%
probability.
Goal :To predict the user’s intended query reliably when user has entered only
one character.
Advantages:◦ Makes search experience faster◦ Reduces load on servers in Instant Search
4
QAC Algorithms
User enters the prefix “x” of
Query “q”
Returns a List of “K”
Completions
“Hit” occurs if “c”=“q”
Need efficient data structure
for faster lookup
Completion “c” of Top K Completion
List
QAC Algorithm should also work
if “c” is semantically equal to “q”
Ordered By Quality Score
Hash Table or Trie
5
Context-Sensitive Auto-Completion
How to Compensate for the lack of information ??
Observation:
• User searches within some context.• User context reflects user’s intent.
Context examples• Recent queries• Recently visited pages• Recent Tweets• etc…..Our focus – “Recent queries”• Accessible by search engines• 49% of searches are preceded by a different
query in the same session • For simplicity, in this presentation we focus
on the most recent query
6
Recent Query Use Approaches
Cluster Similar Queries(Use of Techniques like HMMs)
Nearest Completion Algorithm(Assumption:Context relevant
to the query)
Generalize Most Popular Completion Algorithm
• None of these previous studies took the user input (prefix) into account in the prediction
• In 37% of the query pairs the former query has not occurred in the log before
Problem with this approach ??
How to tackle this problem ???
7
Nearest Completion:Measure of Similarity
Challenge: Choosing similarity
measure that is correlated and
universally applicable
Completions must be semantically related to the context query.
Recommendation Based Query Expansion
• Represent queries and contexts as high- dimensional term-weighted vectors and resort to cosine similarity.
• Idea :rich representation of a query is constructed not from its search results, but rather from its recommendation tree.
Recommendation Based Query
• Outputs list of recommendations which are reformulations of previous query.
• Problem occurs when none of the recommendation compatible with user query
How to Overcome this challenge ??
8
EvaluationEVALUATION METRIC
MRR-Mean Reciprocal Rank• A standard IR measure to evaluate a
retrieval of a specific object at a high rank
wMRR-Weighted MRR• Weight sample pairs according to
“prediction difficulty” (total # of candidate completions)
EVALUATION FRAMEWORK
Evaluation Set• A random sample of (context,
query) pairs from the AOL log
Prediction Task• Given context query and first
character of intended query predict intended query at as high rank as possible
9
Analysis
NearestCompletion
• Fails when the context is irrelevant (difficult to predict whether the context is relevant)
MostPopularCompletion
• Fails when the intended query is not highly popular (long tail)
Solution:HybridCompletion
• HybridCompletion: a combination of Most popular Completion and Nearest Completions• Its MRR is 31.5% higher
than that of MostPopularCompletion.
10
Most Popular VS Nearest Completion
Relevant Context:MRR of NearestCompletion (with depth-3 traversal) is higher in 48% than that of MostPopular-Completion.
NearestCompletion becomesdestructive, so its MRR is 19% lower than that of MostPopularCompletion.
11
How Hybrid Completion Works??
Produce Lists
• Produce top k completions of NearestCompletion• Produce top k completions of MostPopularCompletion
Standardi
ze• Two lists differ in units and scale
Hybrid
Score is
Convex
Combination
• hybscore(q) = α · Zsimscore(q) + (1 − α) · Zpopscore(q)• 0≤ α ≤1 is a tunable parameter
• Prior probability that context is relevant
MostPopular, Nearest, and Hybrid (2)
HybridCompletion is shown to be at least as good as NearestCompletion when the context is relevant and almost as good as MostPopularCompletion when thecontext is irrelevant.
13
Examples
14
Conclusion Query Auto Completion
HybridCompletion Algorithm
Nearest Completion Algorithm
MostPopularCompletion Algorithm
Context Sensitive-Query Auto Completion
Based on Popular Queries(AOL Query Log)
Convex Combination of NearestCompletion and
MostPopular
• Relevent Context:Based on Users Recent Queries
• Recommendation Based Algorithm: Rich Query Representatin
15
Future
• NearestCompletition: More effective session segmentation technique
• Predicting the first query in a session still remains an open problem Use of Other Context Resources like Recently Visited Web Pages or Search History
• Measure of Quality Evaluation should be more relaxed
• Rich query representation may be further fine tuned.