Extending facet search to the general web
-
Upload
- -
Category
Data & Analytics
-
view
107 -
download
0
Transcript of Extending facet search to the general web
Extending Facet Search to the
General WebDate:2014/11/27
Author:Weize Kong,James Allan
Source:CIKM’14
Advisor:Jia-ling Koh
Spearker:LIN,CI-JIE1
Introduction
Faceted search helps users by offering drill-down
options as a complement to the keyword input boxfacet
facet term
Multiple
selections
4
Introduction
However, this idea is not well explored for general
web search
Big data
Heterogeneous nature
Google defaul facet 5
Introduction
Goal:
query-dependent automatic facet generation
Incorporate user feedback on these query facets into
document ranking
所有航線國際航線國內航線
Query-dependent facet
6
Flow chart
querySearch result
Candidate facets
Facets
Extracting
CandidateRefining
Candidate
Facet feedback terms
Selecting facets
Top-ranked Document
Facet feedback model
Ranking
documents
8
Flow chart
querySearch result
Candidate facets
Facets
Extracting
CandidateRefining
Candidate
Facet feedback terms
Selecting facets
Top-ranked Document
Facet feedback model
Ranking
documents
9
Extracting candidate example
query : “mars landing”
search result: ”Mars rovers such as Curiosity,
opportunity and Spirit”
candidate facets:
C: {Curiosity,Opportunity,Spirit}
11
Cleaning candidate query facets
Converting text to lowercase
Removing non-alphanumeric characters
Removing stopwords and duplicate terms
Removing all candidate facets that contain
only one item or more than 200 items
12
Extracting candidate
The candidate query facets extracted are usually noisy
Non-relevant to the issued query
Terms are not members of the same class
Incomplete
Four candidate facets for the query “mars landing”
13
Flow chart
querySearch result
Candidate facets
Facets
Extracting
CandidateRefining
Candidate
Facet feedback terms
Selecting facets
Top-ranked Document
Facet feedback model
Ranking
documents
14
Refining Candidate
Re-cluster the query facets or their facet terms
into higher quality query facets
15
Refining Candidate
Topic model
pLSA, LDA
Unsupervised clustering method
QDMiner, QDM
supervised methods based on a graphical model
QF-I, QF-J
16
Refining Candidate example
Input: {sets of noisy terms}
Output: {pure facets}
Year: {2001,2012,2013}
Lab: {nasa,bell lab,mars science lab }
Refining
Candidate
17
Flow chart
querySearch result
Candidate facets
Facets
Extracting
CandidateRefining
Candidate
Facet feedback terms
Selecting facets
Top-ranked Document
Facet feedback model
Ranking
documents
18
Facet feedback model
gives a score for document
Input : Document,Query,Facet feedback terms
Model:
Boolean Filtering Model
Soft Ranking Model
Output: the score of each document
19
Boolean Filtering Model
𝐹𝑢denote the set of feedback facets selected by a
user
condition B can be either AND, OR, or A+O
S(D,Q) is the score returned by the original retrieval
model
20
Soft Ranking Model
λ is a parameter for adjusting the weight between the two
parts
𝑆𝐸(D, 𝐹𝑢) is the expansion model which captures the
relevance between the document D and feedback facet 𝐹𝑢
21
Intrinsic Evaluation
Ground truth: query facets are constructed by human
annotators
The ground truth to be compared with facets generated by
different systems
Annotators are asked to group or re-group terms in the pool
into preferred query facets Pooling facets generated by the different systems
25
Intrinsic Evaluation
SearchResult
Candidate Facets Facets : { terms }
Query
Extracting
Candidates
Refining
Candidates
Pool
user
Facets : { terms }re-group
Facets generated by different
systems
annotators26
Extrinsic Evaluation
Evaluate a system based on an interactive search task that
incorporates Facet web Search(FWS)
The gain can be measured by the improvement of the re-
ranked results
The cost can be measured by the time spent by the users
giving facet feedback
27
Extrinsic Evaluation
Oracle Feedback and Annotator Feedback
Oracle feedback only selected effective terms as feedback
The annotator is asked to select all the terms from the
facets that would help address the information need
28
Extrinsic Evaluation
User model
based on user model we can estimate the time cost for the
user
↑
time for scanning facet
time for selecting terms
↓
29
Extrinsic Evaluation
30
Facet GenerationModel
Facets : { terms }
User model TimeSimulatedFeedback
Performance
Selected Terms
FacetFeedback
evaluation
Experiment Settings
Data set
For the document corpus, we use the ClueWeb09 Category-B collection
196 queries and 678 query subtopics
Facet Generation Models
pLSA, LDA,QDM, QF-I and QF-J
Facet FeedbackModels
Boolean filtering model, soft ranking model
Baseline Retrieval Model
SDM32
Conclusion
Proposed Faceted Web Search, an extension of
faceted search to the general Web
Boolean filtering models are too strict in Faceted Web
Search, and less effective than soft ranking models
37