Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web...
-
Upload
rosalind-carpenter -
Category
Documents
-
view
215 -
download
1
Transcript of Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web...
![Page 1: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/1.jpg)
Date: 2012/10/18Author: Makoto P. Kato , Tetsuya Sakai , Katsumi TanakaSource: World Wide Web conference (WWW "12)Advisor: Jia-ling, KohSpeaker: Jiun Jia, Chiou
Structured Query Suggestion for Specialization and Parallel Movement: Effect on Search
Behaviors
1
![Page 2: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/2.jpg)
2
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
![Page 3: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/3.jpg)
3
Introduction• Traditional query suggestion
Camera
Nikon cameraCanon camera
….….….….
High
Low
Relevance
Query suggestion list
![Page 4: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/4.jpg)
4
Introduction• Popular query reformulation:
Specialization Nikon Nikon camera
Parallel movement
Nikon camera Canon camera
a broad or ambiguous query is modified to narrow down the search result
the user’s topic of interest shifts to another with similar aspects
![Page 5: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/5.jpg)
5
Nikon
Current query
Canon ixy
Query suggestionNikon
camera,Canon
ixy
Cluster
the user wants to select a query suggestion strictly related to "Nikon"
Query suggestion
Nikon camera
Canon ixy
Canon camera
Helpful
It’s difficult for simple clustering approaches to support specialization and parallel movement
simultaneously.
Specialization
Parallel movement
Nikon camera
![Page 6: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/6.jpg)
6
Specialization and Parellel movement Query Suggestion [SParQS]
Diagonal moveme
nt
![Page 7: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/7.jpg)
7
Introduction• SParQS back-end algorithm:
Classifies query
suggestions
clustering queries
clustering entities
log of queries and clicked URLs from Microsoft’s Bing
1 2 3
![Page 8: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/8.jpg)
8
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
![Page 9: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/9.jpg)
Problem Definition
Clickthrough data
Query 1
Query 2
URL 1
URL 2
URL 3
2
5
3
1
2
4
Q set of queries
U set of URLs
w(q,u)
how many times a URL u ∈ U presented in response to a query q ∈ Q has been clicked
E set of entities , Ex: Wikipedia entry titles
Sj set of query suggestions for each entity ej
∈ E
n the number of query suggestion categories required
9
![Page 10: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/10.jpg)
Three Criteria:
① Evenness of Categories: Ex: the entity cluster {“nikon”, “canon”, “olympus”} category label : "ixy"
② Specificity of Categories:
Ex: the entity cluster {“nikon”, “canon”, “olympus”} category: "Product"→ too broad
③ Accuracy of Suggestion Classification: Ex: "canon printer" classified into photo.
Confuse the user
10
Not suitable
![Page 11: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/11.jpg)
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
11
![Page 12: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/12.jpg)
12
SParQS Backend Algorithm
• From a query log, query contexts are obtained for each entity by replacing the occurrences of the entity in queries with a wildcard.
entity queries
query contexts: "∗ camera" " price ∗ camera “c= "prefix e suffix" e= "canon " donate: c(e)
canoncanon
cameraprice canon
camera
C= {c|c(e) ∈ Q ^ e ∈ E }Entity total: 250,000
Define : entity vector Ve (e:canon)
<canon camera , canon photo , canon lens , …..>
Top 10
![Page 13: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/13.jpg)
13
Clustering Entities:w(cl(e), u):the number of times a URL u has been clicked in response to thequery q.
<canon camera , canon photo , canon lens , …..>Vcanon :
<10 , 4 , 5 , …..>
# of click
URL 1 : 3URL 2 : 2URL 3 : 1URL 4 : 2URL 5 : 2
Volympus :<5 , 3 , 9 , …..>
Cosine similarity:
Group-average hierarchical cluster
Obtain a set of entity cluster ε={E1 , E2 , ….}
![Page 14: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/14.jpg)
14
Entity 1 Entity 2 Entity 3
Entity 1
0 0.29 0.24
Entity 2
0.29 0 0.37
Entity 3
0.24 0.37 0Entity 1 Entity 2,3
Entity 1 0 <1>
Entity 2,3
<1> 0
<1> : (0.24+0.29)/2=0.265
Entity 1 Entity 2,4 Entity 3
Entity 1 0 0.24 0.37
Entity 2,4
0.24 0 0.45
Entity 3 0.37 0.45 0
Entity 1 Entity 2,3,4
Entity 1 0 <2>
Entity 2,3,4
<2> 0
<2> : (0.24*2+0.37)/3=0.283
Group-average hierarch
ical cluster
![Page 15: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/15.jpg)
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
15
![Page 16: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/16.jpg)
Clustering Queries:
Define : query vector Vq (q=c(e))
w(c(ej),u) : the sum of click counts of queries that have the same context c.
c= "prefix e suffix" e1= "canon " e2= " nikon " e3= "olympus "
Canon camera
Nikon camera
Olympus camera
# of URL 1 clicked :
5
# of URL 1 clicked :
2
# of URL 1 clicked :
3
<5+2+3 , 4, 5, …..>
V* camera :
* camera
URL 1URL 2URL 3URL 4URL 5
…
Top 10
<URL 1, URL 2, URL 3 , …..>
V* photo : <5 , 3 , 9 , …..>
Cosine similarity:
Group-average hierarchical cluster
Obtain a set of query cluster 16
![Page 17: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/17.jpg)
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
17
![Page 18: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/18.jpg)
18
Classifying Query Suggestion:
(Q(k) ={Canon camera, Nikon camera , Olympus camera,….})
Define : query cluster vector VQ(k)
Define : query suggestion vector Vs
If Sim(Q(k), s)> θ
classify a query suggestion s into a query cluster Q
(k)
• Choose n query clusters as categories to classify query suggestion
Accuracy
Evenness
Specificity
Query suggestion entropy over entities
Query suggestion entropy over categories
![Page 19: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/19.jpg)
19
Query suggestion entropy over entities
Photo PhotoPhoto
Nikon digital cameraNikon cameraNikon dslr
Olympus cameraOlympus digital camera
Canon cameraCanon photoCanon dslrCanon digital camera
Canon
Olympus
Nikon
Pphoto(Nikon)= = 0.33
Pphoto(Olympus)= = 0.25
Pphoto(Canon)= = 0.416
Hphoto(E)= -[(0.33*log 0.33)+(0.25*log 0.25)+(0.416*log 0.416)]= 0.4679
Hk(E) Query suggestions classified into a category are distributed more evenly across entities.
![Page 20: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/20.jpg)
Query suggestion entropy over categories
20
Photo
Nikon digital cameraNikon cameraNikon dslr
Nikon
Nikon digital camera accessoriesNikon accessoriesNikon camera accessories
accessories
Nikon lensNikon lensesNikon lens reviews
lenses
PNikon(photo)= = 0.33
PNikon(accessories)= = 0.33
PNikon(lenses)= = 0.33
HNikon( )= -[(0.33*log 0.33)+(0.33*log 0.33)+(0.33*log 0.33)]= 0. 4767
query suggestions of an entity ej are distributed more evenly across categories
![Page 21: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/21.jpg)
21
Classificationof query
suggestion
Select best query clusteras categories
n=5
θ=0.3
![Page 22: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/22.jpg)
22
Q(l): {nikon photo , nikon camera , nikon digital camera} ej : nikonSj :{nikon lenses, nikon accessories, nikon customer service,…….}
nikon photo=< 6,3,2,…> nikon camera=< 3,1,5,…>nikon digital camera=< 2,4,2,…>
Clustering query
query cluster vector < 11,8,9,…>
query suggestion vector: <#of top 1 url that clicked,top 2 url,…>=<3,5,4,…>
s1=<3,5,4,…>s2=<6,1,2,…>s3=<3,1,1,…>s4=<4,3,2,…>
…..
Cosine similarity>θ:0.3
Q(2) Q(3)Q(1) ……….
Query cluster
setMax
nikon photo, nikon camera,
nikon digital camera
Has been Classified
s1=<3,5,4,…>s2=<6,1,2,…>
![Page 23: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/23.jpg)
23
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
![Page 24: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/24.jpg)
24
Experiment• Data-Microsoft Bing’s query log from April 25th to May 1st
, 2010
• Input: 〈 named entity list〉 Total : 5,156
• Manually chose 20 entity clusters that had at least 2 entities from each of the 5 entity classes.
Record 3,503,469,327
Unique queries
76,462,963
Unique URLs 62,978,872
person
landmark
city
product
company 2,000
119
1,203
388
1,446
Query clusterin
g
Entity clusterin
g
nikon , canon ,olympus
sharp , samsung ,
lg ,sony ,panasonic
Entity class:
company
![Page 25: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/25.jpg)
25
• Two assessors evaluated categories of 100 entity clusters with five types of values for a parameter λ. 2459 ﹝categories﹞
• Showed a list of category labels, a set of entities, and their unclassified query suggestions.
• Precision :
Highly relevantSomewhat relevantIrrelevant
Precision
specificity evenness
![Page 26: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/26.jpg)
26
• Prepared 20 tasks , hired 20 subjects and asked users to collect answers relevant to each task within five minutes. For each task, each subject used either the SParQS interface, or a flat list interface as a baseline to complete the task.
• 10 Information Gathering tasks finding information about the given entity query " nikon " → " nikon cameras "• 10 Entity Comparison tasks finding information about entities related to the given one in terms of a particular aspect Ex:"competitors such as Canon and Olympus"
![Page 27: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/27.jpg)
27G:Information Gathering taskC:Entity Comparison task
![Page 28: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/28.jpg)
28
User study
Questionnaire
Scores: 1 (Not at all), 2, 3 (Somewhat), 4, and 5 (Extremely)
![Page 29: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/29.jpg)
29
Outline
Introduction Problem Definition SParQS Backend Algorithm• Clustering entity• Clustering queries• Clssifying query suggestion
Experiment Conclusion
![Page 30: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/30.jpg)
30
Conclusion This paper proposed a new method to present query
suggestions to the user, which has been designed to help two query reformulation actions: specialization and parallel movement.
SParQS classifies query suggestions into automatically generated categories and generates a label for each category.
SParQS presents some new entities as alternatives to the original query, together with their query suggestions classified in the same way as the original query’s suggestions.
Results show that subjects using the flat list query suggestion interface and those using the SParQS interface behaved significantly differently even though the set of query suggestions presented was exactly the same.
![Page 31: Date: 2012/10/18 Author: Makoto P. Kato, Tetsuya Sakai, Katsumi Tanaka Source: World Wide Web conference (WWW "12) Advisor: Jia-ling, Koh Speaker: Jiun.](https://reader033.fdocuments.in/reader033/viewer/2022052702/56649e5d5503460f94b55a6f/html5/thumbnails/31.jpg)
31