1 Web Search Personalization via Social Bookmarking and Tagging Michael G. Noll & Christoph Meinel...
-
Upload
francis-welch -
Category
Documents
-
view
219 -
download
3
Transcript of 1 Web Search Personalization via Social Bookmarking and Tagging Michael G. Noll & Christoph Meinel...
1
Web Search Personalization
via Social Bookmarking andTagging
Michael G. Noll & Christoph MeinelHasso-Plattner-Institut an der Universit¨at Pots
dam, GermanyISWC 2007Advisor: Prof. Hsin-Hsi Chen Reporter: Yu-Hui Chang
2008/07/30
2
Introduction
2008/07/30 Y.H. Chang 3/27
Social bookmarking and tagging
• social bookmarking:– publicly sharing your bookmarks with others– including any additional metadata
• tagging / folksonomies:– Users annotate Documents with with a flat, un
structured list of keywords called Tags
TUDRtagging
2008/07/30 Y.H. Chang 4/27
Web search personalization
• integration of user-specific data to improve results / advertising
• two main approaches:1. modify user's query:
“nyt” > “new york times”
2. re-rank search results based on user profile
5
Personalization Technique
2008/07/30 Y.H. Chang 6/27
Personalization
• Input: – user profile + document profiles
• Via social bookmarking and tagging
• Algorithm:1. calculate Similarity(user, document) for all docs
2. sort documents by similarity from highest to lowest
• Output: – re-ranked search result list
2008/07/30 Y.H. Chang 7/27
Profile
• user profile– “Tagmarking”
• A user search “research internet security”=>he/she can click single buttonto bookmark a document with auto translate tag “rese
arch”, “internet”, ”security”
• document profiles– Communicating with the bookmarking service
over its web API
Tagmarking
2008/07/30 Y.H. Chang 8/27
Personalization
• Complete process of web search personalization• * every step done in real-time
2008/07/30 Y.H. Chang 9/27
Data Aggregation: User Profile
• User’s profile example
m tags
n documents
2008/07/30 Y.H. Chang 10/27
Data Aggregation: Doc. Profile
• Document’s profile example
m tags
n users
Mu
2008/07/30 Y.H. Chang 11/27
Similarity
• Similarity(u ,d)=puT ||p‧ d||
– ||pd||: simply normalization of document profile, reset matrix element with only 1 and 0 two values ( True or False)
2008/07/30 Y.H. Chang 12/27
Similarity example
• Similarity(u ,d)=puT ||p‧ d||
…
11…
1111……
13192
102134
=13*0+19*1+2*0+10*1+21*0+34*1=63
2008/07/30 Y.H. Chang 13/27
Similarity score properties
• the key factor: unmodified user profile– Promotes known and similar doc., demotes
those unknown or non-similar doc.• more sophisticated normalization for both user and
document is on-going
• Score of unknown document => 0!!!• most critical factor in practice:
– “do we have sufficient data to make all this work?”
2008/07/30 Y.H. Chang 14/27
Personalization
• system setup– server: social bookmarking service– client: browser add-on
• modification of search engine UI by updating the DOM tree of the search result pages in real-time
2008/07/30 Y.H. Chang 15/27
2008/07/30 Y.H. Chang 16/27
Re-rank example
• a user with a strong interest in information technology and network security
2008/07/30 Y.H. Chang 17/27
18
Experiment and Evaluation
2008/07/30 Y.H. Chang 19/27
Experiment
• Del.icio.us– Public social bookmarking service– Large user community
• Key question:• Quantitative analysis
– How many social annotations in practice?
• Qualitative analysis– Quality evaluation
2008/07/30 Y.H. Chang 20/27
Quantitative analysis
• test set– 140 “popular tags” on del.icio.us– 1400 search results link (top 10 results )
• totaling– 981,989 bookmarks– 20,498 tag annotations (2,300 unique)
2008/07/30 Y.H. Chang 21/27
Quantitative analysis
2008/07/30 Y.H. Chang 22/27
Quantitative analysis
• we can expect to personalize approx. 85% (in the 1st page) per query in practice
Percentage of links with at least 1 tag
2008/07/30 Y.H. Chang 23/27
Qualitative analysis
• For each query, participants were presented two search result lists:
1. original list from Google Search2. The personalized version
• 8 participants evaluate the top 10 results for 13 queries each
– Participant’s job: researchers, web masters, software developers, system administrators
– The average number of bookmarks for a participant was 153.
2008/07/30 Y.H. Chang 24/27
Qualitative analysis
29.80%
70.20%6.70%
• Some discussions:– “Expert” user profiles
– Disambiguate words and contexts
Personalized version Better
Worse
Equal
25
Conclusion
2008/07/30 Y.H. Chang 26/27
Conclusion
• proposed personalization approach is feasible and viable in practice:– already sufficient user-supplied metadata available– initial evaluation of personalization quality shows very
promising results
• Open Access:• http://www.michael-noll.com/dmoz100k06/ - dat
a set• http://www.michael-noll.com/delicious-api/ - scri
pts
27
Thank you!!