Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark...
Transcript of Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark...
![Page 1: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/1.jpg)
Social SearchIntroduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson
![Page 2: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/2.jpg)
Aardvark
“The Anatomy of a Large-Scale Social Search Engine” by Horowitz, Kamvar WWW2010
![Page 3: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/3.jpg)
Aardvark
• Web IR
• Input is a query of keywords
• Search is over documents
• Trust is based on authority
• Mental model is a library
![Page 4: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/4.jpg)
Aardvark
• Web IR
• Input is a query of keywords
• Search is over documents
• Trust is based on authority
• Mental model is a library
• Social Search
• Input is a question
• Search is over people
• Trust is based on intimacy
• Mental model is a village
![Page 5: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/5.jpg)
Aardvark
• Web IR
• facts
• navigation
• transactions
• Social Search
• opinion
• advice
• experience
• recommendations
![Page 6: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/6.jpg)
Aardvark
CrawlerIndexer
Query Analyzer
Ranking Function UI
Components
![Page 7: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/7.jpg)
Aardvark
• Crawler/Indexer
• Users not documents
• Query Analyzer
• Understand the information need
• Ranking Function
• Pick the best resources
• UI
• To manage the conversation
![Page 8: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/8.jpg)
Aardvark
Welcome to Aardvark - Sign on process
• After confirming a new user’s account
• A Social Graph is built
• Facebook/LinkedIn connections
• webmail connections
• manual email invites
• “group” aware
• This is a work colleague, college friend, etc.
![Page 9: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/9.jpg)
Aardvark
Welcome to Aardvark - Sign on process
• A knowledge bank is built on
• self-identified expertise
• friend identified expertise
• home page identified expertise
• facebook status update analysis
• twitter status update analysis
• observed Aardvark usage
• knowledge bank’s inverted index maps topic -> user
![Page 10: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/10.jpg)
Aardvark
The Question
![Page 11: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/11.jpg)
Aardvark
The Question
• The question is acquired by various input channels:
• text, web, IM, mobile, etc.
• The question is screened for obscenity
• The question is topic analyzed
• topic is presented to asker for confirmation
• The question is passed to a routing engine
• which ranks potential answers based on
• social graph and expertise
![Page 12: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/12.jpg)
Aardvark
Routing Engine
• Pick the best answerer
• What is the probability that user i will answer question q?
• Marginalize over the topics in the question and the topic
expertise of the user
• Pick the best pair of users
• Which user i is the most likely to give a good answer to
user j?
![Page 13: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/13.jpg)
Aardvark
Routing Engine
• Ranking function combines the two
• For a given query, q, by user j
• what is the best user i to ask?
• Biases intimacy over authority
• Notice there is nothing like PageRank here
• The only real-time component is p(t|q)
![Page 14: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/14.jpg)
Aardvark
Indexing People
• Figuring out
• Figuring out
p(t|ui)
p(ui|uj)
![Page 15: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/15.jpg)
Aardvark
Indexing People
• Figuring out
• Users self-identify topics they are “experts” in
• Others identify topics you are “experts” in
• Topics are mined from
• Homepages
• Blogs
p(t|ui)
![Page 16: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/16.jpg)
Aardvark
Indexing People
• Figuring out
• In unstructured text
• an SVM classifies the general topic
• an ad-hoc entity parser figures out specific topics
• scaled by TF-IDF score
• topics are also mined from aardvark conversations
!
p(t|ui)
![Page 17: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/17.jpg)
Aardvark
Indexing People
• The mined information is not for answering questions
• It’s for identifying people who can answer questions
• Topics are enhanced with social network information
![Page 18: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/18.jpg)
Aardvark
Indexing People
• All of the scores for topics given a user are normalized
!
• Bayes Law is used to invert the probability
![Page 19: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/19.jpg)
Aardvark
Indexing People
• Probability that u_i will respond to u_j
• Social Connections
• Demographic similarity
• Profile similarity
• Vocabulary similarity
• Chattiness similarity
• Verbosity similarity
• Politeness similarity
• Speed match
p(ui|uj)
![Page 20: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/20.jpg)
Aardvark
Pulling topics out of a question
• Real-time response needed
• Doesn’t have to be perfect, people pick up the slack
• Screening
• Is it a question? (No -> reject)
• Is it inappropriate?Spam? Sex? Commercial? (Yes -> reject)
• Is it trivial? (Yes -> answer it)
• Is it location sensitive?
• Location is treated differently than topic
p(t|q)
![Page 21: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/21.jpg)
Aardvark
Pulling topics out of a question
• topics
• Keyword to Topic Matcher
• Taxonomy Topic Mapper
• SVM classifier
• Phrase to Topic Matcher
• User Tag to Topic Mapper
• If the user tags the question
p(t|q)
![Page 22: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/22.jpg)
Aardvark
User Interface
![Page 23: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/23.jpg)
Aardvark
User Interface
![Page 24: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/24.jpg)
Aardvark
Samples
![Page 25: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/25.jpg)
Aardvark
Samples
![Page 26: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/26.jpg)
Aardvark
Empirical Results
![Page 27: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/27.jpg)
Aardvark
Empirical Results
![Page 28: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/28.jpg)
Aardvark
Empirical Results
![Page 29: Social Search - Donald Bren School of Information and ...djp3/classes/2014_01_INF... · Aardvark The Question • The question is acquired by various input channels: • text, web,](https://reader035.fdocuments.in/reader035/viewer/2022081403/5f10643b7e708231d448e18b/html5/thumbnails/29.jpg)
Aardvark
Empirical Results