Search and Data Management Rakesh Agrawal MSR Search Lab.
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
1
Transcript of Search and Data Management Rakesh Agrawal MSR Search Lab.
![Page 1: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/1.jpg)
Search and Data Management
Rakesh AgrawalMSR Search Lab
![Page 2: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/2.jpg)
Current Focus & Direction
• Understand the virtuous cycle between search and data and ways to accelerate it
• New search-centric applications– Personal data mining (Health)– Distributed Knowledge creation (Education)
![Page 3: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/3.jpg)
Search & Data: Virtuous Cycle
Search
DataInsights
Queries, Clicks
Mining
Relevance
Web PagesFeedsBetter Search Results ►
More Data ►Greater Insights ►
Better Search Results
Intents
Behaviors
Connections
Popularity
Trends
![Page 4: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/4.jpg)
Related Searches (aka Query Suggestions)
• Most popular queries containing the current query• Analysis of how users reformulated their queries
• Query click graph to find related queries
Football SoccerWildflower cafe Wildflower bakery
(whole query)(piecewise)
![Page 5: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/5.jpg)
Result Diversification
• Ideas from portfolio theory to allocate space to different result types
• Marginal utility of adding a document decreases if the result set already contains high quality documents of the same type
• Query and document classification using merged click logs
![Page 6: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/6.jpg)
Seeddocuments
ANIMALS documents
ANIMALS queries
Classification Using Click Graph
Algorithm: Random walk with absorbing states
![Page 7: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/7.jpg)
118
125
133
141
149
157
164
171
100
120
140
160
180
1995 2000 2005 2010 2015 2020 2025 2030
Year
Num
ber
of P
eopl
e W
ith
Chr
onic
Con
ditio
ns (m
illio
ns)
Changing Nature of Disease
• New Challenge: chronic conditions: illnesses and impairments expected to last a year or more, limit what one can do and may require ongoing care.
• In 2005, 133 million Americans lived with a chronic condition (up from 118 million in 1995).
Infectious Diseases
![Page 8: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/8.jpg)
Technology Trends
• Tremendous simplification in the technologies for capturing useful personal information
• Dramatic reduction in the cost and form factor for personal storage
• Cloud Computing
![Page 9: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/9.jpg)
Personal Health Analytics
![Page 10: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/10.jpg)
Personal Data Mining
Charts for appropriate demographics?
Optimum level for Asian Indians: 150 mg/dL(much lower than 200 mg/dL for Westerners)
Due to elevated levels of lipoprotein(a)*
Computation and selection across millions of data sources
Privacy and security
*Enas et al. Coronary Artery Disease In Asian Indians. Internet J. Cardiology. 2001.
![Page 11: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/11.jpg)
Collaborative Knowledge Creation(Educational Material)
• More than 3.5 million articles in 75 languages
• Fashioned by more than 25,000 writers
• 1 million articles in English (80,000 in Encyclopedia Britannica)
• Inspired by Wikipedia• But multiple viewpoints
rather than one consensus version!
• How to personalize search to find the material suitable for one’s own style of teaching?
• Management of trust and authoritativeness?
![Page 12: Search and Data Management Rakesh Agrawal MSR Search Lab.](https://reader030.fdocuments.in/reader030/viewer/2022032800/56649d405503460f94a1a4fe/html5/thumbnails/12.jpg)
Summary
• Web search is a “data management and creating value from data” problem
• New search-centric applications can provide rich fodder for future database research.