BIG DATA , We have a communication problem.
-
Upload
abigail-puckett -
Category
Documents
-
view
27 -
download
1
description
Transcript of BIG DATA , We have a communication problem.
![Page 1: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/1.jpg)
BIG DATA,We have a communication problem.
GINORMOUS SYSTEMSApril 30–May 1, 2013Washington, D.C.
Daniel TunkelangHead of Query Understanding, LinkedIn
![Page 2: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/2.jpg)
BIG DATA IS EVERYWHERE
![Page 3: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/3.jpg)
BIG DATA POWERS EVERYTHING
![Page 4: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/4.jpg)
DATA SCIENTISTS WORRY ABOUTVOLUME, VELOCITY, VARIETY, …
![Page 5: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/5.jpg)
BUT THE BOTTLENECKISN’T COMPUTATIONAL
IT’S COGNITIVE
![Page 6: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/6.jpg)
TOOLS AUGMENTHUMAN INTELLECT
BIG DATA IS A TOOL
Doug Engelbart, inventor ofthe mouse, hypertext, etc.
![Page 7: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/7.jpg)
NOT EVERYONE SUBSCRIBESTO THIS POINT OF VIEW…
Claudia Perlich, Chief Scientist of media6degrees, speaking atTTI/Vanguard 2012 Conference on Understanding Understanding:
![Page 8: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/8.jpg)
SHE HAS A POINT
![Page 9: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/9.jpg)
BUT PREDICTIVE MODELINGIS NOT ENOUGH
![Page 10: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/10.jpg)
TRAININGDATA?
OBJECTIVEFUNCTION?
![Page 11: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/11.jpg)
WE NEED APEOPLE-CENTRICAPPROACH TOBIG DATA
INTERPRETABILITYINTERACTION
INSIGHT
![Page 12: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/12.jpg)
LET’S START WITHINTERPRETABILITY
![Page 13: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/13.jpg)
EXAMPLE:SVMvs.
DECISION TREE
![Page 14: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/14.jpg)
DECISION TREES HAVE FLAWS…
DISCRETE
![Page 15: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/15.jpg)
BUT THEYCOMMUNICATE
(if they’re shallow)
early splits provide big picture…
fat leaves guidefeature engineering
…or reveal training data problems
![Page 16: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/16.jpg)
WHI
CHSUPPORTS
ITERATION
![Page 17: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/17.jpg)
INTERPRETABILITY DELIVERS
Key search leader favors rule-based approach for key scoring algorithms.
Replaced regression with decision tree in local search model: gained accuracy and insight.
Using trees to recognize spam, analyze search abandonment, model / quantify social proof.
![Page 18: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/18.jpg)
GO DEEP vs INTERPRETABILITY
A KEY DATA SCIENCE TRADE-OFF
![Page 19: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/19.jpg)
ON TOINTERACTION
![Page 20: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/20.jpg)
![Page 21: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/21.jpg)
DON’T OVERPAY FOR PRECISION
![Page 22: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/22.jpg)
BE FAST, CHEAP, AND 98% RIGHT
http://metamarkets.com/2012/fast-cheap-and-98-right-cardinality-estimation-for-big-data/
![Page 23: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/23.jpg)
ARE PEOPLE THAT IMPATIENT?
tolerable wait time for web users
0.1s increase in latency significantly reduces # of searches, ad revenue
tl;dr: YES
![Page 24: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/24.jpg)
IMPATIENCE IS GOODSPEED MATTERS
![Page 25: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/25.jpg)
INSIGHT
![Page 26: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/26.jpg)
http://blog.takejune.com/archives/52334044.html
![Page 27: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/27.jpg)
BE TRENDY AND NORMALIZE
vs
![Page 28: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/28.jpg)
Sept. 11thAbu Ghraib
Weapons Inspectors
SOLVE FOR INTERESTINGNESS
![Page 29: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/29.jpg)
COMPUTE POTENTIAL INSIGHTS
APPLY HUMAN INTUITION
![Page 30: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/30.jpg)
SUMMARY: Let’s have a conversation with Big Data.
INTERPRETABILITYINTERACTION
INSIGHT
![Page 31: BIG DATA , We have a communication problem.](https://reader035.fdocuments.in/reader035/viewer/2022062517/568135b9550346895d9d1f3a/html5/thumbnails/31.jpg)