calculation | consulting data science leadership
Who Are We?
c|c (TM)
Dr. Charles H. Martin, PhD University of Chicago, Chemical PhysicsNSF Fellow in Theoretical Chemistry
Over 10 years experience in applied Machine LearningDeveloped ML algos for Demand Media; the first $1B IPO since Google
Lean Start Ups: Aardvark (acquired by Google), eHow, ModeWall Street: BlackRock, GLGFortune 500: Big Pharma, Telecom, eBay
(TM)
3
BackStory: in 2011, Search Changed. Forever.
• first $1B IPO since Google
• Machine Learning based SEO algorithms
• Measure the demand for search, and fulfill it
data science algorithms created a billion $ company
c|c (TM)
(TM)
Demand Media
calculation | consulting data science leadership(TM)
4
eHow.com
BackStory: in 2011, Search Changed. Forever.
• Google adapted (Panda)
• Lack of diversification
• Lack of adaptation
• Stock price never recovered
algorithmic accountability: DMD or Google?
c|c (TM)
IPO
Panda
stock price 2011-2012
(TM)
calculation | consulting data science leadership
DMD
(TM)
5
• first $1B collapse due to Panda ?
• CPC revenues down
• premium online publishers diedcollapse
?stock price 2011-2012
c|c (TM)
$1B in ad revenue was repriced and reallocated
Problem: Cornering the market on search induced a market crash
calculation | consulting data science leadership(TM)
6
c|c (TM)
Panda-Induced ‘Market Crash’Google CPC dropped just after Panda
calculation | consulting data science leadership(TM)
7
Data Science is Different
c|c (TM)
Thomas H. Davenport
calculation | consulting data science leadership
Generating sustainable revenue requires Data Science Leadership and Execution
(TM)
8
“Companies need a Spock in the boardroom”
Data Science is Different
c|c (TM)
Thomas H. Davenport
calculation | consulting data science leadership
Generating sustainable revenue requires Data Science Leadership and Execution
(TM)
9
http://www.theonion.com/articles/national-science-foundation-science-hard,1405/
Problem: Data Scientists are Different
c|c (TM)
Thomas H. Davenport
calculation | consulting data science leadership(TM)
10
not all techies are the same
Problem: Data Scientists are Different
c|c (TM)
Thomas H. Davenport
calculation | consulting data science leadership
theoretical physics machine learning specialist
(TM)
11
experimental physics data scientist
engineer software, browser tech, dev ops, …
not all techies are the same
Problem: Data Scientists are Different
c|c (TM)
Thomas H. Davenport
calculation | consulting data science leadership(TM)
12
not all techies are the same
Managing: Data Science Process
• Acquire Domain Knowledge
• Formulate Hypothesis
• Generate Model(s) from the Data
• Predict Revenue Gains
• Backtest Predictions on your Data
• A/B Test in Production
• Attribute Gains to Model(s)
c|c (TM)
(TM)
acting
solving
framing
calculation | consulting data science leadership13
Managing: Data Science Process
c|c (TM)
(TM)calculation | consulting data science leadership14
c|c (TM)
• Systems Thinking: leveraging the inter-relationships between data, marketing, and the customer
• Knowledge Transfer: mentoring — not training — to develop both personal mastery and team learning
• Mental Models: create a base of small-scale models for thinking about how to use your data
• Knowledge Sharing: foster collaboration between research, engineering, and product to drive revenue
Managing: Learning from Data
calculation | consulting data science leadership(TM)
15
c|c (TM)
• Cross-functional engineering, product, marketing, finance
• Autonomous: separate from the traditional engineering product lifecycle. self-organizing and self-managing
• Experimental: form hypothesis, analyze data, make predictions, run backtests, A/B testing
• Self-sustaining: not a cost center; generates revenue
(TM)
Data Science is Different
calculation | consulting data science leadership16
Solution: Collecting and Organizing Data
(TM)
c|c (TM)
• Most companies are struggling organizing their data
• Data needs to be examined
• Don’t assume data is correct or useful
• More is More: simple algos work
• More is Less: noise is noise
Data not examined is not collected
calculation | consulting data science leadership17
Solutions: Hadoop and Big Data
(TM)
c|c (TM)
• Hadoop is an internal data ecosystem
• Hadoop appears to have won the adoption wars ?
• Hadoop : 90% deployments internal
• Hadoop is a cost center
• ROI needs cut across business divisions
Algorithms, not data, generate revenue
calculation | consulting data science leadership18
Solutions: Cloud
(TM)
c|c (TM)
• Startups don’t need infrastructure
• long term Data Storage is virtually free
• Amazon Redshift
• Google Big Query
• Cloud is the future ?
Algorithms, not data, generate revenue
calculation | consulting data science leadership19
Solutions: Spark
(TM)
c|c (TM)
• Next Gen Platform for Machine Learning
• Sits on Hadoop or the Cloud
• Still very high touch
• Limited algos
Algorithms, not data, generate revenue
calculation | consulting data science leadership20
Problem: Measurements
(TM)
c|c (TM)
good experiments are amazing
calculation | consulting data science leadership21
“If you can’t measure it, you can’t fix it.”DJ Patil, White House Chief Data Scientist
Data Science’s Measurement Problem
(TM)
c|c (TM)
good experiments are hard to design
calculation | consulting data science leadership22
http://www.forbes.com/sites/lizryan/2014/02/10/if-you-cant-measure-it-you-cant-manage-it-is-bs/
Data Science’s Measurement Problem
(TM)
c|c (TM)
good experiments are hard to design
calculation | consulting data science leadership23
“Data science has a measurement problem. Simple metrics may not address complex situations.
But complex metrics present myriad problems.”
“As we strive for better algorithms, we often fail to think critically about what it means
for predictions to be ‘good’”
http://www.kdnuggets.com/2015/03/data-science-measurement-problem-accuracy-auroc-f1.html
Data Science’s Measurement Problem
(TM)
c|c (TM)
good experiments are hard to design
calculation | consulting data science leadership24
“Buffett found it 'extraordinary' that academics studied such things. They studied what was measurable, rather than what was meaningful. ‘
… to a man with a hammer, everything looks like a nail.”
― Roger Lowenstein, Buffett: The Making of an American Capitalist
c|c (TM)
(TM)
Problem: The Cult of the Algorithm
calculation | consulting data science leadership25
what can algos actually do ?
“We have a new machine learning algo that anticipate your needs over time and behave accordingly”
c|c (TM)
(TM)
Problem: What can Machine Learning Do?
calculation | consulting data science leadership26
what can algos actually do ?
Demand Algos: Gas Station AnalogyProblem: where to open a gas station ?Need: good traffic, weak competition
c|c (TM)
less competitorsno trafficsweet spotgreat traffic
too many competitors
calculation | consulting data science leadership
all businesses balance supply and demand
(TM)
27
SAAS Machine Learning Algos
c|c (TM)
calculation | consulting data science leadership(TM)
28
$100,000 • 167 teamsDiabetic Retinopathy Detection
$15,000 • 341 teamsMarch Machine Learning Mania 2015
machine learning contests
SAAS Machine Learning Algos
c|c (TM)
calculation | consulting data science leadership(TM)
29
machine learning apis
c|c (TM)
(TM)
Problem: What can Deep Learning Do?
calculation | consulting data science leadership30
what can algos actually do ?
c|c (TM)
(TM)
Problem: Externalities
calculation | consulting data science leadership31
external factors can change
c|c (TM)
(TM)
Problem: Externalities
calculation | consulting data science leadership32
“Zynga is our best company ever!” (2010)John Doerr, Google Investor, Legendary VC
http://venturebeat.com/2010/11/16/google-investor-john-doerr-zynga-is-our-best-company-ever/
one marketplace | big risks
c|c (TM)
(TM)
Solution: Algorithmic Accountability
calculation | consulting data science leadership
An asset is an economic resource.
Anything tangible or intangible that is capable of being owned or controlled to produce value and that is held to have positive economic value is considered an asset.
algorithms can be valuable assets
33
c|c (TM)
(TM)
Algorithmic Accountability
calculation | consulting data science leadership34
does revenue depends on hidden algos ?
• WebMD Google SEO
• Amazon Product Listing Algo
• Pinterest Relevance Algo
• Twitter Spam filter
• Apple App Store Rankings
c|c (TM)
(TM)
Algorithmic Accountability
calculation | consulting data science leadership35
do decisions depend on hidden factors ?
A 'Crisis' in Online Ads: One-Third of Traffic Is Bogushttp://www.wsj.com/articles/SB10001424052702304026304579453253860786362
Now Algorithms Are Deciding Whom To Hire…http://www.npr.org/blogs/alltechconsidered/2015/03/23/394827451/now-algorithms-are-deciding-whom-to-hire-based-on-voice
What you don’t know about Internet algorithms is hurting you…http://www.washingtonpost.com/news/the-intersect/wp/2015/03/23/what-you-dont-know-about-internet-algorithms-is-hurting-you-and-you-probably-dont-know-very-much/
c|c (TM)
(TM)
Solution: Algorithmic Transparency
calculation | consulting data science leadership36
can you be transparent and not be gamed ?
http://fortune.com/2015/03/18/how-do-you-govern-a-hidden-fluid-and-amoral-algorithm/
83% of the participants in the study changed their behavior once they knew about the algorithm
How do you govern a (hidden, fluid and amoral) algorithm?
participants mistakenly believed that their friends intentionally chose not to show them stories
c|c (TM)
(TM)
Algorithmic Accountability
calculation | consulting data science leadership
Do you depend on some else’s marketplace?
How does your revenue depend on algos?
Do you need an internal algo ?
Who will manage it? build it? maintain it?
algorithms have unforeseen liabilities
37
Top Related