Flirting With Disaster With Disaster: Learning from Analytical Failures Director Customer Analytics...
Transcript of Flirting With Disaster With Disaster: Learning from Analytical Failures Director Customer Analytics...
Flirting With Disaster:Learning from Analytical Failures
DirectorCustomer AnalyticsWalmart Stores, Inc.
Sterling Price
About the presenterSterling Price is Senior Director of Shopper Analytics, Merchandising, and Marketing at Walmart. Sterling leads several teams of analytics professionals engaged in generating business value by building a holistic view of the Walmart customer, across channels, banners, and formats. Sterling has been with Walmart for 20 years, in a career spanning Information Systems, Merchandise Flow, Labor Force Analytics, and Customer Insights prior to his current role.
Perspective
Be Careful Out There!
Wise Words
“Big data may mean more
information, but it also means
more false information.”
- Nassim Nicholas Taleb,
Author, “The Black Swan”,
“Antifragile”
Google Flu Trends
• CDC flu data was doctor-reported, 2 weeks
behind
• In 2008 Google launched Google Flu Trends
• Mined 5 years of web logs – huge amount of data
• Found 45 predictive searches out of 50MM tested
• Ran ~450MM models
• Estimated > 90% correlation to historical CDC flu
data
Google Flu Trends - Continued
• Overestimated flu 100 of 108 weeks (as of March
2014)
• Over-predicted 2012-2013 seasonal flu by 50%
• Changes in Google’s search algorithm may have
contributed to errors, also people’s search habits
(for example, searching for “Google Flu Trends”)
• Google Flu Trends page still exists, but data no
longer offered to the public – only to medical
researchers
Lessons Learned – Google Flu Trends
• Correlation doesn’t mean much by itself. A huge
amount of data doesn’t change that
• Don’t fall prey to “Big Data Hubris” – assuming
results will be accurate and useful because of how
much data was used
• Still need a way to “separate the wheat from the chaff”
– and there’s a lot more chaff now!
• Methodology still matters. Big Data by itself does
nothing. How we use it defines its value – just as it
always has for any data
The Netflix Prize
• One million dollars offered to anyone who could
improve upon Netflix’s own movie recommendations
algorithm by at least 10%.
• The winning entry was never implemented by Netflix
• Netflix: “additional accuracy gains that we measured
did not seem to justify the engineering effort needed
to bring them into a production environment.”
• Was based on DVD rentals - Netflix shifted to
streaming during this time
Lessons Learned - The Netflix Prize
• Assess project ROI before proceeding – will the
incremental benefit be worth it when all costs are
considered?
• Be aware of changes to the nature of the business
that could affect the outcome (difficult for Netflix
because the competition lasted several years).
• Scalability must be considered up front.
Statistical Significance
Right Turn on Red Light
• RTOR started in CA, 1937
• Engineers questioned safety
• 1973 oil crisis – gov’t allowed
• VA consultant did pre-post test
• 20 intersections were studied
• Accidents increased but not statistically significant.
Subsequent studies in other states concurred.
Right Turn on Red Light - Continued
• VA consultant: “since the result is not statistically
significant, it is best to assume the safety effect
to be zero.”
• Government: “we can discern no significant
hazard to motorists.”
• Once right turn on red became common, more data
became available leading to analysis that challenged
this finding. This error cost lives.
Lessons Learned - Right Turn on Red
• Statistical Significance mistaken for practical
significance – very common problem
• Insufficient statistical power for analysis
• “I cannot be sure that the safety effect is not zero” is
in effect what consultant probably meant, but what he
said was “it is best to assume the safety effect to
be zero.”
• Wrong choice on prioritizing Type 1 error avoidance
over Type 2 avoidance
Refresher – Type 1 vs. Type 2 Error
Let’s Talk About a Presidential Candidate
…this
one!
The Election of 1936
• FDR vs. Alfred Landon
• The Literary Digest commissioned a survey – one of
the largest and most expensive ever. The “big data”
of its time
• ~10MM surveys sent, 2.4MM respondents
• Prediction: Landon 57%, FDR 43%
• The Literary Digest had correctly predicted outcome
of presidential elections since 1906
The Election of 1936 - Continued
• Actual: FDR landslide victory – 62% vs. 38%
• George Gallup used 50K random sample to predict
FDR victory
• Reasons for failure: Sample Bias and Nonresponse
Bias
• Survey was sent to magazine subscribers, club
membership lists, people in phone book – during the
Great Depression
1936 Election – Lessons Learned
• A badly chosen large sample (even a really big one)
is much worse than a well chosen small sample
• Selection bias is insidious – how you ask can have
serious implications about who you are asking
• Why important for big data? We don’t need to sample
now, right? Wrong – still need to sample for things
like holdout groups for model training, tool capacity,
etc
Management Expectations?Big data is magic, infallible!
It doesn’t have the same
pesky uncertainty as pre-Big
Data analytics. Machines can
tell me all I need to know!
• Give me the answer I want, as supporting
data for something I’ve already decided
• We owe our organizations objective analysis,
based on science, not wishful thinking
SKU Rationalization & Project Impact
• Reduced “SKUs” (items)
by about 15%
• Removed displays from
“Action Alley”
• Widely blamed for at
least 8 consecutive
quarters of comp sales
decline
SourcesNassim Taleb - “Beware the Big Errors of Big Data”
http://www.wired.com/2013/02/big-data-means-big-errors-people
Spurious Correlations
http://www.tylervigen.com/spurious-correlations
Big Data: A Revolution that Will Transform How We Live, Work, and Think
Viktor Mayer-Schönberger, Kenneth Cukier ISBN-10: 0544227751
The Parable of Google Flu: Traps in Big Data Analysis
http://scholar.harvard.edu/files/gking/files/0314policyforumff.pdf
Google Flu Trends Failure Shows Good Data > Big Data
https://hbr.org/2014/03/google-flu-trends-failure-shows-good-data-big-data/
Sources - Continued
When Google Got Flu Wrong
http://www.nature.com/news/when-google-got-flu-wrong-1.12413
What the Failed $1M Netflix Prize Says About Business Advice
http://www.forbes.com/sites/ryanholiday/2012/04/16/what-the-failed-1m-netflix-prize-tells-us-
about-business-advice/#2715e4857a0b3808da747757
Business Model Lessons From Walmart’s SKU Reductions
http://businessmodelinstitute.com/business-model-lessons-from-wal-mart-sku-reductions/
Walmart Lost Billions by Listening to Customers
http://www.thecmosite.com/author.asp?section_id=1200&doc_id=205973
Contact Information
http://linkedin.com/in/SterlingPrice