Post on 28-Jul-2015
Data Science: A Mindset for ProductivityDaniel Tunkelang
@dtunkelang
Daniel
tl;dr
The most important part of data science is pickingthe right problem and figuring out how to frame it.
We’re all technologists, right?
But nobody knows everything.*Class HashMap<K,V>
java.lang.Objectjava.util.AbstractMap<K,V>
java.util.HashMap<K,V>
Type Parameters:
K - the type of keys maintained by this mapV - the type of mapped values
All Implemented Interfaces:Serializable, Cloneable, Map<K,V>
*Except Jeff Dean.
Math and computer science matter…
But you have to solve the right problem.
Stay friends with your exes.
explainexpress
experiment
Data science is a mindset.
ExplainIterate using explainable models.
ExpressModel your utility and inputs.
ExperimentOptimize for speed of learning.
Explain
With apologies to the little prince.
Deep learning is the new black.
But accuracy isn’t everything.
The importance of being explainable.• Algorithms can protect you from overfitting, but they can’t
protect you from the biases you introduce.
• Introspection into your models and features makes it easier for you and others to debug them.
• Especially if you don’t completely trust your objective function or representativeness of your training data.
Linear models? Decision trees?• Linear regression and decision trees favor explainability over accuracy,
compared to more sophisticated models.
• But size matters. If you have too many features or too deep a decision tree, you lose explainability.
• You can always upgrade to a more sophisticated model when you trust your objective function and training data.
• Build a machine learning model is an iterative process. Optimize for the speed of your own learning.
Express
Machine learning for dummies.• Define objective function.• Collect training data.• Build models.• Profit!
You only improve what you measure.
Clicks?
Actions?
Outcomes?
Sometimes accuracy is complicated.
What’s your error function?
Consider stratified sampling.
Experiment
How to find your prince.You have to kiss a lot of frogs to find one prince. So how can you find your prince faster?
By finding more frogs andkissing them faster and faster.
-- Mike Moran
Think like an economist.Yesterday
Experiments are expensive,
choose hypotheses wisely.
TodayExperiments are cheap,
do as many as you can!
But don’t forget you’re a scientist.
Optimize for the speed of learning.
Test one variable at a time.• Autocomplete• Entity Tagging• Vertical Intent• # of Suggestions• Suggestion Order• Language• Query Construction• Ranking Model
tl;dr
The most important part of data science is pickingthe right problem and figuring out how to frame it.
Daniel Tunkelangdtunkelang@gmail.com
https://linkedin.com/in/dtunkelang@dtunkelang