NachumShacham, PhD Director, Data Science PayPal · Constructing Operational Big-Data Models...
Transcript of NachumShacham, PhD Director, Data Science PayPal · Constructing Operational Big-Data Models...
5/26/16
1© 2014 PayPal Inc. All rights reserved. .
© 2015 PayPal Inc. All rights reserved..
Constructing Operational Big-Data Models
Nachum Shacham, PhDDirector, Data Science
PayPal
5/18/2016
Predictive Modeling With Big Data
2
Data Sourcing
Data Warehouse
Model(s) selection and tuning
Data Munging
Deploy
FeatureEngineering
Observational big-data
DataExploration
Validation
5/26/16
2© 2014 PayPal Inc. All rights reserved. .
Tools & Infrastructure for Data and Modeling
3
Data: Collect, stream, store, process
Training large scale predictive models
Score millions of objects
Big Server
Open source cluster-based tools
Cloud interface: IAAS, PAAS, SAAS
Collaboration/model-sharing tools
GitHub
• Big predictive models are complex• Trained on historical data to score future cases•Hundreds/thousands of features•Millions of cases
• Machine-learning algorithms have been upgraded to operate efficiently on big data•Neural nets• Tree-based ensembles (Random Forest, GBM)•Clustering
• Do the model results work?
4
Modeling: Predicting Individual Users
5/26/16
3© 2014 PayPal Inc. All rights reserved. .
• Performance Metrics: R2, Error Rate(s)•Measure via cross validation•Based on historical data
• Cost of errors: False Positive, False Negative•Set by the users of the model
• Pre-production evaluation requires trust•A/B testing for selecting alternatives
5
How Good Is Your Model
• Model results: interpretable or black box• Can score reasoning be articulated
• Identify champion(s) who believe in the model• Incentivize adoption of the model• Avoid impression of positional bias • Integrate gradually to prove results on a gradually increasing scale • Repeatedly test and adjust• Measure the benefits of actions based on the model
6
Gain a Trust in the Model
5/26/16
4© 2014 PayPal Inc. All rights reserved. .
THANK YOU
7