Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

38
2018 Predictive Analytics Symposium Session 29: Opening the Black Box: Understanding Complex Models SOA Antitrust Compliance Guidelines SOA Presentation Disclaimer

Transcript of Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori...

Page 1: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

2018 Predictive Analytics Symposium

Session 29: Opening the Black Box: Understanding Complex Models

SOA Antitrust Compliance Guidelines SOA Presentation Disclaimer

Page 2: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Opening the Black Box:Understanding Complex ModelsSession 29September 2018 – Predictive Analytics Symposium

Michael Niemerg, FSA, MAAAPredictive Modeling Manager, Milliman [email protected]

Page 3: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

SOA Antitrust Compliance GuidelinesActive participation in the Society of Actuaries is an important aspect of membership. While the positive contributions of professional societies and associations are well-recognized and encouraged, association activities are vulnerable to close antitrust scrutiny. By their very nature, associations bring together industry competitors and other market participants.

The United States antitrust laws aim to protect consumers by preserving the free economy and prohibiting anti-competitive business practices; they promote competition. There are both state and federal antitrust laws, although state antitrust laws closely follow federal law. The Sherman Act, is the primary U.S. antitrust law pertaining to association activities. The Sherman Act prohibits every contract, combination or conspiracy that places an unreasonable restraint on trade. There are, however, some activities that are illegal under all circumstances, such as price fixing, market allocation and collusive bidding.

There is no safe harbor under the antitrust law for professional association activities. Therefore, association meeting participants should refrain from discussing any activity that could potentially be construed as having an anti-competitive effect. Discussions relating to product or service pricing, market allocations, membership restrictions, product standardization or other conditions on trade could arguably be perceived as a restraint on trade and may expose the SOA and its members to antitrust enforcement procedures.

While participating in all SOA in person meetings, webinars, teleconferences or side discussions, you should avoid discussing competitively sensitive information with competitors and follow these guidelines:

• -Do not discuss prices for services or products or anything else that might affect prices• -Do not discuss what you or other entities plan to do in a particular geographic or product markets or with particular customers.• -Do not speak on behalf of the SOA or any of its committees unless specifically authorized to do so.• -Do leave a meeting where any anticompetitive pricing or market allocation discussion occurs.• -Do alert SOA staff and/or legal counsel to any concerning discussions• -Do consult with legal counsel before raising any matter or making a statement that may involve competitively sensitive information.

Adherence to these guidelines involves not only avoidance of antitrust violations, but avoidance of behavior which might be so construed. These guidelines only provide an overview of prohibited activities. SOA legal counsel reviews meeting agenda and materials as deemed appropriate and any discussion that departs from the formal agenda should be scrutinized carefully. Antitrust compliance is everyone’s responsibility; however, please seek legal counsel if you have any questions or concerns.

2

Page 4: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Limitations

3

This presentation is intended for informational purposes only. It reflects the opinions of the presenter, and does not represent any formal views held by Milliman, Inc. Milliman makes no representations or warranties regarding the contents of this presentation. Milliman does not intend to benefit or create a legal duty to any recipient of this presentation.

Page 5: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

What is interpretability?

• How does the algorithm construct the model?• What features most influence the model’s predictions and by how much?• Does the relationship between each predictor and the response make sense?• How does the model extrapolate and interpolate?• Why did the model make a specific prediction?• How confident / sensitive is the model?• Is the model equitable? Is it discriminatory?

4

Warning: Contents hard to interpret

Page 6: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

What makes interpretability difficult?

• Algorithmic complexity• High dimensionality• Interactions and correlation• Nonlinear relationships• Omitted variables• Noise variables

5

Page 7: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Interpretability Issues

• A priori vs. post hoc• Choosing an interpretable model form vs. using a “black-box”

• Global vs. local• Does the interpretation explain something about the entire model (global) or only a particular

instance (local)• Model-specific vs. model agnostic

• Is the interpretation method only applicable to certain algorithms or can it be applied to any algorithm?

• Interpretability vs. performance• Simpler models are often more interpretable at the expense of model performance

6

Page 8: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Interpretability vs. Performance

7

Page 9: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Some of the worst culprits…

• Random Forests• Gradient Boosted Decision Trees• Neural Networks• Ensembles

8

Page 10: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Methods

Model Agnostic•Partial Dependence Plots• ICE Plots•Variable Importance•Local interpretable model

explanation (LIME)•Visualization: t-SNE / MDS / PCA•Surrogate Models•Sensitivity Analysis•Shapley Predictions•Rule Extraction

Neural Networks•Saliency Masks•Activation Maximization•Relevance Propagation

Gradient Boosting•XGBFI•xgboostExplainer•Monotonicity constraints

9

Page 11: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Software

• iml (R)• LIME (R / Python)• SKATER (Python)• XGBFI (R - xgboost)• xgboostExplainer (R - xgboost)• DALEX (R)• H20 Driverless AI• Aequitas• Themis ML (Python)

10

Page 12: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Creating more interpretable models

Proprietary and Confidential for Client. Not for distribution.

• Simpler methods• Decision trees• Linear models

• Monotonicity constraints• Higher regularization

• Fewer parameters• Shallower tree models

Page 13: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

dataCar

• Library: insuranceData• Target:

claimcst0 claim amount• Features:

•veh_value vehicle value, in $10,000’s•veh_body vehicle body•veh_age age of vehicle: 1, 2, 3, 4•gender gender of driver: M, F•area driver's area of residence: A, B, C, D, E, F•agecat driver's age category: 1, 2, 3 4, 5, 6

Let’s meet our data…

12

http://www.businessandeconomics.mq.edu.au/our_departments/Applied_Finance_and_Actuarial_Studies/research/books/GLMsforInsuranceData/data_sets

Page 14: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

PDP and ICE Plots

13

Warning: Contents hard to interpret

Partial Dependence Plot (PDP)• Displays the marginal impact of a feature on the model – what’s happening with

“all else equal”• Shows the relationship between the target and the feature on average

• Fix the relationship of 1 or 2 predictors at multiple values of interest of interest• Average over the other variables• Plot response

Individual Conditional Expectation (ICE)• Shows how a single prediction changes when the value of a single feature is varied• Run this for multiple predictions and plot results

Page 15: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

PDP and ICE Plots - Visualized

14

Warning: Contents hard to interpret

XGBoost Neural Network

clai

m a

mou

nt

clai

m a

mou

nt

veh_value veh_value

Page 16: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

PDP with 2 features - Visualized

15

claim amount

veh_age

veh_

valu

e

Page 17: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Surrogate Model

16

Warning: Contents hard to interpret

• A model trained using another models predictions as its target• Decision tree• Linear model

• Result is a simpler model that can help interpret the more complex model

Page 18: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Surrogate Model - Visualized

17

Warning: Contents hard to interpret

Page 19: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Feature Importance

18

Warning: Contents hard to interpret

• Measures how much a feature contributes to the predictive performance of the model

• Helps us know what is drives predictions at a global level • Common methods

• Permute a feature and measure change in model error• LOCO – Leave One Covariate Out - Build model with and without feature and compare

difference in error

Page 20: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Feature Importance - Visualized

19

Warning: Contents hard to interpret

Feat

ure

Feature Importance

Page 21: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Shapley Predictions

20

Warning: Contents hard to interpret

• Provides a measure of local feature contribution for a given prediction• Basis in game theory

• Assigns “payout” to players in proportion to marginal contribution • “Game” is prediction of an observation

Page 22: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Shapley Visualization

21

Warning: Contents hard to interpret

Feature Importance

Feat

ure

Valu

e

Page 23: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Local Surrogate Models (LIME)

22

Warning: Contents hard to interpret

Algorithm• Choose instances to explain• Permute instance to create replicated feature data• Weight permuted instances with the original based on proximity • Apply “black-box” machine learning model to predict outcomes of permuted data • Fit a simple model, explaining the complex model outcome with the selected

features from the permuted data weighted by its similarity to the original observation

• Explain predictions using this simpler model

Page 24: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

LIME - Visualized

23

Warning: Contents hard to interpret

Page 25: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Sensitivity Analysis

24

Warning: Contents hard to interpret

• Thoroughly test the model for changes based upon small permutations in features

• Use simulated data representing prototypes for different areas of interest

Page 26: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

XGBFI (XGBoost)

25

Warning: Contents hard to interpret

• Computes variable importance and interaction importance (“Gain”)

• Shows number of possible splits taken on a feature (“Fscore”) and the cut-points chosen

• & more!

Interaction Gain FScoreveh_value 4,259,983,149 1,911

area 1,211,945,038 878 veh_body 1,147,646,618 914 veh_age 1,088,228,059 709 agecat 806,955,407 610 gender 707,919,139 514

Interaction Gain FScoreveh_value|veh_value 5,970,120,855 1,198 veh_age|veh_value 1,562,875,549 252 agecat|veh_value 1,311,331,233 299

veh_body|veh_value 1,295,426,670 313 area|veh_value 1,100,576,093 327

gender|veh_value 880,025,508 245

Page 27: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

XGBFI (XGBoost)

26

Warning: Contents hard to interpret

split value count split value count split value count split value count split value count split value count0.09 25 1.5 226 1.5 1 1.5 212 1.5 129 1.5 514

0.185 1 2.5 198 2.5 7 2.5 222 2.5 1340.205 1 3.5 178 3.5 58 3.5 275 3.5 1210.225 1 4.5 161 4.5 121 4.5 1160.23 4 5.5 115 5 1 5.5 110

0.245 2 5.5 470.25 2 6 15

0.265 1 6.5 290.285 6 7 80.295 5 7.5 640.305 6 8 20.315 1 8.5 180.325 14 9 700.33 3 9.5 16

0.345 14 10.5 1900.355 9 11.5 1320.36 1 12.5 135

genderveh_value area veh_body veh_age agecat

Page 28: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

xgboostExplainer (XGBoost)

27

Warning: Contents hard to interpret

• Shows how each variable is locally contributing to a prediction

Page 29: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Monotonicity Constraints (XGBoost)

28

http://xgboost.readthedocs.io/en/latest/tutorials/monotonic.html

• Enforce a constraint on the model so that the predicted response can only increase / decrease for a given feature

Page 30: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Activation Maximization (Neural Networks)

29

https://www.researchgate.net/figure/Figure-S4-a-Previous-state-of-the-art-activation-maximization-algorithms-produce_fig9_301845946

• Find a prototype that most strongly correlates to a given prediction

Page 31: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Saliency Masks (Neural Networks)

30

• Determine what input is causing the prediction

Page 32: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Adversarial Examples

31

https://codewords.recurse.com/issues/five/why-do-neural-networks-think-a-panda-is-a-vulture

Page 33: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Model Fairness

32

Does the model discriminate against any protected classes?• Does the model directly incorporate protected classes into the model?• Can the model proxy protected classes through other variables in the model?

• Determine whether the protected classes can be statistically predicted using other data attributes (ex: logistic regression)

• Determine whether model outcomes are statistically different by protected class• Change either data or predictions to increase model fairness

Page 34: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Themis ML

33

“Themis ML is a Python library built on top of pandas and sklearn that implements fairness-aware machine learning algorithms” (https://github.com/cosmicBboy/themis-ml)

Page 35: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Aequitas

34

“An open source bias audit toolkit for machine learning developers, analysts, and policymakers to audit machine learning models for discrimination and bias, and make informed and equitable decisions around developing and deploying predictive risk-assessment tools” (https://dsapp.uchicago.edu/aequitas/)

Page 36: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Explainable Machine Learning Challenge

35

• Kaggle-like competition for model interpretability• Collaboration between Google, FICO and academics at Berkeley, Oxford, Imperial,

UC Irvine and MIT• Task: Use information in Home Equity Line of Credit (HELOC) to predict whether

someone will repay their HELOC account within 2 years. This prediction is then used to decide whether the homeowner qualifies for a line of credit and, if so, how much credit should be extended.

• https://community.fico.com/s/explainable-machine-learning-challenge

Page 37: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

References

36

Interpretable Machine Learning: A Guide to Making Black Box Models Explainable https://christophm.github.io/interpretable-ml-book/ XGBoost: http://xgboost.readthedocs.io/en/latest/ G. Montavon, W. Samek, and K.-R. Muller. Methods for interpreting and

understanding deep neural networks. arXiv preprint arXiv:1706.07979, 2017. Z. C. Lipton. The mythos of model interpretability. arXiv preprint

arXiv:1606.03490, 2016. F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine

learning. arXiv preprint arXiv:1702.08608, 2017. A. Nguyen, J. Yosinski, and J. Clune, “Deep neural networks are easily fooled: High

confidence predictions for unrecognizable images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 427–436, 2015.

Page 38: Session 29: Opening the Black Box: Understanding Complex ... · Interpretability Issues •A priori vs. post hoc •Choosing an interpretable model form vs. using a “black -box”

Thank you