CAS Predictive Modeling Seminar Evaluating Predictive Models

CAS Predictive Modeling Seminar

Evaluating Predictive Models

Glenn MeyersISO Innovative Analytics

October 5, 2006

Choosing Models

• Predicting losses for individual insurance policies involves:– Millions of policy records– Hundreds (or thousands) of variables

• There are a number of models that provide good predictions– GLM, GAM, CART, MARS, Neural Nets, etc.

• Business objectives influence choice of model

The Modeling Process

• Modeling process involves dimension reduction techniques– Clustering, Principal Components, Factor

Analysis– Building submodels and using predicted

values as input into a higher level model

• The modeling cycle– 1. Build model with training data– 2. Evaluate model with test data– 3. Identify improvements in models and data– 4. Go back to Step 1

Hidden Parameters

• Classic model building methods correct for the number of parameters using “degrees of freedom.”

• The model exploration process “eats up degrees of freedom” in ways that cannot be captured by formal model adjustments.

• In essence the “test” data gets merged into the “training” data.

What Is Significant?

• Statistical packages will often identify improvements that are “statistically significant” but not “practically significant.”

• This talk is about determining when a model identifies “practically significant” improvements.

• Illustrate how to do this on a real example.

The ExampleA Personal Auto Model Under Development

Preliminary Results• Input – Address of insured vehicle• Output – Address Specific Loss Cost

– 30 year old, single car with no SDIP points– 500 deductible or 25/50/25 policy limits– Symbol 8, model year 2006– etc.

• Model derived from over 1,200 variables reflecting weather, traffic, demographic, topographical and economic conditions.

Difference Between

Address Specific and ISO Territory Loss Cost

Differences AboundSome Questions to Ask

• Can the model output be used to improve insurer underwriting results?

• Are the results statistically significant?

Define ELI

Address Specific Loss CostExpected Loss Index

ISO Territory Loss Cost

Use Expected Loss Index for Risk Selection

Expected Loss Index Loss Ratio %Less than 75% 69.7Between 75 and 100% 85.8Between 100 and 125% 109.7Greater than 125% 159.5

Denominator = Full ISO Loss Cost

Propose a Standard Way of Evaluating Lift – The Gini Index

• Originally proposed by Corrado Gini in 1912

• Most often used to measure income and/or wealth inequality– Search for “Gini” in wikipedia.org

• In insurance underwriting, we want to evaluate systematic methods of finding “loss” inequality.

Gini Index

• Look at set of policy records below cutoff point, ELI < 1.

• This set of records accounts for 59% of total ISO (full) loss cost.

• This set of records accounts for 48% of total loss.

• 1 − 48/59 → 19% reduction in loss ratio.

Gini Index

• Do this calculation for other cutoff points.

• The results make up the what we call the Lorenz Curve

Gini Index

• If ELI is random, the Lorenz curve will be on the diagonal line.

• The Gini index is the percentage of the area under the “random” line that is above the Lorenz curve.

• Higher Gini means better predictive model.

A Gini Index Thought Experiment

• If we had the ability to predict who will have losses, what would the Gini index be?

• It would be 100% if only one risk had all the losses

Bodily Injury

Property Damage

Collision

Statistical Significance

• How much random fluctuation is in the Gini index calculation?

• Use bootstrapping to evaluate– Take a random sample of records, with

replacement.– Calculate Gini index for the sample.– Repeat 250 times.

• Plot a histogram of the results.

Bootstrap Results

Summary

• Standard tests of statistical significance are suspect.

– Informal model selection process– Statistical/Practical significance

• Propose Gini index as a test of practical significance.

• Divide data into three samples1. Training – Used to fit models2. Test – Used to evaluate fits3. Holdout – “Final” evaluation

CAS Predictive Modeling Seminar Evaluating Predictive Models

Documents

Transcript of CAS Predictive Modeling Seminar Evaluating Predictive Models

Evaluating Predictive Models of Software Quality

Predictive Modeling CAS Reinsurance Seminar May 7, 2007

Designing and Evaluating an Interpretable Predictive ...

Refining Hispanic ROI Evaluating the State of Predictive ...culturemarketingcouncil.org/Portals/0/Research... · origins of a campaign and its strategy when evaluating model results

Evaluating Predictive Uncertainty Challengemlg.eng.cam.ac.uk/pub/pdf/QuiRasSinetal06.pdf · Evaluating Predictive Uncertainty Challenge ... as a random variable, and make to predic-

Refining Hispanic ROI Evaluating the State of Predictive Analytics … · 2017-02-22 · Evaluating the State of Predictive Analytics and the Opportunities for Improvement January

On Predictive Modeling for Claim Severity Paper in Spring 2005 CAS Forum Glenn Meyers ISO Innovative Analytics Predictive Modeling Seminar September 19,

Can You Trust Your Model’s Uncertainty? Evaluating Predictive ... You Trust Y… · Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

Evaluating competing predictive distributions

Evaluating the Predictive Accuracy of Volatility Models · 2013. 5. 26. · Evaluating the Predictive Accuracy of Volatility Models Jose A. Lopez Economic Research Department Federal

Evaluating the predictive performance of habitat models ... · Ecological Modelling 133 (0000) 225–245 Evaluating the predictive performance of habitat models developed using logistic

CAS: Crime Anticipation System Predictive Policing in Amsterdam.

Severity GLMs: A Forgotten Distribution Christopher Monsour CAS Predictive Modeling Seminar

Evaluating Predictive Performance of Value-at-Risk Models ...

Evaluating Big Data Predictive Analytics Platforms

Toward a theory of evaluating predictive accuracy

Chapter 4 Evaluating Classification and Predictive Performance

Review Experience in Evaluating Predictive Biomarkers – Design and Analysis Considerations

Evaluating Predictive Models

Posterior Predictive Analysis for Evaluating DSGE … Predictive Analysis for Evaluating DSGE Models Jon Faust and Abhishek Gupta NBER Working Paper No. 17906 March 2012 JEL No. C52,E1,E32,E37Authors: