Review Experience in Evaluating Predictive Biomarkers – Design and Analysis Considerations
CAS Predictive Modeling Seminar Evaluating Predictive Models
-
Upload
heulwen-evans -
Category
Documents
-
view
38 -
download
1
description
Transcript of CAS Predictive Modeling Seminar Evaluating Predictive Models
CAS Predictive Modeling Seminar
Evaluating Predictive Models
Glenn MeyersISO Innovative Analytics
October 5, 2006
Choosing Models
• Predicting losses for individual insurance policies involves:– Millions of policy records– Hundreds (or thousands) of variables
• There are a number of models that provide good predictions– GLM, GAM, CART, MARS, Neural Nets, etc.
• Business objectives influence choice of model
The Modeling Process
• Modeling process involves dimension reduction techniques– Clustering, Principal Components, Factor
Analysis– Building submodels and using predicted
values as input into a higher level model
• The modeling cycle– 1. Build model with training data– 2. Evaluate model with test data– 3. Identify improvements in models and data– 4. Go back to Step 1
Hidden Parameters
• Classic model building methods correct for the number of parameters using “degrees of freedom.”
• The model exploration process “eats up degrees of freedom” in ways that cannot be captured by formal model adjustments.
• In essence the “test” data gets merged into the “training” data.
What Is Significant?
• Statistical packages will often identify improvements that are “statistically significant” but not “practically significant.”
• This talk is about determining when a model identifies “practically significant” improvements.
• Illustrate how to do this on a real example.
The ExampleA Personal Auto Model Under Development
Preliminary Results• Input – Address of insured vehicle• Output – Address Specific Loss Cost
– 30 year old, single car with no SDIP points– 500 deductible or 25/50/25 policy limits– Symbol 8, model year 2006– etc.
• Model derived from over 1,200 variables reflecting weather, traffic, demographic, topographical and economic conditions.
Difference Between
Address Specific and ISO Territory Loss Cost
Differences AboundSome Questions to Ask
• Can the model output be used to improve insurer underwriting results?
• Are the results statistically significant?
Define ELI
Address Specific Loss CostExpected Loss Index
ISO Territory Loss Cost
Use Expected Loss Index for Risk Selection
Expected Loss Index Loss Ratio %Less than 75% 69.7Between 75 and 100% 85.8Between 100 and 125% 109.7Greater than 125% 159.5
Denominator = Full ISO Loss Cost
Propose a Standard Way of Evaluating Lift – The Gini Index
• Originally proposed by Corrado Gini in 1912
• Most often used to measure income and/or wealth inequality– Search for “Gini” in wikipedia.org
• In insurance underwriting, we want to evaluate systematic methods of finding “loss” inequality.
Gini Index
• Look at set of policy records below cutoff point, ELI < 1.
• This set of records accounts for 59% of total ISO (full) loss cost.
• This set of records accounts for 48% of total loss.
• 1 − 48/59 → 19% reduction in loss ratio.
Gini Index
• Do this calculation for other cutoff points.
• The results make up the what we call the Lorenz Curve
Gini Index
• If ELI is random, the Lorenz curve will be on the diagonal line.
• The Gini index is the percentage of the area under the “random” line that is above the Lorenz curve.
• Higher Gini means better predictive model.
A Gini Index Thought Experiment
• If we had the ability to predict who will have losses, what would the Gini index be?
• It would be 100% if only one risk had all the losses
Bodily Injury
Property Damage
Collision
Statistical Significance
• How much random fluctuation is in the Gini index calculation?
• Use bootstrapping to evaluate– Take a random sample of records, with
replacement.– Calculate Gini index for the sample.– Repeat 250 times.
• Plot a histogram of the results.
Bootstrap Results
Summary
• Standard tests of statistical significance are suspect.
– Informal model selection process– Statistical/Practical significance
• Propose Gini index as a test of practical significance.
• Divide data into three samples1. Training – Used to fit models2. Test – Used to evaluate fits3. Holdout – “Final” evaluation
R2