1 Does Credit Score Really Help Explain Insurance Losses? Cheng-Sheng Peter Wu, FCAS, ASA, MAAA, Jim...

Post on 02-Jan-2016

213 views 0 download

Tags:

Transcript of 1 Does Credit Score Really Help Explain Insurance Losses? Cheng-Sheng Peter Wu, FCAS, ASA, MAAA, Jim...

1

Does Credit Score Really Help Explain Insurance Losses?

Cheng-Sheng Peter Wu, FCAS, ASA, MAAA,

Jim Guszcza, ACAS, MAAA, Ph. D.

2

Themes

The History What Does the Question Mean? Simpson’s Paradox - Need for Multivariate

Analysis What Has Been Done So Far? Our Large-Scale Data Mining Experience Going Beyond Credit Conclusions

3

The History

Pricing/Class Plans Few factors before World War II Explosion of class plan factors after the War Current class plans (Auto) – territory, driver,

vehicle, loss and violation, others, tiers/company, etc.

Actuarial techniques – Minimum Bias & GLM

4

The History

Credit First important factor identified over the past 2 decades Composite multivariate score vs. raw credit information Introduced in late 80’s and early 90’s Viewed at first as a “secret weapon” Currently almost everyone is using it Industry scores vs. proprietary scores Quiet, confidential, controversial, black-box, …etc

5

What Does the Question Mean?

Can Credit Score Really “Explain” Ins Losses?“X explains Y”Weaker than claiming that X causes YStronger than merely reporting that X is

correlated with Y

6

What Does the Question Mean?

Working DefinitionWe say that “X helps explain Y” if:

– X is correlated with Y– The correlation does not go away when

other available, measurable information is introduced

7

What Does the Question Mean?

Intuition Behind the Definition It might be okay for X to be a proxy for a “true”

cause of Y– Testosterone level might be a true cause of auto

losses…. But it’s not available– Age/Gender is a reasonable proxy

It might not be okay for X to be a proxy for other available predictive information

8

What Does the Question Mean?

Applying the Definition Suppose we see that credit score plays an

important role in a multivariate regression equation that predicts loss ratio

Then it is fair to say the credit helps explain insurance losses

A multivariate study is needed

9

Simpson’s Paradox – Need for Multivariate Analysis

Statistics can lie Illustrates how a univariate association

can lead to a spurious conclusion The “true” explanatory factor is masked

by the spurious correlationFamous example: 1973 Berkeley

admissions data

10

Simpson’s Paradox – Need for Multivariate Analysis

The Berkeley Example (stylized)2200 people applied for admission1100 men; 1100 women210 men, 120 women were accepted.Clear-cut case of gender discrimination……. Or is it?

11

Simpson’s Paradox – Need for Multivariate Analysis

# Applicants # Accepted % AcceptedArts Eng Total Arts Eng Total Arts Eng Total

Female 1000 100 1100 100 20 120 10% 20% 11%Male 100 1000 1100 10 200 210 10% 20% 19%

12

Simpson’s Paradox – Need for Multivariate Analysis

REGRESSION RESULTSBeta T- Score

Intercept 0.109 10.2

Gender 0.082 5.1

Beta T- ScoreIntercept 0.10 9.20

Gender 0.00 0.00

School 0.10 3.80

13

Simpson’s Paradox – Need for Multivariate Analysis

# Policies # Policies w/Claims FrequencyAdult Youthful Total Adult Youthful Total Adult Youthful Total

Good Credit 1000 100 1100 100 20 120 10% 20% 11%Bad Credit 100 1000 1100 10 200 210 10% 20% 19%

14

What Has Been Done So Far

We (actuaries) have been quiet Few published actuarial studies/opinions

– NAIC/Tillinghast (1997)– Monaghan’s Study (2000)

Recent/related studies– Virginia State Study (1999)– CAS Sub-Committee (2002)– Washington State Study (2003)– University of Texas Study (2003)

15

What Has Been Done So Far

Relevant Actuarial/Statistical Principles Pure premium vs. loss ratio

– Loss ratio studies go beyond existing rating plans, and are implicitly multivariate

Independence vs. correlation – Most insurance variables are correlated

Univariate vs. multivariate– Correlated variables call for multivariate studies for true answers

(Simpson’s Paradox) Credibility vs. homogeneity

– Studies need to be credible and representative

16

What Has Been Done So Far

The Tillinghast Study 9 companies’ data, seems representative Loss ratio study No other predictive variables included in the study No detailed information given about the data Strong correlation with loss ratio, seems credible This is true, but it doesn’t answer our question and

doesn’t quiet the critics

17

What Has Been Done So Far

Tillinghast Study of 9 Companies' Data

Loss Ratio Relativity of the Best and Worst 20% of Credit Score

Co1 Co2 Co3 Co4 Co5 Co6 Co7 Co8 Co9 Avg

Best 20% -38% -29% -19% -15% -14% -34% -22% -22% -36% -25%

Worst 20% 48% 20% 32% 30% 46% 59% 20% 22% 95% 41%

18

What Has Been Done So Far

Monaghan’s Study Loss ratio study Large amount of data – credible analysis Analyze individual credit variables as well as score Multivariate analysis – limited to score + 1 traditional

rating variable at a time Shows strong correlations with loss ratio do not go

away in the presence of other variables Another good step, but we can go further

19

Our Large-Scale Data Mining Experience

Our Work Loss ratio studies Multiple studies - representative Large amounts of data – credible Hundreds of variables tested along with credit – truly

multivariate– Policy, driver, vehicle, coverages, billing, agency, external

data, synthetic, …etc. Sound actuarial and statistical model design Disciplined data mining process

20

Our Large-Scale Data Mining Experience

What Have We Found Out? Credit score is always one of top variables selected for

the multivariate models Credit score has among the strongest parameters and

statistical measurements (t-score)– Credit’s predictive power does not go away in the truly

multivariate context

Removing credit score dampens the predictive power of the models

21

Our Large-Scale Data Mining Experience

What Do We Conclude? We conclude that credit score bears an

unambiguous relationship to insurance losses, and is not a mere proxy for other kinds of information available to insurance companies.

This does not mean that credit score is the “cause” of insurance losses

22

Our Large-Scale Data Mining Experience

Why Is Credit Score Correlated with Ins Losses? Beyond the scope of our work

– Emphasis is not causation

Plausible speculations include– Stress/planning & organization– Risk-seeking behavior– ??

Analogy: Age/Gender might be a proxy for testosterone

23

Going Beyond Credit

Can We Do Well Without Credit? YES: non-credit predictive models are

– Valuable alternative to credit scores– Flexible– Tailored to individual companies– Comparable predictive power to credit scores

Also possible to build mixed credit/non-credit models

24

Going Beyond Credit

Keys to Building Successful Non-Credit Models: Fully utilize all sources of information

– Leverage company’s internal data sources– Enriched with other external data sources

Use large amount of data Employ disciplined analytical process Utilize state-of-the-art modeling tools Apply multivariate methodology

25

Going Beyond Credit

Advantages of Going Beyond Credit Next generation of competitive advantage More variables, more predictive power Leverages company’s internal data sources More flexibility Address regulatory issues and public concerns Expense savings Everyone gets a score (less of a “no hit” problem) More customized – less “plain vanilla” than credit score

26

Conclusions

Credit works… even in a fully multivariate setting But non-credit models can work well too! What it means to us – beginning of a new era

– Advances in computer technology– Advances in predictive modeling techniques– Large scale multivariate studies now practical– More external and internal info, anything else out there?– Other ways to go beyond credit?

27

Conclusions

Future works on this topic Multivariate pure premium analysis would

provide more insights Further study of public policy issues

– WA, VA came to opposite conclusions

Comparison of various existing scoring models