Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background...

32
4/2/2013 1 Cutting Edge Tools for Pricing and Underwriting Seminar Needles in Haystacks Reduction Techniques to Find © 2011 Towers Watson. All rights reserved. Information in Data Casualty Actuarial Society by Serhat Guven Fall 2011 Agenda Background Main Effects Interactions Residual Analysis towerswatson.com © 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only. 2 Other Alternatives

Transcript of Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background...

Page 1: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

1

Cutting Edge Tools for Pricing and Underwriting SeminarNeedles in Haystacks – Reduction Techniques to Find

© 2011 Towers Watson. All rights reserved.

y qInformation in Data

Casualty Actuarial Society

by Serhat Guven

Fall 2011

Agenda

Background

Main Effects

Interactions

Residual Analysis

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

2

Other Alternatives

Page 2: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

2

Background

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

3

Background

Goals of predictive modeling

Produce a sensible model that explains recent historical experience and is likely to be predictive of future experience

Overall mean Best Models One parameter per

Response Variable

Systematic Component

Unsystematic Component= +

Signal: Function of the Rating Factors/Predictors

Noise: Reflects stochastic process

1. Separate the random components from the systematic components of the estimator

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Underfit:Predictive

Poor explanatory power

Overfit:Poor predictive power

Explains history

Overall mean observation2. Balance predictive power and explanatory effects

4

Model Complexity (number of parameters)

Page 3: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

3

Background

Goals of predictive modeling

Predict a response variable using a series of explanatory variables

Predictive Model

Response variablesL

Explanatory variablesAge AccidentsLimit ConvictionsRegion Credit score

ParametersValidation Statistics

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

5

LossesClaims

Retention

Larger data storage capabilities and greater access to external data enhances ability to identify the right rate for the risk

Many external factors have been found to be predictive of frequency and/or loss severity. Here are a few for auto…

Background

Major Convictions

Territory

Traffic Density

Garaged

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Age Minor Convictions

Traffic Density

Gender

Method of Payment

Page 4: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

4

Background

Component vs. Combined Modeling

Raw loss ratio modeling

Raw pure premium modeling

Standard risk collective

COMPONENT MODELS

Frequency

S it

COMBINE

Frequency Severityx

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

7

Severity

Poisson/ Negative Binomial

Gamma

Background

Component vs. Combined Modeling

Alternate collective

]

COMPONENT MODELS

Event Frequency

Severity

Coverage Propensity

Poisson/ Negative

Binomial

COMBINE

Event Frequency

Coverage Propensity Severityx x

Gamma

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

8

gBinomial

Challenge is dealing with low frequency coverages/perils

Further alternatives are to replace frequency concepts with probability

Page 5: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

5

Background

Component vs. Combined Modeling

Further alternatives replace frequency with probability

]

COMPONENT MODELS

Event Probability

Severity

Coverage Propensity

Binomial Binomial

COMBINE

Event Probability

Coverage Propensity Severityx x

Gamma

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

9

Moving away from defining a variance function

Background

Modeling is an iterative process

How does the analyst decide which factors are most

Review Modelwhich factors are most

valuable?

Parameters/standard errors

Consistency of patterns over time or random data sets

Type III statistical tests (e.g., chi-square tests)

J d t ( d th t dComplicateSimplify

Model

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

10

Judgment (e.g., do the trends make sense)

Focus of the section is on analysis of data NOT gathering and collecting

y

Page 6: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

6

Background

Greater availability of data requires a more efficient approach to analysis and selection

Brute force still has value Brute force still has value

Aligning with statistical theory – working smarter NOT harder

Different strategies employed

Main effects

Interactions

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Residual Analysis – there is still signal left

Other Alternatives11

Main Effects Identification

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

12

Page 7: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

7

Main Effects Identification

Classical Mining

Useful source of information

G d f P tt R iti Good for Pattern Recognition

Quick and easy

Methodological flaws

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

13

Methodological flaws

Limited by data quantity

No statistical framework

Studying distributional biases

Identify potential duplication of predictors

Main Effects Identification

predictors

Identify potential aliasing

Issues

Curse of dimensionality

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

14

Curse of dimensionality

Defining correlation

Page 8: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

8

Principle Components

Transform correlated predictors into primary effects

Main Effects Identification

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

15

Issues

Interpretability

Categorical Effects

Main Effects Identification

Stepwise Regression

Forward regression

Iteration 1 Iteration 2 … Model Stable

Base Line A A + M A + M + C + …

A + B A + M + B

A + C A + M + C

A + D A + M + D

… …

Selection M C NONE

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

16

Useful search routine when dealing with large number of related factors

Page 9: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

9

Main Effects Identification

Stepwise Regression

Backward Elimination

Iteration 1 Iteration 2 … Model Stable

Base Line A + B + C + D A + B + D A + D

A + C + D A + B

A + B + D A + D

A + B + C

Removal C B NONE

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

17

More appropriate to judge variable importance of a more complete model

Full main effects models are difficult to fit

Main Effects Identification

Stepwise Regression

Adaptive Regression

Iteration 1 Iteration 2

Forward Backward Forward Backward

Base Line A A + M A + M A + M + C

A + B A A + M + B A + M

A + C A + M + C A + C

A + D A + M + D

… …

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

18

Full set of permutations developed

Computationally intensive

Selection M NONE C NONE

Page 10: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

10

Main Effects Identification

Stepwise Regression Issues

Computationally intensive –

N d t h d d f t ft t li h i i f l— Need strong hardware and fast software to accomplish in a meaningful way

Selection of test for statistical significance

— Chi and F

– Violating normality assumptions

– Tendency to overfit— AIC Family

– Parameter penalization concerns

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

– Produce factor selection similar to that of a decision tree

Selected factors need more rigorous testing

19

Main Effects Identification

Balance

Aggregate average observed vs. aggregate average fitted

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

20

Tricky to deal with distributional biases

Often requires more rigor for confirmation

Page 11: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

11

Main Effects Identification

Decision Trees

Factor selection

F t ti Factor creation

Model localization

Node 4PREMIUM_INVITED

W=17351000

Node 6WI_AVG_123_RATIO

W=109087000

Node 3CONTENTS_HISTORY$

W = 126438.000N = 126438

Node 9PAYMENT_METHOD2$

W=85669 000

Node 11PREMIUM_INVITED

W=22202000

Node 8PREMIUM_RATIOW = 107871.000

N = 107871

Node 2AA_POL_DURW = 234309.000

N = 234309

TerminalNode 13

W = 233048.000

Node 15AA_POL_DURW=72394 000

Node 17AA_POL_DURW=26347000

Node 14PREMIUM_RATIOW = 98741.000

N = 98741

Node 13PREMIUM_INVITEDW = 331789.000

N = 331789

Node 1AA_POL_DURW = 566098.000

N = 566098

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

21

TerminalNode 1

W = 6626.000

TerminalNode 2

W = 5208.000

TerminalNode 3

W = 5517.000

Node 5WI_AVG_123_RATIO

W = 10725.000N = 10725

W = 17351.000N = 17351

TerminalNode 4

W = 1698.000

TerminalNode 5

W = 44999.000

Node 7VEHICLE_AGE_RN

W = 46697.000N = 46697

TerminalNode 6

W = 62390.000

W = 109087.000N = 109087

TerminalNode 7

W = 33234.000

TerminalNode 8

W = 8092.000

TerminalNode 9

W = 44343.000

Node 10CONTENTS_HISTORY$

W = 52435.000N = 52435

W = 85669.000N = 85669

TerminalNode 10

W = 1317.000

TerminalNode 11

W = 5311.000

Node 12PAYMENT_METHOD2$

W = 6628.000N = 6628

TerminalNode 12

W = 15574.000

W = 22202.000N = 22202

TerminalNode 14

W = 16540.000

TerminalNode 15

W = 24985.000

Node 16PAYMENT_METHOD2$

W = 41525.000N = 41525

TerminalNode 16

W = 30869.000

W = 72394.000N = 72394

TerminalNode 17

W = 13616.000

TerminalNode 18

W = 7197.000

TerminalNode 19

W = 5534.000

Node 18PREMIUM_RATIOW = 12731.000

N = 12731

W = 26347.000N = 26347

Main Effects Identification

Decision Trees Issues

Tendency to over fit

P f ‘b tt ’ l ifi ti t th th i t Performs ‘better’ as a classification tree rather than a regression tree

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

22

Page 12: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

12

Even after items have been identified additional testing is needed

Parameter/Standard Errors Time Testing

Main Effects Identification

Parameter/Standard Errors Time Testing

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Type III

Model With Without Deviance 8,906.44 8,909.62 Degrees of Freedom 18,469.00 18,475.00 Scale Parameter 0.48 0.48

Chi-Square Test 0.79

23

Interaction Identification

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

24

Page 13: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

13

Interaction Identification

Stepwise Regression

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

25

Need to specify direction, test, and stopping conditions

Computationally inefficient

Balance

Approach is to compare aggregate average observed values vs. t fitt d l

Interaction Identification

aggregate average fitted values

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Imbalance suggests need for interaction

Page 14: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

14

Balance

Observed valueClaims

Interaction Identification

β)h(xμy iii

i

ii Exposures

Claimsy

Fitted value (assumes simple model structure)

Weighted average observed value and weighted average fitted values for Class k

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

kii

kiii

k Exposures

Exposures x yA

kii

kiii

k Exposures

Exposures x yE

ˆˆ

Balance

For each combination of cells for two rating factors derive the following:

Interaction Identification

Male Female

Youthful DYM DYF

Adult DAM DAF

Mature DMM DMF

Seniors DSM DSF

Such that: 2kkk

k

EA x ExposuresD

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

kk E

D

Then:

Age Gender

kDQ

Follows a chi squared distribution with (n-1)*(m-1) degrees of freedom

Page 15: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

15

Interaction Identification

Balance

Chi squared test then run for every two way combination

Framework allows for ranking of potential constructs

Rank Factor 1 Factor 2 Chi Test1 Driving Restriction Age 0.00002 Age Gender 0.00003 Driving Restriction NCD 0.00004 NCD Gender 0.00005 NCD Age 0.00016 Protected NCD Gender 0.00027 Driving Restriction Gender 0.00048 Driving Restriction Protected NCD 0.00069 L Y D i i R t i ti 0 0008

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

9 LossYear Driving Restriction 0.000810 LossYear Gender 0.013211 Vehicle Age NCD 0.017612 Driving Restriction Vehicle Category 0.019513 LossYear Protected NCD 0.042514 Vehicle Category Gender 0.067015 Rating Area Gender 0.1185… … … …

Interaction Identification

Balance

Advantages

Can quickly identify areas in the model where interactions are needed

“Exponential” effect of distributional biases can magnify the importance of one interaction structure vs. another

Disadvantages

Sensitive to noise of severity

Li it d id t i lifi ti

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Limited guidance as to simplification

Page 16: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

16

Decision Trees

What would an interaction look like in a decision tree?

Interaction Identification

Splitter on factor A

Splitter on factor B

Splitter on factor B

Splitter on factor C

Splitter on factor C

Splitter on factor C

Splitter on factor C

Splitter on factor A

Splitter on factor B

Splitter on factor C

Splitter on factor D

Splitter on factor G

Splitter on factor F

Splitter on factor E

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

The left hand tree has similar structures down each branch and so is unlikely to indicate interactions

The right hand tree has different structures depending on which branch is traversed. This might have interactions.

Decision Trees

Will not guarantee interactions

Interaction Identification

Provides additional clues as to what interactions to test.

This tree may have the following interactions

A x B, A x C, A x D, A x E, A x F,

A x G, B x D, B x E, C x F, C x G,

A x B x D, A x B x E,

Splitter on factor A

Splitter on factor B

Splitter on factor C

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

A x B x D, A x B x E,

A x C x F, A x C x G

The list of candidate interactions can grow quickly!

Splitter on factor D

Splitter on factor G

Splitter on factor F

Splitter on factor E

Page 17: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

17

Decision Trees

Case study: auto renewals model

Interaction Identification

Balance test identified 23 interactions

Tree identified cross holdings and tenure interaction not from the balance test

Node 4 Node 6

Node 3CONTENTS_HISTORY$

W = 126438.000N = 126438

Node 9 Node 11

Node 8PREMIUM_RATIOW = 107871.000

N = 107871

Node 2AA_POL_DURW = 234309.000

N = 234309

TerminalNode 13

W = 233048.000

Node 15 Node 17

Node 14PREMIUM_RATIOW = 98741.000

N = 98741

Node 13PREMIUM_INVITEDW = 331789.000

N = 331789

Node 1AA_POL_DURW = 566098.000

N = 566098

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

TerminalNode 1

W = 6626.000

TerminalNode 2

W = 5208.000

TerminalNode 3

W = 5517.000

Node 5WI_AVG_123_RATIO

W = 10725.000N = 10725

Node 4PREMIUM_INVITED

W = 17351.000N = 17351

TerminalNode 4

W = 1698.000

TerminalNode 5

W = 44999.000

Node 7VEHICLE_AGE_RN

W = 46697.000N = 46697

TerminalNode 6

W = 62390.000

Node 6WI_AVG_123_RATIO

W = 109087.000N = 109087

TerminalNode 7

W = 33234.000

TerminalNode 8

W = 8092.000

TerminalNode 9

W = 44343.000

Node 10CONTENTS_HISTORY$

W = 52435.000N = 52435

Node 9PAYMENT_METHOD2$

W = 85669.000N = 85669

TerminalNode 10

W = 1317.000

TerminalNode 11

W = 5311.000

Node 12PAYMENT_METHOD2$

W = 6628.000N = 6628

TerminalNode 12

W = 15574.000

Node 11PREMIUM_INVITED

W = 22202.000N = 22202

TerminalNode 14

W = 16540.000

TerminalNode 15

W = 24985.000

Node 16PAYMENT_METHOD2$

W = 41525.000N = 41525

TerminalNode 16

W = 30869.000

Node 15AA_POL_DURW = 72394.000

N = 72394

TerminalNode 17

W = 13616.000

TerminalNode 18

W = 7197.000

TerminalNode 19

W = 5534.000

Node 18PREMIUM_RATIOW = 12731.000

N = 12731

Node 17AA_POL_DURW = 26347.000

N = 26347

Decision Trees

Advantages

Interaction Identification

Quickly identify potential n-way interactions

Suggests areas of localization

Disadvantages

Growth in complexity

Better performance when response is structured as a discrete (i.e. binomial/multinomial) construct

Li it d id t i lifi ti

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Limited guidance as to simplification

Page 18: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

18

Saddles

Quadrant saddle: revisiting an simple main effect model

Interaction Identification

400

600

800

1000

1200

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

1 3 5 7 9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

S1

S5

S9

S13

S17

S21

S25

S29

S33

S37

-400

-200

0

200

Interaction Identification

Saddles

Quadrant saddle: interaction terms twist the paper

-200

0

200

400

600

800

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

1 3 5 7 9

11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

S1

S5

S9

S13

S17

S21

S25

S29

S33

S37

-800

-600

-400

Page 19: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

19

Interaction Identification

Transforming predictors into single parameter variatesR

espo

nse

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Factor Levels

Interaction Identification

Transforming predictors into single parameter variates

Res

pons

e

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Factor Levels

Page 20: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

20

Interaction Identification

Case study: vehicle value x rating area

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Interaction Identification

Full interaction is very noisy

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Page 21: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

21

Interaction Identification

Different quadrants to be tested

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Interaction Identification

Focus on higher valued vehicles in certain areas

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Page 22: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

22

Interaction Identification

Systematically study different twists

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Interaction Identification

Systematically study different twists

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Page 23: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

23

Interaction Identification

Systematically study different twists

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Interaction Identification

Systematically study different twists

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Page 24: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

24

Interaction Identification

Saddles

Advantages

Identifies and simplifies interactions via the transformation

Useful to find local 3, 4, and 5 way interactions of frequency data

Useful in finding interactions on severity data

Allow for interactions on high dimensional factors

Disadvantages

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Difficulties in dealing with interactions where main effects are not part of the model

Difficulties in dealing with interactions with low dimensionality

Residual Analysis

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

48

Page 25: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

25

Parametric Signal Non parametric Analysis

Residual Analysis

General approach

Factor A Factor B Factor C Factor E Factor F

Parametric SignalLeftover

Signal

p yLow $ 1

1 4

4

5

1 0

1 1

1 2

1 3

6

7

8

9

2

3

Factor D

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Expected Cost

High $1 71 81 9

2 0

1 5

1 6

Residual Analysis

Super factors

After a GLM model is constructed use decision trees to model the residuals to see if any pattern exists

If a pattern is discovered, go back to the model structure and incorporate the findings

Test to see if model structure was inadvertently over-simplified

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

50

Page 26: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

26

Residual Analysis

Principle of Locality

Risks that share similar characteristics should have similar experienceexperience

Adjacency Smoothing

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

51

Distance Adjacency

Residual Analysis

Case Study – Vehicle Similarity

Instead of using latitude/longitude to build relationships, use vehicle dimensionsvehicle dimensions

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Page 27: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

27

Diagnostics:

Smoothing and diagnostics used to identify signal in the vehicle residual

Residual Analysis

QQ Plot: P-Values vs Uniform

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1

Uniform

P-V

alu

es

P-Values Smoothing 1 (20)

P-Values Smoothing 2 (40)

P-Values Smoothing 3 (60)

P-Values Smoothing 4 (80)

P-Values Smoothing 5 (100)

P-Values Time Period Selector Node 1

Uniform Line

QQ plot on smoothed residual indicates potential signal

If no signal, then all estimated cdfs

would be above uniform cdf

Smoothing 4 (80)

2 2

2.4

200,000

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

0 2 4 6 8 10 12 14 16 18 20 22

Smoothing 4 (80) Band

No

de

We

igh

t

Consistency tests further support signal identification

Both time and data consistency

tests should be performed

Residual Analysis

Advantages

Identifies complex patterns that parametric model could not

C id i ifi t lift t d l t t Can provide significant lift to model structure

Disadvantages

Over fit potential is significant

Interpretability

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

Page 28: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

28

Other Alternatives

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

55

Other Alternatives

Noise Reduction

Use the data to solve for parameters

Sample

ModelData Input

Solve forP t

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

56

Sample Parameters

Page 29: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

29

Other Alternatives

Noise Reduction

Tested against a holdout

Sample

ModelData Input

Solve forP t Hold Out

Hold OutTest

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

57

Sample Parameters Hold Out

Practical Issues

What is a good fit?

What do you do if there is not a good fit

Other Alternatives

Noise Reduction

Tested against a holdout

Sample

ModelData Input

Hold Out

Hold OutTest

Circular

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

58

Sample Hold Out

Changes in model create an issue

CircularReference

Page 30: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

30

Other Alternatives

Noise Reduction

Use holdout WITHOUT using it

Sample

ModelData Input

Hold Out

Hold OutTest

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

59

Sample Hold Out

NoiseReduced

Model

Solve forParameters

• Non parametric

– Neural Networks,

– Genetic Algorithms

Other Alternatives

Genetic Algorithms,

– Decision Trees, etc.

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

60

Page 31: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

31

Summary

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

61

Summary

Many tools can be used to find signal

Patterns can be found with new factors or interactions between existing factors

Underfitting and Overfitting are two sides of the same problem

Statistics allow us to work smarter NOT harder

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

62

Page 32: Cutting Edge Tools for Pricing and Underwriting Seminar · 2013-04-02 · 4/2/2013 3 Background Goals of predictive modeling Predict a response variable using a series of explanatory

4/2/2013

32

Contact Details

Serhat Guven

Towers Watson

Senior Consultant

200 Concord Plaza Suite 420

San Antonio, TX 78216

210.826.2878

[email protected]

towerswatson.com© 2011 Towers Watson. All rights reserved. Proprietary and Confidential. For Towers Watson and Towers Watson client use only.

63