Post on 27-Jun-2020
willistowerswatson.com
Machine learning – making an impact
Tony Bradshaw, Duncan Anderson
19 April 2018
PC European Actuarial Directors’ Forum
© 2018 Willis Towers Watson. All rights reserved.
willistowerswatson.com 2© 2018 Willis Towers Watson. All rights reserved. Proprietary and Confidential. For Willis Towers Watson and Willis Towers Watson client use only.
Agenda
Machine Learning - Making an impact - Wider implications
Tony
Machine Learning - Why now? What’s involved? Does it work? Implications for actuaries?
Duncan
This is not all new
3
1940s Today1960s 1980s 2000s
Trees
GLMs
Actuaries adopt GLMs
1950s 1970s 1990s 2010s
First computer
Neural nets
CART
GBMs
Random forests
© 2018 Willis Towers Watson. All rights reserved.
This is not all new
4
1940s Today1960s 1980s 2000s
Trees
GLMs
Actuaries adopt GLMs
1950s 1970s 1990s 2010s
First computer
Neural nets
CART
GBMs
Random forests
© 2018 Willis Towers Watson. All rights reserved.
Integrated
environments
and services
Hyper scale
parallel
computing
Distributed
storage/
processing
with Hadoop
NoSQL
databases
Data
visualisation
tools
Free software
environments,
languages,
analytics
libraries
Data stream
and real-time
processing
supporting IoT
There is a spectrum of complexity
5
Harder
Evolving
Requires significant expertise
Not at all hard
Already in use
Existing teams can normally do this stuff
AI comprehension Bespoke image recognition Speech analytics Machine learning predive modelling
Full autonomous driving Object recognition Topic modelling Automated GLMs
“Vastly more
risky than
North Korea”
GLM
stepwise
macro
© 2018 Willis Towers Watson. All rights reserved.
Example machine learning methods
6
Ensemble
Methods
Classifications
Trees"Earth"
Support Vector
MachinesElastic Net Neural Networks
Gradient
Boosting
Machines
Naïve Bayes
Regression
Trees
Principal
Components
Analysis
LassoK-nearest
Neighbours
K-Means
Clustering
Random Forests
Ridge
Regression
© 2018 Willis Towers Watson. All rights reserved.
Fitting - Penalised Regression
© 2018 Willis Towers Watson. All rights reserved.on client use only. 7
𝐿(𝛽|𝑋, 𝑦) + 𝜆1 𝑖𝛽𝑖 + 𝜆2
𝑖𝛽𝑖2
f(x) = g-1(X.b) where b estimated by minimising
Elastic Net
Ridge Lasso GLM
Fit
Fit
Test
1
Holdout
Fit
Fit
Fit
Fit
Fit
2
Holdout
Fit
Test
Fit
Test
Fit
3
Holdout
Fit
Fit
Test
Fit
Fit
4
Holdout
Fit
Fit
Fit
Fit
Fit
5
Holdout
Test
Fit
1
0.9
5 0.9 0.8
5 0.8 0.7
5 0.7 0.6
5 0.6
0.5
5 0.5
a
Cro
ss-v
ali
da
tio
n e
rro
rlog(l)
Optimal parameters
𝜆1 = 𝜆𝛼, 𝜆2 = 𝜆1 − 𝛼
2
𝐿(𝛽|𝑋, 𝑦)
Fitting - Gradient Boosted Machine or “GBM”
8
Group < 15?
Age < 40?
All dataGroup < 5?
Y N
Y N
Y N
𝑓 𝑥 = λ 𝑛=1
𝑁
𝑓𝑛(𝑥)
λ + λ + λ + λ +
λ + λ + λ + λ +
λ + λ + λ + λ +
λ + λ + λ + λ
© 2018 Willis Towers Watson. All rights reserved.
𝑓 𝑥 = λ 𝑛=1
𝑁
𝑓𝑛(𝑥)
λ + λ + λ + λ +
λ + λ + λ + λ +
λ + λ + λ + λ +
λ + λ + λ + λ
Optimal model of
those considered
Fitting - Gradient Boosted Machine or “GBM”
9© 2018 Willis Towers Watson. All rights reserved.
l(learning rate or “shrinkage”)
Interaction depth(number of splits on each tree)
N(number of trees)
Bag fraction(proportion of data used at each iteration)
y
w3,3
w3,2
w3,1
x1=21
x2=5
w1,1,3
α1,3w1,2,3
w1,2,2
α1,2
w1,1,2w2,1,3
w2,3,1
α2,3
𝛼2,2
𝛼2,1
w2,3,3
w2,3,2
w2,1,2
w2,1,1
w2,2,3
w2,2,1
w2,2,2
𝛼1,1w1,1,1
w1,2,1
Fitting - Neural Networks
10© 2018 Willis Towers Watson. All rights reserved.
Do they work? Do they add value?
11
Ensemble
Methods
Classifications
Trees"Earth"
Support Vector
MachinesElastic Net Neural Networks
Gradient
Boosting
Machines
Naïve Bayes
Regression
Trees
Principal
Components
Analysis
LassoK-nearest
Neighbours
K-Means
Clustering
Random Forests
Ridge
Regression
© 2018 Willis Towers Watson. All rights reserved.
Dimensions of utility
12
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
Dimensions of utility
13
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
Data
Factor
engineering &
knowing what
to model Method
Dimensions of utility
14
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
Think of a model…
Multiply it by 123
Square it
Add 74½ billion
…and you get the
same Gini coefficient!
Dimensions of utility
15
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
Model RMS Error
GLM 34.7%
Neural Net 33.1%
GBM 31.0%
Dimensions of utility
16
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
Loss ratio improvement 3.1%!© 2018 Willis Towers Watson. All rights reserved.
Dimensions of utility
17
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
18
Dimensions of utility
19
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
Pre/post
adjustments
“Comfort
Diagnostics”
Model
0
20000
40000
60000
80000
100000
4.4%
4.9%
5.4%
5.9%
6.4%
< 88% 88% -90%
90% -92%
92% -94%
94% -96%
96% -98%
98% -100%
100% -102%
102% -104%
104% -106%
106% -108%
108% -110%
110% -112%
> 112%
Resp
on
se
Proposed Model / Current ModelObserved Current Model Proposed Model
“Comfort diagnostics”
Dimensions of utilityHow much do you need to understand?
How much would you normally understand? (eg vehicle classification)
Cost of error? (eg marketing, operational efficiency)
Regulatory requirements
Professional standards
20
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
Dimensions of utility
21
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
Main Policy
Admin System
Next gen
rating
engine
Analytical
environ-
ment
600
610
620
630
640
650
660
670
680
690
700
710
720
730
740
750
760
770
780
790
800
New Business Price £
Cu
rren
t R
ate
-2.0% +2.0%
-5.0%
+5.0%
10%
40%
40%
10%
Analytical
environment
Next generation
rating engine
Policy administration
system
© 2018 Willis Towers Watson. All rights reserved.
Dimensions of utility
22
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
A toolkit…
23
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
Analytical
time and
effort
Predictive power
Table
Implementation
Interpretation
Penalised
Regression
Stability
Execution speed
Analytical
time and
effort
Predictive power
Execution speedTable
Implementation
Interpretation
"Earth"
Stability
Analytical
time and
effort
Predictive power
Execution speedTable
Implementation
Interpretation
GBMs
Stability
Analytical
time and
effort
Predictive power
Execution speedTable
Implementation
Interpretation
Trees
Stability
Analytical
time and
effort
Predictive power
Execution speedTable
Implementation
Interpretation
Random
Forests
Stability
Analytical
time and
effort
Predictive power
Execution speedTable
Implementation
Interpretation
Support
Vector
Machine
Stability
Analytical
time and
effort
Execution speedTable
Implementation
Interpretation
Neural
Networks
Stability
Predictive power
Analytical
time and
effort
Predictive power
Execution speedTable
Implementation
Interpretation
GLM
Stability
© 2018 Willis Towers Watson. All rights reserved.
That is already in use…
24
Analytical
time and
effort
Predictive power
Execution speed Implementation
Interpretation
Method
Stability
© 2018 Willis Towers Watson. All rights reserved.
2017 Willis Towers Watson US market survey
For which business applications do you use or plan to use these methods?
Willis Towers Watson Predictive Modeling Survey 2017
Underwriting
and risk
management
Pricing ReservingClaims
management
Customer
management
Marketing
and
Distribution
ClaimsUnderwriting/Pricing Marketing
94%
84%
55%
41%
37%
41%
41%
37%
37%
20%
Generalized linear models(GLMs)
One-way analyses
Decision trees
Model combining methods(e.g., stacking, blending)
Gradient boosting machines(GBMs)
Random forest (RF)
Penalized regressionmethods (e.g., lasso, ridge,
elastic net)
Neural networks
Generalized additivemodels (GAMs)
Support vector machines
61%
58%
58%
27%
24%
36%
27%
24%
21%
12%
78%
54%
54%
35%
32%
35%
30%
41%
19%
19%
That spectrum of complexity
25
Becoming BAU
This end could be
interesting…
AI comprehension Bespoke image recognition Speech analytics Machine learning predive modelling
Full autonomous driving Object recognition Topic modelling Automated GLMs
© 2018 Willis Towers Watson. All rights reserved.
26© 2018 Willis Towers Watson. All rights reserved.
So…
27© 2018 Willis Towers Watson. All rights reserved.
It’s not just about methods
Data beats models
It’s not just about methods
Working out what to model matters -
Data beats factor engineering beats models
Machine learning is already in use
Actuaries are already involved
It’s not just about predictiveness
A broader set of problems can be analysed
- operationalising rapid insight
Evolution not revolution
Models are complementary
to existing methods
PCEADF
Issues for the Profession(s)
28© 2018 Willis Towers Watson. All rights reserved.
Regulatory issues
TAS: Judgement - what judgement?
GDPR
UK Government Select Committee
(Science and Technology)
Training
A generation less familiar with stats?
CAS, SOA ahead? (eg CSPA)
IFoA on the case, but fast enough?Role of the actuary
Domain expertise matters (at least currently)
Easier for an actuary to pick up machine learning
than for a data scientist to understand insurance?
Siloed teams don’t work
Familiarity and the right vernacular can help
Scope of involvement?
Pricing Reserving Claims analytics
Customer management ? Marketing ???