Application of Statistical Techniques in Group Insurance · Moving Forward to the 21 st Century...
Transcript of Application of Statistical Techniques in Group Insurance · Moving Forward to the 21 st Century...
Application of Statistical Techniques in Group Insurance
Chit Wai Wong, John Low, Keong Chuah & Jih Ying Tioh
© AIA Australia
This presentation has been prepared for the 2016 Financial Services Forum.
The Institute Council wishes it to be understood that opinions put forward herein are not necessarily those of the Institute and the Council is not
responsible for those opinions.
Agenda • Background and Motivation • Stochastic Claims Techniques and Applications • Generalised Linear Models and Applications • GLM Case Study: Understanding What Drives TPD Claim Delays
Some Relevant Developments within General Insurance
1993
Thomas Mack Standard Error of
Chain Ladder
England & Verrall Stochastic Claims
Reserving
2002
Research with applications in
pricing? Based on non-aggregated
data?
2008
Mario Wüthrich Stochastic Claims
Reserving for Solvency Purposes
To Date 1970’s
Basic Chain
Ladder
Generalised Linear
Models
Moving Forward to the 21st Century
Anticipated Improvements in
data quality, Volume of data
available
Significant Improvements in
Computing power
Accessibility of Statistical packages
Further insights to be gained from using individual member data
Suggested Statistical Package R with RStudio as Integrated Development Environment
• Widely used by practising statisticians and researchers for statistical analysis
• Various built-in packages assessable to users who might not have very strong coding experience
• Has active user groups where questions can be asked and are often quickly responded to
• Free of charge and open source
Analysis Capabilities Need to Expand… The chain ladder and its deterministic variants are heavily used in almost all aspects of group insurance related work, but…
Deterministic chain ladder
GLM
Stochastic methods
Additional insights potentially lead to
Enhanced pricing, reserving and capital
management
Optimised Return on Capital
Stochastic Claims Models
Simplified Models • Reflective of key drivers of claims profile • Reduced policy/member data and
computing requirements • Setting of homogenous groups with
sufficient credibility
Full Models • Most reflective of claims profile • Needs policy/member data • Needs computing power
• Many variants of stochastic models have emerged over the last 30 years in the general insurance area
• Most of these models are chain ladder based
• Some are easy to implement in spreadsheets through bootstrapping method
• Full models using individual data could be more popular as computing power increases
Why Stochastic Claims Models?
99.5%
Provide measure of variability
Provide predictive distribution - Provide more than just mean and variance - Link probability to amount at risk - Useful in setting risk margins – pricing, capital - Could be extended to enterprise value at risk
Produce asymmetric distributions - Avoid understating the tail
Pricing options and guarantees - Profit share for group life business
Capital Risk Margins Calibration • Stochastic chain ladder models could be
used to understand IBNR distributions at the aggregated level or by accident/incurred years
• Support the calibration of risk margins on IBNR
• Most models can produce asymmetric distributions to avoid understating the tail
• Some variations could be implemented in Excel
• R has a very comprehensive package
AY/DY 1 2 3 4 5 6 7
2007 3,855 3,428 2,057 1,374 1,037 598 340
2008 3,979 3,538 2,123 1,418 1,070 618 272
2009 4,263 3,791 2,274 1,519 1,146 615 270
2010 3,992 3,549 2,129 1,423 986 710 163
2011 4,217 3,750 2,250 2,139 901 742 232
2012 5,108 4,542 2,618 2,067 1,214 921 356
2013 6,283 5,812 3,191 2,424 1,415 915 295
IBNR distributions by incurred year All Years 2011 2012 2013
#ChainLadder package
https://cran.r-project.org/web/packages/ChainLadder/ChainLadder.pdf
Simulate
Profit Share Pricing • Profit share arrangement
creates asymmetric return for the insurer
• The variability of claim outcomes will need to be considered when setting the cost of the profit share
Profit Share Pricing
Process to determine premiums is iterative • Initial premiums could be chosen based
on risk tolerance, say, to cover claims at 95th percentile
• Simulation process could be repeated by varying premiums and profit share refund %
• Higher premiums lead to lower exposure to claims variability for the insurer but reduce competitiveness
• Final premiums and profit share refund % will be based on the balance between insurer’s and client’s objectives
Cost of profit share = f (profit share refund %, premiums, claims distribution)
Cost of profit share • Profit share refund % • Premiums
• Volatility in claims distribution
A statistical model that relates a response variable to a set of explanatory variables (rating factors)
Generalised Linear Model
Response Variables Examples
Incidence rate
Lodgement delay
Lapse rate
Termination rate
Explanatory Variables Examples
Age
Smoking status
Gender
Policy Duration
Why Use Generalised Linear Model?
Moving away from one-way analysis
One way analysis does not take into account of correlations between factors
GLM is multivariate which allows for correlations and interactions between factors. Rating factors can be ranked and discovery of new rating factors is made easier using statistical packages.
3
/ =Link ratio
Why Use Generalised Linear Model?
Moving from aggregated data to individual data Group insurance claims analysis
are typically performed using aggregated data using the chain ladder approach. This approach will inform whether the delays are longer or shorter but not necessarily help us understand the drivers that are causing the delays.
The advantages: Further insights could be generated by identifying what drives claim delays using individual claims data. Volume and richness of data will continue to grow.
3
Claim ID Member ID Loss Date Reported Date Claim Paid Delay0005 M61868 5/06/2012 5/07/2012 200000 1.03005 M003 18/07/2014 16/10/2014 150000 3.003656 M30540 6/09/2010 25/03/2011 80000 6.69873 M2168 1/07/2011 29/10/2011 400000 3.9
Questions GLM Can Help Answer
Which are the most important
drivers of experience?
Which customer segments should
the business target?
Are there any interactions
between rating factors?
What is the optimal
grouping of rating factors?
What is the impact of rating factors on the
dependent variable?
Sourcing data for GLM analysis • GLM analysis not limited to data that is only available internally. • External sources of data can be linked to internal
data using postcodes to extend analysis.
Source: abs.gov.au and employment.gov.au
• Mitchell LGA example: – Community profiles: Top
employing industry is manufacturing, occupation profile light blue
– Economic data: Unemployment rate of 7.5% at Dec-15 quarter
– Socio-Economic Indexes for Areas (SEIFA): Area is not disadvantaged 2011
Can these factors help
explain claims
experience?
R can be used to visualise analysis.
Visualisations Higher Delay
+ Internal
Data
+ ABS Digital Boundaries
Data for Australia
Background and Motivation
Can we utilise member level claims data to gain further insights into what could be driving claims delay?
With anticipated continued improvement in data quality and availability of external data..
Volatility in claim delay drives uncertainty in pricing, profit and capital reporting for Group Insurance
Incidence Lodgement
Claim Delay
Mis-estimation of claim delays by a few months could cost millions!
What Drives Claim Delay? Level of awareness often cited but can we dig deeper?
Level of Awareness
Cause of Claim
Claim Delay
Level of Unemployment
Age
Exposure to aggressive
lawyer advertising?
Level of Education
GLM was identified as the appropriate tool to: Uncover and test new factors Rank these factors by importance Define the appropriate banding for these
factors (eg. Age) Estimate the impact that the factors might
have on the claim delay
A forward selection algorithm may be used to identify rating factors. The process stops there are no improvement in explanatory power by adding more factors.
One-factor model
•Fit each of the remaining rating factors in turn
•Select the one-factor model that best fits the data
Multi-factor model
•Final model with statistically significant factors
Searching for Relevant Factors and Ranking them by Importance
Age Level of Awareness
Cause of
Claim
Claim Delay
Level of Unemployment
Age Exposure to aggressive
lawyer advertising?
Level of Education
Education
Fit additional factors, some may drop off..
#forward package https://cran.r-project.org/web/packages/forward/forward.pdf
[0,30) [30,40) [40,50) [50,55) [55,60) [60,70)
Rela
tive
Dela
y
Age Band
Longer
Shorter
Results presented relative to
Preservation Age
Indifferent? retirement is still far away
Q1: Are older members potentially more reliant on their cover and therefore may be more aware of their cover?
Claim Delay and Age
Level of Awareness
Cause of
Claim
Claim Delay
Level of Unemployment
Age Exposure to aggressive
lawyer advertising?
Level of Education
***
*** ***
R can be used to identify the appropriate age bands: Start with a model with an explanatory variable that contains all raw ages. R will highlight the ages that are statistically significantly different and will inform how the ages should be grouped.
Darker toned bars indicated higher numbers of claims in the group Higher number of * indicates higher statistical significance
Q2: Are certain causes of claim susceptible to longer delays?
Claim Delay and Cause of Claim Re
lativ
e De
lay
Longer
Shorter
*
*
* **
Small number of claims with significant
longer delays related to more
complicated conditions
Level of Awareness
Cause of
Claim
Claim Delay
Level of Unemployment
Age Exposure to aggressive
lawyer advertising?
Level of Education
Results presented relative to Conclusive diagnosis is potentially complex and could take time?
Social perception?
*** *
Cause of Claim Darker toned bars indicated higher numbers of claims in the group
Higher number of * indicates higher statistical significance
Q3: Are claimants who reside in areas where unemployment rates are high have relatively shorter claim delays?
Claim Delay and Unemployment
Higher Unemployment
Shorter Delay
High Unemployment
Shorter Claim Delay. Higher reliance?
Level of Awareness
Cause of
Claim
Claim Delay
Unemployment
Age Exposure to aggressive
lawyer advertising?
Level of Education Visualisation using
Q4: Do Solicitor Advertising affect the claim delay? Claims with solicitor representation may come in late as claimants were initially unaware of their cover.
Claim Delay and Solicitor Advertising
Yes Unknown No
Rela
tive
Dela
y
Solicitor Representation
Longer
Shorter ***
***
Level of Awareness
Cause of
Claim
Claim Delay
Level of Unemployment
Age Exposure to aggressive
lawyer advertising?
Level of Education
Late onset awareness influenced by advertising?
Higher number of * indicates higher statistical significance
Results presented relative to
Other factors to consider…
Level of Awareness
Claim Delay
Whether the member have
access to a financial planner?
Frequency and form of communication
between Superannuation funds
and its members
Other transactional data (payment
reminder notices)
Socio-economic status (consumer
profiling based on residential address) Claims
management process
Type of cover: Default vs. Opt-in vs. Underwritten
Cover
Additional Insights and Potential Benefits
Age
Better segmentation of risk, assessment of ultimate claim cost
and trends in awareness levels
Cause of Claim
Unemployment
Additional Rating Factors
More accurate reserving
Enhanced pricing based on
characteristics of customers under the
scheme
Optimised return on capital
Inform
Potential Benefits
and others…
Conclusion • Stochastic claims models and GLM are tools that can be used to
unravel useful additional insights • With the continued increase in computing power and increased
access to quality data, these techniques become easier to implement
• Results should not be viewed mechanically without judgement
No models are correct, but some are useful!
DISCLAIMER The content is current as at the date set out on the cover page of this presentation and may be subject to change. This presentation provides general information only, without taking into account the objectives, financial situation, needs or personal circumstances of any individual. This presentation may contain projections concerning financial information and statements concerning future economic performance and events, plans and objectives relating to management, operations, products and services, and assumptions underlying these projections and statements. It is possible that actual results and financial conditions may differ, possibly materially, from the anticipated results and financial condition indicated in these projections and statements.
Thank You
References England, PD and Verrall, RJ (2002) Stochastic Claims Reserving in General Insurance Abs.gov.au Demographic profiles, Socio economic indexes Employment.gov.au Unemployment rate https://cran.r-project.org/web/packages/ChainLadder/ChainLadder.pdf Chain Ladder R package https://cran.r-project.org/web/packages/forward/forward.pdf Forward search R package