Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
-
Upload
pawel-brzeminski -
Category
Technology
-
view
581 -
download
0
Transcript of Analytics, Big Data and The Cloud II Conference - Kiribatu Labs
Learn how insurers predict risk and how you can apply it to your predictive analytics project
Pawel Brzeminski, Founder & CEO [email protected]
May 15, 2013 Analytics, Big Data, and The Cloud II
Edmonton
The Company KIRIBATULABSDiscovering Knowledge Assets
Kiribatu is a predictive analytics company, founded in 2009 / 6 employees We serve the Canadian financial sector, predominantly Property & Casualty insurance
Predic1ve analy1cs, huh? KIRIBATULABSDiscovering Knowledge Assets
Goal-driven ANALYSIS of a large data set to PREDICT human behavior
If speed was important to you… KIRIBATULABSDiscovering Knowledge Assets
YOUR insurance premium is calculated by methods designed 40-50 years ago
VS.
Risk assessment in Insurance KIRIBATULABSDiscovering Knowledge Assets
A vast majority of Canadian insurers (May 2013) still use outdated premium rating formulas created in 1960-1970s Only a handful of Canadian insurance companies are sophisticated predictive analytics users Leaders are decimating their competition
Where to start? KIRIBATULABSDiscovering Knowledge Assets
Source: By Phil McElhinney from London (Jeremy Wariner) (http://creativecommons.org/licenses/by-sa/2.0)
How to identify an opportunity for a predictive analytics project?
Ques1ons to ask while star1ng KIRIBATULABSDiscovering Knowledge Assets
Data is already collected (or can be easily acquired) Transactional data, customer data, sensor-generated data, usage data, etc.
There is a clear objective to predict something Future price, failure rate, customer risk, customer profitability, customer retention, etc.
Well-defined functional settings are a great place to start We focused on a Risk Sharing Pool (RSP) problem optimization
Typically the SMEs (Subject Matter Experts) are making decisions based on their experience and “gut feeling” Senior underwriters in our case
Significant ROI is expected Investment in analytics can be small but usually it is not trivial
Example KIRIBATULABSDiscovering Knowledge Assets
Risk Sharing Pool is a construct used by Canadian insurers to optimize their risk assessment Insurers put their highest risks (primary driver and a vehicle) in the pool to avoid paying for the claims But they forfeit the premium
Insurers retain the risks they deem profitable on their book of business They can collect the premium and make a profit
Challenge KIRIBATULABSDiscovering Knowledge Assets
Can we effectively predict future claims on policies? The model would need to predict claims that will occur up to 12 months in advance
Introducing Underwri1ng Score KIRIBATULABSDiscovering Knowledge Assets
The predictive model generates an Underwriting (UW) Score The UW Score is a number between 1 to 1000
High UW Score = high profitability = low risk Low UW Score = low profitability = high risk
Highly accurate predictor of future claims on a policy UW Score will be used to assess which risks are placed in the pool and which risks are not placed in the pool
Data Prepara1on
Ra1ng Factor Analysis
Model Development
Gain Assessment
KIRIBATULABSDiscovering Knowledge Assets 4 Key Modeling Steps
Data Prepara1on • Policy & claims data profiling, understanding and verifica1on
• Data cleansing (filling missing values, outliers removal)
• Data transforma1on
• Data normaliza1on (infla1on & claim development factors)
• Data enrichment with 3rd party data (demographic, econometric – Census Canada, VICC, CLEAR, etc.)
Data Prepara1on KIRIBATULABSDiscovering Knowledge Assets
Ra1ng Factor Analysis KIRIBATULABSDiscovering Knowledge Assets
• Sta1s1cal analysis of each data element for its propensity to claim
• Ra1ng factors with high correla1ons are included in the final predic1ve model(s)
• OYen, new powerful ra1ng factors are discovered in this step (very useful for Underwri1ng)
Ra1ng Factor Analysis
Data Prepara1on
Model Development KIRIBATULABSDiscovering Knowledge Assets
• Algorithm selec1on (gene1c algorithms, neural networks, logis1c regression, SVM)
• Time-‐wise training and tes1ng data set split
• Model parameteriza1on, genera1on and evalua1on
Data Prepara1on
Ra1ng Factor Analysis
Model Development
• Calcula1on of UW Scores on test data set
• Retrospec1ve underwri1ng gain assessment on historical data sets
Data Prepara1on
Ra1ng Factor Analysis
Model Development
Gain Assessment
KIRIBATULABSDiscovering Knowledge Assets RSP Gain Assessment
Results KIRIBATULABSDiscovering Knowledge Assets
Source: “Improving P&C Insurance Risk Management and Policy Pricing with Predictive Analytics”, Pawel Brzeminski, September 2011, http://www.kiribatulabs.com/resources.php.
UW Score = 1000 – Risk Score
4 Key Challenges KIRIBATULABSDiscovering Knowledge Assets
Extremely low correlations / Data set imbalance 98% of policy transactions do not have any claims, 2% have claims
Bad, bad data Drivers driving 200,000 km per year (that's driving over 500 km per day for 365 days a year)
Over-fitting Certain features do not generalize very well in a time-wise data split
Data sparcity Motor Vehicle Abstract (MVA) data that contains convictions, suspensions and reinstatement is not always available
5 Key Breakthroughs KIRIBATULABSDiscovering Knowledge Assets
Policy transactions collapsed into single vectors Individual risk assessment for each vehicle on policy
Instance sampling and weighting Dealing with dataset imbalance and bad data
Custom model quality metric Aggregation of the highest claims in the top 5% of all transactions really moved the needle
Risk Assessment per insurance coverage Different data elements are important for each coverage, for instance liability coverage and comprehensive coverage are completely different products behave very differently
Prediction of Profitability Include written premiums in 2nd level model
Homework KIRIBATULABSDiscovering Knowledge Assets
Where can I apply predictive analytics in my business?
Questions? Always happy to have a coffee
Pawel Brzeminski, Founder & CEO [email protected]
780-232-2634
http://ca.linkedin.com/pub/pawel-brzeminski/0/523/555
@pawelwb