DATA SCIENCE FOR RISK MANAGEMENT - RIMS Handouts/RIMS 16/TLT001/TLT001_TLT0… · DATA SCIENCE FOR...

17
DATA SCIENCE FOR RISK MANAGEMENT (TLT 001) Speaker: Michael W. Elliott, CPCU, AIAF The Institutes

Transcript of DATA SCIENCE FOR RISK MANAGEMENT - RIMS Handouts/RIMS 16/TLT001/TLT001_TLT0… · DATA SCIENCE FOR...

DATA SCIENCE FOR RISK MANAGEMENT(TLT 001)

Speaker:

Michael W. Elliott, CPCU, AIAF

The Institutes

IOT and Technology

Robotics Wearables Transportation

Types of Data

Structured Unstructured

Internal • Claims history

• Safety data

• Adjuster notes

• Surveillance

videos

External • Financial data

• Labor statistics

• News reports

• Social media text

Example - Commercial Fleet Telematics Data

Seatbelt use

Braking

Driver Passengers

Speed Left turns

AccelerationRoute

Mileage

Data Science

Math and Statistics

Skills

DomainKnowledge

Hacking Skills

Data Science Techniques for Risk Management

• Association rules

• Clustering

• Classification

• Regression

• Text mining

• Social network analysis

Machine Learning - Safety

Safety

device

DutiesShift

Experience

Weather

Location

Predictive Modeling

Training Data Test Data Production

Training Data – WC Claims Fraud

Name Age

Body part

previously

injured

Attorney

involvement Witness

Fraudule

nt Claim

Anna 35 Y Y N Y

Carlos 42 N N Y N

David 53 N N N N

Jason 27 Y Y N Y

Sonia 32 N Y Y N

Attributes

Insta

nce

s

Class

Label

Gregory 45 Y Y Y ?

New Instance

Information Gain from Various Attributes

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Prev. InjuredBody Part

Age Attorneyinvolvement

Days to ReportClaim

Day of week Witness

Classification Tree – WC Claim Fraud

Previously

Injured Body

Part

AgeAttorney

involvement

No Yes

Number of

Medical Visits

< 3

< 40 =>

40

YesNo

Day of

week

Other than MondayMonday

Witness

YesNo

Prob.

Fraud

= .80

Prob.

Fraud

= .02

> 3

Days to

Report Claim

< 1 > 1

Classification as a Set of Rules

If (body part previously injured) AND (an attorney is involved) AND (day of week is Monday) AND (no witness) THEN Class =

Fraud Likely – Refer for Further Investigation

If (body part not previously injured) AND (age less than 40) AND (number of medical visits less than 3) AND (claim reported within 1

day) THEN Class =Fraud Highly Unlikely

Evaluating a Predictive Model

0%

20%

40%

60%

80%

100%

120%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

% c

orr

ectly ide

ntified

% test sample

Baseline Model

(20%, 40%)

Text Mining – Adjusters’ Notes

Claim Co-morbidity Commute >50 mi.

Current

prescriptions Provider A Class Label

001 1 1 0 1 1

002 1 1 1 0 0

003 0 0 0 0 0

004 0 0 1 1 0

005 1 1 0 1 1

Training Data

New Instance

2237 1 1 1 0 ?

(Social) Network Analysis

“degree” is high

“closeness” is high

“betweeness” is high

Loss Cost vs. Data Value

Loss Cost

Data Value

Questions?