Health Equity Analytics Solution

25
Health Equity Analytics Solution -Team PowderQuants Team lead: Ben Taylor Analyst: Justin Powell [email protected]

description

Health Equity Analytics Solution . -Team PowderQuants Team lead: B en T aylor Analyst: Justin Powell [email protected]. Outline. Define the objective Data Formatting Data Clustering Predictive Analytics Model Solution ROI Looking Forward. Define the objective Brief background. - PowerPoint PPT Presentation

Transcript of Health Equity Analytics Solution

Page 1: Health Equity  Analytics Solution

Health Equity Analytics Solution

-Team PowderQuantsTeam lead: Ben TaylorAnalyst: Justin [email protected]

Page 2: Health Equity  Analytics Solution

Outline

• Define the objective• Data Formatting• Data Clustering• Predictive Analytics Model• Solution• ROI• Looking Forward

Page 3: Health Equity  Analytics Solution

Define the objectiveBrief background

• Descriptive Analytics– This is the most basic solution. Nothing more than a graphical visualization resting on

top of a database. If data visualization is needed there are many plug and play vendors such as Tableau, Domo, etc…

• Predictive Analytics– Using the data from the descriptive analytics, can a model be built to predict

account spend rate? This requires a background in modeling and proper metrics for success to ensure overfitting is not an issue.

• Prescriptive Analytics – Rather than just firing a prediction or threshold to react to the data,

prescriptive analytics attempts to use the model for insight to change the future outcome. An example of this would be targeting chronic diabetic customers to reduce the risk of limb amputation.

Page 4: Health Equity  Analytics Solution

Define the objective• The problem objective is to develop a model that can predict

account balance risk for preemptive notification. – Challenge

• Focusing too much on the end goal can distract and confuse.

• Simplifying the problem into tractable pieces reveals where the focus of the algorithm should be: Predicting the likely spend rate of each individual. The rest of the math after that is simple.

Fund

Ok

Inputs

Fund

Ok

?

Expected contributionAccount balance

Inputs ?Predicted spend

Page 5: Health Equity  Analytics Solution

Define the objective• Common pitfalls with predictive analytics

– Default Objective is incorrect• Many novice users will use default algorithms without much thought into the algorithms

underlying objective. This can cause problems if the objective is simply tied to an overall error such as (R^2, RSME, etc…) which is not robust to outlier influence, scaling issues, or give the end user any sense of model confidence. [Powerquants use 3 metrics for comparison]

– Overfitting => Solution confidence / quality• “Any solution without an associated confidence is no solution at all.” An R^2 of 1 can be

provided given enough input variables into a model, but offers poor predictive power beyond the training set. Cross validation / bootstrapping can assist to aid in model confidence assessment. [Powerquants provide robust confidence metrics]

Inputs

?Predicted spend rate

Page 6: Health Equity  Analytics Solution

Data Formating

• Joining up the data– unique [MemberID.dependent x OPT claim]– Combine all other data into single table keyed off

of either claimID or memberID

Page 7: Health Equity  Analytics Solution

Clustering Data• Looking at the sparse raw data (left) it is nearly impossible to see the

value. Clustering using a self organized map (right) allows for areas of interest to come to life. Now procedures with high use counts among members with correlations between other procedures are readily visible along the diagonal. Unique CPT codes (reordered)

Unique CPT codes

Uni

que

mem

ber (

reor

dere

d)

Uni

que

mem

ber

Shuffle

organize

?Sparse!

Page 8: Health Equity  Analytics Solution

A closer look

cool….

Page 9: Health Equity  Analytics Solution

Cluster + age underlayAge specific clusters can be visualized as well

Page 10: Health Equity  Analytics Solution

Cluster drill down by opt codes

These codes were not available to us, but I promise you they are closely related and provide insight

Page 11: Health Equity  Analytics Solution

Cluster drill down by opt codes

99051: Service(s) provided in the office during regularly scheduled evening, weekend, or holiday office hours, in addition to basic service

Page 12: Health Equity  Analytics Solution

Cluster drill down by opt codes

Page 13: Health Equity  Analytics Solution

Cluster drill down by opt codes

These codes were not available to us, but I promise you they are closely related and provide insight into spending behavior

Page 14: Health Equity  Analytics Solution

Cluster drill down by opt codes

Page 15: Health Equity  Analytics Solution

Training / ValidationAll model building should try and utilize some sort of holdout set for confidence assessment.

Train 70% Validate 30%

Page 16: Health Equity  Analytics Solution

Define Bucket ClassificationsLooking at average daily spending behavior across all members we can create a histogram and define classification

buckets.

low med med-high high extreme

Average daily spend ($USD)

1 2 3 4 550th 75th 95th 99.9th

Wellness (intervention)

Page 17: Health Equity  Analytics Solution

Simple baseline to compare against

• Assume training bucket classification persists– Validation results:• Absolute Prediction Error:

– Mean = 7.43$USD/day, median= 1.40 $USD/day• Hit rate:

– 49% match bucket, 84% within 1 bucket, 98% within 2 buckets• Penalty error (over-estimate 1/2, under x2):

– We would rather over estimate than under (this allows potential intervention)

– 1.07

Page 18: Health Equity  Analytics Solution

Bayesian Bootstrap

Male Female

0-10yrsYM

0-10yrsYF

10-40yrsAM

10-40yrsAF

>40yrsEM

>40yrsEF

CPT

in

CPT post probability

Cumulative price increase >5% probability

Bootstrap 100x

Page 19: Health Equity  Analytics Solution

Flowchart

All historical datac3.4xlarge

Process post-prob-matrices for each partition ($0.840/hr ! intermittent use)

YF

AF

EFEM

AM

YM

Predictionhistorical data

For candidate i

Spend Rate Category

historical contributionsbalance

ETL

Simple logic

Education True Educate False

Transform

Training

Page 20: Health Equity  Analytics Solution

LAUNCH AWS Demo• Here I will launch my AWS instance and run the demo

showing the distributed Bayesian bootstrap code running in memory on 16 cores and compare that to my local machine rate (~160hrs).

Page 21: Health Equity  Analytics Solution

Application

Here is an subsample of real customer account balance estimates. We have highlighted interesting accounts to demonstrate different behaviors. The top line shows a low risk individual that continuously funds their account, even if the model determines they were high risk for healthcare cost they still would not trigger a funding notification because their balance is so high. Funding notifications are only sent out if the account is at risk of being empty within the next few months based on spending rate predictions coupled with recent funding behavior.

Medium riskAccount is running out: Fund

Low riskAccount high, ok

Med

risk

, ok

Page 22: Health Equity  Analytics Solution

Investment

• Engineering cost– <$20,000 for consultants to setup AWS

infrastructure and provide full integration• CLOUD Cost – (depends on training frequency and optimization)• Lowest cost could be $100-200/month in cloud

resources assuming 10hrs/month training + wireframe infrastructure (ETL, email, etc..)• Highest cost could go up to $1000/month for

optimization and frequent training

Page 23: Health Equity  Analytics Solution

Return

• Assuming a 5% reduction in health care costs– Reduction will come from wellness awareness and insight into clustered

medical spending (discovered risks). With Bayesian bootstrapping you are essentially giving your customers rich tailored probability maps, do what you want with that information. (i.e. I am going in for X surgery, what are the risks or complications, and what are the costs of those risks for my demographic?)

• Patient responsibility:– $21,979,894.32*0.05=$1,098,994.72 savings

• Negotiated Price:– 144633170.61*0.05=$7,231,658.53 savings

Page 24: Health Equity  Analytics Solution

Future Opportunities• Operations make this type of problem a GPU candidate.

– Running on GPUs can offer anywhere from 10-100x speed up. This could be a cost savings opportunity if frequent trainings are needed.

• Bucket thresholds can be optimized– 50th,75th, etc… thresholds are arbitrary, can be refined for greater predictive power.

• More specific age/gender Bayesian maps can be created, including location given enough data.– Increase resolution, more age groups, smarter age transitions. Also including health assessment data

would improve this type of risk clustering.

• Clustering can be magnified for easier visualization + automated cluster threshold data mining methods can be used to automate insight mining in the clusters. – These clusters provide a wealth of knowledge on common procedures and the largest pain points in

the sector. Spending time to develop cluster evaluation techniques would be worthwhile.

Page 25: Health Equity  Analytics Solution

Code location• git clone https://bitbucket.org/bentaylorche/heqdatacomp.git

– Code is partial, I will check in the AWS demo at the presentation.