Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited...

41
Enhancing Patient Outcomes with Big Data: Two Case Studies Tuesday, March 1, 2016 David A. Friedenberg Ph.D., Principal Research Statistician, Battelle Nancy McMillan Ph.D., Research Leader, Battelle

Transcript of Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited...

Page 1: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Enhancing Patient Outcomes with Big Data: Two Case Studies

Tuesday, March 1, 2016

David A. Friedenberg Ph.D., Principal Research Statistician, Battelle

Nancy McMillan Ph.D., Research Leader, Battelle

Page 2: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Conflict of Interest

David Friedenberg, Ph.D. and Nancy McMillan, Ph.D.

Salary: Salaried employees of Battelle

Page 3: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Agenda

• Introduction

• Using EHR data to predict acute kidney injuries (AKI)

• Analyzing intracortical brain data to reanimate a paralyzed limb

• Conclusion

• Q&A

Page 4: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Learning Objectives

• Demonstrate with real data how data from EHRs can be used to develop

accurate disease prediction models

• Describe a system for bypassing a damaged spinal cord by using large

amounts of data collected from a cortical implant to control a muscle

stimulation system which moves a paralyzed limb controlled by the

subject's thoughts

• Discuss the potential and some of the pitfalls of using big data to improve

patient outcomes using two real world examples

Page 5: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Benefits Realized for the Value of Health IT The value steps impacted were:

Treatment/Clinical

Electronic Secure Data

http://www.himss.org/ValueSuit

e

86%/77%

Sensitivity/Specificity

of 24hr AKI Prediction

Movement

Possible

Paralyzed person

can control hand

movements with

thoughts

Page 6: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Introduction and Methods Used

• Hypothesis – Inpatient AKI is predictable in advance based on electronic health record (EHR) data

• Goal - Predict AKI 24 hours in advance of its occurrence

• Approach – Conduct a retrospective analysis of hospital inpatients to develop a predictive model for identifying patients that are at-risk for AKI

• Monitoring Requirement: 6-hour urinary output rate and serum creatinine concentration difference from baseline are both available continuously for a six hour period

• AKI Encounters – 878 adult, non-prisoner encounters meeting the AKI Network Level 2 or 3 criteria and satisfying the monitoring requirement for 6 or more hours immediately prior to the first AKI event

• Control Encounters – 5096 adult, non-prisoner encounters for which there was a period of 6 or more hours sometime during the encounter during which the monitoring criterion was continuously met

Page 7: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Methods Used (Continued)

• The study database was populated with the following data types for each study encounter:

• Employing two-thirds of the study data:

– Statistical optimization routines were applied to select the risk factors that were most predictive of a future occurrence of AKI

– A logistic regression model employing the selected risk factors was derived from the data and used to produce an AKI risk index on a scale of 0 to 100

– demographic data

– medications administered

– lab test results

– urinary output rates

– vital measurements

– present-on-admission (POA) diagnoses

– problem list diagnoses

– procedures performed

Page 8: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Etiological Model

• The purpose of the etiological model is to identify:

– Identify physiological causal pathways leading to the adverse outcome

– Identify risk factors associated with the causal pathways

• Model development involves:

– Literature review

– Consultation with healthcare professionals

– Knowledge of risk factors for which useful information exists in electronic patient records

Page 9: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,
Page 10: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Data Management Process Stage

Receipt

Stage

Process

Stage

Transfer

Stage

Clean

Stage

Compute

Stage

Release

Raw data

converted to

limited use;

PHI reviewer

reviews

Raw limited

use data

loaded to

SQL

database

Map data to

standardized

format

Backup;

validation

processes

performed

Calculation of

relevant

features

suggested by

literature

Raw limited

use data

released to

analysis

team

Database

made

available to

analysis team;

PHI reviewer

reviews

Figures,

tables, and

summary

results

PH

I S

erv

er

LU

Se

rve

r L

U A

na

lytics

Se

rve

r

Exte

rna

l

Syste

m

Page 11: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

EHR Data Types Utilized

1. Events

Hospital Admission/Discharge

Patient Location Intervals

Patient Care Level Intervals

2. Admission and Discharge Data

Hospital Admission Data

Hospital Discharge Data

3. Patient Data

4. Clinical Flowsheet Data

Clinical Observations

Intake & Output

Ventilation Data

Vital Signs

5. Test and Procedure Results

6. Medication Administered

7. Problem List Entries

8. Diagnoses and Procedures

Diagnosis Codes

Procedure Codes

9. Orders

Lab Test Orders

Procedure Orders

Medication Orders

Page 12: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Analysis Dataset

• Static Data Set – Contains:

– A single record per patient

– Data for static variables, variables that do not vary significantly during a patient’s ICU stay (Age, Race, Weight, etc)

• Dynamic Data Set – Contains:

– Multiple time-stamped records per patient

– Data for dynamic variables, variables that vary significantly during a patient’s ICU stay (Vitals, Urinary Output Rate, Serum Creatinine Concentration, etc)

– Each data set record contains the full record of dynamic variable values that may be used for clinical decision-making from the record’s timestamp to the timestamp of the subsequent record

– A new record is generated when (a) a new value of any dynamic variable is entered into the patient EMR or (b) a value in the current dynamic patient record expires

Page 13: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Model Selection and Fitting

• Stepwise selection of static & dynamic variables to include in the outcome likelihood model based on statistical and practical significance

• Easy for data sets with one observation per patient

• Challenging for data sets with multiple observations per patient

– Correlations among records for each patient invalidate statistical inferences for simple models

– Fitting models that include a random patient effect and produce meaningful estimates of static variable effects was ineffective

– Currently implementing a “k-fold” method based on contributions to area under the ROC curve

Page 14: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

• Alerts are based on:

– Selection of an alerting threshold that balances the competing goals of minimizing false negatives and false positives

• Performance of fixed-threshold alerting procedures is characterized in terms of sensitivity and specificity

Alerts – Using the Model

• Receiver-operator characteristic curves are used to:

– Characterize true positive/false positive behavior for all possible thresholds

– Aid in threshold selection

Page 15: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Risk Attribution

• For every outcome likelihood model, there is a list of contributing risk variables

• The attribution process creates a vector of attribution percentages, one for each risk variable

• The attribution percentages add to 100%

• Each attribution percentage characterizes the degree to which the likelihood of an adverse outcome is attributable to the corresponding risk variable

Page 16: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

The Predictive Model

• Example applications of the predictive model:

– The AKI incidence rate for a patient weighing 68 kg and experiencing none of the 6 conditions in the figure above is 1.72% and the corresponding odds ratio for AKI is 0.0175

– The AKI incidence rate for a patient weighing 68 kg who is post-open-heart-surgery and on a ventilator (but is experiencing none of the other 4 conditions in the figure above) is 20.3% with a corresponding AKI odds ratio of 0.254

• The risk factors selected for the predictive model and their multiplicative contributions to the odds of a future AKI event are reported in the figure to the right

Odds =𝑝1

1−𝑝1

Page 17: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Prediction Performance • The performance of the predictive model / AKI risk index was

characterized:

– Employed the complementary one-third of the study data that was not used in developing the model as the test set

– Determined sensitivity and specificity values for all risk index thresholds and plotted sensitivity vs. 1-specificity to create receiver-operator characteristic (ROC) curves

– Produced ROC curves for predicting AKI at 6, 12 and 24 hours prior to AKI event (see figure below for 24-hr ROC curve)

• The AKI risk index was very effective at predicting the future occurrence of AKI as evidenced in the table below

Page 18: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

AKI Prediction Conclusions

• Based on the results of this study, it is concluded that many patients who are at-risk for AKI can be identified in advance based on data typically stored in an electronic health record (EHR) system

• The methods used in this study should be similarly applicable to other hospital complications such as the need for ventilator assistance, sepsis, shock, infection following surgery, heart attack, and stroke

• Predictive models for hospital complications (like the model derived in this study) can form the basis for in-hospital EHR applications (see mock-up) that monitor patients’ risk indices over time and attribute risk to contributing factors included in the predictive models

Page 19: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Clinical Predictive Analytics Acknowledgements

• Battelle

– Steve Rust, Ph.D.

– Dan Haber

– Mark Davis

– Doug Mooney, Ph.D.

– Michele Morara

– Darlene Wells

• Ohio State University

– Naeem Ali, MD

– Andrew Thomas, MD

– Phyllis Teeter

Page 20: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Analyzing intracortical brain data to reanimate a paralyzed limb

Page 21: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Study overview • FDA- and IRB-approved

clinical IDE study

• Investigate the

effectiveness of a

cortically controlled

neuromuscular

stimulation to restore

movement in a paralyzed

person

• Study participant is a 24-

yr old male who suffered

a complete C5/C6 spinal

cord injury from a diving

accident

Source: Battelle and Ohio State

Page 22: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,
Page 23: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,
Page 24: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Big Data Challenges

• Large Volume of Data

• 30,000 samples/sec x 96 electrodes

• Filter data artifacts

• Isolate signal from noise

• Fast Processing

• Need to pull data, filter, isolate, decode and stim in 0.1 sec

• Offline Data Analysis

• Optimize Decoders and Algorithms

• Design experiments

• Publications

Page 25: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

How does it work?

Image Source: Wikimedia Commons

Page 26: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Neural Spikes

Image by Eric Chudler, UW

Page 27: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Neural data

Page 28: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Neural Spike Panel

Page 29: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Turning Big Data into Manageable Data

• 300,000 samples/sec is a ton of data

• By isolating the signal and filtering out the noise data can be summarized in a much more compact form

• Common theme in many successful big data applications

• Big raw data can be more useful when converted to smaller more concise data

Page 30: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Filters

• 0.3Hz 1st order low pass and 7.5kHz 3rd order high pass Butterworth analog hardware filter applied to data

• 60Hz background filter

• Stimulation artifact

• Static shock

• We can build filters to remove known signal artifacts

• Use wavelets to build robust features

Page 31: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Wavelets as an alternative to spikes

• Wavelets are used for nonparametric regression, signal processing, image analysis etc.

• Represent the raw electrical signal using a wavelet basis

• Localized in both time and frequency

• Coefficients can then be used to represent the signal

Business Sensitive 31

http://www.aticourses.com/blog/index.php/tag/continuous-wavelet-transform/ Elements of Statistical Learning 2nd edition

Page 32: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Spikes vs. Wavelets

• Spikes are usually manually sorted (subjective)

• There are methods for automatic spike sorting – PCA, threshold crossing etc.

• Wavelet choice may have some effect, but optimization can be automated

• We hypothesize the wavelet signal will not decline over time as much as the spike signal

Business Sensitive 32

Page 33: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Wavelet choice

Business Sensitive 33

D. Farina et al, 2007

Page 34: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

34

Wavelet decomposition and

Multi-unit activity (MUA)

• Wavelet decomposition was used to

extract and characterize the raw signal

into different frequency sub-bands

• Wavelet methods does not require spike

sorting

• Multi-unit activity (MUA) is defined as that

corresponding to wavelet scales 4 and 5

• Single-unit activity (SUA) is defined as

that corresponding to wavelet scales 0-3

SUA

MUA

LFP

Sharma et al., 2015

Page 35: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

db4 Wavelet Decomposition

Business Sensitive 35

Source: Battelle

Page 36: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Decoding

• Signal is input into our decoding (aka classification) algorithms

• Translate brain activity to imagined movement

• Training data is acquired by having the subject imagine the movements of an animated hand he is seeing on a screen

• Decoders are trained and then used to control sleeve in test mode

Business Sensitive 36

Page 37: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

SVM Decoders

• We use a custom regularized Support Vector Machine (Humber et al.,2012)

• Separate decoders for each movement

• 96 wavelet features, one for each channel

• Predictor variables go through a non-linear Gaussian radial basis kernel to capture relationships between channels

Business Sensitive 37

Page 38: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Business Sensitive 38

Page 39: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

NeuroLife Acknowledgements

• Battelle

– Chad Bouton, MS

– Nicholas Annetta, MS

– Gaurav Sharma, PhD

– Stephanie Kute, PhD

– Nick Skomrock, MS

– Vimal Buck, MS

– Fritz Eubanks, PhD

– Jeff Friend

– Brad Glenn, PhD

– Mingming Zhang, PH

• Ohio State University

• Ali Rezai, MD

• Jerry Mysiw, MD

• Dina Aziz

• Marcie Bockbrader, MD, PhD

• Ammar Shaikhouni, MD, PhD

• Per Sederberg, PhD

• Dylan Nielson, MD, PhD

Page 40: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Benefits Realized for the Value of Health IT The value steps impacted were:

Treatment/Clinical

Electronic Secure Data

http://www.himss.org/ValueSuit

e

86%/77%

Sensitivity/Specificity

of 24hr AKI Prediction

Movement

Possible

Paralyzed person

can control hand

movements with

thoughts

Page 41: Tuesday, March 1, 2016 - HIMSS20Calculation of relevant features suggested by literature Raw limited use data released to analysis team Database ... during a patient’s ICU stay (Age,

Questions

• David A. Friedenberg, Ph.D

[email protected]

• https://www.linkedin.com/in/davidafriedenberg

• Nancy McMillan, Ph.D.

[email protected]