Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3:...

25
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH June 12, 2012

Transcript of Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3:...

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping

Project Lead: Jyotishman Pathak, PhDPI: Christopher G. Chute, MD, DrPH

June 12, 2012

SHARPn High-Throughput Phenotyping

Electronic health records (EHRs) driven phenotyping

• Overarching goal• To develop high-throughput automated

techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings

©2012 MFMER | slide-2

SHARPn High-Throughput Phenotyping

Current HTP project themes

• Standardization of phenotype definitions

• Library of phenotyping algorithms

• Phenotyping workbench

• Machine learning techniques for phenotyping

• Just-in-time phenotyping

©2012 MFMER | slide-3

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

©2012 MFMER | slide-4

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

SHARPn High-Throughput Phenotyping

NQF Quality Data Model (QDM)• Standard of the National Quality Forum (NQF)

• A structure and grammar to represent quality measures in a standardized format

• Groups of codes in a code set (ICD-9, etc.)• "Diagnosis, Active: steroid induced diabetes" using

"steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)”

• Supports temporality & sequences• AND: "Procedure, Performed: eye exam" > 1 year(s)

starts before or during "Measurement end date"

• Implemented as set of XML schemas• Links to standardized terminologies (ICD-9, ICD-10,

SNOMED-CT, CPT-4, LOINC, RxNorm etc.)

©2012 MFMER | slide-5

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-6

116 Meaningful Use Phase I Quality Measures

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - I

©2012 MFMER | slide-7

Human readable HTML

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - II

©2012 MFMER | slide-8

Computable XML

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

©2012 MFMER | slide-9

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

SHARPn High-Throughput Phenotyping

Drools-based Phenotyping Architecture

©2012 MFMER | slide-10

Business Logic

Clinical Element

Database

List ofDiabetic Patients

Data Access Layer

Transformation Layer

Inference Engine (Drools)

Service for Creating Output (File, Database,

etc)

Transform physical representation Normalized logical representation (Fact Model)

SHARPn High-Throughput Phenotyping

Automatic translation from NQF QDM criteria to Drools

©2012 MFMER | slide-11

[Li et al., submitted 2012]

The “executable” Drools flow

©2012 MFMER | slide-12

©2012 MFMER | slide-13

Phenotype library and workbench - I

1. Converts QDM to Drools2. Rule execution by querying

the CEM database3. Generate summary reports

http://phenotypeportal.org

©2012 MFMER | slide-14

Phenotype library and workbench - IIhttp://phenotypeportal.org

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-15

Phenotype library and workbench - III

SHARPn High-Throughput Phenotyping

Machine learning and HTP - I

• Machine learning and association rule mining• Manual creation of

algorithms take time• Let computers do the

“hard work”• Validate against

expert developed ones

©2012 MFMER | slide-16

[Caroll et al. 2011]

SHARPn High-Throughput Phenotyping

Machine learning and HTP - II

• Origins from sales data

• Items (columns): co-morbid conditions

• Transactions (rows): patients

• Itemsets: sets of co-morbid conditions

• Goal: find all itemsets (sets of conditions) that frequently co-occur in patients.• One of those conditions should be DM.

• Support: # of transactions the itemset I appeared in• Support({TB, DLM, ND})=3

• Frequent: an itemset I is frequent, if support(I)>minsup

Patient TB DLM

ND … IEC

001 Y Y Y Y

002 Y Y Y Y

003 Y Y

004 Y

005 Y Y Y

X: infrequent

[Simon et al. 2012]

Electronic Health Records and Phenomics

Just-in-Time phenotyping - I

Transfusion-related Acute Lung Injury (TRALI)Transfusion-associated Circulatory Overload (TACO)

SHARPn High-Throughput Phenotyping

Just-in-Time phenotyping - II

©2012 MFMER | slide-19

TRALI/TACO “sniffer”

Electronic Health Records and Phenomics

SHARPn High-Throughput Phenotyping

Active Surveillance for TRALI and TACO

Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service.

Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service.

SHARPn High-Throughput Phenotyping

Publications till date (conservative)

Year 1 (2011) Year 2 (2012) Year 3 (2013)0

2

4

6

8

10

12

14

8

66

2

12

PapersAbstractsUnder review

©2012 MFMER | slide-22

SHARPn High-Throughput Phenotyping

2011 Milestones

Standardized definitions for phenotype criteria

Rules-based environment for phenotype algorithm execution

National library for standardized phenotype definitions (collaboration with eMERGE)

Machine learning techniques for algorithm definitions

Online, real-time phenotype execution

Phenotyping algorithm authoring environment

©2012 MFMER | slide-23

SHARPn High-Throughput Phenotyping

2012 Milestones

• Machine learning techniques for algorithm definitions

• Online, real-time phenotype execution

• Collaboration with NQF, Query Health and i2b2 infrastructures

• Use cases and demonstrations• MU quality metrics (w/ NQF, Query Health)• Cohort identification (w/ eMERGE, PGRN)• Value analysis (w/ Mayo CSHCD, REP)• Clinical trial alerting (w/ Mayo Cancer Ctr./CTSA)

©2012 MFMER | slide-24

SHARPn High-Throughput Phenotyping

Project 3: Collaborators & Acknowledgments

• CDISC (Clinical Data Interchange Standards Consortium)• Rebecca Kush, Landen Bain

• Centerphase Solutions• Gary Lubin, Jeff Tarlowe

• Group Health Seattle• David Carrell

• Harvard University/MIT• Guergana Savova, Peter Szolovits

• Intermountain Healthcare/University of Utah• Susan Welch, Herman Post, Darin Wilcox, Peter Haug

• Mayo Clinic• Cory Endle, Rick Kiefer, Sahana Murthy, Gopu

Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor, Chris Chute

©2012 MFMER | slide-25