Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH

25
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH June 12, 2012

description

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput Phenotyping. Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH. June 12, 2012. Electronic h ealth r ecords (EHRs) driven phenotyping. - PowerPoint PPT Presentation

Transcript of Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH

Page 1: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput PhenotypingProject Lead: Jyotishman Pathak, PhDPI: Christopher G. Chute, MD, DrPH

June 12, 2012

Page 2: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Electronic health records (EHRs) driven phenotyping

• Overarching goal• To develop high-throughput automated

techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings

©2012 MFMER | slide-2

Page 3: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Current HTP project themes

• Standardization of phenotype definitions

• Library of phenotyping algorithms

• Phenotyping workbench

• Machine learning techniques for phenotyping

• Just-in-time phenotyping

©2012 MFMER | slide-3

Page 4: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

©2012 MFMER | slide-4

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 5: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

NQF Quality Data Model (QDM)• Standard of the National Quality Forum (NQF)

• A structure and grammar to represent quality measures in a standardized format

• Groups of codes in a code set (ICD-9, etc.)• "Diagnosis, Active: steroid induced diabetes" using

"steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)”

• Supports temporality & sequences• AND: "Procedure, Performed: eye exam" > 1 year(s)

starts before or during "Measurement end date"• Implemented as set of XML schemas

• Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.)

©2012 MFMER | slide-5

Page 6: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-6

116 Meaningful Use Phase I Quality Measures

Page 7: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - I

©2012 MFMER | slide-7

Human readable HTML

Page 8: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Example: Diabetes & Lipid Mgmt. - II

©2012 MFMER | slide-8

Computable XML

Page 9: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

DataTransformTransform

Algorithm Development Process - Modified

©2012 MFMER | slide-9

PhenotypeAlgorithm

Visualization

Evaluation

NLP, SQL

Rules

Mappings

Semi-Automatic Execution

• Standardized representation of clinical data

• Create new and re-use existing clinical element models (CEMs)

• Standardized and structured representation of phenotype definition criteria

• Use the NQF Quality Data Model (QDM)

• Conversion of structured phenotype criteria into executable queries

• Use JBoss® Drools (DRLs)

[Welch et al. 2012][Thompson et al., submitted 2012]

[Li et al., submitted 2012]

Page 10: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Drools-based Phenotyping Architecture

©2012 MFMER | slide-10

Business Logic

Clinical Element

Database

List ofDiabetic Patients

Data Access Layer

Transformation Layer

Inference Engine (Drools)

Service for Creating Output (File, Database,

etc)

Transform physical representation Normalized logical representation (Fact Model)

Page 11: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Automatic translation from NQF QDM criteria to Drools

©2012 MFMER | slide-11

[Li et al., submitted 2012]

Page 12: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

The “executable” Drools flow

©2012 MFMER | slide-12

Page 13: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

©2012 MFMER | slide-13

Phenotype library and workbench - I

1. Converts QDM to Drools2. Rule execution by querying

the CEM database3. Generate summary reports

http://phenotypeportal.org

Page 14: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

©2012 MFMER | slide-14

Phenotype library and workbench - IIhttp://phenotypeportal.org

Page 15: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-15

Phenotype library and workbench - III

Page 16: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Machine learning and HTP - I• Machine learning and

association rule mining• Manual creation of

algorithms take time• Let computers do the

“hard work”• Validate against

expert developed ones

©2012 MFMER | slide-16

[Caroll et al. 2011]

Page 17: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Machine learning and HTP - II

• Origins from sales data• Items (columns): co-morbid conditions• Transactions (rows): patients• Itemsets: sets of co-morbid conditions• Goal: find all itemsets (sets of conditions)

that frequently co-occur in patients.• One of those conditions should be DM.

• Support: # of transactions the itemset I appeared in• Support({TB, DLM, ND})=3

• Frequent: an itemset I is frequent, if support(I)>minsup

Patient TB DLM

ND … IEC

001 Y Y Y Y

002 Y Y Y Y

003 Y Y

004 Y

005 Y Y Y

X: infrequent

[Simon et al. 2012]

Page 18: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

Electronic Health Records and Phenomics

Just-in-Time phenotyping - I

Transfusion-related Acute Lung Injury (TRALI)Transfusion-associated Circulatory Overload (TACO)

Page 19: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Just-in-Time phenotyping - II

©2012 MFMER | slide-19

TRALI/TACO “sniffer”

Page 20: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

Electronic Health Records and Phenomics

Page 21: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Active Surveillance for TRALI and TACO

Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service.

Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service.

Page 22: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Publications till date (conservative)

Year 1 (2011) Year 2 (2012) Year 3 (2013)0

2

4

6

8

10

12

14

8

66

2

12

PapersAbstractsUnder review

©2012 MFMER | slide-22

Page 23: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

2011 Milestones Standardized definitions for phenotype criteria Rules-based environment for phenotype

algorithm execution National library for standardized phenotype

definitions (collaboration with eMERGE) Machine learning techniques for algorithm

definitions Online, real-time phenotype execution Phenotyping algorithm authoring environment

©2012 MFMER | slide-23

Page 24: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

2012 Milestones• Machine learning techniques for algorithm

definitions

• Online, real-time phenotype execution

• Collaboration with NQF, Query Health and i2b2 infrastructures

• Use cases and demonstrations• MU quality metrics (w/ NQF, Query Health)• Cohort identification (w/ eMERGE, PGRN)• Value analysis (w/ Mayo CSHCD, REP)• Clinical trial alerting (w/ Mayo Cancer Ctr./CTSA)

©2012 MFMER | slide-24

Page 25: Project Lead: Jyotishman  Pathak,  PhD PI: Christopher G. Chute, MD,  DrPH

SHARPn High-Throughput Phenotyping

Project 3: Collaborators & Acknowledgments• CDISC (Clinical Data Interchange Standards Consortium)

• Rebecca Kush, Landen Bain• Centerphase Solutions

• Gary Lubin, Jeff Tarlowe• Group Health Seattle

• David Carrell• Harvard University/MIT

• Guergana Savova, Peter Szolovits• Intermountain Healthcare/University of Utah

• Susan Welch, Herman Post, Darin Wilcox, Peter Haug• Mayo Clinic

• Cory Endle, Rick Kiefer, Sahana Murthy, Gopu Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor, Chris Chute

©2012 MFMER | slide-25