Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH
description
Transcript of Project Lead: Jyotishman Pathak, PhD PI: Christopher G. Chute, MD, DrPH
Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use of EHR Data Project 3: High-Throughput PhenotypingProject Lead: Jyotishman Pathak, PhDPI: Christopher G. Chute, MD, DrPH
June 12, 2012
SHARPn High-Throughput Phenotyping
Electronic health records (EHRs) driven phenotyping
• Overarching goal• To develop high-throughput automated
techniques and algorithms that operate on normalized EHR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings
©2012 MFMER | slide-2
SHARPn High-Throughput Phenotyping
Current HTP project themes
• Standardization of phenotype definitions
• Library of phenotyping algorithms
• Phenotyping workbench
• Machine learning techniques for phenotyping
• Just-in-time phenotyping
©2012 MFMER | slide-3
SHARPn High-Throughput Phenotyping
DataTransformTransform
Algorithm Development Process - Modified
©2012 MFMER | slide-4
PhenotypeAlgorithm
Visualization
Evaluation
NLP, SQL
Rules
Mappings
Semi-Automatic Execution
• Standardized representation of clinical data
• Create new and re-use existing clinical element models (CEMs)
• Standardized and structured representation of phenotype definition criteria
• Use the NQF Quality Data Model (QDM)
• Conversion of structured phenotype criteria into executable queries
• Use JBoss® Drools (DRLs)
[Welch et al. 2012][Thompson et al., submitted 2012]
[Li et al., submitted 2012]
SHARPn High-Throughput Phenotyping
NQF Quality Data Model (QDM)• Standard of the National Quality Forum (NQF)
• A structure and grammar to represent quality measures in a standardized format
• Groups of codes in a code set (ICD-9, etc.)• "Diagnosis, Active: steroid induced diabetes" using
"steroid induced diabetes Value Set GROUPING (2.16.840.1.113883.3.464.0001.113)”
• Supports temporality & sequences• AND: "Procedure, Performed: eye exam" > 1 year(s)
starts before or during "Measurement end date"• Implemented as set of XML schemas
• Links to standardized terminologies (ICD-9, ICD-10, SNOMED-CT, CPT-4, LOINC, RxNorm etc.)
©2012 MFMER | slide-5
SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-6
116 Meaningful Use Phase I Quality Measures
SHARPn High-Throughput Phenotyping
Example: Diabetes & Lipid Mgmt. - I
©2012 MFMER | slide-7
Human readable HTML
SHARPn High-Throughput Phenotyping
Example: Diabetes & Lipid Mgmt. - II
©2012 MFMER | slide-8
Computable XML
SHARPn High-Throughput Phenotyping
DataTransformTransform
Algorithm Development Process - Modified
©2012 MFMER | slide-9
PhenotypeAlgorithm
Visualization
Evaluation
NLP, SQL
Rules
Mappings
Semi-Automatic Execution
• Standardized representation of clinical data
• Create new and re-use existing clinical element models (CEMs)
• Standardized and structured representation of phenotype definition criteria
• Use the NQF Quality Data Model (QDM)
• Conversion of structured phenotype criteria into executable queries
• Use JBoss® Drools (DRLs)
[Welch et al. 2012][Thompson et al., submitted 2012]
[Li et al., submitted 2012]
SHARPn High-Throughput Phenotyping
Drools-based Phenotyping Architecture
©2012 MFMER | slide-10
Business Logic
Clinical Element
Database
List ofDiabetic Patients
Data Access Layer
Transformation Layer
Inference Engine (Drools)
Service for Creating Output (File, Database,
etc)
Transform physical representation Normalized logical representation (Fact Model)
SHARPn High-Throughput Phenotyping
Automatic translation from NQF QDM criteria to Drools
©2012 MFMER | slide-11
[Li et al., submitted 2012]
The “executable” Drools flow
©2012 MFMER | slide-12
©2012 MFMER | slide-13
Phenotype library and workbench - I
1. Converts QDM to Drools2. Rule execution by querying
the CEM database3. Generate summary reports
http://phenotypeportal.org
©2012 MFMER | slide-14
Phenotype library and workbench - IIhttp://phenotypeportal.org
SHARPn High-Throughput Phenotyping ©2012 MFMER | slide-15
Phenotype library and workbench - III
SHARPn High-Throughput Phenotyping
Machine learning and HTP - I• Machine learning and
association rule mining• Manual creation of
algorithms take time• Let computers do the
“hard work”• Validate against
expert developed ones
©2012 MFMER | slide-16
[Caroll et al. 2011]
SHARPn High-Throughput Phenotyping
Machine learning and HTP - II
• Origins from sales data• Items (columns): co-morbid conditions• Transactions (rows): patients• Itemsets: sets of co-morbid conditions• Goal: find all itemsets (sets of conditions)
that frequently co-occur in patients.• One of those conditions should be DM.
• Support: # of transactions the itemset I appeared in• Support({TB, DLM, ND})=3
• Frequent: an itemset I is frequent, if support(I)>minsup
Patient TB DLM
ND … IEC
001 Y Y Y Y
002 Y Y Y Y
003 Y Y
004 Y
005 Y Y Y
X: infrequent
[Simon et al. 2012]
Electronic Health Records and Phenomics
Just-in-Time phenotyping - I
Transfusion-related Acute Lung Injury (TRALI)Transfusion-associated Circulatory Overload (TACO)
SHARPn High-Throughput Phenotyping
Just-in-Time phenotyping - II
©2012 MFMER | slide-19
TRALI/TACO “sniffer”
Electronic Health Records and Phenomics
SHARPn High-Throughput Phenotyping
Active Surveillance for TRALI and TACO
Of the 88 TRALI cases correctly identified by the CART algorithm, only 11 (12.5%) of these were reported to the blood bank by the clinical service.
Of the 45 TACO cases correctly identified by the CART algorithm, only 5 (11.1%) were reported to the blood bank by the clinical service.
SHARPn High-Throughput Phenotyping
Publications till date (conservative)
Year 1 (2011) Year 2 (2012) Year 3 (2013)0
2
4
6
8
10
12
14
8
66
2
12
PapersAbstractsUnder review
©2012 MFMER | slide-22
SHARPn High-Throughput Phenotyping
2011 Milestones Standardized definitions for phenotype criteria Rules-based environment for phenotype
algorithm execution National library for standardized phenotype
definitions (collaboration with eMERGE) Machine learning techniques for algorithm
definitions Online, real-time phenotype execution Phenotyping algorithm authoring environment
©2012 MFMER | slide-23
SHARPn High-Throughput Phenotyping
2012 Milestones• Machine learning techniques for algorithm
definitions
• Online, real-time phenotype execution
• Collaboration with NQF, Query Health and i2b2 infrastructures
• Use cases and demonstrations• MU quality metrics (w/ NQF, Query Health)• Cohort identification (w/ eMERGE, PGRN)• Value analysis (w/ Mayo CSHCD, REP)• Clinical trial alerting (w/ Mayo Cancer Ctr./CTSA)
©2012 MFMER | slide-24
SHARPn High-Throughput Phenotyping
Project 3: Collaborators & Acknowledgments• CDISC (Clinical Data Interchange Standards Consortium)
• Rebecca Kush, Landen Bain• Centerphase Solutions
• Gary Lubin, Jeff Tarlowe• Group Health Seattle
• David Carrell• Harvard University/MIT
• Guergana Savova, Peter Szolovits• Intermountain Healthcare/University of Utah
• Susan Welch, Herman Post, Darin Wilcox, Peter Haug• Mayo Clinic
• Cory Endle, Rick Kiefer, Sahana Murthy, Gopu Shrestha, Dingcheng Li, Gyorgy Simon, Matt Durski, Craig Stancl, Kevin Peterson, Cui Tao, Lacey Hart, Erin Martin, Kent Bailey, Scott Tabor, Chris Chute
©2012 MFMER | slide-25