10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services...

18
Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ Ram Gouripeddi University of Utah 10 th Annual Utah Health Services Research Conference Considering Data Quality in Health Services Research Monday, March 16, 2015

Transcript of 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services...

Page 1: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons

from PHIS+

Ram Gouripeddi University of Utah

10th Annual Utah Health Services Research ConferenceConsidering Data Quality in Health Services Research

Monday, March 16, 2015

Page 2: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Acknowledgements• Raj Srivastava, MD, MPH• Ron Keren MD, MPH• OpenFurther Team members• PHIS+ Team members across multiple institutions• Apelon

• FURTHeR development was supported by the NCRR and the NCATS, NIH, through Grant UL1RR025764 and supplement 3UL1RR025764-02S2. This project was funded under grant number R01 HS019862-01 from the AHRQ, U.S. Department of Health and Human Services (HHS). The opinions expressed [in this document] are those of the authors and do not reflect the official position of AHRQ or the HHS.

• PHIS+: www.childrenshospitals.org/phisplus/index.html

2

Page 3: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

PHIS+

• Augment Children’s Hospital Association’s (CHA) existing electronic database of administrative data - Pediatric Health Information System (PHIS) with clinical data to conduct Comparative Effectiveness Research studies.

• UU Biomedical Informatics Core - Informatics Partners• Agency for Healthcare Research and Quality (AHRQ)

funded project.

PHIS PHIS+

3

Page 4: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

PHIS+ Overview

Pneumonia

Appendicitis

Osteomyelitis

Gastroesophageal Reflux Disease

Data Streams3

Laboratory

Microbiology

Radiology

CER Studies4

2007 – 2011

2009 – Development

2012….

Years Data5

4

Page 5: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

The PHIS+ Process6

Pediatric Research in Inpatient Setting (PRIS) Sites6

1. Cincinnati Children’s Hospital Medical Center (CCHMC)

3. Children’s Hospital of Philadelphia (CHOP)

5. Primary Children’s Medical Center, Intermountain Healthcare (PCMC)

2. Children’s Hospital Boston (CHB) 4. Children’s Hospital of Pittsburgh (CHP) 6. Seattle Children’s Hospital (SCH)

5

1

2

34

Page 6: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

OpenFurther

6

Page 7: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Developmental Process Overview

Narus et. al, Federating Clinical Data from Six Pediatric Hospitals: Process and Initial Results from the PHIS+ Consortium. AMIA 2011

7

Page 8: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Modeling & Terminology Phase

• Data Model Harmonization• Semantic Mapping• Steps ensured quality of the data by limiting

information losses arising from data transformations

8

Page 9: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Data Model Harmonization

• Informatics team worked with domain experts to create representative common data models for storage of different domains of data.

• Then with each hospital’s IT to harmonize their data models with the common data models.

9

Page 10: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Semantic Mapping• Obtained detailed information

about distinct local data elements using a metadata collection toolkit

• Mapped local data elements to standard biomedical terminologies.

• Doubtful mappings discussed with their respective hospital team inclusive of the site PI, lab and EHR personnel.

• All mapping peer-reviewed within the informatics team, with the contributing hospital team, and also run through software checks.

10

Metadata Fields ExampleLocal Battery/Panel Name/CodeBattery/Panel DescriptionLocal Test Name GlucoseLocal Test Code GluTest Description Blood GlucoseLOINC Code -Test Value Type NumericTest Value Sample Data 86Test Start Date FormatTest End Date FormatSpecimen SerumUnits of Measure mg/dLReference Range 80 – 120Interpretation CodesTest Status CodesComments

Page 11: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Differences in Local Coding Schemas

C Reactive Protein

[Mass/volume] in Serum or

Plasma (1988-5)

C Reactive Protein (8726)

C Reactive Protein (CRPT)

CRP (CRP)

CRP Test (700111)

C-Reactive Protein

(801582)

C R Protein (801679)

11

Nanogram/Decilitre

(258805003)

NG/DL

ng/dL

ng/dL

ng per dL

ng/Dl

ng per dL

Laboratory Test Unit of Measure

Page 12: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Data Processing Phase

• Data collection phase: Each hospital used a combination of a data collection toolkit and data validation scripts to assess their submitted data.

• Contributed data was then processed through the OpenFurther platform for translation to selected standard terminologies and storage in common data models.

• Each row of processed data was check for different data quality issues specific to each domain.

• Errors in the data were flagged with an error taxonomy and reviewed for fixes or resubmissions.

12

Page 13: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Example Checks

• Is the lab test associated with a patient?• Is there a valid lab test in each row of lab

result data?• Does the lab test have a result a valid result?• Are there proper relationships between

cultures, their test specimens and results?

13

Page 14: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Study Specific Quality Assessment

• Individual studies have different granularities and specificities in their data requirements.

• We undertook a second set of data quality assessments at the study cohort level.

• This included a chart review of a significant sample within each study cohort.

14

0.6 3 4.653 6.7 8.8 10.9 13.2 >10.00

5000

10000

15000

20000

25000

30000

35000

2823-3: Potassium [Moles/volume] in Serum or Plasma

Page 15: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

PHIS+ CER Database – 2007-11

Site Results LOINC Lab Test Code

A 15,011,312 538

B 33,214,540 1,214

C 16,868,383 860

D 25,706,608 1,089

E 38,422,668 1,016

F 14,507,629 2,131

Total 150,731,140 *6,848 (2,992)

Site Culture Results SNOMED Specimen Code

SNOMED Culture Procedure Code

SNOMED Organism Code

RxNorm Anti-microbial Code

Susceptibility Results

LOINC Susceptibility Test Code

A 247,933 114 70 113 57 487,813 97

B 359,780 58 42 56 58 393,594 85

C 231,071 179 46 162 59 340,100 99

D 335,606 110 34 145 57 376,844 75

E 486,315 130 56 160 59 605,000 76

F 176,848 264 71 121 51 283,865 89

Total 1,837,553 *855 (451) *319 (95) *757 (203) *341 (74) 2,487,216 *521 (136)

Site Reports CPT Radiology Procedure Code

A 445,681 280

B 1,151,383 349

C 635,458 296

D 980,740 482

E 1,098,693 497

F 201,708 477

Total 4,513,663 *2,381 (714)

Laboratory Radiology

Microbiology

* The first number is the total number of standard codes, the second in parenthesis is the distinct number of standard codes across all sites.

1,854,406 Kids

Page 16: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Discussion

• We developed an infrastructure that assesses the quality of data being integrated from disparate data sources.

• Using this infrastructure we populated a database with high quality data to support HSR & CER.

• To ensure data quality a combination of computerized data assessment checks within OpenFurther and manual checks were used.

• Global and study specific data quality assessments were required– Address systemic issues in data integration and study specific

issues.16

Page 17: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

Discussion

• Informed by the framework developed by Kahn et. al in “A Pragmatic Framework for Single-site and Multisite Data Quality Assessment in Electronic Health Record-based Clinical Research”

• Inherent dimensions such as Accuracy, Objectivity and Believability; and Conceptual dimensions such as Timeliness and Appropriate amount of data were measured.

• A software platform that complies with existing theoretic frameworks of data quality can assist this process and speed up the process of generating new and reproducible study results.– A Data Model for Representation and Storage of Biomedical Data

Quality, Breakout Session 3 – Strategies for Identifying Data Quality Issues 17

Page 18: 10th Annual Utah's Health Services Research Conference - Data Quality in Multi-Site Health Services and Comparative Effective Research: Lessons from PHIS+ By: Ram Gouripeddi

THANK YOU

18