Download - Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Page 1: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Data Governance and Normalization with the Mayo Clinic Enterprise

Christopher G Chute MD DrPH, Mayo ClinicChair, ICD11 Revision, World Health OrganizationChair, ISO TC215 on Health Informatics

CTSA Ontology WorkshopBaltimore

25 April 20121

Page 2: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics


From Practice-based Evidenceto Evidence-based Practice


ClinicalDatabases Registries et al.


Medical Knowledge


Data Inference



StandardsShared Semantics

Vocabularies & Terminologies

Page 3: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Mayo Clinic – Normalization NeedsVariations on Secondary Use

•Clinical Decision Support•Quality measurement and improvement•Best practice discovery and application •New knowledge discovery (research)

•Genotype to phenotype association•Outcomes Research•Comparative Effectiveness Analyses

© 2007 Mayo Clinic College of Medicine 3

Page 4: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Comparable and Consistent Data

• Inferencing from data to information requires sorting information into categories•Statistical bins•Machine learning features

•Accurate and reproducible categorization depends upon semantic consistency

•Semantic consistency is the vocabulary problem•Almost always manifest as the “value set” problem

© 2007 Mayo Clinic College of Medicine 4

Page 5: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

LOINC, RxNorm, and SNOMEDWe’re not done yet…

•Picking high-level, Meaningful Use conformant, and fashionable vocabularies is not hard

•Finding which subsets (value sets) map to•For each application and interface•For each data element and variable

• Implies thousands of value sets for the average enterprise

• Imposes a normalization and mapping challenge

© 2007 Mayo Clinic College of Medicine 5

Page 6: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Data Governance

Data Standards

Data Quality

Data Architecture and Infrastructure

Data Governance Support Services


Page 7: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Terminology VisionMayo Clinic Production Terminology Support

•To incrementally standardize Mayo Clinic terminology•To access and use standardized terminology in

applications, interfaces, warehouses, and processes across Mayo Clinic

•To reduce cost and effort of developing and maintaining redundant and disparate terminologies and mappings for operations, analytics, and decision support

•To support business needs and strategies that require or benefit from standardized terminology (e.g., ICD-10 conversion, Meaningful Use, Health Information Exchange)

Page 8: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics


•Leverage external standards•Support Mayo needs that are not yet part of an

external standard and promote them to the industry

•Align with the Enterprise Data Model and other types of Mayo Clinic Data Standards

•Maintain a balance between short-term and strategic needs and solutions

Page 9: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

High-Level Context

External Terminologies


EnterpriseData Model


Analytic Systems

EDT, Admin,Nursing,DSS, other

Operational Systems

EMRs. Lab,Radiology, other

Research andCollaboratingOrganizations


Page 10: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Solution Vision


Tools and Services: Load, Author, Map, Store, Search, Browse, Publish, Access

• Code Systems•Mayo Clinic Backbone Terminology, ICD-9 CM, ICD-10 CM, SNOMED CT, RxNorm, LOINC, etc.

• Mappings•Clinical Problems to ICD-9 CM, ICD-10 CM, and SNOMED CT, Lab Orders/Results to LOINC

• Value Sets•Clinical Problem, Body Structure, Body Side, Administrative Gender, Unit of Measure

• Picklists•Admitting Diagnosis, Lab Unit of Measure

Page 11: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

HypertensionHyperlipidemiaDepressionHypothyroidismObesityOsteoporosisCoronary artery diseaseAnemiaDiabetes mellitusObstructive sleep apnea

Clinical ProblemsClinical Facing

Code Systems


Administrative Billing




Health Information Exchanges

Value Set and Mappings Example

Page 12: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Problem List Terminology Example

Develop and approve

Provide in an

accessible format

Distribute and Load

Use in clinical and admin processes

EMR’s, Scheduling, Orders, etc.

Page 13: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Terminology Management Offerings

•Methodologies and Best Practices•Terminology Standardization Facilitation•External Standards Expertise•Terminology Mapping (and 3rd party

recommendations)•External Terminology Management•Mayo Clinic Terminology Standards


Page 14: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Tools and Terminology Services

•Health Language Inc.• Central authoring tool (LExScape)• Mapping assistance tool (LEAP)• Web-based browser (LExPlorer)• Importer/Exporter and Web Services

• Mayo Developed & Supported• Exports• Web Services (as needed)

Page 16: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Approved Data Standards and Policy

•Approved Data Standards on Web• Mayo Clinic Data Standards• Enterprise Data Model

•Data Standards Policy• Data Governance Policy for Data Standards

Page 17: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics


Data Granularity DissonancePractice vs. Research vs. Public Health

•Clinical data is impressionistic•Not protocol driven – quaint rambling text…•Detail and consistency is a happy coincidence•However, can be holistic; intuit the unanticipated

•Research data is rigid though rigorous•Detailed, structured, complete (ideally), coherent•May miss the forest for the chlorophyll

•Public Health comprehends broader interests•Metrics of life-style, high risk behavior, fitness•Non-clinical circumstances, settings, environment

•Can there be reconciliation among use-cases?

Page 18: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

electronic MEdical Records and GEnomics NHGRI eMERGE (U01) Goals

•GWAS – Genome Wide Association Study•600k Affy chip

•High-throughput Phenotyping•Disease algorithm scans across EMRs• “catch up” with high throughput genomics

•Generalize Phenotypes across the Consortium•Measure reproducibility of algorithms among

members•Vanderbilt, Northwestern, Marshfield, Group

Health Seattle, Mayo1811/16/10

Page 19: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics


SHARP: Area 4: Secondary Use of EHR DataA $15M National Consortium

•16 academic and industry partners•Develop tools and resources that influence and

extend secondary uses of clinical data•Cross-integrated suite of project and products

•Clinical Data Normalization •Natural Language Processing (NLP)•Phenotyping (cohorts and eligibility)•Common pipeline tooling (UIMA) and scaling•Data Quality (metrics, missing value management)•Evaluation Framework (population networks)

Page 20: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

SHARP Area 4: Secondary Use of EHR Data

• Harvard Univ. • Intermountain Healthcare• Mayo Clinic• Mirth Corporation, Inc.• MIT • MITRE Corp. • Regenstrief Institute, Inc.• SUNY • University of Colorado

• Agilex Technologies• CDISC (Clinical Data Interchange

Standards Consortium)• Centerphase Solutions• Deloitte• Group Health, Seattle• IBM Watson Research Labs• University of Utah• University of Pittsburgh

Page 21: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Themes & Projects

Page 22: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Clinical Data Normalization•Data Normalization

•Clinical data comes in all different forms even for the same kind of information.

•Comparable and consistent data is foundational to secondary use

•Clinical Data Models – Clinical Element Models (CEMs)•Basis for retaining computable meaning when data

is exchanged between heterogeneous computer systems.

•Basis for shared computable meaning when clinical data is referenced in decision support logic.

Page 23: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

A diagram of a simple clinical model

# 23

data 138 mmHg



data Right Arm


data Sitting


Clinical Element Model for Systolic Blood Pressure

Page 24: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Page 25: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Data Element Harmonization

•Stan Huff – CIMI•Clinical Information Model Initiative

•NHS Clinical Statement•CEN TC251/OpenEHR Archetypes•HL7 Templates• ISO TC215 Detailed Clinical Models•CDISC Common Clinical Elements• Intermountain/GE CEMs

Page 26: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Normalization Pipelines

• Input heterogeneous clinical data•HL7, CDA/CCD, structured feeds

•Output Normalized CEMs•Create logical structures within UIMA CAS

•Serialize to a persistence layer•SQL, RDF, “PCAST like”, XML

•Robust Prototypes Q1 2012•Early version production Q3 2010

Page 27: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

This slide is obvious

Determine payloadIs laboratory data

1. Physical Quantity

2. CodedValues

3. Text field

Page 28: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

Natural Language Processing

• Information extraction (IE): transformation of unstructured text into structured representations and merging clinical data extracted from free text with structured data•Entity and Event discovery•Relation discovery•Normalization template: Clinical Element Model

• Improving the functionality, interoperability, and usability of a clinical NLP system(s), namely the Clinical Text Analysis and Knowledge Extraction System (cTAKES)

Page 29: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

High-Throughput Phenotyping

•Phenotype - identifying a set of characteristics of about a patient, such as:•A diagnosis•Demographics•A set of lab results

•Phenotyping – overload of terms•Originally for research cohorts from EMRs•Obvious extension to clinical trial eligibility•Note relevance to quality metrics

• Numerator and denominator constitute phenotypes•Clinical decision support

• Trigger criteria

Page 30: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

EMR Phenotype Algorithms

•Typical components•Billing and diagnoses codes; Procedure codes•Labs; Medications•Phenotype-specific co-variates (e.g.,

Demographics, Vitals, Smoking Status, CASI scores)

•Pathology; Imaging?•Organized into inclusion and exclusion criteria•Experience from eMERGE Electronic Medical

Records and Genomics Network (

Page 31: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics

SHARP Area 4: More information…

Page 32: Biomedical Informatics Data Governance and Normalization with the Mayo Clinic Enterprise Christopher G Chute MD DrPH, Mayo Clinic Chair, ICD11 Revision,

Biomedical Informatics


Where is This Going?•Standards and interoperability are crucial to the

operation and success of academic medical centers

•The boundaries among clinical and research standards are eroding

• Information standards, especially vocabularies, are the foundation for scientific synergies

•Data Governance, at an enterprise (and arguably a national) level is critical