Post on 07-May-2015
The electronic medical record (EMR) will constitute the core of a computerized health care systemin the near future. The electronic storage of clinical information will create the potential forcomputer-based tools to help clinicians significantly enhance the quality of medical care andincrease the efficiency of medical practice. These tools may include reminder systems thatidentify patients who are due for preventative care interventions, alerting systems that detectcontraindications among prescribed medications, and coding systems that facilitate the selectionof correct billing codes for patient encounters. Numerous other "decision-support" tools havebeen developed and may soon facilitate the practice of clinical medicine. The potential of suchtools will not be realized, however, if the EMR is just a set of textual documents stored in acomputer, i.e. a "word-processed" patient chart. To support intelligent and useful tools, the EMRmust have a systematic internal model of the information it contains and must support theefficient capture of clinical information in a manner consistent with this model. Althoughcommercially available EMR systems that have such features are appearing, the builders and thebuyers of EMR systems must continue to focus on the proper design of these systems if thebenefits of computerization are to be fully realized. (Sujansky WV. The Benefits and Challenges ofan Electronic Medical Record: Much More than a "Word-Processed" Patient Chart. West J Med1998; 169:176-183)© COPYRIGHT 1998 British Medical Association
Electronic Medical Records
• Electronic medical records (EMR)– clinical benefits
• reduction in medical errors, prescription errors• supports quality improvement programs
– research benefits• “Frankly, one of the biggest attractions to LastWord is going to
be a boon to clinical research. Information will be accessible in a much more uniform and complete way.” Haile Debas, Daybreak, Feb. 2, 2001
• UCSF spending $50 million over next 2 years on CareCast• How real is the promise of EMRs for research ?
Background
• Understand key properties of useful electronic medical records and data warehousing– free vs. coded entry– importance of a standardized clinical vocabulary
• Understand implications of database technologies on clinical research
• Be familiar with basic concepts in data security and privacy
Learning Objectives
• Example Study– a single-institution outcomes research question
• Electronic Medical Records (EMRs)– relational databases– vocabulary
• Data Warehousing• Security and Privacy
Outline
• Retrospective analysis• Compare 1 year re-admission rate for acute MI for
– diabetics admitted with acute MI, discharged • on -blockers• not on -blockers
• First acute MI in 1999 to 2001, followup to 2002
An Outcomes Research Project
• Find diabetics admitted with AMI‘1999 to‘2001• Find whether D/C’ed on -blocker• For these patients, find all re-admissions in the year
following the index MI– identify re-admissions that were for acute MI
• Analyze– predictor = -blocker status– primary outcome = acute MI readmission rate– secondary outcome = length of stay (LOS)
Study Steps
• Data needed– admission: Admission Discharge Transfer system– diabetes diagnosis: chart, HgbA1C– MI diagnosis: chart, troponins, EKG readings
• or just trust coding of admission diagnosis?
-blocker usage: orders, pharmacy
• Existing (legacy) systems– claims, pharmacy, ADT, lab, xray, med record, etc
Health System Minnesota: 50 paper, 50 computer
200,000 lives, 460 physicians
Health System Minnesota: 50 paper, 50 computer
200,000 lives, 460 physicians
Data Needed for -Blocker Study
Pros Cons
ChartReview
ElectronicMedicalRecord
Data Collection Method
• EMR provides individual patient data for– real-time clinical care – reimbursement (eg for E&M coding)– see table for major functionality dimensions
• Clinical workstation includes interfaces to– practice management systems– pharmacy benefit management– knowledge resources (e.g., WWW, guidelines)
• “EMRs” range from flat file, text-based systems to full-featured workstations
What is an EMR?
8 Types of EMR Functionality
Viewing Electronic viewing of chart notes, problem and medication lists, dischargesummaries, laboratory results, and radiology results.
Documentation Entry of visit note and other information into the EMR, whether throughdictation or direct keyboard entry.
Order Entry Electronic physician order entry of drug prescriptions, laboratorytests, radiology studies, or referrals.
Care Planningand Management
Managing patients in disease management programs, such as for asthma orcongestive heart failure
Patient-Directed Patient education materials; web-based education modules, self-diagnosisalgorithms, patient-viewing of EMR data, and e-mail with care providers
Billing and OtherAdministrative
Determination of insurance eligibility, assistance with visit level coding,management and tracking of referrals.
PerformanceReporting
Quality and utilization reporting to both internal and external audiences
Messaging E-mail or other messaging system among providers and staff within theorganization, or to external organizations
• Physician friendliness– if docs won’t use it, it won’t help research
• What data it contains• How that data is stored (and retrieved)• Security• Cost, maintenance, technical support, etc
Critical EMR Features
• Workflow compatible– portable
• Easy data entry– voice-recognition– pen-based (PDAs)– digital ink
• Preserves doctor-patient relationship
• Secure Fujitsu 510
Physician Friendliness
• Contents: data and detail sufficient for– real-time clinical care
• notes, orders, labs, prescriptions, xray (reports)...
– administration• demographic, billing, provider IDs...
– research?• standardized data collection, symptom scales, etc
• Structure: generally should store contents in relational form– unstructured free text (flat file) difficult to compute on– relational data schema provides structure to the EMR data
• e.g., fields for diagnosis, medication name, dosage
EMR Contents and Structure
Relational Admissions Database MasterTable
ID Name Sex Birthdate Insurance000-01-001 Lee M 09-Jul-00 B/T Healthnet000-01-002 Smith F 22-Oct-25 Medicare000-01-003 Perez F 13-Jun-57 B/T Pacificare
AdmissionNumberTableAdm# ID Admit Date Discharge
Date001 000-01-001 31-Dec-94 12-Jan-95002 000-01-001 27-Mar-96 31-Mar-96003 000-01-002 03-Feb-95 16-Feb-95004 000-01-002 27-Feb-95 20-Mar-95005 000-01-003 19-Nov-97 23-Nov-97
AdmissionTableAdm# Admit
ServiceAdmit
DiagnosisPrincipalDischargeDiagnosis
001 Med Acute MI Acute MI002 Med COPD Pneumonia003 Surg THR THR004 Med Acute MI Acute MI005 Gyn Menorrhagia von Willebrand's
Secondary Discharge Diagnosis TableAdmission # Secondary Discharge Diagnoses
001 COPD001 Diabetes002 COPD003 Acute MI004 VF Arrest005 Diabetes
What Goes Into the Table Cells?
• If the entire chart were stored in relational tables, all the chart information (including HPI) is in the cells
• Free vs. coded entries– “Mrs. Jones suffered an anterior non-Q wave MI” vs– MI: Yes, Location: Anterior, Type: Non-Q
• Structure and coding is essential for making the EMR more machine interpretable– free text entries in structured fields better than plain flat
file– even better to code entries into standardized terms
• A term is a designation of a concept or an object in a specific vocabulary
• e.g., English blood = German blut
• Standardization required for communication– acts like a dictionary
• DGIM tried to use STOR to pull out all CHF patients for quality improvement program but terms used were too varied
• i.e., how to guarantee that all acute MI admissions will be retrieved if asked for?
• Vocabularies (collections of terms)– general standardized: ICD-9, CPT, MeSH– research-domain specific: CDEs for cancer, etc...– your own data dictionary
Standardization of Clinical Terms
Cost/Benefits of Coding
• The more coded and more structured your data, the more advanced computing you can do with that data– because the computer can “understand” more
• But coding and structuring costs time and effort– selecting billing codes for outpatient practice– structured templates for clinic notes may be too
constraining
• Tradeoff between – costs of more coding and structuring, and– benefits to accrue from “smarter” computing
Notable Clinical Vocabularies
Vocabulary Name Domain Use
SNOMED Standardized Nomenclatureof Human and Vet Medicine
ClinicalMedicine
EMRDocumentation
MeSH Medical Subject Heading BiomedicalIndexing
BibliographicRetrieval
ICD-9 International Classificationof Diseases
Diseases Billing
CPT Current ProceduralTerminology
MedicalProcedures
Billing
DSM-IV Diagnostic and StatisticalManual of Mental Disorders
Pyschiatry Billing,Nosology
LOINC Logical ObservationIdentifier Names and Codes
Labs Lab systems,Billing
Dangers of ICD-9 Coding
• VBAC uterine rupture rate – 665.0 and 665.1 ICD-9 discharge codes used in study
(NEJM 2001;345:3-8) – letter to editor: in 9 years of Massachusetts data
• 716 patients with 665.0 and 665.1 discharged• reviewed 709 charts• 363 (51.2%) had actual uterine rupture• others had incidental extensions of C-section incision, or were
incorrectly coded or typed• 674.1 (dehiscence of the uterine wound) also used to code
another 197 ruptures (or 35% of confirmed cases of uterine rupture)
• Administrative codes are not ideal for research
ICD-9 Concept Coverage
• How well would ICD-9 do in capturing a medical chart?
• Inpatient and outpatient charts from 4 medical centers abstracted into 3061 concepts [Chute, 96]
– diagnoses, modifiers, findings, treatments and procedures, other
• Matching: 0=no match, 1=partial, 2=complete– 1.60 for diagnoses– 0.77 overall– ICD-9 augmented with CPT: overall 0.82
UMLS• A meta-thesaurus of over 40 English and non-
English vocabularies– SNOMED, MeSH, ICD-9, CPT, DSM, Read code, etc.– designates a UMLS preferred term
• e.g., “Atrial Fibrillation” is preferred over– a fib, afib, or AF– auricular fibrillation, or ushka predserdiia fibrilliatsiia
• UMLS terms categorized into 55 semantic types– e.g., signs and symptoms, biologic function, chemicals,
finding, pathologic function• Also links concepts together
– Atrial Fibrillation is-a Cardiovascular Disease
UMLS Semantic Coverage
• 1996 UMLS with ~30 vocabularies (Humphreys)
– 32,679 normalized strings submitted (80% for EMR)• 58% exact concept found• 28% related to broader concept but modifications not
found• 13% related concept found• 1% not found
– semantic coverage varied from 45% to 71%
• SNOMED International and Read did the best• Bottom line: current vocabularies cannot fully
capture all the clinical concepts in medical charts
Research Data Dictionaries
• Research data dictionaries are lists of study variables and their definitions
• Standardization of data dictionaries facilitates data sharing, merging, and meta-analysis
• Terms in a data dictionary should ideally come from a standard clinical vocabulary– e.g., SOB? shortness of breath? breathlessness?
• ICD-9: Dypsnea and other respiratory abnormalities (786.0)• CPT: no matching concept or term• UMLS: Dypsnea is preferred term
Notable Research Data Dictionaries
• Common Data Elements (from the NCI)– standardized study variables for breast, lung, cervical,
prostate cancer– http://cii-server5.nci.nih.gov:8080/cde_browser/cde_java.show
• HCFA’s MedQuest modules – domain specific data dictionaries
• a fib, CHF, diabetes, pneumonia, orthopedics, etc.
• Other domain specific ones?– prospective meta-analysis movement attempting to
disseminate common data dictionaries
Common Data Elements Example• Menopausal Status: “Indication of whether a
woman is potentially fertile or not.” • Allowed values:
Post (Prior bilateral ovariectomy, OR >12 mo since LMP with no prior hysterectomy and not currently receiving therapy with LH-RH analogs [eg. Zolades])
Post (Prior bilateral ovariectomy, OR >12 mo since LMP with no prior hysterectomy)
Pre (<6 mo since LMP AND no prior bilateral ovariectomy, AND not on estrogen replacement)
Above categories not applicable AND Age < 50Above categories not applicable AND Age >=50
EMR for Research Summary• An EMR is not automatically going to help
clinical research– if it’s all unstructured free text, it won’t help much at
all• the more structured it is (ie more defined fields), the better
– if it’s just coded sporadically in ICD-9• problem with gamed codes• poor coverage of many clinical concepts
– if it’s coded in SNOMED• some clinical concepts still not well covered• now nationally site licensed, but
• EMR better than chart review; can we do even better?
• Sample Study– a single-institution outcomes research question
• Electronic Medical Records (EMRs)– relational databases– vocabulary
• Data Warehousing• Security and Privacy
Outline
Types of Queries
• Clinical care• What was Mr. Smith’s last
potassium?• Does he have an old CXR
for comparison?• What antihypertensives
has he been on before?• What did the neurology
consult say about his epilepsy?
• Research• What % of diabetics with
AMI admissions were discharged on -blockers?
• What was the average Medicine length of stay in 2000 compared to 1995?
• What is the trend in use of head CTs in patients with migraine?
• Is admission creatinine independent predictor of bacteremia outcomes?
MICU
FinanceResearch
QA
Data Warehouse
Internet
ADT Chem EMR XRay PMB Claims
• Integrated historical data common to entire enterprise
What is a Data Warehouse?
Types of Data Warehouses
• A data warehouse is just a collection of data from other databases– is itself just a database
• Two somewhat distinct types– clinical data repository
• collects data from day-to-day clinical care, admin data, etc.• for quality improvement, outcomes research, business decision
making…– research data repository
• collects data from multiple research projects• may also collect data from day-to-day clinical care, admin
data, etc.
Data Warehouses: Hype and Hope
• Touted for– business decision making– health care quality improvement– outcomes research– genotype-phenotype correlations for translational research
• Clinical and Genomic Information Management (CGIM) database– UCSF partnership with IBM– $4-6 million over 3 years– goal: a single repository of research data from all UCSF research
projects, plus data from STOR, radiology, etc. (maybe CareCast)– to enable
• analyses and data mining across data sets• correlation of clinical, genomic, imaging, etc data
• Need many types of data for research and QI• E.g., for our outcomes study, need
– admission: ADT (admission/discharge/transfer) system– diabetes diagnosis: e-chart, HgbA1C– MI diagnosis: e-chart, troponins, EKG readings– -blocker usage: online ordering, pharmacy system
• Existing (legacy) systems– claims, pharmacy, ADT, lab, xray, med record, etc– HealthSystems Minnesota with 50 computer systems, 50
paper systems Health System Minnesota: 50 paper, 50 computer
200,000 lives, 460 physicians
Health System Minnesota: 50 paper, 50 computer
200,000 lives, 460 physicians
Why are Data Warehouses Useful?
• Extract data from legacy systems• Clean data and feed it to warehouse• Allow ad hoc use
– data query, data mining, data analysis
• Service users– modify data content based on queries– provide standard reports– provide alerts to trends
Data Warehousing Procedure
• Requires physical networking and transmission standards (protocols)
MICU
FinanceResearch
QA
Warehouse
Internet
ADT Chem EMR XRay PMB Claims
Networking
Prerequisites for Large-Scale Medical Data Merging
• Health-specific network protocols needed– Health-Level 7 (HL-7)
• to provide standards for the exchange, management and integration of data that support clinical patient care and the management, delivery and evaluation of healthcare services
– Digital Imaging and Communications in Medicine (DICOM)
• common data exchange format for medical images
HL-7 Version 2.x Example
MSH|…message headerPID|…patient identifier<!-OBR…observation request>OBR|1|870930010^OE|CM3562^LAB|80004^ELECTROLYTES|R|
198703281530|198703290800||| 401-0^INTERN^JOE^^^^MD^L|N|||||SER|^SMITH^RICHARD^W.^^^DR.|(319)377-4400|
This is requestor field #1. Requestor field #2|Diag.serv.field #1.|Diag.serv.field #2.|198703311400|||F<CR>
<!-OBX…observation result>OBX|1|ST|84295^NA||150|mmol/l|136-148|H||A|F|19850301<CR> OBX|2|ST|84132^K+||4.5|mmol/l|3.5-5|N||N|F|19850301<CR> OBX|3|ST|82435^CL||102|mmol/l|94-105|N||N|F|19850301<CR> OBX|4|ST|82374^CO2||27|mmol/l|24-31|N||N|F|19850301<CR>
• Common data schema– type (e.g. relational)– data modeling (i.e. column names)
• Common naming of data items– eg., “MI” vs. “myocardial infarction”
• For online data sharing and merging– a physical connection between the computers– common data transmission protocols
• e.g., HL-7– common database communication protocol
• e.g. SQL over TCP/IP (the telnet protocol)
Prerequisites for Data Warehouse Construction
MICU
FinanceResearch
QA
???
Internet
ADT Chem EMR XRay PMB Claims
Data Warehouse Contents
Should Warehouse Schema = EMR Schema? MasterTable
ID Name Sex Birthdate Insurance000-01-001 Lee M 09-Jul-00 B/T Healthnet000-01-002 Smith F 22-Oct-25 Medicare000-01-003 Perez F 13-Jun-57 B/T Pacificare
AdmissionNumberTableAdm# ID Admit Date Discharge
Date001 000-01-001 31-Dec-94 12-Jan-95002 000-01-001 27-Mar-96 31-Mar-96003 000-01-002 03-Feb-95 16-Feb-95004 000-01-002 27-Feb-95 20-Mar-95005 000-01-003 19-Nov-97 23-Nov-97
AdmissionTableAdm# Admit
ServiceAdmit
DiagnosisPrincipalDischargeDiagnosis
001 Med Acute MI Acute MI002 Med COPD Pneumonia003 Surg THR THR004 Med Acute MI Acute MI005 Gyn Menorrhagia von Willebrand's
Secondary Discharge Diagnosis TableAdmission # Secondary Discharge Diagnoses
001 COPD001 Diabetes002 COPD003 Acute MI004 VF Arrest005 Diabetes
Clinical Data Warehouse Schema Discharge Diagnoses
DischargeDiagnosis Admission #
LOS Service Team Attending
Acute MI 001 13 Med II RedAcute MI 004 22 Med I BlueTHR 003 14 Surg III BronzeCOPD 002 5 Med II WhiteMetrorrhagia 005 4 Gyn A Buff
Discharge Meds for AMI Admissions Table
Admission #Aspirinon D/C
Beta-Blockeron D/C
Statinon D/C
ACE Inhibitor onD/C
001 ASA 325 mg QD Atenolol 50 mgQD
Simvastatin 20 mgQD
Lisinopril 10 mgQD
004 ECA 81 mg QD Metoprolol 100mg BID
Atorvastatin 40 mgQD
Ramipril 5 mg QD
• diagnoses would be ICD-9 codes• perhaps a separate table for admission diagnoses?
Research Data Warehouse Schema• Should depend on anticipated queries• UCSF in midst of trying to understand this
– are queries mostly within a project? across projects?– for analysis of ongoing projects? or analysis across
completed projects? both?– anonymized participant data?– what about participants from other study sites?– does administrative data (insurance) need to be there
too?• Scientific issues have huge implications for design
(and eventual worth) of research warehouse– if you don’t know what you want, no technology will
give it to you
Other CGIM Issues
• Does CGIM help project databases? data acquisition?• How would CGIM benefit single project? • Standard coding vocabulary? standard data representation?• Standard definition of clinical variables?
(in SNOMED-CT)(in MAGE-ML) (in MAGE-ML)
CGIM
microarray Bmicroarray A
•Breast CA (not DCIS)•Menopause
•Osteoporosis (Heel US)•Menopause
Project 1
DB 1
Project 2
DB 2
Project 3
DB 3
Project 4
DB 4
•Osteoporosis (DXA)•Menopause
•Breast CA (DCIS ok)•Alzheimers (path)
Data mining/Display ToolsRadiologySTOR
Choosing a Vocabulary• For an EMR
– billing: ICD-9, CPT– clinical data capture: SNOMED-CT best
• under US national site license (ie free for all)• hard to get docs to choose correct code out of 325,000 terms
– research: any is better than none!• For your own research databases
– if standard domain-specific data dictionary exists, use it– if not, use a standard clinical vocabulary
• often ICD-9 or CPT, or SNOMED, or UMLS preferred terms– try not to be defining your own terms and your own
definitions• upfront work will make it easier to share data later…
Data Warehouse Summary
• Enterprise viewpoint more appropriate for research than patient viewpoint of EMR
• Integrates data from multiple sources– need standardization of codes, definitions, and data
formats• Querying and processing occurs “offline”
– little impact on real-time clinical care• Schema can evolve to optimize for analytic needs
– can make or modify tables off of legacy systems
Viewpoint Time Queries
EMR Patient Real-Time ClinicalData Warehouse Enterprise Historical Ad Hoc
• Compare 1 year re-admission rate for acute MI in diabetics discharged on -blockers or not– data captured in EMR and other databases– data aggregated in data warehouse– you query the data warehouse — NOT YET….
Study Steps Using EMR
• Sample Study– a single-institution outcomes research question
• Electronic Medical Records (EMRs)– relational databases– vocabulary
• Data Warehousing• Security and Privacy
Outline
Privacy vs. Security
• Security (a technical feature)– confidentiality
• ensuring that only authorized persons can read or copy information
– encryption of data during transmission impedes eavesdropping only
– integrity• ensuring that information is modified only in appropriate ways
– availability• ensuring that information is not made inaccessible
• Privacy (a legal concept)– right to keep personal information from outside world
• study nurse, data entry clerk, investigator, database administrator, etc may be authorized to see data but may disclose it inappropriately
• Physical security– firewalls
• Encryption– public/private keys
• People security– authority– authentication – access– audit
Internet
Firewall
Network Security
itsa
jaundice
ucsf.edu
LAN
• Authentication – are you who you say you are?
• use passwords, biometrics (e.g., retinal scan), smartcards
• Authority– do you have a need to know?
• different levels of data access for different users• Access
– how to allow only authenticated users to perform authorized activities on authorized data?
• Audit– record of who actually got into what
People Security
De-identification Isn’t Easy• 87% of the American populace can be uniquely
identified by only [Sweeney, L. ‘97]
– date of birth• in room of 23 people, what is chance that 2 people will share
the same birthday (independent of year of birth)?• http://www.people.virginia.edu/~rjh9u/birthday.html
– gender– five-digit ZIP code– easy to find someone’s info if you’re looking for it;
harder to find out who’s info it is that you have• Anonymizing databases does not remove your
duty to enforce security and safeguard privacy
Summary of Privacy & Security
• Computing/network infrastructure can deal with security– but privacy is a policy matter
• Anonymizing of databases helps but it isn’t foolproof
• In general, people are the weakest security and privacy link
• Compare 1 year re-admission rate for acute MI in diabetics discharged on -blockers or not– data captured in EMR and other databases– data aggregated in data warehouse– you request IRB approval– you are authorized to to conduct HIPAA-compliant
search (e.g.,. Limited Data Set) in data warehouse– audit trail of queries are maintained
Outcomes Research Project
• EMR does not always = easier clinical research• Structure and coding is critical
– structure: schema needed, designed to support intended queries
– coding: standardized, coded data trumps free text• especially important for research• but most standardized vocabularies have insufficient clinical
coverage– data formats: standard needed for genomic, imaging, etc.
data • Clinical/Research data warehouses could be useful
for research but must be designed correctly with high-quality, cross-compatible data
Take-Home Points