Medical Data Mining

92
Medical Data Mining Lars Juhl Jensen

Transcript of Medical Data Mining

Page 1: Medical Data Mining

Medical Data Mining

Lars Juhl Jensen

Page 2: Medical Data Mining

unstructured data

Page 3: Medical Data Mining
Page 4: Medical Data Mining

structured data

Page 5: Medical Data Mining

Jensen et al., Nature Reviews Genetics, 2012

Page 6: Medical Data Mining

individual hospitals

Page 7: Medical Data Mining

central registries

Page 8: Medical Data Mining
Page 9: Medical Data Mining

opt-out

Page 10: Medical Data Mining

opt-in

Page 11: Medical Data Mining

Danish registries

Page 12: Medical Data Mining

civil registration system

Page 13: Medical Data Mining

CPR number

Page 14: Medical Data Mining

established in 1968

Page 15: Medical Data Mining

Jensen et al., Nature Reviews Genetics, 2012

Page 16: Medical Data Mining

national discharge registry

Page 17: Medical Data Mining

14 years

Page 18: Medical Data Mining

6.2 million patients

Page 19: Medical Data Mining

45 million admissions

Page 20: Medical Data Mining

68 million records

Page 21: Medical Data Mining

119 million diagnosis

Page 22: Medical Data Mining

ICD-10

Page 23: Medical Data Mining

Jensen et al., Nature Reviews Genetics, 2012

Page 24: Medical Data Mining

reimbursement

Page 25: Medical Data Mining

not research

Page 26: Medical Data Mining

diagnosis trajectories

Page 27: Medical Data Mining

naïve approach

Page 28: Medical Data Mining

comorbidity

Page 29: Medical Data Mining

Jensen et al., Nature Reviews Genetics, 2012

Page 30: Medical Data Mining

confounding factors

Page 31: Medical Data Mining

“known knowns”

Page 32: Medical Data Mining

gender

Page 33: Medical Data Mining

age

Page 34: Medical Data Mining

type of hospital encounter

Page 35: Medical Data Mining

Jensen et al., submitted, 2014

Page 36: Medical Data Mining

“known unknowns”

Page 37: Medical Data Mining

smoking

Page 38: Medical Data Mining

diet

Page 39: Medical Data Mining

“unknown unknowns”

Page 40: Medical Data Mining

reporting biases

Page 41: Medical Data Mining

matched controls

Page 42: Medical Data Mining

temporal correlation

Page 43: Medical Data Mining

Jensen et al., Nature Communications, 2014

Page 44: Medical Data Mining

trajectories

Page 45: Medical Data Mining

Jensen et al., Nature Communications, 2014

Page 46: Medical Data Mining

trajectory networks

Page 47: Medical Data Mining

Jensen et al., Nature Communications, 2014

Page 48: Medical Data Mining

key diagnoses

Page 49: Medical Data Mining

Jensen et al., Nature Communications, 2014

Page 50: Medical Data Mining

direct medical implications

Page 51: Medical Data Mining

electronic health records

Page 52: Medical Data Mining

structured data

Page 53: Medical Data Mining

Jensen et al., Nature Reviews Genetics, 2012

Page 54: Medical Data Mining

unstructured data

Page 55: Medical Data Mining
Page 56: Medical Data Mining

free text

Page 57: Medical Data Mining

Danish

Page 58: Medical Data Mining

busy doctors

Page 59: Medical Data Mining

typos

Page 60: Medical Data Mining

psychiatric patients

Page 61: Medical Data Mining

delusions

Page 62: Medical Data Mining

heavily medicated

Page 63: Medical Data Mining

Eriksson et al., Drug Safety, 2014

Page 64: Medical Data Mining

text mining

Page 65: Medical Data Mining

dictionary-based method

Page 66: Medical Data Mining

diseases

Page 67: Medical Data Mining

drugs

Page 68: Medical Data Mining

adverse drug reactions

Page 69: Medical Data Mining

expansion rules

Page 70: Medical Data Mining

typos

Page 71: Medical Data Mining

“negative modifiers”

Page 72: Medical Data Mining

negations

Page 73: Medical Data Mining

delusions

Page 74: Medical Data Mining

detailed disease profiles

Page 75: Medical Data Mining

Roque et al., PLOS Computational Biology, 2011

3262638254947

Assigned codes

Text mined codes

Page 76: Medical Data Mining

pharmacovigilance

Page 77: Medical Data Mining

structured data

Page 78: Medical Data Mining

medication

Page 79: Medical Data Mining

semi-structured data

Page 80: Medical Data Mining

drug indications

Page 81: Medical Data Mining

known ADRs

Page 82: Medical Data Mining

unstructured data

Page 83: Medical Data Mining

adverse drug reactions

Page 84: Medical Data Mining

temporal correlation

Page 85: Medical Data Mining

Eriksson et al., Drug Safety, 2014

Page 86: Medical Data Mining

known ADRs

Page 87: Medical Data Mining

ADR frequencies

Page 88: Medical Data Mining

Eriksson et al., Drug Safety, 2014

Page 89: Medical Data Mining

new ADRs

Page 90: Medical Data Mining

Drug substance ADE p-value

Chlordiazepoxide Nystagmus 4.0e-8

Simvastatin Personality changes

8.4e-8

Dipyridamole Visual impairment

4.4e-4

Citalopram Psychosis 8.8e-4

Bendroflumethiazide

Apoplexy 8.5e-3

Eriksson et al., Drug Safety, 2014

Page 91: Medical Data Mining

AcknowledgmentsDisease trajectoriesAnders Bøck JensenTudor OpreaPope MoseleySøren Brunak

Adverse drug reactionsRobert ErikssonThomas WergeSøren Brunak

EHR text mining

Peter Bjødstrup Jensen

Robert ErikssonHenriette SchmockFrancisco S. Roque

Anders JuulMarlene Dalgaard

Massimo AndreattaSune FrankildEva Roitmann

Thomas HansenKaren Søeby

Søren BredkjærThomas Werge

Søren Brunak

Page 92: Medical Data Mining