NetBioSIG2012 annabauermehren

18
Network analysis of unstructured EHR data for compara8ve effec8veness research Anna Bauer)Mehren ShahLab, Stanford Center for Biomedical Informa:cs

description

Mining of electronic health records (EHRs) has recently gained importance. However, most efforts are restricted to analyzing drugs, diseases and their associations. In biomedical research, network analysis has provided the conceptual framework to interpret protein-protein interactions or gene-disease association networks via large-scale network maps. We analyze associations between drugs, diseases, devices and procedures mined from EHRs using network analysis to extract hidden “modules of care” for hypothesis generation. In particular, we annotated the textual notes of the EHRs of one million patients in the Stanford Clinical Data Warehouse with disease, drug, procedure and device terms using ontologies such as SNOMED-CT or RxNorm. We then used standard co-occurrence statistics to establish associations between these clinical concepts and to construct networks. Hidden modules of care - clusters of diseases, drugs, procedures, devices – useful for hypothesis generation are extracted through network analysis approaches and visualized using Cytoscape. We present a study for comparative effectiveness of Cilostazol vs. a control group in peripheral artery disease (PAD) patients (see Figure 1) and compare our results derived from the network analysis against standard methods such as regression analysis. We believe that network analysis allows us to uncover hidden (“latent”) modules of care not detected through standard approaches, which do not account for the connectivity of the clinical events and entities.

Transcript of NetBioSIG2012 annabauermehren

Page 1: NetBioSIG2012 annabauermehren

Network(analysis(of(unstructured(EHR(data(for(compara8ve(effec8veness(research(

Anna$Bauer)Mehren$ShahLab,$Stanford$Center$for$Biomedical$Informa:cs$

Page 2: NetBioSIG2012 annabauermehren

Background(>(PAD(

7/17/12 NetBio SIG, 2012 2

•  Peripheral$artery$disease$(PAD):$obstruc:on$of$infra)renal$abdominal$aorta$and$lower$extremity$arteries$$

•  Symptoms:$IntermiEent$claudica:on$$

•  Risk$factors:$Smoking,$Diabetes,$Obesity,$Hypertension,$Age$

•  Treatments:$Lifestyle$changes,$Surgical$and$endovascular$interven:ons,$Pharmacotherapy$$

!$Cilostazol$

Page 3: NetBioSIG2012 annabauermehren

Background(>(Cilostazol(•  Reversible$selec:ve$inhibitor$of$phosphodiesterase$(PDE)$type$III$

•  Long)term$oral$milrinone$therapy$associated$with$life)threatening$cardiovascular$events$in$conges:ve$heart$failure$(CHF)$pa:ents$

$

$$$$$Black$box$warning:$$$

$$

$$$$$Hypothesis:$

7/17/12 NetBio SIG, 2012 3

Cilostazol is contraindicated in patients with CHF

Cilostazol is not associated with increased risk of major adverse cardiovascular events (MACE)

Page 4: NetBioSIG2012 annabauermehren

•  1.6$million$pa:ents$

•  15$million$visits$

•  25$million$coded$ICD9$diagnoses$

•  9.5$million$clinical$free$text$notes$•  pathology$•  radiology$•  transcrip:on$reports$

7/17/12 NetBio SIG, 2012 4

NCBO(STRIDE(•  We$create$and$maintain$a$

library$of$more$than$300$biomedical$ontologies$

•  We$build$tools$and$web$services$to$enable$the$use$of$ontologies$

•  We$use$ontologies$for$annota:ng$EHRs$to$allow$clinically$relevant$studies$

Page 5: NetBioSIG2012 annabauermehren

Networks(for(compara8ve(effec8veness(

•  Networks$have$been$successfully$applied$in$biology$

•  Can$we$use$networks$for$clinical$studies?$

•  We$want$to$use$networks$for$•  Cohort$building$

•  Analysis$of$clinical$outcome$$

7/17/12 NetBioSIG 2012 5

Roque et al, PLoS Comput Biol. 2011; 7(8): e1002141.

Page 6: NetBioSIG2012 annabauermehren

Term recognition tool NCBO Annotator

NegEx Patterns

•  Negation detection •  FH detection

Patient feature matrix

Sel

ect c

ohor

t of

Inte

rest

Drugs

BioPortal – knowledge graph

Creating clean lexicons Annotation Workflow

Furthe

r$Analysis$

Text clinical note

Terms Recognized

Negation, FH detected

From(clinical(notes(to(pa8ent(feature(matrix(

7/17/12 NetBio SIG, 2012 6

Diseases

Devices Procedures

Term$–$1$:$:$Term$–$n$

Page 7: NetBioSIG2012 annabauermehren

Dimension(reduc8on(using(ontologies(

7/17/12 NetBio SIG, 2012 7

Page 8: NetBioSIG2012 annabauermehren

Follow up time in peripheral artery disease patients

follow up time in 30 day intervals

Freq

uenc

y

0 1000 2000 3000 4000 5000 6000

050

010

0015

0020

0025

0030

00

365

Cohort(building(–(restrict(by(follow(up(8me(

7/17/12 NetBio SIG, 2012 8

5757 PAD patients

t

Patient timeline

Follow up time

PAD

tPAD

Last note

tlast

Page 9: NetBioSIG2012 annabauermehren

Cohort(building(–(set(8me(

7/17/12 NetBio SIG, 2012 9

t

Patient timeline

PAD

t PAD

Last note

t last

CIL

t CIL= t0

t

PAD

t PAD

22 days

t 0

before after

Median = 22 days

Cilostazol patient

Other PAD patient

Last note

t last

Matching Outcome

Page 10: NetBioSIG2012 annabauermehren

Matching(

7/17/12 NetBio SIG, 2012 10

1.  Choose$variables$for$matching$

2.  Compute$PS$based$on$variables$(logis:c$regression)$

3.  Nearest$neighbor$matching$(1:1)$

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1

1 1 1

0 1 0

0 0 1

0 1 0

0 0 0

0 1 1

0 0 1

0 1 1

1 1 1

0 0 1

1 0 1

1 1 1

Drug$classes$ Diseases$ Devices$ Procedures$ Demographics$

J(A,B) =1−A∩BA∪B

0 0.6 0.8 0.6 0.5 0.8

0 0.8 0.9 0.8 0.3

0 0.7 0.9 0.9

0 0.7 0.8

0 0.8

0

Pa:e

nts$

Pa:ents$

5776(

5776(Nearest(neighbor(Matching((1:1)(

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1 446(

1159(

Cilostazol(

Control(

Concepts$

0 0 1

0 1 1

1 1 1

0 0 1

1 0 1

1 1 1

Pa8ent>pa8ent(similarity(network( Propensity>score(matching(

5776(

1159(

Pa:e

nts$

Pa:e

nts$

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1

446(

17(Concepts$

Pa:e

nts$

Page 11: NetBioSIG2012 annabauermehren

Matching(>(comparison(

7/17/12 NetBio SIG, 2012 13

!!!!!

Variable Before Matching Patient-patient similarity network

Propensity score matching

Treatment

(n= 223) Control (n= 5534)

p-value Control (n= 223)

p-value Control (n= 223)

p-value

Demographics Age (at indication onset), mean (sd) 71.22

(11.02) 70.43

(12.46) 0.295 72.05

(10.62) 0.41196 70.87

(11.51) 0.74704

Gender (female), n (%) 37.22 45.94 0.0090723 36.65 0.84432 35.87 0.76777 Race , (%)

Asian 8.52 7.41 0.56047 6.33 0.36671 10.31 0.50525 Black 2.69 3.71 0.36423 3.61 0.588 0.90 0.15684 Unknown 22.87 26.17 0.25365 22.17 0.82063 20.63 0.57404 White 65.47 62.22 0.31854 67.87 0.54642 67.27 0.68645

Comorbidities Coronary artery disease, n (%) 5.38 6.47 0.48345 4.98 0.83089 6.28 0.69516 Congestive heart failure, n (%) 25.56 22.84 0.36255 20.36 0.21361 30.49 0.36255 Hypertension, n (%) 10.76 11.31 0.79597 9.50 0.75135 10.31 0.87893 Co-prescriptions Beta blocking agents, n (%) 75.34 60.77 1.6611e-06 69.68 0.20252 74.89 0.90693 ACE inhibitors, plain, n (%) 78.03 69.57 0.0032697 67.87 0.013595 78.92 0.81386 Platelet aggregation inhibitors excl.

heparin, n (%) 91.93 79.00 8.1983e-11 89.59 0.41347 95.51 0.072929

Vasodilators, n (%) 32.29 26.36 0.064893 31.67 0.83902 37.22 0.28753 History of Cardiac arrhythmia, n (%) 32.29 32.17 0.96957 23.08 0.03383 33.18 0.84006 Stroke, n (%) 17.94 18.31 0.88877 15.84 0.61129 21.52 0.33903 Myocardial infarction, n (%) 17.94 15.87 0.43015 13.58 0.23919 19.73 0.63765 Vascular surgical procedures, n (%) 74.44 47.71 < 2.22e-16 65.61 0.048937 74.44 1 Bypass surgery, n (%)

41.70 26.56 1.0506e-05 36.20 0.24269 40.36 0.75073

Page 12: NetBioSIG2012 annabauermehren

Cohort(building(–(set(8me(

7/17/12 NetBio SIG, 2012 14

t

Patient timeline

PAD

t PAD

Last note

t last

CIL

t CIL= t0

t

PAD

t PAD

22 days

t 0

before after

Median = 22 days

Cilostazol patient

Other PAD patient

Last note

t last

Cohort matching

Matching Outcome

Page 13: NetBioSIG2012 annabauermehren

Outcome(comparison(

7/17/12 NetBio SIG, 2012 15

$$

$

$$

Concept>concept(network( Dispropor8onality(analysis(

- 0.5 2.5 4 10

- 1.3 2 0

- 0 0

- 1

-

- 0.5 2.5 4 10

- 1.3 2 0

- 0 0

- 1

-

control

cilostazol ascid1,cid2 = log(cofreqcid1,cid2

freq1cid1 * freq2cid2n

)

!!!!!!!!

"

#

$$$$$$$$

%

&

0321

3303231

2232021

1131110

dddd

d

d

d

ββββ

ββββ

ββββ

ββββ

Compute edges

Logistic regression

1 0 0 1 1

0 1 1 0 1

0 1 0 0 1

0 1 1 0 0

1 0 1 0 0

1 1 1 1 0

zolpidem

pravastatin

nifedipine

congestive heart failure

insulin glargine

cilostazol

transplantation

trimethoprim

bypass

doppler studies

surgical revision

atherectomy

bypass graftvascular surgical

procedures

ultrasound imaging

testosterone

fentanyl

amoxicillin

heart failure

coronary angiography

pantoprazole

cephalexin

hydralazine

amiodarone

obesity

wheelchair

diazepam

pneumonia

tacrolimus

sulfamethoxazole

temazepam

decompressive incision

fluoroscopic angiography heart

transplantation

revascularization

diagnostic imaging

vascular diseases

ramipril

angioplasty

doppler ultrasonography

cane

carotid endarterectomy

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

Ventricular tachicardia

Ventricular fibrillation

Vasulcar surgical procedures

Stroke

Myocardial infarction

Electric countershock

Death

Coronary stents

Cardiac arrythmia

Bypass surgery

Atrial fibrillation

Odds ratios − cilostazol vs. control

similaritypsm

Page 14: NetBioSIG2012 annabauermehren

zolpidem

pravastatin

nifedipine

congestive heart failure

insulin glargine

cilostazol

transplantation

trimethoprim

bypass

doppler studies

surgical revision

atherectomy

bypass graftvascular surgical

procedures

ultrasound imaging

testosterone

fentanyl

amoxicillin

heart failure

coronary angiography

pantoprazole

cephalexin

hydralazine

amiodarone

obesity

wheelchair

diazepam

pneumonia

tacrolimus

sulfamethoxazole

temazepam

decompressive incision

fluoroscopic angiography heart

transplantation

revascularization

diagnostic imaging

vascular diseases

ramipril

angioplasty

doppler ultrasonography

cane

carotid endarterectomy

Concept>concept(network(

7/17/12 NetBio SIG, 2012 16

Vascular surgical procedures

Heart transplantation

Heart failure

Page 15: NetBioSIG2012 annabauermehren

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

Ventricular tachicardia

Ventricular fibrillation

Vasulcar surgical procedures

Stroke

Myocardial infarction

Electric countershock

Death

Coronary stents

Cardiac arrythmia

Bypass surgery

Atrial fibrillation

Odds ratios − cilostazol vs. control

similaritypsm

Dispropor8onality(analysis(

7/17/12 NetBio SIG, 2012 17

Page 16: NetBioSIG2012 annabauermehren

Conclusions(

•  Technical$•  Pa:ent)pa:ent$network$based$on$similarity$in$annota:ons$can$be$used$for$cohort$building$

•  Concept)concept$network$based$on$co)occurrence$in$clinical$notes$can$be$used$for$analyzing$clinical$outcome$

•  Medical$•  No$significant$differences$in$MACE$comparing$cilostazol$vs.$control$$

$

$

7/17/12 NetBio SIG, 2012 18

zolpidem

pravastatin

nifedipine

congestive heart failure

insulin glargine

cilostazol

transplantation

trimethoprim

bypass

doppler studies

surgical revision

atherectomy

bypass graftvascular surgical

procedures

ultrasound imaging

testosterone

fentanyl

amoxicillin

heart failure

coronary angiography

pantoprazole

cephalexin

hydralazine

amiodarone

obesity

wheelchair

diazepam

pneumonia

tacrolimus

sulfamethoxazole

temazepam

decompressive incision

fluoroscopic angiography heart

transplantation

revascularization

diagnostic imaging

vascular diseases

ramipril

angioplasty

doppler ultrasonography

cane

carotid endarterectomy

Cilostazol is contraindicated in patients with CHF

Cilostazol is not associated with increased risk of major adverse cardiovascular events (MACE)

Page 17: NetBioSIG2012 annabauermehren

Future(work(

•  Technical$•  Improve$aggrega:on$of$concepts$

•  Might$improve$cohort$building$•  Useful$for$outcome$analysis$based$on$networks$

•  Medical$•  Repeat$study$in$different$hospital$•  Profile$PAD$pa:ents$using$all$annota:ons$using$concept)concept$networks$for$hypothesis$genera:on$

7/17/12 NetBio SIG, 2012 19

sulfamethoxazole

carotid endarterectomy

coronary angiography

zolpidemtrimethoprim

transplantation

pantoprazoleinsulin glargineobesity

amiodarone pravastatin

hydralazinepneumonia

wheelchair

cilostazol

angioplasty

congestive heart failure

heart failure

sulfamethoxazole

carotid endarterectomy

coronary angiography

zolpidemtrimethoprim

transplantation

pantoprazoleinsulin glargineobesity

amiodarone pravastatin

hydralazinepneumonia

wheelchair

cilostazol

angioplasty

heart failure

Page 18: NetBioSIG2012 annabauermehren

Acknowledgements(

•  ShahLab$•  Nigam$Shah$•  Paea$LePendu$•  Rave$Harpaz$•  Srinivasan$Iyer$•  Amogh$Vasekar$•  Kenneth$Jung$

•  Medical$collaborator$•  Nicholas$Leeper$•  John$Cooke$

7/17/12 NetBio SIG, 2012 20

•  NCBO$Team$•  Mark$Musen$•  NIH$funding$

$•  STRIDE$Team$

•  Tanya$Podchiyska$•  Todd$Ferris$$