Using Machine Learning to Automate Clinical Pathways

47
Using Machine Learning to Automate Clinical Pathways David Sontag, PhD Department of Computer Science Courant Ins@tute of Mathema@cal Sciences NYU Joint work with my student Yoni Halpern (NYU) and Steven Horng (Beth Israel Deaconess Medical Center)

Transcript of Using Machine Learning to Automate Clinical Pathways

Using  Machine  Learning  to  Automate  Clinical  Pathways  

David  Sontag,  PhD  Department  of  Computer  Science  

Courant  Ins@tute  of  Mathema@cal  Sciences  NYU  

Joint  work  with  my  student  Yoni  Halpern  (NYU)  and  Steven  Horng  (Beth  Israel  Deaconess  Medical  Center)  

Health  Informa@on  Technology  is  Rapidly  Changing  

•  Aided  by  HITECH  Act,  hospital  adop@on  of  EHRs  has  increased  5-­‐fold  since  2008  

[Charles  et  al.,  ONC  Data  Brief,  May  2014]  

•  Over  $4  billion  of  investment  in  digital  health  startups  in  2014  

Health  Informa@on  Technology  is  Rapidly  Changing  

Analy@cs  /  Big  Data  

Healthcare  Consumer  Engagement  

[Wang  et  al.,  “Digital  health  funding  in  Q1  2015  over  $600M”,  Rock  Health,  April  2015]  

EHR  /  Clinical  Workflow  

Digital  Diagnos@cs  

Popula@on  Health  

Management  

Digital  Medical  Device  

[Weber  et  al.  (2014).  Finding  the  Missing  Link  for  Big  Biomedical  Data.  JAMA.]  

Wealth  of  digital  health  data  available  

Research  in  my  clinical  ML  lab  

•  Next-­‐genera*on  electronic  health  records  focus  of  today’s  talk  

•  Popula@on-­‐level  risk  stra@fica@on  

•  Beber  managing  pa@ents  with  chronic  disease  

clinicalml.org  

Emergency  Department:  

•  Limited  resources  •  Time  sensi*ve  •  Cri*cal  decisions  

Triage  Informa@on  (Free  text)  

Lab  results  (Con@nuous  valued)  

MD  comments  (free  text)  

Specialist  consults   Physician  documenta@on  

Repeated  vital  signs  (con@nuous  values)  Measured  every  30  s  

T=0  

30  min  2  hrs  

Disposi@on  

Next-Generation EHR for the Emergency Department

All  pa*ent    observa*ons  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

All  pa*ent    observa*ons  

Clinical  state  variables  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

From  nursing  home?  

Has  altered  mental  status?  

Has  cardiac  e@ology?  

Has  infec@on?  

Will  die  in  next  30  days?  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

Machine  learning  and  natural  language  processing  

All  pa*ent    observa*ons  

Clinical  state  variables  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

Ac*on   Alerts/Reminders  

Decision  support   Cohort  Selec@on  QA  review   Contextual  display  

From  nursing  home?  

Has  altered  mental  status?  

Has  cardiac  e@ology?  

Has  infec@on?  

Will  die  in  next  30  days?  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

Machine  learning  and  natural  language  processing  

All  pa*ent    observa*ons  

Clinical  state  variables  

MD/nurse  documenta@on  

Billing  codes   Vitals   Orders   Labs   History  

Ac*on   Alerts/Reminders  

Decision  support   Cohort  Selec@on  QA  review   Contextual  display  

From  nursing  home?  

Has  altered  mental  status?  

Has  cardiac  e@ology?  

Has  infec@on?  

Will  die  in  next  30  days?  

Built  on  Top  of  Real-­‐@me  Predic@on  of  Clinical  State  Variables  

Machine  learning  and  natural  language  processing  

Advise  fall  precau@ons  

Suggested  order  sets  

Triggering  celluli@s  pathway  

Sepsis  alert   Panel  management  

Example:  Triggering  Clinical  Pathways  

•  Clinical  Pathways  project  at  Beth  Israel  Deaconess  Medical  Center  (BIDMC)  

•  Standardizing  care  in  the  Emergency  Department  

–  Reduce  possibili@es  for  error  

–  Enforce  established  best  prac@ces  

•  Pathways  have  been  shown  to  reduce  in-­‐hospital  complica@ons,  without  increasing  costs  [Rober  et  al  2010]  

Celluli@s  Pathway  Flowchart  

Automa@ng  triggers  

•  Don’t  rely  on  the  user’s  knowledge  that  the  pathway  exists!  

Current  triggering  mechanism  (Celluli@s  pathway)  

Trigger  if  chief  complaint  contains  any  of  the  following:    

CELLULITIS,  REDDENED  HOT  LIMB,  ERYTHEMA,  LEG  SWELLING,  INFECTION,  HAND,  LEG,  FOOT,  TOE,  ARM,  FACE,  FINGER  

Current  triggering  mechanism  (Celluli@s  pathway)  

Trigger  if  chief  complaint  contains  any  of  the  following:    

CELLULITIS,  REDDENED  HOT  LIMB,  ERYTHEMA,  LEG  SWELLING,  INFECTION,  HAND,  LEG,  FOOT,  TOE,  ARM,  FACE,  FINGER  

Expert  constructed  rule  –  built  for  sensi*vity  

Could  we  learn  a  beber  rule?  

Supervised  learning  is  a  non-­‐starter  

•  Leverage  large  clinical  databases  to  learn  predic@ve  rules.  

•  Need  labeled  data  •  Classifiers  onen  don’t  generalize  across  ins@tu@ons    

LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Our  contribu@on:    Anchor  &  Learn  Framework  

•  Use  a  combina@on  of  domain  exper@se  (simple  rules)  and  vast  amounts  of  data  (machine  learning).  

•  Method  does  not  require  any  manual  labeling.  

•  Anchors  are  highly  transferable  between  ins@tu@ons.  

[Halpern  et  al.,  AMIA  2014]  

What  are  anchors?  

•  Rather  than  provide  gold-­‐standard  labels,  construct  a  simple  rule  that  can  catch  some  posi@ve  cases.    

What  are  anchors?  

•  Rather  than  provide  gold-­‐standard  labels,  construct  a  simple  rule  that  can  catch  some  posi@ve  cases.    

•  Examples:  

Phenotype   Possible  Anchor  Diabe@c   gsn:016313  (insulin)  in  Medica@ons  

Cardiac   ICD9:428.X  (heart  failure)  in  Diagnoses  

Nursing  home   “from  nursing  home”  in  text  

Social  work   “social  work  consulted”  in  text  

What  are  anchors?  

•  Rather  than  provide  gold-­‐standard  labels,  construct  a  simple  rule  that  can  catch  some  posi@ve  cases.  Low  sensi*vity  here  is  ok!    

•  Examples:  

Phenotype   Possible  Anchor  Diabe@c   gsn:016313  (insulin)  in  Medica@ons  

Cardiac   ICD9:428.X  (heart  failure)  in  Diagnoses  

Nursing  home   “from  nursing  home”  in  text  

Social  work   “social  work  consulted”  in  text  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

•  Iden@fy  anchors  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

1011001  

•  Iden@fy  anchors  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

1011001  

•  Iden@fy  anchors  •  Learn  to  predict  the  anchors  (anchor  as  pseudo-­‐labels)  

Learning  with  Anchors  LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&

Pa@ent    database  

1011001  

•  Iden@fy  anchors  •  Learn  to  predict  the  anchors  (anchor  as  pseudo-­‐labels)  •  Account  for  the  difference  between  anchors  and  labels  

Transform  

Predict  anchor   Predict  label  

New  ins@tu@on  

Generalizability/Portability  LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

New  ins@tu@on  

Generalizability/Portability  

Data  may  be  very  different:  •  Language  •  Representa@on    •  Popula@on  

New  ins@tu@on  

Generalizability/Portability  

As  long  as  our  anchors  appear  in  the  new  data  as  well…  

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

New  ins@tu@on  

Generalizability/Portability  

As  long  as  our  anchors  appear  in  the  new  data  as  well…  Can  learn  a  new  model,  specific  to  the  new  ins@tu@on.  

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

New  ins@tu@on  

Generalizability/Portability  

As  long  as  our  anchors  appear  in  the  new  data  as  well…  Can  learn  a  new  model,  specific  to  the  new  ins@tu@on.  

Only  need  to  share  anchor  defini*ons,  Each  site  trains  models  on  its  own  data.  

LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

•  Alterna@ve  formula@on  –  two  necessary  condi@ons:  

P (Y = 1|A = 1) = 1Posi*ve  condi*on  

A ? X|YCondi*onal  independence  

AND  

X represents  all  other  observa@ons.  

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

•  Alterna@ve  formula@on  –  two  necessary  condi@ons:  

P (Y = 1|A = 1) = 1Posi*ve  condi*on  

A ? X|YCondi*onal  independence  

AND  

X represents  all  other  observa@ons.  

e.g.  If  pa@ent  is  taking  insulin,  the  pa@ent  is  surely  diabe*c.  

e.g.  If  we  know  the  pa@ent  had  heart  failure,  knowing  whether  the  diagnosis  code  appears  does  inform  us  about  the  rest  of  the  record.  

Theore@cal  basis  for  anchors  •  Unobserved  variable:  Y,  Observa@on:  A  •  A  is  an  anchor  for  Y  if  condi@oning  on  A=1  gives  uniform  samples  from  the  set  of  posi8ve  cases.  

•  Theorem  [Elkan  &  Noto  2008]:    

In  the  above  se>ng,  a  func8on  to  predict  A    

can  be  transformed  to  predict  Y  

•  Can  also  use  more  recent  advances  on  learning  with  noisy  labels  (e.g.,  Natarajan  et  al.,  NIPS  ‘13)  

Learning  with  anchors  Input:  anchor  A  

           unlabeled  pa@ents  Output:  predic@on  rule  1.  Learn  a  calibrated  classifier  (e.g.  

logis@c  regression)  to  predict:  

2.  Using  a  validate  set,  let  P  be  the  pa@ents  with  A=1.  Compute:  

3.  For  a  previously  unseen  pa@ent  t,  predict:  

Pr(A = 1 | X̃ )

C =1

|P|X

k2PPr(A = 1 | X̃ (k))

[Elkan  &  Noto  2008]  

1

CPr(A = 1|X (t)) if A(t) = 0

1 if A(t) = 1

Calibra*on  C  is  the  average  model  

predic@on  for  pa@ents  with  anchors.  

Learning  Learn  to  predict  A  from  the  other  variables.  

Transforma*on  If  no  anchor  present,  

according  to  a  scaled  version  of  the  anchor-­‐predic@on  

model.  

…  

…  

Specified  anchors  Automated  sugges@ons  

Detailed  pa@ent  display  

Ranked  pa@ent  list  

Pa@ent  filters  

User  interface  to  specify  anchors  

Rapid  itera*on  ~30  min  to  add  a  new  clinical  state  

variable  

Sonware  freely  available:  clinicalml.org  

Learned  model:  Celluli@s  

Pyxis  

Unstructured  text  

Anchors  Highly  weighted  features  (covariates)  

ICD9  680-­‐686:    Infec*ons  of  skin  and  subcutaneous  *ssue  

celluli*s  celluli*c  cellulits    paronychia    pilonidal    bite  cyst    boil    

abcess  abscess    abcesses  red    redness    reddness    erythema  

unasyn    vanco  finger  thumb  rle  lle  gluteal  

cephalexin  vancomycin  clindamycin  

cephazolin  amoxicillin  sulfameth/trimeth  

(using  200K  pa@ents’  data,  2008-­‐2013)  

Learned  model:  Cardiac  E@ology  

ICD9  codes  410.*  acute  MI  

411.*  other  acute  …  413.*  angina  pectoris  785.51  card.  shock  

Pyxis  coron.  vasodilators  

loop  diure@c  

Anchors  

cmed  

Ages  age=80-­‐90  age=70-­‐80  age=90+  

nstemi  stemi  ntg    lasix  nitro  

lasix  furosemide  

Medica*ons  aspirin  

clopidogrel  Heparin  Sodium  Metoprolol  Tartrate  

Morphine  Sulfate  Integrilin  Labetalol  

Pyxis  

Unstructured  text  

cp  chest  pain  edema  cmed  

chf  exacerba@on  sob  

pedal  edema  

Sex=M  

Highly  weighted  features  (covariates)  

(using  200K  pa@ents’  data,  2008-­‐2013)  

Learned  model:  Nursing  Home  

nursing  facility  

nursing  home  nsg  facility  

nsg  home  nsg.  home  

from  staff  at  

resident  sent  

reported  

Ages  age=90+  age=80-­‐90  age=70-­‐80  

baseline  changes  nonverbal  

ams  unwitnessed_fall  

confusion  

senna  colace  

trazodone  

dnr  full  code  g  tube  foley  nh  

Medica*ons  

vancomycin  levofloxacin  

Pyxis  

Unstructured  text  

Anchors  

mirtazapine  maalox  tums  

Highly  weighted  features  (covariates)  

(using  200K  pa@ents’  data,  2008-­‐2013)  

Evalua@on:  ED  red  flags  

•  Ac@ve  malignancy  •  Fall  •  Cardiac  E@ology  •  Infec@on  •  From  Nursing  Home  

•  An@coagulated  •  Immunosuppressed  •  Sep@c  Shock  •  Pneumonia  

We  gathered  gold  standard  labels  for  these  9  variables  by  adding  ques@ons  to  EMR  at  @me  of  ED  disposi@on:  

Comparison  to  Exis@ng  Approaches  

•  (Rules)  Predict  just  according  to  the  anchors.    – 1  if  anchor  is  present,  0  otherwise  

•  (ML)  Machine  learning  (logis@c  regression)  – Using  up  to  3K  labels  –  Improves  with  more  labels,  but  labels  are  expensive!  

Accuracy  of  predic@ons  

*  

Accuracy  of  predic@ons  

*  

Accuracy  of  predic@ons  

*  

*  

Scaling  this  up  •  Currently  making  predic@ons  for  40  clinical  variables  within  the  BIDMC  pa*ent  display  – e.g.  allergic  reac@on,  motor  vehicle  accident,  hiv+  

•  Only  turned  on  for  a  small  number  of  clinicians  

Suggested  tags:  MD  can  accept/reject  

Scaling  this  up  •  Currently  making  predic@ons  for  40  clinical  variables  within  the  BIDMC  pa*ent  display  – e.g.  allergic  reac@on,  motor  vehicle  accident,  hiv+  

Accep@ng  a  tag  triggers  events    (pathway  enrollment,  specialized  order  sets,  etc)  

Our  next  steps  

•  Shared  library  of  anchored  phenotypes  •  Real-­‐@me  es@ma@on  of  clinical  states  and  actual  use  for  decision  support  within  ED  

•  Test  portability  of  anchors  to  other  ins@tu@ons  

More  info:  clinicalml.org