Acquiring and representing drug-drug interaction knowledge as claims and evidence, NLM informatics...

1
Jodi Schneider & Richard D. Boyce Department of Biomedical Informatics School of Medicine, University of Pittsburgh Designing a data model for represen1ng claims and evidence 1. Iden1fy key ontologies relevant for claims and evidence. Nanopublica+ons Ontology represents se4led science: Each formalized claim (asser+on) is wrapped in provenance and publica+on info. Micropublica+ons Ontology represents claims and evidence. It views a scien+fic paper as a network of claims supported by data, methods and materials. Claims, data, methods, and materials can be text, images, or mul+media: anything the Open Annota+on Ontology can reference. 2. Iden1fy key domain ontologies to reuse. Ontology of Biomedical Inves+ga+ons Chemical En++es of Biological Interest Drug Ontology 3. Conceptualize 3 layers and determine what belongs in each layer: 4. Formalize key terms about drugdrug interac1ons in a new ontology, DIDEO. For instance, “poten+al drugdrug interac+on” gets obo:DIDEO_00000000 5. Create micropublica1ons claims and nanopublica1on asser1ons. Micropublica+on M: “Clarithromycin interacts with simvasta+n” Nanopublica+on asser+on N : obo:CHEBI_3732 obo:DIDEO_00000000 obo: CHEBI_9150 6. Connect natural language quota1ons and their informa1onretrieval friendly versions. M mp:formalizedBy N . N mp:formalizes M . Ontologies, Data, and Websites DIDEO: The Poten+al Drugdrug Interac+on and Poten+al Drugdrug Interac+on Evidence Ontology h4ps://github.com/DIDEO/DIDEO Contribu+ons to Micropublica+ons Ontology (formalizes/formalizedBy): h4ps://github.com/dbmipi4/DIKBMicropublica+on/blob/master/data/ mp_1.18.owl Drug Interac+on Knowledge Base website & discussion forums h4p://dikb.org and h4p://forums.dikb.org Problem Poten+al drugdrug interac+ons are a significant source of preventable drugrelated harm. The drug informa+on sources clinicians use are disconcordant: Most drug informa1on sources disagree substan1ally in their content (e.g. Abarca et al. 2003, Wang et al. 2010, Saverno et al. 2011). This problem has persisted for more than a decade (e.g. Ayvaz et al. 2015, Ekstein et al. 2015) despite extensive editorial work on the part of each drug informa+on source. This is in part because: (1) There is no standard, agreed upon method for assessing evidence about drugdrug interac+ons. (2) Knowledge claims and evidence about drugdrug interac+ons are distributed across mul+ple sources: premarket studies, postmarket studies, and clinical experience. University of PiPsburgh Department of Biomedical Informa1cs Funded by training grant T15LM007059 from the Na1onal Library of Medicine/Na1onal Ins1tute of Dental and Craniofacial Research and by R01LM011838 from the Na1onal Library of Medicine. Approach Annota1on Results From prior work, we have transformed 410 asser+ons 519 evidence items Annotators have also iden+fied evidence in 158 nonregulatory documents (including fulltext research ar+cles) 27 FDAapproved drug labels This leads to an addi+onal: 230 asser+ons of drugdrug interac+ons in nonregulatory documents 609 evidence items rela+ng to poten+al pharmacokine+c drugdrug interac+ons from 27 FDAapproved drug labels Acquiring claims and evidence 1. Formulate claims of interest. “Clarithromycin interacts with simvasta+n”. 2. Iden1fy relevant source documents. Source documents include FDAapproved drug product labels and fulltext research papers (clinical trials and case reports). 3. Experts assess quality & relevance of source documents. Experts check that documents meet inclusion criteria. Experts find relevant claims, methods, and results. 4. Preannota1on by computer text mining. Source documents are preprocessed to find drug men+ons, using named en+ty recogni+on algorithms. 5. Human curators annotate fulltext documents. (a) The curator highlights the claim. (b) The curator enters the claim and scien+fic method. (c) The curator is prompted to add data based on the method. Claim 1 [Clarithryomycin interacts with Simvastatin] Data 1 Micropublication 1 mp:argues Method 1 mp:qualies obo:CHEBI_3732 [Clarithryomycin] mp:qualies obo:CHEBI_9150 [Simvastatin] mp:qualies obo:DIDEO_00000000 [Potential drug-drug interaction] Materials 1 mp:supports mp:supports mp:supports mp:supports doi: "Clarithryomycin signicantly (p<0.001) increased the AUC (and Cmax) of all 3 statins, most markedly simvastatin" oa:hasSource Publica1ons and Presenta1ons 1. Jodi Schneider, Mathias Brochhausen, Samuel Rosko, Paolo Ciccarese, William R. Hogan, Daniel Malone, Yifan Ning, Tim Clark and Richard D. Boyce. “Formalizing knowledge and evidence about poten+al drugdrug interac+ons.” Interna7onal Workshop on Biomedical Data Mining, Modeling, and Seman7c Integra7on at Interna7onal Seman7c Web Conference 2015 h4p://ceurws.org/Vol1428/BDM2I_2015_paper_10.pdf 2. Jodi Schneider, Paolo Ciccarese, Tim Clark and Richard D. Boyce. “Using the Micropublica+ons ontology and the Open Annota+on Data Model to represent evidence within a drugdrug interac+on knowledge base.” 4th Workshop on Linked Science at Interna7onal Seman7c Web Conference 2014 h4p://ceurws.org/Vol1282/lisc2014_submission_8.pdf 3. Mathias Brochhausen, Jodi Schneider, Daniel Malone, Philip E. Empey, William R. Hogan and Richard D. Boyce “Towards a founda+onal representa+on of poten+al drugdrug interac+on knowledge.” First Interna7onal Workshop on Drug Interac7on Knowledge Representa7on at the Interna7onal Conference on Biomedical Ontologies 2014 h4p://ceurws.org/Vol1309/paper2.pdf 4. Mathias Brochhausen, Philip E. Empey, Jodi Schneider, William R. Hogan, and Richard D. Boyce. Adding evidence type representa+on to DIDEO. ICBO 2016 h4p://jodischneider.com/pubs/icbo2016.pdf 5. Jodi Schneider, Samuel Rosko, Yifan Ning, and Richard D. Boyce. “Towards structured publishing of poten+al drugdrug interac+on knowledge and evidence. Poster presenta+on at: the Pi4sburgh Biomedical Informa+cs Training Program 2015 Retreat. Pi4sburgh, PA, August 20, 2015. doi:10.6084/m9.figshare.1514991 6. Jodi Schneider and Richard D. Boyce “Medica+on safety as a use case for argumenta+on mining”. Dagstuhl Seminar 16161: Natural Language Argumenta+on: Mining, Processing, and Reasoning over Textual Arguments, Dagstuhl, Germany, April 19, 2016 h4p://www.slideshare.net/jodischneider/medica+onsafetyasausecasefor argumenta+onminingdagstuhlseminar1616120160419 Future Work Build an informa+on portal that supports clinical pharmacists and drug informa+on professionals in retrieving the claims and evidence. Test the informa+on portal in a taskbased, withinsubject, user study. Measure the completeness of the informa+on experts retrieve with our informa+on portal compared to current stateoftheart retrieval tools. Test the feasibility of authors annota+ng their own claims and evidence. Enable annota+on beyond PubMed Central open access HTML. Use rulesets of “belief criteria” to transform evidence to a knowledge base. Aim In this work, we address the distributed nature of drugdrug interac+on knowledge, by developing a computable representa+on for claims and evidence about drugdrug interac+ons. Our goal is to support: (a) Knowledge acquisi+on from fulltext natural language (b) Search and retrieval of all evidence. We are applying this representa+on to acquire claims and evidence about pharmacokine+c interac+ons for 65 drugs. This will help us design a search portal, to test whether computable representa+ons of knowledge claims and evidence can improve search and retrieval of poten+al drug drug interac+ons. Longer term, we will test whether this can help reduce the disconcordance between different drug informa+on sources. We model knowledge as claims supported by evidence. asser+on provenance publica+on info nanopublica+on DIDEO: formalizing medica1on safety studies As an OWL ontology, DIDEO supports querying as well as reasoning. For instance, curators will only need to enter a few facts in order for the scien+fic method of a study to be automa+cally determined. DIDEO uses ontological realism to dis+nguish: Poten+al vs. actual drugdrug interac+ons. Inferred vs. observed interac+ons. Method Data Claim Dosage, regimen, clearance, Cmax, AUC, half life Number of par+cipants, randomiza+on Claim, data, method, and material

Transcript of Acquiring and representing drug-drug interaction knowledge as claims and evidence, NLM informatics...

Page 1: Acquiring and representing drug-drug interaction knowledge as claims and evidence, NLM informatics training conference, 2016-06-26

Jodi Schneider & Richard D. Boyce

Department of Biomedical Informatics School of Medicine, University of Pittsburgh

Designing    a  data  model  for  represen1ng  claims  and  evidence    

1.   Iden1fy  key  ontologies  relevant  for  claims  and  evidence.  •  Nanopublica+ons  Ontology  represents  se4led  science:    

Each  formalized  claim  (asser+on)  is  wrapped  in    provenance  and  publica+on  info.  

   

•  Micropublica+ons  Ontology  represents  claims  and  evidence.  It  views  a  scien+fic  paper  as  a  network  of  claims  supported  by  data,  methods  and  materials.  Claims,  data,  methods,  and  materials  can  be  text,  images,  or  mul+media:  anything  the  Open  Annota+on  Ontology  can  reference.  

2.   Iden1fy  key  domain  ontologies  to  reuse.  •  Ontology  of  Biomedical  Inves+ga+ons  •  Chemical  En++es  of  Biological  Interest  •  Drug  Ontology  

 3.  Conceptualize  3  layers  and  determine  what  belongs  in  each  layer:                  4.  Formalize  key  terms  about  drug-­‐drug  interac1ons  in  a  new  ontology,  DIDEO.  For  instance,  “poten+al  drug-­‐drug  interac+on”  gets  obo:DIDEO_00000000                        5.  Create  micropublica1ons  claims  and  nanopublica1on  asser1ons.  Micropublica+on  M:  “Clarithromycin  interacts  with  simvasta+n”  Nanopublica+on  asser+on  N:  obo:CHEBI_3732  obo:DIDEO_00000000  obo:  CHEBI_9150    6.  Connect  natural  language  quota1ons  and  their  informa1on-­‐retrieval  friendly  versions.  M  mp:formalizedBy  N  .  N  mp:formalizes  M  .                

Ontologies,  Data,  and  Websites    DIDEO:  The  Poten+al  Drug-­‐drug  Interac+on  and  Poten+al  Drug-­‐drug  Interac+on  Evidence  Ontology  h4ps://github.com/DIDEO/DIDEO    Contribu+ons  to  Micropublica+ons  Ontology  (formalizes/formalizedBy):    h4ps://github.com/dbmi-­‐pi4/DIKB-­‐Micropublica+on/blob/master/data/mp_1.18.owl    Drug  Interac+on  Knowledge  Base  website  &  discussion  forums  h4p://dikb.org    and  h4p://forums.dikb.org  

Problem    

Poten+al  drug-­‐drug  interac+ons  are  a  significant  source  of  preventable  drug-­‐related  harm.  The  drug  informa+on  sources  clinicians  use  are  disconcordant:  Most  drug  informa1on  sources  disagree  substan1ally  in  their  content  (e.g.  Abarca  et  al.  2003,  Wang  et  al.  2010,  Saverno  et  al.  2011).  This  problem  has  persisted  for  more  than  a  decade  (e.g.  Ayvaz  et  al.  2015,  Ekstein  et  al.  2015)  despite  extensive  editorial  work  on  the  part  of  each  drug  informa+on  source.  This  is  in  part  because:  (1)  There  is  no  standard,  agreed  upon  method  for  assessing  

evidence  about  drug-­‐drug  interac+ons.  (2)  Knowledge  claims  and  evidence  about  drug-­‐drug  

interac+ons  are  distributed  across  mul+ple  sources:    pre-­‐market  studies,  post-­‐market  studies,  and  clinical  experience.  

                       

University  of  PiPsburgh   Department  of  Biomedical  Informa1cs  Funded  by  training  grant  T15LM007059    from  the  Na1onal  Library  of  Medicine/Na1onal  Ins1tute  of  Dental  and  Craniofacial  Research  and  by  R01LM011838    from  the  Na1onal  Library  of  Medicine.    

Approach  

Annota1on  Results      From  prior  work,  we  have  transformed  

•  410  asser+ons  •  519  evidence  items  

 Annotators  have  also  iden+fied  evidence  in  

•  158  non-­‐regulatory  documents    (including  full-­‐text  research  ar+cles)  

•  27  FDA-­‐approved  drug  labels  

This  leads  to  an  addi+onal:  •  230  asser+ons  of  drug-­‐drug  interac+ons  in  non-­‐regulatory  documents  •  609  evidence  items  rela+ng  to  poten+al  pharmacokine+c  drug-­‐drug  

interac+ons  from  27  FDA-­‐approved  drug  labels  

Acquiring  claims  and  evidence    

1.    Formulate  claims  of  interest.  “Clarithromycin  interacts  with  simvasta+n”.    2.  Iden1fy  relevant  source  documents.  Source  documents  include  FDA-­‐approved  drug  product  labels  and  full-­‐text  research  papers  (clinical  trials  and  case  reports).    3.  Experts  assess  quality  &  relevance  of  source  documents.  Experts  check  that  documents  meet  inclusion  criteria.  Experts  find  relevant  claims,  methods,  and  results.      4.  Pre-­‐annota1on  by  computer  text  mining.  Source  documents  are  pre-­‐processed  to  find  drug  men+ons,  using  named  en+ty  recogni+on  algorithms.    5.  Human  curators  annotate  full-­‐text  documents.  (a)  The  curator  highlights  the  claim.  (b)  The  curator  enters  the  claim  and  scien+fic  method.  

(c)  The  curator  is  prompted  to  add  data  based  on  the  method.    

Claim 1[Clarithryomycin

interacts with Simvastatin]

Data 1

Micropublication 1

mp:argues

Method 1

mp:qualifies

obo:CHEBI_3732[Clarithryomycin]

mp:qualifies

obo:CHEBI_9150[Simvastatin]

mp:qualifies

obo:DIDEO_00000000[Potential drug-drug

interaction]

Materials 1

mp:supports

mp:supports

mp:supports

mp:supports

doi:������������������ ����������

"Clarithryomycin significantly (p<0.001) increased the AUC (and Cmax) of all 3 statins, most markedly simvastatin"

oa:hasSource

Publica1ons  and  Presenta1ons    

1.  Jodi  Schneider,  Mathias  Brochhausen,  Samuel  Rosko,  Paolo  Ciccarese,  William  R.  Hogan,  Daniel  Malone,  Yifan  Ning,  Tim  Clark  and  Richard  D.  Boyce.  “Formalizing  knowledge  and  evidence  about  poten+al  drug-­‐drug  interac+ons.”  Interna7onal  Workshop  on  Biomedical  Data  Mining,  Modeling,  and  Seman7c  Integra7on  at  Interna7onal  Seman7c  Web  Conference  2015    h4p://ceur-­‐ws.org/Vol-­‐1428/BDM2I_2015_paper_10.pdf    2.  Jodi  Schneider,  Paolo  Ciccarese,  Tim  Clark  and  Richard  D.  Boyce.  “Using  the  Micropublica+ons  ontology  and  the  Open  Annota+on  Data  Model  to  represent  evidence  within  a  drug-­‐drug  interac+on  knowledge  base.”  4th  Workshop  on  Linked  Science  at  Interna7onal  Seman7c  Web  Conference  2014    h4p://ceur-­‐ws.org/Vol-­‐1282/lisc2014_submission_8.pdf    3.  Mathias  Brochhausen,  Jodi  Schneider,  Daniel  Malone,  Philip  E.  Empey,  William  R.  Hogan  and  Richard  D.  Boyce  “Towards  a  founda+onal  representa+on  of  poten+al  drug-­‐drug  interac+on  knowledge.”  First  Interna7onal  Workshop  on  Drug  Interac7on  Knowledge  Representa7on  at  the  Interna7onal  Conference  on  Biomedical  Ontologies  2014    h4p://ceur-­‐ws.org/Vol-­‐1309/paper2.pdf    4.  Mathias  Brochhausen,  Philip  E.  Empey,  Jodi  Schneider,  William  R.  Hogan,  and  Richard  D.  Boyce.  Adding  evidence  type  representa+on  to  DIDEO.  ICBO  2016    h4p://jodischneider.com/pubs/icbo2016.pdf    5.  Jodi  Schneider,  Samuel  Rosko,  Yifan  Ning,  and  Richard  D.  Boyce.  “Towards  structured  publishing  of  poten+al  drug-­‐drug  interac+on  knowledge  and  evidence.  Poster  presenta+on  at:  the  Pi4sburgh  Biomedical  Informa+cs  Training  Program  2015  Retreat.  Pi4sburgh,  PA,  August  20,  2015.    doi:10.6084/m9.figshare.1514991    6.  Jodi  Schneider  and  Richard  D.  Boyce  “Medica+on  safety  as  a  use  case  for  argumenta+on  mining”.  Dagstuhl  Seminar  16161:  Natural  Language  Argumenta+on:  Mining,  Processing,  and  Reasoning  over  Textual  Arguments,  Dagstuhl,  Germany,  April  19,  2016    h4p://www.slideshare.net/jodischneider/medica+on-­‐safety-­‐as-­‐a-­‐use-­‐case-­‐for-­‐argumenta+on-­‐mining-­‐dagstuhl-­‐seminar-­‐16161-­‐2016-­‐0419      

Future  Work  •  Build  an  informa+on  portal  that  supports  clinical  pharmacists  and  drug  

informa+on  professionals  in  retrieving  the  claims  and  evidence.  •  Test  the  informa+on  portal  in  a  task-­‐based,  within-­‐subject,  user  study.  

Measure  the  completeness  of  the  informa+on  experts  retrieve  with  our  informa+on  portal  compared  to  current  state-­‐of-­‐the-­‐art  retrieval  tools.  

•  Test  the  feasibility  of    authors  annota+ng  their  own  claims  and  evidence.  •  Enable  annota+on  beyond  PubMed  Central  open  access  HTML.  •  Use  rulesets  of  “belief  criteria”  to  transform  evidence  to  a  knowledge  base.  

Aim    

In  this  work,  we  address  the  distributed  nature  of    drug-­‐drug  interac+on  knowledge,  by  developing  a    computable  representa+on  for  claims  and  evidence  about  drug-­‐drug  interac+ons.  Our  goal  is  to  support:  (a)  Knowledge  acquisi+on  from  full-­‐text  natural  language  (b)  Search  and  retrieval  of  all  evidence.  

We  are  applying  this  representa+on  to  acquire  claims  and  evidence  about  pharmacokine+c  interac+ons  for  65  drugs.    This  will  help  us  design  a  search  portal,  to  test  whether  computable  representa+ons  of  knowledge  claims  and  evidence  can  improve  search  and  retrieval  of  poten+al  drug-­‐drug  interac+ons.      Longer  term,  we  will  test  whether  this  can  help  reduce  the  disconcordance  between  different  drug  informa+on  sources.      

         

We  model  knowledge  as  claims  supported  by  evidence.  

                   

asser+on  

provenance  

publica+on  info  

nanopublica+on  

DIDEO:  formalizing  medica1on  safety  studies    As  an  OWL  ontology,  DIDEO  supports  querying  as  well  as  reasoning.  For  instance,  curators  will  only  need  to  enter  a  few  facts  in  order  for  the  scien+fic  method  of  a  study  to  be  automa+cally  determined.    DIDEO  uses  ontological  realism  to  dis+nguish:  •  Poten+al  vs.  actual  drug-­‐drug  interac+ons.  •  Inferred  vs.  observed  interac+ons.  

Method  

Data  

Claim  

Dosage,  regimen,  clearance,  Cmax,  AUC,    half  life              

Number  of  par+cipants,  randomiza+on  

Claim,  data,  method,  and  material