Ho'okele News - Feb. 6, 2015 (Pearl Harbor-Hickam Newspaper)
Biomedical Information Retrieval · questions using an information retrieval system. Bulletin of...
Transcript of Biomedical Information Retrieval · questions using an information retrieval system. Bulletin of...
BiomedicalInformationRetrieval
WilliamHersh,MDProfessorandChair
DepartmentofMedicalInformatics&ClinicalEpidemiologyOregonHealth&ScienceUniversity
Portland,OR,USAEmail:[email protected]:www.billhersh.info
Blog:http://informaticsprofessor.blogspot.comTwitter:@williamhersh
ReferencesAlsheikh-Ali,AA,Qureshi,W,etal.(2011).Publicavailabilityofpublishedresearchdatainhigh-impactjournals.PLoSONE.6(9):e24357.http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0024357Anonymous(2006).FatallyFlawed-RefutingtherecentstudyonencyclopedicaccuracybythejournalNature.Chicago,IL,EncyclopediaBrittanica.http://corporate.britannica.com/britannica_nature_response.pdfAnonymous(2012).FromScreentoScript:TheDoctor'sDigitalPathtoTreatment.NewYork,NY,ManhattanResearch;Google.https://www.thinkwithgoogle.com/research-studies/the-doctors-digital-path-to-treatment.htmlAnonymous(2015).TheBeginner'sGuidetoSEO.Seattle,WA,Moz.http://moz.com/beginners-guide-to-seoAnonymous(2016).Towardfairnessindatasharing.NewEnglandJournalofMedicine.375:405-407.Anonymous(2017).DatabaseResourcesoftheNationalCenterforBiotechnologyInformation.NucleicAcidsResearch.45:D12-D17.Bachrach,CAandCharen,T(1978).SelectionofMEDLINEcontents,thedevelopmentofitsthesaurus,andtheindexingprocess.MedicalInformatics.3:237-254.Bastian,H,Glasziou,P,etal.(2010).Seventy-fivetrialsandelevensystematicreviewsaday:howwillweeverkeepup?PLoSMedicine.7(9):e1000326.http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.1000326Brin,SandPage,L(1998).Theanatomyofalarge-scalehypertextualWebsearchengine.ComputerNetworksandISDNSystems.30:107-117.http://infolab.stanford.edu/pub/papers/google.pdfBroder,A(2002).AtaxonomyofWebsearch.SIGIRForum.36(2):3-10.http://www.acm.org/sigir/forum/F2002/broder.pdfCastillo,CandDavison,BD(2011).AdversarialWebSearch.Delft,Netherlands,nowPublishers.Cerrato,P(2012).IBMWatsonFinallyGraduatesMedicalSchool.InformationWeek,October23,2012.http://www.informationweek.com/healthcare/clinical-systems/ibm-watson-finally-graduates-medical-sch/240009562Coletti,MHandBleich,HL(2001).Medicalsubjectheadingsusedtosearchthebiomedicalliterature.JournaloftheAmericanMedicalInformaticsAssociation.8:317-323.Davies,K(2006).SearchandDeploy.Bio-ITWorld,October16,2006.http://www.bio-itworld.com/issues/2006/oct/biogen-idec/DeAngelis,CD,Drazen,JM,etal.(2005).Isthisclinicaltrialfullyregistered?AstatementfromtheInternationalCommitteeofMedicalJournalEditors.JournaloftheAmericanMedicalAssociation.293:2927-2929.
Ferrucci,D,Brown,E,etal.(2010).BuildingWatson:anoverviewoftheDeepQAProject.AIMagazine.31(3):59-79.http://www.aaai.org/ojs/index.php/aimagazine/article/view/2303Ferrucci,DA(2012).Introductionto"ThisisWatson".IBMJournalofResearchandDevelopment.56(3/4):1.http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6177724Fox,S(2011).HealthTopics.Washington,DC,PewInternet&AmericanLifeProject.http://www.pewinternet.org/Reports/2011/HealthTopics.aspxFox,S(2011).TheSocialLifeofHealthInformation,2011.Washington,DC,PewInternet&AmericanLifeProject.http://www.pewinternet.org/Reports/2011/Social-Life-of-Health-Info.aspxFox,SandDuggan,M(2013).HealthOnline2013.Washington,DC,PewInternet&AmericanLifeProject.http://www.pewinternet.org/Reports/2013/Health-online.aspxFunk,MEandReid,CA(1983).IndexingconsistencyinMEDLINE.BulletinoftheMedicalLibraryAssociation.71:176-183.Giles,J(2005).Internetencyclopaediasgoheadtohead.Nature.438:900-901.http://www.nature.com/nature/journal/v438/n7070/full/438900a.htmlGorman,PN(1995).Informationneedsofphysicians.JournaloftheAmericanSocietyforInformationScience.46:729-736.Hanbury,A,Müller,H,etal.(2015).Evaluation-as-a-Service:OverviewandOutlook,arXiv.http://arxiv.org/pdf/1512.07454v1Haynes,RB,McKibbon,KA,etal.(1990).OnlineaccesstoMEDLINEinclinicalsettings.AnnalsofInternalMedicine.112:78-84.Heilman,J(2013).Onlineencyclopediaprovidesfreehealthinfoforall.BulletinoftheWorldHealthOrganization.91:8-9.Hersh,W,Müller,H,etal.(2009).TheImageCLEFmedmedicalimageretrievaltasktestcollection.JournalofDigitalImaging.22:648-655.Hersh,WandVoorhees,E(2009).TRECgenomicsspecialissueoverview.InformationRetrieval.12:1-15.Hersh,WR(1994).Relevanceandretrievalevaluation:perspectivesfrommedicine.JournaloftheAmericanSocietyforInformationScience.45:201-206.Hersh,WR(2009).InformationRetrieval:AHealthandBiomedicalPerspective(3rdEdition).NewYork,NY,Springer.Hersh,WR,Bhupatiraju,RT,etal.(2006).Enhancingaccesstothebibliome:theTREC2004GenomicsTrack.JournalofBiomedicalDiscoveryandCollaboration.1:3.http://www.j-biomed-discovery.com/content/1/1/3Hersh,WR,Crabtree,MK,etal.(2002).FactorsassociatedwithsuccessforsearchingMEDLINEandapplyingevidencetoanswerclinicalquestions.JournaloftheAmericanMedicalInformaticsAssociation.9:283-293.Hersh,WR,Crabtree,MK,etal.(2000).Factorsassociatedwithsuccessfulansweringofclinicalquestionsusinganinformationretrievalsystem.BulletinoftheMedicalLibraryAssociation.88:323-331.Hersh,WRandHickam,DH(1998).Howwelldophysiciansuseelectronicinformationretrievalsystems?Aframeworkforinvestigationandreviewoftheliterature.JournaloftheAmericanMedicalAssociation.280:1347-1352.Hersh,WR,Hickam,DH,etal.(1994).AperformanceandfailureanalysisofSAPHIREwithaMEDLINEtestcollection.JournaloftheAmericanMedicalInformaticsAssociation.1:51-60.Hersh,WR,Müller,H,etal.(2006).Advancingbiomedicalimageretrieval:developmentandanalysisofatestcollection.JournaloftheAmericanMedicalInformaticsAssociation.13:488-496.Holan,AD(2016).2016LieoftheYear:Fakenews.St.Petersburg,FL,Politifact.http://www.politifact.com/truth-o-meter/article/2016/dec/13/2016-lie-year-fake-news/Huesch,MD(2013).Privacythreatswhenseekingonlinehealthinformation.JAMAInternalMedicine.173:1838-1839.
Insel,TR,Volkow,ND,etal.(2003).Neurosciencenetworks:data-sharinginaninformationage.PLoSBiology.1:E17.Kalpathy-Cramer,J,SecodeHerrera,AG,etal.(2015).Evaluatingperformanceofbiomedicalimageretrievalsystems-anoverviewofthemedicalimageretrievaltaskatImageCLEF2004–2013.ComputerizedMedicalImagingandGraphics.39:55-61.Laine,C,Horton,R,etal.(2007).Clinicaltrialregistration:lookingbackandmovingahead.JournaloftheAmericanMedicalAssociation.298:93-94.Laurent,MRandVickers,TJ(2009).Seekinghealthinformationonline:doesWikipediamatter?JournaloftheAmericanMedicalInformaticsAssociation.16:471-479.Lee,JS,Lorincz,C,etal.(2011).ShouldHealthcareOrganizationsUseSocialMedia?FallsChurch,VA,ComputerSciencesCorp.http://assets1.csc.com/health_services/downloads/CSC_Should_Healthcare_Organizations_Use_Social_Media.pdfLibert,T(2015).PrivacyimplicationsofhealthinformationseekingontheWeb.CommunicationsoftheACM.58(3):68-77.Lohr,S(2012).TheFutureofHigh-TechHealthCare—andtheChallenge.Newyork,NY.NewYorkTimes.February13,2012.http://bits.blogs.nytimes.com/2012/02/13/the-future-of-high-tech-health-care-and-the-challenge/Magrabi,F,Coiera,EW,etal.(2005).Generalpractitioners'useofonlineevidenceduringconsultations.InternationalJournalofMedicalInformatics.74:1-12.Marcetich,J,Rappaport,M,etal.(2004).IndexingconsistencyinMEDLINE.MLA04Abstracts,Washington,DC.MedicalLibraryAssociation.10-11.Markoff,J(2011).ComputerWinson‘Jeopardy!’:Trivial,It’sNot.NewYork,NY.NewYorkTimes.February16,2011.http://www.nytimes.com/2011/02/17/science/17jeopardy-watson.htmlMcHenry,R(2004).TheFaith-BasedEncyclopedia.TechCentralStation,November15,2004.http://www.techcentralstation.com/111504A.htmlMello,MM,Francer,JK,etal.(2013).Preparingforresponsiblesharingofclinicaltrialdata.NewEnglandJournalofMedicine.369:1651-1658.Metzger,JandRhoads,J(2012).SummaryofKeyProvisionsinFinalRuleforStage2HITECHMeaningfulUse.FallsChurch,VA,ComputerSciencesCorp.http://skynetehr.com/PDFFiles/MeaningUse_Stage2.pdfNicholson,DT(2006).AnevaluationofthequalityofconsumerhealthinformationonWikipediaCapstone,OregonHealth&ScienceUniversity.Nielsen,JandLevy,J(1994).Measuringusability:preferencevs.performance.CommunicationsoftheACM.37:66-75.Perrin,A(2015).One-fifthofAmericansreportgoingonline‘almostconstantly’.Washington,DC,PewResearchCenter.http://www.pewresearch.org/fact-tank/2015/12/08/one-fifth-of-americans-report-going-online-almost-constantly/Pluye,PandGrad,RM(2004).Howinformationretrievaltechnologymayimpactonphysicianpractice:anorganizationalcasestudyinfamilymedicine.JournalofEvaluationinClinicalPractice.10:413-430.Pluye,P,Grad,RM,etal.(2005).Impactofclinicalinformation-retrievaltechnologyonphysicians:aliteraturereviewofquantitative,qualitativeandmixedmethodsstudies.InternationalJournalofMedicalInformatics.74:745-768.Purcell,K,Brenner,J,etal.(2012).SearchEngineUse2012.Washington,DC,PewInternet&AmericanLifeProject.http://www.pewinternet.org/Reports/2012/Search-Engine-Use-2012.aspxRodwin,MAandAbramson,JD(2012).Clinicaltrialdataasapublicgood.JournaloftheAmericanMedicalAssociation.308:871-872.
Roegiest,AandCormack,GV(2016).Anarchitectureforprivacy-preservingandreplicablehigh-recallretrievalexperiments.Proceedingsofthe39thInternationalACMSIGIRconferenceonResearchandDevelopmentinInformationRetrieval,Pisa,Italy.1085-1088.Ross,JSandKrumholz,HM(2013).Usheringinaneweraofopensciencethroughdatasharing:thewallmustcomedown.JournaloftheAmericanMedicalAssociation.309:1355-1356.Royle,JA,Blythe,J,etal.(1995).Literaturesearchandretrievalintheworkplace.ComputersinNursing.13:25-31.Salton,G(1991).Developmentsinautomatictextretrieval.Science.253:974-980.Sánchez-Mendiola,MandMartínez-Franco,AI,Eds.(2014).InformáticaBiomédica,2aEdición.MexicoCity,MX,Elsevier.Shortliffe,EHandCimino,JJ,Eds.(2014).BiomedicalInformatics:ComputerApplicationsinHealthCareandBiomedicine(FourthEdition).London,England,Springer.Smith,M(2014).Targeted:HowTechnologyIsRevolutionizingAdvertisingandtheWayCompaniesReachConsumers.Washington,DC,AMACOM.Stanfill,MH,Williams,M,etal.(2010).Asystematicliteraturereviewofautomatedclinicalcodingandclassificationsystems.JournaloftheAmericanMedicalInformaticsAssociation.17:646-651.Strzalkowski,TandHarabagiu,S,Eds.(2006).AdvancesinOpen-DomainQuestionAnswering.Dordrecht,Netherlands,Springer.Taylor,H(2010)."Cyberchondriacs"ontheRise?Thosewhogoonlineforhealthcareinformationcontinuestoincrease.Rochester,NY,HarrisInteractive.http://www.harrisinteractive.com/vault/HI-Harris-Poll-Cyberchondriacs-2010-08-04.pdfTuason,O,Chen,L,etal.(2004).Biologicalnomenclatures:asourceoflexicalknowledgeandambiguity.PacificSymposiumonBiocomputing,Kona,Hawaii.WorldScientific.238-249.Voorhees,EandHersh,W(2012).OverviewoftheTREC2012MedicalRecordsTrack.TheTwenty-FirstTextREtrievalConferenceProceedings(TREC2012),Gaithersburg,MD.NationalInstituteofStandardsandTechnologyhttp://trec.nist.gov/pubs/trec21/papers/MED12OVERVIEW.pdfVoorhees,EM(2005).QuestionAnsweringinTREC.TREC-ExperimentandEvaluationinInformationRetrieval.E.VoorheesandD.Harman.Cambridge,MA,MITPress:233-257.Voorhees,EMandHarman,DK,Eds.(2005).TREC:ExperimentandEvaluationinInformationRetrieval.Cambridge,MA,MITPress.Voorhees,EMandTong,RM(2011).OverviewoftheTREC2011MedicalRecordsTrack.TheTwentiethTextREtrievalConferenceProceedings(TREC2011),Gaithersburg,MD.NationalInstituteofStandardsandTechnologyWanke,LAandHewison,NS(1988).ComparativeusefulnessofMEDLINEsearchesperformedbyadruginformationpharmacistandbymedicallibrarians.AmericanJournalofHospitalPharmacy.45:2507-2510.Westbrook,JI,Gosling,AS,etal.(2005).Theimpactofanonlineevidencesystemonconfidenceindecisionmakinginacontrolledsetting.MedicalDecisionMaking.25:178-185.Wu,S,Liu,S,etal.(2017).Intra-institutionalEHRcollectionsforpatient-levelinformationretrieval.JournaloftheAmericanSocietyforInformationScience&Technology:inpress.Yandell,MDandMajoros,WH(2002).Genomicsandnaturallanguageprocessing.NatureReviews-Genetics.3:601-610.Zarin,DAandTse,T(2013).Trustbutverify:trialregistrationanddeterminingfidelitytotheprotocol.AnnalsofInternalMedicine.159:65-67.Zarin,DA,Tse,T,etal.(2015).TheproposedruleforU.S.clinicaltrialregistrationandresultssubmission.NewEnglandJournalofMedicine.372:174-180.Zarin,DA,Tse,T,etal.(2011).TheClinicalTrials.govresultsdatabase--updateandkeyissues.NewEnglandJournalofMedicine.364:852-860.
1
BiomedicalInformationRetrievalWilliamHersh,MDProfessorandChair
DepartmentofMedicalInformatics&ClinicalEpidemiologyOregonHealth&ScienceUniversity
Portland,OR,USAEmail:[email protected]:www.billhersh.info
Blog:http://informaticsprofessor.blogspot.comTwitter:@williamhersh
1
Topicstocover
• Content• Indexing• Evaluation
2
2
Content
• Currentstatusandchallengesinbiomedicalinformationretrieval(IR)
• Classificationandexamplesofknowledge-basedinformation
3
ChallengesinbiomedicalIR
• Wehavegonefrominformationpaucitytoinformationoverload
• Manytopicswewanttosearchonhavemultiplewaystobeexpressed– e.g.,diseases,genes,symptoms,etc.
• Theconverseisaproblemtoo:Manywordsandtermsusedtoexpresstopicshavemultiplemeanings
• Balancingopenaccessvs.providingforcostofproductionandmaintenance
4
3
IRisnow“mainstream”• Internet(andlikelysearchengine)
useisnowubiquitous– Notonlyindevelopedcountries(Perrin,
2015)butacrossworld–http://www.internetworldstats.com/stats.htm
• 71%ofInternetusers(59%ofUSadults)havesearchedforhealthinformation,with35%usingitforself-diagnosis(Fox,2013)
• “Searchengineoptimization”(SEO)isakeyfunctionusedbymanycompaniesandorganizations(Moz,2015)
– https://moz.com/beginners-guide-to-seo
– Somearelucky,e.g.,lastnameof“Hersh”
5
TheWebhaschangedthenatureofsearch
• Threemajoruses(Broder,2002)– Informational– seekinginformation(39-48%)– Navigational– lookingforaspecificpage,e.g.,ahomepage(20-
24%)– Transactional– performtransactions,e.g.,on-linepurchasing
(30-36%)• Weareintheeraof“adversarial”search– thereiscontent
wedonotwanttoretrieve(Castillo,2011;Smith,2014)– Someofthecontentwemightnotwanttoretrieveis“fake
news,”whichcametotheforein2016(Holan,2016)• Growingprivacyconcernsabouttrackingoursearching
(Huesch,2013;Libert,2015)
6
4
IRalsoagrowingpartof“knowledgediscovery”fromscientificliterature
7
Allliterature
Possiblyrelevantliterature
Definitelyrelevantliterature
Structuredknowledge
Informationretrieval
Informationextraction,textmining
IRandonlineaccessfirmlyplantedinhealthandbiomedicine
• Biologyisnowdefinedasan“informationscience”(Insel,2003)
• Pharmaceuticalcompaniescompeteforinformatics/librarytalent(Davies,2006)
• Clinicianscannotkeepup– averageof75clinicaltrialsand11systematicreviewspublishedeachday(Bastian,2010)
• Searchforhealthinformationbyclinicians,researchers,andpatients/consumersisubiquitous(Purcell,2012;Google/ManhattanResearch,2012)– It’sevenpartof“meaningfuluse”– textsearchoverelectronichealthrecordnotes(Metzger,2012)
8
5
Useisubiquitousamongphysicians(Google/ManhattanResearch,2012)
• Mosthavemultipledevices– 99%withadesktoporlaptop,84%withasmartphone,and54%withatablet
• Spendtwiceasmuchtimeusingonlineresourcesasprintresources• Evenphysiciansaged55+heavyusers– 80%ownasmartphone,84%usesearch
enginesdaily,and9hoursperweekisspentonlineforprofessionalpurposes• Searchengineuseadailyactivity– 84%,withaverageofsixsearchesdoneperday
and94%usingGoogle• Whenlookingforclinicalortreatmentinformation,aboutathirdclickfirston
sponsoredlistingsfromasearch• About93%saytheytakeactionbasedonsearching– everythingfrompursuing
moreinformationtosharingwithapatientorcolleaguetochangingtreatmentdecisions
• Onsmartphones,searchingispreferredovermobileapps – 48%ofusetimewithasearchengine,34%withmobileapps,and18%goingtospecificWebsitesinabrowserorwithabookmark
• Spendabout6hoursperweekwatchingonlinevideo,withabouthalfofthattimespentforprofessionalpurposes
9
Whatkindofhealthinformationdoconsumerssearchfor?(Fox,2011)
Healthtopic %searchingSpecificdiseaseormedicalproblem 66%Certainmedicaltreatmentorprocedure 56%Doctorsorotherhealthprofessionals 44%Hospitalsorothermedicalfacilities 36%Healthinsurance– privateorgovernment 33%Foodsafetyorrecalls 29%Environmentalhealthhazards 22%Pregnancyandchildbirth 19%Medicaltestresults 16%
10
6
HowtofindmoreinformationaboutIRinhealthandbiomedicine
• HershWR,InformationRetrieval:AHealthandBiomedicalPerspective,ThirdEdition,2009– Website:www.irbook.info
• Chaptersinotherbooks,e.g.,Shortliffe (2014),Sanchez-Mendiola (2014)
• Plentyofotherbooks,journals,andothersources
11
WhyisIRpertinenttohealthandbiomedicine?
• Growthofknowledgehaslongsurpassedhumanmemorycapabilities
• Clinicianshavefrequentandunmetinformationneeds• Researchersmustfrequentlyupdatetheirknowledgeinnew
areasquickly• Primaryliteratureonagiventopiccanbescatteredandhard
tosynthesize• Non-primaryliteraturesourcesareoftenneither
comprehensivenorsystematic• Webisincreasinglyusedassourceofhealthandbiomedical
information
12
7
Life-cycleofknowledge-basedinformation
13
Originalresearch
Writeupresults
Submitforpublication
Publish
Secondarypublications
Peerreview
Publicdatarepository
Relinquishcopyright
Revise
Reject
Accept
Classificationofknowledge-basedscientificinformation
• Primary– originalresearch– Publishedmainlyinjournalsbutalsoinconferenceproceedings,technicalreports,books,etc.
– Canincludere-analysis,e.g.,meta-analysisandsystematicreviews
• Secondary– reviews,condensations,and/orsynopsesofprimaryliterature– Textbooksandhandbooksarestaplesofclinicalpractitioners,researchers,andothers
– Guidelinesareimportantfornormalizingcareandmeasuringquality
14
8
Classificationofknowledge-basedcontent
• Bibliographic– Bydefinitionrichinmetadata
• Full-text– Everythingon-line
• Annotated– Non-textorstructuredtextannotatedwithtext
• Aggregations– Bringingtogetheralloftheabove
• Thesecategoriesareadmittedlyfuzzy,andincreasingnumbersofresourceshavemorethanonetype
15
Bibliographiccontent• Bibliographicdatabases
– Theold(e.g.,MEDLINE)havebeenrevitalizedwithnewfeatures
– Newones(e.g.,NationalGuidelinesClearinghouse)haveemerged
• Webcatalogs– Sharemanycharacteristicsoftraditionalbibliographicdatabases
• Realsimplesyndication/Richsitesummary(RSS)– “Feeds” provideinformationaboutnewcontent
16
9
Bibliographicdatabases• Containmetadataabout(mostly)journalarticlesandotherresourcestypicallyfoundinlibraries
• Producedby– U.S.government– mostproducedbyNationalLibraryofMedicine(NLM,www.nlm.nih.gov)
• e.g.,MEDLINE,genomicsinformation,etc.– Commercialpublishers,e.g.,
• EMBASE– partoflargerSciVal• CINAHL– CumulativeIndextoNursingandAlliedHealthLiterature
• ACMGuidetoComputingLiterature– computerscienceandrelatedareas
17
MEDLINE• Referencestobiomedicaljournalliterature
– OriginalmedicalIRapplication– systemforsearchingMEDLINElaunchedin1971withliteraturemaintainedinMEDLARSsystemdatingbackto1966
• NamederivesfromMEDLARSOn-Line– MEDLINE– Freetoworldsince1997viaPubMed– http://pubmed.gov
• Nowwithlinkstofulltextofarticlesandotherresources• Statistics
– http://www.nlm.nih.gov/bsd/bsd_key.html– Over23Mreferencestopeer-reviewedliterature– Over5600journals,mostlyEnglishlanguage– Nearly900,000newreferencesaddedyearly
18
10
NationalGuidelinesClearinghouse
• ProducedbyAgencyforHealthcareResearchandQuality(AHRQ)– www.guideline.gov
• Containsdetailedinformationaboutguidelines– Includingdegreetheyareevidence-based– Interfaceallowscomparisonofelementsindatabaseformultipleguidelines
• HaslinkstothosethatarefreeonWebandlinkstoproducerswhenproprietary
19
Webcatalogs
• Generallyaimtoprovidequality-filteredWebsitesaimedatspecificaudiences– Distinctionbetweencatalogsandsitesblurry
• Someareaimedtowardsclinicians– HONSelect– http://www.hon.ch/HONselect/– TranslatingResearchintoPractice–www.tripdatabase.com
• Othersareaimedtowardspatients/consumers– Healthfinder – www.healthfinder.gov
20
11
RSS
• RSS“feeds” provideshortsummaries,typicallyofnews,journalarticles,orotherrecentpostingsonWebsites
• UsersreceiveRSSfeedsbyanRSSaggregatorthatcantypicallybeconfiguredforthesite(s)desiredandtofilterbasedoncontent– Workasstandalone,inWebbrowsers,inemailclients,etc.
• Twoversions(1.0,2.0)butbasicallyprovide– Title– nameofitem– Link– URLoffullpage– Description– briefdescriptionofpage
21
Full-textcontent
• Containscompletetextaswellastables,figures,images,etc.
• Ifthereiscorrespondingprintversion,bothareusuallyidentical
• Includes– Periodicals– Books– Websites– mayincludeeitherofabove
22
12
Full-textprimaryliterature• Almostallbiomedicaljournalsavailableelectronically
– ManypublishedbyHighwire Press(www.highwire.org),whichaddsvaluetocontentoforiginalpublisher,includingBritishMedicalJournal,JournaloftheAmericanMedicalAssociation,NewEnglandJournalofMedicine,etc.
– Alsopublishedbyleadingcommercialscientificpublishers,e.g.,Elsevier,Kluwer,Springer,etc.
– Growingnumberavailableviaopen-accessmodel,e.g.,BiomedCentral(BMC),PublicLibraryofScience(PLoS)
– Anothersourceoffull-textpapersisPubMedCentral(PMC;http://pubmedcentral.gov)
23
Books• Textbooks
– Mostwell-knownclinicaltextbooksarenowavailableelectronically
• e.g.,Harrison’sPrinciplesofInternalMedicine– Mostarebundledintolargecollectionsbypublishers
• e.g.,AccessMedicine(McGraw-Hill),Elsevier,Kluwer– NLMhasdevelopedbookssiteaspartofEntrez
• http://www.ncbi.nlm.nih.gov/books• Compendiaofdrugs,diseases,evidence,etc.• Handbooks– verypopularwithclinicians• Increasinglypublishedonmobiledevices
24
13
Valueaddedforelectronicbooks• Multimedia,e.g.,skinlesions,shufflinggaitofParkinson’sDisease,etc.
• Bundlingofmultiplebooks
• Canbeupdatedinbetween“editions”
• Linkagetootherinformation,e.g.,toreferences,self-assessments,updates,otherresources,etc.
25
Websites
• DefinedmorenarrowlyheretorefertocoherentcollectionsofinformationonWeb
• UsuallytakeadvantageofWebfeatures,suchaslinking,multimedia
• Increasinglyintegratedwithotherresourcesandavailableondifferentplatforms(e.g.,integratedintoelectronichealthrecords[EHRs],onsmartphones,etc.)
26
14
Somenotablefull-textcontentonWebsites
• Governmentagencies– NationalCancerInstitute
• www.cancer.gov– CentersforDiseaseControl– travelandinfectioninformation
• http://www.cdc.gov/DiseasesConditions• http://www.cdc.gov/travel/
– OtherNIHinstitutes,e.g.,NationalHeart,Lung,andBloodInstitute(NHLBI)
• www.nhlbi.nih.gov
27
Full-textWebsites(cont.)• Physician-orientedmedicalnewsandoverviews,e.g.,
– Medscape– www.medscape.com– Manyprofessionalsocietiesprovidetomembers,e.g.,http://www.acponline.org/clinical_information/
• Patient/consumer-oriented,e.g.,– NetWellness – www.netwellness.com– WebMD– www.webmd.com
• Manymobileappsprovidehealthinformation,e.g.,– iTriage – www.itriagehealth.com– Epocrates – www.epocrates.com
28
15
OtherinterestingtypesofWebcontent
• Wikipedia– www.wikipedia.org– Encyclopediawithfreeaccessanddistributedauthorship– Someconcernsaboutmanipulation(McHenry,2004)but
• ComparabletoEncyclopediaBritannica?(Giles,2005– rebuttal:Anonymous,2006)
• Healthinformationqualityisreasonablygood(Nicholson,2006)• ContentretrievedprominentlyinmostWebsearches(Laurent,2009)• Makingattempttoimprovequalityofmedicalcontent(Heilman,2013)
• Bodyofknowledge– SoftwareEngineeringBodyofKnowledge(SWEBOK,
www.swebok.org)organizesknowledgeoffield• Socialmedia/Web2.0andbeyond(Lee,2011)
29
Annotated
• Non-textorstructuredtextannotatedwithtext
• Includes– Imagecollections– Citationdatabases– Evidence-basedmedicinedatabases– Clinicaldecisionsupport– Genomicsdatabases– Otherdatabases
30
16
Imagecollections• Mostprominentinthe“visual” medicalspecialties,suchas
radiology,pathology,anddermatology• Well-knowncollectionsinclude
– VisibleHuman–http://www.nlm.nih.gov/research/visible/visible_human.html
– Lieberman’seRadiology – http://eradiology.bidmc.harvard.edu– WebPath – http://library.med.utah.edu/WebPath/webpath.html– Morepathology– PEIR,www.peir.net– DermIS – www.dermis.net– Moredermatology,alsoadecision-supportsystem–
www.visualdx.com• Manyhaveassociatedtext,whichassistswithindexingand
retrieval
31
Citationdatabases• ScienceCitationIndexandSocialScienceCitationIndex– Databaseofjournalarticlesthathavebeencitedbyotherjournalarticles
– NowpartofapackagecalledWebofScience,whichitselfispartofalargerproduct,WebofKnowledge(Clarivate)
• http://clarivate.com/scientific-and-academic-research/research-discovery/web-of-science/
• SCOPUS– http://www.elsevier.com/online-tools/scopus
• GoogleScholar– http://scholar.google.com
32
17
Evidence-basedmedicinedatabases
• CochraneDatabaseofSystematicReviews–http://www.cochrane.org– Collectionofsystematicreviews,keptupdated
• Evidence“formularies”– ClinicalEvidence(BMJ)–http://clinicalevidence.bmj.com/x/index.html
– JAMAevidence – http://jamaevidence.com• PubMedHealth–https://www.ncbi.nlm.nih.gov/pubmedhealth/– Systematicreviewsandsummariesofsystematicreviews
• Manyresourcespartofaggregations
33
Clinicaldecisionsupport(CDS)• ContentusedinCDSsystems,usuallypartofEHRs
– Ordersets(usually“evidence-based”)– CDSrules– Health/diseasemanagementtemplates
• Growingandevolvingcommercialmarketforsuchtools,especiallyasEHRadoptionincreases;leadersinclude– Zynx – www.zynxhealth.com– ThomsonReutersCortellis –http://cortellis.thomsonreuters.com
– EHRvendorsthemselvesandpartners
34
18
Genomicsdatabases• NationalCenterforBiotechnologyInformation(NCBI,www.ncbi.nlm.nih.gov;NCBI,2017)collectionlinks– Literaturereferences– MEDLINE– Textbookofgeneticdiseases– On-LineMendelianInheritanceinMan
– Sequencedatabases– Genbank– Structuredatabases– MolecularModelingDatabase– Genomes– Catalogofgenes– Maps– Locationsofgenesonchromosomes
35
Otherdatabases
• ClinicalTrials.gov– www.clinicaltrials.gov– OriginallydatabaseofclinicaltrialsfundedbyNIH– Nowusedasregisterforclinicaltrials,withresultsreportingforsome(DeAngelis,2005;Laine,2007;Zarin,2013;Zarin,2015)
• NIHRePORTER– http://projectreporter.nih.gov/reporter.cfm– DatabaseofallresearchgrantsfundedbyNIH– ReplacedtheCRISPdatabase
36
19
Datapublishing• Internetmakesittechnologicallyfeasible• Manyfieldshavelongtraditionofrequiringdepositingofdatainpublic
repositoryasaconditiontopublish,e.g.,genomics,althoughavailabilityincomplete(Alsheikh-Ali,2011)
• Growingadvocacyforclinicaltrialsdata– A“publicgood”(Rodwin,2012)forneweraof“openscience”(Ross,2013)– Callsfordoingsobyjournaleditors(Taichman,2016)andothers(Ross,2013;
Mello,2013)– Pushbackfromtrialistswhowanttime-limitedprotectionofthosewho
generatedataforrewardsoftheirworkandfromthosewhoaimtodiscreditorundermineoriginalresearch(Anonymous,2016)
• biomedicalandhealthCAre DataDiscoveryIndexEcosystem(bioCADDIE)– Databaseofmetadataaboutavailablebiomedicaldatasets– https://datamed.org/
37
Aggregations– integratingmanyresources
• Clinical– growingtendencyofpublisherstoaggregateresourcesintocomprehensiveproducts– MerckMedicus – www.merckmedicus.com
• CollectionofmanyresourcesavailabletoanylicensedUSphysician
– UptoDate– www.uptodate.com• Verypopularamongclinicians
– EssentialEvidencePlus(includesInfoPOEMS,“Patient-orientedevidencethatmatters”)–www.essentialevidenceplus.com
– Dynamed – www.dynamed.com
38
20
Otheraggregations
• Biomedicalresearch:Modelorganismdatabases,e.g.,MouseGenomeInformatics– www.informatics.jax.org– Combinesgenomicsandrelateddata,bibliographicdatabase,genereferences,etc.
• Consumer:MEDLINEplus– http://medlineplus.gov– IntegratesavarietyoflicensedresourcesandpublicWebsites
39
Indexing
• Assignmentofmetadatatocontenttofacilitateretrieval
• Twomajortypes– Humanindexingwithcontrolledvocabulary– Automatedindexingofallwords
• Alsoaddress– Indexingother“objects”– UMLSMetathesaurus– Webindexing
40
21
Humanindexing
• Usuallyperformedbyprofessionalindexerwithsomebackgroundinbiomedicine
• Followsprotocoltoscanresourceandselecttermsfromacontrolledvocabulary
• Mostvocabulariesarehierarchicalandhavespecificdefinitionsforwhentermistobeassigned
41
MedicalSubjectHeadings(MeSH)vocabulary(Colletti,2001)
• Over26,000terms,withmanysynonymsforthoseterms
• Over230,000SupplementaryConceptRecords,formerlymostlychemicalsanddrugs,nowrarediseasesandgenes
• Hierarchical,basedon16trees,e.g.,Anatomy,Diseases,ChemicalsandDrugs
• Contains83subheadings,whichcanbeusedtomakeaheadingmorespecific,suchasDiagnosisorTherapy
• MeSH browserallowsexploration– http://www.nlm.nih.gov/mesh/MBrowser.html
42
22
C Diseases
C20Immunologic
Diseases
C14Cardiovascular
Diseases
C1 Bacterial andFungal Diseases
C14.907.055Aneurysm
C14.907VascularDiseases
C14.280 HeartDiseases
C14.907.489Hypertension
C14.907.940Vasculitis
C14.907.489.631Renal
Hypertension
C14.907.489.430Portal
Hypertension
C14.907.489.330Malignant
Hypertension
C14.240CardiovascularAbnormalities
AsliceofMeSH
43
MEDLINEindexing• Indexingdonebyprofessionalswhofollowprotocolfirst
devisedbyBachrach (1978)– Readtitle,introduction,andconclusionandthenscanmethods,
results,figures,tables,and,lastly,abstract– Ignore“keywords”ofpublisher– Assign2-4headings(withorwithoutsubheadings)ascentral
concepts(ormajorheadings)andanother5-10asminorheadings
– Usemostspecificheadingsinhierarchyassigned• ImportantadditionaltagisPublicationTypes
– e.g.,RandomizedControlledTrial,Meta-Analysis,PracticeGuideline,Review
• Manymoderntoolshavebeendevelopedtoassistindexing,suchastermsuggestionandlook-up
44
23
Otherbibliographicindexing
• OtherNLMdatabasesuseMeSH• Somenon-NLMresourcesuseMeSH
– MeSHfreelyavailablefromNLMathttp://www.nlm.nih.gov/mesh/filelist.html
• Othernon-NLMdatabaseshavetheirownsubjectheadings,e.g.,– CINAHLsubjectheadings– EMTREE
45
Othermetadata
• Indexingcoversmorethancontent• Otherattributesofdocumentstoindexcaninclude– Author(s)– Source:journalname,issue,pages– Publicationorresourcetype– Relationshiptootherinformation
• e.g.,geneidentifier,grantnumber,etc.
46
24
Automatedindexing
• Indexingofallwordsthatoccurincontentitems– Inbibliographicdatabases,willusuallyincludetitle,abstract,andoftenotherfields,e.g.,authororsubjectheading
– Infull-textdocuments,willusuallyincludealltext,includingtitle
• Oftenuseastopwordlisttoremovecommonwords(e.g.,the,and,which)
• Somesystems“stem”wordstorootform(e.g.,coughs orcoughing tocough)
47
Weightedindexing(Salton,1991)
• Usuallyusedwithautomatedindexing• Givesweighttowordsthatarefrequentbutdiscriminating
• MostcommonapproachisforweighttoequalproductTF*IDF– Inversedocumentfrequencyofwordi
• IDFi =log(#documents/#documentswithword)+1
– Termfrequencyofwordiindocumentj• TFij =frequencyofwordindocument
48
25
Weightedindexingexamples
• FromadatabaseonAIDS– ThewordAIDSwilllikelyoccurinalmosteverydocument,whileretinopathy willbemuchmore“discriminating”
• Inageneralmedicaldatabase– AIDSwilloccurmuchlessfrequently,soisbetterindexingterm
49
“Visual” indexing– e.g.,Wordle,www.wordle.net
50
Scientificpublicationsofyourinstructor(fromSciVal app)
26
Citationindexing• Othercontentitemsthat“cite”thisone,e.g.,references,links,etc.
• Indexingisatcontentitemlevel• Goalistodesignaterelatedorimportantcontentitems• Citationdatabaseslistallotherarticlesthatciteaspecificarticleinjournals– e.g.,ScienceCitationIndex,SCOPUS,andGoogleScholar
• NovelfeatureofGooglesearchengine(Brin,1998)wasgivinghigherweighttoWebpagesthathavemorelinkstothem
51
Limitationsofhumanindexing• Inconsistency
– WhenMEDLINErecordsindexedinduplicate,consistencyvariesfrom63%forcentralconceptheadingsto36%forheading-subheadingcombination(Funk,1983)
– Resultsverifiedevenwithmodernindexingtoolsandmethods(Marcetich,2004)
• Inadequateindexingvocabulary– Upto25%ofallconceptsnotrepresentedinMeSH(Hersh,1994)
– Ambiguitiesandothernamingproblemswithgenes,proteins,etc.(Yandell,2002;Tuason,2004)
52
27
Limitationsofwordindexing
• Synonymy– e.g.,cancer/carcinoma• Polysemy– e.g.,lead• Context– e.g.,highbloodpressure• Focus– e.g.,centralvs.incidentalconcepts• Granularity– e.g.,antibioticsvs.specificones
53
Research
• Evaluation– Howvaluablearesystemstousers?– Howwelldosystemsandusersperform?
• Futuredirections– ApplyingIRtechniquestoelectronichealthrecords
– Beyondretrieval– question-answering
54
28
Evaluation• Questionsoftenasked
– Issystemused?– Areuserssatisfied?– Dotheyfindrelevantinformation?– Dotheycompletetheirdesiredtask?
• Moststudiedgroupisphysicians,withsystematicreviewsofresults(Hersh,1998,Pluye,2005)
• MostIRevaluationresearchhasfocusedonretrievalofrelevantdocuments,whichmaynotcapturefullspectrumofusage– Oftenconsistsofchallengeevaluationsthatdevelop“test
collections” – bestknownis(non-medical)TextRetrievalConference(TREC,http://trec.nist.gov)(Voorhees,2005)
55
Issystemused?
• MoststudiesdonepriortoubiquitousInternet,electronichealthrecords,mobiledevices,etc.
• Studiesinvariousclinicalsettings(Hersh,2009;Magrabi,2005)showedaverageusevariedfrom0.3to8.7accessesperperson-month
• Whatevertheactualnumber,thispaledincomparisontoknownphysicianinformationneeds(Gorman,1995)oftwoquestionspereverythreepatients
56
29
Areuserssatisfied?
• Moststudiesreportgoodusersatisfaction,butsomeinterestingstudiestonote– Nielsen(1994)meta-analysisfoundassociation(thoughimperfect)betweenusersatisfactionandabilitytousecomputersystems
– MostInternetusersbelievetheymostlyfindinformationtheyareseeking(Taylor,2010;Fox,2011)
57
Dotheyfindrelevantinformation?
• Mostcommonapproachtoevaluation• Usuallymeasuredbyrelevance-basedmeasuresofrecallandprecision– Recall(R)
– Precision(P)
• Andvariousaggregations,e.g.,F,MAP,NDCG,etc.58
30
Commentsaboutrecallandprecision
• Theretendstobeatrade-offbetweenthetwo• “Relevance” canbeanambiguousnotion(Hersh,1994)
• Itisunclearwhethertheycorrelatewithauser’ssuccessinusinganIRsystem
• Theproliferationofstandardtestcollectionsleadstoagreatdealofresearchthatexcludesrealusers
59
Howwelldoclinicianssearch?EarlyresultsfromHaynes(1990)
Searcher Type Recall PrecisionNovice clinicians 27% 38%Expertclinicians 48% 48%Librarians 49% 57%
60
Otherfindings• Littleoverlapamongretrievalsets
• Searcherstendedtofindsimilarquantitiesofdisparaterelevantdocuments
• Novicesearcherssatisfiedwithresults• Adequateinformationorignorantbliss?
31
Extendingevaluationbeyondphysiciansanddocuments
• Otherclinicians– Nurses– Rolye,1995– Pharmacists– Wanke,1988– Nursepractitioners– Hersh,2000;Hersh,2002
• Biomedicalresearchers– VerylittlestudyoftheiruseofIRsystems– InvestigatedbyTRECGenomicsTrack(Hersh,2006;Hersh,2009)
– http://ir.ohsu.edu/genomics/• Imageretrieval– ImageCLEFmed (Hersh,2006;Hersh,
2009;Kalpathy-Cramer,2015)– Retrievalperformancerelatedtoquerytype,measureselection– http://ir.ohsu.edu/image/
61
Recallandprecisionstudiesyieldusefulresults,but
• Aresearchersabletosolvetheirinformationproblemsbyusingsystem?– Someresultsresearchhaveused“task-orientedapproach”tomeasurequestion-answering
– Hersh(2002)– useofMEDLINEtoanswerclinicalquestions
• Medicalstudentsanswered34%ofquestionsbeforesystem,51%afterwards
• Nursepractitionerstudentsanswered34%ofquestionsbeforesystembutdidnotchangewithsystem
• Timetoansweraquestionwas~30minutes• Noassociationofrecallorprecisionwithcorrectanswering
62
32
Anothertask-orientedstudy
• Westbrook(2005)– useofonlineevidencesystem– Physiciansanswered37%ofquestionsbeforesystem,50%afterwards
– Nursespecialistsanswered18%ofquestionsbeforesystem,50%afterwards
– Thosewhohadcorrectanswershadhigherconfidenceintheiranswers,butthosenotknowinganswerinitiallyhadnodifferenceinconfidencewhetheranswerrightorwrong
63
HowdoIRsystemsimpactphysicianpractice?(Pluye,2004)
• Qualitativestudyfoundfourthemesmentionedbyphysicians– Recall– offorgottenknowledge– Learning– newknowledge– Confirmation– ofexistingknowledge– Frustration– thatsystemusenotsuccessful
• Researchersalsonotedtwoadditionalthemes– Reassurance– thatsystemisavailable– Practiceimprovement– ofpatient-physicianrelationship
64
33
ChallengesforIRevaluationmovingforward
• Mustunderstandtasksofuserandfocusevaluationaccordingly
• Ultimatemeasure,likeanyotherinformaticsapplication,mightbehealthoutcome– ThismaybedifficultwithIRsystemssinceusagemaynotdirectlyimpactoutcomesofpatientcareorresearchactivity
65
Researchdirections– applyingIRtomedicalrecords
• Mostmedicalrecordsstillinnarrativedocuments,wherenaturallanguageprocessing(NLP)techniquesareimprovingbutstillimperfect(Stanfill,2010)
• Forsometasks,canwetakeanIRapproach?– TRECMedicalRecordsTrackusedde-identifiedcorpusofmedicalrecordsininitialtaskofidentifyingpatientsascandidatesforclinicalresearchstudies(Voorhees,2011;Voorhees,2012)
66
34
TRECMedicalRecordsTracktestcollection
67(Courtesy,EllenVoorhees,NIST)
3EKrCWvnwcbU
DISCHARGE SUMMARY...PRINCIPAL DIAGNOSES:1. Urinary tract infection.2. Gastroenteritis.3. Dehydration.4. Hyperglycemia.5. Diabetes mellitus.6. Osteoarthritis.7. History of anemia.8. History of tobacco use.
HOSPITAL COURSE: The patient is a **AGE[in 40s]-year-old insulin-dependent diabetic who presented with nausea,...
VISIT LISTRECORD-VISIT MAP
Report Extract
20071026ER-9qWiuGEk8Xkz-488-541231171
20073482DS-56d8329-100-34234561
20071026RAD-9qWiuGEk8Xkz-488-1222308213
20073482DS-56d8329-100-34234561
20071027HP-9qWiuGEk8Xkz-488-1348146618
20073482DS-56d8329-100-34234561
2007100542DS-56d8329-100-34234561
20073482HP-56d8329-100-342348376
200782RAD-56d83asd29-100-34238923847
20071028HP-9qWiuGEk8Xkz-488-1617583866
2007348932DS-56dnp29-100-34289345023804
20073482DS-56d83fsdf29-344-34234561
20071030DS-9qWiuGEk8Xkz-488-856269896
200734462RAD-56d8329-800-87342345323
17,198 visits 101,712 reports (93,552 mapped to visits)
TRECMedicalRecordsTrackresults• Highlyvariableacrossdifferent
topics– Easiest– consistentlybest
results• 105:Patientswithdementia
– Hardest– consistentlyworstresults
• 108:Patientstreatedforvascularclaudicationsurgically
– Largedifferencesbetweenbestandworstresults
• 125:Patientsco-infectedwithHepatitisCandHIV
• Overallresultsshowsubstantialroomforimprovement– Bestresultsinvolvemanual
modificationofqueries
68
(Voorhees,2011;Voorhees,2012)
35
Subsequentworkinmedicalrecordssearch
• Publictestcollectionsofmedicalrecordsstymiedbyprivacyconcerns
• FundedforprojectusingparallelcorporawithcommontopicsatOHSUandMayoClinic(Wu,2017)
• ExploringoptionsforEvaluationasaService(EaaS)toallowotherstousedatawithoutseeingit(Hanbury,2015)– SimilarsituationtoTRECTotalRecallTracksearchingoveremailandcorporaterepositories(Roegeist,2016)
69
MorerecentTRECtracks
• ClinicalDecisionSupport,2014-2016(Roberts,2016)– Givenpatientcase,findrelevantfull-textarticles(fromPMCsnapshot)aboutdiagnosis,tests,ortreatments
• PrecisionMedicine(2017)
70
36
Researchdirections– question-answering
• Usersmayretrievedocuments,butusuallywantanswerstoquestions
• SubareaofIRresearchhasfocusedonquestion-answeringsystems(Strzalkowski,2006)
• Highest-profilesystemisIBMWatson– DevelopedoutofTRECQuestion-AnsweringTrack(Voorhees,2005;Ferrucci,2010)
– Additional(exhaustive)detailsinspecialissueofIBMJournalofResearchandDevelopment (Ferrucci,2012)
– BeathumansatJeopardy!(Markoff,2011)– Nowbeingappliedtohealthcare(Lohr,2012);has“graduated”medicalschool(Cerrato,2012)
71
HowdoesWatsonwork(Ferrucci,2010)?
• BuiltaroundasystemcalledDeepQA,whichusesmassivelyparallelcomputingtoacquireknowledgefromresourcesofagivendomain
• Learningprocessbuildsaroundsamplequestionsfromthedomain– Akeystepistoidentifylexicalanswertypes(LATs)inthedomain– Amonggeneralquestions,somecommonLATsincludehe,country,city,man,film,state,she,author,group,here,company,etc.
– NLPthenappliedtotextandknowledgerepresentationandreasoning(KRR)appliedtostructuredknowledge
– Machinelearningthenappliedtoquestionsandtheiranswers
72
37
Watsonarchitecture(Ferrucci,2010)
73
ApplyingWatsontomedicine(Ferrucci,2012)
• Trainedusingseveralresourcesfrominternalmedicine:ACPMedicine,PIER,MerckManual,andMKSAP
• Conceptadaptationprocessrequired– Namedentitydetection– e.g.,disambiguationoftermsandtheir
senses– Measurerecognitionandinterpretation– e.g.,ageorbloodtestvalue– Recognitionofunaryrelations– e.g.,elevated<testresult>
• Trainedwith5000questionsfromDoctor'sDilemma,acompetitionlikeJeopardy!,inwhichmedicaltraineesparticipateandisrunbytheACPeachyear– Samplequestionis,Familial adenomatous polyposis is caused by mutations of this gene,withtheanswerbeing,APC Gene
• Googling thequestiongivesthecorrectansweratthetopofitsrankingtothisandtwoothersamplequestionslisted
74
38
EvaluationofWatsononinternalmedicinequestions(Ferrucci,2012)
• Evaluatedonanadditional188unseenquestions
• Primaryoutcomemeasurewasrecallat10answers– HowwouldWatsoncompare
againstothersystems,suchasGoogleorPubmed,orusingothermeasures,suchasMRR?
• FutureusecaseforWatsonisapplyingsystemtodatainEHR,ultimatelyaimingtoserveasaclinicaldecisionsupportsystem(Cerrato,2012)
• Notmuchpeerreviewedliteraturesincethen…
75