SciReader : A Recommender system for Biomedical...

Post on 16-Mar-2018

227 views 6 download

Transcript of SciReader : A Recommender system for Biomedical...

SciReader:ARecommendersystemforBiomedicalliteratureDesai,P.1,2 ,Lehmann,B.2,Telis,N.2,PritchardJ.P2

1StanfordCenterforGenomicsandPersonalizedMedicine(SCGPM),2 DepartmentofGenetics,StanfordUniversity

Motivationn

Withtherecentexplosioninbiomedicalresearch,ithashasbecomeincreasinglyimportantandyetchallengingtokeepupwiththerelevantliterature.SciReaderisapersonalizedrecommendersystemthatspecificallyaimstohelpresearchersandpractitionersinthebiomedicalcommunityparsethroughthelargevolumeofliteratureandfilterpublicationsthatmayberelevantandofinteresttothem.

SciReaderwasinitiallydevelopedatthePritchardlab(Geneticsdepartment,StanfordSchoolofMedicine)andisnowmaintainedandoperatedbytheSCGPM.ItiscurrentlybeingmigratedtoGoogleCloudandshouldbeavailablesoon.(http://scireader.org)

Introduction• SciReaderisacloudbasedservicethatusesnovelalgorithmsto

classifyandclusterpublishedbiomedicalcorporausingtopicmodeling(LatentDirichletAllocation).

• Usersprovidebasicinfo:i.e.topics/keywordsofinterestandjournalpapers.

• Bestresultswhenusercreatesa‘library’anduploadpapersofinteresttoit.CancreatePersonalizedrecommendationsbasedonrelevancy,recency,impactfactorandsentimentanalysis–updateddaily.

• Weeklyemaildigestsofimportantpublicationsinyourfieldofresearch.

• Relevanttrendingtwitterfeedsprovidedinrealtime.

TopicModelingofPubmed/BioRxivusingLDA

• ThecornerstoneofScireaderisitstopicmodelofPubmed.• Topicmodelsrepresentaclassofcomputerprogramsthatseemsto

‘automagically’extractunderlyingthemesortopicsfromlargeunstructuredtexts.

• LDA:LatentDirichletAllocation,atopicmodelalgorithm• Mathematically:

• WeusedTitlesandAbstractsfromallthearticlespublishedinpubmedin2012(~1.2million)totrainaLDAmodelwhichwasthenusedtocreatea‘topicinferencer’.

• Ourtopicmodelhas150topicswhichweregroupedinto20‘supertopics”

• AllarticlesfrompubmedandbioRxivhavebeen‘topicmodeled’usingthisinferencer

SciReaderScreenshots

BasicoverviewoftheRecommenderpipeline

Bloodcancersgenomics Obstetrics

Exampletopics‘discovered’byLDA

v

Summary• SciReaderisagreatwayforresearchersandmedicalpractitionersto

stayabreastofadvancesintheirfield.• SciReader’stopicmodelanddatabasecanbeusedasaresearchtoolto

performlongitudinalstudiesonhistoryofdiseasebasedonPublicationdataandotherbibiliometricstudies.