Executive Summary: Natural Language Processing of Twitter #swineflu (H1N1) Posts using Semantic...
-
Upload
us-national-library-of-medicine-division-of-specialized-information-services -
Category
Health & Medicine
-
view
3.228 -
download
2
description
Transcript of Executive Summary: Natural Language Processing of Twitter #swineflu (H1N1) Posts using Semantic...
Influenza A(H1N1)
Executive Summary:Natural Language Processing of Twitter
#swineflu Posts using the Semantic MEDLINE Prototype
Dr. Alla Keselman, Dr. Thomas Rindflesch, David HaleNational Library of Medicine, National Institutes of Health,
Department of Health and Human ServicesMay 2009
http://twitter.com/CDCemergency
H1N1 information via Twitter:Communication issues
• Information receivers– Information overload
• >12,000 #swineflu (H1N1) posts/hour @ peak– Signal:Noise ratio
• Quality?• Authority?
– Twitter accounts impersonating CDC• Information providers– Effective information provision– Biosurveillance
(un)Controlled Vocabulary
• Folksonomy• Hashtags (#)• Grammar• Abbreviations– SRSLY IMO ROI 4 RT? YMMV
• High context
#swineflu Tweets
Acquisition Challenges
• Twitter timeline– Storage requirements– Privacy
• Twitter API– Limited search functionality• Temporal and range limitations
– Range definition limited to midnight– 1500 posts from limit
Semantic MEDLINE Prototype
• Summarizes MEDLINE citations returned by PubMed search
• Natural Language Processing (MetaMap, SemRep) used to analyze salient content in titles and abstracts
• Information presented in graph that has links to the MEDLINE text processed
• Visualize relationships, such as:– A is a process of B– X treats Y
http://skr3.nlm.nih.gov/SemMedDemo/
http://skr3.nlm.nih.gov/SemMedDemo/
http://skr3.nlm.nih.gov/SemMedDemo/
Semantic processing of#swineflu Tweets
• Sample - 1267 Tweets– Afternoon of April 27, 2009
• No adjustments made to NLP software (MetaMap, SemRep)– No additional vocabulary, abbreviations, etc.
Preliminary Processing of #swineflu Tweets
Preliminary Processing of #swineflu Tweets
Concepts in Tweets Isolatedby Semantic Processing
• Disease: influenza• Disease symptom: coughing• Geographic area: Mexico• Animal: family suidae • Health care organization: Centers for Disease
Control and Prevention (U.S.)• Medical device: mask
Next Steps
• Processing of larger dataset– include non-H1N1-related Tweets
• Additional vocabulary– Folksonomy, abbreviations, etc.
• Visualization of semantic processing results
Opportunities
• Biosurveillance• Monitoring of wide-spread sentiment• Targeted information provision– Respond to misinformation trends
• Evaluation of accuracy/authenticity
Links
• Semantic MEDLINE Prototype– http://skr3.nlm.nih.gov/SemMedDemo/
• Semantic Medline: Multi-Document Summarization and Visualization– http://www.nlm.nih.gov/pubs/techbull/mj07/theater_ppt/semantic.
ppt• National Library of Medicine
– http://www.nlm.nih.gov• National Institutes of Health
– http://nih.gov• Department of Health and Human Services
– http://hhs.gov
Dr. Alla Keselmankeselmana AT mail DOT nlm DOT nih DOT gov
Dr. Thomas Rindfleschtrindflesch AT mail DOT nih DOT gov
David Haledavid DOT hale AT nih DOT gov