Ambient Geographic Information and Biosurveillance Capstone Presentation Todd Barr March 20, 2013.
-
Upload
gina-latus -
Category
Documents
-
view
217 -
download
0
Transcript of Ambient Geographic Information and Biosurveillance Capstone Presentation Todd Barr March 20, 2013.
Ambient Geographic Information and Biosurveillance
Capstone Presentation
Todd Barr
March 20, 2013
“Classic” Biosurveillance
• Reports Only the Cases that are handled by Medical Professionals
• Data is sent to the Centers for Disease Control and Prevention
• Data is Aggregated to the State level• Standard Turn Around time is anywhere from 7 to 10
days depending on the data, and the level of the crisis
Ambient Geographic Information
• Ambient Geographic Information (AGI) differs from Volunteered Geographic Information (VGI)
• Most Commonly Captured from Twitter, Facebook and Four Square
• Can be used to trace vectors through Social Networks• Can Determine “Hot Spots” of activity via Hashtags, key
words and modifiers• Starting to be used in Biosurveillance, but still does not
have buy in from “establishment”
Risk Terrain Modeling
• Originally Used to Predict Crime• Core Concept is that Certain activities are related to
Geographic Features (Assaults tend to occur near certain Liquor Stores, Bars or Entertainment Venue)
• Leads to a Spatial Understanding for Strategic Decision Making
• Allows Decision Makers to make best use their of Resources
AGI and RTM Enhancing Biosurveilance
• AGI• Allowing Real Time Disease Information to be consumed
and Analyzed both Spatially and Text• No turn around time• Not Aggregated to a State level
• RTM• Generation of a RTM Map for Public Health by County• People in the lesser served areas less likely to seek medical
attention and less likely to have symptoms/aliment reported
Data Collection - RTM
• Used the Criteria from Publication “County Health Rankings and Roadmaps: a Healthier Nation County by County• 32 influencers on health and health care quality• Examples• Number of Medical Doctors in County• Proximity to Medical Care• Percentage of Population with Health Insurance
• Divided Counties into Quartiles• 152 counties had no Data
Data Collection - AGI
• Used Python Script To Collect Tweets within the US to populate spreadsheet
• Collected an average of 40,000 tweets a night• Roughly 5% of those Tweets had location data• Used Hashtags, Keywords and Modifiers to determine if
they were talking about the Flu, or getting a Flu shot
The Study
• Collection of Flu Related Geo located Tweets within the United States from the week of January 5 to the week ending February 2
• Determined how many of those Tweets were in each Quartile
• Compare the Results to the CDC Data from those same timeframe
Data Cleaning - AGI
• Total Usable Tweets 25,000• Geocoding Issues• Most had City and State• Some just had State• Others had full State Names which did not Geocode• Others had Clinics for Cities and Cities for States
• Used both ESRI Online Geocoding as well as CartoDB• ESRI Online Geolocated 75% of the total tweets• CartoDB Geolocated 90% of the total tweets
Data Metrics – Key Words
flu Influenza h1n1 H3N2 H5N1 Adenovirus0
5000
10000
15000
20000
25000
30000
Key Word and Hashtag
Data Metrics - Modifiers
sick
have
shot
sch
ild
Health
I hop
eain
tha
s
docto
rI f
eel
son dr
I thi
nkDon
t
Nurse
meds
isn't
vacc
inati
onwor
se
diag
nose
d
Well
ness
suffe
ring
teen
Phys
ician
dayc
are
reduc
e
prov
ider
0
500
1000
1500
2000
2500
3000
Tweet Modifiers
Data Metrics – by State
AK ARM
DM
A CA GA TN NJW
I HI ID KS LAM
EM
NM
T NENM OK
WA SC UT
WY
0
500
1000
1500
2000
2500
3000
Data Metrics – by Quartile
Total Tweets By Quartile
Quartile 1
Quartile 2
Quartile 3
Quartile 4
No Data
Maps – All Tweets
Map – Tweets January 5th
Map – CDC ILI January 5
Maps – Tweets January 12
Maps – CDC ILI January 12
Maps – Tweets January 19
Maps – CDC ILI January 19
Maps – Tweets January 26
Maps – CDC ILI January 26
Maps – Tweets February 2
Maps – CDC ILI February 2
Conclusions
• Social Media can be used as a new tool in the Biosurveillance Toolkit
• Tweets are nearly evenly disturbed between the Risk Quartiles
• Social Media shows trends that are reflected in the CDC Data