Forecasting Influenza Dynamics with Neural Networks using...

1
Forecasting Influenza Dynamics with Neural Networks using Signals from Social Media Ellyn Ayton and Svitlana Volkova me // File Date // PNNL-SA-##### ABOUT Pacific Northwest National Laboratory The Pacific Northwest National Laboratory, located in southeastern Washington State, is a U.S. Department of Energy Office of Science laboratory that solves complex problems in energy, national security, and the environment, and advances scientific frontiers in the chemical, biological, materials, environmental, and computational sciences. The Laboratory employs nearly 5,000 staff members, has an annual budget in excess of $1 billion, and has been managed by Ohio-based Battelle since 1965. For more information on the science you see here, please contact: Svitlana Volkova Pacific Northwest National Laboratory Richland, WA 99352 (509) 372-6585 [email protected] Motivation 500, 000 deaths worldwide from influenza Influenza monitoring data is 1-2 weeks old when released Utilize social media signals as source to track the severity and spread of the influenza Contributions: Contrasted the predictive power of social media signals Experimented with location-specific vs. location-independent models Compared neural network performance to the state-of-the-art machine learning models Data Twitter L14 i2 L3 L27 L20 i25 L28 L11 L25 L10 L4 L29 L0 L23 L12 L30 L33 L15 L13 L31 i3 L32 L22 L21 i27 L34 L37 L16 i20 L19 i17 Locations 0.0e+00 5.0e+06 1.0e+07 1.5e+07 The number of tweets collected within a 25-mile radius for 31 geolocations ILI-Related Clinical Visit Data 3 × 10 2 6 × 10 2 9 × 10 2 0 50 100 150 200 Week ILI Location L10 L12 L13 L15 L22 L23 Weekly ILI proportion dynamics between 2011 and 2014 for six example geo-locations ILI = # of weekly ILI visits per location # of total weekly visits per location Predictors ILI: ILI proportions Tweets: Unigrams, bigrams, trigrams, and TFIDF Network: Hashtags and mentions as tweet ngrams Topics: LDA topic representations Embeddings: Pre-trained embeddings using Word2Vec Stylistic: Proportion of emotions, ellongations, mentions, URLs, RTs, capitalization etc. Syntactic: Part-of-speech tags Models Baselines: AdaBoost with Decision Trees and Support Vector Machines with Linear and RBF kernels. LSTM: A one-layer Long Short-Term Memory (LSTM) neural network for regression and evaluated using 4-fold cross validation. LSTMs capture long term dependencies. i t = σ (W i x t + U i h t-1 + b i ) ˜ c t = tanh(W c x t + U c h t-1 + b c ) f t = σ (W f x f + U f h t-1 + b f ) C t = i t * ˜ c t + f t * C t-1 o t = σ (W o x t + U o h t-1 + V o C t + b o ) h t = o t * tanh(C t ) .4 .1 .3 .3 .3 .1 .3 .3 LSTM layer Fully connected layer SM predictors tweets, network, embeddings t 0 Predicted weekly ILI proportions } } .03 .03 .01 .02 .05 ILI predictors t 1 t 3 t 4 Nowcasting: Predicting this week’s ILI value using 4 weeks of past data. Forecasting: Predicting the ILI values for next week and in two weeks using 4 weeks of past data. Evaluation Metrics Pearson Correlation – the linear dependence be- tween the predicted and observed values: r = n i=1 (Y t i - Y t i )( ˆ Y t i - ˆ Y t i )) n i=1 (Y t i - Y t i ) 2 n i=1 ( ˆ Y t i - ˆ Y t i ) 2 Root Mean Squared Error (RMSE) – the dif- ference between the predicted and observed values: RMSE = 1 n n i=1 (Y t i - ˆ Y t i ) 2 RMSPE: root mean squared percent error MAPE: maximum absolute percent error Experimental Results True vs. predicted ILI dynamics as a function of time ILI nowcasting using social media signals Summary LSTM learned from SM data yield the best performance Text embeddings, tweet and network signals are predictive Location-specific models outperform location-independent Future Work: Combine Twitter + ILI signals Run experiments for 31 locations ***This is a joint work with Dr. Courtney D. Corley

Transcript of Forecasting Influenza Dynamics with Neural Networks using...

Page 1: Forecasting Influenza Dynamics with Neural Networks using ...svitlana/posters/EllynWiMLPoster.pdfForecasting Influenza Dynamics with Neural Networks using Signals from Social Media

Forecasting Influenza Dynamics with Neural

Networks using Signals from Social Media

Ellyn Ayton and Svitlana Volkova

File

Nam

e //

File

Dat

e //

PN

NL-

SA-#

####

ABOUT Pacific Northwest National Laboratory The Pacific Northwest National Laboratory, located in southeastern Washington State, is a U.S. Department of Energy Office of Science laboratory that solves complex problems in energy, national security, and the environment, and advances scientific frontiers in the chemical, biological, materials, environmental, and computational sciences. The Laboratory employs nearly 5,000 staff members, has an annual budget in excess of $1 billion, and has been managed by Ohio-based Battelle since 1965. For more information on the science you see here, please contact: Svitlana Volkova Pacific Northwest National Laboratory Richland, WA 99352 (509) 372-6585 [email protected]

Motivation

• 500, 000 deaths worldwide from influenza• Influenza monitoring data is 1-2 weeks old when released•Utilize social media signals as source to track the severityand spread of the influenza

Contributions:•Contrasted the predictive power of social media signals•Experimented with location-specific vs.location-independent models

•Compared neural network performance to thestate-of-the-art machine learning models

Data

Twitter

L14 i2 L3 L27

L20

i25

L28

L11

L25

L10 L4 L29 L0 L23

L12

L30

L33

L15

L13

L31 i3

L32

L22

L21

i27

L34

L37

L16

i20

L19

i17

Locations

0.0e+00

5.0e+06

1.0e+07

1.5e+07

The number of tweets collected within a 25-mile radius for 31 geolocations

ILI-Related Clinical Visit Data

3 × 10−2

6 × 10−2

9 × 10−2

0 50 100

150

200

Week

ILI

LocationL10

L12

L13

L15

L22

L23

Weekly ILI proportion dynamics between 2011 and 2014 for six example geo-locations

ILI =# of weekly ILI visits per location# of total weekly visits per location

Predictors

• ILI: ILI proportions• Tweets: Unigrams, bigrams, trigrams, and TFIDF• Network: Hashtags and mentions as tweet ngrams• Topics: LDA topic representations• Embeddings: Pre-trained embeddings using Word2Vec• Stylistic: Proportion of emotions, ellongations,mentions, URLs, RTs, capitalization etc.

• Syntactic: Part-of-speech tags

Models

Baselines: AdaBoost with Decision Trees and SupportVector Machines with Linear and RBF kernels.LSTM: A one-layer Long Short-Term Memory (LSTM)neural network for regression and evaluated using 4-fold crossvalidation. LSTMs capture long term dependencies.

it = σ(Wixt + Uiht−1 + bi)ct = tanh(Wcxt + Ucht−1 + bc)ft = σ(Wfxf + Ufht−1 + bf)Ct = it ∗ ct + ft ∗ Ct−1ot = σ(Woxt + Uoht−1 + VoCt + bo)ht = ot ∗ tanh(Ct)

.4….1

.3….3

.3….1

.3….3

LSTM layer

Fully connected

layer

SM predictorstweets,

network, embeddings

t0

Predicted weekly ILI proportions

}}

.03

.03 .01 .02 .05 ILI predictorst1 t3 t4

Nowcasting: Predicting this week’s ILI value using 4weeks of past data.Forecasting: Predicting the ILI values for next week andin two weeks using 4 weeks of past data.

Evaluation Metrics

Pearson Correlation – the linear dependence be-tween the predicted and observed values:

r =∑ni=1(Yti − Y ′ti)(Yti − Y ′ti))

√√√√√∑ni=1(Yti − Y ′ti)

2√√√√√√∑ni=1(Yti − Y ′ti)

2

Root Mean Squared Error (RMSE) – the dif-ference between the predicted and observed values:

RMSE =√√√√√√√√√√√√1n

n∑i=1

(Yti − Yti)2

RMSPE: root mean squared percent errorMAPE: maximum absolute percent error

Experimental ResultsTrue vs. predicted ILI dynamics as a function of time

ILI nowcasting using social media signals

Summary

•LSTM learned from SM datayield the best performance

•Text embeddings, tweet andnetwork signals are predictive

•Location-specific modelsoutperform location-independent

Future Work:•Combine Twitter + ILI signals•Run experiments for 31 locations***This is a joint work with Dr.Courtney D. Corley