Assessing the impact of a health intervention via user-generated Internet content

42
ECML PKDD 2015, Porto, Portugal Assessing the impact of a health intervention via usergenerated Internet data Data Mining and Knowledge Discovery 29(5), pp. 1434–1457, 2015 Vasileios Lampos, Elad YomTov, Richard Pebody and Ingemar J. Cox

Transcript of Assessing the impact of a health intervention via user-generated Internet content

Page 1: Assessing the impact of a health intervention via user-generated Internet content

ECML  PKDD  2015,  Porto,  Portugal

Assessing  the  impact  of  a  health  intervention  via  user-­‐generated  Internet  data  

Data  Mining  and  Knowledge  Discovery  29(5),  pp.  1434–1457,  2015

Vasileios  Lampos,  Elad  Yom-­‐Tov,    Richard  Pebody  and  Ingemar  J.  Cox

STATUTORY NOTIFICATIONS OF INFECTIOUS DISEASES

WEEK 2015/33 week ending 16/08/2015

in ENGLAND and WALES

Table 1 Statutory notifications of infectious diseases in the past 6 weeks with totals for the current year compared with corresponding periods of the two preceding years

CONTENTS

Table 2 Statutory notifications of infectious diseases for diseases for WEEK 2015/33 by PHE Region, county, local and unitary authority including additional diseases notifiable from 6th April 2010

Registered Medical Practioner in England and Wales have a statutory duty to notify a Proper Officer of the local authority, often the CCDC (Consultant in Communicable Disease Control), of suspected cases of certain infectious diseases:

Cholera Meningococcal septicaemia Viral haemorrhagic feverBrucellosis * Measles Typhus

Diphtheria Mumps Whooping cough

Food poisoning Rabies

Enteric fever (typhoid or paratyphoid) Plague Yellow fever

Botulism * Malaria Tuberculosis

Acute infectious hepatitis Infectious bloody diarrhoea * SARS *Acute encephalitis Haemolytic uraemic syndrome * Rubella

Acute meningitis Invasive group A Streptococcal disease

Scarlet fever

Anthrax Leprosy TetanusAcute poliomyelitis Legionnaires disease * Smallpox

* Notifiable from 6th April 2010

Notifications of infectious diseases, some of which are later microbiologically confirmed, prompt local investigation and action to control the diseases. Proper officers are required every week to inform the PHE (formerly the Registrar General) anonymised details of each case of each disease that has been notified. PHE has responsibilty of collating the weekly returns from proper officers and publishing analyses of local and national trends.

All weekly data are Provisional© Public Health England - Information Management Department

NOIDs WEEKLY REPORTTurning ideas into reality at

Microsoft Research Cambridge

Page 2: Assessing the impact of a health intervention via user-generated Internet content

๏ Background  and  motivation  

๏ Nowcasting  disease  rates  from  online  text  

๏ Estimating  the  impact  of  a  health  intervention  

๏ Case  study:  influenza  vaccination  impact  

๏ Conclusions  &  future  work

1%

Assessing  the  impact  of  a  health  intervention  via  online  content

Page 3: Assessing the impact of a health intervention via user-generated Internet content

Online,  user-­‐generated  data

+ Social  media,  blogs,  search  engine  query  logs  

+ Proxy  of  real-­‐world  (online+offline)  behaviour  

+ Complementary  information  sensors  to  more  ‘traditional’  crowdsourcing  efforts  

+ Can  answer  questions  difficult  to  resolve  otherwise  

+ Strong  predictive  power

Page 4: Assessing the impact of a health intervention via user-generated Internet content

Online,  user-­‐generated  data  —  Applications

+ Politics  • voting  intention  • result  of  an  election  

+ Finance  • financial  indices  • tourism  patterns  

+ User  profiling  • age  • gender  • occupation (Preotiuc-­‐Pietro,  Lampos  &  Aletras,  2015)

(Burger  et  al.,  2011)

(Rao  et  al.,  2010)

(Bollen,  Mao  &  Zeng,  2011)

(Choi  &  Varian,  2012)

(Lampos,  Preotiuc-­‐Pietro  &  Cohn,  2013)

(Tumasjan  et  al.,  2010)

Page 5: Assessing the impact of a health intervention via user-generated Internet content

Online,  user-­‐generated  data  for  health

Traditional  disease  surveillance  - does  not  cover  the  entire  population  - not  present  everywhere  (cities  /  countries)  - not  always  timely  

Digital  disease  surveillance  + different  or  better  population  coverage  + better  geographical  granularity  + useful  in  underdeveloped  parts  of  the  world  + almost  instant  - noisy,  unstructured  information

e.g.  (Lampos  &  Cristianini,  2010  &  2012),  (Lamb,  Paul  &  Dredze,  2013),  (Lampos  et  al.,  2015)  

Page 6: Assessing the impact of a health intervention via user-generated Internet content

What  this  work  is  all  about

Health  intervention

disease  rates

(Pebody  &  Cox,  2015

impact ?

Page 7: Assessing the impact of a health intervention via user-generated Internet content

What  this  work  is  all  about

Health  intervention

disease  rates

(Lampos,  Yom-­‐Tov,  Pebody  &  Cox,  2015)

impact ?

Page 8: Assessing the impact of a health intervention via user-generated Internet content

✓ Background  and  motivation  

๏ Estimating  disease  rates  from  online  text  

๏ Estimating  the  impact  of  a  health  intervention  

๏ Case  study:  influenza  vaccination  impact  

๏ Conclusions  &  future  work

Assessing  the  impact  of  a  health  intervention  via  online  content

15%

Page 9: Assessing the impact of a health intervention via user-generated Internet content

Estimating  disease  rates  from  online  text

Ridge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

1

time  intervalsn-­‐grams

frequency  of  n-­‐grams  during  the  time  intervals

disease  rates  during  the  time  intervals

Ridge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

1

(Hoerl  &  Kennard,  1970)

Ridge  regressionRidge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

1

(Zou  &  Hastie,  2005)

Elastic  net

Page 10: Assessing the impact of a health intervention via user-generated Internet content

Estimating  disease  rates  from  online  text

Ridge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

GPs can be defined as sets of random variables, any finite number of

which have a multivariate Gaussian distribution.

In GP regression, for the inputs x, x

0 2 RM(both expressing rows of

the observation matrix X) we want to learn a function f : RM ! R that is

drawn from a GP prior

f(x) ⇠ GP �µ(x) = 0, k(x,x0

)

kSE(x,x0) = �2

exp

✓�kx� x

0k222`2

where �2describes the overall level of variance and ` is referred to as the

characteristic length-scale parameter.

An infinite sum of SE kernels with di↵erent length-scales results to an-

other well studied covariance function, the Rational Quadratic (RQ) kernel.

kRQ(x,x0) = �2

✓1 +

kx� x

0k222↵`2

◆�↵

↵ is a parameter that determines the relative weighting between small

and large-scale variations of input pairs. The RQ kernel can be used to model

functions that are expected to vary smoothly across many length-scales.

1

Gaussian  Process

Ridge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

GPs can be defined as sets of random variables, any finite number of

which have a multivariate Gaussian distribution.

In GP regression, for the inputs x, x

0 2 RM(both expressing rows of

the observation matrix X) we want to learn a function f : RM ! R that is

drawn from a GP prior

f(x) ⇠ GP �µ(x) = 0, k(x,x0

)

kSE(x,x0) = �2

exp

✓�kx� x

0k222`2

where �2describes the overall level of variance and ` is referred to as the

characteristic length-scale parameter.

An infinite sum of SE kernels with di↵erent length-scales results to an-

other well studied covariance function, the Rational Quadratic (RQ) kernel.

kRQ(x,x0) = �2

✓1 +

kx� x

0k222↵`2

◆�↵

↵ is a parameter that determines the relative weighting between small

and large-scale variations of input pairs. The RQ kernel can be used to model

functions that are expected to vary smoothly across many length-scales.

1

Rational  Quadratic  covariance  function  (kernel)

infinite  sum  of  squared  exponential  (RBF)  kernels

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

One  kernel  per  n-­‐gram  category  varied  usage  patterns,  increasing  semantic  value

(Rasmussen  &  Williams,  2006)

see  also  (

Page 11: Assessing the impact of a health intervention via user-generated Internet content

Estimating  disease  rates  from  online  text

Ridge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

GPs can be defined as sets of random variables, any finite number of

which have a multivariate Gaussian distribution.

In GP regression, for the inputs x, x

0 2 RM(both expressing rows of

the observation matrix X) we want to learn a function f : RM ! R that is

drawn from a GP prior

f(x) ⇠ GP �µ(x) = 0, k(x,x0

)

kSE(x,x0) = �2

exp

✓�kx� x

0k222`2

where �2describes the overall level of variance and ` is referred to as the

characteristic length-scale parameter.

An infinite sum of SE kernels with di↵erent length-scales results to an-

other well studied covariance function, the Rational Quadratic (RQ) kernel.

kRQ(x,x0) = �2

✓1 +

kx� x

0k222↵`2

◆�↵

↵ is a parameter that determines the relative weighting between small

and large-scale variations of input pairs. The RQ kernel can be used to model

functions that are expected to vary smoothly across many length-scales.

1

Gaussian  Process

Ridge regression

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+

MX

j=1

w2j

1

A

Elastic net

argmin

w,�

0

@NX

i=1

(xiw + � � yi)2+ �1

MX

j=1

|wj |+ �2

MX

j=1

w2j

1

A

Variables

NMX 2 RN⇥M

y 2 RN

GPs can be defined as sets of random variables, any finite number of

which have a multivariate Gaussian distribution.

In GP regression, for the inputs x, x

0 2 RM(both expressing rows of

the observation matrix X) we want to learn a function f : RM ! R that is

drawn from a GP prior

f(x) ⇠ GP �µ(x) = 0, k(x,x0

)

kSE(x,x0) = �2

exp

✓�kx� x

0k222`2

where �2describes the overall level of variance and ` is referred to as the

characteristic length-scale parameter.

An infinite sum of SE kernels with di↵erent length-scales results to an-

other well studied covariance function, the Rational Quadratic (RQ) kernel.

kRQ(x,x0) = �2

✓1 +

kx� x

0k222↵`2

◆�↵

↵ is a parameter that determines the relative weighting between small

and large-scale variations of input pairs. The RQ kernel can be used to model

functions that are expected to vary smoothly across many length-scales.

1

Rational  Quadratic  covariance  function  (kernel)

infinite  sum  of  squared  exponential  (RBF)  kernels

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].

2

One  kernel  per  n-­‐gram  category  varied  usage  patterns,  increasing  semantic  value

(Rasmussen  &  Williams,  2006)

see  also  (Lampos  et  al.,  2015)

Page 12: Assessing the impact of a health intervention via user-generated Internet content

Estimating  influenza-­‐like  illness  (ILI)  rates  —  Data

2012 2013 20140

0.01

0.02

0.03

0.04

ILI r

ate

per 1

00 p

eopl

e

ILI rates (PHE)

Bing

User-­‐generated  data,  geolocated  in  England  • Twitter:  May  2011  to  April  2014  (308  million  tweets)  • Bing:  end  of  December  2012  to  April  2014

ILI  rates  from  Public  Health  England  (PHE)

Page 13: Assessing the impact of a health intervention via user-generated Internet content

Estimating  ILI  rates  —  Feature  extraction

• Start  with  a  manually  crafted  list  of  36  textual  markers,  e.g.  flu,  headache,  doctor,  cough    

• Extract  frequent  co-­‐occurring  n-­‐grams  from  a  corpus  of  30  million  UK  tweets  (February  &  March,  2014)  after  removing  stop-­‐words  

• Set  of  markers  expanded  to  205  n-­‐grams  (n  ≤  4) e.g.  #flu,  #cough,  annoying  cough,  worst  sore  throat    

• Relatively  small  set  of  features  motivated  by  previous  work   (Culotta,  2013)

Page 14: Assessing the impact of a health intervention via user-generated Internet content

Estimating  ILI  rates  —  Experimental  setup

Two  time  intervals  based  on  the  different  temporal  coverage  of  Twitter  and  Bing  data  • Dt1:  154  weeks  (May  2011  to  April  2014)  • Dt2:  67  weeks  (December  2012  to  April  2014)  

Stratified  10-­‐fold  cross  validation  

Error  metrics  • Pearson  correlation  (r)  • Mean  Absolute  Error  (MAE)

Page 15: Assessing the impact of a health intervention via user-generated Internet content

Pearso

n  co

rrelation  (r)

0.5

0.6

0.7

0.8

0.9

1

User-­‐generated  data  source

Twitter  (Dt1) Twitter  (Dt2) Bing  (Dt2)

0.9520.924

0.8450.867

0.7440.718

0.814

0.698

0.64

Ridge  Regression Elastic  Net Gaussian  Process

Estimating  ILI  rates  —  Performance

Page 16: Assessing the impact of a health intervention via user-generated Internet content

MAE

1

1.64

2.28

2.92

3.56

4.2

User-­‐generated  data  source

Twitter  (Dt1) Twitter  (Dt2) Bing  (Dt2)

1.598

1.9992.196

2.564

3.198

2.828 2.963

4.084

3.074

Ridge  Regression Elastic  Net Gaussian  Process

Estimating  ILI  rates  —  Performancex  10

3

Page 17: Assessing the impact of a health intervention via user-generated Internet content

✓ Background  and  motivation  

✓ Estimating  disease  rates  from  online  text  

๏ Estimating  the  impact  of  a  health  intervention  

๏ Case  study:  influenza  vaccination  impact  

๏ Conclusions  &  future  work

Assessing  the  impact  of  a  health  intervention  via  online  content

41%

Page 18: Assessing the impact of a health intervention via user-generated Internet content

Estimating  the  impact  of  a  health  intervention

1. Disease  intervention  launched  (to  a  set  of  areas)  

2. Define  a  distinct  set  of  control  areas  

3. Estimate  disease  rates  in  all  areas  

4. Identify  pairs  of  areas  with  strong  historical  correlation  in  their  disease  rates  

5. Use  this  relationship  during  and  slightly  after  the  intervention  to  infer  diseases  rates  in  the  affected  areas  had  the  intervention  not  taken  place

Page 19: Assessing the impact of a health intervention via user-generated Internet content

Estimating  the  impact  of  a  health  intervention

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

time  interval(s)  before  the  interventionlocation(s)  where  the  intervention  took  placecontrol  location(s)

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

such  that

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

disease  rate(s)  in  affected  location  

before  intervention

disease  rate(s)  in  control  location  

before  intervention

high

Page 20: Assessing the impact of a health intervention via user-generated Internet content

Estimating  the  impact  of  a  health  intervention

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

such  that

qvdisease  rate(s)  in  affected  location  during/after  intervention

�v = qv � q

⇤v

absolute  difference

✓v =

qv � q

⇤v

q

⇤v

relative  difference  (impact)

(Lambert  &  Pregibon,  2008

estimate  projected  rate(s)  in  affected  location  during/after  intervention

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

q

⇤v = qcw + b

qv

�v = qv � q

⇤v

2

Page 21: Assessing the impact of a health intervention via user-generated Internet content

Estimating  the  impact  of  a  health  intervention

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

such  that

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

.

2

disease  rate(s)  in  affected  location  during/after  intervention

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

2

absolute  difference

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

qv

�v = qv � q

⇤v

✓v =

qv � q

⇤v

q

⇤v

2

relative  difference  (impact)

(Lambert  &  Pregibon,  2008)

estimate  projected  rate(s)  in  affected  location  during/after  intervention

k(x,x0) =

CX

n=1

kRQ(gn,g0n)

!+ kN(x,x

0)

where gn is used to express the features of each n-gram category, i.e.,

x = {g1, g2, g3, g4}, C is equal to the number of n-gram categories (in our

experiments, C = 4) and kN(x,x0) = �2

N ⇥ �(x,x0) models noise (� being a

Kronecker delta function).

Denoting the disease rate time series as y = (y1, . . . , yN ), the GP re-

gression objective is defined by the minimization of the following negative

log-marginal likelihood function

argmin

�1,...,�C ,`1,...,`C ,↵1,...,↵C ,�N

�(y � µ)|K�1

(y � µ) + log |K|�

where K holds the covariance function evaluations for all pairs of inputs,

i.e., (K)i,j = k(xi,xj), and µ = (µ(x1), . . . , µ(xN )).

Based on a new observation x⇤, a prediction is conducted by computing

the mean value of the posterior predictive distribution, E[y⇤|y,O,x⇤].—

⌧ = {t1, . . . , tN}vc

r(q⌧v ,q

⌧c )

f(w,�) : R ! R

argmin

w,�

NX

i=1

�qtic w + � � qtiv

�2

q

⇤v = q

⇤cw + b

q

⇤v = qcw + b

qv

�v = qv � q

⇤v

2

Page 22: Assessing the impact of a health intervention via user-generated Internet content

✓ Background  and  motivation  

✓ Estimating  disease  rates  from  online  text  

✓ Estimating  the  impact  of  a  health  intervention  

๏ Case  study:  influenza  vaccination  impact  

๏ Conclusions  &  future  work

Assessing  the  impact  of  a  health  intervention  via  online  content

52%

Page 23: Assessing the impact of a health intervention via user-generated Internet content

Live  Attenuated  Influenza  Vaccine  (LAIV)  campaign

2012 2013 20140

0.01

0.02

0.03

ILI r

ate

per 1

00 p

eopl

e

PHE/RCGP LAIV Post LAIV

∆tv

• LAIV  programme  for  children  (4  to  11  years)  in  pilot  areas  of  England  during  the  2013/14  flu  season  

• Vaccination  period  (blue):  Sept.  2013  to  Jan.  2014  • Post-­‐vaccination  period  (green):  Feb.  to  April  2014

Page 24: Assessing the impact of a health intervention via user-generated Internet content

Target  (vaccinated)  &  control  areas

Vaccination impact paper

Vaccinated Locations

Bury

Cumbria

Gateshead

Leicester

East Leicestershire

Rutland

Havering

Newham

South-East Essex

Control Locations

Brighton

Bristol

Cambridge

Exeter

Leeds

Liverpool

Norwich

Nottingham

Plymouth

Sheffield

Southampton

York

Brighton  •  Bristol  •  Cambridge  Exeter  •  Leeds  •  Liverpool  Norwich  •  Nottingham  •  Plymouth  Sheffield  •  Southampton  •  York

Control  areas

Bury  •  Cumbria  •  Gateshead  Leicester  •  East  Leicestershire  Rutland  •  South-­‐East  Essex  Havering  (London)  Newham  (London)

Vaccinated  areas

Page 25: Assessing the impact of a health intervention via user-generated Internet content

Applying  the  impact  estimation  framework

Target  vs.  control  areas  • Use  previous  flu  season  only  to  establish  relationships  • Find  the  best  correlated  areas  or  supersets  of  them  

Confidence  intervals  • Bootstrap  sampling  of  the  regression  residuals  

(mapping  function  of  control  to  vaccinated  areas)  • Bootstrap  sampling  of  data  prior  to  the  application  of  

the  bootstrapped  regressor  • 105  bootstraps;  use  the  .025  and  .975  quantiles  

Statistical  significance  assessment  • Impact  estimate  (abs.)  >  2σ  of  the  bootstrap  estimates

Page 26: Assessing the impact of a health intervention via user-generated Internet content

Relationship  between  vaccinated  &  control  areas

Twitter  —  All  areas

Bing  —  All  areas

0 0.25 0.5 0.75 1

0

0.25

0.5

0.75

1

ILI r

ates

in v

acci

nate

d ar

eas

ILI rates in control areas

pre−vaccination periodduring/after LAIV

0 0.25 0.5 0.75 10

0.25

0.5

0.75

1

ILI r

ates

in v

acci

nate

d ar

eas

ILI rates in control areas

pre−vaccination periodduring/after LAIV

axes  normalised  from  0  to  1

r  =  .86

r  =  .87

Page 27: Assessing the impact of a health intervention via user-generated Internet content

Relationship  between  vaccinated  &  control  areas

Twitter  —  London  areas

Bing  —  London  areas

0 0.25 0.5 0.75 10

0.25

0.5

0.75

1

ILI r

ates

in v

acci

nate

d ar

eas

ILI rates in control areas

pre−vaccination periodduring/after LAIV

0 0.25 0.5 0.75 10

0.25

0.5

0.75

1IL

I rat

es in

vac

cina

ted

area

s

ILI rates in control areas

pre−vaccination periodduring/after LAIV

axes  normalised  from  0  to  1

r  =  .74

r  =  .85

Page 28: Assessing the impact of a health intervention via user-generated Internet content

Impact  estimation  results  (strongly  correlated  controls)

Source Target r δ  x  103 θ  (%)

Twitter All  areas .861 -­‐2.5  (-­‐4.1,  -­‐1.0) -­‐32.8  (-­‐47.4,  -­‐15.6)

Bing All  areas .866 -­‐1.9  (-­‐3.2,  -­‐0.7) -­‐21.7  (-­‐32.1,  -­‐9.10)

TwitterLondon  areas .738 -­‐1.7  (-­‐2.5,  -­‐0.9) -­‐30.5  (-­‐41.8,  -­‐17.5)

BingLondon  areas .848 -­‐2.8  (-­‐4.1,  -­‐1.6) -­‐28.4  (-­‐36.7,  -­‐17.9)

Page 29: Assessing the impact of a health intervention via user-generated Internet content

Impact  estimation  results  (strongly  correlated  controls)

Source Target r δ  x  103 θ  (%)

Twitter All  areas .861 -­‐2.5  (-­‐4.1,  -­‐1.0) -­‐32.8  (-­‐47.4,  -­‐15.6)

Bing All  areas .866 -­‐1.9  (-­‐3.2,  -­‐0.7) -­‐21.7  (-­‐32.1,  -­‐9.10)

TwitterLondon  areas .738 -­‐1.7  (-­‐2.5,  -­‐0.9) -­‐30.5  (-­‐41.8,  -­‐17.5)

BingLondon  areas .848 -­‐2.8  (-­‐4.1,  -­‐1.6) -­‐28.4  (-­‐36.7,  -­‐17.9)

Page 30: Assessing the impact of a health intervention via user-generated Internet content

Source Target r δ  x  103 θ  (%)

Twitter All  areas .861 -­‐2.5  (-­‐4.1,  -­‐1.0) -­‐32.8  (-­‐47.4,  -­‐15.6)

Bing All  areas .866 -­‐1.9  (-­‐3.2,  -­‐0.7) -­‐21.7  (-­‐32.1,  -­‐9.10)

TwitterLondon  areas .738 -­‐1.7  (-­‐2.5,  -­‐0.9) -­‐30.5  (-­‐41.8,  -­‐17.5)

BingLondon  areas .848 -­‐2.8  (-­‐4.1,  -­‐1.6) -­‐28.4  (-­‐36.7,  -­‐17.9)

Impact  estimation  results  (strongly  correlated  controls)

Page 31: Assessing the impact of a health intervention via user-generated Internet content

Source Target r δ  x  103 θ  (%)

Twitter All  areas .861 -­‐2.5  (-­‐4.1,  -­‐1.0) -­‐32.8  (-­‐47.4,  -­‐15.6)

Bing All  areas .866 -­‐1.9  (-­‐3.2,  -­‐0.7) -­‐21.7  (-­‐32.1,  -­‐9.10)

TwitterLondon  areas .738 -­‐1.7  (-­‐2.5,  -­‐0.9) -­‐30.5  (-­‐41.8,  -­‐17.5)

BingLondon  areas .848 -­‐2.8  (-­‐4.1,  -­‐1.6) -­‐28.4  (-­‐36.7,  -­‐17.9)

Impact  estimation  results  (strongly  correlated  controls)

Page 32: Assessing the impact of a health intervention via user-generated Internet content

Impact  estimation  results  (stat.  sig.)-­‐θ  (%

)

0

7

14

21

28

35

All  areas London  areas Newham Cumbria Gateshead

30.228.7

21.7 21.1

30.430.532.8

Twitter Bing

Page 33: Assessing the impact of a health intervention via user-generated Internet content

Projected  vs.  inferred  ILI  rates  in  vaccinated  locations

Twitter  —  All  areas

Bing  —  All  areas

Oct Nov Dec Jan Feb Mar Apr0

0.005

0.01

0.015

0.02

ILI r

ates

per

100

peo

ple

weeks during and after the vaccination programme

inferred ILI ratesprojected ILI rates

Oct Nov Dec Jan Feb Mar Apr0

0.005

0.01

0.015

0.02

ILI r

ates

per

100

peo

ple

weeks during and after the vaccination programme

inferred ILI ratesprojected ILI rates

Page 34: Assessing the impact of a health intervention via user-generated Internet content

Projected  vs.  inferred  ILI  rates  in  vaccinated  locations

Twitter  —  London  areas

Bing  —  London  areas

Oct Nov Dec Jan Feb Mar Apr0

0.005

0.01

ILI r

ates

per

100

peo

ple

weeks during and after the vaccination programme

inferred ILI ratesprojected ILI rates

Oct Nov Dec Jan Feb Mar Apr0

0.005

0.01

0.015

ILI r

ates

per

100

peo

ple

weeks during and after the vaccination programme

inferred ILI ratesprojected ILI rates

Page 35: Assessing the impact of a health intervention via user-generated Internet content

Sensitivity  of  impact  estimates  to  variable  controls

• Repeat  the  impact  estimation  for  the  N  controls  (up  to  a  100)  with  r  ≥  95%  of  the  best  r  —>  μ(δ)  and  μ(θ)  (%)  

• Measure  %  of  difference,  Δ(θ),  between  θ  and  μ(θ)

Source Target N μ(r) μ(δ)  x  103 μ(θ)  (%) Δθ  (%)

Twitter All  areas 100 0.84 -­‐2.5  (0.2) -­‐32.7  (2.1) 0.10Bing All  areas 46 0.85 -­‐1.4  (0.4) -­‐16.4  (3.6) 24.4

Twitter London  areas 79 0.70 -­‐1.5  (0.1) -­‐27.9  (2.0) 8.32

Bing London  areas 100 0.84 -­‐1.4  (0.2) -­‐16.9  (1.8) 40.4

Page 36: Assessing the impact of a health intervention via user-generated Internet content

Sensitivity  of  impact  estimates  to  variable  controls

• Repeat  the  impact  estimation  for  the  N  controls  (up  to  a  100)  with  r  ≥  95%  of  the  best  r  —>  μ(δ)  and  μ(θ)  (%)  

• Measure  %  of  difference,  Δ(θ),  between  θ  and  μ(θ)

Source Target N μ(r) μ(δ)  x  103 μ(θ)  (%) Δθ  (%)

Twitter All  areas 100 0.84 -­‐2.5  (0.2) -­‐32.7  (2.1) 0.10

Bing All  areas 46 0.85 -­‐1.4  (0.4) -­‐16.4  (3.6) 24.4

Twitter London  areas 79 0.70 -­‐1.5  (0.1) -­‐27.9  (2.0) 8.32

Bing London  areas 100 0.84 -­‐1.4  (0.2) -­‐16.9  (1.8) 40.4

Page 37: Assessing the impact of a health intervention via user-generated Internet content

Sensitivity  of  impact  estimates  to  variable  controls

• Repeat  the  impact  estimation  for  the  N  controls  (up  to  a  100)  with  r  ≥  95%  of  the  best  r  —>  μ(δ)  and  μ(θ)  (%)  

• Measure  %  of  difference,  Δ(θ),  between  θ  and  μ(θ)

Source Target N μ(r) μ(δ)  x  103 μ(θ)  (%) Δθ  (%)

Twitter All  areas 100 0.84 -­‐2.5  (0.2) -­‐32.7  (2.1) 0.10Bing All  areas 46 0.85 -­‐1.4  (0.4) -­‐16.4  (3.6) 24.4

Twitter London  areas 79 0.70 -­‐1.5  (0.1) -­‐27.9  (2.0) 8.32

Bing London  areas 100 0.84 -­‐1.4  (0.2) -­‐16.9  (1.8) 40.4

Page 38: Assessing the impact of a health intervention via user-generated Internet content

✓ Background  and  motivation  

✓ Estimating  disease  rates  from  online  text  

✓ Estimating  the  impact  of  a  health  intervention  

✓ Case  study:  influenza  vaccination  impact  

๏ Conclusions  &  future  work

Assessing  the  impact  of  a  health  intervention  via  online  content

89%

Page 39: Assessing the impact of a health intervention via user-generated Internet content

Conclusions  &  points  for  discussion

• Framework  for  estimating  the  impact  of  a  health  intervention  based  on  online  content  

• Access  to  different  &  larger  parts  of  the  population  

Evaluation  is  hard,  however:  • PHE’s  impact  estimates:  -­‐66%  based  on  sentinel  

surveillance,  -­‐24%  laboratory  confirmed  • Correlation  between  actual  vaccination  uptake  and  our  

study’s  estimated  impacts  

Why  are  Bing  and  Twitter  estimations  different?  • Different  user  demographics  (?)  —  this  can  be  useful  • Different  temporal  resolution

(Pebody  et  al.,  2014)

Page 40: Assessing the impact of a health intervention via user-generated Internet content

Potential  future  work  directions

• Improve  supervised  learning  models  - better  natural  language  processing  /  machine  

learning  modelling  - combination  of  different  data  sources  

• Work  on  unsupervised  techniques  - inferring  /  understanding  the  demographics  of  the  

online  medium  will  be  essential  

• More  rigorous  evaluation

Page 41: Assessing the impact of a health intervention via user-generated Internet content

Collaborators,  acknowledgements  &  material

Elad  Yom-­‐Tov,  Microsoft  Research  Richard  Pebody,  Public  Health  England  Ingemar  J.  Cox,  UCL  &  University  of  Copenhagen

Jens  Geyti,  UCL  (Software  Engineer)  Simon  de  Lusignan,  University  of  Surrey  &  RCGP

Slides:  ow.ly/RN7MZPaper:  ow.ly/RN9J2

i-­‐sense.org.uk

Page 42: Assessing the impact of a health intervention via user-generated Internet content

Bollen,  Mao  &  Zeng.  Twitter  mood  predicts  the  stock  market.  J  Comp  Science,  2011.  Burger,  Henderson,  Kim  &  Zarrella.  Discriminating  Gender  on  Twitter.  EMNLP,  2011.  Choi  &  Varian.  Predicting  the  Present  with  Google  Trends.  Economic  Record,  2012.  Culotta.  Lightweight  methods  to  estimate   influenza  rates  and  alcohol  sales  volume  from  Twitter  messages.  Lang  Resour  Eval,  2013.  Hoerl  &  Kennard.  Ridge  regression:  biased  estimation  for  nonorthogonal  problems.  Technometrics,  1970.  Lamb,  Paul  &  Dredze.  Separating  Fact  from  Fear:  Tracking  Flu  Infections  on  Twitter.  NAACL,  2013.  Lambert  &  Pregibon.  Online  effects  of  offline  ads.  Data  Mining  &  Audience  Intelligence  for  Advertising,  2008.  Lampos  &  Cristianini.  Tracking  the  flu  pandemic  by  monitoring  the  Social  Web.  CIP,  2010.  Lampos  &  Cristianini.  Nowcasting  Events  from  the  Social  Web  with  Statistical  Learning.  ACM  TIST,  2012.  Lampos,  Miller,  Crossan  &  Stefansen.  Advances   in  nowcasting   influenza-­‐like   illness  rates  using  search  query   logs.  Sci  Rep,  2015.  Lampos,   Yom-­‐Tov,   Pebody   &   Cox.   Assessing   the   impact   of   a   health   intervention   via   user-­‐generated   Internet  content.  DMKD,  2015.  Pebody  et  al.  Uptake  and  impact  of  a  new  live  attenuated  influenza  vaccine  programme  in  England:  early  results  of  a  pilot  in  primary  school-­‐age  children,  2013/14  influenza  season.  Eurosurveillance,  2014.  Preotiuc-­‐Pietro,  Lampos  &  Aletras.  An  analysis  of  the  user  occupational  class  through  Twitter  content.  ACL,  2015.  Rao,  Yarowsky,  Shreevats  &  Gupta.  Classifying  Latent  User  Attributes  in  Twitter.  SMUC,  2010.  Rasmussen  &  Williams.  Gaussian  Processes  for  Machine  Learning.  MIT  Press,  2006.  Tumasjan,   Sprenger,   Sandner   &   Welpe.   Predicting   Elections   with   Twitter:   What   140   characters   Reveal   about  Political  Sentiment.  ICWSM,  2010.  Zou  &  Hastie.  Regularization  and  variable  selection  via  the  elastic  net.  J  R  Stat  Soc  Series  B  Stat  Methodol,  2005.

References