SISMIDSpatialStatisticsinEpidemiologyand PublicHealth...
Transcript of SISMIDSpatialStatisticsinEpidemiologyand PublicHealth...
SISMID Spatial Statistics in Epidemiology andPublic Health
2015 R Notes: Infectious Disease Data
Jon WakefieldDepartments of Statistics and Biostatistics, University of
Washington
2015-07-21
Lower Saxony Measles Data
As a first example of the Held et al. (2005) approach we examinedata on measles considered in this paper
These data are in the surveillance package.
The data consist of weekly measles counts over 2 years, for each of15 neighborhoods of Lower Saxony in Germany.
The figure below shows the study region.
Included in the dataset are a 15× 15 matrix of 0/1 entriesindicating which areas share a common boundary.
There is also the population fraction that is contained in each area.
Lower Saxony Measles: Reading in the Data
library(surveillance)library(RColorBrewer)data(measles.weser)# Shapefile for Germanydata <- url("http://faculty.washington.edu/jonno/SISMIDmaterial/DEU_adm3.RData")load(data)names(gadm)## [1] "PID" "ID_0" "ISO" "NAME_0" "ID_1"## [6] "NAME_1" "ID_2" "NAME_2" "ID_3" "NAME_3"## [11] "NL_NAME_3" "VARNAME_3" "TYPE_3" "ENGTYPE_3"xxx <- gadm[which(gadm$NAME_2 == "Weser-Ems" & gadm$ID_3 !=
249), ]xxx$HHH <- c(3458, 3404, 3459, 3460, 3461, 3462, 3451,
3452, 3453, 3402, 3454, 3455, 3456, 3457, 3403)dat <- data.frame(t(measles.weser$observed)[c(11, 3,
12, 13, 14, 15, 4, 5, 6, 1, 7, 8, 9, 10, 2), ])names(dat) <- paste("wk", 1:104, sep = "")attributes(xxx)$data <- cbind(attributes(xxx)$data,
dat)weser <- xxx
Lower Saxony Measles Data
Plotting a map of the study region.
save(weser, file = "~Weser_Data.RData")totals <- colSums(measles.weser$observed)totals <- totals[c(11, 3, 12, 13, 14, 15, 4, 5, 6,
1, 7, 8, 9, 10, 2)]nclr <- 5brks <- unique(round(quantile(totals, probs = seq(0,
1, 1/10))), digits = 1)plotclr <- colorRampPalette(brewer.pal(nclr, "BuPu"))(52)plot(xxx)text(coordinates(xxx), labels = xxx$HHH, cex = 0.8,
col = 1, font = 2)
Visualizing Spatial Data
3458
34043459
3460
3461
3462
3451
3452
3453
3402
3454
3455
3456
34573403
Lower Saxony Measles DataWe next plot the data over time# plot of no. infected in each of 15# neighborhoodsplot(measles.weser, type = observed ~ time,
legend.opts = NULL)
time
No.
of I
nfec
ted
2001
I
2001
III
2002
I
2002
III
2003
I
020
40
Lower Saxony Measles Data
Plot of number of infected by neighbourhood
plot(measles.weser, as.one = FALSE, xaxis.years = FALSE)
Lower Saxony Measles Data
0 40 80
020
50
time
3402
0 40 80
020
50
time
No.
infe
cted 3403
0 40 80
020
50
timeN
o. in
fect
ed 3404
0 40 80
020
50
time
No.
infe
cted 3451
0 40 80
020
50
time
No.
infe
cted 3452
0 40 80
020
50
time
3453
0 40 80
020
50
time
No.
infe
cted 3454
0 40 80
020
50
time
No.
infe
cted 3455
0 40 80
020
50
timeN
o. in
fect
ed 3456
0 40 80
020
50
time
No.
infe
cted 3457
0 40 80
020
50 3458
0 40 80
020
50
No.
infe
cted 3459
0 40 80
020
50
No.
infe
cted 3460
0 40 80
020
50
No.
infe
cted 3461
0 40 80
020
50
No.
infe
cted 3462
Lower Saxony Measles Data# Map of cases in week 70plot(xxx)plot(xxx, col = plotclr[findInterval(dat[,
70], 0:51, all.inside = T)], add = T)
Lower Saxony Measles DataWe first fit the model
µi ,t = λAR︸︷︷︸exp(αAR
0 )
yi ,t−1 + λNE︸︷︷︸exp(αNE
0 )
∑j∈ne(i)
yj,t−1 + NitλENit ,
with endemic term:
log(λENit ) = αEN
0 + α1t + γ sin(ωt) + δ cos(ωt)
where,
I λAR is the epidemic force,I λNE is the neighborhood effect,I Nit are (possibly standardized) population counts in area i at
time t,I λEN
it is the endemic term,I α1 is a slope parameter describing the large scale endemic
temporal trend,I γ and δ are seasonal parameters and do not vary across areas,ω = (2π)/52.
Lower Saxony Measles Data# Get the data in the right format ('sts' class)measles <- disProg2sts(measles.weser)str(measles)## Formal class 'sts' [package "surveillance"] with 13 slots## ..@ epoch : int [1:104] 1 2 3 4 5 6 7 8 9 10 ...## ..@ freq : num 52## ..@ start : num [1:2] 2001 1## ..@ observed : int [1:104, 1:15] 0 0 0 0 0 0 0 0 0 0 ...## .. ..- attr(*, "dimnames")=List of 2## .. .. ..$ : NULL## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## ..@ state : num [1:104, 1:15] 0 0 0 0 0 0 0 0 0 0 ...## .. ..- attr(*, "dimnames")=List of 2## .. .. ..$ : NULL## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## ..@ alarm : logi [1:104, 1:15] NA NA NA NA NA NA ...## .. ..- attr(*, "dimnames")=List of 2## .. .. ..$ : NULL## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## ..@ upperbound : logi [1:104, 1:15] NA NA NA NA NA NA ...## .. ..- attr(*, "dimnames")=List of 2## .. .. ..$ : NULL## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## ..@ neighbourhood : int [1:15, 1:15] 0 0 0 0 1 0 0 0 0 1 ...## .. ..- attr(*, "dimnames")=List of 2## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## ..@ populationFrac: num [1:104, 1:15] 0.0224 0.0224 0.0224 0.0224 0.0224 ...## .. ..- attr(*, "dimnames")=List of 2## .. .. ..$ : NULL## .. .. ..$ : chr [1:15] "3402" "3403" "3404" "3451" ...## ..@ map :Formal class 'SpatialPolygons' [package "sp"] with 4 slots## .. .. ..@ polygons : list()## .. .. ..@ plotOrder : int(0)## .. .. ..@ bbox : num[0 , 0 ]## .. .. ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot## .. .. .. .. ..@ projargs: chr ""## ..@ control : list()## ..@ epochAsDate : logi FALSE## ..@ multinomialTS : logi FALSE
Lower Saxony Measles Data# Endemic component: inear trend and 1 pair of# seasonal termsf.end1 <- addSeason2formula(f = ~1 + t, S = 1, period = 52)model1 <- list(ar = list(f = ~1), ne = list(f = ~1,
weights = neighbourhood(measles)), end = list(f = f.end1,offset = population(measles)), family = "NegBin1") # NegBin1 has a single
# overdisperion parameterres.hhh1 <- hhh4(measles, control = model1)summary(res.hhh1)#### Call:## hhh4(stsObj = measles, control = model1)#### Coefficients:## Estimate Std. Error## ar.1 -0.472565 0.143066## ne.1 -4.474443 0.420570## end.1 0.343990 0.244113## end.t 0.004128 0.003956## end.sin(2 * pi * t/52) 0.931667 0.175207## end.cos(2 * pi * t/52) -0.241281 0.156280## overdisp 2.823871 0.369716#### Log-likelihood: -1002.05## AIC: 2018.1## BIC: 2055.5#### Number of units: 15## Number of time points: 103
Lower Saxony Measles Data
Hence, the mean model is
E [Yi,t+1|yit , yjt ] = exp(−0.47)yit + exp(−4.47)∑
j∈ne(i)yjt
+ exp [0.34+ 0.0041× t + 0.93 sin(ωt)− 0.24 cos(ωt)]
The variance is
var(Yi,t+1|yit) = E [Yit |yit ](1+ 2.28× E [Yit |yit ]).
The fits are different in each of the areas due to the two epidemiccomponents.
Lower Saxony Measles Data# Plot for the fifth area:plot(res.hhh1, unit = 5, col = c("#FFA50095",
"#0000FF75", "#D9D9D9"), legend = TRUE)
2001.0 2001.5 2002.0 2002.5 2003.0
0
2
4
6
8
No.
infe
cted
3452
spatiotemporalautoregressiveendemic
Lower Saxony Measles Data
y <- measles@observed[2:104, ] # Notice no firstmu1 <- fitted(res.hhh1, "pearson")time <- matrix(rep(measles@epoch, 15), nrow = 104,
ncol = 15)time <- time[-1, ]res1 <- (y - mu1)/sqrt(mu1 * (1 + 2.28 * mu1))par(mfrow = c(1, 3))plot(mu1 ~ y, xlab = "Observed", ylab = "Fitted", type = "n")for (i in 1:15) {
points(mu1[, i] ~ y[, i], col = i)}abline(0, 1)plot(res1 ~ mu1, xlab = "Fitted", ylab = "Residuals",
type = "n")for (i in 1:15) {
points(res1[, i] ~ mu1[, i], col = i)}abline(0, 0)plot(res1 ~ time, xlab = "Time (week)", ylab = "Residuals",
type = "n")for (i in 1:15) {
points(res1[, i] ~ time[, i], col = i)}abline(0, 0)
Lower Saxony Measles Data
0 20 40
05
1015
2025
30
Observed
Fitt
ed
0 10 25
010
2030
4050
60
Fitted
Res
idua
ls
0 40 80
010
2030
4050
60
Time (week)
Res
idua
ls
Lower Saxony Measles Data
Residuals do not look good!
But residuals are very difficult to examine when there are lots ofobserved counts that are zero
Lower Saxony Measles Data
We now extend the model to add area-specific intercepts to theendemic component.
# Endemic Component: region-specific intercepts,# linear trend and 1 pair of seasonal termsf.end2 <- addSeason2formula(f = ~-1 + fe(1, which = rep(TRUE,
ncol(measles))) + t, S = 1, period = 52)# Autoregressive Comp, Neighbourhood Effect,# Endemic compmodel2 <- list(ar = list(f = ~1), ne = list(f = ~1,
weights = neighbourhood(measles)), end = list(f = f.end2,offset = population(measles)), family = "NegBin1")
# Fit the modelres.hhh2 <- hhh4(measles, control = model2)
Lower Saxony Measles Datasummary(res.hhh2)#### Call:## hhh4(stsObj = measles, control = model2)#### Coefficients:## Estimate Std. Error## ar.1 -0.69791 0.14835## ne.1 -6.68692 2.13455## end.t 0.01204 0.00360## end.sin(2 * pi * t/52) 1.32238 0.16243## end.cos(2 * pi * t/52) -0.58866 0.13454## end.1.3402 2.04845 0.33483## end.1.3403 -0.48083 0.41065## end.1.3404 -1.91449 0.65148## end.1.3451 -1.21596 0.70807## end.1.3452 0.74223 0.33867## end.1.3453 -0.19656 0.44766## end.1.3454 0.07133 0.33151## end.1.3455 -2.16494 1.27427## end.1.3456 -2.10119 0.76601## end.1.3457 2.25367 0.27666## end.1.3458 -0.81222 0.48152## end.1.3459 -0.59600 0.37366## end.1.3460 -1.26344 0.54192## end.1.3461 0.31475 0.39120## end.1.3462 -0.66236 0.61236## overdisp 1.73842 0.22807#### Log-likelihood: -910.27## AIC: 1862.53## BIC: 1974.73#### Number of units: 15## Number of time points: 103
Lower Saxony Measles Data# Plot for the fifth area:plot(res.hhh2, unit = 5, col = c("#FFA50095",
"#0000FF75", "#D9D9D9"), legend = TRUE)
2001.0 2001.5 2002.0 2002.5 2003.0
0
2
4
6
8
No.
infe
cted
3452
spatiotemporalautoregressiveendemic
Lower Saxony Measles Data#mu2 <- fitted(res.hhh2)res2 <- (y - mu2)/sqrt(mu2 * (1 + 1.74 * mu2))par(mfrow = c(1, 3))plot(mu2 ~ y, xlab = "Observed", ylab = "Fitted", type = "n")for (i in 1:15) {
points(mu2[, i] ~ y[, i], col = i)}abline(0, 1)plot(res1 ~ mu2, xlab = "Fitted", ylab = "Residuals",
type = "n")for (i in 1:15) {
points(res1[, i] ~ mu2[, i], col = i)}abline(0, 0)plot(res1 ~ time, xlab = "Time (week)", ylab = "Residuals",
type = "n")for (i in 1:15) {
points(res1[, i] ~ time[, i], col = i)}abline(0, 0)
Lower Saxony Measles Data
0 20 40
05
1015
2025
30
Observed
Fitt
ed
0 10 20 30
010
2030
4050
60
Fitted
Res
idua
ls
0 40 80
010
2030
4050
60
Time (week)
Res
idua
ls
Lower Saxony Measles Data# Plot for the 5th area:plot(res.hhh2, unit = 5, col = c("#FFA50095",
"#0000FF75", "#D9D9D9"), legend = TRUE)
2001.0 2001.5 2002.0 2002.5 2003.0
0
2
4
6
8
No.
infe
cted
3452
spatiotemporalautoregressiveendemic
Lower Saxony Measles Data
#mu2 <- fitted(res.hhh2)res2 <- (y - mu2)/sqrt(mu2 * (1 + 1.74 * mu2))par(mfrow = c(1, 3))plot(mu2 ~ y, xlab = "Observed", ylab = "Fitted", type = "n")for (i in 1:15) {
points(mu2[, i] ~ y[, i], col = i)}abline(0, 1)plot(res1 ~ mu2, xlab = "Fitted", ylab = "Residuals",
type = "n")for (i in 1:15) {
points(res1[, i] ~ mu2[, i], col = i)}abline(0, 0)plot(res1 ~ time, xlab = "Time (week)", ylab = "Residuals",
type = "n")for (i in 1:15) {
points(res1[, i] ~ time[, i], col = i)}abline(0, 0)
Lower Saxony Measles Data
0 20 50
05
1015
2025
30
Observed
Fitt
ed
0 15 30
010
2030
4050
60
Fitted
Res
idua
ls
0 40 100
010
2030
4050
60
Time (week)
Res
idua
ls
Southern Germany Influenza Data
These data were analyzed by Paul and Held (2011) and consist ofweekly counts of influenza cases in 140 areas of Southern Germanyover 2001–2008.
Southern Germany Influenza Datadata(fluBYBW)# Time series of countsplot(fluBYBW, type = observed ~ time, legend.opts = NULL)
time
No.
infe
cted
2001
II
2002
IV
2004
II
2005
IV
2007
II
2008
IV
020
060
010
00
# Map of Total number of Cases in 2001plot(fluBYBW[year(fluBYBW) == 2001, ], type = observed ~
1 | unit, labels = FALSE)
0 48
Southern Germany Influenza DataThe code below gives a number of increasingly complex models,including spatial conditional auroregressive (CAR) random effects.
Model is
E [Yi ,t+1|yit , yjt ] = µi ,t+1
varv(Yi ,t+1|yit , yjt) = µi ,t+1(1+ µi ,t+1/ψ)
with
µi ,t+1 = λARi yit + λNE
it∑
j∈ne(i)yjt + Nitλ
ENit
log(λENit ) = αEN
0i + α1t + γ sin(ωt) + δ cos(ωt)
with the possibility of allowing log λARi , log λNE
i and log λENit to
contain independent or spatial random effects.
Southern Germany Influenza Data
First, model is basically the measles model, except that area effectsare taken as random effects.
# Endemic Component: linear trend, seasonal terms,# iid random effects Center time variable and put# on year scale: I((t-208)/100)flu.end <- addSeason2formula(~-1 + ri(type = "iid",
corr = "all") + I((t - 208)/100), S = 3, period = 52)# Make Weight matrix for neighbourhood effectswji <- neighbourhood(fluBYBW)/rowSums(neighbourhood(fluBYBW))# Model specificationmod.flu <- list(end = list(f = flu.end, offset = population(fluBYBW)),
ar = list(f = ~1), ne = list(f = ~-1 + ri(type = "iid",corr = "all"), weights = wji), family = "NegBin1")
#res.flu1 <- hhh4(fluBYBW, mod.flu)
Southern Germany Influenza Datasummary(res.flu1)#### Call:## hhh4(stsObj = fluBYBW, control = mod.flu)#### Random effects:## Var Corr## ne.ri(iid) 0.9645## end.ri(iid) 0.5066 0.5652#### Fixed effects:## Estimate Std. Error## ar.1 -0.89193 0.03676## ne.ri(iid) -1.51746 0.10350## end.I((t - 208)/100) 0.57368 0.02366## end.sin(2 * pi * t/52) 2.17870 0.09808## end.cos(2 * pi * t/52) 2.33738 0.12134## end.sin(4 * pi * t/52) 0.45161 0.10427## end.cos(4 * pi * t/52) -0.37668 0.09404## end.sin(6 * pi * t/52) 0.30145 0.06452## end.cos(6 * pi * t/52) -0.24798 0.06294## end.ri(iid) 0.22345 0.10222## overdisp 1.08439 0.03392#### Penalized log-likelihood: -18696.6## Marginal log-likelihood: -343.59#### Number of units: 140## Number of time points: 415
Southern Germany Influenza Data# A plot with the fit for area 8111:plot(res.flu1, unit = 77, col = c("#FFA50095",
"#0000FF75", "#D9D9D9"), legend = TRUE)
2002 2004 2006 2008
0
10
20
30
40
50
No.
infe
cted
8111
spatiotemporalautoregressiveendemic