Spatio-Temporal Statistics · Spatio-Temporal Statistics Noel Cressie* Program in Spatial...
Transcript of Spatio-Temporal Statistics · Spatio-Temporal Statistics Noel Cressie* Program in Spatial...
Spatio-Temporal StatisticsNoel Cressie *
Program in Spatial Statistics and Environmental Statistics
The Ohio State University
Christopher K. Wikle
Department of Statistics
University of Missouri, Columbia
Slides are based on the book, “Statistics for Spatio-Temporal Data”
by Cressie and Wikle, 2011, Wiley, Hoboken, NJ
– p.1/26
Spatio-Temporal Statistics
There is no history without geography (and v.v.). We consider space and timetogether
The dynamical evolution (time dimension) of spatial processes means that we areable to reach more forecefully for the “Why” question. The problems are clearestwhen there is no aggregation ; henceforth consider processes at point-levelsupport
Consider the deterministic 1-D space × time , reaction-diffusion equation:
∂Y (s; t)
∂t= β
∂2Y (s; t)
∂s2− αY (s; t) ;
β is the diffusion coefficient
– p.2/26
Reaction-Diffusion Plots
Space
Tim
e
(a)
20 40
10
20
30
40
50
60
70
80
0 0.5 1
Space
(b)
20 40
10
20
30
40
50
60
70
80
0 0.5 1
Space
(c)
20 40
10
20
30
40
50
60
70
80
0 0.5 1
Y (s, 0) = I(15 ≤ s ≤ 24)
(a) α = 1, β = 20; (b) α = 0.05, β = 0.05; (c) α = 1, β = 50
– p.3/26
Stochastic Version
Consider the stochastic PDE:
∂Y
∂t− β
∂2Y
∂s2+ αY = δ ,
where {δ(s; t) : s ∈ R, t ≥ 0} is a zero-mean random process. Here we assume whitenoise for δ:
E(δ(s; t)) ≡ 0
cov(δ(s; t), δ(u; r)) = σ2I(s = u, t = r)
– p.4/26
Stochastic Reaction-Diffusion Plots
Space
Tim
e
(a)
20 40
10
20
30
40
50
60
70
80
0 0.5 1
Space
(b)
20 40
10
20
30
40
50
60
70
80
−0.5 0 0.5 1
Space
(c)
20 40
10
20
30
40
50
60
70
80
−5 0 5
Y (s, 0) = I(15 ≤ s ≤ 24)
α = 1 , β = 20
(a) σ = 0.01; (b) σ = 0.1; (c) σ = 1
– p.5/26
Spatio-Temporal Covariance Function
The stochastic reaction-diffusion equation implies a (stationary in space and time)covariance function :
C(h; τ) ≡ cov(Y (s; t), Y (s + h; t + τ))
and correlation function :
ρ(h; τ) ≡ C(h; τ)/C(0; 0)
Heine (1955) Biometrika, gives a closed-form solution for ρ(·; ·)
– p.6/26
Contour Plot of Spatio-Temporal Correlation Function
0.10.1
0.1
0.1
0.2
0.20.2
0.3
0.3
0.4
0.4
0.50.60.70.8
h
τ
0 1 2 3 4 5 6 7 8 90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
ρ(h; τ) for the stochastic reaction-diffusion equation
– p.7/26
Separability of Spatio-Temporal Covariance Functions
Stochastic PDEs are built from dynamical physical considerations and they implycovariance functions
Covariance functions have to be positive-definite (p-d) . So, specifying classesof spatio-temporal covariance functions to describe the dependence inspatio-temporal data is not all that easy
Suppose the spatial C(1)(h) is p-d and the temporal C(2)(τ) is p-d. Then theseparable class:
C(h; τ) ≡ C(1)(h) · C(2)(τ)
is guaranteed to be p-d
Separability is unusual in dynamical models ; it says that temporal evolutionproceeds independently at each spatial location
– p.8/26
Stochastic Reaction-Diffusion and Separability
If C(h; τ) = C(1)(h) · C(2)(τ),then
C(h; 0) = C(1)(h)C(2)(0)
C(0; τ) = C(1)(0)C(2)(τ) ,
and henceρ(h; τ) =
C(1)(h) · C(2)(τ)
C(0; 0)
=C(h; 0) · C(0; τ)
C(0; 0) · C(0; 0)
= ρ(h; 0) · ρ(0; τ)
Is this true for the stochastic reaction-diffusion equation? Plot
ρ(h; 0) · ρ(0; τ) versus (h, τ)
ρ(h; τ) versus (h, τ)
– p.9/26
Contour Plots of Spatio-Temporal Correlation Functions
0.1
0.1
0.1
0.10.2
0.2
0.2
0.3
0.3
0.4
0.4
0.50.60.70.8
h
τ
(a)
0 1 2 3 4 5 6 7 8 90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0.10.1
0.1
0.20.2
0.2
0.3
0.3
0.4
0.4
0.50.60.70.8
h
τ
(b)
0 1 2 3 4 5 6 7 8 90
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
(a) ρ(h; 0) · ρ(0; τ); (b) ρ(h; τ)
The difference in correlation functions is striking . Hence ρ(·; ·) is not separable .Can we see the difference between separability and non-separability in theirrealizations ?
– p.10/26
Non-Separable Realizations in Space-Time
Three realizations of Y (s; t)
Space
Tim
e
(a)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(b)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(c)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Realizations are generated from a stationary Gaussian process with the
non-separable , reaction-diffusion correlation function, ρ(h; τ)
– p.11/26
Separable Realizations in Space-Time
Three realizations of Y (s; t)
Tim
e
Space
(a)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(b)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(c)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Realizations are generated from a stationary Gaussian process with separablecorrelation function, ρ(h; 0) · ρ(0; τ)
– p.12/26
Inference on a Hidden Spatio-Temporal Process
We could ignore the dynamics and treat time as another “spatial” dimension. Writethe data as:
Z = (Z(s1; t1), . . . , Z(sn; tn))′ ,
which are observations taken at known space-time “locations”. The data are noisyand not observed at all locations of interest
Assume a hidden (“true”) process {Y (s; t) : s ∈ D ⊂ Rd ; t ≥ 0}, which is not
observable due to measurement error and missingness. Write
Z = Y + ε ,
where E(ε) = 0, var(ε) = σ2εI. We wish to predict Y (s0; t0) from data Z
– p.13/26
Spatio-Temporal Kriging
Predict Y (s0; t0) with the linear predictor λ′Z:
For simplicity, assume E(Y (s; t)) ≡ 0. Then minimize w.r.t. λ, the mean squaredprediction error ,
E(Y (s0; t0) − λ′Z)2 .
This results in the simple kriging predictor :
bY (s0; t0) = c(s0; t0)′Σ−1Z Z ,
where ΣZ ≡ var(Z) and c(s0; t0) = cov(Y (s0; t0),Z)
The simple kriging standard error (s.e.) is:
σk(s0; t0) = {var(Y (s0; t0)) − c(s0; t0)′Σ−1
Z c(s0; t0)}1/2
– p.14/26
Kriging for the Stochastic Reaction-Diffusion Equation
Space
Tim
e
(a)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(b)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(c)
20 40
10
20
30
40
50
60
70
80
0 0.5 1
(a) Full realization; ε = 0
(b) Crosses show {(si; ti)} superimposed on the kriging predictormap, {bY (s0; t0)}
(c) Kriging s.e. map, {σk(s0; t0)}
– p.15/26
Kriging for the Stochastic Reaction-Diffusion Equation, ctd.
Space
Tim
e
(a)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(b)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(c)
20 40
10
20
30
40
50
60
70
80
0 0.5 1
(a) Same full realization; ε = 0
(b) Crosses show different {(si; ti)} superimposed on the kriging predictormap, {bY (s0; t0)}
(c) Kriging s.e. map, {σk(s0; t0)}
– p.16/26
Emphasize the Dynamics
Approximate the differentials in the reaction-diffusion equation:
∂Y
∂t= β
∂2Y
∂s2− αY
with differences :
Y (s; t + ∆t) − Y (s; t)
∆t= β
Y (s + ∆s; t) − 2Y (s; t) + Y (s − ∆s; t)
∆2s
ff− αY (s; t)
Define Yt ≡ (Y (∆s; t), . . . , Y (79 − ∆s; t);YBt ≡ (Y (0; t), Y (79; t))′. Then the
stochastic version of the difference equation is:
Yt+∆t= MYt + MBY
Bt + δt+∆t
,
where MBYBt represents given boundary effects. The difference equation is a good
approximation to the differential equation, provided α∆t < 1 and 2β∆t/∆2s < 1
– p.17/26
Emphasize the Dynamics, ctd.
Importantly,
M =
2666666666664
θ1 θ2 0 . . . 0
θ2 θ1 θ2 . . ....
0 θ2 θ1
. . ....
.... . .
. . . θ2
0 0 . . . θ2 θ1
3777777777775
,
where θ1 = (1 − α∆t − 2β∆t/∆2s), θ2 = β∆t/∆2
s . This can be viewed as thepropagator matrix of a VAR(1) process. The matrix is defined by the dynamics. In otherwords, in a model of spatio-temporal dependence, M has structure (and is sparse ).Conditional on the boundary effects, we see that the lagged covariances are given by,
C(m)Y = MmC
(0)Y ,
where C(m)Y ≡ cov(Yt,Yt+m∆t
); m = 0, 1, 2, . . .
– p.18/26
Comparison of Differential and Difference Equations
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
τ
ρ(h,
τ)
h = 0
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
τ
ρ(h,
τ)
h = 1
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
τ
ρ(h,
τ)
h = 2
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
τ
ρ(h,
τ)
h = 3
Spatio-temporal correlations; α = 1, β = 20, ∆s = 1, and ∆t = 0.01
Solid blue line : from differential equations
Red dots : from difference equations
– p.19/26
The Dynamics in the Difference Equation
Think of a spatial process at time t rather than a spatio-temporal process. Call it thevector Yt. Then describe the dynamics by a VAR(1):
Yt = MYt−1 + δt
The choice of M is crucial. Define M ≡ (mij) “spatially” , that is, where the mij
corresponding to nearby locations si and sj are non-zero , and are zero when locationsare far apart
This applies the “First Law of Geography” (Tobler; cf. Fisher and wheat yields) to the
dynamical evolution of the process
– p.20/26
Structure of M
su = s
u = u1
u = u2
t-1 t
u
u
u
uii
i − 2
i − 1
i + 1
i + 2
i − 2
i − 1
i + 1
i + 2
t-1 t
u u
u
u
u
u
u
u
u
u
u
u
u
u
General M M defined “spatially”
– p.21/26
Instantaneous Spatial Dependence (ISD)
To capture the process’ behavior at small temporal scales between time t and time t + 1,we need a component of variation that is modeled as instantaneous spatialdependence (ISD) :
Yt = B0Yt + B1Yt−1 + νt ,
where B0 has zero down its diagonal. Model B0 and B1 “spatially” ; see the figurebelow. This implies
Yt = MYt−1 + δt ,
where M = (I − B0)−1B1 and δt = (I − B0)−1νt. What use are B0 and B1? They
have dynamic structure and are sparse !
– p.22/26
ISD in Graphical Form
(a)
ii
i − 2
i − 1
i + 1
i + 2
i − 2
i − 1
i + 1
i + 2
t-1 t
u u
u
u
u
u
u
u
u
u
u
u
u
u
(b)
ii
i − 2
i − 1
i + 1
i + 2
i − 2
i − 1
i + 1
i + 2
t-1 t
u u
u
u
u
u
u
u
u
u
u
u
u
u
(a) Graph structure (sparse) showing (b) Equivalent directed graph
relationships that are defined “spatially” structure (non-sparse )
– p.23/26
Nonstationarity
Stationarity can be an unrealistic assumption. Descriptive approaches tospatio-temporal modeling, expressed in terms of covariance functions, almostdemand it.
Dynamical approaches are much more forgiving. Consider the nonstationaryVAR(1) process:
Yt = MtYt−1 + δt
For example,
Mt = f(t) · M ,
where
f(t) =
8>><>>:
1 0 ≤ t ≤ 29
−1 30 ≤ t ≤ 59
1 60 ≤ t ≤ 79 ,
and M is tridiagonal but has different parameters for 0 ≤ s ≤ 19 and for 20 ≤ s ≤ 39
– p.24/26
Realizations for Nonstationary Process
Three realizations of Y (s; t)
Tim
e
Space
(a)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(b)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Space
(c)
20 40
10
20
30
40
50
60
70
80
−2 0 2
Yt = MtYt−1 + δt
– p.25/26
Space-Time
... the NEXT frontier
(with apologies to Gene Roddenberry and Trekkies)
– p.26/26