Analysis of spatio-temporal point patterns with replication

4
Analysis of spatio-temporal point patterns with replication Jonatan A. González 1* , Ute Hahn 2, and Jorge Mateu 1 1 Departamento de Matemáticas, Universitat Jaume I, Castellón, Spain; [email protected], [email protected] 2 Centre for Stochastic Geometry and Advanced Bioimaging, Department of Mathematics, University of Aarhus, Aarhus C, Denmark; [email protected] * Corresponding author Abstract. We develop and apply several methods for the analysis of replicated spatio-temporal point patterns in order to identify structural differences between groups of them. First, we calculate a number of functional descriptors of each spatio-temporal pattern to investigate departures from completely random patterns, both among subjects and groups. The distributions of our functional descriptors and of our derived test statistics are unknown, so for the nonparametric inference we use bootstrap and permutation procedures to estimate the null distribution of our test statistics, even though the null hypothesis is not supported by these data. A simulation study provides evidence of the validity and power of our procedures. Keywords. K-function, non-parametric test, spatio-temporal point process, subsampling, permutation test. 1 Spatio-temporal tests We assume X as a spatio-temporal point process which has a separable locally integrable intensity func- tion λ(u, v)= λ 1 (u)λ 2 (v), where λ 1 and λ 2 are non-negative integrable functions; for spatio-temporal stationary and isotropic process, λ(u, v) assumes a constant value λ. The K-function of a stationary, isotropic spatio-temporal process is defined as K(r , t )= λ -1 E ! 0 (N[c(r , t )]) (1) where c(r , t ) represents a spatio-temporal cylinder within radius r and high 2t and N() is the number of points of a spatio-temporal volume. The conditional expectation can be interpreted as the expected number of further events within distance r and time t of an arbitrary event (taken as the origin). Note that the K-function measures pattern independently of spatio-temporal density. Under a spatio-temporal homogeneous Poisson process, whose spatial and temporal components are independent homogeneous

Transcript of Analysis of spatio-temporal point patterns with replication

Analysis of spatio-temporal point patterns withreplication

Jonatan A. González1∗, Ute Hahn2, and Jorge Mateu1

1 Departamento de Matemáticas, Universitat Jaume I, Castellón, Spain; [email protected], [email protected] Centre for Stochastic Geometry and Advanced Bioimaging, Department of Mathematics, University of Aarhus,Aarhus C, Denmark; [email protected]∗Corresponding author

Abstract. We develop and apply several methods for the analysis of replicated spatio-temporal pointpatterns in order to identify structural differences between groups of them. First, we calculate a numberof functional descriptors of each spatio-temporal pattern to investigate departures from completelyrandom patterns, both among subjects and groups. The distributions of our functional descriptorsand of our derived test statistics are unknown, so for the nonparametric inference we use bootstrapand permutation procedures to estimate the null distribution of our test statistics, even though the nullhypothesis is not supported by these data. A simulation study provides evidence of the validity andpower of our procedures.

Keywords. K-function, non-parametric test, spatio-temporal point process, subsampling, permutationtest.

1 Spatio-temporal tests

We assume X as a spatio-temporal point process which has a separable locally integrable intensity func-tion λ(u,v) = λ1(u)λ2(v), where λ1 and λ2 are non-negative integrable functions; for spatio-temporalstationary and isotropic process, λ(u,v) assumes a constant value λ. The K-function of a stationary,isotropic spatio-temporal process is defined as

K(r, t) = λ−1E!

0 (N[c(r, t)]) (1)

where c(r, t) represents a spatio-temporal cylinder within radius r and high 2t and N() is the numberof points of a spatio-temporal volume. The conditional expectation can be interpreted as the expectednumber of further events within distance r and time t of an arbitrary event (taken as the origin). Notethat the K-function measures pattern independently of spatio-temporal density. Under a spatio-temporalhomogeneous Poisson process, whose spatial and temporal components are independent homogeneous

González, Hahn and Mateu ANOVA for spatio-temporal point patterns

Poisson processes on R2 and R+ respectively, K(r, t) = 2πr2t, this represents the volume of a cylinderwith base radius r and height 2t. An estimator of K(r, t) is given by

K̂(r, t) =1

λ̂2 |W ×T |∑i 6= j1[∥∥ui−u j

∥∥≤ r,]

1[∣∣ti− t j

∣∣≤ t]

e2 (ui,u j)e1 (ti, t j) , (2)

where ed() is a d-dimensional edge correction function.

1.1 Diggle’s spatio-temporal test procedure

In order to test for the differences between independent replicates of empirical spatial K-functions, [1, 2]suggested a bootstrap procedure. We develop a similar but more general test in the space-time case,indeed, we suppose an original sample consisting of g groups of sizes m1, ...mg. Let wi j = ni j/ni (ni =

∑mij=1 ni j), and n = ∑

gi=1 ni. Given such an estimated descriptor (Ki j(r, t)) for each pattern, we define the

estimated group-specific and overall mean functions as usual in heteroscedastic ANOVA by

K̄i(r, t) =mi

∑j=1

wi jK̂i j(r, t) and K̄(r, t) =1n

g

∑i=1

niK̄i(r, t); (3)

and the statistic

Dst =g

∑i=1

∫ r0

0

∫ t0

0

ni

r2t[K̄i(r, t)− K̄(r, t)]2 drdt, (4)

which is a natural extension of the proposed by [2] to measure differences between groups.

The sampling variation of Ki j(r, t) increases with r and t, so we use a weighting factor (1/r2t), whichdown-weights the variance of the space-time K-function estimates at large r and t. The statistic Dst is asensible measure of the extent to which the group-specific mean K-functions differ and is analogous to aresidual sum of squares in a conventional one-way ANOVA.

1.1.1 A Bootstrap Procedure

The interest focuses on testing the null hypothesis that K-functions do not differ between groups, i.e.

H0 : E(K̄1(r, t)) = E(K̄2(r, t)) = · · ·= E(K̄g(r, t)) for all r and t

H1 : E(K̄u(r, t)) 6= E(K̄v(r, t)) for some r, some t and for some u and v.

The analytical form of the probability function of Dst is intractable, but we perform a pure randomizationtest to permute the K̂i j(r, t) across groups and recompute Dst in order to obtain its exact conditional dis-tribution. We generate bootstrap samples as follows: in the first step, residual spatio-temporal functionsare defined as

R̂i j(r, t) = n1/2i j

(K̂i j(r, t)− K̄i(r, t)

). (5)

Under the null or the alternative hypotheses the R̂i j(r, t) are approximately exchangeable quantities sincethe sampling variance of each Ki j(r, t) is proportional to n−1

i j . Note that

K̂i j(r, t) = K̄i(r, t)+n−1/2i j R̂i j (r, t) .

Joint METMAVII and GRASPA14 Workshop 2

González, Hahn and Mateu ANOVA for spatio-temporal point patterns

Then, we obtain a random sample, without replacement, of functional residuals and define

K̂booti j (r, t) = K̄(r, t)+n−1/2

i j R̂booti j (r, t) . (6)

To determine the bootstrap p-value, the observed value of Dst is ranked among the corresponding boot-strap values (Dboot

st ). We proceed to analyse a set of simulations generated varying parameters as thenumber of patterns per group or the intensity.

The simulation study indicates that this way of bootstrapping by permutation of residuals may fail toreproduce the distribution of the test statistic, presenting non-uniformity p-values in some cases consid-ered in the simulation scenario, which leads us to believe that the generalisation of Diggle’s statistic isnot enough to make comparisons in the spatio-temporal case.

1.2 Spatio-temporal Hahn’s permutation test

Because spatio-temporal Diggle’s test yields non-uniform p- values under the null hypothesis, we givea generalised spatio-temporal version of the Studentized permutation test proposed by [3], which hasuniformly distributed rejection rates by construction. Consider the estimates of Ki j(r, t) using an unbiasedestimator. Let

K̄i(r, t) =1mi

mi

∑j=1

K̂i j (r, t) and s2i (r, t) =

1mi−1

mi

∑j=1

(K̂i j (r, t)− K̄i(r, t)

)2

denote empirical mean and variance of the K-function estimates on a given group i. We define a statisticassociated to the t-statistic, as

Tst = ∑1≤i< j≤g

∫ r0

0

∫ t0

0

(K̄i(r, t)− K̄ j(r, t))2

m−1i s2

i (r, t)+m−1j s2

j (r, t)drdt (7)

The use of the statistic T may lead to tests sensitive to heteroscedasticity. In these cases, we prefer usingthe statistic

T st = ∑1≤i< j≤g

∫ r0

0

∫ t0

0

(K̄i(r, t)− K̄ j(r, t))2

m−1i s2

i (r, t)+m−1j s2

j (r, t)drdt, (8)

where

s2i (r, t) =

r2tr0t0

∫ r0

0

∫ t0

0

s2i (u,v)u2v

dudv.

When there is a heteroscedastic stage, instead of Tst to use T st guarantees a better performance of thetest. As expected, the Studentized permutation tests present a general better performance than Diggle’stests, and this is shown through a new set of simulations.

2 Database

We have a climate database, where information of occurrence locations of tornadoes over 59 years inUSA are collected (see Figure 1), we want to detect possible differences between the patterns of occur-rence of tornadoes in hot (Spring - Summer) and cold (Autumn - Winter) weather seasons. The database

Joint METMAVII and GRASPA14 Workshop 3

González, Hahn and Mateu ANOVA for spatio-temporal point patterns

Autumn − Winter Spring − Summer

1953

1963

1973

1983

1993

2003

year

Figure 1: Observed point patterns of tornado occurrences in U.S. from 1953 until 2012 in a couple ofgroups corresponding to two climatic seasons.

exhibits a rough spatial inhomogeneity, representing a trouble to be addressed prior to apply any of thetests we have proposed; then divide the spatial region into subregions (as in a tessellation) according toareas with greater or lesser intensity in order to ensure the homogeneity of local intensities, so we areable to perform the Studentized permutation test over our data.

Acknowledgments. This research was supported by grant MTM2010-14961 from the Spanish Ministryof Science and Education and the Centre for Stochastic Geometry and Advanced Bioimaging. We wantto grant to Storm Prediction Center of the National Oceanic and Atmospheric Administration (NOAA)which is part of the National Weather Service (NWS) and the National Centers for Environmental Pre-diction (NCEP) for providing dataset.

References

[1] Diggle, Peter J. and Lange, Nicholas and Bene?, Francine M. (1991). Analysis of Variance for ReplicatedSpatial Point Patterns in Clinical Neuroanatomy, Journal of the American Statistical Association, 86, 618–625.

[2] Diggle, Peter J. and Mateu, Jorge and Clough, Helen E. (2000). A comparison between parametric and non-parametric approaches to the analysis of replicated spatial point patterns, Advances in Applied Probability,32(2), 331–343.

[3] Hahn, Ute. (2012). A Studentized Permutation Test for the Comparison of Spatial Point Patterns, Journal ofAmerican Statistical Association, 107, 754–764.

Joint METMAVII and GRASPA14 Workshop 4