2017 - data.math.au.dk
Transcript of 2017 - data.math.au.dk
www.csgb.dk
RESEARCH REPORT 2017
CENTRE FOR STOCHASTIC GEOMETRYAND ADVANCED BIOIMAGING
Abdollah Jalilian, Yongtao Guan and Rasmus Waagepetersen
Orthogonal series estimation of the pair correlation functionof a spatial point process
No. 01, March 2017
Orthogonal series estimation of the paircorrelation function of a spatial point process
Abdollah Jalilian1, Yongtao Guan2 and Rasmus Waagepetersen3
1Department of Statistics, Razi University, Iran, [email protected] of Management Science, University of Miami, [email protected]
3Department of Mathematical Sciences, Aalborg University, Denmark, [email protected]
Abstract
The pair correlation function is a fundamental spatial point process character-istic that, given the intensity function, determines second order moments ofthe point process. Non-parametric estimation of the pair correlation functionis a typical initial step of a statistical analysis of a spatial point pattern. Kernelestimators are popular but especially for clustered point patterns suffer frombias for small spatial lags. In this paper we introduce a new orthogonal seriesestimator. The new estimator is consistent and asymptotically normal accord-ing to our theoretical and simulation results. Our simulations further showthat the new estimator can outperform the kernel estimators in particular forPoisson and clustered point processes.
Keywords: Asymptotic normality; Consistency; Kernel estimator; Orthogonalseries estimator; Pair correlation function; Spatial point process.
1 Introduction
The pair correlation function is commonly considered the most informative second-order summary statistic of a spatial point process (Stoyan and Stoyan, 1994; Møllerand Waagepetersen, 2003; Illian et al., 2008). Non-parametric estimates of the paircorrelation function are useful for assessing regularity or clustering of a spatial pointpattern and can moreover be used for inferring parametric models for spatial pointprocesses via minimum contrast estimation (Stoyan and Stoyan, 1996; Illian et al.,2008). Although alternatives exist (Yue and Loh, 2013), kernel estimation is the byfar most popular approach (Stoyan and Stoyan, 1994; Møller and Waagepetersen,2003; Illian et al., 2008) which is closely related to kernel estimation of probabilitydensities.
Kernel estimation is computationally fast and works well except at small spatiallags. For spatial lags close to zero, kernel estimators suffer from strong bias, seee.g. the discussion at page 186 in Stoyan and Stoyan (1994), Example 4.7 in Møllerand Waagepetersen (2003) and Section 7.6.2 in Baddeley et al. (2015). The bias
1
is a major drawback if one attempts to infer a parametric model from the non-parametric estimate since the behavior near zero is important for determining theright parametric model (Jalilian et al., 2013).
In this paper we adapt orthogonal series density estimators (see e.g. the reviewsin Hall, 1987; Efromovich, 2010) to the estimation of the pair correlation function.We derive unbiased estimators of the coefficients in an orthogonal series expansion ofthe pair correlation function and propose a criterion for choosing a certain optimalsmoothing scheme. In the literature on orthogonal series estimation of probabilitydensities, the data are usually assumed to consist of indendent observations from theunknown target density. In our case the situation is more complicated as the dataused for estimation consist of spatial lags between observed pairs of points. Theselags are neither independent nor identically distributed and the sample of lags isbiased due to edge effects. We establish consistency and asymptotic normality ofour new orthogonal series estimator and study its performance in a simulation studyand an application to a tropical rain forest data set.
2 Background
2.1 Spatial point processes
We denote by X a point process on Rd, d ≥ 1, that is, X is a locally finite randomsubset of Rd. For B ⊆ Rd, we let N(B) denote the random number of points inX ∩B. That X is locally finite means that N(B) is finite almost surely whenever Bis bounded. We assume that X has an intensity function ρ and a second-order jointintensity ρ(2) so that for bounded A,B ⊂ Rd,
E{N(B)} =
∫
B
ρ(u)du,
E{N(A)N(B)} =
∫
A∩Bρ(u)du+
∫
A
∫
B
ρ(2)(u, v)dudv.
(2.1)
The pair correlation function g is defined as g(u, v) = ρ(2)(u, v)/{ρ(u)ρ(v)} wheneverρ(u)ρ(v) > 0 (otherwise we define g(u, v) = 0). By (2.1),
Cov{N(A), N(B)} =
∫
A∩Bρ(u)du+
∫
A
∫
B
ρ(u)ρ(v){g(v, u)− 1
}dudv
for bounded A,B ⊂ Rd. Hence, given the intensity function, g determines the covari-ances of count variablesN(A) andN(B). Further, for locations u, v ∈ Rd, g(u, v) > 1(< 1) implies that the presence of a point at v yields an elevated (decreased) proba-bility of observing yet another point in a small neighbourhood of u (e.g. Coeurjollyet al., 2016). In this paper we assume that g is isotropic, i.e. with an abuse of no-tation, g(u, v) = g(‖v − u‖). Examples of pair correlation functions are shown inFigure 6.1.
2
2.2 Kernel estimation of the pair correlation function
Suppose X is observed within a bounded observation window W ⊂ Rd and letXW = X ∩ W . Let kb(·) be a kernel of the form kb(r) = k(r/b)/b, where k is aprobability density and b > 0 is the bandwidth. Then a kernel density estimator(Stoyan and Stoyan, 1994; Baddeley et al., 2000) of g is
gk(r; b) =1
sadrd−1
6=∑
u,v∈XW
kb(r − ‖v − u‖)ρ(u)ρ(v)|W ∩Wv−u|
, r ≥ 0,
where sad is the surface area of the unit sphere in Rd,∑6= denotes sum over all
distinct points, 1/|W ∩Wh|, h ∈ Rd, is the translation edge correction factor withWh = {u − h : u ∈ W}, and |A| is the volume (Lebesgue measure) of A ⊂ Rd.Variations of this include (Guan, 2007a)
gd(r; b) =1
sad
6=∑
u,v∈XW
kb(r − ‖v − u‖)‖v − u‖d−1ρ(u)ρ(v)|W ∩Wv−u|
, r ≥ 0
and the bias corrected estimator (Guan, 2007a)
gc(r; b) = gd(r; b)/c(r; b), c(r; b) =
∫ min{r,b}
−bkb(t)dt,
assuming k has bounded support [−1, 1]. Regarding the choice of kernel, Illianet al. (2008), p. 230, recommend to use the uniform kernel k(r) = 1(|r| ≤ 1)/2,where 1( · ) denotes the indicator function, but the Epanechnikov kernel k(r) =(3/4)(1 − r2)1(|r| ≤ 1) is another common choice. The choice of the bandwidth bhighly affects the bias and variance of the kernel estimator. In the planar (d = 2)stationary case, Illian et al. (2008), p. 236, recommend b = 0.10/
√ρ based on prac-
tical experience where ρ is an estimate of the constant intensity. The default inspatstat (Baddeley et al., 2015), following Stoyan and Stoyan (1994), is to use theEpanechnikov kernel with b = 0.15/
√ρ.
Guan (2007b) and Guan (2007a) suggest to choose b by composite likelihoodcross validation or by minimizing an estimate of the mean integrated squared errordefined over some interval I as
mise(gm, w) = sad∫
I
E{gm(r; b)− g(r)
}2w(r − rmin)dr, (2.2)
where gm, m = k, d, c, is one of the aforementioned kernel estimators, w ≥ 0 is aweight function and rmin ≥ 0. With I = (0, R), w(r) = rd−1 and rmin = 0, Guan(2007a) suggests to estimate the mean integrated squared error by
M(b) = sad∫ R
0
{gm(r; b)
}2rd−1dr − 2
6=∑
u,v∈XW‖v−u‖≤R
g−{u,v}m (‖v − u‖; b)
ρ(u)ρ(v)|W ∩Wv−u|, (2.3)
where g−{u,v}m , m = k, d, c, is defined as gm but based on (X \ {u, v}) ∩W . Loh andJang (2010) instead use a spatial bootstrap for estimating (2.2). We return to (2.3)in Section 5.
3
3 Orthogonal series estimation
3.1 The new estimator
For an R > 0, the new orthogonal series estimator of g(r), 0 ≤ rmin < r < rmin +R,is based on an orthogonal series expansion of g(r) on (rmin, rmin +R) :
g(r) =∞∑
k=1
θkφk(r − rmin), (3.1)
where {φk}k≥1 is an orthonormal basis of functions on (0, R) with respect to someweight function w(r) ≥ 0, r ∈ (0, R). That is,
∫ R0φk(r)φl(r)w(r)dr = 1(k = l) and
the coefficients in the expansion are given by θk =∫ R
0g(r + rmin)φk(r)w(r)dr.
For the cosine basis, w(r) = 1 and
φ1(r) = 1/√R, φk(r) = (2/R)1/2 cos{(k − 1)πr/R}, k ≥ 2.
Another example is the Fourier-Bessel basis with w(r) = rd−1 and
φk(r) = 21/2Jν (rαν,k/R) r−ν/{RJν+1(αν,k)}, k ≥ 1,
where ν = (d − 2)/2, Jν is the Bessel function of the first kind of order ν, and{αν,k}∞k=1 is the sequence of successive positive roots of Jν(r).
An estimator of g is obtained by replacing the θk in (3.1) by unbiased estimatorsand truncating or smoothing the infinite sum. A similar approach has a long historyin the context of non-parametric estimation of probability densities, see e.g. thereview in Efromovich (2010). For θk we propose the estimator
θk =1
sad
6=∑
u,v∈XWrmin<‖u−v‖<rmin+R
φk(‖v − u‖ − rmin)w(‖v − u‖ − rmin)
ρ(u)ρ(v)‖v − u‖d−1|W ∩Wv−u|, (3.2)
which is unbiased by the second order Campbell formula, see Section B of the sup-plementary material. This type of estimator has some similarity to the coefficientestimators used for probability density estimation but is based on spatial lags v− uwhich are not independent nor identically distributed. Moreover the estimator isadjusted for the possibly inhomogeneous intensity ρ and corrected for edge effects.
The orthogonal series estimator is finally of the form
go(r; b) =∞∑
k=1
bkθkφk(r − rmin), (3.3)
where b = {bk}∞k=1 is a smoothing/truncation scheme. The simplest smoothingscheme is bk = 1[k ≤ K] for some cut-off K ≥ 1. Section 3.3 considers severalother smoothing schemes.
4
3.2 Variance of θkThe factor ‖v − u‖d−1 in (3.2) may cause problems when d > 1 where the presenceof two very close points in XW could imply division by a quantity close to zero. Theexpression for the variance of θk given in Section B of the supplementary materialindeed shows that the variance is not finite unless g(r)w(r − rmin)/rd−1 is boundedfor rmin < r < rmin +R. If rmin > 0 this is always satisfied for bounded g. If rmin = 0the condition is still satisfied in case of the Fourier-Bessel basis and bounded g.
For the cosine basis w(r) = 1 so if rmin = 0 we need the boundedness of g(r)/rd−1.If X satisfies a hard core condition (i.e. two points in X cannot be closer than someδ > 0), this is trivially satisfied. Another example is a determinantal point process(Lavancier et al., 2015) for which g(r) = 1− c(r)2 for a correlation function c. Theboundedness is then e.g. satisfied if c(·) is the Gaussian (d ≤ 3) or exponential(d ≤ 2) correlation function. In practice, when using the cosine basis, we take rmin
to be a small positive number to avoid issues with infinite variances.
3.3 Mean integrated squared error and smoothing schemes
The orthogonal series estimator (3.3) has the mean integrated squared error
mise(go, w
)= sad
∫ rmin+R
rmin
E{go(r; b)− g(r)
}2w(r − rmin)dr
= sad∞∑
k=1
E(bkθk − θk)2 = sad∞∑
k=1
[b2k E{(θk)2} − 2bkθ
2k + θ2
k
]. (3.4)
Each term in (3.4) is minimized with bk equal to (cf. Hall, 1987)
b∗k =θ2k
E{(θk)2}=
θ2k
θ2k + Var(θk)
, k ≥ 0, (3.5)
leading to the minimal value sad∑∞
k=1 b∗k Var(θk) of the mean integrated square error.
Unfortunately, the b∗k are unknown.In practice we consider a parametric class of smoothing schemes b(ψ). For prac-
tical reasons we need a finite sum in (3.3) so one component in ψ will be a cut-off index K so that bk(ψ) = 0 when k > K. The simplest smoothing schemeis bk(ψ) = 1(k ≤ K). A more refined scheme is bk(ψ) = 1(k ≤ K)b∗k whereb∗k = θ2
k/(θk)2 is an estimate of the optimal smoothing coefficient b∗k given in (3.5).
Here θ2k is an asymptotically unbiased estimator of θ2
k derived in Section 5. For thesetwo smoothing schemes ψ = K. Adapting the scheme suggested by Wahba (1981),we also consider ψ = (K, c1, c2), c1 > 0, c2 > 1, and bk(ψ) = 1(k ≤ K)/(1 + c1k
c2).In practice we choose the smoothing parameter ψ by minimizing an estimate of themean integrated squared error, see Section 5.
3.4 Expansion of g( · )− 1
For large R, g(rmin + R) is typically close to one. However, for the Fourier-Besselbasis, φk(R) = 0 for all k ≥ 1 which implies go(rmin + R) = 0. Hence the estimator
5
cannot be consistent for r = rmin + R and the convergence of the estimator forr ∈ (rmin, rmin + R) can be quite slow as the number of terms K in the estimatorincreases. In practice we obtain quicker convergence by applying the Fourier-Besselexpansion to g(r)−1 =
∑k≥1 ϑkφk(r−rmin) so that the estimator becomes go(r; b) =
1 +∑∞
k=1 bkϑkφk(r − rmin) where ϑk = θk −∫ rmin+R
0φk(r)w(r)dr is an estimator of
ϑk =∫ R
0{g(r + rmin) − 1}φk(r)w(r)dr. Note that Var(ϑk) = Var(θk) and go(r; b) −
E{go(r; b)} = go(r; b)−E{go(r; b)}. These identities imply that the results regardingconsistency and asymptotic normality established for go(r; b) in Section 4 are alsovalid for go(r; b).
4 Consistency and asymptotic normality
4.1 Setting
To obtain asymptotic results we assume that X is observed through an increasingsequence of observation windows Wn. For ease of presentation we assume squareobservation windows Wn = ×di=1[−nai, nai] for some ai > 0, i = 1, . . . , d. Moregeneral sequences of windows can be used at the expense of more notation andassumptions. We also consider an associated sequence ψn, n ≥ 1, of smoothingparameters satisfying conditions to be detailed in the following. We let θk,n and go,ndenote the estimators of θk and g obtained from X observed on Wn. Thus
θk,n =1
sad|Wn|
6=∑
u,v∈XWnv−u∈BRrmin
φk(‖v − u‖ − rmin)w(‖v − u‖ − rmin)
ρ(u)ρ(v)‖v − u‖d−1en(v − u)),
where
BRrmin
= {h ∈ Rd | rmin < ‖h‖ < rmin+R} and en(h) = |Wn∩(Wn)h|/|Wn|. (4.1)
Further,
go,n(r; b) =Kn∑
k=1
bk(ψn)θk,nφk(r − rmin)
=1
sad|Wn|
6=∑
u,v∈XWnv−u∈BRrmin
w(‖v − u‖)ϕn(v − u, r)ρ(u)ρ(v)‖v − u‖d−1en(v − u)| ,
where
ϕn(h, r) =Kn∑
k=1
bk(ψn)φk(‖h‖ − rmin)φk(r − rmin). (4.2)
In the results below we refer to higher order normalized joint intensities g(k) of X.Define the k’th order joint intensity of X by the identity
E{ 6=∑
u1,...,uk∈X1(u1 ∈ A1, . . . , uk ∈ Ak)
}=
∫
A1×···×Akρ(k)(v1, . . . , vk)dv1 · · · dvk
6
for bounded subsets Ai ⊂ Rd, i = 1, . . . , k, where the sum is over distinct u1, . . . , uk.We then let g(k)(v1, . . . , vk) = ρ(k)(v1, . . . , vk)/{ρ(v1) · · · ρ(vk)} and assume with anabuse of notation that the g(k) are translation invariant for k = 3, 4, i.e.
g(k)(v1, . . . , vk) = g(k)(v2 − v1, . . . , vk − v1).
4.2 Consistency of orthogonal series estimator
Consistency of the orthogonal series estimator can be established under fairly mildconditions following the approach in Hall (1987). We first state some conditions thatensure (see Section B of the supplementary material) that Var(θk,n) ≤ C1/|Wn| forsome 0 < C1 <∞:
V1: There exists 0 < ρmin < ρmax <∞ such that for all u ∈ Rd, ρmin ≤ ρ(u) ≤ ρmax.
V2: For any h, h1, h2 ∈ BRrmin
, g(h)w(‖h‖ − rmin) ≤ C2‖h‖d−1 and g(3)(h1, h2) ≤ C3
for constants C2, C3 <∞.
V3: A constant C4 < ∞ can be found such that suph1,h2∈BRrmin
∫Rd∣∣g(4)(h1, h3,
h2 + h3)− g(h1)g(h2)∣∣dh3 ≤ C4.
The first part of V2 is needed to ensure finite variances of the θk,n and is discussedin detail in Section 3.2. The second part simply requires that g(3) is bounded. Thecondition V3 is a weak dependence condition which is also used for asymptoticnormality in Section 4.3 and for estimation of θ2
k in Section 5.Regarding the smoothing scheme, we assume
S1: B = supk,ψ |bk(ψ)| <∞ and for all ψ,∑∞
k=1
∣∣bk(ψ)∣∣ <∞.
S2: ψn → ψ∗ for some ψ∗, and limψ→ψ∗ max1≤k≤m |bk(ψ)− 1| = 0 for all m ≥ 1.
S3: |Wn|−1∑∞
k=1 |bk(ψn)| → 0.
E.g. for the simplest smoothing scheme, ψn = Kn, ψ∗ = ∞ and we assume thatKn/|Wn| tends to zero.
Assuming the above conditions we now verify that the mean integrated squarederror of go,n tends to zero as n→∞. By (3.4),
mise(go,n, w
)/sad =
∞∑
k=1
[bk(ψn)2 Var(θk) + θ2
k{bk(ψn)− 1}2].
By V1–V3 and S1 the right hand side is bounded by
BC1|Wn|−1
∞∑
k=1
|bk(ψn)|+ max1≤k≤m
θ2k
m∑
k=1
(bk(ψn)− 1)2 + (B2 + 1)∞∑
k=m+1
θ2k.
By Parseval’s identity,∑∞
k=1 θ2k < ∞. The last term can thus be made arbitrarily
small by choosing m large enough. It also follows that θ2k tends to zero as k → ∞.
Hence, by S2, the middle term can be made arbitrarily small by choosing n largeenough for any choice of m. Finally, the first term can be made arbitrarily small byS3 and choosing n large enough.
7
4.3 Asymptotic normality
The estimators θk,n as well as the estimator go,n(r; b) are of the form
Sn =1
sad|Wn|
6=∑
u,v∈XWnv−u∈BRrmin
fn(v − u)
ρ(u)ρ(v)en(v − u)(4.3)
for a sequence of even functions fn : Rd → R. We let τ 2n = |Wn|Var(Sn).
To establish asymptotic normality of estimators of the form (4.3) we need certainmixing properties for X as in Waagepetersen and Guan (2009). The strong mixingcoefficient for the point process X on Rd is given by (Ivanoff, 1982; Politis et al.,1998)
αX(m; a1, a2)
= sup{∣∣pr(E1 ∩ E2)− pr(E1)pr(E2)
∣∣ : E1 ∈ FX(B1), E2 ∈ FX(B2),
|B1| ≤ a1, |B2| ≤ a2,D(B1, B2) ≥ m,B1, B2 ∈ B(Rd)},
where B(Rd) denotes the Borel σ-field on Rd, FX(Bi) is the σ-field generated byX ∩Bi and
D(B1, B2) = inf{
max1≤i≤d
|ui − vi| : u = (u1, . . . , ud) ∈ B1, v = (v1, . . . , vd) ∈ B2
}.
To verify asymptotic normality we need the following assumptions as well as V1(the conditions V2 and V3 are not needed due to conditions N2 and N4 below):
N1: The mixing coefficient satisfies αX(m; (s + 2R)d,∞) = O(m−d−ε) for somes, ε > 0.
N2: There exists a η > 0 and L1 < ∞ such that g(k)(h1, . . . , hk−1) ≤ L1 fork = 2, . . . , 2(2 + dηe) and all h1, . . . , hk−1 ∈ Rd.
N3: lim infn→∞ τ 2n > 0.
N4: There exists L2 <∞ so that |fn(h)| ≤ L2 for all n ≥ 1 and h ∈ BRrmin
.
The conditions N1–N3 are standard in the point process literature, see e.g. thediscussions in Waagepetersen and Guan (2009) and Coeurjolly and Møller (2014).The condition N3 is difficult to verify and is usually left as an assumption, seeWaagepetersen and Guan (2009), Coeurjolly and Møller (2014) and Dvořák andProkešová (2016). However, at least in the stationary case, and in case of estimationof θk,n, the expression for Var(θk,n) in Section B of the supplementary material showsthat τ 2
n = |Wn|Var(θk,n) converges to a constant which supports the plausibility ofcondition N3. We discuss N4 in further detail below when applying the generalframework to θk,n and go,n. The following theorem is proved in Section C of thesupplementary material.
Theorem 4.1. Under conditions V1, N1–N4, τ−1n |Wn|1/2
{Sn−E(Sn)
} D−→ N(0, 1).
8
4.4 Application to θk,n and go,n
In case of estimation of θk, θk,n = Sn with fn(h) = φk(‖h‖−rmin)w(‖h‖−rmin)/‖h‖d−1.The assumption N4 is then straightforwardly seen to hold in the case of the Fourier-Bessel basis where |φk(r)| ≤ |φk(0)| and w(r) = rd−1. For the cosine basis, N4 doesnot hold in general and further assumptions are needed, cf. the discussion in Sec-tion 3.2. For simplicity we here just assume rmin > 0. Thus we state the following
Corollary 1. Assume V1, N1–N4, and, in case of the cosine basis, that rmin > 0.Then
{Var(θk,n)}−1/2(θk,n − θk) D−→ N(0, 1).
For go,n(r; b) = Sn,
fn(h) =ϕn(h, r)w(‖h‖ − rmin)
‖h‖d−1
=w(‖h‖ − rmin)
‖h‖d−1
Kn∑
k=1
bk(ψn)φk(‖h‖ − rmin)φk(r − rmin),
where ϕn is defined in (4.2). In this case, fn is typically not uniformly bounded sincethe number of not necessarily decreasing terms in the sum defining ϕn in (4.2) growswith n. We therefore introduce one more condition:
N5: There exist an ω > 0 and Mω <∞ so that
K−ωn
Kn∑
k=1
bk(ψn)∣∣φk(r − rmin)φk(‖h‖ − rmin)
∣∣ ≤Mω
for all h ∈ BRrmin
.
Given N5, we can simply rescale: Sn := K−ωn Sn and τ 2n := K−2ω
n τ 2n. Then, assum-
ing lim infn→∞ τ 2n > 0, Theorem 4.1 gives the asymptotic normality of τ−1
n |Wn|1/2 ·{Sn − E(Sn)} which is equal to τ−1
n |Wn|1/2{Sn − E(Sn)}. Hence we obtain
Corollary 2. Assume V1, N1–N2, N5 and lim infn→∞K−2ωn τ 2
n > 0. In case of thecosine basis, assume further rmin > 0. Then for r ∈ (rmin, rmin +R),
τ−1n |Wn|1/2
[go,n(r; b)− E{go,n(r; b)}
] D−→ N(0, 1).
In case of the simple smoothing scheme bk(ψn) = 1(k ≤ Kn), we take ω = 1for the cosine basis. For the Fourier-Bessel basis we take ω = 4/3 when d = 1 andω = d/2 + 2/3 when d > 1 (see the derivations in Section F of the supplementarymaterial).
9
5 Tuning the smoothing scheme
In practice we choose K, and other parameters in the smoothing scheme b(ψ), byminimizing an estimate of the mean integrated squared error. This is equivalent tominimizing
sadI(ψ) = mise(go, w)−∫ rmin+R
rmin
{g(r)− 1
}2w(r)dr
=K∑
k=1
[bk(ψ)2 E{(θk)2} − 2bk(ψ)θ2
k
].
(5.1)
In practice we must replace (5.1) by an estimate. Define θ2k as
6=∑
u,v,u′,v′∈XWv−u,v′−u′∈BRrmin
φk(‖v − u‖ − rmin)φk(‖v′ − u′‖ − rmin)· w(‖v − u‖ − rmin)w(‖v′ − u′‖ − rmin)
sad2ρ(u)ρ(v)ρ(u′)ρ(v′)‖v − u‖d−1‖v′ − u′‖d−1
· |W ∩Wv−u||W ∩Wv′−u′|.
Then, referring to the set-up in Section 4 and assuming V3,
limn→∞
E(θ2k,n)→
{∫ R
0
g(r + rmin)φk(r)w(r)dr
}2
= θ2k
(see Section D of the supplementary material) and hence θ2k,n is an asymptotically
unbiased estimator of θ2k. The estimator is obtained from (θk)
2 by retaining onlyterms where all four points u, v, u′, v′ involved are distinct. In simulation studies, θ2
k
had a smaller root mean squared error than (θk)2 for estimation of θ2
k.Thus
I(ψ) =K∑
k=1
{bk(ψ)2(θk)
2 − 2bk(ψ)θ2k
}(5.2)
is an asymptotically unbiased estimator of (5.1). Moreover, (5.2) is equivalent to thefollowing slight modification of Guan (2007a)’s criterion (2.3):
∫ rmin+R
rmin
{go(r; b)
}2w(r − rmin)dr
− 2
sad
6=∑
u,v∈XWv−u∈BRrmin
g−{u,v}o (‖v − u‖; b)w(‖v − u‖ − rmin)
ρ(u)ρ(v)|W ∩Wv−u|.
For the simple smoothing scheme bk(K) = 1(k ≤ K), (5.2) reduces to
I(K) =K∑
k=1
{(θk)
2 − 2θ2k
}=
K∑
k=1
(θk)2(1− 2b∗k), (5.3)
where b∗k = θ2k/(θk)
2 is an estimator of b∗k in (3.5).
10
In practice, uncertainties of θk and θ2k lead to numerical instabilities in the min-
imization of (5.2) with respect to ψ. To obtain a numerically stable procedure wefirst determine K as
K = inf{2 ≤ k ≤ Kmax : (θk+1)2 − 2θ2k+1 > 0}
= inf{2 ≤ k ≤ Kmax : b∗k+1 < 1/2}.(5.4)
That is, K is the first local minimum of (5.3) larger than 1 and smaller than an upperlimit Kmax which we chose to be 49 in the applications. This choice of K is also usedfor the refined and the Wahba smoothing schemes. For the refined smoothing schemewe thus let bk = 1(k ≤ K)b∗k. For the Wahba smoothing scheme bk = 1(k ≤ K)/
(1 + c1kc2), where c1 and c2 minimize
∑Kk=1{(θk)2/(1 + c1k
c2)2 − 2θ2k/(1 + c1k
c2)}over c1 > 0 and c2 > 1.
6 Simulation study
Poisson Thomas VarGamma DPP
0.00 0.04 0.08 0.12 0.00 0.04 0.08 0.12 0.00 0.04 0.08 0.12 0.00 0.04 0.08 0.120.00
0.25
0.50
0.75
1.00
5
10
15
20
2.5
5.0
7.5
0.50
0.75
1.00
1.25
1.50
r
g(r)
Figure 6.1: Pair correlation functions for the point processes considered in the simulationstudy.
We compare the performance of the orthogonal series estimators and the kernelestimators for data simulated on W = [0, 1]2 or W = [0, 2]2 from four point pro-cesses with constant intensity ρ = 100. More specifically, we consider nsim = 1000realizations from a Poisson process, a Thomas process (parent intensity κ = 25, dis-persion standard deviation ω = 0.0198), a Variance Gamma cluster process (parentintensity κ = 25, shape parameter ν = −1/4, dispersion parameter ω = 0.01845,Jalilian et al., 2013), and a determinantal point process with pair correlation func-tion g(r) = 1 − exp{−2(r/α)2} and α = 0.056. The pair correlation functions ofthese point processes are shown in Figure 6.1.
For each realization, g(r) is estimated for r in (rmin, rmin +R), with rmin = 10−3
and R = 0.06, 0.085, 0.125, using the kernel estimators gk(r; b), gd(r; b) and gc(r; b)or the orthogonal series estimator go(r; b). The Epanechnikov kernel with bandwidthb = 0.15/
√ρ is used for gk(r; b) and gd(r; b) while the bandwidth of gc(r; b) is chosen
by minimizing Guan (2007a)’s estimate (2.3) of the mean integrated squared error.For the orthogonal series estimator, we consider both the cosine and the Fourier-Bessel bases with simple, refined or Wahba smoothing schemes. For the Fourier-Bessel basis we use the modified orthogonal series estimator described in Section 3.4.The parameters for the smoothing scheme are chosen according to Section 5.
11
From the simulations we estimate the mean integrated squared error (2.2) withw(r) = 1 of each estimator gm, m = k, d, c, o, over the intervals [rmin, 0.025] (smallspatial lags) and [rmin, rmin + R] (all lags). We consider the kernel estimator gk asthe baseline estimator and compare any of the other estimators g with gk using thelog relative efficiency eI(g) = log{miseI(gk)/miseI(g)}, where miseI(g) denotes theestimated mean squared integrated error over the interval I for the estimator g. ThuseI(g) > 0 indicates that g outperforms gk on the interval I. Results for W=[0, 1]2
are summarized in Figure 6.2.
● ● ●
● ● ●
● ● ●
● ● ●
● ● ●
● ● ●
● ● ●
●●
●
VarGamma
(rmin, .025]
VarGamma
(rmin, R]
DPP
(rmin, .025]
DPP
(rmin, R]
Poisson
(rmin, .025]
Poisson
(rmin, R]
Thomas
(rmin, .025]
Thomas
(rmin, R]
0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125
0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.1250.8
0.9
1.0
1.1
1.2
1.3
−1.5
−1.0
−0.5
0.0
0.5
1.0
1.1
1.2
1.3
1.4
1.5
−2
−1
0
1
0
1
2
3
0.2
0.3
0.4
0.5
0
1
2
3
4
0.2
0.3
0.4
0.5
0.6
R
log
rela
tive
effic
ieny
with
resp
ect t
o g k Type
simple
refined
Wahba
Estimator● kernel d
kernel c
Bessel
cosine
Figure 6.2: Plots of log relative efficiencies for small lags (rmin, 0.025] and all lags (rmin, R],R = 0.06, 0.085, 0.125, and W = [0, 1]2. Black: kernel estimators. Blue and red: orthogonalseries estimators with Bessel respectively cosine basis. Lines serve to ease visual interpre-tation.
For all types of point processes, the orthogonal series estimators outperform ordoes as well as the kernel estimators both at small lags and over all lags. The detailedconclusions depend on whether the non-repulsive Poisson, Thomas and Var Gammaprocesses or the repulsive determinantal process are considered. Orthogonal-Besselwith refined or Wahba smoothing is superior for Poisson, Thomas and Var Gammabut only better than gc for the determinantal point process. The performance ofthe orthogonal-cosine estimator is between or better than the performance of thekernel estimators for Poisson, Thomas and Var Gamma and is as good as the bestkernel estimator for determinantal. Regarding the kernel estimators, gc is betterthan gd for Poisson, Thomas and Var Gamma and worse than gd for determinantal.The above conclusions are stable over the three R values considered. For W =[0, 2]2 (see Figure G.1 in the supplementary material) the conclusions are similarbut with more clear superiority of the orthogonal series estimators for Poisson andThomas. For Var Gamma the performance of gc is similar to the orthogonal seriesestimators. For determinantal and W = [0, 2]2, gc is better than orthogonal-Bessel-refined/Wahba but still inferior to orthogonal-Bessel-simple and orthogonal-cosine.Figures G.2 and G.3 in the supplementary material give a more detailed insightin the bias and variance properties for gk, gc, and the orthogonal series estimators
12
with simple smoothing scheme. Table G.1 in the supplementary material shows thatthe selected K in general increases when the observation window is enlargened,as required for the asymptotic results. The general conclusion, taking into accountthe simulation results for all four types of point processes, is that the best overallperformance is obtained with orthogonal-Bessel-simple, orthogonal-cosine-refined ororthogonal-cosine-Wahba.
To supplement our theoretical results in Section 4 we consider the distributionof the simulated go(r; b) for r = 0.025 and r = 0.1 in case of the Thomas processand using the Fourier-Bessel basis with the simple smoothing scheme. In additionto W = [0, 1]2 and W = [0, 2]2, also W = [0, 3]2 is considered. The mean, standarderror, skewness and kurtosis of go(r) are given in Table 6.1 while histograms of theestimates are shown in Figure G.4. The standard error of go(r; b) scales as |W |1/2 inaccordance with our theoretical results. Also the bias decreases and the distributionsof the estimates become increasingly normal as |W | increases.
Table 6.1: Monte Carlo mean, standard error, skewness (S) and kurtosis (K) of go(r)using the Bessel basis with the simple smoothing scheme in case of the Thomas process onobservation windows W1 = [0, 1]2, W2 = [0, 2]2 and W3 = [0, 3]3.
r g(r) E{go(r)} [Var{go(r)}]1/2 S{go(r)} K{go(r)}W1 0.025 3.972 3.961 0.923 1.145 5.240W1 0.1 1.219 1.152 0.306 0.526 3.516W2 0.025 3.972 3.959 0.467 0.719 4.220W2 0.1 1.219 1.187 0.150 0.691 4.582W3 0.025 3.972 3.949 0.306 0.432 3.225W3 0.1 1.2187 1.2017 0.0951 0.2913 2.9573
7 Application
We consider point patterns of locations of Acalypha diversifolia (528 trees), Lon-chocarpus heptaphyllus (836 trees) and Capparis frondosa (3299 trees) species inthe 1995 census for the 1000 m× 500 m Barro Colorado Island plot (Hubbell andFoster, 1983; Condit, 1998). To estimate the intensity function of each species, weuse a log-linear regression model depending on soil condition (contents of copper,mineralized nitrogen, potassium and phosphorus and soil acidity) and topographical(elevation, slope gradient, multiresolution index of valley bottom flatness, ncomingmean solar radiation and the topographic wetness index) variables. The regressionparameters are estimated using the quasi-likelihood approach in Guan et al. (2015).The point patterns and fitted intensity functions are shown in Figure H.1 in thesupplementary material.
The pair correlation function of each species is then estimated using the biascorrected kernel estimator gc(r; b) with b determined by minimizing (2.3) and theorthogonal series estimator go(r; b) with both Fourier-Bessel and cosine basis, refinedsmoothing scheme and the optimal cut-offs K obtained from (5.4); see Figure 7.1.
13
For Lonchocarpus the three estimates are quite similar while for Acalypha andCapparis the estimates deviate markedly for small lags and then become similar forlags greater than respectively 2 and 8 meters. For Capparis and the cosine basis,the number of selected coefficients coincides with the chosen upper limit 49 for thenumber of coefficients. The cosine estimate displays oscillations which appear to beartefacts of using high frequency components of the cosine basis. The function (5.3)decreases very slowly after K = 7 so we also tried the cosine estimate with K = 7which gives a more reasonable estimate.
Acalypha diversifolia
0 5 10 15
010
2030
40
r
g(r
)−1
kernel: gc(r ; b = 1.35)Bessel: K = 7 , refined
cosine: K = 11 , refined
Lonchocarpus heptaphyllus
0 10 20 30 40 50 60 70
0.0
0.5
1.0
1.5
2.0
r
g(r
)−1
kernel: gc(r ; b = 9.64)Bessel: K = 7 , refined
cosine: K = 6 , refined
Capparis frondosa
0 20 40 60 80
01
23
r
g(r
)−1
kernel: gc(r ; b = 5.13)Bessel: K = 7 , refined
cosine: K = 49 , refined
cosine: K = 7 , refined
Figure 7.1: Estimated pair correlation functions for tropical rain forest trees.
Acknowledgement
Rasmus Waagepetersen is supported by the Danish Council for Independent Re-search | Natural Sciences, grant “Mathematical and Statistical Analysis of SpatialData”, and by the “Centre for Stochastic Geometry and Advanced Bioimaging”,funded by the Villum Foundation.
References
Baddeley, A., Rubak, E., and Turner, R. (2015). Spatial Point Patterns: Methodology andApplications with R. Chapman & Hall/CRC Interdisciplinary Statistics. Chapman &Hall/CRC, Boca Raton.
Baddeley, A. J., Møller, J., and Waagepetersen, R. (2000). Non- and semi-parametric esti-mation of interaction in inhomogeneous point patterns. Statistica Neerlandica, 54:329–350.
Biscio, C. A. N. and Waagepetersen, R. (2016). A central limit theorem for α-mixingspatial point processes. In preparation.
Coeurjolly, J.-f. and Møller, J. (2014). Variational approach for spatial point processintensity estimation. Bernoulli, 20(3):1097–1125.
Coeurjolly, J.-F., Møller, J., and Waagepetersen, R. (2016). A tutorial on Palm distribu-tions for spatial point processes. International Statistical Review. to appear.
14
Condit, R. (1998). Tropical forest census plots. Springer-Verlag and R. G. Landes Company,Berlin, Germany and Georgetown, Texas.
Dvořák, J. and Prokešová, M. (2016). Asymptotic properties of the minimum contrastestimators for projections of inhomogeneous space-time shot-noise cox processes. Appli-cations of Mathematics, 61(4):387–411.
Efromovich, S. (2010). Orthogonal series density estimation. Wiley Interdisciplinary Re-views: Computational Statistics, 2(4):467–476.
Guan, Y. (2007a). A least-squares cross-validation bandwidth selection approach in paircorrelation function estimations. Statistics and Probability Letters, 77(18):1722–1729.
Guan, Y. (2007b). A composite likelihood cross-validation approach in selecting bandwidthfor the estimation of the pair correlation function. Scandinavian Journal of Statistics,34(2):336–346.
Guan, Y., Jalilian, A., and Waagepetersen, R. (2015). Quasi-likelihood for spatial pointprocesses. Journal of the Royal Statistical Society. Series B: Statistical Methodology,77(3):677–697.
Hall, P. (1987). Cross-validation and the smoothing of orthogonal series density estimators.Journal of Multivariate Analysis, 21(2):189–206.
Hubbell, S. P. and Foster, R. B. (1983). Diversity of canopy trees in a Neotropical forestand implications for the conservation of tropical trees. In Sutton, S. L., Whitmore, T. C.,and Chadwick, A. C., editors, Tropical Rain Forest: Ecology and Management., pages25–41. Blackwell Scientific Publications, Oxford.
Illian, J., Penttinen, A., Stoyan, H., and Stoyan, D. (2008). Statistical Analysis and Mod-elling of Spatial Point Patterns, volume 76. Wiley, London.
Ivanoff, G. (1982). Central limit theorems for point processes. Stochastic Processes andtheir Applications, 12(2):171–186.
Jalilian, A., Guan, Y., and Waagepetersen, R. (2013). Decomposition of variance for spatialCox processes. Scandinavian Journal of Statistics, 40(1):119–137.
Landau, L. (2000). Monotonicity and bounds on Bessel functions. In Proceedings of theSymposium on Mathematical Physics and Quantum Field Theory, volume 4, pages 147–154. Southwest Texas State Univ. San Marcos, TX.
Lavancier, F., Møller, J., and Rubak, E. (2015). Determinantal point process modelsand statistical inference. Journal of the Royal Statistical Society: Series B (StatisticalMethodology), 77:853–877.
Loh, J. M. and Jang, W. (2010). Estimating a cosmological mass bias parameter withbootstrap bandwidth selection. Journal of the Royal Statistical Society: Series C (AppliedStatistics), 59(5):761–779.
McGown, K. J. and Parks, H. R. (2007). The generalization of Faulhaber’s formula to sumsof non-integral powers. Journal of Mathematical Analysis and Applications, 330(1):571– 575.
15
Møller, J. and Waagepetersen, R. P. (2003). Statistical inference and simulation for spatialpoint processes. Chapman and Hall/CRC, Boca Raton.
Politis, D. N., Paparoditis, E., and Romano, J. P. (1998). Large sample inference forirregularly space dependent opservations based on subsampling. Sankhya: The IndianJournal of Statistics, 60(2):274–292.
Stoyan, D. and Stoyan, H. (1994). Fractals, Random Shapes and Point Fields: Methods ofGeometrical Statistics. Wiley.
Stoyan, D. and Stoyan, H. (1996). Estimating pair correlation functions of planar clusterprocesses. Biometrical Journal, 38(3):259–271.
Waagepetersen, R. and Guan, Y. (2009). Two-step estimation for inhomogeneous spatialpoint processes. Journal of the Royal Statistical Society: Series B (Statistical Methodol-ogy), 71(3):685–702.
Wahba, G. (1981). Data-based optimal smoothing of orthogonal series density estimates.Annals of Statistics, 9:146–l56.
Watson, G. N. (1995). A treatise on the theory of Bessel functions. Cambridge universitypress.
Yue, Y. R. and Loh, J. M. (2013). Bayesian nonparametric estimation of pair correlationfunction for inhomogeneous spatial point processes. Journal of Nonparametric Statistics,25(2):463–474.
Supplementary material
Supplementary material includes proofs of consistency and asymptotic normalityresults and details of the simulation study and data analysis.
Supplementary material for ‘Orthogonal series estimation of the pair correlationfunction of a spatial point process’
A Expanding observation window
This section states a few results on the asymptotic behavior of the edge correctionen(h) (defined in (4.1) in the main document) and related ratios. For each n ≥ 1and h, h1, h2 ∈ Rd,
|Wn ∩ (Wn)h| ={ ∏d
i=1 (2nai − |hi|) |hi| < 2nai, i = 1, . . . , d0 otherwise
and∣∣Wn ∩ (Wn)h1 ∩ (Wn)h2
∣∣
=
{Vn(h1, h2) max{|h1i|, |h2,i|, |h1i − h2i|} < 2nai, i = 1, . . . , d
0 otherwise
16
where Vn(h1, h2) =∏d
i=1
(2nai −max{0, h1i, h2i} + min{0, h1i, h2i}
). Therefore, for
any fixed h, h1, h2 ∈ Rd, as n→∞,
en(h) =|Wn ∩ (Wn)h||Wn|
=d∏
i=1
(1− |hi|
2nai
)→ 1. (A.1)
If n ≥ R/min1≤i≤d ai or equivalently BRrmin⊂ Wn, then 1/2d ≤ en(h) ≤ 1 for any
h ∈ BRrmin
.Further,∣∣Wn ∩ (Wn)h1 ∩ (Wn)h2
∣∣|Wn|
=d∏
i=1
(1− max{0, h1i, h2i} −min{0, h1i, h2i}
2nai
)→ 1.
(A.2)Similarly, it can be shown that for any fixed h1, h2, h3 ∈ Rd,
∣∣Wn ∩ (Wn)h1 ∩ (Wn)h3 ∩ (Wn)h2+h3
∣∣/|Wn| → 1.
B Mean and variance of θk,n
For n large enough, 1(h ∈ BRrmin
) > 0 implies |Wn∩ (Wn)h| > 0. Then, by the secondorder Campbell formula (see Møller and Waagepetersen, 2003, Section C.2.1),
E(θk,n)
=1
sad|Wn|
∫
W 2n
1(v − u ∈ BRrmin
)
· φk(‖v − u‖ − rmin)w(‖v − u‖ − rmin)
‖v − u‖d−1en(v − u)g(‖v − u‖)dudv
=
∫
Rd1(h ∈ BR
rmin)g(‖h‖)φk(‖h‖ − rmin)w(‖h‖ − rmin)
sad‖h‖d−1
∫
Rd
1{v ∈ Wn ∩ (Wn)h}|Wn ∩ (Wn)h|
dvdh
=
∫ rmin+R
rmin
g(r)φk(r − rmin)w(r − rmin)dr = θk.
Let fk(h) = φk(‖h‖ − rmin)w(‖h‖ − rmin)/‖h‖d−1. Then, by the second to fourthorder Campbell formulae (omitting the details),
Var(θk,n) =1
(sad)2|Wn|
·[
2
∫
BRrmin
g(h)f 2k (‖h‖)G(2)
n (h)dh
+ 4
∫
BRrmin
∫
BRrmin
g(3)(h1, h2)fk(‖h1‖)fk(‖h2‖)G(3)n (h1, h2)dh1dh2
+
∫
BRrmin
∫
BRrmin
fk(‖h1‖)fk(‖h2‖)G(4)n (h1, h2)dh1dh2
],
(B.1)
17
where
G(2)n (h) =
1
e2n(h)
[1
|Wn|
∫
Wn
1[u+ h ∈ Wn]
ρ(u)ρ(u+ h)du
],
G(3)n (h1, h2) =
1
en(h1)en(h2)
[1
|Wn|
∫
Wn
1[u ∈ (Wn)h1 ∩ (Wn)h2 ]
ρ(u)du
],
G(4)n (h1, h2) =
1
en(h1)en(h2)
∫
R2
[g(4)(h1, u, h2 + u)− g(h1)g(h2)
]
·∣∣Wn ∩ (Wn)u ∩ (Wn)h1 ∩ (Wn)h2
∣∣|Wn|
du.
For n ≥ R/min1≤i≤d ai and any h, h1, h2 ∈ BRrmin
, by V1,
G(2)n (h) =
1
e2n(h)
[1
|Wn|
∫
Wn
1[u+ h ∈ Wn]
ρ(u)ρ(u+ h)du
]≤ 1
ρ2minen(h)
≤ 2d
ρ2min
,
G(3)n (h1, h2) =
1
en(h1)en(h2)
[1
|Wn|
∫
Wn
1[u ∈ (Wn)h1 ∩ (Wn)h2 ]
ρ(u)du
]
≤ 1
ρminen(h)≤ 2d
ρmin
and by V3,∣∣G(4)
n (h1, h2)∣∣ ≤ 1
en(h1)en(h2)
∫
R2
∣∣∣g(4)(h1, u, h2 + u)− g(h1)g(h2)∣∣∣
·∣∣Wn ∩ (Wn)u ∩ (Wn)h1 ∩ (Wn)h2
∣∣|Wn|
du
≤ 1
en(h1)
∫
R2
∣∣∣g(4)(h1, u, h2 + u)− g(h1)g(h2)∣∣∣du
≤ 2d∫
R2
∣∣∣g(4)(h1, u, h2 + u)− g(h1)g(h2)∣∣∣du ≤ 2dC4.
Thus, using V2, for all k ≥ 1 and n ≥ R/min1≤i≤d ai,
Var(θk,n) ≤ 1
(sad)2|Wn|
·[
2d+1
ρ2min
∫
BRrmin
f 2k (‖h‖)g(‖h‖)dh+
2d+2
ρmin
C3
(∫
BRrmin
∣∣fk(‖h‖)∣∣dh)2
+ 2dC4
(∫
BRrmin
∣∣fk(‖h‖)∣∣dh)2].
But for all k ≥ 1,∫
BRrmin
f 2k (‖h‖)g(‖h‖)dh =
∫ rmin+R
rmin
φ2k(r − rmin)g(r)
w2(r − rmin)
rd−1dr
=
∫ R
0
φ2k(r)g(r + rmin)
w2(r)
(r + rmin)d−1dr
≤ C2
∫ R
0
φ2k(r)w(r)dr = C2,
18
and by Hölder’s inequality,∫
BRrmin
∣∣fk(‖h‖)∣∣dh =
∫
BRrmin
∣∣φk(‖h‖ − rmin)∣∣w(‖h‖ − rmin)
‖h‖d−1dh
=
∫ rmin+R
rmin
∣∣φk(r − rmin)∣∣w(r − rmin)dr
≤(∫ R
0
φ2k(r)w(r)dr
)1/2(∫ R
0
w(r)dr
)1/2
=
(∫ R
0
w(r)dr
)1/2
<∞.
Therefore
Var(θk,n) <1
(sad)2|Wn|
[2d+1
ρ2min
C2 +
(2d+2
ρmin
C3 + 2dC4
)(∫ R
0
w(r)dr
)]=
C1
|Wn|,
where
C1 =1
(sad)2
[2d+1
ρ2min
C2 +
(2d+2
ρmin
C3 + 2dC4
)(∫ R
0
w(r)dr
)]> 0.
C Proof of asymptotic normality
In this section we give a proof of Theorem 4.1. Let for t = (t1, . . . , td) ∈ Zd,
∆(t) = ×di=1 (s(ti − 1/2), s(ti + 1/2)]
be the hyper-square with side length s and centered at st. Then, {∆(t) : t ∈ Zd}is a partition of Rd; i.e., ∆(t1) ∩ ∆(t2) = ∅ for t1 6= t2 and ∪t∈Zd∆(t) = Rd, and∣∣∆n(t)⊕R
∣∣ = (s+ 2R)d, where
∆(t)⊕R = ×di=1 (s(ti − 1/2)−R, s(ti + 1/2) +R] .
Let Tn ={t ∈ Zd : ∆(t) ∩Wn 6= ∅
}and define
Yn(t) =∑
u∈X∩∆(t)
∑
v∈X\{u}v−u∈BRrmin
fn(v − u)1(u ∈ Wn, v ∈ Wn)
ρ(u)ρ(v)en(v − u).
19
Then, since X =⋃t∈Zd
(X ∩∆(t)
),
Sn =1
sad|Wn|
6=∑
u,v∈XWnv−u∈BRrmin
fn(v − u)
ρ(u)ρ(v)en(v − u)
=1
sad|Wn|∑
t∈Zd
∑
u∈X∩∆(t)
∑
v∈X\{u}v−u∈BRrmin
fn(v − u)1(u ∈ Wn, v ∈ Wn)
ρ(u)ρ(v)en(v − u)
=1
sad|Wn|∑
t∈Zd∆(t)∩Wn 6=∅
Yn(t) =1
sad|Wn|∑
t∈TnYn(t).
Due to V1, N4 and since en(h) > 1/2d for n large enough and h ∈ BRrmin
,
E(|Yn(t)|2+dηe) ≤ E( ∑
u∈X∩∆(t)
∑
v∈X\{u}v−u∈BRrmin
L22d
ρ2min
)2+dηe.
The moments E(|Yn(t)|2+dηe) are thus bounded by sums of integrals involving thenormalized joint intensities g(k)(u1, . . . , uk−1) times (L22d/ρ2
min)2+dηe for k = 2, . . . ,2(2 + dηe). These integrals are bounded uniformly in t and n due to assumption N2.Thus,
supn≥1
supt∈Tn
E(|Yn(t)|2+η) ≤ supn≥1
supt∈Tn
E(|Yn(t)|2+dηe) <∞
and hence{|Yn(t)|2+η : t ∈ Tn, n ≥ 1
}is a uniformly integrable family (trian-
gular array) of random variables. Invoking finally N1 and N3 and letting σ2n =
Var{∑t∈Tn Yn(t)}, it follows directly from Theorem 3.1 in Biscio and Waagepetersen(2016) that
σ−1n
∑
t∈Tn
[Yn(t)− E{Yn(t)}
] D−→ N(0, 1),
which is equivalent to Theorem 4.1.
20
D Asymptotic mean of θ2k
Consider a real function f on Rd × Rd where f(h1, h2) 6= 0 implies |Wn ∩ (Wn)h1| ·|Wn ∩ (Wn)h2| > 0. Then by the fourth order Campbell formula,
E{ 6=∑
u,v,u′,v′∈XWn
f(v − u, v′ − u′)ρ(u)ρ(v)ρ(u′)ρ(v′)|Wn ∩ (Wn)v−u||Wn ∩ (Wn)v′−u′ |
}
=
∫
W 4n
f(v − u, v′ − u′)|Wn ∩ (Wn)v−u||Wn ∩ (Wn)v′−u′ |
g(4)(v − u, u′ − u, v′ − u)dudvdu′dv′
=
∫
(Rd)4f(h1, h2)g(4)(h1, u
′ − u, h2 + u′ − u)
· 1{u ∈ Wn ∩ (Wn)h1 , u
′ ∈ Wn ∩ (Wn)h2}
|Wn ∩ (Wn)h1||Wn ∩ (Wn)h2|dudh1du′dh2
=
∫
Rd
∫
Rdf(h1, h2)
∫Wn∩(Wn)h1
∫Wn∩(Wn)h2
g(4)(h1, u′ − u, h2 + u′ − u)dudu′
|Wn ∩ (Wn)h1 ||Wn ∩ (Wn)h2|
dh1dh2.
This expectation is the sum of
A =
∫
Rd
∫
Rdf(h1, h2)
{∫Wn∩(Wn)h1
∫Wn∩(Wn)h2
g(h1)g(h2)
|Wn ∩ (Wn)h1||Wn ∩ (Wn)h2 |dudu′
}dh1dh2
=
∫
Rd
∫
Rdf(h1, h2)g(h1)g(h2)dh1dh2
and
Bn =
∫
Rd
∫
Rdf(h1, h2)
·
∫Wn∩(Wn)h1
∫Wn∩(Wn)h2{g(4)(h1, u
′ − u, h2 + u′ − u)− g(h1)g(h2)}
dudu′
|Wn ∩ (Wn)h1||Wn ∩ (Wn)h2|
dh1dh2
=
∫
Rd
∫
Rdf(h1, h2)
·[ ∫
Rd
|Wn ∩ (Wn)h1 ∩ (Wn)h3 ∩ (Wn)h2+h3||Wn ∩ (Wn)h1||Wn ∩ (Wn)h2|
·{g(4)(h1, h3, h2 + h3)− g(h1)g(h2)
}dh3
]dh1dh2.
We now specialize to f(h1, h2) = fk(h1)fk(h2), where
fk(h) = φk(‖h‖ − rmin)w(‖h‖ − rmin)1(h ∈ BRrmin
)/(sad‖h‖d−1).
21
Then, by V3,
|Bn| ≤∫
BRrmin
∫
BRrmin
fk(h1)fk(h2)
·[ ∫
Rd
|Wn ∩ (Wn)h1 ∩ (Wn)h3 ∩ (Wn)h2+h3||Wn ∩ (Wn)h1||Wn ∩ (Wn)h2 |
·∣∣g(4)(h1, h3, h2 + h3)− g(h1)g(h2)
∣∣dh3
]dh1dh2
≤C4
∫
BRrmin
∫
BRrmin
fk(h1)fk(h2)
|Wn ∩ (Wn)h1|dh1dh2.
Thus Bn tends to zero as n→∞. Regarding A, we have
A =
∫
Rd
∫
Rdfk(h1)fk(h2)g(h1)g(h2)dh1dh2 =
{∫
Rdfk(h)g(h)dh
}2
=
{∫ rmin+R
rmin
g(r)φk(r − rmin)w(r − rmin)dr
}2
= θ2k.
E Asymptotic behavior of Bessel function roots
It is known (see Watson, 1995, p. 199) that as r →∞,
Jν(r) ∼(
2
πr
)1/2
cos(r − νπ
2− π
4
),
which implies that
αν,k =
(k +
ν
2− 1
4
)π +O(k−1), as k →∞, (E.1)
and αν,k →∞, as k →∞. We can argue that for large k,
Jν+1(αν,k) ≈(
2
παν,k
)1/2
cos
(αν,k −
π
4− (ν + 1)π
2
)
=
(2
παν,k
)1/2
cos((k − 1)π +O(k−1)
),
and consequently∣∣Jν+1(αν,k)
∣∣ ≈(
2
παν,k
)1/2
. (E.2)
F Order of sum of products of Bessel basisfunctions
In this section we consider the Bessel basis in the case rmin = 0. For ν ≥ 0 and r > 0(Landau, 2000), ∣∣Jν(r)
∣∣ ≤ c
|r|1/3 ,
22
where c = 0.7857468704 . . ., and hence
∣∣φk(r)∣∣ ≤ cr−ν−1/3
√2
(R2αν,k)1/3∣∣Jν+1(αν,k)
∣∣ r > 0, k ≥ 1.
Using (E.1) and (E.2), for large k,
cr−ν−1/3√
2
(R2αν,k)1/3∣∣Jν+1(αν,k)
∣∣ ≈ c√πr−ν−1/3
R2/3α
1/6ν,k ≈ c
√πr−ν−1/3
R2/3
{(k +
ν
2− 1
4
)π
}1/6
.
Since limr→0 Jν(r)r−ν = {Γ(ν+1)2ν}−1, we also obtain for large k and 0 < ‖h‖ < R,
|φ(‖h‖)| ≤ const(R
αν,k
)−ν √2
RJν+1(αν,k)≈ const
√παν+1/2ν,k
Rν+1= O(kν+1/2).
Thus, for fixed r and 0 < ‖h‖ < R,
|φk(r)φk(‖h‖)| = O(k1/6+max(1/6,ν+1/2)) = O(k1/6+max(1/6,d/2−1/2)).
By generalization of Faulhaber’s formula (McGown and Parks, 2007),
Kn∑
k=1
kp = O(Kp+1n ), p > −1.
Therefore,1
Kωn
Kn∑
k=1
∣∣φk(r)φk(‖h‖)∣∣ =
{O(K
4/3−ωn ) d = 1
O(Kd/2+2/3−ωn ) d > 1
for ω > 0 and 0 < r, ‖h‖ < R.
G Simulation study
The results in Figure G.1 are obtained as for the simulation study in the main docu-ment but with W = [0, 2]2. The estimated cut-offs K are summarized in Table G.1.Simulation mean and 95% envelopes for gk, gc and go with both Fourier-Bessel andcosine basis and simple smoothing schemes are shown in Figure G.2 and Figure G.3.Figure G.4 shows histograms of orthogonal series estimates. Comments on thesefigures and the table are given in the main document.
23
● ● ●
● ● ●
● ● ●
● ● ●
● ● ●
● ● ●
● ● ●
●
●●
VarGamma
(rmin, .025]
VarGamma
(rmin, R]
DPP
(rmin, .025]
DPP
(rmin, R]
Poisson
(rmin, .025]
Poisson
(rmin, R]
Thomas
(rmin, .025]
Thomas
(rmin, R]
0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125
0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.125 0.060 0.085 0.1250.9
1.2
1.5
1.8
−1.0
−0.5
0.0
0.5
1.2
1.6
2.0
−1
0
1
1
2
3
4
0.00
0.25
0.50
0.75
1.00
1
2
3
4
0.0
0.4
0.8
R
log
rela
tive
effic
ieny
with
resp
ect t
o g k Type
simple
refined
Wahba
Estimator● kernel d
kernel c
Bessel
cosine
Figure G.1: Plots of log relative efficiencies for small lags (rmin, 0.025] and all lags(rmin, R], R = 0.06, 0.085, 0.125, and W = [0, 2]2. Black: kernel estimators. Blue and red:orthogonal series estimators with Bessel respectively cosine basis. Lines serve to ease visualinterpretation.
Table G.1: Monte Carlo mean and quantiles of the estimated cut-off K obtained from(5.4) for the orthogonal series estimator with Bessel (go1) and cosine (go4) basis in the caseof Poisson (P), Thomas (T), Variance Gamma (V) and determinantal (D) point processeson observation windows W1 = [0, 1]2 and W2 = [0, 2]2 with rmin = 0.001.
R = 0.06 R = 0.085 R = 0.125
E(K) q0.05(K) q0.95(K) E(K) q0.05(K) q0.95(K) E(K) q0.05(K) q0.95(K)
W1 P go1 2.17 2.00 3.00 2.15 2.00 3.00 2.11 2.00 3.00W1 P go4 2.17 2.00 4.00 2.15 2.00 4.00 2.11 2.00 4.00W1 T go1 2.24 2.00 3.00 2.24 2.00 3.00 3.23 2.00 4.05W1 T go4 2.24 2.00 4.00 2.24 2.00 4.00 3.23 3.00 5.00W1 V go1 2.77 2.00 4.00 3.50 2.00 6.00 4.85 3.00 8.00W1 V go4 2.77 2.00 5.00 3.50 2.00 7.00 4.85 3.00 10.00W1 D go1 2.21 2.00 3.00 2.18 2.00 3.00 2.38 2.00 3.00W1 D go4 2.21 2.00 3.00 2.18 2.00 4.00 2.38 2.00 5.00W2 P go1 2.17 2.00 3.00 2.12 2.00 3.00 2.09 2.00 3.00W2 P go4 2.17 2.00 4.00 2.12 2.00 4.00 2.09 2.00 4.00W2 T go1 2.39 2.00 4.00 2.46 2.00 4.00 3.78 3.00 5.00W2 T go4 2.39 2.00 5.00 2.46 2.00 5.00 3.78 3.00 6.00W2 V go1 3.58 3.00 5.00 5.14 3.00 8.00 7.19 5.00 11.00W2 V go4 3.58 3.00 9.00 5.14 4.00 12.00 7.19 5.00 17.00W2 D go1 2.22 2.00 4.00 2.17 2.00 3.00 2.79 2.00 4.00W2 D go4 2.22 2.00 3.00 2.17 2.00 4.00 2.79 3.00 5.00
24
kernel
r
g(r
)
0.00 0.04 0.08
05
1015
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
05
1015
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
kernel
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
kernel
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
kernel
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
Figure G.2: True pair correlation function (solid line), Monte Carlo mean (dashed lines)and 95% pointwise probability interval (grey area) of estimates based on nsim = 1000 sim-ulations from the Poisson (first row), Thomas (second row), Variance Gamma (third row)and determinantal (fourth row) point processes on W = [0, 1]2.
25
kernel
r
g(r
)
0.00 0.04 0.08
05
1015
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
05
1015
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
kernel
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
05
1015
2025
3035
kernel
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
010
2030
4050
60
kernel
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
kernel: bias corrected
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
Bessel: simple scheme
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
cosine: simple scheme
r
g(r
)
0.00 0.04 0.08
0.0
0.5
1.0
1.5
Figure G.3: True pair correlation function (solid line), Monte Carlo mean (dashed lines)and 95% pointwise probability interval (grey area) of estimates based on nsim = 1000 sim-ulations from the Poisson (first row), Thomas (second row), Variance Gamma (third row)and determinantal (fourth row) point processes on W = [0, 2]2.
26
Den
sity
2 3 4 5 6 7 8 9
0.0
0.2
0.4
g(0.025)D
ensi
ty0.5 1.0 1.5 2.0
0.0
0.5
1.0
1.5
g(0.1)
Den
sity
3 4 5 6
0.0
0.2
0.4
0.6
0.8
g(0.025)
Den
sity
0.8 1.0 1.2 1.4 1.6 1.8 2.0
0.0
1.0
2.0
3.0
g(0.1)
Den
sity
3.0 3.5 4.0 4.5 5.0
0.0
0.4
0.8
1.2
g(0.025)
Den
sity
0.9 1.0 1.1 1.2 1.3 1.4 1.5
01
23
4
g(0.1)
Figure G.4: Histograms of go(r) at r = 0.025 and r = 0.1 using the Bessel basis with thesimple smoothing scheme in case of the Thomas process on W = [0, 1]2 (upper panels),W = [0, 2]2 (middle panels) and W = [0, 3]2 (lower panels).
27
H Data example
Figure H.1 shows the data sets and fitted intensity functions considered in Section 7of the main document.
Acalypha diversifolia Lonchocarpus heptaphyllus Capparis frondosa
0.002 0.004 0.006 0.008 0.01 0.012 0.001 0.002 0.003 0.004 0.005 0.006 0.004 0.006 0.008 0.01 0.012 0.014
Figure H.1: Locations of Acalypha diversifolia, Lonchocarpus heptaphyllus and Capparisfrondosa trees in the Barro Colorado Island plot (upper panels) and their fitted parametricintensity functions (lower panels).
For the Capparis frondosa species and the orthogonal series estimator with cosinebasis, the function I(K) given in (5.3) of the main document is shown in Figure H.2.Although I(K) is decreasing over 1 ≤ K ≤ 49, the rate of decrease slows down afterK = 7.
●
●
●
●
●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
K
I(K)
0 10 20 30 40 50
−14
5−
140
−13
5
Figure H.2: Estimate I(K) of the mean integrated squared error for Capparis frondosain case of the orthogonal series estimator with cosine basis.
I Behavior of the Fourier-Bessel and cosine basis
Figure I.1 shows the Fourier-Bessel and cosine basis functions φk(r) in the planarcase (d = 2) for R = 0.125, k = 1, . . . , 8 and r ∈ [0, 0.125]. Obviously, the cosine
28
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
05
1020 k = 1
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−30
−10
10
k = 2
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−10
1030 k = 3
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−50
−20
020
k = 4
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−20
020
40 k = 5
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−60
−20
20
k = 6
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−20
2060
k = 7
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−60
−20
20 k = 8
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
2.0
3.0
4.0
k = 1
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4
k = 2
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4
k = 3
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4
k = 4
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4k = 5
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4
k = 6
r
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4
k = 7
0.00 0.02 0.04 0.06 0.08 0.10 0.12
−4
02
4
k = 8
Figure I.1: Fourier-Bessel and cosine basis functions φk(r) in the planar case (d = 2) forR = 0.125 and k = 1, . . . , 8.
basis functions are uniformly bounded and integrable. However, the Fourier-Besselbasis functions exhibit damped oscillation behavior with φk(R) = 0 and
φk(0) =ανν,k
Rν+12ν−1/2Γ(ν + 1)Jν+1(αν,k),
for all k ≥ 1, because limr→0 Jν(r)r−ν = 1/(Γ(ν+ 1)2ν) for ν ≥ 0. Thus, φk(0)→∞
as k →∞.For 0 ≤ ν ≤ 1/2 (or equivalently d = 2, 3), |Jν(r)| ≤ (2/πr)1/2 and hence
∫ αν,k
0
∣∣Jν(r)∣∣rν+1dr ≤
(2
π
)1/2 ∫ αν,k
0
rν+ 12 dr =
(2
π
)1/2(αν,k)
ν+ 32
ν + 32
.
29
Therefore, as k →∞,∫ R
0
∣∣φk(r)∣∣w(r)dr =
√2
R∣∣Jν+1(αν,k)
∣∣(R
αν,k
)ν+2 ∫ αν,k
0
∣∣Jν(r)∣∣rν+1dr
≤√
2
R∣∣Jν+1(αν,k)
∣∣(R
αν,k
)ν+2(2
π
)1/2(αν,k)
ν+ 32
ν + 32
=2Rν+1
∣∣Jν+1(αν,k)∣∣(παν,k)1/2(ν + 3
2)
≈ 2Rν+1
( 2παν,k
)1/2(παν,k)1/2(ν + 32)
=
√2Rν+1
ν + 32
<∞,
which implies uniform integrability of φk(r).
30