Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data...

57
ISSN 1440-771X Department of Econometrics and Business Statistics http://business.monash.edu/econometrics-and-business- statistics/research/publications October 2019 Working Paper 26/19 Time-Varying Coefficient Spatial Autoregressive Panel Data Model with Fixed Effects Xuan Liang, Jiti Gao and Xiaodong Gong

Transcript of Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data...

Page 1: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

ISSN 1440-771X

Department of Econometrics and Business Statistics

http://business.monash.edu/econometrics-and-business-statistics/research/publications

October 2019

Working Paper 26/19

Time-Varying Coefficient Spatial Autoregressive Panel Data Model

with Fixed Effects

Xuan Liang, Jiti Gao and Xiaodong Gong

Page 2: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Time–Varying Coefficient Spatial Autoregressive

Panel Data Model with Fixed Effects

Xuan Liang†, Jiti Gao? and Xiaodong Gong‡

The Australian National University†, Monash University?,

The University of Canberra‡, Australia

October 30, 2019

Abstract

This paper develops a time-varying coefficient spatial autoregressive panel data model with the

individual fixed effects to capture the nonlinear effects of the regressors, which vary over the time.

To effectively estimate the model, we propose a method that incorporates the nonparametric local

linear method and the concentrated quasi-maximum likelihood estimation method to obtain consistent

estimators for the spatial coefficient and the time-varying coefficient function. The asymptotic properties

of these estimators are derived as well, showing the regular√NT–rate of convergence for the parametric

parameters and the common√NTh–rate of convergence for the nonparametric component, respectively.

Monte Carlo simulations are conducted to illustrate the finite sample performance of our proposed

method. Meanwhile, we apply our method to study the Chinese labor productivity to identify the

spatial influences and the time-varying spillover effects among 185 Chinese cities with comparison to the

results on a subregion East China.

Key Words: Concentrated quasi-maximum likelihood estimation, local linear estimation, time–varying

coefficient.

JEL Classifications: C21, C23

1 Introduction

Panel data analysis has widely been used in many fields of social sciences as it usually enables strong

identification and increases estimation efficiency. A comprehensive review about these methodologies can be

found in Arellano (2003); Baltagi (2008); Hsiao (2014). In classical panel data models, we normally assume

the independence in the error terms among different units. Even though some dependence assumptions

can be made in the error term, no clear cross-sectional dependence structure can be modelled in pure panel

data models.

Spatial econometric models, which are designed to model spatial interactions, have provided a way

to model the cross-sectional dependence with a clear model structure and meaningful interpretations. A

class of spatial autoregressive (SAR) models is first proposed in Cliff and Ord (1973). Since then, it has

become one of the most active research fields in spatial econometrics. One issue with cross-sectional spatial

econometric models is that the spatial lag terms are endogenous. Various methods have been proposed

The first and the second authors acknowledge the Australian Research Council Discovery Grants Program for its financialsupport under Grant Numbers: DP150101012 & DP170104421.

1

Page 3: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

to overcome this issue, e.g. the instrumental variable (IV) method (Kelejian and Prucha, 1998), the

generalised method of moments (GMM) framework (Kelejian and Prucha, 1999) and the quasi-maximum

likelihood (QML) method (Lee, 2004)1. Combining the advantages of traditional panel data model with

spatial features, spatial panel data models have received considerable attentions. Kapoor et al. (2007)

and Fingleton (2008) both adopt a method of moments (MOM) estimation, but consider different error

structure models with the former containing an SAR-disturbance term while the latter assuming SAR-

dependent variables with random components and a spatial moving average (SMA)-disturbance. Lee and

Yu (2010) consider a QML method to estimate spatial autoregressive panel models under a fixed effects

setup. More recent studies on spatial dynamic panel data models can be found in Yu et al. (2008); Lee

and Yu (2014); Li (2017).

In these standard spatial panel data models, the coefficients of the explanatory variables are assumed

constant. Yet, we know that, especially when the time span of data is long, covariate effects probably change

over time. The reason behind could be due to economic structure change, policy reform, environment

evolution or technology development. In such cases, assuming coefficients of regressors to be constant

is not reasonable not only because it potentially leads to inconsistent estimation but also to misleading

interpretations. In panel data settings, such issues have been recognised and studied well in Li et al. (2011),

Chen et al. (2012) and Silvapulle et al. (2017). Hence, it is necessary to investigate possible time-varyingness

in spatial panel data models.

In addition, evolution of these time–varying effects may be of unknown nonlinear functional forms.

There have been some recent studies for pure spatial models involving nonparametric structures. For

example, Su and Jin (2010) consider using QML method to estimate a partially linear SAR model, and

Su (2012) proposes a semiparametric GMM estimation method for SAR models with a nonparametric

regressor term. A functional–coefficient spatial model with nonparametric spatial weights is also studied in

Sun (2016), and Malikov and Sun (2017) study semiparametric estimation and testing of smooth coefficient

spatial autoregressive models. In terms of spatial panel data models, Zhang and Sun (2015) and Zhang

and Shen (2015) discuss estimation of semiparametric varying-coefficient spatial panel data models with

random-effects and partial specified dynamic spatial panel data models, respectively.

Thus, in this paper, we propose a time-varying coefficient spatial panel data model with fixed effects.

This model allows for regressors to be trending nonstationary so that level data may be used directly

without differencing.2 We specify the time-varying coefficients as nonparametric functions of time to

allow for a flexible functional form. The model is estimated using Quasi Maximum Likelihood Estimator

(QMLE)combined with the local linear regression–a local linear concentrated quasi-maximum likelihood

estimator. We show that under some reasonable assumptions, the estimates obtained are asymptotically

consistent and follows a normal distribution. The finite–sample performance of the estimator is evaluated

and compared with those of the parametric estimators. We also applied the estimator to analyse labour

1One can refer to some classic spatial econometrics books for more logical concepts and details of spatial econometrics, e.g.Anselin et al. (2013); LeSage and Pace (2009).

2When time-varying coefficients and trends involved in regressors do not exist, our model reduces to a classical spatialpanel data model which is considered in Lee and Yu (2010).

2

Page 4: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

productivity in Chinese cities.

Our contributions are then summarized as follows.

• Since there exists both unknown parameters and nonparametric components, we propose incorpo-

rating the local linear estimation (Fan and Gijbels, 1996) into the QMLE estimation to consistently

estimate unknown parameters and functions. We then establish new asymptotic normality results

for the proposed estimators under a general model structure and a set of reasonable assumptions.

• We compare the finite–sample performance of our proposed estimation method with existing paramet-

ric counterparts under several scenarios. We find that parametric estimation methods produce large

biases, but our proposed estimation produces consistent estimates when the true model has strong

time varying feature on the one hand. Even when the model is not time–varying, the performance of

our method is almost the same as the parametric ones.

• To further evaluate the finite–sample performance and empirical relevance, we also apply our model

and the associated estimation method to the analysis of a Chinese labour productivity dataset, which

includes 185 cities over the period of 1995–2009. We demonstrate that the estimated coefficients have

quite strong time–varying features.

The rest of paper is organised as follows. Section 2 introduces the model setting first and proposes the

estimation procedure. After we introduce the necessary assumptions in Section 3, we establish theoretical

results in Section 4. Simulated and real data examples are reported in Section 5 and 6, respectively. We

conclude the paper and discuss future research in Section 7. Appendices A–C provide the proofs of the

main theorems and some useful lemmas, respectively. Appendix D gives some additional simulation results.

2 Model Setting and Estimation

2.1 Model

The models we consider take the following form:

Yit = ρ0

∑j 6=i

wijYjt +

d∑k=1

β0,tkXitk + α0,i + eit

= ρ0

∑j 6=i

wijYjt +X>itβ0,t + α0,i + eit, t = 1, · · · , T, i = 1, · · · , N, (2.1)

where Yit is the response of location i at time t, Xit = (Xit1, · · · , Xitd)> is a d-dimensional vector with the

corresponding time-varying coefficient vector function β0,t = (β0,t1, · · · , β0,td)>, α0,i reflects the unobserved

individual fixed effects, wij describes the spatial weight of observation j to i, which is normally a decreasing

function of spatial distance between i and j, the scalar parameter ρ0 measures the strength of spatial

dependence belonging to (−1, 1) and the error component is eit with variance σ20. In this model, the term

ρ0wijYjt captures the spatial interaction and X>itβ0,t measures the covariate effects over time.

3

Page 5: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Here we denote T and N as the time length and the number of spatial units, respectively. For

the purpose of model identification, we assume that N−1∑N

i=1 α0,i = 0 throughout the paper. Thus,

(α0,1, · · · , α0,N )> = D0α0 where D0 = (−1N−1, IN−1)>. Define a spatial weight matrix W = (wij)N×N

with zero diagonal elements, i.e. wii = 0. A clear matrix form of model (2.1) can be written as

Yt = ρ0WYt +Xtβ0,t +D0α0 + et, t = 1, · · · , T, (2.2)

where Yt = (Y1t, · · · , YNt)>, Xt = (X1t, · · · , XNt)>, α0 = (α0,2, · · · , α0,N )> and et = (e1t, · · · , eNt)>.

When β0,t does not vary over time, and clearly it reduces to a vector of constants, model (2.2) becomes

the traditional spatial panel data model as discussed in Lee and Yu (2010). This paper is to construct

consistent estimates for the spatial coefficient ρ0 and the time-varying coefficient vector function β0,t as

well as the variance σ20. In the same way as the nonparametric function forms used in Li et al. (2011) and

Chen et al. (2012), we use the following specification:

β0,t = β0(τt), t = 1, · · · , T,

where β0,t(·) is a d× 1vector of smooth functions defined on R and τt = t/T ∈ (0, 1]. The reason to scale

the time to (0,1] is for some technical reasons when using a nonparametric kernel method. Thus, model

(2.2) has the form:

Yt = ρ0WYt +Xtβ0(τt) +D0α0 + et, t = 1, · · · , T. (2.3)

If only some components of β0,t change over time, it means a partial linear time-varying spatial panel

data model.

Model (2.3) shows the structure for each time t. Denote

Y =

Y1

...

YT

, X =

X1

. . .

XT

, β0 =

β0(τ1)

...

β0(τT )

, e =

e1

...

eT

.

Let SN (ρ) = IN − ρW , SN,T (ρ) = IT ⊗ SN (ρ), Y ∗(ρ) = SN,T (ρ)Y and D = 1T ⊗D0, where ⊗ denotes the

Kronecker products. Model (2.3) can be written in the matrix form as

Y ∗(ρ0) = Xβ0 +Dα0 + e. (2.4)

In model (2.4), we move the spatial lag term (ρ0WYt) to the left side so that Y ∗(ρ0) can be regarded

as the new response as if ρ0 were known.

4

Page 6: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

2.2 Estimation

To estimate the model, we combine the local linear estimation and quasi-likelihood estimation methods.

In this method, the time-varying component β(τ) is first expressed as a function of the non-time-varying

parametric parameters (ρ and α). For given values of ρ and α, β(τ) can be estimated using local linear

regression. These parameters can then be ’concentrated out’ so that quasi-likelihood method is used

to estimate the two parametric parameters ρ and α. With ρ and α estimated, the estimates of the

nonparametric time-varying parameters can then be updated. This is an iterative procedure.

Let K(·) and h be the kernel function and the smoothing bandwidth, respectively and denote

M(τ) =

X1

1−τTTh X1

......

XTT−τTTh XT

, ω(τ) =

K(

1−τTTh

). . .

K(T−τTTh

) ,

Ω(τ) = ω(τ) ⊗ IN . Assuming that β0 has continuous derivatives of up to the second order, according to

Taylor expansion we have

β0(τt) = β0(τ) + β′0(τ)(τt − τ) +O((τt − τ)2

), (2.5)

where β′0(·) is the first derivative of β0(·) and τ ∈ (0, 1]. By this approximation,

Xβ0 ≈ M(τ)

β(τ)

hβ′(τ)

.

The model can be estimated with the following procedure.

(i) Estimate β0(·) as a function of ρ and α. Define the following weighted loss function:

L(a,b) =Y ∗(ρ)−M(τ)(a>,b>)> −Dα

>Ω(τ)

Y ∗(ρ)−M(τ)(a>,b>)> −Dα

.

For some given ρ and α, we estimate β(τ) and hβ′(τ) by βρ,α(τ)

hβ′ρ,α(τ)

= arg min(a>,b>)>

L(a,b)

=M>(τ)Ω(τ)M(τ)

−1M>(τ)Ω(τ) Y ∗(ρ)−Dα . (2.6)

Denote Φ(τ) = (Id, 0d×d)M>(τ)Ω(τ)M(τ)

−1M>(τ)Ω(τ). β0(·) can then be estimated by

βρ,α(τ) = Φ(τ)Y ∗(ρ)−Dα. (2.7)

5

Page 7: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

(ii) Plugging βρ,α(τ) in model (2.4), we estimate ρ0 and σ20 by maximising the quasi log likelihood

function:

logLN,T (ρ, σ2,α) = −NT2

log(2πσ2) + T log|SN (ρ)|

− 1

2σ2

T∑t=1

SN (ρ)Yt −Xtβρ,α(τt)−D0α

> SN (ρ)Yt −Xtβρ,α(τt)−D0α

= −NT

2log(2πσ2) + T log|SN (ρ)| − 1

2σ2

Y (ρ)− Dα

> Y (ρ)− Dα

, (2.8)

where Y (ρ) = (INT − S)Y ∗(ρ) and D = (INT − S)D are the smoothing versions of Y ∗(ρ) and D by the

smoothing matrix S = XΦ with Φ = (Φ(τ1)>, · · · ,Φ(τT )>)>, respectively.

Taking the derivative of (2.8) with respect to α and setting it to be zero, we get

α(ρ) = (D>D)−1D>Y (ρ).

Define QN,T = INT − D(D>D)−1D>. Pluging α(ρ) into (2.8) leads to

logLN,T (ρ, σ2) = −NT2

log(2πσ2) + T log|SN (ρ)| − 1

2σ2Y >(ρ)QN,T Y (ρ). (2.9)

Then, taking the derivative of (2.9) with respect to σ2, we have

σ2(ρ) =1

NTY >(ρ)QN,T Y (ρ).

Replacing σ2 with σ2(ρ) in (2.9) , we obtain the concentrated quasi log likelihood function

logLN,T (ρ) = −NT2log(2π) + 1 − NT

2log

1

NTY >(ρ)QN,T Y (ρ)

+ T log|SN (ρ)|. (2.10)

Therefore, we can estimate the parameter θ0 = (ρ0, σ20)> and α0 by θ = (ρ, σ2)> and α as follows:

ρ = maxρ

logLN,T (ρ), (2.11)

σ2 =1

NTY >(ρ)QN,T Y (ρ), (2.12)

α = (D>D)−1D>Y (ρ). (2.13)

(iii) Plug ρ and α into (2.7) to obtain the final estimate of β0(τ) by

β(τ) = Φ(τ)Y ∗(ρ)−Dα. (2.14)

In the following sections, we study θ = (ρ, σ2)> and β(τ) from theoretical and numerical aspects, which

show that our proposed method has some nice properties in both large and small samples.

6

Page 8: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

3 Assumptions

Before continuing, we define the following notation. Throughout the paper, we denote ‖a‖s = (∑n

i=1 |ai|s)1/s

as the s-norm (s ≥ 1) for any generic vector a = (a1, · · · , an)>. For any genericm×mmatrix A = (aij)m×m,

define ‖A‖1 = max1≤j≤m

∑mi=1 |aij | and ‖A‖∞ = max

1≤i≤m

∑mj=1 |aij | as the 1-norm and ∞-norm, respectively.

Let 0m1×m2 to be an m1 ×m2 matrix with all zero elements and Im is the m-dimensional identity matrix.

Donote 0n and 1n as the vectors with n elements of zeros and ones, respectively.

Before we establish the theoretical results of this paper, we introduce the following necessary conditions.

Assumption 1. Let d × 1 vetcor Xit = g(τt) + vit contain a time trend part g(τ) = (g1(τ), · · · , gd(τ))>

and an error part vit = (vit1, · · · , vitd)>.

(i) Suppose that g(τ) is a continuous function for any 0 ≤ τ ≤ 1.

(ii) Denote vt = (v1t, · · · ,vNt)>. Suppose that vt, t ≥ 1 is a stationary sequence with mean zero

and α-mixing with mixing coefficient αN (t). Suppose there exist a function α(t) and a constant δ such that

supN≥1 αN (t) < α(t) and∑∞

t=1 α(t)δ/(4+δ) <∞ for some δ > 0.

(iii) For some η1 > 0, vit, i ≥ 1, t ≥ 1 is identically distributed with E(‖vit‖4+η12 ) < ∞ and satisfies

the following conditions:

E(vitv>it ) = Σv = (σ

(k1,k2)v )d×d,

E

∥∥∥∥∥N∑i=1

vit

∥∥∥∥∥2+η1/2

2

= O(N1+η1/4

), E

∥∥∥∥∥N∑i=1

vitv>it

∥∥∥∥∥2+η1/2

2

= O(N1+η1/4

),

T∑t1,t2,t3,t4=1

N∑i1,i2=1

Cov(v>i1t1vi1t2 ,v

>i2t3vi2t4

)= O(NT 3),

andT∑

t1,t2,t3,t4=1

N∑i1,i2,i3,i4

Cov(v>i1t1vi2t2 ,v

>i3t3vi4t4

)= O(N2T 3).

Assumption 2. The error term et, t ≥ 1 is a stationary sequence of martingale differences such that

(i) E(et|EN,t−1) = 0N and E(ete>t |EN,t−1) = σ2

0IN , where Et,N = σ(es1 ,vis2), s1 ≤ t, s2 ≤ T, 1 ≤ i ≤

N is a σ-filed.

(ii) Denote mj = E(ejit|EN,t−1). We assume for any i, i1, i2, i3, i4 = 1, · · · , N and some η2 > 0,

E(|eit|4+η2) <∞, E(|ei1tei2tei3t|) <∞, E(|ei1tei2tei3tei4t|) <∞.

and also for any i, i1 6= i2 6= i3 6= i4 = 1, · · · , N , almost surely,

E(e3it|EN,t−1) = E(e3

it), E(e2i1tei2t|EN,t−1) = 0, E(ei1tei2tei3t|EN,t−1) = 0,

E(e4it|EN,t−1) = E(e4

it), E(e2i1te

2i2t|EN,t−1) = σ4

0,

E(e2i1tei2tei3t|EN,t−1) = 0, E(ei1tei2tei3tei4t|EN,t−1) = 0.

7

Page 9: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

(iii) For some η2 > 0, E

(∥∥∥∑Ni=1 viteit

∥∥∥2+η2

2

)= O

(N1+η2/2

).

Assumption 3. The kernel function K(·) is symmetric and Lipschitz continuous with compact support on

[−1, 1]. The bandwidth is assumed to satisfy Th→∞, NTh→∞ and NTh8 → 0.

Assumption 4. W is a non-stochastic spatial weight with zero diagonals and is uniformly bounded in both

row and column sums in absolute value (for short, UB), i.e. supn≥1 ‖W‖1 <∞ and supn≥1 ‖W‖∞ <∞.

Assumption 5. SN (ρ) is invertible for all ρ ∈ ∆, where ∆ is a compact interval with the true value ρ0 as

an interior point. Also, SN (ρ) and S−1N (ρ) are both UB, uniformly in ρ ∈ ∆.

Assumption 6. The coefficient function β0(·) has continuous derivatives of up to the second order.

Assumption 7. The fixed effects satisfy that ‖Dα0‖1 <∞ and ‖Dα0‖∞ <∞.

To proceed, we need to introduce some notation. Let SN = SN (ρ0), SN,T = SN,T (ρ0), GN (ρ) =

WS−1N (ρ), GN = GN (ρ0), GN,T = IT ⊗GN , PN,T = (INT − S)>QN,T (INT − S) and RN,T = GN,T (Xβ0 +

Dα0).

Assumption 8. Let ΨR,R = limN,T→∞1NT E(R>N,TPN,TRN,T ) > 0.

The justifications for these assumptions are provided in Appendix A.1.

4 Asymptotic Properties

We first establish the asymptotic consistency of θ = (ρ, σ2)> to θ0 = (ρ0, σ20)>. The proofs of the following

theorems are given in Appendix A.2.

Theorem 1. Under Assumptions 1, 2(i)-(ii), 3-8, θ0 is globally identifiable and θ is consistent to θ0.

According to Theorem 1, we know the uniqueness and consistency of parametric parameter θ is guar-

anteed.

Next, we need to establish an asymptotic distribution for θ in the following theorem. Denote c1 =

limN,T→∞ tr(G2N,T )/NT + tr(G>N,TGN,T )/NT, c2 = limN,T→∞ tr(GN,T )/NT . The existence of c1 and c2

is due to the uniform boundedness of GN,T shown Lemma C.5.

Theorem 2. Under Assumptions 1, 2(i)-(ii), 3-8, as T →∞ and N →∞ simultaneously, then

√NT

(θ − θ0

)d→ N

(02,Σ

−1θ0

+ Σ−1θ0

Ωθ0Σ−1θ0

), (4.1)

where Σθ0 =

1σ20ΨR,R + c1

c2σ20

c2σ20

12σ4

0

is positive definite shown in Lemma C.8, Ωθ0 = limN,T→∞ΩNT,θ0

8

Page 10: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

with

ΩNT,θ0

=

2m3(R>N,TPN,T vec[diag(PN,TGN,T )])

NTσ40

+(m4−3σ4

0)(∑NTi=1(gp)

2ii)

NTσ40

m3(R>N,TPN,T vec[diag(PN,T )])

2NTσ60

+(m4−3σ4

0)(∑NTi=1(gp)iipii)

2NTσ60

m3(R>N,TPN,T vec[diag(PN,T )])

2NTσ60

+(m4−3σ4

0)(∑NTi=1(gp)iipii)

2NTσ60

(m4−3σ40)(

∑NTi=1 p

2ii)

4σ80NT

,

in which pii and (gp)ii are the i-th main diagonal elements of PN,T and PN,TGN,T , respectively.

Since we use QMLE to estimate θ0, it relaxes the normality assumption on the error term but it adds

an additional term to the variance that is a function of the third and fourth moments. If m3 = 0 and

m4 = 3σ20, the variance reduces to Σ−1

θ0, as shown in the following proposition.

Proposition 1. Under Assumptions 1, 2(i)-(ii), 3-8, as T →∞ and N →∞ simultaneously

√NT

(θ − θ0

)d→ N

(02,Σ

−1θ0

)when eit, i ≥ 1, t ≥ 1 is i.i.d normally distributed with Ωθ0 = 02×2 due to m3 = 0 and m4 = 3σ4

0.

Define µj =∫ujK(u)du and νj =

∫ujK2(u)du. Let β

′′0(τ) be the second derivative of β0(τ). An

asymptotic distribution for β(τ) is established in the following theorem.

Theorem 3. Let Assumptions 1–8 hold. As T →∞ and N →∞ simultaneously, we have

√NTh

(β(τ)− β0(τ)− bβ(τ)h2 + oP(h2)

)d→ N

(0d, σ

20ν0Σ−1

X (τ)), (4.2)

where bβ(τ) = 12µ2β

′′0(τ) and ΣX(τ) = g(τ)g(τ)> + Σv.

We can see the rates convergence of β(τ) is√NTh, which is the fast possible in the nonparametric

structure. Also it is clear to see that the variance is related to g(τ) so that it involves the trend of Xit.

When Xit is stationary, the asymptotic variance reduces to a constant matrix σ20ν0(µXµ

>X + Σv)−1 where

µX = E(Xit).

In Section 6 of our case study, we use the following sample estimates to estimate the covariance matrix

of θ0 and β(τ): Σθ0 =

1σ2 ΨR,R + c1

c2σ2

c2σ2

12σ4

, Ωθ0 = ΩNT,θ0 , g(τ) =∑i,tK( τt−τh )Xit∑i,tK( τt−τh )

, and ΣX(τ) =

g(τ)g(τ)> + Σv, where ΨR,R = (NT )−1R>N,TPN,TRN,T , Σv = (NT )−1∑

i,t vitv>it and vit = Xit − g(τt).

The consistency is shown in Lemma C.8.

5 Monte Carlo Simulations

In this section, simulation experiments under different scenarios are carried out to evaluate the finite

sample performance and robustness of our estimates for the proposed model. The Epanechnikove kernel

K(u) = 3/4(1− u2)I(|u| ≤ 1) is adopted throughout the papert, where I(·) is the indicator function. The

bandwidth is selected through a leave-one-unit-out cross-validation method explained in Appendix A.3.

9

Page 11: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Consider a univariate time-varying spatial panel data autoregressive model (d = 1) of the form:

Yt = ρ0WYt +Xtβ0(τt) +D0α0 + et, t = 1, · · · , T.

The spatial matrix W in the data generating process is chosen as a “q step head and q step behind” spatial

weights matrix as in Kelejian and Prucha (1999) with q = 2 in this section. The procedure is as follows:

all the units are arranged in a circle and each unit is affected only by the q units immediately before it

and immediately after it with the weight being 1, and then following Kelejian and Prucha (1999), we also

normalise the spatial weights matrix by letting the sum of each row equal to 1 so that it generates an equal

weight influence from all the neighbouring units to each unit.

The data generating process for our simulation is summarized below. The explanatory variables Xit is

generated by

Xit = g(τt) + vit, 1 ≤ i ≤ N, 1 ≤ t ≤ T,

where g(τ) = a0 + a1τ + a2τ2 is the trend function and vit is the i-th element from a N × 1 vector

vt = 0.2vt−1 + N(0N ,Σ∗) with Σ∗ = (0.5|i−j|)N×N for −99 ≤ t ≤ T and v−100 = 0n. It is obvious

that here vit are both serially and cross-sectionally dependent. The error term eit is independent and

identically generated from the distribution of exp(1) − 1 so that σ20 = 1. The fixed effects follow α0,i =

T−1∑T

t=1Xit for i = 1, · · · , N − 1 and α0,N = −∑N−1

i=1 α0,i. The time-varying coefficient function is set

to be β0(τ) = b0 + b1τ + b2τ2. We consider the following cases for the parameters (a0, a1, a2), (b0, b1, b2)

and ρ0:

• Setting of Xit: (I-1) , (a0, a1, a2) = (0, 0, 0); (I-2) (a0, a1, a2) = (1, 0, 0) and (I-3) (a0, a1, a2) = (1, 2, 2);

• Setting of β0: (II-1) (b0, b1, b2) = (1, 0, 0); (II-2) (b0, b1, b2) = (1, 0.05, 0.05) and (II-3) (b0, b1, b2) =

(1, 2, 2);

• Setting of spatial coefficient: (III-1) ρ0 = 0.3, (III-2) ρ0 = 0.7 and (III-3) ρ0 = 0.9.

It is clear that Cases (I-1) and (I-2) consider the stationary covariate Xit, while under Case (I-3) Xit is

nonstationary. The time-varying feature of β0 is reflected by the setting of Case (II) with no time-varying,

minor time-varying and strong time-varying features from Cases (II-1) to (II-3), respectively. Cases (III-1)

– (III-3) measure different spatial autoregressive coefficient or different spatial dependences among cross-

sectional units. All the results are conducted on 1000 replications.

Tables 1 and 2 report the means and standard deviations (SDs) (in parentheses) of the estimates of ρ0

and σ20 under different settings of Xit and β0(τ) with ρ0 = 0.3. They also provide the parametric estimation

results proposed in Lee and Yu (2010). If the underlying model is no time-varying (II-1), we find that the

estimation results obtained form the parametric model in Lee and Yu (2010) perform similarly to ours for

ρ0 and slightly better for σ20 when the covariate Xit is either stationary (I-1 and I-2) or non-stationary (I-3).

Under the “minor time-varying” setting (II-2), the parametric estimation of ρ0 has large biases which does

not decrease when N and T increase if Xit is nonstationary (I-3), while our method gives small biases. In

10

Page 12: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

the case “strong time varying” results (II-3), our estimates of ρ0 and σ20 are very close to the true values

while the ones from the parametric method are quite far away. The bias becomes larger from case (I-1) to

case (I-3); the more Xit tends to non-stationary, the more bias there will be. Meanwhile, our method shows

the asymptotic convergence in both N and T dimensions. Meanwhile, much smaller standard deviations

can be obtained by our method for σ20 than that in the parametric one.

To evaluate the finite–sample performance of the nonparametric estimation of the time-varying coeffi-

cient function β0, the means and SDs (in parentheses) of the mean squared errors (MSEs) of the estimates

of the time-varying coefficient functions are reported in Table 3, where for an estimate β, its MSE is defined

by

MSE(β) =1

T

T∑t=1

β(τt)− β(τt)

2.

Both the means and SDs decrease when either N or T increases. In Appendix D, Table D.1 - Table D.6

report the results when ρ0 = 0.7 and ρ0 = 0.9. The conclusion is quite similar to the case of ρ0 = 0.3. In

summary, our model is robust regardless of the specification of Xit, β0(τ) and ρ0.

6 Case Studies

As a case study, we apply our model to study labour productivity in 185 Chinese cities over the period

between 1995 and 2009. In the last four decades or so, China has been experiencing unprecedent economic

growth and change in economic structure. Its vast internal urban labour markets are also inter-linked. It

is likely that the impacts of economic factors e.g., on productivity change over time and spill over from

one place to another. Thus it is an ideal case study for our proposed model.

Our data run from 1995 to 2009 (T=15) and cover 185 (N) cities, including four cities like Beijing,

Shanghai, Tianjin, and Chongqing directly administrated by the central government, and other 181 cities

at or above prefectural levels in China3. The geographical location of these cities can be found in Figure

1. Table 4 summarises the variables used in this paper and Table 5 reports the number of cities in each

region. We can see that East China (EC) has almost one third cities followed by Northeast (NE) and

Central China (CC) while it is very sparse in the western region, reflecting the uneven distribution of

cities. In this analysis, highway distances in kilometres grasped by Software R are used to measure the

spatial distances between cities, as it can better reflect economic relation between cities. We specify W as

the inverse of highway distances between cities, standardised by its maximum eigenvalue.

As we know, the method we propose does rely on the kernel bandwidth. The optimal bandwidth

(hopt = 0.4000) is obtained from the leave-one-unit-out cross validation method. To check the robustness

of the method, we compare the results on different bandwidths containing 2/3hopt, 4/5hopt, hopt, 5/4hopt

and 3/2hopt. Figure 2 shows that the coefficient trends under different bandwidths are consistent and Table

6 also reports the corresponding estimate of the spatial coefficient ρ0 and variance σ20. The similarities

confirm the robustness of the results. Therefore, in the rest of this paper we report the results based on

3A prefectural city in China means a city that directly controlled by provincial governments.

11

Page 13: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table 1: Means and standard deviations of bias of ρ when ρ0 = 0.3, σ20 = 1.

(a) Time Varying

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0142 -0.0070 -0.0066 -0.0137 -0.0066 -0.0064 -0.0037 -0.0013 -0.0019

(0.0995) (0.0798) (0.0554) (0.0975) (0.0784) (0.0544) (0.0461) (0.0392) (0.0258)

T=15-0.0128 -0.0064 -0.0057 -0.0127 -0.0062 -0.0055 -0.0045 -0.0016 -0.0005

(0.0768) (0.0655) (0.0454) (0.0754) (0.0643) (0.0446) (0.0387) (0.0320) (0.0219)

T=30-0.0049 -0.0028 -0.0025 -0.0047 -0.0027 -0.0024 -0.0013 -0.0005 0.0000

(0.0547) (0.0438) (0.0305) (0.0537) (0.0430) (0.0300) (0.0263) (0.0218) (0.0150)

(I-2)

T=10-0.0258 -0.0150 -0.0107 -0.0245 -0.0143 -0.0102 -0.0029 -0.0009 -0.0007

(0.0853) (0.0707) (0.0497) (0.0829) (0.0689) (0.0483) (0.0345) (0.0300) (0.0198)

T=15-0.0206 -0.0123 -0.0092 -0.0194 -0.0117 -0.0086 -0.0018 -0.0008 -0.0004

(0.0673) (0.0566) (0.0391) (0.0658) (0.0553) (0.0380) (0.0288) (0.0244) (0.0161)

T=30-0.0109 -0.0069 -0.0050 -0.0103 -0.0067 -0.0047 -0.0008 -0.0002 0.0001

(0.0476) (0.0379) (0.0269) (0.0464) (0.0370) (0.0262) (0.0200) (0.0160) (0.0113)

(I-3)

T=10-0.0271 -0.0144 -0.0081 -0.0238 -0.0122 -0.0072 0.0002 0.0018 0.0024

(0.0776) (0.0656) (0.0457) (0.0743) (0.0637) (0.0442) (0.0292) (0.0256) (0.0174)

T=15-0.0196 -0.0113 -0.0068 -0.0175 -0.0097 -0.0061 0.0011 0.0016 0.0016

(0.0625) (0.0523) (0.0355) (0.0608) (0.0509) (0.0344) (0.0245) (0.0211) (0.0142)

T=30-0.0099 -0.0057 -0.0039 -0.0085 -0.0050 -0.0036 0.0011 0.0012 0.0016

(0.0446) (0.0343) (0.0244) (0.0426) (0.0330) (0.0236) (0.0173) (0.0140) (0.0104)

(b) Parametric

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0137 -0.0074 -0.0069 -0.0127 -0.0067 -0.0062 0.0768 0.0897 0.0982

(0.0989) (0.0808) (0.0546) (0.0968) (0.0792) (0.0535) (0.0887) (0.0758) (0.0543)

T=15-0.0136 -0.0064 -0.0056 -0.0131 -0.0060 -0.0049 0.0816 0.0928 0.1028

(0.0761) (0.0648) (0.0452) (0.0750) (0.0638) (0.0443) (0.0757) (0.0616) (0.0423)

T=30-0.0054 -0.0026 -0.0028 -0.0050 -0.0020 -0.0025 0.0878 0.0977 0.1027

(0.0536) (0.0437) (0.0299) (0.0528) (0.0429) (0.0294) (0.0528) (0.0437) (0.0328)

(I-2)

T=10-0.0137 -0.0074 -0.0069 -0.0117 -0.0055 -0.0053 0.2608 0.2725 0.2807

(0.0989) (0.0808) (0.0546) (0.0971) (0.0795) (0.0534) (0.0919) (0.0773) (0.0515)

T=15-0.0136 -0.0065 -0.0056 -0.0122 -0.0054 -0.0039 0.2595 0.2716 0.2779

(0.0761) (0.0648) (0.0452) (0.0752) (0.0644) (0.0444) (0.0762) (0.0605) (0.0419)

T=30-0.0054 -0.0026 -0.0028 -0.0041 -0.0015 -0.0016 0.2634 0.2713 0.2755

(0.0536) (0.0437) (0.0300) (0.0528) (0.0430) (0.0294) (0.0541) (0.0440) (0.0293)

(I-3)

T=10-0.0084 -0.0045 -0.0049 0.0219 0.0252 0.0240 0.4006 0.4055 0.4077

(0.0771) (0.0667) (0.0464) (0.0713) (0.0617) (0.0433) (0.0327) (0.0266) (0.0192)

T=15-0.0080 -0.0062 -0.0038 0.0210 0.0224 0.0241 0.3999 0.4037 0.4075

(0.0642) (0.0555) (0.0369) (0.0597) (0.0518) (0.0340) (0.0261) (0.0229) (0.0160)

T=30-0.0035 -0.0035 -0.0028 0.0237 0.0235 0.0237 0.3990 0.4027 0.4056

(0.0464) (0.0355) (0.0253) (0.0432) (0.0331) (0.0234) (0.0190) (0.0159) (0.0115)

12

Page 14: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table 2: Means and standard deviations of bias of σ2 when ρ0 = 0.3, σ20 = 1.

(a) Time Varying

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.1413 -0.1226 -0.1150 -0.1411 -0.1223 -0.1149 -0.1368 -0.1173 -0.1084

(0.1350) (0.1064) (0.0769) (0.1350) (0.1062) (0.0768) (0.1351) (0.1060) (0.0758)

T=15-0.0967 -0.0844 -0.0762 -0.0971 -0.0843 -0.0761 -0.0922 -0.0781 -0.0693

(0.1096) (0.0939) (0.0650) (0.1101) (0.0939) (0.0649) (0.1088) (0.0944) (0.0647)

T=30-0.0492 -0.0426 -0.0389 -0.0491 -0.0425 -0.0387 -0.0422 -0.0353 -0.0323

(0.0802) (0.0661) (0.0462) (0.0802) (0.0662) (0.0461) (0.0801) (0.0655) (0.0457)

(I-2)

T=10-0.1442 -0.1242 -0.1167 -0.1443 -0.1244 -0.1168 -0.1310 -0.1136 -0.1064

(0.1347) (0.1079) (0.0765) (0.1347) (0.1078) (0.0765) (0.1324) (0.1064) (0.0758)

T=15-0.1026 -0.0870 -0.0782 -0.1024 -0.0871 -0.0780 -0.0896 -0.0754 -0.0663

(0.1097) (0.0959) (0.0657) (0.1095) (0.0960) (0.0655) (0.1074) (0.0948) (0.0650)

T=30-0.0527 -0.0445 -0.0403 -0.0526 -0.0445 -0.0401 -0.0399 -0.0326 -0.0294

(0.0822) (0.0669) (0.0463) (0.0823) (0.0672) (0.0463) (0.0806) (0.0666) (0.0465)

(I-3)

T=10-0.1422 -0.1205 -0.1140 -0.1403 -0.1190 -0.1136 -0.1088 -0.0906 -0.0864

(0.1334) (0.1071) (0.0763) (0.1328) (0.1063) (0.0763) (0.1398) (0.1138) (0.0851)

T=15-0.1006 -0.0847 -0.0743 -0.0998 -0.0836 -0.0738 -0.0699 -0.0591 -0.0495

(0.1084) (0.0950) (0.0650) (0.1072) (0.0945) (0.0647) (0.1130) (0.1000) (0.0759)

T=30-0.0506 -0.0414 -0.0379 -0.0492 -0.0408 -0.0376 -0.0263 -0.0196 -0.0168

(0.0812) (0.0666) (0.0459) (0.0807) (0.0665) (0.0459) (0.0887) (0.0766) (0.0571)

(b) Parametric

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0216 -0.0083 -0.0086 -0.0205 -0.0072 -0.0080 1.2871 1.3137 1.3162

(0.1474) (0.1175) (0.0846) (0.1475) (0.1175) (0.0849) (0.4128) (0.3458) (0.2466)

T=15-0.0170 -0.0072 -0.0039 -0.0161 -0.0067 -0.0028 1.3012 1.2993 1.3194

(0.1153) (0.1003) (0.0692) (0.1154) (0.1004) (0.0691) (0.3395) (0.2689) (0.1838)

T=30-0.0074 -0.0030 -0.0023 -0.0066 -0.0021 -0.0014 1.2988 1.2992 1.3020

(0.0824) (0.0681) (0.0476) (0.0823) (0.0681) (0.0477) (0.2321) (0.1943) (0.1335)

(I-2)

T=10-0.0216 -0.0083 -0.0086 -0.0196 -0.0063 -0.0073 1.9066 1.9372 1.9383

(0.1474) (0.1175) (0.0846) (0.1476) (0.1178) (0.0847) (0.5532) (0.4755) (0.3210)

T=15-0.0170 -0.0072 -0.0039 -0.0154 -0.0061 -0.0020 1.9021 1.8994 1.9122

(0.1154) (0.1003) (0.0692) (0.1153) (0.1005) (0.0692) (0.4513) (0.3698) (0.2497)

T=30-0.0074 -0.0030 -0.0023 -0.0059 -0.0015 -0.0007 1.8800 1.8690 1.8717

(0.0824) (0.0681) (0.0476) (0.0825) (0.0683) (0.0477) (0.3006) (0.2546) (0.1778)

(I-3)

T=10-0.0211 -0.0082 -0.0085 -0.0207 -0.0085 -0.0093 1.9817 1.9661 1.9890

(0.1462) (0.1168) (0.0850) (0.1473) (0.1176) (0.0851) (0.5612) (0.4649) (0.3297)

T=15-0.0180 -0.0073 -0.0039 -0.0179 -0.0079 -0.0040 1.9746 1.9389 1.9624

(0.1145) (0.1007) (0.0691) (0.1146) (0.1007) (0.0691) (0.4707) (0.3595) (0.2453)

T=30-0.0077 -0.0028 -0.0022 -0.0083 -0.0034 -0.0028 1.9366 1.9177 1.9143

(0.0820) (0.0680) (0.0476) (0.0823) (0.0681) (0.0478) (0.3162) (0.2633) (0.1796)

13

Page 15: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table 3: Means and standard deviations of MSE of β0(τ) when ρ0 = 0.3, σ20 = 1.

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=100.0434 0.0295 0.0143 0.0433 0.0294 0.0143 0.0520 0.0362 0.0177

(0.0503) (0.0299) (0.0152) (0.0501) (0.0297) (0.0153) (0.0527) (0.0318) (0.0154)

T=150.0256 0.0171 0.0085 0.0258 0.0171 0.0083 0.0308 0.0223 0.0124

(0.0374) (0.0220) (0.0110) (0.0378) (0.0220) (0.0103) (0.0335) (0.0231) (0.0104)

T=300.0159 0.0106 0.0056 0.0158 0.0104 0.0055 0.0162 0.0108 0.0065

(0.0219) (0.0145) (0.0073) (0.0212) (0.0142) (0.0073) (0.0200) (0.0122) (0.0058)

(I-2)

T=100.0318 0.0216 0.0104 0.0320 0.0219 0.0105 0.0367 0.0265 0.0132

(0.0352) (0.0233) (0.0108) (0.0352) (0.0236) (0.0109) (0.0330) (0.0234) (0.0109)

T=150.0197 0.0138 0.0067 0.0196 0.0139 0.0066 0.0219 0.0162 0.0096

(0.0246) (0.0169) (0.0077) (0.0246) (0.0169) (0.0075) (0.0216) (0.0149) (0.0079)

T=300.0127 0.0082 0.0042 0.0126 0.0081 0.0041 0.0109 0.0075 0.0046

(0.0179) (0.0121) (0.0057) (0.0178) (0.0122) (0.0055) (0.0143) (0.0080) (0.0040)

(I-3)

T=100.0187 0.0120 0.0056 0.0183 0.0120 0.0056 0.0288 0.0212 0.0118

(0.0228) (0.0135) (0.0064) (0.0220) (0.0135) (0.0064) (0.0262) (0.0200) (0.0102)

T=150.0139 0.0094 0.0038 0.0137 0.0091 0.0037 0.0194 0.0147 0.0092

(0.0182) (0.0116) (0.0047) (0.0181) (0.0111) (0.0045) (0.0165) (0.0127) (0.0083)

T=300.0096 0.0050 0.0026 0.0089 0.0048 0.0024 0.0163 0.0121 0.0089

(0.0141) (0.0079) (0.0037) (0.0131) (0.0076) (0.0035) (0.0184) (0.0134) (0.0111)

20

30

40

50

80 100 120Longitude

Latit

ude

Region

Central ChinaEast ChinaNorth ChinaNortheastNorthwestSouth ChinaSouthwest

Figure 1: Map of 185 Chinese cities where each colour represents one of the following regions from CentralChina (CC), East China (EC), North China (NC), North East (NE), North West (NW), South China (SC)and South West(SW).

14

Page 16: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table 4: Variable definition

Dependent Variable (Y) Definition

log(wage) Log value of average wage per worker (1994 price)

Independent Variable (X) Definition

log(FDI) Log value of FDI (10 thousand yuans, 1994 price)

log(Asset) Log value of total asset (million yuans, 1994 price)

log(pop) Log value of population (10 thousand persons)

GDPm Proportion of GDP by the manufactural sector

GDPs Proportion of GDP by the services sector

Empms Proportion of employed persons in the manufactural

sector out of the non-agricultural sector

Table 5: Number of cities by region

Regions Number of cities

Whole country 185

Northwest 13

Northeast 32

South China 23

East China 57

Southwest 11

North China 21

Central China 28

15

Page 17: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

−0.02

0.00

0.02

0.04

0.06

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

hopt

2 3hopt

4 5hopt

5 4hopt

3 2hopt

have

log(FDI)

0.05

0.10

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

hopt

2 3hopt

4 5hopt

5 4hopt

3 2hopt

have

log(asset)

0.72

0.73

0.74

0.75

0.76

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

hopt

2 3hopt

4 5hopt

5 4hopt

3 2hopt

have

log(pop)

0.5

0.6

0.7

0.8

0.9

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

hopt

2 3hopt

4 5hopt

5 4hopt

3 2hopt

have

GDPm

0.25

0.50

0.75

1.00

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

hopt

2 3hopt

4 5hopt

5 4hopt

3 2hopt

have

GDPs

−0.5

−0.4

−0.3

−0.2

−0.1

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

hopt

2 3hopt

4 5hopt

5 4hopt

3 2hopt

have

Empms

Figure 2: Estimated curves of the time varying coefficient β(τ) based on under different bandwidths forthe whole country (185 cities).

16

Page 18: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

−0.02

0.00

0.02

0.04

0.06

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

log(FDI)

0.05

0.10

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

log(asset)

0.71

0.72

0.73

0.74

0.75

0.76

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

log(pop)

0.5

0.6

0.7

0.8

0.9

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

GDPm

0.25

0.50

0.75

1.00

1.25

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

GDPs

−0.5

−0.4

−0.3

−0.2

−0.1

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

Empms

Figure 3: Estimated curves and confidence bands of time varying coefficient β(τ) based on 185 cities withhave = 0.4173.

17

Page 19: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

the average bandwidth have = 0.4173 of those five bandwidths.

Table 6: Estimation of parameters under different bandwidths.

hopt 2/3hopt 4/5hopt 5/4hopt 3/2hopt

ρ0 0.3461 (0.0179) 0.3378 (0.0179) 0.3415 (0.0179) 0.3530 (0.0180) 0.3580 (0.0180)

σ20 0.008 (0.0002) 0.008 (0.0002) 0.008 (0.0002) 0.008 (0.0002) 0.008 (0.0002)

Table 7: Estimation results of time-varying spatial models, where the coefficient estimator for the time-varying model is calculated by the average over time.

Whole Country East China

log(FDI) 0.0210 0.0072

log(Asset) 0.0748 0.0647

log(pop) 0.7266 0.8315

GDPm 0.7593 1.2762

GDPs 0.8250 0.9053

Empms -0.3758 -0.4193

ρ0 0.3477*** (0.0179) 0.2870*** (0.020)

σ20 0.0082*** (0.0002) 0.0082*** (0.0002)

Figure 3 displays the time-varying coefficient curve for each variable as well as the 95% confidence bands.

It is obvious that the zero line does not lie within the confidence bands all the time, which confirms it does

vary over time significantly. As we can see, the coefficients of “log(asset)”, ‘log(asset)” and “GDPs” are

positive and increasing , which showing more positive effect on productivity and larger city agglomeration

effects as the time varies. The effect of “GDPm” is always positive while the positive effect is increasing

before 1999 and after that it becomes decreasing. During that time, China began to take actions to be a

member of WTO. It shows that it has a lot of effects and impacts on the Chinese economic sector. The

larger it is, the more negative effect from “Empms” will be on productivity. It is noted that the negative

effects became stable after 1999. The trend of “log(FDI)” is always decreasing from positive to slightly

negative. It is reasonable that the effect of FDI becomes less important than the development of economy

and technology by China its own.

Table 7 reports the estimates of the spatial coefficient ρ0, σ20 and average β0(τt). The significance of the

spatial coefficient estimate reflects the strong spatial dependence and confirms the existences of spillover

effects between cities. From the comparison in Table 7, the spatial impact in East China (Figure 4) is less

than that in the whole country while the magnitude effect of the explanatory variables seems strong except

for FDI and asset. It seems reasonable as cities within a small region share similar economic structure

and government policies. Moreover, the ignorance of the spatial effects outside the regions may lead to

18

Page 20: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

the underestimation of the spatial dependences inside the region. The time varying-trend in East China is

provided in Figure 5.

24

28

32

36

115 120Longitude

Latit

ude

Provinces in EC

AnhuiFujianJiangsuJiangxiShandongShanghaiZhejiang

Figure 4: Map of 57 cities in Region EC where each color represents different province.

7 Conclusion and Discussion

We have considered a time-varying coefficient spatial panel data model with fixed effects. A semi-parametric

profiled quasi-likelihood method has been proposed to estimate the parametric and nonparametric compo-

nents at two steps. Asymptotic properties for these two terms have been derived with√NT and

√NTh

rate of convergence respectively when both the cross-sectional size N and the time length T go to infinity.

We also have set up some simulated examples to evaluate the finite sample performances of our proposed

estimation method. The simulation results have shown that the proposed method uniformly outperforms

the parametric method in most cases. In addition, we have applied the proposed method to study Chinese

productivity in different cities.

Here we assume that there is no time-varying structure in the spatial part ρ0W . However, sometimes it

is violated in reality especially when T is large. It is because that during the period, the spatial structure

W may change and also on the other hand, spatial coefficient, ρ0, may vary over time. The time-varying

spatial structure is worth studying in future. Moreover, how to choose a suitable weigh matrix W in

practice needs more investigation.

19

Page 21: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

−0.02

−0.01

0.00

0.01

0.02

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

log(FDI)

0.04

0.05

0.06

0.07

0.08

0.09

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

log(asset)

0.80

0.85

0.90

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

log(pop)

0.5

1.0

1.5

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

GDPm

0.5

1.0

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

GDPs

−0.75

−0.50

−0.25

0.00

0.25

1995

1997

1999

2001

2003

2005

2007

2009

Year

Coe

ffici

ent

Empms

Figure 5: Time varying coefficients estimation curve based on Region of East China with for have = 0.3478.

20

Page 22: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

References

Anselin, L., Florax, R., and Rey, S. J. (2013). Advances in Spatial Econometrics: Methodology, Tools and Applications.

Springer Science & Business Media.

Arellano, M. (2003). Panel Data Econometrics. Oxford University Press.

Baltagi, B. (2008). Econometric Analysis of Panel Data. John Wiley & Sons.

Chen, J., Gao, J., and Li, D. (2012). Semiparametric trending panel data models with cross-sectional dependence. Journal of

Econometrics, 171(1):71–85.

Chen, J., Li, D., and Linton, O. (2019). A new semiparametric estimation approach for large dynamic covariance matrices

with multiple conditioning variables. Journal of Econometrics, 212(1):155–176s.

Cliff, A. D. and Ord, J. K. (1973). Spatial Autocorrelation, Monographs in Spatial Environmental Systems Analysis. London:

Pion Limited.

Dong, C., Gao, J., and Peng, B. (2015). Semiparametric single-index panel data models with cross-sectional dependence.

Journal of Econometrics, 188(1):301–312.

Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and its Applications. Chapman & Hall/CRC.

Fan, J. and Yao, Q. (2008). Nonlinear Time Series: Nonparametric and Parametric Methods. Springer Science & Business

Media.

Fingleton, B. (2008). A generalized method of moments estimator for a spatial panel model with an endogenous spatial lag

and spatial moving average errors. Spatial Economic Analysis, 3(1):27–44.

Gao, J. (2007). Nonlinear Time Series: Semiparametric and Nonparametric Methods. Chapman & Hall/CRC, London.

Gao, Y. and Li, K. (2013). Nonparametric estimation of fixed effects panel data models. Journal of Nonparametric Statistics,

25(3):679–693.

Hsiao, C. (2014). Analysis of Panel Data. Cambridge University Press.

Kapoor, M., Kelejian, H. H., and Prucha, I. R. (2007). Panel data models with spatially correlated error components. Journal

of Econometrics, 140(1):97–130.

Kelejian, H. H. and Prucha, I. R. (1998). A generalized spatial two-stage least squares procedure for estimating a spatial

autoregressive model with autoregressive disturbances. Journal of Real Estate Finance and Economics, 17(1):99–121.

Kelejian, H. H. and Prucha, I. R. (1999). A generalized moments estimator for the autoregressive parameter in a spatial model.

International Economic Review, 40(2):509–533.

Kelejian, H. H. and Prucha, I. R. (2001). On the asymptotic distribution of the moran i test statistic with applications.

Journal of Econometrics, 104(2):219–257.

Kelejian, H. H. and Prucha, I. R. (2010). Specification and estimation of spatial autoregressive models with autoregressive

and heteroskedastic disturbances. Journal of Econometrics, 157(1):53–67.

Lee, L.-F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econo-

metrica, 72(6):1899–1925.

21

Page 23: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Lee, L.-F. and Yu, J. (2010). Estimation of spatial autoregressive panel data models with fixed effects. Journal of Econometrics,

154(2):165–185.

Lee, L.-F. and Yu, J. (2014). Efficient GMM estimation of spatial dynamic panel data models with fixed effects. Journal of

Econometrics, 180(2):174–197.

LeSage, J. and Pace, R. K. (2009). Introduction to Spatial Econometrics. Chapman and Hall/CRC.

Li, D., Chen, J., and Gao, J. (2011). Non-parametric time-varying coefficient panel data models with fixed effects. Econometrics

Journal, 14(3):387–408.

Li, K. (2017). Fixed-effects dynamic spatial panel data models and impulse response analysis. Journal of Econometrics,

198(1):102–121.

Li, Q. and Racine, J. S. (2007). Nonparametric Econometrics: Theory and Practice. Princeton University Press.

Lin, X. and Lee, L.-f. (2010). Gmm estimation of spatial autoregressive models with unknown heteroskedasticity. Journal of

Econometrics, 157(1):34–52.

Malikov, E. and Sun, Y. (2017). Semiparametric estimation and testing of smooth coefficient spatial autoregressive models.

Journal of Econometrics, 199(1):12–34.

Seber, G. A. F. (2007). A Matrix Handbook for Statisticians. Wiley-Interscience.

Silvapulle, P., Smyth, R., Zhang, X., and Fenech, J.-P. (2017). Nonparametric panel data model for crude oil and stock market

prices in net oil importing countries. Energy Economics, 67:255–267.

Su, L. (2012). Semiparametric gmm estimation of spatial autoregressive models. Journal of Econometrics, 167(2):543–560.

Su, L. and Jin, S. (2010). Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. Journal

of Econometrics, 157(1):18–33.

Su, L. and Ullah, A. (2006). Profile likelihood estimation of partially linear panel data models with fixed effects. Economics

Letters, 92(1):75–81.

Sun, Y. (2016). Functional-coefficient spatial autoregressive models with nonparametric spatial weights. Journal of Econo-

metrics, 195(1):134–153.

White, H. (1996). Estimation, Inference and Specification Analysis. Cambridge University Press.

Yu, J., De Jong, R., and Lee, L.-F. (2008). Quasi-maximum likelihood estimators for spatial dynamic panel data with fixed

effects when both n and t are large. Journal of Econometrics, 146(1):118–134.

Zhang, Y. and Shen, D. (2015). Estimation of semi-parametric varying-coefficient spatial panel data models with random-

effects. Journal of Statistical Planning and Inference, 159:64–80.

Zhang, Y. and Sun, Y. (2015). Estimation of partially specified dynamic spatial panel data models with fixed-effects. Regional

Science and Urban Economics, 51:37–46.

Appendix A

In the appendix, we denote C as a finite constant, which adjusts its values under different scenarios.

22

Page 24: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

A.1 Discussion and Justification to the Assumptions.

Assumption 1:

It is a list of assumptions for the structure of d-dimensional variable Xit with an unknown time trend function

g(τ) and a remainder part vit. Here vit can be regarded as the sum of fixed effect and the random component,

i.e. xi + εit, where E(xi) = E(εit) = 0. As we are not interested in the fixed effects among Xit, we stick with the

structure Xit = g(τ) + vit in this paper.

In Assumption (i) we assume the time trend g(τ) is continuous, which is quite general for the trend. As here we

use the kernel function to estimate the trend, it is not suitable for non-continuous trends.

Notice that if vit is i.i.d. over i and t, Assumption (ii) and (iii) hold automatically. To allow for both serial

and cross-sectional dependence, we impose such conditions .

In Assumption (ii), we assume the stationarity and α mixingness of vt. This case can be verifiable and can cover

many linear and nonlinear time series models. See more examples in Fan and Yao (2008); Gao (2007). Since vt is a

high dimensional vector, we can assume an upper bound function α(t). One example is that the α-mixing coefficient

satisfies an exponential decay, like α(t) ≈ ψt, where ψ is a constant. Such assumption is similar to Assumption 1 (i)

in Chen et al. (2012, 2019) for high dimensional vectors.

As the α mixing condition only reflects serial correlation, we impose some moment conditions on vit in Assumption

(iii) such that it allows for both cross-sectional and serial dependence in vit. As we know, there is no natural ordering

for cross-sectional indices so that when vit1 and vjt2 are dependent, any kind of mixing or martingale difference

conditions on vit are not appropriate. Assumption (iii) allows the cross-sectional dependence decays in some order,

such that it is not “long memory” in sections, similarly to the mixing condition in the time domain. Such conditions

cover the following special case where vit = λiηt + uit, where uit is independent and identically distributed (i.i.d.)

over i and t with mean zero and variance Σu, uit and ηt are also identically distributed (i.i.d.) over i and t,

respectively. Moreover, λi, ηt and uit are independent with each other. By simple algebra calculation, we obtain

E(vitv>it) = E(λ2i )E(ηtη

>t ) + Σu. Then, we show the last equation in Assumption (iii), and some other conditions

can be verified similarly. According to the structure of vit, we have

T∑t1,t2,t3,t4=1

N∑i1,i2,i3,i4=1

Cov(v>i1t1vi2t2 ,v

>i3t3vi4t4

)

=

N∑i1,i2,i3,i4=1

E(λi1λi2λi3λi4)

T∑t1,t2,t3,t4=1

E(η>t1ηt2η>t3ηt4)−

N∑

i1,i2=1

E(λi1λi2)

T∑t1,t2=1

E(η>t1ηt2)

2

Using the distance function in Chen et al. (2012), one can show∑Ni1,i2,i3,i4=1 E(λi1λi2λi3λi4) = O(N2) and∑N

i1,i2=1 E(λi1λi2) = O(N). If ηt1 is a strictly stationary α-mixing process, it can be shown that

T∑t1,t2,t3,t4=1

E(η>t1ηt2η>t3ηt4) = O(T 3) and

T∑t1,t2=1

E(η>t1ηt2) = O(T ).

Therefore, we can show the last equation in Assumption (iii) holds. See more examples about cross-sectional

dependence in Dong et al. (2015).

Furthermore, the trend of Xit can be generalised for individual time trend so that gi(τt) is different for different

i. As it increases the complicity of the proof, we just assume the homogenous trend. For the heterogenous trend, we

can estimate the common trend by g(τ) = 1N

∑Ni=1 gi(τ) where gi(τ) =

∑Tt=1K(

τt−τh )Xit∑T

t=1K(τt−τh )

.

23

Page 25: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Assumption 2:

It summarizes some necessary conditions on the error term eit. We assume that et is a stationary sequence

satisfying martingale difference structure, which is weaker than the independence assumption. The moment condition

here is imposed to control the cross-sectional dependence among eit, ejt and vjt on certain comments. Also it rules

out heteroscedasticity. If there is some unknown heteroskedasticity, the MLE (QMLE) will not be consistent. For this

reason, Kelejian and Prucha (2010) and Lin and Lee (2010) explore the GMM estimation of the linear SAR models

with heteroscedasticity. One may apply our approach to time-varying SAR models with the heteroscedasticity case.

Assumption 2 (iii) can also be verified by using the distance function in Chen et al. (2012).

Assumption 3:

It imposes some regularity conditions on the kernel and bandwidth parameter. It is standard in the nonparametric

literature for local linear estimation. Many commonly-used kernels, such as Epanechnikov kernel and biweight kernel,

satisfy these conditions.

Assumptions 4 and 5:

These assumptions are originated from Kelejian and Prucha (1998, 2001) and also used in Lee (2004). The UB

condition ensures that the spatial correlation ρ0 in a manageable degree.

Assumptions 6 and 7:

Assumption 6 is a mild condition on the smoothness of the functions which is required by the local linear fitting

procedure. Such an assumption is common in existing research on nonparametric estimation, e.g. Condition 2.1

of Li and Racine (2007), Assumption 2.7 of Gao (2007) and Assumption A3 of Chen et al. (2012). Assumption 7

guarantees the uniform boundless of the sum of absolute fixed effects.

Assumption 8:

It is a condition for the identification of ρ0, which is similar to Assumption 8 in Lee (2004), and Assumption 4

in Lee and Yu (2010).

A.2 Proofs of Theorems

Proof of Theorem 1.

Even though we have the nonparametric terms in our model, the idea for proving the consistency of parametric

terms can be adopted from Lee (2004). Define QN,T (ρ) = maxσ2 ElogLN,T (θ), where θ = (ρ, σ2). By White (1996)

and Lee (2004), to show the consistency of θ, it only suffices to show

1

NTlogLN,T (ρ)−QN,T (ρ) P→ 0 uniformly on ∆, and (A.1)

lim supN,T→∞

maxρ∈Ncε (ρ0)

1

NTQN,T (ρ)−QN,T (ρ0) < 0 for any ε > 0, (A.2)

where N cε (ρ0) is the complement of an open neighbourhood of ρ0 on 4 of diameter ε.

(1) Proof of (A.1).

By some simple calculations, we obtain

QN,T (ρ) = −NT2log(2π) + 1 − NT

2logσ2∗(ρ)+ T log|SN (ρ)|.

where σ2∗(ρ) = (NT )−1EY >(ρ)QN,T Y (ρ) = (NT )−1EY ∗>(ρ)PN,TY

∗(ρ)

.

24

Page 26: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Due to PN,TD = 0NT,N−1, PN,T = P>N,T and SN,T (ρ) = SN,T + (ρ0−ρ)IT ⊗W , we can break σ2∗(ρ) into several

terms as

σ2∗(ρ) =1

NT(ρ0 − ρ)2E(R>N,TPN,TRN,T ) +

2

NT(ρ0 − ρ)E((Xβ0)>PN,TRN,T )

+1

NTE((Xβ0)>PN,T (Xβ0)) +

σ20

NTtrS−1>N,T S

>N,T (ρ)E(PN,T )SN,T (ρ)S−1N,T . (A.3)

Then, (NT )−1logLN,T (ρ)−QN,T (ρ) = −(1/2)[logσ2(ρ) − logσ2∗(ρ)

], where

σ2(ρ) =1

NTY ∗>(ρ)PN,TY

∗(ρ)

=1

NT(ρ0 − ρ)2R>N,TPN,TRN,T +

2

NT(ρ0 − ρ)(Xβ0)>PN,TRN,T +

1

NT(Xβ0)>PN,T (Xβ0)

+2

NT(Xβ0)>PN,TSN,T (ρ)S−1N,Te +

2(ρ0 − ρ)

NTR>N,TPN,TSN,T (ρ)S−1N,Te

+1

NTe>S−1>N,T SN,T (ρ)>PN,TSN,T (ρ)S−1N,Te. (A.4)

To show (A.1), it is sufficient to show that

σ2(ρ)− σ2∗(ρ) = oP(1), uniformly on ∆. (A.5)

According to Lemma B.1, we know

1

NT(Xβ0)>PN,T (Xβ0) = oP(1),

1

NTE((Xβ0)>PN,T (Xβ0)) = o(1),

1

NT(Xβ0)>PN,TRN,T = oP(1),

1

NTE((Xβ0)>PN,TRN,T ) = o(1),

1

NTR>N,TPN,TRN,T =

1

NTE(R>N,TPN,TRN,T ) + oP(1)→ ΨR,R.

By (A.3) and (A.4),

σ2(ρ)− σ2∗(ρ) =2

NT(Xβ0)>PN,TSN,T (ρ)S−1N,Te +

2(ρ0 − ρ)

NTR>N,TPN,TSN,T (ρ)S−1N,Te

+1

NTe>S−1>N,T SN,T (ρ)>PN,TSN,T (ρ)S−1N,Te− σ2(ρ) + oP(1)

:= 2H1(ρ) + 2(ρ0 − ρ)H2(ρ) +H3(ρ)− σ2(ρ) + oP(1),

where σ2(ρ) = (NT )−1σ20trS−1>N,T S

>N,T (ρ)E(PN,T )SN,T (ρ)S−1N,T .

According to Lemma B.2, we get (A.5) so that (A.1) holds.

(2) Proof of (A.2).

Consider an auxiliary SAR panel process: Yt = ρ0WYt + et, t = 1, · · · , T where et ∼ N(0, σ20In). Denote the

log likelihood of this model as logLaN,T (ρ, σ2). Define σ2(ρ) = (NT )−1σ20trS−1>N,T S

>N,T (ρ)SN,T (ρ)S−1N,T . It can be

verified that

maxσ2

EalogLaN,T (ρ, σ2) = −NT2log(2π) + 1 − NT

2logσ2(ρ)+ T log|SN (ρ)| := QN,T (ρ),

where Ea denotes the expectation under this auxiliary model.

25

Page 27: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Thus, for any ρ ∈ 4, QN,T (ρ) ≤ maxρ,σ2 EalogLaN,T (ρ, σ2) = EalogLaN,T (ρ0, σ20) = QN,T (ρ0) so that

1

NTQN,T (ρ)− QN,T (ρ0) ≤ 0 uniformly on4 .

The term in (A.2) can be rewritten as

1

NTQN,T (ρ)−QN,T (ρ0) =

1

NTQN,T (ρ)− QN,T (ρ0)+

1

2H4(ρ) +

1

2H5,

where H4(ρ) = logσ2(ρ) − logσ2∗(ρ) and H5 = logσ2∗(ρ0) − logσ2(ρ0).According to B.3, we have (NT )−1tr(E(PN,T )) = (NT )−1tr(INT ) = 1 so that

σ2∗(ρ0)− σ2(ρ0) =1

NTE((Xβ0)>PN,T (Xβ0)) + σ2

0

tr(E(PN,T ))

NT− 1

= o(1).

Accordingly, we obtain H5 = o(1).

To show H4(ρ) ≤ 0 uniformly on 4, it suffices to show

σ2(ρ)− σ2∗(ρ) ≤ 0, uniformly on4 . (A.6)

As

σ2(ρ)− σ2∗(ρ) =σ20

NTH1

4 (ρ)− 1

NTH2

4 (ρ),

where

1

NTH1

4 (ρ) = 1− tr(E(PN,T ))

NT+ 2(ρ0 − ρ)

tr(GNT )− tr(E(PN,T )GNT )

NT

+(ρ0 − ρ)2tr(G>NTGNT )− tr(G>NTE(PN,T )GNT )

NT

and H24 (ρ) = E

(Xβ0 + (ρ0 − ρ)RN,T >PN,T Xβ0 + (ρ0 − ρ)RN,T

).

By Lemma B.3, we get (NT )−1H14 (ρ) = op(1) uniformly on 4. Since PN,T = (IN,T − XΦ)>Q>Q(IN,T − XΦ)

and Xβ0 + (ρ0 − ρ)RN,T is not a zero vector, H24 (ρ) can be written as E(x>Ax) with a positive semidefinite matrix

A. Thus, (NT )−1H24 (ρ) ≤ 0 uniformly on 4 so that H4(ρ) ≤ 0 holds uniformly on 4.

Hence, we have limN,T→∞ sup maxρ∈Ncε (ρ0)1NT QN,T (ρ)−QN,T (ρ0) ≤ 0 for any ε > 0.

If the identification uniqueness condition was not satisfied, without loss of generality, there would exist a sequence

ρn converging to ρ∗ 6= ρ0 such that limN,T→∞ sup maxρ∈Ncε (ρ0)1NT QN,T (ρn)−QN,T (ρ∗) = 0. This will follow if (a)

limN,T→∞σ2(ρ∗) − σ2∗(ρ∗) = 0 and (b) limN,T→∞(NT )−1[QN,T (ρ∗) − QN,T (ρ0)] = 0 . Since limN,T→∞σ2(ρ) −σ2∗(ρ) = ΨR,R, (a) contradicts with Assumption 8.

Hence, we have accomplished the proof.

Proof of Theorem 2.

In this section, we provide the proof of the asymptotic distribution of θ. According to Taylor expansion of the

first order condition of (2.9), we obtain

√NT (θ − θ0) = −

1

NT

∂2logLN,T (θ)

∂θ∂θ>

−11√NT

∂logLN,T (θ0)

∂θ, (A.7)

26

Page 28: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

where θ lies between θ and θ0 and hence it converges to θ0 in probability by Theorem 1. If the following statements

hold,

1√NT

∂logLN,T (θ0)

∂θ

d→ N(0,Σθ0 + Ωθ0), (A.8)

− 1

NT

∂2logLN,T (θ0)

∂θ∂θ>− Σθ0

= oP(1), (A.9)

1

NT

∂2logLN,T (θ)

∂θ∂θ>− 1

NT

∂2logLN,T (θ0)

∂θ∂θ>= oP(1) uniformly in θ, (A.10)

the asymptotic normality holds:√NT (θ − θ0)

d→ N(0,Σ−1θ0+ Σ−1θ0

Ωθ0Σ−1θ0).

(1) The proof of (A.8).

First, we can get the first order derivative of logLN,T (θ) as follows

1√NT

∂logLN,T (θ)

∂θ=

1√NT

∂logLN,T (θ)∂ρ

1√NT

∂logLN,T (θ)∂σ2

=

(IT⊗W )Y >PN,TY ∗(ρ)−Tσ2trGN (ρ)σ2√NT

Y ∗(ρ)>PN,TY∗(ρ)−NTσ2

2σ4√NT

=1√NT

R>N,TPN,T e

σ20

+e>G>N,TPN,T e−Tσ

20tr(GN )

σ20

+R>N,TPN,T Xβ0

σ20

+(Xβ0)

>PN,TGN,T e

σ20

e>PN,T e−NTσ20

2σ40

+(Xβ0)

>PN,T (Xβ0)+2(Xβ0)>PN,T e

2σ40

=

1√NT

1σ20

e>G>N,TPN,Te− σ2

0tr(GN,T )

+ 1σ20R>N,TPN,Te

12σ4

0(e>PN,Te−NTσ2

0)

+ oP(1) (A.11)

due to1√NT

R>N,TPN,T Xβ0 = oP(1),1√NT

(Xβ0)>PN,T (Xβ0) = oP(1),

1√NT

(Xβ0)>PN,Te = oP(1),1√NT

(Xβ0)>PN,TGN,Te = oP(1),

in Lemma B.1.

By Lemma B.3, we have

1√NTE(tr(PN,T ))−NT = o(1),

1√NT

E(tr(PN,TGN,T −GN,T )) = o(1)

so that

1√NT

∂logLN,T (θ0)

∂θ=

1√NT

1σ20

e>G>N,TPN,Te− σ2

0tr(G>N,TE(PN,T ))

+ 1σ20R>N,TPN,Te

12σ4

0e>PN,Te− σ2

0tr(E(PN,T ))

+ oP(1)

=1√NTQ − E(Q)+ oP(1),

where Q =

1σ20e>G>N,TPN,Te + 1

σ20R>N,TPN,Te

12σ4

0e>PN,Te

.

Under the regularity assumptions, we can apply the central limit theorem of linear quadratic forms in Kelejian

27

Page 29: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

and Prucha (2001) so that1√NTQ − E(Q) d→ N(0,ΣQ),

where ΣQ = limN,T→∞(NT )−1Cov(Q) = limN,T→∞(ΣNT,θ0+ ΩNT,θ0

),

ΣNT,θ0=

E(R>N,TP2N,TRN,T )

NTσ20

+E(tr(PN,TGN,TPN,TGN,T+P

2N,TGN,TG

>N,T ))

NT

E(tr(G>N,TP2N,T ))

NTσ20

E(tr(G>N,TP2N,T ))

NTσ20

E(tr(P 2N,T ))

2σ40NT

=

R>N,TPN,TRN,T

NTσ20

+tr(G2

N,T )+tr(GN,TG>N,T )

NTtr(GN,T )

NTσ20

tr(GN,T )

NTσ20

12σ4

0

+ oP(1),

and

ΩNT,θ0

=

2m3(R>N,TPN,T vec[diag(PN,TGN,T )])

NTσ40

+(m4−3σ4

0)(∑NTi=1(gp)

2ii)

NTσ40

m3(R>N,TPN,T vec[diag(PN,T )])

2NTσ60

+(m4−3σ4

0)(∑NTi=1(gp)iipii)

2NTσ60

m3(R>N,TPN,T vec[diag(PN,T )])

2NTσ60

+((m4−3σ4

0)∑NTi=1(gp)iipii)

2NTσ60

(m4−3σ40)(

∑NTi=1 p

2ii)

4σ80NT

,

where pii and (gp)ii are the i-th main diagonal elements of PN,T and PN,TGN,T , respectively. As

E(R>N,TPRN,T )/NT → ΦR,R, limN,T→∞

tr(G2N,T ) + tr(G>N,TGN,T )

NT= c1,

tr(GN,T )

NT= c2,

we have

limN,T→∞

ΣNT,θ0= Σθ0

=

1σ20ΦR,R + c1

c2σ20

c2σ20

12σ4

0

.

Therefore,

1√NT

∂logLN,T (θ0)

∂θ

d→ N(0,Σθ0+ Ωθ0

),

where Ωθ0= limN,T→∞ ΩNT,θ0

.

(2) The proof of (A.9).

The second derivative can be obtained as follows

∂2logLN,T (θ)

∂ρ2= −T tr

G2N (ρ)

− 1

σ2(IT ⊗W )Y > PN,T (IT ⊗W )Y ,

∂2logLN,T (θ)

∂ρ∂σ2= −(IT ⊗W )Y > PN,T SN,T (ρ)Y

σ4,

∂2logLN,T (θ)

∂σ2∂σ2=

NT

2σ4− SN,T (ρ)Y >QN,T SN,T (ρ)Y

σ6. (A.12)

28

Page 30: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

By some calculations, we have

1

NT

∂2logLN,T (θ0)

∂ρ2= −

tr(G2N,T )

NT−R>N,TPN,TRN,T + 2R>N,TPN,TGN,Te + e>G>N,TPN,TGN,Te

NTσ20

,

1

NT

∂2logLN,T (θ0)

∂ρ∂σ2= −

R>N,TPN,Te + e>G>N,TPN,Te + e>G>N,TPN,T Xβ0 +R>N,TPN,T Xβ0

NTσ40

,

1

NT

∂2logLN,T (θ0)

∂σ2∂σ2=

1

2σ40

− e>PN,Te + (Xβ0)>PN,T Xβ0 + 2(Xβ0)>PN,Te

NTσ60

. (A.13)

Thus,

− 1

NT

∂2logLN,T (θ0)

∂θ∂θ>=

1

NT

2R>N,TPN,TGN,T e+e>G>N,TPN,TGN,T e

σ20

R>N,TPN,T e+e>G>N,TPN,T e

σ40

R>N,TPN,T e+e>G>N,TPN,T e

σ40

e>PN,T e

σ60

+

tr(G2N,T )

NT +R>N,TPN,TRN,T

NTσ20

0

0 − 12σ4

0

+ oP(1).

By Lemma (B.1), we have

1

NT

2R>N,TPN,TGN,T e+e>G>N,TPN,TGN,T e

σ20

R>N,TPN,T e+e>G>N,TPN,T e

σ40

R>N,TPN,T e+e>G>N,TPN,T e

σ40

e>PN,T e

σ60

P→ E

1

NT

2R>N,TPN,TGN,T e+e>G>N,TPN,TGN,T e

σ20

R>N,TPN,T e+e>G>N,TPN,T e

σ40

R>N,TPN,T e+e>G>N,TPN,T e

σ40

e>PN,T e

σ60

=

tr(G>N,TE(PN,T )GN,T )

NTtr(E(PN,T )GN,T )

NTσ20

tr(E(PN,T )GN,T )

NTσ20

tr(E(PN,T ))

NTσ40

.

According to Lemma (B.3), we can get

− 1

NT

∂2logLN,T (θ0)

∂θ∂θ>P→

tr(G>N,TE(PN,T )GN,T )+tr(G2N,T )

NT +E(R>N,TPN,TRN,T )

NTσ20

tr(E(PN,T )GN,T )

NTσ20

tr(E(PN,T )GN,T )

NTσ20

tr(E(PN,T ))

NTσ40− 1

2σ40

=

tr(G2N,T )+tr(G>N,TGN,T )

NT +E(R>N,TPN,TRN,T )

NTσ20

tr(GN,T )

NTσ20

tr(GN,T )

NTσ20

12σ4

0

+ o(1)

= Σθ0+ o(1).

Thus, (A.9) has been proved.

(3) The proof of (A.10).

By (A.12) , we note that 1/σ2 appears either in linear, quadratic or cubic from in all the elements of the second

derivative of logLN,T (θ) and ρ only appears in linear form in ∂2logLN,T (θ)/∂ρ∂σ2 and ∂2logLN,T (θ)/∂σ2∂σ2.

According to the convergence of θ to θ0 in probability, it is easy to show (A.10) holds for all elements but the

second derivative of logLN,T (θ) with respect to ρ. Since GN (ρ) = WSN (ρ), we have tr

G2N(ρ)

= tr

(G2

N

)+

2tr

G3N(ρ∗)

(ρ∗ − ρ0) for some ρ∗ between ρ and ρ0 by Assumption 5 and mean value theorem. Consequently, we

29

Page 31: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

have

1

NT

∂2logLN,T (θ)

∂ρ2− 1

NT

∂2logLN,T (θ0)

∂ρ2

= −2tr

G3N(ρ∗)

N

(ρ∗ − ρ0) +

(1

σ20

− 1

σ2

)(IT ⊗W )Y > PN,T (IT ⊗W )Y

NT. (A.14)

Since GN (ρ) is UB, N−1tr

G3N(ρ∗)

is bounded, implying the first term (A.14) is oP(1). The second term is also

oP(1) since (NT )−1 (IT ⊗W )Y > PN,T (IT ⊗W )Y which can be shown by Lemma B.1. Consequently, (A.10)

holds.

Hence, we have finished the whole proof.

Proof of Theorem 3.

As we know SN,T (ρ) = SN,T + (ρ0 − ρ)IT ⊗W as well as the definition of β(τ) in (2.14), we have

β(τ)− β0(τ) = Φ(τ)Y ∗(ρ)−Dα − β0(τ)

= Φ(τ)Y ∗(ρ)−D(D>D)−1D>Y (ρ) − β0(τ)

= Φ(τ)IN,T −D(D>D)−1D>(INT − XΦ)Y ∗(ρ)− β0(τ)

= Φ(τ)IN,T −D(D>D)−1D>(INT − XΦ)SN,T (ρ)S−1N,T (Xβ0 +Dα+ e) − β0(τ)

= Φ(τ)IN,T −D(D>D)−1D>(INT − S)Xβ0 − β0(τ)

+Φ(τ)IN,T −D(D>D)−1D>(INT − S)Dα

+Φ(τ)IN,T −D(D>D)−1D>(INT − S)e

+(ρ0 − ρ)Φ(τ)IN,T −D(D>D)−1D>(INT − S)GN,T (Xβ0 +Dα+ e)

:= ΞN,T (1) + ΞN,T (2) + ΞN,T (3) + ΞN,T (4). (A.15)

Due to D = (INT − S)D, we get

ΞN,T (2) = Φ(τ)D −D(D>D)−1D>Dα = 0d. (A.16)

We could rewrite ΞN,T (1) by

ΞN,T (1) =

Φ(τ)Xβ0 − β0(τ)− Φ(τ)D(D>D)−1D>(INT − S)Xβ0

:= ΞN,T (1, 1)− ΞN,T (1, 2).

Then, according to Lemma B.4 and B.5, we know

ΞN,T (1) =1

2µ0β

′′0(τ)h2 + oP(h2). (A.17)

Also, Lemma B.6 shows that

√NThΞN,T (3) =

√NThΦ(τ)e + oP(1)

d→ N(0d, ν0σ20Σ−1X (τ)). (A.18)

From Theorem 2, we have√NThΞN,T (4) =

√NThOP(ρ0 − ρ) = oP(1). (A.19)

30

Page 32: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

By (A.15)-(A.19), we have accomplished the proof.

A.3 Optimal Bandwidth Selection

Due to the existence of individual fixed effects, the traditional cross–validation method may not provide satisfactory

results in panel data when selecting the optimal bandwidth. Hence, throughout the paper, we adopt the leave-one-

unit-out cross-validation method to choose optimal bandwidth which is also used in Li et al. (2011) and Chen et al.

(2012). The initial value ρ is obtained from the parametric spatial panel data model Lee and Yu (2010).

The idea is that firstly, use N(t− 1) observations among all data except for the i-th unit (x>it , yit), 1 ≤ t ≤ Tas the training data to obtain the estimate of β(τ), which is denoted as β

(−i)ρ (τ) for each 1 ≤ i ≤ N . The optimal

bandwidth is the one minimizes a weight squared prediction error of the form

hopt = arg minh

[Z(ρ)−B

(X, β

(−)ρ

)]>M∗>M∗

[Z(ρ)−B

(X, β

(−)ρ

)], (A.20)

where M∗ = INT − T−1(iT i′T ) ⊗ IN is used to delete the unobserved fixed effect due to M∗D = 0, B(X, β

(−)ρ ) =([

x>11β(−1)ρ

(1T

)]>, · · · ,

[x>N1β

(−N)

ρ

(1T

)]>, · · · ,

[x>1T β

(−1)ρ

(TT

)]>, · · · ,

[x>NT β

(−N)

ρ

(TT

)]>)>.

Appendix B: Main Lemmas and Their Proofs

Define ∆M (τ) = Λµ(τ)⊗ ΣX(τ), ∆M,N (τ) = λµ ⊗ ΣX(τ),

∆M (τ) =M>(τ)Ω(τ)M(τ)

NThand ∆M,N (τ) =

M>(τ)Ω(τ)N(τ)

NTh,

where

λµ =

µ2(τ)

µ3(τ)

,Λµ(τ) =

µ0(τ) µ1(τ)

µ1(τ) µ2(τ)

,Λν(τ) =

ν0(τ) ν1(τ)

ν1(τ) ν2(τ)

, N(τ) =

(1−τTTh

)2X1

...(T−τTTh

)2XT

,

where µ`(τ) = µ` =∫∞−∞ u`K(u)du and ν`(τ) = µ` =

∫∞−∞ u`K2(u)du if 0 ≤ τ ≤ 1; µ`(τ) =

∫∞0u`K(u)du and

ν`(τ) =∫∞0u`K2(u)du if τ = 0; and µ`(τ) =

∫ 0

−∞ u`K(u)du and ν`(τ) =∫ 0

−∞ u`K2(u)du if τ = 1.

Lemma B.1. Under Assumptions 1-7, as N,T →∞, we have

(1) 1√NT

A>NTPN,T Xβ0 = oP(1) for ANT = Xβ0, GN,T Xβ0, RN,T , e and GN,Te.

(2)(Xβ0+Dα0+e)>PN,T (Xβ0+Dα0+e)

NT = OP(1),(Xβ0+Dα0+e)>G>N,TPN,T (Xβ0+Dα0+e)

NT = OP(1)

and(Xβ0+Dα0+e)>G>N,TPN,TGN,T (Xβ0+Dα0+e)

NT = OP(1).

Proof. (1) By Lemma B.4, it is obvious that

X>itβ(τt)−X>itΦ(τt)Xβ0 = −1

2µ2X

>itβ

′′

0 (τt)h2 + oP(h2).

31

Page 33: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Define β′′

0 = (β′′

0 (τ1)>, · · · ,β′′

0 (τT )>)>. Then,

(INT − S)Xβ0 =

X>11β(τt)−X>11Φ(τt)Xβ0

...

X>NTβ(τt)−X>NTΦ(τt)Xβ0

= −1

2µ2h

2Xβ′′

0 + 1NT oP(h2).

As Xβ′′

0 is element bounded uniformly by Assumption 1 and 5, we have

‖(INT − S)Xβ0‖∞ =

∥∥∥∥−1

2µ2h

2Xβ′′

0 + 1NT o(h2)

∥∥∥∥∞

= OP(h2)

and ‖(INT − S)Xβ0‖2 =

∥∥∥∥−1

2µ2h

2Xβ′′

0 + o(h2)

∥∥∥∥2

= OP(√NTh2).

By Lemma C.6, ‖QNT ‖2 ≤√‖QNT ‖1‖QNT ‖∞ <∞ holds so that

‖QNT (INT − S)Xβ0‖2 = OP(√NTh2).

Due to Assumption 10 we know NTh8 → 0 , we have

1√NT

(Xβ0)>PN,T Xβ0 =1√NT‖QN,T (INT − S)Xβ0‖22 = OP(

√NTh4)

= oP(1).

Similarly, we can show that

‖QN,T (INT − S)GN,T Xβ0‖2 = OP(√NTh2)

and

‖QN,T (INT − S)RN,T ‖2 = OP(√NTh2).

Then,

1√NT

(GN,T Xβ0)>PN,T Xβ0 =1√NT‖QN,T (INT − S)Xβ0‖2‖QN,T (INT − S)GN,T Xβ0‖2

= O(√NTh4) = oP(1).

1√NT

R>N,TPN,T Xβ0 =1√NT‖QN,T (INT − S)Xβ0‖2‖QN,T (INT − S)RN,T ‖2

= O(√NTh4) = oP(1),

Thus, we have shown that1√NT

A>N,TPN,T Xβ0 = oP(1)

holds for ANT = Xβ0, GN,T Xβ0 and RN,T .

As E(

1√NT

A>N,TPN,T Xβ0

)= 0 for AN,T = e and GN,Te, to prove the lemma it suffices to show that

Var

(1√NT

A>N,TPN,T Xβ0

)= o(1).

32

Page 34: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

It can be show that ‖PN,T Xβ0‖2 = OP(√NTh2) and ‖G>N,TPN,T Xβ0‖2 = OP(

√NTh2) so that we have

Var

(1√NT

e>G>N,TPN,T Xβ0

)=

1

NTE(

(Xβ0)>PN,TGN,TVar(e|FX)G>N,TPN,T Xβ0

)=

σ20

NTE(‖G>N,TPN,T Xβ0‖22

)= o(h4).

and

Var

(1√NT

e>PN,T Xβ0

)=

1

NTE(

(Xβ0)>PN,TVar(e|FX)PN,T Xβ0

)=

σ20

NTE(

(Xβ0)>P 2N,T Xβ0

)=

σ20

NTE(‖PN,T Xβ0‖22

)= o(h4).

Then,1√NT

A>N,TPN,T Xβ0 = oP(1)

for AN,T = e and GN,Te.

(2) From (1), it can be shown that

(Xβ0 +Dα0 + e)>PN,T (Xβ0 +Dα0 + e)

NT=

e>PN,Te

NT+ oP(1),

(Xβ0 +Dα0 + e)>G>N,TPN,T (Xβ0 +Dα0 + e)

NT=

e>G>N,TPN,Te

NT+R>N,TPN,Te

NT+ oP(1),

(Xβ0 +Dα0 + e)>G>N,TPN,TGN,T (Xβ0 +Dα0 + e)

NT=

e>G>N,TPN,TGN,Te

NT+R>N,TPN,TRN,T

NT

+2R>N,TPN,Te

NT+ oP(1). (B.1)

It can be shown that tr(E(P 2N,T )) = O(NT ) and

∑ni=1 E(p2ii) = O(NT ), we have

Var

(e>PN,Te

NT

)=

1

(NT )2

(m4 − 3σ4

0)

n∑i=1

E(p2ii) + 2σ40tr(E(P 2

N,T ))

= O

(1

NT

)= o(1).

Then,e>PN,Te

NT= E

(e>PN,Te

NT

)+ oP(1) = σ2

0 + oP(1) = OP(1).

Similarly, we can show that

e>G>N,TPN,Te

NT= E

(e>G>N,TPN,Te

NT

)+ oP(1) =

tr(GN,T )

NTσ20 + oP(1) = OP(1)

andR>N,TPN,Te

NT= E

(R>N,TPN,Te

NT

)+ oP(1) = oP(1).

AsR>N,TPN,TRN,T

NT → ΨR,R = O(1), we can get the right side of equations in (B.1) to be OP(1).

33

Page 35: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Lemma B.2. Under Assumptions 1-7, as N,T →∞, we have

(1) H1(ρ) = 1NT (Xβ0)>PN,TSN,T (ρ)S−1N,Te = oP(1) uniformly on 4.

(2) H2(ρ) = 1NTR

>N,TPN,TSN,T (ρ)S−1N,Te = oP(1) uniformly on 4.

(3) H3(ρ) = 1NT e>S−1>N,T SN,T (ρ)>PN,TSN,T (ρ)S−1N,Te = σ2(ρ) + oP(1) uniformly on 4.

Proof. (1) By Lemma B.1,

1√NT

(Xβ0)>PN,TSN,T (ρ)S−1N,Te =1√NT

e>PN,T Xβ0 + (ρ0 − ρ)1√NT

(GN,Te)>PN,T Xβ

= oP(1),

uniformly on 4 due to the linear form of ρ.

Thus, H1(ρ) = OP( 1√NT

) = oP(1) uniformly on 4.

(2) Similarly to the (1) in Lemma B.1, we can show that

Var

(R>N,TPN,TGN,Te

NT

)= o(1).

so thatR>N,TPN,TGN,Te

NT= E

(R>N,TPN,TGN,Te

NT

)+ oP(1) = oP(1).

As we have shown that (NT )−1R>N,TPN,Te = oP(1), it holds that

H2(ρ) =1

NTR>N,TPN,Te + (ρ0 − ρ)

1

NTR>N,TPN,TGN,Te = oP(1),

uniformly on 4 due to the linear form of ρ.

(3) It is easy to check that

EH3(ρ) = σ2(ρ).

Define AN,T = (aij) = S−1>N,T S>N,T (ρ)PN,TSN,T (ρ)S−1N,T and therefore AN,T is UB. According to Lemma A.7 in

Lee (2004), we get tr(AN,TA>N,T ) = O(NT ) and

∑ni=1 a

2ii ≤ tr(A2

N,T ) = O(NT ). Then,

VarH3(ρ) =1

(NT )2Var(e>AN,Te)

=1

(NT )2E(e>AN,Te)2 − 1

(NT )2E2(e>AN,Te)

=1

(NT )2

[(m4 − 3σ4

0)

NT∑i=1

E(a2ii)

+ σ40E(tr(AN,TA

>N,T +A2

N,T ))]

= O

(1

NT

)= o(1).

Thus, H3(ρ) = σ2(ρ) + oP(1) uniformly on 4.

Lemma B.3. For any NT ×NT UB matrix A, we have

tr(PN,TA)

NT=

tr(A)

NT+ oP(1)

34

Page 36: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

for sufficiently large N and T .

Proof. Due to the symmetry of PN,T , we have

tr(PN,TA) = tr(PN,TA>) = tr

(PN,T

A+A>

2

)

where (A+A>)/2 is a symmetric matrix.

Then, we only need to show the lemma for the case of symmetric matrix A so that in the following part, we

assume A is symmetric.

By calculation,

tr(PN,TA)

NT− tr(A)

NT

=S>S − S − S> − (INT − S)>D(D>D)−1D>(INT − S)A

NT

=1

NTtr(S>S − S − S>)A − 1

NTtrD(D>D)−1D>(INT − S)A(INT − S)>

(B.2)

It can be show the elements of S are OP(1/NTh) uniformly. As A is UB, we have

1

NTtr(SA) =

1

NTtr(S>A) =

1

NTO

(NT

NTh

)= OP

(1

NTh

)= oP(1)

by Lemma A.8 in Lee (2004). Similarly we can approve that S>S is also with elements being O(1/NTh) uniformly

so that1

NTtr(S>SA) = oP(1).

Thus,

1

NTtr

(S>S − S − S>)A

= oP(1). (B.3)

As INT −S is UB for sufficient large N and T , the product (INT −S)A(INT −S)> is also UB for sufficient large

N and T by Lemma C.5 so that

supN,T≥1

‖(INT − S)A(INT − S)>‖2

≤√

supN,T≥1

‖(INT − XΦ)A(INT − XΦ)>‖1 supN,T≥1

‖(INT − XΦ)A(INT − XΦ)>‖∞

= O(1).

Since D(D>D)−1D> is an idempotent matrix and (INT − S)A(INT − S)> is symmetric, by 6.77 on page 120 in

Seber (2007) it can be shown

trD(D>D)−1D>(INT − S)A(INT − S)>

= tr(D>D)−1/2D>(INT − S)A(INT − S)>D(D>D)−1/2

≤ (N − 1) supN,T≥1

‖(INT − S)A(INT − S)>‖2

= O(N − 1).

35

Page 37: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Thus,

1

NTtrD(D>D)−1D>(INT − S)A(INT − S)>

= O

(1

T

)= o(1). (B.4)

Combining (B.2)-(B.4), we show that it holds that

tr(PN,TA)

NT=

tr(A)

NT+ oP(1)

for sufficiently large N and T .

Lemma B.4. Suppose Assumptions 1 and 3 hold. As T , N →∞,

ΞN,T (1, 1) =1

2µ0β

′′0(τ)h2 + oP(h2).

Proof. According to the Taylor Expansion, we have

Φ(τ)Xβ0 − β0(τ)

= (Id, 0d×d)∆−1M (τ)∆M,N (τ)

(1

2β′′0(τ)h2 + o(h2)

)= (Id, 0d×d)∆

−1M (τ)∆M,N (τ)

(1

2β′′0(τ)h2 + o(h2)

)+(Id, 0d×d)

(∆−1M (τ)−∆−1M (τ)

)∆M,N (τ)

(1

2β′′0(τ)h2 + o(h2)

)+(Id, 0d×d)∆

−1M (τ)

(∆M,N (τ)−∆M,N (τ)

)(1

2β′′0(τ)h2 + o(h2)

)+(Id, 0d×d)

(∆−1M (τ)−∆−1M (τ)

)(∆M,N (τ)−∆M,N (τ)

)(1

2β′′0(τ)h2 + o(h2)

)= ΞN,T (1, 1, 1) + ΞN,T (1, 1, 2) + ΞN,T (1, 1, 3) + ΞN,T (1, 1, 4).

Simple algebra yields that

ΞN,T (1, 1, 1) =1

2µ0β

′′0(τ)h2 + o(h2). (B.5)

By Lemma C.2,

‖ΞN,T (1, 1, `)‖2 = OP

(h2√NTh

)= oP(h2), for ` = 6, 7, 8. (B.6)

Thus, the lemma has been proved.

Lemma B.5. Suppose Assumptions 1 and 3 hold. As T , N →∞,

‖ΞN,T (1, 2)‖2 = o(h2).

36

Page 38: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Proof. Using the similar proof in Su and Ullah (2006) and Gao and Li (2013), we have

(D>D)−1 = (D>D)−1

1 +O

(h2 +

1√NTh

)so that the leading term of ΞN,T (1, 2) is

ΞN,T (1, 2, 1) = Φ(τ)D(D>D)−1D>(INT − S)>(INT − S)Xβ0.

Define Φ(τ) = (Φ1(τ), · · · ,ΦT (τ)). Then, Υ is a NT ×NT matrix with the (s1, s2)-th N ×N block matrix being

Υs1,s2 = (Xs2Φs1(τs2)> +Xs1Φs2(τs1)−T∑t=1

XtΦs1(τt)>XtΦs2(τt).

Thus,

(INT − S)>(INT − S)Xβ0 = Xβ0 −ΥXβ0 =

X1β(τ1)−

∑Ts1=1 Υ1,s1Xs1β(τs1)

...

XTβ(τT )−∑Ts1=1 ΥT,s1Xs1β(τs1)

:=

XS

1

...

XST

,

where

XSt = Xt

(β(τt)−

T∑s1=1

Φs1(τt)Xs1β(τs1)

)−

T∑s1=1

[(Xs1Φt(τs1)> −

T∑s2=1

Xs2Φt(τs2)> Xs2Φs1(τs2)

]Xs1β(τs1)

= Xt

(o(h2)

)by Lemma (C.3).

Define Ks(τ) =(K(τs−τh

)Id,

τs−τh K

(τs−τh

)Id)>

. Since

D(D>D)−1D> =1

T1T1>T ⊗

(IN −

1

N1N1>N

),

we have

ΞN,T (1, 2, 1)

= Φ(τ)

1

T1T1>T ⊗

(IN −

1

N1N1>N

)(INT − S)>(INT − S)Xβ0

= (Id, 0d×d)

M>(τ)Ω(τ)M(τ)

NTh

−11

NThM>(τ)Ω(τ)

[1T ⊗

1

T

T∑t=1

(IN −

1

N1N1>N

)XSt

]

=o(h2)

NT 2h(Id, 0d×d)∆M (τ)−1

T∑t,s=1

Kt(τ)

(v>t vs −

1

Nv>t 1N1>Nvs

)

=o(h2)

NT 2h(Id, 0d×d)

∆M (τ)−1 −∆M (τ)−1

T∑t,s=1

Kt(τ)

(v>t vs −

1

Nv>t 1N1>Nvs

)o(h2)

NT 2h(Id, 0d×d)∆M (τ)−1

T∑t,s=1

Kt(τ)

(v>t vs −

1

Nv>t 1N1>Nvs

).

By Lemma C.2 and Lemma C.4, we have ‖ΞN,T (1, 2, 1)‖2 = oP(h2).

37

Page 39: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Lemma B.6. Suppose Assumptions 1, 2 and 3 hold. As T , N →∞,

√NThΞN,T (3) =

√NThΦ(τ)e + oP(1)

d→ N(0d×d, ν0σ20Σ−1X (τ)).

Proof. By definition, we have

√NThΞN,T (3) =

√NThΦ(τ)IN,T −D(D>D)−1D>(INT − S)e

=√NThΦ(τ)e +

√NThΦ(τ)D(D>D)−1D>(INT − S)e

=√NThΞN,T (3, 1) +

√NThΞN,T (3, 2).

By definition, we have

√NThΞN,T (3, 1)

=√NTh(Id, 0d×d)

M>(τ)Ω(τ)M(τ)

NTh

−11

NThM>(τ)Ω(τ)e

=√NTh(Id, 0d×d)∆

−1M (τ)∆M,e(τ)

=√NTh(Id, 0d×d)∆

−1M (τ)∆M,e(τ)

+√NTh(Id, 0d×d)

(∆−1M (τ)−∆−1M (τ)

)∆M,e(τ)

= ΞN,T (3, 1, 1) + ΞN,T (3, 1, 2).

Define Zt,N = 1√NTh

K(t−τTTh

)X>t et(

t−τTTh

)K(t−τTTh

)X>t et

so that Zt,N , Et,N is a martingale difference array and

√NTh∆M,e(τ) =

1√NTh

∑Tt=1K

(t−τTTh

)X>t et∑T

t=1

(t−τTTh

)K(t−τTTh

)X>t et

=

T∑t=1

Zt,N .

Next we only need to check for any ε > 0 and 2d× 1 element bounded vector a, it holds that

T∑t=1

E∣∣a>Zt,NZ>t,Na

∣∣ I (∣∣a>Zt,N ∣∣ > ε)|EN,t−1

= oP(1) (B.7)

and there exists a positive definite matrix Σz such that

T∑t=1

E(a>Zt,NZ

>t,Na|EN,t−1

)= σ2

0a>Σza + oP(1). (B.8)

To prove (B.7), it is sufficient to prove

T∑t=1

E∣∣a>Zt,N ∣∣2+δ = o(1).

38

Page 40: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Without loss of generality, we assume d = 1 and a = (a0, a1)>. Define K`t (τ) =

(τt−τh

)`K(τt−τh

). We have

T∑t=1

E∣∣a>Zt,N ∣∣2+δ

=1

(NTh)1+δ/2

T∑t=1

E

∣∣∣∣∣K0

t (τ)

N∑i=1

(a0Xiteit) +K1t (τ)

N∑i=1

(a1Xiteit)

∣∣∣∣∣2+δ

=1

(NTh)1+δ/2

T∑t=1

E

∣∣∣∣∣(K0

t (τ)a0 +K1t (τ)a1)g(τt)

(N∑i=1

eit

)+ (K0

t (τ)a0 +K0t (τ)a1)

(N∑i=1

viteit

)∣∣∣∣∣2+δ

=1

(NTh)1+δ/2

T∑t=1

E

∣∣∣∣∣(K0t (τ)a0 +K1

t (τ)a1)g(τt)

(N∑i=1

eit

)∣∣∣∣∣2+δ

1/(2+δ)

+

E

∣∣∣∣∣(K0t (τ)a0 +K1

t (τ)a1)

(N∑i=1

viteit

)∣∣∣∣∣2+δ

1/(2+δ)2+δ

≤ C

(NTh)1+δ/2

T∑t=1

|K0t (τ) +K1

t (τ)|2+δ

E

∣∣∣∣∣N∑i=1

eit

∣∣∣∣∣2+δ

1/(2+δ)

+

E

∣∣∣∣∣N∑i=1

viteit

∣∣∣∣∣2+δ

1/(2+δ)2+δ

=C

N1+δ/2Th

1

Th

T∑t=1

|K0t (τ) +K1

t (τ)|2+δO(N1+δ/2

)= O

(1

Th

)= o(1).

As

T∑t=1

Zt,NZ>t,N =

1

NTh

T∑t=1

(K0t (τ)

)2X>t ete

>t Xt

(K0t (τ)

)1X>t ete

>t Xt(

K1t (τ)

)2X>t ete

>t Xt

(K2t (τ)

)2X>t ete

>t Xt,

we have

1

NTh

T∑t=1

(K`t (τ)

)2E(X>t ete

>t Xt|EN,t−1

)=

1

NTh

T∑t=1

(K`t (τ)

)2X>t E

(ete>t |EN,t−1

)Xt

= σ20

1

NTh

T∑t=1

(K`t (τ)

)2X>t Xt

= σ20ΣX(τ)ν`(τ) + oP(1),

where the last equality can be shown using the similar approach as Lemma C.2 (1).

Thus,

T∑t=1

E(a>Zt,NZ

>t,Na|EN,t−1

)= σ2

0a> (Λν(τ)⊗ ΣX(τ)) a + oP(1). (B.9)

39

Page 41: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Based on the martingale difference array central limit theory, we have

√NTh∆M,e

d→ N(02d, σ

20ΣX(τ)ν`(τ)

)By Cramer-Rao device, we have

ΞN,T (3, 1, 1)d→ N

(0d, ν0σ

20Σ−1X (τ)

).

Also, by Lemma C.2 and (B.10) we have ‖ΞN,T (3, 1, 2)‖2 = oP(1).

Next, we show ΞN,T (3, 2) = oP(1). By Lemma C.7, we have Thus,

E (ΞN,T (3, 2)) = E (E(ΞN,T (3, 2)|FX))

= E(√

NThΦ(τ)D(D>D)−1D>(INT − S)E(e|FX))

= 0d

and

Var (ΞN,T (3, 2))

= E (Var(ΞN,T (3, 2)|FX)) + Var (E(ΞN,T (3, 2)|FX))

= E (Var(ΞN,T (3, 2)|FX))

= E(

Var(√NThΦ(τ)D(D>D)−1D>(INT − S)e|FX)

)= (NTh)E

(Φ(τ)D(D>D)−1D>(INT − S)Var(e|FX)(INT − S)>D(D>D)−1D>Φ(τ)>

)= σ2

0E(NThΦ(τ)D(D>D)−1D>(INT − S)(INT − S)>D(D>D)−1D>Φ(τ)>

)= σ2

0E(ΞN,T (3, 2, 1))

Thus, we only need to show ‖E(ΞN,T (3, 2, 1))‖ = o(1). By definition,

ΞN,T (3, 2, 1)

= NThΦ(τ)D(D>D)−1D>(INT − S)(INT − S)>D(D>D)−1D>Φ(τ)>

= NThΦ(τ)D(D>D)−1D>(INT −Υ)(INT −Υ)D(D>D)−1D>Φ(τ)>

= NThΦ(τ)D(D>(INT −Υ)D

)−1D>Φ(τ)>

−NThΦ(τ)D(D>(INT −Υ)D

)−1D>ΥD

(D>(INT −Υ)D

)−1D>Φ(τ)>

+NThΦ(τ)D(D>(INT −Υ)D

)−1D>ΥΥD

(D>(INT −Υ)D

)−1D>Φ(τ)>

= ΞN,T (3, 2, 1, 1) + ΞN,T (3, 2, 1, 2) + ΞN,T (3, 2, 1, 3)

40

Page 42: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

The leading order of ΞN,T (3, 2, 1, 1) is

NThΦ(τ)D(D>D

)−1D>Φ(τ)>

= NTh(Id, 0d×d)

M>(τ)Ω(τ)M(τ)

NTh

−11

NThM>(τ)Ω(τ)

1

T1T1>T ⊗

(IN −

1

N1N1>N

)1

NThΩ(τ)M(τ)

M>(τ)Ω(τ)M(τ)

NTh

−1(Id, 0d×d)

>

=1

NT 2h(Id, 0d×d)∆M (τ)−1

T∑t,s=1

Kt(τ)

(v>t vs −

1

Nv>t 1N1>Nvs

)Ks(τ)>

∆M (τ)−1(Id, 0d×d)

>

By Lemma C.2 and Lemma C.4, we have∥∥∥NThΦ(τ)D

(D>D

)−1D>Φ(τ)>

∥∥∥ = oP(1).

Similarly, we can show ‖ΞN,T (3, 2, 1, 2)‖ = oP(1) and ‖ΞN,T (3, 2, 1, 3)‖ = oP(1).

Appendix C: Additional Lemmas and Their Proofs

Lemma C.1. Under Assumption 1, for any 1 ≤ i, j ≤ N , ≤ t, s ≤ T and 1 ≤ k1, k2 ≤ d, we have

Cov(i>k1X

>t Xtik2, i

>k1X

>s Xsik2

)≤ CNα(|t− s|)δ/(4+δ).

Proof. By definition, we have

i>k1X>t Xtik2 = Ngk1(τt)gk2(τt) + gk2(τt)1

>Nvtik1 + gk1(τt)1

>Nvtik2 + i>k1v

>t vtik2

i>k1X>s Xsik2 = Ngk1(τs)gk2(τs) + gk2(τs)1

>Nvsik1 + gk1(τs)1

>Nvsik2 + i>k1v

>s vsik2

≤ C

Thus,

Cov(i>k1X

>t Xtik2, i

>k1X

>s Xsik2

)= gk2(τt)gk2(τs)Cov(1>Nvtik1,1

>Nvsik1) + gk1(τt)gk2(τs)Cov(1>Nvtik2,1

>Nvsik1) +

gk2(τs)Cov(i>k1v>t vtik2,1

>Nvsik1) + gk2(τt)gk1(τs)Cov(1>Nvtik1,1

>Nvsik2) +

gk1(τt)gk1(τs)Cov(1>Nvtik2,1>Nvsik2) + gk1(τs)Cov(i>k1v

>t vtik2,1

>Nvsik2) +

gk2(τt)Cov(1>Nvtik1, i>k1v>s vsik2) + gk1(τt)Cov(1>Nvtik2, i

>k1v>s vsik2) +

Cov(i>k1v>t vtik2, i

>k1v>s vsik2) (C.1)

=

9∑m=1

Ξcov(m) (C.2)

41

Page 43: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

According to Proposition 2.5 in Fan and Yao (2008), under Assumption 1 (ii) we get

Ξcov(1)

≤ C∣∣Cov(1>Nvtik1,1

>Nvsik1)

∣∣≤ Cα(|t− s|)δ/(4+δ)

E(|1>Nvtik1|2+δ/2)

2/(4+δ) E(|1>Nvsik1|2+δ/2)

2/(4+δ)

≤ Cα(|t− s|)δ/(4+δ)E

∥∥∥∥∥N∑i=1

vit

∥∥∥∥∥2+δ/2

2/(4+δ)E

∥∥∥∥∥N∑i=1

vis

∥∥∥∥∥2+δ/2

2/(4+δ)

≤ Cα(|t− s|)δ/(4+δ)(O(N1+δ/4)

)4/(4+δ)≤ CNα(|t− s|)δ/(4+δ).

Ξcov(3)

≤ C∣∣Cov(i>k1v

>t vtik2,1

>Nvsik1)

∣∣≤ Cα(|t− s|)δ/(4+δ)

E(|i>k1v

>t vtik2|2+δ/2)

2/(4+δ) E(|1>Nvsik1|2+δ/2)

2/(4+δ)

≤ Cα(|t− s|)δ/(4+δ)E

∥∥∥∥∥N∑i=1

vitv>it

∥∥∥∥∥2+δ/2

2/(4+δ)E

∥∥∥∥∥N∑i=1

vis

∥∥∥∥∥2+δ/2

2/(4+δ)

≤ Cα(|t− s|)δ/(4+δ)(O(N1+δ/4)

)4/(4+δ)≤ CNα(|t− s|)δ/(4+δ).

Ξcov(9)

≤∣∣Cov(i>k1v

>t vtik2, i

>k1v>s vsik2)

∣∣≤ Cα(|t− s|)δ/(4+δ)

E(|i>k1v

>t vtik2|2+δ/2)

2/(4+δ) E(|i>k1v

>s vsik2|2+δ/2)

2/(4+δ)

≤ Cα(|t− s|)δ/(4+δ)E

∥∥∥∥∥N∑i=1

vitv>it

∥∥∥∥∥2+δ/2

2/(4+δ)E

∥∥∥∥∥N∑i=1

visv>is

∥∥∥∥∥2+δ/2

2/(4+δ)

≤ Cα(|t− s|)δ/(4+δ)(O(N1+δ/4)

)4/(4+δ)≤ CNα(|t− s|)δ/(4+δ).

Similarly, we can show other terms having the same rate. Thus, we obtain

Cov (Xitk1Xitk2 , Xjsk1Xjsk2) ≤ CNα(|t− s|)δ/(4+δ).

Lemma C.2. Suppose Assumption 1 and 3 hold. As T , N →∞ simultaneously and 0 ≤ τ ≤ 1, we have

(i)

∥∥∥∥∥ 1

NTh

T∑t=1

(t− τTTh

)`K

(t− τTTh

)X>t Xt − ΣX(τ)µ`(τ)

∥∥∥∥∥2

= OP

(1√NTh

), for ` = 0, · · · , 3

42

Page 44: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

(ii)∥∥∥∆M (τ)−∆M (τ)

∥∥∥2

= OP

(1√NTh

)= oP(1),

(iii)∥∥∥∆M,N (τ)−∆M,N (τ)

∥∥∥2

= OP

(1√NTh

)= oP(1).

Proof. Denote ∆X = ΣX(τ)µ`(τ) =(

∆(k1,k2)X

)d×d

, K`t =

(t−τTTh

)`K(t−τTTh

)and

∆X =(

∆(k1,k2)X

)d×d

=1

NTh

T∑t=1

(t− τTTh

)`K

(t− τTTh

)X>t Xt.

Since ∆X is a finite matrix, to show (i) it suffices to prove

∣∣∣∆(k1,k2)X −∆

(k1,k2)X

∣∣∣ = O

(1√NTh

). (C.3)

By Assumption 1, we have

1

N

N∑i=1

E (Xitk1Xitk2) = gk1(τt)gk2(τt) + σ(k1,k2)v .

Then, the expectation of (k1, k2)-element of ∆X is

E(

∆(k1,k2)X

)=

1

Th

T∑t=1

K`t

(1

N

N∑i=1

E (Xitk1Xitk2)

)

=1

Th

T∑t=1

K`t

(gk1(τt)gk2(τt) + σ(k1,k2)

v

)=(gk1(τ)gk2(τ) + σ(k1,k2)

v

)µ`(τ) + o(1) = ∆

(k1,k2)X + o(1).

By Lemma C.1(ii), we obtain the variance

Var(

∆(k1,k2)X

)=

1

(NTh)2

T∑t,s=1

K`tK

`sCov

(i>k1X

>t Xtik2, i

>k1X

>s Xsik2

)=

C

(NTh)2

N∑i,j=1

T∑t=1

(K`t

)2α(0)δ/(4+δ) +

C

N(Th)2

T∑t=1

t−1∑s=1

K`tK

`sα(|t− s|)δ/(4+δ) +

C

N(Th)2

T∑t=1

T∑s=t+1

K`tK

`sα(|t− s|)δ/(4+δ) ≤ C

NThα(0)δ/(4+δ)

1

Th

T∑t=1

(K`t

)2+

C

NTh

T−1∑h=1

α(h)δ/(4+δ)

1

Th

T∑t=1

K`t

+

C

NTh

T−1∑h=1

α(h)δ/(4+δ)

1

Th

T∑t=1

K`t

= O

(1

NTh

).

Thus, C.3 holds and (i) has been proved.

(ii) and (iii) By definition, we have

∆M (τ) =1

NTh

∑Tt=1K

(t−τTTh

)X>t Xt

∑Tt=1

t−τTTh K

(t−τTTh

)X>t Xt∑T

t=1t−τTTh K

(t−τTTh

)X>t Xt

∑Tt=1

(t−τTTh

)2K(t−τTTh

)X>t Xt

43

Page 45: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

and

∆M,N (τ) =1

NTh

∑Tt=1

(t−τTTh

)2K(t−τTTh

)X>t Xt∑T

t=1

(t−τTTh

)3K(t−τTTh

)X>t Xt

which are block matrices with each block having the convergence rate of (i). Thus, (ii) and (iii) are proved.

Lemma C.3. Under Assumption 1 and 3, as N,T →∞, we have XSt = Xt

(o(h2)

).

Proof. By definition, we have

XSt = Xt

(β(τt)−

T∑s1=1

Φs1(τt)Xs1β(τs1)

)−

T∑s1=1

[(Xs1Φt(τs1)> −

T∑s2=1

Xs2Φt(τs2)> Xs2Φs1(τs2)

]Xs1β(τs1).

The first term can be written as

Xt

(β(τt)−

T∑s1=1

Φs1(τt)Xs1β(τs1)

)

= Xt

β(τt)− (Id, 0d×d)∆M (τt)

−1 1

NTh

T∑s1=1

Ks1(τt)X>s1Xs1

(β(τt) +

1

2h2β′′(τt) + o(h2)

)

= Xt

β(τt)− (Id, 0d×d)∆M (τt)

−1∆M (τt)(Id, 0d×d)>(β(τt) +

1

2h2β′′(τt) + o(h2)

)= −1

2h2Xtβ

′′(τt) +Xt

(o(h2)

).

The second term is

T∑s1=1

[(Xs1Φt(τs1)> −

T∑s2=1

Xs2Φt(τs2)> Xs2Φs1(τs2)

]Xs1β(τs1)

=

T∑s1=1

Φt(τs1)>X>s1Xs1β(τs1)−T∑

s1=1

T∑s2=1

Φt(τs2)>X>s2Xs2Φs1(τs2)Xs1β(τs1)

=1

NThXt

T∑s1=1

K>t (τs1)∆M (τs1)−1(Id, 0d×d)>X>s1Xs1β(τs1)−

1

(NTh)Xt

T∑s2=1

K>t (τs2)∆M (τs2)−1(Id, 0d×d)>X>s2Xs2

(β(τs2) +

1

2h2β′′(τs2) + o(h2)

)

= − 1

NThXt

T∑s1=1

K>t (τs1)∆M (τs1)−1(Id, 0d×d)>X>s1Xs1

(1

2h2β′′(τs1) + o(h2)

)

= − 1

NThXt

T∑s1=1

K>s1(τt)∆M (τt)−1(Id, 0d×d)

>X>s1Xs1

(1

2h2β′′(τt) + o(h2)

)

= −Xt(Id, 0d×d)>∆M (τt)

−1 1

NTh

T∑s1=1

Ks1(τt)X>s1Xs1

(1

2h2β′′(τt) + o(h2)

)= −1

2h2Xtβ

′′(τt) +Xt

(o(h2)

).

44

Page 46: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Therefore, we have

XSt = Xt

(o(h2)

).

Lemma C.4. Under Assumption 1 and 3, as N,T →∞,∥∥∥∥∥ 1

NT 2h

T∑t,s=1

Kt(τ)

(v>t vs −

1

Nv>t 1N1>Nvs

)∥∥∥∥∥2

= oP(1).

Proof. By definition, we have a 2d× d matrix

1

NT 2h

T∑t,s=1

Kt(τ)

(v>t vs −

1

Nv>t 1N1>Nvs

)

=

1NT 2h

∑Tt,s=1

K(τs−τh

) (v>t vs − 1

N v>t 1N1>Nvs)

1NT 2h

∑Tt,s=1

τt−τh K

(τt−τh

) (v>t vs − 1

N v>t 1N1>Nvs)

:=

∆(0)v

∆(1)v

.

Thus, we only need to prove the (k1, k2)-th element ∆(`,k1,k2)v of ∆

(`)v , ` = 0, 1 be oP(1). By definition, we have

∆(`,k1,k2)v =

1

NT 2h

T∑t,s=1

K`t

(i>k1v

>t vsik2 −

1

Ni>k1v

>t 1N1>Nvsik2

)

=1

NT 2h

T∑t,s=1

K`t i>k1v>t vsik2 −

1

N2T 2h

T∑t,s=1

K`t i>k1v>t 1N1>Nvsik2

= ∆(`,k1,k2,1)v − ∆(`,k1,k2,2)

v .

Next we prove the mean and variance of ∆(`,k1,k2,1)v and ∆

(`,k1,k2,2)v to be oP(1).

Since

∣∣∣E(∆(`,k1,k2,1)v

)∣∣∣=

1

NT 2h

T∑t,s=1

K`t

∣∣E (i>k1v>t vsik2)∣∣

=1

NT 2h

T∑t=1

K`t

∣∣E (i>k1v>t vtik2)∣∣+

1

NT 2h

T∑t 6=s

K`t

∣∣E (i>k1v>t vsik2)∣∣

=|σ(k1,k2)v |T

1

Th

T∑t=1

K`t +

1

T 2h

T∑t 6=s

K`t |E (vitk1visk2)|

≤ O

(1

T

)+

C

T 2h

T∑t6=s

K`tα(|t− s|)δ/(4+δ)

E(‖vit‖2+δ/2

)4/(4+δ)

≤ C

T+C

T

(T∑h=1

α(|t− s|)δ/(4+δ))

1

Th

T∑t=1

K`t

= O

(1

T

),

45

Page 47: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Var(

∆(`,k1,k2,1)v

)=

(1

NT 2h

)2 T∑t1,t2,s1,s2=1

K`t1K

`s1Cov(i>k1v

>t1vt2ik2 , i

>k1v>s1vs2ik2)

= O

(1

NTh

)1

Th

T∑t1=1

(K`t1)2

= O

(1

NTh

)= o(1),

∣∣∣E(∆(`,k1,k2,2)v

)∣∣∣≤ 1

N2T 2h

T∑t,s=1

K`t

∣∣Cov(i>k1v

>t 1N ,1

>Nvsik2

)∣∣≤ C

N2T 2h

T∑t,s=1

K`tα(|t− s|)δ/(4+δ)

E

(∥∥∥∥∥T∑t=1

vit

∥∥∥∥∥)2+δ/2

4/(4+δ)

≤ C

NTα(0δ/(4+δ)

1

Th

T∑t=1

K`t +

2C

NT

(T∑h=1

α(h)δ/(4+δ)

)1

Th

T∑t=1

K`t

= O

(1

NT

),

and

Var(

∆(`,k1,k2,2)v

)=

(1

N2T 2h

)2 T∑t1,t2,s1,s2=1

K`t1K

`t2Cov(i>k1v

>t11N1>Nvt2ik2 , i

>k1v>s11N1>Nvs2ik2)

≤(

C

N2T 2h

)2 T∑t1=1

K`t1

T∑t2,s1,s2=1

Cov(i>k1v>t11N1>Nvt2ik2 , i

>k1v>s11N1>Nvs2ik2)

≤ C

N2Th

1

Th

T∑t1=1

K`t1 = o(1),

We have ∆(`,k1,k2,1)v = oP(1) and ∆

(`,k1,k2,2)v = oP(1) so that we have accomplished the proof.

Lemma C.5. Let An = (aij) and Bn = (bij) be n × n matrices. If An and Bn are UB, the product AnBn is also

UB.

Proof. See the proof of Lemma 1 in Kelejian and Prucha (1999).

Lemma C.6. (1) Suppose Assumptions 4-5 hold. Then, S−1N,T (ρ), GN (ρ) and GN,T (ρ) and are UB matrices uniformly

in ρ ∈ ∆, especially when ρ = ρ0, we have S−1N,T , GN and GN,T are UB matrices.

(2) Suppose Assumptions 1-5 hold. Then, S, RN,T , QN,T and PN,T is UB in probability.

Proof. (1) According to Assumptions 4 and 5, we have W and S−1N (ρ) are UB in ρ ∈ ∆. Then, it holds that

46

Page 48: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

supNT≥1 ‖S−1N,T (ρ)‖1 = supNT≥1 ‖IT⊗S−1N (ρ)‖1 = supN>1 ‖S−1N (ρ)‖1 <∞ and supNT≥1 ‖S−1N,T (ρ)‖∞ = supNT≥1 ‖IT⊗S−1N (ρ)‖∞ = supN≥1 ‖S−1N (ρ)‖∞ <∞ uniformly in ρ ∈ ∆ so that S−1N,T (ρ) is UB uniformly in ρ ∈ ∆.

As GN (ρ) = WS−1N (ρ), we have GN (ρ) is UB matrices uniformly in ρ ∈ ∆ by Lemma C.5.

Similarly, supNT≥1 ‖GN,T (ρ)‖1 = supNT≥1 ‖IT⊗GN (ρ)‖1 = supN>1 ‖GN (ρ)‖1 <∞ and supNT≥1 ‖GN,T (ρ)‖∞ =

supNT≥1 ‖IT ⊗GN (ρ)‖∞ = supN≥1 ‖GN (ρ)‖∞ <∞ uniformly in ρ ∈ ∆. Then, we have GN,T (ρ) is UB uniformly in

ρ ∈ ∆.

Lemma C.7. For an NT × NT matrix AN,T = (At1t2) and a NT × 1vector bN,T = (bt) with elements being a

function of Xt, t = 1, · · · , T, we have

(1)

E(b>N,Tee>AN,Te) = m3

NT∑i=1

E(biaii).

(2)

Var(e>AN,Te) = (m4 − 3σ40)

NT∑i=1

E(a2ii)

+ σ40E(tr(AN,TA

>N,T +A2

N,T )).

(3)

E (e|FX) = 0NT , Var (e|FX) = σ20INT .

Proof. (1) By definition, we have

E(b>N,Tee>AN,Te)

= E

(T∑

t1,t2,t3=1

b>t3et3e>t1At1t2et2

)

= E

(T∑

t1=1

b>t1et1e>t1At1t1et1

)+ E

T∑t1 6=t2=1

b>t2et2e>t1At1t1et1

+ E

T∑t1 6=t2=1

b>t1et1e>t1At1t2et2

+E

T∑t1 6=t2=1

b>t1et1e>t2At2t1et1

+ E

T∑t1 6=t2 6=t3=1

b>t3et3e>t1At1t2et2

= E

(T∑

t1=1

b>t1et1e>t1At1t1et1

)

= m3

NT∑i=1

E(biaii).

(2) As Var(e>AN,Te) = E(e>AN,Tee>AN,Te)− E2(e>AN,Te), we have to calculate the two terms on the right

side seperately.

47

Page 49: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

By definition, we have

E(e>AN,Te) = E

(T∑

t1,t2=1

e>t1At1t2et2

)

= E

(T∑

t1=1

e>t1At1t1et1

)+ E

(T∑

t1>t2=1

e>t1At1t2et2

)+ E

(T∑

1=t1<t2

e>t1At1t2et2

)

=

T∑t=1

E(tr(E(Attete

>t |EN,t−1

)))+

T∑t1>t2=1

E(E(e>t1 |EN,t−1

)At1t2et2

)+

T∑1=t1<t2

E(e>t1At1t2E (et2 |EN,t−1)

)= σ2

0

T∑t=1

E(tr(Att)) = σ20E (tr(AN,T )) .

Since

E(e>AN,Tee>AN,Te) = E

(T∑

t1,t2=1

e>t1At1t2et2

)2

=

T∑t1,t2,t3,t4=1

E(e>t1At1t2et2e

>t3At3t4et4

)=

T∑t1=1

E(e>t1At1t1et1e

>t1At1t1et1

)+

T∑t1 6=t2=1

E(e>t1At1t1et1e

>t2At2t2et2

)+

T∑t1 6=t2=1

E(e>t1At1t2et2e

>t1At1t2et2

)+

T∑t1 6=t2=1

E(e>t1At1t2et2e

>t2At2t1et1

),

we need to calculate the four terms in the last equation next.

(a)

E(e>t1At1t1et1e

>t1At1t1et1

)=

N∑i1,i2,i3,i4=1

E(

E(a(t1t1)i1i2

a(t1t1)i3i4

ei1t1ei2t1ei3t1ei4t1 |Et1−1,N))

=

N∑i1=1

E(

(a(t1t1)i1i1

)2E(e4i1t1 |Et1−1,N

))

+

N∑i1 6=i2

E(

(a(t1t1)i1i1

a(t1t1)i2i2

+ a(t1t1)i1i2

a(t1t1)i2i1

+ (a(t1t1)i1i2

)2)E(e2i1t1e

2i2t1 |Et1−1,N

))

= (m4 − 3σ40)

N∑i1=1

E(

(a(t1t1)i1i1

)2)

+ σ40E(tr2(At1t1) + tr(At1t1A

>t1t1) + tr(A2

t1t1)).

48

Page 50: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

(b) Without loss of generality, we assume t1 > t2, we have

E(e>t1At1t1et1e

>t2At2t2et2

)=

N∑i1,i2,i3,i4=1

E(

E(a(t1t1)i1i2

a(t2t2)i3i4

ei1t1ei2t1ei3t2ei4t2 |Et1−1,N))

=

N∑i1,i2,i3,i4=1

E(a(t1t1)i1i2

a(t2t2)i3i4

ei3t2ei4t2E (ei1t1ei2t1 |Et1−1,N ))

=

N∑i1,i3,i4=1

σ20E(a(t1t1)i1i1

a(t2t2)i3i4

ei3t2ei4t2

)

=

N∑i1,i3,i4=1

σ20E(

E(a(t1t1)i1i1

a(t2t2)i3i4

ei3t2ei4t2 |Et2−1,N))

=

N∑i1,i2

σ40E(a(t1t1)i1i1

a(t2t2)i2i2

).

(c)

E(e>t1At1t2et2e

>t1At1t2et2

)=

N∑i1,i2,i3,i4=1

E(

E(a(t1t2)i1i2

a(t1t2)i3i4

ei1t1ei2t2ei3t1ei4t2 |Et1−1,N))

=

N∑i1,i2

σ40E(

(a(t1t2)i1i2

)2).

(d)

E(e>t1At1t2et2e

>t2At2t1et1

)=

N∑i1,i2

σ40E(a(t1t2)i1i2

a(t2t1)i2i1

).

According to (a)-(b), we first get

E(e>AN,Tee>AN,Te)

=

T∑t1=1

E(e>t1At1t1et1e

>t1At1t1et1

)+

T∑t1 6=t2=1

E(e>t1At1t1et1e

>t2At2t2et2

)+

T∑t1 6=t2=1

E(e>t1At1t2et2e

>t1At1t2et2

)+

T∑t1 6=t2=1

E(e>t1At1t2et2e

>t2At2t1et1

)= (m4 − 3σ4

0)

T∑t1=1

N∑i1=1

E(

(a(t1t1)i1i1

)2)

+ σ40

T∑t1,t2

N∑i1,i2

E(a(t1t1)i1i1

a(t2t2)i2i2

+ a(t1t2)i1i2

a(t2t1)i2i1

+ (a(t1t2)i1i2

)2)

and then

Var(e>AN,Te)

= E(e>AN,Tee>AN,Te)− E2(e>AN,Te)

= (m4 − 3σ40)

NT∑i=1

E(a2ii)

+ σ40

T∑t1,t2

N∑i1,i2

E(a(t1t2)i1i2

a(t2t1)i2i1

+ (a(t1t2)i1i2

)2)

= (m4 − 3σ40)

NT∑i=1

E(a2ii)

+ σ40E(tr(AN,TA

>N,T +A2

N,T )).

49

Page 51: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

(3) It is easy to show by the following equations

E (e|FX) =

E (e1|FX)

...

E (eT |FX)

=

E E (E (e1|F0)) |FX

...

E E (E (e1|FT−1)) |FX

= 0NT

and

Var (e|FX) = E(ee>|FX

)

E(e1e>1 |FX

)· · · E

(e1e>T |FX

)...

E(eTe>1 |FX

)· · · E

(eTe>T |FX

)

=

E

E(E(e1e>1 |F0))|FX

)· · · E

E(E

(e1e>T |FT−1))|FX

)...

. . ....

E

E(E(eTe>1 |FT−1))|FX

)· · · E

E(E

(eTe>T |FT−1))|FX

) = σ2

0INT .

Lemma C.8. Under Assumptions 1, 2(i)-(ii), 3-8, Σθ0is positive definite.

Proof. It suffices to show1

2σ40

ΨR,R +c1

2σ40

− c22σ40

> 0.

As it can be show c1 − 2c22 = limN,T→∞ tr(CsN,TCs>N,T )/(NT ) ≥ 0 where CsN,T = GN,T − (NT )−1tr(GN,T )INT ,

we only need to show that1

2σ40

ΨR,R > 0

which is satisfied by Assumption 8. Thus, we conclude the proof.

Appendix D: Additional Tables

50

Page 52: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table D.1: Means and standard deviations of bias of ρ when ρ0 = 0.7, σ20 = 1.

(a) Time Varying

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0149 -0.0091 -0.0056 -0.0144 -0.0088 -0.0055 -0.0039 -0.0020 -0.0018

(0.0553) (0.0452) (0.0313) (0.0540) (0.0443) (0.0306) (0.0247) (0.0212) (0.0138)

T=15-0.0138 -0.0077 -0.0052 -0.0133 -0.0073 -0.0050 -0.0041 -0.0018 -0.0009

(0.0431) (0.0371) (0.0252) (0.0422) (0.0362) (0.0246) (0.0202) (0.0175) (0.0115)

T=30-0.0081 -0.0051 -0.0034 -0.0077 -0.0050 -0.0031 -0.0017 -0.0012 -0.0002

(0.0304) (0.0250) (0.0173) (0.0296) (0.0247) (0.0169) (0.0136) (0.0117) (0.0080)

(I-2)

T=10-0.0244 -0.0165 -0.0094 -0.0229 -0.0153 -0.0088 -0.0032 -0.0016 -0.0009

(0.0452) (0.0373) (0.0263) (0.0438) (0.0362) (0.0253) (0.0164) (0.0142) (0.0094)

T=15-0.0233 -0.0153 -0.0091 -0.0216 -0.0143 -0.0084 -0.0023 -0.0015 -0.0009

(0.0380) (0.0300) (0.0198) (0.0366) (0.0292) (0.0191) (0.0136) (0.0116) (0.0075)

T=30-0.0190 -0.0129 -0.0072 -0.0176 -0.0119 -0.0066 -0.0015 -0.0011 -0.0005

(0.0282) (0.0210) (0.0141) (0.0271) (0.0205) (0.0137) (0.0094) (0.0076) (0.0054)

(I-3)

T=10-0.0165 -0.0098 -0.0048 -0.0144 -0.0082 -0.0042 -0.0008 -0.0004 0.0003

(0.0375) (0.0324) (0.0225) (0.0357) (0.0313) (0.0216) (0.0131) (0.0116) (0.0078)

T=15-0.0125 -0.0077 -0.0042 -0.0110 -0.0065 -0.0037 -0.0005 -0.0003 -0.0003

(0.0311) (0.0263) (0.0172) (0.0299) (0.0250) (0.0165) (0.0110) (0.0094) (0.0063)

T=30-0.0076 -0.0045 -0.0027 -0.0058 -0.0038 -0.0024 -0.0004 -0.0006 -0.0002

(0.0224) (0.0173) (0.0121) (0.0207) (0.0164) (0.0115) (0.0078) (0.0064) (0.0048)

(b) Parametric

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0121 -0.0077 -0.0054 -0.0115 -0.0071 -0.0051 0.0294 0.0409 0.0469

(0.0538) (0.0453) (0.0305) (0.0525) (0.0442) (0.0298) (0.0458) (0.0408) (0.0291)

T=15-0.0104 -0.0051 -0.0045 -0.0101 -0.0048 -0.0040 0.0327 0.0428 0.0493

(0.0410) (0.0357) (0.0253) (0.0403) (0.0352) (0.0247) (0.0394) (0.0331) (0.0228)

T=30-0.0051 -0.0032 -0.0031 -0.0052 -0.0030 -0.0030 0.0380 0.0463 0.0493

(0.0286) (0.0241) (0.0169) (0.0280) (0.0236) (0.0166) (0.0267) (0.0233) (0.0176)

(I-2)

T=10-0.0121 -0.0077 -0.0054 -0.0109 -0.0065 -0.0047 0.1255 0.1396 0.1448

(0.0538) (0.0453) (0.0305) (0.0529) (0.0446) (0.0296) (0.0437) (0.0372) (0.0242)

T=15-0.0104 -0.0051 -0.0045 -0.0094 -0.0045 -0.0034 0.1248 0.1394 0.1437

(0.0410) (0.0357) (0.0253) (0.0403) (0.0355) (0.0244) (0.0356) (0.0285) (0.0197)

T=30-0.0051 -0.0032 -0.0031 -0.0045 -0.0023 -0.0026 0.1272 0.1398 0.1428

(0.0286) (0.0241) (0.0169) (0.0280) (0.0238) (0.0166) (0.0248) (0.0207) (0.0139)

(I-3)

T=10-0.0059 -0.0038 -0.0032 0.0110 0.0132 0.0135 0.1761 0.1799 0.1810

(0.0372) (0.0330) (0.0230) (0.0333) (0.0298) (0.0211) (0.0134) (0.0109) (0.0078)

T=15-0.0046 -0.0041 -0.0029 0.0112 0.0126 0.0135 0.1759 0.1792 0.1812

(0.0314) (0.0273) (0.0179) (0.0284) (0.0248) (0.0165) (0.0107) (0.0093) (0.0065)

T=30-0.0023 -0.0030 -0.0025 0.0128 0.0129 0.0131 0.1758 0.1792 0.1807

(0.0228) (0.0176) (0.0128) (0.0204) (0.0161) (0.0120) (0.0077) (0.0063) (0.0046)

51

Page 53: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table D.2: Means and standard deviations of bias of σ2 when ρ0 = 0.7, σ20 = 1.

(a) Time Varying

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.1515 -0.1324 -0.1191 -0.1520 -0.1328 -0.1192 -0.1682 -0.1390 -0.1164

(0.1400) (0.1095) (0.0794) (0.1398) (0.1093) (0.0793) (0.1360) (0.1078) (0.0772)

T=15-0.1139 -0.0973 -0.0824 -0.1138 -0.0970 -0.0820 -0.1284 -0.0989 -0.0773

(0.1159) (0.0978) (0.0676) (0.1158) (0.0976) (0.0676) (0.1123) (0.0970) (0.0657)

T=30-0.0687 -0.0572 -0.0462 -0.0688 -0.0571 -0.0460 -0.0668 -0.0500 -0.0346

(0.0864) (0.0715) (0.0486) (0.0861) (0.0715) (0.0482) (0.0887) (0.0711) (0.0467)

(I-2)

T=10-0.1613 -0.1377 -0.1232 -0.1615 -0.1375 -0.1233 -0.1509 -0.1264 -0.1146

(0.1357) (0.1086) (0.0772) (0.1354) (0.1084) (0.0774) (0.1334) (0.1061) (0.0765)

T=15-0.1315 -0.1060 -0.0869 -0.1312 -0.1059 -0.0867 -0.1131 -0.0916 -0.0759

(0.1127) (0.0975) (0.0664) (0.1129) (0.0972) (0.0668) (0.1118) (0.0962) (0.0666)

T=30-0.0881 -0.0678 -0.0513 -0.0873 -0.0672 -0.0509 -0.0584 -0.0470 -0.0369

(0.0886) (0.0697) (0.0477) (0.0881) (0.0697) (0.0480) (0.0863) (0.0709) (0.0492)

(I-3)

T=10-0.1420 -0.1206 -0.1141 -0.1407 -0.1194 -0.1137 -0.1251 -0.1099 -0.1015

(0.1336) (0.1064) (0.0778) (0.1324) (0.1062) (0.0777) (0.1420) (0.1149) (0.0853)

T=15-0.1036 -0.0870 -0.0748 -0.1026 -0.0857 -0.0743 -0.0919 -0.0807 -0.0685

(0.1096) (0.0958) (0.0656) (0.1102) (0.0957) (0.0656) (0.1189) (0.1015) (0.0747)

T=30-0.0546 -0.0436 -0.0388 -0.0518 -0.0425 -0.0383 -0.0467 -0.0432 -0.0340

(0.0842) (0.0677) (0.0465) (0.0832) (0.0674) (0.0464) (0.0966) (0.0814) (0.0579)

(b) Parametric

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0151 -0.0041 -0.0059 -0.0142 -0.0032 -0.0054 1.2823 1.2952 1.2912

(0.1508) (0.1206) (0.0874) (0.1509) (0.1207) (0.0872) (0.4191) (0.3493) (0.2468)

T=15-0.0119 -0.0046 -0.0016 -0.0111 -0.0042 -0.0008 1.2926 1.2797 1.2935

(0.1183) (0.1023) (0.0711) (0.1183) (0.1022) (0.0708) (0.3452) (0.2703) (0.1859)

T=30-0.0046 -0.0011 -0.0006 -0.0037 -0.0003 0.0003 1.2834 1.2758 1.2762

(0.0841) (0.0698) (0.0484) (0.0837) (0.0698) (0.0486) (0.2363) (0.1964) (0.1355)

(I-2)

T=10-0.0151 -0.0041 -0.0059 -0.0137 -0.0026 -0.0048 1.7496 1.7152 1.7036

(0.1508) (0.1206) (0.0873) (0.1507) (0.1209) (0.0872) (0.5376) (0.4597) (0.3099)

T=15-0.0119 -0.0046 -0.0016 -0.0107 -0.0037 -0.0002 1.7505 1.6858 1.6869

(0.1182) (0.1023) (0.0711) (0.1180) (0.1023) (0.0706) (0.4394) (0.3565) (0.2396)

T=30-0.0046 -0.0011 -0.0006 -0.0033 0.0000 0.0009 1.7245 1.6570 1.6531

(0.0841) (0.0699) (0.0484) (0.0839) (0.0700) (0.0484) (0.2930) (0.2447) (0.1721)

(I-3)

T=10-0.0177 -0.0059 -0.0068 -0.0240 -0.0129 -0.0144 1.7331 1.6576 1.6705

(0.1483) (0.1180) (0.0870) (0.1477) (0.1179) (0.0864) (0.5147) (0.4229) (0.3030)

T=15-0.0158 -0.0051 -0.0022 -0.0218 -0.0126 -0.0091 1.7311 1.6376 1.6503

(0.1171) (0.1019) (0.0701) (0.1162) (0.1011) (0.0699) (0.4393) (0.3266) (0.2240)

T=30-0.0064 -0.0011 -0.0009 -0.0132 -0.0082 -0.0078 1.6927 1.6180 1.6085

(0.0829) (0.0691) (0.0484) (0.0825) (0.0688) (0.0483) (0.2917) (0.2404) (0.1648)

52

Page 54: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table D.3: Means and standard deviations of MSE of β0(τ) when ρ0 = 0.7, σ20 = 1.

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=100.0734 0.0525 0.0243 0.0741 0.0532 0.0246 0.1062 0.0682 0.0263

(0.0788) (0.0486) (0.0232) (0.0790) (0.0489) (0.0232) (0.0882) (0.0525) (0.0218)

T=150.0601 0.0414 0.0205 0.0600 0.0410 0.0200 0.0886 0.0497 0.0201

(0.0787) (0.0450) (0.0207) (0.0789) (0.0450) (0.0206) (0.0901) (0.0456) (0.0183)

T=300.0476 0.0319 0.0162 0.0479 0.0319 0.0158 0.0522 0.0284 0.0087

(0.0679) (0.0399) (0.0192) (0.0688) (0.0400) (0.0191) (0.0755) (0.0401) (0.0111)

(I-2)

T=100.0624 0.0421 0.0192 0.0621 0.0416 0.0191 0.0530 0.0356 0.0167

(0.0572) (0.0343) (0.0144) (0.0567) (0.0343) (0.0142) (0.0469) (0.0290) (0.0131)

T=150.0577 0.0384 0.0168 0.0565 0.0379 0.0165 0.0386 0.0268 0.0124

(0.0546) (0.0336) (0.0127) (0.0538) (0.0337) (0.0127) (0.0403) (0.0258) (0.0101)

T=300.0519 0.0317 0.0137 0.0501 0.0303 0.0132 0.0236 0.0157 0.0071

(0.0561) (0.0302) (0.0120) (0.0555) (0.0295) (0.0121) (0.0339) (0.0207) (0.0077)

(I-3)

T=100.0239 0.0158 0.0070 0.0232 0.0155 0.0070 0.0317 0.0229 0.0119

(0.0300) (0.0185) (0.0080) (0.0286) (0.0183) (0.0080) (0.0301) (0.0217) (0.0103)

T=150.0198 0.0130 0.0051 0.0188 0.0121 0.0048 0.0224 0.0172 0.0096

(0.0266) (0.0165) (0.0061) (0.0256) (0.0153) (0.0058) (0.0184) (0.0134) (0.0077)

T=300.0145 0.0075 0.0035 0.0124 0.0067 0.0033 0.0237 0.0175 0.0105

(0.0226) (0.0123) (0.0050) (0.0199) (0.0111) (0.0046) (0.0260) (0.0171) (0.0106)

53

Page 55: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table D.4: Means and standard deviations of bias of ρ when ρ0 = 0.9, σ20 = 1.

(a) Time Varying

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0062 -0.0045 -0.0027 -0.0060 -0.0042 -0.0026 -0.0013 -0.0006 -0.0008

(0.0208) (0.0184) (0.0133) (0.0201) (0.0180) (0.0130) (0.0087) (0.0080) (0.0057)

T=15-0.0055 -0.0037 -0.0025 -0.0053 -0.0036 -0.0024 -0.0014 -0.0006 -0.0005

(0.0158) (0.0148) (0.0108) (0.0154) (0.0145) (0.0105) (0.0070) (0.0067) (0.0048)

T=30-0.0038 -0.0030 -0.0020 -0.0036 -0.0029 -0.0018 -0.0006 -0.0007 -0.0003

(0.0111) (0.0101) (0.0073) (0.0108) (0.0098) (0.0071) (0.0049) (0.0045) (0.0033)

(I-2)

T=10-0.0093 -0.0070 -0.0040 -0.0087 -0.0066 -0.0036 -0.0012 -0.0009 -0.0005

(0.0155) (0.0133) (0.0096) (0.0150) (0.0129) (0.0093) (0.0056) (0.0048) (0.0032)

T=15-0.0091 -0.0064 -0.0038 -0.0085 -0.0059 -0.0036 -0.0011 -0.0008 -0.0006

(0.0130) (0.0106) (0.0070) (0.0125) (0.0103) (0.0068) (0.0046) (0.0039) (0.0025)

T=30-0.0080 -0.0058 -0.0035 -0.0074 -0.0054 -0.0033 -0.0009 -0.0008 -0.0004

(0.0095) (0.0074) (0.0051) (0.0091) (0.0071) (0.0049) (0.0033) (0.0026) (0.0019)

(I-3)

T=10-0.0058 -0.0038 -0.0019 -0.0052 -0.0033 -0.0016 -0.0003 -0.0004 -0.0002

(0.0125) (0.0109) (0.0078) (0.0118) (0.0104) (0.0074) (0.0042) (0.0037) (0.0025)

T=15-0.0047 -0.0030 -0.0018 -0.0041 -0.0025 -0.0016 -0.0003 -0.0003 -0.0004

(0.0105) (0.0089) (0.0059) (0.0100) (0.0085) (0.0057) (0.0035) (0.0030) (0.0020)

T=30-0.0031 -0.0022 -0.0014 -0.0027 -0.0018 -0.0012 -0.0003 -0.0005 -0.0004

(0.0078) (0.0062) (0.0043) (0.0074) (0.0058) (0.0041) (0.0026) (0.0020) (0.0015)

(b) Parametric

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0049 -0.0042 -0.0023 -0.0046 -0.0038 -0.0021 0.0068 0.0115 0.0161

(0.0204) (0.0188) (0.0139) (0.0198) (0.0184) (0.0135) (0.0175) (0.0159) (0.0118)

T=15-0.0035 -0.0020 -0.0016 -0.0035 -0.0018 -0.0013 0.0085 0.0124 0.0172

(0.0158) (0.0147) (0.0117) (0.0154) (0.0146) (0.0115) (0.0150) (0.0128) (0.0090)

T=30-0.0013 -0.0009 -0.0010 -0.0013 -0.0008 -0.0010 0.0112 0.0142 0.0175

(0.0112) (0.0105) (0.0082) (0.0108) (0.0104) (0.0080) (0.0101) (0.0087) (0.0069)

(I-2)

T=10-0.0049 -0.0042 -0.0023 -0.0043 -0.0036 -0.0017 0.0418 0.0503 0.0568

(0.0204) (0.0188) (0.0139) (0.0198) (0.0184) (0.0136) (0.0154) (0.0136) (0.0086)

T=15-0.0035 -0.0020 -0.0016 -0.0031 -0.0017 -0.0011 0.0418 0.0502 0.0563

(0.0158) (0.0147) (0.0117) (0.0155) (0.0147) (0.0114) (0.0125) (0.0101) (0.0070)

T=30-0.0013 -0.0009 -0.0010 -0.0011 -0.0007 -0.0005 0.0429 0.0503 0.0560

(0.0112) (0.0105) (0.0082) (0.0107) (0.0103) (0.0080) (0.0087) (0.0069) (0.0055)

(I-3)

T=10-0.0018 -0.0014 -0.0009 0.0042 0.0052 0.0059 0.0582 0.0595 0.0604

(0.0129) (0.0118) (0.0088) (0.0116) (0.0104) (0.0078) (0.0047) (0.0036) (0.0026)

T=15-0.0014 -0.0012 -0.0005 0.0044 0.0052 0.0061 0.0583 0.0593 0.0606

(0.0109) (0.0099) (0.0069) (0.0099) (0.0090) (0.0060) (0.0039) (0.0032) (0.0023)

T=30-0.0006 -0.0008 -0.0005 0.0049 0.0055 0.0059 0.0583 0.0594 0.0604

(0.0080) (0.0066) (0.0050) (0.0074) (0.0060) (0.0045) (0.0029) (0.0023) (0.0017)

54

Page 56: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table D.5: Means and standard deviations of bias of σ2 when ρ0 = 0.9, σ20 = 1.

(a) Time Varying

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.1679 -0.1446 -0.1252 -0.1691 -0.1449 -0.1253 -0.1822 -0.1529 -0.1281

(0.1371) (0.1091) (0.0795) (0.1368) (0.1085) (0.0793) (0.1315) (0.1045) (0.0767)

T=15-0.1364 -0.1114 -0.0898 -0.1371 -0.1121 -0.0899 -0.1458 -0.1172 -0.0914

(0.1153) (0.0972) (0.0678) (0.1153) (0.0967) (0.0676) (0.1087) (0.0946) (0.0670)

T=30-0.0954 -0.0743 -0.0541 -0.0953 -0.0743 -0.0539 -0.0919 -0.0702 -0.0468

(0.0863) (0.0712) (0.0482) (0.0866) (0.0709) (0.0482) (0.0883) (0.0710) (0.0490)

(I-2)

T=10-0.1719 -0.1452 -0.1273 -0.1731 -0.1455 -0.1277 -0.1744 -0.1480 -0.1267

(0.1340) (0.1063) (0.0767) (0.1335) (0.1061) (0.0767) (0.1330) (0.1061) (0.0769)

T=15-0.1449 -0.1144 -0.0914 -0.1462 -0.1150 -0.0917 -0.1439 -0.1171 -0.0917

(0.1097) (0.0957) (0.0656) (0.1093) (0.0952) (0.0653) (0.1113) (0.0952) (0.0652)

T=30-0.1034 -0.0787 -0.0578 -0.1033 -0.0790 -0.0578 -0.0877 -0.0761 -0.0537

(0.0863) (0.0681) (0.0470) (0.0863) (0.0682) (0.0470) (0.0908) (0.0701) (0.0492)

(I-3)

T=10-0.1447 -0.1224 -0.1160 -0.1440 -0.1215 -0.1156 -0.1337 -0.1261 -0.1233

(0.1340) (0.1057) (0.0777) (0.1342) (0.1060) (0.0775) (0.1432) (0.1126) (0.0811)

T=15-0.1093 -0.0900 -0.0778 -0.1080 -0.0891 -0.0774 -0.1059 -0.1012 -0.0890

(0.1109) (0.0958) (0.0655) (0.1105) (0.0961) (0.0658) (0.1203) (0.1000) (0.0697)

T=30-0.0597 -0.0484 -0.0416 -0.0586 -0.0472 -0.0412 -0.0633 -0.0657 -0.0584

(0.0857) (0.0674) (0.0467) (0.0855) (0.0678) (0.0465) (0.1010) (0.0803) (0.0507)

(b) Parametric

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=10-0.0131 -0.0014 -0.0055 -0.0123 -0.0008 -0.0051 1.3049 1.3292 1.3160

(0.1519) (0.1221) (0.0886) (0.1520) (0.1217) (0.0885) (0.4302) (0.3617) (0.2560)

T=15-0.0119 -0.0043 -0.0018 -0.0109 -0.0040 -0.0011 1.3143 1.3134 1.3186

(0.1195) (0.1031) (0.0723) (0.1194) (0.1028) (0.0724) (0.3550) (0.2807) (0.1932)

T=30-0.0055 -0.0016 -0.0009 -0.0045 -0.0008 0.0000 1.3013 1.3073 1.2999

(0.0848) (0.0702) (0.0489) (0.0847) (0.0705) (0.0491) (0.2436) (0.2033) (0.1402)

(I-2)

T=10-0.0131 -0.0014 -0.0055 -0.0120 -0.0003 -0.0050 1.7767 1.6997 1.6062

(0.1519) (0.1221) (0.0886) (0.1517) (0.1221) (0.0883) (0.5588) (0.4724) (0.3172)

T=15-0.0118 -0.0043 -0.0018 -0.0107 -0.0035 -0.0006 1.7750 1.6772 1.5995

(0.1195) (0.1031) (0.0723) (0.1191) (0.1029) (0.0721) (0.4538) (0.3664) (0.2436)

T=30-0.0055 -0.0016 -0.0009 -0.0041 -0.0004 0.0000 1.7460 1.6509 1.5701

(0.0848) (0.0702) (0.0489) (0.0845) (0.0702) (0.0490) (0.3043) (0.2512) (0.1764)

(I-3)

T=10-0.0177 -0.0055 -0.0072 -0.0250 -0.0133 -0.0162 1.7567 1.6573 1.6143

(0.1482) (0.1178) (0.0867) (0.1473) (0.1175) (0.0857) (0.5280) (0.4330) (0.3086)

T=15-0.0159 -0.0054 -0.0031 -0.0233 -0.0137 -0.0113 1.7559 1.6416 1.5986

(0.1172) (0.1019) (0.0700) (0.1164) (0.1007) (0.0692) (0.4566) (0.3350) (0.2302)

T=30-0.0067 -0.0016 -0.0017 -0.0147 -0.0099 -0.0100 1.7130 1.6220 1.5583

(0.0829) (0.0687) (0.0482) (0.0826) (0.0682) (0.0477) (0.3005) (0.2441) (0.1705)

55

Page 57: Time-Varying Coefficient Spatial Autoregressive Panel Data ...In terms of spatial panel data models,Zhang and Sun(2015) andZhang and Shen(2015) discuss estimation of semiparametric

Table D.6: Means and standard deviations of MSE of β0(τ) when ρ0 = 0.9, σ20 = 1.

(II-1): β0(τ) = 1 (II-2): β0(τ) = 1 + 0.05τ + 0.05τ2 (II-3): β0(τ) = 1 + 2τ + 2τ2

N=10 N=15 N=30 N=10 N=15 N=30 N=10 N=15 N=30

(I-1)

T=100.1101 0.0761 0.0342 0.1114 0.0762 0.0342 0.1292 0.0854 0.0370

(0.0906) (0.0533) (0.0232) (0.0908) (0.0532) (0.0232) (0.0903) (0.0534) (0.0226)

T=150.1043 0.0656 0.0311 0.1053 0.0664 0.0311 0.1140 0.0707 0.0323

(0.0939) (0.0477) (0.0216) (0.0935) (0.0480) (0.0215) (0.0938) (0.0471) (0.0205)

T=300.0940 0.0581 0.0265 0.0937 0.0581 0.0261 0.0854 0.0512 0.0199

(0.0883) (0.0501) (0.0210) (0.0882) (0.0503) (0.0210) (0.0865) (0.0478) (0.0194)

(I-2)

T=100.0782 0.0510 0.0228 0.0785 0.0511 0.0227 0.0705 0.0481 0.0212

(0.0587) (0.0343) (0.0140) (0.0593) (0.0349) (0.0138) (0.0529) (0.0320) (0.0130)

T=150.0751 0.0475 0.0204 0.0753 0.0470 0.0203 0.0599 0.0415 0.0186

(0.0561) (0.0334) (0.0120) (0.0560) (0.0329) (0.0120) (0.0467) (0.0280) (0.0111)

T=300.0707 0.0416 0.0186 0.0692 0.0413 0.0183 0.0456 0.0331 0.0140

(0.0618) (0.0307) (0.0120) (0.0611) (0.0306) (0.0121) (0.0500) (0.0275) (0.0107)

(I-3)

T=100.0250 0.0164 0.0077 0.0244 0.0160 0.0077 0.0314 0.0227 0.0113

(0.0301) (0.0180) (0.0085) (0.0293) (0.0175) (0.0085) (0.0292) (0.0204) (0.0092)

T=150.0229 0.0142 0.0061 0.0217 0.0134 0.0059 0.0243 0.0192 0.0098

(0.0288) (0.0171) (0.0068) (0.0277) (0.0162) (0.0066) (0.0194) (0.0133) (0.0063)

T=300.0185 0.0105 0.0050 0.0172 0.0093 0.0047 0.0300 0.0227 0.0138

(0.0272) (0.0163) (0.0067) (0.0257) (0.0145) (0.0064) (0.0290) (0.0171) (0.0092)

56