Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive...

12
Central Annals of Public Health and Research Cite this article: Chen J, Wang R, Huang Y (2015) Semiparametric Spatial Autoregressive Model: A Two-Step Bayesian Approach. Ann Public Health Res 2(1): 1012. *Corresponding author Renfu Wang, School of Statistics, Renmin University of China ,100872 Beijing, China, Tel:+86-010-82500745; Fax: +86-010-82500745 ; E-mail: Submitted: 21 November 2013 Accepted: 14 October 2014 Published: 19 January 2015 Copyright © 2015 Wang et al. OPEN ACCESS Keywords Bayesian inference Kernel estimation Markov Chain Monte Carlo algorithm Semi parametric spatial autoregressive model Spatial data Research Article Semiparametric Spatial Autoregressive Model: A Two- Step Bayesian Approach Jiaqing Chen 1 , Renfu Wang 2 * and Yangxin Huang 3 1 School of Science, Wuhan University of Technology, China 2 School of Statistics, Renmin University of China, China 3 Department of Epidemiology and Biostatistics, University of South Florida, USA Abstract Spatial data arise frequently in econometric studies and it is a common practice to analyze such data with spatial autoregressive (SAR) models. This paper proposes a two-step Bayesian approach for inference in the semiparametric spatial autoregressive (SPSAR) model, including the cases for mixed data. With proper transformation, the estimation problem under SPSAR model is conducted into two steps. In the first step, a transformed SAR model is offered to fit the utilized data using Bayesian method, and then the residuals of the transformed SAR model are smoothed via nonparametric kernel estimator. In the second step, we substitute the kernel estimator into the SPSAR model to recalculate the parameters. Since the likelihood function of spatial autoregressive model is so complex to get an analytic solution, the Markov chain Monte Carlo (MCMC) algorithm is adopted to implement Bayesian inferential approach. A simulation study is conducted to assess the performance of the proposed method, and a real example is analyzed with the proposed method. ABBREVIATIONS SAR: Spatial Autoregressive; SPSAR: Semiparametric Spatial Autoregressive; MCMC: Markov Chain Monte Carlo; GDP: Gross Domestic Product; GMM: Generalized Method of Moments; NIG: Normal-Inverse Gamma. INTRODUCTION As in Anselin [1], Spatial data analysis has attracted considerable research interest, since spatial dependence is introduced into econometrics. In recent years, nonlinear modeling in spatial econometrics has been very popular. Paelinck [2] pointed out that there exits highly nonlinearity in spatial econometric relations. Semiparametric methods provide a compromise between a full parametric specification and a nonparametric method to accommodate both spatial dependence and nonlinearity. Fotheringham and Rogerson [3] summarized four main applications of semiparametric methods in spatial analysis. First, the semiparametric method is as an alternative to specifying a specified spatial process in error term, where the error covariance is estimated in a nonparametric way (Kelejian and Prucha [4]). Second, a spatial lagged dependent variable is introduced in the weight matrix in a nonparametric way (Pinkse et.al [5]). Third, it is akin to spatial filtering, and purport to model unspecified spatial spillover effects non-parametrically, in a so-called smooth spatial effects estimator (Gibbons and Machin [6]). And finally, other variables are introduced into the spatial regression model in a nonparametric fashion (Gress [7]), which is called the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed in modeling the hedonic housing price by Gress [7]. Gress [7] compared two semiparametric autoregressive models with other parametric spatial models such as spatial autoregressive (SAR) model and found the semiparametric models show more accurate and stable estimates than the parametric ones. Basile and Gress [8] applied a semiparametric autoregressive model to test the presence of spatial externality on economic growth of European regions, where the dependent variable is the average per capita GDP growth rate in the period 1988- 2000 and the exogenous regressors are the initial per

Transcript of Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive...

Page 1: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central Annals of Public Health and Research

Cite this article: Chen J, Wang R, Huang Y (2015) Semiparametric Spatial Autoregressive Model: A Two-Step Bayesian Approach. Ann Public Health Res 2(1): 1012.

*Corresponding authorRenfu Wang, School of Statistics, Renmin University of China ,100872 Beijing, China, Tel:+86-010-82500745; Fax: +86-010-82500745 ; E-mail:

Submitted: 21 November 2013

Accepted: 14 October 2014

Published: 19 January 2015

Copyright© 2015 Wang et al.

OPEN ACCESS

Keywords•Bayesian inference•Kernel estimation•Markov Chain Monte Carlo algorithm•Semi parametric spatial autoregressive model•Spatial data

Research Article

Semiparametric Spatial Autoregressive Model: A Two-Step Bayesian ApproachJiaqing Chen1, Renfu Wang2* and Yangxin Huang3 1School of Science, Wuhan University of Technology, China2School of Statistics, Renmin University of China, China3Department of Epidemiology and Biostatistics, University of South Florida, USA

Abstract

Spatial data arise frequently in econometric studies and it is a common practice to analyze such data with spatial autoregressive (SAR) models. This paper proposes a two-step Bayesian approach for inference in the semiparametric spatial autoregressive (SPSAR) model, including the cases for mixed data. With proper transformation, the estimation problem under SPSAR model is conducted into two steps. In the first step, a transformed SAR model is offered to fit the utilized data using Bayesian method, and then the residuals of the transformed SAR model are smoothed via nonparametric kernel estimator. In the second step, we substitute the kernel estimator into the SPSAR model to recalculate the parameters. Since the likelihood function of spatial autoregressive model is so complex to get an analytic solution, the Markov chain Monte Carlo (MCMC) algorithm is adopted to implement Bayesian inferential approach. A simulation study is conducted to assess the performance of the proposed method, and a real example is analyzed with the proposed method.

ABBREVIATIONS

SAR: Spatial Autoregressive; SPSAR: Semiparametric Spatial Autoregressive; MCMC: Markov Chain Monte Carlo; GDP: Gross Domestic Product; GMM: Generalized Method of Moments; NIG: Normal-Inverse Gamma.

INTRODUCTION

As in Anselin [1], Spatial data analysis has attracted considerable research interest, since spatial dependence is introduced into econometrics. In recent years, nonlinear modeling in spatial econometrics has been very popular. Paelinck [2] pointed out that there exits highly nonlinearity in spatial econometric relations. Semiparametric methods provide a compromise between a full parametric specification and a nonparametric method to accommodate both spatial dependence and nonlinearity. Fotheringham and Rogerson [3] summarized four main applications of semiparametric methods in spatial analysis. First, the semiparametric method is as an alternative to specifying a specified spatial process in error term, where the error covariance is estimated in a nonparametric way (Kelejian

and Prucha [4]). Second, a spatial lagged dependent variable is introduced in the weight matrix in a nonparametric way (Pinkse et.al [5]). Third, it is akin to spatial filtering, and purport to model unspecified spatial spillover effects non-parametrically, in a so-called smooth spatial effects estimator (Gibbons and Machin [6]). And finally, other variables are introduced into the spatial regression model in a nonparametric fashion (Gress [7]), which is called the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed in modeling the hedonic housing price by Gress [7]. Gress [7] compared two semiparametric autoregressive models with other parametric spatial models such as spatial autoregressive (SAR) model and found the semiparametric models show more accurate and stable estimates than the parametric ones. Basile and Gress [8] applied a semiparametric autoregressive model to test the presence of spatial externality on economic growth of European regions, where the dependent variable is the average per capita GDP growth rate in the period 1988-2000 and the exogenous regressors are the initial per

Page 2: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 2/12

capita GDP, average proportion of real physical investment to real GDP, average growth rate of the population, and average unemployment rate. The econometric results provide strong evidence of nonlinearities in the effect of initial per capita incomes and schooling attainment levels.

Except for the empirical study of SPSAR model by Gress [7] and Basile and Gress [8], further researches, including analytical estimator, the convergence and asymptotic property of the estimator and comparison of different estimators based on different estimation techniques, can be accomplished. Recently, Su and Jin [9] proposed the profile qusi-maximum likelihood estimation in partially linear spatial autoregressive models and showed that the convergence rate of the spatial parameter estimator depends on some general features of the spatial weight matrix of the model. Moreover, the estimators of other finite-dimensional parameters in the model had the regular 2 convergence rate and the estimator of the nonparametric component was consistent. Further, with both heteroscedasticity and spatial dependence in the error terms, Su, L.[10] explored a semiparametric Generalized Method of Moments (GMM) estimation in SPSAR models under weak moment conditions, derived the limiting distributions of the estimators for both the parametric and nonparametric components and demonstrated that the estimator of the parametric component had the usual

n asymptotic. Su and Jin [9] and Su, L. [10] have done considerable work on estimation of SPSAR model. Not only for different specifications (heteroscedasticity and spatial dependence) of error had terms, but they also derived the asymptotic property of the parametric and nonparametric components in theory.

In fact, the SPSAR model can be viewed as a combination of spatial dependence, spatial regression and nonparametric component. Thus, if the parametric component and nonparametric component were separated effectively, parametric estimation methods, including maximum likelihood estimation (Anselin [1]) and Bayesian estimation (LeSage, J. P. [11]) could be used in our model. Moreover, given the known spatial dependence parameter ρ , we can regard the SPSAR model as a usual semiparametric regression model. Hence, the estimation of SPSAR model turns out to be our familiar semiparametric regression problems, which are discussed in a large number of literatures. For example, Robinson,P.M. [12] provided a n -consistent estimator with a normal asymptotic distribution. Andrews [13] adopted a general framework for proving the n -consistency ad asymptotic normality of a wide class of semiparametric estimators, which were obtain by minimizing a criterion, and Chai et.al [14] demonstrated a two-stage estimation for semiparametric

regression by transforming the model into a standard linear regression model.

In this paper, with effective transformation, the estimation problem of SPSAR model is conducted in two steps, which is named as the two-step Bayesian estimation procedure. In the first step, we use a transformed SAR model to fit the data under the Bayesian framework, and then the residuals of the transformed SAR model are smoothed via nonparametric kernel estimator. In the second step, we substitute the kernel estimator into the SPSAR model to recalculate the parameters. This paper is organized as follows. This part is the introduction of Semiparametric Spatial Autoregressive Model and previous work done by other researchers. The Materials and Methods parts consist of three parts. Section 1 describes the SPSAR model and its extensions. Section 2 provides the two-step Bayesian estimation procedure. Section 3 describes the details of the Bayesian inference, including the specification of prior distributions, Markov Chain Monte Carlo (MCMC) algorithm and mixed data modeling. In the Results and Discussion part, we evaluate the performance of the proposed method through a simulation study and analyze a data set concerning knowledge spillovers among cities in several Provinces in China. And the Conclusion part gives a summary of our research.

MATERIALS AND METHODS

Semiparametric Spatial Autoregressive Model and Its Extensions

We consider the SPSAR model as follows,

( )gρ β= + + +y Wy X T e (1)

where 1 2( , ,..., )ny y y ′=y , ρ is the spatial dependence parameter, W is a pre-specified constant n\times n weight matrix, X is an n k× fixed regressor matrix which does not contain constant term, 1 2( , ,..., )kβ β β β ′=is the vector of parameters which is associated with the regressor matrix X , 1 2( ) ( ( ), ( ),..., ( ))ng g g g ′=T T T Tand , 1, 2,...,q

i i n∈ =T are random variables with identically independent distribution(i.i.d.), ( )g is an unknown function defined on q

, the error term 1 2( , ,..., )ne e e ′=e is an n-dimensional random vec-tor

that each component follows identically independent normal distribution, i.e. ~e , 2(0, )N σ e I besides, { }ei and

iT for i 1, 2,..., n= are independent variables.

In fact, the semiparametric spatial autoregressive model 0 ( )n n nρ= + +1n n ny W y m X U when the error term nU is homoscedasticity considered by Su, L. [10] is a special case where the parametric linear part βX is omitted. Moreover, the semiparametric partially linear model where the spatial term ρWy is omitted from the

Page 3: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 3/12

SPSAR model discussed in the literature (Hardle, W et.al [15], Li and Racine [16]) and the spatial autoregressive model proposed by Anselin [1], where the nonparametric component was removed, is a special case of model (1). In addition, the spatial Durbin model in the form of

ρ=y Wy ( ) ( )β γ ρβ ε+ + + − +X WX and the spatial error model in the form of ρ β= +y Wy X ρ β ε− +WX discussed by LeSage and Pace [17] can be converted to the SPSAR model, if we assume the nonparametric component

( ) ( )g ρβ= −WX and ( )g ρ β= − WX , respectively. Thus, the SPSAR model suggested in this paper can be extended to a number of different specifications.

Two-step Bayesian Inference

In this section, we discuss the two-step Bayesian inferential approach and provide related details of estimation procedure and special cases for mixed data. We decompose the estimation procedure into two steps as follows.

First step estimation: Suppose

( ( )), ( ( ))i iE g E gα = < ∞T T , let ( )i i ig eε α= − +T , then ( ) 0iE ε = and

2

2

0 ( ) ( ( ))

, 1, 2,...,

ε σ

σ

< = = +

< ∞ =i ivar var g

i ne

T.

Thus, we transform the SPSAR model (1) to the following spatial autoregressive model,

ρ α β ε= + + +ny Wy l X (2)

Where (1,1,...,1)n

′=nl

is an n-dimensional vector.

Then, model (2) can be simplified as

ρ θ ε= + +y Wy Z (3)

Where ( )=Z l X and ( )θ α β ′ ′= .

It can be shown that the likelihood function of model (3) is given by

2 2 2

2

( | , , ) (2 ) | |1exp{ ( ) ( )}

2

ρ θ σ πσ

θ θσ

−=

′− − −

n

L y A

Ay Z Ay Z (4)

Where ρ= −A I W . Base on the Bayes formula, the conditional posterior distribution ofθ , 2σ and ρ is expressed as,

2 2 2( , , | ) ( | , , ) ( , ) ( )p Lρ θ σ ρ θ σ π θ σ π ρ∝y y (5)

where 2( , )π θ σ and ( )π ρ are the prior distributions ofθ , 2σ and ρ , which should be specified based on the domain knowledge in applications.

For simplicity, we use * * *( )θ α β ′ ′= and *ρ to denote the Bayesian estimators of θ and ρ ,

respectively, which are calculated from the conditional distribution. Then, based on the residuals calculated by

* *ρ β− −y Wy X , we may estimate the nonparametric component ( )g using kernel estimation.

Let * *ρ β= − −S y Wy X , then we have the nominal nonparametric regression model as follows,

( )g= +S T e (6)

We use , 1, 2,..., , 1, 2,...,is i n s q= =T to denote the sth component of iT and let ( )g t denote the joint probability density function of variable iT . Thus, for 1 2( , ,..., )qt t t=t , define

11 2

1 1( , ) ( ) ( )...

qi s is

isq s s

tK K qλ λ λ λ λ λ λ=

− −= =∏t T Tt T (7)

Where ( )q is a symmetric and nonnegative univariate kernel function (Li and Racine [16]), such as Guassian kernel, Epanechnikov kernel and others, and

1 2( , ,..., )qλ λ λ λ= is the smoothing parameters.

According to the local constant kernel estimation method, ( )g t can be estimated by

1

1

( , )ˆ ( )

( , )

n

i ii

n

jj

S Kg

K

λ

λ

=

=

=∑

t Tt

t T. (8)

Which is simply a weighted average of iS because equation (9) can be rewritten as

,1

ˆ ( ) ( ) ( ( ))n

i ii

g λ λω ω=

′= =∑t S t t S (9)

Where ,1

( ) ( , ) / ( , )n

i i jj

K Kλ λ λω=

= ∑t t T t T is the weight attached to iS , and ,1 ,2 ,( ) ( ( ), ( ),..., ( ))nλ λ λ λω ω ω ω ′=t t t t

Thus, the first-step estimators of SPSAR model have been obtained for both the para-metric components *β and *ρ , and nonparametric component ˆ ( )g t .

Second step estimation: In Section 2.1, a preliminary result of SPSAR model has been obtained. However, ignoring the existence of ( )g T in the estimation of *β and *ρ may lead to an inaccurate result. In addition, the transformation from model (1) to model (2) increases the variance of error term from 2

eσ to 2σ . Therefore, a second estimation needs to be conducted to improve the accuracy of the estimator. The detailed procedure of the second estimation is shown as below.

First, substituting equation (9) into model (1), we have

( ( )) Sλρ β ω ′= + + +y Wy X T e (10)

Page 4: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 4/12

Where 1 2( ) ( ( ), ( ),..., ( ))nλ λ λ λω ω ω ω ′=T T T T . Thus, the likelihood function of model (10) is given by

2 2 2

2

( | , , ) (2 ) | |1exp{ ( ( ( )) )

2( ( ( )) )}

λ

λ

ρ θ σ πσ

β ωσβ ω

−=

′ ′× − − −

′− −

n

L D e e

e

A

Ay X T S

Ay X T S

(11)

Where ρ= −A I W .Compared to the likelihood function in equation

(4), equation (11) is identical to equation (4) in form if let * 1 ( ) Sλω

− ′= −y y A T . Therefore, given the prior distributions, the Bayesian estimators of β and \rho can be calculated similarly, and are donated as β and ρ̂, respectively.

DETAILS OF THE TWO-STEP BAYESIAN ESTIMATION

Prior distributions and posterior distribution

In this section, a detailed calculation of posterior distribution is conducted. For Bayesian inference, we need to specify prior distributions for unknown parameters in the model in order to produce the posterior distribution. For the sake of the similarity bet-ween normal distribution likelihood and the likelihood as equation (4) LeSage and Pace [17], a normal-inverse gamma (NIG) prior distribution for the parameters θ and 2σ , that is 2( , ) ~ ( , )NIG a bθ σ, is introduced in this paper. However, the NIG prior for θ and 2σ is not a conjugate distribution. Since the parameter ρ plays an important role in this model, we take a uniform prior over the feasible range for this parameter, that is

~ρ (1/ ,1/ )min maxU λ λ , where minλ and maxλ are the minimum and maximum eigenvalue of the spatial weight matrix W , respectively.

Thus, combining the likelihood function and prior distribution by Bayesian rule, we give a formal statement of the Bayesian SAR model as follows.

2 2 2

2

2 2 2 2

( | , , ) (2 ) | |1exp{ ( ) ( )}

2( , ) ~ ( | ) ( ) ( , ) ( , )( ) ~ (1/ ,1/ )

ρ θ ε

ρ θ σ πσ

θ θσ

π θ σ π θ σ π σ σπ ρ λ λ

= + +

=

′− − −

=

n

min max

L

N c IG a bU

y Wy Z

y A

Ay Z Ay Z

T

(12)

Hence, the conditional posterior distribution is obtained as below,

2 2 2

2 *

* * * 1 *2

( , , | ) ( | , , ) ( , ) ( )1( 1) | |

21exp{ [2 ( ) ( ) ( )]}

2

θ σ ρ ρ θ σ π θ σ π ρ

σ

θ θσ

∝+

∝ + +

′− + − −

p Lka

b c c

y y

A

H

(13)

where*

* 1 * * 1 *

* 1 1 1

* 1 1

/ 2,( ( ) ( ) ) / 2,

( ) ( ,.

)( )

a a nb b c c c cc c

− −

− − −

− −

= +

′ ′ ′ ′= + + −

′ ′= + +

′= +

H y A Ay HZ Z H Z Ay H

H Z Z H

Therefore, the conditional posterior distribution of θ and ρ is obtained as

2 2

2 2

( | ) ( , , | )d d

( | ) ( , , | )d d

p p

p p

ρ ρ θ σ θ σ

θ ρ θ σ ρ σ

∫ ∫∫ ∫

y y

y y (14)

However, given the conditional posterior likelihood functions in equation (14), the Bayesian estimators of θand ρ vary with different loss functions. For example, if the loss function is a quadratic loss as

2( , ) ( )L θ θ θ θ= −, where θ is the estimator and θ is the true value, the Bayesian estimator is the mean of the conditional posterior function; if the loss function is an absolute loss function as ( , ) | |L θ θ θ θ= − , the Bayesian estimator is the median

of the conditional posterior function. Therefore, the key point to get the Bayesian estimators of θ and ρ is to get the conditional posterior function.

However, to obtain the analytical solution of equation (14), we need to integrate equation (13) over the domain of ρ , which involves evaluating the n n× determinant: | | | |ρ= −A I W . That can almost be a impossible mission, especially when the row number of weight matrix is large. Thus, a numeric approach should be used to get the Bayesian estimator, which is the topic of Section.

Markov chain Monte Carlo algorithm

The Markov chain Monte Carlo (MCMC) algorithm is implemented to obtain the Bayesian estimator effectively. Note that given other parameters known while computing conditional posterior distribution, we can obtain an analytical solution. For example, given 2,ρ σ known, we derive the conditional posterior distribution ofθ , which is

* 2 *( , )N c σ T . Despite that we can write the conditional posterior distribution of ρ as,

2

2

( | , , ) | |1exp{ ( ) ( )}

2

ρ θ σ

θ θσ

′− − −

p y A

Ay Z Ay Z (15)

Page 5: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 5/12

Though generating samples from normal distribution or other usual distributions is easy, sampling for the parameter \rho must be preceded using an alternative approach where we choose Metropolis-Hastings sampling method.

To make it clear, we summary the MCMC algorithm procedures as follows,

Input (0) 2(0) (0), ,θ σ ρ as initial values, and let m 0= ;

Generate ( 1)mθ + using ( 1) 2( ) ( ) * 2( ) *( | , ) ~ ( , )m m m mp N cθ σ ρ σ+ H distribution,

the mean and variance are calculated from:* 1 ( ) 1

* 1 1

( ) ( ( ) )( )

mc cρ− − −

− −

′ ′= + − +

′= +

1Z Z H Z I W y HH Z Z HGenerate 2( 1)mσ + using

2( 1) ( 1) ( ) * *( | , ) ~ ( , )m m mp IG a bσ θ ρ+ + distribution, where

*

* ( 1)

( 1) ( )

/ 2,( )

( ) / 2, .θ

θ ρ

+

+

= +

′= + −

− = −

m

m m

a a nb b Ay ZAy Z A I W

Generate ( 1)mρ + using ( 1) 2( 1) ( 1)( | , )m m mp ρ σ θ+ + + with the M-H algorithm as LeSage and Pace[17];

U p d a t e ( ) ( 1) 2( ) 2( 1) ( ) ( 1), ,m m m m m mθ θ σ σ ρ ρ+ + += = = and continue the step(2), step(3), step(4) until the algorithm converge;

Cut off the first B observations and consider the remains as the sample for the post-error analysis;

Plot the posterior distribution or obtain summaries of the posterior distribution (mean, median, standard deviation, quartiles, and correlations).

Modeling for mixed data

In the estimation for the SPSAR model, we usually presume that the underlying data X and T are continuous in nature. However, when encountering a data set containing a mix of continuous and discrete (i.e., nominal or ordinal categorical) data types, the previous approach that treats this two data types without difference seems inefficient. For example, in the modeling of hedonic home price, school quality (good, ordinary, poor) in the neighborhood must be a collection of discrete data, while the price index in this distinction is continuous.

From a geometric point of view, when these two variables are included in the Para-metric linear part, two curves in space rather than a surface should be generated

in the fit-ting of the data. Similarly, for the nonparametric part, in the fitting of these two variables, two univariate distribution density curves rather a two-dimension distribution density surface should be produced. To prevent the discrete data being misused as the continuous data, the dummy variables and custom designed kernel functions are introduced, respectively.

For the parametric part, dummy variable works well in the mixed data regression. Since it may be redundancy to rewrite the usage of dummy variable, readers who are not familiar with it may find some examples in Wooldridge [18] and Rawlings et.al [19].

For the nonparametric part, first of all, notation of continuous and discrete regressors should be clarified, including the ordered discrete regressors and non-ordered regressors. Let 1 2( ) ( ( ), ( ),..., ( ))ng g g g ′=T T T T and ( )g is unknown function defined on p

and iT contains both continuous and discrete regressors. We use d

iT to denote a 1r× vector of discrete regressors, ,c q

i q p r∈ = −T denotes the remaining continuous regressors and

let ( , ), 1, 2,..,d ci i i i n= =T T T . Moreover, we assume

that some of the discrete regressors have a natural ordering, for example, environmental conditions (excellent, good, poor). Let d

iT denote a 1 1r × vector of discrete regressors that do not have a natural ordering. Let d

iT

denote the remaining 2 1r r r= − discrete regressors that have a natural ordering and define ( , )

dd dii i=T T T

. Again, , 1, 2,...,dis s r=T represents the sth component

of diT , and c

isT , s 1,2,...,q= the sth component of ciT for i 1, 2,..., n= and let ( ) ( , )d cg g=t t t denote

the joint probability density function of variable( , ), 1, 2,..,d c

i i i i n= =T T T .

Consider the discrete regressors 11 2( , ,..., )d d d dr=t t t t

that do not have a natural ordering. According to Aitchison, J. and Aitken, C [20], we define a discrete univariate kernel function,

1 ,( , , )

,1

d ds s is

d ds is s s

s

hl h h others

c

− == −

t Tt T

(16)

where , {0,1,... } , 1d ds is sc∈ −t T , and the range of

smooth parameter 1h ,s 0,1,.., rs = is ( )( )0, c 1 / cs s− .

For the discrete regressors 21 2( , ,..., )d d d d

r=t t t t

that have a natural ordering, according to Racine and Li [21], we get

| |

1,( , , )

,d d

iss

d dd d iss

iss s

s

l hh others−

==

t T

t Tt T

. (17)

Page 6: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 6/12

Combining equations (18) and (19), we obtain the product of kernel functions for the discrete regressors as follows.

1

2

1

( )

1

( ) | |

1

( , , ) ( )1

(1 ) ( )

=

−+

=

=−

is

d dissis

rN xd d s

is s

rN x

s s rs

hLc

h h t T

t T h (18)

Where 1 2( , ,..., )rh h h ′=h , and ( ) 1( )d d

issisN x = ≠t T

is an indicator function, which equals 1 ifd d

iss ≠t T

, and 0 otherwise.

For the continuous regressors 1 2( , ,..., )c c c cq=t t t t , we

choose a product kernel ( )Q and a vector of smoothing parameters 1 2( , ,..., )qλ λ λ λ ′= . Let

1 2

1

1( , ) ( )...

1 ( )

λ λ λ λ λ

λ λ=

−= =

−∏

c cc c i

iq

c cqs is

s s s

Q Q

q

t Tt T

t T (19)

Thus, combining equations (18) and (19), we obtain the product of kernel functions for all the mixed variables.

( , ) ( , ) ( , , )c c d di i iK Q Lη λ=t T t T t T h (20)

where ( , )η λ= h .Thus, like equation (9), mixed data ( )g t can be estimated by

1

2

1

1

2

1

1

1

( )

1 1 1

( ) | |

1

( )

1 1 1

( ) |

1

( , )ˆ ( )

( , )

1 ( ) ( )1

(1 ) ( )

1 ( ) ( )1

(1 ) (

η

η

λ λ

λ λ

=

=

= = =

−+

=

= = =

−+

=

=

−−

−=

−−

∑ ∏ ∏

∑∏ ∏

is

d dissis

is

dsis

n

i ii

n

jj

rc cqnN xs is s

ii s ss s s

rN x

s s rs

c c rqns js N xs

j s ss s s

rN x

s s rs

S Kg

K

hS qc

h h

hqc

h h

t T

t T

t Tt

t T

t T

t T

| )djs

(21)

Hence, a detailed but succinct description about the estimation procedure is given for the mixed data case.

RESULTS AND DISCUSSION

Simulation study

In this section, Monte Carlo experiments on different values of ρ and β are conducted to evaluate the

performance of our two-step Bayesian estimation. To generate data for experiment, similar rules suggested by Su, L. and Yang, S [22] and Su, L. [10] are adopted in our experiments. First, in parametric component of SPSAR model, for continuous data c

iX , they are i.i.d. and each is equal to the sum of 48 independent random variables on a certain distribution, such as uniform distribution

( )U 0.25,0.25− , standard normal distribution N(0,1); For discrete data d

iX , we assume ( )diP l p= =X

, for 0,1l = and 0 p 1< < . Second, in nonparametric component of SPSAR model, for continuous data c

iT , they are i.i.d on ( )U 5,5− ; For discrete data d

iT , they are sampled the same way as d

iX .

However, a more simple treatment other than the Rook Contiguity mentioned in Su, L. and Yang, S [22] is adopted for the generation of weight matrix. We generate the elements of weight matrix using the following rules. First, elements in row i and column i 1, i 1 − +( )i 2,3,..., n 1= − are nonzero but equal, however other elements are all zero. To be sym-metry, for 1th row and nth row, let (1, 2) 1, ( , 1) 1n n= − =W W and others equal zero. Second, elements in each row are row normalized.

In addition, to implement the estimation, the kernel functions of continuous regressors and bandwidth sequences, 1 2( , ,..., )qλ λ λ λ= and 1 2( , ,..., )rh h h=h need to be specified. In the experiments, we choose the second order Guassian kernel function as ( ) (2 ) *q x π=

2exp( / 2)x− for continuous regressor. And the least-squares cross-validation bandwidth selection discussed by Li and Racine [16] is used. Note that the R package “np” by T. Hayfield and J.S.Racine [23] is used in kernel estimation.

A detailed description of data generation for the two data generation processes is given below.

1 1 2 2 3

1 1 1

1:

sin(0.5 ) exp( / 5)

ρ β β

π

= + + + +

+ − + +

c d d

c c c

DGP y Wy X X XT T T e

1 1 2 2

33 3 2 1 1

31 2

2 :

sin(0.5* ) cos(0.5* )

exp( / 5) .

ρ β β

β

= + + +

+ + +

− + +

c c

dd d c c

dc d

DGP y Wy X X

X T T T T

T T T e

In DGP1, 1, 1, 2,3,...ci i n=X is the ith component of 1

cX, that are identically in-dependent distributed; each equals to the sum of 48 independent random variables which are all follow the standard normal distribution N(0,1),

2diX satisfies 2( 0) 0.35,d

iP = =X 2( 1) 0.65diP = =X

and 3diX satisfies 3( 0) 0.2d

iP = =X , 3( 1) 0.8diP = =X

, for i 1, 2,..., n= . For 1cT , we sampled it from uniform

distribution ( )U 5,5− . Besides, the error term with-out heteroscedasticity is drawn from standard normal

Page 7: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 7/12

distribution. Finally, β = 1 2 3( , , )β β β ′ is specified as(10,8,5)β ′= .

In DGP2, let 1ciX be identically independent distributed

and each equals to the sum of 48 independent random variables each distributed on standard normal distribution N(0,1), for i 1, 2,..., n= . Each element of 2

cX equals the sum of 48 random variables i.i.d. on ( )U 0.25,0.25− . And

3diX satisfies 3( 0) 0.2d

iP = =X , 3( 1) 0.8diP = =X . For T

, we sampled it from uniform distribution ( )U 5,5− , while

2diT satisfies 2( 0) 0.35,d

iP = =T 2( 1) 0.65diP = =T

and Besides, 3diT

satisfies 3( 0) 0.15diP = =T

,

2( 1) 0.45diP = =T

, 2 2( 2) 0.2, ( 3) 0.2d di iP P= = = =T T

, for i 1, 2,..., n= . The error term without heter-oscedasticity is again draw from standard normal distribution. Finally, (6,12,15)β ′= is chosen.

For the spatial parameter ρ , the nine cases of 0.8, 06,...,0.6,0.8ρ = − − are considered, and each

experiment contains a sample size of n=100.

According to the data generated from rules and DGPs above, we estimated ρ and β using the two-step Bayesian estimation method proposed in this paper. To measure the differences between the values predicted and the values actually observed, the Root Mean Square Error (RMSE) is introduced as,

2

1

ˆ( )( , )m

i i

iRMSE

mθ θθ θ

=

−= ∑ (22)

where 1 2( , ,..., )mθ θ θ θ ′= is the true value, while

1 2ˆ̂̂( , ,..., )mθ θ θ θ ′= is the estimator ofθ , and m is the length

ofθ . For example, *( , )RMSE β β measures the accuracy of the first-step estimator and ˆ( ( ), (( )))RMSE g g TTindicates the accuracy of the nonparametric estimation.

The estimated results and associated RMSE values

by the two-step Bayesian in-ference of DGP1 are shown in (Table 1) and (Table 2), respectively. Some important findings are summarized as follows. First, the first estimators, the second estimators and the true values of ρ and β for all the 0.8, 0.6,...,0.6,0.8ρ = − − presented in (Table 1) have a small difference. Second, the second estimators for both ρ and β may be accepted as good estimators. This provides simulation evidence to support results proved by Chai et.al [14] that ρ and β are almost surely to be the true value. Third, the RMSE results shown in (Table 2) indicate that the second estimators do improve the accuracy of estimation. In fact, using the kernel estimation result other than the residuals makes the estimation procedure more close to the real model when the kernel estimator approaches the nonparametric part effectively. Finally, the ( , )RMSE y y which reflects a general measurement of the estimation procedure shows that our method performs well.

For the case of mixed data in DGP2, along with the fact that (Table 3) demonstrates similar findings to those in (Table 1, Table 4) compare the results of RMSE when treating the original data as mixed data and continuous data, separately. Misusing mixed data as continuous data will slightly increase the error of estimators. Hence, in the modeling of SPSAR model for mixed data, a correct understanding of data type will benefit the estimation results. In some circumstances that require high precision, a slight improvement may make a big difference.

For the estimation of nonparametric component, the results of kernel estimation shown in (Figure 1) for DGP1 and Figure 2 for DGP2, demonstrate that the residuals of the first step estimation fit the true value well. Also, the kernel estimators move closely to the true values and capture the main trend of the nonparametric part. However, an interesting but significant phenomenon should be emphasized. Highly correlated data of y and T will lead to inaccurate, sometimes completely wrong,

ρ *ρ ρ̂ β n β̂

0.8 0.8033 0.8003 (10.0000,8.0000,5.0000) (10.0400,8.2073,4.4735) (10.0424,8.0700,4.8891)

0.6 0.6004 0.5999 (10.0000,8.0000,5.0000) (10.0027,8.3400,4.6680) (10.0155,8.1609,4.8657)

0.4 0.006 0.4031 (10.0000,8.0000,5.0000) (9.9537,8.3284,4.8391) (9.9599,8.1481,5.0462)

0.2 0.197 0.1991 (10.0000,8.0000,5.0000) (9.9782,8.2568,4.4042) (9.9879,8.1383,4.5671)

0 0.0074 -0.0009 (10.0000,8.0000,5.0000) (10.0550,7.9321,4.6904) (10.0324,7.9474,4.8309)

-0.2 -0.1934 -0.1964 (10.0000,8.0000,5.0000) (10.0280,8.4140,4.8494) (10.0156,8.3617,4.9744)

-0.4 -0.4005 -0.4003 (10.0000,8.0000,5.0000) (9.9776,8.3804,4.6384) (9.9911,8.0606,4.9131)

-0.6 -0.5981 -0.6005 (10.0000,8.0000,5.0000) (10.0261,7.8434,4.4664) (10.0001,7.5898,4.7314)

-0.8 -0.8027 -0.8027 (10.0000,8.0000,5.0000) (9.9596,8.3890,4.3865) (9.9746,8.2009,4.6450)

Table 1: Estimated Results from DGP1.

Abbreviations: DGP: Data Generation Process

Page 8: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 8/12

ρ ( *, )ρ ρRMSE ˆ( , )ρ ρRMSE ( *, )β βRMSE ˆ( , )β βRMSE ˆ( ( ), ( ))RMSE g T g T ˆ( , )RMSE y y

0.8 0.0033 0.0003 0.3275 0.0796 0.4538 2.0663

0.6 0.0004 0.0001 0.2744 0.1213 0.3414 1.1482

0.4 0.006 0.0031 0.2128 0.0925 0.4006 0.8287

0.2 0.0027 0.0009 0.3748 0.2625 0.4839 0.8285

0 0.0074 0.0009 0.1857 0.104 0.3412 1.0257

-0.2 0.0066 0.0036 0.2549 0.2096 0.3747 1.0328

-0.4 0.0005 0.0003 0.3033 0.0614 0.4539 0.8511

-0.6 0.0019 0.0005 0.3214 0.2831 0.6803 1.1426

-0.8 0.0027 0.0027 0.42 0.2359 0.3336 2.4139

Table 2: RMSE Results From DGP2.

Abbreviations: RMSE: Root Mean Square Error; DGP: Data Generation Process.

ρ *ρ ρ̂ β n β̂

0.8 0.7926 0.7938 (6.0000,12.0000,15.0000) (6.0055,11.7217,15.0704) (5.9986,11.8385,15.0285)

0.6 0.5975 0.5971 (6.0000,12.0000,15.0000) (6.0242,11.9450,14.7962) (6.0288,12.0254, 14.8193)

0.4 0.3974 0.3983 (6.0000,12.0000,15.0000) (5.9684,11.9196,15.1254) (5.9757,11.9741,15.1985)

0.2 0.1978 0.1997 (6.0000,12.0000,15.0000) (5.9928,11.9639,15.3708) (5.9972,11.9518, 15.3650)

0 0.009 0.0089 (6.0000,12.0000 ,15.0000) (6.0418,11.7354,14.6292) (6.0312 ,11.8116,14.6510)

-0.2 -0.2009 -0.2005 (6.0000,12.0000,15.0000) (6.0013,12.1514,15.0371) (5.9866,12.1907,15.0622)

-0.4 -0.3991 -0.4001 (6.0000 ,12.0000,15.0000) (6.0034,11.6794,15.3487) (5.9996,11.6497,15.2462)

-0.6 -0.5994 -0.6004 (6.0000,12.0000,15.0000) (5.9887 ,11.9799,14.8130) (5.9836,11.9807 ,4.8639)

-0.8 -0.8011 -0.8 (6.0000,12.0000,15.0000) (5.9552,11.7612,14.7693) (5.9698,11.8908,14.8111)

Table 3: Estimate Results from DGP2.

Abbreviations: DGP: Data Generation Process

ρ ( *, )ρ ρRMSE ˆ( , )ρ ρRMSE ( *, )β βRMSE ˆ( , )β βRMSE ˆ( ( ), ( ))RMSE g T g T ˆ( , )RMSE y y

0.80.0074 0.0062 0.1657 0.0947 0.7982 1.6852

- + - + + -

0.60.0025 0.0029 0.1227 0.1067 0.7215 0.9496

- - - - + +

0.44.00E-04 0 0.427 0.4006 0.7025 0.8577

- - - + + -

0.20.0084 0.0069 0.2735 0.2221 0.5344 0.7116

- + - - + -

00.009 0.0089 0.2641 0.2297 0.5705 0.9229

- + - - - -

-0.29.00E-04 5.00E-04 0.09 0.1161 0.4981 0.8199

- - - - + -

-0.49.00E-04 1.00E-04 0.2735 0.2472 0.6426 1.003

- - - + + -

-0.66.00E-04 4.00E-04 0.1088 0.0799 0.5849 1.0463

- - - - + +

-0.80.0011 0 0.1934 0.1272 0.5174 1.8354

- - - - - +

Table 4: RMSE Results from DGP2.

Abbreviations: RMSE: Root Mean Square Error; DGP: Data Generation Process

Page 9: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 9/12

estimators of ρ and β taking the DGP1 when 0.6ρ = as an example, Table 5 reveals that higher correlation between y and T has higher RMSEs between the estimator and the true value. And the same finding in the estimation of nonparametric component is also recognized in Figure3. The reason why high correlation damages the accuracy of our estimator may be explained by the fact that the high correlation conflicts the assumption of SPSAR model, i.e., { }ei and iT for i 1, 2,..., n= are independent variables. This finding indicates that it is important to evaluate correlation between y and T before one can fit the SPSAR model.

APPLICATION IN MODELING SPATIAL KNOWLEDGE SPILLOVERS

In this section, we apply the SPSAR model and the

associated two-step Bayesian inferential approach in modeling the spatial knowledge spillovers. The data set collected from the 2011 statistical yearbook of the four Provinces in China consisted of 54 observa-tions and each measurement was taken from one city in these four Provinces.

Knowledge and technological progress are the main engines of economic dynamics in most endogenous growth models. (Romer [24]). An interesting aspect of this perspective is how the spatial correlation affects the knowledge spillovers. The conceptual framework for analyzing the geographic spillovers of university research is based on a two-factor Cobb-Douglas knowledge production function (KPF) of Griliches [25] and Jaffe [26]. Formally, this can be expressed as,

1 2( ) ( ) ( )log K log R log Uβ β ε= + + (23)

Figure 1 Kernel Estimation of DGP1.

Page 10: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 10/12

Figure 2 Kernel Estimation of DGP2.

( , )ρ y T ( *, )ρ ρRMSE ˆ( , )ρ ρRMSE ( *, )β βRMSE ˆ( , )β βRMSE

0.5545 0.1563 0.079 0.2701 0.1922

0.7072 0.1681 0.0985 0.3332 0.1137

0.4546 0.1557 0.0703 0.1676 0.2493

0.3625 0.105 0.0279 0.1934 0.1922

-0.0035 0.0071 0.008 0.0915 0.083

0.1058 0.0263 0.0129 0.2032 0.2519

0.0877 0.0133 0.0098 0.2233 0.1317

Table 5: Correlation between y and T.

Abbreviations: RMSE: Root Mean Square Error

where , ,K R U denote a proxy for knowledge(mostly, patents), industry research & development and university research, respectively.

In this paper, we extend this basic model to a semi parametric spatial autoregressive model to accommodate the case for knowledge spillovers of four Provinces in

China as follows.

1 2

3 4

= (K)+ ( ) ( )( ) ( ) ( )

log(K) log log RD log HClog GDP log RD g DI

ρ θ θθ θ ε

+ +

+ + +

W (24)

where K is the amount of patents, RD is the research and development expenditure, HC is the human capital,

Page 11: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 11/12

Figure 3 Relationship between Correlation ρ(y, T) and Kernel Estimation.

namely, persons who relate to innovation activities, GDP is the gross domestic product of a city, and W is the spatial weight matrix, which is pre-specified based on the distance between different cities.

Using the two-step Bayesian inferential approach proposed in this paper, we obtain the estimated results summarized in (Table 6). We can see that all the variables are significant at 95% confidence level. The positive coefficients of log (RD), log (GDP), log (HC) suggest that the increase of R&D expenditure, gross domestic product and number of people related to R&D will improve the knowledge product. Note that the spatial coefficient ρ is significantly larger than 0; thus, strong spatial spillovers recognized in the data set provide evidence that the SPSAR model and associated method perform well.

CONCLUSION

In this paper, we propose a two-step Bayesian inferential

approach for semiparametric spatial autoregressive model. In the first step, a transformed SAR model is offered to fit the utilized data with the Bayesian estimation method, and then the residuals of the trans-formed SAR model are smoothed via nonparametric kernel estimator; in the second step, we substitute the kernel estimator into the SPSAR model to recalculate the parameters. The method proposed in this paper are quite different from those discussed by Su, L. and Jin, S. [9] and Su, L. [10]. They firstly estimated the nonparametric component in SPSAR model while assuming the parameters ρ and β known, and then a parameter estimation method, such as profile quasi-maximum likelihood estimation, generalized method of moments is applied to obtain ρ and β . In this paper, we first estimate parameters ρ and β while assuming the nonparametric component as constant, and then the kernel estimation method is applied to estimate the nonparametric based on the residuals of the previous

Page 12: Semiparametric Spatial Autoregressive Model: A Two-Step ...the semiparametric spatial autoregressive model. The semiparametric spatial autoregressive model (SPSAR) is originally proposed

Central

Wang et al. (2015)Email:

Ann Public Health Res 2(1): 1012 (2015) 12/12

step; finally we estimate the parameters ρ and β again using the results from the kernel estimation. The results from simulation study show that our method performs well in estimation of both the parametric and nonparametric parts. The second-step estimation of the parameters offers a second chance to correct the estimators which makes the result more accurate. In fact, if we iterate the second step, the result may be closer to the true value. Moreover, simulation results show that a check-up of the correlation between y and T needs to be conducted to avoid misestimating the SPSAR model. In addition, the simulation study for a correct recognition of mixed data provides more accurate results than those using all the data as continuous data. Finally, an application in modeling spatial knowledge spillovers of cities in the four Provinces in China also supports the SPSAR model and our estimation method.

Several extensions can be done in further research. First, one can extend the analysis by considering the error term with heteroscedasticity. LeSage, J. P. and Pace [17] described an effective solution by specifying a set of prior distributions of the variance in the Bayesian approach. Second, one can extend this model on panel data. However, the performance of our approach applied in this extended cases needs further research.

ACKNOWLEDGEMENTSThis work was partially supported by the Fundamental

Research Funds for the Universities of China (2013-1a-040) and the Postdoctoral Funds of China (20100471168).

Conflict of interest

We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work.

REFERENCES1. Anselin L. Spatial Econometrics: Methods and Models.

Dordrecht:Kluwer Academic. 1988.

2. Paelinck JH. Spatial econometrics. Saxon House, 1979.

3. Fotheringham AS, Rogerson P. The SAGE Handbook of Spatial Analysis. SAGE Publications, 2009.

4. Kelejian HH, Prucha IR. HAC estimation in a spatial framework. Journal of Econometrics. 2007; 140: 131-154.

5. Pinkse, Slade ME, Brett C. Spatial price competition: A semiparametric approach. Econometrica. 2002; 70: 1111-1153.

6. Gibbons S, Machin S. Valuing English primary schools. Journal of Urban Economics. 2003; 53: 197-219.

7. Gress B. Semiparametric spatial autocovariance models. Ph.D. thesis, University of California. 2004. A. Basile R, Gress B. Region et Development. 2005; 21: 97-118.

8. Su L, Jin S. Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. Journal of Econometrics. 2010; 157:18-33.

9. Su L. Semi parametric GMM estimation of spatial autoregressive models. Journal of Econometrics. 2012; 167, 543-560.

10. LeSage JP. Bayesian estimation of spatial autoregressive models. International Regional Science Review. 1997; 20:113-129.

11. Robinson PM. Root- N-consistent semiparametric regression. Econometrica. 1988; 56: 931-54.

12. Andrews DWK. Asymptotics for semiparametric econometric models via stochastic Equicontinuity. Econometrica. 1994; 62: 43-72.

13. Chai G, Sun P and Jiang J. Two stage estimation in semiparametric model. Acta Mathematicae Applicatae Sinica. 1995; 18: 353-363.

14. Hardle W, Liang H, Gao J. Partially Linear Models. Physica Verlag Heidelberg. 2000.

15. Li Q, Racine JS. Nonparametric Econometric: Theory and Practice. Princeton University Press. 2006.

16. LeSage JP, Pace RK. Introduction to Spatial Econometrics. CRC Press, 2009.

17. Wooldridge J. Introductory Econometrics: A Modern Approach. South Western College, 2008.

18. Rawlings JO, Pantula SG and Dickey DA. Applied Regression Analysis: A Reserch Tool. Springer. 2001.

19. Aitchison J and Aitken CGG. Multivariate binary discrimination by the kernel method. Biometrika. 1976; 63:413-420.

20. Racine J, Li Q. Nonparametric estimation of regression functions with both categorical and continuous data. J Econometrics. 2004; 119: 99-130.

21. Su L, Yang S. Instrumental variable quantile estimation of spatial autoregressive models, Development Economics Working Papers 22476, East Asian Bureau of Economic Research. 1989.

22. Hayfield T, Racine JS. Nonparametric Econometrics: TH np Package. Journal of Statistical Software. 2008; 27.

23. Romer PM. Endogenous technological change. J Political Economy, 1990; 98: 71-102.

24. Griliches Z. Issues in assessing the contribution of research and development to productivity growth. Bell Journal of Economics. 1979; 10: 92-116.

25. Jaffe A. Real effects of academic research. American Economic Review. 1989; 79: 957-970.

Variable Coefficients Standard Deviation p-Value

ρ 0.5662 0.128 0

log(RD) 0.6312 0.111 0

log(GDP) 0.2102 0.094 0.0095

log(HC) 0.156 0.089 0.047

W log(RD) -0.3029 0.0946 0.0015

R2 = 0.8332

Table 6: Coefficients for the Knowledge Slipover Example.

Abbreviations: RD: the research and development expenditure; GDP: Gross Domestic Product; HC: Human Capital

Chen J, Wang R, Huang Y (2015) Semiparametric Spatial Autoregressive Model: A Two-Step Bayesian Approach. Ann Public Health Res 2(1): 1012.

Cite this article