Nonlinear Cointegrating Regressions

7/28/2019 Nonlinear Cointegrating Regressions

1/56

Nonlinear regressions with nonstationary time series

Nigel Chan and Qiying Wang

The University of Sydney

February 25, 2013

Abstract

This paper develops asymptotic theory for a nonlinear parametric cointegrating

regression model. We establish a general framework for weak consistency that iseasy to apply for various nonstationary time series, including partial sum of linearprocesses and Harris recurrent Markov chains. We provide limit distributions fornonlinear least squares estimators, extending the previous works. We also introduceendogeneity to the model by allowing the error to be serially dependent and crosscorrelated with the regressors.

JEL Classification: C13, C22.

Key words and phrases: Cointegration, nonlinear regressions, consistency, limit distribu-

tion, nonstationarity, nonlinearity, endogeneity.

1 Introduction

The past few decades have witnessed significant developments in cointegration analy-

sis. In particular, extensive researches have focused on cointegration models with linear

structure. Although it gives data researchers convenience in implementation, the linear

structure is often too restrictive. In particular, nonlinear responses with some unknown

parameters often arise in the context of economics. For empirical examples, we refer to

Granger and Terasvirta (1993) as well as Terasvirta et al. (2011). In this situation, it is

expected that nonlinear cointegration captures the features of many long-run relationships

in a more realistic manner.

The authors thanks to Editor, Associate Editor and three referees for their very helpful comments onthe original version. The authors also acknowledge research support from Australian Research Council.Address correspondence to Qiying Wang: School of Mathematics and Statistics, University of Sydney,NSW 2006, Australia. E-mail: [email protected]

1


2/56

A typical nonlinear parametric cointegrating regression model has the form

yt = f(xt, 0) + ut, t = 1,...,n, (1)

where f : RRm R is a known nonlinear function, xt and ut are regressor and regression

errors, and 0 is an m-dimensional true parameter vector that lies in the parameter set. With the observed nonstationary data {yt, xt}nt=1, this paper is concerned with thenonlinear least squares (NLS) estimation of the unknown parameters 0 . In thisregard, Park and Phillips (2001) (PP henceforth) considered xt to be an integrated, I(1),

process. Based on PP framework, Chang et al. (2001) introduced additional linear time

trend term and stationary regressors into model (1). Chang and Park (2003) extended to

nonlinear index models driven by integrated processes. More recently, Choi and Saikkonen

(2010), Gao et al. (2009) and Wang and Phillips (2012) developed statistical tests for

the existence of a nonlinear cointegrating relation. Park and Chang (2011) allowed the

regressors xt to be contemporaneously correlated with the regression errors ut and Shi

and Phillips (2012) extended the model (1) by incorporating a loading coefficient.

The present paper has a similar goal to the previously mentioned papers but offers

more general results, which have some advantages for empirical studies. First of all, we

establish a general framework for weak consistency of the NLS estimator n, allowing for

the xt to be a wider class of nonstationary time series. The set of sufficient conditions

are easy to apply to various nonstationary regressors, including partial sum of linear

processes and recurrent Markov chains. Furthermore, we provide limit distributions for

the NLS estimator n. It deserves to mention that the routine employed in this paper

to establish the limit distributions of n is different from those used in the previous

works, e.g. Park and Phillips (2001). Roughly speaking, our routine is related to joint

distributional convergence of a martingale under target and its conditional variance, rather

than using classical martingale limit theorem which requires establishing the convergence

in probability for the conditional variance. In nonlinear cointegrating regressions, there

are some advantages for our methodology since it is usually difficult to establish the

convergence in probability for the conditional variance, in particular, in the situation that

the regressor xt is a nonstationary time series. Second, in addition to the commonly used

martingale innovation structure, our model allows for serial dependence in the equilibrium

errors ut and the innovations driving xt. It is important as our model permits joint

determination ofxt and yt, and hence the system is a time series structural model. Under

such situation, the weak consistency and limit distribution of the NLS estimator n are

2


3/56

also established.

This paper is organized as follow. Section 2 presents our main results on weak consis-

tency of the NLS estimator n. Theorem 2.1 provides a general framework. Its applications

to integrable and non-integrable f are given in Theorems 2.22.5, respectively. Section 3

investigates the limit distributions ofn in which the model (1) has a martingale structure.

Extension to endogeneity is presented in Section 4. As mentioned above, our routine es-

tablishing the limit distribution of n is different from previous works. Section 5 performs

simulation, reporting and discussing the numerical values of means and standard errors

which provide the evidence of accuracies of our NLS estimator. Section 6 presents an

empirical example, providing a link between our theory and real applications. The model

of interest is the carbon Kuznets curve relating the per capita CO2 emissions and per

capita GDP. Endogeneity occurs in this example due to potential misreporting of GDP,

omitted variable bias and reverse causality. Section 7 concludes the paper. Technical

proofs are postponed to Section 8.

Throughout the paper, we denote constants by C, C1, C2,... which may be different at

each appearance. For a vector x = (x1,...,xm), assume that x = (x21 + ... + x2m)1/2, andfor a matrix A, the norm operator is defined by A = supx:x=1 xA. Furthermore,the parameter set Rm is assumed to be compact and convex, and the true parametervector 0 is an interior point of .

2 Weak consistency

This section considers the estimation of the unknown parameters 0 in model (1) by

NLS. Let Qn() =n

t=1(yt f(xt, ))2. The NLS estimator n of 0 is defined to be theminimizer of Qn() over , that is,

n = arg minQn(), (2)

and the error estimator is defined by 2n = n1nt=1 u2t , where ut = yt f(xt, n). Toinvestigate the weak consistency for the NLS estimator n, this section assumes the regres-

sion model (1) to have a martingale structure. In this situation, our sufficient conditions

are closely related to those of Wu (1981), Lai (1994) and Skouras (2000), and they intend

to provide a general framework. In comparison to the papers mentioned, our assumptions

are easy to apply, particularly in nonlinear cointegrating regression situation as stated in

the two examples below. Extension to endogeneity between xt and ut is investigated in

3


4/56

Section 4.

2.1 A framework

We make use of the following assumptions for the development of weak consistency.

Assumption 2.1. For each , 0 , there exists a real function T : R R such that

|f(x, ) f(x, 0)| h(|| 0||) T(x), (3)

where h(x) is a bounded real function such that h(x) h(0) = 0, as x 0.

Assumption 2.2. (i) {ut, Ft, 1 t n} is a martingale difference sequence satisfyingE(|ut|2|Ft1) = 2 and sup1tn E(|ut|2q|Ft1) < a.s., where q > 1; and

(ii) xt is adapted to

Ft

1, t = 1,...,n.

Assumption 2.3. There exists an increasing sequence 0 < n such that

2n

nt=1

[T(xt) + T2(xt)] = OP(1), (4)

and for any 0 < < 1 and = 0, where , 0 , there exist n0 > 0 and M1 > 0 suchthat

Pn

t=1 (f(xt, ) f(xt, 0))2

2n /M1 1 , (5)

for all n > n0.

Theorem 2.1. Under Assumptions 2.12.3, the NLS estimator n is a consistent esti-

mator of 0, i.e. n P 0. If in addition 2nn1 = O(1), then 2n P 2, as n .

Assumptions 2.1 and 2.2 are the same as those used in Skouras (2000), which are

standard in the NLS estimation theory. Also see Wu (1981) and Lai (1994). Assumption

2.3 is used to replace (3.8), (3.9) and (3.11) in Skouras (2000), in which some uniform

conditions are used. In comparison to Skouras (2000), our Assumption 2.3 is related to

the conditions on the regressor xt and is more natural and easy to apply. In particular,

it is directly applicable in the situation that T is integrable and the regressor xt is a

nonstationary time series, as stated in the following sub-section.

4


5/56

2.2 Identification of Assumption 2.3: integrable functions

Due to Assumption 2.1, f(x, ) f(x, 0) is integrable in x ifT is an integrable function.This class of functions includes f(x, 1, 2) = 1|x|2I(x [a, b]), where a and b are finiteconstants, the Gaussian function f(x, ) = ex

2

, the Laplacian function f(x, ) = e|x|,

the logistic regression function f(x, ) = e|x|/(1 + e|x|), etc. In this sub-section, two

commonly used non-stationary regressors xt are shown to satisfy Assumption 2.3 if T is

integrable.

Example 1 (Partial sum of linear processes). Let xt =t

j=1 j, where {j, j 1}is a linear process defined by

j =k=0

k jk, (6)

where

{j,

< j 0 for all = 0.

Then (4) and (5) hold with 2n = n/dn. Consequently, if in addition Assumption 2.2, then

n P 0.

Theorem 2.2 improves Theorem 4.1 of PP in two folds. Firstly, we allow for more

general regressor. The result under C1 is new, which allows xt to be long memory process,

including the fractionally integrated process as an example. PP only allow xt to satisfy

C2 with additional conditions on k, that is, they require xt to be a partial sum of short

memory process. Secondly, we remove the part (b) required in the definition of I-regular

function given in their Definition 3.3. Furthermore, we allow for certain non-integrable f

5


6/56

such as the logistic regression function f(x, ) = e|x|/(1 + e|x|), 0 for some 0 > 0,although we assume the integrability of T.

Example 2 (Recurrent Markov Chain). Let {xk}k0 be a Harris recurrent Markovchain with state space (E, E), transition probability P(x, A) and invariant measure . Wedenote P for the Markovian probability with the initial distribution , E for correspon-dent expectation and Pk(x, A) for the k-step transition of{xk}k0. A subset D ofE with0 < (D) < is called D-set of{xk}k0 if for any A E+,

supxE

Ex Ak=1

ID(xk)

< ,

where E+ = {A E: (A) > 0} and A = inf{n 1 : xn A}. By Theorem 6.2 of Orey(1971), D-sets not only exist, but generate the entire sigma E, and for any D-sets C, Dand any probability measure , on (E,

E),

limn

nk=1

Pk(C)/n

k=1

Pk(D) =(C)

(D), (8)

where Pk(D) = P

k(x, D)(dx). See Nummelin (1984) for instance.

Let a D-set D and a probability measure on (E, E) be fixed. Define

a(t) = 1(D)[t]k=1

Pk(D), t 0.

By recurrence, a(t)

. Here and below, we set the state space to be the real space,

that is (E, E) = (R, R). We have the following result.Theorem 2.3. Supposext is defined as in Example 2 and Assumption 2.1 holds. Assume:

(i) T is bounded and |T(x)|(dx) < , and

(ii)(f(s, ) f(s, 0))2(ds) > 0 for all = 0.

Then (4) and (5) hold with 2n = a(n). Consequently, if in addition Assumption 2.2, then

n P 0.

Theorem 2.3 seems to be new to the literature. By virtue of (8), the asymptotic orderof a(t) depends only on {xk}k0. It is interesting to notice that Theorem 2.3 does notimpose the -regular condition as commonly used in the literature. The Harris recurrent

Markov chain {xk}k0 is called -regular if

lim

a(t)/a() = t, t > 0, (9)

where 0 < 1. See Chen (1999) for instance.

6


7/56

2.3 Identification of Assumption 2.3: beyond integrable func-tions

As noticed in Section 2.2, Assumption 2.3 still holds for certain non-integrable f, although

we require the integrability of T. However, identification of Assumption 2.3 is more

involved for general non-integrable f. In this situation, instead of requiring T to be

integrable, we impose certain relationship between f and T.

Assumption 2.4. (i) There exist a regular function1 g(x,,0) on satisfying|s|

g2(s,,0)ds > 0

for all = 0 and > 0, a real function T : R R and a positive real function v()which is bounded away from zero as , such that for any bounded x,

sup,0

|f(x,) f(x,0) v() g(x,,0)|/T(x) = o(1), (10)

as .(ii) There exists a real function T1 : R R such that T(x) v() T1(x) as |x|

and T1 + T2

1 is locally integrable (i.e. integrable on any compact set).

Theorem 2.4. Suppose Assumption 2.4 holds. Suppose that there exists a continuous

Gaussian process G(t) such that x[nt],n G(t), on D[0, 1], where xi,n = xi/dn and 0 0 continuous functionsH

, H, and a constant > 0 such that H(x, ) H(y, ) H(x, ) for all |x y| < on K, a

compact set of R, and such thatK

(H H)(x, )dx 0 as 0, and (b) for all x R, H(x, ) isequicontinuous in a neighborhood of x.

7


8/56

that of PP, as we only require that x[nt]/dn converges weakly to a continuous Gaussian

process. This kind of weak convergence condition is very likely close to be necessary.

We next consider a class of asymptotically homogeneous functions. In this regard, we

follow PP. Let f : R R have the structure:

f(x,) = v(, )h(x, ) + b(, ) A(x, ) B(x,), (11)

where sup |b(, ) v1(, )| 0, as ; sup |A(x, )| is locally bounded, thatis, bounded on bounded intervals; sup |B(x,)| is bounded on R; h(x, ) is regular on satisfying

|s| h

2(s, )ds > 0 for all = 0 and > 0, and v(, ) satisfy: there exist > 0 and a neighborhood N of such that as

inf

|p

p

|


9/56

3 Limit distribution

This section considers limit distribution of n. In what follows, let Qn and Qn be the

first and second derivatives of Qn() in the usual way, that is, Qn = Qn/ and Qn =

2Qn/. Similarly we define f and f. We assume these quantities exist whenever they

are introduced. The following result comes from an application of Lemma 1 of Andrews

and Sun (2004), which provides a framework and plays a key part in our development on

limit distribution of n.

Theorem 3.1. There exists a sequence of m m nonrandom nonsingular matrices Dnwith D1n 0, as n , such that

(i) sup:Dn(0)kn (D1n )n

t=1

f(xt, )f(xt, )

f(xt, 0)f(xt, 0)

D1n = oP(1),(ii) sup:Dn(0)kn (D1n )

nt=1 f(xt, )

f(xt, ) f(xt, 0)

D1n = oP(1),(iii) sup:Dn(0)kn (D1n ) nt=1 f(xt, ) ut D1n = oP(1),(iv) Yn := (D

1n )

nt=1 f(xt, 0)f(xt, 0)

D1n D M, where M > 0, a.s., and

Zn := (D1n )

nt=1

f(xt, 0) ut = OP(1),

for some sequence of constants {kn, n 1} for which kn , as n . Then, thereexists a sequence of estimators {n, n 1} satisfying Qn(n) = 0 with probability thatgoes to one and

Dn(n 0) = Y1

n Zn + oP(1). (13)

If we replace (iv) by the following (iv), then Dn(n 0) D M1 Z.(iv) for any i = (i1,...,im) Rm, i = 1, 2, 3,

(1 Yn2, 3Zn) D (1 M 2, 3 Z),

where M > 0, a.s. and P(Z < ) = 1.Our routine in establishing the limit distribution of n is essentially different from that

of PP, and is particularly convenient in the situation that f, f and f all are integrable in

which Dn = n I for some 0 < n , where I is an identity matrix. See next sectionfor examples. The condition AD3 of PP requires to show Yn P M instead of (iv). It isequivalent to say that, under the PPs routine, one requires to show (at least under an

enlarged probability space)

1n

nt=1

g(xt) P

g(s)ds LW(1, 0), (14)

9


10/56

where xt is an integrated process, g is a real integrable function and LW(t, s) is the

local time of the standard Brownian Motion W(t)2. The convergence in probability is

usually hard or impossible to establish without enlarging the probability space. Our

routine essentially reduces the convergence in probability to less restrictive convergence

in distribution. Explicitly, in comparison to (14), we only need to show that

1n

nt=1

g(xt) D

g(s)ds LW(1, 0). (15)

This allows to extend the nonlinear regressions to much wider class of nonstationary

regressor data series, and enables our methodology and proofs straightforward and neat.

3.1 Limit distribution: integrable functions

We now establish our results on convergence in distribution for then, which mainly

involves settling the conditions (i)(iii) and (iv) in Theorem 3.1. First consider the

situation that f is an integrable function, together with some additional conditions on xt

and ut.

Assumption 3.1. (i) xt is defined as in Example 1, that is, xt =t

j=1 j, where j

satisfies (6);

(ii) Fk is a sequence of increasing -fields such that k Fk and k+1 is independent of

Fk for all k

1, and k

F1 for all k

0;

(iii) {uk, Fk}k1 forms a martingale difference satisfying maxkm |E(u2k+1 | Fk) 1| 0a.s., and for some > 0, maxk1 E(|uk+1|2+ | Fk) < .

Assumption 3.2. Let p(x, ) be one off, fi and fij , 1 i, j m.

(i) p(x, 0) is a bounded and integrable real function;

(ii) = f(s, 0)f(s, 0)

ds > 0 and(f(s, ) f(s, 0))2ds > 0 for all = 0;

(iii) There exists a bounded and integrable function Tp : R R such that |p(x, )

p(x, 0)| hp(|| 0||) Tp(x), for each , 0 , where hp(x) is a bounded realfunction such that hp(x) hp(0) = 0, as x 0.

2Here and below, the local time process LG(t, s) of a Gaussian process G(t) is defined by

LG(t, s) = lim0

1

2

t

0

I|G(r) s| dr.

10


11/56

Theorem 3.2. Under Assumptions 3.1 and 3.2, we haven/dn(n 0) D 1/2 NL1/2G (1, 0), (16)

whereN is a standard normal random vector, which is independent of G(t) defined by

G(t) =W3/2(t), underC1,

W(t), underC2.(17)

Here and below, W(t) denotes the fractional Brownian motion with 0 < < 1 on

D[0, 1], defined as follows:

W(t) =1

A()

0

(t s)1/2 (s)1/2

dW(s) +

t0

(t s)1/2dW(s),where W(s) is a standard Brownian motion and

A() = 1

2+

0

(1 + s)1/2 s1/2

2ds1/2

.

Remark 3.1. Theorem 3.2 improves Theorem 5.1 of PP by allowing xt to be a long

memory process, which includes fractional integrated processes as an example.

Using the same routine and slight modification of assumptions, we have the following

limit distribution when xt is a -regular Harris recurrent Markov chain.

Assumption 3.1*. (i) xt is defined as in Example 2 satisfying (9), that is, xt is a -

regular Harris recurrent Markov chain;

(ii) {uk, Fk}k1 forms a martingale difference;(iii) maxkm |E(u2k+1 | Fnk)1| 0 a.s., and for some > 0, maxk1 E(|uk+1|2+ | Fnk) 0 and (f(s, ) f(s, 0))2(ds) > 0 for all = 0;

(iii) There exists a bounded function Tp : R R with Tp(x)(dx) < such that

|p(x, ) p(x, 0)| hp(|| 0||) Tp(x), for each , 0 , where hp(x) is a boundedreal function such that hp(x) hp(0) = 0, as x 0.

11


12/56

Theorem 3.3. Under Assumptions 3.1* and 3.2*, we havea(n)(n 0) D 1/2 N1/2 , (18)

whereN is a standard normal random vector, which is independent of , and for = 1,

= 1, and for 0 < < 1,

is a stable random variable with Laplace transform

Eexp{t } = exp

t

(+ 1)

, t 0. (19)

Remark 3.2. Theorem 3.3 is new, even for the stationary case ( = 1) in which a(n) = n

and the NLS estimator n converges to a normal variate. The random variable in

the limit distribution, after scaled by a factor of (+ 1)1, is a Mittag-Leffler random

variable with parameter , which is closely related to a stable random variable. For details

regarding the properties of this distribution, see page 453 of Feller (1971) or Theorem 3.2

of Karlsen and Tjstheim (2001).

Remark 3.3. Assumption 3.1* (iii) imposes a strong orthogonal property between the

regressor xt and the error sequence ut. It is not clear at the moment whether such

condition can be relaxed to a less restrictive one where xt is adapted to Fnt and xt+1 isindependent ofFnt for all 1 t n. We leave this for future research.

Remark 3.4. Unlike linear cointegration, integrable regression function does not pro-

portionately transfer the effect of outliers of the nonstationary regressor to the response.This is useful when practitioners want to lessen the impact of extreme values of xt on the

response. In addition, such function arises in macro-economics when the dependent vari-

able has an uneven response to the regressor. Empirical examples include central banks

market intervention and the currency exchange target zones, described in Phillips (2001).

Remark 3.5. The convergence rates in (16) and (18) are reduced by the nonstationarity

property of the regressor xt when f is an integrable function. For the partial sum of linear

processes in Theorem 3.2, comparing to the stationary situation where the convergence

rate is n1/2, the rate is reduced to n3/4/21/2(n) for the long memory case, and is further

reduced to n1/4 for the short memory case. For the -regular Harris recurrent Markov

chain, note that by (9), the asymptotic order a(n) is regularly varying at infinity, i.e., there

exists a slowly varying function (n) such that a(n) n(n). Therefore, the convergencerate of the NLS estimator decreases as the regularity index of the chain decreases from

one (stationary situation) to zero.

12


13/56

3.2 Limit distribution: beyond integrable functions

We next consider the limit distribution of n when the regression function f is non-

integrable. Unlike PP and Chang et al. (2001), we use a more general regressor xt in the

development of our asymptotic distribution.

Assumption 3.3. (i) {ut, Ft, 1 t n} is a martingale difference sequence satisfyingE(|ut|2|Ft1) = 2 and sup1tn E(|ut|2q|Ft1) < a.s., where q > 1;

(ii) xt is adapted to Ft1, t = 1,...,n, max1tn |xt|/dn = OP(1), where d2n = var(xn) ; and

(iii) There exists a vector of continuous Gaussian processes (G, U) such that

(x[nt],n, n1/2

[nt]

i=1ui) D (G(t), U(t)), (20)

on D[0, 1]2, where xt,n = xt/dn.

Assumption 3.4. Let p(x, ) be one of f, fi and fij, 1 i, j m. There exists a realfunction Tp : R R such that

(i) |p(x, ) p(x, 0)| Ap(|| 0||) Tp(x), for each , 0 , where Ap(x) is a boundedreal function satisfying Ap(x) Ap(0) = 0, as x 0;

(ii) For any bounded x, sup |p(x,) vp() hp(x, )|/Tp(x) = o(1), as ,where hp(x, ) on is a regular function and vp() is a positive real function which

is bounded away from zero as ; and(iii) Tp(x) vp() T1p(x) as |x| , where T1p : R R is a real function such that

T1p + T2

1p is locally integrable (i.e. integrable on any compact set).

For notation convenience, let v(t) = vf(t), h(x, ) = hf(x, ), vi(t) = vfi(t) and

v(t) = (v1(t),..., vm(t)). Similarly, we define v(t), h(x, ) and h(x, ). We have the

following main result.

Theorem 3.4. Suppose Assumptions 3.3 and 3.4 hold. Further assume that

sup1i,jm | v(dn) vij(dn)vi(dn) vj(dn) | < and|s| h(s, 0)h(s, 0)

ds > 0 for all > 0. Then we have

Dn (n 0) D1

0

(t)(t)dt1 1

0

(t) dU(t), (21)

on D[0, 1], as n , where (t) = h(G(t), 0) and Dn = diag(nv1(dn),..., nvm(dn)).

13


14/56

Remark 3.6. Except the joint convergence, other conditions in establishing Theorem

3.4 are similar to Theorem 5.2 of PP. PP made use of the concept of H0-regular. Our

conditions on f, fi and fij, 1 i, j m are more straightforward and easy to identify.Furthermore, the joint convergence assumption under present paper is quite natural when

f,

f and

f all satisfy Assumption 3.4 (ii). Indeed, under

fi(x,) vi()hi(x, ), one can

easily obtain the following asymptotics under our joint convergence.

D1n

nt=1

f(xt, 0)ut D1

0

h(G(t), 0)dU(t), (22)

on D[0, 1], which is required in the proof of (21). See, e.g., Kurtz and Protter (1991) and

Hansen (1992).

Remark 3.7. If we further assume that U(t) and G(t) are asymptotic independent, that

is, the long run relationship between the regressor sequence xt and the innovative sequenceut vanishes asymptotically, the limiting distribution in (21) will become mixed normal.

In particular, when U(t) is a standard Wiener process, we have

Dn (n 0) D 1

0

(t)(t)dt1/2

N, (23)

where N is a standard normal random vector.

Remark 3.8. Nonlinear cointegrating regressions with structure described in Theorem

3.4 are useful to modelling money demand functions. In such case, yt is the logarithm of

the real money balance, xt is the nominal interest rate, and f can either be f(x,,) =

+ log |x| or f(x,,) = + log(1+|x||x| ). See Bae and de Jong (2004) and Bae etal. (2006) for empirical studies investigating the estimation of money demand functions

in USA and Japan respectively. Also, see Bae et al. (2004) for the derivation of these

functional forms from the underlying money demand theories studied in macro-economics.

Remark 3.9. Another example is the MichaelisMenton model, which was considered

by Lai (1994). In such case, the regression function is f(x, 1, 2) = 1/(2 + x)I

{x

0

}.

Bates and Watts (1988) employs such model to investigate the dependence of rate of

enzymatic reactions on the concentration of substrate.

Remark 3.10. Unlike in the situation where f is integrable, there is super convergence

when f has the structure given in Assumption 3.4 and xt is nonstationary. The conver-

gence rate is given by Dn, whose elements

nvi(dn) are all faster than the standard raten in stationary situation.

14


15/56

We finally remark that, if f is an asymptotically homogeneous function having the

structure (11), it is also possible to extend PPs result from a short memory linear process

to general nonstationary time series as described in Assumption 3.3. The idea of proof

for such result can be easily generalized from Theorem 5.3 of PP. Also see Chang et al.

(2001). As it only involves slight modification of notation, we omit the details.

4 Extension to endogeneity

Assumption 2.2 ensures the model (1) having a martingale structure. The result in this

regard is now well known. However, there is little work on allowing for contemporaneous

correlation between the regressors and regression errors. Using a nonparametric approach,

Wang and Phillips (2009b) considered kernel estimate of f and allowed the equation error

ut to be serially dependent and cross correlated with xs for |t s| < m0, thereby inducingendogeneity in the regressor, i.e., cov(ut, xt) = 0. In relation to present paper de Jong(2002) considered model (1) without exogeneity, and assuming certain mixing conditions.

Chang and Park (2011) considered a simple prototypical model, where the regressor and

regression error are driven by i.i.d. innovations. This section provides some extensions

to these works. Our main results show that there is an essential effect of endogeneity on

the asymptotics, introducing the bias in the related limit distribution. Full investigation

in this regard requires new asymptotic results, which will be left for future work.

4.1 Asymptotics: integrable functions

We first consider the situation that f satisfies Assumption 2.1 with T being integrable. Let

i (i, i), i Zbe a sequence of i.i.d. random vectors satisfying E0 = 0 and E02q 0. As noticed in Remark 4 of Jeganathan (2008),the conditions on the characteristics function (t) are not very restrictive.

Assumption 4.1. (i) xt = tj=1 j where j is defined as in (6), that is, j = k=0 kjkwith the coefficients k satisfying C1 or C2;

(ii) ut =

k=0 k tk, where the coefficients k are assumed to satisfy

k=0 k2|k|2 < ,

k=0 k = 0 and k=0 |k| max{1, |k|} < where k = ki=0 i.As we do not impose the independence between k and k, Assumption 4.1 provides

the endogeneity in the model (1), and it is much general than the conditions set given in

15


16/56

Chang and Park (2011). We have the following result.

Theorem 4.1. Suppose Assumptions 2.1 and 4.1 hold. Further assume:

(i) T is bounded and integrable, and

(ii) (f(s, ) f(s, 0))2ds > 0 for all = 0.Then the NLS estimator n defined by (2) is a consistent estimator of 0, i.e. n P 0.

If in addition Assumption 3.2, then, as n ,n/dn(n 0) D 1 1/2 NL1/2G (1, 0), (24)

wheref() =

eixf(x, 0)dx,

= (2)1 f()f()[Eu20 + 2r=1 E(u0ure

ixr)] d, (25)

and other notations are given as in Theorem 3.1.

4.2 Asymptotics: beyond integrable functions

This section considers the situation that f satisfies Assumption 2.1 with T being non-

integrable. Let again i (i, i), i Z be a sequence of i.i.d. random vectors satisfyingE0 = 0 and E02q < , where q > 2, and we make use of the following assumption inrelated to the model (1).

Assumption 4.2. (i) xt =t

j=1 j , where j =

k=0 kjk with =

k=0 k = 0 andk=0 |k| < ;

(ii) ut =

k=0 k tk, where the coefficients k are assumed to satisfy

k=0 k|k| < , k=0 k = 0 and k=0 k|k||k| < .

Theorem 4.2. Under Assumption 2.1* and 4.2, the NLS estimator n defined by (2) is

a consistent estimator of 0, i.e. n P 0.

In the following, we define hx(x, ) = h(x,)x .

Theorem 4.3. Suppose Assumptions 3.4 and 4.2 hold. Further assume that:

(i) sup1i,jm | v(n) vij(

n)

vi(n) vj(

n)

| < and |s| h(s, 0)h(s, 0)ds > 0 for all > 0, and(ii) hx(x, ) satisfies |hx(x, 0)| K(1+ |x|) for some constantsK, > 0 for allx R,

and q max(3, 2) where q is given in Assumption 4.2.

16


17/56

Then the limit distribution of n is given by

Dn(n 0) D 1

0

(t)(t)dt1

u

10

(t) dt +

10

(t) dU(t)

, (26)

as n , where (t) = h(W(t), 0), (t) = hx(W(t), 0), u =

j=0 E(0uj), Dn =

diag(nv1(n),..., nvm(n)) and (W(t), U(t)) is a bivariate Brownian motion withcovariance matrix

=

2E20 E00

E00 2E20

.

5 Simulation

In this section, we investigate the finite sample performance of the NLS estimator n of

nonlinear regressions with endogeneity. Chang and Park (2011) performed simulation of

similar model, but only considered the error structure ut to be i.i.d. innovation. We intend

to investigate the sampling behavior of n under different degree of serially dependence

of ut on itself. To this end, we generate our data in the following way:

xt = xt1 + t,

vt =

1 2 wt + t,

f1(x, ) = exp{|x|} and f2(x,,1, 2) = + 1x + 2x2, (27)

where {wt} and {t} are i.i.d. N(0, 32

) variables, and is the correlation coefficient thatcontrols the degree of endogeneity. The true value of is set as 0.1 and that of (, 1, 2)

is set as (20, 10, 0.1). The error structure is generated according to the following threescenarios:

S1: ut = vt,

S2: ut =

j=1j10 vtj+1, and

S3: ut =

j=1j4 vtj+1.

Scenario S1 is considered by Chang and Park (2011), which eliminates the case that

ut is serially correlated. Scenarios S2 and S3 introduce self-dependence to the error

sequence. The decay rate of S2 is faster than that of S3, which implies that the level of

self dependence of ut increases from S1 to S3.

It is obvious that f1 is an integrable function and f2 satisfies the conditions in Theorem

4.3, with

v(,,1, 2) = 2 and h(x,,1, 2) = 2x

2.

17


18/56

In addition, we have

h(x,,1, 2) = (1, x , x2) and hx(x,,1, 2) = (0, 1, 2x).

It would be convenient for later discussion to write out the limit distributions of the

estimators for the f2 case. By Theorem 4.3, we have n1/2(n 0)n(1n 10)n3/2(2n 20)

D 1 U(1)u + 10 W(t)dU(t)

2u1

0 W(t)dt +1

0 W2(t)dU(t)

, (28)where u =

j=0 E(0uj) and

=

1

10

W(t)dt1

0W2(t)dt

1

0 W(t)dt

1

0 W2(t)dt

1

0 W3(t)dt

10 W2(t)dt 10 W3(t)dt 10 W4(t)dt

.

In our simulations, we draw samples of sizes n = 200, 500 to estimate the NLS esti-

mators and their t-ratios. Each simulation is run for 10,000 times, and their densities are

estimated using kernel method with normal kernel function. Our results are presented

in 16 tables. Table 23 present the numerical values of the means and standard errors

of the NLS estimators, n and (n, 1n, 2n), as well as those of the error variance es-

timates 2n. Table 47 are the estimated densities of the centered distributions of the

estimators. Table 811 are the estimated densities of the centered distributions scaled by

their respective convergence rates, that is, n1/4 and (n1/2, n , n3/2). Table 1215 are the

t-ratios. In all tables, the left columns are simulation results based on 200 samples, while

the right columns are based on 500 samples. We expect that as increases, the degree of

endogeneity increases, and hence the variance of the distribution will increase due to (24)

and (26). We also expect that the more serial correlation of ut, the higher the variance

of the limit distribution due to the cross terms appeared in (25) and (26).

Our simulation results largely corroborate with our theoretical results in Section 4.

Firstly, the means of the estimators are close to the true values, and the deviations

from true values decrease as the sample sizes increase. Also, the reductions of standard

errors from n = 200 to 500 are close to the theoretical values. For example, the ratio

of standard errors for n between the sample size 200 and 500 is 1.564 in scenario S1,

which is close to theoretical value

500200

= 1.581. This confirms that our estimators are

accurate in all scenarios. Secondly, comparing the S1S3 curves in each plot, we can

see that the S3 curves have fat tails and lower peak than those of S1S2. This verifies

18


19/56

that high dependence ofut on its own past will increase the variance of limit distribution.

Thirdly, comparing the shapes of the curves for different values of, the curves with = 0

have highest peaks, and the peakedness decrease as increases. This matches with our

expectation that, if the cross dependence between ut and xt increases, the variance of

limit distribution will increase.Finally, the sampling results for t-ratios are also much expected from our limit theories.

In particular, we have interesting results for the t-ratios of1n in Table 14. In scenario S1,

there is no endogeneity and the limiting t-ratios overlap with the normal curve. However,

as we introduce endogeneity to the model in scenarios S2 and S3, the t-ratios is shifted

away from the standard normal. Such behavior can be explained by looking at the limit

distribution of 1n in (28). When = 0, there exists an extra term of 1u, and suchterm will introduce bias to the limit distribution.

6 An empirical example: Carbon Kuznets Curve

In this section, we consider a simple example of the model with endogeneity. The equation

of interest is the Carbon Kuznets Curve (CKC) relating the per capita CO2 emission of

a country to its per capita GDP. Basically, the CKC hypothesis states that there is an

inverted-U shaped relationship between economic activities and per capita CO2 emissions.

As it is well accepted that the GDP variable displays some wandering nonstationary be-

haviors over time, such problem is relevant to nonlinear transformations of nonstationaryregressors. Refer to Wagner (2008) and Muller-Furstenberger and Wagner (2007) for

detailed expositions of how such problem relates to the works of PP and Chang et al.

(2001).

The formula that we currently consider has a quadratic formulation and is given by

ln(et) = + 1 ln(xt) + 2(ln(xt))2 + ut, 1 t n, (29)

where et and yt denote the per capital emissions of CO2 and GDP in period t, and

ut is a stochastic error term. In the usual CKC formulation, the logarithm of per

capita CO2 emissions and GDP are assumed to be integrated processes, see Muller-

Furstenberger and Wagner (2007). Also, it is clear that the nonlinear link function

f(x,,1, 2) = + 1x + 2x2 satisfies the conditions in Theorem 4.3, and is con-

sidered in our simulations. There are studies which employ more complicated models, by

including time trend, stochastic terms and additional explanatory variables such as the

19


20/56

energy structure. To avoid complicating our discussion, we only consider GDP as the only

explanatory variable.

In the literature most studies assumed that the parameters (1, 2) are homogeneous

across different counties. Several works investigated the appropriateness of restricting all

countries adhering to the same values of (1, 2). For example, List and Gallet (1999), Di-jkgraaf et al. (2005), Piaggio and Padilla (2012). Particularly, Piaggio and Padilla (2012)

tested this assumption by checking whether the confidence intervals of (1, 2) across dif-

ferent countries overlap. They rejected the homogeneity assumption of parameters across

different countries, based on the result that they could not find any possible group with

more than 4 countries whose confidence intervals overlap with each other.

As an example of employing our NLS estimators n = (n, 1n, 2n) and calculating

their confidence intervals, we carry out similar study and examine whether the 95% con-

fidence intervals of the parameters (1n, 2n) across different countries overlap with each

other. We extend previous studies by incorporating the endogeneity in the model. The

assumption that xt is an exogenous variable does not usually hold in practise as there are

many potential sources of endogeneity. Firstly, endogeneity occurs due to measurement

error of the explanatory variables xt, including the misreporting of GDP of each country.

Secondly, there might be omitted variable bias, which arises if there exists a relevant

explanatory variable that is correlated with the regressor xt (per capita GDP), but is

excluded from the model. Finally, potential reverse causality will give rise to endogeneity.

That is, the CO2 emissions yt might at the same time has a feedback effect on the GDP

xt (e.g., due to stricter government regulation).

We consider 15 countries, which are shown in Piaggio and Padilla (2012) that their

response variables have a quadratic relationship described in (29) with the regressors.

We use annual data in this example and for each country, there are 58 observations of

the CO2 emission data from 19512008 published by the Carbon Dioxide Information

Analysis Center (Boden et al. (2009)). For per capita GDP, we adopt the GDP data

from Maddison (2003), which is transformed to 1990 Geary-Khamis dollars.We first estimate the parameters by minimizing the error function given in (2) using

the nls() function in the statistical software R. Next, we calculate the confidence interval

by finding the critical values of the limit distribution of (n, 1n, 2n) given in (29). Note

that most of the quantities in (29) involve only the standard Brownian motion or integral

of Brownian motions, which can be easily obtained by direct simulations. However, there

20


21/56

exist two nuisance parameters, including the covariance matrix of Brownian motions

(W, U) and the autocovariances u of linear processes {t, ut}. Thus we have to replacethe unknown parameters with their consistent estimates and adopt the empirical critical

values instead of the real ones. Using the linear process estimation procedures described

in Chang et al. (2001), we have,

as the estimators of , , and t and t as the

artificial error sequences. The parameters and u can then be estimated by

=

n12

nt=1

2t n

1n

t=1 ttn1

nt=1 tt n

12n

t=1 2t

, and u =

lj=1

n1njt=1

tut+j,

for any l = o(n1/4). See Ibragimov and Phillips (2008), Phillips and Solo (1992) and

Phillips and Perron (1988).

Table 1: Estimates and 95% Confidence Intervals of 1n and 2n

Table 18 and 19 report the estimates and 95% confidence intervals of parameters 1

21


22/56

and 2, and Table 1 presents the plots of the results. For the parameter 1, we can

not find any group with more than four groups of countries whose confidence intervals

overlap. Such result is consistent with that of Piaggio and Padilla (2012). However,

for the parameter 2n, among the 15 countries, we can find group of 11 countries whose

confidence intervals overlap. Hence, the homogeneity of parameters can not be rejected asin Piaggio and Padilla (2012). There are two main reasons for the differences between our

results and theirs. Firstly, they assume the innovation is i.i.d. normally distributed while

we allow the error to be serially correlated with itself. Secondly, we incorporate the effect

of endogeneity in our limit distribution of estimator. Based on (29), both reasons will

make the critical values and confidence intervals significantly larger than that of Piaggio

and Padilla (2012).

7 Conclusion and discussion

In this paper, we establish an asymptotic theory for a nonlinear parametric cointegrating

regression model. A general framework is developed for establishing the weak consistency

of the NLS estimator n. The framework can easily be applied to a wide class of non-

stationary regressors, including the partial sum of linear processes and Harris recurrent

Markov chains. Limit distribution of n is also established, which extends previous works.

Furthermore, we introduce endogeneity to our model by allowing the error term to be

serially dependent on itself, and cross-dependent on the regressor. We show that the limitdistribution of n under the endogeneity situation is different from that with martingale

error structure. This result is of interest in the applied econometric research area.

Asymptotics in this paper are limited to univariate regressors and nonlinear para-

metric cointegrating regression models. Specification of the nonlinear regression function

f depends usually on the underlying theories of the subject. Illustration can be found

in Bae et al. (2004), where the concept of the money in the utility function with the

constant elasticity of substitution (MUFCES) is used to relate the money balance with

nominal interest rate by the link function f(x,,) = + log( x1+x). More currently,

nonparametric method has been developed to test the specification of f. See, e.g., Wang

and Phillips (2012), for instance. Extension to multivariate regressors require substantial

different techniques. This is because local time theory can not in general be extended to

multivariate data, in which the techniques play a key rule in the asymptotics of the NLS

estimator with nonstationary regressor when f is an integrable function. Possible exten-

22


23/56

sion to multivariate regressors is the index model discussed in Chang and Park (2003).

This again requires some new techniques, and hence we leave it for future work.

8 Appendix: Proofs of main results

This section provides proofs of the main results. We start with some preliminaries, which

list the limit theorems that are commonly used in the proofs of the main results.

8.1 Preliminaries

Denote N(0) = { : 0 < }, where 0 is fixed.

Lemma 8.1. Let Assumption 2.1 hold. Then, for anyxt satisfying (4) withT being given

as in Assumption 2.1,

supN(0)

2n

nt=1

|f(xt, ) f(xt, 0)| + |f(xt, ) f(xt, 0)|2 P 0, (30)as n first and then 0. If in addition Assumption 2.2, then

2n

nt=1

[f(xt, 0) f(xt, 0)] ut P 0, (31)

for any 0, 0 , and

supN(0)

2nnt=1

|f(xt, ) f(xt, 0)| |ut| P 0, (32)

as n first and then 0.

Proof. (30) is simple. As 2nn

t=1[f(xt, 0) f(xt, 0)]2 C2nn

t=1 T2(xt) = OP(1),

(31) follows from Lemma 2 of Lai and Wei (1982). As for (32), the result follows from

n

t=1|f(xt, ) f(xt, 0)| |ut| h( 0)

n

t=1T(xt) |ut|,

andn

t=1 T(xt) |ut| Cn

t=1 T(xt) +n

t=1 T(xt)|ut| E(|ut| | Ft1) = OP(2n).

Lemma 8.2. Let Assumption 3.1 (i) hold. For any boundedg(x) satisfying |g(x)|dx 2, and (set Fn = (x1,...,xn))

Ln = qnn

k=1

|Znk|qE(|uk|q|Fn) + E2n n

k=1

Z2nk[E(u2k|Fnk) 1]

q/2Fn.Recall from Assumption 3.2 (iv) and the fact that 2n =

nk=1 Z

2nk, we have,

E2n

nk=1

Z2nk[E(u2k|Fnk) 1]

q/2Fn

P 0,

by dominated convergence theorem. Hence, routine calculations show that

Ln C(q2)n a(n)(q2)/2 + oP(1) = oP(1),

because 2n = OP(1) by (35) and q > 2. This proves (38), which implies that (36) holds

true with g2(x) = g21(x). Finally, note that, for any a, b R,

a(n)1nt=1

ag21(xt) + bg2(xt)

D

ag21(s) + bg2(s)

(ds) ,

due to Theorem 2.3 of Chen (1999), which implies thata(n)1

nt=1

g21(xt), a(n)1

nt=1

g2(xt)

D

g21 (s)(ds) ,

g2(s)(ds) ,

.

Hence, by continuous mapping theorem,nt=1 g

21(xt)n

t=1 g2(xt)P

g21(s) (ds)

g2(s) (ds).

This shows that (36) is still true with general g2(x).

25


26/56

Lemma 8.4. Under Assumption 4.1, for any boundedgi(x), i = 1, 2, satisfying |gi(x)|dx 0 for all x R,and if q max(3, 2) where q is given Assumption 4.2, we have 1

n

nt=1

g1

xtn

ut,

1

n

nt=1

g2

xtn

D

u

10

g1(W(t))dt +1

0

g1(W(t))dU(t),

10

g2(W(t))dt

, (41)

as n , where u = j=0 E(0uj), and (W(t), U(t)) is a bivariate Brownian motion

with covariance matrix

=

2E20 E00

E00 2E20

.

Proof. See Theorem 4.3 of Ibragimov and Phillips (2008) with minor improvements. We

omit the details.

Lemma 8.6. Under Assumptions 3.1 (i) and 3.2, we have

dnn

sup

N(0)

n

t=1 fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0)

P 0, (42)

dnn

supN(0)

nt=1

fij(xt, ) [f(xt, ) f(xt, 0)] P 0, (43)as n first and then 0, for any 1 i, j m. If in addition Assumption 3.1(ii)(iii), then

dnn

supN(0)

nt=1

fij(xt, ) ut P 0, (44)

26


27/56

as n first and then 0, for any 1 i, j m.Similarly, under Assumptions 3.1* (i) and 3.2*, (42) and (43) are true if we replace

dn/n bya(n)1. If in addition Assumption 3.1* (ii) and (iii), then (44) holds if we replace

dn/n by a(n)1.

Proof. We only prove (42). Others are similar and the details are omitted. First note

that, by Assumption 3.2 (i) and (iii),

sup

|fi(xt, )| sup

|fi(xt, ) fi(xt, 0)| + |fi(xt, 0)|

sup

h( 0)T(xt) + |fi(xt, 0)| C.

It follows that

fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0) fi(xt, )fj(xt, ) fj(xt, 0)+ fj(xt, 0)fi(xt, ) fi(xt, 0) Cfj(xt, ) fj(xt, 0)+ C1fi(xt, ) fi(xt, 0).

Therefore, by recalling (33), the result (42) follows from an application of Lemma 8.1 with

2n = n/dn.

Lemma 8.7. Under Assumptions 3.3 and 3.4, we have

1nvi(dn)vj(dn)

supN(0)

nt=1

fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0) P 0, (45)1

nv(dn)vij(dn)sup

N(0)

nt=1

fij(xt, ) [f(xt, ) f(xt, 0)] P 0, (46)1

nvij(dn)sup

N(0)

nt=1

fij(xt, ) ut P 0, (47)

as n first and then 0, for any 1 i, j m.

Proof. Recall max1tn |xt|/dn = OP(1). Without loss of generality, we assume max1tn |xt|/dn K0 for some K0 > 0. It follows from Assumption 3.4 and dn that, for any 1 i mand , fi(xt, ) vi(dn)|hi(xt/dn)| + o(1) T1fi(xt/dn),fi(xt, ) fi(xt, 0) Afi(|| 0||)vi(dn)T1fi(xt/dn).

27


28/56

This implies that

supN(0)

nt=1

fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0)

2 (Afi() + Afj()) vi(dn)vj(dn)

n

t=1 |hi(xt/dn)| + T1fi(xt/dn). (48)(45) now follows from Afi() 0 as 0 and

1

n

nt=1

|hi(xt/dn)| + T1fi(xt/dn) D 10

|hi(G(t))| + T1fi(G(t))dt = OP(1),due to Assumptions 3.3 (iii) and 3.4 (iii).

The proof of (46) is similar and hence the details are omitted. As for (47), by noting

nt=1

fij(xt, ) ut nt=1

fij(xt, 0) ut+ nt=1

|fij(xt, ) fij(xt, 0)| |ut|,

the result can be proved by using the similar arguments as in (31) and (32) of Lemma

8.1.

Lemma 8.8. Let Assumption 3.3 hold. Then, for any regular functions g(x, ) and

g1(x, ) on , we have

1nn

t=1 gxtdn , ut, 1nn

t=1 g1xtdn , D

10

g

G(t),

dU(t),

10

g1

G(t),

dt

. (49)

Proof. Ifg(x, ) and g1(x, ) are continuous functions, the result follows from (20) and the

continuous mapping theorem. See, Kurtz and Protter (1991) for instance. The extension

from continuous function to regular function is standard in literature. The details can be

found in PP or Park and Phillips (1999).

Lemma 8.9. LetDn(, 0) = Qn() Qn(0). Suppose that, for any > 0,lim infn

inf|0|

Dn(, 0) > 0 in probability, (50)

then n P 0.

Proof. See Lemma 1 of Wu (1981).

28


29/56

8.2 Proof of Theorems

Proof of Theorem 2.1. Let N be any open subset of containing 0. Since n isthe minimizer of Qn() over , by Lemma 8.9, proving consistency is equivalent toshowing that, for any 0 < < 1 and = 0, where , 0 , there exist n0 > 0 andM1 > 0 such that

P

infNc

Dn(, 0) 2n/M1

1 , (51)

for all n > n0.

DenoteN(0) = { : 0 < }. Note that Nc is compact, by the finite coveringproperty of compact set, (51) will follow if we prove that, for any fixed 0 Nc,

supN(0)

2n

Dn(, 0) Dn(0, 0)

P 0, (52)

as n first and then 0, and for > 0, there exists M1 > 0 and n > n0 suchthat for all n > n0,

P

Dn(0, 0) 2n/M1

1 2. (53)

The result (52) is simple. Indeed, by Lemma 8.1, for each fixed 0 Nc, we have

supN(0)

2nDn(, 0) Dn(0, 0)

supN(0) 2n

n

t=1

(f(xt, ) f(xt, 0))2

+ supN(0)

2n

nt=1

|f(xt, ) f(xt, 0)||ut|

P 0, (54)

as n first and then 0, which yields (52).To prove (53), we recall

Dn

(0,

0) =

n

t=1 (f(xt, 0) f(xt, 0))2 n

t=1 (f(xt, 0) f(xt, 0))ut.This, together with (5) and (31), implies that, for any > 0, there exists n0 > 0 and

M1 > 0 such that for all n > n0,

P

Dn(0, 0) 2nM11

P n

t=1

(f(xt, 0) f(xt, 0))2 2n M11 /2

1 2,

29


30/56

which implies (53).

Finally, by noting Qn(0) = n1n

t=1 u2t P 2 due to Assumption 2.2 and strong

law of large number, it follows from the consistency of n and (52) that

|2n

2

| C2n

|Qn(n)

Qn(0)

|+ oP(1) = oP(1).

Proof of Theorem 2.2. (4) follows from (33) of Lemma 8.2. (5) follows from (34) of

Lemma 8.2 with g2(x) = (f(x, ) f(x, 0))2 and the facts that P(LG(1, 0) > 0) = 1 and(f(s, ) f(s, 0))2ds > 0, for any = 0.

Proof of Theorem 2.3. (4) follows from (35) of Lemma 8.3 with g(x) = T(x) + T2(x).

(5) follows from (36) of Lemma 8.3 with g2(x) = (f(x, ) f(x, 0))2 and the facts thatP( > 0) = 1 and (f(s, ) f(s, 0))2(ds) > 0, for any = 0.Proof of Theorem 2.4. Recalling T(x) v() T1(x), we have

1

nv2(dn)

nt=1

[T(xt) + T2(xt)] 1

n

nt=1

T1xt

dn

+ T21

xtdn

D

10

[T1(G(t)) + T2

1 (G(t))]dt, (55)

where the convergence in distribution comes from Berkes and Horvath (2006). This proves

(4) with n = nv(dn) due to the local integrability of T1 + T21 . To prove (5), by lettingG(, x) := f(x,) f(x,0) v()g(x,,0), we have

1

nv2(dn)

nt=1

(f(xt, ) f(xt, 0))2

=1

nv2(dn)

nt=1

G

dn,xtdn

+ v(dn)g

xtdn

, , 0

2 1

n

n

t=1 g2

xtdn

, , 0 |n|, (56)where n =

2nv(dn)

nt=1 G

dn,

xtdn

gxtdn

, , 0

. The similar arguments as in the proof of

(55) yield that

1

n

nt=1

g2xt

dn, , 0

D Z :=

10

g2(G(s), , 0)ds, (57)

30


31/56

where P(0 < Z < ) = 1 due to Assumption 2.1* (i). On the other hand, it follows from(10) that

1

nv2(dn)

nt=1

G2

dn,xtdn

o(1)

nv2(dn)

nt=1

T2(xt) o(1)n

nt=1

T21 (xt/dn) = oP(1).

Now (5) follows from (56), (57) and the fact that

n 2 1

nv2(dn)

nt=1

G2

dn,xtdn

1/2 1n

nt=1

g2xt

dn, , 0

1/2 P 0.

Proof of Theorem 2.5. Let 0 = { 0 } where > 0 is a constant. By virtueof Lemma 8.9, it suffices to prove that, for any , M0 > 0, there exist a n0 > 0 such that,

for all n > n0,

Pn1 inf0

Dn(, 0) > M0 > 1 . (58)

To prove (58), first note thatn

t=1 u2t/n M0 in probability, for some M0 > 0, due to

the Assumption 2.2 (i). This, together with Cauchy-Schwarz Inequality, yields that

n1Dn(, 0)

=1

n

nt=1

(f(xt, ) f(xt, 0))2 2n

nt=1

(f(xt, ) f(xt, 0))ut

1

n

nt=1

(f(xt, ) f(xt, 0))2 2

nnt=1

(f(xt, ) f(xt, 0))21/2 nt=1

u2t1/2 Mn(, 0)

1 2

M0 + oP(1)

Mn(, 0)1/2

, (59)

where Mn(, 0) =1n

nt=1(f(xt, ) f(xt, 0))2. Hence, for any equivalent process xk of

xk (i.e., xk =D xk, 1 k n, n 1, where =D denotes equivalence in distribution), we

have

Pn1 inf

0Dn(, 0) > M0 P inf0 M

n(, 0)1

2

M0 + oP(1)

Mn(, 0)1/2 > M0, (60)

where Mn(, 0) =1n

nt=1(f(x

t , ) f(xt , 0))2.

Recalling x[nt]/dn D G(t) on D[0, 1] and G(t) is a continuous Gaussian process, by theso-called Skorohod-Dudley-Wichura representation theorem (e.g., Shorack and Wellner,

1986, p. 49, Remark 2), we can choose an equivalent process xk of xk so that

sup0t1

|x[nt]/dn G(t)| = oP(1). (61)

31


32/56

For this equivalent process xt , it follows from the structure of f(x, ) that

m(dn, )2 :=

1

nv(dn, )2

nt=1

f(xt , )2

=1

n

n

t=1 h(xt/dn, )2 + oP(1)=

10

h(x[ns]/dn, )2ds + oP(1)

P1

0

h(G(s), )2ds =: m()2, (62)

uniformly in . Due to (62), the same argument as in the proof of Theorem 4.3 inPP yields that

inf0

Mn(, 0)

, in probability,

which, together with (60), implies (58).

Proof of Theorem 3.1. It is readily seen that

Qn(0) = nt=1

f(xt, 0)(yt f(xt, 0)) = nt=1

f(xt, 0)ut,

Qn() = nt=1

f(xt, )f(xt, )

nt=1

f(xt, )(yt f(xt, )),

for any . It follows from the conditions (i)(iii) that

sup:Dn(0)kn

(D1n )

Qn() Qn(0)

D1n = oP(1).

By using (iii) and (iv), we have (D1n ) Qn(0) = OP(1) and (D1n )

Qn(0) D1n D M.As M > 0, a.s., min[(D

1n )

Qn(0) D1n ] n with probability that goes to one for somen > 0, where min(A) denotes the smallest eigenvalue of A and n 0 can be chosenas slowly as required. Due to these facts, a modification of Lemma 1 in Andrews and

Sun (2004) implies (13). Note that (iv) implies (iv). The result Dn(n 0) D M1Zfollows from (13) and the continuing mapping theorem.

Proof of Theorem 3.2. It suffices to verify the conditions (i)(iii) and (iv) of Theorem

3.1 with

Dn =

n/dn I, Z = 1/2 NL

1/2G (1, 0), M = LG(1, 0),

32


33/56

where I is an identity matrix, under Assumptions 3.1 and 3.2. In fact, as n/dn , wemay take kn such that : ||Dn( 0)|| kn falls in N(0) = { : || 0|| < }.This, together with Lemma 8.6, yields (i)(iii).

On the other hand, it follows from Lemma 8.2 with g1(x) = 3f(x, 0) and g2(x) =

1

f(x, 0)

f(x, 0)2 that, for any i = (i1,...,im) Rm

, i = 1, 2, 3,3 Zn,

1Yn 2

=dn

n

1/2 nt=1

3 f(xt, 0)ut,dnn

nt=1

1f(xt, 0) f(xt, 0)2

D

N L1/2G (1, 0),

1 M 2

=D

3 Z,

1 M 2

, (63)

where 2 =[

3f(s, 0)]

2ds, N is a standard normal random variable independent of

G(t) and we have used the fact that

N =

[3f(s, 0)]

2ds1/2

N

=D 3

f(s, 0)f(s, 0)ds1/2

N = 3 1/2 N.

This proves (iv).

Proof of Theorem 3.3. As in the proof of Theorem 3.2, it suffices to verify the condi-

tions (i)(iii) and (iv) of Theorem 3.1 with

Dn = a(n)I, Z = 1/2 N, M = ,

under Assumptions 3.1* and 3.2*. The details are similar to that of Theorem 3.2, with

Lemma 8.2 replaced by Lemma 8.3, and hence are omitted.


3.1 with Dn = diag(

nv1(dn),...,

nvm(dn)),

Z =

1

0

h(G(t), 0)dU(t), M =

1

0

(t)(t)dt,

under Assumptions 3.3 and 3.4. Recalling sup1i,jm | v(dn) vij(dn)vi(dn) vj(dn) | < , it follows from(46) that

1

nvi(dn)vj(dn)sup

N(0)

nt=1

fij(xt, )[f(xt, ) f(xt, 0)] C

nvij(dn)v(dn)sup

N(0)

nt=1

fij(xt, )[f(xt, ) f(xt, 0)] = oP(1),33


34/56

for any 1 i, j m. On the other hand, as nvi(dn) for all 1 i m, we maytake kn such that : ||Dn( 0)|| kn falls in N(0) = { : || 0|| < }. Thesefacts imply that

sup:Dn(0)kn

(D1n )

n

t=1 f(xt, ) f(xt, ) f(xt, 0)D1n = oP(1).This proves the required (ii). The proofs of (i) and (iii) are similar, we omit the details.

Finally (iv) follows from Lemma 8.8 with

g(x, 0) = 3h(x, 0) and g1(x, 0) =

1h(x, 0)h(x, 0)

2.

Proof of Theorem 4.1. We first establish the consistency result. The proof goes along

the same line as in Theorem 2.1. It suffices to show that for any fixed 0 Nc,

supN(0)

dnn

nt=1

|[f(xt, ) f(xt, 0)]ut| P 0, (64)

as 0 uniformly for all large n, andnt=1

(f(xt, 0) f(xt, 0))ut = oP(n/dn). (65)

By (39) in Lemma 8.4 with g1(x) =|T(x)

|, (64) follows from

supN(0)

dnn

nt=1

|[f(xt, ) f(xt, 0)]ut| supN(0)

h( 0) dnn

nt=1

|T(xt)||ut|

C supN(0)

h( 0) OP(1) P 0,

as 0. Similarly, by (40) in Lemma 8.4 with g1(x) = f(x, 0) f(x, 0), we have thatn

t=1(f(xt, 0) f(xt, 0))ut = OP[(n/dn)1/2],

which implies the required (65).

We next prove the convergence in distribution. As in Theorem 3.2, it suffices to verify

the conditions (i)(iii) and (iv) of Theorem 3.1, but with

Dn =

n/dnI, Z = 1/2 NL

1/2G (1, 0), M = LG(1, 0),

under Assumptions 4.1.

34


35/56

The proofs for (i) and (ii) are exactly the same as those of Theorem 3.2. Using similar

arguments as in the proof of (63), (iv) follow from Lemma 8.4.

Finally, note that (64) and (65) still hold if we replace f(x, ) by fij(x, ) due to

Assumption 3.2. By choosing kn in the same way as in the proof of Theorem 3.2,

we have, for any 1 i, j m,sup

:Dn(0)kn

dnn

nt=1

fij(xt, )ut

sup

:Dn(0)kn

dnn

nt=1

[fij(xt, ) fij(xt, 0)] ut+ dn

n

nt=1

fij(xt, 0)ut

= oP(1), (66)

which implies the required (iii).

Proof of Theorem 4.2. The proof goes along the same line as in Theorem 2.1 and

Theorem 4.1. Write vn = v(

n), vin = vi(

n), vijn = vij(

n), for 1 i, j m. Itsuffices to show that for any fixed 0 Nc,

supN(0)

1

nvn

nt=1

|[f(xt, ) f(xt, 0)]ut| P 0, (67)

as 0 uniformly for all large n, andn

t=1 (f(xt, 0) f(xt, 0))ut = oP(nvn). (68)In fact, by Cauchy-Schwarz inequality then weak law of large number, we have

supN(0)

1

nvn

nt=1

|[f(xt, ) f(xt, 0)]ut|

supN(0)

1

nvn

nt=1

[f(xt, ) f(xt, 0)]21/2 n

t=1

u2t

1/2+ oP(1)

C sup

N(0)1

n

n

t=1 Af( 0)T1fxt

n2

1/2

+ oP(1)

P 0,

as 0. This proves (67). Furthermore, by (41) in Lemma 8.5 with g(x) = h(x, 0) h(x, 0), we have

nt=1

(f(xt, 0) f(xt, 0))ut = OP[(nvn)1/2],

which implies the required (68).

35


36/56


3.1 with

Dn = diag(

nv1n,...,

nvmn), M =

10

(t)(t)dt,

Z = u 10

(t)dt + 10

h(G(t), 0)dU(t), (69)

under Assumptions 4.1.

The proofs for (i) and (ii) are exactly the same as those in Theorem 3.4. Note that

sup1i,jm |vn vijnvin vjn | < . It follows that, by choosing kn in the same way as in theproof of Theorem 3.2, we have, for any 1 i, j m,

sup:Dn(0)kn

1

nvinvjn

n

t=1fij(xt, )ut

sup

:Dn(0)kn

Cnvijn

nt=1

[fij(xt, ) fij(xt, 0)] ut+ Cnvijn

nt=1

fij(xt, 0)ut

= oP(1),

where the convergence of first term follows from (67) with 0 = 0 and f replaced by f,

and that of second term follows from Lemma 8.5. This yields the required (iii).

Finally, (iv) follows from Lemma 8.5 with

g1(x) = 3

h(x, 0) and g2(x) = 1

h(x, 0)h(x, 0)2.

REFERENCES

Andrews, D. W. K. and Sun, Y., 2004. Adaptive local polynomial Whittle estimation of

long-range dependence. Econometrica, 72, 569-614.

Boden, T. A., Marland, G. and Andres, R. J., 2009. Global, Regional, and National

Fossil-Fuel CO2 Emissions. Carbon Dioxide Information Analysis Center, Oak Ridge

National Laboratory, U.S. Department of Energy, Oak Ridge, Tenn., U.S.A..

Bae, Y. and de Jong, R., 2007. Money demand function estimation by nonlinear cointe-

gration. Journal of Applied Econometrics, 22(4), 767793.

Bae, Y. Kakkar, V. and Ogaki, M., 2004. Money demand in Japan and the liquidity trap.

ohio State University Department of Economics Working Paper #0406.

36


37/56

Bae, Y., Kakkar, V., and Ogaki, M., 2006. Money demand in Japan and nonlinear

cointegration. Journal of Money, Credit, and Banking, 38(6), 16591667.

Bates, D. M. and Watts, D. G., 1988. Nonlinear Regression Analysis and Its Applications.

analysis, 60(4), 856865.

Berkes, I. and Horvath, L., 2006. Convergence of integral functionals of stochastic pro-cesses. Econometric Theory, 22(2), 304322.

Chang, Y. and Park, J. Y., 2011. Endogeneity in nonlinear regressions with integrated

time series, Econometric Review, 30, 5187.

Chang, Y. and Park, J. Y., 2003. Index models with integrated time series. Journal of

Econometrics, 114, 73106.

Chang, Y., Park, J. Y. and Phillips, P. C. B., 2001. Nonlinear econometric models with

cointegrated and deterministically trending regressors. Econometric Journal, 4, 136.

Chen, X., 1999. How Often Does a Harris Recurrent Markov Chain Recur? The Annals

of Probability, 27, 13241346.

Choi, I. and Saikkonen, P., 2010. Tests for nonlinear cointegration. Econometric Theory,

26, 682709.

Dijkgraaf, E. and Vollebergh, H. R., 2005. A test for parameter homogeneity in CO2

panel EKC estimations. Environmental and Resource Economics, 32(2), 229239.

de Jong, R., 2002. Nonlinear Regression with Integrated Regressors but Without Exo-

geneity, Mimeograph, Department of Economics, Michigan State University.

Feller, W., 1971. Introduction to Probability Theory and Its Applications, vol. II, 2nd

ed. Wiley.

Gao, J. K., Maxwell, Lu, Z. and Tjstheim, D., 2009. Specification testing in nonlinear

and nonstationary time series autoregression. The Annals of Statistics, 37, 38933928.

Granger, C. W. J. and Terasvirta, T., 1993. Modelling Nonlinear Economic Relationships.

Oxford University Press, New York.

Hall, P. and Heyde, C. C., 1980. Martingale limit theory and its application. Probability

and Mathematical Statistics. Academic Press, Inc.Hansen, B. E., 1992. Covergence to stochastic integrals for dependent heterogeneous

processes. Econometric Theory, 8, 489500.

Ibragimov, R. and Phillips, P. C. B., 2008. Regression asymptotics using martingale

convergence methods Econometric Theory, 24, 888947.

37


38/56

Jeganathan, P., 2008. Limit theorems for functional sums that converge to fractional

Brownian and stable motions. Cowles Foundation Discussion Paper No. 1649, Cowles

Foundation for Research in Economics, Yale University.

Karlsen, H. A. and Tjstheim, D., 2001. Nonparametric estimation in null recurrent time

series. The Annals of Statistics, 29, 372416.Kurtz, T. G. and Protter, P., 1991. Weak limit theorems for stochastic integrals and

stochastic differential equations. The Annals of Probability, 19, 10351070.

Lai, T. L., 1994. Asymptotic theory of nonlinear least squares estimations. The Annals

of Statistics, 22, 19171930.

Lai, T. L. and Wei, C. Z., 1982. Least squares estimates in stochastic regression models

with applications to identification and control of dynamic systems, The Annals of

Statistics, 10, 154166.

List, J. and Gallet, C., 1999. The environmental Kuznets curve: does one size fits all?

Ecological Economics 31, 409423.

Maddison, A., 2003. The World Economy: Historical Statistics, OECD.

Muller-Furstenberger, G. and Wagner, M., 2007. Exploring the environmental Kuznets

hypothesis: Theoretical and econometric problems. Ecological Economics, 62(3), 648

660.

Nummelin, E., 1984. General Irreducible Markov Chains and Non-negative Operators.

Cambridge University Press.

Orey, S., 1971. Limit Theorems for Markov Chain Transition Probabilities. Van Nostrand

Reinhold, London.

Piaggio, M. and Padilla, E., 2012. CO2 emissions and economic activity: Heterogeneity

across countries and non-stationary series. Energy policy, 46, 370381.

Park, J. Y. and Phillips, P. C. B., 1999. Asymptotics for nonlinear transformation of

integrated time series. Econometric Theory, 15, 269298.

Park, J. Y. and Phillips, P. C. B., 2001. Nonlinear regressions with integrated time series.

Econometrica, 69, 117161.Phillips, P. C. B., 2001. Descriptive econometrics for non-stationary time series with

empirical illustrations. Cowles Foundation discuss paper No. 1023.

Phillips, P. C. B. and Perron, P., 1988. Testing for a unit root in time series regression.

Biometrika, 75(2), 335346.

38


39/56

Phillips, P. C. B. and Solo, V., 1992. Asymptotics for linear processes. The Annals of

Statistics, 9711001.

Terasvirta, T., Tjstheim, D. and Granger, C. W. J., 2011. Modelling Nonlinear Economic

Time Series, Advanced Texts in Econometrics.

Shi, X. and Phillips, P. C. B., 2012. Nonlinear cointegrating regression under weakidentification. Econometric Theory, 28, 509547.

Skouras, K., 2000. Strong consistency in nonlinear stochastic regression models. The

Annals of Statistics, 28, 871879.

Wagner, M., 2008. The carbon Kuznets curve: A cloudy picture emitted by bad econo-

metrics? Resource and Energy Economics, 30(3), 388408.

Wang, Q., 2011. Martingale limit theorems revisited and non-linear cointegrating regres-

sion. Working Paper.

Wang, Q., Lin, Y. X. and Gulati, C. M., 2003. Asymptotics for general fractionally

integrated processes with applications to unit root tests. Econometric Theory, 19,

143164.

Wang, Q. and Phillips, P. C. B., 2009. Asymptotic theory for local time density estimation

and nonparametric cointegrating regression. Econometric Theory, 25, 710738.

Wang, Q. and Phillips, P. C. B., 2009. Structural nonparametric cointegrating regression.

Econometrica, 77, 19011948.

Wang, Q. and Phillips, P. C. B., 2012. A specification test for nonlinear nonstationary

models. The Annals of Statistics, forthcoming.

Wu, C. F., 1981. Asymptotic theory of nonlinear least squares estimations. The Annals

of Statistics, 9, 501513.

39


40/56

Table 2: Means and standard errors of n and 2n with f(x, ) = exp{|x|}.

= 0 = 0.5 = 1n Mean Std. error Mean Std. error Mean Std. error

Scenario S1

n 200 0.10018 (0.00511) 0.10018 (0.00517) 0.10011 (0.00563)500 0.10014 (0.00413) 0.10013 (0.00406) 0.10004 (0.00366)

2n 200 8.95974 (0.88064) 8.96749 (0.89797) 8.98783 (0.89446)500 8.98116 (0.57035) 8.98106 (0.56659) 8.99062 (0.57060)

Scenario S2

n 200 0.10014 (0.00521) 0.10024 (0.00526) 0.10014 (0.00546)500 0.10009 (0.00418) 0.10017 (0.00435) 0.10012 (0.00416)

2n 200 8.97149 (0.89930) 8.96567 (0.88766) 9.00270 (0.89405)500 8.99296 (0.57106) 9.00135 (0.56876) 8.99701 (0.56470)

Scenario S3

n 200 0.10033 (0.00595) 0.10017 (0.00618) 0.10026 (0.00616)500 0.10022 (0.00481) 0.10020 (0.00499) 0.10012 (0.00489)

2n 200 9.08776 (0.91758) 9.11416 (0.92635) 9.12292 (0.92537)500 9.13221 (0.58686) 9.12295 (0.59715) 9.14155 (0.59417)

40


41/56

Table 3: Means and standard errors of n = (n, 1n, 2n) and 2n with f(x,,1, 2) =

+ 1x + 2x2.

= 0 = 0.5 = 1n Mean Std. error Mean Std. error Mean Std. error

Scenario S1n 200 -19.99444 (0.52156) -19.99776 (0.52417) -20.00214 (0.53271)

500 -20.00171 (0.33352) -20.00164 (0.33260) -20.00333 (0.33106)

1n 200 9.99995 (0.04303) 10.01348 (0.04427) 10.02649 (0.04441)

500 9.99961 (0.01713) 10.00537 (0.01741) 10.01056 (0.01790)2n 200 0.09999 (0.00137) 0.09999 (0.00133) 0.10000 (0.00115)500 0.10000 (0.00034) 0.10001 (0.00034) 0.10000 (0.00029)

2n 200 8.87340 (0.89132) 8.83901 (0.89370) 8.79399 (0.87995)500 8.94769 (0.56790) 8.94198 (0.57076) 8.91645 (0.56403)

Scenario S2n 200 -19.99538 (0.54080) -19.99210 (0.54464) -20.00177 (0.53812)

500 -20.00287 (0.33868) -20.00412 (0.33885) -20.00142 (0.34983)

1n 200 10.00054 (0.04456) 10.01464 (0.04606) 10.02658 (0.04718)500 10.00025 (0.01747) 10.00548 (0.01829) 10.01142 (0.01838)

2n 200 0.10001 (0.00145) 0.10000 (0.00136) 0.10000 (0.00118)500 0.09999 (0.00037) 0.10000 (0.00034) 0.10000 (0.00029)

2n 200 8.85807 (0.89311) 8.83928 (0.89470) 8.79743 (0.89541)500 8.95826 (0.56801) 8.94974 (0.56812) 8.92162 (0.56690)

Scenario S3n 200 -19.99784 (0.62713) -20.00389 (0.60584) -19.99447 (0.62031)

500 -19.99644 (0.39496) -19.99572 (0.38897) -20.00707 (0.40941)

1n 200 9.99962 (0.04949) 10.01477 (0.05129) 10.03178 (0.05180)500 10.00011 (0.02033) 10.00647 (0.02073) 10.01282 (0.02142)

2n 200 0.10002 (0.00165) 0.09998 (0.00156) 0.10000 (0.00131)500 0.10000 (0.00040) 0.10000 (0.00039) 0.10000 (0.00033)

2n 200 8.96801 (0.91677) 8.93070 (0.91952) 8.86505 (0.93340)500 9.08728 (0.58567) 9.07935 (0.58298) 9.03815 (0.58235)

41


42/56

Table 4: Density estimate of n (unscaled).

42


43/56

Table 5: Density estimate of n (unscaled).

43


44/56

Table 6: Density estimate of 1n (unscaled).

44


45/56

Table 7: Density estimate of 2n (unscaled).

45


46/56

Table 8: Density estimate of n (scaled by n1/4).

46


47/56

Table 9: Density estimate of n (scaled by n1/2).

47


48/56

Table 10: Density estimate of 1n (scaled by n).

48


49/56

Table 11: Density estimate of 2n (scaled by n3/2).

49


50/56

Table 12: Density of n t-ratios.

50


51/56

Table 13: Density of n t-ratios.

51


52/56

Table 14: Density of 1n t-ratios.

52


53/56

Table 15: Density of 2n t-ratios.

53


54/56

Table 16: Estimate and 95% Confidence Interval of n

54


55/56

Table 17: Confidence Intervals (95% confidence) of n.

Country n Lower Upper

AUS -57.387 -58.005 -56.769AUT -25.352 -26.414 -24.290BEL -41.632 -43.144 -40.121CAN -44.442 -45.615 -43.268CHN -32.402 -38.126 -26.678DEN -113.670 -115.375 -111.965FIN -91.869 -94.019 -89.720FRA -76.449 -78.162 -74.735HOL -71.892 -73.439 -70.344IND -59.854 -60.930 -58.779IRE -33.058 -34.265 -31.851

ITA -71.375 -72.604 -70.146JAP -27.674 -29.293 -26.056NOR -58.519 -60.442 -56.597USA -45.950 -46.945 -44.954

Table 18: Confidence Intervals (95% confidence) of 1n.

Country 1n Lower UpperAUS 11.520 10.810 12.230AUT 5.075 4.193 5.958BEL 9.116 7.922 10.310CAN 9.217 8.114 10.321CHN 7.609 5.063 10.154DEN 23.724 22.242 25.205FIN 18.967 17.225 20.709FRA 16.443 15.000 17.886HOL 14.921 13.579 16.262IND 14.908 13.894 15.923IRE 6.816 6.062 7.570ITA 14.502 13.647 15.356JAP 5.447 4.614 6.279NOR 11.802 10.555 13.050USA 9.536 8.449 10.623

55


56/56

Table 19: Confidence Intervals (95% confidence) of 2n.

Country 2n Lower UpperAUS -0.562 -0.730 -0.394AUT -0.246 -0.389 -0.102BEL -0.485 -0.721 -0.248CAN -0.462 -0.732 -0.192CHN -0.446 -0.775 -0.118DEN -1.226 -1.586 -0.865FIN -0.967 -1.256 -0.678FRA -0.875 -1.190 -0.559HOL -0.762 -1.093 -0.432IND -0.945 -1.174 -0.715IRE -0.341 -0.443 -0.238ITA -0.729 -0.882 -0.576JAP -0.258 -0.365 -0.151NOR -0.586 -0.822 -0.351USA -0.477 -0.768 -0.186

Nonlinear Cointegrating Regressions

Documents

Transcript of Nonlinear Cointegrating Regressions