Nonlinear Cointegrating Regressions

download Nonlinear Cointegrating Regressions

of 56

Transcript of Nonlinear Cointegrating Regressions

  • 7/28/2019 Nonlinear Cointegrating Regressions

    1/56

    Nonlinear regressions with nonstationary time series

    Nigel Chan and Qiying Wang

    The University of Sydney

    February 25, 2013

    Abstract

    This paper develops asymptotic theory for a nonlinear parametric cointegrating

    regression model. We establish a general framework for weak consistency that iseasy to apply for various nonstationary time series, including partial sum of linearprocesses and Harris recurrent Markov chains. We provide limit distributions fornonlinear least squares estimators, extending the previous works. We also introduceendogeneity to the model by allowing the error to be serially dependent and crosscorrelated with the regressors.

    JEL Classification: C13, C22.

    Key words and phrases: Cointegration, nonlinear regressions, consistency, limit distribu-

    tion, nonstationarity, nonlinearity, endogeneity.

    1 Introduction

    The past few decades have witnessed significant developments in cointegration analy-

    sis. In particular, extensive researches have focused on cointegration models with linear

    structure. Although it gives data researchers convenience in implementation, the linear

    structure is often too restrictive. In particular, nonlinear responses with some unknown

    parameters often arise in the context of economics. For empirical examples, we refer to

    Granger and Terasvirta (1993) as well as Terasvirta et al. (2011). In this situation, it is

    expected that nonlinear cointegration captures the features of many long-run relationships

    in a more realistic manner.

    The authors thanks to Editor, Associate Editor and three referees for their very helpful comments onthe original version. The authors also acknowledge research support from Australian Research Council.Address correspondence to Qiying Wang: School of Mathematics and Statistics, University of Sydney,NSW 2006, Australia. E-mail: [email protected]

    1

  • 7/28/2019 Nonlinear Cointegrating Regressions

    2/56

    A typical nonlinear parametric cointegrating regression model has the form

    yt = f(xt, 0) + ut, t = 1,...,n, (1)

    where f : RRm R is a known nonlinear function, xt and ut are regressor and regression

    errors, and 0 is an m-dimensional true parameter vector that lies in the parameter set. With the observed nonstationary data {yt, xt}nt=1, this paper is concerned with thenonlinear least squares (NLS) estimation of the unknown parameters 0 . In thisregard, Park and Phillips (2001) (PP henceforth) considered xt to be an integrated, I(1),

    process. Based on PP framework, Chang et al. (2001) introduced additional linear time

    trend term and stationary regressors into model (1). Chang and Park (2003) extended to

    nonlinear index models driven by integrated processes. More recently, Choi and Saikkonen

    (2010), Gao et al. (2009) and Wang and Phillips (2012) developed statistical tests for

    the existence of a nonlinear cointegrating relation. Park and Chang (2011) allowed the

    regressors xt to be contemporaneously correlated with the regression errors ut and Shi

    and Phillips (2012) extended the model (1) by incorporating a loading coefficient.

    The present paper has a similar goal to the previously mentioned papers but offers

    more general results, which have some advantages for empirical studies. First of all, we

    establish a general framework for weak consistency of the NLS estimator n, allowing for

    the xt to be a wider class of nonstationary time series. The set of sufficient conditions

    are easy to apply to various nonstationary regressors, including partial sum of linear

    processes and recurrent Markov chains. Furthermore, we provide limit distributions for

    the NLS estimator n. It deserves to mention that the routine employed in this paper

    to establish the limit distributions of n is different from those used in the previous

    works, e.g. Park and Phillips (2001). Roughly speaking, our routine is related to joint

    distributional convergence of a martingale under target and its conditional variance, rather

    than using classical martingale limit theorem which requires establishing the convergence

    in probability for the conditional variance. In nonlinear cointegrating regressions, there

    are some advantages for our methodology since it is usually difficult to establish the

    convergence in probability for the conditional variance, in particular, in the situation that

    the regressor xt is a nonstationary time series. Second, in addition to the commonly used

    martingale innovation structure, our model allows for serial dependence in the equilibrium

    errors ut and the innovations driving xt. It is important as our model permits joint

    determination ofxt and yt, and hence the system is a time series structural model. Under

    such situation, the weak consistency and limit distribution of the NLS estimator n are

    2

  • 7/28/2019 Nonlinear Cointegrating Regressions

    3/56

    also established.

    This paper is organized as follow. Section 2 presents our main results on weak consis-

    tency of the NLS estimator n. Theorem 2.1 provides a general framework. Its applications

    to integrable and non-integrable f are given in Theorems 2.22.5, respectively. Section 3

    investigates the limit distributions ofn in which the model (1) has a martingale structure.

    Extension to endogeneity is presented in Section 4. As mentioned above, our routine es-

    tablishing the limit distribution of n is different from previous works. Section 5 performs

    simulation, reporting and discussing the numerical values of means and standard errors

    which provide the evidence of accuracies of our NLS estimator. Section 6 presents an

    empirical example, providing a link between our theory and real applications. The model

    of interest is the carbon Kuznets curve relating the per capita CO2 emissions and per

    capita GDP. Endogeneity occurs in this example due to potential misreporting of GDP,

    omitted variable bias and reverse causality. Section 7 concludes the paper. Technical

    proofs are postponed to Section 8.

    Throughout the paper, we denote constants by C, C1, C2,... which may be different at

    each appearance. For a vector x = (x1,...,xm), assume that x = (x21 + ... + x2m)1/2, andfor a matrix A, the norm operator is defined by A = supx:x=1 xA. Furthermore,the parameter set Rm is assumed to be compact and convex, and the true parametervector 0 is an interior point of .

    2 Weak consistency

    This section considers the estimation of the unknown parameters 0 in model (1) by

    NLS. Let Qn() =n

    t=1(yt f(xt, ))2. The NLS estimator n of 0 is defined to be theminimizer of Qn() over , that is,

    n = arg minQn(), (2)

    and the error estimator is defined by 2n = n1nt=1 u2t , where ut = yt f(xt, n). Toinvestigate the weak consistency for the NLS estimator n, this section assumes the regres-

    sion model (1) to have a martingale structure. In this situation, our sufficient conditions

    are closely related to those of Wu (1981), Lai (1994) and Skouras (2000), and they intend

    to provide a general framework. In comparison to the papers mentioned, our assumptions

    are easy to apply, particularly in nonlinear cointegrating regression situation as stated in

    the two examples below. Extension to endogeneity between xt and ut is investigated in

    3

  • 7/28/2019 Nonlinear Cointegrating Regressions

    4/56

    Section 4.

    2.1 A framework

    We make use of the following assumptions for the development of weak consistency.

    Assumption 2.1. For each , 0 , there exists a real function T : R R such that

    |f(x, ) f(x, 0)| h(|| 0||) T(x), (3)

    where h(x) is a bounded real function such that h(x) h(0) = 0, as x 0.

    Assumption 2.2. (i) {ut, Ft, 1 t n} is a martingale difference sequence satisfyingE(|ut|2|Ft1) = 2 and sup1tn E(|ut|2q|Ft1) < a.s., where q > 1; and

    (ii) xt is adapted to

    Ft

    1, t = 1,...,n.

    Assumption 2.3. There exists an increasing sequence 0 < n such that

    2n

    nt=1

    [T(xt) + T2(xt)] = OP(1), (4)

    and for any 0 < < 1 and = 0, where , 0 , there exist n0 > 0 and M1 > 0 suchthat

    Pn

    t=1 (f(xt, ) f(xt, 0))2

    2n /M1 1 , (5)

    for all n > n0.

    Theorem 2.1. Under Assumptions 2.12.3, the NLS estimator n is a consistent esti-

    mator of 0, i.e. n P 0. If in addition 2nn1 = O(1), then 2n P 2, as n .

    Assumptions 2.1 and 2.2 are the same as those used in Skouras (2000), which are

    standard in the NLS estimation theory. Also see Wu (1981) and Lai (1994). Assumption

    2.3 is used to replace (3.8), (3.9) and (3.11) in Skouras (2000), in which some uniform

    conditions are used. In comparison to Skouras (2000), our Assumption 2.3 is related to

    the conditions on the regressor xt and is more natural and easy to apply. In particular,

    it is directly applicable in the situation that T is integrable and the regressor xt is a

    nonstationary time series, as stated in the following sub-section.

    4

  • 7/28/2019 Nonlinear Cointegrating Regressions

    5/56

    2.2 Identification of Assumption 2.3: integrable functions

    Due to Assumption 2.1, f(x, ) f(x, 0) is integrable in x ifT is an integrable function.This class of functions includes f(x, 1, 2) = 1|x|2I(x [a, b]), where a and b are finiteconstants, the Gaussian function f(x, ) = ex

    2

    , the Laplacian function f(x, ) = e|x|,

    the logistic regression function f(x, ) = e|x|/(1 + e|x|), etc. In this sub-section, two

    commonly used non-stationary regressors xt are shown to satisfy Assumption 2.3 if T is

    integrable.

    Example 1 (Partial sum of linear processes). Let xt =t

    j=1 j, where {j, j 1}is a linear process defined by

    j =k=0

    k jk, (6)

    where

    {j,

    < j 0 for all = 0.

    Then (4) and (5) hold with 2n = n/dn. Consequently, if in addition Assumption 2.2, then

    n P 0.

    Theorem 2.2 improves Theorem 4.1 of PP in two folds. Firstly, we allow for more

    general regressor. The result under C1 is new, which allows xt to be long memory process,

    including the fractionally integrated process as an example. PP only allow xt to satisfy

    C2 with additional conditions on k, that is, they require xt to be a partial sum of short

    memory process. Secondly, we remove the part (b) required in the definition of I-regular

    function given in their Definition 3.3. Furthermore, we allow for certain non-integrable f

    5

  • 7/28/2019 Nonlinear Cointegrating Regressions

    6/56

    such as the logistic regression function f(x, ) = e|x|/(1 + e|x|), 0 for some 0 > 0,although we assume the integrability of T.

    Example 2 (Recurrent Markov Chain). Let {xk}k0 be a Harris recurrent Markovchain with state space (E, E), transition probability P(x, A) and invariant measure . Wedenote P for the Markovian probability with the initial distribution , E for correspon-dent expectation and Pk(x, A) for the k-step transition of{xk}k0. A subset D ofE with0 < (D) < is called D-set of{xk}k0 if for any A E+,

    supxE

    Ex Ak=1

    ID(xk)

    < ,

    where E+ = {A E: (A) > 0} and A = inf{n 1 : xn A}. By Theorem 6.2 of Orey(1971), D-sets not only exist, but generate the entire sigma E, and for any D-sets C, Dand any probability measure , on (E,

    E),

    limn

    nk=1

    Pk(C)/n

    k=1

    Pk(D) =(C)

    (D), (8)

    where Pk(D) = P

    k(x, D)(dx). See Nummelin (1984) for instance.

    Let a D-set D and a probability measure on (E, E) be fixed. Define

    a(t) = 1(D)[t]k=1

    Pk(D), t 0.

    By recurrence, a(t)

    . Here and below, we set the state space to be the real space,

    that is (E, E) = (R, R). We have the following result.Theorem 2.3. Supposext is defined as in Example 2 and Assumption 2.1 holds. Assume:

    (i) T is bounded and |T(x)|(dx) < , and

    (ii)(f(s, ) f(s, 0))2(ds) > 0 for all = 0.

    Then (4) and (5) hold with 2n = a(n). Consequently, if in addition Assumption 2.2, then

    n P 0.

    Theorem 2.3 seems to be new to the literature. By virtue of (8), the asymptotic orderof a(t) depends only on {xk}k0. It is interesting to notice that Theorem 2.3 does notimpose the -regular condition as commonly used in the literature. The Harris recurrent

    Markov chain {xk}k0 is called -regular if

    lim

    a(t)/a() = t, t > 0, (9)

    where 0 < 1. See Chen (1999) for instance.

    6

  • 7/28/2019 Nonlinear Cointegrating Regressions

    7/56

    2.3 Identification of Assumption 2.3: beyond integrable func-tions

    As noticed in Section 2.2, Assumption 2.3 still holds for certain non-integrable f, although

    we require the integrability of T. However, identification of Assumption 2.3 is more

    involved for general non-integrable f. In this situation, instead of requiring T to be

    integrable, we impose certain relationship between f and T.

    Assumption 2.4. (i) There exist a regular function1 g(x,,0) on satisfying|s|

    g2(s,,0)ds > 0

    for all = 0 and > 0, a real function T : R R and a positive real function v()which is bounded away from zero as , such that for any bounded x,

    sup,0

    |f(x,) f(x,0) v() g(x,,0)|/T(x) = o(1), (10)

    as .(ii) There exists a real function T1 : R R such that T(x) v() T1(x) as |x|

    and T1 + T2

    1 is locally integrable (i.e. integrable on any compact set).

    Theorem 2.4. Suppose Assumption 2.4 holds. Suppose that there exists a continuous

    Gaussian process G(t) such that x[nt],n G(t), on D[0, 1], where xi,n = xi/dn and 0 0 continuous functionsH

    , H, and a constant > 0 such that H(x, ) H(y, ) H(x, ) for all |x y| < on K, a

    compact set of R, and such thatK

    (H H)(x, )dx 0 as 0, and (b) for all x R, H(x, ) isequicontinuous in a neighborhood of x.

    7

  • 7/28/2019 Nonlinear Cointegrating Regressions

    8/56

    that of PP, as we only require that x[nt]/dn converges weakly to a continuous Gaussian

    process. This kind of weak convergence condition is very likely close to be necessary.

    We next consider a class of asymptotically homogeneous functions. In this regard, we

    follow PP. Let f : R R have the structure:

    f(x,) = v(, )h(x, ) + b(, ) A(x, ) B(x,), (11)

    where sup |b(, ) v1(, )| 0, as ; sup |A(x, )| is locally bounded, thatis, bounded on bounded intervals; sup |B(x,)| is bounded on R; h(x, ) is regular on satisfying

    |s| h

    2(s, )ds > 0 for all = 0 and > 0, and v(, ) satisfy: there exist > 0 and a neighborhood N of such that as

    inf

    |p

    p

    |

  • 7/28/2019 Nonlinear Cointegrating Regressions

    9/56

    3 Limit distribution

    This section considers limit distribution of n. In what follows, let Qn and Qn be the

    first and second derivatives of Qn() in the usual way, that is, Qn = Qn/ and Qn =

    2Qn/. Similarly we define f and f. We assume these quantities exist whenever they

    are introduced. The following result comes from an application of Lemma 1 of Andrews

    and Sun (2004), which provides a framework and plays a key part in our development on

    limit distribution of n.

    Theorem 3.1. There exists a sequence of m m nonrandom nonsingular matrices Dnwith D1n 0, as n , such that

    (i) sup:Dn(0)kn (D1n )n

    t=1

    f(xt, )f(xt, )

    f(xt, 0)f(xt, 0)

    D1n = oP(1),(ii) sup:Dn(0)kn (D1n )

    nt=1 f(xt, )

    f(xt, ) f(xt, 0)

    D1n = oP(1),(iii) sup:Dn(0)kn (D1n ) nt=1 f(xt, ) ut D1n = oP(1),(iv) Yn := (D

    1n )

    nt=1 f(xt, 0)f(xt, 0)

    D1n D M, where M > 0, a.s., and

    Zn := (D1n )

    nt=1

    f(xt, 0) ut = OP(1),

    for some sequence of constants {kn, n 1} for which kn , as n . Then, thereexists a sequence of estimators {n, n 1} satisfying Qn(n) = 0 with probability thatgoes to one and

    Dn(n 0) = Y1

    n Zn + oP(1). (13)

    If we replace (iv) by the following (iv), then Dn(n 0) D M1 Z.(iv) for any i = (i1,...,im) Rm, i = 1, 2, 3,

    (1 Yn2, 3Zn) D (1 M 2, 3 Z),

    where M > 0, a.s. and P(Z < ) = 1.Our routine in establishing the limit distribution of n is essentially different from that

    of PP, and is particularly convenient in the situation that f, f and f all are integrable in

    which Dn = n I for some 0 < n , where I is an identity matrix. See next sectionfor examples. The condition AD3 of PP requires to show Yn P M instead of (iv). It isequivalent to say that, under the PPs routine, one requires to show (at least under an

    enlarged probability space)

    1n

    nt=1

    g(xt) P

    g(s)ds LW(1, 0), (14)

    9

  • 7/28/2019 Nonlinear Cointegrating Regressions

    10/56

    where xt is an integrated process, g is a real integrable function and LW(t, s) is the

    local time of the standard Brownian Motion W(t)2. The convergence in probability is

    usually hard or impossible to establish without enlarging the probability space. Our

    routine essentially reduces the convergence in probability to less restrictive convergence

    in distribution. Explicitly, in comparison to (14), we only need to show that

    1n

    nt=1

    g(xt) D

    g(s)ds LW(1, 0). (15)

    This allows to extend the nonlinear regressions to much wider class of nonstationary

    regressor data series, and enables our methodology and proofs straightforward and neat.

    3.1 Limit distribution: integrable functions

    We now establish our results on convergence in distribution for then, which mainly

    involves settling the conditions (i)(iii) and (iv) in Theorem 3.1. First consider the

    situation that f is an integrable function, together with some additional conditions on xt

    and ut.

    Assumption 3.1. (i) xt is defined as in Example 1, that is, xt =t

    j=1 j, where j

    satisfies (6);

    (ii) Fk is a sequence of increasing -fields such that k Fk and k+1 is independent of

    Fk for all k

    1, and k

    F1 for all k

    0;

    (iii) {uk, Fk}k1 forms a martingale difference satisfying maxkm |E(u2k+1 | Fk) 1| 0a.s., and for some > 0, maxk1 E(|uk+1|2+ | Fk) < .

    Assumption 3.2. Let p(x, ) be one off, fi and fij , 1 i, j m.

    (i) p(x, 0) is a bounded and integrable real function;

    (ii) = f(s, 0)f(s, 0)

    ds > 0 and(f(s, ) f(s, 0))2ds > 0 for all = 0;

    (iii) There exists a bounded and integrable function Tp : R R such that |p(x, )

    p(x, 0)| hp(|| 0||) Tp(x), for each , 0 , where hp(x) is a bounded realfunction such that hp(x) hp(0) = 0, as x 0.

    2Here and below, the local time process LG(t, s) of a Gaussian process G(t) is defined by

    LG(t, s) = lim0

    1

    2

    t

    0

    I|G(r) s| dr.

    10

  • 7/28/2019 Nonlinear Cointegrating Regressions

    11/56

    Theorem 3.2. Under Assumptions 3.1 and 3.2, we haven/dn(n 0) D 1/2 NL1/2G (1, 0), (16)

    whereN is a standard normal random vector, which is independent of G(t) defined by

    G(t) =W3/2(t), underC1,

    W(t), underC2.(17)

    Here and below, W(t) denotes the fractional Brownian motion with 0 < < 1 on

    D[0, 1], defined as follows:

    W(t) =1

    A()

    0

    (t s)1/2 (s)1/2

    dW(s) +

    t0

    (t s)1/2dW(s),where W(s) is a standard Brownian motion and

    A() = 1

    2+

    0

    (1 + s)1/2 s1/2

    2ds1/2

    .

    Remark 3.1. Theorem 3.2 improves Theorem 5.1 of PP by allowing xt to be a long

    memory process, which includes fractional integrated processes as an example.

    Using the same routine and slight modification of assumptions, we have the following

    limit distribution when xt is a -regular Harris recurrent Markov chain.

    Assumption 3.1*. (i) xt is defined as in Example 2 satisfying (9), that is, xt is a -

    regular Harris recurrent Markov chain;

    (ii) {uk, Fk}k1 forms a martingale difference;(iii) maxkm |E(u2k+1 | Fnk)1| 0 a.s., and for some > 0, maxk1 E(|uk+1|2+ | Fnk) 0 and (f(s, ) f(s, 0))2(ds) > 0 for all = 0;

    (iii) There exists a bounded function Tp : R R with Tp(x)(dx) < such that

    |p(x, ) p(x, 0)| hp(|| 0||) Tp(x), for each , 0 , where hp(x) is a boundedreal function such that hp(x) hp(0) = 0, as x 0.

    11

  • 7/28/2019 Nonlinear Cointegrating Regressions

    12/56

    Theorem 3.3. Under Assumptions 3.1* and 3.2*, we havea(n)(n 0) D 1/2 N1/2 , (18)

    whereN is a standard normal random vector, which is independent of , and for = 1,

    = 1, and for 0 < < 1,

    is a stable random variable with Laplace transform

    Eexp{t } = exp

    t

    (+ 1)

    , t 0. (19)

    Remark 3.2. Theorem 3.3 is new, even for the stationary case ( = 1) in which a(n) = n

    and the NLS estimator n converges to a normal variate. The random variable in

    the limit distribution, after scaled by a factor of (+ 1)1, is a Mittag-Leffler random

    variable with parameter , which is closely related to a stable random variable. For details

    regarding the properties of this distribution, see page 453 of Feller (1971) or Theorem 3.2

    of Karlsen and Tjstheim (2001).

    Remark 3.3. Assumption 3.1* (iii) imposes a strong orthogonal property between the

    regressor xt and the error sequence ut. It is not clear at the moment whether such

    condition can be relaxed to a less restrictive one where xt is adapted to Fnt and xt+1 isindependent ofFnt for all 1 t n. We leave this for future research.

    Remark 3.4. Unlike linear cointegration, integrable regression function does not pro-

    portionately transfer the effect of outliers of the nonstationary regressor to the response.This is useful when practitioners want to lessen the impact of extreme values of xt on the

    response. In addition, such function arises in macro-economics when the dependent vari-

    able has an uneven response to the regressor. Empirical examples include central banks

    market intervention and the currency exchange target zones, described in Phillips (2001).

    Remark 3.5. The convergence rates in (16) and (18) are reduced by the nonstationarity

    property of the regressor xt when f is an integrable function. For the partial sum of linear

    processes in Theorem 3.2, comparing to the stationary situation where the convergence

    rate is n1/2, the rate is reduced to n3/4/21/2(n) for the long memory case, and is further

    reduced to n1/4 for the short memory case. For the -regular Harris recurrent Markov

    chain, note that by (9), the asymptotic order a(n) is regularly varying at infinity, i.e., there

    exists a slowly varying function (n) such that a(n) n(n). Therefore, the convergencerate of the NLS estimator decreases as the regularity index of the chain decreases from

    one (stationary situation) to zero.

    12

  • 7/28/2019 Nonlinear Cointegrating Regressions

    13/56

    3.2 Limit distribution: beyond integrable functions

    We next consider the limit distribution of n when the regression function f is non-

    integrable. Unlike PP and Chang et al. (2001), we use a more general regressor xt in the

    development of our asymptotic distribution.

    Assumption 3.3. (i) {ut, Ft, 1 t n} is a martingale difference sequence satisfyingE(|ut|2|Ft1) = 2 and sup1tn E(|ut|2q|Ft1) < a.s., where q > 1;

    (ii) xt is adapted to Ft1, t = 1,...,n, max1tn |xt|/dn = OP(1), where d2n = var(xn) ; and

    (iii) There exists a vector of continuous Gaussian processes (G, U) such that

    (x[nt],n, n1/2

    [nt]

    i=1ui) D (G(t), U(t)), (20)

    on D[0, 1]2, where xt,n = xt/dn.

    Assumption 3.4. Let p(x, ) be one of f, fi and fij, 1 i, j m. There exists a realfunction Tp : R R such that

    (i) |p(x, ) p(x, 0)| Ap(|| 0||) Tp(x), for each , 0 , where Ap(x) is a boundedreal function satisfying Ap(x) Ap(0) = 0, as x 0;

    (ii) For any bounded x, sup |p(x,) vp() hp(x, )|/Tp(x) = o(1), as ,where hp(x, ) on is a regular function and vp() is a positive real function which

    is bounded away from zero as ; and(iii) Tp(x) vp() T1p(x) as |x| , where T1p : R R is a real function such that

    T1p + T2

    1p is locally integrable (i.e. integrable on any compact set).

    For notation convenience, let v(t) = vf(t), h(x, ) = hf(x, ), vi(t) = vfi(t) and

    v(t) = (v1(t),..., vm(t)). Similarly, we define v(t), h(x, ) and h(x, ). We have the

    following main result.

    Theorem 3.4. Suppose Assumptions 3.3 and 3.4 hold. Further assume that

    sup1i,jm | v(dn) vij(dn)vi(dn) vj(dn) | < and|s| h(s, 0)h(s, 0)

    ds > 0 for all > 0. Then we have

    Dn (n 0) D1

    0

    (t)(t)dt1 1

    0

    (t) dU(t), (21)

    on D[0, 1], as n , where (t) = h(G(t), 0) and Dn = diag(nv1(dn),..., nvm(dn)).

    13

  • 7/28/2019 Nonlinear Cointegrating Regressions

    14/56

    Remark 3.6. Except the joint convergence, other conditions in establishing Theorem

    3.4 are similar to Theorem 5.2 of PP. PP made use of the concept of H0-regular. Our

    conditions on f, fi and fij, 1 i, j m are more straightforward and easy to identify.Furthermore, the joint convergence assumption under present paper is quite natural when

    f,

    f and

    f all satisfy Assumption 3.4 (ii). Indeed, under

    fi(x,) vi()hi(x, ), one can

    easily obtain the following asymptotics under our joint convergence.

    D1n

    nt=1

    f(xt, 0)ut D1

    0

    h(G(t), 0)dU(t), (22)

    on D[0, 1], which is required in the proof of (21). See, e.g., Kurtz and Protter (1991) and

    Hansen (1992).

    Remark 3.7. If we further assume that U(t) and G(t) are asymptotic independent, that

    is, the long run relationship between the regressor sequence xt and the innovative sequenceut vanishes asymptotically, the limiting distribution in (21) will become mixed normal.

    In particular, when U(t) is a standard Wiener process, we have

    Dn (n 0) D 1

    0

    (t)(t)dt1/2

    N, (23)

    where N is a standard normal random vector.

    Remark 3.8. Nonlinear cointegrating regressions with structure described in Theorem

    3.4 are useful to modelling money demand functions. In such case, yt is the logarithm of

    the real money balance, xt is the nominal interest rate, and f can either be f(x,,) =

    + log |x| or f(x,,) = + log(1+|x||x| ). See Bae and de Jong (2004) and Bae etal. (2006) for empirical studies investigating the estimation of money demand functions

    in USA and Japan respectively. Also, see Bae et al. (2004) for the derivation of these

    functional forms from the underlying money demand theories studied in macro-economics.

    Remark 3.9. Another example is the MichaelisMenton model, which was considered

    by Lai (1994). In such case, the regression function is f(x, 1, 2) = 1/(2 + x)I

    {x

    0

    }.

    Bates and Watts (1988) employs such model to investigate the dependence of rate of

    enzymatic reactions on the concentration of substrate.

    Remark 3.10. Unlike in the situation where f is integrable, there is super convergence

    when f has the structure given in Assumption 3.4 and xt is nonstationary. The conver-

    gence rate is given by Dn, whose elements

    nvi(dn) are all faster than the standard raten in stationary situation.

    14

  • 7/28/2019 Nonlinear Cointegrating Regressions

    15/56

    We finally remark that, if f is an asymptotically homogeneous function having the

    structure (11), it is also possible to extend PPs result from a short memory linear process

    to general nonstationary time series as described in Assumption 3.3. The idea of proof

    for such result can be easily generalized from Theorem 5.3 of PP. Also see Chang et al.

    (2001). As it only involves slight modification of notation, we omit the details.

    4 Extension to endogeneity

    Assumption 2.2 ensures the model (1) having a martingale structure. The result in this

    regard is now well known. However, there is little work on allowing for contemporaneous

    correlation between the regressors and regression errors. Using a nonparametric approach,

    Wang and Phillips (2009b) considered kernel estimate of f and allowed the equation error

    ut to be serially dependent and cross correlated with xs for |t s| < m0, thereby inducingendogeneity in the regressor, i.e., cov(ut, xt) = 0. In relation to present paper de Jong(2002) considered model (1) without exogeneity, and assuming certain mixing conditions.

    Chang and Park (2011) considered a simple prototypical model, where the regressor and

    regression error are driven by i.i.d. innovations. This section provides some extensions

    to these works. Our main results show that there is an essential effect of endogeneity on

    the asymptotics, introducing the bias in the related limit distribution. Full investigation

    in this regard requires new asymptotic results, which will be left for future work.

    4.1 Asymptotics: integrable functions

    We first consider the situation that f satisfies Assumption 2.1 with T being integrable. Let

    i (i, i), i Zbe a sequence of i.i.d. random vectors satisfying E0 = 0 and E02q 0. As noticed in Remark 4 of Jeganathan (2008),the conditions on the characteristics function (t) are not very restrictive.

    Assumption 4.1. (i) xt = tj=1 j where j is defined as in (6), that is, j = k=0 kjkwith the coefficients k satisfying C1 or C2;

    (ii) ut =

    k=0 k tk, where the coefficients k are assumed to satisfy

    k=0 k2|k|2 < ,

    k=0 k = 0 and k=0 |k| max{1, |k|} < where k = ki=0 i.As we do not impose the independence between k and k, Assumption 4.1 provides

    the endogeneity in the model (1), and it is much general than the conditions set given in

    15

  • 7/28/2019 Nonlinear Cointegrating Regressions

    16/56

    Chang and Park (2011). We have the following result.

    Theorem 4.1. Suppose Assumptions 2.1 and 4.1 hold. Further assume:

    (i) T is bounded and integrable, and

    (ii) (f(s, ) f(s, 0))2ds > 0 for all = 0.Then the NLS estimator n defined by (2) is a consistent estimator of 0, i.e. n P 0.

    If in addition Assumption 3.2, then, as n ,n/dn(n 0) D 1 1/2 NL1/2G (1, 0), (24)

    wheref() =

    eixf(x, 0)dx,

    = (2)1 f()f()[Eu20 + 2r=1 E(u0ure

    ixr)] d, (25)

    and other notations are given as in Theorem 3.1.

    4.2 Asymptotics: beyond integrable functions

    This section considers the situation that f satisfies Assumption 2.1 with T being non-

    integrable. Let again i (i, i), i Z be a sequence of i.i.d. random vectors satisfyingE0 = 0 and E02q < , where q > 2, and we make use of the following assumption inrelated to the model (1).

    Assumption 4.2. (i) xt =t

    j=1 j , where j =

    k=0 kjk with =

    k=0 k = 0 andk=0 |k| < ;

    (ii) ut =

    k=0 k tk, where the coefficients k are assumed to satisfy

    k=0 k|k| < , k=0 k = 0 and k=0 k|k||k| < .

    Theorem 4.2. Under Assumption 2.1* and 4.2, the NLS estimator n defined by (2) is

    a consistent estimator of 0, i.e. n P 0.

    In the following, we define hx(x, ) = h(x,)x .

    Theorem 4.3. Suppose Assumptions 3.4 and 4.2 hold. Further assume that:

    (i) sup1i,jm | v(n) vij(

    n)

    vi(n) vj(

    n)

    | < and |s| h(s, 0)h(s, 0)ds > 0 for all > 0, and(ii) hx(x, ) satisfies |hx(x, 0)| K(1+ |x|) for some constantsK, > 0 for allx R,

    and q max(3, 2) where q is given in Assumption 4.2.

    16

  • 7/28/2019 Nonlinear Cointegrating Regressions

    17/56

    Then the limit distribution of n is given by

    Dn(n 0) D 1

    0

    (t)(t)dt1

    u

    10

    (t) dt +

    10

    (t) dU(t)

    , (26)

    as n , where (t) = h(W(t), 0), (t) = hx(W(t), 0), u =

    j=0 E(0uj), Dn =

    diag(nv1(n),..., nvm(n)) and (W(t), U(t)) is a bivariate Brownian motion withcovariance matrix

    =

    2E20 E00

    E00 2E20

    .

    5 Simulation

    In this section, we investigate the finite sample performance of the NLS estimator n of

    nonlinear regressions with endogeneity. Chang and Park (2011) performed simulation of

    similar model, but only considered the error structure ut to be i.i.d. innovation. We intend

    to investigate the sampling behavior of n under different degree of serially dependence

    of ut on itself. To this end, we generate our data in the following way:

    xt = xt1 + t,

    vt =

    1 2 wt + t,

    f1(x, ) = exp{|x|} and f2(x,,1, 2) = + 1x + 2x2, (27)

    where {wt} and {t} are i.i.d. N(0, 32

    ) variables, and is the correlation coefficient thatcontrols the degree of endogeneity. The true value of is set as 0.1 and that of (, 1, 2)

    is set as (20, 10, 0.1). The error structure is generated according to the following threescenarios:

    S1: ut = vt,

    S2: ut =

    j=1j10 vtj+1, and

    S3: ut =

    j=1j4 vtj+1.

    Scenario S1 is considered by Chang and Park (2011), which eliminates the case that

    ut is serially correlated. Scenarios S2 and S3 introduce self-dependence to the error

    sequence. The decay rate of S2 is faster than that of S3, which implies that the level of

    self dependence of ut increases from S1 to S3.

    It is obvious that f1 is an integrable function and f2 satisfies the conditions in Theorem

    4.3, with

    v(,,1, 2) = 2 and h(x,,1, 2) = 2x

    2.

    17

  • 7/28/2019 Nonlinear Cointegrating Regressions

    18/56

    In addition, we have

    h(x,,1, 2) = (1, x , x2) and hx(x,,1, 2) = (0, 1, 2x).

    It would be convenient for later discussion to write out the limit distributions of the

    estimators for the f2 case. By Theorem 4.3, we have n1/2(n 0)n(1n 10)n3/2(2n 20)

    D 1 U(1)u + 10 W(t)dU(t)

    2u1

    0 W(t)dt +1

    0 W2(t)dU(t)

    , (28)where u =

    j=0 E(0uj) and

    =

    1

    10

    W(t)dt1

    0W2(t)dt

    1

    0 W(t)dt

    1

    0 W2(t)dt

    1

    0 W3(t)dt

    10 W2(t)dt 10 W3(t)dt 10 W4(t)dt

    .

    In our simulations, we draw samples of sizes n = 200, 500 to estimate the NLS esti-

    mators and their t-ratios. Each simulation is run for 10,000 times, and their densities are

    estimated using kernel method with normal kernel function. Our results are presented

    in 16 tables. Table 23 present the numerical values of the means and standard errors

    of the NLS estimators, n and (n, 1n, 2n), as well as those of the error variance es-

    timates 2n. Table 47 are the estimated densities of the centered distributions of the

    estimators. Table 811 are the estimated densities of the centered distributions scaled by

    their respective convergence rates, that is, n1/4 and (n1/2, n , n3/2). Table 1215 are the

    t-ratios. In all tables, the left columns are simulation results based on 200 samples, while

    the right columns are based on 500 samples. We expect that as increases, the degree of

    endogeneity increases, and hence the variance of the distribution will increase due to (24)

    and (26). We also expect that the more serial correlation of ut, the higher the variance

    of the limit distribution due to the cross terms appeared in (25) and (26).

    Our simulation results largely corroborate with our theoretical results in Section 4.

    Firstly, the means of the estimators are close to the true values, and the deviations

    from true values decrease as the sample sizes increase. Also, the reductions of standard

    errors from n = 200 to 500 are close to the theoretical values. For example, the ratio

    of standard errors for n between the sample size 200 and 500 is 1.564 in scenario S1,

    which is close to theoretical value

    500200

    = 1.581. This confirms that our estimators are

    accurate in all scenarios. Secondly, comparing the S1S3 curves in each plot, we can

    see that the S3 curves have fat tails and lower peak than those of S1S2. This verifies

    18

  • 7/28/2019 Nonlinear Cointegrating Regressions

    19/56

    that high dependence ofut on its own past will increase the variance of limit distribution.

    Thirdly, comparing the shapes of the curves for different values of, the curves with = 0

    have highest peaks, and the peakedness decrease as increases. This matches with our

    expectation that, if the cross dependence between ut and xt increases, the variance of

    limit distribution will increase.Finally, the sampling results for t-ratios are also much expected from our limit theories.

    In particular, we have interesting results for the t-ratios of1n in Table 14. In scenario S1,

    there is no endogeneity and the limiting t-ratios overlap with the normal curve. However,

    as we introduce endogeneity to the model in scenarios S2 and S3, the t-ratios is shifted

    away from the standard normal. Such behavior can be explained by looking at the limit

    distribution of 1n in (28). When = 0, there exists an extra term of 1u, and suchterm will introduce bias to the limit distribution.

    6 An empirical example: Carbon Kuznets Curve

    In this section, we consider a simple example of the model with endogeneity. The equation

    of interest is the Carbon Kuznets Curve (CKC) relating the per capita CO2 emission of

    a country to its per capita GDP. Basically, the CKC hypothesis states that there is an

    inverted-U shaped relationship between economic activities and per capita CO2 emissions.

    As it is well accepted that the GDP variable displays some wandering nonstationary be-

    haviors over time, such problem is relevant to nonlinear transformations of nonstationaryregressors. Refer to Wagner (2008) and Muller-Furstenberger and Wagner (2007) for

    detailed expositions of how such problem relates to the works of PP and Chang et al.

    (2001).

    The formula that we currently consider has a quadratic formulation and is given by

    ln(et) = + 1 ln(xt) + 2(ln(xt))2 + ut, 1 t n, (29)

    where et and yt denote the per capital emissions of CO2 and GDP in period t, and

    ut is a stochastic error term. In the usual CKC formulation, the logarithm of per

    capita CO2 emissions and GDP are assumed to be integrated processes, see Muller-

    Furstenberger and Wagner (2007). Also, it is clear that the nonlinear link function

    f(x,,1, 2) = + 1x + 2x2 satisfies the conditions in Theorem 4.3, and is con-

    sidered in our simulations. There are studies which employ more complicated models, by

    including time trend, stochastic terms and additional explanatory variables such as the

    19

  • 7/28/2019 Nonlinear Cointegrating Regressions

    20/56

    energy structure. To avoid complicating our discussion, we only consider GDP as the only

    explanatory variable.

    In the literature most studies assumed that the parameters (1, 2) are homogeneous

    across different counties. Several works investigated the appropriateness of restricting all

    countries adhering to the same values of (1, 2). For example, List and Gallet (1999), Di-jkgraaf et al. (2005), Piaggio and Padilla (2012). Particularly, Piaggio and Padilla (2012)

    tested this assumption by checking whether the confidence intervals of (1, 2) across dif-

    ferent countries overlap. They rejected the homogeneity assumption of parameters across

    different countries, based on the result that they could not find any possible group with

    more than 4 countries whose confidence intervals overlap with each other.

    As an example of employing our NLS estimators n = (n, 1n, 2n) and calculating

    their confidence intervals, we carry out similar study and examine whether the 95% con-

    fidence intervals of the parameters (1n, 2n) across different countries overlap with each

    other. We extend previous studies by incorporating the endogeneity in the model. The

    assumption that xt is an exogenous variable does not usually hold in practise as there are

    many potential sources of endogeneity. Firstly, endogeneity occurs due to measurement

    error of the explanatory variables xt, including the misreporting of GDP of each country.

    Secondly, there might be omitted variable bias, which arises if there exists a relevant

    explanatory variable that is correlated with the regressor xt (per capita GDP), but is

    excluded from the model. Finally, potential reverse causality will give rise to endogeneity.

    That is, the CO2 emissions yt might at the same time has a feedback effect on the GDP

    xt (e.g., due to stricter government regulation).

    We consider 15 countries, which are shown in Piaggio and Padilla (2012) that their

    response variables have a quadratic relationship described in (29) with the regressors.

    We use annual data in this example and for each country, there are 58 observations of

    the CO2 emission data from 19512008 published by the Carbon Dioxide Information

    Analysis Center (Boden et al. (2009)). For per capita GDP, we adopt the GDP data

    from Maddison (2003), which is transformed to 1990 Geary-Khamis dollars.We first estimate the parameters by minimizing the error function given in (2) using

    the nls() function in the statistical software R. Next, we calculate the confidence interval

    by finding the critical values of the limit distribution of (n, 1n, 2n) given in (29). Note

    that most of the quantities in (29) involve only the standard Brownian motion or integral

    of Brownian motions, which can be easily obtained by direct simulations. However, there

    20

  • 7/28/2019 Nonlinear Cointegrating Regressions

    21/56

    exist two nuisance parameters, including the covariance matrix of Brownian motions

    (W, U) and the autocovariances u of linear processes {t, ut}. Thus we have to replacethe unknown parameters with their consistent estimates and adopt the empirical critical

    values instead of the real ones. Using the linear process estimation procedures described

    in Chang et al. (2001), we have,

    as the estimators of , , and t and t as the

    artificial error sequences. The parameters and u can then be estimated by

    =

    n12

    nt=1

    2t n

    1n

    t=1 ttn1

    nt=1 tt n

    12n

    t=1 2t

    , and u =

    lj=1

    n1njt=1

    tut+j,

    for any l = o(n1/4). See Ibragimov and Phillips (2008), Phillips and Solo (1992) and

    Phillips and Perron (1988).

    Table 1: Estimates and 95% Confidence Intervals of 1n and 2n

    Table 18 and 19 report the estimates and 95% confidence intervals of parameters 1

    21

  • 7/28/2019 Nonlinear Cointegrating Regressions

    22/56

    and 2, and Table 1 presents the plots of the results. For the parameter 1, we can

    not find any group with more than four groups of countries whose confidence intervals

    overlap. Such result is consistent with that of Piaggio and Padilla (2012). However,

    for the parameter 2n, among the 15 countries, we can find group of 11 countries whose

    confidence intervals overlap. Hence, the homogeneity of parameters can not be rejected asin Piaggio and Padilla (2012). There are two main reasons for the differences between our

    results and theirs. Firstly, they assume the innovation is i.i.d. normally distributed while

    we allow the error to be serially correlated with itself. Secondly, we incorporate the effect

    of endogeneity in our limit distribution of estimator. Based on (29), both reasons will

    make the critical values and confidence intervals significantly larger than that of Piaggio

    and Padilla (2012).

    7 Conclusion and discussion

    In this paper, we establish an asymptotic theory for a nonlinear parametric cointegrating

    regression model. A general framework is developed for establishing the weak consistency

    of the NLS estimator n. The framework can easily be applied to a wide class of non-

    stationary regressors, including the partial sum of linear processes and Harris recurrent

    Markov chains. Limit distribution of n is also established, which extends previous works.

    Furthermore, we introduce endogeneity to our model by allowing the error term to be

    serially dependent on itself, and cross-dependent on the regressor. We show that the limitdistribution of n under the endogeneity situation is different from that with martingale

    error structure. This result is of interest in the applied econometric research area.

    Asymptotics in this paper are limited to univariate regressors and nonlinear para-

    metric cointegrating regression models. Specification of the nonlinear regression function

    f depends usually on the underlying theories of the subject. Illustration can be found

    in Bae et al. (2004), where the concept of the money in the utility function with the

    constant elasticity of substitution (MUFCES) is used to relate the money balance with

    nominal interest rate by the link function f(x,,) = + log( x1+x). More currently,

    nonparametric method has been developed to test the specification of f. See, e.g., Wang

    and Phillips (2012), for instance. Extension to multivariate regressors require substantial

    different techniques. This is because local time theory can not in general be extended to

    multivariate data, in which the techniques play a key rule in the asymptotics of the NLS

    estimator with nonstationary regressor when f is an integrable function. Possible exten-

    22

  • 7/28/2019 Nonlinear Cointegrating Regressions

    23/56

    sion to multivariate regressors is the index model discussed in Chang and Park (2003).

    This again requires some new techniques, and hence we leave it for future work.

    8 Appendix: Proofs of main results

    This section provides proofs of the main results. We start with some preliminaries, which

    list the limit theorems that are commonly used in the proofs of the main results.

    8.1 Preliminaries

    Denote N(0) = { : 0 < }, where 0 is fixed.

    Lemma 8.1. Let Assumption 2.1 hold. Then, for anyxt satisfying (4) withT being given

    as in Assumption 2.1,

    supN(0)

    2n

    nt=1

    |f(xt, ) f(xt, 0)| + |f(xt, ) f(xt, 0)|2 P 0, (30)as n first and then 0. If in addition Assumption 2.2, then

    2n

    nt=1

    [f(xt, 0) f(xt, 0)] ut P 0, (31)

    for any 0, 0 , and

    supN(0)

    2nnt=1

    |f(xt, ) f(xt, 0)| |ut| P 0, (32)

    as n first and then 0.

    Proof. (30) is simple. As 2nn

    t=1[f(xt, 0) f(xt, 0)]2 C2nn

    t=1 T2(xt) = OP(1),

    (31) follows from Lemma 2 of Lai and Wei (1982). As for (32), the result follows from

    n

    t=1|f(xt, ) f(xt, 0)| |ut| h( 0)

    n

    t=1T(xt) |ut|,

    andn

    t=1 T(xt) |ut| Cn

    t=1 T(xt) +n

    t=1 T(xt)|ut| E(|ut| | Ft1) = OP(2n).

    Lemma 8.2. Let Assumption 3.1 (i) hold. For any boundedg(x) satisfying |g(x)|dx 2, and (set Fn = (x1,...,xn))

    Ln = qnn

    k=1

    |Znk|qE(|uk|q|Fn) + E2n n

    k=1

    Z2nk[E(u2k|Fnk) 1]

    q/2Fn.Recall from Assumption 3.2 (iv) and the fact that 2n =

    nk=1 Z

    2nk, we have,

    E2n

    nk=1

    Z2nk[E(u2k|Fnk) 1]

    q/2Fn

    P 0,

    by dominated convergence theorem. Hence, routine calculations show that

    Ln C(q2)n a(n)(q2)/2 + oP(1) = oP(1),

    because 2n = OP(1) by (35) and q > 2. This proves (38), which implies that (36) holds

    true with g2(x) = g21(x). Finally, note that, for any a, b R,

    a(n)1nt=1

    ag21(xt) + bg2(xt)

    D

    ag21(s) + bg2(s)

    (ds) ,

    due to Theorem 2.3 of Chen (1999), which implies thata(n)1

    nt=1

    g21(xt), a(n)1

    nt=1

    g2(xt)

    D

    g21 (s)(ds) ,

    g2(s)(ds) ,

    .

    Hence, by continuous mapping theorem,nt=1 g

    21(xt)n

    t=1 g2(xt)P

    g21(s) (ds)

    g2(s) (ds).

    This shows that (36) is still true with general g2(x).

    25

  • 7/28/2019 Nonlinear Cointegrating Regressions

    26/56

    Lemma 8.4. Under Assumption 4.1, for any boundedgi(x), i = 1, 2, satisfying |gi(x)|dx 0 for all x R,and if q max(3, 2) where q is given Assumption 4.2, we have 1

    n

    nt=1

    g1

    xtn

    ut,

    1

    n

    nt=1

    g2

    xtn

    D

    u

    10

    g1(W(t))dt +1

    0

    g1(W(t))dU(t),

    10

    g2(W(t))dt

    , (41)

    as n , where u = j=0 E(0uj), and (W(t), U(t)) is a bivariate Brownian motion

    with covariance matrix

    =

    2E20 E00

    E00 2E20

    .

    Proof. See Theorem 4.3 of Ibragimov and Phillips (2008) with minor improvements. We

    omit the details.

    Lemma 8.6. Under Assumptions 3.1 (i) and 3.2, we have

    dnn

    sup

    N(0)

    n

    t=1 fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0)

    P 0, (42)

    dnn

    supN(0)

    nt=1

    fij(xt, ) [f(xt, ) f(xt, 0)] P 0, (43)as n first and then 0, for any 1 i, j m. If in addition Assumption 3.1(ii)(iii), then

    dnn

    supN(0)

    nt=1

    fij(xt, ) ut P 0, (44)

    26

  • 7/28/2019 Nonlinear Cointegrating Regressions

    27/56

    as n first and then 0, for any 1 i, j m.Similarly, under Assumptions 3.1* (i) and 3.2*, (42) and (43) are true if we replace

    dn/n bya(n)1. If in addition Assumption 3.1* (ii) and (iii), then (44) holds if we replace

    dn/n by a(n)1.

    Proof. We only prove (42). Others are similar and the details are omitted. First note

    that, by Assumption 3.2 (i) and (iii),

    sup

    |fi(xt, )| sup

    |fi(xt, ) fi(xt, 0)| + |fi(xt, 0)|

    sup

    h( 0)T(xt) + |fi(xt, 0)| C.

    It follows that

    fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0) fi(xt, )fj(xt, ) fj(xt, 0)+ fj(xt, 0)fi(xt, ) fi(xt, 0) Cfj(xt, ) fj(xt, 0)+ C1fi(xt, ) fi(xt, 0).

    Therefore, by recalling (33), the result (42) follows from an application of Lemma 8.1 with

    2n = n/dn.

    Lemma 8.7. Under Assumptions 3.3 and 3.4, we have

    1nvi(dn)vj(dn)

    supN(0)

    nt=1

    fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0) P 0, (45)1

    nv(dn)vij(dn)sup

    N(0)

    nt=1

    fij(xt, ) [f(xt, ) f(xt, 0)] P 0, (46)1

    nvij(dn)sup

    N(0)

    nt=1

    fij(xt, ) ut P 0, (47)

    as n first and then 0, for any 1 i, j m.

    Proof. Recall max1tn |xt|/dn = OP(1). Without loss of generality, we assume max1tn |xt|/dn K0 for some K0 > 0. It follows from Assumption 3.4 and dn that, for any 1 i mand , fi(xt, ) vi(dn)|hi(xt/dn)| + o(1) T1fi(xt/dn),fi(xt, ) fi(xt, 0) Afi(|| 0||)vi(dn)T1fi(xt/dn).

    27

  • 7/28/2019 Nonlinear Cointegrating Regressions

    28/56

    This implies that

    supN(0)

    nt=1

    fi(xt, ) fj(xt, ) fi(xt, 0) fj(xt, 0)

    2 (Afi() + Afj()) vi(dn)vj(dn)

    n

    t=1 |hi(xt/dn)| + T1fi(xt/dn). (48)(45) now follows from Afi() 0 as 0 and

    1

    n

    nt=1

    |hi(xt/dn)| + T1fi(xt/dn) D 10

    |hi(G(t))| + T1fi(G(t))dt = OP(1),due to Assumptions 3.3 (iii) and 3.4 (iii).

    The proof of (46) is similar and hence the details are omitted. As for (47), by noting

    nt=1

    fij(xt, ) ut nt=1

    fij(xt, 0) ut+ nt=1

    |fij(xt, ) fij(xt, 0)| |ut|,

    the result can be proved by using the similar arguments as in (31) and (32) of Lemma

    8.1.

    Lemma 8.8. Let Assumption 3.3 hold. Then, for any regular functions g(x, ) and

    g1(x, ) on , we have

    1nn

    t=1 gxtdn , ut, 1nn

    t=1 g1xtdn , D

    10

    g

    G(t),

    dU(t),

    10

    g1

    G(t),

    dt

    . (49)

    Proof. Ifg(x, ) and g1(x, ) are continuous functions, the result follows from (20) and the

    continuous mapping theorem. See, Kurtz and Protter (1991) for instance. The extension

    from continuous function to regular function is standard in literature. The details can be

    found in PP or Park and Phillips (1999).

    Lemma 8.9. LetDn(, 0) = Qn() Qn(0). Suppose that, for any > 0,lim infn

    inf|0|

    Dn(, 0) > 0 in probability, (50)

    then n P 0.

    Proof. See Lemma 1 of Wu (1981).

    28

  • 7/28/2019 Nonlinear Cointegrating Regressions

    29/56

    8.2 Proof of Theorems

    Proof of Theorem 2.1. Let N be any open subset of containing 0. Since n isthe minimizer of Qn() over , by Lemma 8.9, proving consistency is equivalent toshowing that, for any 0 < < 1 and = 0, where , 0 , there exist n0 > 0 andM1 > 0 such that

    P

    infNc

    Dn(, 0) 2n/M1

    1 , (51)

    for all n > n0.

    DenoteN(0) = { : 0 < }. Note that Nc is compact, by the finite coveringproperty of compact set, (51) will follow if we prove that, for any fixed 0 Nc,

    supN(0)

    2n

    Dn(, 0) Dn(0, 0)

    P 0, (52)

    as n first and then 0, and for > 0, there exists M1 > 0 and n > n0 suchthat for all n > n0,

    P

    Dn(0, 0) 2n/M1

    1 2. (53)

    The result (52) is simple. Indeed, by Lemma 8.1, for each fixed 0 Nc, we have

    supN(0)

    2nDn(, 0) Dn(0, 0)

    supN(0) 2n

    n

    t=1

    (f(xt, ) f(xt, 0))2

    + supN(0)

    2n

    nt=1

    |f(xt, ) f(xt, 0)||ut|

    P 0, (54)

    as n first and then 0, which yields (52).To prove (53), we recall

    Dn

    (0,

    0) =

    n

    t=1 (f(xt, 0) f(xt, 0))2 n

    t=1 (f(xt, 0) f(xt, 0))ut.This, together with (5) and (31), implies that, for any > 0, there exists n0 > 0 and

    M1 > 0 such that for all n > n0,

    P

    Dn(0, 0) 2nM11

    P n

    t=1

    (f(xt, 0) f(xt, 0))2 2n M11 /2

    1 2,

    29

  • 7/28/2019 Nonlinear Cointegrating Regressions

    30/56

    which implies (53).

    Finally, by noting Qn(0) = n1n

    t=1 u2t P 2 due to Assumption 2.2 and strong

    law of large number, it follows from the consistency of n and (52) that

    |2n

    2

    | C2n

    |Qn(n)

    Qn(0)

    |+ oP(1) = oP(1).

    Proof of Theorem 2.2. (4) follows from (33) of Lemma 8.2. (5) follows from (34) of

    Lemma 8.2 with g2(x) = (f(x, ) f(x, 0))2 and the facts that P(LG(1, 0) > 0) = 1 and(f(s, ) f(s, 0))2ds > 0, for any = 0.

    Proof of Theorem 2.3. (4) follows from (35) of Lemma 8.3 with g(x) = T(x) + T2(x).

    (5) follows from (36) of Lemma 8.3 with g2(x) = (f(x, ) f(x, 0))2 and the facts thatP( > 0) = 1 and (f(s, ) f(s, 0))2(ds) > 0, for any = 0.Proof of Theorem 2.4. Recalling T(x) v() T1(x), we have

    1

    nv2(dn)

    nt=1

    [T(xt) + T2(xt)] 1

    n

    nt=1

    T1xt

    dn

    + T21

    xtdn

    D

    10

    [T1(G(t)) + T2

    1 (G(t))]dt, (55)

    where the convergence in distribution comes from Berkes and Horvath (2006). This proves

    (4) with n = nv(dn) due to the local integrability of T1 + T21 . To prove (5), by lettingG(, x) := f(x,) f(x,0) v()g(x,,0), we have

    1

    nv2(dn)

    nt=1

    (f(xt, ) f(xt, 0))2

    =1

    nv2(dn)

    nt=1

    G

    dn,xtdn

    + v(dn)g

    xtdn

    , , 0

    2 1

    n

    n

    t=1 g2

    xtdn

    , , 0 |n|, (56)where n =

    2nv(dn)

    nt=1 G

    dn,

    xtdn

    gxtdn

    , , 0

    . The similar arguments as in the proof of

    (55) yield that

    1

    n

    nt=1

    g2xt

    dn, , 0

    D Z :=

    10

    g2(G(s), , 0)ds, (57)

    30

  • 7/28/2019 Nonlinear Cointegrating Regressions

    31/56

    where P(0 < Z < ) = 1 due to Assumption 2.1* (i). On the other hand, it follows from(10) that

    1

    nv2(dn)

    nt=1

    G2

    dn,xtdn

    o(1)

    nv2(dn)

    nt=1

    T2(xt) o(1)n

    nt=1

    T21 (xt/dn) = oP(1).

    Now (5) follows from (56), (57) and the fact that

    n 2 1

    nv2(dn)

    nt=1

    G2

    dn,xtdn

    1/2 1n

    nt=1

    g2xt

    dn, , 0

    1/2 P 0.

    Proof of Theorem 2.5. Let 0 = { 0 } where > 0 is a constant. By virtueof Lemma 8.9, it suffices to prove that, for any , M0 > 0, there exist a n0 > 0 such that,

    for all n > n0,

    Pn1 inf0

    Dn(, 0) > M0 > 1 . (58)

    To prove (58), first note thatn

    t=1 u2t/n M0 in probability, for some M0 > 0, due to

    the Assumption 2.2 (i). This, together with Cauchy-Schwarz Inequality, yields that

    n1Dn(, 0)

    =1

    n

    nt=1

    (f(xt, ) f(xt, 0))2 2n

    nt=1

    (f(xt, ) f(xt, 0))ut

    1

    n

    nt=1

    (f(xt, ) f(xt, 0))2 2

    nnt=1

    (f(xt, ) f(xt, 0))21/2 nt=1

    u2t1/2 Mn(, 0)

    1 2

    M0 + oP(1)

    Mn(, 0)1/2

    , (59)

    where Mn(, 0) =1n

    nt=1(f(xt, ) f(xt, 0))2. Hence, for any equivalent process xk of

    xk (i.e., xk =D xk, 1 k n, n 1, where =D denotes equivalence in distribution), we

    have

    Pn1 inf

    0Dn(, 0) > M0 P inf0 M

    n(, 0)1

    2

    M0 + oP(1)

    Mn(, 0)1/2 > M0, (60)

    where Mn(, 0) =1n

    nt=1(f(x

    t , ) f(xt , 0))2.

    Recalling x[nt]/dn D G(t) on D[0, 1] and G(t) is a continuous Gaussian process, by theso-called Skorohod-Dudley-Wichura representation theorem (e.g., Shorack and Wellner,

    1986, p. 49, Remark 2), we can choose an equivalent process xk of xk so that

    sup0t1

    |x[nt]/dn G(t)| = oP(1). (61)

    31

  • 7/28/2019 Nonlinear Cointegrating Regressions

    32/56

    For this equivalent process xt , it follows from the structure of f(x, ) that

    m(dn, )2 :=

    1

    nv(dn, )2

    nt=1

    f(xt , )2

    =1

    n

    n

    t=1 h(xt/dn, )2 + oP(1)=

    10

    h(x[ns]/dn, )2ds + oP(1)

    P1

    0

    h(G(s), )2ds =: m()2, (62)

    uniformly in . Due to (62), the same argument as in the proof of Theorem 4.3 inPP yields that

    inf0

    Mn(, 0)

    , in probability,

    which, together with (60), implies (58).

    Proof of Theorem 3.1. It is readily seen that

    Qn(0) = nt=1

    f(xt, 0)(yt f(xt, 0)) = nt=1

    f(xt, 0)ut,

    Qn() = nt=1

    f(xt, )f(xt, )

    nt=1

    f(xt, )(yt f(xt, )),

    for any . It follows from the conditions (i)(iii) that

    sup:Dn(0)kn

    (D1n )

    Qn() Qn(0)

    D1n = oP(1).

    By using (iii) and (iv), we have (D1n ) Qn(0) = OP(1) and (D1n )

    Qn(0) D1n D M.As M > 0, a.s., min[(D

    1n )

    Qn(0) D1n ] n with probability that goes to one for somen > 0, where min(A) denotes the smallest eigenvalue of A and n 0 can be chosenas slowly as required. Due to these facts, a modification of Lemma 1 in Andrews and

    Sun (2004) implies (13). Note that (iv) implies (iv). The result Dn(n 0) D M1Zfollows from (13) and the continuing mapping theorem.

    Proof of Theorem 3.2. It suffices to verify the conditions (i)(iii) and (iv) of Theorem

    3.1 with

    Dn =

    n/dn I, Z = 1/2 NL

    1/2G (1, 0), M = LG(1, 0),

    32

  • 7/28/2019 Nonlinear Cointegrating Regressions

    33/56

    where I is an identity matrix, under Assumptions 3.1 and 3.2. In fact, as n/dn , wemay take kn such that : ||Dn( 0)|| kn falls in N(0) = { : || 0|| < }.This, together with Lemma 8.6, yields (i)(iii).

    On the other hand, it follows from Lemma 8.2 with g1(x) = 3f(x, 0) and g2(x) =

    1

    f(x, 0)

    f(x, 0)2 that, for any i = (i1,...,im) Rm

    , i = 1, 2, 3,3 Zn,

    1Yn 2

    =dn

    n

    1/2 nt=1

    3 f(xt, 0)ut,dnn

    nt=1

    1f(xt, 0) f(xt, 0)2

    D

    N L1/2G (1, 0),

    1 M 2

    =D

    3 Z,

    1 M 2

    , (63)

    where 2 =[

    3f(s, 0)]

    2ds, N is a standard normal random variable independent of

    G(t) and we have used the fact that

    N =

    [3f(s, 0)]

    2ds1/2

    N

    =D 3

    f(s, 0)f(s, 0)ds1/2

    N = 3 1/2 N.

    This proves (iv).

    Proof of Theorem 3.3. As in the proof of Theorem 3.2, it suffices to verify the condi-

    tions (i)(iii) and (iv) of Theorem 3.1 with

    Dn = a(n)I, Z = 1/2 N, M = ,

    under Assumptions 3.1* and 3.2*. The details are similar to that of Theorem 3.2, with

    Lemma 8.2 replaced by Lemma 8.3, and hence are omitted.

    Proof of Theorem 3.4. It suffices to verify the conditions (i)(iii) and (iv) of Theorem

    3.1 with Dn = diag(

    nv1(dn),...,

    nvm(dn)),

    Z =

    1

    0

    h(G(t), 0)dU(t), M =

    1

    0

    (t)(t)dt,

    under Assumptions 3.3 and 3.4. Recalling sup1i,jm | v(dn) vij(dn)vi(dn) vj(dn) | < , it follows from(46) that

    1

    nvi(dn)vj(dn)sup

    N(0)

    nt=1

    fij(xt, )[f(xt, ) f(xt, 0)] C

    nvij(dn)v(dn)sup

    N(0)

    nt=1

    fij(xt, )[f(xt, ) f(xt, 0)] = oP(1),33

  • 7/28/2019 Nonlinear Cointegrating Regressions

    34/56

    for any 1 i, j m. On the other hand, as nvi(dn) for all 1 i m, we maytake kn such that : ||Dn( 0)|| kn falls in N(0) = { : || 0|| < }. Thesefacts imply that

    sup:Dn(0)kn

    (D1n )

    n

    t=1 f(xt, ) f(xt, ) f(xt, 0)D1n = oP(1).This proves the required (ii). The proofs of (i) and (iii) are similar, we omit the details.

    Finally (iv) follows from Lemma 8.8 with

    g(x, 0) = 3h(x, 0) and g1(x, 0) =

    1h(x, 0)h(x, 0)

    2.

    Proof of Theorem 4.1. We first establish the consistency result. The proof goes along

    the same line as in Theorem 2.1. It suffices to show that for any fixed 0 Nc,

    supN(0)

    dnn

    nt=1

    |[f(xt, ) f(xt, 0)]ut| P 0, (64)

    as 0 uniformly for all large n, andnt=1

    (f(xt, 0) f(xt, 0))ut = oP(n/dn). (65)

    By (39) in Lemma 8.4 with g1(x) =|T(x)

    |, (64) follows from

    supN(0)

    dnn

    nt=1

    |[f(xt, ) f(xt, 0)]ut| supN(0)

    h( 0) dnn

    nt=1

    |T(xt)||ut|

    C supN(0)

    h( 0) OP(1) P 0,

    as 0. Similarly, by (40) in Lemma 8.4 with g1(x) = f(x, 0) f(x, 0), we have thatn

    t=1(f(xt, 0) f(xt, 0))ut = OP[(n/dn)1/2],

    which implies the required (65).

    We next prove the convergence in distribution. As in Theorem 3.2, it suffices to verify

    the conditions (i)(iii) and (iv) of Theorem 3.1, but with

    Dn =

    n/dnI, Z = 1/2 NL

    1/2G (1, 0), M = LG(1, 0),

    under Assumptions 4.1.

    34

  • 7/28/2019 Nonlinear Cointegrating Regressions

    35/56

    The proofs for (i) and (ii) are exactly the same as those of Theorem 3.2. Using similar

    arguments as in the proof of (63), (iv) follow from Lemma 8.4.

    Finally, note that (64) and (65) still hold if we replace f(x, ) by fij(x, ) due to

    Assumption 3.2. By choosing kn in the same way as in the proof of Theorem 3.2,

    we have, for any 1 i, j m,sup

    :Dn(0)kn

    dnn

    nt=1

    fij(xt, )ut

    sup

    :Dn(0)kn

    dnn

    nt=1

    [fij(xt, ) fij(xt, 0)] ut+ dn

    n

    nt=1

    fij(xt, 0)ut

    = oP(1), (66)

    which implies the required (iii).

    Proof of Theorem 4.2. The proof goes along the same line as in Theorem 2.1 and

    Theorem 4.1. Write vn = v(

    n), vin = vi(

    n), vijn = vij(

    n), for 1 i, j m. Itsuffices to show that for any fixed 0 Nc,

    supN(0)

    1

    nvn

    nt=1

    |[f(xt, ) f(xt, 0)]ut| P 0, (67)

    as 0 uniformly for all large n, andn

    t=1 (f(xt, 0) f(xt, 0))ut = oP(nvn). (68)In fact, by Cauchy-Schwarz inequality then weak law of large number, we have

    supN(0)

    1

    nvn

    nt=1

    |[f(xt, ) f(xt, 0)]ut|

    supN(0)

    1

    nvn

    nt=1

    [f(xt, ) f(xt, 0)]21/2 n

    t=1

    u2t

    1/2+ oP(1)

    C sup

    N(0)1

    n

    n

    t=1 Af( 0)T1fxt

    n2

    1/2

    + oP(1)

    P 0,

    as 0. This proves (67). Furthermore, by (41) in Lemma 8.5 with g(x) = h(x, 0) h(x, 0), we have

    nt=1

    (f(xt, 0) f(xt, 0))ut = OP[(nvn)1/2],

    which implies the required (68).

    35

  • 7/28/2019 Nonlinear Cointegrating Regressions

    36/56

    Proof of Theorem 4.3. It suffices to verify the conditions (i)(iii) and (iv) of Theorem

    3.1 with

    Dn = diag(

    nv1n,...,

    nvmn), M =

    10

    (t)(t)dt,

    Z = u 10

    (t)dt + 10

    h(G(t), 0)dU(t), (69)

    under Assumptions 4.1.

    The proofs for (i) and (ii) are exactly the same as those in Theorem 3.4. Note that

    sup1i,jm |vn vijnvin vjn | < . It follows that, by choosing kn in the same way as in theproof of Theorem 3.2, we have, for any 1 i, j m,

    sup:Dn(0)kn

    1

    nvinvjn

    n

    t=1fij(xt, )ut

    sup

    :Dn(0)kn

    Cnvijn

    nt=1

    [fij(xt, ) fij(xt, 0)] ut+ Cnvijn

    nt=1

    fij(xt, 0)ut

    = oP(1),

    where the convergence of first term follows from (67) with 0 = 0 and f replaced by f,

    and that of second term follows from Lemma 8.5. This yields the required (iii).

    Finally, (iv) follows from Lemma 8.5 with

    g1(x) = 3

    h(x, 0) and g2(x) = 1

    h(x, 0)h(x, 0)2.

    REFERENCES

    Andrews, D. W. K. and Sun, Y., 2004. Adaptive local polynomial Whittle estimation of

    long-range dependence. Econometrica, 72, 569-614.

    Boden, T. A., Marland, G. and Andres, R. J., 2009. Global, Regional, and National

    Fossil-Fuel CO2 Emissions. Carbon Dioxide Information Analysis Center, Oak Ridge

    National Laboratory, U.S. Department of Energy, Oak Ridge, Tenn., U.S.A..

    Bae, Y. and de Jong, R., 2007. Money demand function estimation by nonlinear cointe-

    gration. Journal of Applied Econometrics, 22(4), 767793.

    Bae, Y. Kakkar, V. and Ogaki, M., 2004. Money demand in Japan and the liquidity trap.

    ohio State University Department of Economics Working Paper #0406.

    36

  • 7/28/2019 Nonlinear Cointegrating Regressions

    37/56

    Bae, Y., Kakkar, V., and Ogaki, M., 2006. Money demand in Japan and nonlinear

    cointegration. Journal of Money, Credit, and Banking, 38(6), 16591667.

    Bates, D. M. and Watts, D. G., 1988. Nonlinear Regression Analysis and Its Applications.

    analysis, 60(4), 856865.

    Berkes, I. and Horvath, L., 2006. Convergence of integral functionals of stochastic pro-cesses. Econometric Theory, 22(2), 304322.

    Chang, Y. and Park, J. Y., 2011. Endogeneity in nonlinear regressions with integrated

    time series, Econometric Review, 30, 5187.

    Chang, Y. and Park, J. Y., 2003. Index models with integrated time series. Journal of

    Econometrics, 114, 73106.

    Chang, Y., Park, J. Y. and Phillips, P. C. B., 2001. Nonlinear econometric models with

    cointegrated and deterministically trending regressors. Econometric Journal, 4, 136.

    Chen, X., 1999. How Often Does a Harris Recurrent Markov Chain Recur? The Annals

    of Probability, 27, 13241346.

    Choi, I. and Saikkonen, P., 2010. Tests for nonlinear cointegration. Econometric Theory,

    26, 682709.

    Dijkgraaf, E. and Vollebergh, H. R., 2005. A test for parameter homogeneity in CO2

    panel EKC estimations. Environmental and Resource Economics, 32(2), 229239.

    de Jong, R., 2002. Nonlinear Regression with Integrated Regressors but Without Exo-

    geneity, Mimeograph, Department of Economics, Michigan State University.

    Feller, W., 1971. Introduction to Probability Theory and Its Applications, vol. II, 2nd

    ed. Wiley.

    Gao, J. K., Maxwell, Lu, Z. and Tjstheim, D., 2009. Specification testing in nonlinear

    and nonstationary time series autoregression. The Annals of Statistics, 37, 38933928.

    Granger, C. W. J. and Terasvirta, T., 1993. Modelling Nonlinear Economic Relationships.

    Oxford University Press, New York.

    Hall, P. and Heyde, C. C., 1980. Martingale limit theory and its application. Probability

    and Mathematical Statistics. Academic Press, Inc.Hansen, B. E., 1992. Covergence to stochastic integrals for dependent heterogeneous

    processes. Econometric Theory, 8, 489500.

    Ibragimov, R. and Phillips, P. C. B., 2008. Regression asymptotics using martingale

    convergence methods Econometric Theory, 24, 888947.

    37

  • 7/28/2019 Nonlinear Cointegrating Regressions

    38/56

    Jeganathan, P., 2008. Limit theorems for functional sums that converge to fractional

    Brownian and stable motions. Cowles Foundation Discussion Paper No. 1649, Cowles

    Foundation for Research in Economics, Yale University.

    Karlsen, H. A. and Tjstheim, D., 2001. Nonparametric estimation in null recurrent time

    series. The Annals of Statistics, 29, 372416.Kurtz, T. G. and Protter, P., 1991. Weak limit theorems for stochastic integrals and

    stochastic differential equations. The Annals of Probability, 19, 10351070.

    Lai, T. L., 1994. Asymptotic theory of nonlinear least squares estimations. The Annals

    of Statistics, 22, 19171930.

    Lai, T. L. and Wei, C. Z., 1982. Least squares estimates in stochastic regression models

    with applications to identification and control of dynamic systems, The Annals of

    Statistics, 10, 154166.

    List, J. and Gallet, C., 1999. The environmental Kuznets curve: does one size fits all?

    Ecological Economics 31, 409423.

    Maddison, A., 2003. The World Economy: Historical Statistics, OECD.

    Muller-Furstenberger, G. and Wagner, M., 2007. Exploring the environmental Kuznets

    hypothesis: Theoretical and econometric problems. Ecological Economics, 62(3), 648

    660.

    Nummelin, E., 1984. General Irreducible Markov Chains and Non-negative Operators.

    Cambridge University Press.

    Orey, S., 1971. Limit Theorems for Markov Chain Transition Probabilities. Van Nostrand

    Reinhold, London.

    Piaggio, M. and Padilla, E., 2012. CO2 emissions and economic activity: Heterogeneity

    across countries and non-stationary series. Energy policy, 46, 370381.

    Park, J. Y. and Phillips, P. C. B., 1999. Asymptotics for nonlinear transformation of

    integrated time series. Econometric Theory, 15, 269298.

    Park, J. Y. and Phillips, P. C. B., 2001. Nonlinear regressions with integrated time series.

    Econometrica, 69, 117161.Phillips, P. C. B., 2001. Descriptive econometrics for non-stationary time series with

    empirical illustrations. Cowles Foundation discuss paper No. 1023.

    Phillips, P. C. B. and Perron, P., 1988. Testing for a unit root in time series regression.

    Biometrika, 75(2), 335346.

    38

  • 7/28/2019 Nonlinear Cointegrating Regressions

    39/56

    Phillips, P. C. B. and Solo, V., 1992. Asymptotics for linear processes. The Annals of

    Statistics, 9711001.

    Terasvirta, T., Tjstheim, D. and Granger, C. W. J., 2011. Modelling Nonlinear Economic

    Time Series, Advanced Texts in Econometrics.

    Shi, X. and Phillips, P. C. B., 2012. Nonlinear cointegrating regression under weakidentification. Econometric Theory, 28, 509547.

    Skouras, K., 2000. Strong consistency in nonlinear stochastic regression models. The

    Annals of Statistics, 28, 871879.

    Wagner, M., 2008. The carbon Kuznets curve: A cloudy picture emitted by bad econo-

    metrics? Resource and Energy Economics, 30(3), 388408.

    Wang, Q., 2011. Martingale limit theorems revisited and non-linear cointegrating regres-

    sion. Working Paper.

    Wang, Q., Lin, Y. X. and Gulati, C. M., 2003. Asymptotics for general fractionally

    integrated processes with applications to unit root tests. Econometric Theory, 19,

    143164.

    Wang, Q. and Phillips, P. C. B., 2009. Asymptotic theory for local time density estimation

    and nonparametric cointegrating regression. Econometric Theory, 25, 710738.

    Wang, Q. and Phillips, P. C. B., 2009. Structural nonparametric cointegrating regression.

    Econometrica, 77, 19011948.

    Wang, Q. and Phillips, P. C. B., 2012. A specification test for nonlinear nonstationary

    models. The Annals of Statistics, forthcoming.

    Wu, C. F., 1981. Asymptotic theory of nonlinear least squares estimations. The Annals

    of Statistics, 9, 501513.

    39

  • 7/28/2019 Nonlinear Cointegrating Regressions

    40/56

    Table 2: Means and standard errors of n and 2n with f(x, ) = exp{|x|}.

    = 0 = 0.5 = 1n Mean Std. error Mean Std. error Mean Std. error

    Scenario S1

    n 200 0.10018 (0.00511) 0.10018 (0.00517) 0.10011 (0.00563)500 0.10014 (0.00413) 0.10013 (0.00406) 0.10004 (0.00366)

    2n 200 8.95974 (0.88064) 8.96749 (0.89797) 8.98783 (0.89446)500 8.98116 (0.57035) 8.98106 (0.56659) 8.99062 (0.57060)

    Scenario S2

    n 200 0.10014 (0.00521) 0.10024 (0.00526) 0.10014 (0.00546)500 0.10009 (0.00418) 0.10017 (0.00435) 0.10012 (0.00416)

    2n 200 8.97149 (0.89930) 8.96567 (0.88766) 9.00270 (0.89405)500 8.99296 (0.57106) 9.00135 (0.56876) 8.99701 (0.56470)

    Scenario S3

    n 200 0.10033 (0.00595) 0.10017 (0.00618) 0.10026 (0.00616)500 0.10022 (0.00481) 0.10020 (0.00499) 0.10012 (0.00489)

    2n 200 9.08776 (0.91758) 9.11416 (0.92635) 9.12292 (0.92537)500 9.13221 (0.58686) 9.12295 (0.59715) 9.14155 (0.59417)

    40

  • 7/28/2019 Nonlinear Cointegrating Regressions

    41/56

    Table 3: Means and standard errors of n = (n, 1n, 2n) and 2n with f(x,,1, 2) =

    + 1x + 2x2.

    = 0 = 0.5 = 1n Mean Std. error Mean Std. error Mean Std. error

    Scenario S1n 200 -19.99444 (0.52156) -19.99776 (0.52417) -20.00214 (0.53271)

    500 -20.00171 (0.33352) -20.00164 (0.33260) -20.00333 (0.33106)

    1n 200 9.99995 (0.04303) 10.01348 (0.04427) 10.02649 (0.04441)

    500 9.99961 (0.01713) 10.00537 (0.01741) 10.01056 (0.01790)2n 200 0.09999 (0.00137) 0.09999 (0.00133) 0.10000 (0.00115)500 0.10000 (0.00034) 0.10001 (0.00034) 0.10000 (0.00029)

    2n 200 8.87340 (0.89132) 8.83901 (0.89370) 8.79399 (0.87995)500 8.94769 (0.56790) 8.94198 (0.57076) 8.91645 (0.56403)

    Scenario S2n 200 -19.99538 (0.54080) -19.99210 (0.54464) -20.00177 (0.53812)

    500 -20.00287 (0.33868) -20.00412 (0.33885) -20.00142 (0.34983)

    1n 200 10.00054 (0.04456) 10.01464 (0.04606) 10.02658 (0.04718)500 10.00025 (0.01747) 10.00548 (0.01829) 10.01142 (0.01838)

    2n 200 0.10001 (0.00145) 0.10000 (0.00136) 0.10000 (0.00118)500 0.09999 (0.00037) 0.10000 (0.00034) 0.10000 (0.00029)

    2n 200 8.85807 (0.89311) 8.83928 (0.89470) 8.79743 (0.89541)500 8.95826 (0.56801) 8.94974 (0.56812) 8.92162 (0.56690)

    Scenario S3n 200 -19.99784 (0.62713) -20.00389 (0.60584) -19.99447 (0.62031)

    500 -19.99644 (0.39496) -19.99572 (0.38897) -20.00707 (0.40941)

    1n 200 9.99962 (0.04949) 10.01477 (0.05129) 10.03178 (0.05180)500 10.00011 (0.02033) 10.00647 (0.02073) 10.01282 (0.02142)

    2n 200 0.10002 (0.00165) 0.09998 (0.00156) 0.10000 (0.00131)500 0.10000 (0.00040) 0.10000 (0.00039) 0.10000 (0.00033)

    2n 200 8.96801 (0.91677) 8.93070 (0.91952) 8.86505 (0.93340)500 9.08728 (0.58567) 9.07935 (0.58298) 9.03815 (0.58235)

    41

  • 7/28/2019 Nonlinear Cointegrating Regressions

    42/56

    Table 4: Density estimate of n (unscaled).

    42

  • 7/28/2019 Nonlinear Cointegrating Regressions

    43/56

    Table 5: Density estimate of n (unscaled).

    43

  • 7/28/2019 Nonlinear Cointegrating Regressions

    44/56

    Table 6: Density estimate of 1n (unscaled).

    44

  • 7/28/2019 Nonlinear Cointegrating Regressions

    45/56

    Table 7: Density estimate of 2n (unscaled).

    45

  • 7/28/2019 Nonlinear Cointegrating Regressions

    46/56

    Table 8: Density estimate of n (scaled by n1/4).

    46

  • 7/28/2019 Nonlinear Cointegrating Regressions

    47/56

    Table 9: Density estimate of n (scaled by n1/2).

    47

  • 7/28/2019 Nonlinear Cointegrating Regressions

    48/56

    Table 10: Density estimate of 1n (scaled by n).

    48

  • 7/28/2019 Nonlinear Cointegrating Regressions

    49/56

    Table 11: Density estimate of 2n (scaled by n3/2).

    49

  • 7/28/2019 Nonlinear Cointegrating Regressions

    50/56

    Table 12: Density of n t-ratios.

    50

  • 7/28/2019 Nonlinear Cointegrating Regressions

    51/56

    Table 13: Density of n t-ratios.

    51

  • 7/28/2019 Nonlinear Cointegrating Regressions

    52/56

    Table 14: Density of 1n t-ratios.

    52

  • 7/28/2019 Nonlinear Cointegrating Regressions

    53/56

    Table 15: Density of 2n t-ratios.

    53

  • 7/28/2019 Nonlinear Cointegrating Regressions

    54/56

    Table 16: Estimate and 95% Confidence Interval of n

    54

  • 7/28/2019 Nonlinear Cointegrating Regressions

    55/56

    Table 17: Confidence Intervals (95% confidence) of n.

    Country n Lower Upper

    AUS -57.387 -58.005 -56.769AUT -25.352 -26.414 -24.290BEL -41.632 -43.144 -40.121CAN -44.442 -45.615 -43.268CHN -32.402 -38.126 -26.678DEN -113.670 -115.375 -111.965FIN -91.869 -94.019 -89.720FRA -76.449 -78.162 -74.735HOL -71.892 -73.439 -70.344IND -59.854 -60.930 -58.779IRE -33.058 -34.265 -31.851

    ITA -71.375 -72.604 -70.146JAP -27.674 -29.293 -26.056NOR -58.519 -60.442 -56.597USA -45.950 -46.945 -44.954

    Table 18: Confidence Intervals (95% confidence) of 1n.

    Country 1n Lower UpperAUS 11.520 10.810 12.230AUT 5.075 4.193 5.958BEL 9.116 7.922 10.310CAN 9.217 8.114 10.321CHN 7.609 5.063 10.154DEN 23.724 22.242 25.205FIN 18.967 17.225 20.709FRA 16.443 15.000 17.886HOL 14.921 13.579 16.262IND 14.908 13.894 15.923IRE 6.816 6.062 7.570ITA 14.502 13.647 15.356JAP 5.447 4.614 6.279NOR 11.802 10.555 13.050USA 9.536 8.449 10.623

    55

  • 7/28/2019 Nonlinear Cointegrating Regressions

    56/56

    Table 19: Confidence Intervals (95% confidence) of 2n.

    Country 2n Lower UpperAUS -0.562 -0.730 -0.394AUT -0.246 -0.389 -0.102BEL -0.485 -0.721 -0.248CAN -0.462 -0.732 -0.192CHN -0.446 -0.775 -0.118DEN -1.226 -1.586 -0.865FIN -0.967 -1.256 -0.678FRA -0.875 -1.190 -0.559HOL -0.762 -1.093 -0.432IND -0.945 -1.174 -0.715IRE -0.341 -0.443 -0.238ITA -0.729 -0.882 -0.576JAP -0.258 -0.365 -0.151NOR -0.586 -0.822 -0.351USA -0.477 -0.768 -0.186