SEQUENTIAL FIXED-PRECISION ESTIMATION IN STOCHASTIC LINEAR REGRESSION MODELS

31
This article was downloaded by: [83.63.164.4] On: 22 October 2014, At: 12:02 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Sequential Analysis: Design Methods and Applications Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsqa20 SEQUENTIAL FIXED-PRECISION ESTIMATION IN STOCHASTIC LINEAR REGRESSION MODELS Sujay Datta a a Department of Statistics , University of Michigan , Ann Arbor, Michigan, 48109, U.S.A. Published online: 16 Aug 2006. To cite this article: Sujay Datta (2002) SEQUENTIAL FIXED-PRECISION ESTIMATION IN STOCHASTIC LINEAR REGRESSION MODELS, Sequential Analysis: Design Methods and Applications, 21:3, 161-190, DOI: 10.1081/SQA-120014362 To link to this article: http://dx.doi.org/10.1081/SQA-120014362 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Transcript of SEQUENTIAL FIXED-PRECISION ESTIMATION IN STOCHASTIC LINEAR REGRESSION MODELS

This article was downloaded by: [83.63.164.4]On: 22 October 2014, At: 12:02Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Sequential Analysis: Design Methods and ApplicationsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lsqa20

SEQUENTIAL FIXED-PRECISION ESTIMATION INSTOCHASTIC LINEAR REGRESSION MODELSSujay Datta aa Department of Statistics , University of Michigan , Ann Arbor, Michigan, 48109, U.S.A.Published online: 16 Aug 2006.

To cite this article: Sujay Datta (2002) SEQUENTIAL FIXED-PRECISION ESTIMATION IN STOCHASTIC LINEAR REGRESSIONMODELS, Sequential Analysis: Design Methods and Applications, 21:3, 161-190, DOI: 10.1081/SQA-120014362

To link to this article: http://dx.doi.org/10.1081/SQA-120014362

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

SEQUENTIAL FIXED-PRECISION

ESTIMATION IN STOCHASTIC LINEAR

REGRESSION MODELS

Sujay Datta*

Department of Statistics, University of Michigan,Ann Arbor, MI 48109-1027

ABSTRACT

The goal of this article is to address the problem of fixed-precision estimation of parameters in a linear regressionsetup with stochastic regressors. First in the case of a linearregression model where the regressor variable and the randomerrors have independent Gaussian distributions, sequentialsampling schemes are proposed for fixed proportional accu-racy estimation and fixed-width interval estimation of theregression-slope based on a Chebyshev inequality approach.The asymptotic second-order efficiency of these procedures isthen established using the techniques developed in Aras andWoodroofe (1993). Asymptotic second-order expansions arederived for a lower bound of P (relative error<preassignedbound) in the fixed proportional accuracy estimation case andthat of the coverage-probability in the interval-estimation

161

DOI: 10.1081/SQA-120014362 0747-4946 (Print); 1532-4176 (Online)Copyright & 2002 by Marcel Dekker, Inc. www.dekker.com

SEQUENTIAL ANALYSISVol. 21, No. 3, pp. 161–190, 2002

*Present address: Department of Mathematics and Computer Science, NorthernMichigan University, 1401 Presque Isle Ave., Marquette, Michigan 49855. E-mail:

[email protected]

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

case. In both cases, the possibility of relaxing the Gaussian

assumption is explored, leading to a reconsideration of

Martinsek’s (1995) fixed proportional accuracy estimation

and a detailed discussion of a stochastic multiple linear regres-

sion model with distribution-free errors and regressors. In this

distribution-free multiple regression scenario, construction of

fixed-size confidence regions for the vector of regression-

parameters is considered and an asymptotically second-

order efficient sequential methodology is put forward.

Moderate sample-size performances of some of these

procedures are investigated via simulation-studies.

Key Words: Stochastic linear regression; Sequential sampling;

Second-order asymptotics; Fixed-proportional-accuracy esti-

mation; Chebyshev’s inequality

AMS 1991 Subject Classifications: 62F10; 62F12; 62F25;

62G05

1. INTRODUCTION

Let us first consider a stochastic linear regression model where theregressors, as well as the random errors, have a probability distribution.Examples of such models are found in time series analysis, dynamicinput-output systems, adaptive stochastic approximation schemes, stochas-tic control theory and so forth. See Lai and Wei (1982) for some of theseapplications and other references. While there is an extensive literature onvarious fixed sample-size and sequential methodologies for fixed-precisionpoint and interval estimation of the regression-parameters with determinis-tic regressors (see, for example, Albert (1966), Gleser (1965), Martinsek(1990), Mukhopadhyay and Abid (1986), Mukhopadhyay and Datta(1995), Sriram (1992)), not nearly as much has been done with stochasticregressors. Only a handful of authors such as Martinsek (1995), Finster(1983) and Sriram and Bose (1988) have worked with random explanatoryvariables. Here our goal is to develop asymptotically efficient sequentialestimation methodologies in a regression model of this kind. For simplicity’ssake, we first look at the case where there is only one regressor variable Xand one response variable Y, and the regressor is independent of the randomerrors. To be more precise, we start by assuming the model

Yi ¼ � þ �Xi þ �i; i ¼ 1, . . . , n ð1:1Þ

162 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

where � and � are unknown constants, the random errors �i’s are i.i.dNð0, �2

� Þ, Xi’s are i.i.d. Nð�, �2X Þ and �i’s are independent of Xi’s. Later in

Sec. 4, we deal with a multiple regression model and also drop the assump-tion of normality to take a distribution-free approach. Notice that under themodel (1.1), for each i 2 f1, . . . , ng, EðYijXiÞ ¼ � þ �Xi, EðYiÞ ¼ � þ ��,VðYiÞ ¼ �2�2

X þ �2� , COVðXi,YiÞ ¼ ��2

X and (Xi,Yi) has a bivariate normaldistribution with these parameters.

Under this setup, we address the problem of point-estimating � in Sec. 2such that the proportional error of estimation jð�̂� �Þ=�j is bounded above bya preassigned quantity � with a high probability. In other words, given �>0and 2 ð0, 1Þ, we would like to havePðjð�̂� �Þ= �j �Þ � 1 for all valuesof the unknown parameters �, �2

X and �2� . We call it fixed proportional accu-

racy estimation. Martinsek (1983) first motivated this kind of a loss functionin the context of estimating a regression slope. He argued that it is appro-priate when there is a possibility that the parameter (which is being estimated)may be close to 0, so that a small absolute error is too weak a requirementto place. Later on, Sriram (1991) used a slightly different version of this loss(i.e., squared proportional error plus sampling cost, instead of absolute pro-portional error) for sequential estimation of the mean in the distribution-freescenario. More recently, Martinsek (1995) considered fixed proportionalaccuracy estimation of the slope in a simple regression model with stochasticregressors having an unknown distribution and with distribution-free errorsas well. Under appropriate moment conditions on these unknown distribu-tions, he derived the asymptotic first-order properties of his procedure. In thepresent article, although at first we talk about estimation in a simple stochas-tic regression model under the normality assumption, our ultimate goal is toexplore a distribution-free multiple regression scenario and our emphasisthroughout is on asymptotic second order optimality. In this sense it can beconsidered an improvement over Martinsek’s (1995) paper. In Sec. 2, we usethe ordinary least-squares estimator �̂�ðnÞOLS of � from the conditional regressionmodel: EðYijXiÞ ¼ � þ �Xi, i¼ 1, . . . , n, and apply Chebyshev’s inequality toget a lower bound for the above-mentioned probability of the proportionalerror being<�. We eventually conclude that this fixed proportional accuracypoint-estimation problem does not have a fixed sample-size solution.This motivates us to propose a new sequential methodology whichstarts with m0(� 2) sample-vectors {(Xi,Yi): i¼ 1, . . . ,m} for the m definedin Eq. (2.3) and continues to observe one vector (Xj,Yj) at a time untila stopping number N, defined in Eq. (2.4), is reached. At the stopped stage,based on the observations {(Xj,Yj ): j¼ 1, . . . ,m, . . . ,N}, we estimate �by �̂�ðNÞOLS ¼ S

ðNÞXY=S

ðNÞXX where S

ðNÞXY ¼

PNj¼1ðXj

�XXNÞðYj �YYNÞ and SðNÞXX ¼PN

j¼1ðXj �XXNÞ

2. We show that this procedure is asymptotically second-order efficient in the sense of Ghosh and Mukhopadhyay (1981) by deriving

ESTIMATION IN STOCHASTIC REGRESSION 163

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

asymptotic second-order expansions for E(N) and for a lower boundof P½jð�̂�ðNÞOLS �Þ=�j < �� as �! 0. Later in Sec. 6.1, we demonstrate themoderate sample-size performance of this procedure by means of simulationexercises.

At the end of Sec. 2, we briefly explore the possibility of relaxing theGaussian assumption which played an important role in this section.

We then switch to the construction of fixed-width confidence inter-vals for � in the above framework in Sec. 3. As before, we take theChebyshev’s inequality approach to this problem. Once again we will seethat there is no fixed sample-size procedure to construct intervals ofspecified widths that have confidence-coefficients at least a preassignedquantity for all possible values of the unknown parameters. So we proceedsequentially and devise a sequential sampling scheme which is asymptoti-cally second-order efficient in the aforementioned sense as the width of thecorresponding interval shrinks to zero. This is shown in Theorems 3.1. Wealso derive an asymptotic second-order expansion for a lower bound of thecoverage-probability of the resulting interval. Once again the issue ofrelaxing the Gaussian assumption is raised at the end, and this time itleads to a fixed-size confidence set estimation procedure under a moregeneral setup in Sec. 4.

In Sec. 4, we allow more than one regressor variable in our model and,at the same time, generalize to a distribution-free setup by relaxing theassumption of normality exploited so far. To be precise, we now considerthe model

Yn ¼ Xn�þ �n, ð1:2Þ

where Y0n ¼ ðY1, . . . ,YnÞ is the vector of responses, �0 ¼ ð�1, . . . ,�pÞ is thevector of regression parameters, Xn is the n� p matrix of regressor variableswith the rows of Xn being i.i.d. Observations from an unknown p-variatedistribution F1 with mean �X and dispersion matrix VX¼ ((�ij)), and�0n¼ (�1, . . . ,�n) is the vector of unobservable random errors with �i ei:i:d:F2,an unspecified distribution with mean 0 and variance �2. Denoting(Xi1, . . . ,Xip), the i-th row of Xn, by X

(i), (1 i n), we continue to assumethat the X(i)’s are independent of the �i’s, and impose higher order moment-conditions on F1 and F2 only as necessary. Also, we will need the additionalassumption that X0nXn is nonsingular a.s. for all n� 1.

Under this set-up, assuming that �X, VX and �2 are unknown, we areinterested in constructing fixed-size confidence regions for � with a preas-signed confidence-level (1 ). For deterministic regressors, the correspond-ing problem has been studied extensively by several authors. Using theasymptotic theory of fixed-width confidence intervals for the mean of an

164 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

unknown distribution developed in Chow and Robbins (1965), Gleser(1965) constructed fixed-size spherical confidence regions for � centered atits OLS estimator following a sequential sampling scheme. In doing so,he needed V�

¼ limn!1 nVðnÞ to exist and be nonsingular, where V(n) is

the dispersion matrix of �̂�ðnÞOLS ¼ ðX 0nXnÞ

1X 0nYn. Not only does this

lead to the strong consistency of �̂�ðnÞOLS for �, as shown later by Lai,Robbins and Wei (1979), but also the sequential stopping rule used byGleser (1965) actually depended on the eigenvalues of V*. Later Albert(1966) proposed confidence ellipsoids for � with kernels proportional toX0nXn and centered at �̂�ðnÞOLS. In the process, he also developed analogoustechniques for estimable functions of � so that his results can eventually beadapted to sequential testing of the general linear hypothesis. While Gleser(1965) and Albert (1966) confined themselves to the first-order asymptoticproperties of their methodologies, Mukhopadhyay and Abid (1986) wereable to achieve asymptotic second-order efficiency for their own sequentialprocedure, although they had to assume Gaussain errors. They also usedconfidence-ellipsoids with kernels proportional to X0nXn and utilizing toolsfrom Woodroofe (1977), they derived asymptotic second-order expansionsfor both the ASN and the coverage-probability of the resulting interval.A slight variant of their methodology can also be applied to theproblem of bounded-risk point estimation of � under certain specificloss-functions, as pointed out by Datta (1995). As for the problem ofminimum-risk point estimation of �, Martinsek (1990) and Sriram (1992)are excellent references.

The case with stochastic regressors, however, has received relativelylittle attention. Lai and Wei (1982) addressed this problem in a moregeneral distribution-free setup using the asymptotic normality ofn1=2ð�̂�ðnÞOLS �Þ as n!1 under some regularity conditions. But theirswas a fixed sample-size procedure without any preassigned size-restrictionon the confidence region, the design levels were past measurable and theerrors were martingale difference sequences. Finster (1983) consideredsequential minimum-risk point estimation of the regression parameter-vector in a general linear model where X’s were allowed to be randomhaving an unknown distribution. Unlike these authors, our goal here is todevise an asymptotically second-order efficient sequential procedure forconstructing fixed-size ellipsoids for � under the model (1.2). Our ellipsoidsare similar in form to those of Mukhopadhyay and Abid (1986), andalthough we too need certain growth-conditions on X0n Xn in order for�̂�2n to be strongly consistent for �2, our stopping variable is neither directly

dependent on any limiting matrix such as V* nor based on any asymptoticnormal approximation. The procedure is described in Eqs. (4.4)–(4.5) andits second-order asymptotic characteristics summarized in Theorem 4.1,

ESTIMATION IN STOCHASTIC REGRESSION 165

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

the proof of which once again exploits results from Aras and Woodroofe(1993). At the end of Sec. 4, we briefly remark on how an appropriatemodification of our methodology also applies to bounded-risk pointestimation of �.

2. STOCHASTIC LINEAR REGRESSION: FIXED

PROPORTIONAL ACCURACY ESTIMATION

Consider the model (1.1) and assume the same setup as in the begin-ning of the previous section. For notational convenience, let us writeVðY1Þ ¼ �2

Y , COV(X1,Y1)¼ �XY, E(Y1)¼�Y and E(X1)¼�X. Clearly,� ¼ �XY=�

2X and the correlation � between the two variables is �¼ �XY/

(�X�Y). Notice that the conditional regression model E(Y jX )¼ �þ �Xcan also be written as

EðY jXÞ ¼ �Y þ �ðX �X Þ, ð2:1Þ

and the conditional variance of Y given X is �2Y jX ¼ �2

Y ð1 �2Þ. Havingobserved fðXi,YiÞ : i ¼ 1, . . . , ng, we define �XXn ¼ n

1Pni¼1 Xi,

�YYn ¼n1

Pni¼1 Yi,

SðnÞXX ¼

Xni¼1

ðXi �XXnÞ2, S

ðnÞYY ¼

Xni¼1

ðYi �YYnÞ2,

SðnÞXY ¼

Xni¼1

ðXi �XXnÞðYi �YYnÞ

and estimate � by �̂�ðnÞOLS ¼ SðnÞXY=S

ðnÞXX , which is its ordinary least-squares esti-

mator in the conditional regression model (2.1). Suppose the loss incurredby using �̂�ðnÞOLS as an estimator of � is 0 if the relative error jð�̂�ðnÞOLS �Þ=�j isat most a preassigned bound �, and is 1 otherwise. With this loss-function,we would like to bound the risk from above by some given 2 ð0, 1Þ, that iswe would like to have P½jð�̂�ðnÞOLS �Þ=�j < �� � 1 .

2.1. Chebyshev’s Inequality Approach

In order to achieve the above-mentioned goal, we take an approachbased on Chebyshev’s inequality. This approach was originally introducedby Mukhopadhyay and Datta (1996) in the context of constructing fixed-width confidence intervals for the mean in the distribution-free scenario. See

166 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

Datta (1995) for additional details. Observe that under the setup describedabove, Eð�̂�ðnÞOLSÞ ¼ EXEð�̂�

ðnÞOLSjXÞ ¼ EX ð�Þ ¼ � and Vð�̂�ðnÞOLSÞ ¼ EXEfð�̂�

ðnÞOLS

�Þ2jXg ¼ EX ð�2� =S

ðnÞXX Þ. These follow from the standard theory of simple

linear regression applied to the conditional regression model (2.1), since Xis assumed to be independent of �. Now, ðS

ðnÞXX=�

2X Þ has a 2

n1 distributionand hence EX ð�

2� =S

ðnÞXX Þ ¼ ðn 3Þ1ð�2

� =�2X Þ, provided that n� 4. Also, by

Chebyshev’s inequality, for any given 2 ð0, 1Þ, P½jð�̂�ðnÞOLS �Þ=�j �� �1 E½ð�̂�ðnÞOLS �Þ=ð��Þ�2 ¼ 1 �2

� fðn 3Þ�2X�

2�2g1 � 1 if

�2�

ðn 3Þ�2X�

2�2 ()n� 3þ

�2�

�2�2�2X

¼ 3þ1 �2

�2�2¼ n0 ðsayÞ, ð2:2Þ

since � ¼ �XY=�2X . In some sense, therefore, we can call n0 an optimal fixed

sample-size for this problem. However, this n0 is useless in practice since � isunknown. So we now propose the following purely sequential samplingscheme: start with {(Xi,Yi): i¼ 1, . . . ,m} where

m ¼ mð�Þ ¼ max 4, ð�2Þ1=2� ��

þ1n o

ð2:3Þ

and continue to observe one (Xj,Yj) at a time until there are N of them where

N ¼ Nð�Þ ¼ inf n � m : n � 1 �̂�2n�

= �̂�2n�2

� �, ð2:4Þ

with �̂�n ¼ SðnÞXY=ðS

ðnÞXXS

ðnÞYY Þ

1=2. Finally having observed {(Xi,Yi): i¼1, . . . ,m, . . . ,N}, estimate � by �̂�ðNÞOLS ¼ S

ðNÞXY=S

ðNÞXX . It is easy to see that N is

finite almost surely and it blows up to 1 as �! 0 for every fixed value of 2 ð0, 1Þ, � 2 R and � 2 ½0, 1�. Also, since �̂�n ! � a.s. as n!1,PðNð�Þ <1Þ ¼ 1 for every fixed value of �, , � and �. In view of thestrong consistency of �̂�n for �, it is easy to see that N=n0 ! 1 a.s. as�! 0, by letting � tend to 0 in the following string of inequalities resultingfrom (2.4):

ð1 �̂�2NÞ

�̂�2N�2n0

N

n0ð1 �̂�2N1Þ

�̂�2N1�2n0

þm

n0: ð2:5Þ

This shows the almost sure asymptotic behavior of the stopping vari-able N in comparison with n0. Now we will examine how E(N), also calledthe average sample number (ASN), behaves asymptotically w.r.t. n0 and findout how close P½jð�̂�ðNÞOLS �Þ=�j < �� gets to its preassigned target (1 )asymptotically as �! 0.

Theorem 2.1. Under the present setup, we have as �! 0,

ESTIMATION IN STOCHASTIC REGRESSION 167

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

(i ) EðN n0Þ ¼ �1 �2 3þ �ð1Þ;(ii ) P½jð�̂�ðNÞOLS �Þ=�j < �� � ð1 Þ n10 K þ �ð�2Þ;

where �1 comes from Eq. (2.10), �2 is defined in Eq. (2.11) and K is obtainablefrom Eqs. (2.13)–(2.14).

Remark 2.1. The above theorem shows that the purely sequential procedure(2.3)–(2.4) is asymptotically second-order efficient in the sense of Ghosh andMukhopadhyay (1981).

Proof. Let us first define i.i.d. random vectors U1,U2, . . . , in thefollowing way:

Ui ¼ ðXi �X ,Yi �Y , ðXi �X Þ2 �2

X , ðYi �Y Þ2 �2

Y ,

ðXi �X ÞðYi �Y Þ �XY Þ: ð2:6Þ

If we denote by G the common distribution of the Ui’s, then G has mean Oand dispersion matrix

M ¼M1 O

O M2

� where M1 ¼

�2X �XY

�XY �2Y

" #and

1

2M2 ¼

�4X �2

XY �2X�XY

�2XY �4

Y �2Y�XY

�2X�XY �2

Y�XY ð�2X�

2Y þ �2

XY Þ=2

264375:

Next we define a function g : D!R as

gðx1, . . . , x5Þ

¼ð1 �2Þðx5 x1x2 þ �XY Þ

2

�2fðx3 x21 þ �2

X Þðx4 x22 þ �2

Y Þ ðx5 x1x2 þ �XY Þ2g

ð2:7Þ

where D ¼ fðx1, . . . , x5Þ 2 R5 : ðx3 x

21 þ �2

X Þðx4 x22 þ �2

Y Þ ðx5 x1x2þ�XY Þ

26¼ 0g. It is easy to observe that gðOÞ ¼ 1 and g is twice continuously

differentiable on some neighborhood of O. If we now define functionsgn : D! for n � 1 as

gnðx1, . . . , x5Þ

¼�2ð1 �2Þ

max 1=n,ðx3 x

21 þ �2

X Þ ðx4 x22 þ �2

Y Þ ðx5 x1x2 þ �XY Þ2

ðx5 x1x2 þ �XY Þ2

" #ð2:8Þ

168 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

if n > �2ð1 �2Þ1, and gnðx1, . . . , x5Þ ¼ gðx1, . . . , x5Þ if n �2ð1 �2Þ1,clearly gn ¼ g 8 n � 1 on a sufficiently small neighborhood of O.Notice that the stopping rule (2.4) has the same form as that in Eq. (2)of Aras and Woodroofe (1993) with a ¼ ð1 �2Þ1�2�2 ¼ n�0 ðsayÞ andZn ¼ nþ hc,Dni þ �n for n � 1, where c ¼ DðgÞjO ¼ ð0, 0, �2X ð1 �2Þ1,�2Y ð1 �2Þ1, 2�1XY ð1 �2Þ1Þ, Dn ¼

Pni¼1Ui and

�n ¼ ngnð �UUnÞ þ

Pni¼1 ðXi �X Þ

2

�2X ð1 �2Þ

þ

Pni¼1 ðYi �Y Þ

2

�2Y ð1 �2Þ

2Pn

i¼1 ðXi �X ÞðYi �Y Þ

�XY ð1 �2Þ n:

This is precisely what Eq. (17) of Aras and Woodroofe (1993)demands. Also, since (X,Y) has a bivariate normal distribution, Eq. (12)of Aras and Woodroofe (1993) holds here ði:e:,

RR5 jjujj

qGðduÞ <1Þ for anyq>0. Hence, by their Proposition 4, the conditions (C4), (C5) and (C6) oftheir paper hold under the present setup with ¼ q=2 (i.e., for any >0)and � ¼ 1=2hW,D2

ðgÞjOWi where W ¼ ðW1, . . . ,W5Þ0� N5ðO,MÞ and

D2ð gÞjO¼

A1 O

O A2

� with

1

2A1 ¼

�2Y ð�

2X�

2Y �2

XY Þ1

�1XY�1XY �2

X ð�2X�

2Y �2

XY Þ1

� and

ð�2X�

2Y �2

XY Þ2A2

¼

2�4Y �2

X�2Y þ �2

XY 2�1XY�2Y ð�

2XY þ �2

X�2Y Þ

�2X�

2Y þ �2

XY 2�4X 2�1XY�

2X ð�

2XY þ �2

X�2Y Þ

2�1XY�2Y ð�

2XY þ �2

X�2Y Þ 2�1XY�

2X ð�

2XY þ �2

X�2Y Þ 2�2XY�

2X�

2Y ð�

2X�

2Y þ 3�2

XY Þ

264375:

ð2:9Þ

Also, (C1) of Aras and Woodroofe (1993) is obviously true in this casefor any positive p as the distribution G has mean O and the distribution of(X,Y) has finite joint moments of all positive orders. In order to verifycondition (C2) of Aras and Woodroofe (1993) in this situation, observethat Zn ¼ ngnð �UUnÞ and hence Zn n

2 for all but a few small values of n.Also, Zn kn on fðð1 �̂�2nÞ�

2Þ=ðð1 �2Þ�̂�2nÞ � 1=kg w.p. 1 for any k>0

and any n� 2 which, together with the fact that n2sPfðð1 �̂�2nÞ�2Þ=

ðð1 �2Þ�̂�2nÞ < 1=kg ! 0 as n!1 for any s>0 and some k>1, establishes

ESTIMATION IN STOCHASTIC REGRESSION 169

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

the validity of condition (C2) here. For a proof of this latter fact, see LemmaA.1 in the Appendix. The verification of condition (C3) of Aras andWoodroofe (1993) under the present setup can be done in the same wayas Datta (1995). Details are provided in Lemma A.2 of the Appendix. At thispoint, let us define �1¼E(R) where the distribution of R is given by

Pðr R rþ drÞ ¼1

Eð�ÞP �

P�i¼1 ðXi �X Þ

2

�2X ð1 �2Þ

P�i¼1 ðYi �Y Þ

2

�2Y ð1 �2Þ

(

þ2P�

i¼1 ðXi �X ÞðYi �Y Þ

�XY ð1 �2Þ> r

�dr ð2:10Þ

for 0 < r <1 with

� ¼ inf

(n � 1 : n

Pni¼1 ðXi �X Þ

2

�2X ð1 �2Þ

Pni¼1 ðYi �Y Þ

2

�2Y ð1 �2Þ

þ2Pn

i¼1 ðXi �X ÞðYi �Y Þ

�XY ð1 �2Þ> 0

):

Let us also define �2 ¼ Eð�Þ ¼12EðhW,D2

ðgÞjOWiÞ. So

�2 ¼ �2X�

2Y �2

XY

� 2n2�2

X�2Y �2

X�2Y �2

XY

� 2 �2

X�2Y �2

XY

� 2þ4�4

X�4Y

þ �2X�

2Y�

2XY �2

X�2Y þ �2

XY

� �2X�

2Y þ 3�2

XY

� þ 2�2

XY �2X�

2Y þ �2

XY

� 8�2

X�2Y �2

XY þ �2X�

2Y

� o¼ �4�8

X�6 1 �4�

: ð2:11Þ

Notice that by Proposition 3 of Aras and Woodroofe (1993),ðNð1=2Þ

DN , �N ,RNÞ ) ðW, �,RÞ in distribution as �! 0 where RN ¼ZN n

�0 is the ‘overshoot ’ at the stopped stage, R is as in Eq. (2.10) and

ðW, �Þ is independent of R.Now the first part of our Theorem 2.1 follows directly from Theorem 1

of Aras and Woodroofe (1993) and the fact that n0 n�0 ¼ 3. To prove the

other part of Theorem 2.1, we will utilize Theorem 4 of Aras andWoodroofe (1993). Since Eq. (12) of their paper is true in our case forany q>0 and conditions (C1)–(C6) hold here for any >0 and p>0 with� as defined earlier, we are immediately in a position to apply theirTheorem 4 with h : D�

! R defined as

170 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

hðx1, . . . , x5Þ ¼f�2X ðx5 x1x2Þ �XY ðx3 x

21Þg

2

�2XY ðx3 x

21 þ �2

X Þ2

ð2:12Þ

where D�¼ fðx1, . . . , x5Þ 2 R

5 : x3 x21 þ �2

X 6¼ 0g. It is easy to see thath(O)¼ 0, h is four times continuously differentiable on a neighborhood of0 and D(h)jO¼O. Since m ¼ mð�Þ ! 1 as �! 0, Efsupk�m jhð �UUkÞj

rg will

be finite for some r � ðq 1Þ=ðq 3Þ if � is sufficiently small. This is becausea 2 distribution with k degrees of freedom has finite moments of all positiveorders and finite inverse moments of order < k/2. For q� 6, obviouslyðq 1Þ=ðq 3Þ 5=3 so that Efsupk�m jhð �UUkÞj

rg <1 for r ¼ ðq 1Þ=

ðq 3Þ if � is small enough to ensure that m� 8. In other words, Eq. (13)of Aras and Woodroofe (1993) holds in the present setup withhk � h 8 k � m. Since, in addition to that, hc,U1i has a non-arithmetic dis-tribution in this case, Eq. (16) of their paper immediately yields:

ð1 �2Þ

�2�2

( )2

E�̂�ðNÞOLS �

!2

1

2hD2

ðhÞjO,Mi2ð1 �2Þ

�2�2

¼1

2hD2

ðhÞjO,�2i2 þ1

6hD3

ðhÞjO,�3i3 þ1

24hD4

ðhÞjO,�4i4 þ �ð1Þ

ð2:13Þ

as �! 0 where h�, �ik denotes the natural inner product on the inner productspace of all k-linear functionals, Dk(h) jO denotes the 5� � � � � 5 (k times)dimensional hypermatrix of the k-th order partial derivatives of h evaluatedatO, and the forms of�k (k¼ 2,3,4) are obtainable respectively fromEqs. (9),(10) and (11) of Aras and Woodroofe (1993). Since under the current setup,the conditions of both Theorem 2 and Theorem 3 in Aras and Woodroofe(1993) are satisfied, we can use them to obtain the following:

�2 ¼ Ef2�ðWW0Þg þ ðc

0Mc v2 v1ÞM þ 2Mcc0M

þ Eð2hc,U1iU1U01g;

�3ði, j, kÞ ¼X5

l¼16mijmklcl þ EGðUi1Uj1Uk1Þ;

�4ði, j, k, lÞ ¼ 3mijmkl

ð2:14Þ

where U1 ¼ ðU11, . . . ,U51Þ is as in Eq. (2.6), mij is the (i, j)-th entry of M,�3(i, j, k) is the (i, j, k)-th entry of�3 and�4(i, j, k,l) is the (i, j, k,l)-th entry of�4 (1 i, j, k, l 5). The evaluation of the three expectations present in theabove expressions for �2 and �3 will involve various joint moments of abivariate normal distribution, such as EðX1 �X Þ

6¼ 15�6

X , EðY1 �Y Þ6¼

ESTIMATION IN STOCHASTIC REGRESSION 171

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

15�6Y , EfðX1�X ÞðY1�Y Þ

5g¼15�X�

5Y�EfðX1�X Þ

5ðY1�Y Þg¼15�5

X�Y�,EfðX1�X Þ

2ðY1�Y Þ

4g¼3�2

X�4Y ð1þ4�2Þ, EfðX1�X Þ

4ðY1�Y Þ

2g¼3�2

Y�4X

ð1þ4�2Þ and EfðX1�X Þ3ðY1�Y Þ

3g¼9�3

X�3Y� þ 6�3

X�3Y�

3.In view of the fact that

D2ðhÞjO ¼

O O

O B

� where

1

2B ¼

�4X 0 �2X �1XY0 0 0

�2X �1XY 0 �2XY

24 35,some simplification of Eq. (2.13) after division throughout by ð1 �2Þ �1�2�2 leads us to the second part of our theorem with K comingfrom Eqs. (2.13)–(2.14), since P½jð�̂�ðNÞOLS�Þ=�j���1E½ð�̂�ðNÞOLS�Þ=ð��Þ�2

by Chebyshev’s inequality, and since n10 n�10 ¼�ð�2Þ as �!0. Thiscompletes the proof.

Remark 2.2. One of the reasons we have presented the proof of Theorem 2.1in great details is that we will often refer back to it in the following sectionsand thereby avoid repeating similar steps in the proofs of the remainingtheorems.

Remark 2.3. In the above analysis, the assumption of normality on theerrors and regressors was exploited in a crucial way. A partinent questionis whether it is possible to get rid of this assumption and formulate anestimation procedure that still enjoys asymptotic second-order propertiessimilar to the ones in Theorem 2.1. This question is not only inspired bythe desire to increase the scope of application of our methodology, but isalso motivated by Martinsek’s (1995) work. As mentioned earlier, hedevised an asymptotically first-order efficient fixed proportional accuracyestimation procedure in the distribution-free scenario. It might be of interestto investigate if Martinsek’s (1995) sequential stopping rule lends itself to asimilar analysis as above. Under the setup of this section, assume now thatXi’s are i.i.d.�F1 (unknown) and �i’s are i.i.d.�F2 (unknown) and theerrors are independent of the regressors. In order to estimate � so thatPðjð�̂� �Þ=�j �Þ � 1 for all values of �, �2

X and �2� , Martinsek (1995)

used the fact that n1=2ð�̂�ðnÞOLS �Þ ! Nð0, �2� =�

2X Þ in distribution as n!1

to find an optimal fixed sample-size n00 for this problem: n00 ¼n00ð�Þ ¼ z

2�

2� =ð�

2�2�2X Þ. But as this n00 contains unknown parameters, he

essentially proposed the following sequential sampling scheme: start with

172 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

m (Xi,Yi)-pairs where m ¼ mð�Þ ¼ maxf2, ½z=���þ 1g and continue taking

one more at a time until there are N0 of them where

N 0¼ N 0

ð�Þ ¼ inf n � m : n �z2�̂�

2�n

ð�̂�ðnÞOLSÞ2�̂�2Xn�

2

( ), ð2:15Þ

with �̂�2�n ¼ n

1Pn1ðYi �̂�ðnÞOLS �̂�ðnÞOLS

�XXiÞ2, �̂�2

Xn ¼ SðnÞXX=n and �̂�ðnÞOLS ¼

�YYn�̂�ðnÞOLS

�XXn. And finally estimate � at the stopped stage by �̂�ðNÞOLS. Notice thatthe N0 above is a slight variant of Martinsek’s (1995) stopping rule, but hisearly stopping prevention mechanism (i.e., addition of a ‘‘fudge factor’’1/n�2) is already incorporated in our choice of the initial sample-sizem¼m(�). Assuming that F1 has finite absolute moments of orders upto6þ � for some �>0, and that F2 has finite moments of orders upto 4,Martinsek showed that N 0=n00 ! 1 a.s., EðN 0

Þ=n00 ! 1 and P½jð�̂�ðNÞOLS �Þ=�j �� ! 1 , as �! 0. In addition to these asymptotic first-orderproperties, he also established the asymptotic normality of ðN 0

n00Þ=ðn00Þ

1=2. However, due to the basic similarity between his approach andours, it is not difficult to prove that:

Theorem 2.2. Under the distribution-free setup described here, assuming thatF1 has finite absolute moments of orders 12, that PðjX1 X2j wÞ ¼ �ðw�

Þ

as w! 0 for some � > 0 and also that F2 has finite absolute moments oforders 12þ � for some �> 0,

(i ) EðN 0 n00Þ ¼ ��1 ��2 þ �ð1Þ;

(ii ) P½jð�̂�ðN0Þ

OLS �Þ=� �� � ð1 Þ K�=n00 þ �ð�2Þ;

as �! 0 where ��1, ��2 and K* are appropriately defined constants.

Proof. We only provide a very brief sketch. Define Ui’s as in Eq. (2.6), anddefine the functions g(x1, . . . , x5) and {gn(x1, . . . , x5)}n�1 in an appropriateway so that they have the properties described in the proof of Theorem 2.1and the stopping variable N0 can be rewritten as N 0

¼ inffn � m :ngnð �UUnÞ � n

00g (i.e., Eq. (17) of Aras and Woodroofe (1993) is satisfied).

Then notice that, in view of the moment conditions on F1, Eq. (12) of theirpaper holds here with q¼ 6. Hence, by their Proposition 4, conditions (C4),(C5) and (C6) hold under the current setup with ¼ 3 and � as in the proof ofTheorem 2.1 (except that if now involves the second-order partial derivativesof the new function g). Also, condition (C1) clearly holds with p¼ 6, andconditions (C2) and (C3) can be verified using appropriate modifications ofLemmas A.1 and A.2. Then, defining ��1 and ��2 in a manner similar toEqs. (2.10) and (2.11) and defining a function h(x1, . . . , x5) exactly as inEq. (2.12), we see that Eq. (13) of Aras and Woodroofe (1993) is satisfied

ESTIMATION IN STOCHASTIC REGRESSION 173

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

with r¼ 53 in view of Lemma A.1 in the appendix of Sriram (1991) which

guarantees that Efsupn�2Lþ1ðn=SðnÞXX Þ

5=3g <1 for L>5/3�. Then an applica-

tion of Theorems 1 and 4 of Aras andWoodroofe (1993) completes the proof.

3. STOCHASTIC LINEAR REGRESSION: FIXED-WIDTH

INTERVAL ESTIMATION

Consider once again the simple linear regression model (1.1). Under thesame setup and notations as in the previous section, we are now interested inconstructing fixed-width (¼ 2d ) confidence intervals for � having confidence-coefficients at least (1 ) for some given d>0 and 2 ð0, 1Þ. This can alsobe viewed in an obvious way as a bounded-risk estimation problem with anappropriate loss-function. In Sec. 3.1, we once again take a Chebyshev’sinequality approach to devise a sequential methodology and in Theorem3.1, we derive asymptotic second-order expansions for its ANS and for alower bound of the associated coverage probability. The technique of proofis very similar to that of Theorem 2.1, and hence details are omitted.

3.1. Chebyshev’s Inequality Approach

Having observed fðXi,YiÞ : i ¼ 1, . . . , ng, we use �̂�ðnÞOLS ¼ SðnÞXY=S

ðnÞXX as the

pivot and consider the interval ð�̂�ðnÞOLS d, �̂�ðnÞOLS þ d Þ ¼ Jn, say. Since for any

n� 2, Pð� 2 JnÞ ¼ Pðj�̂�ðnÞOLS �j d Þ � 1 Eð�̂�ðnÞOLS �Þ2=d2, we follow the

same line of reasoning as in Sec. 2.1 to conclude that a sufficient condition forPð� 2 JnÞ to be at least (1 ) for a given is the following:

�2�

ðn 3Þ�2Xd

2 ()n� 3þ

�2�

d2�2X

¼ 3þ�2Y ð1�2Þ

d2�2X

¼ n1 ðsayÞ, ð3:1Þ

provided of course that n� 4. But since this n1 involves unknown param-eters, we now devise a sequential procedure in which, the boundary-condition for optional stopping essentially mimics the expression for n1.Starting with fðXi,YiÞ : i ¼ 1, . . . ,mg where m is the same as in Eq. (2.3)with � replaced by d, we proceed by observing one additional vector(Xj,Yj) at a time until we have gathered N1 of them where

N1 ¼ N1ðd Þ ¼ inf n � m : n �SðnÞYY ð1 �̂�2nÞ

d2SðnÞXX

( ): ð3:2Þ

with �̂�n being the same as in Eq. (2.4). At the stopped stage, based onfðXi,YiÞ : i ¼ 1, . . . ,m, . . . ,N1g, we construct the interval JN1

. Once again

174 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

it is not difficult to see that PðN1ðdÞ <1Þ ¼ 1 and that N1=n1 ! 1 a.s. asd ! 0, for every fixed value of , �2

X , �2Y and �. But our primary interest

lies in the asymptotic second-order properties of this procedure which aresummarized in the following theorem.

Theorem 3.1. For the sequential interval-estimation procedure (3.2), we haveas d ! 0:

(i) EðN1 n1Þ ¼ �4 6þ �ð1Þ;(ii) Pð� 2 JN1

Þ � ð1 Þ n11 K1 þ � ðd2Þ; where �4 is defined in the

paragraph immediately after Eq. (3.6) and K1 is obtainable fromthe paragraph immediately after Eq. (3.7).

Remark 3.1. This theorem testifies to the asymptotic second-order efficiencyof the methodology described above in the sense of Ghosh andMukhopadhyay (1981).

Proof. The technique of proof is essentially the same as in the previoussection. We once again make extensive use of Aras and Woodroofe (1993)having noticed the similarity between the stopping variable ta in Eq. (2) oftheir paper and our N1 with a ¼ �2

Y ð1 �2Þ=ðd2�2X Þ ¼ n

�1 (say) and Zn ¼

ng�nð �UUnÞ ¼ nþ hc�,Dni þ ��n for n� 1, where �UUn ¼ n

1Pni¼1Ui with Ui’s

coming from Eq. (2.6), Dn ¼Pn

i¼1Ui, g�n : D!R are functions defined as

g�nðx1, . . . , x5Þ

¼�2X �2

Y ð1 �2Þ

max ð1=nÞ,ðx3 x

21 þ �2

X Þðx4 x22 þ �2

Y Þ ðx5 x1x2 þ �XY Þ2

ðx3 x21 þ �2

X Þ2

" # ð3:3Þ

for n > �2X�

2Y ð1 �2Þ1 and D as in Eq. (2.7),

c�¼ 0, 0,

1 2�2

�2X ð1 �2Þ

, 1

�2Y ð1 �2Þ

,2�2

�XY ð1 �Þ2

!and

��n þ n

¼ ng�nð �UUnÞ þð2�2 1Þ

Pni¼1 ðXi �X Þ

2

�2X ð1 �2Þ

þ

Pni¼1 ðYi �Y Þ

2

�2Y ð1 �2Þ

2�2

Pni¼1 ðXi �X ÞðYi �Y Þ

�XY ð1 �2Þ:

ESTIMATION IN STOCHASTIC REGRESSION 175

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

If we now define g� : D!R as

g�ðx1, . . . ,x5Þ ¼�2Y ð1 �2Þðx3 x

21 þ �2

X Þ2

�2Xfðx3 x

21 þ �2

X Þðx4 x22 þ �2

Y Þ ðx5 x1x2 þ �XY Þ2g

ð3:4Þ

and let g�nðx1, . . . , x5Þ ¼ g�ðx1, . . . , x5Þ for n < �2

X�2Y ð1 �2Þ1, it is obvious

that g*(O)¼ 1, g* is twice continuously differentiable on some neighbor-hood of O and g�n ¼ g

� for all n� 1 on a suitable neighborhood of O. So,following the same line of arguments as in the proof of Theorem 2.1, it canbe verified that the conditions (C4)–(C6) of Aras and Woodroofe (1993)hold here for any q>0, ¼ q/2 (i.e., for any >0) and �� ¼12 hW,D2

ðg�ÞjOWi where W is the same as in Sec. 2 and

D2ðg�ÞjO ¼

A�1 O

O A�2

with

1

2A�1 ¼

4

�2X

þ2�2

Y

�2X�

2Y �2

XY

2�XY�2X�

2Y �2

XY

2�XY�2X�

2Y �2

XY

2�2X

�2X�

2Y �2

XY

2666437775 ð3:5Þ

and

�2X�

2Y �2

XY

� 2A�2

¼

2�4XY�

4X 3�2

XY �2X�

2Y 4�3

XY�2X

3�2XY �2

X�2Y 2�4

X 4�2X�XY

4�3XY�

2X 4�2

X�XY 2�2X�

2Y þ 6�2

XY

264375 ð3:6Þ

Also, appropriately modified versions of Lemmas A1 and A2 in theAppendix establish the validity of their conditions (C2)–(C3) in this case.Let us now define �5 ¼ Eð�

�Þ and �4 ¼ EðR

�Þ, where the distribution of R* is

as in Eq. (2.10) except that the coefficients ofP

iðXi �X Þ2 andP

iðXi �X ÞðYi �Y Þ are changed respectively to �2X ð2�2 1Þð1 �2Þ1

and 2�2�1XY ð1 �2Þ1 on the rhs of Eq. (2.10) and in the expression for� as well. Since �5 reduces to 3 after simplification, the first part of

176 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

Theorem 3.1 is obtained as a direct consequence of Theorem 1 of Aras andWoodroofe (1993). As for the second part, defining h� : D�

! R as

h�ðx1, . . . , x5Þ ¼f�2X ðx5 x1x2Þ �XY ðx3 x

21Þg

2

�4X ðx3 x

21 þ �2

X Þ2

, ð3:7Þ

we essentially repeat the same steps as in the proof of part (ii) of Theorem2.1. Details are omitted.

Remark 3.1. As we remarked at the end of Sec. 2, here again the questionnaturally arises as to whether one might do away with the normalityassumption (on the regressor and errors) and still retain the nice asymptoticproperties of the fixed-width interval estimation procedure describedabove. We provide a more general answer to this question in the followingsection.

4. STOCHASTIC MULTIPLE LINEAR REGRESSION:

CONFIDENCE SETS FOR THE

DISTRIBUTION-FREE CASE

So far we have confined ourselves to a simple linear regressionmodel, and the assumption of normality for the errors as well as for theregressor has been a key one in the earlier sections. For example, we haveexploited the facts that S

ðnÞXX=�

2X � 2

n1 and joint moments for any positiveorder of a bivariate normal distribution exist. But now we wish to considera model with several regressors and no specific distributional assumptionon any of the variables involved, for such assumptions may often turn outto be unrealistic or hard to verify. Assuming the model (1.2) and theassociated setup, as described in Sec. 1, we now address the problem ofconstructing fixed-size confidence-ellipsoids, similar in form to those con-sidered by Mukhopadhyay and Abid (1986), for the vector of regression-coefficients �. We present our detailed analysis for the case where the firstcolumn of the design-matrix is not deterministic, but as indicated later inRemark 4.1, a very similar analysis with only minor modifications isapplicable to the case with deterministic first column (e.g., consistingof all 1’s). Having preassigned d(>0) and having observedfðX

ðiÞ,YiÞ : i ¼ 1, . . . , ng, we propose

Rn ¼ ! 2 Rp : �̂�ðnÞOLS !� �0

n1X 0nXn

� �̂�ðnÞOLS !� �

d2n o

ð4:1Þ

ESTIMATION IN STOCHASTIC REGRESSION 177

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

as a fixed-size confidence ellipsoid for �. Given 2 ð0, 1Þ,

Pð� 2 RnÞ ¼ P �̂�ðnÞOLS �� �0

n1X 0nXn

� �̂�ðnÞOLS �� �

d2n o

� 1 d2E �̂�ðnÞOLS �� �0

n1X 0nXn

� �̂�ðnÞOLS �� �n o

and hence Pð� 2 RnÞ � ð1 Þ if Efð�̂�ðnÞOLS �Þ0ðn1X 0nXnÞ � ð�̂�ðnÞOLS �Þg=

d2 . Now,

E �̂�ðnÞOLS �� �0

n1X 0nXn

� �̂�ðnÞOLS �� �n o

¼ EXnE �̂�ðnÞOLS �� �0

n1X 0nXn

� �̂�ðnÞOLS �� �

jXn

n o¼ EXnE tr �̂�ðnÞOLS �

� �0n1X 0

nXn�

�̂�ðnÞOLS �� �n o���Xnh i

¼ EXnE tr n1X 0nXn

� �̂�ðnÞOLS �� �

�̂�ðnÞOLS �� �0n o���Xnh i

¼ EXn tr n1X 0nXn

� E �̂�ðnÞOLS �� �

�̂�ðnÞOLS �� �0���Xnn oh i

¼ EXn tr n1X 0nXn

� �2 X 0

nXn� 1n oh i

¼ EXn tr n1�2Ip� � �

¼ pn1�2, ð4:2Þ

the last few steps in the above string of equations following from the stan-dard theory of multiple linear regression with deterministic regressors due tothe independence between the X(i)’s and the �i’s. So a sufficient condition forPð� 2 RnÞ to be at least (1 ) is that

pn1d2�2 () n � p1d2�2

¼ n2, say: ð4:3Þ

But this n2, which can be regarded in some sense as an optimal fixedsample-size for the problem at hand, involves the unknown parameter �2

and we therefore propose the following sequential sampling scheme instead:fix �0>0 and define m*, the initial sample-size, as

m�¼ m�

ðdÞ ¼ max 2, ð p1d2Þ1=ð1þ�0Þ

h i�þ 1

n o: ð4:4Þ

Starting with fðXðiÞ,YiÞ : i ¼ 1, . . . ,m�

g, we continue to augment ourdata-set by one new observation ðXð jÞ,YjÞ at a time until there are N2 suchvectors where

N2 ¼ N2ðdÞ ¼ inf n � m� : n � p1d2�̂�2n

�, ð4:5Þ

178 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

with �̂�2n ¼ n

1ðYn Xn�̂�

ðnÞOLSÞ

0ðYn Xn�̂�

ðnÞOLSÞ. Finally, having obtained

fðXðiÞ,YiÞ : i ¼ 1, . . . ,m�, . . . ,N2g, we construct the fixed-size confidence

region RN2for �.

It is obvious that

�̂�2n ¼ n

1�0n�n n1 �̂�ðnÞOLS �� �0

X 0nXn

� �̂�ðnÞOLS �� �

, ð4:6Þ

and that n1�0n�n ! �2 a.s. as n!1 whereas the second term on the rhs ofEq. (4.6) is �Pð1Þ as n!1. These imply that �̂�2

n!P�2, and hence one

concludes that PðN2ðd Þ <1Þ ¼ 1 for each fixed value of , d, � and �2.In order to investigate the almost sure asymptotic behavior of N2 in com-parison with n2, we prove the following lemma.

Lemma 4.1. Suppose that �1 has finite moment of order 2þ � for some �>0,and log �ðnÞmax ¼ �ðnÞ a.s. as n!1 where �ðnÞmax is the maximum eigenvalue ofX 0nXn. Then N2=n2 ! 1 a.s. as d ! 0 for every fixed value of , � and �2.

Proof. In view of Eq. (4.6) and the fact that n1�0n�n ! �2 a.s. as n!1, weonly need to show that the above condition on �ðnÞmax forces the second termon the rhs of Eq. (4.6) to converge to 0 a.s. an n!1. For, this will implythe strong consistency of �̂�2

n for �2 which, along with the fact thatN2ðd Þ ! 1 a.s. as d ! 0, will then prove the lemma as can be seen fromthe following string of inequalities:

p�̂�2N2

d2 N2

p�̂�2N21

d2þm�: ð4:7Þ

Now, the fact that the condition log �ðnÞmax ¼ �ðnÞ a.s. as n!1 forcesn1ð�̂�ðnÞOLS �Þ0ðX 0

nXnÞð�̂�ðnÞOLS �Þ to converge to 0 a.s. follows directly from

Lemma 1 and Lemma 3 of Lai and Wei (1982), since Eqs. (1.6) and (4.1) oftheir paper hold here with the increasing sequence of sigma-fieldsF n ¼ �fX ð1Þ, . . . ,X ðnþ1Þ, �1, . . . , �ng, n ¼ 1, 2, . . . . This completes the proofof our lemma.

Now we examine the asymptotic behavior of E(N2), the ASN of thesequential procedure (4.4)–(4.5), w.r.t. n2 as d ! 0 and also derive anasymptotic second-order expansion for a lower bound of Pð� 2 RN2

Þ. Bydoing so, we establish the asymptotic second-order efficiency of this proce-dure in the sense of Ghosh and Mukhopadhyay (1981). For clarity of pre-sentation, we confine ourselves to the case with three regressors in thetheorem below. The proof for the general case can be obtained by followingexactly the same line of arguments as in the proof of this special case, withappropriate modifications of the algebraic details.

ESTIMATION IN STOCHASTIC REGRESSION 179

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

Theorem 4.1. Under the present setup with three regressor variables, supposingthat the conditions of Lemma 4.1 are satisfied and assuming also thatRRj�j12þ�

F2ðd�Þ <1 for some �� > 12�0 (which ) Efsupn�1ðn1�Pn

i¼1 �2i Þ

5=3g <1Þ and

RR3 kxk

12F1ðdxÞ <1, we have the following asd ! 0, provided that hc��,U�

1i has a non-arithmetic distribution:

(i ) EðN2 n2Þ ¼ �6 �7 þ �ð1Þ;(ii ) Pð� 2 RN2

Þ � ð1 Þ n12 K2 þ �ðd2Þ;

with �6, �7 and K2 being obtainable respectively from Eqs. (4.11), (4.12) and(4.14).

Proof. We start by defining i.i.d. zero-mean random vectors U�1,U

�2, . . . as

U�i ¼ ð�2i �2,Xi1�i,Xi2�i,Xi3�i,X

2i1 �11 �2

1,Xi1Xi2 �12 �1�2,

Xi1Xi3 �13 �1�3,X2i2 �22 �2

2,Xi2Xi3 �23 �2�3,

X2i3 �33 �2

3Þ: ð4:8Þ

Denoting the common distribution of the U�i ’s by G* and the dispersion

matrix of G* by VG� ¼ ðð�ijÞÞ, we notice immediately that �ij¼ 0 unlessboth the indices i and j are 4 or both of them are � 5. In other words,VG� is of the form

Vð1ÞG� O

O Vð2ÞG�

" #because of the independence between X(i) and �i, ði � 1Þ, where

Vð1ÞG� ¼

mð4Þ2 �4 �1m

ð3Þ2 �2m

ð3Þ2 �3m

ð3Þ2

�1mð3Þ2 ð�11 þ�2

1Þ�2

ð�12þ�1�2Þ�2ð�13þ�1�3Þ�

2

�2mð3Þ2 ð�12þ�1�2Þ�

2ð�22þ�2

2Þ�2

ð�23þ�2�3Þ�2

�3mð3Þ2 ð�13þ�1�3Þ�

2ð�23þ�2�3Þ�

2ð�33þ�2

3Þ�2

2666437775

with mð jÞ2 ð j ¼ 3, 4Þ being the j-th order moment of F2, and the explicit form

of Vð2ÞG� is omitted due to its irrelevance to the subsequent analysis. Next we

let g�� : D��! R be a function defined as

g��ðu1, . . . , u10Þ ¼ �2 u1 þ �2 ðu2 u3 u4ÞfB

�ðuÞg

1ðu2 u3 u4Þ

0� �1

ð4:9Þ

where

B�ðuÞ ¼u5 þ �11 þ �2

1 u6 þ �12 þ �1�2 u7 þ �13 þ �1�3

u6 þ �12 þ �1�2 u8 þ �22 þ �22 u9 þ �23 þ �2�3

�7 þ �13 þ �1�3 u9 þ �23 þ �2�3 u10 þ �33 þ �23

24 35ð4:10Þ

180 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

and D��¼ fu ¼ ðu1, . . . , u10Þ 2 R

10 : B�ðuÞ is nonsingular and the denomina-tor on the rhs of Eq. (4.9) is nonzero}. Evidently, this g** equals unity at theorigin, is twice continuously differentiable on a suitable neighborhood of theorigin and Dðg��ÞjO ¼ ð�2, 0, . . . , 0Þ. If fg��n g, n¼ 1,2, . . . are real-valuedfunctions on D** such that g��n � g�� for n �2 and g��n ðu1, . . . , u10Þ ¼�2½maxfn�

0

, u1 þ �2 ðu2 u3 u4ÞfB

�ðuÞg

1ðu2 u3 u4Þ

0g�1 for n> �2, one

observes at once that g��n ðuÞ ¼ g��ðuÞ8n � 1 on a sufficiently small neigh-

borhood of O. Now the similarity between ta in Eq. (2) of Aras andWoodroofe (1993) and N2 in Eq. (4.5) above becomes apparent witha¼ n2 and Zn ¼ nþ hc

��,D�ni þ ���n for n� 1, where c��¼ Dðg��ÞjO,

D�n¼

Pni¼1U

�i and ���n ¼ ng��n ðUU

�nÞ þ

Pni¼1 �

2�2i 2n. Therefore, due to themoment-conditions we have imposed on F1 and F2, Eqs. (17) and (12)of Aras and Woodroofe (1993) hold here for q¼ 6 as do theirconditions (C4)–(C6) with ¼ q/2 and ��� ¼ 1

2hW�,D2

ðg��ÞjOW�i where

W�¼ ðW�

1 , . . . ,W�10Þ � N10ðO,VG� Þ. The other three conditions of their

paper, namely (C1)–(C3), are also valid under the present setup in muchthe same way as they were in the proof of Theorem 2.1, as can be seen fromLemma A3 and an appropriate modification of Lemma A2 in the Appendix.If R** is a random variable such that

Pðr R�� rþ drÞ ¼ E1ð���ÞP 2��� �2X���i¼1

�2i > r

( )dr ð4:11Þ

with ��� ¼ inffn � 1 : 2n �2Pn

i¼1 �2i > 0g, we get the weak convergence

of ðN1=22 D

�N2, ���N2

,ZN2 n2Þ to ðW

�, ���,R��Þ ad d ! 0, as well as the asymp-totic independence between ðW�, ���Þ and R**, directly from Proposition 3 ofAras and Woodroofe (1993). Defining �6¼E(R

**) and

�7 ¼ Eð���Þ ¼ m

ð4Þ2 �4 þ 2, ð4:12Þ

the first part of Theorem 4.1 can now be obtained from Theorem 1 of Arasand Woodroofe (1993). If we also define a real-valued function h** onR10 as

h��ðu1, . . . , u10Þ ¼ ðu2 u3 u4ÞfB�ðuÞg

1ðu2 u3 u4Þ

0ð4:13Þ

for all u¼ (u1, . . . , u10) such that B*(u) is invertible, and observe thath**(O)¼ 0, D(h**) jO¼O and h** is four times continuously differentiableon a neighborhood of O, we will immediately be in a position to applyTheorem 4 of Aras and Woodroofe (1993) with hk � h 8 k � 1, sinceEq. (13) of their paper is satisfied by this choice of the hk’s due to the factthat hð �UU�

kÞ k1Pk

i¼1 �2i for each k� 1. Assuming that hc��,U�

1i has a

ESTIMATION IN STOCHASTIC REGRESSION 181

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

non-arithmetic distribution, we can conclude from their Eq. (16) that

ð31d1�2Þ2E ð�̂�ðN2Þ

OLS�Þ0ðN12 X 0

N2XN2

Þð�̂�ðN2Þ

OLS�Þn o

1

2hD2

ðh��ÞjO,VG� i2ð31d2�2

Þ

¼1

2hD2

ðh��ÞjO,���2 i2þ

1

6hD3h��ÞjO,�

��3 i3

þ1

24hD4

ðh��ÞjO,���4 i4þ�ð1Þ ð4:14Þ

as d ! 0, where the symbols bear the same connotations as in Eq. (2.13),and ���

k ðk ¼ 2, 3, 4Þ have expressions analogous to those of �k(k¼ 2, 3, 4) inEq. (2.14) with �**,W*, c**, VG� , �6, �7 and U

�1 replacing �,W, c,M, �1, �2

and U1 respectively. ���k ðk ¼ 2, 3Þ clearly involve joint moments of F1 upto

the sixth order and various moment of F2 upto order six as well, and theseexist in view of the moment-assumptions we are working under. Thiscompletes the proof of part (ii), noticing that Pð� 2 RN2

Þ � 1

Efð�̂�ðN2Þ

OLS �Þ0 ðN12 X 0

N2XN2

Þð�̂�ðN2Þ

OLS �Þg=d2 by Chebyshev’s inequality, andthat D2

ðh��ÞjO ¼ ððsijÞÞ10i, j¼1 with all sij’s being zeros except s22, s23, s24, s33,

s34 and s44 (explicit forms of these nonzero entries are being omittedfor brevity).

Remark 4.1. In the stochastic multiple linear regression model (1.2), we didnot consider design-matrices with the first column consisting of all 1’s. Butsince having a completely random design-matrix is not a prerequisite for theabove analysis, it should be clear that everything we did here would carryover to the case where the first column of the regressor-matrix is non-stochastic, except that U�

1, G*, VG� , g** and h** would than be differentfrom what they are now and consequently the second-order expansionsprovided by Theorem 4.1 would also be different.

Remark 4.2. It should be noted that in order to prove only the first part ofTheorem 4.1, it is sufficient to have weaker moment-conditions on F1 and F2than we actually assumed in the statement of the theorem. This is because inorder to apply Theorem 1 of Aras and Woodroofe (1993), we only needfinite absolute moments of orders upto 6þ �* for F2 and the condition thatRR3 jjxjj

6F1ðdxÞ<1.

Remark 4.3. If we dropped the assumption of hc��,U�1i having a non-

arithmetic distribution, we would still be able to show under the setup of

182 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

Theorem 4.1 that Pð� 2 RN2Þ � ð1 Þ þOðd2Þ as d ! 0, using Eq. (15)

from Theorem 4 of Aras and Woodroofe (1993).

Remark 4.4. Assume thatRR3 jjxjj

8F1ðdxÞ <1 and F2 has finite absolutemoment of order 8þ �*, instead of the stronger moment-restrictions weimposed on these two distributions in Theorem 4.1. Then we can onlyclaim that Pð� 2 RN2

Þ � ð1 Þ þ �ð1Þ as d ! 0, which would follow fromEq. (14) of Aras and Woodroofe (1993) having satisfied the conditions oftheir Theorem 4 with q¼ 4, ¼ 2 and r¼ 5/3> (q 1)/(q 2)¼ 3/2.

Remark 4.5. Under the model (1.2) and the associated setup, supposewe are interested in bounded-risk point-estimation of � and the loss inestimating � by �̂�ðnÞOLS is

Ln ¼ C �̂�ðnÞOLS �� �0

n1X 0nXn

� �̂�ðnÞOLS �� �n ot

, ð4:15Þ

where C(>0) and t 2 ð0, 1� are known constants. Then the risk E(Ln) asso-ciated with Eq. (4.15) is bounded above by C( pn1�2)t. If our goal is to haveE(Ln) b for some preassigned positive number b, it is sufficient to ensurethat C( pn1�2)t b or, equivalently, that n� (C/b)1/tp�2. So [(C/b)1/tp�2]*þ 1 is, in some sense, an optimal fixed sample-size for this pro-blem. But since it is not usable in practice due to the presence of theunknown parameter �2, we propose the following sequential procedureinstead: start with {(X(i),Yi): i¼ 1, . . . ,m*} where m* is the same as in Eq.(4.4) except that and d2 are now replaced by b1/t and C1/t respectively.Continue augmenting this initial data-set by one vector (X(i),Yj) at a timeuntil there are N0

2 of them where

N 02 ¼ N

02ðCÞ ¼ inf n � m� : n � ðC=bÞ1=tp�̂�2

n

�, ð4:16Þ

with �̂�2n still given by Eq. (4.6). At the stopped stage, estimate � by �̂�

ðN 02Þ

OLS.Exactly the same analysis as in the proof of Theorem 4.1 carries over to thissituation with only C(1/2t) and b1/t replacing d and respectively. Thisshows that the sequential sampling scheme (4.16) is asymptoticallysecond-order efficient in the sense of Ghosh and Mukhopadhyay (1981) asC!1. Asymptotic second-order expansions (as C!1) for E(N0

2) andfor a lower bound of EðLN 0

2Þ can be derived in a manner similar to that in

Theorem 4.1. Details are omitted.

Remark 4.6. The statement of Theorem 4.1 says thatRRj�j12þ�

F2ðd�Þ <1) Efsupn�1ðn1Pn

i¼1 �2i Þ

5=3g <1. This follows easily from,

for example, Lemma 9.2.4 (page 276) of Ghosh et al. (1997) with m¼ 1.This is because sample variances (for sample sizes n� 2) form a sequence

ESTIMATION IN STOCHASTIC REGRESSION 183

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

of U-statistics which, in turn, is a reverse submartingale w.r.t. an appropri-ate filtration (see 9.2.11 of Ghosh et al. (1997)) and one can apply Doob’smoment inequality to a reverse submartingale.

5. SIMULATION EXERCISES

In this section, we summarize results from simulation-studies that wereconducted in order to assess moderate-sample-size performances of some ofour procedures. All computations were performed on the SUNSPARCstation 20 at the Department of Statistics, University of Michigan,Ann Arbor using FORTRAN-77 codes. NAG subroutines were used forgenerating Gaussian random-deviates.

5.1. Stochastic Simple Linear Regression: Fixed Proportional

Accuracy Estimation

We start with the sequential procedure (2.3)–(2.4) based on aChebyshev inequality approach for the fixed proportional accuracy estima-tion problem of Sec. 2. For this, we chose �¼�X to be 5.0; ¼ 0.05;�2� ¼ 4:0; �2

X=�2� ¼ 0:25, 1:0; (�,�)¼ (0.0, 2.0), (5.0,2.0) and 5 different

values of n�0 (namely, 50, 100, 200, 400, 800) corresponding to 5 differentvalues of �. For each of these parameter-combinations, we performed 1000independent replications of the sequential sampling scheme (2.3)–(.24) andthe results obtained thereby are summarized in Tables 6.1.1–6.1.2. In eachrow of these four tables, �NN represents the average of 1000 independentvalues of the stopping variable N, sð �NNÞ is the corresponding standarderror, �pp stands for then estimate of P(relative error<�) and sð �ppÞ providesthe associated standard error. Results are being reported for �2

X=�2� ¼ 0:25

Table 6.1.1. Fixed Proportional Accuracy Estimation: Chebyshev’s

Inequality Approach (�2� ¼ 4:0, �2

X=�2� ¼ 0:25, (�,�)¼ (0.0, 2.0))

n�0 � m �NN sð �NNÞ �pp sð �ppÞ

50 0.6523 7 40.4920 0.6928 0.973 0.0051

100 0.4541 10 92.8240 0.9794 0.988 0.0034200 0.3186 15 195.8170 1.2940 0.999 0.0010400 0.2245 20 397.3770 1.8259 0.999 0.0010

800 0.1584 29 797.9450 2.6044 1.000 0.0000

184 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

only—for the sake of brevity. We obtained fairly similar results with�2X=�

2� ¼ 1:0 too.As is evident from these tables, the sequential sampling scheme (2.3)–

(2.4) undersamples compared to the optimal fixed sample-size n0 on theaverage; in other words, �NN < n0 in all the rows of the tables above. Yetthe estimated Pðjð�̂�ðNÞOLS �Þ=�j < �Þ is quite impressive even when �NN is wellbelow 100. As suggested by the asymptotic second-order expansion in part(ii) of Theorem 2.1, Pðjð�̂�NOLS �Þ=�j < �Þ always remains above the pre-specified lower bound (1 ). Actually, it seems to be growing well beyond(1 ) as n0 grows ð() �! 0Þ.

6. APPENDIX

Lemma A.1. In Theorem 2.1, n2sPfð1 �̂�2nÞ�2=ð1 �2Þ�̂�2n 1=kg ! 0 as

n!1 for any s>0 and some k>1.

Proof. First we observe that ð1 �̂�2nÞ�2=ð1 �2Þ�̂�2n is a continuous function

of SðnÞXX , S

ðnÞYY and S

ðnÞXY , and it converges to 1 a.s. as n!1. So for any

particular k>1,

n2sPð1 �̂�2nÞ�

2

ð1 �2Þ�̂�2n

1

k

( ) n2s P

�2

�̂�2n

1

k1=2

( )þ P

1 �̂�2n1 �2

1

k1=2

( )" #¼ n2s½I1 þ I2�, say:

Now,

n2sI1 n2s P

SðnÞXX

n�2X

1

k1=8

( )þ P

SðnÞYY

n�2Y

1

k1=8

( )þ P

n�XY

SðnÞXY

���������� 1

k1=8

( )" #¼ n2s½I11 þ I12 þ I13�, say:

Table 6.1.2. Fixed Proportional Accuracy Estimation: Chebyshev’sInequality Approach (�2

� ¼ 4:0, �2X=�

2� ¼ 0:25, (�,�)¼ (5.0,2.0))

n�0 � m �NN sð �NNÞ �pp sð �ppÞ

50 0.6523 7 40.4680 0.7038 0.977 0.0047100 0.4541 10 91.7520 1.0075 0.985 0.0038200 0.3186 15 195.0110 1.2879 0.999 0.0010

400 0.2245 20 394.5410 1.7831 0.999 0.0010800 0.1584 29 796.3650 2.4784 1.000 0.0000

ESTIMATION IN STOCHASTIC REGRESSION 185

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

Also,

n2sI13 n2s P

SðnÞXX

n�2X

� j�jk1=8( )

þ PSðnÞYY

n�2Y

� j�jk1=8( )" #

¼ n2s½I131 þ I132�, say:

Observing that j � j k1/8>1 for a large enough k>1, we can apply resultsfrom large-deviation theory on I11, I12, I131 and I132 to conclude that eachone of them is �ðn2sÞ for any s>0. Finally, I2 ¼ Pfð�̂�

2n=�

2Þ �

1þ ð1 1=k1=2Þðð1 �2Þ=�2Þg, so that it can be handled the same way asI1. Or we could apply techniques from Chow and Yu (1981), as cited byAras and Woodroofe (1993) in their Example 2.

Lemma A.2. In Theorem 2.1, the condition (C3) of Aras and Woodroofe(1993) holds.

Proof. Notice that

n1�n ¼ð1 �2Þ�̂�2nð1 �̂�2nÞ�

2þXni¼1

ðXi �X Þ2

n�2X ð1 �2Þ

þXni¼1

ðYi �Y Þ2

n�2Y ð1 �2Þ

Xni¼1

2ðXi �X ÞðYi �Y Þ

n�XY ð1 �2Þ 1:

Let us call it Mn. We have to show thatP1

n¼1 nPfMn<�1g <1 for�1 2 ð0, 1Þ. Now, for 0 < � < �1 < 1, we write

PfMn �1g P fMn �1g \ð1 �2Þ�̂�2nð1 �̂�2nÞ�

2� 1 �

( )" #

þ Pð1 �2Þ�̂�2nð1 �̂�2nÞ�

2< 1 �

( )¼ In þ IIn, say:

Then we have the following (with �0 ¼ �1 � > 0);

In P

(ð1 �Þ þ

Xni¼1

ðXi �X Þ2

n�2X ð1 �2Þ

þXni¼1

ðYi �Y Þ2

n�2Y ð1 �2Þ

Xni¼1

2ðXi �X ÞðYi �Y Þ

n�XY ð1 �2Þ 1 �1

)

PXni¼1

ðXi �X Þ2

n�2X

1

����������þ Xn

i¼1

ðYi �Y Þ2

n�2Y

1

����������

(

186 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

þ

�����Xni¼1

2ðXi �X ÞðYi �Y Þ

n�XY 2

����� � �0ð1 �2Þ

)

P

( Xni¼1

ðXi �X Þ2

n�2X

1

���������� � �0ð1 �2Þ

3

)

þ PXni¼1

ðYi �Y Þ2

n�2Y

1

���������� � �0ð1 �2Þ

3

( )

þ PXni¼1

ðXi �X ÞðYi �Y Þ

n�XY 1

���������� � �0ð1 �2Þ

6

( ),

and hence, using Theorem 1 of Katz (1963) with some t� 3, we haveP1

n¼1 nIn <1. Also, with some �00>0, we have IIn ¼ Pfð1 �̂�2nÞ�2=

ð1 �2Þ�̂�2n > 1þ �00g, and the fact thatP1

n¼1 nIIn <1 now follows byapplying a slight variant of Lemma 1 with some s>1

2.

Lemma A.3. Under the setup of Theorem 4.1, Pfð�̂�2n=�

2Þ<1=kg ¼ �ðnð1þ�

0ÞsÞ

as n!1, for any s 6 and any k>1.

Proof. We prove the lemma for, say k¼ 3. Recall the form of �̂�2n given

by Eq. (4.6). Notice that ð�̂�ðnÞOLS �Þ0X 0nXnð�̂�

ðnÞOLS �Þ ¼ �0nPn�n where

Pn¼Xn(X0nXn)

1X 0n is the projection matrix, and Eð�0nPn�nÞ ¼

EF1Eð�0nPn�njXnÞ ¼ 3�2 due to the independence between the X(i)’s and the

�i’s (i� 1) and since rank(Pn)¼ trace(Pn)¼ 3 in the special case of the model(1.2) with p¼ 3. So

P�̂�2n

�2<

1

3

! P

n1�0n�n�2

3

n<

2

5

!

þ P�̂�ðnÞOLS �� �0

X 0nXn �̂�ðnÞOLS �

� �n�2

3

n>

1

15

8><>:9>=>;

¼ I þ II , say:

For large enough n(n� 30, to be precise), 25þ

3n

12 and so

I P �0n�n=n�2

� < 1

2

� :

On the other hand,

II P �̂�ðnÞOLS �� �0

X 0nXn �̂�ðnÞOLS �

� � 3�2

� �=n�2

� ���� ��� > 1

15

( �:

ESTIMATION IN STOCHASTIC REGRESSION 187

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

Also, n2sPðn1�2�0n�n <12Þ ! 0 as n!1 for any fixed s>0,

following the same line of arguments as in Example 2 of Aras andWoodroofe (1993) which uses Lemma 4 of Chow and Yu (1981) for thispurpose. As for the other part, notice that

II ð15ÞrE n�2� 1

�̂�ðnÞOLS �� �0

X 0nXn �̂�ðnÞOLS �

� � 3�2

��� ���n orby Markov inequality, for any r>0 such that the expectation on the rhsexists. Now, fix s 2 ð0, 6�. Then, using Theorem 2 of Whittle (1960) alongwith the independence between the regressors and the errors, we get

Ej�0nPn�n 3�2js 23sCðsÞfCð2sÞg1=2EF1

Xni¼1

Xnj¼1

ð pðnÞij Þ

2�2i ð2sÞ�2j ð2sÞ

( )s=2

where CðsÞ ¼ ð2s=2=�1=2Þ�ððsþ 1Þ=2Þ for s>0, p

ðnÞij is the (i, j)-th entry of Pn,

and �iðrÞ ¼ ðEj�ijrÞ1=r for i� 1 and r>0. Clearly, �i(2s) is finite for i� 1 and

s 2 ð0, 6þ ��=2� because of the assumption thatRRj�j12þ�

F2ðd�Þ<1.Moreover, since the �i’s are i.i.d., �i(2s) is the same for all i� 1 for eachfixed s>0. This, together with the fact that

Pni¼1

Pnj¼1ð p

ðnÞij Þ

2¼ 3 for all

n� 3 forces II to be � ðnð1þ�0ÞsÞ as n!1. The proof of the lemma is

complete.

ACKNOWLEDGMENTS

This research was supported by National Science Foundation grantsNSF-G-ASC-9504041 and NSF-DMS-9157715. The author is grateful toProf. Michael Woodroofe for his useful suggestions and an anonymousreferee for a long list of helpful remarks. Thanks also to prof. NitisMukhopadhyay and Dr. Janis Hardwick for their comments on the clarityof presentation and overall organization of the paper.

REFERENCES

1. Albert, A. Fixed Size Confidence Ellipsoids for Linear RegressionParameters. Ann. Math. Statist. 1966, 37, 1602–1630.

2. Aras, G.; Woodroofe, M. Asymptotic Expansions for the Moments of aRandomly Stopped Average. Ann. Statist. 1993, 21, 503–519.

188 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

3. Chow, Y.S.; Robbins, H. On the Asymptotic Theory of Fixed-WidthSequential Confidence Intervals for the Mean. Ann. Math. Statist.1965, 36, 457–462.

4. Chow, Y.S.; Yu, K.F. On the Performance Sequential Procedure forthe Estimation of the Mean. Ann. Statist. 1981, 9, 184–188.

5. Datta, S. On Multistage Parametric Inference-Procedures: the ‘FineTuning’ Aspect and the Distribution-Free Scenario. Ph.D.Dissertation, Dept. of Statistis, Univ. of Connecticut, Storrs, 1995.

6. Finster, M. A Frequentistic Approach to Sequential Estimation in theGeneral Linear Model. Journal of the American Statistical Association1983, 78, 403–407.

7. Ghosh, M.; Mukhopadhyay, N. Consistency and Asymptotic Effi-ciency of Two-stage and Sequential Estimation Procedures. Sankhya,Ser. A 1981, 43, 220–227.

8. Ghosh, M.; Mukhopadhyay, N.; Sen, P.K. Sequential Estimation;John Wiley and Sons, Inc.: New York, 1997.

9. Gleser, L.J. On the Asymptotic Theory of Fixed-Size SequentialConfidence Bounds for Linear Regression Parameters. Ann. Math.Statist. 1965, 36, 463–467.

10. Katz, M.L. The Probability in the Tail of Distribution. Ann. Math.Statist. 1963, 34, 312–318.

11. Lai, T.L.; Robbins, H.; Wei, C.Z. Strong Consistency of Least-SquaresEstimates in Multiple Regression II. J. Multivariate Anal. 1979, 9,343–361.

12. Lai, T. L.; Wei, C. Z. Least Squares Estimates in Stochastic RegressionModels with Applications to Identification and Control of DynamicSystems. Ann. Statist. 1982, 10, 154–166.

13. Martinsek, A.T. Sequential Estimation with Squared Relative ErrorLoss. Bull. Inst. Math. Acad. Sinica 1983, 11, 607–623.

14. Martinsek, A.T. Sequential Point Estimation in RegressionModels with Nonnormal Errors. Sequential Anal. 1990, 9,243–268.

15. Martinsek, A.T. Estimating a Slope Parameter in Regression withPrescribed Proportional Accuracy. Statistics and Decisions 1995,13, 363–377.

16. Mukhopadhyay, N.; Abid, A.D. On Fixed-Size Confidence Regionsfor the Regression Parameters. Metron. 1986, 44, 197–206.

17. Mukhopadhyay, N.; Datta, S. On Fine-Tuning a Purely SequentialProcedure and Associated Second-Order Properties. Sankhya, Ser. A1995, 57, 100–117.

18. Mukhopadhyay, N.; Datta, S. On Sequential Fixed-WidthConfidence Intervals for the Mean and Second-Order Expansions of

ESTIMATION IN STOCHASTIC REGRESSION 189

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4

©2002 Marcel Dekker, Inc. All rights reserved. This material may not be used or reproduced in any form without the express written permission of Marcel Dekker, Inc.

MARCEL DEKKER, INC. • 270 MADISON AVENUE • NEW YORK, NY 10016

the Associated Coverage Probabilities. Ann. Inst. Stat. Math.1996, 48, 497–507.

19. Sriram, T.N. Second Order Approximation to the Risk of a SequentialProcedure Measured Under Squared Relative Error Loss. Statisticsand Decisions 1991, 9, 375–392.

20. Sriram, T.N. An Improved Sequential Procedure for Estimating theRegression Parameter in Regression Models with Symmetric Errors.Ann. Statist. 1992, 20, 1441–1453.

21. Sriram, T.N.; Bose, A. Sequential Shrinkage Estimation in the GeneralLinear Model. Sequential Anal. 1988, 7, 149–1631.

22. Whittle, P. Bounds for the Moments of Linear and Quadratic Forms inIndependent Variables. Teoriya Veroyatnostei i ee Primeneniya 1960,5, 331–335.

23. Woodroofe, M.B. Second-order Approximations for Sequential Pointand Interval Estimation. Ann. Statis. 1977, 5, 984–995.

Received December, 2001Recommended by T. N. Sriram

190 DATTA

Dow

nloa

ded

by [

83.6

3.16

4.4]

at 1

2:02

22

Oct

ober

201

4