[Lecture Notes in Statistics] Topics in Survey Sampling Volume 153 || Prediction in Finite...

Chapter 7

Prediction in FinitePopulation under~easurernentError~odels

7.1 INTRODUCTION

In practical sample survey situations the true values of the variables arerarely observed but only values mixed with measurement errors. Consideragain a finite population P of a known number N of identifiable unitslabelled 1, ... ,i ... ,N. Associated with i is a value Yi of a study variable'y'. We assume that Yi cannot be observed correctly but a different valueY; which is mixed with measurement errors is observed.

We also assume that the true value Yi in the finite population is actuallya realisation of a random variable Yi, the vector Y = (Y1, ... ,YN)' havinga joint distribution e. However, both Yi and Yi are not observable and wecannot make any distinction between them. Our problem is to predict thepopulation total T (= 2:;:1 Yi) (population mean iJ = TIN), the populationvariance S; (= 2:;:1 (Yi - iJ)2IN) or the population distribution function

FN(t) = 11 2:;:1 t:..(t - Yi) by drawing a sample according to a samplingdesign p(s), observing the data and using e. Note that in the previouschapters we used Y; to denote the random variable corresponding to Yi' Inthis chapter we shall use Y; to denote the observed value of yon unit i and

203P. Mukhopadhyay, Topics in Survey Sampling© Springer-Verlag New York, Inc. 2001

204 CHAPTER 7. MEASUREMENT ERROR MODEL

Yi will denote both the true value of Y on unit i and the random variablecorresponding to it.

A general treatment for inference under additive measurement error modelshas been considered in Fuller (1987, 1989) and the same for multiplicativemeasurement error models in Hwang (1986). In section 7.2 we review theprediction problems in finite population under additive measurement error models. The next section considers the problems under multiplicativemeasurement error models.

7.2 ADDITIVE MEASUREMENT ERROR

MODELS

7.2.1 THE LOCATION MODEL WITH MEASURE-

MENT ERRORS

Consider the simple location model with measurement error:

Yi = J.L + ei, ei rv (0, aee ),

Y; = Yi + Ui, Ui rv (0, auu) , (7.2.1)

ei ir::.duj (i,j = 1,2, ... )

where J.L, aee(> 0), auu (> 0) are constants and Zi rv (0, A) denotes the random variables Zi are iid with mean zero and variance A. The model (7.2.1)was considered by Bolfarine (1991), Mukhopadhyay (1992, 1994 a), Bhattacharyya (1997), among others. Here, and subsequently, E, V(Ep , ~)

will denote expectation, variance with respect to superpopulation model(design).

Consider the class of linear predictors

e(s, ~) = bs + L bksYk

kEs

(7.2.2)

where bs , bks are constants not depending on Y-values and ~ = {Y;, i E s}denotes the set of observations on Y on the units in s.

As noted in definition 2.2.2, a predictor 9 is said to be design-model (pm) unbiased predictor of fJ(y) or pm-unbiased estimator of E(fJ(y)) wherey = (Yl, ... , YN) if

EpE(g(s, ~)) = E(B(y)), (7.2.3)

7.2. ADDITIVE ERROR MODELS 205

for all the possible values of the parameters involved in the model. Hence,e(s,:Ys) will be pm-unbiased iff

EpE(bs + L bksYk) = E(y) = IlkEs

ie. iffEp(bs) = 0

Ep(L bks ) = 1kEs

(6.2.4.1)

(7.2.4.2)

Following the usual variance-minimisation criterion, a predictor g* will besaid to be optimal in a class of predictors G for predicting B(y) for a fixedp, if

(7.2.5)

for all g E G.

To find an optimal pm-unbiased predictor of y, we consider the followingtheorem on UMVU-estimation (Rao, 1973). Let C denote a class of pmunbiased estimators of T and Co the corresponding class of pm-unbiasedestimators of zero.

THEOREM 6.2.1 A predictor g* in 0 is optimal for T iff for any f in0 0 , EpE(g* f) = O.

From the above theorem, Theorem 7.2.2 readily follows.

THEOREM 6.2.2 Under model (7.2.1) optimal pm-unbiased predictor ofyin the class of all linear pm-unbiased predictors, where p E Pn, is given byy,,(= ~ L:kEs Yk). Again, any p E Pn is optimal for using y".

Again, if V denotes variance operation with respect to joint operation ofmodel (7.2.1) and s.d.p,

1 lITo= (- - - )O"ee + - = - (say)n N n N2

(7.2.6)

Theorem 7.2.2 states that any FS(n-) design including a purposive sampling design is optimal for predicting y. However, for purpose of robustnessunder model- failures (as shown in a different context by Godambe andThompson (1977)) one should use a probability sampling design p E Pnalong with y".


We now consider optimal prediction of S;. For this we shall confine to theclass of pm-unbiased quadratic predictors

eq(s, Y s ) = bs +L b ksy k2 + L bkk'sYkYk'

kEs k,pk'ES

where bs , bkSl bkk,s are suitable constants that do not depend on ~.

By virtue of Theorem 7.2.1 the following result can easily be verified.

THEOREM 6.2.3 Under model (7.2.1) and assumptions E(yf) < 00, E(Y;4 IYi) < 00,

2 1 L - 2Sy = -- (Yi - Y)n-1

iEs

is the UMV quadratic pm-unbiased predictor of S; for any p E Pn. Again,any p E Pn is optimal for using s~.

In the next part we consider Bayes prediction of population total and variance under model (7.2.1).

Bayes Predictor of T

Assume that the variables ei, Ui are normally distributed with variances(O"ee, O"UU) , assumed to be known. As the distribution of a large number ofsocia-economic variables is (at least approximately) normal in large samples, we consider a normal prior for p"

p, rv N(O, (p)

The posterior distribution of p, is, therefore,

where

Again, posterior distribution of y;'s, given (~, p,) are independent:

. I (Y. ) N (YiO"ee + p,O"uu 2) (' E )Y. S! P, rv 2' 0"0 t S0"

(7.2.7)

(7.2.8)

(7.2.9)

(7.Q.10)

(7.2.11)

7.2. ADDITIVE ERROR MODELS

where

207

(7.2.12)

Therefore, under squared error loss function, Bayes predictor of T is

N N

TB = E(LYi I~) = E{E(LYi I ~,Il-) I~}~1 i=1

(7.2.13)

Also,VeT I~) = E~{V(T 1Il-,~)} + V~{E(T 1Il-,~)}

= T(J(TB) (say)

which, being independent of s, is also Bayes risk of T. Again,

(7.2.14)

= TO (say)

It is seen from (7.2.6) that the risk of the predictor NY. is given by TO.

Hence, by Theorem 3.2.2 Ny' is a minimax predictor of T under the assumption of normality of the distribution of e;'s and u;'s as consideredabove. Again, since expression (7.2.6) was obtained without any assumption about the form of the distributions, it follows by Theorem 3.2.3 thatthe predictor Ny' is minimax in the general class of distributions (notnecessarily normal).

THEOREM 7.2.4 The predictor Ny' is a minimax predictor of T under thegeneral class of prior distributions of errors (e;'s and Ui'S) which satisfy(7.2.1).

We now assume that k = fTee/fTuu is a known positive constant but 7 = l/fTeeis unknown. Assuming a normal-gamma prior for (1l-,7) (eg. Broemling,1985) with parameters (v, 0:', (3), the joint distribution of Il-, 7 is

P(Il-,7) ex 70

-1

/2 exp {-~ [(Il- - v? + 2{3]}

2

Il-, VERI, 7 > 0,0:' > 0, {3 > °


The marginal posterior distribution of p is a Student's t-distribution with(n + 2a) dJ. and posterior mean and variance given, respectively, by

E( IYo) = 1/+nqfsPSI +nq

2(3 + q ~. y2 + 1/2 - t nq1V(p I :Ys) = LJ,Es , l+nq

(1 + nq)(n + 2a)

where q = kl(k + 1). It is assumed that n > 1.

Marginal posterior distribution of T is a gamma with parameters

* n+aa =--,2

2 - 2(3* = (3 ~ ~~ y2 _ (1/ + nq:Ys)

+ 2 + 2 LJ' 2(1 + nq),Es

Hence, Bayes predictor of Tis

~(1) ~ ~TB = E{E(LJ Yi + LJ Yi I p, T,:Ys) I :Ys}

iEs inotins

= kn(N + 1) Yo + N + keN - n) ()ken + 1) + 1 s ken + 1) + 1

(7.2.15)

To use T~l) one needs to know only the value of k = u;1u;'. Note that when

n = N, T~l) i- T and hence T~l) is not a consistent estimator in Cochran's(1977) sense. Similarly, one can calculate Bayes predictor of T and S; underJeffrey's non-informative prior (Exercise 1).

Bhattacharyya (1997) considered the Bayes prediction of S; under themodel (7.2.1). Consider the identity

2 n 2 N - n 2 n(N - n) _ _ 2Sy = NSY + ~Syr + N2 (Yr - Ys)

where

iEs iEr

Ys = Ly;/n, Yr = Ly;/(N - n), r = s = P - S

iEs iEr

Now,ns2

-t- 1:Ys rv non-central X2((n - 1), >')

uo(7.2.16)


where2"" -2and Sy = L./Yi - ~) In

iEs

(7.2.17)

Also,

Again,

(7.2.18)

(7.2.19)

(7.2.20)

(7.2.22)

whereUee -

J1.1 = -;;2(J1. - ~)

2 UeeUuu Uee Uee [ 2 ( )]Ul = nu2 + N -n = n(N -n)u2 Nu - N -n Uee

From (7.2.16) - (7.2.20), Bayes estimate of S; can easily be calculated.

Again, for a square error loss function, Bayes prediction risk of S~B is givenby E[V(S; I~)] where the expectation is taken with respect to predictiondistribution of~. The posterior variance of S; is

2 u~ 4} n= N2 -;;t{(n - l)uuu + (N - n - l)u + 4 N2

U~eUuu 2 2n2(N - n? {Uee 2(UUU u2--S-Sy+ N4 -2U -+-N

U U n -n

U' (p 4n2(N _ n)2 u2 y2+ ee }2 + ee s(n(J2 + ( 2) N4 (n(J2 + ( 2)2

Uee Uuu u2 uee(J2{-;;2(-;- + N - n + (n(J2 + (2)} (7.2.21)

Again, the predictive distribution of~ under the above models is N(O, ~.+(J2) (Bolfarine and Zacks, 1991). Hence, taking expectation of (7.2.21) Bayesrisk of S~B is

E[V(S~ I~)] = ~~;4 {en - l)uuu + (N - n - 1)u4 + 2(n - l)uee uuu }

2n2(N - n)2 {uee (UUU u2 uee(J2 )}2+ - -+--+ +N4 u2 n N-n (n(J2+ u2)

n 2(N - n)2 u~ {Uee Uuu u2 Uee(J2}4 N4 (n(J2 + ( 2) -;;2 (-;- + N - n + (n(J2 + ( 2)


Allowing () ~ 00, the limiting value of risk of S;B is

NOTE 7.2.1

Mukhopadhyay (1994 c) considered a variation of model (6.2.1). Considerthe general class of superpopulation model distribution ~ of y such that forgiven p, 6 is a distribution in hyperplanes in RN with y(= 2:;:1 ydN) = Pand

N

E L(Y; - p? :5 (N - 1)(Tee,;=1

(7.2.23)

(Tee a constant and E denoting expectation with respect to superpopulationmodel 6 (and other models relevant to the context). The distribution 6 of~ = {Y;, i E s} is considered to be a member of the class with the propertythat the conditional distribution of Y; given Yi is independent and

(7.2.24)

Let C denote the class of distributions {~ = 6 x 6}. Consider the subclassCo = {~ = 60 x 60} of C where ~1O is such that given p, y is distributed as aN-variate singular normal distribution with mean vector pI and dispersionmatrix l: having constant values of

60 is a pdf on R N such that the conditional distribution of Y; given Y; isindependent normal with mean and variances as stated in (7.2.24).

It is assumed that p is distributed a priori normally with mean 0 andvariance ()2. Bayes predictor of y based on a random sample of size n forthe class of distributions in Co is found to be

E(p I~) = ~rT2 = 8(1 (say)1 + n92

where

and posterior variance

(7.2.25)

(7.2.26)


An appeal to theorems 3.2.2 and 3.2.3 (using () -4 00) showed that ~ isminimax for the class of distributions in C. Bayes estimation of domaintotal was also considered.

The above model was extended to the case of stratified random sampling.It was found that for the loss function

L

L(F,8) = (8 - F)2 + I>hnhh=l

for estimating F by 8, where Ch is the cost of sampling a unit and nhthe sample size in the hth stratum, minimax estimate of 'Ef:::l ahfh(ah a

constant) is 'E~=l ah~h and a minimax choice of nh is

nit =

where aeeh, auuh have obvious interpretations. In particular, if F = Y =

'Eh WhYh, W h = !:ft,

The model was also extended to the case of two-stage sampling (Mukhopadhyay, 1995 a).

7.2.2 LINEAR REGRESSION MODELS

We first consider regression model with one auxiliary variable. Assume thatassociated with each i there is a true value Xi of an auxiliary variable x

closely related to the main variable y. The values xi's ,however, cannot beobserved without error and instead some other values Xi's are observed.It is assumed that Xl, ... , X N are unknown fixed quantities. Consider themodel

Xi = Xi + Vi, Vi_(O, a vv )

where ei, Vi, Ui are assumed to be mutually independent.

The following theorem easily follows from Theorem 7.2.1.

(7.2.27)


THEOREM 7.2.4 Under model (7.2.27) the best linear pm-unbiased predictor of fj for any given p E Pn is

(7.2.28)

where Z/c = {3o + {3x/c. Again,

EpE(e* - {3o - {3x)2

= on (say),

a constant dependent only on n. Hence, any p E Pn is an optimal s.d. forusing ei.

NOTE 7.2.2

In deriving e'i it has been assumed that {3o, (3 are known, which may not bethe case. For estimating the parameters we assume

• (i) ei, Ui, Vi are normally distributed with the parameters as statedin (6.2.17);

• (ii) X/c '" N(/-Lx, C7xx ) and is independent ofeivj, u/(i, j, k, l = 1,2, ...)

We call these assumptions as assumptions A. Under these assumptions(Xi, Yi) have a bivariate normal distribution with mean vector

[/-LY] = [{3o+{3/-Lx]ILx /-Lx

and dispersion matrix

[{32 C7xx + C7uu + C7ee {3C7XX ]

{3C7xx C7xx + C7vv

We denote by mXX,myy,mXY the sample variances of X and Y and covariance of (X, Y) respectively, mXY = ~ 2:s(Xi - Xs)(Yi - ~), etc. Theparameters are estimated in the following situations:

(i) the ratio C7xx/C7XX = C7xx /(C7xx + C7vv ) = kxx called the reliability ratio,is known. Here,

A A ~LS{3 = {3u (say) =-k

xx


where{3LS = mXy/mxx

is the ordinary least squares estimator of {3. {3u is unbiased for {3.

(ii) The measurement error variance O"vv is known. Here

{3 = {3Fl = mXYmxx - O"vv

(iii) The ratio (O"uu + O"ee)/O"vv = 5 is known. Here

A A 1 J{3 = {3F2 = --[myy - 5mxx + (myy - 5mxx)2 +45m~yl2mxx

213

In all the cases /30 = "t - (3xs ' The above derivations follow from Fuller(1987).

NOTE 7.2.3

In case, Xk'S are known only for k E s, et in (7.2.28) may be replaced by

a Ha'jek(1959)- type predictor. The predictor ei' is pm-biased.

Following Royall (1970) and Rodrigues et al (1985), Bolfarine, Zacks andSandoval (1996) obtained best linear m-unbiased predictor of T undermodel (7.2.27) where Yi'S are measured without error. Assume that Xk'Sare available for k = 1, ... , N and as in the case of note (7.2.2), Xk rv

N(J.1.x, O"xx) , ek rv N(O,O"ee) and Vk rv N(O, Vkk) and the variables are independent. A predictor of Tis

Tc 1:::s Yk + 1:::. Yk= N"t + (N - n)(Xr - X s){3c

(7.2.29)

where Xr 1:::k¢s Xk/(N - n) and {3c is a predictor of {3. The predictors of T using predictors {3Ls, {3u, {3Fl, {3F2 of {3 will be denoted asTLS ,Tu, TFl' TF2,respectively. We recall below some of their important results.

THEOREM 7.2.5 Under models (7.2.27) and assumptions A (with O"uu = 0)

vn({3LS - kxx(3) --4L N(O, O"x~{B - (320"~v})


.;n(~u - (3) -;L N(O, ux~k;;{uxxB - (32 U;v})

.;n(~F1 - (3) -;L N(O, u;;{uxxB + (32u~v})

.;n(~F2 - (3) -;L N(O, u;;{B - (32 u;v})

as n -; 00, N -; 00 and N - n -; 00, where

(7.2.30)

THEOREM 7.2.6 Under model of Theorem 5.2.8 and under the assumptionthat .Jii(~c - (3) converges to some proper distribution,

.;n(Tc - T)/N -;L N(O, (1 - f)B)

as n/N -; f and as n, N -; 00.

Proof·Tc - T = (N - n)(Xr - Xs)(~c - (3)+

(N - n)[~ - (3 - X s(3 - (y;. - f30 - X r(3)]

.;n(Tc - T)/N = (1- n/N)(Xr - Xs).;n(~c - (3)+

(1 - n/N)n-1/2~)Yk - f30 - (3Xk)+kEs

vn/NVI - n/N(N - n)-1/2 ~)Yk - (30 - (3Xk)kEs

Now, Xr - XS -;P O. Again,

(7.2.31)

n-1/2(LYk - (30 - (3Xk) = n-1/2 L(ek - (3vk) -;L N(O, B)kEs kEs

as n -; 00, by the Central Limit Theorem (CLT). Similarly,

(N - n)-1/2 L(Yk - f30 - (3Xk) -;L N(O, B)kEr

as N - n -; 00.

Theorems 7.2.5 and 7.2.6 imply that Tu ,T F1 ,T F2 , when properly normalised,are each asymptotically distributed as (7.2.31).

THEOREM 7.2.7 Under model of Theorem 7.2.5,

(7.2.32)


Defining asymptotic relative efficiency (ARE) of T1 with respect to T2 as

AV(T2 )

e12 = AV(T1) (7.2.33)

(7.2.34)

where AV means asymptotic variance, RE of TLS with respect to Tu ( or TFl ,

TF2 , since these estimators are asymptotically equivalent by Theorem 7.2.6)is

where 8 = (Jee/(Jvv' Unlike Tu,TFl , and TF2 ,TLS does not depend on anyextra knowledge about the population. Also, e12 increases as kxx decreasesLe. as measurement error becomes severe.

The authors performed simulation studies to investigate the behaviour ofTLS 'Tu,TF1 ,TF2 and

where

~. {!JFl>(3F1 = mXY

mxx-~u +1iw'W n-l

if). > 1 + _1_- n-1

if ). < 1 + n~l

(7.2.35)

as suggested by Fuller (1987) as a small sample improvement over !JF1'

Mukhopadhyay(1994 b) obtained Bayes predictor ofT under model (7.2.27)when the Xi values are measured without error. Assume further that e;'s,u;'s are independently normally distributed with known values of (Juu, (Jee'

Suppose also that xk(k = 1, ... ,N) are known positive quantities. Assumethat prior distribution of {3 = (130, (3)' is bivariate normal with mean bO =(bg, bO)' and precision matrix qSO (q = (k~l)' k = ~) where Sa is a 2 x 2positive semidefinite matrix. The posterior distribution of (3 given ~,Xs

where X s = [1,xkjk E S]nx2 is normal with

(7.2.36)

and dispersion matrix

(7.2.37)00-1 1 a -1D({31 ~,Xs) = S = -(S + S) ,qT

where b = (Y:,-bxs, b)', b = 'Es(Yk-y:')(Xk-Xs)/'Es(Xk-Xs)2 = mxy/mxx ,


Bayes predictor of T is

S _ [ n Ls Xk ]

- Ls Xk Ls x~(7.2.38)

(7.2.40)Tf = (N _ ~)bOO (nxS (N _ )- )boO nkYsB k + 1 a + k + 1 + n Xr + k + 1 '

where boo = (bgo, bOO).

In particuklar, if we assume a natural conjugate prior of f3 so that bO =b, Sa = S, then baa = band T1 in (7.2.39) reduces to

T~ = N["t + b(X - xs )] (7.2.40)

the linear regression predictor of T.

Bolfarine et al (1996) extended their work on simple regression model inTheorems 7.2.5 - 7.2.7 to multiple regression model. Consider the model

X k =Xk+Vk, k= 1, ... ,N (7.2.41)

(7.2.42)

where f3 = (f31, ... ,f3r)',Xk = (Xk1' ... ,Xkp)',Xk = (XH"",Xkr)',Vk =(Vk1,,,,,Vkp)',Xkj(Xkj ),Vkj are the true (observed) value of the auxiliaryvariable Xj on k and the corresponding measurement error (j = 1, ... ,p).Assume that

[::] ",N2p+1 [[Z~] [;ee~x~ ~]]Vk 0 0, 0 ~vv

where ~xx, ~vv are non-singular with ~vv known. Consider an unbiasedpredictor ~u of f3,

where

- -1f3u = (Mxx - ~vv) Mxy

Xs = (Xs1 ... ,Xsp )' Xsj = L Xkj/n.kEs

(7.2.43)


THEOREM 7.2.8 Under models (7.2.41), (7.2.42),

yn(i3u - (3) ~ N(O, G)

where

217

(7.2.44)

The proof follows from Theorem 2.2.1 of Fuller (1987). The correspondingpredictor of T is

Tu = Ny' + (N - n)(Xr - X s )'i3u

The following theorem is an extention of Theorem 7.2.6.

THEOREM 7.2.9 Under models (7.2.41), (7.2.42),

yn(Tu - T)jN ~L N(O, (1 - j)(CYee + f3'Evv(3))

as N ~ 00. The least square predictor of f3 is

The corresponding predictoopr of T is

THEOREM 7.2.10 Under model (7.2.41), (7.2.42)

yn(TLS - T)jN ~L N(O, (1 - j)(CYee + f3'ExxEx~Evv(3))

(7.2.45)

(7.2.46)

(7.2.47)

(7.2.48)

It follows that the asymptotic relative efficiency of TLS with respect to Tuis

, CYee + f3'Evv f3e - --:..:...--:.----...:....:..:...-12 - + f3,,,,-l '" f3'CYee LJXXLJvv

which is always greater than one since

(7.2.49)

Comparison between matrix mse of i3LS and i3u poses an interesting problem.


Mukhopadhyay (1995 b, 1998 b) considered multiple regression model withmeasurement errors on all the variables, His model is

Yi = x:f3 + ei

Xi =Xi +Vi

1"; = Yi +Ui

Writing Ei = (ei, Ui, vD', E(Ei) = 0, and

[

aee 0 0 ]V(Ei) = 0 auu ~uv

o ~~v ~vv

(7.2.50)

(7.2.51)

Under the above models he considered best linear optimal pm-unbiasedprediction of y for any given p E Pn , Bayes prediction of y under normaltheory set up and linear Bayes prediction of y. Under a slightly differentmodel he (1999 a) he derived a strongly consistent estimator for f3 (Exercise3).

7.2.3 BAYES ESTIMATION UNDER TWO-STAGE

SAMPLING ERROR-IN-VARIABLES MODELS

Consider as in Bolfarine (1991) the following two-stage sampling superpopulation models with error-in-variables. Superpopulation models for multistage sampling (without measurement errors) were earlier considered byScott and Smith (1969), Royall (1976), Malec and Sedransk(1985), amongothers. Mukhopadhyay(1995 a) considered a different model in two-stagesampling accommodating measurement errors.

The finite population consists of K mutually exclusive and exhaustive subpopulations (clusters) of size Mh (Number of elementary units) , h =1, ... ,K. In the first stage a sample of n clusters is selected from the Kavailable clusters. In the second stage a sample Sh of mh elementary units isselected from each cluster h in the sample s. Let Yhi denote the true vlaueof the characteristic Y on the ith elementary unit belonging to the clusterh. We assume that whenever (h, i) is in the sample So = UhEsSh, Yhi cannotbe observed but a different value Yhi , mixed with measurement error isobserved.

The following model is assumed:

Yhi = f.Jh + ehi


J.Lh = J.L + 1/h i = 1, ... ,Mh

Yhi = Yhi + Uhi, h = 1, ... , K

where ehi, 1/h, Uhi are all independent,

219

(7.2.52)

Uhi i~d N(O, O"uuh)

The above models correspond to the exchangeability between elements ina cluster. Let ~ = {Yhi,i E Sh, and h E s} denote the observed valuecorresponding to sample So. Now Posterior distribution of J.Lh give ~ is

(7.2.53)

where

Also, posterior distribution of Yhi given J.Lh and ~ is

Yhi I J.Lh rv N(J.Lh, O"eeh), (h, i) ~ So

After the sample s has been selected, we may write the population total Tas

Mh

T= LLYhi+ LLYhi+ LLYhihEs iEsh hEs iEsh hEs i=l

(7.2.54)

where s denotes the set of clusters not included in sand Sh is the set ofelementary units in the hth cluster not included in the sample Sh. Hence,Bayes predictor of T is

1'B = E[E{T I~,J.Lh} I~]


(7.2.55)

Mukhopadhyay (1999 c) derived Bayes estimators of a finite populationtotal in unistage and two-stage sampling under simple location model withmeasurement error under the Linex loss function due to Varian (1975) andZellner (1986). Bayes estimator of a finite population variance under thesame model and same loss function was also derived (exercise 6).

7.2.4 PREDICTION OF A FINITE POPULATION DIS

TRIBUTION FUNCTION

We first consider the model (6.2.1) along with the assumptions that ej, Uj

are normally distributed. Consider the class of predictors linear in ~(t-Y;)

for FN(t) (defined in (6.1.1)),

Fs(t) = bs +L bks~(t - Yk)kEs

(7.2.56)

where bs, bks are constants not depending on y-values. Clearly, fts(t) willbe pm-unbiased for F (t) iff

(7.2.57)

(7.2.58)

where U ee = u~ and q>(z) denotes the area under a standard normal curveup to the ordinate at z. The condition (7.2.57) implies

Ep(bs ) = 0

q>(~)E (~b ) u.

p L.J ks = q>(~)kEs U

where u2 = U ee + U uu ' The following theorem easily follows as Theorem7.2.2.

THEOREM 7.2.12 Under model (7.2.1), optimal pm-unbiased predictor ofF(t) in the class of predictors (7.2.56), where p E Pn is given by

where

(7.2.59)

(7.2.60)


Again, any p E Pn is optimal for using F;.

221

Assume that IJ- is not known (aee , auu are, however, assumed known) and isestimated by m(~) = m (say). In this case, a predictor ofF(t) is

(7.2.61)

whereA t-m

<Po = <po(t, m) = <p(--)ae

A t-m<P1 = <P1(t, m) = <p(--)

a

Note that F;(t, m) is a U-statistic except for the estimate m of IJ-.

AI" AF;(t, m) = - L...J Fks(t, m)

nleEs

whereA. <Po(t,m)

Fles(t, m) = <P ( ) b.(t - YIe )1 t,m

is a symmetric kernel of degree one. Taking m = ~ = 2:iEs "Y;/n it followsby Randles' (1982) Theorem (Lemma 6.5.1) that under some regularity conditions F;(t, m) is asymptotically normally distributed with mean <po(t, It)and asymptotic prediction variance

where

~ = D(F:(t, IJ-), ~)

D(.) denoting the dispersion matrix (Mukhopadhyay, 1998 d).

We now consider Bayes prediction of F(t). Assume that aee ,auu are known.Also, consider a Normal prior N(o, B) for IJ-. It then follows as in (7.2.13),Bayes predictor of F(t) under a squared error loss function is

1 n

FE(t) = E(F(t) I~) = E{E(N L b.(t - Yi) IIJ-,~) I~}i=l


1= -E{E(L tl(t - Yi) + L tl(t - Yi) Ip,~) IYs }N. .

IEs IEr

Now, for i E s,

t- Y'E{E(tl(t - Yi) Ip,~) I~} = E(<I>(--' ) I~)

0'0

= Hi (say)

(7.2.62)

where

Similarly, for i E s,

E{E(tl(t - Yi) Ip,~) I~} = G (say)

Therefore, from (7.2.61),

~ 1 ~FB(t) = -[L,., Hi + (N - n)G]

N.IEs

(7.2.63)

Consider now optimal prediction of F(t) under model (7.2.27). Here

N1 ~ t - f30 - (3x·

EpE(F(t)) = N L,., <1>( 0' ') = <1>3 (say)i=1 e

Hence, Fs(t) in (7.2.56) is pm-unbiased for F(t) iff

N

L <l>4k L bksp(s) = <1>3k=l s3k

where

Hence, we have the following theorem.

THEOREM 7.2.13 Under model (7.2.27) the optimal pm-unbiased predictorof F(t) in the class of predictors (7.2.56) with bs ~ 0 V S, bks ~ 0 V (k,s)and for any given p E Pn with p(s) > 0 V s is

(7.2.64)

7.3. MULTIPLICATNE ERROR MODEL 223

Again,

7.3 PREDICTION UNDER MULTIPLICATIVE

ERROR-IN- VARIABLES MODEL

Hwang (1986) considered the multiplicative error-in-variables model as follows: Let

Y = xf3+ eX = x*V

(7.3.1)

where x = ((Xkj, k = 1, ... ,nj j = 1, ... ,p)) are not observed directly andthe observed data are Y = (Yi, ... ,Yn ) and

(7.3.2)

which is the Hadamard Product of the matrix x and v. We assume

V-- [v~'fi] where Vi rv i~d(lp, M)

n

(7.3.3)

where x;'s are iid with mean fLx and finite variance. Also, ei, Vj, xk areassumed uncorrelated (i,j, k = 1,2, ...). We define

(7.3.4)

The above model was used by Hwang in analysing the data collected bythe US Department of energy concerning energy consumption and housing characteristics of households. To preserve confidentiality some of thepredicting variables that might identify household owners were multipliedby Vkj, which were assumed to be truncated normal random variables with


known variances and means all equal to one. Clearly, such a model can alsobe used in randomized response trials.

THEOREM 7.3.1 Assume that E(X1XD exists and is non-singular and M =COV(Vl) exists and is known. Then

~c = {(X'X)%M}-lX'Y

is strongly consistent for f3.

Proof. Consider the lse

NowX'X/n = X'xf3/n + X'e/n

Again,

n

X'e/n = L eiX;fn ----;as E(e1X1) = E(el)E(X1) = 0i=l

where as denotes 'almost surely'. Also,

n

X'x/n = L Xix:fn ----;as E(Xix;) = E(X1X~) = A (say)i=l

Hence,X'Y/n ----;as Af3

Again,n

X'X/n = L XiX:!n ----;as E(X1XD = A * Mi=l

Therefore,

~LS ----;as (A * M)-l Af3

provided A * M is non-singular. This is so, because

(7.3.5)

(7.3.6)

A * M - A = A * (M - 1p1~)

Again, A is positive definite, M - 1p1~ = E(V1VD - E(Vl)E(vD = V(Vl) ispositive semidefinite. Therefore, matrix in the right hand side of (6.3.6) isnon-negative definite (vide Theorem 3.1 of Styan, 1973). Therefore, A* Mis p.d. and hence nonsingular, since A is so. Hence

7.3. MULTIPLICATIVE ERROR MODEL

= {(X'X)%M':IXY

is strongly consistent for (3.

COROLLARY 7.3.1 For p=l,

A M.h -1(3LS = -,- -4 M (3

M xx

where M~x = 'L7=1 X1!n, M~y = 'L7=1 XiYi/n. He~ce

(JLS -4as M-1(3

whereM = E(v;) = 1 + avv

Hence a consistent estimator of (3 is

225

(7.3.7)

(7.3.8)

(7.3.9)

THEOREM 7.3.2 In addition to the assumptions of Theorem 6.3.1, supposethat the elements of x and V have ~nite fourth moments and that ei has

a finite second moment. Then ..;n({J - (3) is asymptotically normal withmean zero.

Proof

where

Now

..;n(~ - (3) = vn[{(X'X)%Mr1X'(x(3 + e) - (3l

= vn[{(X'X)%M}-1{(X'x - (X'X)%M)(3+ X'e}l

= {n-1(X'X)%M}-1S/ vn

S = {X'x - (X'X)%M}(3 + X'e

(7.3.10)

(6.3.11)S/vn = In t[{XiX : - (XiX;)%M}(3+Xieil

The random vectors within the braces form a sequence of iid random vectors, i = 1 ... ,n, having finite covariance matrices and zero mean. Therefore, by Central Limit Theorem, S/ /Ti is asymptotically normally distributed. Again, {~(X'X)%M}-1 -4

as A-1. Hence the result.

Huang (1986) also obtained an asymptotic expression for Cov{ /Ti({J - (3))and its estimate.


THEOREM 7.3.3 Under (7.3.1)-(7.3.3) with p=l,

vn(~ - M-l (3) -.L N(O, 01s)

vn(~c - (3) -.L N(O, 11;)

as n -. 00 where

withE(xi) = Px + I1xx , M = 1 + l1uu (7.3.12)

Gasco, Bolfarine and Sandoval (1997) considered estimation of finite population total under multiplicative measurement error superpopulation models. Consider the following variant of the model (7.2.41):

Y; = a + (3' Xi + ei

with

Xi =Xi *Vi (7.3.13)

where, as before, observed data are (Y;, Xi, i E s) and Vi is a measurement error corresponding to the true value Xi of the auxiliary variableX = (Xl,' .. ' Xp )' on unit i.It is assumed that Xl, ... , X N are all known.For r ~ 1, two predictors of Tare:

~ - - -,~

T LS = NY" + (N - n)(Xr - X s ) (3LS

~ - I ~T c = NY" + (N - n)(Xr - X s ) (3c

THEOREM 7.3.4 Under mode~ (7.3.14) with P ~ 1, ~LS is asymptotically

normal about mean B(3 and ~c is asymptotically normal about mean (3where

THEOREM 7.3.5 Under moder (7.3.14), with P ~ 1,

c: ';.. L 2yn(TLs - T)/N -. N(O, (1 - f)I1LS)

7.4. EXERCISES

where

227

As in the case of additive measurement error models, pm- unbiased optimalpredictor) Bayes and minimax predictor of T and S; can be found out inthe present case (vide exercise 5).

7.4 EXERCISES

1. Consider model (7.2.1) with ei, u/s having independent normal distributions. Show that for Jeffrey's (1959) non-informative prior

Bayes predictor of T is given by Ny' with posterior variance of T,

VeT IY.) = N(k + 1) - nk 82 [2k(n -1) N(k + 1) - nk]5 (k + 1)2 Y n - 3 + n

N 2 (k + 1)2 - n2 k 2 • n - 1~ ( )2 8~ ( assummg --3 ~ 1)

nn+1 n-.

where 8~ = n~l I:iES(Y; - Yl. Also, Bayes predictor of S; is

A 2 2 k2 n(N + n - 2) - 2SyB = E(Sy IY s ) = (k + 1)2 N(n _ 1) :Y. +

k8~ keN - n) 2(n - 1)(k + 1)2 [k + N(N _ 1) + N(n _ 3) ((N - n)(k + 1) + 1)]

(Mukhopadhyay, 1994 a)

2. Let a sample 8 of n clusters out of K clusters constituting a finitepopulation and a sample 8h of mh second stage units be selected from eachcluster h (consisting of M h ssu's) in 8. Let Yhj be the quantity of interestassociated with unit j in cluster h. Consider the following error-in-variablessuperpopulation model


Yhj = Yhj + 1/hj

where Yhj is the observed value on (h, j) and ehj rv N(O, aeeh) , Uh rv N(O, t?), 1/hj rv

N(O, avvh) , the variables are all independent and ~ denotes the set of allobserved Y values on the sample. Show that

hE s,j E Sh

otherwise

Also,

E(Ph I~) = Ah~h + (1 - Ah)~

where ~h = L:jESh Yhjlmh, ~ = L:hEs Ah~hl L:jEs Aj and

82

Ah = 82 ( I for h E s(O otherwise) .+ aeeh + auuh mh

Hence find Bayes estimator of T.

3. Consider the modelY = x{3+e

X=x+v

with

(Bolfarine, 1991)

E(e) = 0, Vee) = aeelN, E(v) = 0, V(v) = M

wherex = [Xl, ... ,xnY, xi = (Xil' ... , Xip)', v = [Vl' ... Vn]'

Y = (Yi, ... ,Yn)', Xij being the value of the auxiliary variable Xj on uniti. Assume that E(XlXD = A, a nonsingular matrix and Vi,Xj,ek(i,j,k =1, ... ,n) are mutually uncorrelated. Show that under the assumption that(X'Xln - M) is nonsingular with probability one, the estimator ~ =(X' Xln - M)-lX'Yln is strongly consistent for {3.


4. Consider the model and the assumptions of exercise 3. Show that

whereS = {X'xln - (X'Xln - M)}{3 + X'eln

7.4. EXERCISES 229

and ~ is defined in exercise 3. Show also that y'riS is asymptotically normally distributed with zero mean. Also, asymptotic covariancwe matrix ofy'ri(~ - 13) is

n

I: = A-1COV{1/y'ri~)Zli + Z2i)}A-1

i=l

whereZli = Xieli , Z2i = (XiX: - XiX: + M)f3

Zli = Zl, Z2i = Z2, (Zli, Z2i are iid)

Also COV(Zl, Z2) = O. Hence, I: = A -1 (I:1 + ~2)A-1, I:1 = Cov (Zl) =O"eeE(X1XD, I:2 = COV (Z2)' Again, show that

&ee = ~[Y'Y - Y'X~ln

is strongly consistent for O"ee. However, &ee may be negative. A positivepart estimator of O"ee, is, therefore,

5. Consider model Yi = J.L+ei,Y; = YiUi,ei rv (O,O"ee),Ui rv (l,O"uu),eiir::.duj

(i, j = 1, 2, ...). Show that under this model ~ is best linear pm-unbiasedpredictor of y for any given p E Pn. Also,

is best quadratic pm-unbiased predictor of S; for any given p E Pn. In boththe cases any p E Pn is an optimal sampling design.


6. Consider the model (7.2.1) and a normal prior J.L rv N(O, fJ2) for fJ-.Find Bayes estimates of population total and variance under the Linex lossfunction (3.4.1). Also, find Bayes estimate of T under the model (7.2.27)with respect to the same loss function.

(Mukhopadhyay, 199ge)

[Lecture Notes in Statistics] Topics in Survey Sampling Volume 153 || Prediction in Finite...

Documents

Transcript of [Lecture Notes in Statistics] Topics in Survey Sampling Volume 153 || Prediction in Finite...