Grete Usterud Fenstad - UiO
Transcript of Grete Usterud Fenstad - UiO
IVIETHODS FOR COI-TSTRUCTING ASYMPTOTICALLY NORMAL ESTIIviATORS
by
Grete Usterud Fenstad
University of Oslo
September 1965
O. INTRODUCTION
In this paper which is a sequel to [4], we shall give a rather general theorem on constructing asymptotically normal esti
mators with minimum asymptotic variances. This theorem genera
lizes previous work of Ferguson [5], Chiang [3] and Stene [6]. The methods of proof are similar to Chiang [3]. In fact~ his methods
carry over completely; thus we refer the reader to his paper for
the details of proof.
The theorem is stated in Section 1. In Section 2 we use this result to give a simple treatment of the multinomial case, and
in Section 3 we shall apply it to the gamma-distribution, cons~
dering estimators that caa be constructed by means of the general
method of Section 1.
-2-
ASSUl\fi>TION 3. The kxs-matrix
• • •
• A(Q) = -· - - -
cl_A1 ( Q)
dQk • • •
has rank k for every Q E. @. ASSUMPTION 4. There exists a symmetric, posi ti Yo def'j.n:i te
sxs-matrix d(Q) such that
• • A(Q)d(Q)a(Q) = A(Q)
where a(Q) = (aij(Q)) is the covariance matrix of ea:ch Zo(.•
Assumption 4 generalizes the usual condition that a(Q) is
positive definite. If a(Q) is positive definite one may choose
d(Q) = a(Q)-1• The generalization makes the treatment of the
multinomial distri!?ution very simple (see section 2). The observa
tion that Assumption 4 is sufficient is due to Ferguson [5].
In a previous paper [4; Theorem 1], we have proved that if
Zo(, o(= 1, 2, ••• satisfy the Assumptions 1 to 4, there exists
a consistent, asymptotically normal estimator of Q. This esti-
mator depends on ~, ~= 1, 2, ••• only through
"' "' " 1 n Zn = ( zn 1 ' • • • ' zns) = n 2: Zo<.
o(..=1
on a neighborhood S of R = A(QD) = tA(Q); Q{~} "The esti
mator is continuously differentiable with respect to Zn on s. We now make the following
.QE.£INITION. The matrix "~- is greater than. or- equal.--:t·o· --3--"'-
(A),. B) if A-B is positive semi-de:fini te.
T""' -·- .L li and B are two covariance m::~trices of the same ·order
and A 1- B, it follows that the variances of A is greater than
or equal to the corresponding variances of B. -a In the aforementioned paper it was also provedl •; Theorem 2 J
that any consistent estimator being a function of Zn on S and
continuously differentiable on S has an asymptotic covariance
matrix greater than or equal to
Thus the problem is to construct estimators with this asymptotic
covariance matrix. The following theorem gives one particular
method to construct such estimators.
THBOHEivi. Let the functions ••• 9
t 0 li< S , g = ( g 1 , • • • , g S ) OD. ~~~~ S t 0 i~) S F;J1.d C . . , i, j = 1 , •• 9 S lJ (k:)_ sx tffi to ·,'h:' 1 t f th d · t · b l TI J:' • on 1 ~ ,r sa is ·y ~e con 1 1ons e ow. e£1ne a
quadratic form Q by
where " = ( c . . ( zn, g)) • lJ Then
( i) as n -1 oo , there exists, with a probabili~y ·tending
to 1, one and only one function Q(Zn) which locally "
minimizes the quadr8tic form Q(Zn,G);
(ii) G(Zn) is a consistent estimator of Q09 the true
parameter;
-4-
(iii) Q(Zn) is continuously dj_fferentio.ble with re:spect
to Zn;
(iv) Vn([g(zn) - Q0 ] is asymptotically normal distributed
with mean 0;
(v) Q(Zn) has asymptotic covariance matrix
~~.(Qo)d(Qo) -<\(Qo) J -1 •
fhe conditions on f are~
(a) f nay be differentie.ted twice with respect to Q and A
once with respect to Zn, and the derivatives are all si~ultaneously
continuous in zn and Q· ' "
(b) I -? f 1 ( zn, Q)
I a g, • { f(Q) = I - - -
"' \ 0 f1(Zn,Q) \ \ ~ Q
v k
. . . " \
-~f (Z Q) \ 0 s n' \
3 Q1 \
" -()f s ( zn' g) . . .
d Qk A( Q) 'Q
is of rank k for every Q~ ® and
(c) ( 0 fi (Zn,Q) \
:::: 0 for every i,j = 1 ' s d z . )
.. ' ' nJ A( Q) 'Q
The conditions on g are~
(d) g is continuously differentiable with respect to 17 0
~.~n,
(e) g(A(Q)) = f(A(Q),Q) for every Q E--@
(f) g(A(Q')) =l= f(A(Q') ,Q) for every Q' =f= Q
A .... \ (g)
·a g1 ( zn) d gs(Zn) \ .... ... 0 zn1 a zn1 \
. I g(Q) = - - - - - - I "'
rogs(Zn) . . . ----" ~z
\; :~~z:)-(7Z
\ ns '""' ns J A(Q)
is non-singular for every Q E. @ ; and
. . . (h) f(Q) = A(Q) g(Q)
The conditions on c = (cijl_ are:
i,j = 1, ••• , s may be differentiated twice
with respect to Q and once with respect to Zn, and the deriva
tives are simultaneously continuous in Zn and Q;
(j) c is symmetric and _9ositive definite for every
( Zn, Q) 6 ff<Sx ([1 ; and • • i
(k) g(Q) c(Q) g(Q) = d(Q) for every Q~ ® , where
c(Q) = (cij(A(Q)~Q)) •
The proof of this theor2m is similar to the proof of Theo
rem 6 (and Theorem 2) in Chiang [ 3 J , and the reader may be re-
ferred to that paper for the details.
In the definition of Q and in the conditions imposed on
f, g and c we have implicitly assumed that the various require-
ments hold for all zn and Q, It suffices, however, to 1 .. equire
that Q is defined on and that (a) - (h) hold in some open set
containing (A(Qo),Qo) and that (i) - (k) hold in some open set
containing A(Qo), because zn converges to A(Qo) in probability
and becG.use we are only deali:.1g vvi th Q.S37n:ptotic :properties.
The extension made in relation to Ferguson[ 5] .and Chiang[$]
is that we allow the function f in the quadratic form to depend ....
on both zn and Q, requiring only that f(A(Q),Q) = g(A(Q)).
This added freedom gives U'S a convenient method for finding ....
an ""'(Z Q) ..L n' in applying the theorem (this 11ethod was fil'\St pro-
" " ~"" .., posed by Stene :_r, j ) : For simplicity, let g( Zn) = Zn, then
f(A.(Q),G) must be equal to A(Q). Let Q*(zn) be a consistent,
continuously differentiable estimator of Q. Expanding A(Q) in " a Taylor series about Q = Q*(Z ) n '
we obtain for the two first terms A
g'~cz '' • A
A(Q)~A(Q*(Zn)) + (Q - A(G*(z )). n n
We then propose to choose
" A " . " f(Zn,G) = A(G*(zn)) + (G - Q7(- ( zn)) A ( g* ( zn) )
' and one easily verifies that this choice of f satisfies the con-
ditions of the theorem if A(Q) may be differentiated twice. This
:particular f leads to linear equations if c depends only
An analogous generalization is possible for the linear form
of Ferguson [5] .
Let f and g satisfy the condition (a) - (h) (actually,
condition (a) of f may·be replaced by (a)' f is continuously A "
differentiable with respect to (Zn,G)), and let b(Zn,Q) = A
(b (Zn,Q)) be a kxs-matrix with real elements satisfying the CR.i
conditions
(l) b , ~= 1, ••• ,k, i=1, ••• ,s is continuously dif~i
"' ferentiable with respect to (Zn,Q);
• (m) b (g) f( Q) is non-singular for every.. Q c ® ; -~.vhere·';
b(Q) = b(A(Q),Q) ; and
(n) b(Q) g(Q)' = .Ll(Q)d(Q)
Then, as n --7 ro , there exists with a probabili t;sr tending " A A
to 1, one and only one function Q(Zn) which locally solves the
equation
" " " Q(Zn) is a consistent estimator of Q0 and continuously diffe-
rentiable with respect to Zn. :B"'urther normal distributed with covariance matrix
~ " e(Zn) is asymptoticallS
The proof of this statement may be carried out similar to
the proof of the theorem. The essential point is the use of the
implicit function theorem.
The definition of L and the conditions made on f 9 g and
b need only be satisfied "locally", as the remarks made in con-
nection with the above theorem also apply in this case.
2. APPLICATION TO THE MULTINOMIAL DISTRIBUTION
To illustrate the theorem of Section 1, we shall give an
application to the multinomial distribution. Consider a multino-
mial experiment of s mutually exclusive events R1 ' • •. ' Rs.
For each i the probability of R· l is pi ( Q) ' pi(Q)> o. Let
G if Ri occurs in the o( -th trial
z = o<i if Ri does not occur in the 6( -th trial
i = 1 ' ... ' s ' o<.= 1 ' 2' 0 • •
-8-
Using the notstion already introduced we have
z d... -- (Z
' . ' . ' z ) ·:\ = 1 ~ 2 ~ 0( 1 _),_ s
s A(Q) = (p1 (Q)' • • • ,ps(G))' 2:.- pi(G) = 1
i=1
a( G) (
= ( r-\ . . p . ( Q ) - p . ( Q ) p • ( Q ) ) ..... l J l l J
s Because of the condition ~ p1 (G) = 1 we have that a(G) is
i=1
singular, but
d(G)
! ( ) -1 fP1 G Q I
= \
\ 0
satisfJes the A.ssumption 4.
It is tacitly understood that Assumptions 2 and 3 are satis-
fieC~-. A
If we generate estimators of G by minimizing Q(Zn,G) A
or solving L(Zn,G) = 0 we get consistent, asymptotically normal
estimators rvi th covariance matrix
which is also the Cramer-Rao lower bound for estimators of G, viz.
where
- E( o<_ 1 [ C) ln p ( Z. ; Q )
n ro G~z_
p ( Z ~ ; G)= TI p. (G) z £X. i i=1 l
-9-
(It has been sho~n ( r 1 -~ '- _, and [ Lf J ) that if the com-
mon probability density of Zx, ~ = 1,2, ..• , p(.;G), satisfies
some additional regularity conditions, estimators generated lJJ> Dne
of these methods have asymptotic covariance matrix equal to the
Cram~r-Rao lower bound if and only if p(.;Q) is of the expo-
nential type, i.e.
p ( z ; Q) = exp fc( (G) + cb ( z ) + ~ C( ( Q) z l ) L 0 · i=1 i ij •
We may, if pi(G), i=1, ••• ,s is twice continuously differen-
tiable, for example 8hoose
A
fi(Zn,G) A
gi(Zn) A
c .. (Z ~G) lJ n
in the quadratic form
A
fi(Zn,G) A
=
=
=
Q,
=
pi(G)
z . nl--
6 . _,p. (G)-1 lJ l
or
pi(G) A
,, f
r i,j 1 ' = ... ' s
I J
I g. ( z )
l n = 2ni j i,j = 1, ••• , s
A
c .. (Z ,G) l J n = (" "-1
d i j z . nJ
getting respectively the minimum-;r 2 - estimators and the minimum
modified- X 2 - estimators. /l
This is easily generalized to the case of several independent
multinomial experiments depending on the same basic parameter G.
In this case special choices of f,g and c will give the
result of Taylor [ 8 ; Theorem p.88J and thus, as pointed out by
2 r 2• Taylor, Berkson's minimum logi t ~ -method L J •
-10-
3. APPLICATION TO THE GAMMA-DISTRIBU'riON
We shall use the results of Section 1 to propose some new
estimators of the unknown parameters in the gamma-distribution.
Let Xj, j=1,2, ••• be independent, identically distributed
randoril variables with probability d·ensi ty
1 p(x;O(.' cr) = rcc() o-c(_
with respect to the Lebesque measure.
X Dt .. -1- (J
x e X I 0, ci..,U ) 0
The maximum likelihood estimators of oL , CJ are found as the
solution of the equations
\~ (o(*)+ -* ln U ( 1 ) I
ot....* v*
where 1 n
2n1 = L n j=1 i
and -~ (d.. ) = f"i (d.,_ )
['(.::{)
= 2n2
= 2n1
x. J
1 n = n z
j=1 lnX.
J
( * *) D(,C) is nsymptotically normal distributed with mean
(<X, d) and covariance matrix
1 1
n d- '-V' ( ot_)-1
foe.
\-u ~ u ) ' J.. 'tl (oL) <Y
(See for example Sverdrup [7; Ch. ~fl~, p.120]).
-11-
" " We get another set of estimators (((, cr) by solving the
equations
EX = 1 n
LX.= X n j=1 J
var X 1 n 2
= L (X. - X) = n j=1 J
One easily finds
EX = c<.. 0'"' and var X = cL C5' 2 ,
thus we get
" -2 X 0(_=
s2 ' " "
It may be shown that (ex., cr) is asymptotically normal distributed
with mean ( o(., <1 ) and covariance matrix
11 2o£.(ot.+1) iii
\-2(ot ... +1 )0"
-2 ( 0(_ + 1 )<J \
2 o( + 3 0'2 J oL
* * The asymptotic variances of o(. and (J are smaller than "
the asymptotic variances of c<.. and 0" (this will be shown later).
( ~* *) This suggests that ~ , cr should be preferred as estimator
* ·~ since we do not know any non-asymptotic properties of (0( , cr ) " "
or ( o1,., cr ) • However, a Monte-Carlo experiment has suggested for
some finite n and special values of (<i., cr) that the estimator
* * gives values aloser to the real parameter than ( oL , a).
Introducing
we may write
(2) cr=
-12-
Thus -)(-
o-*) " " " " (d ' depends on ( 2n1' 2n2) and (~' c;) depends on "' "
' 2n3) • vre shall propose an estimator depending on z = n "'
(Zn1 "
::(Zn1' zn2' zn3), hoping it will have the asymptotic properties of
* ( d... ' -}f- A A
cr ) and the finite properties of ( d. o-).
Put
lnXj,
then Z., j=1,2, ••• J
is a sequence of independent, identically dis-
tributed random vectors. In order to apply the theorem of 0ection 1
we need the first and second order moments of 2 Z = (X, lnX, X )
(deleting the index j), We know that
(3) co
f' C ""- ) crc( = J X
o{ -1 cs x e dx , 0
differentiating on both sidas gives
(4) I' I ( d_) () ol + r ( o(. ) (J" cJ._ ln fJ"
X 00 \ x 0(_ - 1 e a- dx = .J ln x
0
and once more
( 5 ) jl 11 ( d. ) U d._ + 2 r I ( o( ) (j '~ ln C) + r ( oL ) (J o(_ ( ln a ) 2 =
X
= ~ ( ln x) 2 xc/... - 1 e- () dx 0
Substituting d. .. + r for ol.. in ( 1) and dividing by r (d.) ad
gives
In the same way we
and from (5)
EXr = r ( oL + r ) (J r • 1.., ( cJ. )
obtain from (4)
= ( r' ( oL+r) + r ( o(. +r) ln CJ )cr r ' r(o() · rCx.>
Ex"(ln X) 2 = t~'f~Jr) + 2 ~· f~t) lnCJ + r~~+))(ln<d)(}r
1
-13-
These formulae give us
and
A ( ol. , IS" ) = ( eX.. r:r , 'f ( ct.. ) + ln o- , d... ( oc + 1 ) a 2 ) ,
a(o£..,0') = I
~(d.)
(2o<.+1)cr 2
2oc(«.+1)CJ' 3
(2cx..+1)c- 2
2o( ( o( + 1 ) ( 2o<.. + 3) (J 4
Differentiating A(oL,cr) we get
1 (j
(2CX+1)0" 2
2oL(o/.+1)Cl)
Assumptions 1 to 3 of Section 1 are easily verified to
hold. The covariance matrix a( oL, C)) is positive definite for
all ( oL, C)); if not there would have been a linear relation be-f"'\
tween X, ln X and X~ with probability 1 (Sverdrup [7; Ch. XII,
p. 16]). We may therefore c "loose d (Dl. , cr) = a(o( , () ) - 1 in Assump
tion 4. Thus, applying one of the methods in Section 1, the con-
structed estimators will have asymptotic covariance matrix
1 c· . , l-1 n A( o!.., CJ )d( of..,, c; )A( .. :.L. ,V') .J .
To evaluate this matrix it is necessary to calculate
d(«.,o) = a(cX:.,cr)- 1 :
=: la(cx,o)l
, ( 2o: ( cx+1 ) ( 2cx+ 3) '1! (a:)-( 2cx+1 ) 2) c/J.
-4cx(cx+1)o5
-L~cx(o:+1 )o5
2cx2 (cx+1 )o6
( -2cx(cx+1 )'l'' (cx)+(2cx+1) )o;
0:04
(-2a:(cx+1 )'I'' (a:)+(2cx+1) )o3 0:04 )
-14-
where Ia( 0:., CJ )\ = cL[2oc(Q.+1) lf' (o() - (2c< +3)] cf 6 • This gives
(6) 1 I ( ol..'f \o()-1
which is the same asymptotic covariance matrix as the one given above
for the maximum likehood estimators.
RE1~RK. We are now able to show, as previously stated, that the A
asymptotic variances of ::J... and a- are greater than the asymptotic
* variances of ex. and * cl Since a( eX., d) is positive defi-
nite, it follows that
\ a ( ct., 0' ) \ = o1.l2o( ( o( + 1 ) lV' ( ci ) - ( 2<X. + 3 U d 6 ) 0 ,
and therefore
' ( ) 2oc +3 tV of.. > 2o<.(o<.+1)
Thus the ratios between the asymptotic variances are
.l 2~ (o<,+ 1 ) ":;-1 ....;n-._~a-- = 2a(a+1 )11!' (a)-2(a+1 )>(2a+3 )-2(a+1) = 1
n ' a'l! (a)-1
and
At this point we have to choose the functions f and g
satisfying the conditions (a) - (h), and the matrix c satisfying
(i) - (k) if we want to use the quadratic form Q, or the matrix b
satisfying (l) - (n) if we want to use the linear form L. Each
possible choice .oi these functions will result in estimators which
are asymptotically normal distributed with mean ( 0{., cr) and co-
variance matrix (6).
-15-
We shall give some examples of specific choices of f, g, c
and b:
Let
"' f ( zn, eX: , cr) = A ( cJ.. , cr )
then • • • g( ol..' ct) = I and f(oL, 6) = A(cL. ~d) •
It is easily proved that f and g satisfy conditions (a) - (h). "" To apply L we need a 2x3 matrix b(Zn,e<, ct) satisfying (n):
b(A(ol,d),o(,d) = b(oc,d) = A(cx., ct)d(o(,ct)(g(ot.,ct)')-1 =
We simply choose 1
0
1
0
The conditions (1) and (m) are now satisfied. The solution of
L ( zn' ol ' cr) = b ( zn' o(' (f ) l zn - A ( ol..' o') J I = 0
with respect to ( o/., a') are estimators with the stated properties. "'
Bacause of the simple form of b ( zn ,o( , ct ) ' the equation above is
equivalent to ,..
2n2 = A2 (oL , cf)
zn1 = A1 (o( ' d )
Thus we get the estimators (;; ,et) as the solution of ('I.J ('\) ""
'4/ ( ~ ) . + ~n ~ = :n2
ol. Cf = zn 1
which is seen to be the maximum likelihood estimators (1).
-16-
P~other possibility is to choose
z n ~
A A 1\ A ,._ e ,.. A
f ( zn ~ o< , a ) = A ( oL ~ cf ) + ( cL - o( , if - d ) A ( ac:., lf ) •
" f(Zn, d, d) is obtained as mentioned after the theorem in Section 1;
we have evaluated A( eX.~ d) "about" the estimator (2). One may veri
fy that the conditions (a) - (h) are satisfied, and that " " "
c ( zn, ot , d) = d ( ot , o' )
satisfies conditions (i) - (k). Differentiating
Q ( zn' oL ' (f ) = [ Zll - f ( Zll f o( ' ()' ~ d (~, d ) [ Zn - f ( Zn' o/. J d )] I
with respect to (~, d) and setting the derivatives equal to 0
we get • " " " ~)[zn - f ( Zn' eX ' d ) J I -2A( o1., 0") d ( o(' = o.
(In fact we would have obtained an equivalent equation using " • " " " "' b ( zn, o( , 0') = A( o( ' lf)d( ol..' cr) in the linear form · L.)
+0 1 0 ) • " " " " A(cL, a') d ( c:~..., cr) ~
~2 0 0 0
As
the equation is equivalent to " " 2n2 = f2( zn,o(, d)
" " 2n1 = f( zn' o( ' if )
" " Using the fact that d. d = zn1' we get the estimator ..., v
(a( , c1 ) defined
by "' "
v ~ [1 zn2- ~ (&_) -ln o-
J oL = + I
oL\.jl (0<'..)-1
" " v " [ . zn2 - '4) (:i)- ln 0' ] o- = cr 1 - .....;;.;;;.;;;..._.,...,... __.._ ____ _
oll.v'~)-1
I .. '
-17-
" ... * •* Substituting (OL , a' ) for ( ~, a) in the expression
" " " zn2- 'P( o() -ln if
I
d. 'yJ (ex.) -1
" " "' F(Z~1' 2n2' 2n3) =
we see that F = 0 in this case. Generally it is to be expected
that F
hope that
is small (at least asymptotically, ~0 ), thus one may v v
(of.., if ) is an estimator with finite properties similar " ...
to those of ( cJ.., 0" ), having at the same time the best possible
asymptotic properties. However, the only way to obtain a more de-
finite conclusion concerning the finite properties of the estimators
considered in this section, is through numerical calculations.
REFER'~NCES
[11 E.W. Barankin and J. Gurland (1951): "On asymptotically normal) efficient estimators: I", University of California Publications in Statistics, Vol.1, No.6, pp.89-130.
[2] J. Berkson (1934): "Application of the logistic function to bioassay", J. Amer. Statist. Ass., Vol.29, pp.357-365.
[3] C.L. Chiang (1956): "On regular best asymptotically normal estimates", Ann. Math. Stat., Vol.27, pp.336-351.
[4) ~U. Fenstad (1965):"0n asymptotically normal estimators", University of Oslo (to be published).
L5] T.S. Ferguson (1958):"A method of generating best asymptotically normal estimates with application to the estimation of bacterial densities", .Ann. Math. Stat., Vol.29, pp.1046-1062.
[6] J. Stene (1963):"Bidrag til teorien om beste, asymptotisk normale estimatorer", University of Oslo, (unpublished).
[7] E. Sverdr@ (1964):"Lov og tilfeldighet II", Universitetsforlaget, Oslo.
[s} W.F. Taylor (1953):"Distance functions and regular best asymptotically normal estimates", .Ann. Math. Stat., Vol.24, pp.85-92.