An inequality with applications to statistical estimation for probabilistic functions of markov...
Transcript of An inequality with applications to statistical estimation for probabilistic functions of markov...
8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…
http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 1/4
AN INEQUALITY WITH APPLICATIONS TO STATISTICAL
E S T I M A T I O N F O R P R O B A B I L IS T IC F U N C T I O N S
O F M A R K O V P R O C E S S E S A N D T O
A M O D E L F O R E C O L O G Y
BY LEONARD E. BAUM AND J . A. EAGON
Communicated by R. C. Buck, November 21, 1966
1. Summary . T h e ob jec t of th i s no t e is to p ro ve the theo rem be low
and ske tch two appl ica t ions , one to s ta t i s t i ca l es t imat ion for (p robabi l is t ic) funct ions of Markov processes [ l ] and one to Blak ley ' s
model for ecology [ 4 ] .
2. Resu l t .
T H E O R E M . Let P(x)=P({xij}) be a polynomial w ith nonnegative
coefficients homogeneous of degree d in its variables { # # } . Let x= {##}
be any point of the doma in D: ##§: ( ) , ] p L i ## = 1, i = l, • • • , p,
j=l, • • • , q% . For x= {xij} £ £ > le t 3(#) = 3 {# #} denote the point of D
whose i, j coordinate is
( dP\ \ f « dP3(*)<i = ( Xij 7 — ) / 2 * *< i —
\ dXij\(X)// , - i dXij (»>
Then P(3(x))>P(x) unless 3 ( x ) = x .
Notation, fi will de no te a dou bly indexed a r r ay of non neg a t ive
i n t e g e r s : fx= {M#}> i=
l > • • • > < lu i=l, • • • , A #* the n de no tes
I l f - i H î - i ^ * S im ila r ly , cM is an abb rev ia t ion for C[M iJ}. The po ly
n o m i a l P({xij}) i s then wr i t t en P(x) = ] C M V ^ -In ou r no t a t i on :
(1) 3(&)*i = ( Z ) «Wnys* ) / JLH CpiiijX».
We wish t o p rove
(2) P O ) = E <^" ̂ z ^ n f i 3(*)«w.
P R O O F .
^ ) = i k n i i 3 ( * ) « }
360
8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…
http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 2/4
l/d+l
^ijld\ d/d+1
STATISTICAL ESTIMATION AN D A MODEL FOR ECOLOGY 3 6 1
W e ap ply H old er ' s inequ a l i ty [6, p . 21] to ob ta in
/ P Qi \
( P Qi / %.. \Mild\ *
( 3 ) x {E^nn~-) }( In the las t b races we have used (x»)d+lld=x* ü f - i H j - i ^
/ t f0
S i nc e
Z ? - i Z î - i M t f / ^ 5 5 1 by hom oge ne i ty o f P , we can ap ply the inequ a l i ty
of geom et r ic an d a r i th m et ic m ean s [6, p . 16] to the dou ble p ro du c ts
of the second brace of (3) to conclude:
P Qi / X.-J \PHld P Qi nu Xss
( 4 ) E v n n f e - i i ^ E L ? ^ -We now subst i tute the def ini t ion (1) of 3(#)# in the express ion on
the r igh t o f (4) and in te rchange the order o f summat ion to ob ta in :P Qi n . . < Y » . .
2- c,x" 2- 2- -r 77-r"\ P Qi
= "J Z <T»*M X S /*</*#
(5) ' ( Z Z Ctfp'ijJP' ) / ( Z <V/4^ M ' )
= — £ Z ^i f £ M*;VMJ / ( Z MW*
M'J
* Z Z^'Mtfo^'-i o - i M'
For each (i, j) the express ion wi th in the bracke ts i s = 1 and by hy
pothesis for each i, Z ? - i x * y = *• Hence the whole las t express ion of (5)
reduces to (1/d) Z f - i Z ? { - i Z M V / 4 0X M
' - B u t thi s is ju st (1 /d) Z</a**/o
• (dP/dXij0) so by t he E ule r theo rem for homog eneous func t ions i t i s
equa l to 2/lcMa;'1.
F ina l ly , i f we use th i s upper bound H^W for the express ion within
the second braces in (3 ) , r a i se bo th s ides o f (3 ) to the (d+l ) s t power ,
and d iv ide bo th s ides o f the resu l t ing inequa l i ty by ÇSpW) we ob
ta in the des i red inequa l i ty (2 ) .
T h a t P(3{xij})>P{xij) if { ## }? *{ ## } follows from (4) an d th e
s t r ic tness o f the inequa l i ty o f geomet r ic and a r i thmet ic means i f a l l
s u m m a n d s a r e n o t e q u a l .
8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…
http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 3/4
362 L. E. BAUM AND J. A. EAGON [May
3. Application 1. T h e f irs t app l ica t ion of th i s theor em i s to s ta t i s t i
ca l es t im at ion for (p robabi l i s ti ca l ly ) lum ped M ark ov cha ins . Le t 5
be th e f inite s t a t e space of a M ar ko v cha in. Le t ƒ be a funct ion f rom
5 to R. L e t yG.RT, T an in teger , be an observ a t ion . In [ l ] the p rob lem
is cons idered of es t im at ing the t rans i t ion probab i l i t ies a# for i, j&S,
given y.
L e t X=(fT)-
1(y). XQS
T. F o r xEX, i, jES, le t Vi j{x ) be t he num
ber of t imes the pat tern • , • , • , i, j , •, -, • oc cu rs in x. The func t ion
P ( { # # } )=
Z » e x TLiJesaVij^
x) m ay t>
ei n t e rp r e t ed a s t he " p robab i l i t y
of observing y given the t rans i t ion probabi l i t i es { # # } . " N o t e t h a t P
i s a homogeneous po lynomia l o f degree T with no nne ga t iv e ( in teger )
coeff ic ients in the var iables a#.
An i te ra t ive procedure for es t imat ing the t rans i t ion probabi l i t i es
{&<;} given y i s sugg ested in [ l ] . If {ay} is an a priori es t ima te , l e t
4 j = ( ] £ * e j r Pij(x) TLk,iG8 öww(af)
/-P({^y})«A'a
m aY be in te rp re ted as
th e "a posteriori expected value of the f requency of t ransi t ion f rom
s t a t e i t o s t a t e j given y and the a priori probabi l i t i es {da}." T h u s
AyXjA'u may be thought o f as an "a p osteriori es t imate o f the t rans i
t ion probabi l i t i es g iven y." Since
A'ufàA'H = a^dP/da^/^aaidP/dai,)
by ou r the ore m app l ied to t he t ran sfo rm atio n 3{#<,-} = {A'^fLA'^}
w e c o n cl ud e t h a t - P ( 3 { a # } ) è P ( { a # } ) . I n o t h e r w o r d s t h e a pos
teriori es t imate o f t rans i t ion probabi l i t i es increases the l ike l ihood of
the g iven obse rva t i on y.
Various resul ts on the convergence of hi l l c l imbing i tera t ion pro
cedures [2 ] , [3 ] , [5] may be adduced to show tha t fo r a lmos t a l l
s ta r t s success ive i t e ra t ions wi l l converge to a connec ted component
of the local m ax im um set of P . If P ha s only f initely m an y local ma x
ima then success ive i t e ra tes converge to a po in t .
This i s the usua l case in the more genera l s i tua t ion cons idered in
[ l ] in which the observa t ion y t a t t ime t i s ob ta in ed f rom th e M ark ov
s t a t e x t a t t ime t accord ing to P(y t = k \ x t =j) = b3-k where bjk is an 5 X r
s toc has t ic m at r ix w hich is a l so to be es t im ated . H ere th e iden ti f iab i l-
i ty problem does not ar ise s ince, according to a theorem of Ted
Pé t r i e [7 ] , " i n gene ra l " no o the r ( a# ) , (b jk) y i e l d s t h e s a m e ^ p r o b a
bilities as a given (a°y), (&°*) (save for the si re label l ings of s ta tes) .
The second appl icat ion is to some resul ts of Blakley and Dixon
[2]» [3], [ 4 ] . Let r be a symmetr ic ^ - l inear fo rm on Rn
t h a t h a s
nonnegat ive coeff ic ients with respect to the s tandard basis for Rn.
L e t g(rj) = T(rj, TJ, • • • , rj) w h e r e rj is a vector in Rn. Since g is then
ju s t a pth. degree homogeneous po lynomia l wi th nonnega t ive coef iv
8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…
http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 4/4
1967I STATISTICAL ESTIMATION AND A MODEL FOR ECOLOGY 363
cients of the components of rj we may apply the theorem of this note
to it. In Blakley's model g is the adaptati on (rate of growth)
of a population. The transformation in Blakley's model cr(rj)==r
]i(dg(v)/drli)/pÇ[(y) is the same as the transformation 3 {#»-,*} where
Xij^rjj, * = 1, i = l , • • • , n.
In Blakley's model if rj is the distribution of genotypes at time t>
then a(r]) is the distribution at time t+l. Thus it follows from the
theorem in this note with i = 1 tha t adaptation is nondecreasing with
time when evolution of the genotypes at a single locus is considered.
Our theorem with i>\ yields the same conclusion under natural
hypotheses for evolution of the genotypes at several loci. This non-
decreasing of the adapta tion with time is clearly a desirable feature
of the model.
REFERENCES
1. L. E. Baum, A statistical estimation procedure for probabilistic functions of
M arkov processes% IDA -CR D Working Paper No. 131.2. G. R. Blakley, Homogeneous non-negative symmetric quadratic transformations,
Bull. Amer. Math. Soc. 70 (1964), 712-715.3. G. R. Blakley and R. D. Dixon, The sequence of iterates of a non-negative nonlinear transformation. Ill, The theory of homogeneous symmetric transformations andrelated differential equations, (to appear) .
4. G. R. Blakley, Natural selection in ecosystems from the standpoint of mathematicalgenetics, (to appear).
5. Wolfgang Hahn, Theory and application of Liapunov's direct method, Prentice-Hall, Englewood Cliffs, N. J., 1963, pp. 139-150.
6. G. H. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, CambridgeUniv. Press, New York, 1959.
7. Ted Pétrie, Classification of equivalent processes which are probabilistic functions
of finite Markov chains, IDA-CR D Working Paper No. 181, ID A- CR D Log No. 8694.
INSTITUTE FOR DEFENSE ANALYSES