An inequality with applications to statistical estimation for probabilistic functions of markov...

8/14/2019 An inequality with applications to statistical estimation for probabilistic functions of markov processes and to a mo…

http://slidepdf.com/reader/full/an-inequality-with-applications-to-statistical-estimation-for-probabilistic 1/4

AN INEQUALITY WITH APPLICATIONS TO STATISTICAL

E S T I M A T I O N F O R P R O B A B I L IS T IC F U N C T I O N S

O F M A R K O V P R O C E S S E S A N D T O

A M O D E L F O R E C O L O G Y

BY LEONARD E. BAUM AND J . A. EAGON

Communicated by R. C. Buck, November 21, 1966

1. Summary . T h e ob jec t of th i s no t e is to p ro ve the theo rem be low

and ske tch two appl ica t ions , one to s ta t i s t i ca l es t imat ion for (p robabi l is t ic) funct ions of Markov processes [ l ] and one to Blak ley ' s

model for ecology [ 4 ] .

2. Resu l t .

T H E O R E M . Let P(x)=P({xij}) be a polynomial w ith nonnegative

coefficients homogeneous of degree d in its variables { # # } . Let x= {##}

be any point of the doma in D: ##§: ( ) , ] p L i ## = 1, i = l, • • • , p,

j=l, • • • , q% . For x= {xij} £ £ > le t 3(#) = 3 {# #} denote the point of D

whose i, j coordinate is

( dP\ \ f « dP3(*)<i = ( Xij 7 — ) / 2 * *< i —

\ dXij\(X)// , - i dXij (»>

Then P(3(x))>P(x) unless 3 ( x ) = x .

Notation, fi will de no te a dou bly indexed a r r ay of non neg a t ive

i n t e g e r s : fx= {M#}> i=

l > • • • > < lu i=l, • • • , A #* the n de no tes

I l f - i H î - i ^ * S im ila r ly , cM is an abb rev ia t ion for C[M iJ}. The po ly

n o m i a l P({xij}) i s then wr i t t en P(x) = ] C M V ^ -In ou r no t a t i on :

(1) 3(&)*i = ( Z ) «Wnys* ) / JLH CpiiijX».

We wish t o p rove

(2) P O ) = E <^" ̂ z ^ n f i 3(*)«w.

P R O O F .

^ ) = i k n i i 3 ( * ) « }

360



l/d+l

^ijld\ d/d+1

STATISTICAL ESTIMATION AN D A MODEL FOR ECOLOGY 3 6 1

W e ap ply H old er ' s inequ a l i ty [6, p . 21] to ob ta in

/ P Qi \

( P Qi / %.. \Mild\ *

( 3 ) x {E^nn~-) }( In the las t b races we have used (x»)d+lld=x* ü f - i H j - i ^

/ t f0

S i nc e

Z ? - i Z î - i M t f / ^ 5 5 1 by hom oge ne i ty o f P , we can ap ply the inequ a l i ty

of geom et r ic an d a r i th m et ic m ean s [6, p . 16] to the dou ble p ro du c ts

of the second brace of (3) to conclude:

P Qi / X.-J \PHld P Qi nu Xss

( 4 ) E v n n f e - i i ^ E L ? ^ -We now subst i tute the def ini t ion (1) of 3(#)# in the express ion on

the r igh t o f (4) and in te rchange the order o f summat ion to ob ta in :P Qi n . . < Y » . .

2- c,x" 2- 2- -r 77-r"\ P Qi

= "J Z <T»*M X S /*</*#

(5) ' ( Z Z Ctfp'ijJP' ) / ( Z <V/4^ M ' )

= — £ Z ^i f £ M*;VMJ / ( Z MW*

M'J

* Z Z^'Mtfo^'-i o - i M'

For each (i, j) the express ion wi th in the bracke ts i s = 1 and by hy

pothesis for each i, Z ? - i x * y = *• Hence the whole las t express ion of (5)

reduces to (1/d) Z f - i Z ? { - i Z M V / 4 0X M

' - B u t thi s is ju st (1 /d) Z</a**/o

• (dP/dXij0) so by t he E ule r theo rem for homog eneous func t ions i t i s

equa l to 2/lcMa;'1.

F ina l ly , i f we use th i s upper bound H^W for the express ion within

the second braces in (3 ) , r a i se bo th s ides o f (3 ) to the (d+l ) s t power ,

and d iv ide bo th s ides o f the resu l t ing inequa l i ty by ÇSpW) we ob

ta in the des i red inequa l i ty (2 ) .

T h a t P(3{xij})>P{xij) if { ## }? *{ ## } follows from (4) an d th e

s t r ic tness o f the inequa l i ty o f geomet r ic and a r i thmet ic means i f a l l

s u m m a n d s a r e n o t e q u a l .



362 L. E. BAUM AND J. A. EAGON [May

3. Application 1. T h e f irs t app l ica t ion of th i s theor em i s to s ta t i s t i

ca l es t im at ion for (p robabi l i s ti ca l ly ) lum ped M ark ov cha ins . Le t 5

be th e f inite s t a t e space of a M ar ko v cha in. Le t ƒ be a funct ion f rom

5 to R. L e t yG.RT, T an in teger , be an observ a t ion . In [ l ] the p rob lem

is cons idered of es t im at ing the t rans i t ion probab i l i t ies a# for i, j&S,

given y.

L e t X=(fT)-

1(y). XQS

T. F o r xEX, i, jES, le t Vi j{x ) be t he num

ber of t imes the pat tern • , • , • , i, j , •, -, • oc cu rs in x. The func t ion

P ( { # # } )=

Z » e x TLiJesaVij^

x) m ay t>

ei n t e rp r e t ed a s t he " p robab i l i t y

of observing y given the t rans i t ion probabi l i t i es { # # } . " N o t e t h a t P

i s a homogeneous po lynomia l o f degree T with no nne ga t iv e ( in teger )

coeff ic ients in the var iables a#.

An i te ra t ive procedure for es t imat ing the t rans i t ion probabi l i t i es

{&<;} given y i s sugg ested in [ l ] . If {ay} is an a priori es t ima te , l e t

4 j = ( ] £ * e j r Pij(x) TLk,iG8 öww(af)

/-P({^y})«A'a

m aY be in te rp re ted as

th e "a posteriori expected value of the f requency of t ransi t ion f rom

s t a t e i t o s t a t e j given y and the a priori probabi l i t i es {da}." T h u s

AyXjA'u may be thought o f as an "a p osteriori es t imate o f the t rans i

t ion probabi l i t i es g iven y." Since

A'ufàA'H = a^dP/da^/^aaidP/dai,)

by ou r the ore m app l ied to t he t ran sfo rm atio n 3{#<,-} = {A'^fLA'^}

w e c o n cl ud e t h a t - P ( 3 { a # } ) è P ( { a # } ) . I n o t h e r w o r d s t h e a pos

teriori es t imate o f t rans i t ion probabi l i t i es increases the l ike l ihood of

the g iven obse rva t i on y.

Various resul ts on the convergence of hi l l c l imbing i tera t ion pro

cedures [2 ] , [3 ] , [5] may be adduced to show tha t fo r a lmos t a l l

s ta r t s success ive i t e ra t ions wi l l converge to a connec ted component

of the local m ax im um set of P . If P ha s only f initely m an y local ma x

ima then success ive i t e ra tes converge to a po in t .

This i s the usua l case in the more genera l s i tua t ion cons idered in

[ l ] in which the observa t ion y t a t t ime t i s ob ta in ed f rom th e M ark ov

s t a t e x t a t t ime t accord ing to P(y t = k \ x t =j) = b3-k where bjk is an 5 X r

s toc has t ic m at r ix w hich is a l so to be es t im ated . H ere th e iden ti f iab i l-

i ty problem does not ar ise s ince, according to a theorem of Ted

Pé t r i e [7 ] , " i n gene ra l " no o the r ( a# ) , (b jk) y i e l d s t h e s a m e ^ p r o b a

bilities as a given (a°y), (&°*) (save for the si re label l ings of s ta tes) .

The second appl icat ion is to some resul ts of Blakley and Dixon

[2]» [3], [ 4 ] . Let r be a symmetr ic ^ - l inear fo rm on Rn

t h a t h a s

nonnegat ive coeff ic ients with respect to the s tandard basis for Rn.

L e t g(rj) = T(rj, TJ, • • • , rj) w h e r e rj is a vector in Rn. Since g is then

ju s t a pth. degree homogeneous po lynomia l wi th nonnega t ive coef iv



1967I STATISTICAL ESTIMATION AND A MODEL FOR ECOLOGY 363

cients of the components of rj we may apply the theorem of this note

to it. In Blakley's model g is the adaptati on (rate of growth)

of a population. The transformation in Blakley's model cr(rj)==r

]i(dg(v)/drli)/pÇ[(y) is the same as the transformation 3 {#»-,*} where

Xij^rjj, * = 1, i = l , • • • , n.

In Blakley's model if rj is the distribution of genotypes at time t>

then a(r]) is the distribution at time t+l. Thus it follows from the

theorem in this note with i = 1 tha t adaptation is nondecreasing with

time when evolution of the genotypes at a single locus is considered.

Our theorem with i>\ yields the same conclusion under natural

hypotheses for evolution of the genotypes at several loci. This non-

decreasing of the adapta tion with time is clearly a desirable feature

of the model.

REFERENCES

1. L. E. Baum, A statistical estimation procedure for probabilistic functions of

M arkov processes% IDA -CR D Working Paper No. 131.2. G. R. Blakley, Homogeneous non-negative symmetric quadratic transformations,

Bull. Amer. Math. Soc. 70 (1964), 712-715.3. G. R. Blakley and R. D. Dixon, The sequence of iterates of a non-negative nonlinear transformation. Ill, The theory of homogeneous symmetric transformations andrelated differential equations, (to appear) .

4. G. R. Blakley, Natural selection in ecosystems from the standpoint of mathematicalgenetics, (to appear).

5. Wolfgang Hahn, Theory and application of Liapunov's direct method, Prentice-Hall, Englewood Cliffs, N. J., 1963, pp. 139-150.

6. G. H. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, CambridgeUniv. Press, New York, 1959.

7. Ted Pétrie, Classification of equivalent processes which are probabilistic functions

of finite Markov chains, IDA-CR D Working Paper No. 181, ID A- CR D Log No. 8694.

INSTITUTE FOR DEFENSE ANALYSES

An inequality with applications to statistical estimation for probabilistic functions of markov...

Documents

Transcript of An inequality with applications to statistical estimation for probabilistic functions of markov...