Finite Non-Homogeneous Chains

7
Annals of Mathematics Finite Non-Homogeneous Chains Author(s): David Blackwell Source: Annals of Mathematics, Second Series, Vol. 46, No. 4 (Oct., 1945), pp. 594-599 Published by: Annals of Mathematics Stable URL: http://www.jstor.org/stable/1969199 . Accessed: 17/11/2014 21:20 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Annals of Mathematics is collaborating with JSTOR to digitize, preserve and extend access to Annals of Mathematics. http://www.jstor.org This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PM All use subject to JSTOR Terms and Conditions

Transcript of Finite Non-Homogeneous Chains

Page 1: Finite Non-Homogeneous Chains

Annals of Mathematics

Finite Non-Homogeneous ChainsAuthor(s): David BlackwellSource: Annals of Mathematics, Second Series, Vol. 46, No. 4 (Oct., 1945), pp. 594-599Published by: Annals of MathematicsStable URL: http://www.jstor.org/stable/1969199 .

Accessed: 17/11/2014 21:20

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Annals of Mathematics is collaborating with JSTOR to digitize, preserve and extend access to Annals ofMathematics.

http://www.jstor.org

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions

Page 2: Finite Non-Homogeneous Chains

ANNALS OF MATHEMATICS

Vol. 46, No. 4, October, 1945

FINITE NON-HOMOGENEOUS CHAINS

BY DAVID BLACKWELL

(Received March 8, 1945)

Introduction Let X denote the set of integers 1, 2, * , N, considered as the possible states

of a physical system, and denote by Pij(r, s) the probability that the system is in state j at time s under the hypothesis that it is in state i at time r, where r < s, r, s = -1, 0,1, . . The NxN matrices P(r, s) with elements Pij(r, s) will be Markoff matrices, i.e. matrices with non-negative elements and row sums unity, and we shall suppose that they satisfy the equation

(1) P(r, s)P(s, t) = P(r, t) for r < s < t.

This equation represents the condition that the successive states of the system form a AIarkoff chain, i.e. the probability that the system is in state j at time s under the hypothesis that it is in state i at time r (r < s) is independent of any hypotheses concerning the states of the system at instants prior to r. A set of lxN Markoff matrices P(s) will be called a set of absolute probabilities if the equation (2) P(s)P(s, t) = P(t), s < t

holds. Kolmogoroff' has shown that sets of absolute probabilities always exist and \vill be unique under certain conditions; it will be easy to show that there are never more than N linearly independent such sets. Using these facts, together with a theorem of Doob asserting the convergence of a sequence of chance variables satisfying certain conditions, we investigate the asymptotic properties of the matrices P(r, s) and the associated stochastic processes. Our principal result, Theorem 3, shows that the properties of non-homogeneous chains are analogous to those well known in the homogeneous case where P(r, s) depends only on s - r.

The associated stochastic processes

Following Kolmogoroff2, choose a sequence of integers r7 - -o such that for every s the sequence P(r7 , s) converges 3, say

(3) P (r, , s) Q (s).

Letting r -oo through r, in (1), we obtain

(4) Q(s)P(s, t) = Q(t).

1 Reference 3, p. 157. 2 Reference 3, p. 157. 3We shall say that a sequence of matrices P(n) converges to a matrix P if for all i, j,

P.^i~n ) P. ;Pq . 594

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions

Page 3: Finite Non-Homogeneous Chains

FINITE NON-HOMOGENEOUS CHAINS 595

(4) asserts that for every fixed i the set of matrices Qi(s) = 1(s) ... QiNv(s) is a set of absolute probabilities, which is Kolmogoroff's result.

Now, let R(s) be any set of absolute probabilities, and let r4 be a subsequence of rr for which R(r') converges, say to R. Letting s - o through r4 in (2), we obtain

(5) RQ(t) = R(t),

so that every set of absolute probabilities R(t) is a linear combination of the N sets Q1(t), . , QN(t) obtained by Kolmogoroff's method. From (4) and (5) follows a further result of Kolmogoroff4 which we state as

THEOREM 1: There is a unique set of absolute probabilities P(s) i and only if P(r, s) --- Pi(s) as r >-oo, where Pi(s) has identical rows. Each row of Pi(s) us then identical with P(s).

Denote by 5 the space of sequences w:( , x1, xo, x1, ), where each Xr = 1, 2, * * , N: so that the points w represent the possible complete histories of the physical system. If B is the Borel field of subsets of Q determined by all sets {Xr = i}, Doob5 has shown that to every set of absolute probabilities P(r) there corresponds a unique probability measure P(E) on B such that

(6) P{Xr = i} = Pi(r),

(7) P(x., = i;{x., = j}) = Pi, (r, s);

and has shown that the space Q with the measure P(E) is a Markoff process, i.e.

(8) E(. , xnl , x1n; f) = E(xn; ;f) for every summable function f(w) depending only on Xn+ Xn+2, . This measure P(E) we shall call the measure corresponding to P(r). The following lemma asserts the linearity of this correspondence:

LEMMA 1: If P(s) = z =, p Pi(s), where P(s), Pi(s) are sets of absolute prob- abilities with corresponding measures P(E), Pi(E), then P(E) = Camel piPi(E).

PROOF: Write Q(E) = a piPi(E). It is sufficient to show that P(E) = Q(E) when E = { Xr+j = tj }, f or the two measures will then coincide on the field of finite sums of disjunct sets of this form and consequently on B. For n = 0,

Q(E) = E PiP (E) - piP' 0(r) = P(E) il ill

by (6). Supposing Q(E) P(E) for n < k and writing E = HMo {Xr+j = tj}, F= J 1

J{xr4j = ti}, we have

P(E) = P(F)Ptk.ltk(r + k -1, r + k) m m

- E pPi~(FF)Ptk_1tk(r + k -1, r + k) = A p^P,(E) =Q(E)

I Reference 3, p. 157. 5 Reference 2, p. 102 et seq.

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions

Page 4: Finite Non-Homogeneous Chains

596 DAVID BLACKWELL

using successively (8), the induction hypothesis, and (8). This completes the proof.

We consider the space Q with the probability measure P(E) corresponding to any set of absolute probabilities P(r). We remark that P(E) is absolutely continuous with respect to the measure Q(E) corresponding to the set of abso-

lute probabilities Q(r) = k EN=1 Qt(r). For if Q(N) = 0, it follows from Lemma

1 that Qj(N) = 0 for all i, where Qj(E) is the measure corresponding to Qt(r). By (5) and Lemma 1, P(E) is a linear combination of Qj(E), so that P(N) = 0.

Denote by F the field of all subsets of Q which depend on only a finite number of X, and let T e F, say T = T(xr x,.+1 X, Xs). Define f, (w) = P(xn; T) 6,7 for n < r. Then for m < n,

(9) E(f.. *mif m; fln) - fn.8

For if S is any set depending only on fm.- , fm

ff m dP = P(ST) ff dP.

Then by a theorem of Doob,9 ff (co) = f(c) exists with probability 1. It follows that f(co) = P,(T), where P,(E) denotes the conditional probability of E with respect to the Borel field A of all sets in B which depend only on **,

Xn-1, x71 for all n. For if A e A, L f dP = P(AT) and letting n * -oo we

obtain, since the chance variables fn(w) are uniformly bounded,

L dP = P(AT).

By excluding a set of Q-measure zero for each T, we obtain a set C such that Q(C) = 1 such that for all T,

(10) P(xm; T) -> P,(T) as m -G for co e C.

Comparing (10) with (3), we see that for every wc e C there exists an i such that P<{x8 = j} = Qijx, = j} for all s, j. Defining co- W2 if Pc,1,{x, = j} = P<2 { x, = j } for all s, j, we obtain a division of C into at most N equivalence classessayA1, ,AI,AA+1, ,Am,wherem < NQ(A ) > Ofori < 1, Q(Ai) = 0 for i > 1. We may suppose, relabeling the states if necessary, that

(11) Pw{Xr = i} = Qk{Xr = i} for o EAk, k < m.

6 We shall use the notations P(x , - . , x8 ; E) and E(Xr, X . , x8 ; f) to denote the conditional probability of a set E and conditional expectation of a functionf with respect to the variables x, , . . ., xs .

I It is easy to see that the function f.(w) can be defined in terms of the matrices P(r, s) only, and is consequently independent of P(E).

8 This fact was pointed out to the writer by Prof. J. L. Doob. 9 Reference 1, p. 458.

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions

Page 5: Finite Non-Homogeneous Chains

FINITE NON-HOMOGENEOUS CHAINS 597

This suggests LEMMA2: P,.,(E) = Qk(E) for E e B, CO e Ak, k _ rn. This lemma is to be interpreted as meaning that the function P,(E) which for

each E e B is unique except on a set of measure zero, may be defined uniquely on A, + ... + Am simultaneously for all E e B by the above equation. So defined, P,,(E) becomes for each w e Al + * * * + Am a probability measure on B. We see also that P,(E), so defined, is independent of P(E) on Al + * + Am.

PROOF: We must show that for all E e B, A E A,

(12) P(AE) = ? P(AAj)Qj(E).

It is sufficient to show (12) when E = H2o {Xr+j = tj}. For n = 0, (12) follows from (11). Supposing (12) holds whenever n < k, and writing E = fl=o {Xr4 j = ti A, F = JJ'I I {xr+j = tj}, we have

P(AE) = >, P(AAjE) = E P(AAiF)Ptk-ltk(r + k -1, r + k) t~ i=1

m m

= A, (PAAi)Qi(F)Ptk-ltk(r + k - 1, r + k) = S P(AA,)Qj(E). i~~~~l t~~~~~~~sl~~= COROLLARY: Qk(Ak) = 1 for k < 1. PROOF: Since Ak e A, P(Ak) = 1 for X e Ak . Collecting our results, we have THEOREM 2: We may writes = Al + +AI+Al+1 + + Am + D,

where (a) P(xn; T) - Qk (T) for w EAk, k < m, T EF, (b) P. (E) = Q1,(E) forE EB, X e Ak, k < m, (C) P(A 1 + + AI'= COROLLARY 1: The sets of absolute probabilities Qk(S), k < 1 are linearly inde-

pendent, and every set of absolute probabilities is a linear combination of these. PROOF: From the corollary to Lemma 2, Qk(Ak) = 1 for k < l so that certainly

the measures Qk(E), k < l are linearly independent. Also P(E) = k1

P(AkE) = 'E L P(Ak)QA(E) so that P(E) is a linear combination of QW(E), k ? 1. By Lemma 1, the corresponding statements hold for Qk(s).

COROLLARY 2: 1 = 1 if and only if P(r, s) -* Pi(s) as r - o- , where Pl(s) is the matrix defined in Theorem 1.

This follows at once from Theorem 1 and Corollary 1 of Theorem 2. We now apply Theorem 2 to obtain information about the behavior of P(r, s)

as r -m. - co Our result is THEOREM 3: We may write X = Trl + * + Tr + N., where (a) Pi j(r, s) Qk(s) for i c Trk, r - o -

(b) Pij(r, s) -Ofori e TrkJ e Tri, n $ kIas r, s -00

(c) P(lim supr-o. {xr c Nr}) = 0. PROOF: Define subsets Srk, of 0 as follows:

Srkp = I IP(xr; E) -Qk(E) I < 1/p} k ? I

for all E depending only on x-p, * *, xv.

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions

Page 6: Finite Non-Homogeneous Chains

598 DAVID BLACKWELL

Then Ak =l N_ 1 Hr=N Srkp SO that for all p, Qk(112rv Srkp) 1 as N o-c. We can choose N1 > N2 > * N* , - - o so that

1 Qk( I Srkp) > 1 -

r ?Np

Defining Srk = Srkp for Np > r > Np+l, we see that for N ? Npo,

1 Qk (I Srk) >- Qk I( I Srkp) > 1 -2Po1

reN P 2 po rS?NI

so that Qk(lim infr__.- Srk) = 1. Moreover we have

(13) P(Xr; T) --> Qk(T) for w - limr infr-_oo Srk , T E F

so that lim inf Srk g Ak , and Aklim sup Srj = 0 for j $ k. We remark that the convergence in (13) is for fixed T and N uniform on the set Hr <? Srk. Let Srk = SrkC(E, jpk Sri.)

Then

lim inf Srk = (lim inf Srk)Ak(1Im inf C( Srj)) r~---o rr-*-oo jf~k

= (lim inf Srk)Ak(lim inf HI C(Srj)) r--oo r ---oo jfpk

(lim inf Srk)Ak II lm inf C(Sr;) r---oo hv~k

-lim inf Srk)Ak ]rI C(lim SUP Srj) r ~--o jf k

lim inf Srk

Define Trk = j:{Xr = j} 9 Srk, Nr = X - ?k=1 Trk. Since Srk C Srk, (a) follows from (13). To establish (b), given E > 0, choose N1 so that for k < 1, Qk(7IJ?<NI Snk) > 1-E. Then choose

N2 < N1 so that P(Xr; SN'k) > 1 - 2E for Xr E Trk, r < N2, k ? 1.

This can be done by (13), since S[lk E F. Then for r, s < N2 X E Trk 7 j k, 2E > P(Xr; S8 jSNS ,j) > P(Xr; S81j)(1 - 2E) so that P(xr; S'j) < 2E/(1 - 2E). Finally, since lim infr.-oo Srk rlim SUPr--oo {Xr E Nr = 0, Qk(lim SUPr,-,, {[Xr E Nr}) =0 for k _ 1, so that

P(lim supr.-- {xr E Nr}) - 0.

COROLLARY: If for some t > 0 and an infinite number of negative values of r we have Pij(r, r + t) > - > O for all i, j, then P(r, s) -+Pi(s) as r- oo.

PROOF: From (b), I = 1. The result then follows from Corollary 2 of Theorem 2.

The corollary is a direct generalization of the well-known fact for temporally

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions

Page 7: Finite Non-Homogeneous Chains

FINITE NON-HOMOGENEOUS CHAINS 599

homogeneous chains that if for some t > 0 Ptj > 0 for all i, j, then P' converges to a limit with identical rows, and there is only one set of absolute probabilities.

HOWARD UNIVERSITY

REFERENCES

1. DOOB, J. L., Regularity Properties of Certain Families of Chance Variables. Trans. A. M. S. (47) 1940, pp. 455-486.

2. DOOB, J. L., Stochastic Processes with an Integral-Valued Parameter. Trans. A. M. S. (44) 1938, pp. 87-150.

3. KOLMOGOROFF, A., Zur Theorie der M1arkoffschen Ketten. Mathematische Annalen (112) 1935-6, pp. 155-160.

This content downloaded from 192.231.202.205 on Mon, 17 Nov 2014 21:20:48 PMAll use subject to JSTOR Terms and Conditions