Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. ·...

13
Probability, Random Processes , and Ergodie Properties

Transcript of Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. ·...

Page 1: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

Probability, Random Processes , and Ergodie Properties

Page 2: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

Robert M. Gray

Probability, Random Processes , and Ergodie Properties

Springer Science+Business Media, LLC

Page 3: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

Roher! M. Gray Department of Electrical Engineering Stanford University Stanford, CA 94305, USA

ISBN 978-1-4757-2026-6 ISBN 978-1-4757-2024-2 (eBook) DOI 10.1007/978-1-4757-2024-2

Library of Congress Cataloging-in-Publication Data Gray, Robert M., 1943-

Probability , random processes, and ergodie properties.

Bibliography: p. Includes index. I. Probabilities. 2. Measure theory. 3. Stochastic

processes. I. Title. QA273.G683 1987 519.2 87-28429

© 1988 by Springer Science+Business Media New York Originally published by Springer-Verlag New York Inc in 1988.

Softcover reprint of the hardcover 1 st edition 1988

All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher Springer Science+Business Media, LLC, except

for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc. in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

Camera-ready copy prepared by the author using EQNITRoff.

9 8 7 6 5 4 3 2 I

Page 4: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

This book is afJectionately dedicated to

Elizabeth Dubois Jordan Gray

and to the memory of

R. Adm. Augustine Heard Gray, U.S.N. 1888-1981

Sara Jean Dubois

and

William "Billy" Gray 1750-1825

Page 5: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

CONTENTS

PREFACE History and Goals Assumed background Acknowledgements

1. PROBABILITY AND RANDOM PROCESSES 1.1 Introduction 1.2 Probability Spaces and Random Variables 1.3 Random Processes and Dynamical Systems 1.4 Distributions 1.5 Extension 1.6 Isomorphism

Rerences

2. STANDARD ALPHABETS 2.1 Extension of Probability Measures 2.2 Standard Spaces 2.3 Some Properties of Standard Spaces 2.4 Simple Standard Spaces 2.5 Metric Spaces 2.6 Extension in Standard Spaces 2.7 The Kolmogorov Extension Theorem 2.8 Extension Without a Basis

Rerences

3. BOREL SPACES AND POLISH ALPHABETS 3.1 Borel Spaces 3.2 Polish Spaces 3.3 Polish Schemes

Rerences

xi

1 1 2 8

12 19 27 30

32 33 34 41 45 47 55 56 58 65

66 66 71 81 90

vii

Page 6: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

viii Contents

4. AVERAGES 92 4.1 Introduction 92 4.2 Discrete Measurements 93 4.3 Quantization 97 4.4 Expectation 101 4.5 Time A verages 115 4.6 Convergence of Random Variables 118 4.7 Stationary A verages 129

Rerences 132

5. CONDITIONAL PROB ABILITY AND EXPECTATION 133 5.1 Introduction 133 5.2 Measurements and events 134 5.3 Restrietions of Measures 139 5.4 Elementary Conditional Probability 140 5.5 Projections 144 5.6 The Radon-Nikodym Theorem 147 5.7 Conditional Probability 152 5.8 Regular Conditional Probability 155 5.9 Conditional Expectation 160 5.10 Independence and Markov Chains 168

Rerences 172

6. ERGODIC PROPERTIES 173 6.1 Ergodie Properties of Dynamical Systems 173 6.2 Some Implications of Ergodie Properties 179 6.3 Asymptotically Mean Stationary Processes 185 6.4 Recurrence 194 6.5 Asymptotic Mean Expectations 201 6.6 Limiting Sampie A verages 203 6.7 Ergodicity 207

Rerences 215

7. ERGODIC THEOREMS 216 7.1 Introduction 216 7.2 The Pointwise Ergodie Theorem 217 7.3 Block AMS Processes 223 7.4 The Ergodie Decomposition 225 7.5 The Subadditive Ergodie Theorem 232

Rerences 242

Page 7: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

Contents ix

8. SPACES OF MEASURES AND THE ERGODIC DECOMPOSITION 244

8.1 Introduction 244 8.2 AMetrie Space of Measures 246 8.3 The rho-bar Distance 254 8.4 Measures on Measures 264 8.5 The Ergodie Decomposition Revisited 265 8.6 The Ergodie Decomposition of Markov Sources 269 8.7 Barycenters 272 8.8 Affine Functions of Measures 276 8.9 The Ergodie Decomposition of Affine Functionals 281

Rerences 282

Bibliography 284

Index 289

Page 8: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

PREFACE

This book has been written for several reasons, not all of which are academic. This material was for many years the first half of a book in progress on information and ergodic theory. The intent was and is to provide a reasonably self-contained advanced treatment of measure theory, prob ability theory, and the theory of discrete time random processes with an emphasis on general alphabets and on ergodic and stationary properties of random processes that might be neither ergodic nor stationary. The intended audience was mathematically inc1ined engineering graduate students and visiting scholars who had not had formal courses in measure theoretic probability . Much of the material is familiar stuff for mathematicians, but many of the topics and results have not previously appeared in books.

The original project grew too large and the first part contained much that would likely bore mathematicians and dis courage them from the second part. Hence I finally followed the suggestion to separate the material and split the project in two. The original justification for the present manuscript was the pragmatic one that it would be a shame to waste all the effort thus far expended. A more idealistic motivation was that the presentation bad merit as filling a unique, albeit smaIl, hole in the literature. Personal experience indicates that the intended audience rarely has the time to take a complete course in measure and probability theory in a mathematics or statistics department, at least not before they need some of the material in their research. In addition, many of the existing mathematical texts on the subject are hard for this audience to follow, and the emphasis is not weIl matched to engineering applications. A notable exception is Ash's excellent text [1], which was likely influenced by his original training as an electrical engineer. Still, even that text devotes little effort to ergodic theorems, perhaps the most fundamentally important family of results for applying prob ability theory to real problems. In addition, there are many other special topics that are given little space (or none at all) in most texts on advanced prob ability and random processes.

xi

Page 9: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

xii Preface

Examples of topics developed in more depth here than in most existing texts are the following:

(i) Random processes with standard alphabets We develop the theory of standard spaces as a model of quite general process alphabets. Although not as general (or abstract) as often considered by probability theorists, standard spaces have useful structural properties that simplify the proofs of some general results and yield additional results that may not hold in the more general abstract case. Examples of results holding for standard alphabets that have not been proved in the general abstract case are the Kolmogorov extension theorem, the ergodic decomposition,and the existence of regular conditional probabilities. In fact, Blackwell [2] introduced the notion of a Lusin space, a structure c10sely related to a standard space, in order to avoid known examples of probability spaces where the Kolmogorov extension theorem does not hold and regular conditional probabilities do not exist. Standard spaces inc1ude the common models of finite alphabets (digital processes ) and real alphabets as weIl as more general complete separable metric spaces (Polish spaces). Thus they include many function spaces, Euclidean vector spaces, two-dimensional image intensity rasters, etc. The basic theory of standard Borel spaces may be found in the elegant text of Parthasarathy [12], and treatments of standard spaces and the related Lusin and Suslin spaces may be found in Christensen [4], Schwartz [13] , Bourbaki [3], and Cohn [5]. We here provide a different and more coding oriented development of the basic results and attempt to separate clearly the properties of standard spaces, which are useful and easy to manipulate, from the demonstrations that certain spaces are standard, which are more complicated and can be skipped. Thus, unlike in the traditional treatments, we define and study standard spaces first from a purely prob ability theory point of view and postpone the topological metric space considerations until lateT.

(ii) Nonstationary and nonergodie processes We develop the theory of asymptotically mean stationary processes and the ergodic decomposition in order to model many physical processes better than can traditional stationary and ergodic processes. Both topics are virtually absent in all books on random processes, yet they are fundamental to understanding the limiting behavior of nonergodic and nonstationary processes. Both topics are considered in Krengel's excellent book on ergodic theorems [10], but the treatment

Page 10: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

Preface xiii

here is more detailed and in greater depth. We consider both the common two-sided processes, which are considered to have been producing outputs forever, and the more difficult one-sided processes, which better model processes that are "tumed on" at so me specific time and which exhibit transient behavior.

(iii) Ergodie properties and theorems We develop the notion of time averages along with that of probabilistic averages to emphasize their similarity and to demonstrate many of the implications of the existence of limiting sample averages. We prove the ergodic theorem theorem for the general case of asymptotically mean stationary processes. In fact, it is shown that asymptotic mean stationarity is both sufficient and necessary for the elassical pointwise or almost everywhere ergodic theorem to hold. We also prove the sub additive ergodic theorem of Kingman [9], which is useful for studying the limiting behavior of certain measurements on random processes that are not simple arithmetic averages. The proofs are based on recent simple proofs of the ergodic theorem developed by Ornstein and Weiss [11], Katznelson and Weiss [8], Jones [7], and Shields [14]. These proofs use co ding arguments reminiscent of information and communication theory rather than the traditional (and somewhat tricky) maximal ergodic theorem. We consider the interrelations of stationary and ergodic properties of processes that are stationary or ergodic with respect to block shifts, that is, processes that produce stationary or ergodic vectors rather than scalars.

(iv) Process distance measures We develop measures of the "distance" between random processes. Such results quantify how "elose" one process is to another and are useful for considering spaces of random processes. These in turn provide the means of proving the ergodic decomposition of certain functionals of random processes and of characterizing how elose or different the long term behavior of distinct random processes can be expected to be.

Having described the topics treated here that are lacking in most texts, we admit to the omission of many topics usually contained in advanced texts on random processes or second books on random processes for engineers. The most obvious omission is that of continuous time random processes. A variety of excuses explain this: The advent of digital systems and sampled-data systems has made discrete time processes at least equally important as continuous time processes in modeling real world phenomena. The shift in emphasis from continuous

Page 11: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

xiv Preface

time to discrete time in texts on electrical engineering systems can be verified by simply perusing modem texts. The theory of continuous time processes is inherently more difficult than that of discrete time processes. It is harder to construct the models precisely and much harder to demonstrate the existence of measurements on the models, e.g., it is harder to prove that limiting integrals exist than limiting sums. One can approach continuous time models via discrete time models by letting the outputs be pieces of waveforms. Thus, in asense, discrete time systems can be used as a building block for continuous time systems.

Another topic clearly absent is that of spectral theory and its applications to estimation and prediction. This omission is a matter of taste and there are many books on the subject.

A further topic not given the traditional emphasis is the detailed theory of the most popular particular examples of random processes: Gaussian and Poisson processes. The emphasis of this book is on general properties of random processes rather than the specific properties of special cases.

The final noticeably absent topic is martingale theory. Martingales are only briefly discussed in the treatment of conditional expectation. My excuse is again that of personal taste. In addition, this powernd theory is simply not required in the intended sequel to this book on information and ergodic theory.

The book's original goal of providing the needed machinery for a book on information and ergodic theory remains. That book will rest heavily on this book and will only quote the needed material, freeing it to focus on the information measures and their ergodic theorems and on source and channel coding theorems. In hindsight, this manuscript also serves an alternative purpose. I have been approached by engineering students who have taken a master' s level course in random processes using my book with Lee Davisson [6] and who are interested in exploring more deeply into the underlying mathematics that is often referred to, but rarely exposed. This manuscript provides such a sequel and fills in many details only hinted at in the lower level text.

As a final, and perhaps less idealistic, goal, I intended in this book to provide a catalogue of many results that I have found need of in my own research together with proofs that I could follow. This is one goal wherein I can judge the success; I often find myself consulting these notes to find the conditions for some convergence result or the reasons for some required assumption or the generality of the existence of some limit.

Page 12: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

Preface xv

If the manuscript provides similar service for others, it will have succeeded in a more global sense.

The book is aimed at graduate engineers and hence does not assume even an undergraduate mathematical background in functional analysis or measure theory. Hence topics from these areas are developed from scratch, although the developments and discussions often diverge from traditional treatments in mathematics texts. Some mathematical sophistication is assumed for the frequent manipulation of deltas and epsilons, and hence some background in elementary real analysis or a strong ca1culus knowledge is required.

Acknowledgments

The research in information theory that yielded many of the results and some of the new proofs for old results in this book was supported by the National Science Foundation. Portions of the research and much of the early writing were supported by a fellowship from the John Simon Guggenheim Memorial Foundation. The book was written using the eqn and troff utilities on several UNIX (an AT&T trademark) systems supported by the Industrial Affiliates Pro gram of the Stanford University Information Systems Laboratory.

The book benefited greatly from comments from numerous students and colleagues through many years: most notably Paul Shields, Lee Davisson, John Kieffer, Dave Neuhoff, Don Ornstein, Bob Fontana, Jim Dunham, Farivar Saadat, Mari Ostendorf, Michael Sabin, Paul Algoet, Wu Chou, Phil Chou, and Tom Lookabaugh. They should not be blamed, however, for any mistakes I have made in implementing their suggestions.

I would also like to acknowledge my debt to Al Drake for introducing me to elementary prob ability theory and to Tom Pitcher for introducing me to measure theory. Both are extraordinary teachers.

Finally, I would like to apologize to Lolly, Tim, and Lori for all the time I did not spend with them while writing this book.

REFERENCES 1. R. B. Ash, Real Analysis and Probability, Academic Press, New

York, 1972.

Page 13: Probability, Random Processes , and Ergodie Properties978-1-4757-2024... · 2017. 8. 27. · Probability, Random Processes , and Ergodie Properties Springer Science+Business Media,

xvi Preface

2. D. Blackwell, "On a dass of prob ability spaces," Proc. 3rd Berkele:y Symposium on Math. Sei. and Prob., vol. H, pp. 1-6, Univ. California Press, Berkeley, 1956.

3. N. Bourbaki, Elements de Mathematique, Livre VI, Integration, Hermann, Paris, 1956-1965.

4. J. P. R. Christensen, Topology and Borel Structure, Mathematics Studies 10, North-HollandlAmerican Elsevier, New York, 1974.

5. D. C. Cohn, Measure Theory, Birkhauser, New York, 1980.

6. R. M. Gray and L. D. Davisson, Random Processes: A Mathematical Approach for Engineers, Prentice-Hall, Englewood Cliffs, New Jersey, 1986.

7. R. Jones, "New proof for the maximal ergodic theorem and the Hardy-Littlewood maximal inequality," Proc. AMS, vol. 87, pp. 681-684, 1983.

8. I. Katznelson and B. Weiss, "A simple proof of some ergodic theorems," Israel Journal of Mathematics, vol. 42, pp. 291-296, 1982.

9. J. F. C. Kingman, "The ergodic theory of subadditive stochastic processes," Ann. Probab., vol. 1, pp. 883-909, 1973.

10. U. Krengel, Ergodic Theorems, De Gruyter Series in Mathematics, De Gruyter, New York, 1985.

11. D. Ornstein and B. Weiss, "The Shannon-McMillan-Breiman theorem for a dass of amenable groups," Israel J. of Math, vol. 44, pp. 53-60, 1983.

12. K. R. Parthasarathy, Probability M easures on M etric Spaces, Academic Press, New York, 1967.

13. L. Schwanz, Radon Measures on Arbitrary Topological Spaces and Cylindrical Measures, Oxford University Press, Oxford, 1973.

14. P. C. Shields, "The ergodic and entropy theorems revisited," IEEE Transactions on Information Theory, vol. IT-33, pp. 263-266, March 1987.