THE STORY OF THE CENTRAL LIMIT THEOREMsms.math.nus.edu.sg/smsmedley/Vol-04-1/The story of... · THE...

11
THE STORY OF THE CENTRAL LIMIT THEOREM Loh Wei Yin The central limit theprem (CLT) occupies a place of honour in the theory of probability, due to its age, its invaluable contribution to the theory of probability and its applications. Like other limit theorems, it essentially says that all large-scale random phenomena 1.n their collective action produce strict regularity. The limit la\-J in the CLT is the well-known Normal distribution from which is derived many of the techniques in statistics, particularly the so-called 11 large sample theory" . Because the CLT is so very basic, it has attracted the attention of numerous workers. The earliest work on the subject is perhaps the theorem of Bernoulli (1713) which 1.s really a special case of the Law of Large Numbers. De Moivre (1730) and.Laplace (1812) later proved the first vers. ion of the CLT. This was generalized by Poisson to constitute the last of the main achievements before the time of Chebyshev. The theorems mentioned above deal with a sequence of independent events ... , with their respective probabilities denoted by p = P(l; ). The number of actually n n occurring events among the first n events t; 1 , ... ,;n is denoted by th e random variable Z . The above-mentioned n results can now be stated as follows. (The first two theorems have pn = p for all n, and 0 < p < 1.) 1. Bernoulli's Theorem. For every E > 0, z P( Inn -PI> s) 0 as n oo 1 . ..... - 35 -

Transcript of THE STORY OF THE CENTRAL LIMIT THEOREMsms.math.nus.edu.sg/smsmedley/Vol-04-1/The story of... · THE...

THE STORY OF THE CENTRAL LIMIT THEOREM

Loh Wei Yin

The central limit theprem (CLT) occupies a place of

honour in the theory of probability, due to its age, its

invaluable contribution to the theory of probability and

its applications. Like a~l other limit theorems, it

essentially says that all large-scale random phenomena 1.n

their collective action produce strict regularity. The

limit la\-J in the CLT is the well-known Normal distribution

from which is derived many of the techniques in statistics,

particularly the so-called 11 large sample theory" .

Because the CLT is so very basic, it has attracted the

attention of numerous workers. The earliest work on the

subject is perhaps the theorem of Bernoulli (1713) which 1.s

really a special case of the Law of Large Numbers. De

Moivre (1730) and.Laplace (1812) later proved the first

vers.ion of the CLT. This was generalized by Poisson to

constitute the last of the main achievements before the

time of Chebyshev.

The theorems mentioned above deal with a sequence of

independent events ~ 1 ,~ 2 ,~ 3 , ... , with their respective

probabilities denoted by p = P(l; ). The number of actually n n occurring events among the first n events t; 1 , ... ,;n is

denoted by the random variable Z . The above-mentioned n

results can now be stated as follows. (The first two theorems

have pn = p for all n, and 0 < p < 1.)

1. Bernoulli's Theorem. For every E > 0,

z P( Inn -PI> s) ~ 0 as n ~ oo

1 . .....

- 35 -

2. Laplace's Theorem

z n

- np

as n + oo uniformly with respect to z 1 and z2

.

We have used the notation

4?(x) = t"2_

1 fx e ·-2 dt

J2TI J -oo

which 1s the standard Normal distribution functio.n.

3. CLT in Poisson's Form

Then

as n

Let A = p 1+ ••• +p , D~ = p 1 C1 -p 1 )~ ... ~p (1-p l . n n n n n

Z -A P(zl < ~ n < z2) + ¢(z2) - ¢(zl)

n

uniformly with respect to z1

and z 2 .

If we introduce the indicator random variable

if l; occurs I~

if ~ does not occur, =

Zn can be vJri tten as

z n =IE. +It:+ ••• + It: •

~1 ~2 ~n

Thus the above three theorems are 1n fact special c ases of

limit theorems concerning sums of independent random v2~i~~~s= .

- 36 -

The rigorous proof of the more general CLT for sums

of arbitrarily distributed independent random variables was

made possible by the creation in the second half of the

nineteenth century of powerful methods due to Chebyshev,

whose work signalled the dawn of a new development in the

entire theory of probability.

Chebyshev considered a sequence of independent random

variables x1 , x2 , ... , Xn,··· with finite means and variances,

denoted respectively by a =EX b 2 = E(X -a ) 2 • Let . n n' n n n

S = X1

+ •.. + X , A = a1

+ ... +a ,and B2 = b 12 + ... + b 2 • n n n n n n

Chebyshev studied and solved the folloHing problem.

Problem. V.Jhat additional conditions ensure the

validity of the CLT:

P( S -A n n

B n

< z) -+- cl>(z)

for every real z as n + ~?

To solve this problem, Chebyshev created the method

of moments . . His proof, in a paper in 1890, was based on a

lemma which was proved only later by Markov (1899). Soon

afterwards, Lyapunov (1900, 1901) solved the same problem

under considerably more general conditions using another

method, although Markov later showed that the method of

moments is also c~pable of obtaining Lyapunov's theorem.

However, it turned out that Lyapunov's method was simpler

and more powerful in its application to the whole class of

limit theorems concerning sums of independent variables.

This is the method of characteristic functions using

Fourier analytic techniques. It is so powerful that to

d~te no other method can yield. better results for the case

of independent random variables.

The condition Lyapunov used to solve Chebyshev 1 s

problem was

- 37 -

) [ i

::.

lim c /B2+o = o , n n n-+oo

An even weaker condition 1s the famous Lindeberg

condition that for every 6>0.

lim n-+oo

where Fk is the distribution of Xk. Subsequently Feller

(1937) showed that the Lindeberg condition is not only

sufficient but also necessary for the limit law to be

normal, provided an appropriate uniform asymptotic

negligibility of the X./B is assumed. 1 n

In practical applications the CLT is used essentially

as an approximate formula for "suff~ciently large values of

n. In order that this use 1s justified, the formula must

contain an estimate of the error involved. One \.Jay of

doing this is to consider the various asymptotic expansions

for the distribution

S -A Fn(x) =PC~ n < x).

n

In his 1890 paper Chebyshev indicated without proof the

following expansion for the difference F n (x) ··· ~ (x), TtJhen

the random variables are identically distributed:

where the Qi(x) are polynomials. The most definitive resul T~

ln this direction are due to Cramer. Edgeworth (1965) studicC

in detail the expans1on 1n a slightly different form.

~ 3 8 --

· When the random variables are·identica11ydistributed

and possess finitethird moments,,Bei'.'ry (1941) and Esseen

{1945) independently proved the celebrated result

K f3 -·--rr2 ' JTI a

. ~;.;rhe:r>e:f3 = EIX1 -Ex1 ! 3 ,.a 2 =EX~- (EX1 ) 2 and K 1.s a constant.

L~ter results have ~ gene:r>alized this to the case of non­

identically distributed summands with the best p()und

·achiev~d by Esseen (1969) in terms of truncated third

mbments.

. A natural question generated by · Lyapunov 1 s CL'l' is

whether the condition that the random variables be indeper-ldent

can be generalized. It ~as forty-seven yea~s ·later before

Hocffding and Robbins ( 1948) proved a CLT for an m-depend:::nt

sequence of random variables. (The conc.ept of m-dependenc2

~ssentially r~quires that given the sequence xl,x2'''''xn''"''

it ism-dependent if cx1,x,, ... ,X) is independent of

L r . (Xs,Xs+l'''') for s-r>m. In .this terminology an independ2nt

sequence is 0-dependent.) Later Diananda (1955) and O::::>e:y

(l958) improved on this result by assuming only Lindeber g 1 s

condition and the boundedness of the . sum of the individual

·variances.

A~most at the same time, ~osenblatt (1956) proved a

CLT for a ' '' strong mixing 11 sequence·. This condition requir ::s

. only that the d6pend~nce ~etwee~ X and X +' diminishes =2 . n . _ n K

k increases. Thus m-dependenc~ ~s incl~ded as a speci~l

case. Rosenblatt's results tvere subsequently improve.r2 b·.'

Philipp (1969a, 19~gb) wh6 not only relaxed the form0r's

conditions b1it also obtained bounds for the error in th::;

norrnal approximation. Soon after, in the Sixth Berkel o::;:

Symposium. Dvoretzky (1972) presented 'v'ery general result ::.;

for dependent random variables. For the particular caE a

of strong mixing, he went beyond Philipp's (1 969a) theors~

- 39 -

by dropping the condition that the variables be uniformly

bounded. In this connection, the author and Chen ~2J have added a refinement to one of Dvoretzky's theorems.

Recently too, McLeish (1974) has made improvements on

Dvoretzky's paper.

At the same symposlum ln which Dvoretzky presented

his results, an equally interesting paper was given by

Stein (1972). This paper is concerned with bounding the

error in the normal approximation for dependent random

variable. Its significance lies not so much in its improve­

ment of known results, Hhich it did manage handsomely, but

rather in its introduction of a new method vastly different

from the established Fourier techniques. The method, which

makes no use of characteristic functions, essentially

d8pends upon an identity and a perturbation technique.

The interest created by Stein's paper was almost

i~~ediate. Chen (1972) used it to give an elementary proof

of the CLT for i~dependent random variables while Erickson

(1974) obtained an 1 1 bound for the error for m-dependent

sums. The latter has since been generalized by the author

and Che n \2ll to ¢-mixing sequences. Chen [9] has - L--- _)

meam.Jhile employed a variation of Stein's method to obtain

nc:c e ssary and sufficient condi t:.i.ons for the dependent

central limit problem where the limit law need not be

normal. In the case that the limit is normal, the author

and Chen [2 2-1 have improved on the existing results for _,

strong mixing s e quences. Although Stein's method appears

to be more easily applicable to dependent random variabl e s .,

the classical Fourier method lS still superior for independ8nt

variab l e s . This is because to date no one has been able t c

apply Stein's method to yield the classical Berry-Esseen

theorem.

I n yet another direction of generalization~ Markov

~v-a.s among the first to prove a multidimensional CLT, vJh2:;_~.,:;

the sequence of random variables is now a sequence of

independent random vectors. The limit law then becomes the

multidimensional Gaussian distribution. Apart from the

extra work of dealing with matr~ces, the proof of the

multidimensional ,_ C~T appears to be a simple extension · of

the one-dimensional case.

The borresponding problem of bounding the error 1n the

multi-dimensional CLT is more interesting. Among the first

to look for estimates was Rao (1961). He was closely

foildwed by a host . of others, mainly Russians, like Bikjalis

(1966), von Bahr (1967), Bhattacharya - (1968), Sadikova (1968)

Sazanov (1968), Bergstrom (1969), Paulanskas (1970) and

Rota.r~ (1970). With the exception of the last two, all the

authors mentioned above considered only :independent and

identically distribu-ted random vectors. The last two

dropped the assumption of identical distributions. \fuen. 1 .

third moments exist~ p.n or•der of n -~ is obtained' which is

equivalent to the Ber~~-E~seen rate. However, this is

only possible for the class of convex Borel sets. In fact,

Bikjalis (1966) has shown that for arbitrary Borel sets,

additional conditions had ~o be assumed.

This is therefore the present situation regarding

developments in. the study of -the CLT. There are still

many nagging questions left to be asnwered, particularly

in bounding the error in the normal approximation. By

considering coin-tossing, it is seen that the rate given

in the Berry-Esseen theorem is achi~ved and hence further

work on this may only be found in reducing the absolute

constant. A more challenging problem is to obtain a prope r

generalization to dependent variables. So far, all

estimates, with the exception of ~hat of Stein (1972), de

not red~be to the Berry-Esseen rate. Stein (1972) obtaineJ 1.:

the correct order of n- 2 for a sequence of stationary

n-dcpendent random variables with eighth moments. The -~ others manage at best an order of n (see e.g. Philipp

(1969b), Erickson (1974), Loh and Chen [21] ) fer more

general types of dependence. Another problem awaiting

- 41 -

future rese'arch is to get bound~ for the corresponding

rnultidirn2nsiona1 case for dependent random vector!3. There

does not appear to have bben any work:' on this pr()blem yet.

vJork on the CLT has generated much interest in

related problems like the Poisson approximation and the

Central Limit Prohlem. With the , latter are associated some.

of the great pioneers in probability like Lev;y, . Khi tchine

and more recently, Kolmogorbv .' To retrace their work ~,,1ould

I"e'quire another essay as long as the present. ,_, .. , __ )

It lS perhaps justifie~ to add that no othep topic in

the theory of probability has attracted so much attention

for so long as the CLT. F?r two hundred and fifty years

since its birth, the CLT . has held man 1 s f~sc;i.p.ation and 'l.vill

continue to do so for many years to come.

References

von ~ahr, B. (1967). On the central limit theorem in

Rk. Ark. Mat~ 7, 5, 61-69.

~l Bergstrom, H. (1969). On the central limit theorem

in Rk. The remainder term for special Borel sets.

Z. · Wahrscheinlichkeitstheorie 14, 113-126 ~ -

Bernoulli, J. (1713). Conjestandi. Basle.

( 1 i_lr J Be rry, A.C. (1941). The accuracy of the Gaussian

a pproximation to the sum of independent variates.

Trans. Amer. Math. Soc. 49, 122-136.

bl Bhattacharya, R.N. (1968). Berry-Esseen bounds for

the multi-dimensional central limit theorem. Bull.

Amer. Math. Soc. 74, 285-287.

[s] Bik:j a lis, A. ( 19 66) . The remainder terms in multi­

dimensional limit theorems. Soviet Math. DokZ. 7,

705-707.

- 42 -

. . . ..

Chebyshev, P·.L. (1890) .·Sur deux theorems re.latirs

awe probabilites. Acta Math. 14, 305-315.

(~ Chen, L6uis H.Y. (1970). An elementa~y proof of the

central 1imi t theorem. Bu ~ L Singapore Math. Soa.

1·-12.

~] Chen, Louis H.Y. A riew approach to dependent central

limit problems. To appear.

~~ Diananda, P.H. (1955). The central limit theorem for

m-dependent variables. Iroc. Camb. Fhi ~. Soc. 51, 92 .. g 5.

~~ Dvoretzky, A. (1972). Asymptotic normality for sums

of dependent random variables. Froc. Sixth Berkeley

Symp. Math. Statist. Prob. 2~ 513-535.

[1~ Edgeworth, f. Y. (1905). The law of error. Proc. Camb.

PhiZ.. Soc. 20,36-65.

[1~ Er·ickson, R.V . . (1974). L1 bounds for asymptotic

normality of rn-dependent sums using Stein's technique

Ann. l:Y.obabiZ.ity. 2, 532-529.

'[lq Esseen, e.G. (1945). A mathematical study of the

Laplace-Gaussian law. Acta Math. 77, 1-125.

r1~ Esseen, C.G. (1969). On the remainder term ln the

central limit theorem. Ark. Mat. 8, 7 - 15.

~~ feller, W. (1S37). Ueber den zentralen Grenzwerksatz

der Wahrscheinlichkeitsrechnung. Math. Z. 42, 301-312.

~i Gnedenko, B.V. and Kohmogorov, A,N. Limit distributions

for sums of independent random variables. Addison­

Wesley . 1954.

~1 Hoeffding, W. and Robbins, H, (1948). The central

limit theorem for dependent random variables. Duke

Math. J. 15, 773 ·-780.

- 43 -

[19] Laplace, R. (.1812) • T heqrie ana Z.y tiqu {:! iir>a 1:n:-o hrrb-t 7>! t: o ~ . Paris.

Lindeberg, Y.W. (1922). Ei:he nene Herleitung d e s

Exponentialgesetzes in der Wahrscheinlichkeitsrechnung .

Math. Z. 15, 21f-22S.

Loh, Wei--Yin and Chen, Louis · 11. Y. The L., normal J..

approximation for ¢-mixing random variables. To

appear.

[2~ Loh, Wei-Yin and Chen, Louis H~Y. On the asymtotic

normality for sums of strong mixing random variables.

To appear.

)}~ Lyapunov, A.M. (1900). Sur une' proposition de la

theorie des probabilites. BuZ.Z.. de Z. 'Acad. Imp. des

Sci. de Mt. Fetersbour•g. 13,359--386.

pq Lyapunov, A.M. (1901). Nouvelle forme du theoreme sur

la limite de probabilites. Mem. Acad. Sc. St.

Fetersbourg 12, 1 - 24.

~j Markov, A.A.(1899). The law of large numbers and the

method of least squares. Izv. Fiz. ~Mat. Obshchestva

Kazan. Univ. 8, 110-128.

[2s] t1cLeish, D.L. (1974). Dependent Central Limit Theorems

and Invariance Principles. Ann Frobabi Z.ity 2, 620··628.

~~J de t1oivre, A. (1730). MisceZ.Z.anea AnaZ.ytica

SuppZ.ementum. London.

[20 Orey, S.A. (195b). Central limit theorem form­

dependent random variables. Duke Math. J. 25, 543-546.

~~ Paulauskas~ V. (1970). The multi-dimensional central

limit theorem. Litovsk. Mat. Sb. 10, 783-789.

- 44 ·-

[3oJ Philipp, VJ. (1969a). The central limit problem for

mixing sequences of random variables. Z. Wahrachein­

- lichkeitstheorie. 12, 255-171.

b~ Philipp, W. (1969b). The remainder in the central

limit theorem for mixii;J.g stochastic processes. Ann.

Math. Statist. 40, \ 601-609. ,

~-~ .Rao,: ,R. Ranga {_1961). On the central limit theore'm

~n Rk. BuZZ. Amer. Math. Soc. 67, 359~g6l.

Rosenblatt, M. (1956). A central limit ~heorem and . . . • ,

a mlxlng condition. !Toe. nat. Acad. Sci. USA. 42,

412-413.

~~ Rotav, V.I. (1970). The rate of convergence in the

multi-dimensional central limit theorem. Theor. Frob.

Appl. 15, 354-356.

~~- Sadikova, S.M~ (1968). On the . multi-dimensional

central limit. theorem. Th_eor. F1"ob. AppZ. 13, 164· ·170.

[3ij Sazanov, V.V. (1968). On the multi-di~ensional centra:

limit theorem. Sankhya, Ser. A, 3b pg.2.

~~ Stein, C. (1972). A bound for the error in the norm~1 approximation to the distribution ,C!f a. s.um o_f .

dependent random variables. !Toe. Sixth Berkeley Sy m~ .

Math. Statist. Prob. 2, 583·-602.

- 45 -