localhost:8889/nbconvert/html/Web/Lectures and Materials ... and...

11/4/2019 CentralLimitTheorem

localhost:8889/nbconvert/html/Web/Lectures and Materials/CentralLimitTheorem.ipynb?download=false 1/8



Central Limit TheoremThe Central Limit Theorem (hereafter, CLT) is the most important and most significant result in all of probability and statistics. It explains theprofound importance of the normal distribution, and provides a foundation, essentially, for all of elementary statistics.

Warm-up to the CLTOne of the most significant aspects of the CLT is that it shows how repeated sampling of values from a random variable results in a randomvariable whose expected value remains the same as but with a variance which decreases linearly in the number of samples.

Formally, let be independent random variables that are identically distributed (i.e., all have the same probability function inthe discrete case or density function in the continuous case) and all have the same finite mean and variance . In the simple case ofsampling, each represents one "poke" of a single random variable , but this is not required.

Now suppose we consider the random variable representing the mean of the , i.e.,

First, it is hardly surprising that the mean of , or the "mean of the mean" remains unchanged, by virtue of the linearity of expected values,i.e.,

and also for independent RVs and , we have

Thus for "pokes" of a RV , we have

and so, putting these results together, we have a not unexpected result:

What about the variance of ? Does it have the same linearity property? Not quite! As shown in lecture:

This is not unexpected, since the units of the variance are squared! However, note that the standard deviation does have a similar linearityproperty, except that shifting the distribution left or right by a constant increment does not affect the variance:

However, we do have linearity for the sum of independent RVs:

and so we have:

Finally we can investigate .

𝑋

𝑋

, , … ,𝑋1 𝑋2 𝑋𝑛

𝜇 𝜎2

𝑋𝑖 𝑋

𝑋⎯ ⎯⎯⎯⎯

𝑛 𝑋𝑖

= .𝑋⎯ ⎯⎯⎯⎯

𝑛

+ + ⋯ +𝑋1 𝑋2 𝑋𝑛

𝑛

𝑋⎯ ⎯⎯⎯⎯

𝑛

𝐸(𝑎 ⋅ 𝑋 + 𝑏) = 𝑎 ⋅ 𝐸(𝑋) + 𝑏

𝑋 𝑌

𝐸(𝑋 + 𝑌 ) = 𝐸(𝑋) + 𝐸(𝑌 ).

𝑛 𝑋

𝐸( + + ⋯ + ) = 𝑛 ⋅ 𝐸(𝑋)𝑋1 𝑋2 𝑋𝑛

= 𝐸( ) = 𝐸( ) = ⋅ 𝑛 ⋅ 𝐸(𝑋) = 𝐸(𝑋).𝑋⎯ ⎯⎯⎯⎯

𝑛 𝑋⎯ ⎯⎯⎯⎯

𝑛

+ + ⋯ +𝑋1 𝑋2 𝑋𝑛

𝑛

1

𝑛

𝑋

𝑉 𝑎𝑟(𝑎 ⋅ 𝑋 + 𝑏) = ⋅ 𝑉 𝑎𝑟(𝑋)𝑎2

𝑏

= |𝑎| ⋅ .𝜎𝑎⋅𝑋+𝑏 𝜎𝑋

𝑉 𝑎𝑟(𝑋 + 𝑌 ) = 𝐸[(𝑋 + 𝑌 ] − 𝐸(𝑋 + 𝑌)2 )2

= 𝐸( + 2𝑋𝑌 + ) − (𝐸(𝑋) + 𝐸(𝑌 )𝑋2

𝑌2 )2

= 𝐸( ) + 2𝐸(𝑋)𝐸(𝑌 ) + 𝐸( ) − [𝐸(𝑋 + 𝐸(𝑋)𝐸(𝑌 ) + 𝐸(𝑌 ]𝑋2

𝑌2 )2 )2

= 𝐸( ) + 2𝐸(𝑋)𝐸(𝑌 ) + 𝐸( ) − 𝐸(𝑋 − 2𝐸(𝑋)𝐸(𝑌 ) − 𝐸(𝑌𝑋2

𝑌2 )2 )2

= 𝐸( ) − 𝐸(𝑋 + 𝐸( ) − 𝐸(𝑌𝑋2 )2

𝑌2 )2

= 𝑉 𝑎𝑟(𝑋) + 𝑉 𝑎𝑟(𝑌 )

𝑉 𝑎𝑟( + + ⋯ + ) = 𝑛 ⋅ 𝑉 𝑎𝑟(𝑋)𝑋1 𝑋2 𝑋𝑛

𝑉 𝑎𝑟( )𝑋⎯ ⎯⎯⎯⎯

𝑛

𝑉 𝑎𝑟( )+ + ⋯ +𝑋1 𝑋2 𝑋𝑛

𝑛=

𝑉 𝑎𝑟( + + ⋯ + )𝑋1 𝑋2 𝑋𝑛

𝑛2

=𝑉 𝑎𝑟( ) + 𝑉 𝑎𝑟( ) + ⋯ + 𝑉 𝑎𝑟( )𝑋1 𝑋2 𝑋𝑛

𝑛2

=𝑛 ⋅ 𝑉 𝑎𝑟(𝑋)

𝑛2

=𝑉 𝑎𝑟(𝑋)

𝑛



and the standard deviation is

So the punchline is when we take the mean of independent "pokes" of a random variable, the mean value remains the same, but thevariance decreases linearly (by a factor of ) and the standard deviation gets smaller by a factor of

Now we can finally be precise about our experiment (from day one!) of flipping a fair coin and taking the average of the number of heads! If weflip a fair coin once, we have Bernoulli(1/2) with a mean of 1/2 and a standard deviation of 1/4. If we flip the coin times, it converges to itsmean value 1/2 in the very precise sense that the standard deviation is

You can see this graphically if we plot the mean of n = 1, ..., 1000 coin flips:

=𝜎𝑋⎯ ⎯⎯⎯⎯

𝑛

𝜎𝑋

𝑛√

𝑛1

𝑛.1

𝑛√

𝑛

𝜎 =1

4 𝑛√

Experiment One: What happens when we keep flipping a coin?



In [1]: from random import randint n = 10**6 count = 0for k in range(n): count += randint(0,1)mu = count/n print("n = " + str(n) + "\t result = " + str(mu)+ "\tVar = " + str(0.5/n))if (mu-0.5) >= 0: print("sigma = " + str(0.25 * (1/(n**0.5))))else: print("sigma = " + str(0.25 * (1/(n**0.5))))print("delta = " + str(mu-0.5))

Experiment Two: How does variance decrease as n gets larger?

n = 1000000 result = 0.499746 Var = 5e-07 sigma = 0.00025 delta = -0.00025399999999997647



In [35]: import numpy as npfrom numpy import arange,linspace, mean, var, std, uniqueimport matplotlib.pyplot as plt from numpy.random import random, randint,uniform, choice, binomial, geometric, poisson import mathfrom collections import Counterimport pandas as pd%matplotlib inline def round4(x): return round(float(x)+0.00000000001,4) def sampleMeanUniform(n): X = [uniform(0,1) for i in range(n)] return sum(X)/n def display_sample_mean_uniform(n,num_trials,decimals): fig, ax = plt.subplots(1,1,figsize=(12,6)) ax.set_xlim(0,1) ax.set_ylim(0,50) plt.title('Uniform: Distribution of Sample Mean n = ' + str(n)) plt.ylabel("f(x)") plt.xlabel("k in Range(X)") # use bernoulli to generate random samples X = [sampleMeanUniform(n) for i in range(num_trials)] Xrounded = [round(x,decimals) for x in X] # Now convert frequency counts into probabilities D = Counter( Xrounded ) Xrounded = unique(Xrounded) # sorts and removes duplicates P = [10**decimals*D[k]/num_trials for k in Xrounded] # must multiply probs by 10**decimals because plt.bar(Xrounded,P,width=1/10**decimals, edgecolor='k',alpha=0.5) # bins are of width 1/10**decimals plt.show() n = 1000 # try for 1, 2, 5, 10, 30, 100, 1000num_trials = 1000display_sample_mean_uniform(n,num_trials,3)



However, we can say much more than this, and that is the content of the next section.

The Central Limit TheoremWe expect the normal distribution to arise whenever the outcome of a situation results from numerous small additive components, with nosingle or small group of effects dominant (since all the components are independent). Hence, it occurs regularly in biostatistics (where manygenes and many environmental factors add together to produce some result, such as intelligence or height), in errors in measurement (wheremany small errors add up), and finance (where many small effects contribute for example to the price of a stock). The CLT provides the formaljustification for this phenomenon.

To state the CLT, first, suppose we standardize by subtracting and dividing by its standard deviation to obtain a new random

variable :

Then the CLT states that as , converges to the standard normal , that is:

A simpler version of this theorem is the following, which has immense consequences for the development of sampling theory (which is ournext topic in CS 237).

As gets large, the random variable converges to the distribution .

As gets large, the random variable converges to the distribution .

There are several crucial things to remember about the CLT:

1. The mean of is the same as the .2. The standard deviation gets smaller as gets larger, and approaches 0 as approaches .

3. The distributions of the do NOT MATTER at all, and as long as they have a common mean and standard deviation, they can becompletely different distributions. Typically, however, these are separate "pokes" of the same random variable.

4. We can use the strong properties of the normal distribution, such as the "68-95-99 rule," to quantify the randomness inherent in thesampling process. This will be the fundamental fact we will use in developing the various statistical procedures in elementary statistics.

𝑋⎯ ⎯⎯⎯⎯

𝑛 𝜇𝜎

𝑛√𝑍𝑛

=𝑍𝑛

− 𝜇𝑋⎯ ⎯⎯⎯⎯

𝑛

𝜎

𝑛√

𝑛 ↦ ∞ 𝑍𝑛 𝑍 ∼ 𝑁(0, 1)

𝑃 ( ≤ 𝑎) = 𝑑𝑥lim𝑛→∞

𝑍𝑛 ∫𝑎

−∞

1

2𝜋‾‾‾√𝑒

− /2𝑥2

𝑛 𝑋⎯ ⎯⎯⎯⎯

𝑛 𝑁(𝜇, )𝜎2

𝑛

𝑛 𝑍⎯ ⎯⎯⎯⎯

𝑛 𝑁(0, 1)

𝜇 𝑋⎯ ⎯⎯⎯⎯

𝑛 𝑋𝑖𝜎

𝑛√𝑛 𝑛 ∞

𝑋𝑖

In [3]: # General useful importsimport numpy as npfrom numpy import arange,linspace,round, uniqueimport matplotlib.pyplot as plt from numpy.random import random, randint,uniform, choice, binomial, geometric, poisson, normal, exponential from scipy.stats import normimport mathfrom collections import Counterimport pandas as pd%matplotlib inline

Experiments with the CLTWe will verify the CLT with several distributions:

- Uniform U ~ Uniform(0,1) mean is 0.5 std dev = sqrt( 1/ 12 ) = 0.28867513459 - Bernoulli X ~ Bern(0.61) mean is 0.61 std dev = (0.61)*(0.39) = 0.237 - Exponential E ~ Exp(0.1) mean is 10 std dev = 100 - Normal Y ~ N(66,9) mean is 66 std dev = 3



Experiment Three: Uniform DistributionIn [18]:

Experiment Four: Bernoulli

In [23]:

Experiment Five: Exponential



In [28]:

Experiment Six: Normal

In [31]:

localhost:8889/nbconvert/html/Web/Lectures and Materials ... and...

Documents

Transcript of localhost:8889/nbconvert/html/Web/Lectures and Materials ... and...