The standard error of the sample mean and confidence intervals How far is the average sample mean...

37
The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu can we expect to find 95% or 99% or sample means
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of The standard error of the sample mean and confidence intervals How far is the average sample mean...

Page 1: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The standard error of the sample mean and confidence intervals

How far is the average sample mean from the population mean?

In what interval around mu can we expect to find 95% or 99% or sample

means

Page 2: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

An introduction to random samples

• When we speak about samples in statistics, we are talking about random samples.

• Random samples are samples that are obtained in line with very specific rules.

• If those rules are followed, the sample will be representative of the population from which it is drawn.

Page 3: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Random samples: Some principles

• In a random sample, each and every score must have an equal chance of being chosen each time you add a score to the sample.

• Thus, the same score can be selected more than once, simply by chance. (This is called sampling with replacement.)

• The number of scores in a sample is called “n.” (Small n, not capital N.)• Sample statistics based on random samples provide

least squared, unbiased estimates of their population parameters.

Page 4: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The first way a random sample is representative of its population

• One way a random sample will be representative of the population is that the sample mean will be a good estimate of the population mean.

• Sample means are better estimates of mu than are individual scores.

• Thus, on the average, sample means are closer to mu than are individual scores.

Page 5: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The variance and the standard deviation are the basis for the rest of this chapter.

• In Chapter 1 you learned to compute the average squared distance of individual scores from mu. We called it the variance.

• Taking a square root, you got the standard deviation.

• Now we are going to ask a slightly different question and transform the variance and standard deviation in another way.

Page 6: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

As you add scores to a random sample

• Each randomly selected score tends to correct the sample mean back toward mu

• If we have several samples drawn from a single population, as we add scores to each sample, each sample mean gets closer to mu.

• Since the sample means are all getting closer to mu, they will also be getting closer to each other.

Page 7: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

As you add scores to a random sample – larger vs. smaller

samples

• The larger the random samples, the closer their means will be to mu, on the average.

• Therefore: The larger the random samples, the closer their means will be to each other, on the average.

Page 8: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Let’s see how that happens

Population is 1320 students taking a test.

is 72.00, = 12

Let’s randomly sample one student at a time and see what happens.We’ll create a random sample with 8 students’ scores in the sample.

Page 9: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Test ScoresFrequency

score

36 48 60 96 10872 84

Sample scores:

3 2 1 0 1 2 3Standard

deviations

Scores

Mean

87Means: 80 79

102 72 66 76 66 78 69 63

76.4 76.7 75.6 74.0

Page 10: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

How much closer to mu does the sample mean get when you increase n, the size of the sample? (1)

• The average squared distance of individual scores from mu, the population mean is called the variance. You learned to compute it in Chapter 1.

• The symbol for the mean of a sample is the letter X with a bar over it.We will write that as X-bar or .X

Page 11: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

How much closer to mu does the sample mean get when you increase n, the size of the sample? (2)

• The average squared distance of sample means from mu is the average squared distance of individual scores from mu divided by n, the size of the sample.

• Let’s put that in a formula

• sigma2X-bar = sigma2/n

Page 12: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Then take a square root to obtain the standard error of the mean,

the average unsquared distance of sample means from mu.

• This is important. So here are several definitions of the standard error of the mean. They are all correct, but ignore any that confuse you!

nsigmasigmaX /

Page 13: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The standard error of the sample mean• As you know, the square root of the variance is

called the standard deviation. It is the average unsquared distance of individual scores from mu.

• The average unsquared distance of sample means from mu is the square root of sigma2

X-bar

• The square root of sigma2X-bar = sigmaX-bar.

sigmaX-bar is called the standard error of the sample mean or, more briefly, the standard error of the mean. Here are the formulae

sigma2X-bar = sigma2/n

(Then, to get the standard error of the mean take a square root of sigma2/n).

sigmaX-bar = sigma/ n

Page 14: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The standard error of the mean

• Let’s translate the formula into English, just to be sure you understand it. Here is the formula again: sigmaX-bar = sigma/

• In English: The average (unsquared) distance of sample means from mu equals the ordinary standard deviation divided by the square root of the sample size.

n

Page 15: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Still another way to say the same thing:The standard error of the mean is the standard deviation of the means of

random samples of a specific size (n).

This last definition sometimes confuses people. If it confuses you, just

remember: The standard error of the mean is the

averaged unsquared distance of sample means from mu.

Page 16: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Let’s check and make sure that the formula is correct. Let’s see that the standard error equals the ordinary

standard deviation divided by the square root of n. To do that, let’s start with a

tiny population: N=5• Here are all the scores in a population: 1,3,5,7,9.• The scores in this population form a perfectly

rectangular distribution.• Mu = 5.00

Page 17: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Figure 4.5: Scores of the 5 research participants in this population.

Frequency321 x x x x x

1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00

Page 18: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Computing sigma

• SS=(1-5)2+(3-5)2+(5-5)2+ (7-5)2+ (9-5)2=40

• sigma2=SS/N=40/5=8.00

• sigma = 2.83

Page 19: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

If we did compute a standard deviation of sample means from mu, it should give

the same result as the formula• Let’s see if it does.• We can only do all the computations if we have a very

small population and an even tinier sample.• Let’s use the population of 5 scores we just looked at

(sigma = 2.83). We’ll look at all the samples with n = 2.• If the formula is right, the average unsquared distance of

sample means, n=2, should be 2.83/ = 2.83/1.414 = 2.00.• Is that right? To find out, let’s compute the standard error

of the mean from the differences between the means of all possible samples (n=2) from mu.

2

Page 20: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Table 4.10: List of all 25 possible samples (n=2) of scores from the tiny population of five scores (1, 3, 5, 7, & 9 ) and direct computation of the standard error of these means. In this case, the standard error is computed from these means (n=2) just as we would the standard deviation of an ordinary set of scores (n=1). Sample Scores X Sample Scores X Summary statistics (all samples, n=2) AA 1,1 1.00 DA 7,1 4.00 X = 125.00 AB 1,3 2.00 DB 7,3 5.00 N = 25 AC 1,5 3.00 DC 7,5 6.00 mu = 5.00 AD 1,7 4.00 DD 7,7 7.00 SSX = 100.00 AE 1,9 5.00 DE 7,9 8.00 sigmaX

2 = 4.00 BA 3,1 2.00 EA 9,1 5.00 sigmaX = 2.00 BB 3,3 3.00 EB 9,3 6.00 BC 3,5 4.00 EC 9,5 7.00 BD 3,7 5.00 ED 9,7 8.00 BE 3,9 6.00 EE 9,9 9.00 CA 5,1 3.00 CB 5,3 4.00 CC 5,5 5.00 CD 5,7 6.00 CE 5,9 7.00

Page 21: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Means of all possible 25 samples (n=2) from this population of scores 1. 3, 5, 7. and 9. Frequency 5 X 4 X X X 3 X X X X X 2 X X X X X X X 1 X X X X X X X X X Score 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00

Page 22: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Table 4.10: List of all 25 possible samples (n=2) of scores from the tiny population of five scores (1, 3, 5, 7, & 9 ) and direct computation of the standard error of these means. In this case, the standard error is computed from these means (n=2) just as we would the standard deviation of an ordinary set of scores (n=1). Sample Scores X Sample Scores X Summary statistics (all samples, n=2) AA 1,1 1.00 DA 7,1 4.00 X = 125.00 AB 1,3 2.00 DB 7,3 5.00 N = 25 AC 1,5 3.00 DC 7,5 6.00 mu = 5.00 AD 1,7 4.00 DD 7,7 7.00 SSX = 100.00 AE 1,9 5.00 DE 7,9 8.00 sigmaX

2 = 100/25=4.00 BA 3,1 2.00 EA 9,1 5.00 sigmaX = 2.00 BB 3,3 3.00 EB 9,3 6.00 BC 3,5 4.00 EC 9,5 7.00 BD 3,7 5.00 ED 9,7 8.00 BE 3,9 6.00 EE 9,9 9.00 CA 5,1 3.00 CB 5,3 4.00 CC 5,5 5.00 CD 5,7 6.00 CE 5,9 7.00

Page 23: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The standard error = the standard deviation divided by the square

root of n, the sample size

• The formula works. And it works every time.

Page 24: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Let’s see what sigmaX-bar can tell us• We know that the mean of SAT/GRE scores = 500

and sigma = 100• So 68.26% of individuals will score between 400 and

600 and 95.44% will score between 300 and 700• REMEMBER THAT SAMPLE MEANS FALL

CLOSER TO MU, ON THE AVERAGE, THAN DO INDIVIDUAL SCORES.

Page 25: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

What happens when we take random samples of SAT scores

with n=4?• The standard error of the mean is sigma divided by the

square root of the sample size = 100/2=50.• SAMPLE MEANS FALL INTO A NORMAL CURVE

EVEN BETTER THAN INDIVIDUAL SCORES• 68.26% of the sample means (n=4) will be within 1.00

standard error of the mean from mu and 95.44% will be within 2.00 standard errors of the mean from mu

• So, 68.26% of the sample means (n=4) will be between 450 and 550 and 95.44% will fall between 400 and 600

Page 26: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Let’s make the samples larger• Take random samples of SAT scores, with 400 people in each

sample, the standard error of the mean is sigma divided by the square root of 400 = 100/20=5.00

• 68.26% of the sample means will be within 1.00 standard error of the mean from mu and 95.44% will be within 2.00 standard errors of the mean from mu.

• So, 68.26% of the sample means (n=400) will be between 495 and 505 and 95.44% will fall between 490 and 510.

• See how sample means get closer and closer to mu as sample size increases!

Page 27: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

The Central Limit Theorem

Page 28: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

What happens as n increases? • The sample means get closer to each other and to mu.• Their average squared distance from mu equals the

variance divided by the size of the sample.• Therefore, their average unsquared distance from mu

(which is called the standard error of the mean) equals the standard deviation divided by the square root of the size of the sample.

• The sample means fall into a more and more perfect normal curve.

• These facts are called “The Central Limit Theorem” and can be proven mathematically.

Page 29: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

CONFIDENCE INTERVALS

Page 30: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

We want to define two intervals around mu:

One interval into which 95% of the sample means will fall.

Another interval into which 99% of the sample means will fall.

Page 31: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

95% of sample means will fall in a symmetrical interval around mu that goes from 1.960 standard

errors below mu to 1.960 standard errors above mu

• A way to write that fact in statistical language is:

CI.95: mu + 1.960 sigmaX-bar or

CI.95: mu - 1.960 sigmaX-bar < X-bar < mu + 1.960 sigmaX-bar

Page 32: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

As I said, 95% of sample means will fall in a symmetrical interval around mu that goes from 1.960

standard errors below mu to 1.960 standard errors above mu

• Take samples of SAT/GRE scores (n=400)

• Standard error of the mean is sigma divided by the square root of n=100/ = 100/20.00=5.00

• 1.960 standard errors of the mean with such samples = 1.960 (5.00)= 9.80

• So 95% of the sample means with n=400 can be expected to fall in the interval 500+9.80

• 500-9.80 = 490.20 and 500+9.80 =509.80

CI.95: mu + 1.960 sigmaX-bar = 500+9.80 or

CI.95: 490.20 < X-bar < 509.20

400

Page 33: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

99% of sample means will fall within 2.576 standard errors from mu

• Take the same samples of SAT/GRE scores (n=400)

• The standard error of the mean is sigma divided by the square root of n=100/20.00=5.00

• 2.576 standard errors of the mean with such samples =

2.576 (5.00)= 12.88

• So 99% of the sample means can be expected to fall in the interval 500+12.88

• 500-12.88 = 487.12 and 500+12.88 =512.88

CI.99: mu + 2.576 sigmaX-bar = 500+12.88 or

CI.99: 487.12 < the sample mean < 512.88

Page 34: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Notice that the 99% CI includes the 95% CI. With n=400, mu=500.00,

sigma=100.00

CI.95: 490.20 < X-bar < 509.20

CI.99: 487.12 < X-bar < 512.88

Page 35: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

You do one• Using SAT scores, with n=2500:

• Into what interval should 95% of the sample means fall?

Page 36: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Interval around mu• 95% of the sample means should fall within

1.960 standard errors of the mean from mu.

• Given that sigmaX-bar =2.00, you multiply 1.960 * sigmaX-bar = 1.960 x 2.00 = 3.92

Page 37: The standard error of the sample mean and confidence intervals How far is the average sample mean from the population mean? In what interval around mu.

Thus:• 95% of the sample means should fall in an

interval that goes 3.92 points in both directions around mu

• 500 – 3.92 = 496.08

• 500 + 3.92 = 503.92

• So 95% of sample means (n=2500) should fall between 496.08 and 503.92

• CI.95: 496.08 < X-bar < 503.92