Inferring the Mean and Standard Deviation of a Population.

20
Inferring the Mean and Standard Deviation of a Population

Transcript of Inferring the Mean and Standard Deviation of a Population.

Page 1: Inferring the Mean and Standard Deviation of a Population.

Inferring the Mean and Standard Deviation of a Population

Page 2: Inferring the Mean and Standard Deviation of a Population.

Central Problem

Two important numbers tell us a lot about a distribution of data:

Mean tells us the central tendency of the data Standard deviation tells us the spread in the data

The problem is … we don’t normally know either of these and must infer them from a SRS of the population

Page 3: Inferring the Mean and Standard Deviation of a Population.

Baby Paradox

Two hospitals in the same city deliver, on average, a 50:50 ratio of baby girls and baby boys. Hospital A delivers 120 babies a day (on average) while hospital B delivers 12 babies a day (on average). One day there were twice as many boys as girls born in one of the hospitals. In which hospital is this more likely to happen?

Page 4: Inferring the Mean and Standard Deviation of a Population.

Measuring the mean…

How do we know the mean of a population?

Answer: We can either measure every single sample in the population or estimate the mean from a suitable SRSWe will assume that the population is normally

distributed so X has a normal distribution N(,/√n)

Page 5: Inferring the Mean and Standard Deviation of a Population.

Standard Error and Standard Deviation These are two very distinct and different ideas:

Standard error measures the uncertainty in the measure of the mean

This depends on how YOU measure and sample size

Standard deviation measures the spread in the data This is a property of the data set – does not change

We can often estimate the standard deviation by measuring the standard error.

Page 6: Inferring the Mean and Standard Deviation of a Population.

Standard error is always lessthan standard deviation

SE gets smaller as n grows

does not change!

SE measures the uncertaintyin location of mean

measures spread in data

Page 7: Inferring the Mean and Standard Deviation of a Population.

t-Distributions

If we know then setting a confidence interval on how well our sample mean X measures the true mean is easy:

But – if we don’t know then we estimate use the t-distribution:

xz

n

Xt

sn

Page 8: Inferring the Mean and Standard Deviation of a Population.

Closer look at t-distributions

The t-distribution looks very much like the Normal distribution and as the number of degrees of freedom (df) gets large the two become indistinguishable

t-distribution tables are used much the same way as N(0,1) – major difference is the df value

Xt

sn

Page 9: Inferring the Mean and Standard Deviation of a Population.

Example…

You are inspecting a shipment of 10 000 precision machined rods to be used in an engine assembly plant. You select a random sample of 20 and measure the diameters. You find that the average diameter of the sample is 5.465 cm with a standard deviation in the measurements of 0.005 cm. It is critical that the diameters do not exceed 5.471 cm. You are willing to accept a 1% failure rate. Should you accept the shipment?

Page 10: Inferring the Mean and Standard Deviation of a Population.

Solution: This would be an example of a 1-tailed t-

distribution, = 0.01, t19,0.01= 2.539

A 1% failure rate looks like this:

Page 11: Inferring the Mean and Standard Deviation of a Population.

Test the numbers…

This implies that 99.998% of the sample will not exceed the threshold diameter

Accept!

5.471

5.465

0.005

(5.471 5.465)5.231

0.00519

X cm

cm

s cm

t

Page 12: Inferring the Mean and Standard Deviation of a Population.

Two-tailed t-Tests In the previous example we

looked at whether or not the diameter was less than a maximum allowable value. Just as we have done earlier with confidence intervals we can also specify a maximum allowable range (“plus or minus”) for our mean.

Let’s test the mean diameter at a 95% confidence level that is implied by our measurement

Use following formula:

1, / 2 1, / 2n n

s sx t x t

n n

Margin of error

Page 13: Inferring the Mean and Standard Deviation of a Population.

We measured mean diameter as 5.645 cm, s = 0.005 so the upper and lower margins are:

We can be 95% confident that the diameters of the parts are in the range (5.463,5.467) cm

1, / 2

0.005(2.093) 0.0024

19n

st

n

Page 14: Inferring the Mean and Standard Deviation of a Population.

Example 7.9

Plot data: Identify variables, etc:

df = (50-1) = 49 = 0.05 = 23.56, s = 12.52 t = 2.009

Interval = (20.00,27.12)

?

Xt

sn

Page 15: Inferring the Mean and Standard Deviation of a Population.

Example of a Matched Pairs t-test: Exercise 7.40 Formulate appropriate

hypothesesH0: no difference

H: LH > RH

Re-arrange data: find and s (see next page)

Page 16: Inferring the Mean and Standard Deviation of a Population.

Ho: = 0 df = 25 - 1 = 24 Find

Use Excel =tdist(t, df, #tails)Use Table D

The probability of the null hypothesis is only 0.004

LH thread takes longer

2.844X

tsn

Page 17: Inferring the Mean and Standard Deviation of a Population.

Robustness…

A statistical test is considered robust if: It is insensitive to deviations from original

assumptions being made. This could include smaller sample size or deviation from normality

Page 18: Inferring the Mean and Standard Deviation of a Population.

Rules of thumb – When to use the t-test

• Small sample sizes (n≈15) and close to normal

• Mid range sample size (n ≥ 15) as long as distribution not strongly skewed and no outliers

• Large sample size (n > 40) even if skewed or with some outliers

Fine print: Rules of thumb do not obviate the need to always inspect your data! Stemplots or histograms give you insight into just how “skewed” or “outlier-riddled” is your data. Always know what the data set looks like before applying tests.

Page 19: Inferring the Mean and Standard Deviation of a Population.

In conclusion…

Read 7.1 carefully – we skipped over some terms and discussions of applicability of the t-test

Be sure you understand when (and why) we need the t-test

Know the difference between standard deviation and Standard Error

Try: 7.4, 7.12, 7.26, 7.42

Page 20: Inferring the Mean and Standard Deviation of a Population.