Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

27
Sampling Distribution WELCOME to INFERENTIAL STATISTICS

Transcript of Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Page 1: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Sampling DistributionWELCOME to INFERENTIAL STATISTICS

Page 2: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling Distribution

We are moving from descriptive statistics to inferential statistics.

Inferential statistics allow the researcher to come to conclusions about a population on the basis of descriptive statistics about a sample.

Page 3: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling Distribution

Your sample says that a candidate gets support from 47%.

Inferential statistics allow you to say that the candidate gets support from 47% of the population with a margin of error of +/- 4%.

This means that the support in the population is likely somewhere between 43% and 51%.

For example:

Page 4: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling Distribution

Margin of error is taken directly from a sampling distribution.

43% 51% 47%

Your Sample Mean

95% of Possible Sample MeansIt looks like this:

Page 5: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Take a sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K.

$30K

Page 6: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K.

$30K

Page 7: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K.

$30K

Page 8: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K.

$30K

Page 9: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K.

$30K

Page 10: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Take another sample of size 1,500 from the US. Record the mean income. Our census said the mean is $30K.

$30K

Page 11: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K.

$30K

Page 12: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K.

$30K

Page 13: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K.

$30K

Page 14: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionLet’s create a sampling distribution of means…

Let’s repeat sampling of sizes 1,500 from the US. Record the mean incomes. Our census said the mean is $30K.

$30K

The sample means would stack up in a normal curve. A normal sampling distribution.

Page 15: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionSay that the standard deviation of this distribution is $10K.

Think back to the empirical rule. What are the odds you would get a sample mean that is more than $20K off.

$30K

The sample means would stack up in a normal curve. A normal sampling distribution.

-3z -2z -1z 0z 1z 2z 3z

Page 16: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

A Sampling DistributionSay that the standard deviation of this distribution is $10K.

Think back to the empirical rule. What are the odds you would get a sample mean that is more than $20K off.

$30K

The sample means would stack up in a normal curve. A normal sampling distribution.

-3z -2z -1z 0z 1z 2z 3z

2.5% 2.5%

Page 17: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Central Limit Theorem (CLT)

Page 18: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Central Limit Theorem: As sample size increases, the sampling distribution of sample means approaches that of a normal distribution with a mean the same as the population and a standard deviation equal to the standard deviation of the population divided by the square root of n (the sample size).

N( , σ/√n) ℳ with mean ℳ and sd σ/√n

Page 19: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Variability in Sampling Distribution

Knowing the likely variability of the sample means from repeated sampling gives us a context within which to judge how much we can trust the number we got from our sample.

For example, if the variability is low, , we can trust our number more than if the variability is high, .

Page 20: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

An Example:A population’s car values are μ = $12K with σ = $4K.Which sampling distribution is for sample size 625 and

which is for 2500? What are their s.e.’s (standard error)?

-3 -2 -1 0 1 2 3

95% of M’s

? $12K ? 95% of M’s

-3-2-1 0 1 2 3

? $12K ?

Page 21: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

An Example:

A population’s car values are μ = $12K with σ = $4K.

Which sampling distribution is for sample size 625 and which is for 2500? What are their s.e.’s?

s.e. = $4K/25 = $160s.e. = $4K/50 =

$80

(√2500 = 50)(√625 = 25)

-3 -2 -1 0 1 2 3

95% of M’s

? $12K ? 95% of M’s

-3-2-1 0 1 2 3

? $12K ?

Page 22: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Which sample will be more precise? If you get a particularly bad sample, which sample size will help you be sure that you are closer to the true mean?

-3 -2 -1 0 1 2 3

95% of M’s

? $12K ? 95% of M’s

-3-2-1 0 1 2 3

? $12K ?

Page 23: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Repeated samples would pile up in a normal distribution

The sample means will center on the true population mean

The standard error will be a function of the population variability and sample size

The larger the sample size, the more precise, or efficient, a particular sample is

95% of all sample means will fall between +/- 2 s.e. from the population mean

So we know in advance of ever collecting a sample, that if sample size is sufficiently large:

Page 24: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

What proportion of US teens know that 1492 was the year in which Columbus “discovered” America? A Gallup Poll fund that 210 out of a random sample of 501 American teens aged 13-17 knew this historically important date. The sample proportion:

p = 210/501 = 0.42

0.42 is the statistic that we use to gain information about the unknown population parameter p. We may say that 42% of US

teens know that Columbus discovered America in 1492.

Page 25: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Sampling distribution of sample proportion

p Count of success in sample

Size of the sample

X

n==

The mean of the sampling distribution is exactly p

p The standard deviation of the sampling

distribution is p

√p(1-p)n

Page 26: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Applying to collegeNormal calculation involving

pA polling organization asks an SRS (simple random sample) of 1500 1st year college students whether they applied for admission to any other college. In fact 35% of all the 1st year students applied to colleges besides the one they are attending. What is the probability that the random sample of 1500 students will give a result within 2 percentage point of this true value?n=150

0p=0.35

pℳ =0.35

√p(1-p)nσ=

√= 0.35(1-0.35)1500 = 0.0123

Page 27: Sampling Distribution WELCOME to INFERENTIAL STATISTICS.

Sampling DistributionJeremy, out of boredom, decided to find the probability of a male student being 72 inches tall in BHS. Mr. Delton told him that the average height of 857 male students in BHS is 67 inches with a standard deviation of 3.5 inches. Show a statistical procedure on how to help Jeremy on his quest of getting rid of his boredom.