Download - Sampling Distributions. Review Random phenomenon Individual outcomes unpredictable Sample space all possible outcomes Probability of an outcome long-run.

Sampling Distributions

Review

Random phenomenon• Individual outcomes unpredictable

Sample space• all possible outcomes

Probability of an outcome• long-run proportion for outcome

Probability distribution• probabilities for outcomes in sample space

Review

parameter numerical fact about the population (e.g. m) – the

thing we want to know, but can’t

statistic corresponding numerical fact in the sample (e.g. ) – the thing we can know

x

Fact about (when X has mean μ and s.d. σ)

• Law of Large Numbers– As n gets larger, “gets closer and closer” to the

mean μ– More precisely, the chance of getting a “bad”

gets smaller as n gets larger

x

x

x

Sampling Distribution• A mental picture: imagine trying to estimate

average height in this class (μ) – A sample of 5 persons is obtained and is

calculated – Imagine all possible samples of size 5, with an

for each sample– Collect all the ’s: This is the “sampling

distribution of ”• A definition: The sampling distribution of a

statistic is the distribution of values taken by the statistic in all possible samples of size n

x

x

xx

What we did….Histogram of Heights: = 68.18; = 4.49

Height (inches)

# o

f stu

de

nts

60 65 70 75 80

02

04

06

08

0

Histogram of xbars: ̂x = 68.39; ̂x = 2.17

x (inches)

# o

f x v

alu

es

60 65 70 75 80

02

04

06

08

0 ̂x

Application to Statistics

if you have a statistic calculated from a random sample or randomized experiment

• sample space = all possible values of sample statistic

• The probability distribution of the sample statistic is called the sampling distribution

iClickerConsider a simple random sample of 100 BYU students, asking them how many movies they watched last week (x), and then calculating . What is the sampling distribution?

a. dist. of x for all BYU stud.b. dist. of x for 100 BYU stud.c. prob. of getting the

particular of the sampled. prob. dist. of for samples

of 100 BYU students

x

xx

Why sampling distribution?

• sampling distribution allows us to assess uncertainty of sample results (i.e., “how reliable is ?”)

• if we knew the spread of the sampling distribution, we would know how far our might be from the true m

x

x

Height Data for Our Class

• μ = 68.18 inches (~ 5’ 8”) and σ = 4.49 inches• What is sampling distribution for if n=5?

– We can’t see the (theoretical) sampling distribution because we don’t have time to look at all possible samples of size 5

– We CAN approximate it with simulation• How does the sampling distribution of

compare with distribution of heights (x)?

x

x

What we did….Histogram of Heights: = 68.18; = 4.49

Height (inches)

# o

f stu

de

nts

60 65 70 75 80

02

04

06

08

0

Histogram of xbars: ̂x = 68.39; ̂x = 2.17

x (inches)

# o

f x v

alu

es

60 65 70 75 80

02

04

06

08

0 ̂x

If we had truly random samples….Histogram of Heights: = 68.18; = 4.49

Height (inches)

# of

stu

dent

s

60 65 70 75 80

010

2030

40

Histogram of xbars when n= 5 : mean of xbars = 68.23 (should be 68.18)

sd of xbars = 1.98 (should be 4.49/sqrt(n)= 2.01 )

xbars

Den

sity

60 65 70 75 80

0.00

0.05

0.10

0.15

0.20

x



xbars

Den

sity

60 65 70 75 80

0.0

0.1

0.2

0.3

x



xbars

Den

sity

60 65 70 75 80

0.0

0.2

0.4

0.6

0.8

1.0

x

More facts about (when X has mean μ and s.d. σ)

• Sampling Distribution (aka “Theoretical Sampling Distribution”) for – Has a mean of exactly μ– Has a standard deviation of exactly

x

x

n


• μ = 68.18 inches (~ 5’ 8”) and σ = 4.49 inches• Someone says BYU’s incoming class for Fall

2014 will have a mean height larger than 68.18, based on a random sample of n=5 incoming freshman with = 69.5. What do you think?– What if the came from a sample with n=16?

n=100?x

x


• μ = 68.18 inches (~ 5’ 8”) and σ = 4.49 inches• QUIZ: What is the mean of the sampling

distribution for if n=4?– A: impossible to know– B: exactly 68.18 inches– C: approximately 68.18 inches, give or take a little

bit of room for error– D: a value that gets closer and closer to 68.18

inches as n gets larger and larger

x


• μ = 68.18 inches (~ 5’ 8”) and σ = 4.49 inches• QUIZ: What is the standard deviation of the

sampling distribution for if n=4?– A: impossible to know– B: 4.49 inches– C: 4.49/2 = 2.245 inches– D: 4.49/4 = 1.1225 inches

x

sampling distribution applet

T F1. always estimates μ well (i.e., it’s

always close to μ). 2. Using the sampling dist. we can

compute probabilities on .3. does not vary from sample to

sample.4. The mean of the sampling dist. of

is µ.

iClicker

x

xx

x

Next…• What if we don’t have the whole population to

simulate from?• What if we don’t have 600 Stat 121 students

willing to calculate values based on 600 different samples?– What if we only have time for one sample of size

n=35 (BYU students), and we get 6.9 hours as an average number of TV hours per week? Can we say that BYU students’ mean viewing time is significantly less than the national average of 10.6 hours for college students? (σ=8.0) What if knew somehow that the sampling distribution for is normal?

x

x

Vocabulary

StatisticParameterProbabilityProbability distributionSampling distribution of statistic