Sampling Distributions
A review by Hieu Nguyen(03/27/06)
Parameter vs Statistic
A parameter is a description for the entire population.
Example:A parameter for the US population is the proportion of all people who support President Bush’s nomination of Samuel Alito to the Supreme Court.
p=.74
Parameter vs Statistic
A statistic is a description of a sample taken from the population. It is only an estimate of the population parameter.
Example:In a poll of 1001 Americans, 73% of those surveyed supported Alito’s nomination.
p-hat=.73
Bias
The bias of a statistic is a measure of its difference from the population parameter.
A statistic is unbiased if it exactly equals the population parameter.
Example:The poll would have been unbiased if 74% of those surveyed approved of Alito’s nomination.
p-hat=.74=p
Sampling Variability
Samples naturally have varying results. The mean or sample proportion of one sample may be different from that of another.
In the poll mentioned before p-hat=.73. A repetition of the same poll may have
p-hat=.75.
Central Limit Theorem (CLT)
Populations that are wildly skewed may cause samples to vary a great deal.
However, the CLT states that these samples tend to have a sample proportion (or mean) that is close to the population parameter.The CLT is very similar to the law of large
numbers.
CLT Example
Imagine that many polls of 1001 Americans are done to find the proportion of those who supported Alito’s nomination.
Although the poll results vary, more samples have a mean that is close to the population parameter μ=.74.
CLT Example
Plot the mean of all samples to see the effects of the CLT. Notice how there are more sample means near the population parameter μ=.74.
This histogram is actually a sampling distribution
Sampling Distributions: Definition Textbook definition:
A sampling distribution is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
In other words, a sampling distribution is a histogram of the statistics from samples of the same size of a population.
Two Most Common Types of Sampling Distributions Sample Proportion Distribution
Distribution of the sample proportions of samples from a population
Sample Mean Distribution Distribution of the sample means of samples
from a population For both types, the ideal shape is a normal
distribution
Sampling Distributions: Conditions Before assuming that a sampling
distribution is normal, check the following conditions:Plausible IndependenceRandomnessEach sample is less than 10% of the
population
Sampling Distributions As Normal Distributions When all conditions met, the sampling
distribution can be considered a normal distribution with a center and a spread.
Note:With sample proportion distributions, another condition must be meet:Success-failure conditon – there must be at least 10
success and 10 failures according to the population parameter and sample size
Sampling Distributions As Normal Distributions: Equations Sample Proportion
Distributionp = population proportion (given)
Sample Mean Distributionμ = population mean (given)
σ = population standard deviation (given)
n
pqpSD ˆ
pSDpN ˆ,
n
ySD
ySDN ,
Sampling Distributions As Normal Distributions: Note Note:
If any of the parameters are unknown, use the statistics from a sample to approximate it.
Using Sampling Distributions
Sampling Distributions can estimate the probability of getting a certain statistic in a random sample.Use z-scores or the NormalCDF function in
the TI-83/84.
Using Sampling Distributions: Z-Scores w/ Example Use the z-score table to find appropriate
probabilitiesExample:Find the probability that a poll of Americans that support Alito’s nomination will return a sample proportion of .72.
ppP
OR
ppP
pSD
ppz
ˆˆ
ˆˆ
ˆ
ˆ
0749.72.ˆ
443.10139.
74.72.ˆ
ˆ
0139.1001
26.*74.ˆ
74.
pP
pSD
ppz
n
pqpSD
p
Using Sampling Distributions: NormalCDF Function w/ Example The syntax for the NormalCDF function is:
NormalCDF(lower limit, upper limit, μ, σ)Example:Find the probability that a sample of size 25 will have a mean of 5 given that the population has a mean of 7 and a standard deviation of 3.
000429.)6,.7,5,0(
6.25
3
3
7
NormalCDFn
ySD
Sampling Distribution for Two Populations Use a difference sampling distribution if
the question presents 2 different populations.
22yxyx
yxyx
Sampling Distribution for Two Populations: Example(adapted from AP Statistics – Chapter 9 – Sampling Distribution Multiple Choice Questions
Medium oranges have a mean weight of 14oz and a standard deviation of 2oz. Large oranges have a mean weight of 18oz and a standard deviation of 3oz. Find the probability of finding a medium orange that weights more than a large orange.
134.)606.3,4,0,(
606.323
41418
3
18
2
14
2222
NormalCDF
xyxy
xyxy
y
y
x
x
Example Problem(adapted from DeVeau Sampling Distribution Models Exercise #42)
Ayrshire cows average 47 pounds if milk a day, with a standard deviation of 6 pounds. For Jersey cows, the mean daily production is 43 pounds, with a standard deviation of 5 pounds. Assume that Normal models describe milk production for these breeds. A) We select an Ayrshire at random. What’s the probability that she averages
more than 50 pounds of milk a day? B) What’s the probability that a randomly selected Ayrshire gives more milk
than a randomly selected Jersey? C) A farmer has 20 Jerseys. What’s the probability that the average
production for this small herd exceeds 45 pounds of milk a day? D) A neighboring farmer has 10 Ayrshires. What’s the probability that his herd
average is at least 5 pounds higher than the average for the Jersey herd?
Example Problem Solution
First, check the assumptions: Independent samplesRandomnessSample represents less than 10% of
population
Example Problem Solution
A) Use the normal model to estimate the appropriate probability.
309.6,47,,50
309.50ˆ5.6
4750
6
47
NormalCDF
pPx
z
Example Problem Solution
B) Create a normal model for the difference between Ayrshires and Jerseys. Use the model to estimate the appropriate probability.
696.)810.7,4,,0(
696.0512.810.7
40
810.756
44347
5
43
6
47
2222
NormalCDF
xPx
zja
ja
jaja
jaja
j
j
a
a
Example Problem Solution
C) Create a sampling distribution model for which n=20 Jerseys. Use the model to estimate the appropriate probability.
0367.)6,47,,50(
0367.45ˆ789.1.118.1
4345
118.120
5
20
5
43
NormalCDF
pPx
z
nySD
n
Example Problem Solution
D) First create a sampling distribution model for 10 random Ayrshires and 20 random Jerseys. Then create a normal model for the difference between the 10 Ayrshires and 20 Jerseys.
118.120
5
20
5
43
j
jj
j
j
j
nySD
n
897.110
6
10
6
47
a
aa
a
a
a
nySD
n
325.)202.2,4,,5(
325.5454.202.2
45
202.2118.1897.1
44347
2222
NormalCDF
xPx
zja
ja
jaja
jaja
Top Related