Download - 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

Transcript
Page 1: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

11

Sampling Sampling DistributionsDistributions

Presentation 2

• Sampling Distribution of sample proportions• Sampling Distribution of sample means

Page 2: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

22

Statistics VS Parameters

Statistic – is a numerical value computed from a sample.

Parameter – is a numerical value associated with a population.

Essentially, we would like to know the parameter. But in most cases it is hard to know the parameter since the population is too large.

So we have to estimate the parameter by some proper statistics computed from the sample.

Page 3: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

33

Some NotationSome Notation

pp = population proportion = population proportion = sample proportion= sample proportion μμ = population mean = population mean = sample mean= sample mean σσ = standard deviation = standard deviation ss = sample standard deviation = sample standard deviation

x

Page 4: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

44

A.A. Sampling Distribution of the Sampling Distribution of the Sample ProportionSample Proportion

Situation 1:Situation 1: A survey is undertaken to determine A survey is undertaken to determine the proportion of PSU students who engage in the proportion of PSU students who engage in under-age drinking. The survey asks 200 random under-age drinking. The survey asks 200 random under-age students (assume no problems with under-age students (assume no problems with bias). Suppose the true population proportion of bias). Suppose the true population proportion of those who drink is 60%.those who drink is 60%.

Thus, Thus, p p = 0.6 and is the proportion in the = 0.6 and is the proportion in the sample who drink.sample who drink.

Page 5: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

55

Repeated SamplesRepeated Samples

Imagine repeating this survey many timesImagine repeating this survey many times, and , and each time we record the sample proportion of each time we record the sample proportion of those who have engaged in under-age drinking. those who have engaged in under-age drinking. What would the sampling distribution of look What would the sampling distribution of look like?like?

Sample (n=200)Sample (n=200) Sample Sample ProportionProportion

11 11

22 22

33 33

44 44

55 55

…… ……

150,000150,000 150,000150,000

p̂p̂

p̂p̂p̂

p̂ is a random variable assigning a value to each sample!

Page 6: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

66

Histogram of for 150 000 Histogram of for 150 000 samples.samples.

0.4 0.5 0.6 0.7 0.8

02

46

810

Page 7: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

77

Sampling Distribution ofSampling Distribution of Let Let XX be the number of respondents who say they engage in be the number of respondents who say they engage in

under age drinking. under age drinking. XX is binomial with is binomial with n n =200 and =200 and p p =0.6. =0.6. So, we can calculate the probability of So, we can calculate the probability of XX for each possible for each possible

outcome (0-200). The PDF is plotted below:outcome (0-200). The PDF is plotted below:

69 74 79 84 89 94 99 104 109 114 119 124 129 134 139 144 149 154 159 164 169

X

0.00

0.01

0.02

0.03

0.04

0.05

0.06

Pro

babi

lity

Page 8: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

88

Sampling Distribution of Sampling Distribution of

Since Since X X ~Bin (~Bin (n n =200, =200, pp =0.6), the sampling =0.6), the sampling distribution of is the same as that of the distribution of is the same as that of the binomial distribution divided by binomial distribution divided by nn. .

Therefore we haveTherefore we have

npp

npnp

nXsd

p

pnnp

nXE

p

)()()(ˆfor Dev. Std

)(ˆfor Mean

11

200

XnX

p sample the in students ofnumber total

drinkthat sample the in students ofnumber ˆ

Page 9: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

99

Sampling Distribution of - Cont. Sampling Distribution of - Cont.

Using the Normal approximation to the binomial Using the Normal approximation to the binomial distribution we have that the sampling distribution of distribution we have that the sampling distribution of is approximately Normal with mean is approximately Normal with mean p p and std. dev. and std. dev.

i.e. i.e.

The conditions for this approximation to be valid are:The conditions for this approximation to be valid are:

1.1. The sample selected from the population is random. The sample selected from the population is random.

2.2. The sample must be large enough, The sample must be large enough, npnp and and n(1-p)n(1-p) MUST be greater than 5, and should be greater than MUST be greater than 5, and should be greater than 10. 10.

npp

pNpapprox )(

,~ˆ. 1

npp /)( 1

Page 10: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1010

Example:Example:

Recent studies have shown that about 20% of American Recent studies have shown that about 20% of American adults fit the medical definition of being obese. adults fit the medical definition of being obese.

A large medical clinic would like to estimate what A large medical clinic would like to estimate what percent of their patients are obese, so they take a percent of their patients are obese, so they take a random sample of 100 patients and find that 18 random sample of 100 patients and find that 18 percent are obese.percent are obese.

Suppose in truth, the same percentage holds for the Suppose in truth, the same percentage holds for the patients of the medical clinic as for the general patients of the medical clinic as for the general population, 20%. population, 20%.

Give notation and the numerical value for the following.Give notation and the numerical value for the following.

Page 11: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1111

Problem - Cont. Problem - Cont. a.a. The population proportion of obese patients in the The population proportion of obese patients in the

medical clinic:medical clinic:

b.b. The proportion of obese patients in the sample of The proportion of obese patients in the sample of 100 patients: 100 patients:

c.c. The mean of the sampling distribution of :The mean of the sampling distribution of :

d.d. The standard deviation of the sampling distribution The standard deviation of the sampling distribution of :of :

e.e. The variance of the sampling distribution of : The variance of the sampling distribution of : p̂

Page 12: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1212

B. Sampling Distribution of the B. Sampling Distribution of the Sample MeanSample Mean

Situation 2Situation 2:: The mean height of women age 20 to The mean height of women age 20 to 30, 30, X X , is normally distributed (bell-shaped) with a , is normally distributed (bell-shaped) with a mean of 65 inches and a standard deviation of 3 mean of 65 inches and a standard deviation of 3 inches. i.e. inches. i.e.

X X ~N(65,9)~N(65,9) A random sample of 200 women was taken and the A random sample of 200 women was taken and the

sample mean recorded. sample mean recorded.

Now IMAGINE taking MANY samples of size 200 from Now IMAGINE taking MANY samples of size 200 from the population of women.the population of women. For each sample we record For each sample we record the . What is the sampling distribution of ?the . What is the sampling distribution of ?

X

XX

Page 13: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1313

Histograms for the Distribution of Histograms for the Distribution of XX and and X X -Bar-Bar

Original Population of Women: X= height of random woman

Distribution of Sample Means: X-bar = mean of random sample of size 200.

Page 14: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1414

Normal DataNormal Data Consider a Normal random variable X with mean Consider a Normal random variable X with mean μμ

and standard deviation and standard deviation σσ, ,

X X ~N( ~N( μμ , , σσ2 2 ).). The sampling distribution of the sample mean of The sampling distribution of the sample mean of X X

for a sample of size for a sample of size nn is Normal with is Normal with

i.e. i.e.

n

NX2,~

nXVarX

nXdsX

XEX

2

)( of Variance

).(. of Dev. Std

)( of ValueExpectedor Mean

Page 15: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1515

Skewed or Non-Normal Skewed or Non-Normal DataData

0 100 200 300 400 500 600

01

02

03

04

0

CDs

Situation 3: In a college survey, students were asked to report the number of cd’s they own. Clearly CDs is a right skewed data set. Suppose our population looked something like this, let us take repeated samples from this population and see what the sample mean looks like.

Page 16: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1616

Suppose we take Suppose we take repeatedrepeated samples of size samples of size n n = 4, 8, 16, 32= 4, 8, 16, 32

0 100 200 300

050

010

00

15

00

20

00

Sample Mean for n=40 50 100 150 200 250

020

040

060

080

010

00

12

00

Sample Mean for n=8

50 100 150 200

020

040

060

080

0

Sample Mean for n1640 60 80 100 120 140 160 180

020

040

060

080

0

Sample Mean for n=32

n = 4

n = 32n = 16

n = 8

Page 17: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1717

Statistics From Skewed DataStatistics From Skewed Data

Using that CD sample as the population, Using that CD sample as the population,

µµ = 87.6, = 87.6, σσ = 87.8 = 87.8 The sample means from the previous slide had the The sample means from the previous slide had the

following summary statistics:following summary statistics:

Sample SizeSample Size Mean of X-barMean of X-bar Std. Dev. of X-Std. Dev. of X-barbar

nn = 4 = 4 86.686.6 43.243.2

nn = 8 = 8 86.886.8 30.930.9

nn = 16 = 16 86.786.7 21.921.9

nn = 32 = 32 86.686.6 15.615.6Note: that the mean remains constant, and the std. deviation decreases as the sample size increases!

Page 18: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1818

Central Limit TheoremCentral Limit Theorem

For non-normal data coming from a population For non-normal data coming from a population with mean with mean µµ and standard deviation and standard deviation σσ the the sampling distribution of the sample mean is sampling distribution of the sample mean is approximately normal withapproximately normal with

ConditionsConditions: : The above is true if the sample size The above is true if the sample size is large enough, usually is large enough, usually nn > 30 is sufficient. > 30 is sufficient.

nN

nXVarX

nXdsX

XEX

2

2

,~X i.e.

)( of Variance

).(. of Dev. Std

)( of ValueExpectedor Mean

approx.

Page 19: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

1919

What next?What next?

We have shown that both the sampling distribution We have shown that both the sampling distribution of the sample proportion, and the sampling of the sample proportion, and the sampling distribution of the sample mean are both normal distribution of the sample mean are both normal under certain conditions. under certain conditions.

Now we can use what we know about normal Now we can use what we know about normal distributions to make conclusions about and !distributions to make conclusions about and !

In the following we will see how to use the values of In the following we will see how to use the values of the statistics (p-hat, x-bar) to make inferences the statistics (p-hat, x-bar) to make inferences about the parameters (about the parameters (p, p, µµ).).

Xp̂

Page 20: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

2020

Exercise 1Exercise 1

The population proportion is 0.30. Consider the The population proportion is 0.30. Consider the following questions.following questions.

1.1. Find the sampling distribution of p-hat for each Find the sampling distribution of p-hat for each of the following sample sizes n=100, n=200, of the following sample sizes n=100, n=200, n=1000n=1000

2.2. What is the probability that a sample What is the probability that a sample proportion will be within proportion will be within ±.04 of the population ±.04 of the population proportion for each of these sample sizes? proportion for each of these sample sizes?

3.3. What is the advantage of larger sample size?What is the advantage of larger sample size?

Page 21: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

2121

Exercise 2 Exercise 2 A certain antibiotic in known to cure 85% of strep A certain antibiotic in known to cure 85% of strep

bacteria infections. A scientist wants to make bacteria infections. A scientist wants to make sure the drug does not lose its potency over sure the drug does not lose its potency over time. He treats 100 strep patients with a 1 year time. He treats 100 strep patients with a 1 year old supply of the antibiotic. Let be the old supply of the antibiotic. Let be the proportion of individuals who are cured. proportion of individuals who are cured. ASSUME the drug has NOT lost potency, answer ASSUME the drug has NOT lost potency, answer the following questions…the following questions…

1.1. What is the sampling distribution of ? Draw a What is the sampling distribution of ? Draw a picture picture

2.2. If we repeated this study many times we would If we repeated this study many times we would expect 95% of to fall within what interval?expect 95% of to fall within what interval?

3.3. What is the probability that more than 90% in the What is the probability that more than 90% in the sample are cured? sample are cured?

Page 22: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

2222

Exercise 3Exercise 3 A newspaper conducts a poll to determine the A newspaper conducts a poll to determine the

proportion of adults who favor a certain candidate. proportion of adults who favor a certain candidate. They ask a random sample of 800 people whether or They ask a random sample of 800 people whether or not they favor that candidate (Assume no bias!). not they favor that candidate (Assume no bias!). Suppose the true proportion of adults who favor the Suppose the true proportion of adults who favor the candidate is 58%.candidate is 58%. 1.1. The newspaper records the sample proportion who The newspaper records the sample proportion who

favor the candidate. What is the sampling distribution favor the candidate. What is the sampling distribution of the sample proportion? Draw a picture of its PDF of the sample proportion? Draw a picture of its PDF (center it correctly and include the appropriate scale).(center it correctly and include the appropriate scale).

2.2. What is the probability that the newspaper would have What is the probability that the newspaper would have recorded a sample proportion greater than 62%?recorded a sample proportion greater than 62%?

3.3. What is the probability that less than 50% of the What is the probability that less than 50% of the newspaper respondents would support this candidate? newspaper respondents would support this candidate?

4.4. What is the probability that a randomly selected What is the probability that a randomly selected individual favors this candidate?individual favors this candidate?

Page 23: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

2323

Exercise 4Exercise 4 Suppose the number of calories FIT students consume in Suppose the number of calories FIT students consume in

a day is normally distributed with mean 2000 and a day is normally distributed with mean 2000 and standard deviation 300. standard deviation 300. 1.1. About 95% of PSU students have a daily caloric intake About 95% of PSU students have a daily caloric intake

between what two values? between what two values? 2.2. What is the probability that a randomly selected individual What is the probability that a randomly selected individual

consumed between 1800 and 2100 calories yesterday?consumed between 1800 and 2100 calories yesterday?3.3. Suppose I take a random sample of 36 students and Suppose I take a random sample of 36 students and

recorded the number of calories each consumed on a given recorded the number of calories each consumed on a given day. Describe the sampling distribution of the sample day. Describe the sampling distribution of the sample mean.mean.

4.4. Draw a picture of the sampling distribution of the sample Draw a picture of the sampling distribution of the sample mean (center it correctly and include the appropriate mean (center it correctly and include the appropriate scale).scale).

5.5. If I take a sample of size 36 from the student body, what is If I take a sample of size 36 from the student body, what is the probability that the sample mean will be less than the probability that the sample mean will be less than 2050?2050?

Page 24: 1 Sampling Distributions Presentation 2 Sampling Distribution of sample proportions Sampling Distribution of sample means.

2424

Exercise 5Exercise 5 Assume the length of trout living in the Susquehanna Assume the length of trout living in the Susquehanna

River is normally distributed with mean of 14 inches River is normally distributed with mean of 14 inches and standard deviation of 2 inches. A random sample and standard deviation of 2 inches. A random sample of 16 trout is taken from the river. of 16 trout is taken from the river.

1.1. What is the sampling distribution of the average trout What is the sampling distribution of the average trout length (i) in a sample of size 16 (ii) in a sample of size length (i) in a sample of size 16 (ii) in a sample of size 100?100?

2.2. What happens to the sampling distribution of the sample What happens to the sampling distribution of the sample mean as the sample size increases? (Draw a picture) mean as the sample size increases? (Draw a picture)

3.3. What is the probability that a random sample of 16 trout What is the probability that a random sample of 16 trout will provide a sample mean within one in of the will provide a sample mean within one in of the population mean? population mean?

4.4. What is the probability that a random sample of 100 What is the probability that a random sample of 100 trout will provide a sample mean within one in of the trout will provide a sample mean within one in of the population mean?population mean?

5.5. What is the advantage of a larger sample size when one What is the advantage of a larger sample size when one is attempting to estimate the population mean?is attempting to estimate the population mean?