STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha,...
-
Upload
shawn-byrd -
Category
Documents
-
view
215 -
download
0
Transcript of STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha,...
![Page 1: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/1.jpg)
STT 315
This lecture is based on Chapter 5.4
Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their slides.
![Page 2: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/2.jpg)
2
Statistical Inference• Inference means that we are making a
conclusion about the population parameter based on the statistic we calculated from a sample.
• Conclusions made using statistical inference are probabilistic in nature. We may not be able to say for sure, but with certain confidence.
• There are two types of inference:– Confidence Intervals,– Hypothesis Tests.
![Page 3: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/3.jpg)
3
Goal
Students will be able to:• Construct a confidence interval for a proportion.• Interpret a confidence interval for a proportion.• Check conditions for the use of inference about a
population proportion– Independence (or sample less than 10% of population),– Sample size large enough (successes and failures each
greater than 10).
• Explain the relationship between the margin of error, sample size, and level of certainty.
![Page 4: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/4.jpg)
4
Estimating Smokers• Suppose I want to estimate the
percent of MSU undergraduate students who smoke.
• A random sample of 99 undergraduate students were selected and 17 of them smoked tobacco last week.
• I want to make a 95% confidence interval for the proportion of MSU undergraduates based on this information.
![Page 5: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/5.jpg)
5
Are the conditions met?• Firstly it is a random sample.
• Though the sample is without replacement, but it satisfies 10% condition as there are more than 1000 undergraduate students in MSU.
• Also both number of smokers (17) and non-smokers (82) are larger than 10, the sample can be considered to be large enough.
![Page 6: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/6.jpg)
6
Use the results to make a CI• The sampling distribution results
guarantees us that the sample proportions will be roughly normally distributed around the population proportion.
• So 95% of samples should fall within two standard deviations of the population proportion.
• But we don’t know the population proportion (that’s what we are trying to estimate)! So we cannot get
• Therefore we need to use the sample proportion and work backward from there.
.p̂
![Page 7: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/7.jpg)
7
To make a 95% confidence interval we must create an interval that is 2 standard deviations long, above and below the statistic.
One standard deviation is
So 2 standard deviations is 2(.0379) = .0758 (and 7.58% is the margin of error).
In our sample, 17 out of 99 students smoked tobacco in the last week, or 17.2%.
17.2% is a statistic (or a point estimate).
We will use 17.2% to make an interval estimate for the value of the parameter.
.0379.099
)828(.172.
Construction of C.I.
![Page 8: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/8.jpg)
8
And we write:We are 95% confident that between 9.62% and 24.78% of MSU undergraduates smoke tobacco.
If we want to make a 68% confidence interval, we only have to extend the interval one standard deviation from the statistic in each direction:
So a 68% confidence interval has endpoints at 0.172 - 0.0379 = 0.1341, and 0.172 + 0.0379 = 0.2099and we write:
We are 68% confident that between 13.4% and 21.0% of MSU undergraduates smoke tobacco.
So a 95% confidence interval has endpoints at 0.172 - 0.0758 = 0.0962 , and 0.172 + 0.0758 = 0.2478.
![Page 9: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/9.jpg)
9
Our 95% CI for smokers was 9.62% to 24.78%.
This means that (find the correct one):a) 95% of random samples of MSU undergraduates
will have between 9.62% and 24.78% smokers.b) Between 9.62% and 24.78% of MSU
undergraduates smoke.c) 95% of MSU undergraduates smoke between
9.62% and 24.78% of the time.d) We are 95% sure that between 9.62% and
24.78% of MSU undergraduates smoke.
![Page 10: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/10.jpg)
10
Standard Error (S.E.)• If subjects are independent, • and if the sample size is large enough,
then the sample proportions are approximately normally distributed with mean p, and standard deviation
i.e.,• But in an estimation problem, p is unknown. So
we replace population proportion (p) by the sample proportion ( ) in its formula and get standard error of sample proportion
),(~ˆ p̂pNp
.)1(
ˆn
ppp
approximately.
p̂
,ˆˆ)ˆ1(ˆ
)ˆ.(.n
qp
n
pppES
where .ˆ1ˆ pq
![Page 11: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/11.jpg)
11
Confidence Interval (C.I.) andMargin of Error (M.E.)
• Since for large n, the sample proportion ( ) is approximately normal, we can conclude (using empirical rule) that within a margin of error of 1×S.E. we are about
68% sure the population proportion (p) lies. within a margin of error of 2×S.E. we are about
95% sure the population proportion (p) lies.• So confidence interval for p is• Obviously, more the confidence you require, larger the
margin of error.
p̂
..ˆ EMp
![Page 12: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/12.jpg)
12
Find the exact area between -2 and 2 standard deviations from the mean on a normal curve using your calculator.
Hint: normalcdf(-2,2,0,1) = 0.954 = 95.4%.So this is not exactly 95%, but slightly more.On the other hand the exact area between -1.96 and 1.96 standard deviations from the mean is
normalcdf(-1.96,1.96,0,1) = 0.95 = 95%.Using 1.96, we get the 95% C.I. for p to be: (0.097, 0.246).
Is it 1.96 or 2 for 95% C.I.?
Note: calculator uses 1.96.But how to use calculator to construct C.I.?
![Page 13: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/13.jpg)
13
Formula: C.I. for • The formula for C.I. for is given by
where is such a number that , where Z is a standard normal variable.• However, one can use TI 83/84 to compute
C.I.’s for p.
![Page 14: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/14.jpg)
14
C.I. with TI 83/84 PlusWant to make a 85% confidence interval for smokers among MSU undergraduates. In a random sample of 99 MSU undergraduates 17 smoked tobacco last week.• Press [STAT].• Select [TESTS].• Choose A: 1-PropZInt….• Input the following:
o x: 17o n: 99o C-Level: 85
• Choose Calculate and press [ENTER].
Answer: 85% C.I. for p is (0.117, 0.226).
![Page 15: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/15.jpg)
15
How would I a confidence interval a proportion using a calculator?
Sample Input
Sample Output
The sample input shows finding a 99% confidence interval with a sample size of 4040 people and 2048 smokers.
We would interpret the sample output as:“We are 99% confident that between 48.7% and 52.7% of the population smokes.
Note: This example wasn’t actually about smoking.
![Page 16: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/16.jpg)
16
Width of a C.I.• Since the formula of confidence interval for p is
the width of the C.I. is 2×M.E.• So if we know the width of C.I., we can compute the
M.E. by halving the width.• Example: Given a 90% C.I. for p is (0.23, 0.37), find
the values of (a) sample proportion and (b) margin of error of the 90% C.I.Solution: Since the width = (0.37-0.23) = 0.14, and so the margin of error of 90% C.I. for p is 0.14/2 = 0.07.Moreoverand so
..ˆ EMp
23.0..ˆ EMp
.30.023.007.0ˆ p
![Page 17: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/17.jpg)
17
Smoker example• We found that 17.2% of a sample of 99
MSU undergraduates had smoked in the past week.
• We used this to find a 95% confidence interval for the proportion of MSU undergraduates who smoke.
• The endpoints of our 95% confidence interval is (0.096, 0.248).
• The width of 95% C.I. is (0.248-0.096) = 0.15, and so the margin of error is (0.15/2) = 0.075.
• If we want to reduce the margin of error while keeping the confidence level the same, we could increase the sample size.
![Page 18: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/18.jpg)
18
M.E. and sample size• If we wanted to reduce the margin of error to
4%, minimum how many undergrads would we have to survey?
• The formula is: • But what p to use (remember q = 1-p)?
.25.600)04.0(
5.05.096.1
.).(
5.05.096.12
2
2
2
EM
n
.98.341)04(.
)828)(.172(.8416.3
.).(
96.122
2
EM
pqn So we would need 342 subjects.
..).(
)(2
22/
EM
pqzn
Two cases: No information about p is given. In that case use p = 0.5.In our exercise, if nothing about p is known:
So we would need 601 subjects.
If some information about p is known, use that information.If we use the information of sample: p = 0.172, q = 1-0.172 = 0.828.
![Page 19: STT 315 This lecture is based on Chapter 5.4 Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing.](https://reader030.fdocuments.in/reader030/viewer/2022032605/56649e7a5503460f94b79f52/html5/thumbnails/19.jpg)
19
Summary
• Larger sample size makes smaller margin of error.
• Larger confidence makes larger margin of error.
• The level of confidence is the proportion of intervals that will contain the value of the population parameter.
• As long as the conditions are met, the process of confidence interval works.