Last lecture summary. 2345678 0 344434243533 Population 2015.
-
Upload
annabel-mosley -
Category
Documents
-
view
219 -
download
0
Transcript of Last lecture summary. 2345678 0 344434243533 Population 2015.
![Page 1: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/1.jpg)
Last lecture summary
![Page 2: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/2.jpg)
![Page 3: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/3.jpg)
2
3 4
56 7
8
0
34
4
4
3
4
2
4
3
5
3
3
![Page 4: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/4.jpg)
Population 2015
![Page 5: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/5.jpg)
Population 2014
![Page 6: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/6.jpg)
2
3 4
56 7
8
0
34
4
4
3
4
2
4
3
5
3
3
průměr = 3.3
průměr = 3.0
![Page 7: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/7.jpg)
Data 2015
Population:
4,3,3,5,0,4,4,4,3,4,2,6,8,2,4,3,5,7,3,3
25 samples (n=3) and their averages
3.3,5.3,3.6,4.3,2.3,3.0,3.6,3.0,5.3,5.6,3.3,4.3,3.3,4.0,5.6,4.3,4.3,4.6,6.3,3.3,4.0,3.3,4.6,3.0,4.3
http://blue-lover.blog.cz/1106/lentilky
![Page 8: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/8.jpg)
2015, n = 3, number of samples = 25
![Page 9: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/9.jpg)
2015, n = 3, number of samples = 50
![Page 10: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/10.jpg)
2015, n = 3, number of samples = 300
![Page 11: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/11.jpg)
2015, n = 3, all possible samples (1540)
![Page 12: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/12.jpg)
2015, n = 5, all possible samples (42 504)
![Page 13: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/13.jpg)
2015, n = 10, all possible samples (20 030 010)
![Page 14: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/14.jpg)
Central limit theorem• Distribution of sample means is normal.
• The distribution of means will increasingly approximate a normal distribution as the sample size increases.
• Its mean is equal to population mean.
• Its standard deviation is equal to population standard deviation divided by the square root of .• is called standard error.
𝑆𝐸=𝜎 𝑥=𝜎√𝑛
𝑀 ¿𝜇𝑥=𝜇
![Page 15: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/15.jpg)
ESTIMATION, CONFIDENCE INTERVALS
![Page 16: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/16.jpg)
Statistical inference
If we can’t conduct a census, we collect data from the sample of a population.
Goal: make conclusions about that population
![Page 17: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/17.jpg)
Demonstration• You sample 36 apples from your farm’s harvest of over
200 000 apples. The mean weight of the sample is 112 grams (with a 40 gram sample standard deviation).
• What is the probability that the mean weight of all 200 000 apples is within 100 and 124 grams?
![Page 18: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/18.jpg)
What is the question?• We would like to know the probability that the population
mean is within 12 of the sample mean.
• But this is the same thing as
• But this is the same thing as
• So, if I am able to say how many standard deviations away from I am, I can use the Z-table to figure out the probability.
![Page 19: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/19.jpg)
Slight complication• There is one caveat, can you see it?• We don’t know the standard deviation of a sampling
distribution (standard error). We only know it equals to , but is uknown.
• What we’re going to do is to estimate . Best thing we can do is to use sample standard deviation .
• . This is our best estimate of a standard error.• Now you finish the example. What is the probability that
population mean lies within 12 of the sample if the SE equals to 6.67?• 92.82%
![Page 20: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/20.jpg)
This is neat!• You sample 36 apples from your farm’s harvest of over
200 000 apples. The mean weight of the sample is 112 grams (with a 40 gram sample standard deviation). What is the probability that the population mean weight of all 200 000 apples is within 100 and 124 grams?
• We started with very little information (we know just the sample statistics), but we can infere that
with the probability of 92.82% a population mean lies within 12 of our sample mean!
![Page 21: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/21.jpg)
Point vs. interval estimate• You sample 36 apples from your farm’s harvest of over
200 000 apples. The mean weight of the sample is 112 grams (with a 40 gram sample standard deviation).
• Goal: estimate population mean
1. Population mean is estimated as sample mean. i.e. we say population mean equals to 112 g. This is called a point estimate (bodový odhad).
2. However, we can do better. We can estimate that our true population mean will lie with the 95% confidence within an interval of (interval estimate).
𝑥±1.96×𝑠
√𝑛
![Page 22: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/22.jpg)
Confidence interval• This type of result is called a confidence interval
(interval spolehlivosti, konfidenční interval).
• The number of stadandard errors you want to add/subtract depends on the confidence level (e.g. 95%) (hladina spolehlivosti).
𝑥±𝑍×𝑠
√𝑛margin of error
možná odchylka
critical valuekritická hodnota
![Page 23: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/23.jpg)
Confidence level• The desired level of confidence is set by the researcher,
not determined by data.• If you want to be 95% confident with your results, you add/subtract
1.96 standard errors (empirical rule says about 2 standard errors).• 95% interval spolehlivosti
Confidence level Z-value
80 1.28
90 1.64
95 1.96
98 2.33
99 2.58
![Page 24: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/24.jpg)
80% 90%
95% 99%
1.28
1.96
1.64
2.58
![Page 25: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/25.jpg)
Small sample size confidence intervals
• 7 patient’s blood pressure have been measured after having been given a new drug for 3 months. They had blood pressure increases of 1.5, 2.9, 0.9, 3.9, 3.2, 2.1 and 1.9. Construct a 95% confidence interval for the true expected blood pressure increase for all patients in a population.
![Page 26: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/26.jpg)
• We will assume that our population distribution is normal, with and .
• We don’t know anything about this distribution but we have a sample. Let’s figure out everything you can figure out about this sample: • ,
• We estimate true population standard deviation with sample standard deviation
• However, we are estimating our standard deviation with of only seven! This is probably goint to be not so good estimate.
• In general, if this is considered a bad estimate.
![Page 27: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/27.jpg)
William Sealy Gosset aka Student• 1876-1937• an employee of Guinness
brewery• 1908 papers addressed the
brewer's concern with small samples• "The probable error of a mean".
Biometrika 6 (1): 1–25. March 1908.• Probable error of a correlation
coefficient". Biometrika 6 (2/3): 302–310. September 1908.
![Page 28: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/28.jpg)
Student t-distribution• Instead of assuming a sampling distribution is normal we
will use a Student t-distribution.• It gives a better estimate of your confidence interval if you
have a small sample size.• It looks very similar to a normal distribution, but it has
fatter tails to indicate the higher frequency of outliers which come with a small data set.
![Page 29: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/29.jpg)
Student t-distribution
![Page 30: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/30.jpg)
Student t-distribution
df – degree of freedom (stupeň volnosti)
![Page 31: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/31.jpg)
Back to our case
• Because sample size is small, the sampling distribution of the mean won’t be normal. Instead, it will have a Student t-distribution with .
• Construct a 95% confidence interval, please
for𝑛<30 :𝑥 ±𝑡𝑛−1×𝑠
√𝑛
![Page 32: Last lecture summary. 2345678 0 344434243533 Population 2015.](https://reader036.fdocuments.in/reader036/viewer/2022062309/5697c02e1a28abf838cda4af/html5/thumbnails/32.jpg)
• Just to summarize, the margin of error depends on1. the confidence level (common is 95%)
2. the sample size • as the sample size increases, the margin of error decreases• For the bigger sample we have a smaller interval for which we’re
pretty sure the true population lies.
3. the variability of the data (i.e. on σ)• more variability increases the margin of error
• Margin of error does not measure anything else than chance variation.
• It doesn’t measure any bias or errors that happen during the proces.
• It does not tell anything about the correctness of your data!!!
neco×𝑠
√𝑛