Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1....
Transcript of Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1....
![Page 1: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/1.jpg)
Unit 4: Inference for numerical data2. ANOVA
GOVT 3990 - Spring 2020
Cornell University
Dr. Garcia-Rios Slides posted at http://garciarios.github.io/govt_3990/
![Page 2: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/2.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care
2. ANOVA tests for some difference in means of manydifferent groups
3. ANOVA compares between group variation to within groupvariation
4. To identify which means are different, use t-tests and theBonferroni correction
3. Summary
![Page 3: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/3.jpg)
Announcements
I SlackI Class zoom link
1
![Page 4: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/4.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care2. ANOVA tests for some difference in means of many
different groups3. ANOVA compares between group variation to within group
variation4. To identify which means are different, use t-tests and the
Bonferroni correction
3. Summary
![Page 5: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/5.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care2. ANOVA tests for some difference in means of many
different groups
3. ANOVA compares between group variation to within groupvariation
4. To identify which means are different, use t-tests and theBonferroni correction
3. Summary
![Page 6: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/6.jpg)
NEWS FLASH!Jelly beans rumored to cause acne!!!
How would you check this rumor? Imagine that doctors canassign an “acne score” to patients on a 0-100 scale.
I What would your research question be?I How would you conduct your study?I What statistical test would you use?
Use an independent samples t-test:H0 : µjelly beans − µplacebo = 0HA : µjelly beans − µplacebo ̸= 0
2
![Page 7: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/7.jpg)
NEWS FLASH!Jelly beans rumored to cause acne!!!
How would you check this rumor? Imagine that doctors canassign an “acne score” to patients on a 0-100 scale.
I What would your research question be?I How would you conduct your study?I What statistical test would you use?
Use an independent samples t-test:H0 : µjelly beans − µplacebo = 0HA : µjelly beans − µplacebo ̸= 0
2
![Page 8: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/8.jpg)
NEWS FLASH!Jelly beans rumored to cause acne!!!
How would you check this rumor? Imagine that doctors canassign an “acne score” to patients on a 0-100 scale.
I What would your research question be?I How would you conduct your study?I What statistical test would you use?
Use an independent samples t-test:H0 : µjelly beans − µplacebo = 0HA : µjelly beans − µplacebo ̸= 0
2
![Page 9: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/9.jpg)
NEWS FLASH!Jelly beans rumored to cause acne!!!
How would you check this rumor? Imagine that doctors canassign an “acne score” to patients on a 0-100 scale.
I What would your research question be?I How would you conduct your study?I What statistical test would you use?
Use an independent samples t-test:H0 : µjelly beans − µplacebo = 0HA : µjelly beans − µplacebo ̸= 0
2
![Page 10: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/10.jpg)
Your turnSuppose α = 0.05. What is the probability of making a Type1 error and rejecting a null hypothesis like
H0 : µpurple jelly bean − µplacebo = 0
when it is actually true?
(a) 1%(b) 5%(c) 36%(d) 64%(e) 95%
3
![Page 11: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/11.jpg)
Your turnSuppose α = 0.05. What is the probability of making a Type1 error and rejecting a null hypothesis like
H0 : µpurple jelly bean − µplacebo = 0
when it is actually true?
(a) 1%(b) 5%(c) 36%(d) 64%(e) 95%
3
![Page 12: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/12.jpg)
Your turnSuppose we want to test 20 different colors of jelly beans versus aplacebo with hypotheses like
H0 : µpurple jelly bean − µplacebo = 0H0 : µbrown jelly bean − µplacebo = 0H0 : µpeach jelly bean − µplacebo = 0...
and we use α = 0.05 for each of these tests. What is the probability ofmaking at least one Type 1 error in these 20 independent tests?
(a) 1%(b) 5%(c) 36%(d) 64%
→ 1 − (1 − 0.05)20
(e) 95% 4
![Page 13: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/13.jpg)
Your turnSuppose we want to test 20 different colors of jelly beans versus aplacebo with hypotheses like
H0 : µpurple jelly bean − µplacebo = 0H0 : µbrown jelly bean − µplacebo = 0H0 : µpeach jelly bean − µplacebo = 0...
and we use α = 0.05 for each of these tests. What is the probability ofmaking at least one Type 1 error in these 20 independent tests?
(a) 1%(b) 5%(c) 36%(d) 64% → 1 − (1 − 0.05)20
(e) 95% 4
![Page 14: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/14.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care
2. ANOVA tests for some difference in means of manydifferent groups
3. ANOVA compares between group variation to within groupvariation
4. To identify which means are different, use t-tests and theBonferroni correction
3. Summary
![Page 15: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/15.jpg)
ANOVA tests for some difference in means of many different groups
Null hypothesis:
H0 : µplacebo = µpurple = µbrown = . . . = µpeach = µorange.
Your turnWhich of the following is a correct statement of the alternativehypothesis?
(a) For any two groups, including the placebo group, no two groupmeans are the same.
(b) For any two groups, not including the placebo group, no two groupmeans are the same.
(c) Among the jelly bean groups, there are at least two groups thathave different group means from each other.
(d) Amongst all groups, there are at least two groups that havedifferent group means from each other. 5
![Page 16: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/16.jpg)
ANOVA tests for some difference in means of many different groups
Null hypothesis:
H0 : µplacebo = µpurple = µbrown = . . . = µpeach = µorange.
Your turnWhich of the following is a correct statement of the alternativehypothesis?
(a) For any two groups, including the placebo group, no two groupmeans are the same.
(b) For any two groups, not including the placebo group, no two groupmeans are the same.
(c) Among the jelly bean groups, there are at least two groups thathave different group means from each other.
(d) Amongst all groups, there are at least two groups that havedifferent group means from each other. 5
![Page 17: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/17.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care
2. ANOVA tests for some difference in means of manydifferent groups
3. ANOVA compares between group variation to within groupvariation
4. To identify which means are different, use t-tests and theBonferroni correction
3. Summary
![Page 18: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/18.jpg)
ANOVA compares between group variation to within group variation
∑|2/∑
|2
6
![Page 19: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/19.jpg)
Relatively large WITHIN group variation: little apparent difference
∑|2/∑
|2
7
![Page 20: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/20.jpg)
Relatively large BETWEEN group variation: there may be a differ-ence
∑|2/∑
|2
8
![Page 21: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/21.jpg)
For historical reasons, we use a modification of this ratio calledthe F-statistic:
F =
∑|2 / (k − 1)∑|2 / (n − k) =
MSGMSE
k: # of groups; n: # of obs.
Df Sum Sq Mean Sq F value Pr(>F)Between groups k-1
∑|2 MSG Fobs pobs
Within groups n-k∑
|2 MSETotal n-1
∑(|+ |)2
9
![Page 22: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/22.jpg)
For historical reasons, we use a modification of this ratio calledthe F-statistic:
F =
∑|2 / (k − 1)∑|2 / (n − k) =
MSGMSE
k: # of groups; n: # of obs.
Df Sum Sq Mean Sq F value Pr(>F)Between groups k-1
∑|2 MSG Fobs pobs
Within groups n-k∑
|2 MSETotal n-1
∑(|+ |)2
9
![Page 23: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/23.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care
2. ANOVA tests for some difference in means of manydifferent groups
3. ANOVA compares between group variation to within groupvariation
4. To identify which means are different, use t-tests and theBonferroni correction
3. Summary
![Page 24: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/24.jpg)
To identify which means are different, use t-tests and the Bonferronicorrection
I If the ANOVA yields a significant results, next naturalquestion is: “Which means are different?”
I Use t-tests comparing each pair of means to each other,– with a common variance (MSE from the ANOVA table)
instead of each group’s variances in the calculation of thestandard error,
– and with a common degrees of freedom (dfE from theANOVA table)
I Compare resulting p-values to a modified significance level
α⋆ =α
Kwhere K is the total number of pairwise tests
10
![Page 25: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/25.jpg)
Application exercise: 4.4 ANOVA
11
![Page 26: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/26.jpg)
Outline
1. Housekeeping
2. Main ideas
1. Comparing many means requires care
2. ANOVA tests for some difference in means of manydifferent groups
3. ANOVA compares between group variation to within groupvariation
4. To identify which means are different, use t-tests and theBonferroni correction
3. Summary
![Page 27: Unit 4: Inference for numerical data - 2. ANOVA · Outline 1. Housekeeping 2. Main ideas 1. Comparing many means requires care 2. ANOVA tests for some difference in means of many](https://reader036.fdocuments.in/reader036/viewer/2022071214/60427aaf86012d4d846c34ba/html5/thumbnails/27.jpg)
Summary of main ideas
1. ??2. ??3. ??4. ??
12