One-way ANOVA
• Example• Analysis of Variance Hypotheses• Model & Assumptions• Analysis of Variance• Multiple Comparisons• Checking Assumptions
Example:Days Absent by Job Type
11/26/2012 9:25.30 (4)
2 4 6 8 10 12 14 16A
DotPlot
2 4 6 8 10 12 14 16B
DotPlot
2 4 6 8 10 12 14 16C
DotPlot
2 4 6 8 10 12 14 16D
DotPlot
Analysis of Variance• Analysis of Variance is a widely used statistical technique that
partitions the total variability in our data into components of variability that are used to test hypotheses about equality of population means.
• In One-way ANOVA, we wish to test the hypothesis: H0 : 1 = 2 = = k
against:Ha : Not all population means are the same
Assumptions
• Each population being sampled is normally distributed and all populations are equally variable.
• Normality can be checked by skewness/kurtosis or normal probability plots. If any of the samples do not look like they come from a normal population the assumption is not met (unless the samples that do not look normal have a large sample size (n>30)
• Equal variability can be checked by comparing standard deviations. If no standard deviation is more than 2 times bigger than another equal variability can be assumed.
Example: Are the population mean days absent the same for all 4 job types?
ANOVA table Source SS df MS F p-value
Treatment 763.823 3 254.6076 69.48 5.72E-24Error 351.795 96 3.6645 Total 1,115.618 99
H0: μA = μB = μC = μD H1: Not all μ’s are equal
We can be almost 100% confident that population mean days absent differ in some way between the 4 job types.
Multiple Comparisons• A significant F-test tells us that at least two of the underlying
population means are different, but it does not tell us which ones differ from the others.
• We need extra tests to compare all the means, which we call Multiple Comparisons.
• We look at the difference between every pair of group population means, as well as the p-value for each difference.
• When we have k groups, there are:
possible pair-wise comparisons. For example 4 groups have 4*3/2 = 6 comparisons.
( )2
1)!2(!2
!2
-=
-=÷÷
ø
öççè
æ kkkkk
Multiple Comparisons • If we estimate each comparison separately with 95% confidence, the overall confidence will be less than 95%.
• So, using ordinary pair-wise comparisons (i.e. lots of individual t-tests), we tend to find too many significant differences between our sample means.
• We need to modify our p-values so that we determine the true differences with 95% confidence across the entire set of comparisons.
• These methods are known as:multiple comparison procedures
Multiple Comparisons• We use Tukey simultaneous comparisons.• Tukey simultaneous comparisons overcome
the problems of the unadjusted pair-wise comparisons finding too many significant differences (i.e. p-values that are too small).
Tukey Pair-wise ComparisonsTukey simultaneous comparison t-values (d.f. = 96)
B D C A2.73 5.03 6.74 10.11
B 2.73 D 5.03 4.15 C 6.74 7.47 3.05 A 10.11 14.04 9.24 6.36
critical values for experimentwise error rate:0.05 2.620.01 3.21
We can be at least 99% confident that Job Type A has the highest population mean days absent. We can be also more than 99% confident that C and D have a larger population mean days absent than B. We can only be 95% confident that Job Type C has a higher population mean than D.
Top Related