Post on 28-Dec-2015
1 1 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Chapter 13, Part AChapter 13, Part A Analysis of Variance and Experimental Analysis of Variance and Experimental
Design Design Introduction to Analysis of Variance Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality of Analysis of Variance: Testing for the Equality of k k Population Means Population Means Multiple Comparison ProceduresMultiple Comparison Procedures
2 2 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
Analysis of VarianceAnalysis of Variance (ANOVA) can be used to test (ANOVA) can be used to test for the equality of three or more population means.for the equality of three or more population means. Analysis of VarianceAnalysis of Variance (ANOVA) can be used to test (ANOVA) can be used to test for the equality of three or more population means.for the equality of three or more population means.
Data obtained from observational or experimentalData obtained from observational or experimental studies can be used for the analysis.studies can be used for the analysis. Data obtained from observational or experimentalData obtained from observational or experimental studies can be used for the analysis.studies can be used for the analysis.
We want to use the sample results to test theWe want to use the sample results to test the following hypotheses:following hypotheses: We want to use the sample results to test theWe want to use the sample results to test the following hypotheses:following hypotheses:
HH00: : 11==22==33==. . . . . . = = kk
HHaa: Not all population means are equal: Not all population means are equal
3 3 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
HH00: : 11==22==33==. . . . . . = = kk
HHaa: Not all population means are equal: Not all population means are equal
If If HH00 is rejected, we cannot conclude that is rejected, we cannot conclude that allall population means are different.population means are different.
If If HH00 is rejected, we cannot conclude that is rejected, we cannot conclude that allall population means are different.population means are different.
Rejecting Rejecting HH00 means that at least two population means that at least two population means have different values.means have different values.
Rejecting Rejecting HH00 means that at least two population means that at least two population means have different values.means have different values.
4 4 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Sampling Distribution of Given Sampling Distribution of Given HH00 is True is Truexx
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
1x1x 3x3x2x2x
Sample means are close togetherSample means are close together because there is onlybecause there is only
one sampling distributionone sampling distribution when when HH00 is true. is true.
22x n
2
2x n
5 5 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Introduction to Analysis of VarianceIntroduction to Analysis of Variance
Sampling Distribution of Given Sampling Distribution of Given HH00 is False is Falsexx
33 1x1x 2x2x3x3x 11 22
Sample means come fromSample means come fromdifferent sampling distributionsdifferent sampling distributionsand are not as close togetherand are not as close together
when when HH00 is false. is false.
6 6 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
For each population, the response variable isFor each population, the response variable is normally distributed.normally distributed. For each population, the response variable isFor each population, the response variable is normally distributed.normally distributed.
Assumptions for Analysis of VarianceAssumptions for Analysis of Variance
The variance of the response variable, denoted The variance of the response variable, denoted 22,, is the same for all of the populations.is the same for all of the populations. The variance of the response variable, denoted The variance of the response variable, denoted 22,, is the same for all of the populations.is the same for all of the populations.
The observations must be independent.The observations must be independent. The observations must be independent.The observations must be independent.
7 7 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Analysis of Variance:Analysis of Variance:Testing for the Equality of Testing for the Equality of kk Population Population
MeansMeans Between-Treatments Estimate of Population VarianceBetween-Treatments Estimate of Population Variance
Within-Treatments Estimate of Population VarianceWithin-Treatments Estimate of Population Variance Comparing the Variance Estimates: The Comparing the Variance Estimates: The F F Test Test ANOVA TableANOVA Table
8 8 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Between-Treatments EstimateBetween-Treatments Estimateof Population Varianceof Population Variance
A between-treatment estimate of A between-treatment estimate of 2 2 is called the is called the mean square treatmentmean square treatment and is denoted MSTR. and is denoted MSTR.
2
1
( )
MSTR1
k
j jj
n x x
k
2
1
( )
MSTR1
k
j jj
n x x
k
Denominator representsDenominator represents the the degrees of freedomdegrees of freedom associated with SSTRassociated with SSTR
Numerator is theNumerator is the sum of squaressum of squares
due to treatmentsdue to treatmentsand is denoted SSTRand is denoted SSTR
9 9 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
The estimate of The estimate of 22 based on the variation of the based on the variation of the sample observations within each sample is called sample observations within each sample is called the the mean square errormean square error and is denoted by MSE. and is denoted by MSE.
Within-Samples EstimateWithin-Samples Estimateof Population Varianceof Population Variance
kn
sn
T
k
jjj
1
2)1(
MSEkn
sn
T
k
jjj
1
2)1(
MSE
Denominator representsDenominator represents the the degrees of freedomdegrees of freedom
associated with SSEassociated with SSE
Numerator is theNumerator is the sum of squaressum of squares
due to errordue to errorand is denoted SSEand is denoted SSE
10 10 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Comparing the Variance Estimates: The Comparing the Variance Estimates: The FF TestTest
If the null hypothesis is true and the ANOVAIf the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution ofassumptions are valid, the sampling distribution of MSTR/MSE is an MSTR/MSE is an FF distribution with MSTR d.f. distribution with MSTR d.f. equal to equal to kk - 1 and MSE d.f. equal to - 1 and MSE d.f. equal to nnTT - - kk..
If the means of the If the means of the kk populations are not equal, the populations are not equal, the value of MSTR/MSE will be inflated because MSTRvalue of MSTR/MSE will be inflated because MSTR overestimates overestimates 22.. Hence, we will reject Hence, we will reject HH00 if the resulting value of if the resulting value of MSTR/MSE appears to be too large to have beenMSTR/MSE appears to be too large to have been selected at random from the appropriate selected at random from the appropriate FF distribution.distribution.
11 11 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
FF = MSTR/MSE = MSTR/MSE
HH00: : 11==22==33==. . . . . . = = kk
HHaa: Not all population means are equal: Not all population means are equal
HypothesesHypotheses
Test StatisticTest Statistic
12 12 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
Rejection RuleRejection Rule
where the value of where the value of FF is based on an is based on anFF distribution with distribution with kk - 1 numerator d.f. - 1 numerator d.f.and and nnTT - - kk denominator d.f. denominator d.f.
Reject Reject HH00 if if pp-value -value << pp-value Approach:-value Approach:
Critical Value Approach:Critical Value Approach: Reject Reject HH00 if if FF >> FF
13 13 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Sampling Distribution of MSTR/MSESampling Distribution of MSTR/MSE
Rejection RegionRejection Region
Do Not Reject H0Do Not Reject H0
Reject H0Reject H0
MSTR/MSEMSTR/MSE
Critical ValueCritical ValueFF
Sampling DistributionSampling Distributionof MSTR/MSEof MSTR/MSE
14 14 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
ANOVA TableANOVA Table
SST is SST is partitionedpartitioned
into SSTR and into SSTR and SSE.SSE.
SST’s degrees of SST’s degrees of freedomfreedom
(d.f.) are partitioned (d.f.) are partitioned intointo
SSTR’s d.f. and SSE’s SSTR’s d.f. and SSE’s d.f.d.f.
TreatmentTreatment
ErrorError
TotalTotal
SSTRSSTR
SSESSE
SSTSST
kk – 1 – 1
nnT T – – kk
nnTT - 1 - 1
MSTRMSTR
MSEMSE
Source ofSource ofVariationVariation
Sum ofSum ofSquaresSquares
Degrees ofDegrees ofFreedomFreedom
MeanMeanSquaresSquares
MSTR/MSEMSTR/MSE
FF
15 15 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
ANOVA TableANOVA Table
SST divided by its degrees of freedom SST divided by its degrees of freedom nnTT – 1 is the – 1 is the overall sample variance that would be obtained if weoverall sample variance that would be obtained if we treated the entire set of observations as one data set.treated the entire set of observations as one data set.
SST divided by its degrees of freedom SST divided by its degrees of freedom nnTT – 1 is the – 1 is the overall sample variance that would be obtained if weoverall sample variance that would be obtained if we treated the entire set of observations as one data set.treated the entire set of observations as one data set.
With the entire data set as one sample, the formulaWith the entire data set as one sample, the formula for computing the total sum of squares, SST, is:for computing the total sum of squares, SST, is: With the entire data set as one sample, the formulaWith the entire data set as one sample, the formula for computing the total sum of squares, SST, is:for computing the total sum of squares, SST, is:
2
1 1
SST ( ) SSTR SSEjnk
ijj i
x x
2
1 1
SST ( ) SSTR SSEjnk
ijj i
x x
16 16 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
ANOVA TableANOVA Table
ANOVA can be viewed as the process of partitioningANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedomthe total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.into their corresponding sources: treatments and error.
ANOVA can be viewed as the process of partitioningANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedomthe total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.into their corresponding sources: treatments and error.
Dividing the sum of squares by the appropriateDividing the sum of squares by the appropriate degrees of freedom provides the variance estimatesdegrees of freedom provides the variance estimates and the and the FF value used to test the hypothesis of equal value used to test the hypothesis of equal population means.population means.
Dividing the sum of squares by the appropriateDividing the sum of squares by the appropriate degrees of freedom provides the variance estimatesdegrees of freedom provides the variance estimates and the and the FF value used to test the hypothesis of equal value used to test the hypothesis of equal population means.population means.
17 17 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Example: Reed ManufacturingExample: Reed Manufacturing
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
Janet Reed would like to know ifJanet Reed would like to know ifthere is any significant difference inthere is any significant difference inthe mean number of hours worked per the mean number of hours worked per week for the department managersweek for the department managersat her three manufacturing plantsat her three manufacturing plants(in Buffalo, Pittsburgh, and Detroit). (in Buffalo, Pittsburgh, and Detroit).
18 18 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Example: Reed ManufacturingExample: Reed Manufacturing
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
A simple random sample of fiveA simple random sample of fivemanagers from each of the three plantsmanagers from each of the three plantswas taken and the number of hourswas taken and the number of hoursworked by each manager for theworked by each manager for theprevious week is shown on the nextprevious week is shown on the nextslide.slide. Conduct an Conduct an FF test using test using = .05. = .05.
19 19 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
1122334455
48485454575754546262
73736363666664647474
51516363616154545656
Plant 1Plant 1BuffaloBuffalo
Plant 2Plant 2PittsburghPittsburgh
Plant 3Plant 3DetroitDetroitObservationObservation
Sample MeanSample MeanSample VarianceSample Variance
5555 68 68 57 5726.026.0 26.5 26.5 24.5 24.5
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
20 20 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
HH00: : 11==22==33
HHaa: Not all the means are equal: Not all the means are equalwhere: where: 1 1 = mean number of hours worked per= mean number of hours worked per
week by the managers at Plant 1week by the managers at Plant 1 2 2 = mean number of hours worked per= mean number of hours worked per week by the managers at Plant 2week by the managers at Plant 23 3 = mean number of hours worked per= mean number of hours worked per week by the managers at Plant 3week by the managers at Plant 3
1. Develop the hypotheses.1. Develop the hypotheses.
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
21 21 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
2. Specify the level of significance.2. Specify the level of significance. = .05= .05
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
3. Compute the value of the test statistic.3. Compute the value of the test statistic.
MSTR = 490/(3 - 1) = 245MSTR = 490/(3 - 1) = 245SSTR = 5(55 - 60)SSTR = 5(55 - 60)22 + 5(68 - 60) + 5(68 - 60)22 + 5(57 - 60) + 5(57 - 60)22 = 490 = 490
= (55 + 68 + 57)/3 = 60= (55 + 68 + 57)/3 = 60xx(Sample sizes are all equal.)(Sample sizes are all equal.)
Mean Square Due to TreatmentsMean Square Due to Treatments
22 22 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
3. Compute the value of the test statistic.3. Compute the value of the test statistic.
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
MSE = 308/(15 - 3) = 25.667MSE = 308/(15 - 3) = 25.667
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308Mean Square Due to ErrorMean Square Due to Error
(continued)(continued)
FF = MSTR/MSE = 245/25.667 = 9.55 = MSTR/MSE = 245/25.667 = 9.55
pp -Value and Critical Value Approaches -Value and Critical Value Approaches
23 23 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
TreatmentTreatment
ErrorError
TotalTotal
490490
308308
798798
22
1212
1414
245245
25.66725.667
Source ofSource ofVariationVariation
Sum ofSum ofSquaresSquares
Degrees ofDegrees ofFreedomFreedom
MeanMeanSquaresSquares
9.559.55
FF
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
ANOVA TableANOVA Table
24 24 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
5. Determine whether to reject 5. Determine whether to reject HH00..
We have sufficient evidence to conclude that We have sufficient evidence to conclude that the mean number of hours worked per week the mean number of hours worked per week by department managers is not the same at by department managers is not the same at all 3 plant.all 3 plant.
The The pp-value -value << .05, .05, so we reject so we reject HH00..
With 2 numerator d.f. and 12 With 2 numerator d.f. and 12 denominator d.f.,denominator d.f.,the the pp-value is .01 for -value is .01 for FF = 6.93. = 6.93. Therefore, theTherefore, thepp-value is less than .01 for -value is less than .01 for FF = 9.55. = 9.55.
pp –Value Approach –Value Approach
4. Compute the 4. Compute the pp –value. –value.
25 25 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
5. Determine whether to reject 5. Determine whether to reject HH00..
Because Because FF = 9.55 = 9.55 >> 3.89, we reject 3.89, we reject HH00..
Critical Value ApproachCritical Value Approach
4. Determine the critical value and rejection rule.4. Determine the critical value and rejection rule.
Reject Reject HH00 if if FF >> 3.89 3.89
Test for the Equality of Test for the Equality of kk Population Population MeansMeans
We have sufficient evidence to conclude that We have sufficient evidence to conclude that the mean number of hours worked per week the mean number of hours worked per week by department managers is not the same at by department managers is not the same at all 3 plant.all 3 plant.
Based on an Based on an FF distribution with 2 numerator distribution with 2 numeratord.f. and 12 denominator d.f., d.f. and 12 denominator d.f., FF.05.05 = 3.89. = 3.89.
26 26 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Multiple Comparison ProceduresMultiple Comparison Procedures
Suppose that analysis of variance has Suppose that analysis of variance has provided statistical evidence to reject the null provided statistical evidence to reject the null hypothesis of equal population means.hypothesis of equal population means.
Fisher’s least significant difference (LSD) procedure can Fisher’s least significant difference (LSD) procedure can be used to determine where the differences occur.be used to determine where the differences occur.
27 27 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Fisher’s LSD ProcedureFisher’s LSD Procedure
1 1MSE( )
i j
i j
x xt
n n
1 1MSE( )
i j
i j
x xt
n n
Test StatisticTest Statistic
HypothesesHypotheses
0 : i jH 0 : i jH : a i jH : a i jH
28 28 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Fisher’s LSD ProcedureFisher’s LSD Procedure
where the value of where the value of ttaa/2 /2 is based on ais based on a
tt distribution with distribution with nnTT - - kk degrees of freedom. degrees of freedom.
Rejection RuleRejection Rule
Reject Reject HH00 if if pp-value -value <<
pp-value Approach:-value Approach:
Critical Value Approach:Critical Value Approach:
Reject Reject HH00 if if tt < - < -ttaa/2 /2 or or tt > > ttaa/2 /2
29 29 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Test StatisticTest Statistic
Fisher’s LSD ProcedureFisher’s LSD ProcedureBased on the Test Statistic Based on the Test Statistic xxii - - xxjj
__ __
/ 21 1LSD MSE( )
i jt n n / 2
1 1LSD MSE( )i j
t n n wherewhere
i jx xi jx x
Reject Reject HH00 if > LSD if > LSDi jx xi jx x
HypothesesHypotheses
Rejection RuleRejection Rule
0 : i jH 0 : i jH : a i jH : a i jH
30 30 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
Fisher’s LSD ProcedureFisher’s LSD ProcedureBased on the Test Statistic Based on the Test Statistic xxii - - xxjj
Example: Reed ManufacturingExample: Reed Manufacturing
Recall that Janet Reed wants to knowRecall that Janet Reed wants to knowif there is any significant difference inif there is any significant difference inthe mean number of hours worked per the mean number of hours worked per week for the department managersweek for the department managersat her three manufacturing plants. at her three manufacturing plants.
Analysis of variance has providedAnalysis of variance has providedstatistical evidence to reject the nullstatistical evidence to reject the nullhypothesis of equal population means.hypothesis of equal population means.Fisher’s least significant difference (LSD) Fisher’s least significant difference (LSD) procedureprocedurecan be used to determine where the differences can be used to determine where the differences occur.occur.
31 31 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
For For = .05 and = .05 and nnTT - - kk = 15 – 3 = 12 = 15 – 3 = 12
degrees of freedom, degrees of freedom, tt..025 025 = 2.179 = 2.179
LSD 2 179 25 667 15
15 6 98. . ( ) .LSD 2 179 25 667 1
51
5 6 98. . ( ) .
/ 21 1LSD MSE( )
i jt n n / 2
1 1LSD MSE( )i j
t n n
MSE value wasMSE value wascomputed earliercomputed earlier
Fisher’s LSD ProcedureFisher’s LSD ProcedureBased on the Test Statistic Based on the Test Statistic xxii - - xxjj
32 32 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
LSD for Plants 1 and 2LSD for Plants 1 and 2
Fisher’s LSD ProcedureFisher’s LSD ProcedureBased on the Test Statistic Based on the Test Statistic xxii - - xxjj
• ConclusionConclusion
• Test StatisticTest Statistic1 2x x1 2x x = |55 = |55 68| = 13 68| = 13
Reject Reject HH00 if if > 6.98 > 6.981 2x x1 2x x
• Rejection RuleRejection Rule
0 1 2: H 0 1 2: H 1 2: aH 1 2: aH
• Hypotheses (A)Hypotheses (A)
The mean number of hours worked at Plant 1 isThe mean number of hours worked at Plant 1 isnot equalnot equal to the mean number worked at Plant 2. to the mean number worked at Plant 2.
33 33 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
LSD for Plants 1 and 3LSD for Plants 1 and 3
Fisher’s LSD ProcedureFisher’s LSD ProcedureBased on the Test Statistic Based on the Test Statistic xxii - - xxjj
• ConclusionConclusion
• Test StatisticTest Statistic1 3x x1 3x x = |55 = |55 57| = 2 57| = 2
Reject Reject HH00 if if > 6.98 > 6.981 3x x1 3x x
• Rejection RuleRejection Rule
0 1 3: H 0 1 3: H 1 3: aH 1 3: aH
• Hypotheses (B)Hypotheses (B)
There is There is no significant differenceno significant difference between the mean between the mean number of hours worked at Plant 1 and number of hours worked at Plant 1 and the meanthe mean number of hours worked at Plant 3.number of hours worked at Plant 3.
34 34 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
LSD for Plants 2 and 3LSD for Plants 2 and 3
Fisher’s LSD ProcedureFisher’s LSD ProcedureBased on the Test Statistic Based on the Test Statistic xxii - - xxjj
• ConclusionConclusion
• Test StatisticTest Statistic2 3x x2 3x x = |68 = |68 57| = 11 57| = 11
Reject Reject HH00 if if > 6.98 > 6.982 3x x2 3x x
• Rejection RuleRejection Rule
0 2 3: H 0 2 3: H 2 3: aH 2 3: aH
• Hypotheses (C)Hypotheses (C)
The mean number of hours worked at Plant 2 isThe mean number of hours worked at Plant 2 is not equalnot equal to the mean number worked at Plant 3. to the mean number worked at Plant 3.
35 35 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
The experimentwise Type I error rate gets larger for The experimentwise Type I error rate gets larger for problems with more populations (larger problems with more populations (larger kk).).
Type I Error RatesType I Error Rates
EWEW = 1 – (1 – = 1 – (1 – ))((k k – 1)!– 1)!
The The comparisonwise Type I error ratecomparisonwise Type I error rate indicates the level of significance associated indicates the level of significance associated with a single pairwise comparison.with a single pairwise comparison.
The The experimentwise Type I error rateexperimentwise Type I error rate EWEW is the is the probability of making a Type I error on at least probability of making a Type I error on at least one of the (one of the (kk – 1)! pairwise comparisons. – 1)! pairwise comparisons.
36 36 Slide
Slide
© 2005 Thomson/South-Western© 2005 Thomson/South-Western
End of Chapter 13, Part AEnd of Chapter 13, Part A