15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block...

27
15.3 The F Test for a Randomized Block Experiment 15-1 We saw in Chapter 11 that when two treatments are to be compared, a paired experi- ment is often more effective than one involving two independent samples. This is also the case when more than two treatments are to be compared. Suppose that four dif- ferent pesticides (the treatments) are being considered for use with a particular crop. There are 20 plots of land available for planting. If 5 of these plots are randomly se- lected to receive Pesticide 1, 5 of the remaining 15 randomly selected for Pesticide 2, and so on, the result is a completely randomized experiment, and the data should be analyzed using single-factor ANOVA. The disadvantage of this experiment is that if there are differences in characteristics of the plots that could affect yield, this could limit our ability to identify differences among the treatments. Consider the following alternative experiment. Separate the 20 plots into five groups, each consisting of 4 plots. The plots within each group are chosen so that the plots are as much alike as possible with respect to characteristics affecting yield. Then, within each group one plot is randomly selected for Pesticide 1, a second plot is ran- domly chosen to receive Pesticide 2, and so on. The homogeneous groups are called blocks, and the random allocation of treatments within each block as described results in a randomized block experiment. DEFINITION Suppose that experimental units (individuals or objects to which the treatments are applied) are first separated into blocks consisting of k units in such a way that the units within each block are as similar as possible. Within any particular block, the treatments are then randomly assigned so that each unit in a block receives a different treatment. This experimental design is referred to as a ran- domized block design. High energy costs have made consumers and home builders increasingly aware of whether household appliances are energy efficient. A large developer carried out a study to compare electricity usage for four different residential air-conditioning sys- tems being considered for tract homes. Each system was installed in five homes, and the resulting electricity usage (in kilowatt-hours) was monitored for a 1-month pe- riod. Because the developer realized that many characteristics of a home could affect usage (for example, floor space, type of insulation, directional orientation, and type of roof and exterior), care was taken to ensure that extraneous variation in such char- acteristics did not influence the conclusions. Homes selected for the experiment were grouped into five blocks consisting of four homes each in such a way that the four homes within any given block were as similar as possible. The resulting data are dis- played in Table 15.5, in which rows correspond to the different treatments (air- conditioning systems) and columns correspond to the different blocks. Later in this section we will analyze these data to see whether electricity usage depends on which system is used. 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning Data set available online 54906-15-ch15-p001-027.indd 15-1 54906-15-ch15-p001-027.indd 15-1 11/23/10 3:29 PM 11/23/10 3:29 PM

Transcript of 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block...

Page 1: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.3 The F Test for a Randomized Block Experiment 15-1

We saw in Chapter 11 that when two treatments are to be compared, a paired experi-ment is often more effective than one involving two independent samples. This is also the case when more than two treatments are to be compared. Suppose that four dif-ferent pesticides (the treatments) are being considered for use with a particular crop. There are 20 plots of land available for planting. If 5 of these plots are randomly se-lected to receive Pesticide 1, 5 of the remaining 15 randomly selected for Pesticide 2, and so on, the result is a completely randomized experiment, and the data should be analyzed using single-factor ANOVA. The disadvantage of this experiment is that if there are differences in characteristics of the plots that could affect yield, this could limit our ability to identify differences among the treatments.

Consider the following alternative experiment. Separate the 20 plots into five groups, each consisting of 4 plots. The plots within each group are chosen so that the plots are as much alike as possible with respect to characteristics affecting yield. Then, within each group one plot is randomly selected for Pesticide 1, a second plot is ran-domly chosen to receive Pesticide 2, and so on. The homogeneous groups are called blocks, and the random allocation of treatments within each block as described results in a randomized block experiment.

DEF IN I T ION

Suppose that experimental units (individuals or objects to which the treatments are applied) are first separated into blocks consisting of k units in such a way that the units within each block are as similar as possible. Within any particular block, the treatments are then randomly assigned so that each unit in a block receives a different treatment. This experimental design is referred to as a ran-domized block design.

High energy costs have made consumers and home builders increasingly aware of whether household appliances are energy efficient. A large developer carried out a study to compare electricity usage for four different residential air-conditioning sys-tems being considered for tract homes. Each system was installed in five homes, and the resulting electricity usage (in kilowatt-hours) was monitored for a 1-month pe-riod. Because the developer realized that many characteristics of a home could affect usage (for example, floor space, type of insulation, directional orientation, and type of roof and exterior), care was taken to ensure that extraneous variation in such char-acteristics did not influence the conclusions. Homes selected for the experiment were grouped into five blocks consisting of four homes each in such a way that the four homes within any given block were as similar as possible. The resulting data are dis-played in Table 15.5, in which rows correspond to the different treatments (air-conditioning systems) and columns correspond to the different blocks. Later in this section we will analyze these data to see whether electricity usage depends on which system is used.

15.3 The F Test for a Randomized Block Experiment

EXAMPLE 1 5 . 8 Cost of Air-Conditioning

Data set available online

54906-15-ch15-p001-027.indd 15-154906-15-ch15-p001-027.indd 15-1 11/23/10 3:29 PM11/23/10 3:29 PM

Page 2: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-2 Chapter 15 Analysis of Variance

The hypotheses of interest and assumptions underlying the analysis of a randomized block design are similar to those for a completely randomized design.

The single observation made on any particular treatment in a given block is assumed to be selected from a normal distribution with variance s2. Although the mean of the distribution may depend separately on the treatment applied and on the block, the variance s2 is assumed to be the same for each block–treatment combination.

The hypotheses of interest are as follows:

H0: The mean value does not depend on which treatment is applied.Ha: The mean value does depend on which treatment is applied.

Assumptions and Hypotheses

The mi notation used previously is no longer adequate for stating hypotheses, because an observation’s mean value may depend on both the treatment applied and the block.

The F TestThe key to analyzing data from a randomized block experiment is to represent SSTo, which measures total variation in the data, as a sum of three pieces: SSTr and SSE (as was the case in single-factor ANOVA) and a new contribution to variation, the block sum of squares SSBl. SSBl incorporates any variation resulting from differences be-tween the blocks; these differences can be substantial if, before creating the blocks, there was great heterogeneity in experimental units. Once the four sums of squares have been computed, the test statistic is an F ratio, MSTr/MSE, but the number of error degrees of freedom is no longer N 2 k as it was in single-factor ANOVA.

Alternative formulas for the sums of squares appropriate for efficient hand com-putation appear in the appendix to this chapter (online).

Calculations for this F test are usually summarized in an ANOVA table. The table is similar to the one for a single-factor ANOVA except that blocks are an extra source of variation, so the table contains four rows rather than just three, consistent with the added source of variation.

Block

Treatment 1 2 3 4 5 Treatment Mean

1 116 118 97 101 115 109.40

2 171 131 105 107 129 128.60

3 138 131 115 93 110 117.40

4 144 141 115 93 99 118.40

Block mean 142.25 130.25 108.00 98.50 113.25 Grand mean 118.45

TABLE 15 .5 Data from the Randomized Block Experiment of Example 15.8

54906-15-ch15-p001-027.indd 15-254906-15-ch15-p001-027.indd 15-2 11/23/10 3:29 PM11/23/10 3:29 PM

Page 3: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.3 The F Test for a Randomized Block Experiment 15-3

Table 15.6 shows a mean square for blocks and the mean squares for treatments and error. Sometimes the F ratio MSBl/MSE is also computed. A large value of this ratio suggests that blocking was effective in filtering out extraneous variation.

Notation:

k 5 number of treatments l 5 number of blocks

xi 5 mean of all observations for treatment i bj 5 mean of all observations in block j x 5 mean of all kl observations in the experiment 1the grand mean2

Sums of squares and associated df’s are as follows.

SSE is obtained by subtraction through the use of the fundamental identity

SSTo 5 SSTr 1 SSBl 1 SSE, which implies SSE 5 SSTo 2 SSTr 2 SSBI

Test statistic: F 5MSTrMSE

where

MSTr 5SSTrk 2 1

 and MSE 5SSE1k 2 12 1l 2 12

The test is based on df1 5 k 2 1 and df2 5 (k 2 1)(l 2 1). P-values are determined using Appendix Table 6, statistical software, or a graphing calculator.

Summary of the Randomized Block F Test

Sum of Squares Symbol df Formula

Treatments SSTr k 2 l l 3 1x1 2 x 2 2 1 1x2 2 x 2 2 1 p 1 1xk 2 x 2 2 4Blocks SSBl l 2 l k 3 1b1 2 x 2 2 1 1b2 2 x 2 2 1 p 1 1bk 2 x 2 2 4Error SSE (k 2 1)(l 2 1) by subtraction Total SSTo kl 2 1 g

all kl obs.1x 2 x 2 2

Source of

Variation df

Sum of

Squares Mean Square F

Treatments k 2 1 SSTr MSTr 5SSTrk 2 1

F 5MSTrMSE

Blocks l 2 1 SSBl MSBl 5SSBll 2 1

Error 1k 2 12 1l 2 12 SSE MSE 5SSE1k 2 12 1l 2 12

Total kl 2 1 SSTo

TABLE 15 .6 ANOVA Table for a Randomized Block Experiment

54906-15-ch15-p001-027.indd 15-354906-15-ch15-p001-027.indd 15-3 11/23/10 3:29 PM11/23/10 3:29 PM

Page 4: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-4 Chapter 15 Analysis of Variance

Reconsider the electricity usage data given in Example 15.8.

H0: The mean electricity usage does not depend on which air-conditioning sys-tem is used.

Ha: The mean electricity usage does depend on which system is used.

Test Statistic: F 5MSTrMSE

From Example 15.8,

x 5 118.45x1 5 109.40  x2 5 128.60  x3 5 117.40  x4 5 118.40

(these are the four treatment averages), and

b1 5 142.25  b2 5 130.25  b3 5 108.00  b4 5 98.50  b5 5 113.25

(these are the five block averages). Using the individual observations given previously,

SSTo 5 aall kl obs.

1x 2 x 2 2 5 1116 2 118.452 2 1 1118 2 118.452 2 1 p 1 199 2 118.452 2 5 7594.95

The other sums of squares are

SSTr 5 l 3 1x1 2 x 2 2 1 1x2 2 x 2 2 1 p 1 1xk 2 x 2 2 4 5 5 3 1109.4 2 118.452 2 1 1128.6 2 118.452 2 1 1117.4 2 118.452 2 1 1118.4 2 118.452 2 4

5 930.15

SSBl 5 k 3 1b1 2 x 2 2 1 1b2 2 x 2 2 1 p 1 1bl 2 x 2 2 4 5 4 3 1142.25 2 118.452 2 1 1130.25 2 118.452 2 1 p 1 1113.25 2 118.452 2 4 5 4959.70

SSE 5 SSTo 2 SSTr 2 SSBl 5 7594.95 2 930.15 2 4959.70 5 1705.10

The remaining calculations are displayed in the accompanying ANOVA table.

Source of

Variation df Sum of Squares Mean Square F

Treatments 3 930.15 310.05310.05142.09

5 2.18Blocks 4 4959.70 1239.93Error 12 1705.10 142.09Total 19 7594.95

In Appendix Table 6 with df1 � 3 and df2 � 12, the value 2.61 corresponds to a P-value of .10. Since 2.18 � 2.61, P-value �.10 and H0 cannot be rejected. Mean electricity usage does not seem to depend on which of the four air-conditioning sys-tems is used.

EXAMPLE 1 5 . 9 Electricity Cost of Air-Conditioning Revisited

54906-15-ch15-p001-027.indd 15-454906-15-ch15-p001-027.indd 15-4 11/23/10 3:29 PM11/23/10 3:29 PM

Page 5: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.3 The F Test for a Randomized Block Experiment 15-5

In many studies, all k treatments can be applied to the same experimental unit, so there is no need to group different experimental units to form blocks. For example, an experiment to compare the effects of four different gasoline additives on automobile engine efficiency could be carried out by selecting just 5 engines and using all four treatments on each one rather than using 20 engines and blocking them. Each engine by itself then constitutes a block. As another example, a manufacturing company might wish to compare outputs for three different packaging machines. Because out-put could be affected by which operator is using the machine, a design that controls for the effects of operator variation is desirable. One possibility is to use 15 operators grouped into homogeneous blocks of 5 operators each, but such homogeneity within each block may be difficult to achieve. An alternative approach is to use only five op-erators and to have each one operate all three machines in a randomly chosen order. There are then three observations in each block, all three with the same operator.

In the article “The Effects of a Pneumatic Stool and a One-Legged Stool on

Lower Limb Joint Load and Muscular Activity During Sitting and Rising” (Ergo-nomics [1993]: 519–535), the accompanying data were given on the effort (measured on the Borg Scale) required by a subject to rise from a sitting position for each of four different stools. Because it was suspected that different people could exhibit large dif-ferences in effort, even for the same type of stool, a sample of nine people was selected and each person was tested on all four stools, with the following results:

Subject

1 2 3 4 5 6 7 8 9

Stool A 12 10 7 7 8 9 8 7 9Stool B 15 14 14 11 11 11 12 11 13Stool C 12 13 13 10 08 11 12 8 10Stool D 10 12 9 9 7 10 11 7 8

For each person, the order in which the stools were tested was randomized. This is a randomized block experiment, with subjects playing the role of blocks. The test con-sists of these hypotheses:

H0: Mean effort does not depend on type of stool.Ha: Mean effort does depend on type of stool. MSTr

Test statistic: F 5MSTrMSE

Computations are summarized in Table 15.7, an ANOVA table from Minitab.

EXAMPLE 1 5 . 10 Comparing Four Stool Designs

Step-by-Step technology instructions available online

Data set available online

Two-way Analysis of Variance

Analysis of Variance for Effort

Source DF SS MS F P

Stool 3 81.19 27.06 22.36 0.000

Block 8 66.50 8.31 6.87 0.000

Error 24 29.06 1.21

Total 35 176.75

TABLE 15 . 7 ANOVA Table for Example 15.10

54906-15-ch15-p001-027.indd 15-554906-15-ch15-p001-027.indd 15-5 11/23/10 3:29 PM11/23/10 3:29 PM

Page 6: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-6 Chapter 15 Analysis of Variance

Bold exercises answered in back Data set available online Video Solution available

15.23 A particular county employs three assessors who are responsible for determining the value of residential property in the county. To see whether these assessors

differ systematically in their appraisals, 5 houses are se-lected, and each assessor is asked to determine the market value of each house. Explain why a randomized block

The test statistic value is 22.36, with associated P-value � .000. If a � .05, P-value # a, and we reject H0. There is sufficient evidence to conclude that the mean effort required is not the same for all four stool types.

Experiments such as the one described in Example 15.10, in which repeated observations are made on the same experimental unit, are sometimes called repeated-measures designs. Such designs should not be used when application of the first several treatments somehow affects responses to later treatments. This would be the case if treatments were different methods for learning the same skill, so that if all treatments were given to the same subject, the response to the treatment given last would pre-sumably be much better than the response to the treatment initially applied.

Multiple ComparisonsAs in single-factor ANOVA, once H0 has been rejected, further analysis of the data is appropriate to identify significant differences among the treatments. The Tukey– Kramer method introduced in Section 15.2 is easily adapted for this purpose.

Declare that treatments i and j differ significantly if the interval

1xi 2 xj2 6 qÅMSE

l

does not include 0, where q is from Appendix Table 7 and is based on a comparison of

k treatments and error df 5 1k 2 12 # 1l 2 12 .

In Example 15.10, we had k � 4 and error df � 24, from which q � 4.91 for a 99% simultaneous confid ence level. The � term for each interval is

qÅMSEl

5 4.91Å1.219

5 1.80

The four treatment means arranged in order are

Treatment A D C BSample mean 8.556 9.222 10.778 12.444

It is easily verified that the corresponding underscoring pattern is as follows:

Treatment A D C BSample mean 8.556 9.222 10.778 12.444

EXAMPLE 1 5 . 1 1 Multiple Comparisons for Stool Designs

EXERCISES 15 .23 - 15 .28

54906-15-ch15-p001-027.indd 15-654906-15-ch15-p001-027.indd 15-6 11/23/10 3:29 PM11/23/10 3:29 PM

Page 7: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.3 The F Test for a Randomized Block Experiment 15-7

Bold exercises answered in back Data set available online Video Solution available

experiment (with blocks corresponding to the 5 houses) was used rather than a completely randomized experi-ment involving a total of 15 houses with each assessor asked to appraise 5 different houses (a different group of 5 for each assessor).

15.24 The accompanying display is a partially com-pleted ANOVA table for the experiment described in Exercise 15.23 (with houses representing blocks and as-sessors representing treatments).

Source of

Variation df

Sum of

Squares

Mean

Square F

Treatments 11.7Blocks 113.5Error

Total 250.8

a. Fill in the missing entries in the ANOVA table. b. Use the ANOVA F statistic and a .05 level of

significance to test the null hypothesis of no differ-ence between assessors.

15.25 With the use of biofuels increasing, investiga-tors are looking for ways in which the wood ash that is a byproduct of biomass combustion can be used. An ex-periment described in the paper “Wood Ash Admixture

to Organic Wastes Improves Compost and Its Perfor-

mance” (Agriculture, Ecosystems & Environment [2008]: 43–49) looked at the effect of using wood ash in compost. Composts with 0%, 8%, and 16% ash were applied to plots in an experimental field. The plots were grouped into four blocks to create blocks that consisted of plots with similar soil characteristics. Treatments (the three composts) were assigned at random to the plots within each block. At the end of the composting period, the concentration of lead (mg/kg) was measured. Use the accompanying data (consistent with summary quantities given in the paper) to determine if there is evidence that the mean lead concentration differs for the three ash concentrations. Use � � .01.

Block 0% Ash 8% Ash 16% Ash

1 61.4 39 34.72 57.2 41 38.23 58 38.5 34.74 63.5 40.3 33.4

15.26 Fire ants can have a negative impact on other ant species. To investigate whether the use of bait traps is

effective in reducing the number of fire ants, fire-ant abundance was measured before, during, and after the use of bait traps at each of 10 sites (“Red Imported Fire

Ant Impacts on Upland Anthropoids in Southern Mis-

sissippi,” The American Midland Naturalist [2010]:

54–64). This can be viewed as a randomized block ex-periment with the 10 sites corresponding to blocks. Data on fire-ant abundance that are consistent with summary values given in the paper are given in the accompanying table. With

m1�mean fire-ant abundance before the use of bait trapsm2�mean fire-ant abundance during the use of bait trapsm3�mean fire ant-abundance after the use of bait traps

carry out a hypothesis test to determine if the data pro-vide evidence that the null hypothesis H0: m1 5 m2 5 m3 should be rejected. Use � � .05.

Site Before During After

1 975 354 1042 809 459 1333 917 399 1004 930 418 1105 840 371 746 977 482 1187 889 447 768 841 472 1129 834 458 12310 973 385 85

15.27 An experiment to assess the effect of predators on ecosystem characteristics such as soil temperature and soil moisture is described in the paper “Predators Have

Large Effects on Ecosystem Properties by Changing

Plant Diversity, Not Plant Biomass” (Ecology [2006]:

1432–1437). Three treatments were considered: (1) a treatment that attempted to exclude predators from a plot by using a sheet-metal barrier; (2) a treatment that attempted to exclude predators from a plot using a chemical treatment; and (3) a control treatment where no attempt was made to exclude predators. A large field was divided into thirty 2 3 2 meter plots. Because the field was sloped, the plots were grouped into 10 blocks each consisting of similar plots with respect to the steep-ness of slope. After 1 year, soil moisture (%) and soil temperature (°C) were measured for each plot. Data on soil moisture consistent with summary values given in the paper are given in the accompanying table. Do the data provide convincing evidence that mean soil mois-

54906-15-ch15-p001-027.indd 15-754906-15-ch15-p001-027.indd 15-7 11/23/10 3:29 PM11/23/10 3:29 PM

Page 8: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-8 Chapter 15 Analysis of Variance

Bold exercises answered in back Data set available online Video Solution available

15.28 The paper referenced in the previous exercise also gave the accompanying data on soil temperature (°C). Do the data provide convincing evidence that mean soil temperature is not the same for all three treat-ments? Construct an ANOVA table and test the relevant hypotheses using a significance level of .05.

Block Treatment 1 Treatment 2 Treatment 3

1 15.2 18.9 20.22 19.1 18.4 16.73 15.9 14.7 22.54 19.4 20.3 20.65 19.1 21.2 18.96 18.7 18.4 18.77 16.3 18.4 16.38 15.5 15.2 18.69 19.9 19.3 19.310 18.2 20.8 17.0

ture is not the same for all three treatments? Construct an ANOVA table and test the relevant hypotheses using a significance level of .05.

Block Treatment 1 Treatment 2 Treatment 3

1 41.2 45.3 42.72 40.2 42.3 47.13 40.7 47.1 42.94 46.6 41.3 40.15 43.6 44.2 51.86 42.3 45.5 43.17 42.7 44.8 39.28 47.4 41.8 42.89 41.5 42.2 42.310 39.8 42.6 44.1

An investigator is often interested in assessing the effects of two different factors on a response variable. Consider the following examples.

1. A physical education researcher wishes to know how body density of football players varies with position played (a categorical factor with categories defensive back, offensive back, defensive lineman, and offensive lineman) and level of play (a second categorical factor with categories professional, college Division I, col-lege Division II, and college Division III).

2. An agricultural scientist is interested in seeing how yield of tomatoes is affected by choice of variety planted (a categorical factor, with each category correspond-ing to a different variety) and planting density (a quantitative factor, with a level corresponding to each planting density being considered).

3. An applied chemist might wish to investigate how strength of a particular adhe-sive varies with application temperature (a quantitative factor with levels 250°F, 260°F, and 270°F) and application pressure (a quantitative factor with levels 110, 120, 130, and 140 lb./in.2).

Let’s label the two factors under study Factor A and Factor B. Even when a factor is categorical, it simplifies terminology to refer to the categories as levels. Thus, in the first example, the categorical factor position played has four levels. The number of levels of Factor A is denoted by k, and l denotes the number of levels of Factor B, as shown in Table 15.8. This rectangular table contains a row corresponding to each level of Factor A and a column corresponding to each level of Factor B. Each cell in the table corresponds to a particular level of Factor A in combination with a particular level of Factor B. Because there are l cells in each row and k rows, there are kl cells in the table. The kl different combinations of Factor A and Factor B levels are often referred to as treatments. For example, if there are three tomato varieties and four different planting densities under consideration, the number of treatments is 12.

15.4 Two-Factor ANOVA

54906-15-ch15-p001-027.indd 15-854906-15-ch15-p001-027.indd 15-8 11/23/10 3:29 PM11/23/10 3:29 PM

Page 9: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.4 Two-Factor ANOVA 15-9

Suppose that an experiment is carried out, resulting in a data set that contains some number of observations for each of the kl treatments. In general, there could be more observations for some treatments than for others, and there may even be a few treatments for which no observations are available. An experimenter may set out to make the same number of observations on each treatment, but, sometimes events beyond the experimenter’s control—such as the death of an experimental subject or malfunctioning equipment, and so on—result in different sample sizes for some treat-ments. Such imbalances in sample sizes makes analysis of the data rather difficult. In this section, we will restrict our discussion to data sets containing the same number of observations for each treatment, and we will let m denote this number.

k � number of levels of Factor A l � number of levels of Factor B kl � number of treatments (each one a combination of a Factor A level and a Factor B level) m � number of observations on each treatment

Notation

An experiment was carried out to assess the effects of tomato variety (Factor A, with k 5 3 levels) and planting density (Factor B, with l 5 4 levels of 10,000, 20,000, 30,000, and 40,000 plants per hectare) on yield. Each of the kl 5 12 treatments was used on m 5 3 plots, resulting in the data set consisting of klm 5 36 observations shown in Table 15.9.

EXAMPLE 1 5 . 1 2 Tomato Yield and Planting Density

Fact

or A

leve

ls

Factor B levels1 2 . . . . l

1

2

. . ..

k

TABLE 15 .8 A Table of Factor Combinations for a

Two-Way ANOVA Experiment

Variety

(Factor A)

Density (Factor B)

1 2 3 4

1 7.9, 9.2, 10.5 11.2, 12.8, 13.3 12.1, 12.6, 14.0 9.1, 10.8, 12.502 8.1, 8.6, 10.1 11.5, 12.7, 13.7 13.7, 14.4, 15.4 11.3, 12.5, 14.53 15.3, 16.1, 17.5 16.6, 18.5, 19.2 18.0, 20.8, 21.0 17.2, 18.4, 18.9

TABLE 15 .9 Data from the Two-Factor Experiment of Example 15.12

54906-15-ch15-p001-027.indd 15-954906-15-ch15-p001-027.indd 15-9 11/23/10 3:30 PM11/23/10 3:30 PM

Page 10: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-10 Chapter 15 Analysis of Variance

Sample mean yields for each treatment, each level of Factor A, and each level of Fac-tor B are important summary quantities. These can be displayed in a rectangular table (see Table 15.10). A plot of these sample means is also quite informative. First, con-struct horizontal and vertical axes, and scale the vertical axis in units of the response variable (yield). Then mark a point on the horizontal axis for each level of one of the factors (either Factor A or Factor B can be chosen). Now above each such mark, plot a point for the sample mean response for each level of the other factor. Finally, con-nect all points corresponding to the same level of the other factor using straight line segments.

Factor A

(Variety)

Factor B (Planting Density) Sample Mean Yield

for Each Level of

Factor A1 2 3 4

1 9.20 12.43 12.90 10.80 11.332 8.93 12.63 14.50 12.77 12.213 16.30 18.10 19.93 18.17 18.13

Sample Mean Yield

for Each Level

of Factor B

11.48 14.39 15.78 13.91 Grand mean 5 13.89

TABLE 15 . 10 Sample Means for the 12 Treatments of Example 15.12

Figure 15.10 displays two plots: one in which Factor A levels mark the horizontal axis and one in which Factor B levels mark the horizontal axis; usually only one of the two plots is constructed.

FIGURE 15.10Graphs of treatment sample mean re-sponses for the data of Example 15.12.

10

12

14

16

18

20

1 2 3 4

Level 1 of

Factor A

Level 2 of

Factor A

Level 3 of

Factor A

Sample

mean

yield

Factor B levels

10

12

14

16

18

20

1 2 3

Level 1 of

Factor B

Level 2 of

Factor B

Level 4 of

Factor B

Factor A levels

Level 3 of

Factor B

Sample

mean

yield

54906-15-ch15-p001-027.indd 15-1054906-15-ch15-p001-027.indd 15-10 11/23/10 3:30 PM11/23/10 3:30 PM

Page 11: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.4 Two-Factor ANOVA 15-11

InteractionAn important aspect of two-factor studies involves assessing how simultaneous changes in the levels of both factors affect the response. As a simple example, suppose that an automobile manufacturer is studying engine efficiency (measured in miles per gallon) for two different engine sizes (Factor A, with k � 2 levels) in combination with two different carburetor designs (Factor B, with l � 2 levels). Consider the two possible sets of mean responses displayed in Figure 15.11. In Figure 15.11(a), when Factor A changes from Level 1 to Level 2 and Factor B remains at Level 1 (the change within the first column), the true mean response increases by 2. Similarly, when Fac-tor B changes from Level 1 to Level 2 and Factor A is fixed at Level 1 (the change within the first row), the mean response increases by 3. And when the levels of both factors are changed from 1 to 2, the mean response increases by 5, which is the sum of the two “one-at- a-time” increases. This is because the change in mean response when the level of either factor changes from 1 to 2 is the same for each level of the other factor: The change within either row is 3, and the change within either column is 2. In this case, changes in the levels of the two factors affect the mean response sepa-rately or in an additive manner.

The changes in mean responses in the first row and in the first column of Figure 15.11(b) are 3 and 2, respectively, exactly as in Figure 15.11(a). However, the change in mean response when the levels of both factors change simultaneously from 1 to 2 is 8, which is much larger than the separate changes suggest. In this case, there is interaction between the two factors, so that the effect of simultaneous changes cannot be determined from the individual effects of separate changes. This is because in Figure 15.11(b), the change in going from the first to the second column is different for the two rows, and the change in going from the first to the second row is different for the two columns. That is, the change in mean response when the level of one factor changes depends on the level of the other factor. This is not true in Figure 15.11(a).

When there are more than two levels of either factor, a graph similar to that for sample mean responses in Figure 15.10, provides insight into how changes in the level of one factor depend on the level of the other factor. Figure 15.12 shows several possible such graphs when k � 4 and l � 3. The most general situation is pictured in Figure 15.12(a). There, the change in mean response when the level of Factor B is changed (a vertical distance) depends on the level of Factor A. An analogous property would hold if the picture were redrawn so that levels of Factor B were marked on the horizontal axis. This is a prototypical picture suggesting interaction between the factors—the change in mean response when the level of one factor changes depends on the level of the other factor.

There is no interaction between the factors when the connected line segments are parallel, as in Figure 15.12(b). Then the change in mean response when the level of one factor changes is the same for each level of the other factor (the vertical distances are the same for each level of Factor A). Figure 15.12(c) illustrates an even more re-strictive situation—there is no interaction between factors; in addition, the mean

FIGURE 15.11Two possible sets of mean responses when k � 2 and l � 2.

Factor B

Fac

tor

A

1 2

24 27

26 29

3

5

3

2 2

1

2

(a)

Factor B

Fac

tor

A1 2

24 27

26 32

3

8

6

2 5

1

2

(b)

54906-15-ch15-p001-027.indd 15-1154906-15-ch15-p001-027.indd 15-11 11/23/10 3:30 PM11/23/10 3:30 PM

Page 12: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-12 Chapter 15 Analysis of Variance

response does not depend on the level of Factor A. Only when the graph looks like this can it be said that Factor A has no effect on the responses. Similarly, the graph in Figure 15.12(d) indicates no interaction and no dependence on the level of Factor B. A final case, illustrated in Figure 15.12(e), shows a single set of four points con-nected by horizontal line segments, which indicates that the mean response is identi-cal for every level of both factors.

If the graphs of mean responses are connected line segments that are parallel, there is no interaction between the factors. In this case, the change in mean response when the level of one factor is changed is the same for each level of the other factor. Special cases of no interaction are as follows:

1. The mean response is the same for each level of Factor A (no Factor A main effects). 2. The mean response is the same for each level of Factor B (no Factor B main effects).

The graphs in Figure 15.12 depict actual treatment mean responses—that is, quantities whose values are fixed but unknown to an investigator. Figure 15.10 con-tains graphs of the sample mean responses based on data resulting from an experi-ment. These sample means are, of course, subject to variability because there is sam-pling variation in the individual observations. If the experiment discussed in Example 15.12 was repeated, the resulting graphs of sample means would probably look some-what different from the graph in Figure 15.10—perhaps a great deal different if there was substantial underlying variability in responses. Even when there is no interaction

FIGURE 15.12Some graphs of true average responses.

10

12

14

16

18

20

1 2 3 4

Level 1 ofFactor B

Level 2 ofFactor B

Level 3 ofFactor B

Truemean

response

Factor A levels

(a)

1 2 3 4

Every level ofFactor B

Factor A levels

(e)

1 2 3 4

Level 1 of

Factor B

Level 2 of Factor B

Factor A levels

(c)

Level 3 of Factor B

1 2 3 4

Level 1 ofFactor B

Level 2 ofFactor B

Level 3 ofFactor B

Factor A levels

(b)

10

12

14

1 2 3 4

Every level ofFactor B

Factor A levels

(d)

10

12

14

Truemean

response

Truemean

response

Truemean

response

Truemean

response

54906-15-ch15-p001-027.indd 15-1254906-15-ch15-p001-027.indd 15-12 11/23/10 3:30 PM11/23/10 3:30 PM

Page 13: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.4 Two-Factor ANOVA 15-13

among factors, the connected line segments in the graph of the sample means will not typically be exactly parallel, and they may deviate quite a bit from parallelism in the presence of substantial underlying variability. Similarly, there might actually be no Factor A effects (Figure 15.12(c)), yet the sample graphs would not usually be exactly horizontal. The sample graphs give us insight, but formal inferential procedures are necessary to draw sound conclusions about the nature of the mean responses for dif-ferent factor levels.

Hypotheses and F TestsANOVA procedures can used to test hypotheses about the effects of two different factors on a response.

The observations on any particular treatment are independently selected from a normal distribution with variance s2 (the same variance for each treatment), and samples from different treatments are independent of one another.

Basic Assumptions for Two-Factor ANOVA

The necessary sums of squares for a two-factor ANOVA result from breaking up SSTo 5 g 1x 2 x 2 2 into four parts, which reflect random variation and variation attributable to various factor effects:

SSTo � SSA � SSB � SSAB � SSE

where

1. SSTo is total sum of squares, with associated df � klm � 1. 2. SSA is the Factor A main effect sum of squares, with associated df � k � 1. 3. SSB is the Factor B main effect sum of squares, with associated df � l � 1. 4. SSAB is the interaction sum of squares, with associated df � (k � 1)(l � 1). 5. SSE is error sum of squares, with associated df � kl(m � 1).

The formulas for these sums of squares are similar to those given in previous sec-tions, so we will not give them here. The standard statistical computer packages can calculate all sums of squares and other necessary quantities. The magnitude of SSE is related entirely to the amount of underlying variability (as specified by s2) in the distributions being sampled. SSAB reflects in part underlying variability, but its value is also affected by whether there is interaction between the factors. In general, the more extensive the amount of interaction (i.e., the further the graphs of mean re-sponses are from being parallel), the larger the value of SSAB tends to be. The test statistic for testing the null hypothesis that there is no interaction between factors is the ratio F 5 MSAB/MSE. A large value of this statistic suggests that interaction ef-fects are present.

Both the absence of Factor A effects and the absence of Factor B effects are special cases of no-interaction situations. If the data suggest that interaction is present, it does not make sense to investigate effects of one factor without reference to the other fac-tor. Our recommendation is that hypotheses concerning the presence or absence of separate factor effects be tested only if the hypothesis of no interaction is not rejected. Then, the Factor A main effect sum of squares, SSA, will reflect random variation as well as any differences between mean responses for different levels of Factor A. The same applies to SSB.

54906-15-ch15-p001-027.indd 15-1354906-15-ch15-p001-027.indd 15-13 11/23/10 3:30 PM11/23/10 3:30 PM

Page 14: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-14 Chapter 15 Analysis of Variance

Computations are typically summarized in an ANOVA table, as shown in Table 15.11.

An ANOVA table for the tomato-yield data of Example 15.12 is given in Table 15.12.

EXAMPLE 1 5 . 1 3 More on Tomato Yield

1. H0: There is no interaction between factors. Ha: There is interaction between factors.

Test statistic: F 5MSABMSE

based on df1 5 (k 2 1)(l 2 1) and df2 5 kl(m 2 1).

The following two hypotheses should be tested only if the hypothesis of no interaction is not rejected.

2. H0: There are no Factor A main effects (mean response is the same for each level of Factor A).

Ha: H0 is not true.

Test statistic: F 5MSAMSE

based on df1 5 k 2 1 and df2 5 kl(m 2 1).

3. H0: There are no Factor B main effects. Ha: H0 is not true.

Test statistic: F 5MSBMSE

based on df1 5 l 2 1 and df2 5 kl(m 2 1).

Two-Factor ANOVA Hypotheses and Tests

Source of Variation df Sum of Squares Mean Square F

Factor A main effects k � 1 SSA MSA 5SSA

k 2 1F 5

MSAMSE

Factor B main effects l � 1 SSB MSB 5SSB

l 2 1F 5

MSBMSE

AB interaction (k � 1)( l � 1) SSAB MSAB 5SSAB1k 2 12 1l 2 12 F 5

MSABMSE

Error kl(m � 1) SSE MSE 5SSE

kl 1m 2 12Total klm � 1 SSTo

TABLE 15 . 1 1 Analysis of Variance Table

Source of

Variation df Sum of Squares Mean Square F

Variety 2 327.60 163.80 103.70Density 3 86.69 28.90 18.30Interaction 6 8.03 1.34 0.85Error 24 38.04 1.58 Total 35 460.36

TABLE 15 . 12 Analysis of Variance on Tomato Yield for Example 15.13

54906-15-ch15-p001-027.indd 15-1454906-15-ch15-p001-027.indd 15-14 11/23/10 3:30 PM11/23/10 3:30 PM

Page 15: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.4 Two-Factor ANOVA 15-15

1. Test of H0: no interaction between variety and density:

Calculated FAB � 0.85, based on df1 � 6, df2 � 24

From Appendix Table 6, the smallest value for these df’s is 2.04, so P-value . .10. There is no evidence of interaction, so it is appropriate to carry out further tests concerning the presence of main effects.

2. Test of H0: Factor A (variety) main effects are absent:

calculated FA � 103.7, based on df1 � 2, df2 � 24

Appendix Table 6 shows that P-value , .001. We therefore reject H0 and con-clude that mean yield does depend on variety.

3. Test of H0: Factor B (density) main effects are absent:

calculated FB � 18.3, based on df1 � 3, df2 � 24

Again, P-value , .001, so we reject H0 and conclude that mean yield does de-pend on planting density.

After the null hypothesis of no Factor A main effects has been rejected, significant differences in Factor A levels can be identified by using a multiple-comparisons pro-cedure. In particular, the Tukey–Kramer method described previously can be applied. The quantities x1, x2, p , xk are now the sample mean responses for levels 1, p , k of Factor A, and error df is kl(m � 1). A similar comment applies to Factor B main ef-fects and significant differences in Factor B levels.

The Case m � 1There is a problem with the analysis just described when m 5 1 (one observation on each treatment). Although we did not give the formula, MSE is an estimate of s2 ob-tained by computing a separate sample variance s2 for the m observations on each treat-ment and then averaging these kl sample variances. With only one observation on each treatment, there is no way to estimate s2 separately from each of the treatments.

One way to proceed is to assume a priori that there is no interaction between factors. This should, of course, be done only when the investigator has sound reasons, based on a thorough understanding of the problem, for believing that the factors contribute separately to the response. Having made this assumption, the investigator can then use what would otherwise be an interaction sum of squares for SSE. The fundamental identity becomes

SSTo � SSA � SSB � SSE

with the four associated df kl � 1, k � 1, l � 1, and (k � 1)(l � 1).Table 15.13 gives the corresponding ANOVA table. FA is the test statistic for

testing the null hypothesis that mean responses are identical for all Factor A levels.

Source of

Variation df

Sum of

Squares Mean Square F

Factor A k 2 1 SSA MSA 5SSA

k 2 1F 5

MSAMSE

Factor B 2 1 SSB MSB 5SSB

l 2 1F 5

MSBMSE

Error (k 2 1)(l 2 1) SSE MSE 5SSE1k 2 12 1m 2 12

Total kl 2 1 SSTo

TABLE 15 . 13 ANOVA Table for Two-Factor Experiment with m 5 1

54906-15-ch15-p001-027.indd 15-1554906-15-ch15-p001-027.indd 15-15 11/23/10 3:30 PM11/23/10 3:30 PM

Page 16: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-16 Chapter 15 Analysis of Variance

Bold exercises answered in back Data set available online Video Solution available

15.29 The article “Caffeine, Exercise Help Fight Skin

Cancer” (The Salt Lake Tribune, July 31, 2007) sum-marizes a study investigating the effects of caffeine con-sumption and exercise on the survival of precancerous cells. The article states:

In mice, there is a protective effect from both caffeine and voluntary exercise, and when both are provided-not necessarily at the same time protection is even more than the sum of the two, said Allan Conney of the laboratory for cancer research at Rutgers.

FB plays a similar role for Factor B main effects. The analysis of data from a random-ized block experiment in fact assumed no interaction between treatments and blocks. If SSTr is relabeled SSA and if SSBl is relabeled SSB, the formulas for all sums of squares given in Section 15.3 are valid here.

When metal pipe is buried in soil, it is desirable to apply a coating to retard corro-sion. Four different coatings are under consideration for use with pipe that will ulti-mately be buried in three types of soil. An experiment to investigate the effects of these coatings and soils was carried out by first selecting 12 pipe segments and apply-ing each coating to 3 segments. The segments were then buried in soil for a specified period in such a way that each soil type received one piece with each coating. The resulting data (depth of corrosion) and ANOVA table are given in Table 15.14. As-suming that there is no interaction between coating type and soil type, let’s test at level .05 for the presence of separate Factor A (coating) and Factor B (soil) effects.

Factor A

(Coating)

Factor B (Soil)Sample

Mean1 2 3

1 64 49 50 54.332 53 51 48 50.673 47 45 50 47.334 51 43 52 48.67

Sample mean 53.75 47.00 50.00 x 5 50.25

TABLE 15 . 14 Data and ANOVA Table for Example 15.14

Source of

Variation df Sum of Squares Mean Square F

Factor A 3 83.5 27.8 FA 527.820.6

5 1.3

Factor B 2 91.5 45.8 FB 545.820.6

5 2.2

Error 6 123.3 20.6

Total 11 298.3

Appendix Table 6 shows that P-value . .10 for both tests. It appears that the mean response (amount of corrosion) depends on neither the coating used nor the type of soil in which the pipe is buried.

EXAMPLE 1 5 . 1 4 Effect of Soil Type and Pipe Coating on Corrosion

EXERCISES 15 .29 - 15 .37

Step-by-Step technology instructions available online

Data set available online

54906-15-ch15-p001-027.indd 15-1654906-15-ch15-p001-027.indd 15-16 11/23/10 3:30 PM11/23/10 3:30 PM

Page 17: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.4 Two-Factor ANOVA 15-17

Bold exercises answered in back Data set available online Video Solution available

and the effect of number of choices on percentage correct.

15.31 The following graphs appear in the paper “Which

Thoughts Count? Algorithms for Evaluating Satisfac-

tion in Relationships” (Psychological Science [2008]:

1030–1036). The vertical axis in both graphs represents mean score on a measure of relationship satisfaction.

4.5

4.0

5.5

Lowapproach

Averageapproach

Highapproach

5.0

6.0

6.5

Low

positive

Average High

positive

Sat

isfa

ctio

n

Thoughts

4.5

4.0

5.5

Lowavoidance

Averageavoidance

Highavoidance

5.0

6.0

Low

negative

Average High

negative

Sat

isfa

ctio

n

Thoughts

a. In the top graph, mean relationship satisfaction is plotted for each level of a measure of thoughts of passion (on the horizontal axis with levels low posi-tive, average, and high positive) separately for par-ticipants with low, average, and high approach goals. Does this graph suggest an interaction between the two factors, thoughts of passion and approach goal level? What aspects of the graph support your response?

b. In the bottom graph, mean relationship satisfaction is plotted for each level of a measure of thoughts of insecurity (the horizontal axis with levels low nega-tive, average, and high negative) separately for par-ticipants with low, average, and high avoidance goals. Does this graph suggest an interaction be-tween the two factors, thoughts of insecurity and avoidance goal level? What aspects of the graph sup-port your response?

Does this statement indicate an interaction between caf-feine consumption and exercise or does it indicate that there is no interaction between caffeine consumption and exercise. Explain.

15.30 The paper “Feedback Enhances the Positive

Effects and Reduces the Negative Effects of Multiple-

Choice Testing” (Memory & Cognition [2008]: 604–

616) describes an experiment to investigate the effects of two factors on performance on a multiple-choice exam. The response variable was the percentage correct on a multiple-choice exam and the two factors of interest were

Factor 1: Prior study, with levels no study, study, and study 1 review

Factor 2: Number of choices for the questions on the exam, with levels 2, 4, and 6

Subjects were randomly assigned to one of the nine treat-ments corresponding to the nine prior study and number of choices combinations. Mean percentage correct for each of the treatments are shown in the accompanying table.

Number of Choices

2 4 6

Prior Study No Study 56 34 47Study 69 56 47Study1Review 84 77 70

a. The authors concluded that there was no interaction between prior study and number of choices. Con-struct a graph of the treatment means (similar to those of Figure 15.10). Does this graph support the conclusion of no interaction? Explain.

b. The paper also included the following F statistic values for testing main effects of prior study and number of choices.

Source F

Prior Study 66.25Number of Choices 73.76

The error df was reported as 69. Carry out appropri-ate tests to determine if the authors’ conclusion of significant main effects for both prior study and number of choices is justified.

c. Based on your answers from Parts (a) and (b), write a few sentences describing the effect of prior study

54906-15-ch15-p001-027.indd 15-1754906-15-ch15-p001-027.indd 15-17 11/23/10 3:30 PM11/23/10 3:30 PM

Page 18: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-18 Chapter 15 Analysis of Variance

Bold exercises answered in back Data set available online Video Solution available

the variable gender (female, male). The author allowed for an interaction between gender and status. Assume that there were 12 students at each status–gender level combi-nation, for a total of 72 subjects.

Source of

Variation df

Sum of

Squares

Mean

Square F

Status 2 14.49Gender 1 .15Interaction 2 .95Error 66 .0120Total 71

a. Fill in the missing numbers in the ANOVA table. b. Is there a significant interaction between status and

gender? c. Is there a difference between the mean “Rate of

Talk” scores for girls and boys? d. Is there a difference between the mean “Rate of

Talk” scores across the three different status groups?

15.35 The article “Experimental Analysis of Prey Se-

lection by Largemouth Bass” (Transactions of the American Fisheries Society [1991]: 500–508) gave an ANOVA summary in which the response variable was a certain preference index, there were three sizes of bass, and there were two different species of prey. Three observations were made for each size–species combina-tion. Sums of squares for size, species, and interaction were reported as .088, .048, and .048, respectively, and SSTo � .316. Test all relevant hypotheses using a significance level of .01.

15.36 The accompanying (slightly modified) ANOVA table appeared in the article “An Experimental Test of

Mate Defense in an Iguanid Lizard” (Ecology [1991]:

1218–1224). The response variable was territory size.

Source of

Variation df

Sum of

Squares

Age 1 .614Sex 1 1.754Interaction 1 .146Error 80 5.624

a. How many age classes were there? b. How many observations were made for each age–sex

combination? c. What conclusions can be drawn about how the fac-

tors affect the response variable?

c. Consider the following quote from the paper:

In short, the amount that increases in positive and negative thoughts about the relationship contrib-uted to overall feelings of satisfaction differed in ac-cordance with an individual’s social goals, such that positive thoughts were meaningful to participants high on approach, and negative thoughts were meaningful to those high on avoidance.

Do the given graphs support this statement? Explain.

15.32 The behavior of undergraduate students when exposed to various odors was examined by the authors of the article “Effects of Environmental Odor and Coping

Style on Negative Affect, Anger, Arousal and Escape”

(Journal of Applied Social Psychology [1999]: 245–260). The following table was constructed using data on reported discomfort level (measured on a scale from 1 to 5). There were 24 students in each odor–gender combination.

Type of Odor Male x Female x

No odor 1.36 1.68 Rotten egg 1.83 2.33 Skunk 2.42 2.69 Cigarette ash 2.74 3.16

a. Construct a graph (similar to those of Figure 15.10) that shows the mean discomfort level on the vertical axis. Mark the four odor categories on the horizontal axis. Then plot the four means for the males and connect them with line segments. Plot the four means for females and connect them with line segments.

b. Interpret the interaction plot. Do you think that there is an interaction between gender and type of odor?

15.33 Explain why the individual effects of Factor A or Factor B cannot be interpreted when an AB interaction is present.

15.34 The following partially completed ANOVA table approximately matches summary statistics given in the article “From Here to Equity: The Influence of Status on

Student Access to and Understanding of Science” (Sci-ence Education [1999]: 577–602). The study described in this article attempted to quantify the effect of socioeco-nomic status on learning science. The response variable was “Rate of Talk” (the number of on-task talk speech acts per minute) during group work. Data were also collected on the variable socioeconomic status (low, middle, high) and

54906-15-ch15-p001-027.indd 15-1854906-15-ch15-p001-027.indd 15-18 11/23/10 3:30 PM11/23/10 3:30 PM

Page 19: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.5 Interpreting and Communicating the Results of Statistical Analyses 15-19

Bold exercises answered in back Data set available online Video Solution available

quantities given in the article were used to compute the following: SSA �857, SSB � 291, SSAB � 32, SSE � 5541, and error df 36. a. Use a significance level of .01 to test the null hypoth-

esis of no interaction between race and gender. b. Using a .01 significance level, test to determine

whether the mean length differs for the two races. c. Using a .01 significance level, test to determine

whether the mean length differs for males and females.

15.37 Identification of gender in human skeletons is an important part of many anthropological studies. An ex-periment conducted to determine whether measure-ments of the sacrum could be used to determine gender was described in the article “Univariate and Multivariate

Methods for Sexing the Sacrum” (American Journal of Physical Anthropology [1978]: 103–110). Sacra from skeletons of individuals of known race (Factor A, with two levels—Caucasian and black) and gender (Factor B, with two levels—male and female) were measured and the lengths recorded. Data compatible with summary

The ANOVA procedures introduced in this chapter are used to compare more than two population or treatment means. When a single-factor ANOVA has been used to test the null hypothesis of equal population or treatment means, the value of the F statistic and the associated P-value usually are reported. It is also fairly common to see the support-ing calculations summarized in an ANOVA table, although this is not always the case.

What to Look For in Published DataHere are some questions to ask when you read an article that includes a description of a single-factor ANOVA:

• Are the assumptions required for the validity of the ANOVA procedure reasonable? Specifically, are the samples independently chosen, or is there random assignment to treatments? Is it reasonable to think that the population or treatment response distributions are normal in shape? Are the reported sample standard deviations consistent with the assumption of equal population or treatment variances?

• What is the P-value associated with the test? Does the P-value lead to rejection of the null hypothesis?

• If the ANOVA F test led to rejection of the null hypothesis, was a multiple com-parisons procedure used to identify differences in the means? Are the results of the multiple comparisons procedure interpreted properly?

• Are the conclusions drawn consistent with the results of the hypothesis test and the multiple comparisons procedure? If H0 was rejected, does this indicate practi-cal significance or only statistical significance?

As an example, consider the following passage from the newspaper article “Mean-

ness Appears to Rub Off on Viewers” (USA Today, September 16, 2008):

Brigham Young University professor Sarah Coyne and colleagues asked 53 British college-aged women to watch one of three video clips, featuring either physical ag-gression (a knife fight from Kill Bill); relational aggression (a montage from Mean Girls); or no aggression (a séance scene from the horror movie What Lies Beneath). They then filled out a brief questionnaire and were allowed to leave the room. Right outside was another researcher who asked if they would like to participate in a study involving reaction times.

15.5 Interpreting and Communicating the Results of Statistical Analyses

54906-15-ch15-p001-027.indd 15-1954906-15-ch15-p001-027.indd 15-19 11/23/10 3:30 PM11/23/10 3:30 PM

Page 20: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-20 Chapter 15 Analysis of Variance

Once the women agreed to take part, the researcher behaved rudely, telling them to hurry. When they showed uneasiness, she said, “Great! This is really go-ing to screw things up!”

The researcher left the room and the subjects took two tests that are com-monly used to test aggression. Subjects who viewed the Kill Bill and the Mean Girls clips reacted in similarly aggressive ways. Prompted to subject the rude researcher to a sharp noise by pushing a button, they turned up the noise louder than a control group. They also gave lower scores than the control group on an evaluation form that supposedly was going to be used to decide whether the re-searcher should be hired.

This newspaper article is summarizing the findings from research that is more fully described in the paper “The Effects of Viewing Physical and Relational Ag-

gression in the Media” (Journal of Experimental Psychology [2008]: 1551–1554). The accompanying table gives the reported means for loudness and for duration of the noises administered by subjects for each of the three video types (physical aggres-sion, relational aggression, and no aggression).

Video Viewed

Physical

Aggression

Relational

Aggression

No

Aggression

Loudness 6.11 5.82 3.97Duration 5.37 5.23 3.91

The authors reported that the null hypothesis that the means for noise loudness were equal for the three video types was rejected (F 5 8.09, P-value � .001). The authors then used a multiple comparison procedure to conclude that the mean loudness for participants who viewed physical aggression or relational aggression was significantly higher than the mean loudness for those who viewed the no-aggression video, but that the difference in mean loudness was not significantly different for the physical-aggression and the relational-aggression groups. Using the means provided and the underscoring method introduced in this chapter, we could illustrate this conclusion with the following display:

Physical aggression Relational aggression No aggression6.11 5.82 3.97

Similar conclusions were reached based on the analysis of the duration data. The null hypothesis of equal mean noise duration for the three types of videos was re-jected. Multiple comparisons led to the conclusion that while there was no significant difference in mean duration for the two aggression groups, mean duration for the physical- and relational-aggression video groups was significantly higher than the mean duration for the no-aggression group.

A Word to the Wise: Cautions and LimitationsWhen using analysis of variance methods to test hypotheses about the differences between population or treatment means, keep the following in mind:

1. In single-factor analysis of variance, the alternative hypothesis is that not all population or treatment means are the same. When we reject the null hypothesis,

54906-15-ch15-p001-027.indd 15-2054906-15-ch15-p001-027.indd 15-20 11/23/10 3:30 PM11/23/10 3:30 PM

Page 21: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15.5 Interpreting and Communicating the Results of Statistical Analyses 15-21

it does not mean that we have evidence that all the population means are differ-ent. Remember, the alternative hypothesis is not m1 � m2 � p � mk. A multi-ple-comparisons procedure, such as the Tukey–Kramer method presented in Section 15.2, can be used to identify which means differ.

2. As was the case for the two-sample t test of Chapter 11, when the sample sizes are small, we must be willing to assume that the population distributions are at least approximately normal in order for analysis of variance to be an appropriate method of analysis. However, there is an additional assumption that is necessary for ANOVA— that the population or treatment variances are equal. When this assumption is not reasonable, it is sometimes possible to express the data differ-ently (by using a transformation such as a logarithm or the square root) to obtain data for which the ANOVA assumptions are more plausible. This is why it is not uncommon to see an analysis of variance performed using transformed data.

Additional Key Concepts and Formulas

TERM OR FORMULA COMMENT

Randomized block design An experimental design that controls for extraneous varia-tion when comparing treatments. The experimental units are grouped into homogeneous blocks so that within each block, the units are as similar as possible. Then each treat-ment is used on exactly one experimental unit in every block (each treatment appears once in every block).

Randomized block F test The four sums of squares for a randomized block design—SSTo, SSTr, SSBl, and SSE (with df kl � 1, k � 1, l � 1, and (k � 1)(l � 1), respectively)—are related by SSTo � SSTr � SSBl � SSE. Usually SSE is obtained by subtrac-tion once the other three have been calculated using com-puting formulas. The null hypothesis is that the mean re-sponse does not depend on which treatment is applied. With mean squares MSTr � SSTr/(k � 1) and MSE � SSE/(k � 1)(l � 1), the test statistic is F � MSTr/MSE, based on df1 � k � 1 and df2 � (k � 1)(l � 1).

Interaction between factors Two factors are said to interact if the mean change in re-sponse associated with changing the level of one factor de-pends on the level of the other factor.

Two-factor ANOVA When there are k levels of factor A and l levels of factor B, and m (�1) observations made for each combination of A–B levels, total sum of squares SSTo can be decomposed into SSA (sum of squares for A main effects), SSB, SSAB (interaction sum of squares), and SSE. Associated df are klm � 1, k � 1, l � 1, (k � 1)(l �1), and kl(m � 1), re-spectively. The null hypothesis of no interaction between

the two factors is tested using FAB 5MSABMSE

, where

MSAB 5SSAB1k 2 12 1l 2 12 and MSE 5

SSEkl 1m 2 12 . If this

null hypothesis cannot be rejected, tests for A and B main

effects are based on FA 5MSAMSE

and FB 5MSBMSE

, respectively.

54906-15-ch15-p001-027.indd 15-2154906-15-ch15-p001-027.indd 15-21 11/23/10 3:30 PM11/23/10 3:30 PM

Page 22: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-22 Chapter 15 Analysis of Variance

Single-Factor ANOVALet T1 denote the sum of the observations in the sample from the first population or treatment, and let T2, p , Tk denote the other sample totals. Also let T represent the sum of all N observations—the grand total—and

CF 5 correction factor 5T2

N

Then

SSTo 5 aall N obs.

x2 2 CF

SSTr 5T2

1

n11

T22

n21 p 1

T2k

nk2 CF

SSE 5 SSTo 2 SSTr

Example 15A.1

Treatment 1 4.2 3.7 5.0 4.8 T1 � 17.7 n1 � 4Treatment 2 5.7 6.2 6.4 T2 � 18.3 n2 � 3Treatment 3 4.6 3.2 3.5 3.9 T3 � 15.2 n3 � 4 T � 51.2 N � 11

CF 5 correction factor 5T

2

N5151.22 2

115 238.31

SSTr 5T

21

n11

T 22

n21 p 1

T 2k

nk2 CF

5117.72 2

41118.32 2

31115.22 2

42 238.31

5 9.40 SSTo 5 a

all N obs.x2 2 CF 5 14.22 2 1 13.72 2 1 p 1 13.92 2 2 238.31 5 11.81

SSE 5 SSTo 2 SSTr 5 118.1 2 9.40 5 2.41

Randomized Block ExperimentLet T1, T2, p , Tk denote the treatment totals and B1, B2, p , Bl represent the block totals. Also, let T be the grand total of all kl observations and

CF 5 correction factor 5T2

kl

Chapter 15 Appendix: ANOVA Computations

54906-15-ch15-p001-027.indd 15-2254906-15-ch15-p001-027.indd 15-22 11/23/10 3:30 PM11/23/10 3:30 PM

Page 23: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

Chapter 15 Appendix ANOVA Computations 15-23

Then

SSTo 5 aall kl obs.

x2 2 CF

SSTr 51l

3T 21 1 T2

2 1 p 1 T 2k 4 2 CF

SSBl 51k

3B 21 1 B

22 1 p 1 B

2l 4 2 CF

SSE 5 SSTo 2 SSTr 2 SSBl

Block

Treatment 1 2 3 4

1 4.2 3.7 5.0 4.8 T1 � 17.7 2 5.2 4.5 6.7 5.4 T2 � 21.8 3 3.4 3.2 5.1 3.9 T3 � 55.1

B1 � 12.8 B2 � 11.4 B3 � 16.8 B4 � 14.1 T � 55.1

CF 5 correction factor 5T

2

kl5155.12 2132 142 5

3036.0112

5 253.00

SSTr 51l

3T 21 1 T

22 1 p 1 T

2k 4 2 CF

514

3 117.72 2 1 121.82 2 1 115.62 2 4 2 253.00

5 4.97

SSBl 51k

3B 21 1 B

22 1 p 1 B

2l 4 2 CF

513

3 112.82 2 1 111.42 2 1 116.82 2 1 114.12 2 4 2 253.00

5 5.28 SSTo 5 a

all kl obs.x2 2 CF 5 14.22 2 1 p 1 13.92 2 2 253.00 5 10.73

SSE 5 SSTo 2 SSTr 2 SSBl 5 10.73 2 4.97 2 5.28 5 0.48

EXAMPLE 1 5 . A2

54906-15-ch15-p001-027.indd 15-2354906-15-ch15-p001-027.indd 15-23 11/23/10 3:30 PM11/23/10 3:30 PM

Page 24: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-24 Chapter 15 Analysis of Variance

Bold exercises answered in back Data set available online Video Solution available

Chapter Review Exercises 15.38 - 15.49

15.38 Suppose that a random sample of size n � 5 was selected from the vineyard properties for sale in Sonoma County, California, in each of 3 years. The following data are consistent with summary information on price per acre (in dollars, rounded to the nearest thousand) for disease-resistant grape vineyards in Sonoma County (Wines and Vines, November 1999).

1996 30,000 34,000 36,000 38,000 40,0001997 30,000 35,000 37,000 38,000 40,0001998 40,000 41,000 43,000 44,000 50,000

a. Construct boxplots for each of the 3 years on a com-mon axis, and label each by year. Comment on the similarities and differences.

b. Carry out an ANOVA to determine whether there is evidence to support the claim that the mean price per acre for vineyard land in Sonoma County was not the same for the 3 years considered. Use a significance level of .05 for your test.

15.39 Parents are frequently concerned when their child seems slow to begin walking (although when the child finally walks, the resulting havoc sometimes has the parents wishing they could turn back the clock!). The article “Walking in the Newborn” (Science, 176 [1972]:

314–315) reported on an experiment in which the effects of several different treatments on the age at which a child first walks were compared. Children in the first group were given special walking exercises for 12 minutes per day beginning at age 1 week and lasting 7 weeks. The second group of children received daily exercises but not the walking exercises administered to the first group. The third and fourth groups were control groups: They re-ceived no special treatment and differed only in that the third group’s progress was checked weekly, whereas the fourth group’s progress was checked just once at the end of the study. Observations on age (in months) when the children first walked are shown in the accompanying ta-ble. Also given is the ANOVA table, obtained from the SPSS computer package.

Age n Total

Treatment 1 9.00 9.50 9.75 6 60.7510.00 13.00 9.50

Treatment 2 11.00 10.00 10.00 6 68.2511.75 10.50 15.00

Treatment 3 11.50 12.00 9.00 6 70.2511.50 13.25 13.00

Treatment 4 13.25 11.50 12.00 5 61.7513.50 11.50

Analysis of Variance

Source df Sum of sq. Mean Sq. F Ratio F Prob

Between Groups 3 14.778 4.926 2.142 .129

Within Groups 19 43.690 2.299

Total 22 58.467

a. Verify the entries in the ANOVA table. b. State and test the relevant hypotheses using a

significance level of .05.

15.40 The nutritional quality of shrubs commonly used for feed by rabbits was the focus of a study summarized in the article “Estimation of Browse by Size Classes for

Snowshoe Hare” (Journal of Wildlife Management [1980]: 34–40). The energy content (cal/g) of three sizes (4 mm or less, 5–7 mm, and 8–10 mm) of serviceberries was studied. Let m1, m2, and m3 denote the true mean en-ergy content for the three size classes. Suppose that 95% simultaneous confidence intervals for m1 2 m2, m1 2 m3, and m2 2 m3 are (210, 290), (150, 450), and (10, 310), respectively. How would you interpret these intervals?

15.41 Consider the accompanying data on plant growth after the application of different types of growth hormone.

1 13 17 7 142 21 13 20 17

Hormone 3 18 14 17 214 7 11 18 105 6 11 15 8

a. Carry out the F test at level a � .05. b. What happens when the T–K procedure is applied?

(Note: This “contradiction” can occur when H0 is “barely” rejected. It happens because the test and the multiple comparison method are based on different distributions. Consult your friendly neighborhood statistician for more information.)

54906-15-ch15-p001-027.indd 15-2454906-15-ch15-p001-027.indd 15-24 11/23/10 3:30 PM11/23/10 3:30 PM

Page 25: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

Chapter Review Exercises 15-25

Bold exercises answered in back Data set available online Video Solution available

15.44 In many countries, grains and cereals are the pri-mary food source. The authors of the article “Mineral

Contents of Cereal Grains as Affected by Storage and

Insect Infestation” ( Journal of Stored Products Re-search [1992]: 147–151) investigated the effects of storage period on the mineral content of maize. Four storage periods were considered: 0 months (no storage), 1 month, 2 months, and 4 months. Twenty-four contain-ers of maize were randomly divided into four groups of six each. The iron content (mg/100 g dry weight) of the first group of six was measured immediately (0 months of storage), the second six were measured after 1 month in storage, etc. The following summary quantities are consistent with information in the article:

Storage Period x s2

0 4.923 .0001071 4.923 .0000672 4.917 .0001474 4.902 .000057

a. Use a test with a 5 .05 to decide whether true aver-age iron content is the same for all four storage periods.

b. If appropriate, carry out a multiple comparison analysis.

15.45 The accompanying ANOVA table is from the article “Bacteriological and Chemical Variations and

Their Inter-Relationships in a Slightly Polluted Wa-

ter-Body” (International Journal of Environmental Studies [1984]: 121–129). A water specimen was taken every month for a year at each of 15 designated locations on the Lago di Piediluco in Italy. The ammonia-nitrogen concentration was determined for each specimen and the resulting data analyzed using a two-way ANOVA. The researchers were willing to assume that there was no in-teraction between the two factors location and month. Complete the given ANOVA table, and use it to perform the tests required to determine whether the true mean concentration differs by location or by month of year. Use a .05 significance level for both tests.

Source of

Variation df

Sum of

Squares

Mean

Square F

Location 0.6Month 11 2.3Error

Total 179 6.4

15.42 The article “Learning, Opportunity to Cheat,

and Amount of Reward” ( Journal of Experimental Edu-cation [1977]: 30–40) described a study to determine the effects of expectations concerning cheating during a test and perceived payoff on test performance. Subjects, stu-dents at UCLA, were randomly assigned to a particular factor-level combination. Factor A was expectation of op-portunity to cheat, with levels high, medium, and low. Those in the high group were asked to study and then recall a list of words. For the first four lists, they were left alone in a room with the door closed, so they could look at the original list of words if they wanted to. The medium group was asked to study and recall the list while left alone but with the door open. For the low group, the experi-menter remained in the room. For study and recall of a fifth list, the experimenter stayed in the room for all three groups, thus precluding any cheating on the fifth list. Score on the fifth test was the response variable. The sec-ond factor (B) under study was the perceived payoff, with a high and a low level. The high payoff group was told that if they scored above average on the test, they would receive 2 hours of credit rather than just 1 hour (subjects were fulfilling a course requirement by participating in experi-ments). The low group was not given any extra incentive for scoring above the average. The article gave the follow-ing statistics: FA 5 4.99, FB 5 4.81, FAB 5 1, error df 5 120. Test the null hypothesis of no interaction between the factors. If appropriate, test the null hypotheses of no Factor A and no Factor B effects. Use a � .05.

15.43 A study was carried out to compare aptitudes and achievements of three different groups of college students (“A Comparison of Three Groups of Academi-

cally At-Risk College Students,” Journal of College Student Development [1995]: 270–279):1. Students diagnosed as learning disabled2. Students who identified themselves as learning

disabled3. Students who were low achieversThe Scholastic Abilities Test for Adults was given to each student. Consider the following summary data on writ-ing composition score:

n1 � 30 n2 � 30 n3 � 30x1 � 9.40 x2 � 11.63 x3 � 11.00SSE � 749.85

Does it appear that population mean score is not the same for the three types of students? Carry out a test of hypothesis. Does your conclusion depend on whether a significance level of .05 or .01 is used?

54906-15-ch15-p001-027.indd 15-2554906-15-ch15-p001-027.indd 15-25 11/23/10 3:30 PM11/23/10 3:30 PM

Page 26: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

15-26 Chapter 15 Analysis of Variance

Bold exercises answered in back Data set available online Video Solution available

u 513

m1 113

m2 113

m3 212

m4 212

m5

which, in essence, compares the mean time to relief of the national brands to the mean for the house brands. Refer to Exercise 15.47 and suppose that brand 1 is a store brand and brands 2 and 3 are national brands. Ob-tain a 95% confidence interval for

u 5 m1 212

m2 212

m3

15.49 One of the assumptions that underlies the va-lidity of the ANOVA F test is that the population or treatment response variances s2

1, s22, p , s2

k should be identical regardless of whether H0 is true: the assumption of constant variance across populations or treatments. In some situations, the x values themselves may not satisfy this assumption, yet a transformation using some specified mathematical function (for example, taking the logarithm or the square root) will give observations that have (approximately) constant variance. The ANOVA F test can then be applied to the transformed data. When observations are made on a counting variable (x 5 num-ber of something), statisticians have found that taking the square root will frequently “stabilize the variance.” In an experiment to compare the quality of four different brands of videotape, cassettes of a specified length were selected, and the number of flaws in each was determined.

Brand 1 10 14 5 12 8 Brand 2 17 14 8 9 12 Brand 3 13 18 15 18 10 Brand 4 14 22 12 16 17

Make a square-root transformation, and analyze the re-sulting data by using the ANOVA F test at significance level .01.

15.46 Suppose that each observation in a single-factor ANOVA data set is multiplied by a constant c (a change in units; for example, c 5 2.54 changes observations from inches to centimeters). How does this affect MSTr, MSE, and the test statistic F ? Is this reasonable? Explain.

15.47 Three different brands of automobile batteries, each one having a 42-month warranty, were included in a study of battery lifetime. A random sample of batteries of each brand was selected and lifetime (in months) was determined, resulting in the following data:

Brand 1 45 38 52 47 45 42 43 Brand 2 39 44 50 54 48 46 40 Brand 3 50 46 43 48 57 44 48

State and test the appropriate hypotheses using a significance level of .05. Be sure to summarize your cal-culations in an ANOVA table.

15.48 Let c1, c2, p , ck denote k specified numbers, and consider the quantity u defined by

u 5 c1m1 1 c2m2 1 % 1 ckmk

A confidence interval for u is then

c1x1 1 % 1 ckxk

6 1t critical value2ÅMSEa c21

n11 % 1

c2k

nkb

where the t critical value is based on an error df of N 2 k. For example, in a study carried out to compare pain relievers with respect to true average time to relief, suppose that brands 1, 2, and 3 are nationally available, whereas brands 4 and 5 are sold only by two large chains of drug stores. An investigator might then wish to consider

54906-15-ch15-p001-027.indd 15-2654906-15-ch15-p001-027.indd 15-26 11/23/10 3:30 PM11/23/10 3:30 PM

Page 27: 15.3 The F Test for a Randomized Block Experiment … · 15.3 The F Test for a Randomized Block Experiment EXAMPLE 15.8 Cost of Air-Conditioning ... 15-2 Chapter 15 Analysis of ...

Chapter 1515.23 A randomized block experiment was used to control the fac-tor value of house, which definitely affects the assessors’ appraisals. If a completely randomized experiment had been done, then there would have been danger of having the assessors appraising houses which were not of similar value. Therefore, differences between as-sessors would be partly due to the fact that the homes were dissimi-lar, as well as to differences in the appraisals made.15.25 F  5  117.36, P-value  ,  0.001, reject H0.15.27 F  5  0.42, P-value  .  0.1, fail to reject H0.15.29 The statement indicates that there is interaction between caf-feine consumption and exercise.15.31 a. Yes. The roughly horizontal lines for “low approach”, the small positive slope for “average approach”, the greater slope for “high approach.” b. Yes. The small positive slope for “low avoid-ance”, the small negative slope for “average avoidance”, the larger negative slope for “high avoidance.” c. Yes. In the top graph the slope for “high approach” is more positive than that of either of the other two categories, and in the bottom graph the slope for “high avoidance” is more negative than that of either of the other two categories.15.33 When an AB interaction is present, the change in mean re-sponse, when going from one level of factor A to another, depends upon which level of factor B is being used. Since these effects are different for different levels of factor B, the individual effects of fac-tor A or factor B cannot be interpreted.15.35 Test for interaction: F  5  2.18, P-value  .  0.10, fail to reject H0; test for main effect for size: F  5  4.00, 0.01  ,  P-value  ,  0.05, fail to reject H0; test for main effect for species: F  5  4.36, 0.05  ,  P-value  ,  0.10, fail to reject H0 15.37 a. Ho: There is no interaction between race and sex. Ha: There is interaction between race and sex. a  5  0.01. From Appen-dix Table 6, P-value  .  0.10. Since the P-value exceeds a, the null hypothesis of no interaction between race and sex is not rejected. Thus, hypothesis tests for main effects are appropriate. b. Ho: There are no race main effects. Ha: There are race main effects. a  5  0.01. From Appendix Table 6, 0.05  .  P-value  .  0.01. Since the P-value

exceeds a, the null hypothesis of no race main effects is not rejected. c. Ho: There are no sex main effects. Ha: There are sex main effects. a  5  0.01. From Appendix Table 6, P-value  .  0.10. Since the P-value exceeds a, the null hypothesis of no sex main effects is not rejected. The data are consistent with the hypothesis that the true average lengths of sacra do not differ for males and females.15.39 a. See solutions manual for detailed computations. b. F  5  2.142, P-value  .  0.10, fail to reject H0 15.41 a. H0: m1 5 m2 5 m3 5 m4 5 m5. Ha: At least two of the five mi’s are different. a  5  0.05. From Appendix Table 6, 0.05  . P-value  . 0.01. Since the P-value is less than a, Ho is rejected. The data support the conclusion that the mean plant growth is not the same for all five growth hormones. b. k  5 5, Error df  5 15. From Appendix Table 7, q  5 4.37. Since the sample sizes are the same, the 6 factor is the same for each comparison.

4.37Å15.1666

2a1

41

14b 5 8.51

No significant differences are determined using the T-K method.15.43 F  5 4.60, 0.01  , P-value  , 0.05, reject H0 for a  5 .05 but not for a  5 .01.15.45 Test for location effect: F  5 1.89, 0.01  , P-value  , 0.05, reject H0; test for month effect: F  5 9.20, P-value  , 0.001, reject H0 15.47 Let m1, m2, and m3 denote the mean lifetime for brands 1, 2 and 3 respectively. H0: m1 5 m2 5 m3. Ha: At least two of the three mi’s are different. a  5 0.05. From Appendix Table 6, P-value  . 0.10. Since the P-value exceeds a, Ho is not rejected at level of significance 0.05. The data are consistent with the hypothe-sis that there are no differences in true mean lifetimes of the three brands of batteries.15.49 Let m1, m2, m3, m4 denote the mean of the square root of the number of flaws for brand 1, 2, 3 and 4 of tape, respectively. H0: m1 5 m2 5 m3 5 m4. Ha: At least two of the four mi’s are differ-ent. a  5 0.01. From Appendix Table 6, P-value  .  0.01. Since the P-value exceeds a, H0 is not rejected. The data are consistent with the hypothesis that there are no differences in true mean square root of the number of flaws for the four brands of tape.

Answers to Selected Odd-Numbered Exercises

58010-19-ANS-SE-Ch15.indd 1 12/2/11 12:27 PM