[Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment...

11
CHAPTER 9 Missing Observations in Split Plot and Split Block Experiment Designs 9.1. INTRODUCTION Missing observations can occur as a result of many causes during the conduct of an experiment. Animals can invade and destroy some experimental units. Floods or fires can occur and damage a part of the experiment. On some occasions workers have been known to unintentionally leave out some of the experimental units when setting up the experiment. The problem of obtaining a statistical analysis of the results from an experiment with missing or damaged experimental units is resolvable. Several available statistical computer software packages handle this situation. Data analysis with missing observations is not more difficult than when there are no missing observations. SAS PROC GLM (SAS Institute, 1999–2001) is used for the numerical examples illustrating the data analysis with missing observations. A split plot designed experiment with missing observations is presented in the next section. A split block designed experiment with missing observations is shown in Section 9.3. Whole plot, split plot, split block whole plot, or subplot experimental units can be missing. The SAS PROC GLM codes handle all these situations with a correct adjustment for the degrees of freedom associated with the missing observations in most cases. A discussion of missing observations for variations of the split plot and split block experiment designs is given. The SAS codes and outputs for the numerical examples are given in the two appendices, 9.1 and 9.2. Variations on Split Plot and Split Block Experiment Designs, by Walter T. Federer and Freedom King Copyright # 2007 John Wiley & Sons, Inc. 202

Transcript of [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment...

Page 1: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

C H A P T E R 9

Missing Observations in Split Plot and

Split Block Experiment Designs

9.1. INTRODUCTION

Missing observations can occur as a result of many causes during the conduct of an

experiment. Animals can invade and destroy some experimental units. Floods or

fires can occur and damage a part of the experiment. On some occasions workers

have been known to unintentionally leave out some of the experimental units when

setting up the experiment. The problem of obtaining a statistical analysis of the

results from an experiment with missing or damaged experimental units is

resolvable. Several available statistical computer software packages handle this

situation. Data analysis with missing observations is not more difficult than when

there are no missing observations. SAS PROC GLM (SAS Institute, 1999–2001) is

used for the numerical examples illustrating the data analysis with missing

observations.

A split plot designed experiment with missing observations is presented in the

next section. A split block designed experiment with missing observations is shown

in Section 9.3. Whole plot, split plot, split block whole plot, or subplot experimental

units can be missing. The SAS PROC GLM codes handle all these situations with a

correct adjustment for the degrees of freedom associated with the missing

observations in most cases. A discussion of missing observations for variations of

the split plot and split block experiment designs is given. The SAS codes and outputs

for the numerical examples are given in the two appendices, 9.1 and 9.2.

Variations on Split Plot and Split Block Experiment Designs, by Walter T. Federerand Freedom KingCopyright # 2007 John Wiley & Sons, Inc.

202

Page 2: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

9.2. MISSING OBSERVATIONS IN A SPLIT PLOTEXPERIMENT DESIGN

To illustrate the computations for a split plot experiment design, we use a numerical

example and the SAS PROC GLM procedure (SAS Institute, Inc., 1999-2001).

Using the data of Example 1.1 in Chapter 1, omit two observations in replicate 4,

planting method 4, and the responses for seedbed preparations 3 and 4, that is, data

values 65.6 and 63.3. There are now 62 observations rather than 64. Using the SAS

PROC GLM code as shown in Appendix 9.1, an analysis of variance table with Type

III sums of squares is obtained and is presented below:

Source of variation Degrees of freedom Sum of squares Mean square

Replicate ¼ R 3 173.87 57.96

Seedbed preparation ¼ A 3 214.02 71.34

Error A ¼ A� R 9 97.38 10.82

Planting method ¼ B 3 4100.79 1366.93

A� B 9 236.99 26.33

Error B ¼ B� R=A 34 592.74 17.43

The Type I sums of squares and the estimable least squares means, lsmeans, are given

in the output for Example 9.1 in Appendix 9.1. The F-tests proceed as for the equal

numbers case, that is, the error term for factor A is Error A and the error term for factor

B and the interaction of factors A and B is Error B. Several computer packages are able

to handle this case where there are an unequal number of observations.

Instead of having missing observations in the split plot experimental units, whole

plot experimental units may be missing. To illustrate this case, suppose that the disk-

harrowed plots A4 in replicates 3 and 4 were missing. There would be two missing

whole plots and eight missing split plot experimental units resulting in 56 data values.

A partitioning of the degrees of freedom in an analysis table would be as follows:

Source of variation Degrees of freedom

Total 56

Correction for the mean 1

Replicate ¼ R 3

Seedbed preparation ¼ A 3

Error A ¼ A� R 7

Planting method ¼ B 3

A� B 9

Error B ¼ B� R=A 30

There were two missing whole plots and the two degrees of freedom for these

are taken out of the Error A degrees of freedom. The B� R sum of squares

within whole plots A has 9þ 9þ 9þ 3 ¼ 30 degrees of freedom. SAS PROC

GLM provides the sums of squares and mean squares for the above partitioning

missing observations in a split plot experiment design 203

Page 3: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

of the degrees of freedom. The F-statistics may be obtained just as for no

missing values.

9.3. MISSING OBSERVATIONS IN A SPLIT BLOCKEXPERIMENT DESIGN

Following the same steps as in the previous section, a numerical example is used to

illustrate the computations for a split-block-designed experiment with missing

observations. Using the data for Example 2.1 of Chapter 2, omit the last three

observations for the example. These are for hybrid 10 in replicate 2 and are equal to 43,

43, and 42 for generations b, c, and a, respectively. The number of observations is

reduced to 57 from 60, as present in the example in Chapter 2. From the computer output

for Example 9.2, Appendix 9.2, the following Type III analysis of variance is obtained:

Source of variation Degrees of freedom Sum of squares Mean square

Replicate ¼ R 1 0.17 0.17

Hybrid ¼ H 9 66.80 7.42

Error H ¼ H � R 8 67.00 8.38

Generation ¼ G 2 30.67 15.34

Error G ¼ G� R 2 12.11 6.06

G� H 18 60.50 3.36

Error GH ¼ G� H � R 16 22.22 1.39

One degree of freedom is lost from Error A and two from the three factor interaction

G� H � R. As may be seen when using available software, missing observations

present no difficulties in analyzing data.

9.4. COMMENTS

As demonstrated in Chapters 3, 4, 5, and 6, there are many variations of split plot and

split block experiment designs. When missing observations occur, use of the same

computer codes as for no missing observations provide the statistical analysis in the

same forms. Since orthogonality is disturbed by the missing observations, Type III or

Type IV analyses should be used. Statistical analyses without the use of computer

software can become cumbersome. A note of caution in using software packages is

to always check on the number of degrees of freedom to ascertain that they are

correct. F-test statistics may be computed as described previously.

9.5. PROBLEMS

Problem 9.1. Omit another observation (e.g., replicate 1, planting method B1, and

seedbed preparation A1) for the example discussed in Section 9.2 and perform an

analysis of the remaining data.

204 missing observations in split plot and split block

Page 4: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

Problem 9.2. Omit another observation (e.g., replicate 1, hybrid 7, and generation

a) for the example discussed in Section 9.3 and perform an analysis of the remain-

ing data.

Problem 9.3. A study was conducted to investigate the effect of gender (whole plot

factor), age group (split plot factor), dieting (split split plot factor), and exercise

regimen (split split split plot factor) on weight loss. A random sample of 300 males

and 300 females was selected as the experimental subjects. The number of subjects

per combination of the four factors is presented in the table that follows. Note that

the numbers are unequal.

Age group Diet Exercise Female Male

Young No 0 8 10

Young No 1 8 10

Young No 2 8 10

Young No 3 8 10

Young No 4 8 10

Young Yes 0 8 10

Young Yes 1 8 10

Young Yes 2 8 10

Young Yes 3 8 10

Young Yes 4 8 10

Middle age No 0 12 12

Middle age No 1 12 12

Middle age No 2 12 12

Middle age No 3 12 12

Middle age No 4 12 12

Middle age Yes 0 12 12

Middle age Yes 1 12 12

Middle age Yes 2 12 12

Middle age Yes 3 12 12

Middle age Yes 4 12 12

Old No 0 10 8

Old No 1 10 8

Old No 2 10 8

Old No 3 10 8

Old No 4 10 8

Old Yes 0 10 8

Old Yes 1 10 8

Old Yes 2 10 8

Old Yes 3 10 8

Old Yes 4 10 8

Total 300 300

(i) Obtain a partitioning of the 600 degrees of freedom into the degrees of

freedom for each source of variation in an analysis of variance table.

problems 205

Page 5: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

(ii) Write a SAS PROC GLM code for obtaining an analysis of variance table,

F-tests, and means for all combinations and their standard errors.

(iii) Are Type I sums of squares equal to Type III sums of squares? Why or why

not?

(iv) Simulate 600 numbers using random normal deviates plus 5 to form a data

set and use your code to analyze the data set.

9.6. REFERENCE

SAS Institute, Inc. (1999–2001). Release 8.02, copyright. Cary, NC.

APPENDIX 9.1. SAS CODE FOR NUMERICAL EXAMPLE INSECTION 9.2.

A computer code and data for the numerical example in Section 9.2 is given below:

Data spex1;inputYRAB;/*Y¼yield,R¼ block,A¼ planting method,B¼ cultivationmethod*/datalines;

82.8 1 1 146.2 1 1 278.6 1 1 377.7 1 1 472.2 2 1 151.6 2 1 270.9 2 1 373.6 2 1 472.9 3 1 153.6 3 1 269.8 3 1 370.3 3 1 474.6 4 1 157.0 4 1 269.6 4 1 372.3 4 1 474.1 1 2 149.1 1 2 272.0 1 2 366.1 1 2 476.2 2 2 153.8 2 2 271.8 2 2 365.5 2 2 4

206 missing observations in split plot and split block

Page 6: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

71.1 3 2 143.7 3 2 267.6 3 2 366.2 3 2 467.8 4 2 158.8 4 2 260.6 4 2 360.6 4 2 468.4 1 3 154.5 1 3 272.0 1 3 370.6 1 3 468.2 2 3 147.6 2 3 276.7 2 3 375.4 2 3 467.1 3 3 146.4 3 3 270.7 3 3 366.2 3 3 465.6 4 3 153.3 4 3 265.6 4 3 369.2 4 3 471.5 1 4 150.9 1 4 276.4 1 4 375.1 1 4 470.4 2 4 165.0 2 4 275.8 2 4 375.8 2 4 472.5 3 4 154.9 3 4 267.6 3 4 375.2 3 4 467.8 4 4 150.2 4 4 2 /*last 2 observations of Example 1.1 omitted*/

;

Proc GLM;

Class R A B;Model Y¼ R A R*A B A*B;Lsmeans A B A*B;Run;

appendix 207

Page 7: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

The computer output from the above code and data set is presented below:

The GLM Procedure

Dependent Variable: YSum of

Source DF Squares Mean Square F Value Pr> FModel 27 4945.101519 183.151908 10.51 < .0001Error 34 592.735417 17.433395Corrected Total 61 5537.836935

R-Square Coeff Var Root MSE Y Mean0.892966 6.305765 4.175332 66.21452

Source DF Type I SS Mean Square F Value Pr> FR 3 221.932918 73.977639 4.24 0.0119A 3 199.166118 66.388706 3.81 0.0187R*A 9 189.280400 21.031156 1.21 0.3232B 3 4097.733083 1365.911028 78.35 < 0001A*B 9 236.989000 26.332111 1.51 0.1840

Source DF Type III SS Mean Square F Value Pr> FR 3 173.866083 57.955361 3.32 0.0311A 3 214.023083 71.341028 4.09 0.0139R*A 9 97.381190 10.820132 0.62 0.7710B 3 4100.789391 1366.929797 78.41 < 0001A*B 9 236.989000 26.332111 1.51 0.1840

Least Squares Means

A Y LSMEAN1 68.35625002 64.06250003 64.84375004 67.9583333

B Y LSMEAN1 71.45000002 52.28750003 70.86041674 70.6229167

A B Y LSMEAN1 1 75.62500001 2 52.10000001 3 72.22500001 4 73.4750000

208 missing observations in split plot and split block

Page 8: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

2 1 72.30000002 2 51.35000002 3 68.00000002 4 64.60000003 1 67.32500003 2 50.45000003 3 71.25000003 4 70.35000004 1 70.55000004 2 55.25000004 3 71.96666674 4 74.0666667

APPENDIX 9.2. SAS CODE FOR NUMERICAL EXAMPLEIN SECTION 9.3.

The computer code for the data of the example in Section 9.3 is presented below:

data sbex;input yield rep hyb gen;datalines;

48 1 3 146 1 3 343 1 3 246 1 8 145 1 8 342 1 8 246 1 2 144 1 2 342 1 2 242 1 1 146 1 1 344 1 1 243 1 6 145 1 6 344 1 6 247 1 7 149 1 7 347 1 7 248 1 0 145 1 0 345 1 0 246 1 9 148 1 9 347 1 9 246 1 4 148 1 4 3

appendix 209

Page 9: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

47 1 4 249 1 5 149 1 5 348 1 5 246 2 4 248 2 4 342 2 4 145 2 3 244 2 3 342 2 3 146 2 9 246 2 9 344 2 9 145 2 5 245 2 5 343 2 5 143 2 1 250 2 1 344 2 1 148 2 7 251 2 7 348 2 7 144 2 2 248 2 2 347 2 2 144 2 8 246 2 8 346 2 8 147 2 6 248 2 6 344 2 6 1 /*last3observationsforExample2.1wereomittedforthisexample.*/

;

proc glm data¼ sbex;class rep hyb gen;model yield¼ rep hyb hyb*rep gen gen*rep gen*hyb;lsmeans hyb gen gen*hyb;

run;

The output of the above code and data set is presented below in an abbreviated form:

Class Level InformationClass Levels Values

rep 2 1 2hyb 10 0 1 2 3 4 5 6 7 8 9gen 3 1 2 3

210 missing observations in split plot and split block

Page 10: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

Number of observations 57Dependent Variable: yield

Sum ofSource DF Squares Mean Square F Value Pr> FModel 40 247.8128655 6.1953216 4.46 0.0011Error 16 22.2222222 1.3888889Corrected Total 56 270.0350877

R-Square Coeff Var Root MSE yield Mean0.917706 2.574747 1.178511 45.77193

Source DF Type I SS Mean Square F Value Pr> Frep 1 0.23879142 0.23879142 0.17 0.6839hyb 9 66.79629630 7.42181070 5.34 0.0018rep*hyb 8 67.00000000 8.37500000 6.03 0.0012gen 2 36.35087719 18.17543860 13.09 0.0004rep*gen 2 16.92319688 8.46159844 6.09 0.0108hyb*gen 18 60.50370370 3.36131687 2.42 0.0409

Source DF Type III SS Mean Square F Value Pr> Frep 1 0.16666667 0.16666667 0.12 0.7335hyb 9 66.79629630 7.42181070 5.34 0.0018rep*hyb 8 67.00000000 8.37500000 6.03 0.0012gen 2 30.67111111 15.33555556 11.04 0.0010rep*gen 2 12.11111111 6.05555556 4.36 0.0308hyb*gen 18 60.50370370 3.36131687 2.42 0.0409

Least Squares Meanshyb yield LSMEAN0 Non-est1 44.83333332 45.16666673 44.66666674 46.16666675 46.50000006 45.16666677 48.33333338 44.83333339 46.1666667

gen yield LSMEAN1 Non-est2 Non-est3 Non-est

hyb gen yield LSMEAN0 1 Non-est0 2 Non-est0 3 Non-est

appendix 211

Page 11: [Wiley Series in Probability and Statistics] Variations on Split Plot and Split Block Experiment Designs (Federer/Variations on Split Plot and Split Block Experiment Designs) || Missing

1 1 43.00000001 2 43.50000001 3 48.00000002 1 46.50000002 2 43.00000002 3 46.00000003 1 45.00000003 2 44.00000003 3 45.00000004 1 44.00000004 2 46.50000004 3 48.00000005 1 46.00000005 2 46.50000005 3 47.00000006 1 43.50000006 2 45.50000006 3 46.50000007 1 47.50000007 2 47.50000007 3 50.00000008 1 46.00000008 2 43.00000008 3 45.50000009 1 45.00000009 2 46.50000009 3 47.0000000

212 missing observations in split plot and split block