One-at-a-Time Plans

9
One-at-a-Time Plans Author(s): Cuthbert Daniel Source: Journal of the American Statistical Association, Vol. 68, No. 342 (Jun., 1973), pp. 353- 360 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2284076 . Accessed: 15/06/2014 13:57 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PM All use subject to JSTOR Terms and Conditions

Transcript of One-at-a-Time Plans

One-at-a-Time PlansAuthor(s): Cuthbert DanielSource: Journal of the American Statistical Association, Vol. 68, No. 342 (Jun., 1973), pp. 353-360Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2284076 .

Accessed: 15/06/2014 13:57

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

One-at-a-Time Plans CUTHBERT DANIEL*

One-at-a-time experiments are always done when the experimental system is set up to produce single results or pairs of results. When random error is small compared to main effects expected, such experi- ments are economical, but may give biased estimates. These biases can usually be described by two-factor interactions (2fi). Minimal aug- mentations of standard one-at-a-time sequences are given, first to separate main effects from 2fi, then to estimate each 2fi separately. Each new datum produces one or more new estimates.

1. INTRODUCTION

Some scientists do their experimental work in single steps. They hope to learn something from each run, or trial. They see and react to their data more rapidly than experimental agronomists or clinical investigators whose endpoints may take months or years to materialize. The statistician who tells such an experimenter that he can secure much greater precision or validity by doing 16 or more runs in a balanced set, may be listened to with courtesy, but rarely with enthusiasm.

If a one-finding-at-a-time experimenter has in fact found out a good deal by his methods, it must be true that the effects he has found are at least three or four times his average random error per trial. The plans proposed in this article are directed to such experi- menters. The plans are arranged to produce data of greater validity, i.e., of less bias, than the usual sequences of one-at-a-time trials. The improvement comes in re- moving two-factor interaction (henceforth 2fi) biases from main effect estimates. Minimal augmentations are given, both for separating effects from 2fi, and for esti- mation of single 2fi components.

The one-finding-at-a-time scientist usually varies one factor (independent variable) at a time. It is convenient to divide one-at-a-time plans into five categories, since their informative augmentations follow different patterns. I name these five types: strict, standard, paired, free, and nested or curved.

1. Strict o.a.t. plans vary one factor from the condition of the last preceding trial.

2. Standard o.a.t. plans vary one factor from some standard condition.

3. Paired o.a.t. plans produce two observations and hence one simple comparison at a time.

4. Free o.a.t.- plans are for those who can make each new run under any conditions that appear useful, after study of all earlier runs.

*Cuthbert Daniel is a private consultant, R. D. 2, Rhinebeck, N.Y. 12572. The substance of this article was read as the Fisher Memorial Lecture at the annual meeting of the American Statistical Association at Colorado State Univer- sity, on August 25, 1971. The author is indebted to J.T. Daniel, 0. Kempthorne, B.H. Margolin, H. Scheffe, J.W. Tukey, M. Zelen, and to a referee for helpful comments.

5. Nested or curved o.a.t. plans produce a subset of results by variation of one easy-to-vary factor, with all others held constant.

It almost, but not quite, goes without saying that we must give up most randomization if we are to do o.a.t. plans. Randomization is a form of insuranice against bias from selection, planned or accidental, of experimental material or of treatment. Some situations require more insurance than others. The two commonest sources of bias are, I believe, model error and uncontrollable time- trends. (Unconscious selection of favorable allocation of experimental units to treatments seems to me to be rare in serious scientific experimentation.) The plans proposed here express and detect model error as two-factor interactions.

Uncontrollable time-trends are mainly dangerous to o.a.t. plans when they are long compared to the time required to do a single run (trial). The two natural routes to take to minimize the effects of such trends are: (1) to use blocks of two trials, as close together as possible, and (2) to use a trend-free or trend-robust design. We will take both routes, but not simultaneously.

The factorial representation and its nomenclature have been used unchanged for nearly forty years. Both are well expounded in many texts, e.g., [7, 4, 2]. For our present purposes, only the elementary logic and sym- bolism of the 2n series will be required. One level of each factor is indicated by the presence of the corresponding lower-case letter in a symbol; the other level is indicated by the absence of that letter. Thus in a three-factor ex- periment, ac means that a trial is to be made at "high" levels of the two factors A and C (the assignment of the term high is arbitrary) and at the "low" level of B. The symbol (1) means a trial with all factors at their low levels. The average effect, measured from the mean, of varying factor A over all combinations of levels of the other factors is indicated by A, its estimate by A. If a 22 factorial experiment shows that factor A has different effects at the two levels of factor B, we call this form of non-additivity "the two-factor interaction AB" and mea- sure it by the obvious difference of the two "simple" A-effects:

[(ab-b) - (a- (1))]/4 -[ab-a-b + (1)]/4.

Conforming to tradition, we use the same symbol to denote the conditions for a trial and the result of that

? Journal of the American Statistical Association June 1973, Volume 68, Number 342

Theory and Methods Section 353

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

354 Journal of the American Statistical Association, June 1973

Al and A2. STRICT ONE-AT-A-TIME MINIMAL 23/14. 8; TIME-TRENDS NEGLIGIBLE

Run Spec. Es timable Ai

2 a A - AB -AC c

3 ab B + AB - BC . 3 A 4

4 abc C + AC + BC 286 A c

5 bc A; AB + AC

6 c B; C; AB -BC - 3

7 a c AB; AC; BC A-

8 b ABC; All other effects with double precision.

8 Not strict o.a.t.

trial. If a full 23 is done, then the expression for AB just given must be averaged with the corresponding quantity for the four runs at high C. The difference between these two measures of AB gives (twice) the three-factor inter- action, ABC.

For a 2n factorial experiment, a treatment combination may be represented by an n-tuple, X - (X17 X2, - n X ) where each x is -+1. Thus the factorial representation of the result of a single run in a 23 may be written:

y(x)-k M + Ax, + Bx2 + CX3 + (AB)xlx2 + (AC)xlX3 + (BC)x2x3 + (ABC)xlX2x3 + e(x)

where A, B, C, (AB), etc. are parameter values,

M is the average of all eight error-free responses, xi= -1 at the low level of A, B, or C for i = 1, 2, 3

= +1 at the high level of the corresponding factor.

The e(x) represents the random independent error in this particular trial, and is assumed to have expected value zero and constant variance, a2. The symbols (AB), etc. are inseparable, not products. Their parentheses will usually be omitted. As an example, for run ac we write

Efac} = M + A-B + C-AB + AC-BC -ABC.

It is common experience that 2fi's are rarer than main effects. Three-factor interactions (3fi) are generally rarer than 2fi. We propose in this article to assume that all 3fi are zero or negligible, but to retain symbols for all 2fi. We may drop this assumption about 2fi if evidence turns up that sums of two or more are small.

2. STRICT ONE-AT-A-TIME PLANS 2.1 The 23 in Six to Eight Runs, or 23//6. 8

We start with the simplest instructive case. The experimenter has three factors, A, B, and C under study. He has made four runs, each varying a factor from the conditions of the preceding run, thus: (1), a, ab, abc, as diagrammed in factor space in Figure Al. He now wants safer, more generalizable, estimates of the average effects of A, B, and C than are given by the obvious comparisons between the results of consecutive runs.

The simple A-effect from the first two runs, (1) and a, has expected value (A -ARB-AC + ABC)/2. As

J.W. Gorman has shown me, the easiest way to find this expression-and most others in this article-is to enter + 1 or -1 in the corresponding places in Yates' standard order [7,p.15], or [2,p.264] for the 2n, and then to carry through the familar addition and subtraction algorithm. The non-zero entries in the nth column (here n = 3) identify the aliases. Thus:

(0) (1) (2) (3)

(1) -1 0 0 0 a +1 0 0 2 A

b 0 0 2 0 B ab 0 0 0 -2 AB

c 0 2 0 0 C ac 0 0 0 -2 AC

bc 0 0 -2 0 BC abc 0 0 0 2 ABC.

It is worth emphasizing that this computation is not done for the purpose of finding effects from data, but for getting aliases from designs.

It is easy to see that the expected value of the A-effect, measured at high B and C, i.e., by (abc - be), is 2(A + AB + AC + ABC), and hence that we can esti- mate the effect of A free of 2fi by adding to the o.a.t. plan of Fig. Al, the single run bc, since

E{a - (1) + abc - bc} -2(A + ABC).

Similarly, by going to c, again by strict o.a.t. variation, we can estimate B free of 2fi bias from the two pairs of runs in which B was the only factor varied. The effect of C can also be estimated from (c - (1) + abc - ab)/4.

It should be clear from Figure A2 that the augmenting runs must lower the levels of the factors in the same order in which they were raised. The six-run plan permits the estimation of the three main effects unaliased with any 2fi and so is, in familiar terminology, of Resolution IV. This plan, with its obvious extensions to n > 3, requires the minimum number of runs [6] for a Resolution IV 2", namely 2n. They all have the additional property of

A3. STRICT O.A.T. MINIMAL 25//10 .*. 16; TIME-TRENDS NEGLIGIBLE

Run Spec. Estimable Run Spec. Estimable

1 (1) 11 a e AE

2 a A -ints. with A 12 a de AD

3 ab B ? " i B 13 a cde AC;AB

4 abc cC ? I C 14 a cd BE

5 abcd D ? " D 15 a c BC;BD

6 abcde E ? it E 16 a c e CD;CE;DE

7 bcde A; AB + AC + AD + AE

8 cde B; AB -BC -BD -BE

9 de C; AC +BC -CD-GCE

10 e D; AD +BD +CD -DE E; AE + BE + CE + DE

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

One-at-a-Time Plans 355

being connected so that runs can be made in strict o.a.t. order.

Estimates of pairs of 2fi can be made from the same data. Thus

E[(abc - bc) -(a - (1))] = 4(AB + AC),

E{(bc - c) -(ab - a)} 4(BC - AB),

and

Ef(abc - ab)- (c - (1))} -4(AC + BC).

If these estimates come out small, the experimenter probably need not worry about 2fi. It is conceivable, but hardly likely, that there is in fact a large positive AB and an equally large negative AC. If the experimenter feels this a contingency worth guarding against, he has only to read further.

If any of the three contrasts measuring the sum or difference of two 2fi turns out to be large, it will surely be important to separate its components. The single run ac will accomplish this. Inspection of the 23 sketched in Figure A2 shows why this is so. The run ac completes three faces of the cube, each one of which permits the estimation of a 2fi. The statistician-reader who is used to seeing data with large error may well need to be reminded that we are dealing with systems whose effects are large compared to their errors. The fact that the 2fi estimates given here have twice the variance of those from the full 23 is then not seriously adverse and is more than balanced by the advantage of optional stopping.

We now extend this simple example in three directions: to get better elimination of time-trends, to study more than three two-level factors, and to study factors at more than two levels.

2.2 The 23//6 . 8 with Elimination of Linear and Quadratic Time-Trends

We start with the 23//6 of the preceding section. Since two more requirements are to be met, namely the elimi- nation of linear (L) and quadratic (Q) time trends, two additional runs will be needed. The runs must repeat earlier experimental conditions. We choose the conditions (1) and a in that order, to maintain our strict o.a.t. regime. This will give the A-effect with half the variance of the B and C estimates. We could of course choose b or c instead of a.

It is assumed that trials can be made in nearly equally- spaced periods of time. If they cannot, then the simple integer coefficients given later will not be correct. Adding linear and quadratic time terms to the usual factorial representation and solving the resulting equations for parameter estimates will then usually require some com- puter aid.

Figure B shows the two extra runs added as numbers 7 and 8 to the former 23//6 of Figure A2. Each estimate before run number 8 is inevitably biased by L or Q or by both. It is of some interest to see that early estimates are not equally trend-biased, and that the L-bias is

B. STRICT-ONE-AT-A-TIME; 23//4- - 10; LINEAR (L) AND QUADRATIC (Q) TIME-TRENDS ASSUMED

Run Spec. Estimable

1 (1)

2 a A -AB-AC + L - 3Q

3 ab B + AB - BC + L - 2Q

4 abc C + AC + BC + L - Q

5 bc A - 3Q/2; AB + AC - L + 3Q/2

6 c B - 3Q12; AB - BC - 2L + Q

7 (1) -C - 3Q/2; AC + BC + L + Q/2

8 a A; B; C; AB + AC; AB - BC; L; Q

9 a c AB; AC; BC

10 b ABC; All other effects with nearly doubled precision.

eliminated for the three main effects at runs 5, 6, and 7, respectively.

The orderly solution of the eight factorial representa- tions of the single responses, with L and Q added, bringing in one new result at a time, gives all the expected values and corresponding estimators shown in lines 2 to 8 of Figure B. The estimators after run 8 are shown in Table 1 as regression coefficients, like all parameter-estimates in this article. These are half the overall effects used by Yates, Kempthorne and Davies for the two-level fac- torials. The L and Q coefficients are scaled by the familiar Fisher-Chebyshev orthogonal polynomial integer co- efficients [3].

The reader will notice that the L and Q regression coefficients are estimated by the familiar contrasts for four equally-spaced or -timed observations, even though these are not so spaced. The contrast for 8A is a weighted average of the three intuitive estimators of A, needing no further trend correction. The coefficient -3 for a in 8P is "really" (-2-1). The -2 gives with the other three 42's an estimate of (8B - 12Q). The four ?i's appear in the right places to estimate + 12Q.

I conjecture that a similar partition of any contrast whose coefficients are not all i1 (but which are small integers) is always feasible and will often be illuminating.

This plan gets all its information on quadratic trend

1. ESTIMATION MATRIX FOR STRICT o.a.t. 23//8 WITH LINEAR AND QUADRATIC CORRECTIONS

Run

1 2 3 4 5 6 7 8

Spec.

Effect Var(X) Eff.

(1) a ab abc bc c (1) a

8A -1 +1 +2 -2 -1 +1 3/16 2/3

8 8 +1 -3 +2 +2 -2 -1 +1 3/8 1/3

8C +1 -1 -2 +2 +2 -3 +1 3/8 1/3

12(AC+BC) +1 -2 +3 -3 +2 -1 7/36 9/14

12(AC$BC) +1 -3 +*3 -3 +3 -1 19/72 9/19

24 i -1 -1 +1 +1 1/144 6/7

12 4 +1 -1 -1 +1 1/36 3/14

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

356 Journal of the American Statistical Association, June 1973

Cl. STRICT ONE-AT-A-TIME 33//20* -24. TIME-TRENDS NEGLIGIBLE. MAIN

EFFECTS CLEARED FIRST 9 8 7

11@ 19

17~ ~ 1

13 is 14

20

1 ~~~2 3

from pairs of runs at the ends of the sequence, and there- fore involves a risky correction to the results of the inter- mediate runs. If the experimenter can break the strict o.a.t. rule for one run, it is natural to ask for a "standard" run, (1), in the middle of the sequence. This produces a slightly more precise design. It is a safer plan since it takes its measure of quadratic trend partly from the middle of the sequence. As before, the runs ac and b will give 2fi estimability and improved precision, respectively.

2.3 2n Plans for Four or More Factors

Strict o.a.t. plans for the 24//8 and the 25//10 are exactly analogous to the 23//6 already discussed. If simple trends are likely, then 2n + 2 runs will suffice. But augmentation to estimate 2fi brings minor surprises.

The 24//8 can only be extended in one way, namely to ad (permitting estimation of AD). Then there are two alternatives for the next run, abd (to estimate BD and CD), or acd (to estimate AB and AC). The final run in each sequence is then forced. It is bd for the former and ac for the latter (to estimate all remaining 2fi).

The 2//1O... 16 is shown in Figure A3, for ease of comparison with Figure A2. It has been assumed that the experimenter wants the interactions with A to be esti- mable first. Starting with ae it is also possible to continue with abe, abce, bce, ce, and ace and so find first the inter- actions with E, then those with D.

The only other starting point (at run 11) is ce, which permits the interactions with D to be estimated first. There are a few minor permutations possible after ce, but it is almost disappointing to find that so very few paths (six in all) through the five dimensional factorial grid are admissible in the sense that they provide at least one new estimate after each run, and that they yield all main

effect and 2fi estimates in 16 runs. This limitation can perhaps be made useful as a reminder to the experimenter that if he commits himself to strict o.a.t. plans, then only a few paths of minimal length are possible.

2.4 The 33//12-- 24

The symbol in the caption means that the effects of three factors (the exponent), each at three levels (the base), are to be explored by o.a.t. rules in from 12 to 24 runs depending on the degree of aliasing-and hence biasing-that is discovered as the trials proceed.

Assuming all 3fi and error to be negligible, the factorial representation for the 33 may be written as a regression equation:

Y = M + ALX1 + BLX2 + CLX3

+ AQX1 + BQX2 + CQX2

? (ALBL)X1X2 + (ALCL)XlX3 + (BLCL)X2X3

+ (ALBQ)X1X2 +.*. + (BQCL)X2X3 (six terms) + (AQBQ)XlX2 + (AQCQ)XX3 + (BQCQ)X2X3.

In the interests of typographical economy, carets are omitted over the regression coefficients on the right. It is anticipated that a small minority of the terms in this equation will dominate. All xi are scaled to have levels - 1, 0, + 1. This scaling has the familiar meaning for factors with quantitative equally spaced levels. It can also be used for "qualitative" factors when two new varieties (or unordered treatments) are being compared with a standard. In this case the "linear" term estimates the difference between the two new varieties and the "quadratic" term measures the difference between the standard and the average response of the two just contrasted.

The equation is printed on five lines to separate sets of terms of decreasing expected importance. The plans which follow produce parameter estimates in the same order.

We give two strict o.a.t. paths through the 31//20. The first (Figure Cl) gives early estimates of the A, B, and C effects, biased as shown; the second (Figure C3) clears the A-estimates in 11 runs, then those of B and C.

Figure Cl shows in its first 12 runs an edge-path analogous to the 23//6 of Figure A2. We see in Figure C2 that the main effect estimates are cleared of linear-by- linear (L X L) 2fi by the opposite-edge augmentations, but that the L X Q and Q X Q components are still attached. To get full separation we must of course take observations at the axial or internal points. The figure starts by going to the A-axis points but any of the three factors (B by strict o.a.t., C not) could be chosen. Pre- sumably the experimenter would prefer to clear first the factor showing the largest effects, or largest interactions.

After run 15 has been done, we have nine points in the sloping plane from runs 1, 2, 3, through 13, 15, 14, to 9, 8, 7, and these suffice to estimate AL and AQ free of all 2fi. Table 2 gives the estimation matrix for the A effects and for the separated pairs of 2fi with A. Two

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

One-at-u-Time Plans 357

more runs (16 and 17) permit corresponding separation for B and three more complete the design for C and its interaction-pairs.

Just as for the 2" series, we can derive the alias struc- ture of each estimate by use of the extension of Yates' algorithm to the 3n series which is shown in Davies [2,p.366j.

The aliasing patterns are complex at first sight, but the computations of effects and 2fi are all simple and should be done by hand. After run 3:

"AL" = (y3 - Yl)/2; "AQ" = (y, -- 2Y2 + Y3)/6.

The inverted commas are merely intended to remind the reader that these are early and hence maximally-biased estimates.

After run 9 we have:

'{AL" = (Y3 - Yl + Y7 - Ys)/4,

"AQ" = (Y - 2Y2 + Y3 + Y7 - 2Y8 + yB)/12, ALBL + ALCL = (-Yl + Y3 + Y7 Y9)/4,

AQRL 4 AQOL = (-y + 2Y2 - Y3 + Y7-2YS + y 12,

C2. ESTIMATES AND ALIASES AFTER EACH RUN OF THE 33//20 -24

Run Spec. Es timable

1 (1)

2 a1 AL AQ ? eight 2fi with A

3 a2 AL four 2fl ALX; AQ ? four 2fi AQX

4 a2b1 BL BQ? eight 2fi with B

S a2b2 BL+four 2fi BLX; BQ? four 2fi BQX

6 a2b2c CL -CQ? eight 2fi with C

7 a2b2c2 CL four 2fi; C + four 2fi C X

8 a1b2c2 AL + AQ + four 2fi AXQ; ? four 2fi AXL

9 b2c2 AL+ALBQ + ALCQ; AQ + AQBQ + AQCQ;

ALBL +

ALCL; AQBL + AQCL

10 bIc2

11 c2 BL+ AQBL + BLCQ; BQ+ AQBQ + BQCQ

CL + AQCL +

BQcL; ALBL -

'LCL; ALCL +

LCL 12 c1 CQI AQCQ + BQCQ; ALCQ + BLCQ

13 b1c1

14 a2bIcI AL +L Q

15 a1b1cI AQ; AQBQ + AQCQ

16 aIb2c1

17 a1c1 BL; BQ; ALBQ -

BQcL; AQBQ + BQCQ 18 a1bc

19 aIb1c2

20 a2b1 CL; CQ; ALCQ + BLCQ; AQCQ + BQCQ

(21) a2 c2 ALBL; ALCL; B LCL

(22) a2b1c2 AL5Q; AQBL; ASB

C23) a1 c2 ALCQ; AQCL; AQCQ

(24) a2 c1 BLCQ; BQCL; BQCQ

2. ESTIMATION MATRIX FOR A AND AXa in the 33//20

C2 c3 Spec. A B C 6AL 18AQ 4(ALXL)C 12(AQXL) 36(A-X

1 1 (1) 0 0 0 -1 1 1 -1 1

2 2 a1 1 0 0 -2 2 -2

3 3 a2 2 0 0 1 1 -1 -1 1

13 7 b1c1 0 1 1 -1 1 -2

15 6 ab1c1 1 1 1 -2 4

1 24 5 a1bc 2 1 1 1 1 - 2

9 9 b2c2 0 2 2 -1 1 -1 1 1

8 10 ab2c2 1 2 2 -2 -2 -2

711 ab2c22 2 2 2 1 1 1 1 1

a AX means "interactions with A." b Run numbers for the two paths of Figures Cl and C3. C ALXL means "estimate of ALBL + ALCL."

all as would be intuitively guessed. Each divisor is the sum of the squares of the coefficients of the observations in that estimator. The symbol + is used to abbreviate: "The sum of the two parameters on the left is estimated by the statistic on the right."

Just as for the 23//6, we must now add runs at the vacant corner (ac in Figure Al, a2C2 in Figure Cl) and along one or more edges adjacent to that corner, to separate one or more 2fi pairs. It is clear from the figure that the addition of two runs will complete one face (a 32) and so permit estimation of the separate elements of one pair of 2fi. The four runs a2C2, alc2, a2C1, and a2b1c2 will allow estimation of all 2fi.

Another path through the 33//20 responds to the experi- menter who requires early information of the main effects of A. It is shown in Figure C3. The full Resolution IV

C3. STRICT ONE-AT-A-TIME 33//20 .. 24. TIME- TRENDS NEGLIGIBLE. A CLEARED FIRST, THEN B.

9 10 11

a ~~~~~19

7fX~~~~I -s 14 3~~~~1

7 1 16 ~~~~~~14

15 1 ~ ~ I

12

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

358 Journal of the American Statistical Association, June 1973

Dl. STRICT ONE-AT-A-TIME. 721/21 A_

1 2 3 4 5 6 7

BI 3 4 - _ 5 I l

2 _ 13 - 14 15

3 2 16 -17

A 1 2 -11 18 1

5 1-10l 2

6 2-20 - 19 6

7 9-8- 7

plan can now be done in 23 runs, since we are not com- pelled to repeat any point by this route.

2.5 A 72//21 We take the case of two seven-level factors and assume

that the experimenter prefers to sample the 49 cells, getting early information on row and column effects, together with moderate, patchy information on non- additivity. If the 21 observations can be made changing only one factor's level at a time (three in each row and column) we will have eight degrees of freedom for sam- pling the 2fi.

The standard balanced incomplete block (BIB) design for seven varieties in seven blocks of three [2, Appendix 6A; 4,p.528], can be written as a "partially replicated two-way layout," using varieties for rows and blocks for columns. It can then be rearranged as shown in Figure Dl so that the maximum number of rows and column can be swept through in sets of three runs. Such a plan, followed in the sequence indicated, will give estimates of all row and column parameters by its 17th run, biased of course by whatever 2fi exist in those cells. This plan should be compared with the classical o.a.t. plan which goes through one row and one column in 13 runs, but gives no clue to non-additivity. If large residuals appear when the calculated row and column effects are sub- tracted from the observed values, the experimenter knows that he has found serious interaction.

2.6 Other Partially Replicated Two-Way Layouts

The cells unoccupied in Figure Dl also form a BIB plan. They are also re-orderable so that no diagonal steps are taken, and hence are usable as a strict o.a.t. plan. This can be used as a start if a more thorough look at 2fi is desired (with 15 degrees of freedom instead of 8), or to complete the square if that is judged advisable.

Many other BIB and even PBIB (partially balanced) can be rearranged in this way, but some cannot. For example, the Regular Group Divisible plan for six varieties in six blocks of three [1, design Ri] does not appear to be orderable in this sense.

The o.a.t. experimenter who insists on sweeping

D2. ROW SWEEP. 721/21 A

1 2 3 4 5 6 7

2 I1 2 3

2 4 5 6

3 7 8 9

4 10 I1I 1 2

5 13 14 1 5

6 16 17 18

7 19 20 21

through all levels of one factor at fixed level of the other, will be more interested in Figure D2. Here the same BIB plan is arranged and numbered for this purpose. Such a plan is obviously not a strict o.a.t. plan. It could be rendered more robust to linear and quadratic time- trends by repeating its first row, or even one observation from its first row, near the middle of the sequence and again at the end.

3. STANDARD ONE-AT-A-TIME PLANS Starting simply, we contemplate the four trials speci-

fied by (1), a, b, c as diagrammed in Figure E. This plan has all the defects of 2fi bias of its strict o.a.t. counterpart, plus another which the diagram makes obvious. All the runs are done in one corner of the cube. It is then going to be harder to augment this plan to remove 2fi biases.

As R. De Baun showed me long ago, it is possible to get a rather blurred look at nion-additivity by appending to the first four the single run abc, numbered 5 in Figure E. If the response found there matches closely the value we would expect from the first four runs, we might dare to guess that we have near-additivity. The expected value of the difference between the "abc effect," i.e., [abc - (1)], and the sum of the three simple effects of A, B, and C, is 4(AB + AC + BC) and so may give some idea of the seriousness of the failure of the factors to operate additively.

E. STANDARD ONE-AT-A-TIME; 23//TWO SETS OF FOUR

Run Spec. Estimable

1 (1) 6 5

2 a A - AB - AC

3 b B - AB - BC 7

4 c C - AC - BC 3 A~~~~

5 abc AB + AC + BC C -

6 bc A; AB +AC I _. 2

7 a c B; AB +BC

B ab C; AB; AC; BC. (Main effects with Eff. ?; 2fi with Eff. 1)

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

One-at-a-Time Plans 359

Another run must be added to get a single interaction isolated. If two can be added to the six already done, then all three 2fi can be estimated. The completion of the cube is then well-nigh irresistible, since it permits double precision on all main effects and 2fi as well as a first look at the 3fi, ABC.

Suppose now that a standard set of four trials has been done, but that some drift or blocking effect must be allowed for if new data are needed. The full comple- mentary set of four, a one-down-at-a-time set from abc must now be carried out. Some information does appear after each run. Thus if the upper horizontal pair (runs 5 and 6, or abc and bc) are completed, we have a new estimate of the effect A, which may be averaged with the earlier one to get a less-biased estimate of the effect of A over the whole cube. The two A effects may also be differenced to get a measure of the sum of the 2fi with A. namely (AB + AC).

When the second set of four is complete, one surprising and one disappointing result emerge. The three 2fi are separately estimated with full efficiency; the blocking bias has not interfered. But the three main effects can still only be estimated within blocks and so only with efficiency factor 2-

So much for this fractured 23. All of this generalizes easily to larger numbers of factors, provided that the word easily is interpreted loosely. Margolin [5] has given a useful table of weighing designs that can be used for this purpose.

4. ONE-PAIR-AT-A-TIME PLANS Much routine testing is controlled by repeated com-

parison with a standard. Sometimes the comparisons are made in sequence in one test set-up; sometimes two parallel set-ups measure standard and unknown simul- taneously. Blocks of two will be more useful when parallel trials are technically reasonable.

It may be a mistake to rush greedily forward and use those blocks of two that give early main-effect estimates, cleared of 2fi. Thus, if we consider the two blocks abcd - (1), and a - bcd, we can indeed find the A effect unaliased with any 2fi, but we have aliased all 2fi with block means and so can only estimate them biased by the block-to-block variation, which is expected to be large.

If instead we start as in Figure F, with the philistine simple comparisons of Blocks I to IV, we are in the framework of the earlier plans. We get each main effect with all its 2fi after the first round, but we can now separate main effects from 2fi strings by doing the complementary sets V to VIII.

There is no logical necessity to carry out these eight blocks in the Roman order. Thus if the very first block showed a very large effect, the experimenter might want to go after that immediately. The first move would surely be , to separate A from its 2fi. If the latter string appeared large, the two blocks IX and X (now numbered 3 and 4) wrould sufice to give separate estimates of its three components.

F. ONE-PAIR-AT-A-TIME PLAN. 24//4*4 * *12 BLOCKS OF TWO

Order Bun Spec. Estimable Computation

1 I a - (1) A -A-AC - AD

5 II b - (1) B - AB - BC - BD

8 III c - (1) C - AC - BC - CD

9 IV d - (1) D - AD - BD - CD

2 V abcd - bcd A; AB + AC + AD V + I; V - I

6 VI abcd - acd B; AB + BC + BD VI + II; VI - II

10 VII abcd - abd C; AC + BC + CD VII - III; VII - III

11 VIII abcd - abc D; AD + BD + CD VIII + IV; VIII - IV

3 IX ab - b AB; AC + AD IX - I; V - IX

4 X abc - bc AC; AD X- IX; V- X

7 XI bc - c BC; BD XI - II; I+VI-IX-XI

12 XII cd - d CD XIII - III

a Suggested order if all 2fi with A, then with B, etc., are desired.

The experimenter may now want to return to the factor B. If II and VI (numbered 5 and 6) are done, he can estimate B and the string of 2fi containing B. Since he has already found AB, only BC and BD require separation. One more block, XI (number 7) will give BC with good precision, but BD with less good, since four block-differences as indicated must be used in its estimation.

We have thus far done seven blocks and we have seven effect estimates clear, namely, A, B, and all their 2fi. We can get the usual preliminary estimates of C and D-each main effect with all its 2fi-from Blocks III and IV (numbered 8 and 9). We then separate C and D from their respective 2fi by Blocks VII and VIII (numbered 10 and 11). A final block, XII, is necessary to get an unencum- bered good estimate of CD.

In summary, we have shown one way to get good estimates of four main effects and six 2fi's in twelve blocks of two, learning something from each block, and re- sponding in some way to each outcome.

5. FREE ONE-AT-A-TIME PLANS We have had a simple example of a free o.a.t. plan in

Figure E at run 5, since we there added the run abc to the standard set (1), a, b, c.

A different sort of example is given here, partly because its analysis produces some surprises. We imagine that some enlightened experimenter has completed the half- replicate of a 21, a 23-1 then, with the experimental conditions (1), ab, ac, and bc, and that only one contrast appears large. Call it the one that estimates (A - BC), the usual A-contrast. We "therefore" assume that all other effects, B, C, AB, and AC are negligible.

Every run in the half-replicate not yet done has an expected value that includes (A + BC). We choose one run, a, and write down its conditional expected value, dropping all the terms just assumed negligible:

E'{ta} =M + A + BC.

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions

360 Journal of the American Statistical Association, June 1973

The prime (') is a reminder that this expectation is con- ditional on the assumption that four parameters are zero. We can then estimate (A + BC) by simply correcting the observed response to trial a by the estimated mean for the whole 23 which we can find from the 23-1 already completed.

A +BC =a-M (5.1)

4(A + BC) = 4(a - ) =4a-(1)-ab --ac- bc (5.2)

4(A-BC)- -(1) + ab + ac-bc. (5.3)

Therefore,

4A -2a-(1)--bc (5.4) and

4fC -2a - ab -ac. (5.5)

Equations (5.2) and (5.3) are solvable by inspection for A and BC. It is rather counterintuitive to see that even though four observations are needed to estimate (A - BC), only three are required to estimate A. The estimator of BC (5.5) is also unexpected in that some of us have not seen an interaction estimated from three observations. Even this contrast is redundant, being as it is the sum of two simple comparisons, each of which estimates BC without bias. This is only possible, of course, because we have already made a judgment that B and C are negligible.

6. ONE-CURVE-AT-A-TIME PLANS The common practice of sweeping through the levels

of a single multi-level factor, holding all other factors constant, can be justified (or discredited, as the case may be) by augmentation in the manner that must by now be familiar, i.e., by changing all the factors held constant to the other ends of their ranges, and again sweeping through the levels of the easy-to-vary factor. If the two curves produced have nearly the same shape, most experimenters will take it as proven that the "easy" factor does not interact with the others. If the two curves are considerably different, then intermediate curves-at intermediate levels of the other factors-must be taken.

When the experimental situation requires, say, a 5 X 22, the primitive o.a.t.-er would do ao, al, a2, a3, a4. I suggest augmentation by another set of five at high B and C. (To keep in the strict o.a.t. regime we should interpolate the run a4b between the two sweeps.) We can now separate the A-effects from their 2fi with B and C. We do this, of course, by comparing the two five-point curves (if A is continuous) or the five pairs of observations (if A is discrete). If the two sets of five do not show similar spacing of their responses then another set of five should be run, either at high B and low C, or at low B and high C. We will then have 16 runs and can separate and esti- mate the four degree of freedom interaction AB, the

4 d.f. AC, and the single value for BC. We are still assuming that all 3fi are negligible.

The reader will see that more runs, patience and ingenuity will be required if there are more than two levels of B and of C, or if more than two "hard" factors must be studied. But the general ideas are clear and the examples given should suffice to guide the reader through these more complex situations.

7. CONCLUSIONS The laboratory researcher's practice of doing one trial

at a time, and of making some judgment after each trial, is justified when the effects are expected to be of magni- tude 4o or more. Methods have been given for refinement of the first estimates produced by this practice, isolating two-factor interaction biases, first in strings, then individually.

These augmentations appear to work best for the experimenter who can do "strict" one-at-a-time sets, i.e., each trial varying only one factor from the previous trial. They work well too for the researcher who is set up to carry through simultaneous or closely related pairs of runs. The "standard" one-at-a-time plans each trial once-removed from some standard condition-have the limitations of their conservatism. They are harder to augment, but the best method of augmentation is clear.

When long trends are likely to be present, and are approximable by linear plus quadratic terms in time, the addition of two runs may give sufficient information to correct adequately for such drifts.

The two basic requirements (small error, quick results) are stringent and, of course, cannot always be met. The plans proposed here cannot conceivably be used in agricultural field trials, in long-term clinical trials, in full-scale plant experiments, or in studies of consumer- product shelf-life. For these and for many other systems, the classical plans of Fisher, Yates, Box, and their associates seem irreplaceable and will probably continue to dominate the field of multifactor experimental design.

[Received March 1972. Revised November 1972.]

REFERENCES [1] Bose, R.C. et al., Tables of Partially Balanced Designs, Technical

Bulletin 107, North Carolina Agricultural Station, Raleigh, N.C., 1954.

[2] Davies, O.L., ed., Design and Analysis of Industrial Experi- ments, New York: Hafner Publishing Co., 1956.

[3] Fisher, R.A. and Yates, F., Statistical Tables, New York: Hafner Publishing Co., 1949, Table XXIII.

[4] Kempthorne, O., Design and Analysis of Experiments, New York: John Wiley and Sons, Inc., 1952.

[5] Margolin, B.H., "Results on Factorial Designs of Resolution IV for 2n and 2n3m Series," Technometrics, 11 (August 1969), 431-44.

[6] , "Resolution IV Fractional Factorial Designs," Journal of the Royal Statistical Society, Ser. B, 31, No. 3 (1969), 514-23.

[7] Yates, F., Design and Analysis of Factorial Experiments, Har- penden, England: Imperial Bureau of Soil Science, 1937.

This content downloaded from 195.34.78.245 on Sun, 15 Jun 2014 13:57:11 PMAll use subject to JSTOR Terms and Conditions