Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block...

31
Incomplete Block Designs Recall: in randomized complete block design, each of a treatments was used once within each of b blocks. In some situations, it will not be possible to use each of a treatments in each block. Any blocked experiment which has fewer than a units per block is called an incomplete block design. 1 STA305 week11

Transcript of Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block...

Page 1: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Incomplete Block Designs

• Recall: in randomized complete block design, each of a treatments was used once within each of b blocks.

• In some situations, it will not be possible to use each of a treatments in each block.

• Any blocked experiment which has fewer than a units per block is called an incomplete block design.

1STA305 week11

Page 2: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Balanced Incomplete Block Design (BIBD)

• If all treatment comparisons are of equal importance and if we wish to estimate each treatment effect with the same precision, then the design needs to be balanced:

(1) Each block contains the same number of units.(2) Each treatment occur the same number of times in total(3) Each pair of treatments occurs together the same number of times in total.

• A design that satisfies these conditions is called Balanced Incomplete Block Design.

2STA305 week11

Page 3: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Example

• Consider a study which we wish to compares 4 treatments: A, B, C, and D.

• Only 2 treatments can be used in each block and suppose there are 6 blocks available.

• The following is a balanced incomplete block design (BIBD):

• This design is balanced with respect to each of the 3 points above.

3STA305 week11

Page 4: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• Consider the following design, which also uses 6 blocks, each with 2 units

• Although this design is balanced with regard to the first 2 items, the same 2 treatments always appear together.

• So this would not be considered a balanced incomplete block design.

• Could this study be designed using 5 blocks? 7 blocks?

STA305 week11 4

Page 5: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Number of Experimental Units Required• We will use the following notation:

a - number of treatments, b - number of blocks, k - number of units per block , r - total number of times each treatment occurs

• The total number of units required in the study is therefore N = bk.• However, the total number of units required can also be expressed as

N = ar.• So in a balanced design, we must have bk = ar.• Using combinatorics, we can show that the total number of times that

each pair of treatments appears together is where λ must be an integer.

• NOTE: designs that satisfy the last two conditions may not exist for a particular choice of a, b, k, r.

STA305 week11 5

( ) ( )11 −−= akrλ

Page 6: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Example

• A study was designed to compare 5 treatments.

• Although the study will use blocks to control for a nuisance factor, there are only 3 experimental units within each block.

• Are there values of b (# of blocks) and r (# of times each treatment is used) that would lead to a BIBD?

STA305 week11 6

Page 7: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Solution

STA305 week11 7

Page 8: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Statistical Model for BIBD

• The model equation for the BIBD is the same as for the randomized complete block design, that is Yij = μ+τi +βj+ εij.

μ is the overall mean,

τi is the effect of the i-th treatment,

βj is the effect of the j-th block and

εij are i.i.d N(0, σ2) - the residual or random error term.

STA305 week11 8

Page 9: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Sums of Squares

• The total sum of squares is calculated in the usual manner:

• In decomposing total sum of squares, we use an adjusted treatment sum of squares to account for the fact that each treatment did not occur within each block.

• So treatment sum of squares sums over different blocks for each treatment.

STA305 week11 9

( )∑∑= =

••−=a

i

b

jijT YYSS

1 1

2

Page 10: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Additional Notation

• Let Ti denote the sum over all experimental units which received treatment i.

• Since each treatment does not appear in each block, the treatment total must be adjusted to take this into account.

• Let Bi be the sum over all observations in the blocks in which treatment i occurred.

• Then the adjusted treatment total is Qi = Ti - Bi / k

• If we were interested in testing for the equality of block effects, we would also need to compute an adjusted block total.

STA305 week11 10

Page 11: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• The treatment sum of squares adjusted for blocks is

• The sum of squares for the blocking factor (not adjusted for treatments) is

• The error sum of squares is obtained by subtraction:

SSE = SST - SSTr (adj) - SSBl

STA305 week11 11

( ) aQkSSa

iiadjTr λ∑

=

=1

2

( )∑=

••• −=b

jjBl YYkSS

1

2

Page 12: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Degrees of Freedom

• Since the same number of parameters must still be estimated in this design (compared to the randomized complete block design), the degrees of freedom will be the same.

• Since there are a treatments, the degrees of freedom for treatment is a-1.

• Similarly, the degrees of freedom for blocks is b-1.

• The total degrees of freedom is N-1, where N = bk = ar

STA305 week11 12

Page 13: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Testing for Treatment Effects

• In order to test for treatment effects, we start by calculating the necessary sums of squares and constructing the ANOVA Table. It is given below

• Note that the mean square for blocks is not computed; if we wanted to test for block effects we would have to compute an adjusted sum of squares for blocks.

STA305 week11 13

Page 14: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• To test the hypothesesH0: τ1 = τ2 = ... = τa = 0Ha: not all τi = 0

We use the test statistic: Fobs = MSTr (adj) / MSE

• The corresponding P-value = P(F(a-1, N-a-b+1 > Fobs )

• Small values of the p-value would cause us to reject the hypothesis that there is no treatment effect.

STA305 week11 14

Page 15: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Example• A researcher planned and carried out a study to examine the inter-rater

reliability of a scale for measuring depression.• Six examiners at a particular institution were the primary focus of the

study.• However, since the outcome of the assessment was likely to be

influenced by the patient, it was decided to block on patients.• It was not feasible to have all 6 examiners question every subject; each

subject was to be examined by 3 examiners.• In order to achieve balance it is necessary to have ar = bk, which in

this case is 6r = 3b• It is also necessary for λ = r(k-1)/(a-1) to be an integer; in this case

λ = r(2/5).• The smallest value of r for which λ will be an integer is 5, and this

requires that the number of blocks, b, be 10.

STA305 week11 15

Page 16: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• A balanced incomplete block design was used, and the following data were collected:

• Do these data provide any evidence that the examiners differ with respect to the mean depression scores that they assign to patients?

STA305 week11 16

Page 17: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Solution

STA305 week11 17

Page 18: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Contrasts

• As in other experiments, the researchers may be interested in comparisons between subgroups of treatment means.

• Contrasts can be used to test hypotheses.

• Tests should be based on the adjusted treatment effects:

• To test the hypothesis

H0 : c1μ1 + c2μ2 + · · · + caμa = 0Ha : c1μ1 + c2μ2 + · · · + caμa ≠ 0

We use Fobs = SSC / MSE, where

STA305 week11 18

ii

i Qr

QY ==•'

22

1

2

1

/ krac

QcSS

a

ii

a

iii

C

λ⎭⎬⎫

⎩⎨⎧

⎭⎬⎫

⎩⎨⎧

=

=

=

Page 19: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Least Squares Estimates of the Model Parameter

• In deriving the least squares estimators, we will use nij, an indicator for treatment I occurring in block j as follow:

• Start by forming the sum of the squared deviations of the observed values from the fitted values…

STA305 week11 19

⎩⎨⎧ ∈

=not if0

block treatmentif1 jinij

Page 20: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Analysis of BIBD Using SAS• In the designs we have examined up to now, the order in which

factors were specified in the “model” statement was not important.• However, for the BIBD case the order is important.• Refer to the Example on slide 15-16.• The manner in which data are input for the BIBD is similar to that

for the RBC design:data example ;input patient examiner score ;cards ;1 1 101 2 141 3 102 1 32 2 32 4 1etc;run ;

STA305 week11 20

Page 21: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• The glm procedure is still the appropriate choice for this design, but is important that the block factor be entered into the model before the treatment factor

proc glm data = example ;class patient examiner ;model score = patient examiner ;title1 'block before factor' ;run ;

• Care is needed in reading the SAS output, since Type I SS is used for the block factor (unadjusted) while Type III SS is used for the treatment factor (adjusted). The outputs is given below:

STA305 week11 21

Page 22: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

STA305 week11 22

Page 23: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• The total SS is read from the SAS output, and in this case it is 1156.66666667.

• The error SS is read from the output and it is 139.22222222.

• The unadjusted block SS is taken from the column containing the Type I SS, and in this case is 982.00.

• The adjusted treatment SS is taken from the column containing the Type III SS, and is 35.44444444.

• The correct value to use for the F-test is the one associated with the Type III SS for treatment.

• For the purposes of comparison, the output generated by SAS when the treatment factor is entered into the model before blocks is given below

STA305 week11 23

Page 24: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

STA305 week11 24

Page 25: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• From this output we can get the adjusted SS for the factor (examiner), but we cannot get the unadjusted SS for the blocks (patients).

• Because of the unbalanced nature of this design, least squares treatment means should be used rather than simple averages.

• To obtain least squares means for the treatment factor, just add one line to the SAS code as follows:

proc glm data = example ;class patient examiner ;model score = patient examiner ;lsmeans examiner ;title1 'block before factor' ;

run ;

• the output generated by the “lsmeans” statement is given below:

STA305 week11 25

Page 26: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• Compare the LS means to the treatment averages that do not take into account block effects:

STA305 week11 26

Page 27: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Contrasts in SAS

• Contrasts can also be tested using SAS.

• Suppose that examiner 1 was more recently hired than the other examiners and it is of interest to test the hypothesis that the mean for examiner 1 is equal to the mean for the others.

• In other words, test H0 : μ1 = (μ2 + μ3 + μ4 + μ5 + μ6)/5Ha : μ1 ≠ (μ2 + μ3 + μ4 + μ5 + μ6)/5

• To do this test is SAS, it is still important that the block factor is entered into the model before the treatment factor.

• Add the following statement to the glm code given abovecontrast ‘1st vs others’ examiner -1 0.2 0.2 0.2 0.2 0.2 ;

STA305 week11 27

Page 28: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

• An addition line of output will be produced giving the following information

STA305 week11 28

Page 29: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Additive Models and Alternatives• Models used for randomized complete block design, BIBD, Latin

square & Graeco-Latin square design are known as additive models.• Consider the model used for the randomized complete block design:

Yij = μ+τi +βj+ εij.• In this model, treatment i always causes the expected response to

increase/decrease τi units from the overall mean μ.• The effect of block j is to cause a change of βj units, regardless of

which treatment is applied.• The effect of treatment i and block j together cause an increase or

decrease from the overall mean by τi +βj.• In other words, the effects of the 2 factors are additive.• This model is not always appropriate for a randomized complete block

design.• Generally, an additive model may not be appropriate for a particular

design.

STA305 week11 29

Page 30: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Multiplicative Effects• In some cases, the effect of one factor might be amplified by the

presence of another factor.

• A multiplicative model might be appropriate in this situation: Yij = μ×τi×βj×εij.

• Taking logarithms results in an additive model: ln{Yij} = ln{μ}+ln{τi}+ln{βj}+ln{εij}.

• This can be written as

STA305 week11 30

'''' ijjiijW εβτμ +++=

Page 31: Incomplete Block Designsfisher.utstat.toronto.edu/~hadas/STA305/Lecture notes...Incomplete Block Designs • Recall: in randomized complete block design, each of a treatments was used

Interaction

• In some cases, there may be an element of additivity, although the additive model may not be adequate.

• For instance, consider a randomized complete block design where treatment i causes an extreme reaction in a particular block, but the other treatments do not.

• Furthermore, suppose that in the other blocks treatment i does not cause an extreme reaction.

• In this case, there is an interaction between treatments and blocks.

• The additive model would need to be modified to include an interaction term.

STA305 week11 31