Post on 20-Jan-2016
Chapter 4
analysis of variance
(ANOVA)
Section 1
the basic idea and
condition of application
Objective: deduce and compare several (or two ) population
means .
Method: analysis of Variance (ANOVA), ie F test for compari
ng several sample means .
Basic idea : according to the type of design, the sum of squa
res of deviation from means (SS) and degree of freedom (d
f) were divided into two or several sections . Except the cha
nce error, the variation of every section can be explained b
y a certain or some factors.
Condition of Application :
population : normal distribution and homogeneity
of variance.
Sample: independent and random
Types of design :
The ANOVA of completely random design;
The ANOVA of randomized block design;
The ANOVA of Latin square design;
The ANOVA of cross-over design ;
Table 4-1 the results of g groups
group Measure value Statistic
1level X11 X12 … X1j … 1nX1
n1 1X S1
2 level X21 X22 … X2j … 2nX2
n2 2X S2
…
…
…
…
…
…
…
…
…
…
g level Xg1 Xg2 … Xgj … ggnX ng gX Sg
total N S
The basic idea of ANOVA of completely random design
partition of variationsum of squares of deviations from m
ean , SS :
2 2
1 1 ,
2 ( 1)
ing N
iji j i j
SS X X X X
N S
1. total variation : the degree of variation of all variable values , the formula as follows
2 2
1 1 1
,
1
2 ,
i i
N
i
n ng g
ij iji j i j
ji j
SS X
C
X X
X
C
total
2 2
1 1 ,
( ) ( )ing N
ij iji j i j
X XC
N N amend factor:
1N total
2. between- group variation: the sum of squares of deviations from mean between groups means and grand mean show the effects of treatment and random error, the formula :
2
12
1 1
( )
( )
in
ijj
g g
i ii i i
X
SS n X X Cn
bg
1g bg
2
1 1
( )ing
ij ii j
SS X X
wg
g N g w
3.Within-group Variation : differences among values 3.Within-group Variation : differences among values
within each group .The formula as follows:within each group .The formula as follows:
b wgSS SS SS total g
total bg wg
the relation of three variation
SSMS
SSMS
bgbg
bg
wgwg
wg
mean square , MS
Test statistic :
•
If , were the estimated value of the random error , F value should be close to 1 .
• If were not equal , F value will be larger than 1.
1 2, , MS
FMS
bgbg wg
wg
1 2 g ,MS MSbg wg
2
1 2, , , g
Section 2
The ANOVA of Completely Random Design
All of objects were randomly distributed
to g groups (levels), and every group
give the different treatment. The effects
of treatment will be deduced by
comparing the groups means after
experimentation.
completely random design
Example 4-1 A doctor want to explore the cli
nic effect of a new medicine for reducing blood
fat, and selects 120 patients according to the s
ame standard. All of patients were divide into 4
groups by the completely random design. How
should he divide the groups?
The methods of dividing groups of completely random design 1. serial number: 120 patients was numbered from 1to
120 ( table 4-2 column 1) ;
2. choosing random figure: you can begin from the any
row or any column in the appendix 15(for example
beginning from the fifth row and seventh column),
and read three digit in turn as a random number to
write down the serial number, (table 4-2,column 2)
Table 4-2 the grouping result of completely random design
Serial number 1 2 3 4 5 6 7 8 9 10 … 119 120
Random number 260 873 373 204 056 930 160 905 886 958 … 220 634
rank 24 106 39 15 3 114 13 109 108 117 … 16 75
Grouping result A D B A A D A D D D … A C
3.edit serial number: edit serial number according to the number from
small to large (the same number according to early or late order)
(table 4-2,column 3)
4.define in advance: the serial numbers from 1-30 were defined the A
group; 31-60 were the B group; 61-90 were the C group; 91-120 were
the D group, (table 4-2,column 4)
( 2 ) the choice of statistic methods
1. If the data accord with normal distribution an
d homogeneity of variance, one-way ANOVA
or independent t test was used (g=2) ;
2. If the data are not normal distribution or
heterogeneity of variance, the datum transfor
m or Wilcoxon rank sum test can be done.
decompose of variation
Example 4-2 A doctor wanted to explore the clinic e
ffect of a new medicine for reducing blood fat, and s
elected 120 patients according to the same standard.
He divided all of patients into 4 groups by the compl
etely random design. The low density lipoprotein wer
e measured after 6 weeks by double blind experimen
t, table 4-3. Is there difference among the population
means of low density lipoprotein of 4 groups ?
statistic group value
n
iX X 2X
3.53 4.59 4.34 2.66 3.59 3.13 2.64 2.56 3.50 3.25
3.30 4.04 3.53 3.56 3.85 4.07 3.52 3.93 4.19 2.96 Placebo group
1.37 3.93 2.33 2.98 4.00 3.55 2.96 4.3 4.16 2.59
30 3.43 102.91 367.85
New medicine
2.42 3.36 4.32 2.34 2.68 2.95 1.56 3.11 1.81 1.77
1.98 2.63 2.86 2.93 2.17 2.72 2.65 2.22 2.90 2.97 2.4g
2.36 2.56 2.52 2.27 2.98 3.72 2.80 3.57 4.02 2.31
30 2.72 81.46 233.00
2.86 2.28 2.39 2.28 2.48 2.28 3.21 2.23 2.32 2.68
2.66 2.32 2.61 3.64 2.58 3.65 2.66 3.68 2.65 3.02 4.8g
3.48 2.42 2.41 2.66 3.29 2.70 3.04 2.81 1.97 1.68
30 2.70 80.94 225.54
0.89 1.06 1.08 1.27 1.63 1.89 1.19 2.17 2.28 1.72
1.98 1.74 2.16 3.37 2.97 1.69 0.94 2.11 2.81 2.52 7.2g
1.31 2.51 1.88 1.41 3.19 1.92 2.47 1.02 2.10 3.71
30 1.97 58.99 132.13
Table 4-3 the low density lipoprotein value of 4 treatment groups (mmol/L)
三、 steps of analysis
H0:1 2 3 4 ,即4个试验组的总体均数相等
H1:4个试验组的总体均数不全相等
0.05
按表4-4中的公式计算各离均差平方和SS、自由度、均方MS和F值。 H0 : ie. all of 4 population means are equal.
H1 : not all of the population means are equal
1 2 3 4
0.05
2 . Calculate test statistic
1. State the hypotheses and test criteria
102.91 81.46 80.94 58.99 324.30ijX
2 367.85 233.00 225.54 132.13 958.52ijX 2(324.30) /120 876.42C
958.52 876.42=82.10SS total ,总=120-1=119 2 2 2 2(102.91) (81.46) (80.94) (58.99)
876.42 32.1630 30 30 30
SS bg
4 1 3 bg
82.10 32.16 49.94SS wg , 120 4 116 wg
32.1610.72
3MS bg , 49.94
0.43116
MS wg ,
10.7224.93
0.43F
variation source df SS MS F P total 119 82.10
Between-group 3 32.16 10.72 24.93 <0.01 Within-group 116 49.94 0.43
Table 4-5 the table of ANOVA of completely random design
list the ANOVA table
3. Calculate p value and deduce
according to a=0.05 level, reject , and accept ,
not all of 4 population means are equal; ie. differen
t dose medicines have different effects on ldl-c.
0H1H
attention : if the result of ANOVA is to reject H0 , and accept H1, it d
oes not mean that all of population means have differ
ence each other. If analysing which groups have sign
ificant difference , we must compare among several
population means (section 6). When g=2, the ANOVA
of completely random design is equal to independent
t test, ie.
t F
Section 3
The ANOVA of randomized block design
randomized block design
Firstly, match the objects as the blocks according
to the non-treatment factor affecting the result
of experiment (such as sex, weight, age,
occupation , state of illness, course of disease et al) .
Secondly, the objects of each block were randomly
distributed to each treatment group or control group.
(1) grouping method of randomized block design :
( 2 ) characteristic of randomized block design • Random distribution was repeated many times for
objects of the blocks. The number of objects is
same in every treatment group.
• SS of the block variation was separated from SS of
the within-group variation of completely random
design; SS of within-group (sum of error square)
was decreased, and power of test was increased.
example 4-3 distribute 15 white mice of 5 blocks
to three treatment groups , how to do it ?
Grouping method: firstly, number the mice by the weight,
and match the 3 near weigh mice as a block (table 4-6).
Secondly, select 2 digit as one random number from any
row or any column in the random number table, for
example, from the 8th row and third column (table 4-6);
and rank the random number from small to large in every
block. The object of serial number in each block is 1,2,3
will accept A,B,C treatment respectively. (table 4-6)
Table 4-6 the distribution result of 5 blocks white mice
block 1 2 3 4 5
White mice 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Random
number 68 35 26 00 99 53 93 61 28 52 70 05 48 34 56
order 3 2 1 1 3 2 3 2 1 2 3 1 2 1 3
treatment C B A A C B C B A B C A B A C
Treatments (g level) Block number 1 2 3 … g
1 X11 X21 X31 … Xg1
2 X12 X22 X32 … Xg2
…
…
…
…
…
…
j X1j X2j X3j … Xgj
…
…
…
…
…
…
n 1nX 2nX 3nX … gnX
table 4-7 the result of random block design
partition of variation
(1)Total variation: SStotal.
(2) Treatment-group variation : SStreatment.
(3) block-group variation: SSblock.
(4) Error variation: SSerror.
SS SS SS SS total treatment bl ock error
total treatment bl ock error
variaion df SS MS F
total N-1 2
1 1
g n
iji j
X C
treanment g-1 2
1 1
1( )
g n
iji jn
X C
SS
treatment
treatment
error
MS
MStreatment
Block n-1 2
1 1
1( )
gn
ijj ig
X C
SS
bl ock
bl ock
MS
MSbl ock
error
error (n-1)(g-1) SStotal- SStreatment--SSblock SS
error
error
table 4-8 the ANOVA of random block design
Steps of analysis
example 4-4
15 mice were divided into 5 blocks by th
e weight. there are 3 mice in every block
.the result showed in table 4-9. is there d
ifference among 3 treatment groups?
block A B C 1
g
iji
X
1 0.82 0.65 0.51 1.98
2 0.73 0.54 0.23 1.50
3 0.43 0.34 0.28 1.05
4 0.41 0.21 0.31 0.93
5 0.68 0.43 0.24 1.35
1
n
ijj
X 3.07 2.17 1.57 6.81 ( )ijX
iX 0.614 0.434 0.314 0.454 ( )X
2
1
n
ijj
X 2.0207 1.0587 0.5451 3.6245
2( )ijX
table 4-9 the variable values of different groups ( g )
H0 :
H1 : not of all population means are equa
l
1 2 3
0.05
2
1 1
3.6245 3.0917=0.5328g n
iji j
SS X C
total
2 2 2 2
1 1
1 1( ) (3.07 2.17 1.57 ) 3.0917 0.2280
5
g n
iji j
SS X Cn
treatment
2 2
1 1
( ) / (6.81) /15 3.0917g n
iji j
C X N
2
1 1
2 2 2 2 2
1( )
1 (1.98 1.50 1.05 0.93 1.35 ) 3.0917 0.2282
3
gn
ijj i
SS X Cg
bl ock
Table 4-10 the ANOVA of example 4-4
variation df SS MS F P
total 14 0.5328
treatment 2 0.2280 0.1140 11.88 <0.01
block 4 0.2284 0.0571 5.95 <0.05
error 8 0.0764 0.0096
according to 1=2 、 2=8, check F value tabl
e:
At α=0.05 level , reject H0, accept H1, not
all of population means are equal.
0.01(2,8
0.05(2,8) 0.01(2,8)
)
4.46, 8.65,
11.88 , 0.01
F F
F F P
。
section 6
multiple comparison
can the above example be analyzed by t test ?Numbers of t test
a=0.05, the probability of non-type I error for one comparison :
1-0.05=0.95;
the probability of non-type I error for all of 6 times analysis :
=0.77;
the probability of type I error for 6 times analysis: 1-0.77=0.23
the probability of type I error will be increased
24 6C
60.95
Condition of application :
when the result of ANOVA reject H0,
and accept H1, not all of population
means are equal. If wanting to know
the difference between any two group
means, we should do the multiple
comparison.
LSD-t test( least significant differenc
e )
The formula
LSD , i j
i j
X X
X Xt
S
error
1 1i jX X
i j
S MSn n
error
MS MS误差 组内
example 4-7example 4-7 for the example 4-2 d for the example 4-2 d
ataata ,, are there difference among the pare there difference among the p
opulation means of 2.4gopulation means of 2.4g 、、 4.8g4.8g 、、 7.2g 7.2g
and placebo groupand placebo group ??
α=0.05
Comparing between 2.4g and placebo group :
0 2.4g 0:H
1 2.4g 0:H
According to example 4-2, 2.4gX =2.72,
0X =3.43, 2.4gn = 0n =30,MSerror =0.43, error=116。
i jX XS =1 1
0.4330 30
=0.17
LSD-t =2.72 3.43
0.17
=-4.18
ν=116,t=4.18 check t value table,P<0.001。at
0.05 ,reject H0,accept H1,and there are
significant .
4.8g VS placebo group: LSD-t =-4.29
7.2g VS placebo: LSD-t =-8.59 。
Dunnett- t test
0
0
i
i
X X
X Xt
S
formulaformula ::
Dunnett- , error
example 4-8 according to example
4-2, compare 3 population means of treatm
ent groups and placebo group,respectively?
H0: μi=μ0
H1: μi μ0
α=0.05
According to example 4-2, 2.4gX =2.72, 4.8gX =2.70, 7.2gX =1.97,
0X =3.43, in = 0n =30,MSerror =0.43, error=116.
2.4g
2.72 3.43
1 10.43
30 30
t
=-4.18
4.8g
2.70 3.43
1 10.43
30 30
t
=-4.29
7.2g
1.97 3.43
1 10.43
30 30
t
=-8.59
Dunnett-
Dunnett-
Dunnett-
ν =116、T=g-1=4-1=3 , check Dunnett-t
value table(two tail), 0.01/2,116 0.01/2,120 =2.98t t 。
2.4g 0.01/2,116t t ,4.8g 0.01/2,116t t ,7.2g 0.01/2,116t t P<0.01。
at the level of 0.05 ,reject H0,accept
H1,there is significant difference。
三、 SNK-q test( Student-Newman-Keuls )
i j
i j
X X
X Xq
S
, ν=νerror
1 1
2i jX Xi j
MSS
n n
error
iX , in and jX , jn mean the group means and sample
numbers.
Example 4-9 according to 4-4,
compare the 3 group means by
SNK-q test
H0 : μA=μB
H1 : μA≠μB,
α=0.05
rank the 3 group means from small to
large and number them
mean 0.314 0.434 0.614
group C B A
number 1 2 3
group i jX X a q 0.05q 0.01q P
(1) (2) (3) (4) (5) (6) (7)
1,2 0.12 2 2.74 3.26 4.75 >0.05
1,3 1.30 3 6.85 4.04 5.64 <0.01
2,3 0.18 2 4.11 3.26 4.75 <0.05
Table 4-15 the comparing between two group means
Example 4-4 误差MS =0.0096, 8 误差 。numbe of
sample is 5, 0.0096 1 1
0.04382 5 5i jX XS
。
conclusion :
there are significant difference
between A and B, A and C.