Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

23
Chapter 8 1-Way Analysis of Variance - Completely Randomized Design

Transcript of Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Page 1: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Chapter 8

1-Way Analysis of Variance - Completely Randomized Design

Page 2: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Comparing t > 2 Groups - Numeric Responses

• Extension of Methods used to Compare 2 Groups• Independent Samples and Paired Data Designs• Normal and non-normal data distributions

DataDesign

Normal Non-normal

IndependentSamples(CRD)

F-Test1-WayANOVA

Kruskal-Wallis Test

Paired Data(RBD)

F-Test2-WayANOVA

Friedman’sTest

Page 3: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Completely Randomized Design (CRD)

• Controlled Experiments - Subjects assigned at random to one of the t treatments to be compared

• Observational Studies - Subjects are sampled from t existing groups

• Statistical model yij is measurement from the jth subject from group i:

ijiijiijy

where is the overall mean, i is the effect of treatment i , ij is a random error, and i is the population mean for group i

Page 4: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

1-Way ANOVA for Normal Data (CRD)

• For each group obtain the mean, standard deviation, and sample size:

1

)( 2.

.

i

jiij

ii

jij

i n

yy

sn

y

y

• Obtain the overall mean and sample size

N

y

N

ynynynnN i j ijtt

t

..11

..1

......

Page 5: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Analysis of Variance - Sums of Squares

• Total Variation

1)(1 1

2..

NdfyyTSS Total

k

i

n

j iji

• Between Group (Sample) Variation

t

i

n

j

t

i Tiiii tdfyynyySST

1 1 1

2...

2... 1)()(

• Within Group (Sample) Variation

ETTotal

E

t

i ii

t

i

n

j iij

dfdfdfSSESSTTSS

tNdfsnyySSE i

1

2

1 1

2. )1()(

Page 6: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Analysis of Variance Table and F-TestSource ofVariation Sum of Squares

Degrres ofFreedom Mean Square F

Treatments SST t-1 MST=SST/(t-1) F=MST/MSEError SSE N-t MSE=SSE/(N-t)Total TSS N-1

• Assumption: All distributions normal with common variance

•H0: No differences among Group Means (t =0)

• HA: Group means are not all equal (Not all i are 0)

)(:

)9(:..

:..

,1,

obs

tNtobs

obs

FFPvalP

TableFFRRMSE

MSTFST

Page 7: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Expected Mean Squares

• Model: yij = +i + ij with ij ~ N(0,2), i = 0:

1)(

)( true),is ( otherwise

1)(

)( true,is 0:When

)1(11

)(

)(

1

)(

)(

10

21

2

2

1

2

2

1

2

2

2

MSEE

MSTEH

MSEE

MSTEH

t

nt

n

MSEE

MSTE

t

nMSTE

MSEE

a

t

t

iii

t

iii

t

iii

Page 8: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Expected Mean Squares

• 3 Factors effect magnitude of F-statistic (for fixed t)– True group effects (1,…,t)

– Group sample sizes (n1,…,nt)

– Within group variance (2)

• Fobs = MST/MSE

• When H0 is true (1=…=t=0), E(MST)/E(MSE)=1 • Marginal Effects of each factor (all other factors fixed)

– As spread in (1,…,t) E(MST)/E(MSE)

– As (n1,…,nt) E(MST)/E(MSE) (when H0 false)

– As 2 E(MST)/E(MSE) (when H0 false)

Page 9: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0 20 40 60 80 100 120 140 160 180 200

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0 20 40 60 80 100 120 140 160 180 200

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0 20 40 60 80 100 120 140 160 180

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0 20 40 60 80 100 120 140 160 180 200

A)=100, 1=-20, 2=0, 3=20, = 20 B)=100, 1=-20, 2=0, 3=20, = 5

C)=100, 1=-5, 2=0, 3=5, = 20 D)=100, 1=-5, 2=0, 3=5, = 5

n A B C D4 9 129 1.5 98 17 257 2 1712 25 385 2.5 2520 41 641 3.5 41

)(

)(

MSEE

MSTE

Page 10: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Example - Seasonal Diet Patterns in Ravens

• “Treatments” - t = 4 seasons of year (3 “replicates” each)– Winter: November, December, January

– Spring: February, March, April

– Summer: May, June, July

– Fall: August, September, October

• Response (Y) - Vegetation (percent of total pellet weight)• Transformation (For approximate normality):

100arcsin'

YY

Source: K.A. Engel and L.S. Young (1989). “Spatial and Temporal Patterns in the Diet of Common Ravens in Southwestern Idaho,” The Condor, 91:372-378

Page 11: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Seasonal Diet Patterns in Ravens - Data/MeansY Winter(i=1) Fall(i=2) Summer(i=3) Fall (i=4)

j=1 94.3 80.7 80.5 67.8j=2 90.3 90.5 74.3 91.8j=3 83.0 91.8 32.4 89.3

Y' Winter(i=1) Fall(i=2) Summer(i=3) Fall (i=4)j=1 1.329721 1.115957 1.113428 0.967390j=2 1.254080 1.257474 1.039152 1.280374j=3 1.145808 1.280374 0.605545 1.237554

135572.112

237554.1...329721.1

16773.13

237554.1280374.1967390.0

919375.03

605545.0039152.1113428.1

217935.13

280374.1257474.1115957.1

24203.13

145808.1254080.1329721.1

..

.4

.3

.2

.1

y

y

y

y

y

Page 12: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Seasonal Diet Patterns in Ravens - Data/MeansPlot of Transformed Data by Season

0.500000

0.600000

0.700000

0.800000

0.900000

1.000000

1.100000

1.200000

1.300000

1.400000

1.500000

0 1 2 3 4 5

Season

Tra

nsf

orm

ed

% V

eg

eta

tio

n

Page 13: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Seasonal Diet Patterns in Ravens - ANOVA

241038.0)161773.1237554.1(...)243203.1329721.1(

8)4-12(df :Variation GroupWithin

197387.0)135572.1161773.1(...)135572.124203.1(3

3)1-4(df :Variation GroupBetween

438425.0)135572.127554.1(...)135572.1329721.1(

11)1-12(df :Variation Total

22

E

22

T

22

Total

SSE

SST

TSS

ANOVASource of Variation SS df MS F P-value F critBetween Groups 0.197387 3 0.065796 2.183752 0.167768 4.06618Within Groups 0.241038 8 0.03013

Total 0.438425 11

Do not conclude that seasons differ with respect to vegetation intake

Page 14: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Seasonal Diet Patterns in Ravens - Spreadsheet

Month Season Y' Season MeanOverall Mean TSS SST SSENOV 1 1.329721 1.243203 1.135572 0.037694 0.011584 0.007485DEC 1 1.254080 1.243203 1.135572 0.014044 0.011584 0.000118JAN 1 1.145808 1.243203 1.135572 0.000105 0.011584 0.009486FEB 2 1.115957 1.217935 1.135572 0.000385 0.006784 0.010400MAR 2 1.257474 1.217935 1.135572 0.014860 0.006784 0.001563APR 2 1.280374 1.217935 1.135572 0.020968 0.006784 0.003899MAY 3 1.113428 0.919375 1.135572 0.000490 0.046741 0.037657JUN 3 1.039152 0.919375 1.135572 0.009297 0.046741 0.014346JUL 3 0.605545 0.919375 1.135572 0.280928 0.046741 0.098489AUG 4 0.967390 1.161773 1.135572 0.028285 0.000687 0.037785SEP 4 1.280374 1.161773 1.135572 0.020968 0.000687 0.014066OCT 4 1.237554 1.161773 1.135572 0.010400 0.000687 0.005743

Sum 0.438425 0.197387 0.241038

Total SS Between Season SS Within Season SS

(Y’-Overall Mean)2 (Group Mean-Overall Mean)2 (Y’-Group Mean)2

Page 15: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

CRD with Non-Normal Data Kruskal-Wallis Test

• Extension of Wilcoxon Rank-Sum Test to k > 2 Groups• Procedure:

– Rank the observations across groups from smallest (1) to largest ( N = n1+...+nk ), adjusting for ties

– Compute the rank sums for each group: T1,...,Tk . Note that T1+...+Tk = N(N+1)/2

Page 16: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Kruskal-Wallis Test

• H0: The k population distributions are identical (1=...=k)

• HA: Not all k distributions are identical (Not all i are equal)

)(:

:..

)1(3)1(

12:..

2

21,

1

2

HPvalP

HRR

Nn

T

NNHST

k

k

ii

i

An adjustment to H is suggested when there are many ties in the data. Formula is given on page 426 of O&L.

Page 17: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Example - Seasonal Diet Patterns in RavensMonth Season Y' RankNOV 1 1.329721 12DEC 1 1.254080 8JAN 1 1.145808 6FEB 2 1.115957 5MAR 2 1.257474 9APR 2 1.280374 10.5MAY 3 1.113428 4JUN 3 1.039152 3JUL 3 0.605545 1AUG 4 0.967390 2SEP 4 1.280374 10.5OCT 4 1.237554 7

• T1 = 12+8+6 = 26

• T2 = 5+9+10.5 = 24.5

• T3 = 4+3+1 = 8

• T4 = 2+10.5+7 = 19.5

1632.)12.5(:value

815.7:)05.0.(.

12.53912.44)112(33

)5.19(

3

)8(

3

)5.24(

3

)26(

)112(12

12:..

sDifference Seasonal : difference seasonal No :

2

214,05.

2222

0

HPP

HRR

HST

HH a

Page 18: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Post-hoc Comparisons of Treatments

• If differences in group means are determined from the F-test, researchers want to compare pairs of groups. Three popular methods include:– Fisher’s LSD - Upon rejecting the null hypothesis of no

differences in group means, LSD method is equivalent to doing pairwise comparisons among all pairs of groups as in Chapter 6.

– Tukey’s Method - Specifically compares all t(t-1)/2 pairs of groups. Utilizes a special table (Table 10, pp. 1110-1111).

– Bonferroni’s Method - Adjusts individual comparison error rates so that all conclusions will be correct at desired confidence/significance level. Any number of comparisons can be made. Very general approach can be applied to any inferential problem

Page 19: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Fisher’s Least Significant Difference Procedure

• Protected Version is to only apply method after significant result in overall F-test

• For each pair of groups, compute the least significant difference (LSD) that the sample means need to differ by to conclude the population means are not equal

ijji

ijjiji

jiij

LSDyy

LSDyy

tNnn

MSEtLSD

..

..

2/

:Interval Confidence sFisher'

if Conclude

dfwith 11

Page 20: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Tukey’s W Procedure

• More conservative than Fisher’s LSD (minimum significant difference and confidence interval width are higher).

• Derived so that the probability that at least one false difference is detected is (experimentwise error rate)

. .

. .

( , ) given in Table 10, pp. 1110-1111 with

Conclude if

Tukey's Confidence Interval:

( , ) 1 1When the sample sizes are unequal, use

2

ij

i j iji j

iji j

iji j

MSEW q t q N - t

n

y y W

y y W

q tW MSE

n n

Page 21: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Bonferroni’s Method (Most General)• Wish to make C comparisons of pairs of groups with simultaneous confidence intervals or 2-sided tests

•When all pair of treatments are to be compared, C = t(t-1)/2

• Want the overall confidence level for all intervals to be “correct” to be 95% or the overall type I error rate for all tests to be 0.05

• For confidence intervals, construct (1-(0.05/C))100% CIs for the difference in each pair of group means (wider than 95% CIs)

• Conduct each test at =0.05/C significance level (rejection region cut-offs more extreme than when =0.05)

• Critical t-values are given in table on class website, we will use notation: t/2,C, where C=#Comparisons, = df

Page 22: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Bonferroni’s Method (Most General)

ijji

ijjiji

jivCij

Byy

Byy

N-tvt

nnMSEtB

..

..

,,2/

:Interval Confidence s'Bonferroni

if Conclude

) with websiteclasson given (

11

Page 23: Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.

Example - Seasonal Diet Patterns in Ravens

Note: No differences were found, these calculations are only for demonstration purposes

4930.03

1

3

1)03013.0(479.3

4540.03

1)03013.0(53.4

3268.03

1

3

1)03013.0(306.2

479.353.4306.2303013.0 8,6,025.8,4,05.8,025.

ij

ij

ij

dfCdfti

B

W

LSD

tqtnMSEEE

Comparison(i vs j) Group i Mean Group j MeanDifference1 vs 2 1.243203 1.217935 0.0252671 vs 3 1.243203 0.919375 0.3238281 vs 4 1.243203 1.161773 0.0814302 vs 3 1.217935 0.919375 0.2985602 vs 4 1.217935 1.161773 0.0561623 vs 4 0.919375 1.161773 -0.242398