Multivariate ANOVA (MANOVA) - Homepage | ETH Zürich · For the example of the dental growth curves...

14
Lukas Meier (most material based on lecture notes and slides from H.R. Roth) Multivariate ANOVA (MANOVA)

Transcript of Multivariate ANOVA (MANOVA) - Homepage | ETH Zürich · For the example of the dental growth curves...

Lukas Meier (most material based on lecture notes and slides from H.R. Roth)

Multivariate ANOVA (MANOVA)

Multivariate Perspective

1

Group Subject ID a b c d e

A

1

2

3

4

B

5

6

7

8

C

9

10

11

12

within-subjects factor

be

twe

en-s

ubje

cts

facto

r1 𝑝-dimensional observation

Standard MANOVA only tests

between subjects effects

Multivariate Perspective

Assume we have 𝑝 repeated measurements per person.

We can interpret these 𝑝 measurements as a single multivariate observation

(which is 𝑝-dimensional) with an arbitrary correlation structure.

This means that we have a response that is 𝒑-dimensional

each profile is one observation.

More precisely we use the one-way MANOVA model

𝒀𝒊𝒋 = 𝝁 + 𝜶𝒊 + 𝜺𝒊𝒋

𝜶𝒊 = (𝛼𝑖1, 𝛼𝑖2, … , 𝛼𝑖𝑝) are the effects of group 𝑖.

2

Vector of

𝑝 responses of

subject 𝑗 in group 𝑖

Vector of

general levels

Vector of fixed

effects of group 𝑖

Vector of random

errors, i.i.d.

𝑁𝑝(0, Σ)

The same

covariance matrix

in all groups

Digesting the Multivariate Model

3

For the example of the dental growth curves this means that we have the two

following “mean profiles”:

This means that both profiles start with the same “mean” (𝜇1, 𝜇2, 𝜇3, 𝜇4) and then

a sex specific effect is added at each time-point.

An individual profile then contains a person specific error, i.e. for boy 4 we have

the error term 𝜺𝟏𝟒 = (𝜀141, 𝜀142, 𝜀143, 𝜀144) ∼ 𝑁(0, Σ)

Age 8 10 12 14

Boys (𝑖 = 1) 𝜇1 + 𝛼11 𝜇2 + 𝛼12 𝜇3 + 𝛼13 𝜇4 + 𝛼14

Girls (𝑖 = 2) 𝜇1 + 𝛼21 𝜇2 + 𝛼22 𝜇3 + 𝛼23 𝜇2 + 𝛼24

Age 8 10 12 14

Boy 4 (𝑗 = 4) 𝜇1 + 𝛼11 + 𝜀141 𝜇2 + 𝛼12 + 𝜀142 𝜇3 + 𝛼13 + 𝜀143 𝜇4 + 𝛼14 + 𝜀144

completely unstructured

boy girl

16

20

24

28

32

8 10 12 14 8 10 12 14

age

dis

tance

person

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

Digesting the Multivariate Model

4

𝜇1

𝜇2

𝜇3

𝜇4

𝛼11

𝛼12

𝛼13

𝛼14

𝛼22 𝛼23

𝛼24

𝛼21

mean profile for boys

mean profile for girls

Digesting the Multivariate Model

5

𝜇1

𝜇2

𝜇3

𝜇4

𝛼11

𝛼12

𝛼13

𝛼14

𝛼22 𝛼23

𝛼24

𝛼21

𝜀142

profile of 4th boy

MANOVA (Multivariate Analysis of Variance)

As usual we need some side constraint on the fixed effect parameter vector 𝜶𝒊,

e.g. 𝜶𝟏 +⋯+ 𝜶𝒈 = 𝟎.

We assume that the error terms follow a multivariate normal distribution with

the same covariance matrix Σ in all groups, i.e.

𝜺𝒊𝒋 ∼ 𝑁𝑝(0, Σ)

We do not assume any specific structure about the covariance matrix, i.e., no

compound symmetry etc.

Σ =

𝜎12 𝜎12 𝜎13 𝜎14

𝜎12 𝜎22 𝜎23 𝜎24

𝜎13 𝜎23 𝜎32 𝜎34

𝜎14 𝜎24 𝜎34 𝜎42

6

This means that we actually have to estimate a lot of parameters.

MANOVA (Multivariate Analysis of Variance)

In ANOVA we partitioned the total sums of squares into between groups (treatment) and

within groups

sums of squares and compared them using the 𝐹-test.

In a MANOVA we do a similar thing based on the corresponding covariance

matrices.

No details.

In R: manova

Let’s have a look at an example to get a better understanding.

7

Example: Growth Curves

We have two factors: sex (2 levels) age (4 levels)

We have 𝝁 = 𝜇1, 𝜇2, 𝜇3, 𝜇4 for the overall mean

𝜶𝟏 = (𝛼11, 𝛼12, 𝛼13, 𝛼14) for the effects of the girls

𝜶𝟐 = (𝛼21, 𝛼22, 𝛼23, 𝛼24) for the effects of the boys

In R

8

boy girl

16

20

24

28

32

8 10 12 14 8 10 12 14

age

dis

tance

person

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

Example: Growth Curves

The standard tests use the null-hypothesis

𝐻0: 𝜶𝒊 = 𝟎, 𝑖 = 1,… , 𝑔

which is (very) strong.

This means that we test whether the (multivariate) group means are the same

for all groups.

Several test statistics can be used Wilks’ lambda Pillai Trace Hotelling-Lawley Trace

9

𝜶𝟏

𝜶𝟐

Example: Growth Curves

In R:

We conclude that the two (multivariate) group means do not coincide.

However, we do not have any specific information of where that difference

comes from.

10

Example: Growth Curves

By using a suitable transformation of the data matrix we can also test some

more specific hypothesis using the multivariate approach (a la contrasts)

See R-file for more details.

11

MANOVA: Checking Model Assumptions

Multivariate normality?

Same covariance matrix in all groups?

12

MANOVA: Disadvantages

Standard MANOVA only gives us tests for the between subjects factors.

Simultaneous inference often difficult to interpret.

We do not make any use of any special structure of the data (time, space, …),

this comes at the price that we have to estimate many parameters ( low

power).

13