ANOVA (between groups)

ANOVABetween Groups

Comparing several means

Thanks to Prof. Andy Field for stats content.

See Chapter 11 of 4th edition text (GLM 1)&

https://www.youtube.com/watch?v=SULO2-gjZoYhttp://www.statisticshell.com/html/woodofsuicides.html

LEARNING OUTCOMES• Understand the basic principles of ANOVA

– Why it is done?– What it tells us?– Explain how ANOVA is used in the context of experimental

research– Explain how ANOVA is a special case of the linear model

• Compute an F-ratio– Understand what SSs, MSs and F represent

• Theory of one-way independent ANOVA• Following up an ANOVA:

– Planned Contrasts/Comparisons• Choosing Contrasts• Coding Contrasts

– Post Hoc Tests

https://en.wikipedia.org/wiki/Analysis_of_variance

Sir Ronald Fisher- original hipster

http://hipsterlogo.com/

When And Why• We can use a t test to compare means

– compare only 2 means– only for one Predictor/Independent Variable.

• ANOVA– Compares several means.– Can be used when you have manipulated more

than one Independent Variables.– It is an extension of regression (the General

Linear Model)

ANOVA is an extension of

regression (the General Linear

Model)

Why Not Use Lots of t-Tests?If we want to compare several means why don’t we compare pairs of means with t-tests?

– Can’t look at several independent variables.– Inflates the Type I error (false positive) rate.

2 31 3

n95.01Error Familywise

Type I and II errors

http://www.economistsdoitwithmodels.com/2010/02/17/before-you-go-locking-up-all-of-those-crazy-people/

What Does ANOVA Tell us?Null Hypothesis:

– Like a t-test, ANOVA tests the null hypothesis that the means are the same.

Experimental Hypothesis:– The means differ.

ANOVA is an Omnibus test– It test for an overall difference between groups.– It tells us that the group means are different.– It doesn’t tell us exactly which means differ.

ANOVA AS REGRESSION

The Data

Viagara.sav in on Moodle

ANOVA as Regressionoutcome𝑖 = ሺmodelሻ+error𝑖 iiii bbb LowHighLibido 120

Dummy coding is explained on p419 of Field textbook

Placebo Group

0Placebo

120Libido

00Libido

iiii bbb LowHighLibido 120

High Dose Group iiii bbb LowHighLibido 120

120Libido

01Libidobb

PlaceboHigh2

2PlaceboHigh

20Libido

bXXbbi

Low Dose Group iiii bbb LowHighLibido 120

120Libido

10Libidobb

PlaceboLow1

1PlaceboLow

10Libido

XXbbXX

Output from Regression

Use dummy.sav which contains the Viagra data, but also dummy variables. Run a linear regression with libido as the outcome and dummy1 and dummy2 as predictors. See p434 in textbook.

This is significant: F(2, 12)= 5.12, p = .025.

b0 =Mean placebo=2.2b2 = High-placebo=5.0-2.2*b1 = Low –placebo=3.2-2.2

ANOVA is an extension of

regression (the General Linear

Model)

Another example of ANOVA as regression

IV with 3 categories, managerial, clerical and custodial. DV is previous experience (months).ANOVA

– CODED: each category’s mean is compared to the grand mean.

Regression– DUMMY: each category’s intercept is compared to the

reference group’s intercept.– The intercept = mean value when all other predictors

= 0. So…The three intercepts are just means.Job Category has an F=69.192, with a p < .001. Highly significant.

Different example continuedANOVA MeansClerical 85.039

Custodial 298.111

Manager 77.619

Regression CoefficientsIntercept 77.619

Clerical 7.420

Custodial 220.492

The regression coefficient for Clerical, is the difference between the mean for Clerical, 85.039, and the Intercept, or mean for Manager (85.039 – 77.619 = 7.420).

So an ANOVA reports each mean and a p-value that says at least two are significantly different.

A regression reports only one mean (as an intercept), and the differences between that one and all other means, but the p-values evaluate those specific comparisons.

It’s all the same model, the same information, but presented in different ways.

THEORY OF ANOVA

Experiments vs. Correlation• ANOVA in Regression:

– Used to assess whether the regression model is good at predicting an outcome.

• ANOVA in Experiments:– Used to see whether experimental

manipulations lead to differences in performance on an outcome (DV).

• By manipulating a predictor variable can we cause (and therefore predict) a change in behaviour?

– Asking the same question, but in experiments we systematically manipulate the predictor, in regression we don’t.

Rationale to Experiments• Variance created by our manipulation

– Removal of brain (systematic variance)• Variance created by unknown factors

– E.g. Differences in ability (unsystematic variance)

Lecturing Skills

Group 1 Group 2

Sample Mean

6 7 8 9 10 11 12 13 14

Mean = 10SD = 1.22

M= 9M= 11

M= 12M= 10

Theory of ANOVA• We calculate how much variability there is

between scores– Total Sum of squares (SST).

• We then calculate how much of this variability can be explained by the model we fit to the data– How much variability is due to the experimental

manipulation, Model Sum of Squares (SSM)...• … and how much cannot be explained

– How much variability is due to individual differences in performance, Residual Sum of Squares (SSR).

Theory of ANOVA cont…• We compare the amount of variability

explained by the Model (experiment), to the error in the model (individual differences)– This ratio is called the F-ratio.

• If the model explains a lot more variability than it can’t explain, then the experimental manipulation has had a significant effect on the outcome (DV).

Theory of ANOVA

• If the experiment is successful, then the model will explain more variance than it can’t– SSM will be greater than SSR

ANOVA BY HAND

ANOVA by Hand• Testing the effects of Viagra on Libido

using three groups:– Placebo (Sugar Pill)– Low Dose Viagra– High Dose Viagra

• The Outcome/Dependent Variable (DV) was an objective measure of Libido.

The Data

The data:

ANOVA by HandStep 1: calc TOTAL SSStep 2: calc MODEL SSStep 3: calc RESIDUAL SSStep 4: Calc Mean squared errorStep 5: Calc F ratioStep 6: Table

Step 1: Calculate SST

2)( grandiT xxSS)1(

12 NsSS 12 NsSS grandT

115124.3

Degrees of Freedom (df)

• Degrees of Freedom (df) are the number of values that are free to vary.

• In general, the df are one less than the number of values used to calculate the SS.

141151 NdfTSlide 34

Step 2: Calculate SSM

2)( grandiiM xxnSS

135.20755.11355.0025.8

533.15267.05267.15467.30.55467.32.35467.32.25

Model Degrees of Freedom• How many values did we use to

calculate SSM?– We used the 3 means.

2131 kdfM

Step 3: Calculate SSR

12 NsSS 12iiR nsSS

2)( iiR xxSS

111 32

1 nsnsnsSS groupgroupgroupR

Step 3: Calculate SSR

60.23108.68.6

450.2470.1470.11550.21570.11570.1

111 32

nsnsnsSS groupgroupgroupR

Residual Degrees of Freedom• How many values did we use to

calculate SSR?– We used the 5 scores for each of the

SS for each group.

151515111 321

nnndfdfdfdf groupgroupgroupR

Double Check

74.4374.4360.2314.2074.43

RMT SSSSSS

141412214

RMT dfdfdf

Step 4: Calculate the Mean Squared Error

067.102135.20

967.11260.23

Step 5: Calculate the F-Ratio

MMSMSF

12.5967.1067.10

MMSMSF

Step 6: Construct a Summary Table

Source SS df MS F

Model 20.14 2 10.067 5.12*

Residual 23.60 12 1.967

Total 43.74 14

There was a significant effect of Viagra on levels of libido, F(2, 12) = 5.12, p = .025, ω= .60.

F ratio• Signal to noise ratio.• Ratio of the variation explained by the model and the variation

explained by unsytematic factors.• How good it Is against how bad it is (how much error there is).• Similar to the independent t-test.• Experimental effect versus individual difference.• If it is less than 1, the effect must be non-sig as the MSr is larger than

MSm.• The Viagra F is above 5, so experiment had some effect, but we still

don’t know if this result is due to chance. • We need to compare obtained F against the max value we expect if

group means were equal in an F distribution with the same degrees of freedom.

• Here 2 and 12 dfs, critical values are 3.89 (p =.05) and 6.93 (p = .01).• Observed effect 5.12 is between critical values. SPSS gives exact p

=.025 as sig.

Interpreting F p442 of text

F-test assesses the overall fit of the model to the data.A significant F ratio tells us that:

It doesn’t say which of the following is the case:X1≠ X2≠ X3

X1≠ X2= X3

An F ratio tells us only that there is a difference somewhere.

X1= X2= X3

VideosOne way ANOVA by handhttps://www.youtube.com/watch?v=q48uKU_KWas

Andy Fieldhttps://www.youtube.com/watch?v=SULO2-gjZoY35mins onwards

ASSUMPTIONS OF ANOVAPage 442

T-test

Assumptions(Same as t-tests)1. Each sample is an independent random sample.2. The distribution of the response variable follows

a normal distribution.– Skewness and Kurtoisis and histogram– KS, Shapiro-Wilk should be non sig

3. The population variances are equal across responses for the group levels. (Homogeneity of Variance).– Levene’s test should be non-sig (don’t want variances

to be sig different)

Assumptions for T-tests and One Way ANOVA in SPSS - In Depth

https://www.youtube.com/watch?v=5OCnY153vU8By Easy statistics now

When assumptions are violated:

• Chapter 5 text (5.4) – bias reduction methods, correct unequal variances by adjusting F itself, normality might require data transform.

• Do scores come from same distribution= Kruskal-Wallis test (chapter 6).

• Robust methods for comparing means exist. (Package called R).

SPSS output one-way anova & Levene test p456

• Descriptive stats• Test equal variances (Levene’s test)• ANOVA table/ Robust test of equality

of means• Follow up tests.

There was a significant effect of Viagra on levels of libido, F(2, 12) = 5.12, p = .025, ω= .60.

If Levene is sig, use ->Otherwise, use below

of testAnalyse /compare means/one way anovaAnd fill in all the options inside the buttons

http://imgs.xkcd.com/comics/self_description.png

Follow up tests

https://www.youtube.com/watch?v=5xWrdheUsyY

Why Use Follow-Up Tests?• The F-ratio tells us only that the

experiment was successful– i.e. group means were different

• It does not tell us specifically which group means differ from which.

• We need additional tests to find out where the group differences lie.

How?• Multiple t-tests

– We saw earlier that this is a bad idea• Orthogonal Contrasts/Comparisons

– Hypothesis driven– Planned a priori

• Post Hoc Tests– Not Planned (no hypothesis)– Compare all pairs of means

• Trend Analysis

PLANNED CONTRASTS

Planned Contrasts• Basic Idea:

– The variability explained by the Model (experimental manipulation, SSM) is due to participants being assigned to different groups.

– This variability can be broken down further to test specific hypotheses about which groups might differ.

– We break down the variance according to hypotheses made a priori (before the experiment).

– It’s like cutting up a cake

The Data

SPSS is dummy coding

Rules When Choosing Contrasts

• Independent– contrasts must not interfere with each

other (they must test unique hypotheses).• Only 2 Chunks

– Each contrast should compare only 2 chunks of variation (why?).

• K-1– You should always end up with one less

contrast than the number of groups.

Generating Hypotheses• Example: Testing the effects of Viagra on

Libido using three groups:– Placebo (Sugar Pill)– Low Dose Viagra– High Dose Viagra

• Dependent Variable (DV) was an objective measure of Libido.

• Intuitively, what might we expect to happen?

Example3 volunteers to eat chocolate please.

How do I Choose Contrasts?• Big Hint:

– In most experiments we usually have one or more control groups.

– The logic of control groups dictates that we expect them to be different to groups that we’ve manipulated.

– The first contrast will always be to compare any control groups (chunk 1) with any experimental conditions (chunk 2).

Hypotheses• Hypothesis 1:

– People who take Viagra will have a higher libido than those who don’t.

– Placebo (Low, High)• Hypothesis 2:

– People taking a high dose of Viagra will have a greater libido than those taking a low dose.

– Low High

Planned Comparisons

Another Example

Coding Planned Contrasts: Rules

• Rule 1– Groups coded with positive weights

compared to groups coded with negative weights.

• Rule 2– The sum of weights for a comparison

should be zero.• Rule 3

– If a group is not involved in a comparison, assign it a weight of zero.

Coding Planned Contrasts: Rules

• Rule 4– For a given contrast, the weights assigned

to the group(s) in one chunk of variation should be equal to the number of groups in the opposite chunk of variation.

• Rule 5– If a group is singled out in a comparison,

then that group should not be used in any subsequent contrasts.

Defining Contrasts Using Weights

SPSS is using a different coding

Contrast 1(Dose vs placebo)

Contrast 2(Low vs High)

Placebo -2 0

Low 1 1

High 1 -1

Placebo Low High

Libido

Dummy coding- The model

Placebo Low High

Libido

The contrast model

Contrasts in SPSS• Page 462 of text.

Output

Trend Analysis

Trend Analysis: Output

Trend Analysis: Output II

Reporting Results• There was a significant effect of

Viagra on levels of libido, F(2, 12) = 5.12, p = .025, ω= .60.

• There was a significant linear trend, F(1, 12) = 9.97, p = .008, ω= .62, indicating that as the dose of Viagra increased, libido increased proportionately.

Reporting Results Continued• Planned contrasts revealed that

having any dose of Viagra significantly increased libido compared to having a placebo, t(12) = 2.47, p = .029, r = .58, but having a high dose did not significantly increase libido compared to having a low dose, t(12) = 2.03, p = .065, r = .51.

LEARNING OUTCOMES• Understand the basic principles of ANOVA

– Why it is done?– What it tells us?– Explain how ANOVA is used in the context of experimental

research– Explain how ANOVA is a special case of the linear model

• Compute an F-ratio– Understand what SSs, MSs and F represent

• Theory of one-way independent ANOVA• Following up an ANOVA:

– Planned Contrasts/Comparisons• Choosing Contrasts• Coding Contrasts

– Post Hoc Tests

POST HOC TESTS

Post Hoc Tests• Compare each mean against all

others.• In general terms they use a

stricter criterion to accept an effect as significant.– Hence, control the familywise error

rate.– Simplest example is the Bonferroni

method:

TestsofNumber Bonferroni

Post Hoc Tests Recommendations:

• SPSS has 18 types of Post hoc Test!• Field (2013):

– Assumptions met:• REGWQ or Tukey HSD.

– Safe Option:• Bonferroni.

– Unequal Sample Sizes:• Gabriel’s (small n), Hochberg’s GT2 (large n).

– Unequal Variances:• Games-Howell.

Post Hoc Test Output p470

*Assumptions met

Unequal Variances

One tail

EFFECT SIZES

PG Stats Andy Field

Effect SizesAn effect size is a standardized measure of the size of an effect:• Standardized = comparable across studies• Not (as) reliant on the sample size• Allows people to objectively evaluate the size of observed effect.

PG Stats Andy Field

Effect Size MeasuresThere are several effect size measures that can be used:• Cohen’s d• Pearson’s r• Glass’ Δ• Hedges’ g• Odds Ratio/Risk ratesPearson’s r is a good intuitive measure• Oh, apart from when group sizes are different …

PG Stats Andy Field

Effect Size Measuresr = .1, d = .2 (small effect):• the effect explains 1% of the total variance.r = .3, d = .5 (medium effect):• the effect accounts for 9% of the total variance.r = .5, d = .8 (large effect):• the effect accounts for 25% of the variance.Beware of these ‘canned’ effect sizes though:• The size of effect should be placed within the research context.

Calc effect size p472 text• SPSS doesn’t do this for you• Omega squared

ω2 = SSM –(dfM) MSR

SST + MSR

= 20.13 –(2)1.9743.73 +1.97

ω = .60 Small .01Medium .06Large .14

Calculating the Effect Size- Independent t

of textMeanA – MeanBStandard deviation control group

Effect sizes contrastsr contrast 1 =Square root of (2.4742/ (2.4742+12)) =.058

r contrast 2 =Square root of (2.0292/ (2.0292+12)) =.051

Large effect sizes

Reporting Results• There was a significant effect of

Viagra on levels of libido, F(2, 12) = 5.12, p = .025, ω= .60.

• There was a significant linear trend, F(1, 12) = 9.97, p = .008, ω= .62, indicating that as the dose of Viagra increased, libido increased proportionately.

Reporting Results Continued• Planned contrasts revealed that

having any dose of Viagra significantly increased libido compared to having a placebo, t(12) = 2.47, p = .029, r = .58, but having a high dose did not significantly increase libido compared to having a low dose, t(12) = 2.03, p = .065, r = .51.

One way ANOVA Summary p472The one way independent ANOVA compares several means when those means come from different groups of people. E.g. if you have several experimental conditions and have used different participants in each condition.

When you have generated specific hypotheses before the experiment- use planned comparisons, but if you don’t have specific hypothesis use post hoc tests.

There are lots of different post-hoc tests: when you have equal sample sizes and homogeneity of variance is met us REGWQ or Tukey’s HSD. If sample sizes are slightly different use Gabriel’s procedure, but if sample sizes are very different use Hochberg’s GT2. If there is any doubt about homogeneity of variance use the Games-Howell procedure.

Test for homogeneity of variance using Levene’s test. Find the table with this label: if the value column labelled Sig. is less than .05 than the assumption is violated. If this is the case go to the table labelled Robust Tests of Equality of Means. If homogeneity of variance has been met (the sig of Levene’s test is greater than .05) go to the table labelled ANOVA.

In the table labelled ANOVA (or Robust Tests of Equality of Means), look at the column labelled Sig. if the value is less than .05 than the means of the groups are significantly different.

For contrast and post-hoc tests, again look at the columns labelled Sig. to discover if your comparisons are significant (they will be if your significance value is less than .05).

Tutorialhttp://www.statisticshell.com/html/woodofsuicides.htmlANOVA (one way)HandoutsExercisesVideo lectureSPSS data files

ANOVA (between groups)

Education

Transcript of ANOVA (between groups)

Determining Differences Between Groups T-tests ANOVA.

AP Statistics: ANOVA Section 1. In section 13.1 A, we used a t-test to compare the means between two groups. An ANOVA (ANalysis Of VAriance) test is used.

TWO-WAY BETWEEN GROUPS ANOVA - University of Oxford · Introduction Two-way means that there are 2 IVs (categorical) Between-groups – different participants in each group The advantage

Mixed between-within groups ANOVA

ANOVA BETWEEN

Section 4B – Categorical/Quantitative Relationships: ANOVAteachoutcoc.org/files/section4B_intro_to_stats_for... · 2019-09-28 · ANOVA test, the variance between the groups and

Two-Way Between-Groups ANOVA - bcs.worthpublishers.combcs.worthpublishers.com/WebPub/Psychology/nolanheinzen2e/ir_pdf/... · Two-Way Between-Groups ANOVA NOTE TO INSTRUCTORS ... A

Comparisons among groups within ANOVA. Problem with one-way anova There are a couple issues regarding one-way Anova First, it doesn’t tell us what we.

ONE-WAY BETWEEN-GROUPS ANOVA Psyc 301-SPSS Spring 2014.

LAMPIRAN - CORE · 2014-01-22 · Lampiran 2. Analisis One Way Anova 1. Folikel Primer ANOVA Folikel_Primer Sum of Squares df Mean Square F Sig. Between Groups 3,450 3 1,150 2,453

Hypothesis Testing with One-Way ANOVA - …acfoos/Courses/381/10...Li fANOVAQ tifi O lLogic of ANOVA: Quantifying Overlap within -groups variance between -groups variance F = yWhenever

Statistics ANOVA. What is ANOVA? Analysis of Variance Test if a measured parameter is different between groups Literally comparing the sample means.

Analysis of variance (ANOVA) - University of British … › ~bole › bio300 › overheads › ...ANOVA Comparing the means of more than two groups Analysis of variance (ANOVA) •

SPSS Series 1: ANOVA and Factorial ANOVA · One-way ANOVA •SPSS output 10 The Levene’stest is about the equal variance across the groups. p = .137, means homogeneous variances

ANOVA Test compares three or more groups or Analysis of ...

factorial betweenfactorial between--Ps ANOVA II:Ps ANOVA IIuqwloui1/stats/3010 for post... · 2018-08-02 · 1 1 psyc3010 lecture 3 factorial betweenfactorial between--Ps ANOVA II:Ps

Between-Groups ANOVA

12. Comparing Groups: Analysis of Variance (ANOVA) Methodsusers.stat.ufl.edu/~aa/harvard/12. Comparing groups (ANOVA).pdf · Comparing Groups: Analysis of Variance (ANOVA) Methods

Between-Groups ANOVA Chapter 12. >When to use an F distribution Working with more than two samples >ANOVA Used with two or more nominal independent variables.

Lecture 9: One Way ANOVA Between Subjects