Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

31
Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun www.phon.ox.ac.uk/~bettina/ teaching.html

Transcript of Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Page 1: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Statistics for Linguistics Students

Michaelmas 2004Week 6

Bettina Braunwww.phon.ox.ac.uk/~bettina/teaching.html

Page 2: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Overview

• Recap

• X2-test for frequency data

• Introduction to Analysis of variance (ANOVA)– One-factor between-subjects design– Two-factor between-subjects design

Page 3: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

What do we report from the results table?

Differences in the mean There is a signficant difference (t=2.94, df=15, p = 0.01)

Page 4: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

What are frequency data?

• Frequency count– Number of subjects/events in a given category (e.g.

number of high and low accents)– What about the number of correct responses of

different subjects?• X2-test for

– Test of deviation from expected frequencies: Test whether the observed frequencies deviate from expected frequencies (e.g. using a dice, there is an a priori chance of 16.67% for each number)

– Test of association: Finding relationship between two or more independent variables (e.g. test relation between gender and the use of high or low accents?)

this is not frequency data!!!

Page 5: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

X2-test for deviation from expected frequencies

• Null-hypothesis: there is no difference between expected and observed frequencies

• Example data

• Calculation

1 2 3 4 5 6 total

observed 12 17 16 18 13 24 100

expected 16.7 16.7 16.7 16.7 16.7 16.7 100

= 5.8

Have to be identical

Page 6: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Calculated value for X2

must be larger than the one found in the table to be significant

Degrees of freedom:

• If there is one independent variabledf = a – 1

• If there are two independent variables:df = (a-1)(b-1)

Looking up the p-value

Page 7: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Further Notes

• The X2 test for the deviation from expected frequencies can be used for one independent variable only

• If the independent variable has only two levels (e.g. high vs. low accent), a correction for continuity has to be used

-0.5)2

Page 8: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

X2 as test of association

• Calculation of expected frequencies:

Cell freq =

Aspect Past tense Present tense

total

Progressive 308 476 784Non-progressive

315 297 612

Total 623 773 1396

Row total x column total

Grand total

Page 9: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Useful checks

• Sum of expected frequencies must eqal to the sum of the observed frequenciesfo = fe = N

• The sum of the observed frequencies minus the expected frequencies must equal zero(fo – fe) = 0

Page 10: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

X2-test

• Limitations:– All raw data for X2 must be frequencies

(not percentages!)– Each subject or event is counted only once,

i.e. contributes to only one cell value (strictly between-subjects)

– The total number of observations should be greater than 20

– The expected frequency in any cell should be greater than 5

Page 11: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

An Example

• You want to test how well non-Chinese speaking students can learn Chinese characters using different kinds of mnemonic. There are three groups of subjects, one with no mnemonic, one with mnemonic 1 and one with mnemonic 2. You count how many characters were correctly recalled.

• What are the independent and dependent variables?– How many levels does the IV have?– What is the type of the dependent variable?

IV: kind of mnemonic, DV: recall3interval

Page 12: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

ANOVA: general

• Analysis of variance – Test the null-hypothesis that all the samples

are taken from the same population– compares the variances within the samples

(random error) to the variance between the samples (systematic error)

– If the variances between the samples are larger than the variances within the samples, we can reject the null-hypothesis

Page 13: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

ANOVA: limitations

• All samples must be selected randomly

• The scores must be interval

• The scores in the samples must be normally distributed

• The variances of the samples must be homogenious

• There need to be an equal number of scores in each sample

Page 14: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

ANOVA: general

• Conventions– Independent variables are called `factors´– ANOVA calculates an F-statistic that

determines whether the null-hypothesis can be rejected or not

– In SPSS, you find the ANOVAs in Analyze => General Linear Model

• “univariate”: analysis of one DV (between)• “multivariate”: for more than one DV (between)• “repeated measures”: within-subjects designs

Page 15: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

F-statistic

• F-statistic is the ratio of the between-group variance to the within-group variance.It has to be larger thana critical value in a table

• The p-value of the F-statistic depends on two df-values F(dfn, dfd) = value

– Df of the numerator dfn=k-1

– Df of the denominator dfd=N-k

(N: number of scores in sample, k: number of groups)

Page 16: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Reporting the F-value

• As the p-value of the F-statistic depends on two df-values, you have to report them

• Suppose, we have 3 groups (3 levels of an independent variable), and 12 scores per group, we report the F-statistic as follows:

F(2,9) = 2.9, p = ???

(similarly to the t-value, the df, and the p-value for t-tests!)

Page 17: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Critical values for the F-statistic

Page 18: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

One factor between-subjects ANOVA

• If the independent variable has two levels, the results are comparable to an independent t-test (F = t2)

• If we have more than two levels, we could in principle run multiple independent t-tests

• BUT: This increases our Type I error– With one test we can be 95% sure our conclusion is

correct– With two tests, this percentage drops to 0.95 * 0.95 =

0.90 (we can only be 90% sure of our conclusion)– With even more tests …

Page 19: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

One factor between-subjects ANOVA

• A one-factor ANOVA corrects for this increased risk of a Type I error

• There are fixed factors and random factors:– If you choose the IV to be a fixed factor, the model is

calculated for just the levels of independent variable you have (e.g. gender, accentedness)

– If you choose the IV to be a random factor, you want to generalise from the levels of your independent variable to other levels (e.g. IV variable contains three different degrees of blood alcohol but you want to generalise the effect of e.g. speech control to other levels)

Page 20: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

SPSS output

• There is a significant effect of mnemonicness on the number of characters recalled: F (2,27) = 17.7, p < 0.001

Ignore these!

But between which of the

groups??

Page 21: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Post-hoc tests

• If the IV has more than 2 levels, we have to do post-hoc tests to find out, which of the groups are significantly differet

• Scheffé test: – Suitable for pair-wise comparison between all groups – Corrects for the increased risk of an Type I error

(most conservative post-hoc test)

• Dunnet test:– Useful for “planned comparisons”, e.g. comparing two

different groups against a control group– Less stringent than Scheffé

Page 22: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Post-hoc tests with SPSS

Page 23: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

SPSS output for multiple comparisions (here Scheffé test)

• Significant differences between “no mnemonic” and the other two groups.

Page 24: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

SPSS output for homegenous subsets (Scheffé test)

• There are two subsets

Page 25: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Two factor between-subject ANOVA

• In an ANOVA, you can also investigate the effect of more than one independent variable

• This is called a factorial design• Example: You would like to investigate how the

two diff. speech rates affect the duration of words in sentence-initial, -medial, and -final position.

• What are the IV and DV?– How many levels do the IV have?– What is the type of the DV?

IV: speech rates, position

2 and 3interval

Page 26: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Factorial Design:Example

• Every level of each factor is combined with every level of the other factor

• Factorial designs have to be completely randomised, i.e. every group contributes data to only one cell

Initial Medial final

Fast Group1 Group2 Group3

Slow Group4 Group5 Group6

Page 27: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Factorial Design:Main effects and interaction

• For every variable we can find significant main effects. Would you expect main effects here?– we would expect to find a main effect of speech rate

on duration (i.e. higher speech rate => shorter durations)

– Also, there might be a main effect of position (final segments undergo phrase-final lengthening, early and medial ones don’t

• An interaction would indicate that the effect of one IV is different in the conditions of another IV

Page 28: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Factorial Design: Hypothesised results

Initial medial final position

dura

tion

Slow speech

Fast speech

Does this graph show an interaction?

Non-parallel lines always show interactions!!!

Page 29: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Factorial Design: Degrees of freedom

• Degrees of freedom are different for stating main effects and for interactions

• Numerator (the first value in round brackets):– For main effects: df = k -1

(number of groups -1)– For interactions: df = (k-1)*(j-1)

(k, j: number of levels in IV)

• Denominator (second value in round brackets):– for both: df = N-j*k– Note: df for denominator is always found in row

labelled “error”

Page 30: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Output for factorial design: Please interpret

• Significant main effect of task: F(2,126)=132.9, p<0.001

• Signficant interaction: F(4,126)=6.3, p<0.001

From http://www.uvm.edu/~dhowell/gradstat/psych341/lectures/Factorial1Folder/class4.html

Inter-action

Page 31: Statistics for Linguistics Students Michaelmas 2004 Week 6 Bettina Braun bettina/teaching.html.

Why are inhomogenious variances a problem?

• Assume that the means and the variances are correlated (i.e. a higher mean in one sample coccurs with a higher variance in that sample – possibly caused by outliers and extreme values)– Then the mean is very unreliable– Since the ANOVA compares the variances

and the means, you might get a significant difference which is not actually in the data!