Statistical Methods in Computer Science

30
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan

description

Hypothesis Testing II: Single-Factor Experiments Ido Dagan. Statistical Methods in Computer Science. Single-Factor Experiments. A generalization of treatment experiments Determine effect of independent variable values (nominal) Effect: On the dependent variable - PowerPoint PPT Presentation

Transcript of Statistical Methods in Computer Science

Page 1: Statistical Methods in  Computer Science

Statistical Methods in Computer Science

Hypothesis Testing II:Single-Factor Experiments

Ido Dagan

Page 2: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

2

Single-Factor Experiments

A generalization of treatment experiments Determine effect of independent variable values

(nominal) Effect: On the dependent variable

treatment1 Ind1 & Ex1 & Ex2 & .... & Exn ==> Dep1

treatment2 Ind2 & Ex1 & Ex2 & .... & Exn ==> Dep2

control Ex1 & Ex2 & .... & Exn ==> Dep3

Compare performance of algorithm A to B to C .... Control condition: Optional (e.g., to establish

baseline)

Page 3: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

3

Single-Factor Experiments

An generalization of treatment experiments Determine effect of independent variable values

(nominal) Effect: On the dependent variable

treatment1 Ind1 & Ex1 & Ex2 & .... & Exn ==> Dep1

treatment2 Ind2 & Ex1 & Ex2 & .... & Exn ==> Dep2

control Ex1 & Ex2 & .... & Exn ==> Dep3

Compare performance of algorithm A to B to C .... Control condition: Optional (e.g., to establish

baseline)

Values of independent variable

Values of dependent variable

Page 4: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

4Single-Factor Experiments: Definitions

The independent variable is called the factor Its values (being tested) are called levels

Our goal: Determine whether there is an effect of levels Null hypothesis: There is no effect Alternative hypothesis: At least one level causes an

effect

Tool: One-way ANOVA A simple special case of general Analysis of Variance

Page 5: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

5The case for Single-factor ANOVA

(one-way ANOVA) We have k samples (k levels of the factor)

Each with its own sample mean, sample std. deviation for the dependent variable value

We want to determine whether one (at least) is different

treatment1 Ind1 & Ex1 & Ex2 & .... & Exn ==> Dep1

…treatment2 Indk & Ex1 & Ex2 & .... & Exn ==> Depk

control Ex1 & Ex2 & .... & Exn ==> Dep3

Values of independent variable = levels of the

factor

Values of dependent variable

Cannot use the tests we learned: Why?

Page 6: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

6The case for Single-factor ANOVA

(one-way ANOVA) We have k samples (k levels of the factor)

Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is

different

H0: M1=M2=M3=M4

H1: There exist i,j such that Mi <> Mj

Level S. mean S. stdev. N(sample) Mi Si

1 4.24 0.91 292 3.75 1.38 1203 2.85 1.38 594 2.63 1.41 59

Page 7: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

7The case for Single-factor ANOVA

(one-way ANOVA) We have k samples (k levels of the factor)

Each with its own sample mean, sample std. deviation We want to determine whether one (at least) is

different

H0: M1=M2=M3=M4

H1: There exist i,j such that Mi <> Mj

Level S. mean S. stdev. N(sample) Mi Si

1 4.24 0.91 292 3.75 1.38 1203 2.85 1.38 594 2.63 1.41 59

Why not use t-test to compare every

Mi, Mj?

Page 8: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

8

Multiple paired comparisons

Let ac be the probability of an error in a single comparison alpha = the probability of incorrectly rejecting null hypothesis

1-ac: probability of making no error in a single comparison

(1-ac)m: probability of no error in m comparisons

(experiment) ae = 1-(1-ac)

m: probability of an error in the experiment Under assumption of independent comparisons

ae quickly becomes large as m increases

Page 9: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

9

Example

Suppose we want to contrast 15 levels of the factor 15 groups, k=15

Total number of pairwise comparisons (m) : 105 15 X (15-1) / 2 = 105

Suppose ac = 0.05 Then ae = 1-(1-ac)

m = 1-(1-0.05)105 = 0.9954

We are very likely to make a type I error!

Page 10: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

10

Possible solutions?

Reduce ac until overall ae level is 0.05 (or as needed) Risk: comparison alpha target may become unobtainable

Ignore experiment null hypothesis, focus on comparisons Carry out m comparisons # of errors in m experiments: m X ac

e.g., m=105, ac=0.05, # of errors = 5.25. But which?

Page 11: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

11

One-way ANOVA

A method for testing the experiment null hypothesis H0: all levels' sample means are equal to each other

Key idea: Estimate a variance B under the assumption H0 is

true Estimate a “real” variance W (regardless of H0) Use F-test to test hypothesis that B=W

Assumes variance of all groups is the same

Page 12: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

12

Some preliminaries

Let xi,j be the jth element in sample i Let Mi be the sample mean of sample i Let Vi be the sample variance of sample i

For example:Class 1 Class 2 Class 3

14.9 11.1 5.715.2 9.5 6.617.9 10.9 6.715.6 11.7 6.810.7 11.8 6.9

Mi 14.86 11 6.54Vi 6.8 0.85 0.23

x1,2

x3,4

Page 13: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

13

Some preliminaries

Let xi,j be the jth element in sample i Let Mi be the sample mean of sample i Let Vi be the sample variance of sample i

Let M be the grand sample mean (all elements, all samples)

Let V be the grand sample variance

Page 14: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

14The variance contributing to a value

Every element xi,j can be re-written as:

xi,j = M + ei,j

where ei,j is some error component

We can focus on the error component

ei,j = xi,j – M

which we will rewrite as:

ei,j = (xi,j - Mi ) + (Mi - M)

Page 15: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

15Within-group and between-group

The re-written form of the error component has two parts

ei,j = (xi,j - Mi ) + (Mi - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean

For example, in the table: x1,1 = 14.9, M1 = 14.86, M = 10.8 e1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1

Page 16: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

16Within-group and between-group

The re-written form of the error component has two parts

ei,j = (xi,j - Mi ) + (Mi - M) Within-group component: variance w.r.t group mean Between-group component: variance w.r.t grand mean

For example, in the table: x1,1 = 14.9, M1 = 14.86, M = 10.8 e1,1 = (14.9-14.86) + (14.86 – 10.8) = 0.04 + 4.06 = 4.1

Note within-group and between-group components: Most of the error (variance) is due to the between group!

Can we use this in more general fashion?

Page 17: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

17

No within-group variance

Class 1 Class 2 Class 3M 15 11 6

10.67 15 11 6V 15 11 6

14.52 15 11 615 11 6

Mi 15 11 6Vi 0 0 0

No variance within group, in any element

Page 18: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

18

No between-group variance

Class 1 Class 2 Class 3M 17 11 2215 26 13 14V 9 18 12

24.86 11 18 812 15 19

Mi 15 15 15Vi 46.5 9.5 31

No variance between groups, in any group

Page 19: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

19Comparing within-group and between-groups components

The error component of a single element is:

ei,j =(xi,j - M) = (xi,j - Mi ) + (Mi - M)

Let us relate this to the sample and grand sums-of-squares It can be shown that:

Let us rewrite this as

i j

ii j

iji,i j

ji, MM+Mx=Mx 222

betweenwithintotal SS+SS=SS

Page 20: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

20From Sums of Squares (SS) to

variances

We know

... and convert to Mean Squares (as variance estimates):

betweenwithintotal SS+SS=SS

`

1

2

1

2

IN

Mx

=N

Mx

=df

SS=MS i j

iji,

I

=ki

i jiji,

within

withinwithin

11

22

I

MMN=

I

MM

=df

SS=MS i

iii j

i

between

betweenbetween

Page 21: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

21From Sums of Squares (SS) to

variances

We know

... and convert to variances:

betweenwithintotal SS+SS=SS

IN

Mx

=N

Mx

=df

SS=MS i j

iji,

I

=ki

i jiji,

within

withinwithin

2

1

2

1

11

22

I

MMN=

I

MM

=df

SS=MS i

iii j

i

between

betweenbetween

Degrees of freedom

Page 22: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

22From Sums of Squares (SS) to

variances

We know

... and convert to variances:

betweenwithintotal SS+SS=SS

IN

Mx

=N

Mx

=df

SS=MS i j

iji,

I

=ki

i jiji,

within

withinwithin

2

1

2

1

11

22

I

MMN=

I

MM

=df

SS=MS i

iii j

i

between

betweenbetween

# of levels (samples)

Page 23: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

23From Sums of Squares (SS) to

variances

We know

... and convert to variances:betweenwithintotal SS+SS=SS

`

1

2

1

2

IN

Mx

=N

Mx

=df

SS=MS i j

iji,

I

=ki

i jiji,

within

withinwithin

11

22

I

MMN=

I

MM

=df

SS=MS i

iii j

i

between

betweenbetween

Page 24: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

24

Determining final alpha level

MSwithin is an estimate of the (inherent) population variance Which does not depend on the null hypothesis (M1=M2=... MI)

Intuition: It’s an “average” of variances in the individual groups

MSbetween estimates the population variance + the treatment effect It does depend on the null hypothesis

Intuition: It’s similar to an estimate for the variance of the samples means, where each component is multiplied by Ni

Recall: N · sample mean variance = population variance

If the null hypothesis is true – the two values estimate the inherent variance, and should be equal up to the sampling variation

So now we have two variance estimates for testing Use F-test

F = Msbetween / MSwithin

Compare to F-distribution with dfbetween, dfwithin

Determine alpha level (significance)

Page 25: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

25

ExampleClass 1 Class 2 Class 3

M 14.9 11.1 5.710.8 15.2 9.5 6.6

V 17.9 10.9 6.714.64 15.6 11.7 6.8

10.7 11.8 6.9Mi 14.86 11 6.54Vi 6.8 0.85 0.23

173.310.86.54510.811510.814.865 222 =++=SSbetween

86.72

173.3

13

173.3===

df

SS=MS

between

betweenbetween

Page 26: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

26

ExampleClass 1 Class 2 Class 3

M 14.9 11.1 5.710.8 15.2 9.5 6.6

V 17.9 10.9 6.714.64 15.6 11.7 6.8

10.7 11.8 6.9Mi 14.86 11 6.54Vi 6.8 0.85 0.23

22 14.8610.7...14.8614.9 ++=SSwithin

2.612

31.5

315

31.5===

df

SS=MS

within

withinwithin

31.56.546.9...1111.8....1111.1 222 =+++++

Page 27: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

27

ExampleClass 1 Class 2 Class 3

M 14.9 11.1 5.710.8 15.2 9.5 6.6

V 17.9 10.9 6.714.64 15.6 11.7 6.8

10.7 11.8 6.9Mi 14.86 11 6.54Vi 6.8 0.85 0.23

F=MSbetween

MSwithin

=86.72.6

=32.97Check F

distribution(2,12):Significant!

Page 28: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

28Reading the results from statistics

software

You can use a statistics software to run one-way ANOVA

It will give out something like this:

Source df SS MS F pbetween 2 173.3 86.7 32.97

p<0.001within 14 31.5 2.6total 16 204.9

You should have no problem reading this, now.

Page 29: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

29

Analogy to linear regression

Analogy to linear regression – where:• the variance of observation is composed of:– the variance of the predictions – plus the variance of the deviations from the corresponding

predictions:

• that is – explained variance (according to the prediction) vs. unexplained variance (due to deviations from prediction)

Page 30: Statistical Methods in  Computer Science

Empirical Methods in Computer Science © 2006-now Gal Kaminka

30

Summary

Treatment and single-factor experiments Independent variable: categorical Dependent variable: “numerical” (ratio/interval)

Multiple comparisons: A problem for experiment hypotheses

Run one-way ANOVA instead Assumes:

populations are normal have equal variances independent random samples (with replacement)

Moderate deviation from normal, particularly with large samples, is still fine Somewhat different variances are fine for roughly equal samples

If significant, run additional tests for details: Tukey's procedure (T method) LSD Scheffe ...