Lecture11

43
Introduction to sample size and power calculations How much chance do we have to reject the null hypothesis when the alternative is in fact true? (what’s the probability of detecting a real

description

 

Transcript of Lecture11

Page 1: Lecture11

Introduction to sample size and power calculations

How much chance do we have to reject the null hypothesis when the alternative is in fact true?(what’s the probability of detecting a real effect?)

Page 2: Lecture11

Can we quantify how much power we have for given sample sizes?

Page 3: Lecture11

Null Distribution: difference=0.

Clinically relevant alternative: difference=10%.

Rejection region. Any value >= 6.5 (0+3.3*1.96)

study 1: 263 cases, 1241 controls

For 5% significance level, one-tail area=2.5%

(Z/2 = 1.96)

Power= chance of being in the rejection region if the alternative is true=area to the right of this line (in yellow)

Page 4: Lecture11

Rejection region. Any value >= 6.5 (0+3.3*1.96)

Power= chance of being in the rejection region if the alternative is true=area to the right of this line (in yellow)

study 1: 263 cases, 1241 controls

Power here:

%85=)06.1>Z(P

=)3.3

105.6>Z(P

Page 5: Lecture11

Critical value= 0+10*1.96=20

Power closer to 15% now.

2.5% area

Z/2=1.96

study 1: 50 cases, 50 controls

Page 6: Lecture11

Critical value= 0+0.52*1.96 = 1

Power is nearly 100%!

Study 2: 18 treated, 72 controls, STD DEV = 2

Clinically relevant alternative: difference=4 points

Page 7: Lecture11

Critical value= 0+2.58*1.96 = 5

Power is about 40%

Study 2: 18 treated, 72 controls, STD DEV=10

Page 8: Lecture11

Critical value= 0+0.52*1.96 = 1

Power is about 50%

Study 2: 18 treated, 72 controls, effect size=1.0

Clinically relevant alternative: difference=1 point

Page 9: Lecture11

Factors Affecting Power

1. Size of the effect2. Standard deviation of the

characteristic3. Bigger sample size 4. Significance level desired

Page 10: Lecture11

average weight from samples of 100

Null

Clinically relevant alternative

1. Bigger difference from the null mean

Page 11: Lecture11

average weight from samples of 100

2. Bigger standard deviation

Page 12: Lecture11

average weight from samples of 100

3. Bigger Sample Size

Page 13: Lecture11

average weight from samples of 100

4. Higher significance level

Rejection region.

Page 14: Lecture11

Sample size calculations Based on these elements, you can

write a formal mathematical equation that relates power, sample size, effect size, standard deviation, and significance level…

**WE WILL DERIVE THESE FORMULAS FORMALLY SHORTLY**

Page 15: Lecture11

Simple formula for difference in means

Sample size in each group (assumes equal sized groups)

Represents the desired power (typically .84 for 80% power).

Represents the desired level of statistical significance (typically 1.96).

Standard deviation of the outcome variable

Effect Size (the difference in means)

2

2/2

2

difference

)Z(2

Zn

Page 16: Lecture11

Simple formula for difference in proportions

221

2/2

)(p

)Z)(1)((2

p

Zppn

Sample size in each group (assumes equal sized groups)

Represents the desired power (typically .84 for 80% power).

Represents the desired level of statistical significance (typically 1.96).

A measure of variability (similar to standard deviation)

Effect Size (the difference in proportions)

Page 17: Lecture11

Derivation of sample size formula….

Page 18: Lecture11

Critical value= 0+.52*1.96=1

Power close to 50%

Study 2: 18 treated, 72 controls, effect size=1.0

Page 19: Lecture11

SAMPLE SIZE AND POWER FORMULAS

Critical value=

0+standard error (difference)*Z/2

Power= area to right of Z=

(diff)error standard

1)(here difference ealternativ - valuecritical Z

%50power;(diff)error standard

0:here ..

Zge

Page 20: Lecture11

ZZ

ZZ

Z

Z

Z

power

power

ofright the toarea the ofleft the toarea the

Z)error(diff standard

difference

)error(diff standard

differenceZ

)error(diff standard

difference -(diff)error standard*Z

/2

/2

/2

Power= area to right of Z=

(diff)error standard

difference ealternativ - valuecriticalZ

Power is the area to the right of Z. OR power is the area to the left of - Z.Since normal charts give us the area to the left by convention, we need to use - Z to get the correct value. Most textbooks just call this “Z”; I’ll use the term Zpower to avoid confusion.

Page 21: Lecture11

2/erence)error(diff standard

differenceZZ power

All-purpose power formula…

Page 22: Lecture11

2

2

1

2

).(.nn

diffes

1

2

1

2

).(. :1 group to2 group ofr ratio ifrnn

diffes

Derivation of a sample size formula…

Sample size is embedded in the standard error….

Page 23: Lecture11

2

1

2

2/2

/2

1

2

/2

1

2

1

2

))1(

difference()Z(

Z)1(

difference

Z difference

rn

rZ

rn

rZ

rnn

Z

power

power

power

Algebra…

Page 24: Lecture11

2

2/2

2

1

2/2

221

21

2/2

2

difference

)Z()1(

)Z()1(difference

difference)Z()1(

r

Zrn

Zrrn

rnZr

power

power

power

2

2/2

2

1difference

)Z(2 then groups), (equal 1r If

powerZ

n

2

2/2

2

1difference

)Z()1( powerZ

r

rn

Page 25: Lecture11

Sample size formula for difference in means

2

2/2

2

1difference

)Z()1( powerZ

r

rn

.05)for (1.96 level cesignifican tailed- two toscorrespondZ

power) 80%(.84power toscorrespondZ

outcome theof meansin difference meaningful clinicallyediffferenc

sticcharacteri theofdeviation standard

groupsmaller togrouplarger of ratio r

groupsmaller of size n

:where

2/

1

power

Page 26: Lecture11

Examples Example 1: You want to calculate how much

power you will have to see a difference of 3.0 IQ points between two groups: 30 male doctors and 30 female doctors. If you expect the standard deviation to be about 10 on an IQ test for both groups, then the standard error for the difference will be about:

= 2.57 30

10

30

10 22

Page 27: Lecture11

Power formula…

2/2/2

2/ 2

*

2

*

*)(

* Z

ndZ

n

dZ

d

dZ power

 

P(Z≤ -.79) =.21; only 21% power to see a difference of 3 IQ points.

79.96.12

30

10

3

2

*or 79.96.1

57.2

3

*)(

*2/2/

Znd

ZZd

dZ

powerZ powerZ

Page 28: Lecture11

Example 2: How many people would you need to sample in each group to achieve power of 80% (corresponds to Z=.84)

174)3(

)96.184)(.2(100

*)(

)(22

2

2

22/

2

d

ZZn

174/group; 348 altogether

Page 29: Lecture11

Sample Size needed for comparing two proportions:

Example: I am going to run a case-control study to determine if pancreatic cancer is linked to drinking coffee. If I want 80% power to detect a 10% difference in the proportion of coffee drinkers among cases vs. controls (if coffee drinking and pancreatic cancer are linked, we would expect that a higher proportion of cases would be coffee drinkers than controls), how many cases and controls should I sample? About half the population drinks coffee.

Page 30: Lecture11

Derivation of a sample size formula:

The standard error of the difference of two proportions is:

21

)1()1(

n

pp

n

pp

Page 31: Lecture11

nnn

/5.)5.1(5.)5.1(5.

Here, if we assume equal sample size and that, under the null hypothesis proportions of coffee drinkers is .5 in both cases and controls, then

s.e.(diff)=

Derivation of a sample size formula:

Page 32: Lecture11

2/)statistis.e.(test

statistitest Zc

cZ power

96.1n/5.

10.=powerZ

Page 33: Lecture11

For 80% power…

39210.

)96.184(.5.

5.

10.)96.184(.

/5.

10.96.184.

96.1/5.

10.84.

2

2

22

n

n

n

n

There is 80% area to the left of a Z-score of .84 on a standard normal curve; therefore, there is 80% area to the right of -.84.

Would take 392 cases and 392 controls to have 80% power! Total=784

Page 34: Lecture11

Question 2:

How many total cases and controls would I have to sample to get 80% power for the same study, if I sample 2 controls for every case?

Ask yourself, what changes here?

nnnnnnn

pp

n

pp

2

75.

2

75.

2

5.

2

25.25.

2

25.)1(

2

)1(

2/)statistis.e.(test

statistitest Zc

cZ power

Page 35: Lecture11

Different size groups…

29410).2(

)96.184(.75.

75.

2)10(.)96.184(.

2/75.

10.96.184.

96.12/75.

10.84.

2

2

22

n

n

n

n

Need: 294 cases and 2x294=588 controls. 882 total.

Note: you get the best power for the lowest sample size if you keep both groups equal (882 > 784). You would only want to make groups unequal if there was an obvious difference in the cost or ease of collecting data on one group. E.g., cases of pancreatic cancer are rare and take time to find.

Page 36: Lecture11

General sample size formula

rn

ppr

rn

ppr

rn

pp

n

pp

rn

ppdiffes

)1()1()1()1()1()1().(.

221

22/

)(

))(1(1

pp

ZZpp

r

rn

power

Page 37: Lecture11

General sample size needs when outcome is binary:

2

22/

2

)(

)(2

diff

ZZn power

.05)for (1.96 level cesignifican tailed- two toscorrespondZ

power) 80%(.84power toscorrespondZ

outcome theof sproportionin difference meaningful clinicallyp

groupsmaller togrouplarger of ratio r

groupsmaller of size n

:where

2/

21

p

221

22/

)(

))(1(1

pp

ZZpp

r

rn

Page 38: Lecture11

Compare with when outcome is continuous:

2

2/2

2

1difference

)Z()1(

Z

r

rn

.05)for (1.96 level cesignifican tailed- two toscorrespondZ

power) 80%(.84power toscorrespondZ

outcome theof meansin difference meaningful clinicallyediffferenc

sticcharacteri theofdeviation standard

groupsmaller togrouplarger of ratio r

groupsmaller of size n

:where

2/

1

Page 39: Lecture11

Question How many subjects would we need to

sample to have 80% power to detect an average increase in MCAT biology score of 1 point, if the average change without instruction (just due to chance) is plus or minus 3 points (=standard deviation of change)?

Page 40: Lecture11

Standard error here=

nn

change 3

Page 41: Lecture11

2/)statistis.e.(test

statistitest Zc

cZ power

2

22/

2

2

22

2/

2/

)(

)(

D

ZZn

DnZZ

Z

n

DZ

powerD

Dpower

Dpower

Therefore, need: (9)(1.96+.84)2/1 = 70 people total

Where D=change from test 1 to test 2. (difference)

Page 42: Lecture11

Sample size for paired data:

2

2/2

2

difference

)Z(

Zn d

.05)for (1.96 level cesignifican tailed- two toscorrespondZ

power) 80%(.84power toscorrespondZ

difference meaningful clinicallyediffferenc

differencepair - within theofdeviation standard

size sample n

:where

2/

Page 43: Lecture11

Paired data difference in proportion: sample size:

2

22/

2

)(

)(2

diff

ZZn power

.05)for (1.96 level cesignifican tailed- two toscorrespondZ

power) 80%(.84power toscorrespondZ

sproportiondependent in difference meaningful clinicallyp

group 1for size sample n

:where

2/

21

p

221

22/

)(

))(1(

pp

ZZppn