Sample Size Estimation Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.

Sample Size Estimation

Dr. Tuan V. NguyenGarvan Institute of Medical Research

Sydney, Australia

The classical hypothesis testingThe classical hypothesis testing

• Define a null hypothesis (H0) and a null hypothesis (H1)

• Collect data (D)

• Estimate p-value = P(D | H0)

• If p-value > , accept H0; if p-value < , reject H0

Diagnosis and statistical reasoningDiagnosis and statistical reasoning

Disease statusPresent Absent

Test result+ve True +ve False +ve

(sensitivity)

-ve False -ve True -ve(Specificity)

Significance Difference isPresent Absent(Ho not true) (Ho is true)

Test resultReject Ho No error Type I err.

1-

Accept Ho Type II err. No error

: significance level1- : power

Study design issuesStudy design issues

• Setting

• Participants: inclusion / exclusion criteria

• Design: survey, factorial, etc

• Measurements: outcome, covariates

• Analysis

• Sample size / power issues

Sample size issuesSample size issues

• How many judges / consumers?– Practical and statistical issues– Ethical issues

• Ethical issues– Unnecessarily large number of judges may be

deemed unethical– Too small a sample may also be unethical as

the study can’t show anything.

Practical difference vs statistical significancePractical difference vs statistical significance

Outcome Group A Group B

Improved 9 18

No improved 21 12

Total 30 30

% improved 30% 60%

Chi-square: 5.4; P < 0.05“Statistically significant”

Outcome Group A Group B

Improved 6 12

No improved 14 8

Total 20 20

% improved 30% 60%

Chi-square: 3.3; P > 0.05“Statistically insignificant”

Effect of sample size on preference proportion

Number of judges

% preferred

A (p)

Variance of p

Std Dev of p

Z test for =0.5

P-value

20 0.65 0.0114 0.1067 1.41 0.160

30 0.65 0.0076 0.0871 1.72 0.085

40 0.65 0.0057 0.0754 1.99 0.047

50 0.65 0.0046 0.0675 2.22 0.026

Effect of sample size: a simulation

Sample size Est. M SD Est. M SD10 98.0 11.0 108.9 32.2

50 100.4 13.6 95.3 41.4

100 101.3 14.4 99.1 35.5

200 99.9 15.2 100.3 33.2

500 99.8 15.3 98.9 33.8

1000 99.5 15.1 99.9 35.0

2000 99.7 15.0 99.9 34.7

10000 100.1 15.0 99.9 35.0

100000 100.0 15.0 100.0 35.0

True mean: 100True SD: 15

True mean: 100True SD: 35

• Parameter (or outcome) of major interest

• Magnitude of difference in the parameter

• Variability of the parameter

• Bound of errors (type I and type II error rates)

What are required for sample size estimation?What are required for sample size estimation?

• Type of measurement of primary interest:– Continuous or categorical outcome

• Examples:– Proportion: proportion (or probability) of preference for

a product

– Hedotic scale: 0-10

– Nominal scale

Parameter of interestParameter of interest

• If the parameter is a continuous variable:– What is the standard deviation (SD) ?

• If the parameter is a categorical variable:– SD can be estimated from the proportion/probability.

Variability of the parameter of interestVariability of the parameter of interest

• Distinction between practical and statistical relevance.

• Examples:

– Probability of preference: 85% vs 50%

– Tasting scores: difference between products by 1 SD.

Magnitude of difference of interestMagnitude of difference of interest

The Normal distributionThe Normal distribution

0.95

0.025 0.0250-1.96 1.96

0.95

0.050 1.64

Prob. Z1 Z2

0.80 0.84 1.280.90 1.28 1.640.95 1.64 1.960.99 2.33 2.81

Z1Z2

The Normal deviatesThe Normal deviates

Alpha Z Z/2

c (One-sided) (Two-sided)

0.20 0.84 1.28

0.10 1.28 1.64

0.05 1.64 1.96

0.01 2.33 2.81

Power Z

0.80 0.84

0.90 1.28

0.95 1.64

0.99 2.33

Study design and outcomeStudy design and outcome

• Single population• Two populations

• Continuous measurement• Categorical outcome• Correlation

Single groupSingle group

Sample size for estimating a population mean Sample size for estimating a population mean

• How close to the true mean

• Confidence around the sample mean

• Type I error.

• N = (Z/2)2 2 / d2

: standard deviation

d: the accuracy of estimate (how close to the true mean).

Z/2: A Normal deviate reflects the type I error.

• Example: we want to estimate the average weight in a population, and we want the error of estimation to be less than 2 kg of the true mean, with a probability of 95% (e.g., error rate of 5%).

• N = (1.96)2 2 / 22

Effect of standard deviationEffect of standard deviation

Std Dev () Sample size

10 96

12 138

14 188

16 246

18 311

20 3840

50

100

150

200

250

300

350

400

450

0 5 10 15 20 25

Standard deviation

Sam

ple

size

Sample size for estimating a population proportion Sample size for estimating a population proportion

• How close to the true proportion

• Confidence around the sample proportion.

• Type I error.

• N = (Z/2)2 p(1-p) / d2

p: proportion to be estimated.

d: the accuracy of estimate (how close to the true proportion).

Z/2: A Normal deviate reflects the type I error.

• Example: The proportion of preference for product A is around 80%. We want to estimate the preference p in a community within 5% with 95% confidence interval.

• N = (1.96)2 (0.8)(0.2) / 0.052 = 246 consumers.

Effect of accuracy Effect of accuracy

• Example: The proportion of preference in the general population is around 30%. We want to estimate the prevalence p in a community within 2% with 95% confidence interval.

• N = (1.96)2 (0.3)(0.7) / 0.022 = 2017 subjects.

0

500

1000

1500

2000

2500

0 0.02 0.04 0.06 0.08 0.1

Standard deviation

Sam

ple

size

Sample size for estimating a correlation coeffcientSample size for estimating a correlation coeffcient

• In observational studies which involve estimate a correlation (r) between two variables of interest, say, X and Y, a typical hypothesis is of the form:– Ho: r = 0 vs H1: r not equal to 0.

• The test statistic is of the Fisher's z transformation, which can be written as:

31

1log

2

1

n

r

rt e

• Where n is the sample size and r is the observed correlation coefficient.

• It can be shown that t is normally distributed with mean 0 and unit variance, and the sample size to detect a statistical significance of t can be derived as:

3

11

log41

2

1

rr

ZZN

e

Sample size for estimating r: exampleSample size for estimating r: example

• Example: According to the literature, the correlation between salt intake and systolic blood pressure is around 0.3. A study is conducted to test the correlation in a population, with the significance level of 1% and power of 90%. The sample size for such a study can be estimated as follows:

873

3.013.01

log41

28.133.2 2

e

N

• A sample size of at least 87 subjects is required for the study.

Sample size for difference between two meansSample size for difference between two means

2

22

11

rd

ZZrN

• Hypotheses:Ho: 1 = 2 vs. Ha: 1 = 2 + d

• Let n1 and n2 be the sample sizes for group 1 and 2, respectively; N = n1 + n2 ; r = n1 / n2 ; : standard deviation of the variable of interest.

• Then, the total sample size is given by:

• If we let Z = d / be the “effect size”then:

2

2

11

rZ

ZZrN

Where Z and Z1- are Normal deviates

• If n1 = n2 , power = 0.80, alpha = 0.05, then (Z + Z1-)2 = (1.96 + 1.28)2 = 10.5, then the equation is reduced to:

221

ZN

Two-group comparisonsTwo-group comparisons

Sample size for two means vs.“effect size”Sample size for two means vs.“effect size”

0

400

800

1200

1600

2000

2400

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Effect size (d / s)

Tot

al s

ampl

e si

ze (

N)

For a power of 80%, significance level of 5%

Sample size for difference between 2 proportionsSample size for difference between 2 proportions

2

21112

21

22111

pp

ppppZppZn

• Hypotheses:Ho: 1 = 2 vs. Ha: 1 = 2 + d .

• Let p1 and p2 be the sample proportions (e.g. estimates of 1 and 2) for group 1 and group 2. Then, the sample size to test the hypothesis is:

Where: n = sample size for each group ; p = (p1 + p2) / 2 ; Z and Z1- are Normal deviates

A better (more conservative) suggestion for sample size is:2

21

411

4

ppn

nn

a

Sample size for difference between 2 prevalenceSample size for difference between 2 prevalence

2

21

1

arcsinarcsin00061.0

2

pp

ZZn

• For most diseases, the prevalence in the general population is small (e.g. 1 per 1000 subjects). Therefore, a difference formulation is required.

• Let p1 and p2 be the prevalence for population 1 and population 2. Then, the sample size to test the hypothesis is:

Where: n = sample size for each group; Z and Z1- are Normal deviates.

Sample size for two proportions: exampleSample size for two proportions: example

• Example: The preference for product A is expected to be 70%, and for product B 60%. A study is planned to show the difference at the significance level of 1% and power of 90%.

• The sample size can be calculated as follows:

– p1 = 0.6; p2 = 0.7; p = (0.6 + 0.7)/2 = 0.65; Z = 2.81; Z = 1.28.

– The sample size required for each group should be:

75927.06.0

23.07.04.06.028.135.065.0281.2

n

• Adjusted / conservative sample size is:

8367.06.0759

411

4

7592

a

n

Sample size for two proportions vs. effect sizeSample size for two proportions vs. effect size

Difference from p1 by:

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

P1

0.1 424 131 67 41 28 19 14 100.2 625 173 82 47 30 20 14 90.3 759 198 89 50 30 19 13 80.4 825 206 89 47 28 17 . .0.5 825 198 82 41 22 . . .0.6 759 173 67 31 . . . .0.7 625 131 45 . . . . .0.8 424 73 . . . . . .

Note: these values are “unadjusted” sample sizes

Sample size for estimating an odds ratioSample size for estimating an odds ratio

ppORr

ZZrN

1ln

12

2

1

2

• In case-control study the data are usually summarized by an odds ratio (OR), rather then difference between two proportions.

• If p1 and p2 are the proportions of cases and controls, respectively, exposed to a risk factor, then:

12

21

1

1

pp

ppOR

• If we know the proportion of exposure in the general population (p), the total sample size N for estimating an OR is:

• Where r = n1 / n2 is the ratio of sample sizes for group 1 and group2; p is the prevalence of exposure in the controls; and OR is the hypothetical odds ratio. If n1 = n2 (so that r = 1) then the fomula is reduced to:

ppOR

ZZN

1ln

42

2

1

Sample size for an odds ratio: exampleSample size for an odds ratio: example

• Example: The prevalence of vertebral fracture in a population is 25%. It is interested to estimate the effect of smoking on the fracture, with an odds ratio of 2, at the significance level of 5% (one-sided test) and power of 80%.

• The total sample size for the study can be estimated by:

27575.025.02ln

85.064.142

2

N

Sample size for 2 correlation coefficientsSample size for 2 correlation coefficients

1

11 1

1log5.0

r

rz e

Where Z and Z1- are Normal deviates

2

22 1

1log5.0

r

rz e

221

2

14

zz

ZZN

• In detecting a relevant difference between two correlation coefficients r1 and r2 obtained from two independent samples of sizes n1 and n2, respectively, we need to firstly transform these coefficients into z value as follows:

• The total sample size N required to detect the difference between two correlation coefficients r1 and r2, with a significance level of and power 1-, can be estimated by:

Sample size for two r’s: exampleSample size for two r’s: example

92098.1424.0

28.196.142

2

N

• The sample size required to detect the difference between r1 = 0.8 and r2 = 0.4 with the significance level of 5% (two-tailed) and power of 80% can be solved as follows:– z1 = 0.5 ln ((1+0.4) / (1-0.4)) = 0.424

– z1 = 0.5 ln ((1+0.8) / (1-0.8)) = 1.098

• 46 subjects is needed in each group.

Some commentsSome comments

• The formulae presented are theoretical.• They are all based on the assumption of Normal distribution.• The estimator [of sample size] has its own variability.• The calculated sample size is only an approximation.• Non-response must be allowed for in the calculation.

Computer programsComputer programs

• Software program for sample size and power evaluation– PS (Power and Sample size), from Vanderbilt Medical Center. This can

be obtained from me by sending email to ([email protected]). Free.

• On-line calculator:– http://ebook.stat.ucla.edu/calculators/powercalc/

• References:– Florey CD. Sample size for beginners. BMJ 1993 May 1;306(6886):1181-4– Day SJ, Graham DF. Sample size and power for comparing two or more treatment groups in clinical trials. BMJ

1989 Sep 9;299(6700):663-5.– Miller DK, Homan SM. Graphical aid for determining power of clinical trials involving two groups. BMJ 1988

Sep 10;297(6649):672-6– Campbell MJ, Julious SA, Altman DG. Estimating sample sizes for binary, ordered categorical, and continuous

outcomes in two group comparisons. BMJ 1995 Oct 28;311(7013):1145-8. – Sahai H, Khurshid A. Formulae and tables for the determination of sample sizes and power in clinical trials for

testing differences in proportions for the two-sample design: a review. Stat Med 1996 Jan 15;15(1):1-21.– Kieser M, Hauschke D. Approximate sample sizes for testing hypotheses about the ratio and difference of two

means. J Biopharm Stat 1999 Nov;9(4):641-50.

mailto:[email protected]

http://ebook.stat.ucla.edu/calculators/powercalc/

Sample Size Estimation Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.

Documents

Transcript of Sample Size Estimation Dr. Tuan V. Nguyen Garvan Institute of Medical Research Sydney, Australia.