Understanding Statistical Power for Non-Statisticians

General Housekeeping

Use mic & speakers or call in Webinar users submit

questions via chat

Dale W. Usner, Ph.D.

20 years in industry 50+ FDA and international regulatory body

interactions Frequent study design support Therapeutic Expertise

Anti-viral/Anti-infective, Cardiovascular, Gastrointestinal, Oncology/Immunology, Ophthalmology, Surgical, Other

Agenda

What is Statistical Power How Assumptions Affect Statistical Power and Sample Size How Power Is Associated With What is Statistically Significant Q&A

What is Statistical

Power?

What is Statistical Power?

Clinical/Medical: If I want to compare our Test Product to the Control Product in systolic blood pressure, what sample size will I need?

Statistician: Assuming a Difference in Means of 10 mm Hg and a common Standard Deviation (SD) of 20 mm Hg, 64 subjects per treatment group are required to have 80% power for a 2-sided = 0.05 test.

What is Statistical Power?

Next?

What is Statistical Power? Questions

What does it mean to have 80% power?


What does it mean to have 80% power? What does it mean to assume a difference in

means of 10 mm Hg? What if the true diff <10? >10?




What role does the SD play? What if the true SD is >20? What if the true SD is <20?




What role does the SD play? What if the true SD is >20? What if the true SD is <20?

What happens if the difference observed in the study is <10 mm Hg?

Motivation

Assuming a Difference in Means of 10 mm Hg and a common SD of 20 mm Hg, N = 64 subjects per treatment group are required to have 80% power for a 2-sided = 0.05 test.

The study will demonstrate a statistically significant result if the difference observed in the study is ≥7.0 mm Hg (assuming the observed SD is 20 mm Hg).

Motivation

With 85% Power (N = 73 subjects / Tx Gp) observed differences ≥6.55 mm Hg would yield statistical significance

With 90% Power (N = 86 subjects / Tx Gp) observed differences ≥6.03 mm Hg would be statistical significance

Background

Statistical Inference

Statistical Inference: Drawing conclusions about an entire population based on a sample from that population.

Hypotheses (Efficacy)

Superiority H0: Test Arm is no different from Control Arm H1: Test Arm is different (superior) than Control Arm

Non-Inferiority H0: Test Arm is inferior to Control Arm H1: Test Arm is non-inferior to Control Arm

Desired Outcome: Reject H0 in favor of H1

Statistical Inference: Coin Flip

H0: Proportion of Heads = Prop Tails = 0.50 H1: Proportion of Heads > Prop Tails Flip a coin 4 times with result 3H and 1T. 75%

Heads, should H0 be rejected?



Heads, should H0 be rejected? No, probability of this occurring under H0 is

31.25%



Heads, should H0 be rejected? No, probability of this occurring under H0 is

31.25% Flip a coin 40 times with result 30H and 10T.

75% Heads, should H0 be rejected?



Heads, should H0 be rejected? No, probability of this occurring under H0 is 31.25%

Flip a coin 40 times with result 30H and 10T. 75% Heads, should H0 be rejected? Yes, probability of this occurring under H0 is <0.1%

Define and Power

(Type I Error) is the probability that the study concludes the Test Arm is different from the Control Arm, when the Test Arm truly is no different. This is a regulatory risk.

Power is the probability that the study concludes the Test Arm is different from the Control Arm, when in the Test Arm truly is different. This is sponsor risk.

Continuous & Binary Measures

Continuous Measures Generally testing differences in Means or Medians Standard Deviation also very important

Binary Measures Generally testing differences or ratios of proportions Standard Deviations generally determined by

assumed proportions

Power in Pictures

Consider a therapy designed to lower systolic blood pressure

(SysBP) by an additional 10 mm Hg more than the currently best selling therapy, which has been shown to lower the SysBP to an average of 140 mm Hg.

that each treatment has an SD of 20 mm Hg Assume the data follow normal distributions

Power in Pictures: Sample Size = 1

Power in Pictures: Sample Size = 5Power: 10%

Standard Powers

80%: If the test product is as efficacious as assumed under H1, 80% of trials should reject H0 in favor of H1 by design. Generally considered to be the lowest targeted power.

85%: 85% of trials should reject H0 in favor of H1 by design.

90%: 90% of trials should reject H0 in favor of H1 by design.

Sponsor Risk

80%: If the test product is as efficacious as assumed under H1, 20% of trials will fail to reject H0 in favor of H1 by design.

85%: 15% of trials will fail to reject H0 in favor of H1 by design.

90%: 10% of trials will fail to reject H0 in favor of H1 by design.

How Assumptions Affect Statistical

Power and Sample Size

Sample Size x Assumed Difference x Power

6 7 8 9 10 11 12 13 140

100

200

300

400

500

SD = 20, 2-sided alpha = 0.05

80% Power85% Power90% Power

Assumed Difference

Tota

l Sam

ple

Size

% Increase in Sample Size (N) for Increasing Power

Assuming a 2-sided = 0.05 test, N increases by: ~14% from 80% to 85% power ~34% from 80% to 90% power

Assuming a 2-sided = 0.10 test, N increases by ~16% from 80% to 85% power ~38% from 80% to 90% power

Assuming a 2-sided = 0.20 test, N increases by ~19% from 80% to 85% power ~46% from 80% to 90% power

Sample Size x Assumed Difference x Standard Deviation

6 7 8 9 10 11 12 13 140

100

200

300

400

500

600

80% Power, 2-sided alpha = 0.05

SD = 16SD = 18SD = 20SD = 22SD = 24

Assumed Difference

Tota

l Sam

ple

Size

% Increase in Sample Size (N) for Increasing Standard Deviation

Regardless of the and power, N increases by a factor of (SDhigh / SDlow)2

Sample Size x Assumed Difference x Alpha (Type I Error)

6 7 8 9 10 11 12 13 140

100

200

300

400

SD = 20, 80% Power

2-sided alpha = 0.202-sided alpha = 0.102-sided alpha = 0.05

Assumed Difference

Tota

l Sam

ple

Size

% Decrease in Sample Size (N) for Increasing Alpha (Type I Error)

Assuming 80% power, N decreases by: ~21% from 2-sided alpha = 0.05 to 0.10 ~42% from 2-sided alpha = 0.05 to 0.20



Sample Size x Assumed Difference x Power and Ratio of Randomization

6 7 8 9 10 11 12 13 140

100

200

300

400

500

600

SD = 20, 2-sided alpha = 0.05

80% Power80% Power 2:185% Power85% Power 2:190% Power

Assumed Difference

Tota

l Sam

ple

Size

% Increase in Sample Size (N) for Increasing Randomization Ratio

Regardless of the and power, N increases by: ~4.2% from 1:1 to 3:2 randomization ratio ~12.5% from 1:1 to 2:1 randomization ratio ~33.3% from 1:1 to 3:1 randomization ratio

% Decrease in Min Obs Diff Required for Significance with Increasing N

Regardless of the , the Minimum Observed Difference required for Significance decreases by a factor of sqrt(Nlow / Nhigh) With 80% Power, N = 64 / Tx Group required diff is 7.00 mm

Hg With 85% Power, N = 73 / Tx Group required diff is

(7.00*sqrt(64/73)) = 6.55 mm Hg With 90% Power, N = 86 / Tx Group required diff is

(7.00*sqrt(64/86)) = 6.03 mm Hg

Effect on Statistical Significance

Increasing Sample Size decreases the minimum difference required to show statistical significance (driven by H0) and therefore increases power.

Required sample size increases with: Increasing: Power, SD, Randomization Ratio Decreasing: Alpha, Assumed Difference

Thank You

[email protected] www.sdcclinical.com

Q&A

Understanding Statistical Power for Non-Statisticians

Healthcare

Transcript of Understanding Statistical Power for Non-Statisticians