Post on 27-Mar-2015
BAYESIAN ADAPTIVE DESIGN
& INTERIM ANALYSIS
BAYESIAN ADAPTIVE DESIGN
& INTERIM ANALYSIS
Donald A. Berrydberry@mdanderson.org
Donald A. Berrydberry@mdanderson.org
22
Some referencesSome references
Berry DA (2003). Statistical Innovations in Cancer Research. In Cancer Medicine e6. Ch 33. BC Decker. (Ed: Holland J, Frei T et al.)
Berry DA (2004). Bayesian statistics and the efficiency and ethics of clinical trials. Statistical Science.
Berry DA (2003). Statistical Innovations in Cancer Research. In Cancer Medicine e6. Ch 33. BC Decker. (Ed: Holland J, Frei T et al.)
Berry DA (2004). Bayesian statistics and the efficiency and ethics of clinical trials. Statistical Science.
33
BenefitsBenefits
Adapting; examples Stop early (or late!) Change doses Add arms Drop arms
Final analysis Greater precision (even full follow-up) Earlier conclusions
Adapting; examples Stop early (or late!) Change doses Add arms Drop arms
Final analysis Greater precision (even full follow-up) Earlier conclusions
44
GoalsGoals
Learn faster: More efficient trials
More efficient drug/device development
Better treatment of patients in clinical trials
Learn faster: More efficient trials
More efficient drug/device development
Better treatment of patients in clinical trials
55
OUTLINE: EXAMPLESOUTLINE: EXAMPLES
Extraim analysis Modeling early endpoints Seamless Phase II/III trial Adaptive randomization
Phase II trial in AML Phase II drug screening process Phase III trial
Extraim analysis Modeling early endpoints Seamless Phase II/III trial Adaptive randomization
Phase II trial in AML Phase II drug screening process Phase III trial
66
EXTRAIM ANALYSES*EXTRAIM ANALYSES* Endpoint: CR (detect 0.42 vs 0.32) 80% power: N = 800 Two extraim analyses, one at 800 Another after up to 300 added pts Maximum n = 1400 (only rarely) Accrual: 70/month Delay in assessing response
Endpoint: CR (detect 0.42 vs 0.32) 80% power: N = 800 Two extraim analyses, one at 800 Another after up to 300 added pts Maximum n = 1400 (only rarely) Accrual: 70/month Delay in assessing response
*Modeling due to Scott Berry<scott@berryconsultants.com>*Modeling due to Scott Berry<scott@berryconsultants.com>
77
After 800 pts accrued, have response info on 450 pts
Find pred prob of stat sig when full info on 800 pts available
Also when full info on 1400 Continue if . . . Stop if . . . If continue, n via pred prob Repeat at 2nd extraim analysis
After 800 pts accrued, have response info on 450 pts
Find pred prob of stat sig when full info on 800 pts available
Also when full info on 1400 Continue if . . . Stop if . . . If continue, n via pred prob Repeat at 2nd extraim analysis
Table 1: p0=0.42 p1 P(succ) meanSS sdSS P(800) P(1400) P(succ1) P(succ2) 0.37 0.0001 844.6 122.0 0.8707 0.0194 0.0001 0.0001 0.42 0.0243 1011.2 247.6 0.5324 0.2360 0.0084 0.0059 0.47 0.4467 1188.5 254.5 0.2568 0.5484 0.1052 0.0914 0.52 0.9389 1049.9 248.7 0.4435 0.2693 0.4217 0.2590 0.57 0.9989 874.2 149.1 0.7849 0.0268 0.7841 0.1729
Table 2: p0=0.32 p1 P(succ) meanSS sdSS P(800) P(1400) P(succ1) P(succ2) 0.27 0.0001 836.5 111.1 0.8937 0.0152 0.0005 0.0000 0.32 0.0284 1013.1 246.3 0.5238 0.2338 0.0094 0.0083 0.37 0.4757 1186.6 252.0 0.2513 0.5339 0.1083 0.1044 0.42 0.9545 1045.5 245.9 0.4485 0.2449 0.4316 0.2505 0.47 0.9989 922.7 181.0 0.6632 0.0258 0.6632 0.2111
Table 3: p0=0.22 p1 P(succ) meanSS sdSS P(800) P(1400) P(succ1) P(succ2) 0.17 0.0000 827.7 95.3 0.9163 0.0086 0.0000 0.0000 0.22 0.0288 1013.3 246.6 0.5242 0.2340 0.0090 0.0062 0.27 0.5484 1199.0 246.3 0.2313 0.5392 0.1089 0.1063 0.32 0.9749 1074.4 234.8 0.3702 0.2030 0.3577 0.2065 0.37 0.9995 1024.7 205.4 0.4121 0.0508 0.3977 0.1685
vs 0.80vs 0.80
99
MODELING EARLY ENDPOINTS: LONGITUDINAL MARKERS
MODELING EARLY ENDPOINTS: LONGITUDINAL MARKERS
Example CA125 in ovarian cancer Use available data from trial (&
outside of trial) to model relationship over time with survival, depending on Rx
Predictive distributions Use covariates Seamless phases II & III
Example CA125 in ovarian cancer Use available data from trial (&
outside of trial) to model relationship over time with survival, depending on Rx
Predictive distributions Use covariates Seamless phases II & III
1010
CA125 data & predictive distributions of survival for two of many patients* ——>
CA125 data & predictive distributions of survival for two of many patients* ——>
*Modeling due to Scott Berry<scott@berryconsultants.com>*Modeling due to Scott Berry<scott@berryconsultants.com>
Days
Patient #1
Treatment
Patient #1
Days
Patient #2
Patient #2
1515
MethodsMethods
Analytical
Multiple imputation
Analytical
Multiple imputation
1616
SEAMLESS PHASES II/III*SEAMLESS PHASES II/III*
Early endpoint (tumor response, biomarker) may predict survival?
May depend on treatment Should model the possibilities Primary endpoint: survival But observe relationships
Early endpoint (tumor response, biomarker) may predict survival?
May depend on treatment Should model the possibilities Primary endpoint: survival But observe relationships
*Inoue, et al (2002 Biometrics)*Inoue, et al (2002 Biometrics)
1717
Goodresp
Goodresp
No respNo resp
SurvivaladvantageSurvival
advantage
No survivaladvantageNo survivaladvantage
Phase 2Phase 2 Phase 3Phase 3
Conventional drug developmentConventional drug development
6 mos6 mos 9-12 mos9-12 mos > 2 yrs> 2 yrs
StopStop
Seamless phase 2/3Seamless phase 2/3
< 2 yrs (usually)< 2 yrs (usually)
NotNot
MarketMarket
1818
Seamless phasesSeamless phases Phase 2: 1 or 2 centers; 10 pts/mo,
randomize E vs C If pred probs “look good,” expand to
Phase 3: Many centers; 50 pts/mo (Initial centers continue accrual)
Max n = 900
[Single trial: survival data combined in final analysis]
Phase 2: 1 or 2 centers; 10 pts/mo, randomize E vs C
If pred probs “look good,” expand to Phase 3: Many centers; 50 pts/mo (Initial centers continue accrual)
Max n = 900
[Single trial: survival data combined in final analysis]
1919
Early stoppingEarly stopping Use pred probs of stat sig Frequent analyses (total of 18)
using pred probs to: Switch to Phase 3 Stop accrual for
Futility Efficacy
Submit NDA
Use pred probs of stat sig Frequent analyses (total of 18)
using pred probs to: Switch to Phase 3 Stop accrual for
Futility Efficacy
Submit NDA
2020
Conventional Phase 3 designs: Conv4 & Conv18, max N = 900
(samepower as adaptive design)
Conventional Phase 3 designs: Conv4 & Conv18, max N = 900
(samepower as adaptive design)
ComparisonsComparisons
2121
Expected N under H0Expected N under H0
0
200
400
600
800
1000
431
855 884
Bayes Conv4 Conv18
2222
Expected N under H1Expected N under H1
0
200
400
600
800
1000
649
887 888
Bayes Conv4 Conv18
2323
BenefitsBenefits Duration of drug development is
greatly shortened under adaptive design: Fewer patients in trial No hiatus for setting up phase 3 All patients used for
Phase 3 endpoint Relation between response & survival
Duration of drug development is greatly shortened under adaptive design: Fewer patients in trial No hiatus for setting up phase 3 All patients used for
Phase 3 endpoint Relation between response & survival
2424
Possibility of large NPossibility of large N
N seldom near 900
When it is, it’s necessary!
This possibility gives Bayesian design its edge
[Other reason for edge is modeling response/survival]
N seldom near 900
When it is, it’s necessary!
This possibility gives Bayesian design its edge
[Other reason for edge is modeling response/survival]
2525
Troxacitabine (T) in acute myeloid leukemia (AML) combined with cytarabine (A) or idarubicin (I)
Adaptive randomization to:IA vs TA vs TI
Max n = 75 End point: Time to CR (< 50 days)
Troxacitabine (T) in acute myeloid leukemia (AML) combined with cytarabine (A) or idarubicin (I)
Adaptive randomization to:IA vs TA vs TI
Max n = 75 End point: Time to CR (< 50 days)
ADAPTIVE RANDOMIZATIONGiles, et al JCO (2003)
ADAPTIVE RANDOMIZATIONGiles, et al JCO (2003)
2626
Adaptive RandomizationAdaptive Randomization
Assign 1/3 to IA (standard) throughout (until only 2 arms)
Adaptive to TA and TI based on current results
Results
Assign 1/3 to IA (standard) throughout (until only 2 arms)
Adaptive to TA and TI based on current results
Results
2727
Patient Prob IA Prob TA Prob TI Arm CR<501 0.33 0.33 0.33 TI not2 0.33 0.34 0.32 IA CR3 0.33 0.35 0.32 TI not4 0.33 0.37 0.30 IA not5 0.33 0.38 0.28 IA not6 0.33 0.39 0.28 IA CR7 0.33 0.39 0.27 IA not8 0.33 0.44 0.23 TI not9 0.33 0.47 0.20 TI not
10 0.33 0.43 0.24 TA CR11 0.33 0.50 0.17 TA not12 0.33 0.50 0.17 TA not13 0.33 0.47 0.20 TA not14 0.33 0.57 0.10 TI not15 0.33 0.57 0.10 TA CR16 0.33 0.56 0.11 IA not17 0.33 0.56 0.11 TA CR
2828
Patient Prob IA Prob TA Prob TI Arm CR<5018 0.33 0.55 0.11 TA not19 0.33 0.54 0.13 TA not20 0.33 0.53 0.14 IA CR21 0.33 0.49 0.18 IA CR22 0.33 0.46 0.21 IA CR23 0.33 0.58 0.09 IA CR24 0.33 0.59 0.07 IA CR25 0.87 0.13 0 IA not26 0.87 0.13 0 TA not27 0.96 0.04 0 TA not28 0.96 0.04 0 IA CR29 0.96 0.04 0 IA not30 0.96 0.04 0 IA CR31 0.96 0.04 0 IA not32 0.96 0.04 0 TA not33 0.96 0.04 0 IA not34 0.96 0.04 0 IA CR
Compare n = 75
DropTI
2929
Summary of resultsSummary of results
CR < 50 days: IA: 10/18 = 56% TA: 3/11 =
27% TI: 0/5 = 0%
Criticisms . . .
CR < 50 days: IA: 10/18 = 56% TA: 3/11 =
27% TI: 0/5 = 0%
Criticisms . . .
3030
SCREENING PHASE II DRUGSSCREENING PHASE II DRUGS
Many drugsTumor responseGoals:
Treat effectively Learn quickly
Many drugsTumor responseGoals:
Treat effectively Learn quickly
3131
Standard designsStandard designs
One drug (or dose) at a time; no drug/dose comparisons
Typical comparison by null hypothesis: RR = 20%
Progress hopelessly slow!
One drug (or dose) at a time; no drug/dose comparisons
Typical comparison by null hypothesis: RR = 20%
Progress hopelessly slow!
3232
Standard 2-stage designStandard 2-stage design
First stage 20 patients: Stop if ≤ 4 or ≥ 9 responsesElse second set of 20
First stage 20 patients: Stop if ≤ 4 or ≥ 9 responsesElse second set of 20
3333
An adaptive allocationAn adaptive allocation
When assigning next patient, find r = P(rate ≥ 20%|data) for each drug
Assign drugs in proportion to r Add drugs as become available Drop drugs that have small r Drugs with large r phase 3
When assigning next patient, find r = P(rate ≥ 20%|data) for each drug
Assign drugs in proportion to r Add drugs as become available Drop drugs that have small r Drugs with large r phase 3
3434
Suppose 10 drugs, 200 patientsSuppose 10 drugs, 200 patients
9 drugs have mix of RRs 20% & 40%, 1 has 60%(“nugget”)
9 drugs have mix of RRs 20% & 40%, 1 has 60%(“nugget”)
<70%<70%
>99%>99%
Identify nugget …
With probability: In average n:
Identify nugget …
With probability: In average n:
110110
5050
Adaptive also better at finding “40%”, & soonerAdaptive also better at finding “40%”, & sooner
Sta
nd
ard
Ad
apti
ve
Ad
apti
ve
Sta
nd
ard
3535
Suppose 100 drugs, 2000 patientsSuppose 100 drugs, 2000 patients
99 drugs have mix of RRs 20% & 40%, 1 has 60%(“nugget”)
99 drugs have mix of RRs 20% & 40%, 1 has 60%(“nugget”)
Adaptive also better at finding “40%”, & soonerAdaptive also better at finding “40%”, & sooner
<70%<70%
>99%>99%
Identify nugget …
With probability: In average n:
Identify nugget …
With probability: In average n:
11001100
500500
Sta
nd
ard
Ad
apti
ve
Ad
apti
ve
Sta
nd
ard
3636
ConsequencesConsequences
Treat pts in trial effectivelyLearn quicklyAttractive to patients, in and
out of the trialBetter drugs identified
sooner; move through faster
Treat pts in trial effectivelyLearn quicklyAttractive to patients, in and
out of the trialBetter drugs identified
sooner; move through faster
3737
PHASE III TRIALPHASE III TRIAL Dichotomous endpoint Q = P(pE > pS|data)
Min n = 150; Max n = 600 After n = 50, assign to arm E
with probability Q Except that 0.2 ≤ P(assign E) ≤ 0.8
(Not “optimal,” but …)
Dichotomous endpoint Q = P(pE > pS|data)
Min n = 150; Max n = 600 After n = 50, assign to arm E
with probability Q Except that 0.2 ≤ P(assign E) ≤ 0.8
(Not “optimal,” but …)
3838
Recommendation to DSMB toRecommendation to DSMB to
Stop for superiority if Q ≥ 0.99
Stop accrual for futility if P(pE – pS < 0.10|data) > PF
PF depends on current n . . .
Stop for superiority if Q ≥ 0.99
Stop accrual for futility if P(pE – pS < 0.10|data) > PF
PF depends on current n . . .
3939
0.0
0.2
0.4
0.6
0.8
1.0
0 100 200 300 400 500 600
n
Futility stopping boundary
0.75
0.95
PF
4040
Common prior density for pE & pS
Common prior density for pE & pS
Independent
Reasonably non-informative
Mean = 0.30
SD = 0.20
Independent
Reasonably non-informative
Mean = 0.30
SD = 0.20
4141
0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
p
Beta(1.275, 2.975) density
4242
UpdatingUpdating
After 20 patients on each arm
8/20 responses on arm 1
12/20 responses on arm 2
After 20 patients on each arm
8/20 responses on arm 1
12/20 responses on arm 2
4343
0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
p
Beta(9.275, 14.975)
Beta(13.275, 10.975)
4444
AssumptionsAssumptions
Accrual: 10/month
50-day delay to assess response
Accrual: 10/month
50-day delay to assess response
4545
Need to stratify. But how?Need to stratify. But how?
Suppose probability assign to experimental arm is 30%, with these data . . .
Suppose probability assign to experimental arm is 30%, with these data . . .
4646
Proportions of Patients onExperimental Arm by Strata
Stratum 1Stratum 2
Small Big
Small 6/20 (30%) 10/20 (50%)
Big 6/10 (60%) 2/10 (20%)
Probability of Being Assigned toExperimental Arm for Above Example
Stratum 1Stratum 2
Small Big
Small 37% 24%
Big 19% 44%
4747
One simulation; pS = 0.30, pE = 0.45
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 6 12 18 24 Months
Proportion Exp
Probability Exp is better 178/243
= 73%
FinalStd 12/38 19/60 20/65Exp 38/83 82/167 87/178
Superiority boundary
4848
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.00 6 12 18 24 Months
Proportion Exp
Probability Exp is better
87/155 = 56%
Probability futility
9 mos. End FinalStd 8/39 15/57 18/68Exp 11/42 32/81 22/87
One simulation; pE = pS = 0.30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.00 6 12 18 24 Months
Proportion Exp
Probability Exp is better
87/155 = 56%
Futility boundary
4949
Operating characteristicsOperating characteristics
True ORR Mean # of patients (%)Std Exp
Probselect
exp Std Exp Total
Meanlength(mos)
Probmax n
0.3 0.2 <0.001 51 (34.9) 95 (65.1) 146 15 <0.0010.3 0.3 0.05 87 (43.1) 115 (56.9) 202 20 0.0030.3 0.4 0.59 87 (30.4) 199 (69.6) 286 29 0.050.3 0.45 0.88 79 (30.7) 178 (69.3) 257 26 0.020.3 0.5 0.98 59 (29.5) 141 (70.5) 200 20 0.0030.3 0.6 1.0 47 (30.1) 109 (69.9) 156 16 <0.001
5050
FDA: Why do this? What’s the advantage?
FDA: Why do this? What’s the advantage?
Enthusiasm of PIs
Comparison with standard design . . .
Enthusiasm of PIs
Comparison with standard design . . .
5151
Adaptive vs tailored balanced design w/same false-positive rate & power
(Mean number patients by arm)
Adaptive vs tailored balanced design w/same false-positive rate & power
(Mean number patients by arm)
ORR
Arm
pS = 0.20pE = 0.35
pS = 0.30pE = 0.45
pS = 0.40pE = 0.55
Std Exp Std Exp Std Exp
Adaptive 68 168 79 178 74 180
Balanced 171 171 203 203 216 216
Savings 103 3 124 25 142 36
5252
Consequences of Bayesian Adaptive Approach
Consequences of Bayesian Adaptive Approach
Fundamental change in way we do medical research
More rapid progressWe’ll get the dose right!Better treatment of patients . . . at less cost
Fundamental change in way we do medical research
More rapid progressWe’ll get the dose right!Better treatment of patients . . . at less cost
5353
OUTLINE: EXAMPLESOUTLINE: EXAMPLES
Extraim analysis Modeling early endpoints Seamless Phase II/III trial Adaptive randomization
Phase II trial in AML Phase II drug screening process Phase III trial
Extraim analysis Modeling early endpoints Seamless Phase II/III trial Adaptive randomization
Phase II trial in AML Phase II drug screening process Phase III trial