Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Allocating Recycled Significance Levels in GroupSequential Procedures for Multiple Endpoints
Dong XiNorthwestern University
Joint work with Ajit C. Tamhane, Northwestern University
(Thanks to Ekkehard Glimm)
IWSM (July 2013)
1 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Outline
Problem
GSHv and GSHf Procedures
GSP(r) Procedure
Methods for Constructing GSP(r)
Performance Comparisons for a Single Hypothesis
Multiple Hypotheses
Diabetes Trial Example
Concluding Remarks
2 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Problem
• Test hypotheses H1, . . . ,Hn concerning n ≥ 2 endpointsusing group sequential procedures (GSPs).
• Strong control of the familywise error rate (FWER):
FWER = P{Reject at least one true Hi} ≤ α.
• Previous works: Follman, Proschan & Geller (1994), Tang &Geller (1999) and others.
• Problem: How to incorporate recycling in a GSP?
3 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Recycling
• More powerful multiple test procedures (MTPs) can beconstructed by recycling significance levels from rejectedhypotheses to unrejected hypotheses.
• Bretz et al. (2009) and Burman et al. (2009) proposedgraphical approaches to construct MTPs with recycling basedon weighted Bonferroni tests.
• Graphical representation of the Holm procedure for twohypotheses:
Initial graph
H1
0.025
H2
0.0251
1
Graph after H1 rejected
H2
0.05
4 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
GSPs with Recycling
• Maurer & Bretz (2013) and Ye et al. (2013) studied theproblem of constructing GSPs with recycling. We build onthese two papers.
• For GSPs, a new problem arises: How to allocate the recycledsignificance level to the stages of the GSP for the unrejectedhypothesis?
• Ye et al. (2013) proposed two procedures: Group SequentialHolm Variable (GSHv) and Group Sequential Holm Fixed(GSHf).
• Maurer & Bretz (2013) implicitly used the GSHv procedure.
5 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
GSHv and GSHf Procedures
• GSHv allocates the recycled significance level to all stages ofthe GSP for the unrejected hypothesis.
• GSHf allocates the recycled significance level only to the finalstage of the GSP for the unrejected hypothesis.
• If recycling occurs at stage s > 1, GSHv wastes the portion ofthe recycled significance level allocated to stages 1, . . . , s− 1since those stages can’t be revisited.
• GSHf does not waste any recycled significance level, but thetrial has to continue to the final stage to benefit fromrecycling.
• In general, neither GSHv nor GSHf minimizes E(N).
6 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
GSP(r) Procedure
• Consider m-stage GSPs to test H1 and H2 (m− 1 interimanalyses and a final analysis).
• Assume Bonferroni split of α: α1 and α2 s.t. α1 + α2 = α.
• Fix a common r (1 ≤ r ≤ m) for GSPs for both H1 and H2.
• Assume that H1 is rejected at Stage s before H2 is rejected.
• GSP(r) allocates α1 to stages r, r + 1, . . . ,m of GSP for H2.
• GSP(1) = GSHv, GSP(m) = GSHf.
• We call r the planned change point and s the recycling point.
7 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
GSP(r) Procedure
• Change in the GSP boundary for H2 due to recycling of α1
cannot take place before the rth or the sth stage, whicheveroccurs later.
• Let u = max(r, s), the effective change point.
• If s > r then the portion of α1 allocated to stagesr, r + 1, . . . , s− 1 is wasted.
• If s < r then full α1 is utilized but not until the rth stage.
• Ideally, we would like to set r = s, but such an adaptive GSPdoes not always control FWER.
8 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Adaptive GSP(s) Procedure
• Adaptive GSP(s) procedure controls FWER when the teststatistics for H1 and H2 are independent (more generally forn ≥ 2 hypotheses).
• max FWER > α for ρ > 0 where ρ is the correlationcoefficient between the test statistics for H1 and H2 and themax is taken over δ1 = noncentrality parameter of H1
assuming H1 is false and H2 is true.
0 1 2 3 4 5 6 7 8 9 100.02
0.025
0.03
0.035
0.04
0.045
0.05
0.055
0.06
0.065
δ1
FWER
ρ=0ρ=0.1
ρ=0.2
ρ=0.3ρ=0.4ρ=0.5
ρ=0.6
ρ=0.7
ρ=0.8
ρ=0.9
ρ=1
9 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Example
• α1 = α2 = 0.025, m = 3.
• Initial boundary for both H1 and H2: Pocock at 0.025 level:
(c1, c2, c3) = (2.289, 2.289, 2.289).
• Modified Pocock boundaries for H2 if H1 is rejected and its0.025 level is recycled to H2:
GSP(1): (1.992, 1.992, 1.992)GSP(2): (2.289, 1.890, 1.890)GSP(3): (2.289, 2.289, 1.737)
• If s = 2, then the effective boundary for GSP(1): (2.289,1.992, 1.992).
10 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Problem Setting
• Consider testing a single hypothesis H0 initially at level γusing an m-stage GSP.
• At some random stage s (1 ≤ s ≤ m) recycling takes placeand total level for testing H0 is raised to γ′ > γ; e,g,,γ = 0.025, γ′ = 0.05.
• Let (Z1, . . . , Zm) be the test statistics with m-variate normaldistribution s.t. under H0,
E(Zi) = 0, var(Zi) = 1, corr(Zi, Zj) =√i/j for 1 ≤ i < j ≤ m.
• Let (c1(γ), . . . , cm(γ)) denote the initial γ-level boundary.
11 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Boundary Method for Constructing GSP(r)
• The modified γ′-level boundary(c1(γ), . . . , cr−1(γ), c∗r(γ
′), . . . , c∗m(γ′)) obtained by solvingthe equation
1− γ′ = P{Z1 ≤ c1(γ), . . . , Zr−1 ≤ cr−1(γ),
Zr ≤ c∗r(γ′), . . . , Zm ≤ c∗m(γ′)}
such that c∗k(γ′) ≤ ck(γ) for k = r, . . . ,m.
• May choose the same form for c∗k(γ′) as the initial boundary,e.g., if the initial boundary is Pocock then set
c∗r(γ′) = · · · = c∗m(γ′).
• This method is used in the previous example.
12 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Error Spending Function Method for Constructing GSP(r)
• Error spending function introduced by Lan & DeMets (1983)as a flexible method for constructing GSPs.
• Let ε(γ, t) be the initial error spending function which is ↑ int ∈ [0, 1] s.t. ε(γ, 0) = 0 and ε(γ, 1) = γ.
• Approximate error spending functions for Pocock (POC) andO’Brien-Fleming (OBF):
εPOC(γ, t) = γ ln[1 + (e− 1)t], εOBF(γ, t) = 2Φ(−zγ/2/√t).
• Let 0 = t0 < t1 < · · · < tm = 1 be the information times.
13 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Error Spending Function Method for Constructing GSP(r)
• The modified error spending function ε∗(γ′, r, t) can beexpressed as
ε∗(γ′, r, t) =
{ε(γ, t) for 0 ≤ t ≤ tr−1ε(γ, tr−1) + f(γ′, r, t) for tr−1 < t ≤ 1,
where f(γ′, r, t) is ↑ in t with f(γ′, r, tr−1) = 0 andf(γ′, r, 1) = γ′ − ε(γ, tr−1) so that ε∗(γ′, r, 1) = γ′.
• Any other error spending function than the original ε(γ, t) canbe used subject to a certain monotonicity condition.
14 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Error Spending Function Method for Constructing GSP(r)
• A good choice for f(γ′, r, t):
f(γ′, r, t) = ε(γ∗, t)− ε(γ∗, tr−1),
where γ∗ satisfies
γ∗ − ε(γ∗, tr−1) = γ′ − ε(γ, tr−1).
• Check: f(γ′, r, tr−1) = 0 and
f(γ′, r, 1) = ε(γ∗, 1)−ε(γ∗, tr−1) = γ∗−ε(γ∗, tr−1) = γ′−ε(γ, tr−1).
15 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Error Spending Function Method for Constructing GSP(r)
Error spending functions for GSP(1), GSP(2) and GSP(3) usingthe POC boundary (m = 3, γ = 0.025, γ′ = 0.05)
0 1/3 2/3 1
Information fraction
Errorspent
γ = 0.025
γ′ = 0.05
r = 1 r = 3r = 2
16 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Error Spending Function Method for Constructing GSP(r)
• How to calculate the boundary from the modified errorspending function?
• Calculate “spent”levels:
α∗k(γ′) = ε∗(γ′, r, tk)− ε∗(γ′, r, tk−1) (1 ≤ k ≤ m).
Note that∑m
k=1 α∗k(γ′) = γ′.
• Then solve for the c∗k(γ′) recursively from the following set ofequations for 1 ≤ k ≤ m:
α∗k(γ′) = P
k−1⋂j=1
[Zj ≤ c∗j (γ′)
]⋂[Zk > c∗k(γ′)
].
17 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Error Spending Function Method for Constructing GSP(r)
• c∗k(γ′) = ck(γ) for k < r and c∗k(γ′) < ck(γ) for k ≥ r.
• Maurer & Bretz (2013) showed that ensure consonance andhence a stepwise shortcut, we need monotonicity: c∗k(γ′) ↓ asγ′ ↑ which requires αk(γ′) ↑ as γ′ ↑. This is a condition onboth ε(γ, t) and ε∗(γ′, r, t).
• Both POC and OBF boundaries satisfy this monotonicitycondition.
18 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Example
• For the POC error spending function withm = 3, r = 2, γ = 0.025, γ′ = 0.05, we can calculateγ∗ = 0.0707. So for t > 1/3
ε∗(.05, 2, t) = ε(0.025, 1/3) + ε(0.0707, t)− ε(0.0707, 1/3).
• This gives ε∗(.05, 2, 1/3) = 0.0113, ε∗(.05, 2, 2/3) = 0.0333and ε∗(.05, 2, 1) = 0.05. Hence the spent levels are
α∗2(.05) = 0.0333− 0.0113 = 0.0220,
α∗3(.05) = 0.05− 0.0333 = 0.0167.
• c∗2(.05) and c∗3(.05) can be determined recursively fromα∗2(.05) and α∗3.05) as c∗2(.05) = 1.925 and c∗3(.05) = 1.865.Note that they are not equal.
• Using the boundary method, c∗2(.05) = c∗3(.05) = 1.890.
19 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Expected Sample Size Comparisons
• Consider testing H0 : θ = 0 vs. H1 : θ > 0 using an m-stageGSP(r) with γ = 0.025 and γ′ = 0.05.
• For fixed total sample size M = mn where n is the samplesize per stage (assuming a common sample size), powerincreases with r for each s, so maximum power is attainedwith GSP(m).
• However, E(N) is also higher for GSP(m) since it stops late,often at the last stage. So we fix power and find r thatminimizes E(N).
• Power requirement: Power using GSP(r) = 1− β whenθ = δ > 0.
• Determine M to guarantee power and then calculate
E(N) = n
m−1∑k=1
kP (GSP stops and rejects H0 at Stage k|θ = δ)
+M × P (GSP stops at Stage m|θ = δ).20 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Expected Sample Size Comparisons
Expected sample sizes (expressed as percentages of the fixedsample size) for GSP(r) conditional on s(m = 4, γ = 0.025, γ′ = 0.05, Power 1− β = 0.80 at δ = 1).
Initialr
E(N)Boundary s = 1 s = 2 s = 3 s = 4
OBF
1 81.42 81.74 85.15 90.132 81.71 81.71 85.12 90.103 84.17 84.17 84.17 89.334 86.13 86.13 86.13 86.13
POC
1 79.30 83.42 88.04 92.732 78.55 78.55 84.09 89.843 79.25 79.25 79.25 86.164 80.43 80.43 80.43 80.43
21 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Multiple Hypotheses
• Use the graphical approach and the algorithm of Bretz et al.(2009) for updating weights on hypotheses, significance levelsand transition parameters.
• Calculate modified error spending function and correspondingmodified boundary for each rejection at each stage.
• Allows multiple rejections at each stage.
22 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
• Maurer & Bretz (2013) used a 3-stage GSP(1) with equalsample sizes. We will use GSP(2).
• Primary endpoint: HbA1c, Secondary endpoint: Bodyweight.
• Low dose and high dose vs. placebo.
• Gatekeeping restriction: Within each dose test the secondaryendpoint only if the primary endpoint is significant.
• Overall α = 0.025. Initial significance levels:α1 = 0.0125, α2 = 0.0125, α3 = 0, α4 = 0.
23 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
Initial graph
Primary
Secondary
H1
12
H2
12
H3
0
H4
0
1/2
12
1/2
12
1 1
24 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
• Use the O’Brien-Fleming (OBF) boundary for all hypotheses.
• Stage 1 Test Statistics:
Z11 = 2.50, Z21 = 2.12, Z31 = 2.61, Z41 = 1.13.
The OBF boundary for H1 and H2 is (3.935, 2.782, 2.272).Neither H1 nor H2 can be rejected.
• Stage 2 Test Statistics:
Z12 = 3.04, Z22 = 2.63, Z32 = 2.86, Z42 = 1.55.
Since Z12 = 3.04 > 2.782, reject H1. but not H2.
25 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
H1 rejected
H2
34
H3
14
H4
0
13
23
1
1/2
12
26 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
• New significance levels: α2 = 0.01875, α3 = 0.00625, α4 = 0.
• The modified OBF boundary using GSP(2):(3.935, 2.591, 2.118) for H2 and (∞, 3.085, 2.519) for H3.
• Since Z22 = 2.63 > 2.591, reject H2.
• New graph
H1, H2 rejected
H3
12
H4
12
1/2
1/2
27 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
• New significance levels: α3 = 0.0125, α4 = 0.0125.
• The modified OBF boundary for H3 and H4 using GSP(2):(∞, 2.780, 2.272).
• Since Z32 = 2.86 > 2.780, reject H3.
• New graph
H1, H2, H3 rejected
H4
1
28 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Diabetes Trial Example
• New significance level: α4 = 0.025.
• The modified OBF boundary for H4 using GSP(2):(∞, 2.452, 2.003).
• Since Z42 = 1.55 < 2.452 we can’t reject H4.
• At this point, the trial may proceed to Stage 3 or the DMCmay decide to terminate the trial.
29 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
Concluding Remarks
• For given power requirement, E(N) is generally minimized forsome r between 1 and m.
• This value of r can be determined by simulation.
• The goal may be other than minimizing E(N), e.g., theEMEA guideline “Often it may not be acceptable to stop atrial very early, despite convincing efficacy results, becauseinsufficient data on safety, or on secondary endpoints may beavailable,” so a larger r may be chosen.
• Ongoing work: incorporate futility boundaries.
30 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
References I
Bretz, F., Maurer, W., Brannath W. and Posch, M. (2009). A graphicalapproach to sequentially rejective multiple test procedures. Statistics inMedicine, 28, 586–604.
Burman, C.F., Sonesson, C. and Guilbaud, O. (2009). A recyclingframework for the construction of Bonferroni-based multiple tests.Statistics in Medicine, 28, 739-761.
European Medicines Agency (EMA). (2007). Reflection paper onmethodological issues in confirmatory clinical trials with flexible designand analysis plan. London, UK: EMA.
Geller, N.L., Proschan, M.A. and Follmann, D.A. (1995). Group sequentialmonitoring of multi-armed clinical trials. Drug information journal, 29,705–713.
Lan, K.K.G. and DeMets, D.L. (1983). Discrete sequential boundaries forclinical trials. Biometrika, 70, 659–663.
31 / 32
Problem GSHv and GSHf Procedures GSP(r) Procedure Methods for Constructing GSP(r) Performance Comparisons for a Single Hypothesis Multiple Hypotheses Diabetes Trial Example Concluding Remarks
References II
Maurer, W. and Bretz, F. (2013). Multiple testing in group sequentialtrials using graphical approaches. Statistics in BiopharmaceuticalResearch, published online.
O’Brien, P.C. and Flemming, T.R. (1979). A Multiple Testing Procedurefor Clinical Trials. Biometrics, 35, 549–556.
Pocock, S.J. (1977). Group sequential methods in the design and analysisof clinical trials. Biometrika, 64, 191–199.
Tang, D.I. and Geller, N.L. (1999). Closed testing procedures for groupsequential clinical trials with multiple endpoints. Biometrics, 55,1188–1192.
Ye, Y., Li, A., Liu, L. and Yao, B. (2013). A group sequential Holmprocedure with multiple primary endpoints. Statistics in Medicine, 32,1112–1124.
32 / 32
Top Related