Seminar, Bordeaux School of Public Health 8 June 2011 Combining endpoints in clinical trials to...
Transcript of Seminar, Bordeaux School of Public Health 8 June 2011 Combining endpoints in clinical trials to...
Seminar, Bordeaux School of Public Health
8 June 2011
Combining endpoints in clinical trials to increase power
John Whitehead
Medical and PharmaceuticalStatistics Research Unit
Medical and Pharmaceutical Statistics Research Unit Department of Mathematics and StatisticsTel: +44 1524 592350 Fylde CollegeFax: +44 1524 592681 Lancaster UniversityE-mail: [email protected] Lancaster LA1 4YF, UK
MPS Research Unit
1. Ordinal endpoints in stroke studies
Treatments for acute stroke are administered for a few days following diagnosis
The primary endpoint is the functional status of the patient, 90 days after the stroke
Several scoring systems exist, including the Barthel index, the modified Rankin score and the NIH stroke scale
All are ordinal scales from full recovery to vegetative state, to which death before 90 days can be added
2
MPS Research Unit
Analysis of an ordinal response
R1 = best response (full recovery)
Rk = worst response (death before day 90)
Response Control Experimental Total
R1 c1 e1 t1
R2 c2 e2 t2
Rk ck ek tk
Total nC nE n
3
MPS Research Unit
Let
Ch = c1 +…+ ch
Ch = the number of controls with response Rh or better
Let
Ch = ch +…+ ck
Ch = the number of controls with response Rh or worse
Similarly define Eh, Eh, Th and Th
4
MPS Research Unit
Let
QCh = P(a control has response Rh or better)
QEh = P(an experimental has response Rh or better)
(then QCk = QEk = 1)
Put
h is the log-odds ratio for response Rh or better, E:C
Eh Chh
Ch Eh
Q (1 Q )log
Q (1 Q )
h = 1,…, k – 1
5
MPS Research Unit
The proportional odds assumption is
1 = 2 = … = k–1 =
The common value, , is a measure of the advantage of the experimental treatment
> 0 experimental better
= 0 no difference
< 0 control better
6
MPS Research Unit
Under PO, the most efficient test of treatment advantage
greatest power for any given sample size
is based on the test statistics
and
For large samples and small , approximately Z ~ N(V, V)
Z is the score statistic and V is Fisher’s information
kh 1
h h 1h 1
1Z a (B B )
n
3kC E h
h 1
n n tV 1
3n n
7
MPS Research Unit
To test for treatment difference, refer Z2/V to
This is the Mann-Whitney test
Also known as the Wilcoxon test
Under the null hypothesis of no treatment effect, PO is true with = 0
Thus the hypothesis test and the p-value are valid without assumptions
Estimates of and confidence intervals for do rely on assumptions, as does adjustment for prognostic factors
21
8
MPS Research Unit
How should investigators choose which scale to use?
An alternative to choosing is to combine more than one stroke scale in the analysis
Tilley et al. (1996) combined four scales in the trial of rTPA as a treatment in acute stroke conducted by the National Institute of Neurological Disorders and Stroke
the trial was positive and the approach caught on
If the treatment has a beneficial effect on all scales, then combining them will increase the power to demonstrate the advantage of the treatment
9
MPS Research Unit
2. Example: The ICTUS trial in stroke
• Currently ongoing in 60 centres in Europe
• Patients who have suffered acute stroke
• Randomised between citicoline and placebo
• Assessed at 90 days on Barthel index, modified Rankin score and NIH stroke scale
• Prognostic factors baseline NIHSS time from stroke to treatment ( or > 12 hours) age ( or > 70 years)site of stoke (right or left side)use of rTPA (yes or no)
10
MPS Research Unit
The approach used by Tilley et al.
Combine the three analyses using GEE (based on an independence covariance structure: IEE)
That is, analyse as if the three scores were independent, butadjust the standard error of the treatment effect estimate usingthe sandwich estimator
• complicated to understand• no associated sample size formula• failed in test data set of 1000 patients with binary
responses and adjustment for 60 centres
11
MPS Research Unit
An alternative general approach
The log-odds ratio and the test statistics Z and V, for the analysis of the ith response will be denoted by i, Zi and Vi
i = 1 is Barthel indexi = 2 is modified Rankin scorei = 3 is NIH stroke score
W will test H0: 1 = 2 = 3 = 0 (no effect of treatment on any of the scales) using
Z = Z1 + Z2 + Z3
12
MPS Research Unit
For each scale,
Zi ~ N(iVi, Vi)
if Vi is large and i is small
If1 = 2 = 3 = , then approximately
where V = V1 + V2 + V3, C = 2(C12 + C23 + C31) and
Cij = cov(Zi, Zj)
Z ~ N V,V C
13
MPS Research Unit
It follows that, if
then
as required for a 2 test and for sample size calculation
What we need to use this is an expression for
Cij = cov(Zi, Zj)
Z* ~ N V*,V*
2
anZV V
Z* VC
d *V C V
14
MPS Research Unit
The binary case, no covariates only one response
Control Experimental Total
Success (R1) c1 e1 t1
Failure (R2) c2 e2 t2
Total nC nE n
C 1 E 1 C E 1 23
ann e n c n n
dt t
Z Vn n
15
MPS Research Unit
The binary case, no covariates ith of several responses
assuming that each patient provides all responses
Control Experimental Total
Success (R1) ci1 ei1 ti1
Failure (R2) ci2 ei2 ti2
Total nC nE n
C i1 E i1 C E i1 i2i i 3
ann e n c n n t t
dZ Vn n
16
MPS Research Unit
Covariance between Zi and Zj
For two such statistics, we have
where ti1 is the number of patients succeeding on the ith scale,
tj1 the number succeeding on the jth scale and t(ij),1 the number
succeeding on both scales (Pocock, Geller and Tsiatis, 1987)
C Eij i j (ij),1 i1 j13
n nC cov Z ,Z nt t t
n
17
MPS Research Unit
The ordinal case, no covariates ith of several responses
with Cih = ci1 +…+ cih and Cih = cih +…+ cik
Control Experimental Total
R1 ci1 ei1 ti1
Rk cik eik tik
Total nC nE n
18
MPS Research Unit
Covariance between Zi and Zj
For two such statistics, we have
where fv = 1, 0 or 1 if f <, =, > v respectively,
Kfg = tfi tgj/n2, Hfg = t(ij),(fg)/n Kfg,
t(ij),(fg) is the count of patients who have both response Rf,i on
the ith scale and response Rg,j on the jth scale
C Eij fv gw fg vw C fg vw E fg vw2
f ,g,v,w
n nC H H n H K n K H
n
19
MPS Research Unit
Adjustment for covariates
The approach can be extended to allow for prognostic factors via stratification and/or linear modelling of covariates
Stratification: sum Z and V statistics over strata, and assume that the treatment effect is constant over strata
Covariate adjustment: use proportional hazards regression, plus binary logistic regression to model the simultaneous occurrence of particular responses on different scales (such as complete recovery on Barthel index and partial recovery on the modified Rankin)
20
MPS Research Unit
3. Sample size calculation for the combined test
For power of 90% to detect a log-odds ratio of R as significantat level 0.05 (two-sided), we need
for a test based on a single response, and
for a test based on the combined approach
2
2R R
1.960 1.282 10.5V
2
2R R
1.960 1.282 10.5V*
21
MPS Research Unit
For a single binary (success/fail) response, with an overall success probability of p,
For three binary responses, each having an overall success probability of p, and with the probability of success on any two responses being g
2R
4 42n V
p(1 p) p(1 p)
2 2
2 2 2R
4 p(1 p) 2 g p 42 p(1 p) 2 g pn V*
3 p(1 p) 3 p(1 p)
22
MPS Research Unit
Suppose that g = p2 (independence), then
that is one third of the sample size using only one response
For g = p (responses coincide), then
that is the same as the sample size using only one response
Otherwise, combining the responses reduces sample size by up to one third, depending on the correlation between the responses
23
2 22RR
42p(1 p) 42n
3p(1 p)3 p(1 p)
2 22
RR
42 p(1 p) 2p(1 p) 42n
p(1 p)3 p(1 p)
MPS Research Unit
Now suppose that p = 0.2 and that g = 0.1 (correlation = 0.75)
then for one response
and for three responses
58% of the sample size using a single response
24
2 22
RR
42 0.16 2 0.10 0.04 153.1n
3 0.16
2 2R R
42 262.5n
0.16
MPS Research Unit
If the success rate on control is 18%, and the trial is to be powered to detect an improvement to 22%, then the log-odds ratio is
so that, for one response
n = 4200
and for three responses
n = 2450
25
R
0.22(1 0.18)log 0.25
0.18(1 0.22)
MPS Research Unit
ICTUS trial
Fixed sample size using only
Barthel: 2590modified Rankin: 3584NIH stroke scale: 5494
Combined test: 2421
This is for dichotomised responses, based on the previous data available
ICTUS is using a sequential design
26
MPS Research Unit
Ordinal scales
For sample size calculation for combining several ordinal responses, probabilities of every pair of responses on every pair of responses must be anticipated
Databases from previous trials can be used
A mid-trial sample size review can be used
27
MPS Research Unit
4. Evaluation of the combined approach
The first of a series of interim analyses of the ICTUS trial takes place when data from 1000 patients are available
A dataset from four previous studies comparing citicoline with placebo is available (Davalos et al., 2002) comprising 1,372 patients
First, one dataset of 1,000 was extracted and analysed using the combined test and the GEE approach
Then 10,000 datasets of size 200, 500 or 1,000 were randomly selected, the treatment code was removed and randomly reassigned
in some runs an artificial treatment effect of known magnitude was introduced
28
MPS Research Unit
Analyses of a synthetic stroke dataset, n = 1000
Adjusting Method Z* V* p
no factors GEE 15.55 68.19 0.2280 0.1211
0.0597
comb 15.57 68.94
0.2259 0.1204
0.0607
all factors GEE 17.27 60.09 0.2874 0.1290
0.0259
binary comb 17.84 62.85
0.2839 0.1261
0.0244
all factors + GEE Failed to converge
centre comb 19.64 58.23
0.3373 0.1311
0.0100
no factors GEE 9.02 89.85 0.1004 0.1055 0.3413
comb 7.84 83.92 0.0935 0.1092 0.3918
all factors GEE 17.15 89.17 0.1923 0.1059 0.0695
ordinal comb 15.18 82.41 0.1842 0.1102 0.0945
all factors + GEE 20.96 79.86 0.2624 0.1119 0.0190
centre comb 18.49 80.33 0.2302 0.1116 0.0391
ˆsd
29
MPS Research Unit
Results from 10,000-fold simulations of thecombined score test and the GEE approach
30
sample size
hypothesis
true # rejections according to from
comb GEE both comb GEE
200 H0 0 232 268 228 0.002 0.002
H1 0.781 9170 9186 9122 0.795 0.808
500 H0 0 251 255 239 0.001 0.001
H1 0.494 8950 8942 8894 0.480 0.472
1000 H0 0 227 226 211 0.000 0.000
H1 0.349 9010 8995 8956 0.345 0.334
MPS Research Unit
5. Conclusions
Use of the combined approach can reduce sample size, provided that the treatment effect is apparent on all responses being combined
The score approach used here matches the GEE approach, and is more reliable in small samples
The approach can combine quantitative responses and survival responses, it can also be used to combine different types of response
31
MPS Research Unit
References
Bolland, K., Whitehead, J., Cobo, E. and Secades, J. J. (2009). Evaluation of a sequential global test of improved recovery following stroke as applied to the ICTUS trial of citicoline. Pharmaceutical Statistics 8, 136-149.
Dávalos A, Castillo J, Álvarez-Sabin J, Secades JJ, Mercadal J, López S, Cobo E, Warach S, Sherman D, Clark WM, Lozano R. (2002). Oral citicoline in acute ischemic stroke. Stroke 33, 2850-2857.
Dávalos A. (2007). Protocol 06PRT/3005: ICTUS study: International Citicoline Trial on acUte Stroke (NCT00331890) Oral citicoline in acute ischemic stroke. Lancet Protocol Reviews.
Pocock, S.J., Geller, N. L. and Tsiatis, A. A. (1987). The analysis of multiple endpoints in clinical trials. Biometrics 43, 487-498.
Tilley, P. C., Marler, J., Geller, N. L., Lu, M., Legler, J., Brott, T., Lyden, P. and Grotta, J. for the National Institute of Neurological Disorders and Stroke (NINDS) rt-PA Stroke Trial Study Group. (1996). Use of a global test for multiple outcomes in stroke trials with application to the National Institute of Neurological Disorders and t-PA Stroke Trial. Stroke 27, 2136-2142.
Whitehead, J., Branson, M. and Todd, S. (2010). A combined score test for binary and ordinal endpoints from clinical trials. Statistics in Medicine 29, 521-532.
32