Seminar, Bordeaux School of Public Health 8 June 2011 Combining endpoints in clinical trials to...

32
Seminar, Bordeaux School of Public Health 8 June 2011 Combining endpoints in clinical trials to increase power John Whitehead Medical and Pharmaceutical Statistics Research Unit Medical and Pharmaceutical Statistics Research UnitDepartment of Mathematics and Statistics Tel: +44 1524 592350 Fylde College Fax: +44 1524 592681 Lancaster University E-mail: [email protected] Lancaster LA1 4YF, UK

Transcript of Seminar, Bordeaux School of Public Health 8 June 2011 Combining endpoints in clinical trials to...

Seminar, Bordeaux School of Public Health

8 June 2011

Combining endpoints in clinical trials to increase power

John Whitehead

Medical and PharmaceuticalStatistics Research Unit

Medical and Pharmaceutical Statistics Research Unit Department of Mathematics and StatisticsTel: +44 1524 592350 Fylde CollegeFax: +44 1524 592681 Lancaster UniversityE-mail: [email protected] Lancaster LA1 4YF, UK

MPS Research Unit

1. Ordinal endpoints in stroke studies

Treatments for acute stroke are administered for a few days following diagnosis

The primary endpoint is the functional status of the patient, 90 days after the stroke

Several scoring systems exist, including the Barthel index, the modified Rankin score and the NIH stroke scale

All are ordinal scales from full recovery to vegetative state, to which death before 90 days can be added

2

MPS Research Unit

Analysis of an ordinal response

R1 = best response (full recovery)

Rk = worst response (death before day 90)

Response Control Experimental Total

R1 c1 e1 t1

R2 c2 e2 t2

Rk ck ek tk

Total nC nE n

3

MPS Research Unit

Let

Ch = c1 +…+ ch

Ch = the number of controls with response Rh or better

Let

Ch = ch +…+ ck

Ch = the number of controls with response Rh or worse

Similarly define Eh, Eh, Th and Th

4

MPS Research Unit

Let

QCh = P(a control has response Rh or better)

QEh = P(an experimental has response Rh or better)

(then QCk = QEk = 1)

Put

h is the log-odds ratio for response Rh or better, E:C

Eh Chh

Ch Eh

Q (1 Q )log

Q (1 Q )

h = 1,…, k – 1

5

MPS Research Unit

The proportional odds assumption is

1 = 2 = … = k–1 =

The common value, , is a measure of the advantage of the experimental treatment

> 0 experimental better

= 0 no difference

< 0 control better

6

MPS Research Unit

Under PO, the most efficient test of treatment advantage

greatest power for any given sample size

is based on the test statistics

and

For large samples and small , approximately Z ~ N(V, V)

Z is the score statistic and V is Fisher’s information

kh 1

h h 1h 1

1Z a (B B )

n

3kC E h

h 1

n n tV 1

3n n

7

MPS Research Unit

To test for treatment difference, refer Z2/V to

This is the Mann-Whitney test

Also known as the Wilcoxon test

Under the null hypothesis of no treatment effect, PO is true with = 0

Thus the hypothesis test and the p-value are valid without assumptions

Estimates of and confidence intervals for do rely on assumptions, as does adjustment for prognostic factors

21

8

MPS Research Unit

How should investigators choose which scale to use?

An alternative to choosing is to combine more than one stroke scale in the analysis

Tilley et al. (1996) combined four scales in the trial of rTPA as a treatment in acute stroke conducted by the National Institute of Neurological Disorders and Stroke

the trial was positive and the approach caught on

If the treatment has a beneficial effect on all scales, then combining them will increase the power to demonstrate the advantage of the treatment

9

MPS Research Unit

2. Example: The ICTUS trial in stroke

• Currently ongoing in 60 centres in Europe

• Patients who have suffered acute stroke

• Randomised between citicoline and placebo

• Assessed at 90 days on Barthel index, modified Rankin score and NIH stroke scale

• Prognostic factors baseline NIHSS time from stroke to treatment ( or > 12 hours) age ( or > 70 years)site of stoke (right or left side)use of rTPA (yes or no)

10

MPS Research Unit

The approach used by Tilley et al.

Combine the three analyses using GEE (based on an independence covariance structure: IEE)

That is, analyse as if the three scores were independent, butadjust the standard error of the treatment effect estimate usingthe sandwich estimator

• complicated to understand• no associated sample size formula• failed in test data set of 1000 patients with binary

responses and adjustment for 60 centres

11

MPS Research Unit

An alternative general approach

The log-odds ratio and the test statistics Z and V, for the analysis of the ith response will be denoted by i, Zi and Vi

i = 1 is Barthel indexi = 2 is modified Rankin scorei = 3 is NIH stroke score

W will test H0: 1 = 2 = 3 = 0 (no effect of treatment on any of the scales) using

Z = Z1 + Z2 + Z3

12

MPS Research Unit

For each scale,

Zi ~ N(iVi, Vi)

if Vi is large and i is small

If1 = 2 = 3 = , then approximately

where V = V1 + V2 + V3, C = 2(C12 + C23 + C31) and

Cij = cov(Zi, Zj)

Z ~ N V,V C

13

MPS Research Unit

It follows that, if

then

as required for a 2 test and for sample size calculation

What we need to use this is an expression for

Cij = cov(Zi, Zj)

Z* ~ N V*,V*

2

anZV V

Z* VC

d *V C V

14

MPS Research Unit

The binary case, no covariates only one response

Control Experimental Total

Success (R1) c1 e1 t1

Failure (R2) c2 e2 t2

Total nC nE n

C 1 E 1 C E 1 23

ann e n c n n

dt t

Z Vn n

15

MPS Research Unit

The binary case, no covariates ith of several responses

assuming that each patient provides all responses

Control Experimental Total

Success (R1) ci1 ei1 ti1

Failure (R2) ci2 ei2 ti2

Total nC nE n

C i1 E i1 C E i1 i2i i 3

ann e n c n n t t

dZ Vn n

16

MPS Research Unit

Covariance between Zi and Zj

For two such statistics, we have

where ti1 is the number of patients succeeding on the ith scale,

tj1 the number succeeding on the jth scale and t(ij),1 the number

succeeding on both scales (Pocock, Geller and Tsiatis, 1987)

C Eij i j (ij),1 i1 j13

n nC cov Z ,Z nt t t

n

17

MPS Research Unit

The ordinal case, no covariates ith of several responses

with Cih = ci1 +…+ cih and Cih = cih +…+ cik

Control Experimental Total

R1 ci1 ei1 ti1

Rk cik eik tik

Total nC nE n

18

MPS Research Unit

Covariance between Zi and Zj

For two such statistics, we have

where fv = 1, 0 or 1 if f <, =, > v respectively,

Kfg = tfi tgj/n2, Hfg = t(ij),(fg)/n Kfg,

t(ij),(fg) is the count of patients who have both response Rf,i on

the ith scale and response Rg,j on the jth scale

C Eij fv gw fg vw C fg vw E fg vw2

f ,g,v,w

n nC H H n H K n K H

n

19

MPS Research Unit

Adjustment for covariates

The approach can be extended to allow for prognostic factors via stratification and/or linear modelling of covariates

Stratification: sum Z and V statistics over strata, and assume that the treatment effect is constant over strata

Covariate adjustment: use proportional hazards regression, plus binary logistic regression to model the simultaneous occurrence of particular responses on different scales (such as complete recovery on Barthel index and partial recovery on the modified Rankin)

20

MPS Research Unit

3. Sample size calculation for the combined test

For power of 90% to detect a log-odds ratio of R as significantat level 0.05 (two-sided), we need

for a test based on a single response, and

for a test based on the combined approach

2

2R R

1.960 1.282 10.5V

2

2R R

1.960 1.282 10.5V*

21

MPS Research Unit

For a single binary (success/fail) response, with an overall success probability of p,

For three binary responses, each having an overall success probability of p, and with the probability of success on any two responses being g

2R

4 42n V

p(1 p) p(1 p)

2 2

2 2 2R

4 p(1 p) 2 g p 42 p(1 p) 2 g pn V*

3 p(1 p) 3 p(1 p)

22

MPS Research Unit

Suppose that g = p2 (independence), then

that is one third of the sample size using only one response

For g = p (responses coincide), then

that is the same as the sample size using only one response

Otherwise, combining the responses reduces sample size by up to one third, depending on the correlation between the responses

23

2 22RR

42p(1 p) 42n

3p(1 p)3 p(1 p)

2 22

RR

42 p(1 p) 2p(1 p) 42n

p(1 p)3 p(1 p)

MPS Research Unit

Now suppose that p = 0.2 and that g = 0.1 (correlation = 0.75)

then for one response

and for three responses

58% of the sample size using a single response

24

2 22

RR

42 0.16 2 0.10 0.04 153.1n

3 0.16

2 2R R

42 262.5n

0.16

MPS Research Unit

If the success rate on control is 18%, and the trial is to be powered to detect an improvement to 22%, then the log-odds ratio is

so that, for one response

n = 4200

and for three responses

n = 2450

25

R

0.22(1 0.18)log 0.25

0.18(1 0.22)

MPS Research Unit

ICTUS trial

Fixed sample size using only

Barthel: 2590modified Rankin: 3584NIH stroke scale: 5494

Combined test: 2421

This is for dichotomised responses, based on the previous data available

ICTUS is using a sequential design

26

MPS Research Unit

Ordinal scales

For sample size calculation for combining several ordinal responses, probabilities of every pair of responses on every pair of responses must be anticipated

Databases from previous trials can be used

A mid-trial sample size review can be used

27

MPS Research Unit

4. Evaluation of the combined approach

The first of a series of interim analyses of the ICTUS trial takes place when data from 1000 patients are available

A dataset from four previous studies comparing citicoline with placebo is available (Davalos et al., 2002) comprising 1,372 patients

First, one dataset of 1,000 was extracted and analysed using the combined test and the GEE approach

Then 10,000 datasets of size 200, 500 or 1,000 were randomly selected, the treatment code was removed and randomly reassigned

in some runs an artificial treatment effect of known magnitude was introduced

28

MPS Research Unit

Analyses of a synthetic stroke dataset, n = 1000

Adjusting Method Z* V* p

no factors GEE 15.55 68.19 0.2280 0.1211

0.0597

comb 15.57 68.94

0.2259 0.1204

0.0607

all factors GEE 17.27 60.09 0.2874 0.1290

0.0259

binary comb 17.84 62.85

0.2839 0.1261

0.0244

all factors + GEE Failed to converge

centre comb 19.64 58.23

0.3373 0.1311

0.0100

no factors GEE 9.02 89.85 0.1004 0.1055 0.3413

comb 7.84 83.92 0.0935 0.1092 0.3918

all factors GEE 17.15 89.17 0.1923 0.1059 0.0695

ordinal comb 15.18 82.41 0.1842 0.1102 0.0945

all factors + GEE 20.96 79.86 0.2624 0.1119 0.0190

centre comb 18.49 80.33 0.2302 0.1116 0.0391

ˆsd

29

MPS Research Unit

Results from 10,000-fold simulations of thecombined score test and the GEE approach

30

sample size

hypothesis

true # rejections according to from

comb GEE both comb GEE

200 H0 0 232 268 228 0.002 0.002

H1 0.781 9170 9186 9122 0.795 0.808

500 H0 0 251 255 239 0.001 0.001

H1 0.494 8950 8942 8894 0.480 0.472

1000 H0 0 227 226 211 0.000 0.000

H1 0.349 9010 8995 8956 0.345 0.334

MPS Research Unit

5. Conclusions

Use of the combined approach can reduce sample size, provided that the treatment effect is apparent on all responses being combined

The score approach used here matches the GEE approach, and is more reliable in small samples

The approach can combine quantitative responses and survival responses, it can also be used to combine different types of response

31

MPS Research Unit

References

Bolland, K., Whitehead, J., Cobo, E. and Secades, J. J. (2009). Evaluation of a sequential global test of improved recovery following stroke as applied to the ICTUS trial of citicoline. Pharmaceutical Statistics 8, 136-149.

Dávalos A, Castillo J, Álvarez-Sabin J, Secades JJ, Mercadal J, López S, Cobo E, Warach S, Sherman D, Clark WM, Lozano R. (2002). Oral citicoline in acute ischemic stroke. Stroke 33, 2850-2857.

Dávalos A. (2007). Protocol 06PRT/3005: ICTUS study: International Citicoline Trial on acUte Stroke (NCT00331890) Oral citicoline in acute ischemic stroke. Lancet Protocol Reviews.

Pocock, S.J., Geller, N. L. and Tsiatis, A. A. (1987). The analysis of multiple endpoints in clinical trials. Biometrics 43, 487-498.

Tilley, P. C., Marler, J., Geller, N. L., Lu, M., Legler, J., Brott, T., Lyden, P. and Grotta, J. for the National Institute of Neurological Disorders and Stroke (NINDS) rt-PA Stroke Trial Study Group. (1996). Use of a global test for multiple outcomes in stroke trials with application to the National Institute of Neurological Disorders and t-PA Stroke Trial. Stroke 27, 2136-2142.

Whitehead, J., Branson, M. and Todd, S. (2010). A combined score test for binary and ordinal endpoints from clinical trials. Statistics in Medicine 29, 521-532.

32