SECONDARY ANALYSES IN CLINICAL TRIALS

2013 CTN Web Seminar Series

Produced by: NIDA CTN CCC Training Office"This training has been funded in whole or in part with Federal funds from the National Institute on Drug Abuse,

National Institutes of Health, Department of Health and Human Services, under Contract No.HHSN271201000024C."

SECONDARY ANALYSES IN CLINICAL TRIALS

Presented by:George Bigelow, PhDDaniel J. Feaster, PhD

Abigail G. Matthews, PhD

December 6, 2013

2

Objectives• Review the statistical issues with analyzing

and interpreting secondary analyses, and demonstrate the multiple testing burden

• Explain the importance of secondary outcome and analysis identification during protocol development

• Discuss reporting and interpretation of secondary analyses, including the perspective of the CTN Publications Committee

3

Outline• Introduction and motivation• Statistician’s perspective• Multiplicity• Implementing secondary analyses• Summary• Discussion

4

PUBLICATIONS COMMITTEE PERSPECTIVE ON SECONDARY ANALYSES: OPPORTUNITIES & CAUTIONS

5

Opportunities• CTN encourages multiple publications• We want to learn as much as possible• Large-N, diverse, multi-site studies• Broad study teams with diverse interests• Extensive investment in assessments• Repeated assessments over time• Assessment commonalities across studies

6

Cautions• CTN studies typically yield multiple

publications• Question: Are we over-analyzing the data?• Discussions in Publications and Executive

Committees• Consequence: This webinar

7

Cautions (cont.)• Multiple testing incurs risk of false

conclusions• Proper planning can reduce this risk• Acknowledgement of limitations is essential

8

Example of Multiple Publications• CTN006/007: MIEDAR – Abstinence-

Contingent Incentives• Report of primary outcome• Do contingent incentives reduce stimulant

use?

9

Example of Multiple PublicationsCTN006/007:

MIEDAR – Abstinence-Contingent IncentivesReports of secondary outcomes:

Do contingent incentives affect……HIV risk behavior?

…gambling?…cost or cost effectiveness?…methamphetamine use?

…staff attitudes?

10

Example of Multiple PublicationsCTN006/007: MIEDAR – Abstinence-Contingent

Incentives

Reports of moderator variable associations: Are incentive effects related to…

…gender, race, ethnicity?…treatment history?

…criminal justice involvement?…urinalysis result at intake?

…gambling history?

11

A Caution About Demographic Subgroup DifferencesBe cautious of thinking of subgroup differences

as inherent characteristics of those groups or of individuals within those groups

Demographic subgroup differences are very likely the result of some correlated

confounding variable; they likely reflect differences in life experiences and

opportunities, and the contexts in which drugs are encountered

12

Example of Multiple PublicationsCTN006/007: MIEDAR – Abstinence-

Contingent Incentives

Reports of associations unrelated to the study intervention:

What symptoms are related to dependence on various drugs?

13

Types of Analyses• Intervention Effects on Primary Outcome

• Intervention Effects on Secondary Outcomes

• Analysis of Moderators or Mediation

• Associations

14

Common Errors• Mistaking correlations for causes

• Mis-describing the study methods

• Overlooking explanatory confounding variables

• Failing to acknowledge limitations

15

Correlation is Not CausationAvoid language that implies causality when reporting associations

Examples

“effect on” “effect of”

“impact on” “impact of”

“consequently” etc.

16

Describe Methods Accurately• Understand and describe original study

accurately

• Explain origins and methods of secondary analysis

• Idea should precede looking through the data

• Describe types and numbers of analyses performed

17

Consider Confounding Factors• One report examined relationship between

study pay and proportion of Ss present at the final assessment

• Proposed implausible conclusion that greater pay led to less retention

• Failed to note that pay amount was related to study duration and difficulty

18

Proceed with CautionMany of the factors that must be considered in conducting and reporting secondary analyses are the same as those important for careful and thoughtful reporting of primary analyses

However, secondary analyses can also involve some special statistical considerations, as will be discussed by the following speakers

19

STATISTICIAN’S PERSPECTIVE AND THE ISSUE OF MULTIPLICITY

20

What do we mean by primary and secondary outcomes and analyses?

• The primary outcome is the main outcome variable for the study– Research hypothesis based on this measure– Used to power study and determine statistical

significance of any treatment effect – Analytic method must be specified a priori in

the Statistical Analysis Plan (SAP) at a minimum

21

What do we mean by primary and secondary outcomes and analyses?

• Secondary analyses are any other analyses, e.g.:– Sensitivity analysis of primary outcome

measure with respect to missing data– Subgroup analyses by age, race, gender,

ethnicity, disease severity, etc.• Secondary outcomes are any other

outcome measures

22

Why secondary analyses?• Publish or perish!!!• Possible that primary outcome measure

data ends up being unreliable– e.g., using TLFB but high rate of discordance

between self-report and UDS• Analytic issues with pre-specified primary

analysis– e.g., proposed distribution of the primary

outcome variable does not hold and alternative methods should be used

23

Why secondary analyses? (cont.)• Possibly poor power of primary analysis if

assumptions used in sample size are not appropriate

• Sensitivity analyses of primary outcome with respect to missing data – key for addiction research

• Subgroup analyses– Race, ethnicity, gender required by NIH– Baseline severity of disease (Nunes et. al., 2011)

SPECIFY A PRIORI IN PROTOCOL OR SAP AS MUCH AS POSSIBLE!

24

Why secondary outcomes?• Publish or perish!!!• In addiction research we always focus on

abstinence but is that enough?– Improved overall quality of life– Engaging in less risky sexual behaviors– Less illegal activity such as theft or

prostitution– CTN TEAM Task Force recommends at least

one secondary outcome measure be related to “functioning, satisfaction, or quality of life”

25

Why secondary outcomes? (cont.)• Again, what if there are unanticipated

issues with the primary outcome? • Cannot “hang your hat” on only one

outcome measureSPECIFY A PRIORI IN PROTOCOL OR SAP AS MUCH

AS POSSIBLE!

Example of Utility of Secondary Analyses• CTN: Women with Trauma and Addictions• Primary outcome results (Hien et. al., 2009):

– Trauma symptom severity: NS– Abstinence: NS

• Secondary analyses:– Women with baseline eating disorders had significantly

less improvement in PTSD severity and abstinence– SS significantly reduces unprotected sex in high risk

women over time– Racial/ethnic matching with therapist associated with SS

effectiveness– Examples of other positive findings: retention, sleep

disorders, intimate partner violence

Denise Hien CPDD 2013 Presentation:

28

Words of Warning• Too many post hoc analyses opens one to

accusations of data dredging

• Secondary analyses/outcomes cannot be used to evaluate the trial as a whole (only primary outcome)

• If there are a substantial number of pre-specified secondary outcomes and analyses, consider adjusting for multiple comparisons

• Appropriate interpretation of results is key– Hypothesis generating

29

Cautionary Tale

Scott Harkonen, MD

• Convicted of wire fraud: “willfully overstating in a press release the evidence for benefit of a drug his company made”

• Primary outcome p-value=0.08• Asked his statisticians to identify sub-group with significance • Patients with mild to moderate disease severity: p=0.004• Press release acknowledged negative finding from primary

outcome analysis but maintained drug associated with increased survival

• Post: “…everyone agrees there weren’t any factual errors in the four-page document. The numbers were right; it’s the interpretation of them that was deemed criminal.”

• During appeal court said: “Statements are fraudulent if ‘misleading or deceptive’ and need not be literally ‘false’.”

30

Why such controversy?Multiplicity• Type I error is preserved (usually 5%) for primary

outcome(s)

• If performing multiple secondary analyses, then overall Type I error will be higher

• If enough analyses are performed, there will be at least one spurious association

• Adjustment not necessary for secondary analyses/outcomes, but interpretation must be cautious and presentation of results forthright and transparent

31

Illustration of Multiplicity• Generate 10 outcome variables

independently from normal with mean=0 and variance=1 for 300 participants

• Calculate Spearman correlation coefficient for each pair-wise combination

• Test correlation coefficient ≠ 0

• Type I error estimated as the number of tests that are statistically significant divided by number of tests (45)

32

Illustration of Multiplicity (cont’d)V2 V3 V4 V5 V6 V7 V8 V9 V10

V1 R = 0.03p = 0.661

R = -0.03p = 0.541

R = 0.04p = 0.448

R = 0.16p = 0.006

R = 0.07p = 0.231

R = -0.04p = 0.481

R = <0.01p = 0.960

R = 0.01p = 0.875

R = -0.08p = 0.186

V2 R = -0.05p = 0.436

R = 0.03p = 0.548

R = -0.12p = 0.033

R = 0.06p = 0.310

R = 0.02p = 0.793

R = 0.10p = 0.083

R = 0.01p = 0.877

R = 0.07p = 0.222

V3 R = 0.10p = 0.082

R = 0.05p = 0.427

R = 0.07p = 0.197

R = -0.03p = 0.605

R = -0.06p = 0.322

R = -0.08p = 0.149

R = 0.07p = 0.197

V4 R = -0.04p = 0.494

R = 0.08p = 0.149

R = -0.14p = 0.013

R = 0.15p = 0.012

R = 0.03p = 0.598

R = 0.05p = 0.396

V5 R = 0.06p = 0.282

R = -0.03p = 0.550

R = -0.04p = 0.501

R = 0.02p = 0.772

R = -0.16p = 0.005

V6 R = 0.05p = 0.433

R = 0.05p = 0.397

R = -0.14p = 0.013

R = -0.06p = 0.273

V7 R = -0.11p = 0.055

R = -0.07p = 0.227

R = -0.09p = 0.116

V8 R = 0.03p = 0.614

R = -0.03p = 0.623

V9 R = 0.07p = 0.232

Type I Error Rate = 6/45 = 13.3%

33

Implications• Avoid post hoc analyses

• Pre-specify as much as possible (protocol or SAP) →

→ Avoid data dredging criticism

→ Can even adjust Type I error rate for number of secondary analyses performed (rare)

• Interpret secondary results keeping in mind the inflated Type I error rate

34

Responsible Analysis and Reporting• Focus should always be on primary outcome

• Of secondary analyses, focus should be on those that were pre-specified

• Requires careful planning with statement of hypotheses in protocol/SAP (SAP should be finalized before data lock)

• Report in a manuscript the number of pre-specified analyses performed and the number reported

35

Responsible Analysis and Reporting (cont’d)• Present estimates of treatment differences

and CIs:“plausible range of treatment differences consistent with trial results”

• Interpretation needs to be viewed as exploratory rather than confirmatory

• Frame results in context of supporting or contradictory data from other studies

36

Responsible Analysis and Reporting (cont’d)• For post hoc analyses:

– Acknowledge that analyses were not specified a priori (data driven)

– Describe why analyses are important and the relevance of the research question

– Report number of post hoc analyses performed and the number reported

– Significance should be viewed as descriptive and not used for inference or decision making

– Can be used to justify future research

37

ExamplesIf primary outcome not statistically significant but some pre-specified secondary analyses were:

While the primary outcome did not demonstrate statistically significant evidence of a treatment effect, some secondary analyses suggested that the treatment may be effective. Therefore, future research is warranted.

If primary outcome is statistically significant but no secondary analyses are:

The primary outcome was statistically significant indicating that treatment is effective in this study population. Despite the fact that numerous secondary analyses did not yield statistical significance, there is sufficient evidence to justify future research of this intervention.

38

Questions…

39

IMPLEMENTATION OF SECONDARY ANALYSES

The Design Stage

Multiple Types of Secondary Analyses• Secondary hypotheses—Utilize the design• Mediation studies—Use data post-

randomization• Association Studies—Normally don’t use

the design– Example: Predictors of HIV Testing

(CTN0032)– Since do not use the study design—

observational!40

Multiple Types of Secondary Analyses (cont.)• Subgroup or Moderator Analyses

– Risk reduction counseling impact on HIV testing by modality of substance use treatment (CTN0032)

– Differential Treatment Effects by Race/Ethnicity and/or Gender

– We do not randomize to subgroups—observational!

41

Why consideration at design stage?• Appropriate measures

• Sample size considerations

• For secondary analyses that do NOT use design:– Causal interpretation is difficult– Statistical models can help

• Subject to assumptions—no unmeasured confounders• Implies need to think about and measure confounders for

any secondary analysis that is not just a test of difference by randomized treatment group

42

Secondary Outcomes• Simplest type of “secondary” analysis

• Other outcomes on which we feel the intervention will have an impact

• Analysis strategy frequently very similar to primary outcome analysis

• Need to consider multiple testing issue!– If enumerate and measure 20 secondary outcomes,

the Type 1 error is .64 (if we use α = .05 for each test)– Each secondary outcome being in a separate paper

does NOT change this fact43

Mediation Analyses• Another analysis frequently included in

protocols• Because pieces of the model are determined

after randomization, there is difficulty making a strong causal interpretation

44

Treatment Assignment

Attendance in Treatment

Count of Drug Use Days

But we do not randomize attendance, so even if observed for everyone:

45

Treatment Assignment

Attendance in Treatment


CONFOUNDERS

We cannot rule out confounders of both Attendance and Drug Useand, therefore, cannot make strong causal statements about the gold pathway without strong assumptions.

Possible AssumptionsNo Unmeasured Confounders

• If we measure all potential confounds, then can make causal statements SUBJECT to the assumption that we have measured ALL the confounds

• May want to measure potentially confounding factors

• Example: Propensity Score Analysis

Instrumental Variables

• If we can find exogenous factors, Z, that are correlated with Attendance, X

• Also, Z does not directly affect Count of Drug Use Days

• Can use Z as instruments to identify the causal impact of attendance on Count of Drug Days

• Use the instrumental variable in place of the endogenous variable

46

Association Studies• Like mediator models, association studies are

looking at the impact of variables which the experimenter has not controlled on some chosen outcome→Do not make causal claims (unless include confounder analysis or Instrumental Variable(s))!

• Frequently these studies will look at numerous predictors– Type I error– Must be very clear and honest about the way the

analyses were done47

Potential Solution for Type I Error• Machine Learning approaches

– Used in data mining– Allows an exhaustive search for the best

predictive model of the outcome– Like testing all covariates, sometimes can over-fit

• Cross-validation– Simplest approach is split sample, explore on

one sample, then replicate on a second sample (the second sample is a “test” of results on the first sample)

48

Subgroup or Moderator Analyses: Should Determine Which Subgroups Are of Interest at Design Stage• Many of us have interests in:

– Racial and ethnic groups– Gender

• But other subgroups may be of interest– Drug of choice– Severity of individual’s problem – Age– Socioeconomic status – Site Differences (in levels of outcome and/or treatment

effects)– PTSD/No PTSD

• Important to define groups a priori (Necessary to consider at the design phase) 49

Many (if not most) subgroups are not randomized.

• This means these are observational models and cannot make causal statements– Should assess for confounds– Should be careful not to over-interpret– Even if assess for confounds, cannot rule out

unobserved confounds– Race/Ethnic differences are examples where

differences are largely NOT causal (race/ethnicity is correlated with true casual agent)

50

Example

• If results show there is an interaction & appears that treatment works best in high severity

• Cannot be sure that high severity “caused” treatment success (or that treatment will work in high severity) because have not randomized to high severity 51


Randomized Group

Baseline Drug Use Severity

Randomized Group X Baseline Drug Use Severity

Genetic factor

Suggestions• Pre-specify at the design stage the subgroups of

interest

• Plan to assess known confounds related to the subgroups

• Minimize the number of subgroups examined (Type I error issue)

• Use tests of interaction within all participants, rather than testing treatment effects within each subgroup!!!

52

Wang et. al., Statistics in Medicine – Reporting of Subgroup Analyses in Clinical Trials. NEJM, 2007:357;21.Lagakos, S. The Challenge of Subgroup Analysis—Reporting without Distorting. NEJM, 2006:354:16.

Defining Subgroups• Easiest to work with subgroups that are

inherently categorical (race/ethnicity, gender, primary drug of use, site)

• Subgroup membership is ambiguous (and potentially manipulated) if defined on continuous measures (age or income, etc.)—better to include a continuous interaction

• Focusing on categorical subgroups

53

To analyze subgroups, you must recruit subgroups.

• Examine sites and particular clinics for subgroup composition—choose sites accordingly

• Should focus on a few subgroups (or correct for multiple testing in analysis)

• May have sites that are predominately a single minority (Puerto Rico in BSFT was 100% Hispanic)

NOTE: This may create difficulty in identifying subgroup effects separately from site effects in studies with small number of sites

54

Must Protect Integrity of the Overall Study Design

• If subgroup is associated with primary outcome measure, consider stratified randomization (by subgroup to ensure balance across conditions)

• Must decide whether to incorporate subgroups into primary hypothesis testing

• If so, how to incorporate• Fully stratified primary analysis is like running

duplicate trials in each subgroup and would require a large overall sample

55

How much should subgroups be incorporated into primary analysis?• Depends on what subgroup membership affects

—trick is specifying a priori

• Look at 4 possibilities (generally assuming that randomization works) for subgroup effects:– Initial levels differ by subgroup– Initial levels and rates of change differ by subgroup – Initial levels and rates of change differ by subgroup and

failure of randomization within at least one subgroup– Initial levels and rates of change differ by subgroup and

intervention status [i.e., Subgroup X Treatment Interaction]

56

Only Initial Levels Differ

57

• Include a 1df control for subgroup membership– Reduces residual variance – Increases power

• Little cost to include: Minor increase in model complexity—1df (per additional group) if wrong

Initial Level and Rates of Change

58

• 2 df per group (one for intercept and 1 for change-if linear)– Reduces residual variance– Increases Power

• Relatively low cost to complexity and lost df (unless many higher order polynomials in change)

Initial Level & Rates of Change and Randomization Failure in 1 Group

59

• We assume randomization will work—and participants should look similar on average at baseline

• But as you examine more subgroups you increase the chance that one subgroup will have an imbalance (here also differences in slopes)

Initial Levels and Rates of Change Differ by Group and Intervention

60

• Need separate intercepts and slopes for each group BY each intervention condition (have interaction of treatment by subgroup)

• Costly Model (in the sense of statistical power and sample size)– Complicated, many df (wasted if guess wrong)– Basically, fully stratified model—should consider powering within subgroup

61

So, what should you do for your a priori Statistical Analysis Plan?• Mean Differences by subgroup: add 1 df

control for subgroup membership—low cost if wrong

• Different trajectories by subgroup (but = intervention effect)– With existing evidence: plan to include

subgroup specific rates of change – No Prior Evidence:

• Could include examination of trajectories blind by condition

• May be better to test control group for differences in trajectories by subgroup

So, what should you do for your a priori Statistical Analysis Plan? (cont.)• Different intervention effects—Full

stratification with test for intervention interaction effects—Large Trial– Note that Testing for interaction does not

“solve” the problem

– Must plan for possibility that interaction is significant—then would want power to show effect within a subgroup

62

How many do you need in a subgroup if want to explore?

• If subgroups are not main emphasis of trial, difficult to power subgroup effects– Emphasize effect sizes and minimal sample to get

stable estimates– Nevertheless, useful to have a feel for power

• Will look at power implications– For mean (initial) level differences across subgroups– For rate of change differences across subgroups

63

64

N per group to Have 80% Power For Simple Mean Differenceat Point in Time

Sta

ndar

dize

d D

iffer

ence

Sample Size Per Group

0.1

0.3

0.5

0.7

0.9

0 100 200 300 400 500

Source: PASS 2005 Assumes normality and 2 group comparison

Power for Differences in Slopes• Assumes 4 assessment times

• Lose 15% at T2, 5% more at T3 & T4

• Compound symmetry in Errors=.20

• Random effects in intercept and slope terms

• Effect size is the standardized mean difference at the LAST time (assumes largest effect at last observation)

• Uses RMASS program from Don Hedeker– Hedeker, Gibbons, & Waternaux, 1999 JEBS– Available at http://tigger.uic.edu/~hedeker/

65

66

Sample Size per Subgroup for 80% PowerGrowth Curve Showing Mean Difference at Last of 4 Times

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 50 100 150 200 250 300 350

n per group

Effe

ct S

ize

Linear Growth CurveQuadratic Growth Curve

Assumes 2 group comparison

Note About Power Statements• These are for 2 group comparisons

• If you want to find power for testing difference in intervention effects across 2 (or more) groups (an interaction effect) would need to simulate if planning on a growth curve framework with missing data

• Rough, overestimate—4 times the per group n—should over-power the interaction effect

• However, if want power within subgroups—4 times per group n is the proper sample size

67

Potential Difficulties with Subgroup Analyses• Disentangling subgroup effects from site

effects

• Interpretation of subgroup findings– Are measurement instruments equally valid

and reliable across subgroups (need at least 100 per group even for small invariance analysis)

68

Potential Difficulties with Subgroup Analyses (cont.)• Interpretation of subgroup findings (cont.)

– What does group membership mean? Proxy for?• What person-specific variables do we need to

understand subgroup difference• What contextual variables do we need?

– Local, neighborhood and regional factors (zip codes of individuals and clinics?)

– Treatment context and other site level factors

69

Summary of Issues with Secondary Analysis• Analyses that use treatment assignment as the

only predictor*:– Have a strong causal interpretation– Are useful to characterize the full impact of an

intervention—substance treatment has multiple targets

– But too many secondary outcomes may cause a Type I error, unless a multiple testing procedure is followed

– Need to be clear about the number of secondary outcomes considered

70* Could also include other factors measured at baseline as control variables

Summary of Issues with Secondary Analysis (cont.)• All other secondary analyses (mediation,

associational, and subgroup):– Cannot be interpreted as causal without strong

assumptions

– Need to assess for confounding variables

– Be careful not to over-interpret—Cannot rule out unobserved confounding

– Are also susceptible to Type 1 errors—• limit the number of analysis and/or incorporate a multiple

testing strategy and be clear in presentation of what you have done

71

Suggestions• All manuscripts should have an analysis plan

pre-specified

• Any findings that come from exploratory analyses need to be clearly designated in the manuscript

• May be useful to cite the CTN dissemination library for the particular trial (for readers reference for the other different outcomes that have been published on the same data)

72

73

WRAPPING IT UP!Highlights

74

Summary• Secondary analyses are important and necessary

components of clinical trial research

• Secondary outcomes and analyses should be considered and identified during protocol and/or SAP development

• Interpret results appropriately: exploratory, confirmatory, descriptive, not causal

• Always report whether analysis/outcome was pre-specified or post hoc and the number performed

75

Summary (cont.)• CTN encourages multiple analyses & publications

• Describe methods clearly and accurately

• It is essential to acknowledge the limitations

• Beware of:− Excessive testing− Data dredging− Misinterpreting correlations as causes− Confounding variables

76

Q&A – Questions / Comments

Alternatively, questions can be directed to the presenter by sending an email to [email protected].

77

References• Hien, et al. 2009. J Consult Clin Psychol, 77(4); 607–619.

• Lagakos, S. 2006. The challenge of subgroup analysis—Reporting without distorting. NEJM, 354; 16.

• Nunes, et al. 2011. American Journal of Drug and Alcohol Abuse, 37; 446-452.

• Wang, et al. 2007. Statistics in medicine – Reporting of subgroup analyses in clinical trials. NEJM, 357; 21.

78

Survey ReminderThe NIDA CCC encourages all to complete the survey issued to participants directly following this webinar session, as this is the primary collective tool for rating your experience with this and other webinars, and for communicating the interests and needs of CTN members and associates.

See you all in 2014!

79

A copy of this presentation will be available electronically after this session.

http://ctndisseminationlibrary.org

80

THANK YOU FOR YOUR PARTICIPATION

SECONDARY ANALYSES IN CLINICAL TRIALS

Documents

Transcript of SECONDARY ANALYSES IN CLINICAL TRIALS