Predicting response to a cognitive–behavioral approach to treating low back pain: Secondary...

9
Predicting Response to a Cognitive–Behavioral Approach to Treating Low Back Pain: Secondary Analysis of the BeST Data Set MARTIN UNDERWOOD, DIPESH MISTRY, RANJIT LALL, AND SALLIE LAMB Objective. Identifying factors that predict who is likely to gain the greatest benefit from different treatments for low back pain is an important research priority. Here we report moderator analyses of the Back Skills Training Trial (BeST) that tested a cognitive– behavioral approach for low back pain. Methods. We recruited 701 participants ages >18 years with at least moderately troublesome low back pain present for >6 weeks from 56 general practices in 7 localities across England to a trial adding a group cognitive– behavioral approach to active management advice. The cognitive– behavioral package had a moderate effect on primary outcomes (Roland Morris Disability Questionnaire [RMDQ] and modified Von Korff scales). At 12-month followup, we tested for interaction between randomized groups on 2 prespecified baseline variables (troublesomeness and fear avoidance) and 10 post hoc (exploratory) variables identified from previous studies. Results. Neither troublesomeness nor fear avoidance moderated treatment effect on any of our primary outcomes. In the final model, the only moderation by baseline variables of the effect of randomization was on the RMDQ outcome. Being younger and currently working both moderated treatment effect, resulting in larger improvements as a response to treatment. Conclusion. Although BeST is one of the larger trials of back pain treatment, it is still too small to reliably detect moderation if it exists. Since the significant moderation effects were only observed for 1 outcome measure in 3 of 10 post-hoc analyses, we cannot conclude that these are true moderation effects. INTRODUCTION Several therapist-delivered intervention packages are ef- fective and cost effective for chronic nonspecific low back pain. Treatment packages of acupuncture needling, exer- cise, and manual therapy are recommended for the early management of persistent back pain by the National Insti- tute of Health and Clinical Excellence guidelines (1). We have added to these data by demonstrating that a group cognitive– behavioral approach delivered by a range of health professionals has sustained the effect on low back pain disability, at a very modest cost to the health care provider (Back Skills Training Trial n 701) (2). Most of these interventions produce small to moderate mean ben- efits that are likely to be important at a population level, but that may not be important to an individual patient. Identifying moderators (effect modifiers), factors that pre- dict who is likely to gain the greatest benefit from different treatments for low back pain, is an important research priority because it will allow us to deliver the best treat- ment for an individual patient (1,3,4). There are few ran- domized studies of moderators of treatments for nonspe- cific low back pain and even fewer that have used prospectively defined moderators (5). Here we report pre- specified moderator analyses and further exploratory mod- erator analyses of the Back Skills Training Trial data. MATERIALS AND METHODS Trial design. We have reported the design, intervention, and main analyses of the Back Skills Training Trial in detail elsewhere (2,6 – 8). Briefly, we recruited 701 partic- ipants ages 18 years or older with at least moderately ISRCTN: 37807450. The BeST trial was funded by the UK National Health Service Health Technology Assessment Programme (project 01/75/01). This project benefited from facilities funded through Birmingham Science City Translational Medicine Clinical Research and infrastructure Trials platform, with support from Advantage West Midlands. Martin Underwood, MD, Dipesh Mistry, MSc, Ranjit Lall, PhD, Sallie Lamb, PhD: Warwick Medical School, Coventry, Warwickshire, UK. Address correspondence to Martin Underwood, MD, War- wick Medical School, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, UK. E-mail: m.underwood@ warwick.ac.uk. Submitted for publication December 23, 2010; accepted in revised form May 20, 2011. Arthritis Care & Research Vol. 63, No. 9, September 2011, pp 1271–1279 DOI 10.1002/acr.20518 © 2011, American College of Rheumatology ORIGINAL ARTICLE 1271

Transcript of Predicting response to a cognitive–behavioral approach to treating low back pain: Secondary...

Predicting Response to a Cognitive–BehavioralApproach to Treating Low Back Pain: SecondaryAnalysis of the BeST Data SetMARTIN UNDERWOOD, DIPESH MISTRY, RANJIT LALL, AND SALLIE LAMB

Objective. Identifying factors that predict who is likely to gain the greatest benefit from different treatments for low backpain is an important research priority. Here we report moderator analyses of the Back Skills Training Trial (BeST) thattested a cognitive–behavioral approach for low back pain.Methods. We recruited 701 participants ages >18 years with at least moderately troublesome low back pain present for>6 weeks from 56 general practices in 7 localities across England to a trial adding a group cognitive–behavioral approachto active management advice. The cognitive–behavioral package had a moderate effect on primary outcomes (RolandMorris Disability Questionnaire [RMDQ] and modified Von Korff scales). At 12-month followup, we tested for interactionbetween randomized groups on 2 prespecified baseline variables (troublesomeness and fear avoidance) and 10 post hoc(exploratory) variables identified from previous studies.Results. Neither troublesomeness nor fear avoidance moderated treatment effect on any of our primary outcomes. In thefinal model, the only moderation by baseline variables of the effect of randomization was on the RMDQ outcome. Beingyounger and currently working both moderated treatment effect, resulting in larger improvements as a response totreatment.Conclusion. Although BeST is one of the larger trials of back pain treatment, it is still too small to reliably detectmoderation if it exists. Since the significant moderation effects were only observed for 1 outcome measure in 3 of 10post-hoc analyses, we cannot conclude that these are true moderation effects.

INTRODUCTION

Several therapist-delivered intervention packages are ef-fective and cost effective for chronic nonspecific low backpain. Treatment packages of acupuncture needling, exer-cise, and manual therapy are recommended for the earlymanagement of persistent back pain by the National Insti-tute of Health and Clinical Excellence guidelines (1). Wehave added to these data by demonstrating that a groupcognitive–behavioral approach delivered by a range of

health professionals has sustained the effect on low backpain disability, at a very modest cost to the health careprovider (Back Skills Training Trial n � 701) (2). Most ofthese interventions produce small to moderate mean ben-efits that are likely to be important at a population level,but that may not be important to an individual patient.Identifying moderators (effect modifiers), factors that pre-dict who is likely to gain the greatest benefit from differenttreatments for low back pain, is an important researchpriority because it will allow us to deliver the best treat-ment for an individual patient (1,3,4). There are few ran-domized studies of moderators of treatments for nonspe-cific low back pain and even fewer that have usedprospectively defined moderators (5). Here we report pre-specified moderator analyses and further exploratory mod-erator analyses of the Back Skills Training Trial data.

MATERIALS AND METHODS

Trial design. We have reported the design, intervention,and main analyses of the Back Skills Training Trial indetail elsewhere (2,6–8). Briefly, we recruited 701 partic-ipants ages 18 years or older with at least moderately

ISRCTN: 37807450.The BeST trial was funded by the UK National Health

Service Health Technology Assessment Programme (project01/75/01). This project benefited from facilities fundedthrough Birmingham Science City Translational MedicineClinical Research and infrastructure Trials platform, withsupport from Advantage West Midlands.

Martin Underwood, MD, Dipesh Mistry, MSc, Ranjit Lall,PhD, Sallie Lamb, PhD: Warwick Medical School, Coventry,Warwickshire, UK.

Address correspondence to Martin Underwood, MD, War-wick Medical School, University of Warwick, Gibbet HillRoad, Coventry CV4 7AL, UK. E-mail: [email protected].

Submitted for publication December 23, 2010; accepted inrevised form May 20, 2011.

Arthritis Care & ResearchVol. 63, No. 9, September 2011, pp 1271–1279DOI 10.1002/acr.20518© 2011, American College of Rheumatology

ORIGINAL ARTICLE

1271

troublesome nonspecific low back pain present for greaterthan 6 weeks from 56 general practices in 7 localitiesacross England. We randomized these, 2:1 in favor of theintervention, using a remote telephone randomization ser-vice. All baseline variables were collected using a self-completed questionnaire prior to randomization.

All participants received a 15-minute session of activemanagement advice, including the benefit of and how toremain active, avoidance of bed rest, appropriate use ofpain medication and symptom management, and a copy ofThe Back Book (9). Those in the intervention group wereoffered an individual assessment lasting up to 1.5 hoursand 6 sessions of group therapy using a cognitive–behavioral approach lasting 1.5 hours per session.

Our primary outcomes, the Roland Morris DisabilityQuestionnaire (RMDQ; scale 0–24, where lower scoresindicate less severe disability) and modified von Korff(MVK) scales of pain and disability (scale 0–100%, wherelower scores indicate less pain and disability), were mea-sured at 12 months using a self-administered postal ques-tionnaire (10,11). Nonresponders to the postal question-naire received a telephone interview to collect coreoutcome data. Collection of the RMDQ data was not in-cluded in the telephone interview process due to itslength.

We found that the intervention package was moderatelyeffective at 12 months based on the 598 subjects whoresponded at 12 months (2) (Table 1). To reduce hazardsfrom making multiple comparisons, we have not included3- and 6-month followup data in this secondary analysis.

Potential effect moderators. At the design stage of thetrial, we prespecified 3 hypotheses (confirmatory analyses)(12). First, the benefits from treatment would be greater inthose with more troublesome low back pain (moderatelyversus very/extremely troublesome). We measured thisduring the participants’ initial assessment prior to ran-domization using a 5-point Likert scale from not at alltroublesome to extremely troublesome, with participantsonly being included if they reported low back pain of atleast moderate troublesomeness (13,14). Those with moretroublesome pain have a greater capacity for improvementand arguably have more to gain from additional improve-ments from treatment. Assessing troublesomeness of painis an attractive and simple approach that could be used inthe consulting room to decide which patients should beconsidered for further treatment. Second, the intervention

would have a greater effect in those with high levels of fearavoidance. Fear avoidance is well established as a poten-tially modifiable predictor of poor outcome from back pain(15). Our intervention was designed to target fear avoid-ance among a range of unhelpful beliefs about back painand to promote physical activity. We measured this usingthe Fear-Avoidance Beliefs Questionnaire (FABQ) at base-line (16). Third, the benefits of treatment would be greaterfor those with subacute low back pain (�3 months) thanthose with chronic low back pain. We were unable toperform this analysis because too few participants hadsubacute pain.

At baseline, we collected data on other potential predic-tors or moderators of treatment effect (Table 2). Theseincluded demographic information on: 1) whether or notparticipants received state benefits, 2) depression and anx-iety using the Hospital Anxiety and Depression Scale(HADS), 3) self-efficacy using the Pain Self-Efficacy Ques-tionnaire, and 4) health-related quality of life using thephysical and mental health component scores of the ShortForm 12 (SF-12) (17–19). We dichotomized potential mod-erators that were continuous variables to facilitate theanalyses using cut points available in the literature. Weused a cut point of �14 for the FABQ and �11 for both theanxiety and depression components of the HADS (17,20).For all other continuous outcomes where generally ac-cepted cut points were not available, we used a median cutpoint (21).

Statistical analysis. The analysis was on an intent-to-treat basis. We analyzed our primary outcome measures aschange from baseline (baseline minus followup). All sta-tistical tests were 2-sided and statistical significance wasassessed at the 5% level for the univariate and exploratory

Significance & Innovations● Baseline characteristics were not shown to predict

response to a cognitive–behavioral approach totreatment of low back pain in secondary analysisof a large randomized controlled trial.

● No evidence to support selective provision of suchservices was found.

● Future research into back pain subgroups mayneed to use different approaches.

Significance & Innovations● Baseline characteristics were not shown to predict

response to a cognitive–behavioral approach totreatment of low back pain in secondary analysisof a large randomized controlled trial.

● No evidence to support selective provision of suchservices was found.

● Future research into back pain subgroups mayneed to use different approaches.

Table 1. Main effect of BeST intervention package*

Mean treatmentdifference (95% CI)

All participants at 12-month followupRMDQ (n � 498)† 1.3 (0.56, 2.06)MVK disability (n � 552)‡ 8.4 (4.47, 12.32)MVK pain (n � 583)‡ 7.0 (3.12, 10.81)

Participants with pain for �3 monthsat randomization

3-month followupRMDQ (n � 509)† 1.0 (0.38, 1.69)MVK disability (n � 515)‡ 4.3 (0.45, 8.17)MVK pain (n � 534)‡ 6.6 (3.21, 10.08)

6-month followupRMDQ (n � 524)† 1.4 (0.64, 2.18)MVK disability (n � 547)‡ 8.1 (4.25, 12.00)MVK pain (n � 565)‡ 7.9 (4.11, 11.61)

12-month followupRMDQ (n � 494)† 1.3 (0.53, 2.02)MVK disability (n � 548)‡ 8.2 (4.24, 12.16)MVK pain (n � 579)‡ 6.8 (2.92, 10.57)

* 95% CI � 95% confidence interval; RMDQ � Roland MorrisDisability Questionnaire; MVK � modified Von Korff.† Range 0–24.‡ Range 0–100.

1272 Underwood et al

Table 2. Demographic characteristics and outcome measures at baseline of the sample providing followup at 12 months*

Control(advice only)

(n � 199)

Advice pluscognitive–behavioral

intervention(n � 399)

Total(n � 598)

Age, yearsNo. 199 398 597Mean � SD 54.2 � 14.46 53.7 � 14.38 53.8 � 14.39Missing, no. 0 1 1

SexMale, no. (%) 77 (38.7) 164 (41.1) 241 (40.3)Female, no. (%) 121 (60.8) 235 (58.9) 356 (59.5)Missing, no. 1 0 1

Ethnicity, no. (%)White 175 (88.0) 350 (87.7) 525 (87.8)Asian or Asian British 6 (3.0) 18 (4.5) 24 (4.0)Black or black British 3 (1.5) 6 (1.5) 9 (1.5)Chinese or other 4 (2.0) 5 (1.3) 9 (1.5)Missing 11 (5.5) 20 (5.0) 31 (5.2)

Left full-time education, no. (%)Age �16 years 101 (50.8) 225 (56.4) 326 (54.5)Age �16 years 90 (45.2) 154 (38.6) 244 (40.8)Missing 8 (4.0) 20 (5.0) 28 (4.7)

Employed, no. (%)Not employed 103 (51.8) 192 (48.1) 295 (49.3)Employed 95 (47.7) 206 (51.6) 301 (50.3)Missing 1 (0.50) 1 (0.3) 2 (0.4)

Frequency of back pain (past 6 weeks), no. (%)Comes and goes � getting better 49 (24.6) 98 (24.6) 147 (24.6)Fairly constant � getting worse 148 (74.4) 299 (74.9) 447 (74.7)Missing 2 (1.0) 2 (0.5) 4 (0.7)

Troublesomeness, no. (%)Moderately troublesome 113 (56.8) 214 (53.6) 327 (54.7)Very/extremely troublesome 86 (43.2) 185 (46.4) 271 (45.3)Missing 0 0 0

Fear avoidance†No. 187 376 563Mean � SD 13.9 � 6.35 13.4 � 6.47 13.5 � 6.43Missing, no. 12 23 35

Duration of back pain, years since first onsetNo. 194 381 575Mean � SD 13.1 � 12.16 13.1 � 13.23 13.1 � 12.87Missing, no. 5 18 23

Benefits, no. (%)No benefits 164 (82.4) 327 (81.9) 491 (82.1)Benefits 33 (16.6) 65 (16.3) 98 (16.4)Missing 2 (1.0) 7 (1.8) 9 (1.5)

HADS anxiety‡No. 196 390 586Mean � SD 7.5 � 4.43 8.1 � 4.29 7.9 � 4.34Missing, no. 3 9 12

HADS depression‡No. 197 396 593Mean � SD 5.3 � 3.47 5.9 � 3.75 5.7 � 3.66Missing, no. 2 3 5

Pain self-efficacy§No. 190 387 577Mean � SD 42.1 � 11.82 39.9 � 13.49 40.6 � 13.00Missing, no. 9 12 21

RMDQ¶No. 199 399 598Mean � SD 8.2 � 4.49 8.7 � 5.01 8.5 � 4.84Missing, no. 0 0 0

(continued)

Treatment Moderators in Low Back Pain 1273

analyses only. For the prespecified subgroup analyses, thesignificance level was adjusted for multiple comparisonsusing the Bonferroni correction and was therefore assessedat the 2.5% level, i.e., � � 0.025 (22). All of the analyseswere carried out using Stata, version 10.1.

Initially, we did an exploratory univariate analysis of allparticipant demographics and baseline outcomes usingsimple linear regression in order to identify potential pre-dictors of 12-month outcome. Although identifying pre-dictors of outcome was not our primary focus, we reportthem here for completeness. We fitted linear regressionmodels for each of the primary outcome measures (changefrom baseline) with the inclusion of an interaction term todirectly examine whether the treatment difference de-pends on the moderators, both prespecified and explor-atory. A statistical test for interaction is the most appro-priate method to evaluate and draw inferences fromsubgroup analyses (23–25). We fitted the following mod-els: 1) model 1, unadjusted model: with treatment assign-ment, moderator, and the interaction term of these 2 vari-ables; 2) model 2, adjusted model: with treatmentassignment, moderator, and the interaction term of these 2variables adjusted for age, sex, and baseline value of thedependent variable; and 3) model 3, adjusted model: withtreatment assignment, moderator, and the interaction termof these 2 variables adjusted for baseline and demographiccovariates selected using a forward stepwise variable-selection algorithm with a stringent significance level of0.01 due to multiple testing to determine whether a vari-able is added to or removed from the model (26).

The unadjusted model was a comparator for the adjustedmodels to see whether covariate adjustment altered theconclusions of the analyses. The second model adjustedfor clinically relevant factors that were prespecified in thestudy protocol. Selection of which covariates to adjust forcan be problematic in subgroup analyses; therefore, base-line factors that predict outcome can be considered usingan appropriate statistical selection procedure (27). For thisreason, the final model was fitted using a forward stepwiseselection procedure to identify covariates that are predic-

tors of outcome to put into the model. Where the modeladjusts for a baseline variable that is also a potential mod-erator (e.g., age �54 years versus age �54 years), we ex-cluded that variable from the model to avoid any issues ofcolinearity.

For the benefit of those doing systematic reviews oftreatments for chronic low back pain, we also present herethe main analysis just for those reporting pain present for3 months or more at randomization.

RESULTS

We obtained 12-month followup data on 598 (85%) of 701of our participants (199 control and 399 intervention).Demographic and outcome measure data collected at base-line were well balanced across both arms for the sampleproviding 12-month followup (Table 2). The number ofparticipants contributing to the univariate analyses rangedfrom 456 for the SF-12 to 583 for troublesomeness andMVK pain. Initial univariate analyses showed a consistentpattern that age, employment, benefits, and MVK disabil-ity score were all predictors of outcome in all 3 outcomemeasures. Troublesomeness, duration, baseline RMDQ,and MVK pain predicted outcome in some, but not all,outcome measures. None of the other baseline variables,including the FABQ, showed any association with out-come (Table 3).

Models including interaction term between randomizedgroup and moderator. The goodness of fit was assessedfor models 1, 2, and 3 using the adjusted R2 statistic. Thisstatistic can take on any value less than or equal to 1, witha value closer to 1 indicating a better fit. If the modelcontains terms that do not aid in predicting response, thennegative values can occur. The adjusted R2 values formodel 1 (unadjusted model: treatment � moderator �interaction) ranged from 0.02 to 0.06, for model 2 (adjustedfor age, sex, and baseline) ranged from 0.09 to 0.33, and formodel 3 (adjusted for significant predictors from stepwise

Table 2. (Cont’d)

Control(advice only)

(n � 199)

Advice pluscognitive–behavioral

intervention(n � 399)

Total(n � 598)

MVK disability#No. 196 389 585Mean � SD 45.8 � 23.41 47.8 � 23.86 47.1 � 23.71Missing, no. 3 10 13

MVK pain#No. 198 396 594Mean � SD 58.3 � 18.72 58.7 � 19.17 58.6 � 19.01Missing, no. 1 3 4

* HADS � Hospital Anxiety and Depression Scale; RMDQ � Roland Morris Disability Questionnaire; MVK � modified Von Korff.† Scale 0–24; a lower score indicates lower fear-avoidance beliefs.‡ Scale 0–21; a lower score indicates less anxiety and depression.§ Scale 0–60; a lower score indicates lower self-efficacy.¶ Scale 0–24; lower scores indicate less severe disability.# Scale 0–100%; lower scores indicate less pain and disability.

1274 Underwood et al

Table 3. Univariate analyses to identify potential baseline predictors of outcome at 12 months of followup*

RMDQ change frombaseline

MVK disability change frombaseline

MVK pain change frombaseline

PDifference(95% CI) No. P

Difference(95% CI) No. P

Difference(95% CI) No.

Baseline variables of subgroupsspecified in protocol

FABQ (positive values favorthose with greater fear-avoidance beliefs) (16)

0.94 0.002(�0.06, 0.06)

470 0.48 0.13(�0.23, 0.48)

523 0.19 �0.20(�0.50, 0.10)

551

Troublesomeness of backpain (positive valuesindicate better outcome forthose with greatertroublesomeness) (13,14)

0.02 0.90(0.16, 1.64)

498 0.13 3.38(�1.04, 7.80)

552 0.02 4.59(0.89, 8.30)

583

Baseline variables of othersubgroups

Duration of back pain, years(positive values favor thosewith a longer duration ofback pain)

0.04 �0.03(�0.06, �0.01)

478 0.03 �0.20(�0.37, �0.02)

531 0.11 �0.12(�0.27, 0.03)

560

Age, years (positive valuesfavor those who are older)

0.04 �0.03(�0.05, �0.01)

498 � 0.001 �0.32(�0.48, �0.17)

551 0.003 �0.20(�0.32, �0.07)

582

Sex (positive values indicatebetter outcome for women)

0.58 0.21(�0.54, 0.97)

497 0.06 4.37(�0.10, 8.83)

551 0.88 0.29(�3.50, 4.07)

582

Age left full-time education(positive values indicatebetter outcome for thosewho left education at anolder age or are still ineducation)

0.09 0.42(�0.07, 0.91)

475 0.21 1.88(�1.03, 4.78)

527 0.41 1.01(�1.41, 3.43)

558

Frequency of back pain, past6 weeks (positive valuesfavor those with improvedfrequency of back pain)

0.86 �0.05(�0.58, 0.49)

494 0.20 2.05(�1.10, 5.21)

548 0.98 0.02(�2.63, 2.68)

581

In employment (positivevalues indicate betteroutcome for those inemployment)

� 0.001 1.45(0.72, 2.18)

496 � 0.001 9.35(5.00, 13.70)

550 � 0.001 6.97(3.30, 10.64)

581

Benefits (positive valuesindicate better outcome forthose with no benefits)

0.02 1.19(0.17, 2.22)

490 0.04 6.31(0.42, 12.19)

545 0.01 6.81(1.81, 11.81)

574

HADS anxiety (positivevalues favor those withanxiety symptoms) (17)

0.89 0.01(�0.08, 0.09)

490 0.31 0.26(�0.25, 0.78)

541 0.54 �0.13(�0.56, 0.30)

571

HADS depression (positivevalues favor those withdepressive symptoms) (17)

0.97 0.002(�0.10, 0.11)

493 0.29 0.33(�0.28, 0.94)

549 0.32 �0.26(�0.76, 0.25)

578

Pain self-efficacy (positivevalues favor those withstronger self-efficacybeliefs) (18)

0.32 0.02(�0.01, 0.05)

480 0.74 �0.03(�0.20, 0.14)

536 0.09 0.12(�0.02, 0.27)

564

Other baseline variables testedfor inclusion in finalmodel

RMDQ (range 0–24, wherelower scores indicate lesssevere disability)

� 0.001 0.22(0.15, 0.30)

498 0.36 0.21(�0.25, 0.67)

552 0.97 �0.01(�0.39, 0.37)

583

MVK disability (range 0–100,where lower scoresindicate less disability)(11)

� 0.001 0.03(0.01, 0.04)

485 � 0.001 0.52(0.44, 0.60)

552 � 0.001 0.15(0.08, 0.23)

572

(continued)

Treatment Moderators in Low Back Pain 1275

selection) ranged from 0.16 to 0.43. Clearly, model 3 ex-plained the most variance and was therefore chosen as ourfinal model.

In our third model based on the stepwise selection pro-cess, we adjusted for different baseline variables for eachof our 3 primary outcomes. Only for the RMDQ did anyinteractions reach statistical significance: age and employ-ment status. Being younger and employed were all statis-tically significant moderators for gaining additional bene-fit, as measured by the RMDQ, from treatment. This means,for example, that on average those who were employedgained an additional benefit from the treatment of 1.89(95% confidence interval 0.43, 3.35) points in the RMDQcompared to those who were not employed (Table 4).These effects were not seen with either the MVK disabilityor MVK pain.

No statistically significant moderation effects were ob-served for the prespecified subgroup analyses in either ofthe models; however, some significant moderators of treat-ment effect were observed in the exploratory subgroupanalyses.

DISCUSSION

This analysis contributes to the body of research seeking toidentify subgroups of patients with low back pain who arelikely to achieve greater benefit from particular treatments.Particular strengths are that the analysis is based on a largewell-conducted randomized controlled trial and that wewere able to do confirmatory analyses on 2 variables, fearavoidance and troublesomeness, that were prespecifiedbefore the trial started. The study was not, however, orig-inally powered for these confirmatory analyses. Becausewe had a prespecified duration of back pain as a subgroup,we have included an analysis of pain of more or less than

3 years since onset as an exploratory analysis; however,this may not be a meaningful distinction in the clinicalsituation. The remainder of our subgroups is known so-ciodemographic prognostic indicator variables or psycho-logical factors that might moderate the effect of a cogni-tive–behavioral approach. This reduces the chances offinding spurious positive results purely by chance.

With 598 participants included in this analysis, this isone of the largest studies of moderators of treatment fornonspecific low back pain. Witt et al included substan-tially more participants (n � 3,093), the UK Back PainExercise and Manipulation Trial (UK BEAM) includednearly twice as many participants (n � 1,116), and Sher-man et al included a similar number of participants (n �638). In these latter 2 cases, the participants were splitbetween 4 treatment groups, meaning that the numberincluded in each comparison was much smaller (28–30).Nevertheless, our statistical power to identify moderatorsis poor (31). Consequentially, we present only the pointestimate and 95% confidence interval for each interactionin preference to the P value (32). In our study, we hadsufficient power to detect a between-subgroup standard-ized mean difference ranging from 0.2 to 0.3 in the primaryoutcomes.

We make a clear distinction between confirmatory pre-specified subgroup analyses and exploratory subgroup ana-lyses, where the former is used for hypothesis testing andthe latter is hypothesis generating (33,34). We used theBonferroni correction to adjust for multiple testing for theprespecified subgroup analyses only. Had the significancelevel not been adjusted, the conclusions drawn still wouldhave been the same. For our exploratory analyses, we didnot make this correction to ensure that we did identifyvariables worthy of further exploration. Initially we iden-tified potential predictors of outcome using univariate ana-

Table 3. (Cont’d)

RMDQ change frombaseline

MVK disability change frombaseline

MVK pain change frombaseline

PDifference(95% CI) No. P

Difference(95% CI) No. P

Difference(95% CI) No.

MVK pain (range 0–100,where lower scoresindicate less pain) (11)

0.06 0.02(�0.01, 0.04)

494 � 0.001 0.24(0.13, 0.36)

549 � 0.001 0.32(0.23, 0.42)

583

Ethnicity (UK Censuscategories; positive valuesindicate better outcome forthose from a nonwhitebackground)

0.54 0.22(�0.49, 0.94)

474 0.79 0.51(�3.28, 4.30)

523 0.96 0.09(�3.08, 3.25)

553

SF-12 physical componentscore (positive values favorthose with better physicalquality of life) (19)

0.80 0.01(�0.03, 0.04)

456 0.75 �0.04(�0.28, 0.20)

510 0.13 0.15(�0.04, 0.35)

538

SF-12 mental componentscore (positive values favorthose with better mentalquality of life) (19)

0.57 �0.01(�0.04, 0.02)

456 0.16 �0.15(�0.35, 0.06)

510 0.97 0.003(�0.17, 0.17)

538

* RMDQ � Roland Morris Disability Questionnaire; MVK � modified Von Korff; difference � mean effect size of the interaction term; 95% CI � 95%confidence interval; FABQ � Fear-Avoidance Belief Questionnaire; HADS � Hospital Anxiety and Depression Scale; SF-12 � Short Form 12.

1276 Underwood et al

lyses, followed by a more formal approach of forwardstepwise selection (model 3). The stepwise selection pro-cess does not identify the same predictors as that from theunivariate analyses, as some covariates either become sig-nificant or insignificant in the presence of other covariatesduring the modeling process (35).

Our subgroup analyses considered 3 models: an unad-justed model and 2 adjusted models. The first of the ad-justed models (model 2) adjusted for clinically relevantcovariates that were prespecified in the protocol. The fo-cus of this model was to try and estimate subgroup effects,correcting for relevant predictors as presented in the liter-ature. The second of the adjusted models (model 3) con-tained statistically significant predictors of outcome se-lected using forward stepwise selection, where the choiceof covariates in these models was data driven. Based onthis particular study, the latter adjusted model (model 3)offered more precision when estimating the subgroup ef-fects compared to the former adjusted model (model 2).Because of our concerns about the scaling and sensitivityto change of the RMDQ, we included MVK disability andpain scores as additional primary outcomes. Consequen-tially, this analysis included 36 individual comparisonsincreasing the risk of any statistically significant interac-

tions being chance findings. Such statistically significantinteractions that we have found are not consistent acrossthe 3 outcome measures, and thus caution is needed intheir interpretation. This does raise for us a slight concernthat positive findings in previous studies might not havebeen consistent if different outcome measures had beenused.

In common with other studies, even using a model fittedusing a selection procedure, the proportion of the variancein outcome explained by baseline variables is modest. Thisis greatest for a pure disability measure (MVK disability),least for a pure pain measure (MVK pain), and intermedi-ate for a mixed measure (RMDQ).

Troublesomeness was identified as a potential predictorof outcome in 2 of our 3 primary outcome measures in theunivariate analyses, but was not a predictor in the multi-variate analyses (model 3) and did not moderate treatmenteffect. All of the Back Skills Training Trial participantshad at least moderately troublesome low back pain. It ispossible, had we also included subjects with slightly trou-blesome pain, that we would have observed a difference,but knowing whether or not our intervention works forthose who are only slightly troubled by their back painmay not be a high research priority.

Table 4. Multivariate analyses of outcomes at 12 months of followup (n � 598)*

Subgroup†

Model 3

RMDQ‡ MVK disability§ MVK pain¶

Prespecified subgroup analysesFear avoidance, included % (R2) 75 (0.21) 80 (0.43) 89 (0.18)

�14 to �14 �0.01 (�1.53, 1.50) �2.56 (�10.35, 5.23) 2.18 (�5.35, 9.71)Troublesomeness, included % (R2) 79 (0.21) 83 (0.42) 93 (0.17)

Very/extremely to moderately �1.01 (�2.52, 0.50) �4.42 (�12.11, 3.27) �5.04 (�12.47, 2.40)Exploratory subgroup analyses

Duration, included % (R2) 76 (0.22) 81 (0.42) 90 (0.17)�3 years to �3 years 0.14 (�1.55, 1.83) �3.00 (�11.59, 5.60) �3.46 (�11.81, 4.89)

Age, included % (R2) 79 (0.21) 83 (0.42) 93 (0.18)�54 years to �54 years �1.58 (�3.05, �0.12) �1.67 (�9.28, 5.94) �3.00 (�10.35, 4.35)

Sex, included % (R2) 78 (0.21) 83 (0.43) 93 (0.17)Female to male �1.27 (�2.79, 0.25) �3.59 (�11.30, 4.12) �3.27 (�10.83, 4.28)

Left full-time education, included % (R2) 75 (0.21) 79 (0.42) 89 (0.17)Age �16 years to age �16 years 1.29 (�0.24, 2.82) 3.01 (�4.90, 10.92) 4.15 (�3.47, 11.77)

Frequency of back pain, included % (R2) 78 (0.21) 83 (0.43) 93 (0.18)Comes and goes � getting better to

fairly constant � getting worse�0.12 (�1.81, 1.57) 2.81 (�5.73, 11.35) 0.15 (�8.25, 8.55)

Benefits, included % (R2) 79 (0.21) 82 (0.43) 93 (0.17)Benefits to no benefits 0.32 (�1.67, 2.31) �5.56 (�15.94, 4.81) �0.22 (�10.06, 9.62)

Employed, included % (R2) 79 (0.22) 83 (0.43) 93 (0.18)Employed to not employed 1.89 (0.43, 3.35) 3.16 (�4.44, 10.75) 5.01 (�2.33, 12.34)

HADS anxiety, included % (R2) 78 (0.21) 82 (0.43) 92 (0.17)�11 to �11 �1.12 (�2.83, 0.58) �2.15 (�10.97, 6.67) �2.41 (�10.83, 6.01)

HADS depression, included % (R2) 78 (0.22) 83 (0.43) 93 (0.17)�11 to �11 �2.07 (�4.79, 0.65) �14.58 (�29.19, 0.03) �4.82 (�17.98, 8.33)

Pain self-efficacy, included % (R2) 79 (0.19) 83 (0.42) 93 (0.16)�42 to �42 �0.15 (�1.65, 1.36) 2.05 (�5.67, 9.78) 0.60 (�6.91, 8.11)

* Values are the mean estimate of the interaction term (95% confidence interval) of the effect of the treatment difference between subgroups unlessotherwise indicated. RMDQ � Roland Morris Disability Questionnaire; MVK � modified Von Korff; included % � percentage of 598 subjects includedin this analysis; HADS � Hospital Anxiety and Depression Scale.† The direction of the analyses is interpreted as presented, i.e., the former subgroup minus the latter.‡ Adjusted for baseline RMDQ, employed, pain self-efficacy, and benefits.§ Adjusted for baseline MVK disability, baseline MVK pain, baseline Short Form 12 (SF-12) physical, baseline SF-12 mental, and pain self-efficacy.¶ Adjusted for baseline MVK pain, pain self-efficacy, and benefits.

Treatment Moderators in Low Back Pain 1277

Fear avoidance did not predict outcome in either ourunivariate analyses or multivariate analyses (model 3).Also, it did not moderate treatment effect in any of our 3outcomes. Our participants had all consulted for low backpain in the previous 6 months and had significant ongoingproblems. They might, however, not be the same popula-tion as those currently attending for treatment who willtend to have more severe symptoms (36,37).

Only for the RMDQ is there any statistically significantmoderation of outcome in our final model (model 3). If itwas not for our concerns about the measurement proper-ties of the RMDQ, we would only have these data toconsider. Since similar effects were not seen in the MVKdisability and MVK pain scores, these are unlikely to betrue moderation effects. If these were true results, thiswould suggest that we focused our efforts on the youngerpopulation that was currently working and not on theolder unemployed population.

We have presented the data here just for those withchronic low back pain. We make no further comment onthese data, as they are not the focus of this study but willbe of use to others.

There is considerable research interest in identifyingback pain subgroups. This analysis, in common with sec-ondary analyses of the UK BEAM data set and a trial ofacupuncture by Sherman et al, have failed to find convinc-ing data to suggest that subgroups can be identified inexisting trial data (29,30). Together they have considered arange of potential moderators for 5 different treatmentpackages in rigorous analyses. As a rule of thumb to showan interaction between a potential moderator and treat-ment effect of a similar size to the main treatment effect, a4-fold increase in sample size is required (31). The size ofboth of these studies was based on finding a main treat-ment effect rather than a moderation of such an effect. Toour knowledge, only 1 trial has been explicitly powered toshow moderator effects (n � 3,093) (28). In that trial, Wittet al found that acupuncture was more effective for thosewith worse initial back function, younger patients, andthose with �10 years of schooling (28). Although we con-sider we have used the most appropriate statistical ap-proach to our data, this approach may not yield positiveresults unless there are resources to run more trials of asimilar size to the acupuncture trial by Witt et al. It is,therefore, perhaps surprising that some other studies havefound apparently important effects on much smaller num-bers. Great care is needed in interpretation of data fromsuch studies. We now have a number of treatments ofproven modest effectiveness. Any future studies should,therefore, need to compare 2 active treatments. The meandifferences between the 2 treatments are likely to be muchsmaller than those comparing an active treatment to notreatment. To show a main effect will require a trial sub-stantially larger than the Back Skills Training Trial; toshow a statistically significant interaction, the sample sizewill need to be multiplied further. Any such trial wouldonly test moderators as a single comparison between 2treatments. It is unlikely that the very substantial fundingneeded for many such trials will be forthcoming and thatany further research on subgrouping for those with non-

specific low back pain will need to consider adoptingdifferent approaches.

We suggest 2 alternative approaches that might possiblybe more fruitful. First, there are now many thousands ofindividuals with back pain who have been recruited torandomized controlled trials. If the research communitywas to collaborate to develop a repository of individualpatient data it may be possible, with the large number ofsubjects available, to develop statistical techniques thatwould allow moderators to be identified and clinical pre-diction rules to be developed (38). The acupuncture re-search community is already making progress in this di-rection (30). The back pain research community also needsto measure potential moderators and outcomes in a similarmanner that is congruent with existing suggestions to fa-cilitate such pooling (39–41).

Second, the research community should work togetherto develop some theoretically informed descriptors of backpain syndromes, which may include subject and clinicalcharacteristics, that may respond to specific treatment ap-proaches and then test the interventions in people meetingthese criteria. The headache research community, for ex-ample, has developed a largely clinical classification ofmore than 200 different headache types that are now usedto inform entry criteria for trials and clinical managementwithout seeking to prove statistically that any one patientcharacteristic predicts response to treatment (42).

A robust secondary analysis of a large trial of a cognitive–behavioral approach did not identify baseline characteris-tics that modify treatment effects, allowing only a 0.2 to0.3 between-subgroup standardized mean difference to bedetected in the primary outcomes. Much larger studieswould be needed to be confident that important modera-tors had not been overlooked; it is unlikely that suchstudies will appeal to funders. New research approachesare needed to confidently identify back pain subgroups.

AUTHOR CONTRIBUTIONS

All authors were involved in drafting the article or revising itcritically for important intellectual content, and all authors ap-proved the final version to be published. Dr. Underwood had fullaccess to all of the data in the study and takes responsibility forthe integrity of the data and the accuracy of the data analysis.Study conception and design. Underwood, Mistry, Lall, Lamb.Acquisition of data. Underwood, Lall, Lamb.Analysis and interpretation of data. Underwood, Mistry, Lall,Lamb.

REFERENCES

1. Savigny P, Watson P, Underwood M. Early management ofpersistent non-specific low back pain: summary of NICE guid-ance. BMJ 2009;338:b1805.

2. Lamb SE, Hansen Z, Lall R, Castelnuovo E, Withers EJ, Nich-ols V, et al. Group cognitive behavioural treatment for low-back pain in primary care: a randomised controlled trial andcost-effectiveness analysis. Lancet 2010;375:916–23.

3. Borkan JM, Koes B, Reis S, Cherkin DC. A report from theSecond International Forum for Primary Care Research onLow Back Pain: reexamining priorities. Spine (Phila Pa 1976)1998;23:1992–6.

4. Kraemer HC, Stice E, Kazdin A, Offord D, Kupfer D. How dorisk factors work together? Mediators, moderators, and inde-

1278 Underwood et al

pendent, overlapping, and proxy risk factors. Am J Psychiatry2001;158:848–56.

5. Kamper SJ, Maher CG, Hancock MJ, Koes BW, Croft PR, Hay E.Treatment-based subgroups of low back pain: a guide to ap-praisal of research studies and a summary of current evi-dence. Best Pract Res Clin Rheumatol 2010;24:181–91.

6. Lamb SE, Lall R, Hansen Z, Withers EJ, Griffiths FE, Szcz-epura A, et al. Design considerations in a clinical trial of acognitive behavioural intervention for the management of lowback pain in primary care: Back Skills Training Trial. BMCMusculoskelet Disord 2007;8:14.

7. Hansen Z, Daykin A, Lamb SE. A cognitive-behavioural pro-gramme for the management of low back pain in primary care:a description and justification of the intervention used in theBack Skills Training Trial (BeST; ISRCTN 54717854). Phys-iotherapy 2010;96:87–94.

8. Lamb SE, Lall R, Hansen Z, Castelnuovo E, Withers EJ, Nich-ols V, et al. A multicentred randomised controlled trial of aprimary care-based cognitive behavioural programme for lowback pain: the Back Skills Training (BeST) trial. Health Tech-nol Assess 2010;14:1–253.

9. Roland M, Waddell G, Klaber Moffett J, Burton K, Main C. Theback book. 2nd ed. Norwich: The Stationery Office; 2002.

10. Roland M, Morris R. A study of the natural history of backpain. Part I: development of a reliable and sensitive measureof disability in low-back pain. Spine (Phila Pa 1976) 1983;8:141–4.

11. Underwood MR, Barnett AG, Vickers MR. Evaluation of twotime-specific back pain outcome measures. Spine (Phila Pa1976) 1999;24:1104–12.

12. Kent P, Keating JL, Leboeuf-Yde C. Research methods forsubgrouping low back pain. BMC Med Res Methodol 2010;10:62.

13. United Kingdom back pain exercise and manipulation (UKBEAM) randomised trial: effectiveness of physical treatmentsfor back pain in primary care. BMJ 2004;329:1377.

14. Parsons S, Carnes D, Pincus T, Foster N, Breen A, Vogel S, etal. Measuring troublesomeness of chronic pain by location.BMC Musculoskelet Disord 2006;7:34.

15. Pincus T, Vogel S, Burton AK, Santos R, Field AP. Fearavoidance and prognosis in back pain: a systematic reviewand synthesis of current evidence. Arthritis Rheum 2006;54:3999–4010.

16. Waddell G, Newton M, Henderson I, Somerville D, Main CJ. AFear-Avoidance Beliefs Questionnaire (FABQ) and the role offear-avoidance beliefs in chronic low back pain and disabil-ity. Pain 1993;52:157–68.

17. Zigmond AS, Snaith RP. The Hospital Anxiety and Depres-sion Scale. Acta Psychiatr Scand 1983;67:361–70.

18. Nicholas MK. The Pain Self-Efficacy Questionnaire: takingpain into account. Eur J Pain 2007;11:153–63.

19. Ware J Jr, Kosinski M, Keller SD. A 12-item Short-FormHealth Survey: construction of scales and preliminary tests ofreliability and validity. Med Care 1996;34:220–33.

20. Cleland JA, Childs JD, Fritz JM, Whitman JM, Eberhart SL.Development of a clinical prediction rule for guiding treat-ment of a subgroup of patients with neck pain: use of thoracicspine manipulation, exercise, and patient education. PhysTher 2007;87:9–23.

21. Altman DG, Royston P. The cost of dichotomising continuousvariables. BMJ 2006;332:1080.

22. Lagakos SW. The challenge of subgroup analyses: reportingwithout distorting. N Engl J Med 2006;354:1667–9.

23. Brookes ST, Whitley E, Peters TJ, Mulheran PA, Egger M,Davey Smith G. Subgroup analyses in randomised controlled

trials: quantifying the risks of false-positives and false-nega-tives. Health Technol Assess 2001;5:1–56.

24. Assmann SF, Pocock SJ, Enos LE, Kasten LE. Subgroup ana-lysis and other (mis)uses of baseline data in clinical trials.Lancet 2000;355:1064–9.

25. Matthews JN, Altman DG. Interaction 3: how to examineheterogeneity. BMJ 1996;313:862.

26. Petrie A, Sabin C. Medical statistics at a glance. 3rd ed.Chichester (UK): Wiley-Blackwell; 2009.

27. Pocock SJ, Assmann SE, Enos LE, Kasten LE. Subgroup ana-lysis, covariate adjustment and baseline comparisons in clin-ical trial reporting: current practice and problems. Stat Med2002;21:2917–30.

28. Witt CM, Jena S, Selim D, Brinkhaus B, Reinhold T, Wruck K,et al. Pragmatic randomized trial evaluating the clinical andeconomic effectiveness of acupuncture for chronic low backpain. Am J Epidemiol 2006;164:487–96.

29. Underwood MR, Morton V, Farrin A. Do baseline character-istics predict response to treatment for low back pain? Sec-ondary analysis of the UK BEAM dataset [ISRCTN32683578].Rheumatology (Oxford) 2007;46:1297–302.

30. Sherman KJ, Cherkin DC, Ichikawa L, Avins AL, Barlow WE,Khalsa PS, et al. Characteristics of patients with chronic backpain who benefit from acupuncture. BMC Musculoskelet Dis-ord 2009;10:114.

31. Brookes ST, Whitely E, Egger M, Smith GD, Mulheran PA,Peters TJ. Subgroup analyses in randomized trials: risks ofsubgroup-specific analyses; power and sample size for theinteraction test. J Clin Epidemiol 2004;57:229–36.

32. Matthews JN, Altman DG. Statistics notes. Interaction 2: com-pare effect sizes not P values. BMJ 1996;313:808.

33. Kent DM, Rothwell PM, Ioannidis JP, Altman DG, HaywardRA. Assessing and reporting heterogeneity in treatment ef-fects in clinical trials: a proposal. Trials 2010;11:85.

34. Pincus T, Miles C, Froud R, Underwood M, Carnes D, TaylorS. Methodological criteria for the assessment of moderators insystematic reviews of randomised controlled trials: a consen-sus study. BMC Med Res Methodol 2011;11:14.

35. Tu YK, Gunnell D, Gilthorpe MS. Simpson’s paradox, Lord’sparadox, and suppression effects are the same phenomenon:the reversal paradox. Emerg Themes Epidemiol 2008;5:2.

36. Poiraudeau S, Rannou F, Baron G, Le Henanff A, Coudeyre E,Rozenberg S, et al. Fear-avoidance beliefs about back pain inpatients with subacute low back pain. Pain 2006;124:305–11.

37. George SZ, Fritz JM, Childs JD. Investigation of elevated fear-avoidance beliefs for patients with low back pain: a secondaryanalysis involving patients enrolled in physical therapy clin-ical trials. J Orthop Sports Phys Ther 2008;38:50–8.

38. Stewart LA, Parmar MK. Meta-analysis of the literature or ofindividual patient data: is there a difference? Lancet 1993;341:418–22.

39. Bombardier C. Outcome assessments in the evaluation oftreatment of spinal disorders: summary and general recom-mendations. Spine 2000;25:3100–3.

40. Deyo RA, Battie M, Beurskens AJ, Bombardier C, Croft P, KoesB, et al. Outcome measures for low back pain research: aproposal for standardized use [published erratum appears inSpine 1999;24:418]. Spine 1998;23:2003–13.

41. Vickers AJ, Cronin AM, Maschino AC, Lewith G, MacPhersonH, Victor N, et al. Individual patient data meta-analysis ofacupuncture for chronic pain: protocol of the AcupunctureTrialists’ Collaboration. Trials 2010;11:90.

42. The International Classification of Headache Disorders: 2ndedition. Cephalalgia 2004;24 Suppl:9–160.

Treatment Moderators in Low Back Pain 1279