Analysis of Longitudinal Substance Use Outcomes u

download Analysis of Longitudinal Substance Use Outcomes u

of 14

Transcript of Analysis of Longitudinal Substance Use Outcomes u

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    1/14

    Addiction (2000 ) 95(Supplement 3 ), S381 S394

    MAKING THE MOST OUT OF DATA ANALYSIS AND

    INTERPRETATION

    Analysis of longitudinal substance useoutcomes using ordinal random-effectsregression models

    DONALD HEDEKER 1

    & ROBIN J. MERMELSTEIN2

    1Division of Epidemiology and Biostatistics, and Health Research and Policy Centers,University of Illinois at Chicago & 2Health Research and Policy Centers, and Department of Psychology, University of Illinois at Chicago, IL, USA

    AbstractIn this paper we describe analysis of longitudinal substance use outcomes using random-effects regressionmodels ( RRM ) . Some of the advantages of this approach is that these models allow for incomplete data acrosstime, time-invariant and time-varying covariates, and can estimate individual change across time. Becausesubstance use outcomes are often measured in terms of dichotomous or ordinal categories, our presentation focuses on categorical versions of RRM. Speci cally, we present and describe an ordinal RRM that includesthe possibility that covariate effects vary across the cutpoints of the ordinal outcome. This latter feature is particularly useful because a treatment can have varying effects on full versus partial abstinence, for example.Data from a smoking cessation study are used to illustrate application of this model for analysis of longitudinal substance use data.

    Introduction

    Longitudinal studies play an important role insubstance use research. In these studies, levels of substance use are measured repeatedly across aseries of timepoints, and the goal is often toexamine the effects of different treatments and/orpredictors on usage levels across time. In somecases, there is interest only in the nal measure-ment, or the repeated usage levels can be aggre-gated to provide a single outcome per sub ject,for example, an average substance use level or a

    simple difference in substance use (e.g. pre- topost-change ). In these cases, standard statistical

    analysis procedures can be readily applied. How-

    ever, these approaches are limited because theyeither ignore change across time (i.e. end-pointor averaged value analysis ) , or they only considerwithin-sub jects change that is linear (i.e. pre- topost-change analysis ) . Furthermore, if sub jectsfrom different treatment groups drop out differ-entially across time, then treatment group andtime are confounded in these analyses. Inclusionof time-varying covariates presents another prob-lem for these approaches. Finally, from a statisti-

    cal point of view these approaches are inef cientbecause much of the data that are collected are

    Correspondence to: Donald Hedeker, Division of Epidemiology and Biostatistics (m/c 922 ), School of PublicHealth, University of Illinois at Chicago, 2121 West Taylor Street, Room 525, Chicago, IL 60612 7260, USA.e-mail: [email protected]

    Submitted 22nd February 2000; initial review completed 25th May 2000; nal version accepted 10th July 2000.

    ISSN 0965 2140 print/ISSN 1360 0443 online/00/11S381 14 Society for the Study of Addiction to Alcohol and Other Drugs

    Carfax Publishing, Taylor & Francis Limited

    DOI: 10.1080/09652140020004296

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    2/14

    S382 Donald Hedeker & Robin J. Mermelstein

    ignored. For these reasons, development of moregeneral statistical methods for longitudinal dataanalysis has been an active area of statisticalresearch in recent years. In particular, random-effects regression models (RRM ) have been de-veloped to overcome many of these limitations.With the concurrent development of software,RRM represent a popular approach for analysisof longitudinal data.

    Variants of RRM have been developed under avariety of names: random-effects models (Laird& Ware, 1982 ), variance component models(Dempster, Rubin & Tsutakawa, 1981 ); hier-archical linear models (Bryk & Raudenbush,1992 ) , multilevel models (Goldstein, 1995 ),two-stage models (Bock, 1989 ), random

    coef cient models (de Leeuw & Kreft, 1986 ),mixed models (Longford, 1987; Wol nger,1993 ) , empirical Bayes models (Hui & Berger,1983; Strenio, Weisberg & Bryk, 1983 ), unbal-anced repeated-measures models ( Jennrich &Schlucter, 1986 ) and random regression models(Bock, 1983a, 1983b ). A basic characteristic of these models is the inclusion of random sub jecteffects into regression models in order to accountfor the in uence of sub jects on their repeated

    observations. These random sub ject effects thusdescribe and explain the correlational structureof the longitudinal data. Additionally, they indi-cate the degree of sub ject variation that exists inthe population of sub jects.

    There are several features that make RRMespecially useful in longitudinal research. First,sub jects are not assumed to be measured on thesame number of timepoints, thus, sub jects withincomplete data across time are included in the

    analysis. The ability to include sub jects withincomplete data across time is an important ad-vantage relative to procedures that require com-plete data across time because (a) by includingall data, the analysis has increased statisticalpower, and (b) complete-case analysis may sufferfrom biases to the extent that sub jects with com-plete data are not representative of the largerpopulation of sub jects. Because time is treated asa continuous variable in RRM, sub jects do nothave to be measured at the same timepoints.This is useful for analysis of longitudinal studieswhere follow-up times are not uniform across allsub jects. Both time-invariant and time-varyingcovariates can be included in the model. Thus,changes in the outcome variable may be due toboth stable characteristics of the sub ject (e.g.

    their gender or race ) as well as characteristicsthat change across time (e.g. life-events ). Finally,whereas traditional approaches estimate averagechange (across time ) in a population, RRM canalso estimate change for each sub ject. Theseestimates of individual change across time can beparticularly useful in longitudinal studies where aproportion of sub jects exhibit change across timethat deviates from the average trend. For exam-ple, while most sub jects over time may show anincrease in drinking or smoking, there may be aproportion of sub jects who do not.

    In terms of measurement scale, substance useoutcomes are often categorical in nature. Forexample, sub jects may simply be dichotomouslyclassi ed as being abstinent or not at each time-

    point. Alternatively, sub jects might be classi edin terms of degrees of use as none, mild, moder-ate or severe. Because substance useclassi cations like these are often the primaryoutcome variables in a study, statistical modelsfor dichotomous or ordinal responses are partic-ularly germane for analysis of alcohol and smok-ing data. In this regard, generalizations of RRMhave been developed for dichotomous responsedata (Wong & Mason, 1985; Gibbons & Bock,

    1987; Goldstein, 1991; Rosner, 1992; Stiratelli,Laird & Ware, 1984; Wol nger & Lin, 1997 )and ordinal response data ( Jansen, 1990; Ezzet& Whitehead, 1991; Hedeker & Gibbons, 1994;Ten Have, 1996 ), thus allowing a general frame-work for analysis of both continuous and cate-gorical longitudinal outcomes.

    As these methods have been developed andused more widely, application of RRM for sub-stance use data has grown. For longitudinal

    smoking data, Hu et al. (1998 ) reviews andcompares RRM to the generalized estimatingequations (GEE ) method for analysis of longi-tudinal dichotomous data. For longitudinal ordi-nal smoking outcomes, Hedeker & Mermelstein( 1996 ) describe and illustrate use of ordinalRRM. Nich & Carroll (1997 ) compare RRM totraditional repeated measures analysis of vari-ance for analysis of continuous drug compositescores. In addition to these papers focusing onRRM description and dissemination, several out-comes-orientated papers using RRM for analysisof longitudinal substance use data include Bern-stein et al. (1994 ), Carroll et al. (1994, 1998 ),Chassin et al. (1996 ), Gallagher et al. (1997 ),Gruder et al. (1993 ) , Halikas et al. (1997 ), Jason et al. (1997 ), O Malley et al. (1996 )

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    3/14

    Longitudinal substance use outcomes S383

    and Salina et al. (1994 ) . These latter papers areespecially useful for illustrating how results fromRRM can be described and reported.

    More generally, several textbooks describingRRM for longitudinal data analysis have recentlybeen published (Bryk & Raudenbush, 1992;Longford, 1993; Diggle, Liang & Zeger, 1994;Goldstein, 1995; Hand & Crowder, 1996 ) andreview, comparison, and/or tutorial articles onlongitudinal data analysis treating RRM haveproliferated (Gibbons et al. , 1993; Burchinal,Bailey & Snyder, 1994; Manor & Kark, 1996;Cnaan, Laird & Slasor, 1997; Everitt, 1998;Albert, 1999; Delucchi & Bostrom, 1999; Kesel-man et al. , 1999; Lesaffre, Asefa & Verbeke,1999; Omar et al. , 1999; Sullivan, Dukes &

    Losina, 1999 ). Most of these papers concerncontinuous response variables, although onesdealing speci cally with dichotomous outcomeshave also appeared (Zeger & Liang, 1992;Fitzmaurice, Laird & Rotnitzky, 1993; Gib-bons & Hedeker, 1994; Pendergast et al. , 1996 ),although most of these are somewhattechnical.

    Statistical software to perform RRM analysishas also proliferated, especially for continuous

    outcomes (SAS PROC MIXED; BMDP 5V;HLM: Bryk, Raudenbush & Congdon, 1996;MIXREG: Hedeker & Gibbons, 1996b; ML-wiN: Goldstein et al. , 1998 ). For categoricaldata, software has become available for dichoto-mous (EGRET, Statistics and EpidemiologyResearch Corporation, 1991 ) and ordinal ornominal outcomes (SAS PROC NLMIXED;HLM; MLwiN; MIXOR: Hedeker & Gibbons,1996a; MIXNO: Hedeker, 1999 ). Of course,

    software for nominal and ordinal outcomes canbe used to t models for dichotomous outcomes.Review articles comparing some of these soft-ware programs include Kreft et al. , (1994 ), vander Leeden, Vri jburg & de Leeuw (1996 ) and deLeeuw & Kreft (1999 ).

    Because substance use outcomes are primarilycategorical, we will focus on RRM for categoricaloutcomes in this paper. Also, since a dichoto-mous outcome is simply a special case of anordinal outcome with only two categories, wewill consider ordinal outcomes only. For longi-tudinal ordinal outcomes, most of the modelsinclude an assumption called the proportionalodds assumption (McCullagh, 1980 ) . A pro-portional odds model, for an ordinal responsewith K categories, assumes that the effect of a

    covariate is proportional across the model s K 2 1 cumulative odds, or homogeneous acrossthe corresponding K 2 1 cumulative logits (i.e.the log odds ). For alcohol and smoking out-comes this assumption may not be reasonable.For example, suppose that there are three cate-gories (abstinence, mild use, severe use ) andsuppose that an intervention is not successful inincreasing the proportion of individuals in theabstinence category but is successful in movingindividuals from severe to mild use. In this case,the effect of intervention group (i.e. the covari-ate ) would not be observed on the rst cumulat-ive odds (i.e. comparing abstinence versus thetwo use categories combined ), but would beobserved on the second cumulative odds (i.e.

    comparing abstinence and mild use combinedversus the severe use category ) .

    Recently, Hedeker & Mermelstein (1998 ) de-scribed an extension of the random-effectsproportional odds model to allow for non-proportional odds for the covariate effects. Inthis extended model, covariates can be speci edeither requiring or relaxing the proportional oddsassumption. As applied to longitudinal substanceuse data, this model allows the in uence of

    covariates (e.g. time and intervention group ) tovary, for example, across the levels of alcohol orsmoking usage. Thus, variables can have differ-ent, or heterogeneous, effects on mild and severeusage. In this paper, data from a longitudinalsmoking cessation study will be used to illustrateapplication of this more general model.

    RRM for longitudinal categorical dataRRM for categorical outcomes generally adopteither a probit or logistic regression model andutilize various methods for incorporating andestimating the in uence of the random effects. Arecent review paper (Pendergast et al. , 1996 )presents and describes many of these ap-proaches. In general, parameter estimation iscomputationally more intensive for these modelsthan for models of continuous outcome data. Inthis article, we will focus on application of themodel. Interested readers should consult thePendergast et al. review paper for details onestimation. Also, because the logistic regressionmodel is probably more commonly used than theprobit regression model, we will present themodel in terms of the logistic response function.

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    4/14

    S384 Donald Hedeker & Robin J. Mermelstein

    For an introduction on the use of the probitmodel for longitudinal dichotomous outcomes,see Gibbons & Hedeker (1994 ).

    With the logistic response function, the modelfor sub ject i ( i 5 1, 2, , N sub jects ) on occasion j ( j 5 1, 2, , ni occasions ) can be written interms of the log odds of response for a dichoto-mous outcome Y (with values, for instance 1 and2) as:

    logF P (Yi j 5 1)1 2 P (Yi j 5 1)G5 a 0 1 b1t i j 1log odds of 5 constant 1 time 11 response factor effect

    b2x

    i 1 b

    3x

    ij 1 v

    i

    sub ject- 1 sub ject- and 1 sub jectvarying time-varying specificcovariate covariate effect

    The numerator is the probability of a 1 response,and the denominator 1 2 P (Yi j 5 1) equals theprobability of a 2 response. The ratio of theseprobabilities is the odds of a 1 response, and thelog of this ratio is the log odds of a 1 response

    (sometimes called the logit of P ). Notice that thelog odds is equal to 0 when the probability of a1 response equals 0.5 (i.e. equal odds of a re-sponse in category 1 and category 2 ), is negativewhen the probability is less than 0.5 (i.e. oddsfavoring a response in category 2 ), and is positivewhen the probability is greater than 0.5 (i.e. oddsfavoring a response in category 1 ).

    In terms of the regression parameters, a 0 is theintercept, b1 is the coef cient for the effect of

    time t i j , b 2 is the coef cient for the time-invariantcovariate xi, and b3 is the coef cient for thetime-varying covariate xij. Notice that the sub-scripts for the covariates indicate whether thevariable varies by sub jects (i ) or across sub jectsand time (i j ). Covariate interactions can be in-cluded in the same way as interactions are in-cluded into the usual multiple regression model.For example, in the above model xi might rep-resent the type of treatment that a sub ject isassigned to for the course of the study, while xi j might be the treatment by time interaction whichis obtained as the product of xi by t i j .

    The remaining term, vi , which represents thein uence of sub ject i on the log-odds of responseacross all time-points, is what separates theabove model from an ordinary ( xed-effects )

    logistic regression model. This term indicates thein uence of individual i on his/her repeated ob-servations, and because individuals in a sampleare thought typically to be representative of alarger population of individuals, the individual-speci c effects vi are treated as random effects.That is, v i are considered to be representative of a distribution of individual effects in the popu-lation. The most common assumed form for thispopulation distribution is the normal distributionwith mean 0 and variance r 2v. Because there isonly one random individual effect in this model,it is sometimes referred to as a random-intercepts model, with each vi indicating howindividual i deviates from the ( xed-effects partof the ) model.

    This model for the dichotomous outcome canalso be written as:

    logF P (Yi j # 1)1 2 P (Y i j # 1)G5 a 0 1 b1t i j 1 b 2xi 1 b 3x i j 1 v i which then generalizes to the random-interceptsordinal logistic regression model:

    logF P (Y i j # k)1 2 P (Y i j # k )G5 a 0k 1 b 1t i j 1 b 2xi 1 b 3x i j 1 v i j , k 5 1, , K 2 1, (1)

    where a 0k are the K 2 1 intercept terms tomodel the marginal frequencies in the K orderedcategories. In this representation, a positive valuefor a regression coef cient b indicates a negative

    association between Y and the covariate, in thesense that large values of Y are relatively lesslikely to occur for larger values of the covariate.Thus, if the outcome is coded as 1 5 no use,2 5 minimal use, and 3 5 heavy use, a covariatewith a positive regression coef cient would implyless use with higher values of the covariate.

    The above model makes what is called theproportional odds assumption. This pro-portional odds characterization for ordinal re-sponse models, discussed in detail by Agresti( 1989 ) , has some important features. The left-hand side of the equality in Equation (1)speci es K 2 1 cumulative logits each contrast-ing the combined rst k categories to the remain-ing combined ( K 2 k) categories. For example,with four possible response categories (coded as

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    5/14

    Longitudinal substance use outcomes S385

    1 , 2 , 3 o r 4 ), the following three cumulativelogits are indicated by the model:

    logF P (Yi j # 1)1 2 P (Y i j # 1)G5 logF P (Yi j 5 1)

    P (Y i j 5 2, 3 or 4 )G

    logF P (Yi j # 2)1 2 P (Y i j # 2)G5 logFP (Y i j 5 1 or 2 )P (Y i j 5 3 or 4 )G

    logF P (Yi j # 3)1 2 P (Y i j # 3)G5 logFP (Y i j 5 1, 2 or 3 )

    P (Yi j 5 4) GAs the regression coef cients b do not carry thek subscript, it is assumed that the effect of acovariate is homogeneous across these K 2 1

    cumulative logits, or proportional across thecumulative odds. The odds of a response in acategory greater than K (for any xed K ) ismultiplied by exp (b) for a unit change of thegiven covariate x. With four ordered categories,the model simultaneously describes the effect of x on all three cumulative comparisons betweenthe probabilities (i.e. 1 vs. 2, 3 or 4; 1 or 2 vs. 3or 4; and 1, 2 or 3 vs. 4 ). Thus, a single effect isestimated for each covariate: the homogeneous

    effect of the covariate on the K 2 1 cumulativelogits.The proportional odds assumption is not al-

    ways reasonable, and examples violating this as-sumption are not hard to nd (Peterson &Harrell, 1990 ). For example, an interventionmight be successful at reducing alcohol use fromheavy to intermediate usage categories (i.e. mildusage ) but not total abstinence. For such hetero-geneous effects across the ordered response cate-

    gories, a model that relaxes the proportionalodds assumption is necessary. For cross-sec-tional data, Peterson & Harrell (1990 ) and Terza(1985 ) describe ordinal logistic and probit mod-els, respectively, relaxing this assumption. Apply-ing this model to stages of change data, Hedeker,Mermelstein & Weeks (1999 ) described it as athresholds of change model. For longitudinaldata, Hedeker & Mermelstein (1998 ) furtherdeveloped this model and termed it a multilevelthresholds of change model. This model can bewritten as:

    logF P (Y i j # k)1 2 P (Yi j # k)G5 a 0k 1 w i j a k 1 x i j b 1 v i ,k 5 1, , K 2 1, (2)

    where w i j and x i j denote the covariate vectorswith heterogeneous and homogeneous effects,respectively. Because the regression coef cientvector a k carries the k subscript, each of thecovariates in w i j have K 2 1 effects, one for eachof the K 2 1 cumulative logits. In this way, acovariate can have no effect on complete cess-ation (e.g. the rst category, thus the rst logit ),but can have positive effects on reducing theproportions of sub jects in the highest usage cate-gories (e.g. the second and third logits ) . Addi-tionally, by comparing model t assuminghomogeneous versus heterogeneous effects for acovariate or set of covariates, a test of the pro-portional odds assumption can be performed.

    It is worth nothing that ordinal regression

    models are often motivated and described usingthe threshold concept (Bock, 1975 ). Here, it isassumed that a continuous latent variable under-lies the observed ordinal response. Speci cally,with K ordered response categories, K 2 1strictly increasing thresholds separate values of the unobserved continuous variable into the ob-served ordinal responses. This concept, and itsdescription via thresholds, is what leads to thethresholds of change formulation of the above

    heterogeneous effects model described inHedeker, Mermelstein & Weeks (1999 ) andHedeker & Mermelstein (1998 ). As noted byMcCullagh & Nelder (1989 ), the assumption of a continuous latent distribution, while providinga useful motivating concept, is not a strict re-quirement for use of ordinal regression modelslike the kind presented in this paper.

    Repeated ordinal outcomes: change acrosstime in smoking abstinenceIn our example, we present and describe use of RRM to analyze longitudinal smoking data, butthese same methods can be applied to alcohol orother substance use data. The data for our exam-ple come from a study on the use of extendedtelephone contact in a multi-component smok-ing cessation program. All sub jects participatedin a 7-week group treatment program. Followingthe group program, sub jects were randomized toreceive one of two types of telephone treatment( Standard or Recycling condition ) that differedin the content of the phone calls. Seven coun-selor-initiated telephone calls were scheduled inboth conditions. The calls started the week fol-lowing the last group meeting and were spaced

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    6/14

    S386 Donald Hedeker & Robin J. Mermelstein

    Table 1. Distribution of participants by group and days abstinent across time

    Standard group Recycling group

    0 days 1 5 days 6 7 days 0 days 1 5 days 6 7 days

    3-month 55.3% 5.5% 39.1% 36.5% 13.2% 50.4%

    (140 ) ( 14 ) (99 ) ( 97 ) ( 35 ) ( 134 )9-month 59.2% 11.5% 29.3% 52.7% 15.8% 31.5%(170 ) ( 33 ) (84 ) (164 ) ( 49 ) ( 98 )

    15-month 57.7% 12.9% 29.4% 54.8% 12.9% 32.3%(161 ) ( 36 ) (82 ) (166 ) ( 39 ) ( 98 )

    approximately every other week. In the Standardcondition, counselors only gave sub jects wordsof encouragement without speci c guidance (e.g.

    keep trying if the sub ject had relapsed, orcongratulations if the sub ject was abstinent ) .In the Recycling condition, the telephone calltreatment protocol varied depending on the sub-ject s smoking status: still smoking; abstinent;slipped; or relapsed. The goals of the Recy-cling calls were: (1) to encourage subsequentquit attempts in sub jects still smoking at the endof treatment by helping them to problem solvebarriers and reset quit dates; (2) to help prevent

    relapse in sub jects who were abstinent by provid-ing continued encouragement and planning forhigh-risk situations; and (3) to help Recycle sub-jects who had relapsed (to quit again ) by de-brie ng the relapse episode, overcoming barriersand resetting quit dates.

    Following the end of the group treatment,sub jects were interviewed every 3 months for 15months and asked to recall retrospectively theirdaily smoking behavior. In addition, reports of

    abstinence were veri ed biochemically through acombination of expired-air carbon monoxide andsaliva cotinine at the post 1 (end of groups ), 3-,6- and 15-month follow-up points. Carbon mon-oxide values less than 8 p.p.m. and cotininevalues less than 10 ng/ml veri ed abstinence. Ateach follow-up point, the number of days absti-nent during that week was obtained and catego-rized into one of three outcome categories: 0,1 5 or 6 7 days abstinent. For this example, forsimplicity, we will focus on only three of thefollow-up time points: 3 2 , 9 2 and 15-monthfollow-up points. Table 1 lists the observed per-centages in these three categories across time forthe two intervention groups.

    As can be seen in Table 1, although thema jority of sub jects fell into the two extreme

    categories (either 0 days abstinent per week or6 7 days abstinent per week ) , there were stillnoticeable proportions of sub jects who fell in the

    middle outcome group. Treating the number of days abstinent as an ordinal outcome, ratherthan recording it into a dichotomous outcome, isadvantageous since one of the issues with alcoholand smoking research is documentingtransitional states that are not represented bydichotomous outcomes. Furthermore, the cut-point for abstinence or success is often debat-able. The advantage of treating the outcome asmore than a dichotomy is that it allows re-

    searchers to consider more ne grained patternsof behavior.Table 1 also lists the group sample sizes across

    time and, as can be seen, the amount of data persub ject varied. Clearly, not all sub jects weremeasured at each and every time-point. Also, interms of an intervention effect, there appears tobe an initial advantage for the Recycling group.However, by the last follow-up the groups ap-pear to be similar. In general, a trend of de-

    creased abstinence across time is observed.To prepare for the analysis, Table 2 lists theobserved cumulative odds and logits (corre-sponding to the three outcome categories ) acrossthe three time-points broken down by group.Since the rst cumulative logit separates 0 daysabstinent from the 1 5 and 6 7 days abstinentcategory, it is termed the partial abstinencethreshold. Similarly, the second cumulative logitis termed the full abstinence threshold because itseparates those below 6 7 days to those 6 7 daysabstinent. As can be seen from Table 2, essen-tially all of these logits increase in value acrosstime indicating less abstinence. However, com-paring the 9- and 15- to the 3-month logits, itappears that the time effect is more pronouncedin terms of the full abstinence threshold than the

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    7/14

    Longitudinal substance use outcomes S387

    Table 2. Days abstinent by group across time: observed cumulative odds ( and logits )

    Standard group Recycling group

    1 vs. 2 3* 1 2 vs. 3 1 vs. 2 3 1 2 vs. 3Partial abstinence Full abstinence Partial abstinence Full abstinence

    threshold threshold threshold threshold

    3-month 140/113 5 1.24 154/99 5 1.56 97/169 5 0.57 132/134 5 0.99(0.21 ) ( 0.44 ) ( 2 0.56 ) ( 2 0.02 )

    9-month 170/117 5 1.45 203/84 5 2.42 164/147 5 1.12 213/98 5 2.17(0.37 ) ( 0.88 ) ( 0.11 ) ( 0.78 )

    15-month 161/118 5 1.36 197/82 5 2.40 166/137 5 1.21 205/98 5 2.09(0.31 ) ( 0.88 ) ( 0.19 ) ( 0.74 )

    *1 5 0 days abstinent, 2 5 1 5 days abstinent, 3 5 6 7 days abstinent.

    partial abstinence threshold. In other words, over

    time, the proportion of sub jects in the 0 daysabstinent category is more stable than in the 6 7days abstinent category.

    To examine these observations more formally,the following random-intercepts ordinal logisticregression model was t:

    logF P (Yi j # k)1 2 P (Y i j # k)G5 a 0k 1 b 1Time 1 j 1 b2Time 2 j 1 b3Group i 1 b 4(Group i 3 Time 1 j )

    1 b5 (Group i 3 Time 2 j ) 1 v i (3)

    where Time 1 5 contrasts 9- to 3- months, Time 2contrasts 15- to 3-months, and Group is coded as0 for Standard and 1 for Recycling. Because of these codings and because of the interactions inthe model, b1 and b2 represent the time effects

    for the Standard group, b3 represents the groupdifference at 3-months and b4 and b5 representdifferences in the time effects between groups.Also, because sub jects were measured at thesame time-points in this study, the time variablesonly carry the occasion subscript j and not theadditional sub ject subscript i .

    The left-hand side of Table 3 lists the par-ameter estimates and standard errors for thismodel assuming homogeneous covariate effects.All analyses were performed using MIXOR (Hedeker & Gibbons, 1996a ). This sharewareprogram and its manual are available at http://www.uic.edu/ , hedeker/mix.html. Readers in-terested in the program speci cations for theexamples in this article can send a note to the

    rst author at [email protected]. The p-values

    indicated next to the parameter estimates are

    obtained using the so-called Wald test (Wald,1943 ), which uses the ratio of the parameterestimate to its standard error to determine statis-tical signi cance. These test statistics (i.e. Z 5 ratio of the parameter estimate to its stan-dard error ) are compared to a standard normalfrequency table to test the null hypothesis that agiven parameter equals 0. Alternatively, these Z -statistics are sometimes squared, in which casethe resulting test statistic is distributed as chi-

    square on 1 degree of freedom. In either case,the p-values are identical.

    Inspection of Table 3 reveals signi cant time,group and group 3 time interaction terms. Thesigni cant time effects indicate the increasinglevels of smoking at 9- and 15-, relative to 3months, for the Standard group. Since the groupeffect is negative and signi cant, the Recyclinggroup had increased abstinence at 3 months,relative to the Standard group. However, since

    the group 3 time interaction terms are positiveand signi cant, this bene cial effect of the Recy-cling group goes away at the 9- and 15-monthfollow-ups.

    The next set of estimates, presented in themiddle columns of Table 3, is for a model allow-ing heterogeneous effects for time and group. Torepresent this change, the regression coef cientsfor the time and group effects carry the k sub-script and are denoted with a rather than b:

    logF P (Yi j # k )1 2 P (Yi j # k)G5 a 0k 1 a 1k Time 1 j 1 a 2k Time 2 j 1 a 3k Group i 1 b 4 (Group i 3 Time 1 j )

    1 b 5 (Group i 3 Time 2 j ) 1 v i (4)

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    8/14

    S388 Donald Hedeker & Robin J. Mermelstein

    Table 3. Ordinal logistic random-effects regression model estimates ( standard errors ) : homogeneous and heterogeneous time, group and group by time effects

    Some heterogeneousHomogeneous effects effects All heterogeneous effects

    Parameter Partial abs Full abs Partial abs Full abs Partial abs Full abs

    Intercept 0.388 1.711*** 0.696** 1.427*** 0.769** 1.360***(0.301 ) (0.306 ) ( 0.324 ) (0.325 ) ( 0.330 ) ( 0.334 )

    9-month 0.621** 0.298 0.950*** 0.234 1.025***(vs. 3-month ) ( 0.265 ) ( 0.287 ) (0.318 ) ( 0.298 ) ( 0.370 )

    15-month 0.560** 0.288 0.816*** 0.124 1.026**(vs. 3-month ) ( 0.242 ) ( 0.273 ) (0.275 ) ( 0.290 ) ( 0.298 )

    Group 2 1.324*** 2 1.632*** 2 1.261*** 2 1.751*** 2 1.052**(0 5 std, 1 5 recy ) ( 0.402 ) ( 0.433 ) (0.415 ) ( 0.445 ) ( 0.434 )

    Group 3 9-month 0.989*** 1.031*** 1.191*** 0.853*(0.365 ) ( 0.374 ) ( 0.417 ) ( 0.468 )

    Group 3 15-month 1.039*** 1.085*** 1.456*** 0.642(0.324 ) ( 0.336 ) ( 0.393 ) ( 0.399 )

    Sub ject SD r v 3.831*** 3.815*** 3.830***(0.250 ) ( 0.250 ) ( 0.252 )

    2 2log L 2611.8 2596.8 2592.4

    *** p , 0.01; ** p , 0.05; * p , 0.10.

    Comparing this model to the previous one by a

    likelihood-ratio v2

    test indicates an improvementin model t; the difference in deviance ( 2 2log L) equals 2611.8 2596.8 5 15, which issigni cant at p , 0.005 for this 3 degrees of freedom v2 test. The degrees of freedom for thistest is equal to the difference in number of estimated parameters between the two models.Additionally, the null hypothesis of equal effectsacross the two cumulative logits is re jected foreach of the three parameters ( p , 0.004, 0.02

    and 0.02 for the 9 months, 15 months and groupterms, respectively ). As can be seen, the timeeffects are not signi cant in terms of the rstcumulative logit, they are only signi cant interms of the second cumulative logit. Alterna-tively, the group effect is appreciable on both,but is more pronounced for the rst cumulativelogit. Because the rst cumulative logit compares0 days abstinent to 1 5 and 6 7 days abstinent,the non-signi cance of the time effects indicatesthat the proportion of sub jects with 0 days absti-nent is relatively similar across time for the Stan-dard group. This is seen in Tables 1 and 2. Theheterogeneous group effect indicates that there isa larger group difference at 3 months for the 0days abstinent category than the 6 7 days absti-nent category. This is clearly seen in Table 2,

    where the difference at 3 months is 2 0.77 and

    2 0.46 for the rst and second cumulative logits,respectively.The nal model presented in Table 3 relaxes

    the homogeneity (i.e. proportional odds )assumption for all regression coef cients.

    logF P (Y i j # k)1 2 P (Yi j # k)G5 a 0k 1 a 1k Time 1 j 1 a 2k Time 2 j 1 a 3k Group i 1 a 4k (Group i 3 Time 1 j )

    1 a 5k (Group i 3 Time 2 j ) 1 v i (5)

    The likelihood-ratio statistic for comparing thismodel to the previous one equals 4.4, which isnot signi cant on 2 degrees of freedom. Thus,there is not suf cient evidence to re ject the as-sumption of homogenous interaction effectsacross the two cumulative logits. The overalldiminishing across time of the group effect thatis seen in the rst two models remains. Descrip-tively, it is interesting to note that in this nalmodel, the interaction terms are more pro-nounced in terms of the rst, rather than thesecond cumulative logit.

    Figure 1 plots the observed and estimatedproportions for the rst two models in Table 3.

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    9/14

    0.0

    0.1

    2 4 6 8 10 12 14 16

    0.2

    0.3

    0.4

    0.5

    0.6

    P r o p o r t

    i o n =

    0 d a y s a

    b s

    t i n e n

    t

    (a) Homogeneous model fit: 0 days abstinent

    Follow-up month

    0.0

    0.1

    2 4 6 8 10 12 14 16

    0.2

    0.3

    0.4

    0.5

    0.6

    P r o p o r t

    i o n =

    1 - 5

    d a y s a

    b s

    t i n e n

    t

    (b) Homogeneous model fit: 1-5 days abstinent

    Follow-up month

    0.0

    0.1

    2 4 6 8 10 12 14 16

    0.2

    0.3

    0.4

    0.5

    0.6

    P r o p o r t

    i o n =

    6 - 7

    d a y s a

    b s

    t i n e n

    t

    (c) Homogeneous model fit: 6-7 days abstinent

    Follow-up month

    0.0

    0.1

    2 4 6 8 10 12 14 16

    0.2

    0.3

    0.4

    0.5

    0.6

    P r o p o r t

    i o n =

    0 d a y s a

    b s

    t i n e n

    t

    (d) Heterogeneous model fit: 0 days abstinent

    Follow-up month

    0.0

    0.1

    2 4 6 8 10 12 14 16

    0.2

    0.3

    0.4

    0.5

    0.6

    P r o p o r t

    i o n =

    1 - 5

    d a y s a

    b s

    t i n e n

    t

    (e) Heterogeneous model fit: 1-5 days abstinent

    Follow-up month

    0.0

    0.1

    2 4 6 8 10 12 14 16

    0.2

    0.3

    0.4

    0.5

    0.6

    P r o p o r t

    i o n =

    6 - 7

    d a y s a

    b s

    t i n e n

    t

    (f) Heterogeneous model fit: 6-7 days abstinent

    Follow-up month

    Sta nd ard (o bs) R ec ycling (ob s) S ta nd ard (e st) R ec ycli ng (es t)

    Longitudinal substance use outcomes S389

    Figure 1. The observed and estimated proportions for the rst two models in Table 3.

    The top half of Fig. 1 is for the model assuminghomogeneous effects, and the bottom half is forthe model allowing heterogeneous effects due totime and group ( but not group 3 time ). As canbe seen, the second model ts the data better,

    especially for the middle response category. Also,the estimates based on the homogeneous modeldeviate more from the observed proportions interms of the initial group differences for the twoextreme categories (0 and 6 7 days abstinent ) .This is due to the larger group effect for 0 daysabstinent compared to 6 7 days abstinent.

    For all models presented in Table 3, there is aconsiderable effect of the sub ject on their re-peated observations. The population standarddeviation associated with the random sub ject-varying effects is estimated as approximatelyequal to 3.8 for all three models. These estimatesgreatly exceed their standard errors, althoughthis is not surprising since it would be unreason-able to assume that a sub ject s repeated smokingstatus assessments are independent. The degree

    of dependency attributable to sub jects in theirrepeated observations is sometimes expressed asan intraclass, or more appropriately intrasub ject,correlation. This correlation represents the aver-age correlation between any two observations

    within the same sub ject. It also measures theproportion of total variance which is between-sub jects. In the present case, since the errorvariance is assumed to be equal to p2/3 for thelogistic model (see Long, 1997, p. 119 ), theintrasub ject correlation is estimated as 0.82 [e.g.( 3.8 )2/(( 3.8 ) 2 1 p2/3) 5 0.82 ]. Thus, the smok-ing measures are highly correlated within indi-viduals.

    In sum, with the present example, ordinalRRM allows us to examine time trends in out-come, to include all sub jects regardless of miss-ing data, and to examine several ordered levels of outcome. Relaxing the proportional odds as-sumption allows an examination of whether thecovariate effects vary across the levels of theordinal outcome. For these data, the time effects

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    10/14

    S390 Donald Hedeker & Robin J. Mermelstein

    were more important in distinguishing fullabstainers from partial and non-abstainers (i.e.second cumulative logit ), than in distinguishingnon-abstainers from partial and full abstainers(i.e. rst cumulative logit ) . Alternatively, the ini-tial group difference was larger in terms of non-abstainers than in terms of full abstainers. Thedecrease across time in the group effect wasrelatively similar for both of these comparisons of the ordinal outcome.

    DiscussionAs demonstrated, RRM provide a useful way of analyzing longitudinal outcomes data.Speci cally, RRM allow for the presence of miss-

    ing data, irregularly spaced measurements acrosstime, time-varying and invariant covariates, ac-commodation of individual-speci c deviationsfrom the average time trend and estimation of the population variance associated with theseindividual effects. Additionally, methods andsoftware exist for analysis of continuous andcategorical outcomes.

    Perhaps the most useful feature of RRM is itstreatment of missing data. As has been illus-

    trated, sub jects are not assumed to be measuredat the same number of time-points. Since thereare no restrictions on the number of observationsper individual, sub jects who are missing at agiven interview wave are not excluded from theanalysis. The assumption of the model is that thedata that are available for a given sub ject arerepresentative of that sub ject s deviation fromthe average population trend across time (whichis estimated based on the whole sample ). This is

    termed ignorable missingness in the statisticalliterature (Laird, 1988 ) and falls under Rubin s(1976 ) missing at random (MAR ) assumption,in which the missingness depends only on ob-served data. That is, the probability of missing-ness is dependent on observed covariates andprevious values of the outcome variable fromsub jects with missing data. The notion here isthat if sub ject attrition is related to previousperformance, in addition to other observablesub ject characteristics, then RRM provides validstatistical inference.

    In some cases, the assumption of ignorablemissingness may not be reasonable for longitudi-nal substance use data. For example, in thesubstance use literature, it is often thought thatmissingness equals substances use, regardless of

    sub ject covariate values or prior observed sub-stance use levels. For this reason, researcherssometimes recode missing observations as thehighest substance use level. Because it is unlikelythat missingness and substance use are com-pletely correlated (as this recoding assumes ), amore statistical approach to this problem is de-sirable. One such approach for dealing with non-ignorable missingness is based on use of pattern-mixture modeling (Little, 1995 ). Forthis, sub jects are rst grouped based on theiravailable data pattern across time. For example,in the simplest case, sub jects can be classi ed ascomplete-data sub jects or incomplete-data sub-jects. The between-sub jects classi cation vari-ables that are formed based on these missing

    data patterns are then included in the model toexamine their effect on the outcome variable.Interactions can also be included in order todetermine, for example, if treatment group-related effects vary by missing data pattern. Uti-lizing this pattern-mixture approach withinRRM, Hedeker & Gibbons (1997 ) illustrate itsapplication to psychiatric clinical trials data, andHedeker & Rose (2000 ) describe its use forlongitudinal smoking data. For longitudinal sub-

    stance use data, pattern-mixture modeling is par-ticularly useful because sub jects with missingdata across time often have increased baselinelevels and/or increased tra jectories across time( Tebes, Snow & Arthur, 1992 ). Other ap-proaches for dealing with non-ignorable missing-ness in longitudinal settings are described byConaway (1992 ) and Diggle & Kenward (1994 ).

    Due to the categorical nature of alcohol andsmoking outcomes, recent extensions of RRM

    for categorical data are particularly important. Inthis paper, we have presented use of the ordinalRRM. A common characteristic of the ordinalmodel is the proportional odds assumption. Thisassumption speci es that the covariate effects arehomogeneous (on the logit scale ) across the( comparisons of the ) ordinal outcome categories.For smoking and alcohol data, this assumptiondoes not always hold. For example, an interven-tion might be able to reduce use in the middleoutcome categories, but not at the highest levelof use. The model presented in this article allowsfor relaxation of the proportional odds assump-tion, which helps identify cutpoints among theordinal categories where variables have thestrongest (and weakest ) effects.

    In dealing with ordinal outcomes in practice,

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    11/14

    Longitudinal substance use outcomes S391

    researchers sometimes dichotomize the ordinalvariable. This practice is often performed morefor convenience than for any substantive consid-erations. An issue that emerges is the location onthe scale that should be used to dichotomize thevariable. For example, should abstinence bede ned as 0 days abstinent or 0 1 days absti-nent. The model presented in this paper over-comes this issue because it can estimate theeffects of covariates for all K 1 dichotomiza-tions of an ordinal outcome with K categories.Thus, the dilemma of where to dichotomize iseffectively solved.

    While it is important to consider the samplesize for application of the ordinal random-effectsregression model, especially with heterogeneous

    effects, it is not easy to provide global recom-mendations for what the required sample sizeshould be. One consideration in estimating het-erogeneous effects is the numbers of observa-tions in the K response categories as they arebroken down by the covariates and covariateinteractions of the model. Consider the simplestcase of one covariate with two categories (e.g.gender ) ; the data then form a 2 3 K cross-tabulation table. Estimating heterogeneous

    threshold effects for gender then requires obser-vations in both gender groups at each of the K 1 comparisons across the response variable (i.e.category 1 vs. 2 to K combined, categories 1 and2 combined vs. 3 to K combined, , categories1 to K 1 combined vs. K ). Because allowinginteractions (e.g. gender by treatment group ) tohave heterogeneous effects splits the data upeven further, it may not always be possible ormay require collapsing of covariate or response

    categories. A further point regarding sample sizeis that the signi cance tests that are formed bytaking the ratio of the parameter estimate to itsstandard error are based on asymptotic statisticaltheory. However, many other statistical tech-niques using maximum likelihood estimationalso invoke asymptotic theory for hypothesis test-ing (e.g. logistic regression, log-linear modelsand structural equation models ). For more de-tails on asymptotic theory as applied to random-effects models, see Longford (1993 ).

    In the example, repeated observations wereobserved nested within individuals. In the termi-nology of multi-level analysis (Goldstein, 1995 )and hierarchical linear models (Bryk & Rauden-bush, 1992 ) , this is termed a two-level datastructure with individuals representing level 2

    and the nested repeated observations level 1.The models that we have presented are thusreferred to as two-level models. Individualsthemselves, however, are often observed clus-tered within some higher-level unit; for example,a classroom, clinic or work-site. Cross-sectionalclustered data can also be considered as two-level data, with the clusters representing level 2and the clustered sub jects level 1. Analysis of cross-sectional clustered (substance use ) data us-ing RRM is discussed by Hedeker, Gibbons &Flay (1994 ) and Hedeker et al. (1994 ). In somestudies, sub jects are clustered and also repeat-edly measured, resulting in three levels of data:the cluster (level 3 ), individual (level 2 ) andrepeated observation (level 1 ). Analysis of three-

    level data is described in Goldstein ( 1995 ), Bryk& Raudenbush (1992 ), Longford (1993 ) andGibbons & Hedeker (1997 ).

    Since longitudinal designs are increasinglyused to study alcohol, smoking and other sub-stance use patterns across time, it is importantthat statistical methods are developed and usedto extract the most out of these longitudinaldatasets. RRM provide an attractive approachfor addressing some key questions that emerge

    from longitudinal designs. Hopefully, this paperhas helped in increasing the understanding of these methods and their potential for use inanalyzing longitudinal substance use outcomes.

    AcknowledgementsThe authors thank Siu-Chi Wong for computerand statistical programming assistance. Prep-aration of this paper was supported by NationalHeart Lung and Blood Institute Grant HL 42485 and National Institutes of Mental HealthGrant MH 44826.

    ReferencesAGRESTI , A. (1989 ) Tutorial on modeling ordered cat-

    egorical response data Psychological Bulletin , 105,290 301.

    ALBERT , P. S. (1999 ) Longitudinal data analysis (re-

    peated measures ) in clinical trials, Statistics in Medi-cine , 18, 1707 1732.BERNSTEIN , G. A., C ARROLL , M. E., C ROSBY , R. D.,

    P ERWIEN , A. R., GO, F. S. & B ENOWITZ , N. L.(1994 ) Caffeine effects on learning, performance,and anxiety in normal school-age children, Journal of the American Academy of Adolescent Psychiatry , 33,407 415.

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    12/14

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    13/14

    Longitudinal substance use outcomes S393

    computer program for mixed-effects ordinal re-gression analysis, Computer Methods and Programs inBiomedicine , 49, 157 176.

    H EDEKER , D. & G IBBONS , R. D. (1996b ) MIXREG: acomputer program for mixed-effects regressionanalysis with autocorrelated errors, Computer Meth-ods and Programs in Biomedicine , 49, 229 252.

    H EDEKER , D. & G IBBONS , R. D. (1997 ) Application of random-effects pattern-mixture models for missingdata in longitudinal studies, Psychological Methods , 2,64 78.

    H EDEKER , D., G IBBONS , R. D. & F LAY , B. R. (1994 )Random-effects regression models for clustered data:with an example from smoking prevention research, Journal of Consulting and Clinical Psychology , 62, 757 765.

    H EDEKER , D. , M C M AHON , S. D., J ASON , L. A. &SALINA , D. (1994 ) Analysis of clustered data incommunity psychology: with an example from aworksite smoking cessation pro ject, American Journal of Community Psychology , 22, 595 615.

    H EDEKER , D. & M ERMELSTEIN , R. J. (1996 ) Appli-cation of random-effects regression models in relapseresearch, Addiction , 91 (suppl. ), S211 S229.

    H EDEKER , D. & M ERMELSTEIN , R. J. (1998 ) A multi-level thresholds of change model for analysis of stages of change data, Multivariate Behavioral Re-search , 33, 427 455.

    H EDEKER , D., M ERMELSTEIN , R . J . & W EEKS , K. A.(1999 ) The thresholds of change model: an ap-proach to analyzing stages of change data, Annals of Behavioral Medicine , 21, 61 70.

    H EDEKER , D. & R OSE , J. S. (2000 ) The natural historyof smoking: a pattern-mixture random-effects re-gression model, in: R OSE , J. S., C HASSIN , L., P RES-SON , C. C. & S HERMAN , S. J. (Eds ) Multivariate Applications in Substance Use Research , pp. 79 112(Hillsdale, NJ, Lawrence Erlbaum Associates ).

    H U , F. B., G OLDBERG , J., H EDEKER , D., F LAY , B. R. &P ENTZ , M. A. (1998 ) A comparison of population-averaged and sub ject-speci c approaches for analyz-ing repeated binary outcomes, American Journal of Epidemiology , 147, 694 703.

    H UI , S. L. & B ERGER , J. O. (1983 ) Empirical Bayes

    estimation of rates in longitudinal studies, Journal of the American Statistical Association , 78, 753 759.

    JANSEN , J. (1990 ) On the statistical analysis of ordinaldata when extravariation is present, Applied Statistics ,39, 75 84.

    JASON , L., S ALINA , D., M C M AHON , S. D., H EDEKER , D.& STOCKTON , M. (1997 ) A worksite smoking inter-vention: a 2 year assessment of groups, incentives,and self-help, Health Education Research , 12, 129 138.

    JENNRICH , R. I. & S CHLUCHTER , M. D. ( 1986 ) Unbal-anced repeated-measures models with structured

    covariance matrices, Biometrics , 42, 805 820.K ESELMAN , H. J., A LGINA , J., K OWALCHUK , R. K. &W OLFINGER , R. D. (1999 ) A comparison of recentapproaches to the analysis of repeated measure-ments, British Journal of Mathematical and Statistical Psychology , 52, 63 78.

    K REFT , I. G., DE L EEUW , J . & VAN DER L EEDEN , R.(1994 ) Comparing ve different statistical packages

    for hierarchical linear regression: BMDP-5V, GEN-MOD, HLM, ML3, and VARCL, American Statisti-cian , 48, 324 335.

    L AIRD , N. M. (1988 ) Missing data in longitudinal stud-ies, Statistics in Medicine , 7, 305 315.

    L AIRD , N. M. & W ARE , J. H. (1982 ) Random-effectsmodels for longitudinal data, Biometrics , 38, 963

    974.L ESAFFRE , E., A SEFA , M. & V ERBEKE , G. (1999 ) As-sessing the goodness-of- t of the Laird and Waremodelan example: the Jimma infant survival dif-ferential longitudinal study, Statistics in Medicine , 18,835 854.

    L ITTLE , R. J. A. (1995 ) Modeling the drop-out mech-anism in repeated-measures studies, Journal of the American Statistical Association , 90, 1112 1121.

    L ONG , J. S. ( 1997 ) Regression Models for Categorical and Limited Dependent Variables (Thousand Oaks, CA,Sage Publications ).

    L ONGFORD , N. T. (1987 ) A fast scoring algorithm formaximum likelihood estimation in unbalancedmixed models with nested random effects,Biometrika , 74, 817 827.

    L ONGFORD , N. T. (1993 ) Random Coef cient Models(New York, Oxford University Press ).

    M ANOR , O. & K ARK , J. D. (1996 ) A comparative studyof four methods for analysing repeated measuresdata, Statistics in Medicine , 15, 1143 1159.

    M C C ULLAGH , P. ( 1980 ) Regression models for ordinaldata (with discussion ), Journal of the Royal Statistical Society, Series B , 42, 109 142.

    M C C ULLAGH , P. & N ELDER , J. A. (1989 ) Generalized Linear Models , 2nd edn (London, Chapman andHall ).

    N ICH , C. & C ARROLL , K. (1997 ) Now you see it, nowyou don t: a comparison of traditional versus ran-dom-effects regression models in the analysis of longitudinal follow-up data from a clinical trial, Jour-nal of Consulting and Clinical Psychology , 65, 252 261.

    O MALLEY , S. S., J AFFE , A. J., C HANG , G., R ODE , S.,SCHOTTENFELD , R., M EYER , R. E. & R OUNSAVILLE ,B. (1996 ) Six month follow-up of naltrexone andpsychotherapy for alcohol dependence, Archives of

    General Psychiatry , 53, 217 224.O MAR , R. Z. , W RIGHT , E . M. , T URNER , R. M. &

    T HOMPSON , S. G. (1999 ) Analysing repeated mea-sures data: a practical comparison of methods,Statistics in Medicine , 18, 1587 1603.

    P ENDERGAST , J . F. , G ANGE , S. J . , N EWTON , M. A.,L INDSTROM , M. J . , P ALTA , M. & F ISHER , M. R.(1996 ) A survey of methods for analyzing clusteredbinary response data, International Statistical Review ,64, 89 118.

    P ETERSON , B. & H ARRELL , F. E. (1990 ) Partial pro-portional odds models for ordinal response variables,

    Applied Statistics , 39, 205 217.R OSNER , B. (1992 ) Multivariate methods for binarylongitudinal data with heterogeneous correlationover time, Statistics in Medicine , 11, 1915 1928.

    R UBIN , D. B. (1976 ) Inference and missing data,Biometrika , 63, 581 592.

    SALINA , D., J ASON , L. A., H EDEKER , D., K AUFMAN , J.,L ESONDAK , L. , M C M AHON , S. D., T AYLOR , S . &

  • 8/12/2019 Analysis of Longitudinal Substance Use Outcomes u

    14/14