CHAPTER 17 Moderation

26
Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 1 — #1 CHAPTER 17 Moderation HerbertW. Marsh, Kit-Tai Hau, ZhonglinWen, Benjamin Nagengast, and Alexandre J.S. Morin Abstract Moderation (or interaction) occurs when the strength or direction of the effect of a predictor variable on an outcome variable varies as a function of the values of another variable, called a moderator. Moderation effects address critical questions, such as under what circumstances, or for what sort of individuals,does an intervention have a stronger or weaker effect? Moderation can have important theoretical, substantive, and policy implications. Especially in psychology with its emphasis on individual differences, many theoretical models explicitly posit interaction effects. Nevertheless, particularly in applied research,even interactions hypothesized on the basis of strong theory and good intuition are typically small, nonsignificant, or not easily replicated. Part of the problem is that applied researchers often do not know how to test interaction effects, as statistical best practice is still evolving and often not followed. Also, tests of interactions frequently lack power so that meaningfully large interaction effects are not statistically significant. In this chapter we provide an intuitive overview to the issues involved, recent developments in how best to test for interactions, and some directions that further research is likely to take. Key Words: Interaction effect; moderator, moderated multiple regression; mediation; latent interaction; product indicator; structural equation model Introduction Moderation and interactions between variables are important concerns in psychology and the social sciences more generally (here we use moderation and interaction interchangeably). In educational psy- chology, for example, it is often hypothesized that the effect of an instructional technique will inter- act with characteristics of individual students, an aptitude-treatment interaction (Cronbach & Snow, 1979). For example, a special remediation program developed for slow learners may not be an effec- tive instructional strategy for bright students (i.e., the effect of the special remediation “treatment” is moderated by the ability “aptitude” of the stu- dent). Developmental psychologists are frequently interested in how the effects of a given variable are moderated by age in longitudinal or cross-sectional studies (i.e., effects interact with age or develop- mental status). Developmental psychopathologists may also be interested to know whether a predic- tor variables is, in fact, a risk factor, predicting the emergence of new symptoms and useful in preven- tive efforts, or an aggravation factor, mostly useful in curative efforts (i.e. effects interact with the base- line level on the outcome in longitudinal studies; Morin, Janoz, & Larivée, 2009). Social psycholo- gists and sociologists are concerned with how the effects of individual characteristics are moderated by groups in which people interact with others. Organizational psychologists study how the effects of individual employee characteristics interact with workplace characteristics. Personnel psychologists want to know whether a selection test is equally valid at predicting work performance for different 1

Transcript of CHAPTER 17 Moderation

Page 1: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 1 — #1

C H A P T E R

17 Moderation

HerbertW. Marsh, Kit-Tai Hau, ZhonglinWen, Benjamin Nagengast, andAlexandre J.S. Morin

Abstract

Moderation (or interaction) occurs when the strength or direction of the effect of a predictor variableon an outcome variable varies as a function of the values of another variable, called a moderator.Moderation effects address critical questions, such as under what circumstances, or for what sort ofindividuals, does an intervention have a stronger or weaker effect? Moderation can have importanttheoretical, substantive, and policy implications. Especially in psychology with its emphasis on individualdifferences, many theoretical models explicitly posit interaction effects. Nevertheless, particularly inapplied research, even interactions hypothesized on the basis of strong theory and good intuition aretypically small, nonsignificant, or not easily replicated. Part of the problem is that applied researchersoften do not know how to test interaction effects, as statistical best practice is still evolving and oftennot followed. Also, tests of interactions frequently lack power so that meaningfully large interactioneffects are not statistically significant. In this chapter we provide an intuitive overview to the issuesinvolved, recent developments in how best to test for interactions, and some directions that furtherresearch is likely to take.

Key Words: Interaction effect; moderator, moderated multiple regression; mediation; latentinteraction; product indicator; structural equation model

IntroductionModeration and interactions between variables

are important concerns in psychology and the socialsciences more generally (here we use moderation andinteraction interchangeably). In educational psy-chology, for example, it is often hypothesized thatthe effect of an instructional technique will inter-act with characteristics of individual students, anaptitude-treatment interaction (Cronbach & Snow,1979). For example, a special remediation programdeveloped for slow learners may not be an effec-tive instructional strategy for bright students (i.e.,the effect of the special remediation “treatment”is moderated by the ability “aptitude” of the stu-dent). Developmental psychologists are frequentlyinterested in how the effects of a given variable aremoderated by age in longitudinal or cross-sectional

studies (i.e., effects interact with age or develop-mental status). Developmental psychopathologistsmay also be interested to know whether a predic-tor variables is, in fact, a risk factor, predicting theemergence of new symptoms and useful in preven-tive efforts, or an aggravation factor, mostly usefulin curative efforts (i.e. effects interact with the base-line level on the outcome in longitudinal studies;Morin, Janoz, & Larivée, 2009). Social psycholo-gists and sociologists are concerned with how theeffects of individual characteristics are moderatedby groups in which people interact with others.Organizational psychologists study how the effectsof individual employee characteristics interact withworkplace characteristics. Personnel psychologistswant to know whether a selection test is equallyvalid at predicting work performance for different

1

Page 2: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 2 — #2

demographic groups. Fundamental to the rationaleof differential psychology is the assumption thatpeople differ in the way that they respond to allsorts of external stimuli.

Many psychological theories explicitly hypoth-esize interaction effects. Thus, for example, someforms of expectancy-value theory hypothesize thatresultant motivation is based on the interactionbetween expectancy of success and the value placedon success by the individual (e.g., motivation is highonly if both probability of success and the valueplaced on the outcome are high). In self-conceptresearch, the relation between an individual compo-nent of self-concept (e.g., academic, social, physical)and global self-esteem is sometimes hypothesizedto interact with the importance placed on a spe-cific component of self-concept (e.g., if a personplaces no importance on physical accomplishments,then these physical accomplishments—substantialor minimal—are not expected to be substantiallycorrelated with self-esteem). More generally, avariety of weighted-average models posit—at leastimplicitly—that the effects of each of a given set ofvariables will depend on the weight assigned to eachvariable in the set (i.e., the weight assigned to a givenvariable interacts with the variable to determine thecontribution of that variable to the total effect).

Interaction or moderation can be seen as theopposite of generalizabilty. For example, if the effectof an intervention is the same for males and females,it is said to generalize across gender. However, ifthe effect of the intervention differs for men andwomen, then it is said to interact with gender.

Classic Definition of ModerationIn their classic presentation of moderation, Baron

and Kenny (1986, p. 1174) defined a moderatorvariable to be a “variable that affects the directionand/or strength of the relationship between an inde-pendent or predictor variable and a dependent orcriterion variable.” An interaction occurs when theeffect of at least one predictor variable and an out-come variable is moderated (i.e., depends on orvaries as a function of ) by at least one other pre-dictor. Moderation studies address issues like “when(under what conditions/situations)” or “for whom”X has a stronger/weaker (positive/negative) relationwith or effect on Y .

Consider the effect of a variable X1 on an out-come Y : If the effect of X1 on Y is affected byanother variable X2, then we say X2 is a modera-tor; the relation between X1 and Y is moderated by

X2. Under this circumstance, the size or directionof the effect of X1on Y varies with the value of themoderator X2. As used here, X1 and X2 are sym-metrical and can be interchanged, such that eitherof them can moderate the effect of the other. How-ever, depending on the design of the study, the goalsof the research, or the specific research questions, itmight be reasonable to designate one of the predic-tor variables to be a moderator variable. Implicit inthe discussion of interaction effects is the assump-tion that the outcome variable is determined, at leastin part, by a combination of the main effects of thetwo predictor variables and their interaction.

Moderators can be categorical variables (e.g.,gender, ethnicty, school type) or continuous vari-ables (e.g., age, years of education, self-concept,test scores, reaction time). They can be a mani-fest observed variable (e.g., gender, race) or a latentvariable measured with multiple indicators (e.g.,self-concept, test scores). Different analytic meth-ods of testing interactions are associated with thedifferent types of moderators. In tests of interac-tion effects, the interaction term is typically theproduct of two variables and treated as a sepa-rate variable (e.g., the X1X2 interaction is oftendenoted as X1-by-X2 or X1 × X2). A significantinteraction effect indicates that the simple slopesof the predictor vary when the moderator takeson different values. We begin by discussing meth-ods for analyzing interactions between observedvariables and then discuss alternative approachesto probing the meaning of these interactions. Wethen give a brief account of the development ofmore sophisticated but statistically (and theoreti-cally) stronger models in the estimation of latentinteractions.

Particularly for categorical independent variables,manifest variables as outcome, and experimen-tal designs with random assignment to groups,ANOVA is commonly used to evaluate interactions.The initial tests of these interactions are performedalmost completely automatically by statistical pack-ages with little intervention by the researcher. Evenhere, however, probing the appropriate interpreta-tion and meaning of statistically significant interac-tions requires careful consideration. In contrast toANOVAs, interaction terms in regression analysesare usually based on a priori theoretical predictions.Although statistically all ANOVA models can berespecified as multiple regression models (e.g., byusing dummy variables), ANOVA is appropriatewhen all the independent variables are categoricalwith a relatively small number of levels. In the

2 m o d e r at i o n

Page 3: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 3 — #3

present chapter, we concentrate on the detectionand estimation of interactions in regression analyseswith the understanding that with appropriate cod-ing of the different design factors, these analyticaltechniques can be applied equally effectively toexperimental designs.

Graphs of Interaction EffectsWhatever the statistical approach used to test

interaction effects, it is always helpful to graph sta-tistically significant interaction effects. For example,suppose you want to test the prediction that thereis no relation between X1 and Y for boys but thatX1 and Y are positively related for girls. The find-ing that gender interacts significantly with X1 onlysays that the relation between X1 and the outcomeY is different for boys and girls but not whether thenature of this interaction is consistent with a specifica priori prediction. The starting point for probingthe nature of this interaction is typically to graphthe results.

Although we discuss more statistically sophisti-cated ways to probe significant interaction effects, a

graph is always helpful in understanding the natureof the interaction effect. In Figure 17.1 we illus-trate a number of different graphs of paradigmaticinteraction effects, which are sometimes given spe-cific labels in the literature. For purposes of theexample, we can consider these as interactions inwhich one of the predictor variables is dichotomous(e.g., male/female; experimental/control), whereasthe other predictor and the outcome variable arecontinuous. However, later we will describe howsuch graphs can be constructed when all variablesare continuous.

Even with relatively simple models, interactioneffects can be very diverse (Fig. 17.1). In Figure17.1 we have plotted different forms of interactionsin which X1 is a continuous predictor variable andX2 (the other predictor, the moderating variable) isa dichotomous grouping variable (here labelled asboys and girls). The key distinguishing feature isthat the lines (regression plots) for boys and girls areparallel in the first graph (indicating no interaction),whereas they are significantly non-parallel for all theother graphs.

boys

girls

No Interaction(parellellines)

X1

Y Y

boys

girls

Ordinal (convergent)Interaction

Y

boys

girls

Disordinal (Cross-over)Interaction

X1

X1X1X1

Y

boys

girls

Ordinal (divergent)Interaction

Y

boys

girls

Non-linearInteraction

Y

boys

girls

Interaction withStable Baseline

X1

Figure 17.1 Diverse hypothetical outcomes testing whether the relation between a continuous predictor variable (X1) and a continuousoutcome variable (Y ) varies as a function of the dichotomous grouping variable, gender (X2, the moderating variable). The distinguishingfeature is that the line (regression plots) for boys in girls are parallel in the first graph (indicating no interaction), whereas they are notparallel for any of the other graphs. Regression plots that contain only linear terms necessarily result in graphs that are strictly linear.However, the final graph illustrates an interaction that is nonlinear in relation to X1 (i.e., the differences between the two lines is smallwhen X1 is large or small, but larger when X1 is intermediate).

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 3

Page 4: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 4 — #4

Specific forms of interactions are sometimesreferred to by different names. In particular, it is typ-ical to distinguish between disordinal graphs, wherethe lines cross, and ordinal graphs, where the linesdo not cross, for the range of possible or plausiblevalues that the predictor variables can take on. It isimportant to note that these graphs represent thepredicted values from the regression model used totest the interaction effect (and should be limited toa range of values for X1 and X2. that are plausibleand actually considered in the analyses). Thus, forexample, all but the last graph are strictly linear inthat only the linear effects of X1 were included in themodel (because X2 is dichotomous, it can only havea linear effect). However, the final model includes anonlinear (quadratic) component of X1; the differ-ence between the two lines is small when X1 is small,large when X1 is intermediate, and small when X1 islarge. Because of the nature of this interaction in thislast model, it is likely that there would have been nostatistically significant interaction if only the lineareffects of X1 were considered. It would also be pos-sible to control for one or more covariates in any ofthese models, and this typically would change theform of the interaction.

Plotting regression equations like those in Figure17.1 is a good starting point in the interpreta-tion of statistically significant interactions, but acasual visual inspection of the graphs is not suf-ficient. Particularly for studies based on modestsample sizes, researchers are likely to overinterpretthe results, leading to false–positive errors. We nowturn to appropriate strategies to test the statisticalsignificance of interactions and probe their meaning.

Traditional (Non-Latent) Approaches forObserved Variables

In this section, we introduce analytical methodsfor interaction models, depending on the nature ofthe variables. The critical feature common to allthese approaches is that all the constructs are pre-sented by a single indicator. Hence, these are notlatent variable models in which the constructs arerepresented by multiple indicators (which we willdiscuss latter).

Interactions between Categorical Variables:Analysis of Variance

When the independent variables X1 and X2 areboth categorical variables that can take on a rel-atively small number of levels and the dependentvariable is a continuous variable, interaction effects

can be easily estimated with traditional analysis ofvariance (ANOVA) procedures (for more generaldiscussions of ANOVA, see classic textbooks suchas Kirk, 1982; see also Jaccard, 1998 and ChapterXXX in the present Handbook). In the simplest fac-

AQ: Pleaseprovidechapternumber.

torial design, both X1 and X2 have two levels (i.e., a2 × 2 design). In addition to the main effects of X1and X2, this factorial ANOVA provides a test of thestatistical significance of the interaction between X1and X2.

The null hypothesis for the interaction effect isthat the effect of neither predictor variable (X1 orX2) depends on the value of the other. More com-plex factorial designs can have more than two levelsof each variable and more than two predictor vari-ables. There can also be higher-order interactionsinvolving more than two variables (e.g., the natureof the X1X2 interaction depends on a third variable,X3).

Although ANOVAs are typically used in experi-mental studies in which participants are randomlyassigned to different levels of a grouping vari-able (e.g., experimental and control), this designcan also be evaluated with more general regressionapproaches, representing the factors with dummyor indicator variables. Likewise, nonexperimen-tal studies with only categorical predictor variables(e.g., gender) can be analyzed with either ANOVAor regression approaches.

The deceptive ease with which ANOVAs canbe conducted has tempted researchers to transformreasonably continuous variables into a few discretecategories so that they can be evaluated with tra-ditional ANOVA approaches. For example, withindependent variables that are originally reasonablycontinuous, it might be possible to divide one orboth of the variables into two (e.g., using a mean ormedian split) or a small number of categories (e.g.,low, medium, high groups) so that they can be evalu-ated with ANOVAs. There are, however, potentiallyserious problems that dictate against this strategy,including (1) a reduction in reliability of the origi-nal variables, resulting in a loss of power in detectingmain and particularly the interaction effects; (2)a reduction in variance explained by the originalvariables and particularly the interaction terms; (3)absence of a commonly used summary estimate ofthe strength of the interaction effect (using cate-gories, we have t -values in different groups only;Jaccard, Turrisi, & Wan, 1990); and (4) difficultyin determining the nature of potential nonlinearrelations (particularly when a continuous variableis represented by only two categories). A possible

4 m o d e r at i o n

Herb Marsh
Cross-Out
Herb Marsh
Sticky Note
Herb, I think you can ask Rajashri to plug in the appropriate page numbers when they are finalized. I think volume 2 will be ready at the end of this week. You can just focus on any other comments and get that back to us. Todd D. Little, Ph.D., Professor/Director
Page 5: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 5 — #5

exception is that if the categorization is a naturalcut-off of particular interest (e.g., minimum testscores to qualify for acceptance into a program orclassification schemes). For example, Baron andKenny (1986) discuss a threshold form of moder-ation such that a predictor variable has no effectwhen the moderator value is low, but has a pos-itive effect when the moderator takes on a valueabove a certain threshold. In this case, if the thresh-old value is know a priori, it might be reasonableto dichotomize the moderator at the level of thethreshold. Even here, however, there are typicallystronger models that test, for example, whetherthe effects of the predictor variable are unaffectedby variation in the moderator below the thresh-old or variation above the threshold. Nevertheless,the general strategy forming a small number ofcategories from a reasonably continuous variableshould usually be avoided (for further discussion,see MacCallum, Zhang, Preacher, & Rucker, 2002).Although researchers have been warned of the inap-propriateness of such practices for more than aquarter of a century (e.g., Cohen, 1978), it stillpersists. In summary, researchers should (almost)never transform continuous variables into discretecategories.

InteractionsWith One Categorical Variable:Separate Group Multiple Regression

When one independent variable is a categoricalvariable (e.g., X2)—particularly with only a few nat-urally occurring levels (e.g., gender as in Fig. 17.1, orethnic groups)—and the other one is a continuousvariable (e.g., X1), a possible approach is to conducta separate regression for each group (see also subse-quent discussion about multiple-group tests in thesection on latent variable interactions). The interac-tion effect is represented by the differences betweenunstandardized regression coefficients obtained withthe separate groups (Aiken & West, 1991; Cohen& Cohen, 1983; Cohen, Cohen, Aiken, & West,2003). Assume we draw these regression equationlines: If these lines from different groups are paral-lel, then we conclude that there is no interaction (seethe first graph in Fig. 17.1).

If one of the predictor variables (X2) is dichoto-mous, then it is possible to test for the statisticalsignificance of the difference between to regressioncoefficients relating X1and Y (Cohen & Cohen,1983, p. 56). If the hypothesis is rejected, thenthe interaction is significant. This approach is par-ticularly useful if there are differences in the errorvariances at different levels of the moderator (see

also subsequent discussion of latent variable tests ofinvariance across multiple groups). Although usefulin some special cases, this multiple-group approachis typically more limited in terms of facilitating theinterpretation of interaction effects, reducing powerbecause of lower sample size in each group consid-ered separately, and, of course, because it requires atleast one of the predictor variables to be a true cat-egorical variable. Therefore, we now present a moregeneral approach that is appropriate when predictorvariables are categorical, continuous, or a mixture ofthe two.

Interactions With Continuous Variables:Moderated Multiple Regression Approaches

Now we move from analyses of moderationinvolving categorical observed independent vari-ables to those using continuous observed inde-pendent variables. Consider the familiar regressionequation involving two predictors, X1 and X2:

AQ: Pleaseclarifymissingsymbolsthrough-out.

Y = β0 + β1X1 + β2X2 + e,

which assumes no interaction; the effects of X1 andX2 are additive (Judd, McClelland, & Ryan, 2009;Klein, Schermelleh-Engel, Moosbrugger, & Kelava,2009). That means the effect of a predictor (e.g., X1)does not depend the value of the other (i.e., X2) andthe effects of the two predictors on Y can simply beadded. The first graph in Figure 17.1 (with parallellines) was based on a model of this form.

However, this assumption of strictly additiveeffects might be false. Irrespective of whether X1or X2 has a main effect on Y , an interaction effectmight exist in that the effect of X1 on Y dependson the value of X2. The mathematical equationrepresenting this can be expressed as:

Y = β0 + β1X1 + β2X2 + β3X1X2 + e, (1)

where β1 and β2 represent the main effects of withX1 andX2, β3 represents the interaction effect, X1X2is the product of X1 and X2 (the interaction term),and eis a random disturbance term with zero meanthat is uncorrelated with X1 and X2.

Suppose that

Y = b0 + b1X1 + b2X2 + b3X1X2

represents the estimated values of Equation 1. WhenX1 is treated as the moderator, it can also be expressedas:

Y = (b0 + b1X1) + (b2 + b3X1)X2,

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 5

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
of this practice
Herb Marsh
Sticky Note
there are no missing symbols here??
Herb Marsh
Sticky Note
add space
Page 6: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 6 — #6

where the intercept (b0 + b1X1) as well as the slope(b2 + b3X1) of X2 are a linear functions of X1.Similarly, the equation can be represented as:

Y = (b0 + b2X2) + (b1 + b3X2)X1,

with the intercept and effect (slope) ofX1 onY beingmoderated by X2. With simple derivations usingeither X1 or X2 as the moderator, it can be shownthat the interactive effect can be operationalizedby adding a product term X1X2 into the regres-sion equation (for details, see, for example, Jaccardet al., 1990; Judd, McClelland, & Ryan, 2009). Thisshows that the moderating relation is symmetric inthat if the effect of X1 on Y depends on the value ofX2, then the effect of X2 on Y also depends on thevalue of X1. Although this equation only representsthe linear effects of X1, X2, and X1X2, it could eas-ily be expanded to include nonlinear components ofX1 and X2 as well as additional covariates. Althoughthis means that statistically either one of the vari-ables forming the interaction effect can be treated asthe moderator, this choice should be guided by thedesign of the study or substantive theory.

The variables in Equation 1 can also be centered(M = 0, as the variable mean is subtracted fromeach value; see Equation 2).

YC = βC0+βC1X1C+βC2X2C+βC3X1CX2C+eC(2)

Standardized estimates are the parameter esti-mates that are obtained when all independent anddependent variables in the regression model are stan-dardized (i.e., z-scores; subtracting the variableswith their respective means and then divided by theirrespective standard deviations).

ZY = βZ0+βZ1ZX 1+βZ2ZX 2+βZ3ZX 1ZX 2+eZ(3)

The centered and standardized form of the regres-sion equation that are based on centered and stan-dardized variables, respectively, are given by wherethe subscripts Z andC on the regression weights areused to indicate that these parameter estimates basedon centered scores (Equation 2) and standardizedscores (Equation 3) are not the same as the corre-sponding parameter estimates based on raw scores(Equation 1).

For the raw score regression (Equation 1), β0 isthe intercept or the predicted value ofY whenX1 andX2 equal 0. The intercept (β0) might be meaninglessif X1 and X2 never take on a value of 0. However,when the predictor variables are centered or stan-dardized (Equation 2 or Equation 3), the intercept

(βC0 or βZ0) is the value of Y at the mean of X1and X2, which is typically meaningful.

β1 is the estimated change in Y associated with 1unit of change in X1 when X2 = 0 (i.e., the slope ofthe relation betweenX1 andY whenX2 = 0). Againthis may be meaningless in the raw score equation.However, with centered or standardized predictors,it is the association between X1 and Y at the meanof X2. With centered predictors, βC1 represents thechange in the outcome if X1 changes one unit in theraw metric. If the predictors are standardized, thenβZ1 represents the change in Y in standard devi-ation (SD) units if X1 changes one SD. Althoughmain effects should always be interpreted cautiouslywhen there is an interaction, this is typically a mean-ingful result in the standardized equation. Becausethe interaction is symmetric in relation to X1 andX2, β2 is merely the change in Y associated with a1-unit change in X2 when X1 = 0. For Equations 1and 2, the 1-unit change is in the original (raw score)metric, whereas in Equation 3 it is in standard devi-ation units. If the units of the predictor variables arein a meaningful metric, it might be useful to use thecentered predictor variables, as changes are then inthe same metric as the original variables. However,particularly when the metric is arbitrary, it often ismore meaningful to standardize the variables so thatchanges are in terms of standard deviation units.However, when one of the predictor variables is acategorical variable, it is traditional to use values of0 and 1 (or appropriate dummy variables if there aremore than two categories).

β3 represents the interaction effect, the amountby which the effect of X1 on Y changes with a 1-unitincrease in X2. Equivalently, because of the symme-try of the interaction effect, it is the amount bywhich the effect of X2 on Y changes with a 1-unitincrease in X1. In the raw and centered form, thisvalue may or may not be interpretable, dependingon the nature of the metric of X1 and X2. In the stan-dardized form, the interpretation of βZ3 is the same,with the exception that all changes are in terms ofSD units.

It is important to understand the relationbetween the regression equation and graphs likethose presented in Figure 17.1. Regression plots areconstructed by substituting values for the X1 andX2 predictor variables into the regression equation.In this regard, the regression plots represent thepredicted values based on the model that is beingtested, not the raw data. Thus, for example, evenif there is nonlinearity in the raw data, this willnot be reflected in regression plots based on a

6 m o d e r at i o n

Page 7: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 7 — #7

model with no nonlinear terms. To address thisissue, researchers sometimes include a scatterplotof the individual cases (or the raw mean values forcategorical variables) as well as the regression plot.

The regression plot can be based on raw scores,centered scores, or standardized scores. In terms ofplotting, it is, of course, critical that values plottedare the same as those used to estimate the regres-sion parameters. Hence, if predictor variables arestandardized in the regression, then the standardizedvalues should be used to construct the plot. How-ever, it is reasonable to construct different plots inrelation to raw, centered, or standardized scores—whichever is most appropriate—as the shape of theplot will be similar except for the metric of the axesof the graph.

When there are more than two groups, typicalpractice is to use dummy coding (in which one groupis “left out” and serves as a baseline of compari-son for other groups). However, the whole rangeof orthogonal and nonorthogonal coding schemes(e.g., effect coding, polynomial coding, differencecoding, as well as coding specific to the nature of thestudy; e.g., Cohen et al., 2003; Marsh & Grayson,1994) are available to the researcher and may havestrategic advantages depending on the nature of thestudy design, the goals of the researcher, the mean-ing of the variables, and so forth. Whatever codingscheme is used, it is useful that all of the predictorvariables have a meaningful zero point to facilitatethe interpretation of the regression equations.

When both the predictors (X1 and X2) are con-tinuous variables, it is traditional to plot regressionlines for at least two strategically chosen values andtypically three or more values (although complicatedinteractions involving nonlinear components mightrequire more than three regression lines). The repre-sentative values are typically selected to be the mean,1 or 2 SDs above the mean, and 1 or 2 SD below themean. In each case, the graphs can be constructed bysimply substituting these representative values intothe regression equation.

In Figure 17.2 we represent some of the corre-spondences between the regression equation and thegraphs. We have graphed only two regression lines,as would be appropriate if one of the predictor vari-ables was dichotomous (it is also appropriate in amodel with only linear effects of X1 and X2 but moreconventional to include three lines). To facilitateinterpretation, these are presented in standard devi-ation units (i.e., all variables are standardized beforeconducting the multiple regression). The regressionequation is plotted for X1 = 0 (the mean of X1

Y = β0

Y

slope = β1 + ?3

slope = β1

x1 = 0 x1 = 1 x1

x2 = 1

x2 = 0

Figure 17.2 Graphical representation of the regression equation.X1 and X2 are predictor variables and Y is the outcome variable(to facilitate interpretation, these are presented in standard devi-ation units). The regression equation is plotted for X1 = 0 (themean of X1) and X1 = 1 (1 SD above the mean of 1). β0 (theintercept) is the value that Y takes on when both X1 = 0 andX2 = 0. β1 (the regression weight for X1) is the change in Yassociated with a 1-unit change in X1 when X2 = 0. β2 (theregression weight for X2) is the change in Y associated with a1-unit change in X2 when X1 = 0. This form of the interac-tion term (the product of X1 and X2) is typical in social scienceresearch, but other forms of interaction could be considered.

when it is standardized) and X1 = 1 (1 SD abovethe mean of X1 when it is standardized). β0 (theintercept) is the value that Y takes on when both X1= 0 and X2 = 0. β1 (the regression weight for X1) isthe change in Y associated with a 1-unit change inX1 when X2 = 0. β2 (the regression weight for X2)is the change in Y associated with a 1-unit changein X2 when X1 = 0. The relation between Y and X2when X1 = 1 is given by the sum of β2 and β3.

There are, however, important caveats in theinterpretation of the results that are highlighted bythis graph. In particular, the effects of X1 and X2on Y are conditional upon the value that the othervariable takes on. The regression parameters β1 andβ2 cannot be unconditionally interpreted as maineffects of X1 and X2, respectively. This would onlybe possible if there were no interaction effect—thatis, if β3 = 0. The values of these regression weightsrepresent the effect of their predictor at the pointwhere the other predictor takes on a value of 0.In most cases in social science research, neither X1nor X2 actually take on values of 0, or the value of0 is arbitrary (i.e., ratio scales with an absolute 0are rare). In these circumstances, it makes no senseto base substantive interpretations on the values ofthe regression weights unless the variables are cen-tered or standardized. In this case, the zero pointhas a meaningful interpretation as the mean of thevariable considered.

Some authors (e.g., Carte & Russell, 2003) areso adamant about this point that they have argued

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 7

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
values
Page 8: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 8 — #8

that regression coefficients should never be inter-preted unless continuous X1 and X2 predictors aremeasured on a ratio scale that has an absolute 0 (fordiscussion of ratio scales, see Chapter XXX in the

AQ: Pleaseprovidechapternumber.

Handbook). However, we emphasize that regressionweights can be meaningfully interpreted when pre-dictor variables are centered or when all variables inthe regression equation are standardized or, perhaps,when zero is meaningful. Thus, for example, β2 (theregression weight for X2) is the change in Y associ-ated with a 1-unit change inX2 whenX1 = 0. Whenvalues are standardized, this means the change in Y(in SD units) associated with a change of 1 SD inX2 at the mean of X1. Hence, in relation to stan-dardized or centered values, this is an interpretableresult. It is, of course, important to emphasize thatthe change in Y will differ when X1 takes on dif-ferent values. Similarly, for standardized values itis interpretable to report β2, the marginal changein Y associated with a 1 SD change in X2 at themean of X1. It is important that the values of thepredictor variables are scaled so that 0 is a meaning-ful value—whether a 0 to 1 coding of experimentalgroups, a zero-centered coding of continuous pre-dictor variables, or appropriately standardized valuesso that all predictor variables have M = 0 and SD= 1. However, although we argue that it is mean-ingful to evaluate interaction effects in relation toscores transformed to have a meaningful zero value,it is important to evaluate simple effects of the pre-dictor variable at different levels of the moderatorvariable.

Standardized Solutions for Models With Inter-actions Terms Standardized estimates (see Equation3) are useful for comparing results based on differ-ent variables in a standardized metric, even whenthe original variables are based on different, possi-bly arbitrary metrics. Standardizing a variable canbe treated as two steps: centering (subtracting thevariable with its mean) and rescaling (multiplyingby a constant, e.g., meter is replaced by centimeter).Although the main effects of predictors on outcomevariables are unaffected by centering in analyses formodels without an interaction term, they may beaffected substantially in models with interactionterms (Cohen, 1978; Cohen et al., 2003). How-ever, centering does not affect the coefficient forthe interaction term (see Cohen et al., 2003) orthe nature (e.g., ordinal vs. disordinal) of the inter-action. We also note that when predictors are notcentered, product or nonlinear (e.g., X 2) terms willtypically be highly correlated with the original vari-ables, leading to multicollinearity problems such as

large standard errors for the regression coefficients(Aiken & West, 1991; Cohen et al., 2003).

The computation of the appropriate standard-ized effects involving interaction terms in regressionanalyses is not straightforward with most commer-cial statistical packages (see Aiken & West, 1991;Cohen et al., 2003). In particular, the standardizedregression coefficients are typically not correct forthe interaction term. To obtain correct standardizedregression coefficients (Friedrich, 1982):

1. standardize (z-score) all variables to becomeZY , ZX 1, ZX 2;

2. form the interaction term by multiplying thetwo standardized variables ZX 1ZX 2 (but do notre-standardize the product term);

3. use ZX 1, ZX 2, and ZX 1 ZX 2 in a regressionanalysis to predict ZY , determine the statisticalsignificance of all predictors using their respectivet -values, and report the unstandardized regressioncoefficients as the appropriate standardizedcoefficients. That is, the coefficients of theEquation 2. (Note: The intercept term β0 isnecessary and generally not equal to 0 because theproduct term ZX 1ZX 2 typically does not have amean of 0).

Tests of Statistical Significance of InteractionEffects. In the application of a multiple regressionmodel, it is straightforward to test the statisti-cal significance of main effects and interactions byusing the so-called hierarchical regression approach.This can be done in relation to raw, centered, orstandardized variables:

1. For the additive model (with no interactionterms) predict Y withX1 and X2 and obtain thesquared multiple correlation (percentage ofvariance explained) R2

1 ;2. For the interaction model, predict Y with X1,

X2 and X1X2 using Equation 1, obtain the squaredmultiple correlation R2

2 and compute the change inthe squared multiple correlation, R1F2.

Based on Equation 1, to test whether the interac-tion effect is statistically significant, the hypothesisH0 : β3 = 0 the test statistic is

t = β̂3

SE(β̂3),

where SE(β̂3) is the standard error of estimated β3.More generally, when there is only one interactionterm in the regression model, an equivalent test is

8 m o d e r at i o n

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
and
Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
Herb Marsh
Sticky Note
Rsq2 - Rsq1 cannot include equation -- see supplemental corrections
Page 9: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 9 — #9

whether R22 is significantly higher than R2

1 with thefollowing F -test,

F = [(R22 − R2

1 )/(k2 − k1)]/[(1 − R22 )/(N − k2 − 1)],

where R22 is from the equation involving the inter-

action term of k2 predictors, R21 is from the original

equation with k1 predictors, and N is the samplesize. This is evident in that the squared t -statistic isequal to the F -statistic with df = 1 in the numera-tor. This value will be the same regardless of whetherX1 and X2 have been centered or standardized.

The greater the partial regression coefficient β3,and the change in R2 (i.e., R2

2 − R21 ) resulting from

the introduction of the interaction terms, the greaterthe moderating effect of X1 on the relation of X2 onY (or symmetrically X2 on the relationship of X1on Y ). As pointed out by various researchers (e.g.,Aiken & West, 1991; McClelland & Judd, 1993), itis important that the hierarchical regression be con-ducted with a model in which the interaction termis tested after controlling for both X1 and X2. Thatis, regression models testing the interaction term(X1X2) must also contain the corresponding predic-tor variables (X1 and X2). Otherwise the effect of theinteraction is confounded with the main effects ofX1and X2. For models involving categorical variablesin which more than one dummy variable is neededto represent one of the predictor variables (i.e., thereare more than two categories), then all the relatedproduct terms must be entered simultaneously inthe same regression step. Whether analyses are donein one step (including all predictor and interactionvariables) or two steps (testing additive effects firstand then including interaction terms), it is criticalthat interaction terms are evaluated in a model thatcontains all the additive terms for all variables in theinteraction.

If additional covariates are included as controlvariables, they should be entered as the very first setof variables in the hierarchical regressions, followedby those involved in the interaction terms (Frazier,Tix, & Barron., 2004). However, this recommen-dation is based on the assumption that covariatescome before the predictor and outcome variablesin relation to their causal ordering. For some covari-ates, this is reasonable (e.g., gender, ethnicity, pretestvariables), but in others it might not (see further dis-cussion of mediated effects below). It is also relevantto evaluate whether the covariates really have similareffects on the outcome variable at different values ofthe other predictor variables by testing interactionsbetween covariates and other variables (Cohen et al.,2003; Frazier et al., 2004).

Post hoc Examination of the Interactions WithContinuous Observed Variables. Even when there isa statistically significant interaction effect, the inter-pretation of the values of the regression weights ishazardous, particularly when based on raw scoresin the original metric. Hence, we recommendthat researchers should always graph the regressionequation at representative values of the predictorvariables. Although this can be done in relationto the original (raw score) metric, there are some-times interpretational advantages in constructingthese graphs based on centered or standardized val-ues. Logical questions might include (1) what is thepattern of changes in the slope (e.g., does the slopeincrease or decrease with increasing values of X2)?and (2) is the regression of Y on X1 significant at aparticular value of X2 or for a range of X2 values?

Graphically this simple approach can be depictedby plots like those in Figure 17.2. When one ofthe predictor variables is a dichotomous groupingvariable and the other is a continuous variable, thegroup-by-linear interaction can be depicted by twostraight lines. Simple slopes refer to the fact that foreach value of one predictor variable, it is possibleto plot the relation of the other predictor variableto the outcome. This approach can still be usedeven when the effect of the continuous variable isnonlinear. Even when both of the variables are con-tinuous, such plots of the effects of predictor variableat representative values of the other predictor vari-able provide a heuristic pictorial representation ofan interaction. We generally recommend that allinteractions should be represented graphically tobetter understand and communicate the nature ofthe interaction.

More formally, to explore the pattern of chang-ing slopes in the regression of Y on X1 at differentvalues of X2, it is customary to plot at least twoor more of these lines (Y against X1—i.e., simpleslopes) at specific values of X2 (e.g., the mean ofX2 and X2 = +1SD and – 1SD; Aiken & West,1991; Cohen et al., 2003). Mathematically, we cansubstitute X2 (or symmetrically X1) with values at–1SD, 0, +1SD from its mean into the regressionequation and manually calculate the appropriate t -values for the respective regression weights forX1 (seeAiken & West, 1991, for an illustrated example). Itis also possible to plot simple slopes either in termsof the original regression equation or the completelystandardized regression equation, whichever is eas-ier and most interpretable. Based on values providedfrom most computer packages, it is also possible tocompute standard errors and appropriate t -values

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 9

Page 10: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 10 — #10

to test the statistical significance of slopes for a pre-dictor at a given value of the moderator or to testthe difference in slopes at two different values ofthe moderator variable (for further discussion, seeAiken & West, 1991; Darlington, 1990; Jaccardet al., 1990). However, a possibly more expedientapproach is to center the value of the moderator atthe desired value and re-run the regression modelwith the new terms (i.e., the newly centered moder-ator variable and corresponding new cross-productterms based on the newly centered moderator vari-able). As is always the case, the effect of X1 can beinterpreted as its effect when X2 = 0. Hence, if X2is centered on a particular value of interest, the testof statistical significance of X1 provides a test of thesimple slope of X1 at that value of X2. A similar logiccan be used with categorical variables by choosingdifferent categories as the reference (or “left-out”)category that is assigned a value of zero. Thus, forexample, when X2 is gender, with boys = 0 and girls= 1, the test of statistical significance of X1 providesa test of the simple slope of X1 for boys. However,redoing the analysis with boys = 1 and girls = 0 pro-vides a test of the simple slope for girls. (For furtherdiscussion, see Cohen et al., 2003).

The Johnson-Neyman approach is a more generalalternative to the simple slopes approach that is par-ticularly relevant when both predictor variables arecontinuous (Johnson & Neyman, 1936; Potthoff,1964). This approach is used to define regions ofsignificance to represent the range (or ranges) of val-ues of one predictor variable for which the slope ofthe other predictor variable is significantly differ-ent than zero. Thus, a region of significance is therange of X2 (moderator) values for which the rela-tion between X1 (predictor) and Y are statisticallysignificant. For each region of significance, there isat most one upper and one lower bound, althoughone or the other of these values might be outsideof the range of possible values of the moderator (oreven take on the value of infinity) so that there iseffectively only one bound. Alternatively, regions ofnonsignificance are the ranges of values of one pre-dictor variable for which the slope relating the otherpredictor variable is not significantly different from0. As noted by Preacher, Curran, and Bauer (2006;see also Dearing & Hamilton, 2006), the logic ofregions of significance is the converse of that usedto interpret simple slopes. Simple slopes identify aslope coefficient (and its standard error, confidenceinterval, and statistical significance) at chosen valuesof the moderator. In contrast, regions of significance

provide the range of values of the moderator forwhich the simple slopes are significant.

The manual calculation and plotting involved inprobing significant interactions (e.g., simple slopesand regions of significance) is tedious and proneto error. Fortunately, Preacher et al. (2006) haveprovided a useful online program that helps theexamination of two- and three-way interactions inmultiple regression, multilevel modeling, and latentcurve analysis (see Appendix of this chapter for moredetail). Users just input regression coefficients, thevariances/covariances of these coefficients, degrees offreedom, and α levels. Simple slopes and interceptsfor conditional values of the moderator at prede-termined points of interest, their standard errors,critical ratios, p-values, as well as confidence bandsaround them are either provided or can be easily gen-erated (see Appendix for an example of this presentedin detail).

Based on the tools introduced by Preacher et al.(2006; see also Appendix of this chapter for moredetail) we constructed a graph of simple slopes (Fig.17.3A) and regions of significance (Fig. 17.3B). Thegraph of relations between X1 (a predictor variable)and Y (the outcome variable) at different values ofX2 (the moderator variable) shows that this is a dis-ordinal interaction (Fig. 17.3A) in that the threeregression plots cross (at the “point of intersection”)within the range of X1 and X2 values that were con-sidered. The effect ofX1 onY is very positive forX2 =–1, less positive for X2 = 0 (the mean of the moder-ator variable X2), and close to 0 for X2 = +1. Figure17.3B represents the relation between the modera-tor (X2) and the simple main slope of the relation ofX1onY —the regression plot that is a solid dark-grayline. The relation between the simple slope and themoderator is negative; the simple slope is positivefor sufficiently low values of X2, 0 for some inter-mediate value of X2, and negative for sufficientlyhigh values of X2. The curved lines on either side ofthe regression line are the 95% confidence bands forthe simple slope at each value of X2. The solid darkline for the simple slope of 0 represents the valuesof X2 for which X1 is not related to Y ; this occursat a value of X2 = 0.87. The 95% confidence inter-val around the value of X2 where the simple slopeis 0 is the “region of nonsignificance” (representedby the two horizontal dashed lines—that is, the val-ues for X2 for which the simple slope of X1 on Y isnot significantly different from 0; X2 = +0.55 toX2 = +1.40 in our example). There are two regionsof significance; the effect of X1 on Y is positive for

10 m o d e r at i o n

Page 11: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 11 — #11

X2 = +1

X2 = 0

X2 = –1

X1

Moderator (X2)

Sim

ple

Slo

pe

Y

Region ofSignificance

(Simple Slope isPositive)

Reg

ion

of N

on-

Sign

ifica

nce

Point ofIntersection

Region ofSignificance

(Simple Slopeis Negative)

3

2

1

0

–1

–2

–3

20

15

10

0.5

00

–0.5

–1.0

–3 –2 –1 0 1 2 3

–3 –2 –1 0 1 2 3

slope-moderator

relation

ConfidenceBands

Figure 17.3 Plot of a Two-way interaction (A) and regions of significance and nonsignificance (B). Y = outcome variable, X1 =predictor variable, X2 = moderator. In (A), simple slopes are represented as regression plots of the relation between X1 and Y atdifferent values of X2. (B) shows the relation between the simple slope (relating X1 to Y ) and the moderator X2. Regions of significanceshow the ranges of values of X2 for which the simple slope is statistically significant (i.e., X1 is significantly related to Y ), whereasregions of nonsignificance show the range for the values of X2 for which the simple slope is not statistically significant.

values of X2 less than +0.55 but negative for valuesof X2 greater than 1.40.

Point of Intersection for Disordinal Interactions.For a regression equation with a significant interac-tion, the simple slopes may cross within or outsidethe possible (or plausible) range of the predictingvariables (e.g., the disordinal interaction in Fig.17.1). In an ordinal interaction the set of lines merelyfan out or fan in with the nominal crossover pointoutside of the range of plausible values of X1 (seeordinal interaction in Fig. 17.1). Thus, for example,for the disordinal interaction in Figure 17.1, thepoint of intersection represents the value for X1 atwhich gender (X2) has no effect on the outcome vari-able (i.e., where the lines cross so that scores for boysand girls are the same). In this graph, for all values ofX1 to the right of the intersection (i.e., more positivevalues of X1), girls score higher than boys. Depend-ing on the size of the confidence intervals, for somevalue of X1 sufficiently above the intersection, girls

have outcomes significantly higher than boys. Like-wise, for values X1 below the intersection, boys havehigher values than girls, and this difference is statis-tically significant for values of X1 less than the lowerbound of the region of nonsignificance.

For disordinal interactions where all the effectsare linear, there can be only one region of non-significance and at most two regions of significance.For ordinal interactions based on linear effects, itis possible to have no regions of nonsignificanceand no more than one region of significance. How-ever, for nonlinear effects, it is possible to havedisordinal interactions with more than one pointof intersection and multiple regions of significanceand nonsignificance.

For a disordinal interaction based on lineareffects, we can determine mathematically the inter-section or crossing point. With some simple algebrabased on the appropriate regression equation (Aiken& West, 1991), it can be shown that when X1 is the

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 11

Page 12: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 12 — #12

moderator, X2 = −b1/b3 is the intersection andthat when X2 is the moderator, X1 = −b2/b3 is theintersection. For example, in Figure 17.3A, b2 =−0.355 and b3 = −0.397 so that the point of inter-action is .89 (–b2/b3 = −(−0.355/−0.397) =−0.89

Power in Detecting Interactions. Using standardmultiple regression procedures, it is not difficultto test the statistical significance of interactioneffects. Nevertheless, McClelland and Judd (1993)emphasized that researchers have had great difficultyin identifying substantively meaningful, statisti-cally significant interactions that are replicable—particularly in nonexperimental research. Reasonsfor this situation include (1) overall model erroris generally smaller in controlled experiments thannonexperimental studies; (2) measurement errorsare exacerbated when X1 and X2 are multipliedto form the product term X1X2 (see also Frazieret al., 2004) and are likely to be larger in nonex-periments than in experimental studies when valuesare manipulated to take on a few fixed values atpredetermined levels; (3) the magnitude of interac-tions is typically constrained in field studies in whichresearchers cannot assign participants to optionallevels of the predictor variables; and (4) power todetect interactions is compromised because of non-linearities of the effects of X1 and X2 and theirinteraction (e.g., products of higher order terms),which is more problematic for field studies withmore levels of measurements. Finally, the valuesthat factors in experimental studies are assignedare often selected to maximize the sizes of effects(e.g., extreme values) and to be optimal in relationto research predictions—even if not representativeof values typically observed, whereas the values innonexperimental studies are more likely to be repre-sentative of the population from which the samplewas drawn.

Particularly relevant to this chapter, McClellandand Judd (1993) demonstrated that because of thereduced residual variance of the product X1X2 aftercontrolling for X1 and X2 in typical field studies, theefficiency of the interaction coefficient estimates andits associated power are much lower. This is likelyto dramatically reduce the power to detect interac-tions in field studies as compared to more optimaldesign in true experiments. Thus, for example, Agui-nis (2004; McClelland & Judd, 1993) has shownthat the power of the test of an interaction effectis typically less than 50% so that large sample sizesmight be needed to evaluate interaction effects. Thesituation becomes worse when the variances of X1

and X2 are limited, truncated, or reduced in a par-ticular study. For these reasons, it is reasonable toconduct a preliminary power analysis before collect-ing data to determine how large an N is needed tohave a reasonably high probability of finding a mod-erately sized interaction to be statistically significant(for further discussion, see Cohen et al., 2003).

Before conducting research, Frazier et al. (2004)suggested that researchers ensure that they have suf-ficient power to detect hypothesized interactions. Inparticular, sample size should be sufficiently large,especially when interaction effects are likely to besmall, as is typically the case. For categorical vari-ables, it is best to have approximately similar samplesizes across subgroups. Further, if error varianceswithin subgroups are not similar, then alternativetests of significance are needed (e.g., a multiple-group approach with heterogeneous error terms;see subsequent discussion). For continuous pre-dictor and outcome measures, reliabilities shouldbe high and samples leading to restriction of therange should be avoided. Several researchers (e.g.,Frazier et al., 2004; Marsh, Wen, & Hau, 2004)have also suggested the use of latent factors instructural equation modeling (SEM) to control formeasurement errors (see subsequent discussion oflatent-variable approaches to interaction models).

Multicollinearity Involved With Product Terms.Multicollinearity results when multiple predictorvariables are correlated with each other such that thevalues of one variable systematically vary with theother variable (Marsh, Dowson, Pietsch & Walker,2004). Hence, multicollinearity is a function of rela-tions among predictor variables and their relationto the outcome variable. There are three quite dif-ferent meanings of multicollinearity that are oftenconfused. The first, and the focus of this section, iswhen the two or more predictor variables are corre-lated, thus complicating the interpretation of eachconsidered separately. This form of multicollinearityoccurs in almost all studies in which the predictorvariables are not experimentally manipulated so asto be orthogonal. The second meaning of multi-collinearity is when the predictor variables are sohighly correlated that the typical tests of statisticalsignificance are distorted or cannot be done (e.g.,matrices become nonpositive definite so they can-not be inverted). This almost never happens andso is not a major concern. [Actually this sometimeshappens when researchers inadvertently use differ-ent variables that are a linear combination of eachother but this should be easy to recognize.] Thethird meaning, perhaps, is when multicollinearity

12 m o d e r at i o n

Page 13: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 13 — #13

is sufficiently large that SEs of coefficients are solarge that meaningful interpretations of the param-eter estimates cannot be made. This is not a statisticalproblem per se but, rather, the inability to disentan-gle the effects of each separate variable when theyare highly correlated. Although there are numerousguidelines as to how large multicollinearity has tobe before it is a serious problem, these are roughrules of thumb typically based on arbitrary cut-offvalues (see discussion by Cohen et al., 2003) andnot “golden rules” (Marsh, Hau, & Wen, 2004)to be followed blindly in all instances. Importantly,any non-zero correlation between predictor variablesresults in multicollinearity. Nevertheless, reduc-ing multicollinearity can substantially enhance theinterpretability of regression weights and, in extremecases, affect statistical tests of these effects.

Whenever product terms are computed to rep-resent interaction effects (e.g., X1X2)—and par-ticularly nonlinear effects (e.g., X 2 to represent aquadratic effect)—there is likely to be substantialamounts of multicollinearity. That is, the prod-uct terms are likely to be substantially correlatedwith the variables used to construct the productterms. Cohen et al. (2003; see also Marquardt,1980) have distinguished between what they referto as essential and nonessential multicollinearity.Nonessential multicollinearity can be seen as arti-ficial effects resulting from the scaling of predictorvariables in relation to the means of the predic-tor variables, whereas essential multicollinearity isa function of the correlation between the predictorvariables (and would be 0 if X1 and X2 are uncorre-lated). Cohen et al. have noted that nonessentialmulticollinearity can be reduced substantially bycentering the predictor variables (or by standardizingthem).

Even when there is substantial levels of “essen-tial” multicollinearity resulting in substantial SEsthat undermine interpretability, alternative mod-els can be specified to circumvent these problems.Thus, for example, Marsh, Downson, Pietch, andWalker (2004) reanalyzed data apparently showingthat paths leading from self-efficacy to achievement(0.55, p < 0.05) were apparently much larger thanthose from self-concept (–0.05, ns). However, theinterpretation was problematic because the corre-lation between self-concept and self-efficacy (0.93)was so high, resulting in very large SEs for bothpaths. In an alternative model, the two paths wereconstrained to be equal. This constraint did notresult in a statistically significant decline in fit, pro-vided a better more parsimonious fit to the data, and

also substantially reduced the sizes of SEs (from 0.50to 0.03).

Multicollinearity can be reduced even more byapplying the more general, hierarchical (sequen-tial) approach analogous to residualizing procedurelike that proposed by Landram and Alidaee (1997;Lance, 1988; Little, Bovaird, & Widaman, 2006;but see also Lin, Wen, Marsh, & Lin, in press; Marsh,

AQ: Isthere anupdate toLin, Wen,Marsh, &Lin?

Wen, Hau, Little, Bovaird, & Widaman, 2007)for polynomial components. With this approach,variance attributable to the predictor (X1 and X2)variables is partialed out of the X1X2interaction(product) term, main effect and first-order interac-tions are partialed out of second-order interactions,and so forth. This strategy is consistent with the hier-archical approach used to test interactions (basedon the change in R2) but has the advantage ofsubstantially reducing levels of multicollinearity.However, it also introduces interpretational prob-lems because the coefficients of the interaction termsare no longer in the same metric as the predictorvariables and so it is not so easy to graph the inter-action effect (see King, 1986, on problems withanalyses of residuals). Hence, we recommend thatthis approach should only be used with appropriatecaution and this it is generally better to use cen-tering and standardizing strategies). Importantly,multicollinearity may sometimes indicate that themodel has been in some ways misspecified (e.g.,when the predictor and the moderator should, infact, be considered as indicators of the same con-struct) and should not be simply partialled out of themodel.

Summary of Traditional (Non-Latent)Approaches to Interaction Effects

In this section, we briefly reviewed approachesto testing interaction effects based on observed(non-latent) variables. We began with a brief dis-cussion of ANOVA when all predictor variablesare categorical, noting in particular the dangers intransforming continuous predictors into categori-cal variables that are appropriate for ANOVA. Wethen moved to the more general moderated multipleregression approach that can be applied to con-tinuous or categorical predictors (and thus includeANOVA models as a special case). We emphasizedthat graphs of interactions are typically a usefulstarting place in the interpretation of results, butthese should always be supplemented with tests ofstatistical significance of interaction effects overalland supplemental analyses such as tests of simple

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 13

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
2010
Herb Marsh
Sticky Note
Lin, G-C., Wen, Z. L., Marsh, H. W. & Lin,H-S. (2010).Structural Equation Models of Latent Interactions: Clarification of Orthogonalizing and Double-Mean-Centering Strategies. Structural Equation Modeling, 17(3), 374-391. doi:HYPERLINK "http://psycnet.apa.org/doi/10.1080/10705511.2010.488999" 10.1080/10705511.2010.488999
Page 14: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 14 — #14

slopes and regions of significance. We recommendedconsideration of, and discussed issues related to,a number of transformations of the original data(centering predictor variables or standardizing allvariables). We concluded with a discussion of therelated issues of power and multicollinearity (and thelittle-used residualizing procedure). A critical limi-tation in most psychological research is that whentests of interactions are appropriately constructed,a priori interaction effects that are intuitive andbased on a strong theoretical rationale are typicallysmall, non-significant, or not replicable. The impli-cation is that most research lacks the statistical powerto detect interaction effects—particularly in non-experimental studies where interaction effects aretypically small. A critical issue related to power isthe substantial measurement error associated partic-ularly in interaction terms. Latent variable modelsof interaction effects that control for measurementerror might substantially reduce measurement erroras well as having more general strategic advan-tages. Hence, we now turn to an overview of theseapproaches.

Latent Variable Approaches to Tests ofInteraction Effects

The critical feature common to each of the abovenon-latent approaches is that all of the dependentand independent variables are observed variablesinferred on the basis of a single indicator ratherthan latent variables inferred by multiple indicators.Particularly when there are multiple indicators ofthese variables (e.g., multiple items in rating scalesor achievement tests), latent variable approachesprovide a much stronger basis for evaluating theunderlying factor structure relating multiple indi-cators to their factors, controlling for measurementerror, increasing power, testing the implicit factorstructure used to create scale scores, and, ultimately,providing more defensible interpretations of theinteraction effects.

Despite the ongoing emphasis on interactioneffects, empirical support for predicted interactionshas been disappointingly limited. One reason mightbe that the independent variables are contaminatedby measurement error and do not provide accu-rate estimates of true interaction effects (Moulder& Algina, 2002). Indeed, because the measurementerror of each of the main effect variables combinesmultiplicatively in the formation of the interactionterm, measurement error in the interaction is likelyto be substantially larger than in either of the main

effects variables (see earlier discussion). When eachof the independent variables is a latent variableinferred from multiple indicators, SEM providesmany advantages over the use of analyses based onobserved variables.

Nevertheless, despite the widespread use of SEMsfor the purposes of estimating relations among latentvariables and the importance of interaction effects,there have been very few substantive applicationsof SEMs to estimating interactions between twolatent variables. Obviously, the paucity of suchinteraction applications does not result from a lackof relevant substantive applications that requireinteraction terms. Rather, as noted by Rigdon,Schumacker, and Wothke (1998), inherent prob-lems in the specification of SEMs with interactionsbetween latent variables have led researchers to pur-sue other approaches. Similarly, Jöreskog (1998)warned that latent variable approaches require theresearcher to understand how to specify complicatednonlinear constraints. Although a variety of dif-ferent approaches to estimating latent interactionswere described in the book edited by Schumackerand Marcoulides (1998), and some new approacheshave been developed by Algina and Moulder (2001),Wall and Amemiya (2001), and Klein and Moos-brugger (2000), there has been no consensus thatany one of these approaches was optimal. Further,most of these approaches are not practically usefulfor the applied researcher because of the difficul-ties in specifying the nonlinear constraints. Anotherpossible reason for the infrequent use of the latentapproach is the difficulty in deciding how to con-struct or select the multiple indicators to form thelatent factors, as typically required by all traditionallatent approaches.

Latent interactions fall into two broad categories.In the first and generally simpler situation, at leastone of the variables involved in the interaction isa categorical variable with only a few categories(e.g., gender: male and female). In this situation,a multiple-group SEM is appropriate in which thedifferent categories are treated as separate groupswithin the same model. Even when both variablesinvolved in the interaction are categorical, they canbe used to form groups reflecting the main and inter-action effects (e.g., two dichotomous variables canbe used to form four groups reflecting the two maineffects and one interaction effect; which is like anANOVA within a SEM framework but providingcontrol for measurement error). Although it mightbe argued that the interaction in this case is not reallylatent, it is still a latent variable model as long as the

14 m o d e r at i o n

Page 15: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 15 — #15

dependent (or other predictor) variables are inferredon the basis of multiple indicators.

In the second situation, both independent vari-ables involved in the interaction are latent andcontinuous. Here there are various approaches toestimating their interaction effects, and “best prac-tice” is still evolving. Marsh, Wen, and Hau 2004;(see also Marsh, Wen & Hau, 2006) reviewedvarious alternative approaches for estimating theselatent interaction effects, including the uncon-strained (Marsh et al., 2004), constrained (Algina& Moulder, 2001), generalized appended productindicator (GAPI; Wall & Amemiya, 2001), andthe distribution-analytic (Klein & Moosbrugger,2000; Klein & Muthén, 2007; see also Moosbrug-ger, Schermelleh-Engel, Kelava, & Klein, 2009)approaches1.

In the remainder, we will first discuss multiple-group SEMs that are appropriate to analyze inter-actions when one of the predictors is categorical.We will then turn to the discussion of approachesto study interaction between latent continuousvariables.

Multiple Group Structural EquationModeling Approach to Interaction

Consider an interaction between a latent variable(ξ1) and an observed variable (X2) on a latent vari-able (η), and assume that X2 is a categorical variablewith a few naturally existing categories. We can usea multiple-group SEM approach (e.g., Bagozzi &Yi, 1989; Byrne, 1998; Rigdon, Schumacker, &Wothke, 1998; Vandenberg & Lance, 2000) withthe categorical variable (X2) as the grouping variable.Once the sample is divided into a small number ofgroups according to the values of X2, we conductmultiple-group SEM for the latent variable ξ1 andcompare the model with and without restraining theeffect of ξ1 on the dependent variable η to be equalacross the groups. Letχ2

1 (with df 1) andχ22 (with

df 2) be the chi-square test statistics, respectively,with and without restraining the effect of ξ1 on η

to be equal in all the groups. If there is substantialdecline in the goodness of fit with the invarianceconstraint (i.e., χ2

1 – χ22 with df 1 – df 2 is sig-

nificant and there is a substantial deterioration inselected goodness of fit indexes), then the effects inthe different groups are not identical and there isan interaction between the categorical predictor X2and the latent variable ξ1. Of course, the invarianceof the corresponding loadings and factor variances(i.e., constraining them to be the same in the dif-ferent groups) must have been examined earlier (for

details, see, e.g., Marsh, Muthén et al., 2009; Marsh,Lüdtke et al., in press; Vandenberg & Lance, 2000;

AQ: Isthere anupdate toMarsh,Ludtke, etal?

see also subsequent discussion and further discussionin Chapter XXX).

AQ: Pleasespecifychapternumber.

Historically this approach has been used oftenand has some advantages—particularly in terms ofease of implementation. If one of the independentvariables is an observed variable that can be usedto divide the sample naturally into a small numberof groups, then multiple-group SEM is a simple,direct, yet effective approach (Bagozzi & Yi, 1989;Rigdon, Schumacker, & Wothke, 1998), which canbe easily implemented in most common commer-cial SEM softwares. However, as emphasized earlier,this approach is generally not appropriate when twoor a small number of groups are formed from areasonably continuous predictor variable in that itignores measurement error in the variable used todivide the sample into multiple groups and actuallyincreases the unreliability in the grouped variablerelative to the original continuous variable. Strate-gically there are limitations in that it is not easy toestimate the size of the interaction effect and some ofthe groups can become unacceptably small. In gen-eral we recommend not using this approach unlessone of the interacting variables is a true categoricalvariable with a small number of categories and atleast moderate sample sizes.

Structural Equation Models with ProductIndicators

Kenny and Judd (1984) first used a SEM modelwith an interaction term to estimate latent inter-action effects. In their approach, the dependentvariable y is an observed variable, whereas theindependent variables 1,2 each has two observedindicators x1, x2 and x3, x4, respectively, with allvariables being centered (mean = 0). If the latentvariables are identified by fixing the loading of thefirst indicator to 1, then the measurement equationsof the model are:

x1 = ξ1 + δ1,x2 = λ2ξ1 + δ2; x3 = ξ2 + δ3;

x4 = λ4ξ2 + δ4.

The structural equation is:

y = γ1ξ1 + γ2ξ2 + γ3ξ1ξ2 + ζ , (4)

where ξ1ξ2 is the interaction term of ξ1 and ξ2 ony. In the model, it is assumed that latent variablesand the error terms are all normally distributed andthere is no correlation between the latent variablesand the error terms or between any two error terms.

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 15

Herb Marsh
Sticky Note
Marsh, H. W., Lüdtke, O., Muthén, B., Asparouhov, T., Morin, A. J. S., Trautwein, U. & Nagengast, B. (2010). A new look at the big-five factor structure through exploratory structural equation modeling. Psychological Assessment, 22, 471-491. DOI: 10.1037/a0019227
Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
ξ 1, ξ 2
Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
2010
Page 16: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 16 — #16

Equation 4 is not linear with respect to ξ1 and ξ2 andis different from the SEM equations generally used.If we consider ξ1 ξ2 as a separate third latent variable,then having no indicators for this variable remainsa problem. To solve this, Kenny and Judd used theproducts of all possible pairs of the centered indica-tors x1x3, x1x4, x2x3, x2x4as the indicators for ξ1ξ2with a lot of constraints added for identification.

Kenny and Judd’s (1984) ingenious work washeuristic, stimulating many published studies thatpresent alternative approaches to the use of prod-ucts of indicators to estimate latent interactions (e.g.,Algina & Moulder, 2001; Coenders, Batista-Foguet& Saris, 2008; Hayduk, 1987; Jaccard & Wan,1995; Jöreskog & Yang, 1996; Marsh et al., 2004;Ping, 1996; Wall & Amemiya, 2001). However,their original approach was unduly cumbersome andoverly restrictive in terms of the assumptions onwhich it was based (see Marsh et al., 2004), leadingto the development of new approaches.

Unconstrained Approach for Latent Interaction.Compared to the traditional constrained approach(e.g., Algina & Moulder, 2001; Marsh et al., 2004)and the partially constrained approach (Wall &Amemiya, 2001; Marsh et al., 2004), the uncon-strained approach is fundamentally different andimpressively simple to implement in that many ofthe complicated constraints in the original Kennyand Judd (1984) approach are no longer necessary.Marsh et al. (2004) showed that the unconstrainedapproach performed nearly as well as the constrainedapproach when all assumptions of the constrainedapproach were met and substantially better undermore realistic conditions.

For the sake of simplicity, suppose that theendogenous latent variable has three indicators: y1,y2, y3; the exogenous latent variables ξ1 and ξ2 alsohave three indicators, respectively: x1, x2, x3 and x4,x5, x6. Lin et al. (in press) proposed a double-mean-

AQ: Isthere anupdate toLin et al?

centering strategy for the unconstrained model thatdoes not require a mean structure. First they cen-tered all indicators to their mean, then formed amore parsimonious set of product indicators, andfinally centered the product again (double-mean-centering). When SEM software (such as LISREL)is employed in practice, however, the product indi-cators do not really need to be re-centered again.Because when the model does not consist of a meanstructure, the estimation results are identical withand without re-centering the product indicators. Inother words, SEM software routinely treats the databeing mean-centered if there is no mean structurein the model. Noting this point, Wu, Wen, and Lin

(2009) suggested that researchers can directly usesingle-mean-centered data to analyze a latent inter-action model without a mean structure. We note,however, that some statistical packages (e.g., Mplus)include a mean structure as the default so that thispotential advantage of being able to ignore the meanstructure might not be so important. Thus, the stepsfor analyzing the latent interaction are as below:

1. center all indicators to their mean, stilldenoted as y1, y2, y3; x1, x2, x3; x4, x5, x6;

2. form the product indicators x1x4, x2x5, x3x6;3. use y1, y2, y3 as the indicators of η, x1, x2, x3 as

the indicators of ξ , x4, x5, x6 as the indicators of ξ2,x1x4, x2x5, x3x6 as the indicators of ξ3 (ξ3 is thecentered interaction term ξ1ξ2 − E (ξ1ξ2), see Linet al., in press; Wu et al, 2009). The structuralequation has three exogenous latent variables:

η = γ1ξ1 + γ2ξ2 + γ3ξ3 + ζ . (5)

ξ1, ξ2, and ξ3 are allowed to be correlated witheach other, but each is uncorrelated withmeasurement errors and the residual term ζ .

In Equation 4 γ1 and γ2 represent the condi-tional main effects, γ3 represents the interactioneffect (for the path diagram, see Fig. 17.4). A sim-ple LISREL syntax is given in Appendix 3). Readerswho are familiar with the usual LISREL syntax mayeasily revise it for their researches involved in latentinteractions).

Construction of Product Indicators of the LatentInteraction. A potential problem with the indica-tor approach is how to form the product indicators.In contrast to earlier, ad hoc approaches to createproduce indicators, Marsh, Hau, and Wen (2004,2006) proposed the guiding principles: (1) use allthe multiple indicators from both interacting vari-ables in the formation of the indicators of the latentinteraction factor, and (2) do not re-use the sameindicator in forming the indicators for the latentinteraction factor. Thus, each indicator in ξ1 andξ2should be used once and only once in formingthe indicators for the latent interaction factor.

In some situations there is a natural matching thatshould be used to form product indicators (e.g., theitems used to infer ξ1 and ξ2 have parallel wording).More generally, when the two first-order effect fac-tors ξ1 and ξ2 have the same number of indicators,Marsh et al. (2004; see also Saris, Batista-Foguet, &Coenders, 2007) suggested that it would be better tomatch indicators in terms of the reliabilities of theindicators (i.e., the best item from the first factorwith the best item from the second factor, etc.).

16 m o d e r at i o n

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
2010
Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
2010
Page 17: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 17 — #17

δ9 x3 x6

δ8

δ7

δ6

δ5

δ4

δ3

δ2λ2

λ3

λ5

λ6

λy3

λy2

λ8

λ9

ξ3

ξ2

ξ1

φ12

φ23

φ13

γ2

γ1ε1

ε2

ε3

ζ

η

γ3

δ1

x2 x5

x1 x4

x6

x5

x4

x3

y3

y2

y1

x2

x1 1

1

1

1

Figure 17.4 The notional path diagram of the latent interaction model. ξ3 is the centered interaction term ξ1ξ2 − E (ξ1ξ2).

However, if the number of indicators differsfor the two first-order effect factors, then a sim-ple matching strategy does not work. Assume, forexample, that there were 5 indicators for the firstfactor and 10 for the second. One approach wouldbe to use the 10 items from the second factor to formfive (item pair) parcels by taking the average of thefirst 2 items to form the first item parcel, the aver-age of the second 2 items to form the second parcel,and so forth. In this way, the first factor would bedefined in terms of 5 (single-indicator) indicators,the second factor would be defined by 5 (item-pairparcel) indicators, and the latent interaction factorwould be defined in terms of 5 matched-productindicators.

Appropriate Standardized Solution for LatentInteractions Standardized parameter estimates areimportant because they facilitate the comparisonof the effects of different predictor variables in theequation. Analogous to the corresponding problemin multiple regression with the manifest interactiondescribed earlier, the usual standardized coefficientsof SEMs with latent interaction are not appropri-ate. Wen, Marsh, and Hau 2010; (see also Wen,Hau & Marsh, 2008) derived an appropriate stan-dardized solution for latent interaction. They provedthat these appropriate standardized estimates of themain and interaction effects as well as their stan-dard errors andt -value are all scale-free (for furtherdiscussion, see Wen et al., 2010). Although specif-ically designed for the latent interaction model,this one-step approach might also be appropriate

to manifest models based on single indicators ofpredictor variables.

Distribution-Analytic ApproachesIn contrast to the product-indicator approaches

that model the latent interaction by spec-ifying a separate latent interaction variable,the distribution-analytic approaches (Klein &Moosbrugger, 2000; Klein & Muthén, 2007;Moosbrugger, Schermelleh-Engel, Kelava, & Klein,2009) explicitly model the distribution of the latentoutcome variables and their manifest indicators inthe presence of latent nonlinear effects and providea promising alternative for the estimation of non-linear SEM. Two methods are currently available:the Latent Moderated Structural Equations (LMS)approach (Klein & Moosbrugger, 2000) that isimplemented in Mplus (Muthén & Muthén, 1997–2008) and the Quasi-Maximum Likelihood (QML)approach (Klein & Muthén, 2007) that has not yetbeen implemented in a readily available softwarepackage (but is available from its author AndreasKlein).

Both QML and LMS directly estimate theparameters of the latent interaction model givenin Equation 4 (and are flexible enough to han-dle quadratic effects of latent variables as well)without having to resort to the use of product-indicators. They differ in the distributional assump-tions made about the latent dependent variable ? andits indicators and in the estimation method used toobtain the parameter estimates (for more technical

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 17

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
10 (single-indicator)
Herb Marsh
Inserted Text
(cross-product terms based on 5 single-indicators for the first factor and 5 item-pair parcels for the second factor).
Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
η
Page 18: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 18 — #18

description, see Klein & Moosbrugger, 2000; Klein& Muthén, 2007).

In contrast to product-indicator approaches fol-lowing from Kenny and Judd’s (1984) research, it isnot necessary to have indicators of the latent interac-tion variable in the distribution-analytic approaches.The product-indicator approaches assume normal-ity of indicators and latent constructs to estimateparameter and standard errors. In general, bothassumptions are violated in models with latentinteractions, although results seem to be robustin relation to these violations. In addition, alter-native approaches such as bootstrapping (see Wenet al., 2010) can be used. Bootstrapping is becomingmore readily available in standard statistical packagesand provides a more accurate estimate of SEs—particularly when N is small. The distribution-analytic approaches, on the other hand, maximizespecial fitting functions that take the non-normalityof the indicators of the dependent latent variableexplicitly into account (but still rely on normal-ity assumptions about the indicators of the latentpredictor variables). However, these advantages areoffset by the need to use specialized software to esti-mate these models (Mplus and the QML-program)and the high computational demands of the LMSapproach that limit its applicability.

Simulation studies that compared distribution-analytic approaches (Klein & Muthén, 2007; Marshet al., 2004) to the unconstrained approach showedthat QML appears to be sufficiently robust in rela-tion to non-normal data. LMS, on the other hand,can yield biased standard errors when distributionalassumptions about the indicators of the predictorvariables are violated. The QML program providesa fully standardized solution that assumes that allmanifest and latent variables are standardized butno standard errors for the standardized effects. Theseestimates are equal to the parameters of the appropri-ately standardized solution. Standardized effects forLMS are harder to obtain, as the current implemen-tation does neither provide a standardized solutionnor an estimator for the variance of the latentproduct variable ξ1ξ2 to calculate the standardizedsolution.

Comparisons of LMS and QML (Klein &Muthén, 2007) indicate that LMS is slightly moreefficient when its distributional assumptions are ful-filled, but QML is comparatively more robust toviolations of normality. It appears that the QMLapproach will be more useful for applied researchers,but there have been too few studies with real data tofully evaluate its potential in actual practice. Indeed,

this concern can be applied to all of the variousapproaches to latent interaction.

Summary of Latent-Variable Approaches toInteraction Effects

Non-latent variable tests of interaction effectstraditionally lack power because of the substantialamount of measurement error, particularly in theinteraction component. In this section we brieflydescribed new and evolving approaches to testinglatent interaction effects. When one of the predic-tor variables is a manifest grouping variable witha small number of categories (e.g., male/female)the traditional approach to multigroup SEM canbe applied. This approach is well established, easilyimplemented, and facilitates a detailed evaluationof invariance assumptions (e.g., invariance of fac-tor loadings over groups) that are largely ignoredin other approaches. However, the multigroupapproach is generally not recommended when allthe predictor variables are continuous or basedon multiple indicators. Historically, the productindicator approaches have dominated latent inter-action research. Although these approaches are stillevolving, there is good evidence in support of theunconstrained approach in terms of robustness andease of implementation in terms of all commercialSEM packages. More recently, distribution-analyticapproaches (LMS and QML) hold considerablepromise and apparently have strategic advantagesover the product-indicator approach. AlthoughLMS is available in Mplus, the QML approach isnot available in any major statistical package (butis available from its author Andreas Klein uponrequest). Despite a long history dating back to 1984,latent interaction effects are rarely used in appliedresearch. Furthermore, even the limited numbers ofdemonstration and simulation articles have focusedon statistical issues involved in estimating the mod-els and have not adequately dealt with many of theissues faced by applied researchers (including someof those discussed here in relation to the moderatedmultiple regression approach).

SummaryTests of interaction effects are central to psy-

chology. The implications of being able to test andinterpret interaction effects are critical for theory,substantive understanding, and applied practice.Yet, typical practice in the evaluation of interactioneffect is surprisingly weak. In our chapter we haveprovided an overview of the topic of moderation

18 m o d e r at i o n

Page 19: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 19 — #19

based on both manifest and latent variable mod-els. In doing so, we have attempted to summarizecurrent best practice. The good news is that therehas been important progress in both best practiceand typical practice. Applied researchers are appar-ently becoming more aware of the basic issues in thetesting and interpretation of interaction effects. Thebad news is that many well-established methodolog-ical requirements continue to be ignored in appliedresearch. Although there are some stunning newdevelopments in the application of latent modelsto testing interaction effects, these new approacheshave not yet had much impact on applied research.Further, these new developments have focused pri-marily on the substantial statistical issues involvedin fitting data to the latent variable models and havenot made much progress in resolving many of thecomplications that have been the focus of manifestapplications. Hence, it is perhaps unsurprising thatevolving latent variable approaches to moderationeffects have had limited effect on typical practice inapplied research.

In this chapter, we have tried to promote asubstantive-methodological synergy (Marsh & Hau,2007), bringing to bear new, strong, and evolv-ing methodology to tackle complex substantiveissues with important implications for theory andpractice. Here, as in other areas of applied psy-chological research, theory, good measurement,research, and practice are inexorably related suchthat the neglect of one will undermine pursuitof the others. Marsh and Hau (2007) claimed:(1) some of the best methodological research isbased on the development of creative methodologi-cal solutions to problems that stem from substantiveresearch; (2) new methodologies provide importantnew approaches to current substantive issues; and(3) methodological-substantive synergies are partic-ularly important in applied research. In the study ofmoderation, this synergy is particularly important.

Limitations and Directions for FurtherResearch

We now turn to a set of issues that are beyondthe scope of the present chapter. In some cases theseare ongoing or evolving issues that have not beenresolved in the literature and reflect our thoughtsabout directions for further research.

Quadratic Effects: Confounding Nonlinearand Interaction Effects

Examples with quadratic effect are common. Forexample, nonlinear effects may be hypothesized

between strength of interventions (or dosage level)and outcome variables such that benefits increaseup to an optimal level, and then level off or evendecrease beyond this optimal point. At low levelsof anxiety, increases in anxiety may facilitate per-formance but at higher levels of anxiety, furtherincreases in anxiety may undermine performance.Self-concept may decrease with age for young chil-dren, level out in middle adolescence, and thenincrease with age into early adulthood. The level ofworkload demanded by teachers may be positivelyrelated to student evaluations of teaching effective-ness for low to moderate levels of workload butmay have diminishing positive effects or even nega-tive effects for possibly excessive levels of workload.Quadratic effects can be seen as a special case of non-linearity effect that can be very complicated but arenot discussed here in detail.

The existence of quadratic effects can complicatethe analysis and interpretation of interaction effects.A strong quadratic effect may give the appearanceof a spurious significant interaction effect, and it issometimes difficult to distinguish them (Klein et al.,2009; Lubinski & Humphreys, 1990; MacCallum& Mar,1995). Thus, without the proper analysisof the potential quadratic effects, the investigatormight easily misinterpret, overlook, or mistake aquadratic effect as an interaction effect. Particularlywhen both X1 (the predictor) and the modera-tor are positively correlated, it is likely that theproduct (quadratic and interaction) terms are alsocorrelated. That is, the quadratic function of X1(i.e., X1 × X1 = X 2

1 ) is likely to be correlatedwith the interaction (X1 × X2). Unless there is awell-established causal ordering of X1 and X2, thenthere is no easy solution as to how to disentanglethe potential confounding between the quadraticand moderation terms. In most cases, the variancethat can uniquely be explained by the interactioneffect will be diminished by controlling for quadraticeffects. However, this is the same sort of confound-ing that is typical in multiple regression analyseseven when product terms are not considered; somevariance can be uniquely attributed to one predic-tor, some to the other predictor, and some can beexplained by either predictor. Hence, it is reasonableto include both the nonlinear and the interactioneffects in the same model to more fully specify thatpattern of relations among the variables.

When all variables are manifest, the inclusionof quadratic effects is a relatively straightforwardextension of approaches outlined here. Indeed, theconstruction of the statistical model is likely to be

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 19

Page 20: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 20 — #20

much easier than the interpretation of the results.However, for latent variable models when all pre-dictor variables are based on multiple indicators,the simultaneous inclusion of latent interaction andlatent quadratic terms is likely to prove more com-plicated. Nevertheless, there has been some recentwork on this issue involving both the multiple-indicator and distributional approaches to latentvariable modeling (see Klein et al., 2009; Marsh etal., 2006; Kelava & Brandt, in press). Furthermore,

AQ: Isthere anupdate toKelava &Brandt?

as is the case with interaction effects more generally,the focus of latent variable models has been moreon the statistical issues involved in fitting the modelthan the interpretational concerns that have beenthe focus of studies based on manifest variables (andour chapter).

We also note that in applied research it is some-times suggested that nonlinear relations should bespecified a priori on the basis of theory or previ-ous research. However, the linearity of the observedrelationships is an implicit assumption of any formof model that does not include nonlinear terms.Hence, it is always reasonable to test the appro-priateness of the linearity assumption through theinclusion of nonlinear terms.

Moderation versus Mediation and the Roleof Causal Ordering

The concept of moderation (or interaction) isoften confused with that of mediation (see Holm-beck, 1997; Baron & Kenny, 1986). As noted byFrazier et al.(2004, p. 116): “Whereas moderatorsaddress ‘when’ or ‘for whom’ a predictor is morestrongly related to an outcome, mediators establish‘how’ or ‘why’ one variable predicts or causes an out-come variable.” Mediation refers to the mechanismthat explains the relation between X1 and Y .

Mediation occurs (see Fig. 17.5) when some ofthe effects of an independent variable (X ) on thedependent variable (Y ) can be explained in termsof another mediating variable (MED) that fallsbetween X1 and Y in terms of the causal ordering:X1 ➔ MED ➔Y . A mediator is an intervening vari-able that accounts for—at least in part—the relationbetween a predictor and an outcome such that thepredictor influences an outcome indirectly throughthe mediator. Thus, for example, the effects of math-ematical ability prior to the start of high school (X )on mathematics achievement test scores at the endof high school (Y ) are likely to be mediated in partby mathematics coursework completed during highschool (MED). In Figure 17.5A the model assumesthat the effect of X1 on Y is unmediated (X1 ➔Y ).

MED

X Yβ1

β2α 1

MOD

X Yβ1

β2

INTβ

MOD

X Yβ1

β2

B

C

D

X Yβ1A

3

Figure 17.5 The distinction between mediation (MED) andmoderation (MOD). X = predictor variable. Y = outcome vari-able. INT = interaction. In (A), X effects Y with no mediationor moderation. In (B), the effect of X on Y is mediated in partby MED. There is total mediation if the direct effect of X on Y is0 (β1 = 0) or partial medication of β1 �= 0. The indirect effectof X on Y thru that is mediated by MED is the product of α1and β2. (C) and (D) offer two representations of the interactioneffect. In (C), the interaction is depicted as a separate construct(typically defined as the product of X and MOD, but sometimesafter both have been centered or standardized). (D) offers demon-strates more explicitly that MOD moderates the relation betweenX and Y .

In Figure 17.5B the model assumes that part of theeffect of X1 on Y is direct (X1 ➔Y , the path thatgoes directly from X1 to Y ) and that some of theeffect is mediated (X1 ➔ MED ➔Y ) by the indi-rect path that goes from X1 to MED and then fromMED to Y ). If both the direct (X1 ➔Y ) and indi-rect (X1 ➔ MED ➔Y ) effects are non-zero, thenthe X –Y relation is said to be partially mediated,whereas in Figure 17.5B, the X1–Y relation is saidto be completely mediated by MED if the X1 ➔Yeffect is 0.

In contrast, and as discussed extensively above,moderation is said to have taken place when the sizeor direction of the effect of X1on Y varies with thevalue of a moderating variable (MOD). Thus, forexample, the effect of a remedial course in mathe-matics rather than regular mathematics coursework(X ) on subsequent mathematics achievement (Y )may vary systematically depending on the student’sinitial level of mathematics ability (MOD); the effectof the remedial course may be very positive for ini-tially less able students, negligible for average-abilitystudents, and even detrimental for high-ability stu-dents who would probably gain more from regular

20 m o d e r at i o n

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
if
Page 21: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 21 — #21

or advanced mathematics coursework. Figure 17.5Cshows the interaction as a separate construct (INT)and the effect of the interaction on the outcome(INT ➔ Y ) as a separate regression coefficient(β3). Alternatively the representation of the inter-action effect in Figure 17.5D shows more clearlythat the moderator influences the relation betweentwo variables.

Although a full discussion of mediation is beyondthe scope of this chapter, a critical requirementof mediation is that there is a clear causal order-ing from X ➔ MED ➔Y . Particularly as mediationanalyses are typically applied in nonexperimentalstudies, this is an important consideration. Frazieret al. (2004) have proposed that the most defen-sible strategy is a longitudinal study in which allthree variables are tested on at least three occa-sions (e.g., Marsh, Trautwein, Lüdtke, Köller, &Baumert, 2005; Marsh & O’Mara, 2008). Withoutclear support for the causal ordering of the X1andMED variables, traditional tests of mediation makeno sense. Unless there is a clear causal ordering, it isnot possible to rule out (empirically, theoretically, orcommonsensically) the possibility that X1and MEDare reciprocally related (i.e., both X1 ➔ MED, andMED ➔X1; see Marsh & Craven, 2006, for fur-ther discussion of reciprocal effects). This point is atthe heart of the Judd and Kenny (2010) critical re-evaluation of the Baron and Kenny (1986; Judd &Kenny, 1981) assumptions of what are necessary andsufficient conditions to demonstrate mediation. Theissue of causality in mediation requires strong the-ory, a good design, appropriate statistical analyses,and probably an ongoing research programme thataddresses the same issues from multiple perspectives.It is our contention that the vast majority of stud-ies purporting to test mediation—particularly whenbased on a single wave of data—are either wrongor cannot be defended in relation to the typicallyimplicit, untested assumption of causal ordering.Without strong assumptions of causality, tests ofmediation are typically uninterpretable.

For purposes of this chapter, we argue that thereneeds to be no causal ordering established betweenX1 and MOD to test interaction effects. The statisti-cal tests cannot distinguish between interpretationsthat X1moderates MOD and MOD moderates X1;the two perspectives are equivalent in terms of statis-tical significance, variance explained, and so forth.Our recommendation is that unless there is a clearlyestablished causal ordering, then both possibili-ties should be considered, although one might bemore substantively or theoretically useful (Brambor,

Clark, & Golder, 2006). However, if a strong basisof causal ordering can be established on the basis oftheory or the design of the study (e.g., with longi-tudinal data), then the interpretability of the resultsis greatly enhanced. In the evaluation of interac-tion effects, there is, of course, the assumption thatboth X1 and MOD precede Y in the causal orderingand tests of interactions typically are not symmet-ric in relation to X1 and Y (or for MOD and Y ).Thus, for example, if there is not a theoretical or sub-stantive rationale for assuming X1 ➔Y rather thanY ➔X1 (e.g., X1 is an intervention based on randomassignment) Traci and Russell (2003) have suggestedthat researchers should consider different models inwhich X1 is considered the outcome rather than thepredictor variable.

Although we have emphasized the distinctionbetween mediation and moderation, they are obvi-ously not mutually exclusive. It is quite conceivableto have moderated mediation or mediated modera-tion. Here the issues of direction of causality becomeeven more important (for further discussion, seeJudd & Kenny, 2010; Muller, Judd, & Yzerbyt,2005).

Interactions with More Than TwoContinuous Variables

The moderated regression approach describedabove can be extended to include more than twoindependent variables with multiple two-way orother higher-order interactions involving more thantwo variables. Should we include all possible inter-actions in the regression equation? In general, theinclusion of interaction terms should be theorydriven. It is also tempting to exclude non-significanteffects to simplify the regression equation, increasethe degrees of freedom, and increase the power ofthe remaining statistical tests. Nevertheless, Jaccardet al. (1990) and many others have recommendedthat interaction terms supported by strong theory—as well as the predictor variables used to construct theinteraction terms—should always be included irre-spective of whether they are significant. In particular,failure to include main effects (or lower-order inter-action) will typically bias coefficient estimates forthe higher-order interactions (see earlier discussion).Also, for hypothesized interaction effects that arebased on theory, the values of even non-significanteffects can be important, for example, for meta-analyses that seek to combine the effects from manydifferent studies.

When multiple interaction effects are examinedin the same regression equation, one approach to

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 21

Page 22: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 22 — #22

safeguard against an inflated Type I error is to use anomnibus F -test to compare squared multiple corre-lations with and without the entire set of interactionsterms. Interactions are individually inspected onlywhen this ombibus F -test is significant (Aiken &West, 1981; Frazier et al., 2004). Analogous toanalyses of higher-order interactions in ANOVA,the interpretation of the three-way (or other higherorder) interaction in regression may not be straight-forward and easily interpretable. When a certainthree-way interaction is significant, for example,we can construct simple slopes similar to those forthe two-way interactions. Operationally, to find theslope ofX1 at certain values ofX2 andX3 (or symmet-rically, slope of X2 at certain values of X1 and X3),substitute these values of X2 and X3 into the fullregression equations with known coefficients andthe coefficient of X1 will be the required slope. Cor-responding t -tests on these coefficients can also becomputed with the appropriate standard errors ofthe coefficients (see Jaccard et al., 1990) as in thecase with two-way interactions.

Measurement error is likely to be an even largerproblem for the evaluation of higher-order interac-tions. As noted earlier with two-way interactions,measurement errors associated with each of the sep-arate predictor variables combines multiplicativelyso that measurement error is typically much largerfor the interaction terms. This issue is likely to beexacerbated even more for higher-order interactions.Hence, there is the need for latent variable modelsthat control for unreliability. However, latent vari-able models of interaction have primarily focused onmodels of two-way interactions, and there has beenlittle progress on extending these models to includehigher-order interaction.

Tests of Measurement InvarianceOf particular substantive importance for applied

psychological research are mean-level differencesacross multiple groups (e.g., male vs. female; agegroups; single-sex vs. co-educational schools) or overtime (i.e., observing the same group of participantsat multiple occasions, perhaps before and after anintervention) as well as interactions between thesevariables. What have typically been ignored in suchstudies are tests of whether the variables have thesame meaning in the different groups or for multipleoccasions. When there are multiple indicators of theconstructs, typical approach is to test the invarianceover groups or time of the factor structure and theitem intercepts. An important assumption underly-ing tests of mean differences is that group differences

in the latent mean are reflected in each of the mul-tiple indictors of the latent construct. For example,if the underlying factor structure for the outcomevariable Y differs for boys and girls, or over time,then interpretations of mean level differences areproblematic. Also, if mean level differences are notconsistent across items used to infer the outcome,given comparable levels on the estimated outcome(i.e., there is differential item functioning), then theobserved differences might be idiosyncratic to theparticular items used. Obviously these issues are crit-ical in the interpretation of interaction effects. If themeaning of a predictor variable is qualitatively differ-ent for boys and girls, then it makes no sense to testthe effect of the interaction between the predictorvariable and gender

The evaluation of model invariance over differentgroups (e.g., gender) or over time for the same groupis widely applied in SEM studies (Jöreskog & Sör-bom, 1988; Meredith & Teresi, 2006; Vandenberg& Lance, 2000). Indeed, such tests of invari-ance might be seen as a fundamental advantageof CFA over traditional approaches to EFA (butsee also recent applications of exploratory struc-tural equation modeling that integrate EFA andCFA approaches in the evaluation of measurementinvariance; see Marsh, Muthén et al., 2009, Marsh,Lüdtke, et al., in press). Tests of measurement invari-ance begin with a model with no invariance of anyparameters (configural invariance) followed by testsof whether factor loadings are invariant over groups(weak invariance). Strong measurement invariancerequires that the indicator intercepts and factor load-ings are invariant over groups or time and is anassumption for comparison of latent means. Strictmeasurement invariance requires invariance of itemuniquenesses (i.e., each item’s residual variance) inaddition to invariant factor loadings and interceptsand is an assumption for the comparison of man-ifest means over groups or time. Unless there issupport for at least strong measurement invariance,then the comparison of latent means is not justi-fied (because of the problem of differential itemfunctioning). The comparison of manifest meansrequires the much stronger assumption of strictmeasurement invariance (as a necessary, but not asufficient condition).

The multiple-group approach to measurementinvariance is easily formulated and tested whenin relation to a predictor that is a true groupingvariable with a relatively small number of discretecategories. However, for continuous variables, themultiple-group approach would require researchers

22 m o d e r at i o n

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
2010
Page 23: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 23 — #23

to transform continuous variables into a relativelysmall number of categories that constitute the mul-tiple groups—a practice we have criticized earlier.Marsh, Tracey, and Craven (2006; see also Marsh,Lüdtke et al., in press) have proposed a hybridapproach involving an integration of interpreta-tions based on both multiple-indicator-multiple-cause (MIMIC) and multiple group approaches.In the MIMIC approach, the predictor variables(X1 and X2) and their interaction are included inthe model based on the total sample; they can becategorical or continuous, latent or manifest. Inthis respect, the distinction between the MIMICand multiple-group approaches is analogous to themultiple-group and single (total) group distinctionalready discussed in relation to the moderated regres-sion approach. The main difference is that in theMIMIC and multiple-group approach as used here,some of the variables are latent variables based onmultiple indicators.

Moderated multiple regression models like thoseconsidered in the first half of this chapter are basedon manifest variables that might not have multi-ple indicators. Although some constructs might beappropriately measured with a single indicator thathas no measurement error (e.g., gender), most can-not. However, tests of single-indicator (manifest)variables make the same assumptions of measure-ment invariance, as do latent variables. Indeed, theassumptions are even more stringent (strict mea-surement invariance is needed rather than strongmeasurement invariance). The problem is that typi-cal approaches to measurement invariance cannot beused to test the appropriateness of these assumptionsfor single-indicator manifest constructs. Althoughissues of measurement invariance have been largelyignored in tests of interactions, this situation is likelyto change with the increased focus on latent vari-able models in general and latent interactions inparticular.

Multilevel Designs and Clustered SamplesThere is a special type of interactions in that data

points are related as clusters and are not collectedtotally independently (e.g., students’ questionnaireresponses from 100 schools, with data from the sameschool sharing some commonalities). The effects ofthe independent variables on dependent variables(or their relations) may change according to theirsubgroup units (e.g., school) or their characteris-tics. For example, parental influence on students’achievement depends on the characteristics of theschools (e.g., whether the student is studying in

a school with low or high average socioeconomicstatus). Indeed, a key question in many multilevelstudies is how the effect of an individual variablevaries from group-to-group, and whether there aregroup-level variables that can explain the group-levelvariation. Although the groups are typically consid-ered as a random effect, rather than a fixed effect,the issue is still an interaction question—is the effectof one variable moderated by another variable (e.g.,the particular school that a student attends).

Historically, multilevel researchers have tendedto work with manifest variables that ignore measure-ment error, whereas SEM researchers have tended towork with latent variable models that ignore mul-tilevel structures in their data. However, these twodominant analytic approaches are increasingly beingintegrated within more comprehensive multilevelSEMs (e.g., Lüdtke, Marsh et al., 2008; Marsh,Lüdtke et al., 2009). Inevitably, this integrationwill lead to increased sophistication in the studyof latent interactions, a better understanding ofthe cross-level interactions between individual-leveland group-level variables, and a clearer distinctionbetween fixed-effects that have been emphasized inour chapter and random effects that are emphasizedin multilevel analyses. Although clearly beyond thescope of our chapter (but see chapter “Multilevel

AQ: Pleasespecifychapternumber.

regression and Multilevel SEM” in this monograph),this is an important emerging area in quantitativeanalysis.

Author NotesThe authors would like to thank the following

colleagues for input in relation to earlier drafts of thischapter: Eric Dearing, Kathleen McCartney; DavidKenny, Matthew R Golder; and Thomas Brambor.

Note1. As in many other statistical models, there are Bayesian

approaches and non -Bayesian approaches in estimating latentinteraction models. Although well developed (see, e.g., Arminger& Muthén, 1998; Lee, Song & Poon, 2004), Bayesian

AQ: Pleasenote thatthereferencefor thecitation“Arminger& Muthén,1998” hasnot beengiven in thereferenceslist.

approaches and their calculation algorithms are relatively diffi-cult for general applied researchers and thus although available inthe special WinBUGS software, these approaches have not beenadopted for the most popular commercial SEM software (but alsosee the Bayesian approach recently introduced in Mplus).

ReferencesAguinis, H. (2004). Moderated regression. New York: Guilford.Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing

and interpreting interactions. Newbury Park, CA: Sage.

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 23

Herb Marsh
Cross-Out
Herb Marsh
Replacement Text
2010
Herb Marsh
Cross-Out
Page 24: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 24 — #24

Algina, J., & Moulder, B. C. (2001). A note on estimating theJöreskog-Yang model for latent variable interaction using LIS-REL 8.3. Structural Equation Modeling: A MultidisciplinaryJournal, 8, 40–52.

Bagozzi, R. P., & Yi, Y. (1989). On the use of structuralequation models in experimental designs. Journal of Market-ing Research, 26, 271–284.

Baron, R. M., & Kenny, D. A. (1986). The moderator-mediatorvariable distinction in social psychological research: Con-ceptual, strategic, and statistical considerations. Journal ofPersonality and Social Psychology, 51, 1173–1182.

Brambor, T., Clark, W. R., & Golder, M. (2006). Understandinginteraction models: Improving empirical analyses. PoliticalAnalysis, 14, 63–82.

Byrne, B. M. (1998). Structural equation modeling with LIS-REL, PRELIS, and SIMPLIS: Basic concepts, applications andprogramming. Mahwah, NJ: Erlbaum.

Carte, T. A. & Russell, C. J. (2003). In Pursuit of Moderation:Nine Common Errors and Their Solutions. ManagementInformation Systems Quarterly, 27, 479–501.

Coenders, G., Batista-Foguet, J.M. & Saris. W. E. (2008). Sim-ple, efficient and distribution-free approach to interactioneffects in complex structural equation models. Quality &Quantity, 42, 369–396.

Cohen, J. (1968). Multiple regression as a general data analyticsystem. Psychological Bulletin 70(6), 426–443.

Cohen, J. (1978). Partialed products are interactions; partialedpowers are curve components. Psychological Bulletin, 85, 858–866.

Cohen, J., & Cohen, P. (1983). Applied multiple regres-sion/correlation analysis for the behavioral sciences. Hillsdale,NJ: Lawrence Erlbaum.

Cohen, J., Cohen, P., Aiken, L. S., & West, S. G. (2003). Appliedmultiple regression/correlation analysis for the social sciences (3rded.). Mahwah, NJ: Lawrence Erlbaum.

Cronbach, L. J., & Snow, R. E. (1977). Aptitudes and instruc-tionalmethods: A handbook for research on interactions. Oxford,England: Irvington.

Darlington, R. B. (1990). Regression and linear models. New York:McGraw-Hill.

Dearing, E., & Hamilton, L. C. (2006). Contemporaryapproaches and classic advice for analyzing mediating andmoderating variables. Monographs of the Society for Researchin Child Development, 71 (Serial No. 285).

Frazier, P. A., Tix, A. P., & Barron, K. E. (2004). Testing mod-erator and mediator effects in counseling psychology. Journalof Counseling Psychology, 51, 115–334.

Friedrich, R. J. (1982). In defense of multiplicative terms inmultiple regression equations. American Journal of PoliticalScience, 26, 797–833.

Hayduk, L. A. (1987). Structural equationmodeling with LISREL:Essentials and advances. Baltimore: Johns Hopkins University.

Holmbeck, G. N. (1997). Toward terminological, concep-tual, and statistical clarity in the study of mediators andmoderators: Examples from the child-clinical and pediatricpsychology literatures. Journal of Consulting and ClinicalPsychology, 65, 599–610.

Jaccard, J. (1998). Interaction effects in factorial analysis ofvariance. Thousand Oaks, CA, US, Sage Publications.

Jaccard, J., Turrisi, R., & Wan, C. K. (1990). Interaction effectsin multiple regression. Newbury Park, CA: Sage.

Jaccard, J., & Wan, C. K. (1995). Measurement error in theanalysis of interaction effects between continuous predictors

using multiple regression: multiple indicator and structuralequation approaches. Psychological Bulletin, 117, 348–357.

Johnson, P. O., & Neyman, J. (1936). Tests of certain lin-ear hypotheses and their application to some educationalproblems. Statistical Research Memoirs, 1, 57–93.

Jöreskog, K. G. (1998). Interaction and nonlinear model-ing: issue and approaches. In R.E. Schumacker, G.A.Marcoulides, (Eds.), Interaction and nonlinear effects in struc-tural equation modelling (pp. 239–250). Mahwah, NJ:Erlbaum.

Jöreskog, K. G., & Sörbom, D. (1988). LISREL 7—A guide tothe program and applications (2nd ed.). Chicago, IL: SPSS.

Jöreskog, K. G., & Yang, F. (1996). Nonlinear structural equationmodels: The Kenny-Judd model with interaction effects. In.G. A. Marcoulides & R. E. Schumacker, (Eds.), Advancedstructural equation modeling: Issued and techniques (pp. 57–88). Mahwah, NJ: Erlbaum.

Judd, C. M., & Kenny, D. A. (1981). Process analysis: Estimatingmediation in treatment evaluations. Evaluation Review, 5,602–619.

Judd, C. M., & Kenny, D. A. (2010). Data analysis in SocialPsychology: Recent and recurring issues (pp. 115–139). InS. T. Fiske, D. T. Gilbert & G. Lindzey. Handbook of SocialPsychology (5th ed). Hoboken NJ: Wiley.

Judd, C. M., McClelland, G. H., & Ryan, C. S. (2009). Dataanalysis: A model comparison approach (2nd ed.). New York:Routledge.

Kelava, A., & Brandt, H. (in press). Estimation of Nonlin-ear Latent Structural Equation Models using the ExtendedUnconstrained Approach. Review of Psychology.

AQ: PleaseupdateKelava &Brandt.

Kenny, D. A., & Judd, C. M. (1984). Estimating the nonlin-ear and interactive effects of latent variables. PsychologicalBulletin, 9, 201–210.

King, G. (1986) How not to lie with statistics: Avoiding com-mon mistakes in political science. American Journal of PoliticalScience, 30, 666–687.

Kirk, R. (1982). ExperimentalDesign: Procedures for theBehavioralSciences. Belmont CA: Brooks/Cole.

Klein, A. G., & Moosbrugger, H. (2000). Maximum likelihoodestimation of latent interaction effects with the LMS method.Psychometrika, 65, 457–474.

Klein, A. G., & Muthén, B. (2007). Quasi-maximum likelihoodestimation of structural equation models with multiple inter-action and quadratic effects.Multivariate Behavioral Research,42, 647–674.

Klein, A. G., Schermelleh-Engel, K., Moosbrugger, H., &Kelava, A. (2009). Assessing spurious interaction effects. InT. Teo & M. S. Khine (Eds.), Structural equation modeling ineducational research: Concepts and applications (pp. 13–28).Rotterdam, The Netherlands: Sense Publishers.

Lance, Charles E. (1988). Residual centering, exploratory andconfirmatory moderator analysis, and decomposition ofeffects in path models containing interactions.

Lance, C. E. (1988). Residual centering, exploratory and con-firmatory moderator analysis, and decomposition of effectsin path models containing interactions. Applied PsychologicalMeasurement, 12, 163–175.

Landram, F. G. & Alidaee, B. (1997). Computing orthogonalpolynomials. Computers Operations Research, 24, 473–476.

Lee, S. Y., Song, X. Y., & Poon, W. Y. (2004). Comparisonof approaches in estimating interaction and quadratic effectsof latent variables. Multivariate Behavioral Research, 29,37–67.

24 m o d e r at i o n

Page 25: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 25 — #25

Lin, G. C., Wen, Z., Marsh, H. W., & Lin, H. S. (in press).AQ: Isthere anupdate toLin, Wen,Marsh, &Lin?

Structural equation models of latent interactions: clarificationof orthogonalizing and double-mean-centering strategies.Structural Equation Modeling.

Little, Todd D.; Bovaird, James A.; Widaman, Keith F.(2006). On the merits of orthogonalizing powered and prod-uct terms: Implications for modeling interactions amonglatent variables.

Little, T. D., Bovaird, J. A., & Widaman, K. F. (2006).On the merits of orthogonalizing powered and productterms: Implications for modeling interactions among latentvariables. Structural Equation Modeling, 13, 497–519.

Lubinski, D., & Humphreys, L. G. (1990). Assessing spu-rious “moderator effects”: Illustrated substantively withthe hypothesized (“synergistic”) relation between spatialand mathematical ability. Psychological Bulletin, 107,385–393.

Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U.,Asparouhov, T., & Muthén, B. (2008). The multilevel latentcovariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13,203–229.

MacCallum, R. C., & Mar, C. M. (1995). Distinguish-ing between moderator and quadratic effects in multipleregression. Psychological Bulletin, 118, 405–421.

MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D.(2002). On the practice of dichotomization of quantitativevariables. Psychological Methods, 7, 19–40.

Marquardt, D. W. (1980). You should standardize the predictorvariables in your regression models. Journal of the AmericanStatistical Association, 75, 87–91.

Marsh, H. W. & Craven, R. G. (2006). Reciprocal effectsof self-concept and performance from a multidimensionalperspective: Beyond seductive pleasure and unidimensionalperspectives. Perspectives on Psychological Science, 1, 133–163.

Marsh, H. W., Dowson, M., Pietsch, J., & Walker, R. (2004).Why multicollinearity matters: A re-examination of relationsbetween self-efficacy, self-concept and achievement. Journalof Educational Psychology, 96, 518–522.

Marsh, H. W., & Grayson, D. (1994). Longitudinal stability oflatent means and individual differences: A unified approach.Structural Equation Modeling, 1, 317–359.

Marsh, H. W., & Hau, K-T. (2007). Applications of latent-variable models in educational psychology: The need formethodological-substantive synergies. Contemporary Educa-tional Psychology, 32, 151–171.

Marsh, H. W., Hau, K.T. & Wen, Z., (2004). In search ofgolden rules: Comment on hypothesis testing approaches tosetting cutoff values for fit indexes and dangers in overgener-alising Hu & Bentler’s (1999) findings. Structural EquationModelling, 11, 320–341.

Marsh, H. W., Lüdtke, O., Muthén, B., Asparouhov, T., Morin,

AQ: Isthere anupdate toMarsh,Ludtke,Muthen,Asparouhov,Morin,Trautwein,&Nagengast?

A. J. S., Trautwein, U. & Nagengast, B. (in press). A new lookat the big-five factor structure through exploratory structuralequation modeling. Psychological Assessment.

Marsh, H. W., Muthén, B., Asparouhov, T., Lüdtke, O.,Robitzsch, A., Morin, A. J. S., & Trautwein, U. (2009).Exploratory Structural equation modeling, integrating CFAand EFA: Application to students’ evaluations of universityteaching. Structural Equation Modeling,16, 439–476.

Marsh, H. W., & O’Mara, A. (2008). Reciprocal effectsbetween academic self-concept, self-esteem, achievement,and attainment over seven adolescent years: Unidimensional

and multidimensional perspectives of self-concept. Personalityand Social Psychology Bulletin, 34, 542–552.

Marsh, H. W., Tracey, D. K., & Craven, R. G. (2006). Mul-tidimensional self-concept structure for preadolescents withmild intellectual disabilities: A hybrid multigroup-mimicapproach to factorial invariance and latent mean differences.Educational and Psychological Measurement, 66, 795–818.

Marsh, H. W., Trautwein, U., Lüdtke, O., Köller, O. &Baumert, J. (2005) Academic self-concept, interest, gradesand standardized test scores: Reciprocal effects models ofcausal ordering. Child Development, 76, 297–416.

Marsh, H. W., Wen, Z., & Hau, K.T. (2004). Structural equationmodels of latent interactions: Evaluation of alternative esti-mation strategies and indicator construction. PsychologicalMethods, 9, 275–300.

Marsh, H. W., Wen, Z., & Hau, K. T. (2006). Structuralequation models of latent interaction and quadratic effects. InG. Hancock & Mueller (Eds). Structural equation modeling :A second course (pp. 225–265). Greenwich, CT: InformationAge Publishing.

Marsh, H. W., Wen, Z. L., Hau, K. T, Little, T. D. Bovaird,J. A. & Widaman, K. F. (2007). Unconstrained struc-tural equation models of latent interactions: Contrastingresidual- and mean-centered approaches. Structural EquationModelling, 14, 570–580.

McClelland, G. H., & Judd, C. M. (1993). Statistical difficultiesof detecting interactions and moderator effects. PsychologicalBulletin, 114, 376–390.

Meredith, W., & Teresi, J. A. (2006). n Essay on Measurementand Factorial Invariance. edical Care, 44, S69–S77.

Moosbrugger, H., Schermelleh-Engel, K., Kelava, A., & Klein,A. G. (2009). Testing multiple nonlinear effects in structuralequation modeling: A comparison of alternative estimationapproaches. In T. Teo & M. S. Khine (Eds.), Structuralequation modeling in educational research: Concepts and appli-cations (pp.103–136). Rotterdam, The Netherlands: SensePublishers.

Morin, A. J. S., Janoz, M. & Larivée, S. (2009). The Mon-treal adolescent depression development project (MADDP):School life and depression following high school transition.Psychiatry Research Journal, 1(3), 1–50.

Moulder, B. C., & Algina, J. (2002). Comparison of methods forestimating and testing latent variable interactions. StructuralEquation Modeling, 9, 1–19.

Muller, D., Judd, C. M., & Yzerbyt, V. Y. (2005). When mod-eration is mediated and mediation is moderated. Journal ofPersonality and Social Psychology, 89, 852–863.

Muthén, L. K., & Muthén, B. O. (1997–2008). Mplus user’sguide. Los Angeles, CA: Muthén & Muthén.

Ping, R. A., Jr. (1996). Latent variable interaction and quadraticeffect estimation: A two-step technique using structuralequation analysis. Psychological Bulletin, 119, 166–175.

Potthoff, R. F. (1964). On the Johnson-Neyman technique andsome extensions thereof. Psychometrika, 29, 241–256.

Preacher, K. J., Curran, P. J., & Bauer, D. (2006). Computationaltools for probing interactions in multiple linear regression,multilevel modeling, and latent curve analysis. Journal ofEducational and Behavioral Statistics, 31, 437–448.

Rigdon, E. E., Schumacker, R. E. & Wothke, W. (1998). Acomparative review of interaction and nonlinear modeling. InR. E. Schumacker & G. A. Marcoulides (Eds.), Interaction andnonlinear effects in structural equation modeling (pp. 1–16).Mahwah, NJ: Lawrence Erlbaum Associates.

m a r s h , h a u , w e n , n a g e n g a s t , m o r i n 25

Herb Marsh
Sticky Note
Marsh, H. W., Lüdtke, O., Muthén, B., Asparouhov, T., Morin, A. J. S., Trautwein, U. & Nagengast, B. (2010). A new look at the big-five factor structure through exploratory structural equation modeling. Psychological Assessment, 22, 471-491. DOI: 10.1037/a0019227
Herb Marsh
Sticky Note
Lin, G-C., Wen, Z. L., Marsh, H. W. & Lin,H-S. (2010).Structural Equation Models of Latent Interactions: Clarification of Orthogonalizing and Double-Mean-Centering Strategies. Structural Equation Modeling, 17(3), 374-391. doi:HYPERLINK "http://psycnet.apa.org/doi/10.1080/10705511.2010.488999" 10.1080/10705511.2010.488999
Herb Marsh
Cross-Out
Herb Marsh
Cross-Out
Page 26: CHAPTER 17 Moderation

Little-V2: “CHAP17” — 2012/9/7 — 09:56 — PAGE 26 — #26

Saris. W. E., Batista-Foguet, J.M. & Coenders, G. (2007).Selection of indicators for the interaction term in structuralequation models with interaction. Quality & Quantity, 41,55–72.

Schumacker, R. E., & Marcoulides, G. A. (Eds.) (1998). Inter-action and nonlinear effects in structural equation modeling.Mahwah, NJ: Erlbaum.

Traci, A. C., & Russell, C. J. (2003). In pursuit of moderation:Nine common errors and their solutions. MIS Quaterly, 27,479–501.

Vandenberg, R. J., & Lance, C. E. (2000). A review and syn-thesis of the measurement invariance literature: Suggestions,practi ces, and recommendations for organizational research.Organizational Research Methods, 3, 4–69.

Wall, M. M., & Amemiya, Y . (2001). Generalized appendedproduct indicator procedure for nonlinear structural equationanalysis. Journal of Educational and Behavioral Statistics, 26,1–29.

Wen, Z., Hau, K. T., & Marsh, H. W. (2008). Appropriatestandardized estimates for moderating effects in structuralequation models. Acta Psychologica Sinica, 40(6), 729–736.

Wen, Z., Marsh, H. W., & Hau, K.T. (2010). Structural equationmodels of latent interactions: An appropriate standardizedsolution and its scale-free properties. Structural EquationModeling, 17, 1–22.

Wu, Y., Wen, Z., & Lin, G. C. (2009). Structural equation mod-eling of latent interactions without using the mean structure.Acta Psychologica Sinica, 41(12), 1252–1259.

26 m o d e r at i o n