1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis...

28
1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos, 2013

Transcript of 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis...

Page 1: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

1

Advanced Topics in Regression

Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling

Compiled by Nick Evangelopoulos, 2013

Page 2: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

2

Part 1: Quantile Regression

Page 3: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

3

Motivation for Quantile Regression

ProblemANOVA and regression provide information only about the

conditional mean.More knowledge about the distribution of the statistic may

be important.The covariates may shift not only the location or scale of the

distribution, they may affect the shape as well.Solution

Quantile regression models the relationship between X and the conditional quantiles of Y given X = x

Page 4: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile Definition

• Definition: Given p ∈ [0, 1]. A pth quantile of a random variable Z is any number ζp such that Pr(Z< ζ p ) ≤ p ≤ Pr(Z ≤ ζ p ). The solution always exists, but need not be unique.Ex: Suppose Z={3, 4, 7, 9, 9, 11, 17, 21} and p=0.5 then Pr(Z<9) = 3/8 ≤ 1/2 ≤ Pr(Z ≤ 9) = 5/8

So, the 50th percentile is equal to 9

Page 5: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile Regression• A family of conditional quantiles of Y given X=x.• The median regression line is also the OLS

regression line. The other quantile functions are solutions to a set of linear programming problems

x

Y

90%

75%

50%

25%

10%

Page 6: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile Regression

Daily High Temperature

0

5

10

15

20

25

30

35

40

45

50

0 10 20 30 40 50

Yesterday

To

day

A scatter of daily high temperature in Sydney. The red line is the 45-degree line

Page 7: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile Regression

5 10 15 20

20

40

60

80Cool Yesterday (n=259)

Temperature Today

Freq

uenc

y75

1

X 1

18.47.6 X 0

Page 8: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile Regression

15 20 25 30 35 40 45

20

40

60

80Hot Yesterday (n=259)

Temperature Today

Freq

uenc

y61

6

X 1

42.5514 X 0

Page 9: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile RegressionQuantiles at .9, .75, .5, .25, and .10. Given yesterday’s temperature, today’s temperature has an expected distribution which is non-symmetrical

Temperature Quantiles

0

10

20

30

40

50

60

5 15 25 35 45

Yesterday

To

da

y

Page 10: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile RegressionEstimation

• The quantile regression coefficients are the solution to

• The k first order conditions are

)1(xyxysgnpn

1min

n

1i

'ii

'ii2

121

)2(0xˆxysgn2

1

2

1p

n

1 n

1iip

'ii

Page 11: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile RegressionCoefficient Interpretation

• The marginal change in the Θth conditional quantile due to a marginal change in the jth element of x. There is no guarantee that the ith person will remain in the same quantile after her x is changed.

ij

ii

x

x|yQ

Page 12: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile RegressionBibliography

• Koenker and Hullock (2001), “Quantile Regression,” Journal of Economic Perspectives, Vol. 15, Pps. 143-156.

• Buchinsky (1998), “Recent Advances in Quantile Regression Models”, Journal of Human Resources, Vo. 33, Pps. 88-126.

• www.econ.uiuc.edu/~roger

• http://Lib.stat.cmu.edu/R/CRAN

Page 13: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Quantile Regression in SAS

Optional Reading:Colin (Lin) Chen, An Introduction to Quantile Regression and the QUANTREG Procedure, SUGI30, Paper 213-30

Page 14: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

14

Part 2: Analysis of Causality

For more information: BUSI 6280 The material presented here is based on a paper by

Josef Brüderl (University of Mannheim, Germany)

Page 15: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Get more at http://dilbert.com/strips/

Page 16: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Panel Data Methods for analysis of causality exploit a data structure of

multi-dimensional longitudinal data, which is typically described in the statistics and econometrics literature as Panel Data

Panel data is defined as a combination of cross-section data, where data on one or more variables are collected at the same point in time, and time-series data, where data are collected at regular time intervals.

Analysis of panel data will be performed using the TSCREG procedure in the statistical package SAS (Allison 2005; Mohd Nor & Maarof 2007) and the xtreg procedure in the statistical package Stata (Brüderl 2005).

Page 17: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

References Allison, P.D. (2005). Fixed Effects Regression Methods for Longitudinal Data

Using SAS. SAS Press. Brüderl, J. (2005). Panel Data Analysis. University of Mannheim,

http://www2.sowi.uni-mannheim.de/lsssm/veranst/Panelanalyse.pdf (accessed October 15, 2012)

Mohd Nor, A. H. S., & Maarof, F. (2007). “Panel Data Analysis Using SAS”. Proceedings of the 21st Annual SAS Malaysia Forum, 5th September 2007, Kuala Lumpur.

Halaby, C. (2004). Panel Models in Sociological Research. Annual Review of Sociology, 30: 507-544.

Wooldridge, J. (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press.

Wooldridge, J. (2003). Introductory Econometrics: A Modern Approach. Thomson. Chapters 13, 14.

Baron and Kenny (1986)

Page 18: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

18

Part 3: Mediation Analysis

For more information: BUSI 6280, EPSY 6270 The material presented here is based on Wikipedia

Page 19: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Mediation Models Mediation is a hypothesized causal chain in which one

variable affects a second variable that, in turn, affects a third variable. The intervening variable, M, is the mediator. It “mediates” the relationship between a predictor, X, and an outcome Y.a and b: direct effects of X on M and M on Y, resp.c’: direct effect of X on Y after accounting for M

X YMa b

c’

Page 20: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Baron and Kenny steps The Baron and Kenny (1986) approach is not the best, but

many researchers are still using it STEP 1: Conduct a simple regression analysis with X

predicting Y to test for path c alonec is the direct effect of X on Y, without taking into

account M. This is not the same as c’ on the previous slide!

X YM

c

Page 21: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Baron and Kenny steps STEP 2: Conduct a simple regression analysis with X

predicting M to test the significance of path a alone

X YMa

Page 22: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Baron and Kenny steps STEP 3: Conduct a simple regression analysis with M

predicting Y to test the significance of path b alone The purpose of Steps 1-3 is to establish that zero-order

relationships among the variables exist. If one or more of these relationships are non-significant, researchers usually conclude that mediation is not possible or likely

Assuming there are significant relationships from Steps 1 through 3, proceed to Step 4.

X YMb

Page 23: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Baron and Kenny steps STEP 4: Conduct a multiple regression analysis with X and

M predicting Y In Step 4, some form of mediation is supported if the effect of

M (path b) remains significant after controlling for X. If X is no longer significant when M is controlled, the finding supports full mediation. If X is still significant, the finding supports partial mediation.

X YMb

c’

Page 24: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Sobel steps STEP 1: Conduct a multiple regression analysis with X and

M predicting Y: Y = b0 + b1X + b2M + e

STEP 2: Conduct a simple regression analysis with X predicting M: M = b3 + b4X + u

STEP 3: Compute the indirect effect as bindirect = (b2)(b4) Significance is best determined using bootstrapping

X YMa

X YMb

c’

Page 25: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

SEM approach The Structural Equation Modeling (SEM) approach is

considered the best for testing mediation effects. In SEM, a single mediation model is tested.

Full mediation and partial mediation models can be compared by fitting both as alternative models. The model with the highest fit statistics is the more appropriate

X YMa b

c’

X YMa b

Full mediation Partial mediation

Page 26: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

References Baron, R.M. & Kenny, D.A. (1986). The Moderator-

Mediator variable distinction in Social Psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.

MacKinnon, D.P. (2008). Introduction to statistical mediation analysis. Mahwah, NJ: Erlbaum.

Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhardt (Ed.), Sociological Methodology (pp. 290-312). Washington DC: American Sociological Association.

Page 27: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

27

Part 4: Hierarchical Linear Modeling

For more information: BUSI 6480, EPSY 6230(EPSY offered at the UNT College of Education)

Page 28: 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,

Multilevel Models Multilevel models are particularly appropriate for research

designs where the data for participants is organized at more than one level

Analysis of Covariance (ANCOVA) include nested designsIndividuals nested within groupsCompanies nested within industries