1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis...
-
Upload
alvin-howard -
Category
Documents
-
view
220 -
download
0
Transcript of 1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis...
1
Advanced Topics in Regression
Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling
Compiled by Nick Evangelopoulos, 2013
2
Part 1: Quantile Regression
3
Motivation for Quantile Regression
ProblemANOVA and regression provide information only about the
conditional mean.More knowledge about the distribution of the statistic may
be important.The covariates may shift not only the location or scale of the
distribution, they may affect the shape as well.Solution
Quantile regression models the relationship between X and the conditional quantiles of Y given X = x
Quantile Definition
• Definition: Given p ∈ [0, 1]. A pth quantile of a random variable Z is any number ζp such that Pr(Z< ζ p ) ≤ p ≤ Pr(Z ≤ ζ p ). The solution always exists, but need not be unique.Ex: Suppose Z={3, 4, 7, 9, 9, 11, 17, 21} and p=0.5 then Pr(Z<9) = 3/8 ≤ 1/2 ≤ Pr(Z ≤ 9) = 5/8
So, the 50th percentile is equal to 9
Quantile Regression• A family of conditional quantiles of Y given X=x.• The median regression line is also the OLS
regression line. The other quantile functions are solutions to a set of linear programming problems
x
Y
90%
75%
50%
25%
10%
Quantile Regression
Daily High Temperature
0
5
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50
Yesterday
To
day
A scatter of daily high temperature in Sydney. The red line is the 45-degree line
Quantile Regression
5 10 15 20
20
40
60
80Cool Yesterday (n=259)
Temperature Today
Freq
uenc
y75
1
X 1
18.47.6 X 0
Quantile Regression
15 20 25 30 35 40 45
20
40
60
80Hot Yesterday (n=259)
Temperature Today
Freq
uenc
y61
6
X 1
42.5514 X 0
Quantile RegressionQuantiles at .9, .75, .5, .25, and .10. Given yesterday’s temperature, today’s temperature has an expected distribution which is non-symmetrical
Temperature Quantiles
0
10
20
30
40
50
60
5 15 25 35 45
Yesterday
To
da
y
Quantile RegressionEstimation
• The quantile regression coefficients are the solution to
• The k first order conditions are
)1(xyxysgnpn
1min
n
1i
'ii
'ii2
121
)2(0xˆxysgn2
1
2
1p
n
1 n
1iip
'ii
Quantile RegressionCoefficient Interpretation
• The marginal change in the Θth conditional quantile due to a marginal change in the jth element of x. There is no guarantee that the ith person will remain in the same quantile after her x is changed.
ij
ii
x
x|yQ
Quantile RegressionBibliography
• Koenker and Hullock (2001), “Quantile Regression,” Journal of Economic Perspectives, Vol. 15, Pps. 143-156.
• Buchinsky (1998), “Recent Advances in Quantile Regression Models”, Journal of Human Resources, Vo. 33, Pps. 88-126.
• www.econ.uiuc.edu/~roger
• http://Lib.stat.cmu.edu/R/CRAN
Quantile Regression in SAS
Optional Reading:Colin (Lin) Chen, An Introduction to Quantile Regression and the QUANTREG Procedure, SUGI30, Paper 213-30
14
Part 2: Analysis of Causality
For more information: BUSI 6280 The material presented here is based on a paper by
Josef Brüderl (University of Mannheim, Germany)
Get more at http://dilbert.com/strips/
Panel Data Methods for analysis of causality exploit a data structure of
multi-dimensional longitudinal data, which is typically described in the statistics and econometrics literature as Panel Data
Panel data is defined as a combination of cross-section data, where data on one or more variables are collected at the same point in time, and time-series data, where data are collected at regular time intervals.
Analysis of panel data will be performed using the TSCREG procedure in the statistical package SAS (Allison 2005; Mohd Nor & Maarof 2007) and the xtreg procedure in the statistical package Stata (Brüderl 2005).
References Allison, P.D. (2005). Fixed Effects Regression Methods for Longitudinal Data
Using SAS. SAS Press. Brüderl, J. (2005). Panel Data Analysis. University of Mannheim,
http://www2.sowi.uni-mannheim.de/lsssm/veranst/Panelanalyse.pdf (accessed October 15, 2012)
Mohd Nor, A. H. S., & Maarof, F. (2007). “Panel Data Analysis Using SAS”. Proceedings of the 21st Annual SAS Malaysia Forum, 5th September 2007, Kuala Lumpur.
Halaby, C. (2004). Panel Models in Sociological Research. Annual Review of Sociology, 30: 507-544.
Wooldridge, J. (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press.
Wooldridge, J. (2003). Introductory Econometrics: A Modern Approach. Thomson. Chapters 13, 14.
Baron and Kenny (1986)
18
Part 3: Mediation Analysis
For more information: BUSI 6280, EPSY 6270 The material presented here is based on Wikipedia
Mediation Models Mediation is a hypothesized causal chain in which one
variable affects a second variable that, in turn, affects a third variable. The intervening variable, M, is the mediator. It “mediates” the relationship between a predictor, X, and an outcome Y.a and b: direct effects of X on M and M on Y, resp.c’: direct effect of X on Y after accounting for M
X YMa b
c’
Baron and Kenny steps The Baron and Kenny (1986) approach is not the best, but
many researchers are still using it STEP 1: Conduct a simple regression analysis with X
predicting Y to test for path c alonec is the direct effect of X on Y, without taking into
account M. This is not the same as c’ on the previous slide!
X YM
c
Baron and Kenny steps STEP 2: Conduct a simple regression analysis with X
predicting M to test the significance of path a alone
X YMa
Baron and Kenny steps STEP 3: Conduct a simple regression analysis with M
predicting Y to test the significance of path b alone The purpose of Steps 1-3 is to establish that zero-order
relationships among the variables exist. If one or more of these relationships are non-significant, researchers usually conclude that mediation is not possible or likely
Assuming there are significant relationships from Steps 1 through 3, proceed to Step 4.
X YMb
Baron and Kenny steps STEP 4: Conduct a multiple regression analysis with X and
M predicting Y In Step 4, some form of mediation is supported if the effect of
M (path b) remains significant after controlling for X. If X is no longer significant when M is controlled, the finding supports full mediation. If X is still significant, the finding supports partial mediation.
X YMb
c’
Sobel steps STEP 1: Conduct a multiple regression analysis with X and
M predicting Y: Y = b0 + b1X + b2M + e
STEP 2: Conduct a simple regression analysis with X predicting M: M = b3 + b4X + u
STEP 3: Compute the indirect effect as bindirect = (b2)(b4) Significance is best determined using bootstrapping
X YMa
X YMb
c’
SEM approach The Structural Equation Modeling (SEM) approach is
considered the best for testing mediation effects. In SEM, a single mediation model is tested.
Full mediation and partial mediation models can be compared by fitting both as alternative models. The model with the highest fit statistics is the more appropriate
X YMa b
c’
X YMa b
Full mediation Partial mediation
References Baron, R.M. & Kenny, D.A. (1986). The Moderator-
Mediator variable distinction in Social Psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.
MacKinnon, D.P. (2008). Introduction to statistical mediation analysis. Mahwah, NJ: Erlbaum.
Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhardt (Ed.), Sociological Methodology (pp. 290-312). Washington DC: American Sociological Association.
27
Part 4: Hierarchical Linear Modeling
For more information: BUSI 6480, EPSY 6230(EPSY offered at the UNT College of Education)
Multilevel Models Multilevel models are particularly appropriate for research
designs where the data for participants is organized at more than one level
Analysis of Covariance (ANCOVA) include nested designsIndividuals nested within groupsCompanies nested within industries