8/13/2019 2013 06 19 Robust Regression
1/101
8/13/2019 2013 06 19 Robust Regression
2/101
8/13/2019 2013 06 19 Robust Regression
3/101
Introduction
Introduction
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 3/45
I d i
8/13/2019 2013 06 19 Robust Regression
4/101
Introduction
Modern regression
(OLS Ordinary Least Squares) regression:One of the most widely used statistical methods in socialsciences.
However, we will see that OLS regression does have limitations.
Modern regression methods can be seen as improvements/alternatives to the usual regression model.
Robust regression is one such modern method.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 4/45
I t d ti
8/13/2019 2013 06 19 Robust Regression
5/101
Introduction
Robust regression
The termrobusthas multiple interpretations in estimationframeworks:
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 5/45
Introduction
8/13/2019 2013 06 19 Robust Regression
6/101
Introduction
Robust regression
The termrobusthas multiple interpretations in estimationframeworks:
Robustness of validity: Resistance of the estimator to unusualobservations.A robustestimator should not suffer big changes when small changes are
made to the data.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 5/45
Introduction
8/13/2019 2013 06 19 Robust Regression
7/101
Introduction
Robust regression
The termrobusthas multiple interpretations in estimationframeworks:
Robustness of validity: Resistance of the estimator to unusualobservations.A robustestimator should not suffer big changes when small changes are
made to the data.
Robustness of efficiency: Resistance of the estimator toviolations of underlying distributional assumptions.A robustestimator should keep high precision (i.e., small SEs) when
distributional assumptions are violated.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 5/45
Introduction
8/13/2019 2013 06 19 Robust Regression
8/101
Introduction
Robust regression
The termrobusthas multiple interpretations in estimationframeworks:
Robustness of validity: Resistance of the estimator to unusualobservations.A robustestimator should not suffer big changes when small changes are
made to the data.
Robustness of efficiency: Resistance of the estimator toviolations of underlying distributional assumptions.A robustestimator should keep high precision (i.e., small SEs) when
distributional assumptions are violated.
We will mostly focus onrobustness of validityin regression.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 5/45
Introduction
8/13/2019 2013 06 19 Robust Regression
9/101
Introduction
Limitations of OLS regression: Example
Jasso, G. (1985). Marital coital frequency and the passage of time: Estimating
the separate effects of spouses ages and marital duration, birth and marriagecohorts, and period influences. American Sociological Review, 50(2), 224-41.doi:10.2307/2095411
Jasso (1985) found that wifes age had apositive effecton the
monthly coital frequency of married couples, after controlling forperiod and cohort effects.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 6/45
Introduction
8/13/2019 2013 06 19 Robust Regression
10/101
Introduction
Limitations of OLS regression: Example
Jasso, G. (1985). Marital coital frequency and the passage of time: Estimating
the separate effects of spouses ages and marital duration, birth and marriagecohorts, and period influences. American Sociological Review, 50(2), 224-41.doi:10.2307/2095411
Jasso (1985) found that wifes age had apositive effecton the
monthly coital frequency of married couples, after controlling forperiod and cohort effects.
Kahn and Udry (1986) questioned her findings:
They found four miscodes in the dataset. They found four additional outliersusing model diagnostics. They claim Jasso failed to consider a relevant interaction effect
(length of marriage by wifes age): Model misspecification.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 6/45
Introduction
8/13/2019 2013 06 19 Robust Regression
11/101
Limitations of OLS regression: Example
Jasso, G. (1985). Marital coital frequency and the passage of time: Estimating
the separate effects of spouses ages and marital duration, birth and marriagecohorts, and period influences. American Sociological Review, 50(2), 224-41.doi:10.2307/2095411
Jasso (1985) found that wifes age had apositive effecton the
monthly coital frequency of married couples, after controlling forperiod and cohort effects.
Kahn and Udry (1986) questioned her findings:
They found four miscodes in the dataset. They found four additional outliersusing model diagnostics. They claim Jasso failed to consider a relevant interaction effect
(length of marriage by wifes age): Model misspecification.
Total sample size: 2062.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 6/45
Introduction
8/13/2019 2013 06 19 Robust Regression
12/101
Limitations of OLS regression: Example
Model 1 Model 2Period 0.72 0.67
Log Wifes Age 27.61 13.56Log Husbands Age 6.43 7.87Log Marital Duration 1.50 1.56
Wife Pregnant 3.71
3.74
Child Under 6 0.56 0.68
Wife Employed 0.37 0.23Husband Employed 1.28 1.10
R2 .0475 .0612
n 2062 2054
Note. p< .10, p< .05, p< .01.
Model 1: Jassos (1985) original model.Model 2: Kahn and Udrys (1986) model excluding 4 miscodes and 4
outliers.Workshop Modern methods for robust regression, (# 152, Robert Andersen) 7/45
Introduction
8/13/2019 2013 06 19 Robust Regression
13/101
Limitations of OLS regression: Example
Some conclusions:
A small number of unusual observations can have a large effecton the estimated regression coefficients.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 8/45
8/13/2019 2013 06 19 Robust Regression
14/101
Introduction
8/13/2019 2013 06 19 Robust Regression
15/101
Limitations of OLS regression: Example
Some conclusions:
A small number of unusual observations can have a large effecton the estimated regression coefficients.
Large samples arenotimmune to this problem.(Previous example: 8/2062= 0.39% of data.)
Using diagnostic tools to uncover potential problems is a crucial(and often disregarded) analysis step.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 8/45
Introduction
8/13/2019 2013 06 19 Robust Regression
16/101
Limitations of OLS regression: Example
Some conclusions:
A small number of unusual observations can have a large effecton the estimated regression coefficients.
Large samples arenotimmune to this problem.(Previous example: 8/2062= 0.39% of data.)
Using diagnostic tools to uncover potential problems is a crucial(and often disregarded) analysis step.
The decision on what to do when influential observations arefound should be based onsubstantive knowledge(i.e., no one-way-out solution exists).
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 8/45
8/13/2019 2013 06 19 Robust Regression
17/101
Important background
8/13/2019 2013 06 19 Robust Regression
18/101
Properties of estimators
Assessing whether an estimator is robustrequires checking severalmathematical properties.
Notation:
population parameter that we intend to estimate
T estimator for Y sample (Y = (y1, . . . , yn))
estimate of (T(Y) =
)
n sample size
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 10/45
Important background
8/13/2019 2013 06 19 Robust Regression
19/101
Properties of estimators
1 Bias: Does the estimator give, on average, the desiredparameter?
bias=E[ ]
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 11/45
8/13/2019 2013 06 19 Robust Regression
20/101
8/13/2019 2013 06 19 Robust Regression
21/101
Important background
8/13/2019 2013 06 19 Robust Regression
22/101
Properties of estimators
4 Influence function(IF): Localmeasure of the resistance of anestimator.Ideal: Having aboundedinfluence function, implying that onesingle observation has limited influence on the estimator.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 12/45
Important background
8/13/2019 2013 06 19 Robust Regression
23/101
Properties of estimators
4 Influence function(IF): Localmeasure of the resistance of anestimator.Ideal: Having aboundedinfluence function, implying that onesingle observation has limited influence on the estimator.
5 Relative efficiency: Ratio ofMSEs of the estimator withsmallest MSE (say TOpt) and estimator T
Relative efficiency=
MSE(TOpt)
MSE(T)
0 Rel. Effic. 1, the larger the better.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 12/45
Important background
8/13/2019 2013 06 19 Robust Regression
24/101
Properties of estimators
Fact:
OLS estimators are the most efficient (i.e., with smallestMSE) among all unbiased linear estimators if all theregression model assumption are met.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 13/45
Important background
8/13/2019 2013 06 19 Robust Regression
25/101
Properties of estimators
Fact:
OLS estimators are the most efficient (i.e., with smallestMSE) among all unbiased linear estimators if all theregression model assumption are met.
As a consequence:Relative efficiency(more precisely, asymptoticefficiency) ofrobust estimators is assessed in comparison to the OLSestimators when model assumptions are met.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 13/45
8/13/2019 2013 06 19 Robust Regression
26/101
Important background
8/13/2019 2013 06 19 Robust Regression
27/101
Measures of location
Measure of location: Quantity that characterizes a position in a
distribution
Statistic Formula BDP IF In current use in
robust regression
Mean y=
i
yi
n 0 Unbounded No
-trimmed mean yt= y(g+1)++y(ng)
n2g Bounded Yes
Median M=Q.50 .50 Bounded Yes
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 14/45
Important background
8/13/2019 2013 06 19 Robust Regression
28/101
Measures of location: Influence function
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 15/45
8/13/2019 2013 06 19 Robust Regression
29/101
8/13/2019 2013 06 19 Robust Regression
30/101
Robustness, resistance, and OLS regression
Cl ifi i f l b i
8/13/2019 2013 06 19 Robust Regression
31/101
Classification of unusual observations
Observation Definition
Does it affect
regression estimates?
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 18/45
Robustness, resistance, and OLS regression
Cl ifi i f l b i
8/13/2019 2013 06 19 Robust Regression
32/101
Classification of unusual observations
Observation Definition
Does it affect
regression estimates?
Univariate outlier Outlier ofx ory
Not necessarily(unconditional on each other)
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 18/45
Robustness, resistance, and OLS regression
Cl ifi ti f l b ti
8/13/2019 2013 06 19 Robust Regression
33/101
Classification of unusual observations
Observation Definition
Does it affect
regression estimates?
Univariate outlier Outlier ofx ory
Not necessarily(unconditional on each other)
Regressionoutlier Unusual yvalue for a given x Not necessarily (A)
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 18/45
Robustness, resistance, and OLS regression
Cl ifi ti f l b ti
8/13/2019 2013 06 19 Robust Regression
34/101
Classification of unusual observations
Observation Definition
Does it affect
regression estimates?
Univariate outlier Outlier ofx ory
Not necessarily(unconditional on each other)
Regressionoutlier Unusual yvalue for a given x Not necessarily (A)
Observation withleverage Unusual x value Not necessarily (B)
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 18/45
Robustness, resistance, and OLS regression
Cl ssifi ti f s l bs ti s
8/13/2019 2013 06 19 Robust Regression
35/101
Classification of unusual observations
Observation Definition
Does it affect
regression estimates?
Univariate outlier Outlier ofx ory
Not necessarily(unconditional on each other)
Regressionoutlier Unusual yvalue for a given x Not necessarily (A)
Observation withleverage Unusual x value Not necessarily (B)
Observation withinfluence Regression estimates change
Yes (C)greatly with/without them
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 18/45
8/13/2019 2013 06 19 Robust Regression
36/101
8/13/2019 2013 06 19 Robust Regression
37/101
Robustness, resistance, and OLS regression
Classification of unusual observations
8/13/2019 2013 06 19 Robust Regression
38/101
Classification of unusual observations
Conclusion:
Being x-discrepant (leverage) ory-discrepant (regressionoutlier)is not sufficient to identifyinfluentialobservations.
However, it is a combination of both x- and y- discrepanciesthat determines the influence of an observation.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 19/45
Robustness, resistance, and OLS regression
Classification of unusual observations
8/13/2019 2013 06 19 Robust Regression
39/101
Classification of unusual observations
Conclusion:
Being x-discrepant (leverage) ory-discrepant (regressionoutlier)is not sufficient to identifyinfluentialobservations.
However, it is a combination of both x- and y- discrepanciesthat determines the influence of an observation.
Important note:Observations with high leverages that follow the main regressiontrend helpdecreasingthe SE of the estimates! Plot B
SE(b) = se
i(xi x)2
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 19/45
Robustness, resistance, and OLS regression
Classification of unusual observations
8/13/2019 2013 06 19 Robust Regression
40/101
Classification of unusual observations
Conclusion:
Being x-discrepant (leverage) ory-discrepant (regressionoutlier)is not sufficient to identifyinfluentialobservations.
However, it is a combination of both x- and y- discrepanciesthat determines the influence of an observation.
Important note:Observations with high leverages that follow the main regressiontrend helpdecreasingthe SE of the estimates! Plot B
SE(b) = se
i(xi x)2
How do we identifyregressionoutliers, observations withleverage,and observations withinfluence?
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 19/45
Robustness, resistance, and OLS regression
Identifying unusual observations
8/13/2019 2013 06 19 Robust Regression
41/101
Identifying unusual observations
n = number of observations (i=1, . . . , n)
k= number of predictors (j=1, . . . , k)
Detecting. . . Use. . . Rule of thumb?
Test? Plot?Flag if...
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 20/45
Robustness, resistance, and OLS regression
Identifying unusual observations
8/13/2019 2013 06 19 Robust Regression
42/101
Identifying unusual observations
n = number of observations (i=1, . . . , n)
k= number of predictors (j=1, . . . , k)
Detecting. . . Use. . . Rule of thumb?
Test? Plot?Flag if...
Leverage Hat values hi >2(k+ 1)/n a Index plot
a. For large samples. Use hi >3(k+1)/n for small samples.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 20/45
8/13/2019 2013 06 19 Robust Regression
43/101
Robustness, resistance, and OLS regression
Identifying unusual observations
8/13/2019 2013 06 19 Robust Regression
44/101
Identifying unusual observations
n = number of observations (i=1, . . . , n)
k= number of predictors (j=1, . . . , k)
Detecting. . . Use. . . Rule of thumb?
Test? Plot?Flag if...
Leverage Hat values hi >2(k+ 1)/n a Index plot
Regression outliers Studentized residuals | | 2 Yesb QQ-plot
InfluenceDFBETAs | | 2/n Index plotCooks D
> .5 c Index plot
>4/(n k 1)d
a. For large samples. Use hi >3(k+1)/n for small samples.b. Controlling for capitalization by chance is required.c. According to Cook and Weisberg (1999).d. According to Fox (1997).
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 20/45
Robustness, resistance, and OLS regression
Identifying unusual observations: Example
8/13/2019 2013 06 19 Robust Regression
45/101
Identifying unusual observations: Example
Weakliem, D.L., Andersen, R., & Heath, A.F. (2005). By popular demand:
The effect of public opinion on income quality. Comparative Sociology, 4(3),261-284. doi:10.1163/156913305775010124
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 21/45
Robustness, resistance, and OLS regression
Identifying unusual observations: Example
8/13/2019 2013 06 19 Robust Regression
46/101
Identifying unusual observations: Example
Weakliem, D.L., Andersen, R., & Heath, A.F. (2005). By popular demand:
The effect of public opinion on income quality. Comparative Sociology, 4(3),261-284. doi:10.1163/156913305775010124
We will focus on a subset of the original data: Sample: n=26 countries with democracies
8/13/2019 2013 06 19 Robust Regression
47/101
y g p
Weakliem, D.L., Andersen, R., & Heath, A.F. (2005). By popular demand:
The effect of public opinion on income quality. Comparative Sociology, 4(3),261-284. doi:10.1163/156913305775010124
We will focus on a subset of the original data: Sample: n=26 countries with democracies
8/13/2019 2013 06 19 Robust Regression
48/101
y g p
Weakliem, D.L., Andersen, R., & Heath, A.F. (2005). By popular demand:
The effect of public opinion on income quality. Comparative Sociology, 4(3),261-284. doi:10.1163/156913305775010124
We will focus on a subset of the original data: Sample: n=26 countries with democracies
8/13/2019 2013 06 19 Robust Regression
49/101
8/13/2019 2013 06 19 Robust Regression
50/101
8/13/2019 2013 06 19 Robust Regression
51/101
Robustness, resistance, and OLS regression
Identifying unusual observations: Example
8/13/2019 2013 06 19 Robust Regression
52/101
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 23/45
8/13/2019 2013 06 19 Robust Regression
53/101
Robustness, resistance, and OLS regression
Identifying unusual observations: Example
8/13/2019 2013 06 19 Robust Regression
54/101
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 23/45
8/13/2019 2013 06 19 Robust Regression
55/101
8/13/2019 2013 06 19 Robust Regression
56/101
8/13/2019 2013 06 19 Robust Regression
57/101
Robustness, resistance, and OLS regression
Identifying unusual observations: Example
8/13/2019 2013 06 19 Robust Regression
58/101
Conclusion:
Slovakia and Czech Republic seem to have a large influence onthe estimation of the regression coefficients. This can be confirmed:
OLS OLS(all countries) (omitting cz and sk)
b SE b SE
Intercept .028 .128 .107* .058Gini .00074 .0028 .00527*** .0013GDP .0175** .0079 .0063 .0037
s .138 .0602
R2
.175 .4622n 26 24
Note. p< .10, p< .05, p< .01.
Lets now look into better alternatives to OLS!
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 24/45
Robust regression for the linear model
8/13/2019 2013 06 19 Robust Regression
59/101
Robust regression for the
linear model
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 25/45
8/13/2019 2013 06 19 Robust Regression
60/101
8/13/2019 2013 06 19 Robust Regression
61/101
8/13/2019 2013 06 19 Robust Regression
62/101
Robust regression for the linear model
Various robust regression models
8/13/2019 2013 06 19 Robust Regression
63/101
Type Estimator Breakdown Bounded Asymptotic
Point Infl. Func. Efficiency
OLS 0 No 100%
L-EstimatorsLAV (Least Absolute Values) 0 Yes 64%LMS (Least Median of Squares) .5 Yes 37%
LTS (Least Trimmed Squares) .5 Yes 8%
R-Estimators Bounded influence estimator < .2 Yes 90%
M-EstimatorsM-estimates (Huber, biweight) 0 No 95%GM-estimates (Mal.& Schw.) 1/(p+1) Yes 95%GM-estimates (S1S) .5 Yes 95%
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 26/45
8/13/2019 2013 06 19 Robust Regression
64/101
Robust regression for the linear model
Various robust regression models
8/13/2019 2013 06 19 Robust Regression
65/101
Type Estimator Breakdown Bounded Asymptotic
Point Infl. Func. Efficiency
OLS 0 No 100%
L-EstimatorsLAV (Least Absolute Values) 0 Yes 64%LMS (Least Median of Squares) .5 Yes 37%
LTS (Least Trimmed Squares) .5 Yes 8%
R-Estimators Bounded influence estimator < .2 Yes 90%
M-EstimatorsM-estimates (Huber, biweight) 0 No 95%GM-estimates (Mal.& Schw.) 1/(p+1) Yes 95%GM-estimates (S1S) .5 Yes 95%
S-Estimators S-estimates .5 Yes 33%
GS-estimates .5 Yes 67%
MM-Estimators MM-estimates .5 Yes 95%
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 26/45
8/13/2019 2013 06 19 Robust Regression
66/101
8/13/2019 2013 06 19 Robust Regression
67/101
Robust regression for the linear model
M-Estimators
8/13/2019 2013 06 19 Robust Regression
68/101
Objectivefunctions:
Weightfunctions:
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 29/45
8/13/2019 2013 06 19 Robust Regression
69/101
Robust regression for the linear model
GM-, S-, GS-, MM-estimators
8/13/2019 2013 06 19 Robust Regression
70/101
GM-Estimators: Generalized M-estimators that take both vertical
outliers and leverage into account: Vertical outliers: By using M-estimators.
Leverage: Using hat values.
Method to retain: Schweppe one-step estimator (S1S).
S-Estimators: Minimize a robust M-estimate of the residual scale(instead ofY scale).
GS-Estimators: Generalized S-estimates that increase efficiency.
MM-Estimators: Rely on both M- (high efficiency) and S- (highBDP) estimators.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 31/45
Robust regression for the linear model
Example: Simulated data
8/13/2019 2013 06 19 Robust Regression
71/101
8/13/2019 2013 06 19 Robust Regression
72/101
Robust regression for the linear model
Example: Simulated data
8/13/2019 2013 06 19 Robust Regression
73/101
Robust regression for the linear model
Example: Simulated data
8/13/2019 2013 06 19 Robust Regression
74/101
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 32/45
8/13/2019 2013 06 19 Robust Regression
75/101
Robust regression for the linear model
Using robust regression as regression diagnostics
8/13/2019 2013 06 19 Robust Regression
76/101
Common statistics measuring influence of observations (e.g., Cooks
distance) are not robust against unusual obervations.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 34/45
8/13/2019 2013 06 19 Robust Regression
77/101
8/13/2019 2013 06 19 Robust Regression
78/101
8/13/2019 2013 06 19 Robust Regression
79/101
Robust regression for the linear model
Using robust regression as regression diagnostics
E l I d l f b i i h
8/13/2019 2013 06 19 Robust Regression
80/101
Example: Index plots of robust regression weights wi
Idea: Weights indicate levels of unusualness (y- and/or x-discrepancies) the smaller, the more unusual.
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 35/45
8/13/2019 2013 06 19 Robust Regression
81/101
Robust regression for the linear model
Using robust regression as regression diagnostics
E l RR l t (T k 1991)
8/13/2019 2013 06 19 Robust Regression
82/101
Example: RR-plots (Tukey, 1991)
Idea: Robust regression residuals are better than OLS residuals fordiagnosing outliers:
OLS regression tries to produce normal-looking residuals
even when the data themselves are not normal.(Rousseeuw & van Zomeren, 1990)
Workshop Modern methods for robust regression, (# 152, Robert Andersen) 36/45 Robust regression for the linear model
Using robust regression as regression diagnostics
E a le RR lots (T ke 1991)
8/13/2019 2013 06 19 Robust Regression
83/101
Example: RR-plots (Tukey, 1991)
Idea: Robust regression residuals are better than OLS residuals fordiagnosing outliers:
OLS regression tries to produce normal-looking residuals
even when the data themselves are not normal.(Rousseeuw & van Zomeren, 1990)
RR-plots: Residual-residual scatterplot matrix
OLS assumptions hold = scatter around y=x line(OLS vsrobust regression residuals)
Workshop Modern methods for robust regression (# 152 Robert Andersen) 36/45 Robust regression for the linear model
Using robust regression as regression diagnostics
8/13/2019 2013 06 19 Robust Regression
84/101
Workshop Modern methods for robust regression (# 152 Robert Andersen) 37/45
8/13/2019 2013 06 19 Robust Regression
85/101
Standard errors for robust regression
Standard errors for robust regression
Analytical (asymptotic) SEs are available for only some types of
8/13/2019 2013 06 19 Robust Regression
86/101
Analytical (asymptotic) SEs are available for only some types ofrobust regression: M- and GM-estimation. S- and GS-estimation. MM-estimation.
These estimates are given by
SE = Diagonalvar-cov matrix.
Workshop Modern methods for robust regression (# 152 Robert Andersen) 39/45 Standard errors for robust regression
Standard errors for robust regression
Analytical (asymptotic) SEs are available for only some types of
8/13/2019 2013 06 19 Robust Regression
87/101
Analytical (asymptotic) SEs are available for only some types ofrobust regression: M- and GM-estimation. S- and GS-estimation. MM-estimation.
These estimates are given by
SE = Diagonalvar-cov matrix. Some problems with SE: Unreliable for small n (say,
8/13/2019 2013 06 19 Robust Regression
88/101
Analytical (asymptotic) SEs are available for only some types ofrobust regression: M- and GM-estimation. S- and GS-estimation. MM-estimation.
These estimates are given by
SE = Diagonalvar-cov matrix. Some problems with SE: Unreliable for small n (say,
8/13/2019 2013 06 19 Robust Regression
89/101
Bootstrapping is useful to estimate difficult/unknown sampling
distributions.
Workshop Modern methods for robust regression (# 152 Robert Andersen) 40/45 Standard errors for robust regression
Computing standard errors: Bootstrapping
Bootstrapping is useful to estimate difficult/unknown sampling
8/13/2019 2013 06 19 Robust Regression
90/101
Bootstrapping is useful to estimate difficult/unknown sampling
distributions.Remember: If OLS assumptions are met, then OLS estimates for theSEs are better than bootstrap estimates!
Workshop Modern methods for robust regression (# 152 Robert Andersen) 40/45
8/13/2019 2013 06 19 Robust Regression
91/101
8/13/2019 2013 06 19 Robust Regression
92/101
8/13/2019 2013 06 19 Robust Regression
93/101
Standard errors for robust regression
Computing bootstrapping CIs
Having estimated regression coefficients plustheir associated
8/13/2019 2013 06 19 Robust Regression
94/101
g g p
(bootstrapped) SEs, it is now possible to compute (1
)% CI.
Workshop Modern methods for robust regression (# 152 Robert Andersen) 42/45 Standard errors for robust regression
Computing bootstrapping CIs
Having estimated regression coefficients plustheir associated
8/13/2019 2013 06 19 Robust Regression
95/101
(bootstrapped) SEs, it is now possible to compute (1 )% CI. There are some possibilities:
Bootstrap t CI: When bias is small and the bootstrap samplingdistribution is roughly normally distributed.
CI= tnk1,/2SE()
Workshop Modern methods for robust regression (# 152 Robert Andersen) 42/45
8/13/2019 2013 06 19 Robust Regression
96/101
Standard errors for robust regression
Computing bootstrapping CIs
Having estimated regression coefficients plustheir associated
8/13/2019 2013 06 19 Robust Regression
97/101
(bootstrapped) SEs, it is now possible to compute (1 )% CI. There are some possibilities:
Bootstrap t CI: When bias is small and the bootstrap samplingdistribution is roughly normally distributed.
CI= tnk1,/2SE() Bootstrap percentile CI: When bias is small but the bootstrap
sampling distribution deviates from the normal distribution.
CI= (q/2(), q1/2()), =bootstrapped dist.
Bias-corrected percentile CI: When bias is large (e.g., for smallsample sizes).
W ksh M d th ds f b st ssi (# 152 R b t A d s ) 42/45
8/13/2019 2013 06 19 Robust Regression
98/101
Standard errors for robust regression
Example: Public opinion about pay inequality
8/13/2019 2013 06 19 Robust Regression
99/101
Bootstrap t CIs
B Boot. SE Lower 95% Upper 95%
Intercept .0978 .0580Gini .0051 .0012 .0026 .0076GDP .0057 .0033 .0015 .0129
W k h M d th d f b t i (# 152 R b t A d ) 43/45 Robust regression in R
8/13/2019 2013 06 19 Robust Regression
100/101
Robust regression in R
W k h M d th d f b t i (# 152 R b t A d ) 44/45 Robust regression in R
Robust regression in R
Fitting regression models
8/13/2019 2013 06 19 Robust Regression
101/101
Fitting regression models
Reg. Model R package R command
OLS lm(SECPAY gini + GDP)L-Estimation (LAV) quantreg rq(SECPAY gini + GDP)M-Estimation (Huber) MASS rlm(SECPAY gini + GDP)M-Estimation (Biweight) MASS rlm(SECPAY gini + GDP,psi=psi.bisquare)MM-Estimation MASS rlm(SECPAY
gini + GDP,method="MM")
Model diagnostics (stats package, loaded by default)
Statistic Diagnosing. . . R command
Hat values Leverage hatvalues(lm(SECPAY
gini + GDP))
Studentized residual Reg. outlier rstudent(lm(SECPAY gini + GDP))DFBETA Influence dfbeta(lm(SECPAY gini + GDP))Cooks D Influence cooks.distance(lm(SECPAY gini + GDP))
W k h M d h d f b i (# 152 R b A d ) 45/45
Top Related