Robust Regression. Regression Methods We are going to look at three approaches to robust...

52
Robust Regression

description

 We will look at a model that predicts the api 2000 scores  Our focus is whether the average class size in K through 3 (acs_k3) and average class size 4 through 6 (acs_46) affect the academic performance

Transcript of Robust Regression. Regression Methods We are going to look at three approaches to robust...

Page 1: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Robust Regression

Page 2: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Regression Methods We are going to look at three approaches

to robust regression: Regression with robust standard errors Regression with robust standard errors

including the cluster option Regression with random effect Regression with fixed effect

Page 3: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

We will look at a model that predicts the api 2000 scores Our focus is whether the average class size in K through 3 (acs_k3) and average class size 4 through 6 (acs_46) affect the academic performance

Page 4: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

use a new data set http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2

Page 5: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 6: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 7: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 8: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

4.1.1 Regression with Robust Standard Errors The Stata regress command includes a robust option for estimating the standard errors using the Huber-White sandwich estimators. Such robust standard errors can deal with a collection of minor concerns about failure to meet assumptions,

Minor problems about normality Heteroscedasticity Some observations that exhibit large residuals, leverage or influence.

Page 9: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

With the robust option, the point estimates of the coefficients are exactly the same as in ordinary OLS, but the standard errors take into account issues concerning heterogeneity and lack of normality.

Page 10: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 11: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 12: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

As with the robust option, the estimate of the coefficients are the same as the OLS estimates, but the standard errors take into account that the observations within districts are non-independent. 

If you have a very small number of clusters compared to your overall sample size it is possible that the standard errors could be quite larger than the OLS results. For example, if there were only 3 districts, the standard errors would be computed on the aggregate scores for just 3 districts. 

Page 13: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Using the Cluster Option The elemapi2 dataset contains data on 400 schools that come from 37 school districts. It is very possible that the scores within each school district may not be independent, and this could lead to residuals that are not independent within districts. We can use the cluster option to indicate that the observations are clustered into districts (based on dnum) and that the observations may be correlated within districts, but would be independent between districts.

Page 14: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 15: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 16: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 17: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Control for random effect (school district)

Page 18: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 19: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 20: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Control for fixed effect (school district)

Page 21: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 22: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 23: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

2.4 Examine Distribution Assumption Classical regression assumption requires

that the outcome (dependent) to be normally distributed.

In large sample, this assumption is not that important because of Central Limit Theory

In small sample, however, the distribution assumption could be relevant

We will investigate issues concerning normality.

Page 24: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Here we check the normality of enroll We start with making some graphs

Hisgram Kdesnity

Page 25: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 26: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 27: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 28: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 29: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

We can use the normal option to superimpose a normal curve on this graph and the bin(20) option to use 20 bins.  The distribution looks skewed to the right.

Page 30: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 31: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 32: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 33: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 34: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

An alternative to histograms is the kernel density plot, which approximates the probability density of the variable. Kernel density plots have the advantage of being smooth and of being independent of the choice of origin, unlike histograms. Stata implements kernel density plots with the kdensity command.

Page 35: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 36: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 37: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 38: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 39: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 40: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Having concluded that enroll is not normally distributed, how should we address this problem? We may try to transform enroll to make it more normally distributed. Potential transformations include taking the log, the square root or raising the variable to a power. Stata includes the ladder and gladder commands to help selecting the right transformation. Ladder reports numeric results and gladder produces a graphic display.

Page 41: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 42: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 43: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 44: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

This indicates that the log transformation would help to make enroll more normally distributed. Let's use the generate command with the log function to create the variable lenroll which will be the log of enroll. Note that log in Stata will give you the natural log, not log base 10. To get log base 10, type log10(var)

Page 45: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 46: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 47: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 48: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 49: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Page 50: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

2. 5 Summary Simple Regression Multiple Regression Hypothesis Testing Examine the normality assumption

Page 51: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Quiz I Make graphs of api99: histogram, kdensity plot What is the correlation between api99 and meals? Regress api99 on meals. Create and list the fitted (predicted) values. Graph meals and api99 with and without the regression line.

Page 52: Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.

Quiz II Look at the correlations among the variables api99 meals ell avg_ed using the corr and pwcorr commands. Perform a regression predicting api99 from meals and ell. Interpret the output.