Empirical Asset Pricing - Uwasalipas.uwasa.fi/~sjp/Teaching/eaptx/lectures/p3.pdf · Empirical...
Transcript of Empirical Asset Pricing - Uwasalipas.uwasa.fi/~sjp/Teaching/eaptx/lectures/p3.pdf · Empirical...
Empirical Asset Pricing
Seppo Pynnonen
Department of Mathematics and Statistics, University of Vaasa, Finland
Texas A&M University, May – June, 2013
As of May 24, 2013Seppo Pynnonen Empirical Asset Pricing
Stata regression
Part III
Stata Regression
Seppo Pynnonen Empirical Asset Pricing
Stata regression
1 Stata regression
Regression
Factor variables
Postestimation: test
More about factor variables
Robust standard errors
Postestimation commands
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Regression
Stata has a very rich family of regression models and estimationmethods.
Full description is found from the manuals.
The basic syntax in all alternatives are the same as in the basicregression:
regress depvar [indepvars] [if] [in] [weight] [, options]
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Regression
options Description
---------------------------------------------------------------------------------------
Model
noconstant suppress constant term
hascons has user-supplied constant
tsscons compute total sum of squares with constant; seldom used
SE/Robust
vce(vcetype) vcetype may be ols, robust, cluster clustvar, bootstrap,
jackknife, hc2, or hc3
Reporting
level(#) set confidence level; default is level(95)
beta report standardized beta coefficients
eform(string) report exponentiated coefficients and label as string
depname(varname) substitute dependent variable name; programmer’s option
display_options control column formats, row spacing, line width, and display of
omitted variables and base and empty cells
noheader suppress output header
notable suppress coefficient table
plus make table extendable
mse1 force mean squared error to 1
coeflegend display legend instead of statistics
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Regression
Example: IBM FF-factor model (regression.do)
IBM stock Fama-French 3-factor model (daily data)
reg rete mkt_rf smb hml, cformat(%6.3f) pformat(%5.3f) sformat(%8.2f)
Source | SS df MS Number of obs = 5817
-------------+------------------------------ F( 3, 5813) = 1065.30
Model | 6941.76055 3 2313.92018 Prob > F = 0.0000
Residual | 12626.3454 5813 2.17208763 R-squared = 0.3547
-------------+------------------------------ Adj R-squared = 0.3544
Total | 19568.106 5816 3.36452991 Root MSE = 1.4738
------------------------------------------------------------------------------
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.866 0.017 51.20 0.000 0.833 0.899
smb | -0.242 0.034 -7.12 0.000 -0.309 -0.176
hml | -0.446 0.033 -13.38 0.000 -0.511 -0.381
_cons | 0.030 0.019 1.57 0.117 -0.008 0.068
------------------------------------------------------------------------------
Formating printed output, see help cformat
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Factor variables
gen dow = dow(date) // day-of-week 0 = Sun, 1 = Mon, ...
i.dow is called a factor variable and defines virtual indicatorvariable (dummy variable) for each unique value of dow where thesmallest forms the reference class.
list i.dow in 1/5, clean
1b. 2. 3. 4. 5.
dow dow dow dow dow
1. 0 1 0 0 0
2. 0 0 1 0 0
3. 0 0 0 1 0
4. 0 0 0 0 1
5. 0 0 0 0 0
1b. is the reference class with all values equal to zero. It isdropped from the regression.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Factor variables
The base value (reference value/class) can be defined by ib.
Base operator Descriptionib#. use # as base, # = value of variableib(##). use the #th ordered value as baseib(first). use smallest value as base (default)ib(last). use largest value as baseib(freq). use most frequent value as baseibn. no base level
The i may be omitted. E.g., you can type ib2.dow or b2.dow.For example, ib(#2). means to use the second value as the base.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Factor variables
In fact, bn., rather than i., defines virtual indicator value for eachclass.
list bn.dow in 1/5, clean
1. 2. 3. 4. 5.
dow dow dow dow dow
1. 0 1 0 0 0
2. 0 0 1 0 0
3. 0 0 0 1 0
4. 0 0 0 0 1
5. 1 0 0 0 0
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Factor variables
Testing for day of the week effects// estimate ff-model with weekday dummies
reg rete mkt_rf smb hml bn.dow, noconstant
Source | SS df MS Number of obs = 5817
-------------+------------------------------ F( 8, 5809) = 404.01
Model | 7000.01619 8 875.002024 Prob > F = 0.0000
Residual | 12580.9525 5809 2.16576906 R-squared = 0.3575
-------------+------------------------------ Adj R-squared = 0.3566
Total | 19580.9687 5817 3.36616274 Root MSE = 1.4717
------------------------------------------------------------------------------
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.866 0.017 51.29 0.000 0.833 0.899
smb | -0.230 0.034 -6.75 0.000 -0.297 -0.163
hml | -0.447 0.033 -13.41 0.000 -0.512 -0.381
|
dow |
1 | 0.178 0.044 4.00 0.000 0.091 0.265
2 | 0.088 0.043 2.07 0.039 0.005 0.172
3 | -0.067 0.043 -1.57 0.116 -0.151 0.017
4 | -0.001 0.043 -0.02 0.984 -0.085 0.083
5 | -0.037 0.043 -0.87 0.386 -0.122 0.047
------------------------------------------------------------------------------
Monday (dow = 1) is statistically significant (Monday effect?).
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation: test
Given the Monday effect, we can test whether the rest of theweekday effects are jointly zero.
This can be carried out by the postestimation command test
test (spec) [(spec) . . .] [, test options]
See help test for details.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation: test
Example: Testing for linear hypotheses
test 2.dow 3.dow 4.dow 5.dow // note, e.g., i2.dow is
// the same as 2.dow
( 1) 2.dow = 0
( 2) 3.dow = 0
( 3) 4.dow = 0
( 4) 5.dow = 0
F( 4, 5809) = 1.87
Prob > F = 0.1122
No empirical evidence of joint difference from zero.
Note: Usually 2.dow 3.dow 4.dow 5.dow is equivalent to i(2 3
4 5).dow or i(2/5).dow but with test it does not seems towork.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation: test
Equality of Tue through Fri effects
test 2.dow = 3.dow = 4.dow = 5.dow // equality of coefficients
( 1) 2.dow - 3.dow = 0
( 2) 2.dow - 4.dow = 0
( 3) 2.dow - 5.dow = 0
F( 3, 5809) = 2.49
Prob > F = 0.0588
Seppo Pynnonen Empirical Asset Pricing
Stata regression
More about factor variables
Factor variable operators
Operator Description
i. unary operator to specify indicatorsc. unary operator to treat as continuous# binary operator to indicate interaction## binary operator to specify full-factorial interactions
E.g. i1.dow.##c.smb includes all the terms 1.dow, smb, and1.dow#c.smb to the model.
The c. means continuous variable.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
More about factor variables
Monday effect in risk in factor betas?gen mon = 1.dow // for convenience, define Monday dummy and interaction terms
gen mkt_x_mon = mkt_rf * mon
gen smb_x_mon = smb * mon
gen hml_x_mon = hml*mon
reg rete mkt_rf smb hml mon mkt_x_mon smb_x_mon hml_x_mon // regression with
// monday effects in coefficient
Source | SS df MS Number of obs = 5817
-------------+------------------------------ F( 7, 5809) = 460.98
Model | 6988.12246 7 998.303209 Prob > F = 0.0000
Residual | 12579.9835 5809 2.16560226 R-squared = 0.3571
-------------+------------------------------ Adj R-squared = 0.3563
Total | 19568.106 5816 3.36452991 Root MSE = 1.4716
------------------------------------------------------------------------------
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.890 0.019 46.07 0.000 0.852 0.928
smb | -0.217 0.038 -5.68 0.000 -0.291 -0.142
hml | -0.445 0.037 -11.88 0.000 -0.519 -0.372
mon | 0.178 0.050 3.58 0.000 0.081 0.276
mkt_x_mon | -0.105 0.040 -2.64 0.008 -0.184 -0.027
smb_x_mon | -0.079 0.085 -0.93 0.354 -0.245 0.088
hml_x_mon | 0.013 0.082 0.16 0.876 -0.149 0.175
_cons | -0.005 0.021 -0.25 0.806 -0.047 0.037
------------------------------------------------------------------------------
Seppo Pynnonen Empirical Asset Pricing
Stata regression
More about factor variables
The enhanced model is of the form
r et = α + βmktremt + s SMBt + h HMLt + δ mont (3)
+δm(r em × mon)t + δs(HML× mon)t + δh(HML× mon)t + ut
which we can write
r et = α + δ mont + (βmkt + δmmont)remt (4)
+(s + δsmont) SMBt
+(h + δhmont) HMLt + ut .
We see that on e.g. on Monday βmkt → βmkt + δm and thus δm canbe interpreted as the shift in market beta on Mondays.
The significance of the δm-estimate above indicates a change inIBM stock’s market risk in Mondays (decreases).
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
In financial data homoscedastisity and independence of regressionerrors are typically not satisfied.
Stata has several options to adapt these problems by using thevce() option (see help vce option).
If the errors are not correlated White (1980, Econometrica817–838) type corrected standard errors are typically utilized.
Generally the OLS standard errors depend on the diagonalelements of the matrix
X′ΩX (5)
where Ω is the covariance matrix of the error terms, which isdiagonal if the error terms are not correlated.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
Under this assumption the diagonal elements of (5) are of the form
n∑i=1
ω2i xijxik (6)
j , k = 1, . . . , p = number of regressors, where ω2i = var[ui ] is the
variance of the ith observation.
In literature there are four different practices to estimate ω2i
HC0 : u2iHC1 : n
n−p−1 u2i Stata : vcr(robust)
HC2 :u2i
1−hi Stata : vce(hc2)
HC3 :u2i
(1−hi )2Stata : vce(hc3)
(7)
hi is the ith diagonal element of X(X′X)−1X′.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
Generally the different scalings of u2i in (7) result to different biascorrections to estimate the error variances ω2
i .
HC0 performs worst, HC1 does better, HC2 does better still, and HC3does best of all (usually), Davidson and MacKinnon (1993, p. 554)Estimation and Inference in Econometrics.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
OLS standard errorsreg rete mkt_rf smb hml mon mkt_x_mon // OLS standard errors
Source | SS df MS Number of obs = 5817
-------------+------------------------------ F( 5, 5811) = 645.29
Model | 6986.00399 5 1397.2008 Prob > F = 0.0000
Residual | 12582.102 5811 2.16522147 R-squared = 0.3570
-------------+------------------------------ Adj R-squared = 0.3565
Total | 19568.106 5816 3.36452991 Root MSE = 1.4715
------------------------------------------------------------------------------
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.019 46.29 0.000 0.853 0.928
smb | -0.233 0.034 -6.84 0.000 -0.300 -0.166
hml | -0.441 0.033 -13.23 0.000 -0.506 -0.376
mon | 0.184 0.049 3.71 0.000 0.087 0.281
mkt_x_mon | -0.104 0.040 -2.63 0.009 -0.182 -0.027
_cons | -0.005 0.021 -0.23 0.821 -0.047 0.037
------------------------------------------------------------------------------
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
Huber-White heteroscedastic consistent standard errors (HC1)
// White-Huber heteroscedastic consistent standard errors
reg rete mkt_rf smb hml mon mkt_x_mon, vce(robust)
Linear regression Number of obs = 5817
F( 5, 5811) = 412.25
Prob > F = 0.0000
R-squared = 0.3570
Root MSE = 1.4715
------------------------------------------------------------------------------
| Robust
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.022 40.94 0.000 0.848 0.933
smb | -0.233 0.039 -6.05 0.000 -0.309 -0.158
hml | -0.441 0.043 -10.28 0.000 -0.525 -0.357
mon | 0.184 0.047 3.90 0.000 0.091 0.276
mkt_x_mon | -0.104 0.052 -1.99 0.047 -0.207 -0.001
_cons | -0.005 0.022 -0.22 0.823 -0.048 0.038
------------------------------------------------------------------------------
Notice the difference in the F -statistic and standard errors, inparticular that of mkt x mon interaction and its t and p values!
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
White-MacKinnon correction (HC2)
reg rete mkt_rf smb hml mon mkt_x_mon, vce(hc2)
Linear regression Number of obs = 5817
F( 5, 5811) = 408.31
Prob > F = 0.0000
R-squared = 0.3570
Root MSE = 1.4715
------------------------------------------------------------------------------
| Robust HC2
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.022 40.87 0.000 0.848 0.933
smb | -0.233 0.039 -6.04 0.000 -0.309 -0.157
hml | -0.441 0.043 -10.25 0.000 -0.525 -0.357
mon | 0.184 0.047 3.89 0.000 0.091 0.276
mkt_x_mon | -0.104 0.053 -1.95 0.051 -0.209 0.001
_cons | -0.005 0.022 -0.22 0.823 -0.048 0.038
------------------------------------------------------------------------------
It is notable that mkt x mon is at best borderline significant!
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
White-MacKinnon correction (HC3)
Linear regression Number of obs = 5817
F( 5, 5811) = 404.03
Prob > F = 0.0000
R-squared = 0.3570
Root MSE = 1.4715
------------------------------------------------------------------------------
| Robust HC3
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.022 40.78 0.000 0.848 0.933
smb | -0.233 0.039 -6.01 0.000 -0.309 -0.157
hml | -0.441 0.043 -10.22 0.000 -0.526 -0.356
mon | 0.184 0.047 3.89 0.000 0.091 0.276
mkt_x_mon | -0.104 0.055 -1.91 0.056 -0.211 0.003
_cons | -0.005 0.022 -0.22 0.823 -0.048 0.038
------------------------------------------------------------------------------
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
Stata regress has also Jackknife and Bootstrap standard errorsand also cluster type robust standard errors.
Jackknifereg rete mkt_rf smb hml mon mkt_x_mon, vce(jacknife, nodots) // nodots suppress intermediate results
Linear regression Number of obs = 5817
Replications = 5817
F( 5, 5816) = 404.10
Prob > F = 0.0000
R-squared = 0.3570
Adj R-squared = 0.3565
Root MSE = 1.4715
------------------------------------------------------------------------------
| Jackknife
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.022 40.78 0.000 0.848 0.933
smb | -0.233 0.039 -6.02 0.000 -0.309 -0.157
hml | -0.441 0.043 -10.22 0.000 -0.526 -0.356
mon | 0.184 0.047 3.89 0.000 0.091 0.276
mkt_x_mon | -0.104 0.055 -1.91 0.056 -0.211 0.003
_cons | -0.005 0.022 -0.22 0.823 -0.048 0.038
------------------------------------------------------------------------------
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
Bootstrapreg rete mkt_rf smb hml mon mkt_x_mon, vce(boot, nodots) // nodots suppress intermediate results
Linear regression Number of obs = 5817
Replications = 50
Wald chi2(5) = 3190.50
Prob > chi2 = 0.0000
R-squared = 0.3570
Adj R-squared = 0.3565
Root MSE = 1.4715
------------------------------------------------------------------------------
| Observed Bootstrap Normal-based
rete | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.018 50.55 0.000 0.856 0.925
smb | -0.233 0.042 -5.59 0.000 -0.315 -0.151
hml | -0.441 0.047 -9.44 0.000 -0.533 -0.349
mon | 0.184 0.044 4.18 0.000 0.098 0.270
mkt_x_mon | -0.104 0.056 -1.87 0.061 -0.213 0.005
_cons | -0.005 0.022 -0.22 0.823 -0.047 0.038
------------------------------------------------------------------------------
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
Finally Stata allows also to robustify standard errors with respectto correlation within clusters (autocorrelation corrections can bedealt with by time series regressions).
Using dow as a cluster variable gives:
reg rete mkt_rf smb hml mon mkt_x_mon, vce(cluster dow)
Linear regression Number of obs = 5817
F( 3, 4) = .
Prob > F = .
R-squared = 0.3570
Root MSE = 1.4715
(Std. Err. adjusted for 5 clusters in dow)
------------------------------------------------------------------------------
| Robust
rete | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mkt_rf | 0.891 0.015 57.75 0.000 0.848 0.933
smb | -0.233 0.049 -4.78 0.009 -0.368 -0.098
hml | -0.441 0.010 -46.35 0.000 -0.467 -0.415
mon | 0.184 0.038 4.84 0.008 0.078 0.289
mkt_x_mon | -0.104 0.015 -7.15 0.002 -0.145 -0.064
_cons | -0.005 0.034 -0.14 0.894 -0.100 0.090
------------------------------------------------------------------------------
Estimation seems to have some problems because F -statistic does not
show up.Seppo Pynnonen Empirical Asset Pricing
Stata regression
Robust standard errors
The overall conclusion from the above estimation results is thatwhile the Monday effect on IBM’s returns seems evident, theMonday effect on IBM’s market beta may eventually not exist.
Using dow as a cluster variable to capture the potential similarityof regression residuals on particular weekdays (intraclass correlationamong weekdays, e.g. Mondays) does not seem to work.
In this case a better alternative is to use time series tools of whichwe will return later. Before that we look at some post estimationcommands.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation commands
Postestimation commands provide tools for model checking,testing, etc. purposes. Earlier we have already used test fortesting linear hypotheses.
Some regress postestimation commandsThe following postestimation commands are of special interest after regress:
================================================================================
Command Description
--------------------------------------------------------------------------------
dfbeta DFBETA influence statistics
estat hettest tests for heteroskedasticity
estat imtest information matrix test
estat ovtest Ramsey regression specification-error test for omitted variables
estat szroeter Szroeters rank test for heteroskedasticity
estat vif variance inflation factors for the independent variables
acprplot augmented component-plus-residual plot
avplot added-variable plot
avplots all added-variables plots in one image
cprplot component-plus-residual plot
lvr2plot leverage-versus-squared-residual plot
rvfplot residual-versus-fitted plot
rvpplot residual-versus-predictor plot
===============================================================================
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation commands
Some moreMore useful postestimation commands:
==========================================================================
Command Description
--------------------------------------------------------------------------
contrast contrasts and ANOVA-style joint tests of estimates
estat AIC, BIC, VCE, and estimation sample summary
estat(svy) postestimation statistics for survey data
estimates cataloging estimation results
hausman Hausmans specification test
lincom point estimates, standard errors, testing,
and inference for linear combinations of coefficients
linktest link test for model specification
lrtest1 likelihood-ratio test
margins marginal means, predictive margins, marginal effects, and average marginal effects
marginsplot graph the results from margins (profile plots, interaction plots, etc.)
nlcom point estimates, standard errors, testing,
and inference for nonlinear combinations of coefficients
predict predictions, residuals, influence statistics, and other diagnostic measures
predictnl point estimates, standard errors, testing, and inference for generalized predictions
pwcompare pairwise comparisons of estimates
suest seemingly unrelated estimation
test Wald tests of simple and composite linear hypotheses
testnl Wald tests of nonlinear hypotheses
=============================================================================
For more information, see help (e.g. help estat).
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation commands
Postestimation commands apply to the last estimated model.
We estimate the previous model wit HC3 corrected standard errors.
Examples
predict res, residuals // generates residuals into res
predict yhat, xb // generates fitted values into yhat
avplots, msymbol(point) // added variable plots
avplots, called added variable plots or ”partial-regression leverageplots”, create two-dimensional scatter plots of projections of themultidimensional data. These can be used to identify visuallyoutliers in the data.
Seppo Pynnonen Empirical Asset Pricing
Stata regression
Postestimation commands
−20
−10
010
e(
rete
| X
)
−10 −5 0 5 10e( mkt_rf | X )
coef = .89060303, se = .01923992, t = 46.29
−15
−10
−5
05
10
e(
rete
| X
)
−4 −2 0 2 4e( smb | X )
coef = −.23311098, se = .03406098, t = −6.84
−15
−10
−5
05
10
e(
rete
| X
)
−4 −2 0 2 4e( hml | X )
coef = −.44099632, se = .03333577, t = −13.23
−15
−10
−5
05
10
e(
rete
| X
)
−.5 0 .5 1e( mon | X )
coef = .1836463, se = .04945237, t = 3.71
−15
−10
−5
05
10
e(
rete
| X
)
−10 −5 0 5 10e( mkt_x_mon | X )
coef = −.10413018, se = .03956648, t = −2.63
Seppo Pynnonen Empirical Asset Pricing