A New Estimation on Time-varying Betas in Conditional · PDF fileA New Estimation on...
Transcript of A New Estimation on Time-varying Betas in Conditional · PDF fileA New Estimation on...
A New Estimation on Time-varying Betas in
Conditional CAPM∗
Zongwu Caia,b Yu Renb
aDepartment of Mathematics and Statistics and Department of Economics,
University of North Carolina at Charlotte, Charlotte, NC 28223, USA
bThe Wang Yanan Institute for Studies in Economics, Xiamen University, Fujian, China.
June 15, 2011
∗Cai’s research was supported, in part, by the National Science Foundation of China grant #70871003,
and funds provided by the Cheung Kong Scholarship from Chinese Ministry of Education, the Minjiang
Scholarship from Fujian Province, China and Xiamen University. Ren’s research was supported by
Xiamen University and National Science Foundation of China grant #70971113. We thank the seminar
participants in UC riverside, University of Oklahoma, Canadian Economics Association 2010 meeting,
Tsinghua Econometrics Workshop 2010 and FERM meeting 2010.
Abstract
This paper uses a functional coefficient regression to estimate time-varying betas
and alphas in the conditional capital asset pricing model (CAPM). Functional co-
efficient representation relaxes the strict assumptions on the structure of betas and
alphas by combining the predictors into an index that best captures time variations
in the betas and alphas and estimates them nonparametrically. This index of betas
and alphas helps us to determine which economic variables we should track and
more importantly, in what combination. We select an appropriate index variable
by applying the smoothly clipped absolute deviation penalty to functional coeffi-
cients. In this way, estimation and variable selection can be done simultaneously.
Our model performs better than the alternatives in explaining asset returns. Based
on the empirical studies, we find no evidence to reject the conditional CAPM.
JEL classification: C13; C52; G12
Keywords: Conditional CAPM; Functional coefficient regression; Index model; Smoothly
clipped absolute deviation penalty;
2
1 Introduction
The capital asset pricing model (CAPM) is a cornerstone of finance. It states that a linear
relationship exists between the excess return of a risky asset and the beta of that asset
with respect to the market return. The betas in the CAPM are commonly assumed to
be constant over time. However, recent empirical studies provide ample evidence against
this assumption because the relative risk of a firm’s cash flow varies over the business
cycle and the state of the economy (see Fama and French (1997), Lettau and Ludvigson
(2001), Zhang (2005), Lewellen and Nagel (2006)). In other words, it is more reasonable to
believe that the CAPM holds under the condition of current information sets, which leads
to the conditional CAPM. In the conditional CAPM, the corresponding betas should be
adjusted accordingly because information sets are updated over time. Thus, the betas are
time varying. Because the betas measure the risk exposure to the assets, it is important
to estimate their values correctly. In this paper, we introduce a new methodology to
accomplish this task.
Estimation of the time-varying betas has already been discussed extensively through
two different approaches in the literature. First, the betas can be regarded as a function
of time (for examples, see the papers by Hall and Hart (1990), Johnstone and Silverman
(1997), and Robinson (1997)). For parametric models in this approach, there are threshold
values for the betas, either discretely, such as the threshold CAPM proposed by Akdeniz,
Altay-Salih and Caner (2003), or continuously, such as the smooth threshold transition
(STR) developed by Lin and Terasvirta (1994). For nonparametric models, the betas are
simply a function of time, as described by Ang and Kristensen (2011). This approach is
criticized for hiding the driving force behind the betas; first, it does not show how and
why betas vary over time, and second, the betas are assumed to be affected by some
instrument variables. These instrument variables can be the proxies of latent variables
(Ang and Chen (2007)) or of some observable macro variables (Ferson and Harvey (1999)).
This approach can provide more economic intuition on the movement of the betas than
3
the first one. Because there is no overwhelming argument for either approach, this paper
circumvents this debate and assumes that the betas are functions of some observable
variables. These variables are called financial instruments.
There are two advantages of using financial instruments to track the movement of
the betas. The first advantage is that this method can reveal the close relationship
between the relative risk of a firm’s cash flow as it varies over the business cycle and the
state of the economy. As Aıt-Sahalia and Brandt (2001), all of the potential instrument
variables are combined into an index that best captures time variations in the betas,
and the index can be explained as an economic state variable. The index has a good
interpretation as follows. From a statistical standpoint, the index avoids the curse of
dimensionality because it allows us to reduce the multivariate problem to one where we can
implement the nonparametric approach described below in a univariate setting (because
the index is univariate). From an economic perspective, this index offers a convenient
univariate summary statistic that describes the current state of the various time-varying
economic indicators related to investment opportunities for portfolio investment. From
a normative perspective, our results can help investors with any set of preferences to
determine which economic variables they should track and, more importantly, in what
single combination. And the other advantage is that this approach considers not only the
variation across averages of betas in each short time window but also the variation of the
actual betas within each window. Campbell and Vuolteenaho (2004), Fama and French
(2005), and Lewellen and Nagel (2006), among others, assume discrete changes in betas
across subsamples but constant betas within subsamples.
However, there are still three pitfalls in this approach. First, there is a strict assump-
tion about the relationship between the betas and the instrument variables. Ferson and
Harvey (1999) impose the assumption that the betas are linear functions of the index,
although the linearity assumption is quite controversial. Wang (2002, 2003) find strong
evidence against it, arguing that this strong assumption will lead to model misspecifica-
4
tion. As shown in Ghysels (1998), inference and estimation based on misspecification can
be very misleading. In addition, he shows that among several well-known time-varying
beta models, a serious misspecification produces time variation in the beta that is highly
volatile and leads to large pricing errors. Thus it is important for us to analyze the time-
varying betas by relaxing this assumption. Second, as pointed out by Aıt-Sahalia and
Brandt (2001), there is a separation of instrument selection and model estimation. In the
literature, it is common to choose the instruments by claiming the relationship between
the instrument and the asset returns. However, this evidence is found based on models
different from the one being explored. Because different objective functions place differ-
ent emphases on the instruments, to select the instruments that best fit the model, the
selection should be conducted simultaneously with the estimation. Lastly, Harvey (2001)
shows that the estimates of the betas obtained using instrumental variables are very sen-
sitive to the choice of instruments used as a proxy for time-variation in the conditional
betas.
To address the three aforementioned issues, we use functional coefficient regression
(FCR) proposed by Cai, Fan and Yao (2000), to estimate the time-varying betas. FCR
estimates the betas nonparametrically. It assumes that the coefficients of the factors are
some unknown functions of the state variable. The estimates are obtained with local
linear fitting (see Fan and Gijbels (1996)). Thus, FCR can relax the strong assumption of
linearity and avoid model misspecification. In addition, by adding a penalty term (in this
case, the smoothly clipped absolute deviation penalty (SCAD) function, introduced by
Fan and Li (2001)) to the conventional FCR, we can estimate the state variable and the
betas simultaneously. By doing so, we can choose the state variable that is the most suit-
able for the regression model. It combines the variable selection and the model estimation
together. Moreover, SCAD is a data-driven method, and it can delete the “useless” in-
struments automatically. Thus, we can place all of the potential candidates into the model
and do not need to study the relationship between the individual instrument variable with
5
the asset returns. The results will not be as sensitive to the choice of the instruments as
before.
Instrument selection is very important in empirical analysis. Through this channel,
we can observe which instruments play a role in the estimation and which do not. This
helps us explore the theoretical mechanism behind the estimation results. Furthermore,
instrument selection can enforce the robustness of empirical results. Because this pro-
cedure deletes “useless” instruments automatically, adding or changing those “useless”
instruments in the model will have minimal effects on the model. These two features
cannot be offered by the conventional index model.
The framework for a discussion concerning time-varying betas is the conditional CAPM.
However, the conditional CAPM, as an alternative to the static CAPM, is quite controver-
sial in the literature. Theoretically, the conditional CAPM can hold perfectly, from period
to period, and many models are derived based on it, for example, the Premium-Labor
model in Jagannathan and Wang (1996). Conversely, Ferson and Harvey (1999), Wang
(2003), and Lewellen and Nagel (2006) find that the conditional CAPM is rejected by
their empirical analysis. Still, their results are not convincing. The model in Ferson and
Harvey (1999) may be misspecified, and Wang (2003) uses four smoothing variables to
perform nonparametric estimations and tests. The number of observations in his data is
approximately 600, which is too small to produce reliable inferences. Lewellen and Nagel
(2006) do a simple comparison to evaluate whether the variation in the betas and the
equity premium is large enough to explain important asset-pricing anomalies; however,
the size of their method is challenged by Cai, Li and Ren (2011). Thus, the validity of
the conditional CAPM is still an open question.
To evaluate the conditional CAPM fairly, there are many aspects to consider. Because
it is infeasible for us to analyze all of them here, we only focus on two points in this paper.
The first is whether the data supports the time-varying betas, and the second is whether
the pricing errors, the alphas, are significant. We test on these two phenomenon based
6
on our model. By applying the Fama-French data sets, we find that we cannot reject
the hypotheses that the betas are time varying and that the alphas are insignificant.
Therefore, at this stage, we do not find any evidence against the conditional CAPM.
The rest of this paper is organized as follows: Section 2 describes our estimation model,
Section 3 presents some simulation results, and Section 4 reports the empirical analysis.
Finally, Section 5 concludes.
2 Econometric Model
2.1 Functional Coefficient CAPM
In the conditional CAPM, Ferson and Harvey (1999) assume the beta is a linear function of
the state variable. But Wang (2002, 2003) find a strong evidence against this assumption.
In order to circumvent a possible model misspecification, we adopt functional coefficient
regression (FCR) model proposed by Cai, Fan and Yao (2000). This FCR representation
does not impose any assumptions on the relationship between the beta and the state
variable. Also, pointed by Cai, Gu and Li (2009), a functional coefficient regression model
might approximate a general nonparametric regression model well and has an ability to
capture heteroscedasticity. Indeed, it fits the data locally by a linear function (see (3)
later).
Specifically, let {(zt, xt, yt)}+∞t=1 be jointly strictly stationary processes1, where yt is the
excess return of the portfolios, and both zt and xt are the factors in the asset pricing
model. In the conditional CAPM model, xt is commonly the excess market return and zt
is a univariate state variable.
Define m(z0, x0) = E(yt|zt = z0, xt = x0). A functional coefficient model expresses
1In practice, financial data may not be exactly stationary. But we adopt this assumption for simplicityand it is a standard assumption in the literature. When zt or xt is nonstationary, the statistical inferencesfor this are totally different from those for the stationary case so that the econometric modeling fornonstationary case becomes complicated; see Cai, Li and Park (2009) for details.
7
m(z0, x0) as
m(z0, x0) = α(z0) + β(z0)x0, (1)
so that the functional coefficient capital asset pricing model (FCCAPM) is given by
yt = α(zt) + β(zt)xt + et, (2)
where α(·) and β(·) are two unknown functions of zt. FCCAPM uses a local linear fitting
(see Fan and Gijbels (1996)) to estimate α(·) and β(·) in (2). For a given grid point z0,
we can approximate α(z) and β(z) locally by a linear function α(z) = a0 + a1(z − z0)
β(z) = b0 + b1(z − z0) when z is in a neighborhood of z0. The local linear estimators of
α(z0) and β(z0) are defined as α(z0) = a0 and β(z0) = b0. {aj, bj} (j = 0, 1) minimize the
following locally weighted least squares
T∑t=1
{yt − [a0 + a1(zt − z0)]− [b0 + b1(zt − z0)]xt}2Kh(zt − z0), (3)
where Kh(·) = K(·/h)/h. K(·) is a kernel function on R1 and h is a bandwidth, which
controls the amount of smoothing used in the estimation. Next, we write the minimization
problem (3) in a matrix form. Let
U =
1 x1 (z1 − z0) (z1 − z0)x1
1 x2 (z2 − z0) (z2 − z0)x2...
......
...
1 xT (zT − z0) (zT − z0)xT
, W = diag{Kh(zt − z0)},
B = (a0, b0, a1, b1)′ and Y = (y1, y2, · · · , yT )′. Then, the minimization problem can be
transformed to
minB
(Y − UB)′W (Y − UB). (4)
8
And the solution is
B = (U ′WU)−1U ′WY, (5)
the weighted least squares estimate. Clearly, (5) provides a formula for computational
implementation, which can be carried out by any standard statistical package. When z0
moves over the domain of zt, the estimated curves of α(zt) and β(zt) are obtained.
2.2 Bandwidth Selection
It is well known that the bandwidth is important in nonparametric estimation. It balances
the trade-off between the bias and the variance of the estimates. Following Cai and Tiwari
(2000), Cai (2002) and Cai (2007), we find the optimal bandwidth hopt by minimizing
AIC(h) = log(σ2) + 2(Th + 1)/(T − Th − 2), (6)
where σ2 =∑T
t=1(yt − yt)2/T and Th is the trace of the hat matrix Hh which makes
Y = HhY .
This selection criterion counteracts the over/under-fitting tendency of the generalized
cross-validation and the classical AIC; see Cai and Tiwari (2000) and Cai (2002) for more
details. Alternatively, one might use some existing methods in the time series literature
although they may require more computing; see Fan and Gijbels (1996), Cai, Fan and
Yao (2000) and Cai (2007).
2.3 Testing
In order to check the validity of the conditional CAPM, we have to test the significance
of the pricing error. In the conditional CAPM, the pricing error is denoted as α(·) in the
model. Hence, the testing problem can be formulated as
H0 : α(z) = 0, (7)
9
and
H1 : α(z) 6= 0. (8)
The sum square residuals (RSS) under the null hypothesis is
RSS0 = T−1T∑t=1
[yt − β(zt)xt]2, (9)
where β is estimated under the null. The RSS under the alternative is
RSS1 = T−1T∑t=1
[yt − α(zt)− β(zt)xt]2, (10)
where α and β are estimated under the alternative. The test statistic is defined as
JT =RSS0
RSS1
− 1. (11)
We use the following nonparametric bootstrap approach to obtain the p−value of the
statistic:
1. Collect the residuals {et} by
et = yt − βtxt.
2. Generate the bootstrap residuals {e∗i } from the empirical distribution of the centered
residuals {et − ¯et}.
3. Define
y∗t = βtxt + e∗t .
4. Calculate the bootstrap test statistic J∗T based on the sample {y∗t , xt, zt}.
5. Compute the p-value of the test based on the relative frequency of the event {J∗T ≥
JT} in the replications of the bootstrap sampling.
10
2.4 Instrument Variables Selection
Although it is common to model the betas as a function of the observed macroeconomic
variables, in practice, the selection process itself is crucial (Cochrane (2001), Harvey
(2001)). Usually, the instrument variables are chosen according to the specific studies of
the relationship between that variable and the asset returns. For example, the spread
between the returns of three-month and one-month Treasury bills (Ferson and Harvey
(1991)), the spread between Moody’s Baa and Aaa corporate bond yield (Fama (1990)),
the spread between ten-year and one-year Treasury bond yields (Fama and French (1989))
and one-month Treasury bill yield (Ferson (1989)) are empirically proven to be highly
related to asset returns. Those instruments are proven to be useful in explaining the
asset returns in the models different from the one that we are studying. To select the
instruments that best fit our model, we adopt the variable selection method proposed by
Fan and Li (2001).
We define ut (t = 1, 2, · · · , T ) as a k × 1 vector of the instrument variables, and the
linear combination of these variables zt = c′ut is the state variable for FCR, where c is
a k × 1 coefficient vector. When c = 0, the corresponding instrument is ignored in the
state variable. Thus, we propose a model that can delete the redundant instrument by
delivering the estimated coefficient as 0.
The loss function of FCR can be expressed as
L(c) =T∑t=1
[yt − α(zt)− β(zt)xt]2 . (12)
It is a function of c since zt depends on c. We add Smoothly Clipped Absolute Deviation
penalty to the loss function L(c) in equation
mincL(c) + T
k∑i=1
Pλ,v(|ci|), (13)
11
where the first order derivative of Pλ,v is
P ′λ,v(ci) = λI(ci ≤ λ) +(vλ− ci)+v − 1
I(ci > λ), (14)
and λ and v are two parameters. The choices of λ and v are important for the estimation.
Here we follow Fan and Li (2001), and use fivefold cross-validation as follows: Denote the
full dataset by T , and denote cross-validation training and test set by T − T q and T q,
respectively, for q = 1..., 5. For every λ, v and q, we get the estimators α and β by using
T − T q training set. Then, we choose λ and v to minimize
CV (λ, v) =5∑q=1
∑(yp,xp,zp)∈T q
[yp − α(zp)− β(zp)xp]2. (15)
As shown in Fan and Li (2001), the SCAD penalty function leads to the estimators
with the three desired properties that cannot be achieved by either the Lp penalty function
or the hard penalty function. The three properties are the following: unbiasedness for
the large true coefficient to avoid unnecessary estimation bias, sparsity for estimating
a coefficient as small as 0 to reduce model complexity, and continuity of the resulting
estimator to avoid unnecessary variation in model prediction. More discussions regarding
these properties can be found in the papers by Antoniadis and Fan (2001) and Fan and
Li (2001).
Intuitively, if the instrument variables are not helpful in explaining the time-varying
betas, then the estimated coefficient for this instrument will shrink to zero. Hence, in
formatting the state variable, the estimated coefficient for the instrument will not be
considered. This variable selection method is entirely data-driven and reveals the potential
effects of the instrument variables while simultaneously maintains the analysis within the
framework of FCR. Furthermore, because the number of the instrument variables are no
longer restricted, SCAD will ignore the “useless” instruments automatically.
12
3 Monte Carlo Simulations
In this section, we illustrate the proposed method using simulated data sets. This data set
mimics the actual portfolio returns and market returns in the conditional CAPM model.
In addition, we generate three instrument variables but assume that only two of them
format the state of the economy. We then determine whether our model can deliver good
estimates in terms of the mean absolute deviation and determine the actual state variable.
Because the key point that demonstrates the validity of the conditional CAPM is
whether the pricing error is significant, the bootstrap testing procedure we use should
have the appropriate size and power to make correct inferences. Thus, based on the
simulated data, we also check the size and the power of our testing procedure.
Linear Model
We generate 500 data points, z1t, z2t and z3t drawn from a normal distribution with a
mean of 0 and a standard deviation of 0.38 as our instrument variables. xt is generated
from a uniform distribution on [0, 1]. The data generating process for yt is
yt = (c1z1t + c2z2t)xt + et, (16)
where et is an error term distributed normally with a mean of 0 and a standard deviation of
0.08. c1 and c2 are the coefficients of the factors, and they are set equal to√
2/2 ≈ 0.7071.
Although only two instruments are involved in DGP, we plug all three into our estimation
procedure. We expect our model to account for this fact and decrease the effect of the
irrelevant instrument somehow. We repeat the simulation 1,000 times, and report the
estimates of c1, c2 and c3 in terms of their absolute deviation.
Table 1 shows the median and the standard deviation of the absolute deviation of the
estimators. We find c3 shrinks to 0 in all the simulations, which indicates that our model
can delete the third instrument, which is not included in the true DGP, from the state
13
Table 1: Estimation Results: absolute deviationWe generate three instruments, and only use the first two to form the state variable. The excess returns aresimulated linearly. This table reports the median and the standard deviation for the absolute deviation ofthe estimates based on 1000 simulations. c1, c2 and c3 are the coefficients for the instruments. ‘Shrinkagerate’ indicates the frequency that the estimator of c3 equal to 0.
c1 c2 c3
T=300median 0.0022 0.0022 0.0000std 0.0017 0.0017 0.0000shrinkage rate 100%
T=500median 0.0019 0.0019 0.0000std 0.0010 0.0010 0.0000shrinkage rate 100%
T=1000median 0.0014 0.0014 0.0000std 0.0010 0.0010 0.0000shrinkage rate 100%
variable automatically. In addition, the estimates of c1 and c2 are very close to their true
values. This finding is consistent with what our theoretical model indicates.
Nonlinear Model
We use the same method to simulate z1t, z2t, z3t, xt and et, whereas, yt is generated
nonlinearly as follows
yt = 0.1 exp(c1z1t + c2z2t)xt + et. (17)
Similarly, c1 = c2 =√
2/2 ≈ 0.7071. The estimation results are reported in Table 2.
This time, the estimated c3 deviates from its true value slightly, and the shrinkage rates
decrease to 80% when T = 300, which is lower than the rate in the linear model. This
low shrinkage rate is reasonable because the performance of our model depends on the
complexity of the true DGP. If we just focus on the median and the standard deviation
of the estimators, then the performance of our model still works well.
14
Table 2: Estimation Results: absolute deviationWe generate three instruments, and only use the first two to form the state variable. The excess returnsare simulated nonlinearly. This table reports the median and the standard deviation for the absolutedeviation of the estimates based on 1000 simulations. c1, c2 and c3 are the coefficients for the instruments.‘Shrinkage rate’ indicates the frequency that the estimator of c3 equal to 0.
c1 c2 c3
T=300median 0.0180 0.0160 0.0000std 0.0186 0.0196 0.0753shrinkage rate 80%
T=500median 0.0112 0.0114 0.0000std 0.0133 0.0143 0.0036shrinkage rate 95%
T=1000median 0.0106 0.0107 0.0000std 0.0059 0.0073 0.0000shrinkage rate 98%
Size and Power
Because we use the bootstrap method to test the significance of the pricing error, we need
to check the size and the power of our test. If the size or power distortion is large, then
the inferences based on the test results are not reliable.
We still use the DGPs previously described as the representatives of the linear and the
nonlinear models for the conditional CAPM. For each simulation, we simulate 500 data
points and test the null hypothesis that the intercept is zero. Then we repeat the entire
procedure 1,000 times and report the frequency of the rejection of the null hypothesis.
Because the true DGP is based on the conditional CAPM, the rejection frequency should
be close to the nominal levels. These results are summarized in Table 3.
In order to check the power of the test, we add a local time-varying intercept to the
linear and the nonlinear models described before. For the linear model,
yt =c0T 2/5
(c1z1t + c2z2t) + (c1z1t + c2z2t)xt + et, (18)
15
Table 3: Size of TestWe use the linear and the nonlinear model in the previous section to generate the data, and test the nullhypothesis that the pricing error is insignificant. We report the rejection frequencies in this table.
Nomial Level (%) 1 5 10
T=100Linear Model (%) 0.3 3.3 7.6
Nonlinear Model (%) 1.3 3.7 9.2
T=300Linear Model (%) 0.7 3.9 8.2
Nonlinear Model (%) 0.9 5.7 10.9
T=500Linear Model (%) 1.2 4.6 9.8
Nonlinear Model (%) 1.1 4.8 9.5
and for the nonlinear model,
yt =c0T 2/5
(c1z1t + c2z2t) + 0.1 exp(c1z1t + c2z2t)xt + et. (19)
We obtain the rejection frequency based on 1,000 replications. We try different values of
c0 and T , and set the significant level at 5%. The rejection frequencies are reported in
Table 4.
The power of our test increases sharply with c0. It reaches almost 100% when c0 = 0.5.
So the power of our test is warranted to the usage on the empirical analysis.
4 Empirical Analysis
4.1 Data
We collect monthly returns of the Fama-French 25 portfolios from July 1963 to December
2009. The financial instruments, following Ferson and Harvey (1999), are the spread
between the returns of the three-month and the one-month Treasury bill , the spread
16
Table 4: Power of TestWe use the linear and the nonlinear model, which contain a local pricing error, to generate the data, andtest the null hypothesis that the pricing error is insignificant. The significant level is 5%. We report therejection frequencies in this table.
Parameter c0=0.1 c0=0.15 c0=0.2 c0=0.25 c0=0.3 c0=0.5
T=100Linear Model(%) 10.5 29 43.5 63.5 79.5 98.0
Nonlinear Model(%) 11.0 25 42 62 83.5 97.5
T=300Linear Model (%) 22.0 45.5 70.5 86.5 95.5 99.5
Nonlinear Model (%) 17.0 46.5 70.5 86.5 96.0 100
T=500Linear Model(%) 26.5 49 76 91.5 97.5 100
Nonlinear Model (%) 25.5 50 76 95.5 99 100
between Moody’s Baa and Aaa corporate bond yield, the spread between a ten-year and
one-year Treasury bond yields and the one-month Treasury bill yield. To match the
model, we obtain the one-month lagged data for the instrument variable.
We do not claim that these instrument variables are the set of all potential instruments:
these are the most popular ones used in practice because interest rates and spreads are
usually benchmark indexes for the business cycle. By putting them into our model, we
can estimate which variable plays a more important role than the others. Certainly, it is
feasible to incorporate more macro variables into the model because the SCAD penalty
criterion will shrink the coefficients of those “useless” variables to zero. Doing so will not
hurt our estimation, but only needs more computation time.
4.2 Data Smoothing
We use our model to estimate the state variable and the asset returns. To compare the
relative performances, we also employ the model in Ferson and Harvey (1999) (FH model)
to the data. We report the mean square error in Table 5. The first column indicates the
17
portfolios we study. The portfolios are sorted by the orders of the size (“S”) and the
book-to-market ratio (“B”). “S1” (and “B1”) denotes the lowest order and “S5” (and
“B5”) denotes the highest order. The mean square errors (MSE) estimated in our model
and in the FH model are reported in the second and the third columns respectively. We
can see that the numbers in the second column are smaller than those in the third column
for all portfolios. The improvement of the MSE can be as large as 12% shown in portfolio
“S2/B3”. For the others, our model can decrease the MSE for approximately 7% on
average. The relatively poor performance of the FH model may be explained by the
model’s misspecification. The linearity assumption imposed on the relationship between
the state variable and the time-varying betas seems too strong. The fitting performance
increases when this assumption is relaxed, which is consistent with the findings in Wang
(2002, 2003).
To explore further, we check how the choice for the state variable affects our results.
We run a FCR with a different state variable. Three different state variables are studied
here. The first is the one selected by SCAD, as our model describes. For the second state
variable, we run a linear regression of the asset returns on the instrument variables and
treat the fitted value as the state variable. The last is the state variable implied by the
time-varying betas in the FH model2. All of the state variables used in the smoothing
variable for FCR, and we report the MSE in Table 6. On average, SCAD gives us the
best state variable among these three. The majority of the MSE in the second column
are smaller than the ones in the other two columns. The reason is that SCAD selects
the variables to directly fit the asset returns and chooses the index that is the best for
the model estimation. This method avoids the problems caused by the separation of
instrument selection and model estimation.
When our model is employed, it produces the estimated coefficients of the instruments
summarized in Table 7. To avoid the identification problem, we search the estimates over
2The FH model allows the alpha and the beta to have different state variables.
18
Table 5: Mean Square ErrorWe use our model to estimate the state variable and the asset returns. The instruments are the spreadbetween the returns of a three-month and a one-month Treasury bill , the spread between Moody’s Baaand Aaa corporate bond yield, the spread between a ten-year and one-year Treasury bond yield and theone-month Treasury bill yield. In order to compare the relative performances, we also employ the modelin Ferson and Harvey (1999) (FH model) to fit the data. This table reports the mean square error. Thefirst column indicates the portfolios we study. The portfolio are sorted by the orders of the size (“S”) andthe book-to-market ratio (“B”). “S1” (and “B1”) denotes the lowest order and “S5” (and “B5”) denotesthe hightest order. The mean square error delivered by our model and by the FH model are reported inthe second and the third columns respectively. The first column indicates the portfolios we study. Thesecond and the third columns report the mean square error delivered by our model and the FH modelrespectively.
FCR+SCAD Ferson and Harvey
S1/B1 22.4948 23.3314S1/B2 17.5266 17.7231S1/B3 10.7763 12.048S1/B4 10.6107 11.4277S1/B5 12.3372 13.4097S2/B1 12.5011 13.3613S2/B2 8.1924 8.6243S2/B3 6.4449 7.3834S2/B4 7.3587 7.4214S2/B5 9.8359 10.8331S3/B1 8.9408 9.2193S3/B2 4.9812 5.1106S3/B3 4.9300 5.2432S3/B4 5.3756 5.812S3/B5 8.1348 9.2337S4/B1 4.8292 5.1875S4/B2 3.3935 3.5154S4/B3 4.1486 4.6364S4/B4 4.5217 5.0956S4/B5 7.5399 8.383S5/B1 2.3848 2.7164S5/B2 2.4869 2.6139S5/B3 3.6080 3.9479S5/B4 5.1640 5.3412S5/B5 8.4341 9.0278
19
Table 6: Mean Square ErrorWe use FCR to fit the data, but try different state variables. The second column reports the MSEwhen we use SCAD to find the state variable. The third column reports the MSE when we run a linearregression of the asset returns on the instrument variables and treat the fitted value as the state variable.The last one is the MSE when the state variables are chosen by the time-varying betas in the FH model.
SCAD OLS Ferson and Harvey
S1/B1 22.4948 23.4331 23.1556S1/B2 17.5266 17.4579 17.6291S1/B3 10.7763 11.7326 11.6373S1/B4 10.6107 11.1470 11.1239S1/B5 12.3372 13.4502 13.3691S2/B1 12.5011 12.9744 13.1612S2/B2 8.1924 8.3108 8.4618S2/B3 6.4449 7.2664 7.2051S2/B4 7.3587 7.2686 7.2975S2/B5 9.8359 10.6718 10.6696S3/B1 8.9408 8.8972 9.0156S3/B2 4.9812 4.9189 4.9750S3/B3 4.9300 5.0807 5.0757S3/B4 5.3756 5.6697 5.5941S3/B5 8.1348 9.0188 8.7943S4/B1 4.8292 4.9809 5.0273S4/B2 3.3935 3.5223 3.4804S4/B3 4.1486 4.3329 4.4692S4/B4 4.5217 4.8917 4.9957S4/B5 7.5399 8.4117 8.1457S5/B1 2.3848 2.6724 2.5847S5/B2 2.4869 2.6469 2.6192S5/B3 3.6080 3.9656 3.8562S5/B4 5.1640 5.3757 5.1527S5/B5 8.4341 8.7039 8.6781
20
a unit circle. Interestingly, the coefficients of the one-month interest rate are zero for
many portfolios. This is consistent with the empirical evidence in Bernanke (1990) that
the interest rate spreads play a more important role than the rates in financial markets.
This feature cannot be observed in Ferson and Harvey (1999).
Table 7: Coefficients for Instrument VariablesWe use our method to estimate the coefficients of the instrument variables. The portfolios are theFama-French 25 portfolios, and the instrument variables are the spread between Moody’s Baa and Aaacorporate bond yield (BmA), the spread between the returns of a three-month and a one-month Treasurybill (r3m-r1m), spread between a ten-year and one-year Treasury bond yield (r10y-r1y) and one-monthTreasury bill yield (r1m).
Coefficient BmA r3m-r1m r10y-r1y r1m
S1/B1 0.3009 0.5242 0.8009 0S1/B2 0.5549 0.5916 0.5849 0S1/B3 0.6663 0.7118 0.2268 0S1/B4 0.6049 0.6890 0.3393 0.2103S1/B5 0.6192 0.7541 0.1880 0.1123S2/B1 0.2910 0.5239 0.8006 0S2/B2 0.6811 0.7134 0.1682 0S2/B3 0.4812 0.6944 0.4446 0.2977S2/B4 0.2225 0.9749 0 0S2/B5 0.5210 0.5639 0.3908 0.5078S3/B1 0.2910 0.5123 0.8080 0S3/B2 0.7350 0.4576 0.3759 0.3303S3/B3 0.4736 0.5125 0.4434 0.5625S3/B4 0.3021 0.6616 0.5472 0.4143S3/B5 0.5759 0.5426 0.3492 0.5020S4/B1 0.6001 0.2166 0.4654 0.6134S4/B2 0.1821 0.5188 0.6673 0.5024S4/B3 0.6216 0.3452 0.2588 0.6539S4/B4 0.4373 0.4900 0.4528 0.6030S4/B5 0.5620 0.5216 0.3925 0.5080S5/B1 0.3137 0.4947 0.6700 0.4561S5/B2 0.6118 0.3644 0.5044 0.4884S5/B3 0.3182 0.6129 0.6187 0.3747S5/B4 0.5079 0.5639 0.3933 0.5191S5/B5 0.5081 0.4720 0.4500 0.5626
Whether or not the betas are time varying is crucial to the conditional CAPM. We plot
the estimates of the alphas and the betas with their 95% point-wise confidence intervals.
The confidence intervals are obtained by bootstrapping the data 1000 times. To save
space, we only report the S4/B2 portfolios. Figure 1 shows clearly that the betas are not
constant over the time period studied. This justifies the usage of the conditional CAPM.
21
And Figure 2 gives us a rough idea on the pricing errors, which fluctuate around zero.
Figure 1: Estimated betas with 95% point wise confidence intervals for S4/B2 portfolio
0 100 200 300 400 500 6000.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
time
beta
Figure 2: Estimated alphas with 95% point wise confidence intervals for S4/B2 portfolio
0 100 200 300 400 500 600−4
−3
−2
−1
0
1
2
3
time
alph
a
4.3 Model Testing
After estimating the time-varying betas and the alphas, we are curious about whether our
model specification is correct and whether our model is sufficient to explain the portfolio
returns. To answer these two questions, we test the model specified in Ferson and Harvey
(1999). In addition, we test the significance of the pricing errors produced by our model.
22
We do so with both time series analysis and cross-sectional analysis. If the conditional
CAPM is valid and our model estimates the betas correctly, then the pricing error should
be insignificant.
Nonlinearity
Ferson and Harvey (1999) use a linear model to describe the time-varying betas, while
our paper opts to use the nonparametric model to fulfill this job. It is necessary for us to
test the model specification. Namely, the null hypothesis is
H0 : αt = a0 + a′1Zt βt = b0 + b′1Zt. (20)
The alternative hypothesis is
H1 : αt = f1(c′Zt) βt = f2(c
′Zt), (21)
where f1 and f2 are two unknown functions. The p-values of 25 portfolios are summarized
by Table 8. The portfolios in each row are sorted by the book-to-market ratios and
in each column by the sizes. We find that we can reject the null hypothesis for 17
portfolios, which means that the majority of the Fama-French 25 portfolios do not satisfy
the linearity assumption imposed in Ferson and Harvey (1999). This nonlinearity is
even more prominent for the portfolios that have high book-to-market ratios or small
sizes. This nonlinearity implies the instruments capture the time lag information mainly
through their high-order term.
Pricing Error of Time Series Analysis
In the conditional CAPM, the alphas represent the abnormal returns of the assets. The
abnormal returns occur when there are other systematic risks in the market that the
model does not capture. Thus, we need to test the significance of the alphas. When
23
Table 8: p−ValuesThis table reports the p−values of the test on the linearity of the alphas and the betas. The nullhypothesis the linear model in Ferson and Harvey (1999); the alternative is our nonparametric model.The p−values are obtained by bootstrapping the data 1000 times. The portfolios in each row are sortedby the book-to-market ratios and in each column by the sizes.
B1 B2 B3 B4 B5
S1 0.2500 0.1954 0.0002 0 0
S2 0.2082 0.0125 0.0011 0.0001 0.0088
S3 0.2344 0.0412 0.0168 0.0248 0.0038
S4 0.0050 0.0002 0.0136 0.0184 0.0006
S5 0.2508 0.0360 0.0702 0.2628 0.0806
the estimated alphas are significant, then the model is rejected; however, if the model is
misspecified, the test results may be misleading. Because we have already rejected the
linear time-varying beta and alpha, we are warranted in testing the significance of the
alphas based on our model.
We estimate the time-varying alphas for the entire time period of the data and then
perform a significance test on the alphas. The null hypothesis is H0 : αt = 0 and
H1 : αt 6= 0 for t = 1, 2, · · · , T . Here, we only impose restrictions on the alphas, and
allow the betas to vary. The p-values are obtained using the bootstrapping procedure
mentioned previously. Table 9 summarizes the results. Out of 25 portfolios, 20 have
insignificant alphas at the level of 5%. This result is quite different from the results found
by Ferson and Harvey (1999). The difference is due to the model specification. As argued
in Ghysels (1998), the inferences made from the misspecified model can be very mislead-
ing. Thus, we believe that the conditional CAPM can deliver insignificant pricing errors.
The insignificant pricing errors also suggest that the conditional CAPM can capture the
systematic risk from period to period. Therefore, we do not need to use conditional
multi-factor models in asset pricing. This result also gives strong support for the models
that are derived or based on the conditional CAPM, such as the premium-labor model
proposed by Jagannathan and Wang (1996).
24
Table 9: p−ValuesThis table reports the p−values for the test on the significance of the alphas when the model has time-varying betas. The null hypothesis is H0 : αt = 0 and the alternative is H1 : αt 6= 0.
Portfolio B1 B2 B3 B4 B5
S1 0.6840 0.5480 0.0020 0 0.0002
S2 0.6520 0.3980 0.0100 0.1500 0.1060
S3 0.0900 0.2620 0.5300 0.0300 0.1480
S4 0.0860 0.0800 0.7020 0.0500 0.4880
S5 0.5400 0.0940 0.6160 0.6660 0.5480
Pricing Error of Cross Sectional Analysis
In addition, we also run a cross-sectional regression. Since the conditional CAPM model
is expressed as
E[Rit|It−1] = γ1t−1βit−1, (22)
where γ1t−1 is the conditional market risk premium, following Ferson and Harvey (1999),
the cross-sectional regression can be setup as
Rit = γ0,t−1 + γ1,t−1βit−1 + γ2,t−1δ′it−1Zt−1 + eit, (23)
where γ0,t−1 is the intercept and γ1,t−1, γ2,t−1 are the slope coefficients. δ′it−1Zt−1 is the
fitted conditional expected return, where δit−1 is estimated by regressing the return on
the lagged variable Z, using the data up to time t− 1. γ0,t−1 denotes the pricing errors in
the cross-sectional regression, and γ2,t−1 indicates how much information is not captured
by the conditional betas. So if the conditional CAPM is valid, γ0,t−1 and γ2,t−1 should be
insignificant.
We estimate the betas using two methods. One method uses the rolling 120-month
prior estimation period, and the other uses an expanding sample. The time-series averages
of the cross-sectional regression coefficients are shown along with their Fama-MacBeth
25
t−ratios in Table 10.
Table 10: Cross-sectional RegressionWe run the cross-sectional regression of the excess returns on constant, the time-varying beta and fittedconditional expected return. The time-series averages of the cross-sectional regression coefficients areshown in the first row in each panel and the corresponding Fama-MacBeth t−ratios are reported in thesecond row with parenthesis.
γ0 γ1 γ2Rolling Sample
0.9996 -0.3498(0.1787) (-0.0519)0.6815 -0.0057 0.8408
(0.1171) (-0.0010) (0.1622)
Expanding Sample
0.9749 -0.2360(0.1268) (-0.0300)0.8599 -0.3111 0.9330
(0.1166) (-0.0437) (0.2138)
The results suggest that γ0 is insignificant in both cases. Although the pricing errors of
some portfolios are significant in the time-series analysis, we can still obtain insignificant
pricing errors for the cross-sectional regression. This is because, in the time-series analysis,
pricing error is a function of the instruments, α = α(Zt). The realized value of this
function is not zero, but its expected value can be zero. Moreover, we find that γ2 is
insignificant as well. This implies that the public information has already been revealed
in the time-varying betas. These two findings can justify the conditional CAPM from the
cross-sectional point of view.
5 Conclusion
This paper uses a functional coefficient regression to estimate the time-varying betas and
the alphas in the conditional capital asset pricing model. Functional coefficient repre-
sentation relaxes the strict assumptions regarding the structure of betas and alphas by
26
combining the predictors into an index that best captures time variations in betas and
alphas and estimates them nonparametrically. This index of betas and alphas helps us
to determine which economic variables we should track and, more importantly, in what
combination. We select an appropriate index variable by using a smoothly clipped abso-
lute deviation penalty on functional coefficients. In this manner, estimation and variable
selection can be performed simultaneously.
Our findings are quite interesting. First, empirical results show that our model has
the best fit compared with other methods in the literature. Second, our results support
the conventional wisdom about the conditional CAPM, which is that it holds from period
to period. The pricing errors are not significant for most of the portfolios that we study.
27
References
Aıt-Sahalia, Y. and Brandt, W. (2001). Variable selection for portfolio choice, Journal of
Finance, 56, 1297-1351.
Ang, A. and Kristensen, D. (2011). Testing Conditional Factor Models. Working paper.
Akdeniz, L., Altay-Salih, A. and Caner, M. (2003). Time-varying betas help in asset
pricing: the threshold CAPM. Studies in Nonlinear Dynamics & Econometrics, 6, 1-
16.
Ang, A. and Chen, J. (2007). CAPM over the long run: 1926-2001. Journal Empirical
Finance, 14, 1-40.
Antoniadis, A. and Fan, J. (2001). Regularization of wavelet approximations. Journal of
American Statistical Association, 96, 939-955.
Bernanke, B. (1990). On the predictive power of interest rates and interest rate spreads.
New England Economic Review, 51-68.
Cai, Z. (2002). A two-stage approach to additive time series models. Statistica Neerlandica,
56, 415-433.
Cai, Z. (2007). Trending time varying coefficient time series models with serially correlated
errors. Journal of Econometrics, 136, 163-188.
Cai, Z., Fan, J. and Yao, Q. (2000). Functional-coefficient regression models for nonlinear
time series. Journal of American Statistical Association, 95, 941-956.
Cai, Z., Gu, J. and Li, Q. (2009). Some recent developments on nonparametric economet-
rics. Advances in Econometrics, 25, 495-549.
Cai, Z., Li, Q. and Park, J.Y. (2009). Functional-coefficient models for nonstationary time
series data. Journal of Econometrics, 148, 101-113.
28
Cai, Z., Li, X. and Ren, Y. (2011). A Direct Bootstrap Test on the Conditional CAPM.
Working paper.
Cai, Z. and Tiwari, R.C. (2000). Application of a local linear autoregressive model to
BOD time series. Environmetrics, 11, 341-350.
Campbell, J. and Vuolteenaho, T. (2004). Good beta, bad beta. American Economic
Review, 94, 1249-1275.
Cochrane, J. (2001). Asset Pricing. Princeton University Press, Princeton, NJ.
Fama, E. (1990). Stock returns, expected returns, and real activity. Journal of Finance,
45, 1089-1108.
Fama, E. and French, K. (1989). Business conditions and expected stock returns. Journal
of Financial Economics, 25, 23-50.
Fama, E. and French, K. (1997). Industry costs of equity. Journal of Financial Economics,
43, 153-193.
Fama, E. and French, K. (2005). Financing decisions: who issues stock? Journal of
Financial Economics, 76, 549-582.
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its
oracle properties. Journal of American Statistical Association, 96, 1348-1360.
Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman
and Hall, London.
Ferson W. (1989). Changes in expected security returns, risk and the level of interest
rates. Journal of Finance, 44, 1191-1218.
Ferson, W. and Harvey, C. (1991). The variation of economic risk premiums. Journal of
Political Economy, 99, 385-415.
29
Ferson, W. and Harvey, C. (1999). Conditional variables and the cross section of stock
returns. Journal of Finance, 54, 1325-1360.
Ghysels, E. (1998). On Stable Factor Structures in the Pricing of Risk: Do Time-Varying
Betas Help or Hurt?. Journal of Finance, 53, 549-573.
Hall, P. and Hart, J.D. (1990). Nonparametric regression with long-range dependence.
Stochastic Processes and Their Applications, 36, 339-351.
Harvey, C. (2001). The specification of conditional expectations. Journal of Empirical
Finance, 8, 573-637.
Jagannathan, R. and Wang, Z. (1996). The conditional CAPM and the cross-section of
expected returns. Journal of Finance, 51, 3-53.
Johnstone, I.M. and Silverman, B.W. (1997). Wavelet threshold estimators for data with
correlated noise. Journal of the Royal Statistical Society, 59, 319-351.
Letta, M. and Ludvigson, S. (2001). Consumption, aggregate Wealth, and expected Stock
Returns. Journal of Finance, 56, 815-849.
Lewellen, J. and Nagel, S. (2001). The conditional CAPM does not explain asset-pricing
anomalies. Journal of Financial Economics, 82, 289-314.
Lin, C.F. and Terasvirta, T. (1994). Testing the constancy of regression parameters against
continuous structural change. Journal of Econometrics, 62, 211-228.
Robinston, P.M. (1997). Large-sample inference for nonparametric regression with depen-
dent errors. The Annals of Statistics, 25, 2054-2083.
Shanken, J. (1992). On the estimation of beta pricing models. Review of Financial Studies,
5, 1-34.
30
Wang, K. (2002). Nonparametric tests of conditional mean-variance efficiency of a bench-
mark portfolio. Journal of Empirical Finance, 9, 133-169.
Wang, K. (2003). Asset pricing with conditional information: A new test. Journal of
Finance, 58, 161-196.
Zhang, L. (2005). The value premium. Journal of Finance, 60, 67-103.
31