A New Estimation on Time-varying Betas in Conditional · PDF fileA New Estimation on...

A New Estimation on Time-varying Betas in

Conditional CAPM∗

Zongwu Caia,b Yu Renb

aDepartment of Mathematics and Statistics and Department of Economics,

University of North Carolina at Charlotte, Charlotte, NC 28223, USA

bThe Wang Yanan Institute for Studies in Economics, Xiamen University, Fujian, China.

June 15, 2011

∗Cai’s research was supported, in part, by the National Science Foundation of China grant #70871003,

and funds provided by the Cheung Kong Scholarship from Chinese Ministry of Education, the Minjiang

Scholarship from Fujian Province, China and Xiamen University. Ren’s research was supported by

Xiamen University and National Science Foundation of China grant #70971113. We thank the seminar

participants in UC riverside, University of Oklahoma, Canadian Economics Association 2010 meeting,

Tsinghua Econometrics Workshop 2010 and FERM meeting 2010.

Abstract

This paper uses a functional coefficient regression to estimate time-varying betas

and alphas in the conditional capital asset pricing model (CAPM). Functional co-

efficient representation relaxes the strict assumptions on the structure of betas and

alphas by combining the predictors into an index that best captures time variations

in the betas and alphas and estimates them nonparametrically. This index of betas

and alphas helps us to determine which economic variables we should track and

more importantly, in what combination. We select an appropriate index variable

by applying the smoothly clipped absolute deviation penalty to functional coeffi-

cients. In this way, estimation and variable selection can be done simultaneously.

Our model performs better than the alternatives in explaining asset returns. Based

on the empirical studies, we find no evidence to reject the conditional CAPM.

JEL classification: C13; C52; G12

Keywords: Conditional CAPM; Functional coefficient regression; Index model; Smoothly

clipped absolute deviation penalty;

2

1 Introduction

The capital asset pricing model (CAPM) is a cornerstone of finance. It states that a linear

relationship exists between the excess return of a risky asset and the beta of that asset

with respect to the market return. The betas in the CAPM are commonly assumed to

be constant over time. However, recent empirical studies provide ample evidence against

this assumption because the relative risk of a firm’s cash flow varies over the business

cycle and the state of the economy (see Fama and French (1997), Lettau and Ludvigson

(2001), Zhang (2005), Lewellen and Nagel (2006)). In other words, it is more reasonable to

believe that the CAPM holds under the condition of current information sets, which leads

to the conditional CAPM. In the conditional CAPM, the corresponding betas should be

adjusted accordingly because information sets are updated over time. Thus, the betas are

time varying. Because the betas measure the risk exposure to the assets, it is important

to estimate their values correctly. In this paper, we introduce a new methodology to

accomplish this task.

Estimation of the time-varying betas has already been discussed extensively through

two different approaches in the literature. First, the betas can be regarded as a function

of time (for examples, see the papers by Hall and Hart (1990), Johnstone and Silverman

(1997), and Robinson (1997)). For parametric models in this approach, there are threshold

values for the betas, either discretely, such as the threshold CAPM proposed by Akdeniz,

Altay-Salih and Caner (2003), or continuously, such as the smooth threshold transition

(STR) developed by Lin and Terasvirta (1994). For nonparametric models, the betas are

simply a function of time, as described by Ang and Kristensen (2011). This approach is

criticized for hiding the driving force behind the betas; first, it does not show how and

why betas vary over time, and second, the betas are assumed to be affected by some

instrument variables. These instrument variables can be the proxies of latent variables

(Ang and Chen (2007)) or of some observable macro variables (Ferson and Harvey (1999)).

This approach can provide more economic intuition on the movement of the betas than

3

the first one. Because there is no overwhelming argument for either approach, this paper

circumvents this debate and assumes that the betas are functions of some observable

variables. These variables are called financial instruments.

There are two advantages of using financial instruments to track the movement of

the betas. The first advantage is that this method can reveal the close relationship

between the relative risk of a firm’s cash flow as it varies over the business cycle and the

state of the economy. As Aıt-Sahalia and Brandt (2001), all of the potential instrument

variables are combined into an index that best captures time variations in the betas,

and the index can be explained as an economic state variable. The index has a good

interpretation as follows. From a statistical standpoint, the index avoids the curse of

dimensionality because it allows us to reduce the multivariate problem to one where we can

implement the nonparametric approach described below in a univariate setting (because

the index is univariate). From an economic perspective, this index offers a convenient

univariate summary statistic that describes the current state of the various time-varying

economic indicators related to investment opportunities for portfolio investment. From

a normative perspective, our results can help investors with any set of preferences to

determine which economic variables they should track and, more importantly, in what

single combination. And the other advantage is that this approach considers not only the

variation across averages of betas in each short time window but also the variation of the

actual betas within each window. Campbell and Vuolteenaho (2004), Fama and French

(2005), and Lewellen and Nagel (2006), among others, assume discrete changes in betas

across subsamples but constant betas within subsamples.

However, there are still three pitfalls in this approach. First, there is a strict assump-

tion about the relationship between the betas and the instrument variables. Ferson and

Harvey (1999) impose the assumption that the betas are linear functions of the index,

although the linearity assumption is quite controversial. Wang (2002, 2003) find strong

evidence against it, arguing that this strong assumption will lead to model misspecifica-

4

tion. As shown in Ghysels (1998), inference and estimation based on misspecification can

be very misleading. In addition, he shows that among several well-known time-varying

beta models, a serious misspecification produces time variation in the beta that is highly

volatile and leads to large pricing errors. Thus it is important for us to analyze the time-

varying betas by relaxing this assumption. Second, as pointed out by Aıt-Sahalia and

Brandt (2001), there is a separation of instrument selection and model estimation. In the

literature, it is common to choose the instruments by claiming the relationship between

the instrument and the asset returns. However, this evidence is found based on models

different from the one being explored. Because different objective functions place differ-

ent emphases on the instruments, to select the instruments that best fit the model, the

selection should be conducted simultaneously with the estimation. Lastly, Harvey (2001)

shows that the estimates of the betas obtained using instrumental variables are very sen-

sitive to the choice of instruments used as a proxy for time-variation in the conditional

betas.

To address the three aforementioned issues, we use functional coefficient regression

(FCR) proposed by Cai, Fan and Yao (2000), to estimate the time-varying betas. FCR

estimates the betas nonparametrically. It assumes that the coefficients of the factors are

some unknown functions of the state variable. The estimates are obtained with local

linear fitting (see Fan and Gijbels (1996)). Thus, FCR can relax the strong assumption of

linearity and avoid model misspecification. In addition, by adding a penalty term (in this

case, the smoothly clipped absolute deviation penalty (SCAD) function, introduced by

Fan and Li (2001)) to the conventional FCR, we can estimate the state variable and the

betas simultaneously. By doing so, we can choose the state variable that is the most suit-

able for the regression model. It combines the variable selection and the model estimation

together. Moreover, SCAD is a data-driven method, and it can delete the “useless” in-

struments automatically. Thus, we can place all of the potential candidates into the model

and do not need to study the relationship between the individual instrument variable with

5

the asset returns. The results will not be as sensitive to the choice of the instruments as

before.

Instrument selection is very important in empirical analysis. Through this channel,

we can observe which instruments play a role in the estimation and which do not. This

helps us explore the theoretical mechanism behind the estimation results. Furthermore,

instrument selection can enforce the robustness of empirical results. Because this pro-

cedure deletes “useless” instruments automatically, adding or changing those “useless”

instruments in the model will have minimal effects on the model. These two features

cannot be offered by the conventional index model.

The framework for a discussion concerning time-varying betas is the conditional CAPM.

However, the conditional CAPM, as an alternative to the static CAPM, is quite controver-

sial in the literature. Theoretically, the conditional CAPM can hold perfectly, from period

to period, and many models are derived based on it, for example, the Premium-Labor

model in Jagannathan and Wang (1996). Conversely, Ferson and Harvey (1999), Wang

(2003), and Lewellen and Nagel (2006) find that the conditional CAPM is rejected by

their empirical analysis. Still, their results are not convincing. The model in Ferson and

Harvey (1999) may be misspecified, and Wang (2003) uses four smoothing variables to

perform nonparametric estimations and tests. The number of observations in his data is

approximately 600, which is too small to produce reliable inferences. Lewellen and Nagel

(2006) do a simple comparison to evaluate whether the variation in the betas and the

equity premium is large enough to explain important asset-pricing anomalies; however,

the size of their method is challenged by Cai, Li and Ren (2011). Thus, the validity of

the conditional CAPM is still an open question.

To evaluate the conditional CAPM fairly, there are many aspects to consider. Because

it is infeasible for us to analyze all of them here, we only focus on two points in this paper.

The first is whether the data supports the time-varying betas, and the second is whether

the pricing errors, the alphas, are significant. We test on these two phenomenon based

6

on our model. By applying the Fama-French data sets, we find that we cannot reject

the hypotheses that the betas are time varying and that the alphas are insignificant.

Therefore, at this stage, we do not find any evidence against the conditional CAPM.

The rest of this paper is organized as follows: Section 2 describes our estimation model,

Section 3 presents some simulation results, and Section 4 reports the empirical analysis.

Finally, Section 5 concludes.

2 Econometric Model

2.1 Functional Coefficient CAPM

In the conditional CAPM, Ferson and Harvey (1999) assume the beta is a linear function of

the state variable. But Wang (2002, 2003) find a strong evidence against this assumption.

In order to circumvent a possible model misspecification, we adopt functional coefficient

regression (FCR) model proposed by Cai, Fan and Yao (2000). This FCR representation

does not impose any assumptions on the relationship between the beta and the state

variable. Also, pointed by Cai, Gu and Li (2009), a functional coefficient regression model

might approximate a general nonparametric regression model well and has an ability to

capture heteroscedasticity. Indeed, it fits the data locally by a linear function (see (3)

later).

Specifically, let {(zt, xt, yt)}+∞t=1 be jointly strictly stationary processes1, where yt is the

excess return of the portfolios, and both zt and xt are the factors in the asset pricing

model. In the conditional CAPM model, xt is commonly the excess market return and zt

is a univariate state variable.

Define m(z0, x0) = E(yt|zt = z0, xt = x0). A functional coefficient model expresses

1In practice, financial data may not be exactly stationary. But we adopt this assumption for simplicityand it is a standard assumption in the literature. When zt or xt is nonstationary, the statistical inferencesfor this are totally different from those for the stationary case so that the econometric modeling fornonstationary case becomes complicated; see Cai, Li and Park (2009) for details.

7

m(z0, x0) as

m(z0, x0) = α(z0) + β(z0)x0, (1)

so that the functional coefficient capital asset pricing model (FCCAPM) is given by

yt = α(zt) + β(zt)xt + et, (2)

where α(·) and β(·) are two unknown functions of zt. FCCAPM uses a local linear fitting

(see Fan and Gijbels (1996)) to estimate α(·) and β(·) in (2). For a given grid point z0,

we can approximate α(z) and β(z) locally by a linear function α(z) = a0 + a1(z − z0)

β(z) = b0 + b1(z − z0) when z is in a neighborhood of z0. The local linear estimators of

α(z0) and β(z0) are defined as α(z0) = a0 and β(z0) = b0. {aj, bj} (j = 0, 1) minimize the

following locally weighted least squares

T∑t=1

{yt − [a0 + a1(zt − z0)]− [b0 + b1(zt − z0)]xt}2Kh(zt − z0), (3)

where Kh(·) = K(·/h)/h. K(·) is a kernel function on R1 and h is a bandwidth, which

controls the amount of smoothing used in the estimation. Next, we write the minimization

problem (3) in a matrix form. Let

U =

1 x1 (z1 − z0) (z1 − z0)x1

1 x2 (z2 − z0) (z2 − z0)x2...

......

...

1 xT (zT − z0) (zT − z0)xT

, W = diag{Kh(zt − z0)},

B = (a0, b0, a1, b1)′ and Y = (y1, y2, · · · , yT )′. Then, the minimization problem can be

transformed to

minB

(Y − UB)′W (Y − UB). (4)

8

And the solution is

B = (U ′WU)−1U ′WY, (5)

the weighted least squares estimate. Clearly, (5) provides a formula for computational

implementation, which can be carried out by any standard statistical package. When z0

moves over the domain of zt, the estimated curves of α(zt) and β(zt) are obtained.

2.2 Bandwidth Selection

It is well known that the bandwidth is important in nonparametric estimation. It balances

the trade-off between the bias and the variance of the estimates. Following Cai and Tiwari

(2000), Cai (2002) and Cai (2007), we find the optimal bandwidth hopt by minimizing

AIC(h) = log(σ2) + 2(Th + 1)/(T − Th − 2), (6)

where σ2 =∑T

t=1(yt − yt)2/T and Th is the trace of the hat matrix Hh which makes

Y = HhY .

This selection criterion counteracts the over/under-fitting tendency of the generalized

cross-validation and the classical AIC; see Cai and Tiwari (2000) and Cai (2002) for more

details. Alternatively, one might use some existing methods in the time series literature

although they may require more computing; see Fan and Gijbels (1996), Cai, Fan and

Yao (2000) and Cai (2007).

2.3 Testing

In order to check the validity of the conditional CAPM, we have to test the significance

of the pricing error. In the conditional CAPM, the pricing error is denoted as α(·) in the

model. Hence, the testing problem can be formulated as

H0 : α(z) = 0, (7)

9

and

H1 : α(z) 6= 0. (8)

The sum square residuals (RSS) under the null hypothesis is

RSS0 = T−1T∑t=1

[yt − β(zt)xt]2, (9)

where β is estimated under the null. The RSS under the alternative is

RSS1 = T−1T∑t=1

[yt − α(zt)− β(zt)xt]2, (10)

where α and β are estimated under the alternative. The test statistic is defined as

JT =RSS0

RSS1

− 1. (11)

We use the following nonparametric bootstrap approach to obtain the p−value of the

statistic:

1. Collect the residuals {et} by

et = yt − βtxt.

2. Generate the bootstrap residuals {e∗i } from the empirical distribution of the centered

residuals {et − ¯et}.

3. Define

y∗t = βtxt + e∗t .

4. Calculate the bootstrap test statistic J∗T based on the sample {y∗t , xt, zt}.

5. Compute the p-value of the test based on the relative frequency of the event {J∗T ≥

JT} in the replications of the bootstrap sampling.

10

2.4 Instrument Variables Selection

Although it is common to model the betas as a function of the observed macroeconomic

variables, in practice, the selection process itself is crucial (Cochrane (2001), Harvey

(2001)). Usually, the instrument variables are chosen according to the specific studies of

the relationship between that variable and the asset returns. For example, the spread

between the returns of three-month and one-month Treasury bills (Ferson and Harvey

(1991)), the spread between Moody’s Baa and Aaa corporate bond yield (Fama (1990)),

the spread between ten-year and one-year Treasury bond yields (Fama and French (1989))

and one-month Treasury bill yield (Ferson (1989)) are empirically proven to be highly

related to asset returns. Those instruments are proven to be useful in explaining the

asset returns in the models different from the one that we are studying. To select the

instruments that best fit our model, we adopt the variable selection method proposed by

Fan and Li (2001).

We define ut (t = 1, 2, · · · , T ) as a k × 1 vector of the instrument variables, and the

linear combination of these variables zt = c′ut is the state variable for FCR, where c is

a k × 1 coefficient vector. When c = 0, the corresponding instrument is ignored in the

state variable. Thus, we propose a model that can delete the redundant instrument by

delivering the estimated coefficient as 0.

The loss function of FCR can be expressed as

L(c) =T∑t=1

[yt − α(zt)− β(zt)xt]2 . (12)

It is a function of c since zt depends on c. We add Smoothly Clipped Absolute Deviation

penalty to the loss function L(c) in equation

mincL(c) + T

k∑i=1

Pλ,v(|ci|), (13)

11

where the first order derivative of Pλ,v is

P ′λ,v(ci) = λI(ci ≤ λ) +(vλ− ci)+v − 1

I(ci > λ), (14)

and λ and v are two parameters. The choices of λ and v are important for the estimation.

Here we follow Fan and Li (2001), and use fivefold cross-validation as follows: Denote the

full dataset by T , and denote cross-validation training and test set by T − T q and T q,

respectively, for q = 1..., 5. For every λ, v and q, we get the estimators α and β by using

T − T q training set. Then, we choose λ and v to minimize

CV (λ, v) =5∑q=1

∑(yp,xp,zp)∈T q

[yp − α(zp)− β(zp)xp]2. (15)

As shown in Fan and Li (2001), the SCAD penalty function leads to the estimators

with the three desired properties that cannot be achieved by either the Lp penalty function

or the hard penalty function. The three properties are the following: unbiasedness for

the large true coefficient to avoid unnecessary estimation bias, sparsity for estimating

a coefficient as small as 0 to reduce model complexity, and continuity of the resulting

estimator to avoid unnecessary variation in model prediction. More discussions regarding

these properties can be found in the papers by Antoniadis and Fan (2001) and Fan and

Li (2001).

Intuitively, if the instrument variables are not helpful in explaining the time-varying

betas, then the estimated coefficient for this instrument will shrink to zero. Hence, in

formatting the state variable, the estimated coefficient for the instrument will not be

considered. This variable selection method is entirely data-driven and reveals the potential

effects of the instrument variables while simultaneously maintains the analysis within the

framework of FCR. Furthermore, because the number of the instrument variables are no

longer restricted, SCAD will ignore the “useless” instruments automatically.

12

3 Monte Carlo Simulations

In this section, we illustrate the proposed method using simulated data sets. This data set

mimics the actual portfolio returns and market returns in the conditional CAPM model.

In addition, we generate three instrument variables but assume that only two of them

format the state of the economy. We then determine whether our model can deliver good

estimates in terms of the mean absolute deviation and determine the actual state variable.

Because the key point that demonstrates the validity of the conditional CAPM is

whether the pricing error is significant, the bootstrap testing procedure we use should

have the appropriate size and power to make correct inferences. Thus, based on the

simulated data, we also check the size and the power of our testing procedure.

Linear Model

We generate 500 data points, z1t, z2t and z3t drawn from a normal distribution with a

mean of 0 and a standard deviation of 0.38 as our instrument variables. xt is generated

from a uniform distribution on [0, 1]. The data generating process for yt is

yt = (c1z1t + c2z2t)xt + et, (16)

where et is an error term distributed normally with a mean of 0 and a standard deviation of

0.08. c1 and c2 are the coefficients of the factors, and they are set equal to√

2/2 ≈ 0.7071.

Although only two instruments are involved in DGP, we plug all three into our estimation

procedure. We expect our model to account for this fact and decrease the effect of the

irrelevant instrument somehow. We repeat the simulation 1,000 times, and report the

estimates of c1, c2 and c3 in terms of their absolute deviation.

Table 1 shows the median and the standard deviation of the absolute deviation of the

estimators. We find c3 shrinks to 0 in all the simulations, which indicates that our model

can delete the third instrument, which is not included in the true DGP, from the state

13

Table 1: Estimation Results: absolute deviationWe generate three instruments, and only use the first two to form the state variable. The excess returns aresimulated linearly. This table reports the median and the standard deviation for the absolute deviation ofthe estimates based on 1000 simulations. c1, c2 and c3 are the coefficients for the instruments. ‘Shrinkagerate’ indicates the frequency that the estimator of c3 equal to 0.

c1 c2 c3

T=300median 0.0022 0.0022 0.0000std 0.0017 0.0017 0.0000shrinkage rate 100%



variable automatically. In addition, the estimates of c1 and c2 are very close to their true

values. This finding is consistent with what our theoretical model indicates.

Nonlinear Model

We use the same method to simulate z1t, z2t, z3t, xt and et, whereas, yt is generated

nonlinearly as follows

yt = 0.1 exp(c1z1t + c2z2t)xt + et. (17)

Similarly, c1 = c2 =√

2/2 ≈ 0.7071. The estimation results are reported in Table 2.

This time, the estimated c3 deviates from its true value slightly, and the shrinkage rates

decrease to 80% when T = 300, which is lower than the rate in the linear model. This

low shrinkage rate is reasonable because the performance of our model depends on the

complexity of the true DGP. If we just focus on the median and the standard deviation

of the estimators, then the performance of our model still works well.

14

Table 2: Estimation Results: absolute deviationWe generate three instruments, and only use the first two to form the state variable. The excess returnsare simulated nonlinearly. This table reports the median and the standard deviation for the absolutedeviation of the estimates based on 1000 simulations. c1, c2 and c3 are the coefficients for the instruments.‘Shrinkage rate’ indicates the frequency that the estimator of c3 equal to 0.

c1 c2 c3




Size and Power

Because we use the bootstrap method to test the significance of the pricing error, we need

to check the size and the power of our test. If the size or power distortion is large, then

the inferences based on the test results are not reliable.

We still use the DGPs previously described as the representatives of the linear and the

nonlinear models for the conditional CAPM. For each simulation, we simulate 500 data

points and test the null hypothesis that the intercept is zero. Then we repeat the entire

procedure 1,000 times and report the frequency of the rejection of the null hypothesis.

Because the true DGP is based on the conditional CAPM, the rejection frequency should

be close to the nominal levels. These results are summarized in Table 3.

In order to check the power of the test, we add a local time-varying intercept to the

linear and the nonlinear models described before. For the linear model,

yt =c0T 2/5

(c1z1t + c2z2t) + (c1z1t + c2z2t)xt + et, (18)

15

Table 3: Size of TestWe use the linear and the nonlinear model in the previous section to generate the data, and test the nullhypothesis that the pricing error is insignificant. We report the rejection frequencies in this table.

Nomial Level (%) 1 5 10

T=100Linear Model (%) 0.3 3.3 7.6

Nonlinear Model (%) 1.3 3.7 9.2

T=300Linear Model (%) 0.7 3.9 8.2


T=500Linear Model (%) 1.2 4.6 9.8


and for the nonlinear model,

yt =c0T 2/5

(c1z1t + c2z2t) + 0.1 exp(c1z1t + c2z2t)xt + et. (19)

We obtain the rejection frequency based on 1,000 replications. We try different values of

c0 and T , and set the significant level at 5%. The rejection frequencies are reported in

Table 4.

The power of our test increases sharply with c0. It reaches almost 100% when c0 = 0.5.

So the power of our test is warranted to the usage on the empirical analysis.

4 Empirical Analysis

4.1 Data

We collect monthly returns of the Fama-French 25 portfolios from July 1963 to December

2009. The financial instruments, following Ferson and Harvey (1999), are the spread

between the returns of the three-month and the one-month Treasury bill , the spread

16

Table 4: Power of TestWe use the linear and the nonlinear model, which contain a local pricing error, to generate the data, andtest the null hypothesis that the pricing error is insignificant. The significant level is 5%. We report therejection frequencies in this table.

Parameter c0=0.1 c0=0.15 c0=0.2 c0=0.25 c0=0.3 c0=0.5

T=100Linear Model(%) 10.5 29 43.5 63.5 79.5 98.0

Nonlinear Model(%) 11.0 25 42 62 83.5 97.5

T=300Linear Model (%) 22.0 45.5 70.5 86.5 95.5 99.5

Nonlinear Model (%) 17.0 46.5 70.5 86.5 96.0 100

T=500Linear Model(%) 26.5 49 76 91.5 97.5 100

Nonlinear Model (%) 25.5 50 76 95.5 99 100

between Moody’s Baa and Aaa corporate bond yield, the spread between a ten-year and

one-year Treasury bond yields and the one-month Treasury bill yield. To match the

model, we obtain the one-month lagged data for the instrument variable.

We do not claim that these instrument variables are the set of all potential instruments:

these are the most popular ones used in practice because interest rates and spreads are

usually benchmark indexes for the business cycle. By putting them into our model, we

can estimate which variable plays a more important role than the others. Certainly, it is

feasible to incorporate more macro variables into the model because the SCAD penalty

criterion will shrink the coefficients of those “useless” variables to zero. Doing so will not

hurt our estimation, but only needs more computation time.

4.2 Data Smoothing

We use our model to estimate the state variable and the asset returns. To compare the

relative performances, we also employ the model in Ferson and Harvey (1999) (FH model)

to the data. We report the mean square error in Table 5. The first column indicates the

17

portfolios we study. The portfolios are sorted by the orders of the size (“S”) and the

book-to-market ratio (“B”). “S1” (and “B1”) denotes the lowest order and “S5” (and

“B5”) denotes the highest order. The mean square errors (MSE) estimated in our model

and in the FH model are reported in the second and the third columns respectively. We

can see that the numbers in the second column are smaller than those in the third column

for all portfolios. The improvement of the MSE can be as large as 12% shown in portfolio

“S2/B3”. For the others, our model can decrease the MSE for approximately 7% on

average. The relatively poor performance of the FH model may be explained by the

model’s misspecification. The linearity assumption imposed on the relationship between

the state variable and the time-varying betas seems too strong. The fitting performance

increases when this assumption is relaxed, which is consistent with the findings in Wang

(2002, 2003).

To explore further, we check how the choice for the state variable affects our results.

We run a FCR with a different state variable. Three different state variables are studied

here. The first is the one selected by SCAD, as our model describes. For the second state

variable, we run a linear regression of the asset returns on the instrument variables and

treat the fitted value as the state variable. The last is the state variable implied by the

time-varying betas in the FH model2. All of the state variables used in the smoothing

variable for FCR, and we report the MSE in Table 6. On average, SCAD gives us the

best state variable among these three. The majority of the MSE in the second column

are smaller than the ones in the other two columns. The reason is that SCAD selects

the variables to directly fit the asset returns and chooses the index that is the best for

the model estimation. This method avoids the problems caused by the separation of

instrument selection and model estimation.

When our model is employed, it produces the estimated coefficients of the instruments

summarized in Table 7. To avoid the identification problem, we search the estimates over

2The FH model allows the alpha and the beta to have different state variables.

18

Table 5: Mean Square ErrorWe use our model to estimate the state variable and the asset returns. The instruments are the spreadbetween the returns of a three-month and a one-month Treasury bill , the spread between Moody’s Baaand Aaa corporate bond yield, the spread between a ten-year and one-year Treasury bond yield and theone-month Treasury bill yield. In order to compare the relative performances, we also employ the modelin Ferson and Harvey (1999) (FH model) to fit the data. This table reports the mean square error. Thefirst column indicates the portfolios we study. The portfolio are sorted by the orders of the size (“S”) andthe book-to-market ratio (“B”). “S1” (and “B1”) denotes the lowest order and “S5” (and “B5”) denotesthe hightest order. The mean square error delivered by our model and by the FH model are reported inthe second and the third columns respectively. The first column indicates the portfolios we study. Thesecond and the third columns report the mean square error delivered by our model and the FH modelrespectively.

FCR+SCAD Ferson and Harvey

S1/B1 22.4948 23.3314S1/B2 17.5266 17.7231S1/B3 10.7763 12.048S1/B4 10.6107 11.4277S1/B5 12.3372 13.4097S2/B1 12.5011 13.3613S2/B2 8.1924 8.6243S2/B3 6.4449 7.3834S2/B4 7.3587 7.4214S2/B5 9.8359 10.8331S3/B1 8.9408 9.2193S3/B2 4.9812 5.1106S3/B3 4.9300 5.2432S3/B4 5.3756 5.812S3/B5 8.1348 9.2337S4/B1 4.8292 5.1875S4/B2 3.3935 3.5154S4/B3 4.1486 4.6364S4/B4 4.5217 5.0956S4/B5 7.5399 8.383S5/B1 2.3848 2.7164S5/B2 2.4869 2.6139S5/B3 3.6080 3.9479S5/B4 5.1640 5.3412S5/B5 8.4341 9.0278

19

Table 6: Mean Square ErrorWe use FCR to fit the data, but try different state variables. The second column reports the MSEwhen we use SCAD to find the state variable. The third column reports the MSE when we run a linearregression of the asset returns on the instrument variables and treat the fitted value as the state variable.The last one is the MSE when the state variables are chosen by the time-varying betas in the FH model.

SCAD OLS Ferson and Harvey

S1/B1 22.4948 23.4331 23.1556S1/B2 17.5266 17.4579 17.6291S1/B3 10.7763 11.7326 11.6373S1/B4 10.6107 11.1470 11.1239S1/B5 12.3372 13.4502 13.3691S2/B1 12.5011 12.9744 13.1612S2/B2 8.1924 8.3108 8.4618S2/B3 6.4449 7.2664 7.2051S2/B4 7.3587 7.2686 7.2975S2/B5 9.8359 10.6718 10.6696S3/B1 8.9408 8.8972 9.0156S3/B2 4.9812 4.9189 4.9750S3/B3 4.9300 5.0807 5.0757S3/B4 5.3756 5.6697 5.5941S3/B5 8.1348 9.0188 8.7943S4/B1 4.8292 4.9809 5.0273S4/B2 3.3935 3.5223 3.4804S4/B3 4.1486 4.3329 4.4692S4/B4 4.5217 4.8917 4.9957S4/B5 7.5399 8.4117 8.1457S5/B1 2.3848 2.6724 2.5847S5/B2 2.4869 2.6469 2.6192S5/B3 3.6080 3.9656 3.8562S5/B4 5.1640 5.3757 5.1527S5/B5 8.4341 8.7039 8.6781

20

a unit circle. Interestingly, the coefficients of the one-month interest rate are zero for

many portfolios. This is consistent with the empirical evidence in Bernanke (1990) that

the interest rate spreads play a more important role than the rates in financial markets.

This feature cannot be observed in Ferson and Harvey (1999).

Table 7: Coefficients for Instrument VariablesWe use our method to estimate the coefficients of the instrument variables. The portfolios are theFama-French 25 portfolios, and the instrument variables are the spread between Moody’s Baa and Aaacorporate bond yield (BmA), the spread between the returns of a three-month and a one-month Treasurybill (r3m-r1m), spread between a ten-year and one-year Treasury bond yield (r10y-r1y) and one-monthTreasury bill yield (r1m).

Coefficient BmA r3m-r1m r10y-r1y r1m

S1/B1 0.3009 0.5242 0.8009 0S1/B2 0.5549 0.5916 0.5849 0S1/B3 0.6663 0.7118 0.2268 0S1/B4 0.6049 0.6890 0.3393 0.2103S1/B5 0.6192 0.7541 0.1880 0.1123S2/B1 0.2910 0.5239 0.8006 0S2/B2 0.6811 0.7134 0.1682 0S2/B3 0.4812 0.6944 0.4446 0.2977S2/B4 0.2225 0.9749 0 0S2/B5 0.5210 0.5639 0.3908 0.5078S3/B1 0.2910 0.5123 0.8080 0S3/B2 0.7350 0.4576 0.3759 0.3303S3/B3 0.4736 0.5125 0.4434 0.5625S3/B4 0.3021 0.6616 0.5472 0.4143S3/B5 0.5759 0.5426 0.3492 0.5020S4/B1 0.6001 0.2166 0.4654 0.6134S4/B2 0.1821 0.5188 0.6673 0.5024S4/B3 0.6216 0.3452 0.2588 0.6539S4/B4 0.4373 0.4900 0.4528 0.6030S4/B5 0.5620 0.5216 0.3925 0.5080S5/B1 0.3137 0.4947 0.6700 0.4561S5/B2 0.6118 0.3644 0.5044 0.4884S5/B3 0.3182 0.6129 0.6187 0.3747S5/B4 0.5079 0.5639 0.3933 0.5191S5/B5 0.5081 0.4720 0.4500 0.5626

Whether or not the betas are time varying is crucial to the conditional CAPM. We plot

the estimates of the alphas and the betas with their 95% point-wise confidence intervals.

The confidence intervals are obtained by bootstrapping the data 1000 times. To save

space, we only report the S4/B2 portfolios. Figure 1 shows clearly that the betas are not

constant over the time period studied. This justifies the usage of the conditional CAPM.

21

And Figure 2 gives us a rough idea on the pricing errors, which fluctuate around zero.

Figure 1: Estimated betas with 95% point wise confidence intervals for S4/B2 portfolio

0 100 200 300 400 500 6000.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

time

beta

Figure 2: Estimated alphas with 95% point wise confidence intervals for S4/B2 portfolio

0 100 200 300 400 500 600−4

−3

−2

−1

0

1

2

3

time

alph

a

4.3 Model Testing

After estimating the time-varying betas and the alphas, we are curious about whether our

model specification is correct and whether our model is sufficient to explain the portfolio

returns. To answer these two questions, we test the model specified in Ferson and Harvey

(1999). In addition, we test the significance of the pricing errors produced by our model.

22

We do so with both time series analysis and cross-sectional analysis. If the conditional

CAPM is valid and our model estimates the betas correctly, then the pricing error should

be insignificant.

Nonlinearity

Ferson and Harvey (1999) use a linear model to describe the time-varying betas, while

our paper opts to use the nonparametric model to fulfill this job. It is necessary for us to

test the model specification. Namely, the null hypothesis is

H0 : αt = a0 + a′1Zt βt = b0 + b′1Zt. (20)

The alternative hypothesis is

H1 : αt = f1(c′Zt) βt = f2(c

′Zt), (21)

where f1 and f2 are two unknown functions. The p-values of 25 portfolios are summarized

by Table 8. The portfolios in each row are sorted by the book-to-market ratios and

in each column by the sizes. We find that we can reject the null hypothesis for 17

portfolios, which means that the majority of the Fama-French 25 portfolios do not satisfy

the linearity assumption imposed in Ferson and Harvey (1999). This nonlinearity is

even more prominent for the portfolios that have high book-to-market ratios or small

sizes. This nonlinearity implies the instruments capture the time lag information mainly

through their high-order term.

Pricing Error of Time Series Analysis

In the conditional CAPM, the alphas represent the abnormal returns of the assets. The

abnormal returns occur when there are other systematic risks in the market that the

model does not capture. Thus, we need to test the significance of the alphas. When

23

Table 8: p−ValuesThis table reports the p−values of the test on the linearity of the alphas and the betas. The nullhypothesis the linear model in Ferson and Harvey (1999); the alternative is our nonparametric model.The p−values are obtained by bootstrapping the data 1000 times. The portfolios in each row are sortedby the book-to-market ratios and in each column by the sizes.

B1 B2 B3 B4 B5

S1 0.2500 0.1954 0.0002 0 0

S2 0.2082 0.0125 0.0011 0.0001 0.0088

S3 0.2344 0.0412 0.0168 0.0248 0.0038

S4 0.0050 0.0002 0.0136 0.0184 0.0006

S5 0.2508 0.0360 0.0702 0.2628 0.0806

the estimated alphas are significant, then the model is rejected; however, if the model is

misspecified, the test results may be misleading. Because we have already rejected the

linear time-varying beta and alpha, we are warranted in testing the significance of the

alphas based on our model.

We estimate the time-varying alphas for the entire time period of the data and then

perform a significance test on the alphas. The null hypothesis is H0 : αt = 0 and

H1 : αt 6= 0 for t = 1, 2, · · · , T . Here, we only impose restrictions on the alphas, and

allow the betas to vary. The p-values are obtained using the bootstrapping procedure

mentioned previously. Table 9 summarizes the results. Out of 25 portfolios, 20 have

insignificant alphas at the level of 5%. This result is quite different from the results found

by Ferson and Harvey (1999). The difference is due to the model specification. As argued

in Ghysels (1998), the inferences made from the misspecified model can be very mislead-

ing. Thus, we believe that the conditional CAPM can deliver insignificant pricing errors.

The insignificant pricing errors also suggest that the conditional CAPM can capture the

systematic risk from period to period. Therefore, we do not need to use conditional

multi-factor models in asset pricing. This result also gives strong support for the models

that are derived or based on the conditional CAPM, such as the premium-labor model

proposed by Jagannathan and Wang (1996).

24

Table 9: p−ValuesThis table reports the p−values for the test on the significance of the alphas when the model has time-varying betas. The null hypothesis is H0 : αt = 0 and the alternative is H1 : αt 6= 0.

Portfolio B1 B2 B3 B4 B5

S1 0.6840 0.5480 0.0020 0 0.0002

S2 0.6520 0.3980 0.0100 0.1500 0.1060

S3 0.0900 0.2620 0.5300 0.0300 0.1480

S4 0.0860 0.0800 0.7020 0.0500 0.4880

S5 0.5400 0.0940 0.6160 0.6660 0.5480

Pricing Error of Cross Sectional Analysis

In addition, we also run a cross-sectional regression. Since the conditional CAPM model

is expressed as

E[Rit|It−1] = γ1t−1βit−1, (22)

where γ1t−1 is the conditional market risk premium, following Ferson and Harvey (1999),

the cross-sectional regression can be setup as

Rit = γ0,t−1 + γ1,t−1βit−1 + γ2,t−1δ′it−1Zt−1 + eit, (23)

where γ0,t−1 is the intercept and γ1,t−1, γ2,t−1 are the slope coefficients. δ′it−1Zt−1 is the

fitted conditional expected return, where δit−1 is estimated by regressing the return on

the lagged variable Z, using the data up to time t− 1. γ0,t−1 denotes the pricing errors in

the cross-sectional regression, and γ2,t−1 indicates how much information is not captured

by the conditional betas. So if the conditional CAPM is valid, γ0,t−1 and γ2,t−1 should be

insignificant.

We estimate the betas using two methods. One method uses the rolling 120-month

prior estimation period, and the other uses an expanding sample. The time-series averages

of the cross-sectional regression coefficients are shown along with their Fama-MacBeth

25

t−ratios in Table 10.

Table 10: Cross-sectional RegressionWe run the cross-sectional regression of the excess returns on constant, the time-varying beta and fittedconditional expected return. The time-series averages of the cross-sectional regression coefficients areshown in the first row in each panel and the corresponding Fama-MacBeth t−ratios are reported in thesecond row with parenthesis.

γ0 γ1 γ2Rolling Sample

0.9996 -0.3498(0.1787) (-0.0519)0.6815 -0.0057 0.8408

(0.1171) (-0.0010) (0.1622)

Expanding Sample

0.9749 -0.2360(0.1268) (-0.0300)0.8599 -0.3111 0.9330

(0.1166) (-0.0437) (0.2138)

The results suggest that γ0 is insignificant in both cases. Although the pricing errors of

some portfolios are significant in the time-series analysis, we can still obtain insignificant

pricing errors for the cross-sectional regression. This is because, in the time-series analysis,

pricing error is a function of the instruments, α = α(Zt). The realized value of this

function is not zero, but its expected value can be zero. Moreover, we find that γ2 is

insignificant as well. This implies that the public information has already been revealed

in the time-varying betas. These two findings can justify the conditional CAPM from the

cross-sectional point of view.

5 Conclusion

This paper uses a functional coefficient regression to estimate the time-varying betas and

the alphas in the conditional capital asset pricing model. Functional coefficient repre-

sentation relaxes the strict assumptions regarding the structure of betas and alphas by

26

combining the predictors into an index that best captures time variations in betas and

alphas and estimates them nonparametrically. This index of betas and alphas helps us

to determine which economic variables we should track and, more importantly, in what

combination. We select an appropriate index variable by using a smoothly clipped abso-

lute deviation penalty on functional coefficients. In this manner, estimation and variable

selection can be performed simultaneously.

Our findings are quite interesting. First, empirical results show that our model has

the best fit compared with other methods in the literature. Second, our results support

the conventional wisdom about the conditional CAPM, which is that it holds from period

to period. The pricing errors are not significant for most of the portfolios that we study.

27

References

Aıt-Sahalia, Y. and Brandt, W. (2001). Variable selection for portfolio choice, Journal of

Finance, 56, 1297-1351.

Ang, A. and Kristensen, D. (2011). Testing Conditional Factor Models. Working paper.

Akdeniz, L., Altay-Salih, A. and Caner, M. (2003). Time-varying betas help in asset

pricing: the threshold CAPM. Studies in Nonlinear Dynamics & Econometrics, 6, 1-

16.

Ang, A. and Chen, J. (2007). CAPM over the long run: 1926-2001. Journal Empirical

Finance, 14, 1-40.

Antoniadis, A. and Fan, J. (2001). Regularization of wavelet approximations. Journal of

American Statistical Association, 96, 939-955.

Bernanke, B. (1990). On the predictive power of interest rates and interest rate spreads.

New England Economic Review, 51-68.

Cai, Z. (2002). A two-stage approach to additive time series models. Statistica Neerlandica,

56, 415-433.

Cai, Z. (2007). Trending time varying coefficient time series models with serially correlated

errors. Journal of Econometrics, 136, 163-188.

Cai, Z., Fan, J. and Yao, Q. (2000). Functional-coefficient regression models for nonlinear

time series. Journal of American Statistical Association, 95, 941-956.

Cai, Z., Gu, J. and Li, Q. (2009). Some recent developments on nonparametric economet-

rics. Advances in Econometrics, 25, 495-549.

Cai, Z., Li, Q. and Park, J.Y. (2009). Functional-coefficient models for nonstationary time

series data. Journal of Econometrics, 148, 101-113.

28

Cai, Z., Li, X. and Ren, Y. (2011). A Direct Bootstrap Test on the Conditional CAPM.

Working paper.

Cai, Z. and Tiwari, R.C. (2000). Application of a local linear autoregressive model to

BOD time series. Environmetrics, 11, 341-350.

Campbell, J. and Vuolteenaho, T. (2004). Good beta, bad beta. American Economic

Review, 94, 1249-1275.

Cochrane, J. (2001). Asset Pricing. Princeton University Press, Princeton, NJ.

Fama, E. (1990). Stock returns, expected returns, and real activity. Journal of Finance,

45, 1089-1108.

Fama, E. and French, K. (1989). Business conditions and expected stock returns. Journal

of Financial Economics, 25, 23-50.

Fama, E. and French, K. (1997). Industry costs of equity. Journal of Financial Economics,

43, 153-193.

Fama, E. and French, K. (2005). Financing decisions: who issues stock? Journal of

Financial Economics, 76, 549-582.

Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its

oracle properties. Journal of American Statistical Association, 96, 1348-1360.

Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman

and Hall, London.

Ferson W. (1989). Changes in expected security returns, risk and the level of interest

rates. Journal of Finance, 44, 1191-1218.

Ferson, W. and Harvey, C. (1991). The variation of economic risk premiums. Journal of

Political Economy, 99, 385-415.

29

Ferson, W. and Harvey, C. (1999). Conditional variables and the cross section of stock

returns. Journal of Finance, 54, 1325-1360.

Ghysels, E. (1998). On Stable Factor Structures in the Pricing of Risk: Do Time-Varying

Betas Help or Hurt?. Journal of Finance, 53, 549-573.

Hall, P. and Hart, J.D. (1990). Nonparametric regression with long-range dependence.

Stochastic Processes and Their Applications, 36, 339-351.

Harvey, C. (2001). The specification of conditional expectations. Journal of Empirical

Finance, 8, 573-637.

Jagannathan, R. and Wang, Z. (1996). The conditional CAPM and the cross-section of

expected returns. Journal of Finance, 51, 3-53.

Johnstone, I.M. and Silverman, B.W. (1997). Wavelet threshold estimators for data with

correlated noise. Journal of the Royal Statistical Society, 59, 319-351.

Letta, M. and Ludvigson, S. (2001). Consumption, aggregate Wealth, and expected Stock

Returns. Journal of Finance, 56, 815-849.

Lewellen, J. and Nagel, S. (2001). The conditional CAPM does not explain asset-pricing

anomalies. Journal of Financial Economics, 82, 289-314.

Lin, C.F. and Terasvirta, T. (1994). Testing the constancy of regression parameters against

continuous structural change. Journal of Econometrics, 62, 211-228.

Robinston, P.M. (1997). Large-sample inference for nonparametric regression with depen-

dent errors. The Annals of Statistics, 25, 2054-2083.

Shanken, J. (1992). On the estimation of beta pricing models. Review of Financial Studies,

5, 1-34.

30

Wang, K. (2002). Nonparametric tests of conditional mean-variance efficiency of a bench-

mark portfolio. Journal of Empirical Finance, 9, 133-169.

Wang, K. (2003). Asset pricing with conditional information: A new test. Journal of

Finance, 58, 161-196.

Zhang, L. (2005). The value premium. Journal of Finance, 60, 67-103.

31

A New Estimation on Time-varying Betas in Conditional · PDF fileA New Estimation on...

Documents

Transcript of A New Estimation on Time-varying Betas in Conditional · PDF fileA New Estimation on...