Pricing Kernels with Coskewness and Volatility Risk · Electronic copy available at: 1361926...

56
Electronic copy available at: http://ssrn.com/abstract=1361926 Pricing Kernels with Coskewness and Volatility Risk Fousseni CHABI-YO * Fisher College of Business, Ohio State University This Version : March 16, 2009 Abstract I investigate a pricing kernel in which coskewness and the market volatility risk factors are endogenously determined. I show that the price of coskewness and market volatility risk are restricted by investor risk aversion and skewness preference. The risk aversion is estimated to be between two and five and significant. The price of volatility risk ranges from -1.5% to -0.15% per year. Consistent with theory, I find that the pricing kernel is decreasing in the aggregate wealth and increasing in the market volatility. When I project my estimated pricing kernel on a polynomial function of the market return, doing so produces the puzzling behaviors observed in pricing kernel. Using pricing kernels, I examine the sources of the idiosyncratic volatility premium. I find that nonzero risk aversion and firms’ non-systematic coskewness determine the premium on idiosyncratic volatility risk. When I control for the non-systematic coskewness factor, I find no significant relation between idiosyncratic volatility and stock expected returns. My results are robust across different sample periods, different measures of market volatility and firm characteristics. * Fousseni Chabi-Yo is from the Fisher College of Business, The Ohio State University. I am grateful to Gurdip Bakshi, Turan Bali, Karl Diether, Rene Garcia, Eric Ghysels, Pete Kyle, Dietmar Leisen, Kewei Hou, Hong Liu, An- drew Karolyi, Mark Loewenstein, Lemma Sembet, Georgios Skoulakis, Rene Stulz, Liuren Wu, Harold Zhang, Guofu Zhou and seminar participants at Baruch College, the University of Maryland, HEC Montreal, Rutgers University, The Ohio State University, the University of Washington in St Louis, the University of Texas in Dallas, and the Stockholm School of Economics for helpful comments. As usual, they are exonerated with respect to the paper’s residual shortcomings. I thank Kenneth French for making a large amount of historical data publicly available in his online data library. I thank Bollerslev Tim, George Tauchen and Hao Zhou for making the monthly time-series of realized volatility available. I thank the Dice Center for Financial Economics for financial support. Correspondence Address: Fousseni Chabi-Yo, Fisher College of Business, Ohio State University, 840 Fisher Hall, 2100 Neil Avenue, Columbus, OH 43210-1144. email: chabi-yo 1@fisher.osu.edu

Transcript of Pricing Kernels with Coskewness and Volatility Risk · Electronic copy available at: 1361926...

Electronic copy available at: http://ssrn.com/abstract=1361926

Pricing Kernels with Coskewness and Volatility Risk

Fousseni CHABI-YO∗

Fisher College of Business, Ohio State University

This Version : March 16, 2009

Abstract

I investigate a pricing kernel in which coskewness and the market volatility risk factors areendogenously determined. I show that the price of coskewness and market volatility risk arerestricted by investor risk aversion and skewness preference. The risk aversion is estimated tobe between two and five and significant. The price of volatility risk ranges from -1.5% to -0.15%per year. Consistent with theory, I find that the pricing kernel is decreasing in the aggregatewealth and increasing in the market volatility. When I project my estimated pricing kernel ona polynomial function of the market return, doing so produces the puzzling behaviors observedin pricing kernel. Using pricing kernels, I examine the sources of the idiosyncratic volatilitypremium. I find that nonzero risk aversion and firms’ non-systematic coskewness determinethe premium on idiosyncratic volatility risk. When I control for the non-systematic coskewnessfactor, I find no significant relation between idiosyncratic volatility and stock expected returns.My results are robust across different sample periods, different measures of market volatilityand firm characteristics.

∗Fousseni Chabi-Yo is from the Fisher College of Business, The Ohio State University. I am grateful to GurdipBakshi, Turan Bali, Karl Diether, Rene Garcia, Eric Ghysels, Pete Kyle, Dietmar Leisen, Kewei Hou, Hong Liu, An-drew Karolyi, Mark Loewenstein, Lemma Sembet, Georgios Skoulakis, Rene Stulz, Liuren Wu, Harold Zhang, GuofuZhou and seminar participants at Baruch College, the University of Maryland, HEC Montreal, Rutgers University,The Ohio State University, the University of Washington in St Louis, the University of Texas in Dallas, and theStockholm School of Economics for helpful comments. As usual, they are exonerated with respect to the paper’sresidual shortcomings. I thank Kenneth French for making a large amount of historical data publicly available in hisonline data library. I thank Bollerslev Tim, George Tauchen and Hao Zhou for making the monthly time-series ofrealized volatility available. I thank the Dice Center for Financial Economics for financial support. CorrespondenceAddress: Fousseni Chabi-Yo, Fisher College of Business, Ohio State University, 840 Fisher Hall, 2100 Neil Avenue,Columbus, OH 43210-1144. email: chabi-yo [email protected]

Electronic copy available at: http://ssrn.com/abstract=1361926

Pricing Kernels with Coskewness and Volatility Risk

Abstract

I investigate a pricing kernel in which coskewness and the market volatility risk factors areendogenously determined. I show that the price of coskewness and market volatility risk arerestricted by investor risk aversion and skewness preference. The risk aversion is estimated tobe between two and five and significant. The price of volatility risk ranges from -1.5% to -0.15%per year. Consistent with theory, I find that the pricing kernel is decreasing in the aggregatewealth and increasing in the market volatility. When I project my estimated pricing kernel ona polynomial function of the market return, doing so produces the puzzling behaviors observedin pricing kernel. Using pricing kernels, I examine the sources of the idiosyncratic volatilitypremium. I find that nonzero risk aversion and firms’ non-systematic coskewness determinethe premium on idiosyncratic volatility risk. When I control for the non-systematic coskewnessfactor, I find no significant relation between idiosyncratic volatility and stock expected returns.My results are robust across different sample periods, different measures of market volatilityand firm characteristics.

Electronic copy available at: http://ssrn.com/abstract=1361926

I. Introduction

In a setting with standard preferences and static prices of risk, the pricing kernel can be interpreted

as a scaled marginal utility. As a result, under these assumptions, to be consistent with positive

marginal utility and the no arbitrage condition, the pricing kernel should be positive, and to be

consistent with decreasing absolute risk aversion it should be decreasing in the aggregate wealth

(market return). Over a reasonable range of wealth states, these pricing kernels increase as the

aggregate wealth increases. Jackwerth (2000), Rosenberg et al. (2002) and Aı̈t-Sahalia and Lo

(2000) document the puzzling behavior of the pricing kernel, which affects the absolute risk-aversion

function and renders it negative over a reasonable range of wealths. Previous studies ignore the

possibility that the market volatility risk could affect the behavior of the pricing kernels. Recent

papers such as Ang et al. (2006) find that the market volatility risk is priced. Bakshi and Madan

(2006) show that the market volatility premium is determined by a nonzero risk aversion and higher

moments such as market skewness. Their finding suggests that investor skewness preferences should

also impact the volatility risk premium as well. Ang et al. (2006) show that, in addition to the

market volatility, the idiosyncratic volatility is priced. In contrast to Merton’s (1987) prediction,

Ang et al. (2006) find that on average, stocks with high idiosyncratic volatility earn low returns.

They call their result a puzzle. This puzzle has attracted recent attention and there has been

increasing interest in explaining their results.

My contribution to these studies is three-fold. First, I build a partial equilibrium model in

which investors trade in a multi-period market. I go beyond the representative agent utility models

by allowing heterogeneity of preferences among agents. I make no assumptions about the func-

tional form of investor utility functions and the distribution of asset returns. Instead, I provide a

general framework that maps heterogeneity of preferences into the pricing kernel. Thus, I provide a

structural interpretation to the pricing kernel involving coskewness and volatility risk. My pricing

kernel is a function of two deep structural parameters, the average value of investor risk-tolerances

(inverse of the Arrow-Pratt measure of investor risk aversions) and skewness preference. I extend

the frameworks of Samuelson (1970) and; Judd and Guu (2001) to a dynamic market model in

which each investor maximizes his or her expected utility. My intertemporal portfolio choice is a

dynamic problem in which each investor chooses asset allocations conditional on current wealth.

My model setup is a dynamic extension of Harvey and Siddique (2000), Kraus and Litzenberger

(1976), Rubinstein (1973) and is also related to Brandt et al. (2005). As a result, the aggregate

pricing kernel in equilibrium depends on both coskewness and market volatility risk. I show that

the price of coskewness and market volatility are restricted by investor preferences. Further, I pro-

vide a closed-form solution for the prices of coskewness and market volatility risk in terms of mean

average of investor risk aversions and skewness preferences. The price of coskewness and volatility

risk are highly nonlinear functions of the mean risk aversion and mean skewness preference. I use

two sets of independent data. I first use the 30 industry portfolio returns. Second, I use the 30

1

Dow Jones stock returns. I estimate both the risk aversion and skewness preference parameters

with different proxies for the market volatility. I use the implied market volatility measures VIX

and VXO, and the realized market volatility (RV) computed with high frequency intraday squared

returns. I also consider different sub-sample periods. When I use the 30 industry portfolio returns,

the risk aversion is estimated to be between 2.5 and 4.75 while the skewness preference ranges from

1.05 to 2.25. The parameters are mostly statistically significant. The implied price of coskewness

associated to these estimates ranges from -3.2% to -0.72% per year . When I use the VIX and VXO

as my proxy for the market volatility, the implied price of the market volatility ranges from -1.2%

to -0.19% per year. The implied price of the market volatility ranges from -1.51% to -0.77% per

year when I use the realized volatility. When I use the 30 Dow Jones Stock returns, the risk aver-

sion is estimated to be between 3.75 and 5.75 while the skewness preference ranges from 1.003 to

1.71. The parameters are all statistically significant. The implied price of coskewness associated to

these estimates ranges from -2.7% to -2.07% per year and the implied price of the market volatility

ranges from -1.30% to -0.15% per year. These estimates are in a reasonable range and consistent

with the literature. I also investigate the impact of the risk aversion and skewness preference on

the price of the market volatility risk over time. I find that periods of high price of volatility risk is

sometimes associated with high risk aversion and low or stable skewness preference, sometimes to

high skewness preference and low and stable risk aversion, and sometimes to both high risk aversion

and high skewness preference.

Second, I examine the puzzling behaviors of the pricing kernel. I show that my estimated pricing

kernel is consistent with economic theory. It is decreasing in the aggregate wealth and increasing in

the market volatility. When I project my estimated pricing kernel on a polynomial function of the

market return, doing so produces the puzzling behaviors observed in pricing kernel. The missing

market volatility priced factor in the pricing kernel and the lack of a structural interpretation of the

price of coskewness and volatility risk in terms of investor risk aversion and skewness preference,

in previous studies, could be the cause of the puzzling behaviors of the pricing kernel.

Third, I study the negative relation between idiosyncratic volatility and expected returns, and

ask why do low idiosyncratic volatility firms earn higher future returns than ones with higher id-

iosyncratic volatility? To answer this question, I use pricing kernels to examine the sources of the

idiosyncratic volatility premium. I find that the premium on idiosyncratic volatility risk is deter-

mined by a nonzero risk aversion and firms’ non-systematic coskewness. I define non-systematic

coskewness as the non-systematic component of asset skewness that is related to the market port-

folio’s skewness. I find that when this non-systematic component is positive, the difference in

Fama-French (1993) alpha between the valued-weighted decile portfolio with the highest and low-

est non-systematic coskewness has a significant alpha of -1.30% per month (with a t-statistic of

-2.82). In contrast, when the non-systematic coskewness is negative, a long-short portfolio hold-

ing the highest non-systematic coskewness decile of stocks and shorting the lowest non-systematic

coskewness decile of stocks has a highly significant alpha of 1.30% per month (with a t-statistic

2

of 2.94). To study the negative relation between idiosyncratic volatility and expected returns, I

sort stocks on idiosyncratic volatility and form value-weighted decile portfolios. Consistent with

previous studies, I find that the Fama-French (1993) alphas of the high-idiosyncratic decile exceeds

the alpha of the low-idiosyncratic decile by -1.38% per month (with a t-statistic of -3.02). When

I control for long-short portfolios that hold the highest non-systematic coskewness decile of stocks

and short the lowest non-systematic coskewness decile of stocks, I find that the Fama-French (1993)

alphas of the high-idiosyncratic decile exceeds the alpha of the low-idiosyncratic decile by 0.04%

per month (with a t-statistic of 0.17). I also relate my findings to recent studies that use GARCH

specification of idiosyncratic volatility risk. I show that by assuming GARCH specifications, these

papers restrict the relation between idiosyncratic volatility and stock returns to be positive. There-

fore, given our sample, it is not possible to use a GARCH type of specification and arrive at a

negative relation between idiosyncratic volatility and expected returns.

The paper is organized as follows. In Section II, I describe the features of my partial equilibrium

model. In Section III, I discuss the equilibrium pricing kernel and its implication for asset pricing.

In Section IV, I describe the data set and discusses the empirical results of my basic model. In

Section V, I investigate the sources of the idiosyncratic volatility premium. In Section VI, I relate

my findings to studies on idiosyncratic volatility that uses GARCH specification of idiosyncratic

volatility risk. Section VII concludes.

II. The Model

I develop an approximated equilibrium model with heterogeneous investors. I do so to characterize

the pricing kernel and endogenously link the pricing kernel to both coskewness and volatility risk

with a structural interpretation of the market prices of the aggregate volatility and coskewness risk.

I keep the model standard, and summarize it by a finite number of investors who make the optimal

allocation decisions in a multi-period market. I exclude intermediate consumption from the model.

The heart of the model is the return decomposition. This decomposition is crucial to obtain the

closed-form solution for the optimal portfolio weight at any trading date and the aggregate pricing

kernel in equilibrium.

A. Investor Preferences and Portfolio Optimization

I consider an economy in which there are many investors with heterogeneous preferences and en-

dowments. In this economy, investors are indexed by i = 1, ..., I and trade in n risky assets and a

safe asset at times τ = t, t+1, ...,T −1. I use Rkτ+1, k = 1, ...n to denote, the return from investing

$1 at time τ in each security. All assets are traded in competitive markets without transaction

costs and taxes. I consider the portfolio choice at time t of investor i, who maximizes the expected

utility of wealth at some terminal date T < ∞ by trading in n risky assets and a safe asset at times

3

τ = t, ..., T − 1. Formally, the investor’s problem is:

V(i)t = max{

ω(i)τ

}τ=T−1

τ=t

Et

[ui

(W

(i)T

)]i = 1, ..., I (1)

subject to the sequence of budget constraints

W(i)τ+1 = W (i)

τ

(Rf + ω(i)ᵀ

τ Reτ+1

)∀ τ ≥ t

where ω(i)τ is the vector of portfolio weights on the risky assets chosen at time τ , and Re

τ+1 =

Rτ+1−Rf1n is the n-dimensional portfolio weight vector of excess returns on the risky assets from

time τ to τ +1. The function ui (.) measures the investor’s utility of terminal wealth W(i)T . I assume

that the individual asset allocation shares fulfill the market clearing conditions

I∑

i=1

W (i)τ ω(i)

τ = θτ ∀ τ ≥ t (2)

where θ>τ−1Rτ represents the aggregate future wealth. I use the small noise expansion approach

to solve the portfolio choice (1). To solve for the optimal asset allocation, I follow Samuelson

(1970) and Judd and Guu (2001), assuming that the distribution of the returns belongs to a family

of “compact”or “small-risk” distributions. I define the small-risk distribution as some specified

parameter ε goes to zero, all the distributions converge to a sure outcome. I can decompose any

random vector Rkτ+1 that belongs to this family as follows:

Rkτ+1 = Rf + ε2akτ (ε) + εYkτ+1. (3)

Here, the coefficient akτ (ε) is a function of the ε parameter, which characterizes the scale of risk.

In terms of Brownian motion, ε is the square root of time, and the drift and diffusion terms are

given by(Rf + ε2akτ (ε)

)and (εYkτ+1) respectively. Throughout this paper, ε refers to the scale

risk parameter in the small noise expansions framework. I note that there are no restrictions on

the distribution of Ykτ+1. Hence, the distribution of Rkτ+1 is general and is not restricted to a

specific distribution. All asset returns are not correlated through ε, since the correlation between

two assets k and j is independent of ε. The term ε2akτ (ε) is the risk premium on the risky asset.

The function akτ (ε) that characterizes this premium is unknown.

The return’s decomposition (3) shows that the return is a function of the scaled risk parameter

ε. As a result, it follows that the first-order conditions of (1) implicitly define the portfolio weight

as a function of ε. Like most Taylor expansion series1, I assume that the small noise expansion of

ω(i)t does converge, that is

ω(i)t ≈

j=Q∑

j=0

1j!

ω(i)t

[j], (4)

1See Brandt et al. (2005).

4

where Q = 1 and ω(i)t

[j] represents the jth derivative of the portfolio weight evaluated at ε = 0. I

note that an approximation with Q = 1 works well when investors have homogeneous preferences.

Approximating the portfolio weight with Q = 1 is sufficient to show that, in equilibrium, the

pricing kernel depends on three factors: the market return, the coskewness factor and the volatility

of the market return. My approach is more intuitive than the standard contingent state approach to

equilibrium. As shown in Hart (1975) and Elul (1995), the incomplete markets paradigm focuses on

the difference between the number of contingent states and the number of assets. It depends on how

many assets are missing and the number of agents in the economy. It is difficult to interpret such

indices of incompleteness. The main reason is that we can count neither the number of contingent

states nor the number of different kinds of agents in a real economy2. To characterize investor

preferences, I assume that all agents have utility functions that exhibit non-satiation (u′i > 0),

risk aversion (u′′i < 0), and a preference for positive skewness (u

′′′i > 0). At ε = 0, the following

parameters characterize the investor’s preferences:

℘i = limε7→0

[−u′i

(W

(i)T

)

u′′i

(W

(i)T

) ], ρi = limε7→0

[℘2

i

2

u′′′i

(W

(i)T

)

u′i

(W

(i)T

) ], (5)

where 1/℘i is the Arrow-Pratt absolute measure of risk aversion and ρi represents the skew-

tolerance.

Next, I consider the average values:

℘ =1I

I∑

i=1

℘i , ρ =

I∑i=1

ρi℘i

I∑i=1

℘i

, (6)

where ℘ is the cross-sectional average of investor risk tolerances and ρ represents the weighted

average of investor skewness preferences. The intertemporal portfolio choice in equation (1) is a

dynamic problem. At time τ , each investor chooses his asset allocation ω(i)τ conditional on having

wealth W(i)τ . To make his or her decision at time τ , investor i takes into account the fact that at any

future date τ , the portfolio weight will be optimally revised conditional on the wealth W(i)τ . I solve

backward the dynamic portfolio choice in Equation (1) by expressing the multiperiod problem (1)

as single-period problems. In my model, all investors have homogeneous expectations. The source

of heterogeneity in this economy comes from investor preferences and endowments.2The impact of asset incompleteness on economic performance is related to the diversity of investor objectives

than to the number of states and the number of agents. Therefore, the number of different agents in an economy isa poor measure of agent diversity because an economy with 1000 types of investors with different risk aversions andskewness preferences close to the mean risk aversion and mean skewness preferences is less diverse than an economywith 20 types of investors with substantially different risk aversions and skewness preferences.

5

III. Pricing Kernels with Coskewness and Volatility Risk

I use the small noise expansion approach proposed by Judd and Guu (2001), without knowledge of

investor utility functions. Using this approach makes it possible for me to derive the closed-form

solution for the optimal portfolio weights by assuming that risky asset returns can be decomposed

according to equation (3). In the Appendix, I show how I derive the optimal shares of wealth

invested in risky securities. I then examine the asset pricing implications of the investor optimization

problem. By using the optimal portfolio weights, I can recover the functional form for the parameter

aτ (.) appearing in the return decomposition, and use it to derive the functional form of the pricing

kernel in equilibrium3.

PROPOSITION 1 : The pricing kernel for the period [t, T ] is

mTt =

T∏

ν=t+1

mν−1,ν (7)

with

mν−1,ν =1

Rf+ D0rMν + D1

[r2Mν −Eν−1r

2Mν

]+

T−1∑τ=ν

D2τ [Eνσ2Mτ −Eν−1σ

2Mτ ] (8)

where:

D1 =ρ

℘2 R[2(T−ν)−1]f , D2τ =

(ρ− 1)℘2 R

[2(T−τ)−1]f 1τ≤T−1,D0 = − 1

℘R

[T−ν−1]f +

ν−1∑

τ=t+1

DτνrMτ (9)

and

Dτν =(2ρ − 1)

℘2 R[(T−τ)+2(T−ν)−1]f (10)

where rMν is the demeaned market return. The indicator function 1τ≤T−1 equals one if τ ≤ T − 1

and zero otherwise. σ2Mτ represents the volatility of the market return.

Proof See the Appendix.

The pricing kernel derived in Proposition 1 gives a structural interpretation of the market

prices of risk in terms of the cross-sectional average of investor risk tolerance, ℘, and the cross-

sectional average of investor skewness preferences ρ. If investors have identical preferences and

different endowments, and if the portfolio weight is approximated by (4) with Q = 1, then my

model predicts that the aggregate pricing kernel will remains unchanged. Moreover, the aggregate

pricing kernel is still valid if the representative agent assumption applies. What really matters in

equilibrium is the mean average of investor preferences4.3In the previous version of the paper, I derived the pricing kernel with the restriction that the risk-free return

R2f ' Rf . In this version of the paper, see proposition 1, there is no restriction on the risk-free return. As shown in

the empirical section, this restriction has no impact on the estimated preference parameters.4When I include an additional high-order term into the portfolio weight (4), that is Q = 2, the aggregate pricing

kernel is different from the one obtained under the representative agent assumption. The theoretical results areavailable on request. I will investigate this issue in future research.

6

In nonlinear pricing kernels literature, the usual route is to assume the existence of a represen-

tative agent, then expand the agent’s marginal utility in Taylor series and then drop the higher

moments that are economically unimportant for explaining returns. This approach produces a

pricing kernel that is a polynomial function of the market return, and does not depend on the

volatility of the market return. If the volatility of the market return is priced, it should be relevant

for explaining the cross-section of risky returns. To my knowledge, this paper is the first to show

how the market volatility risk factor enters the pricing kernel. To compute the risk premium on

the risky asset from time t to t + 1. I assume that the investor invests at time t in the asset k for

T − t periods. Then, the pricing kernel mTt should price correctly returns, that is

EtmTt Rkt+1....RkT = 1. (11)

To examine the pricing implications of my model from time t to t + 1 in a one and two-period

market, I assume that the investor invests in the risky asset from t to t+1 and reinvests the payoff

in the risk-free asset for the remaining period. I simplify the Euler equation (11) as:

Etmt,t+1Rkt+1 = 1 (12)

where

mt,t+1 =1

Rf+ D0rMt+1 + D1

[r2Mt+1 −Etr

2Mt+1

]+

T−1∑

τ=t+1

D2τ [Et+1σ2Mτ − Etσ

2Mτ ] (13)

where D0, D1 and D2τ are defined in (9).

In a one-period model, T = t + 1, I use (12) to derive the risk premium on the risky asset from

time t to t + 1 is

EtRkt+1 −Rf = (λMt/Rf )βMt + (λSKDt/Rf )βSKDt. (14)

with

λMt =1℘

Rfσ2Mt, λSKDt = − ρ

℘2 R2fσ3

Mt, (15)

and

βMt =Covt(rMt+1, Rkt+1)

σ2Mt

, βSKDt =Covt(r2

Mt+1, Rkt+1)σ3

Mt

, (16)

Equation (14) is the asset pricing model derived in Rubinstein (1973), Kraus and Litzenberger

(1976), and Harvey and Siddique (2000). The main contribution of my model is that in a two-

period model, the asset risk premium depends on the co-movement between the market volatility

and the return on the risky asset. In a two-period model, T = t + 2, I use (12) to show that the

risk premium on the risky asset from time t to t + 1 is

EtRkt+1 −Rf = λMtβMt + λSKDtβSKDt + λV OLtβV OLt (17)

with

λV OLt =(1− ρ)

℘2 R2fV art(σ2

Mt+1), βV OLt =Covt(σ2

Mt+1, Rkt+1)V art(σ2

Mt+1). (18)

7

where λMt and λSKDt are defined in (15) and λV OLt is defined in 18.

For T = t + 2, the resulting asset pricing model is a generalization of Rubinstein (1973), Kraus

and Litzenberger (1976), Harvey and Siddique (2000), and Ang et al. (2006). Since the mean

risk aversions 1℘ and mean skewness preferences ρ are positive, λSKDt is negative and assets with

negative βSKDt (coskewness) have higher expected returns than assets with positive coskewness

and identical characteristics. The asset’s risk premium decomposition shows that the prices of

coskewness and volatility risk have a structural interpretation in terms of the average value of

investor risk tolerance and skewness preferences. My results indicate that the price of coskewness

and volatility risk are restricted by investor risk aversion and skewness preference. Moreover, the

sign of the price of market volatility risk depends on the average values of investor preferences.

If the cross-sectional average of investor skewness preference ρ is larger than one, then the price

λSKDt of the market volatility risk is negative.

The main prediction of this model is that stocks with different loadings on volatility risk have

different average returns, and high volatility is associated with high future expected returns. In

contrast, when ρ is lower than one, the price of the market volatility risk is positive. Ang et al.

(2006) use the risk premium specification to estimate the price of the market volatility. However,

their approach leaves open the question of what value of investor risk aversion and skewness prefer-

ences are consistent with the prices of coskewness and volatility risk. My theoretical model makes

it possible to address this issue. I am interested in estimating investor risk aversion and skewness

preference, and then using these structural preference parameters to estimate both the market price

of coskewness and volatility risk.

To examine the sources of the variance risk premium, I consider the pricing kernel (13) generated

by my two-period model with T=t+2. I use (13) to derive the market variance premium as5

σ2Mt − σ∗2Mt = α2

Mt + λMtσMtSMt + λSKDtσMt (KMt − 1) + λV OLtυMt, (19)

with

υMt =Covt(σ2

Mt+1, r2Mt+1)

V art

(σ2

Mt+1

) ,

where σ∗2Mt is the variance of the market return under the risk neutral measure, αMt is the

expected excess return on the market, σMt, SMt and KMt represent the standard deviation, the

skewness and the kurtosis of the market return. Carr and Wu (2008) propose a direct and robust

method for quantifying the variance risk premium on financial assets. As shown in equation (19)

my model examines the sources of the variance premium. The variance premium can be attributed

to exposure to higher moments such as skewness, kurtosis and the correlation between the market

volatility and the squared of the market return; the risk-averse behavior of the investors; and

investor preferences for skewness. At a basic level, equation (19) shows that the variance premium

is related to nonzero risk aversion ( 1℘). The result in equation (19) has special relevance. At

5See the proof in the Appendix.

8

a theoretical level, the equation states that there are three sources of negative market variance

premium. First, a negative skewness (SMt) and high positive kurtosis (KMt) cause the variance

premium to be more negative. Second, if the average skewness preference ρ is higher than one,

then a high correlation of the market volatility with the squared market return causes the variance

risk premium to be even more negative. Third, for a given risk aversion level, raising the level of

skewness preferences could generate a more pronounced negative variance premium.

Comparing the expression in equation (19) to the expression for the equity premium in equation

(17), suggests that the variance risk premium should serve as useful predictor for the actual realized

returns. My paper is one of several recent papers that intend to explain the relation between

volatility risk and expected returns. Bollerslev et al. (2008) present a stylized general equilibrium

model designed to illuminate theoretical linkages between financial market volatility and expected

returns. Their model involves a standard endowment economy with Epstein-Zin-Weil recursive

preferences. Drechsler and Yaron (2008) propose a more elaborate model. Nyberg and Wilhelmsson

(2008) test if innovations in investor risk aversion are a priced factor in the stock market as predicted

by models incorporating habit formation in preferences. Their proxy for time-varying risk aversion

is based on the volatility risk premium series constructed by Bollerslev et al. (2008).

My paper is also related to recent theories that examine theoretical link between the idiosyn-

cratic skewness and the asset expected excess return. To compare my model’s prediction to the

predictions in these papers, I assume that the assets j = 1, ...n for j 6= k are normally dis-

tributed but asset k is not. I denote ISkt = Etε3kt+1, the idiosyncratic skewness and assume

that E(Rkt+1−EtRkt+1)(Rjt+1−EtRjt+1)(Rit+1−EtRit+1) = ISkt for i = j = k and 0 otherwise.

Similar to Barberis and Huang (2007) and Mitton and Vorkink (2007), this restriction on the as-

set’s skewness allows to isolate the potential impact that idiosyncratic skewness may have on the

expected excess return. Under these restrictions, my two-period model predicts that the expected

excess return on asset k is6.

EtRkt+1 −Rf = λMtβMt − ρ

℘2 R2fω2

kISkt + λV OLtβV OLt (20)

where ωk is the weight of asset k in the market portfolio. Since ρ > 0, equation (20) suggests that

stocks with high idiosyncratic skewness have low expected excess return on average. This is consis-

tent with Mitton and Vorkink (2007). Recent theories investigate whether idiosyncratic skewness is

related to the expected excess return. Each theory starts from a different set of assumptions. Boyer,

Mitton and Vorkink (2008) find that expected skewness helps explain the phenomenon that stocks

with high idiosyncratic volatility have low expected returns. Barberis and Huang (2007) show

that when investors have cumulative prospect theory preferences, stocks with greater idiosyncratic

skewness may have lower average returns. Brunnermeier and Parker (2005) and Brunnermeier,6The proof is straightforward from (17). I write the market return RMt+1 as

∑nj=1 ωjRjt+1. I replace the market

return in the asset’s coskewness that appears in equation (17), and use the underlying assumptions to simplify thiscoskewness and obtain the final result.

9

Gollier, and Parker (2007) solve an endogenous probabilities model that produces similar asset

pricing implications for individual asset skewness.

IV. The Empirical Framework

A. Estimation Methods

My main goal is to estimate the cross-sectional average of risk aversions and skewness preferences,

then check whether these preference parameters are reasonable, and then use these values to recover

the price of coskewness and volatility risk. To compare my results with those in recent studies on

the pricing of volatility risk, I consider the two-period model with T = t + 2. I assume that the

investor invests at time t in the asset k for one period and then reinvests the payoff in the risk-free

asset for the remaining period. This assumption leads me to collect the vector of errors

εt+1 = mTt RT

t − In, (21)

where RTt is the vector of risky asset returns over this two periods. This vector contains elements

of the form Rkt+1Rf where Rkt+1 is the return from time t to t + 17. The Euler equation will be

Etmt,t+1mt+1,t+2Rkt+1Rf = 1 (22)

This allows to simplify the Euler equation as Etmt,t+1Rkt+1 = 1 with

mt,t+1 =1

Rf− 1

℘rMt+1 +

ρ

℘2 Rf

[r2Mt+1 − Etr

2Mt+1

]+

(ρ− 1)℘2 Rf

[σ2

Mt+1 − Etσ2Mt+1

], (23)

Equation (21) implies E [εt+1|Zt] = 0, which forms a set of moment conditions that I can utilize

to test the asset pricing model via Hansen’s (1982) generalized method of moments (GMM); E[]

denotes the unconditional expected operator. Zt represents the set of instrumental variables. If

the pricing kernel prices correctly returns, then the sample version of the moment conditions is:

gT (Θ) =1T

T∑

τ=1

ε(T )τ ⊗ Zτ . (24)

where Θ =(

1Rf

, 1τ , ρ

)is the set of parameters to be estimated. As the sample size T increases,

gT (Θ) should be sufficiently close to zero. Hansen (1982) shows that a test of model specification

can be obtained by minimizing the quadratic form:

J = arg minΘ

gᵀT (Θ)WT gT (Θ) ,

where WT is the GMM weighting matrix. However, Chapman (1997) shows that the standard

GMM estimator in a Euler equation test may result in acceptance of the pricing kernel due to7I also use two-period returns in the form Rkt+1Rjt+2. The results are qualitatively similar. The results are

available on request.

10

noise in the pricing kernel. In a different framework, Hansen and Jaganathan (1997) use the same

criterion function as in the standard GMM approach but specify the weighting matrix as the second

moment of the returns. I follow Dittmar (2002) and use the Hansen and Jagannathan weighting

matrix in the estimation process.

B. Data

I use the 30 monthly industry portfolios obtained from Kenneth French’s website8. The sample

period is from January 1986 to December 2006, and yields a total of 252 observations. For the

market portfolio, I use the value-weighted NYSE/AMEX/NASDAQ index. This index is also known

as the value-weighted index of the Center for Research in Security Prices (CRSP).

As a proxy the volatility of the market return, I use the options implied volatility estimators.

The Chicago Board Options Exchange (CBOE)s VXO implied volatility index provides investors

with up-to-the-minute market estimates of expected volatility by using the real-time S&P 100 index

option bid/ask quotes. The VXO is a weighted index of American implied volatilities calculated

from eight near-the-money, near-to-expiry, S&P 100 call and put options based on the Black-

Scholes (1973) pricing formula.

I also use the CBOE’s newer VIX index, which is obtained from the European style S&P 500

index option prices and which is based on “model-free”implied volatilities. The VIX incorporates

information from the volatility skew by using a wider range of strike prices rather than just at-the-

money series. I use historical monthly data on the VIX from 1990 to 2006.

As an alternative to the VXO and VIX indexes, I use the Realized Volatility (RV). Several

recent studies have argued for the use of so-called “model-free”realized variances computed by

the summation of high-frequency intraday squared returns. These measures generally afford much

more accurate expost observations on the actual return variation than the more traditional sample

variances based on daily or coarser frequency return observations. I follow Bollerslev et al. (2008)

to construct my “model-free”realized volatility measure. Bollerslev et al. (2008) use intraday data

for the SP500 composite index to construct the realized variance measure9.

C. Results

I estimate the mean of the pricing kernel, the cross-sectional average of investor risk aversions

and the cross-sectional average of investor skewness preferences. I then use these parameters to

compute the price of the market, coskewness and volatility risk. I investigate three different sub-

sample periods.

Table I presents results of GMM tests when I estimate the pricing kernel (23). I estimate the

preference parameters by using the Hansen and Jagannathan (1997) weighting matrix. Column

(1) shows the mean of the pricing kernel and columns (2) and (3) present the risk aversion and8I thank Kenneth French for making a large amount of historical data publicly available in his online data library.9I thank Bollerslev et al. (2008) for making the monthly time-series of realized volatility available.

11

skewness preferences, respectively. Column (4) gives the Hansen and Jagannathan (1997) distance

measure with p-values for the test of model specification in parentheses. Columns (5) through (7)

present the implied price of market, coskewness and volatility risk that I obtain when I use the

estimated preference parameters. P -values for tests of the coefficients appear in parentheses.

The set of returns used covers different sample periods. As my proxy for the volatility of the

market return, I use the CBOEs VXO and VIX implied volatilities and the realized volatility (RV).

Panel A presents the results when I use the VXO. I show that the estimated coefficients for the

risk aversion are reasonable and range from two to four. The p-values indicate that most of the

estimated risk aversions are statistically significant at the 5% level, except for the short sample

period January 1996 to December 2006. The distance measure and p-values suggest that the

estimated pricing kernel cannot be rejected at the 5% level when I use the sample periods January

1986 to December 2000 and January 1986 to December 2006. Panel A also reports the implied

price of the market, coskewness and market volatility risk. The price of the market is positive and

ranges from 6% to 10% per year. The price of coskewness risk is negative and ranges from -1.87%

to -1.3% per year and the price of the market volatility ranges from -0.4% to -0.2% per year. The

sign of the prices of market, coskewness, and volatility risk is consistent with my model’s prediction

and the results in previous studies. For the sample period from January 1986 to December 2000,

I find that the price of the volatility risk is about -0.38% annually. To test whether these prices

of risk are statistically significant, I use the Delta-method to compute t-statistics of the prices and

find that the price of the market risk is significant (with p-value of 0.044), but that the prices of

coskewness and volatility risk are not significant. The results are not reported10.

However, this result does not suggests that the prices of volatility and coskewness risks are not

significant. The main reason for this finding is that the Delta method is based on a linear approxi-

mation of the price of risk in terms of risk aversion and skewness preference. Linear approximations

do not incorporate nonlinear components of the price of risk. While the price of the market is linear

in the risk aversion, the price of coskewness and volatility risk are highly nonlinear functions of

both risk aversion and skewness preference. My results indicate that the price of the volatility risk,

λV OLt, is a nonlinear function of the risk aversion and skewness preference parameters that turn

to be significant at the 5% level. As a result, the volatility risk premium defined in equation (19)

is significantly different from zero and is time-varying.11

To investigate how changes in skewness and risk aversion parameters cause changes in the

prices of the volatility and coskewness risk, I examine ten-years windows, yielding a total of 120

observations. Every year, from 1996 to 2006, I use the past ten years’ data and estimate the pricing

kernel. The risk aversion and skewness preference are mostly statistically significant12.

Figure 1 plots the time series of the estimated risk aversion and skewness preference. As is10Results are not tabulated but are available on request.11They are two sources of time-variation in the volatility premium (19), (i) time-variation in the price of the

volatility risk λV OLt and/or (ii) time-variation in the conditional moments of the market return.12Results are not tabulated but are available on request.

12

evident from the figure, both the risk aversion and the price of market volatility risk are somewhat

higher in magnitude during the 1998 to 2000 part of the sample. This result suggests that high risk

aversion may imply a high price of market variance risk and consequently a high volatility premium.

This figure also shows that the risk aversion estimated is stable during the 2002 to 2006 part of the

sample, but the price of the market volatility is somewhat higher in magnitude, particularly during

the 2004 to 2006 part of the period. Over the same period, the estimated skewness preference

is higher (ranging from one to 2.4). This result suggests that changes in the price of the market

volatility and hence in the volatility premium, could also be caused by changes in investor skewness

preference while their risk aversion is stable. Figure 1 also plots the price of coskewness risk, and

shows that the price of coskewness preference and volatility risk tend to move together during the

1998 to 2001 and 2002 to 2006 periods. There is at least one explanation to this co-movement. The

price of volatility risk is a sum of two quantities: the first is the price of coskewness risk and the

second is the square of the risk aversion parameter.

Panel B in Table I reports the results when I use the VIX. With this measure, the risk aversion

parameter ranges from two to 4.75, which represent a marginal increase compared to the estimated

values with the old volatility measure VXO. Both the skewness preference and risk aversion are

statistically significant. The prices of both the coskewness and volatility risk are slightly higher in

magnitude compared to the implied prices of risks when I use the VXO, except for the sample period

January 1996 to December 2006. The price of the volatility risk ranges from -1.2% to -0.62% per

year, and the price of coskewness risk ranges from -2.86% to -1.7% per year. The distance measure

and p-values suggest that the estimated pricing kernel is rejected at 5% level.

In Figure 2, I plot the time-series of risk aversion, skewness preference, prices of coskewness

and volatility risk. The results in this figure confirms my previous findings that periods of high

volatility premium is due to high risk aversion or high skewness preference or both.

Panel C in Table I reports the results when I use the realized volatility measure (RV). With

this new measure, the risk aversion parameter ranges from 2.50 to 4.75. The prices of coskewness

(volatility) risk are lower (higher) in magnitude compared to the implied prices of risks when I use

the VXO and VIX, except for the sample period January 1990 to December 2000. The distance

measure and p-values suggest that the estimated pricing kernel is rejected at 5% level, except for

the sample period January 1990 to December 2006.

C.1. Controlling for Size, Book-to-Market, Momentum Factors

I investigate the robustness of my results by estimating an alternative specification of the pricing

kernel that incorporates the Fama and French (1993) size and book-to-market characteristics. I aug-

ment the pricing kernel with the size and book-to-market factors. I also control for the momentum

factor of Jegadeesh and Titman (1993). The augmented pricing kernel has the form

m∗t,t+1 = mt,t+1 + D3rSMBt+1 + D4rHMLt+1 + D5rMOMt+1. (25)

13

In this pricing kernel, rSMBt+1 represents the excess return on a portfolio of small-cap stocks over

large-cap stocks, rHMLt+1 represents the excess return on a portfolio of high market-to-book stocks

over low market-to-book stocks, and rMOMt+1 represents the return on the momentum portfolio of

Jegadeesh and Titman (1993).13

Table II reports the estimated risk aversion and skewness preference, and the implied price of

market, coskewness and volatility risk14. Panel A presents the results when I use the VXO as my

proxy for the market volatility. The table shows that the estimated risk aversion ranges from four to

4.3, and that the results are significant at 10% level. The skewness preferences are also statistically

significant, except for the sample period January 1996 to December 2006. The implied price of

the volatility risk ranges from -1.04% to -0.19%, while the price of coskewness ranges from -3.2%

to -1.96% per year. Panel B reports the results when I use the VIX as my proxy for the market

volatility. The estimated risk aversion ranges from 3.4 to 4.6 and is significant at the 5% level

when the full sample is used in the estimation process. The implied price of coskewness risk ranges

from -2.95% to -1.7% per year, while the implied price of the market volatility risk ranges from

-0.94% to -0.89% per year. After controlling for the size, book-to-market and momentum factors,

I find that both the estimated risk aversions and skewness preferences and the implied prices of

coskewness and market volatility risk are in a reasonable range. Panel C in Table I reports the

results when I use the realized volatility measure (RV). With this new measure, the risk aversion

parameter ranges from 3.62 to 3.99. Both the skewness preference and risk aversion coefficients are

in majority statistically significant. The prices of coskewness (volatility) risk are lower (higher) in

magnitude compared to the implied prices of risks when I use the VXO and VIX, except for the

sample period January 1990 to December 2000. The distance measure and p-values suggest that

the estimated pricing kernel is rejected at 5% level, except for the sample period January 1990 to

December 2006.

C.2. Explaining the Puzzling Behavior of Pricing Kernels

To gauge the ability of the estimated pricing kernel to explain recent puzzles documented in studies

mentioned earlier, I use a setting with standard preferences and static prices of risk. By doing so,

I can interpret the pricing kernel as a scaled marginal utility. Under these assumptions, to be

consistent with positive marginal utility and the no arbitrage condition, the pricing kernel should

be positive, and be consistent with decreasing absolute risk aversion it should be decreasing in the

aggregate wealth (market return).

I plot my estimated pricing kernel as function of the market return and volatility of the mar-

ket return. The estimated mean pricing kernel, risk aversion and skewness preference, are those

reported in Table I. Figures 3 and 4 depict the estimated pricing kernel when I use the VXO and13I also use the Pastor and Stambaugh (2003) liquidity factor. The results are qualitatively similar. Therefore I do

not report the results with the liquidity factor. These results are available on request.14I do not report the coefficients of the size, book-to-market and momentum priced factors. Results are available

on request.

14

VIX as my proxies for market volatility. The support for the graphs is the range of the returns on

the value-weighted index and the implied volatility difference. These figures show that the pricing

kernel is decreasing as the market return increases and is increasing when the market volatility

increases. This result makes my pricing kernel consistent with preference theory.

To further examine this suggestion, I project the estimated pricing kernel on the polyno-

mial function of the market return alone. The projected pricing kernel has the form mt,t+1 =∑nj=0 bjr

jMt+1. I use different values for n, as in n equals three, four, and five, and find that the

results remain unchanged. Therefore, I present only the result for n equals five. Figures 5 and 6

depict the projected pricing kernel when I use both the VXO and VIX. The support for the graphs

is the range of the returns. These Figures show that for various sub-samples, the projected pricing

kernel, increases when the aggregate wealth (market return) increases.

As an alternative measure to the VXO and VIX indexes, I also use the realized volatility measure

(RV). The results are qualitatively similar but are not reported15. My finding suggests that the

missing market volatility factor in the pricing kernel and a lack of a structural interpretation of

the pricing kernel in terms of investor preferences are plausible explanations to the pricing kernel

puzzle. My estimated risk aversion and skewness preference are reasonable, and more importantly,

the implied prices of coskewness and market volatility have the expected sign and are within a

reasonable range.16

D. Robustness to the Test Assets

For robustness check, I use the latest stock composition of the 30 Dow Jones Industrial Average.

The company names and summary statistics are presented in Table III. I use monthly returns

on Dow Jones 30 stocks from January 1990 to December 2006. Table IV reports the estimated

risk aversion and skewness preference, and the implied price of market, coskewness and volatility

risk. Panel A presents the results when I use the VXO, VIX, and RV as my proxy for the market

volatility. The table shows that the estimated risk aversion ranges from 3.78 to 5.75, and that

the results are significant at 5% level. The skewness preferences are also statistically significant.

The implied price of the volatility risk ranges from -0.91% to -0.14%, while the price of coskewness

ranges from -2.58% to -2.09% per year. Panel B reports the results when I control for the Fama

and French and momentum factors. The estimated risk aversion ranges from 4.61 to 5.4 and is

significant at the 5% level. The implied price of the volatility risk ranges from -1.30% to -0.87%,

while the price of coskewness ranges from -2.69% to -2.34% per year.15The results of the projected pricing kernels are available on request.16Chabi-Yo et al. (2008) argue that state dependence in preferences and fundamentals could be the cause of the

pricing kernel puzzle. Brown and Jackwerth (2004) provide a model of generating the pricing kernel puzzle, albeitonly for parameter constellations which are not typically observed in the real word.

15

V. Sources of the Idiosyncratic Volatility Premium

In the previous sections, I examine the sources of the market volatility risk premium. In this

section, I first examine the sources of the idiosyncratic volatility premium. In an economy where

the pricing kernel is a linear function of priced factors, I find that nonzero risk aversion and firms’

non-systematic coskewness determine the premium on idiosyncratic volatility risk. I interpret the

non-systematic coskewness as the non-systematic component of asset’s skewness that is related to

the market’s portfolio skewness. I empirically show the relevance of this non-systematic coskewness

in explaining the idiosyncratic volatility puzzle put forward in Ang et al. (2006, 2008).

I define idiosyncratic risk as the risk that is unique to a specific firm, so I also refer to it as

firm-specific risk. By definition, idiosyncratic risk is independent of the common movement of the

market. To understand the idiosyncratic volatility puzzle, it is first important to understand how

it is priced. Given an asset pricing model with ft+1 as the set of risky factors, the pricing kernel is

mt,t+1 =1

Rf− dt

Rfft+1 (26)

where the coefficient dt depends on investor preferences17. In a single factor model, dt represents the

risk aversion coefficient. I define the idiosyncratic variance risk premium as the difference between

the idiosyncratic volatility under the objective and risk neutral measure18. Given the pricing kernel

(26), the idiosyncratic variance premium is

ivpkt = σ2εkt − σ∗2εkt = −RfCovt(mt,t+1, ε

2kt+1) (27)

where εkt+1 represents the asset’s k idiosyncratic shock in the linear regression

rkt+1 = αkt + βktft+1 + εkt+1. (28)

σ2εkt represents the idiosyncratic volatility of asset k under the objective measure, σ∗2εkt is the idiosyn-

cratic volatility of asset k under the risk neutral measure. If the idiosyncratic volatility is priced,

the component (27) which is the premium on the idiosyncratic volatility risk should be signifi-

cantly different from zero. Proposition 2 below gives the conditions under which this idiosyncratic

volatility premium component is zero.

PROPOSITION 2 : The idiosyncratic volatility premium, ivpkt, is

ivpkt = λt.γkt (29)

with

γkt = [V art(ft+1)]−1Covt(ft+1, ε2kt+1) and λt = dt[V art(ft+1)]. (30)

I refer γkt to as the non-systematic coskewness. If the idiosyncratic shock εkt+1 and the risk factor

ft+1 are jointly and normally distributed, then the non-systematic coskewness is zero. As a result,

the non-systematic variance premium is zero and the idiosyncratic volatility risk is not priced.17Notice that the pricing kernel derived in Section 1 is a linear function of the risky factors.18Carr and Wu (2008) adopt a similar definition for individual stock variance risk premium.

16

Proof See the Appendix for the proof.

In regression (28), the idiosyncratic shock is uncorrelated with the risk factor ft+1. However, the

regression does not tell me whether the higher-order components of the idiosyncratic shock are

uncorrelated to the risk factor ft+1. If the idiosyncratic shock and the stock return are jointly and

normally distributed, then by using Stein Lemma19, I can show that the idiosyncratic volatility

risk is not priced and that γkt = 0. Idiosyncratic volatility is priced due to the presence of higher

moments in the stock returns and a non-zero risk aversion via dt. To access how the asset’s

expected excess return vary across stocks with different levels of non-systematic coskewness, I

consider a single factor model in which the market excess return is a the only priced factor20. I do

not make any assumption about the distribution of the market return. Under the restriction that

cov(εkt+1εjt+1, rmt+1) = 0 for j 6= k, I show that

∂γkt

∂αkt= − 2

dtσ2Mt

$kγk,

where $k represents the weight of asset k in the market portfolio and dt is the risk aversion

coefficient21. Equation (31) states that the non-systematic coskewness γkt is negatively (positively)

related to the asset’s expected excess return if it is positive (negative). In the subsequent sections,

I show that the non-systematic coskewness is priced, and is helpful in explaining the idiosyncratic

volatility anomaly.

A. Estimating Idiosyncratic Volatility

To investigate this prediction, I use all the NASDAQ, AMEX, and NYSE stocks and consider

industrial firms. The sample period is from January 1971 to December 2006. To reduce the impact

of infrequent trading on idiosyncratic volatility estimates, I require that firms have a minimum of

120 trading days (non-zero observations) in a year. I also exclude equity prices lower than one dollar.

I first compute the idiosyncratic volatility at the end of each month using the past 12 months daily

observations. I use different models to compute the idiosyncratic volatility: the CAPM model, the

Fama and French (1993) three-factor model, the Fama and French three-factor model augmented

with the Jegadeesh and Titman (1993) momentum factor and the Harvey and Siddique (2000)

market coskewness model. I then rank stocks based on their idiosyncratic volatility to form value-

weighted decile portfolios and then hold the portfolios over the next month. I rank the stocks based

on their past idiosyncratic volatility risk into ten groups and form ten value-weighted portfolios in

10% increments from 10% to 100%. Figure 7 depicts the mean average return across deciles. The

figure shows that on average, regardless of the model used to compute the idiosyncratic volatility

risk, stocks with high idiosyncratic risk earn lower returns than do stocks with low idiosyncratic

volatility risk.19See the Stein Lemma in the Appendix.20With two additional priced factors, the quantitative results are more complicated, but the conclusions are quali-

tatively similar. They are available on request.21See the proof in the Appendix.

17

The mean average returns reported in Panel A of Table V are strongly and almost monotonically

declining in idiosyncratic volatility risk regardless of which model I use. The mean average returns

for the value-weighted portfolio return with the lowest-idiosyncratic volatility risk (10% Low) are

positive and ranges from 1.04% to 1.07% per month. The mean average returns for the value-

weighted portfolio return with the highest-idiosyncratic volatility risk (10% High) are significantly

negative and range from -0.22% to -0.16% per month. A long-short portfolio holding the volatile

decile of stocks and shorting the safest decile has a mean average return ranging from -1.29% to

-1.19% with robust Newey-West (1987) t-statistics ranging from -2.64 to -2.48. Panel B reports

each model alpha, showing robust Newey-West t-statistics in square brackets. A long-short portfolio

holding the volatile decile of stocks and shorting the safest decile has an alpha ranging from -1.47%

to -1.37% and robust Newey-West t-statistics ranging from -3.15 to -3.10. When I correct for risk

using either the CAPM model, the Fama and French (1993) model, the Fama and French (1993)

model, the Fama and French (1993) model augmented with the momentum factor of Jegadeesh

and Titman (1993) and the Harvey and Siddique (2000) models, I worsen the anomalous poor

performance of volatile stocks rather than correcting it.

I also correct for the market volatility. I use different measures of the market volatility as

described in Section IV. I use the VXO from January 1986 to December 2006, the VIX from

January 1990 to December 2006 and the realized volatility RV from January 1990 to December

2006. The results are qualitatively similar to those reported in Table V. Consistent with Ang et al.

(2006), the market volatility cannot explain the anomalous poor performance of volatile stocks. I

do not report the results.22

Since the idiosyncratic anomaly is robust to the models used including the market volatility,

I use the CAPM model as a benchmark to compute the idiosyncratic volatility in the rest of the

paper.

B. Non-Systematic Coskewness and Expected Returns

I find a strong positive cross-sectional relation between the average returns and non-systematic

coskewness when the non-systematic coskewness is negative and there is a strong negative cross-

sectional relation between the average returns and non-systematic coskewness when the non-

systematic coskewness is positive. This finding is consistent with my theoretical motivation. In

this section, I assess how average returns vary across stocks with different levels of non-systematic

coskewness.

B.1. Decile Portfolios in 10% increments from 10% to 100%

My model predicts that if stocks’ non-systematic coskewness is negative (positive), then on average,

stocks with high non-systematic coskewness earn higher (lower) returns than do stocks with low non-

systematic coskewness. To verify my prediction, I use the past 12 months daily returns to compute22The results are available on request.

18

the non-systematic coskewness at the end of each month. I form two groups of stocks, those with

negative and those with positive non-systematic. Within each group, I rank the stocks based on

their past non-systematic coskewness into ten groups and form ten value-weighted portfolios in 10%

increments from 10% to 100%.

When the non-systematic coskewness is positive, the mean average returns reported in the

Panel A of Table VI are strongly and almost monotonically declining in non-systematic coskewness

risk. The mean average returns for the value-weighted portfolio with the lowest non-systematic

coskewness risk (10% Low) is positive at 1.07% per month, and the mean average returns for the

value-weighted portfolio with the highest non-systematic coskewness (10% high) is significantly

lower at 0.62% per month. However, a long-short portfolio holding the highest non-systematic

coskewness decile of stocks and shorting the lowest non-systematic coskewness decile has an average

return of -0.45% per month which is not statistically significant (the t-statistic is -1.04).

When the non-systematic coskewness is negative, the mean average returns reported in the Panel

A of Table VI are strongly and almost monotonically increasing in non-systematic coskewness risk.

The average returns for the value-weighted portfolio with the lowest non-systematic coskewness risk

(10% Low) is positive at 0.32% per month, and the average returns for the value-weighted portfolio

with the highest non systematic coskewness (10% High) is significantly higher at 1.13% per month.

A long-short portfolio holding the highest non-systematic coskewness decile of stocks and shorting

the lowest non-systematic coskewness decile has an average return of 0.81% per month which is

statistically significant (the t-statistic is 2.3). These results confirm my model’s prediction that on

average, stocks with high non-systematic coskewness earn on average higher returns than do stocks

with low non-systematic coskewness.

To correct for the CAPM or Fama and French (1993) three-factors, for each of the ten value-

weighted portfolio returns and the “10-1 ”portfolio returns formed by ranking stock based on

non-systematic coskewness, I run the regression:

rp = α + βf + η. (31)

where f represents the market excess returns when I use the CAPM and the Fama and French

(1993) three-factor when I use the Fama and French (1993) model. I define 10-1 as the difference in

returns between portfolio 10 and portfolio 1. When I correct for risk using either the CAPM or the

Fama-French (1993) three-factor model, there is a striking variation in alpha across the portfolios in

Table IV. First, when the non-systematic coskewness is positive, the value-weighted portfolios with

low non-systematic coskewness have highly significant alphas, but the value-weighted portfolios with

high non-systematic coskewness portfolio have non significant alphas. In contrast, when the non-

systematic coskewness is negative, the value-weighted portfolios with low non-systematic coskewness

have non significant alphas and the value weighted portfolios with high non-systematic coskewness

have highly significant alphas. Moreover, when the non-systematic coskewness is negative, a long-

short portfolio holding the highest non-systematic coskewness decile of stocks and shorting the

19

lowest non-systematic coskewness decile has a highly significant alpha of 1% per month (the t-

statistic is 3.101) when I control for the CAPM, and a highly significant alpha of 0.96% per month

(the t-statistic is 2.86) when I control for the Fama and French three-factor. These results suggest

that when the non-systematic coskewness is negative, the alphas of the value-weighted portfolio

with the highest non-systematic coskewness exceeds the Fama-French alphas of the value-weighted

portfolio with the lowest non-systematic coskewness by about 1% per month.

B.2. Decile Portfolios and the Tails of the Distribution

Because the non-systematic coskewness is related to the skewness of the return’s distribution, I

construct value-weighted portfolios that pays greater attention to the tails of stocks’ distribution. I

form two groups of stocks, those with negative and those with positive non-systematic coskewness.

Within each group, I rank the stocks in percentiles 0-5, 5-20, 20-30, 30-40, 40-50, 50-60, 60-70,

70-80, 80-95, and 95-100 based on their past non-systematic coskewness. Within each percentile, I

form the value-weighted return.

When the non-systematic coskewness is positive, the mean average returns reported in the Panel

A of Table VII are strongly and almost monotonically declining in non-systematic coskewness

risk. The mean average returns for the value-weighted portfolio with the lowest non-systematic

coskewness risk (5% Low) is positive at 1.17% per month, and the mean average returns for the

value-weighted portfolio with the highest non-systematic coskewness (5% high) is significantly lower

at -0.05% per month. A long-short portfolio holding the highest non-systematic coskewness decile

of stocks and shorting the lowest non-systematic coskewness decile has an average return of -1.22%

per month which is statistically significant (the t-statistic is -2.65).

When the non-systematic coskewness is negative, the mean average returns reported in the Panel

A of Table VII are strongly and almost monotonically increasing in non-systematic coskewness risk.

The average returns for the value-weighted portfolio with the lowest non-systematic coskewness risk

(5% Low) is negative at -0.05% per month, and the average returns for the value-weighted portfolio

with the highest non systematic coskewness (5% High) is significantly higher at 1.24% per month.

A long-short portfolio holding the highest non-systematic coskewness decile of stocks and shorting

the lowest non-systematic coskewness decile has an average return of 1.29% per month which is

statistically significant (the t-statistic is 2.88).

When I correct for risk using either the CAPM or the Fama-French (1993) three-factor model,

there is a striking variation in alpha across the portfolios in Table VII. First, when the non-

systematic coskewness is positive, the value-weighted portfolios with low non-systematic coskewness

have highly significant alphas, but the value-weighted portfolios with high non-systematic coskew-

ness portfolio have non significant alphas. In contrast, when the non-systematic coskewness is

negative, the value-weighted portfolios with low non-systematic coskewness have non significant al-

phas and the value weighted portfolios with high non-systematic coskewness have highly significant

alphas.

20

When the non-systematic coskewness is positive, a long-short portfolio holding the highest non-

systematic coskewness decile of stocks and shorting the lowest non-systematic coskewness decile

has a highly significant alpha of -1.37% per month (the t-statistic is -3.17) when I control for the

CAPM, and a highly significant alpha of -1.30% per month (the t-statistic is -2.82) when I control

for the Fama and French three-factor. Further, when the non-systematic coskewness is negative,

a long-short portfolio holding the highest non-systematic coskewness decile of stocks and shorting

the lowest non-systematic coskewness decile has a highly significant alpha of 1.49% per month (the

t-statistic is 3.52) when I control for the CAPM, and a highly significant alpha of 1.30% per month

(the t-statistic is 2.94) when I control for the Fama and French three-factor.

C. Explaining the Low Returns of High Idiosyncratic Volatility Stocks

In this section, I assess how the non-systematic coskewness explains the low returns of high id-

iosyncratic volatility stocks. I first consider only stocks with positive non-systematic coskewness.

Second, I consider stocks with negative non-coskewness. Third, I consider all stocks.

C.1. Positive Non-Systematic Coskewness

To explain the anomalous underperformance of high idiosyncratic volatility stocks, I consider the

following explanation. I consider stocks with positive non-systematic coskewness. I construct a

long-short portfolio relative to non-systematic coskewness when it is positive. I construct this

portfolio by holding the highest non-systematic coskewness decile of stocks and shorting the lowest

non-systematic coskewness decile. For each of the ten value-weighted portfolio returns, and the

10-1 portfolio returns formed by ranking stocks based on the idiosyncratic volatility, I run the

regression:

rp = α + βf + βν(ν+ − ν−) + η, (32)

where f represents the market excess returns when the CAPM is used and the Fama and French

(1993) three-factor when I use the Fama and French (1993) three-factor model. I refer ν− to as

the value-weighted portfolio return formed with the 5% lowest non-systematic coskewness stocks

and ν+ the value-weighted portfolio return formed with the 5% highest non-systematic coskewness

stocks. I then formulate the following hypothesis based on my theoretical model’s prediction.

H1: Because the expected excess return on the long-short portfolio (ν+ − ν−) is negative, my

model predicts that the non-systematic coskewness factor will explain the idiosyncratic volatility

anomaly if the value-weighted portfolio with high idiosyncratic volatility risk has significant pos-

itive loadings on the long-short portfolio (ν+ − ν−); and that the difference in returns between

the value-weighted portfolio with the highest idiosyncratic volatility risk and the value-weighted

portfolio with the lowest idiosyncratic volatility risk has a non significant alpha and significant

positive loading on the long-short portfolio (ν+ − ν−).

21

Panel A of Table VIII reports the average returns on value-weighted idiosyncratic portfolios. I

measure the statistics in the columns labeled Mean and Std Dev (standard deviation) in monthly

percentage terms. Standard errors appear in parentheses. Robust Newey-West (1987) t-statistics

appear in square brackets. A long short portfolio holding the highest idiosyncratic decile of stocks

and shorting the lowest idiosyncratic decile of stock has an average of -1.29 (wit a t-statistic of

-2.64).

Panel B of Table VIII reports the regression alphas and betas when I use the market excess

return and the long-short portfolio (ν+ − ν−) as explanatory variables. The table reports robust

t-statistics in brackets. The alphas reported in the first row of Panel A are all positive. More

importantly, the alpha for the 10-1 portfolio return is -0.16% per month (the t-statistic is -0.39),

and therefore is not significant. As the results indicate, it is apparent that compared to the same

alpha in Panel A, in Panel B, the difference in alphas between the portfolio with the highest

idiosyncratic volatility risk and the value-weighted portfolio with the lowest idiosyncratic volatility

risk reduces from -1.29 (the t-statistic is -2.64) to -0.16 (the t-statistics is -0.39). It is apparent

from Panel B that alphas exhibit a reverse symmetric U-shaped across deciles.

In contrast almost all betas on the long-short portfolio (ν+−ν−) are positive and highly signif-

icant. These betas increase as I move from the value-weighted portfolios with lower idiosyncratic

volatility to the value-weighted portfolios with higher idiosyncratic volatility. The betas ranges

from 0.03 to 1.05. With the exception of the value-weighted portfolio with the lowest idiosyn-

cratic volatility, all t-statistics range from 2.60 to 16.34. More importantly, the 10-1 portfolio loads

positively on the long-short portfolio (ν+ − ν−) with a value 1.01 and a t-statistic of 13.89.

These results indicate that stocks with high idiosyncratic volatility risk have high positive betas

on the long-short return. Since the expected return on the long-short portfolio is negative, it follows

that on average, stocks with high idiosyncratic volatility earn lower returns. Another interpretation

is that stocks with high idiosyncratic volatility have high positive betas on the long-short return and

hence highly positive non-systematic coskewness. As noted in section B, when the non-systematic

coskewness is positive, stocks with highly positive non-systematic coskewness earn, on average lower

returns. Panel C of Table VIII presents similar results when I control for the Fama and French

(1993) three factors.

C.2. Negative Non-Systematic Coskewness

In this section, I consider stocks with negative non-systematic coskewness. I construct a long-

short portfolio relative to non-systematic coskewness when it is negative. I construct this portfolio

by holding the highest non-systematic coskewness decile of stocks and shorting the lowest non-

systematic coskewness decile. For each of the ten value-weighted portfolio returns, and the 10-1

portfolio returns formed by ranking stocks based on the idiosyncratic volatility, I run the regression:

rp = α + βf + βν(υ+ − υ−) + η, (33)

22

I refer υ− to as the value-weighted portfolio return formed with the 5% lowest non-systematic

coskewness stocks and υ+ the value-weighted portfolio return formed with the 5% highest non-

systematic coskewness stocks. I then formulate the following hypothesis based on my theoretical

model’s prediction.

H2: Because the expected excess return on the long-short portfolio (υ+ − υ−) is positive, my

model predicts that the non-systematic coskewness factor will explain the idiosyncratic volatility

anomaly if the value-weighted portfolio with high idiosyncratic volatility risk has significant neg-

ative loadings on the long-short portfolio (υ+ − υ−); and that the difference in returns between

the value-weighted portfolio with the highest idiosyncratic volatility risk and the value-weighted

portfolio with the lowest idiosyncratic volatility risk has a non significant alpha and significant

negative loading on the long-short portfolio (υ+ − υ−).

Panel A of Table IX reports the average returns on value-weighted idiosyncratic portfolios. I

measure the statistics in the columns labeled Mean and Std Dev (standard deviation) in monthly

percentage terms. Standard errors appear in parentheses. Robust Newey-West (1987) t-statistics

appear in square brackets. A long short portfolio holding the highest idiosyncratic decile of stocks

and shorting the lowest idiosyncratic decile of stock has an average of -1.35 (wit a t-statistic of

-2.44).

Panel B of Table IX reports the regression alphas and betas when I use the market excess

return and the long-short portfolio (υ+ − υ−) as explanatory variables. The table reports robust

t-statistics in brackets. The alpha for the 10-1 portfolio return is -0.54% per month (the t-statistic

is -1.24), and therefore is not significant. As the results indicate, it is apparent that compared to

the same alpha in Panel A, in Panel B, the difference in alphas between the portfolio with the

highest idiosyncratic volatility risk and the value-weighted portfolio with the lowest idiosyncratic

volatility risk reduces from -1.35 (the t-statistic is -2.44) to -0.54 (the t-statistics is -1.24).

In contrast almost all betas on the long-short portfolio (υ+−υ−) are negative and highly signif-

icant. These betas decrease as I move from the value-weighted portfolios with lower idiosyncratic

volatility to the value-weighted portfolios with higher idiosyncratic volatility. The betas ranges

from -0.84 to -0.09. With the exception of the value-weighted portfolio with the lowest idiosyn-

cratic volatility, all t-statistics range from -12.99 to -2.09. More importantly, the 10-1 portfolio

loads negatively on the long-short portfolio (υ+−υ−) with a value -0.84 and a t-statistic of -12.63.

These results indicate that stocks with high idiosyncratic volatility risk have negative betas on

the long-short return. Since the expected return on the long-short portfolio is positive, it follows

that on average, stocks with high idiosyncratic volatility earn lower returns. Panel C of Table IX

presents similar results when I control for the Fama and French (1993) three factors.

23

C.3. Negative and Positive Non-Systematic Coskewness

I consider all stocks with positive and negative non-systematic coskewness. I use the value-weighted

decile portfolios that I construct in Section A. For each of the ten value-weighted portfolio returns,

and the 10-1 portfolio returns formed by ranking stocks based on the idiosyncratic volatility, I run

the regression:

rp = α + βf + βν(ν+ − ν−) + βυ(υ+ − υ−) + η, (34)

where f represents the market excess returns when the CAPM is used and the Fama and French

(1993) three-factor when I use the Fama and French (1993) three-factor model. I then formulate

the following hypothesis based on my theoretical model’s prediction.

H3: My model predicts that the non-systematic coskewness factor will explain the idiosyn-

cratic volatility anomaly if the value-weighted portfolio with high idiosyncratic volatility risk has

significant positive loadings on the long-short portfolio (ν+ − ν−), significant negative loading on

the long-short portfolio (υ+ − υ−); and that the difference in returns between the value-weighted

portfolio with the highest idiosyncratic volatility risk and the value-weighted portfolio with the

lowest idiosyncratic volatility risk has a non significant alpha and significant positive loading on

the long-short portfolio (ν+ − ν−) and negative loading on the long-short portfolio (υ+ − υ−).

Table X reports the regression alphas and betas and robust t-statistics in brackets. Panel

A reports the result when I use the full sample of industry firms to construct the ten value-

weighted portfolio returns and the 10-1 portfolio returns based on idiosyncratic volatility. The

alphas reported in the first row of Panel B are all positive. It is apparent from Panel B that alphas

exhibit a reverse symmetric U-shaped across deciles. More importantly, the alpha for the 10-1

portfolio return is 0.04% per month (the t-statistic is 0.17), and therefore is not significant. In

contrast almost all betas on the long-short portfolios (ν+ − ν−) and (υ+ − υ−) have the expected

sign and are highly significant. When I use the 10-1 value-weighted idiosyncratic portfolio return

as my dependent variable in the regression, the adjusted R-square is about 74%. The correlation

between the two long-short portfolio returns (ν+ − ν−) and (υ+ − υ−) is 0.55.

As my results indicate, it is apparent that compared to the same alpha in Panel A, in Panel

B, the difference in alphas between the portfolio with the highest idiosyncratic volatility risk and

the value-weighted portfolio with the lowest idiosyncratic volatility risk reduces from -1.47% (the

t-statistic is -3.15) to 0.04% (the t-statistics is 0.17). Panel C of Table X presents similar results

when I control for the Fama and French (1993) three factors.

VI. Relation to other research on idiosyncratic volatility

Recent papers such as Fu (2008), Brockman and Schutte (2007) assume that risky assets’ return

follow an asymmetric GARCH model. These papers use EGARCH method to estimate conditional

24

idiosyncratic volatility and confirm that the relation between stock returns and conditional idiosyn-

cratic volatility is positive in both U.S. and international data. Similarly Spiegel and Wang (2006)

and Eiling (2006) adopt the EGARCH models to estimate conditional idiosyncratic volatility and

also find the positive relation in the U.S. data.

To explain why these authors find a positive relation between the idiosyncratic volatility and

expected excess return when GARCH models and its extensions are used to compute the idiosyn-

cratic volatility risk, I begin by assuming that the asset’s return is described by Equation (28) with

the market return as a single risky factor. In addition, I assume that the idiosyncratic risk εkt+1

is normally distributed with conditional variance described by a model that belongs to a family of

GARCH models. Further, I assume that the idiosyncratic volatility is described by an asymmetric

GARCH model based on Glosten, Jagannathan and Runkle (1993):

hkt+1 = β0 +j=p∑

j=1

βjhkt+1−j +j=q∑

j=1

αjε2kt+1−j +

j=q∑

j=1

δjIt+1−jε2kt+1−j (35)

where εkt+1−j =√

hkt+1−jzt+1−j with zt+1−j ∼ N(0, 1) and hkt+1−j = σ2εkt−j . The indicator

function It+1−j equals 1 if εkt+1−j < 0, and zero otherwise. When δ1 > 0, the model (35) accounts

for the leverage effect, that is, that bad news (εkt+1−j < 0) raises the future volatility more than

does good news (εkt+1−j ≥ 0) of the same absolute magnitude. Under assumption (35), the non-

systematic coskewness γkt is23

γkt =1

(α1 + δ12 )

Covt(rMt+1, σ2εkt+1)

σ2Mt

(36)

where the coefficients α1 and δ1 are both positive. Thus a negative correlation of the idiosyncratic

volatility with the market return causes the non-systematic coskewness to be negative. In my

sample, the average correlation of idiosyncratic volatility with the market return is negative (-

2%). According to my model’s prediction, if the non-systematic coskewness is negative, then on

average, stocks with high non-systematic coskewness earn high returns. In my sample, the average

correlation of idiosyncratic volatility with the non-systematic coskewness is positive which implies

that stocks with high idiosyncratic volatility risk have high non-systematic coskewness. Thus, if

I assume a GARCH specification for individual stock returns, the non-systematic coskewness in

Equation (36) will be negative and stocks with high idiosyncratic volatility will have high non-

systematic coskewness and therefore would earn in average higher returns. This finding could

explain why under GARCH specifications, recent studies find that stocks with high idiosyncratic

volatility earn high expected returns. This reasoning is valid even under a simple GARCH model

in which δ1 = 0.23The proof of this expression appears in the Appendix.

25

VII. Conclusion

Recent papers such as Ang et al. (2006) estimate the price of the market volatility risk and find

that the volatility of the market is priced. Its price is about -1% per annum. Bollerslev, Tauchen

and Zhou (2008) use a model-free approach and show that the difference between the volatility

of the market under the risk-neutral measure and the volatility of the market under the physical

measure is significant, and that the magnitude of return predictability of the variance premium

easily dominates that afford by standard predictor variables. These authors suggest that temporal

variation in the risk and risk aversion play an important role in determining the variance premium

and also argue that period of high volatility premium is intimately associated with high risk aversion.

However, these papers leave unanswered the question of what value of risk aversion is consistent

with the estimated price of market volatility or the observed volatility premium.

I build a partial equilibrium model in which investors trade in a multi-period market. As a result,

the aggregate pricing kernel in equilibrium depends on both coskewness and market volatility risk

factor. I show that the price of coskewness and market volatility are restricted by investor risk

aversion and skewness preference, and I provide a closed-form solution for the prices of coskewness

and market volatility risk in terms of investor risk aversion and skewness preference. I use two

sets of independent data. I first use the 30 industry portfolio returns. Second, I use the 30 Dow

Jones stock returns. When I use the 30 industry portfolio returns, the risk aversion is estimated to

be between 2.5 and 4.75 while the skewness preference ranges from 1.05 to 2.25. The parameters

are mostly statistically significant. The implied price of coskewness associated to these estimates

ranges from -3.2% to -0.72% per year . When I use the VIX and VXO as my proxy for the market

volatility, the implied price of the market volatility ranges from -1.2% to -0.19% per year. The

implied price of the market volatility ranges from -1.51% to -0.77% per year when I use the realized

volatility. When I use the 30 Dow Jones Stock returns, the risk aversion is estimated to be between

3.75 and 5.75 while the skewness preference ranges from 1.003 to 1.71. The parameters are all

statistically significant. The implied price of coskewness associated to these estimates ranges from

-2.7% to -2.07% per year . When I use the VIX, VXO, and RV as my proxy for the market volatility,

the implied price of the market volatility ranges from -1.30% to -0.15% per year. These estimates

are in a reasonable range and consistent with the literature. I also investigate the impact of the

risk aversion and skewness preference on the price of the market volatility risk over time. I find

that periods of high price of volatility risk is sometimes associated to high risk aversion and low or

stable skewness preference, sometimes to high skewness preference and low and stable risk aversion,

or sometimes to both high risk aversion and high skewness preference.

I also examine the puzzling behaviors of the pricing kernel. I show that my estimated pricing

kernel is consistent with economic theory, in that it is decreasing in the aggregate wealth (market

return) and increasing in the market volatility. When I project my estimated pricing kernel on the

polynomial function of the market return alone, doing so produces the puzzling behaviors observed

26

in pricing kernel. I argue that the missing market volatility priced factor in the pricing kernel and

the lack of a structural interpretation of the price of coskewness and volatility risk in terms of

investor risk aversion and skewness preference noted in previous studies could be the cause of the

puzzling behaviors of the pricing kernel.

Finally, I examine the negative relation between idiosyncratic volatility and expected returns,

and ask why do low idiosyncratic volatility firms earn higher future returns than ones with higher

idiosyncratic volatility? To answer this question, I study the source of idiosyncratic volatility pre-

mium by using pricing kernels. I find that the premium on idiosyncratic volatility risk is determined

by a nonzero risk aversion and firms non-systematic coskewness. I define non-systematic coskew-

ness as the non-systematic component of asset skewness that is related to the market portfolio’s

skewness. I find two results. First, when this non-systematic component is positive, the differ-

ence in Fama-French (1993) alphas between the valued-weighted decile portfolio with the highest

non-systematic coskewness and the value-weighted decile portfolio with the lowest non-systematic

coskewness has a significant alpha of -1.30% per month. In contrast, when the non-systematic

coskewness is negative, a long-short portfolio holding the highest non-systematic coskewness decile

of stocks and shorting the lowest non-systematic coskewness decile of stocks has a highly significant

alpha of 1.30% per month.

I also study the negative relation between idiosyncratic volatility and expected returns. My

results show that the non-systematic coskewness is helpful in solving the idiosyncratic volatility

anomaly. I relate my findings to recent studies that use GARCH specification of the idiosyncratic

volatility risk. I show that by assuming GARCH specifications, these studies restrict the relation

between idiosyncratic volatility and stock returns to be positive. Therefore, given our sample

period, it is not possible to use a GARCH type of specification and arrive at a negative relation

between idiosyncratic volatility and expected returns.

27

Appendix

Appendix A: The Optimal Asset Allocation

To give a formal proof of all propositions, I first use the bifurcation theorem (see Therorem 4, Page 8 in Judd andGuu (2001)) to solve the optimization problem (1). Following the bifurcation theorem, the optimal portfolio weightis a function of the small noise expansion parameter ε and is given by:

ω(i)ν−1 = ω

(i)ν−1(0) + ω

(i)′ν−1(0)ε, (A1)

where ω(i)ν−1(0) and ω

(i)′ν−1(0) represents the level and slope of the portfolio weights. To determine these quantities, I

solve backward the optimization problem (1). Appendix A contains the proof of the optimal portfolio weight (A1)and the risk premium function aτ (.) appearing in the return decomposition (3). Appendix B contains the proof ofall propositions.

Proof I consider the First-Order Conditions (hereafter FOCs)

ω(i)ν−1 : Eν−1u

(i)′(W

(i)T

) T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν) = 0. (A2)

at time ν − 1 ∈ {t, ..., T − 1}. I discuss below the steps needed to solve (A2) for the optimal portfolio weight.

1. First Step: I proceed in an intuitive fashion to arrive at a solution validated by the bifurcation theorem. Iwant to solve ω

(i)ν−1 as functions of ε near 0. I first compute what ω

(i)ν−1 is the correct solution to the ε = 0

case. I compute:lim

ε 7−→0ω

(i)ν−1.

In the rest of this proof, I denote:ω

(i)ν−1 (0) = lim

ε 7−→0ω

(i)ν−1.

To solve the FOCs for ω(i)ν−1 (0), I consider the FOCs as shown in Equation (A2) and denote:

H(ω

(i)ν−1, ε

)= Eν−1u

(i)′(W

(i)T

) T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν) . (A3)

The choice of ω(i)ν−1 is a function of ε implicitly defined by H

(i)ν−1 (ε) , ε

)= 0. Implicit differentiation of

Equation (A3) with respect to ε implies:

(i)ν−1 (ε) , ε

(i)′ν−1 (ε) +Hε

(i)ν−1 (ε) , ε

)= 0. (A4)

Differentiating H(ω

(i)ν−1, ε

)with respect to ε and ω

(i)ν−1 respectively, I find:

(i)ν−1, ε

)= Eν−1u

(i)′′(W

(i)T

) ∂W(i)T

∂ε

T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν) (A5)

+Eν−1u(i)′

(W

(i)T

)RT−ν−1

f

T∑τ=ν+1

(i)ᵀτ−1Yτ + ω

(i)′ᵀτ−1 Re

τ

)(εaν−1 (ε) + Yν)

+Eν−1u(i)′

(W

(i)T

) T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

) (aν−1 (ε) + εa

′ν−1 (ε)

),

and

(i)ν−1, ε

)= Eν−1u

(i)′′(W

(i)T

) ∂W(i)T

∂ω(i)ν−1

T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν) . (A6)

Now, I look for a bifurcation point ω(i)ν−1 (0) defined by Hε

(i)ν−1 (0) , ε

)= 0.To do this, notice that:

limε7→0

∂W(i)T

∂ε= WtR

T−t−1f

T∑τ=t+1

(i)ᵀτ−1 (0)Yτ

)and lim

ε 7→0

∂W(i)T

∂ω(i)ν−1

= 0 (A7)

28

I replace limε7→0∂W

(i)T

∂ω(i)ν−1

in (A6), and take the limit to get

Eν−1u(i)′′

(limε7→0

W(i)T

) ∂W(i)T

∂εYν + Eν−1u

(i)′(

limε7→0

W(i)T

)aν−1 (0) = 0 (A8)

for all ω(i)ν−1. In addition, I replace limε7→0

∂W(i)T

∂εin (A5) and substitute the result in Equation (A4) to derive:

Eν−1WtRT−t−1f

(i)ᵀν−1 (0)Yν

)Yν = τiaν−1 (0) .

which simplifies to:

RT−νf Covν−1

(limε 7→0

W(i)ν−1ω

(i)ᵀν−1 (0)Yν ,Yν

)= τiaν−1 (0) . (A9)

Recall that, the market clearing conditions take the form as expressed in Equation (2). Thus, near 0, I writethe market clearing conditions as

I∑i=1

limε 7→0

W(i)ν−1ω

(i)ν−1 (0) = θν−1, (A10)

I take the sum of Equation (A9) for i = 1, ..., I and use these market clearing conditions (see Equation A10))to obtain:

1

℘RT−ν

f Covν−1

(ωᵀ

ν−1Yν ,Yν

)= aν−1 (0) (A11)

with ων−1 = 1Iθν−1 and ℘ = 1

I

I∑i=1

℘i.

Now, I plug Equation (A11) in Equation (A9) and get:

ω(i)ν−1 (0) =

1

limε7→0 W(i)ν−1

℘i

℘Σ−1

ν−1Covν−1

(ωᵀ

ν−1Yν , Yν

)=

1

limε7→0 W(i)ν−1

℘i

℘ων−1. (A12)

where Σν−1 is the variance-covariance matrix defined by:

Σν−1 = Eν−1YνYᵀν .

2. Second Step: I want to solve for the slope of the weights ω(i)ν−1 near ε = 0. Specifically, I want to solve for

ω(i)′ν−1 near ε = 0. To do this, I consider again the FOCs at date ν − 1.

For all ω(i)ν−1, it is straightforward to show that limε→0H

(i)ν−1, ε

)= H

(i)ν−1 (0) , 0

)= 0.

Furthermore, for(ω

(i)ν−1 (0) , 0

), I have:

H(ω

(i)ν−1 (0) , 0

)= 0, (A13)

(i)ν−1 (0) , 0

)= 0, (A14)

where Hε represents the first derivative of H (., .) with respect to ε.

Now, I check whether detHωε

(i)ν−1 (0) , 0

)6= 0 where Hωε (., .) represents the second derivative of H with

respect to ω and ε respectively. It can show that

Hωε

(i)ν−1 (0) , 0

)= u(i)′′

(limε 7→0

W(i)T

)R

2(T−ν)f lim

ε 7→0W

(i)ν−1Σν−1 (A15)

anddetHωε

(i)ν−1 (0) , 0

)= u(i)′′

(limε7→0

W(i)T

)R

2(T−ν)f lim

ε 7→0W

(i)ν−1 detΣν−1 6= 0 (A16)

Since Equations (A13), (A14) and (A16) are satisfied, I use the bifurcation theorem (see Theorem 4 in Judd

and Guu (2001)) to solve the FOCs for ω(i)′ν−1.

Following this , there exists an open neighborhood N of(ω

(i)ν−1 (0) , 0

)and a function h

(i)ν−1 (ε) : R → Rn,

h(i)ν−1 (ε) 6= 0 for ε 6= 0, such that

Hεε

(h

(i)ν−1 (ε) , ε

)= 0 for

(h

(i)ν−1 (ε) , ε

)∈ N (A17)

where Hεε (., .) represents the second derivative of H (., .) with respect to ε. Furthermore h(i)ν−1 is analytical

and can be approximated by a Taylor series. In particular the first order derivative equals

h(i)′ν−1 (0) = −1

2H−1

ωε

(i)ν−1 (0) , 0

)Hεε

(i)ν−1 (0) , 0

)= lim

ε 7→0ω

(i)′ν−1. (A18)

29

In the rest of this proof, I denote

ω(i)′ν−1 (0) = lim

ε7→0ω

(i)′ν−1.

Now, I compute the second derivative of H with respect to ε:

Hεε

(i)T−1, 0

)= Eν−1

(u(i)′

(W

(i)T

))′′ T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν)

+Eν−1u(i)′

(W

(i)T

) (T∏

τ=ν+1

(Rf + ω

(i)ᵀτ−1R

))′′

(εaν−1 (ε) + Yν)

+Eν−1u(i)′

(W

(i)T

) T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν)

′′

+2Eν−1

(u(i)′

(W

(i)T

))′ ( T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

))′

(εaν−1 (ε) + Yν)

+2Eν−1

(u(i)′

(W

(i)T

))′ T∏τ=ν+1

(Rf + ω

(i)ᵀτ−1R

)(εaν−1 (ε) + Yν)

+2Eν−1u(i)′

(W

(i)T

) (T∏

τ=ν+1

(Rf + ω

(i)ᵀτ−1R

))′

(εaν−1 (ε) + Yν)′.

I expand expression above and take its limit as ε approaches zeros to get:

Hεε

(i)T−1, 0

)(A19)

= Eν−1

(u(i)′′′

(limε 7→0

W(i)T

) (limε 7→0

W(i)′T

)2

+ u(i)′′(

limε 7→0

W(i)T

)limε 7→0

W(i)′′T

)RT−ν

f Yν

+Eν−1u(i)′

(limε7→0

W(i)T

)

2RT−ν−1f

T∑τ=ν+1

ω(i)ᵀτ−1 (0)aτ−1 (0)

+2RT−ν−1f

T∑τ=ν+1

τ 6=ν

ω(i)′ᵀτ−1 (0)Yτ

+2RT−ν−2f

T∑ν+1≤τ<τ∗≤T

(i)ᵀτ−1 (0)Yτ

) (ω

(i)ᵀτ∗−1 (0)Yτ∗

)

+2Eν−1u(i)′

(limε 7→0

W(i)T

)RT−ν

f a′ν−1 (0)

+2Eν−1

(u(i)′′

(limε7→0

W(i)T

)limε7→0

W(i)′T

) (RT−ν−1

f

T∑τ=ν+1

ω(i)ᵀτ−1 (0)Yτ

)Yν

+2Eν−1

(u(i)′′

(limε7→0

W(i)T

)limε7→0

W(i)′T

)RT−ν

f aν−1 (0)

+2Eν−1u(i)′

(limε 7→0

W(i)T

) (RT−ν−1

f

T∑τ=ν+1

ω(i)ᵀτ−1 (0)Yτ

)aν−1 (0) .

Notice that:

limε 7→0

∂W(i)T

∂ε= WtR

T−t−1f

T∑τ=t+1

(i)ᵀτ−1 (0)Yτ

)

limε 7→0

∂2W(i)T

∂2ε=

2T∑

τ=t+1

RT−t−1f W

(i)t ω

(i)ᵀτ−1 (0)aτ−1 (0) + 2

T∑τ=t+1

τ 6=ν

RT−t−1f W

(i)t ω

(i)′ᵀτ−1 (0)Yτ

+2T∑

t+1≤τ<τ∗≤T

RT−t−2f W

(i)t

(i)ᵀτ−1 (0)Yτ

) (ω

(i)ᵀτ∗−1 (0)Yτ∗

).

30

I then replace analytical expressions of ω(i)τ−1 (0), aτ−1 (0) and ω

(i)τ∗−1 (0) in expression above and deduce:

limε 7→0

∂W(i)T

∂ε=

℘i

T∑τ=t+1

RT−τf

(ωᵀ

τ−1Yτ

)

limε 7→0

∂2W(i)T

∂2ε=

2 ℘i℘2

T∑τ=t+1

R2(T−τ)f V arτ−1

(ωᵀ

τ−1Yτ

)+ 2

T∑τ=t+1τ 6=ν

RT−t−1f W

(i)t ω

(i)′ᵀτ−1 (0)Yτ

+2T∑

t+1≤τ<τ∗≤T

RT−τ∗+t−τf

℘2i /W

(i)t

℘2

(ωᵀ

τ−1Yτ

) (ωᵀ

τ∗−1Yτ∗) .

I replace the two equations above in Equation (A19), use the definition of preference parameters as specifiedin Equations (5) and (6) to get

[u(i)′′

(limε7→0

W(i)T

)]−1

Hεε

(i)T−1, 0

)(A20)

= Eν−1

(−2

ρi

℘i

(limε 7→0

W(i)′T

)2

+ limε 7→0

W(i)′′T

)RT−ν

f Yν

−Eν−1℘i

2RT−ν−1f

T∑τ=ν+1

ω(i)ᵀτ−1 (0)aτ−1 (0)

+2RT−ν−1f

T∑τ=ν+1

ω(i)′ᵀτ−1 (0)Yτ

+2RT−ν−2f

T∑ν+1≤τ<τ∗≤T

(i)ᵀτ−1 (0)Yτ

) (ω

(i)ᵀτ∗−1 (0)Yτ∗

)

−2Eν−1℘iRT−νf a

′ν−1 (0) + 2Eν−1 lim

ε 7→0W

(i)′T

(RT−ν−1

f

T∑τ=ν+1

ω(i)ᵀτ−1 (0)Yτ

)Yν

+2Eν−1 limε 7→0

W(i)′T RT−ν

f aν−1 (0)− 2Eν−1τi

(RT−ν−1

f

T∑τ=ν+1

ω(i)ᵀτ−1 (0)Yτ

)aν−1 (0)

which simplifies to

[u(i)′′

(limε 7→0

W(i)T

)]−1

Hεε

(i)T−1, 0

)

= −2ρi℘i

℘2 R3(T−ν)f Covν−1

((ωᵀ

τ−1Yτ

)2,Yν

)

+

T∑τ=ν+1

[2℘i (1− ρi)

℘2 R2(T−τ)+(T−ν)f

]Covν−1

(V arτ−1

(ωᵀ

τ−1Yτ

),Yν

)

ν−1∑τ=t+1

(2℘i−4ρi℘i)

℘2 R(T−τ)+2(T−ν)f

+2R2(T−ν)+t−τf

℘2i /W

(i)t

℘2

(

ωᵀτ−1Yτ

)Covν−1

((ωᵀ

ν−1Yν

),Yν

)

−2Eν−1℘iRT−νf a

′ν−1 (0)

I replace expression above and Equation (A15) in Equation (A18) to deduce

limε 7→0

W(i)ν−1ω

(i)′ν−1 (0)

=ρi℘i

℘2 R(T−ν)f Σ−1

ν−1Covν−1

((ωᵀ

ν−1Yν

)2,Yν

)

+

T∑τ=ν+1

℘i (ρi − 1)

℘2 R2(T−τ)−(T−ν)f Σ−1

ν−1Covν−1

(V arτ−1

(ωᵀ

τ−1Yτ

),Yν

)

+

ν−1∑τ=t+1

(℘i (2ρi − 1)

℘2 RT−τf − ℘2

i /W(i)t

℘2 Rt−τf

)Σ−1

ν−1

(ωᵀ

τ−1Yτ

)Covν−1

((ωᵀ

ν−1Yν

),Yν

)

+℘iR−(T−ν)f Σ−1

ν−1a′ν−1 (0)

31

which simplifies to:

limε7→0

W(i)ν−1ω

(i)′ν−1 (0) = π

(i)2 Σ−1

ν−1Covν−1

((ωᵀ

ν−1Yν

)2,Yν

)(A21)

+

T∑τ=ν+1

π(i)1τ Σ−1

ν−1Covν−1

(V arτ−1

(ωᵀ

τ−1Yτ

),Yν

)

+

ν−1∑τ=t+1

π(i)0τ

(ωᵀ

τ−1Yτ

)Σ−1

ν−1Covν−1

((ωᵀ

ν−1Yν

),Yν

)

+℘iR−(T−ν)f Σ−1

ν−1a′ν−1 (0)

with:

π(i)2 =

ρiτi

℘2 RT−νf , π

(i)1τ =

℘i (ρi − 1)

℘2 RT−νf R

2(ν−τ)f , π

(i)0τ =

℘i (2ρi − 1)

℘2 RT−τf − ℘2

i /W(i)t

℘2

1

Rτ−tf

.

I assume that the market clearing conditions hold, in the neighborhood N , and I differentiate the market

clearing conditions with respect to ε and evaluate the result at(ω

(i)ν−1 (0) ,0

), then get:

limε7→0

I∑i=1

W(i)′ν−1ω

(i)ν−1 (ε) +

I∑i=1

limε 7→0

W(i)ν−1ω

(i)′ν−1 (ε) = 0 (A22)

Notice that:

limε7→0

W(i)′ν−1 =

℘i

ν−1∑τ=t+1

Rν−1−τf

(ωᵀ

τ−1Yτ

)

I take the sum of Equation (A21) for i = 1, ..., I and use the market clearing conditions (A22) to get

a′ν−1 (0) = δ2Covν−1

((ωᵀ

ν−1Yν

)2,Yν

)(A23)

+

T∑τ=ν+1

δ1τCovν−1

(V arτ−1

(ωᵀ

τ−1Yτ

),Yν

)

+

ν−1∑τ=t+1

δ0τ

(ωᵀ

τ−1Yτ

)Covν−1

((ωᵀ

ν−1Yν

),Yν

)

with

δ2 = − ρ

℘2 R2(T−ν)f , δ1τ =

(1− ρ)

℘2 R2(T−ν)f R

2(ν−τ)f , δ0τ =

(1− 2ρ)

℘2 RT−τf R

2(T−ν)f .

Now, I replace Equation (A23) in Equation (A21) and get

limε 7→0

W(i)ν−1ω

(i)′ν−1 (0) = ψ

(i)2 Σ−1

ν−1Covν−1

((ωᵀ

ν−1Yν

)2,Yν

)(A24)

+

T∑τ=ν+1

ψ(i)1τ Σ−1

ν−1Covν−1

(V arτ−1

(ωᵀ

τ−1Yτ

),Yν

)

+

ν−1∑τ=t+1

ψ(i)0τ

(ωᵀ

τ−1Yτ

)Σ−1

ν−1Covν−1

((ωᵀ

ν−1Yν

),Yν

)

with:

ψ(i)2 =

(ρi − ρ) ℘i

℘2 RT−νf , ψ

(i)1τ =

℘i (ρi − ρ)

℘2 RT−νf R

2(ν−τ)f , ψ

(i)0τ =

2℘i (ρi − ρ)

℘2 RT−τf − ℘2

i /W(i)t

℘2

1

Rτ−tf

In the following proofs, I will use the analytical expressions aν−1 (0) , a′ν−1 (0), derived in Equations (A11) and

(A23).

32

Appendix B: Proof of Propositions

Proof of Proposition 1 I first use analytical expressions of aν−1 (0) and a′ν−1 (0) derived in Equations (A11) and

(A23) respectively to derive the risk premium on the risky assets from time ν − 1 to ν (see the return decomposition(3)). This premium is

Eν−1Rν −Rf l

= ε2aν−1 (ε)

= ε2(aν−1 (0) + εa

′ν−1 (0)

)

= B2Covν−1

((RMν − Eν−1RM )2 ,Rν

)+

T∑τ=ν+1

B1τCovν−1 (V arτ−1 (RMτ ) ,Rν)

+B0Covν−1 (RMν ,Rν)

with:

B2 = − ρ

℘2 R2[T−ν]f , B1℘ =

(1− ρ)

℘2 R[2(T−τ)]f , B0τ =

1

℘RT−ν

f +

ν−1∑τ=t+1

δ0τ (RMτ − Eτ−1RMτ )

with δ0τ = (1−2ρ)

℘2 R[(T−τ)+2(T−ν)]f and RMτ = ωᵀ

τ−1Rτ . Now, I use the risk premium result to derive the pricing

kernel. The pricing kernel for the time period [t, T ] is

mTt =

T∏ν=t+1

mν−1,ν .

where mν−1,ν represents the pricing kernel for the period [ν − 1, ν]. By definition, the risk premium on the riskyasset for the time period [ν − 1, ν] is given by

Eν−1Rν −Rf l = −RfCovν−1 (mν−1,ν ,Rν) . (B1)

I identify the risk premium in Equation (B1) with the analytical expression of asset risk premium and deduce theanalytical expression of the pricing kernel which is of the form.

mν−1,ν =1

Rf+ D0rMν + D1

[r2

Mν − Eν−1r2Mν

]+

T−1∑τ=ν

D2τ [Eνσ2Mτ − Eν−1σ

2Mτ ] (B2)

where:

D1 =ρ

℘2 R[2(T−ν)−1]f , D2τ =

(ρ− 1)

℘2 R[2(T−τ)−1]f 1τ≤T−1, D0 = − 1

℘R

[T−ν−1]f +

ν−1∑τ=t+1

DτνrMτ (B3)

and

Dτν =(−1 + 2ρ)

℘2 R[(T−τ)+2(T−ν)−1]f (B4)

Since Rf ≈ 1, the coefficients, D0, D1, and D2 can be written as:

D1 =ρ

℘2 , D2τ =(ρ− 1)

℘2 1τ≤T−1, D0 = − 1

℘+

(2ρ− 1)

℘2

ν−1∑τ=t+1

rMτ (B5)

Proof of the Analytical Expression of the Volatility Premium I denote rMt+1 = RMt+1 −Rf and note

33

that

σ∗2Mt = E∗t (RMt+1 −Rf )2

= Etmt,t+1

Etmt,t+1r2

Mt+1

= Covt

(mt,t+1

Etmt,t+1, r2

Mt+1

)+ Etr

2Mt+1

= RfCovt

(mt,t+1, r

2Mt+1

)+ Etr

2Mt+1

= RfCovt

(mt,t+1, r

2Mt+1

)+ Et (RMt+1 −Rf )2

= RfCovt

(mt,t+1, r

2Mt+1

)+ Et (RMt+1 − EtRMt+1 + EtRMt+1 −Rf )2

= RfCovt

(mt,t+1, r

2Mt+1

)+ Et (RMt+1 − EtRMt+1 + EtRMt+1 −Rf )2

= RfCovt

(mt,t+1, r

2Mt+1

)+ Et (RMt+1 − EtRMt+1)

2 + (EtRMt+1 −Rf )2

= RfCovt

(mt,t+1, r

2Mt+1

)+ σ2

Mt + (EtRMt+1 −Rf )2

Thusσ2

Mt − σ∗2Mt = − (EtRMt+1 −Rf )2 −RfCovt

(mt,t+1, (RMt+1 −Rf )2

)

I then replace the excess market squared return by its expression

(RMt+1 −Rf )2 = (RMt+1 − EtRMt+1)2 + (EtRMt+1 −Rf )2 + 2(RMt+1 − EtRMt+1)(EtRMt+1 −Rf ),

and the pricing kernel mt,t+1 by its expression to get the final result.

Proof of Proposition 2 I Note that the covariance of the pricing kernel and the square of the idiosyncraticshock is

Covt(mt,t+1, ε2kt+1) = [Etmt,t+1]Et

mt,t+1

Etmt,t+1ε2

kt+1 − Etmt,t+1Etε2kt+1. (B6)

Since Etmt,t+1 = R−1f and Et

mt,t+1Etmt,t+1

[X] = E∗t [X] where the operator E∗

t [.] denotes the expectation taken under

the risk neutral measure, I simplify (B6) and get:

RfCovt(mt,t+1, ε2kt+1) = E∗

t ε2kt+1 − Etε

2kt+1 = σ∗2εkt − σ2

εkt (B7)

I then replace mt,t+1 in (B7) and get the final result.

Lemma B1: (Stein’s Lemma) If X an Y are jointly normally distributed, then Cov(g(X), Y ) = E(g(X)′)Cov(X, Y ).

If the idiosyncratic shock and the factor ft+1 are jointly normally distributed, Using Stein Lemma, it can be shownthat

Covt(ft+1, ε2kt+1) = 2Etεkt+1Covt(ft+1, εkt+1). (B8)

Since Etεkt+1 = 0, it follows that Covt(ft+1, ε2kt+1) = 0.

Proof of the relationship between the nonsystematic coskewness and the expected excess returnI notice that in a single factor model, the risk premium on the asset’s k is

αkt = EtRkt+1 −Rf

= −Rfcovt (mt,t+1, Rkt+1)

= −Rfcovt

(1

Rf− dt

Rf(RMt+1 − EtRMt+1) , Rkt+1

)

= dtσ2Mtβkt

where

βkt =covt (Rkt+1, RMt+1)

σ2Mt

LetISkt = Etε

3kt+1

denote the idiosyncratic skewness with

εkt+1 = Rkt+1 −Rf − αkt − βkt (RMt+1 − EtRMt+1)

34

Therefore,

∂γkt

∂αkt=

covt

(ε2

kt+1, rMt+1

)

σ2Mt

= 2covt

(εkt+1

∂εkt+1∂αkt

, rMt+1

)

σ2Mt

= 2covt

(εkt+1

∂εkt+1∂αkt

, rMt+1

)

σ2Mt

= −2∂βkt+1

∂αkt

covt

(εkt+1, r

2Mt+1

)

σ2Mt

= − 2

dtσ2Mt

covt

(εkt+1, r

2Mt+1

)

σ2Mt

Since

rMt+1 = RMt+1 − EtRMt+1 =

n∑j=1

$jtεjt+1

where $jt represents the weight of asset j in the market portfolio. Therefore, under the restrictions covt (εjt+1εkt+1, rMt+1) =0 for j 6= k , we have

∂γkt

∂αkt= − 2

dtσ2Mt

n∑j=1

$jtcovt (εkt+1, rMt+1εjt+1)

σ2Mt

= − 2

dtσ2Mt

[$kγk +

n∑j=1

$jcovt (εjt+1εkt+1, rMt+1)

σ2Mt

]

∂γkt

∂αkt= − 2

dtσ2Mt

$kγk

Proof of the Analytical Expression of Non-Systematic Coskewness with GARCH Specification Itake the expectation of (35) under the physical measure, and the expectation of (35) under the risk neutral measureand show that the spread of these expected values equals

Et[σ2εkt+1]− E∗

t [σ2εkt+1] = (α1 +

δ1

2)σ2

εkt(1− V ar∗t [zt+1]).(B9)

I use the idiosyncratic risk εkt+1 = σεktzkt+1 and express the non-systematic volatility premium as

σ2εkt − σ∗2εkt = (1− V ar∗t [zt+1])σ

2εkt. (B10)

I recover 1− V ar∗t [zt+1] from (B9) and replace the result in (B10) to get

σ2εkt − σ∗2εkt =

Et[σ2εkt+1]− E∗

t [σ2εkt+1]

(α1 + δ12

)(B11)

I note that

Et[σ2εkt+1]− E∗

t [σ2εkt+1] = −RfCovt(mt,t+1, σ

2εkt+1) =

1

℘Covt(rMt+1, σ

2εkt+1) (B12)

and I replace this last expression in (B11) to get

σ2εkt − σ∗2εkt =

Covt(rMt+1, σ2kt+1)

℘(α1 + δ12

)(B13)

From equation (??), I can also write the non-systematic volatility risk premium as.

σ2εkt − σ∗2εkt =

1

℘σ2

Mtγkt (B14)

I equate equations (B13) and (B14) to get the final result.

35

References

Aı̈t-Sahalia, Yacine and Andrew W. Lo, 2000, Nonparametric risk management and implied riskaversion, Journal of Econometrics 94, 9-51.

Ang, Andrew, Robert Hodrick, Yuhang Xing, and Xiaoyan Zhang, 2006, The cross-section ofvolatility and expected returns, Journal of Finance 61, 259-299.

Ang, Andrew, Robert Hodrick, Yuhang Xing, and Xiaoyan Zhang, 2008, High idiosyncratic volatil-ity and low returns: International and further U.S. evidence, Journal of Financial Economics,forthcoming.

Bakshi, Gurdip, and Dilip Madan, 2006, A theory of volatility spread, Management Science 52,1945-1956.

Barberis, Nicholas and Ming Huang, 2007, Stocks as lotteries: The implications of probabilityweighting for security prices, American Economic Review, forthcoming.

Bollerslev, Tim, George Tauchen, and Hao Zhou, 2008, Expected stock return and variance riskpremia, Review of Financial Studies, Forthcoming.

Brandt, Michael, Amit Goyal, Pedro Santa-Clara and Jonathan R. Stroud, 2005, A simulation ap-proach to dynamic portfolio choice with an application to learning about return predictability,Review of Financial Studies 18, 831-873.

Brockman, Paul and Maria, G. Schutte, 2007, Is idiosyncratic volatility priced? The internationalevidence. working paper. University of Missouri Columbia.

Brown, David P., and Jens, C. Jackwerth, 2004, The Pricing Kernel Puzzle: Reconciling IndexOption Data and Economic Theory,Working Paper, University of Wisconsin at Madison,Finance Department, School of Business.

Carr, Peter and Liuren Wu, 2008, Variance risk premia. Review of Financial Studies, Forthcoming.Chabi-Yo, Fousseni, Rene Garcia, and Eric Renault, 2008, State Dependence Can Explain Risk-

Aversion Puzzle, The Review of Financial Studies 21 (2): 973-1011.Chapman, David, 1997, Approximating the asset pricing kernel, Journal of Finance 52, 13831410.Dittmar, Robert F., 2002, Nonlinear pricing kernels, kurtosis preference, and evidence from the

cross section of equity returns, Journal of Finance 57, 369-403.Drechsler, Itamar and Amir Yaron (2008), Whats Vol Got to Do With It, Working Paper, University

of Pennsylvania.Eiling, Esther, 2006, Can nontradable assets explain the apparent premium for idiosyncratic risk?

The case of industry-specific human capital. working paper. Tilburg University, Netherlands.Elul, Ronel, 1995, Welfare effects of financial innovation in incomplete markets economies with

several consumption goods, Journal of Economic Theory 65, 43-78.Fama, Eugenne F., and Kenneth R. French, 1993, Common risk factors in the returns on stocks

and bonds. Journal of Financial Economics 33, 3-56.Fu, Fangjian, 2008, Idiosyncratic risk and the cross-section of expected stock returns, Journal of

Financial Economics, forthcoming.Glosten, Laurence R., Ravi Jagannathan, David. E, Runkle, 1993. On the relation between the

expected value and the volatility of the nominal excess returns on stocks. Journal of Finance48, 17791801.

Guo, Hui., Robert F. Whitelaw, 2006, Uncovering the risk-return relation in the stock market.Journal of Finance 61, 14331463.

Hansen, Lars Peter, 1982, Large sample properties of generalized method of moments estimators,Econometrica 50, 10291054.

Hansen, Lars Peter, and Ravi Jagannathan, 1997, Assessing specification errors in stochastic dis-count factor models, Journal of Finance 52, 557590.

36

Harvey, Campbell and Siddique Akhatar, 2000, Conditional skewness in asset pricing tests, Journalof Finance 55, 1263-1295.

Hart, Olivier, 1975, On the optimality of equilibrium when the market structure is incomplete.Journal of Economic Theory 11, 418-443.

Jegadeesh, Narasimhan, and Sheridan Titman, 1993, Returns to buying winners and selling losers:Implications for stock market efficiency, Journal of Finance 48, 65-92.

Jens, C. Jackwerth, 2000, Recovering Risk Aversion From Option Prices and Realized Returns,Review of Financial Studies 13 (2) 433-451.

Juud, G., and S. Guu., 2001, .Asymptotic Methods for Asset Market Equilibrium Analy-sis,.Economic Theory 18, 127-157.

Kraus, Alan, and Robert Litzenberger, 1976, Skewness preference and the valuation of risk assets,Journal of Finance 31, 10851100.

Merton, Robert., 1987. A simple model of capital market equilibrium with incomplete information.Journal of Finance 42, 483-510.

Mitton, Todd, and Keith Vorkink, 2007, Equilibrium underdiversification and the preference forskewness, Review of Financial Studies 20, 1255-1288.

Nyberg, Peter and Anders Wilhelmsson (2007), Volatility Risk Premium, Risk Aversion and theCross-Section of Stock Returns, Working Paper, Swedish School of Economics and BusinessAdministration.

Rosenberg, Joshua, and Robert Engle, 2002, Empirical pricing pernels, Journal of Financial Eco-nomics 64, 341.372.

Rubinstein, Mark, 1973, The Fundamental Theorem of Parameter-Preference Security, Valuation,Journal of Financial and Quantitative Analysis 8(1), 61-69.

Samuelson, Paul A., 1970, The fundamental approximation theorem of portfolio analysis in termsof means, variances and higher moments, Review of Economic Studies 37, 537-542.

Spiegel, Matthew and Wang Xiaotong, 2006, Cross-sectional variation in stock returns: liquidityand idiosyncratic risk. Unpublished working paper. Yale University.

37

Table I: Preference Parameters and Implied Prices of Risk Using Industry Portfolio ReturnsTable I presents results of GMM tests of the Euler equation condition, EmT

t Rt+1Rf = 1 using the pricing kernel derived inProposition 1 when the investment horizon h = T − t = 2. I estimate the preference parameters by using the Hansen and

Jagannathan (1997) weighting matrix ERt+1R′t+1. Column (1) presents the mean of the pricing kernel, Columns (2) and (3)

present the risk aversion and skewness preference respectively. Column (4) presents the HansenJagannathan distance measurewith p-values for the test of model specification in parentheses. Columns (5) - (7) present the annualized price of market,coskewness and market volatility risk, using the estimated preference parameters. The P-values for tests of the coefficientsappear in parentheses. The set of returns I use in my estimations are those of 30 industry-sorted portfolios augmented by thereturn on a one-month Treasury bill, covering the sample periods 01/1986-12/2000, 01/1996-12/2006 and 01/1986-12/2006.For the market portfolio, I use the value-weighted NYSE/AMEX/NASDAQ index, also known as the value-weighted index ofthe Center for Research in Security Prices (CRSP). As my proxy for the volatility of the market return, I use the ChicagoBoard Options Exchange (CBOE)s VXO, the VIX implied volatilities and the realized volatility RV, respectively. In Panel A,I present the results when I use the VXO. Panel B presents the results when I use the VIX. Panel C presents the results whenI use the realized volatility RV.

Panel A: Market Volatility: VXO

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Subperiod: 01/1986-12/2000Coefficient 0.996 3.989 1.096 0.150 9.552 -1.861 -0.384

P-value 0.000 0.044 0.001 0.045Subperiod: 01/1996-12/2006

Coefficient 0.997 2.756 2.254 0.194 6.669 -1.857 -0.741P-value 0.000 0.197 0.345 0.000

Subperiod: 01/1986-12/2006Coefficient 0.996 3.439 1.080 0.094 7.934 -1.290 -0.192

P-value 0.000 0.033 0.017 0.081

Panel B: Market Volatility: VIX

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Subperiod: 01/1990-12/2000Coefficient 0.996 4.748 1.583 0.248 9.671 -2.857 -1.207

P-value 0.000 0.067 0.023 0.012Subperiod: 01/1996-12/2006

Coefficient 0.997 2.656 2.207 0.196 6.428 -1.689 -0.621P-value 0.000 0.219 0.403 0.000

Subperiod: 01/1990-12/2006Coefficient 0.997 3.782 1.705 0.119 7.761 -2.066 -0.906

P-value 0.000 0.053 0.082 0.003

Panel C: Market Volatility: RV

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Subperiod: 01/1990-12/2000Coefficient 0.996 4.753 1.105 0.272 9.178 -1.930 -0.978

P-value 0.000 0.049 0.000 0.036Subperiod: 01/1996-12/2006

Coefficient 0.997 2.504 1.183 0.202 5.629 -0.721 -1.320P-value 0.000 0.239 0.003 0.000

Subperiod: 01/1990-12/2006Coefficient 0.997 3.236 1.175 0.124 6.277 -0.958 -1.380

P-value 0.000 0.098 0.000 0.060

38

Table II: Preference Parameters and Implied Prices of Risk Using Industry Portfolio Returns (Robustness toSize, Book-to-Value and Momentum Factors):Table II presents results of GMM tests of the Euler equation condition, EmT

t Rt+1Rf = 1 using the pricing kernel derived inProposition 1 when the investment horizon h = T − t = 2 augmented with Fama and French (1993) size and book-to-market

factors. I estimate the preference parameters by using the Hansen and Jagannathan (1997) weighting matrix ERt+1R′t+1.

Column (1) presents the mean of the pricing kernel, Columns (2) and (3) present the risk aversion and skewness preferencerespectively. Column (4) presents the HansenJagannathan distance measure with p-values for the test of model specificationin parentheses. Columns (5) - (7) present the annualized price of market, coskewness and market volatility risk, using theestimated preference parameters. The P-values for tests of the coefficients appear in parentheses. The set of returns I use inmy estimations are those of 30 industry-sorted portfolios augmented by the return on a one-month Treasury bill, covering thesample periods 01/1986-12/2000, 01/1996-12/2006 and 01/1986-12/2006. For the market portfolio, I use the value-weightedNYSE/AMEX/NASDAQ index, also known as the value-weighted index of the Center for Research in Security Prices (CRSP).As my proxy for the volatility of the market return, I use the Chicago Board Options Exchange (CBOE)s VXO, the VIX impliedvolatilities and the Realized Volatility RV, respectively. In Panel A, I present the results when I use the VXO. Panel B presentsthe results when I use the VIX. Panel C presents the results when I use the realized volatility RV.

Panel A: Market Volatility: VXO

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Subperiod: 01/1986-12/2000Coefficient 0.996 4.211 1.095 0.116 10.082 -2.072 -0.424

P-value 0.000 0.032 0.002 0.162Subperiod: 01/1996-12/2006

Coefficient 0.997 4.009 1.827 0.175 9.701 -3.185 -1.033P-value 0.000 0.169 0.182 0.002

Subperiod: 01/1986-12/2006Coefficient 0.996 4.318 1.050 0.085 9.961 -1.976 -0.187

P-value 0.000 0.023 0.001 0.081

Panel B: Market Volatility: VIX

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Subperiod: 01/1990-12/2000Coefficient 0.996 3.403 1.741 0.199 6.931 -1.689 -0.896

P-value 0.000 0.268 0.270 0.037Subperiod: 01/1996-12/2006

Coefficient 0.997 3.755 1.913 0.177 9.086 -2.926 -0.939P-value 0.000 0.206 0.264 0.003

Subperiod: 01/1990-12/2006Coefficient 0.997 4.608 1.482 0.113 9.457 -2.667 -0.920

P-value 0.000 0.061 0.027 0.006

Panel C: Market Volatility: RV

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Subperiod: 01/1990-12/2000Coefficient 0.996 3.990 1.118 0.208 7.705 -1.376 -0.774

P-value 0.000 0.201 0.000 0.016Subperiod: 01/1996-12/2006

Coefficient 0.997 3.636 1.138 0.172 8.174 -1.461 -1.487P-value 0.000 0.286 0.000 0.003

Subperiod: 01/1990-12/2006Coefficient 0.997 3.623 1.215 0.097 7.027 -1.241 -1.519

P-value 0.000 0.195 0.001 0.076

39

Table III: Descriptive Statistics on Dow 30 Stocks:This table presents summary statistics for the monthly returns on Dow 30 Stocks. Maximum, minimum, mean, standarddeviation (Std), skewness, and kurtosis are reported for each stock. The descriptive statistics are computed for the sampleperiod from Januray 1990 to December 2006.

Dow Jones Stocks Minimum Maximum Mean Std Skewness Kurtosis

Microsoft -0.3435 0.4078 0.0251 0.1024 0.3978 4.6155Honeywell -0.3840 0.5105 0.0141 0.0901 -0.0367 9.3331AT&T Inc -0.1876 0.2900 0.0097 0.0717 0.2064 4.1836Coca Cola -0.1910 0.2228 0.0115 0.0660 -0.2103 4.0537

E.I. DuPont de Nemours -0.1699 0.2174 0.0091 0.0673 0.1367 2.8528Exxon Mobil -0.1165 0.2322 0.0127 0.0465 0.6183 5.4267

General Electric -0.1765 0.1924 0.0134 0.0626 0.1297 3.5668General Motors -0.2403 0.2766 0.0072 0.0951 0.1984 3.1706

International Business Machines -0.2619 0.3538 0.0123 0.0890 0.3456 4.3209Altria (was Philip Morris) -0.2656 0.3427 0.0164 0.0830 -0.2708 5.0895

United Technologies -0.3202 0.2461 0.0154 0.0711 -0.6543 6.1925Procter and Gamble -0.3570 0.2509 0.0135 0.0631 -0.7790 8.7933

Caterpillar -0.2146 0.4079 0.0157 0.0836 0.4065 4.7004Boeing -0.3457 0.1949 0.0120 0.0788 -0.5865 4.4729Pfizer -0.1707 0.2655 0.0151 0.0735 0.1404 2.9849

Johnson & Johnson -0.1601 0.1881 0.0142 0.0629 0.0878 3.18133M Corporation -0.1578 0.2580 0.0108 0.0581 0.4040 4.8393

Merck -0.2577 0.2276 0.0114 0.0773 -0.0737 3.3652Alcoa -0.2387 0.5114 0.0115 0.0907 0.7795 6.9529

Walt Disney Co. -0.2678 0.2415 0.0100 0.0775 -0.0686 3.9232Hewlett-Packard -0.3199 0.3539 0.0176 0.1099 0.0874 3.6069

McDonalds -0.2567 0.1826 0.0114 0.0697 -0.2540 3.5148JP Morgan Chase -0.3468 0.3257 0.0161 0.1004 -0.1720 4.8161Wal-Mart Stores -0.2080 0.2643 0.0136 0.0730 0.1671 3.4579

Intel Corp -0.4449 0.3382 0.0223 0.1213 -0.3005 3.6254Verizon Communications -0.2099 0.3901 0.0076 0.0721 0.8210 6.7551

Home Depot -0.2059 0.3023 0.0193 0.0849 0.2564 3.5188American Int’l Group -0.2310 0.2387 0.0129 0.0665 0.0223 4.2054

CitiGroup -0.3401 0.2608 0.0214 0.0889 -0.1032 4.3008American Express IBM -0.2933 0.2031 0.0139 0.0754 -0.8057 4.6244

International Business Machines

40

Table IV: Preference Parameters and Implied Prices of Risk Using the 30 Dow Jones Returns:Table IV presents results of GMM tests of the Euler equation condition, EmT

t Rt+1Rf = 1 using the pricing kernel derived inProposition 1 when the investment horizon h = T − t = 2 augmented with Fama and French (1993) size and book-to-market

factors. I estimate the preference parameters by using the Hansen and Jagannathan (1997) weighting matrix ERt+1R′t+1.

Column (1) presents the mean of the pricing kernel, Columns (2) and (3) present the risk aversion and skewness preferencerespectively. Column (4) presents the HansenJagannathan distance measure with p-values for the test of model specificationin parentheses. Columns (5) - (7) present the annualized price of market, coskewness and market volatility risk, using theestimated preference parameters. The P-values for tests of the coefficients appear in parentheses. The set of returns I use inmy estimations are those of 30 Dow Jones returns augmented by the return on a one-month Treasury bill, covering the sampleperiod 01/1990-12/2006. For the market portfolio, I use the value-weighted NYSE/AMEX/NASDAQ index, also known asthe value-weighted index of the Center for Research in Security Prices (CRSP). As my proxy for the volatility of the marketreturn, I use the Chicago Board Options Exchange (CBOE)s VXO, the VIX implied volatilities and the Realized Volatility RV,respectively. In Panel A, I present the results when I use the VXO, the VIX and the realized volatility RV. Panel B presentsthe results when I control for the Fama and French and the momentum factors.

1Rf

ρ HJ Dist λMKT (%) λSKD(%) λV OL(%)

Panel A

VXO

Coefficient 0.997 3.921 1.603 0.121 8.047 -2.088 -0.897P-value 0.000 0.039 0.045 0.003

VIX

Coefficient 0.997 3.782 1.705 0.120 7.761 -2.066 -0.906P-value 0.000 0.053 0.082 0.003

RV

Coefficient 0.997 5.750 1.003 0.101 11.154 -2.582 -0.148P-value 0.000 0.008 0.000 0.319

Panel B

VXO

Coefficient 0.997 4.771 1.394 0.115 9.791 -2.689 -0.869P-value 0.000 0.046 0.008 0.004

VIX

Coefficient 0.997 4.608 1.482 0.113 9.457 -2.667 -0.920P-value 0.000 0.061 0.028 0.006

RV

Coefficient 0.9967 5.4074 1.0277 0.0913 10.4890 -2.3396 -1.2963P-value 0.000 0.0650 0.0000 0.4098

41

Table

V:P

ortf

olios

Sorte

don

Idio

syncrati

cV

ola

tility

.I

sort

stock

sin

todec

ile

port

folios

base

don

thei

rid

iosy

ncr

ati

cvola

tility

usi

ng

only

NY

SE

/A

ME

X/N

ASD

AQ

indust

rial

firm

s.I

form

valu

e-w

eighte

ddec

ile

port

folios

ever

ym

onth

by

sort

ing

stock

sbase

don

idio

syncr

ati

cvola

tility

rela

tive

todiff

eren

tm

odel

s.To

com

pute

the

idio

syncr

ati

cvola

tility

,I

use

the

CA

PM

model

,th

eFam

aand

Fre

nch

(1993)

model

,th

eFam

aand

Fre

nch

(1993)

model

augm

ente

dw

ith

the

Jeg

adee

shand

Tit

man

(1993)

mom

entu

mfa

ctor,

and

the

Harv

eyand

Sid

diq

ue

(2000)

cosk

ewnes

sm

odel

.Port

folio

1(1

0)

isth

eport

folio

of

stock

sw

ith

the

low

est

(hig

hes

t)id

iosy

ncr

ati

cvola

tility

risk

.T

he

colu

mn

titl

ed“10-1

”re

fers

toth

ediff

eren

cein

expec

ted

retu

rns

bet

wee

nport

folio

10

and

port

folio

1.

InPanel

A,I

mea

sure

the

stati

stic

sin

the

colu

mns

label

edM

ean

and

Std

Dev

(Sta

ndard

Dev

iati

on)

inm

onth

lyper

centa

ge

term

s.I

use

tota

l,not

exce

ss,re

turn

s.Std

Dev

sare

inpare

nth

eses

.R

obust

New

ey-W

est

(1987)

t-st

ati

stic

s(t

-sta

t)are

inbra

cket

s.Panel

Bre

port

sea

chof

the

model

alp

has.

Robust

New

ey-W

est

(1987)t-

stati

stic

sappea

rin

square

bra

cket

s.T

he

colu

mn

titl

ed10-1

refe

rsto

the

diff

eren

cein

alp

has

bet

wee

nport

folio

10

and

port

folio

1.

The

sam

ple

per

iod

isfr

om

January

1971

toD

ecem

ber

2006.

1Low

23

45

67

89

10

Hig

h10-1

Panel

A

CA

PM

Mea

n1.0

69

0.9

83

1.1

80

1.1

65

1.0

87

0.9

38

0.7

75

0.6

53

0.0

31

-0.2

22

-1.2

90

Std

Dev

(3.8

88)

(4.6

65)

(5.4

76)

(6.0

91)

(6.9

92)

(7.9

33)

(8.3

66)

(8.7

43)

(9.6

97)

(10.6

21)

[-2.6

42]

FF

Mea

n1.0

49

1.0

56

1.1

25

1.1

41

1.1

00

0.9

22

0.8

35

0.6

12

0.3

33

-0.1

78

-1.2

27

Std

Dev

(3.8

97)

(4.7

03)

(5.4

92)

(6.1

33)

(7.0

67)

(7.8

70)

(8.3

35)

(8.7

92)

(9.5

59)

(10.4

57)

[-2.5

52]

FF-M

Mea

n1.0

36

1.1

00

1.1

20

1.0

98

1.0

68

0.9

10

0.9

01

0.5

97

0.3

21

-0.1

57

-1.1

93

Std

Dev

(3.9

20)

(4.6

91)

(5.5

31)

(6.0

85)

(7.1

07)

(7.7

88)

(8.4

07)

(8.7

07)

(9.5

22)

(10.4

41)

[-2.4

80]

HS

Mea

n1.0

62

1.0

01

1.1

63

1.1

87

1.0

37

0.9

45

0.7

97

0.6

45

0.2

93

-0.2

17

-1.2

78

Std

Dev

(3.8

84)

(4.6

71)

(5.4

93)

(6.1

15)

(6.9

51)

(7.9

28)

(8.3

65)

(8.7

58)

(9.6

68)

(10.6

28)

[-2.6

24]

Panel

B

CA

PM

Alp

ha

0.5

86

0.4

72

0.6

55

0.6

39

0.5

53

0.3

73

0.2

10

0.0

66

-0.2

65

-0.8

80

-1.4

66

t-st

at

[3.0

41]

[2.0

99]

[2.4

71]

[2.2

18]

[1.6

43]

[0.9

46]

[0.5

26]

[0.1

59]

[-0.5

83]

[-1.7

28]

[-3.1

53]

FF

Alp

ha

0.5

66

0.5

50

0.5

97

0.6

26

0.5

56

0.3

55

0.2

72

0.0

41

-0.2

75

-0.8

42

-1.4

09

t-st

at

[2.9

17]

[2.4

39]

[2.2

49]

[2.1

45]

[1.6

30]

[0.8

94]

[0.6

78]

[0.1

00]

[-0.6

16]

[-1

.686]

[-3.0

81]

FF-M

Alp

ha

0.5

54

0.5

94

0.5

90

0.5

80

0.5

17

0.3

41

0.3

38

0.0

26

-0.2

91

-0.8

22

-1.3

76

t-st

at

[2.8

51]

[2.6

26]

[2.2

33]

[1.9

96]

[1.4

97]

[0.8

75]

[0.8

31]

[0.0

63]

[-0.6

47]

[-1.6

42]

[-3.0

16]

HS

Alp

ha

0.5

79

0.4

89

0.6

41

0.6

63

0.4

96

0.3

81

0.2

33

0.0

66

-0.2

93

-0.8

78

-1.4

57

t-st

at

[3.0

13]

[2.1

62]

[2.4

18]

[2.2

78]

[1.4

95]

[0.9

65]

[0.5

79]

[0.1

60]

[-0.6

44]

[-1.7

30]

[-3.1

46]

42

Table VI: Decile Portfolios Sorted on Non-Systematic Coskewness.I use NYSE/AMEX/NASDAQ industrial firms and form value-weighted decile portfolios every month by sorting stocks basedon non-systematic coskewness relative to the CAPM model. My portfolios contain stocks in percentiles 0-10, 10-20, 20-30, 30-40,40-50, 50-60, 60-70, 70-80, 80-90, and 90-100. At the end of each month, I split the sample into two groups, stocks with positivenon-systematic coskewness and stocks with negative non-systematic coskewness. Within each group, I sort stocks into decileportfolios and then form value-weighted decile portfolios every month by sorting these stocks based on their non-systematiccoskewness. Portfolio 1 (10) is the portfolio of stocks with the lowest (highest) non-systematic coskewness. In Panel A, Imeasure the statistics in the columns labeled Mean and Std Dev (standard deviation) in monthly percentage terms. I use total,not excess, returns. Standard errors appear in parentheses. Robust Newey-West (1987) t-statistics appear in square brackets.The column titled 10-1 refers to the difference in expected return between portfolio 10 and portfolio 1. Panel B reports thecoefficients of the CAPM regression:

rp = α + βM [RM −Rf ] + η, (B.15)

and Fama and French (1993) regressions:

rp = α + βM [RM −Rf ] + βSMBrSMB + βHMLrHML + η, (B.16)

when γk > 0. Panel C reports the regression coefficients when γk < 0. The sample period is from January 1971 to December2006. I also report the R-Square of the regressions.

1Low 2 3 4 5 6 7 8 9 10 High 10-1

Panel A

γk>0

Mean 1.07 1.02 1.01 1.14 1.09 1.11 1.07 1.11 1.04 0.62 -0.45Std Dev (4.16) (4.47) (4.70) (5.09) (5.72) (6.11) (6.67) (7.74) (8.84) (9.93) [-1.04]

γk<0

Mean 0.32 0.95 0.78 0.79 0.96 0.65 1.25 1.02 1.01 1.13 0.81Std Dev (8.61) (7.38) (6.68) (6.46) (6.10) (5.85) (5.35) (4.96) (4.58) (4.40) [2.30]

Panel B

γk>0

CAPMα 0.58 0.53 0.51 0.63 0.55 0.59 0.54 0.56 0.49 -0.01 -0.59

t-stat [2.76] [2.44] [2.24] [2.55] [1.94] [2.01] [1.75] [1.45] [1.10] [-0.02] [-1.49]βM -0.01 0.01 0.03 0.05 0.10 0.07 0.09 0.12 0.14 0.28 0.28

t-stat [-0.16] [0.15] [0.59] [0.96] [1.49] [1.07] [1.31] [1.63] [1.60] [2.70] [3.35]R2(%) -0.22 -0.23 -0.14 0.00 0.34 0.02 0.10 0.23 0.25 1.32 2.18

FFα 0.62 0.57 0.54 0.68 0.58 0.65 0.56 0.68 0.61 0.12 -0.50

t-stat [2.85] [2.56] [2.31] [2.66] [1.97] [2.09] [1.73] [1.70] [1.36] [0.23] [-1.05]βM -0.05 -0.04 -0.02 0.02 0.05 0.03 0.06 0.05 0.04 0.18 0.23

t-stat [-0.87] [-0.60] [-0.29] [0.38] [0.71] [0.39] [0.90] [0.56] [0.34] [1.59] [2.31]βSMB 0.10 0.09 0.14 0.02 0.14 0.06 0.05 0.06 0.15 0.13 0.02

t-stat [1.96] [1.65] [2.07] [0.32] [1.97] [0.60] [0.56] [0.57] [1.13] [0.81] [0.16]βHML -0.07 -0.09 -0.07 -0.09 -0.06 -0.09 -0.04 -0.20 -0.23 -0.23 -0.16t-stat [-1.01] [-1.37] [-0.99] [-1.22] [-0.70] [-0.91] [-0.34] [-1.48] [-1.27] [-0.95] [-0.66]

R2(%) 0.30 0.16 0.57 -0.19 0.64 -0.15 -0.28 0.39 0.72 1.52 2.02

Panel C

γk<0

CAPMα -0.34 0.41 0.19 0.23 0.46 0.11 0.77 0.50 0.49 0.66 1.00

t-stat [-0.87] [1.17] [0.64] [0.72] [1.61] [0.37] [3.05] [2.11] [2.16] [3.11] [3.10]βM 0.33 0.10 0.21 0.15 0.03 0.11 -0.01 0.08 0.07 -0.03 -0.36

t-stat [4.01] [1.14] [3.19] [2.24] [0.51] [1.70] [-0.24] [1.39] [ 1.25] [-0.51] [-4.56]R2(%) 2.80 0.13 1.70 0.84 -0.18 0.45 -0.22 0.31 0.27 -0.15 4.91

FFα -0.30 0.27 0.33 0.31 0.51 0.13 0.83 0.48 0.50 0.66 0.96

t-stat [-0.76] [0.72] [1.07] [1.01] [1.73] [0.43] [3.14] [2.02] [2.09] [2.98] [2.86]βM 0.27 0.17 0.11 0.08 -0.03 0.08 -0.07 0.06 0.03 -0.05 -0.32

t-stat [2.94] [2.00] [1.67] [1.16] [-0.51] [1.21] [-1.09] [1.01] [0.48] [-0.88] [-3.67]βSMB 0.17 -0.01 0.11 0.10 0.18 0.08 0.13 0.10 0.15 0.10 -0.07t-stat [1.38] [-0.13] [1.20] [1.35] [2.41] [0.94] [2.04] [1.39] [2.25] [1.83] [-0.63]

βHML -0.09 0.24 -0.23 -0.15 -0.10 -0.04 -0.10 0.00 -0.04 -0.01 0.08t-stat [-0.51] [1.54] [-2.05] [-1.44] [-0.91] [-0.48] [-1.44] [0.05] [-0.58] [-0.22] [0.50]

R2(%) 2.91 0.46 2.64 1.18 0.59 0.24 0.38 0.24 0.97 -0.05 4.69

43

Table VII: Portfolios Sorted on Non-Systematic Coskewness in 10 Different Groups.I use NYSE/AMEX/NASDAQ industrial firms and form value-weighted decile portfolios every month by sorting stocks basedon non-systematic coskewness relative to the CAPM model. My portfolios contain stocks in percentiles 0-5, 5-20, 20-30, 30-40,40-50, 50-60, 60-70, 70-80, 80-95, and 95-100. My portfolio construction procedure pays greater attention to the tails of thestock distribution. At the end of each month, I split the sample into two groups, stocks with positive non-systematic coskewnessand stocks with negative non-systematic coskewness. Within each group, I sort stocks into decile portfolios and then form value-weighted decile portfolios every month by sorting these stocks based on their non-systematic coskewness. Portfolio 1 (10) isthe portfolio of stocks with the lowest (highest) non-systematic coskewness. In Panel A, I measure the statistics in the columnslabeled Mean and Std Dev (standard deviation) in monthly percentage terms. I use total, not excess, returns. Standard errorsappear in parentheses. Robust Newey-West (1987) t-statistics appear in square brackets. The column titled 10-1 refers to thedifference in expected return between portfolio 10 and portfolio 1. Panel B reports the coefficients of the CAPM regression:

rp = α + βM [RM −Rf ] + η, (B.17)

and Fama and French (1993) regressions:

rp = α + βM [RM −Rf ] + βSMBrSMB + βHMLrHML + η, (B.18)

when γk > 0. Panel C reports the regression coefficients when γk < 0. The sample period is from January 1971 to December2006. I also report the R-Square of the regressions.

1Low 2 3 4 5 6 7 8 9 10 High 10-1

Panel A

γk>0

Mean 1.17 1.00 1.01 1.14 1.09 1.11 1.07 1.11 1.07 -0.05 -1.22t-stat (4.43) (4.28) (4.70) (5.09) (5.72) (6.11) (6.67) (7.74) (9.13) (10.18) [-2.65]

γk<0

Mean -0.05 0.87 0.78 0.79 0.96 0.65 1.25 1.02 1.02 1.24 1.29t-stat (10.00) (7.28) (6.68) (6.46) (6.10) (5.85) (5.35) (4.96) (4.36) (4.66) [2.88]

Panel B

γk>0

CAPMα 0.68 0.51 0.51 0.63 0.55 0.59 0.54 0.56 0.49 -0.69 -1.37

t-stat [3.15] [2.43] [2.24] [2.55] [1.94] [2.01] [1.75] [1.45] [1.08] [-1.38] [-3.17]βM 0.00 0.00 0.03 0.05 0.10 0.07 0.09 0.12 0.18 0.31 0.30

t-stat [0.08] [-0.02] [0.59] [0.96] [1.49] [1.07] [1.31] [1.63] [1.94] [3.12] [3.22]R2(%) -0.23 -0.23 -0.14 0.00 0.34 0.02 0.10 0.23 0.54 1.60 2.19

FFα 0.72 0.56 0.54 0.68 0.58 0.65 0.56 0.68 0.64 -0.58 -1.30

t-stat [3.22] [2.56] [2.31] [2.66] [1.97] [2.09] [1.73] [1.70] [1.31] [-1.11] [-2.82]βM -0.04 -0.05 -0.02 0.02 0.05 0.03 0.06 0.05 0.08 0.21 0.25

t-stat [-0.68] [-0.81] [-0.29] [0.38] [0.71] [0.39] [0.90] [0.56] [0.67] [1.71] [2.12]βSMB 0.11 0.10 0.14 0.02 0.14 0.06 0.05 0.06 0.12 0.18 0.07

t-stat [1.78] [1.92] [2.07] [0.32] [1.97] [0.60] [ 0.56] [0.57] [0.92] [1.06] [0.47]βHML -0.09 -0.09 -0.07 -0.09 -0.06 -0.09 -0.04 -0.20 -0.26 -0.21 -0.13t-stat [-1.04] [-1.42] [-0.99] [-1.22] [-0.70] [-0.91] [-0.34] [-1.48] [-1.23] [-1.09] [-0.68]

R2(%) 0.37 0.32 0.57 -0.19 0.64 -0.15 -0.28 0.39 1.01 1.91 2.00

Panel C

γk<0

CAPMα -0.73 0.31 0.19 0.23 0.46 0.11 0.77 0.50 0.52 0.76 1.49

t-stat [-1.54] [0.91] [0.64] [ 0.72] [ 1.61] [ 0.37] [3.05] [2.11] [2.38] [ 3.33] [ 3.52]βM 0.37 0.14 0.21 0.15 0.03 0.11 -0.01 0.08 0.03 -0.02 -0.38

t-stat [3.24] [1.71] [3.19] [2.24] [0.51] [1.70] [-0.24] [1.39] [0.57] [-0.29] [-3.15]R2(%) 2.52 0.51 1.70 0.84 -0.18 0.45 -0.22 0.31 -0.12 -0.21 3.61

FFα -0.55 0.18 0.33 0.31 0.51 0.13 0.83 0.48 0.53 0.75 1.30

t-stat [-1.14] [ 0.50] [ 1.07] [1.01] [1.73] [0.43] [3.14] [2.02] [ 2.34] [3.18] [2.94]βM 0.28 0.19 0.11 0.08 -0.03 0.08 -0.07 0.06 0.00 -0.04 -0.32

t-stat [2.34] [2.34] [1.67] [1.16] [-0.51] [1.21] [-1.09] [1.01] [-0.02] [-0.68] [-2.51]βSMB 0.00 0.05 0.11 0.10 0.18 0.08 0.13 0.10 0.11 0.13 0.13t-stat [0.02] [0.48] [1.20] [1.35] [2.41] [0.94] [2.04] [1.39] [1.81] [1.84] [0.77]

βHML -0.29 0.21 -0.23 -0.15 -0.10 -0.04 -0.10 0.00 -0.04 0.00 0.29t-stat [-1.42] [ 1.37] [-2.05] [-1.44] [ -0.91] [-0.48] [-1.44] [ 0.05] [-0.71] [0.02] [ 1.58]

R2(%) 2.69 0.66 2.64 1.18 0.59 0.24 0.38 0.24 0.17 0.10 4.02

44

Table VIII:Explaining the Low Returns of High Idiosyncratic Volatility Stocks when the Non-SystematicCoskewness is PositiveI form value-weighted decile portfolios every month by sorting stocks based on non-systematic coskewness relative to the CAPMmodel. I use only NYSE/AMEX/NASDAQ industrial firms. I use only stocks with positive non-systematic coskewness. Myportfolios contain stocks in percentiles 0-5, 5-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-95, and 95-100. In Panel A, at theend of each month, I sort stocks into decile portfolios and then form value-weighted decile portfolios every month by sortingthese stocks based on their idiosyncratic volatility. I measure the statistics in the columns labeled Mean and Std Dev (standarddeviation) in monthly percentage terms. I use total, not excess, returns. Standard errors appear in parentheses. RobustNewey-West (1987) t-statistics appear in square brackets. The column titled 10-1 refers to the difference in expected returnbetween portfolio 10 and portfolio 1. In Panel B, at the end of each month, I also sort stocks into decile portfolios and thenform value-weighted decile portfolios every month by sorting these stocks based on their non-systematic coskewness. When γkt

is positive, I refer to the value-weighted portfolio formed with the lowest 5% non-systematic coskewness as ν− and the highest5% non-systematic coskewness as ν+. I refer 10-1 to as the difference in expected return between portfolio 10 and portfolio 1.I use the 10-1 return ν+ − ν− as a control variable in the CAPM linear regression model. Panel B reports the coefficients ofthe regression:

rp = α + βM [RM −Rf ] + βν(ν+ − ν−) + η. (B.19)

To run this regression, I consider stocks with positive non-systematic coskewness. I form value-weighted decile portfolios everymonth by sorting stocks based on their idiosyncratic volatility relative to the CAPM model. I then run a regression of eachdecile idiosyncratic volatility portfolio return on a constant, the excess market return and the 10-1 return (ν+ − ν−). RobustNewey-West (1987) t-statistics appear in square brackets. I also report the R-Square of the regression. In Panel C, I use theFama and French factors as control variables. I run the regression:

rp = α + βM [RM −Rf ] + βSMB [RSMB −Rf ] + βHML[RHML −Rf ] + βν(ν+ − ν−) + η. (B.20)

The sample period is from January 1971 to December 2006.

1Low 2 3 4 5 6 7 8 9 10 High 10-1

Panel A

Mean 1.07 0.98 1.18 1.16 1.09 0.94 0.77 0.65 0.30 -0.22 -1.29Std Dev (3.89) (4.66) (5.48) (6.09) (6.99) (7.93) (8.37) (8.74) (9.70) (10.62) [-2.64]

Panel B

α 0.63 0.73 0.94 1.08 1.30 1.33 1.07 1.01 0.83 0.47 -0.16t-stat [3.04] [3.31] [3.79] [4.10] [4.25] [4.21] [3.07] [2.97] [2.42] [1.08] [-0.39]

βM -0.04 0.04 0.02 0.10 0.08 0.14 0.10 0.16 0.20 0.30 0.35t-stat [-0.80] [0.67] [0.35] [1.54] [1.06] [2.04] [1.17] [1.89] [2.35] [2.39] [2.87]

βν 0.03 0.11 0.26 0.33 0.44 0.59 0.60 0.69 0.85 1.05 1.01t-stat [0.84] [2.60] [6.54] [7.85] [7.54] [12.67] [9.74] [14.29] [14.79] [16.34] [13.89]

R2(%) 0.24 4.57 16.07 20.57 26.55 36.97 34.38 42.89 53.62 53.59 57.91

Panel C

α 0.65 0.77 0.92 1.12 1.28 1.46 1.07 1.11 0.88 0.55 -0.11t-stat [3.03] [3.33] [3.56] [3.88] [3.97] [4.46] [2.89] [3.09] [2.46] [1.14] [-0.24]

βM -0.07 -0.01 0.00 0.06 0.07 0.05 0.07 0.13 0.17 0.26 0.33t-stat [-1.10] [-0.20] [0.06] [0.85] [0.88] [0.65] [0.77] [1.37 ] [1.75] [1.59] [2.14]

βSMB 0.05 0.12 0.10 0.07 0.07 0.10 0.09 -0.08 0.01 0.02 -0.03t-stat [0.84] [2.20] [1.59] [ 0.84] [0.86] [1.00] [0.95] [-0.67] [0.14] [0.15] [-0.20]

βHML -0.05 -0.08 0.01 -0.08 0.02 -0.23 -0.01 -0.14 -0.09 -0.13 -0.08t-stat [-0.79] [-1.12] [0.17] [-0.78] [0.16] [-1.67] [-0.10] [-1.04] [-0.65] [-0.60] [-0.38]

βν 0.03 0.11 0.26 0.32 0.44 0.58 0.60 0.69 0.85 1.04 1.01t-stat [0.80] [2.47] [6.37] [7.66] [7.47] [12.76] [9.65] [14.11] [14.58] [15.90] [13.65]

R2(%) 0.08 5.26 16.00 20.48 26.29 37.52 34.20 42.83 53.47 53.46 57.74

45

Table IX:Explaining the Low Returns of High Idiosyncratic Volatility Stocks when the Non-Systematic Coskew-ness is NegativeI form value-weighted decile portfolios every month by sorting stocks based on non-systematic coskewness relative to the CAPMmodel. I use only NYSE/AMEX/NASDAQ industrial firms. I use only stocks with negative non-systematic coskewness. Myportfolios contain stocks in percentiles 0-5, 5-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-95, and 95-100. In Panel A, at theend of each month, I sort stocks into decile portfolios and then form value-weighted decile portfolios every month by sortingthese stocks based on their idiosyncratic volatility. I measure the statistics in the columns labeled Mean and Std Dev (standarddeviation) in monthly percentage terms. I use total, not excess, returns. Standard errors appear in parentheses. RobustNewey-West (1987) t-statistics appear in square brackets. The column titled 10-1 refers to the difference in expected returnbetween portfolio 10 and portfolio 1. In Panel B, at the end of each month, I also sort stocks into decile portfolios and thenform value-weighted decile portfolios every month by sorting these stocks based on their non-systematic coskewness. When γkt

is negative, I refer to the value-weighted portfolio formed with the lowest 5% non-systematic coskewness as υ− and the highest5% non-systematic coskewness as υ+. I refer 10-1 to as the difference in expected return between portfolio 10 and portfolio 1.I use the 10-1 return υ+ − υ− as a control variable in the CAPM linear regression model. Panel B reports the coefficients ofthe regression:

rp = α + βM [RM −Rf ] + βυ(υ+ − υ−) + η. (B.21)

To run this regression, I consider stocks with positive non-systematic coskewness. I form value-weighted decile portfolios everymonth by sorting stocks based on their idiosyncratic volatility relative to the CAPM model. I then run a regression of eachdecile idiosyncratic volatility portfolio return on a constant, the excess market return and the 10-1 return (υ+ − υ−). RobustNewey-West (1987) t-statistics appear in square brackets. I also report the R-Square of the regression. In Panel C, I use theFama and French factors as control variables. I run the regression:

rp = α + βM [RM −Rf ] + βSMB [RSMB −Rf ] + βHML[RHML −Rf ] + βυ(υ+ − υ−) + η. (B.22)

The sample period is from January 1971 to December 2006.

1Low 2 3 4 5 6 7 8 9 10 High 10-1

Panel A

Mean 1.05 1.08 1.08 1.17 1.23 1.10 0.79 0.64 0.27 -0.30 -1.35Std Dev (3.94) (4.50) (5.58) (6.26) (7.34) (8.38) (8.88) (9.21) (10.05) (12.46) [-2.44]

Panel B

α 0.58 0.71 0.85 1.10 0.87 0.71 0.97 0.87 0.59 0.04 -0.54t-stat [2.90] [2.93] [3.21] [3.70] [2.89] [2.24] [2.96] [2.46] [1.72] [0.08] [-1.24]

βM -0.01 0.01 0.11 0.07 0.08 0.13 0.23 0.24 0.15 0.55 0.55t-stat [-0.10] [0.09] [1.63] [0.97] [1.07] [1.48] [2.85] [2.48] [1.88] [4.96] [5.77]

βυ 0.01 -0.09 -0.18 -0.30 -0.37 -0.39 -0.51 -0.59 -0.67 -0.84 -0.84t-stat [0.20] [-2.09] [-4.77] [-5.95] [-9.81] [-8.27] [-11.58] [-10.31] [-15.22] [-12.99] [-12.63]

R2(%) -0.45 2.41 8.41 16.33 20.93 20.70 30.68 33.13 40.02 44.43 47.12

Panel C

α 0.56 0.73 0.87 1.07 0.86 0.73 0.93 0.83 0.53 0.01 -0.55t-stat [2.72] [2.90] [3.20] [3.52] [2.73] [2.20] [2.80] [2.27] [1.49] [0.01] [-1.24]

βM -0.01 -0.04 0.06 0.05 0.04 0.07 0.20 0.22 0.12 0.50 0.50t-stat [-0.14] [-0.58] [0.83] [0.72] [0.48] [0.69] [2.32] [2.17] [1.27] [3.81] [4.13]

βSMB 0.05 0.15 0.18 0.13 0.20 0.20 0.19 0.15 0.26 0.28 0.23t-stat [0.86] [2.41] [2.57] [1.58] [2.06] [1.42] [1.83] [1.26] [1.80] [1.54] [1.22]

βHML 0.02 -0.05 -0.05 0.04 -0.01 -0.07 0.04 0.05 0.07 0.01 -0.01t-stat [0.38] [-0.70] [-0.68] [0.38] [-0.08] [-0.48] [0.33] [0.36] [0.48] [0.07] [-0.06]

βυ 0.00 -0.09 -0.18 -0.30 -0.37 -0.39 -0.52 -0.59 -0.68 -0.84 -0.85t-stat [0.16] [-2.14] [-4.95] [-6.06] [-10.14] [-8.65] [-11.85] [-10.34] [-14.98] [-12.85] [-12.36]

R2(%) -0.78 3.14 9.17 16.33 21.36 21.20 30.87 33.10 40.49 44.75 47.31

46

Table X:Explaining the Low Returns of High Idiosyncratic Volatility StocksI form value-weighted decile portfolios every month by sorting stocks based on non-systematic coskewness relative to the CAPMmodel. I use only NYSE/AMEX/NASDAQ industrial firms. My portfolios contain stocks in percentiles 0-5, 5-20, 20-30, 30-40,40-50, 50-60, 60-70, 70-80, 80-95, and 95-100. At the end of each month, I split the sample into two groups, stocks with positivenon-systematic coskewness and stocks with negative non-systematic coskewness. Within each group, I sort stocks into decileportfolios and then form value-weighted decile portfolios every month by sorting these stocks based on their non-systematiccoskewness. When γkt is positive, I refer to the value-weighted portfolio formed with the lowest 5% non-systematic coskewnessas ν− and the highest 5% non-systematic coskewness as ν+. When γkt is negative, I refer to the value-weighted portfolio formedwith the lowest 5% non-systematic coskewness as υ− and the highest 5% non-systematic coskewness as υ+. I refer 10-1 to asthe difference in expected return between portfolio 10 and portfolio 1. I use the 10-1 returns ν+ − ν− and υ+ − υ− as controlvariables in the CAPM linear regression model. For comparison purpose, I report in Panel A the Fama and French (1993)alphas of idiosyncratic decile portfolios when I use all stocks (see Panel B of Table III). Robust Newey-West (1987) t-statisticsappear in square brackets. In Panel B, I reports the coefficients of the regression:

rp = α + βM [RM −Rf ] + βν(ν+ − ν−) + βυ(υ+ − υ−) + η. (B.23)

To run this regression, I consider all stocks and form value-weighted decile portfolios every month by sorting stocks based ontheir idiosyncratic volatility relative to the CAPM model (see Table III). I then run a regression of each decile idiosyncraticvolatility portfolio return on a constant, the excess market return, the Fama and French (1993) three-factor and the 10-1 returns(ν+ − ν−) and (υ+ − υ−). Robust Newey-West (1987) t-statistics appear in square brackets. I also report the R-Square of theregression. In Panel C, I use the Fama and French factors as control variables. I run the regression:

rp = α + βM [RM −Rf ] + βSMB [RSMB −Rf ] + βHML[RHML −Rf ] + βν(ν+ − ν−) + βυ(υ+ − υ−) + η, (B.24)

The sample period is from January 1971 to December 2006.

1Low 2 3 4 5 6 7 8 9 10 High 10-1

Panel A

Alpha 0.59 0.47 0.65 0.64 0.55 0.37 0.21 0.07 -0.26 -0.88 -1.47t-stat [3.04] [2.10] [2.47] [2.22] [1.64] [0.95] [0.53] [0.16] [-0.58] [-1.73] [-3.15]

Panel B

α 0.65 0.72 1.09 1.20 1.29 1.30 1.22 1.17 1.11 0.69 0.04t-stat [3.14] [3.09] [4.33] [4.46] [4.49] [4.62] [3.79] [3.50] [3.58] [2.12] [0.17]

βM -0.01 0.04 0.06 0.07 0.08 0.13 0.13 0.17 0.14 0.29 0.30t-stat [-0.13] [0.76] [0.97] [1.02] [1.06] [1.82] [1.65] [1.91] [1.85] [2.94] [4.09]

βυ -0.03 -0.07 -0.13 -0.20 -0.21 -0.26 -0.28 -0.32 -0.37 -0.42 -0.39t-stat [-0.88] [-1.76] [-2.70] [-3.88] [-4.34] [-4.74] [-4.82] [-5.22] [-6.87] [-6.47] [-7.01]

βν 0.01 0.10 0.17 0.20 0.31 0.39 0.43 0.46 0.60 0.68 0.68t-stat [0.18] [2.01] [3.58] [3.75] [5.81] [8.48] [7.29] [7.52] [10.67] [11.13] [11.44]

R2(%) 0.05 7.49 17.79 23.71 31.82 40.03 42.73 46.83 59.24 65.72 74.19

Panel C

α 0.65 0.75 1.08 1.18 1.29 1.34 1.20 1.18 1.11 0.69 0.04t-stat [3.07] [3.10] [4.13] [4.15] [4.23] [4.53] [3.57] [3.35] [3.41] [2.04] [0.15]

βM -0.03 -0.01 0.03 0.05 0.06 0.06 0.10 0.15 0.12 0.27 0.29t-stat [-0.46] [-0.23] [0.42] [0.69] [0.70] [0.78] [1.16] [1.58] [1.41] [2.36] [3.44]

βSMB 0.07 0.15 0.15 0.10 0.08 0.18 0.15 0.03 0.10 0.10 0.04t-stat [1.37] [2.69] [2.27] [1.30] [0.89] [1.82] [1.60] [0.33] [0.90] [1.05] [0.46]

βHML -0.02 -0.09 -0.01 0.01 -0.02 -0.10 0.01 -0.03 -0.01 -0.02 0.00t-stat [-0.39] [-1.32] [-0.18] [0.12] [-0.16] [-0.85] [0.08] [-0.26] [-0.05] [-0.22] [-0.02]

βυ -0.03 -0.08 -0.13 -0.20 -0.21 -0.27 -0.29 -0.32 -0.38 -0.43 -0.39t-stat [-0.92] [-1.89] [-2.88] [-4.04] [-4.48] [-4.99] [-5.03] [-5.34] [-6.99] [-6.62] [-7.03]

βν 0.01 0.09 0.17 0.19 0.30 0.38 0.42 0.46 0.59 0.68 0.68t-stat [0.14] [1.84] [3.38] [3.62] [5.61] [8.13] [7.07] [7.39] [10.65] [11.07] [11.44]

R2(%) -0.06 8.59 18.22 23.59 31.64 40.51 42.77 46.61 59.15 65.66 74.08

47

Risk Aversion

1996 1998 2000 2002 2004 20062.5

3

3.5

4

4.5

5

5.5

6

years

Ris

k A

vers

ion

Risk Aversion

RA

Skewness Preference

1996 1998 2000 2002 2004 20060.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

years

Ske

wn

ess

Pre

fere

nce

Skewness Preference

SP

Price of Volatility Risk

1996 1998 2000 2002 2004 2006−1.5

−1

−0.5

0

0.5

1

1.5

years

Price

(%

)

Price of Variance Risk

Price of Coskewness Risk

1996 1998 2000 2002 2004 2006−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

years

Price

(%

)

Price of Coskewness Risk

Figure 1: Preference Parameters and Prices of RiskFigure 1 depicts the risk aversion, skewness preferences, price of the volatility risk and the price of coskewness risk when Iestimate the pricing kernel with constant preference parameters for different sample periods [1986 + j, 1986 + 10 + j] whenj = 0, ..., 10. I report the preference parameters for different years j. I estimate the preference parameters of the pricing kernelvia GMM utilizing the Euler equation condition Etmt,t+1Rkt+1 = 1 where mt,t+1 represents the pricing kernel. I estimate theparameters by using the Hansen and Jagannathan (1997) weighting matrix. The sets of returns I use in my estimations arethose of 30 industry-sorted portfolios covering the period January, 1986, through December 31, 2006, augmented by the returnon a 30-day Treasury bill. I use the Chicago Board Options Exchange (CBOE)’s VOX as my proxy for the market volatility.

48

Risk Aversion

2000 2001 2002 2003 2004 2005 20062.5

3

3.5

4

4.5

5

years

Ris

k A

vers

ion

Risk Aversion

RA

Skewness Preference

2000 2001 2002 2003 2004 2005 20060.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

years

Ske

wn

ess

Pre

fere

nce

Skewness Preference

SP

Price of Volatility Risk

2000 2001 2002 2003 2004 2005 2006−1.4

−1.2

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

years

Price

(%

)

Price of Variance Risk

Price of Coskewness Risk

2000 2001 2002 2003 2004 2005 2006−3

−2.8

−2.6

−2.4

−2.2

−2

−1.8

−1.6

−1.4

−1.2

−1

years

Price

(%

)

Price of Coskewness Risk

Figure 2: Preference Parameters and Prices of RiskFigure 2 depicts the risk aversion, skewness preferences, price of the volatility risk and the price of coskewness risk when Iestimate the pricing kernel with constant preference parameters for different sample periods [1990 + j, 1990 + 10 + j] whenj = 0, ..., 6. I report the preference parameters for different years j. I estimate the preference parameters of the pricing kernelvia GMM utilizing the Euler equation condition Etmt,t+1Rkt+1 = 1 where mt,t+1 represents the pricing kernel. I estimate theparameters by using the Hansen and Jagannathan (1997) weighting matrix. The sets of returns I use in my estimations arethose of 30 industry-sorted portfolios covering the period January, 1990, through December 31, 2006, augmented by the returnon a 30-day Treasury bill. I use the Chicago Board Options Exchange (CBOE)’s VIX as my proxy for the market volatility.

49

01/1996-12/2006

−0.02−0.01

00.01

0.02

−0.2

−0.1

0

0.1

0.20.5

1

1.5

2

2.5

∆σm2

Pricing Kernel (PK)

rm

PK

01/1986-12/2000

−0.02−0.01

00.01

0.02

−0.2

−0.1

0

0.1

0.20.5

1

1.5

2

2.5

3

∆σm2

Pricing Kernel (PK)

rm

PK

01/1986-12/2006

−0.02−0.01

00.01

0.02

−0.2

−0.1

0

0.1

0.20.5

1

1.5

2

2.5

∆σm2

Pricing Kernel (PK)

rm

PK

Figure 3: Estimated Pricing KernelsFigure 3 depicts point estimates of the pricing kernels estimated with constant preference parameters. The support for thegraphs is the range of the return on the value-weighted index and the implied volatility difference. The preference parametersof the pricing kernel via GMM utilizing the Euler equation condition Etmt,t+1Rkt+1 = 1 where mt,t+1 represents the pricingkernel. I estimate the parameters by using the Hansen and Jagannathan (1997) weighting matrix. The sets of returns I usein my estimations are those of 30 industry-sorted portfolios covering the period January, 1986, through December 31, 2006,augmented by the return on a 30-day Treasury bill. I use the Chicago Board Options Exchange (CBOE)’s VOX as my proxyfor the market volatility.

50

01/1996-12/2006

−0.02−0.01

00.01

0.02

−0.2

−0.1

0

0.1

0.20.5

1

1.5

2

2.5

∆σm2

Pricing Kernel (PK)

rm

PK

01/1990-12/2000

−0.02−0.01

00.01

0.02

−0.2

−0.1

0

0.1

0.20

1

2

3

4

∆σm2

Pricing Kernel (PK)

rm

PK

01/1990-12/2006

−0.02−0.01

00.01

0.02

−0.2

−0.1

0

0.1

0.20.5

1

1.5

2

2.5

3

∆σm2

Pricing Kernel (PK)

rm

PK

Figure 4: Estimated Pricing KernelsFigure 4 depicts point estimates of the pricing kernels estimated with constant preference parameters. The support for thegraphs is the range of the return on the value-weighted index and the implied volatility difference. The preference parametersof the pricing kernel via GMM utilizing the Euler equation condition Etmt,t+1Rkt+1 = 1 where mt,t+1 represents the pricingkernel. I estimate the parameters by using the Hansen and Jagannathan (1997) weighting matrix. The sets of returns I usein my estimations are those of 30 industry-sorted portfolios covering the period January, 1990, through December 31, 2006,augmented by the return on a 30-day Treasury bill. I use the Chicago Board Options Exchange (CBOE)’s VIX as my proxyfor the market volatility.

51

01/1996-12/2006

−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.080

0.5

1

1.5

2

2.5

3

rm

PK

Projected Pricing Kernel

01/1986-12/2000

−0.1 −0.05 0 0.05 0.10.5

1

1.5

2

2.5

3

3.5

rm

PK

Projected Pricing Kernel

01/1986-12/2006

−0.1 −0.05 0 0.05 0.10.5

1

1.5

2

2.5

3

3.5

rm

PK

Projected Pricing Kernel

Figure 5: Projected Pricing KernelsFigure 5 depicts the projection of the estimated pricing kernel estimated with constant preference parameters (see Figure 3) on

a polynomial function of the market return, mt,t+1 =∑5

j=0 bjrjMt+1. The support for the graphs is the observed range of the

return on the value-weighted index.

52

01/1996-12/2006

−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.080

0.5

1

1.5

2

2.5

3

rm

PK

Projected Pricing Kernel

01/1990-12/2000

−0.1 −0.05 0 0.05 0.10

0.5

1

1.5

2

2.5

3

rm

PK

Projected Pricing Kernel

01/1990-12/2006

−0.1 −0.05 0 0.05 0.1−0.5

0

0.5

1

1.5

2

2.5

3

rm

PK

Projected Pricing Kernel

Figure 6: Projected Pricing KernelsFigure 6 depicts the projection of the estimated pricing kernel estimated with constant preference parameters (see Figure 4) on

a polynomial function of the market return, mt,t+1 =∑5

j=0 bjrjMt+1. The support for the graphs is the observed range of the

return on the value-weighted index.

53

1 2 3 4 5 6 7 8 9 10−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

Exp

ecte

d R

etu

rn (

%)

Group

CAPMFFFF−MHS

Figure 7: Idiosyncratic Volatility and Expected ReturnsFigure 8 plots the expected return across deciles when I use different measures of idiosyncratic volatility. CAPM indicatesthat the idiosyncratic volatility is computed using the CAPM model. FF indicates the Fama and French (1993) model, FF-Mindicates the Fama and French model augmented with the momentum factor of Jegadeesh and Titman (1993), and HS indicatesthe Harvey and Siddique (2000) market coskewness model.

54