Post on 24-Jul-2018
Trading Bond Convexity - AModel Agnostic Approach
A thesis submitted in partial fulfillment of the MSc in
Mathematical Finance
April 7, 2015
Candidate no. 900228
This thesis is dedicated to those multitude of Indian children who do not have the
means to afford the luxury of education.
2
Candidate no. 900228
Acknowledgements
I would like to express my sincere gratitude to my advisor Dr. Riccardo Rebonato
for introducing me to this fascinating topic and for coaching me throughout. This
thesis would not have been possible without his relentless patience in answering my
many trivial questions and in explaining to me the key concepts. I have learned a lot,
both from his classes at the Mathematical Institute and from our discussions over the
phone and email. I have been extremely lucky to have interacted with Dr.Vladimir
Putyatin throughout the course of this thesis. His insights, advice and review of my
calculations have gone a long way in helping me to complete the analysis and write
this thesis.
The interactions with the faculty members at the Mathematical Institute have
been extremely beneficial to my overall understanding of the nuances of Mathematical
Finance and I would like to thank them for their quality teaching and feedback.
My experience on each of the 7 visits to the Mathematical Institute has been very
satisfying and it would not have been possible without the hard work that the staff
of the Mathematical Institute put in prior to every visit. A big thank you to them.
As we are all aware, life is much more easier to live when one has a supportive
and caring family. This is where I have been blessed with wonderful parents who
have been encouraging throughout and my wife Padma who has been a phenomenal
source of positive energy in my life. My deepest gratitude to all of them.
3
Candidate no. 900228
In this essay, we study bond portfolio Convexity and we do so from three different
perspectives. First, we introduce a model based representation of what the portfolio
convexity should be using a simple Vasicek setting followed by a general multi-factor
Affine set up. Second, we derive a novel model agnostic approach to extract the
value of portfolio convexity in terms of portfolio “Carry” and “Roll-Down”. Finally,
we develop a trading strategy which employs the model agnostic representation of
portfolio convexity to exploit discrepancies in implied and realized convexity using
the Treasury data provided by the US Federal Reserve[10] for the period 1987-2014.
Our intention to focus on portfolio convexity is ultimately linked to the belief that
mis-pricing in the fair value of convexity exist in today’s markets. These mis-pricings
provide us with a trading opportunity and motivates us to develop a model agnostic
approach to monetize convexity. The trading strategy is relatively easy to implement
and is overall profitable conditional on the quality of the estimates of future yield
volatility. Furthermore, we show that the profitability of trading strategy is not due
to uncontrolled residual exposure to level, slope or curvature of the yields but is purely
due to the ability of the strategy to tap into the mis-pricings in convexity.
4
Candidate no. 900228
Contents
1 Introduction 1
2 Convexity 3
2.1 Monetizing Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The Vasicek Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 A General Affine Model . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 A Model Agnostic Approach 13
3.1 The Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 The Trading Strategy 18
5 Strategy Implementation 21
5.1 Estimation of Weights . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Estimation of Volatilities . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Implementation Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6 Results 32
6.1 Estimated Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Estimated Volatilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3 Portfolio Profit & Loss . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7 Conclusion 41
8 Appendix 1 - Estimation of Portfolio Weights: MATLAB Code 46
9 Appendix 2 - Estimation of Volatilities: R Code 48
References 51
i
Candidate no. 900228
List of Figures
2.1 Bond Price vs. Yield . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5.1 ACF plot for 10 year bond at t = 9th Feb 1987 . . . . . . . . . . . . . 26
5.2 PACF plot for 10 year bond at t = 9th Feb 1987 . . . . . . . . . . . . 26
5.3 ACF plot for 20 year bond at t = 2nd Oct 2014 . . . . . . . . . . . . . 27
5.4 PACF plot for 20 year bond at t = 2nd Oct 2014 . . . . . . . . . . . . 28
5.5 ACF plot for 30 year bond at t = 12th May 2014 . . . . . . . . . . . . 28
5.6 PACF plot for 30 year bond at t = 12th May 2014 . . . . . . . . . . . 29
6.1 Estimated weights for the 10 (Blue line) and 20 (Orange line) year
bonds for the portfolio 10/20/30 . . . . . . . . . . . . . . . . . . . . . 33
6.2 Estimated weights for the 5 (Blue line) and 10 (Orange line) year bonds
for the portfolio 5/10/30 . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3 Estimated weights for the 5 (Blue line) and 15 (Orange line) year bonds
for the portfolio 5/15/30 . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.4 Estimated volatilities σ5 (Yellow line), σ10 (Orange line) and σ15 (Green
line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.5 Estimated volatilities σ20 (Blue line) and σ30 (Orange line). . . . . . . 36
6.6 P&L for the portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.1 Signal Strength vs. Average P&L for 10/20/30 . . . . . . . . . . . . 42
7.2 Signal Strength vs. Average P&L for 5/15/30 . . . . . . . . . . . . . 42
7.3 Signal Strength vs. Average P&L for 5/10/30 . . . . . . . . . . . . . 43
ii
Candidate no. 900228
Chapter 1
Introduction
It is well known that the forces that shape the yield curve1 manifest themselves
through the (i) Expectations around future one period interest rates, (ii) Term pre-
mia and (iii) Convexity[7]. The expectations hypothesis asserts that yields on long
term bonds must be equal to the expected future one period interest rates. Term pre-
mia is the compensation that an investor hopes to receive for bearing duration risk
and finally, Convexity arises from the nonlinear relationship between yields and bond
prices. The overall shape of the yield curve is, in reality, a trade off between these
three competing effects. The three factors have specific maturity ranges in which they
are most active - for example the Expectations component is material at the short
end of the yield curve, Term premia is material at the medium term maturities and
Convexity, ultimately, dominates at the long end of the yield curve.
In this essay, we study bond portfolio Convexity and we do so from three different
perspectives. First, we introduce a model based representation of what the portfolio
convexity should be using a simple Vasicek setting followed by a general multi-factor
Affine set up. Second, we derive a novel model agnostic approach to extract the
value of portfolio convexity in terms of portfolio “Carry” and “Roll-Down”. Finally,
we develop a trading strategy which employs the model agnostic representation of
portfolio convexity to exploit discrepancies in implied and realized convexity using
the Treasury data provided by the US Federal Reserve[10] for the period 1987-2014.
Our intention to focus on portfolio convexity is ultimately linked to the belief that
mis-pricing in the fair value of convexity exist in today’s markets. This mis-pricing
can be a result of the sudden changes in volatilities after the 2008 financial crisis
or due to a net demand for long term bonds by institutions like the pension funds2.
1There are a number of other factors that can also affect a bond’s yield. For example credit riskcould affect yields of defaultable bonds. Illiquidity is another factor affecting yields[7].
2Pension funds or Insurance companies may have negatively convex long-dated liabilities andmay want to match their long-dated liabilities with long-dated assets. Their desire to match thematurity profile of their liabilities, and to reduce the associated negative convexity, creates a netinstitutional demand for these long-dated assets[21].
1
Candidate no. 900228
Whatever be the reason, these mis-pricings provide us with a trading opportunity and
motivate us to develop a model agnostic approach to monetize convexity. However,
harnessing this value of convexity requires an active dynamic strategy, similar to that
of ‘gamma’ trading that an option trader engages in. Whether we make a profit will
depend on the interplay between the market’s perception of today’s convexity and
the future realized convexity or in other words, on the interplay between realized and
implied yield volatilities, much like ‘gamma’ trading for options.
Academic literature that discuss the possibility of monetizing convexity is prac-
tically non-existent. It is, beyond doubt, true that many financial institutions, like
the hedge funds for example, employ proprietary trading strategies that attempt
to exploit the mis-pricings in convexity however information about those strategies
are largely unavailable publicly. This study is unique in that respect and attempts
to introduce a trading strategy to monetize convexity using a novel model agnostic
representation of portfolio convexity. The trading strategy is relatively easy to im-
plement and is overall profitable conditional on the quality of the estimates of future
yield volatility. Furthermore, we show that the profitability of trading strategy is not
due to uncontrolled residual exposure to level, slope or curvature of the yields but is
purely due to the ability of the strategy to tap into the mis-pricings in convexity.
The rest of this essay is organized as follows: Chapter 2 introduces model de-
pendent expressions of portfolio convexity using a Vasicek model and a multi-factor
Affine model. The main contributions of this essay are in Chapter 3 and Chapter 4
where we introduce the model agnostic representation of portfolio convexity and the
corresponding trading strategy. Chapter 5 and Chapter 6 discuss the implementation
steps and the results of running the strategy on US Treasury data for the period
1987-2014. Chapter 7 concludes the essay.
2
Candidate no. 900228
Chapter 2
Convexity
In this section, we study Convexity and its impact on the shape of the yield curve.
In particular, we show that (i) Convexity has the effect of depressing bond yields, (ii)
the effect of Convexity is larger for long dated bonds, and (iii) Convexity is related to
the volatility of the bond yields, in the sense that if there is no volatility, there shall
be no Convexity. Much of this analysis has been executed in the celebrated paper of
Mark Fisher[7] under a discrete time setting where all the uncertainty in the bond
prices are resolved through the flip of an unbiased coin. Here we adopt a continuous
time approach but essentially arrive at similar conclusions.
Let y(t, T ) denote the time t yield of a zero coupon bond with maturity T . Recall
that the price of a zero coupon bond at time t with maturity T is given by
p(t, T ) = exp [−(T − t)y(t, T )]
and so
y(t, T ) = − ln p(t, T )
(T − t)(2.1)
Figure (2.1) plots the relationship between bond yield and bond price as represented
by equation (2.1). Notice the convex nature of the curve that results in an average
price (the red line) greater than the price due to average yield (the yellow line). This
feature is called convexity which is nothing but Jensen’s inequality at play. For any
convex function f and a random variable X, Jensen’s inequality is represented as
E [f(X)] ≥ f(E [X]), whenever the expectation exists.
In our setup, exp [−(T − t)y(t, T )] is a convex function of the T maturity yields and
thus the above inequality holds.
3
Candidate no. 900228
Figure 2.1: Bond Price vs. Yield
In pursuit of what drives convexity, we assume that a stochastic quantity X(t) has
the following simplistic underlying process
dxt = µtdt+ σtdWt (2.2)
where Wt is a standard Brownian motion and µt, σt are deterministic. We know that
xt = x0 +∫ t
0µtdt+
∫ t
0σtdWt
Let x0 = 0 which is nothing but an assumption made to simplify calculations. Then
exp
(E[∫ t
0
µtdt+
∫ t
0
σtdWt
])= exp
(∫ t
0
µtdt
)(2.3)
,since µt is deterministic and expectation of the Ito integral∫ t
0σtdWt is 0. Further,
since exp
(−∫ t
0
1
2σ2t dt+
∫ t
0σtdWt
)is an exponential Martingale, we have
E[exp
(−∫ t
0
1
2σ2t dt+
∫ t
0
σtdWt
)]= 1 (2.4)
From equation (2.4), we can write
E[exp
(∫ t
0
µtdt+
∫ t
0
σtdWt
)]= exp
(∫ t
0
µtdt+
∫ t
0
1
2σ2t dt
)(2.5)
> exp
(∫ t
0
µtdt
)(2.6)
= exp
(E[∫ t
0
µtdt+
∫ t
0
σtdWt
])(2.7)
4
Candidate no. 900228
Through the above equations, we have proved that indeed E [exp (Xt)] ≥ exp (E [Xt])
but crucially it is the quantity exp(∫ t
012σ2t dt
)that drives the gap between the two
expressions. Thus, convexity is essentially driven by the square of the volatility of
the stochastic variable X(t), atleast while dealing with exponential functions. In the
context of yields and prices, therefore, it is the yield volatilities that drive convexity
and if there is no volatility then there shall be no convexity.
We now inquire how does convexity affect the yield curve. Note that from figure
(2.1), for a downward move in the yield the bond price goes up by more than it goes
down for an upward move in the yield. The investors are aware of this feature of the
bonds and therefore are willing to pay more for long dated bonds because this feature
is most prevalent in bonds with longer maturities. For instance, since
p(t, T ) = exp [−(T − t)y(t, T )]
we have∂2p(t, T )
∂y(t, T )2= (T − t)2p(t, T ) and (2.8)
E [dp(t, T )]
p(t, T )=
1
p(t, T )
∂p(t, T )
∂tdt+
∂p(t, T )
∂y(t, T )
E [dy(t, T )]
p(t, T )+
1
2(T − t)2σ2
yt (2.9)
The term (T − t)2(σ2yt) is always positive and is precisely responsible for giving that
extra increment to the expected returns for long term bonds. Thus long dated bonds
command a higher price and hence a lower yield due to the effects of convexity.
In other words, convexity has the effect of lowering bond yields and this effect is
most pronounced for long dated bonds. The other way to understand this effect of
convexity is through the lens of the forward rate curve. Brown et.al (2000)[2]note that
the term structure of long term forward interest rates is generally downward sloping.
They show that this effect arises due to the combined effect of the term structure of
volatility of long term interest rates and the greater convexity of long term bonds[2].
Thus Convexity, once again, has the effect of lowering the long term forward rates.
2.1 Monetizing Convexity
Unlike other risk factors (duration risk for example), harnessing the value of convex-
ity requires an active dynamic strategy, similar to “gamma” trading that a delta-
neutralized option trader would engage in. Suppose we buy an option and engage
in accurate delta hedging. We know that when we buy an option we end up being
long “gamma” and long realized volatility. Whether we make money at option expiry
will then depend on whether the future realized volatility turns out to be bigger or
smaller than the implied volatility that determined the price of the option. Capturing
the value of convexity is also based on similar principles. By engaging in effective
5
Candidate no. 900228
duration hedging, the investor can tap into the difference between the realized and
the implied volatility. In particular, if the investor believes that the market implies
a lower volatility, he will build a duration-neutral portfolio which is long convexity.
This leads to what we call ‘yield give -up’ which is similar to ‘premium paid’ in the
context of “gamma” trading because the yield at which he buys a long dated bond
is lower than the yield in the absence of convexity. The investor then makes money
by continuously duration re-hedging his portfolio. Starting from a duration neutral
portfolio, as the yields move, the portfolio loses its duration neutrality. As the port-
folio is re-hedged, the investor makes money but the amount of money he makes will
depend on the difference between the implied and realized yield volatility. Note that
the money is made because the investor is a net buyer when prices have gone down
and a net seller when prices have gone up. Similarly, if the investor believes that
the market implies a high volatility, he will build a duration-neutral portfolio with
negative convexity. This negative convexity portfolio will offer the investor a ‘yield
pick-up’ which is similar to ‘premium received’ in the context of “gamma” trading.
The investor will now hope that he has to duration hedge the portfolio as rarely as
possible because now to remain duration neutral he will have to buy when the prices
go up and sell when prices go down.
The strategy discussed above, however, requires immunization not just against
the parallel moves in the yield curve but also against the level and slope risks in
the yield curve. The other crucial factor is that this strategy requires a very active
re-balancing of the portfolio because each time the yield curve moves up or down and
comes back to the same level without the investor having re-balanced his position, he
would have lost some of the time value that he paid for. Thus monetizing convexity
requires a very active strategy and whether we make money really depends on how
well we immunize our portfolio, the precision of our “belief” on future volatility and
how actively we re-balance our position.
2.2 The Vasicek Setting
Consider a Vasicek model for the short rate rt given by
drt = κ(θ − rt)dt+ σrdWt (2.10)
6
Candidate no. 900228
where θ is the mean reversion level, κ is the reversion speed, σr is the volatility of the
short rate and Wt is a standard Brownian motion. We know that in a Vasicek model
p(t, T ) = exp(AT
t +BTt rt
)where (2.11)
ATt = −1
2σ2r
∫ T
t
(BTs )
2ds− κθ
∫ T
t
BTs ds (2.12)
BTt = −1− exp (−κ(T − t))
κ(2.13)
Now
y(t, T ) = − 1
T − tlog p(t, T ) = − 1
T − t
(AT
t +BTt rt
)= aTt + bTt rt say. (2.14)
To calculate a process for the yields, we proceed as follows
dy(t, T ) =∂y(t, T )
∂tdt+
∂y(t, T )
∂rdrt +
∂2y(t, T )
∂r2(drt)
2 (2.15)
= bTt drt ,assuming constant maturity. (2.16)
= bTt κ(θ − rt)dt+ bTt σrdWt (2.17)
= µTytdt+ σT
ytdWt (2.18)
Note that our assumption of a constant maturity yield works well for reasonably
shaped yield curves, ie, for yield curves which are not too steep at the long end. We
start by setting up a portfolio Πt at time t consisting of 1 unit of T1 maturity bonds
and w2 units of T2 maturity bonds. So we have
Πt = p(t, T1) + w2p(t, T2)
and the change in the portfolio value over the time interval dt is given by
dΠt = dp(t, T1) + w2dp(t, T2) (2.19)
= [A (p(t, T1)) + w2A (p(t, T2))] dt+
[∂p(t, T1)
∂y(t, T1)σT1yt + w2
∂p(t, T2)
∂y(t, T2)σT2yt
]dWt (2.20)
where
A (p(t, Ti)) =∂p(t, Ti)
∂t+
∂p(t, Ti)
∂y(t, Ti)µTiyt +
1
2
∂2p(t, Ti)
∂y(t, Ti)2σ2
yTit
We choose w2 so that the dWt term in equation (2.20) is 0 and so
w2 = −∂p(t, T1)
∂y(t, T1)
∂y(t, T2)
∂p(t, T2)
σT1yt
σT2yt
Since the portfolio is now riskless, we must have
dΠt = Πtrtdt (2.21)
⇒ [A (p(t, T1)) + w2A (p(t, T2))] dt = [p(t, T1) + w2p(t, T2)] rtdt (2.22)
7
Candidate no. 900228
⇒2∑
i=1
wi
[(∂p(t, Ti)
∂t+
∂p(t, Ti)
∂y(t, Ti)µTiyt − p(t, Ti)rt
)+
1
2
∂2p(t, Ti)
∂y(t, Ti)2σ2
yTit
]= 0 (2.23)
Recall that σyTit
= σrbTit and so we can write equation (2.23) as
Cσ2r − Y = 0 (2.24)
where
C =1
2
2∑i=1
wi∂2p(t, Ti)
∂y(t, Ti)2b2,Tit and (2.25)
Y = −2∑
i=1
wi
(∂p(t, Ti)
∂t+
∂p(t, Ti)
∂y(t, Ti)µTiyt − p(t, Ti)rt
)(2.26)
The interpretation of equation (2.24) is critical. If we define a break-even volatility
as σ2r = Y/C and if the realized volatility over the time interval dt, say 1 day or
1 week, is bigger than the break-even volatility, a portfolio which is long convexity
will make money. Similarly, if the realized volatility is smaller than the break-even
volatility, a long convexity portfolio will lose money. However, sitting at time t one
would have no way to know the realized volatility over time step dt unless we resort to
a forecasting technique that provides the best guess of the realized volatility over dt.
This is precisely why the estimation of future volatility is of paramount importance
and we discuss this aspect in detail in Chapter 5. Looking at equation (2.24), it is
straightforward to conclude that under the Vasicek model, the portfolio profitability
will depend on the volatility of the short rate rt and not on the volatilities of the yields
themselves. This is undesirable because we know that Convexity, if anything, is driven
by the yield volatilities and to have the portfolio P&L guided by the uncertainty in
the short rate is reason enough to warrant an improvement in our choice of a model.
2.3 A General Affine Model
In this section we consider a multi-factor affine model of n − 1 state variables, x˜t,
that follow a joint mean-reverting process given by
dx˜t = κ(θ˜− x˜t
)+ SdW˜t (2.27)
where κ, S are square matrices of order n − 1 and E[dW i
t dWjt
]= δij, δij being the
Kronecker delta. Let the short rate rt be of the form
rt = ur + g˜′x˜t (2.28)
Then the price of a zero coupon bond at time t and maturity T will be[5]
p(t, T ) = exp[AT
t +BTt
′x˜t
]8
Candidate no. 900228
In a manner similar to the Vasicek case, we proceed to obtain a process for the vector
of n yield changes given by
dy˜t =dy(t, T1)dy(t, T2)
...dy(t, Tn)
We choose n yields because we have n − 1 factors and to create a risk-less portfolio
we will need n bonds and hence n yields of different maturities. Note that for the
yield with maturity Tk
y(t, Tk) = − 1
Tk − tlog p(t, Tk) = aTk
t + b˜Tkt
′x˜t (2.29)
where b˜Tkt
′= −B(t, Tk)/(Tk − t) is a row vector of length n− 1. Now
dy(t, Tk) =∂y(t, Tk)
∂tdt+
∂y(t, Tk)
∂x˜t
dx˜t +∂2y(t, Tk)
∂x˜2t
(dx˜t)2 (2.30)
= b˜Tkt
′dx˜t (2.31)
= b˜Tkt
′κ(θ˜− x˜t)dt+ b˜Tk
t
′SdW˜t (2.32)
= µykdt+ σ˜yk′dW˜t (2.33)
and so we can write
dy˜t = µ˜ydt+ SydW˜t
where we have defined
µ˜y =
µy1
µy2...
µyn
as a column vector of length n and
Sy =
σ˜y1
′
σ˜y2′
...σ˜yn
′
=
b˜T1t
′S
b˜T2t
′S
...
b˜Tnt
′S
as a matrix of n rows and n− 1 columns.
As before, we setup a portfolio Πt at time t consisting of wk units of Tk maturity
bonds, k = 1, 2, · · · , n. The change in the portfolio value over the time interval dt is
given by
dΠt =n∑
i=1
widp(t, Ti) (2.34)
Now note that
dp(t, Ti) =∂p(t, Ti)
∂tdt+ [grad p(t, Ti)]
′dy˜t + 1
2
∂2p(t, Ti)
∂y(t, Ti)2bTit
′SS ′bTi
t dt (2.35)
9
Candidate no. 900228
where
[grad p(t, Ti)] =
∂p(t, Ti)
∂y(t, T1)...
∂p(t, Ti)
∂y(t, Tn)
To calculate the weights, we set the dW˜t term in equation (2.34) to 0 and obtain the
optimal weights w∗i as solutions to equations
n∑i=1
wi[grad p(t, Ti)]′Sy = 0 (2.36)
Since the portfolio is now risk-less, we must have
dΠt = Πtrtdt
which we re-write as
w˜∗′ (c1 + c2 + c3 + c4) = 0 (2.37)
where
w˜∗′ = [w∗1, · · · , w∗
n] (2.38)
c1 =
∂p(t, T1)
∂tdt
...∂p(t, Tn)
∂tdt
(2.39)
c2 =
∂p(t, T1)
∂y(t, T1)b˜T1t
′κ (θ˜− x˜t) dt
...∂p(t, Tn)
∂y(t, Tn)b˜Tnt
′κ (θ˜− x˜t) dt
=
− (T1 − t) p(t, T1)b˜T1t
′κ (θ˜− x˜t) dt
...
− (Tn − t) p(t, Tn)b˜Tnt
′κ (θ˜− x˜t) dt
(2.40)
c3 =
−p(t, T1)rtdt...
−p(t, Tn)rtdt
(2.41)
c4 =
1
2
∂2p(t, T1)
∂y(t, T1)2b˜T1t
′SS ′b˜T1
t dt
...1
2
∂2p(t, Tn)
∂y(t, Tn)2b˜Tnt
′SS ′b˜Tn
t dt
=
1
2(T1 − t)2 p(t, T1) (σ
y1)2 dt
...1
2(Tn − t)2 p(t, Tn) (σ
yn)2 dt
(2.42)
and we have defined
(σyi)2 = σ˜yi′σ˜yi = [σ1,yi , σ2,yi , · · · , σn−1,yi ]
σ1,yi
σ2,yi...
σn−1,yi
(2.43)
10
Candidate no. 900228
Our intention here is to obtain an expression similar to equation (2.24) for the general
affine case. With that in mind, we proceed to simplify the c4 term as follows. Note
that the volatilities σyi are the yield volatilities produced by the model and one can
express them as
σyi = hiσy (2.44)
where
σy =1
n
n∑i=1
σyi (2.45)
In equations (2.44) and (2.45) above we are alluding to the fact that the yield volatili-
ties σyi can be represented as a fraction hi of the average yield volatility σy. Crucially
enough, we now assume that over time what varies is the average yield volatility σy
but the fractions hi do not. In other words, what we are really articulating is that
the yield volatilities move in parallel and therefore the first principal component,
or the level, of the yield volatilities explains a majority of the variance in the yield
volatilities. Thus the term σy, which is nothing but the level of the yield volatilities,
accounts for almost all of the variability in yields and terms hi remain unchanged.
We can now re-write c4 as
c4 = (σy)2
1
2(T1 − t)2 p(t, T1)h
21dt
...1
2(Tn − t)2 p(t, Tn)h
2ndt
(2.46)
Finally, from equation (2.37), we have
w˜∗′ (c1 + c2 + c3) = −w˜∗′c4 = −(σy)2w˜∗′
1
2(T1 − t)2 p(t, T1)h
21dt
...1
2(Tn − t)2 p(t, Tn)h
2ndt
(2.47)
and so
(σy)2 =−w˜∗′ (c1 + c2 + c3)
w˜∗′
1
2(T1 − t)2 p(t, T1)h
21
...1
2(Tn − t)2 p(t, Tn)h
2n
=
YC
(2.48)
Looking at equation (2.48), several comments are in order. Unlike the Vasicek model,
we now have the portfolio profitability depend on the yield volatilities. The convexity
contribution term C is, among other things, directly driven by the model implied
yield volatilities σyk as against the the Vasicek case where, in equation (2.25), the
convexity contribution term C depends on the yield volatilities through the bTit ’s. This
11
Candidate no. 900228
is an improvement but it is not without certain complications which we now discuss.
In an attempt to run this strategy one of the foremost tasks would be to calibrate
this multi-factor model to today’s yields. We might do so by fitting the κ and S
matrices to the best statistical estimate of the yield covariance matrix. This is a
plausible approach especially when one considers the Principal Component based
affine term structure model introduced by Rebonato et.al.[20]1. With the fitted κ
and S matrices, we can now use the remaining model parameters to fit to the shape
of today’s yield curve out to 10 years2. Then the 10 to 30 year portion of the fitted
yield curve will tell us what the model thinks the convexity contribution to the yield
curve should be as compared to the market implied convexity contribution to the
yield curve. In case the fit to the 10 to 30 year portion of the yield curve is poor, we
might conclude that the model’s perception of the convexity contribution is different
to that of the market’s and this is exactly when we run into the question as to who
is right - the model or the market? Having seen that the fit to the 10 to 30 year
portion of the yield curve is poor, one might adjust the estimated κ and S matrices
to produce a better fit. However, this corrective action translates into model implied
yield volatilities being very different from the statistical yield covariance matrix. Since
different implied yield volatilities mean different implied convexity contribution, we
once again run into the question of who is right - the model or the market?
Thus we see that we have a battery of distinct estimation techniques available to
us to estimate a model implied convexity but each of those techniques do not shield
us from facing the question on whom to trust - the market or the model. This is
crucial for our strategy because we would like to have an unequivocal indicator of the
value of convexity and not be entangled in a motley of methodological abundance.
1The AFPC model introduced by Rebonato, Saroka and Putyatin [20] is known to recover verywell the best statistical estimate of the yield covariance matrix.
2We choose 10 years because convexity has very little effect on the shape of the yield curve at thismaturity and so the model’s view of convexity can be obtained from the higher maturity portion ofthe fitted yield curve.
12
Candidate no. 900228
Chapter 3
A Model Agnostic Approach
In this section we derive a model independent estimate of the fair value of portfolio
convexity. Our efforts shall salvage us from the issues arising out of a model based
estimate as discussed in the previous chapter. They will also prove to be extremely
beneficial as far as developing a trading strategy aimed at monetizing convexity is
concerned.
We start with a similar setup of n− 1 factors x˜t with a joint dynamics given by
dx˜t = κ(θ˜− x˜t
)+ SdW˜t (3.1)
where κ, S are square matrices of order n − 1 and E[dW i
t dWjt
]= δij, δij being the
Kronecker delta. Our intention here is to work with the bond prices p(t, T ) directly
and not the yields y(t, T ). Given that the short rate rt is of the form shown in
equation (2.28), the price of a zero coupon bond at time t and maturity T will be [5]
p(t, T ) = exp[AT
t +BTt
′x˜t
]Further,
dp(t, T ) =∂p(t, T )
∂tdt+
n−1∑i=1
∂p(t, T )
∂xi,t
dxi,t +1
2
n−1∑i=1
n−1∑j=1
∂2p(t, T )
∂xi,t∂xj,t
dxi,tdxj,t
=∂p(t, T )
∂tdt+ [grad p(t, T )]′ dx˜t +
1
2tr[S ′BT
t BTt
′S]p(t, T )dt (3.2)
We setup a portfolio Πt at time t consisting of wk units of Tk maturity bonds, k =
1, 2, · · · , n. The change in the portfolio value over the time interval dt is given by
dΠt =n∑
i=1
widp(t, Ti)
We would like our portfolio to be first order immunized and to that effect one must
recover the weights wi so that the dW˜t terms in the above equation are all zero. In
13
Candidate no. 900228
other words, the optimal weights w∗i must satisfy
n∑i=1
w∗i [grad p(t, Ti)] = 0˜
Since the portfolio is now risk-less, we must have
dΠt = Πtrtdt
and so,
n∑i=1
w∗i
(∂p(t, Ti)
∂tdt+
1
2tr[S ′BTi
t BTit
′S]p(t, Ti)dt
)= rt
n∑i=1
w∗i p(t, Ti)dt (3.3)
,recalling that w∗i s satisfy
n∑i=1
w∗i [grad p(t, Ti)] = 0˜
Let wi = w∗i p(t, Ti). Then, we can re-write equation (3.3) as
n∑i=1
wi
p(t, Ti)
(∂p(t, Ti)
∂tdt− rtp(t, Ti)dt+
1
2tr[S ′BTi
t BTit
′S]p(t, Ti)dt
)= 0 (3.4)
If we define D =∑n
i=1 wiBTit BTi
t
′, then we can express the portfolio convexity CP as
CP =1
2tr [S ′DS] = −
n∑i=1
wi
(1
p(t, Ti)
∂p(t, Ti)
∂t− rt
)(3.5)
Now note that the instantaneous forward rate at time t is given by
fTt = −∂ log p(t, T )
∂T=
∂ log p(t, T )
∂t=
1
p(t, T )
∂p(t, Ti)
∂t
where we have assumed time homogeneity1. Re-writing equation (3.5), we see that
CP =1
2tr [S ′DS] = −
n∑i=1
wi
(fTit − rt
)(3.6)
Equation (3.6) above is quiet compelling. The expression on the left hand side,12tr [S ′DS], is purely a model dependent estimate of portfolio convexity. Let us call
it the theoretical portfolio convexity or CPth. It is essentially what the model thinks
the portfolio convexity should be. The expression on the right hand side, on the other
hand, is a model independent quantity that can be read off from the shape of today’s
yield curve, save the weights wis. Before we dive deep into the implications of this
1In other words, if the system is time homogeneous then the maturity T and calendar time talways appear as τ = T − t
14
Candidate no. 900228
equation, let us look closely at the expression on the left -∑n
i=1 wi
(fTit − rt
)and see
if we can glean a deeper intuition.
Recall that given equation (2.28), we have
p(t, T ) = exp[AT
t +BTt
′x˜t
]and so
1
p(t, T )
∂p(t, Ti)
∂t=
∂ATt
∂t+
∂BTt′
∂tx˜t (3.7)
Also,
−(T − t)y(t, T ) = log p(t, T )
⇒ y(t, T )− (T − t)∂y(t, T )
∂t=
∂ATt
∂t+
∂BTt′
∂tx˜t
, and further assuming time homogeneity, we have
y(t, T ) + (T − t)∂y(t, T )
∂T=
1
p(t, T )
∂p(t, Ti)
∂t(3.8)
Thus using equation (3.8), we can re-write equation (3.5) as
CPth =1
2tr [S ′DS] = −
n∑i=1
wi
[y(t, Ti)− rt + (Ti − t)
∂y(t, Ti)
∂T
](3.9)
The quantity y(t, Ti)− rt is commonly known as bond carry in the trader’s jargon. It
is simply the excess yield available over and above the funding cost rt and portfolio
carry is nothing but∑n
i=1 wi [y(t, Ti)− rt]. Similarly, the quantity (Ti − t)∂y(t,Ti)∂T
is
called roll-down which is the change in yield due a small change in maturity multiplied
by the duration of a zero coupon bond. Thus, we can re-write equation (3.9) as
CPth =1
2tr [S ′DS] = − [Carry + Roll-down] (3.10)
where
Carry =n∑
i=1
wi [y(t, Ti)− rt]
Roll-down =n∑
i=1
wi(Ti − t)∂y(t, Ti)
∂T
Let us now discuss equation (3.10) and understand its implications. The expression
on the left-hand side, 12tr [S ′DS], is, as discussed earlier, a model dependent quantity.
It gives the model’s view of what the portfolio convexity should be. Crucially, this
quantity depends on what the bonds will do in the next time step dt. A straight-
forward way to estimate this quantity would be to calibrate the model to the yield
15
Candidate no. 900228
covariance matrix and then computing 12tr [S ′DS] from the calibrated model. Quite
interestingly, equation (3.10) declares that this model dependent quantity must equal
the model independent quantity − [Carry + Roll-down], that can be easily read off
from today’s yield curve, except the optimal portfolio weights. Thus equation (3.10)
really provides an almost model agnostic estimate of portfolio convexity that we had
originally set out to explore. Further, consider any time homogeneous model. If this
model is calibrated to the shape of the yield curve2 then it will produce the same
value of the theoretical portfolio convexity CPth. The crucial assumption around this
impressive result of equation (3.10) is, however, that of time homogeneity. We have
assumed that all our equations depend on the time to maturity (T − t) alone and not
on the calender time t. Given the exceptional times that we dwell in, for example the
crisis of September 2008, this assumption might seem untenable but one must note
that such forward looking beliefs are limited only to the near future and for shorter
maturities where the effect of convexity is hardly significant.
3.1 The Weights
We notice that in the expression − [Carry + Roll-down], the optimal portfolio weights
are not really model independent and hence this expression is not completely model
agnostic. In this section, we present an estimation technique to calculate the optimal
weights.
Let us go back to our factor model in equation (3.1) and choose the n− 1 factors
as yield principal components. Then we have,
dy˜t = V ′dx˜t (3.11)
where V is the matrix of eigen vectors of the yield covariance matrix. Now, recall that
in the Vasicek case we had constructed a portfolio with two bonds and had obtained
the optimal weight w2 as
w2 = −∂p(t, T1)
∂y(t, T1)
∂y(t, T2)
∂p(t, T2)
σT1yt
σT2yt
which can be re-written as
w2 = −
∂p(t, T1)
∂y(t, T1)σT1yt
∂p(t, T2)
∂y(t, T2)σT2yt
= −p(t, T1)(T1 − t)σT1
yt
p(t, T2)(T2 − t)σT2yt
2It is crucial that the fit to the yield curve is perfect at the long end.
16
Candidate no. 900228
But σTiyt = bTi
t σr, bTit = BTi
t /(Ti − t) and
∂p(t, Ti)
∂r= BTi
t p(t, Ti)
Thus, we have
w2 = −
∂p(t, T1)
∂r∂p(t, T2)
∂r
(3.12)
Notice how the volatility factor has disappeared from equation (3.12). In the multi-
factor case, the expression ∂p(t,Ti)∂r
generalizes to ∂p(t,Ti)∂xk
and using equation (3.11), we
know∂p(t, Ti)
∂xk
=∂p(t, Ti)
∂y(t, Ti)
∂y(t, Ti)
∂xk
= −(Ti − t)p(t, Ti)(V )′ik (3.13)
In a multi-factor affine setup, we have already obtained that the optimal weights wi
must satisfyn∑
i=1
wi1
p(t, Ti)[grad p(t, Ti)] = 0˜
which can be written asw1τ1V
′11
w1τ1V′21
...w1τ1V
′n−1,1
+
w2τ2V
′12
w2τ2V′22
...w2τ2V
′n−1,2
+ · · ·+
wnτnV
′1,n
wnτnV′2,n
...wnτnV
′n−1,n
=
00...0
(3.14)
where τi = Ti − t. Assuming w1 = 1, we can write equation (3.14) in matrix form as
Gw = g (3.15)
and obtain the optimal weights w as
w = G−1g (3.16)
Quite importantly, this technique of estimating the optimal portfolio weights is still
model dependent. Further, since it involves matrix inversion, one can run into com-
putational inefficiencies while dealing with an ill-conditioned matrix. One of the
approaches to deal with such ill-conditioned matrices would be to employ Singular
Value Decomposition (SVD) to estimate the optimal weights. In Chapter 5.1, how-
ever, we present another methodology to estimate the portfolio weights which will be
truly model agnostic and it shall be the one that we finally implement while executing
our trading strategy.
17
Candidate no. 900228
Chapter 4
The Trading Strategy
In this section, we present a trading strategy that exploits the model agnostic rep-
resentation of portfolio convexity that was derived in equation (3.10). Through this
trading strategy, we seek to exploit those instances in time when Convexity has not
been fairly priced. Using data on the US Zero Coupon yield from 1987-2014, we show
that there have been several occasions of prolonged inconsistencies in the pricing of
Convexity and on these occasions our strategy has been able to exploit such incon-
sistencies.
Our approach in the previous section has allowed us to construct a duration-
neutral portfolio by suitably estimating the optimal weights wi’s for the n bonds.
Thereafter, we were able to represent the theoretical portfolio convexity CPth as
CPth =1
2tr [S ′DS] = − [Carry + Roll-down]
where
Carry =n∑
i=1
wi [y(t, Ti)− rt]
Roll-down =n∑
i=1
wi(Ti − t)∂y(t, Ti)
∂T
The left hand side of this equation is the theoretical portfolio convexity, which is
nothing but the model’s view of what the portfolio convexity should be. The right
hand side, on the other hand, is almost model independent and can be read off from
today’s yield curve. Let us for a moment re-visit the portfolio convexity. We know
that convexity is additive and if we could compute the convexity of every bond in the
portfolio, say Convi, then we can easily calculate the portfolio convexity CP from
equation (2.9) as
CP =n∑
i=1
wiConvi =1
2
n∑i=1
wiT2i σ
2i (4.1)
18
Candidate no. 900228
The σi’s in equation (4.1) are the yield volatilities. One way to estimate them is
to employ a calibrated model but a more beneficial alternative would be to use a
statistical estimation technique that not only estimates the σi’s but also provides
the best estimate of the yield volatilities over the next time step dt. This is crucial
because an estimate of the future yield volatility over the next time step allows us to
generate an estimate of the future realized convexity over the next time step. One
can then compare this estimated realized convexity with the theoretical convexity
CPth that relies only on today’s yield curve. A gap in these two quantities is the all
important trading signal that we want our strategy to pick up.
More precisely, let σi denote the best time t estimate of the ith yield volatility over
the next time step dt. Then, an estimate of future realized convexity over the time
step dt is given by,
CPpred =1
2
n∑i=1
wiT2i σ
2i (4.2)
The trading strategy then consists of defining a trading signal S as
S = CPpred − CPth =1
2
n∑i=1
wiT2i σ
2i + [Carry + Roll-down] (4.3)
When S > 0, we expect the future realized portfolio convexity to be bigger than
today’s portfolio convexity and thus we buy a duration hedged long convexity portfolio
at time t. Similarly, when S < 0 the future realized portfolio convexity is expected
to be smaller than today’s portfolio convexity and so we sell a duration hedged long
convexity portfolio at time t. At time t+dt, we execute two actions. First, we unwind
our position and calculate the portfolio profit and loss (P&L) as the difference between
the portfolio values at time t and t+ dt over and above the funding cost rt. Second,
at the same time t + dt, we repeat the entire process by obtaining an estimate of
the future realized convexity over the next trading period dt, checking the trading
signal S and then entering into a trade based on the sign of the signal at time t+ dt.
We repeat our strategy over several trading intervals and hope to generate a positive
P&L every time the strategy encounters a clear signal.
It is important to note, however, that our trading strategy relies on several critical
components and it is, therefore, instructive to understand the factors that influence
the profitability of our strategy. Recall that convexity is a second order effect and to
capture this second order effect, the optimal portfolio weights must provide a complete
first order immunization to our portfolio. Thus, the estimation of portfolio weights
wi’s form the crucial first step as any residual first order exposure will have a bearing
on our portfolio P&L. The estimated yield volatilities σi’s form the other important
input to the trading strategy. These estimated volatilities are essentially our best
19
Candidate no. 900228
guesses of the future yield volatilities over the trading period dt and a significant
over or under estimation of the yield volatilities will have a negative impact on the
portfolio P&L. We discuss the estimation of the optimal portfolio weights and future
yield volatilities in Chapter 5.
20
Candidate no. 900228
Chapter 5
Strategy Implementation
In this section we discuss the steps that have been implemented to back-test our
strategy on the daily historical US Zero Coupon yields from 1987-20141. The daily
yields and the forward rate information used in this strategy are an outcome of the
seminal paper by Gurkaynak et al (2006)[10]. We direct the interested readers to
their paper[10] which describes in detail the estimation methodology adopted by the
authors to derive the zero coupon yields and the forward rates across maturities. The
funding cost rt is another important ingredient for our strategy and we use the daily
historical US Federal Funds Effective Rate2 as a proxy for the funding cost.
We implement the strategy on three typical bond portfolios:
1. a portfolio made up of 10 year, 20 year and 30 year zero coupon bonds. (Portfolio
10/20/30)
2. a portfolio made up of 5 year, 10 year and 30 year zero coupon bonds. (Portfolio
5/10/30)
3. a portfolio made up of 5 year, 15 year and 30 year zero coupon bonds. (Portfolio
5/15/30)
These stylized portfolios have the key advantage that they deliver a ‘yield give -up’
and ‘convexity pick -up’ or a ‘yield pick -up’ and ‘convexity give -up’ depending on
the trading signal S. For example, if the trading signal S is positive, the trader can
enter into a duration hedged long convexity portfolio by being long on the 10 and
30 year end of the yield curve and being short on the 20 year, say. Such a portfolio
will have a higher convexity (‘convexity pick -up’) but a lower yield than the 20 year
(‘yield give -up’).
1The daily data on yields and forward rates were downloaded from the US Federal Reserve Boardwebsite.
2The daily data on the Federal Funds Effective Rate were downloaded from US Federal ReserveBoard Selected Interest Rates (Daily) - H.15.
21
Candidate no. 900228
While back-testing the strategy, we traded each portfolio in intervals of 5 business
days. In other words, we would create the duration-neutral portfolio at time t based
on the trading signal S at time t, hold the portfolio over 5 business days, unwind
our position at t + 5 and calculate our portfolio P&L. At t + 5, we would repeat
the same process. In Chapter 5.1 we discuss the methodology employed to estimate
the duration-neutral portfolio weights and in Chapter 5.2 we discuss the volatility
estimation procedure employed to obtain the predicted yield volatilities σi over the
trading intreval dt.
5.1 Estimation of Weights
In Chapter 3.1, we presented a methodology to estimate the optimal weights that ren-
der the portfolio duration neutral. This technique of estimating the weights assumes
a factor model for the yields and is indeed model dependent; impairing the otherwise
model agnostic representation of the theoretical portfolio convexity in equation (3.6)
CPth =1
2tr [S ′DS] = −
n∑i=1
wi
(fTit − rt
)In this section, we present a slightly different approach to estimate the optimal port-
folio weights which does not rely on a specific model for the yields. In this sense, this
method of estimating the weights is model agnostic.
Note that a duration neutral portfolio is immunized only against parallel shifts
in yield curves, but not against the non-parallel shifts. Thus, duration neutrality
can expose the portfolio to a considerable amount of risk from non-parallel shifts in
the yield curve fluctuations[8, 13, 1]. One of the techniques widely used to guard
against non-parallel shifts in the yield curve is to employ portfolio theory and de-
termine those portfolio weights as optimal which minimize the portfolio variance.
Originally, Ederington (1979)[6] and Johnson (1960)[14] had used this technique to
derive the minimum variance hedge ratio (HR) as the average relationship between
the changes in the cash price and the changes in the futures price which minimizes the
net price change risk, where net price change risk is the variance of the price changes
of the hedged position[3]. In our setup, we adopt their approach and estimate the
portfolio weights as those which minimize the portfolio variance of the bond price
changes. Formally, let there be N zero coupon bonds in the portfolio with matu-
rity Ti, i = 1, 2, ..N . Let rti denote the return on the bond i from time t − 1 to t.
Further, let Σ be the NxN return covariance matrix. Then the minimum variance
portfolio is the bond portfolio with the lowest return variance and is the solution to
22
Candidate no. 900228
the minimization problem,
minimizew˜=(w1,w2,··· ,wN )
w˜ ′Σw˜ such that w˜ > 0 (5.1)
Memmel and Kempf [17] go on to show that the optimal weights obtained from
the above minimization are equivalent to the estimated regression coefficients in the
following regression model
rtN = α + β1rt1 + β2r
t2 + · · ·+ βN−1r
tN−1 + ϵt (5.2)
where ϵt satisfies the assumptions of a classical linear regression model. The βi’s
are estimated using the ordinary least squares technique and we have wi = βi, i =
1, 2, · · · , N − 1 and wN = 1.
We exploit the regression setting in equation (5.2) to calculate the optimal port-
folio weights for our problem through the following steps.
1. Define the return on the bond i from time t− 1 to t as
rti = −Ti [y(t− 1, Ti)− y(t, Ti)] , i = 1, 2, 3 (5.3)
Calculate rti for t = 1, 2, · · · , 300, that is for a period of 300 trading days starting
from time t and going back. Notice that Ti [y(t− 1, Ti)− y(t, Ti)] is a proxy for
a clean price change of a bond. Also, recall that the three bond portfolios
introduced under the trading strategy will each hold 3 zero coupon bonds with
the 30 year zero coupon bond being common across all the three portfolios.
2. Let rt3 denote the return on the bond with maturity 30 years. Estimate the
regression coefficients βi, i = 1, 2 under the model
rt3 = β1rt1 + β2r
t2 + ϵt (5.4)
3. The estimated optimal portfolio weights are then given by w1 = −β1, w2 = −β2
and w3 = 1 where w3 is the optimal weight for the 30 year zero coupon bond.
Appendix 1 provides a MATLAB implementation of the estimation procedure dis-
cussed above. It is worthwhile to note that the above process of estimating the opti-
mal weights assumes that the yields exhibit a Gaussian behavior. However, when the
rates are high it can be argued that the yields exhibit a more Log-Normal behavior.
In such scenarios, one may calculate the return on the bond i as
rti = −Tiy(t− 1, Ti)− y(t, Ti)
y(t− 1, Ti), i = 1, 2, 3 (5.5)
and follow essentially the same procedure to estimate the optimal weights. We do
not pursue this line of thought in this essay and leave the above implementation for
future research.
23
Candidate no. 900228
5.2 Estimation of Volatilities
As discussed in Chapter 4, the estimated yield volatilities σi’s form the other impor-
tant input to the trading strategy. A straightforward procedure that can be employed
to estimate these volatilities is to use a fixed length window, say of 300 trading days,
and calculate the standard deviation of the daily yield changes. If we use a fixed
length window, we could end up with either a noisy, highly responsive estimator if
we use a smaller window or with a stable, slow to respond estimator if we use a long
window. However, recall that we are interested in obtaining a best estimate of future
volatility and ex ante we do not have a way to predict the unexpected; but we would
like our volatility estimation process to know immediately that future volatilities are
likely to be higher incase we have entered a regime of unexpected volatility. A long
fixed length window will not be able to provide such immediate update of volatilities.
Further, if we do underestimate the future volatility and remain short Convexity per
our trading strategy, we will make a severe loss.
In this study, we have used a GARCH(1, 1)[9] model to estimate the future volatil-
ities. Our decision to use a conditional heteroscedastic model is mainly driven by the
observed autocorrelations in squared returns and volatility clustering in financial time
series data. Possibilities exist to use other variants of the GARCH model like the
EGARCH model of Nelson (1991)[19] but we have embraced simplicity over model
richness. In what follows, we provide the general setup of our GARCH(1, 1) model.
For a zero coupon bond with maturity Ti, define the return series as
xti = y(t, Ti)− y(t− 1, Ti), Ti = 5, 10, 15, 20 and 30 (5.6)
Let the mean equation of the time series xti be given by an ARMA(m,n) process of
autoregressive order m and moving average order n as
xti = µ+
m∑r=1
arxt−ri +
n∑s=1
bsεt−s + εt (5.7)
where µ is the unconditional mean, ar’s are the auto-regressive coefficients, bs’s are
the moving average coefficients and the innovation terms εt’s are uncorrelated with
zero mean. The variance equation of the GARCH(1, 1) model can be expressed as,
εt = σt,iηt (5.8)
σ2t,i = ω + αε2t−1 + βσ2
t−1,i (5.9)
where σt,i is the time t standard deviation of the yield changes with maturity Ti, ηt is
an iid process with zero mean and unit variance and ω, α, β are the unknown param-
eters that need to be estimated. Note that for strict stationarity of a GARCH(1, 1)
24
Candidate no. 900228
model, we must have ω > 0, α > 0, β > 0 and α+ β < 1. It is also worthwhile to ob-
serve that although the innovation terms εt are serially uncorrelated, their conditional
variance equals σ2t,i and hence varies with time. We refer the reader to Francq et. al.
(2010)[9] for a detailed discussion on the statistical properties of a GARCH(1, 1)
model.
Although we have specified a GARCH(1, 1) model for the variance equation, we
still need to determine the appropriate autoregressive and moving average orders of
m and n for the ARMA mean equation before we can proceed with parameter esti-
mation. Our approach to determine suitable values for m and n relies on the plots of
autocorrelation (ACF) and partial autocorrelation functions (PACF) of the series xti.
Recall that for a moving average process, MA(n)
zt = εt + θ1εt−1 + · · ·+ θnεt−n, εt ∼ IID(0, σ2)
the ACF of lag h is given by
Corr (zt+h, zt) = ρ(h) =
∑n−h
j=0 θjθj+h∑nj=0 θ
2j
, 0 ≤ h ≤ n.
0, h > n.
(5.10)
Similarly, for an auto regressive process, AR(m)
zt = ϕ1zt−1 + ϕ2zt−2 + · · ·+ ϕmzt−m + εt, εt ∼ IID(0, σ2)
the PACF of lag h is given by
ϕh,h = Corr (zt+h, zt|zt+h−1, · · · , zt+1) = 0, if h > p (5.11)
since εt+h is uncorrelated with zt+h−1, · · · , zt+1 and zt. Thus, we can use equations
(5.10) and (5.11) to determine the appropriate values of m and n from the plots of
ACF and PACF. In particular, if the ACF plot of xti displays a sharp cutoff after lag
h then we will consider the mean equation to be an MA(h) process. If the PACF
plot displays a sharp cutoff after lag h then we will consider the mean equation
to be an AR(h) process. In this relatively straightforward rule for determining the
appropriate values of m and n, we exclude the possibility of incorporating both AR
and MA terms in the mean equation. This is done primarily to ensure a simplistic
setup for the volatility estimation procedure, one which is credible yet parsimonious.3
The following figures present the ACF and PACF plots of the 10 year zero coupon
3We realize that restricting the mean equation to either an AR orMA process could be consideredas a model weakness. However, our aim here is to study the performance of the trading strategy andnot dive into the depths of methodological riches. We further show that this procedure of estimatingthe volatility works appreciably well as far as our portfolio P&L is concerned.
25
Candidate no. 900228
bond at t = 9th Feb 1987 which is the start date that we use for back-testing our
strategy.
Figure 5.1: ACF plot for 10 year bond at t = 9th Feb 1987
Figure 5.2: PACF plot for 10 year bond at t = 9th Feb 1987
The ACF plot shows a sharp cutoff after lag 1 whereas we see no such evidence in
the PACF plot. This behavior of the ACF and PACF plots for the 10 year bonds
is observed across all the trading days that are available in our data, starting from
t = 9th Feb 1987 to t = 3rd Oct 2014. Thus, we use an MA(1) mean equation for the
10 year zero coupon bonds given by,
xt10 = µ+ b1εt−1 + εt, ∀t (5.12)
26
Candidate no. 900228
where µ is the unconditional mean, b1 is the moving average coefficient and the
innovation terms εt’s are uncorrelated with zero mean. The variance equation of the
10 year bonds continues to be GARCH(1, 1) as expressed in equation (5.8). The
ACF and PACF plots of 5 and 15 year yields show a similar behavior with the ACF
plot producing a sharp cutoff after lag 1 and the PACF plot being hardly significant
after lag 0. Hence, much like the 10 year yields, we opt for an MA(1) mean equation
for the 5 and 15 year zero coupon bonds. The 20 and 30 year zero coupon yields,
however, narrate a different story. The ACF and PACF profiles of these two yields
start off looking very similar to those of the 10 year yields, indicating an MA(1) mean
equation. Around the year 2014, the ACF and the PACF profiles change, indicating
that an MA(1) representation will not be an appropriate choice of the mean equation.
Figure 5.3: ACF plot for 20 year bond at t = 2nd Oct 2014
27
Candidate no. 900228
Figure 5.4: PACF plot for 20 year bond at t = 2nd Oct 2014
Figure 5.5: ACF plot for 30 year bond at t = 12th May 2014
28
Candidate no. 900228
Figure 5.6: PACF plot for 30 year bond at t = 12th May 2014
In figures (5.3), (5.4), (5.5) and (6.1), we see that instead of anMA(1) mean equation,
an AR(1) mean equation appears to be more appropriate for the 20 and 30 year yields.
This observation is largely driven by the sharp cutoff observed in PACF plot after lag
1. One could also consider a mixed MA and AR mean equation as the PACF plot
does not really demonstrate a uniform decay after lag 1. With these observations at
hand and the fact that the 20 and 30 year yields undergo a change in profile over
time, we employ a simple mean equation and assume that the innovations εt from
the mean equation are nothing but the returns xti themselves. Thus for the 20 and
30 year bonds,
xti = εt, ∀t and i = 20, 30 (5.13)
where the innovation terms εt’s are uncorrelated with zero mean. The variance equa-
tion of the 20 and 30 year yields continues to be GARCH(1, 1) as expressed in
equation (5.8).
For parameter estimation, we follow Wurtz et.al[23] and employ a quasi maximum
likelihood (QML) technique to estimate the unknown parameters for each of the five
volatility models corresponding to the five different maturities considered in our strat-
egy. The QML technique infers the innovations εt’s and assumes that conditional on
some initial values of ε0 and σ20,i, ηt’s are distributed as standard Gaussian. Appendix
2 provides an R implementation of the volatility estimation procedure using the pack-
age ’fGarch’. The estimation process uses all available daily historical data on the
yields with a minimum of 300 days worth of data. For the bond with maturity Ti,
the time t predicted yield volatility σ2i over the trading period dt = 5 days is then
calculated as the average of model predicted volatilities at times t+1, t+2, · · · , t+5.
29
Candidate no. 900228
5.3 Implementation Steps
Let us consider the portfolio 10/20/30 and let w10, w20 and w30 (with w30 = 1) be
the optimal portfolio weights estimated at time t using the technique described in
Chapter 5.1. Denote by σ10, σ20 and σ30 the time t best estimates of yield volatilities
over the trading interval dt calculated using the methodology described in Chapter
5.2. In what follows, we list the computational steps required to run the strategy
from time t to t+ dt.
1. Let fTt and rt denote respectively the instantaneous forward rates and the fund-
ing rate at time t where T = 10, 20, 30. Calculate the theoretical portfolio
Convexity, CPth, at time t using equation (3.6),
CPth = −[w30
(f 30t − rt
)+ w20
(f 20t − rt
)+ w10
(f 10t − rt
)](5.14)
2. Calculate the predicted portfolio Convexity, CPpred, using equation (4.2),
CPpred =1
2
[w30T
230σ
230 + w20T
220σ
220 + w10T
210σ
210
](5.15)
where Ti = i, i = 10, 20, 30.
3. Calculate the trading signal, S, using equation (4.3),
S = CPpred − CPth (5.16)
4. If S > 0, we buy a duration hedged long convexity portfolio and if S < 0 we sell
a duration hedged long convexity portfolio. As far as strategy implementation
is concerned, we generate smoothed weights s10, s20, s30 as follows:
If S > 0 , obtain s10 = w10, s20 = −w20, s30 = w30 = 1
If S < 0 , obtain s10 = −w10, s20 = w20, s30 = −w30 = −1
5. Estimate the zero coupon yield and the corresponding bond price at the next
time step t+dt and maturity Ti−dt where Ti = i, i = 10, 20, 30 and dt = 1/52. In
other words, we calculate y(t+ dt, Ti − dt) using the parametric representation
of the yield curve provided in Gurkaynak et al.[10]. To calculate the price
p(t+ dt, Ti − dt), we simply use the standard relationship,
p(t+ dt, Ti − dt) = exp [− (Ti − dt) y(t+ dt, Ti − dt)]
30
Candidate no. 900228
6. At this stage, we have all the ingredients required to compute the bond excess
return. Define ExRet(i) as the excess return on bond i calculated at time t+dt
as
ExRet(i) = si
(p(t+ dt, Ti − dt)− p(t, Ti)
p(t, Ti)− rtdt
), i = 10, 20, 30 (5.17)
Note that in equation (5.17), the first term is the full return on the bond includ-
ing Carry and the second term is the risk-free funding rate rt over the period
dt = 1/52.
7. Calculate the portfolio P&L at time t+ dt as
MTM(t+ dt) = ExRet(10) + ExRet(20) + ExRet(30) (5.18)
The steps described above complete one iteration of running the strategy on historical
data from time t to t+dt on the portfolio 10/20/30. The same procedure is applicable
while using the strategy on the portfolios 5/10/30 or 5/15/30 with appropriate values
of durations wherever applicable. To generate a cumulative portfolio P&L profile over
the entire history of the available data, we repeat the above steps at time t = t+ dt,
cumulatively adding the MTM ’s generated at each time step to produce a running
portfolio P&L profile over time.
31
Candidate no. 900228
Chapter 6
Results
In this section, we present the results of our strategy implementation in terms of
the estimated portfolio weights, the estimated volatilities and the observed portfolio
P&L for the three strategies considered. We remind ourselves that while back-testing
our strategy, we have taken t = 9th Feb, 1987 as the date when we first implement
the strategy and t = 3rd Oct, 2014 as the last date. Between these two periods, the
strategy has been run on 1, 381 trading days with the trading interval dt being equal
to 5 days.
6.1 Estimated Weights
We begin this section by presenting the estimated weights wi, i = 5, 10, 15, 20 for
the three portfolios considered in the strategy. The weights for the 30 year bonds
are always equal to 1 and thus we refrain from presenting them here. Recall that
the weights estimation procedure discussed in Chapter 5.1 has been executed 1, 381
number of times, one for each trading day in our data. Figures (6.1), (6.2) and (6.3)
show the estimated portfolio weights for the three portfolios used in the strategy.
32
Candidate no. 900228
Figure 6.1: Estimated weights for the 10 (Blue line) and 20 (Orange line) year bonds for the portfolio 10/20/30
Figure 6.2: Estimated weights for the 5 (Blue line) and 10 (Orange line) year bonds for the portfolio 5/10/30
33
Candidate no. 900228
Figure 6.3: Estimated weights for the 5 (Blue line) and 15 (Orange line) year bonds for the portfolio 5/15/30
We see that the estimated weights have magnitudes which are generally intuitive. For
example, the 5 year bonds tend to have weights relatively higher in magnitude. This
is expected since the duration of the 5 year bond is smaller and hence more of the
5 year bonds are necessary to hedge the movement in the 30 year bonds. Also the
profile of the optimal weights are highly anti-correlated.
6.2 Estimated Volatilities
In this section we present the results of the volatility estimation exercise as discussed
in Chapter 5.2. We begin with figures that demonstrate the propagation of the esti-
mated yield volatilities over the entire sample. Once again, the estimated volatilities
σi have been estimated for each of the five volatilities considered in the strategy across
the 1, 381 trading days in the data.
34
Candidate no. 900228
Figure 6.4: Estimated volatilities σ5 (Yellow line), σ10 (Orange line) and σ15 (Green line).
We notice three distinct spikes in the volatility estimates of the 5, 10 and 15 year
yields. The first spike emerges around the period 1987 − 1988 which coincides with
the stock market crash of 19871 followed by a spike around 1994 which could have
been a fallout of the Mexican peso crisis2. The final spike in estimated volatility
appears around the year 2008, concurrent with the global financial crisis of 2008.
1The 1987 stock market crash was a major systemic shock and market functioning was severelyimpaired. Mark Carlson in his 2006 paper [16] discusses the events surrounding the crash.
2On December 20, 1994, the Mexican government devalued the peso. The financial crisis thatfollowed cut the peso’s value in half and sparked a severe recession in Mexico [15].
35
Candidate no. 900228
Figure 6.5: Estimated volatilities σ20 (Blue line) and σ30 (Orange line).
For the 20 and 30 year volatility estimates, we observe a similar pattern however the
fluctuations in the volatility estimates are of a much smaller magnitude.
In the following tables, we present the parameters estimates and their significance
for the five volatility models. The results are presented for a sample of randomly
chosen 10 trading dates only but the observations hold in general across all the 1, 381
trading dates in the data.
5 Year Volatility Model
Trading Dates σ5 µ b⋄1 ω α⋄ β⋄
3-Oct-14 0.0067 -1.2*10−5• 0.0615 2.8*10−9⋄ 0.0452 0.948818-Jun-13 0.0070 -1.2*10−5 0.0661 3.0*10−9⋄ 0.0451 0.94895-Apr-12 0.0081 -1.4*10−5• 0.0705 5.7*10−9⋄ 0.0462 0.941115-Feb-11 0.0107 -1.2*10−5 0.0727 5.6*10−9⋄ 0.0449 0.94282-Aug-10 0.0093 -1.3*10−5 0.0763 5.4*10−9⋄ 0.0438 0.944413-Nov-07 0.0101 -1.2*10−5 0.0887 6.4*10−9⋄ 0.0437 0.94087-Sep-00 0.0083 -1.4*10−5 0.1123 1.3*10−8⋄ 0.0498 0.919527-May-97 0.0089 -1.4*10−5 0.1275 1.5*10−8⋄ 0.0504 0.91603-Oct-91 0.0079 -9.9*10−6 0.1573 2.6*10−8⋄ 0.0879 0.854017-Jul-87 0.0094 -3.0*10−5 0.1717 1.4*10−8 0.1261 0.8566
Table 6.1: Estimated parameters for the 5 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •: significantat 10%.
36
Candidate no. 900228
10 Year Volatility Model
Trading Dates σ10 µ b⋄1 ω α β⋄
3-Oct-14 0.0074 -1.4*10−5⋆ 0.0467 3.3*10−9⋄ 0.0384⋄ 0.953818-Jun-13 0.0089 -1.5*10−5⋆ 0.0500 3.6*10−9⋄ 0.0391⋄ 0.95245-Apr-12 0.0095 -1.6*10−5⋆ 0.0511 3.8*10−9⋄ 0.0397⋄ 0.951615-Feb-11 0.0113 -1.5*10−5⋆ 0.0512 3.3*10−9⋄ 0.0373⋄ 0.95522-Aug-10 0.0096 -1.6*10−5⋆ 0.0562 3.3*10−9⋄ 0.0367⋄ 0.955713-Nov-07 0.0082 -1.5*10−5• 0.0675 3.5*10−9⋄ 0.0354⋄ 0.95567-Sep-00 0.0078 -1.9*10−5• 0.0806 6.2*10−9⋄ 0.0376⋄ 0.947127-May-97 0.0081 -2.0*10−5 0.0933 6.0*10−9⋄ 0.0361⋄ 0.95013-Oct-91 0.0078 -2.1*10−5 0.1126 8.5*10−9⋆ 0.0525⋄ 0.932217-Jul-87 0.0110 -2.2*10−5 0.1764 1.3*10−8 0.0785⋆ 0.9028
Table 6.2: Estimated parameters for the 10 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •:significant at 10%.
15 Year Volatility Model
Trading Dates σ15 µ b⋄1 ω α⋄ β⋄
3-Oct-14 0.0075 -1.3*10−5• 0.0444 3.0*10−9⋄ 0.0375 0.954718-Jun-13 0.0088 -1.3*10−5• 0.0473 3.3*10−9⋄ 0.0387 0.95295-Apr-12 0.0092 -1.5*10−5• 0.0465 3.4*10−9⋄ 0.0395 0.952215-Feb-11 0.0108 -1.4*10−5• 0.0462 2.9*10−9⋄ 0.0363 0.95642-Aug-10 0.0099 -1.5*10−5• 0.0533 2.9*10−9⋄ 0.0359 0.956713-Nov-07 0.0077 -1.5*10−5• 0.0643 2.9*10−9⋄ 0.0338 0.95827-Sep-00 0.0072 -1.9*10−5• 0.0746 4.0*10−9⋄ 0.0363 0.953527-May-97 0.0081 -2.0*10−5• 0.0922 3.7*10−9⋄ 0.0351 0.95623-Oct-91 0.0073 -2.2*10−5 0.1363 4.2*10−9⋆ 0.0406 0.952217-Jul-87 0.0118 -2.1*10−5 0.2473 1.1*10−8 0.0682 0.9158
Table 6.3: Estimated parameters for the 15 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •:significant at 10%.
We note that the estimated autoregressive parameter b1 from the MA(1) mean equa-
tion and the estimated parameters α and β from the GARCH(1, 1) model are all
significant; not only for these 10 trading dates but for the entire sample3. Further,
α+ β < 1 indicating second order stationarity of the fitted GARCH(1, 1) process [9].
3We have also noticed that the GARCH(1, 1) residuals and their squares do not exhibit auto-correlation although the residuals do not look like being normal.
37
Candidate no. 900228
20 Year Volatility Model
Trading Dates σ20 ω α⋄ β⋄
3-Oct-14 0.0074 2.5*10−9⋄ 0.0364 0.956118-Jun-13 0.0084 2.7*10−9⋄ 0.0376 0.95465-Apr-12 0.0088 2.7*10−9⋄ 0.0382 0.954215-Feb-11 0.0101 2.4*10−9⋄ 0.0350 0.95832-Aug-10 0.0100 2.4*10−9⋄ 0.0348 0.958313-Nov-07 0.0076 2.3*10−9⋄ 0.0321 0.96067-Sep-00 0.0067 3.1*10−9⋄ 0.0343 0.956627-May-97 0.0077 2.8*10−9⋄ 0.0327 0.95983-Oct-91 0.0067 3.6*10−9⋆ 0.0424 0.950817-Jul-87 0.0113 1.3*10−8• 0.0671 0.9152
Table 6.5: Estimated parameters for the 20 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •:significant at 10%.
30 Year Volatility Model
Trading Dates σ30 ω α⋄ β⋄
3-Oct-14 0.0077 3.7*10−9⋄ 0.0485 0.942118-Jun-13 0.0086 4.1*10−9⋄ 0.0500 0.94005-Apr-12 0.0093 4.1*10−9⋄ 0.0507 0.939715-Feb-11 0.0087 3.8*10−9⋄ 0.0479 0.94282-Aug-10 0.0097 3.8*10−9⋄ 0.0478 0.943013-Nov-07 0.0090 3.8*10−9⋄ 0.0468 0.94337-Sep-00 0.0067 3.9*10−9⋄ 0.0463 0.945327-May-97 0.0063 4.4*10−9⋄ 0.0463 0.94573-Oct-91 0.0097 5.6*10−9⋆ 0.0424 0.948917-Jul-87 0.0119 1.1*10−8• 0.0600 0.9273
Table 6.6: Estimated parameters for the 30 year volatility model. ⋄: significant at 1%, ⋆: significant at 5%, •:significant at 10%.
For the 20 and 30 year volatility models, we have similar observations. The estimated
parameters α and β from the GARCH(1, 1) model are all significant both for these
10 trading dates and for the entire sample4. Also, α+ β < 1 indicating second order
stationarity of the fitted GARCH(1, 1) process [9].
4In this case too, we have noticed that the GARCH(1, 1) residuals and their squares do notexhibit autocorrelation although the residuals do not look like being normal.
38
Candidate no. 900228
6.3 Portfolio Profit & Loss
In this section we present the observed P&L’s when we apply the strategy on the
three portfolios: 10/20/30, 5/10/30 and 5/15/30.
Figure 6.6: P&L for the portfolios
The fact that the strategy is profitable on the three portfolios is evident from figure
(6.6) although the portfolio 10/20/30 appears to be losing money post 2008. This is
also supported by the handsome Sharp Ratios5 observed for the three portfolios over
the entire sample but a negative ratio for the 10/20/30 portfolio post 20086.
Sharp Ratio
Portfolio Full Sample Post 2008
10/20/30 0.31 -0.075/10/30 0.40 1.055/15/30 0.68 0.94
Table 6.7: Sharp Ratios for the three portfolios.
We also note that all the three portfolios exhibit three distinct periods of P&L regimes.
During the first period, the strategy is moderately profitable followed by a period in
5Sharp Ratio (SR), developed by William F. Sharpe, is a method of calculating risk adjustedreturn. In our setup, SR =
√52* (ratio of average daily portfolio P&L to the standard deviation of
the daily portfolio P&L).6The negative Sharpe Ratio observed for the 10/20/30 portfolio post 2008 does not necessarily
dictate that the strategy fails to work after 2008 because the computed values of the Sharpe Ratiodepend on the chosen dates and the period 2008-2014 might not be enough to infer whether thestrategy was profitable post 2008.
39
Candidate no. 900228
the middle when the strategy remains almost neutral. In the third period, the strat-
egy is clearly profitable. We recall that if the portfolio weights genuinely immunize
against the duration shocks and if the volatility estimation procedure provides correct
guesses of the future volatility then our strategy should make money every time. The
revelation of the reality as represented by figure (6.4), however, motivates us to search
for reasons as to why the strategy fails to make money during the middle period. We
explore this question in detail in Chapter 7.
40
Candidate no. 900228
Chapter 7
Conclusion
We recognize that our strategy banks heavily on two moving parts - the optimal
portfolio weights wi and the best estimates of future yield volatilities σi. This implies
that there could be three potential reasons as to why the strategy might fail to make
money:
• In reality the yield curve is fairly curved and the signal strength |S| is small
indicating that there is in fact no money to be made.
• The optimal portfolio weights wi do not really render the portfolio duration-
immunized. Thus, there might be residual exposure to the level, slope or cur-
vature of the yields that impact the portfolio P&L.
• Finally, there is the potential danger of producing estimates of future volatility
that are far away from the reality.
As far as the first reason is concerned, it is important to note that if the absence of
profit on any particular trading day is really due to that fact that the yield curve
is fairly curved and hence |S| is small, then we would expect a strong correlation
between the daily P&L’s and the strength of the signal |S|. Whenever the signal
strength is small, we would know that there is no money to be made but a moderate
to strong signal should imply a bigger P&L. To test this hypothesis, we calculated
the average P&L for each decile of the absolute signal strength |S|, hoping that a
strong relationship between signal strength and P&L would produce a monotonically
increasing average P&L with the deciles of |S|. Figures (7.1), (7.2) and (7.3) plot the
average P&L against the deciles of the absolute signal strength |S|.
41
Candidate no. 900228
Figure 7.1: Signal Strength vs. Average P&L for 10/20/30
Figure 7.2: Signal Strength vs. Average P&L for 5/15/30
42
Candidate no. 900228
Figure 7.3: Signal Strength vs. Average P&L for 5/10/30
The figures for the three portfolios reveal a complex picture. The relationship between
average P&L and absolute signal strength is certainly not monotonically increasing
although for two of the last three deciles the average P&L is relatively big across all
the three portfolios. It is also clear, with the exception of the portfolio 5/15/30, a
weaker signal strength indeed means a smaller P&L. Thus, signal strength appears
to have some bearing on the realized P&L although there might be other factors
contributing to it.
The observed lack of a strong positive relationship between signal strength and
P&L could be due to the inability of the portfolio weights wi to deliver duration
immunization. In other words, there might be residual exposure to the level, slope or
curvature of the yields and if this were true then we would expect to see significant
coefficients in the regression of P&L against the level, slope and curvature of the three
yields for the portfolio in question. Thus, we proceed to test the second reason on
why the strategy fails to make money every time.
Yield Curve Level vs. P&L
Portfolio R2 α p Value β p Value
10/20/30 -0.0007 0.0013 0.5127 -0.0037 0.84625/10/30 -0.0003 0.0001 0.9673 0.0144 0.46585/15/30 0.0007 0.0003 0.8834 0.0274 0.1603
Table 7.1: Regression of P&L against the Level of the Yields.
In table (7.1), we see that the R2’s for the regression of yield curve level on the
daily P&L are extremely low with the coefficients being insignificant. Thus the three
43
Candidate no. 900228
portfolios appear to be sufficiently immunized against yield curve levels.
Yield Curve Slope vs. P&L
Portfolio R2 α p Value β p Value
10/20/30 0.0089 0.0075 0.0001 -0.7102 0.00035/10/30 0.0000 -0.0013 0.6438 0.1386 0.31945/15/30 -0.0007 0.0020 0.4819 0.0160 0.8993
Table 7.2: Regression of P&L against the Slope of the Yields.
In table (7.2), the R2’s for the regression of yield curve slope on the daily P&L are
once again extremely low with the coefficients being largely insignificant. Thus the
three portfolios appear to be sufficiently immunized against yield curve slope too.
Yield Curve Curvature vs. P&L
Portfolio R2 α p Value β p Value
10/20/30 0.0173 -0.0106 0.0000 2.9675 0.00005/10/30 -0.0003 0.0010 0.2928 -0.3077 0.46065/15/30 0.0186 -0.0026 0.0236 1.8631 0.0000
Table 7.3: Regression of P&L against the Curvature of the Yields.
As far as the regressions of yield curve curvature on the daily P&L are concerned, we
find that the R2’s for the portfolios 10/20/30 and 5/15/30 are marginally higher but
definitely not in the range that warrants caution primarily because in this case the
exposure, if at all any, is to the yield curve curvature which is the least erratic of the
three.
Thus, in general, the optimal portfolio weights appear to have immunized the
respective portfolios appreciably well against the yield curve level, slope and curva-
ture movements. The missing link between the observed lack of a strong positive
relationship between signal strength and P&L, therefore, points to weaknesses in our
volatility estimation - the last of the plausible reasons as to why the strategy failed
to make money every time. To test whether the estimates of future volatility are re-
ally far away from the reality, we regress the convexity contributed P&L against the
difference between the predicted portfolio convexity CPpred and the realized portfolio
convexity CPreal. An estimated value of R2 significantly smaller than 1 would then
provide the degree of deviation of the convexity contributed P&L from the signal
strength and hence a sense of how far away the volatility estimates have been from
the truth. In particular, for the 5/15/30 portfolio, the convexity contributed P&L is
44
Candidate no. 900228
calculated as
CP&L = P&L−∑i
siTi (y(t, Ti)− y(t− 1, Ti)) , i = 5, 15, 30 (7.1)
where the si’s are the smoothed optimal weights. The realized portfolio convexity is
calculated as
CPreal =1
2
∑i
siT2i (y(t, Ti)− y(t− 1, Ti))
2 , i = 5, 15, 30 (7.2)
For the 5/15/30 portfolio, the regression of CP&L on CPreal − CPpred produces an
R2 of 0.45, indicating that in general there is a positive relationship between the
convexity contributed P&L and the difference in realized and predicted convexities
however this relationship appears to be distorted whenever there are large swings in
the convexity contributed P&L. Thus the volatility estimation model chosen does not
provide us with a perfect predictive tool and when moves of unexpected sizes occur,
the strategy might fail to generate a profit even though it picks up a strong signal.
Therefore, in reality, depending on the size of the unexpected move and whether the
portfolio was long or short convexity before the move, the strategy could either fail
or work exceedingly well.
In this essay, we have devoted our efforts in developing a novel model agnos-
tic representation of portfolio convexity based on the cross -sectional information of
“Carry” and “Roll - Down” of a suitably immunized portfolio. We went further and
introduced a criterion to determine if the market yield curve is fairly curved based
on our model agnostic representation. We tested this criteria as a trading strategy
on historical US Treasury yield curve data from 1987-2014, noting that the strategy
was overall profitable with limitations as far as yield volatility estimation was con-
cerned. We also demonstrated that the profitability of trading strategy is not due to
uncontrolled residual exposure to level, slope or curvature of the yields but is purely
due to the ability of the strategy to tap into the mis-pricings in convexity. In partic-
ular, we noted that when moves of unanticipated size occur, the GARCH volatility
estimates may not correctly estimate the future volatility leading to situations when
the portfolio incurs a loss or an unexpected gain.
We envision further research being directed to exploit sophisticated volatility es-
timation procedures like EGARCH or APARCH and improve the quality of the
estimates of future yield volatility. The possibility of introducing a threshold on the
minimum absolute signal strength |S| above which the trade is executed, could be
explored to manage the sensitivity of the P&L profile to the signal strength. As far
as the estimation of optimal portfolio weights is concerned, an area of further inves-
tigation would be to assess the impact of those weights that have been estimated
assuming that the yields are Log Normal rather than Gaussian.
45
Candidate no. 900228
Chapter 8
Appendix 1 - Estimation ofPortfolio Weights: MATLAB Code
%%-------------------------------------------------------------------
%% Estimation of Optimal Portfolio Weights.
%%-------------------------------------------------------------------
clear all;
clc;
%%--------------------------------------------------------------------
%% Define Options
%%-------------------------------------------------------------------
project.weights = ’LogN’; %’LogN’,’Norm’
%% Read the daily yield data
[~,~,fulldata] = xlsread(’1985to2014.xlsx’, 1);
fulldata = cell2dataset(fulldata);
fullyields = fulldata;
fullyields.Index =[];
fullyields.Date =[];
DateString = fulldata.Date;
formatIn = ’yyyy-mm-dd’;
temp1 = datenum(DateString,formatIn);
temp2 = datevec(temp1);
fulldata.Year = temp2(:,1);
fulldata.Month = temp2(:,2);
fulldata.Day = temp2(:,3);
maturity = [10 20 30];%5,15,30 or 5,10,30
46
Candidate no. 900228
yields = [fulldata.Index fulldata.Year10./100 fulldata.Year20./100 ...
fulldata.Year30./100];
clear(’temp1’,’temp2’);
%%-------------------------------------------------------------------
%% Create MTM & Weights using MVP
%%-------------------------------------------------------------------
weights = zeros(length(yields)-300,3);
MTM = zeros(length(yields)-1, 3);
if strcmp(project.weights,’Norm’)
for i = 1:length(weights)
MTM = [-maturity(1)*diff(yields(i+1:300+i,2)) ...
-maturity(2)*diff(yields(i+1:300+i,3)) ...
-maturity(3)*diff(yields(i+1:300+i,4))];
mdl = fitlm([MTM(:,1) MTM(:,2)],MTM(:,3),’Intercept’,false);
weights(i,:) = [-mdl.Coefficients.Estimate’ 1];
end
elseif strcmp(project.weights,’LogN’)
for i = 1:length(weights)
MTM = [-maturity(1)*diff(yields(i+1:300+i,2))./yields(i+2:300+i,2) ...
-maturity(2)*diff(yields(i+1:300+i,3))./yields(i+2:300+i,3) ...
-maturity(3)*diff(yields(i+1:300+i,4))./yields(i+2:300+i,4)];
mdl = fitlm([MTM(:,1) MTM(:,2)],MTM(:,3),’Intercept’,false);
weights(i,:) = [-mdl.Coefficients.Estimate’ 1];
end
end
%%-------------------------------------------------------------------
47
Candidate no. 900228
Chapter 9
Appendix 2 - Estimation ofVolatilities: R Code
This script provides the estimation procedure for the obtaining the predicted yield
volatilities of 10, 20 and 30 year bonds. The script can be easily modified to do the
same for the 5 and 15 year bonds.
##----------------------------------------------------------------------------
## The required packages
require("fUnitRoots")
require(graphics)
require(fGarch)
require(tseries)
## Read the Data
fulldata <- read.csv("~/Data/1985to2014.csv",quote="’",as.is = TRUE)
yields <-cbind(fulldata$Year10/100,fulldata$Year20/100,fulldata$Year30/100)
temp = nrow(yields)
## Initialize some space
param10 = matrix(0,nrow=nrow(yields)-300,ncol=5)
param20 = matrix(0,nrow=nrow(yields)-300,ncol=3)
param30 = matrix(0,nrow=nrow(yields)-300,ncol=3)
predVols = matrix(0,nrow=nrow(yields)-300,ncol=3)
se10 = param10
se20 = param20
se30 = param30
48
Candidate no. 900228
tval10 = se10
tval20 = se20
tval30 = se30
pval10 = se10
pval20 = se20
pval30 = se30
## Begin the loop for estimating the 3 models across all data
for (i in 1:(nrow(yields)-300)){
temp10 = yields[(i+1):temp,1]
temp20 = yields[(i+1):temp,2]
temp30 = yields[(i+1):temp,3]
yields10 = temp10
yields20 = temp20
yields30 = temp30
for (j in 1:length(temp10)) {
yields10[j]= temp10[(length(temp10)-j+1)]
yields20[j]= temp20[(length(temp20)-j+1)]
yields30[j]= temp30[(length(temp30)-j+1)]
}
## Create Time Series object
yields10<- ts(yields10,start=1,end=length(yields10),frequency=1)
diff10<-diff(yields10,differences=1)
yields20<- ts(yields20,start=1,end=length(yields20),frequency=1)
diff20<-diff(yields20,differences=1)
yields30<- ts(yields30,start=1,end=length(yields30),frequency=1)
diff30<-diff(yields30,differences=1)
gar10=garchFit(~arma(0,1)+garch(1,1), ...
data=diff10,trace=F,algorithm="lbfgsb")
49
Candidate no. 900228
gar20=garchFit(~garch(1,1), ...
data=diff20,trace=F,algorithm= "lbfgsb",include.mean=F)
gar30=garchFit(~garch(1,1), ...
data=diff30,trace=F,include.mean=F,algorithm="lbfgsb")
p10=predict(gar10,5)
p20=predict(gar20,5)
p30=predict(gar30,5)
param10[i,] = gar10@fit$coef
param20[i,] = gar20@fit$coef
param30[i,] = gar30@fit$coef
se10[i,] = gar10@fit$se.coef
se20[i,] = gar20@fit$se.coef
se30[i,] = gar30@fit$se.coef
tval10[i,] = gar10@fit$tval
tval20[i,] = gar20@fit$tval
tval30[i,] = gar30@fit$tval
pval10[i,] = gar10@fit$matcoef[,4]
pval20[i,] = gar20@fit$matcoef[,4]
pval30[i,] = gar30@fit$matcoef[,4]
res10 = residuals(gar10,standardize =TRUE)
res20 = residuals(gar20,standardize =TRUE)
res30 = residuals(gar30,standardize =TRUE)
predVols[i,]= sqrt(250)*c(mean(p10$standardDeviation), ...
mean(p20$standardDeviation), mean(p30$standardDeviation))
rm(gar10)
rm(gar20)
rm(gar30)
print(i)}
50
Candidate no. 900228
References
[1] Barrett, W. Brian, Thomas F. Gosnell, Jr., and Andrea J. Heuson. (1995). Yield
Curve Shifts and the Selection of Immunization Strategies. Journal of Fixed In-
come. September, (1995).
[2] Brown R H, Schaefer M S, (2000). Why Long-Term Forward Rates (Almost)
Always Slope Downwards. London Business School working paper.
[3] Daigler, R.T., (1998). Comparing hedge ratio methodologies for fixed-income in-
vestments. working paper
[4] Diebold and Rudebusch (2013). Yield Curve Modeling and Forecasting: The
Dynamic Nelson-Siegel Approach.
[5] Duffie, D. & Kan, R. (1996). A yield-factor model of interest rates. Mathematical
Finance, 6(4):379-406.
[6] Ederington, L.H. (1979). The Hedging Performance of the New Futures Markets
The Journal o f Finance, (March), Vol. 34 No. 1, pp. 157-170
[7] Fisher, Mark. (2001). Forces that Shape the Yield Curve: Parts 1 and 2 (March
2001). FRB of Atlanta Working Paper No. 2001-3. Available at SSRN
[8] Fong, H.G., and O. Vasicek. (1983). The Tradeoff Between Return and Risk in
Immunized Portfolios. Financial Analysts Journal (September-October 1983),
pp. 73-78.
[9] Francq et. al. (2010). GARCH Models Structure, Statistical Inference and Finan-
cial Applications.
[10] Gurkaynak, Refet S., Sack, Brian P. and Wright, Jonathan H. (2006). The U.S.
Treasury Yield Curve: 1961 to the Present. FEDS Working Paper No. 2006-28.
Available at SSRN
[11] Hamilton (1994). Time Series Analysis.
[12] Higham (2008). Functions Of Matrices: Theory and Computation.
51
Candidate no. 900228
[13] Ho, T.S.Y. (1992). Key Rate Durations: Measures of Interest Rate Risks. Journal
of fixed Income, September (1992), pp. 29-44.
[14] Johnson, L. L. (1960). The Theory of Hedging and Speculation in Commodity
Futures. Review of Economic Studies, Vol. 27 No. 3, pp. 139-151
[15] Joseph A. Whitt, Jr. (1996). The Mexican Peso Crisis. Economic Review. Federal
Reserve Bank of Atlanta.
[16] Mark Carlson (2006). A Brief History of the 1987 Stock Market Crash with a
Discussion of the Federal Reserve Response. Finance and Economics Discussion
Series.Federal Reserve Board, Washington, D.C.
[17] Memmel, Christoph and Kempf, Alexander, (2006). Estimating the Global Min-
imum Variance Portfolio. Available at SSRN
[18] Meucci, Attilio (2009). Review of Statistical Arbitrage, Cointegration, and Mul-
tivariate Ornstein-Uhlenbeck. Available at SSRN
[19] Nelson, D.B. (1991). Conditional heteroskedasticity in asset returns: a new ap-
proach. Econometrica 59:347-370.
[20] Rebonato, Saroka and Putyatin (2014) Affine Principal-Component-Based Term
Structure Model. Available at SSRN
[21] Riccardo Rebonato (2014-2015) Private communication
[22] Vladimir Putyatin (2014-2015) Private communication
[23] Wurtz et.al. Parameter Estimation of ARMA Models with GARCH/APARCH
Errors. An R and SPlus Software Implementation. Journal of Statistical Software
52
Candidate no. 900228