Managing Swaption Risk with a Dynamic SABR Model

Amsterdam School of EconomicsFaculty of Economics and Business

Managing Swaption Risk

with a Dynamic SABR Model

MSc in EconometricsFinancial Econometrics track

Frank de Zwart

10204245

supervised by

Dr. S.A. Broda

and supervisor at Abn Amro

Ms Hiltje Bijkersma

July 28, 2017

ABN AMRO Bank N.V.CRM | Regulatory Risk | Model Validation

Frank de Zwart Abn Amro Model Validation

Statement of Originality

This document is written by Student Frank de Zwart who declares to take full responsibility for the

contents of this document. I declare that the text and the work presented in this document is original

and that no sources other than those mentioned in the text and its references have been used in creating

it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the

work, not for the contents.


Abstract

This thesis focuses on models that can be used to estimate risk measures, like Value at Risk and Expected

Shortfall. The displaced Black’s model and the displaced SABR volatility model are used to price a

portfolio of swaptions. The aim here is to capture the dynamics of the SABR parameters in a time series

model to obtain more accurate swaption risk estimates. Hence, this time series model is used to simulate

the one-day-ahead profit and loss distribution and is then compared to the Historical Simulation method.

In an empirical study, we compute the Value at Risk and Expected Shortfall estimates based on the

Historical Simulation method as well as the time series model. These models are analyzed with several

backtests and diagnostic tests to be able to answer the following research question. Can one outperform

the Historical Simulation Value at Risk and Expected Shortfall forecasts by fitting a time series model

to the calibrated SABR model parameters instead?

A vector autoregressive model is used as well as a local level model. Based on these two models we

are not able to outdo the Historical Simulation estimates of the risk measures. Diagnostic tests show

remaining significant autocorrelation as well as heterogeneity in the residuals of the vector autoregressive

model. Also the backtests that are carried out show that the vector autoregressive model performs worse

than the Historical Simulation method.


Contents

1 Introduction 1

2 Preliminaries on financial notation 2

2.1 Interest rate instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.2 Bootstrapping the zero curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.3 Swaptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.4 Martingales and Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Literature review 9

4 Models and method 12

4.1 Option pricing models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2 Time series analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.3 Risk measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 Backtests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Data 26

5.1 Calculating the implied volatilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.2 Leaving out of some strikes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Empirical study and results 29

6.1 Calibrating the SABR model parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Fitting a model through the SABR parameters time series . . . . . . . . . . . . . . . . . . 32

6.3 Risk measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

6.4 Backtests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.5 Robustness check: Local level model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

7 Conclusion 43

References 45

A Appendix 47


1 Introduction

The Basel Committee (2013) has introduced the Fundamental Review of the Trading Book (FRTB).

To contribute to a more resilient banking sector, they have decided to change the current framework’s

reliance on Value at Risk (VaR) to the Expected Shortfall (ES) measure to estimate market risk. On

the other hand Pérignon and Smith (2010) state that most banks use Historical Simulation (HS) to

estimate their VaR. This Historical Simulation method computes the VaR by using past returns of the

portfolio’s present assets so that one obtains a distribution of price changes that would have realized,

had the current portfolio been held throughout the observation period. The decision described in the

FRTB shows that it is getting even more important for financial institutions to estimate their market

risk accurately. However, we also see that a relatively simple method is still used to obtain these risk

measures. One of the main drawbacks of this Historical Simulation method is that it does not take the

decreasing predictability of older returns into account.

Derivatives are traded extensively these days, and one of these products is a swap option, or swaption.

A swaption is an option on an interest rate swap. Swaptions are traded over-the-counter, so compared

to derivatives that are traded on an exchange, the information is more scarce and not publicly available.

This makes it an interesting challenge to find an accurate method to assess the risk of holding these

derivatives. Besides this, the negative interest rates also affect almost all the valuation methods for

these options. In the current interest rate environment, the Historical Simulation method that is used

to produce the VaR and ES estimates of market risk may not be reliable. Hence, finding a method to

get more reliable estimates for the VaR and ES, based on historical swaption data, is of interest.

This leads to the following research question, which defines the main purpose of this thesis. Can one

outperform the Historical Simulation Value at Risk and Expected Shortfall forecasts by fitting a time

series model to the calibrated SABR model parameters instead? An empirical study is performed to be

able to answer this question. This study is based on an ICAP data set of swaption premiums, interest

rate deposits, and interest rate swaps. The time series, with a time grid of approximately 2.5 years, of

the displaced SABR volatility model parameters will be analyzed to obtain an one-day-ahead forecast of

the price of a portfolio of swaptions. Finally, a backtesting procedure will assess the quality of this new

method compared to the well-known Historical Simulation method.

The remainder of this research report is structured as follows. First in Section 2, the necessary

background theory for this research will be discussed. This includes theory on interest rate instruments

in general as well as a description of an interpolation method called bootstrapping to obtain the zero and

discount curves. We will then give a description of a swaption and some of its relevant trading strategies

are discussed. This section will be concluded with the a description of martingales and measures. Then we

will briefly discuss the relevant literature for this research in Section 3 and subsequently we will continue

in Section 4 with the theory that is used to price the swaptions. In this section, the well-known model

of Black (1976) is described in detail. Besides this, we will focus on the SABR volatility model of Hagan

et al. (2002) and the correction to their work from Obłój (2008). We will then discuss the implications of

negative interest rates on these models. In Section 4.2, some basic time series models that are used in this

research will be discussed. We will then continue with the risk measurement concepts, like Value at Risk

and Expected Shortfall. Finally, several different backtests will be elaborated. Different backtests are

used to be able to assess the quality of our model estimates as thoroughly as possible. The data set will

be described in Section 5. Not only the raw data will be described, but also the different pre-processing

techniques will be explained. This section also contains information on some of the limitations and

1


argumentation on why some adjustments are made. In the next section, Section 6, the empirical study

and results are described. This section follows the structure of Section 4 and starts with the calibrated

SABR parameters and continues with the time series analysis, risk measurement, and concludes with the

backtesting procedure. There are also some diagnostic tests carried out, besides the backtests themselves,

to assess the quality of the fit of the time series analysis. The results will be elaborated and discussed

in every single step. Finally in Section 7, the main findings are summarized and a conclusion is drawn.

The research question will be answered and some limitations and recommendations for further research

will be provided.

2 Preliminaries on financial notation

Trading in derivatives has become an indispensable part of the financial industry. There are multiple

different derivatives for every type of investment asset. The magnitude of this market shows that it

is of great importance to have an understanding of how these derivatives work. Consequently a lot of

researchers have focused on all these derivatives. Numerous papers and books describe how derivatives

work and what risks the holder of an open position in them is taking. We will first explain some basic

but crucial concepts of interest rate instruments. Then in Section 4, we will describe the models and

methods that are applied in the empirical analysis of this research.

2.1 Interest rate instruments

Interest rates are crucial in the valuation of derivatives. Especially the ’risk-free’ rate is of concern

when evaluating derivatives. Hull (2012) explains that the interest rates implied by Treasury bills are

artificially low because of a favorable tax treatment and other regulations. For this reason the LIBOR

rate became commonly used instead. However, when the rates ascended during the crisis in 2007, many

derivatives dealers had to review their practices. The LIBOR rate is the short-term opportunity cost

of capital of AA-rated financial institutions and it lifted off due to unwillingness of banks to lend to

each other during the crisis. Many dealers have switched now to using overnight indexed swaps (OIS),

because they are closer to being ’risk-free’. This research focuses on the Euro market and makes use of

the Euribor rate. The Euribor rate is similar to the LIBOR rate, however is only based on a panel of

European banks. The rates at which they borrow funds from one another is the Euro Interbank Offered

Rate (Euribor). Although the Euribor rate is not theoretically risk-free, it still is considered a good

alternative against which to measure the risk and return trade off.

There is one very important assumption that makes the risk-free rate even more crucial. This is

known as the assumption of a risk-neutral world. In a risk-neutral world is assumed that all investors

are risk-neutral. In other words, they do not require a higher expected return from an investment that

is more risky.

Theorem 2.1. This leads to the following two characteristics of a risk-neutral world (Hull, 2012):

1. The expected return on an investment is the risk-free rate.

2. The discount rate used for the expected payoff on a financial instrument is the risk-free rate.

This makes the pricing of derivatives much more simple. The world is not actually risk-neutral,

however it can be shown that if we compute the price of a derivative under the risk-neutral world

2


assumption, we obtain the correct price for the derivative in all worlds. This makes a significant difference,

because there is still a lot unknown about the risk preferences of buyers and sellers of derivatives.

The main focus of this research is on swaptions. The underlying of this product is an interest rate

swap and therefore this derivative will first be discussed. However before the swaps are considered, we

briefly discuss forward rate agreements. These agreements give an insight in how we can price a swap.

We will then continue with an interpolation method, known as bootstrapping, that is used to obtain

the zero curve. Then the swaption will be explained together with some of its most common trading

strategies. Finally we will continue with theory about martingales and measures. These measures are

used to compute the discounted expected value of a certain future payoff.

A forward rate agreement (FRA) is an agreement defined to ensure that a certain interest rate will

apply to either borrowing or lending a certain principal during a specified future period of time. We

define RK as the interest rate agreed to in the FRA and define RF as the forward of the reference rate

at time Tα for the period between times Tα and Tβ . We denote the value of a FRA at time t, where RKis received as

VFRA(t) = L(RK −RF )(Tβ − Tα)P (t, Tβ), (2.1)

where L is the principal of the FRA and P(t,T) is the present value at time t of 1 Euro received at time

T (Brigo and Mercurio, 2007).

The forward interest rate that is used in the FRA’s, is implied by current zero rates for periods of

time in the future. A n-year zero rate is the rate of interest earned on an investment that starts today

and lasts for n years. All the interest and principal is realized at the end of n years. A curve of zero rates

can be created from market quotes by using a popular interpolation method known as the bootstrap

method. This method will be described in more detail in Section 2.2.

A fixed-for-floating swap, also known as a payer swap, is the most common type of swap. In this swap

an investor agrees to pay interest at a predetermined fixed rate on a notional principal for a predetermined

number of years. In return it receives interest at a floating rate on the same notional principal for the

same period of time. A swap can be characterized as a portfolio of forward rate agreements and this can

be used to determine its value. The value of the swap is simply the sum of multiple FRA’s, so we find

that the value of a payer swap is given by

Vswap(t) = L

β∑i=α+1

(RK −RFi)(Ti − Ti−1)P (t, Ti), (2.2)

where the length of the swap Tβ − Tα is called the tenor with n years between Tα and Tβ with each m

cash flows per year. Throughout this entire paper we will denote m as the swap payment frequency per

annum. So in total we have n×m cash flows, which can be valued like FRA’s. This leads to the sum as

shown in (2.2), which sums in total over n×m different cash flows (Brigo and Mercurio, 2007).

2.2 Bootstrapping the zero curve

Only spot rates are quoted in the market, so the bootstrapping method is used to obtain the forward

rates and forward swap rates. This method works by incrementally computing zero-coupon bonds in

order of increasing maturity.

As curve inputs first multiple market quotes based on the Euribor rate are used. More precisely,

interest rate deposits with different maturities varying from overnight up to 3 weeks are used. To expand

3


the time grid out to 30 years, swaps are used with a maturity varying from 1 month up to 30 years. The

interpolation over these market quotes will give us all zero-coupon prices, which we can use to compute

the forward rates and the forward swap rates.

Uri (2000) lists the different payment frequencies, compounding frequencies and day count conven-

tions, as applicable to each currency specific interest rate type. The conventions for the Euro rates are

used for this research, namely for Euro deposit rates the day count convention of ACT/360 and for Euro

swap rates the day count convention of 30/360, respectively.

The deposit rates that are used for the time grid of the swap curve up to 3 weeks are inherently zero-

coupon rates. For this reason they only need to be converted to the base currency swap rate compounding

frequency and day count convention. The day count convention of the deposit is ACT/360, so we can

directly interpolate the data points to obtain the first part of the zero curve.

For the middle part of the curve one could use market quotes of forward rate agreements like described

by Uri (2000). This can be preferable, because they carry a fixed time horizon to settlement and settle

at maturity. However the FRA’s can lack liquidity, which results in inaccurate market quotes. For this

reason only swaps and deposits are used. The annually compounded zero swap rate is used to construct

most of the zero curve. The different day count convention of the swaps is taken into account. The

discount rates are computed based on the deposit and swap rates. Brigo and Mercurio (2007) define the

zero curve at time t as a graph of the simply-compounded interest rates for maturities up to one year

and of the annually compounded rates for maturities larger than one year. The simply compounded and

the annually compounded interest rates are defined as follows

L(t, T ) =1− P (t, T )

(T − t)P (t, T ), (2.3)

and Y (t, T ) =1

[P (t, T )]1/(T−t), (2.4)

where L(t, T ) represents the simply compounded interest rate at time t for maturity T and Y (t, T )

represents the annually compounded interest rate respectively. The simply compounded interest rates

that represent the first part of the zero curve are now combined with the annually compounded rates

that are used for the other part of the zero curve. To do so we first define R(ti) as the interest rate

corresponding to maturity ti, where i is the market observation index. Hence, R(ti) represents the simply

compounded interest rate if ti ≤ 1 and the annually compounded interest rate if ti > 1. There is no single

way to construct this complete zero curve correctly. It is however important that the derived yield curve

is consistent, smooth, and closely tracks the observed market points. Uri (2000) however also mentions

that over-smoothing the yield curve might cause the elimination of valuable market pricing information.

Piecewise linear interpolation and piecewise cubic spline interpolation are two commonly used methods

that are appropriate for market pricing.

The piecewise linear interpolation method is simple to implement, because the value of a new data

point is simply assigned according to its position along a straight line between observed market data

points. One drawback of this method however is that it produces kinks in the areas, where the yield

curve is changing slope. The piecewise linear interpolation can be constructed in a closed form as follows

R(t) = R(ti) +

[(t− ti)

(ti+1 − ti)

]× [R(ti+1)−R(ti)], (2.5)

where ti ≤ t ≤ ti+1.

To avoid the kinks produced by the linear method, one can choose to fit a polynomial function through

the observed marked data points instead. It is possible to either use a single high-order polynomial or a

4


number of lower-order polynomials. The latter method is preferred, because the extra degrees of freedom

can be used to impose additional constraints to ensure smoothness of the curve. The piecewise cubic

spline technique goes through all observed data points and creates the smoothest curve that fits the

observations and avoids kinks.

We can construct a cubic polynomial for the n− 1 splines between the n market observations. Now

let Qi(t) denote the cubic polynomial associated with the t segment [ti, ti+1]

Qi(t) = ai(t− ti)3 + bi(t− ti)2 + ci(t− ti) +R(ti), (2.6)

where R(ti) again represents market observation point i and ti represents the time to maturity of market

observation i. With three coefficients per spline and n− 1 splines, we have 3n− 3 unknown coefficients

and we impose the following constraints

ai(ti+1)3 + bi(ti+1 − ti)2 + ci(ti+1 − ti) = R(ti+1)−R(ti),

3ai−1(ti − ti−1)2 + 2bi−1(ti − ti−1) + ci−1 − ci = 0,

6ai−1(ti − ti−1) + 2bi−1 − 2bi = 0,

b1 = 0,

6an−1(tn − tn−1) + 2bn−1 = 0.

(2.7)

The first set of n − 1 constraints is imposed in order to force the polynomials to perfectly fit to each

other at the knot points. To also let the first and second order derivatives of the polynomials match, we

set the second and third set of 2n − 2 constraints. Finally two endpoint constraints are required to set

the derivative equal to zero at both ends. We end up with a linear system of 3n−3 equations and 3n−3

unknowns, which is solved to obtain the optimal piecewise cubic spline.

Both methods are used and plotted in Figure 2.1 below. The main advantage of the linear interpo-

lation method is that it is closed form. The piecewise cubic spline interpolation method however takes

longer to compute. The figures show almost no difference between both methods. This relatively small

difference can be explained by the large number of data points and the smooth structure of the rates

with respect to their time to maturity.

0 5 10 15 20 25 30−5

0

5

10x 10

−3

Time to Maturity

Rate

s

Zero curve for 02−Jun−2016

Market Quotes

Linear interpolated line

0 5 10 15 20 25 300.7

0.8

0.9

1

1.1

Time to Maturity

Rate

s

Discount curve for 02−Jun−2016

Discount rates

Linear interpolated line

0 5 10 15 20 25 30−5

0

5

10x 10

−3

Time to Maturity

Rate

s

Zero curve for 02−Jun−2016

Market Quotes

Cubic spline interpolated line

0 5 10 15 20 25 300.7

0.8

0.9

1

1.1

Time to Maturity

Rate

s

Discount curve for 02−Jun−2016

Discount rates

Cubic spline interpolated line

Figure 2.1: The interpolated zero and discount curves.

The zero curves as well as the discount curves are computed daily over the entire time grid. The

discount curves are used to price the swaptions and are obtained for a maturity up to 30 years. However,

we will only focus on swaptions with a maximum maturity of 10 years and a maximum underlying tenor

of the swap of 10 years.

As showed in Figure 2.1, the two interpolation methods differ very little. The computation time of

the linear interpolation method is however significantly shorter. For this practical reason and the aim of

5


this research we have chosen to only use the linear interpolation method.

2.3 Swaptions

Swap options, or swaptions, are options on interest rate swaps. They give the holder the right to enter

into a certain interest rate swap at a certain time in the future. Depending on whether the swaption is a

call or a put option, we call it a payer swaption or a receiver swaption respectively. The swap rate of the

swap contract equals the strike of the swaption. In a payer swaption, the owner pays the fixed leg and

receives the floating leg and in a receiver swaption this is the other way around. For example, a ’2y10y’

European payer swaption with a strike of 1%, represents a contract in which the owner has the right to

enter a swap, with a tenor of ten years, in two years from now where he pays a fixed rate of 1%.

First we define the annuity factor A that is used for discounting

Aα,β(t) =

β∑i=α+1

(Ti − Ti−1)P (t, Ti), (2.8)

where we have again n×m cash flows, with n the number of years and m the number of cash flows per

year respectively. We also define the forward swap rate at time t for the sets of times Ti, like Brigo and

Mercurio (2007). The forward swap rate is the rate in the fixed leg of the interest rate swap that makes

the contract fair at the present time. We denote the forward swap rate as follows

Sα,β(t) =P (t, Tα)− P (t, Tβ)

Aα,β(t). (2.9)

Now one can define the value of a payer swaption with strike K and resetting at Tα, . . . , Tβ−1 as follows

Vswaption(t) = Aα,β(t)Et[(Sα,β(Tα)−K)+

]. (2.10)

The value of the swaption clearly depends on the expected value of the difference between the forward

swap value and the strike rate. To obtain an arbitrage free price of a swaption, we need to define the

corresponding measure used to derive the expected value. This will be elaborated in more detail in

Section 2.4.

Some swaptions or combinations of swaptions will briefly be explained in this section, because of their

relevance for this research. An at-the-money (ATM) swaption is a swaption that has a strike equal to

the par swap rate of the underlying swap of the swaption. There are multiple trading strategies involving

swaptions. We define a straddle as the sum of an ATM payer swaption and an ATM receiver swaption

with the same ATM strike. If the interest rate is close to the strike rate at expiration of the options,

the straddle leads to a loss. However, if there is a sufficiently large move in either direction, a significant

profit will be the result.

We also define a strangle, which is the sum of a receiver swaption with a strike of ’ATM - offset’

and a payer swaption with a strike of ’ATM + offset’. The market normally refers to strangles as a ’2Y

into 10Y 100 out/wide skew strangle’ in which 100 is the width (in basis points) between the payer and

receiver strike and the offset from the ATM to both the payer and receiver swaption is thus width/2.

For example if we assume an ATM strike of 1%, the receiver strike is thus 0.5% and the payer strike is

1.5%. A strangle is a similar strategy to a straddle. The investor is betting that there will be a large

movement in the interest rate, but is uncertain whether it will be an increase or a decrease. If we would

compare the payoff of both strategies, we see that the interest rate has to move farther in a strangle than

6


in a straddle for the investor to make profit. However, the downside risk if the interest rate ends up in

a central value is less with a strangle.

Finally, a collar is also defined. A collar is a payer swaption with a strike of ’ATM + offset’ minus

a receiver swaption with a strike of ’ATM - offset’. A collar is normally quoted as a ’2Y into 10Y 100

out/wide skew collar’ in which the width of 100 basis points is again the width between the payer and

receiver strike. So, you will pay floating if the swap rate is within the interval of ’ATM ± offset’ and pay

a fixed rate for the range of the swap rate outside this interval.

2.4 Martingales and Measures

The models that are used to price derivatives try to estimate the expected payoff of the derivative. These

models are based on a stochastic process, which is simply a variable whose value changes over time in

an uncertain way. The processes, where only the current value of a variable is relevant for predicting

the future, are called Markov processes. The Markov property is very useful, because it states that the

future value of a variable is independent of the path it has followed in the past. This corresponds to

the assumption of weak market efficiency and states that all the relevant information is captured in the

current value of the variable (Hull, 2012). A stochastic process that satisfies the Markov property is

known as a Markov process.

We now focus on a particular kind of Markov process, which is known as a Wiener process (or a

Brownian motion). Formally, we define a P-Wiener process as stated in the theorem below (Tsay, 2005).

Theorem 2.2. A real-valued stochastic process Wtt≥0 is a P-Wiener process if for some real constant

σ, under P,

1. for each s ≥ 0 and t ≥ 0 the random variable Wt+s −Ws has the normal distribution with mean

zero and variance σ2t ,

2. for each n ≥ 1 and any times 0 ≤ t0 ≤ t1 ≤ · · · ≤ tn, the random variables Wtr −Wtr−1 are

independent,

3. W0 = 0,

4. Wt is continuous in t ≥ 0.

The probability measure P is defined as the probability of each event A ∈ F . We can think here

of F as a collection of subsets out of the entire sample space. Finally Ft contains all the information

about the evolution of the stochastic process up until time t.

The price of a non-dividend paying stock is often modelled as a Geometric Brownian motion. Before

we define this Geometric Brownian motion we first define a standard Brownian motion as a Wiener

process with zero drift and a variance proportional to the length of the time interval. This corresponds

to a rate of change in the expectation that is equal to zero and a rate of change in the variance that

is equal to one. We now consider a generalized Wiener process, where the expectation has a drift rate

equal to µ and the rate of change in the variance is equal to σ2 (Tsay, 2005). This leads to the following

generalized Wiener process

dxt = µ(xt, t)dt+ σ(xt, t)dWt, (2.11)

7


where Wt is a standard Brownian motion. We then consider the modelled change in price of a non-

dividend paying stock over time and this results in the following Geometric Brownian motion

dSt = µStdt+ σStdWt ⇒ dStSt

= µdt+ σdWt, (2.12)

where µ and σ are constant. Now Ito’s lemma can be used to derive the process followed by the logarithm

of St (Itô, 1951). First consider the general case for the continuous-time stochastic process xt of (2.11).

We also define G(xt, t) as a differentiable function of xt and t and find

dG =

(∂G

∂xµ(xt, t) +

∂G

∂t+

1

2

∂2G

∂x2σ2(xt, t)

)dt+

∂G

∂xσ(xt, t)dWt. (2.13)

We then apply Ito’s lemma to obtain a continuous-time model for the logarithm of the stock price. The

differentiable function is now defined as G(St, t) = ln(St). This leads to

d ln (St) =

(µ− σ2

2

)dt+ σdWt. (2.14)

This stochastic process has a constant drift rate of µ− σ2/2 and a constant variance of σ2. This implies

that the price of a stock at some future time T is log-normally distributed, given the current value of

the stock at time t

ln ST ∼ φ[ln S0 +

(µ− σ2

2

)∆, σ2∆

], (2.15)

where ∆ is the fixed time interval T − t. Black’s model is based on this lognormal property together

with the property of a risk-neutral world as will be further explained in Section 4.1.

In order to be able to normalize different asset prices, one can use a numeraire Z as reference asset.

A numeraire is defined as any positive non-dividend-paying asset. A key result used in the pricing of

derivatives is the relation between the concept of absence of arbitrage and the existence of a probability

measure like the martingale measure (or risk-neutral measure). Brigo and Mercurio (2007) denote this

relation as follows based on a numeraire Z

StZt

= EZ[STZT|Ft

]0 ≤ t ≤ T, (2.16)

where the price of any traded asset S (without intermediate payments) relative to Z is a martingale

under probability measure QZ . This probability measure Q is equivalent to the real world probability

measure P. A martingale is a zero-drift stochastic process, so under probability measure QZ we have for

a sequence of random variables S0, S1, . . .

EZ [Si|Si−1, Si−2, . . . , S0] = Si−1, ∀ i > 0. (2.17)

The preferred numeraire to use, depends on the derivative that is priced. Two frequently used numeraires

are now briefly described. First a numeraire based on the zero-coupon bond and secondly a numeraire

based on the annuity of a swap.

A zero-coupon bond, with a maturity T equal to that of the derivative, is commonly used as a

numeraire. We denote the value of this numeraire at time t as Zt and note that ZT = P (T, T ) = 1. We

also denote the measure associated with this numeraire as the T-forward measure QT with expectation

ET . This way we are able to price a derivative by computing the expectation of its payoff under this

measure. This leads to the following price of a derivative at time t

V (t) = P (t, T )ET[V (T )

P (T, T )|Ft

]= P (t, T )ET [V (T )], (2.18)

8


for 0 ≤ t ≤ T (Brigo and Mercurio, 2007). Notice that the forward rate is a martingale under this

measure, this makes the forward measure convenient to work with.

The annuity of a swap is a linear combination of zero-coupon bonds. A numeraire is defined as a

positive non-dividend paying asset, so the annuity of a swap can also be used as a numeraire. The

numeraire in this case will be the following portfolio of zero coupon bonds:

ZT = Aα,β(T ) =

β∑i=α+1

(Ti − Ti−1)P (T, Ti), (2.19)

this leads to the swap measure Qα,β . Under this measure we find that the swap rate Sα,β(t) is a

martingale:

Sα,β(t) =P (t, Tα)− P (t, Tβ)

Aα,β(t)(2.20)

⇒ P (t, Tα)− P (t, Tβ)

Zt= Eα,β

[P (T, Tα)− P (T, Tβ)

ZT|Ft

]0 ≤ t ≤ T. (2.21)

These numeraires and their related measures are used in arbitrage-free pricing, which is an essential part

of the option pricing models that are used. These models and their assumptions are further explained in

Section 4.1. However, first we will review several studies that are relevant for this research in the next

section.

3 Literature review

This study combines several methods with different underlying assumptions. First of all, the risk-neutral

world assumption is used and both the Black model and the SABR volatility model are used to price a

swaption on an interval of strike rates. When pricing these swaptions we also take negative interest rates

into account by using a displacement parameter. Then also the risk measures Value at Risk and Expected

Shortfall are computed. The estimated underlying profit and loss distribution is based on the normal

world probabilities to estimate valid risk measures. Hence, we make a bridge between the Q-measure and

the P-measure. To obtain the forecasts of the risk measures, we use two different methods. The quality

of the estimates of these two methods is evaluated by several different backtests. The methods that are

used differ in several ways. First, the Historical Simulation method for instance gives all historical returns

in the estimation window an equal weight and uses them to construct the profit and loss distribution

of the portfolio. The time series analysis that is performed on the other hand simulates one-day-ahead

forecasts of the SABR model parameters. These SABR parameters represent the characteristics of the

volatility structure of the individual swaptions. As a result of this we estimate the risk measures based

on one-day-ahead simulations of this volatility structure instead of the value of the portfolio on itself.

We will now discuss some studies that have also focused on the aspects that we are looking at.

Pérignon and Smith (2010) for instance, compare the disclosed quantitative information and VaR es-

timates of up to fifty international commercial banks in their paper. They use panel data over the

period 1996-2005 and find that VaR estimates are in general excessively conservative and also note that

there is no improvement in the estimates of the VaR over time. Besides this, they find that the most

popular VaR method is the Historical Simulation method. Then they also conclude that this method

helps little in forecasting the volatility of future trading revenues. Pérignon and Smith (2010) use the

Unconditional Coverage test of Kupiec (1995) to test whether the proportion of VaR violations equals

the desired proportion p. The number of VaR violations that is found is extremely small and the null

9


hypothesis of unconditional coverage is rejected for every year except for 1998 at the 5% confidence level.

This study clearly shows the relevance of finding an improved and less conservative method to estimate

risk measures like the Value at Risk.

There are multiple improvements proposed in the literature that are based on the Historical Simula-

tion method such as the Filtered Historical Simulation method (FHS) as described by Barone-Adesi et al.

(2002). While the Historical Simulation method was found to be excessively conservative by Pérignon

and Smith (2010), it is also known to underestimate risk in some particular situations. This is because

the method is based on the assumption that the risks do not change over time. Hence, when the market

conditions change and the market becomes more volatile, the risk is underestimated by the method. This

can fortunately be solved by first standardizing the historical returns and then scaling them to the current

volatility as is done with the Filtered Historical Simulation method. In this method a GARCH model

is fitted to the historical data and the residuals are divided by their corresponding volatility estimates.

These standardized residuals are then randomly drawn and used to simulate the one-day-ahead profit

and loss distribution. Even though this method overcomes a shortcoming of the Historical Simulation

method it still needs some care. According to the work of Gurrola and Murphy (2015) the filtering

process changes the return distribution in ways that may not be intuitive. Furthermore it is important

to make a careful selection in which application of the FHS method is used and besides this re-calibration

and re-testing is essential to ensure that the model remains relevant. Finally, Pritsker (2001) also shows

that one has to be careful when dealing with limited data sets. He shows for example that two years of

historical data is not sufficient for the FHS method to estimate the Value at Risk accurately at a 10-day

horizon.

The Historical Simulation method is based on historical returns, however to obtain these returns we

first need to price the swaptions. There are numerous models that can be used to price a swaption.

However, we also need to take the smile risk into account due to the fact that the volatility of the

swaptions differs for different strike rates. To capture this smile risk in the derivatives market Hagan

et al. (2002) introduce the SABR volatility model. West (2005) calibrates the parameters of the SABR

model in a situation where input data is very scarce. The calibration is based on equity futures which are

traded at the South African Futures Exchange. The study focuses on packages of options that combine

multiple derivatives like a collar or a butterfly for example. Some of these packages are traded in total

for about 800 times, while there are more than double that number of strike combinations. West (2005)

compares two cases. First, he estimates all of the SABR model parameters daily and then in the second

case he keeps one of the parameters (β) fixed while he still estimates the other parameters daily. This is

because hedging efficiency can be ensured by changing the parameters only once a month while changing

the input values of F and σATM daily. West (2005) finds that the calibrated parameters of the model

only change infrequently when the value for β is fixed. In fact, they are always changing up to a very high

precision, but they remain unchanged up to a fairly high precision. For this reason, he finds that keeping

the value for β fixed leads to an infrequent change of the other SABR parameters. These infrequent

changes result in the end in lower hedging costs. Hence, this research shows a robust algorithm to capture

the volatility smile based on the SABR model while the input data is very scarce and also shows the

advantages of keeping the parameter β fixed.

Bogerd (2015) also uses the SABR volatility model, but he combines it with the Historical Simulation

method. He focuses on the volatility structure of swaptions in specific. He uses daily observations of

the calibrated SABR model parameters and also uses a displacement parameter to deal with negative

10


interest rates. He simulates 1000 one-day-ahead estimates of the profit and loss distribution based on

historical changes in the SABR model parameters. A distinction is made here between the curvature

and the level of the volatility structure. Only varying one of the SABR parameters (i.e. α) results in

just a vertical shift of the volatility skew. Bogerd (2015) notes that this is a reasonable approximation,

because most of the variation in the swaption volatility over time is caused by vertical movements of the

volatility smile. He performs an unconditional coverage test as well as an independence test and only

rejects the independence property for the Historical Simulation method applied to all of the SABR model

parameters. The independence property is tested here with the backtest of Du and Escanciano (2015),

which is based on the Ljung-Box statistic. These results imply that there are possibilities to obtain valid

forecasts of the risk measures based on estimates of the one-day-ahead volatility structure. We note

however that the SABR parameters that represent the volatility structure are dependent on each other

like described in Section 4.1.2. When dealing with such a time series of interdependent parameters a

multivariate time series model can be used to capture the dynamics of the parameters over time. This

makes it interesting to investigate whether it is possible to improve the one-day-ahead forecasts of the

volatility structure by using a time series analysis.

There are however some difficulties when applying the Historical Simulation method on the SABR

model parameters. Moni (2014) explains that it is questionable if it is meaningful to add past changes

in the SABR parameters to their current values. A change in the SABR parameters changes the entire

volatility structure. Such a change may not always be valid, especially if the values of the historical

SABR parameters are significantly different from the current values of the SABR parameters. For this

reason, the Historical Simulation method will not be applied to the SABR parameters in this study. We

will compare estimated risk measures of the Historical Simulation method based on the portfolio returns

with the estimated risk measures based on a time series analysis of the SABR model parameters.

In this study we make use of two different measures with each their own underlying assumptions.

The risk-neutral world assumption makes it possible for us to compute the expected value of future

payoffs without having to deal with the different risk preferences of buyers and sellers of derivatives.

Giordano and Siciliano (2013) clarify in their paper that this risk-neutral hypothesis is acceptable for

pricing derivatives. However, they also note that the risk-neutral assumption can not be used to forecast

the future value of a financial product. So, if we estimate the one-day-ahead value of a swaption we need

to take the risk premium into account. Hence, we compute the estimated profit and loss distribution

based on the real-world probability measure P. Therefore we use the risk-neutral world assumption only

to compute the volatility structure of the derivatives based on the quoted historical swaption premiums.

These volatility structures are then used together with the risk-neutral assumption to price the swaptions

up to and including the last day of the estimation window. The methods that are then used to estimate

the one day ahead profit and loss distribution do not depend on the risk-neutral assumption. The one-

day-ahead forecasts of the price of the swaptions are estimated based on the real-world probabilities.

The risk measures are then computed based on these estimates of the profit and loss distribution.

The adequacy of the forecasts based on these models will be assessed by several backtests. Piontek

(2009) reviews various backtests that assess the quality of models that produce VaR estimates. He

analyzes some commonly used backtesting methods in his research and focuses on the problems regarding

limited data sets and low power of the tests. The simulations are performed for different sample sizes

with the number of observations between 100 and 1000. He finds a low power for the backtest of Kupiec

(1995) for all of these sample sizes. He tests for example based on 250 observations and an inaccurate

11


model that gives 3% or 7% of violations, instead of the chosen tolerance level of 5%. In this example

the backtest only rejects the model in 35% of the draws. This shows that an inaccurate model in such a

situation is not rejected in 65% of the cases with a significance level of 5%. A low power is also found by

other backtests and this shows that we can not assume that a model is correct if it is not rejected by a

backtest. In the empirical study of this research we also have to deal with a limited backtesting sample

size of 363 observations. For this reason, we apply numerous different backtests that enable us to assess

the quality of our methods more extensively.

In the next section we will first discuss the models and methods that are used in the empirical part

of this research. We will then continue with a description of the data and then also discuss the results

of the empirical study.

4 Models and method

The SABR volatility model that is used will be explained in more detail in Section 4.1.2. It will be

used to convert the quoted market swaption premiums into a volatility surface that allows us to price

swaptions for arbitrary non-quoted strikes. This will be done for a selected combination of the expiry

and tenor, so not the entire surface will be taken into account.

4.1 Option pricing models

Under the right corresponding measure, we have seen that both the forward rate as well as the swap

rate are martingales. In this research we use the Euribor forward rate, which is a martingale under the

forward measure QT . We also have that forward swap rates are a martingale under their measure Qα,β .The option pricing models are based on the following stochastic process

dFt = c(t, . . . )dWt. (4.1)

The Brownian motion Wt and coefficient c can be deterministic or random. Note that the dynamics do

not have a drift term, since the forward rate is a martingale under its corresponding measure.

4.1.1 Black’s model

Black (1976) introduced a model which gives a closed form solution for the price of an option under the

assumption that price movements of the forward rate Ft follow a log-normal distribution. The dynamics

in Black’s model depend on the current value of the forward rate Ft and one parameter σB called Black’s

volatility and are given by the following equation

dFt = σBFtdWt F0 = F > 0. (4.2)

The standard continuous-time stochastic process is denoted in (2.11). Notice that the drift parameter

µ is dropped out of Black’s differential equation. This implies that the equation is independent of risk

preferences. Black, Scholes and Merton use in their analysis that a riskless portfolio can be set up from

the stock and the derivative. This portfolio is riskless for an instantaneously short period, but can be

rebalanced frequently. This way one can assume that investors are risk-neutral and therefore use the

following results. The expected return on all securities is the risk-free interest rate r and the present

value of any cash flow can be obtained by discounting its expected value at the risk-free rate (Tsay,

2005).

12


The expected payoff of an European call option on a futures contract under the forward measure is

ET [max(V (T )−K, 0)], (4.3)

where ET denotes the expected value under the forward measure and V (T ) is the value of the underlying

of the option at time t = T . We denote the price of this call option at time t as

ct = P (t, T )ET [max(V (T )−K, 0)]. (4.4)

Using the dynamics of (4.1), the following well known solution for the price of an European call option

on a futures contract can be derived

c0(F0,K, T ;σB) = P (0, T )[F0φ(d1)−Kφ(d2)],

d1 =log

(F0

K

)+ σ2

2 T

σ√T

,

d2 = d1 − σ√T .

(4.5)

Besides this general formula, one can also compute the price of a payer swaption with Black’s formula,

as described in Hull (2012)

d1 =ln(Sα,β(Tα)

K

)+ σ2

2 T

σ√T

,

d2 = d1 − σ√T ,

VSwaption(t) = LAα,β(t)[Sα,β(Tα)N(d1)−KN(d2)],

(4.6)

where L is the notional principal value of the contract. In this formula the swap rate is used instead of

the discounted futures price, based on this swap rate and the swap measure we can price a swaption in

a similar manner to an option on a futures contract.

4.1.2 SABR volatility model

One of the assumptions of the Black model is that a fractional change in the futures price over any

interval follows a lognormal distribution (Black, 1976). If this assumption would be violated, some of

the outcomes will as a result change. If for example the probability of a large positive movement in the

interest rate would actually be significantly higher than implied by the lognormal property, this would

lead to a higher expected payoff of an out-of-the-money (OTM) payer swaption with a strike rate in

this region. The corresponding price of such a swaption will subsequently also need to be higher than

the price based on the lognormal assumption. This phenomenon is observed in the market and leads to

a volatility that varies for different strike rates, as opposed to the constant Black’s volatility. For this

reason, we introduce a volatility model to take this volatility skew into account.

The Stochastic Alpha Beta Rho model, like derived by Hagan et al. (2002), is given by a system of

two stochastic differential equations. The state variables Ft and αt are defined as the forward interest

rate and a volatility parameter respectively. The dynamics of the model are as follows

dFt = αtFβt dW

(1)t F0 = F > 0,

dαt = ναtdW(2)t α0 = α > 0,

dW(1)t dW

(2)t = ρdt,

(4.7)

13


where the power parameter β ∈ [0, 1] and ν > 0 is the volatility of αt, so the volatility of the volatility

of the forward rate. dW (1)t & dW

(2)t are two ρ-correlated Brownian motions. The factors F and α are

stochastic and the parameters β, ρ and ν are not.

West (2005) describes the parameters in more detail. α is a ’volatility-like’ parameter: not equal to

the volatility, but there will be a functional relationship between this parameter and the at-the-money

volatility. Including the constant ν acknowledges that volatility obeys well known clustering in time.

The parameter β ∈ [0, 1] defines the relationship between futures spot and at-the-money volatility. A

value of β close to one indicates that the user believes that if the market were to move up or down in an

orderly fashion, the at-the-money volatility level would not be affected significantly. Whereas for values

of β << 1 it indicates that if the market were to move then the at-the-money volatility would move

in the opposite direction. The closer β is to zero the more distinct this effect would be. Moreover the

value for β also gives insight in the distribution of the the underlying. If β is close to one the stochastic

model is said to be more lognormal and the closer β is to zero the closer the stochastic model follows the

normal distribution instead.

Hagan et al. (2002) show that the price of a vanilla option under the SABR model is given by the

appropriate Black’s formula, provided the correct implied volatility is used. For given α, β, ρ, ν and τ ,

this volatility is given by

σ(K,F, τ) =α(

1 +(

(1−β)224

α2

(FK)1−β+ 1

4ρβνα

(FK)(1−β)/2+ 2−3ρ2

24 ν2)τ)

(FK)(1−β)/2[1 + (1−β)2

24 ln2 FK + (1−β)41920 ln4 FK

] z

χ(z), (4.8)

where z =ν

α(FK)(1−β)/2ln

F

K, (4.9)

and χ(z) = ln

(√1− 2ρz + z2 + z − ρ

1− ρ

), (4.10)

for an option with strike K, given that the current value of the forward price is F . Here we note that in

our case we have that the forward value is equal to the par swap rate. Hence, we have F = Sα,β(Tα) and

note that if F = K the swaption is said to be at-the-money. For the ATM strike rate, we can remove

the terms z and χ(z) from the equation, because in the limit we have zχ(z) = 1. So for an at-the-money

volatility, one can rewrite the equation as follows

σATM (F, τ) =

(1−β)2τ24F (2−2β)α

3 + ρβντ4F (1−β)α

2 +(

1 + 2−3ρ224 ν2τ

)α

F (1−β) , (4.11)

where τ is the year fraction to maturity. This formula is closed form, which makes the model very

convenient for the pricing of an option.

There is however one main drawback of Hagan’s formula. This drawback is that the formula is known

to produce wrong prices in region of small strikes for large maturities. Obłój (2008) proposes for this

reason to an improvement to the original formulas that compute the volatility as defined by Hagan et al.

(2002). In his paper he gives several arguments to use the formula derived by Berestycki et al. (2004).

To understand why we use the formula of Berestycki et al. (2004), we consider the Taylor expansion of

the implied volatility surface

σ0(K,F, τ) = σ0(K,F )(1 + σ1(K,F )τ

)+O(τ2). (4.12)

Obłój (2008) then compares the explicit expressions of Hagan et al. (2002) and Berestycki et al. (2004)

for σ0(K,F ) and σ1(K,F ). It can be shown that both expressions for σ0(K,F ) and σ1(K,F ) are exactly

14


the same when either K = F , ν = 0 or β = 1. However, when β < 1 the results of σ0(K,F ) of the

two papers differ and Obłój (2008) argues that the formula of Berestycki et al. (2004) is correct and

should be used. This conclusion is based on two arguments. First of all Hagan’s formula is inconsistent

if β → 0. And secondly the formula suggested by Obłój (2008) produces, in most cases, correct prices in

the region of small strikes for large maturities, unlike Hagan’s formula.

The formula for the implied volatility is now obtained by combining σ0(K,F ) from Berestycki et al.

(2004) and σ1(K,F ) from Hagan et al. (2002). We define the fine-tuned implied volatility as follows

σ(K,F, τ) =ν ln F

K

(1 +

((1−β)2

24α2

(FK)1−β+ 1

4ρβνα

(FK)(1−β)/2+ 2−3ρ2

24 ν2)τ)

χ(z), (4.13)

where z =ν

α

F (1−β) −K(1−β)

1− β, (4.14)

and χ(z) = ln

(√1− 2ρz + z2 + z − ρ

1− ρ

), (4.15)

which is used instead of (4.8) if there is reason to assume that β < 1.

We will now discuss the method that is used to calibrate the SABR model parameters. In the

empirical part of this research we will only find values of β < 1, so as a result we will only work with

(4.13) instead of (4.8). Nevertheless Obłój (2008) showed that the expressions from Hagan et al. (2002)

and Berestycki et al. (2004) are exactly the same for the volatility of an at-the-money swaption. For this

reason, (4.11) remains valid. We now follow the steps from West (2005) and notice the following relation

ln σATM = ln α− (1− β)ln F + . . . , (4.16)

so the right value of β can be estimated from a log-log plot of σATM and F . Hagan et al. (2002) suggest

that it is appropriate to fit this parameter in advance and never change it. So the appropriate value for

β is chosen first. Then (4.11) is inverted to obtain an expression of α in the other SABR parameters and

the at-the-money volatility. This is done by setting the equation equal to zero and selecting the smallest

positive real root. In the final step we minimize the difference between the market volatilities and the

volatilities computed with the SABR model

minρ,ν|σM − σSABR(α, β, ρ, ν, τ)|, (4.17)

where β is already estimated and α(σATM , β, ρ, ν, τ). The time to maturity τ is also known, so we

calibrate ρ and ν by minimizing this difference. In this method, we calibrate the parameters so that the

produced at-the-money volatilities are exactly equal to the market quotes. The at-the-money volatilities

are important to match, because they are traded most frequently. Finally, when all of the parameters

are calibrated and we have estimated the SABR volatility for a swaption, we can use (4.6) to price this

swaption. The steps to calibrate the SABR model parameters are all applied and described in more

detail in Section 6.1.

4.1.3 Pricing in a negative interest rate environment

Before we continue with the time series analysis, we first need to consider a method that enables us to

price derivatives in a negative interest rate environment. The option pricing models that are used in this

research do not allow interest rates to become negative. However a lot has changed since these models

where constructed and we need to adjust our models to be able to deal with the negative interest rates

15


that have occurred over the past years. Frankema (2016) describes the Displaced Black’s model as well

as the displaced SABR model, which allow interest rates to be negative. The shifted models with shift

s > 0 allow rates larger than −s to be modelled. This leads to the following adjusted dynamics of Black’s

model, which is also known as a displaced diffusion process

dFt = d(Ft + s) = σB(Ft + s)dWt, (4.18)

where s is the constant displacement (or shift) parameter. Note that Ft ≡ (Ft+s) follows a lognormal (or

Black) process. This fact, together with the fact that the payoff of a European call option max(FT−K, 0)can be written as

max(FT −K, 0) = max((FT + s)− (K + s), 0) ≡ max(FT − K), (4.19)

leads to the conclusion that European calls and puts can be valued under the displaced diffusion model

by plugging in F0 ≡ (F0 + s) and K = (K + s) in Black’s model.

A similar adjustment leads to the following dynamics of the displaced SABR model

dFt = αt(Ft + s)βdW(1)t ,

dαt = ναtdW(2)t ,

E[dW(1)t dW

(2)t ] = ρdt.

(4.20)

Hence, we use the formulas from Black’s model (4.6) and the SABR model (4.13) with the displaced

values F0 and K instead of F0 and K. A drawback of the displaced models however is that the shift

parameter needs to be selected a priori. So an assumption has to be made on the minimum of the interest

rate. To overcome this drawback mentioned above, Antonov et al. (2015) describe the Free boundary

model. However for this research the displaced SABR model is preferred.

4.2 Time series analysis

The SABR volatility model parameters are estimated on a daily basis. The aim of this research is to

estimate the risk related to a portfolio of swaptions. Therefore an analysis of these SABR parameters

over time is of interest to be able to forecast the one-day-ahead volatility structure. In this section, some

models will be discussed that are used to capture the dynamics of the parameters αt, ρt and νt over

time.

4.2.1 Vector Autoregressive model

A time series is called white noise if all autocorrelation functions (ACF) of a sequence γt are equal

to zero. So we need for a white noise series, that all sample ACFs are close to zero. To obtain this, we

need to apply some time series models to model the dynamic structure of our time series. Tsay (2005)

denotes first the simple autoregressive model of order 1 or simply AR(1) model. This model is defined

as follows:

γt = φ0 + φ1γt−1 + at, (4.21)

where at is assumed to be a white noise series with mean zero and variance σ2a.

This model described above could make sense for the individual parameters, but we have to obtain

a forecast of all of the SABR parameters together. These parameters clearly depend on each other like

16


described in (4.7). Hence, a model that takes the correlation between these time series into account

is desired. The vector autoregressive model (VAR) is a model that can be used for this kind of linear

dynamic structures of a multivariate time series. We fit a VAR model to the three time series α, ρ and ν

Γt = φ0 + ΦΓt−1 + at,

where Γt =

αt

ρt

νt

.(4.22)

The vectors Γt and φ0 are k-dimensional, Φ is a k×k matrix, and at is a sequence of serially uncorrelated

random vectors with mean zero and co-variance matrix Σ. Note that we are modelling three different

SABR parameters over time and for this reason have k = 3.

For our V AR(p) model estimation, we have to decide how many lags p to include. A vector au-

toregressive model of lag length p refers to a time series in which its current value is dependent on its

first p lagged values. There are several tools that can be used to decide which lag length to include.

Firstly a sample autocorrelation function (ACF) of the parameters can be used to check their level of

autocorrelation. If we have a weakly stationary return series γt, we define the lag-l autocorrelation of

γt, ACFl, as the correlation coefficient between γt and γt−l. We define ACFl as follows (Tsay, 2005)

ACFl =Cov(γt, γt−l)√V ar(γt)V ar(γt−l)

=Cov(γt, γt−l)

V ar(γt)(4.23)

Another method to determine the optimal selection of lags to include is to use information criteria.

These criteria like the Akaike information criterion (AIC), Bayes information criterion (BIC) and Hannan-

Quinn criterion (HQC) can be used to measure the relative quality of statistical models for a given set

of data. Liew (2004) compares these different criteria in a simulation study to obtain the best choice of

lag length criteria for an autoregressive model. He finds out that for a relatively large sample, with 120

or more observations, the Hannan-Quinn criterion is found to outdo the rest in correctly identifying the

true lag length.

4.2.2 Local level model

A local level model is a type of state space model, which can like the VAR model also be used for a

time series analysis. In a classical regression model, a trend and an intercept are estimated. However,

when focusing on a time series this intercept might in reality not be fixed over time. When this level

component changes over time it is applied locally and for this reason this model is known as the local

level model. The local level model allows this intercept to change over time and is defined as follows

µt+1 = Imµt +Bηt, where µt =

µ(1)t

µ(2)t

µ(3)t

, (4.24)

Γt = Cµt +Dεt, (4.25)

where Γt is the vector of SABR parameters that is defined in (4.22). Moreover the observation or

measurement equation (4.25) contains the values of the three observed time series at time t. Besides

this, we also have a m × 1 vector of unobserved variables µt. Three unobserved variables are used in

this research, so we have here m = 3. These unobserved variables represent the unknown fixed effects

17


and we define (4.24) as the state equation. We also define εt as the observation disturbances and ηt as

the state disturbances respectively. These disturbances are independent and follow the standard normal

distribution.

The state disturbance coefficient matrix B is here defined as a 3 × 3 matrix. This results in a co-

variance matrix equal to BB′. The observation innovation coefficient matrix D is defined in a similar way

as a 3× 3 matrix, which leads to an observation innovation co-variance matrix equal to DD′. Both the

state disturbance coefficient matrix as well as the observation innovation coefficient matrix are defined as

a diagonal matrix. The diagonal elements of these matrices are estimated by using maximum likelihood.

Furthermore, we note that Im is the identity matrix of size m = 3. Finally, the 3 × 3 matrix C links

the unobservable factors of the state vector µt with the observation vector Γt. All the coefficients of the

matrix C are also estimated by using maximum likelihood.

The state equation is defined as a random walk and in the measurement equation an irregular com-

ponent εt is added, which makes this model a random walk plus noise. The state equation is essential in

time series analysis, because the time dependencies in the observed time series are dealt with by letting

the state at time t+ 1 depend on the state at time t (Commandeur and Koopman, 2007).

4.3 Risk measurement

The option pricing models are based on a probability measure Q that is related to a risk-neutral world.

On the other hand the real probability P is used to estimate the risk of a portfolio. These two measures

give different weights to the same possible outcomes for the same derivatives. The risk measures are

based on estimates of the profit and loss distribution. The probability of occurring a certain value from

this profit and loss distribution needs to be equivalent to the real probability P to obtain a valid risk

measure. In this research, we will use option pricing models together with the risk neutral measure to

price the swaptions. These swaption prices as well as the calibrated parameters of the SABR model are

then used to derive the profit and loss distribution under the probability in the real world. In this section

the concepts of financial risk and some methods of measuring risk will be introduced. This includes a

definition of Value at Risk and Expected Shortfall as well as their limitations.

4.3.1 Risk measures

Financial risk can be seen as the change of a loss in a financial position, caused by an unexpected change

in the underlying risk factor. In this research we focus on a portfolio of swaptions, so the risk related to

this is the risk of losses in positions arising from movements in market prices. The risk we are trying to

measure is called market risk and in our case in specific, interest rate risk.

Now a formal definition of a risk measure is provided. We have a finite set of states of nature Ω,

a set of all risks χ and the set of all real-valued functions X ∈ χ , which represent the final net worth

of an instrument for each element of Ω. We now define a risk measure ρ(X) as a mapping of χ into R(Roccioletti, 2016).

To assess whether a risk measure is acceptable, the axioms of a coherent risk measure are defined. In

other words, a risk measure is said to be coherent if it satisfies the following four properties.

Axiom 1. Translation Invariance

For all X ∈ χ and for all m ∈ R, we have

ρ(X +m) = ρ(X)−m (4.26)

18


Translation invariance implies in words that the addition of a sure amount of capital reduces the risk by

the same amount.

Axiom 2. Sub-additivity

For all X1 ∈ χ and X2 ∈ χ, we have

ρ(X1 +X2) ≤ ρ(X1) + ρ(X2) (4.27)

So, the risk of two portfolios together cannot get any worse than adding the two risk separately.

Axiom 3. Positive Homogeneity

For all X ∈ χ and for all τ > 0, we have

ρ(τX) = τρ(X) (4.28)

Again in words, positive homogeneity implies the risk of a position is proportional to its size.

Axiom 4. Monotonicity

For all X1 ∈ χ and X2 ∈ χ with X1 ≤ X2, we have

ρ(X1) ≥ ρ(X2) (4.29)

Finally, like described by Roccioletti (2016), the monotonicity axiom explains that if, in each state of

the world, the position X2, performs better than position X1, then the risk associated to X1 should be

higher than that related to X2.

The Value at Risk measure is a single estimate of the amount by which an institution’s position in

a risk category could decline due to general market movements during a given holding period. Define

∆Vl as the change in value of the assets of a financial position from time t to t + l. This quantity will

be measured in Euros and is a random variable at time index t. The cumulative distribution function of

∆Vl is expressed as Fl(X). The Value at Risk measure is defined such that a loss will not exceed VaR

with probability 1-p over a given time horizon(Tsay, 2005). The VaR is given by

p = Pr[∆Vl ≤ −V aR] = Fl(−V aR). (4.30)

Although VaR is widely used among banks, it also has several limitations. First of all, as described

by the Basel Committee (2013), the VaR measure does not capture tail risk. As it is a single estimate of

the minimal potential loss in an adverse market outcome, it will underestimate the actual potential loss.

The Value at Risk measure gives no estimation of the magnitude of the loss in such an event. Besides

this the sub-additivity property fails to be valid for VaR in general, meaning that it is not a coherent

risk measure and we can have

V aR(X1 + · · ·+Xd) > V aR(X1) + · · ·+ V aR(Xd). (4.31)

While in general, portfolio diversification always leads to risk reduction, this is not the case for the

VaR measure. This is especially a problem when we consider the capital adequacy requirements for a

financial institution made of several businesses. With a decentralized approach, where the VaR number

is calculated for every different branch, we can not be sure if the aggregated overall risk is an accurate

estimation. However, we note that VaR is not sub-additive in general, but whether or not it is the case

depends on the properties of the joint loss distribution.

To overcome the shortcomings of the Value at Risk measure, the Expected Shortfall measure can be

used instead. Expected Shortfall is the expected return of the portfolio given that a loss has exceeded

19


the VaR. We define the ES as

−ES(1−p) = E[∆Vl|∆Vl ≤ −V aR(1−p)], (4.32)

−ES(1−p) =1

p

∫ −V aR(1−p)

−∞xfl(x)dx, (4.33)

where fl(x) is the probability distribution function of ∆Vl. In these formulas we assume a long position

in the portfolio, but the same can be derived for a short position.

Expected Shortfall fulfills all the four axioms above and so it is a coherent risk measure. Also, the

tail risk is taken into account with the ES measure. There are still some other issues with this measure.

To obtain the ES forecast, we first need to ascertain the VaR estimate to subsequently compute the tail

expectation. This brings a greater uncertainty into the estimation. There is also some difficulty with

the validation of risk models’ ES forecasts. As showed by Gneiting (2011), Expected Shortfall is not

elictable. A function is elictable if there exists a scoring function that is strictly consistent for it. The

difficulty with the ES forecasts is that it measures all risk in the tail of the return distribution. Some

losses far out in the tail however, will not be observed in regular backtesting. Despite these drawbacks,

it is still proposed as a replacement of the VaR measure.

To assess the risk related to the swaptions, we want to compute the Value at Risk and Expected

Shortfall forecasts. The Historical Simulation method will now be described. This procedure uses

historical returns to predict the VaR. It is easy to implement, but has some shortcomings. All of the

returns are given the same weight, so this procedure does not take the decreasing predictability of data

that are further away from the present into account.

Let rt, rt−1, . . . , rt−K be the returns of a portfolio in the sample period. So first the changes in swap-

tion price over our sample are computed. Then we sort the returns in ascending order: r[1], r[2], . . . , r[K].

The one-day ahead Value at Risk is given by:

−V aR(1−p) = r[k], (4.34)

where k = Kp. The Expected Shortfall follows from the previous steps and can be computed as follows:

−ES(1−p) =1

k

k∑i=1

r[i] (4.35)

We note that classical HS is only valid in theory when the volatility and the correlation are constant

over time, when dealing with a time-varying volatility we need to use another method.

4.4 Backtests

Backtesting can be described as checking whether realizations are in line with the model forecasts.

Financial institutions base their decisions partly on their estimates of risk measures. Therefore it is very

important to test whether these estimates are accurate. There are various different tests developed over

time to be able to assess the quality of the models that produce these estimates. Even though it may

seem like a simple task, there are some complications. The main difficulty is that the methods result in an

estimate of the profit and loss distribution daily, but to assess the quality of this estimated distribution

only one true profit or loss is observed. Especially the evaluation of the accuracy of an Expected Shortfall

estimate is challenging. If we focus on the ES(0.975) for example, we only incur in theory a loss that

exceeds the V aR(0.975) in 2.5% of the cases. With this minimal number of actual losses that exceed the

20


VaR, we need to assess whether the ES forecast actually represents the true expected value of the tail

loss. Besides this, the tail loss is also estimated based on a different profit and loss distribution for every

new forecast. Fortunately there are some methods to backtest the models we are using. We will mainly

focus however on backtests based on the VaR estimates, but we will also perform a backtest to assess

the performance of the models with regard to their Expected Shortfall estimates.

Campbell (2007) reviews a variety of backtests. He defines a hit function that creates a sequence like

for example: (0, 0, 0, 1, 0, 0, . . . , 1), where a 1 stands for a loss that exceeds the VaR measure. Determining

the accuracy of the VaR measure can be reduced to determining whether the hit sequence satisfies two

properties. First of all, the probability of receiving a loss that exceeds the (1− p)% VaR measure must

be p. Secondly, any two elements of the hit sequence must be independent from each other. Only hit

sequences that satisfy both properties can be described as evidence of an accurate VaR model. Let this

hit function be defined as follows

It =

1, if rt+1 < −V aR(1−p)

0, if rt+1 ≥ −V aR(1−p).(4.36)

The hit function is used to test the unconditional coverage property with the backtest proposed by

Kupiec (1995) and also to test the independence property with the backtest proposed by Christoffersen

(1998). In addition also the magnitude of losses that exceed the VaR can be taken into account with a

magnitude-based test.

4.4.1 Unconditional coverage backtesting

The unconditional coverage backtest, proposed by Kupiec (1995), tests the null hypothesis of E[It] = p.

The hit function defined at the beginning of this section is used and we first compute the total number

of hits

n1 =

T∑t=1

It, (4.37)

we also define n0 = T − n1 as the total number of returns larger than −V aR(1−p). The estimated

probability now becomes

π =n1

n0 + n1. (4.38)

So this corresponds to the following hypothesis based on the returns and the Value at Risk measure

H0 : π = p, H1 : π 6= p. (4.39)

The likelihood under the null hypothesis is defined as

L(p; I1, I2, . . . , IT ) = (1− p)n0pn1 , (4.40)

and under the alternative hypothesis as

L(π; I1, I2, . . . , IT ) = (1− π)n0πn1 . (4.41)

This can be tested with a standard likelihood ratio test

LRuc = −2 log[L(p; I1, I2, . . . , IT )

L(π; I1, I2, . . . , IT )]

]asy∼ χ2(m− 1) (4.42)

21


The variable m is the number of possible outcomes of the hit sequence, so in this case we have m = 2.

The LR-statistic converges under the null hypothesis to the chi-squared distribution with one degree of

freedom

LRuc = 2 log[(

1− π1− p

)n0(π

p

)n1]

d−−→ χ2(1). (4.43)

4.4.2 Magnitude-based test

Frequency tests do not take the magnitude of the losses into account, therefore it is desired to also

perform a magnitude-based test. If we consider for example two different banks, with both a V aR(0.99)

estimate. Say that these banks both encounter three losses that exceed their Value at Risk estimate

within the same time period. The Unconditional Coverage test would indicate that the performance of

both models is similar. However, it could be the case that Bank A has occurred three losses that exceed

the VaR with one million euros, while Bank B has occurred losses that exceed the VaR with one billion

euros. This difference in the risk is obvious, so for that reason a multivariate version of the unconditional

coverage test will be applied.

Colletaz et al. (2013) describe a method to validate risk models. The test is based on the intuition

that a large loss will not only exceed the V aR(1−p), but is also likely to exceed the V aR(1−p′) with

p′ < p. A standard Value at Risk violation is defined as a exception and a super exception is defined as

rt < −V aR(1−p′). Based on these two concepts the following null hypothesis is defined

H0 : E[It(p)] = p and E[It(p′)] = p′. (4.44)

To test this hypothesis, we define two hit functions to indicate the frequency of returns that fall in each

interval

J1,t = It(p)− It(p′) =

1 if − V aR(1−p′) < rt < −V aR(1−p),

0 otherwise,(4.45)

J2,t = It(p′) =

1 if rt < −V aR(1−p′),

0 otherwise ,(4.46)

and J0,t = 1 − J1,t − J2,t = 1 − It(p). The hit functions Ji,t2i=0 are Bernoulli random variables equal

to one with probability 1− p, p− p′, and p′, respectively. The hit functions are not independent of each

other and we now denote ni,t =∑Tt=1 Ji,t, for i = 0, 1, 2. Then we define the proportions of exceptions

as follows

π0 =n0

n0 + n1 + n2, π1 =

n1n0 + n1 + n2

, and π2 =n2

n0 + n1 + n2. (4.47)

The likelihood ratio test can now also be defined for the multivariate case

LRmuc(p, p′) = 2 ln

[(π0

1− p

)n0(

π1p− p′

)n1(π2p′

)n2]

d−−→ χ2(2), (4.48)

where the χ2 distribution has m− 1 degrees of freedom with in this case m = 3.

4.4.3 Independence backtesting

The next step is to test whether any two outcomes of the hit sequence are independent of each other.

Christoffersen (1998) proposed a test that examines whether the likelihood of a VaR violation today is

22


dependent on a violation yesterday. The hypotheses are constructed as follows

πij = P (It = j|It−1 = i), i, j = 0, 1,

H0 : π01 = π11 = p, H1 : π01 6= π11.(4.49)

Christoffersen (1998) tests the independence property against an explicit first-order Markov alternative.

First a transition probability matrix is defined based on the binary first-order Markov chain It

Π1 =

1− π01 π01

1− π11 π11

. (4.50)

We now define nij as the number of observations with value i followed by j and this leads to the following

likelihood function

L(Π1; I1, I2, . . . , IT ) = (1− π01)n00πn0101 (1− π11)n10πn11

11 . (4.51)

Conditioned on the first observation, the log likelihood can be maximized and the parameters are ratios

of the counts of the appropriate cells

Π1 =

n00

n00+n01

n01

n00+n01

n10

n10+n11

n11

n10+n11

. (4.52)

We now consider a similar interval model, with the same output sequence It. This Markov chain model

has the independence property and is given by

Π2 =

1− π2 π2

1− π2 π2

. (4.53)

This gives us the likelihood under the null hypothesis

L(Π2; I1, I2, . . . IT ) = (1− π2)(n00+n10)π(n01+n11)2 , (4.54)

where we can again maximize the likelihood function and estimate the parameters. This leads to

Π2 = π2 =n01 + n11

n00 + n10 + n01 + n11, (4.55)

now the likelihood ratio test follows and is like the unconditional coverage test asymptotically χ2 dis-

tributed with (m− 1)2 degrees of freedom

LRind = −2 log

[L(Π2; I1, I2, . . . , IT )

L(Π1; I1, I2, . . . , IT )

]asy∼ χ2((m− 1)2). (4.56)

This leads again to a χ2 distribution with one degree of freedom, because we again have m = 2

LRind = 2 log

[(1− π01)n00πn01

01 (1− π11)n10πn1111

(1− π2)(n00+n10)π(n01+n11)2

]d−−→ χ2(1). (4.57)

4.4.4 Duration-based test

In addition to the tests described above, one could also assess the duration between two consecutive hits.

The baseline idea is that if the one-day-ahead Value at Risk is correctly specified for a coverage rate

p, then the durations between two consecutive hits must have a geometric distribution with a success

23


probability equal to p (Candelon et al., 2010). When the model satisfies the unconditional coverage

property (UC) as well as the independence property (IND), the VaR forecasts are said to have a correct

conditional coverage (CC). Under this property, the VaR violation process is a martingale difference

E[It(p)− p|Ft−1] = 0. (4.58)

The hit series It(p) is a random sample from a Bernoulli distribution with a success probability equal

to p. We denote the duration between two consecutive violations as

di = ti − ti−1, (4.59)

where ti represents the date of the ith violation. A GMM moment condition test is used to backtest

the UC, IND and CC properties, but now based on the duration. First we define the orthonormal

polynomials associated to a geometric distribution with a success probability p as follows

Mk+1(d, p) =(1− p)(2k + 1) + p(k − d+ 1)

(k + 1)√

1− pMk(d, p)−

(k

k + 1

)Mk−1(d, p), (4.60)

for any order k ∈ N, with M−1(d, p) = 0 and M0(d, p) = 1. If the true distribution is a geometric

distribution with a success probability p, then we have

E[Mk(d, p)] = 0, ∀ k ∈ N∗, ∀ d ∈ N∗. (4.61)

This leads to the following hypotheses for each property

H0,uc : E[M1(di, p)] = 0,

H0,ind : E[Mk(di, q)] = 0, k = 1, . . . ,K,

H0,cc : E[Mk(di, p)] = 0, k = 1, . . . ,K,

(4.62)

where K is defined as the number of moment conditions. The unconditional coverage property is tested

with the first hypothesis. This hypothesis states that the expected value of the first moment condition

is equal to zero for the sequence of durations d1, . . . , dN. The second hypothesis is used to test the

independence property. This hypothesis states in words that the expected value for every moment

condition is equal to zero. There is however one difference, the probability q in the moment conditions

does not has to be equal to the true success probability p. Finally the conditional coverage property is

tested with the final hypothesis, which is a combination of the other two hypotheses. Now the statistics

of the three different tests are defined

GMMuc(K) =

(1√N

N∑i=1

M1(di, p)

)2

d−−→ χ2(1), (4.63)

GMMind(K) =

(1√N

N∑i=1

M(di, q)

)T (1√N

N∑i=1

M(di, q)

)d−−→ χ2(K), (4.64)

GMMcc(K) =

(1√N

N∑i=1

M(di, p)

)T (1√N

N∑i=1

M(di, p)

)d−−→ χ2(K). (4.65)

Note however that in the second equation the value of q is not known, so has to be estimated. Candelon

et al. (2010) show that the distribution of the GMM statistic GMMind, based on Mk(di, q), is similar to

the one based on Mk(di, q) and this leads to

GMMind(K) =

(1√N

N∑i=1

M(di, q)

)T (1√N

N∑i=1

M(di, q)

)d−−→ χ2(K − 1), (4.66)

(4.67)

24


because the first polynomial is used to estimated the maximum likelihood estimator q. The first polyno-

mial M1(di, q) is strictly proportional to the score used to define the maximum likelihood estimator q,

so we solve M1(di, q) = 0 to obtain our estimate of q.

4.4.5 Kolmogorov Smirnov test

To assess the goodness of fit of a statistical model, one can use the Kolmogorov Smirnov test. The

test, like described in Massey (1951), is based on the maximum difference between an empirical and a

hypothetical cumulative distribution. The first distribution is a specified cumulative distribution function

F0(x). This is compared with an observed cumulative step-function of the sample SN (x) = k/N , where

k is the number of observations less than or equal to x. This results in the following test statistic

DN = max |F0(x)− SN (x)|. (4.68)

When (x1, x2, . . . , xn) are mutually independent and all come from the same distribution function F0(x),

then the distribution of DN does not depend on F0(x). This means that a table, used to test the hypothe-

sis that numbers come from a uniform distribution, may also be used to test the hypothesis that numbers

come from a normal distribution, or from any completely specified continuous distribution(Miller, 1956).

The statistic DN is used to test the null hypothesis that the observations come from F0(x) against the

alternative that they come from an alternative distribution. Based on formulas noted in Miller (1956),

one can derive the values of ε based on the sample size N and the desired level of significance (1 − a).

These values of ε define the distribution of the statistic DN : P = Prob(DN ≤ ε).

4.4.6 Expected Shortfall backtesting

We mainly focus on backtests based on the estimated Value at Risk measure. However, we also estimate

the Expected Shortfall measure and therefore also want to assess the quality of our methods based on

these ES estimates. Acerbi and Szekely (2014) describe three different backtests based on the Expected

Shortfall measure. They only make the assumption that the profit and loss distributions are continuous.

This way the Expected Shortfall can be written as

ES(1−p)(t) = −E[∆Vl(t)|∆Vl(t) + V aR(1−p)(t) < 0]. (4.69)

The tests that are used are model independent, so there is besides continuity no assumption made on the

true distribution of the returns. The general hypothesis of the Expected Value backtests is constructed

as follows

H0 : Fl(t) = Pl(t),

H1 : ES(1−p)F (t) > ES

(1−p)P (t),

(4.70)

where Fl(t) is the unknown true distribution of the returns ∆Vl(t) and Pl(t) is the forecasted distribution

of the returns ∆Vl(t) based on the model. Furthermore we also define ES(1−p)F (t) as the Expected

Shortfall based on the unknown true distribution Fl(t) and ES(1−p)P (t) as the Expected Shortfall estimate

based on the model distribution Pl(t).

We perform one of the proposed backtests that is sensitive to both the magnitude as well as the

frequency of exceptions. Besides this, we only estimate one-day-ahead forecasts and for this reason set

l = 1. The test is based on the returns of a portfolio rt in a sample period with T observations in total.

25


Acerbi and Szekely (2014) base the test statistic of this test on the following relation

ES(1−p)F (t) = −E

[rtItp

], (4.71)

where It is the indicator function as defined in (4.36). This leads to the following test statistic

Z(~r ) =

T∑t=1

rtIt

T pES(1−p)F (t)

+ 1. (4.72)

The hypothesis of this specific test is defined as follows

H0 :F[1−p]1 (t) = P

[1−p]1 (t) ∀ t,

H1 :ES(1−p)F (t) ≥ ES(1−p)

P (t) ∀ t,

and ES(1−p)F (t) > ES

(1−p)P (t) ∃ t,

and V aR(1−p)F (t) ≥ V aR(1−p)

P (t) ∀ t.

(4.73)

So under the null hypothesis we have a model that estimates the tail risk correctly, while if the null

hypothesis is rejected we have a model that underestimates the tail risk. The expected value of this test

statistic Z is under the null hypothesis equal to zero and under the alternative hypothesis strictly smaller

than zero. We perform this test with a significance level of 5% and Acerbi and Szekely (2014) show that

we do not need to perform a Monte Carlo simulation to compute the p-value for Z. They show that the

p-values are remarkable stable when all financially realistic cases are taken into account. This leads to a

selected critical value of the test statistic that is equal to −0.7.

5 Data

Two data sets, with each a different source, are combined for this research. The swaption data is provided

by ICAP and the zero curve data is collected from Thomson Reuter Eikon and Bloomberg. All of the

data is available from 13-Jan-2015 up to and including 1/Jun/2017. The data only contains trading days

and this leads to a total of 613 observations for each variable.

Little pre-processing is done to obtain the necessary zero curve data. The interest rates and interest

rate swaps, that are used to construct the zero and discount curves, are based on the Euribor rate. The

quotes are end-of-day and based on a floating tenor of three months. Furthermore we use so called mid

rates, which are computed based on the bid and ask quotes as observed in the market. The day count

convention is also quoted for each product and this is all used together with the bootstrapping method

described in Section 2.2 to obtain the zero and discount curves.

We then start with the pre-processing of the ICAP swaption data. The initial data consists of two raw

ICAP end-of-day data files. The first file contains the ATM data, including the ATM straddle premiums

for various swaption expiry and tenor combinations. The second file contains the skew data including

payer, receiver, collar and strangle premiums for various expiry, tenor and relative strike combinations.

First the relevant data is extracted from these raw files, then we convert them into files that are used

as input to the calibration. We store the ATM straddle premiums in a separate file. The premiums are

stored in an expiry-tenor grid. In another file we store the payer and receiver swaption premiums for

different relative strikes. For some strikes no payer and receiver premiums are available, but only collars

and strangles. In this case the payer and receiver swaptions are derived using the relationship

payer =collar + strangle

2, receiver = strangle - payer. (5.1)

26


We end up with two files with payer and receiver swaption premiums. These files contain the exact same

values, we have only separated the premiums for expiries up to one year from the premiums for expiries

of one year and beyond. The only reason for the two separate files is to follow the set up of the raw

input ICAP data. Then finally we also create a file which contains all of the ICAP displacement values

for every expiry-tenor combination.

The descriptive statistics of these deposit rates, swap rates, and the ’10y10’ swaption premiums are

shown in Table 5.1. The displacement parameter is excluded from this table, because it only takes on

a small number of discrete values on the entire time grid. A plot of the magnitude of the displacement

parameter for the ’5y5y’ and the ’10y10y’ swaption is shown instead in Figure 6.2. The value for the

standard deviation that is shown in the table is the average of the standard deviations between the

different tenors and strike rates for the Euribor data and the swaption data respectively. Aggregating

the data gives a more clear view of the main characteristics of the data. However, on the other hand some

information is lost because of the aggregation. For this reason, we show boxplots of both the Euribor

data and the Swaption data in Section A.1.

Euribor deposit rate Euribor swap rate Swaption premium

Tenor Overnight - 3 weeks 1 month - 60 years 10 years

Maturity - - 10 years

Min -0.3320 % -0.3980 % 57.85 euro

Max 0.0710 % 1.7965 % 875.64 euro

Mean -0.1787 % 0.2781 % 551.49 euro

Median -0.2420 % -0.1490 % 582.98 euro

Std. Dev. 0.0014 0.0020 25.23

Number of observations 3678 23907 10421

Table 5.1: Descriptive statistics of the data.

5.1 Calculating the implied volatilities

Next, we have to convert the premiums to volatilities, which we can then use to calibrate the SABR

model. To obtain the volatilities, we will use the displaced Black’s model as described in (4.18). First

we will use the ATM implied volatility to compute the correct principle value of the contract. This way

we link the correct volatilities to the ICAP premiums. The ATM volatility is given in our dataset, so

this makes a good starting point. We compute the principle value of the contract L as follows

d1.ATM =σATM

√T

2

d2.ATM = −σATM√T

2

L =Pswaption.ATM

Aα,β(0)[Sα,β(Tα)N(d1.ATM )−KN(d2.ATM )],

(5.2)

this notional principal is then used in the next step to compute the out-of-the-money volatilities. The

premiums for both receiver and payer OTM swaptions are quoted in the data set. The interval of these

strikes relative to the par swap rate of the underlying swap of the swaption is as follows

27


Receiver -3% -2% -1.5% -1% -0.75% -0.5% -0.25% -0.125% -0.0625%

ATM 0%

Payer +0.0625% +0.125% +0.25% +0.5% +0.75% +1% +1.5% +2% +3%

Table 5.2: Available strikes relative to the par swap rate.

Now we also compute the absolute rates of the ATM strikes based on the par swap rates. The ICAP

data strikes are all relative to the ATM strike, so to get the absolute strikes we need to compute the par

swap rate. To do so we use (2.9) together with the bootstrapped discount curve based on the Euribor

rate. The OTM volatilities are now computed by inverting (4.6) and solving this function for σ. We

make use of the displaced variant of Black’s model, so we use F and K, as described in Section 4.1.3.

This way we obtain the market points of the implied volatilities of the swaption.

5.2 Leaving out of some strikes

Firstly, there are some strikes missing in our data set. We only focus on the most frequently traded

expiry-tenor combinations to minimize the amount of missing values, but still some premiums are missing.

Especially the receiver swaptions with strikes of -3% and -2% relative to the par swap rate are often

missing. For this reason, we choose to exclude those two strikes on the entire interval. Furthermore there

is one day in particular (25/Mar/2015) where the premiums of only 11 out of the 19 strikes are available.

Fortunately this day is the only exception and for the 10y10 swaption there are at least 17 out of the 19

premiums available for all of the other days. The missing premiums here are the receiver swaptions with

strikes of -3% and -2%, which are excluded from our calibration. This results in a complete premium

vector for our interval of strikes for all days except for 25/Mar/2015. To obtain a more stable time series

of SABR parameters, we choose to exclude the quotes on 25/Mar/2015 from our data set.

Secondly, as will be described in Section 6.1, the shape of the volatility structure depends on the

chosen level of displacement. This volatility structure is then used to calibrate the SABR model. The

SABR model can however have difficulties to calibrate both to the low and high strikes. Some of the low

strike receiver swaptions will be removed in the calibration to obtain a better calibration to the higher

strikes, in which practitioners have the most exposure. The impact of these low strike receiver swaptions,

with a high volatility, on the SABR parameters is too big in relation to their importance. Leaving them

out will not only result in a better calibration for the other strikes, but also prevent calibrated SABR

smiles that result in big repricing differences. We remove a strike (K[1]) from the range we use for the

calibration in one of the following two cases

1. |σK[1]− σK[2]

| > 0.2,

2. σK[1]< σK[2]

,

where the strikes are ordered in ascending order from the receiver swaption with the lowest strike up

to the payer swaption with the highest strike. K[i] represents the ith strike in this sorted strike range.

Moreover by removing strikes, with a too high (1.) or a too low (2.) volatility, we improve our overall

calibration. These two cases only occur in the period from 13/Jan/2015 until 25/Mar/2015 and in total

no more than 23 strikes are removed on this interval. Note that for example in Figure 6.2 a strike is

removed from the interval for the lower two displacements.

28


6 Empirical study and results

The models and theory that are described in the previous sections will now be applied on our data set.

First, we will argue which values for β and the displacement parameter are preferred. Then we will

continue by calibrating the other SABR model parameters and subsequently we will start with the time

series analysis. The vector autoregressive model is estimated and analyzed. The estimates of the risk

measures based on this model will then be compared to the estimates of the historical simulation method

by multiple backtests. Finally, this section is ended with the estimation of the local level model and this

is used as a robustness check of the vector autoregressive model.

6.1 Calibrating the SABR model parameters

In Section 5 is described how the implied volatilities are obtained from the input data. These volatilities

are now used as inputs for the SABR volatility model. The model will be calibrated daily and the time

series of the parameters will then be stored and finally also analyzed. This is described in Section 6.2.

The first step in calibrating the SABR parameters is to determine which value for β fits the data

best. Our main focus in this research is on a swaption with 10 years to maturity and an underlying swap

tenor of 10 years as well. In Figure 6.1 the log-log plot of σATM and F is displayed. This can be used

together with the theoretical relation described in (4.16). Now one can estimate the value for β and

we use a simple OLS regression to do so. The linear approximations are plotted as well and the OLS

estimates are shown in Table 6.1.

−4.7 −4.6 −4.5 −4.4 −4.3 −4.2 −4.1 −4 −3.9 −3.8 −3.7−1.7

−1.6

−1.5

−1.4

−1.3

−1.2

−1.1

−1

−0.9

Log F

log

σA

TM

Log log plot with an OLS approximation

Data

OLS approximation

Figure 6.1: Log-log plot for the ’10y10y’ swaption.

The OLS estimation gives us the following results:

Log α -(1-β) α β

OLS estimate -3.5563 -0.5262 0.0285 0.4738

Table 6.1: OLS estimates for α and β.

29


So, as mentioned before, one fixed value for β can be used for the entire time grid. This method of

estimating the best value for β is however not always used. A common other approach is to just set β

equal to 0.5. We note that the value of β lies close to this value of 0.5 for the ’10y10y’ swaption. On the

other hand, this does not hold for every swaption. If we compute the log-log plot for the ’5y5y’ swaption,

we find a optimal value for β = 0.7191. The log-log plot and OLS estimates for the ’5y5y’ swaption are

displayed in appendix Section A.2.

Before we start with the calibration of the other SABR parameters, we first need to select the level of

the displacement parameter. This level of displacement has on itself no impact when repricing a single

swaption. If a given displacement is used to imply the volatility, then recomputing the premium will

result in an identical premium independent of the size of the displacement. However, the displacement

parameter has got an impact on the underlying volatility structure for different strikes. So, we need to

take two things into account when choosing the displacement parameter. First of all the displacement

parameters s needs to be larger than the absolute value of the lowest strike K. This is necessary to be

able to use Black’s model for the entire range of strikes. Also if we have K + s > 0, but really close to

zero, this will result in very high volatilities. Secondly, a large displacement parameter will flatten the

volatility structure or will even result in a frown. This effect is clearly shown in Figure 6.2 based on our

data set.

−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.050

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

110−Mar−2017

Strike

Imp

lie

d B

lac

k V

ola

tili

ty

Market Volatilities

ATM volatility

Displacement of 1%

Displacement of 1.6%

Displacement of 3%

Figure 6.2: Implied volatilities and SABR calibration for different levels of displacement.

The interest rate and the par swap rate vary over time. For this reason it also makes sense to

vary the magnitude of the displacement parameter over time. The proposed magnitude of this variable

displacement parameter is provided with the data and shown in Figure 6.3. The fixed displacement value

of 1.25% is proposed for the 10y10y swaption, because a larger value will result in a worse calibration

for the positive interest rate period. On the other hand, a smaller value for the displacement parameter

forces us to remove some of the lowest strikes in the negative interest rate period. We also see a dynamic

displacement in Figure 6.3, these displacements are used by the data supplier ICAP. The dynamic

displacement parameter makes sure that we obtain a well behaving volatility structure on the entire

interval.

30


Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

0

0.5

1

1.5

2

2.5

3D

isp

lacem

en

t in

%

Different displacement parameters for the 5y5y swaption

Dynamic

Fixed

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

0

0.5

1

1.5

2

2.5

3

Dis

pla

cem

en

t in

%

Different displacement parameters for the 10y10y swaption

Dynamic

Fixed

Figure 6.3: Magnitude of displacement for the ’5y5y’ and ’10y10y’ swaption respectively.

Once we have obtained the optimal value for β and the displacement parameter s, we can calibrate

the other SABR parameters; α, ρ, and ν. In Figure 6.4, we can see the effect of a change in one of these

parameters, while the other parameters remain unchanged. Again, the relationship between α and β,

as given in (4.16), is clear to see. An increase (decrease) in α or a decrease (increase) in β leads to an

increase (decrease) in all of the implied volatilities. So, a shift in one of these two parameters results in

a vertical shift of the entire volatility structure.

Now in the figure on the left side below, we can see that a change in ρ will lead to a tilt in the

volatility skew. So an increase (decrease) in ρ results in a decrease (increase) of the implied volatility

for the OTM receiver swaption strikes and in an increase (decrease) for the OTM payer swaption strikes

respectively. Finally, a shift in ν affects the structure again in another way. An increase (decrease) in

ν leads to a more (less) curved volatility structure. These responses of a change in one of the SABR

parameters hold in general for every swaption. The plots below are based on the ’5y5y’ swaption, but

for this reason also hold for swaptions with another expiry-tenor combination like the ’10y10y’ swaption.

31


−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.050

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

110−Mar−2017

Strike

Imp

lie

d B

lac

k V

ola

tili

ty

Market Volatilities

ATM volatility

α = 0.0344

α = 0.0844

α = 0.1344

−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.050

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

110−Mar−2017

Strike

Imp

lie

d B

lac

k V

ola

tili

ty

Market Volatilities

ATM volatility

β = 0.6227

β = 0.7227

β = 0.8227

−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.050

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

110−Mar−2017

Strike

Imp

lie

d B

lac

k V

ola

tili

ty

Market Volatilities

ATM volatility

ρ = −0.6896

ρ = −0.1896

ρ = 0.3104

−0.02 −0.01 0 0.01 0.02 0.03 0.04 0.050

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

110−Mar−2017

Strike

Imp

lie

d B

lac

k V

ola

tili

ty

Market Volatilities

ATM volatility

ν = 0.0669

ν = 0.2669

ν = 0.4669

Figure 6.4: Effect of changes in SABR parameters for the ’5y5y’ swaption.

6.2 Fitting a model through the SABR parameters time series

Now we have calibrated the SABR volatility model, we obtain the volatility structure for our expiry-tenor

combination. Our input strikes are relative to the par swap rate, so they differ over our time period. For

the next step in our research, we will focus on one fixed interval of strikes for the entire time period. We

will compute the volatilities again with the formula suggested by Obłój (2008) and our calibrated SABR

parameters for 100 strikes equally distributed on an interval between 0.1% and 3.0%.

The calibrated SABR parameters for a fixed displacement of 1.25 % are displayed in the left part

of Figure 6.5. As can be seen from this plot, we notice that the parameter ρ is very unstable up to

25/Mar/2015. We expect that these unstable results are due to the relatively high displacement for this

period. The right part of Figure 6.5 shows again the calibrated SABR parameters, but now the dynamic

displacement is used. The dynamic displacement is significantly lower in the first months of 2015 and this

solves our problem of ρ being unstable. This clearly shows the significance of using the right magnitude

of the displacement parameter.

Again the same steps are followed for the ’5y5y’ swaption and the results are similar. The calibrated

SABR parameters are displayed in Section A.3. Different magnitudes of the fixed and dynamic displace-

ment parameter are proposed for the ’5y5y’ swaption, however again the first months of our time grid are

calibrated with a relatively high value for the fixed displacement parameter. The dynamic displacement

parameter, that is related to the level of the interest rates at that current time period, results also for

the ’5y5y’ swaption in more stable SABR parameters.

32


Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1S

AB

R p

ara

mete

rs

SABR parameters with fixed displacement

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

SA

BR

para

mete

rs

SABR parameters with dynamic displacement

Figure 6.5: SABR parameters for the ’10y10y’ swaption.

We will now try to capture the linear interdependencies among our variables α, ρ, and ν. We will

focus on the ’10y10y’ swaption with a dynamic displacement parameter. Again the decision to focus

on the ’10y10y’ swaption is made based on the fact that the quoted premiums are the most reliable

and complete. The dynamic displacement is preferred, because it results in a more stable time series of

our parameters. So this combination is the most promising in leading to reliable estimates of our risk

measures.

As discussed in Section 6.1, the volatility surface depends on the magnitude of the displacement

parameter. This results in some shocks in our calibrated SABR parameters. A shift in the dynamic

displacement parameter causes a change in the volatility structure and this results in slightly different

calibrated SABR parameters. In Figure A.5 of the appendix the SABR parameters and the level of the

dynamic displacement parameter are displayed in one figure. These plots give a clear view of the effect

of the level of displacement on the calibrated SABR parameters. For now we do not adjust our time

series analysis to deal with these small shocks, but we do note this occurrence.

To estimate the one-day-ahead forecasts, we use a moving window of n = 250 observations. The

first estimation will be based on the t1, . . . , tn interval, where t1 is the first day of our data set,

namely 13/Jan/2015, and tn represents 27/Jan/2016. This results in the first estimated profit or loss

on 28/Jan/2016. We will fit a new autoregressive model to be able to estimate every day between

28/Jan/2016 and 01/Jun/2017. The moving window method implies that we use interval t2, . . . , tn+1 to

estimate tn+2 and so on. The SABR parameters α, ρ, and ν are shown in Figure 6.6. In addition, we

have also computed the first differences of the three SABR parameters and show them as well in Figure

6.6.

33


Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

0.02

0.04

0.06

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

-0.4

-0.2

0

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

0.2

0.4

0.6

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

-2

0

2

10-3

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

-0.1

0

0.1

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-0.05

0

0.05

Figure 6.6: SABR parameters and first differences for the ’10y10y’ swaption with dynamic displacement.

Now we will determine how many lags p to include in our vector autoregression. To do so we first

check the sample ACF’s of the parameters and the parameters in first differences respectively. Plots of

the ACF’s for different intervals can be found in Section A.4. These ACF’s are by themselves not enough

to decide how many lags to include. If we look at the ACF’s of our parameters in first differences, there

can still be some significant autocorrelation up to lag number 200. It does not make sense to include this

many lags in our estimation, so we check the lag order selection criteria. The values of several criteria

are compared and denoted in Table A.2, which can be found in the appendix. In our decision we follow

the argumentation of Liew (2004) and make our selection based on the Hannan-Quinn criterion, because

of our large sample. We check for lags between zero and twenty and the Hannan-Quin criterion reaches

its minimum value on this interval at three lags. For this reason we make the decision to include p = 3

lags in our vector autoregressive model.

We estimate the parameters of our VAR(3) model based on our moving window of 250 observations.

The results of this estimation for the first period are denoted in Section A.5. One can use the VAR model

to obtain forecasts of the SABR parameters. In Section A.7, a ’10-day-ahead’ forecast of the parameters

can be found. The figure shows us the estimated trend of the parameters based on our fitted VAR(3)

model. However, we are especially interested in the one-day-ahead forecasts. For this reason we simulate

20000 different one-day-ahead forecasts for our SABR parameters.

We now first want to check the fit of our vector autoregressive model to the data and multiple

diagnostic tests are performed. These tests show that the VAR model does not fit the data as well

as required. The VAR models are stable, which can be evaluated by checking the inverse roots of the

characteristic polynomial. However, a different preferred number of lags is found if we compare the

Hannan-Quinn information criterion for different estimation windows over time. Instead of the three

lags that are used now, for some estimation windows a preferred number of one lag as well as a number

of thirteen lags is found. We also perform some other diagnostic tests based on the VAR(3) model and

show these results in Section A.6.

Recall that we want to model the dynamic structure of the time series such that the remaining

residuals are white noise. First we check whether there is still significant auto correlation within the

residuals by performing the Portmanteau test and the LM test for serial correlations. The null hypothesis

of the Portmanteau test states that there is no serial correlation up to lag h. This hypothesis is rejected

for h > 6, with a significance level of 1%. Furthermore the null hypothesis of the LM test states that

there is no serial correlation and this hypothesis is with the same significance level rejected for lags 6, 12,

34


13 and 19 if we take up to 20 lags into account. Subsequently the White test is performed to check for

heteroskedasticity in the errors. The test is carried out both with and without cross terms and rejects

the null hypothesis for every combination of the individual components. The test without cross terms

tests for heteroskedasticity only, while the test with cross terms also tests for a specification error. They

both show that we have not captured the dynamics of our parameters as intended. In the final test,

we assess whether the residuals follow the multivariate normal distribution. This Jarque-Bera test is

based on the square root of correlation as orthogonalization method and rejects the null hypothesis of

normality. Only a test on the skewness of the residuals of the equation for ν does not reject the null

hypothesis of normality, but for every other part the null is rejected. We can conclude that the VAR(3)

model does not capture the dynamics of the SABR parameters over time.

The diagnostic tests show that the vector autoregressive model is not able to capture all the dynamics

of the SABR parameters. We note these findings, but we will nevertheless compute the risk measures

based on these simulations and compare the estimates of the risk measures to the estimates of the

Historical Simulation method. Then we will perform the backtests and in addition also perform one

robustness check after this. In Section 6.5, we will estimate the local level model and compare the

simulations based on this model to the simulations that are generated based on the VAR(3) model.

6.3 Risk measurement

Now we use the simulated SABR parameters of the 28/Jan/2016 - 01/Jun/2017 period to compute 20000

volatility structures for each day. We then compute the premiums for the swaptions on the fixed range

of strikes again based on these volatility structures. These premiums are used to create the profit and

loss distribution. We will focus on a portfolio of three different swaptions. A strangle is one of the most

popular trading strategies and for this reason we will focus on this trading strategy. This portfolio will

be completed by adding an ATM payer swaption. This results in the following portfolio

Π = SwaptionReceiver(K1) + SwaptionPayer(K2) + SwaptionPayer(K3),

where K1 = 0.62%, K2 = 1.62% &K3 = 2.62%.

This portfolio is based on the par swap rate at the first day of our data set, K2 = 1.62%. The strangle

is a combination of a receiver swaption with strike K1 =ATM - offset and a payer swaption with strike

K3 =ATM + offset, where we have chosen for a offset of 1%. A plot of the par swap rates together with

K1, K2, K3, and the used range of strikes is displayed in Section A.8.

We compute the profit and loss distribution for all of our individual strikes based on the Historical

Simulation method, which makes use of the 249 most recent past returns. We also compute a profit and

loss distribution for every strike based on the 20000 simulated swaption prices. These distributions are

used to compute the 99% VaR by simply selecting the value of our sorted profit and loss distribution that

represents the lowest one percentile. For the 97.5% ES we compute the 97.5% VaR and then compute

the expected value within this tail. We compare our Value at Risk and Expected Shortfall estimates

between the two methods in Figure 6.7

35


Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-50

0

50

100

150

200

250

300

350-9

9%

VaR

VAR(3) and Historical VaR estimates over time

VAR(3)

Historical

Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-50

0

50

100

150

200

250

300

350

-97%

ES

VAR(3) and Historical ES estimates over time

VAR(3)

Historical

Figure 6.7: VAR(3) and Historical Simulation −99% VaR and −97.5% ES.

The graph clearly shows that, if we use a vector autoregressive model together with the SABR model

to estimate the risk measures, we get far more unstable results. The Historical Simulation method reacts

relatively slow to changes in the market, because all of the 249 historical returns are taken into account

with equal weights. The VAR(3) model however focuses on the more recent dynamics of the SABR model

parameters. We also see that the somewhat less stable first part of our calibrated SABR parameters

results in a higher estimate of the VaR and ES. The figures on the next page show the losses of our

portfolio over time together with the estimates of the two risk measures. The percentage of violations is

also given in the title of these plots.

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-100

-50

0

50

100

150

Lo

ss in

Eu

ros

Portofolio losses over time, with : 1.105% violated.

Historical returns

-99% HS VaR

Violations

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-100

-50

0

50

100

150

200

250

300

350

400

Lo

ss in

Eu

ros


Historical returns

-99% VAR(3) VaR

Violations

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-100

-50

0

50

100

150

Lo

ss in

Eu

ros


Historical returns

-97.5% HS ES

Violations

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-100

-50

0

50

100

150

200

250

300

350

400

Lo

ss in

Eu

ros


Historical returns

-97.5% VAR(3) ES

Violations

Figure 6.8: Losses over time for both methods with estimated VaR and ES.

36


We can now compare the proportion of violations with the theoretical value p. The Historical Sim-

ulation method gives the results as we would expect. The 99% VaR results in four violations on the

estimated interval, which is close to the expected one percent of the total estimations. The values for

the 97.5% ES are also shown in the lower two plots of Figure 6.8. The title of these plots also shows a

percentage of violations, but we note that this can not be used to assess the accuracy of the estimates of

the Expected Shortfall measure. The models that produce the ES forecasts will be backtested in Section

6.4.6. The VAR(3) model on the other hand only results in two losses larger than the 99% VaR. This is

in itself not that strange, but the values of the risk measures are. Some estimates do not make sense,

because they are either extremely high or way too low. For the 99% Value at Risk for example, we find

values ranging from 354.0662 up to −31.8681. A VaR of 354.0662 corresponds to a very large loss of

our portfolio and is not very likely to be correct. Let alone the value of −31.8681, which means that we

are 99% sure that the return of the portfolio over this one day is at least 31, 86 Euro. The simulated

returns are displayed in gray in Figure 6.9. The mean of each set of simulated returns is also displayed

and compared to the actual returns based on the data. In Figure A.9 a plot of the difference between

the mean of the simulation and the actual return is displayed. These errors are compared to errors of

the Historical Simulation method. These errors are defined as the difference between the mean of the

Historical Simulation profit and loss and the actual return. If we compute the mean squared error (MSE)

we find for the VAR(3) model a MSE of 563.7987 and for the Historical Simulation method a MSE of

378.7922.

Figure 6.9: VAR(3)-model simulations compared to the data set.

In the next section the results of several backtests will be displayed. These statistical tests give us a

better way to assess the quality of the models. It is hard to evaluate the quality of our models, based

on the number of violations alone. Due to the fact that we are looking at the tails of the distributions

only, we do not have enough data available to draw conclusions about the quality of these models with

enough certainty.

37


6.4 Backtests

In this section the quality of the two models will be assessed by applying several backtests. The

Value at Risk as well as the Expected Shortfall is computed with four different probabilities p =

[0.05, 0.025, 0.01, 0.001]. The Value at Risk and Expected Shortfall are estimated for 363 days. The

number and proportion of violations for the Value at Risk measures based on both methods are denoted

for different probabilities p in Table 6.2.

Historical Simulation VAR(3)-model

p Risk measure Violations Proportion Violations Proportion

0.05 VaR 18 4.9587% 9 2.4793%

0.025 VaR 10 2.7548% 4 1.1019%

0.01 VaR 4 1.1049% 2 0.5510%

0.001 VaR 0 0% 1 0.2755%

Table 6.2: Proportions and total number of violations for different values of p.

The Historical Simulation method results in a number of violations that is very close to the theoretical

values for all of the four values of p. The vector autoregressive model on the other hand shows a deviation

from the theoretical values. The 99.9% Value at Risk for example has in theory a chance on a loss greater

than this V aR(0.999) of one out of thousand. For this reason we would not expect to find a violation with

363 estimations in total, but even though of this small chance we still find a violation with the VAR(3)

model Expected Shortfall.

6.4.1 Kupiec

The unconditional coverage test can be used to evaluate this even better. The test is performed with a

significance level of 5% and the results can be found in Table 6.3. The test confirms the deviation of the

VAR(3) model and rejects the null hypothesis for a value of p = 0.05. We recall the null hypothesis of

this test, which is equal to E[It] = p. In words the null hypothesis states that the expected proportion

of losses that exceed the VaR is equal to p. The null hypothesis for the Historical Simulation method is

unlike the VAR(3)-model rejected in none of the five cases. We also note that the null hypothesis for the

VAR(3) model VaR estimation with p = 0.025 is close to being rejected. If we would use a significance

level of 6% the VAR(3) model would also be rejected based on the VaR estimates with p = 0.025.

Historical VaR VAR(3) model VaR

p LR-statistic p-value Reject H0 LR-statistic p-value Reject H0

0.05 0.0013 0.9711 False 5.9146 0.0150 True

0.025 0.0937 0.7596 False 3.6686 0.0554 False

0.01 0.0369 0.8477 False 0.8830 0.3474 False

0.005 0.0183 0.8923 False 0.0183 0.8923 False

0.002 0 1 False 0.0926 0.7609 False

Table 6.3: Kupiec unconditional coverage test, with a significance level of 5%.

38


The Historical Simulation method satisfies the unconditional coverage property according to Kupiec’s

backtest. The LR-statistics based on the HS method are for every value of p close to zero, which shows

that there is little reason to suspect that H0 does not hold. Moreover a well known drawback of these

backtests based on the Value at Risk is that they often have a low power, particularly when dealing with

a lower number of observations, like the data set that is used in this research. Nevertheless, this backtest

still indicates that the HS method VaR, unlike the VAR(3) model VaR, satisfies the unconditional

coverage property.

6.4.2 Magnitude-based test

Now in next step, we also take the magnitude of the losses into account. This is done by performing a

multivariate backtest on both the normal exceptions as well as on the super exceptions, as described in

Section 4.4.2. The results of this test are in line with what we have found so far and shown in the table

below. The power of most of these tests is as stated before in theory relatively low for our data set.

The magnitude-based test rejects the null hypothesis in none of the six cases. However, we find again

very low LR-statistics for the HS VaR. If we would use a larger significance level (e.g. 10%), we would

again reject the null hypothesis in two out of three cases of the VAR(3) model VaR estimates. On the

other hand, we note a different outcome for the p = 0.01 & p′ = 0.002 coverage rates. The LR-statistic

based on the Historical Simulation method in this case is even larger than the statistic based on the

VAR(3) model. This can possibly be explained by the fact that there are very few VaR violations for

these coverage rates, which makes the results for this test very inaccurate. This way it becomes more

difficult to assess the model, because of the low number of violations.


p | p′ LR-statistic p-value Reject H0 LR-statistic p-value Reject H0

0.050 | 0.01 0.0554 0.9727 False 5.9417 0.0513 False

0.025 | 0.005 0.0937 0.9543 False 5.4537 0.0654 False

0.010 | 0.002 1.8220 0.4021 False 1.7756 0.4116 False

Table 6.4: Magnitude-based test, with a significance level of 5%.

To assess the quality of our methods even better, we also test on the independence property and apply

a duration-based test.

6.4.3 Christoffersen

The third test is performed to check whether the occurrences of a loss greater than the Value at Risk on

two different dates for the same coverage rate are independently distributed. We test with a significance

level of 5% and reject the null hypothesis of independent outcomes in none of the eight cases. This

indicates that the models in general do not violate the indepence property.

39



p LR-statistic p-value Reject H0 LR-statistic p-value Reject H0

0.05 1.8791 0.1704 False 0.4577 0.4987 False

0.025 0.5666 0.4516 False 0.0891 0.7653 False

0.01 0.0891 0.7653 False 0.0222 0.8817 False

0.005 0.0222 0.8817 False 0.0222 0.8817 False

0.002 0 1 False 0.0055 0.9407 False

Table 6.5: Christoffersen independence property test, with a significance level of 5%.

Also note that the VAR(3) model actually performs better compared to the Historical Simulation

method, based on the LR-statistics here. One of the drawbacks of the HS method Value at Risk estimation

is that it does not always satisfy the independence property. In this case, we are not able to reject the null

hypothesis of independence, but we note that the Historical Simulation method is performing somewhat

worse than the VAR(3) model.

6.4.4 Duration-based test

The duration-based test is based on a GMM framework (Candelon et al., 2010). The test is performed

for a number of different moment conditions, however the results are only shown for K = 6 moment

conditions. The results for other numbers of moment conditions were similar to the results below and so

the same number of moment conditions as Candelon et al. (2010) imposed in their empirical research is

used. The duration-based test can be used to test the unconditional coverage property, the independence

property, and the conditional coverage property. The results for the test based on the conditional coverage

property are shown in the table below and again we are not able to reject the null hypothesis in any

case. However, we note again the difference in GMM-statistics between both methods and we are able

to reject the UC property for the VAR(3) model V aR(0.95) with a significance level of 1%. We do not

display the results for p = 0.002, because this results in too few violations. To test the independence

property, we need at least two durations between at least three violations. The results for the tests based

on the UC and the IND properties by themselves are shown in Section A.9.


p GMM-statistic p-value Reject CC H0 GMM-statistic p-value Reject CC H0

0.05 3.0972 0.9281 False 15.2962 0.0536 False

0.025 0.6814 0.9996 False 12.8338 0.1177 False

0.01 0.9753 0.9984 False 2.9089 0.9399 False

0.005 1.3723 0.9946 False 4.5482 0.8046 False

Table 6.6: Duration-based CC property test, with a significance level of 5% and K = 6.

The HS method performs well in the duration-based test on the conditional coverage property. Again

the results show that the VAR(3) model does not produce better VaR estimates than the HS method,

as also found with the unconditional coverage test and the magnitude-based test.

40


6.4.5 Kolmogorov-Smirnov

The next step in assessing the quality of the Historical Simulation method is to perform a test of

goodness of fit. We have 363 different estimation samples from which we compute 363 different Value

at Risk estimates. The profit and loss distributions, which we use to estimate the upcoming returns,

are composed of 249 equally weighted historical returns. We now use the Kolmogorov-Smirnov test to

assess whether the actual returns we observe are uniform draws from the estimation samples. This gives

the opportunity to test a crucial assumption in our analysis, namely if the historical returns can be used

with equal weights to obtain an accurate estimate of the one-day-ahead return. The output of the test

is displayed in Table 6.7.

Significance level ks-statistic p-value Reject H0

0.05 0.0386 0.9461 False

Table 6.7: Kolmogorov-Smirnov test for Historical Simulation method.

The Historical Simulation method is used to estimate the profit and loss distribution and the true

return is compared to this sample. First, we sort the returns of the estimated profit and loss distribution

in ascending order. Then we check the rank of the observed return in this sorted estimated distribution.

The absolute value of the rank of the observed return is then converted to the relative rank by dividing

by the total number of observations in the estimation window. For the Historical Simulation method to

be valid the observed returns need to be random draws from the 249 historical returns that represent

the estimated profit and loss distribution for every value of t. We check if this is the case by evaluating

if the relative rank of the actual return with respect to the values of the profit and loss distribution is a

random draw from the uniform distribution. Therefore the two-sample Kolmogorov-Smirnov test is used

to compare the sample of relative ranks to the theoretical values from the uniform distribution.

The theoretical values from the uniform distribution are also plotted together with the sample of

relative ranks and displayed in Figure A.10. Table 6.7 already showed that the null hypothesis is not

rejected with a significance level of 5% and the graph also shows little difference between the two CDF’s.

So to conclude, based on the Kolmogorov-Smirnov test, we are not able to reject the assumption that

the historical returns can be used to estimate current returns.

6.4.6 Expected Shortfall backtest

Next, also the backtest based on the Expected Shortfall measure is performed. We focus again on

different values for p and expect under the null hypothesis to find a value for Z equal to zero. The

results of the backtest are shown in Table 6.8 and we find values for the Z statistic close to zero for the

Historical Simulation method. The Z statistic is not defined for the Historical Simulation method with

p = 0.001, because the indicator function is in this case equal to zero for all t.

41


Historical ES VAR(3) model ES

p Z-statistic Reject H0 Z-statistic Reject H0

0.05 0.0292 False 0.4616 False

0.025 -0.0595 False 0.9380 False

0.01 0.0624 False -0.3730 False

0.005 0.1305 False -0.2686 False

0.001 - - -0.8821 True

Table 6.8: Expected Shortfall backtest, with a significance level of 5%.

The Z statistic is strictly negative for the VAR(3) model estimations with p ≤ 0.01, but we only

reject the VAR(3) model estimations with p = 0.001. We are not able to reject the null hypothesis for

other values for p, but the results are in line with what we found based on the Value at Risk estimates.

The VAR(3) model performs here worse than the Historical Simulation method based on the Expected

Shortfall forecasts.

6.5 Robustness check: Local level model

The results that we find based on the VAR(3) model are less stable than we would have hoped for. The

main goal is to find an accurate risk measure. However, stability is also valued, because this makes a risk

measure more suitable to be actually enforced by a financial institution. Large shifts in the level of the

risk measures will make it more difficult and more expensive to adjust the required amount of capital

that needs to be hold.

Now the Local Level model is also applied on the time series of SABR parameters in first differences.

The estimates of the model parameters and the model that is used itself are denoted in Section A.10.

Based on this model, we find the following simulations and VaR estimates over time.

Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-20

0

20

40

60

80

100

120

140

160

180

200

-99%

VaR

VAR(3) and LLM Value-at-Risk estimates over time

VAR(3)

LLM

Figure 6.10: Local level model simulations.

The results that are found based on the local level model are similar to results that were found with

the VAR(3) model. If we compare the simulated profit and loss distributions, which are denoted in gray

in the right part of Figure 6.10, we see outcomes that are close to the VAR(3) model simulations. Hence,

applying the local level model does not result in more stable simulations. This is unfortunate, but on

the other hand supports the results that we found based on the VAR(3) model.

42


7 Conclusion

The main goal of this research was to find more accurate risk measures for swaptions, based on a time

series analysis of SABR model parameters. An empirical study is used to assess the quality of this

new method compared to the commonly used Historical Simulation method. Before we were able to

apply the time series analysis, the SABR volatility model had to be calibrated. The optimal value for

β is first estimated based on the log-log plot of σATM and F and we found a value close to 0.5 for the

’10y10y’ swaption. Some studies however, do not make use of this log-log plot and set β = 0.5. When we

estimated the optimal value for β based on the ’5y5y’ swaption, we found a value significantly different

from 0.5. Estimating the value for β based on the log-log plot is straightforward and takes little time to

carry out. For this reason, we recommend to always check this estimate of β based on the log-log plot,

before the decision to set β equal to 0.5 is made.

We then noticed the relationship between the magnitude of the displacement parameter and the

volatility structure. We have chosen for the displaced SABR model to be able to deal with negative

interest rates and we find that a high value for the displacement parameter can lead to unstable cali-

brated SABR parameters. Based on this consequence, we conclude that it is preferred to use a dynamic

value for the displacement parameter. Using a displacement parameter that represents the interest rate

environment at the current time period results in a more stable SABR calibration.

Then in the next step a time series analysis was applied to the SABR model parameters. After

some diagnostic tests, we can conclude that a vector autoregressive model is not qualified to capture the

dynamic structure of the time series. As a result, we obtain unstable estimates of our risk measures over

time. This is unfavorable, so based on the VAR model we are not able to improve the estimates of the

Historical Simulation method. We then also use a local level model to analyze the time series, but find

similar results as we found with the VAR model. We recall the research question of this thesis. Can one

outperform the Historical Simulation Value at Risk and Expected Shortfall forecasts by fitting a time

series model to the calibrated SABR model parameters instead? Based on our empirical study, we can

conclude that we were not able to improve the Historical Simulation estimates of the risk measures by

using a vector autoregressive model or a local level model.

If we compare the results of the numerous backtests, we conclude that the Historical Simulation

method performs relatively well. The indepence property is in general sometimes violated when the HS

method is used, but in our case we do not reject the null hypothesis of independence. However, we note

that we find somewhat higher LR-statistics and keep in mind that it is possible that the power of our

backtest is too low to reject in our case. Nevertheless the HS method performs in this case relatively

well even though the estimates of the risk measures respond slowly to changes in the profit and loss

distribution. The vector autoregressive model on the other hand performs worse in the unconditional

coverage test, the magnitude-based test and the duration-based test. Also if we perform a test based on

the estimates of the Expected Shortfall measure, we find that the estimates based on the HS method are

more accurate than the estimates based on the VAR(3) model. The backtests are in line with what we

noticed from the estimated risk measures themselves and confirm that the Historical Simulation method

outperforms the vector autoregressive model in the estimation of the risk measures.

In our conclusion, we make the distinction between two possibilities. First of all, it could be the case

that another more advanced time series model is able to produce better estimates of the risk measures.

This is something that would be interesting for a follow up research. We saw for example that the shifts

in the dynamic displacement parameter caused a shift in the calibrated SABR model parameters. These

43


shifts are in this study ignored, but it would be interesting to check to which extent the estimates of the

risk measures could be improved by taken these shocks into account. On the other hand it could also

be the case that the uncertainty in the simulated one-day-ahead SABR model parameters just has a too

large impact on the volatility structure and as a result also on the price of the swaptions. If this is the

case, then the time series analysis on itself is not the main issue. In a follow up research, it would be

interesting to investigate whether one can find better estimates with a more advanced time series model

and when this does not work it would also be interesting to investigate why this is the case.

44


References

Acerbi, C. and Szekely, B. (2014). Backtesting expected shortfall. Risk.

Antonov, A., Konikov, M., and Spector, M. (2015). The free boundary sabr: Natural extension to

negative rates. Risk.

Barone-Adesi, G., Giannopoulos, K., and Vosper, L. (2002). Backtesting derivative portfolios with filtered

historical simulation (fhs). European Financial Management, 8(1):31–58.

Basel Committee (2013). Fundamental Review of the Trading Book: A revised market risk framework.

Bank for International Settlements.

Berestycki, H., Busca, J., and Florent, I. (2004). Computing the implied volatility in stochastic volatility

models. Communications on Pure and Applied Mathematics, 57(10):1352–1373.

Black, F. (1976). The pricing of commodity contracts. Journal of Financial Economics, 3(1-2):167–179.

Bogerd, K. (2015). Smile risk in expected shortfall estimation for interest rate options. Utrecht University.

Brigo, D. and Mercurio, F. (2007). Interest rate models - theory and practice: with smile, inflation and

credit. Springer.

Campbell, S. (2007). A review of backtesting and backtesting procedures. The Journal of Risk, 9(2):1–17.

Candelon, B., Colletaz, G., Hurlin, C., and Tokpavi, S. (2010). Backtesting value-at-risk: A gmm

duration-based test. Journal of Financial Econometrics, 9(2):314–343.

Christoffersen, P. F. (1998). Evaluating interval forecasts. International Economic Review, 39(4):841–862.

Colletaz, G., Hurlin, C., and Perignon, C. (2013). The risk map: A new tool for validating risk models.

Journal of Banking and Finance, 37(10):3843–3854.

Commandeur, J. J. F. and Koopman, S. J. (2007). An introduction to state space time series analysis.

Oxford University Press.

Du, Z. and Escanciano, Juan, C. (2015). Backtesting expected shortfall: Accounting for tail risk. Man-

agement Science.

Frankema, L. (2016). Pricing and hedging options in a negative interest rate environment. Delft University

of Technology.

Giordano, L. and Siciliano, G. (2013). Real-world and risk-neutral probabilities in the regulation on the

transparency of structured products. SSRN Electronic Journal.

Gneiting, T. (2011). Making and evaluating point forecasts. Journal of the American Statistical Associ-

ation, 106(494):746–762.

Gurrola, P. and Murphy, D. (2015). Filtered historical simulation value-at-risk models and their com-

petitors. Bank of England.

Hagan, P., Kumar, D., Lesniewski, S., andWoodward, D. (2002). Managing smile risk. Wilmott magazine,

1:84–108.

45


Hull, J. (2012). Options, futures, and other derivatives. Prentice Hall.

Itô, K. (1951). On stochastic differential equations. Memoirs of the American Mathematical Society,

4:1–51.

Kupiec, P. H. (1995). Techniques for verifying the accuracy of risk measurement models. The Journal

of Derivatives, 3(2):73–84.

Liew, V. K.-S. (2004). Which lag length selection criteria should we employ. Economics Bulletin,

3(33):1–9.

Massey, F. J. (1951). The kolmogorov-smirnov test for goodness of fit. Journal of the American Statistical

Association, 46(253):68.

Miller, L. H. (1956). Table of percentage points of kolmogorov statistics. Journal of the American

Statistical Association, 51(273):111.

Moni, C. (2014). Risk managing smile risk with sabr model. WBS Interest Rate Conference.

Obłój, J. (2008). Fine-tune your smile correction to hagan et al. Wilmott magazine, 1.

Pérignon, C. and Smith, D. R. (2010). The level and quality of value-at-risk disclosure by commercial

banks. Journal of Banking & Finance, 34(2):362–377.

Piontek, K. (2009). The analysis of power for some chosen var backtesting procedures: Simulation ap-

proach. Advances in Data Analysis, Data Handling and Business Intelligence Studies in Classification,

Data Analysis, and Knowledge Organization, page 481–490.

Pritsker, M. G. (2001). The hidden dangers of historical simulation. Journal of Banking and Finance,

30(2):561–582.

Roccioletti, S. (2016). Backtesting value at risk and expected shortfall. Springer Gabler.

Tsay, R. S. (2005). Analysis of financial time series. Wiley Series in Probability and Statistics.

Uri, R. (2000). A practical guide to swap curve construction. Bank of Canada.

West, G. (2005). Calibration of the sabr model in illiquid markets. Applied Mathematical Finance,

12(4):371–385.

46


A Appendix

A.1 Data

Boxplots of the data are displayed below. The Euribor data consists of Deposits for all tenors up to and

including three weeks and of Swaps for all of the remaining tenors. The boxplots show the minimum, the

quantiles, and the outliers of the data for every tenor and strike rate, respectively. The boxplot draws

values as an outlier if they are larger than q3 +w(q3 − q1) or smaller than q1 −w(q3 − q1), with whisker

w = 1.5 and q1 and q3 equal to the 25th and the 75th percentile of the sample data, respectively. We

note that the Euribor data is shown in a more compact way, but this boxplot would still show outliers

if they existed in the data. The boxplot of the swaption premiums on the other hand shows multiple

outliers. If the data would be normally distributed then we would expect to find 0.7% outliers for every

strike rate, which equals 4.3 outliers for every strike rate in our sample. However, we notice more outliers

for the swaptions with a strike rate that is further out-of-the-money. This indicates that the swaption

premiums are not identically distributed for different strike rates.

-0.5

0

0.5

1

1.5

De

po

sit

or

sw

ap

ra

te i

n %

Boxplot of Euribor data

Tenor

ON

TN

SN

1W

2W

3W

1M

2M

3M

4M

5M

6M

7M

8M

9M

10

M

11

M

1Y

15

M

18

M

21

M

2Y

27

M

30

M

33

M

3Y

4Y

5Y

6Y

7Y

8Y

9Y

10

Y

11

Y

12

Y

13

Y

14

Y

15

Y

20

Y

25

Y

30

Y

35

Y

40

Y

50

Y

60

Y

Figure A.1: Boxplot of Euribor data.

47


-1.5 -1 -0.75 -0.5 -0.25 -0.125 -0.0625 0 0.0625 0.125 0.25 0.5 0.75 1 1.5 2 3

Relative strike rates in %

100

200

300

400

500

600

700

800

900

Pre

miu

m in

Eu

ros

Boxplot of swaption premium data

Figure A.2: Boxplot of premiums for the ’10y10y’ swaption.

A.2 Determining the optimal value for β

-5.4 -5.2 -5 -4.8 -4.6 -4.4 -4.2 -4 -3.8

Log F

-1.65

-1.6

-1.55

-1.5

-1.45

-1.4

-1.35

-1.3

-1.25

-1.2

log

A

TM

Log log plot with an OLS approximation

Data

OLS approximation

Figure A.3: Log-log plot for the ’5y5y’ swaption.

The OLS estimation gives us the following results:

Log α -(1-β) α β

OLS estimate -2.7005 -0.2809 0.0672 0.7191

Table A.1: OLS estimates for α and β.

48


A.3 Time series of SABR parameters

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

SA

BR

para

mete

rs

SABR parameters with fixed displacement

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

SA

BR

para

mete

rs

SABR parameters with dynamic displacement

Figure A.4: SABR parameters for the ’5y5y’ swaption.

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

SA

BR

para

mete

rs

0

0.5

1

1.5

2

2.5

3

Dis

pla

cem

en

t in

%

SABR parameters with dynamic displacement for the 5y5y swaption

Dynamic displacement

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

SA

BR

para

mete

rs

0

0.5

1

1.5

2

2.5

3

Dis

pla

cem

en

t in

%

SABR parameters with displacement for the 10y10y swaption

Dynamic displacement

Figure A.5: SABR parameters with dynamic displacement.

49


A.4 Lag length selection

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 20 40 60

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 20 40 60

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 200 400 600

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 200 400 600

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 20 40 60

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 20 40 60

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 200 400 600

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 200 400 600

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 20 40 60

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 20 40 60

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 200 400 600

Lag

-0.5

0

0.5

1

Sam

ple

Auto

corr

ela

tion Sample ACF of

0 200 400 600

Lag

Figure A.6: Sample ACF’s for SABR parameters with Dynamic displacement.

50


Below the results for several selection criteria are shown. The LR stands for the sequential modified

Likelihood-Ratio test statistic, FPE is the final prediction error, AIC is the Akaike information criterion,

SC is the Schwarz information criterion and HQ represents the Hannan-Quinn information criterion.

The tests in the table are at a 5% level and the optimal lag order selected by the criterion is denoted

with a asterisk.

Lag LogL LR FPE AIC SC HQ

0 2261,784 NA 5,92E-13 -19,64160 -19,59675 -19,62351

1 2290,584 56,59829 4,98E-13 -19,81377 -19,63440* -19,74142

2 2303,829 25,68476 4,80E-13 -19,85069 -19,53678 -19,72407

3 2322,641 35,98778 4,41E-13 -19,93601 -19,48757 -19,75512*

4 2329,173 12,32593 4,51E-13 -19,91455 -19,33157 -19,67939

5 2337,536 15,56151 4,53E-13 -19,90901 -19,19150 -19,61958

6 2349,805 22,51040 4,41E-13 -19,93743 -19,08539 -19,59373

7 2361,361 20,90162 4,31E-13 -19,95966 -18,97308 -19,56169

8 2369,328 14,20298 4,36E-13 -19,95068 -18,82957 -19,49845

9 2391,234 38,47807 3,90E-13 -20,06291 -18,80726 -19,55640

10 2398,282 12,19566 3,97E-13 -20,04593 -18,65575 -19,48516

11 2409,427 18,99526 3,90E-13 -20,06458 -18,53987 -19,44954

12 2419,645 17,14795 3,87e-13* -20,07517* -18,41592 -19,40587

13 2425,063 8,95164 4,00E-13 -20,04402 -18,25024 -19,32045

14 2433,286 13,37230 4,04E-13 -20,03727 -18,10896 -19,25943

15 2438,372 8,13667 4,19E-13 -20,00323 -17,94039 -19,17112

16 2449,252 17,12415 4,14E-13 -20,01958 -17,82220 -19,13320

17 2464,643 23,82352 3,93E-13 -20,07516 -17,74325 -19,13451

18 2468,343 5,62935 4,13E-13 -20,02907 -17,56262 -19,03415

19 2471,675 4,98327 4,36E-13 -19,97978 -17,37880 -18,93060

20 2483,380 17,20143* 4,29E-13 -20,00330 -17,26779 -18,89985

Table A.2: VAR lag order selection criteria.

51


A.5 Vector Autoregression

AR−Stat i onary 3−Dimensional VAR(3) Model

E f f e c t i v e Sample S i z e : 246

Number o f Estimated Parameters : 30

LogLike l ihood : 2432 .32

AIC : −4804.63BIC : −4699.47

Value StandardError TS t a t i s t i c PValue

___________ _____________ __________ __________

Constant (1 ) −4.9616e−05 6 .4909 e−05 −0.7644 0.44463

Constant (2 ) −0.0007553 0.0025274 −0.29885 0.76506

Constant (3 ) −0.0003377 0.0017135 −0.19708 0.84377

AR1(1 ,1) −0.019815 0.07939 −0.2496 0 .8029

AR1(2 ,1) −3.6426 3 .0912 −1.1784 0.23865

AR1(3 ,1) −2.6018 2 .0958 −1.2415 0.21444

AR1(1 ,2) 0 .001924 0.0021331 0.90198 0.36707

AR1(2 ,2) −0.34306 0.083057 −4.1305 3 .62 e−05AR1(3 ,2) −0.055376 0.056311 −0.98341 0.32541

AR1(1 ,3) 0 .0019307 0.0028546 0.67634 0.49883

AR1(2 ,3) −0.048717 0.11115 −0.43829 0.66118

AR1(3 ,3) 0 .016625 0.075359 0.22061 0 .8254

AR2(1 ,1) 0 .11722 0.078275 1 .4976 0.13425

AR2(2 ,1) −10.905 3 .0478 −3.578 0.00034629

AR2(3 ,1) 5 .5086 2 .0664 2 .6658 0.0076798

AR2(1 ,2) −0.0036061 0.0021917 −1.6453 0.099902

AR2(2 ,2) −0.14852 0.08534 −1.7403 0.081801

AR2(3 ,2) −0.033632 0.057859 −0.58128 0.56105

AR2(1 ,3) −0.0113 0.0028421 −3.9758 7 .0137 e−05AR2(2 ,3) 0 .30217 0.11067 2 .7304 0.0063252

AR2(3 ,3) −0.24748 0.07503 −3.2985 0.00097215

AR3(1 ,1) 0 .050474 0.079474 0 .6351 0.52537

AR3(2 ,1) −7.2978 3 .0945 −2.3583 0.018359

AR3(3 ,1) −0.36743 2 .098 −0.17513 0.86098

AR3(1 ,2) −0.0016881 0.0021352 −0.79059 0.42918

AR3(2 ,2) −0.0095256 0.083142 −0.11457 0.90879

AR3(3 ,2) −0.1261 0.056368 −2.2371 0.025282

AR3(1 ,3) −0.012139 0.0028895 −4.2012 2 .6556 e−05AR3(2 ,3) 0 .54294 0.11251 4 .8257 1 .3953 e−06AR3(3 ,3) −0.29708 0.07628 −3.8946 9 .835 e−05

52


Innovat ions co−var iance matrix :

0 .0000 −0.0000 0 .0000

−0.0000 0 .0016 −0.00060 .0000 −0.0006 0 .0007

Innovat ions c o r r e l a t i o n matrix :

1 .0000 −0.5863 0 .4330

−0.5863 1 .0000 −0.53100 .4330 −0.5310 1 .0000

A.6 Evaluating the time series analysis

The results of the performed diagnostic tests are summarized below. The lag length selection criteria are

shown for different samples and the other tests are all based on the first estimation window (13/Jan/2015

- 06/Jan/2016).

Portmanteau test LM test (χ2(9))

Lags Q-Stat Prob. Adj Q-Stat Prob. df LM-Stat Prob

1 0,395149 NA* 0,396762 NA* NA* 10,44895 0,3154

2 1,948446 NA* 1,96279 NA* NA* 11,24182 0,2595

3 4,111961 NA* 4,153016 NA* NA* 16,09898 0,0648

4 7,868684 0,5474 7,971833 0,537 9 4,45073 0,8793

5 11,26682 0,8827 11,44047 0,8747 18 3,858909 0,9205

6 44,47365 0,0185 45,47747 0,0145 27 37,52362 0

7 58,42196 0,0105 59,83431 0,0076 36 13,75203 0,1314

8 78,40643 0,0015 80,49053 0,0009 45 20,5727 0,0147

9 93,57406 0,0007 96,23414 0,0004 54 15,46203 0,079

10 102,7398 0,0012 105,7882 0,0006 63 9,46408 0,3956

11 115,5032 0,0009 119,149 0,0004 72 13,45859 0,1429

12 139,618 0,0001 144,5006 0 81 24,38307 0,0037

13 173,1128 0 179,8642 0 90 36,33583 0

14 177,9179 0 184,9593 0 99 5,150915 0,821

15 184,9171 0 192,4129 0 108 7,294864 0,6064

16 195,7306 0 203,9787 0 117 11,4214 0,2479

17 212,7453 0 222,2564 0 126 17,5525 0,0407

18 224,2442 0 234,6632 0 135 12,33173 0,1952

19 243,6628 0 255,7071 0 144 21,86051 0,0093

20 252,3717 0 265,1867 0 153 8,914652 0,4452

Table A.3: Portmanteau test and LM test for auto correlation.

53


Without cross terms

Dependent R-squared F(18,227) Prob. χ2(18) Prob.

res1*res1 0,646044 23,01793 0 158,9268 0

res2*res2 0,546893 15,22139 0 134,5356 0

res3*res3 0,283992 5,001986 0 69,86214 0

res2*res1 0,657318 24,19009 0 161,7002 0

res3*res1 0,619349 20,51922 0 152,3597 0

res3*res2 0,588867 18,06292 0 144,8612 0

Joint test (χ2(108)): 312,5637, with prob. 0.

With cross terms

Dependent R-squared F(18,227) Prob. χ2(18) Prob.

res1*res1 0,929864 46,89429 0 228,7466 0

res2*res2 0,910374 35,92741 0 223,952 0

res3*res3 0,465922 3,085659 0 114,6168 0

res2*res1 0,981364 186,2624 0 241,4156 0

res3*res1 0,924792 43,49284 0 227,4988 0

res3*res2 0,927955 45,55747 0 228,2768 0

Joint test (χ2(324)): 720,2421, with prob. 0.

Table A.4: White heteroskedasticity test.

Component Skewness χ2 df Prob.

1 -2,294449 92,91814 1 0

2 -0,433951 7,546527 1 0,006

3 -1,176436 40,00863 1 0

Joint test 140,4733 3 0

Component Kurtosis χ2 df Prob.

1 20,48321 20,32191 1 0

2 18,2069 444,9114 1 0

3 22,57069 417,4646 1 0

Joint test 882,698 3 0

Component Jarque-Bera df Prob.

1 113,24 2 0

2 452,458 2 0

3 457,4732 2 0

Joint test 1023,171 6 0

Table A.5: Normality test.

54


A.7 Forecasts

Figure A.7: Forecasts based on VAR(3) model fitted to ’10y10y’ swaption with dynamic displacement.

55


A.8 Risk measurement

Q4-14 Q1-15 Q2-15 Q3-15 Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

0

0.5

1

1.5

2

2.5

3

3.5R

ate

in

%Range of strikes and the par swap rate over time

Par swap rate Fixed strike interval K1

K2

K3

Figure A.8: Range of strikes and the par swap rate over time.

Q4-15 Q1-16 Q2-16 Q3-16 Q1-17 Q2-17 Q3-17

Date

-200

-150

-100

-50

0

50

100

Ov

ere

sti

ma

te i

n E

uro

s

VAR(3) and Historical Simulation errors

VAR(3) model MSE:563.7987 Historical Simulation MSE:378.7922

Figure A.9: Comparison between actual and mean of estimated returns.

56


A.9 Backtests


p GMM-statistic p-value Reject UC H0 GMM-statistic p-value Reject UC H0

0.05 0.2006 0.6542 False 7.5003 0.0062 True

0.025 0.2234 0.6365 False 1.4720 0.2250 False

0.01 5.3872e-04 0.9815 False 0.8001 0.3711 False

0.005 4.0201e-04 0.9840 False 0.8975 0.3434 False

Table A.6: Duration-based UC property test, with a significance level of 5% and K = 6 moment

conditions.


p GMM-statistic p-value Reject IND H0 GMM-statistic p-value Reject IND H0

0.05 2.8965 0.9407 False 7.7958 0.4537 False

0.025 0.4581 0.9999 False 11.3618 0.1820 False

0.01 0.9747 0.9984 False 2.1088 0.9775 False

0.005 1.3719 0.9946 False 3.6506 0.8872 False

Table A.7: Duration-based IND property test, with a significance level of 5% and K = 6 moment

conditions.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

x values

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Pro

ba

bil

ity

Distribution comparison Kolmogorov-Smirnov test

Uniform distribution CDF

Empirical CDF

Figure A.10: Comparison between empirical and theoretical distribution Historical Simulation method.

57


A.10 Local level model

The local level model that is used is specified as follows:

µ(1)t = µ

(1)t−1 + c1ε

(1)t

µ(2)t = µ

(2)t−1 + c2ε

(2)t

µ(3)t = µ

(3)t−1 + c3ε

(3)t

αt = c4µ(1)t + c7µ

(2)t + c10µ

(3)t + c13ηt

ρt = c5µ(1)t + c8µ

(2)t + c11µ

(3)t + c14ηt

νt = c6µ(1)t + c9µ

(2)t + c12µ

(3)t + c15ηt

(A.1)

Also the following initial state mean and co-variance matrix estimates are used:

I n i t i a l s t a t e means :

x1 x2 x3

−6.5489∗e−05 5.2572∗ e−04 −4.7769∗e−05

I n i t i a l s t a t e co−var iance matrix :

x1 x2 x3

x1 1 .66 e−07 −3.25e−06 −3.92e−08x2 −3.25e−06 6 .65 e−04 2 .24 e−05x3 −3.92e−08 2 .24 e−05 3 .71 e−05

The coefficients of the model equations are estimated by maximum likelihood and this results in the

following values:

58


Coeff Std Err t Stat Prob.

c(1) -0.00009 0.00624 -0.01466 0.98830

c(2) -0.00005 0.30904 -0.00015 0.99988

c(3) -0.00004 0.36125 -0.00011 0.99991

c(4) 0.08108 3.01323 0.02691 0.97853

c(5) 0.46985 20.65594 0.02275 0.98185

c(6) 0.43666 23.87897 0.01829 0.98541

c(7) 0.02826 0.58728 0.04812 0.96162

c(8) 0.40308 15.57251 0.02588 0.97935

c(9) 0.42118 11.34973 0.03711 0.97040

c(10) 0.02609 5.85120 0.00446 0.99644

c(11) 0.43940 79.70170 0.00551 0.99560

c(12) 0.42112 87.15293 0.00483 0.99614

c(13) -0.00109 0.00004 -26.26307 0

c(14) -0.04227 0.00086 -49.09493 0

c(15) -0.02838 0.00100 -28.50648 0

Final Final State Std Dev t Stat Prob.

x(1) 0.00011 0.00161 0.07058 0.94373

x(2) -0.00676 0.02566 -0.26360 0.79209

x(3) 0.00535 0.02502 0.21382 0.83069

Table A.8: Parameter estimates of the local level model.

This is based on the first differences of the SABR parameters in the first estimation window with a

sample size of 249. We also find a logarithmic likelihood of 2307.61, an Akaike information criterion of

-4585.21 and a Bayesian information criterion of -4532.45.

59

Managing Swaption Risk with a Dynamic SABR Model

Documents

Transcript of Managing Swaption Risk with a Dynamic SABR Model