What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 *...

31
What Can We Learn From High Frequency Data in Finance - Thirty Years Later? Michel M. Dacorogna Partner at Prime Re Solutions, Zug, Switzerland ETH Risk-Center Seminar Series, Zurich, March 20, 2018 M. Dacorogna Prime Re Solutions High Frequency Data in Finance 1

Transcript of What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 *...

Page 1: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

What Can We Learn From High Frequency

Data in Finance - Thirty Years Later?

Michel M. Dacorogna

Partner at Prime Re Solutions, Zug, Switzerland

ETH Risk-Center Seminar Series, Zurich, March 20, 2018

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 1

Page 2: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Introduction

Tribute

The historical work presented here today is the result of a team workdone at Olsen & Associates from 1986 to 2000.

I am indebted to all these very creative team members:

Ulrich A. Muller

Richard B. Olsen, who made it all possible

Olivier V. Pictet

Gilles Zumbach

And all the visitors like Ramazan Gencay, Gennady Samorodnitskyor Gerhard Stahl and students like Fulvio Corsi or Philippe Hartman

I would like to thank F. Corsi and R. Gencay for providing me with updates on the research in this field

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 2

Page 3: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Introduction

An Unusual Way at the Time

In the late eighties, high frequency data meant daily data. The usualfrequency used in models were monthly or quarterly data

The approach for most of the work done in economics and financewas to first develop models and then look for data to justify them

Most academics considered data beyond daily as noise, not veryuseful to understand financial markets

Quite to the contrary, we set up our research program first to collectdata, then to find regularities and only afterwards to develop modelsthat would reproduce these regularities

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 3

Page 4: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Introduction

Timeline

The company Olsen & Associates started in 1985 and I joined it on September 1986.

Collecting HF FX-rates

1986 1988

Real-Time Information System

1991

First scientific article in JBF Scaling Laws

19951993

First International conference HFDF-I

In Zurich

Publication of 𝜃𝜃-time in JIMF

1996 1997 2001

Foundation of OANDAFX Currency Converter

on Internet

Publication of the HARCH model in JEF

First visit of:B. MandelbrotH. Bühlmann

First visit of:P. EmbrechtsA. Shiryaev

First visit of:R. EngleT. BollerslevF. Diebold

First visit of:G. SamorodnitskyG. Stahl

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 4

Page 5: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Introduction

Selected O&A References

1 U. A. Muller, M. M. Dacorogna, R. B. Olsen, O. V. Pictet, M. Schwarz, M., and C. Morgenegg, 1990, Statistical study offoreign exchange rates, empirical evidence of a price change scaling law, and intraday analysis, Journal of Banking andFinance, 14, pp. 1189-1208.

2 M. M. Dacorogna, U. A. Muller, R. J. Nagler, R. B. Olsen, and O. V. Pictet, 1993, A geographical model for the daily andweekly seasonal volatility in the FX market, Journal of International Money and Finance, 12, pp. 413-438.

3 U. A. Muller, M. M. Dacorogna, R. D. Dave, R. B. Olsen, O. V. Pictet, and von J. E. Weizsacker, 1997, Volatilities ofdifferent time resolutions - analyzing the dynamics of market components, Journal of Empirical Finance, 4, pp. 213-239.

4 M. M. Dacorogna, P. Embrechts, U. A. Muller, and G. Samorodnitsky, 1998, How heavy are the tails of a stationaryHARCH(k) process? In Stochastic Processes and Related Topics, Y. Rajput and M. Taqqu, Eds., pp. 1-31, Birkhauser,Boston.

5 U. A. Muller, M. M. Dacorogna, and O. V. Pictet, 1998, Heavy tails in high frequency financial data, In A Practical Guideto Heavy Tails: Statistical Techniques for Analysing Heavy Tailed Distributions, R. J. Adler, R. E. Feldman and M. S.Taqqu, Eds., pp. 55-77, Birkhauser, Boston, MA.

6 G. O. Zumbach, M. M. Dacorogna, J. L. Olsen and R. B. Olsen, 2000, Measuring Shocks in Financial Markets,International Journal of Theoretical and Applied Finance, 3(3), pp. 347-355.

7 H. A. Haukson, M. M. Dacorogna, T. Domenig, U. A. Muller, and G. Samorodnitsky, 2001, Multivariate extremes,aggregation and risk estimation, Quantitative Finance, 1, pp. 79-95.

8 G. O. Zumbach and P. Lynch, 2001, Heterogeneous Volatility Cascade in Financial Markets, available on SSRN:https://papers.ssrn.com/sol3/papers.cfm?abstract_id=269188

9 M. M. Dacorogna, U. A. Muller, R. B. Olsen and O. V. Pictet, 2001, Defining efficiency in heterogeneous markets,Quantitative Finance, 1(2), pp. 198-201.

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 5

Page 6: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Introduction

Data Frequency and Research in Finance

Relating the type of data available for researchers, to the effects and the models thatare discovered and developed with these different samples, provides insight into thedevelopment of research in finance

4 CHAPTER 1 INTRODUCTION

0.001 0.01 0.1 1 10 100

1e+02

1e+03

1e+04

1e+05

1e+06

1e+07

Time Scale (days)

Avai

labl

e Sa

mpl

e Si

ze

1 m

onth

1 y

ear

1 ho

ur

10 m

in

Breakdown of

Permanence Hypothesis

Price Formation Process

Intraday Effects

(Seasonality)

FIGURE 1.1 Available data samples with their typical sizes and frequency. The samplesize and the frequency are plotted on a logarithmic scale. The first point corresponds tothe O&A database, the last one to the 700 years of yearly data analyzed by Froot et al.(1995), the second to its left to the cotton price data of Mandelbrot (1963), and the dailydata are computed from the sample used in Ding et al. (1993) to show long memory inthe S&P 500. The text refers to the effects discovered and analyzes these in the differentsegments of these samples.

double logarithmic scale makes the points lie almost on a straight line. The datasample with the lowest frequency is the one used by Frootet al. (1995) of 700years of annual commodity price data from England and Holland. Beyond 700years, one is unlikely to find reliable economic or financial data.3 The data with thehighest frequency is the Olsen & Associates (O&A) dataset of more than 14 yearsof high-frequency foreign exchange data. The tick-by-tick data are the highestfrequency available. Between those two extremes, one finds the daily series ofthe Standard & Poors 500 from 1928 to 1991 used by Dinget al. (1993) or themonthly cotton prices used by Mandelbrot (1963) from 1880 to 1940. On thisgraph, we superimpose those effects that have been identified at these differenttime scales. One of the questions with data collected over very long periods iswhether they really refer to the same phenomenon. Stock indices, for example,change their composition through time due to mergers or the demise of companies.When analyzing the price history of stock indices, the impact of these changes in

3 Data can be found in natural sciences such as weather data up to a few hundred thousand years.

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 6

Page 7: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Introduction

Model and Data Frequency

The high-frequency data have opened great possibilities to test market micro-structuremodels, while traditionally low-frequency data are used for testing macroeconomic models.In between lies the whole area of financial and time series modeling, which is typicallystudied with daily or monthly data as, for instance, option pricing or GARCH models

1.3 DATA FREQUENCY AND MARKET INFORMATION 5

0.001 0.01 0.1 1 10 100

1e+02

1e+03

1e+04

1e+05

1e+06

1e+07

Time Scale (days)

Ava

ilabl

e Sa

mpl

e Si

ze

1 m

onth

1 y

ear

1 ho

ur

10 m

in

Macroeconomic

Models

Time Series and Financial

Models

Market Microstructure

Models

FIGURE 1.2 Available data samples with their typical sizes and frequency. The samplesize and the frequency are plotted on a logarithmic scale. The text refers to the modelsdeveloped and tested in the different segments of these samples.

composition is not obvious. We call this phenomenon the “breakdown of thepermanence hypothesis.” It is difficult to assess the quality of any inference as theunderlying process is not stationary over decades or centuries. At the other endof the frequency spectrum (i.e. with high-frequency data), we are confronted withthe details of the price generation process, where other effects, such as how thedata are transmitted and recorded in the data-base (see Chapter 4) have an impact.With data at frequencies of the order of one hour, a new problem arises, due tothe fact that the earth turns and the impact of time zones, where the seasonality ofvolatility becomes very important (as we shall see in Chapter 5) and overshadowsall other effects.

Figure 1.2 relates the data to the models that are typically developed andtested with them. The high-frequency data have opened great possibilities to testmarket microstructure models, while traditionally low-frequency data are usedfor testing macroeconomic models. In between lies the whole area of financialand time series modeling, which is typically studied with daily or monthly data as,for instance, option pricing or GARCH models. It is clear from this figure that wehave a continuum of both samples and models. The antagonism that is sometimesencountered between time series and market microstructure approaches shouldslowly vanish with more and more studies combining both with high-frequency

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 7

Page 8: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Framework and Notations

The FXFX Page of Reuters

financial markets are the source of high-frequency data. The original form ofmarket prices is tick-by-tick data: each “tick” is one logical unit of informationarriving at a time tj , like a quote or a transaction price

2.2 FOREIGN EXCHANGE MARKETS 17

TABLE 2.1 The traditional FXFX page of Reuters.

On this traditional page, the first column gives the time in GMT (for example, for thefirst line, “07:27”), the second column gives the name of a traded currency (“DEM” forUSD-DEM), the third column the name of the bank subsidiary that publishes the quote asa mnemonic (“RABO”), the fourth column the name of the bank (“Rabobank”), the fifthcolumn the location of the bank as a mnemonic (“UTR” for Utrecht), the sixth columngives the bid price with five digits (“1.6290”) and the two last digits of the ask price (“00”which means 1.6300), the seventh column repeats the currency (“DEM”), and the last twocolumns give the highest (“1.6365”) and the lowest (“1.6270”) quoted prices of the day.

0727 CCY PAGE NAME * REUTER SPOT RATES * CCY HI*EURO*LO FXFX

0727 DEM RABO RABOBANK UTR 1.6290/00 * DEM 1.6365 1.6270

0727 GBP MNBX MOSCOW LDN 1.5237/42 * GBP 1.5245 1.5207

0727 CHF UBZA U B S ZUR 1.3655/65 * CHF 1.3730 1.3630

0727 JPY IBJX I.B.J LDN 102.78/83 * JPY 103.02 102.70

0727 FRF BUEX UE CIC PAR 5.5620/30 * FRF 5.5835 5.5582

0726 NLG RABO RABOBANK UTR 1.8233/38 * NLG 1.8309 1.8220

0727 ITL BCIX B.C.I. MIL 1592.00/3.00 * ITL 1596.00 1591.25

0727 ECU NWNT NATWEST LDN 1.1807/12 * ECU 1.1820 1.1774

XAU SBZG 387.10/387.60 * ED3 4.43/ 4.56 * FED PREB * GOVA 3OY

XAG SBCM 5.52/ 5.53 * US30Y YTM 7.39 * 4.31- 4.31 * 86.14-15

characteristics of the FX markets are reflected by the statistical properties of thedata.

Actual trading prices and volumes are not known from the over-the-counterspot market. However, reputation considerations prevent market makers fromquoting prices at which they would actually not be willing to trade. Therefore,real transaction prices tend to be contained within the quoted bid-ask spread(Petersen and Fialkowski, 1994). This is also shown by a comparison to simulta-neous transaction prices of electronic dealing systems (where the bid-ask spreadis narrower).

The growing volume of FX transactions has been increasingly made up ofshort-term, intraday transactions and results from the interaction of traders withdifferent time-horizons, risk-profiles, or regulatory constraints. Nonfinancial cor-porations, institutional investors (mutual funds, pension funds, insurance compa-nies), and hedge funds9 have shifted their FX activities from long-term (buy andhold) investment to short-term (profit-making) transactions. This movement isboth enabled and enhanced by the development of real-time information systemsand the decrease of transaction costs following the liberalization of cross-borderfinancial flows. This flow of short and long-term transactions initiated by nonfi-nancial institutions on the retail market is the origin of an even larger—by a factor

9 The high leverage and unregulated aspects of hedge funds distinguish their investors from otherinstitutional investors.

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 8

Page 9: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Framework and Notations

Tick-by-Tick and Time Series

In the late eighties, time series in economics and finance were homogeneous(equally spaced in time). Tick-by-tick data are inhomogeneous and the timebetween two ticks contains in itself information

Prices on the FX-market come in pairs: pbid and pask as they are “quotes”from the market-makers

The most important variable under study is the logarithmic middle price x. Attime tj it is defined as:

x(tj) =ln(pbid(tj)) + ln(pask(tj)

2= ln(

√pbid(tj) · pask(tj))

The inhomogeneous time series x(tj) can be transformed to a homogeneousone by using an interpolation method, we use the index i for the homogeneoustime series:

x(ti) = x(∆t; ti) =ln(pbid(ti)) + ln(pask(ti)

2M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 9

Page 10: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Framework and Notations

Interpolation Methods for Tick DataThere are essentially two interpolation methods:

1 linear interpolation2 previous tick

3.2 VARIABLES IN HOMOGENEOUS TIME SERIES 39

-t0 t1 t2 t3

d

d

d

d

qq

����

a

q\

\\\q

a

q qa

q qXXXXXa

FIGURE 3.2 Interpolation methods to obtain a homogeneous time series: selectingvalues at equally spaced time points ti , indicated by dotted vertical lines. The inhomoge-neous time sequence of raw observations is indicated by ticks below the horizontal timeaxis and by dashed vertical lines (only for the observations bracketing the time points ti ).Two important interpolation methods are illustrated by empty circles: linear interpolation(big circles) and previous-tick interpolation (small circles).

The transformation of an inhomogeneous time series to a homogeneous onecan also be understood as the result of a special microscopic time series operatorwhich is discussed in Sections 3.3.1 and 3.4.2.

3.2.2 Price

Prices of assets are the most important variables explored in finance. Dependingon the market structure and the data supplier, prices are available as quotes indifferent forms:

Bid-ask price pairs:pbid andpask

Transaction prices (which may or may not be former bid or ask quotes)

Bid, ask, transaction prices in irregular sequence (not in pairs, not syn-chronous)

Middle prices

One individual observation at a timetj , also in the case of bid-ask pairs, is calleda tick.

Bid-ask price pairs are discussed first. FX prices and other asset prices, aswell as nonprice variables such as spot interest rates and implied volatility figuresfrom option markets, are quoted as bid-ask pairs. The most important variable

Both methods have their merits. Previous tick interpolation respects causality as itexclusively uses information already known at time t0 + i∆t, whereas linear interpolationuses information from time tj+1, which lies in the future of time t0 + i∆t. While, linearinterpolation is the appropriate method for a random process with identically andindependently distributed (i.i.d.) increments

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 10

Page 11: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Framework and Notations

Variables of Interest (1/2)

Return and Realized Volatility

The return at time ti is defined as

r(ti) = r(∆t; ti) = x(ti)− x(ti −∆t)

where x(ti) is a homogeneous sequence of logarithmic prices

The realized volatility v(ti) at time ti is computed from historical dataand it is also called historical volatility. It is defined as

v(ti) = v(∆t, n, p; ti) =

{1

n− 1·n∑k=1

∣∣∣r(∆t; ti−n+k)− µ∣∣∣p}1/p

where the regularly spaced returns r are defined as above, and n is thenumber of return observations. There are two time intervals, which arethe return interval ∆t, and the size of the total sample, n∆t. Theexponent p is often set to 2 so that v2 is the realized variance of thereturns about the mean µ = 1

n

[∑nl=1 r(∆t; ti−n+l)

]M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 11

Page 12: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Framework and Notations

Variables of Interest (2/2)

Bid-Ask Spread and Tick Frequency

In bid-ask price pairs, the ask price is higher than the bid price. Thebid-ask spread is their difference. A suitable variable for research studiesis the relative spread s(tj):

s(tj) = ln pask(tj)− ln pbid(tj)

where j is still the index of the original inhomogeneous time series. Thenominal spread, pask - pbid, is in units of the underlying price, whereasthe relative spread is dimensionless; relative spreads from differentmarkets can directly be compared to each other

The tick frequency f(ti) at time ti is defined as

f(ti) = f(∆t; ti) =1

∆tN{x(tj) | ti −∆t < tj ≤ ti

}where N{x(tj)} is the counting function and ∆t is the size of the timeinterval in which ticks are counted

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 12

Page 13: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Empirical Regularities

A Changing Distribution Shape136 CHAPTER 5 BASIC STYLIZED FACTS

-10.0 -8.0 -6.0 -4.0 -2.0 0.0 2.0 4.0 6.0 8.0 10.0

Normalized Return

10 min

1 day1 week

10 min

1 day 1 week0.0010.002

0.005

0.010

0.020

0.050

0.100

0.200

0.500

0.800

0.900

0.950

0.980

0.990

0.995

0.9980.999

Cu

mu

lati

ve

Fre

qu

en

cy

FIGURE 5.6 The cumulative distributions for 10-min, 1-day, and 1-week USD-JPYreturns shown against the Gaussian probability on the y-axis. On the x-axis the returnsnormalized to their mean absolute value are shown. The mean absolute return for 10 minis 2.62× 10−4, for 1 day 3.76× 10−3, and for 1 week 1.14× 10−2. The three curves areS-shaped as typical of fat-tailed distributions. The S-shapes of the three curves are verydifferently pronounced.

on whether one uses the linear interpolation method or the previous tick to obtainprice values at fixed time intervals at such frequencies. This is an example of thedifficulty of making reliable analyses of quoted prices at frequencies higher than10 min. The divergence of the fourth moment explains why absolute values of thereturns are often found to be the best choice of a definition of the volatility (i.e., theone that exhibits the strongest structures).4 Indeed, because the fourth momentof the distribution enters the computation of the autocorrelation function of thevariance, the autocorrelation values will systematically decrease with a growingnumber of observations.

To complement Tables 5.1 and 5.2, we plot on Figure 5.6 the cumulativefrequency of USD-JPY for returns measured at 10 min, 1 day, and 1 week on the

4 We shall see some evidence of this in Section 5.6.1 and in Chapter 7.

We plot here the cumulativefrequency of USD-JPY for returnsmeasured at 10 min, 1 day, and 1week on the scale of the cumulativeGaussian probability distribution(Q-Q plot)

Normal distributions have the formof a straight line, which isapproximately the case for the weeklyreturns with a moderate (excess)kurtosis of approximately 1.3

The distribution of 10-min returns,however, has a distinctly s-shapeform, which is a sign of fat-tails

We are in presence of a non-stabledistribution with fat-tails

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 13

Page 14: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Empirical Regularities

Fat Tails

5.4 DISTRIBUTIONAL PROPERTIES OF RETURNS 147

TABLE 5.7 Extreme risks in the FX market.

Extreme risks over 6 hr for model distributions produced by Monte-Carlo simulations ofsynthetictime series fitted to USD-DEM, compared to empirical FX data studied througha tail estimation.

Probabilities (p)1/1 1/5 1/10 1/15 1/20 1/25year year year year year year

Models:Normal 0.4% 0.5% 0.6% 0.6% 0.7% 0.7%Student 3 0.5% 0.8% 1.0% 1.1% 1.2% 1.2%GARCH(1,1) 1.5% 2.1% 2.4% 2.6% 2.7% 2.9%HARCH 1.8% 2.9% 3.5% 4.0% 4.3% 4.6%

USD rates:USD-DEM 1.7% 2.5% 3.0% 3.3% 3.5% 3.7%USD-JPY 1.7% 2.4% 2.9% 3.2% 3.4% 3.6%GBP-USD 1.6% 2.3% 2.6% 2.9% 3.1% 3.2%USD-CHF 1.8% 2.7% 3.1% 3.5% 3.7% 4.0%USD-FRF 1.6% 2.3% 2.8% 3.0% 3.3% 3.4%USD-ITL 1.8% 2.8% 3.4% 3.8% 4.1% 4.4%USD-NLG 1.7% 2.5% 2.9% 3.2% 3.4% 3.6%

Cross rates:DEM-JPY 1.3% 1.9% 2.2% 2.5% 2.6% 2.8%GBP-DEM 1.1% 1.7% 2.1% 2.3% 2.5% 2.6%GBP-JPY 1.6% 2.3% 2.7% 3.0% 3.2% 3.4%DEM-CHF 0.7% 1.0% 1.2% 1.3% 1.4% 1.5%GBP-FRF 1.1% 1.8% 2.2% 2.5% 2.7% 2.9%

An interesting piece of information displayed in Table 5.7 is the comparisonof empirical results and results obtained from theoretical models.16 The model’sparameters, including the variance of the normal and Student-t distributions, resultfrom fitting USD-DEM 30-min returns. For the GARCH (1,1) model (Bollerslev,1986), the standard maximum likelihood fitting procedure is used (Guillaumeet al., 1994) and the GARCH equation is used to generate synthetic time series.The same procedure is used for the HARCH model (Muller et al., 1997a). Themodel’s results are computed using the averagem andγ obtained by estimatingthe tail index of 10 sets of synthetic data for each of the models for the aggregatedtime series over 6 hr. As expected, the normal distribution model fares poorlyas far as the extreme risks are concerned. Surprisingly, this is also the case for

16 The theoretical processes such as GARCH and HARCH are discussed in detail in Chapter 8. Herethey simply serve as examples for extreme risk estimation.

Extreme risks over 6 hr for model distributions produced by Monte-Carlo simulations ofsynthetic time series fitted to USD-DEM, compared to empirical FX data studied through atail estimation

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 14

Page 15: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Empirical Regularities

Scaling Laws for Mean Volatility

There is no privileged time interval at which the data and the generatingprocess should be investigated

We analyze the dependence of mean volatility on the time interval on whichthe returns are measured

The scaling law is empirically found for a wide range of financial data and timeintervals in good approximation:

{E[|r|p]}1/p = c(p)∆tD(p)

where E is the expectation operator, and c(p) and D(p) are deterministicfunctions of p

This form for the left part of the equation is so that, for a Gaussian randomwalk, a constant drift exponent of 0.5 whatever the choice of p

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 15

Page 16: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Empirical Regularities

Empirical Scaling Laws150 CHAPTER 5 BASIC STYLIZED FACTS

6.0 8.0 10.0 12.0 14.0 16.0

log (time interval in seconds)

-8.00

-7.00

-6.00

-5.00

-4.00

-3.00

USD-JPY

10 min.

1 day

2 months

6.0 8.0 10.0 12.0 14.0 16.0

log (time interval in seconds)

-8.00

-7.00

-6.00

-5.00

-4.00

-3.00

GBP-USD

10 min.

1 day

2 months

FIGURE 5.8 Scaling law for USD-JPY (right) and GBP-USD (left). On the y-axis, thenatural logarithm of the mean absolute return (p = 1 in Equation 5.10) is reported. Theerror bars correspond to the mode described in Section 5.5.3. The sample period isJanuary 1, 1987, to December 31, 1995.

Small deviations for the extreme interval sizes can be explained. The dataerrors grow on both sides, as discussed in Section 5.5.3. For long intervals, thenumber of observations in the sample becomes smaller and smaller, leading toa growing stochastic error. For very short intervals, the price uncertainty withinthe bid-ask spread becomes important. In fact, researchers such as Moody andWu (1995) studied the scaling law at very high frequencies and obtained differentexponents because they did not take into account the problem of price uncertainty.Recently, Fisheret al.(1997) also found a break of the scaling law around 2 hr. It isclear on both plots of Figure 5.8 that for time intervals shorter than 1 hour the pointsstart to depart from a straight line. This deviation can be treated in two ways. Ina first approach, we treat the price uncertainty as a part of the measurement error,leading to error bars that are as wide as to easily include the observed deviation atshort time intervals. In a second approach, the deviation is identified as a bias thatcan be explained, modeled, and even eliminated by a correction. Both approachesare presented in Section 5.5.3.

In Table 5.8, we report the values of the drift exponent for four of the majorFX rates against the USD and for gold for three different measures of volatility.Each of these measures treats extreme events differently. The interquartile rangecompletely ignores them. The measure withp = 2 gives more emphasis to thetails thanp = 1. The scaling law exponentsD are around 0.57 for all rates andfor p = 1, very close to 0.5 forp = 2, and around 0.73 for the interquartile range.

Empirical Scaling Laws. On the y-axis, the natural logarithm of the mean absolute return(p = 1) is reported.The sample period is January 1, 1987, to December 31, 1995

All results indicate a very general scaling law that relates time intervals and realizedvolatility and applies to different currencies as well as to commodities such as gold and silveror stock indices

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 16

Page 17: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Empirical Regularities

Seasonal Volatility166 CHAPTER 5 BASIC STYLIZED FACTS

0 6

Intraday Volatility

12 18 240

2

4

6

8

10

12

Ab

solu

te R

etu

rn (

x 0.

0001

)

1 2 3 4 5 6 7

Intraweek Volatility

0

10

20

Intraday Spread

0 6 12 18 24

4

6

Rel

ativ

e S

pre

ad (

x 0.

0001

)

Intraweek Spread

1 2 3 4 5 6 7

10

20

30

Intraday Tick Activity

0 6 12 18 24Intraday (in hours GMT)

0

100

200

300

Nu

mb

er o

f T

icks

Intraweek Tick Activity

1 2 3 4 5 6 7Days of the Week

0

100

200

300

400

FIGURE 5.12 Hourly intraday and intraweek distribution of the absolute return, thespread and the tick frequency: a sampling interval of 1t = 1 hour is chosen. Theday is subdivided into 24 hours from 0:00 – 1:00 to 23:00 – 24:00 (GMT) and the weekis subdivided into 168 hours from Monday 0:00 – 1:00 to Sunday 23:00 – 24:00 (GMT)with index i. Each observation of the analyzed variable is made in one of these hourlyintervals and is assigned to the corresponding subsample with the correct index i. Thesample pattern does not account for bank holidays and daylight saving time. The FX rateis USD-DEM and the sampling period covers the 6 years from 1987 to 1992.

Intraday volatility in terms ofmean absolute returns is plottedin the two top histograms of thefigure for USD-DEM

Both histograms indicatedistinctly unevenintraday-intraweek volatilitypatterns. The daily maximum ofaverage volatility is roughly fourtimes higher than the minimum

The pattern can be explained byconsidering the structure of theworld market, which consists ofthree main parts with differenttime zones: America, Europe,and East Asia

Similar patterns are found on thestock markets (U-shape) thathave opening and closing hours

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 17

Page 18: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

Θ -Time to Account for the Seasonalitiy of Volatility

6.3 A NEW BUSINESS TIME SCALE (ϑ-SCALE) 191

0 24 48 72 96 120 144 168

Physical Time (hours in a week)

0

24

48

72

96

120

144

168

192

Th

eta

Tim

e (

ho

urs

in

a w

eek)

Mon

Tue

Wed

Thu

Fri

Sat Sun

FIGURE 6.6 The time mapping function between physical time and ϑ time. The weekchosen to draw this mapping function is a week with no market holidays (September 25to October 1, 1995). The thin line represents the flow of physical time.

It is difficult to take into account the differentholidays of eachmarket ac-curately.5 In the framework of the three markets of Table 6.1, our approach is anapproximate solution. A holiday is considered if it is common to a large part ofone of the three markets of the model. On such holidays, the activitya1,k is set tozero for this market. The holiday is treated like a weekend day in Equation 6.10.

In some countries, there arehalf-day holidays. Their treatment would requirethe splitting of the daily activity functions into morning and afternoon parts. Thissplitting could also be used to model the few Saturday mornings in Japan (until1989) when the banks were open. These modifications have not been made as theyare beyond our objective of modeling themainfeatures of the FX activity patterns.

Thedaylight saving timeobserved in two of the markets, Europe and America,has an influence on the activity pattern and thus onϑ . The presence of local marketsdepends on local time rather than on GMT. One way to deal with this is to convert

5 Future holidays are not always known in advance as, for instance, the Islamic holidays. Thus,ϑ

might no longer be predictable in those special cases.

The intradaily and intraweeklyseasonality of volatility is a dominanteffect that overshadows many furtherstylized facts of high-frequency data.In order to continue the research forstylized facts, we need a powerfultreatment of this seasonality

Our model for the seasonal volatilityfluctuations introduces a new timescale such that the transformed datain this new time scale do not possessintraday seasonalities

The construction of this time scaleutilizes two components: thedirecting process, θ(t), and asubordinated price process generatedfrom the directing processx(t) = x∗[θ(t)]. The process x∗

does not have intraday seasonality

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 18

Page 19: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

Long Memory of Volatility7.3 CONDITIONAL HETEROSKEDASTICITY 207

0 100 200 300 400 500 600 700

0.0

0.1

0.2

0.3

Lag (in 20-min intervals)

1 week

0 1000 2000 3000 4000

-0.05

0.00

0.05

0.10

0.15

0.20

Lag (in 20-min intervals)

1 week

FIGURE 7.5 The autocorrelation function of the USD-DEM returns and the absolutereturns at 20-min data frequency in ϑ-time. The number of lags is up to 10 ϑ days. The firstlag is marked by an empty circle. The exponential decay is shown with a dashed line. Thehyperbolically decay fits best to the autocorrelation function of the absolute returns. Thefigure on the right is the same autocorrelation function for the absolute returns extendedto a much larger number of lags with the superimposition of the hyperbolic decay.

in the underlying data-generating process of returns. The rate of decline in theautocorrelation is, however, slower than an exponential decline, which would beexpected for a low-order GARCH process, Bollerslev (1986).

The autocorrelation function of volatility (Figure 7.5) is not completely free ofseasonalities. A narrow peak can be identified at a lag of 1 week. This peak mightbe due to the day of the week effects. In our framework, the activity is assumedto be the same for all working days, which may exhibit slight variations acrossthe working days. A small local maximum at a lag of around 1 average businessday (one-fifth of a week inϑ); a small local maximum at a lag of 2 business daysand maxima at 3 and 4 business days also exist. A plausible reason for theseremaining autocorrelation peaks is a market-dependent persistence of absolutereturns. Autocorrelations with a lag of 1 business day compare with the behaviorsof the same market participants, whereas autocorrelations with lags of one half or11

2 business days compare with the behaviors of different market participants (onopposite sides of the globe). The market-dependent persistence decreases after2 business days. The predominance of the “meteor shower hypothesis” foundby Engleet al. (1990) is confirmed by the fact that the autocorrelation curve in

The autocorrelation function of volatilitydecays at a hyperbolic rate rather thanan exponential rate

To illustrate the presence of the longmemory, two curves, one hyperbolic andone exponential, are drawn in the Figuretogether with the empiricalautocorrelation functions

The hyperbolic curve approximates theautocorrelation function much moreclosely than the exponential curve

A small local maximum at a lag ofaround 1 average business day (one-fifthof a week in θ-time); a small localmaximum at a lag of 2 business days andmaxima at 3 and 4 business days alsoexist: a market-dependent persistence ofabsolute returns

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 19

Page 20: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

The Heterogeneous Market Hypothesis

Financial markets are made of traderswith different trading horizons. In theheart of the trading mechanisms are themarket makers (now also High FrequencyTraders HFT)

A next level up are the intraday traderswho carry out trades only within a giventrading day

Then there are day traders who maycarry positions overnight, short-termtraders and long-term traders

Each of these classes of traders may havetheir own trading tool sets and maypossess a homogeneous appearancewithin their own classes

What we see is in the price time series isthe sum of these activities

200 CHAPTER 7 REALIZED VOLATILITY DYNAMICS

Long-Term Traders

Medium-Term Traders

Market Makers

Time Horizons (price changes)

FIGURE 7.1 Financial markets are made of traders with different trading horizons.In the heart of the trading mechanisms are the market makers. A next level up are theintraday traders who carry out trades only within a given trading day. Then there areday traders who may carry positions overnight, short-term traders and long-term traders.Each of these classes of traders may have their own trading tool sets and may possess ahomogeneous appearance within their own classes. Overall, it is the sum of the activitiesof all traders for all horizons that generates the market prices. Therefore, market activityis heterogeneous with each trading horizon dynamically providing feedback across thedistributions of trading classes.

using Equation 3.8, we can take advantage of high-frequency data by choosing ashort time interval1t of the analyzed returns. This leads to a large number ofobservations within a given sample and thus a low stochastic error. At the sametime, it leads to a considerable bias in most cases.

In the following bias study, Equation 3.8 is considered in the following form:

v(ti) = v(1t, n, 2; ti ) =1

n

n∑j=1

[r(1t; ti−n+j )]2

1/2

(7.1)

The choice of the exponentp = 2 has some advantages here. In Section 5.5, wefound that the empirical drift exponent ofv is close to the Gaussian value 0.5 ifv

is defined with an exponentp = 2. Assuming such a scaling behavior and a fixed

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 20

Page 21: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

A Signature of Market Heterogeneity: Fine and CoarseVolatility

For exploring the behavior of various traders, we analyze volatility measured at differentfrequencies. We define a fine and a coarse volatility as:

vf (ti) =n∑k=1

|r(∆t′; ti−1 + k∆t′)| and vc(ti) =∣∣ n∑k=1

r(∆t′; ti−1 + k∆t′)∣∣

where ∆t′ ≡ ∆t/n

In the figure below, we illustrate this definition where at every time point, ti = ti−1 + 6∆t′,both quantities are simultaneously defined7.4 THE HETEROGENEOUS MARKET HYPOTHESIS 213

Finevf (t − 1) vf (t)

-r r|r1| + |r2| + |r3| + |r4| + |r5| + |r6|︸ ︷︷ ︸∑ |rj |

Coarsevc(t − 1) vc(t) -r r| r1 + r2 + r3 + r4 + r5 + r6 |︸ ︷︷ ︸

|∑ rj |

FIGURE 7.7 The coarse volatility, vc(t), captures the view and actions of long-termtraders while the fine volatility, vf (t), captures the view and actions of short-term traders.The two volatilities are calculated at the same time points and are synchronized.

a synchronous information flow, they would have a symmetric lagged correlationfunction, %τ = %−τ . The symmetry would be violated only by insignificantlysmall, purely stochastic deviations. As soon as the deviations between%τ and%−τ become significant, there is asymmetry in the information flow and a causalrelation that requires an explanation.

In a first analysis, we consider a working-daily time series where weekendsare omitted. The variables under study are the “fine volatility” and the “coarsevolatility.” Fine volatility is the mean absolute working-daily returns averagedover five observations, so covering a full (working) week. Coarse volatility is theabsolute return over a full weekly interval.

The correlation between fine volatility and coarse volatility is a function of thenumber of lags. When the number of lags is zero, the fine and coarse volatilities arecompletely identical. In the case of first positive or negative lag, the two intervalsdo not overlap but follow each other immediately.

The panel on the left hand side of Figure 7.8 shows the lagged correlationfunction for the USD-DEM in a sample longer than 21 years. The correlationmaximum is found at lag zero, which is expected. For the nonzero lags, there isan asymmetry where the coarse volatility predicts fine volatility better than theother way around. The asymmetry is significant for the first two lags where thedifference%τ −%−τ , represented by the thin curve in Figure 7.8, is distinctly outsidethe confidence interval for identically and independently distributed observations.

This result can be explained in terms of the heterogeneous market hypoth-esis presented earlier in this section. For short-term traders, the level of coarsevolatility matters because it determines the expected size of trends and thus thescope of trading opportunities. On one hand, short-term traders react to clusters ofcoarse volatility by changing their trading behavior and so causing clusters of finevolatility. On the other hand, the level of fine volatility does not affect the trading

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 21

Page 22: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

Lead-Lag Correlation7.4 THE HETEROGENEOUS MARKET HYPOTHESIS 215

-20 -10 0 10 20

Lag (in weeks)

-0.2

0.0

0.2

0.4

0.6

0.8

-20 -10 0 10 20

Lag (in multiples of 3 hr)

-0.2

0.0

0.2

0.4

0.6

0.8

FIGURE 7.8 Asymmetric lagged correlation of fine and coarse volatilities for USD-DEM. The left figure is for working-daily return in a week. The right graph is for highresolution study with half-hourly returns within 3 hr (in ϑ-time). The negative lags indi-cate that the coarse volatility was lagged compared to the fine volatility. The thin curveindicates the asymmetry. The 95% confidence intervals are for identically and indepen-dently distributed observations. The sampling period for the left figure is 21 years and 8months, from June 6, 1973, to February 1, 1995. The sampling period for the right figureis 8 years, from January 1, 1987, to January 1, 1995.

The intradaily behavior of the lagged correlation is similar for other FX ratesand gold (see Table 7.1). The empirical findings are similar across the differentrates. The first lag difference is around -0.11 and the second lag difference isaround -0.06, which are close to the corresponding values of Table 7.1. In theright panel of Figure 7.8, there is also a weak, rather wide local maximum aroundlag -11, corresponding to -33 hr inϑ-time. This corresponds to a lag of about 1working day (because a working day is 1/5 rather than 1/7 of a business week). Thedifference%τ −%−τ also has a significant (negative) peak around lag 11. This effecthas been identified in the right panel of Figure 7.8 and discussed in Section 7.3.Following Engleet al. (1990), we call it a “heat wave” effect where traders havea better memory of the events approximately 1 working day ago (when they wereactive) than a broken number working days ago (when other traders on differentcontinents, with different time zones, were active).

The peak around lag -11 can be explained by a residual seasonality that theϑ-scale is unable to capture. However, theϑ-scale is well able to treatordinaryseasonality as indicated by the lack of an analogous peak around the positive lag

Lagged correlation reveals causal relations and information flow structures in the senseof Granger causality

If two time series were generated on the basis of a synchronous information flow, theywould have a symmetric lagged correlation function ρ−τ = ρτ

Here we see that the deviations between ρ−τ and ρτ are significant, there is asymmetryin the information flow and a causal relation: coarse volatility predicts better finevolatility than the other way around

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 22

Page 23: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

Modelling Market Heterogeneity: the HARCH Model

The idea is to define a set of partial volatilities, σj measured at differentfrequencies:

σ2j,t = µj σ

2j,t−1 + (1− µj)

kj∑i=1

rt−i

2

where kj = pj−2 + 1 for j > 1 with k1 ≡ 1, µj = e−2/(kj+1−kj), and p can bechosen arbitrary but here is 4

Then, we define the HARCH process as a linear combination of these partialvolatilities:

σ2t = C0 +

n∑j−1

Cjσ2j,t with rt = σt εt

and εt is iid N (0, 1)

It is easy to prove that a necessary stationarity condition is∑nj=1 kjCj < 1

We define the impact of the component j as Ij = kjCj

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 23

Page 24: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

Quantifying the Impact of Market Component byOptimizing the HARCH Model

The result of the optimization procedure is a set of Cjcoefficients from which the component impacts Ij arecalculated. The sum of impacts Ij must be below onefor stationarity of the process

The impact of the fourth component is the weakestamong all impacts. The fourth component has atypical time horizon of around 12 hr - too long forintraday dealers and too short for other traders. Thisnaturally explains the weakness of that component

The short-term components have, in all cases, thelargest impacts. These short-term components modelessentially the intraday dealers and the market makerswho are known to dominate the markets

The similarity in the impacts of the USD-CHF andUSD-DEM are plausible as it is well known that theSwiss National Bank policy was tightly tied to theUSD-CHF to the USD-DEM rates

The relative weakness of the longer-term componentsfor the GBP-USD is another relevant piece of

information. In the late years of the XXth, thelong-term investors were reluctant to invest in thismarket since 1992 and were more concentrated on thecross rate GBP-DEM

8.3 MODELING HETEROGENEOUS VOLATILITIES 243

0.0

0.1

0.2

Imp

act

of

Co

mp

on

ent

USD-DEM

1 2 3 4 5 6 7

USD-JPY

1 2 3 4 5 6 7

0.0

0.1

0.2

Imp

act

of

Co

mp

on

ent

USD-CHF

1 2 3 4 5 6 7Intraday | |Medium/Long-Term

Market Components

GBP-USD

1 2 3 4 5 6 7Intraday | |Medium/Long-Term

Market Components

FIGURE 8.5 Impacts of market components of HARCH processes with components asdefined in Table 8.3. Each HARCH model has been made for a particular FX rate by fittinga half-hourly time series equally spaced in ϑ-time over 7 years. The differences betweenthe impacts, in particular the low values of the fourth component, are highly significant (seethe error values of Table 8.4). The values for USD-DEM are those presented in Table 8.4and they are not fundamentally different from those of other FX rates.

whether we obtain similar features as in the case of foreign exchange rates. To avoida systematic, deterministic decrease of volatility as explained by Section 5.6.4, weuse forward rates for fixed time intervals. The forward rates are labeled accordingto the market conventions for forward rate agreements. TheIxJ forward rate (e.g.,the 3x6 forward rate) is the forward rate quoted at timet and applicable for theinterval starting at time(t +I ) and ending at time(t +J ) (I andJ are expressed inmonths). The corresponding time-to-start isI months and the maturity is(J − I )

months.The results of the EMA-HARCH process estimation for 3-hrϑ-time intervals

for the different forward rates for the LIFFE Three-Month Euromark are given

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 24

Page 25: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

HARCH to Forecast Short-Term Volatility (1/2)

We construct a time series of realized hourly volatility, vh,t, from our timeseries of returns as follows:

vh,t =

ah∑i=1

r2t−i

where ah is the aggregation factor. In this case, we use data points every10 min in θ-time, so the aggregation factor is ah = 6

Forecasts of four different models are compared to the realized volatility.The one-step ahead forecasts are based on hourly returns in θ-time

We use an out-of-sample period of 5 years of hourly data, which representsmore than 43,000 observations to compare the accuracy of four forecastingmodels to the realized hourly volatility with three quality measures (Qd,direction quality, Qr, the realized potential, Qf , the improvement of theabsolute forecasting errors)

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 25

Page 26: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Modelling the Data

HARCH to Forecast Short-Term Volatility (2/2)

248 CHAPTER 8 VOLATILITY PROCESSES

TABLE 8.6 Forecasting performance for USD-DEM.

Forecasting accuracy of various models in predicting short-term market volatility. Theperformance is measured every hour over 5 years, from January 1, 1992, to December31, 1996, with 43,230 observations. In parentheses, the accuracy of rescaled forecasts isshown.

USD-DEM Qd Qr Qf

Static Optimization

Benchmark 67.7% (67.6%) 54.2% (54.3%) 0.000

GARCH(1,1) 67.8% (67.3%) 58.5% (59.7%) 0.085 (0.072)

HARCH(7c) 69.2% (68.7%) 58.3% (59.2%) 0.134 (0.129)

EMA-HARCH(7) 69.4% (68.8%) 60.7% (62.5%) 0.140 (0.128)

Dynamic Optimization

Benchmark 67.7% (67.4%) 54.2% (54.6%) 0.000

GARCH(1,1) 67.0% (66.0%) 59.5% (59.8%) 0.074 (0.057)

HARCH(7c) 67.7% (66.8%) 60.1% (60.8%) 0.113 (0.102)

EMA-HARCH(7) 68.8% (67.7%) 62.4% (62.9%) 0.133 (0.117)

other currencies, but it should be noted that the early half of the sample has beensynthetically computed from USD-DEM and USD-JPY. This may lead to noise inthe computation of hourly volatility and affect the forecast quality.

The forecast accuracy is remarkable for all ARCH-type models.In more than two-thirdsof the cases, the forecast direction is correctly predicted and the mean absolute errorsare smaller than the benchmark errors for all models

For all measures, three parameter models perform better than the benchmark and theHARCH performs the best

The realized potential Qr is the only measure that consistently improves with dynamicoptimization

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 26

Page 27: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Thirty Years Later, the Future

Since Thirty Year, What Has Changed?

Nowadays High Frequency Trading (HFT) represents 80% of theactivity of the main exchanges and probably also of the FX market

The markets have experienced few flash-crashes with quickrecoveries, but an episode like 1987 (sort of precursor) is notexcluded

All trading is now fast, with technological improvements originallyattached to HFTs permeating throughout the market place

One might have expected that when things are fast the marketstructure becomes irrelevant - the opposite is actually the case. Atvery fast speeds, only the microstructure matters

All HFTs are strategic because their goal are generally to be the“first in line” to trade

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 27

Page 28: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Thirty Years Later, the Future

Since Thirty Year, What Does Remain?

Financial markets remain the privileged place of the price discoveryprocess

Volatility clustering and fat tails are still present

The market makers dictate the conditions in the market

Short-term volatility continues to increase favored by the increase ofHFT (J. Hasbrouck, 2016)

Volume continues to increase reaching levels unheard of

Gaussian assumptions are still prevailing among practitioners andacademics

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 28

Page 29: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Thirty Years Later, the Future

Since Thirty Year, New Research Developments

Development of models along the HARCH approach (MC-HARCHZumbach and Lynch 2001, HAR-RV Corsi 2009)

Research concentrates on market micro-structure and the effects ofHFT (O’Hara 2015, Hasbrouck 2016, Mahmoodzadeh and Gencay,2017)

The flash crashes have also attracted a lot of attention amongresearchers on HFD (Gencay et al. 2016)

Modelling correlation in asynchroneous data (Buccheri and Corsi2017, Buccheri and Koopman 2018, Buccheri et al. 2018)

Studies on the effects of regulation on HFT trading (in Canada,Malinova et al. 2018)

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 29

Page 30: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

Thirty Years Later, the Future

ConclusionsCollecting and analyzing HFD in the late eighties was a real innovation, big-data 25years before big-data!

Olsen & Associates was an exceptional place for developing creative researchrecognized by both academics and professionals

Access to HFD was at the origin of a new dynamics in research and in the marketsthat gave birth to the HFT and refined mathematical methods

Our understanding of markets has evolved: heterogeneous market agents

HFT represents a challenge to research but also generally to society. The role offinancial markets as the price discovery process is put in danger

The goal of HFT is to buy and sell at the same time and thus pocket the spread,rather than intelligently discover the information

Academic analyses show that the nominal spread has decreased thanks to HFT.They neglect the fact that the short-term volatility has increased and thus theeffective spread for long-term traders has actually increased. Researchers shouldconcentrate on studying this effective spread rather than looking at nominal spread

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 30

Page 31: What Can We Learn From High Frequency Data in Finance ... · 0727 frf buex ue cic par 5.5620/30 * frf 5.5835 5.5582 0726 nlg rabo rabobank utr 1.8233/38 * nlg 1.8309 1.8220 0727 itl

References

Some Recent References∗

1 F. Corsi, 2009, A Simple Approximate Long-Memory Model of Realized Volatility, Journal ofFinancial Econometrics, 7, pp. 174-196.

2 T. G. Andersen, T. Bollerslev, P. F. Christoffersen, and F. X. Diebold, Financial RiskMeasurement for Financial Risk Management, in G. Constantinedes, M. Harris and ReneStulz (eds.), Handbook of the Economics of Finance, Vol. 2, Part B, Elsevier, pp.1127-1220.

3 M. O’Hara, 2015, High frequency market microstructure, Journal of Financial Economics,116(2), pp. 257-270.

4 J. Habrouck, 2016, High Frequency Quoting: Short-Term Volatility in Bids and Offers,Journal of Financial and Quantitative Analysis (JFQA), Forthcoming

5 R. Gencay, S. Mahmoodzadeh, J. Rojcek, and M. C. Tseng, 2016, Price Impact and Burstsin Liquidity Provision, available on SSRN:https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2745342

6 S. Mahmoodzadeh and R. Gencay, 2017, Human vs. High-Frequency Traders, PennyJumping, and Tick Size, Journal of Banking and Finance, forthcoming

7 G. Buccheri, G. Bormetti, F. Corsi, F. Lillo, 2017, A score-driven conditional correlationmodel for noisy and asynchronous data: an application to high-frequency covariancedynamics, available on SSRN:https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2912438

*) Subjective and by no means exhaustive

M. Dacorogna Prime Re Solutions

High Frequency Data in Finance 31