Conquering Digital Sprawl: The Role of Governance in Your Digital Portfolio
Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue...
Transcript of Conquering Big Data in Volatility Inference and Risk ...€¦ · the portfolio is a critical issue...
Conquering Big Data in Volatility Inference and RiskManagement
Jian (Frank) Zou
Worcester Polytechnic Institute
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 1 / 29
Introduction
Volatility Modeling and Estimation
Volatility is the conditional variance of the asset price.Volatility modeling is concerned with studying the evolution of thevolatility over time.Critical role in finance.
ExamplesPortfolio allocation;Derivative pricing and hedging;Risk management using measures like VaR.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 2 / 29
Introduction
Low-Frequency Model Features
Black-Scholes mathematically attractiveGARCH and SV work well for low-frequency dataStationary returnsDo not fit high-frequency data.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 3 / 29
Introduction
High-Frequency Financial Data
High-frequency financial data possess unique features absent in datameasured at lower frequencies:
Microstructure noiseNonstationary with jumpsIrregularly spaced and random numbers of observationsNonsynchronous trading
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 4 / 29
Introduction
High-Frequency Financial Data
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 5 / 29
Introduction
High-Frequency Financial Data
BA
C SM
SF
TG
E FIN
TC
CS
CO C
PF
EJP
MM
UW
FC T
OR
CL
MS
HP
Q AA
RF
AM
DC
HK
AA
PL
DE
LLE
MC
YH
OO
AIG
FC
XN
WS
AG
LWX
OM
HA
LV
ZLO
WF
TR
BS
XM
RK
AM
ATC
MC
SA
KE
YP
HM
QC
OM
NV
DA
HB
AN
JNJ
SP
LSE
BAY
XR
XS
CH
WS
TX
MO
FIT
BP
GK
OLS
IU
SB
BM
Y XG
NW HD
ME
TC
SX
WM
TJN
PR
VLO
DO
WT
XN
DIS
DU
KC
OP
BB
YW
AG
BR
CM
JCP
BT
US
YM
CS
BU
XLU
VH
ST
WIN
GIL
DS
LB DH
IB
KN
TAP
CV
SLL
YW
MB
CAT
EX
CH
IGA
BT
WU
MR
OT
WX
ES
RX
SW
YS
TI
AV
PG
PS
NB
RIP
G
Top 100 by volume
0e+
001e
+10
2e+
103e
+10
4e+
10
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 6 / 29
Introduction
High-Frequency Financial Data
Time
log
retu
rn
0 5000 10000 15000 20000
−0.
04−
0.02
0.00
0.02
0.04
0.06
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 7 / 29
Introduction
High-Frequency Financial Data
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 8 / 29
Introduction
Problems and Challenges
There are major difficulties facing the portfolio allocation and volatilitymatrix estimation in high frequency financial data:
Both number of observations (n) and number of assets (p) arelarge;Existing estimators (similar to MLE for covariance estimation)perform poorly;Existing dimension reduction methods fail due tonon-synchronous data structure.
Computation is a very challenging due to large data sets and vastnumber of iterations in simulations and optimizations.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 9 / 29
Introduction
Portfolio Allocation and Risk Management
Portfolio allocation is one of the most fundamental problems infinance.The process of determining the optimal mix of assets to hold inthe portfolio is a critical issue in risk management.Dividing an investment portfolio among different assets based onthe volatilities of the asset returnsIdeal scenario: portfolio with maximum return and minimum risk
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 10 / 29
Introduction
Modern Portfolio Theory
Markowitz (1952) was the original milestone paper for modern portfoliotheory on the mean-variance analysis by solving an unconstrainedquadratic optimization problem. It was later expanded in the bookMarkowitz (1959).
Tradeoff between risk and expected returnAim to select a collection of investment assets that has lower riskthan any individual assetProvide ways to find the best possible diversification strategySharpe (1966) introduced the Sharpe ratio for the performance ofmutual funds, which is a direct measure of reward-to-risk.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 11 / 29
Introduction
Modern Portfolio Theory - Cont’d
Limitationsvery sensitive to errors in the estimates of the expected return andthe conditional covariance of daily returns (which is often calledvolatility matrix)works well only if the portfolio size is smallunstable performance when the portfolio size is large
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 12 / 29
Methodology
Methodology
The proposed methodology consists of three steps:1 Estimate integrated volatility matrix for each day by average
realized volatility matrix (ARVM) estimators.2 Regularize the inverse ARVM estimator using smoothly clipped
absolute deviation (SCAD) penalty to obtain the ARVM-SCADvolatility estimator.
3 Make portfolio allocation based on the ARVM-SCAD volatilityestimator.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 13 / 29
Methodology
High Performance Computing
We exploit a variety of HPC techniques, includingparallel RIntel Math Kernel Library (MKL)automatic offloading to Intel Xeon Phi SE10P Co-processor
to speed up the simulation and optimization procedures in ourstatistical investigations.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 14 / 29
Methodology
Price Model
Suppose that there are p assets and their log price processX(t) = X1(t), · · · , Xp(t)T obeys an Itô process governed by
dX(t) = µt dt+ σTt dBt, t ∈ [0, L], (1)
Our goal is to estimate the integrated volatility matrix for the `-th day,which is defined as
Σx(`) =
∫ `
`−1σsσ
Ts ds, ` = 1, · · · , L. (2)
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 15 / 29
Methodology
Portfolio Allocation Problem
For the portfolio with allocation vector w and a holding period T ,the variance (risk) of the portfolio return is given byR(w,Σ) = wTΣw.However, it is well known that the estimation error in the meanvector µt could severely affect the portfolio weights and producesuboptimal portfolios.This motivates us to adopt another popular portfolio strategy: theglobal minimum variance portfolio, which is the minimum riskportfolio with weights that sum to one. These weights are usuallyestimated proportional to the inverse covariance matrix, i.e.,w ∝ f(Σ−1).
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 16 / 29
Methodology
Global Minimum Variance Portfolio
Following Jagannathan and Ma (2003) and Fan, Zhang and Yu (2012),we consider the following risk optimization with two differentconstraints:
minwTΣw, s.t. ‖w‖1 ≤ c and wT1 = 1 (3)
where c is the gross exposure parameter which specifies the totalexposure allowed in the portfolio. Here we consider two cases:
c = 1 corresponds to the no short sale restriction.c =∞ is the global minimum risk portfolio without any short saleconstraint.
Other cases with varying c can be easily generalized in ourmethodology.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 17 / 29
Methodology
ARVM Estimator
Let τ = τr, r = 1, · · · ,m be the pre-determined sampling frequency.For asset i, define previous-tick times
τir = maxti` ≤ τr, ` = 1, · · · , ni, r = 1, · · · ,m.
Based on τ we define realized co-volatility between assets i1 and i2 by
Σy(1, τ)[i1, i2](τ ) =
m∑r=1
[Yi1(τi1,r)− Yi1(τi1,r−1)] [Yi2(τi2,r)− Yi2(τi2,r−1)] ,
(4)and realized volatility matrix by
Σ(1, τ ) =(
Σy(1, τ)[i1, i2]). (5)
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 18 / 29
Methodology
ARVM-SCAD Estimator
With the estimated volatility matrix ARVM Σ, we define theARVM-SCAD estimator as follows:
consider penalized estimation of the covariance matrix Σ and itsinverse matrix, precision matrix Ω = Σ−1. Denote their(i, j)-element by σij and ωij , respectively.proceed to apply the SCAD penalty pλ(·) to achieve a penalizedestimator by solving the following optimization problem.
minΩ− log |Ω|+ tr(ΣΩ) +
∑i 6=j
pλ(ωij). (6)
Note that (6) is not a convex programming. We will use the locallinear approximation algorithm. At the end of tth step, denote thesolution by Ω(t) = (ω
(t)ij ). By using the local linear approximation,
at the next step we solve the following optimization problem
minΩ− log |Ω|+ tr(ΣΩ) +
∑i 6=j
p′λ(|ω(t)ij |)|ωij | (7)
and denote its solution by Ω(t+1) = (ω(t+1)ij ). We repeat this step
until convergence.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 19 / 29
Methodology
Asymptotic Theory
Theorem 1
Under some regularity conditions, if max|p′λn(θj0)| : θj0 6= 0 → 0,then there exists a local maximizer θ of Q(θ) such that‖θ − θ0‖ = OP (en + bn) where an = max|p′λn(θj0)| : θj0 6= 0,bn = dan, suppose bn → 0 as λn → 0. en ∼ n−1/6 for the case withmicrostructure noise and en ∼ n−1/3 for the noiseless case.
Theorem 2
Under some regularity conditions, if limn→∞ n−1/(enλn)→ 0, and
lim infn→∞
lim infθ→0+
p′λn(θ)/λn > 0, then our estimator in Theorem 1 satisfies
P(θ2 = 0
)→ 1, as n→∞
where en follows the same rate as in Theorem 1.Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 20 / 29
Numerical Studies
Simulation Model
Assume the true log price X(t) of p assets follow the diffusion model
dX(t) = σTt dWt t ∈ [0, 1]
where we take σ as a Cholesky decomposition ofγ(t) = σtσ
Tt = (γij(t))1≤i,j≤p.
The diagonal elements of γ(t) are generated from four commonstochastic volatility models with leverage effect.
Geometric Ornstein-Uhlenbeck processSum of two CIR processesThe volatility process in Nelson’s GARCH diffusion limit modelTwo-factor log-linear stochastic volatility process.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 21 / 29
Numerical Studies
Parallel R
While most features in R are implemented as single thread processes,efforts have been made in enabling parallelism with R over the pastdecade. Parallel package development coincides with the technologyadvances in parallel system development. For computing clusters.
RmpirparallelSnow
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 22 / 29
Numerical Studies
Intel MKL
R enables linking to other shared mathematics libraries to speed upmany basic computation tasks. One option for linear algebracomputation is to use Intel Math Kernel Library (MKL). MKL includes awealth of routines (e.g., the use of BLAS and LAPACK libraries) toaccelerate application performance and reduce development time suchas highly vectorized and threaded linear algebra, fast fouriertransforms (FFT), vector math and statistics functions. Furthermore,the MKL has been optimized to utilize multiple processing cores, widervector units and more varied architectures available in a high endsystem. Different from using parallel packages, MKL can provideparallelism transparently and speed up programs with supported mathroutines without changing code. It has been reported that thecompiling R with MKL can provide three times improvements out ofbox.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 23 / 29
Numerical Studies
Offloading to Phi Coprocessor
The basis of the Xeon Phi is a light-weight x86 core with in-orderinstruction processing, coupled with heavy-weight 512bit SIMDregisters and instructions. With these two features the Xeon Phi diecan support 60+ cores, and can execute 8 double precision (DP) vectorinstructions. The core count and vector lengths are basic extensions ofan x86 processor, and allow the same programming paradigms (serial,threaded and vector) used on other Xeon (E5) processors. Unlike theGPGPU accelerator model, the same program code can be usedefficiently on the host and the coprocessor. Also, the same Intelcompilers, tools, libraries, etc. that you use on Intel and AMD systemsare available for the Xeon Phi. R with MKL can utilize both CPU andXeon Phi co-processor. In this model, R is compiled and built withMKL. Offloading to Xeon Phi can be enabled by setting environmentvariables as opposed to making modifications to existing R programs
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 24 / 29
Numerical Studies
Offloading to Phi Coprocessor
# enable mkl mic offloadingexport MKL_MIC_ENABLE=0
# from 0.0 to 1.0 the work divisionexport MKL_HOST_WORKDIVISION=0.3export MKL_MIC_WORKDIVISION=0.7
# Make the offload report big to be visible:export OFFLOAD_REPORT=2
# now set the number of threads on hostexport OMP_NUM_THREADS=16export MKL_NUM_THREADS=16
# now set the number of threads on the MICexport MIC_OMP_NUM_THREADS=240export MIC_MKL_NUM_THREADS=240
Figure: Configuring environment variables to enable automatic offloading toIntel Xeon Phi Coprocessor. In this sample script, 70% of computation isoffloading to Phi, while only 30% is done on host.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 25 / 29
Numerical Studies
Simulation Results
1 2 3 4 5 6
050
100
150
Sparsity Level
L1 n
orm
ARVMTSRVSCAD
High Noise
1 2 3 4 5 6
020
6010
0
Sparsity Level
L2 n
orm
ARVMTSRVSCAD
Figure: Risk profile with high noise levelJian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 26 / 29
Numerical Studies
Dow 30 Portfolio
We applied our methodology to a portfolio consisting 30 Dow JonesIndustrial Average (DJIA) constituent stocks. The purpose of ourempirical study is twofold: to demonstrate the applicability of ourapproach to a real high-frequency financial data set, as well as toprovide some insights into regularization in the portfolio allocationusing high-frequency data.
Mean Median SD (%)ARVM 0.084 0.094 4.659ARVM(no short) 0.207 0.153 4.019SCAD 0.101 0.133 4.603SCAD(no short) 0.212 0.165 4.011
Table: Portfolio performance based on the Sharpe ratio
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 27 / 29
Summary
Summary
Large portfolio allocation are very challenging due to thecomplexity of the problem.Volatility matrix modeling and estimation using high-frequencydata pose additional difficulties.We proposed a new methodology to perform portfolio allocationthat based on the regularized version of the estimated integratedvolatility matrix.Theoretical and numerical studies indicate that the methodologyworks effectively.We exploit a variety of HPC techniques, including parallel R, IntelMath Kernel Library, and automatic offloading to Intel Xeon Phicoprocessor in particular to speed up the simulation andoptimization procedures in our statistical investigations.
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 28 / 29
Summary
Thank you!
Jian (Frank) Zou (WPI) Volatility Inference & Risk Management May 18, 2016 29 / 29