An Algorithm for Computing Risk Parity Weights - F. Spinu
Transcript of An Algorithm for Computing Risk Parity Weights - F. Spinu
-
7/29/2019 An Algorithm for Computing Risk Parity Weights - F. Spinu
1/6
AN ALGORITHM FOR COMPUTING RISK PARITY WEIGHTS
FLORIN SPINU
Abstract. In this paper we introduce an algorithm for computing the solutionto the risk budget allocation problem. The algorithm is based on Newtons
method for self-concordant functions.
1. Introduction
For N
2 a positive integer, let RN+ =
{x = (x1, . . . , xN)
T
RN, xi > 0
}the
positive cone, and SN+ the set of symmetric, positive definite NN matrices. Forx, y RN, xy := (x1y1, . . . , xNyN)T, and diag(x) is the diagonal NN matrix withxi on the diagonal. For x RN+ , x1 := (1/x1, . . . , 1/xN) RN+ . We will also makeuse of the norms x2 := (
i x
2i )
1/2, x := maxi |xi|, and xC := (xTCx)1/2.For a given C SN+ and b RN+ , we consider the following equation
(1) Cx = bx1, x RN+ .Theorem 1.1. The equation (1) has a unique solution x RN+ .
As a consequence, we have the following corollary (cf. [BR12]):
Corollary 1.2. The vector x/
i xi is the unique solution to the system
(2)
(Cx)ixi
xTCx =
biNj=1 bj , 1 i N ,
satisfying
(3) xi > 0,Ni=1
xi = 1 .
1.1. Financial interpretation. Given a set of N financial assets with covariancematrix C, a long-only, fully invested portfolio in the N assets is determined by a setof allocations x = (x1, . . . , xN)
T satisfying (3). The risk of the portfolio is usuallydefined as the standard deviation of the portfolio returns
(4) P(x) :=
xTCx .
As a homogeneous function of degree one, P(x) satisfies Eulers identity
(5)Ni=1
1
Pxi
Pxi
(x) = 1 .
The ith summation term is sometimes considered to be the contribution of asset ito the risk of the total portfolio. Since
(6)1
Pxi
Pxi
(x) =(Cx)ixi
xTCx,
1
-
7/29/2019 An Algorithm for Computing Risk Parity Weights - F. Spinu
2/6
2 FLORIN SPINU
equation (2) matches the risk contribution of asset i with the predetermined weightbi/
j bj . For this reason, the solution to (2) is referred to as the risk budget
allocation in [BR12]. When the weights bi are equal, the term risk parity wasintroduced earlier in [Q05]. The reader is referred to these papers for an in-depthdiscussion of the properties of portfolios constructed using these weights.
2. Proof of Theorem 1.1
The proof, as well as the choice of the function below, is similar to the one in[BR12]. Let
(7) F(x) :=1
2xTCx
Ni=1
bi log(xi) .
The gradient and Hessian of F are given by, respectively,
(8)
F(x) = Cx
bx1,
2F(x) = C+ diag(bx2) .
In particular, a solution to (1) is a critical point of F. Since F is strictly convex(2F(x) C), it has at most one critical point (this proves the uniqueness part ofthe Theorem). To prove that F has at least one critical point, it suffices to showthat F has a a point of global minimum x RN+ . This is a consequence of the factthat F(x) + as x RN+ (see Lemma 4.1 in the Appendix).
2.1. Proof of Corollary 1.2. If x satisfies (1), then x
i x
isatisfies (2) and (3).
Conversely, if x satisfies (2), 1x satisfies (1), with = (xTCx)1/2
(
i bi)1/2 . This implies
that 1x = x, the unique solution to (1). Furthermore, the constraint (3) impliesthat = 1
i x
i, which determines x uniquely.
3. Newtons algorithm
3.1. Simplifying assumptions. Let D the diagonal ofC, and R = D1/2CD1/2
the associated correlation matrix: R is positive definite with 1 on the diagonal. Ifx is the solution to (1), y := D1/2x is the unique solution to the equation
(9) Ry = by1
Therefore, we may assume from now on that C is a correlation matrix:
(10) C SN+ , Cii = 1 .In particular, |Cij | 1, i, j, by the Cauchy-Schwartz inequality.
The next observation is that if x satisfies (1) with weights bi, t1/2x satisfiesthe same equation with bi replaced by tbi. Therefore we may rescale bi so that
(11) min1iNbi = 1 .
In what follows, we refer to [N98, Chap.4] for the notion of self-concordant func-tions. The set of self-concordant functions is closed under addition and multiplica-tion by a scalar greater than one, and it counts among its members all the quadraticfunctions, as well as the functions log(xi). We conclude thatProposition 3.1. Under the assumptions (10) and (11), F(x) is self-concordant.
-
7/29/2019 An Algorithm for Computing Risk Parity Weights - F. Spinu
3/6
AN ALGORITHM FOR COMPUTI NG RISK PARITY WEIGHTS 3
3.2. Convex optimization. We remarked in Section 2 that the solution to (1) isthe global minimum of the convex function F, defined at (7)
(12) x = argminxRN+ F(x) .
Let x := (2F(x))1F(x) the iteration step in Newtons algorithm. If x0 isclose enough to x, the sequence
(13) xk+1 = xk + xk, k 0,converges quadratically to x. A difficulty arises when an appropriate choice ofthe initial point x0 is not available. Since F is self-concordant, this issue canbe dealt with as in [N98, Chap 4.]. Let F(x) := (x)TF(x). As long ask > := 3
5
2 , the above step is replaced with the damped iteration
(14) xk+1 = xk +1
1 + kxk, k := F(xk) .
This step is guided by the key inequality [N98, Thm 4.10]
(15) F(xk+1) F(xk) (k),where (t) := t log(1 + t). This ensures a decrease in the objective function ofat least () per iteration. In particular, less than
F(x0)F(x)()
damped iterations
are required to arrive at k < , at which point the quadratic phase (13) sets in.Moreover, we can significantly reduce the number of damped iterations by exploitingthe explicit form of F. The following theorem is proved in the Appendix.
Theorem 3.2. For x RN+ , let := max1iN |(x)i|xi . With h = 11+ , we have(16) F(x + hx) F(x) (2F(x)/2) () .
Consequently, we may replace the iteration (14) with
(17) xk+1 = xk +1
1 + kxk, k := xk/xk .
This step is greater than (14), since k < k. On the other hand, the decrease inthe objective function is as good as (15) since (2k/
2k)(k) (k), by Lemma
4.3 in the Appendix. Let u1 := (1, . . . , 1)T and S :=
i bi. We conclude with
Theorem 3.3. Letx0 :=S
uT1Cu1
u1, := 0.95 3
52 , and Tol > 0 a termina-
tion threshold. Consider the following algorithm: for k 0, Compute
uk := F(xk) = Cxk bx1kHk :=
2F(xk) = C+ diag(bx
2k )
xk := H1k ukk := xk/xk
(Damped Phase) While k > , do xk+1 = xk + 11+k xk . (Quadratic Phase) While k > Tol, do xk+1 = xk + xk .The number of iterations is less than 9.4 Slog(CN) in the damped phase, and(log2 log2(1/Tol) + 2.6) in the quadratic phase, withC the condition number ofC.The terminal value xend of the algorithm satisfiesxend xC Tol1Tol .
-
7/29/2019 An Algorithm for Computing Risk Parity Weights - F. Spinu
4/6
4 FLORIN SPINU
Proof. The bound on the number of iterations of the quadratic phase is standard,and can be derived from [N98, Thm 4.1.12]. To bound the number of damped
iterations, we start from the upper bound
F(x0)
F(x)
() . First, we remark thatxT0 Cx0 = (x
)TCx = S. Let 1 and N be the smallest and the largest eigenvaluesof C. Since uT1 Cu1 Nu122 = N N, F(x0) = 12 S Slog( S
1/2
(uT1Cu1)1/2
) 12 S12
Slog( SNN). On the other hand, S = (x)TCx 1x22. This implies xi
S/1, i, hence F(x) 12 S 12 Slog( S1 ). Taking the difference, we obtainF(x0)F(x) 12 Slog(CN), where C = N1 . Finally 12() = 9.385 < 9.4. Thelast term satisfies xk xC xk xxk Tol1Tol (cf. [N98, Thm 4.1.11]). 3.3. Practical Implementation. a) For N = 50 and T ol = 106, the theoreticalupper bound for the number of quadratic iterations is 8, while the stated upperbound for the number of damped iterations can be quite large (depending on S).We find in practice however that the total number of iterations is significantly lower
than that: for each of the 107
trials with random weights b (uniform distribution)and random covariance matrix C (Wishart distribution), the algorithm terminatedafter less than 16 iterations. b) We have also tested the algorithm on the risk parityproblem in dimension N = 1400. That is, bi = 1/N and T ol = 10
6. We foundthat the number of iterations required was less than 6 for each of 2105 trials withrandom covariance matrix C.
4. Appendix
Lemma 4.1. F(x) tends to + as x approaches the boundary ofRN+ :(18) lim
xRN+
F(x) = + .
Proof. Let 1 the lowest eigenvalue of C. Since xTCx 1i x2i , we have
(19) F(x) Ni=1
(1
21x
2i bi log(xi)) =:
Ni=1
gi(xi) ,
with gi(t) =12 1t
2 bi log(t). The function gi is convex when t > 0, has a globalminimum at ti =
bi/1, and satisfies
(20) limt+
gi(t) = limt0+
gi(t) = + .Therefore for each i, F(x) gi(xi)+
j=i gj(t
j). Let B > 0 an arbitrary threshold.
By (20) there exist i > i > 0 (depending on B) such that
(21) xi / [i, i] F(x) B .Therefore F(x) B outside the box
Ni=1[i, i], which proves the Lemma.
Lemma 4.2. (x) > x22(x+1) , when x > 0.
Proof. Let f(x) := (x) x22(x+1) . Since f(x) = x2
2(x+1)2 > 0, f is increasing.
Therefore f(x) > f(0) = 0, when x > 0.
Lemma 4.3. The function (x)x2 is decreasing on (0,).Proof. ddx(
(x)x2 ) =
2x3 [
x2
2(x+1) (x)] < 0, by Lemma 4.2.
-
7/29/2019 An Algorithm for Computing Risk Parity Weights - F. Spinu
5/6
AN ALGORITHM FOR COMPUTI NG RISK PARITY WEIGHTS 5
4.1. Proof of Theorem 3.2. We follow the argument in [N98, Chap.4] and adaptit to our situation. From there, we borrow the notations
(22) ux := (uT2F(x)u)1/2, F(x) := xx .The starting point is the second order Taylor expansion, valid when x, x + u RN+
(23) F(x + u) F(x) uTF(x) =1
0
10
t uT2F(x + tu)u ddt .
Consider the case u := hx. To ensure x + hx RN+ , we need 0 < h < 1 , where
(24) := max1iN
|(x)i|xi
.
With := F(x), uTF(x) = h2 and (23) is re-written as
(25) F(x + hx) F(x) + 2h =
1
0
1
0
h2tx2x+htx ddt .
We note that 2 = (x)TC(x) +
i bix2ix2i
and
x2x+hx = (x)TC(x) +Ni=1
bix2i
x2i
1
(1 + h xixi )2
.
Since |1 + h xixi | 1 h, it follows that
x2x+hx 2
(1 h)2 , 0 < h