An Algorithm for Computing Risk Parity Weights - F. Spinu

7/29/2019 An Algorithm for Computing Risk Parity Weights - F. Spinu

1/6

AN ALGORITHM FOR COMPUTING RISK PARITY WEIGHTS

FLORIN SPINU

Abstract. In this paper we introduce an algorithm for computing the solutionto the risk budget allocation problem. The algorithm is based on Newtons

method for self-concordant functions.

1. Introduction

For N

2 a positive integer, let RN+ =

{x = (x1, . . . , xN)

T

RN, xi > 0

}the

positive cone, and SN+ the set of symmetric, positive definite NN matrices. Forx, y RN, xy := (x1y1, . . . , xNyN)T, and diag(x) is the diagonal NN matrix withxi on the diagonal. For x RN+ , x1 := (1/x1, . . . , 1/xN) RN+ . We will also makeuse of the norms x2 := (

i x

2i )

1/2, x := maxi |xi|, and xC := (xTCx)1/2.For a given C SN+ and b RN+ , we consider the following equation

(1) Cx = bx1, x RN+ .Theorem 1.1. The equation (1) has a unique solution x RN+ .

As a consequence, we have the following corollary (cf. [BR12]):

Corollary 1.2. The vector x/

i xi is the unique solution to the system

(2)

(Cx)ixi

xTCx =

biNj=1 bj , 1 i N ,

satisfying

(3) xi > 0,Ni=1

xi = 1 .

1.1. Financial interpretation. Given a set of N financial assets with covariancematrix C, a long-only, fully invested portfolio in the N assets is determined by a setof allocations x = (x1, . . . , xN)

T satisfying (3). The risk of the portfolio is usuallydefined as the standard deviation of the portfolio returns

(4) P(x) :=

xTCx .

As a homogeneous function of degree one, P(x) satisfies Eulers identity

(5)Ni=1

1

Pxi

Pxi

(x) = 1 .

The ith summation term is sometimes considered to be the contribution of asset ito the risk of the total portfolio. Since

(6)1

Pxi

Pxi

(x) =(Cx)ixi

xTCx,

1


2/6

2 FLORIN SPINU

equation (2) matches the risk contribution of asset i with the predetermined weightbi/

j bj . For this reason, the solution to (2) is referred to as the risk budget

allocation in [BR12]. When the weights bi are equal, the term risk parity wasintroduced earlier in [Q05]. The reader is referred to these papers for an in-depthdiscussion of the properties of portfolios constructed using these weights.

2. Proof of Theorem 1.1

The proof, as well as the choice of the function below, is similar to the one in[BR12]. Let

(7) F(x) :=1

2xTCx

Ni=1

bi log(xi) .

The gradient and Hessian of F are given by, respectively,

(8)

F(x) = Cx

bx1,

2F(x) = C+ diag(bx2) .

In particular, a solution to (1) is a critical point of F. Since F is strictly convex(2F(x) C), it has at most one critical point (this proves the uniqueness part ofthe Theorem). To prove that F has at least one critical point, it suffices to showthat F has a a point of global minimum x RN+ . This is a consequence of the factthat F(x) + as x RN+ (see Lemma 4.1 in the Appendix).

2.1. Proof of Corollary 1.2. If x satisfies (1), then x

i x

isatisfies (2) and (3).

Conversely, if x satisfies (2), 1x satisfies (1), with = (xTCx)1/2

(

i bi)1/2 . This implies

that 1x = x, the unique solution to (1). Furthermore, the constraint (3) impliesthat = 1

i x

i, which determines x uniquely.

3. Newtons algorithm

3.1. Simplifying assumptions. Let D the diagonal ofC, and R = D1/2CD1/2

the associated correlation matrix: R is positive definite with 1 on the diagonal. Ifx is the solution to (1), y := D1/2x is the unique solution to the equation

(9) Ry = by1

Therefore, we may assume from now on that C is a correlation matrix:

(10) C SN+ , Cii = 1 .In particular, |Cij | 1, i, j, by the Cauchy-Schwartz inequality.

The next observation is that if x satisfies (1) with weights bi, t1/2x satisfiesthe same equation with bi replaced by tbi. Therefore we may rescale bi so that

(11) min1iNbi = 1 .

In what follows, we refer to [N98, Chap.4] for the notion of self-concordant func-tions. The set of self-concordant functions is closed under addition and multiplica-tion by a scalar greater than one, and it counts among its members all the quadraticfunctions, as well as the functions log(xi). We conclude thatProposition 3.1. Under the assumptions (10) and (11), F(x) is self-concordant.


3/6

AN ALGORITHM FOR COMPUTI NG RISK PARITY WEIGHTS 3

3.2. Convex optimization. We remarked in Section 2 that the solution to (1) isthe global minimum of the convex function F, defined at (7)

(12) x = argminxRN+ F(x) .

Let x := (2F(x))1F(x) the iteration step in Newtons algorithm. If x0 isclose enough to x, the sequence

(13) xk+1 = xk + xk, k 0,converges quadratically to x. A difficulty arises when an appropriate choice ofthe initial point x0 is not available. Since F is self-concordant, this issue canbe dealt with as in [N98, Chap 4.]. Let F(x) := (x)TF(x). As long ask > := 3

5

2 , the above step is replaced with the damped iteration

(14) xk+1 = xk +1

1 + kxk, k := F(xk) .

This step is guided by the key inequality [N98, Thm 4.10]

(15) F(xk+1) F(xk) (k),where (t) := t log(1 + t). This ensures a decrease in the objective function ofat least () per iteration. In particular, less than

F(x0)F(x)()

damped iterations

are required to arrive at k < , at which point the quadratic phase (13) sets in.Moreover, we can significantly reduce the number of damped iterations by exploitingthe explicit form of F. The following theorem is proved in the Appendix.

Theorem 3.2. For x RN+ , let := max1iN |(x)i|xi . With h = 11+ , we have(16) F(x + hx) F(x) (2F(x)/2) () .

Consequently, we may replace the iteration (14) with

(17) xk+1 = xk +1

1 + kxk, k := xk/xk .

This step is greater than (14), since k < k. On the other hand, the decrease inthe objective function is as good as (15) since (2k/

2k)(k) (k), by Lemma

4.3 in the Appendix. Let u1 := (1, . . . , 1)T and S :=

i bi. We conclude with

Theorem 3.3. Letx0 :=S

uT1Cu1

u1, := 0.95 3

52 , and Tol > 0 a termina-

tion threshold. Consider the following algorithm: for k 0, Compute

uk := F(xk) = Cxk bx1kHk :=

2F(xk) = C+ diag(bx

2k )

xk := H1k ukk := xk/xk

(Damped Phase) While k > , do xk+1 = xk + 11+k xk . (Quadratic Phase) While k > Tol, do xk+1 = xk + xk .The number of iterations is less than 9.4 Slog(CN) in the damped phase, and(log2 log2(1/Tol) + 2.6) in the quadratic phase, withC the condition number ofC.The terminal value xend of the algorithm satisfiesxend xC Tol1Tol .


4/6

4 FLORIN SPINU

Proof. The bound on the number of iterations of the quadratic phase is standard,and can be derived from [N98, Thm 4.1.12]. To bound the number of damped

iterations, we start from the upper bound

F(x0)

F(x)

() . First, we remark thatxT0 Cx0 = (x

)TCx = S. Let 1 and N be the smallest and the largest eigenvaluesof C. Since uT1 Cu1 Nu122 = N N, F(x0) = 12 S Slog( S

1/2

(uT1Cu1)1/2

) 12 S12

Slog( SNN). On the other hand, S = (x)TCx 1x22. This implies xi

S/1, i, hence F(x) 12 S 12 Slog( S1 ). Taking the difference, we obtainF(x0)F(x) 12 Slog(CN), where C = N1 . Finally 12() = 9.385 < 9.4. Thelast term satisfies xk xC xk xxk Tol1Tol (cf. [N98, Thm 4.1.11]). 3.3. Practical Implementation. a) For N = 50 and T ol = 106, the theoreticalupper bound for the number of quadratic iterations is 8, while the stated upperbound for the number of damped iterations can be quite large (depending on S).We find in practice however that the total number of iterations is significantly lower

than that: for each of the 107

trials with random weights b (uniform distribution)and random covariance matrix C (Wishart distribution), the algorithm terminatedafter less than 16 iterations. b) We have also tested the algorithm on the risk parityproblem in dimension N = 1400. That is, bi = 1/N and T ol = 10

6. We foundthat the number of iterations required was less than 6 for each of 2105 trials withrandom covariance matrix C.

4. Appendix

Lemma 4.1. F(x) tends to + as x approaches the boundary ofRN+ :(18) lim

xRN+

F(x) = + .

Proof. Let 1 the lowest eigenvalue of C. Since xTCx 1i x2i , we have

(19) F(x) Ni=1

(1

21x

2i bi log(xi)) =:

Ni=1

gi(xi) ,

with gi(t) =12 1t

2 bi log(t). The function gi is convex when t > 0, has a globalminimum at ti =

bi/1, and satisfies

(20) limt+

gi(t) = limt0+

gi(t) = + .Therefore for each i, F(x) gi(xi)+

j=i gj(t

j). Let B > 0 an arbitrary threshold.

By (20) there exist i > i > 0 (depending on B) such that

(21) xi / [i, i] F(x) B .Therefore F(x) B outside the box

Ni=1[i, i], which proves the Lemma.

Lemma 4.2. (x) > x22(x+1) , when x > 0.

Proof. Let f(x) := (x) x22(x+1) . Since f(x) = x2

2(x+1)2 > 0, f is increasing.

Therefore f(x) > f(0) = 0, when x > 0.

Lemma 4.3. The function (x)x2 is decreasing on (0,).Proof. ddx(

(x)x2 ) =

2x3 [

x2

2(x+1) (x)] < 0, by Lemma 4.2.


5/6

AN ALGORITHM FOR COMPUTI NG RISK PARITY WEIGHTS 5

4.1. Proof of Theorem 3.2. We follow the argument in [N98, Chap.4] and adaptit to our situation. From there, we borrow the notations

(22) ux := (uT2F(x)u)1/2, F(x) := xx .The starting point is the second order Taylor expansion, valid when x, x + u RN+

(23) F(x + u) F(x) uTF(x) =1

0

10

t uT2F(x + tu)u ddt .

Consider the case u := hx. To ensure x + hx RN+ , we need 0 < h < 1 , where

(24) := max1iN

|(x)i|xi

.

With := F(x), uTF(x) = h2 and (23) is re-written as

(25) F(x + hx) F(x) + 2h =

1

0

1

0

h2tx2x+htx ddt .

We note that 2 = (x)TC(x) +

i bix2ix2i

and

x2x+hx = (x)TC(x) +Ni=1

bix2i

x2i

1

(1 + h xixi )2

.

Since |1 + h xixi | 1 h, it follows that

x2x+hx 2

(1 h)2 , 0 < h

An Algorithm for Computing Risk Parity Weights - F. Spinu

Documents

Transcript of An Algorithm for Computing Risk Parity Weights - F. Spinu