On Approaches in Mathematical Optimization with...

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference

NonlinearEigenvalues

Summary andConclusion

References

1/37

On Approaches in MathematicalOptimization with Eigenvalues

Johannes Brust

SDSC, California, USA

August 11, 2016

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

2/37

Outline

Introduction

Subproblem Solver

Shape-Changing

Parameter Inference

Nonlinear Eigenvalues

Summary and Conclusion

References

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

3/37

Problem Formulation

We are solving the minimization problem

minimizex∈Rn

f (x),

where f : Rn → R, using a Trust-Region algorithm.

Assumptions:I The 2nd derivative is difficult to compute,I Large n.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

4/37

About Us

Our team is Prof. R. Marcia, Prof. J. Erway and me:

(R.M.) (J.E.)

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

5/37

Notation

I A bold lower letter will be a vector : x.I A bold upper letter will be a matrix : X.I k represents an iteration index.I n represents the problem dimension.I ‖(.)‖2 represents the Euclidean norm :

‖x‖2 =√∑n

i=1 x2i .

I ∇ represents the 1st derivative:[∂∂x1

∂∂x2

· · ·]T

.

I ∇2 represents 2nd derivative:

∂2

∂2x1

∂2

∂x1∂x2· · ·

∂2

∂x2∂x1

∂2

∂2x2...

. . .

.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

6/37

Iterative Methods

To solve the minimization, repeatedly try

xk+1 = xk + sk,

where sk is a step. The step satisfies

f (xk+1) ≈ Q (sk) ,

where Q (sk) is a quadratic approximation.

Two Methods:(1) Line-search Newton,(2) Trust-Region.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

7/37

The Trust-Region Method

Case: f (xk+1) = Q (sk)

-6 -4 -2 0 2 4 6-6

-4

-2

0

2

4

6

Newton step

Global minimum

Local minimum

s⇤

The Trust-Region method is applicable with saddle-points.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

8/37

Trust-Region Method

For k = 1, 2, . . .

(1) Solve the trust-region subproblem to obtain sk:

minimize||s||≤∆k

Q(s) =

{sTgk +

12

sTBks}.

(2) Set xk+1 = xk + sk.(3) Update ∆k (trust-region radius).(4) Update Bk, gk, xk.

Here

gk = ∇f (xk), Bk ≈ ∇2f (xk), and ∆k > 0.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

9/37

The Subproblem Solution

The subproblem solution is either interior or at theboundary:

(a): Interior (b): Boundary

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

10/37

Subproblem Optimality Conditions

Moré and Sorensen (1983): For some σ∗ ∈ R, thesolution s∗ of the Trust-Region subproblem satisfies

(Bk + σ∗In)s∗ = −gk,

σ∗ · (‖s∗‖2 −∆k) = 0,

‖s∗‖2 ≤ ∆k,

σ∗ ≥ 0,

Bk + σ∗In � 0.

In the remainder we drop the iteration index k.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

11/37

Subproblem Solver.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

12/37

Methodology

Orthonormal Basis subproblem solver (OBS):

(1) Eigendecompose B,

(2) Transform subproblem by change of variables,

(3) Solve transformed optimality conditions,

(4) Change variables back.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

13/37

Eigendecomposition of B

The compact quasi-Newton matrix has structure

B = γIn +

V

n×m

[W]m×m

[VT ]m×n

,

where m� n, γ ∈ R and V,W depend on the method.

Then the eigendecomposition

B = γIn + VWVT = P[

Λ̂γIn−m

]︸︷︷︸

≡Λ

PT ,

where P =[P‖ P⊥

]orthogonal eigenvectors, λ1 is the

smallest eigenvalue.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

14/37

Subproblem Optimality

For the first subproblem optimality condition we have

−g = (B + σI)s= (PΛPT + σI)s= (PΛPT + σPPT)s= P(Λ + σI)PTs= P(Λ + σI)v,

where PTs = v. Then

−PTg = (Λ + σI)v.

For the second condition, since P is orthogonal and sinces = Pv,

||s||2 = ||Pv||2 = ||v||2.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

15/37

Subproblem Optimality

Now the subproblem optimality conditions can beexpressed equivalently in terms of (s∗, σ∗) or (v∗, σ∗):

(B + σ∗I)s∗ = −g (Λ + σ∗I)v∗ = −PTgσ∗(‖s∗‖2 −∆) = 0 σ∗(‖v∗‖2 −∆) = 0

‖s∗‖2 ≤ ∆ ‖v∗‖2 ≤ ∆

σ∗ ≥ 0 σ∗ ≥ 0

B + σ∗I ≥ 0 λi + σ∗ ≥ 0 for all i.

The optimality conditions on the right have a simplifiedform, mainly because Λ + σ∗I is a diagonal matrix.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

16/37

Solving the Optimality Conditions

Interior Solution: If λ1 > 0, try σ∗ = 0 so that

v∗ = − (Λ)−1 (PTg).

If ‖v∗‖2 =∥∥Λ−1

(PTg

)∥∥2 < ∆, then the solution is found.

Boundary Solution: For σ∗ > 0 set

v∗ = − (Λ + σ∗I)−1 (PTg),

where σ∗ is the scalar that satisfies

‖v∗‖2 =∥∥∥(Λ + σ∗I)−1 (PTg

)∥∥∥2

= ∆, σ∗ + λ1 ≥ 0.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

17/37

Root solving

We use Newton’s method to solve the equation

φ (σ∗) ≡ 1∥∥∥(Λ + σ∗I)−1 (PTg)∥∥∥

2

− 1∆

= 0.

−1 0 1 2 3 4 5−0.4

−0.3

−0.2

−0.1

0

0.1

σ

φ(σ)

−λ1=−λmin−λ2 σ∗

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

18/37

Back Transformation

Since v = PTs, then we compute

s∗ = Pv∗

for numerical tests:

Criteria:I 1st Optimality condition: (B + σ∗I)s∗ = −g.I Boundary solution: φ(σ∗) ≡ 1

‖s∗‖2− 1

∆ = 0.I Curvature condition: λ1 + σ∗ ≥ 0.

Parameters:I Convergence tolerance: ε = 1.0× 10−10.I m = 5.I 103 ≤ n ≤ 107.I Randomly generated B.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

19/37

Results of Numerical Experiments: Boundary Solution

Let opt 1 = ||(B + σ∗I)s∗ + g||2.

n opt 1 λ1 + σ∗ |φ(σ∗)| Time103 3.727e-15 5.608e+00 2.775e-17 7.688e-04104 1.145e-14 2.608e+01 2.220e-16 1.343e-03105 3.792e-14 2.626e+01 3.330e-16 8.755e-03106 1.108e-13 1.166e+01 1.873e-16 8.486e-02107 4.001e-13 1.102e+01 8.716e-17 8.723e-01

Note that

I ||(B + σ∗I)s∗ + g||2 ≈ 0,I |φ(σ∗)| ≈ 0,I λ1 + σ∗ ≥ 0.

Note thatIn collaboration with Prof. Marcia and Prof. Erway wesubmitted (1) a manuscript (2nd revision), (2) a conferencepresentation and (3) a poster.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

20/37

Shape-Changing Norms.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

21/37

Shape-Changing Norms

(Burdakov et al., 2015): Instead of the 2-norm use socalled shape-changing norms

‖s‖P,2 = max(∥∥∥PT

‖ s∥∥∥

2,∥∥PT⊥s∥∥

2

)‖s‖P,∞ = max

(∥∥∥PT‖ s∥∥∥∞,∥∥PT⊥s∥∥

2

),

for the trust-region subproblem

minimize||s||P,·≤∆

Q(s) =

{sTg +

12

sTBs}.

The norms depends on the eigendecomposition of B.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

22/37

Shape-Changing Subproblems

Idea: Separate subproblem after a change of variables.Let

v =

[v‖v⊥

], PTg =

[g‖g⊥

], B = P

[Λ̂

γIn−m

]PT ,

then using s = Pv find

Q(s) = sTg +12

sTBs

= · · ·≡ q‖

(v‖)

+ q⊥ (v⊥) ,

where

q‖(v‖)

= vT‖g‖ +

12

vT‖ Λ̂v‖, q⊥ (v⊥) = vT

⊥g⊥ +12γ ‖v⊥‖2

2 .

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

23/37

(P, 2)-Norm

The problems decouple, e.g

minimize||s||P,2≤∆

Q(s) = minimize||v‖||2≤∆

q‖(v‖)︸︷︷︸P1

+ minimize||v⊥||2≤∆

q⊥(v⊥)︸︷︷︸P2

.

Since q⊥ (v⊥) = vT⊥g⊥ + 1

2γ ‖v⊥‖22 then

v∗⊥ = β · g⊥, β =

−1γ if γ > 0 and

∥∥∥g⊥γ

∥∥∥2≤ ∆,

−∆‖g⊥‖2

otherwise.

P1 is a low dimensional problem.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

24/37

Shape-Changing Transformation

With v∗ =

[v∗‖v∗⊥

]then transform

s∗ = Pv∗ = P‖v∗‖ + P⊥v∗⊥,

where P =[P‖ P⊥

].

Synopsis:

I Similar analysis for the (P,∞)-Norm ,

I Perform numerical experiments,

I Published manuscript in collaboration with Prof.Erway, Prof. Burdakov and Prof. Yuan.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

25/37

Parameter Inference for StochasticDifferential Equations.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

26/37

Parameter Inference

Let a stochastic differential equation be given by

dXt = f (Xt; θ)︸︷︷︸drift

dt + g (Xt; θ)︸︷︷︸diffusion

dWt,

where dWt is a Wiener increment, and θ is a vector ofparameters.

Goal: Infer the probabilities

P (Xt = x) .

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

27/37

Parameter Inference

Bhat, Madushani (2016): The DTQ method findsapproximations to probabilities by discretizing in spaceand time

P (Xt = x) ≈ P(X̃tn = xi

)= [pn]i ,

where pn ∈ R2M+1 for M ≥ 1. A recursion exists

pn = Apn−1,

where A ∈ R2M+1×2M+1 and

[A]ij =∆x exp

(− (xi−xj−f (xj,θ)∆t)2

2g2(xj;θ)∆t

)√

2πg2 (xj; θ) ∆t,

and ∆x, ∆t denote space and time discretization steps.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

28/37

Parameter Inference

Example: f (x; θ) = θ2(θ1 − x) and g (x; θ) = 1.

Left: Non-zeros in A, Right: Spectrum of A.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

29/37

Parameter Inference

Since the recursion unwinds as

pn = Anp0,

and since the eigenvalues tend to zero quickly, let theeigendecomposition be

A = S[

Λ1Λ2

]S−1,

where Λ1 ∈ Rm×m, Λ2 ∈ R(2M+1−m)×(2M+1−m) andS ∈ R(2M+1)×(2M+1). If |Λ2|j � 1, then

An ≈ S[

Λn1

0

]S−1.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

30/37

Parameter Inference

Numerical experiment:n = 2000, θ1 = 1.52, θ2 = 1.71,m = 10:

Computed densities are close to the true solution. TimeDTQ: 0.163s, Time Eig: 0.082s.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

31/37

Nonlinear Eigenvalue Problem

From electronic structure computations

H (X) X = XΛ,

where

Λm×m

: (Eigenvalues), Xn×m

: (Eigenvectors),

and the Hamiltonian is H (X)n×n

.

Observations:I Solved via a fixed point iteration,I H (X) sparse, symmetric,I Only m� n eigenvalues needed.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

32/37

Summary

Research on optimization using linear algebra andeigenvalues:

I Eigenvalue method for compact Trust-Regionsubproblem,

I Shape-Changing norms,

I Eigenvalues for matrix exponential,

I Non-linear eigenvalues.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

33/37

Conclusion

Our goals:

I Develop method for eigenvalues of B = γI + VWVT ,

I Develop method for rank and eigenvalues of banded,symmetric A,

I Approximate solutions in the non-linear eigenvalueproblem.

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

34/37

Questions

QΛQT

J. Brust, R. Marcia

Introduction

Subproblem Solver

Shape-Changing

ParameterInference



References

35/37

References:

(1) H. Bhat, R.W.M.A Madushani and S. Rawat."Parameter Inference for Stochastic DifferentialEquations with Density Tracking by Quadrature",2016.

(2) J.J. Brust, J.B. Erway, R.F. Marcia. "On SolvingL-SR1 Trust-Region Subproblems", 2015.

(3) O. Burdakov, L. Gong, Y.-X. Yuan, and S. Zikrin. "OnEfficiently Combining Limited Memory andTrust-Region Techniques". Technical Report2013:13, Linköping University, Optimization, 2015.

(4) R.H. Byrd, J. Nocedal, and R.B. Schnabel."Representations of Quasi-Newton Matrices andtheir use in Limimited-Memory Methods". Math.Program., 63:129-156, 1994.

(5) J.J. Moré and D.C. Sorensen. "Computing ATrust-Region Step". SIAM J. Sci. and Statist.Comput., 4:553-572, 1983.

On Approaches in Mathematical Optimization with...

Documents

Transcript of On Approaches in Mathematical Optimization with...