Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

28
Qualifier Exam in HPC February 10 th , 2010

Transcript of Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Page 1: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Qualifier Exam in HPC

February 10th, 2010

Page 2: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods

Alexandru Cioaca

Page 3: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Nonlinear systems:F(x) = 0, F : Rn Rn

F(x) = [ fi(x1,…,xn) ]T

Such systems appear in the simulation of processes (physical, chemical, etc.)

Iterative algorithm to solve nonlinear systems

Newton’s method != Nonlinear least-squares

Page 4: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Standard assumptions1.F – continuously differentiable in an open

convex set D2.F – Lipschitz continuous on D3.There is x* in D s.t. F(x*)=0, F’(x*) nonsingular

Newton’s method:Starting from x0 (initial iterate)

xk+1 = xk – F’(xk)-1 * F(xk), {xk} x*

Until termination criterion is satisfied

Page 5: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Linear model around xk:

Mn(x) = F(xn) + F’(xn)(x-xn)

Mn(x) = 0 xn+1 = xn - F’(xn)-1 *F(xn)

Iterates are computed as:F’(xn) * sn = F(xn)

xn+1 = xn - sn

Page 6: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Evaluate F’(xn) Symbolically Numerically with finite differences Automatic differentiation

Solve the linear system F’(xn) * sn = F(xn) Direct solve: LU, Cholesky Iterative methods: GMRES, CG

Page 7: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Computation:

F(xk) n scalar functions F’(xk) n2 scalar functions

LU O(2n3/3) Cholesky O(n3/3)

Krylov methods (depends on condition number)

Page 8: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

LU and Cholesky are useful when we want to reuse the factorization (quasi-implicit)

Difficult to parallelize and balance the workload Cholesky is faster and more stable but needs SPD

(!)

For n large, factorization is very impractical (n~106) Krylov methods contain elements easily

parallelizable (updates, inner products, matrix-vector products)

CG is faster and more stable but needs SPD

Page 9: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Advantages:

Under standard assumptions, Newton’s method converges locally and quadratically

There exists a domain of attraction S which contains the solution

Once the iterates enter S, they stay in S and eventually converge to x*

The algorithm is memoryless (self-corrective)

Page 10: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Disadvantages:

Convergence depends on the choice of x0

F’(x) has to be evaluated for each xk

Computation can be expensive: F(xk), F’(xk), sk

Page 11: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Implicit schemes for ODEs

y’ = f(t,y)

Forward Euler: yn+1 = yn + hf(tn,yn) (explicit)

Backward Euler: yn+1 = yn + hf(tn+1, yn+1) (implicit)

Implicit schemes need the solution of a nonlinear system

(also CN, RK, LMF)

Page 12: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

How to circumvent evaluating F’(xk) ? Broyden’s method

Bk+1 = Bk + (yk – Bk*sk)*skT / <sk, sk>

xk+1 = xk – Bk-1 * F(xk)

Inverse update (Sherman-Morrison formula)Hk+1=Hk+(sk-Hk*yk)*sk

T*Hk/<sk,Hk*yk>

xk+1 = xk – Hk * F(xk)

( sk+1 = xk+1 – xk, yk+1 = F(xk+1) – F(xk) )

Page 13: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(nonlinear systems)

Advantages: No need to compute F’(xk) For inverse update – no linear system to solve

Disadvantages: Superlinear convergence No longer memoryless

Page 14: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(unconstrained optimization)

Problem:Find the global minimizer of a cost function

f : Rn R, x* = arg min f

f differentiable means the problem can be attacked by looking for zeros of the gradient

Page 15: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(unconstrained optimization)

Descent methodsxk+1=xk – λk*Pk*f(xk)

Pk = In - steepest descent

Pk = 2f(xk)-1 - Newton’s method

Pk = Bk-1 - Quasi-Newton

Angle between Pk,f(xk) less than 90

Bk has to mimic the behavior of the Hessian

Page 16: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(unconstrained optimization)

Global convergence

Line searchStep length: backtracking, interpolationSufficient decrease: Wolfe conditions

Trust regions

Page 17: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(unconstrained optimization)

For Quasi-Newton, Bk has to resemble 2f(xk)

Single-Rank:

Symmetry:

Positive def.:

Inverse update:

2,

,

,

)()(

ss

sssBsy

ss

BsyssBsyBB

TTT

PSB

2,

,

,

)()(

sy

yysBsy

sy

BsyyyBsyBB

TTT

DFP

sy

ss

sy

ysIH

sy

syIH

TTT

BFGS ,)

,()

,(

sBsy

BsyBsyBB

T

SR ,

))((1

Page 18: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Quasi-Newton methods(unconstrained optimization)

Computation Matrix updates, inner products DFP, PSB 3 matrix-vector products BFGS 2 matrix-matrix products

Storage Limited memory versions (L-BFGS) Store {sk, yk} for the last m iterations and

recompute H

Page 19: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Further improvements

Preconditioning the linear system

For faster convergence one may solve K*Bk*pk = K*F(xk)

If B is spd (and sparse) we can use sparse approximate inverses to generate the preconditioner

This preconditioner can be refined on a subspace of Bk using an algebraic multigrid technique

We need to solve the eigenvalue problem

Page 20: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Further improvements

Model reduction

Sometimes the dimension of the system is very large

Smaller model that captures the essence of the original

An approximation of the model variability can be retrieved from an ensemble of forward simulations

The covariance matrix gives the subspace

We need to solve the eigenvalue problem

Page 21: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

QR/QL algorithmsfor symmetric matrices

Solves the eigenvalue problem Iterative algorithm Uses QR/QL factorization at each step

(A=Q*R, Q unitary, R upper triangular)

for k = 1,2,..Ak=Qk*Rk

Ak+1=Rk*Qk

end

Diagonal of Ak converges to eigenvalues of A

Page 22: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

QR/QL algorithmsfor symmetric matrices

The matrix A is reduced to upper Hessenberg form before starting the iterations

Householder reflections (U=I-v*v’) Reduction is made column-wise If A is symmetric, it is reduced to tridiagonal

form

Page 23: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

QR/QL algorithmsfor symmetric matrices

Convergence to a triangular form can be slow Origin shifts are used to accelerate it

for k = 1,2,..Ak-zk*I=Qk*Rk

Ak+1=Rk*Qk+zk*Iend

Wilkinson shift QR makes heavy use of matrix-matrix

products

Page 24: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Alternatives to quasi-Newton

Inexact Newton methods Inner iteration – determine a search direction by

solving the linear system with a certain tolerance Only Hessian-vector products are necessary Outer iteration – line search on the search

direction

Nonlinear CG Residual replaced by gradient of cost function Line search Different flavors

Page 25: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Alternatives to quasi-Newton

Direct search

Does not involve derivatives of the cost function

Uses a structure called simplex to search for decrease in f

Stops when further progress cannot be achieved

Can get stuck in a local minima

Page 26: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

More alternatives

Monte Carlo

Computational method relying on random sampling

Can be used for optimization (MDO), inverse problems by using random walks

In the case where we have multiple correlated variables, the correlation matrix is spd so we can use Cholesky to factorize it

Page 27: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Conclusions

Newton’s method is a very powerful method with many applications and uses (solving nonlinear systems, finding minima of cost functions). Newton’s method can be used together with many other numerical algorithms (factorizations, linear solvers)

The optimization and parallelization of matrix-vector, matrix-matrix products, decompositions and other numerical methods can have a significant impact in overall performance

Page 28: Qualifier Exam in HPC February 10 th, 2010. Quasi-Newton methods Alexandru Cioaca.

Thank you for your time!