Numerical Methods Applied to Chemical Engineering ... · 10.34: Numerical Methods . Applied to...

10.34: Numerical Methods Applied to

Chemical Engineering

Lecture 11: Unconstrained Optimization

Newton-Raphson and trust region methods

1

Recap

• Optimization

• Steepest descent

2

• Method of steepest decent:

• Estimating an optimal with a Taylor expansion:

• This is quadratic in , so find the critical point:

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

x1

x2

Recap

3

xi+1 = xi � ↵ig(xi)

↵i

↵i

1f(xi+1) = f(xi)� ↵ig(xi)

Tg(xi) + ↵2ig(xi)

TH(xi)g(xi) + . . .2

g(xi)Tg(xi)↵i =g(xi)TH(xi)g(xi)

Recap

4

mx = ��x+ F

mx = ��x�rU


x1

x2

Recap

5


-5

-4

-4

-3

-3-3

-3

-3

-2

-2-2

-2

-2

-2

-1

-1

-1

-1

0

0

0

1

1

1

2

22

x1-3 -2 -1 0 1 2 3

x 2

-3

-2

-1

0

1

2

3


x1

x2

Recap

6


-5

-4

-4

-3

-3-3

-3

-3

-2

-2-2

-2

-2

-2

-1

-1

-1

-1

0

0

0

1

1

1

2

22

x1-3 -2 -1 0 1 2 3

x 2

-3

-2

-1

0

1

2

3

Unconstrained Optimization• Conjugate gradient method:

• Consider the minimization of:

• This has a minimum when?

•

• the Hessian, , is symmetric, positive definite

• Iterative method:

• is a descent dir. but not necessarily the steepest

• Let’s determine the optimal for a given

7

1f(x) = x

TAx

2� b

Tx

g(x) = rf(x) = Ax� b

H(x) = A

Ax = b

xi+1 = xi + ↵ipi

↵i

pi

A

pi

Unconstrained Optimization• Conjugate gradient method

• is quadratic in .

• is minimized when

• For a given direction there is an optimal step size

• How can we choose the optimal direction?

• is already minimized along :

• Can this hold for also?

• Let , then:

8

1f(xi+1) = f(xi) + ↵ig(xi)

Tpi + ↵2

ipTi Ap

2f(xi+1) ↵i

f(xi+1)g(x

↵i = � i)Tpi

p

Ti Api

pi ↵i

f(xi+1) pi g(xi+1)Tpi = 0

f(xi+2)

g(xi+2)Tpi = 0

[A (xi+1 + ↵i+1pi+1)� b]T pi = 0 ) p

Ti+1Api = 0

Unconstrained Optimization• Conjugate gradient method:

• is quadratic in .

• is minimized when

• For a given direction there is an optimal step size

• How can we choose the optimal direction?

• is already minimized along :

• Can this hold for also?

• Let , then:

9

1f(xi+1) = f(xi) + ↵ig(xi)

Tpi + ↵2

ipTi Ap

2f(xi+1) ↵i

f(xi+1)g(x

↵i = � i)Tpi

p

Ti Api

pi ↵i

f(xi+1) pi g(xi+1)Tpi = 0

f(xi+2)

g(xi+2)Tpi = 0

g(xpi+1 = � i+1)TApi

g(xi+1) + �i+1pi, �i+1 =p

Ti Api

10

Unconstrained Optimization• Method of steepest decent/conjugate gradient:

• Example:

• Contours for the function: in SD

f(x) = x

21 + 10x2

2

−10 −8 −6 −4 −2 0 2 4 6 8 10−10

−8

−6

−4

−2

0

2

4

6

8

10

x1

x2

↵i = 0.015

1 0A =

✓

0 10

◆,b = 0

Unconstrained Optimization

• Conjugate gradient method:

• Used to solve linear equations with iterations

• Requires only the ability to compute the product:

• The actual matrix is never needed. We only need to compute its action on different vectors, !

• Only for symmetric, positive definite matrices.

• More sophisticated minimization methods exist for arbitrary matrices.

• Optimization applied to linear equations is the state-of-the-art for solutions of linear equations.

11

O(N)

Ay

Newton-Raphson• Finding local minima in unconstrained optimization

problems involve solutions of the equation:

• at minima in

• If we begin close enough to a minimum, can we expect the NR method to converge to that minimum?

• Yes! NR is locally convergent.

• Accuracy of the iterates will improve quadratically!

• Newton-Raphson iteration:

• What is the Jacobian of ?

12

g(x) = rf(x) = 0

g(x)

f(x)

• Method of steepest decent/Newton-Raphson:

• Example:

• Contours for the function:

−10 −8 −6 −4 −2 0 2 4 6 8 10−10

−8

−6

−4

−2

0

2

4

6

8

10

13


f(x) = x

21 + 10x2

2

x1

x2

↵i = 0.015

• Method of steepest decent/Newton-Raphson:

• Example:


−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

14


x1

x2

log f(x) = x

21 + 10x

22


Newton-Raphson• Compare:

• Optimized steepest decent:

• with

• Newton-Raphson:

•

• What is the difference?

• What are the strengths of Newton-Raphson?

• What are the weaknesses of Newton-Raphson?

• What are the strengths of steepest descent?

• What are the weaknesses of steepest decent?

15

xi+1 = xi �H(xi)�1g(xi)



Trust-Region Methods• Both Newton-Raphson and the optimized steepest

descent methods assume the objective function can be described locally by a quadratic function.

• That quadratic approximation may be good or bad

16

xi

xi+1

Trust-Region Methods• Both Newton-Raphson and the optimized steepest

descent methods assume the objective function can be described locally by a quadratic function.

• That quadratic approximation may be good or bad

17

xi

xi+1

Trust-Region Methods• Trust region methods choose between the Newton-

Raphson direction when the quadratic approximation is good and the steepest decent direction when it is not.

• This choice is based on whether the Newton-Raphson step is too large.

18

Newton-RaphsonSteepest decent Ri

• Newton step:

• Steepest decent:

• If and

• Take the Newton-Raphson step

• Else

• Take a step in the steepest descent direction

• If and with optimal step size

• Take the optimal steepest descent

• Else step to the trust boundary using:

Trust-Region Methods

19

Newton-RaphsonSteepest decent

d

SDi = �↵ig(xi)

dNRi = �H(xi)

�1g(xi)

kdNRi k2 < Ri f(xi + d

NRi ) < f(xi)

f(xi + d

SDi ) < f(xi)kdSD

i k2 < Ri

↵i = Ri/kg(xi)k2

Ri

Ri

• The size of the trust region can be set arbitrarily initially.

• The trust region grows or shrinks depending on which of the two steps we choose.

• If the Newton-Raphson step was chosen:

• The quadratic approximation has minimum value:

• GROW the trust-radius when , because the function was smaller than predicted

• otherwise, SHRINK the trust-radius.

• If the steepest descent step was chosen, keep the trust radius the same.


20

1� = f(xi) + g(xi)

Tdi + dTi H(xi)di

2� > f(xi + di)

• What is a good value of the trust-region radius?

• MATLAB uses one initially!

• Variations on the trust-region method exist as well.

• MATLAB uses the dog-leg step instead of the optimal steepest descent step:


Optimal steepest decent

21

Ri

Newton-Raphson

Dog-leg step

• Method of steepest decent/Newton-Raphson/Trust-Region:

• Example:


−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

22


x1

x2


log f(x) = (x

21 + 10x

22)

2

• Method of steepest decent/Newton-Raphson/Trust-Region:

• Example:


0.9 0.92 0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.10.9

0.92

0.94

0.96

0.98

1

1.02

1.04

1.06

1.08

1.1

23


x1

x2


log f(x) = (x

21 + 10x

22)

2

MIT OpenCourseWarehttps://ocw.mit.edu

10.34 Numerical Methods Applied to Chemical EngineeringFall 2015

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

https://ocw.mit.edu

https://ocw.mit.edu/terms

Numerical Methods Applied to Chemical Engineering ... · 10.34: Numerical Methods . Applied to...

Documents

Transcript of Numerical Methods Applied to Chemical Engineering ... · 10.34: Numerical Methods . Applied to...