Numerical Methods Applied to Chemical Engineering ... · 10.34: Numerical Methods . Applied to...
Transcript of Numerical Methods Applied to Chemical Engineering ... · 10.34: Numerical Methods . Applied to...
10.34: Numerical Methods Applied to
Chemical Engineering
Lecture 11: Unconstrained Optimization
Newton-Raphson and trust region methods
1
Recap
• Optimization
• Steepest descent
2
• Method of steepest decent:
• Estimating an optimal with a Taylor expansion:
• This is quadratic in , so find the critical point:
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
x1
x2
Recap
3
xi+1 = xi � ↵ig(xi)
↵i
↵i
1f(xi+1) = f(xi)� ↵ig(xi)
Tg(xi) + ↵2ig(xi)
TH(xi)g(xi) + . . .2
g(xi)Tg(xi)↵i =g(xi)TH(xi)g(xi)
Recap
4
mx = ��x+ F
mx = ��x�rU
• Method of steepest decent:
x1
x2
Recap
5
xi+1 = xi � ↵ig(xi)
-5
-4
-4
-3
-3-3
-3
-3
-2
-2-2
-2
-2
-2
-1
-1
-1
-1
0
0
0
1
1
1
2
22
x1-3 -2 -1 0 1 2 3
x 2
-3
-2
-1
0
1
2
3
• Method of steepest decent:
x1
x2
Recap
6
xi+1 = xi � ↵ig(xi)
-5
-4
-4
-3
-3-3
-3
-3
-2
-2-2
-2
-2
-2
-1
-1
-1
-1
0
0
0
1
1
1
2
22
x1-3 -2 -1 0 1 2 3
x 2
-3
-2
-1
0
1
2
3
Unconstrained Optimization• Conjugate gradient method:
• Consider the minimization of:
• This has a minimum when?
•
• the Hessian, , is symmetric, positive definite
• Iterative method:
• is a descent dir. but not necessarily the steepest
• Let’s determine the optimal for a given
7
1f(x) = x
TAx
2� b
Tx
g(x) = rf(x) = Ax� b
H(x) = A
Ax = b
xi+1 = xi + ↵ipi
↵i
pi
A
pi
Unconstrained Optimization• Conjugate gradient method
• is quadratic in .
• is minimized when
• For a given direction there is an optimal step size
• How can we choose the optimal direction?
• is already minimized along :
• Can this hold for also?
• Let , then:
8
1f(xi+1) = f(xi) + ↵ig(xi)
Tpi + ↵2
ipTi Ap
2f(xi+1) ↵i
f(xi+1)g(x
↵i = � i)Tpi
p
Ti Api
pi ↵i
f(xi+1) pi g(xi+1)Tpi = 0
f(xi+2)
g(xi+2)Tpi = 0
[A (xi+1 + ↵i+1pi+1)� b]T pi = 0 ) p
Ti+1Api = 0
Unconstrained Optimization• Conjugate gradient method:
• is quadratic in .
• is minimized when
• For a given direction there is an optimal step size
• How can we choose the optimal direction?
• is already minimized along :
• Can this hold for also?
• Let , then:
9
1f(xi+1) = f(xi) + ↵ig(xi)
Tpi + ↵2
ipTi Ap
2f(xi+1) ↵i
f(xi+1)g(x
↵i = � i)Tpi
p
Ti Api
pi ↵i
f(xi+1) pi g(xi+1)Tpi = 0
f(xi+2)
g(xi+2)Tpi = 0
g(xpi+1 = � i+1)TApi
g(xi+1) + �i+1pi, �i+1 =p
Ti Api
10
Unconstrained Optimization• Method of steepest decent/conjugate gradient:
• Example:
• Contours for the function: in SD
f(x) = x
21 + 10x2
2
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
x1
x2
↵i = 0.015
1 0A =
✓
0 10
◆,b = 0
Unconstrained Optimization
• Conjugate gradient method:
• Used to solve linear equations with iterations
• Requires only the ability to compute the product:
• The actual matrix is never needed. We only need to compute its action on different vectors, !
• Only for symmetric, positive definite matrices.
• More sophisticated minimization methods exist for arbitrary matrices.
• Optimization applied to linear equations is the state-of-the-art for solutions of linear equations.
11
O(N)
Ay
Newton-Raphson• Finding local minima in unconstrained optimization
problems involve solutions of the equation:
• at minima in
• If we begin close enough to a minimum, can we expect the NR method to converge to that minimum?
• Yes! NR is locally convergent.
• Accuracy of the iterates will improve quadratically!
• Newton-Raphson iteration:
• What is the Jacobian of ?
12
g(x) = rf(x) = 0
g(x)
f(x)
• Method of steepest decent/Newton-Raphson:
• Example:
• Contours for the function:
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
13
Unconstrained Optimization
f(x) = x
21 + 10x2
2
x1
x2
↵i = 0.015
• Method of steepest decent/Newton-Raphson:
• Example:
• Contours for the function:
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
14
Unconstrained Optimization
x1
x2
log f(x) = x
21 + 10x
22
g(xi)Tg(xi)↵i =g(xi)TH(xi)g(xi)
Newton-Raphson• Compare:
• Optimized steepest decent:
• with
• Newton-Raphson:
•
• What is the difference?
• What are the strengths of Newton-Raphson?
• What are the weaknesses of Newton-Raphson?
• What are the strengths of steepest descent?
• What are the weaknesses of steepest decent?
15
xi+1 = xi �H(xi)�1g(xi)
g(xi)Tg(xi)↵i =g(xi)TH(xi)g(xi)
xi+1 = xi � ↵ig(xi)
Trust-Region Methods• Both Newton-Raphson and the optimized steepest
descent methods assume the objective function can be described locally by a quadratic function.
• That quadratic approximation may be good or bad
16
xi
xi+1
Trust-Region Methods• Both Newton-Raphson and the optimized steepest
descent methods assume the objective function can be described locally by a quadratic function.
• That quadratic approximation may be good or bad
17
xi
xi+1
Trust-Region Methods• Trust region methods choose between the Newton-
Raphson direction when the quadratic approximation is good and the steepest decent direction when it is not.
• This choice is based on whether the Newton-Raphson step is too large.
18
Newton-RaphsonSteepest decent Ri
• Newton step:
• Steepest decent:
• If and
• Take the Newton-Raphson step
• Else
• Take a step in the steepest descent direction
• If and with optimal step size
• Take the optimal steepest descent
• Else step to the trust boundary using:
Trust-Region Methods
19
Newton-RaphsonSteepest decent
d
SDi = �↵ig(xi)
dNRi = �H(xi)
�1g(xi)
kdNRi k2 < Ri f(xi + d
NRi ) < f(xi)
f(xi + d
SDi ) < f(xi)kdSD
i k2 < Ri
↵i = Ri/kg(xi)k2
Ri
Ri
• The size of the trust region can be set arbitrarily initially.
• The trust region grows or shrinks depending on which of the two steps we choose.
• If the Newton-Raphson step was chosen:
• The quadratic approximation has minimum value:
• GROW the trust-radius when , because the function was smaller than predicted
• otherwise, SHRINK the trust-radius.
• If the steepest descent step was chosen, keep the trust radius the same.
Trust-Region Methods
20
1� = f(xi) + g(xi)
Tdi + dTi H(xi)di
2� > f(xi + di)
• What is a good value of the trust-region radius?
• MATLAB uses one initially!
• Variations on the trust-region method exist as well.
• MATLAB uses the dog-leg step instead of the optimal steepest descent step:
Trust-Region Methods
Optimal steepest decent
21
Ri
Newton-Raphson
Dog-leg step
• Method of steepest decent/Newton-Raphson/Trust-Region:
• Example:
• Contours for the function:
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
22
Unconstrained Optimization
x1
x2
g(xi)Tg(xi)↵i =g(xi)TH(xi)g(xi)
log f(x) = (x
21 + 10x
22)
2
• Method of steepest decent/Newton-Raphson/Trust-Region:
• Example:
• Contours for the function:
0.9 0.92 0.94 0.96 0.98 1 1.02 1.04 1.06 1.08 1.10.9
0.92
0.94
0.96
0.98
1
1.02
1.04
1.06
1.08
1.1
23
Unconstrained Optimization
x1
x2
g(xi)Tg(xi)↵i =g(xi)TH(xi)g(xi)
log f(x) = (x
21 + 10x
22)
2
MIT OpenCourseWarehttps://ocw.mit.edu
10.34 Numerical Methods Applied to Chemical EngineeringFall 2015
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.