Steady-State Optimization Lecture 3: Unconstrained Optimization ...€¦ · Numerical Methods and...

Steady-State OptimizationLecture 3: Unconstrained Optimization Problems,

Numerical Methods and Applications

Dr. Abebe Geletu

Ilmenau University of TechnologyDepartment of Simulation and Optimal Processes (SOP)

Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications

TU Ilmenau

3.1. Unconstrained Optimization Problems• Consider the unconstrained optimization problem

(UNLP) minx∈Rn

f (x).

Aim:to find a point x∗ from the whole of Rn so that f has a minimum

value at x∗.

• The function f in the problem UNLP is known as theobjective function or performance criteria.

• A vector x∗ is a minimum point of the function f (equivalently x∗ isa solution of the optimization problem UNLP) if

f (x) ≥ f (x∗) for all x ∈ Rn.


TU Ilmenau

3.1. Unconstrained Optimization Problems...

• Similarly we can also have an unconstrained maximization problem

(UNLP) maxx∈Rn

f (x).

• But a maximization problem can be also, equivalently, written as aminimization problem as

(UNLP) − minx∈Rn

(−f (x)) .

• Therefore, an optimization problem is either a maximization or aminimization problem.• The discussions in this lecture are limited to minimization problems.


TU Ilmenau


Example: The optimization problem

(UNLP) minx=(x1,x2)

{1

2(x1 − 2)2 + x2

2 − 5

}

has x∗ = (2, 0)> as a minimum point. In particular, the minimum value is =−5 = f (x∗) ≤ f (x) for all x ∈ R2.


TU Ilmenau

3.1. Unconstrained Optimization Problems...• In the example above, in any direction d we move from the minimum point x>∗ = (2, 0),the value of the function f (x∗) = −5 cannot be reduced. That is

f (x∗ + d) ≥ f (x∗) for any direction vector d ∈ Rn.

Question: How do we know that wether a given point x∗ is a minimum point of a functionf or not?

First-order Taylor approximation at the point x∗ :

f (x∗ + d) = f (x∗) +∇f (x∗)>d ⇒ f (x∗ + d)− f (x∗) = ∇f (x∗)>d

In general, if x∗ is a minimum point of UNLP, then

f (x∗ + d) ≥ f (x∗)⇒ ∇f (x∗)>d ≤ 0 for any vector d ∈ Rn.

In particular, if we take d = ∇f (x∗), then it follows that

∇f (x∗)>[∇f (x∗)] ≤ 0⇒ ‖∇f (x∗)‖2 ≤ 0⇒ ∇f (x∗)> = 0.


TU Ilmenau


First-order optimality condition for unconstrained optimizationproblems

If x∗ is a minimum point of UNLOP, then

∇f (x∗) = 0.

Remark: For an unconstrained optimization problem, if ∇f (x) 6= 0,

then x is not a minimum point of f (x).

Therefore, we look for the minimum points of a function f (x)(i.e. solution of UNLP) among those points that satisfy theequation

∇f (x) = 0.

• Points that satisfy the equation ∇f (x) = 0 are commonly known asstationary points and they are candidates for optimality.


TU Ilmenau

3.1. Unconstrained Optimization Problems...Example 1: (see example above) Let f (x) = 1

2 (x1 − 2)2 + x22 − 5. Then

∇f (x) = 0⇒[x1 − 2

2x2

]=

[00

]⇒ x1 = 2, x2 = 0.

Abbildung: Ein nichtlineares Feder-System

function my2DPlot2

x = -10:0.1:10; y = -10:0.1:10;

[X,Y] = meshgrid(x,y);

Z = 0.5*(X -2).^2 + Y.^2 -5 ;

meshc(X,Y,Z)

xlabel(’x - axis’);

ylabel(’y - axis’);

hold on

%plot3(2,0,0,’sk’,’markerfacecolor’,[0,0,0]);

title(’Plot for the function f(x_{1},x_{2})=0.5(x_{1}-2)^{2}+ x_{2}^{2} - 5’)


TU Ilmenau

3.1. Unconstrained Optimization Problems...Example 2: Find the solution(s) of optimization problem

min(x1,x2)

{f (x1, x2) = x3

1 − 3x1 − x32 + 3x2

}Solution: First compute ∇f (x) to obtain:

∇f (x) =

(∂f∂x1∂f∂x2

)=

(3x2

1 − 3−3x2

2 + 3

)

Next find the stationary points:

∇f (x) =

(3x2

1 − 3−3x2

2 + 3

)=

(00

)⇒{

3x21 − 3 = 0−3x2

2 + 3 = 0

We obtain, x1 = ±1 and x2 = ±1.

• Hence, the points (1, 1), (1,−1), (−1, 1), (−1,−1) solve the equation ∇f (x) = 0

and they are candidates for optimality.


TU Ilmenau


Note that: The point (1,−1) is the only minimum point.• Even if ∇f (x) = 0 at the rest of the points (1, 1), (−1, 1) and (−1,−1), they are notminimum points.• In fact (−1, 1) is a maximum point.

That a point x ∈ Rn satisfies the equation ∇f (x) = 0 is not enough (or sufficient)

to conclude that x is an minimum point.Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications

TU Ilmenau

Sufficient optimality condition for unconstrainedOptimization• ∇f (x) = 0 is not sufficient that x is a minimum point. So, we need an additional criteria!• Suppose that ∇f (x) = 0. Consider the 2nd-order Taylor Approximation of f at the point x

f (x + d) ≈ f (x) + d>∇f (x)︸︷︷︸=0

+1

2d>H(x)d

for any direction vector d . It then follows that

f (x + d) ≈ f (x) +1

2d>H(x)d

• If d>H(x)d > 0, then we have f (x + d) > f (x).⇒ There is no direction vector d which is a descent direction for f at the point x . Hence, x is a minimum point.

3.2. Sufficient Optimality Condition for UNLPSuppose that, ∇f (x) = 0. If an addition the Hessian matrix H(x) (of f at x) is positive definite

d>H(x)d > 0, d ∈ Rn, d 6= 0,

the x is a minimum point.

Recall: • A square matrix is positive definite if all its eigenvalues are positive.• For a diagonal matrix, the diagonal elements are its eigenvalues.


TU Ilmenau

3.2. Sufficient optimality condition for unconstrainedOptimization...Example 3: Consider again the optimization problem

min(x1,x2)

{f (x1, x2) = x3

1 − 3x1 − x32 + 3x2

}We know that each of the points (1, 1), (1,−1), (−1, 1), (−1,−1) satisfy the equation

∇f (x) =

(3x2

1 − 3−3x2

2 + 3

)=

(00

)The Hessian matrix of f is

H(x1, x2) =

(6x1 00 −6x2

).

Hence,

H(1, 1) indefinite (1, 1) neither maximum nor a minimum (saddle) pointH(1,−1) positive definite (1,−1) is a minimum pointH(−1, 1) negative definite (−1, 1) is a maximum pointH(−1,−1) indefinite (−1,−1) neither maximum nor a minimum (saddle) point


TU Ilmenau

3.3. Unconstrained OptimizationProblems...ExampleApplication (A nonlinear spring system):

Abbildung: A nonlinear spring system

• The applied forces are F1 = 0,F2 = 2N and the spring constants are k1 = k2 = 1N/m.


TU Ilmenau

3.3. Unconstrained OptimizationProblems...ExampleThere is a shift in the x and y direction of the junction of the two springs due to theapplied forces. What are the values these shifts in order to minimize the potential energyof the system?The potential energy is given by

P(x1, x2) =1

2k1 (∆L1)2 +

1

2k2 (∆L2)2 − F1x1 − F2x2,

where ∆L1 and ∆L1 are changes in the length of the springs and x1 und x2 are the x andy shifts, respectively. Hence,

∆L1 =

√(x1 + 10)2 + (x2 − 10)2 − 10

√2

∆L1 =

√(x1 − 10)2 + (x2 − 10)2 − 10

√2.

Therefore, solve the optimization problem

(UNLP) minx

{P(x) =

1

2k1 (∆L1)2 +

1

2k2 (∆L2)2 − F1x1 − F2x2

}.


TU Ilmenau

3.3. Unconstrained OptimizationProblems...ExampleFirst-order optimality condition: ∇P(x) = 0.

∂P

∂x1

= k1∆L1∂∆L1

∂x1

+ k2∆L2∂∆L2

∂x1

− F1 (1)

∂P

∂x2

= k1∆L1∂∆L1

∂x2

+ k2∆L2∂∆L2

∂x2

− F2, (2)

where

∂∆L1

∂x1

=(x1 + 10)√

(x1 + 10)2 + (x2 − 10)2,∂∆L1

∂x2

=(x2 − 10)√

(x1 + 10)2 + (x2 − 10)2(3)

∂∆L2

∂x1

=(x1 − 10)√

(x1 − 10)2 + (x2 − 10)2,∂∆L2

∂x2

=(x2 − 10)√

(x1 − 10)2 + (x2 − 10)2. (4)

Set (3) and (4) in (1) and (2), so that

∂P

∂x1

= k1(x1 + 10) + k2(x1 − 10)− F1 = 0⇒ x1 =F1 + 10(k2 − k1 )

k1 + k2

(5)

∂P

∂x2

= k1(x2 − 10) + k2(x2 − 10)− F2 = 0⇒ x2 =F2 + 10(k2 + k1 )

k1 + k2

. (6)

Replacing the given values of F1 and F2 we obtain (x1, x2) = (0, 11.5).


TU Ilmenau

3.3. Unconstrained OptimizationProblems...Example

The Hessian matrix of the function P(x) is

H(x) =

(k1 + k2 0

0 k1 + k2

).

Since k1 and k2 are positive numbers, the matrix H(x) is alwayspositive definite.⇒ (x1, x2) = (0, 11.5) is a minimum point.

• If the Hessian H(x) is a positive definite matrix for any x , then thefunction f (x) is a a convex function.


TU Ilmenau

3.4. Convex Sets and Convex Functions

A. Convex SetA set S ⊂ Rn is said to be a convex set, if for any x1, x2 ∈ S and any λ ∈ [0, 1] we have

λx1 + (1− λ)x2 ∈ S .

• For a set S to be convex, the line-segment joining any two points x1, x2 in S should becompletely contained in S .

Abbildung: Convex and a non-convex sets

• The set S = {x ∈ Rn | Ax ≤ a,Bx = b} is a convex set, where A,B are matrices anda, b are vectors.


TU Ilmenau


B. Convex FunctionsA function f : Rn → R is said to be a convex function, if for any x1, x2 ∈ Rn and anyλ ∈ [0, 1] we have

f (λx1 + (1− λ)x2) ≤ λf (x1) + (1− λ)f (x2)

• A segment connecting any two points on the graph of f lies above the graph of f .

Examples: The following are convex functions f1(x) = x2, f2(x) = ex , f (x1, x2) = x21 + x2

2 .Question: Is there a simple method to verify whether a function is convex or not?


TU Ilmenau


Verifying convexity of a functionA function f : Rn → R is a convex function if and only if

the Hessian matrix H(x) is positive semi-definite, for every x ∈ Rn.

Example: Hence, for a quadratic function f (x) = 12x>Qx + q>x , its Hessian matrix is

H(x) = Q. Hence, f is a convex function if Q is positive semi-definite.

If f is a convex function and ∇f (x) = 0, then x is minimum point of f .

⇒ Therefore, for a convex function, the condition ∇f (x) = 0 is sufficient for x to be a

minimum point of f .

Remark: Convex optimization problems have an extensive application in signal processingand control engineering.(See the book of S. Boyd and L. Vandenberghe: Convex Optimization. URL:http://www.stanford.edu/ boyd/cvxbook/ )


TU Ilmenau

3.5. Numerical methods for unconstrainedoptimization problems• In general, it is not easy to solve the systems of equations

∇f (x) =

∂f∂x1

(x)∂f∂x2

(x)...

∂f∂xn

(x)

= 0

analytically. In many cases this system of equations is nonlinear andthe number of equations can be also very large.I Therefore, any of the algorithms: Newton, modified Newton,quasi-Newton or inexact Newton methods can be used to solve thenonlinear equation

F (x) = ∇f (x) = 0.


TU Ilmenau

3.5. Numerical methods ...

A general algorithm for unconstrained optimizationproblems :Start: Choose an initial iterate x0

Set k ← 0Repeat:

• Determine a step-length : αk > 0• Solve the system of linear equations: H(xk)dk = −∇f (xk)

• Set xk+1 = xk + αkdk• k ← k + 1

Until: (Termination criteria is satisfied)• There are different types of algorithms, depending on how you

determine the search direction d and step-length α .• Such algorithms are commonly known asGradient-based Algorithms .


TU Ilmenau

3.5. Numerical methods ...Requirements on the search direction dk :• Determine dk , in such a way that value f (xk + αkdk) is not greaterthan the value f (xk).⇒ If xk is not already a solution of UNLP, then dk should a descentdirection.• It follows that

f (xk + αkdk) ≤ f (xk).

• Using the approximation f (xk + αkdk) ≈ f (xk) + αkd>k ∇f (xk) we

obtain

(a) − αkd>k ∇f (xk) = f (xk)− f (xk + αkdk)

(b) d>k ∇f (xk) ≤ 0.

• Hence, the expression d>k ∇f (xk) is a measure of decrease for the

function f at the point xk in the direction of the vector dk .Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications

TU Ilmenau

3.5. Numerical methods ...• A vector d is called a descent direction for the function f at the

point x if d>∇f (x) ≤ 0 .• Note that.

−d>k ∇f (xk) = −‖dk‖‖∇f (xk)‖cosθ,

where θ is the angle between the vectors dk and ∇f (xk).• The reduction f (xk)− f (xk + αkdk) is very large when cosθ = −1;i.e., θ = 1800.⇒ The reduction f (xk)− f (xk + αkdk) is very large when the vectorsdk and ∇f (xk) are in opposite directions.⇒ That is, when dk = −∇f (xk).

Steepest Descent

The direction −∇f (xk) is known as the steepest descent direction.


TU Ilmenau


(A1) Method of the steepest descent directionStart: • Choose an initial iterate x0 and

a termination Tolerance ε• Set k ← k0

Repeat:• Compute : ∇f (xk)

• Set xk+1 = xk − αk∇f (xk)• k ← k + 1

Until: (‖∇f (xk)‖ ≤ ε)Advantages:• There is no need to solve linear system of equations to determine a search direction.• Easy to implement.

Disadvantage:• The convergence speed is only linear (‖xk+1 − x∗‖ ≤ C‖xk − x∗‖, C ∈ (0, 1) constant→ slow convergence).• The steepest descent direction −∇f (xk ) is not, in general, a good search direction.


TU Ilmenau


Matlab code


TU Ilmenau

3.5.Steepst Descent with Exact Line search

• Once the search direction dk is already known, the step-length αk

can be chosen in such a way that we have a maximum reduction inthe direction of dk .• That is, determine αk so that

minα

f (xk + αdk)

Hence, for dk = −∇f (xk), we have f (xk + αkdk) ≤ f (xk).• The method of finding a step-length by solving the optimizationproblme minα f (xk + αdk) is known as exact line search.Matlab code


TU Ilmenau

3.5.Steepst Descent Algorithm with Exact LineSearch

Algorithm 1: Steepst Descent with Exact Line Search

1: Choose an initial iterate x0;2: Set k ← 0;3: while (‖∇f (xk)‖ > tol) do4: Compute ∇f (xk) ;5: Determine αk by solve the one-dimensional optimization

problem

minα

f (xk − α∇f (xk))

6: Compute the next iterate: xk+1 = xk − αk∇f (xk)7: Set k ← k + 1;8: end while


TU Ilmenau

3.5.Steepst Descent Algorithm with Exact LineSearch...function xsol=SteepstExactLS(x0,tolf,maxIter)

% A matlab implementation of steepest descent with Exact Line search

% Use should supply: initial iterate x0

% tolx - tolerance between subsequent iterates xk and xk+1, norm(x_k+1 - x_k)

% tolf - tolerance for norm(F(x_k)) < tolf

% maxIter - the maximum number of iterations

fprintf(’==================================================================== \n’)

fprintf(’iteration alphak norm(grad) \n’)

fprintf(’=================================================================== \n’)

%Constants

%sigma=10e-4;

%lambda = 0.5;

%Initialization

x0=x0(:);

alpha0=1;

k=0;

datasave=[];

%datasave=[datasave; 0 x0(1) x0(2) norm(myfun(x0))];

xk=x0;

alphak=alpha0;

grad=gradFun(x0);

while ((norm(grad)>=tolf)) & (k<=maxIter)

dk=gradFun(xk) ; % Compute the gradient

dk=dk(:);

alphak=lineSearch(xk,dk);

xk=xk+alphak*dk;

grad=gradFun(xk);

k=k+1;

datasave=[datasave; k+1 alphak norm(grad)];

end %end while


TU Ilmenau

3.5.Steepst Descent Algorithm with Exact LineSearch...

xsol=xk;

disp(datasave)

end

function F=myfun(x)

F=(x_{1}-2)^{2} + x(2)^2-5;

end

function grad=gradFun(x)

grad(1) = 2*(x(1)-2);

grad(2) = 2*x(2);

end

function alpha=lineSearch(xk,dk)

x=xk;

d=dk;% The parameter.

alpha=fminbnd(@(y) fun4LS(y,x,d),0,1);

end

function f=fun4LS(y,x,d)

f = ((x(1)+y*d(1)-2))^2+(x(2)+y*d(2))^2-5;

end


TU Ilmenau

3.5.Steepst Descent Algorithm for QuadraticFunctions

• Let f (x) = 12x>Qx + q>x .

newline Assumption: Q is symmeteric and positive definite.Then• steepst descent direction: dk = −∇f (xk) = −Qxk − q ;• exact line search :

minα

f (xk+αdk) = minα

{1

2(xk + α∇dk)>Q (xk + αdk) + q> (xk + αdk)

}Then using dk = −Qxk − q, we obtain

αk =‖Qxk + q‖2

(Qxk + q)>Q (Qxk + q)


TU Ilmenau

3.5.Steepst Descent for QP ... Algorithm


1: Choose an initial iterate x0;2: Set k ← 0;3: while (‖∇f (xk)‖ > tol) do4: Compute dk = −Qxk − q ;

5: Determine αk = ‖Qxk+q‖2

(Qxk+q)>Q(Qxk+q).

6: Set xk+1 = xk + αkdk7: Set k ← k + 1;8: end while

Exercise: Implement this algorithm under MATLAB.


TU Ilmenau

3.5.Steepst Descent Algorithm ... Properties

• Convergence properties of the steeepst descesnt algorithm depends strongly on the properties of the Hessian matrixH(x) = ∇2f (x) (i.e. for a quandratic function 1

2x>Qx + q>x we have H(x) = Q ).

=⇒ if H(x) is positive defnite, then we have a good convergence property.

• If a step-length αk is exact (optimal), then necessarily we have

df (xk + αdk )

dα

∣∣∣∣α=αk=0.

⇒ ∇f (xk + αkdk )> ·d (xk + αdk )

dα= 0.

This implies

∇f (xk + α>k dk )dk = 0.

Since the next steepst descent direction is dk+1 = −∇f (xk + αkdk ), it follows that

−d>k+1dk = 0 =⇒ d>k+1dk = 0.

• When using exact line-search, each new steepest descent direction is orthogonal to the previous one.=⇒ This causes the so called zig-zag problem.=⇒ So, the steepest descent algorithm with exact line-search may take too long to converge.


TU Ilmenau

3.6.The Conjugate Gradient AlgorithmQuestion: How to avoid the zig-zag problem in the steepest(gradient) descent algorithm?One solution: Use the Hessian matrix H(x) of f (x) in thedeterminiation of dk (and αk).• Choose the search direction dk ,in such a way that[

d

dα∇f (xk + αdk−1)

]>dk = 0 (∗∗)

• First order Tylor approximation

∇f (xk + αdk ) = ∇f (xk ) + α∇2f (xk + αdk )dk = ∇f (xk ) + αH(xk + αdk )︸︷︷︸=Hk

dk

Hence, ddα

[∇f (xk + αdk )] = Hk+1dk .• Consquently, according to (**) [

Hk+1dk]> dk−1 = 0.

=⇒ d>k Hkdk−1 = 0 - We say that dk is conjugate to dk−1.


TU Ilmenau

3.6.Conjugate Gradient ... Quadratic Problems• The CG algorithm is commonly used to solve unconstrainedquadratic programming problems

(QP) minx

{f (x) =

1

2x>Qx + q>x

},

where Q is a symmetric positve definite matrix.• Starting from an initial search direction d0, the CG method for (QP)generates search directions d1, d2, . . . , dn−1 so that

d>k Qdk−1 = 0, k = 1, . . . , n − 1.

• Given xk and dk , we use exact line-search to determine αk by solving

minα

f (xk + αdk).

⇒ We solve the equation df (xk+αdk )dα = 0 to obtain αk = − d>k gk

d>k Qdk

where gk = ∇f (xk) = Qxk + q.Steady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications

TU Ilmenau

3.6.Conjugate Gradient ... Quadratic Problems

Question: Given dk how to determine dk+1

• Observe that df (xk+αdk )dα = 0 also implies

g>k+1dk = 0,

where gk+1 = ∇f (xk + αkdk) = Q (xk + αkdk) + q.• In addition,

d>k+1Qdk = 0 (since dk and dk+1 are conjugate vectors).

• Hence, determine dk+1 from dk+1 = −gk+1 + βkdk .• The expression βkdk is correction term for the steepes descentdirection −gk+1.• There are three well knoown methods for the determination ofparameter βk .


TU Ilmenau


Fletcher-Reeves:

βk =g>k+1gk+1

g>k gk

Polak-Ribiere:

βk =g>k+1 (gk+1 − gk)

g>k gk.

Hestenes-Stiefel:

βk =g>k+1 (gk+1 − gk)

d>k (gk+1 − gk).


TU Ilmenau

3.6.CG ... Quadratic Problems...Algorithm


1: Choose an initial iterate x0;2: Set g0 = ∇f (x0) and d0 = −g0;3: Set k ← 0;4: while (‖∇f (xk)‖ > tol) do

5: Determine the step length αk = − d>k gkd>k Qdk

.

6: Set xk+1 = xk + αkdk7: Determine βk by one of the methods Fletcher-Reeves,

Polak-Ribiere or Hestenes-Stiefel.8: Set dk+1 = −gk+1 + βkdk9: Set k ← k + 1;

10: end while


TU Ilmenau

3.6.Conjugate Gradient ... QuadraticProblems...Matlabfunction xsol=myCG_PR(x0,tolf,maxIter)% A matlab implementation of steepest descent with Exact Line search% Use should supply: initial iterate x0% tolx - tolerance between subsequent iterates xk and xk+1, norm(x_k+1 - x_k)% tolf - tolerance for norm(F(x_k)) < tolf% maxIter - the maximum number of iterationsfprintf(’=============================================================================== \n’)fprintf(’iteration alphak norm(grad) \n’)fprintf(’=============================================================================== \n’)

x0=x0(:);alpha0=1;k=0;datasave=[];xk=x0;alphak=alpha0;grad=-gradFun(x0);gradOld=grad;dk=-grad;while ((norm(grad)>=tolf)) & (k<=maxIter)Q=hess(xk);dk=dk(:);gradOld=gradOld(:);alphak= -(dk’*dk)/(dk’*Q*dk);xk=xk+alphak*dk;grad=gradFun(xk);grad=grad(:);betak=grad’*(grad-gradOld)/(dk’*(grad-gradOld)); %Plak-Ribierdk=-grad + betak*dk;gradOld=grad;k=k+1;datasave=[datasave; k+1 alphak norm(grad)];end %end while


TU Ilmenau

3.6.Conjugate Gradient ... QuadraticProblems...Matlab...

xsol=xk;

disp(datasave)

end

function F=myfun(x)

F=(x_{1}-2)^{2} + x(2)^2-5;

end

function grad=gradFun(x)

grad(1) = 2*(x(1)-2);

grad(2) = 2*x(2);

end

function Q=hess(x)

Q(1,1)=2;

Q(1,2)=0;

Q(2,1)=0;

Q(2,2)=2;

end


TU Ilmenau


Example: (parameter estimation) For a chemical process pressure measured atdifferent temperature is given in the following table. Formulate an optimization problem todetermine the best values of the parameters in the following exponential model of the datap = αeβT . Choose any arbitrary point and find optimum values of the parameters usingsteepest descent and CG Algorithms

Temprature (T ◦C) Pressure (mm of Mercury)20 14.4525 19.2330 26.5435 34.5240 48.3250 68.1160 98.3470 120.45


TU Ilmenau

3.6.Conjugate Gradient ... Properties

• For QP’s with symmetric positive definite n × n matrix Q CGrequires only n steps to converge.• For a general nonlinear unconstrained optimization problemminx f (x) convergence depends on

I on the properties of the Hessian matrix⇒ Good convergence if the Hessian matrix H(x) is

positive definite (i.e. f (x) is a strictly convex function )I Since in the CG the Hessian matrix H(x) changes from iteration

to iteration, H(x) may be ill-conditioned.⇒ Preconditioning techniques are required.⇒ As a result we preconditioned Conjugate Gradient

(PCG) Methods.


TU Ilmenau

3.5. Numerical methods ...(A2) The Newton Algorithm:Start: • Choose an initial iterate x0 and

a termination Tolerance ε• Set k ← k0

Repeat:• Compute : ∇f (xk)

• Determine a search direction : dk = − [H(xk)]−1∇f (xk) .

• Set xk+1 = xk + αk∇dk• k ← k + 1

Until: (‖∇f (xk)‖ ≤ ε)Advantage

• The convergence is for αk = 1 ‖xk+1 − x∗‖ ≤ C‖xk − x∗‖2, C > 0 constant→ fast)

• Easy to implement if [H(xk )]−1 is easy to obtain.Disadvantages:• In every iteration the system of equations H(xk )dk = −∇f (xk ) should be solve to determine the search direction dk .

• If the matrix H(xk ) is not semi-definite or ill-conditioned the solution H(xk )dk = −∇f (xk ) may not be exact .

• If the initial iterate x0 is not choose properly, convergence is not guaranteed. (local convergence)

docs/Erfassungsbogen.docSteady-State Optimization Lecture 3: Unconstrained Optimization Problems, Numerical Methods and Applications

TU Ilmenau

3.5. Numerical methods ...• In the Newton-Algorithm the matrix H(xk ) may become ill-conditioned from step to step.• In general, the difficulties could arise:

− the direct computation of H(xk ) can be very hard− the inverse of H(xk ) may not be easily available ⇒ use approximate Hessian

⇒ Quasi-Newton Methods− for unconstrained optimization with several variables, the equation

H(xk )dk = −∇f (xk )

may consume too much cpu-time ⇒ use an inexact Newton Methods.

Some known methods that guarantee the matrix H(xk ) remain well-conditioned at each iteration step(i) Levenberg-Marquardt-Verfahren

Replace [H(xk )] by the approximation

H(xk ) = H(xk ) + βI , β ≥ 0.

(ii) Quasi-Newton-Method(BFGS=Broyden-Fletcher-Gldfarb-Shanon update)

Approximate the inverse [H(xk+1)]−1 by

BFGS Bk+1 = Bk +

(1 +

γ>Bkγ

δ>γ

)δ>δ

δ>γ−(δγ>Bk + Bkγδ>

δ>γ

),

where δ = xk+1 − xk , γ = ∇f (xk+1)−∇f (xk ) and H0 = I . Then, dk+1 = −Bk+1∇f (xk+1) .


TU Ilmenau

The Matlab Optimization Toolbox- functions forunconstrained optimization

• fminunc - Multidimensional unconstrained nonlinear minimization• lsqnonlin - Nonlinear least squares with upper and lower bounds.Using fminunc.m to solve unconstrained optimization problems[xsol,fopt,exitflag,output,grad,hessian] = fminunc(fun,x0,options)

Input arguments:fun a Matlab function m-file that contains the function to be minimzedx0 Startvector for the algorithm, if known, else [ ]options options are set using the optimset funciton, they determine what algorism to use,etc.

Output arguments:xsol optimal solutionfopt optimal value of the objective function; i.e. f (xsol)exitflag tells whether the algorithm converged or not, exitflag ¿ 0 means convergenceoutput a struct for number of iterations, algorithm used and PCG iterations(when LargeScale=on)grad gradient vector at the optimal point xsol.hessian hessian matrix at the optimal point xsol.

To display type of parameters that you can set fmincun.m use the command:>>optimset(’fminunc’) !


TU Ilmenau

The Matlab Optimization Toolbox- functions forunconstrained optimizationUse the Matlab fminunc to solve:

minx{f (x) = x2

1 + 3x22 + 5x2

3}

Problem definition

function [f,g]=fun1(x)%Objective function for example (a)%Defines an unconstrained optimization problem to be solved with fminuncf=x(1)^2+3*x(2)^2+5*x(3)^2;if nargout > 1g(1)=2*x(1);g(2)=6*x(2);g(3)=10*x(3);end

Main program

function [xopt,fopt,exitflag]=unConstEx1options=optimset(’fminunc’);options.LargeScale=’off’; options.HessUpdate=’bfgs’;%assuming the function is defined in the%in the m file fun1.m we call fminunc%with a starting point x0x0=[1,1,1];[xopt,fopt,exitflag]=fminunc(@fun1,x0,options);


TU Ilmenau

The Matlab Optimization Toolbox- functions forunconstrained optimization

If you decide to use the Large-Scale option in fminunc.m to solve problem:

minx{f (x) = x2

1 + 3x22 + 5x2

3}

Edit the main programm as follows

function [xopt,fopt,exitflag]=unConstEx1options=optimset(’fminunc’);options.LargeScale=’on’;options.Gradobj=’on’;%assuming the function is defined as in fun1.m%we call fminunc with a starting point x0x0=[1,1,1];


TU Ilmenau

Steady-State Optimization Lecture 3: Unconstrained Optimization ...€¦ · Numerical Methods and...

Documents

Transcript of Steady-State Optimization Lecture 3: Unconstrained Optimization ...€¦ · Numerical Methods and...