Lec 22 - 23 KKT Conditions

8/9/2019 Lec 22 - 23 KKT Conditions

1/45

Topic: Karesh-Kuhn-Tucker Optimality Criteria

Dr. Nasir M Mirza

Optimization Techniques Optimization Techniques

Email: [email protected]


2/45

• Important question: How do we know that wehave found the “optimum” for f(x)?

• Answer: Test the solution for the “necessary

and sufficient conditions”

Optimality Criteria


3/45

The first order optimality condition for the minimum of f(x)can be derived by considering linear expansion of the functionaround the optimum point x* using Taylor Series:

Necessary Condition for Optimality:

*)((x*)*)()( T x x f x f x f −∇+≈

*)((x*)*)()( T x x f x f x f −∇=−

Where ∇ f(x*) is the gradient of function f(x) and x - x* is the distance.


4/45

Unconstrained Problems:

• If the x* is a minimum point thenthis condition can only be ensured if

∇f(x)=0; The gradient of f( x) mustvanish at the optimum.

• Thus the first order necessary

condition for the minimum of afunction is that its gradient is zero atthe optimum.

• This condition is true for amaximum point also and for anyother point where the slope is zero.

• Therefore, it is only a necessary

condition and is not sufficientcondition.

Conditions for Optimality

Graph of f(x) = x(-cos(1) – sin(1) + sin(x))

-6 -4 -2 0 2 4 6

-12

-8

-4

0

4

8

12 f(x)

x

Local max

Local min

Inflection

Local min


5/45

Conditions for Optimality• Graph for Function and its derivative

-6 -4 -2 0 2 4 6

-12

-8

-4

0

4

8

12

df(x)/dx

x-6 -4 -2 0 2 4 6

-12

-8

-4

0

4

8

12

f(x)

x

Local max

Local min

Inflection

Local min


6/45

Unconstrained Problems

1. ∇F(x)=0; The gradient of F(x) must vanish at the optimum

2. The second order condition for the minimum of f(x) can be derived byconsidering the quadratic expansion of the function around theoptimum point (x*) using Taylor Series as following:

∇2f(x*) is a Hessian Matrix of function f(x) and d = x – x*.

For x* to be a local minimum f(x) – f(x*) must be greater than or equal to Zeroin the neighborhood of x*. So, we must have

Sufficient Condition for Optimality:

Λ+∇+∇+≈ d x f d d f x f x f T *)(21(x*)*)()( 2T

0*)(2

1 2

≥∇ d x f d T


7/45


8/45


• A positive definite Hessian at the minimum ensures only thata local minimum has been found;

• The minimum is the global minimum only if it can be shownthat the Hessian is positive definite for all possible values of

x . This would imply a convex design space.

• Very hard to prove in practice !!!!



9/45


10/45

Optimality Conditions – Unconstrained Case

• Let x* be the point that we think is the minimum for f(x)• Necessary condition (for optimality):

∇ f(x*) = 0• A point that satisfies the necessary condition is a stationary

point

• It can be a minimum, maximum, or saddle point• How do we know that we have a minimum?• Answer: Sufficiency Condition:

The sufficient conditions for x* to be a strict local minimumare:∇ f(x*) = 0∇ 2 f(x*) is positive definite


11/45

Example 1:

⎥⎦

⎤

⎢⎣

⎡

+−

−+−=⎟⎟

⎟⎟

⎠

⎞

⎜⎜

⎜⎜

⎝

⎛

∂∂∂∂

y x

y x

y f x f

4

22

⎟⎟ ⎠ ⎞

⎜⎜⎝ ⎛ −

−=⎟⎟ ⎠

⎞⎜⎜⎝ ⎛

∂∂∂∂∂∂∂∂∂∂

41

12

//

//222

222

y f x y f

y x f x f

Find all stationary points for the following function. Using Optimalityconditions, classify them as minimum, maximum or inflection points.

The objective function is : -2x + x 2 –xy +2y 2

The gradient vector :

The Hessian Matrix :


12/45

Example 1:The first order optimality conditions:

Necessary conditions:

-2 + 2x – y = 0-x + 4y = 0 ; x = 4y and then

-2 + 2(4y) – y = 0; 7y = 2; y = 2/7 and x = 8/7.

Possible solution point is x = 1.14286 and y = 0.285714

Then let us apply the second optimality conditions: Hessian Matrixmust be positive definite at the minimum. Let us find principalminors:

A1 = |a 11 | = 2 ; A2 = det H = 8 – 1 = 7 ; both are positive;

So, H is positive definite.


13/45

Example 1:The function f(x, y) at the point x* = 1.14286, y* = 0.285714 is

f = -1.14286 ;

The point is minimum point.Since Hessian matrix is positive define, we know the function is convex.

Therefore any minimum is a global minimum.

% Matlab program to draw contour of function[X,Y] = meshgrid(-1:.1:2);Z = -2.*X + X.*X - X.*Y + 2.*Y.*Y;contour(X,Y,Z,100)


14/45

Example 1:• Contour graph using MATLAB

Graphical

presentation offunction andminimum at point(x*, y*)


15/45

Example 1:•Three D plot

It confirms the

solution as well thata global minimumexists here at point

(x*, y*)

% Matlab program to draw function[X,Y] = meshgrid(-1:.1:2);Z = -2.*X + X.*X - X.*Y + 2.*Y.*Y;

mesh(Z);


16/45

Example 1:• Important observations:• The minimum point does not change if we add a

constant to the objective function.• The minimum point does not change if we multiply the

objective function by a positive constant.

• The problem changes from minimization to maximizationproblem if we multiply the objective function by anegative sign.

• The unconstrained problem is a convex problem if theobject function is convex . For convex problems any localminimum is also a global minimum .


17/45

Find values of the variables that minimize or maximize theobjective function while satisfying the constraints.

The standard form of the constrained optimization problem can bewritten as:

Minimize: F( x) objective function

Subject to: g j(x) ≤0 j=1, . . . , m inequality constraintshk (x) ≤0 k=1, . . . , l equality constraints

xi lower xi xi upper i=1, . . . , n side constraints

where x=(x 1, x2, x3, x4 , x5 ,xn) design variables

What is an optimization problem:


18/45


1. ∇F(x)=0; The gradient of F(x) must vanish at the optimum

2. Hessian Matrix must be positive definite (i.e. all positive eigenvalues atoptimum point).

⎥

⎥⎥⎥⎥

⎥⎥⎥⎥⎥

⎦

⎤

⎢

⎢⎢⎢⎢

⎢⎢⎢⎢⎢

⎣

⎡

∂

∂

∂∂

∂

∂∂

∂

∂∂∂

∂∂

∂∂∂

∂∂

∂

∂∂

∂

∂

∂

=

2

n

2

2n

2

1n

2

n2

2

22

2

12

2

n1

2

21

2

21

2

x

)x(F

xx

)x(F

xx

)x(F

xx)x(F

x

)x(F

xx)x(F

xx

)x(F

xx

)x(F

x

)x(F

H

Λ

ΛΟΛΛ

Λ

Λ



19/45


• A positive definite Hessian at the minimum ensures only thata local minimum has been found;

• The minimum is the global minimum only if it can be shownthat the Hessian is positive definite for all possible values of

x . This would imply a convex design space.

• Very hard to prove in practice !!!!



20/45

Optimality Conditions – Unconstrained Case

• Let x* be the point that we think is the minimum for f(x)• Necessary condition (for optimality):

∇ f(x*) = 0• A point that satisfies the necessary condition is a stationary

point

• It can be a minimum, maximum, or saddle point• How do we know that we have a minimum?• Answer: Sufficiency Condition:

The sufficient conditions for x* to be a strict local minimumare:∇ f(x*) = 0∇ 2 f(x*) is positive definite


21/45

• To proof a claim of optimality in constrainedminimization (or maximization), we have to check thefound point with respect to the (Karesh) Kuhn Tuckerconditions.

• Kuhn and Tucker extended the Lagrangian theory toinclude the general classical single-objective nonlinearprogramming problem: Minimize: f(x)

Subject to: g j (x) ≥ 0 for j = 1, 2, ..., J

h k (x) = 0 for k = 1, 2, ..., K x = ( x 1 , x 2 , ..., x N )

Constrained Case – KKT Conditions


22/45

Interior versus Exterior Solutions• Interior:

If no constraints are active and (thus) the solution lies at theinterior of the feasible space, then the necessary condition for

optimality is same as for unconstrained case:∇f(x*) = 0

• Exterior:

If solution lies at the exterior, then the condition ∇f(x*) = 0does not apply because some constraints will block movementto this minimum.

• Some constraints will (thus) be active.

• We cannot get any more improvement (in this case) if for x*there does not exist a vector d that is both a descent directionand a feasible direction.

• In other words: the possible feasible directions do notintersect the possible descent directions at all.


23/45

Necessary KKT ConditionsFor the problem: Minimize objective function, f(x)

subjected to: g j(x) ≥ 0; j = 1, 2, 3, . . . J

hk (x) = 0; k = 1, 2, 3, . . . K

xi(L) ≤ xi ≤ xi(U) ; i = 1, 2, 3, . . . N

This is the most general form of a single – objective constrainedoptimization problem.

Here g j(x) are inequality constraint functions (Total J in number);

hk (x) are equality constraint functions respectively (Total K).

A point is feasible if all constraints and bounds are satisfied.


24/45

Necessary KKT ConditionsFor the problem:

Minimize: f(x)

subjected to: g j (x) ≥ 0; j = 1, 2, 3, . . . J

h k (x) = 0; k = 1, 2, 3, . . . K

x i (L) ≤ x i ≤ x i (U) ; i = 1, 2, 3, . . . N

,0)()()(11

=∇−∇−∇ ∑∑==

K

k k k

J

j j j xhv x g u x f

g j (x) ≥ 0 for j= 1, 2, ..., J (feasibility)

h k (x) = 0 for k = 1, 2, ..., K (complementary slackness

condition)

u j g j (x) = 0 for j= 1, 2, ..., J (non-negativity)u j (x) ≥ 0 for j= 1, 2, ..., J

The necessary conditions are:

(optimality)


25/45

Necessary KKT Conditions (if g(x) ≥0)

• If the definition of feasibility changes, the optimality andfeasibility conditions change.

• The necessary conditions become:∇f(x) - Σ u i ∇g i(x) + Σ v j ∇h j(x) = 0 (optimality)

g j (x) ≥ 0 for j = 1, 2, ..., J (feasibility)

h k (x) = 0 for k = 1, 2, ..., K (feasibility)u i g i (x) = 0 for i = 1, 2, ..., J (complementary slackness)

u i ≥ 0 for i = 1, 2, ..., J (non-negativity)


26/45

Exercise 4.1.1Let us take the following function to be minimized:

f(x) = (x 2 + y – 11) 2 + (x + y 2 – 7) 2

Subjected to:

g 1(x) =26 - (x - 5)2 – y

2≥ 0, g 2(x) = 20 – 4x - y ≥ 0,

here every point in the search space is not feasible.The feasible points are those that satisfy the above two

constraints and variable bounds. Let us also choose four points x(1) =(1, 5) T , x(2) =(0, 0) T , x(3)

=(3, 2)T , and x(4) =(3.396, 0)

T to investigate whether each point is a K-T point.

The feasible search space and these four points are shown on acontour plot of the objective function in Figure 4.1.


27/45

Exercise 4.1.1The region on theother side of thehatched portion ofa constraint line isfeasible.

The combinationof two constraintsand variablebounds makesthe interior regionfeasible, asdepicted in thefigure.


28/45

Exercise 4.1.1• At first, we transform the variable bounds to two

inequality constraints: g 3(x) =x ≥ 0, and g 4(y) = y ≥ 0.

• Thus, the above problem has four inequality constraints(J = 4) and no equality constraint (K =0).• There are two problem variables: N =2.

• Thus, for each point a total of 2 + 3 x 4 +0 = 14 Kuhn-Tucker conditions need to be checked.• To formulate all K-T conditions, we first calculate the

gradient of the objective function.• In Table 4.1, we compute these gradients numericallyand also compute the constraint values at all four points.


29/45

Exercise 4.1.1• For point (1, 5) T ,

g 1(x, y) = 26 - (x - 5) 2 – y2 = 26 – (1 – 5) 2 – 5 2

= 26 – 16 – 25 = –15.0 ,

g 2(x, y) = 20 – 4x - y = 20 – 4(1) – 5 = 11.0

g 3(x, y) = x = 1.0

g 2(x, y) = y = 5.0

Now we have f(x) = (x 2 + y – 11) 2 + (x + y 2 – 7) 2

f x = 4x 3 + 4xy – 42x + 2y 2 – 14

f y = 2x 2 + 4y 3 + 4xy – 26y – 22

∇ f (x, y) = ( f x , f y )T =

(4x 3 + 4xy – 42x + 2y 2 – 14 , 2x 2 + 4y 3 + 4xy – 26y – 22 ) T = (18, 370) T


30/45

Exercise 4.1.1• For point (1, 5) T ,

g 1(x, y) = 26 - (x - 5) 2 – y2 = 26 – (1 – 5) 2 – 5 2

= 26 – 16 – 25 = –15.0 ,

( g 1 ) x = –2(x – 5) ; ( g 1 ) y = –2y

∇ g 1(x, y) = ( g 1x , g 1y )T = ( -2x+10 , -2y ) T = (8, -10)

g 2 = 20 – 4x – y ; ( g 2 ) x = –4 ; ( g 2 ) y = –1∇ g 2(x, y) = ( g 2x , g 2y )T = ( -4 , -1 ) T

g 3 = x ; ( g 3 ) x = 1 ; ( g 3 ) y = 0

∇ g 3(x, y) = ( g 3x , g 3y )T = ( 1 , 0 ) T

g 4 = y ; ( g 4 ) x = 0 ; ( g 4 ) y = 1

∇ g 4(x, y) = ( g 4x , g 4y )T = ( 0 , 1 ) T


31/45

Exercise 4.1.1• For the first point when we substitute the values we get following KKT

conditions:,0)()()(

11

=∇−∇−∇ ∑∑==

K

k k k

J

j j j xhv x g u x f

( ) ( ) ( ) ( ) ( ) ,01144332211 =+−−−−− Λ x x x x x x hv g u g u g u g u f

18 – (8)u 1 – (– 4)u 2 – (1)u 3 – (0)u 4 = 0 ;

( g 1x , g 1y )T = (8, -10) T

( g 2x , g 2y )T = ( -4 , -1 ) T

( g 3x , g 3y )T = ( 1 , 0 ) T

( g 4x , g 4y )T = ( 0 , 1 ) T

( ) ( ) ( ) ( ) ( ) ,01144332211 =+−−−−− Λ y y y y y y hv g u g u g u g u f

370 – (-10)u1 – (–1)u

2 – (0)u

3 – (1)u

4= 0 ;


32/45

Exercise 4.1.1• For the first point when we substitute the values we get following KKT

conditions:,0)()()(

11

=∇−∇−∇ ∑∑==

K

k k k

J

j j j xhv x g u x f

( ) ( ) ( ) ( ) ( ) ,01144332211 =+−−−−− Λ x x x x x x hv g u g u g u g u f

g 1 = –15 not greater or equal to zerog 2 = +11 > 0 ;g 3 = 1 > 0 ;

g 4 = 5 >0

g j (x) ≥ 0 for j=

1, 2, ..., J

h k (x) = 0 for k =

1, 2, ..., K (-15)u 1 = 0 ;(11)u 2 = 0 ;(1)u 3 = 0 ;(5)u 4 = 0u 1 , u 2 , u 3 , u 4 ≥ 0

18 – (8)u 1 – (– 4)u 2 – (1)u 3 – (0)u 4 = 0 ;

u j g j (x) = 0 for j= 1, 2, ..., J

u j (x) ≥ 0 for j=1, 2, ..., J


33/45

Exercise 4.1.1• It is clear that the third

condition is not satisfied thatis g j (x) ≥ 0.

• This is enough to concludethat the point x(1) is not a KTpoint.

• In fact, since the firstconstraint value is negative,this constraint is violated atthis point and the point x(1)is not a feasible point, asshown in Figure 4.1.

• If a point is infeasible, thepoint cannot be an optimalpoint.


34/45

Exercise 4.1.1


35/45


36/45

Exercise 4.1.1• Since u1 = 0, u2 = 0, u3 =

-14 and u4 = -22.• The final set of conditions

are not satisfied with thesevalues, because u3 and u4are negative.

• Thus, the point x(2) is not aK-T point.

• Since the constraints are notviolated, the point is afeasible point, as shown inFigure 4.1.

• Thus, the point x(2) cannotbe an optimal point.


37/45

Exercise 4.1.1 Similarly, the conditions for the third point x(3) = (3, 2) T are

following:-4u

1+ 4 u

2 – u

3= 0;

+4u 1 + u 2 – u 4 = 0;18 > 0 ; 6 > 0 ; 3 > 0 ; 2 > 0,

(18)u 1 =0,(6)u2 = 0,(3)u3 =0,

(2)u4 =0,The vector u* = (0,0,0, 0)T satisfies all the above conditions.


38/45

Exercise 4.1.1• The vector u* = (0, 0,

0, 0) T satisfies all theabove conditions.

• Thus, the point x(3) is aK-T point (Figure 4.1).• As mentioned earlier, K-

T points are likelycandidates for minimalpoints.

• To conclude, we maysay that the optimalityof a point requiressatisfaction of moreconditions.


39/45

Exercise 4.1.1The K-T conditions obtained for the point x(4) = (3.396, 0) T

are-3.21u1 + 4u2 - u3 = 0,1 +u2 - u4 = 0,

23.427 > 0; 6.416 > 0 ; 3.396 > 0; 0 = 0,(23.427)(u 1 ) = 0,(6.416)u 2 = 0,(3.396)u 3 =0,(0)u 4 = 0,

The solution to the above conditions is the vector u* = (0,0,0,1) T


40/45

Exercise 4.1.1• The solution to the above

conditions is the vector u* =(0,0,0,1) T.

• Thus, the point x(4) is also aK-T point.

• It is clear from the figure thatthe point x(3) is the minimumpoint, but the point x(4) is not

• a minimum point.• Thus, we may conclude from

the above exercise problemthat a K-T point mayor maynot be a minimum point.

• But if a point is not a K-T point(point x(2) or x(3)), then itcannot be an optimum point.


41/45

Limitations


42/45

• Necessity theorem helps identify points that are notoptimal. A point is not optimal if it does not satisfythe Kuhn–Tucker conditions.

• On the other hand, not all points that satisfy theKuhn-Tucker conditions are optimal points.

• The Kuhn–Tucker sufficiency theorem gives conditionsunder which a point becomes an optimal solution to asingle-objective NLP.

Limitations

Sufficiency Condition


43/45

• Sufficient conditions that a point x* is a strict local minimum of theclassical single objective NLP problem, where f, g j, and h k are twicedifferentiable functions are that

1) The necessary KKT conditions are met.

2) The Hessian matrix ∇2L(x*) = ∇2f(x*) + Σμi∇2g i(x*) +Σλ j∇2h j(x*) is positive definite on a subspace of R n as defined bythe condition:

yT ∇2L(x*) y ≥ 0 is met for every vector y (1xN) satisfying:

∇g j(x*)y = 0 for j belonging to I 1 = { j | g j(x*) = 0,u j* > 0} (active constraints)

∇hk (x*)y = 0 for k = 1, ..., K and y ≠ 0

Sufficiency Condition

KKT Sufficiency Theorem (Special Case)


44/45

• Consider the classical single objective NLP problem.

minimize : f(x)

Subject to: g j (x) 0 for j = 1, 2, ..., J

h k(x) = 0 for k = 1, 2, ..., K

x = (x 1, x 2, ..., x N)

• Let the objective function f(x) be convex, the inequality constraintsg j (x) be all convex functions for j = 1, ..., J, and the equalityconstraints h k(x) for k = 1, ..., K be linear.

• If this is true, then the necessary KKT conditions are also sufficient.

• Therefore, in this case, if there exists a solution x* that satisfies theKKT necessary conditions, then x* is an optimal solution to the NLPproblem.

• In fact, it is a global optimum.

KKT Sufficiency Theorem (Special Case)

Some Remarks


45/45

• Kuhn-Tucker Conditions are an extension of Lagrangian functionand method.

• They provide powerful means to verify solutions

• But there are limitations …

• Sufficiency conditions are difficult to verify.

• Practical problems do not have required nice properties.• For example, You will have a problems if you do not know

the explicit constraint equations (e.g., in FEM).

• If you have a multi-objective formulation, then we suggesttesting each priority level separately.

Some Remarks

Lec 22 - 23 KKT Conditions

Documents

Transcript of Lec 22 - 23 KKT Conditions