Lec 22 - 23 KKT Conditions

download Lec 22 - 23 KKT Conditions

of 19

Transcript of Lec 22 - 23 KKT Conditions

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    1/45

    Topic: Karesh-Kuhn-Tucker Optimality Criteria

    Dr. Nasir M Mirza

    Optimization Techniques Optimization Techniques

    Email: [email protected]

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    2/45

    • Important question: How do we know that wehave found the “optimum” for f(x)?

    • Answer: Test the solution for the “necessary

    and sufficient conditions”

    Optimality Criteria

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    3/45

    The first order optimality condition for the minimum of f(x)can be derived by considering linear expansion of the functionaround the optimum point x* using Taylor Series:

    Necessary Condition for Optimality:

    *)((x*)*)()( T x x f x f x f −∇+≈

    *)((x*)*)()( T x x f x f x f −∇=−

    Where ∇ f(x*) is the gradient of function f(x) and x - x* is the distance.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    4/45

    Unconstrained Problems:

    • If the x* is a minimum point thenthis condition can only be ensured if

    ∇f(x)=0; The gradient of f( x) mustvanish at the optimum.

    • Thus the first order necessary

    condition for the minimum of afunction is that its gradient is zero atthe optimum.

    • This condition is true for amaximum point also and for anyother point where the slope is zero.

    • Therefore, it is only a necessary

    condition and is not sufficientcondition.

    Conditions for Optimality

    Graph of f(x) = x(-cos(1) – sin(1) + sin(x))

    -6 -4 -2 0 2 4 6

    -12

    -8

    -4

    0

    4

    8

    12 f(x)

    x

    Local max

    Local min

    Inflection

    Local min

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    5/45

    Conditions for Optimality• Graph for Function and its derivative

    -6 -4 -2 0 2 4 6

    -12

    -8

    -4

    0

    4

    8

    12

    df(x)/dx

    x-6 -4 -2 0 2 4 6

    -12

    -8

    -4

    0

    4

    8

    12

    f(x)

    x

    Local max

    Local min

    Inflection

    Local min

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    6/45

    Unconstrained Problems

    1. ∇F(x)=0; The gradient of F(x) must vanish at the optimum

    2. The second order condition for the minimum of f(x) can be derived byconsidering the quadratic expansion of the function around theoptimum point (x*) using Taylor Series as following:

    ∇2f(x*) is a Hessian Matrix of function f(x) and d = x – x*.

    For x* to be a local minimum f(x) – f(x*) must be greater than or equal to Zeroin the neighborhood of x*. So, we must have

    Sufficient Condition for Optimality:

    Λ+∇+∇+≈ d x f d d f x f x f T *)(21(x*)*)()( 2T

    0*)(2

    1 2

    ≥∇ d x f d T

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    7/45

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    8/45

    Unconstrained Problems

    • A positive definite Hessian at the minimum ensures only thata local minimum has been found;

    • The minimum is the global minimum only if it can be shownthat the Hessian is positive definite for all possible values of

    x . This would imply a convex design space.

    • Very hard to prove in practice !!!!

    Conditions for Optimality

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    9/45

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    10/45

    Optimality Conditions – Unconstrained Case

    • Let x* be the point that we think is the minimum for f(x)• Necessary condition (for optimality):

    ∇ f(x*) = 0• A point that satisfies the necessary condition is a stationary

    point

    • It can be a minimum, maximum, or saddle point• How do we know that we have a minimum?• Answer: Sufficiency Condition:

    The sufficient conditions for x* to be a strict local minimumare:∇ f(x*) = 0∇ 2 f(x*) is positive definite

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    11/45

    Example 1:

    ⎥⎦

    ⎢⎣

    +−

    −+−=⎟⎟

    ⎟⎟

    ⎜⎜

    ⎜⎜

    ∂∂∂∂

    y x

    y x

    y f x f

    4

    22

    ⎟⎟ ⎠ ⎞

    ⎜⎜⎝ ⎛ −

    −=⎟⎟ ⎠

    ⎞⎜⎜⎝ ⎛

    ∂∂∂∂∂∂∂∂∂∂

    41

    12

    //

    //222

    222

    y f x y f

    y x f x f

    Find all stationary points for the following function. Using Optimalityconditions, classify them as minimum, maximum or inflection points.

    The objective function is : -2x + x 2 –xy +2y 2

    The gradient vector :

    The Hessian Matrix :

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    12/45

    Example 1:The first order optimality conditions:

    Necessary conditions:

    -2 + 2x – y = 0-x + 4y = 0 ; x = 4y and then

    -2 + 2(4y) – y = 0; 7y = 2; y = 2/7 and x = 8/7.

    Possible solution point is x = 1.14286 and y = 0.285714

    Then let us apply the second optimality conditions: Hessian Matrixmust be positive definite at the minimum. Let us find principalminors:

    A1 = |a 11 | = 2 ; A2 = det H = 8 – 1 = 7 ; both are positive;

    So, H is positive definite.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    13/45

    Example 1:The function f(x, y) at the point x* = 1.14286, y* = 0.285714 is

    f = -1.14286 ;

    The point is minimum point.Since Hessian matrix is positive define, we know the function is convex.

    Therefore any minimum is a global minimum.

    % Matlab program to draw contour of function[X,Y] = meshgrid(-1:.1:2);Z = -2.*X + X.*X - X.*Y + 2.*Y.*Y;contour(X,Y,Z,100)

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    14/45

    Example 1:• Contour graph using MATLAB

    Graphical

    presentation offunction andminimum at point(x*, y*)

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    15/45

    Example 1:•Three D plot

    It confirms the

    solution as well thata global minimumexists here at point

    (x*, y*)

    % Matlab program to draw function[X,Y] = meshgrid(-1:.1:2);Z = -2.*X + X.*X - X.*Y + 2.*Y.*Y;

    mesh(Z);

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    16/45

    Example 1:• Important observations:• The minimum point does not change if we add a

    constant to the objective function.• The minimum point does not change if we multiply the

    objective function by a positive constant.

    • The problem changes from minimization to maximizationproblem if we multiply the objective function by anegative sign.

    • The unconstrained problem is a convex problem if theobject function is convex . For convex problems any localminimum is also a global minimum .

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    17/45

    Find values of the variables that minimize or maximize theobjective function while satisfying the constraints.

    The standard form of the constrained optimization problem can bewritten as:

    Minimize: F( x) objective function

    Subject to: g j(x) ≤0 j=1, . . . , m inequality constraintshk (x) ≤0 k=1, . . . , l equality constraints

    xi lower xi xi upper i=1, . . . , n side constraints

    where x=(x 1, x2, x3, x4 , x5 ,xn) design variables

    What is an optimization problem:

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    18/45

    Unconstrained Problems

    1. ∇F(x)=0; The gradient of F(x) must vanish at the optimum

    2. Hessian Matrix must be positive definite (i.e. all positive eigenvalues atoptimum point).

    ⎥⎥⎥⎥

    ⎥⎥⎥⎥⎥

    ⎢⎢⎢⎢

    ⎢⎢⎢⎢⎢

    ∂∂

    ∂∂

    ∂∂∂

    ∂∂

    ∂∂∂

    ∂∂

    ∂∂

    =

    2

    n

    2

    2n

    2

    1n

    2

    n2

    2

    22

    2

    12

    2

    n1

    2

    21

    2

    21

    2

    x

    )x(F

    xx

    )x(F

    xx

    )x(F

    xx)x(F

    x

    )x(F

    xx)x(F

    xx

    )x(F

    xx

    )x(F

    x

    )x(F

    H

    Λ

    ΛΟΛΛ

    Λ

    Λ

    Conditions for Optimality

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    19/45

    Unconstrained Problems

    • A positive definite Hessian at the minimum ensures only thata local minimum has been found;

    • The minimum is the global minimum only if it can be shownthat the Hessian is positive definite for all possible values of

    x . This would imply a convex design space.

    • Very hard to prove in practice !!!!

    Conditions for Optimality

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    20/45

    Optimality Conditions – Unconstrained Case

    • Let x* be the point that we think is the minimum for f(x)• Necessary condition (for optimality):

    ∇ f(x*) = 0• A point that satisfies the necessary condition is a stationary

    point

    • It can be a minimum, maximum, or saddle point• How do we know that we have a minimum?• Answer: Sufficiency Condition:

    The sufficient conditions for x* to be a strict local minimumare:∇ f(x*) = 0∇ 2 f(x*) is positive definite

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    21/45

    • To proof a claim of optimality in constrainedminimization (or maximization), we have to check thefound point with respect to the (Karesh) Kuhn Tuckerconditions.

    • Kuhn and Tucker extended the Lagrangian theory toinclude the general classical single-objective nonlinearprogramming problem: Minimize: f(x)

    Subject to: g j (x) ≥ 0 for j = 1, 2, ..., J

    h k (x) = 0 for k = 1, 2, ..., K x = ( x 1 , x 2 , ..., x N )

    Constrained Case – KKT Conditions

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    22/45

    Interior versus Exterior Solutions• Interior:

    If no constraints are active and (thus) the solution lies at theinterior of the feasible space, then the necessary condition for

    optimality is same as for unconstrained case:∇f(x*) = 0

    • Exterior:

    If solution lies at the exterior, then the condition ∇f(x*) = 0does not apply because some constraints will block movementto this minimum.

    • Some constraints will (thus) be active.

    • We cannot get any more improvement (in this case) if for x*there does not exist a vector d that is both a descent directionand a feasible direction.

    • In other words: the possible feasible directions do notintersect the possible descent directions at all.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    23/45

    Necessary KKT ConditionsFor the problem: Minimize objective function, f(x)

    subjected to: g j(x) ≥ 0; j = 1, 2, 3, . . . J

    hk (x) = 0; k = 1, 2, 3, . . . K

    xi(L) ≤ xi ≤ xi(U) ; i = 1, 2, 3, . . . N

    This is the most general form of a single – objective constrainedoptimization problem.

    Here g j(x) are inequality constraint functions (Total J in number);

    hk (x) are equality constraint functions respectively (Total K).

    A point is feasible if all constraints and bounds are satisfied.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    24/45

    Necessary KKT ConditionsFor the problem:

    Minimize: f(x)

    subjected to: g j (x) ≥ 0; j = 1, 2, 3, . . . J

    h k (x) = 0; k = 1, 2, 3, . . . K

    x i (L) ≤ x i ≤ x i (U) ; i = 1, 2, 3, . . . N

    ,0)()()(11

    =∇−∇−∇ ∑∑==

    K

    k k k

    J

    j j j xhv x g u x f

    g j (x) ≥ 0 for j= 1, 2, ..., J (feasibility)

    h k (x) = 0 for k = 1, 2, ..., K (complementary slackness

    condition)

    u j g j (x) = 0 for j= 1, 2, ..., J (non-negativity)u j (x) ≥ 0 for j= 1, 2, ..., J

    The necessary conditions are:

    (optimality)

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    25/45

    Necessary KKT Conditions (if g(x) ≥0)

    • If the definition of feasibility changes, the optimality andfeasibility conditions change.

    • The necessary conditions become:∇f(x) - Σ u i ∇g i(x) + Σ v j ∇h j(x) = 0 (optimality)

    g j (x) ≥ 0 for j = 1, 2, ..., J (feasibility)

    h k (x) = 0 for k = 1, 2, ..., K (feasibility)u i g i (x) = 0 for i = 1, 2, ..., J (complementary slackness)

    u i ≥ 0 for i = 1, 2, ..., J (non-negativity)

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    26/45

    Exercise 4.1.1Let us take the following function to be minimized:

    f(x) = (x 2 + y – 11) 2 + (x + y 2 – 7) 2

    Subjected to:

    g 1(x) =26 - (x - 5)2 – y

    2≥ 0, g 2(x) = 20 – 4x - y ≥ 0,

    here every point in the search space is not feasible.The feasible points are those that satisfy the above two

    constraints and variable bounds. Let us also choose four points x(1) =(1, 5) T , x(2) =(0, 0) T , x(3)

    =(3, 2)T , and x(4) =(3.396, 0)

    T to investigate whether each point is a K-T point.

    The feasible search space and these four points are shown on acontour plot of the objective function in Figure 4.1.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    27/45

    Exercise 4.1.1The region on theother side of thehatched portion ofa constraint line isfeasible.

    The combinationof two constraintsand variablebounds makesthe interior regionfeasible, asdepicted in thefigure.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    28/45

    Exercise 4.1.1• At first, we transform the variable bounds to two

    inequality constraints: g 3(x) =x ≥ 0, and g 4(y) = y ≥ 0.

    • Thus, the above problem has four inequality constraints(J = 4) and no equality constraint (K =0).• There are two problem variables: N =2.

    • Thus, for each point a total of 2 + 3 x 4 +0 = 14 Kuhn-Tucker conditions need to be checked.• To formulate all K-T conditions, we first calculate the

    gradient of the objective function.• In Table 4.1, we compute these gradients numericallyand also compute the constraint values at all four points.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    29/45

    Exercise 4.1.1• For point (1, 5) T ,

    g 1(x, y) = 26 - (x - 5) 2 – y2 = 26 – (1 – 5) 2 – 5 2

    = 26 – 16 – 25 = –15.0 ,

    g 2(x, y) = 20 – 4x - y = 20 – 4(1) – 5 = 11.0

    g 3(x, y) = x = 1.0

    g 2(x, y) = y = 5.0

    Now we have f(x) = (x 2 + y – 11) 2 + (x + y 2 – 7) 2

    f x = 4x 3 + 4xy – 42x + 2y 2 – 14

    f y = 2x 2 + 4y 3 + 4xy – 26y – 22

    ∇ f (x, y) = ( f x , f y )T =

    (4x 3 + 4xy – 42x + 2y 2 – 14 , 2x 2 + 4y 3 + 4xy – 26y – 22 ) T = (18, 370) T

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    30/45

    Exercise 4.1.1• For point (1, 5) T ,

    g 1(x, y) = 26 - (x - 5) 2 – y2 = 26 – (1 – 5) 2 – 5 2

    = 26 – 16 – 25 = –15.0 ,

    ( g 1 ) x = –2(x – 5) ; ( g 1 ) y = –2y

    ∇ g 1(x, y) = ( g 1x , g 1y )T = ( -2x+10 , -2y ) T = (8, -10)

    g 2 = 20 – 4x – y ; ( g 2 ) x = –4 ; ( g 2 ) y = –1∇ g 2(x, y) = ( g 2x , g 2y )T = ( -4 , -1 ) T

    g 3 = x ; ( g 3 ) x = 1 ; ( g 3 ) y = 0

    ∇ g 3(x, y) = ( g 3x , g 3y )T = ( 1 , 0 ) T

    g 4 = y ; ( g 4 ) x = 0 ; ( g 4 ) y = 1

    ∇ g 4(x, y) = ( g 4x , g 4y )T = ( 0 , 1 ) T

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    31/45

    Exercise 4.1.1• For the first point when we substitute the values we get following KKT

    conditions:,0)()()(

    11

    =∇−∇−∇ ∑∑==

    K

    k k k

    J

    j j j xhv x g u x f

    ( ) ( ) ( ) ( ) ( ) ,01144332211 =+−−−−− Λ x x x x x x hv g u g u g u g u f

    18 – (8)u 1 – (– 4)u 2 – (1)u 3 – (0)u 4 = 0 ;

    ( g 1x , g 1y )T = (8, -10) T

    ( g 2x , g 2y )T = ( -4 , -1 ) T

    ( g 3x , g 3y )T = ( 1 , 0 ) T

    ( g 4x , g 4y )T = ( 0 , 1 ) T

    ( ) ( ) ( ) ( ) ( ) ,01144332211 =+−−−−− Λ y y y y y y hv g u g u g u g u f

    370 – (-10)u1 – (–1)u

    2 – (0)u

    3 – (1)u

    4= 0 ;

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    32/45

    Exercise 4.1.1• For the first point when we substitute the values we get following KKT

    conditions:,0)()()(

    11

    =∇−∇−∇ ∑∑==

    K

    k k k

    J

    j j j xhv x g u x f

    ( ) ( ) ( ) ( ) ( ) ,01144332211 =+−−−−− Λ x x x x x x hv g u g u g u g u f

    g 1 = –15 not greater or equal to zerog 2 = +11 > 0 ;g 3 = 1 > 0 ;

    g 4 = 5 >0

    g j (x) ≥ 0 for j=

    1, 2, ..., J

    h k (x) = 0 for k =

    1, 2, ..., K (-15)u 1 = 0 ;(11)u 2 = 0 ;(1)u 3 = 0 ;(5)u 4 = 0u 1 , u 2 , u 3 , u 4 ≥ 0

    18 – (8)u 1 – (– 4)u 2 – (1)u 3 – (0)u 4 = 0 ;

    u j g j (x) = 0 for j= 1, 2, ..., J

    u j (x) ≥ 0 for j=1, 2, ..., J

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    33/45

    Exercise 4.1.1• It is clear that the third

    condition is not satisfied thatis g j (x) ≥ 0.

    • This is enough to concludethat the point x(1) is not a KTpoint.

    • In fact, since the firstconstraint value is negative,this constraint is violated atthis point and the point x(1)is not a feasible point, asshown in Figure 4.1.

    • If a point is infeasible, thepoint cannot be an optimalpoint.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    34/45

    Exercise 4.1.1

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    35/45

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    36/45

    Exercise 4.1.1• Since u1 = 0, u2 = 0, u3 =

    -14 and u4 = -22.• The final set of conditions

    are not satisfied with thesevalues, because u3 and u4are negative.

    • Thus, the point x(2) is not aK-T point.

    • Since the constraints are notviolated, the point is afeasible point, as shown inFigure 4.1.

    • Thus, the point x(2) cannotbe an optimal point.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    37/45

    Exercise 4.1.1 Similarly, the conditions for the third point x(3) = (3, 2) T are

    following:-4u

    1+ 4 u

    2 – u

    3= 0;

    +4u 1 + u 2 – u 4 = 0;18 > 0 ; 6 > 0 ; 3 > 0 ; 2 > 0,

    (18)u 1 =0,(6)u2 = 0,(3)u3 =0,

    (2)u4 =0,The vector u* = (0,0,0, 0)T satisfies all the above conditions.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    38/45

    Exercise 4.1.1• The vector u* = (0, 0,

    0, 0) T satisfies all theabove conditions.

    • Thus, the point x(3) is aK-T point (Figure 4.1).• As mentioned earlier, K-

    T points are likelycandidates for minimalpoints.

    • To conclude, we maysay that the optimalityof a point requiressatisfaction of moreconditions.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    39/45

    Exercise 4.1.1The K-T conditions obtained for the point x(4) = (3.396, 0) T

    are-3.21u1 + 4u2 - u3 = 0,1 +u2 - u4 = 0,

    23.427 > 0; 6.416 > 0 ; 3.396 > 0; 0 = 0,(23.427)(u 1 ) = 0,(6.416)u 2 = 0,(3.396)u 3 =0,(0)u 4 = 0,

    The solution to the above conditions is the vector u* = (0,0,0,1) T

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    40/45

    Exercise 4.1.1• The solution to the above

    conditions is the vector u* =(0,0,0,1) T.

    • Thus, the point x(4) is also aK-T point.

    • It is clear from the figure thatthe point x(3) is the minimumpoint, but the point x(4) is not

    • a minimum point.• Thus, we may conclude from

    the above exercise problemthat a K-T point mayor maynot be a minimum point.

    • But if a point is not a K-T point(point x(2) or x(3)), then itcannot be an optimum point.

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    41/45

    Limitations

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    42/45

    • Necessity theorem helps identify points that are notoptimal. A point is not optimal if it does not satisfythe Kuhn–Tucker conditions.

    • On the other hand, not all points that satisfy theKuhn-Tucker conditions are optimal points.

    • The Kuhn–Tucker sufficiency theorem gives conditionsunder which a point becomes an optimal solution to asingle-objective NLP.

    Limitations

    Sufficiency Condition

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    43/45

    • Sufficient conditions that a point x* is a strict local minimum of theclassical single objective NLP problem, where f, g j, and h k are twicedifferentiable functions are that

    1) The necessary KKT conditions are met.

    2) The Hessian matrix ∇2L(x*) = ∇2f(x*) + Σμi∇2g i(x*) +Σλ j∇2h j(x*) is positive definite on a subspace of R n as defined bythe condition:

    yT ∇2L(x*) y ≥ 0 is met for every vector y (1xN) satisfying:

    ∇g j(x*)y = 0 for j belonging to I 1 = { j | g j(x*) = 0,u j* > 0} (active constraints)

    ∇hk (x*)y = 0 for k = 1, ..., K and y ≠ 0

    Sufficiency Condition

    KKT Sufficiency Theorem (Special Case)

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    44/45

    • Consider the classical single objective NLP problem.

    minimize : f(x)

    Subject to: g j (x) 0 for j = 1, 2, ..., J

    h k(x) = 0 for k = 1, 2, ..., K

    x = (x 1, x 2, ..., x N)

    • Let the objective function f(x) be convex, the inequality constraintsg j (x) be all convex functions for j = 1, ..., J, and the equalityconstraints h k(x) for k = 1, ..., K be linear.

    • If this is true, then the necessary KKT conditions are also sufficient.

    • Therefore, in this case, if there exists a solution x* that satisfies theKKT necessary conditions, then x* is an optimal solution to the NLPproblem.

    • In fact, it is a global optimum.

    KKT Sufficiency Theorem (Special Case)

    Some Remarks

  • 8/9/2019 Lec 22 - 23 KKT Conditions

    45/45

    • Kuhn-Tucker Conditions are an extension of Lagrangian functionand method.

    • They provide powerful means to verify solutions

    • But there are limitations …

    • Sufficiency conditions are difficult to verify.

    • Practical problems do not have required nice properties.• For example, You will have a problems if you do not know

    the explicit constraint equations (e.g., in FEM).

    • If you have a multi-objective formulation, then we suggesttesting each priority level separately.

    Some Remarks