A Framework for Accommodating Infeasible Starts in Convex ...andre/slides.pdf · A Framework for...

A Framework for Accommodating InfeasibleStarts in Convex Quadratic Optimization, withApplication to Constraint-Reduced Interior Point

M. Paul Laiu1 and (presenter) Andre L. Tits2

1Oak Ridge National [email protected]

2Department ECE and ISR, University of Maryland, College [email protected]

ICCOPT, Berlin, August 5–8, 2019

With thoughts to the memory Andy Conn,and his early, fundamental contributions to exact penalty function

theory

Prolog: Glance at Numerical ResultsCR-MPC:Constraint–Reduced Mehrotra-Predictor-Corrector[LT19:COAP]

I Target: “Imbalanced” CQPs: many inequality constraintsI New rule (Rule R) for selection of working setI BUT requires a feasible starting point

NEW: IS-CRMPC: Infeasible Start – Using Proposed General Framework

Randomly generated Data fitting w/random noise

Algorithm H � 0 H = 0 sin(10t) cos(25t2) sin(5t3) cos2(10t)

time time time timeSDPT3 (cvx) 14.8 10.7 21.2 19.4SeDuMi (cvx) 2.1 2.4 2.7 2.6r.p. simplex w/p.p.1 – 6.6 – –CR-MPC: ALL 3.1 3.3 10.0 7.1

CR-MPC: Rule R 0.2 0.2 0.6 0.6

IS-CRMPC: Rule R 0.4 0.5 1.0 0.8

Table: – CPU time (sec) averaged over 20 runs and 5 problem sizes:m = 10 000 and n = 10, 20, 50, 100, 200.

1Luke Winternitz’s 2010 Matlab implementation [W10] of algor. taken from [BT97]

Outline

Framework: Basics

Framework: Convergence

Application to Constraint Reduction: CR-MPC

Numerical Results

Motivation and PreliminariesWe consider the convex quadratic program (CQP)

(P) minx

f (x) :=1

2xTHx + cTx s.t. Ax ≥ b ,Cx = d ,

where x ∈ Rn, b ∈ Rm, d ∈ Rp, H = HT � 0, with dual given by

(D) maxx,λ,µ

−1

2xTHx+bTλ+dTµ s.t. Hx+c−ATλ−CTµ = 0, λ ≥ 0 .

Assumptions:

A1. (P) and (D) strictly feasible; optimal solution set F∗ of (P) is bdd;

A2. C and [H AT CT ] full rank.

Contribution:

I Framework for accommodating infeasible starts and equalityconstraints given a user-provided feasible-start, inequality-constraintalgorithm (constraint-reduced IP, simplex-like, dual simplex, . . . )

I Application: Infeasible-Start CR-MPC w/ equality constraints.

Key: An Augmented, `1-Penalized Problem

Given ϕ > 0,

(Pϕ) minx,z,y

Fϕ(x, z, y) :=1

2xTHx + cTx + ϕ(1T z + 1Ty)

s.t. Ax + z ≥ b , z ≥ 0 ,Cx + y ≥ d ,−Cx + y ≥ −d ,

where Cx = d has been split into Cx ≥ d, Cx ≤ d, and constraintviolation is penalized for both.

Remarks:

1. y ≥ 0 is left out: implied by (Cx + y ≥ d, −Cx + y ≥ −d).

2. Note that, while (Cx ≥ d, Cx ≤ d) is degenerate, (Cx + y ≥ d,−Cx + y ≥ −d) is not! In particular, as long as C has full row rank,the rows of [C I ;−C I ] are linearly independent.

Note: The associated dual variable will be denoted Λ = (π, ξ,η, ζ).

Framework: Master Algorithm

Initialization: x0 ∈ Rn, z0 ∈ Rm, y0 ∈ Rp, satisfying z0 ≥ 0,Ax0 + z0 ≥ b, Cx0 + y0 ≥ d, and Cx0 − y0 ≤ d; Λ0 ≥ 0; ϕ0 > 0; k := 0

Iteration k :If Λk ≥ 0 not available, provide an appropriate (nonnegative) estimate

of an optimal dual variable; see below.If User’s Stopping Criterion (for (P)) is satisfied, stop.

Penalty update:Input: ϕk > 0; xk , zk , yk , Λk

Output: ϕk+1 ≥ ϕk .

Base iteration (applied to (Pϕk+1)–(Dϕk+1

))Input: Xk := [xk ; zk ; yk ],Output: [xk+1; zk+1; yk+1] := Xk+1, (Λk+1 ≥ 0)

If Fϕk+1(Xk+1) > Fϕk+1

(Xk),go back to Base iteration and set Xk := Xk+1,

Otherwise,set k := k + 1 and go to Iteration k .

Outline

Framework: Basics



Numerical Results

Framework: Requirements for Base Iteration

When applied to (Pϕ) for some ϕ > 0, the Base Iteration must

I [feasiblity] Produce a feasible Xk+1 when given a feasible Xk ,

and, when applied iteratively (w/o ϕ updates) it must

I [convergence] Generate sequences (primal) {Xk} and (dual) {Λk}(the latter possibly produced by the Master Algorithm) which, if(zk , yk) is bounded, “converge to primal(–dual) optimality” on somesubsequence in the sense that,

I if (Pϕ) is bounded then

lim infk→∞

max{‖diag(sk , zk , tk+, t

k−)Λk‖, ‖HXk + cϕ − ATΛk‖

}= 0 ,

where s := Ax + z− b, t+ := Cx + y − d, and t− := −Cx + y + d(slack variables),

I and otherwise {Xk} is unbounded.

Here H = diag(H, 0, 0), A =

A I 00 I 0C 0 I−C 0 I

, cϕ =

cϕ1ϕ1

Framework: Requirements for Penalty Update

The Penalty Update rule must see to it that

I {ϕk} is positive, nondecreasing, and either unbounded or eventuallyconstant.

I {ϕk} can be bounded only if {(zk , yk)} also is; and ϕk ’s eventualvalue ϕ satisfies ϕ > lim infk→∞ ‖[πk ;ηk ; ζk ]‖∞ .

I If {ϕk} is unbounded theneither (i) {(zk , yk)} is unboundedor (ii) there exists an infinite index set K ⊆ {1, 2, . . .} such that

Iϕk

max{1,‖[πk ;ηk−ζk ]‖} ,

as well asI diag(sk , zk , tk+, t

k−)Λk ,

I HXk + cϕk − ATΛk and(Hxk + c− ATπk − CT (ηk − ζk)

)Txk

are bounded on K .

Framework: Possible Penalty Update Rule

Given σ1 > 0, σ2 > 1, γ > 0, ρ > 0,

I Set ϕ+ := ϕ;

I If ϕ <∥∥∥ 1γ [z; y]

∥∥∥∞

, then ϕ+ := σ2

∥∥∥ 1γ [z; y]

∥∥∥∞

;

I If ϕ+ < ‖[π;η; ζ]‖∞ + σ1 andI max

{‖diag(s, z, t+, t−)Λ‖,

∥∥HX + cϕ − ATΛ∥∥} < ρ (near optimal),

I

∣∣∣(Hx + c− ATπ − CT (η − ζ))T

x∣∣∣ < ρ (even if unbdd x sequence)

then ϕ+ := σ2(‖[π;η; ζ]‖∞ + σ1).

This rule satisfies all Requirements for Penalty Update.

Convergence of Master Algorithm

Theorem: Suppose A1–A2 hold and the Base Iteration and PenaltyUpdate satisfy the listed requirements. Then

I ϕk is eventually constant

I as k →∞, (zk , yk)→ 0 and xk → F∗.

Outline

Framework: Basics



Numerical Results

Application to CR-MPC: Outline of CR-MPC Iteration

Given X primal-feasible, Λ ≥ 0;

1. Select working set Q ⊆ {1, . . . ,m};2. Solve reduced Newton–KKT system (affine scaling) → (∆Xa,∆Λa);

3. With q := |Q|, set µ(Q) :=sTQΛQ

q ;

4. Solve reduced modified Newton–KKT system (corrector/centering)→ (∆Xc,∆Λc);

5. Set γ ∈ [0, 1]; set (∆X,∆Λ) := (∆Xa,∆Λa) + γ(∆Xc ,∆Λc);

6. Set αp ∈ (0, 1]; set X+ := X + αp∆X primal-feasible.

Infeasible-Start CR-MPC: Check Requirements

[feasibility] Primal feasibility is obviously satisfied.

Additional assumption (so that LI Assumption of [LT19] holds for (Pϕ)) :

A3. ∀x ∈ Rn, {ai : bi − z ≤ aTi x ≤ bi} ∪ {ci : i = 1, . . . , p} is LI,

with z := maxk ‖zk‖∞.

Note: While we (authors of [LT19]) were not able to do away with an LIassumption in the analysis of [LT19], intuition and extensive numericaltesting suggest that A3 can be dropped.

[convergence] Under A1–A3, suppose (zk , yk) is bounded. Then

I If (Pϕ) is bounded, thenI Assumption A1 implies that {Xk} is bounded;I Claim then follows from [LT19].

I In the case of CR-MPC, 2nd alternative is vacuous: boundedness of{Xk} implies boundedness of (Pϕ).

Convergence of infeasible-start CR-MPC

With the stopping criterion

‖diag(ski )πk‖ < ε, ‖Hxk + c− ATπk − CT (ηk − ζk)‖ < ε,

‖[b− Axk ]+‖ < ε, ‖Cx− d‖ < ε,

under A1–A3, the Master Algorithm with CR-MPC as base iteration(endowed with a constraint-selection rule that satisfies Condition CSR of[LT19]; e.g., Rule R of [LT19])enjoys the following properties:

I If ε = 0, {xk} converges to F∗, while if ε > 0 the stopping criterionis satisfied after finitely many iterations.

I If (P)–(D) has a unique solution (x∗,π∗) at which SOSC holds withstrict complementarity, then (Xk ,Λk)→ (X∗,Λ∗), and convergenceis locally q-quadratic in (Xk ,Λk), hence r-quadratic in xk .

Outline

Framework: Basics



Numerical Results

Randomly Generated Problems

Problem setting:

minimizex∈Rn

1

2xTHx + cTx

subject to Ax ≥ b ,Cx = d .

I A ∼ N (0, 1), C ∼ N (0, 1), and c ∼ N (0, 1)

I xfeas ∼ U(0, 1), sfeas ∼ U(1, 2)

I b := Axfeas − sfeas, d := Cxfeas (ensures strict feasibility)

I m = 10000, n between 10 and 200, and p = n/2

We consider the following two classes of Hessian matrices:

1. Strongly convex QP: diagonal H, diag(H) ∼ U(0, 1).

2. LP: H = 0.

We solved 20 randomly generated problems for each class of H and foreach problem size, and report the results averaged over the 20 problems.

Randomly Generated Problems

10 20 50 100 200

15

20

25

30

35

40

Iteration count vs. n

10 20 50 100 20010

-1

100

101

102

CPU time (sec) vs. n

(a) Strongly convex QP

10 20 50 100 200

10

15

20

25

30

35


10 20 50 100 20010

-1

100

101

102


(b) LP

Randomly Generated Problems w/o Equality Constraints

10 20 50 100 200

10

15

20

25

30


10 20 50 100 200

10-1

100

101

102


(a) Strongly convex QP

10 20 50 100 20010

1

102

103

104


10 20 50 100 200

10-1

100

101

102


(b) LP

Data Fitting Problems

Regularized minimax data fitting problem (taken from [WNTO12]):

minimizex∈Rn

‖Ax− b‖∞ +1

2αxT H x⇒

minimizex∈Rnu∈R

u +1

2αxT H x

subject to Ax− b ≥ −u1 ,

−Ax + b ≥ −u1 .

I b: noisy data measurement from a target function g .

I A: trigonometric basis, x: expansion coefficients

I H: regularization matrix, α: regularization parameter

I m = 10000, n from 10 to 200

For each choice of g and for each problem size, we solved the problem 20times and report the results averaged over the 20 problems.

Data Fitting Problems

10 20 50 100 200

0

20

40

60

80


10 20 50 100 200

10-1

100

101

102


(a) g(t) = sin(10t) cos(25t2)

10 20 50 100 200

10

20

30

40

50

60


10 20 50 100 200

10-1

100

101

102


(b) g(t) = sin(5t3) cos2(10t)

Concluding remarks

I An exact-penalty-function-based framework was proposed to allow“feasible” CQP algorithms to accept infeasible starts; and pureinequality algorithms to allow for equality constraints.

I The framework caters to a broad class of algorithms, and allows forinexact computation of search direction.

I Preliminary numerical testing suggest that, on imbalanced problems,an infeasible-start version of a constraint-reduced algorithm(IS-CRMPC) often performs better than broadly used codes.

I Constraint-Reduced IP could be viewed as a middle-ground betweenIP and Simplex.

THANK YOU !!

These slides are available from Andre Tits’s home page:https://user.eng.umd.edu/∼andre/

[LT19: COAP, Vol. 72, No. 3, 727-768, 2019][BT97: Bertsimas, Tsitsiklis: Introduction to Linear Optimization, 1997][W10: Winternitz, PhD Thesis, 2010: http://hdl.handle.net/1903/10400][WNTO12: Winternitz et al.: COAP, V. 51, No. 1, 1001–1036, 2012]

Some other recent work that combines penalty function and interior pt:

[Benson, Shen, Shanno, Vanderbei: COAP, V. 34, No. 2, 155–182, 2006][Curtis: Math. Prog. Comp, V. 4, 181--209, 2012]

Early, fundamental work by the late Andy Conn:

[Coleman, Conn: SINU, V. 10, No. 4, 760–784, 1973]

How to select Q?

I Many constraints selection rules have been proposed.

I We give a condition on the selection rule that guaranteesconvergence of the regularized CR-MPC algorithm.

I All selection rules that we are aware of satisfy this condition.

Condition (CSR)

1. when {(xk ,λk)} is bounded away from optimality, Qk include theindexes of every active constraint at limit points x′ when such limitpoints are approached

2. when {xk} converges to a solution point x∗, Qk eventually includethe indexes of every active constraint at x∗.

Proposed Constraint Selection RuleRule R:

I Parameters: δ > 0, 0 < β < θ < 1.

I Input: Iteration: k , Slack variable: sk , Error: Emin (when k > 0),Ek := E (xk ,λk), Threshold: δk−1.

I Output: Working set: Qk , Threshold: δk Error: Emin.

I if k = 0I δk := δ

I Emin := Ek

I else if Ek ≤ βEmin

I δk := θδk−1

I Emin := Ek

I elseI δk := δk−1

I Select Qk := {i ∈ m | ski ≤ δk}.

A Framework for Accommodating Infeasible Starts in Convex ...andre/slides.pdf · A Framework for...

Documents

Transcript of A Framework for Accommodating Infeasible Starts in Convex ...andre/slides.pdf · A Framework for...