A Framework for Accommodating Infeasible Starts in Convex ...andre/slides.pdf · A Framework for...
Transcript of A Framework for Accommodating Infeasible Starts in Convex ...andre/slides.pdf · A Framework for...
A Framework for Accommodating InfeasibleStarts in Convex Quadratic Optimization, withApplication to Constraint-Reduced Interior Point
M. Paul Laiu1 and (presenter) Andre L. Tits2
1Oak Ridge National [email protected]
2Department ECE and ISR, University of Maryland, College [email protected]
ICCOPT, Berlin, August 5–8, 2019
With thoughts to the memory Andy Conn,and his early, fundamental contributions to exact penalty function
theory
Prolog: Glance at Numerical ResultsCR-MPC:Constraint–Reduced Mehrotra-Predictor-Corrector[LT19:COAP]
I Target: “Imbalanced” CQPs: many inequality constraintsI New rule (Rule R) for selection of working setI BUT requires a feasible starting point
NEW: IS-CRMPC: Infeasible Start – Using Proposed General Framework
Randomly generated Data fitting w/random noise
Algorithm H � 0 H = 0 sin(10t) cos(25t2) sin(5t3) cos2(10t)
time time time timeSDPT3 (cvx) 14.8 10.7 21.2 19.4SeDuMi (cvx) 2.1 2.4 2.7 2.6r.p. simplex w/p.p.1 – 6.6 – –CR-MPC: ALL 3.1 3.3 10.0 7.1
CR-MPC: Rule R 0.2 0.2 0.6 0.6
IS-CRMPC: Rule R 0.4 0.5 1.0 0.8
Table: – CPU time (sec) averaged over 20 runs and 5 problem sizes:m = 10 000 and n = 10, 20, 50, 100, 200.
1Luke Winternitz’s 2010 Matlab implementation [W10] of algor. taken from [BT97]
Outline
Framework: Basics
Framework: Convergence
Application to Constraint Reduction: CR-MPC
Numerical Results
Motivation and PreliminariesWe consider the convex quadratic program (CQP)
(P) minx
f (x) :=1
2xTHx + cTx s.t. Ax ≥ b ,Cx = d ,
where x ∈ Rn, b ∈ Rm, d ∈ Rp, H = HT � 0, with dual given by
(D) maxx,λ,µ
−1
2xTHx+bTλ+dTµ s.t. Hx+c−ATλ−CTµ = 0, λ ≥ 0 .
Assumptions:
A1. (P) and (D) strictly feasible; optimal solution set F∗ of (P) is bdd;
A2. C and [H AT CT ] full rank.
Contribution:
I Framework for accommodating infeasible starts and equalityconstraints given a user-provided feasible-start, inequality-constraintalgorithm (constraint-reduced IP, simplex-like, dual simplex, . . . )
I Application: Infeasible-Start CR-MPC w/ equality constraints.
Key: An Augmented, `1-Penalized Problem
Given ϕ > 0,
(Pϕ) minx,z,y
Fϕ(x, z, y) :=1
2xTHx + cTx + ϕ(1T z + 1Ty)
s.t. Ax + z ≥ b , z ≥ 0 ,Cx + y ≥ d ,−Cx + y ≥ −d ,
where Cx = d has been split into Cx ≥ d, Cx ≤ d, and constraintviolation is penalized for both.
Remarks:
1. y ≥ 0 is left out: implied by (Cx + y ≥ d, −Cx + y ≥ −d).
2. Note that, while (Cx ≥ d, Cx ≤ d) is degenerate, (Cx + y ≥ d,−Cx + y ≥ −d) is not! In particular, as long as C has full row rank,the rows of [C I ;−C I ] are linearly independent.
Note: The associated dual variable will be denoted Λ = (π, ξ,η, ζ).
Framework: Master Algorithm
Initialization: x0 ∈ Rn, z0 ∈ Rm, y0 ∈ Rp, satisfying z0 ≥ 0,Ax0 + z0 ≥ b, Cx0 + y0 ≥ d, and Cx0 − y0 ≤ d; Λ0 ≥ 0; ϕ0 > 0; k := 0
Iteration k :If Λk ≥ 0 not available, provide an appropriate (nonnegative) estimate
of an optimal dual variable; see below.If User’s Stopping Criterion (for (P)) is satisfied, stop.
Penalty update:Input: ϕk > 0; xk , zk , yk , Λk
Output: ϕk+1 ≥ ϕk .
Base iteration (applied to (Pϕk+1)–(Dϕk+1
))Input: Xk := [xk ; zk ; yk ],Output: [xk+1; zk+1; yk+1] := Xk+1, (Λk+1 ≥ 0)
If Fϕk+1(Xk+1) > Fϕk+1
(Xk),go back to Base iteration and set Xk := Xk+1,
Otherwise,set k := k + 1 and go to Iteration k .
Outline
Framework: Basics
Framework: Convergence
Application to Constraint Reduction: CR-MPC
Numerical Results
Framework: Requirements for Base Iteration
When applied to (Pϕ) for some ϕ > 0, the Base Iteration must
I [feasiblity] Produce a feasible Xk+1 when given a feasible Xk ,
and, when applied iteratively (w/o ϕ updates) it must
I [convergence] Generate sequences (primal) {Xk} and (dual) {Λk}(the latter possibly produced by the Master Algorithm) which, if(zk , yk) is bounded, “converge to primal(–dual) optimality” on somesubsequence in the sense that,
I if (Pϕ) is bounded then
lim infk→∞
max{‖diag(sk , zk , tk+, t
k−)Λk‖, ‖HXk + cϕ − ATΛk‖
}= 0 ,
where s := Ax + z− b, t+ := Cx + y − d, and t− := −Cx + y + d(slack variables),
I and otherwise {Xk} is unbounded.
Here H = diag(H, 0, 0), A =
A I 00 I 0C 0 I−C 0 I
, cϕ =
cϕ1ϕ1
Framework: Requirements for Penalty Update
The Penalty Update rule must see to it that
I {ϕk} is positive, nondecreasing, and either unbounded or eventuallyconstant.
I {ϕk} can be bounded only if {(zk , yk)} also is; and ϕk ’s eventualvalue ϕ satisfies ϕ > lim infk→∞ ‖[πk ;ηk ; ζk ]‖∞ .
I If {ϕk} is unbounded theneither (i) {(zk , yk)} is unboundedor (ii) there exists an infinite index set K ⊆ {1, 2, . . .} such that
Iϕk
max{1,‖[πk ;ηk−ζk ]‖} ,
as well asI diag(sk , zk , tk+, t
k−)Λk ,
I HXk + cϕk − ATΛk and(Hxk + c− ATπk − CT (ηk − ζk)
)Txk
are bounded on K .
Framework: Possible Penalty Update Rule
Given σ1 > 0, σ2 > 1, γ > 0, ρ > 0,
I Set ϕ+ := ϕ;
I If ϕ <∥∥∥ 1γ [z; y]
∥∥∥∞
, then ϕ+ := σ2
∥∥∥ 1γ [z; y]
∥∥∥∞
;
I If ϕ+ < ‖[π;η; ζ]‖∞ + σ1 andI max
{‖diag(s, z, t+, t−)Λ‖,
∥∥HX + cϕ − ATΛ∥∥} < ρ (near optimal),
I
∣∣∣(Hx + c− ATπ − CT (η − ζ))T
x∣∣∣ < ρ (even if unbdd x sequence)
then ϕ+ := σ2(‖[π;η; ζ]‖∞ + σ1).
This rule satisfies all Requirements for Penalty Update.
Convergence of Master Algorithm
Theorem: Suppose A1–A2 hold and the Base Iteration and PenaltyUpdate satisfy the listed requirements. Then
I ϕk is eventually constant
I as k →∞, (zk , yk)→ 0 and xk → F∗.
Outline
Framework: Basics
Framework: Convergence
Application to Constraint Reduction: CR-MPC
Numerical Results
Application to CR-MPC: Outline of CR-MPC Iteration
Given X primal-feasible, Λ ≥ 0;
1. Select working set Q ⊆ {1, . . . ,m};2. Solve reduced Newton–KKT system (affine scaling) → (∆Xa,∆Λa);
3. With q := |Q|, set µ(Q) :=sTQΛQ
q ;
4. Solve reduced modified Newton–KKT system (corrector/centering)→ (∆Xc,∆Λc);
5. Set γ ∈ [0, 1]; set (∆X,∆Λ) := (∆Xa,∆Λa) + γ(∆Xc ,∆Λc);
6. Set αp ∈ (0, 1]; set X+ := X + αp∆X primal-feasible.
Infeasible-Start CR-MPC: Check Requirements
[feasibility] Primal feasibility is obviously satisfied.
Additional assumption (so that LI Assumption of [LT19] holds for (Pϕ)) :
A3. ∀x ∈ Rn, {ai : bi − z ≤ aTi x ≤ bi} ∪ {ci : i = 1, . . . , p} is LI,
with z := maxk ‖zk‖∞.
Note: While we (authors of [LT19]) were not able to do away with an LIassumption in the analysis of [LT19], intuition and extensive numericaltesting suggest that A3 can be dropped.
[convergence] Under A1–A3, suppose (zk , yk) is bounded. Then
I If (Pϕ) is bounded, thenI Assumption A1 implies that {Xk} is bounded;I Claim then follows from [LT19].
I In the case of CR-MPC, 2nd alternative is vacuous: boundedness of{Xk} implies boundedness of (Pϕ).
Convergence of infeasible-start CR-MPC
With the stopping criterion
‖diag(ski )πk‖ < ε, ‖Hxk + c− ATπk − CT (ηk − ζk)‖ < ε,
‖[b− Axk ]+‖ < ε, ‖Cx− d‖ < ε,
under A1–A3, the Master Algorithm with CR-MPC as base iteration(endowed with a constraint-selection rule that satisfies Condition CSR of[LT19]; e.g., Rule R of [LT19])enjoys the following properties:
I If ε = 0, {xk} converges to F∗, while if ε > 0 the stopping criterionis satisfied after finitely many iterations.
I If (P)–(D) has a unique solution (x∗,π∗) at which SOSC holds withstrict complementarity, then (Xk ,Λk)→ (X∗,Λ∗), and convergenceis locally q-quadratic in (Xk ,Λk), hence r-quadratic in xk .
Outline
Framework: Basics
Framework: Convergence
Application to Constraint Reduction: CR-MPC
Numerical Results
Randomly Generated Problems
Problem setting:
minimizex∈Rn
1
2xTHx + cTx
subject to Ax ≥ b ,Cx = d .
I A ∼ N (0, 1), C ∼ N (0, 1), and c ∼ N (0, 1)
I xfeas ∼ U(0, 1), sfeas ∼ U(1, 2)
I b := Axfeas − sfeas, d := Cxfeas (ensures strict feasibility)
I m = 10000, n between 10 and 200, and p = n/2
We consider the following two classes of Hessian matrices:
1. Strongly convex QP: diagonal H, diag(H) ∼ U(0, 1).
2. LP: H = 0.
We solved 20 randomly generated problems for each class of H and foreach problem size, and report the results averaged over the 20 problems.
Randomly Generated Problems
10 20 50 100 200
15
20
25
30
35
40
Iteration count vs. n
10 20 50 100 20010
-1
100
101
102
CPU time (sec) vs. n
(a) Strongly convex QP
10 20 50 100 200
10
15
20
25
30
35
Iteration count vs. n
10 20 50 100 20010
-1
100
101
102
CPU time (sec) vs. n
(b) LP
Randomly Generated Problems w/o Equality Constraints
10 20 50 100 200
10
15
20
25
30
Iteration count vs. n
10 20 50 100 200
10-1
100
101
102
CPU time (sec) vs. n
(a) Strongly convex QP
10 20 50 100 20010
1
102
103
104
Iteration count vs. n
10 20 50 100 200
10-1
100
101
102
CPU time (sec) vs. n
(b) LP
Data Fitting Problems
Regularized minimax data fitting problem (taken from [WNTO12]):
minimizex∈Rn
‖Ax− b‖∞ +1
2αxT H x⇒
minimizex∈Rnu∈R
u +1
2αxT H x
subject to Ax− b ≥ −u1 ,
−Ax + b ≥ −u1 .
I b: noisy data measurement from a target function g .
I A: trigonometric basis, x: expansion coefficients
I H: regularization matrix, α: regularization parameter
I m = 10000, n from 10 to 200
For each choice of g and for each problem size, we solved the problem 20times and report the results averaged over the 20 problems.
Data Fitting Problems
10 20 50 100 200
0
20
40
60
80
Iteration count vs. n
10 20 50 100 200
10-1
100
101
102
CPU time (sec) vs. n
(a) g(t) = sin(10t) cos(25t2)
10 20 50 100 200
10
20
30
40
50
60
Iteration count vs. n
10 20 50 100 200
10-1
100
101
102
CPU time (sec) vs. n
(b) g(t) = sin(5t3) cos2(10t)
Concluding remarks
I An exact-penalty-function-based framework was proposed to allow“feasible” CQP algorithms to accept infeasible starts; and pureinequality algorithms to allow for equality constraints.
I The framework caters to a broad class of algorithms, and allows forinexact computation of search direction.
I Preliminary numerical testing suggest that, on imbalanced problems,an infeasible-start version of a constraint-reduced algorithm(IS-CRMPC) often performs better than broadly used codes.
I Constraint-Reduced IP could be viewed as a middle-ground betweenIP and Simplex.
THANK YOU !!
These slides are available from Andre Tits’s home page:https://user.eng.umd.edu/∼andre/
[LT19: COAP, Vol. 72, No. 3, 727-768, 2019][BT97: Bertsimas, Tsitsiklis: Introduction to Linear Optimization, 1997][W10: Winternitz, PhD Thesis, 2010: http://hdl.handle.net/1903/10400][WNTO12: Winternitz et al.: COAP, V. 51, No. 1, 1001–1036, 2012]
Some other recent work that combines penalty function and interior pt:
[Benson, Shen, Shanno, Vanderbei: COAP, V. 34, No. 2, 155–182, 2006][Curtis: Math. Prog. Comp, V. 4, 181--209, 2012]
Early, fundamental work by the late Andy Conn:
[Coleman, Conn: SINU, V. 10, No. 4, 760–784, 1973]
How to select Q?
I Many constraints selection rules have been proposed.
I We give a condition on the selection rule that guaranteesconvergence of the regularized CR-MPC algorithm.
I All selection rules that we are aware of satisfy this condition.
Condition (CSR)
1. when {(xk ,λk)} is bounded away from optimality, Qk include theindexes of every active constraint at limit points x′ when such limitpoints are approached
2. when {xk} converges to a solution point x∗, Qk eventually includethe indexes of every active constraint at x∗.
Proposed Constraint Selection RuleRule R:
I Parameters: δ > 0, 0 < β < θ < 1.
I Input: Iteration: k , Slack variable: sk , Error: Emin (when k > 0),Ek := E (xk ,λk), Threshold: δk−1.
I Output: Working set: Qk , Threshold: δk Error: Emin.
I if k = 0I δk := δ
I Emin := Ek
I else if Ek ≤ βEmin
I δk := θδk−1
I Emin := Ek
I elseI δk := δk−1
I Select Qk := {i ∈ m | ski ≤ δk}.