Duality Theory for Convexification of Control Problems and...
Transcript of Duality Theory for Convexification of Control Problems and...
Duality Theory for Convexification of Control Problems and Applications
!!!
!!!!!!
Behçet Açıkmeşe!Department of Aerospace Engineering and Engineering Mechanics
University of Texas at Austin !
!May, 2014
Convexification
2
Control ProblemsConvex Formulation
Non-convex Formulation
Convex Formulation
Infinite dimensional !Pontryagin’s Maximum Principle
Finite dimensional !Duality theory of convex optimization
Guaranteed Computation of Optimal Solutions
Convexity Enables Reliable Automated Solutions
Convex Optimization Non-Convex Optimization
f(x,y)
x
y
Convex cost
Convex constraints
Convex Optimization
IPMs (Interior Point Methods)
- Guaranteed global optimum - Polynomial-time complexity
No human in the loop need
f(x,y)
x
y
Non-Convex cost
Non-Convex constraints
Non-Convex Optimization
Sequential QP, Thrust Region methods, Simulated Annealing, Genetic Prog. ...
- No guarantees of convergence or complexity
Requires expert in the loop
3
ConvexificationMain contribution
Soft Landing Control Problem
4
Past Landing Applications
125km
~10km
1-3km
Entry Phase
Parachute Phase
Powered Descent PhaseTARGET
Backshell separation
Precision landing < 1-2 km precision
Meditch, 64 Klumpp (Apollo), 74 NP methods
5
Precision Landing
Parachute Phase
Powered Descent (PD) Phase
Entry Phase
Landing location
Error accumulated in and entry parachute phases
Divert distance
6
Problem Description
7
⇥r⇥ � Vmax
�
�2
�1
0 < �1 � ⇥Tc⇥ � �2
g
Glide slope cone
r =Tc
m+ g + 2r ⇥ ⇥ + (⇥ ⇥ r)⇥ ⇥
m = ��||Tc||
Dynamics
Control Constraints
Find fuel optimal trajectory from a given initial state to a given final state
tan� ⇥rd⇥ � rv
Thru
st m
agni
tude
Time
Thrust Pointing Constraint
n
�
Tc(t)
Pointing cone: Required due to camera pointing
Lander vehicle must turn to burn to obtain the desired thrust vector
PointingEnvelope
Intersection
�1Non-convex for all
Pointing envelope
8
Optimal solutions of problems are the same
Lossless Convexification with Thrust Bound and Pointing Constraints
nT Tc(t) � cos �⇥Tc(t)⇥
⇥Tc(t)⇥ � �(t)�1 � �(t) � �2�(t)
nT Tc(t) � cos � �(t)
0 < �1 � ⇥Tc(t)⇥ � �2
Original Problem Relaxed ProblemSlack variable
Convex
Intersection
Pointing
Half-Space
Convexification without pointing Convexification with pointing
9
Lossless Convexification
Proof-1
10
� � n ⇥= 0 N(� � (� � n)) ⇥= 0
� �
Assumption:When the above does not hold, we have the lossless convexification for the same optimal control problem with a different that is arbitrarily close to
Pontryagin’s Maximum Principle (Necessary Conditions):
Hamiltonian
(i) Co-state conditions: ⌅t ⇥ [0, t�f ],
(�, ⇤(t), ⇥(t)) ⇤= 0⇤(t) = �A(⇧)T ⇤(t)
⇥(t) =⇤(t)T B(t)
m(t)2
(ii) Pointwise Maximum Principle:
T �c (t) = argmax
Tc
⇤T BT⇤ ⇥� ⌅y(t)T
Tc a.e. [0, t�f ]
(iii) Transversality Conditions:
⇥(t�f ) = 0 and H(⌅(t�f )) = 0
Proof-2
11
⇥(t) = �A(⇤)T ⇥(t)
y(t) = BT ⇥(t),
(i) y is analytic and y(t) = 0
– ⇥ [0, tf ]– or for countable number of instances
(ii) y(t) = ��(t) n, �(t) > 0, at most at countable number of instances in[0, tf ]
Main Technical Lemma:
1. y(t) ⇥= 0 a.e. [0, t�f ]
2. y(t) ⇥= ��(t)n a.e. [0, t�f ] for �(t) > 0From the Lemma
• y = 0 implies ⇤ = 0 from observability of (BT ,�A(⌅)T )
• ⇤ = 0 implies ⇥ = 0 and � = 0 using co-state dynamics and transversality
Therefore:
Contradiction from y = 0 from co-state not being zero
Proof-3
12
An optimal solution of the following problem is an extreme point of U(�):
maxTc
yT Tc s.t. Tc ⌅ U(�)
where U(�) := {Tc : �Tc� ⇥ �, nT Tc ⇤ cos ⇥�}, and y ⇧= 0 and y ⇧= ��n forany � > 0. Consequently an optimal solution T �
c satisfies that �T �c � = �.
T �c (t) = argmaxTc⇥U(�) y(t)T Tc(t)
We use Pointwise Maximum Principle:
Further Generalizations
13
• Linear-time varying systems + Non-convex control constraints (Automatica 2011)!
• Nonlinear systems + Non-convex control constraints (Sys&Cont Letters 2012)!
• Linear systems + Active state constraints + Non-convex control constraints (Automatica 2014)!› Needed geometric control theory!
– Controlled invariant subspaces!– Strong controllability/observability
Primal
min c
Tx s.t.
Ax = b, x 2 K
Dual
max b
Ty s.t.
A
Ty + s = c, s 2 K
Custom IPMs for Onboard Optimization
14
Problem instance
Optimal solution
Generic IPM solver
Solution via Generic Solvers
Solution via Custom Solvers
Problem class
Custom IPM
Solver customization
Problem instance
Optimal solution
Custom IPM solver
T secs
T/100 secs
Computation time
0 5000 10000 1500010−4
10−2
100
Solution Variable Size
Log
Mea
n R
untim
e (s
)
SDPT3−v4.0SeDuMiECOSGeneral BsocpCustom Bsocp
Method summary: - Primal-dual IPM - Homogenous self-dual embedding - Newton search directions with NT scalings - Mehrotra’s heuristic - Central path following method
Minimum Time Rendezvous with Differential Drag
15
Minimum Time Rendezvous using Differential Drag
16
ue = uk�u0 2 {�1, 0, 1}
Non-convex control constraints
uk 2 {�1, 0}
min tf
xi = Axi +B(ui � u0), i = 1, 2, ..., N
xi(0) = xi,0, xi(tf ) = 0, i = 0, 1, ..., N
ui(t) 2 {�1, 0} i = 0, 1, ..., N
ui(t) 2 [�1, 0]
Minimum Time Rendezvous using Differential Drag
17
Method CPU Time (msec) 2-vehicle
CPU-Time (msec) 5-vehicle
Custom IPM 6 29
Gurobi (Best commercial MICP solver)
278 >300,000
Lossless Convexification
Custom IPMs for fast computation
Minimum Time Rendezvous
18
Fmincon, DIDO, and GPOPS could not converge when MILP formulation is used rather than convexified problem description. Note that DIDO and GPOPS are parsers of trajectory optimization problems calling 3rd party NLP solvers.
Table 1: Comparison of methods.Comp. Time (s) Flight Time (hr) Switches Guarantee
Analytical 0 5.47 5 yesCustom LP 0.006 4.09 3 yesGurobi 0.031 4.09 3 yesLinprog 0.283 4.09 3 yesFmincon 1.32 4.09 3 noDIDO 1.73 4.09 3 noGPOPS 1.87 4.09 3 no
Vehicle Swarms - Coordination with Minimal Communication
19
Guiding Vehicle Swarms
x(t + 1) = M x(t)lim
t�⇥x(t) = v s.t. Mv = v
Swarm Density Evolves as a Markov chain
20
Core idea behind control of swarms is controlling the swarm density rather to achieve mission goals
!• Can be full decentralized • Converges to a desired density distribution • Can repair a damage to the desired distribution
Guiding the Ensemble
21
xk[i](t) := prob(rk(t) 2 Ri), i = 1, ...,m, k = 1, ..., N.
number of bins number of agents
Mk[i, j](t) :=prob (rk(t+ 1)2Ri|rk(t)2Rj)
8i, j = 1, . . . ,m, k = 1, . . . , N, t = 0, 1, 2, . . .
Probability of finding agent ``k”!in bin “i”
Probability of agent “k” transitioning from bin ``j” to ``i”
22
Ergodicity and Motion Constraints
1TM = 1T , M � 0, Mv = v
(11T �ATa )�M = 0
�2P (M � v1T )TGT
G(M � v1T ) G+GT � P
�⌫ 0
P = PT � 0
Motion - transition - constraints
Stochasticity and steady-state
Convergence
We use Perron-Frobenius theory of nonnegative matrices and Lyapunov theory to derive the following LMIs
23
Safety Constraints
H(t)x(t+ 1) + L(t)x(t)q(t), 8D(t)x(t) p(t),
Density Upper Bounds
Bounds on Rate of Change of Density
D(t)=H(t)=I, L(t)=�I, p(t)=d(t), q(t)=f(t)
D(t)=I, H(t)=�I, L(t)=I, p(t)=d(t), q(t)=f(t)
) |x(t+ 1)� x(t)| f
D(t)=H(t)=I, L(t)=0, p(t)=d(t), q(t)=d(t+1)
) x(t) d(t)
24
Safety Constraints
Necessary and sufficient conditions for safety
S(t) � 0,⇥H(t)M(t) + L(t) + S(t) + y(t)1T
⇤D�1(t) � 0,
y(t) + q(t) �⇥H(t)M(t) + L(t) + S(t) + y(t)1T
⇤D�1(t)p(t)
Dual variables
Now we have linear inequality constraints capturing safety constraints
First convex synthesis conditions to design Markov chains with safety constraints
25
The Proof Idea
Consider density upper bound example
All feasible M satisfying other convex constraints
argmax
M 2 Me
Ti Mx d[i], 8x 2 [0, d], i = 1, ...,m
density upper bound
minA⌘=b, ⌘(x)�0
c
i
(M)T ⌘(x)
linear in their arguments
max
AT y+s=ci(M), s�0bT y
x does not show up
The equivalent condition for safety is that there is a feasible dual with cost equal or more than -d[i], using ZERO DUALITY GAP
PRIMAL DUAL
Mathematics Behind Density Guidance
26
Swarm Density Control
Markov Chain theory for density evolution
Markov Chain Design
Markov Chain Design via SDP
IPMs to solve the SDP design problem
Convexification
Objectives
(O1,O2,O6) Convergence, self
repair
(O3) Resource efficiency
(O4) Motion
constraints
(O5) Density bounds for conflict avoidance
Method of Convexification
Generalized Perron-Frobenius theory Lyapunov theory
Matrix theory Graph theory Duality theory of convex
optimization
Illustration of Evolving Swarm
27
0 50 100 1500
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
time
||x(t)
−v|
| 1
Constant M with density const.Time−varying M with density const.Constant M with flow const.Time−varying M with flow const.Constant M unconstrainedTime−varying M unconstrained
Illustration of Density Upper Bound
28
0 50 1000
0.5
1bin #1
0 50 1000
0.2
0.4bin #2
0 50 1000
0.2
0.4bin #3
0 50 1000
0.1
0.2bin #4
0 50 1000
0.05
0.1bin #5
0 50 1000
0.1
0.2bin #6
0 50 1000
0.05
0.1bin #7
0 50 1000
0.05
0.1bin #8
0 50 1000
0.2
0.4
bin #9
0 50 1000
0.2
0.4
bin #10
time
dens
ity
Time−varying M with density const.Time−varying M unconst.Density upper boundDesired density
0 10 200.2
0.250.3
5 1015200.20.3
0 10 200.05
0.10.15
Illustration of Density Rate
29
0 50 100 1500
0.2
0.4
0.6bin #1
0 50 100 1500
0.2
0.4bin #2
0 50 100 1500
0.2
0.4bin #3
0 50 100 1500
0.1
0.2bin #4
0 50 100 1500
0.1
0.2bin #5
0 50 100 1500
0.1
0.2bin #6
0 50 100 1500
0.1
0.2
bin #7
0 50 100 1500
0.1
0.2
bin #8
0 50 100 1500
0.1
0.2
bin #9
0 50 100 1500
0.1
0.2
0.3bin #10
time
dens
ity
Flow rate for time−varying M with flow const.Flow rate for time−varying M unconst.Flow rate for constant M with flow const.Bound on the flow rate
0 100.10.20.3
0 10
0.40.6
0 100.10.20.3