Living on the Edge - KU Leuven€¦ · Living on the Edge, ROKS 2013, Leuven, 10 July 2013 8....
Transcript of Living on the Edge - KU Leuven€¦ · Living on the Edge, ROKS 2013, Leuven, 10 July 2013 8....
Living on the Edge¦
Phase Transitions in Random Convex Programs
Joel A. TroppMichael B. McCoy
Computing + Mathematical Sciences
California Institute of Technology
Joint with Dennis Amelunxen and Martin Lotz (Manchester),
including work of Samet Oymak and Babak Hassibi (Caltech)
Research supported in part by ONR, AFOSR, DARPA, and the Sloan Foundation 1
.
Phase .Transitions
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 2
What is a Phase Transition?
Definition. A phase transition is a sharp change in the
behavior of a computational problem as its parameters vary.
Example: Sparse linear inverse problem with random data
§ Suppose x\ ∈ Rd has s nonzero entries
§ Acquire m random linear measurements of x\
zi =⟨gi, x
\⟩
for i = 1, . . . ,m
§ Solve a convex optimization problem to reconstruct x\ from the data
minimize ‖x‖1 subject to 〈gi, x〉 = zi for i = 1, . . . ,m
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 3
Example: Sparse Linear Inversion
0 25 50 75 1000
25
50
75
100
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 4
Research Challenge...
Understand and predictphase transitions
in random convex programs
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 5
Random Convex Programs
Examples...
§ Sensing. Collect random measurements; reconstruct via optimization
§ Statistics. Random data models; fit model via optimization
§ Coding. Random channel models; decode via optimization
Motivations...
§ Average-case analysis. Randomness describes “typical” behavior
§ Fundamental bounds. Opportunities and limits for convex methods
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 6
.
Warmup:.Regularized Denoising
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 7
Setup for Regularized Denoising
§ Let x\ ∈ Rd be “structured” but unknown
§ Let f : Rd → R be a convex function that measures “structure”
§ Observe z = x\ + σw where w ∼ normal(0, I)
§ Remove noise by solving the convex program*
minimize1
2‖z − x‖22 subject to f(x) ≤ f(x\)
§ Hope: The minimizer x approximates x\
*We assume the side information f(x\) is available. This is equivalent** to knowing the
optimal choice of Lagrange multiplier for the constraint.
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 8
Geometry of Denoising
{x : f(x) ≤ f(x\)}
x\ + D(f,x\)
z − x\σw
x
error
x\
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 9
The Risk of Regularized Denoising
Theorem 1. [Oymak & Hassibi 2013] Assume
§ We observe z = x\ + σw where w is standard normal
§ The vector x solves
minimize1
2‖z − x‖22 subject to f(x) ≤ f(x\)
Then
supσ>0
E ‖x− x\‖2
σ2= δ
(D(f,x\)
)where δ(D(f,x\)) denotes the statistical dimension of the descent cone
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 10
.
Statistical.Dimension
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 11
The Statistical Dimension
Definition. [Amelunxen, Lotz, McCoy, T 2013]
The statistical dimension of a closed, convex cone K is
δ(K) := E[‖ΠK(g)‖22
]where
§ ΠK is the Euclidean projection onto K
§ g is a standard normal vector
Intuition...
In stochastic geometry, a convex cone K with statistical dimension δ(K)
behaves like a subspace with dimension [δ(K)]
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 12
Basic Examples
Cone Notation Statistical Dimension
Subspace Lj j
Nonnegative orthant Rd+ 12d
Second-order cone Ld+1 12(d+ 1)
Real psd cone Sd+ 14d(d− 1)
Complex psd cone Hd+ 12d
2
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 13
Circular Cones
00
1/4
1/2
3/4
1
1
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 14
Descent Cones
Definition. The descent cone of a function f at a point x is
D(f,x) := {h : f(x+ εh) ≤ f(x) for some ε > 0}
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 15
Descent Cone of `1 Norm at Sparse Vector
0 1/4 1/2 3/4 10
1/4
1/2
3/4
1
1
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 16
Descent Cone of S1 Norm at Low-Rank Matrix
0 1/4 1/2 3/4 1
1/4
1/2
3/4
1
0
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 17
Statistical Dimension & Phase Transitions
§ Key Question: When do two randomly oriented cones strike?
§ Intuition: When do randomly oriented subspaces strike?
The Approximate Kinematic Formula
[Amelunxen, Lotz, McCoy, T 2013]
Let C and K be closed convex cones in Rd
δ(C) + δ(K) . d =⇒ P {C ∩QK = {0}} ≈ 1
δ(C) + δ(K) & d =⇒ P {C ∩QK = {0}} ≈ 0
where Q is a random orthogonal matrix
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 18
.
Regularized LinearInverse Problems
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 19
Setup for Linear Inverse Problems
§ Let x\ ∈ Rd be a structured, unknown vector
§ Let A ∈ Rm×d be a measurement operator
§ Observe z = Ax\
§ Find estimate x by solving convex program
minimize f(x) subject to Ax = z
§ Hope: x = x\
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 20
Geometry of Linear Inverse Problems
x\ + null(A)
{x : f(x) ≤ f(x\)}
x\
x\ + D(f,x\)
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 21
Linear Inverse Problems with Random Data
Theorem 2. [Amelunxen, Lotz, McCoy, T 2013] Assume
§ The vector x\ ∈ Rd is unknown
§ The observation z = Ax\ where A ∈ Rm×d is standard normal
§ The vector x solves
minimize f(x) subject to Ax = z
Then
m & δ(D(f,x\)
)=⇒ x = x\ whp
m . δ(D(f,x\)
)=⇒ x 6= x\ whp.
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 22
Sparse Reconstruction via `1 Minimization
0 25 50 75 1000
25
50
75
100
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 23
Low-Rank Recovery via S1 Minimization
0 10 20 300
300
600
900
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 24
.
DemixingStructured Signals
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 25
Setup for Demixing Problems
§ Let x\ ∈ Rd and y\ ∈ Rd be structured, unknown vectors
§ Let U ∈ Rd×d be a known orthogonal matrix
§ Observe z = x\ +Uy\
§ Reconstruct via convex program
minimize f(x) subject to g(y) ≤ g(y\)
x+Uy = z
§ Hope: (x, y) = (x\,y\)
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 26
Geometry of Demixing Problems
x\
{x : g(U∗(z − x)
)≤ g(y\)}
{x : f(x) ≤ f(x\)}
x\ + D(f,x\)
x\ −UD(g,y\)
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 27
Demixing Problems with Random Incoherence
Theorem 3. [Amelunxen, Lotz, McCoy, T 2013] Assume
§ The vectors x\ ∈ Rd and y\ ∈ Rd are unknown
§ The observation z = x\ +Qy\ where Q is random orthogonal
§ The pair (x, y) solves
minimize f(x) subject to g(y) ≤ g(y\)
x+Qy = z
Then
δ(D(f,x\)
)+ δ(D(g,y\)
). d =⇒ (x, y) = (x\,y\) whp
δ(D(f,x\)
)+ δ(D(g,y\)
)& d =⇒ (x, y) 6= (x\,y\) whp
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 28
Sparse + Sparse via `1 + `1 Minimization
0 25 50 75 1000
25
50
75
100
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 29
Low-Rank + Sparse via S1 + `1 Minimization
0 7 14 21 28 350
245
490
735
980
1225
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 30
.
Cone Programs withRandom Constraints
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 31
Cone Program with Random Constraints
Theorem 4. [Amelunxen, Lotz, McCoy, T 2013] Assume
§ The cone K is proper
§ The vectors u ∈ Rd and b ∈ Rm are standard normal
§ The matrix A ∈ Rm×d is standard normal
Consider the cone program
minimize 〈u, x〉 subject to Ax = b and x ∈ K
Then
m . δ(K) =⇒ the cone program is unbounded whp
m & δ(K) =⇒ the cone program is infeasible whp
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 32
Example: Some Random SOCPs
0 30 60 90 1200%
25%
50%
75%
100%
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 33
To learn more...
E-mail: [email protected]
Web: http://users.cms.caltech.edu/~mccoy
http://users.cms.caltech.edu/~jtropp
Papers:
§ MT, “Sharp recovery bounds for convex deconvolution, with applications.” arXiv cs.IT
1205.1580
§ ALMT, “Living on the edge: A geometric theory of phase transitions in convex
optimization.” arXiv cs.IT 1303.6672
§ Oymak & Hassibi, “Asymptotically exact denoising in relation to compressed sensing,”
arXiv cs.IT 1305.2714
§ More to come!
Living on the Edge, ROKS 2013, Leuven, 10 July 2013 34