From Seismology to Compressed Sensing and Back,
a Brief History of Optimization-Based Signal Processing
Carlos Fernandez-Grandawww.cims.nyu.edu/~cfgranda
Signal and Information Processing Seminar, Rutgers University
4/27/2016
Deconvolution in seismology
Compressed sensing
Back to deconvolution: the super-resolution problem
Super-resolution via semidefinite programming
Demixing of sines and spikes
Deconvolution in seismology
Compressed sensing
Back to deconvolution: the super-resolution problem
Super-resolution via semidefinite programming
Demixing of sines and spikes
Seismology
Reflection seismology
Geological section Acoustic impedance Reflection coefficients
Reflection seismology
Sensing Ref. coeff. Pulse Data
Data ≈ convolution of pulse and reflection coefficients
Sensing model for reflection seismology
Ref. coeff. Pulse Data
∗ =
Spectrum × =
Convolution in time = Pointwise multiplication in frequency
Ill-posed problem! How do we choose between signals consistent with data?
Geophysicists: Minimize `1 norm
Minimum `1-norm estimate
minimize ||estimate||1subject to estimate ∗ pulse = data
Reflection coefficients Estimate
It works, but under what conditions?
Minimum `1-norm estimate
minimize ||estimate||1subject to estimate ∗ pulse = data
Reflection coefficients Estimate
It works, but under what conditions?
Deconvolution in seismology
Compressed sensing
Back to deconvolution: the super-resolution problem
Super-resolution via semidefinite programming
Demixing of sines and spikes
Magnetic resonance imaging
Images are sparse/compressible
Wavelet coefficients
Magnetic resonance imaging
Data: Samples from spectrum
Problem: Sampling is time consuming (annoying, patient might move)
Images are compressible (≈ sparse)
Can we recover compressible signals from less data?
Compressed sensing
1. Undersample the spectrum randomly
1D 2D
Data
Compressed sensing
2. Solve the optimization problem
minimize ||estimate||1subject to frequency samples of estimate = data
Signal Estimate
Compressed sensing
2. Solve the optimization problem
minimize ||estimate||1subject to frequency samples of estimate = data
Signal Estimate
Compressed sensing in MRI
x2 Undersampling
Theoretical questions
1. Is the problem well posed?
2. Does `1-norm minimization work?
Is the problem well posed?
=
=
Spectrumof x
Is the problem well posed?
=
=
Spectrumof x
Measurement operator = random frequency samples
Is the problem well posed?
=
=
Spectrumof x
Is the problem well posed?
=
=
Spectrumof x
What is the effect of the measurement operator on sparse vectors?
Is the problem well posed?
=
=
Spectrumof x
Are sparse submatrices always well conditioned?
Is the problem well posed?
=
=
Spectrumof x
Are sparse submatrices always well conditioned?
Restricted isometry property (RIP)
An m × n matrix A satisfy the restricted isometry property if there is0 < δ < 1 such that for any s-sparse vector x
(1− δ) ||x ||2 ≤ ||Ax ||2 ≤ (1 + δ) ||x ||2
Random Fourier matrices satisfy the RIP with high probabilityif m ≥ O (s) up to log factors (Candès, Tao 2006)
2s-RIP implies that for any s-sparse signals x1, x2
||y2 − y1||2 ≥ (1− δ) ||x2 − x1||2
Theoretical questions
1. Is the problem well posed?
2. Does `1-norm minimization work?
Characterizing the minimum `1-norm estimate
I Aim: Show that the original signal x is the solution of
minimize∣∣∣∣x ′∣∣∣∣1
subject to A x ′ = y
I This is guaranteed by the existence of a dual certificate
Dual certificate
v ∈ Rm is a dual certificate associated to x if
q := AT v
satisfies
qi = sign (xi ) if xi 6= 0|qi | < 1 if xi = 0
Dual certificate
q is a subgradient of the `1 norm at x
For any x + h such that Ah = 0
||x + h||1 ≥ ||x ||1 + qTh
= ||x ||1 + vTAh= ||x ||1
If AT (where T is the support of x) is injective, x is the unique solution
Dual certificate for compressed sensing
Aim: Show that a dual certificate exists for any sparse supportand sign pattern
Certificate for compressed sensing
Idea: Minimum-energy interpolator has closed-form solution
Certificate for compressed sensing
Valid certificate if m ≥ O (s) up to log factors
(Candès, Romberg, Tao 2006)
Deconvolution in seismology
Compressed sensing
Back to deconvolution: the super-resolution problem
Super-resolution via semidefinite programming
Demixing of sines and spikes
Limits of resolution in imaging
The resolving power of lenses, however perfect, is limited (Lord Rayleigh)
Diffraction imposes a fundamental limit on the resolution of optical systems
Fluorescence microscopy
Data
Point sources Low-pass blur
(Figures courtesy of V. Morgenshtern)
Super-resolution
I Optics: Data-acquisition techniques to overcome the diffraction limit
I Image processing: Methods to upsample images onto a finer gridwhile preserving edges and hallucinating textures
I This talk: Estimation/deconvolution from low-pass measurements
Sensing model for super-resolution
Point sourcesPoint-spreadfunction
Data
∗ =
Spectrum × =
Deconvolution problem as in reflection seismology
Minimum `1-norm estimate
minimize ||estimate||1subject to estimate ∗ psf = data
Point sources Estimate
Minimum `1-norm estimate
minimize ||estimate||1subject to estimate ∗ psf = data
Point sources Estimate
Mathematical model
I Signal: superposition of Dirac measures with support T
x =∑
j
ajδtj aj ∈ C, tj ∈ T ⊂ [0, 1]
I Data: low-pass Fourier coefficients with cut-off frequency fc
y = Fc x
y(k) =
∫ 1
0e−i2πktx (dt) =
∑j
aje−i2πktj , k ∈ Z, |k | ≤ fc
Compressed sensing vs super-resolution
Compressed sensing Super-resolution
spectrum interpolation spectrum extrapolation
Total-variation norm
I Continuous counterpart of the `1 norm
I If x =∑
j ajδtj then ||x ||TV =∑
j |aj |I Not the total variation of a piecewise-constant function
I Formal definition: For a complex measure ν
||ν||TV = sup∞∑j=1
|ν (Bj)| ,
(supremum over all finite partitions Bj of [0, 1])
Theoretical questions
1. Is the problem well posed?
2. Does TV -norm minimization work?
Is the problem well posed?
=
=
Spectrumof x
Is the problem well posed?
=
=
Spectrumof x
Measurement operator = low-pass samples with cut-off frequency fc
Is the problem well posed?
=
=
Spectrumof x
Measurement operator = low-pass samples with cut-off frequency fc
Is the problem well posed?
=
=
Spectrumof x
Effect of measurement operator on sparse vectors?
Is the problem well posed?
=
=
Spectrumof x
Submatrix can be very ill conditioned!
Is the problem well posed?
=
=
Spectrumof x
If support is spread out there is hope
Minimum separation
The minimum separation ∆ of the support of x is
∆ = inf(t,t′) ∈ support(x) : t 6=t′
|t − t ′|
Conditioning of submatrix with respect to ∆
I If ∆ < 1/fc the problem is ill posedI If ∆ > 1/fc the problem becomes well posedI Proved asymptotically by Slepian and non-asymptotically by Moitra
1/fc is the diameter of the main lobe of the point-spread function(twice the Rayleigh distance)
Lower bound on ∆
I Above what minimum distance ∆ is the problem well posed?
I Numerical lower bound on ∆:
1. Compute singular values of restricted operator for different values of∆diff
2. Find ∆diff under which the restricted operator is ill conditioned
3. Then ∆ ≥ 2∆diff
Fc
Singular values of the restricted operator
Number of spikes = s, fc = 103
0.1 0.2 0.3 0.4 0.5 0.610−17
10−13
10−9
10−5
10−1
∆ fc
σj
s = 40
j = 1j = 0.25 sj = 0.5 sj = 0.75 sj = s
0.1 0.2 0.3 0.4 0.5 0.610−17
10−13
10−9
10−5
10−1
∆ fc
σj
s = 100
Phase transition at ∆diff = 1/2fc → ∆ = 1/fc
Singular values of the restricted operator
Number of spikes = s, fc = 103
0.1 0.2 0.3 0.4 0.5 0.610−17
10−13
10−9
10−5
10−1
∆ fc
σj
s = 200
0.1 0.2 0.3 0.4 0.5 0.610−17
10−13
10−9
10−5
10−1
∆ fc
σj
s = 500
Phase transition at ∆diff = 1/2fc → ∆ = 1/fc
Example: 25 spikes, fc = 103, ∆ = 0.8/fc
Signals Data (in signal space)
0
0.1
0.2
0.3
0
100
200
300
0
0.1
0.2
0.3
0
100
200
300
Example: 25 spikes, fc = 103, ∆ = 0.8/fc
0
0.1
0.2
0.3Signal 1Signal 2
0
100
200
300
Signal 1Signal 2
Signals Data (in signal space)
Example: 25 spikes, fc = 103, ∆ = 0.8/fc
The difference is almost in the null space of the measurement operator
−0.2
0
0.2
−2,000 −1,000 0 1,000 2,000
10−8
10−6
10−4
10−2
100
Difference Spectrum
Theoretical questions
1. Is the problem well posed?
2. Does TV -norm minimization work?
Estimation via convex programming
For data of the form y = Fc x , we solve
minx||x ||TV subject to Fc x = y ,
over all finite complex measures x supported on [0, 1]
Dual certificate of TV norm
A dual certificate of the TV norm at
x =∑
j
ajδtj aj ∈ C, tj ∈ T
guarantees that x is the unique solution if
q := F∗c v =∑
k≤|fc |vke i2πkt
q (tj) = sign (aj) if tj ∈ T
|q (t)| < 1 if t /∈ T
Range of F∗c is spanned by low pass sinusoids instead of random sinusoids
Certificate for super-resolution
1
0
−1
Aim: Interpolate sign pattern
Certificate for super-resolution
1
0
−1
1st idea: Interpolation with a low-frequency fast-decaying kernel K
q(t) =∑tj∈T
αj K (t − tj)
Certificate for super-resolution
1
0
−1
1st idea: Interpolation with a low-frequency fast-decaying kernel K
q(t) =∑tj∈T
αj K (t − tj)
Certificate for super-resolution
1
0
−1
1st idea: Interpolation with a low-frequency fast-decaying kernel K
q(t) =∑tj∈T
αj K (t − tj)
Certificate for super-resolution
1
0
−1
1st idea: Interpolation with a low-frequency fast-decaying kernel K
q(t) =∑tj∈T
αj K (t − tj)
Certificate for super-resolution
1
0
−1
1st idea: Interpolation with a low-frequency fast-decaying kernel K
q(t) =∑tj∈T
αj K (t − tj)
Certificate for super-resolution
1
0
−1
1
0
−1
Problem: Magnitude of certificate locally exceeds 1
Solution: Add correction term and force the derivative of the certificate toequal zero on the support
q(t) =∑tj∈T
αj K (t − tj) + βj K ′ (t − tj)
Certificate for super-resolution
1
0
−1
1
0
−1
Problem: Magnitude of certificate locally exceeds 1
Solution: Add correction term and force the derivative of the certificate toequal zero on the support
q(t) =∑tj∈T
αj K (t − tj) + βj K ′ (t − tj)
Certificate for super-resolution
1
0
−1
1
0
−1
Problem: Magnitude of certificate locally exceeds 1
Solution: Add correction term and force the derivative of the certificate toequal zero on the support
q(t) =∑tj∈T
αj K (t − tj) + βj K ′ (t − tj)
Certificate for super-resolution
1
0
−1
1
0
−1
Similar construction for bandpass point-spreadfunctions relevant to reflection seismology
Sketch of proof: Interpolation kernel
Key step: Designing a good interpolation kernel
· · =
0.273 fc 0.36 fc 0.367 fc fc
∗ ∗ =
Trade-off between spikiness at the origin and asymptotic decay
Sketch of proof: Non-asymptotic bounds on kernel
Kernel
Kernel
1st derivative
1st derivative
2nd derivative
2nd derivative3rd derivative
3rd derivative
Kernel (fc = 103) Kernel (fc = 5 103) Kernel (fc = 104)Upper bound Lower bound
Figure 1: Upper and lower bounds on K� and its derivatives.
1
Guarantees for super-resolution
Theorem [Candès, F. 2012]
If the minimum separation of the signal support obeys
∆ ≥ 2 /fc
then recovery via convex programming is exact
Theorem [Candès, F. 2012]
In 2D convex programming super-resolves point sources with aminimum separation of
∆ ≥ 2.38 /fc
where fc is the cut-off frequency of the low-pass kernel
Guarantees for super-resolution
Theorem [F. 2016]
If the minimum separation of the signal support obeys
∆ ≥ 1.26 /fc ,
then recovery via convex programming is exact
Theorem [Candès, F. 2012]
In 2D convex programming super-resolves point sources with aminimum separation of
∆ ≥ 2.38 /fc
where fc is the cut-off frequency of the low-pass kernel
Numerical evaluation of minimum separation
fc = 30 fc = 40 fc = 50
0.2 0.4 0.6 0.8 1
5
10
15
20
25
Minimum separation ∆min fc
Num
ber
ofsp
ikes
0.2 0.4 0.6 0.8 1
10
20
30
Minimum separation ∆min fc
Num
ber
ofsp
ikes
0.2 0.4 0.6 0.8 1
10
20
30
Minimum separation ∆min fc
Num
ber
ofsp
ikes
0
0.2
0.4
0.6
0.8
1
Conjecture: TV-norm minimization succeeds if ∆ ≥ 1fc
Dual certificate as theoretical tool
Subsequent work builds on our construction to analyzeI Stability of super-resolution [Candès, F. 2013], [F. 2013], [Azais, De
Castro, Gamboa 2013], [Duval, Peyré 2013]I Denoising of line spectra [Tang, Bhaskar, Recht 2013]I Compressed sensing off the grid [Tang, Bhaskar, Shah, Recht 2013]I Recovery of splines from their projection onto spaces of algebraic
polynomials [Bendory, Dekel, Feuer 2013], [De Castro, Mijoule 2014]I Recovery of point sources from spherical harmonics [Bendory, Dekel,
Feuer 2013]
Deconvolution in seismology
Compressed sensing
Back to deconvolution: the super-resolution problem
Super-resolution via semidefinite programming
Demixing of sines and spikes
Practical implementation
I Primal problem:
minx||x ||TV subject to Fc x = y
Infinite-dimensional variable x (measure in [0, 1])
First option: Discretizing + `1-norm minimization
I Dual problem:
maxu∈Cn
Re [y∗u] subject to ||F∗c u||∞ ≤ 1, n := 2fc + 1
Finite-dimensional variable u, but infinite-dimensional constraint
F∗c u =∑
k≤|fc |uke i2πkt
Second option: Solving the dual problem
Practical implementation
I Primal problem:
minx||x ||TV subject to Fc x = y
Infinite-dimensional variable x (measure in [0, 1])
First option: Discretizing + `1-norm minimization
I Dual problem:
maxu∈Cn
Re [y∗u] subject to ||F∗c u||∞ ≤ 1, n := 2fc + 1
Finite-dimensional variable u, but infinite-dimensional constraint
F∗c u =∑
k≤|fc |uke i2πkt
Second option: Solving the dual problem
Lemma: Semidefinite representation
The Fejér-Riesz Theorem and the semidefinite representation of polynomialsums of squares imply that
||F∗c u||∞ ≤ 1
is equivalent to
There exists a Hermitian matrix Q ∈ Cn×n such that[Q uu∗ 1
]� 0,
n−j∑i=1
Qi ,i+j =
{1, j = 0,0, j = 1, 2, . . . , n − 1.
Consequence: The dual problem is a tractable semidefinite program
Support-locating polynomial
How do we obtain an estimator from the dual solution?
Dual solution vector: Fourier coefficients of low-pass polynomial thatinterpolates the sign of the primal solution (follows from strong duality)
Idea: Use the polynomial to locate the support of the signal
Super-resolution via semidefinite programming
Super-resolution via semidefinite programming
Super-resolution via semidefinite programming
1. Solve semidefinite program to obtain dual solution
Super-resolution via semidefinite programming
2. Locate points at which corresponding polynomial has unit magnitude
Super-resolution via semidefinite programming
Signal Estimate
3. Estimate amplitudes via least squares
Support-location accuracy
fc 25 50 75 100
Average error 6.66 10−9 1.70 10−9 5.58 10−10 2.96 10−10
Maximum error 1.83 10−7 8.14 10−8 2.55 10−8 2.31 10−8
For each fc , 100 random signals with |T | = fc/4 and ∆(T ) ≥ 2/fc
Deconvolution in seismology
Compressed sensing
Back to deconvolution: the super-resolution problem
Super-resolution via semidefinite programming
Demixing of sines and spikes
Spectral super-resolution
I Signal: Multisinusoidal signal
g (t) :=∑fj∈T
cje−i2πfj t
g =∑fj∈T
cjδfj
I Data: n samples measured at Nyquist rate
g (k) :=∑fj∈T
cje−i2πkfj , 1 ≤ k ≤ n
Equivalent to our super-resolution model!
Spectral Super-resolution
Spectrum
Signal
Data
Demixing of sines and spikes
Aim: Super-resolving the spectrum of a multi-sinusoidal signal (sines)in the presence of impulsive events (spikes)
y = Fc x + s
Demixing of sines and spikes
Sines
+ =
Spectrum
+ =
x
+ s = y
Demixing of sines and spikes
Sines
+ =
Spectrum
+ =
Fc x
+ s = y
Demixing of sines and spikes
Sines
+ =
Spectrum
+ =
Fc x
+ s = y
Demixing of sines and spikes
Sines Spikes
+
=
Spectrum +
=
Fc x + s
= y
Demixing of sines and spikes
Sines Spikes Data
+ =
Spectrum + =
Fc x + s = y
Demixing of sines and spikes
Estimator: Solution to
minx , s||x ||TV + γ ||s||1 subject to Fc x + s = y
Dual problem:
maxu∈Cn
Re [y∗u] subject to ||F∗c u||∞ ≤ 1, ||u||∞ ≤ γ
Dual solution: uI u interpolates the sign of the primal solution sI F∗c u interpolates the sign of the primal solution x
Demixing of sines and spikes
Estimator: Solution to
minx , s||x ||TV + γ ||s||1 subject to Fc x + s = y
Dual problem:
maxu∈Cn
Re [y∗u] subject to ||F∗c u||∞ ≤ 1, ||u||∞ ≤ γ
Dual solution: uI u interpolates the sign of the primal solution sI F∗c u interpolates the sign of the primal solution x
Demixing of sines and spikesu F∗c u
Dualsolution
Estimate
s s x x
Spikes Sines (spectrum)
Demixing of sines and spikesu F∗c u
Dualsolution
Estimate
s s x x
Spikes Sines (spectrum)
Conclusion
I Geophysicists pioneered the use of `1-norm regularization forunderdetermined inverse problems
I Mathematicians and statisticians developed theoretical tools tounderstand compressed sensing
I Adapting these insights allows to analyze the potential and limitationsof convex programming for super-resolution
Deconvolution with the `1 norm (Taylor, Banks, McCoy ’79)
Data
Fit
Pulse
Estimate
References: Reflection seismology
I Robust modeling with erratic data. J. F. Claerbout and F. Muir.Geophysics, 1973
I Deconvolution with the `1 norm. H. L. Taylor, S. C. Banks and J. F. McCoy.Geophysics, 1979
I Reconstruction of a sparse spike train from a portion of its spectrum andapplication to high-resolution deconvolution. S. Levy and P. K. Fullagar.Geophysics, 1981
I Linear inversion of band-limited reflection seismograms. F. Santosa and W.W. Symes. SIAM J. Sci. Stat. Comp., 1986
References: Compressed sensing
I Stable signal recovery from incomplete and inaccurate measurements. E. J.Candès, J. Romberg and T. Tao. Comm. Pure Appl. Math., 2005
I Decoding by linear programming. E. J. Candès and T. Tao. IEEE Trans.Inform. Theory, 2004
I Sparse MRI: The application of compressed sensing for rapid MR imaging.M. Lustig, D. Donoho and J. M. Pauly. Magn Reson Med., 2007
References: Super-resolution
I Prolate spheroidal wave functions, Fourier analysis, and uncertainty VV - The discrete case. D. Slepian. Bell System Technical Journal, 1978
I Super-resolution, extremal functions and the condition number ofVandermonde matrices. A. Moitra. Symposium on Theory of Computing(STOC), 2015
I Towards a mathematical theory of super-resolution. E. J. Candès andC. Fernandez-Granda. Comm. on Pure and Applied Math., 2013
I Super-resolution of point sources via convex programming.C. Fernandez-Granda. Information and Inference, 2016
Top Related