Compressive Sensing: - Theory, Applications, Challenges...

Compressive Sensing:Theory, Applications, Challenges and Future

Dr. Mohammad Eslami and Seyed Hamid Safavi

Shahid Beheshti UniversityDigital Signal Processing LabarotarY(DiSPLaY)

http://display.sbu.ac.ir

m [email protected], h [email protected]

21 May 2017

Compressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 1 / 78


Outline

1 Introduction

2 Compressive SensingRecovery Constraints

SparkNSPRIPMutual Coherence

Recovery AlgorithmsMPOMPBPS`0

3 Challenges4 Applications5 Solvers

SparseLabCVX`1-MagicSPGL1SL0ISTA, FISTAADMM

6 Summary and Future Directions


Introduction


x ∈ Rn

x = Dα D ∈ Rn×m α ∈ Rm

Atom Composition

‖α‖0 = k0 k0 � m k0 − sparse

x = Dα + e ‖e‖2 6 ε M (D, k0, ε)


Atomic Decomposition

P0 (D, x, δ) : minα‖α‖0 subject to ‖x−Dα‖2 < δ

P1 (D, x, δ) : minα‖α‖1 subject to ‖x−Dα‖2 < δ


Inverse Problem

y = Hx + υ ‖υ‖2 < δ

P0 (HD, y, δ + ε) x̂ = Dαδ+ε0


Inverse Problem

Synthesis Based

αδ+ε0 = min

α‖α‖0 subject to ‖y −HDα‖2 < δ + ε

x̂ = Dαδ+ε0

Analysis Based

x̂ = minx‖D∗α‖0 subject to ‖y −Hx‖2 < δ + ε


Compressive Sensing

x, N = 512, 20− sparse

y = Φx, L = 50


Compressive Sensing

x̃ = Φ−1y


Compressive Sensing

x̂


Problem Definition

Linear Inverse Problems

System of linear equations:y1 = a11x1 + a12x2 + ...+ a1NxN

y2 = a21x1 + a22x2 + ...+ a2NxN...

yM = aM1x1 + aM2x2 + ...+ aMNxN

Or in a mtrix form:y = Ax

Applications of Linear inverse problems: optics, radar, acoustics, com-munication theory, signal processing, medical imaging, computer vision, geo-physics, oceanography, astronomy, remote sensing, natural language pro-cessing, machine learning, nondestructive testing, and many other fields.


Problem Definition

Linear Inverse Problems

y: Measurements,A ∈ RM×N : Linear Measurement Matrix

M = N, In this case the A is square matrix and the system has aunique solution if det(A) 6= 0.

M < N, the problem is underdetermind (fewer equations thanunknowns). An underdetermined linear system has either no solutionor infinitely many solutions.

M > N, the problem is overdetermind (more equations thanunknowns). An overdetermined system is almost always inconsistent(it has no solution) when constructed with random coefficients.Homogeneous case has a trivial all zero solution.


Problem Definition

ill-posedness


Problem Definition

ill-posedness

Q. what is ill-posedness?A. unstability with respect to measurement errors.Q. How to deal with ill-posed inverse problems?

A. It depends! There is no universal method for solving ill-posed problems.In every specific case, the main “trouble” — instability — has to be tackledin its own way.“All happy families resemble one another, each unhappy family is unhappy

in its own way”. Leo Tolstoy

Solution: Adding regularization such as Tikhonov regularization, `1-regularization, etc.


Introduction

Finding fake coin between coins?


Introduction


Introduction

Finding fake coin between coins?


Introduction


Compressive Sensing


Why CS?

Imaging at wavelengths where silicon is blind is considerably more com-plicated, bulky, and expensive. Thus, for comparable resolution, a US$500digital camera for the visible becomes a US$50,000 camera for the infrared.


Dimensionality Reduction

Data with high-frequency content is often not intrisically high-dimensional.

Signals often obey low-dimensional models such as:

Sparsity

Low-rank matrices

Manifolds

The intrinsic dimension can be much less than the ambient dimension.

Courtesy of Dr. Mark Davenport


Why CS?

Success has many fathers ...

Sampling Theorem: sampling attwice the highest frequency.

Compressive Sensing: sampling atsub-Nyquist rate!

Whittaker, Nyquist, Kotelnikov, Shannon Donoho, Candes, Romberg, Tao

“Can we not just directly measure the part that will not end up being thrownaway?”[Donoho, 2006]

“why spend so much effort acquiring all the data when we know that most of itwill be discarded?” [Candes, 2006]


Problem Statement

“Measure what should be measured ...”

Compressive Sensing

1

3

2

Transform

Sparsity

Non-adaptive

Linear Sampling

Reconstruction

• Spark

• Null Space Property

• RIP

• Mutual Coherence

• Greedy Algorithms (MP, OMP,

StOMP, …)

• Basis Pursuit (BP), Basis Pursuit

Denoising (BPDN)

• LASSO

• Smoothed L0 (SL0)

• …

Enable solving underdetermined

linear inverse problems


Single Pixel Camera

Courtesy of Dr. Marco F. Duarte and Dr. Richard G. Baraniuk


Problem Statement

Definition:

We say that a signal x is K -sparse when it has at most K nonzeros, i.e.,‖x‖0 6 k . Also, we let Σk = {x : ‖x‖0 6 k} denote the set of all K -sparsesignals.

Figure: (a) Measurement process with a random Gaussian matrix Φ and discrete cosinetransform (DCT) matrix Ψ. (b) The measurement vector y is a linear combination of 4 columns

in dictionary [Baraniuk2007]


Recovery Constraints



P0 : mins‖s‖0 subject to y = ΦΨs

P1 : mins‖s‖1 subject to y = ΦΨs

Figure: (a)The set of all k-sparse (k=2) vectors in R3. (b)Visualization of `2 minimization. (c)Visualization of `1 minimization. [Baraniuk, 2007]


Recovery Constraints [Davenport, 2011]

Spark (for sparse signals) σ (Φ) > 2k .

Null Space Property (for approximately sparse signals)‖hΛ‖2 <

CNSP√k‖hΛc‖1.

Restricted Isometry Property (for approximate sparse signals withnoise ) (1− δk) ‖x‖2

2 6 ‖Φx‖22 6 (1 + δk) ‖x‖2

2.

Mutual Coherence µ (Φ) = maxi 6=j

|φTi φj |‖φi‖2‖φj‖2

Spark (Φ) > 1 + 1µ(Φ)


Spark [Davenport, 2011]

Definition:

The spark of a given matrix Φ is the smallest number of columns of Φthat are linearly dependent.

Spark (ΦM×N) , min {k; ∃ 1 6 i1 6 ... 6 ik 6 N, α1, ..., αk ∈ R,

‖α‖2 6= 0,k∑

j=1

αjΦij = 0


Spark [Davenport, 2011]

Theorem:

For any vector y ∈ RM , there exists at most one signal x ∈ Σk such thaty = Φx if and only if σ (Φ) > 2k.

Proof:

If we wish to be able to recover all sparse signals x from the measurementsΦx, then it is immediately clear that for any pair of distinct vectorsx1, x2 ∈ Σk , we must have Φx1 6= Φx2, since otherwise it would beimpossible to distinguish x1 from x2 based solely on the measurements.Mathematically

ifx1 6= x2, and x1, x2k-sparse⇒ Φx1 6= Φx2 ⇒ Φ (x1 − x2) 6= 0

Note:Spark (ΦM×N) ∈ [2,M + 1]


Null Space Property [Davenport, 2011]

Definition:

A matrix Φ satisfies the null space property (NSP) of order K if thereexists a constant CNSP > 0 such that,

∀h ∈ NΦ, ∀Λ ⊆ {1, ...,N} , |Λ| = k , Λc = {1, ...,N} \Λ

‖hΛ‖2 <CNSP√

k‖hΛc‖1

Note:

we must also ensure that NΦ does not contain any vectors that are toocompressible in addition to vectors that are sparse.

Theorem:

Any K -sparse vector x is a unique solution to P1 if and only if matrix Φsatisfies NSP of order K with CNSP = 1.


Restricted Isometry Property [Davenport, 2011]

Definition:

A matrix Φ satisfies the restricted isometry property (RIP) of order K ifthere exists a δk ∈ (0, 1) such that

(1− δk) ‖x‖22 6 ‖Φx‖2

2 6 (1 + δk) ‖x‖22

holds for all x ∈ Σk .

Theorem: Noiseless recovery

Assume that δ2k <√

2− 1 . Then the solution x∗N×1 to `1 minimizationobeys

‖x∗ − x‖2 6C0√k

∥∥∥x− x(k)∥∥∥

1

where x(k)N×1 is the best K -sparse approximate of x and C0 is a constant .


Restricted Isometry Property [Davenport, 2011]

Theorem: Recoery in the presence of bounded noise

Assume that δ2k <√

2− 1 and let y = Φx + e, where ‖e‖2 6 ε. Then thesolution x∗N×1 to `1 minimization obeys

‖x∗ − x‖2 6C0√k

∥∥∥x− x(k)∥∥∥

1+ C1ε

where x(k)N×1 is the best K -sparse approximate of x and C0 and C1 is a

constant as follows:

C0 = 21−

(1−√

2)δ2k

1−(1 +√

2)δ2k

, C1 = 4

√1 + δ2k

1−(1 +√

2)δ2k


RIP and NSP [Davenport, 2011]

Theorem: The relationship between the RIP and the NSP

Suppose that Φ satisfies the RIP of order 2k with δ2k <√

2− 1. Then Φsatisfies the NSP of order 2k with constant

C0 =

√2δ2k

1−(1 +√

2)δ2k


Recovery Constraints [Davenport, 2011]

Mutual Coherence

Definition: The coherence of a matrix Φ, µ (Φ), is the largest absoluteinner product between any two columns φi , φj of Φ:

µ (Φ) = maxi 6=j

|φTi φj |‖φi‖2‖φj‖2

Lemma: For any matrix Φ

Spark (Φ) > 1 +1

µ (Φ)

Theorem:If k < 12 (1 + 1/µ (Φ)) then for each measurement vector

y ∈ RM there exists at most one signal x ∈ Σk such that y = Φx.

Welch Bound:

It is possible to show that the lower bound for coherence of a matrix is:

µ >√

N−MM(N−1)



Courtesy of the concept by Dr. Arash Amini


Measurement Matrices

How should we design the measurement matrix so that the required numberof measurements becomes as small as possible?

Gaussian random matricesBernulli random matricesHadamard matricesFourier matricesDeterministic matrices

Measurement bounds

Let Φ be an M × N matrix that satisfies the RIP of order 2k withconstant δ ∈

(0, 1

2

]. Then

M > C (Klog (N/K ))

where C = 1/2 log(√

24 + 1)≈ 0.28.

How can we recover the desired signal from the measurements?Compressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 37 / 78

Recovery Algorithms


Recovery Algorithms

Designing Criteria:

Minimal number of measurements

Robustness to measurement noise and model mismatch

Speed

Performance guarantees

Greedy Algorithms Convex Optimization based Methods

Matching Pursuit (MP) Basis Pursuit (BP)Orthogonal MP (OMP) Basis Pursuit Denoising (BPDN)Stage-wise OMP LASSOSubspace Pursuit (SP)CoSaMPIterative Hard Thresholding (IHT)

Smoothed `0Compressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 39 / 78

Matching Pursuit (MP)

y = Dx⇒ y =k∑

i=1

x(i)

a(i) da(i) + rk = y(k) + r(k)


Orthogonal Matching Pursuit (OMP)


BP, BPDN and LASSO

`1-minimization:

Basis Pursuit (BP)

P1 : mins∈RN

‖s‖1

s.t. y = ΦΨs

Basis Pursuit Denoising (BPDN)

mins∈RN

‖s‖1

s.t. ‖y −ΦΨs‖22 6 δ

mins∈RN

12 ‖y −ΦΨs‖2

2 + λ‖s‖1

Least Absolute Shrinkage and Selection Operator (LASSO)

mins∈RN

‖y −ΦΨs‖22

s.t. ‖s‖1 6 τ


S`0: Smoothed `0 Algorithm

limσ→0

fσ (xi ) = limσ→0

exp

(−

x2i

2σ2

)=

{1 if xi = 0

0 if xi 6= 0

Fσ (x) =N∑i=1

fσ (xi )→ ‖x‖0 ≈ N − Fσ (x)

SL0 is an algorithm for finding the sparsest solutions of anunderdetermined system of linear equations.SL0 is a very fast algorithm. For example, it is about 2 to 3 orders ofmagnitude faster than `1-magic.SL0 tries to directly minimize the L0 norm. This is contrary to mostof other existing algorithms (e.g. Basis Pursuit), which replace L0norm by other cost functions (like L1). Note also that the equivalenceof minimizing L1 and L0 is only assymptotic, and does not alwayshold.

Software Link: http://ee.sharif.edu/~SLzeroCompressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 43 / 78

http://ee.sharif.edu/~SLzero

Challenges


Challenges [Str12, Ela12]

Galileo Galilei

“Measure what can be measured and make measurable what cannot bemeasured” This quote attributed to Galileo Galilei.

Thomas Strohmer

“Measure what should be measured”

Challenges:

Structured Sensing Matrices

Structured Sparsity and Prior Information

Non-Linear Compressive Sensing

Better Bounds for Compressive Sensing

Hardware Implementation


Structured Sensing Matrices

Question:

Can we use a novel structure for sensing matrices that results in agood performance?

We usually do not have luxury to choose measurement matrix as weplease.

Measurement matrix is often dictated by the physical properties ofthe measurement process. e.g. the laws of wave propagation.

Specific structure in measurement matrix can give rise to fastalgorithms which will significantly speed up recovery algorithms.Structure: Block based compressive sensing, Separable matrices likeKronecker product matrices

Remark: The typical measurement matrix is not Gaussian or Bernoulli,but one with a very specific structure.


Structured Sparsity and Prior Information

Question:

How can we best take advantage of the knowledge that all sparsitypatterns may not be equally likely in a signal?

Structured Sparsity

Block SparsityTree Structure

of Wavelet Coefficients


Non-Linear Compressive Sensing

Question:

For which types of nonlinear measurements can we build aninteresting and relevant compressive sensing theory?

In many applications we can only take nonlinear measurements.

An important example is the case where we observe signalintensities. So, the phase information is missing.

Phase retrieval: X-ray crystallography, diffraction imaging,astronomy, and quantum tomography.

Quantized samples, and in the extreme case, 1-bit measurements.


Weighted CS

Figure: Enhancing Sparsity: At the origin, the canonical`0 sparsity count f0(t) is better approximated by the

log-sum penalty function flog,ε(t) than by the traditionalconvex `1 relaxation f1(t) [Candes2008].

P`1 : mins∈RN

‖s‖1 s.t. y = θs PRW `1 : mins∈RN

N∑i=1

log (|si |+ ε) s.t. y = θs

Solve using Majorization Minimization:

PRW `1 : mins∈RN

‖Ws‖1 s.t. y = θs, PRW `1 : minz∈RN

‖z‖1 s.t. y = θW−1z

w(l+1)i =

1∣∣∣s(l)i

∣∣∣+ ε


Perceptual Compressive Sensing

Redundancies

1

2

Statistical

Perceptual

Avoided by using

Transform Coding

and Variable

Length Coding

Avoided by

assigning weights

to the Transform

Coding coefficients

Question

How we can avoid acquisition of perceptual redundancies in compressivesensing?

Answer: Perceptual Weighted Compressive Sensing

# S.H. Safavi, F. Torkamani-Azar, “Perceptual Compressive Sensing based on Contrast Sensitivity Function: Can we avoidnon-visible redundancies acquisition?”, ICEE, 2017.


Perceptual Compressive Sensing

0 10 20 30 40 50 60

Spatial Frequency

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rel

ativ

e S

ensi

tivi

ty

Luminance Contrast sensitivity Function (CSF)

Figure: Contrast Sensitivity Function (CSF)

H (fi,j) = 2.6 (0.0192 + 0.114fi,j) e−(0.114fi,j )

1.1

fi,j = 12N

√(iθx

)2

+(

jθy

)2

(cpd)Figure: illustration of spatial

frequency

# S.H. Safavi, F. Torkamani-Azar, “Perceptual Compressive Sensing based on Contrast Sensitivity Function: Can we avoidnon-visible redundancies acquisition?”, ICEE, 2017.


Diffusive Compressive Sensing

Applications:

Thermal monitoring of CPUhot spots to avoid leakageand reduce the CPU timefailure

Climate change study

Enviromental Monitoring

...

150

100

50

00

50

100

0

0.5

1

1.5

-0.5

150

Figure: Thermal Field example: LocalizedSource

# A. Hashemi, S.H. Safavi, M. Rostami, N.M. Cheung, “Efficient Sampling Scheme for Reconstruction of Diffiusion Fields”,Preprint, 2017.



Thermal Field Reconstruction

Side Information: 2D Partial Differential Equation (PDE)

∂f (x , y , t)

∂t= γ

(∂2f (x , y , t)

∂x2+∂2f (x , y , t)

∂y2

), t, γ > 0.

Frame Processing:

f t0t (x , y) = γ

(∂2ft0(x , y)

∂x2+∂2ft0(x , y)

∂y2

)where, f t0

t (x , y) = ∂f (x ,y ,t)∂t |t=t0 . The above equation can be discretized as

follows:b = γ (DxDx + DyDy ) f.




Figure: Geometrical depiction of the proposed method. This figure intuitively explains whyadding the additional constraint can help to decrease overcompleteness of the CS problem.




x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(a)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(b)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(c)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(d)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(e)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(f)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(g)

x0 20 40 60 80 100 120 128

y

0

20

40

60

80

100

120

128

(h)Visualization of 10’th frame of reconstructed fields when xd = yd = 4 for

setting A(first row) and setting B (second row): (a,e) Original field,reconstructed fields for (b,f) CS , (c,g) DCS-I, (d,h) DCS-II.


Applications


Solvers


SparseLab

Functions

SolveBP(A, y, N, maxIters, lambda, OptTol)

SolveLasso(A, y, N, algType, maxIters, lambdaStop, resStop, solFreq,verbose, OptTol)

SolveMP(A, y, N, maxIters, lambdaStop, solFreq, verbose, OptTol)

SolveOMP(A, y, N, maxIters, lambdaStop, solFreq, verbose, OptTol)

SolveStOMP(A, y, N, thresh, param, maxIters, verbose, OptTol)

Software Link: https://sparselab.stanford.edu


https://sparselab.stanford.edu

SparseLab

SolveBP

A: Either an explicit M × N matrix, with rank(A) = min(M,N) byassumption, or a string containing the name of a functionimplementing an implicit matrix.

y: Measurement vector of length M.

N: length of the solution vector s

maxIters: maximum number of PDCO iterations to perform, default20.

lambda: If 0 or omitted, Basis Pursuit is applied to the data,otherwise, Basis Pursuit Denoising is applied with parameter lambda(default 0).

OptTol: Error tolerance, default 1e−3.


SparseLab

SolveLasso

A: Either an explicit M × N matrix, with rank(A) = min(M,N) byassumption, or a string containing the name of a functionimplementing an implicit matrix.



algType: ’lars’ for the Lars algorithm, ’lasso’ for lars with the lassomodification (default). Add prefix ’nn’ (i.e. ’nnlars’ or ’nnlasso’) toadd a non-negativity constraint (omitted by default)

maxIters: maximum number of Lars iterations to perform. If notspecified, runs to stopping condition (default)

lambdaStop: If specified (and > 0), the algorithm terminates whenthe Lagrange multiplier <= lambdaStop.


SparseLab

SolveLasso (Cont.)

resStop: If specified (and > 0), the algorithm terminates when the L2norm of the residual <= resStop.

solFreq: if = 0 returns only the final solution, if > 0, returns an arrayof solutions, one every solFreq iterations (default 0).

verbose: 1 to print out detailed progress at each iteration, 0 for nooutput (default)

OptTol: Error tolerance, default 1e−5.


SparseLab

SolveMP and SolveOMP

A: Either an explicit M × N matrix, with rank(A) = min(M,N) byassumption, or a string containing the name of a functionimplementing an implicit matrix



maxIters: maximum number of iterations to perform. If not specified,runs to stopping condition (default)

lambdaStop: If specified, the algorithm stops when the last coefficiententered has residual correlation <= lambdaStop.

solFreq: if = 0 returns only the final solution, if > 0, returns an arrayof solutions, one every solFreq iterations (default 0).

verbose: 1 to print out detailed progress at each iteration, 0 for nooutput (default)

OptTol: Error tolerance, default 1e−5.Compressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 62 / 78

CVX: Matlab Software for Disciplined Convex Programming

CVX is a Matlab-based modeling system for convex optimization. CVX turnsMatlab into a modeling language, allowing constraints and objectives to bespecified using standard Matlab expression syntax. Website link: http:

//cvxr.com/cvx

Notation

BP:

N = constantcvx beginvariables s(N)minimize (norm(s,1))subject toy == ΦΨscvx end

BPDN:

N = constantcvx beginvariables s(N)minimize (0.5 * norm(y −ΦΨs,2) + λ norm(s,1))cvx end

Software Link: http://cvxr.com/cvxCompressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 63 / 78

http://cvxr.com/cvx

http://cvxr.com/cvx

http://cvxr.com/cvx

`1-Magic

L1-MAGIC is a collection of MATLAB routines for solving the convex op-timization programs central to compressive sampling. The algorithms arebased on standard interior-point methods, and are suitable for large-scaleproblems.

l1decode pd: Decoding via linear programming. Solveminx ||b − Ax ||1 . Recast as the linear program and solve usingprimal-dual interior point method.

l1eq pd: Solve minx ||x ||1 s.t.Ax = b. Recast as linear program anduse primal-dual interior point method.

l1qc logbarrier: Solve quadratically constrained l1 minimization:min||x ||1 s.t.||Ax − b||2 <= ε. Reformulate as the Second-OrderCone Program (SOCP) and use a log barrier algorithm.

Software Link: https://statweb.stanford.edu/~candes/l1magic


https://statweb.stanford.edu/~candes/l1magic

SPGL1: A solver for large-scale sparse reconstruction

SPGL1 is a Matlab solver for large-scale one-norm regularized least squares.It is designed to solve any of the following three problems:

1 BP: minimize ||s||1 subject to As = b,

2 BPDN: minimize ||s||1 subject to ||As− b|| ≤ σ,

3 LASSO: minimize ||As− b||2 subject to ||s||1 ≤ τ .

SPGL1 relies only on matrix-vector operations As and ATy and acceptsboth explicit matrices and functions that evaluate these products. SPGL1is suitable for problems that are in the complex domain.

Software Link:http://www.cs.ubc.ca/~mpf/spgl1/downloads/spgl1-1.9.zip


http://www.cs.ubc.ca/~mpf/spgl1/downloads/spgl1-1.9.zip

SPGL1: A solver for large-scale sparse reconstruction

spg bp: Solve the basis pursuit (BP) problem.[s, r, g, info] = spg bp(A,y,options )

spg bpdn: Solve the basis pursuit Denoising (BPDN) problem.[s, r, g, info] = spg bpdn(A,y,sigma,options )

spg lasso: Solve the LASSO problem.[s, r, g, info] = spg lasso(A,y,tau,options )

spg group: Solve jointly-sparse BPDN.[s, r, g, info] = spg group(A,y,groups,sigma,options )

spg mmv: Solve multi-measurement BPDN.[s, r, g, info] = spg mmv(A,y,sigma,options )

spgl1: Solve BP, BPDN, and LASSO.[s, r, g, info] = spgl1(A,y,tau, sigma, s, options )


SL0: Smoothed L0 (SL0) Algorithm

Function

s = SL0(A, y, sigma min, sigma decrease factor, mu 0, L, A pinv, true s)

A: The Dictionary.


sigma min: The first element of the sequence of sigma is calculatedautomatically. The last element is given by ’sigma min’, and thechange factor for decreasing sigma is given by ’sigma decrease factor’.

sigma decrease factor: The default value of ’sigma decrease factor’ is0.5. Larger value gives better results for less sparse sources, but ituses more steps on sigma to reach sigma min, and hence it requireshigher computational cost.

µ0: The value of µ0 scales the sequence of mu. For each vlue ofsigma, the value of mu is chosen via µ = µ0 ∗ sigma2. Note that thisvalue effects Convergence. The default value is µ0 = 2 (see thepaper).

Software Link: http://ee.sharif.edu/~SLzeroCompressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 67 / 78


SL0: Smoothed L0 (SL0) Algorithm

Function

L: number of iterations of the internal (steepest ascent) loop. Thedefault value is L=3.

A pinv: is the pseudo-inverse of matrix A defined byA pinv = A′ ∗ inv(A ∗ A′). If it is not provided, it will be calculatedwithin the function.

true s: is the true value of the sparse solution. This argument is forsimulation purposes. If it is provided by the user, then the functionwill calculate the SNR of the estimation for each value of sigma andit provides a progress report.

Software Link: http://ee.sharif.edu/~SLzero



ISTA:Iterative Shrinkage-Thresholding Algorithm

Which type of optimization problem is of interest to solve?

min{F (x) ≡ f (x) + g (x) : x ∈ RN

}g : RN → R is a continuous convex function which is possiblynonsmooth.

f : RN → R is a smooth convex function, i.e., continuouslydifferentiable with Lipschitz continuous gradient L (f ):

‖∇f (x)−∇f (y)‖ 6 L (f ) ‖x− y‖ for every x, y ∈ RN

where L (f ) is the Lipschitz constant of ∇f

The `1 regularization problem is obviously a special instance of the aboveproblem by substituting f (x) = ‖Ax− b‖2, g (x) = ‖x‖1. The (smallest)Lipschitz constant of the gradient ∇f is L (f ) = 2λmax

(ATA

).



Quadratic approximation of F (x) := f (x) + g (x) at a given point y:

QL (x, y) := f (y) + 〈x− y,∇f (y)〉+L

2‖x− y‖2 + g (x)

It admits a unique minimizer:

pL (y) := arg min{QL (x, y) : x ∈ RN

}By using simple linear algebra and ignoring constant terms in y:

pL (y) = arg minx

{g (x) +

L

2

∥∥∥∥x−(

y − 1

L∇f (y)

)∥∥∥∥2}

Hence, the basic step of ISTA becomes:

xk = pL (xk−1)



Solving `1-regularization:

xk+1 = Tλt (G (xk))

where G (.) stands for a gradient step of the fit-to-data LS term and ISTAis an extension of the classical gradient method.

xk+1 = Tλt(

xk − 2tAT (Axk − b))

where t = 1L(f ) is an appropriate stepsize and Tα : RN → RN is the

shrinkage operator defined by

Tα(x)i = (|xi | − α)+ sgn (xi )



A possible drawback of this basic scheme is that the Lipschitz constantL (f ) is not always known or computable. For instance, the Lipschitzconstant in the `1-regularization problem depends on the maximumeigenvalue of ATA. For large-scale problems, this quantity is not alwayseasily computable. We therefore also analyze ISTA with a backtrackingstepsize rule.

ISTA with backtracking

Find the smallest nonnegative integers ik such that with L̄ = ηikLk−1,

F (pL̄ (xk−1)) 6 QL̄ (pL̄ (xk−1) , xk−1)

Set Lk = ηikLk−1 and compute

xk = pLk (xk−1)


FISTA:Fast Iterative Shrinkage-Thresholding Algorithm

The ISTA approach has a convergence rate of O(

1k

). i.e it has a sublinear

global rate of convergence. Practically, this rate could be somehow slow.Therefore, the question is how we can improve this convergence rate?

FISTA

L = L (f ) - A Lipschitz constant of ∇fy1 = x0 ∈ RN , t1 = 1.

xk = pL (yk)

tk+1 =1+√

1+4t2k

2

yk+1 = xk +(tk−1tk+1

)(xk − xk−1)

The FISTA approach has a convergence rate of O(

1k2

). We can develop

FISTA with backtracing similar to the ISTA approach.


Alternating Direction Method of Multipliers

The alternating direction method of multipliers (ADMM) is an algorithmthat solves convex optimization problems by breaking them into smallerpieces, each of which are then easier to handle.

Main References

“Distributed optimization and statistical learning via the alternatingdirection method of multipliers”,S. Boyd, N. Parikh, E. Chu, B.Peleato, and J. Eckstein, 2011

“Proximal algorithms”, N. Parikh and S. Boyd, 2014

Optimization Problem:

min f (x) + g (z)

s.t. Ax + Bz = c

Software Link: http://stanford.edu/~boyd/admm.html


http://stanford.edu/~boyd/admm.html

Alternating Direction Method of Multipliers

Augmented Lagrangian:

Lρ (x , y , z) = f (x) + g (z) + yT (Ax + Bz − c) +ρ

2‖Ax + Bz − c‖2

2

Solution:

xk+1 = arg minx

Lρ(x , zk , yk

)x −minimization step

zk+1 = arg minz

Lρ(xk+1, z , yk

)z −minimization step

yk+1 = yk + ρ(Axk+1 + Bzk+1 − c

)dual variable update

BP:

minx

f (x) + ‖z‖1

s.t. x − z = 0Compressive Sensing DiSPLaY Group, Shahid Beheshti University, Faculty of Electrical Engineering 75 / 78

Summary and Future Directions


Summary

1 Linear Inverse Problem

2 Compressive Sensing

3 Recovery Constraints such as: Spark, NSP, RIP, and MutualCoherence.

4 Recovery Algorithms such as: MP, OMP, BP, BPDN, and SL0.

5 CS Challenges

6 CS Applications

7 CS Solvers


References

1 E. J. Candes, J. K. Romberg, and T. Tao, “Stable signal recovery from incomplete and inaccurate measurements,”Communications on pure and applied mathematics, vol. 59, no. 8, pp. 1207–1223, 2006.

2 D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006.

3 R. G. Baraniuk,“Compressive sensing,” IEEE Signal Processing Mag., vol. 24, no. 4, 2007.

4 M. Duarte, M. Davenport, D. Takhar, J. Laska, S. Ting, K. Kelly, R.G. Baraniuk, “Single-Pixel Imaging via CompressiveSampling.”, IEEE Signal Process. Mag. vol. 25, no. 2, pp. 83–91, 2008.

5 M. A. Davenport, M. F. Duarte, Y. C. Eldar, and G. Kutyniok, “Introduction to Compressed Sensing,” in CompressedSensing: Theory and Applications, Cambridge University Press, 2012.

6 H. Mohimani, M. Babaie-Zadeh, and C. Jutten, “A fast approach for overcomplete sparse decomposition based onsmoothed `0 norm,” IEEE Transactions on Signal Processing, vol. 57, no. 1, pp. 289–301, 2009.

7 E. J. Candes, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweighted `1 minimization,” Journal of Fourieranalysis and applications, vol. 14, no. 5-6, pp. 877–905, 2008.

8 L. Gan, “Block compressed sensing of natural images,” in 15th IEEE Int. Conf. on Digital Signal Processing, 2007, pp.403–406.

9 M. F. Duarte and R. G. Baraniuk, “Kronecker compressive sensing,” IEEE Trans. on Image Process., vol. 21, no. 2, pp.494–504, 2012.

10 M. Elad, “Sparse and redundant representation modeling—what next?” Signal Processing Letters, IEEE, vol. 19, no.12, pp. 922-928, 2012.

11 T. Strohmer, “Measure what should be measured: progress and challenges in compressive sensing,” Signal ProcessingLetters, IEEE, vol. 19, no. 12, pp. 887-893, 2012.

12 S. H. Safavi, F. Torkamani-Azar, “Perceptual Compressive Sensing based on Contrast Sensitivity Function: Can weavoid non-visible redundancies acquisition?”, The 25th Iranian Conference on Electrical Engineering (ICEE), 2017.

13 A. Hashemi, S.H. Safavi, M. Rostami, N.M. Cheung, “Efficient Sampling Scheme for Reconstruction of DiffiusionFields”, Preprint, 2017.


Thanks for attention.http://display.sbu.ac.ir

http://students.sbu.ac.ir/h_safavi

Questions and Comments?h [email protected], m [email protected]

Sabalan Mountain, Ardabil, Iran



http://students.sbu.ac.ir/h_safavi

Compressive Sensing: - Theory, Applications, Challenges...

Documents

Transcript of Compressive Sensing: - Theory, Applications, Challenges...