Olga V. Holtz UC Berkeley & TU Berlinoholtz/Talks/CS.pdf · An Introduction to Compressive Sensing...

An Introduction to Compressive Sensing

Olga V. HoltzUC Berkeley & TU Berlin

StanfordJanuary 2009

History and Introduction Main idea Constructions Approximation theory

Compressed Sensing: History

Compressed Sensing (CS)

People involved, (Right to left: J. Claerbout, B. Logan, D. Donoho, E. Candés, T. Tao and R. DeVore)

Compressed Sensing: Introduction

Old-fashioned Thinking

Collect data at grid points

For n pixels, take nobservations

Compressed Sensing (CS)

(CS camera at Rice)

Takes only O(n1/4 log5(n))random measurementsinstead of n

Traditional signal processing

Model signals as band-limited functions x(t)

Support of x is contained in [−Ωπ,Ωπ]

Shannon-Nyquist

Uniform time sampling with spacing h ≤ 1/Ω gives exactreconstructon

A/D converters: sample and quantize

Problem: if Ω is very large, one cannot build circuits tosample at the desired rate

Signal processing using CS

Compressive sensing seeks a way out of this dilemma

Two new components:

New model classes for signals: signals are sparse in somerepresentation system (basis/frame)

New meaning of samples: sample is a linear functional appliedto the signal

Given x ∈ IRn with n large, ask m non-adaptive questionsabout x

Question means inner product v · x with v ∈ IRn meanssample

Such sampling is described by an m × n linear systemΦx = y

Signal processing using CS

Compressive sensing seeks a way out of this dilemma

Two new components:

New model classes for signals: signals are sparse in somerepresentation system (basis/frame)

New meaning of samples: sample is a linear functional appliedto the signal

Given x ∈ IRn with n large, ask m non-adaptive questionsabout x

Question means inner product v · x with v ∈ IRn meanssample

Such sampling is described by an m × n linear systemΦx = y

Structure of signals

With no additional information on x cannot say anything

But we are interested in those x that have structure

Typically x can be represented by sparse linearcombinations of certain building blocks (e.g., a basis)

Issue: in many problems, we do not know the basis

Here we assume the basis is known (for now)

Ansatz: look for k -sparse solutions:

x ∈ Σk that is # supp(x) ≤ k .

Sparsest Solutions of Linear equations

Find a sparsest solution of linear system

(P0) min‖x‖0 : Φx = b, x ∈ IRn

where ‖x‖0 = number of nonzeros of xand Φ ∈ IRm×n with m < n.

The solution is in generalnot unique.

Moreover, this problem isNP-Hard

Basis Pursuit

Main idea:Use the convex relaxation

(P1) min‖x‖1 : Φx = b, x ∈ IRn

Basis Pursuit [Chen, Donoho, and Saunders (1999)]

Solving (P1) in polynomial time

Can be solved by linearprogramming:

min 1Ty

s.t.Φx = b−y ≤ x ≤ y

Sparse Recovery and Mutual Incoherence

Mutual incoherence:

M(Φ) = maxi 6=j

i φj |

where Φ = [φ1 . . . φn] ∈ IRm×n and ‖φi‖2 = 1.

Theorem (Elad and Bruckstein (2002))

Suppose that for the sparsest solution x? we have

‖x?‖0 <(√

2− 12)

M(Φ).

Then the solution of (P1) is equal to the solution of (P0).

Sparse Recovery and RIP

Restricted Isometry Property of Order k [Candès, Romberg,Tao (2006)]: Let δk be the smallest number such that

(1− δk )‖x‖22 ≤ ‖Φx‖2

2 ≤ (1 + δk )‖x‖22

for all k -sparse vectors x ∈ IRn where Φ = [φ1 . . . φn] ∈ IRm×n.

Theorem (E. J. Candès (2008))

If δ2k <√

2− 1, then for all k-sparse vectors x such thatΦx = b, the solution of (P1) is equal to the solution of (P0).

Approximate Recovery and RIP

Basis Pursuit De-Noising (BPDN):

(Pε1) min‖x‖1 : ‖Φx − b‖2 ≤ ε

[Chen, Donoho, and Saunders (1999)]

Theorem (E. J. Candès (2008))

Suppose that the matrix Φ is given and b = Φx + e where‖e‖2 ≤ ε. If δ2k <

√2− 1, then

‖x? − x‖2 ≤ C0k−1/2σk (x)1 + C1ε,

where x? is the solution of (Pε1) and

σk (x)1 = minz∈Σk

‖x − z‖1.

Other Heuristics: Orthogonal Matching Pursuit, Mangasarian’sapproach, Bilinear formulation, etc.

End of Part I.

Construction of CS Matrices

Good compressive sensing (CS) matrices:

Known Result for Random matrices

Known reconstruction bounds for matrices with entriesdrawn at random from various probability distributions:

k ≤ Cm/ log(n/m).

Specific recipes include Gaussian, Bernoulli and otherclassical matrix ensembles.

Particular case: there is a probabilistic construction ofmatrices Φ of size m× n with entries ± 1√

m satisfying RIP

of order k with the above bound.

Probabilistic Construction of CS Matrices

Introduce the Concentration of Measure Inequality (CMI)property on a probability space (Ω, %)

Suppose Φ = Φ(ω) is a collection of random m × nmatrices

Property PO(δ): the collection is said to have CMI if, foreach x ∈ IRn, there is a set Ω0(x , δ) ⊂ Ω s.t.

(1− δ)‖x‖2 ≤ ‖Φx‖2 ≤ (1 + δ)‖x‖2, ω ∈ Ω(x , δ)

and %(Ω(x , δ)c) ≤ C0e−c0mδ2

Gaussian, Bernoulli and many other families have thisproperty

Property PO and the JL Lemma

Johnson-Lindenstrauss Lemma.

Given ε ∈ (0, 1), a set X of points in IRn such that#X =:σ > σ0 = O(ln m/ε2), there is a Lipschitz functionΦ : IRn → IRm s.t.

(1− ε)‖u − v‖2 ≤ ‖Φ(u)− Φ(v)‖2 ≤ (1 + ε)‖u − v‖2.

If X is a set of points and m > c ln(#X )ε−2 with c sufficientlylarge, then the set Ω0 :=∩x ,x ′∈X Ω(x − x ′, ε) satisfies

%(Ωc0) ≤ C0(#X )2e−c0nε2

= e2 ln(#X)−c0mε2+ln C0

If m ≥ (2 ln(#X ) + ln C0)/c0ε2, then the measure is < 1 hence

we get the JL lemma.

The JL Lemma and RIP

If k ≤ cm/ ln(n/m) and Φ satisfies JL, then we have RIP oforder k

For c sufficiently small, the probability that a Gaussianemsemble satisfies RIP is 1− Ce−cm

If we have a collection of O(ecm) bases, then a randomdraw of Gaussian will satisfy RIP with respect to all ofthese bases simultaneously

Basic reason why this works

Pr[∣∣∣‖Φ(ω)x‖2

`2− ‖x‖2

∣∣∣ ≥ ε‖x‖2`2

]≤ 2e−mc0(ε), 0 < ε < 1.

Deterministic Construction of CS Matrices

Difficulties in Deterministic Case

[DeVore]: Proposes a deterministic constructions of order

k ≤ C√

m log n/ log(n/m)

which is still far from probabilistic results.

Very recent ideas

Find deterministic constructions with better bounds

using bipartite expander graphs [Indyk, Hassibi, Xu];

using structured matrices such as Toeplitz, cyclic,generalized Vandermonde matrices.

Subgoal: deterministic polynomial time algorithm forconstructing good CS matrices.

Holy grail: k ≤ Cm/ log(n/m) – achievable for randommatrices but not yet in deterministic constructions.

Connections with approximation theory

Cohen, Dahmen, DeVore [2009]: Compressed sensing andbest k -term approximation, JAMS.

Best k -term approximation error:

σk (x)X := infz∈Σk

‖x − z‖X .

Encoder-decoder viewpoint

The matrix Φ serves as an encoder producing y = Φx . Toextract x / approximation to x , use a decoder ∆ (notnecessarily linear). Thus

∆(y) = ∆(Φx) approximates x .

Performance of encoders-decoders

Performance of encoder-decoder pairs

Ask for the largest value of k s.t.

x ∈ Σk =⇒ ∆(Φx) = x .

Ask for the largest value of k s.t., for a given class K ,

En(K )X ≤ Cσk (K )X , where

En(K )X := inf(Φ,∆)

supx∈K

‖x −∆(Φx)‖,

σk (K )X := supx∈K

σk (X ).

A pair (Φ,∆) is called instance-optimal of order k with constantC for the space X is

‖x −∆(Φx)‖X ≤ Cσk (x)X

for all x ∈ X with a constant C independent of k and n.History and Introduction Main idea Constructions Approximation theory

Connection with Gelfand widths

Gelfand widths

For K a compact set in X and m ∈ IN, the Gelfand width of K oforder m is

dm(K )X := infcodimY≤m

sup‖x‖X : x ∈ K ∩ Y.

Basic result

Lemma. Let K ⊂ IRn be symmetric, i.e., K = −K , and satisfyK + K ⊂ C0K for some C0. If X ⊆ IRn is any normed space,then

dm(K )X ≤ Em(K )X ≤ C0dm(K )X , 1 ≤ m ≤ n.

Orders of Gelfand widths

Orders of Gelfand widths of `q balls

Theorem [Gluskin, Garnaev, Kashin (1977,1984)]

C1Ψ(m, n, q, p) ≤ dm(U(`nq)) ≤ C2Ψ(m, n, q, p) where

Ψ(m, n, q, p) :=(

min1, n1−1/qm−1/2) 1/q−1/p

1/q−1/2,

Ψ(m, n, 1, 2) := min1,

√log(n/m)

Corollary. The necessary number of measurements k satisfies

k ≤ c0m/ log(n/m).

Instance optimality and the null space of Φ

Denote N :=N (Φ) :=x : Φx = 0.

Uniqueness of recovery

Lemma. For an m × n matrix Φ and for 2k ≤ m, the followingare equivalent:

There is a decoder ∆ s.t. ∆(Φx) = x for all x ∈ Σk .

Σ2k ∩N = 0.For any set T with #T = 2k , the matrix ΦT has rank 2k .

For any T as above, the matrix Φ∗T ΦT is positive definite.

Approximate recovery

Approximation to accuracy σk

Theorem [Cohen, Dahmen, DeVore (2009)]. Given an m × nmatrix Φ, a norm ‖ · ‖X and a value of k , a sufficient conditionthat there exists a decoder ∆ s.t.

is that‖η‖X ≤ C/2 · σ2k (η)X , η ∈ N .

A necessary condition is that

‖η‖X ≤ C · σ2k (η)X , η ∈ N .

This gives rise to the null space property (in X of order 2k ):

‖η‖X ≤ C · σ2k (η)X , η ∈ N .

The null space property

Approximation and the null space property

Corollary [Cohen, Dahmen, DeVore (2009)].Suppose that X is an lnp space, k ∈ IN and Φ is an encodingmatrix. If Φ has the null space property in X of order 2k withconstant C/2, then there exists a decoder ∆ so that

‖x −∆(Φx)‖X ≤ Cσk (x)X .

Conversely, the validity of the above condition for some decoder∆ implies that Φ has the null space property in X of order 2kwith constant C.

One of the main results, revisited

RIP and good encoder-decoder pairs

Theorem [Candès-Romberg-Tao (2006)]. Let Φ be any matrixwith satisfies the RIP of order 3k with δ3k ≤ δ < (

√2− 1)2/3.

Define the decoder ∆ by

∆(y) := argminΦz=y ‖z‖`1 .

Then (Φ,∆) satisfies

in X = `1 with C = 2√

2+2−(2√

2−2)δ√2−1−(

√2+1)δ

The End.

Olga V. Holtz UC Berkeley & TU Berlinoholtz/Talks/CS.pdf · An Introduction to Compressive Sensing...

Documents

Transcript of Olga V. Holtz UC Berkeley & TU Berlinoholtz/Talks/CS.pdf · An Introduction to Compressive Sensing...

Rabbi David K. Holtz Cantor Margot E.B. Goldberg

CAMPANILE WAY 2813 NOTICES MAILED OUT 12-22-17 NAME1 … · 2017. 12. 22. · matich olga tr 1717 la loma ave 4 berkeley ca 94709 kolenkow robert j tr 1717 la loma ave n-1 berkeley

Queen Olga - Venus Ceramica · 144 25x70 10x273/4" Queen Olga Riga Queen Olga Decore Queen Olga Boiserie Cornisa Queen Olga, 5x25 Cenefa Queen Olga Mix, 7x25 T Queen Olga Zócalo

Holtz Appendix 2010 Dinosaurs)

, the Jackson Bond Citizens’ Advisory Committee Holtz ...

Holtz & Kovacs - An Introduction to Geotechnical Engineering

An introduction to geotechnical engineering (holtz and kovacs)

Rion by DESIGN - Holtz Furniture Studio

Mobile Marketing Crash Course with Shel Holtz

Zonotopal algebra · Zonotopal algebra Olga Holtz∗ Departments of Mathematics University of California-Berkeley & Technische Universit¨at Berlin Amos Ron† Department of Mathematics

Shel Holtz

Tim Holtz 2015

Holtz& Kovacs - An Introduction to Geotechnical Engineering.docx

Holtz - Paper

fiLECOpy - USDA · 1 plocat. f4 calib4.f4 mapdex.f4 getxy.mac lowup.f4 string.mac getdig.r tran.r rbmap.mac simeg.f4 lsg.rel olga olga olga olga olga olga olga olga olga olga olga

Employee Benefits Social Media Employee Benefits and Social Media Shel Holtz, ABC Holtz Communication + Technology.

Holtz 2nd.pdf

Fast and stable matrix multiplication - EECS at UC Berkeleyoholtz/Talks/mit.pdf · Fast and stable matrix multiplication Olga Holtz Department of Mathematics University of California-Berkeley

Tim Holtz CHAW13 Catalog

Tim Holtz Winter 2012.pdf