IPAM MGA Tutorial on Feature Extraction and Denoising: A...

IPAM MGA Tutorial on FeatureExtraction and Denoising: A Saga of

u + v ModelsNaoki Saito

[email protected]

http://www.math.ucdavis.edu/˜saito/

Department of Mathematics

University of California, Davis

Sep. 2004 – p.1

Outline

What Are Features and What Are Noise?

Some History

u+ v Models

Very Briefly, Harlan-Claerbout-Rocca Model

Mumford-Shah Model

Rudin-Osher-Fatemi Model and Total Variation

DeVore-Lucier Model and Besov Spaces

Sparsity

Sep. 2004 – p.2

Outline . . .

Basis Pursuit Denoising

Sparsity vs. Statistical Independence

BV via Cohen-Dahmen-Daubechies-DeVore

BV vs Besov

Very Briefly, Meyer-Vese-Osher Model for Texture

My Comments and Summary

Sep. 2004 – p.3

Acknowledgment

Yves Meyer

Hyeokho Choi (Rice)

Other authors of articles

IPAM/UCLA

NSF & ONR

Sep. 2004 – p.4

What Are Features and What Are Noise?

To answer those questions, we need to specify our aim:

Approximation

Compression

Noise Removal (Denoising)

Object Detection

Classification/Discrimination

Regression

Sep. 2004 – p.5

Some History

Satosi Watanabe (circa 1981) characterized patternrecognition as a quest for minimum entropy by saying,“the essential nature of pattern recognition is . . . aconceptual adaptation to the empirical data in order tosee a form in them. The form means a structure whichalways entails small entropy values.”

Raphy Coifman (circa 1991) suggested that “noise”should be defined as incoherent components in dataused to represent data whereas “signal” or “features”are coherent components both relative to waveformlibraries. coherent ≈ sparse ≈ focused

Sep. 2004 – p.6

u+ v Models

Yves Meyer (2001) describes the so-called u+ v models

The u component is aimed at modeling the objects orimportant features

The v component represents textures and noise

The refined model is the u+ v + w model where v andw represent textures and noise, respectively.

Examples include:

Harlan, Claerbout, & Rocca (1984)

Mumford & Shah (1985)

Rudin, Osher, & Fatemi (1991)Sep. 2004 – p.7

u+ v Models . . .

DeVore & Lucier (1992)

Chen, Donoho, & Saunders (1995)

Olshausen & Field (1996)

Coifman & Sowa (1998)

Donoho, Huo, & Starck (2000)

Cohen, Dahmen, Daubechies, DeVore (2000)

Meyer, Vese, Osher (2002)

many others . . .

Sep. 2004 – p.8

Common Intuitions

Should be able to represent recognizablepatterns/structures in data efficiently and compactly viasome invertible transform

u = signal ≈ features ⇐⇒ sharply focused ≈ sparse

v = noise ⇐⇒ defocused/diffused

Sep. 2004 – p.9

What is u?

Requires some regularity (e.g., smoothness), i.e.,‖u‖B < C, where B some appropriate function space,and C > 0.

More general approach ‖Au‖B′ < C, where A : B → B′

is some invertible transform (e.g., A = Radontransform)

An important problem (modeling) is what B should befor various natural images.

Sep. 2004 – p.10

Various Viewpoints

Harmonic Analysis approach (Cohen, Coifman,Daubechies, DeVore, Donoho, Meyer, . . . )

PDE approach (Chan, Osher, Meyer, Morel, Sapiro,Vese, . . . )

Deterministic approach (Cohen, DeVore, Donoho,Osher, Terzopoulos, . . . )

Stochastic approach (Mumford, Grenander, Donoho,Zhu, Wu, . . . )

Highly active area and more and more interactionsamong various schools

Sep. 2004 – p.11

Very Briefly, Harlan, Claerbout, & Rocca (1984)

A stacked seismic section =∑

of

Geologic component ≈ linear events

Diffraction component ≈ hyperbolic events

Noise component ≈ white Gaussian noise + α.

Sep. 2004 – p.12

Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .

(a) Original

(b) Geology

(c) Diffraction (d) Noise

Sep. 2004 – p.13


(a) Original (b) Geology


Sep. 2004 – p.13



(c) Diffraction

(d) Noise

Sep. 2004 – p.13




Sep. 2004 – p.13

Mumford & Shah (1985)

Motivation: simultaneous image segmentation anddenoising

Let Ω = [0, 1] × [0, 1] ⊂ R2

Sep. 2004 – p.14

Mumford & Shah (1985) . . .

Let u component is smooth everywhere except on acompact set K ∈ Ω, which is unknown.

Find u and K from the data f = u+ v by minimizing:

JMS(u,K) =

∫

Ω

|f(x)−u(x)|2 dx+λ

∫

Ω\K

|∇u(x)|2 dx+µH1(K),

where λ, µ are positive weights and H1 is the 1DHausdorff measure (total length) of K.

Measures fidelity, smoothness of u, simplicity of K,respectively.

Sep. 2004 – p.15


Since u|Ω\K ∈ H1(Ω \K) = W 1,2(Ω \K), the objectiveis: Find u ∈ L2(Ω) and K ⊂ Ω s.t.

infu∈L2(Ω)

‖f−u‖L2(Ω) subject to ‖u‖H1(Ω\K) < C and H1(K) < C ′.

Sep. 2004 – p.16


Sep. 2004 – p.17


Very influential in Computer Vision; refined by IvanLeclerc using MDL formalism; faster numericalalgorithm called ‘Graduated Non-Convexity’ (GNC)algorithm by Andrew Blake and Andrew Zisserman, . . .

Numerical optimization was and still is an issue

Choice of λ and µ

Representation/basis functions for u were not used

Sep. 2004 – p.18

Rudin, Osher, & Fatemi (1992) . . .

Solve the following Euler-Lagrange equation

u = f +λ

2∇ ·( ∇u|∇u|

)

,

by forming the evolution equation and compute thesolution as t→ ∞.

The coarea formula links MS to ROF: If |u|BV (Ω) <∞,

|u|BV (Ω) =

∫ ∞

−∞

H1(∂Et[u]) dt,

where Et[u]∆= x ∈ Rn : u(x) > t is the level set of u

at t and ∂Et[u] is its perimeter. Sep. 2004 – p.20


Boundaries K are not explicit.

Choice of λ: dynamic, i.e., λ(t)

Strictly speaking, total variation is defined moregenerally and |u|BV (Ω) =

∫

Ω|∇u| dx is true for

u ∈ L11(Ω) = W 1,1(Ω) ⊂ BV (Ω).

More about BV (Ω) ( the space of functions of boundedvariation) will be discussed later.

Sep. 2004 – p.21


Sep. 2004 – p.22

Comments on MS and ROF models

Very influential; generated a new field “PDE-basedimage processing”

Characterization of the constraint space via basisfunctions were not used

Improved over the years; yet still computationallyintensive

Sep. 2004 – p.23

DeVore & Lucier (1992)

Now, the functional becomes:

JDL(u) = ‖f − u‖2L2(Ω) + λ‖c[u]‖`1 ,

where c[u] = (cν [u])ν∈Γ is the expansion coefficients ofu relative to an orthonormal wavelet basis ψνν∈Γ, i.e.,cν [u] = 〈u, ψν〉.In other words, Find u ∈ L2(Ω) s.t.

infu∈L2(Ω)

‖f − u‖L2(Ω) subject to ‖c[u]‖`1 < C.

Sep. 2004 – p.24

DeVore & Lucier (1992) . . .

Optimization leads to the soft thresholding (or waveletshrinkage) on the empirical wavelet coefficients

Let f(x) =∑

ν∈Γ cν [f ]ψν(x). Then,

JDL(u) =∑

ν∈Γ

(

(cν [f ] − cν [u])2 + λ|cν [u]|

)

,

whose minimization leads to:

cν [u] =

cν [f ] + λ/2 if cν [f ] < −λ/2,

0 if |cν [f ]| ≤ λ/2

cν [f ] − λ/2 if cν [f ] > λ/2.Sep. 2004 – p.25


(a) Noisy Lena (b) Linear (c) Nonlinear

Sep. 2004 – p.26


In the Besov space language:

‖c[u]‖`1 ‖u‖B1,1

1(Ω).

Thus, the constraint in optimization is ‖u‖B1,1

1(Ω) < C.

If v is WGN with mean 0, variance σ2, then the choiceof λ ≈ const · σ

N

√

logN 2, where N is the number ofsamples in each direction in Ω.

Lots of effort for deriving (near-)optimal threshold, e.g.,DeVore-Lucier, Donoho-Johnstone, and others, mostrecently Johnstone-Silverman.

Sep. 2004 – p.27

Besov Spaces

Let f ∈ Lp(Ω), 0 < p ≤ ∞. Let 0 < α <∞, and0 < q ≤ ∞.

Then, roughly speaking, functions belonging to thehomogeneous Besov space Bα,q

p (Ω) has “αderivatives” measured in Lp(Ω). The parameter qmakes finer distinctions in smoothness

The inhomogeneous Besov space

Bα,qp (Ω) = Bα,q

p (Ω) ∩ Lp(Ω)

‖f‖Bα,q

p (Ω) = ‖f‖Lp(Ω) + ‖f‖Bα,q

p (Ω)

Sep. 2004 – p.28

Besov Spaces . . .

Generalization of Lipschitz/Hölder and L2-Sobolevspaces because:

Bα,22 (Ω) = W α,2(Ω) = Hα(Ω)

Bα,∞∞ (Ω) = Λα(Ω) = Cα(Ω)

Easy to characterize via wavelet coefficients

‖f‖Bα,τ

τ ‖c[f ]‖`τ for τ = 2/(1 + α) in 2D.

Sep. 2004 – p.29

Besov Spaces . . .

Thus, in the specific DL model with α = 1 = τ , we have:

‖u‖B1,1

1

‖c[u]‖`1 .

This norm equivalence means that this DL model isreally seeking a function whose wavelet expansion issparse since ‖c[u]‖`1 < C is a form of sparsityconstraint.

Sep. 2004 – p.30

Sparsity via `p (quasi-) norm (0 < p ≤ 1)

Consider a vector or sequence x = (xj)j∈N.

Then consider the so-called `0 (quasi-)norm as themeasure of the sparsity of x:

‖x‖`0 = #j ∈ N : xj 6= 0.

This counts a number of nonzero components in x.

Thus, under, say, ‖x‖`2 = 1, the smaller ‖x‖`0 is, thesparser x is; a precise definition of sparsity.

However, this norm is too fragile to use (e.g., sensitivityto noise).

Sep. 2004 – p.31

Sparsity via `p (quasi-) norm (0 < p ≤ 1) . . .

Thus, consider the `p (quasi-) norm 0 < p ≤ 1 instead:

‖x‖`p =

(

∑

j∈N

|xj|p)1/p

.

Instead of explicitly saying the number of nonzeros inthe sequence via `0 quasi-norm, we can say:

‖x‖p ≤ C =⇒ |x|(k) < Ck−1/p,

(∵ k|x|p(k) ≤∑k

j=1 |x|p(j) < Cp) which relates sparsity to

the decay of the magnitudes of the rearrangedsequence. The smaller p, the faster the decay.

Sep. 2004 – p.32

Sparsity via `p (quasi-) norm (0 < p ≤ 1) . . .

w`p, i.e., the weak `p space is defined as

w`p(N)∆= x = (x1, x2, . . .) : |x|(k) ≤ Ck−1/p,∃C > 0,∀k ∈ N.

Clearly, `p ( w`p, e.g., xn = n−1/p ∈ w`p, but not in `p.

Later w`1 will be used to “almost” characterize thespace BV (Ω).

Sep. 2004 – p.33

Disadvantages of B1,11

Even very simple cartoon-like images such as χE(x),where ∂E is a smooth closed curve, do not belong toB1,1

1

Oscillatory patterns do not belong to B1,11 either

Sep. 2004 – p.34

Disadvantages of B1,11 . . .

Choi and Baraniuk on the Besov spaces vswavelet-based statistical models (the parameters ofthe generalized Gaussian distribution ⇐⇒ the Besovparameters)

(a) Original u (b) Random

shuffles of cj,·

(c) Random sign

flips of c Sep. 2004 – p.35

Basis Pursuit Denoising (Chen, Donoho, & Saunders, 1995)

Under the discrete setting, assume thatf = u + v ∈ Rn, where v = σz is WGN vector withvariance σ2. The functional is similar to JDL:

JBP (u) = ‖f − u‖2`2 + λ‖α[u]‖`1 ,

The coefficient vector α[u] ∈ Rp are in the form:

u =∑

γ∈Γ

αγ [u]φγ ,

where φγγ∈Γ, |Γ| = p ≥ n is a dictionary of bases(i.e., redundant) such as stationary wavelet, waveletpackets, local Fourier bases, or other frames, etc.Sep. 2004 – p.36

Basis Pursuit Denoising . . .

Choice of λ = σ√

2 log p.

Solution of this convex non-quadratic optimization by“primal-dual log-barrier linear programming”

Specific combination of basis dictionaries, theuncertainty principle, equivalence with `0 minimizationproblem =⇒ ask Donoho, Candès, Huo, Elad, Starckwho are all participating in this program!

Sep. 2004 – p.37

Sparsity vs. Statistical Independence

Independent Component Analysis

Stochastic setting; a collection of images

Can find a basis (LSDB) that provides the leaststatistically-dependence (the best one) out of thewavelet packet library or local Fourier library

Better off to pursue the sparsity than independenceexcept the problems that really have statisticallyindependent sources

Read my articles as well as Donoho & Flesia for moreinfo.

Sep. 2004 – p.38

BV (Ω): Functions of Bounded Variation

Definition of total variation

|u|BV (Ω)∆= sup

g

∫

Ω

u∇ · g dx : g ∈ C1c (Ω; R2), |g(x)| ≤ 1 ∀x ∈ Ω

BV (Ω) ⊂ L1(Ω) is a Banach space with the norm:

‖u‖BV (Ω) = ‖u‖L1(Ω) + |u|BV (Ω).

If u ∈W 1,1(Ω) ⊂ BV (Ω), then

|u|BV (Ω) =

∫

Ω

|∇u(x)| dx,

via integration by parts.Sep. 2004 – p.39

BV (Ω) . . .

In this tutorial, Ω ⊂ R2 is bounded. Thus,

W 1,1(Ω) = BV(Ω) ⊂ BV (Ω) ⊂ L2(Ω) ⊂ L1(Ω).

Unfortunately, W 1,1(Ω) does not contains cartoon-likeimages such as χE(x) where E ⊂ Ω and ∂E is smoothwhereas BV (Ω) does.

Minimizer u∗ exists in BV (Ω) for the corrected versionof the JROF (u) (Vese, 2001)

However, the original version of the ROF criterion doesnot guarantee to get (u, v) = (χE, 0) from f = χE (Y.Meyer 2001).

Sep. 2004 – p.40

Besov vs BV

Embedding (DeVore-Lucier, Donoho, Meyer, . . . ):

B1,11 (Ω) ⊂ BV (Ω) ⊂ B1,∞

1 (Ω)

B1,11 does not contain cartoon-like images while BV (Ω)

does.

On the other hand, BV (Ω) does not possess anyunconditional basis while B1,1

1 does.

Sep. 2004 – p.41

Unconditional Bases

Let φνν∈Γ be a basis for a Banach space B.

Let f =∑

ν∈Γ cνφν ∈ B.

Let ‖f‖B be a functional norm of f ∈ B and let ‖c[f ]‖b

be a discrete sequence norm of c[f ] = (cν [f ]).

Suppose ‖f‖B ‖c[f ]‖b.

This is already not a trivial condition because . . .

Sep. 2004 – p.42

Fourier is not an unconditional basis for Lp(T ), p 6= 2

Let f ∈ B = Lp(T ), T = [0, 2π), and φν(x) = eiνx. Letcν [f ] be the Fourier coefficients of f .

Then, ‖f‖L2(T ) = ‖c[f ]‖`2(Z) (Plancherel)

However, ‖c[f ]‖`p(Z) does not tell information about‖f‖Lp(T ) if p 6= 2.

For example, ‖f‖L4(T ) tells you some info about thedistribution of the energy of f over T (∼ kurtosis).

However, |cν [f ]| does not tell you anything about‖f‖L4(T ).

=⇒ the Littlewood-Paley theorySep. 2004 – p.43

Unconditional Bases . . .

Then, φνν∈Γ is called an unconditional basis of B ifany sequence c = (cν) satisfying |cν | ≤ |cν [f ]|, ∀ν ∈ Γ,yields a new function f =

∑

ν∈Γ cνφν that belong to B.

In other words, operations on the coefficients, such asshrinking, sign flips, do not change the membership ofB.

Examples: Fourier: L2, Wavelets: Lp, 1 < p <∞, Bα,qp ,

α > 0, 1 ≤ p, q ≤ ∞, . . .

Sep. 2004 – p.44

Unconditional Bases . . .

Advantages: φνν∈Γ ⇐⇒ axes of symmetry for the ballin B, e.g., ‖f‖B < C.

“Rotation” into a coordinate system where the norm is“diagonalized” even if the norm is not quadratic.

Read articles by Donoho as well as Meyer’s books!

Sep. 2004 – p.45

Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu

BV (Ω) does not have an unconditional basis; but wecan say the following:

u ∈ BV (Ω)=⇒c[u] ∈ w`1(Γ).

In other words, the sorted wavelet coefficients decayas O(k−1).

This implies that k-term approximation of aBV -function using wavelets is of O(k−1/2).

Sep. 2004 – p.46

Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu . . .

Embeddings in the sequence spaces:

`1(Γ) ⊂ bv(Γ) ⊂ w`1(Γ),

where bv(Γ) is a space of vectors consisting of thewavelet coefficients of BV (Ω) functions and its norm isdefined to be the BV norm of the correspondingfunction.

This allows wavelet shrinkage on the coefficients.

Sep. 2004 – p.47

Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu . . .

The Haar case by C-De-P-X (1998), and the generalwavelet case by Meyer (1998).

The stronger versions by C-Dah-Dau-De (2000).

Of course, using other methods such as ridgelets andcurvelets, one can get better decay =⇒ Lectures byDonoho and Candès tomorrow

Sep. 2004 – p.48

BV for Image Modeling?

Study of Gousseau & Morel

BV (Ω) may be well adapted for large scale geometricstructures

But natural images are not in BV (Ω).

∵ Natural images often contain too many small objectsand textures =⇒ sum of the length of the perimeters ofthe level sets may blow up, i.e.,

∫∞

−∞H1(∂Et[u]) dt = ∞.

Sep. 2004 – p.49

BV for Image Modeling?

20 40 60 80 100 120

20

40

60

80

100

120

(a)

(23,43)

20 40 60 80 100 120

20

40

60

80

100

120

(b)

(43,63)

20 40 60 80 100 120

20

40

60

80

100

120

(c)

(63,83)

20 40 60 80 100 120

20

40

60

80

100

120

(d)

(83,103)

20 40 60 80 100 120

20

40

60

80

100

120

(e)

(103,123)

20 40 60 80 100 120

20

40

60

80

100

120

(f)

(123,143)

20 40 60 80 100 120

20

40

60

80

100

120

(g)

(143,163)

20 40 60 80 100 120

20

40

60

80

100

120

(h)

(163,183)

20 40 60 80 100 120

20

40

60

80

100

120

(i)

(183,203)

20 40 60 80 100 120

20

40

60

80

100

120

(j)

(203,226)

Sep. 2004 – p.50

Very Briefly, Meyer, Vese, & Osher (2002)

All the previous models with ‖f − u‖2L2(Ω) anticipated

WGN for the v component, which are also very muchrelated to statistical estimation methods such as MLE,Bayes, MDL, etc. with prior information on the ucomponent.

Improve the ROF model by changing the L2 norm ofv = f − u component to

JMV O(u) = ‖f − u‖G(Ω) + λ|u|BV (Ω),

where G(Ω) is a dual space of BV(Ω) = W 1,1(Ω).

Sep. 2004 – p.51

Very Briefly, Meyer, Vese, & Osher (2002)

G(Ω) contains oscillatory patterns (textures).

Y. Meyer’s book for precise definition of G(Ω)

Vese-Osher (2003) for numerical algorithm.

Sep. 2004 – p.52

Compare with the ROF model . . .

Sep. 2004 – p.53

Summary

Reviewed u+ v models

u component is often modeled by the constraints‖u‖B < C for some function space B.

This constraint corresponds to the prior information inBayesian statistics and MDL formalism, andregularization term in the inverse problem.

v component is often assumed to be i.i.d. WGN,yielding L2 fidelity term in the functional to beoptimized =⇒ not good for texture

Sep. 2004 – p.54

Summary . . .

Wavelet shrinkage: works very well for B = B1,11 , and

reasonably well for B = BV , and computationally veryfast.

PDE-based approach: works well for B = W 1,1, morecomputationally intensive, but allows more flexiblemodeling for non-L2 error criterion for v.

Sep. 2004 – p.55

My Comments

Use of function spaces for image modeling:

Mathematically sound

Can get deep results

Extremely hard to find a good one for naturalimages

Use of orthonormal bases:

Mathematically tractable

Good affinity with function spaces

Fast algorithms

But too restrictiveSep. 2004 – p.56

My Comments . . .

Use of overcomplete dictionaries

Mathematically more challenging

Can get better results

Can develop fast algorithms

Still in the form of linear combinations in most cases

Sep. 2004 – p.57

My Comments . . .

‖u‖B < C is mathematically great, but restrictive forimage modeling

Need more interaction with stochastic modelingcommunity

Explore more about wavelet shrinkage after theinvertible transform a la Harlan-Claerbout-Rocca

Yves Meyer: “Sparsity does not open the gate tofeature extraction.”

My reaction: “Sparsity can still open the gate to featureextraction.”

Sep. 2004 – p.58

References: Books and Survey Articles

R. DeVore: “Nonlinear Approximation,” in Acta Numerica,Cambridge Univ. Press, 1998.

D. Donoho, M. Vetterli, R. DeVore, & I. Daubechies: “Datacompression and harmonic analysis,” IEEE Trans. Info.Theory, vol.44, pp.2435–2476, 1998.

S. Jaffard, Y. Meyer, & R. D. Ryan: Wavelets: Tools forScience & Technology, SIAM, 2001.

S. Mallat: A Wavelet Tour of Signal Processing, 2nd ed.,Academic Press, 1999.

Y. Meyer: Oscillating Patterns in Image Processing andNonlinear Evolution Equations, University Lecture SeriesVol.22, AMS, 2001. Sep. 2004 – p.59

References: Articles

B. Bénichou & N. Saito: “Sparsity vs. statisticalindependence in adaptive signal representations: A casestudy of the spike process,” in Beyond Wavelets (G. V.Welland, ed.), Chap.9, pp.225–257, Academic Press, 2003.

H. Choi & R. Baraniuk: “Wavelet statistical models andBesov spaces,” in Nonlinear Estimation and Classification(D. Denison ed.), Springer-Verlag, 2003.

A. Cohen, R. DeVore, P. Petrushev, & H. Xu: “Nonlinearapproximation and the space BV (R2),” Amer. J. Math.,vol.121, pp.587–628, 1999.

Sep. 2004 – p.60

References: Articles . . .

A. Cohen, W. Dahmen, I. Daubechies, & R. DeVore:“Harmonic analysis of the space BV ,” Revista MathematicaIberoamericana, vol.19, pp.235–263, 2003.

S. Chen, D. L. Donoho, & M. A. Saunders: “Atomicdecomposition by basis pursuit,” SIAM J. Sci. Comput.,vol.20, pp.33-61, 1999.

R. A. DeVore & B. J. Lucier: “Fast wavelet techniques fornear-optimal image processing,” IEEE MilitaryCommunications Conference Record, pp.1129–1135, 1992.

D. L. Donoho: “Unconditional bases are optimal bases fordata compression and for statistical estimation,” Appl.Comput. Harm. Anal., vol.1, pp.100–115, 1993. Sep. 2004 – p.61


D. L. Donoho: “Sparse components of images and optimalatomic decomposition,” Constr. Approx., vol.17, pp.353–382,2001.

D. L .Donoho & A. G. Flesia: “Can recent innovations inharmonic analysis ‘explain’ key findings in natural imagestatistics?” Network: Comput. Neural Syst., vol.12,pp.371–393, 2001.

Y. Gousseau & J.-M. Morel: “Are natural images of boundedvariation?” SIAM J. Math. Anal., vol.33, pp.634–648, 2001.

W. S. Harlan, J. F. Claerbout, and F. Rocca: “Signal/noiseseparation and velocity estimation,” Geophysics, vol.49,pp.1869–1880, 1984. Sep. 2004 – p.62


D. Mumford & J. Shah: “Boundary detection by minimizingfunctionals, I” IEEE Conf. Computer Vision & PatternRecognition, pp.22–26, 1985.

L. Rudin, S. Osher, & E. Fatemi: “Nonlinear total variationbased noise removal algorithms,” Physica D, vol.60,pp.259–268, 1992.

N. Saito: “Simultaneous noise suppression and signalcompression using a library of orthonormal bases and theminimum description length criterion,” in Wavelets inGeophysics (E. Foufoula-Georgiou and P. Kumar, eds.),chap. XI, pp.299–324, Academic Press, 1994.

Sep. 2004 – p.63


N. Saito: “Image approximation and modeling via leaststatistically-dependent bases,” Pattern Recognition, vol.34,pp.1765–1784, 2001.

N. Saito: “The generalized spike process, sparsity, andstatistical independence,” in Modern Signal Processing (D.Rockmore and D. Healy, Jr. eds.), MSRI Publications,vol.46, pp.317–340, Cambridge Univ. Press, 2004.

L. Vese: “A study in the BV space of a denoising-deblurringvariational problem,” Applied Mathematics & Optimization,vol.44, pp.131–161, 2001.

Sep. 2004 – p.64


L. Vese & S. J. Osher: “Modeling textures with total variationminimization and oscillating patterns in image processing,”J. Sci. Comput., vol.19, pp.553–572, 2003.

S. Watanabe: “Pattern recognition as a quest for minimumentropy,” Pattern Recognition, vol.13, no.5, pp.381–387,1981.

Sep. 2004 – p.65

IPAM MGA Tutorial on Feature Extraction and Denoising: A...

Documents

Transcript of IPAM MGA Tutorial on Feature Extraction and Denoising: A...