IPAM MGA Tutorial on Feature Extraction and Denoising: A...
Transcript of IPAM MGA Tutorial on Feature Extraction and Denoising: A...
IPAM MGA Tutorial on FeatureExtraction and Denoising: A Saga of
u + v ModelsNaoki Saito
http://www.math.ucdavis.edu/˜saito/
Department of Mathematics
University of California, Davis
Sep. 2004 – p.1
Outline
What Are Features and What Are Noise?
Some History
u+ v Models
Very Briefly, Harlan-Claerbout-Rocca Model
Mumford-Shah Model
Rudin-Osher-Fatemi Model and Total Variation
DeVore-Lucier Model and Besov Spaces
Sparsity
Sep. 2004 – p.2
Outline . . .
Basis Pursuit Denoising
Sparsity vs. Statistical Independence
BV via Cohen-Dahmen-Daubechies-DeVore
BV vs Besov
Very Briefly, Meyer-Vese-Osher Model for Texture
My Comments and Summary
Sep. 2004 – p.3
Acknowledgment
Yves Meyer
Hyeokho Choi (Rice)
Other authors of articles
IPAM/UCLA
NSF & ONR
Sep. 2004 – p.4
What Are Features and What Are Noise?
To answer those questions, we need to specify our aim:
Approximation
Compression
Noise Removal (Denoising)
Object Detection
Classification/Discrimination
Regression
Sep. 2004 – p.5
Some History
Satosi Watanabe (circa 1981) characterized patternrecognition as a quest for minimum entropy by saying,“the essential nature of pattern recognition is . . . aconceptual adaptation to the empirical data in order tosee a form in them. The form means a structure whichalways entails small entropy values.”
Raphy Coifman (circa 1991) suggested that “noise”should be defined as incoherent components in dataused to represent data whereas “signal” or “features”are coherent components both relative to waveformlibraries. coherent ≈ sparse ≈ focused
Sep. 2004 – p.6
u+ v Models
Yves Meyer (2001) describes the so-called u+ v models
The u component is aimed at modeling the objects orimportant features
The v component represents textures and noise
The refined model is the u+ v + w model where v andw represent textures and noise, respectively.
Examples include:
Harlan, Claerbout, & Rocca (1984)
Mumford & Shah (1985)
Rudin, Osher, & Fatemi (1991)Sep. 2004 – p.7
u+ v Models . . .
DeVore & Lucier (1992)
Chen, Donoho, & Saunders (1995)
Olshausen & Field (1996)
Coifman & Sowa (1998)
Donoho, Huo, & Starck (2000)
Cohen, Dahmen, Daubechies, DeVore (2000)
Meyer, Vese, Osher (2002)
many others . . .
Sep. 2004 – p.8
Common Intuitions
Should be able to represent recognizablepatterns/structures in data efficiently and compactly viasome invertible transform
u = signal ≈ features ⇐⇒ sharply focused ≈ sparse
v = noise ⇐⇒ defocused/diffused
Sep. 2004 – p.9
What is u?
Requires some regularity (e.g., smoothness), i.e.,‖u‖B < C, where B some appropriate function space,and C > 0.
More general approach ‖Au‖B′ < C, where A : B → B′
is some invertible transform (e.g., A = Radontransform)
An important problem (modeling) is what B should befor various natural images.
Sep. 2004 – p.10
Various Viewpoints
Harmonic Analysis approach (Cohen, Coifman,Daubechies, DeVore, Donoho, Meyer, . . . )
PDE approach (Chan, Osher, Meyer, Morel, Sapiro,Vese, . . . )
Deterministic approach (Cohen, DeVore, Donoho,Osher, Terzopoulos, . . . )
Stochastic approach (Mumford, Grenander, Donoho,Zhu, Wu, . . . )
Highly active area and more and more interactionsamong various schools
Sep. 2004 – p.11
Very Briefly, Harlan, Claerbout, & Rocca (1984)
A stacked seismic section =∑
of
Geologic component ≈ linear events
Diffraction component ≈ hyperbolic events
Noise component ≈ white Gaussian noise + α.
Sep. 2004 – p.12
Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .
(a) Original
(b) Geology
(c) Diffraction (d) Noise
Sep. 2004 – p.13
Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .
(a) Original (b) Geology
(c) Diffraction (d) Noise
Sep. 2004 – p.13
Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .
(a) Original (b) Geology
(c) Diffraction
(d) Noise
Sep. 2004 – p.13
Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .
(a) Original (b) Geology
(c) Diffraction (d) Noise
Sep. 2004 – p.13
Very Briefly, Harlan, Claerbout, & Rocca (1984) . . .
(a) Original (b) Geology
(c) Diffraction (d) Noise
Sep. 2004 – p.13
Mumford & Shah (1985)
Motivation: simultaneous image segmentation anddenoising
Let Ω = [0, 1] × [0, 1] ⊂ R2
Sep. 2004 – p.14
Mumford & Shah (1985) . . .
Let u component is smooth everywhere except on acompact set K ∈ Ω, which is unknown.
Find u and K from the data f = u+ v by minimizing:
JMS(u,K) =
∫
Ω
|f(x)−u(x)|2 dx+λ
∫
Ω\K
|∇u(x)|2 dx+µH1(K),
where λ, µ are positive weights and H1 is the 1DHausdorff measure (total length) of K.
Measures fidelity, smoothness of u, simplicity of K,respectively.
Sep. 2004 – p.15
Mumford & Shah (1985) . . .
Since u|Ω\K ∈ H1(Ω \K) = W 1,2(Ω \K), the objectiveis: Find u ∈ L2(Ω) and K ⊂ Ω s.t.
infu∈L2(Ω)
‖f−u‖L2(Ω) subject to ‖u‖H1(Ω\K) < C and H1(K) < C ′.
Sep. 2004 – p.16
Mumford & Shah (1985) . . .
Sep. 2004 – p.17
Mumford & Shah (1985) . . .
Very influential in Computer Vision; refined by IvanLeclerc using MDL formalism; faster numericalalgorithm called ‘Graduated Non-Convexity’ (GNC)algorithm by Andrew Blake and Andrew Zisserman, . . .
Numerical optimization was and still is an issue
Choice of λ and µ
Representation/basis functions for u were not used
Sep. 2004 – p.18
Rudin, Osher, & Fatemi (1992)
Motivation: image enhancement and denoising
The MS functional is modified to:
JROF (u) =
∫
Ω
|f(x) − u(x)|2 dx+ λ
∫
Ω
|∇u(x)| dx
= ‖f − u‖2L2(Ω) + λ|u|BV (Ω),
where |u|BV (Ω) is the so-called total variation of u.
In other words, Find u ∈ L2(Ω) s.t.
infu∈L2(Ω)
‖f − u‖L2(Ω) subject to |u|BV (Ω) < C.
Sep. 2004 – p.19
Rudin, Osher, & Fatemi (1992) . . .
Solve the following Euler-Lagrange equation
u = f +λ
2∇ ·( ∇u|∇u|
)
,
by forming the evolution equation and compute thesolution as t→ ∞.
The coarea formula links MS to ROF: If |u|BV (Ω) <∞,
|u|BV (Ω) =
∫ ∞
−∞
H1(∂Et[u]) dt,
where Et[u]∆= x ∈ Rn : u(x) > t is the level set of u
at t and ∂Et[u] is its perimeter. Sep. 2004 – p.20
Rudin, Osher, & Fatemi (1992) . . .
Boundaries K are not explicit.
Choice of λ: dynamic, i.e., λ(t)
Strictly speaking, total variation is defined moregenerally and |u|BV (Ω) =
∫
Ω|∇u| dx is true for
u ∈ L11(Ω) = W 1,1(Ω) ⊂ BV (Ω).
More about BV (Ω) ( the space of functions of boundedvariation) will be discussed later.
Sep. 2004 – p.21
Rudin, Osher, & Fatemi (1992) . . .
Sep. 2004 – p.22
Comments on MS and ROF models
Very influential; generated a new field “PDE-basedimage processing”
Characterization of the constraint space via basisfunctions were not used
Improved over the years; yet still computationallyintensive
Sep. 2004 – p.23
DeVore & Lucier (1992)
Now, the functional becomes:
JDL(u) = ‖f − u‖2L2(Ω) + λ‖c[u]‖`1 ,
where c[u] = (cν [u])ν∈Γ is the expansion coefficients ofu relative to an orthonormal wavelet basis ψνν∈Γ, i.e.,cν [u] = 〈u, ψν〉.In other words, Find u ∈ L2(Ω) s.t.
infu∈L2(Ω)
‖f − u‖L2(Ω) subject to ‖c[u]‖`1 < C.
Sep. 2004 – p.24
DeVore & Lucier (1992) . . .
Optimization leads to the soft thresholding (or waveletshrinkage) on the empirical wavelet coefficients
Let f(x) =∑
ν∈Γ cν [f ]ψν(x). Then,
JDL(u) =∑
ν∈Γ
(
(cν [f ] − cν [u])2 + λ|cν [u]|
)
,
whose minimization leads to:
cν [u] =
cν [f ] + λ/2 if cν [f ] < −λ/2,
0 if |cν [f ]| ≤ λ/2
cν [f ] − λ/2 if cν [f ] > λ/2.Sep. 2004 – p.25
DeVore & Lucier (1992) . . .
(a) Noisy Lena (b) Linear (c) Nonlinear
Sep. 2004 – p.26
DeVore & Lucier (1992) . . .
In the Besov space language:
‖c[u]‖`1 ‖u‖B1,1
1(Ω).
Thus, the constraint in optimization is ‖u‖B1,1
1(Ω) < C.
If v is WGN with mean 0, variance σ2, then the choiceof λ ≈ const · σ
N
√
logN 2, where N is the number ofsamples in each direction in Ω.
Lots of effort for deriving (near-)optimal threshold, e.g.,DeVore-Lucier, Donoho-Johnstone, and others, mostrecently Johnstone-Silverman.
Sep. 2004 – p.27
Besov Spaces
Let f ∈ Lp(Ω), 0 < p ≤ ∞. Let 0 < α <∞, and0 < q ≤ ∞.
Then, roughly speaking, functions belonging to thehomogeneous Besov space Bα,q
p (Ω) has “αderivatives” measured in Lp(Ω). The parameter qmakes finer distinctions in smoothness
The inhomogeneous Besov space
Bα,qp (Ω) = Bα,q
p (Ω) ∩ Lp(Ω)
‖f‖Bα,q
p (Ω) = ‖f‖Lp(Ω) + ‖f‖Bα,q
p (Ω)
Sep. 2004 – p.28
Besov Spaces . . .
Generalization of Lipschitz/Hölder and L2-Sobolevspaces because:
Bα,22 (Ω) = W α,2(Ω) = Hα(Ω)
Bα,∞∞ (Ω) = Λα(Ω) = Cα(Ω)
Easy to characterize via wavelet coefficients
‖f‖Bα,τ
τ ‖c[f ]‖`τ for τ = 2/(1 + α) in 2D.
Sep. 2004 – p.29
Besov Spaces . . .
Thus, in the specific DL model with α = 1 = τ , we have:
‖u‖B1,1
1
‖c[u]‖`1 .
This norm equivalence means that this DL model isreally seeking a function whose wavelet expansion issparse since ‖c[u]‖`1 < C is a form of sparsityconstraint.
Sep. 2004 – p.30
Sparsity via `p (quasi-) norm (0 < p ≤ 1)
Consider a vector or sequence x = (xj)j∈N.
Then consider the so-called `0 (quasi-)norm as themeasure of the sparsity of x:
‖x‖`0 = #j ∈ N : xj 6= 0.
This counts a number of nonzero components in x.
Thus, under, say, ‖x‖`2 = 1, the smaller ‖x‖`0 is, thesparser x is; a precise definition of sparsity.
However, this norm is too fragile to use (e.g., sensitivityto noise).
Sep. 2004 – p.31
Sparsity via `p (quasi-) norm (0 < p ≤ 1) . . .
Thus, consider the `p (quasi-) norm 0 < p ≤ 1 instead:
‖x‖`p =
(
∑
j∈N
|xj|p)1/p
.
Instead of explicitly saying the number of nonzeros inthe sequence via `0 quasi-norm, we can say:
‖x‖p ≤ C =⇒ |x|(k) < Ck−1/p,
(∵ k|x|p(k) ≤∑k
j=1 |x|p(j) < Cp) which relates sparsity to
the decay of the magnitudes of the rearrangedsequence. The smaller p, the faster the decay.
Sep. 2004 – p.32
Sparsity via `p (quasi-) norm (0 < p ≤ 1) . . .
w`p, i.e., the weak `p space is defined as
w`p(N)∆= x = (x1, x2, . . .) : |x|(k) ≤ Ck−1/p,∃C > 0,∀k ∈ N.
Clearly, `p ( w`p, e.g., xn = n−1/p ∈ w`p, but not in `p.
Later w`1 will be used to “almost” characterize thespace BV (Ω).
Sep. 2004 – p.33
Disadvantages of B1,11
Even very simple cartoon-like images such as χE(x),where ∂E is a smooth closed curve, do not belong toB1,1
1
Oscillatory patterns do not belong to B1,11 either
Sep. 2004 – p.34
Disadvantages of B1,11 . . .
Choi and Baraniuk on the Besov spaces vswavelet-based statistical models (the parameters ofthe generalized Gaussian distribution ⇐⇒ the Besovparameters)
(a) Original u (b) Random
shuffles of cj,·
(c) Random sign
flips of c Sep. 2004 – p.35
Basis Pursuit Denoising (Chen, Donoho, & Saunders, 1995)
Under the discrete setting, assume thatf = u + v ∈ Rn, where v = σz is WGN vector withvariance σ2. The functional is similar to JDL:
JBP (u) = ‖f − u‖2`2 + λ‖α[u]‖`1 ,
The coefficient vector α[u] ∈ Rp are in the form:
u =∑
γ∈Γ
αγ [u]φγ ,
where φγγ∈Γ, |Γ| = p ≥ n is a dictionary of bases(i.e., redundant) such as stationary wavelet, waveletpackets, local Fourier bases, or other frames, etc.Sep. 2004 – p.36
Basis Pursuit Denoising . . .
Choice of λ = σ√
2 log p.
Solution of this convex non-quadratic optimization by“primal-dual log-barrier linear programming”
Specific combination of basis dictionaries, theuncertainty principle, equivalence with `0 minimizationproblem =⇒ ask Donoho, Candès, Huo, Elad, Starckwho are all participating in this program!
Sep. 2004 – p.37
Sparsity vs. Statistical Independence
Independent Component Analysis
Stochastic setting; a collection of images
Can find a basis (LSDB) that provides the leaststatistically-dependence (the best one) out of thewavelet packet library or local Fourier library
Better off to pursue the sparsity than independenceexcept the problems that really have statisticallyindependent sources
Read my articles as well as Donoho & Flesia for moreinfo.
Sep. 2004 – p.38
BV (Ω): Functions of Bounded Variation
Definition of total variation
|u|BV (Ω)∆= sup
g
∫
Ω
u∇ · g dx : g ∈ C1c (Ω; R2), |g(x)| ≤ 1 ∀x ∈ Ω
BV (Ω) ⊂ L1(Ω) is a Banach space with the norm:
‖u‖BV (Ω) = ‖u‖L1(Ω) + |u|BV (Ω).
If u ∈W 1,1(Ω) ⊂ BV (Ω), then
|u|BV (Ω) =
∫
Ω
|∇u(x)| dx,
via integration by parts.Sep. 2004 – p.39
BV (Ω) . . .
In this tutorial, Ω ⊂ R2 is bounded. Thus,
W 1,1(Ω) = BV(Ω) ⊂ BV (Ω) ⊂ L2(Ω) ⊂ L1(Ω).
Unfortunately, W 1,1(Ω) does not contains cartoon-likeimages such as χE(x) where E ⊂ Ω and ∂E is smoothwhereas BV (Ω) does.
Minimizer u∗ exists in BV (Ω) for the corrected versionof the JROF (u) (Vese, 2001)
However, the original version of the ROF criterion doesnot guarantee to get (u, v) = (χE, 0) from f = χE (Y.Meyer 2001).
Sep. 2004 – p.40
Besov vs BV
Embedding (DeVore-Lucier, Donoho, Meyer, . . . ):
B1,11 (Ω) ⊂ BV (Ω) ⊂ B1,∞
1 (Ω)
B1,11 does not contain cartoon-like images while BV (Ω)
does.
On the other hand, BV (Ω) does not possess anyunconditional basis while B1,1
1 does.
Sep. 2004 – p.41
Unconditional Bases
Let φνν∈Γ be a basis for a Banach space B.
Let f =∑
ν∈Γ cνφν ∈ B.
Let ‖f‖B be a functional norm of f ∈ B and let ‖c[f ]‖b
be a discrete sequence norm of c[f ] = (cν [f ]).
Suppose ‖f‖B ‖c[f ]‖b.
This is already not a trivial condition because . . .
Sep. 2004 – p.42
Fourier is not an unconditional basis for Lp(T ), p 6= 2
Let f ∈ B = Lp(T ), T = [0, 2π), and φν(x) = eiνx. Letcν [f ] be the Fourier coefficients of f .
Then, ‖f‖L2(T ) = ‖c[f ]‖`2(Z) (Plancherel)
However, ‖c[f ]‖`p(Z) does not tell information about‖f‖Lp(T ) if p 6= 2.
For example, ‖f‖L4(T ) tells you some info about thedistribution of the energy of f over T (∼ kurtosis).
However, |cν [f ]| does not tell you anything about‖f‖L4(T ).
=⇒ the Littlewood-Paley theorySep. 2004 – p.43
Unconditional Bases . . .
Then, φνν∈Γ is called an unconditional basis of B ifany sequence c = (cν) satisfying |cν | ≤ |cν [f ]|, ∀ν ∈ Γ,yields a new function f =
∑
ν∈Γ cνφν that belong to B.
In other words, operations on the coefficients, such asshrinking, sign flips, do not change the membership ofB.
Examples: Fourier: L2, Wavelets: Lp, 1 < p <∞, Bα,qp ,
α > 0, 1 ≤ p, q ≤ ∞, . . .
Sep. 2004 – p.44
Unconditional Bases . . .
Advantages: φνν∈Γ ⇐⇒ axes of symmetry for the ballin B, e.g., ‖f‖B < C.
“Rotation” into a coordinate system where the norm is“diagonalized” even if the norm is not quadratic.
Read articles by Donoho as well as Meyer’s books!
Sep. 2004 – p.45
Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu
BV (Ω) does not have an unconditional basis; but wecan say the following:
u ∈ BV (Ω)=⇒c[u] ∈ w`1(Γ).
In other words, the sorted wavelet coefficients decayas O(k−1).
This implies that k-term approximation of aBV -function using wavelets is of O(k−1/2).
Sep. 2004 – p.46
Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu . . .
Embeddings in the sequence spaces:
`1(Γ) ⊂ bv(Γ) ⊂ w`1(Γ),
where bv(Γ) is a space of vectors consisting of thewavelet coefficients of BV (Ω) functions and its norm isdefined to be the BV norm of the correspondingfunction.
This allows wavelet shrinkage on the coefficients.
Sep. 2004 – p.47
Cohen, Dahmen, Daubechies, DeVore, Meyer, Petrushev, & Xu . . .
The Haar case by C-De-P-X (1998), and the generalwavelet case by Meyer (1998).
The stronger versions by C-Dah-Dau-De (2000).
Of course, using other methods such as ridgelets andcurvelets, one can get better decay =⇒ Lectures byDonoho and Candès tomorrow
Sep. 2004 – p.48
BV for Image Modeling?
Study of Gousseau & Morel
BV (Ω) may be well adapted for large scale geometricstructures
But natural images are not in BV (Ω).
∵ Natural images often contain too many small objectsand textures =⇒ sum of the length of the perimeters ofthe level sets may blow up, i.e.,
∫∞
−∞H1(∂Et[u]) dt = ∞.
Sep. 2004 – p.49
BV for Image Modeling?
20 40 60 80 100 120
20
40
60
80
100
120
(a)
(23,43)
20 40 60 80 100 120
20
40
60
80
100
120
(b)
(43,63)
20 40 60 80 100 120
20
40
60
80
100
120
(c)
(63,83)
20 40 60 80 100 120
20
40
60
80
100
120
(d)
(83,103)
20 40 60 80 100 120
20
40
60
80
100
120
(e)
(103,123)
20 40 60 80 100 120
20
40
60
80
100
120
(f)
(123,143)
20 40 60 80 100 120
20
40
60
80
100
120
(g)
(143,163)
20 40 60 80 100 120
20
40
60
80
100
120
(h)
(163,183)
20 40 60 80 100 120
20
40
60
80
100
120
(i)
(183,203)
20 40 60 80 100 120
20
40
60
80
100
120
(j)
(203,226)
Sep. 2004 – p.50
Very Briefly, Meyer, Vese, & Osher (2002)
All the previous models with ‖f − u‖2L2(Ω) anticipated
WGN for the v component, which are also very muchrelated to statistical estimation methods such as MLE,Bayes, MDL, etc. with prior information on the ucomponent.
Improve the ROF model by changing the L2 norm ofv = f − u component to
JMV O(u) = ‖f − u‖G(Ω) + λ|u|BV (Ω),
where G(Ω) is a dual space of BV(Ω) = W 1,1(Ω).
Sep. 2004 – p.51
Very Briefly, Meyer, Vese, & Osher (2002)
G(Ω) contains oscillatory patterns (textures).
Y. Meyer’s book for precise definition of G(Ω)
Vese-Osher (2003) for numerical algorithm.
Sep. 2004 – p.52
Compare with the ROF model . . .
Sep. 2004 – p.53
Summary
Reviewed u+ v models
u component is often modeled by the constraints‖u‖B < C for some function space B.
This constraint corresponds to the prior information inBayesian statistics and MDL formalism, andregularization term in the inverse problem.
v component is often assumed to be i.i.d. WGN,yielding L2 fidelity term in the functional to beoptimized =⇒ not good for texture
Sep. 2004 – p.54
Summary . . .
Wavelet shrinkage: works very well for B = B1,11 , and
reasonably well for B = BV , and computationally veryfast.
PDE-based approach: works well for B = W 1,1, morecomputationally intensive, but allows more flexiblemodeling for non-L2 error criterion for v.
Sep. 2004 – p.55
My Comments
Use of function spaces for image modeling:
Mathematically sound
Can get deep results
Extremely hard to find a good one for naturalimages
Use of orthonormal bases:
Mathematically tractable
Good affinity with function spaces
Fast algorithms
But too restrictiveSep. 2004 – p.56
My Comments . . .
Use of overcomplete dictionaries
Mathematically more challenging
Can get better results
Can develop fast algorithms
Still in the form of linear combinations in most cases
Sep. 2004 – p.57
My Comments . . .
‖u‖B < C is mathematically great, but restrictive forimage modeling
Need more interaction with stochastic modelingcommunity
Explore more about wavelet shrinkage after theinvertible transform a la Harlan-Claerbout-Rocca
Yves Meyer: “Sparsity does not open the gate tofeature extraction.”
My reaction: “Sparsity can still open the gate to featureextraction.”
Sep. 2004 – p.58
References: Books and Survey Articles
R. DeVore: “Nonlinear Approximation,” in Acta Numerica,Cambridge Univ. Press, 1998.
D. Donoho, M. Vetterli, R. DeVore, & I. Daubechies: “Datacompression and harmonic analysis,” IEEE Trans. Info.Theory, vol.44, pp.2435–2476, 1998.
S. Jaffard, Y. Meyer, & R. D. Ryan: Wavelets: Tools forScience & Technology, SIAM, 2001.
S. Mallat: A Wavelet Tour of Signal Processing, 2nd ed.,Academic Press, 1999.
Y. Meyer: Oscillating Patterns in Image Processing andNonlinear Evolution Equations, University Lecture SeriesVol.22, AMS, 2001. Sep. 2004 – p.59
References: Articles
B. Bénichou & N. Saito: “Sparsity vs. statisticalindependence in adaptive signal representations: A casestudy of the spike process,” in Beyond Wavelets (G. V.Welland, ed.), Chap.9, pp.225–257, Academic Press, 2003.
H. Choi & R. Baraniuk: “Wavelet statistical models andBesov spaces,” in Nonlinear Estimation and Classification(D. Denison ed.), Springer-Verlag, 2003.
A. Cohen, R. DeVore, P. Petrushev, & H. Xu: “Nonlinearapproximation and the space BV (R2),” Amer. J. Math.,vol.121, pp.587–628, 1999.
Sep. 2004 – p.60
References: Articles . . .
A. Cohen, W. Dahmen, I. Daubechies, & R. DeVore:“Harmonic analysis of the space BV ,” Revista MathematicaIberoamericana, vol.19, pp.235–263, 2003.
S. Chen, D. L. Donoho, & M. A. Saunders: “Atomicdecomposition by basis pursuit,” SIAM J. Sci. Comput.,vol.20, pp.33-61, 1999.
R. A. DeVore & B. J. Lucier: “Fast wavelet techniques fornear-optimal image processing,” IEEE MilitaryCommunications Conference Record, pp.1129–1135, 1992.
D. L. Donoho: “Unconditional bases are optimal bases fordata compression and for statistical estimation,” Appl.Comput. Harm. Anal., vol.1, pp.100–115, 1993. Sep. 2004 – p.61
References: Articles . . .
D. L. Donoho: “Sparse components of images and optimalatomic decomposition,” Constr. Approx., vol.17, pp.353–382,2001.
D. L .Donoho & A. G. Flesia: “Can recent innovations inharmonic analysis ‘explain’ key findings in natural imagestatistics?” Network: Comput. Neural Syst., vol.12,pp.371–393, 2001.
Y. Gousseau & J.-M. Morel: “Are natural images of boundedvariation?” SIAM J. Math. Anal., vol.33, pp.634–648, 2001.
W. S. Harlan, J. F. Claerbout, and F. Rocca: “Signal/noiseseparation and velocity estimation,” Geophysics, vol.49,pp.1869–1880, 1984. Sep. 2004 – p.62
References: Articles . . .
D. Mumford & J. Shah: “Boundary detection by minimizingfunctionals, I” IEEE Conf. Computer Vision & PatternRecognition, pp.22–26, 1985.
L. Rudin, S. Osher, & E. Fatemi: “Nonlinear total variationbased noise removal algorithms,” Physica D, vol.60,pp.259–268, 1992.
N. Saito: “Simultaneous noise suppression and signalcompression using a library of orthonormal bases and theminimum description length criterion,” in Wavelets inGeophysics (E. Foufoula-Georgiou and P. Kumar, eds.),chap. XI, pp.299–324, Academic Press, 1994.
Sep. 2004 – p.63
References: Articles . . .
N. Saito: “Image approximation and modeling via leaststatistically-dependent bases,” Pattern Recognition, vol.34,pp.1765–1784, 2001.
N. Saito: “The generalized spike process, sparsity, andstatistical independence,” in Modern Signal Processing (D.Rockmore and D. Healy, Jr. eds.), MSRI Publications,vol.46, pp.317–340, Cambridge Univ. Press, 2004.
L. Vese: “A study in the BV space of a denoising-deblurringvariational problem,” Applied Mathematics & Optimization,vol.44, pp.131–161, 2001.
Sep. 2004 – p.64
References: Articles . . .
L. Vese & S. J. Osher: “Modeling textures with total variationminimization and oscillating patterns in image processing,”J. Sci. Comput., vol.19, pp.553–572, 2003.
S. Watanabe: “Pattern recognition as a quest for minimumentropy,” Pattern Recognition, vol.13, no.5, pp.381–387,1981.
Sep. 2004 – p.65