Variable Groupwise Structured Wavelets · VARIABLE GROUPWISE STRUCTURED WAVELETS 3 77 will be only...
Transcript of Variable Groupwise Structured Wavelets · VARIABLE GROUPWISE STRUCTURED WAVELETS 3 77 will be only...
VARIABLE GROUPWISE STRUCTURED WAVELETS∗1
Y. FAROUJ† , J.M. FREYERMUTH‡ , L. NAVARRO§ , M. CLAUSEL¶, AND P.2
DELACHARTRE‖3
Abstract. At the origins of the wavelet transform, there were basically two ways to build multi-4dimensional transform from the unidimensional ones. The first and most commonly used construction5construction generalizes the concept of multiresolution analysis (MRA) to multi-dimensional settings6by taking a tensor product of the MRA. The second and most mathematically appealing construc-7tion is formed by the tensor products of one-dimensional wavelet basis functions. These transforms8have plenty of different denominations in the literature, we refer to them as standard and hyperbolic9wavelet transforms respectively. These have been studied and used in a very wide range of applica-10tions from image compression to denoising and classification. In this paper we introduce a generalized11hyperbolic wavelet basis which mixes both standard and hyperbolic ways of building basis accord-12ingly to the nature of the underlying variables in the multidimensional objects. We demonstrate13its enhanced practical performances for denoising images, image sequences and incompressible flows.14Moreover, we adapt this transform to deal with real life data denosing showing how naturally this15generalized hyperbolic wavelet transform can be used with divergence-free wavelets or jointly with16variance stabilization techniques.17
Key words. Structured sparsity, Non-parametric estimation, Hyperbolic wavelets, Data-driven18denoising, Hyperspectral datas, Images sequence, Flow denoising19
AMS subject classifications. 65T60, 94A0820
1. Introduction and Motivation. We consider the problem of recovering an21
unknown multidimensional function f from a noisy observation fε. This task is com-22
mon in various problems related to image processing. We are interested in the follow-23
ing Additive White Gaussian noise (AWGN) model24
(1) fε = f + εξ25
where fε is the observed data, ε ∈ (0,∞) the noise level and ξ ∼ N (0, 1) is a white26
noise. We are interested in cases where the multidimensional function f depends27
on variables with different physical meaning along the different coordinate axes (e.g28
spatial, temporal, spectral, hyperspectral,. . .).29
Non parametric methods for denoising based on wavelets expansions of the func-30
tion fε have been widely developed over the last two decades since the seminal work31
∗Submitted to the editorsFunding: This work was supported by “Region Rhone-Alpes” under the ARC 6. M. Clausel’s
research is supported by the French Agence Nationale de la Recherche (ANR) under reference ANR-13-BS03-0002-01 (ASTRES) and PERSYVAL-Lab (ANR-11-LABX-0025-01). L. Navarro’s researchis supported by the (ANR) under reference ANR-15-CE19-0002 (LBSMI). P. Delachartre is withinthe framework of the LabEX PRIMES (ANR-11-LABX-0063) of the Universite de Lyon. J.-M.Freyermuth’s research was supported by the Engineering and Physical Sciences Research Council[EP/K021672/2]† Medical Image Processing Lab (MIP:Lab), Institute of Bioengineering, EPFL, CH-1015 Lau-
sanne, Switzerland ([email protected]).‡KU Leuven, ORSTAT and Leuven Statistics Research centre, Naamsestraat 69, 3000 Leuven,
Belgium ([email protected]).§University of Lyon, CNRS UMR 6158 LIMOS, INSERM U1059 F4200, SAINBIOSE F-42023
([email protected]).¶University of Grenoble and CNRS, Laboratoire Jean Kuntzmann, UMR 5224 du CNRS, 51 rue
des Mathematiques 38010 Saint Martin d’Heres, France ([email protected]).‖ Universite de Lyon, CREATIS, UMR5220, INSERM U1044, INSA-Lyon, Lyon 69621, France
1
This manuscript is for review purposes only.
2 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
of Donoho and Johnstone [19] who introduced the celebrated wavelet shrinkage proce-32
dure. Thereafter, we propose a novel extension of these wavelet denoising techniques33
to deal with images or volumes of images. Whenever the variables have the same phys-34
ical meaning, it is natural to use the standard (isotropic) multidimensional wavelet35
bases. However, in many applications one is confronted to data in which the vari-36
ables have different natures and hence, the signal or image of interest have different37
properties according to these different variables. In such an anisotropic case the use38
of hyperbolic wavelet bases is natural. We give here some examples:39
40
Spectral, Multispectral and Hyperspectral data A first example is the41
evolutionary spectrum of possibly non-stationary process which is a bidimensional42
function of frequency and time. The evolutionary-spectrum is consistently estimated43
by smoothing the empirical Wigner-Ville distribution, so called prepriodogram in [38].44
In this context emerges the theory of estimation of anisotropic functions based on the45
thresholding of hyperbolic wavelet coefficients of the preperiodogram. Other exam-46
ples include multispectral and hyperspectral imaging; The 3D stack of images have47
completely different regularities in spectral and spatial variables as the data consists48
of the same image with different spectral bands or wavelengths [14, 45].49
50
Image and volume sequences An image or volume temporal sequence in which51
voxel intensities are preserved over time is governed by the following conservation52
equation53
(2) ∂tI + u.∇xI = 0 ,54
where I refers to the sequence and u is the velocity field of the voxels. When the un-55
known is the velocity u, Equation (2) is the so-called optical flow equation and plays a56
major role in computer vision since it is the reference model for motion estimation [29].57
From a PDE’s point of view this equation can be seen as a simple advection equa-58
tion on the scalar field I assuming that the flow is incompressible. Solutions of such59
time–dependent partial differential equations have different degrees of smoothness in60
space and time and they are not well characterized in isotropic spaces [18].61
Velocity fields Now, consider that the velocity field u is given. This situation oc-62
curs for example in blood flow measurement provided by Phase Contrast MRI [35] or63
Ultrasound Doppler [26]. The output u is computed from phase transitions. It usually64
suffers from low velocity–to–noise ratio (VNR) and a denoising step is often needed.65
The velocity field u can also be computed solving some PDE modeling the underlying66
motion. For example, a blood flow verifies variants of Navier-Stokes equations [40].67
In this case, the difference between the temporal and the spatial directions is related68
to their physical nature in the sense that the divergence over spatial dimensions is null.69
70
Partial Data-dependent noise intensity In many applications, the observed71
data cannot be properly modelled by the equation (1). In particular the noise intensity72
can be proportional to the underlying unknown function. As a consequence wavelet73
thresholding methods need to be adapted via variance stabilization techniques. Such74
data–dependent noises models are often purely related to spatial domain because of75
the imaging systems. Hereafter the variance stabilization technique developed in [23]76
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 3
will be only applied on the spatial variables.77
In this work, we describe a novel construction of a so–called structured wavelet78
basis that can faithfully represent multivariate functions presenting different charac-79
teristics along the different coordinate axis. Our construction extend the so–called80
hyperbolic wavelet basis introduced in [17]. In the same spirit, J. Bigot and T. Sap-81
atinas adressed in [9] the problem of the denoising of time-dependent multivariate82
functions. It has been shown that the hyperbolic wavelet basis is well adapted for83
functions having different degrees of smoothness along the directions [37]. In partic-84
ular, interesting recent results [6] brought to light that classical wavelet estimators85
cannot achieve optimal reconstructions of functions with different regularities on the86
different dimensions. We present a structured wavelets construction which consider87
sparsity assumptions on groups of variables and not only on single variables using a88
generalization of the hyperbolic construction of wavelets. We describe a relevant type89
of functional classes which are described by these structured wavelets basis, and we90
demonstrate that the minimax lower bound of the L2-loss of the estimator is driven by91
the dimension of the largest group, breaking, “partially”, the curse of dimensionality.92
We show that this generalization do not only allow to take regularity features into ac-93
count but also physical characteristic such as divergence-freedom. We also show that94
the obtained construction can handle situations in which the noise characteristics are95
data-dependent but only on a part of the variables.96
There is no reason for the standard multidimensional wavelet basis to be system-97
atically outperformed by the hyperbolic wavelet basis. If the theoretical justification98
does not exists yet, from an empirical point of view we will explain the advantages of99
the primer.100
The rest of the paper is organized as follows. In Section 2, we give a reminder on101
wavelets and their extensions in multivariate cases. The proposed wavelet construction102
is exposed and motivated in Section 3, along with a derivation of the lower bound of103
the L2-loss for a particular functional space. Extensive numerical experiments and104
comparisons are presented in Section 4105
2. Limits of usual wavelet estimation procedures. First, for the sake of106
clarity and ease of reading, we recall some definitions. We begin by introducing107
one–dimensional wavelet bases, then we describe two different extensions in higher108
dimensions.109
One dimensional wavelet bases are defined from a one-dimensional function ψ,110
called mother wavelet and its dilated and translated versions ψj,k(.) = 2j/2ψ(2j .− k)111
with (j, k) ∈ N × Z and a scaling function ϕ defined with its dilated and translated112
versions ϕj,k(.) = 2j/2ϕ(2j .−k). It is then well–known that {ϕ0,k}k∈Z∪{ψj,k}j≤0,k∈Z113
forms an orthogonal basis of L2(R). In the sequel we shall denote ψ−1,k = ϕ0,k.114
In the multivariate setting, one defines multidimensional wavelets as115
ψj1,··· ,jd,k1,··· ,kd(x) = ψj1,k1(x1)⊗ · · · ⊗ ψjd,kd(xd).(3)116
Two possible multidimensional extensions of wavelet bases come from this defi-117
nition. The first one consists in taking all possible combinations of the multiindices118
j = (j1, · · · , jd) ∈ (N ∪ {−1})d, k = (k1, · · · , kd) ∈ Zd leading to the so–called119
“hyperbolic”1 wavelet basis [17] Bhyp,d = {ψj,k}j∈(N∪{−1})d,k∈Zd . The second possi-120
bility is to fix the directional dilation indices to be the same along each dimension121
1This wavelets appears under other names names in the literature, such as mixing scales [41] orrectangular [50].
This manuscript is for review purposes only.
4 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
j = j1 = · · · = jd. Then one first define multidimensional wavelets ψ(i) for any122
i ∈ {0, 1}d \ {(0, · · · , 0)} as123
(4) ψ(i) = ψ(i1) ⊗ · · · ⊗ ψ(id) with ψ(0) = φ, ψ(1) = ψ124
and for any j ≥ −1, k ∈ Zd and i ∈ {0, 1}d \ {(0, · · · , 0)}, one set125
Ψ(i)j,k(x) = 2jd/2ψ(i)(2jx− k)126
We also set for any k ∈ Zd, Ψ(0,··· ,0)−1,k (x) = [φ ⊗ · · · ⊗ φ](x − k). The family Biso,d =127
{Ψ(i)k , j,k} is then an orthonormal wavelet basis of L2(Rd), which is said to be128
“isotropic” [16]. See Figure 1 for a comparison of hyperbolic and classical wavelet129
decompositions on a two-dimensional example.130
In the sequel, we denote as I the indices of a particular wavelet or scaling function.131
The set of all combinations of I will be referred as I. If not specified I can refer both132
to standard or hyperbolic constructions. In the hyperbolic case one has133
I = {I = (j,k) with j = (j1, · · · , jd) ∈ (N ∪ {−1})d, k = (k1, · · · , kd) ∈ Zd}134
whereas in the isotropic case135
136
I = {I = (i, j,k) with i ∈ {0, 1}d\{(0, · · · , 0)}, j ∈ N∪{−1}, k = (k1, · · · , kd) ∈ Zd}137
∪ {I = ((0, · · · , 0),−1,k)k = (k1, · · · , kd) ∈ Zd}138139
The wavelet decomposition of a function f ∈ L2([0, 1]d) is then given by140
(5) f =∑I∈I
βI(f)ΨI ,141
where142
(6) βI(f) =
∫(0,1)d
f(x)ΨI(x) dx.143
Denoising methods based on thresholding the wavelet coefficients has proven its144
effectiveness since the seminal work of Donoho and Johnstone [19]. Under model (1),145
it is possible to recover the unknown function f from its noisy observation fε by146
different procedures on the wavelet empirical wavelet coefficients βI(fε). This results147
in the following estimator148
f =∑I∈Iε
βI(fε)ΨI ,149
where Iε ⊂ I is to the set of multiindices corresponding to the coefficients that are150
kept in the reconstruction. The choice of Iε can be performed following different rules.151
In the sequel we shall consider the most classical one, the so called hard thresholding152
rule where153
Iε = {I ∈ I s.t |j| ≤ j(tε) and |βI(fε)| > tε} ,154
where tε is a threshold calibrated in function of the level of noise ε : tε = 4√
2ε√
log ε−1.155
The integer j(tε) is chosen such that 2−j(tε) ≤ t2ε < 21−tε . The notation |j| allow to156
encompass both the hyperbolic and the isotropic case in the definition of the hard157
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 5
House Isotropic Hyperbolic
Fig. 1: Wavelets decomposition in isotropic and hyperbolic settings
thresholding estimator, since we set |j| = j1 + · · · + jd in the case of a multidimen-158
sional index j = (j1, · · · , jd) whereas one has |j| = j in the case of an unidimensional159
scaling index j. Depending on the considered wavelet basis (isotropic or hyperbolic),160
one has then two possible estimations procedures which can be derived from the hard161
thresholding rule : the isotropic hard thresholding estimator and the hyperbolic one.162
We want to compare the numerical performances of these two procedures ac-163
cording to the anisotropic nature of the data. From the theoretical point of view,164
this question has already been addressed and appears in the work of Neumann and165
Von Sachs [38] where the extension of wavelet thresholding techniques to multivari-166
ate anisotropic scenarios is introduced. It is shown that one has better performances167
for hyperbolic wavelets whenever the unknown function has some anisotropic fea-168
tures. Moreover, due to the adaptive nature of the hard thresholding and the fact169
that isotropy is a particular case of anisotropy, it has been proved recently that even170
if the considered data are isotropic, hyperbolic wavelet thresholding gives the same171
theoretical guarantees as isotropic wavelets [6].172
However, empirical evidences contradict this result. We give here a motivational173
experiment using the sequence “Ayiko”. In Figure 2, we consider a spatial frame (i.e,174
we fix time). The denoising experiment uses the hard thresholding rule. Isotropic175
wavelets are slightly better in terms of PSNR. Moreover, undesirable axis-aligned176
artifacts are visible on the reconstruction from noisy data in the hyperbolic settings.177
This results are in contradiction with theoretical estimations. A possible explanation178
for this phenomenon is the biased definition of anisotropic functional spaces which179
consider only axis-aligned regularities, and the fact that the used image, and natural180
images, in general do not have strong differences in terms of smoothness along the181
vertical and the horizontal directions, that is are not strongly anisotropic.182
In Figure 3, a temporal cross section of the sequence is considered (i.e, we fix183
one of the spatial dimensions). The resulting 2D images (see Figure 3) has different184
regularities in time and space and it then is highly anisotropic. In particular, the185
resulting image is highly regular in the temporal dimension2. In this case, the hyper-186
bolic wavelets have clearly superior performance. To have a better understanding of187
this phenomenon, the degree of anisotropy can be seen as maximal when the degree188
of interaction between the variables is small. This degree is called the atomic dimen-189
2Of course, this is not a generalization. It is also possible to have some irregularity in the temporaldimension, for example, when new objects are appearing in the scene.
This manuscript is for review purposes only.
6 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
Original Noisy Isotropic Hyperbolic
26.67 dB 29.45 dB 28.93 dB
Fig. 2: Denoising of one frame of the Ayiko sequence.
Original Noisy Isotropic Hyperbolic
24.53 dB 30.97 dB 33.68 dB
Fig. 3: Denoising of a temporal cross section of the sequence Ayiko
sion. Simple examples where the atomic dimension is one, are additive models. For190
example in the two dimensional case, these functions are of the form191
(7) f(x1, x2) = f1(x1) + f2(x2), (x1, x2) ∈ R2.192
Such functions are known to allow rates of convergence corresponding to the one-193
dimensional case [6]. Figure 4 shows an example of denoising under the model (7)194
where f1 and f2 are respectively the standard test functions, Doppler and Blocks [4].195
The results show that the hyperbolic setting outperforms the isotropic setting in this196
case.197
To the best of our knowledge, existing research focuses on cases when the data198
are fully anisotropic, that is all the regularities are different according each direction.199
In this work, we investigate the cases when variables can be grouped in sub-ensembles200
with the same physical meaning. This is motivated by the conclusions of the numerical201
results in this section and by the fact that functions having a group behavior arises202
in many applications such as spatio-temporal and multi-spectral data. This naturally203
suggests the use of wavelet atoms that consider both features. In the next section,204
we discuss the construction of such wavelets, which are an instance of the so-called205
composite wavelets [27] and the definition of associated estimation procedures.206
3. Estimation procedures based on tensored wavelet basis.207
3.1. The tensored wavelet basis and associated estimation procedures.208
We now introduce the structured wavelet basis, which arise as a case of composite209
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 7
Original Noisy Isotropic Hyperbolic
28.11 dB 32.87 dB 37.72 dB
Fig. 4: Denoising of an additive model.
wavelets defined in [27]. The starting point are N multidimensional wavelet bases210
{ΨI1,1, I1 ∈ I1}, · · · , {ΨIN ,N , IN ∈ IN} of L2(Rd1), · · · , L2(RdN ) respectively. For211
each I = (I1, · · · , IN ) ∈ I1 × · · · × IN , one then set212
ΨI = ΨI1,1 ⊗ · · · ⊗ΨIN ,N213
The family {ΨI, I ∈ I1 × · · · × IN} is then a basis of L2(Rd) with d = d1 + · · ·+ dN .214
Note that if N = d, we recover anhyperbolic wavelet basis as a special case whereas215
in the case N = 1 this construction corresponds to isotropic wavelet basis.216
Any function f ∈ L2(Rd) can then be decomposed in217
f =∑I
βIΨI218
One then define the hard thresholding estimator as219
f =∑I∈Iε
βI(f) ΨI220
with Iε = {I ∈ I1 × · · · × IN , |j| = j1 + · · · + jN ≤ j(λε)} where j(λε) is such that221
2−j(λε) ≤ λ2ε < 21−j(λε) with λε = mε√
log(ε−1).222
The next section aims at proving the optimality of this procedure in the minimax223
sense.224
3.2. Minimax results. To state and prove our minimax results about the hard225
thresholding procedure in the tensored wavelet basis, we first need to introduce226
some appropriate functional spaces, which appear as approximation spaces for our227
structured wavelet bases. These spaces extend the space of functions with dominat-228
ing mixed derivatives introduced in [42] as approximation spaces for the hyperbolic229
wavelet basis230
For a vector α = (α1, · · ·αd) ∈ Nd and function f ∈ L2([0, 1]d), let us define as231
usual its partial derivatives in the distribution sense232
Dαf =∂|α|f
∂xα11 · · · ∂x
αdd
,233
where |α| = α1+· · ·+αd. Given R = (R1, · · · , RN ) ∈ NN , we now define the following234
functional spaces235
FRd,N = {f ∈ L2((0, 1)d) ‖f‖FR
d,N=
N∑i=1
∑r=(r1,··· ,rN )∈Nd1×···×NdN ,|ri|≤Ri
‖Dr1 · · ·DrN f‖L2((0,1)d) <∞}236
This manuscript is for review purposes only.
8 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
We equip these spaces with the norm237
‖f‖FRd,N
= ‖f‖L2((0,1)d) +
N∑i=1
∑r=(r1,··· ,rN )∈Nd1×···×NdN ,|ri|≤Ri
‖Dr1 · · ·DrN f‖L2((0,1)d) .238
We then denote FRd,N (K) the ball of FR
d,N of radius K.239
The space FRd,N contains functions with dominating mixed derivatives which have240
for any ` ∈ {1, · · · , N}, d` variables having the same regularity order R`. Two special241
cases of interest arise from this definition:242
• If ∀i = 1, · · · , N we have r` ∈ R, that is if N = d, we find the so–called spaces243
with dominating mixed derivatives defined in [42]244
FRN = {f ∈ L2((0, 1)N )|
∑(r1,··· ,rN ), ri≤Ri
‖Dr1 · · ·DrN f‖L2((0,1)N ) <∞}245
• If N = 1 and d1 = d, we recover the classical isotropic Sobolev spaces.246
HR = {f ∈ L2((0, 1)d)|∑
r1∈Nd, |r1|≤R
‖Dr1f‖L2((0,1)d) <∞}247
We can now state our main result giving the rate of convergence of the procedure248
defined in Section 3.1 and proving its optimality in the minimax sense :249
Theorem 3.1. Assume that R1 = · · · = RN = R. Let Jε be such as ε22JεdmaxJε =250
2−2JεR, with dmax = max(di). Then251
inff
supf∈Fd,NR (K)
E‖f − f‖22 =(ε4R/(2R+dmax)
) (| log(ε)|N−1
).252
Note that this rate of convergence shows a lower dimension which is different from253
one in the mimimax estimate and thus breaking “partially” the curse of dimensional-254
ity. The lower bound, in this case, is of the order of ε2R/(R+dmax), while it is ε2R/(R+d)255
in the isotropic case and of ε2R/(R+1) in the fully anisotropic case. It is interesting to256
note that the order of the logarithmic term is related to the degrees of freedom of the257
scales vector instead of the usual dimension.258
3.3. Extension of the method to other settings. The construction (3) al-259
lows to consider different families of wavelets along the different dimensions. One260
could for example consider a representation of a piecewise stationary spectrum with261
Haar wavelets along the time and a smooth wavelet along the spatial frequency. The262
hard thresholding can also be modified considering possibly random threshold depend-263
ing not only on the level of noise but also on the index I. It will be the case when264
considering stabilisation of variance approaches. Here, we extend the approach intro-265
duced in Section 3 by pointing out other situations where specific types of wavelets266
and estimation procedure are needed.267
3.3.1. Denoising of spatio temporal incompressible flows. We consider268
the problem of denoising spatio-temporal vector fields with null divergence in space269
such as incompressible flows. Denoising such type of data gains a lot of attention with270
the emergence of Phase contrast MRI imaging [35]. A velocity field is a function of271
(x, t) where x is the spatial variable of dimension d ≥ 2. The wavelet paradigm for272
such data consists in two types of algorithms based on a decomposing the flow in a273
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 9
divergence free wavelet frame. The classical spatial denoising [35] do not consider the274
time variable and a thresholding is done on the divergence-free wavelet transform of275
each spatial stack. A more interesting procedure was introduced recently by Bostan276
et al. [11, 10] which considers spatio-temporal regularization. The algorithm considers277
two regularization terms in a variational framework278
(8) f = argmin{||WBdivf ||1 +Rt(f) +1
2||gε − f ||22},279
where WBdivf denotes the vector of the coefficients resulting from the decompo-280
sition of f in free-divergence wavelet basis Bdiv and Rt(f) is a term aiming at reg-281
ularizing f in the temporal dimension on which incompressibility is not maintained.282
Many choices for Rt(f) are possible. The presence of two norms makes the mini-283
mization process challenging. In particular, the direct use of simple soft-thresholding284
algorithms is not possible. Moreover, the convexity is lost with the result of the non-285
uniqueness of a minimum. If Rt(f) is not differentiable, as in the case of the `1-norm,286
the problem needs to be divided into two sub-problems by considering one norm at287
the time in an iterative fashion.288
Here, we propose an alternative approach which consists in using one single norm289
which sparsifies the complete vector. This can be achieved by the the structured290
wavelet construction. Let us define the following spatio temporal wavelets :291
Ψ = ψdivx (x)⊗ ψt(t)(9)292
where ψdiv is a divergence-free spatial wavelet and ψt is a one-dimensional temporal293
wavelet. With our construction, we can replace the estimation procedure (8) by a294
simpler one as a thresholding on the wavelet coefficients of the flow of interest in295
the structured wavelet basis. We also propose to consider other possible bases. For296
example, if temporal discontinuities are not allowed, ψt can be replaced by a Fourier297
basis: family of complex exponentials {eint}.298
3.3.2. Partial data-dependent noise. Another advantage of the tensor con-299
struction given is its ability to deal with the case where the noise dependss on the data.300
Such situations occurs, for example, in dynamic imaging when the noise has a specific301
spatial characteristic due to imaging system which is not preserved in time. We are,302
of course interested in cases where the noise can be removed via known thresholding303
methods. Consider, for example, the following noise model304
(10) fε = f + F (f)ξ305
where F is a non-decreasing function and ξ a white noise. Many noise models are of306
the form (10). Examples are speckle noise in Ultrasound imaging [34] and Poisson307
noise in Photon imaging systems [39, 25]. When an image sequence is considered, each308
image of the sequence follows model (10). For each image, this model can be solved by309
wavelet techniques after a Gaussianizing process on the wavelet coefficients [21, 24].310
The corresponding transform is called the Wavelet-Fisz transform and consists in311
stabilizing the variance by dividing wavelet coefficients by local estimations of the noise312
standard deviation. Now consider the problem of denoising a sequence in which each313
image follows model (10). Using a construction similar to (9) comes naturally. Here,314
instead of a divergence free wavelet transformation, a wavelet-Fisz transformation315
is considered; the spatio-temporal coefficients are stabilized in the spatial dimension316
This manuscript is for review purposes only.
10 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
before applying a hard thresholding procedure. The next section is devoted to an317
exhaustive experimental study on the performance of structured wavelet construction318
in practice.319
4. Experiments. In this section, we present some experiments to illustrate the320
effectiveness of the structured variable grouping presented earlier. We start with321
the simple problem of sequence denoising under model (1), then we consider the322
denoising of hyperspectral data, velocity flows and finally denoising under partially323
data-dependent noise models. We aim at showing the merits of using structured324
wavelets presented above compared to the classical constructions. Therefore we com-325
pare our results with those obtained by wavelet thresholding which do not take all326
variables into account (2D-Wavelets) and isotropic multivariate wavelet thresholding327
techniques (3D-Wavelets) which acts in the same manner on all variables. We also328
use simple universal hard thresholding rules. Except the velocity flows, all data have329
a [0−255] grey-values range. At each noise level, we compared results of the different330
methods using the classical Peak Signal to Noise Ratio (PSNR) as a criteria. In the331
case of data-dependent noise, we also show the difference between the true image and332
the denoised result of every method. This is known in the literature as the method333
noise [12]. For the wavelet filters, Daubechies wavelets with 6 vanishing moments are334
used in the different directions for the 2D-Wavelets and the Stuctured wavelets, while335
the 3D-Wavelets are complex isotropic filters [44].336
In each case, we first define sets of variables that we shall group. On these vari-337
ables the appropriate wavelet transform is taken (isotropic, divergence-free) and the338
appropriate estimator is defined (isotropic hard thresholding, hyperbolic hard thresh-339
olding, Fisz, etc· · · ). The rest of variables are taken into account by the standard340
tensor product. It is interesting to note that the complexity of the construction in341
dimension d is O(Nd) as for classical isotropic wavelets, no matter what is the or-342
der of the variables. Figure 5 shows a particular case construction for d = 3 in which343
isotropy in {x1, x2} is considered. This latter case will be of special interest, as x3 will344
refer to the time variable. It can be seen that the obtained atoms have an isotropic345
support along x1 and x2 while it is different along x3.346
The 3D atom x1 − x2 plane x1 − x3 plane
Fig. 5: Example of a Structured Wavelet atoms.
We now detail the different cases that we considered.347
4.1. Image Sequence denoising. The utility of considering isotropy in space348
and anisotropy in time was discussed and motivated in Section 2. We also want to show349
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 11
the merit of the spatio-temporal treatment in general. We consider two images for our350
synthetic experiments. The first sequence “Ayiko” available on the web 3. The second351
sequence “Heart” is generated by the ASSESS software [15]. This software applies a352
simple kinematic model of the heart deformation on an initial image. The result is a353
non-corrupted realistic simulation of a cardiac MRI sequence. We corrupted the two354
sequences with respect to noise model (7) by picking a noise level ε ∈ {5, 10, 15, 20}.355
PSNR results are given in 1. Results obtained by the structured wavelets are clearly356
superior to those obtained by 2D-Wavelets and 3D-Wavelets at all noise levels. The357
benefit of taking the temporal dimension into account can be observed as 3D-Wavelets358
give better results than 2D-Wavelets. As the visualization is done frame by frame, we359
show in figure the temporal evolution of the PSNR for each method. It can be observed360
that, in the case of the 2D-Wavelet approach it imposes a local treatment in time, as361
a consequence there is no temporal evolution of the PSNR. It can be observed that362
the results of 3D-Wavelets and structured wavelets demonstrate a time-dependent363
behaviour. In particular, the PSNR starts and finishes with low values as there is364
not enough information in the temporal domain and has higher values in the rest of365
the sequence. In the case of structured wavelets some sudden drops in the PSNR are366
observed due to temporal discontinuities. Note that an interesting phenomenon is367
observed in the case of the “Heart” sequence as the periodicity of the cardiac cycles368
can be seen in the PSNR results. This is due to the variations in deformation and369
discontinuities between cycles as the heart motion changes direction introducing a370
strong temporal discontinuity.371
Original Noisy (σ = 10) 2D-wavelets 3D-Wavelets StructuredWavelets
Fig. 6: Results of various methods applied to the 20th image of the sequence “Ayiko”.Quantitative evaluation is given in Table 1.
4.2. Spectral Denoising. We considered also two examples of hyperspectral372
and multispectral data. The first sample “Uncle Bens” is taken from the Multispectral373
database4. It consists of 31 images with a resolution reflectance ranging from 400nm374
to 700nm. The second sample “Indian Pine” from the AVIRIS hyperspectral sensor375
database5. It contains 220 images, each one representing a specific wavelength. The376
multispectral data was modified to have a dyadic size in the spectral dimension for377
the wavelet transform. Note that the Structured Wavelet approach does not require378
to have the same size in all dimension as the 3D-Wavelet approach. We show the379
results only for the original 31 samples.380
3http://see.xidian.edu.cn/vipsl/database Video.html4http://www2.cmp.uea.ac.uk/Research/compvis/MultiSpectralDB.htm5https://purr.purdue.edu/publications/1947
This manuscript is for review purposes only.
12 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
Original Noisy (σ = 10) 2D-wavelets 3D-Wavelets StructuredWavelets
Fig. 7: Results of various methods applied to the 20th image of the sequence “Heart”.Quantitative evaluation is given in Table 1.
0 20 40 60 80 100 120 140
24
26
28
30
32
34
36
Frame
PSNR
(dB)
Noisy 2D−Wavelets 3D−Wavelets Structured Wavelets
“Ayiko”
0 20 40 60 80 100 120 140
24
26
28
30
32
34
36
Frame
PSNR
(dB)
Noisy 2D−Wavelets 3D−Wavelets Structured Wavelets
“Heart”
Fig. 8: PSNR evolution for different methods applied to the sequences “Ayiko” and“Heart”.
“Ayiko” “Heart”σ 2D 3D Structured 2D 3D Structured5 35.05 38.12 40.48 35.60 38.54 41.2310 30.33 33.62 36.16 31.41 34.53 37.1225 27.75 30.92 33.68 29.00 32.24 34.7020 26.03 29.11 31.88 27.33 30.70 32.96
Table 1: Quantitative comparison (PSNR) for different methods applied to “Ayiko”and “Heart” at different noise levels.
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 13
“Uncle Bens” “Indian Pines”
Fig. 9: 3D Stacks : “Indian Pines” and “Uncle Bens”.
Original Noisy (σ = 10) 2D-wavelets 3D-Wavelets StructuredWavelets
Fig. 10: Results of various methods applied to the 20th image of the stack “UncleBens”. Quantitative evaluation is given in Table 2.
Original Noisy (σ = 10) 2D-wavelets 3D-Wavelets StructuredWavelets
Fig. 11: Results of various methods applied to the image 10th image of the stack“Indian Pines”. Quantitative evaluation is given in Table 2.
4.3. Incompressible flows Denoising. The experiments were done on a syn-381
thetic velocity map which was initiated by a null divergence vector flow (u, v) where382
the horizontal and vertical component are given respectively by u0(x, y) = sin(2πx)2 sin(4πy)383
and v0(x, y) = − sin(2πy)2 sin(4πx). The temporal evolution was governed by a rota-384
This manuscript is for review purposes only.
14 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
0 5 10 15 20 25 30 35
15
20
25
30
35
Image
PSNR
(dB)
Noisy 2D−Wavelets 3D−Wavelets Structured Wavelets
“Uncle Bens”
0 20 40 60 80 100 120 140
−20
−10
0
10
20
30
40
Image
PSNR
(dB)
Noisy 2D−Wavelets 3D−Wavelets Structured Wavelets
“Indian Pines”
Fig. 12: PSNR evolution for different methods applied to the data “Uncle Bens” and“Indian Pines”.
“Uncle Bens” “Indian Pines”σ 2D 3D Structured 2D 3D Structured5 36.14 37.78 39.41 35.26 35.64 37.3810 32.26 34.24 36.24 31.81 32.22 34.3615 30.11 32.17 34.16 30.08 30.55 32.7020 28.67 30.65 32.53 28.94 29.36 31.42
Table 2: Quantitative comparison (PSNR) for different methods applied to “UncleBens” and “Indian Pines” at different noise levels.
tion matrix u(x, y, t + 1) = u(x, y, t) + hu(x, y, t) and v(x, y) = v(x, y, t)− hv(x, y, t)385
where h is fixed. The result is a velocity map in form of 4 vertices with increas-386
ing velocity (e.g Figure 13). We corrupted this flow with gaussian noise where387
σ = 0.1, 0.3, 0.5. We tested spatial two-dimensional divergence free wavelets and388
spatio-temporal structured divergence free wavelets. Here, the 3D construction is not389
appropriate as divergence freedom cannot be imposed in time. We used divergence free390
wavelets with periodic boundary condition introduced by Harouna and Perrier [30].391
A visual comparison of the results is given in Figure 14 and Figure 15. The vector392
flow obtained using the spatio-temporal structured wavelets are visually smoother393
and still preserve divergence freedom. As the velocity is increasing and the the noise394
variance is constant the PSNR is increasing within time Figure 16. Note that the395
spatio-temporal approach fails in some time steps because of the discontinuities but396
its overall performance is better than the one of the spatial approach as it can be seen397
in Table 3.398
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 15
0 5 10 15 20 250
5
10
15
20
25
x-axis
y-axis
(a)
x-axisy-axis
10 20 30 40 50 60
10
20
30
40
50
60−1.5
−1
−0.5
0
0.5
1
1.5
(b)
x-axis
Tim
e
10 20 30 40 50 60
10
20
30
40
50
60−4
−3
−2
−1
0
1
2
3
4
(c)
Fig. 13: 2D+t Flow data : (a) Vector field at the 10th time step. (b) Horizontalvelocity at the 10th time step. (c) Temporal evolution of the horizontal velocity for afixed vertical plan.
Horizontal velocity Vertical velocityσ 2D Structured 2D Structured
0, 1 46.64 49.00 46.65 48.900, 3 39.65 43.83 39.51 43.530, 5 35.49 39.65 35.47 39.36
Table 3: Quantitative comparison (PSNR) for different methods for horizontal andvertical velocity at different noise levels.
4.4. Mixed models Denoising. Here we consider the model (10), we have a399
particular interest in models arising from ultrasound imaging [34] in which h(f) =√f .400
Our experiments are performed with respect to this model for σ = {1, 2, 3, 4}. We401
used spatial isotropic wavelets with Fisz variance stabilization [22, 24]. We compared402
the spatial approach to the spatio-temporal approach. Note that again classical 3D403
wavelets are not applicable for this problem. Visual comparisons for the sequences404
“Ayiko” and “Heart” are given in Figure 17 and Figure 18. Note that the spatio-405
temporal approach provided by the structured wavelet construction provides a consid-406
erable visual improvement compared to the spatial approach. In the PSNR evolution407
in Figure 20, we can observe again the temporal structure linked to the heart motion408
and the superiority of structured wavelets locally and and globally (e.g Table 4).409
5. Discussion & Perspectives. We give in this section a discussion about the410
positioning of our work compared to existing works on structured group-sparsity. We411
also mention some orientations for future works on the wavelet construction presented412
in this part of the thesis.413
5.1. Some related work on structured group-sparsity. First, we want to414
note that a similar construction to the one presented in the paragraph 3.1 was pro-415
posed in [20] for compressed sensing which is also motivated by video acquisition and416
hyperspectral imaging. There, the purpose was to study the efficiency of considering417
Kronecker products of different bases for multivariate signals with different regulari-418
This manuscript is for review purposes only.
16 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
0 5 10 15 20 250
5
10
15
20
25
x-axis
y-axis
(a)
0 5 10 15 20 250
5
10
15
20
25
x-axisy-axis
(b)
0 5 10 15 20 250
5
10
15
20
25
x-axis
y-axis
(c)
x-axis
y-axis
10 20 30 40 50 60
10
20
30
40
50
60
−2
−1
0
1
2
(d)
x-axis
y-axis
10 20 30 40 50 60
10
20
30
40
50
60−1.5
−1
−0.5
0
0.5
1
1.5
2
(e)
x-axis
y-axis
10 20 30 40 50 60
10
20
30
40
50
60
−2
−1.5
−1
−0.5
0
0.5
1
1.5
(f)
Fig. 14: Example of results obtained by the two methods for σ = 0.3 at the 10th
time step. (a) Noisy flow, (b) 2D-Wavelets reconstruction of the flow, (c)StructuredWavelets reconstruction of the flow, (d) Noisy horizontal velocity, (e) 2D-Waveletsreconstruction of the horizontal velocity., (f) Stuctured Wavelets reconstruction ofthe horizontal velocity.
Test Kidney“Ayiko” “Heart”
σ 2D Structured 2D Structured1 28.82 29.66 32.93 34.312 27.13 29.26 30.44 33.283 25.64 28.82 28.48 32.274 24.42 28.39 26.90 31.34
Table 4: Quantitative comparison (PSNR) for different methods applied to “Ayiko”and “Heart” at different noise levels.
ties. In particular, the authors studied the sparsifying properties and conditions for419
which these bases can be used in the compressed sensing theory. Structured sparsity420
have proven to be useful in denoising, existing structures consider groupe-wise spar-421
sity as in Block Thresholding [13], Hierarchical sparsity as in Tree Thresholding [8]422
or combinations the two [7]. Imposing sparsity in a variational framework can be423
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 17
0 5 10 15 20 250
5
10
15
20
25
x-axis
y-axis
(a)
x-axis
y-axis
10 20 30 40 50 60
10
20
30
40
50
60
−2
−1
0
1
2
(b)
0 5 10 15 20 250
5
10
15
20
25
x-axis
y-axis
(c)
x-axis
y-axis
10 20 30 40 50 60
10
20
30
40
50
60−1.5
−1
−0.5
0
0.5
1
1.5
2
(d)
0 5 10 15 20 250
5
10
15
20
25
x-axis
y-axis
(e)
x-axis
y-axis
10 20 30 40 50 60
10
20
30
40
50
60
−2
−1.5
−1
−0.5
0
0.5
1
1.5
(f)
Fig. 15: Example of results obtained by the two methods for σ = 0.3 at the 10th timestep. (a) Noisy flow, (b) Noisy horizontal velocity, (c) 2D-Wavelets reconstructionof the flow, (d) 2D-Wavelets reconstruction of the horizontal velocity, (c) StructuredWavelets reconstruction of the flow, (d) Stuctured Wavelets reconstruction of thehorizontal velocity.
done via minimizing a `1-regularized least squares functional. This is known as the424
Lasso problem [46]. Structuring the regularization term in groups is known as the425
group-lasso [36] and has other extensions such as the Fused Lasso [47] and the Group426
Fused Lasso [3]. All these paradigms consider structures directly on the sparse rep-427
This manuscript is for review purposes only.
18 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
0 10 20 30 40 50 60 70
10
15
20
25
30
35
40
45
Spatial step
PSNR
(dB)
Noisy 2D−Wavelets Structured Wavelets
Horizontal velocity
0 10 20 30 40 50 60 70
10
15
20
25
30
35
40
45
Image
PSNR
(dB)
Noisy 2D−Wavelets Structured Wavelets
Vertical velocity
Fig. 16: PSNR evolution for the two methods.
Original Noisy (σ = 2) 2D-wavelets StructuredWavelets
Fig. 17: Example of results obtained by the two methods for the 20th image of thesequence “Ayiko” with σ = 0.3.
resentation (i.e after transformation). In our work, we aimed at showing the merit428
of considering the group selection on variables before transformation. As mentioned429
before, many works in the statistical literature demonstrated the advantages of using430
hyperbolic wavelets in multivariate analysis. Overcoming the curse of dimensionality431
has been pointed first in [37] with recent contributions in [5]. In all these works,432
anisotropy was considered on single variables and the authors aimed at dropping the433
error to dimension one.434
5.2. Methodological extensions. The minimax results obtained in this pa-435
per were computed assuming that the same regularity parameter along he different436
groups of variables. As mentioned by [37], it would be also interesting to consider the437
different regularity parameters. This is, actually, more appealing when considering438
problems with different regularities such as the experiments presented in Section 4.439
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 19
Original Noisy (σ = 2) 2D-wavelets StructuredWavelets
Fig. 18: Example of results obtained by the two methods for the 20th image of thesequence “Heart” with σ = 0.3.
Original 2D-wavelets StructuredWavelets
Fig. 19: Method noise: various methods applied to the 20th image of the sequence“Heart”. Quantitative evaluation is given in Table 1.
One can predict that the hardest regularity (smallest parameter) appears in the rate440
of convergence. Still, this is an open question, particularly, how this parameter will441
be combined with the dimension of the largest group. For Functional Data Analysis442
within the functional analysis of variance (FANOVA) framework [2, 1], we believe443
that the structured wavelet construction might give new ways of gathering variables444
presenting similar behaviour, especially, in very high dimensions. Finally, similarly to445
considering divergence-free wavelets on some variables, it is also possible to consider446
operator-like wavelets [48, 33]. This type of wavelets can be used, for instance, to447
invert convolution operators for joint deconvolution/denoising [32]. The structured448
construction can deal with problems in which the data is blurred only on some of the449
variables.450
6. Appendix.451
6.1. Preliminary results on structured wavelets and function spaces.452
In what follows, we shall need a structured wavelet characterization of the functional453
spaces FRN . We have the following result, which extend Theorem 3.1 of [43]454
Theorem 6.1. Assume that the structured wavelet basis is sufficiently regular.455
This manuscript is for review purposes only.
20 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
0 20 40 60 80 100 120 140
16
18
20
22
24
26
28
30
Image
PSNR
(dB)
Noisy 2D−Wavelets Structured Wavelets
“Ayiko”
0 20 40 60 80 100 120 140
18
20
22
24
26
28
30
32
34
Image
PSNR
(dB)
Noisy 2D−Wavelets Structured Wavelets
“Heart”
Fig. 20: PSNR evolution for different methods applied to the sequences “Ayiko” and“Heart”.
Then f ∈ FRN,d iff456
‖f‖2hyp,FRN,d
=∑i
∑j
(1 + 22(
∑` j`R`)
) ∑k∈ZN
|βI|2 <∞ .457
In addition, the two norms ‖ · ‖FRN,d
and ‖ · ‖hyp,FRN,d
are equivalent.458
These functional spaces can be related in a rather classical way to their weak coun-459
terpart. More precisely, let us define the following weak space :460
WN,d(r) = {f ∈ L2, supλ>0
λr−2∑I
|βI(f)|21|βI(f)|<λ <∞} .461
As in Lemma 2.2 of [31], we can prove that we have an alternative definition of these462
spaces :463
Proposition 6.2. Let r ∈ (0, 2). Then464
WN,d(r) = {f ∈ L2, supλ>0
λr∑I
1|βI(f)|>λ <∞} .465
Proof466
Let us first prove that if f ∈WN,d(r) then supλ>0 λr∑
I 1|βI(f)|>λ <∞. Indeed,467 ∑I
1|βI(f)|>λ =∑I
∑`≥0
12`λ≤|βI(f)|<2`+1λ468
≤∑`≥0
∑I
(2`λ)−2|βI(f)|1|βI(f)|<2`+1λ469
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 21
We now use the assumption that f ∈WN,d(r) which implies that470 ∑I
|βI(f)|21|βI(f)|<2`+1λ ≤ (2`+1λ)2−r471
Hence472 ∑I
1|βI(f)|>λ ≤∑`≥0
(2`λ)−2(2`+1λ)2−r ≤ 22−rλ−r
∑`≥0
2−`r
473
Since r > 0, the last sum converges and we get the first inclusion.474
Conversely, let us assume that supλ>0 λr∑
I 1|βI|>λ < ∞ and let us prove that475
f ∈WN,d(r). Indeed,476 ∑I
|βI|2 1|βI|<λ =∑I
∑`≥0
|βI(f)|2 12−`−1λ<|βI(f)|<2−`λ477
≤∑`≥0
∑I
(2−`λ)2 12−`−1λ<|βI(f)|<2−`λ478
We now use the assumption supλ>0 λr∑
I 1|βI(f)|>λ <∞ and deduce that479 ∑I
|βI(f)|2 1|βI|<λ ≤∑`≥0
(2−`λ)2(2−`−1λ)−r480
≤ λ2−r∑`≥0
2−`(2−r)
481
Since 2− r > 0 the last sum converges and we can conclude.482
The following proposition gives some details about the embeddings between the483
spaces FRN,d and WN,d(r) :484
Proposition 6.3. Assume that R1 = · · · = RN = R. Set r = 2dmax/(dmax+2R).485
Then486
FRN,d ⊂WN,d,logN−1(r) .487
Proof488
Let f ∈ FRN,d. Define Jλ such that 2−Jλ ≤ λ
22R+dmax ≤ 2−Jλ+1. Observe that489 ∑
I
|βI(f)|21|βI(f)|≤λ ≤∑|j|≤Jλ
∑k
|βI(f)|21|βI(f)|≤λ +∑|j|>Jλ
∑k
|βI(f)|21|βI(f)|≤λ490
≤ λ2∑|j|≤Jλ
∑k
1 +∑`>Jλ
∑|j|=`
2−2R|j|491
≤ λ2∑|j|≤Jλ
2j1d1+···+jNdN +∑`>Jλ
2−2R``N−1492
≤ λ2∑`≤Jλ
∑|j|=`
2|j|dmax +∑
`>Jλ,R
2−2R``N−1493
≤ λ22JλdmaxJN−1λ + 2−2RJλJN−1λ494
where in the two last inequalities we used the assumption f ∈ FRN,d and Theorem 6.1.495
The conclusion comes from the definition of the index Jλ,N and of that of the space496
WN,d(r) for r = 2dmax/(dmax + 2R).497
We can now use all these results to deduce the minimax results stated in Sec-498
tion 3.2.499
This manuscript is for review purposes only.
22 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
6.2. Proof of upper bound. We fix some function f ∈ FRN,d and bound as500
usual the quadratic risk E‖f − f‖2L2 :501
E‖fε − f‖2L2 ≤∑i
∑j1+···+jN>Jε
[βI(f)]2502
+∑i
∑j1+···+jN≤Jε
βI(f)2E[1|βI(fε)|≤λε ]503
+∑i
∑j1+···+jN≤Jε
E[|βI(fε)− βI(f)|21|βI(fε)|>λε ]504
We first bound the sum∑
i
∑j1+···+jN>Jε [βI(f)]2 using the assumption f ∈ FRN,d and505
Theorem 6.1. Since R1 = · · · = RN = R and j1 + · · ·+ jN > Jε, one deduces that506 ∑i
∑j1+···+jN>Jε
[β2I ] ≤ C2−2RJεJN−1ε .507
We now bound the sum∑
i
∑j1+···+jN≤Jε
∑k βI(f)2E[1|βI(fε)|≤λε ]. Observe that if508
|βI(fε)| ≤ λε, either |βI(f)| ≤ 2λε either |βI(fε)− βI(f)| > λε. It implies that509 ∑i
∑j1+···+jN≤Jε
∑k
βI(f)2E[1|βI(fε)|≤λε ] ≤∑i
∑j1+···+jN≤Jε
∑k
βI(f)2E[1|βI(f)|≤λε ]510
+∑i
∑j1+···+jN≤Jε
∑k
βI(f)2E[1|βI(fε)−βI(f)|>λε ]511
The sum∑
i
∑j1+···+jN≤Jε βI(f)2E[1|βI(f)|≤λε ] is deterministic and equals∑
i
∑j1+···+jN≤Jε
βI(f)21|βI(f)|≤λε .
It can then be bounded using the embedding proved in Proposition 6.3 and the defi-512
nition of the spaces WN,d(r).513
The bound of the sum∑i
∑j1+···+jN≤Jε
∑k βI(f)2E1|βI(fε)−βI(f)|>λε comes from514
the Gaussian assumption on the noise and the classical concentration inequality515
P(|Z| > λ) ≤ Ce−λ2/2 valid for any standard Gaussian random variable Z which516
implies that517 ∑i
∑j1+···+jN≤Jε
βI(f)2E[1|βI(fε)−βI(f)|>λε ] =∑i
∑j1+···+jN≤Jε
βI(f)2P[|(βI(fε)− βI(f))/ε| > λε/ε]518
≤ ε∑i
∑j1+···+jN≤Jε
βI(f)2 = Cεm2/2
519
We then deduce that520 ∑i
∑j1+···+jN≤Jε
∑k
βI(f)2E1|βI(fε)|≤λε ≤ Cεm2/2 + λ4R/(2R+dmax)
ε ≤ Cλ4R/(2R+dmax)ε521
as soon as m > 2√
2.522
Finally the bound of the sum∑i
∑j1+···+jN≤Jε
∑k E|βI(fε)−βI(f)|21|βI(fε)|>λε523
shall follow from Propositions 6.2 and 6.3. Indeed, one has by the Cauchy Schwartz524
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 23
inequality and as in the bound of the last sum525 ∑i
∑j1+···+jN≤Jε
∑k
E|βI(fε)− βI(f)|21|βI(fε)|>λε526
≤ ε2∑
1|βI(f)|>λε + 2Jε/2ε2∑
P1/2[|βI(fε)− βI(f)| > λε/2]527
≤ Cλ4R/(2R+dmax)ε | log(λε))|N−1 + 2Jε/2εm
2/16 ≤ Cλ4R/(2R+dmax)ε | log(λε))|N−1528
as soon as m > 8. The last display is deduced from Proposition 6.3 and the classi-529
cal concentration inequality for standard Gaussian random variables. Gathering the530
bound of the three sums∑
i
∑j1+···+jN≤Jε βI(f)2E[1|βI(f)|≤λε ],
∑i
∑j1+···+jN≤Jε
∑k βI(f)2E1|βI(fε)−βI(f)|>λε531
and∑i
∑j1+···+jN≤Jε
∑k E|βI(fε) − βI(f)|21|βI(fε)|>λε , we get the upper bound532
stated in Theorem 3.1.533
6.3. Proof of lower bound. The derivation of a lower bound for wavelet func-534
tion estimation on FRN,d(K) relies on a classical procedure for constructing minimax535
lower bounds in non-parametric estimation known as the Assouad’s method. We give536
here a version of this lemma when functional spaces and quadratic L2 distances are537
considered (see Lemma 2 in [49] and Lemma 10.2 in [28]).538
Lemma 6.4 (Assouad’s lemma). Let V be a functional space containing a set of539
functions {gτ}τ with τ ∈ {0, 1}m. For each couple τ and τ ′, we write τ ∼ τ ′ if τ and540
τ ′ differs in only one coordinate, and τ ∼kτ ′ if it is the kth coordinate. If we assume541
that for any k542
infτ∼kτ ′E‖gτ − gτ ′‖2L2 ≥ δ543
Then, any estimator f of a function f ∈ V verifies544
maxgτ
E‖f − gτ‖2L2 ≥ mδ
2minτ∼τ ′{Λ(Pτ , Pτ ′)},545
where Pτ is the probability measure associated to gτ and the affinity Λ is given by546
Λ(Pτ , Pτ ′) = 1− |Pτ − Pτ′ |1
2.547
In order to use this lemma for constructing the lower bound, we need first to reduce548
the problem to a parametric family of the form {gτ}τ . The value of m depends on549
a fixed scale which calibrates the risk in order to have an optimal convergence. One550
expects that such scale defines the limit between finest scales for which the coefficients551
are small and so dominated by the smoothness (i.e encode noise on some coefficients)552
and coarse scales. As coarse scales do not contribute to the risk, the error is driven by553
the error made on the hardest scale of fine scales which we denote Jε. We are given554
C0 > 0 not depending on ε > 0 and Jε. Thererafter, we define the following family555
{gτ}τ depending on ε, C0. For any τ = (τI) ∈ {0, 1}m with m = #{I s.t |j| = Jε},556
we set557
gτ =∑i
∑j, j1+···+jN=Jε
βI,τψI with βI,τ =
{0 if τI = 0C0ε otherwise.
,558
We first check that this family belongs to the class V = FRN,d(K) :559
This manuscript is for review purposes only.
24 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
Lemma 6.5. Assume that ε22JεdmaxJε � 2−2JεR, with dmax = max(di). Then we560
have561
{gτ}τ ⊆ FRN,d(K)562
Proof563
It directly comes from the definition of the class FRN,d(K) and from Theorem 6.1.564
All that remains, now, is to apply lemma 6.4 for the class FRN,d(K) and the565
parametric family {gτ}τ . First, note that for the family {gτ}τ , we have566
δ = C20ε
2.567
Note also that the hypercube dimension m is given by the cardinality of the568
coefficients at the scale Jε.569
(11) m = #{I s.t |j| = Jε} = JN−1ε 2(∑ni=1 diji),570
Hence,571
maxgτ
E‖f − gτ‖2L2 ≥ C20ε
2JN−1ε 2(∑ni=1 diji) min
τ∼τ ′{Λ(Pτ , Pτ ′)},572
≥ C20ε
2JN−1ε 2Jεdmax minτ∼τ ′{Λ(Pτ , Pτ ′)},573
As usual, the calibration ε22JεdmaxJε � 2−2JεR imposes that Jε is of the same574
order as log(ε−1) which yields to575
(12) 2Jε =(ε2[log(ε−1)]N−1
)−1/(2R+dmax)
.576
Thus577
maxgτ
E‖f − gτ‖2L2 ≥ C20ε
2(ε2 log(ε−1)]N−1
)−dmax/(2R+dmax)
minτ∼τ ′{Λ(Pτ , Pτ ′)}.578
Finally, by definition the affinity Λ takes only positive values, which gives579
maxgτ
E‖f − gτ‖2L2 ≥ C(ε2 log(ε−1)]N−1
)2R/(2R+dmax)
.580
with C = C20 minτ∼τ ′{Λ(Pτ , Pτ ′)}, which ends the proof.581
REFERENCES582
[1] F. Abramovich and C. Angelini, Testing in mixed-effects fanova models, Journal of statistical583planning and inference, 136 (2006), pp. 4326–4348.584
[2] F. Abramovich, A. Antoniadis, T. Sapatinas, and B. Vidakovic, Optimal testing in a585fixed-effects functional analysis of variance model, International Journal of Wavelets, Mul-586tiresolution and Information Processing, 2 (2004), pp. 323–349.587
[3] C. M. Alaız, A. Barbero, and J. R. Dorronsoro, Group fused lasso, in Artificial Neural588Networks and Machine Learning–ICANN 2013, Springer, 2013, pp. 66–73.589
[4] A. Antoniadis, J. Bigot, and T. Sapatinas, Wavelet estimators in nonparametric regression:590a comparative simulation study, Journal of Statistical Software, 6 (2001), pp. pp–1.591
This manuscript is for review purposes only.
VARIABLE GROUPWISE STRUCTURED WAVELETS 25
[5] F. Autin, G. Claeskens, and J. Freyermuth, Hyperbolic wavelet thresholding methods and592the curse of dimensionality through the maxiset approach, Appl. Comput. Harmon. Anal.,59336 (2014), pp. 239–255.594
[6] F. Autin, G. Claeskens, and J. Freyermuth, Asymptotic performance of projection estima-595tors in standard and hyperbolic wavelet bases, To appear in Electronic journal of statistics,596(2015).597
[7] F. Autin, J.-M. Freyermuth, and R. von Sachs, Combining thresholding rules: a new way598to improve the performance of wavelet estimators, Journal of Nonparametric Statistics, 24599(2012), pp. 905–922.600
[8] R. G. Baraniuk, Optimal tree approximation with wavelets, in SPIE’s International Sym-601posium on Optical Science, Engineering, and Instrumentation, International Society for602Optics and Photonics, 1999, pp. 196–207.603
[9] J. Bigot and T. Sapatinas, Nonparametric adaptive time-dependent multivariate function604estimation, (2012).605
[10] E. Bostan, M. Unser, and J. P. Ward, Divergence-free wavelet frames, Signal Processing606Letters, IEEE, 22 (2015), pp. 1142–1146.607
[11] E. Bostan, O. Vardoulis, D. Piccini, P. D. Tafti, N. Stergiopulos, and M. Unser,608Spatio-temporal regularization of flow-fields, in Biomedical Imaging (ISBI), 2013 IEEE60910th International Symposium on, Ieee, 2013, pp. 836–839.610
[12] A. Buades, B. Coll, and J.-M. Morel, A non-local algorithm for image denoising, in Pro-611ceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern612Recognition (CVPR’05), vol. 2, 2005, pp. 60–65.613
[13] T. T. Cai, On block thresholding in wavelet regression: Adaptivity, block size, and threshold614level, Statistica Sinica, (2002), pp. 1241–1273.615
[14] C.-I. Chang, Hyperspectral imaging: techniques for spectral detection and classification, vol. 1,616Springer Science & Business Media, 2003.617
[15] P. Clarysse, J. Tafazzoli, P. Delachartre, and P. Croisille, Simulation based eval-618uation of cardiac motion estimation methods in tagged-mr image sequences, Journal619of Cardiovascular Magnetic Resonance, 13 (2011), p. P360, https://doi.org/10.1186/6201532-429X-13-S1-P360, http://jcmr-online.com/content/13/S1/P360.621
[16] I. Daubechies, Ten lectures on wavelets, vol. 61, SIAM, 1992.622[17] R. DeVore, S. Konyagin, and V. Temlyakov, Hyperbolic wavelet approximation, Construc-623
tive Approximation, 14 (1998), pp. 1–26.624[18] R. DeVore, G. Petrova, and P. Wojtaszczyk, Anisotropic smoothness spaces via level sets,625
Communications on Pure and Applied Mathematics, 61 (2008), pp. 1264–1297.626[19] D. L. Donoho and I. M. Johnstone, Ideal spatial adaptation by wavelet shrinkage, Biometrika,627
81 (1994), pp. 425–455.628[20] M. F. Duarte and R. G. Baraniuk, Image Processing, IEEE Transactions on, 21 (2012),629
pp. 494–504.630[21] J. M. Fadili, J. Mathieu, B. Romaniuk, and M. Desvignes, Bayesian wavelet-based poisson631
intensity estimation of images using the fisz transformation, in International conference632on image and signal processing, vol. 1, 2003, pp. 242–253.633
[22] J. M. Fadili, J. Mathieu, B. Romaniuk, and M. Desvignes, Bayesian wavelet-based pois-634son intensity estimation of images using the fisz transformation, in Proc.International635Conference on Image and Signal Processing, Agadir, Morocco, 2003, pp. 242–253.636
[23] Y. Farouj, J.-M. Freyermuth, L. Navarro, M. Clausel, and P. Delachartre, IEEE637Transactions on Computational Imaging, 3 (2017), pp. 1–10.638
[24] P. Fryzlewicz, Data-driven wavelet-fisz methodology for nonparametric function estimation,639Electronic journal of statistics, 2 (2008), pp. 863–896.640
[25] P. Fryzlewicz and G. P. Nason, A haar-fisz algorithm for poisson intensity estimation,641Journal of computational and graphical statistics, 13 (2004), pp. 621–638.642
[26] D. Garcia, J. C. del Alamo, D. Tanne, R. Yotti, C. Cortina, E. Bertrand, J. C. An-643toranz, E. Perez-David, R. Rieu, F. Fernandez-Aviles, et al., Two-dimensional intra-644ventricular flow mapping by digital processing conventional color-doppler echocardiography645images, Medical Imaging, IEEE Transactions on, 29 (2010), pp. 1701–1713.646
[27] K. Guo, D. Labate, W.-Q. Lim, G. Weiss, and E. Wilson, Wavelets with composite dila-647tions, Electronic research announcements of the American Mathematical Society, 10 (2004),648pp. 78–87.649
[28] W. Hardle, G. Kerkyacharian, D. Picard, and A. Tsybakov, Wavelets, approximation,650and statistical applications, vol. 129, Springer Science & Business Media, 2012.651
[29] B. K. P. Horn and B. G. Schunck, Determining optical flow, Artificial Intelligence, 17 (1981),652pp. 185–203.653
This manuscript is for review purposes only.
26 Y. FAROUJ, J.M. FREYERMUTH, L. NAVARRO, M. CLAUSEL, P. DELACHARTRE
[30] S. Kadri Harouna and V. Perrier, Divergence-free wavelet projection method for incom-654pressible viscous flow on the square, Multiscale Modeling & Simulation, 13 (2015), pp. 399–655422.656
[31] G. Kerkyacharian and D. Picard, Thresholding algorithms, maxisets and well-concentrated657bases, Test, 9 (2000), pp. 283–344.658
[32] I. Khalidov, J. Fadili, F. Lazeyras, D. Van De Ville, and M. Unser, Activelets:659Wavelets for sparse representation of hemodynamic responses, Signal Processing, 91660(2011), pp. 2810–2821.661
[33] I. Khalidov, D. Van De Ville, T. Blu, and M. Unser, Construction of wavelet bases that662mimic the behaviour of some given operator, in Optical Engineering+ Applications, Inter-663national Society for Optics and Photonics, 2007, pp. 67010S–67010S.664
[34] T. Loupas, W. McDicken, and P. Allan, An adaptive weighted median filter for speckle665suppression in medical ultrasonic images, Circuits and Systems, 36 (1989), pp. 129–135.666
[35] M. Markl, F. P. Chan, M. T. Alley, K. L. Wedding, M. T. Draney, C. J. Elkins,667D. W. Parker, R. Wicker, C. A. Taylor, R. J. Herfkens, et al., Time-resolved668three-dimensional phase-contrast mri, Journal of Magnetic Resonance Imaging, 17 (2003),669pp. 499–506.670
[36] L. Meier, S. Van De Geer, and P. Buhlmann, The group lasso for logistic regression, Journal671of the Royal Statistical Society: Series B (Statistical Methodology), 70 (2008), pp. 53–71.672
[37] M. H. Neumann, Multivariate wavelet thresholding in anisotropic function spaces, Statistica673Sinica, 10 (2000), pp. 399–431.674
[38] M. H. Neumann and R. Von Sachs, Wavelet thresholding in anisotropic function classes and675application to adaptive estimation of evolutionary spectra, The Annals of Statistics, 25676(1997), pp. 38–76.677
[39] R. D. Nowak and R. G. Baraniuk, Wavelet-domain filtering for photon imaging systems,678Image Processing, IEEE Transactions on, 8 (1999), pp. 666–678.679
[40] A. Quarteroni, A. Veneziani, and P. Zunino, Mathematical and numerical modeling of680solute dynamics in blood flow and arterial walls, SIAM Journal on Numerical Analysis, 39681(2002), pp. 1488–1511.682
[41] N. Remenyi, O. Nicolis, G. Nason, and B. Vidakovic, Image denoising with 2d scale-mixing683complex wavelet transforms, Image Processing, IEEE Transactions on, 23 (2014), pp. 5165–6845174.685
[42] H.-J. Schmeisser, Recent developments in the theory of function spaces with dominating mixed686smoothness, Nonlinear Analysis, Function Spaces and Applications, (2007), pp. 145–204.687
[43] H.-J. Schmeisser and W. Sickel, Spaces of functions of mixed smoothness and ap-688proximation from hyperbolic crosses, Journal of Approximation Theory, 128 (2004),689pp. 115 – 150, https://doi.org/http://dx.doi.org/10.1016/j.jat.2004.04.007, http://www.690sciencedirect.com/science/article/pii/S0021904504000693.691
[44] I. W. Selesnick and K. Y. Li, Video denoising using 2d and 3d dual-tree complex wavelet692transforms, in Optical Science and Technology, SPIE’s 48th Annual Meeting, International693Society for Optics and Photonics, 2003, pp. 607–618.694
[45] G. A. Shaw and H.-h. K. Burke, Spectral imaging for remote sensing, Lincoln Laboratory695Journal, 14 (2003), pp. 3–28.696
[46] R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical697Society. Series B (Methodological), (1996), pp. 267–288.698
[47] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, Sparsity and smoothness via699the fused lasso, Journal of the Royal Statistical Society: Series B (Statistical Methodology),70067 (2005), pp. 91–108.701
[48] D. Van De Ville, T. Blu, B. Forster, and M. Unser, Semi-orthogonal wavelets that behave702like fractional differentiators, in Optics & Photonics 2005, International Society for Optics703and Photonics, 2005, pp. 59140C–59140C.704
[49] B. Yu, Assouad, fano, and le cam, in Festschrift for Lucien Le Cam, Springer, 1997, pp. 423–705435.706
[50] V. Zavadsky, Image approximation by rectangular wavelet transform, Journal of Mathematical707Imaging and Vision, 27 (2007), pp. 129–138.708
This manuscript is for review purposes only.