Computational Methods for Inverse Problems in...

Post on 09-Jul-2020

0 views 0 download

Transcript of Computational Methods for Inverse Problems in...

INDAM intensive period“Computational Methods for Inverse Problems in Imaging ”

Conclusive Workshop, Como, Italy, August 16-18, 2018

(Numerical linear algebra, regularization in Banach spacesand) variable exponent Lebesgue spaces

for adaptive regularization

Claudio EstaticoDipartimento di Matematica, Universita di Genova, Italy

Related joint works:

• Image restoration, acceleration and preconditioning:Paola Brianzi, Fabio Di Benedetto; Pietro Dell’Acqua; Marco DonatelliDIMA, Univ. Genova; DISAT, Univ. dell’Aquila; Univ. Insubria, Como.

•Microwave inverse scattering:Alessandro Fedeli, Matteo Pastorino and Andrea RandazzoDip. di Ingegneria DITEN, Univ. Genova.

•Remote Sensing:Flavia Lenti, Serge Gratton; David Titley-Peloquin; Maurizio Migliaccio,Ferdinando Nunziata; Federico Benvenuto, Alberto SorrentinoCERFACS and ENSEEIHT, Universite de Toulouse; Dep. of BioresourceEngineering, McGill University, Canada; Dip. per le Tecnologie, Univ.Napoli Parthenope; Dip. di Matematica (DIMA), Univ. Genova.

• Subsurface prospecting:Gian Piero Deidda; Patricia Diaz De Alba, Caterina Fenu,Giuseppe RodriguezDip. di Ing. Civile, Dip. di Matematica e Informatica, Univ. Cagliari.

Introduction:

linear equations and inverse problems

in Hilbert vs Banach space settings

Inverse Problem

By the knowledge of some “observed” data y (i.e., the effect),

find an approximation of some model parameters x (i.e., the cause).

Given the (noisy) data y ∈ Y ,find (an approximation of) the unknown x ∈ X such that

Ax = y

where A : X −→ Y is a known linear operator,and X ,Y are two functional (here Hilbert or Banach) spaces.

True image Blurred (noisy) image Restored image

Inverse problems are usually ill-posed, they need regularization techniques.

Numerical linear algebra lives in Hilbert spaces

A Hilbert space is a complete vector space endowed with a scalar productthat allows (lengths and) angles to be defined and computed.

In any Hilbert spaces, what happens in our Euclidean world (that is, in ourpart of universe ...) always holds:

• Pythagorean theorem: if u⊥v, then ‖u + v‖2 = ‖u‖2 + ‖v‖2

• Parallelogram identity: ‖u + v‖2 + ‖u− v‖2 = 2(‖u‖2 + ‖v‖2)

•Gradient of (the square of) the norm: ∇12‖u‖

2 = u

The scalar product allows also for a natural definition of:

• orthogonal projection (best approximation) of u on v P (u) = 〈u, v〉v/‖v‖2

• SVD decomposition (or eigenvalues/eigenvectors dec.) of linear operators(for separable Hilbert spaces, as any finite dimensional Hilbert space...)

Regularization: the “classical” framework in Hilbert spaces

In general, all the regularization methods for ill-posed functional equationshave been deeply investigated in the context of Hilbert spaces.

Benefits:Any linear (or linearized) operator in Hilbert spaces can be decomposedinto a set of eigenfunctions by using the conventional spectral theory. Thisway, convergence and regularization properties of any solving method can beanalyzed by considering the behavior of any single eigenfunction (i.e., we canuse the SVD decomposition ...).

Drawback:Regularization methods in Hilbert spaces usually give rise to smooth (andover-smooth) solutions. In image deblurring, regularization methods in Hilbertspaces do not allow to obtain a good localization of the edges.

Regularization: the “recent” framework in Banach spaces

More recently, some regularization methods have been introduced and in-vestigated in Banach spaces. In any Banach space, only distances betweenits elements can be defined and measured, but no scalar product (thus no“angle”) is defined.

Benefits:Due to the geometrical properties of Banach spaces, these regularizationmethods allow us to obtain solutions endowed with lower over-smoothness,which result, as instance, in a better localization and restoration of the dis-continuities in imaging applications. Another useful property of the regular-ization in Banach space is that solutions are more sparse, that is, in generalthey can be represented by very few components.

Drawback:The “Mathematics” is much more involving (the spectral theory cannot beused...). Convex analysis is required.

Regularization: Hilbert VS Banach spaces

Hilbert spaces Banach spaces

L2(Rn), Hs = W s,2 Lp(Rn),W s,p, p ≥ 1;BVBenefits Easier computation Better restoration

(Spectral theory, of the discontinuities;eigencomponents) Sparse solutions

Drawbacks Over-smoothness Theoretically tricky(bad localization of edges) (Convex analysis required)

Recap:

In Banach spaces we have no scalar product (so, no orthogonal projection),no Pythagorean theorem, no SVD...

The (sub-)gradient of (the square of) the norm is not linear, so that theleast square solution (of a linear problem) is no more a linear problem...

Examples: L1 for sparse recovery or Lp, 1 < p < 2, for edge restoration.

L2(R2)1/2 ||x||

2

L1.2(R2)1/1.2 ||x||

1.2

L5(R2)1/5 ||x||

5

∇ ( 1/2 ||x||22 )

−5 0 5−5

0

5

∇ ( 1/1.2 ||x||1.21.2 )

−5 0 5−5

0

5

∇ ( 1/5 ||x||55 )

−5 0 5−5

0

5

Regularization in Hilbert and Banach spaces

I Iterative minimization algorithms

Gradient-like methods: Landweber, CG

II Iterative projection algorithms

SIRT: Cimmino, DROP, ...

III Preconditioning

Trigonometric matrix algebra preconditioners

IV Multi-parameter and Adaptive regularization

The Variable exponent Lebesgue spaces

Regularization in Hilbert and Banach spaces

I

Iterative minimization algorithms

Iterative minimization algorithms

In the variational approach, instead of solving straightly the operator equa-tion Ax = y, we minimize the residual functional H : X −→ [0,+∞)

H(x) =1

r‖Ax− y‖rY .

Basic iterative scheme:

xk+1 = xk − λkΦA(xk, y)

where the operator ΦA : X × Y −→ X returns a value

ΦA(xk, y) ≈ ∇H(xk) = ∇(

1

r‖Axk − y‖rY

),

that is, (an approximation of) the “gradient” of the functional 1r‖Ax− y‖

rY

in the point xk and λk > 0 is the step length.This way, the iterative schemes are all different generalizations of the basic

gradient descent method.

The Landweber algorithm in Hilbert spaces

In the conventional case (i.e., both X and Y are Hilbert spaces) we considerthe least square functional

H2(x) =1

2‖Ax− y‖2Y .

Direct computation of ∇H2 by chaining rule for derivatives (in Rn)

∇H2(x) =((

(∇12‖u‖

2)|u=Ax−y)∗J (Ax− y)

)∗= ((Ax− y)∗A)∗ = A∗(Ax− y)

leads to the “simplest” iterative method: the Landweber algorithm

xk+1 = xk − λA∗(Axk − y)

Since H2 is convex, there exists a non-empty set of stationary points, i.e.∇H2(x) = 0, which are all minimum points of H2.

The constant step size λ ∈ (0, 2/‖A∗A‖) , yields H2(xk+1) < H2(xk) andalso guarantees the convergence of the iterations.

Towards the Landweber algorithm in Banach spaces (I)

xk+1 = xk − λA∗(Axk − y)

Formally, A∗ is the dual operator of A, that is, the operator

A∗ : Y ∗ −→ X∗ such that

y∗(Ax) = (A∗y∗)(x) , ∀x ∈ X and ∀y∗ ∈ Y ∗ ,where X∗ and Y ∗ are the dual spaces of X and Y .

If X and Y are Hilbert spaces, then X is isometrically isomorphic to X∗

and Y is isometrically isomorphic to Y ∗ (by virtue of Riesz Theorem), sothat the operator A∗ : Y ∗ −→ X∗ can be identified with A∗ : Y −→ G.

However, in general Banach spaces are not isometrically isomorph to theirduals. This way, the Landweber iteration above is well defined in Hilbertspaces (...only!)

The key point: To generalize from Hilbert to Banach spaces we have toconsider the so-called duality maps.

Towards the Landweber algorithm in Banach spaces (II)

The computation of the (sub-)gradient of the residual functional requiresthe (sub-)gradient of the norm of the Banach space involved.Here the key tool is the so-called “duality map”, which maps a Banach spaceB with its dual B∗.

Theorem (Asplund, Acta Math., 1968)Let B be a Banach space and let r > 1.The duality map JBr is the (sub-)gradient of the convex functionalf : B −→ R defined as f (u) = 1

r‖u‖rB (with abuse of notation...):

JBr = ∇f = ∇(

1

r‖ · ‖rB

).

Again from chaining rule, the (sub-)gradient of the residual 1r‖Ax− y‖

rY ,

by means of the duality map JYr , can be computed as follows

∇(

1

r‖Ax− y‖rY

)= A∗JYr (Ax− y) .

The Landweber method in Banach spaces

Let r > 1 a fixed weight value. Let x0 ∈ X an initial guess (the null vectorx0 = 0 ∈ X is often used in the applications), and set x∗0 = JXr (x0) ∈ X∗.

For k = 0, 1, 2, . . .

x∗k+1 = x∗k − λkA∗JYr (Axk − y) ,

xk+1 = JX∗

r∗ (x∗k+1) ,

where r∗ is the Holder conjugate of r, that is, 1r + 1

r∗ = 1,and the step sizes λk are suitably chosen.

Here the duality map JXr : X −→ 2X∗

acts on the iterates xk ∈ X , and

the duality map JX∗

r∗ : X∗ −→ 2X∗∗

acts on the iterates x∗k ∈ X∗: In order

to be well defined, it is only required that the space X is reflexive, that isX∗∗ is isometrically isomorph to X , so that JX

∗r∗ ⊆ X .

Landweber iterative method in Hilbert spaces

A : X −→ Y A∗ : Y −→ X H2(x) = 12‖Ax− y‖

2Y

xk+1 = xk − λA∗(Axk − y)

Landweber iterative method in Banach spaces

A : X −→ Y A∗ : Y ∗ −→ X∗ Hr(x) = 1r‖Ax− y‖

rY

xk+1 = JX∗

r∗ (JXr xk − λkA∗JYr (Axk − y))

Some remarks:

(i) Any duality map is in general nonlinear (and multi-valued...), so that,differing from the Hilbert space case, the Landweber method is not linear.

(ii) In the Banach space Lp, p ∈ (1,+∞), by direct computation we have

JLp

r (x) = ‖x‖r−pp |x|p−1sgn(x)

JLp

r is a non-linear, single-valued, diagonal operator , which cost O(n).

JLp

r does not increase the global numerical complexity O(n log n) of image

deblurring with (FFT-based) structured matrices. Note that JL2

2 = I .

A convergence resultfor the Landweber method in Banach spaces

Proposition [Schopfer, Louis, Schuster, Inv. Prob., 2006]Let X be a reflexive Banach space, and Y a (arbitrary) Banach space. Lety ∈ R(A) and let x† the minimum norm pseudo-solution of Ax = y.

If λk > 0 is suitably (...) chosen for all k, then the sequence of the iterations(xk) converges strongly to x†, that is,

limk−→+∞

‖xk − x†‖X = 0

If the data y is noisy, an early stop of the iterations (by discrepancy prin-ciple) gives rise to an iterative regularization method.

In Hilbert setting, the iterations are defined in the (primal) space X .

In Banach setting, the iterations are defined in the dual space X∗ and arelinked to the (“wide”...) Banach fixed point theory.

Duality maps are the basic tool for generalizing classical iterative methodsfor linear systems to Banach spaces: Landweber method, CG, Mann itera-tions, Gauss-Newton Gradient type iterations (for nonlinear problems).

Basic hypotheses for the convergence (and regularization behavior): uni-formly smooth and (uniformly) convex Banach spaces. See also:Scherzer, Kaltenbacher, Fornasier, Hofmann, Kazimierski, Poschl, Hein, Q.Jin, Tautenhahn, Neubauer, Tomba, . . . .

To reduce over-smoothness, these methods have been implemented in thecontext of Lp Lebesgue spaces with 1 < p ≤ 2.

p >≈ 1 Low “regularization” Good recovery of edges and discontinuitiesImprove the sparsity

p ≈ 2 High “regularization” Higher stabilityOver-smoothness

A numerical evidence in Lp Lebesgue space, 1 < p ≤ 2

The classical 2D Image restoration, with Landweber method (200 iters)

True image x PSF (A) Blurred and noisy image y

Hilbert Restoration (p = 2) Banach Restoration (p = 1.5) Banach Restoration (p = 1.2)

A classical image restoration example

Convergence history:

Relative restoration errors‖x−xk‖22‖x‖22

versus iteration index k

The Conjugate Gradient in Lp Banach spaces

The Landweber algorithm in Banach spaces setting gives better restora-tions but it is still slow. Instead of −∇(H(xk)) , we consider a differentanti-gradient descent direction, based on the same idea of the classical CG.

The CG(NR) method for min. of H(x) = 12‖Ax− y‖

22 , in Hilbert spaces:

xk+1 = xk + αkpkpk+1 = −∇H(xk+1) + βkpk = −A∗(Axk+1 − y) + βkpk

where p0 = −∇H(x0) = −A∗(Ax0 − y), and

αk =(Apk)T (y − Axk)

(Apk)T (Apk), βk =

∇H(xk+1)T∇H(xk+1)

∇H(xk)T∇H(xk).

The step size αk is called optimal, since α := αk solves the linear equationd

dαH(xk + αpk) = 0 .

Some facts about the conventional CG method.

Definition: xk is said to be optimal with respect to the direction v if

H(xk) ≤ H(xk + λv) ∀λ ∈ R.

By virtue of the optimal choice of the step length αk, the new iterationxk+1 = xk + αkpk is optimal with respect to the direction pk.

The key idea of the CG method is that the new iteration xk+1 must be op-timal w.r.t. the (previous) direction pk−1 too. This leads to the computationof the special coefficient βk (i.e, the so-called Fletcher-Reeves formula), andit can be proven that xk+1 is automatically optimal w.r.t. all the previousdirections p0, p1, . . . , pk−2.

The optimality of xk+1 w.r.t. the previous descent direction pk−1 is im-portant to achieve (finite step) convergence. Indeed, in the two dimensionalHilbert (i.e., L2 or Euclidean...) case, all the points which are optimal withrespect to a fixed direction lie on a line passing through the minimum point.

−4 −2 0 2 4 6 8 10−5

−4

−3

−2

−1

0

1

2

3

4

5

−4 −2 0 2 4 6 8 10−5

−4

−3

−2

−1

0

1

2

3

4

5

−4 −2 0 2 4 6 8 10−5

−4

−3

−2

−1

0

1

2

3

4

5

L2(Rn) L1.2(Rn) L5(Rn)

The Conjugate Gradient method in Lp Banach spaces

The CG(NR) method for min. of H(x) = 1p‖Ax− y‖

pY , in Y = Lp:

x∗k+1 = x∗k + αkp∗k

xk+1 = JX∗

p∗ (x∗k+1)

p∗k+1 = −∇H(xk+1) + βkp∗k = −A∗JYp (Axk+1 − y) + βkp

∗k

where p∗0 = −∇H(x0) = −A∗JYp (Ax0 − y),

the step size αk solves (with approximation) the nonlinear equation

d

dαH(JX∗

p∗ (x∗k + αp∗k))

= 0 ,

and the coefficient βk satisfies a “Fletcher-Reeves”-like formula

βk = γ||Axk+1 − y||

pp

||Axk − y||pp

with γ < 1/2

Convergence of CG in Lp Banach spaces

Theorem [E., Gratton, Lenti, Titley-Peloquin; Num. Math., 2018]

(...) the sequence xkk of the CG method in Lp converges strongly to the

minimum norm pseudo-solution x† of Ax = y, i.e.

‖xk − x†‖p −→ 0 (k −→ +∞) .

If the data y is noisy, an early stop of the iterations (by discrepancy prin-ciple) gives rise to an iterative regularization method.

Remark: Differing from the conventional CG in Hilbert spaces, NO finite(n steps) convergence in Rn.However, Banach space finite steps convergence still holds in a simplifiedsetting [Herzog R., Wollner W.; J. inverse ill-posed prob., 2017].

L2 L1.5 L1.2

(Relative Errors vs iterations - Red: CG; Magenta: Landweber)

Some comments about the results:

For p = 2 the CG method is too fast (i.e., not enough regularization), anddoes not provide the same good quality of the slowest Landweber.

For the smaller p = 1.5, 1.2 the CG decelerates, and it gives the samequality of restoration give by the Landweber method, in much less iterations.

Regularization in Hilbert and Banach spaces

II

Iterative projection algorithms

Iterated projections (Row Action Methods)

The points x satisfying the i-th row equation 〈ai, x〉 = bi of the m × nlinear system Ax = b define an hyperplane

Qi = x ∈ Rn : 〈ai, x〉 = biThe solution x of the linear systems belongs to all the hyperplanes Qi,

for i = 1, . . . , n.In Hilbert spaces, BECAUSE• the orthogonal projection Pi(z) of a point z onto one hyperplane Qi

is easy to perform

Pi(z) = z +bi − 〈ai, z〉‖ai‖22

aTi ,

• and the projection Pi(z) is closer to the solution x than z itself,

THEN

iterative projecting xk+1 = Pi(xk) onto different hyperplanes Qi gives riseto a “low-cost” sequence (xk)+∞

k=1 converging to the solution x.

Some families of projection methods (in Hilbert spaces)

• ART (Algebraic Reconstruction Techniques) or Kaczmarz‘s methods

xk+1 = xk + λkbi − 〈ai, xk〉‖ai‖2

aTi

where the row index i depends on the iteration index k (in many ways,for instance i = k mod m, where m is the number of rows of A, or evenrandomly (!) ) and λk ∈ R is a relaxation parameter.

• SIRT (Simultaneous Iterative Reconstruction Techniques)

The same “fixed” iteration xk is used (i.e., simultaneously) for a “complete”set of m different projections, that is, for a full set of indexes i = 1, 2, . . . ,m.

In this case, a matrix form iteration holds

xk+1 = xk + λkSA∗M(b− Axk)

SIRT: Cimmino method, DROP (Diagonally Relaxed Orthogonal Projec-tion), CAV (Component Averaging), BIP (Block-Iterative Projections), ...

Cimmino and DROP methods

In the SIRT family xk+1 = xk + λkSA∗M(b− Axk) , we have:

• the Cimmino method (1938), where the new iteration is the averageof a complete set of m projections

xk+1 =1

m

m∑i=1

Pi(xk) = xk +1

m

m∑i=1

bi − 〈ai, xk〉‖ai‖2

aTiso that

S = I,

M =1

mD =

1

mdiag(1/‖a1‖2, 1/‖a2‖2, . . . , 1/‖am‖2).

• the DROP method (Diagonally Relaxed Orthogonal Projection), where

S = diag(1/r1, 1/r2, . . . , 1/rn) ,

M = D = diag(1/‖a1‖2, 1/‖a2‖2, . . . , 1/‖am‖2) ,

being rj the number of non-zeroes elements of the j-th column of A.

−4 −2 0 2 4 6 8 10−5

−4

−3

−2

−1

0

1

2

3

4

5

P1(x

0)

P2(x

0)

(b1 − <a

1,x

0>) a

1 / ||a

1||2

x0x

1

x1 = [P

1(x

0)+P

2(x

0)] / 2

Pi(x) = x + (b

i − <a

i,x>) a

i / ||a

i||2

(b2 − <a

2,x

0>) a

2 / ||a

2||2−2x+(4/3)y=8/3

a2=(−2,4/3); b

2=8/3;

−x+3y=8a

1=(−1,3); b

1=8;

DROP method (in Hilbert spaces)

A : X −→ Y A∗ : Y −→ X HM (x) = 12‖Ax− y‖

2M

xk+1 = S(S−1xk − λkA∗M(Axk − y))

Landweber iterative method in Banach spaces

A : X −→ Y A∗ : Y ∗ −→ X∗ Hr(x) = 1r‖Ax− y‖

rY

xk+1 = JX∗

r∗ (JXr xk − λkA∗JYr (Axk − y))

Some remarks about

JYr , JX∗

r∗ (in Lp Banach spaces) VS M , S (in L2 Hilbert spaces):

• JYr is non-linear and M is linear; JX∗

r∗ is non-linear and S is linear;

• all are diagonal and positive operators;

• all cost O(n) operations;

• the action of the matrix M in DROP is “similar” to the one of the dualitymap JYr = JL

p

2 with 1 < p < 2 in Landweber-Banach.

DROP method (in Hilbert spaces)

A : X −→ Y A∗ : Y −→ X HM (x) = 12‖Ax− y‖

2M

xk+1 = S(S−1xk − λkA∗M(Axk − y))

Landweber iterative method in Banach spaces

A : X −→ Y A∗ : Y ∗ −→ X∗ Hr(x) = 1r‖Ax− y‖

rY

xk+1 = JX∗

r∗ (JXr xk − λkA∗JYr (Axk − y))

In summary, the Landweber iterative method in Banach spaces can alsobe viewed as a non-linear generalization of well-known projection algorithmsfor linear systems.

The oblique geometry of the M -induced norm ‖ · ‖M in Hilbert space isreplaced by the Lp-norm ‖ · ‖Lp in Lp Banach space setting.

Regularization in Hilbert and Banach spaces

III

Preconditioning

Preconditioning in Banach Space

Preconditioned system in Hilbert space, with (invertible) preconditioner D

A∗Ax = A∗y ⇐⇒ DA∗Ax = DA∗y

Preconditioned Landweber method in Hilbert spaces

xk+1 = xk − λDA∗(Axk − y)

where the preconditionerD is a regularized (polynomial, rational, circulant)“low-cost” approximation (a structured and/or sparse matrix) of the inverseof A∗A, i.e.,

D ≈ (A∗A)−1 .

Now in Banach spaceA∗A is not well defined (. . . it cannot even be written!)

We generalized preconditioning schemes to Banach space in two ways

•Primal preconditioning

•Dual preconditioning

[Brianzi, Di Benedetto, E., Comp. Opt. Appl., 2013]

Primal Preconditioning in Banach Space

Remember, in Hilbert space:

xk+1 = xk − λDA∗(Axk − y) with D ≈ (A∗A)−1

Primal preconditioner : D : X −→ X

The Primal preconditioner is a regularized approximation of (JX∗

r∗ A∗JYr A)−1

From Hilbert space:

xk+1 = D(D−1xk − λA∗(Axk − y))

To Banach space:

xk+1 = DJX∗

r∗ (JXr D−1xk − λA∗JYr (Axk − y))

Recalling that A : X −→ Y , A∗ : Y ∗ −→ X∗ , JXr : X −→ X∗ ,JX∗

r∗ : X∗ −→ X , JYr : Y −→ Y ∗ , the iterations are well-defined.

Dual Preconditioning in Banach Space

Remember, in Hilbert space:

xk+1 = xk − λDA∗(Axk − y) with D ≈ (A∗A)−1 .

Dual preconditioner : D = X∗ −→ X∗

The Dual preconditioner is a regularized approximation of (A∗JYr AJX∗r∗ )−1.

From Hilbert space:

xk+1 = xk − λDA∗(Axk − y)

To Banach space:

xk+1 = JX∗

r∗ (JXr xk − λDA∗JYr (Axk − y))

Recalling that A : X −→ Y , A∗ : Y ∗ −→ X∗ , JXr : X −→ X∗ ,JX∗

r∗ : X∗ −→ X , JYr : Y −→ Y ∗ , the iterations are well-defined.

Convergence analysis of preconditioning in Lp Banach Space

These preconditioned algorithms can be read as fixed-point iterations inthe dual space X∗, that is,

x∗k+1 = g(x∗k) ,

where g : X∗ −→ X∗ in the dual preconditioning is

g(t) = t− λDA∗JYr (AJX∗

r∗ (t)− y) .

By simple direct computation, its Jacobian Jg in Lp Banach spaces is

Jg(x∗) = I − λDA∗A+E ,

where E is a matrix of small norm for p close to (the Hilbert value) p = 2.

This way, by a Bauer-Fike argument, the eigenvalues of Jg(x∗) are “per-turbations” of those of the iteration matrix I − λDA∗A of the conventionalHilbert case. It follows that, for p ≈ 2, the spectral radius of Jg(x∗) isessentially the same as that of the Hilbert case.

In particular, choosing D as an optimal approximation of (A∗A)† will bringto an acceleration of the convergence, at least for p ≈ 2.

(Regularized) Preconditioning for image restorationPreconditioner: Thikonov-filtered T. Chan optimal circulant preconditioner.

True image x NO Prec. 200 Iter. (p = 1.5)200 Iter. Rel. Err 0.4182

Tikhonov Prec. (p = 1.8) Tikhonov Prec. 33 Iter.(p = 1.5)93 Iter. Rel. Err 0.3482 33 Iter. Rel. Err 0.3383

0 5 10 15 20 25 30 35 40 45 500.3

0.4

0.5

0.6

0.7

0.8

0.9

1

p=1.5; Tikh. filter alpha=0.02p=1.5; NO Preconditionerp=2.0 Tikh. filter alpha=0.02

Regularization in (Hilbert and) Banach spaces

IV

Multi-parameter and Adaptive regularization

A “new” framework: variable exponent Lebesgue spaces Lp(·)

In image restoration, often different regions of the image require different“amount of regularization”. Setting different levels of regularization is usefulbecause background, low intensity, and high intensity values require differentfiltering levels [Nagy, Pauca, Plemmons, Torgersen, J Opt Soc Am A, 1997].

The idea: the ill-posed functional equationAf = g is solved in Lp(·) Banachspaces, namely, the variable exponent Lebesgue spaces, a special case of theso-called Musielak-Orlicz functional spaces (first proposed in two seminalpapers in 1931 and 1959, but intensively studied just in the last 10 years).

In a variable exponent Lebesgue space, to measure a function f , insteadof a constant exponent p all over the domain, we have a pointwise variable(i.e., a function) exponent 1 ≤ p(·) ≤ +∞.

This way, different values of the exponent, i.e. different kinds of regulariza-tion, on different regions of the image to restore, can be automatically andadaptively assigned.

A sketch on variable exponent Lebesgue spaces Lp(·)

Lp(Ω) Lp(x)(Ω)

1 ≤ p ≤ ∞ p(x) : Ω→ [1,∞]p is constant p(x) is a measurable function

‖f‖p =( ∫

Ω |f (x)|p dx)1/p

‖f‖p(·) =( ∫

Ω |f (x)|p(x) dx)1/???

‖f‖∞ = ess sup |f (x)| . . .

f ∈ Lp(Ω) ⇐⇒∫

Ω |f (x)|p dx <∞ f ∈ Lp(·)(Ω) ⇐⇒ ???

In the following, Ω∞ = x ∈ Ω : p(x) =∞ has zero measure.

Ω = [−5, 5] p(x) =

2 se − 5 ≤ x ≤ 0

3 se 0 < x ≤ 5

Ω = [−5, 5] p(x) =

2 se − 5 ≤ x ≤ 0

3 se 0 < x ≤ 5

f (x) = 1

|x−1|1/3 /∈Lp(·)([−5, 5])

Ω = [−5, 5] p(x) =

2 se − 5 ≤ x ≤ 0

3 se 0 < x ≤ 5

f (x) = 1

|x+1|1/3 ∈Lp(·)([−5, 5]) f (x) = 1

|x−1|1/3 /∈Lp(·)([−5, 5])

The norm of variable exponent Lebesgue spaces

In the conventional case Lp, the norm is ‖f‖Lp =( ∫

Ω |f (x)|pdx)1/p

.

In Lp(·) Lebesgue spaces, the definition and computation of the norm isnot straightforward, since we have not a constant value for computing the(“mandatory”) radical.

‖f‖Lp(·) =

(∫Ω|f (x)|p(x)dx

)1/???.

The solution: compute first the modular (for 1 ≤ p(·) < +∞)

%(f ) =

∫Ω|f (x)|p(x)dx ,

and then obtain the (so called Luxemburg [1955]) norm by solving a 1Dminimization problem

‖f‖Lp(·) = inf

λ > 0 : %

(fλ

)≤ 1.

The elements of a variable exponent Lebesgue space

%(f ) =

∫Ω|f (x)|p(x)dx ,

‖f‖Lp(·) = inf

λ > 0 : %

(fλ

)≤ 1.

The Lebesgue space

Lp(·)(Ω) =f : Ω→ R | ‖f‖

Lp(·) <∞

is a Banach space.

In the case of a constant function exponent p(x) = p, this norm is exactlythe classical one ‖f‖p, indeed

%(fλ

)=

∫Ω

(|f (x)|λ

)pdx =

1

λp

∫Ω|f (x)|pdx =

1

λp‖f‖pp

and

infλ > 0 :

1

λp‖f‖pp ≤ 1

= ‖f‖p

Modulus VS Norm

In (classical) Lp, norm and modulus are “the same” apart from a p-root:

‖f‖p <∞ ⇐⇒∫

Ω|f (x)|p dx <∞

In Lp(·), norm and modulus are really different:

‖f‖p(·) <∞ ⇐ 6⇒ %(f ) <∞

Indeed, the following holds

‖f‖p(·) <∞ ⇐⇒ there exist a λ > 0 s.t. %(fλ

)<∞

(and notice that λ can be chosen large enough . . . ).

A simple example of a strange behavior

Ω = [1,∞) f (x) ≡ 1 p(x) = x

———–

%(f ) =

∫ ∞1

1x dx =∞ BUT ‖f‖p(·) ' 1.763

Indeed %(fλ

)=

∫ ∞1

(1

λ

)xdx =

1

λ log λ<∞, for any λ > 1

The vector case: the Lebesgue spaces of sequences lp(·) = (lpnn )

The unit circle of x = (x1;x2) in R2 with variable exponents p = (p1; p2).

‖x‖p ‖x‖p(·)

Inclusion if (p1, p2) ≥ (q1, q2) (as classical) No inclusion in general

Properties of variable exponent Lebesgue spaces Lp(·)

Let p− = ess infΩ|p(x)| , and p+ = ess sup

Ω|p(x)| .

If p+ =∞ , then Lp(·)(Ω) is a “bad” (although very interesting) Banachspace, with poor geometric properties (i.e., it is not useful for our methods).

If 1 < p− ≤ p+ <∞ , then Lp(·)(Ω) is a “good” Banach space, since

many properties of classical Lebesgue spaces Lp still hold: Lp(·) is uniformlysmooth, uniformly convex, and reflexive; we know the duality map of Lp(·).

Its dual space is well defined,(Lp(·)

)∗' Lq(·), where 1

p(x)+ 1q(x)

= 1 ,

Lg(f ) =

∫Ωf (x)g(x) dx ∈

(Lp(·)

)∗⇐⇒ g ∈ Lq(·)(Ω) .

Unfortunately,(Lp(·)

)∗and Lq(·) are isomorphic but not isometric

‖Lg‖(Lp(·))∗ 6= ‖g‖Lq(·) .

We know the duality map ofLp(·), but don’t know the duality map of (Lp(·))∗.

The duality map of the variable exponent Lebesgue space

By extending the duality maps, we could define into Lp(·) all the iterativemethods developed in Lp (Landweber, Steepest descent, CG, Mann iter.).

For any constant 1 < r < +∞, we recall that the duality map, that is, the(sub-)differential of the functional 1

r‖f‖rLp , in the classical Banach space Lp,

with constant 1 < p < +∞, is defined as follows(JLp(f )

)(x) =

|f (x)|p−1 sgn(f (x))

‖f‖p−rp

.

By generalizing a result of P. Matei [2012], we have that the correspondingduality map in variable exponent Lebesgue space is defined as follows(

JLp(·)(f )

)(x) =

1∫Ωp(x) |f (x)|p(x)

‖f‖p(x)p(·)

dx

p(x) |f (x)|p(x)−1 sgn(f (x))

‖f‖p(x)−rp(·)

,

where any product and any ratio have to be considered as pointwise.

The adaptive algorithm in variable exponent Lebesgue spaces

It is a numerical evidence that, in Lp image deblurring,

• dealing with small 1 ≈ p << 2 improves sparsity and allows a betterrestoration of the edges of the images and of the zero-background,

• dealing with p ≈ 2 (even p > 2), allows a better restoration of the regionsof pixels with the highest intensities.

The idea: to use a scaled into [1, 2] version of the (re-)blurred data y asdistribution of the exponent p(·) for the variable exponent Lebesgue spaces

Lp(·) where computing the solution. Example (linear interpolation):

p(·) = 1 + [A∗y(·)−min(A∗y)]/[max(A∗y)−min(A∗y)]

The Landweber (i.e., fixed point) iterative scheme in this Lp(·) Banachspace can be modified as adaptive iteration algorithm, by recomputing, aftereach fixed number of iterations, the exponent distribution pk(·) by means ofthe k-th restored image xk (instead of the first re-blurred data A∗y), that is

pk(·)= 1 + [xk(·)−min(xk)]/[max(xk)−min(xk)]

The conjugate gradient method in Lp(·) for image restoration

Let p(·) = 1 + [A∗y(·)−min(A∗y)]/[max(A∗y)−min(A∗y)]

φ∗0 = −A∗JLp(·)r (Ax0 − y)

For k = 0, 1, 2, . . .

λk = arg minα1r‖A(xk + αφk)− y‖r

Lp(·)

x∗k+1 = x∗k + λkφ∗k xk+1 = J

(Lp(·))∗

s′ (x∗k+1)

βk+1 = γ‖Axk+1−y‖r

Lp(·)‖Axk−y‖r

Lp(·)with γ < 1/2

φ∗k+1 = −A∗JLp(·)r (Axk+1 − y) + βk+1φ

∗k

and recompute p(·) = 1 + [xk(·)−min(xk)]/[max(xk)−min(xk)]

each m iterations by using the last iteration xk .

[Proof of convergence of CG for Lp (constant) Lebesgue spaces in 2017].

Numerical results

True image Point Spread Function Blurred image (noise = 4.7%)

Special thanks to Brigida Bonino (Univ. Genova).

True image Blurred (noise = 4.7%) p = 2 (0.2692)

p = 1.3 (0.2681) p = 1.3− 1.6 (0.2473) 1.3− 1.6 and irreg. (0.2307)

True image Point Spread Function Blurred image (noise = 4.7%)

A remote sensing application: spatial resolutionenhancement of microwave radiometer data

It is an under-determined problem (with a special structured matrix), since we want toreconstruct the High frequency components reduced by the Low Pass filter of the radiometer.

Courtesy: monde-geospatial.com

Joint work with: Matteo Alparone, Flavia Lenti, Maurizio Migliaccio, Nando Nunziata(Dep. of Engineering, Univ. Napoli Parthenope)

0 200 400 600 800 1000 1200 1400-50

0

50

100

150

200

250

0 200 400 600 800 1000 1200 1400-50

0

50

100

150

200

250

p=2 p=1.2

0 200 400 600 800 1000 1200 1400-50

0

50

100

150

200

250

Variable ppmin = 1.2;pmax = 2;lambda = 0.3;maxiter = 2000;eThr = 0.12;fKind = 1;

0 200 400 600 800 1000 1200 1400-50

0

50

100

150

200

250

0 200 400 600 800 1000 1200 1400-50

0

50

100

150

200

250

RECT E DOUBLE RECT

0 200 400 600 800 1000 1200 1400-50

0

50

100

150

200

250

Variable ppmin = 1.2;pmax = 2;lambda = 0.3;maxiter = 2000;eThr = 0.12;fKind = 1;

0 10 20 30 40 50 60 70100

150

200

250

300Signal

0 500 1000 15000

50

100

150

200

250

300

350

400Res En Constant p = 1.2

xconstTB37

0 500 1000 150050

100

150

200

250

300

350

400

450Res En variable p

xvarTB37

An inverse scattering application: microwave tomography

Microwave imaging is a noninvasive and nondestructive technique to inspect materials byusing incident electromagnetic waves generated in the microwave range (300 MHz - 300GHz). It is a nonlinear, implicit and ill-posed problem.

Joint work with: Alessandro Fedeli, Matteo Pastorino, Andrea Randazzo (Dep. of Engi-neering, Univ. Genoa)

1

𝐿2 𝐿1.2 𝐿𝑝(1.4−2)

𝜖𝑟 = 2, SNR=15dB

𝜖𝑟 = 3, SNR=30dB

2

Thank you for your attention

References

[1] F. Schopfer, A.K. Louis, T. Schuster, Nonlinear iterative methods for linear ill-posed problems in Banach

spaces, Inverse Problems, vol. 22, pp. 311–329, 2006.

[2] M. O. Scherzer, M. Grasmair, H. Grossauer, M. Haltmeier, F. Lenzen, Variational Methods in Imaging,

Springer, Berlin, 2008.

[3] Schuster, T., Kaltenbacher, B., Hofmann, B., and Kazimierski, K. S., Regularization Methods in Banach

Spaces. Radon Series on Computational and Applied Mathematics, vol. 10, De Gruyter, 2012.

[4] L. Diening, P. Harjulehto, P. Hasto, M. Ruzicka, Lebesgue and Sobolev Spaces with Variable Exponents,

Lecture Notes in Mathematics, vol. 2017, Springer, 2011.

[5] C. Estatico, M. Pastorino, and A. Randazzo, A novel microwave imaging approach based on regularization

in Lp Banach spaces, IEEE Transactions on Antennas and Propagation, 60, 3373–3381, 2012.

[6] P. Brianzi, F. Di Benedetto, C. Estatico, Preconditioned iterative regularization in Banach spaces, Com-

putational Optimization and Applications, vol. 54, pp. 263–282, 2013.

[7] F. Lenti, F. Nunziata, C. Estatico, M. Migliaccio, On the resolution enhancement of radiometer data in

Banach spaces, IEEE Trans. Geoscience and Remote Sensing, vol. 52, pp. 1834–1842, 2014.

[8] P. Dell’Acqua, C. Estatico, Acceleration of multiplicative iterative algorithms for image deblurring by

duality maps in Banach spaces, Applied Numerical Mathematics, vol. 99, pp. 121–136, 2016.

[9] C. Estatico, S. Gratton, F. Lenti, D. Titley-Peloquin, A conjugate gradient like method for p-norm

minimization in functional spaces, Numerische Mathematik, vol. 137, pp. 895–922, 2017.

[10] P. Brianzi, F. Di Benedetto, C. Estatico, L. Surace Irregularization accelerates iterative regularization,

Calcolo, vol. 2/18, 2018.