GRADIENT PROJECTION FOR SPARSE RECONSTRUCTION: APPLICATION TO COMPRESSED SENSING AND OTHER INVERSE...

Post on 22-Dec-2015

242 views 0 download

Tags:

Transcript of GRADIENT PROJECTION FOR SPARSE RECONSTRUCTION: APPLICATION TO COMPRESSED SENSING AND OTHER INVERSE...

GRADIENT PROJECTION FOR SPARSE RECONSTRUCTION:

APPLICATION TO COMPRESSED SENSING AND OTHER INVERSE PROBLEMS

M´ARIO A. T. FIGUEIREDO

ROBERT D. NOWAK

STEPHEN J. WRIGHT

BACKGROUND

PREVIOUS ALGORITHMS

Interior-point method

SparseLab: a Matlab software package designed to find sparse solutions to systems of linear equations

L1_ls: a Matlab implementation of the interior-point method for L1-regularized least squares

L1-MAGIC: a collection of MATLAB routines for solving the convex optimization programs central to compressive sampling

GRADIENT PROJECTION FOR SPARSE RECONSTRUCTION

Formulation

GRADIENT DESCENT

Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

BASICGRADIENT PROJECTION

GPSR-BASIC

GPSR-BB

An approach due to Barzilai and Borwein (BB) that F may not decrease at every iteration.

Standard non-monotone method:

Eliminate the backtracking line search step

Monotone method:

CONVERGENCE

Theorem 1: The sequence of iterates {z(k)} generated by

the either the GPSR-Basic of GPSR-BB algorithms either

terminates at a solution or else converges to a solution

of function below at an R-linear rate.

T. Serafini, G. Zanghirati, L. Zanni. “Gradient projection methods for large quadratic programs and applications in training support vector machines,” Optimization Methods and Software, vol. 20, pp. 353–378, 2004.

TERMINATION

Several termination criterions are presented while these options all perform well on all data sets.

The one author used in this paper is motivated by perturbation results for linear complementarity problem (LCP).

DEBIASING

The function above is a bias solution for least-square problem. So we could fix the zero components, and use standard least square to get a debiasing solution.

It is also worth pointing out that debiasing is not always desirable. Shrinking the selected coefficients can mitigate unusually large noise deviations, a desirable effect that may be undone by debiasing.

WARM STARTING AND CONTINUATION

After solving the problem with a given τ, we could use the solution to initialize GPSR for a nearby value of τ.

It has been noted recently that the speed of GPSR may degrade considerably for smaller values of the regularization parameter τ. However, if we use GPSR for a larger value of τ, then decrease τ in steps toward its desired value.

To benefit from a warm start, IP methods require the initial point to be not only close to the solution but also sufficiently interior to the feasible set and close to a “central path,” which is difficult to satisfy in practice.

EXPERIMENTS

EXPERIMENTS

CONCLUSIONS

Significantly faster than the state-of-the-art algorithms in experimental comparisons

Poor performance when the regularization parameter τ is small, while continuation heuristic could be used to recover efficient practical performance

While it is not obvious WHY GPSR perform well