Sparsity-Aware Adaptive Algorithms Based on Alternating Optimization and Shrinkage

5
IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 2, FEBRUARY 2014 225 Sparsity-Aware Adaptive Algorithms Based on Alternating Optimization and Shrinkage Rodrigo C. de Lamare, Senior Member, IEEE, and Raimundo Sampaio-Neto Abstract—This letter proposes a novel sparsity-aware adaptive ltering scheme and algorithms based on an alternating optimiza- tion strategy with shrinkage. The proposed scheme employs a two- stage structure that consists of an alternating optimization of a di- agonally-structured matrix that speeds up the convergence and an adaptive lter with a shrinkage function that forces the coefcients with small magnitudes to zero. We devise alternating optimization least-mean square (LMS) algorithms for the proposed scheme and analyze its mean-square error. Simulations for a system identi- cation application show that the proposed scheme and algorithms outperform in convergence and tracking existing sparsity-aware algorithms. Index Terms—Adaptive lters, iterative methods, sparse signal processing. I. INTRODUCTION I N THE last few years, there has been a growing interest in adaptive algorithms that can exploit the sparsity present in various signals and systems that arise in applications of adap- tive signal processing [1]–[10]. The basic idea is to exploit prior knowledge about the sparsity present in the data that need to be processed for applications in system identication, communi- cations and array signal processing. Several algorithms based on the least-mean square (LMS) [1], [2] and the recursive least- squares (RLS) [3], [4], [5], [6] techniques have been reported in the literature along with different penalty or shrinkage func- tions. These penalty functions perform a regularization that at- tracts to zero the coefcients of the adaptive lter that are not associated with the weights of interest. With this objective in mind, several penalty functions that account for the sparisty of data signal have been considered, namely: an approxima- tion of the -norm [1], [6], the - norm penalty [2], [5], and the log-sum penalty [2], [5], [8]. These algorithms solve prob- lems with sparse features without relying on the computation- ally complex oracle algorithm, which requires an exhaustive search for the location of the non-zero coefcients of the system. However, the available algorithms in the literature also exhibit a performance degradation as compared to the oracle algorithm, which might affect the performance of some applications of adaptive algorithms. Manuscript received November 19, 2013; accepted December 30, 2013. Date of publication January 09, 2014; date of current version January 14, 2014. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Eric Moreau. R. C. de Lamare is with CETUC-PUC-Rio, 22453-900 Rio de Janeiro, Brazil, and also with the Communications Research Group, Department of Electronics, University of York, York Y010 5DD, U.K. R. Sampaio-Neto is with CETUC/PUC-RIO, 22453-900 Rio de Janeiro, Brazil (e-mail: [email protected], [email protected]) Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/LSP.2014.2298116 Motivated by the limitation of existing sparse adaptive tech- niques, we propose a novel sparsity-aware adaptive ltering scheme and algorithms based on an alternating optimization strategy with shrinkage. The proposed scheme employs a two- stage structure that consists of an alternating optimization of a diagonally-structured matrix that accelerates the convergence and an adaptive lter with a shrinkage function that attracts the coefcients with small magnitudes to zero. The diagonally- structure matrix aims to perform what the oracle algorithm does and helps to accelerate the convergence of the scheme and im- prove its steady-state performance. We devise sparsity-aware al- ternating optimization least-mean square (SA-ALT-LMS) algo- rithms for the proposed scheme and derive analytical formulas to predict their mean-square error (MSE) upon convergence. Simulations for a system identication application show that the proposed scheme and algorithms outperform in convergence and tracking the state-of-the-art sparsity-aware algorithms. II. PROBLEM STATEMENT AND THE ORACLE ALGORITHM In this section, we state the sparse system identication problem and describe the optimal strategy known as as the oracle algorithm, which knows the positions of the non-zero coefcients of the sparse system. A. Sparse System Identication Problem In the sparse system identication problem of interest, the system observes a complex-valued signal represented by an vector at time instant , performs ltering and obtains the output , where is an -length nite-impulse-response (FIR) lter that represents the actual system. For system identication, an adaptive lter with coefcients is employed in such a way that it observes and produces an estimate . The system identication scheme then compares the output of the actual system and the adaptive lter , resulting in an error signal , where is the measurement noise. In this context, the goal of an adaptive algorithm is to identify the system by minimizing the MSE dened by (1) A key problem in electronic measurement systems which are modeled by sparse adaptive lters, where the number of non- zero coefcients , is that most adaptive algorithms do not exploit their sparse structure to obtain performance benets and/or a computational complexity reduction. If an adaptive al- gorithm can identify and exploit the non-zero coefcients of the system to be identied, then it can obtain performance improve- ments and a reduction in the computational complexity. 1070-9908 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Transcript of Sparsity-Aware Adaptive Algorithms Based on Alternating Optimization and Shrinkage

IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 2, FEBRUARY 2014 225

Sparsity-Aware Adaptive Algorithms Based onAlternating Optimization and Shrinkage

Rodrigo C. de Lamare, Senior Member, IEEE, and Raimundo Sampaio-Neto

Abstract—This letter proposes a novel sparsity-aware adaptivefiltering scheme and algorithms based on an alternating optimiza-tion strategy with shrinkage. The proposed scheme employs a two-stage structure that consists of an alternating optimization of a di-agonally-structured matrix that speeds up the convergence and anadaptive filter with a shrinkage function that forces the coefficientswith small magnitudes to zero. We devise alternating optimizationleast-mean square (LMS) algorithms for the proposed scheme andanalyze its mean-square error. Simulations for a system identifi-cation application show that the proposed scheme and algorithmsoutperform in convergence and tracking existing sparsity-awarealgorithms.

Index Terms—Adaptive filters, iterative methods, sparse signalprocessing.

I. INTRODUCTION

I N THE last few years, there has been a growing interest inadaptive algorithms that can exploit the sparsity present in

various signals and systems that arise in applications of adap-tive signal processing [1]–[10]. The basic idea is to exploit priorknowledge about the sparsity present in the data that need to beprocessed for applications in system identification, communi-cations and array signal processing. Several algorithms basedon the least-mean square (LMS) [1], [2] and the recursive least-squares (RLS) [3], [4], [5], [6] techniques have been reportedin the literature along with different penalty or shrinkage func-tions. These penalty functions perform a regularization that at-tracts to zero the coefficients of the adaptive filter that are notassociated with the weights of interest. With this objective inmind, several penalty functions that account for the sparistyof data signal have been considered, namely: an approxima-tion of the -norm [1], [6], the - norm penalty [2], [5], andthe log-sum penalty [2], [5], [8]. These algorithms solve prob-lems with sparse features without relying on the computation-ally complex oracle algorithm, which requires an exhaustivesearch for the location of the non-zero coefficients of the system.However, the available algorithms in the literature also exhibita performance degradation as compared to the oracle algorithm,which might affect the performance of some applications ofadaptive algorithms.

Manuscript received November 19, 2013; accepted December 30, 2013. Dateof publication January 09, 2014; date of current version January 14, 2014. Theassociate editor coordinating the review of this manuscript and approving it forpublication was Prof. Eric Moreau.R. C. de Lamare is with CETUC-PUC-Rio, 22453-900 Rio de Janeiro, Brazil,

and also with the Communications Research Group, Department of Electronics,University of York, York Y010 5DD, U.K.R. Sampaio-Neto is with CETUC/PUC-RIO, 22453-900 Rio de Janeiro,

Brazil (e-mail: [email protected], [email protected])Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/LSP.2014.2298116

Motivated by the limitation of existing sparse adaptive tech-niques, we propose a novel sparsity-aware adaptive filteringscheme and algorithms based on an alternating optimizationstrategy with shrinkage. The proposed scheme employs a two-stage structure that consists of an alternating optimization of adiagonally-structured matrix that accelerates the convergenceand an adaptive filter with a shrinkage function that attractsthe coefficients with small magnitudes to zero. The diagonally-structure matrix aims to perform what the oracle algorithm doesand helps to accelerate the convergence of the scheme and im-prove its steady-state performance.We devise sparsity-aware al-ternating optimization least-mean square (SA-ALT-LMS) algo-rithms for the proposed scheme and derive analytical formulasto predict their mean-square error (MSE) upon convergence.Simulations for a system identification application show thatthe proposed scheme and algorithms outperform in convergenceand tracking the state-of-the-art sparsity-aware algorithms.

II. PROBLEM STATEMENT AND THE ORACLE ALGORITHM

In this section, we state the sparse system identificationproblem and describe the optimal strategy known as as theoracle algorithm, which knows the positions of the non-zerocoefficients of the sparse system.

A. Sparse System Identification Problem

In the sparse system identification problem of interest, thesystem observes a complex-valued signal represented by an

vector at time instant , performs filtering andobtains the output , where is an -lengthfinite-impulse-response (FIR) filter that represents the actualsystem. For system identification, an adaptive filter withcoefficients is employed in such a way that it observes

and produces an estimate . The systemidentification scheme then compares the output of the actualsystem and the adaptive filter , resulting in an errorsignal , where is the measurementnoise. In this context, the goal of an adaptive algorithm is toidentify the system by minimizing the MSE defined by

(1)

A key problem in electronic measurement systems which aremodeled by sparse adaptive filters, where the number of non-zero coefficients , is that most adaptive algorithms donot exploit their sparse structure to obtain performance benefitsand/or a computational complexity reduction. If an adaptive al-gorithm can identify and exploit the non-zero coefficients of thesystem to be identified, then it can obtain performance improve-ments and a reduction in the computational complexity.

1070-9908 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

226 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 2, FEBRUARY 2014

Fig. 1. Proposed adaptive filtering scheme.

B. The Oracle Algorithm

The optimal algorithm for processing sparse signals and sys-tems is known as the oracle algorithm. It can identify the posi-tions of the non-zero coefficients and fully exploit the sparsity ofthe system under consideration. In the context of sparse systemidentification and other linear filtering problems, we can statethe oracle algorithm as

(2)

where is an diagonal matrix with the actualpositions of the non-zero coefficients. It turns out that the oraclealgorithm requires an exhaustive search over all the possiblepositions over possibilities, which is an -hard problemwith extremely high complexity if is large. Moreover, theoracle algorithm also requires the computation of the optimalfilter, which is a continuous optimization problem. For thesereasons, it is fundamental to devise low-complexity algorithmsthat can cost-effectively process sparse signals.

III. PROPOSED ALTERNATING OPTIMIZATIONWITH SHRINKAGE SCHEME

In this section, we present an adaptive filtering scheme thatemploys an alternating optimization strategywith shrinkage thatexploits the sparsity in the identification of linear systems. Un-like existing methods, the proposed technique introduces twoadaptive filters that are optimized in an alternating fashion, asillustrated in Fig. 1. The first adaptive filter with coeffi-cients is applied as a diagonal matrix diag toand performs the role of the oracle algorithm, which was definedas in the previous section. The second adaptive filterwith coefficients is responsible for the system identification.Both and employ -norm shrinkage techniques to at-tract to zero the coefficients that have small magnitudes. Theoutput of the proposed adaptive filtering scheme is given by

(3)

A. Adaptive Algorithms

In order to devise adaptive algorithms for this scheme, weneed to cast an optimization problem with a cost function thatdepends on , and a shrinkage function , whererepresents this function applied to a generic parameter vectorwith coefficients. Let us consider the following cost function

(4)

where and are the regularization terms. In order to derivean adaptive algorithm to minimize the cost function in (4)and perform system identification, we employ an alternatingoptimization strategy. We compute the instantaneous gradientof (4) with respect to and and devise LMS-typealgorithms:

(5)

(6)

where is the error signaland and are the step sizes of the LMS recursions, which areused in an alternating way. In Table I, different shrinkage func-tions are shown with their partial derivatives and other features.A key requirement of the proposed scheme is the initializationwhich results in the adjustment of to shrink the coefficientscorresponding to zero elements of the system and to esti-mate the non-zero coefficients. Specifically, is initialized asan all-one vector ( or ) and is initializedas an all-zero vector ( ). When is fixed, the schemeis equivalent to a standard shrinkage algorithm. The two-stepapproach outperforms the single-step method since strivesto perform the role of the Oracle algorithm ( ) by decreasingthe values of its entries in the positions of the zero coefficients.This helps the recursion that adapts to perform the estima-tion of the non-zero coefficients. This process is then alternatedover the iterations, resulting in better performance. Whenis employed, has the information about the actual positionsof the zero coefficients.

B. Computational Complexity

We detail the computational complexity in terms of arith-metic operations of the proposed and some existing algorithms.Specifically, we consider the conventional LMS algorithm,sparsity-aware LMS (SA-LMS) algorithms, and the proposedSA-ALT-LMS algorithm. The details are shown in Table II.

IV. MEAN-SQUARE ERROR ANALYSIS

In this section, we develop an MSE analysis of the proposedSA-ALT-LMS algorithm and devise analytical expressionsto describe the transient and steady-state performances. Bydefining as the optimal filter and as the oracle vector( diag ) with the non-zero coefficients, we can write

and (7)

The error signal can then be rewritten as

(8)

DE LAMARE AND SAMPAIO-NETO: SPARSITY-AWARE ADAPTIVE ALGORITHMS 227

TABLE ISHRINKAGE FUNCTIONS

TABLE IICOMPUTATIONAL COMPLEXITY OF ALGORITHMS

where diag is the errorsignal of the optimal sparse filter. The MSE is written as

(9)

Using the independence assumption between , and, we have:

MSE diag diag

diag diag

diag diag (10)

where . The expectation of the scalarvalues that are functions of triple vector products can berewritten [11] and the MSE expressed by

MSE

(11)

where is the Hadamard product, ,, , , and. Using (5) and (6) into and , we obtain

(12)

(13)

where andappear in (12) and (13). The other

quantities are ,, ,

and is the partial derivativewith respect to the variable of the argument. In Table I, we usethe variable that plays the role of or . We obtainedapproximations for , where is ageneric function, to compute the matrices and for agiven shrinkage function as shown in the 3rd column of Table I.

We can express and as and, where and

. To simplify the analysis, we assume that the sam-ples of the signal are uncorrelated, i.e., withbeing the variance. Using the diagonal matrices

, and, we can write

(14)

(15)

Due to the structure of the above equations, the approxima-tions and the quantities involved, we can decouple them into

(16)

(17)

where and are the th elements of the maindiagonals of and , respectively. By taking

and , we obtain

(18)

(19)

where and. For stability, we must have

and , which results in

and (20)

where ,, with and being the

th elements of and , respectively. The MSE is thengiven by

MSE

(21)

228 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 2, FEBRUARY 2014

where and are the elements of and , respectively.This MSE analysis is valid for uncorrelated input data, whereasa model for correlated input data remains an open problemwhich is highly involved due to the triple products in (11).However, the SA-ALT-LMS algorithms work very well forboth correlated and uncorrelated input data.

V. SIMULATIONS

In this section, we assess the performance of the existingLMS, SA-LMS, and the proposed SA-ALT-LMS algorithmswith different shrinkage functions. The shrinkage functionsconsidered are the ones shown in Table II, which give rise to theSA-LMS with the -norm [2], the SA-LMS with the log-sumpenalty [2], [5], [8] and the -norm [1], [6]. We considersystem identification examples with both time-invariant andtime-varying parameters in which there is a sparse system witha significant number of zeros to be identified. The input signal

and the noise are drawn from independent and iden-tically distributed complex Gaussian random variables withzero mean and variances and , respectively, resulting in asignal-to-noise ratio (SNR) given by SNR . The filtersare initialized as and . In the first experiment,there are coefficients in a time-invariant system,only coefficients are non-zero when the algorithmsstart and the input signal is applied to a first-order auto-re-gressive filter which results in correlated samples obtained by

that are normalized. After 1000iterations, the sparse system is suddenly changed to a systemwith coefficients but in which coefficients arenon-zero. The positions of the non-zero coefficients are chosenrandomly for each independent simulation trial. The curves areaveraged over 200 independent trials and the parameters areoptimized for each example. We consider the log-sum penalty[2], [5], [8] and the -norm [1], [6] because they have shownthe best performances.The results of the first experiment are shown in Fig. 2,

where the existing LMS and SA-LMS algorithms are comparedwith the proposed SA-ALT-LMS algorithm. The curves showthat that MSE performance of the proposed SA-ALT-LMSalgorithms is significantly superior to the existing LMS andSA-LMS algorithms for the identification of sparse system.The SA-ALT-LMS algorithms can approach the performanceof the Oracle-LMS algorithm, which has full knowledge aboutthe positions of the non-zero coefficients. A performanceclose to the Oracle-LMS algorithm was verified for varioussituations of interest including different values of SNR, degreesof sparsity ( ) and for both small and large sparse systems( ).In a second experiment, we have assessed the validity of

the MSE analysis and the formulas obtained to predict theMSE as indicated in (21) and in Table II for uncorrelated inputdata. In the evaluation of (18) and (19), we made the fol-lowing approximations ,and . We have considered a scenario wherethe input signal and the observed noise are white Gaussianrandom sequences with variance of 1 and , respectively,i.e., SNR dB. There are coefficients in atime-invariant system that are randomly generated and only

coefficients are non-zero. The positions of the non-zerocoefficients are again chosen randomly for each independent

Fig. 2. MSE performance against number of iterations for correlated input data.Parameters: SNR dB, , , , ,

, , and .

Fig. 3. MSE performance against step size for . Parameters: SNRdB, , , , , and .

simulation trial. The curves are averaged over 200 independenttrials and the algorithms operate for 1000 iterations in orderto ensure their convergence. We have compared the simulatedcurves obtained with the SA-ALT-LMS strategy using the-norm [2], the SA-LMS with the log-sum penalty [2], [5],[8] and the -norm [1], [6]. The results in Fig. 3 indicate thatthere is a close match between the simulated and the analyticalcurves for the shrinkage functions employed, suggesting thatthe formulas obtained and the simplifications made are validand resulted in accurate methods to predict the MSE perfor-mance of the proposed SA-ALT-LMS algorithms.

VI. CONCLUSION

We have proposed a novel sparsity-aware adaptive filteringscheme and algorithms based on an alternating optimiza-tion strategy that is general and can operate with differentshrinkage functions. We have devised alternating optimiza-tion LMS algorithms, termed as SA-ALT-LMS for the pro-posed scheme and developed an MSE analysis, which re-sulted in analytical formulas that can predict the performanceof the SA-ALT-LMS algorithms. Simulations for a systemidentification application show that the proposed scheme andSA-ALT-LMS algorithms outperform existing sparsity-awarealgorithms.

DE LAMARE AND SAMPAIO-NETO: SPARSITY-AWARE ADAPTIVE ALGORITHMS 229

REFERENCES

[1] Y. Gu, J. Jin, and S. Mei, “ norm constraint LMS algorithm forsparse system identification,” IEEE Signal Process. Lett., vol. 16, pp.774–777, 2009.

[2] Y. Chen, Y. Gu, and A. O. Hero, “Sparse LMS for system identifi-cation,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Pro-cessing, Apr. 19-24, 2009, pp. 3125–3128.

[3] B. Babadi, N. Kalouptsidis, and V. Tarokh, “SPARLS: The sparse RLSalgorithm,” IEEE Trans. Signal Process., vol. 58, no. 8, pp. 4013–4025,2010.

[4] D. Angelosante, J. A. Bazerque, and G. B. Giannakis, “Online adaptiveestimation of sparse signals: Where RLS meets the -norm,” IEEETrans. Signal Process., vol. 58, no. 7, pp. 3436–3447, 2010.

[5] E. M. Eksioglu, “Sparsity regularized RLS adaptive filtering,” IETSignal Process., vol. 5, no. 5, pp. 480–487, Aug. 2011.

[6] E. M. Eksioglu and A. L. Tanc, “RLS algorithm with convex regular-ization,” IEEE Signal Process. Lett., vol. 18, no. 8, pp. 470–473, Aug.2011.

[7] N. Kalouptsidis, G. Mileounis, B. Babadi, and V. Tarokh, “Adaptivealgorithms for sparse system identification,” Signal Process., vol. 91,no. 8, pp. 1910–1919, Aug. 2011.

[8] E. J. Candes, M. Wakin, and S. Boyd, “Enhancing sparsity byreweighted l1 minimization,” J. Fourier Anal. Applicat., 2008.

[9] R. C. de Lamare and R. Sampaio-Neto, “Adaptive reduced-rankMMSE filtering with interpolated FIR filters and adaptive interpola-tors,” IEEE Signal Process. Lett., vol. 12, no. 3, Mar. 2005.

[10] R. C. de Lamare and R. Sampaio-Neto, “Adaptive reduced-rank pro-cessing based on joint and iterative interpolation, decimation, and fil-tering,” IEEE Trans. Signal Process., vol. 57, no. 7, pp. 2503–2514,Jul. 2009.

[11] S. Haykin, Adaptive Filter Theory, 4th ed. ed. Upper Saddle River,NJ, USA: Prentice-Hall, 2002.