Distributed Lasso for In-Network Linear Regression

Juan Andrés Bazerque, Gonzalo Mateos

and Georgios B. Giannakis

March 16, 2010

Acknowledgements: ARL/CTA grant DAAD19-01-2-0011, NSF grants CCF-0830480 and ECCS-0824007

Distributed Lasso for In-Network Linear Regression

Distributed sparse estimation

2

(P1)

Data acquired by J agents

Linear model with sparse common parameter

agent j

Zou, H. “The Adaptive Lasso and its Oracle Properties,” Journal of the American Statistical Association,101(476), 1418-1429, 2006.

Network structure

3

ScalabilityRobustness

Lack of infrastructure

Decentralized

Ad-hoc

Centralized

Fusion center

(P1)

Problem statement Given data yj and regression matrices Xj available locally at agents j=1,…,J solve (P1) with local communications among neighbors (in-network processing)

4

Motivating application

Spectrum cartography

Goal:

J.-A. Bazerque, and G. B. Giannakis, “Distributed Spectrum Sensing for Cognitive Radio Networks by Exploiting Sparsity,” IEEE Transactions on Signal Processing, vol. 58, no. 3, pp. 1847-1862, March 2010.

Specification: coarse approx. suffices

Approach: basis expansion of

Find PSD map across

space and frequency

Scenario: Wireless Communications

Frequency (Mhz)

5

Modeling

Sources

Sensing radios

Frequency bases

Sensed frequencies

Sparsity present in space and frequency

6

Superimposed Tx spectra measured at Rj

Average path-loss

Frequency bases

Space-frequency basis expansion

Linear model in

(P1)

Consensus-based optimization

7

Consider local copies and enforce consensus

Introduce auxiliary variables for decomposition

(P2)

(P1) equivalent to (P2) distributed implementation

Towards closed-form iterates

8

Idea: reduce to orthogonal problem

Introduce additional variables

(P3)

AD-MoM 1st step: minimize w.r.t.

Alternating-direction method of multipliers

9

AD-MoM 4st step: update multipliersAD-MoM 2st step: minimize w.r.t.AD-MoM 3st step: minimize w.r.t.

D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, 2nd ed. Athena-Scientific, 1999.

Augmented Lagrangian vars , , multipliers , ,

D-Lasso algorithm

10

Agent j initializes and locally runs

FOR k = 1,2,…

Exchange with agents in

Update

END FOR offline, inversion NjxNj

D-Lasso: Convergence

Proposition

For every , local estimates generated by D-Lasso satisfy

where

Attractive featuresConsensus achieved across the networkAffordable communication of sparse with neighborsNetwork-wide data percolates through exchangesDistributed numerical operation

11

(P1)

Power spectrum cartography

12

Error evolution Aggregate spectrum map

5 sources Ns=121 candidate locations, J =50 sensing radios, p=969

D-Lasso localizes all sources through variable selection

Convergence to centralized counterpart

iteration

Sparse linear model with distributed data Lasso estimator Ad-hoc network topology

D-LassoGuaranteed convergence for any constant step-sizeLinear operations per iteration

Application: Spectrum cartographyMap of interference across space and timeMulti-source localization as a byproduct

Future directionsOnline distributed versionAsynchronous updates

13

Thank You!

Conclusions and future directions

D. Angelosante, J.-A. Bazerque, and G. B. Giannakis, “Online Adaptive Estimation of Sparse Signals:Where RLS meets the 11-norm,” IEEE Transactions on Signal Processing, vol. 58, 2010 (to appear).

Leave-one-agent-out cross-validation

14

Agent j is set aside in round robin fashion agents estimate compute

repeat for λ= λ1,…, λN and select λmin to minimize the error

c-v error vs λ

Requires sample mean to be computed in distributed fashion

path of solutions

Test case: prostate cancer antigen

15

67 patients organized into J = 7 groups measures the level of antigen for patient n in group j p = 8 factors: lcavol, lweight, age, lbph, svi, lcp, gleason, pgg45Rows of store factors measured in patients

Lasso D-Lasso

Centralized and distributed solutions coincide

Volume of cancer affects predominantly the level of antigen

Distributed elastic net

16

Ridge regression Elastic net

H. Zou and H.H. Zhang, “On The Adaptive Elastic-Net With A Diverging Number of Parameters," Annals of Statistics, vol. 37, no. 4, pp. 1733-1751 2009.

Quadratic term regularizes the solution; centralized in [Zou-Zhang’09]

Elastic net achieves variable selection on ill-conditioned problems

Distributed Lasso for In-Network Linear Regression

Documents

Transcript of Distributed Lasso for In-Network Linear Regression