On the trade-off between energy efficiency and estimation error in compressive sensing

10
On the trade-off between energy efficiency and estimation error in compressive sensing q Donglin Hu a , Shiwen Mao a,, Nedret Billor b , Prathima Agrawal a a Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USA b Department of Mathematics and Statistics, Auburn University, Auburn, AL, USA article info Article history: Received 3 December 2012 Received in revised form 20 March 2013 Accepted 8 April 2013 Available online 28 April 2013 Keywords: Bayesian estimation Cognitive radio Compressive sensing Isotonic regression Polynomial regression Spectrum sensing abstract Compressive sensing (CS) refers to the process of reconstructing a signal that is supposed to be sparse or compressible. CS has wide applications, such as in cognitive radio networks. In this paper, we investigate effective CS schemes for the trade-off between energy efficiency and estimation error. We propose an enhancement to a Bayesian estimation approach and an enhancement to the isotonic regression approach that is based on nearly isotonic regression. We also show how to compute the routing matrix for selecting active sensor nodes. The proposed enhancements are evaluated with trace-driven simulations. Consider- able gaps are observed between the original approaches and the proposed enhancements in the simulation results. The near isotonic regression method achieves the best perfor- mance among all the CS schemes examined in this paper. Ó 2013 Elsevier B.V. All rights reserved. 1. Introduction Compressive sensing (CS) (a.k.a. compressed sensing) re- fers to the process of reconstructing a signal that is sup- posed to be sparse or compressible [2]. It has found wide applications in communications and networking. For example, in cognitive radio networks, spectrum sensing is a critical component in dynamic spectrum access and enforcement of spectrum usage and sharing [3–7]. Given the wide range of activities in space, time, and frequency, it would be extremely challenging and costly to have a full range and dense sampling. In such situations, compressive sensing becomes a powerful tool for efficient spectrum sensing. Generally in a wireless sensor network (WSN), spatially distributed sensors are used to monitor physical or environmental conditions [8,9]. The sparse signals in wireless sensor networks are obtained by collecting read- ings from sensor nodes by a server (or, a data processing center) through wireless transmissions. The server will then use CS to process the sparse data to achieve certain design goals (e.g., recovering the missing data). CS has received considerable interest from the wireless community recently. For example, several CS papers have focused on network optimization and scheduling, where two important factors, i.e., network performance and power consumption, are considered in the design of CS schemes. The objectives of the schemes presented in these papers are to prolong the life time of wireless sensor nodes, while meeting certain network performance requirements. Energy can be conserved by turning off some sensor nodes while keeping the rest active. Network performance can be measured in terms of surveillance quality such as the number of active nodes [10,11], network coverage [12], the minimum degree of connection [13,14], and surveil- lance delay [15,16]. The proposed schemes usually aim to achieve a trade-off between network performance and power consumption. 1570-8705/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.adhoc.2013.04.008 q This paper was presented in part at IEEE GLOBECOM 2012, Anaheim, CA, USA, December 2012 [1] Corresponding author. E-mail addresses: [email protected] (D. Hu), [email protected] (S. Mao), [email protected] (N. Billor), agrawpr@auburn. edu (P. Agrawal). Ad Hoc Networks 11 (2013) 1848–1857 Contents lists available at SciVerse ScienceDirect Ad Hoc Networks journal homepage: www.elsevier.com/locate/adhoc

Transcript of On the trade-off between energy efficiency and estimation error in compressive sensing

Page 1: On the trade-off between energy efficiency and estimation error in compressive sensing

Ad Hoc Networks 11 (2013) 1848–1857

Contents lists available at SciVerse ScienceDirect

Ad Hoc Networks

journal homepage: www.elsevier .com/locate /adhoc

On the trade-off between energy efficiency and estimation errorin compressive sensing q

1570-8705/$ - see front matter � 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.adhoc.2013.04.008

q This paper was presented in part at IEEE GLOBECOM 2012, Anaheim,CA, USA, December 2012 [1]⇑ Corresponding author.

E-mail addresses: [email protected] (D. Hu),[email protected] (S. Mao), [email protected] (N. Billor), [email protected] (P. Agrawal).

Donglin Hu a, Shiwen Mao a,⇑, Nedret Billor b, Prathima Agrawal a

a Department of Electrical and Computer Engineering, Auburn University, Auburn, AL, USAb Department of Mathematics and Statistics, Auburn University, Auburn, AL, USA

a r t i c l e i n f o a b s t r a c t

Article history:Received 3 December 2012Received in revised form 20 March 2013Accepted 8 April 2013Available online 28 April 2013

Keywords:Bayesian estimationCognitive radioCompressive sensingIsotonic regressionPolynomial regressionSpectrum sensing

Compressive sensing (CS) refers to the process of reconstructing a signal that is supposed tobe sparse or compressible. CS has wide applications, such as in cognitive radio networks. Inthis paper, we investigate effective CS schemes for the trade-off between energy efficiencyand estimation error. We propose an enhancement to a Bayesian estimation approach andan enhancement to the isotonic regression approach that is based on nearly isotonicregression. We also show how to compute the routing matrix for selecting active sensornodes. The proposed enhancements are evaluated with trace-driven simulations. Consider-able gaps are observed between the original approaches and the proposed enhancementsin the simulation results. The near isotonic regression method achieves the best perfor-mance among all the CS schemes examined in this paper.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

Compressive sensing (CS) (a.k.a. compressed sensing) re-fers to the process of reconstructing a signal that is sup-posed to be sparse or compressible [2]. It has found wideapplications in communications and networking. Forexample, in cognitive radio networks, spectrum sensingis a critical component in dynamic spectrum access andenforcement of spectrum usage and sharing [3–7]. Giventhe wide range of activities in space, time, and frequency,it would be extremely challenging and costly to have a fullrange and dense sampling. In such situations, compressivesensing becomes a powerful tool for efficient spectrumsensing. Generally in a wireless sensor network (WSN),spatially distributed sensors are used to monitor physical

or environmental conditions [8,9]. The sparse signals inwireless sensor networks are obtained by collecting read-ings from sensor nodes by a server (or, a data processingcenter) through wireless transmissions. The server willthen use CS to process the sparse data to achieve certaindesign goals (e.g., recovering the missing data).

CS has received considerable interest from the wirelesscommunity recently. For example, several CS papers havefocused on network optimization and scheduling, wheretwo important factors, i.e., network performance and powerconsumption, are considered in the design of CS schemes.The objectives of the schemes presented in these papersare to prolong the life time of wireless sensor nodes, whilemeeting certain network performance requirements.Energy can be conserved by turning off some sensor nodeswhile keeping the rest active. Network performance can bemeasured in terms of surveillance quality such as thenumber of active nodes [10,11], network coverage [12],the minimum degree of connection [13,14], and surveil-lance delay [15,16]. The proposed schemes usually aim toachieve a trade-off between network performance andpower consumption.

Page 2: On the trade-off between energy efficiency and estimation error in compressive sensing

Fig. 1. The wireless sensor network model.

D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857 1849

Other CS papers have laid emphasis on signal compres-sion and reconstruction. These papers seek a trade-off be-tween compression efficiency and reconstruction quality.For instance, the conventional CS approach is based onorthonormal basis. An important example of this approachis wavelet transform. The original sensor data is com-pressed and delivered [17]; therefore fewer network re-source will be needed for transmitting the compresseddata [18–20].

In this paper, we investigate effective CS schemes foraddressing the fundamental trade-off between energy effi-ciency and estimation error. As discussed, a compressed sig-nal can be obtained by collecting readings from a subset ofthe deployed sensor nodes, so that the rest of the nodes canbe turned off to save energy. At the data processing center,the intact signal is reconstructed via signal estimation.Therefore, there is a fundamental trade-off between howmany active nodes to choose (thus how much energy tospend) and the corresponding accuracy of prediction. Re-cently, only a few papers have investigated the problemof joint network optimization and signal processing. In [21],sensor nodes are divided into subgroups and all the read-ings can be recovered from one of the subgroups using iso-tonic regression. The objective is to maximize the numberof subgroups while keeping the estimation error below atolerable threshold. In [22], nodes are grouped into pairs.In each pair, the node with lower battery capacity goes tosleep, while the node with higher battery capacity esti-mates the measurement of the sleeping node using linearregression. In [23,24], the authors investigate the problemof reconstructing a distributed signal through the collectionof a small number of sensor readings, where CS is applied inconjunction with principle component analysis. The data ofthe entire network is recovered with Bayesian estimation.

In particular, we aim to develop effective CS schemes inWSNs and seek to balance the trade-off between energysaving and recovery accuracy, which are in terms of thenumber of active sensor nodes and the corresponding esti-mation error. We examine three classes of CS approachesin related work discussed above. We first implement theBayesian estimation approach presented in [23], and pro-pose an enhancement to this approach by scaling the var-iance of the sensor readings. We next revisit the isotonicregression approach presented in [21], and propose an im-proved method based on nearly isotonic regression. Final-ly, we briefly introduce two polynomial regressionapproaches, i.e., linear regression and quadratic regression,which are used as benchmarks in the performance evalua-tion. Bayesian estimation selects active nodes randomly,while isotonic regression chooses active nodes based onestimation errors, which results in better reconstructionquality. Unlike LR and QR, isotonic regression considersthe distribution of readings for more accurate estimation.The proposed enhancements are evaluated with trace-dri-ven simulations. We find considerable gaps between theoriginal approaches and the proposed enhancements inthe simulation results. We also find that the near isotonicregression method achieves the best performance amongall the CS schemes examined in this paper.

The remainder of this paper is organized as follows. InSection 2, we describe the system model and assumptions.

We introduce Masiero’s method and its enhancement inSection 3. In Section 4.1, we discuss Koushanfar’s methodand the nearly isotonic regression based enhancement. InSection 5, we present two polynomial regression schemes.We evaluate the proposed enhancements with trace-dri-ven simulations in Section 6. Section 7 concludes thispaper.

2. System model and assumptions

The network model considered in this paper consists ofautonomous sensor nodes that are deployed in an area tomonitor physical conditions (e.g., spectrum availability)and a data processing center (or, a server), as illustratedin Fig. 1. We assume that the sensor nodes collect informa-tion data following a synchronized slot structure. Once anode obtains its data in a time slot, it transmits the datato the data processing center through its wireless interfacein the same time slot.

We assume the process of sensing and transmission ofone unit of sensor data consumes a certain amount of en-ergy, denoted as E, which is a constant. If there are N sensornodes deployed, the total amount of energy consumed ineach time slot is N � E. It is easy to see that a convenientway to conserve energy for the WSN is to reduce the num-ber of active sensor nodes. On the other hand, the datafrom the idle nodes has to be recovered through CS to en-sure certain precision of detection. The estimation errors ofthe recovered data should meet the minimum precisionrequirement of the target sensing application. Therefore,there is a fundamental trade-off between the number of activenodes and estimation error.

In the following sections, we introduce three classes ofCS approaches and propose enhanced algorithms. The firstclass is Bayesian Estimation approaches, which focus on en-ergy saving by managing the number of active sensornodes. Specifically, L active sensor nodes will be chosenfrom the set of N sensor nodes to collect sensing data ineach time slot. The second class includes the IsotonicRegression approaches and the third class is PolynomialRegression approaches, which guarantee the quality ofrecovered data by controlling the estimation error. Withthese techniques, we choose as few active sensor nodes

Page 3: On the trade-off between energy efficiency and estimation error in compressive sensing

1850 D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857

as possible, while keeping the estimation error below aprescribed threshold. The notations used in this paperand their definitions are summarized in Table 1.

3. Bayesian estimation approaches

In Bayesian Estimation (BE), the parameter set h is ob-tained by maximizing the posterior probability densityfunction (pdf)

pðhjD;MÞ ¼ pðDjh;MÞpðhjMÞpðDjMÞ ; ð1Þ

where D is the data set and M the plausible model. SincepðDjMÞ does not include h, maximizing the posterior prob-ability pðhjD;MÞ is equivalent to maximizing the productof pðDjh;MÞ and pðhjMÞ.

In [24], Masiero et al. propose an approach in whichPrincipal Component Analysis (PCA) and CS are jointly con-sidered. In the following, we first introduce this methodin Section 3.1 for the sake of completeness. We then pro-pose an enhanced joint PCA and CS method in Section 3.2.

Table 1Notation

Symbol Definition

D The data seth The sensor parameter setM The plausible modelN The total number of sensor nodesL The number of active nodesK The number of sampling timesE Energy consumption for sensing and

transmission of one unit of datasX A sensor node X in the networksY A sensor node Y in the networkxk A reading from sensor sX at time kyk A reading from sensor sY at time kyk An estimate of yk

xk The readings from all sensor nodes at time kxk The estimate of xk

x0k The readings from active nodes at time ksk The principal component of xk

sik

The ith element of sk

zk A known data set at time k�x The sample mean vector of xk over

time kb A weighted least-square fit to a vector y

b The optimal solution to (16)

g The weight vectorR The sample covariance matrix of xk over

time kki The ith eigenvalue of RU Orthonormal matrix consisting of

eigenvectors of RG The Gaussian distribution assumptionL The Laplace distribution assumptionU The routing matrix�X,Y The estimation error of sY based on the

readings of sX

R The relative importance matrixE The error matrixC The cumulative error matrix

3.1. Joint PCA and CS

In the network, the readings of sensor nodes, e.g., theillumination of an outdoor environment, are collected witha fixed sampling rate at discrete times k = 1,2,� � �,K. Letxk 2 RN be an N � 1 vector that contains the measurementscollected at time k. Then xk can be considered as an N-dimensional stochastic process.

With PCA, the centered xk can be projected to a newN � 1 variable sk, which is written as

sk ¼ UT � ðxk � �xÞ; ð2Þ

where �x is the mean of x and U is an orthonormal matrixwhose column vectors are the eigenvectors of the covari-ance matrix of xk. The eigenvectors are placed in thedecreasing order of the corresponding eigenvalues (ki).The mean vector and covariance matrix can be replaced bythe sample mean vector and the sample covariance matrix as

�x ¼ 1K

XK

k¼1

xk; ð3Þ

R ¼ 1K

XK

k¼1

ðxk � �xÞðxk � �xÞT : ð4Þ

According to the properties of PCA, it can be shown that sik,

the ith element of sk, is uncorrelated with sjk. That is,

E siksj

k

h i¼ E si

k

� �E sj

k

h i¼ 0; 8i – j: ð5Þ

Furthermore, the elements of sk are independent ofeach other if they follow the Gaussian distribution. Thejoint probability density function (pdf) of sk can be decom-posed into the product of the individual pdfs of its ele-ments. In CS, not all the elements of xk can be obtainedat time k because a portion of sensor nodes are turnedoff to reduce power consumption. Similarly, we only havepartial elements of sk from the projection of the partial ele-ments of the centered xk. Assume there are L active nodesin the WSN. Let x0k be an L � 1 vector that collects measure-ments from the L active nodes. The L active nodes are se-lected randomly from the N sensor nodes with an L � Nrouting matrix U, which is a binary matrix comprised oftwo values 0 and 1. Each row of U has only a single 1 ele-ment whose position corresponds to the index of the se-lected sensor node.

The relationship between x0k and xk can be written as

x0k ¼ U � xk: ð6Þ

Let zk ¼ x0k �U � �x. Recall that x0k is a vector containingthe measurements from active nodes and �x is the samplemean vector. Both of them are available at time k. Thus,zk is a known data set at time k. By combining Eqs. (2)and (6), zk can also be obtained from sk, as

zk ¼ Uðxk � �xÞ ¼ U � Usk: ð7Þ

The objective of this method is to estimate the parame-ter set sk from data set zk. The estimate of the parameterset sk is denoted as sk. Once sk is obtained, the intact signalxk can be recovered from sk according to (2), as

xk ¼ �xþ U � sk; ð8Þ

Page 4: On the trade-off between energy efficiency and estimation error in compressive sensing

D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857 1851

where xk is the estimate of xk. Two types of distributionassumptions of sk were made in [23], i.e., Gaussian distri-bution and Laplace distribution. They are good models torepresent the principal components of typical sensing data.Interested readers are referred to [23] for details.

3.1.1. Gaussian distributionThe authors first assume the elements of sk follow

Gaussian distribution, denoted as Nðl0;r20Þ. Since si

k’s areuncorrelated, they are independent of each other due tothe Gaussian assumption. Therefore pðhjMÞ, the priorprobability, can be computed as:

pðhjMÞ ¼ pðskjN Þ ¼YNi¼1

1ffiffiffiffiffiffiffiffiffiffiffiffi2pr2

0

q exp �ðsik � l0Þ

2

2r20

( )

¼ 1

ð2pr20Þ

N2

exp �PN

i¼1ðsik � l0Þ

2

2r20

( ): ð9Þ

The mean l0 and variance r20 of the Gaussian distribu-

tion can be estimated from the past sl(l < k). According to(7), the relationship between zk and sk can be translatedinto a likelihood of the form:

pðDjh;MÞ ¼ pðzkjsk;NÞ ¼ dðzk;UUkskÞ; ð10Þ

where d(x, y) is 1 if x = y and 0 otherwise. As discussed, theparameter set sk is estimated by maximizing posteriorprobability density, which can be written as:

sk ¼ argmax pðhjD;MÞ ¼ argmax pðhjMÞpðDjh;MÞ

¼ argmaxdðzk;UUkskÞð2pr2

0ÞN2

exp �PN

i¼1ðsik � l0Þ

2

2r20

( )

¼ argminXN

i¼1

ðsik � l0Þ

2; given that zk ¼ UUksk: ð11Þ

Problem (11) is a typical convex problem with linearequality constraints. It can be solved with some mathe-matical packages such as CVX [25,26].

3.1.2. Laplace distributionThe authors also assume sk follow Laplace distribution

Lðl1;r1Þ, where l1 is a location parameter and r1 is a scaleparameter. Laplace distribution with zero mean is widelyused in the literature [24] to statistically model sparse ran-dom vectors. Similarly, it follows that

sk ¼ arg max pðskjzk;LÞ

¼ argmaxdðzk;UUkskÞð2r2

1ÞN2

exp �PN

i¼1jsik � l1j

r1

( )

¼ argminXN

i¼1

jsik � l1j; given that zk ¼ UUksk; ð12Þ

where l1 is estimated using the median of sl (l < k). Once sk

is obtained, xk can be recovered as shown in (8).

3.2. Enhanced PCA-CS scheme

The major disadvantage of the above approach is thatthe difference in the variances of the principal components

is not considered. It is assumed that the elements of sk fol-low the same distribution with identical variance. How-ever, we show in the following proposition that theabove assumption does not generally hold true.

Proposition 1. For any given pair of eigenvalues ki > kj, the

corresponding variance ofsikis greater than that ofsj

k,

i.e.,E ðsikÞ

2h i

> E sjk

� �2� �

if ki > kj.

Proof. From (2), it follows that

E sksTk

� �¼ UT

E½ðxk � �xÞðxk � �xÞT �U ¼ UTRU

¼ diagðk1; . . . ; kNÞ; ð13Þ

where diag(k1,� � �,kN) is a diagonal matrix with diagonal

elements (k1,� � �,kN). Then we have E sik

2h i

¼ kk and the

statement follows. h

To remove this generally unrealistic assumption, themain idea of our enhanced method is to scale sk

i accordingto its corresponding variance ki. We substitute U with an-other matrix V = UK where

K ¼ diagð1=ffiffiffiffiffik1

p;1=

ffiffiffiffiffik2

p; � � � ;1=

ffiffiffiffiffikN

pÞ;

is a diagonal matrix. Obviously V is also an orthonormalmatrix by definition. Defining

dk ¼def VTðxk � �xÞ; ð14Þ

it follows that

E dkdTk

h i¼ VT

E½ðxk � �xÞðxk � �xÞT �V ¼ KT UTRUK

¼ KT diagðk1; . . . ; kNÞK ¼ I; ð15Þ

where I is the N � N identity matrix. In our proposed algo-rithm, we replace matrix U with the new matrix V. As a re-sult, the elements of the new projected vector dk will haveidentical variance.

The two distribution assumptions used in [23] (de-scribed in Sections 3.1.1 and 3.1.2) can also be adoptedfor the enhanced method. In both models, the elementsare assumed to follow the same distribution. Since the ele-ments of dk now have identical variance, it fits the modelassumption much better than sk and consequently, theaccuracy of recovered data will be improved, as will beshown in our simulation study.

4. Isotonic regression approaches

With Bayesian Estimation, we are able to manage L, thenumber of active nodes, and the percentage of power savedis computed as (N � L)/N. However, there is no consider-ation of the distance between the true measurements xk

and the recovered measurements xk. In many applications,the quality of reconstruction is no less important than en-ergy saving. In this section, we introduce Isotonic Regres-sion (IR) that is able to address this problem.

Isotonic regression aims to find a weighted least-squarefit b 2 Rn to a vector y 2 Rn with weight vector g 2 Rn, sub-

Page 5: On the trade-off between energy efficiency and estimation error in compressive sensing

1852 D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857

ject to a set of monotonicity constraints. The IR problem isdefined as follows:

minimize :Xn

i¼1

giðyi � biÞ2 ð16Þ

subject to : b1 6 � � � 6 bn;

where yi, gi and bi are the ith element of vectors y, gand b, respectively. The optimal solution b is a functionof y.

This is motivated by the fact that the readings of neigh-boring sensor nodes are usually positively correlated. Sup-pose two sensors sX and sY are exposed to the same stimulisource. Intuitively, increasing the stimuli value may resultin higher readings at both sensors. It is thus possible topredict the reading of sY from the reading of sX, and viceversa.

4.1. Combinatorial isotonic regression

In [21], the authors propose a method, named Combina-torial Isotonic Regression (CIR), for solving the isotonicregression problem. A four-step method is presented. Thefirst two steps are used to convert isotonic regression intoa combinatorial problem. Dynamic programming is thenused to solve the combinatorial problem.

Denote the time series of reading from two sensors asX = {xk} and Y = {yk}, k = 1,2,� � �,K. CIR sorts the values of Xand Y in strictly increasing order and eliminates redundan-cies by grouping the identical values. We then obtain x(1) -< � � � < x(i) < � � � < x(n) and y(1) < � � � < y(j) < � � � < y(m). Theobjective is to find a mapping X! Y by solving

minimize :Xn

i¼1

Xm

j¼1

gi;jðyðjÞ � biÞ2 ð17Þ

subject to : b1 6 � � � 6 bn;

where gi,j is the number of {x(i), y(j)} pairs of readingsfrom sensor sX and sY. The optimal solution bi is an esti-mate of yk when reading xk at time k is equal to x(i). Notethat the L-1 norm is used in [21] and their method alsoworks in the case of L-2 norm. To be consistent with theother methods, we adopt the L-2 norm in the remainderof this paper.

The CIR method consists of four steps, which are exe-cuted for each pair of sensor nodes. The algorithm is givenbelow.

Step 1 : Build the relative importance matrix R, whichcaptures the number of the pairs {x(i), y(j)}.

Step 2 : Build the error matrix E whose element {ei,j}describes the error that would be made if the pre-diction of Y was chosen y(j) while the reading x(i)

was obtained from sX. The error element ei,j isdefined as follows:

ei;j ¼Xm

l¼1

ri;j � ðyðlÞ � yðjÞÞ2 for i ¼ 1; � � � ; n;

j ¼ 1; � � � ;m; ð18Þ

where ri,j is the element of matrix R in the ith row and jthcolumn. Recall that m and n are the number of grouped xk

and the number of grouped yk, respectively.Step 3 : Build a cumulative error matrix C. Each element

ci,j of the matrix C describes the accumulatederrors from left column to right. CIR computes ci,j

as follows:

ci;j ¼ei;j; if j ¼ 1;ei;j þ min

16i06ifci0 ;j�1g; otherwise:

(ð19Þ

For each value in C, keep the index

di;j ¼ argmin16i06i

fci0 ;j�1g;

that tracks the minimum value in its previous column,which leads to the current estimate of the cumulativeerror.

Step 4 : Once the cumulative matrix C is obtained, findthe minimum value in the last column of C andthe path leading to this value by back trackingthe indices.

Steps 3 and 4 are presented in Algorithm 1. The com-plexity of this algorithm is Oðnm2Þ.

4.2. Nearly isotonic regression

CIR is based on dynamic programming to find the opti-mal value of objective function [21]. However, in Step 2, itcan be seen that the estimate of the reading from sensor sY

can only be selected from the set of y(j)’s. To improve theCIR method, we propose a Nearly Isotonic Regression(NIR) based method in this section.

In particular, We adopt a method originally proposedfor solving NIR problems in [27], to estimate the readingof sY from that of sX. Again, we group the same value of Xand average the values of readings at sY when X takes valuex(i). Denote �yi as the average reading at sY when X takes x(i).The problem can be formulated as:

Page 6: On the trade-off between energy efficiency and estimation error in compressive sensing

D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857 1853

minimize :Xn

i¼1

ð�yi � biÞ2

subject to : b1 6 � � � 6 bn: ð20Þ

The optimal solution bi of this problem is an estimate of yk

when reading xk is x(i) at sensor sX. After obtaining bi, thereading from sensor sX is used to estimate that of sensorsY. If the reading of sX is x(i), the estimate of the readingof sY is bi.

In [27], Tibshirani et al. seek a nearly-monotoneapproximation and considers the following problem:

bk ¼ argminbi

12

Xn

i¼1

ð�yi � biÞ2 þ k

Xn�1

i¼1

ðbi � biþ1Þþ

( ); ð21Þ

where x+ indicates the positive part of x, i.e., x+ = x � 1[x>0].This is a convex problem for each fixed k P 0. We haveProposition 2 from [27] to solve this problem.

Proposition 2. Suppose that, for some k, two adjacentcoordinates of the solution satisfy bk;i ¼ bk;iþ1. Then we havebk0 ;i ¼ bk0 ;iþ1, for all k0 P k [27].

Proposition 2 states that the pieces in the solution canonly be joined together, not split apart, as k increases. Con-sider a parameter value k with Kk joined pieces, which arerepresented by groups of coordinates A1; . . . ;AKk

. Problem(21) can be rewritten as

12

XKk

i¼1

Xj2Ai

ð�yj � bk;AiÞ2 þ k

XKk�1

i¼1

ðbk;Ai� bk;Aiþ1

Þþ � � � ð22Þ

To find the optimal values bk;Ai, we examine the subgra-

dient of (22), as

�Xj2Ai

�yj þ jAijbk;Aiþ kðwi �wi�1Þ ¼ 0 for i

¼ 1; . . . ;Kk; ð23Þ

where wi ¼ 1½bk;Ai�bk;Aiþ1

>0� and w0 ¼ wKk¼ 0. We then differ-

entiate (23) with respect to k to get the slope mi as

mi ¼dbk;Ai

dk¼ wi�1 �wi

jAij: ð24Þ

The slope mi is a constant, indicating that each bk;Aiis a lin-

ear function of k.Using this slope, it can be shown that groups Ai and Ai+1

will merge at

ti;iþ1 ¼biþ1 � bi

mi �miþ1; for i ¼ 1; . . . ;Kk � 1; ð25Þ

where bi ¼P

j2Ai�yj

jAi j. We can move to the next value of k

k� ¼ mini:ti;iþ1>k

fti;iþ1g; ð26Þ

and merge groups Ai� and Ai�þ1, where

i� ¼ argmini:ti;iþ1>k

fti;iþ1g: ð27Þ

Note that the minimization problems (26) and (27) areonly taken over the values of ti,i+1 that are larger than k.

If none of the ti,i+1’s are larger than k, then none of theexisting groups will merge and the algorithm will be ter-minated. The enhancement algorithm is presented in Algo-rithm 2. The worst case complexity of this algorithm isO(n).

4.3. Compute routing matrix

Recall that the routing matrix U in Bayesian Estima-tion is a binary matrix that randomly picks L activenodes from the set of N sensors. In Isotonic Regression,the routing matrix is not a random matrix; it chooses ac-tive nodes based on the estimation errors of each pair ofsensor nodes.

Define �X,Y as the estimation error at sensor sY due toreconstruction of the readings at sensor sX, as

�X;Y ¼1K

XK

k¼1

ðyk � f ðxkÞÞ2 for X – Y and X;Y

2 f1; . . . ;Ng; ð28Þ

where yk is a reading from sY and f(xk) is the estimate ofyk based on the reading of sX. We assume �X,X = 0 for allnodes. For the N sensor nodes, we can construct anN � N matrix W whose element is the estimation errorof each pair of sensors. We compare the elements ofmatrix W with a preset threshold n and choose thepair whose corresponding estimation error is not greaterthan n.

The algorithm for finding the routing matrix U is pre-sented in Algorithm 3. Define X as the set of all sensornodes and X1 as the set of active sensor nodes. In thebeginning of the algorithm, we let X = {1,� � �,N} andX1 = £. In Lines 3–12, we find the minimum estimation er-ror �i� ;j� and compare it with the threshold n. If it is notgreater than n, we remove i⁄ from X and move j⁄ from Xto X1. These lines are executed until all nodes in X are re-moved. Once the algorithm is terminated, the set X1 is theset of nodes we would like to turn on. The nodes not in X1

is scheduled to be switched off. The complexity of thisalgorithm is OðN3Þ

Page 7: On the trade-off between energy efficiency and estimation error in compressive sensing

tworks 11 (2013) 1848–1857

1854 D. Hu et al. / Ad Hoc Ne

0 25 50 75 100 125 150170

175

180

185

190

195

200

205

210

215

Time Index (k)

Mea

sure

d Va

lue

(xk)

Fig. 2. The scatter plot of the trace data.

5 5.5 6 6.5 7 7.5 8 8.5 90

1

2

3

4

5

6

Number of Active Nodes

Mea

n Sq

uare

Erro

r

BELBEGIBELIBEG

Fig. 3. Comparison of the BE approaches.

5. Polynomial regression approaches

If we treat the readings from sX as predictor variablesand those from sY as response variables, it is natural touse polynomial regression to estimate the readings of sY

from those of sX. In this section, we examine how to useLinear Regression (LR) and Quadratic Regression (QR) in CS.

5.1. Linear regression

As in [22], we also adopt the simple LR method in CS.The reading of sensor node sY can be estimated from thereading of sensor node sX with an LR model as follows.

yk ¼ a0 þ a1 � xk; for k ¼ 1; . . . ;K; ð29Þ

where a0 and a1 are the intercept and slope of the linearregression model.

Once yk is obtained, the estimation error �X,Y and esti-mation error matrix W can be computed as shown in Sec-tion 4.3. Similarly, the routing matrix for LR is obtained asdescribed in Algorithm 3.

5.2. Quadratic regression

To compare with the LR method, we also choose a QRmodel as follows.

yk ¼ a0 þ a1 � xk þ a2 � ðxkÞ2; for k ¼ 1; . . . ;K; ð30Þ

where a2 is the coefficient of the second order term of xk.The estimation error and routing matrix are obtained asin the case of LR. The complexity of LR and QR is O(1) be-cause the coefficients can be computed with explicit math-ematic expressions. The rest of this method is the same asthat in the LR method.

6. Performance evaluation

In this section, we evaluate the performance of all themethods with trace-driven simulations. The simulationcode was written in MATLAB. We use the traces from thesensors deployed in an outdoor environment [28]. Thereare N = 16 sensors that measure the illumination in a field.The range of the measurements is from 0 to 255 and they

are collected from time k = 1 to 150. The scatter plot of themeasured data is presented in Fig. 2.

We define Mean Square Error (MSE) as:

MSE ¼ 1K � N

XK

k¼1

XN

i¼1

xik � xi

k

2; ð31Þ

where xik is the reading at the ith sensor at time k and xi

k theestimate of xi

k. Note that xik ¼ xi

k if the ith sensor is active.MSE is computed for k = 51 to 150. At each time k, the pre-vious 50 data vectors are used for parameter estimation.

We first compare the performance of the BE ap-proaches. We name the BE approaches with Gaussian andLaplace assumptions as BEG and BEL, respectively. Our pro-posed improvements are named IBEG and IBEL. In the sim-ulation, we increase the number of active nodes from 5 to9. Four curves of the BE approaches are plotted in Fig. 3. Asexpected, the more active nodes, the more accurate theprediction and the smaller the estimation error. We findour proposed enhancements yield lower MSE than thoseof the original schemes. IBEL has the best performanceamong the four BE-based methods, indicating that Laplaceassumption is preferred in our proposed method.

Page 8: On the trade-off between energy efficiency and estimation error in compressive sensing

5 5.5 6 6.5 7 7.5 8 8.5 90.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Number of Active Nodes

Mea

n Sq

uare

Erro

r

IBELNIRLR

Fig. 6. Comparison of three approaches.

2 4 6 8 100.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Threshold of Estimation Error (ξ)

Mea

n Sq

uare

Erro

r

LRQR

2 4 6 8 105.5

6

6.5

7

7.5

8

8.5

Threshold of Estimation Error (ξ)

Num

ber o

f Act

ive

Nod

es

LRQR

Fig. 5. Comparison of the PR approaches.

2 4 6 8 100.2

0.3

0.4

0.5

0.6

0.7

0.8

Threshold of Estimation Error (ξ)

Mea

n Sq

uare

Erro

r

CIRNIR

2 4 6 8 105.5

6

6.5

7

7.5

8

8.5

Num

ber o

f Act

ive

Nod

es

CIRNIR

Threshold of Estimation Error (ξ)

Fig. 4. Comparison of the IR approaches.

D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857 1855

Next, we investigate the impact of estimation errorthreshold n on CIR and NIR. In Fig. 4, we increase n from1 to 10 and plot MSE and the number of active nodes. Intu-itively, a larger n means a larger tolerance on estimationerror, allowing more nodes to be turned off to save moreenergy. It can be seen from the figure that the MSE ofNIR is up to 0.13 lower than that of CIR. Furthermore, wefind NIR requires less active nodes than CIR. Therefore,NIR is superior to CIR with respect to both estimation errorand energy savings. We thus choose it to compare withother methods in the following simulations.

In Fig. 5, we examine the impact of estimation errorthreshold n on LR and QR. We increase n from 1 to 10and plot MSE and the number of active nodes in the figure.We find that although the MSE of QR is lower than that ofLR, QR requires more active nodes and thus consumesmore power for computation and communication. Com-pared with QR, LR may be a sufficiently good choice for CS.

Finally, we choose one method from each of the threeCS classes discussed above and compare their perfor-mance. Specifically, we choose IBEL, NIR and LR in this sim-ulation. For each threshold of estimation error in Figs. 4

and 5, the Y-axis value is the corresponding MSE and theX-axis value is the corresponding number of active nodes.Three curves are plotted in Fig. 6 for the three selectedschemes, respectively. Although IBEL requires as few asfive active nodes, its MSE is very high. Among these threemethods, it is obvious that NIR achieves the best perfor-mance because it has smallest MSE and requires the mini-mum number of active nodes.

7. Conclusion

In this paper, we investigated three classes of CS ap-proaches with focus on balancing the trade-off betweenenergy efficiency and estimation error in a WSN environ-ment. We first introduced Bayesian Estimation based ap-proaches and propose an enhanced scheme by scaling thevariance of projected random variables. We then adoptedNIR to improve the performance of CIR. LR and QR ap-proaches are also examined for comparison purpose. Wecompared the three classes of CS approaches with trace-driven simulations. Our simulation results showed thatNIR is the best choice among all the CS approaches consid-ered in this paper.

Acknowledgments

This work is supported in part by the US National Sci-ence Foundation (NSF) under Grants CNS-0953513, andthrough the NSF Broadband Wireless Access and Applica-tions (BWAC) Center site at Auburn University.

References

[1] D. Hu, S. Mao, N. Billor, On balancing energy efficiency andestimation error in compressed sensing, in: Proc. IEEE GLOBECOM2012, Anaheim, CA, 2012, pp. 1–6.

[2] E.J. Candés, M.B. Wakin, An introduction to compressive sampling,IEEE Signal Process. Mag. 25 (2) (2008) 21–30.

[3] I.F. Akyildiz, B.F. Lo, R. Balakrishnan, Cooperative spectrum sensingin cognitive radio networks: a survey, IEEE Commun. Mag. 4 (1)(2011) 40–62.

[4] I.F. Akyildiz, W.-Y. Lee, M.C. Vuran, S. Mohanty, A survey onspectrum management in cognitive radio networks, IEEE Commun.Mag. 46(4) (2008) 40–48.

Page 9: On the trade-off between energy efficiency and estimation error in compressive sensing

1856 D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857

[5] Q. Zhao, B. Sadler, A survey of dynamic spectrum access, IEEE SignalProcess. Mag. 24 (3) (2007) 79–89.

[6] Y. Zhao, S. Mao, J.O. Neel, J.H. Reed, Performance evaluation ofcognitive radios: metrics, utility functions, and methodology, TheProceedings of the IEEE. 97(4) (2009) 642–659.

[7] D. Hu, S. Mao, Cooperative relay with interference alignment forvideo over cognitive radio networks, in: Proc. IEEE INFOCOM 2012,Orlando, FL, 2012, pp. 2014–2022.

[8] I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, A survey onsensor networks, IEEE Commun. Mag. 40 (8) (2002) 102–114.

[9] I.F. Akyildiz, I.H. Kasimoglu, Wireless sensor and actor networks:research challenges, Elsevier Ad Hoc Networks Journal 2 (4) (2004)351–367.

[10] F. Ye, G. Zhong, J. Cheng, S. Lu, L. Zhang, PEAS: a robust energyconserving protocol for long-lived sensor networks, in: Proc. 23rdInt. Distributed Computing Systems Conf., 2003, pp. 28–37, http://dx.doi.org/10.1109/ICDCS.2003.1203449.

[11] H. Ling, T. Znati, Energy efficient adaptive sensing for dynamiccoverage in wireless sensor networks, in: Proc. IEEE WCNC’09, 2009,pp. 1–6, http://dx.doi.org/10.1109/WCNC.2009.4917708.

[12] C.-F. Hsin, M. Liu, Network coverage using low duty-cycled sensors:random and coordinated sleep algorithms, in: Proc. ACM IPSN’04,2004, pp. 433–442, http://dx.doi.org/10.1109/IPSN.2004.1307365.

[13] X. Wang, G. Xing, Y. Zhang, C. Lu, R. Pless, C. Gill, Integrated coverageand connectivity configuration in wireless sensor networks, in: Proc.ACM SenSys’03, Los Angeles, CA, 2003, pp. 28–39.

[14] T. Yan, T. He, J.A. Stankovic, Differentiated surveillance for sensornetworks, in: Proc. ACM SenSys’03, Los Angeles, CA, 2003, pp. 51–62.

[15] T. He, S. Krishnamurthy, J.A. Stankovic, T. Abdelzaher, L. Luo, R.Stoleru, T. Yan, L. Gu, J. Hui, B. Krogh, Energy-efficient surveillancesystem using wireless sensor networks, in: Proc. ACM MobiSys’04,Boston, MA, 2004, pp. 270–283.

[16] Q. Cao, T. Abdelzaher, T. He, J. Stankovic, Towards optimal sleepscheduling in sensor networks for rare-event detection, in: Proc.ACM IPSN’05, Los Angeles, CA, 2005, pp. 20–27, http://dx.doi.org/10.1109/IPSN.2005.1440887.

[17] P. Wang, R. Dai, I.F. Akyildiz, Collaborative data compression usingclustered source coding for wireless multimedia sensor networks,in: Proc. IEEE INFOCOM 2010, San Diego, CA, 2010, pp. 1–9.

[18] D. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 52 (4)(2006) 4036–4048.

[19] E. Candes, T. Tao, Near optimal signal recovery from randomprojections: universal encoding strategies?, IEEE Trans Inform.Theory 52 (12) (2006) 5406–5425.

[20] J.R.E. Candes, T. Tao, Robust uncertainty principles: exact signalreconstruction from highly incomplete frequency information, IEEETrans. Inform. Theory 52 (2) (2006) 489–509.

[21] F. Koushanfar, N. Taft, M. Potkonjak, Sleeping coordination forcomprehensive sensing using isotonic regression and domaticpartitions, in: Proc. IEEE INFOCOM’06, Barcelona, Spain, 2006, pp.1–13, http://dx.doi.org/10.1109/INFOCOM.2006.276.

[22] G. Ollos, R. Vida, Adaptive regression algorithm for distributeddynamic clustering in wireless sensor networks, in: Proc. 2nd IFIPWireless Days (WD), Paris, France, 2009, pp. 1–5, http://dx.doi.org/10.1109/WD.2009.5449687.

[23] R. Masiero, G. Quer, D. Munaretto, M. Rossi, J. Widmer, M. Zorzi, Dataacquisition through joint compressive sensing and principalcomponent analysis, in: Proc. IEEE Globecom 2009, Honolulu, HI,2009.

[24] R. Masiero, G. Quer, M. Rossi, M. Zorzi, A Bayesian analysis ofcompressive sensing data recovery in wireless sensor networks, in:Proc. 2009 Int. Conf. Ultra Modern Telecommun. Workshops, St.-Petersburg, Russia, 2009, pp. 1–6.

[25] M. Grant, S. Boyd, Graph implementations for nonsmooth convexprograms, in: V. Blondel, S. Boyd, H. Kimura (Eds.), Recent Advancesin Learning and Control, Lecture Notes in Control and InformationSciences, Springer-Verlag Limited, 2008, pp. 95–110.

[26] M. Grant, S. Boyd, CVX: Matlab Software for Disciplined ConvexProgramming, Version 1.21, April 2011. <http://cvxr.com/cvx>.

[27] R. Tibshirani, H. Hoefling, R. Tibshirani, Nearly-isotonic regression,Technometrics 53 (1) (2010) 54–61.

[28] Wireless Sensor Network Illumination Data Trace. <http://lightning.med.monash.edu/Illumination.zip>.

Donglin Hu received the M.S. degree fromTsinghua University, Beijing, China, in 2007and the B.S. degree from Nanjing University ofPosts and Telecommunications, Nanjing,China in 2004, respectively, all in electricalengineering. He received the M.S. degree inProbability and Statistics from Auburn Uni-versity, Auburn, AL, in 2011, and the Ph.D.degree in Electrical and Computer Engineer-ing from Auburn University in 2012. Currentlyhe is a postdoctoral research fellow in theDepartment of Electrical and Computer Engi-

neering at Auburn University. His research interests include cognitiveradio networks, femtocell networks, network modeling, cross-layerdesign, performance analysis, and algorithm optimization for wireless

networks and multimedia communications.

Shiwen Mao received Ph.D. in electrical andcomputer engineering from Polytechnic Uni-versity, Brooklyn, NY, USA (now PolytechnicInstitute of New York University) in 2004. Hewas a research staff member with IBM ChinaResearch Lab from 1997 to 1998. He was aPostdoctoral Research Fellow/Research Sci-entist in the Bradley Department of Electricaland Computer Engineering at Virginia Poly-technic Institute and State University (Vir-ginia Tech), Blacksburg, VA, USA from 2003 to2006. Currently, he is the McWane Associate

Professor in the Department of Electrical and Computer Engineering,Auburn University, Auburn, AL, USA. His research interests include cross-layer optimization of wireless networks and multimedia communica-

tions, with current focus on cognitive radios, femtocell networks, 60 GHzmmWave networks, free space optical networks, and smart grid. He is onthe Editorial Board of IEEE Transactions on Wireless Communications,IEEE Communications Surveys and Tutorials, Elsevier Ad Hoc NetworksJournal, Wiley International Journal of Communication Systems, and ICSTTransactions on Mobile Communications and Applications. He is theDirector of E-Letter of the Multimedia Communications Technical Com-mittee (MMTC), IEEE Communications Society for 2012–2014. He is acoauthor of TCP/IP Essentials: A Lab-Based Approach (Cambridge Uni-versity Press, 2004). He was awarded the McWane Endowed Professor-ship in the Samuel Ginn College of Engineering for the Department ofElectrical and Computer Engineering, Auburn University in August 2012.He received the US National Science Foundation Faculty Early CareerDevelopment Award (CAREER) in 2010. He is a co-recipient of The 2004IEEE Communications Society Leonard G. Abraham Prize in the Field ofCommunications Systems and The Best Paper Runner-up Award at TheFifth International Conference on Heterogeneous Networking for Quality,Reliability, Security and Robustness (QShine) in 2008. He was named2012 Exemplary Editor of IEEE Communications Surveys and Tutorials. Healso received Auburn Alumni Council Research Awards for Excellence–Junior Award in 2011 and two Auburn Author Awards in 2011. Dr. Maoholds one US patent.

Nedret Billor received her Ph.D. from Uni-versity of Sheffield, UK. She is a professor inthe Department of Mathematics and Statis-tics, Auburn University, Auburn, AL, USA. Herresearch interests include Outlier detection inregression and multivariate data, Robustmultivariate and functional data analysis. Shehas authored numerous publications in jour-nals which have garnered over many cita-tions. She is a fellow of the Royal StatisticalSociety, elected member of the InternationalStatistical Institute.

Page 10: On the trade-off between energy efficiency and estimation error in compressive sensing

D. Hu et al. / Ad Hoc Networks 11 (2013) 1848–1857 1857

Prathima Agrawal is the Samuel Ginn dis-tinguished professor of electrical and com-puter engineering and the director of theWireless Engineering Research and EducationCenter, Auburn University, Auburn, Alabama.Prior to her present positions, she worked inTelcordia Technologies (formerly Bellcore),Morristown, New Jersey, and at AT&T/LucentBell Laboratories, Murray Hill, New Jersey. Shecreated and served as head of the NetworkedComputing Research Department in MurrayHill. She is widely published and holds 51 US

patents. She is a Fellow of IEEE. She received the BE and ME degrees inelectrical communication engineering from the Indian Institute of Sci-

ence, Bangalore, India, and the PhD degree in electrical engineering fromthe University of Southern California in 1977.