A privacy preserving-location_monitoring_system_for_wireless_sensor_networks

1

A Privacy-Preserving Location MonitoringSystem for Wireless Sensor Networks

Chi-Yin Chow, Student Member, IEEE, Mohamed F. Mokbel, Member, IEEE, and Tian He, Member, IEEE

Abstract—Monitoring personal locations with a potentially untrusted server poses privacy threats to the monitored individuals. Tothis end, we propose a privacy-preserving location monitoring system for wireless sensor networks. In our system, we design two in-network location anonymization algorithms, namely, resource- and quality-aware algorithms, that aim to enable the system to providehigh quality location monitoring services for system users, while preserving personal location privacy. Both algorithms rely on the wellestablished k-anonymity privacy concept, that is, a person is indistinguishable among k persons, to enable trusted sensor nodes toprovide the aggregate location information of monitored persons for our system. Each aggregate location is in a form of a monitoredarea A along with the number of monitored persons residing in A, where A contains at least k persons. The resource-aware algorithmaims to minimize communication and computational cost, while the quality-aware algorithm aims to maximize the accuracy of theaggregate locations by minimizing their monitored areas. To utilize the aggregate location information to provide location monitoringservices, we use a spatial histogram approach that estimates the distribution of the monitored persons based on the gathered aggregatelocation information. Then the estimated distribution is used to provide location monitoring services through answering range queries.We evaluate our system through simulated experiments. The results show that our system provides high quality location monitoringservices for system users and guarantees the location privacy of the monitored persons.

Index Terms—Location privacy, wireless sensor networks, location monitoring system, aggregate query processing, spatial histogram

F

1 INTRODUCTION

The advance in wireless sensor technologies has resultedin many new applications for military and/or civilianpurposes. Many cases of these applications rely on theinformation of personal locations, for example, surveil-lance and location systems. These location-dependentsystems are realized by using either identity sensorsor counting sensors. For identity sensors, for example,Bat [1] and Cricket [2], each individual has to carrya signal sender/receiver unit with a globally uniqueidentifier. With identity sensors, the system can pinpointthe exact location of each monitored person. On theother hand, counting sensors, for example, photoelectricsensors [3], [4], and thermal sensors [5], are deployedto report the number of persons located in their sensingareas to a server.

Unfortunately, monitoring personal locations with apotentially untrusted system poses privacy threats tothe monitored individuals, because an adversary couldabuse the location information gathered by the system toinfer personal sensitive information [2], [6], [7], [8]. Forthe location monitoring system using identity sensors,the sensor nodes report the exact location information ofthe monitored persons to the server; thus using identitysensors immediately poses a major privacy breach. To

• Chi-Yin Chow, Mohamed F. Mokbel, and Tian He are with the Departmentof Computer Science and Engineering, University of Minnesota, Min-neapolis, MN 55455, USA, email: {cchow, mokbel, tianhe}@cs.umn.edu.

• This research was partially supported by NSF Grants IIS-0811998, IIS-0811935, CNS-0708604, CNS-0720465, and CNS-0626614, and a Mi-crosoft Research Gift.

tackle such a privacy breach, the concept of aggregatelocation information, that is, a collection of location datarelating to a group or category of persons from whichindividual identities have been removed [8], [9], has beensuggested as an effective approach to preserve locationprivacy [6], [8], [9]. Although the counting sensors by na-ture provide aggregate location information, they wouldalso pose privacy breaches.

Figure 1 gives an example of a privacy breach in alocation monitoring system with counting sensors. Thereare 11 counting sensor nodes installed in nine rooms R1

to R9, and two hallways C1 and C2 (Figure 1a). The non-zero number of persons detected by each sensor node isdepicted as a number in parentheses. Figures 1b and 1c

��

��

��

��

��

��

��

��

��

��

��

�

�

��

��

(a) At time ti

��

��

��

��

��

��

��

��

��

��

�

�

��

��

(b) At time ti+1

��

��

��

��

��

��

��

��

��

��

�

�

��

��

(c) At time ti+2

Fig. 1: A location monitoring system using countingsensors.

2

give the numbers reported by the same set of sensornodes at two consecutive time instances ti+1 and ti+2,respectively. If R3 is Alice’s office room, an adversaryknows that Alice is in room R3 at time ti. Then theadversary knows that Alice left R3 at time ti+1 and wentto C2 by knowing the number of persons detected bythe sensor nodes in R3 and C2. Likewise, the adversarycan infer that Alice left C2 at time ti+2 and went to R7.Such knowledge leakage may lead to several privacythreats. For example, knowing that a person has visitedcertain clinical rooms may lead to knowing the herhealth records. Also, knowing that a person has visiteda certain bar or restaurant in a mall building may revealconfidential personal information.

This paper proposes a privacy-preserving locationmonitoring system for wireless sensor networks to pro-vide monitoring services. Our system relies on the wellestablished k-anonymity privacy concept, which requireseach person is indistinguishable among k persons. In oursystem, each sensor node blurs its sensing area into acloaked area, in which at least k persons are residing. Eachsensor node reports only aggregate location information,which is in a form of a cloaked area, A, along with thenumber of persons, N , located in A, where N ≥ k, to theserver. It is important to note that the value of k achievesa trade-off between the strictness of privacy protectionand the quality of monitoring services. A smaller k indi-cates less privacy protection, because a smaller cloakedarea will be reported from the sensor node; hence bettermonitoring services. However, a larger k results in alarger cloaked area, which will reduce the quality ofmonitoring services, but it provides better privacy pro-tection. Our system can avoid the privacy leakage inthe example given in Figure 1 by providing low qualitylocation monitoring services for small areas that theadversary could use to track users, while providing highquality services for larger areas. The definition of a smallarea is relative to the required anonymity level, becauseour system provides better quality services for the samearea if we relax the required anonymity level. Thus theadversary cannot infer the number of persons currentlyresiding in a small area from our system output withany fidelity; therefore the adversary cannot know thatAlice is in room R3.

To preserve personal location privacy, we proposetwo in-network aggregate location anonymization algo-rithms, namely, resource- and quality-aware algorithms.Both algorithms require the sensor nodes to collaboratewith each other to blur their sensing areas into cloakedareas, such that each cloaked area contains at least kpersons to constitute a k-anonymous cloaked area. Theresource-aware algorithm aims to minimize communi-cation and computational cost, while the quality-awarealgorithm aims to minimize the size of the cloaked areas,in order to maximize the accuracy of the aggregatelocations reported to the server. In the resource-awarealgorithm, each sensor node finds an adequate numberof persons, and then it uses a greedy approach to find a

cloaked area. On the other hand, the quality-aware algo-rithm starts from a cloaked area A, which is computed bythe resource-aware algorithm. Then A will be iterativelyrefined based on extra communication among the sensornodes until its area reaches the minimal possible size. Forboth algorithms, the sensor node reports its cloaked areawith the number of monitored persons in the area as anaggregate location to the server.

Although our system only knows the aggregate lo-cation information about the monitored persons, it canstill provide monitoring services through answering ag-gregate queries, for example, “What is the number ofpersons in a certain area?” To support these monitoringservices, we propose a spatial histogram that analyzes thegathered aggregate locations to estimate the distributionof the monitored persons in the system. The estimateddistribution is used to answer aggregate queries.

We evaluate our system through simulated experi-ments. The results show that the communication andcomputational cost of the resource-aware algorithmis lower than the quality-aware algorithm, while thequality-aware algorithm provides more accurate moni-toring services (the average accuracy is about 90%) thanthe resource-aware algorithm (the average accuracy isabout 75%). Both algorithms only reveal k-anonymousaggregate location information to the server, but they aresuitable for different system settings. The resource-awarealgorithm is suitable for the system, where the sensornodes have scarce communication and computationalresources, while the quality-aware algorithm is favorablefor the system, where accuracy is the most importantfactor in monitoring services.

The rest of this paper is organized as follows. Oursystem model is outlined in Section 2. Section 3 presentsthe resource- and quality-aware location anonymizationalgorithms. Section 4 describes the aggregate query pro-cessing. We describe an attacker model and the experi-ment setting of our system in Section 5. The experimentalresults are given in Section 6. Section 7 highlights therelated work. Finally, Section 8 concludes the paper.

2 SYSTEM MODEL

Figure 2 depicts the architecture of our system, wherethere are three major entities, sensor nodes, server, andsystem users. We will define the problem addressed byour system, and then describe the detail of each entityand the privacy model of our system.

Problem definition. Given a set of sensor nodes s1, s2,. . . , sn with sensing areas a1, a2, . . . , an, respectively,a set of moving objects o1, o2, . . . , om, and a requiredanonymity level k, (1) we find an aggregate locationfor each sensor node si in a form of Ri = (Areai, Ni),where Areai is a rectangular area containing the sensingarea of a set of sensor nodes Si and Ni is the numberof objects residing in the sensing areas of the sensornodes in Si, such that Ni ≥ k, Ni = | ∪sj∈Si

Oj | ≥ k,Oj = {ol|ol ∈ aj}, 1 ≤ i ≤ n, and 1 ≤ l ≤ m; and (2) we

3

��

��

��

��

��

��

��

��

��

��

��

��

��

��

Fig. 2: System architecture.

build a spatial histogram to answer an aggregate queryQ that asks about the number of objects in a certain areaQ.Area based on the aggregate locations reported fromthe sensor nodes.

Sensor nodes. Each sensor node is responsible for deter-mining the number of objects in its sensing area, blurringits sensing area into a cloaked area A, which includesat least k objects, and reporting A with the number ofobjects located in A as aggregate location informationto the server. We do not have any assumption aboutthe network topology, as our system only requires acommunication path from each sensor node to the serverthrough a distributed tree [10]. Each sensor node is alsoaware of its location and sensing area.

Server. The server is responsible for collecting the ag-gregate locations reported from the sensor nodes, usinga spatial histogram to estimate the distribution of themonitored objects, and answering range queries basedon the estimated object distribution. Furthermore, theadministrator can change the anonymized level k of thesystem at anytime by disseminating a message with anew value of k to all the sensor nodes.

System users. Authenticated administrators and userscan issue range queries to our system through either theserver or the sensor nodes, as depicted in Figure 2. Theserver uses the spatial histogram to answer their queries.

Privacy model. In our system, the sensor nodes con-stitute a trusted zone, where they behave as defined inour algorithm and communicate with each other througha secure network channel to avoid internal networkattacks, for example, eavesdropping, traffic analysis, andmalicious nodes [6], [11]. Since establishing such a securenetwork channel has been studied in the literature [6],[11], the discussion of how to get this network channel isbeyond the scope of this paper. However, the solutionsthat have been used in previous works can be appliedto our system. Our system also provides anonymouscommunication between the sensor nodes and the serverby employing existing anonymous communication tech-niques [12], [13]. Thus given an aggregate location R,the server only knows that the sender of R is one of thesensor nodes within R. Furthermore, only authenticatedadministrators can change the k-anonymity level and thespatial histogram size. In emergency cases, the admin-istrators can set the k-anonymity level to a small value

to get more accurate aggregate locations from the sensornodes, or even set it to zero to disable our algorithm toget the original readings from the sensor nodes, in orderto get the best services from the system. Since the serverand the system user are outside the trusted zone, theyare untrusted.

We now discuss the privacy threat in existing locationmonitoring systems. In an identity-sensor location mon-itoring system, since each sensor node reports the exactlocation information of each monitored object to theserver, the adversary can pinpoint each object’s exact lo-cation. On the other hand, in a counting-sensor locationmonitoring system, each sensor node reports the numberof objects in its sensing area to the server. The adversarycan map the monitored areas of the sensor nodes to thesystem layout. If the object count of a monitored area isvery small or equal to one, the adversary can infer theidentity of the monitored objects based on the mappedmonitored area, for example, Alice is in her office roomat time instance ti in Figure 1.

Since our system only allows each sensor node toreport a k-anonymous aggregate location to the server,the adversary cannot infer an object’s exact location withany fidelity. The larger the anonymity level, k, the moredifficult for the adversary to infer the object’s exactlocation. With the k-anonymized aggregate locationsreported from the sensor nodes, the underlying spatialhistogram at the server provides low quality locationmonitoring services for a small area, and better qualityservices for larger areas. This is a nice privacy-preservingfeature, because the object count of a small area ismore likely to reveal personal location information. Thedefinition of a small area is relative to the requiredanonymity level, because our system provides lowerquality services for the same area if the anonymizedlevel gets stricter. We will also describe an attack model,where we stimulate an attacker that could be a systemuser or the server attempting to infer the object count ofa particular sensor node in Section 5.1. We evaluate oursystem’s resilience to the attack model and its privacyprotection in Section 6.

3 LOCATION ANONYMIZATION ALGORITHMS

In this section, we present our in-network resource- andquality-aware location anonymization algorithms that isperiodically executed by the sensor nodes to report theirk-anonymous aggregate locations to the server for everyreporting period.

3.1 The Resource-Aware AlgorithmAlgorithm 1 outlines the resource-aware locationanonymization algorithm. Figure 3 gives an example toillustrate the resource-aware algorithm, where there areseven sensor nodes, A to G, and the required anonymitylevel is five, k = 5. The dotted circles represent thesensing area of the sensor nodes, and a line betweentwo sensor nodes indicates that these two sensor nodes

4

��

��

� �

��

��

��

��

��

��

��

� �

��

��

��

��

� �

��

��

��

�

��

��

��

��

�

�

�

�

�

�

(a) PeerLists after the first broadcast

��

��

� �

��

��

��

��

� �

��

��

��

��

��

� �

��

��

� �

��

��

��

� �

��

��

��

�

��

��

��

� �

��

��

�

�

�

�

�

�

(b) Rebroadcast from sensor node F

��

��

� �

��

��

��

��

� �

��

��

��

��

��

� �

��

��

� �

��

��

��

� �

��

��

��

�

��

��

��

� �

��

��

�

�

�

�

�

�

(c) Resource-aware cloaked area of sensor node A

Fig. 3: The resource-aware location anonymization algorithm (k = 5).

can communicate directly with each other. In general,the algorithm has three steps.

Step 1: The broadcast step. The objective of this step isto guarantee that each sensor node knows an adequatenumber of objects to compute a cloaked area. To reducecommunication cost, this step relies on a heuristic thata sensor node only forwards its received messages toits neighbors when some of them have not yet foundan adequate number of objects. In this step, after eachsensor node m initializes an empty list PeerList (Line 2in Algorithm 1), m sends a message with its identitym.ID, sensing area m.Area, and the number of objectslocated in its sensing area m.Count, to its neighbors(Line 3). When m receives a message from a peer p,i.e., (p.ID, p.Area, p.Count), m stores the message in itsPeerList (Line 5). Whenever m finds an adequate numberof objects, m sends a notification message to its neighbors(Line 7). If m has not received the notification messagefrom all its neighbors, some neighbor has not found anadequate number of objects; therefore m forwards thereceived message to its neighbors (Line 10).

Figures 3a and 3b illustrate the broadcast step. Whena reporting period starts, each sensor node sends amessage with its identity, sensing area, and the numberof objects located in its sensing area to its neighbors.After the first broadcast, sensor nodes A to F have foundan adequate number of objects (represented by blackcircles), as depicted in Figure 3a. Thus sensor nodes A toF send a notification message to their neighbors. Sincesensor node F has not received a notification messagefrom its neighbor G, F forwards its received messages,which include the information about sensor nodes D andE, to G (Figures 3b). Finally, sensor node G has foundan adequate number of objects, so it sends a notificationmessage to its neighbor, F . As all the sensor nodes havefound an adequate number of objects, they proceed tothe next step.

Step 2: The cloaked area step. The basic idea of thisstep is that each sensor node blurs its sensing areainto a cloaked area that includes at least k objects, inorder to satisfy the k-anonymity privacy requirement.To minimize computational cost, this step uses a greedy

Algorithm 1 Resource-aware location anonymization1: function RESOURCEAWARE (Integer k, Sensor m, List R)2: PeerList← {∅}

// Step 1: The broadcast step3: Send a message with m’s identity m.ID, sensing area m.Area, and object

count m.Count to m’s neighbor peers4: if Receive a message from a peer p, i.e., (p.ID, p.Area, p.count) then5: Add the message to PeerList6: if m has found an adequate number of objects then7: Send a notification message to m’s neighbors8: end if9: if Some m’s neighbor has not found an adequate number of objects then

10: Forward the message to m’s neighbors11: end if12: end if

// Step 2: The cloaked area step13: S ← {m}14: Compute a score for each peer in PeerList15: Repeatedly select the peer with the highest score from PeerList to S until the

total number of objects in S is at least k16: Area← a minimum bounding rectangle of the senor nodes in S17: N ← the total number of objects in S

// Step 3: The validation step18: if No containment relationship with Area and R ∈ R then19: Send (Area, N) to the peers within Area and the server20: else if m’s sensing area is contained by some R ∈ R then21: Randomly select a R′ ∈ R such that R′.Area contains m’s sensing area22: Send R′ to the peers within R′.Area and the server23: else24: Send Area with a cloaked N to the peers within Area and the server25: end if

approach to find a cloaked area based on the informationstored in PeerList. For each sensor node m, m initializesa set S = {m}, and then determines a score for each peerin its PeerList (Lines 13 to 14 in Algorithm 1). The scoreis defined as a ratio of the object count of the peer to theEuclidean distance between the peer and m. The ideabehind the score is to select a set of peers from PeerListto S to form a cloaked area that includes at least k objectsand has an area as small as possible. Then we repeatedlyselect the peer with the highest score from the PeerListto S until S contains at least k objects (Line 15). Finally,m determines the cloaked area (Area) that is a minimumbounding rectangle (MBR) that covers the sensing area ofthe sensor nodes in S, and the total number of objectsin S (N ) (Lines 16 to 17).

An MBR is a rectangle with the minimum area (whichis parallel to the axes) that completely contains alldesired regions, as illustrated in Figure 3c, where thedotted rectangle is the MBR of the sensing area of sensor

5

nodes A and B. The major reasons of our algorithmsaligning with MBRs rather than other polygons are thatthe concept of MBRs have been widely adopted byexisting query processing algorithms and most databasemanagement systems have the ability to manipulateMBRs efficiently.

Figure 3c illustrates the cloaked area step. The PeerListof sensor node A contains the information of three peers,B, D, and E. The object count of sensor nodes B, D, andE is 3, 1, and 2, respectively. We assume that the distancefrom sensor node A to sensor nodes B, D, and E is 17,18, and 16, respectively. The score of B, D, and E is3/17 = 0.18, 1/18 = 0.06, and 2/16 = 0.13, respectively.Since B has the highest score, we select B. The sum ofthe object counts of A and B is six which is larger thanthe required anonymity level k = 5, so we return theMBR of the sensing area of the sensor nodes in S, i.e., Aand B, as the resource-aware cloaked area of A, whichis represented by a dotted rectangle.

Step 3: The validation step. The objective of this stepis to avoid reporting aggregate locations with a con-tainment relationship to the server. Let Ri and Rj betwo aggregate locations reported from sensor nodes iand j, respectively. If Ri’s monitored area is included inRj ’s monitored area, Ri.Area ⊂ Rj .Area or Rj .Area ⊂Ri.Area, they have a containment relationship. We donot allow the sensor nodes to report their aggregatelocations with the containment relationship to the server,because combining these aggregate locations may poseprivacy leakage. For example, if Ri.Area ⊂ Rj .Areaand Ri.Area 6= Rj .Area, an adversary can infer thatthe number of objects residing in the non-overlappingarea, Rj .Area − Ri.Area, is Rj .N − Ri.N . In case thatRj .N −Ri.N < k, the adversary knows that the numberof objects in the non-overlapping is less than k, whichviolates the k-anonymity privacy requirement. As thisstep ensures that no aggregate location with the contain-ment relationship is reported to the server, the adversarycannot obtain any deterministic information from theaggregate locations.

In this step, each sensor node m maintains a list R tostore the aggregate locations sent by other peers. Whena reporting period starts, m nullifies R. After m findsits aggregate location Rm, m checks the containmentrelationship between Rm and the aggregate locationsstored in R. If there is no containment relationshipbetween Rm and the aggregate locations in R, m sendsRm to the peers within Rm.Area and the server (Line 19in Algorithm 1). Otherwise, m randomly selects an ag-gregate location Rp from the set of aggregate locations inR that contain m’s sensing area, and m sends Rp to thepeers within Rp.Area and the server (Lines 21 to 22). Incase that no aggregate location in R contains m’s sensingarea, we find a set of aggregate locations in R that arecontained by Rm, R′, and N ′ is the number of monitoredpersons in Rm that is not covered by any aggregatelocation in R′. If N ′ ≥ k, the containment relationshipdoes not violate the k-anonymity privacy requirement;

Algorithm 2 Quality-aware location anonymization1: function QUALITYAWARE (Integer k, Sensor m, Set init solution, List R)2: current min cloaked area← init solution

// Step 1: The search space step3: Determine a search space S based on init solution4: Collect the information of the peers located in S

// Step 2: The minimal cloaked area step5: Add each peer located in S to C[1] as an item6: Add m to each itemset in C[1] as the first item7: for i = 1; i ≤ 4; i ++ do8: for each itemset X = {a1, . . . , ai+1} in C[i] do9: if Area(MBR(X)) < Area(current min cloaked area) then

10: if N(MBR(X)) ≥ k then11: current min cloaked area← {X}12: Remove X from C[i]13: end if14: else15: Remove X from C[i]16: end if17: end for18: if i < 4 then19: for each itemset pair X={x1,. . . ,xi+1}, Y ={y1,. . . ,yi+1} in C[i]

do20: if x1 = y1, . . . , xi = yi and xi+1 6= yi+1 then21: Add an itemset {x1, . . . , xi+1, yi+1} to C[i + 1]22: end if23: end for24: end if25: end for26: Area← a minimum bounding rectangle of current min cloaked area27: N ← the total number of objects in current min cloaked area

// Step 3: The validation step28: Lines 18 to 25 in Algorithm 1

therefore m sends Rm to the peers within Rm.Area andthe server. However, if N ′ < k, m cloaks the number ofmonitored persons of Rm, Rm.N , by increasing it by aninteger uniformly selected between k and 2k, and sendsRm to the peers within Rm.Area and the server (Line 24).Since the server receives an aggregate location from eachsensor node for every reporting period, it cannot tellwhether any containment relationship takes place amongthe actual aggregate locations of the sensor nodes.

3.2 The Quality-Aware Algorithm

Algorithm 2 outlines the quality-aware algorithm thattakes the cloaked area computed by the resource-awarealgorithm as an initial solution, and then refines it untilthe cloaked area reaches the minimal possible area,which still satisfies the k-anonymity privacy require-ment, based on extra communication between otherpeers. The quality-aware algorithm initializes a variablecurrent minimal cloaked area by the input initial solution(Line 2 in Algorithm 2). When the algorithm terminates,the current minimal cloaked area contains the set of sen-sor nodes that constitutes the minimal cloaked area. Ingeneral, the algorithm has three steps.

Step 1: The search space step. Since a typical sensornetwork has a large number of sensor nodes, it is toocostly for a sensor node m to gather the informationof all the sensor nodes to compute its minimal cloakedarea. To reduce communication and computational cost,m determines a search space, S, based on the inputinitial solution, which is the cloaked area computedby the resource-aware algorithm, such that the sensornodes outside S cannot be part of the minimal cloaked

6

� �

�

�

�

�

�

��

��

��

��

(a) The MBR of A’s sensing area

� �

�

�

�

�

�

�

�

��

(b) The extended MBR1 of the edge e1

� �

�

�

�

�

�

��

��

��

��

(c) The extended MBRi (1 ≤ i ≤ 4)

��

��

� �

��

� �

�

�

�

�

�

��

��

��

��

(d) The search space S

Fig. 4: The search space S of sensor node A.

area (Line 3 in Algorithm 2). We will describe how todetermine S based on the example given in Figure 4.Thus gathering the information of the peers residing inS is enough for m to compute the minimal cloaked areafor m (Line 4).

Figure 4 illustrates the search space step, in which wecompute S for sensor node A. Let Area be the area ofthe input initial solution. We assume that Area = 1000.We determine S for A by two steps. (1) We find theminimum bounding rectangle (MBR) of the sensing areaof A. It is important to note that the sensing area canbe in any polygon or irregular shape. In Figure 4a, theMBR of the sensing area of A is represented by a dottedrectangle, where the edges of the MBR are labeled bye1 to e4. (2) For each edge ei of the MBR, we computean MBRi by extending the opposite edge such that thearea of the extended MBRi is equal to Area. S is theMBR of the four extended MBRi. Figure 4b depictsthe extended MBR1 of the edge e1 by extending theopposite edge e3, where MBR1.x is the length of MBR1,MBR1.y = Area/MBR1.x and Area = 1000. Figure 4cshows the four extended MBRs, MBR1 to MBR4, whichare represented by dotted rectangles. The MBR of thefour extended MBRs constitutes S, which is representedby a rectangle (Figure 4d). Finally, the sensor node onlyneeds the information of the peers within S.

Step 2: The minimal cloaked area step. This step takes aset of peers residing in the search space, S, as an inputand computes the minimal cloaked area for the sensornode m. Although the search space step already prunesthe entire system space into S, exhaustively searchingthe minimal cloaked area among the peers residing inS, which needs to search all the possible combinationsof these peers, could still be costly. Thus we propose twooptimization techniques to reduce computational cost.

The basic idea of the first optimization technique isthat we do not need to examine all the combinationsof the peers in S; instead, we only need to considerthe combinations of at most four peers. The rationalebehind this optimization is that an MBR is defined byat most four sensor nodes because at most two sensor

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

Fig. 5: The lattice structure of a set of four items.

nodes define the width of the MBR (parallel to thex-axis) while at most two other sensor nodes definethe height of the MBR (parallel to the y-axis). Thusthis optimization mainly reduces computational cost byreducing the number of MBR computations among thepeers in S. The correctness of this optimization techniquewill be discussed in Section 3.2.2.

The second optimization technique has two properties,lattice structure and monotonicity property. We first de-scribe these two properties, and then present a progressiverefinement approach for finding a minimal cloaked area.

A. Lattice structure. In a lattice structure, a data set thatcontains n items can generate 2n−1 itemsets excluding anull set. In the sequel, since the null set is meaninglessto our problem, it will be neglected. Figure 5 shows thelattice structure of a set of four items S = {s1, s2, s3, s4},where each black line between two itemsets indicatesthat an itemset at a lower level is a subset of an itemsetat a higher level. For our problem, given a set of sensornodes S = {s1, s2, . . . , sn}, all the possible combinationsof these sensor nodes are the non-empty subsets ofS; thus we can use a lattice structure to generate thecombinations of the sensor nodes in S. In the latticestructure, since each itemset at level i has i items in S,each combination at the lowest level, level 1, contains adistinct item in S; therefore there are n itemsets at thelowest level. We generate the lattice structure from the

7

��

��

��

��

��

(a) The full lattice structure

��

��

��

��

��

��

(b) Pruning itemset {A,B}

��

��

��

��

��

��

(c) Pruning itemset {A,D}

��

��

��

��

��

��

(d) The minimal cloaked area

Fig. 6: The quality-aware cloaked area of sensor node A.

lowest level based on a simple generation rule: given twosorted itemsets X = {x1, . . . , xi} and Y = {y1, . . . , yi} inincreasing order, where each itemset has i items (1 ≤ i <n), if all item pairs but the last one in X and Y are thesame, x1 = y1, x2 = y2, . . ., xi−1 = yi−1, and xi 6= yi, wegenerate a new itemset with i + 1 items, {x1, . . . , xi, yi}.In the example, we use bold lines to illustrate the con-struction of the lattice structure based on the generationrule. For example, the itemset {s1, s2, s3, s4} at level 4is combined by the itemsets {s1, s2, s3} and {s1, s2, s4}at level 3, so there is a bold line from {s1, s2, s3, s4} to{s1, s2, s3} and another one to {s1, s2, s4}.

B. Monotonicity property. Let S be a set of items, and Pbe the power set of S, 2S . The monotonicity property ofa function f indicates that if X is a subset of Y , thenf(X) must not exceed f(Y ), i.e., ∀X, Y ∈ P : (X ⊆Y ) → f(X) ≤ f(Y ). For our problem, the MBR of aset of sensor nodes S has the monotonicity property,because adding sensor nodes to S must not decrease thearea of the MBR of S or the number of objects withinthe MBR of S. Let Area(MBR(X)) and N(MBR(X))be two functions that return the area of the MBR of anitemset X and the number of monitored objects locatedin the MBR, respectively. Thus, given two itemsets X andY , if X ⊆ Y , then Area(MBR(X)) ≤ Area(MBR(Y ))and N(MBR(X)) ≤ N(MBR(Y )). By this property, wepropose two pruning conditions in the lattice structure.(1) If a combination C gives the current minimal cloakedarea, other combinations that contain C at the higherlevels of the lattice structure should be pruned. Thisis because the monotonicity property indicates that thepruned combinations cannot constitute a cloaked areasmaller than the current minimal cloaked area. (2) Similarly,if a combination C constitutes a cloaked area that is thesame or larger than the current minimal cloaked area, othercombinations that contain C at the higher levels of thelattice structure should be pruned.

C. Progressive refinement. Since the monotonicity prop-erty shows that we would not need to generate a com-plete lattice structure to compute a minimal cloaked area,we generate the lattice structure of the peers in the searchspace, S, progressively from the lowest level of the latticestructure to its higher levels, in order to minimize the

computational and storage overhead. To compute theminimal cloaked area for the sensor node m, we firstgenerate an itemset for each peer in S at the lowest levelof the lattice structure, C[1] (Line 5 in Algorithm 2). Toaccommodate with our problem, we add m to each item-set in C[1] as the first item (Line 6). Such accommodationdoes not affect the generation of the lattice structure, buteach itemset has an extra item, m. For each itemset Xin C[1], we determine the MBR of X , MBR(X). If thearea of MBR(X) is less than the current minimal cloakedarea and the total number of objects in MBR(X) is atleast k, we set X to the current minimal cloaked area, andremove X from C[1] based on the first pruning conditionof the monotonicity property (Lines 11 to 12). However,if the area of MBR(X) is equal to or larger than thearea of the current minimal cloaked area, we also remove Xfrom C[1] based on the second pruning condition of themonotonicity property (Line 15). Then we generate theitemsets, where each itemset contains two items, at thesecond lowest level of the lattice structure, C[2], based onthe remaining itemsets in C[1] based on the generationrule of the lattice structure. We repeat this procedureuntil we produce the itemsets at the highest level of thelattice structure, C[4], or all the itemsets at the currentlevel are pruned (Lines 19 to 23). After we examine allnon-pruned itemsets in the lattice structure, the currentminimal cloaked area stores the combination giving theminimal cloaked area (Lines 26 to 27).

Figure 6 illustrates the minimal cloaked area step thatcomputes the minimal cloaked area for sensor node A.The set of peers residing in the search space is S ={B, D, E}. We assume that the area of the MBR of {A, B},{A, D}, and {A, E} is 1000, 1200, and 900, respectively.The number of objects residing in the MBR of {A, B},{A, D}, and {A, E} is six, four, and five, respectively, asdepicted in Figure 3. Figure 6a depicts the full latticestructure of S where A is added to each itemset asthe first item. Initially, the current minimal cloaked areais set to the initial solution, which is the MBR of {A, B}’computed by the resource-aware algorithm. The area ofthe MBR of {A, B}, Area(MBR({A, B})), is 1000 andthe total number of monitored objects in MBR({A, B}),N(MBR({A, B})), is six. It is important to note that

8

the progressive refinement approach may not requireour algorithm to compute the full lattice structure. Asdepicted in Figure 6b, we construct the lowest levelof the lattice structure, where each itemset contains apeer in S. Since the area of MBR({A, B}) is the currentminimal cloaked area, we remove {A, B} from the latticestructure; hence the itemsets at the higher levels thatcontain {A, B}, {A, B, D}, {A, B, E}, and {A, B, D, E}(enclosed by a dotted oval), will not be considered by thealgorithm. Then, we consider the next itemset {A, D}.Since the area of MBR({A, D}) is larger than the currentminimal cloaked area, this itemset is removed from thelattice structure. After pruning {A, D}, the itemsets atthe higher levels that contain {A, D}, {A, D, E} (enclosedby a dotted oval), will not be considered (Figure 6c).We can see that all itemsets beyond the lowest level ofthe lattice structure will not be considered by the algo-rithm. Finally, we consider the last itemset {A, E}. Sincethe area of MBR({A, E}) is less than current minimalcloaked area and the total number of monitored objectsin MBR({A, E}) is k = 5, we set {A, E} to the currentminimal cloaked area (Figure 6d). As the algorithm cannotgenerate any itemsets at the higher level of the latticestructure, it terminates. Thus the minimal cloaked areais the MBR of sensor nodes A and E, and the numberof monitored objects in this area is five.

Step 3: The validation step. This step is exactly the sameas in the resource-aware algorithm (Section 3.1).

3.2.1 AnalysisA brute-force approach of finding the minimal cloakedarea of a sensor node has to examine all the combinationsof its peers. Let N be the number of sensor nodes inthe system. Since each sensor node has N − 1 peers,we have to consider

∑N−1i=1 CN−1

i = 2N−1 − 1 MBRsto find the minimal cloaked area. In our algorithm, thesearch space step determines a search space, S, andprunes the peers outside S. Let M be the number ofpeers in S, where M ≤ N − 1. Thus the computationalcost is reduced to

∑Mi=1 CM

i = 2M − 1. In the minimalcloaked area step, the first optimization technique in-dicates that an MBR can be defined by at most fourpeers. As we need to consider the combinations of atmost four peers, the computational cost is reduced to∑4

i=1 CMi = (M4 − 2M3 + 11M2 + 14M)/24 = O(M4).

Furthermore, the second optimization technique uses themonotonicity property to prune the combinations, whichcannot give the minimal cloaked area. In our example,the brute-force approach considers all the combinationsof six peers; hence this approach computes 26 − 1 = 63MBRs to find the minimal cloaked area of sensor nodeA. In our algorithm, the search space step reduces theentire space into S, which contains only three peers;hence this step needs to compute 23 − 1 = 7 MBRs.After examining the three itemsets at the lowest levelof the lattice structure, all other itemsets at the higherlevels are pruned. Thus the progressive refinement ap-proach considers only three combinations. Therefore our

algorithm reduces over 95% computational cost of thebrute-force approach, as it reduces the number of MBRcomputations from 63 to 3.

3.2.2 Proof of CorrectnessIn this section, we show the correctness of the quality-aware location anonymization algorithm.

Theorem 1: Given a resource-aware cloaked area ofsize Area of a sensor node s, a search space, S, computedby the quality-aware algorithm contains the minimalcloaked area.

Proof: Let X be the minimal cloaked area of sizeequal to or less than Area. We know that X must totallycover the sensing area of s. Suppose X is not totallycovered by S, X must contain at least one extendedMBR, MBRi, where 1 ≤ i ≤ 4 (Figure 4c). This meansthat the area of X is larger than the area of an extendedMBR, Area. This contradicts to the assumption that X isthe minimal cloaked area; thus X is included in S.

Theorem 2: A minimum bounding rectangle (MBR) canbe defined by at most four sensor nodes.

Proof: By definition, given an MBR, each edge of theMBR touches the sensing area of some sensor node. Inan extreme case, there is a distinct sensor node touchingeach edge of the MBR but not other edges. The MBRis defined by four sensor nodes, which touch differentedges of the MBR. For any edge e of the MBR, if multiplesensor nodes touch e but not other edges, we can simplypick one of these sensor nodes, because any one of thesesensor nodes gives the same e. Thus an MBR is definedby at most four sensor nodes.

4 SPATIAL HISTOGRAMIn this section, we present a spatial histogram that isembedded inside the server to estimate the distributionof the monitored objects based on the aggregate locationsreported from the sensor nodes. Our spatial histogramis represented by a two-dimensional array that modelsa grid structure G of NR rows and NC columns; hence,the system space is divided into NR×NC disjoint equal-sized grid cells. In each grid cell G(i, j), we maintain afloat value that acts as an estimator H[i, j] (1 ≤ i ≤ NC ,1 ≤ j ≤ NR) of the number of objects within its area.We assume that the system has the ability to knowthe total number of moving objects M in the system.The value of M will be used to initialize the spatialhistogram. In practice, M can be computed online forboth indoor and outdoor dynamic environments. For theindoor environment, the sensor nodes can be deployedat each entrance and exit to count the number of usersentering or leaving the system [4], [5]. For the outdoorenvironment, the sensor nodes have been already usedto count the number of people in a predefined area [3].We use the spatial histogram to provide approximatelocation monitoring services. The accuracy of the spa-tial histogram that indicates the utility of our privacy-preserving location monitoring system will be evaluatedin Section 6.

9

Algorithm 3 Spatial histogram maintenance1: function HISTOGRAMMAINTENANCE (AggregateLocationSet R)2: for each aggregate location R ∈ R do3: if there is an existing partition P = {R1, . . . , R|P |} such that R.Area∩

Rk.Area = ∅ for every Rk ∈ P then4: Add R to P5: else6: Create a new partition for R7: end if8: end for9: for each partition P do

10: for each aggregate location Rk ∈ P do11: Rk.N ←

∑G[i,j]∈Rk.Area

H[i, j]

12: For every cell G(i, j) ∈ Rk .Area, H[i, j] ←Rk.N

No. of cells within Rk.Area

13: end for14: P.Area← R1.Area ∪ . . . ∪ R|P |.Area15: For every cell G(i, j) /∈ P.Area,

H[i, j] = H[i, j] +

∑Rk∈P

Rk.N−Rk.N

No. of cells outside P.Area16: end for

Algorithm 3 outlines our spatial histogram approach.Initially, we assume that the objects are evenly dis-tributed in the system, so the estimated number ofobjects within each grid cell is H[i, j] = M/(NR × NC).The input of the histogram is a set of aggregate loca-tions R reported from the sensor nodes. Each aggregatelocation R in R contains a cloaked area, R.Area, andthe number of monitored objects within R.Area, R.N .First, the aggregate locations in R are grouped into thesame partition P = {R1, R2, . . . , R|P |} if their cloakedareas are not overlapping with each other, which meansthat for every pair of aggregate locations Ri and Rj

in P , Ri.Area ∩ Rj .Area = ∅ (Lines 2 to 8. Then, foreach partition P , we update its entire set of aggregatelocations to the spatial histogram at the same time. Foreach aggregate location R in P , we record the estimationerror, which is the difference between the sum of theestimators within R.Area, R.N , and R.N , and then R.Nis uniformly distributed among the estimators withinR.Area; hence, each estimator within R.Area is set toR.N divided by the total number of grid cells withinR.Area (Lines 10 to 13). After processing all the aggre-gate locations in P , we sum up the estimation error ofeach aggregate location in P ,

∑|P |k=1 Rk.N − Rk.N , that

is uniformly distributed among the estimators outsideP.Area, where P.Area is the area covered by some ag-gregate location in P , P.Area = ∪Rk∈P Rk.Area (Line 15).Formally, for each partition P that contains |P | aggregatelocations Rk (1 ≤ k ≤ |P |), every estimator in thehistogram is updated as follows:

H[i, j] =

Rk.NNo. of cells within Rk .Area , for G(i, j) ∈ Rk .Area

H[i, j] +

∑|P |

k=1Rk .N−Rk .N

No. of cells outside P.Area , for G(i, j) /∈ P.Area

5 SYSTEM EVALUATION

In this section, we discuss an attacker model, the exper-iment setting of our privacy-preserving location mon-itoring system in a wireless sensor network, and theperformance metrics.

5.1 Attacker Model

To evaluate the privacy protection of our system, wesimulate an attacker attempting to infer the number ofobjects residing in a sensor node’s sensing area. We willanalyze the evaluation result in Section 6.1. The keyidea of the attacker model is that if the attacker cannotinfer the exact object count of the sensor node fromour system output, the attacker cannot infer the locationinformation corresponding to an individual object. Weconsider the worst-case scenario where the attacker hasthe background knowledge about the system, i.e., themap layout of the system, the location of each sensornode, the sensing area of each sensor node, the totalnumber of objects currently residing in the system, andthe aggregate locations reported from the sensor nodes.In general, the attacker model is defined as: Given an areaA (that corresponds to the monitored area of a sensor node)and a set of aggregate locations R = {R1, R2, . . . , R|R|}overlapping with A, the attacker estimates the number ofpersons within A. Since the validation step in our locationanonymization algorithms does not allow the aggregatelocations with the containment relationship to be re-ported to the server, no aggregate location is includedin other aggregate locations in R.

Without loss of generality, we use the Poisson distribu-tion as a concrete exemplary distribution for the attackermodel [14]. Under the Poisson distribution, objects areuniformly distributed in an area within intensity of λ.The probability of n distinct objects in a region S of sizes is: P (N(S) = n) = e−λs(λs)n

n! , where λ is computed asthe number of objects in the system divided by the areaof the system.

Suppose that the object count of each aggregate lo-cation Ri is ni, where 1 ≤ i ≤ |R|, and the aggregatelocations in R and A constitute m non-overlappingsubregions Sj , where 1 ≤ j ≤ m; hence, N(Ri) =∑

Sj∈RiN(Sj) = ni. Each subregion must either intersect

or not intersect A, and it intersects one or more aggregatelocations. If a subregion Sk intersects A, but none of theaggregate locations in R, then N(Sk) = 0. The probabil-ity mass function of the number of distinct objects in Abeing equal to na, N = na, given the aggregate locationsin R can be expressed as follows:

P (N = na|N(R1) = n1, . . . , N(R|R|) = n|R|)

=P (N = na, N(R1) = n1, . . . , N(R|R|) = n|R|)

P (N(R1) = n1, . . . , N(R|R|) = n|R|)

=

∑Vi∈(VS∩VA) < vi

1, vi2, . . . , v

im >

∑Vj∈VS

< vj1, v

j2, . . . , v

jm >

, (1)

where the notation V =< v1, v2, . . . , vm > represents thejoint probability that there are vi objects in a subregionSi (1 ≤ i ≤ m); the joint probability is computed as∏

1≤i≤m P (N(Si) = vi). The lower and upper bounds ofvi (denoted as LB(vi) and UB(vi), respectively) are zeroand the minimum nj of the aggregate locations inter-

10

TABLE 1: Parameter settings

Description Default Value RangeHistogram size (NR × NC) 200 × 200 502 to 2502

Query region size ratio [0.001, 0.032] 0.001 to 0.256Number of moving objects 5,000 2,000 to 10,000

k-anonymity level 20 10 to 30Object mobility speed [0,5] [0,5] to [0,30]

secting Si, respectively. Thus, the possible value of vi iswithin a range of [0, minRj∩Si 6=∅∧1≤i≤m∧1≤j≤|R|(nj)]. VS

is the set of < v1, v2, . . . , vm > that is a solution to thefollowing equations: VS :

∑Si∈R1

vi = n1,∑

Si∈R2vi =

n2, . . . ,∑

Si∈R|R|vi = n|R|, where vi ≥ 0 for 1 ≤ i ≤ m.

VA is the set of < v1, v2, . . . , vm > that satisfies thefollowing equation: VA :

∑1≤i≤m vi = na.

The attacker uses an exhaustive approach to findall possible solutions to VS , in order to compute theexpected value E(N ) of Equation 1 as the estimatedvalue of na. The complexity of computing E(N ) isO(

∏1≤i≤m UB(vi)). Since the complexity of the attacker

model is an exponential function of m and m wouldbe much larger than |R|, such exponential complexitymakes it prohibitive for the attacker model to be used toprovide online location monitoring services; and there-fore, we use our spatial histogram to provide online ser-vices in the experiments. We will evaluate the resilienceof our system to the attacker model in Section 6.1.

5.2 Simulation Settings

In all experiments, we simulate 30 × 30 sensor nodesthat are uniformly distributed in a 600 × 600 systemspace. Each sensor node is responsible for monitoringa 20 × 20 space. We generate a set of moving objectsthat freely roam around the system space. Unless men-tioned otherwise, the experiments consider 5,000 movingobjects that move at a random speed within a rangeof [0, 5] space unit(s) per time unit, and the requiredanonymity level is k = 20. The spatial histogram containsNR×NC = 200×200 grid cells, and we issue 1,000 rangequeries whose query region size is specified by a ratio ofthe query region area to the system area, that is, a queryregion size ratio. The default query region size ratiois uniformly selected within a range of [0.001, 0.032].Table 1 gives a summary of the parameter settings.

5.3 Performance Metrics

We evaluate our system in terms of five performancemetrics. (1) Attack model error. This metric measures theresilience of our system to the attacker model by therelative error between the estimated number of objects Nin a sensor node’s sensing area and the actual one N . Theerror is measured as |N−N |

N . When N = 0, we considerN as the error. (2) Communication cost. We measure thecommunication cost of our location anonymization algo-rithms in terms of the average number of bytes sent by

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

10 15 20 25 30

Att

ack

Mod

el E

rror

Anonymity Levels

Resource-AwareQuality-Aware

(a) Anonymity levels

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

2 4 6 8 10

Att

ack

Mod

el E

rror

Number of Objects (in thousands)


(b) Number of objects

Fig. 7: Attacker model error.

each sensor node per reporting period. This metric alsoindicates the network traffic and the power consumptionof the sensor nodes. (3) Cloaked area size. This metricmeasures the quality of the aggregate locations reportedby the sensor nodes. The smaller the cloaked area, thebetter the accuracy of the aggregate location is. (4) Com-putational cost. We measure the computational cost ofour location anonymization algorithms in terms of theaverage number of minimum bounding rectangle (MBR)computations that are needed to determine a resource- orquality-aware cloaked area. We compare our algorithmswith a basic approach that computes the MBR for eachcombination of the peers in the required search spaceto find the minimal cloaked area. The basic approachdoes not employ any optimization techniques proposedfor our quality-aware algorithm. (5) Query error. Thismetric measures the utility of our system, in terms ofthe relative error between the query answer M , which isthe estimated number of objects within the query regionbased on a spatial histogram, and the actual answer

M , respectively. The error is measured as |M−M |M . When

M = 0, we consider M as the error.

6 EXPERIMENTAL RESULTS AND ANALYSIS

In this section, we show and analyze the experimentalresults with respect to the privacy protection and thequality of location monitoring services of our system.

6.1 Anonymization Strength

Figure 7 depicts the resilience of our system to the at-tacker model with respect to the anonymity level and thenumber of objects. In the figure, the performance of theresource- and quality-aware algorithms is represented byblack and gray bars, respectively. Figure 7a depicts thatthe stricter the anonymity level, the larger the attackermodel error will be encountered by an adversary. Whenthe anonymity level gets stricter, our algorithms generatelarger cloaked areas, which reduce the accuracy of theaggregate locations reported to the server. Figure 7bshows that the attacker model error reduces, as thenumber of objects gets larger. This is because whenthere are more objects, our algorithms generate smaller

11

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.001 0.002 0.004 0.008 0.016 0.032 0.064 0.128 0.256

Que

ry A

nsw

er E

rror

Query Region Size Ratio

k=10k=15k=20k=25k=30

(a) Resource-aware algorithm

0

0.1

0.2

0.3

0.4

0.5

0.6

0.001 0.002 0.004 0.008 0.016 0.032 0.064 0.128 0.256

Que

ry A

nsw

er E

rror

Query Region Size Ratio

k=10k=15k=20k=25k=30

(b) Quality-aware algorithm

Fig. 8: Query region size.

cloaked areas, which increase the accuracy of the aggre-gate locations reported to the server. It is difficult to set ahard quantitative threshold for the attacker model error.However, it is evident that the adversary cannot inferthe number of objects in the sensor node’s sensing areawith any fidelity.

6.2 Effect of Query Region Size

Figure 8 depicts the privacy protection and the qualityof our location monitoring system with respect to in-creasing the query region size ratio from 0.001 to 0.256,where the query region size ratio is the ratio of the queryregion area to the system area and the query region sizeratio 0.001 corresponds to the size of a sensor node’ssensing area. The results give evidence that our systemprovides low quality location monitoring services for therange query with a small query region, and better qualityservices for larger query regions. This is an importantfeature to protect personal location privacy, becauseproviding the accurate number of objects in a small areacould reveal individual location information; thereforean adversary cannot use our system output to track themonitored objects with any fidelity. The definition of asmall query region is relative to the required anonymitylevel k. For example, we want to provide low qualityservices, such that the query error is at least 0.2, forsmall query regions. For the resource-aware algorithm,Figure 8a shows that when k = 10, a query region is saidto be small if its query region size is not larger than 0.002(it is about two sensor nodes’ sensing area). However,when k = 30, a query region is only considered as small

if its query region size is not larger than 0.016 (it isabout 16 sensor nodes’ sensing area). For the quality-aware algorithm, Figure 8b shows that when k = 10,a query region is said to be small if its query regionsize is not larger than 0.002, while when k = 30, a queryregion is only considered as small if its query region sizeis not larger than 0.004. The results also show that thequality-aware algorithm always performs better than theresource-aware algorithm.

6.3 Effect of the Number of ObjectsFigure 9 depicts the performance of our system withrespect to increasing the number of objects from 2,000 to10,000. Figure 9a shows that when the number of objectsincreases, the communication cost of the resource-awarealgorithm is only slightly affected, but the quality-awarealgorithm significantly reduces the communication cost.The broadcast step of the resource-aware algorithm ef-fectively allows each sensor node to find an adequatenumber of objects to blur its sensing area. When thereare more objects, the sensor node finds smaller cloakedareas that satisfy the k-anonymity privacy requirement,as given in Figure 9b. Thus the required search space ofa minimal cloaked area computed by the quality-awarealgorithm becomes smaller; hence the communicationcost of gathering the information of the peers in sucha smaller required search space reduces. Likewise, sincethere are less peers in the smaller required search spaceas the number of objects increases, finding the minimalcloaked area incurs less minimum bounding rectangle(MBR) computation (Figure 9c). Since our algorithmsgenerate smaller cloaked areas when there are moreusers, the spatial histogram can gather more accurateaggregate locations to estimate the object distribution;therefore the query answer error reduces (Figure 9d). Theresult also shows that the quality-aware algorithm al-ways provides better quality services than the resource-aware algorithm.

6.4 Effect of Privacy RequirementsFigure 10 depicts the performance of our system withrespect to varying the required anonymity level k from10 to 30. When the k-anonymity privacy requirementgets stricter, the sensor nodes have to enlist more peersfor help to blur their sensing areas; therefore the commu-nication cost of our algorithms increases (Figure 10a). Tosatisfy the stricter anonymity levels, our algorithms gen-erate larger cloaked areas, as depicted in Figure 10b. Forthe quality-aware algorithm, since there are more peersin the required search space when the input (resource-aware) cloaked area gets larger, the computational costof computing the minimal cloaked area by the quality-aware algorithm and the basic approach gets worse (Fig-ure 10c). However, the quality-aware algorithm reducesthe computational cost of the basic approach by at leastfour orders of magnitude. Larger cloaked areas givemore inaccurate aggregate location information to the

12

3

4

5

6

7

8

9

10

2 4 6 8 10

Num

ber

of B

ytes

(in

thou

sand

s)



(a) Communication cost

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2 4 6 8 10

Clo

aked

Are

a S

ize

(in th

ousa

nds)



(b) Cloaked area size

10-1100101102103104105106107108

2 4 6 8 10

No.

of M

BR

Com

puta

tions



Basic

(c) Computational cost

0

0.05

0.1

0.15

0.2

0.25

2 4 6 8 10

Que

ry A

nsw

er E

rror



(d) Estimation error

Fig. 9: Number of objects.

3 4 5 6 7 8 9

10 11 12 13

10 15 20 25 20

Num

ber

of B

ytes

(in

thou

sand

s)

Anonymity Levels



0.5

1

1.5

2

2.5

3

3.5

10 15 20 25 30

Clo

aked

Are

a S

ize

(in th

ousa

nds)

Anonymity Levels



10-1100101102103104105106107108109

10 15 20 25 30

No.

of M

BR

Com

puta

tions

Anonymity Levels


Basic


0

0.05

0.1

0.15

0.2

0.25

0.3

10 15 20 25 30

Que

ry A

nsw

er E

rror

Anonymity Levels



Fig. 10: Anonymity levels.

system, so the estimation error increases as the requiredk-anonymity increases (Figure 10d). The quality-awarealgorithm provides much better quality location moni-toring services than the resource-aware algorithm, whenthe required anonymity level gets stricter.

6.5 Effect of Mobility SpeedsFigure 11 gives the performance of our system withrespect to increasing the maximum object mobility speedfrom [0, 5] and [0, 30]. The results show that increas-ing the object mobility speed only slightly affects thecommunication cost and the cloaked area size of ouralgorithms, as depicted in Figures 11a and 11b, respec-tively. Since the resource-aware cloaked areas are slightlyaffected by the mobility speed, the object mobility speedhas a very small effect on the required search spacecomputed by the quality-aware algorithm. Thus thecomputational cost of the quality-aware algorithm isalso only slightly affected by the object mobility speed(Figure 11c). Although Figure 11d shows that queryanswer error gets worse when the objects are movingfaster, the query accuracy of the quality-aware algorithmis consistently better than the resource-aware algorithm.

7 RELATED WORK

Straightforward approaches for preserving users’ loca-tion privacy include enforcing privacy policies to re-strict the use of collected location information [15], [16]and anonymizing the stored data before any disclo-sure [17]. However, these approaches fail to prevent

internal data thefts or inadvertent disclosure. Recently,location anonymization techniques have been widelyused to anonymize personal location information beforeany server gathers the location information, in orderto preserve personal location privacy in location-basedservices. These techniques are based on one of the threeconcepts. (1) False locations. Instead of reporting themonitored object’s exact location, the object reports ndifferent locations, where only one of them is the object’sactual location while the rest are false locations [18].(2) Spatial cloaking. The spatial cloaking technique blurs auser’s location into a cloaked spatial area that satisfy theuser’s specified privacy requirements [19], [20], [21], [22],[23], [24], [25], [26], [27], [28]. (3) Space transformation.This technique transforms the location information ofqueries and data into another space, where the spatialrelationship among the query and data are encoded [29].

Among these three privacy concepts, only the spatialcloaking technique can be applied to our problem. Themain reasons for this are that (a) the false location tech-niques cannot provide high quality monitoring servicesdue to a large amount of false location information;(b) the space transformation techniques cannot provideprivacy-preserving monitoring services as it reveals themonitored object’s exact location information to thequery issuer; and (c) the spatial cloaking techniquescan provide aggregate location information to the serverand balance a trade-off between privacy protection andthe quality of services by tuning the specified privacyrequirements, for example, k-anonymity and minimumarea privacy requirements [17], [27]. Thus we adopt the

13

0

2

4

6

8

10

5 10 15 20 25 30

Num

ber

of B

ytes

(K)

Maximum Mobility Speed



0

0.5

1

1.5

2

2.5

5 10 15 20 25 30

Clo

aked

Are

a S

ize

(K)




100

102

104

106

108

5 10 15 20 25 30

No.

of M

BR

Com

puta

tions



Basic


0

0.05

0.1

0.15

0.2

0.25

0.3

5 10 15 20 25 30

Que

ry A

nsw

er E

rror




Fig. 11: Object mobility speeds.

spatial cloaking technique to preserve the monitored ob-ject’s location privacy in our location monitoring system.

In terms of system architecture, existing spatial cloak-ing techniques can be categorized into centralized [19],[20], [22], [25], [26], [27], [28], distributed [23], [24], andpeer-to-peer [21] approaches. In general, the centralizedapproach suffers from the mentioned internal attacks,while the distributed approach assumes that mobileusers communicate with each other through base sta-tions is not applicable to the wireless sensor network.Although the peer-to-peer approach can be applied tothe wireless sensor network, the previous work usingthis approach only focuses on hiding a single user loca-tion with no direct applicability to sensor-based locationmonitoring. Also, the previous peer-to-peer approachesdo not consider the quality of cloaked areas and discusshow to provide location monitoring services based onthe gathered aggregate location information.

In the wireless sensor network, Cricket [2] is the onlyprivacy-aware location system that provides a decen-tralized positioning service for its users where eachuser can control whether to reveal her location to thesystem. However, when many users decide not to revealtheir locations, the location monitoring system cannotprovide any useful services. This is in contrast to oursystem that aims to enable the sensor nodes to providethe privacy-preserving aggregate location informationof the monitored objects. The closest work to ours isthe hierarchical location anonymization algorithm [6]that divides the system space into hierarchical levelsbased on the physical units, for example, sub-rooms,rooms and floors. If a unit contains at least k users, thealgorithm cloaks the subject count by rounding the valueto the nearest multiple of k. Otherwise, the algorithmcloaks the location of the physical unit by selecting asuitable space containing at least k users at the higherlevel of the hierarchy. This work is not applicable tosome landscape environments, for example, shoppingmall and stadium, and outdoor environments. Our workdistinguishes itself from this work, as (1) we do notassume any hierarchical structures, so it can be appliedto all kinds of environments, and (2) we consider theproblem of how to utilize the anonymized location datato provide privacy-preserving location monitoring ser-

vices while the usability of anonymized location datawas not discussed in [6].

Other privacy related works include: anonymous com-munication that provides anonymous routing betweenthe sender and the receiver [12], source location privacythat hides the sender’s location and identity [13], aggre-gate data privacy that preserves the privacy of the sensornode’s aggregate readings during transmission [30], datastorage privacy that hides the data storage location [31],and query privacy that avoids disclosing the personalinterests [32]. However, none of these previous works isapplicable to our problem.

8 CONCLUSIONIn this paper, we propose a privacy-preserving loca-tion monitoring system for wireless sensor networks.We design two in-network location anonymization al-gorithms, namely, resource- and quality-aware algorithms,that preserve personal location privacy, while enablingthe system to provide location monitoring services. Bothalgorithms rely on the well established k-anonymity pri-vacy concept that requires a person is indistinguishableamong k persons. In our system, sensor nodes executeour location anonymization algorithms to provide k-anonymous aggregate locations, in which each aggre-gate location is a cloaked area A with the number ofmonitored objects, N , located in A, where N ≥ k,for the system. The resource-aware algorithm aims tominimize communication and computational cost, whilethe quality-aware algorithm aims to minimize the size ofcloaked areas in order to generate more accurate aggre-gate locations. To provide location monitoring servicesbased on the aggregate location information, we proposea spatial histogram approach that analyzes the aggregatelocations reported from the sensor nodes to estimatethe distribution of the monitored objects. The estimateddistribution is used to provide location monitoring ser-vices through answering range queries. We evaluateour system through simulated experiments. The resultsshow that our system provides high quality locationmonitoring services (the accuracy of the resource-awarealgorithm is about 75% and the accuracy of the quality-aware algorithm is about 90%), while preserving themonitored object’s location privacy.

14

REFERENCES

[1] A. Harter, A. Hopper, P. Steggles, A. Ward, and P. Webster, “Theanatomy of a context-aware application,” in Proc. of MobiCom,1999.

[2] N. B. Priyantha, A. Chakraborty, and H. Balakrishnan, “Thecricket location-support system,” in Proc. of MobiCom, 2000.

[3] B. Son, S. Shin, J. Kim, and Y. Her, “Implementation of the real-time people counting system using wireless sensor networks,”IJMUE, vol. 2, no. 2, pp. 63–80, 2007.

[4] Onesystems Technologies, “Counting people in buildings.http://www.onesystemstech.com.sg/index.php?option=comcontent&task=view%&id=10.”

[5] Traf-Sys Inc., “People counting systems. http://www.trafsys.com/products/people-counters/thermal-sensor.aspx.”

[6] M. Gruteser, G. Schelle, A. Jain, R. Han, and D. Grunwald,“Privacy-aware location sensor networks,” in Proc. of HotOS, 2003.

[7] G. Kaupins and R. Minch, “Legal and ethical implications ofemployee location monitoring,” in Proc. of HICSS, 2005.

[8] “Location Privacy Protection Act of 2001, http://www.techlawjournal.com/cong107/privacy/location/s1164is.asp.”

[9] “Title 47 United States Code Section 222 (h) (2),http://frwebgate.access.gpo.gov/cgi-bin/getdoc.cgi?dbname=browse usc&do%cid=Cite:+47USC222.”

[10] D. Culler and M. S. Deborah Estrin, “Overview of sensor net-works,” IEEE Computer, vol. 37, no. 8, pp. 41–49, 2004.

[11] A. Perrig, R. Szewczyk, V. Wen, D. E. Culler, and J. D. Tygar,“SPINS: Security protocols for sensor netowrks,” in Proc. of Mo-biCom, 2001.

[12] J. Kong and X. Hong, “ANODR: Anonymous on demand routingwith untraceable routes for mobile adhoc networks,” in Proc. ofMobiHoc, 2003.

[13] P. Kamat, Y. Zhang, W. Trappe, and C. Ozturk, “Enhancing source-location privacy in sensor network routing,” in Proc. of ICDCS,2005.

[14] S. Guo, T. He, M. F. Mokbel, J. A. Stankovic, and T. F. Abdelzaher,“On accurate and efficient statistical counting in sensor-basedsurveillance systems,” in Proc. of MASS, 2008.

[15] K. Bohrer, S. Levy, X. Liu, and E. Schonberg, “Individualizedprivacy policy based access control,” in Proc. of ICEC, 2003.

[16] E. Snekkenes, “Concepts for personal location privacy policies,”in Proc. of ACM EC, 2001.

[17] L. Sweeney, “Achieving k-anonymity privacy protection usinggeneralization and suppression,” IJUFKS, vol. 10, no. 5, pp. 571–588, 2002.

[18] H. Kido, Y. Yanagisawa, and T. Satoh, “An anonymous communi-cation technique using dummies for location-based services,” inProc. of ICPS, 2005.

[19] B. Bamba, L. Liu, P. Pesti, and T. Wang, “Supporting anonymouslocation queries in mobile environments with privacygrid,” inProc. of WWW, 2008.

[20] C. Bettini, S. Mascetti, X. S. Wang, and S. Jajodia, “Anonymity inlocation-based services: Towards a general framework,” in Proc.of MDM, 2007.

[21] C.-Y. Chow, M. F. Mokbel, and X. Liu, “A peer-to-peer spatialcloaking algorithm for anonymous location-based services,” inProc. of ACM GIS, 2006.

[22] B. Gedik and L. Liu, “Protecting location privacy with person-alized k-anonymity: Architecture and algorithms,” IEEE TMC,vol. 7, no. 1, pp. 1–18, 2008.

[23] G. Ghinita, P. Kalnis, and S. Skiadopoulos, “PRIVE: Anonymouslocation-based queries in distributed mobile systems,” in Proc. ofWWW, 2007.

[24] G. Ghinita1, P. Kalnis, and S. Skiadopoulos, “MobiHide: A mobilepeer-to-peer system for anonymous location-based queries,” inProc. of SSTD, 2007.

[25] M. Gruteser and D. Grunwald, “Anonymous usage of location-based services through spatial and temporal cloaking,” in Proc. ofMobiSys, 2003.

[26] P. Kalnis, G. Ghinita, K. Mouratidis, and D. Papadias, “Preventinglocation-based identity inference in anonymous spatial queries,”IEEE TKDE, vol. 19, no. 12, pp. 1719–1733, 2007.

[27] M. F. Mokbel, C.-Y. Chow, and W. G. Aref, “The New Casper:Query procesing for location services without compromising pri-vacy,” in Proc. of VLDB, 2006.

[28] T. Xu and Y. Cai, “Exploring historical location data for anonymitypreservation in location-based services,” in Proc. of Infocom, 2008.

[29] G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.-L. Tan,“Private queries in location based services: Anonymizers are notnecessary,” in Proc. of SIGMOD, 2008.

[30] W. He, X. Liu, H. Nguyen, K. Nahrstedt, and T. Abdelzaher,“PDA: Privacy-preserving data aggregation in wireless sensornetworks,” in Proc. of Infocom, 2007.

[31] M. Shao, S. Zhu, W. Zhang, and G. Cao, “pDCS: Security andprivacy support for data-centric sensor networks,” in Proc. ofInfocom, 2007.

[32] B. Carbunar, Y. Yu, W. Shi, M. Pearce, and V. Vasudevan, “Queryprivacy in wireless sensor networks,” in Proc. of SECON, 2007.

Chi-Yin Chow (M.Sc., 2008, University of Min-nesota; M.Phil., 2005, B.A., 2002, The HongKong Polytechnic University) is a Ph.D. candi-date in the Department of Computer Scienceand Engineering, University of Minnesota. Hismain research interests are in spatial and spatio-temporal databases, mobile data management,wireless sensor networks, and data privacy. Hereceived the best paper award of MDM 2009.He was an intern at the IBM Thomas J. WatsonResearch Center during the summer of 2008. He

is a student member of ACM and IEEE.

Mohamed F. Mokbel (Ph.D., Purdue University,2005, MS, B.Sc., Alexandria University, 1999,1996) is an assistant professor in the Depart-ment of Computer Science and Engineering,University of Minnesota. His main research inter-ests focus on advancing the state of the art in thedesign and implementation of database enginesto cope with the requirements of emerging ap-plications (e.g., location-aware applications andsensor networks). Mohamed was the co-chairof the first and second workshops on privacy-

aware location-based mobile services, PALMS, 2007 (Mannheim, Ger-many) and 2008 (Beijing, China). He is also the PC co-chair for theACM SIGSPATIAL GIS 2008 and 2009 conferences. Mokbel has spentthe summers of 2006 and 2008 as a visiting researcher at Hong KongPolytechnic University and Microsoft Research, respectively. He is amember of ACM and IEEE.

Dr. Tian He is currently an assistant professorin the Department of Computer Science andEngineering at the University of Minnesota-TwinCities. He received the Ph.D. degree under Pro-fessor John A. Stankovic from the University ofVirginia, Virginia in 2004. Dr. He is the authorand co-author of over 90 papers in premiersensor network journals and conferences withover 4000 citations. His publications have beenselected as graduate-level course materials byover 50 universities in the United States and

other countries. Dr. He has received a number of research awards inthe area of sensor networking, including four best paper awards (MSN2006 and SASN 2006, MASS 2008, MDM 2009). Dr. He is also therecipient of the NSF CAREER Award 2009 and McKnight Land-GrantProfessorship 2009-2011. Dr. He served a few program chair positionsin international conferences and on many program committees, andalso currently serves as an editorial board member for five internationaljournals including ACM Transactions on Sensor Networks. His researchincludes wireless sensor networks, intelligent transportation systems,real-time embedded systems and distributed systems, supported byNSF and other agencies. Dr. He is a member of ACM and IEEE.

A privacy preserving-location_monitoring_system_for_wireless_sensor_networks

Documents

Transcript of A privacy preserving-location_monitoring_system_for_wireless_sensor_networks