DIKNN: An Itinerary-based KNN Query Processing Algorithm for Mobile Sensor Networks
Shan-Hung Wu Kun-Ta Chuang Chung-Min Chen Ming-Syan Chen
Outline
• Introduction• Related Work• Design of DIKNN• KNN Boundary Estimation• Performance Evaluation• Conclusion
Introduction
• The problem of efficient KNN search has been a major research topic– traditional KNN query processing techniques assume location data are available in a centralized database and
focus on improving the index performance.– “in-network” KNN query processing techniques for sensor
networks rely on certain in-network infrastructure (index or data structure)distributed among the sensor nodes.
• Drawbacks of the in-network approaches in large-scale mobile sensor network: distributed indexing structrues, supernodes, fixed network
Introduction
• A Density-aware Itinerary KNN query processing (DIKNN) for mobile sensor networks – key idea: let sensor nodes collect partial results and
propagate the query along a well-devised ,conceptual itinerary structure
– the first KNN processing technique dose not rely on any in-network indexing structure support: no constant
maintenance or fixed data aggregation point
• Several challenging issues arising in the design of DIKNN: estimate of search radius and design of efficient itinerary
Related Work
• The centralized approach performs the queries in a centralized database containing locations of all the sensor nodes.
• The in-network approach propagates the query directly among the sensor nodes in the
network and collects relevant data to form the final result.
Related Work• The Peer-tree and DSI decentralize the index structures
(R-tree) to distributed environments. A network is partitioned into a hierarchy of Minimum Bounding Rectangles (MBRs).
• For KNN, index nodes become system bottlenecks easily and there are many unnecessary hops .
Related Work
• KPT is proposed to handle the KNN query without fixed indexing.
The KNN Perimeter Tree (KPT) builds upon GPSR for processing KNN queries.Two serious drawbacks: considerable overhead and conservative boundaery
Design of DIKNN• Definitions and Network Model
– Definition 1 (k nearest neighbor problem) Given a set of sensor nodes S, a geographical location q (i.e., query point), and valid time T, find a subset S’ of S with k nodes (i.e., S’⊆ S, |S’| = k) such that at time T,∀n1 ∈ S’,n2 ∈S−S’:DIST(n1, q)≤DIST(n2, q), where DIST denotes the Euclidean distance function.
– Network is under ad-hoc mode; all sensor nodes can store data locally and answer the queries individually; moving speed and directions are arbitrary; each sensor node is aware of its geo-location and maintains a table of IDs and locations of neighbor nodes falling within its radio range.
Design of DIKNN
• Execution Phases– Routing phase: Q is geographically routed
from s to the nearest neighbor around q– KNN boundary estimation phase:the home
node estimates KNN boundary with radius R using KNNB algorithm
– Query dissemination phase: the home node disseminates the query message to all sensor nodes inside the KNN boundary
KNN Boundary Estimation
• Routing Phase– Q is routed from s to the nearest neighbor
np around q utilizing the geographic face routing protocol (e.g.GPSR). An additional list L about information of the sensor network is sent along with Q.
– On the ith hop to the destination ,the corresponding node appends its own location loci and the number of newly encountered neighbors enci to L.
KNN Boundary Estimation
• Linear KNNB Algorithm– On receiving the query and list L,Nq
estimats the KNN boundary by determining its radius length R. Determination of R must balance two conflicting factors.
– A weaker assumption adopted by KNNB: sensor node are uniformly distributed only within the optimal KNN boundary (the boundary containing exactly k nearest neighbors).
KNN Boundary Estimation
2i p
._
Area covered from n to n
p
jj i
i
L encest kD
loc q
: the radius of the optimal KNN boundary ni : the corresponding node of the ith hop in the routing path D :density of nodes (nodes/m2) within the optimal KNN boundary
KNN Boundary Estimation
Design of DIKNN
• Itinerary-Based Query Dissemination– On the KNN boundary is determined, the home
node enters the query dissemination phase.
Q-nodes are chosen for query disseminationD-nodes: neighbor nodes that are qualified to reply the query
Design of DIKNN
• Primitives of itinerary-based solution– The itinerary width w specifies how close
between segments of an itinerary. w= r/2– Data collection from multiple D-nodes
needs to be better scheduled to avoid collisions and delays.
timer=– Discussions on the other issues (i.e.,
itinerary void)
3
2im
Design of DIKNN
• Concurrent itinerary structures
KNN boundary is partitioned into multiple sectors.In each sector ,the query is propagated along a sub-itinerary.
The distance between sub-itineraries in adjacent sectors is w to ensure full coverageof the KNN boundary when w ≤ r/2.3
Design of DIKNN
• Each sub-itinerary consists of three segments: the init-,adj- and peri-segments.
( ) /
1
2initR l w
peri i
iwl
S
/adj initl R l w w
min{ /(2sin( / )), }initl w S R
KNN Boundary Estimation
• Interaction with Environments– Spatial Irregularity :when k is large,
the sensor nodes tend to irregularity and their density becomes unpredictable.
Inverse the direction of peri-segments inevery interseptal sector.In such a configuration, the face-to-face adj-segments of different sub-itineraries together form rendezvous.
KNN Boundary Estimation
• Interaction with Environments– Mobile Concern: Mobility of sensor nodes
degrades query accuracy because nodes may move in or move out the KNN boundary during dissemination.
– DIKNN address this issue in the query dissemination phase, where the last Q-node is obligated to determine how much farther a sub-itinerary should continue.
g :assurance gain,0≤g≤1 μ:the fastest moving speedR’= R+g(te − ts)μ, ts and te denote timestamps for the start and end of the query dissemination.
Performance Evaluation
• Settings and Performance Metrics– Query Latency: The elapsed time (in
second) between the time a query is issued by the sink and the time the query responses are returned.
– Energy Consumption: Amount of energy (in Joule) consumed in a simulation run.
– Query Accuracy: the pre-accuracy and post-accuracy are measured separately in experiments.
Performance Evaluation
• Scalability
Performance Evaluation
• Impact of Network Dynamics
Conclusion
• DIKNN integrates query propagation with data collection along a well-designed itinerary traversal.
• A simple and effective KNNB algorithm has been proposed to estimate the KNN boundary under the trade-off between query accuracy and energy efficiency.
• Dynamic adjustment of the KNN boundary has also been addressed to cope with spatial irregularity and mobility of sensor nodes.
Finish
• Thank you!
Top Related