The Design of an The Design of an Acquisitional Query Acquisitional Query Processor for Sensor Processor for Sensor
NetworksNetworksCS851 Presentation 2005CS851 Presentation 2005
Presented by: Gang ZhouPresented by: Gang Zhou
University of VirginiaUniversity of Virginia
OutlineOutline
Application Structure & Design GoalsApplication Structure & Design Goals Acquisitional Query LanguageAcquisitional Query Language Power-Aware OptimizationPower-Aware Optimization Power Sensitive Dissemination and Power Sensitive Dissemination and
RoutingRouting Processing QueriesProcessing Queries Conclusions and Future Work Conclusions and Future Work DiscussionDiscussion
Application StructureApplication Structure
Queries Queries submitted in PCsubmitted in PC
Parsed, Parsed, optimized in PCoptimized in PC
Disseminated Disseminated and processed in and processed in networknetwork
Results flow Results flow back through the back through the routing treerouting tree
Design GoalsDesign Goals
Provide a query processor-like Provide a query processor-like interface to sensor networksinterface to sensor networks
Use acquisitional techniques to Use acquisitional techniques to reduce power consumption reduce power consumption compared to traditional passive compared to traditional passive systemssystems
How?How?
What is meant by acquisitional What is meant by acquisitional techniques?techniques? Where, when, and how often data is Where, when, and how often data is
acquired and delivered to query processing acquired and delivered to query processing operatorsoperators
Four related questionsFour related questions When should samples be taken?When should samples be taken? What sensors have relevant data?What sensors have relevant data? In what order should samples be taken?In what order should samples be taken? Is it worth to process and relay samples?Is it worth to process and relay samples?
What’s the big deal?What’s the big deal?
Radio is Radio is expensiveexpensive
Sensing takes Sensing takes significant energysignificant energy
Four Energy Four Energy Levels:Levels: SnoozingSnoozing ProcessingProcessing Processing and Processing and
receivingreceiving TransmittingTransmitting
RoadmapRoadmap
Application Structure & Design GoalsApplication Structure & Design Goals Acquisitional Query LanguageAcquisitional Query Language Power-Aware OptimizationPower-Aware Optimization Power Sensitive Dissemination and Power Sensitive Dissemination and
RoutingRouting Processing QueriesProcessing Queries Conclusions and Future Work Conclusions and Future Work DiscussionDiscussion
An Acquisitional Query An Acquisitional Query LanguageLanguage
SQL-like queries in the form of SQL-like queries in the form of SELECT-FROM-WHERE SELECT-FROM-WHERE
SELECT nodeid, light, tempSELECT nodeid, light, temp FROM sensors FROM sensors SAMPLE INTERVAL 1s FOR 10s SAMPLE INTERVAL 1s FOR 10s
Sensors viewed as a single tableSensors viewed as a single table Columns are sensor dataColumns are sensor data Rows are individual sensorsRows are individual sensors Unbounded, continuous data stream of Unbounded, continuous data stream of
valuesvalues
Why Windows?Why Windows?
Sensors table is an unbounded, Sensors table is an unbounded, continuous data streamcontinuous data stream
Operations such as sort and Operations such as sort and symmetric join are not allowed on symmetric join are not allowed on streamsstreams
They are allowed on bounded They are allowed on bounded subsets of the stream (windows)subsets of the stream (windows)
WindowsWindows Windows in TinyDB are fixed-size Windows in TinyDB are fixed-size
materialization points over sensor streams.materialization points over sensor streams. Materialization points can be used in queriesMaterialization points can be used in queries ExampleExample
CREATECREATESTORAGE POINT recentlight SIZE 8STORAGE POINT recentlight SIZE 8AS (SELECT nodeid, light FROM sensorsAS (SELECT nodeid, light FROM sensorsSAMPLE INTERVAL 10s)SAMPLE INTERVAL 10s)
SELECT COUNT(*)SELECT COUNT(*)FROM sensors AS s, recentlight AS r1FROM sensors AS s, recentlight AS r1WHERE r.nodeid = s.nodeidWHERE r.nodeid = s.nodeidAND s.light < r1.lightAND s.light < r1.lightSAMPLE INTERVAL 10sSAMPLE INTERVAL 10s
Temporal AggregationTemporal Aggregation Why Aggregation?Why Aggregation?
Reduce the quantity of data that must Reduce the quantity of data that must be transmitted through the networkbe transmitted through the network
ExampleExample SELECT WINAVG (volume, 30s, 5s)SELECT WINAVG (volume, 30s, 5s)
FROM sensorsFROM sensorsSAMPLE INTERVAL 1sSAMPLE INTERVAL 1s
Report the average volume over the last Report the average volume over the last 30 seconds once every 5 seconds, 30 seconds once every 5 seconds, sampling once per secondsampling once per second
How about spacial aggregation or How about spacial aggregation or spacial-temporal aggregation?spacial-temporal aggregation?
Event-Based QueriesEvent-Based Queries An alternative to continuous polling for dataAn alternative to continuous polling for data ExampleExample
ON EVENT bird-detector(loc):ON EVENT bird-detector(loc):SELECT AVG(light), AVG(temp), event.locSELECT AVG(light), AVG(temp), event.locFROM sensors AS sFROM sensors AS sWHERE dist(s.loc, event.loc) < 10mWHERE dist(s.loc, event.loc) < 10mSAMPLE INTERVAL 2s FOR 30sSAMPLE INTERVAL 2s FOR 30s Currently, events are
only signaled on the local node.
How about a fully distributed event propagation system?
What is the gain?
What is the pay?
Is it worthy doing?
Lifetime-Based QueriesLifetime-Based Queries
ExampleExampleSELECT nodeid, accelSELECT nodeid, accel
FROM sensorsFROM sensorsLIFETIME 30 daysLIFETIME 30 days
The query specifies that the network The query specifies that the network shouldshould Run for as least 30 daysRun for as least 30 days Sampling light and acceleration sensors Sampling light and acceleration sensors
as quick as possible and still maintains as quick as possible and still maintains the life time goalthe life time goal
Lifetime-Based QueriesLifetime-Based Queries Nodes perform cost-based analysis in order to Nodes perform cost-based analysis in order to
determine data rate for each nodedetermine data rate for each node
Lifetime-Based QueriesLifetime-Based Queries
Tested a mote with a 24 week queryTested a mote with a 24 week query Sample rate was 15.2 seconds per sampleSample rate was 15.2 seconds per sample Took 9 voltage readings over 12 daysTook 9 voltage readings over 12 days
Reasonable to drop the first two data?
Reasonable to use data from the first 12 days to fit a line which covers 168 days?
RoadmapRoadmap
Application Structure & Design GoalsApplication Structure & Design Goals Acquisitional Query LanguageAcquisitional Query Language Power-Aware OptimizationPower-Aware Optimization Power Sensitive Dissemination and Power Sensitive Dissemination and
RoutingRouting Processing QueriesProcessing Queries Conclusions and Future Work Conclusions and Future Work DiscussionDiscussion
Power-Aware Power-Aware OptimizationOptimization
Where? Where? Queries optimized by base station before Queries optimized by base station before
disseminationdissemination why? why?
Cost-based optimization to yield lowest overall Cost-based optimization to yield lowest overall power consumptionpower consumption
Cost dominated by sampling and transmittingCost dominated by sampling and transmitting How?How?
Optimizer focuses on ordering joins, Optimizer focuses on ordering joins, selections, and sampling on individual nodesselections, and sampling on individual nodes
Reordering Sampling and Reordering Sampling and PredicatesPredicates
Consider the queryConsider the query SELECT accel, magSELECT accel, mag
FROM sensorsFROM sensorsWHERE accel > c1WHERE accel > c1AND mag > c2AND mag > c2SAMPLE INTERVAL 1sSAMPLE INTERVAL 1s
Three optionsThree options Measure accel and mag; then process selectMeasure accel and mag; then process select Measure mag; filter; then measure accelMeasure mag; filter; then measure accel Measure accel; filter; then measure magMeasure accel; filter; then measure mag
First option always more expensive. First option always more expensive. Second option an order of magnitude more Second option an order of magnitude more
expensive than thirdexpensive than third Second option can be cheaper if the Second option can be cheaper if the
predicate is highly selectivepredicate is highly selective
Example 2Example 2
Another exampleAnother exampleSELECT WINMAX (light, 8s, 8s)SELECT WINMAX (light, 8s, 8s)
FROM sensorsFROM sensorsWHERE mag > xWHERE mag > xSAMPLE INTERVAL 1sSAMPLE INTERVAL 1s
Unless mag > x is very selective, it is Unless mag > x is very selective, it is cheaper to check if current light is cheaper to check if current light is greater than maxgreater than max
Reordering is called exemplary Reordering is called exemplary aggregate pushdownaggregate pushdown
Event Query Batching Event Query Batching
Have a queryHave a query ON EVENT e (nodeid)ON EVENT e (nodeid)
SELECT a1SELECT a1
FROM sensors AS sFROM sensors AS s
WHERE s.nodeid = e.nodeidWHERE s.nodeid = e.nodeid
SAMPLE INTERVAL d FOR kSAMPLE INTERVAL d FOR k
Every time e occurs, an instance of the Every time e occurs, an instance of the internal query is started.internal query is started.
Multiple independent instances at the Multiple independent instances at the same time, independent sampling and same time, independent sampling and data deliveringdata delivering
Solution:Solution: Convert event e into a event stream Convert event e into a event stream Rewrite the internal query as a sliding Rewrite the internal query as a sliding
window join between the event stream window join between the event stream and sensorsand sensors
ON EVENT e (nodeid)ON EVENT e (nodeid)
SELECT a1SELECT a1
FROM sensors AS sFROM sensors AS s
WHERE s.nodeid = e.nodeidWHERE s.nodeid = e.nodeid
SAMPLE INTERVAL d FOR kSAMPLE INTERVAL d FOR k
ON EVENT s.a1ON EVENT s.a1
FROM sensors AS s, events AS FROM sensors AS s, events AS ee
WHERE s.nodeid = e.nodeidWHERE s.nodeid = e.nodeid
AND e.type = eAND e.type = e
AND s.time – e.time <= k AND AND s.time – e.time <= k AND s.time > e.times.time > e.time
SAMPLE INTERVAL dSAMPLE INTERVAL d
RoadmapRoadmap
Application Structure & Design GoalsApplication Structure & Design Goals Acquisitional Query LanguageAcquisitional Query Language Power-Aware OptimizationPower-Aware Optimization Power Sensitive Dissemination and Power Sensitive Dissemination and
RoutingRouting Processing QueriesProcessing Queries Conclusions and Future Work Conclusions and Future Work DiscussionDiscussion
Semantic Routing TreesSemantic Routing Trees
Why SRT?Why SRT? It is a routing tree designed to allow each It is a routing tree designed to allow each
node to efficiently determine if any of the node to efficiently determine if any of the nodes below it will need to participate in a nodes below it will need to participate in a given query over some constant attributes.given query over some constant attributes.
Used to prune the routing tree.Used to prune the routing tree. What is SRT?What is SRT?
An SRT is an index over constant attribute An SRT is an index over constant attribute A that can be used to locate nodes that A that can be used to locate nodes that have data relevant to the query.have data relevant to the query.
It is an overlay on the network.It is an overlay on the network.
How to use CRT?How to use CRT? When a query q with a predicate over A When a query q with a predicate over A
arrives at node n, n checks whether any arrives at node n, n checks whether any child’s value of A overlaps the query child’s value of A overlaps the query range of A in q:range of A in q: If yes, prepare to receive results and If yes, prepare to receive results and
forward the queryforward the query If no, do not forward qIf no, do not forward q
Is query q applied locally:Is query q applied locally: If yes, execute the queryIf yes, execute the query If not, ignoredIf not, ignored
How to build CRT?How to build CRT? Flood the SRT build request down the networkFlood the SRT build request down the network
Re-transmitted by every mote until every mote hears Re-transmitted by every mote until every mote hears itit
If a node has no childrenIf a node has no children Choose a parent p; report the value of A to pChoose a parent p; report the value of A to p
If a node has childrenIf a node has children Forward the request, and wait for replyForward the request, and wait for reply Upon reply from children, choose a parent p; report Upon reply from children, choose a parent p; report
to p the range of values of A which it and its to p the range of values of A which it and its descendents cover descendents cover
Since each constant attribute A may have Since each constant attribute A may have a separate SRT, is the scheme scalable?a separate SRT, is the scheme scalable?
Evaluation of SRTEvaluation of SRT
SRT are limited to constant SRT are limited to constant attributesattributes
Even so, maintenance is requiredEven so, maintenance is required Possible to use for non-constant Possible to use for non-constant
attributes but cost can be prohibitiveattributes but cost can be prohibitive
Evaluation of SRTEvaluation of SRT Compared three different strategies for Compared three different strategies for
building tree, random, closest, and clusterbuilding tree, random, closest, and cluster Random: pick a random parent from the nodes Random: pick a random parent from the nodes
with reliable communicationwith reliable communication Closest: pick the parent whose attribute value Closest: pick the parent whose attribute value
(index attribute) is closest(index attribute) is closest Cluster: by snooping siblings’ parent selection, Cluster: by snooping siblings’ parent selection,
each node try to pick the right parent, to minimize each node try to pick the right parent, to minimize the spread of attribute values underneath all of its the spread of attribute values underneath all of its available parentsavailable parents
Report results for two different sensor value Report results for two different sensor value distributions, random and geographicdistributions, random and geographic Random: each attribute value is randomly selected Random: each attribute value is randomly selected
from the interval [0,1000]from the interval [0,1000] Geographic: values among neighbor are highly Geographic: values among neighbor are highly
correlatedcorrelated
SRT ResultsSRT Results
The Cluster scheme is superior to the random scheme and the closest scheme.
With the geographic distribution, the performance of the cluster scheme is close to the optimal.
Where is the data of SRT’s overhead?
RoadmapRoadmap
Application Structure & Design GoalsApplication Structure & Design Goals Acquisitional Query LanguageAcquisitional Query Language Power-Aware OptimizationPower-Aware Optimization Power Sensitive Dissemination and Power Sensitive Dissemination and
RoutingRouting Processing QueriesProcessing Queries Conclusions and Future Work Conclusions and Future Work DiscussionDiscussion
Processing QueriesProcessing Queries Queries have been optimized and Queries have been optimized and
distributed, what more can we do?distributed, what more can we do? Aggregate data that is sent back to the Aggregate data that is sent back to the
rootroot Prioritize data that needs to be sentPrioritize data that needs to be sent
Naïve - FIFONaïve - FIFO Winavg – average the two results at the Winavg – average the two results at the
queue’s head to make room for the new dataqueue’s head to make room for the new data Delta – Send result with most changesDelta – Send result with most changes
Adapt data rates and power consumptionAdapt data rates and power consumption
Prioritization Prioritization ComparisonComparison
Sample rate was K Sample rate was K times faster than times faster than delivery rate.delivery rate.
Readings Readings generated by generated by shaking the sensorshaking the sensor
In this example, K In this example, K = 4= 4
Delta seems to be Delta seems to be betterbetter
AdaptationAdaptation
Not safe to Not safe to assume that assume that network channel network channel is uncontestedis uncontested
TinyDB reduces TinyDB reduces packets sent as packets sent as channel channel contention risescontention rises
Adaptation Adaptation
RoadmapRoadmap
Application Structure & Design GoalsApplication Structure & Design Goals Acquisitional Query LanguageAcquisitional Query Language Power-Aware OptimizationPower-Aware Optimization Power Sensitive Dissemination and Power Sensitive Dissemination and
RoutingRouting Processing QueriesProcessing Queries Conclusions and Future Work Conclusions and Future Work DiscussionDiscussion
Conclusions & Future Conclusions & Future WorkWork
Conclusions:Conclusions: Design of an acquisitional query processor for Design of an acquisitional query processor for
data collection in sensor networksdata collection in sensor networks Evaluation in the context of TinyDBEvaluation in the context of TinyDB
Future Work:Future Work: Selectivity of operators based upon range of Selectivity of operators based upon range of
sensorsensor Exemplary aggregate pushdownExemplary aggregate pushdown More sophisticated prioritization schemesMore sophisticated prioritization schemes Better re-optimization of sample rate based Better re-optimization of sample rate based
upon acquired dataupon acquired data
DiscussionDiscussion
Is this the best way (right way?) to Is this the best way (right way?) to look at a sensor network?look at a sensor network?
Is their approximation of battery Is their approximation of battery lifetime sufficient?lifetime sufficient?
Was their evaluation of SRT good Was their evaluation of SRT good enough?enough?
Top Related