1 Implementation and Research Issues in Query Processing for Wireless Sensor Networks Wei Hong Intel...

42
1 Implementation and Research Issues in Query Processing for Wireless Sensor Networks Wei Hong Intel Research, Berkeley whong@intel- research.net Sam Madden MIT [email protected] ICDE 2004
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of 1 Implementation and Research Issues in Query Processing for Wireless Sensor Networks Wei Hong Intel...

1

Implementation and Research Issues in Query Processing for Wireless

Sensor Networks

Wei Hong Intel Research, Berkeley

[email protected]

Sam MaddenMIT

[email protected]

ICDE 2004

2

Motivation• Sensor networks (aka sensor webs, emnets) are here

– Several widely deployed HW/SW platforms• Low power radio, small processor, RAM/Flash

– Variety of (novel) applications: scientific, industrial, commercial

– Great platform for mobile + ubicomp experimentation

• Real, hard research problems to be solved– Networking, systems, languages, databases

• We will summarize:– The state of the art– Our experiences building TinyDB– Current and future research directions

Berkeley Mote

3

Sensor Network Apps

Traditional monitoring apparatus.

Earthquake monitoring in shake-test sites.

Vehicle detection: sensors along a road, collect data about passing vehicles.

Habitat Monitoring: Storm petrels on Great Duck Island, microclimates on James Reserve.

4

Declarative Queries

• Programming Apps is Hard– Limited power budget– Lossy, low bandwidth communication– Require long-lived, zero admin deployments– Distributed Algorithms– Limited tools, debugging interfaces

• Queries abstract away much of the complexity– Burden on the database developers– Users get:

• Safe, optimizable programs• Freedom to think about apps instead of details

5

Motes

4Mhz, 8 bit Atmel RISC uProc

40 kbit Radio

4 K RAM, 128 K Program Flash, 512 K Data Flash

AA battery pack

Based on TinyOS

Mica MoteMica Mote

Mica2DotMica2Dot

6

History of Motes

• Initial research goal wasn’t hardware– Has since become more of a priority with

emerging hardware needs, e.g.:• Power consumption• (Ultrasonic) ranging + localization

– MIT Cricket, NEST Project• Connectivity with diverse sensors

– UCLA sensor board

– Even so, now on the 5th generation of devices• Costs down to ~$50/node (Moteiv, Dust)• Greatly improved radio quality• Multitude of interfaces: USB, Ethernet, CF, etc.• Variety of form factors, packages

7

Motes vs. Traditional Computing

• Lossy, Adhoc Radio Communication

• Sensing Hardware• Severe Power Constraints

8

Types of Sensors

• Sensors attach via daughtercard

•Weather–Temperature–Light x 2 (high intensity PAR, low intensity, full spectrum)–Air Pressure–Humidity

•Vibration–2 or 3 axis accelerometers

•Tracking–Microphone (for ranging and acoustic signatures)–Magnetometer

• GPS

9

Power Consumption and Lifetime

• Power typically supplied by a small battery– 1000-2000 mAH– 1 mAH = 1 milliamp current for 1 hour

• Typically at optimum voltage, current drain rates

– Power = Watts (W) = Amps (A) * Volts (V)– Energy = Joules (J) = W * time

• Lifetime, power consumption varies by application– Processor: 5mA active, 1 mA idle, 5 uA sleeping– Radio: 5 mA listen, 10 mA xmit/receive, ~20mS / packet– Sensors: 1 uA -> 100’s mA, 1 uS -> 1 S / sample

10

Programming Sensornets: TinyOS

• Component Based Programming Model

• Suite of software components– Timers, clocks, clock synchronization– Single and multi-hop networking– Power management– Non-volatile storage management

11

Programming Philosophy

• Component Based– “Wiring” to components together via

interfaces, configurations

• Split-Phased– Nothing blocks, ever.– Instead, completion events are signaled.

• Highly Concurrent– Single thread of “tasks”, posted and

scheduled FIFO– Events “fired” asynchronously in response

to interrupts.

12

NesC

• C-like programming language with component model support– Compiles into GCC-compatible C

• 3 types of files:– Interfaces

• Set of function prototypes; no implementations or variables– Modules

• Provide (implement) zero or more interfaces• Require zero or more interfaces• May define module variables, scoped to functions in module

– Configurations• Wire (connect) modules according to requires/provides

relationship

13

TinyOS: Getting Started

• The TinyOS home page:– http://webs.cs.berkeley.edu/tinyos– Start with the tutorials!

• The CVS repository– http://sf.net/projects/tinyos

• The NesC Project Page– http://sf.net/projects/nescc

• Crossbow motes (hardware):– http://www.xbow.com

• Intel Imote– www.intel.com/research/exploratory/motes.htm.

14

Part 2

The Design and Implementation of TinyDB

15

Part 2 Outline

• TinyDB Overview• Data Model and Query Language• TinyDB Java API and Scripting• Demo with TinyDB GUI• TinyDB Internals• Extending TinyDB• TinyDB Status and Roadmap

16

TinyDB RevisitedSELECT MAX(mag) FROM sensors WHERE mag > threshSAMPLE PERIOD 64ms

• High level abstraction:– Data centric programming– Interact with sensor

network as a whole– Extensible framework

• Under the hood:– Intelligent query

processing: query optimization, power efficient execution

– Fault Mitigation: automatically introduce redundancy, avoid problem areas

App

Sensor Network

TinyDB

Query, Trigger

Data

17

Feature Overview

• Declarative SQL-like query interface• Metadata catalog management• Multiple concurrent queries• Network monitoring (via queries)• In-network, distributed query processing• Extensible framework for attributes,

commands and aggregates• In-network, persistent storage

18

TinyDB GUI

TinyDB Client APIDBMS

Sensor network

Architecture

TinyDB query processor

0

4

0

1

5

2

6

3

7

JDBC

Mote side

PC side

8

19

Data Model

• Entire sensor network as one single, infinitely-long logical table: sensors

• Columns consist of all the attributes defined in the network

• Typical attributes:– Sensor readings– Meta-data: node id, location, etc.– Internal states: routing tree parent, timestamp, queue

length, etc.• Nodes return NULL for unknown attributes• On server, all attributes are defined in catalog.xml• Discussion: other alternative data models?

20

Query Language (TinySQL)

SELECT <aggregates>, <attributes>

[FROM {sensors | <buffer>}][WHERE <predicates>][GROUP BY <exprs>][SAMPLE PERIOD <const> |

ONCE][INTO <buffer>][TRIGGER ACTION <command>]

21

Comparison with SQL

• Single table in FROM clause• Only conjunctive comparison predicates

in WHERE and HAVING• No subqueries• No column alias in SELECT clause• Arithmetic expressions limited to

column op constant• Only fundamental difference: SAMPLE

PERIOD clause

22

TinySQL Examples

SELECT nodeid, nestNo, lightFROM sensorsWHERE light > 400EPOCH DURATION 1s

1EpocEpoc

hhNodeiNodei

ddnestNnestN

ooLightLight

0 1 17 455

0 2 25 389

1 1 17 422

1 2 25 405

Sensors

“Find the sensors in bright nests.”

23

TinySQL Examples (cont.)

Epoch region CNT(…) AVG(…)

0 North 3 360

0 South 3 520

1 North 3 370

1 South 3 520

“Count the number occupied nests in each loud region of the island.”

SELECT region, CNT(occupied) AVG(sound)

FROM sensors

GROUP BY region

HAVING AVG(sound) > 200

EPOCH DURATION 10s

3

Regions w/ AVG(sound) > 200

SELECT AVG(sound)

FROM sensors

EPOCH DURATION 10s

2

24

Event-based Queries

• ON event SELECT …• Run query only when interesting events

happens• Event examples

– Button pushed– Message arrival– Bird enters nest

• Analogous to triggers but events are user-defined

25

Query over Stored Data

• Named buffers in Flash memory• Store query results in buffers• Query over named buffers• Analogous to materialized views• Example:

– CREATE BUFFER name SIZE x (field1 type1, field2 type2, …)

– SELECT a1, a2 FROM sensors SAMPLE PERIOD d INTO name

– SELECT field1, field2, … FROM name SAMPLE PERIOD d

26

Inside TinyDB

TinyOS

Schema

Query Processor

Multihop Network

Filterlight >

400get (‘temp’)

Aggavg(tem

p)

QueriesSELECT AVG(temp) WHERE light > 400

ResultsT:1, AVG: 225T:2, AVG: 250

Tables Samples got(‘temp’)

Name: tempTime to sample: 50 uSCost to sample: 90 uJCalibration Table: 3Units: Deg. FError: ± 5 Deg FGet f : getTempFunc()…

getTempFunc(…)getTempFunc(…)

TinyDBTinyDB

~10,000 Lines Embedded C Code

~5,000 Lines (PC-Side) Java

~3200 Bytes RAM (w/ 768 byte heap)

~58 kB compiled code

(3x larger than 2nd largest TinyOS Program)

27

Extending TinyDB

• Why extending TinyDB?– New sensors attributes– New control/actuation commands– New data processing logic

aggregates– New events

• Analogous to concepts in object-relational databases

28

Adding Attributes

• Types of attributes– Sensor attributes: raw or cooked

sensor readings– Introspective attributes: parent,

voltage, ram usage, etc.– Constant attributes: constant values

that can be statically or dynamically assigned to a mote, e.g., nodeid, location, etc.

29

TinyDB Status

• Latest released with TinyOS 1.1 (9/03)– Install the task-tinydb package in TinyOS 1.1

distribution– First release in TinyOS 1.0 (9/02)– Widely used by research groups as well as industry pilot

projects

• Successful deployments in Intel Berkeley Lab and redwood trees at UC Botanical Garden– Largest deployment: ~80 weather station nodes– Network longevity: 4-5 months

30

Part 3

Database Research Issues in Sensor Networks

31

Sensor Network Research

• Very active research area– Can’t summarize it all

• Focus: database-relevant research topics– Some outside of Berkeley– Other topics that are itching to be scratched– But, some bias towards work that we find

compelling

32

Topics

• In-network aggregation• Acquisitional Query Processing• Heterogeneity• Intermittent Connectivity• In-network Storage• Statistics-based summarization and

sampling• In-network Joins• Adaptivity and Sensor Networks• Multiple Queries

33

Topics

• In-network aggregation• Acquisitional Query Processing• Heterogeneity• Intermittent Connectivity• In-network Storage• Statistics-based summarization and

sampling• In-network Joins• Adaptivity and Sensor Networks• Multiple Queries

34

Tiny Aggregation (TAG)

• In-network processing of aggregates– Common data analysis operation

• Aka gather operation or reduction in || programming

– Communication reducing• Operator dependent benefit

– Across nodes during same epoch

• Exploit query semantics to improve efficiency!

Madden, Franklin, Hellerstein, Hong. Tiny AGgregation (TAG), OSDI 2002.

35

Acquisitional Query Processing (ACQP)

• TinyDB acquires AND processes data

– Could generate an infinite number of samples

• An acqusitional query processor controls

– when,

– where,

– and with what frequency data is collected!

• Versus traditional systems where data is provided a priori

Madden, Franklin, Hellerstein, and Hong. The Design of An Acqusitional Query Processor. SIGMOD, 2003.

36

ACQP: What’s Different?• How should the query be processed?

– Sampling as a first class operation

• How does the user control acquisition?– Rates or lifetimes– Event-based triggers

• Which nodes have relevant data?– Index-like data structures

• Which samples should be transmitted?– Prioritization, summary, and rate control

37

• E(sampling mag) >> E(sampling light)

1500 uJ vs. 90 uJ

Operator Ordering: Interleave Sampling + Selection

SELECT light, magFROM sensorsWHERE pred1(mag)AND pred2(light)EPOCH DURATION 1s

(pred1)

(pred2)

mag

light

(pred1)

(pred2)

mag

light

(pred1)

(pred2)

mag light

Traditional DBMS

ACQP

At 1 sample / sec, total power savings could be as much as 3.5mW Comparable to processor!

Correct orderingCorrect ordering(unless pred1 is (unless pred1 is very very selective selective

and pred2 is not):and pred2 is not):

Cheap

Costly

38

Exemplary Aggregate Pushdown

SELECT WINMAX(light,8s,8s)FROM sensorsWHERE mag > xEPOCH DURATION 1s

• Novel, general pushdown technique

• Mag sampling is the most expensive operation!

WINMAX

(mag>x)

mag light

Traditional DBMS

light

mag

(mag>x)

WINMAX

(light > MAX)

ACQP

39

Occasionally Connected Sensornets

TinyDB QPTinyDB QP

TinyDB Server

GTWY

Mobile GTWYMobile GTWY

TinyDB QP

Mobile GTWY

GTWY

internet

GTWY

40

Adaptivity In Sensor Networks

• Queries are long running• Selectivities change

– E.g. night vs day

• Network load and available energy vary• All suggest that some adaptivity is needed

– Of data rates or granularity of aggregation when optimizing for lifetimes

– Of operator orderings or placements when selectivities change (c.f., conditional plans for correlations)

• As far as we know, this is an open problem!

41

Multiple Queries and Work Sharing

• As sensornets evolve, users will run many queries simultaneously– E.g., traffic monitoring

• Likely that queries will be similar– But have different end points, parameters,

etc

• Would like to share processing, routing as much as possible

• But how? Again, an open problem.

42

Concluding Remarks

• Sensor networks are an exciting emerging technology, with a wide variety of applications

• Many research challenges in all areas of computer science– Database community included– Some agreement that a declarative interface is right

• TinyDB and other early work are an important first step

• But there’s lots more to be done!