INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks...

68
INF5100 Autumn 2007 © Ell en Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg

Transcript of INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks...

Page 1: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

1

Data Management in Sensor Networks

Ellen Munthe-Kaas

Jarle Søberg

Page 2: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

2

Outline

• Sensor networks– Characteristics– Motes– Application domains– Data management

• TinyOS

• TinyDB

Page 3: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

3

Sensor Networks

Base station(gateway)

Motes (sensors)

Page 4: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

4

Sensor Network Characteristics

• Autonomous nodes– Small, low-cost, low-power, multifunctional– Sensing, data processing, and communicating

components• Sensor network is composed of large number of

sensor nodes– Proximity to physical phenomena

• Deployed inside the phenomenon or very close to it

• Monitoring and collecting physical data• No human interaction for weeks or months at a

time– Long-term, low-power nature

Page 5: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

5

Motes

Spec smart dust; total size 5 mm2

Mica2 mote with 2 AA batteries(provide power for one year’s use)

Mica2DOT mote. Powered with button battery

Mote: Short for remote.Refers to a wireless transceiver that is also a remote sensor

Page 6: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

6

Mote Hardware

• Made up of four basic components– sensing unit

• usually two subunits: sensor and ADC

– processing unit• makes the sensor collaborate with the other nodes

to carry out the assigned sensing tasks

– transceiver unit• connects the node to the network

– power unit• small, standard batteries

Page 7: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

7

Motes in the DMMS Laboratory

Mica2• Processor: MPR400CB based on Atmel ATmega128L• Radio: 900 MHz multi-channel transceiver• Memory: 4 kB Configuration EEPROM

128 kB Program Flash Memory512 kB Measurement (Serial) Flash

• Power: 2 x AA • OS: TinyOS v1.0• Weight: 18g (excluding batteries)• Sensors:

– Light– Temperature– Acoustic

• Actuators:– Sounder

Page 8: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

8

Motes vs. Traditional Computing

• Embedded OS– Usually an image flashed to the device

• Lossy, ad hoc radio communication– E.g. wrt slow CPU

• Sensing hardware

• Severe power constraints

Page 9: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

9

Application Domains

• Environmental

• Health

• Military

• Commercial

Page 10: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

10

Environmental Applications

• Tracking the movements of birds, animals, insects

• Monitoring environmental conditions that affect crops and livestock

• Chemical/biological detection• Biological, earth, and environmental monitoring

in marine, soil, and atmospheric contexts• Meteorological or geophysical research• Pollution study, precision agriculture, irrigation• Biocomplexity mapping of environment• Flood detection, forest fire detection

Page 11: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

11

Health Applications

• Integrated patient monitoring– Body sensor networks

• Telemonitoring of human physiological data

• Tracking and monitoring doctors and patients inside a hospital

• Tracking and monitoring patients and rescue personnel during rescue operations

Page 12: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

12

Military Applications

• Monitoring friendly forces, equipment and ammunition

• Battlefield surveillance

• Reconnaissance of opposing forces and terrain

• Nuclear, biological and chemical (NBC) attack detection and reconnaissance

Page 13: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

13

Commercial Applications

• Monitoring product quality• Constructing smart office spaces• Interactive toys• Smart structures with sensor nodes embedded inside• Machine diagnostics• Interactive museums• Managing inventory control• Environmental control in office buildings• Detecting and monitoring car thefts• Vehicle tracking and detection• Tracking goods with RFID

Page 14: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

14

Application Examples

Traditional monitoring apparatus.

Earthquake monitoring in shake-test sites.Vehicle detection: sensors

along a road, collect data about passing vehicles.

Habitat Monitoring: Storm petrels on Great Duck Island, microclimates on James Reserve.

Page 15: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

15

Managing Data

• Purpose of sensor network: Obtain real-world data– Extract and combine data from the network

• But: Programming sensor networks is hard!– Months of lifetime required from small batteries– Lossy, low-bandwidth, short range communication– Highly distributed environment– Application development– Application deployment administration

Page 16: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

16

Data Management Systems for Sensor Networks

DB

sensor network

Page 17: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

17

Data Management Systems for Sensor Networks

• Motivation:– Implement data

access• Sensor tasking• Data processing• Possibly support for

data model and query language

• Goals:– Adaptive

• Network conditions• Varying/unplanned

stimuli

– Energy efficient• In-network processing• Flexible tasking• Duty cycling

Page 18: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

18

Data Management System Challenges

• Routing

• Resource allocation

• Deployment

• Query language, query optimization

Page 19: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

19

Outline

• Sensor networks

• TinyOS

• TinyDB

Page 20: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

20

TinyOS

• Operating system for managing and accessing mote HW• Characteristics:

– Energy-efficient– Programming model: Components– Only one application running at a time– No process isolation or scheduling– No kernel– No protection domains– No memory manager– No multithreading

• Programming language: nesC

Page 21: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

21

Outline

• Sensor networks• TinyOS• TinyDB

– Overview– Data model– Query language– Architecture– Network administration– Aggregates (TAG: Tiny Aggregations)

Page 22: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

22

TinyDB

• High level abstraction– Data centric programming– Interact with sensor network as

a whole– Extensible framework

• Under the hood:– Intelligent query processing:

query optimization, power efficient execution

– Fault mitigation: automatically introduce redundancy, avoid problem areas

SELECT nodeid, light FROM sensors WHERE light > 400SAMPLE PERIOD 1s

query, trigger

data

Appli-cation

TinyDB

sensor network

Page 23: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

23

Feature Overview

• Declarative SQL-like query interface

• Metadata catalog management

• Multiple concurrent queries

• Network monitoring (via queries)

• In-network, distributed query processing

• Extensible framework for attributes, commands and aggregates

• In-network, persistent storage

Page 24: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

24

Query Language Essentials

• Declarative queries– Simple, SQL-like queries– Users specify the data they want and the rate at

which data should be refreshed– Using predicates, not specific addresses

• TinyDB collects data from motes in the environment, filters it, aggregates it, and routes it out to a PC

• TinyDB does this with power-efficient in-network processing algorithms

Page 25: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

25

Outline

• Sensor networks• TinyOS• TinyDB

– Overview– Data model– Query language– Architecture– Network administration– Aggregates (TAG: Tiny Aggregations)

Page 26: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

26

Data Model

• Relational model• Single table sensors

– One column (attribute) per type of value that a device can produce (light, temperature,...)

– One row (tuple) per node per instant in time– Physically partitioned across all nodes in the network– Tuples are materialized only at need and stored only

for a short period or delivered directly to the network– Projections and transformations of tuples from

sensors may be stored in materialization points

Page 27: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

27

The sensors Table

sensors(nodeid, temp, light)

sensors(nodeid, light)sensors(nodeid,

volume)

acoustic sensorlight sensor

light and temperature sensor

sensors(epoch, nodeid, ) volume , temp, light , ... SELECT nodeid, light

FROM sensors WHERE light > 400SAMPLE PERIOD 1s

Page 28: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

28

TinyDB Routing Tree

TinyDB GUI

TinyDB Client API PostgreSQLDBMS

TinyDB query processor

0

3

1

4

2

5

6

Mote side

PC side

7

Sensor network

Page 29: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

29

SELECT nodeid, light FROM sensors WHERE light > 400SAMPLE PERIOD 1s

sensors(epoch, nodeid, ) , temp, lightvolume , ...

epoch timeStamp

nodeid

volume

light temp ...

1 ... 2 null 422 null null

epoch timeStamp

nodeid

volume

light temp ...

2 ... 1 null 410 null null

2 ... 2 null 460 null null

sensors(nodeid, light)

nodeid light

1 305

2 422

nodeid light

1 410

2 460

sensors(nodeid, temp, light)

sensors(nodeid, volume)

TinySQL Routing Example

nodeid light

3 null

nodeid

light

2 422

nodeid

light

2 460

1

2

3

Page 30: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

30

Outline

• Sensor networks• TinyOS• TinyDB

– Overview– Data model– Query language– Architecture– Network administration– Aggregates (TAG: Tiny Aggregations)

Page 31: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

31

TinySQL Example 1• Sample interval: Interval during which exactly one tuple of sensors is

produced per node for the purpose of executing the query, and during which the query is executed once. The sampling itself does not take much time.

• Epoch: Period of time between the start of each sample interval. Numbered consecutively

• Data collection period: Period of time over which query is running.

”Report light and temperature readings once per second for 10 seconds.”

SELECT nodeid, light, tempFROM sensorsSAMPLE PERIOD 1 s FOR 10 s

time

sample interval

data collection period

epoch: 1 2 3 4 5 6 7 8 9 10

Page 32: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

32

TinySQL Example 2• Materialization point: Stored table in the nodes.

Cf. materialized views in traditional RDBMSs and windows in data stream management systems (DSMSs, will be explained in a later lecture).

”Store the latest eight light readings, doing one reading every 10 seconds (forever).”

CREATE STORAGE POINT recentLight SIZE 8AS (SELECT nodeid, light FROM sensors SAMPLE PERIOD 10 s)

... later:DROP STORAGE POINT recentLight

Page 33: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

33

TinySQL Example 3• Joins are allowed between two materialization points or between a

materialization point and the sensors table.– New sensors tuples are joined with tuples of the materialization point on their

time of arrival

”Count the number of recent light readings (from zero to eight samples in the past) that were brighter than the current reading, each current reading collectedduring a time span of ten seconds.”

SELECT COUNT(*)FROM sensors AS s, recentLight AS rlWHERE rl.nodeid = s.nodeid AND s.light < rl.lightSAMPLE PERIOD 10 s

new reading

nodeid light1 3452 3471 3503 5402 5331 5002 5143 412

nodeid light ...

... ... ...

... ... ...

... ... ...

2 435 ...

sensors

recentLight

Page 34: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

34

TinySQL Example 4• Aggregation can be performed on grouped values as in ordinary SQL.

Grouping and aggregation take place over the tuples collected during each sample interval.

”Find the rooms on floor 6 where the average volume is over some threshold (assuming each room can have multiple sensors). Do this every 30 seconds.”

SELECT room, AVG(volume)FROM sensorsWHERE floor = 6GROUP BY roomHAVING AVG(volume) > thresholdSAMPLE PERIOD 30 s

... later:STOP QUERY id

Page 35: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

35

TinySQL Example 5• An event can be used for initiating data collection.

– Generated by another query or by a lower-level part of the OS

”When a bird-detect event occurs, report the average light and temperature levels at sensors near the event’s location. Do this every 2 seconds for a period of 30 seconds (then go to sleep again).”

ON EVENT bird-detect(loc):SELECT event.loc, AVG(light), AVG(temp)FROM sensors AS sWHERE dist(s.loc, event.loc) < 10 mSAMPLE PERIOD 2 s for 30 s

Page 36: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

36

TinySQL Example 6• Generating an event from a query:

”Signal the event hot whenever the temperature goes above some threshold. Read the temperature every 10 seconds.”

SELECT nodeid, tempFROM sensorsWHERE temp > thresholdOUTPUT ACTION SIGNAL hot(nodeid, temp)SAMPLE PERIOD 10 s

Page 37: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

37

TinySQL Example 7• To make sure the network runs for a guaranteed period, users may request

a specific query lifetime.

”Get the temperature, but space out the readings to make sure that the network will survive at least 30 days.”

SELECT nodeid, tempFROM sensorsLIFETIME 30 days

Page 38: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

38

TinySQL Example 8• Network health queries are metaqueries over the network itself.

”Report all sensors whose current battery voltage is less than k.”

SELECT nodeid, voltageFROM sensorsWHERE voltage < kSAMPLE PERIOD 10 minutes

Page 39: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

39

TinySQL Example 9• Actuation queries can be used to perform some physical action in

response to a query.

”Turn on the fan if the temperature is rising above a certain level.”

SELECT nodeid, tempFROM sensorsWHERE temp > thresholdOUTPUT ACTION power-on(nodeid)SAMPLE PERIOD 30 s

Page 40: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

40

Outline

• Sensor networks• TinyOS• TinyDB

– Overview– Data model– Query language– Architecture– Network administration– Aggregates (TAG: Tiny Aggregations)

Page 41: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

41

Inside TinyDB

TinyOS

Schema

Query Processor

Multihop Network

Filterlight > 400

get (‘temp’)

Aggavg(tem

p)

QueriesSELECT AVG(temp) WHERE light > 400

Results

T: 1, AVG: 225T: 2, AVG: 250

Tables Samples got(‘temp’)Name: tempTime to sample: 50 uSCost to sample: 90 uJCalibration Table: 3Units: Deg. FError: ± 5 Deg FGet f : getTempFunc()…

getTempFunc()

TinyDBTinyDB

~10,000 Lines Embedded C Code

~5,000 Lines (PC-Side) Java

~3200 Bytes RAM (w/ 768 byte heap)

~58 kB compiled code

(3x larger than 2nd largest TinyOS Program)

Page 42: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

42

Metadata Management

• Each node maintains a metadata catalog containing– local attributes

• name• cost: power, sample time

– events• name• signature• cost: frequency estimate

– user-defined functions and predicates

• Periodically copied to root for use by query optimizer• Registered via static linking at compile time using nesC

Page 43: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

43

Outline

• Sensor networks• TinyOS• TinyDB

– Overview– Data model– Query language– Architecture– Network administration– Aggregates (TAG: Tiny Aggregations)

Page 44: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

44

TinyDB Routing Tree

TinyDB GUI

TinyDB Client API PostgreSQLDBMS

TinyDB query processor

0

3

1

4

2

5

6

Mote side

PC side

7

Sensor network

Page 45: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

45

Routing

• A routing tree is established– Root is a gateway– Spanning tree– No multiple paths– Tree construction and maintenance

• Query fragments are disseminated down the routing tree– Query optimization performed centrally, outside the

sensor network, using metadata obtained from the nodes

– Semantic routing tree to avoid flooding

Page 46: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

46

Routing Tree Creation

1. One mote is appointed the root (usually used as the interface/gateway of the network)

2. The root broadcasts a message <ID, distanceFromRoot> asking motes to organize into a routing tree

3. Any mote without an assigned level that hears this message assign its level as the received+1 and chooses the sender as its parent through which it will route messages to the root

4. Motes re-broadcast the routing message, inserting their own IDs and levels... and so on

Page 47: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

47

Communication Scheduling

• Data processed up the routing tree• Per node per epoch: Sleep; data sampling;

receiving; processing; transmitting– Sleep period defined based on number of children

• Awakening just in time to receive results– Sampling– Receiving– Processing:

• Filtering, partial aggregate– Network transmission

• Adaptation to network contention and power consumption• Expensive (sending 1 bit takes approx 800 instructions)

Page 48: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

48

Communication Scheduling• A mote upon receiving a request to

perform a query:– awakens– synchronizes its clock– chooses the sender of the msg as its parent– forwards the query, setting the delivery

interval for children to be slightly before the time its parent expects to see the partial state tuple

Page 49: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

49

The Epoch

• Long enough to allow all nodes to report

• Epoch length = Tree depth * (Sample interval + receive time + send timeSets a lower bound to the epoch)– Limits the maximum

sample interval of the network

• The sample interval is increased by pipelining the communication schedule

Page 50: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

50

Outline

• Sensor networks• TinyOS• TinyDB

– Overview– Data model– Query language– Architecture– Network administration– Aggregates (TAG: Tiny Aggregations)

Page 51: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

51

Aggregates: Centralized Approach

• Server-based approach: All sensor readings are sent to the base station, which then computes the aggregates

• Example:SELECT COUNT(*)FROM sensors

How many transmissions?

Page 52: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

52

Aggregates: Centralized Approach

11

1

1 11

Page 53: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

53

Aggregates: Distributed Approach

• In TinyDB aggregates are computed in-network whenever possible.– Lower number of transmissions– Lower latency– Lower power consumption

• Example:SELECT COUNT(*)FROM sensors

How may transmissions?

Page 54: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

54

Aggregates: Distributed Approach

11

1

3 2

6

Page 55: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

55

Implementation of agg• Each aggregation agg is implemented via a partial state

tuple and three functions:– the partial state tuple (partial state record in the paper) at each

node holds an intermediate aggregate value– i – initializer: Specifies how to instantiate a partial state tuple for

a single sensor value – f – merging function: Specifies how to compute a combined

intermediate aggregate from two intermediate aggregates– e – evaluator: Takes a partial state tuple and computes the

actual value of the aggregate• Example: avg

– partial state tuple <s,c> where s is sum and c count– i(x) = <x,1> for a sensor value x– f(<s1,c1>, <s2,c2>) = <s1+s2, c1+c2>– e(<s,c>) = s/c

Page 56: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

56

Taxonomy of TAG

• We classify aggregates according to four properties:

– Duplicate sensitivity– Tolerance to losses– Monotonicity– State requirements

• Yield a general set of optimizations that can be automatically applied

• Can extend the original five aggregates

Page 57: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

57

Duplicate Sensitivity

• duplicate insensitive aggregates are unaffected by duplicate readings from a single device– max, min, count distinct

• duplicate sensitive aggregates will change when a duplicate reading is reported– count, sum, average, median,...

Page 58: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

58

Loss-Tolerance

• exemplary aggregates return one or more representative values from the set of all values– behave unpredictably in the face of loss– max, min, median,...

• summary aggregates compute some property over all values– the aggregate applied to a subset can be treated as a

robust approximation of the true aggregate value– count, sum, average, count distinct,...

Page 59: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

59

Monotonicity

• Monotonicity: If a b, then f(a) f(b)• Monotonic aggregates: For two partial state tuples s1

and s2 combined via a function f, s1,s2, e(f(s1,s2)) max(e(s1),e(s2))

ors1,s2, e(f(s1,s2)) min(e(s1),e(s2))

• Important when determining whether some predicates (such as HAVING) can be applied in-network, before the final value of the aggregate is known– Early predicate evaluation saves messages by reducing the

distance that partial state tuples must flow up the aggregation tree.

Page 60: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

60

State Requirements

• Distributive aggregates

• Algebraic aggregates

• Holistic aggregates

• Unique aggregates

• Content-sensitive aggregates

Page 61: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

61

State Requirements:Distributive aggregates

• The partial state is the aggregate for the partition of data over which data are computed

• The size of the partial state tuples is the same as the size of the final aggregate– max, min, count, sum

Page 62: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

62

Properties and Optimizations

Property Examples Affects

Duplicate sensitivity Min: dup. insensitive

Avg: dup. sensitive

Routing redundancy

Loss-tolerance Max: exemplary

Count: summary

Effect of loss

Monotonicity Count: monotonic

Avg: non-monotonic

Snooping,

hypothesis testing

Partial state Median: unbounded

Max: 1 tuple

Effectiveness of TAG

Page 63: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

63

State Requirements:Algebraic aggregates

• The partial state tuples are not themselves aggregates for the partitions, but are of constant size– Average (consists of two values)

s1, c1 s2, c2

(s1+s2)/(c1+c2)

Page 64: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

64

State Requirements:Holistic aggregates

• The partial state tuples are propotional in size to the set of data in the partition

• No useful partial aggregation can be done, and all the data must be brought together to be aggregated by the evaluator– median

Page 65: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

65

State Requirements:Unique aggregates

• Similar to holistic aggregates, except that the amount of state that must be propagated is proportional to the number of distinct values in the partition– count distinct

Page 66: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

66

State Requirements:Content-sensitive aggregates

• The partial state tuples are proportional in size to some (perhaps statistical) property of the data values in the partition

Page 67: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

67

Beyond TinyDB: LifeUnderYourFeet

• 10 motes in the soil of an urban forest environment– MicaZ motes– Slanted grid, approx. 2m apart– A small stream runs through the middle of the grid; depth depends on

recent rain events– Collecting air and soil temperature and soil moisture data

• Sampling and data collection– NOT TinyDB! ”Sample-and-collect schemes can loose up to 50% of

collected measurements”– Sample every minute– Store on on-board flash

• 23 kB/day• 512 kB flash means flash is overwritten after 22 days

– Gather results once a week or fortnight• Wireless basestation connected to PC is travelled to the perimeter of the

deployment site to collect measures– Simple sliding window ARQ protocol

Page 68: INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg 1 Data Management in Sensor Networks Ellen Munthe-Kaas Jarle Søberg.

INF5100 Autumn 2007 © Ellen Munthe-Kaas and Jarle Søberg

68

Literature

• Samuel R. Madden, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong: TinyDB: An Acquisitional Query Processing System for Sensor Networks,ACM Transactions on Database Systems (TODS),  Volume 30, Issue 1, 2005.

(Available through the ACM Digital Library; cf. http://x-port.uio.no)

• Some additional reading:– Samuel R. Madden , Michael J. Franklin, Joseph M. Hellerstein,

Wei Hong: TAG: a Tiny Aggregation Service for Ad-Hoc Sensor NetworksOSDI, 2002