Overview of Data Management in Sensor Networks

Overview of Data Management inSensor Networks

Deepak Ganesan (UMass)

Data Management Basics

Sensor networks are data-centric Significant amount of data is being generated

within the network

Data management: How to you manage(store/process) the data in the network

Different data management approachesdepending on: Sensor: Data rate or Event rate Resource: Local storage, processing, bandwidth

and power capacity. Query: Type, arrival rate, complexity, latency

requirement

Key Challenges in Data Management

Where should thedata be stored?

How shouldqueries be routedto the stored data?

Where and howshould aggregationbe performed?

How shouldqueries for sensornetworks beexpressed?

Data Management Challenges Where should data be stored and query processing be

performed? Inside: dealing with storage limitations, query processing

overhead, distributed query processing. Outside: Dealing with bandwidth, scheduling, reliability

issues, power How should queries be routed to data?

Inside: flooding, geographic routing, gradient-basedrouting

Outside: Tree-based routing Where and how should aggregation be performed?

Opportunistically along routing path. Cluster-based, Gossip-based

How should queries on sensor data be expressed? Declarative querying for users Macroprogramming for developers

Where should data be stored?

Spectrum of Data Storage and ProcessingLo

Communication for Data Storage

Local Storage andHierarchical Index

Local Storage andFlooding or Geography-based Query Processing

Multi-resolution Storageand indexing

Centralized Storage and Querying

Spectrum of Data Storage and ProcessingC

Communication for Query Processing

Local Storage andHierarchical Index

Multi-resolution Storageand indexing

Local Storage andFlooding or Geography-based Query Processing

Method: Archive nothing locally, transmiteverything of interest When data item of interest is detected, send

all useful information to the base-station Advantages:

Persistent Centralized Storage. Intelligence is at more resource-rich node.

Complicated signal processing can be easilydone outside the network. Sensor nodesperform very simple filtering of data.

Disadvantages: Power Inefficient. Not applicable to

applications where large amount of data ispotentially useful.

Query, Trigger

When is centralized storage and querying appropriate? First Generation Data Collection/Acquisition Systems

James reserve, Great Duck Island, Structural Monitoring (Wisden)…etc Scientific applications where users need all the data.

Multi-resolution Storage and Indexing Method: Store data in a multi-resolution

hierarchy Raw data at leaves, processed summaries of data

at clusterheads (may be higher power nodes)

Advantages: Root has a multi-resolution view of the data in the

network. This can be used to make intelligentdecisions about what nodes to query and toperform complicated processing

Data is replicated at multiple devices Even if raw data is phased out, summaries can be

stored.

Disadvantages: Processing and hierarchical storage requires power,

although not as much as centralized storage.

When is distributed storage and indexing appropriate? Second Generation Data Collection/Acquisition Systems Scientific applications where data sizes are large, and users need to

find patterns in sensor data.

Local Storage and Distributed Indexing Method: Store data locally at each

node, construct distributed indexstructures to make search efficient

Advantages: Makes search efficient and requires

low communication overhead. Disadvantages:

Data is lost if node fails. Index structures can only deal with

specific attribute-based search, andnot with arbitrary signal processingfunctions over data.

When is local storage and distributed indexing appropriate? When search can be effectively scoped using simple attributes. For

example, if temperature is a good indicator of some other activity,this can be used to limit scope of search.

1<= Event Attribute <= 8

Local Storage and Distributed Querying

Method: Store data locally at eachnode, query is flooded out to thenetwork or geographically routed.Query processing is performed on-demand.

Advantages: Only on-demand processing,

therefore energy efficient. Disadvantages:

Data is lost if node fails. Puts significant complexity into a

network of very low-power devices. Frequent queries incur high overhead

When is local storage and distributed querying appropriate? When queries are simple and have limited scope When schemes can deal with node failure.

How should queries be routedto stored data?

Data-centric routing techniques

Push-based query routing

Tree-based routing

Query Flooding orGeographic Routing

Gradient-based routing

Flooding queries into the network Flood the query throughout the network.

Nodes with matching attributes/parametersrespond to the query.

Pros: Very simple and reliable

Cons: Inefficient if frequent queries are posed or large-

scale network.

When is it useful? A large fraction of current deployment, and

possibly future deployments will be flooding-based just because of the inherent simplicityand reliability.

Geographic routing to known locations

If query explicitly specifies location,selectively route the query to particularlocations of interest. Eg: “Find the average temperature in west

corridor of CS Building”

Pros: Can reduce query routing overhead by

selectively choosing nodes.

Cons: Complex routing strategy. Needs special

mechanisms to route around communicationholes. Lack of redundancy might result in querybeing lost.

Gradient-based routing Setup gradients in the network that can

assist the queries to lead them towards theareas of interest. Also called publish-subscribe schemes.

Pros: Resilient to failures, packet-loss (similar to

gossip-based schemes) Not restricted to location-based queries. Can be

used for any spatially correlated attribute.

Cons: Incurs more overhead than geo-routing

schemes.

Tree-based routing In push-based systems, the query can

remain at the base-station and the datacan be routed to it

Pros: Query process can be complex. Decisions can be

made at the intelligent node rather than theresource-constrained one.

Periodic push is synchronous, and can beoptimized through better scheduling policies.

Cons: Pure push is rather inefficient since decision

making is solely at the central location. Usually acombination of push/pull is more appropriate.

Where and how should queryresults be aggregated?

General aggregationGeneral aggregation Let Hk be the information from k sources. It generally

satisfies the following conditions: It is non-decreasing with respect to k It is concave with respect to k

Uncorrelated sources Hk = k

Correlated sources: Hk = 1

Intermediate correlation

Number of sources

AggregateInformation

Aggregation of query results

Opportunistic Aggregation Build Shortest Path Trees. Aggregate at

the junction nodes.

Cluster close to source of data Force query results to be aggregated

close to the data.

Query-optimized trees Build trees that are optimized for

particular kinds of queries.

Query processing language andoptimization

Query Processing Challenges

Intended Audience: Users who pose queries Application developers

How much complexity to expose? Complex inter-resource constraints Distributed computation Data fusion/collaborative signal

processing

How much run-time vs compile-timequery optimization?

Query Processing Language for Users Expressing spatio-temporal queries

When and where did event occur? Scoping the spatial or temporal region of interest.

Addressing individual sensors Able to specify what sensors and what sensor parameters you are

interested in. Confidence intervals or other measures of error tolerance.

Addressing Events Able to specify “events” of interest transparently from the event

processing. Confidence interval, error tolerance Hide distributed nature of computation for naïve users

Specify query processing constraints Latency of result

Hide distributed nature of computation for naïve users. Enable extensivequery expression of sensor data, events and query constraints.

Programming Language for Developers Addressing groups of sensors and data fusion.

Combine data from motion detector, vibration sensor and camera(that may not be co-located) into a “detection event”.

Aggregation: Data type ‘vibration signal’ can be combined bylooking at the fft and picking the 4 dominant frequencies.

Specify the routing structure Cluster area into groups of nodes that observe correlated events.

Allow user-defined signal processing definition Each application has different aggregation needs.

Express resource constraints Energy: Do not expend more than J joules in trying to get the

result

Expose distributed nature of computation but providecomposable library of primitives for easier development.

Runtime query optimization Energy constraints pose difficult query optimization

requirements Every sensor sample incurs energy with different

sensors incurring different overhead Processing and storage consume power as well.

Consider the query: Sample vibration andmagnetometer and report if vibration > Threshold1and magnetic flux > Th2. Vibration sampling requires lower energy than

magnetometer sampling, hence it should be donefirst.

Ordering of sampling, processing, communicationcan matter for energy reasons. How to performruntime query optimization?

Overview of Data Management in Sensor Networks

Documents

Transcript of Overview of Data Management in Sensor Networks

Wireless Sensor Networks: An Overview

Overview of Sensor Networks David Culler Deborah Estrin Mani Srivastava.

Algorithms for Sensor NetworksAlgorithms for Sensor Networks · Algorithms for Sensor NetworksAlgorithms for Sensor Networks ... • F. Zhao and L. Guibas – Wireless Sensor Networks:

WIRELESS SENSOR NETWORKS · 2013-07-23 · 1.1.2 Applications of Sensor Networks, 10 1.1.3 Focus of This Book, 12 1.2 Basic Overview of the Technology, 13 1.2.1 Basic Sensor Network

Advances in Wireless Sensor Networks - IEEE Ottawa Section · 2011-02-22 · February 27th, 2008 Advances in Wireless Sensor Networks 3 What are sensor networks? • Sensor networks

Wireless Sensor Networks - overview -. Wireless Sensor Networks Introduction Introduction Terminology Terminology Applications Applications Technical.

Wireless Sensor Networks technology overview

CSE 410/510 Sensor Networks Winter 2010 Lecture 1: Course Overview Introduction to Sensor Networks.

Wireless Sensor Networks COE 499 Deployment of Sensor Networks I

Clock Synchronization in Wireless Sensor Networks: An Overview

Wireless Multimedia Sensor Networks A brief overview and some … · 2010-04-26 · Wireless Multimedia Sensor Networks A brief overview and some challenges A. Mostefaoui Franche

Localization in Sensor Networks - ETH Z · Rahul Jain Localization in Sensor Networks. Introduction Countersniper System PinPtr Radio Interferometry Conclusions Overview System Architecture

Disaster Management Projects using Wireless Sensor Networks… · · 2017-01-09Disaster Management Projects using Wireless Sensor Networks: An Overview ... Sensor nodes include

An Overview on Wireless Sensor Networks Technology and

From Sensors to … … Sensor Networks · PerLab Wireless Sensor Networks Giuseppe Anastasi 3 Overview Sensors Passive Sensors Semi-passive Sensors Sensor Nodes Sensor Platforms

Semantic Sensor Networks 2011iswc2011.semanticweb.org/fileadmin/iswc/Papers/Workshops/...Fig.1. Overview of the approach to create Semantic Sensor Networks out of Android devices.

Secure sensor networks for perimeter protectionsavanc1/papers/sensor_security.pdf · Secure sensor networks for perimeter protection ... domain––distributed sensor networks. ...

Wireless Sensor Networks Monitoring Tool · 1.2. Overview of Wireless Sensor Networks Wireless Sensor Networks (WSNs) consist of a few or large number small embedded electronic devices

Sensor Networks: An Overview - California State …andrzej/COMP529-S05/papers/sensorN… · · 2005-01-28Sensor Networks: An Overview Archana Bharathidasan, ... Sensor nodes have

L-22 Sensor Networks. 2 Overview Ad hoc routing Sensor Networks Directed Diffusion Aggregation TAG Synopsis Diffusion.