1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

49
1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    2

Transcript of 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

Page 1: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-1

CMPE 259 Sensor Networks

Katia Obraczka

Winter 2005

Storage and Querying II

Page 2: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-2

Announcements

Hw3 is up. Exams. Sign-up for project presentations. Schedule. Course evaluation: Mon, 03.14.

Need volunteer.

Page 3: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-3

Today

Storage. Querying.

Page 4: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-4

Data-Centric Storage

Page 5: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-5

DCS

Data dissemination for sensor networks. Naming-based storage.

Page 6: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-6

BackgroundSensornet

♦ A distributed sensing network comprised of a large number of small sensing devices equipped with

• processor • memory • radio♦ Large volume of data

Data Dissemination Algorithm ♦ Scalable. ♦ Self-organizing. ♦ Energy efficient.

Page 7: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-7

Some definitions Observation

♦ Low-level output from sensors.♦ E.g. detailed temperature and pressure readings.

Event♦ Constellations of low-level observations.♦ E.g. elephant-sighting, fire, intruder.

Query♦ Used to elicit the event information from sensornets.♦ E.g. is there an intruder? Where is the fire?

Page 8: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-8

Data dissemination schemes

External Storage (ES) Local Storage (LS) Data-Centric Storage (DCS)

Page 9: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-9

External Storage (ES)

Page 10: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-10

Local Storage (LS)

EventData

EventData

Page 11: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-11

Local Storage (LS)

EventData

EventData

Page 12: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-12

Data-Centric Storage (DCS) Events are named with keys. DCS provides (key, value) pair. DCS supports two operations:

♦ Put (k, v) stores v ( the observed data ) according to the key k, the name of the data♦ Get (k) retrieves whatever value is stored associated with key k

Hash function♦ Hash a key k into geographic coordinates.♦ Put() and Get() operations on the same key k hash k to the same location.

Page 13: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-13

DCS – Example

(11, 28)Put(“elephant”, data)

(11,28)=Hash(“elephant”)

Page 14: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-14

DCS – Example

PDA

(11, 28)

(11,28)=Hash(“elephant”)

Get(“elephant”)

Page 15: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-15

DCS – Example – contd..

PDA

elephant

fire

Page 16: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-16

Geographic Hash Table (GHT)

Builds on ♦ Peer-to-peer Lookup Systems. ♦ Greedy Perimeter Stateless Routing.

GHT

GPSRPeer-to-peer lookup system

Page 17: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-17

Comparison study

Metrics♦ Total Messages

• Total packets sent in the sensor network.

♦ Hotspot Messages• Maximal number of packets sent by any

particular node.

Page 18: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-18

Comparison study – cont’d..

Assume ♦ n is the number of nodes ♦ Asymptotic costs of O(n) for floods

O(n 1/2) for point-to-point routing

ES LS DS

Cost for Storage O(n 1/2) 0 O(n1/2)

Cost for Query 0 O(n) O(n1/2)

Cost for Response 0 O(n1/2) O(n1/2)

Page 19: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-19

Comparison Study –cont’d..

Dtotal, the total number of events detected Q , the number of event types queries for Dq, the number of detected events of event types

No more than one query for each event type, so there are Q queries in total.

Assume hotspot occurs on packets sending to the access point.

Page 20: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-20

Comparison Study – cont’d..

ES LS DCS

Total

Hotspot

nDtotal nDQn q nDnDnQ qtotal

totalDqDQ qDQ

DCS is preferable if Sensor network is large Dtotal >> max[Dq, Q]

)(summarynQnDnQ total

)(2 summaryQ

Page 21: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-21

Summary

In DCS, relevant data are stored by name at nodes within the sensornets.

GHT hashes a key k into geographic coordinates, the key-value pair is stored at a node in the vicinity of the location to which its key hashes.

To ensure robustness and scalability, DCS uses Perimeter Refresh Protocol (PRP) and Structured Replication (SR).

Compared with ES and LS, DCS is preferable in large sensornet .

Page 22: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-22

Multi-Resolution Storage

Page 23: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-23

Goals

Provide storage and search for raw sensor data in data-intensive scientific operations.

Previous work: Aggregation and querying. Focus on applications whose interests are

known a priori.

Page 24: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-24

Approach

Lossy, progressively degrading storage.

Page 25: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-25

Constructing the hierarchy

Initially, nodes fill up their own storage with raw sampled data.

Data

Page 26: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-26

Constructing the hierarchy

Organize network into grids, and hash in each to determine location of clusterhead (ref: DCS).

Send compressed local time-series to clusterhead.

Page 27: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-27

Processing at each level

x

time

y

Get compressed summaries from children.

Decode Re-encode at lower resolution and forward to parent.

Store incoming summaries locally for future search.

Page 28: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-28

Constructing the hierarchy

Recursively send data to higher levels of the hierarchy.

Page 29: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-29

Distributing storage load

Hash to different locations over time to distribute load among nodes in the network.

Page 30: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-30

Eventually, all available storage gets filled, and we have to decide when and how to drop summaries.

Allocate storage to each resolution and use each allocated storage block as a circular buffer.

What happens when storage fills up?

Local Storage Allocation

Res 1Res 2Res 3Res 4

Local storage capacity

Page 31: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-31

Graceful query degradation: providing more accurate responses to queries on recent data and less accurate responses to queries on older data.

Tradeoff between storage requirements and query quality

Level 0

Level 1

Level 2

Storage time

Qu

ery

Accu

racy

high query accuracylow compactness

low query accuracyhigh compactness

low

high

How to allocate storage at each node to summaries at different resolutions to provide gracefully degrading storage and search capability?

Page 32: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-32

Match system performance to user requirements

Objective: Minimize worst case difference between user-desired query quality (blue curve) and query quality that the system can provide (red step function).

Quality Difference

Time

Qu

ery

Accu

racy

presentpast

User provides a function, Quser that represents desired query quality degradation over time.

System provides a step function, QQsystemsystem, , with steps at times when with steps at times when summaries are aged.summaries are aged.

iAge

95%

50%

Page 33: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-33

For how long should summaries be stored? To achieve desired query quality given

system’s constraints. Given

N sensor nodes. Each node has storage capacity, S. Users ask a set of typical queries, T. Data is generated at resolution i at rate Ri. D(q,k) – query error when drilldown for

query q terminates at level k. Quser - User-desired quality degradation.

Page 34: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-34

Solve ConstraintOptimization

Aging strategy with limited information

Omniscient Strategy(baseline: when entire data is available.

Training Strategy(when small training dataset from initial deployment).

Greedy Strategy(when no data is available, use a simple weighted allocation to summaries).

Coarse Finer Finest

1 : 2 : 4

No a priori information

full a priori information

Page 35: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-35

Distributed trace-driven implementation Linux implementation.

Uses Emstar (J. Elson et al), a Linux-based emulator/simulator for sensor networks.

3D Wavelet codec . Query processing.

Geo-spatial precipitation dataset. 15x12 grid (50km edge) of precipitation data from

1949-1994, from Pacific Northwest. System parameters

Compression ratio: 6:12:24:48. Training set: 6% of total dataset.

Page 36: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-36

How efficient is search?

Search is very efficient (<5% of network queried) and accurate for different queries studied.

Page 37: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-37

Comparing aging strategies

Training performs within 1% to optimal . Careful selection of parameters for the greedy algorithm can provide surprisingly good results (within 2-5% of optimal).

Page 38: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-38

Summary

Progressive aging of summaries can be used to support long-term spatio-temporal queries in resource-constrained sensor network deployments.

We describe two algorithms: a training-based algorithm that relies on the availability of training datasets, and a greedy algorithm can be used in the absence of such data.

Our results show that training performs close to optimal for the dataset that we

study. the greedy algorithm performs well for a well-chosen

summary weighting parameter.

Page 39: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-39

Continuously Adaptive Continuous Queries (CACQ)

Page 40: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-40

CACQ Introduction

Proposed continuous query (CQ) systems are based on static plans. But, CQs are long running. Initially valid assumptions less so over time.

CACQ insight: apply continuous adaptivity. Dynamic operator ordering. Process multiple queries simultaneously. Enables sharing of work & storage.

Page 41: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-41

Outline

Background Motivation Continuous Queries Eddies

CACQ Contributions- Example driven explanation

Results & Experiments

Page 42: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-42

Motivating applications

Building monitoring. Variety of sensors (e.g., light,

temperature, vibration, strain, etc.). Variety of users with different interests

(e.g., structural engineers, building managers, building users, etc.).

Page 43: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-43

Continuous queries

Long running, “standing queries”. From various users. On a number of sensor streams.

Installed; continuously produce results until removed.

Lots of queries, over the same data sources Opportunity for work sharing.

Page 44: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-44

Eddies & adaptivity

Eddies (Avnur & Hellerstein, SIGMOD 2000): Continuous Adaptivity.

No static ordering of operators. Routing policy dynamically

orders operators on a per tuple basis.

done and ready bits encode where tuple has been, where it can go.

Page 45: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-45

CACQ contributions

Adaptivity Tuple lineage

In addition to ready and done, encode path tuple takes through operator.

• Enables sharing of work and state across queries.

Grouped filter Efficiently compute selections over multiple

queries. Join sharing through State Modules

(SteMs)

Page 46: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-46

Eddies & CACQ : Single Query, Single Source

• Use ready bits to track what to do next• All 1’s in single

source

• Use done bits to track what has been done• Tuple can be output

when all bits set

• Routing policy dynamically orders tuples

R

(R.a > 10)

Eddy

(R.b < 15) R1

R1

R1

R1

a 5

b 25

R2

a 15

b 01 1 0 01 1 0 11 1 0 01 1 1 01 1 11Ready

Done

ab ab

R

(R.a > 10)

Eddy

(R.b < 15)

R2

R2R2R2 R2

R2

SELECT *

FROM R

WHERE R.a > 10 AND R.b < 15

Page 47: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-47

Evaluation

Real Java implementation on top of Telegraph QP 4,000 new lines of code in 75,000 line codebase

Server Platform Linux 2.4.10 Pentium III 733, 756 MB RAM

Queries posed from separate workstation Output suppressed

Lots of experiments in paper, just a few here

Page 48: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-48

CACQ vs. NiagaraCQ Graph

NiagaraCQ (Option 1) vs. CACQ(log scale), 100 Queries, 2250 Stocks

1

10

100

1000

10000

1 10 100|Result| / |Stocks|

Ru

nn

ing

Tim

e L

og

(s)

Niagara

CACQ

Page 49: 1-1 CMPE 259 Sensor Networks Katia Obraczka Winter 2005 Storage and Querying II.

1-49

Conclusion CACQ: sharing and adaptivity for high

performance monitoring queries over data streams.

Features Adaptivity.

• Adapt to changing query workload without costly multi-query reoptimization.

Work sharing via tuple lineage.• Without constraining the available plans.

Computation sharing via grouped filter. Storage sharing via SteMs.