Multi-resolution Data Communication in Wireless Sensor Networks
-
Upload
payam-barnaghi -
Category
Education
-
view
149 -
download
0
Transcript of Multi-resolution Data Communication in Wireless Sensor Networks
Multi-resolution Data Communication in Wireless Sensor Networks
Frieder Ganz, Payam Barnaghi, Francois Carrez
Centre for Communication Systems Research (CCSR) University of SurreyGuildford, United Kingdom
1Seoul, Korea, March 2014 Seoul, Korea, March 2014
Sensors
2
3
Wireless Sensor Networks (WSN)
Sinknode Gateway
Core networke.g. Internet
Core networke.g. InternetGateway
End-userEnd-user
Computer servicesComputer services
- The networks typically run Low Power Devices- Consist of one or more sensors, could be different type of sensors (or actuators)- The networks typically run Low Power Devices- Consist of one or more sensors, could be different type of sensors (or actuators)
4
Image courtesy: the Economist
5
Data Processing
WSNWSN
WSNWSN
WSNWSN
WSNWSN
WSNWSN
Network-enabled DevicesNetwork-enabled Devices
Network-enabled DevicesNetwork-enabled Devices
Network services/storage and processing
units
Data collections and processing
within the networks
Gateway
Gateway
Data aggregation and reduction methods
− The Symbolic Aggregate Approximation (SAX) is a widely used
dimensionality reduction mechanism for time-series data.
− However, time-series != time-series as they can have a variety of
different application domains. SAX was firstly developed for static
databases; however in this work we extend it for the use in sensor
domain applications
− SAX consists of two steps:
− the aggregation phase, using Piecewise Aggregate Approximation
(PAA) and
− the discretisation of the aggregated data.
− This work limits the extension to the PAA phase.
Data aggregation and reduction methods
1. SAX uses z-normalisation (left: original data blue,
normalised green)
2. Then it reduces the data to a vector of a smaller length
by taking the mean of each window. (left below: mean
values)
3. And finally discretising the data based on the Gaussian
distribution into SAX words represented as strings
according to the quartiles of the data. (right below)
Symbolic Aggregate Approximation
Symbolic Aggregate Approximation
Symbolic Aggregate Approximation
The constant relation between input length n and output length m lead to a fixed reduced window size.
Multi Resolution Data Communication
− A variable granularity selection is required that selects the right window length based on the data activity.
− How to measure and quantify data activity?− To measure the activity in the data we pre-selected four
statistical methods that can give insights about the activity in the data, i.e. variability measured as variance, maximum, minimum and the mean.
− Each of these has advantages and disadvantages that can lead to different interpretation.
Multi Granularity
− Using SAX we can define different window/string size; but what is the best choice?
W1
W2
W3
……
Size =m1Size =m1
Size =m2Size =m2
Size =m3Size =m3
Window Selection
− Maximum:− A higher boundary of historical data is identified. If the observed
data in the current frame is close to or higher than maximum m, high granularity is sent.
− However, the application of this method is only useful for the data that has interesting outliers that have a magnitude higher than a certain threshold; for example, this could be applied to presence data where presence could be identified using local maxima.
− Minimum: − Selecting m based on the minimum has the same applications
as choosing the maximum value discussed above; − however it is applicable where a higher granularity should be
achieved for small values.
Window Selection
− Mean:− Taking the average to select the granularity will result in a higher
granularity data values that are stationary around a certain value. This reduces the granularity in cases where there are many outliers.
− Variance: − The variability measure defines how far values are spread out.
This can be used to create a higher granularity in values that are more distant to the mean of the data.
− This includes the features of the min, max approaches. However, it does not favour values that are around the mean.
− In this work, we assume that the values away from the mean are more interesting and those values should be represented with a higher granularity then data that is close to the mean.
Multi Resolution Data Communication
− Which method suits sensor data?− To select a method we compare the similarity of the original and
reconstructed dataset by using Pearson correlation and also compare the size of the original and reconstructed datasets.
− By choosing the variance as the selection method, the dataset is reduced by 36% with a correlation factor of 0.94. − For mean 27% and 0.95;− For max 0.68% and 0.92; − And for min 29% and 0.99 respectively.
− Reduction and reconstruction strongly depend on the underlying dataset
Multi Resolution Data Communication
Deciding on the window length
− How to represent the different window lengths?− To reconstruct the data, the window lengths of each segment
has to be known as there is no constant window length anymore. Therefore we introduce a multi resolution message that reflects the different window length.
Implementation results
− We run our method on a data set consisting of 55000 samples.− Based on the variance a different window size is chosen as shown
below:
Correlation and data size evaluation
Conclusions
− We use a SAX based technique to reduce the size of data communication from WSN nodes to the gateways.
− The method uses a variance function and variable set of window sizes.− For data with higher activity, smaller window sizes are chosen
(assuming the SAX pattern size is fixed).− For data with less activity larger window size is chosen. − The initial thresholds are defined by processes a set of existing
samples. − We have presented the evaluation results based on the size and
correlation evaluation on a sample streaming sensor data set.
− Limitations and future work:− Changing is the size of SAX patterns (variable string size)− Adjusting the thresholds over the time− Deciding on the number and size of the windows based on the
characteristics of the data.
Q&A
− Thank you.
− CityPulse Project: − http://www.ict-citypulse.eu/ − Twitter: @ictcitypulse
− Supported by: