Post on 18-Jan-2016
1
Information extraction from large scale sensor networks
Network data collection Dimensionality reduction dwMDS cooperative self-localization GEM anomaly detection Conclusions
Alfred HeroUniversity of Michigan, Ann Arbor MI, USA
IPAM Jan. 2007
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 2
Acknowledgements Those past and present in my networking group:
(PG) Raviv Raich, Neal Patwari, Jose Costa, Doron Blatt,Clyde Shih, Derek Justice
(G) Raghu Rangarajan, Kevin Carter, Xing Zhou (UG) Adam Pocholsky, Jionglin Wu, Bobby Li (K12) Panna Felsen, Abiola Adatero
Networking sponsors NSF ITR program (John Cozzens) DARPA ISP program (Doug Cochran, Carey Schwartz) AFOSR MURI program (John Tagney) Motorola (Jim Correal) Raytheon (Harry Schmitt)
Networking collaborators: Rob Nowak, Eric Kolaczyk, Mark Crovella, Paul Barford,
Demos Teneketzis, Stephane Lafortune, Mark Coates, Mike Rabat, Randy Moses, Bin Yu …
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 3
I. Network data collection: active/passive
Sensor pair (active, ping)
Single sensor (passive, netflow)
0 500 1000 1500 2000 2500 3000 35000
20
40
60
80
100
120
140
160Three node network pairwise RSS
time (sec 2)
Rec
eiv
ed
Sig
nal
Str
en
gth
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 4
Sensor network information processing challenges
Collected data is frequently of high dimension Response variables Y:
8 x 1,000,000 samples of an rf field 8 x 10242 pixels of a projected image on IR cameras
Latent variables X: 25 targets, 6 dimensional states, one of 10 labels 10243 image volume
Limited collection, aggregation and computation infrastructure Memory constraints Bandwidth constraints Inadequate training data for refined model fitting
Energy constraints limit SNR Limited transmission power on sensor Limited computation power on sensor
Real-time computation is often required
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 5
Enabling hypothesis
Most signals exhibit strong correlation across inputs.
snapshots do not carry independent information
8x1000 sa/sec spatio-temporal time series evolves in lower dimension than: t=1 t=2
t=3t=4
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 6
Dimensionality reduction 101
n points in d-dimensional linear subspace of RD
Objective: discover intrinsic structure (dimension, hyperplane) Multidimensional scaling (MDS) – Richardson38,
Young&Householder42,Torgerson 51 Our focus: dimensionality reduction in a networked setting
x1
x2 x4
x3
x5
x
y
z
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 7
x
y
z
Dimensionality reduction 101
In MDS we observe distances between points
n x n interpoint Euclidean distance matrix:
Can recover X up to rotation/translation by solving
x1
x2 x4
x3
x5
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 8
Application to cooperative self-localization
Use measurements made between pairs of unknown-location devices to self localize
1
2
4
5
6
8
9
A B
C
7
3
1
Unknown Location
Wireless Sensors
Known Location
Time-of-Arrival (TOA)
Angle-of-Arrival (AOA)
Received Signal Strength (RSS)
Connectivity (Proximity)
Quantized RSS (QRSS)
Passive spatial correlation Decentralized computation: scalable
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 9
Pairwise measurement modes For TOA probing measurements are
while for RSS probing measurements are
where path loss exponent is
PatwariHPCO:03
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 10
MDS for sensor self localization
-3 -2 -1 0 1 2-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Sensor field
-3 -2 -1 0 1 2-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
MDS data-passage graph
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 11
Scalable version of MDS?
-3 -2 -1 0 1 2-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-3 -2 -1 0 1 2-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Sensor field Local data passage graph
yiyi
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 12
Distributed weighted MDS (CostaPH:06)
LOESS approximation to MDS with weighting w
dwMDS with anchor node regularization
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 14
Stress criterion is non-quadratic
However, it has locally additive decomposition
and each summand has “local data passage” property. Optimization transfer optimization algorithm:
Implementation issues for dwMDS
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 15
Iterative/distributed dwMDS algorithm
For each local stress can define quadratic surrogate function Q
Monotone iteration
-10 -8 -6 -4 -2 0 2 4 6 8 100
1000
2000
3000
4000
5000
6000
x-coordinate of one sensor
Cos
t fun
ctio
nCost functions and Surrogate for various iterations
Cost function at iteration kSurrogate at iteration kOriginal cost function
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 16
dwMDS simulation: RSS measurements
EstimatorCRB
1- uncertainty ellipses Actual LocationEstimator Mean
Reference Device
Key:
When initialized with NN oracle dwMDS is unbaised and comes close to CRB
Without oracle NNs are estimated by in-range neighbors. First stage dwMDS location estimates have high bias.
Two stage dwMDS attains similar performance as single stage dwMDS with NN oracle
Data simulated with path loss exponent 2.3 and log normal measurement model.
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 17
dwMDS experiments
Challenging outdoor propagation environment
Experiment 1 (no freq hopping) Experiment 3 (freq hopping)
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 18
dwMDS embedded implementation PHY: 915 MHz FSK transceivers MAC: Carrier sense (unscheduled)
Path Loss integer recorded w/ each reception•Network: Token-passing in a cycle
–Node k transmits to node k+1 to pass token–All nodes in range timestamp any token passing
•(Node id, timestamp of most recent token)
–Fairness passing: Token is passed to the neighbor which hasn’t had it for the longest time. –Low energy passing: In case of tie, choose lowest-path-loss link.
•Transport (cycle reliability): Retransmission ensures continuation of the cycle
•Application: dwMDS3–Nodes update their own coordinate estimate when they have the token.–Random hopping among 16 frequencies (requires coarse sync)
Task 1
Robust Token
Passing
Task 2
dwMDS calculation
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 19
Experiment 2 and 3 Results
Experiment 2 RMSE = 25.6 cm
Experiment 3 RMSE = 55.3 cm
Key: Reference / Anchor NodesActual Node Coordinate
Final Estimated Node Coordinate
Key: Reference / Anchor NodesActual Node Coordinate
Final Estimated Node Coordinate
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 20
GEM activity detection
• 14 Mica2 motes randomly distributed inside and outside lab• 14*13=182 pairs of RSSI measurements over 30 minute period• 1 sample acquired every ½ sec.• TDMA broadcast of 24 measurements every 12 secs• Students walk into and out of lab at random times over period• Positions of motes unknown• Webcam recorded activity for ground truth
Experiment
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 21
Traces of 182 RSSI signals
0 500 1000 1500 2000 2500 3000 35000
50
100
150
200
250Intersensor RSS measurements. Top: ground truth anomaly indicator
Time sample/0.5sec
RS
S i
nte
ger
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 22
RSS pairwise scatterN
om
inal
(n
o a
ctiv
ity)
No
min
al (
no
act
ivit
y)
Sen
sor
pai
r
Sensor pair
2D projections of nominal 282D RSS
60 70 8080 100 12060 80 10050 100 15070 80 9060
70
8080
100
12060
80
10050
100
15070
80
90
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 23
Anomaly detection and level sets
Nominal probability density function Thresholded density function
Prob =
Upper Epigraph = acceptance region
Lower Epigraph = rejection region
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 24
GEM
Level set is equivalent to Minimum volume set of specified probability Minimum entropy set of specified probability
Geometric entropy minimization (GEM) provides estimation of epigraphs using minimal graphs Density estimation not required K-point MST K-point kNNG
Asymptotic performance analysis of GEM
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 25
GEM anomaly detection algorithm
GEM anomaly detection: Training: for a large set of training samples
construct a k-MST for k=(1-)n points, 0<<1, over training samples (assumed nominal).
Test: for a singleton test sample merge test and training samples together Declare anomaly at level if k-MST does not
“capture” test sample
Example: nominal bivariate Gaussian mixture density
New point X is is in capture region of k-MST
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 26
GEM vs. UMP test
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 27
500 1000 1500 2000 2500 3000
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
sco
re =
n/m
ax i(
i)L1O kNN scores. rho=0.99751, Pf=0.0046247 , detection rate=0.090186
Experimental results: Pfa=0.25%
Training segment
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 28
Experimental results: Pfa=1%
500 1000 1500 2000 2500 3000
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
sample number
sco
re =
n/m
ax i(
i)L1O kNN scores. rho=0.9901, Pf=0.017088 , detection rate=0.20424
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 29
Conclusions
Non-parametric framework for reliable extraction of information from high dimensional data Model free algorithms Robust performance Scalability through local implementation
dwMDS – dimensionality reduction GEM – anomaly detection
IPAM, Jan. 2007 © 2007 Alfred Hero Slide 30
References J. Costa, N. Patwari and A. O. Hero, "Distributed multidimensional
scaling with adaptive weighting for node localization in sensor networks", (.pdf) , ACM Journal on Sensor Networking. vol. 2, No. 1, pp 39-64, Feb. 2006.
N. Patwari, A. O. Hero and J. Costa, "Learning Sensor Location from Signal Strength and Connectivity," in Secure Localization and Time Synchronization for Wireless Sensor and Ad Hoc Networks , Eds. Radha Poovendran, Cliff Wang, and Sumit Roy, Advances in Information Security series, Vol. 30, Springer, Dec. 2006, ISBN 978-0-387-32721-1. (.pdf) .
N. Patwari and A. O. Hero III, "Demonstrating Distributed Signal Strength Location Estimation", in Proceedings of the 4th ACM Conference on Embedded Networked Sensor Systems (SenSys06), Boulder, CO, November 1-3, 2006 (.pdf)
N. Patwari and A.O. Hero, "Manifold learning visualization of network traffic data," SIGCOMM 2005 Workshop on Mining Network Data, Philadelphia, Aug. 2005. (.pdf)
N. Patwari, A.O. Hero, M. Perkins, N.S. Correal and R.J. O’Dea, “Relative Location Estimation in Wireless Sensor Networks,” IEEE Trans. on Signal Processing, vol. 51, no. 8, pp. 2137–2148, Aug. 2003.
.