Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks Research...

24
Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks Research Proposal Jayant Gupchup Department of Computer Science, Johns Hopkins University

Transcript of Towards a Model-Based Data Collection Framework for Environmental Monitoring Networks Research...

  • Towards a Model-BasedData Collection Framework for Environmental Monitoring Networks

    Research ProposalJayant Gupchup

    Department of Computer Science, Johns Hopkins University

  • Background II (motes)Computing,StorageCommunication(radio)

    SensorsSending one packet costs same energy as thousands of CPU cycles Matt Welsh, Harvard

    SensorPowerBarometric Pressure10 WHumidity/Temperature80 WSoil Moisture19.6 mW

    ComponentPowerRadio (CC2420) RX38 mWRadio TX35 mWMicrocontroller (TI MSP 430. 8Mhz)3 mW

  • All data are not equal

  • Task listDefine Informative Periods

    Algorithm : Find Informative (or interesting) Periods

    Algorithm : Sampling Planner based on the interesting periods

    Evaluation

  • Initial Direction & Main Results

    Principal Component Analysis (PCA) based approach

    Classification-based approach towards detecting events.

  • PCA based approach: MotivationObservations:

    Well behaved days show typical signature (bell-shaped pattern) Rainy days (or periods) deviate from this signature Strong trend component from one day to the next Diurnal, trend features seen in most environmental modalities PCA is good at capturing variation in collection of similar curves

  • PCA Toy Example

  • Eigenmodes for Air Temperature

  • Discriminating event, well-behaved days [5][5] : J. Gupchup, R. Burns, A. Terzis, and A. Szalay, Model-Based Event Detection in Wireless Sensor Networks, Proceedings of Workshop on Data Sharing and Interoperability on the World-Wide Sensor Web (DSI), ACM/IEEE, 2007Well-behaved days: Fits model wellEvent day: Large residuals

    PrecisionRecall51.28%80%

  • Offline to Online

    OfflineBasis locked from midnight to midnightAccess to complete 24 hour signal

    OnlineAccess to signal up to the current hour dBasis locked from hour d to hour dVectors cyclically shifted by dEigenvalues remain the same

  • Online Prediction Residuals

  • SummaryPCA model effective in finding informative periods

    Need to knowShift value, dsundial [6]

    But why not use Barometric Pressure too?[6] : Jayant Gupchup, Razvan Musloiu-E, Alex Szalay, Andreas Terzis. Sundial: Using Sunlight to Reconstruct Global Timestamps, To appear in the proceedings of the 6th European Conference on Wireless Sensor Networks (EWSN 2009)

  • Classification-Based Approach

    2-class problem {Rainy, Sunny}

    Most classifiers provide probabilitiesSample based on those probabilities

  • Future Work - ITask 1: Model ImprovementStudy effect (or correlation) of Event-magnitude Inter-Arrival TimeExplore Incremental and Robust PCA [7], [8]Explore Label based Classifiers Combine Air Temp, Barometric Pressure and Light Modalities (joint work with Zhiliang Ma, Dept. of Applied Math and statistics)

    Task 2 : Sampling PlannerPrediction error and/or Probability of Event (PoE)Neighbor opinion(s)Acquisition cost of each sensor

    [7] : Reliable Eigenspectra for New Generation Surveys, Tamas Budavari, Vivienne Wild, Alexander S. Szalay , Laszlo Dobos, Ching-Wa Yip , MNRAS. Accepted for publication [8] : A Robust Classification of Galaxy Spectra: Dealing with Noisy and Incomplete Data, A.J. Connolly, A.S. Szalay, Astronomical Journal

  • Future Work - II

    Task 3 : EvaluationDefine Cost and Benefit functionsCompare proposed approach with existing systems

    Task 4 : Application and ExtensionsIdentify class of applications where the framework can be used

  • Questions???

  • Overview: Proposed Framework

  • Properties of our PCA model

    Transformation: Y = X*VProjected variables are uncorrelated

    Compression/Multi-resolutionAchieve a massive compressionFrom previous slide, compression ratio = 4/96 = 24X

    Online BasisBasis for any d to d hour using cyclic shifting

    Re-projection error BoundsSum of left out eigenvalues

  • Preliminary ResultsRain predictionUse Barometric Pressure

    Simple linear classifiers perform well

    Classification Accuracy towards 76%

  • Eigenvector 5

  • Online Prediction

  • Literature Survey Barbie-Query (BBQ, [1])Approximate query answering (Range, value queries)Sensing cost differential Energy Saving opportunities!Predictions outside confidence interval, collect samplesShortcomingsNOT collecting long-term environmental dataDo not consider the role played by events

    PRESTO [2]Reduce Storage costs => Reduce Communication costsSeasonal-AutoRegressive Integrated Moving Average (S-ARIMA) [3] model for predictionsModel known to node and BasestationWhen predictions within confidence bounds, do not store collected samplesBasestation can reconstruct missing samples.ShortcomingsNo adaptive sampling on interesting events

    [1] : Model-Driven Data Acquisition in Sensor Networks; Amol Deshpande, et al. VLDB 2004[2] : PRESTO: Feedback-driven Data Management in Sensor Networks; Ming Li, Deepak Ganesan, and Prashant Shenoy; USENIX 2006[3]: P.J. Brockwell, R.A. Davis. Introduction to time series and forecasting. 2002.

  • Related Work

    Near-Optimal Sensor Placement [4]Find most informative locations to place sensorsAt the same time Keep the network connectedSolution: Information-theoretic (entropy) & Steiner tree approximation

    DifferencesFocus is finding informative locations in an offline fashionSolution addresses spatial variability Sampling rate does not change once locations are fixed[4] : A. Krause, C. Guestrin, A. Gupta, J. Kleinberg. "Near-optimal Sensor Placements: Maximizing Information while Minimizing Communication Cost". In Proc. of Information Processing in Sensor Networks (IPSN) 2006

    *Mention modalities*Highlight POWER. Merge into next slide by dropping informative periods at this point.*Try to change the plot. Make titles better*Define informative periods****Notation for x and X**Make this graphical***Conclusion slide********Reduce Reduce Reduce!**