Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.
-
Upload
david-moore -
Category
Documents
-
view
214 -
download
0
Transcript of Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.
![Page 1: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/1.jpg)
Data Provenance in Remote Environmental Monitoring
Dr. Christian Skalka, University of Vermont, USA
![Page 2: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/2.jpg)
Data Provenance in Remote Environmental Monitoring (REM)REM = automated collection of data from the
natural environment in remote settings.
Central points: Data provenance is fundamental to REM.
Data source, times, ownership are intrinsic. REM hardware and software architectures pose
unique challenges for establishing provenance. Heterogeneous, distributed, low-power systems.
![Page 3: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/3.jpg)
Outline
Two REM case studies and problem statements:
1. Snowpack monitoring (SnowMAN) The SnowMAN project summary. Microcosmic provenance issues, challenges. SnowMAN provenance “coping mechanisms”.
2. Sagehen Creek Field Station network Overview of project setting. Macrocosmic provenance issues, challenges. Possible approaches to central challenges.
![Page 4: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/4.jpg)
How Much Snow is Out There? Snow/Water Equivalent (SWE):
measurement of water content in snowpack Not the same as snow height.
![Page 5: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/5.jpg)
How Much Snow is Out There? Regional snowpack profiles are critically important to
natural resource planning, public safety. Real world measurement is complicated by terrain, forest
canopies, wind, exposure. Accurate realtime SWE measurement is a “holy grail” of
REM.
![Page 6: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/6.jpg)
The UVM SnowMAN Project
A new approach to SWE measurement Use modern computer technology for
data acquisition and retrieval A multi-modal approach to SWE
approximation Lightweight, low cost, robust,
adaptable Improved spatial and temporal
resolution
![Page 7: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/7.jpg)
Multimodal Sensor Fusion
Algorithms on sensing nodes combine multiple sensing technologies of variable power cost:
1. Snow height via ultrasound (cheap)
2. Snow density via microwave absorption (moderate)
3. Snow density via gamma ray attenuation (expensive)
![Page 8: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/8.jpg)
SnowMAN System Architecture Multiple data gathering-and-processing nodes
connected via a Wireless Sensor Network (WSN) Arduino-based on-site gateway provides
datalogging via SD card, data processing Remote data retrieval via TCP/IP over cellmodem
![Page 9: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/9.jpg)
Provenance Issues in SnowMAN Data reported by sensors meaningless
without provenance information: Time of sampling event Location of sample Type and ADC conversion formula of sensor
Refinement of multimodal fusion algorithm requires history/cause of sampling event.
![Page 10: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/10.jpg)
Provenance Challenges in SnowMAN Low-bandwidth requirements in WSNs
Messages must be small, infrequent. Volatility of low-cost devices
WSN node failures require data reliability solutions
Heterogeneous network architecture Data formats must be converted in network
communications Time synchronization
![Page 11: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/11.jpg)
Managing Provenance in SnowMAN Reliability ensured by datalogging on gateway,
replication within WSN. Requires data source, time to be stored with readings.
Provenance information reported with data readings. Component of packet format; not onerously large.
Data converted at “protocol boundaries”. 802.15.4 to RS232 to TCP/IP to SQL.
Time synchronization handled by simple protocols. Low precision sufficient; cellmodem provides “true” time.
![Page 12: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/12.jpg)
Outstanding Provenance Issues in SnowMAN How to verify that data is converted properly
at protocol boundaries? How to encode history of multi-modal
readings, for analysis and refinement of algorithms?
How to detect errors in data readings, due to sensor, time synchronization, node failure?
![Page 13: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/13.jpg)
REM in Macrocosm: Sagehen Creek Field Station
Sagehen Creek Field Station and Experimental Forest located near Truckee, CA
Research and Teaching Facility of UC Berkeley 9,000 acres of undisturbed wilderness, extensive
REM technology
![Page 14: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/14.jpg)
REM in Macrocosm: Sagehen Creek Field Station Literally hundreds of various sensor devices
Temperature, wind, humidity Streamflow, Stream temperature Snow height, SWE Video
9 hubs with (programmable) dataloggers, power, wireless transmission
Goal: wireless connectivity to field house and internet, off-site data warehousing
Multiple user, administration groups
![Page 15: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/15.jpg)
Sagehen Creek Field Station
![Page 16: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/16.jpg)
Provenance Issues at Sagehen Inherits microcosmic issues (time, location,
sensor modality essential to data). Video triggering events should be reported. Group data ownership now important to
report (and maintain through data cycle). Sagehen provenance should be credited in
myriad end-uses of data. Diagnostics of network functionality and
services.
![Page 17: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/17.jpg)
Provenance Challenges at SagehenInherits microcosmic challenges, but: Increased sampling rates, network traffic Time synchronization much more complex GPS auto-location for some sensors, manual
for others Much greater diversity of devices,
communications mediums (wired, wireless) More protocol boundaries Multimedia
![Page 18: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/18.jpg)
Sagehen Provenance Issues: ScalabilitySagehen network modeled as source-to-sink
dataflow, from sensors to end-users. Sources extensible by user groups
New sensors, sensor networks (e.g. WSNs) New remote datalogging/replication architecture
Sink usable by end-user groups Arbitrary visualization technologies Diverse research and education applications
![Page 19: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/19.jpg)
Sagehen Network: The Current Reality Establishing data communications backbone
over IEEE802.11 wireless LAN. Limited data collection over network (one-
hop) via canned proprietary software. Most data collection being done manually
from dataloggers. Sensors hardwired to dataloggers, no WSNs in
the field. Some one-hop connectivity between hubs.
![Page 20: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/20.jpg)
Sagehen Network: The Vision Seamless source-to-sink dataflow.
From sensors in the field to off-site, permanent data warehouse.
Also accessible onsite at remote hubs (reliable). Wireless sensor network capabilities in the
field. Attribution of data to source groups and
Sagehen. Easy extensibility of network at source end,
to allow addition of new sensors (and WSNs).
![Page 21: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/21.jpg)
Some Ideas for Supporting Provenance in the Sagehen Software ArchitectureTreating data like messages on a protocol
stack. Stack defined across device (protocol)
boundaries: Sensor data is “raw”, collects more provenance
information as it moves towards the sink. Higher layers of provenance (time, ownership)
encapsulate lower layers. Allows compositional (principled) treatment of
cross-protocol data transformation.
![Page 22: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/22.jpg)
Some Ideas for Supporting Provenance in the Sagehen Software ArchitectureWatermarking data to establish Sagehen and
group ownership. Easily done for video media.
Video retrieved only from the internet; watermarking performed on traditional platform.
Watermarking sensor data?? Need to preserve data may not tolerate traditional
techniques. In-the-field retrieval requires in-the-field
watermarking.
![Page 23: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/23.jpg)
Conclusion
Remote environmental monitoring requires provenance for correct interpretation of data.
REM networks heterogeneous, some components computationally “weak”. Power, cost restrictions. Protocol hodgepodge!
Adapting to REM environment a unique challenge for provenance in software.
![Page 24: Data Provenance in Remote Environmental Monitoring Dr. Christian Skalka, University of Vermont, USA.](https://reader036.fdocuments.in/reader036/viewer/2022062619/5515ecf8550346cf6f8b5218/html5/thumbnails/24.jpg)
Conclusion
Two case studies: SnowMAN: lightweight, low cost SWE monitoring. Sagehen Creek Field Station: REM in macrocosm.
http:www.cs.uvm.edu/~skalkahttp://sagehen.ucnrs.org/