Design and demonstration of a distributed sensor array for predicting water flow and nitrate flux in...

1
0 4 8 12 16 20 24 Jan-01 A pr-01 Ju l-01 O ct-01 Jan-02 A pr-02 Ju l-02 O ct-02 D ec-02 M ar-03 Ju n -03 S ep -03 D ec-03 M ar-04 Ju n -04 S ep -04 D ec-04 M ar-05 Ju n -05 S ep -05 D ec-05 M ar-06 Ju n -06 S ep -06 D ec-06 M illions D ate D aily flo w (m3 M easured flow atUSG S 02322500 W AM predicted flow atUSG S 02322500 0 4 8 12 16 20 24 Jan-01 A pr-01 Ju l-01 O ct-01 Jan-02 A pr-02 Ju l-02 O ct-02 D ec-02 M ar-03 Ju n -03 S ep -03 D ec-03 M ar-04 Ju n -04 S ep -04 D ec-04 M ar-05 Ju n -05 S ep -05 D ec-05 M ar-06 Ju n -06 S ep -06 D ec-06 M illions D ate D aily flo w (m3 M easured flow atUSG S 02321500 W AM predicted flow atUSG S 02321500 Design and demonstration of a distributed sensor array for predicting water flow and nitrate flux in the Santa Fe River Watershed W. Graham 1 , M. Cohen 2 , J. Delfino 3 , C. Foster 2 , J. Gulley 5 , J. Heffernan 2 , C. Krekeler 4 , J. Martin 5 , K. McKee 1 , K. Nagarajan 4 , T. Rayfield 3 , C. Slatton 4 ,V. Srivastava 6 , R. Thomas 5 1 Water Institute, 2 School of Forest Resources and Conservation, 3 Department of Environmental Engineering Sciences, 4 Department of Electrical and Computer Engineering, Department of Agricultural and Biological Engineering, University of Florida. INTRODUCTION & OBJECTIVES OF STUDY CUAHSI HIS WATERSHED ASSESSMENT MODEL Figure 4. Comparison of WAM performance at USGS 02321000 (A), USGS02321500 (B), and USGS 02322500 (C) using mean parameter values estimated using GLUE technique. SITE DESCRIPTION BAYESIAN NETWORK Time series analysis of observed and WAM predicted behavior Figure 5. Cross-correlation function values for daily rainfall at Starke and; (a) USGS measured flow (left); (b) WAM predicted flow (right). Figure 6. Auto-correlation function values for measured daily flow (left) and WAM predicted daily flow (right). Figure 1. Santa Fe River watershed in North Central Florida with USGS gage stations & NCDC raingages used in this study. The SFRW covers 3,585 km 2 in North Central Florida. The basin is comprised of three hydrogeological units: upper confined region where the Hawthorne formation separates the Floridan aquifer from the surface system, the lower un-confined region where the Hawthorne formation is missing, and middle semi-confined region. Major land uses are pine plantation (32.2%), wetlands (16.2%), upland forest (14.7%), improved pasture (14.0%), urban area (8.8%), and forest regeneration (6.0%). The climate is warm and humid with mean annual day- time temperature of 21C and mean annual precipitation of 130 to 150 cm. METHODS CUAHSI HIS The SFRW testbed team has implemented the CUAHSI hydrologic information system including webservices for rainfall, groundwater levels, water quality and streamflow data from other agencies as well as the investigator- collected nitrate sensor and CTD sonde data. Raw data are available via the internet using webservices or several geographical user interfaces. Physical modeling The Watershed Assessment Model (WAM), a physically-based hydrologic model, was developed for the SFRW using mission agency data. WAM is designed to predict surface and groundwater flows and nutrient chemistry based on soils, land use, topography and climate. Bayesian network model Bayesian networks were developed to fuse diverse spatio-temporal Table 1. Statistical measures for WAM performance at three gage locations.

Transcript of Design and demonstration of a distributed sensor array for predicting water flow and nitrate flux in...

Page 1: Design and demonstration of a distributed sensor array for predicting water flow and nitrate flux in the Santa Fe River Watershed W. Graham 1, M. Cohen.

0

4

8

12

16

20

24

Ja

n-0

1

Ap

r-0

1

Ju

l-0

1

Oc

t-0

1

Ja

n-0

2

Ap

r-0

2

Ju

l-0

2

Oc

t-0

2

De

c-0

2

Ma

r-0

3

Ju

n-0

3

Se

p-0

3

De

c-0

3

Ma

r-0

4

Ju

n-0

4

Se

p-0

4

De

c-0

4

Ma

r-0

5

Ju

n-0

5

Se

p-0

5

De

c-0

5

Ma

r-0

6

Ju

n-0

6

Se

p-0

6

De

c-0

6

Millio

ns

Date

Da

ily

flo

w (

m3

)

Measured flow at USGS 02322500

WAM predicted flow at USGS 02322500

0

4

8

12

16

20

24

Ja

n-0

1

Ap

r-0

1

Ju

l-0

1

Oc

t-0

1

Ja

n-0

2

Ap

r-0

2

Ju

l-0

2

Oc

t-0

2

De

c-0

2

Ma

r-0

3

Ju

n-0

3

Se

p-0

3

De

c-0

3

Ma

r-0

4

Ju

n-0

4

Se

p-0

4

De

c-0

4

Ma

r-0

5

Ju

n-0

5

Se

p-0

5

De

c-0

5

Ma

r-0

6

Ju

n-0

6

Se

p-0

6

De

c-0

6

Millio

ns

Date

Da

ily

flo

w (

m3

)

Measured flow at USGS 02321500

WAM predicted flow at USGS 02321500

Design and demonstration of a distributed sensor array for predicting water flow and nitrate flux in the Santa Fe River Watershed W. Graham1, M. Cohen2, J. Delfino3, C. Foster2, J. Gulley5, J. Heffernan2, C. Krekeler4, J. Martin5, K. McKee1, K. Nagarajan4, T. Rayfield3, C. Slatton4,V. Srivastava6, R. Thomas5

1Water Institute, 2School of Forest Resources and Conservation, 3Department of Environmental Engineering Sciences, 4Department of Electrical and Computer Engineering, 5Department of Geological Sciences, 6Department of Agricultural and Biological Engineering, University of Florida.

INTRODUCTION & OBJECTIVES OF STUDYCUAHSI HIS WATERSHED ASSESSMENT MODEL

Figure 4. Comparison of WAM performance at USGS 02321000 (A), USGS02321500 (B), and USGS 02322500 (C) using mean parameter values estimated using GLUE technique.

SITE DESCRIPTION

BAYESIAN NETWORK

Time series analysis of observed and WAM predicted behavior

Figure 5. Cross-correlation function values for daily rainfall at Starke and; (a) USGS measured flow (left); (b) WAM predicted flow (right).

Figure 6. Auto-correlation function values for measured daily flow (left) and WAM predicted daily flow (right).

Figure 1. Santa Fe River watershed in North Central Florida with USGS gage stations & NCDC raingages used in this study.

The SFRW covers 3,585 km2 in North Central Florida. The basin is comprised of three hydrogeological units: upper confined region where the Hawthorne formation separates the Floridan aquifer from the surface system, the lower un-confined region where the Hawthorne formation is missing, and middle semi-confined region. Major land uses are pine plantation (32.2%), wetlands (16.2%), upland forest (14.7%), improved pasture (14.0%), urban area (8.8%), and forest regeneration (6.0%). The climate is warm and humid with mean annual day-time temperature of 21C and mean annual precipitation of 130 to 150 cm.

METHODS

CUAHSI HIS The SFRW testbed team has implemented the CUAHSI hydrologic information system including webservices for rainfall, groundwater levels, water quality and streamflow data from other agencies as well as the investigator-collected nitrate sensor and CTD sonde data. Raw data are available via the internet using webservices or several geographical user interfaces.

Physical modeling The Watershed Assessment Model (WAM), a physically-based hydrologic model, was developed for the SFRW using mission agency data. WAM is designed to predict surface and groundwater flows and nutrient chemistry based on soils, land use, topography and climate.

Bayesian network model Bayesian networks were developed to fuse diverse spatio-temporal datasets (features) to forecast the behavior of the hydrologic system using probabilistic inference based on historical behavior . Information-theoretic concepts were applied to extract the most informative features for predicting particular behaviors.

Table 1. Statistical measures for WAM performance at three gage locations.

-0.2

0

0.2

0.4

0.6

0.8

1

0 30 60 90 120 150 180 210 240 270 300 330 360

Lags (days)

Au

to-c

orr

ela

tio

n f

un

cti

on

(A

CF

)

Measured USGS 02321000

Measured USGS 02321500

Measured USGS 02322500

95% confidence interval

-0.2

0

0.2

0.4

0.6

0.8

1

0 30 60 90 120 150 180 210 240 270 300 330 360

Lags (days)

Au

to-c

orr

ela

tio

n f

un

cti

on

(A

CF

)

WAM at USGS 02321000WAM at USGS 02321500WAM at USGS 0232250095% confidence interval

-0.2

0

0.2

0.4

0.6

0.8

1

-360 -300 -240 -180 -120 -60 0 60 120 180 240 300 360

Lags (days)

Cro

ss

-co

rre

lati

on

fu

nc

tio

n (

CC

F)

Raingage at Starke Vsflow at USGS 02321000

Raingage at Starke Vsflow at USGS 02321500

Raingage at Starke Vsflow at USGS 02322500

95% confidence interval

-0.2

0

0.2

0.4

0.6

0.8

1

-360 -300 -240 -180 -120 -60 0 60 120 180 240 300 360

Lags (days)

Cro

ss

-co

rre

lati

on

fu

nc

tio

n (

CC

F) Rainfall at Starke Vs WAM

predicted flow at USGS 02321000

Rainfall at Starke Vs WAMpredicted flow at USGS 02321500

Rainfall at Starke Vs WAMpredicted flow at USGS 02322500

95% confidence interval

The University of Florida WATERS Testbed group has focused research and sensor array development in the Santa Fe River watershed (SFRW) in north-central Florida (Fig. 1). The Santa Fe river crosses the boundary between the confined and unconfined Floridan Aquifer and consequently exhibits significant transitions in the manner and magnitude of surface and ground water interactions, with important implications for the fate and transport of mass and energy in this karst system. Our long-term science goal is to improve predictive understanding of hydrologic flow paths, nutrient sources, transport and transformation processes within the SFRW. This study was focused on characterizing hydrologic behavior of the SFRW with these specific objectives:• Test off-the-shelf high resolution optical (Satlantic MBARI-ISUS) and cadmium-reduction (YSI 9600) nitrate sensors for background interferences in the high dissolved organic carbon (DOC) Santa Fe River water. • Use the CUAHSI Hydrologic Information System framework to develop a web-accessible geodatabase of federal, state, local and researcher hydrogeospatial and time series data for use by regional stakeholders, local and non-local researchers. • Develop a probabilistic estimation framework focused on predicting surface water flow along the Santa Fe stream network using legacy agency data as well as predictions from simplified deterministic physical models such as the Watershed Assessment Model (WAM) .

CN

TNN

WAM

F

GWL

Q

R

A probabilistic modeling approach was developed to estimate streamflow along the Santa Fe River, complement our WAM analysis, and explore the impact on flow estimate uncertainty of different data availability scenarios. To that end, a Bayesian network was developed to forecast streamflow along the Santa Fe River using observed streamflow, groundwater level and rainfall time series, and streamflow predictions from WAM. Conditional entropy was used to rank features according to the information gained from their inclusion in the network. Spatially explicit predictions and prediction uncertainties were produced along the river.

Figure 8. Expected value of flow (left) and posterior probability of expected flow at points along the Santa Fe river (right) on October 4, 2005, from the Bayesian model.

Figure 7. Architecture of Bayesian network model for forecasting flow at USGS gage locations.

RESULTS

• WAM under-predicts base flow conditions in the unconfined region (USGS 02322500) and over-predicts base flow conditions in the confined regions (USGS 02321000 and 02321500).

• Time series analysis of observed streamflow shows that response times and persistence increase in the downstream direction as flow accumulates in the river system by both surface and subsurface mechanisms.

• Streamflow response to rainfall shows multiple peaks, one at approximately 2-5 days attributable to rapid surface contributions to streamflow, and one at approximately 22-26 days attributable to groundwater stormflow, and another at approximately 3 months attributable to groundwater base flow.

• Time series analysis results show that WAM is not capturing observed geographic differences in system response times.

• Performance of WAM may be improved by enhancing the flow routing algorithms to more realistically reflect spatial heterogeneity of surface and subsurface velocities, travel paths and travel times to the river.

The deterministic physically-based Watershed Assessment Model (WAM) was developed for the SFRW using mission agency data.

WAM partitions the watershed into cells according to the user specified cell size. Based on land use in each cell, WAM applies one of the three field scale models (agricultural, urban, wetland) on each cell to generate surface and subsurface flows and related constituents from each cell.

Generated flow is routed to nearest stream using a travel time based on topographic distance between each cell and the stream and an assumed constant average surface and subsurface flow velocity.

The Generalized Likelihood Uncertainty Estimation (GLUE) procedure was used to estimate the surface and subsurface velocities using USGS streamflow data.

In addition to traditional model performance statistics, time series analysis techniques were used to compare observed versus predicted hydrologic behavior throughout the basin.

RESULTS

CONCLUSIONS

CONCLUSIONS

Figure 3. Time series of dissolved oxygen (DO), NO3-N, pH, and specific conductivity (SC) from sensors we are testing in the Ichetucknee Spring system run. Notice diurnal patterns. These data are available in the DASH and Hydroseek web pages.

(A)

(B) (C)

All WATERS Test Bed sites are beta-testing a standardized hydrologic information system (HIS) and server that store many kinds of water data (and metadata) and publish it on the World Wide Web in a standardized relational database schema.

The Santa Fe Watershed Test Bed is currently hosting rainfall, groundwater, water quality and streamflow data from other agencies as well as our team’s nitrate sensor and CTD sonde data. All data is accessible to anyone on the internet for visualization in a geographical user interface and is downloadable. The aim of the HIS project is to demonstrate the usefulness of a standardized central database for water data that can advance water research.

Figure 2. Data Access System for Hydrology (DASH) showing Santa Fe River watershed & data locations available for downloading and charting of a station’s time series.

The DASH geographic interface shows a test bed map with stations from multiple data networks that are available to visualize and download. It accesses data from a server over the WWW using web services – one for each data network. The geographical interface is made possible by ArcGISServer software on the server. Databases are stored in SQLServer 2005. The tool buttons at the top which enable data query and download are part of the HIS customization.

Each dataset is stored in a standardized relational database schema called the Observational Data Model (ODM). Tools for importing data to the ODM, analyzing ODM data and manipulating data in the ODM have been created and well-documented.

Santa Fe Test bed DASH = http://ees-his06.ad.ufl.edu/dash

Data also in HydroSeek = http://www.hydroseek.net/search/

Table 3. Statistical measures for Bayesian network model performance at three gage locations.

0

4

8

12

16

20

24

Ja

n-0

1

Ap

r-0

1

Ju

l-0

1

Oc

t-0

1

Ja

n-0

2

Ap

r-0

2

Ju

l-0

2

Oc

t-0

2

De

c-0

2

Ma

r-0

3

Ju

n-0

3

Se

p-0

3

De

c-0

3

Ma

r-0

4

Ju

n-0

4

Se

p-0

4

De

c-0

4

Ma

r-0

5

Ju

n-0

5

Se

p-0

5

De

c-0

5

Ma

r-0

6

Ju

n-0

6

Se

p-0

6

De

c-0

6

Millio

ns

Date

Da

ily

flo

w (

m3 )

Measured flow at USGS 02321000

WAM predicted flow at USGS 02321000

The Bayesian network improved Nash Sutcliffe Coefficients and reduced the error entropy over using WAM predictions alone.

The Bayesian network produces more accurate predictions in the confined region (USGS 02321000 and USGS 02321500). Prediction uncertainty measures indicate that subsurface interactions in the unconfined region (USGS 02322500) necessitate a more complex network to sufficiently capture flow variability.

R = Rainfall and ET by cellCN = Curve number by cellQ = Surface/Subsurface runoff by cellWAM = Predicted WAM Streamflow TNN = Obs Nearest neighbor flow GWL = Obs Groundwater elevation

USGS 02321000 USGS 02321500 USGS 02322500

NSC 0.78 0.92 0.67

E 1.71 1.83 3.09

NSC 0.61 0.70 0.27

E 2.07 2.80 2.08

Calibration (Jan' 01 - Dec' 04)

Validation (Jan' 05 - Dec' 06)

NSC is the Nash-Sutcliffe Coefficient of model efficiencyE is the entropy value for the prediction error distribution

Table 2. Feature ranking based on conditional entropy H(C|F). The set of available features reduces by one with each column, represented by “-” entries.

Rank First feature Second feature Third feature First feature Second feature Third feature First feature Second feature Third feature1 TNN: 1.58 GWL: 1.55 WAM: 1.54 TNN: 1.90 GWL: 1.79 WAM: 1.77 WAM: 2.00 GWL: 1.95 TNN: 1.882 WAM: 1.83 WAM: 1.56 - WAM: 1.90 WAM: 1.84 - GWL: 2.07 TNN: 1.95 -3 GWL: 1.88 - - GWL: 1.92 - - TNN: 2.17 - -

USGS 02321000 USGS 02321500 USGS 02321000

USGS 02321000 USGS 02321500 USGS 02322500NSC 0.36 0.45 0.67

E 1.91 2.83 2.51NSC 0.38 0.50 -1.52

E 2.42 1.96 2.18

Calibration (Jan' 01 - Dec' 04)

Validation (Jan' 05 - Dec' 06)

Estimating error entropy at three gage locations

Estimating mean flow and posterior probabilities of mean flow at points along the river

Conditional entropy (H(C|F)) metric used for feature ranking

classesflowcfeaturef

fcpfcpfpCcFf

;

)|(log)|()(

flowestflowobsewhereepepentropyErrori

ii ..,)(log)(

class

class

( ) ( , , | class ,station )flow max

( ) ( , , | class ,station )j

ij

i jj

P f p WAM TNN GWL j iP w

P f p WAM TNN GWL j i

class

class

class

class

( ) ( , , | class ,station )(flow) max

( ) ( , , | class ,station )

( , , | station ), ( ) Mean flow level in class

( ) prior probability of flow being clas

ji

ji j

j

i j

j

E f p WAM TNN GWL j iE w

E f p WAM TNN GWL j i

w P WAM TNN GWL i E f j

P f

s

[1,15], [2321000, 2321500, 2322500]

j

j i