Erin E. Peterson Postdoctoral Research Fellow CSIRO Mathematical and Information Sciences Division

www.csiro.au

Predicting Water Quality Impaired Stream Segments using Landscape-scale Data and a Regional Geostatistical Model

Erin E. Peterson

Postdoctoral Research Fellow

CSIRO Mathematical and Information Sciences Division

March 3, 2006

The work reported here was developed under STAR Research Assistance Agreement CR-829095 awarded by the U.S.

Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by

EPA. EPA does not endorse any products or commercial services mentioned in this presentation.

Space-Time Aquatic Resources Modeling and Analysis Program

This research is funded by

U.S.EPA凡Science To AchieveResults (STAR) ProgramCooperativeAgreement # CR -829095

This research is funded by

U.S.EPAScience To AchieveResults (STAR) ProgramCooperativeAgreement # CR -829095

Collaborators

Dr. David M. TheobaldNatural Resource Ecology LabDepartment of Recreation & TourismColorado State University, USA

Dr. N. Scott UrquhartDepartment of StatisticsColorado State University, USA

Dr. Jay M. Ver HoefNational Marine Mammal Laboratory, Seattle, USA

Andrew A. MertonDepartment of StatisticsColorado State University, USA

Overview

Introduction~

Background~

Patterns of spatial autocorrelation in stream water chemistry

~Visualizing model predictions

~Current and future research in

Water Quality Monitoring Goals

Create a regional water quality assessment

Identify water quality impaired stream segments

Purpose

Demonstrate a geostatistical methodology based on

Coarse-scale GIS data

Field surveys

Predict water quality characteristics about stream segments throughout a region

Purpose of Our Research

How are geostatistical model different from traditional statistical models?

Traditional statistical models (non-spatial)

Residual error (ε) is assumed to be uncorrelated

ε = unexplained variability in the data

Geostatistical models

Residual errors are correlated through space

Spatial patterns in residual error resulting from unidentified process(es)

Model spatial structure in the residual error

Explain additional variability in the data

Generate predictions at unobserved sites

( ) ( ) ( )Y s X s s

Geostatistical Modelling

Fit an autocovariance function to data Describes relationship between observations based on separation distance

Separation Distance

Nugget Range

103 Autocovariance Parameters

1) Nugget: variation between sites as separation distance approaches zero

2) Sill: delineated where semivariance asymptotes

3) Range: distance within which spatial autocorrelation occurs

Distance Measures and Spatial Relationships

Straight Line Distance (SLD)

As the crow flies

Symmetric Hydrologic Distance (SHD)

As the fish swims

Weighted asymmetric hydrologic distance (WAHD)

As the water flows

Incorporate flow direction & flow volume

Ver Hoef, J.M., Peterson, E.E., and Theobald, D.M. (2006) Spatial Statistical Models that Use Flow and Stream Distance, Environmental and Ecological Statistics, to appear.

Challenge: Spatial autocovariance models developed for SLD may not be valid for hydrologic

distances– Covariance matrix is not positive definite

Asymmetric Autocovariance Models for Stream Networks

Ver Hoef, J.M., Peterson, E.E., and Theobald, D.M., Spatial Statistical Models that Use Flow and Stream Distance, Environmental and Ecological Statistics. In Press.

Weighted asymmetric hydrologic distance (WAHD)

Developed by Jay Ver Hoef, National Marine Mammal Laboratory, Seattle, WA, USA

Moving average models

Incorporate flow volume, flow direction, and use hydrologic distance

Positive definite covariance matrices

Evaluate 8 chemical response variables1. pH measured in the lab (PHLAB)2. Conductivity (COND) measured in the lab μmho/cm3. Dissolved oxygen (DO) mg/l4. Dissolved organic carbon (DOC) mg/l5. Nitrate-nitrogen (NO3) mg/l6. Sulfate (SO4) mg/l7. Acid neutralizing capacity (ANC) μeq/l8. Temperature (TEMP) °C

Determine which distance measure is most appropriate SLD, SHD, WAHD? More than one?

Find the range of spatial autocorrelation

Objectives

Maryland Biological Stream Survey (MBSS) Data

Maryland Department of Natural Resources

Maryland, USA

1995, 1996, 1997

Stratified probability-based random survey design

1st, 2nd, and 3rd order non-tidal streams

955 sites

881 sites after pre-processing

17 interbasins

Maryland, USA

Baltimore

AnnapolisWashington D.C. Chesapeake Bay

Study Area

Spatial Distribution of MBSS Data

Create data for geostatistical modelling1. Calculate watershed covariates for each stream segment2. Calculate separation distances between sites

SLD, SHD, Asymmetric hydrologic distance (AHD)3. Calculate the spatial weights for the WAHD4. Convert GIS data to a format compatible with statistics software

FLoWS website: http://www.nrel.colostate.edu/projects/starmap

SHD AHD

Functional Linkage of Watersheds and Streams (FLoWS)

Spatial Weights for WAHD

Proportional influence (PI): influence of each neighboring survey site on a downstream survey site Weighted by catchment area: Surrogate for flow volume

1. Calculate the PI of each upstream segment on segment directly downstream

2. Calculate the PI of one survey site on another site Flow-connected sites Multiply the segment PIs

Watershed Segment B

Watershed Segment A

Segment PI of A

Watershed Area A

Watershed Area A+B=

survey sitesstream segment

Site PI = B * D * F * G

Data for Geostatistical Modelling

Distance matrices

SLD, SHD, AHD

Spatial weights matrix

Contains flow dependent weights for WAHD

Watershed covariates

Lumped watershed covariates

Mean elevation, % Urban

Observations

MBSS survey sites

Validation Set Unique for each chemical response variable

Initial Covariate Selection 5 covariates

Model Development Restricted model space to all possible linear models 4 model sets

Response Significant CovariatesANC (μeq/l) PASTUR, LOWURB, WOODYWET, YR96, YR97COND (μmho/cm) HIGHURB, LOWURB, COALMINE, YR96, NORTHINGDOC (mg/l) WOODYWET, CONIFER, MIXEDFOR, LOWURB, NORTHINGDO (mg/l) DECIDFOR, HIGHURB, WOODYWET, YR96, YR97NO3 (mg/l) PASTUR, PROBCROP, ROWCROP, LOWURB, WATERpH Lab PROBCROP, DECIDFOR, WOODYWET, ACREAGE, CONIFERSO4 (mg/l) LOWURB, COALMINE, NORTHING, ER67, ER69TEMP (°C) PROBCROP, LOWURB, WATER, YR96, YR97

Geostatistical Modeling Methods

Geostatistical model parameter estimation

Maximize the profile log-likelihood function

Geostatistical Modelling Methods

Log-likelihood function of the parameters ( ) given the observed data Z is:2, ,

)()'(2

1)2log(

2);,,( 1

Maximizing the log-likelihood with respect to B and sigma2 yields:

1)ˆlog(

2)2log(

2),ˆ,ˆ;( 22 nnnZprofile

ZXXX 111 ')'(ˆ 1

2ˆ ˆ( ) ' ( )

ˆZ X Z X

Both maximum likelihood estimators can be written as functions of alone

Derive the profile log-likelihood function by substituting the MLEs ( ) back into the log-likelihood function

2ˆ ˆ,

Correlation matrix for SLD and SHD models

Fit exponential autocorrelation function

1 1 21 2

1 if 0( ; , )

(1 )exp( / ) if 0

where C1 is the correlation based on the distance between two sites, h, given the autocorrelation parameter estimates: nugget ( ), sill ( ), and range ( ).0 1 2

0 locations are not flow connected,

( , | ) (0) if location 1 = location 2,

( ) otherwise.D

C s s C

Correlation matrix for WAHD model

Fit exponential autocorrelation function (C1) Hadamard (element-wise) product of C1 & square root of spatial weights

matrix forced into symmetry ( )Dj B jw

Model selection between model types 100 Predictions: Universal kriging algorithm Mean square prediction error (MSPE) Cannot use AICC to compare models based on different distance

measures

Model comparison r2 for observed vs. predicted values

Model selection within model set GLM: Akaike Information Corrected Criterion (AICC) Geostatistical models: Spatial AICC (Hoeting et al., in press)

12),,;(2 2

kpnZAICC profile

where n is the number of observations, p-1 is the number of covariates, and k is the number of autocorrelation parameters.

http://www.stat.colostate.edu/~jah/papers/spavarsel.pdf

Results

Summary statistics for distance measures Spatial neighborhood differs Affects number of neighboring sites Affects median, mean, and maximum separation distance

* Asymmetric hydrologic distance is not weighted here

Summary statistics for distance measures in kilometers using DO (n=826).

Distance Measure N Pairs Min Median Mean Max

Straight Line Distance 340725 0.05 101.02 118.16 385.53

Symmetric Hydrologic Distance 62625 0.05 156.29 187.10 611.74

Pure Asymmetric * Hydrologic Distance 1117 0.05 4.49 5.83 27.44

Results

100.00

ANC COND DOC DO NO3 PHLAB SO4 TEMP

180.79 301.76

Range of spatial autocorrelation differs Shortest for SLD TEMP = shortest range values DO = largest range values

Mean Range ValuesSLD = 28.2 kmSHD = 88.03 km

WAHD = 57.8 km

50000.00

100000.00

150000.00

200000.00

250000.00

300000.00

350000.00

GLM SL SH WAH

5000.00

10000.00

15000.00

20000.00

25000.00

30000.00

35000.00

40000.00

GLM SL SH WAH

100.00

150.00

200.00

250.00

300.00

350.00

400.00

GLM SL SH WAH

Distance Measures GLM always has less predictive ability More than one distance measure usually performed well

– SLD, SHD, WAHD: PHLAB & DOC– SLD and SHD : ANC, DO, NO3– WAHD & SHD: COND, TEMP

SLD distance: SO4

Results

ANC COND DOC DO NO3 PHLAB SO4 TEMP

Strong: ANC, COND, DOC, NO3, PHLAB Weak: DO, TEMP, SO4

r2 Predictive ability of models

Discussion

Site’s relative influence on other sites Dictates form and size of spatial neighborhood

Important because… Impacts accuracy of the geostatistical model predictions

Distance measure influences how spatial relationships are represented in a stream network

SHD WAHDSLD

Geostatistical models describe more variability than GLM

Patterns of spatial autocorrelation found at relatively coarse scale

> 1 distance measure performed well SLD never substantially inferior Do not represent movement through network

Different range of spatial autocorrelation? Larger SHD and WAHD range values Separation distance larger when restricted to network

SLD, SHD, and WAHD represent spatial autocorrelation in continuous coarse-scale variables

Discussion

Probability-based random survey design (-) affected WAHD Maximize spatial independence of sites Does not represent spatial relationships in networks Validation sites randomly selected

149133

3519 15 13 6 1 0

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Number of Neighboring Sites

244 sites did not have neighbors Sample Size = 881Number of sites with ≤1 neighbor: 393Mean number of neighbors per site: 2.81

Discussion

45004500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1715 16

WAHD GLM

Not when neighbors had: Similar watershed conditions Significantly different chemical response values

WAHD models explained more variability as neighboring sites increased

Discussion

GLM predictions improved as number of neighbors increased Clusters of sites in space have similar watershed conditions

– Statistical regression pulled towards the cluster

GLM contained hidden spatial information– Explained additional variability in data with > neighbors

45004500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1715 16

WAHD GLM

Predictive Ability of Geostatistical Models

Coarse

es ANC

0 0.5 1.0

Conclusions

1) Spatial autocorrelation exists in stream chemistry data at a relatively coarse scale

2) Geostatistical models improve the accuracy of water chemistry predictions

3) Patterns of spatial autocorrelation differ between chemical response variables Ecological processes acting at different spatial scales affect

conditions at the survey site

4) SLD is the most suitable distance measure in Maryland for these chemical response variables at this time Unsuitable survey designs SHD: GIS processing time is prohibitive

Conclusions

5) Results are scale specific Spatial patterns change with survey scale Other patterns may emerge at shorter separation distances

6) Further research is needed at finer scales Watershed or small stream network

Demonstrate how a geostatistical methodology can be used to compliment regional water quality monitoring efforts

1) Predict regional water quality conditions

2) Identify the spatial location of potentially impaired stream segments

Visualization of Model Predictions

MBSS 1996 DOC

N 0 20

Kilometers

n Min 1st Qu. Median Mean 3rd Qu. Max σ2312 0.6 1.2 1.7 1.9 2.7 15.9 1.8

N 0 20

Kilometers

n Min 1st Qu. Median Mean 3rd Qu. Max σ2312 0.6 1.2 1.7 1.9 2.7 15.9 1.8n Min 1st Qu. Median Mean 3rd Qu. Max σ2

312 0.6 1.2 1.7 1.9 2.7 15.9 1.8

Spatial Patterns in Model Fit

Squared Prediction Error (SPE)

Generate Model Predictions

Prediction sites Study area

– 1st, 2nd, and 3rd order non-tidal streams– 3083 segments = 5973 stream km

ID downstream node of each segment– Create prediction site

More than one site at each confluence

Generate predictions and prediction variances

SLD Mariah model Universal kriging algorithm Assigned predictions and prediction variances back

to stream segments in GIS

DOC Predictions (mg/l)

Weak Model Fit

Strong Model Fit

Water Quality Attainment by Stream Kilometres

Threshold values for DOC Set by Maryland Department of Natural Resources High DOC values may indicate biological or ecological

stress

Theshold DOC (mg/l)Stream

Kilometers PercentLow < 5.0 5387.67 90.2Medium 5.0 - 8.0 400.19 6.7High > 8.0 185.16 3.1

Different ways to capture spatial information

1) Geostatistical models

Attempt to explain spatial relationship between response variables

May represent another ecological process that is affecting them

2) Spatial location of covariates

Does the spatial location of landuse within the watershed affect the response?

Does the spatial configuration of landuse affect the response?

3) Stream network configuration and connectivity

How does the configuration of the network affect the response?

Are stream segments within one network really connected?

Current and Future Research in SEQ

( ) ( ) (| |) ( ) / ( ) ( )rY s s K u s u s x u du

meanconstant here but might incorporateother covariates

weight function for relative stream orders or watershed areas

independent Gaussianprocess

kernel function: Governs spatialdependence

|u-s| = river distance d

Covariance Matched Constrained Kriging (CMCK)

Geostatistical Models

Cressie, N., Frey, J., Harch, B., and Smith, M.: 2006, ‘Spatial Prediction on a River Network’, Journal of Agricultural, Biological, and Environmental Statistics, to appear.

Covariance Matched Constrained Kriging (CMCK)

Combination of distance measures

Cressie, N., Frey, J., Harch, B., and Smith, M.: 2006, ‘Spatial Prediction on a River Network’, Journal of Agricultural, Biological, and Environmental Statistics, to appear.

Geostatistical Models

Invertebrates

Develop geostatistical models

Individual indices and multivariate indicators

Physical/Chemical

Nutrients

Ecosystem Processes

Determine which distance measure(s) to use

One distance measure: SLD, SHD, WAHD

More than one distance measure: CMCK (covariance matched constrained kriging)

Based on statistical evidence, ecological expertise, and survey design

Make model predictions

Geostatistical Models and the EHMP

Spatial Location of Watershed Attributes

Lumped non-spatial watershed attributes

Covariate DescriptionAREA Catchment area (ha) 30 meterURBAN % Urban 30 meterBARREN % Barren 30 meterWATER % Open Water 30 meterCONIFER % Conifer or evergreen forest type 30 meterDECIDFOR % Deciduous forest type 30 meterMIXEDFOR % Mixed forest type 30 meterEMERGWET % Emergent Herbacious Wetlands 30 meterWOODYWET % Woody or shrubby wetlands 30 meterCOALMINE % Coalmine 30 meterEASTING Easting - Albers Equal Area Conic 1 footNORTHING Northing - Albers Equal Area Conic 1 footER63-ER69 Omernik's Level III Ecoregion 1:7,500,000MEANELEV Mean elevation in the watershed 30 meterSLOPE Mean slope in the watershed 30 meterARGPERC % Argillaceous rock type in watershed 1:250,000CARPERC % Carbonic rock type in watershed 1:250,000FELPERC % Felsic rock type in watershed 1:250,000MAFPERC % Mafic rock type in watershed 1:250,000SILPERC % Siliceous rock type in watershed 1:250,000

MEANKMean soil erodability factor in watershed (adjusted for rock fragments) 1 kilometer

MAXTEMP Mean annual maximum temperature (°C) 4 kilometerMINTEMP Mean minimum temperature for January - April (°C) 4 kilometerPRECIP Mean precipitation for January - April (mm) 4 kilometerANPRECIP Mean annual precipitation 4 kilometer

Buffer streams using straight-line distance

Straight-line distance from stream outlet

Overland hydrologic distance

+ instream distance to stream

outlet

Overland hydrologic distance to stream

Spatial Location of Watershed Attributes

How large or small are patches of landuse?

How complex is the shape?

Is landuse clumped or dissected?

Is landuse adjacent to stream?

Spatial Configuration of Watershed Attributes

Network Configuration

Network Connectivity

= Survey site

BarrierBarrier

Represent connectivity on a regional scale

= Survey site

Define individual networks

Measure network size and complexity

Network Configuration and Connectivity

www.csiro.au

Questions? Comments?

Erin E. Peterson

Phone: +61 7 3214 2914

Email: Erin.Peterson@csiro.au

Erin E. Peterson Postdoctoral Research Fellow CSIRO Mathematical and Information Sciences Division

Documents

Transcript of Erin E. Peterson Postdoctoral Research Fellow CSIRO Mathematical and Information Sciences Division

Postdoctoral Handbook · Postdoctoral Handbook. Table of Contents Welcome, 3 New Postdoctoral Fellow Arrival Checklist, 4 Core Tenets of Postdoctoral Training, 5 Compensations and

Procedures for Hiring Postdoctoral Fellows and Postdoctoral Fellow Research Associates Postdoctoral Fellows Office Office of the Graduate School Temple.

Market Based Instruments, Ecosystems Servicesrrrc.org.au/wp-content/uploads/2014/06/496-CSIRO...1 CSIRO Sustainable Ecosystems, St. Lucia 2 CSIRO Sustainable Ecosystems, Gungahlin

PoStdoctoral FellowShiP - Faculté des études ...fesp.umontreal.ca/fileadmin/Documents/PDF/GuideStagiairePostoct... · Postdoctoral Associates. 1 ... The list of postdoctoral fellowship

CSIRO Flowerdale - Water

HyResource - CSIRO Research

Postdoctoral Fellowships - UP · 2. POSTDOCTORAL FELLOWSHIP AT THE UNIVERSITY OF PRETORIA 6 2.1. Overview of the Postdoctoral Fellowship Progamme at UP 2.1.1. Categories of Postdoctoral

POSTDOCTORAL SCHOLAR HANDBOOK

Modelling using CSIRO Mk3L Part 1: Getting started · 2015. 2. 23. · What is CSIRO Mk3L?What can it do?Installing CSIRO Mk3LRunning CSIRO Mk3L Exercise 1: Using Katana Launch Xming

John Kot, CSIRO

download - csiro

14 - CSIRO Publishing

Pulsars + Parkes = Awesome Ryan Shannon Postdoctoral Fellow, CSIRO Astronomy and Space Science Credit: John Sarkissian.

Postdoctoral Fellowships for Academic Diversity SAMPLE · Postdoctoral Fellowships for Academic Diversity: Penn Provost Postdoctoral Fellowship & CHOP Postdoctoral Fellowship for

Infosession postdoctoral fellowships

(Spectral Line) VLBI Chris Phillips CSIRO ATNF Chris Phillips CSIRO ATNF.

Bryson Bates (CSIRO, Australia) Richard Chandler (UCL, UK) Steve Charles & Eddy Campbell (CSIRO)

CSIRO; Swinburne

CLINCAL PSYCHOLOGY POSTDOCTORAL BROCHURE€¦ · CLINCAL PSYCHOLOGY POSTDOCTORAL BROCHURE Phoenix VA Health Care System Clinical Psychology Postdoctoral Fellowship Program Psychology

Position Details - Postdoctoral Fellowship - CSOF4/media/Positions/2016/NRCA/1878_ CSIRO... · Web viewCSIRO Postdoctoral Fellowship in ... Practical experience in the wet-lab with