Kerry J. Ritter Molly Leecaster N. Scott Urquhart Ken Schiff
description
Transcript of Kerry J. Ritter Molly Leecaster N. Scott Urquhart Ken Schiff
Two-Phase Sampling Approach for Augmenting Fixed Grid Designs to Improve Local Estimation for Mapping
Aquatic Resources
Kerry J. Ritter
Molly Leecaster
N. Scott Urquhart
Ken Schiff
Project Funding
• The work reported here was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and STARMAP, the Program they represent. EPA does not endorse any products or commercial services mentioned in this presentation.
• Southern Californian Coastal Water Research Project (SSCWRP)
Background• Maps of sediment condition are important for
making decisions regarding pollutant discharge• Maps in marine systems are rare• Special study by San Diego Municipal Wastewater
Treatment Plant• Objective : To build statistically defensible maps
of chemical constituents and biological indices around two sewage outfalls– Point Loma
– South Bay
Point Loma and South Bay Outfalls
TYPICAL DESIGN SITUATION
• Many features of the real situation are unknown.– Here: The nature of the semivariogram
• Multiple Responses What is a good solution for one response
may not be a good design for another!
• Time constraint– Answer was required by this past Monday
Two-Phase Approach• Phase I: Model spatial variability at various
spatial scales (eg. Variogram) – This summer
• Phase II: Use information from Phase I to design survey that meets accuracy requirements – next summer = 2005
How Should We Add Sites to Existing Grid in Order to
Estimate Variogram?
• What is best design configuration?
• More sites with less intensity or fewer sites with more intensity?
• Shorter sample spacing or larger sample spacing?
Variogram
distance
ga
mm
a
0 10 20 30 40 50
0.0
0.5
1.0
1.5
2.0
2.5
VARIOGRAM
}NUGGET=>
SILL=>
RANGE
Empirical Variograms(Point Loma 2000 Regional Survey)
distance
gam
ma
0 2 4 6 8
010
2030
4050
60
CHROMIUM
R=5.09 S=36.27 N =0.00distance
gam
ma
0 2 4 6 8
0.0
0.05
0.10
0.15
TOC
R=8.8 S=.077 N =0.0242distance
gam
ma
0 2 4 6 8
05
1015
2025
30
COPPER
R=2.75 S=22.53 N =0.00
distance
gam
ma
0 2 4 6 8
050
100
150
200
250
300
ZINC
R=6.14 S=218.55 N =0.00
Lag Distribution Variogram
lag distance (km)
No.
of p
airs
2 4 6 8
1020
3040
50
Design Considerations for Modeling the Variogram
• Sufficient replication at various spatial scales– Variogram model
– Parameter estimates
• Adequate spatial coverage– Stationarity
– Isotropy vs. Anisotropy
– Strata
• Allow for multiple responses
Choosing the Best DesignCase Study: Point Loma
• Three design configurations– S, STAR, and S with satellites
• Two sets of lag classes– Shorter vs. larger sample spacing
• Compare lag distributions• Simulation study
– Simulate response– Consider different models of spatial variability
• Compare relative performance of designs for estimating parameters
“STAR” and “S” Cluster Designs
S DESIGN
Xkm
Ykm
0 20 40 60 80 100
020
4060
8010
0
STAR DESIGN
Xkm
Ykm
0 20 40 60 80 100
020
4060
8010
0
“S” and “S with Satellites” Design
S DESIGN
Xkm
Ykm
0 20 40 60 80 100
020
4060
8010
0
S with SATELLITES DESIGN
Xkm
Ykm
0 20 40 60 80 100
020
4060
8010
0
Sample AllocationStar S S with Satellites
Grid Stations =12 Grid Stations =12 Grid Stations =12
5 “STAR” Clusters of Size 17
3 grid station
2 sites of interest
1 “S” Cluster of Size 9
11 “S” Clusters of Size 9
5 grid stations
6 sites of interest
8 “S” Clusters of Size 9
8 Satellites added to 3 S”
4 grid stations
4 sites of interest
Field duplicates=9 Field duplicates=6 Field duplicates=8
Total Samples =
12+3*(17-1) +2*(17)+9+9=112
Total Samples =
12+5*(9-1)+6*(9)+6=112
Total Samples =
12+4*(9-1) +6*(9)+6=112
“Star” Cluster Design
Point Loma 5 Star + 1 S Cluster
Xkm
Ykm
466 468 470 472 474
3610
3615
3620
3625
Point Loma 5 Star + 1 S Cluster
Xkm
Ykm
466 468 470 472 474
3610
3615
3620
3625
“S” Cluster Design
S DESIGN
Xkm
Ykm
466 468 470 472 474
3610
3615
3620
3625
S DESIGN
Xkm
Ykm
466 468 470 472
3610
3615
3620
3625
Lag = 0.05, 0.10, 0.20, 0.50 Lag = 0.05, 0.25, 1.00, 3.00
“S” Cluster with SatellitesS with SATELLITES DESIGN
Xkm
Ykm
466 468 470 472 474
3610
3615
3620
3625
S with SATELLITES DESIGN
Xkm
Ykm
466 468 470 472
3610
3615
3620
3625
Omnidirectional Lag Dist.
Ominidirectional Lag Dist
Pairwise Lag distances
No. o
f pair
s
0 2 4 6 8
010
020
030
040
0
SD3StarD5SSATD3
Ominidirectional Lag Dist
Pairwise Lag distances
No. o
f pair
s
0 2 4 6 8
010
020
030
040
0
SStarSSAT
Lag = 0.05, 0.10, 0.20, 0.50 Lag = 0.05, 0.25, 1.00, 3.00
Directional Lag DistLag = 0.05, 0.10, 0.20, 0.50
{ Lag = 0.05, 0.25, 1.00, 3.00 is similar}
Direction = 0
Pairwise Lag distances
No
. o
f p
air
s
0 2 4 6 8
02
04
06
08
01
00
12
0
S0STAR0SSAT0
Direction = 45
Pairwise Lag distances
No
. o
f p
air
s
0 2 4 6 8
02
04
06
08
01
00
12
0
S45STAR45SSAT45
Direction = 90
Pairwise Lag distances
No
. o
f p
air
s
0 2 4 6 8
02
04
06
08
01
00
12
0
S90STAR90SSAT90
Direction = 135
Pairwise Lag distances
No
. o
f p
air
s
0 2 4 6 8
02
04
06
08
01
00
12
0
S135STAR135SSAT135
Simulation Study• 3 Grid Enhancements: S, STAR, S with Satellites• Two sets of lag classes of size 4
– 0.05, 0.10, 0.20, 0.50 (km)– 0.05, 0.25, 1, 3 (km)
• Spherical variogram– Range = 1, 2, 4, 6– Nugget = 0.00, 0.10– Sill = 1
• 1000 sims• Fit using automated procedure in Splus
– This may have introduced artifacts
Percent Difference from Target Range(Median Range) S=1, N= 0.10
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-10
010
2030
40
SStarSSAT
Lag = 0.05, 0.25, 1.00, 3.00
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-10
010
2030
40
SStarSSAT
Lag = 0.05, 0.10, 0.20, 0.50
Percent Difference from Target Sill(Median Sill) S=1, N= 0.10
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-10
-50
510
1520
SStarSSAT
Lag = 0.05, 0.25, 1.00, 3.00
True Range
Per
cent
of T
arge
t
1 2 3 4 5 6
-10
-50
510
1520
SStarSSAT
Lag = 0.05, 0.10, 0.20, 0.50
Percent Difference from Target Nugget(Median Nugget)
S=1, N= 0.10
True Range
Med
ian
1 2 3 4 5 6
-100
-50
050
100
SSTARSSAT
Lag = 0.05, 0.25, 1.00, 3.00
True Range
Med
ian
1 2 3 4 5 6
-100
-50
050
100
SSTARSSAT
Lag = 0.05, 0.10, 0.20, 0.50
Summary
STAR- performed better than S and S with Satellites for estimating variogram parameters- robust to different lag classes
S – lacks sufficient information at short distances for estimating nugget
S with Satellites- better than S design for estimating nugget, not as good as STAR
Larger lag classes generally did better than shorter lag classes
Further Research
• Choose another variogram model– Exponential
• Choose another variogram fitting algorithm– REML
• Simulate anisotropy• Investigate robustness to model misspecification• Explore other designs
END OF PLANNED PRESENTATIONS
• Questions and suggestions are welcome
Note
• Note that rest of slides show simulation results for N=0, S=1. They will not be included in presentation
Percent Difference from Target (Median Range)
S=1, N= 0
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-10
010
2030
40
SStarSSAT
Lag = 0.05, 0.25, 1.00, 3.00
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-10
010
2030
40
SStarSSAT
Lag = 0.05, 0.10, 0.20, 0.50 Lag = 0.05, 0.25, 1.00, 3.00
Percent Difference From Target(Median Sill)
S=1, N=0
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-50
510
SStarSSAT
Lag = 0.05, 0.25, 1.00, 3.00
True Range
Perc
ent o
f Tar
get
1 2 3 4 5 6
-50
510
SStarSSAT
Lag = 0.05, 0.10, 0.20, 0.50
Difference from Target (Median Nugget)
S=1, N= 0
True Range
Med
ian
1 2 3 4 5 6
0.0
0.00
20.
004
0.00
60.
008
SSTARSSAT
True Range
Med
ian
1 2 3 4 5 6
0.0
0.00
20.
004
0.00
60.
008
SSTARSSAT
Lag = 0.05, 0.25, 1.00, 3.00Lag = 0.05, 0.10, 0.20, 0.50
“S” Cluster Design
• 12 grid stations 12
• 11 “S” Clusters of Size 9 99-5 = 94– 5 grid stations– 6 sites of interest (some old stations, some Bight
stations, some new)
• 6 field duplicates 6
• Total samples = 112 112
“STAR” Cluster Design
• 12 grid stations 12• 5 “STAR” Clusters of Size 16 (17) 80
– 3 grid stations
– 2 site of interest (one Bight station, one old station) 2• 1 “S” Cluster of Size 8 (9) 9
– new station
• 9 field duplicates 9• Total samples = 112 112
“S” Cluster with Satellites
• 12 grid stations 12• 8 “S” Clusters of Size 8 (9)
– 4 grid stations (8) 32– 4 sites of interest (some old stations, some Bight
stations, some new) (9) 36• 8 Satellites added to 3 Clusters 24• 8 field duplicates 8• Total samples = 112 112