Application of Statistical Techniques to Interpretation of Water Monitoring Data
description
Transcript of Application of Statistical Techniques to Interpretation of Water Monitoring Data
![Page 1: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/1.jpg)
Application of Statistical Techniques to Interpretation
of Water Monitoring Data
Eric Smith, Golde Holtzman, and Carl Zipper
![Page 2: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/2.jpg)
OutlineI. Water quality data: program design (CEZ, 15 min)
II. Characteristics of water-quality data (CEZ, 15 min)
III. Describing water quality(GIH, 30 min)IV. Data analysis for making decisions
A, Compliance with numerical standards (EPS, 45 min)
Dinner Break
B, Locational / temporal comparisons (“cause and effect”) (EPS, 45)
C, Detection of water-quality trends (GIH, 60 min)
![Page 3: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/3.jpg)
III. Describing water quality(GIH, 30 min)
• Rivers and streams are an essential component of the biosphere
• Rivers are alive• Life is characterized by variation• Statistics is the science of variation• Statistical Thinking/Statistical Perspective • Thinking in terms of variation• Thinking in terms of distribution
![Page 4: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/4.jpg)
The present problem is multivariate
• WATER QUALITY as a function of • TIME, under the influence of co-variates like• FLOW, at multiple • LOCATIONS
![Page 5: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/5.jpg)
WQ variable versus time
Time in Years
Wat
er V
aria
ble
![Page 6: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/6.jpg)
Bear Creek below Town of Wise STP
6.5
7
7.5
8
8.5
9
PH
1973/12/14 1978/12/14 1983/12/14 1988/12/14 1993/12/14
DATE
![Page 7: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/7.jpg)
Univariate WQ Variable
Time
Wat
er Q
ualit
y
![Page 8: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/8.jpg)
Univariate WQ Variable
Time
Wat
er Q
ualit
yW
ater
Qua
lity
Water Quality
Wat
er Q
ualit
y
Water Quality
Wat
er Q
ualit
yW
ater
Qua
lity
Wat
er Q
ualit
yW
ater
Qua
lity
Wat
er Q
ualit
yW
ater
Qua
lity
Wat
er Q
ualit
y
![Page 9: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/9.jpg)
Univariate Perspective, Real Data (pH below STP)
6.5 7 7.5 8 8.5 9
6.5
7
7.5
8
8.5
9
![Page 10: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/10.jpg)
The three most important pieces of information in a sample:
• Central Location– Mean, Median, Mode
• Dispersion– Range, Standard Deviation,
Inter Quartile Range• Shape
– Symmetry, skewness, kurtosis– No mode, unimodal, bimodal, multimodal
![Page 11: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/11.jpg)
Central Location: Sample Mean
• (Sum of all observations) / (sample size)• Center of gravity of the distribution• depends on each observation• therefore sensitive to outliers
![Page 12: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/12.jpg)
Central Location: Sample Mean
• (Sum of all observations) / (sample size)• Center of gravity of the distribution• depends on each observation• therefore sensitive to outliers
![Page 13: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/13.jpg)
Central Location: Sample Mean
• (Sum of all observations) / (sample size)• Center of gravity of the distribution• depends on each observation• therefore sensitive to outliers
![Page 14: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/14.jpg)
Central Location: Sample Mean
• (Sum of all observations) / (sample size)• Center of gravity of the distribution• depends on each observation• therefore sensitive to outliers
![Page 15: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/15.jpg)
Central Location: Sample Mean
• (Sum of all observations) / (sample size)• Center of gravity of the distribution• depends on each observation• therefore sensitive to outliers
![Page 16: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/16.jpg)
Central Location: Sample Mean
• (Sum of all observations) / (sample size)• Center of gravity of the distribution• depends on each observation• therefore sensitive to outliers
![Page 17: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/17.jpg)
Central Location: Sample Median• Center of the ordered array• I.e., the (0.5)(n + 1) observation in the ordered array.
If sample size n is odd, then the
median is the middle value in the
ordered array.
Example A:
1, 1, 0, 2 , 3
Order:
0, 1, 1, 2, 3
n = 5, odd
(0.5)(n + 1) = 3
Median = 1
If sample size n is even, then the
median is the average of the two
middle values in the ordered array.
Example B:
1, 1, 0, 2, 3, 6
Order:
0, 1, 1, 2, 3, 6
n = 6, even,
(0.5)(n + 1) = 3.5
Median = (1 + 2)/2 = 1.5
![Page 18: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/18.jpg)
Central Location: Sample Median
• Center of the ordered array• depends on the magnitude of the central
observations only• therefore NOT sensitive to outliers
![Page 19: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/19.jpg)
Central Location: Sample Median
• Center of the ordered array• depends on the magnitude of the central
observations only• therefore NOT sensitive to outliers
![Page 20: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/20.jpg)
Central Location: Sample Median
• Center of the ordered array• depends on the magnitude of the central
observations only• therefore NOT sensitive to outliers
![Page 21: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/21.jpg)
Central Location: Sample Median
• Center of the ordered array• depends on the magnitude of the central
observations only• therefore NOT sensitive to outliers
![Page 22: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/22.jpg)
Central Location: Sample Median
• Center of the ordered array• depends on the magnitude of the central
observations only• therefore NOT sensitive to outliers
![Page 23: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/23.jpg)
Central Location: Sample Median
• Center of the ordered array• depends on the magnitude of the central
observations only• therefore NOT sensitive to outliers
![Page 24: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/24.jpg)
Central Location: Mean vs. Median
• Mean is influenced by outliers• Median is robust against (resistant to) outliers• Mean “moves” toward outliers• Median represents bulk of observations almost
always
Comparison of mean and median tells us about outliers
![Page 25: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/25.jpg)
Dispersion
• Range• Standard Deviation• Inter-quartile Range
![Page 26: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/26.jpg)
Dispersion: Range• Maximum - Minimum• Easy to calculate• Easy to interpret• Depends on sample size (biased)• Therefore not good for statistical
inference
![Page 27: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/27.jpg)
Dispersion: Standard Deviation
1
2
nYY-
0 5
-1+1
SD = 10
0 5
-2+2
SD = 2
1 2
-1 1 3
![Page 28: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/28.jpg)
Dispersion: Properties of SD• SD > 0 for all data• SD = 0 if and only if all observations the same
(no variation)• Familiar Intervals for a normal distribution,
– 68% expected within 1 SD,– 95% expected within 2 SD,– 99.6% expected within 3 SD,– Exact for normal distribution, ballpark for any distn
• For any distribution, nearly all observations lie within 3 SD
![Page 29: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/29.jpg)
Interpretation of SD
6.5 7 7.5 8 8.5 9
n = 200
SD = 0.41
Median = 7.6
Mean = 7.6
![Page 30: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/30.jpg)
Quartiles, Percentiles, Quantiles, Five Number Summary, Boxplot
Maximum 4th quartile 100th percentile 1.00 quantile
3rd quartile 75th percentile 0.75 quantile
Median 2nd quartile 50th percentile 0.50 quantile
1st quartile 25th percentile 0.25 quantile
Minimum 0th quartile 0th percentile 0.00 quantile
![Page 31: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/31.jpg)
Quartiles (undergrad classes) E.g., Sample: 0, −3.1, −0.4, 0, 2.2, 5.1, 3.8, 3.8, 3.9, 2.3, n = 10
Rank Value
10 5.1 Maximum
9 3.9
8 3.8 3rd Quartile
7 3.8
6 2.3Median 2nd Quartile
5 2.2
4 0
3 0 1st Quartile
2 −0.4
1 −3.1 Minimum
3 3.8Q
22.2 2.3 2.25
2Q
1 0Q
Max 5.1
Min 3.1
Note: Quartiles Q0, Q1, Q2, Q3, Q4, = Quantiles Q0.00, Q0.25, Q0.50, Q0.75, Q1.00
![Page 32: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/32.jpg)
5-Number Summary and Boxplot (undergrad perspective)
Min Q1 Q2 Q3 Max
−3.10 0.00 2.25 3.80 5.10
2 2.25Median Q
5.10 3.10 8.20Range Max Min
3 1 3.80 0.00 3.80IQR Q Q
![Page 33: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/33.jpg)
Terminology Warning:
Quartiles, a.k.a. Percentiles, a.k.a. Quantiles
Note: Quartiles Q0, Q1, Q2, Q3, Q4, = Quantiles Q0.00, Q0.25, Q0.50, Q0.75, Q1.00
Quartiles Percentiles QuantilesQ4 = 4th quartile = Max = 100th percentile = Q1.00 = 1.00 quantile
Q3 = 3rd quartile = 75th percentile = Q0.75 = 0.75 quantile
Q2 = 2nd quartile = Med = 50th percentile = Q0.50 = 0.50 quantile
Q1 = 1st quartile = 25th percentile = Q0.25 = 0.25 quantile
Q0 = 0th quartile = Min = 0th percentile = Q0.00 = 0.00 quantile
![Page 34: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/34.jpg)
Terminology Warning:
But Percentiles and Quantiles are more general
Note: Quartiles Q0, Q1, Q2, Q3, Q4, = Quantiles Q0.00, Q0.25, Q0.50, Q0.75, Q1.00
Quartiles Percentiles QuantilesQ4 = 4th quartile = Max = 100th percentile = Q1.00 = 1.00 quantile
95th percentile = Q0.95 = 0.95 quantile
Q3 = 3rd quartile = 75th percentile = Q0.75 = 0.75 quantile
60th percentile = Q0.60 = 0.60 quantile
Q2 = 2nd quartile = Med = 50th percentile = Q0.50 = 0.50 quantile
34th percentile = Q0.34 = 0.34 quantile
Q1 = 1st quartile = 25th percentile = Q0.25 = 0.25 quantile
2.5th percentile = Q0.025 = 0.025 quantileQ0 = 0th quartile = Min = 0th percentile = Q0.00 = 0.00 quantile
![Page 35: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/35.jpg)
Quantile Location and Quantilesby weighted averages (graduate classes)
1: Quantile Location 1
2 :
th
thq
Step q L q n
Step q Quantile Q a w b a
Example: Find the 20th percentile of the sample above.Step 1:
q = 0.20, n =10
L = 0.20(10 + 1) = 2.2
indicating the “2.2th “ observation in the ordered array.
Step 2: Therefore the 0.20 quantile is a weighted average of the 2nd and 3rd
observations in the ordered array, which are
a = − 0.4, b = 0
and the weight is
w = 0.2
Q = -0.4 + 0.2(0 – (– 0.4)) = – 0.40 + 0.08= – 0.32
E.g., Sample: 0, −3.1, −0.4, 0, 2.2, 5.1, 3.8, 3.8, 3.9, 2.3, n = 10
![Page 36: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/36.jpg)
Quantile Location and Quantilesby weighted averages (graduate classes)
1: Quantile Location 1
2 :
th
thq
Step q L q n
Step q Quantile Q a w b a
Step 2:
a = − 0.4, b = 0, w = 0.2
Q = a + w(b – a)
= – 0.4 + 0.2(0 – (– 0.4))
= – 0.4 + 0.2(0.4)
= – 0.40 + 0.08
= – 0.32
E.g., Sample: 0, −3.1, −0.4, 0, 2.2, 5.1, 3.8, 3.8, 3.9, 2.3, n = 10
– 0.4 0
0.4
– 0.32
![Page 37: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/37.jpg)
Quantile Location and Quantiles Example: 0, − 3.1, − 0.4, 0, 2.2, 5.1, 3.8, 3.8, 3.9, 2.3, n = 10
Value Rank
5.1 10
3.9 9
3.8 8
3.8 7
2.3 6
2.2 5
0 4
0 3
−0.4 2
−3.1 1
Quantilerank, q
Quantile Location, L Quantile, Q
Common Name
1.00 n = 10 5.1 Maximum
0.75 0.75(10+1) = 8.25
3.8+0.25(3.9 − 3.8)= 3.825 3rd Quartile
0.50 0.5(10+1) = 5.5
2.2+0.5(2.3 − 2.2)= 2.25
Median, or 2nd Quartile
0.25 0.25(10+1)=2.75
−0.4+0.75[0 − (−0.4)]= −0.1 1st Quartile
0.00 1 −3.1 Minimum
![Page 38: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/38.jpg)
5-Number Summary and Boxplot using weighted averages for quantiles
Min Q1 Q2 Q3 Max
−3.10 −0.10 2.25 3.825 5.10
2 2.25Median Q
5.10 3.10 8.20Range Max Min
3 1 3.825 0.10 3.925IQR Q Q
Note slightly different results by using weighted averages.
![Page 39: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/39.jpg)
Dispersion: IQRInter-Quartile Range
• (3rd Quartile - (1st Quartile)• Robust against outliers
![Page 40: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/40.jpg)
Interpretation of IQR
6.5 7 7.5 8 8.5 9
n = 200
SD = 0.41
Median = 7.6
Mean = 7.6
IQR = 0.54
For a Normal distribution, Median 2IQR includes 99.3%
![Page 41: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/41.jpg)
Shape: Symmetry and Skewness• Symmetry mean
bilateral symmetry
![Page 42: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/42.jpg)
Shape: Symmetry and Skewness• Symmetry mean
bilateral symmetry
• Positive Skewness (asymmetric “tail” in positive direction)
![Page 43: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/43.jpg)
Shape: Symmetry and Skewness• “Symmetry” mean bilateral
symmetry, skewness = 0• Mean = Median (approximately)
• Positive Skewness (asymmetric “tail” in positive direction)
• Mean > Median
• Negative Skewness (asymmetric “tail” in negative direction)
• Mean < Median
Comparison of mean and median tells us about shape
![Page 44: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/44.jpg)
6.5 7 7.5 8 8.5 9
6.5
7
7.5
8
8.5
9
Bear Creek below Town of Wise STP
![Page 45: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/45.jpg)
6.5
7
7.5
8
8.5
9
Outlier Box Plot Outliers
Whisker
Whisker
Median
75th %-tile = 3rd Quartile
25th %-tile = 1st Quartile
IQR
![Page 46: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/46.jpg)
Wise, VA, below STP
6.5
7
7.5
8
8.5
9
0
2
4
6
8
1011
13
pH
TKN
mg/
l
![Page 47: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/47.jpg)
Wise, VA below STP
102030405060708090
100110120130
0
5
10
15
20
25
DO
(% s
atur
)
BO
D (
mg/
l)
![Page 48: Application of Statistical Techniques to Interpretation of Water Monitoring Data](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816149550346895dd0c7b1/html5/thumbnails/48.jpg)
0
1
2
3
4
5
Wise, VA below STPTo
t Pho
spho
rous
(mg/
l
0
10000
20000
30000
40000
50000
60000Fecal Coliforms