Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long...
Transcript of Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long...
![Page 1: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/1.jpg)
HyDIS: NASA/Raytheon Synergy
NOAA CLIMAS
NWS CSD
NSF SAHRA
NOAA GAPP
Supported by:
Holly C. HartmannDepartment of Hydrology and Water
Resources, University of Arizona
Hydrologic Forecast Verification
![Page 2: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/2.jpg)
• General concepts of verification• Think about how to apply to your operations• Be able to respond to and influence NWS verification program• Be prepared as new tools become available• Be able to do some of their own verification • Be able to work with researchers on verification projects• Contribute to development of verification tools (e.g., look at various options)• Avoid some typical mistakes
Goals
![Page 3: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/3.jpg)
Why Do Verification?Why Do Verification?
Administrative: logistics, selected quantitative criteria
Operations: inputs, model states, outputs, quick!
Research: sources of error, targeting research
Users: making decisions, exploit skill, avoid mistakes
Concerns about verification?
![Page 4: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/4.jpg)
Common across all groupsUninformed, mistaken about forecast interpretation
Use of forecasts limited by lack of demonstrated forecast skill
Have difficulty specifying required accuracy
Unique among stakeholdersRelevant forecast variables, regions (location & scale), seasons, lead times, performance characteristics
Technical sophistication: base probabilities, distributions, math
Role of of forecasts in decision making
Common across many, but not all, stakeholdersHave difficulty distinguishing between “good” & “bad” products
Have difficulty placing forecasts in historical context
Stakeholder Use of HydroClimate Info & Forecasts
![Page 5: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/5.jpg)
Probability of Exceedance Forecasts: These forecasts say something about the entire range of possibilities (not just at tercile boundaries). They provide probabilities and quantities for individual locations.
Although these forecasts are more difficult to understand, they contain much more information than any of the previously available forecast formats. They allow customized forecasts via tradeoffs between ‘confidence’ and ‘precision’.
www.cpc.ncep.noaa.gov/pacdir/NFORdir/
Prob. Forecasts: User preferences influence verification
![Page 6: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/6.jpg)
ESP Forecasts: User preferences influence verification
From: California-Nevada River Forecast Center
![Page 7: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/7.jpg)
From: California-Nevada River Forecast Center
ESP Forecasts: User preferences influence verification
![Page 8: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/8.jpg)
From: California-Nevada River Forecast Center
ESP Forecasts: User preferences influence verification
![Page 9: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/9.jpg)
From: California-Nevada River Forecast Center
ESP Forecasts: User preferences influence verification
![Page 10: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/10.jpg)
From: A. Hamlet, University of Washington
Probabilistic ESP Forecasts
![Page 11: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/11.jpg)
From: A. Hamlet, University of Washington
![Page 12: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/12.jpg)
“Today’s high will be 76 degrees, and it will be partly cloudy, with a 30% chance of rain.”
Deterministic
Categorical
ProbabilisticProbabilisticCategoricalDeterministic
How would you evaluate each of these?
Different Forecasts, Information, Evaluation
![Page 13: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/13.jpg)
Deterministic
Bias
Correlation
RMSE
• Standardized RMSE
• Nash-Sutcliffe
Linear Error in Probability Space
CategoricalHit Rate
Surprise rateThreat ScoreGerrity ScoreSuccess Ratio
Post-agreementPercent Correct
Pierce Skill ScoreGilbert Skill ScoreHeidke Skill Score
Critical Success indexPercent N-class errors
Modified Heidke Skill ScoreHannsen and Kuipers Score
Gandin and Murphy Skill Scores…
Probabilistic
Brier Score
Ranked Probability Score
Distributions-oriented Measures
• Reliability
• Discrimination
• Sharpness
So Many Evaluation Criteria!
![Page 14: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/14.jpg)
Accuracy - overall correspondence between forecasts and observations
Bias - difference between average forecast and average observation
Consistency - forecasts don’t waffle around
Sharpness/Refinement – ability to make bullish forecast statements
Not Sharp
Sharp
Possible Performance Criteria
![Page 15: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/15.jpg)
BiasMean forecast = Mean observed
Correlation CoefficientVariance shared between forecast and observedSays nothing about bias or whether
forecast variance = observed variance
Root Mean Squared (Standard) ErrorDistance between forecast/observation valuesBetter than correlation, but does poor when error is heteroscedasticEmphasizes performance for high flows Alternative: Mean Absolute Error (MAE)
fcstobs
Observed
Fore
cast
![Page 16: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/16.jpg)
1943-99 April 1 Forecasts for Apr-Sept Streamflow at
Stehekin R at Stehekin, WA
Observed (1000’s ac-ft)
Fore
cast
(100
0’s a
c-ft
)
Observed (1000’s ac-ft)
Bias = -87.5Corr = 0.58RMSE = 228.3
1954-97 January 1 Forecasts for Jan-May Streamflow at
Verde R blw Tangle Crk, AZ
Bias = 22Corr = 0.92RMSE = 74.4
![Page 17: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/17.jpg)
False Alarms Surpriseswarning without event event without warning
No fire
“False Alarm Rate” “Probability of Detection”A forecaster’s fundamental challenge
is balancing these two. Which is more important?
Depends on the specific decision context…
Forecasting Tradeoffs
Forecast performance is multi-faceted
![Page 18: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/18.jpg)
Flood Observed?Yes No Total
Floo
d F o
reca
st?
Tota
l N
o Y
es 10 20 30
35 35 70
45 55 100
Probability of detection: 10/45 = 22%How often were you not ‘surprised’?
False Alarm Rate: 20/30 = 66% How often were you ‘led astray’?
But what did you expect by chance alone?
User Perspective: Only one category is relevant
Example: Flood forecast
Contingency Table Evaluations: Ignore Probabilities
![Page 19: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/19.jpg)
Skill: (0.50 – 0.54)/(1.00-0.54) = -8.6%~worse than guessing~
Skill Score =Forecast - BaselinePerfect - Baseline
How Good? Compared to What?
What is the appropriate Baseline?
![Page 20: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/20.jpg)
Observed Outcome
Probabilistic Forecast Evaluation: “Brier” Score
80%
80%
20%
Forecast:“80% chance of rain”
Rain No Rain
With this forecast, what outcome would
you prefer?
Good Not Good
Climatology(Baseline chances)
20%
80%
![Page 21: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/21.jpg)
Observed Outcome
Bad Not Good
Conservative Forecaster
Bold Forecaster
Low Midflows High
ReallyGood
Probabilistic Evaluation: Ranked Probability Score
LowMid
High
LowMidflow
High
RPSB = (0.03 - 0)2 + (0.20 - 0)2 + (1 - 1)2 = 0.04
RPSC = (0.27 - 0)2 + (0.60 - 0)2 + (1 - 1)2 = 0.43
RPSclim = (0.30 - 0)2 + (0.70 - 0)2 + (1 - 1)2 = 0.58
SSBrps = (0.04 - 0.58)/(0 - 0.58) = 0.931 = 93%
SSCrps = (0.43 - 0.58)/(0 - 0.58) = 0.259 = 26%
0.80/0.17/0.03
0.40/0.33/0.27
![Page 22: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/22.jpg)
Reliability Diagrams
“When you say 80% chance of high flows,
how often do high flows happen?”
P(O|F)
![Page 23: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/23.jpg)
Forecast Reliability
Forecasted Probability
Rel
ativ
e fr
eque
ncy
of o
bser
ved
If the forecast saysthere’s a 50% chance of high flows…
![Page 24: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/24.jpg)
Forecasted Probability
Rel
ativ
e fr
eque
ncy
of o
bser
ved
If the forecast saysthere’s a 50% chance of high flows…
High flows should happen 50% of the time
Forecast Reliability
![Page 25: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/25.jpg)
Forecasted Probability
Rel
ativ
e fr
eque
ncy
of o
bser
ved
Flow Climatology
If the forecast saysthere’s a 50% chance of high flows…
High flows should happen 50% of the time
Perfect
Flow “climatology”: Median value
Forecast Reliability
Forecasts “better” than expected. Probabilities could have been more extreme and maintained quality.
![Page 26: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/26.jpg)
Interpretation of Reliability DiagramsInterpretation of Reliability Diagrams
Interpretation of reliability diagrams
Perfect reliability
Over-confidence
Under-confidence
Anti-skill
No skill
Low Sample Size
Reliability
P[O|F]
Does the frequency of occurrence match your probability statement?
Identifies conditional biasR
e lat
ive
frequ
e ncy
of o
b ser
v atio
ns
Forecasted probability
![Page 27: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/27.jpg)
Reliability: CPC forecasts & water managementReliability: CPC forecasts & water management
CPC forecast performance varies among regions, with important
implications for resource management.
Seasonal climate forecasts have been much better for the Lower Colorado
Basin than for the Upper Basin.
Lower Basin
Upper Basin
Upper Colorado River Basin
Lower Colorado River Basin
Forecasts “better” than expected
Forecast probability for “wet”
Rel
ativ
e fr
e qu e
ncy
o f o
b se r
v at io
n s
0.2
0.4
0.4
0.6
1
0.6 0.8
0.8
10.200
Precipitation forecasts accurately reflect expected performance
perfect reliability
~1995-2001 winter season, summer/fall outlooks
![Page 28: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/28.jpg)
Forecast probability
Rela
tive
Fre
quen
cy o
f O
bser
vati
ons Jan 1
2) Tendency to assign too much probability, these months show best reliability.
1) Few high prob. fcasts, good reliability between 10-70% probability; reliability improves.
Reliability: Colorado Basin ESP Seasonal Supply OutlooksReliability: Colorado Basin ESP Seasonal Supply Outlooks
Apr 1
Mar 1
Jun 1
Jan 1
Apr 1
LC JM (5 mo. lead)
LC MM (3 mo. lead)
LC AM (2 mo. lead)
UC JJy (7 mo. lead)
UC AJy (4 mo. lead)
UC JnJy (2 mo. lead)3) Reliability decreases for later forecasts as resolution increases; UC good at extremes.
high 30% mid 40% low 30%
![Page 29: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/29.jpg)
Forecasted Probability
Rel
ativ
e fr
eque
ncy
of
indi
cate
d fo
reca
st
Climatology
0.00 0.33 1.00
Good discrimination!
Forecasted Probability
Rel
ativ
e fr
eque
ncy
of
indi
cate
d fo
reca
st
Climatology
0.00 0.33 1.00
Not much discrimination!
Probability of dry
Probability of wet
Discrimination: CPC Climate Outlooks
Probability of dry
Probability of wet
Discrimination: P[F|O]
Can the forecasts distinguish among different events?
![Page 30: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/30.jpg)
Rela
tive
Fre
quen
cy o
f Fo
reca
sts
High
Mid-
Low
There is some discrimination…
Early forecasts warned “High flows less likely”
Jan 1
Jan-May
When unusually low flows happened…
P(F|Low flows)
From K. Franz (2002)
Forecast probability
Discrimination: Lower Colorado ESP Supply Outlooks
![Page 31: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/31.jpg)
Rela
tive
Fre
quen
cy o
f Fo
reca
sts
Good Discrimination…
Forecasts were saying:
1) high and mid- flows less likely.
2) Low flows more likely
Jan 1
Forecast probability
Apr 1
Jan-May
Apr-May
From K. Franz (2002)
High
Mid-
Low
When unusually low flows happened…
P(F|Low flows)
There is some discrimination…
Early forecasts warned “High flows less likely”
Discrimination: Lower Colorado ESP Supply Outlooks
![Page 32: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/32.jpg)
Rela
tive
Fre
quen
cy o
f Fo
reca
sts
high 30%
mid 40%
low 30%
1)High flows less likely.
2) No discrimination between mid and low flows.3) Both UC and LC show good discrimination for low flows at 2-month lead time.
Jan 1
Forecast probability
Apr 1
Lower Colorado BasinJan-May (5 mo. lead)
April-May (2 mo. lead)
Jan 1
Jun 1
Upper Colorado Basin Jan-July (7 mo. lead)
June-July (2 mo. lead)
(Franz, 2001)
For observed flows in lowest 30% of historic distribution
Discrimination: Colorado Basin ESP Supply Outlooks
![Page 33: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/33.jpg)
Deterministic forecasts
• traditional in hydrology
• sub-optimal for decision making
Common perspective
“Deterministic model simulations and probabilistic forecasts … are two entirely different types of products. Direct comparison of probabilistic forecasts with deterministic single valued forecasts is extremely difficult”
Comparing Deterministic & Probabilistic ForecastsComparing Deterministic & Probabilistic Forecasts
![Page 34: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/34.jpg)
What’s wrong with using ‘deterministic’ metrics?
Metrics that use only the central tendency of each forecast pdf will fail to distinguish between red, green, and aqua forecasts, but will identify the purple forecast as inferior.Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.)
00.5
11.5
22.5
33.5
44.5
5
0 0.5 1 1.5 2 2.5
Obs ValuePDF
1
11
2
From: A. Hamlet, U. Washington
![Page 35: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/35.jpg)
00.5
11.5
22.5
33.5
44.5
5
0 0.5 1 1.5 2 2.5
Obs Value
3
21
4
More sophisticated metrics that reward accuracy but punish spread will rank the forecast skill from highest to lowest as aqua, green, red, purple. Example metric: average RMSE of ALL ensemble members compared to average RMSE of ALL climatological observations.
From: A. Hamlet, U. Washington
![Page 36: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/36.jpg)
Climatologydistribution
Forecast distribution
Tercile boundaries(equal probability) Deterministic
forecast
Jack-knife standard error
Deterministic vs. Probabilistic Forecasts
![Page 37: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/37.jpg)
Multi-dimensional, distributions-oriented evaluation of probabilistic forecasts.
Compare by converting deterministic forecasts to probabilistic form.
Better estimation of naturalized flows.
Cooperation of forecasting agencies and groups.
Archives of forecasts and forecasting information.
Address small sample sizes for operational forecasts: Evaluate hindcasts for individual forecast techniques, objective forecast combinations, or pseudo-forecasts.
Communication of forecast performance to users.
Forecast Evaluation: Critical Needs
![Page 38: Hydrologic Forecast Verification · Example metric: MSE of ensemble mean compared to MSE of long term mean of observations (variance of obs.) 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5](https://reader034.fdocuments.in/reader034/viewer/2022050101/5f402d9424acb33b894a30c3/html5/thumbnails/38.jpg)
http://fet.hwr.arizona.edu/ForecastEvaluationTool/
Initially for NWS CPC climate forecastsAdding water supply forecasts, station forecasts
Six elements in our webtool:• Forecast Interpretation – Tutorials• Exploring Forecast Progression• Historical Context• Forecast Performance• Use in Decision Making • Details: Forecast Techniques, Research