Download - Real-time Verification of Operational Precipitation Forecasts using Hourly Gauge Data Andrew Loughe Judy Henderson Jennifer MahoneyEdward Tollerud Real-time.

Real-time Verification of Operational Precipitation Forecasts using Hourly Gauge Data

Andrew Loughe Judy Henderson

Jennifer Mahoney Edward Tollerud

Real-time Verification System (RTVS) of NOAA / FSLBoulder, Colorado USA

Outline

Some approaches to objective verification

How we perform automated precipitation verification

What we mean by "real-time"

Forecasts + Obs --> Results disseminated over the web. (The steps involved)

QC, Model comparisons, Statistical displays

Future direction

If you don't have objective data

you are just another person

with an opinion

Our Approach?

Basically, we're gross!

No really, we are...

We process 4,500 gauge measurements each hour of every day. On average we retain 2,800 "good" reports. That's 67,000 observations per day, 200,000 per month, and over 6 million per season.

The Real-time Verification System

An independent, real-time, automated data ingest and management system

Gauge observations received each hour of every day (~4500)

Gross error check on observations is performed

Model forecasts interpolated to the observation points

Results stored in 2 x 2 contingency tables of forecast / observation pairs (YY, YN, NY, NN)

Graphics, skill scores and contingency information disseminated over the WorldWide Web

Alternative Approaches(Should be objective)

Grid-to-grid verification

We're game... but not yet!

More fair to the modelers

Less fair to the end-users of the forecast?

More representative of the areal coverage of precipitation

Can do pattern matching and partitioning of the error (Ebert, et al.) or studies of representativeness error (Foufoula et al.)

What about Case Studies?Do you fish with a pole or do you fish with a net?

We fish with a net

Case studies are insufficient for evaluating national-scale forecast systems

Subjective analyses often focus on where forecasts work well, and not on where they work poorly

There exists a need to assess variability on many time and space scales (from daily to seasonal)

Timely and objective information is needed for decision making

Realtime or Near-Realtime?

Realtime processing... Monthly and Seasonal dissemination of results (for now)

Gauge data stored in hourly bins

Model data interpolated once the observations catch up (Models initialized as late as 18Z, and then 24h forecasts are made)

Data collected over numerous accumulation periods

Go with the flow...

I. Obtain gauge data and collect it into hourly bins Match data with list of "good" stations (QC'd list)

II. Interpolate model data to "good" observation points

III. Accumulate precipitation over 3, 6, 12, 24 hours

IV.Compute contingency pairs (YY, YN, NY, NN)

V. Process these contingency data to create plots of ESS and Bias for Eta and RUC2

VI.Make these displays and the associated statistical information available through the web

A Point-Specific Approach

(Eta at 40 km)

Gauge Data Checked for Accuracy

Hourly gauge data are checked for accuracy vs. radar, 24h totals, nearest neighbor

Further data are included through in-house QC efforts

Forecast / Observation Comparisons

Comparisons made at numerous thresholds from 0.1 to 5.0 inches

Comparisons made over 3, 6, 12, and 24h accumulation periods

2x2 Contingency Tables

Dichotomous Forecasting

Basic Definitions

An "event" is one of:

hit = YY YES Forecast YES Observed false_alarm = YN YES Forecast NOT Observed detection_failure= NY NOT Forecast YES Observed null_event = NN NOT Forecast NOT Observed

From which these basic terms may be defined:

numevents = YY + YN + NY + NN Number of eventsyes_obs = YY + NY Number of observed eventsyes_fcst = YY + YN Number of forecast eventsnot_obs = YN + NN Number of events not observednot_fcst = NY + NN Number of events not forecastfcst_or_obs = YY + YN + NY Number of events forecast or observedcorrect = YY + NN Number of events correctly forecast

Skill Scores

* POD = hits / yes_obs Probability of detection

FOM = detection_failures / yes_obs Frequency of misses (1 - POD)

* PON = null_events / not_obs Probability of null event

POFD = false_alarms / not_obs Probability of false detection (probability of false alarm) (1 - PON)

FOH = hits / yes_fcst Frequency of hits (1 - FAR)

* FAR = false_alarms / yes_fcst False alarm ratio

FOCN = null_events / not_fcst Frequency of correct null forecasts (1 - DFR)

DFR = detection_failures / not_fcst Detection failure ratio

* BIAS = yes_fcst / yes_obs Frequency Bias, a measure of over- or under- forecasting

* CSI = hits / fcst_or_obs Critical Success Index (CSI or Threat Score)

* TSS = POD - POFD True Skill Statistic [ hits/yes_obs - false_alarms/not_obs ]

* HSS = (correct - chance) / (numevents - chance);

where chance = ( yes_fcst * yes_obs + not_fcst * not_obs ) / numevents

* ESS = (hits - chance) / (fcst_or_obs - chance); where chance = (yes_fcst * yes_obs) / numevents

Results Available over the Webwww-ad.fsl.noaa.gov/afra/rtvs/precip

Specify parameters... obtain graphical result

View contingency tables stored on disk

The Future!Access and Displays via Database

(Model Icing Forecasts)

Specify parameters Display results (gnuplot) via database query (MySQL)

Are these methods sufficient?

Trade off between dealing with the specifics and dealing with the general (rifle vs. shotgun)

Method is not discretized by region or event

Density of observations is not smooth

Although method is straightforward, there still is a lack of understanding for what the skill scores represent

May tell you which forecast system is "better", but not why

Future Plans

Add more models to this point-specific approach, and provide a measure of confidence

Perform verification using a gridded, analyzed precipitation field (Stage IV Precipitation)

Verify the probabilistic forecasts of ensembles

Move verification data into the relational database and compute results on-the-fly

Relate verification results geographically

Access verification results as soon as the forecast period ends (timeliness)

Contd, ...

Test and extend QC of the observations

Currently we are:

Assessing skill using East-only and West-only hourly station data

Assessing skill using full RFC and the in-house QC methods

Assessing skill using no QC methods whatsoever

Comparing these four experimental results

ProblemNot Reporting "Zero" Precipitation?

The Affect on Precipitation Verification