Cell identification and verification of QPF ensembles using ...€¦ · Cell identification and...

Journal of Hydrology (2007) 343, 105–116

ava i lab le a t www.sc iencedi rec t . com

journal homepage: www.elsevier .com/ locate / jhydrol

Cell identification and verification of QPF ensemblesusing shape analysis techniques

Athanasios Christou Micheas a,*, Neil I. Fox b, Steven A. Lack b,Christopher K. Wikle c

a Department of Statistics, 146 Middlebush Hall, University of Missouri, Columbia, United Statesb Department of Soil, Environmental and Atmospheric Sciences, University of Missouri, Columbia, United Statesc Department of Statistics, University of Missouri, Columbia, United States

Received 28 February 2006; received in revised form 17 April 2007; accepted 17 May 2007

Summary This paper introduces a new verification technique designed for, but not lim-
ited to, quantitative precipitation forecasts. It is tested on, and examples are given for,an intercomparison of very short-period nowcasting schemes. One of these nowcasters is aBayesian scheme that is used in an extensive ensemble formulation, and the verificationscheme is uniquely capable of treating both the ensemble members and the mean fore-cast. The verification scheme uses Procrustes shape analysis methods that are well estab-lished in statistics but have not, to date, been applied to meteorological forecastassessment. The Procrustes methodology allows for a decomposition of the forecast errorinto any number of components such as location (displacement), shape, size, orientationand intensity. Each error component can be afforded a separate weighting such that a costor value of the forecast can be calculated that accounts for different error types. Forexample, a forecaster who is concerned with the location of a storm would place greateremphasis on correct location in the forecast than other attributes. This ability to applyweights makes the system particularly suited to real-time verification applications whereconfidence in the performance of the forecast translates into improved dissemination tousers. In addition, the decomposition of the error into parts enables diagnosis of the errorsources that can lead to model adjustment and improvement.
ª 2007 Elsevier B.V. All rights reserved.

0022-1694/$ - see front matter ª 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.jhydrol.2007.05.036

* Corresponding author. Tel.: +1 573 884 8828.E-mail address: [email protected] (A.C. Micheas).

mailto:[email protected]

106 A.C. Micheas et al.

Introduction

Typical forecast verification schemes, particularly in thefield of QPF (Quantitative Precipitation Forecasts), usuallyemploy categorical statistics such as the critical success in-dex (CSI) or Heidke Skill Score. Particularly in the case ofspatially dispersed precipitation areas, several authors havecriticized such scores as not representing the true value ofthe forecast to the forecaster (see for example Ebert andMcBride, 2000; Grams et al., 2006).

Two issues that have arisen due to current trends in fore-casting are the need to verify forecasts in real-time and theneed to verify ensemble member forecasts. Some of thenewer proposed multiscale verification methods are veryuseful, but may suffer in the real-time and ensemble situa-tions because of the multi-dimensional nature of the fore-cast verification problem (e.g. Casati et al., 2005). Thesolution is to develop a system that is both robust in that itproduces useful and objective measures of forecast accuracy(and preferably indications of forecast strengths and weak-nesses), while reducing the dimensionality of the verificationto an extent whereby multiple realizations of an ensemblemember can be verified individually and overall.

Starting with Ebert and McBride (2000), methods havebeen developed that account for different aspects of fore-cast errors. In this case, it was noted that QPF may have avalue to a forecaster even when the categorical skill scoresare very low because the forecast may contain a small errorin location (propagation) such that the forecast and actualprecipitation areas hardly overlap. On the other hand, theforecast could represent the size, shape and intensity ofthe precipitation area (cell) accurately, and the forecasterwith knowledge of the meteorology or behavior of the fore-cast model may be able to either adjust the forecast or tointerpret it with expert knowledge by including a subjectiveuncertainty. Therefore, Ebert and McBride (2000) developeda verification methodology that differentiated the locationerror from all other errors. The work presented herein ex-tends that concept by separating error terms due to propa-gation, dilation, intensity, rotation and shape.

There are a number of aspects to the rationale behindthe approach taken. Firstly, valuable QPFs contain moreinformation than simply the location and intensity offuture precipitation. Structural factors such as the shapeof the precipitation area, the change of intensity, the align-ment and the size all provide information of the type of pre-cipitation, its likely persistence and future development.Therefore, an analysis of model predictions may provideinformation on model aspects of storm behavior, leadingto improvements in user interpretation and value of fore-casts and, potentially, improvements to the model itself.

Secondly, following Ebert and McBride (2000), the meth-odology employs an object-oriented approach with flexibil-ity in the determination of a boundary threshold intensity todefine objects. One important point is that there is norestriction on the number of objects in either the forecastor the actual (truth). Therefore, it is critical to penalize aforecast that does not produce the same number of objects(cells) as exist in the ‘‘truth’’. Within this methodology sucha penalty is introduced. This allows the verification schemeto reward forecasts that provide good representations of

storm initiation, dissipation, merging and splitting, QPF as-pects that are notoriously difficult to parameterize.

Thirdly, in order to fully exploit the benefits of the veri-fication scheme for model interpretation we choose to workin radar reflectivity rather than precipitation rate. Thisgives a number of advantages of which two are dominant.One is that one avoids the error introduced in the conver-sion from reflectivity to rainrate. This further retains thestructural information contained in reflectivity that a fore-caster may use in assessments of storm type and develop-ment. However, this means that we are not truly verifyingprecipitation forecasts, but rather the quality of the fore-cast radar reflectivity fields compared to actual fields. Ineach case there is no rainfall (precipitation) truth observa-tion, but this is common with many verifications.

This paper introduces the new verification scheme thatuses a shape analysis methodology that reduces the dimen-sionality of the verification problem by identifying precipi-tation objects and assigning their characteristics through anumber of angular components that specify the shape ofthe object in the forecast and corresponding true image.In the following section, the ensemble verification schemeis introduced and described. In Section ‘‘Nowcast schemesand data’’, we discuss the case study we consider along withthe nowcasting scheme. Section ‘‘Application of cell identi-fication and verification methods’’ presents an example ofhow the method may be applied using forecasts from a num-ber of different short-period quantitative precipitationforecast systems. This section illustrates the variety of met-rics available through the methodology. Based on this exam-ple, Section ‘‘Concluding remarks’’ provides a discussion ofthe observed applicability of the scheme and suggestions forimplementation and broader research.

Ensemble verification methods

The verification of nowcasts is critical for evaluating anyproposed forecast, and replacements for the older categor-ical statistical measures that do not reflect the real value ofprecipitation forecasts need to be developed. Verification isnot best accomplished using a pixel by pixel comparison, asmany forecast fields, particularly those that include fore-casts of heavy convective rain, contain features (such aspixel clusters or cells) that can easily be identified as sepa-rate objects, with grid values in a specific cell being highlycorrelated. These identifiable features certainly vary frombeing different in intensity, and a simple way of quantifyinguncertainty in nowcasts would be a pixel by pixel compari-son (of intensity). Although simplistic, this approach doesnot account for other effects such as rotation, scale andlocation changes that alter the intensities of grid points.For an object-based verification of precipitation forecasts,see for example Davis et al. (2006).

In essence, the purpose is to assess differences betweenthe observed image, zT+s, of size K · L, and the M nowcastrealizations from a distribution of predictions that can berepresented by yð1ÞTþs; . . . ; yðMÞTþs. That is, the aim is to assesshow well each prediction matches the actual image. Typicalapproaches to this verification problem utilize all intensitiesto create a penalty function that measures the error. Forexample, Ebert and McBride (2000), considered the usual

Cell identification and verification of QPF ensembles using shape analysis techniques 107

quadratic penalty function between the truth and the ithforecast as

SSðiÞtot ¼1

KLðvecðzTþsÞ � vecðyðiÞTþsÞÞ

0ðvecðzTþsÞ � vecðyðiÞTþsÞÞ;

ð1Þwhere vec(X) denotes the vector with the columns of thematrix X stacked. The authors proposed a further decompo-sition utilizing a shifted forecast that accounts for changesin location and average intensity. Other event-based verifi-cation techniques can be found in Ebert et al. (2004).

Herein, we propose a penalty function that accounts forall effects of location, dilation and rotation separately, aswell as the difference in levels of intensity, that enjoys agreat reduction of dimensionality. The problem with usingall intensities (apart from high dimensionality) is that suchapproaches cannot identify specific cells in an ensemblethat may be of particular interest to the forecaster. Thefact that from nowcast to nowcast we may have a differentnumber of cells moving across the area in comparison withthe observed, dictates that a method of verification shouldpenalize a forecast on a cell-by-cell basis.

Several verification tasks naturally arise, including thedevelopment and implementation of cell identification algo-rithms, which we discuss first.

Cell identification within nowcasting ensemblemembers

We first clarify how we may identify corresponding cells be-tween the observed truth and forecasted ensemble mem-bers (or between any two ensemble members). Thisproblem has not been addressed before in the hydrometeo-rological literature, and provides researchers with a novelapproach to verification as well as forecasting. Assume thateach ensemble member has Ni > 0, identifiable cells (clus-ters of intensities), where i = 1,2, . . . ,M, denotes theensemble member, such that in a grid of K · L pixels, wecould have Ni taking values from 0 (no precipitation) to KL

4,

assuming K and L even. We will consider clusters of fouror more pixels as valid cells. Also notice that we do not con-sider the case of forecasts with no cells.

To provide a mathematical model for cells in eachensemble member, for each one of the Ni identifiable cells,we first obtain a centroid on the grid, denoted by ðcijx ; cijy Þ,j = 1,2 . . . ,Ni, the jth cell within the ith ensemble member,and x and y the horizontal and vertical axis. Thus, cijx takesvalues in {1,2, . . . ,L} while cijy takes values in {1,2, . . . ,K}.For simplicity, we may assume that the lower left cornerof the grid is the point (0,0), for all i and j. Once a centroidhas been identified, a radial is drawn that starts at thecentroid and crosses the cell’s edge at some point(s). Weconsider a set of fixed angles hl 2 H ¼ hl 2 ½0; 2pÞ :fhl ¼ 2pl

A ; l ¼ 0; 1; 2; . . . ;A� 1g, with A taking values from 1to 360, and we obtain the radial distance from the centroidto the edge along the lth angle, denoted by rijhl . Typically,one would choose a value of A� 360 to reduce dimension-ality. Clearly, for convex cells we will obtain a completedescription of the shape of the cell (only one edge point),while for non-convex cells, we choose rijhl to be the largestdistance among such distances. This means that concave

areas, or holes, may be included in the cell parameteriza-tion, but it ensures that no parts of the cell are ever lost.

Since we want to reduce the dimensionality of the prob-lem, without losing vital information, the intensity of a cellshould be represented by a small number of summary statis-tics of intensity. For example, we choose three values thatsummarize intensity in a cell, namely the average, the low-est and the highest intensities, denoted by cijavg; c

ijmin and

cijmax, respectively.Thus, each ensemble member yðiÞTþs, is described as a col-

lection of Ni cells, where each cell is described by an (A + 5)-dimensional vector Sij ¼ ðcijx ; cijy ; c

ijmin; c

ijmax; c

ijavg; r

ijh0; . . . ; rijhA�1Þ;

j ¼ 1; 2; . . . ;Ni (cell number), i = 1,2, . . . ,M (ensemble mem-ber number), which contains a parameterization of the cellin terms of centroid coordinates, three intensity measuresand values of centroid-edge distance for A different angles.Hence, from the original dimension of the realizations beingM · K · L we have a dimensionality of d ¼ ðAþ 5Þ

PMi¼1Ni,

where Ni 6KL4 . Although at first glance, the upper bound

for the new dimension of the problem appears to be large,there are usually only a small number of observed cellsper ensemble member, and hence there is a great reductionin dimension using this approach.

Some discussion is in order about the actual selectionprocess of the cells we use to describe each ensemble mem-ber. Typically, meteorologists have a strong sense of whatintensity level is representative of the weather phenomenaof interest, and more precisely, they can provide a thresh-old value. Intensities below this value indicate precipitationthat is not significant in terms of severe weather or heavyprecipitation. Values above the threshold are of extremeimportance in short-term forecasting, as they may indicatecells of imminent danger to public safety. Therefore, byconsideration of the physical process underlying the rela-tionship between the observed radar reflectivity and stormcharacteristics (precipitation rate, hail size) a suitablethreshold can be selected that we will call F.

Now all values below F are set to the minimum value ob-served. A boundary algorithm is then applied to the result-ing image, that yields the grid coordinates of each cell.Once we have identified each cell’s coordinates on theplane, the aforementioned approach is used to obtain the(A + 5)-dimensional vectors that describe each cell in theensemble.

Cell correspondence between ensemble members

There are several ways in which cell correspondence can beaccomplished. Herein, we will consider matching cellsbased on either their location (centroid based), or theirshape differences (Procrustes matching).

Assume that the true image at time T + s contains N cellsdescribed by vectors

Sj ¼ ðcjx; cjy ; cjmin; c

jmax; c

javg; r

jh0; . . . ; rjhA�1Þ; ð2Þ

j = 1,2, . . . ,N. Since the ensemble member might have fewer(Ni < N) or more (Ni > N) cells than the true image, the fore-cast is inappropriate and should be penalized appropriately.Even in the case of Ni = N, the prediction maybe be unac-ceptable depending on the values of the vectors of the cellsinvolved. For a perfect prediction, we should have Ni = N,

1 In Greek mythology, Damastes (which means ‘tamer’ in Greek),was a robber who operated on the road from Eleusis to Athens. Heused to go by the name Procrustes, and he would offer travellers abed to rest and sleep. Once asleep, he would fit them to the bed bystretching them if they were too short, or cutting off their limbs iftoo tall. In our context, we can think of the bed as the shape of anobject and the traveller as the shape of the object that needs to betranslated, dilated and rotated in order for it to fit as closely aspossible to the bed.


and Sij = Sj, for all j = 1,2, . . . ,N, which may not occur in realapplications. In this case, a forecast with fewer or moreidentifiable cells from the truth is penalized naturally bythe approach we undertake, namely, a cell from the truthis always forced to match with a cell from the forecast.

In the context of verification, we know exactly the truenumber N of cells in the true image. Thus, a natural estima-tor of the cell Sj is the mean vector of all the particular cellvectors from each of the ensemble members,bSj ¼ 1

M

PMi¼1Sij; j ¼ 1; 2; . . . ;N. The obvious drawback with

these estimators is that, depending on how the cell indicesare attached to a cell, it is possible to compute the wrongcells even in the perfect prediction case. Hence, indicesare assigned to cells according to how close, in some sense,the cells are to each other. That is, we order the systems inthe ensemble members depending on their proximity to thecells in the true image. For example, taking only location inmind, if the true image contains only one cell S1 andan ensemble member has two, then we would define themas S11 and S12, provided that ðc1x � c11x Þ

2 þ ðc1y � c11y Þ2<

ðc1x � c12x Þ2 þ ðc1y � c12y Þ

2.In order to take into consideration the effects of loca-

tion, dilation and rotation in each cell, we consider a statis-tical shape analysis framework (e.g., Bookstein, 1991;Stoyan and Stoyan, 1994; Dryden and Mardia, 1998; Micheasand Dey, 2003; Micheas and Dey, 2005; Micheas et al.,2006). To accomplish this, the coordinates of the cells arerewritten using complex coordinates, by considering thegrid as being the first quadrant of the complex plane. Thus,since any cell’s boundary coordinates can be describedby an A · 2 matrix of x–y coordinates ðxjhl ; y

jhlÞ; l ¼ 0;

1; . . . ;A� 1, where xjhl ¼ cjx þ rjhl sin hl, yjhl¼ cjy þ rjhl cos hl,

the x and y grid coordinates of the boundary point on theaxis of angle hl, of the jth cell in the true image, andxkjhl ¼ ckjx þ rkjhl sin hl, ykj

hl¼ ckjy þ rkjhl cos hl, the corresponding

grid coordinates for the jth cell in the kth ensemble mem-ber, then the jth true cell is described by the vector of com-plex numbers

zj ¼ ðxjh0 þ iyjh0; . . . ; xjhA�1 þ iyj

hA�1Þ; ð3Þ

and similarly for the jth cell in the kth ensemble member wedefine

zkj ¼ ðxkjh0 þ iykjh0; . . . ; xkjhA�1 þ iykj

hA�1Þ: ð4Þ

We refer to zj and zkj as the icons of the storm systems. Wedefine zjH ¼ Hzj, the ‘‘Helmertized’’ landmarks of zj, whereH stands for the last ðk� 1Þ-rows of the Helmert matrix, aspecial matrix of independent contrasts (see Dryden andMardia (1998, p. 34)) that help remove location from zj.These rows of the matrix form an orthonormal basis. In or-der to remove scaling effects, we simply take the standard-

ized version of zjH, i.e., we define zjs ¼zjH

kzjHk, where

kzjHk2 ¼ zj�H z

jH, and zj�H denotes the conjugate transpose of

zjH. We call zjs the pre-shape of the shape zj. The pre-shapecan be thought of as the standardized shape of the cell, interms of both location (the average of the landmarks iszero) and scale (size is one).

Hence, let zjs and zkjs be the corresponding pre-shapes ofzj and zkj. Then we define an ordering of pre-shapes (cells)

in each ensemble member. Within the kth ensemble mem-ber we will use the jth cell to estimate the true jth cellwhen

zkjs ¼ argminzkls

dðzjs; zkls Þ ð5Þ

for l = 1,2, . . . ,Ni and where dðzjs; zkls Þ ¼ ðzjs � zkls Þ�ðzjs � zkls Þ

and arg min denotes the argument that minimizes the dis-tance. Notice that for the ordering of the cells in eachensemble member we only take into consideration shapeinformation and not intensity this way. We can easily incor-porate intensity using distances between the intensity mea-sures cjmin; c

jmax, and cjavg, but as far as identifying the

corresponding cells between truth and forecast, intensityis not essential. Thus, when it comes to application of themethods, the first task is to apply the cell identification ap-proach above to obtain the proper pairing between systemsin realizations and corresponding systems in the trueimages.

Shape-analysis techniques in ensemble memberverification

Our goal is to extract shape and intensity information fromeach cell in the true image and its corresponding estimatorfrom the kth ensemble member, and then penalize the now-cast using these measurements. In essence, we will esti-mate the amount of the error associated with dilation,rotation and translation of a forecast cell to a cell fromthe truth. To accomplish this, we first assume that everycell Sj (or z

j) in the true image, is predicted by the jth cellfrom the kth ensemble member, namely Skj (or zkj),j = 1,2, . . . ,N, k = 1,2, . . . ,M. In the case where a forecastcontains fewer cells than the truth then a cell might be usedmore than once to estimate a truth cell, and similarly in theopposite case. We treat the problem of estimating intensityand that of shape separately.

In order to capture shape variation (translation, dilationand rotation) we employ a ‘‘Procrustes’’1 approach. Drydenand Mardia (1998) provide a good overview of Procrustesmethods.

Suppose that there exist a number of dilation parametersrjk > 0, rotation parameters ujk > 0, and translation parame-ter complex vectors bjk = (bjk0, . . . ,bjkA�1), (often all thecoordinates of bjk are equal to a single complex parameterb0jkÞ, such that we may dilate, rotate and translate a given

storm system from some ensemble member and obtain(recreate) the corresponding true storm system. The trans-formations required to match a storm in an ensemble mem-ber to an actual storm, provide the verification measure.That is, zj ¼ rjkeujkizkj þ bjk þ ej, or


xjha þ iyjha¼ rjke

ujkiðxkjha þ iykjhaÞ þ bjka þ eja;

a ¼ 0; 1; . . . ;A� 1; j ¼ 1; 2; . . . ;N; k ¼ 1; 2; . . . ;M;ð6Þ

where eja is an error term from (typically) the complex nor-mal distribution, with mean zero, and variancer2j ¼ r2

j1 þ r2j2i. Thus, there are N complex regression equa-

tions that can be used to estimate the parameters of inter-est for each one of the N true systems. The least squareestimators of rjk, ujk, and bjk, can be denoted by crjk ; cujk,and cbjk. In a statistical shape analysis context, Eq. (6) isthe regression equation used in Procrustes Analysis, wherethe two shapes are matched through similarity transforma-tions, and the differences between zj and the fittedczjk, indi-cate the magnitude of the difference in shape between zj

and zkj. The estimator

czjk ¼ cbjk þcrjkeicujk zkj; ð7Þ

is often referred to as the Full Procrustes fit (superimposi-tion) of zkj onto zj. Now it can be shown that

crjk ¼ jðzkjc Þ�zjcj

ðzkjc Þ�zkjc;

cujk ¼ argððzkjc Þ�zjcÞ;

cbjk ¼ zj �crjkeicujk zkj; or

cb0jk ¼

1

A� 1

XA�1a¼0ðxkjha þ iykj

haÞ �

crjkeicujk

A� 1

XA�1a¼0ðxjha þ iyj

haÞ;

ð8Þ

where

zkj ¼ 1

A� 1

XA�1a¼0ðxkjha þ iykj

haÞ1;

zj ¼ 1

A� 1

XA�1a¼0ðxjha þ iyj

haÞ1;

ð9Þ

are the natural centroids of each cell, 1T = [1, . . . ,1], andzkjc ¼ zkj � zkj; zjc ¼ zj � zj, the centered cells. Notice that

the Procrustes residual sum of squares (RSS) between zj

and zkj (i.e., when estimating the jth true cell with thejth cell from the kth ensemble member), is defined as

RSSjk ¼ ðzj �czjkÞ�ðzj �czjkÞ: ð10Þ

Now, if the forecast cell fits perfectly the corresponding truecell (i.e., crjk ¼ 1; cujk ¼ 0, and cbjk ¼ 0, hence zj = zkj), thenno mistake was made in estimating this cell by the forecastand thus the Procrustes RSS is zero as expected. The overallProcrustes RSS for the kth ensemble member becomes

RSSk ¼XNj¼1ðzj �czjkÞ�ðzj �czjkÞ

¼XNj¼1ðzj � cbjk �crjkeicujk zkjÞ�ðzj � cbjk

�crjkeicujk zkjÞ: ð11Þ

Moreover, an overall estimator of the jth true cell based onall k realizations (i.e., an estimator based on the ensemble)is given by

ezj ¼ 1

M

XMk¼1

czjk : ð12Þ

Penalty function between ensembles

To obtain an objective penalty function we first penalizeforecasts for their deviations from the truth, according tothe following quadratic function that accounts for location,dilation and rotation effects (together), as well as intensityeffects between cells from the truth and ensembles. Thepenalty function between the truth and the kth ensemblemember is defined as

DðzTþs; yðkÞTþsÞ ¼

XNj¼1ðzj � brjkebujk izkj � bbjkÞ�ðzj � brjkebujk izkj � bbjkÞ

þXNj¼1ðcjavg � ckjavgÞ

2 þXNj¼1ðcjmin � ckjminÞ

2

þXNj¼1ðcjmax � ckjmaxÞ

2 ¼ RSSk þ SSðkÞavg þ SSðkÞmin þ SSðkÞmax;

ð13Þ

where the components RSSk, SSðkÞavg; SS

ðkÞmin and SSðkÞmax penalize

differences in shape, average, minimum and maximumintensities, respectively. Several similar penalty functionsmay be created that utilize shape and intensity informationfrom all forecast realizations.

Now, using the estimators crjk ; cujk, andcb0jk, from the full

Procrustes fit on the jth cell from the kth ensemble memberto the jth true cell, we can create a penalty function thatpenalizes an ensemble member according to the individualerrors in location, rotation and dilation. Within the kthensemble member define

SSðkÞshape ¼ SSðkÞloc þ SSðkÞrot þ SSðkÞscale

¼XNj¼1

cb0jk

2

þXNj¼1bu2

jk þXNj¼1

br2jk; ð14Þ

where SSðkÞloc; SSðkÞrot and SSðkÞscale denote the differences (over all

cells within the ensemble member) of location, rotation anddilation, respectively. Then MSSðkÞshape ¼ 1

NSSðkÞshape, denotes the

average effects of location, rotation and dilation for the kthensemble member, and hence a new penalty function is de-fined by

D�ðzTþs; yðkÞTþsÞ ¼ MSSðkÞshape þ SSðkÞavg þ SSðkÞmin þ SSðkÞmax

¼ 1

NSSðkÞloc þ

1

NSSðkÞrot þ

1

NSSðkÞscale þ SSðkÞavg

þ SSðkÞmin þ SSðkÞmax: ð15Þ

Both penalty functions can be used to assess how well anensemble member is performing in estimating the truth.The dimensionality reduction is clearly great, but perhapsthe most interesting consequence of these measures is thatit allows an experimenter to choose the ‘‘best’’ ensemblemember forecast among a set of forecasts being compared,based on several characteristics including location, rotationand scale effects and average, minimum or maximum inten-sities in cells. Eqs. (13) and (15) provide overall measures ofthe differences between the truth and a forecast. For a per-fect forecast, DðzTþs; y

ðkÞTþsÞ should be zero, by the very def-

inition of the quantities in (13), since in this case RSSk,SSðkÞavg; SSðkÞmin and SSðkÞmax, would all be zero. Since we aim tobreak down the error due to differences in shape betweentruth and forecast, we take for each forecast the average


differences across all cells in location, rotation and scale(1NSSðkÞloc,

1NSSðkÞrot, and

1NSSðkÞscale). A perfect forecast in this case

would yield values close to zero for 1NSSðkÞloc and 1

NSSðkÞrot, and

close to one for 1NSSðkÞscale. Hence, D�ðzTþs; y

ðkÞTþsÞ should be

close to one for a perfect forecast.

Nowcast schemes and data

Each of the nowcasting schemes used in this study works ona different principle. For this reason they are likely to pro-duce nowcasts that vary in their treatment of precipitationfields containing structures of different shape, spatial scale,and organization and that move in more or less regularways. In each case, for the verification system to be ap-plied, the nowcast scheme must generate forecast productsin the form of a pseudo-reflectivity grid.

The Warning Decision Support System-Integrated Infor-mation (WDSS-II) has been developed by the National SevereStorms laboratory (NSSL). Now in its second major version,it contains two cell tracking algorithms: The Storm CellIdentification and Tracking (SCIT) system (Johnson et al.,1993) and the K-Means cluster analysis scheme (Lakshmananet al., 2003). The first of these is a centroid tracking schemethat produces motion vectors for identified cells. The latteruses a variable cell clustering technique to identify motionof single cells or groups dependent upon reflectivity thresh-olds set by the user. This latter scheme provides forecasts inthe form of pseudo-reflectivity images and was thereforesuitable for use in this project.

The spectral prognosis (SPROG) nowcasting system wasdeveloped at the Australian Bureau of Meteorology ResearchCentre (BMRC) and uses a spectral decomposition techniqueto selectively identify and track features of different spatialscales. Smaller features are decayed in forecast time inkeeping with the knowledge of persistence and predictabil-ity of rainfall structures in time (Seed, 2003). The resultingforecast spectral components are then recombined into a fi-nal nowcast product in the form of a reflectivity image.Notably, the efficiency of this scheme allows it to be runoperationally in an ensemble member stochastic format asis currently being tested in the Met Office (Pierce et al.,2005). For this study this was not done.

The University of Missouri Bayesian Hierarchical Model(BHM) uses a discretized integrodifference equation ap-proach to forecast pixel-by-pixel motion and intensity ofreflectivity fields (Xu et al., 2005; Fox and Wikle, 2005).The Bayesian nature of the model allows one to obtain pre-diction distributions, and thus, ensemble members. For thisstudy we extracted 20 ensemble members from the poster-ior distribution sample. The BHM used for a demonstrationin this study runs on a 32 · 32 pixel grid because it is ineffi-cient and currently would not produce a viable ensemblemember in a reasonable time if run on a larger grid. Weknow that this is not adequate as an operational forecastscheme and are currently working on an improved versionthat will provide nowcasts on a larger grid. This paper dealswith the verification issue, for which production of ensem-ble member forecasts was required. The effectiveness ofthe verification is not impacted by the size of the grid andwould be the same for a larger grid and number of cells.Description of the BHM, including its limitations, can befound in Xu et al. (2005) and Fox and Wikle (2005).

The verification technique explored in this paper wasdeveloped to work with ensemble member forecast sys-tems, but is not restricted to these and can be used forany spatial forecast field whether it is an ensemble memberor not. The concept is not only to provide objective statis-tical measures of overall forecast skill, but also to identifyindividual ensemble member performance such that differ-ent treatments of the data or configurations of the forecastmodel can be assessed. The separation of errors into thevarious components should provide a tool whereby thecause of the error can be identified and improvements canbe made to the nowcast system based on this knowledge.The verification scheme can be run in real time such thatassessments of nowcast performance can be made and fore-casters could then have confidence in the products or havethe ability to choose between different products (even fromdifferent systems) depending on relative performancestatistics.

The example case was taken using data from the SanAntonio, TX area on 5 July 2002. This day saw a series of in-tense convective storm cells track over the area resulting insome flash flooding. The individual storms were disorga-nized and their tracks influenced by the proximity of thecoast and complex orography inland. Although the stormswere not severe, forecast knowledge of their tracks wouldhave helped in forecasting small catchment response toheavy rainfall.

In order to ensure comparability of the nowcast schemesand, in particular, the performance of the verification, eachscheme was run on the same data. In the example case, thisincludes NEXRAD LEVEL II data reduced to a 32 · 32 grid of4-km resolution pixels from a mosaic of three radars includ-ing Corpus Christi, TX (KCRP), Brownsville, TX (KBRO) andSan Antonio, TX (KEWX) to capture the event. Although thisresolution does not allow any of the nowcast schemes toperform optimally, it does allow for the BHM scheme tobe run in full ensemble mode as a test of the verificationscheme’s handling of a multirealization product.

Application of cell identification andverification methods

Initially, we considered nowcasts from the BHM, WDSS, andSPROG models based on observed radar reflectivities at 10-min intervals from 0130 UTC to 0220 UTC. Specifically, weconsidered nowcasts at 10-min intervals up to 60 min:0230 UTC to 0320 UTC. All forecasts were based on reflectiv-ity images with dimension 32 by 32 pixels. Note, since theBHM model provides ensemble members (20 in this applica-tion), the results presented relate to the posterior meannowcast, unless stated otherwise.

Fig. 1 shows the truth and forecasts from the three mod-els for a 10-min lead-time nowcast verifying at 0230 UTC.The left side of this figure shows the full reflectivity fieldfor the truth and various nowcasts, and the right side showsthe corresponding images after intensities lower than30 dBZ are removed and the boundary finding algorithmhas been applied. The verification methodology then deter-mines the number of ‘‘cells’’ in the truth image, based inthis case on 20 angles. For example, the truth image at0230 UTC is found to contain two cells, with corresponding

Figure 1 (a) Truth at time 0230 UTC. (b) Posterior mean fromthe BHM forecast for time 0230 UTC, lead 10 min. (c) WDSSforecast at time 0230 UTC, lead 10 min. (d) SPROG forecast attime 0230 UTC, lead 10 min.


x-, y-coordinates, minimum, average, and maximumintensities:

True cell 1 : 8:3;�27:8; 30:4; 36:3; 52:1:True cell 2 : 11:6;�6:9; 30:0; 37:0; 50:6:

Similarly, cells are also estimated for the correspondingnowcast images. These cells, relative to the truth cells at0230 UTC, are shown for each nowcast model in Fig. 2. Toquantify the relationship between the cells for the truthand nowcasts, we consider verification summaries. Specifi-cally, we consider the total Procrustes residual sum ofsquares (RSS) and the sum of squares of the intensities(SStot). A summary of these statistics for the nowcasts veri-fying at 0230 UTC to 0320 UTC (10-min lead time nowcastsending at the times) is given in Table 1 where only locationis used for cell correspondence between truth and nowcast,as discussed in Section ‘‘Cell correspondence betweenensemble members’’. Although the primary purpose hereis to demonstrate the methodology, it is interesting tocompare the different nowcast methods based on thesemeasures. In terms of the intensities (SStot), up through40-min lead times, BHM performs best, followed by SPROGand WDSS. For 50- and 60-min lead times, SPROG performsbest, followed by BHM and WDSS. Regarding the total Pro-crustes residual sum of squares, for lead times of 20–50 min, WDSS performs best, followed by SPROG and BHM.At 10-min leads, SPROG is best followed by BHM and WDSS,and at 60 min, BHM is best followed by WDSS and SPROG.Table 2 shows the same RSS statistics but for the case whereProcrustes residuals are used to establish correspondencebetween cells from the truth and nowcasts (as discussedin Section ‘‘Shape-analysis techniques in ensemble memberverification’’). The results are somewhat different, withBHM showing the best performance for medium range leadtimes.

We now look more closely at a specific verification time,0340 UTC, for lead times of 10–60 min. Note that these re-sults are based on a threshold intensity of 20 dBZ and 15 an-gles. This choice, although somewhat subjective, is basedon the ability of the cell detection algorithm to identify spe-cific cells in the truth image. The flexibility of choosing dif-ferent thresholds and number of angles is a strength of themethodology. Table 3 shows the summary statistics for thiscase when only location is used for cell correspondence. Inthis example, SPROG typically performs best at shorter leadtimes, followed by BHM and WDSS. There seems to be someindication that BHM tends to perform better at longer leadtimes. Table 4 shows the comparable verification resultswhen the Procrustes residuals are used for cell correspon-dence. Using this measure, the BHM performs the worst atthe early lead times while the WDSS is best, followed bySPROG. At the longer lead times, SPROG is best, followedby BHM and WDSS.

Recall that a strength of the Procrustes verification ap-proach is that the overall sum of squares statistics can befurther decomposed relative to intensity, shape and scale.Tables 5 and 6 show sum of squares for the minimum,average, and maximum intensity, as well as scale andlocation for the case when cell correspondence is basedon location only, and Procrustes residuals, respectively.These results suggest that each nowcast system has its

Figure 2 (a) Comparing truth to BHM posterior mean ensem-ble, time 0230 UTC. (b) Comparing truth to WDSS ensemblemember, time 0230 UTC. (c) Comparing truth to SPROGensemble member, time 0230 UTC.


strengths and weaknesses. Clearly, more case studiesshould be considered before one can say with confidencethat one method is performing consistently well or poor inany one area. However, the Procrustes-based statisticsprovide a promising suite of measures for such acomparison.

As demonstrated by the above results it is possible tofind an over all error for each nowcast ensemble memberand time-step. More interestingly, it is possible to expressthis error through contributions due to location (SSloc),dilation (SSscale), and intensity (SSmax, SSmin, SSavg). Notshown in the above tables are errors due to rotation(SSrot), as in this case all such errors were negligible. Sucha decomposition allows one to determine how each modelis performing in a variety of ways. For example, fromTables 5 and 6 it is seen that WDSS outperforms bothSPROG and BHM in terms of intensity, but is itself clearlyoutperformed in terms of location, particularly at longerlead-times. This is not surprising given that SPROG andBHM have in-built tendencies to smooth high intensities,whereas WDSS retains current patterns. However, SPROGand BHM seem to account well for the combination ofadvection and propagation of precipitation, whereas WDSShas a simpler advection function. There is no evidencethat the superior location forecasts come at the expenseof dilation, i.e., that location is improved simply byforecasting cell growth, as all the dilation errors aresimilar.

A further comment is in order about the robustness ofthe methodology. Sensitivity analysis was performed withrespect to both the choice of threshold and the number ofangles used. Increasing the number of angles, tends to givesmoother estimates of cells, but still in some cases they canbe more star shaped. Using a very large number defeats thegoal of dimension reduction of the problem. Meanwhile,using only a few, say five, gives crude estimates of the cellsin the forecasts. We found that 10–20 angles seem to workquite well in most cases. The choice of threshold is muchmore subjective. It is expected that the expert meteorolo-gists have a well-developed knowledge of the intensityrange that they need to observe, in order to identify a se-vere storm system. Nonetheless, we investigated both highand lower intensities.

In general, the penalty functions are of similar magni-tudes for small changes in threshold and number of angles.We reached the same results, when it came to small orlarge changes in the number of angles. However, thiswas not the case with the threshold value. Greatly increas-ing the threshold, sometimes yields no identifiable cells insome forecasts. Very low thresholds tend to yield verylarge cells. Changes such as these, greatly affected bothProcrustes as well as the intensity sum of squares. Sincethis is a subjective choice, we advise a careful selectionof threshold that satisfies the needs of the experimenterand the analysis.

There is no reason why one particular forecast shouldproduce the best verification when the measures assess dif-ferent attributes of the forecast. Indeed one major advan-tage of this verification procedure is that it can identifywhich forecast (or ensemble member) produces the mostaccurate representation of the various characteristics ofthe storm. For ensembles that are made up of members

Table 1 Results per time period for all cells, using location only for cell correspondence between cells from truth and forecast

Time period SStot RSS # of cells in truth

WDSS SPROG BHM WDSS SPROG BHM

0230 UTC 25.6253 20.2261 19.459 96.5847 37.9386 48.5222 20240 UTC 51.4803 45.3927 43.5694 75.3826 92.3019 139.9885 30250 UTC 78.85 66.0539 61.7467 52.929 68.2372 69.6679 30300 UTC 93.5189 74.5149 72.0801 33.2543 44.0658 39.3487 30310 UTC 98.4979 69.8259 76.5728 51.9803 60.6706 68.8061 40320 UTC 111.2577 74.439 88.7457 34.6858 51.8965 29.8134 4

Ten-minute lead time nowcasts ending at the times.

Table 2 Results per time period for all cells, using Procrustes residuals for cell correspondence between cells from truth andforecast

Time period RSS # of cells in truth

WDSS SPROG BHM

0230 UTC 113.4805 36.5387 71.2615 20240 UTC 68.9181 75.3691 89.4667 30250 UTC 69.5937 75.2025 61.2549 30300 UTC 60.4894 53.9560 44.9139 30310 UTC 69.4926 59.6525 47.7261 40320 UTC 50.3868 45.7965 54.3680 4

Ten-minute lead time nowcasts ending at the times.

Table 3 Verification results for the 0340 UTC time period, using location only for cell correspondence between cells from truthand forecast

Verifying 0340 UTClead time (min)

SStot RSS

WDSS SPROG BHM WDSS SPROG BHM

10 32.4169 30.3376 32.9085 42.8283 41.9378 39.191320 49.1405 36.3965 41.4114 36.2465 50.5711 48.939830 62.829 44.9979 51.2791 49.8125 46.7081 47.916340 90.9051 51.7074 56.2427 65.1625 44.8942 54.574750 100.567 63.9995 59.6349 93.3849 59.5873 58.246760 118.645 64.8223 62.6202 525.2878 59.0587 89.6373

Table 4 Verification results for the 0340 UTC time period, using Procrustes residuals for cell correspondence between cells fromtruth and forecast

Verifying 0340 UTClead in time (min)

RSS

WDSS SPROG BHM

10 30.4847 34.6459 35.324520 27.3061 29.2164 30.525630 25.0882 35.4505 41.677340 45.1607 33.7056 40.756350 70.4598 46.1167 51.256860 525.2878 58.6487 66.3705


generated by different models (e.g., models containingdifferent convective parameterization schemes or, in thenowcasting realm, different advection/dispersion formula-

tions) the decomposition of the error may allow a determi-nation of which model is handling the situation best. Thiswould best be achieved in a post-event study rather than

Table 5 Verification results for the 0340 UTC time period for individual components, using location only for cell correspondencebetween cells from truth and forecast


Breaking down intensity and shape to specific elements

10 SSmin SSavg SSmax1N SSscale

1N SSloc

WDSS 0.48 49.17 344.99 0.49 61.118SPROG 0.46 26.42 396.15 0.83 4.17BHM 0.16 18.12 429.01 0.83 10.84

20WDSS 0.21 37.3 703.66 0.77 26.82SPROG 0.17 71.78 805.37 1.36 10.43BHM 1.44 64.39 810.4 0.84 8.2

30WDSS 0.5 82.1 566.9 0.5 97.2SPROG 1.0 104.6 991.6 1.4 9.5BHM 0.9 106.9 1074.5 0.9 5.0

40WDSS 1.3 39.8 345.2 0.5 106.9SPROG 1.0 149.7 1177.5 0.9 5.7BHM 1.0 147.2 1211.4 0.9 5.3

50WDSS 1.0 54.2 373.2 0.5 115.4SPROG 1.1 212.0 1602.7 2.4 86.1BHM 1.0 204.7 1573.5 1.8 93.9

60WDSS 0.9 25.4 185.6 0.2 169.2SPROG 0.5 172.0 1401.8 0.7 36.6BHM 0.9 155.3 1258.6 0.4 84.0


as a real-time application, and could then be used to assessmodel applicability to storm type in an analogous way tothat described in Grams et al. (2006). However, the verifica-tion scheme is efficient enough to provide a variety of real-time verification measures. The ’best’ forecast can still beidentified from the total error, but this may depend onthe judgment of the forecaster as to what particular aspectof the storm is the most important and would require a suit-able means of weighting the various components of thetotal error appropriately.

Concluding remarks

This paper introduces a new approach to assessing errors inforecast reflectivity fields that is not limited to the short-period forecasts illustrated herein. One can also envision anumber of ways in which the verifications can be refined,either systematically, by an individual forecaster, or on anevent by event basis. One refinement may be to applyweighting factors to each error term prior to the summa-tion. If one wished to stress storm location as the mostimportant forecast parameter then one could add weightto this term, increasing the penalty for a poor predictionof storm location. Alternatively one may increase theweighting in SSmax to reward forecasts that retain high-

intensity features that may accurately represent areas ofsevere weather, or the error in shape may reflect the abilityof the forecast to maintain or predict structures such asbows and hook-echoes that are indicative of threats ofdownburst or tornado.

As can be seen in the purely mathematical treatmentpresented herein, it is apparent that the different errorterms can result in numerical values that are orders of mag-nitude apart. Although this indicates the relative values ofthe errors it makes intercomparison of the different errorterms difficult and it may be desirable to normalize the er-ror values of each source. This could enable one to comparethe different components and identify from one model tothe next, or one ensemble member to the next, how welleach is dealing with the various morphological factors inthe storm’s behavior.

Finally, there are many avenues of investigation andpotential refinements that can be made to the verificationmethodology. It is planned that further research will beundertaken using the nowcast schemes mentioned hereand others, on a number of cases representing a varietyof meteorological scenarios. Furthermore, we plan toimplement the verification as the standard in our ownnowcast scheme developments as we consider detailedperformance evaluations of the BHM that reveal how ithandles the different aspects of precipitation forecasting

Table 6 Verification results for the 0340 UTC time period for individual components, using Procrustes residuals for cellcorrespondence between cells from truth and forecast


Breaking down intensity and shape to specific elements

10 SSmin SSavg SSmax1N SSscale

1N SSloc

WDSS 0.64 34.65 483.36 0.38 112.83SPROG 0.07 39.16 455.82 0.36 113.61BHM 0.08 52.79 477.52 0.38 112.52

20WDSS 0.26 66.72 874.72 0.62 95.49SPROG 1.11 57.88 653.08 0.34 112.78BHM 1.09 92.75 765.02 0.36 109.17

30WDSS 0.83 48.92 575.37 0.44 117.99SPROG 1.11 86.66 808.88 0.36 112.89BHM 0.99 89.77 987.43 0.73 21.27

40WDSS 0.5 28.8 479.7 0.5 127.6SPROG 1.1 129.6 1004.8 0.4 112.5BHM 1.1 118.8 960.7 0.4 108.4

50WDSS 0.6 25.6 363.1 0.5 128.4SPROG 0.9 124.9 1111.7 0.4 109.2BHM 1.0 142.1 1171.1 0.4 108.7

60WDSS 0.9 25.4 185.6 0.2 169.2SPROG 1.0 149.6 1246.4 0.5 54.4BHM 1.1 96.5 1028.1 0.3 113.3


involved in a good prediction of location structure andintensity.

Acknowledgements

This research was made possible by National Science Foun-dation Grant ATM-0434213. We are also grateful to two ref-erees for their constructive comments and suggestions.

References

Bookstein, F.L., 1991. Morphometric Tools for Landmark Data:Geometry and Biology. Cambridge University Press, Cambridge.

Casati, B., Ross, G., Stephenson, D.B., 2005. A new intensity-scaleapproach for the verification of spatial precipitation forecasts.Meteorol. Appl. 11, 141–154.

Davis, C., Brown, B., Bullock, R., 2006. Object-based verification ofprecipitation forecasts. Part I: Methodology and application tomesoscale rain areas. Mon. Wea. Rev. 34, 1772–1784.

Dryden, I.L., Mardia, K.V., 1998. Statistical Shape Analysis. JohnWiley & Sons.

Ebert, E.E., McBride, K., 2000. Verification of precipitation inweather systems: determination of systematic errors. J. Hydrol.239, 179–202.

Ebert, E.E., Wilson, L.J., Brown, B.G., Nurmi, P., Brooks, H.E.,Bally, J., Jaeneke, M., 2004. Verification of nowcasts from theWWRP Sydney 2000 Forecast Demonstration Project. Wea.Forecasting 19, 73–96.

Fox, N.I., Wikle, C.K., 2005. A Bayesian precipitation nowcastscheme. Wea. Forecasting 20, 264–275.

Grams, J.S., Gallus, W.A., Koch, S.E., Wharton, L.S., Loughe, A.,Ebert, E.E., 2006. The use of a modified Ebert-McBride tech-nique to evaluate mesoscale model QPF as a function ofconvective system morphology during IHOP 2002. Wea. Fore-casting 21, 288–306.

Johnson, J.T., MacKeen, P.L., Witt, A., Mitchell, E.D., Stumpf,G.J., Eilts, M.D., Thomas, K.W., 1993. The storm cell identifi-cation and tracking algorithm: an enhanced WSR-88D algorithm.Wea. Forecasting 13, 263–276.

Lakshmanan, V., Rabin, R., DeBrunner, V., 2003. Multiscale stormidentification and forecast. Atmos. Res., 367–380.

Micheas, A.C., Dey, D.K., 2003. Shape classification procedures withapplications. In: Proceedings of the Section on Bayesian Statis-tical Science, American Statistical Association, San Francisco,2003.

Micheas, A.C., Dey, D.K., 2005. Modeling shape distributions andinferences for assessing differences in shapes. J. MultivariateAnal. 92, 257–280.

Micheas, A.C., Dey, D.K., Mardia, K.V., 2006. Complex ellipticaldistributions with applications to shape analysis. J. Statist.Planning Inference 136, 2961–2982.


Pierce, C.E., Bowler, N., Seed, A., Jones, A., Jones, D., Moore, R.,2005. Use of a stochastic precipitation nowcast scheme forfluvial flood forecasting and warning. Atmos. Sci. Lett. 6, 78–83.

Seed, A.W., 2003. A dynamic and spatial scaling approach toadvection forecasting. J. Appl. Meteor. 42, 381–388.

Stoyan, D., Stoyan, H., 1994. Fractals, Random Shapes and PointFields: Methods of Geometric Statistics. Wiley, Chichester.

Xu, K., Wikle, C.K., Fox, N.I., 2005. A kernel-based spatio-temporaldynamical model for nowcasting weather radar reflectivities. J.Am. Stat. Assoc. 100, 1133–1144.

Cell identification and verification of QPF ensembles using ...€¦ · Cell identification and...

Documents

Transcript of Cell identification and verification of QPF ensembles using ...€¦ · Cell identification and...