An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on...

14
This article was downloaded by: [Politecnico di Torino] On: 05 August 2013, At: 11:19 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Hydrological Sciences Journal Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/thsj20 An approach to propagate streamflow statistics along the river network D. Ganora a , F. Laio a & P. Claps a a Department of Environment, Land and Infrastructure Engineering, Politecnico di Torino, I-10129, Torino, Italy Published online: 30 Nov 2012. To cite this article: D. Ganora , F. Laio & P. Claps (2013) An approach to propagate streamflow statistics along the river network, Hydrological Sciences Journal, 58:1, 41-53, DOI: 10.1080/02626667.2012.745643 To link to this article: http://dx.doi.org/10.1080/02626667.2012.745643 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Transcript of An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on...

Page 1: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

This article was downloaded by: [Politecnico di Torino]On: 05 August 2013, At: 11:19Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,37-41 Mortimer Street, London W1T 3JH, UK

Hydrological Sciences JournalPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/thsj20

An approach to propagate streamflow statistics alongthe river networkD. Ganora a , F. Laio a & P. Claps aa Department of Environment, Land and Infrastructure Engineering, Politecnico di Torino,I-10129, Torino, ItalyPublished online: 30 Nov 2012.

To cite this article: D. Ganora , F. Laio & P. Claps (2013) An approach to propagate streamflow statistics along the rivernetwork, Hydrological Sciences Journal, 58:1, 41-53, DOI: 10.1080/02626667.2012.745643

To link to this article: http://dx.doi.org/10.1080/02626667.2012.745643

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of theContent. Any opinions and views expressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon andshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

41Hydrological Sciences Journal – Journal des Sciences Hydrologiques, 58 (1) 2013http://dx.doi.org/10.1080/02626667.2012.745643

An approach to propagate streamflow statistics along the river network

D. Ganora, F. Laio and P. Claps

Department of Environment, Land and Infrastructure Engineering, Politecnico di Torino, I-10129 Torino, [email protected]

Received 3 August 2011; accepted 18 April 2012; open for discussion until 1 July 2013

Editor D. Koutsoyiannis; Associate editor S. Grimaldi

Citation Ganora, D., Laio, F., and Claps, P., 2013. An approach to propagate streamflow statistics along the river network. HydrologicalSciences Journal, 58 (1), 41–53.

Abstract Streamflow at ungauged sites is often predicted by means of regional statistical procedures. The standardregional approaches do not preserve the information related to the hierarchy among gauged stations deriving fromtheir location along the river network. However, this information is important when estimating runoff at a sitelocated immediately upstream or downstream of a gauging station. We propose here a novel approach, referred toas the Along-Stream Estimation (ASE) method, to improve runoff estimation at ungauged sites. The ASE approachstarts from the regional estimate at an ungauged (target) site, and corrects it based on regional and sample estimatesof the same variable at a donor site, where sample data are available. A criterion to define the domain of applicationaround each donor site of the ASE approach is proposed, and the uncertainty inherent in the estimates obtainedis evaluated. This allows one to compare the variance of the along-stream estimates to that of other models thateventually become available for application (e.g. regional models), and thus to choose the most accurate method(or to combine different estimates). The ASE model was applied in the northwest of Italy in connection with anexisting regional model for flood frequency analysis. The analysed variables are the first L-moments of the annualdischarge maxima. The application demonstrates that the ASE approach can be used effectively to improve theregional estimates for the L-moment of order one (the index flood), particularly when the area ratio of a pair ofdonor–target basins is less than or equal to ten. However, in this case study, the method does not provide significantimprovements to the estimation of higher-order L-moments.Key words streamflow statistics; river network; information propagation; uncertainty

Approche de la propagation des statistiques de débit le long d’un réseau hydrographiqueRésumé Les débits des rivières dans les sections non-jaugées sont souvent estimés par des procédures statistiquesrégionales. Dans les méthodes régionales classiques aucune information relative à la hiérarchie géographiquedes stations placées le long du réseau hydrographique n’est retenue. Cette information est pourtant importantelorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une station de jaugeage.Nous proposons ici une nouvelle approche, appelée Estimation au fil de l’eau (EFE), afin d’améliorer l’estimationdes débits aux sites non jaugés. L’approche EFE commence par l’estimation régionale en un site non jaugé (sitecible), qui est ensuite corrigée à partir des estimations régionales et de l’échantillon de la même variable enun site donneur où les observations sont disponibles. Un critère particulier a été proposé pour définir le domained’application de l’approche EFE autour de chaque site donneur, ainsi que pour évaluer l’incertitude des estimationsobtenues. Ceci permet de comparer la variance des estimations de l’EFE à celle des autres modèles statistiqueséventuellement applicables (par exemple, les modèles régionaux classiques), et donc de choisir la méthode la plusprécise (ou de combiner différentes estimations). Le modèle EFE a été appliqué dans le Nord-Ouest de l’Italie,dans le cadre d’une méthode existante d’analyse régionale de probabilité des crues. Les variables estimées aux sitesnon jaugés sont les premiers L-moments de la crue annuelle. L’application démontre que l’approche EFE peut êtreutilisée efficacement afin d’améliorer les estimations régionales du L-moment d’ordre un (l’indice de crue). Ceciest vrai en particulier lorsque le rapport des surfaces des bassins d’une paire donneur-cible est inférieur ou égal àdix. Dans notre étude de cas la méthode ne démontre pas d’amélioration importante de l’estimation des L-momentsd’ordre supérieur.Mots clefs statistiques des débits; réseau hydrographique; propagation de l’information; incertitude

© 2013 IAHS Press

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 3: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

42 D. Ganora et al.

1 INTRODUCTION

Prediction of streamflow statistics in ungauged basinsis often performed through the use of regional models(e.g. Grimaldi et al. 2011); a procedure able to exploitlocal information and to improve regional estimateswould thus be useful for many purposes.

Several types of regional models have beenproposed in the literature based on the underlying ideato “substitute time for space” (US National ResearchCouncil 1988), i.e. to compensate for the lack of datarecord at a certain location by transferring the infor-mation from other gauged sites. These models differin terms of the regionalized variable and the mathe-matical framework used for the information transfer,while their common focus is to consider the transferof hydrological information moving to the so-calleddescriptor space. The dimensions of the descriptorspace are the catchment characteristics (usually topo-graphic, morphological, pedological or climatic) thatcan be computed for each basin without resortingto hydrologic data. Then, suitable relationships arebuilt to relate some of these characteristics to thedesired hydrological variable, thus providing a tool forestimating the variable in ungauged basins.

Differing from regional approaches, the basicconcept underpinning the model developed in thiswork is the transfer of hydrological information toan ungauged site located upstream or downstream ofexisting gauging stations. This “propagation of infor-mation” involves a supporting variable calculated at agauged (or donor) basin on the basis of sample datathat are used to propagate the information towards theungauged (target) site. The target and the donor siteare directly connected by the drainage network, i.e.the two drainage basins are nested.

Given this perspective, this information transfercan be supposed to be helpful only if the two sitesare close enough. The estimation of the uncertaintyof the propagated prediction is thus a key element,since it allows one to evaluate whether the propagatedprediction is better than the regional model predic-tion. This model is named Along-Stream Estimation(ASE) in the following, to underline that it is basedon the river network structure. Any discharge-relatedvariable can in principle be propagated with the ASEapproach. The procedure is applied here to the first L-moments (e.g. Hosking and Wallis 1997) computedon the record of annual streamflow maxima. Thesestatistics can be profitably used to estimate floodfrequency curves (Laio et al. 2011).

The issue of prediction or interpolation ofhydrological variables along the river network is not

frequently discussed in the literature, although somenotable exceptions are found. Kjeldsen and Jones(2007) adopted a procedure which, analogous toour model, implements the idea of locally correct-ing the regional estimates on the basis of proximalsources of information. However, this procedure doesnot account for the river network structure. We willreturn to the similarities and differences with thework of Kjeldsen and Jones (2007) later in this paper(Section 4).

Gottschalk (1993a, 1993b) approached a simi-lar problem considering the network structure andintroduced the issue of correlation and covariance ofrunoff, adapting the theory of stochastic processesto the hierarchical structure of nested catchments.This approach was later extended by Gottschalk et al.(2006, 2011), and similar concepts are used by Skoienet al. (2006) in the development of a kriging pro-cedure that accounts for the river structure, namedtopological kriging or top-kriging. These approaches,differing from the ASE model, do not aim to correctregional estimates already available for the ungaugedsites, but provide independent estimators of designfloods.

2 ALONG-STREAM INFORMATIONPROPAGATION METHOD

2.1 Method and assumptions

In the ASE approach, a generic hydrological vari-able (e.g. the mean of annual streamflow maxima)is represented as S when it is computed on theempirical sample, while the same variable computedthrough the propagation of information is denotedby P. Moreover, to allow generalizations, the gaugedor donor site is denoted by the subscript d, and theungauged or target site by the subscript t.

The approach is based on the followinghypotheses:

• Proximity: the target site is located on the samestream path as the donor station, upstream ordownstream, i.e. the two basins d and t are nested.

• Transferability: the variable Sd, computed at thedonor site, is used as the support of propagationin the information transfer scheme, Pt = f (Sd,! ),where Pt is the propagated variable at the targetsite, ! is an additional (optional) set of parametersand f is a function to be defined.

• Congruence: when the distance between the donorand the ungauged catchments becomes null, the

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 4: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

An approach to propagate streamflow statistics along the river network 43

two basins coincide and the along-stream estimateat the ungauged site matches the at-site estimateat the gauged basin, i.e. Pt ! Sd for t ! d.The distance is intended with a general meaning,and does not necessarily represent the geograph-ical distance, or the length of the drainage pathbetween the two points.

To set the validity domain is very importantfor assessing the reliability of the ASE method.Here the point will be treated in an intuitive way:the idea is to consider the ASE model applicableonly where the uncertainty, i.e. the standard devia-tion, " Pt, of the propagated prediction can be suit-ably estimated. The " Pt can be statistically evalu-ated by considering the residuals obtained during thecalibration phase of the method: the more efficientthe function for the information transfer, the smallerare the residuals, and the larger is the domain ofvalidity.

The prediction, Pt, is obtained through the func-tion f , and its standard deviation can be obtained asthe combination of a model error, due to the approx-imate form of f , and a sample error, owing to the useof the variable Sd, which is computed from the data.Considering these sources of uncertainty in detailis out the scope of this paper; however, the vari-ance of Pt can be quantified in a simplified way:given the particular function f for the informationtransfer, together with its corresponding domain ofvalidity, the variance of the along-stream predictionis assumed to increase moving away from the donorsite, but still within the validity domain. Outsidethis domain, the along-stream prediction is deemedunreliable and it is therefore no longer necessaryto compute its variance. A sketch representing thisaspect is shown in Fig. 1(c) and details are reportedin Section 2.3.

The ASE approach is meant to be applied to anarea for which a regional flood frequency model has

Increasingestimationvariance

Variancecomputation notapplicable

(c)

Target sitesInformation transfer:function 1

Information transfer:function 2

Donor basin

Different domainof validity

(b)(a)

Fig. 1 Sketch of the along-stream propagation of information: (a) a hydrological variable calculated at the donor (gauged)site is used to predict the value of the same variable at the target locations located upstream or downstream. (b) Differentfunctions can be adopted to achieve this aim; however, each function has a particular domain of validity around the donorstation. (c) The variance of the new predictions is assumed to increase moving away from the donor station, within thedomain of validity. This is no longer applicable out of the validity domain.

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 5: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

44 D. Ganora et al.

already been developed. The regional model is used:(a) as a reference model for results comparison; (b) asa substitute for Pt where the ASE approach is notapplicable; and (c) as a source of information for thedefinition of the additional set of parameters ! . Forinstance, the regional model used here (presented indetail by Laio et al. 2011) allows one to evaluatethe flood frequency curve in ungauged basins throughthe estimation of three L-moments on the basis ofa set of three regional relationships: the index flood(Qind, average of annual maxima), the coefficients ofL-variation (LCV) and L-skewness (LCA).

2.2 Propagation of information

The first step to implement the along-stream esti-mation procedure is to define a suitable function fto compute the variable P at the target site t, giventhe value Sd at the donor site. Here we adopt anequation proposed in the Flood Estimation Handbook(Institute of Hydrology 1999) and re-analysed byKjeldsen and Jones (2007). The function f reads:

f (Sd , ! ) = Rt

RdSd (1)

where R refers to the estimates obtained from theregional procedure and Sd is the at-site sample valueof the variable at the donor site. Equation (1) can beinterpreted as follows: the regional estimate Rt in t iscorrected by a factor equal to the relative error that theregional model produces in d (i.e. Sd/Rd). Note thathere all the symbols P, R and S represent the samehydrological variable of interest (e.g. the index flood,LCV, or LCA).

The propagated estimate of Pt can then bewritten as:

!"

#Pt = Rt

RdSd if D " Dlim

Pt = Rt if D > Dlim

(2)

where D is the generalized distance between t and dand Dlim is the threshold distance beyond which thepropagation is no longer effective. For D ! 0 it isstraightforward to verify that Pt ! Sd. In this con-text, since we already have an alternative model (theregional model) available for the prediction of thevariable at the ungauged site, it seems appropriateto use the pure regional estimates in the cases whereD > Dlim.

2.3 Model reliability: operational estimateand prediction uncertainty

The framework introduced in Section 2.1, which high-lights the idea that the model is applicable only ina limited neighbourhood, is common also to otherapproaches for local correction of regional estimates.However, in our methodology, we explicitly evaluatethe effectiveness of such correction. From a prac-tical point of view, this introduces a further rulewith respect to the propagated estimate of equation(2) that is formalized by defining the operational oralong-stream estimate ASEt as:

$ ASEt = Pt if D " Dlim and "Pt " "Rt

ASEt = Rt otherwise(3)

where "Rt is the standard deviation of the regionalprediction, and "Pt is the standard deviation (to beevaluated) of the propagated variable. This meansthat, even for D " Dlim, the propagated predictionPt is accepted only if "Pt is not greater than thecorresponding regional uncertainty "Rt . The standarddeviation of the regional prediction should be avail-able from the regional model used (see e.g. Laio et al.2011, for our case study).

The basics of the ASE method can then be sum-marized in three steps: (a) choice of a suitable frame-work for the information transfer, (b) definition ofthe threshold distance, Dlim, and (c) evaluation of theuncertainty of the propagated estimate. In Section 2.2,a practical formula (equation (2)) is proposed withoutproviding a quantitative assessment of Dlim. In thissection, we investigate the suitability of a simplifiedapproach for Dlim quantification (point (b)), togetherwith an overall evaluation of the performance of thealong-stream estimation approach (point (c)). Thesetwo steps are operated jointly through an iterativeprocedure to provide an optimal estimator of Dlim.This iterative framework is general, and can also beapplied to propagate equations that are different fromequation (2).

To implement the ASE framework, it is neces-sary to define a suitable relationship that representsthe uncertainty of Pt. As mentioned earlier, an analyt-ical equation for "Pt could be derived on the basis ofequation (1), although actually the effect of the modelerror on "Pt is not easy to define. Consequently, inour approach, we resort to a simple model of the Ptuncertainty:

CVPt = (1 + #D)CVSd (4)

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 6: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

An approach to propagate streamflow statistics along the river network 45

where CV is the coefficient of variation of the prop-agated variable, i.e. the ratio between the standarddeviation and the mean of the variable. This modelfor predicting CVPt can be interpreted as follows:the coefficient of variation of Pt equals the coeffi-cient of variation of the at-site estimate in the gaugedsite (which includes the sample uncertainty) aug-mented proportionally to a factor # that accounts forboth the non-correctness of the ASE transfer func-tion (model error) and for the variance of the othervariables involved in equation (2) (in this case theregional values R). This can also be thought of as afirst-order approximation of a more complex functionfor variance propagation.

Considering the definition of Pt given inequation (2), and the definition of CV in equation (4),we obtain for D " Dlim:

"Pt

Rd

RtSd= (1 + #D)

"Sd

Sd(5)

and thus:

"Pt = (1 + #D)"Sd

Rt

Rd(6)

For D ! 0, it is straightforward to verify that "Pt !"Sd , confirming the congruence hypothesis.

The evaluation of the uncertainty of the propa-gated estimate using equation (6) first requires esti-mation of the parameter #. As a first attempt, wecalibrated # on the basis of the available data set,rearranged to account for each donor–target corre-spondence (details in Section 3) and considering onlythe basin pairs within the threshold distance. Giventhat, for each pair of basins, the residual between Ptand its corresponding at-site value St is:

$t = Pt # St (7)

Pt and St are assumed to be independent random vari-ables neglecting the covariance between Sd and St.This hypothesis allows one to keep the frameworksimple and is justified by the fact that the covarianceof flood statistics rapidly declines in orographicallycomplex areas and cannot be robustly estimated in thestudy area (Laio et al. 2011). Using equation (6) oneobtains:

" 2$ = " 2

Pt+ " 2

St

= (1 + #D)2" 2Sd

%Rt

Rd

&2

+ " 2St

(8)

The coefficient # can thus be estimated by meansof a maximum-likelihood approach: the residuals $tare supposed to be normally distributed with zeromean and variance changing site-by-site accordinglyto equation (8). The likelihood function is obtainedas the product of each (normal) marginal probabilitydistribution.

Note that, even if equation (6) relates a standarddeviation to a distance like a variogram, it is concep-tually different from a variogram because the calibra-tion is performed using the residuals $t, which doesnot require the availability of simultaneous observa-tions. This is particularly important in the context oflimited data availability, where the variograms cannotbe robustly estimated.

The value of Dlim and # (which are not knowna priori) are optimized by means of a trial-and-errorprocedure:

(a) a tentative value of Dlim is selected;(b) the propagated estimate Pt is computed as in

equation (2);(c) the residuals $t are computed and the parameter

# is evaluated in the max-likelihood framework,only for basin pairs within Dlim;

(d) based on #, the variance of the Pt prediction iscomputed with equation (6) and it is comparedagainst the variance of the regional prediction atthe same location;

(e) the operational estimate ASEt is obtained byfollowing the rules in equation (3);

(f) an error index (see equation (11) in the follow-ing) is computed both for the ASE model andthe regional model, and the two error indexes arecompared;

(g) the procedure is repeated, changing the tentativevalue of Dlim; and

(h) the Dlim value that minimizes the overall error ofASEt is assumed as the distance threshold.

This procedure considers a unique Dlim value whichis valid for the whole case study. A “global” Dlimvalue is necessary, from the computational point ofview, to perform the estimation of #. The searchfor Dlim is based on a trial-and-error procedure thatroughly tells us the maximum distance wherein sensi-ble improvements are obtained. Nevertheless, precise

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 7: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

46 D. Ganora et al.

estimation of Dlim would be superfluous, because,even when D < Dlim, the propagated estimate is fur-ther compared to the regional one to choose the mostappropriate estimate (see equation (3)).

3 CASE STUDY

3.1 Organization of nested basins

The regional models employed in this work are inthe form of multiple regression models calibratedover a data set of 70 basins located in northwestItaly; geomorphologic and climatic indexes avail-able for any basin are used as explanatory vari-ables (Laio et al. 2011). The basins belong mainlyto mountainous areas, have areas ranging between22 and 3320 km2 and mean elevation from 471 to2719 m a.s.l. To reduce any effect of upstream lakesand/or reservoirs, basins whose catchment area ismore than 10% covered by lakes are discarded. Theinvestigated region presents basins subject to var-ious climate regimes, from purely nivo-glacial toalmost temperate-Mediterranean. Further details canbe found in Claps and Laio (2008). Each predictionmodel takes the form:

Y = xT ! (9)

where Y is the regionalized variable (L-moment),x is the vector containing the descriptors for theconsidered basin, with one as the first element, and! is the vector of regression coefficients, obtainedfollowing a generalized least-squares procedure mod-ified after Stedinger and Tasker (1985). Laio et al.(2011) found that, for the estimation of the indexflood, it is more appropriate to apply equation (9)to the log-transformed data; consequently, a back-transformation is required to obtain Qind from Y . Theflood frequency curve is then reconstructed consider-ing the regionalized L-moments. The regional modelof Laio et al. (2011) also allows one to evaluate thestandard deviation of each L-moment predicted at anungauged site.

The suitability of the ASE approach is evaluatedconsidering as a case study the same set of 70 basinsused by Laio et al. (2011); however, here the data areorganized in a different way, it being more appropri-ate to work in terms of pairs of basins {t,d}, ratherthan with a single catchment at a time. Figure 2 showsa schematic representation of the hierarchical depen-dence of nested catchments, the connections beingrepresented by a line. Note that there are several

multi-connected basins, as well as basins with no con-nections. All the connected (nested) catchments havebeen considered as possible pairs of donor–target sites(e.g. in Fig. 2, Basin 1 is nested to Basin 15 althoughBasin 13 occurs in between the two).

The connections are considered in both direc-tions, e.g. if Basin 9 is upstream of Basin 10, we firstconsider Basin 9 as the donor site and Basin 10 as thetarget (ungauged) site; then the procedure is repeatedusing Basin 10 as the donor site and Basin 9 as thetarget (ungauged) site. Considering all of the possibleconnections of two stations along the same drainagepath (nested basins), there is a total of 142 connec-tions. Every pair is characterized by a generalizeddistance D between them.

The distance D between two catchments can bedefined in different ways, but it is preferable to avoidboth the geographical distance and the length of thedrainage path linking the two closing sections. Forinstance, such definitions would not represent cor-rectly the abrupt change in basin characteristics thatis expected between two points located just upstreamor downstream of a tributary. We propose a definitionof the distance based on the ratio of basin areas A:

D = log(Amax/Amin) (10)

with Amax = max[At,Ad] and Amin = min[At,Ad].Under the proximity hypothesis (but not in general),two basins with the same area have null distance(they are the same basin). Consequently, their esti-mates must coincide (congruence hypothesis). Othervariables may be included in the representation of thegeneralized distance; for example, the mean basin ele-vation can be useful when the data set is composed ofbasins from both mountainous and plain areas.

3.2 Application of the iterative procedure

The trial-and-error procedure described in Section 2.3was applied to the index flood, the LCV and the LCAstatistics, using equation (10) as the distance measurebetween basins. The variable log(Qind), i.e. the log-transformed index flood, was also considered, usinga slightly modified version of the ASE approach inorder to compare our model with that proposed byKjeldsen and Jones (2007). Details of this comparisonare reported in Section 4.

The main results of the application of the ASEmethod with a number of tentative threshold distancesare summarized for the index flood in Fig. 3, where

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 8: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

An approach to propagate streamflow statistics along the river network 47

Fig

.2G

augi

ngst

atio

nsly

ing

onth

esa

me

drai

nage

path

,eith

erup

stre

amor

dow

nstr

eam

,tha

tare

dire

ctly

conn

ecte

d(n

este

dba

sins

)are

sche

mat

ical

lylin

ked

with

alin

e.So

me

catc

hmen

tsha

vem

ultip

leco

nnec

tions

,oth

ers

are

isol

ated

.

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 9: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

48 D. Ganora et al.

2 5 10 20 50 100 200

23

45

6

Area ratio limit

RM

SE

Regional (reference), RMSEREGOperational, RMSEASEPropagated only, RMSEPRO

Fig. 3 Average error (RMSE of dimensionless errors %{t,d})within the domain of validity considering only the prop-agated estimate (–––) and the operative ASE prediction(- - - -). The global average error of the regional model isindicated, for reference, by a dotted line.

the root mean squared errors (RMSE) of the nor-malized prediction errors are plotted as a functionof the area ratio of the donor and target basins. Thenormalized errors are defined as:

%{t,d} = (prediction){t,d} # St

"st

(11)

where “prediction” indicates the approach used tomake the estimation; the residuals are normalized by"st to account for the sample uncertainty at the targetsite, which can be relevant if the donor has a shortrecord.

In the first instance, Fig. 3 allows one to comparethe behaviour, in terms of RMSE, of the opera-tional ASE estimator of equation (3) (in the followingreferred to as RMSEASE), against the simple propaga-tion of information of equation (2) (RMSEPRO). Bothapproaches were applied over all possible pairs {t,d},but only within the distance limit. The last point of thetrial-and-error procedure is relative to a distance limitof 5.03 (equivalent to an area ratio of about 150) thatincludes all the available basin pairs, i.e. it is equiva-lent to an unbounded domain of validity. In Fig. 3, theglobal RMSE computed considering only the regionalpredictions over the whole data set, RMSEREG, is alsoreported for comparison.

Some important results can be deduced from theRMSEPRO curve: it presents a clear increasing trend,with increasing threshold distance; and RMSEREG isequalled for an area ratio between 10 and 20. This

indicates that the use of a simple propagation ofinformation as in equation (2) is effective only forrelatively short distances.

More important, Fig. 3 shows the effectivenessof the ASE method relative to the simple propagationapproach; in fact, the RMSEASE is always lower thanthe RMSEPRO, meaning that the selection criterion inequation (3), based on the standard deviation of thepropagated and regional estimates, works properly;in other words, this is confirmation that, on average,the operational model is able to correctly select thebest approach (regional or propagated). As expected,for large area ratios, the ASE performances approachthose of the regional model because "Pt increases andthe regional model is selected most of the time inequation (3). Thus, the ASE model has better perfor-mances compared to the regional model alone, evenwhen there is no distance limit.

These results highlight that the use of a restricteddomain of validity improves the effectiveness of thepropagation of information and, as a consequence, thewhole ASE framework. However, a restricted domainof validity limits the applicability of the ASE methodto only the closest target basins.

The optimal threshold distance can thus be seenas the best compromise between two opposite effects:on the one hand, the use of a small threshold dis-tance Dlim leads to better estimation results, but theapplicability of the ASE approach turns out to be lim-ited to only a few basins. On the other hand, largerdomains of validity increase the errors and decreasethe effectiveness of the operational estimator.

The search for an optimal Dlim value has beenperformed iteratively for this case study, consideringthe calibration set of a basin as representative of thereal application context. For instance, very good per-formances can be achieved with Dlim = 0.81 (equalto an area ratio of about 2.25), but only 11.3% ofthe considered basins would benefit in this case ofthe along-stream model. The remaining 88.7% of thebasins would not be considered. Given this perspec-tive, we selected the “optimal” distance as a balancebetween these two effects; this corresponds to extend-ing the area of influence to basins that have an areaof between 1/10 and 10 times the area of the donorbasin, i.e. for pairs of basins whose areas differ by, atmost, one order of magnitude.

The results reported in Fig. 3 show the globalperformances of the method. A more detailed inves-tigation is represented in Fig. 4(a), in which eachnormalized error of the operational model is com-pared to that of the regional (reference) model. The

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 10: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

An approach to propagate streamflow statistics along the river network 49

1510

Ope

ratio

nal E

rror

50

1510

Ope

ratio

nal E

rror

50

0 5 10Regional Error

15

0 5 10Regional Error

15

(a)

(b)

D ! DlimD > Dlim

Fig. 4 Absolute errors (equation (11)) of regional esti-mates of the index flood compared with the errors pro-duced by the ASE model. Open circles represent the errorsobtained for basin pairs closer than the threshold distance;filled circles are relative to the more distant catchments.All the points below the solid line represent basins wherethe index flood estimates are improved by the use of thealong-stream information transfer procedure: (a) relative toa threshold distance Dlim = log(10); (b) no limitation ondistances.

points on the graph can be divided in four differentclasses:

• filled circles on the bisector represent basins out-side the validity domain, where only the regionalmodel is applicable;

• empty circles on the bisector are basins within Dlimfor which the regional model has been selected asthe operational model;

• empty circles below the bisector are basins withinDlim for which the propagated estimate has beenselected, and the propagated estimates provide animprovement over the regional ones;

• empty-circles above the bisector are basins withinDlim for which the propagated estimate has beenselected, but the propagated errors are greater thanthe regional ones.

Most of the points off the diagonal are in the lowerpart of the plot, which demonstrates that, when thepropagated estimate is suitable for use, it providesbetter performances than the corresponding regionalestimation. Only for a few basins is there a mod-erate increase in the operational error. These resultsare positive when compared to those of Fig. 4(b), inwhich no threshold distance was applied. Althoughthe comprehensive operational error (RMSEASE) stillsuggests use of the ASE model, the dispersion ofthe points highlights the fact that the variance ofthe operational predictions is no longer appropriateto describe the reliability of the ASE model. Thisagain confirms that, for basins beyond the thresholddistance, the regional model is the most appropriate.

The same procedure was applied for the LCVand LCA estimations, but no conclusive results werereached. Figure 5 clearly shows, for LCV, that the ASEmodel does not produce reliable results and, whenapplicable, produces a deterioration of the regionalestimates. Similar results apply to LCA, for which themethod seems not applicable at all. These negative

0 2 4 6 8 10

3.53.02.52.01.51.00.50.0

Regional Error

Regional Error

L-CV

L-CA

108

64

Ope

ratio

n E

rror

Ope

ratio

n E

rror

20

3.5

3.0

2.5

2.0

1.5

1.0

0.5

0.0

D ! DlimD > Dlim

D ! DlimD > Dlim

(a)

(b)

Fig. 5 ASE vs regional errors for (a) LCV and (b) LCA withoptimal threshold distance Dlim = log(5).

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 11: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

50 D. Ganora et al.

results can be ascribed in part to the high uncertaintyof the sample higher-order L-moments estimated onshort data records. This uncertainty prevents correctestimation of the parameter # and of the bounds ofthe validity domain, thus deteriorating the quality ofthe results obtained with the ASE approach: the sameeffect influences the size of the domain of validity ofthe ASE approach, with Dlim decreasing with increas-ing order of the L-moment. In our case study, thedomain of validity becomes so small that there arenot enough pairs of basins included within the thresh-old distance that can be used for a robust calibration.Lack of data not only affects the sample uncertainty,but also makes it difficult to investigate the complexmechanisms of propagation of the second- and third-order L-moments. The available database cannot sup-port a detailed analysis of such mechanisms, makingthe uncertainty related to the “model error” impossi-ble to estimate, and hampering the applicability of theprocedure.

4 MODEL COMPARISON

Kjeldsen and Jones (2007) developed a similarapproach (hereafter the KJ approach) to locallyimprove the predictions coming from a regionalmodel. This approach has been rediscussed (Kjeldsenand Jones 2009) and applied also in Kjeldsen andJones (2010). Although the equation we use to trans-fer the information is basically the same as that ofthe KJ model, the two implementations are basedon rather different ideas. In particular, Kjeldsen andJones (2007) propose the model:

Pt = Rt

%Sd

Rd

&#KJ

(12)

where #KJ is an exponent dependent on the geograph-ical distance of the centroids of the donor and targetbasins. The donor basin is always selected as thegeographically-closest gauged basin. To evaluate thesuitability of these approaches for application in thepresent case study, a comparison was carried out.

To evaluate #KJ, the KJ model requires the esti-mation of the cross-correlation coefficient of themodel errors. As a first approximation, and for prac-tical purposes (see Kjeldsen and Jones 2007), it canbe assumed that #KJ depends on the distance from thedonor site following the cross-correlation of annualmaxima rt,d. This approach applies to all the tar-get sites, even if for large donor–target distances thecorrection is negligible, because #KJ tends to zero.

A special case of equation (12), reported by Kjeldsenand Jones (2007), considers #KJ = 1, provided thecorrection applies only for basins within a limit-distance (i.e. only for highly-correlated basin pairs).Beyond the limit-distance, defined on the basis of thecorrelation function, only the regional model is used.

In our case study, the regional model does notprovide the cross-correlation function of the modelerrors, and the cross-correlation of annual maximacannot be safely estimated over the considered areabecause the samples used are sparse in space and notcompletely overlapping in time. Moreover, the cross-correlation function of annual maxima is expectedto decay very quickly, due to the high topographicand climatic heterogeneity in the case study area.To overcome this problem, an iterative procedure isadopted to calibrate the KJ model: the limit-distanceis assumed to be known, with varying values from1 to 200 km; the model is applied correcting onlythe within-limit pairs of target–donor basins; finallya comprehensive error index is computed. In thisway, the most appropriate limit-distance is found tobe 8 km, which is the distance that allows one toimprove most of the estimates. This limit allows usalso to roughly reconstruct the correlation functionin the form of a negative exponential. Kjeldsen andJones (2007) found the correlation function rt,d =exp(–0.016DC) (DC being the distance between basincentroids) valid for their case study, with the max-imum distance for which the model applies cor-responding to rt,d = 0.5. Assuming this value isvalid also in our case study, and considering thelimit-distance of 8 km, the correlation function is re-evaluated as rt,d = exp(–0.087DC), showing a fasterdecay than the Kjeldsen and Jones case study (whichmay be sensible, due to the larger meteorological vari-ability in the study area compared to the UK). Thisresult is necessary for applying the general version ofthe KJ model (equation (12)).

At this point, some clarifications about the ASEapproach are necessary before performing the com-parison. In fact, the ASE and the KJ models are basedon rather different hypotheses, and slight modifica-tions of the ASE approach are necessary:

• while the KJ model is designed to work with log-transformed variables, our method can be directlyapplied to the native regionalized variable (e.g.the index flood in the application of Section 3).To make the comparison more direct, here thereference variable for the ASE model is set tolog(Qind).

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 12: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

An approach to propagate streamflow statistics along the river network 51

• in our approach, the selection of donor basin isbased on the hierarchical organization of nestedbasins; in general, this introduces more than oneASE estimator for each target basin, as well ascases in which no donor basins are availablebecause the target site is not connected to anygauged one. To make the two approaches compa-rable, when more than one estimator is present forthe same target site, we consider only one value,taking the average of the available ASE values.If no ASE estimates are available, the regionalvalue is adopted.

The calibration procedure for the ASE modelconfirmed that a threshold distance of log(10) canbe considered appropriate, even when the model isapplied to the logarithmic index flood. The resultsreported in Fig. 6(a) appear to be quite similar to thoseobtained for the untransformed Qind. In this plot,each point represents a single basin, different fromFig. 4 which (more generally) reports a circle for each

connection {t,d}. The results obtained calibrating theKJ approach are reported in Fig. 6(b) (simplifiedversion of the model) and in Fig. 6(c) (generalizedversion). The generalized version appears slightlymore accurate than the simplified one.

The legends in Fig. 6 report some useful statis-tics, in particular the percentage of processed basins,i.e. how many regional predictions (computed duringthe calibration phase) are suitable to be improved. Forthe ASE model, it includes all the basins with D "Dlim. For the KJ simplified version, it includes all thepoints having at least one neighbour within 8 km (i.e.points out of the bisector of panel (b)), while it has atrivial meaning for the KJ generalized version, sinceall the points are actually processed because a thresh-old distance does not exist. The higher percentage ofbasins processed by the ASE approach compared tothe KJ model reflects the fact that, in this case study,the ASE method has a wider range of application.A comparison of these results also shows that ourmodel has, on average, better performances than the

1510

Ope

ratio

nal e

rror

50

1510

Ope

ratio

nal e

rror

50

1510

Ope

ratio

nal e

rror

50

0 5 10 15Regional error

0 5 10 15Regional error

0 5 10 15Regional error

% processed basins = 83MAE(R) = 2.92MAE(O) = 2.42RMSE(R) = 4.13RMSE(O) = 3.35

D ! DlimD > Dlim

% processed basins = 41.4MAE(R) = 2.92MAE(O) = 2.71RMSE(R) = 4.13RMSE(O) = 3.96

% processed basins = 100MAE(R) = 2.92MAE(O) = 2.69RMSE(R) = 4.13RMSE(O) = 3.94

(a) (b)

(c)

Fig. 6 Operational vs regional errors for the estimation of the log-transformed index flood compared: (a) ASE model withDlim = log(10); (b) KJ model with #KJ = 1; and (c) KJ model with #KJ = exp(–0.087DC).

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 13: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

52 D. Ganora et al.

Fig. 7 Maps of applicability of ASE and KJ (simplified type) models for the case study area. The highlighted part of thedrainage network represents the points where the models are applicable making use of the donor stations represented by thecode numbers.

KJ approach (both versions), since the overall errors(see MAE and RMSE reported over the plots) aresmaller. All the models are able to reduce the overallerror with respect to the pure regional approach, as isapparent from the MAE and the RMSE (labelled R)in Fig. 6.

The very different nature and applicability of theASE and KJ models can be examined consideringthe river reaches wherein the models can actually beapplied. For the river network in the study region,the results are mapped in Fig. 7, where the domainof applicability is represented as a thicker line. Thisrepresentation highlights the different results obtainedfor the propagation of information: for a highlyheterogeneous area like the case study, the along-stream information propagation appears more suitablebecause it has a larger area of applicability.

5 DISCUSSION AND CONCLUSIONS

The Along-Stream Estimation (ASE) approach pro-posed herein hinges on the river structure to performan information transfer towards ungauged basins.This integrates standard regional procedures becauseit is based on local relationships, as the estimationis performed considering only nested catchments.

Along-stream and regional estimates can thereforebe combined to develop a general framework forimproved evaluation of a given hydrological variable,as well as its variance at ungauged locations.

In general, when two or more models are avail-able for the same purpose, one can consider one ofthe following scenarios:

• Model competition: the results of different mod-els (in our work “propagated” and regional pre-dictions) can be evaluated separately and thencompared, in order to identify which model ismore efficient in the reconstruction of the vari-able of interest. In the case study presented here,propagated and regional predictions show differentreliability, depending on the location of the tar-get site and, in particular, on its distance from thedonor site. From this perspective, the aim of thepropagation of information is to identify an alter-native procedure that is more appropriate for theanalysis at some ungauged basins.

• Model cooperation: the output of one model isused to initialize the other model. In this work,for instance, the regional estimate is used as anadditional parameter in the propagation functionand thus contributes to the final along-stream

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013

Page 14: An approach to propagate streamflow statistics along …claps/Papers/HSJ_2013.pdf · lorsqu’on estime des débits pour un site situé immédiatement en amont ou en aval d’une

An approach to propagate streamflow statistics along the river network 53

prediction. This approach can be interpreted as fol-lows: the propagation of information can be usedto locally improve the regional model estimate,accounting for specific information at a donor site.

• Model combination: given different estimates ofthe same variable, one combines them throughsuitable relationships aiming to minimize thevariance of the resulting estimator.

The application of the ASE procedure is based onthe ideas of both cooperation and competition withthe regional model. In particular, the regional modeltries to catch the “global” variability of hydrologicalvariables, without considering the “local” structureof the river that can be accounted for by the propa-gation method. An important feature of the method,compared to other approaches for local correction,is that, even if the target site is close enough to thedonor, the propagation of information is done only ifthe donor is suitable. This approach demonstrated itsfeasibility in a case study characterized by many data-scarce basins, allowing one to exploit all the availableinformation concerning the index flood estimation.

To conclude, the along-stream approach is suit-able for application wherever a regional model isavailable and the uncertainty of the regional predic-tions is provided. It exploits the local (sample) infor-mation to improve the regional estimates, and with aparticular propensity for areas with short data records,because it does not require the data-demanding esti-mation of cross-correlation or variogram functions torepresent the spatial variation of discharge.

Acknowledgements The study was supported bythe Italian Ministry of Education through grantno. 2008KXN4K8. The authors acknowledge AlbertoViglione and an anonymous reviewer for their insight-ful comments.

REFERENCES

Claps, P. and Laio, F., 2008. Aggiornamento delle procedure di valu-tazione delle piene in Piemonte, con particolare riferimento aibacini sottesi da invasi artificiali, vol. I (Updating of the proce-dures for flood evaluation in Piemonte, vol. I. Technical report,in Italian). Italy: Politecnico di Torino. http://www.idrologia.polito.it/piene/PienePiemonte08_Volume1.pdf.

Gottschalk, L., 1993a. Correlation and covariance of runoff.Stochastic Hydrology and Hydraulics, 7 (2), 85–101.

Gottschalk, L., 1993b. Interpolation of runoff applying objec-tive methods. Stochastic Hydrology and Hydraulics, 7 (4),269–281.

Gottschalk, L., et al., 2006. Mapping mean and variance of runoffin a river basin. Hydrology and Earth System Sciences, 10 (4),469–484.

Gottschalk, L., Leblois, E., and Skoien, J. O., 2011. Correlationand covariance of runoff revisited. Journal of Hydrology, 398,76–90.

Grimaldi, S.. et al., 2011. Statistical hydrology. In: P. Wilderer,ed. The science of hydrology, volume 2 of Treatise on WaterScience, 479–517. Elsevier, ISBN: 978–0–444–53199–5.

Hosking, J.R.M. and Wallis, J.R., 1997. Regional frequency analy-sis: An approach based on L-moments. Cambridge: CambridgeUniversity Press.

Institute of Hydrology, 1999. Flood estimation handbook.Wallingford, UK: Institute of Hydrology.

Kjeldsen, T.R. and Jones, D.A., 2007. Estimation of an index floodusing data transfer in the UK. Hydrological Sciences Journal,52 (1), 86–98.

Kjeldsen, T.R. and Jones, D.A., 2009. An exploratory analysis oferror components in hydrological regression modeling. WaterResources Research, 45, W02407.

Kjeldsen, T.R. and Jones, D.A., 2010. Predicting the index flood inungauged UK catchments: on the link between data-transfer andspatial model error structure. Journal of Hydrology, 387 (1–2),1–9.

Laio, F., et al., 2011. Spatially smooth regional estimation of the floodfrequency curve (with uncertainty). Journal of Hydrology, 408,67–77.

Skoien, J.O., Merz, R., and Bloeschl, G., 2006. Top-kriging—geostatistics on stream networks. Hydrology and Earth SystemSciences, 10 (2), 277–287.

Stedinger, J.R. and Tasker, G.D., 1985. Regional hydrologic analy-sis. 1. Ordinary, weighted, and generalized least-squares com-pared. Water Resources Research, 21 (9), 1421–1432.

US National Research Council, 1988. Estimating probabilities ofextreme floods, Washington DC, USA: National AcademyPress.

Dow

nloa

ded

by [P

olite

cnic

o di

Tor

ino]

at 1

1:19

05

Aug

ust 2

013