Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto...

20
Hydrological Sciences -Journal- des Sciences Hydrologiques,38,2,4/1993 103 Homogeneous region delineation based on annual flood generation mechanisms DENIS GINGRAS & KAZ ADAMOWSKI Civi7 Engineering Department, University of Ottawa, Ottawa, Ontario, Canada KIN 6N5 Abstract Flood distributions can have unimodal or multimodal densities due to different flood generation mechanisms such as snowmelt and rain- fall in the annual flood series. When applying nonparametric frequency analysis to annual flood data from the province of New Brunswick in Canada, unimodal, bimodal and heavy-tailed distribution shapes were found. By grouping basins with similarly-shaped densities on a geogra- phical basis, homogeneous regions were delineated. Regional equations derived for a homogeneous region gave lower integral square errors than those of province-wide equations. Délimitation de régions homogènes basée sur le mécanisme de la genèse de la crue annuelle Résumé Les distributions des crues peuvent avoir des densités uni- modales ou multimodales par suite de différents mécanismes tels que la fonte des neiges et la pluie dans la série annuelle des crues. En applicant l'analyse de fréquence non-paramétrique aux crues de la province du Nouveau-Brunswick au Canada, des distributions unimodales, bimodales et avec grande queue ont été découvertes. En groupant des bassins avec des densités de même forme sur une base géographique, des régions homogènes ont été délimitées. Des équations régionales pour une région homogène ont donné des erreurs carrées intégrées plus petites que celles des équations pour la province entière. INTRODUCTION The estimation of design floods at ungauged sites is a common task for water resources engineers. The information from gauged sites can be transferred to ungauged sites through regionalization techniques and thus improve at-site estimation for ungauged locations. These regionalization techniques involve three basic steps: homogeneous region delineation, at-site flood frequency analysis and regional relationship development (Kite, 1977). The identification of homogeneous regions is one of the more serious obstacles to obtaining a satisfactory regional solution. It is known that the division of larger geographical areas into smaller homogeneous regions leads to regional flood regression equations with smaller standard errors (Moin & Shaw, 1986). Of the many approaches which have been suggested in delinea- ting homogeneous regions, some are based on residual analysis, others on standardized flow statistics (Burn, 1990), some on defining an identical distri- Openfor discussion until 1 October 1993

Transcript of Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto...

Page 1: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Hydrological Sciences -Journal- des Sciences Hydrologiques,38,2,4/1993 103

Homogeneous region delineation based on annual flood generation mechanisms

DENIS GINGRAS & KAZ ADAMOWSKI Civi7 Engineering Department, University of Ottawa, Ottawa, Ontario, Canada KIN 6N5

Abstract Flood distributions can have unimodal or multimodal densities due to different flood generation mechanisms such as snowmelt and rain­fall in the annual flood series. When applying nonparametric frequency analysis to annual flood data from the province of New Brunswick in Canada, unimodal, bimodal and heavy-tailed distribution shapes were found. By grouping basins with similarly-shaped densities on a geogra­phical basis, homogeneous regions were delineated. Regional equations derived for a homogeneous region gave lower integral square errors than those of province-wide equations.

Délimitation de régions homogènes basée sur le mécanisme de la genèse de la crue annuelle Résumé Les distributions des crues peuvent avoir des densités uni-modales ou multimodales par suite de différents mécanismes tels que la fonte des neiges et la pluie dans la série annuelle des crues. En applicant l'analyse de fréquence non-paramétrique aux crues de la province du Nouveau-Brunswick au Canada, des distributions unimodales, bimodales et avec grande queue ont été découvertes. En groupant des bassins avec des densités de même forme sur une base géographique, des régions homogènes ont été délimitées. Des équations régionales pour une région homogène ont donné des erreurs carrées intégrées plus petites que celles des équations pour la province entière.

INTRODUCTION

The estimation of design floods at ungauged sites is a common task for water resources engineers. The information from gauged sites can be transferred to ungauged sites through regionalization techniques and thus improve at-site estimation for ungauged locations. These regionalization techniques involve three basic steps: homogeneous region delineation, at-site flood frequency analysis and regional relationship development (Kite, 1977).

The identification of homogeneous regions is one of the more serious obstacles to obtaining a satisfactory regional solution. It is known that the division of larger geographical areas into smaller homogeneous regions leads to regional flood regression equations with smaller standard errors (Moin & Shaw, 1986). Of the many approaches which have been suggested in delinea­ting homogeneous regions, some are based on residual analysis, others on standardized flow statistics (Burn, 1990), some on defining an identical distri-

Openfor discussion until 1 October 1993

Page 2: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

104 Denis Gingras & Kaz Adamowski

bution (Hosking & Wallis, 1991), and still others on multivariate analysis techniques (Bhaskar & O'Connor, 1989; Cavadias, 1990).

Those methods seeking an identical parametric distribution or regional values of parameters must contend with the limitation and subjectivity of para­metric flood frequency analysis. An alternative to parametric flood frequency analysis which has been successfully employed is nonparametric frequency analysis (Adamowski, 1989). Nonparametric methods do not depend on an assumption regarding the type of distribution or on the estimation of many parameters. In fact, in nonparametric frequency analysis, a single parameter, the smoothing factor, is calculated based on available data. This smoothing factor has no regional interpretation.

It has been shown (Waylen & Woo, 1982; 1984) that many rivers in Canada generate floods from a compounded probability density function corres­ponding to different flood producing mechanisms. It has also been found (Alila, 1988) that some annual flood distributions in Canada are bimodal with each mode corresponding to a different flood generation process. As it is known that a number of mechanisms can lead to the annual maximum floods in North America (Hirshboeck, 1987), it is proposed to employ the probability density function shape, which corresponds to flood generating mechanisms, to delineate homogeneous regions.

The province of New Brunswick, where floods occur from a variety of mechanisms, is used herein for numerical analysis. The probability density function shapes needed to define homogeneous regions can be determined from nonparametric frequency analysis (Adamowski, 1985). Simulation studies show that the sampling variability of a unimodal distribution will not produce a bimodal density if analysed by nonparametric frequency analysis. Comparison of the various density shapes on a regional basis yields a homogeneous bimodal region for New Brunswick. Regional regression analysis is performed to demonstrate that lower integral square errors result for a regional equation based on that homogeneous region.

METHODOLOGY

Homogeneous regions

Homogeneous regions are usually defined as regions having similar hydro-logical, climatic and physiographic characteristics, and/or flood-related variables such as coefficient of skew. The subdivision of a large area into smaller homogeneous regions for the purpose of flood frequency analysis as well as the determination of regionalized skew coefficient (US Water Resources Council, 1982) is often quite controversial. For quantile estimation, regional equations are developed or regional parameters evaluated which then apply to this homogeneous region.

There seems to be no uniquely objective approach to the delineation of

Page 3: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 105

homogeneous regions. It is generally agreed, however, that grouping basins within a homogeneous region will yield regional relationships with lower standard errors than those for the entire study area (Kite, 1977). In some cases, the difference can be somewhat significant as for the two-year flood in New Brunswick (Inland Waters, 1987), where the province-wide equation had a standard error of 18% but the equation for homogeneous regions had standard errors of 14, 12, 11 and 15%. In Ontario (Moin & Shaw, 1986), the province-wide two-year flood equation had a standard error of 35% while the regional equations had standard errors of 38, 20 and 32%.

A common approach in homogeneous region delineation is to examine the residual pattern from a linear regression of a given design flood for the entire study area. Based on the geographical proximity of positive and negative residuals, homogeneous regions are then proposed. For a New Brunswick regional flood study (Inland Waters, 1987), it was found that the residual pattern corresponded in some instances to the province's climatic zones (Inland Waters, 1988a).

Homogeneous regions delineated on a geographical basis, however, are often not hydrologically similar (Linsley, 1982; Cunnane, 1987). Consequently, some researchers define homogeneity in terms of flood-related variables such as normalized annual flood and coefficient of variation, which leads to non-geographical regions (Wiltshire, 1985; Burn, 1989). The use of multivariate analysis techniques resulted in non-geographical regions for the state of Kentucky (Bhaskar & O'Connor, 1989) and the island of Newfoundland (Cavadias, 1990) that were very different from those obtained using residual analysis.

Another approach to homogeneous region delineation involves assuming that all single station flood frequencies within the region follow the same probability distribution or that they have some constant distribution parameters (Hosking & Wallis, 1991). Selecting a single distribution for a region or defining a set of constant distribution parameters can be subjective due to the limitations of parametric frequency analysis. Additionally, there may be more than one suitable parametric distribution within a region and the assumption of constant parameters may not be valid. Nonparametric methods do not require assumptions as to the distribution shape or type and depend on a single para­meter having no constant regional value. Furthermore, the density shape obtained from nonparametric frequency analysis is a reflection of the generating mechanisms or the climatic causes behind the floods.

Essentially, all of the approaches discussed above for homogeneous region delineation are based either on some sort of geographical consideration (on the basis of general location, physiography, weather regimes) or flood data characteristics (probability distribution, regional statistical flood parameters). It is possible to combine geographical considerations and flood data charac­teristics for homogeneous region delineation through the use of the probability density function shape, which is dependent on the processes that generate floods in the watershed.

Page 4: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

106 Denis Cingras & Kaz Adamowski

A basin with a single mechanism for flood generation will have a uni-modal density while a basin with two or more mechanisms, for example autumn floods from rainfall and spring snowmelt floods, may have a bimodal probability density function. This shape, determined from the nonparametric frequency density, is then compared on a geographical basis and used for the delineation of homogeneous regions. This technique does not exclude the possi­bility of non-geographical regions. Such an approach, which groups basins following similar probability distributions, is essentially a reflection of climatic considerations.

Nonparametric frequency analysis

The estimation of design floods can be accomplished by the more conventional parametric methods or by nonparametric methods (Adamowski, 1985). The parametric approach involves fitting an assumed theoretical probability distri­bution to a series of observed maximum annual flows. Numerous probability distributions have been proposed, the most commonly used in Canada being the lognormal, the three-parameter lognormal, the generalized extreme value, and the log-Pearson III (Pilon et al., 1985).

Parametric fitting of distributions, however, presents numerous diffi­culties. Design floods are typically required for large return periods at the right-hand tail of the distribution where little information is available. In all parametric methods, the entire data set determines the shape of the tail and it has been shown that a small difference in the lower flood values can severely affect the right-hand tail (KlemeS, 1987). Other problems are the impact of outliers and the assumption of a unimodal distribution.

Floods in Canada and many other parts of the world are known to be the result of a variety of generating mechanisms such as snowmelt, frontal precipi­tation, local thunderstorms and hurricanes, and thus are not drawn from a single statistical population. Also, watersheds are not time invariant, as they produce different flood outputs for the same storm input at different times of the year. Obviously, in a cold climate, floods occurring in the spring on frozen ground with snowmelt are different from summer floods.

Throughout the North American continent, the large return period floods are very often caused by a process different from that of the smaller return period floods (Stoddart & Watt, 1970; Diehl & Potter, 1987; Hirschboeck, 1987; Waylen & Woo, 1984) and yet it is these smaller floods, along with the large ones, that are used together to assess the larger design floods in parametric flood frequency analysis. The many drawbacks of parametric frequency analysis make distribution selection a difficult task, as exemplified by the complexity of an expert system for flood frequency analysis (Chow & Watt, 1990).

In many instances, the probability density function of annual maximum floods is known to be bimodal (Alila, 1988; Labatiuk & Adamowski, 1987). Some researchers have suggested the use of a mixed distribution, or the use of

Page 5: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 107

a heavy-tailed distribution, to define properly the distribution of annual floods where they are multimodal (Waylen & Woo, 1982; Rossi et ah, 1984; Singh, 1987; Ahmad et ah, 1988). The problem with a mixed distribution approach again concerns distribution selection along with the increased number of para­meters to be estimated.

An alternative approach is based on nonparametric frequency analysis, a technique which does not require an assumption (which may turn out to be totally erroneous) with regard to distribution shape or type. In the nonpara­metric kernel method, the probability density function J(x) can be defined as (Adamowski, 1985):

M = (VnVZmx-xyh] (1) !=1

where the x, are the observations, K( ) is the selected kernel function (which is itself a probability density function such as a normal or rectangular distri­bution), n is the number of data points and h is a smoothing factor to be estimated from the data.

The process of constructing a density function is similar to the one of building a histogram with class interval h. In a histogram, one adds a rectangular block of height l/nh at the centre of the class interval to which a given data point belongs. The final histogram frequency distribution is the sum total of all blocks, each one of area 1/n. In nonparametric frequency, one adds a kernel function of area \ln centred at the data point location itself. This kernel may have a variety of shapes, such as a rectangle or a normal distri­bution. The final nonparametric density is the sum total of all kernels.

The choice of the kernel function is not crucial to the performance of the method as most kernels lead to almost identical estimates. In this study, a Gaussian kernel has been used and is given by (Adamowski, 1989):

K(x) - l/[(27r)'^2]exp[0c-je,.)2/2/*2] (2)

where h is the smoothing factor to be calibrated from the data. The selection of the smoothing factor h in nonparametric frequency is

crucial as it defines the smoothness of the density function (Labatiuk & Adamowski, 1987). One method for h selection has been the minimizing, by a cross-validation technique, of the integrated mean square error (IMSE) which is expressed by (Silverman, 1986):

IMSE = \E\fn{x)-j{x)fàx (3)

where fn(x) is an estimate of the unknown density fix). Scott & Terrell (1987) have shown that the cross-validation procedure leads to consistent and asymp­totically optimal nonparametric density estimates. Studies by Adamowski (1989), Bardsley (1989) and Guo (1991) have shown that the use of nonpara­metric methods in flood analysis gives results that are accurate, uniform, and particularly suitable for multimodal data.

Page 6: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

108 Denis Gingras & Kaz Adamowski

NUMERICAL ANALYSIS

Data selection

The province of New Brunswick in Atlantic Canada, which is subject to floods from various mechanisms, was selected for numerical analysis. A total of 53 natural flow hydrometric stations with at least ten years of record exist for New Brunswick and the surrounding areas. The station locations are shown on Fig. 1.

UNIMODAL

BIMODAL

HEAVY-TAILED

Fig. 1 New Brunswick hydrometric stations.

The climate of New Brunswick is very variable throughout the province (Inland Waters, 1988a). As shown by Fig. 2, the province is divided into four climatic regions: the Bay of Fundy, eastern New Brunswick, the New Brunswick Lowlands and the New Brunswick Highlands, corresponding to numbers 1 to 4 on Fig. 2. The Bay of Fundy region (Region 1) has high precipitation, which is in part due to the region's proximity to the major storm tracks on the east coast of North America, and low snow accumulation. Dying tropical storms will on occasion affect this area. Eastern New Brunswick (Region 2) and the New Brunswick Lowlands (Region 3) are areas of low precipitation and moderate

Page 7: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 109

ft

. / REGION 4 / AP-1200-1*00 ~

C>, SH- 250-OVER 4O0 ^ " - - . , SP- 1*0-240

REGION ^AP"10O0-1200

SN" 250- 300 SP- 8 0 - 1*0

•;iy

-REGION 2 AP- XXXJ-1100 SH- 250-300 SP- 120-160

REGtON 1 <AP-12OO-V450

SN- 175 - 300 «SP- 60 - 1*0

AP - mean annual precipitation (mm) SN - mean annual snowfall (cm)

SP - mean snowpack as of 31 March (cm)

Fig. 2 Climatic regions of New Brunswick (taken from Inland Waters, 1988a).

snow accumulation. The New Brunswick Highlands (Region 4), which cover roughly half of the province, form an area of moderate to high snow accumula­tion. The climate is actually very variable throughout that whole area as the wide range of snow parameters on Fig. 2 indicates. The most northern part of this region is subject to large quantities of snowfall and thus to very large snow accumulations in the spring. It should be noted that a small range in mean annual precipitation, and not different flood generating mechanisms, was the basis for delineating the climatic regions.

Figure 3 is a map of the average water content of snowpack on 31 March as simplified from a more complex map (Inland Waters, 1987). There is a climatologically different area in the north with average snowpack above 200 mm as well as another climatologically different area in the southwest with average snowpack below 120 mm.

A partitioning of annual maximum floods in New Brunswick by seasons indicated a strong, although imperfect, correlation between seasonal occurrence

Page 8: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

110 Denis Gingras & Kaz Adamowski

Fig. 3 Simplified snowpack map showing average water content ofsnowpack on 31 March (mm).

and flood mechanism (Gingras & Adamowski, 1992). Sample sizes of less than 20 years for many stations, along with poor coordination between meteoro­logical and hydrometric stations, made some of the analysis uncertain. Spring floods are commonly due to rain-on-snow events, usually within a prolonged snowmelt period. However, on occasion, a rainfall-only flood will occur in late spring after all the snow has melted. The limited availability of snow cover data makes the division between spring floods with snowmelt and spring floods without snowmelt uncertain for some stations. Thus the division of the spring flood series into generating mechanisms is imperfect.

The division by mechanisms for floods outside the spring season is more straightforward. Winter floods occur from rain on snow while autumn floods are due to frontal precipitation events as are some of the few summer floods. Some hydrometric stations in the south of the province, however, are also affected on occasion by dying tropical storms during the summer (Inland Waters, 1988a).

In the northern part of the province, the majority of floods occur in the spring, from rain on snow. As one goes southward, a greater proportion of floods come in the summer, autumn or winter. In the southernmost parts of the province, spring floods may account for less than 50% of all annual maximum floods in some instances. All of the above indicates that different mechanisms are at play in different parts of the province; snowmelt is the dominant annual flood factor in some areas of the north while in the remainder of the province a combination of rain-on-snow and rainfall-only floods occur.

Page 9: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 111

Single station frequency analysis

Using nonparametric frequency analysis, the 2, 10, 20, 50 and 100 year floods were estimated, employing historical information at three sites where it was available. Figure 4 is an example of the resulting parametric and nonparametric

(a) S

O ET CM

o

X *

• * » * ^P***1

--' -*J&**'

^ :

wv*y

_ J ^ * /

» y

\X

1.003 1.05 1.25 2.0 5.0 10 20 50 100 500

RECURRENCE INTERVAL IN YEARS

(b) H

< x CJ If} o

I » I M * * " ^

L ^ ^

- ^ i ^ 6 , : , d

&*

J J ^^-* /

1.003 1.05 1.25 2.0 5.0 10 20 50 100 5D0

RECURRENCE INTERVAL IN YEARS

Fig. 4 Cumulative frequency functions for Lepreau River at Lepreau: (a) three parameter lognormal distribution, parameters estimated by maximum likelihood; and (b) nonparametric method.

Page 10: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

112 Denis Gingras & Kaz Adamowski

cumulative frequency functions for station 01AQ001 on the Lepreau River. The parametric distribution is the three-parameter lognormal which was employed in a previous study (Inland Waters, 1987). It is important to note how different the flood estimates of return periods greater than ten years are in both methods. The parametric and the nonparametric approaches provide ten-year flood esti­mates, respectively, of 131 and 142 m3 s"1, while at the one hundred year flood level, the estimates are, respectively, 266 and 335 m3 s"1.

The probability density functions obtained from nonparametric analysis were plotted and visually compared among each other. The 53 stations were classified into three distinct groupings based on probability density function shape: a unimodal group, a bimodal group and a heavy-tailed group. It should be noted here that, for the bimodal and heavy-tailed group, parametric flood frequency would have fitted a unimodal, and therefore unsuitable, distribution to the data (Gingras & Adamowski, 1992).

It was discovered that basins with floods of similar probability density function shape were subject to similar flood generating mechanisms. The majority of New Brunswick rivers have annual maximum floods following a bimodal distribution such as Fig. 5 for the Canaan River (station 01AP002). The maximum density of the second mode is usually less than half that of the first mode. These rivers are subject to spring floods and occasional winter or autumn or summer floods. The winter, autumn and summer floods usually form between ten and rarely more than thirty percent of all annual maximum floods.

A small number of rivers have a smooth unimodal density, e.g. for the Upsalquitch River (station 01BE001) of Fig. 6, where a great majority of floods occur from the spring snowmelt. Summer and autumn floods usually account for less than 10% of annual maximum floods. Another small number have a heavy-tailed density as for the Lepreau River (station 01AQ001) of Fig. 7. These rivers are usually mainly subject to spring floods but also on many occasions to autumn and winter floods. In the case of the Lepreau River, floods took place during the mid-August to mid-September period, an indication of tropical storms. Two of the five stations with heavy-tailed densities have over 50% of their annual maximum floods occurring at a time other than the spring.

Because of the long tail of station 01AQ001 (Fig. 7), an unrealistic gap appears in the figure, which causes no problem when exceedance probability is estimated. A density function using only the last 30 years instead of the full 70 years has a smoother, but nonetheless heavy, tail, as shown in Fig. 8. With the availability of a much longer record, the unrealistic gap in the long tail is expected to disappear.

Simulations

Due to the presence of a large grouping of bimodal densities, it was decided to investigate the possibility of their being caused simply by sampling varia-

Page 11: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 113

i i 1 1 1 1 =• r 0.0 50.0 100.0 15Q.0 200.0 250.0 300.0

DISCHARGE (M3/S)

Fig. 5 Probability density Junction for Canaan River.

bility. Monte Carlo simulations were therefore undertaken with the purpose of verifying that nonparametric methods would properly identify unimodal densities if present in the data. No attempt was made here to simulate the existing flood characteristics of the data being analysed in New Brunswick.

Simulations from an extreme value type I (EV1) distribution resulted in roughly half of the densities from nonparametric frequency analysis being uni­modal, about one-third being unimodal with some distortion in the tail and the remainder being multimodal. A bimodal distribution with a very high second peak, as found for some New Brunswick floods, and as shown in Fig. 5, was never encountered. One could argue that some of the New Brunswick bimodal distributions with a low second peak were due to sampling variability; how­ever, their repeated occurrences cannot be accounted for solely by sampling variability. As well, out of the one hundred simulations, only six had a

Page 12: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

114 Denis Gingras & Kaz Adamowski

distortion in the tail in any way similar to the ones seen in Figs 7 and 8 for the heavy-tailed region.

Data were then generated from a bimodal distribution produced from a combination of two EVls. Over 40% of the densities from nonparametric frequency analysis were bimodal with two high distinct peaks, close to 10% were unimodal, and close to 50% were multimodal, many of them being bimodal distributions with a distortion in the tail.

It was thus concluded that only data from a bimodal distribution would produce a density function with two high peaks by nonparametric frequency analysis. Thus, some of the New Brunswick annual maximum floods must be generated from a bimodal distribution and those two peaks are not due to sampling variability. As for the heavy-tailed densities, it was concluded that the probability was very remote that many distributions of such similar shape could

DISCHARGE (M3/S)

Fig. 6 Probability density Junction for Upsalquitch River.

Page 13: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 115

DISCHARGE (M3/S)

Fig. 7 Probability density junction for Lepreau River.

randomly be found so close to each other given their low chance of occurrence due to sampling variability.

RESULTS

Homogeneous region delineation

As shown in Fig. 1, most rivers with unimodal probability density functions come from the same geographical area in northern New Brunswick, while those with a heavy tail tend to come from the same part of southern New Brunswick. These two areas correspond very closely to the climatologically different areas shown in Fig. 3. Therefore, based on these findings, a homogeneous bimodal

Page 14: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

116 Denis Gingras & Kaz Adamowski

region excluding these two areas with different density shapes was established. As a consequence, grouped together are basins subject to the same flood

generating mechanisms which appear to respond in a similar fashion to these mechanisms because of their similar density shape. A small number of uni-modal densities were found in the bimodal region; since these had floods resulting from different mechanisms, it was assumed that the two modes had been very close to each other, leading to a unimodal density.

In order to test whether a group of sites contain data all drawn from the same probability distribution, a statistical test employing L-moments was used. For the New Brunswick annual maximum floods, three heterogeneity measures, the details of which are available elsewhere (Hosking & Wallis, 1991), were computed. Hosking & Wallis (1991) suggest that values of these measures less than unity mean a region is acceptably homogeneous, values between 1 and 2

DISCHARGE (M3/S)

Fig. 8 Probability density function for Lepreau River 1959-1988.

Page 15: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 117

mean possibly heterogeneous, while values above 2 mean definitely hetero­geneous. There is no inconsistency in using a parametric test to ascertain the homogeneity of regions established by a nonparametric approach because the latter was merely used as a screening tool in order to group basins of similar probability density shape.

The test was applied (Gingras & Adamowski, 1992) to the entire 53 stations resulting in three heterogeneity measures between 1 and 2, implying that the New Brunswick flood data set was possibly heterogeneous. The same test was then applied to the 44 stations from the delineated homogeneous bimodal region, resulting in three heterogeneity measures less than 1. This region was therefore considered homogeneous, to be drawn from the same probability distribution, by the statistical test. While the delineated bimodal region could perhaps be further subdivided based on some other criteria, it could not be reduced in size based on the selection of an identical distribution.

Regional Analysis

The unimodal shape and heavy-tailed shape regions that were found contained, respectively, only four and five hydrometric stations, numbers too small to develop useful regression equations. Each of these groupings probably belongs to a larger homogeneous region, only a small portion of which is located in New Brunswick. The set of 53 stations minus the nine stations having a different density shape, i.e. a total of 44 stations, may thus be called the homogeneous bimodal region. Regional analysis was performed and compared for both the homogeneous bimodal region and on a province-wide basis in order to determine whether the delineation of a homogeneous region would result in more accurate regional equations.

Multiple linear regression, which involves developing a series of regional equations for floods of various return periods, was the chosen regionalization approach. The 2, 10, 20, 50 and 100 year flood estimates were subjected to a linear regression against physiographic and climatic parameters such as drainage area, mean annual precipitation, percentage of lakes and swamps, and average water content of snow on 31 March as computed for the basins draining to the hydrometric stations. The following relationships were used:

logG = a + blogxl (4)

or:

logo = c+ilogXj+elogXj (5)

where Q is the design flood of a given return period and x1 and x2 are physio­graphic or climatic variables, while a, b, c, d and e are regression coefficients.

Ordinary least squares was used to determine the regression coefficients, given in Table 1, for the entire province and for the homogeneous bimodal region. The most significant factor to enter the equation was always drainage

Page 16: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

118 Denis Gingras & Kaz Adamowski

Table 1 Regression coefficients for New Brunswick (logQ = a + blogXj or logQ = c + dlogx1 + elogx-)

Region Flood equation coefficients

2 year flood

Province

Homogeneous

10 year flood

Province

Homogeneous

20 year flood

Province Homogeneous

50 year flood

Province

Homogeneous

100 year flood

Province

Homogeneous

a

-0.336

-0.346

-0.043

-0.049

0.040

0.024

-0.005

-0.009

0.036

0.011

b

0.871

0.876

0.850

0.852

0.844

0.849

0.881

0.882

0.877

0.883

c

-5.246

-5.849

-6.792

-7.075

-7.204

-7.253

-6.594

-6.156

-6.897

-6.060

d

0.919

0.932

0.915

0.925

0.914

0.924

0.930

0.930

0.928

0.931

e

1.572

1.759

2.160

2.245

2.319

2.326

2.124

1.980

2.235

1.955

area, with mean annual precipitation always coming second. The addition of further variables was not statistically significant. For comparison of goodness of fit between the equations, the integral square error (ISE), in percentage, was calculated. It is defined by (Adamowski & Middleton, 1977):

£[(0,-P,)2]°-5xlOO ISE = tl (6)

n

i = l

where 0{ are the single station flood estimates, P, the predicted flood estimates from the regression equation and n the sample size. Tables 2 and 3 list the ISEs for the regressions of flood versus drainage area as well as flood versus drainage area and mean annual precipitation.

Table 2 Integral square error for flood vs drainage area Flood

2 year

Province-wide 3.85

Homogeneous region 4.07

10 year

4.37

4.41

20 year

4.61

4.49

50 year

4.13

3.87

100 year

3.98

3.69

Discussion

The regression coefficients of Table 1 for the homogeneous bimodal region and

Page 17: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 119

Table 3 Integral square error for flood vs drainage area and mean annual precipitation

Flood

2 year

Province-wide 3.33

Homogeneous region 3.40

10 year

3.30

3.18

20 year

3.40

3.15

50 year

3.14

2.94

100 year

2.99

2.75

for the province-wide equations are very similar, in some instances with identical or almost identical coefficients. A computation of the standard errors of the regression parameters indicates that the equations for the bimodal region and for the entire province are not statistically different. Yet a comparison of the goodness of fit of the regional equations shows a difference between the province-wide ISEs and the bimodal region ISEs for various return periods.

For the smaller return periods (2 year and 10 year floods in Table 2, and 2 year flood in Table 3), the ISEs are lower for the province-wide equations as opposed to the bimodal region equations. This is as expected because the province-wide equations were developed with a larger sample of stations. However, for the other return periods, the bimodal region equations have lower ISEs. This means that, once the stations with non-bimodal probability density function shapes are removed, the resulting regression equations have lower ISEs. The reason for this is that the large return period floods of the non-bimodal stations were caused by processes different from those for the remainder of the province. By defining a homogeneous bimodal region, in other words by removing some stations from where they did not belong, their adverse impact on the regressions was eliminated.

Thus it is observed that perhaps different homogeneous regions could be defined for different return period floods. In New Brunswick, for return periods of 20 years and above, a homogeneous region different from that of the lower return period floods could be delineated. This is not unexpected as it was previously found that floods and low flows can have different homogeneous regions (Inland Waters, 1988b).

CONCLUSION

In New Brunswick, the shape of the probability density function, as estimated from nonparametric frequency analysis, is highly correlated to the presence of different flood generating mechanisms. Simulation studies have shown that non-parametric frequency analysis is reliable in assessing the probability density shape. The latter has been shown to be a useful tool in delineating homogeneous flood regions by grouping basins of similar density shape. Lower integral square errors were found for the regional equations for large return period floods of a homogeneous region as compared to a province-wide equation.

Page 18: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

120 Denis Gingras & Kaz Adamowski

Acknowledgements Financial assistance provided by the Natural Science and Engineering Research Council of Canada and the Ontario Graduate Scholarship is gratefully acknowledged.

REFERENCES

Adamowski, K. (1985) Nonparametric kernel estimation of flood frequencies. Wat. Resour. Res. 21(11), 1585-1590.

Adamowski,K. (1989) AMonteCarlocomparisonofparametricandnon-parametricestimationofflood frequencies./. Hydrol. 108, 295-308.

Adamowski, K. &Middleton,A. C. (1977) Steady-statedissolved oxygen model for the Rideau River. Can. J. Civ. Engng 4(4), 471-481.

Ahmad, M. I., Sinclair, C. D. &Werritty, A. (1988) Log-logistic flood frequency analysis./. Hydrol. 98, 205-224.

Alila, Y. (1988) Nonparametric flood frequency analysis with historical information and hydro-climatically defined mixed distributions. MASc thesis, University of Ottawa, Canada.

Bardsley, W. E. (1989) A simple parameter- free flood magnitude estimation. Hydrol. Sci. J. 34(2), 129-137.

Bhaskar, N. R. & O'Connor, C. A. (1989) Comparison of method of residuals and cluster analysis for flood reeionalization./lSC£'Z Wat. Resour. Plan. Manage. 115(6), 793-808.

Burn, D. H. (1989) Delineation of groups for regional flood frequency analysis. J. Hydrol. 104, 345-361.

Burn, D. H. (1990) An appraisal of the region of influence approach to flood frequency analysis. Hydrol. Sci. J. 35(2), 149-165.

Cavadias,G. S. (1990) Thecanoniealcorrelationapproachtoregionalfloodestimation.In:/fegiona/iza-tion in Hydrology ed. M. A. Beran, M. Brilly, A. Brilly, A. Becker & O. Bonacci, Proc. LubljanaSymp. April 1990, IAHS Publ. no. 191, 171-178.

Chow, K. C. A. & Watt, W. E. (1990) A knowledge-based expert system for flood frequency analysis. Can. J. Civ. Engng 17, 597-609.

Cunnane, C. (1987) Review of statistical models for flood frequency estimation. In: Hydrologie Frequency Modelling ed. V. P. Singh, 49-96, D. Reidel Publishing Co., Dordrecht, The Netherlands.

Diehl, T. & Potter, K. W. (1987) Mixed distributions in Wisconsin. In: Hydrologie Frequency Modelling ed. V. P. Singh, 213-226, D. Reidel Publishing Co., Dordrecht, The Netherlands.

Gingras, D. & Adamowski, K. (1992) Coupling of nonparametric frequency and L-moment analyses for mixed distribution identification. Wat. Resour. Bull. 28(2), 263-272.

Guo, S. L. (1991) Nonparametric variable kernel estimation with historical floods and paleoflood information. Wat. Resour. Res. 27(1), 91-98.

Hirschboeck, K. K. (1987) Hydroclimatically-defined mixed distributions in partial duration flood series. In: Hydrologie Frequency Modelling ed. V. P. Singh, 199-212, D. Reidel Publishing Co., Dordrecht, The Netherlands.

Hosking, J. R. M. & Wallis, J.R. (1991) Some statistics useful in regional frequency analysis. IBM Research Report RC 17096, Yorktown Heights, New York, USA.

Inland Waters (1987) Flood fr equency analyses, New Brunswick. Inland Waters/Lands Directorate, Environment Canada & New Brunswick Department of Municipal Affairs and Environment, New Brunswick, Canada.

Inland Waters (1988a) New Brunswick hydrometric network evaluation, background report no. 2: Delineationof physiographic-climatic zones. Inland Waters/Lands Environment Canada& New Brunswick Department of Municipal Affairs and Environment, New Brunswick, Canada.

Inland Waters, (1988b) New Brunswick nydrometric network evaluation, background report no. 3: Regional hydrology and data transfer methodologies. Inland Waters/Lands, Environment Canada & New Brunswick Department of Municipal Affairs and Environment, New Brunswick, Canada.

Kite, G. W. (1977) Frequency and risk analysis in hydrology. Water Resources Publications, Littleton Colorado, USA.

Kleme§, V. (1987) Hydrological and engineering relevance of flood frequency analysis. In: Hydrologie Frequency Modelling ed. V. P. Singh, 1-18, D. Reidel Publishing Co., Dordrecht, The Netherlands.

Labatiuk, C. & Adamowski, K. (1987) Application of nonparametric density estimation to computation of flood magnitude/frequency. In: Stochastic Hydrology ed. I. B. MacNeill& G. J Umphrey, 161-180.D. Reidel Publishing Co., Dordrecht, The Netherlands.

Linsley, R. K. (1982) Flood estimates. How good are they? Wat. Resour. Res. 22(9), 159-164. Moin, S. M. A. & Shaw, M. A. (1986) Regional flood frequency analysis for Ontario streams (three

volumes). Canada/Ontario Flood Damage Reduction Program, Environment Canada. Pilon, P. J., Condie.R. & Harvey, K. D. (1985) Consolidatedfrequency analysis package (CFA). User

manual for version 1. DEC Pro Series, Water Resources Branch, Inland Waters Directorate, Environment Canada, Ottawa, Canada.

Rossi, F., Fiorentino, M. & Versace, P. (1984) Two-component extreme value distribution for flood

Page 19: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,

Homogeneous region delineation 121

frequency analysis. Wat. Resour. Res. 20(7), 847-856. Scott, D. W. & Terrell, G. R. (1987) Biassed and unbiassed cross-validation in density estimation. J.

Am. Statist. Assoc. 82, 1131-1146. Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. Chapman and Hall,

London, UK. Singh, K. P. (1987) Developmentof a versatile flood frequency methodology and its application to flood

series from different countries. In: Hydrologie Frequency Modelling by V. P. Singh, 183-197, D. Reidel Publishing Co., Dordrecht, The Netherlands.

Stoddart, R. B. L. & Watt, W. E. (1970) Flood frequency prediction for intermediate drainage basins in southern Ontario. Civil Engineering Research Report no. 66, Queen's University, Kingston, Ontario, Canada.

US Water Resources Council (1982) Guidelines for determining flood flow frequency. Bulletin 17B, US Water Resources Council, Washington, DC, USA.

Waylen, P. & Woo, M. K. (1982) Prediction of annual floods generated by mixed processes Wat. Resour. Res. 18(4), 1283-1286.

Waylen, P. & Woo, M. K. (1984) Areal prediction of annual floods generatedby two distinctprocesses. Hydrol. Sci. J. 29(1), 75-88.

Wiltshire, S. E. (1985) Grouping basins for regional flood frequency analysis. Hydrol. Sci. J. 30(1),

Received 25 November 1991; accepted 20 November 1992

Page 20: Homogeneous region delineation based on annual flood ...hydrologie.org/hsj/380/hysj_38_02_0103.pdfto regional flood regression equations with smaller standard errors (Moin & Shaw,