Multivariate generalized Pareto distribution in practice...

27
Multivariate generalized Pareto distribution in practice: models and estimation December 6, 2010 Pál Rakonczai Probability Theory and Statistics Department Eotvos University, Budapest, Hungary & Department of Mathematical Statistics Lund Institute of Technology, Sweden e-mail:[email protected] Contents 1 Introduction 3 2 Models 4 2.1 Univariate extreme value models .................... 4 2.2 Multivariate extreme value models ................... 5 2.3 Multivariate threshold exceedance models ............... 6 2.4 Parametric families of dependence structures ............. 8 2.5 Construction of new non-exchangeable models ............ 12 3 Statistical inference on wind speed data 15 3.1 Computational aspects of estimation and prediction .......... 15 3.2 Results for bivariate observations .................... 15 3.3 Results for trivariate observations ................... 18 4 Discussion 18 A Appendix: "classical BGPD" models fitted by ’fbvpot’ 26 1

Transcript of Multivariate generalized Pareto distribution in practice...

  • Multivariate generalized Pareto distributionin practice: models and estimation

    December 6, 2010

    Pál RakonczaiProbability Theory and Statistics Department

    Eotvos University, Budapest, Hungary&

    Department of Mathematical StatisticsLund Institute of Technology, Sweden

    e-mail:[email protected]

    Contents

    1 Introduction 3

    2 Models 42.1 Univariate extreme value models . . . . . . . . . . . . . . . . . . . .42.2 Multivariate extreme value models . . . . . . . . . . . . . . . . . .. 52.3 Multivariate threshold exceedance models . . . . . . . . . . .. . . . 62.4 Parametric families of dependence structures . . . . . . . .. . . . . 82.5 Construction of new non-exchangeable models . . . . . . . . .. . . 12

    3 Statistical inference on wind speed data 153.1 Computational aspects of estimation and prediction . . .. . . . . . . 153.2 Results for bivariate observations . . . . . . . . . . . . . . . . .. . . 153.3 Results for trivariate observations . . . . . . . . . . . . . . . .. . . 18

    4 Discussion 18

    A Appendix: "classical BGPD" models fitted by ’fbvpot’ 26

    1

  • Abstract

    Extreme values are of substantial interest in fields as diverse as finance, environmentalscience and engineering, because they are associated with rare but hazardous events(such as flooding, mechanical failure or severe financial loss). There is often interestin understanding how the extremes of two different processes are related to each other,and, in particular, in estimating the probability that the processes will both simultane-ously become extreme. Analyses of the extremes of two variables are typically basedon fitting a specific parametric subfamily of the multivariate extreme value distributionto component-wise maxima. An alternative approach, which potentially uses the avail-able data much more efficiently, involves fitting the multivariate generalised Paretodistribution (MGPD) to data that exceed a suitably high threshold. There are several(non-equivalent) definition of MGPD in use, one is for exceedances over a thresholdin at least one of the components (Rootzén and Tajvidi, 2006)and another(s) for ex-ceedances which are over the threshold in all components. Here we investigate theapplicability of the first definition assuming different underlying dependence modelsand compare the performance of the two substantially different type of MGPD models.Although the dependence models are intensively used in different extreme value mod-els, to the best of our knowledge none of these models have been discussed and appliedfor data in the new MGPD framework apart from the bivariate (symmetric) logisticmodel. As there are just a few of these families, which allow asymmetry for the differ-ent components and produce absolutely continuous models for the new MGPD case,we introduce a general transformation for creating such models from symmetric ones.We apply the proposed models to wind speed data for modeling exceedances occur-ring at locations in northern Germany, and outline methods for calculating predictionregions as well as evaluating goodness-of-fit.

    Key words: Multivariate threshold exceedances,parametric dependence models,windspeed data.

    2

  • 1 Introduction

    Multivariate extreme value theory involves describing thestatistical properties of theextreme values of two or more processes, or, in the spatial context, of the same processat two or more sites. It is concerned, in particular, with quantifying the probability thatextreme values will occur simultaneously in more than one process, or at more thanone location. Such probabilities are of substantial interest in a wide range of fields,including the environmental sciences, finance and internettraffic monitoring. Evenwhen interest ultimately lies in describing the extreme values of a single process ata single location, there will still often be value in using data from other processes orlocations to improve the efficiency of estimates relating tothe location and processof interest (Ribatet al., 2007). Efficiency can sometimes also be improved by utilizingdata that have been collected on the same process at different temporal resolutions (e.g.daily and hourly; Nadarajah et al., 1998), and this also requires the use of a multivariatemodel.

    A sophisticated mathematical theory to describe the characteristics of multivariateextreme events has been developed (e.g. Pickands, 1981 and Rootzén & Tajvidi, 2006)but there are substantial computational and statistical challenges in applying the re-sulting asymptotic models to data. Univariate extreme value methods, in contrast, areroutinely used for data analysis within finance, the environmental sciences, and a rangeof other scientific applications (e.g. Coles, 2001). Multivariate extremes are often mod-eled by applying block maxima methods (BMM) to the component-wise maxima of theseries being studied (e.g. Tawn, 1988). In environmental applications this commonlymeans focusing on the annual maximum value of each process. If the length of theblock is sufficiently long, and under certain other conditions, then theoretical resultssuggest that the distribution of these componentwise maxima can be approximated bya multivariate extreme value distribution (MEVD). The practical application of this re-sult typically involves selecting a particular parametricmodel from within the MEVDclass, and then drawing statistical inferences about the parameters of this model usingmaximum likelihood estimation.

    An important drawback of the BMM approach is the fact that it ignores informationon whether or not the extremes of the different processes do actually occur simultane-ously. The annual maxima of the processes may all have arisenon the same day, forexample, but it is also possible that they all occurred in completely different months -the basic implementation of the MEVD model does not distinguish between these sit-uations (although an extension of the approach does allow some information on timingto be incorporated into the modeling: Stephenson and Tawn, 2005). This problem canbe avoided by modeling the sizes of all observations that exceed a given high threshold,rather than modeling the highest value within a particular block of time. This methodusually uses more data than the block maxima approach, and soalso leads to efficiencygains (i.e. more precise estimates for the quantities of interest).

    Methods for analyzing multivariate threshold exceedanceswere originally devel-oped by Tajvidi (1996), and have been further developed for more general cases inRootzen and Tajvidi (2006). These papers have demonstratedthat, under rather mildconditions and given an appropriately high threshold vector, the multivariate exceedancesof a random vector over the threshold can be approximated by amultivariate general-

    3

  • ized Pareto distribution (MGPD). Note that the definition ofthe MGPD in these paperssubstantially differs from the definitions which can be found in other textbooks (ex-cept the recent book of Beirlant et al. (2004), where this definition is mentioned butnot discussed). The main advantage of this approach is that it includes all observa-tions that are extreme in at least one component. Although mathematical theory for theMGPD model has been developed, the statistical properties of this model are not yetfully understood - the only paper discussing the performance of the model in practiceis, to the best of our knowledge, that of Rakonczai and Tajvidi (2009). Similar BGPDmodels have also been used in a recent paper by Brodin and Rootzén (2009) to studywind storm losses in Sweden. Throughout this paper only stationary distributions areinvestigated. A non-stationary extension of the BGPD modelwhich allows for the pos-sibility that the characteristics of extreme events are changing over time (or dependupon the value of some other covariate) is discussed by Rakonczai et al. (2010).

    Standard models for univariate and multivariate extremes are presented in Section2, as well as their role and properties in MGPD models. A new general method forconstructing asymmetric versions of known symmetric models is also presented. InSection 3 we discuss the practical issues involved in estimating the parameters viamaximum likelihood, and outline a procedure for construction prediction regions andmodel validation techniques. Additionally some useful conclusions are summarised inSection 4.

    2 Models

    In this section – after a short introduction to univariate extreme value modells – wepresent the key models that are currently used for modeling maxima of multivariate ob-servations and simultaneously discuss their applicability within models for multivariatethreshold exceedances. The presented models1 are available in the ’mgpd’ package ofR, which has been produced for providing a tool for the statisticians to apply the pro-posed methods and models and for illustrating the practicalapplications on wind speeddata.

    2.1 Univariate extreme value models

    Univariate extreme value theory provides the limit resultsfor the distribution of ex-tremes of a single process: either the maximum of observations or the distribution ofexceedances of observations over a high threshold. Maxima (daily, weekly, yearly etc.)can be modeled using the generalized extreme value distribution (EVD), which has adistribution function (df.) of the form

    G(x) = exp

    {

    −(

    1 + ξx− µ

    σ

    )−1ξ

    }

    , (1)

    1These models are different from those which are available inthe ’evd’ package ofR, the main differ-ences are discussed later on.

    4

  • where1 + ξ x−µσ > 0, µ ∈ R is called the location parameter,σ > 0 the scale pa-rameter andξ ∈ R the shape parameter. Similarly, threshold exceedances (e.g. overa high quantile of the observations) can be approximated by the generalized Paretodistribution (GPD), which has a cdf. of the form

    H(x) = 1 −(

    1 + ξx

    σ

    )− 1ξ

    , (2)

    where1 + ξ xσ > 0 andσ > 0. Both of the limit distributions are strongly linked in thesense that, as the threshold tends to the right endpoint of the underlying distribution,the conditional distribution of the exceedances convergesto the GPD if and only if thedistribution of the maximum converges to the EVD.

    2.2 Multivariate extreme value models

    Analogously to the univariate case it can be shown if the componentwise maxima hasa limit distribution, it is necessarily a multivariate extreme value distribution (MEVD).Additionally, all of its univariate margins are univariateEVD-s. There are severaldifferent (but equivalent) way of characterising the underlying dependence structure.One of these is shown by Resnick (1987) assuming unit Fréchetmargins

    GFréchet(t1, . . . , td) = exp(

    −V (t1, . . . , td))

    , (3)

    with

    V (t1, . . . , td) = ν(([0, t1] × · · · × [0, td])c) =

    Sd

    d∨

    i=1

    (witi

    )

    S(dw), (4)

    whereS is a finite measure on thed-dimensional simplexSd = {w ∈ Rd : |w| = 1},which satisfies

    Sd

    wiS(dw) = 1 for i = 1, ..., d,

    whereV andS are called exponent measure and spectral measure respectively. Inparticular, the total mass ofS is alwaysS(Sd) = d. If G is absolutely continuous,then we can reconstruct the densities ofS from the derivatives ofV . In this contextit is better using "densities" instead of "density" asS can have a density not only onthe interior ofSd but also on each of the lower-dimensional subspaces ofSd. E.g. ifd = 2, the unit simplex is partitioned into two vertices and the interior of the intervalor if d = 3 into three vertices, three edges and the interior of the triangle. Hence,even ifG is absolutely continuous, the spectral measureS can put positive mass to thevertices; for instance, when the margins ofG are independent thenS({ei}) = 1 for alli = 1, . . . , d. More details about the characterization and its propertiescan be found inthe recent textbooks as e.g. Chapter 3 in Kotz and Nadarajah (2000) or Chapter 8 inBeirlant et al. (2004).

    Further relevant formulas for maximum likelihoood inference are the following.Coles and Tawn (1991) found how to compute the spectral densities of all subspaces

    5

  • from the partial derivatives ofV . E.g for the interior the spectral densitys can beexpressed as

    ∂V

    ∂t1 · · · ∂td

    (

    t1, . . . , td

    )

    = −(

    d∑

    1=1

    ti

    )−(d+1)

    s( t1∑

    ti, . . . ,

    td∑

    ti

    )

    .

    Another characterisation of MEVD, due to Pickands (1981) ispossible by the so-calleddependence function. In the bivariate setting the dependence functionA(t) must sat-isfy the following three properties, which we denote by (P).

    i. A(t) is convex

    ii. max{(1 − t), t} ≤ A(t) ≤ t

    iii. A(0) = A(1) = 1.

    The lower bound in (ii.) corresponds to the complete dependence, whereas the up-per bound corresponds to independence. The exponent measure and its mixed partialderivative can be written as

    V (t1, t2) =( 1

    t1+

    1

    t2

    )

    A( t1t1 + t2

    )

    ∂V

    ∂t1∂t2(t1, t2) =

    ( 1

    t1+

    1

    t2

    )

    ×(

    A′′(ζ)ζ′t1ζ′t2 +A

    ′(ζ)ζ′′t1t2

    )

    −A′(ζ) ×(ζ′t2t21

    +ζ′t1t22

    )

    ,

    where

    ζ =t1

    t1 + t2, ζ′t1 =

    t2(t1 + t2)2

    , ζ′t2 =−t1

    (t1 + t1)2, ζ′′t1t2 =

    t1 − t2(t1 + t2)3

    .

    2.3 Multivariate threshold exceedance models

    The multivariate extension of the GPD models has also been intensively investigatedin the last decades and as a result, different definitions runup different careers. Basedon its longer history, the multivariate GPD models using exactly the same construc-tion as the MEVD model has, became more popular, we refer it asthe "classicalMGPD" , which is also implemented in theevd package ofR. It concentrates on thoseexceedances which are greater than the threshold in every single components. Its uni-variate margins are GPD distributions, which are linked by adependence model. Usingthe following transformation for the GPD margins

    t̃i = t̃i(xi) =−1

    logHξi,σi(xi)=

    −1

    log{

    1 −(

    1 + ξixσi

    )− 1ξi

    }

    ,

    where1 + ξ xσ > 0 andσ > 0 the MGPD is the form of

    H̃(x1, . . . , xd) = H̃Fréchet(t̃1, . . . , t̃d) = exp(

    −V (t̃1, . . . , t̃d))

    , (5)

    6

  • wheret̃i ≥ 0 andV is defined analogously to (4).The other definition was proposed by Rootzén and Tajvidi (2006). Let Y =

    (Y1, ..., Yd) denote a random vector,u = (u1, ..., ud) be a suitably high thresholdvector andX = Y − u = (Y1 − u1, ..., Yd − ud) be the vector of exceedances. De-fine the multivariate generalized Pareto distribution (MGPD) for theX exceedances bysomeG multivariate extreme value distribution (MEVD) with non-degenerate marginsas in the paper of Rootzén and Tajvidi (2006)

    H(x1, . . . , xd) =1

    logG(

    0, . . . , 0) log

    G(

    x1, . . . , xd)

    G(

    x1 ∧ 0, . . . , xd ∧ 0) , (6)

    where0 < G(0, . . . , 0) < 1 and∧ denotes the minimum. By to this definition, themodel – which we shall simply call MGPD – provides a model for observations thatare extreme in at least one component. On the other hand it is obvious that the distribu-tion is uniquely determined by the underlying MEVD model. Inthe related literaturethere are several equivalent approaches of characterisingan MEVD assuming differentmarginal transformations and different dependence structure representation. Through-out this paper we assume unit Fréchet margins having df. asΦ(x) = exp(−x−1) andto achieve it, we apply the following marginal transformations

    ti = ti(xi) =−1

    logGξi,µi,σi(xi)= (1 + ξi(xi − µi)/σi)

    1/γ1 , (7)

    with 1 + ξi(xi − µi)/σi > 0 andσi > 0 as in (1) fori = 1, ..., d. In the follow-ing, the definition in (6) is called MGPD and (5) is called the "classical MGPD". It isimportant to notice that even though the dependence structure for two MGPD modelsin is the same, the models are substantially different. Again, the most conspicuousdifference between the MGPD and "classical MGPD" is that (6)gives (parametric)probabilistic inference about the exceedances which are above the threshold at least inone component, in contrast to the another one modeling thoseones, which are higherthan the threshold in every components. For illustrations on real observations comparethe upper and lower panels of Fig.7 where the two substantially different approachesof constructing BGPD models are displayed assuming symmetric logistic family asdependence structure for both cases. Further difference between (6) and (5) is in pa-rameter interpretation. Namely, that theξi, µi andσi parameters of (6) can not be anymore considered as marginal parameters for the model, as the1/ logG(0, . . . , 0) termremains in the distribution even ifd− 1 components ofx converges to infinity. More-over ifH1(x) = H(x,∞) thenH1 is not a one dimensional GPD. Although it is shownin Rootzén and Tajvidi (2006) that ifX1 has a distributionH1 then conditional distri-bution ofX1|X1 > 0 is GPD. Similar properties hold for all marginal distributionsregardless of the dimension of the MGPD. Further, the MGPD density can be obtainedas

    h(x) =∂H

    ∂x1 . . . ∂xd(x) =

    ∂x1 · · · ∂xd

    (

    1 −logG(x)

    logG(0)

    )

    (8)

    =

    ∏di=1 t

    ′i(xi)

    V(

    t1(0), . . . , td(0)) ×

    ∂V

    ∂t1 · · · ∂td

    (

    t1(x1), . . . , td(xd))

    .

    7

  • Although the logarithm in (6) cancels the exponent in (3), the MGPD densityh(x)cannot be computed immediately from the spectral density, as the constant termG(0)still containsV as well. Additionally, for the MGPD models, it is reasonableto requirethat there will be no positive probability mass put on the boundary of the distribution,otherwise the model will not stay absolutely continuous, which is rarely realistic andcauses further complications for the maximum likelihood estimation.

    Lemma 1. Let H be a MGPD represented by an absolutely continuous MEVD Gwith spectral measure S. The distribution H is absolutely continuous if and only ifS(Sd) = S(int(Sd)) = d, i.e all mass is in the interior of the simplex.

    Corollary. In such cases the MGPD decays to zero when reaching the boundaries

    limxi→x

    +i

    H(x1, . . . , xd) = 0 for anyxj > 0, j 6= i,

    wherex+i = µxi − σxi/ξxi is the finite lower endpoint of thei-th univariate GEVmargins of the underlying MEVD model.Proof. It is easy to see that the subspaces of theSd unit simplex represent the bound-aries of the space containing the unit Fréchet coordinates.E.g. if d = 2 then the point(w,w − 1) ∈ int(S2) represents they = 1−ww x line in the unit Fréchet scale andlimiting cases are{1, 0} and{0, 1} representing thex andy axes, respectively. Thisterminology remains the same for higher dimensions as well.Therefore when calcu-lating the distribution function for any point of the interior on the unit Fréchet scale,the masses originated from the faces will always be cumulated, which is equivalent tocausing a jump in the function and so its continuity gets broken.

    In the next subsections we summarize, which specific parametric cases includesuch a model. As we shall see, that there are just a few such models, in section 2.5. wepropose a general approach to construct asymmetric models from symmetric ones.

    2.4 Parametric families of dependence structures

    Since MGPD models are defined by the underlying MEVD model, and practicallyMEVD models are defined by the dependence structures (apart from margins), the mostpopular parametric families for MEVD models have been considered as foot-stones forthe MGPD models. The most important characteristics of these models are summarizedin Table 1. The list is not exclusive but covers a rather wide range of families. Althoughthese models are intensively used in different extreme value models, to the best of ourknowledge none of these models have been discussed and applied for data in the MGPDframework apart from the bivariate (symmetric) logistic model. As characterizationwe give both the exponent measure function and the spectral density for the models.Additionally we discuss which parameter setting allows non-exchangeability and putsall mass in the interior of [0,1] providing absolutely continuous models within theBGPD framework by Lemma 1. More details about these models can be found in therespective papers indicated, giving the first appearance ofthe related models. Graphicalillustration of the bivariate distribution function for some of the models can be foundin Fig 5.

    8

  • Model Non-Exchangeable All mass inint(Sd) Complications

    Sym. Logistic − X −Asym. Logistic X − −

    Sym. Neg. Logistic − X −Asym. Neg. Logistic X − −

    Bilogistic X X not explicitNeg. Bilogistic X X not explicit

    Gen.Sym.Logistic − X −Dirichlet X X only s(w) is givenMixed − unlessψ = 1 −

    Asym. Mixed X − −

    Table 1: Summary of bivariate dependence models. Asymmetric logistic, negativelogistic and mixed models are considered excluding their symmetric case, because theirdifferent continuity properties. Absolutely continuous BGPD models can be obtainedif all mass is in the interior of the[0, 1], these cases are highlighted by bold fonts. Thenon-exchangeable cases are the bilogistic, negative bilogistic and Dirichlet models.

    Most of the bivariate cases are straightforward to extend tohigher dimensions, butthese extensions rarely result valid MGPD models with the full parameterisation. Asan example, the multivariate logistic model provides a richset of model parameters,but due to Lemma 1 the absolute continuity holds only in a verylimited parametersetting (it coincides with the symmetric and non-exchangeable case). The multivari-ate negative logistic or nested logistic models behave similarly. An alternative, themultivariate extension of Coles-Tawn model (Dirichlet model) provides a flexible non-exchangeable model if such a model is needed, but it requiressophisticated numericalintegration tools for its fitting, as only the spectral density is given explicitly. For thedetailed description of the above models see Section 2.4.1 and 2.4.2.

    2.4.1 Bivariate models

    • Asymmetric logistic model (Tawn, 1988b).

    VLog(x, y) = (1 − ψ1)/x+ (1 − ψ2)/y + ((ψ1/x)α + (ψ2/y)

    α)1/α

    andsLog(w) = (α − 1)ψα1 ψα2 (w(1 − w))

    α−2((ψ2w)α + (ψ1(1 − w))

    α)1/α−1,whereα ≥ 1 and0 ≤ ψ1;ψ2 ≤ 1. It allows exchangeability unlessψ1 = ψ2.In the special case ifψ1 = ψ2 = 1, it is calledsymmetric logistic model.This is the only case when the model has all its mass in the interior, otherwiseS({0}) = 1 − ψ2 andS({1}) = 1 − ψ1. (See the upper left panel of Fig 5.)

    • Asymmetric negative logistic model (Joe, 1990). The negative logistic model issimilar in structure to the logistic

    VNeglog(x, y) = 1/x+ 1/y − ((ψ1/x)α + (ψ2/y)

    α)−1/α

    andsNeglog.(w) = −sLog.(w), whereα > 0 and0 < ψ1;ψ2 ≤ 1. AnalogouslyS({0}) = 1 − ψ2, S({1}) = 1 − ψ1 and soψ1 = ψ2 = 1 gives the only

    9

  • symmetric version with all its mass in the interior. We referto it assymmetricnegative logistic model. (See the upper right panel of Fig 5.)

    • The bilogistic model (Smith, 1990). The bilogistic model isan the asymmet-ric extension of the logistic model, where the spectral measureS does not putpositive mass on the boundary points. Its exponent measure function is

    VBilog(x, y) =

    [0,1]

    max

    {

    (ψ1 − 1)z−1/ψ1

    ψ1x,(ψ2 − 1)(1 − z)

    −1/ψ2

    ψ2y

    }

    dz,

    whereψ1;ψ2 > 1. A disadvantage is though, that there is only an implicitformula for its spectral density on (0, 1) in terms of the rootof an equation as

    sBilog(w) =(1 − ψ1)(1 − q)q

    1−ψ1

    (1 − w)w2((1 − q)ψ1 + qψ2),

    whereq = q(x, y;ψ1, ψ2) is the root of the equation

    (1 − ψ1)x(1 − q)ψ2 − (1 − ψ2)yq

    ψ1 = 0.

    (See the lower left panel of Fig 5.) For valid input parameters the equation hasa unique root in [0,1], what makes the numerical root finding quite handy in thiscase.

    • Negative bilogistic model (Coles and Tawn, 1994). The negative bilogistic modelhas the same exponent measure function as the bilogistic model except that in thiscaseψ1;ψ2 < 0. The spectral density issNegbilog(w) = −sBilog(w) and similarlyS({0}) = S({1}) = 0.

    • Tajvidi’s generalized symmetric logistic model (Tajvidi,1996).

    VTajvidi(x, y) =

    (

    ( 1

    x

    )2α

    + 2(1 + ψ)( 1

    xy

    +(1

    y

    )2α)

    12α

    ,

    where1 ≤ α and1 < ψ ≤ 2α− 2.

    • Dirichlet model (Coles and Tawn, 1991). The Dirichlet modelis non-exchangeablelike the two bilogistic models above, and has all probability mass of the spectraldensity confined to the interior.

    VCT(x, y) = 1/x(1 − β(q;ψ1 + 1, ψ2)) + 1/yβ(q;ψ1, ψ2 + 1)

    and

    sCT(w) =ψψ11 ψ

    ψ22 Γ(ψ1 + ψ2 + 1)

    Γ(ψ1)Γ(ψ2)

    wψ1−1(1 − w)ψ2−1

    (ψ1w + ψ2(1 − w))1+ψ1+ψ2

    whereq = ψ1y/(ψ1y + ψ2x), ψ1, ψ2 > 0, β is a normalized incomplete betafunction. (See the upper right panel of Fig 5.)

    10

  • • Polynomial model (Klüppelberg and May, 1999).

    VPoli(x, y) = 1/x+ 1/y −m∑

    k=2

    ψk

    m−k∑

    l=0

    (

    m− k

    l

    )

    xl+k−1ym−k−l−1

    (x + y)m−1

    andsPoli(w) = m(m− 1)ψmwm−2 + (m− 1)(m− 2)ψm−1wm−3 + ...+ 2ψ2,whereψ2 > 0,

    ∑mk=2 ψk ≥ 0, 0 ≤

    ∑mk=2(k − 1)ψk ≤ 1 and

    ∑mk=2 k(k −

    1)ψk ≥ 0. The spectral measureS on the sides on the boundary points isS({0}) = 1 −

    ∑mk=2 ψk andS({1}) = 1 −

    ∑mk=2(k − 1)ψk.

    A special case is theasymmetric mixed model (Tawn, 1988b).

    VAsyMix(x, y) = 1/x+ 1/y − xy(x+ y)−2((ψ1 + ψ2)/x+ (ψ1 + 2ψ2)/y)

    andsAsyMix(w) = ψ1w3 +ψ2w2−(ψ1 +ψ2)w+1,whereψ1 ≥ 0,ψ1 +ψ2 ≤ 1,ψ1 + 2ψ2 ≤ 1 andψ1 + 3ψ2 ≥ 0. Whenψ1 = 0, the asymmetric mixed modelreduces to thesymmetric mixed model, in such a caseS({0}) = 1 − ψ2 andS({1}) = 1 − ψ2. Consequentlyψ2 = 1 is the only symmetric case when allmass is in the interior of [0,1].

    2.4.2 Multivariate models

    • Logistic models and variations (Tawn, 1990). LetB be the set of all nonemptysubsets of{1, . . . , d} and letB1 = {b ∈ B : |b| = 1}. Thed-dimensionallogistic modelcan be represented by

    VLogistic(t1, . . . , td) =∑

    b∈B

    (

    j∈b

    (ψj,bti

    )αb

    )1/αb

    ,

    where the dependence parametersαb’s are in (0, 1] for all b ∈ B\B1 and theasymmetry parametersθj,b’s are in[0, 1] for all b ∈ B andj ∈ b. The constraints

    b∈B(i)

    θj,b = 1, (9)

    for j = 1, . . . , d ensure that the univariate margins are of the correct form, wherewe defineB(j) =

    {

    b ∈ B : j ∈ b}

    . As we pointed out previously in the bivari-ate case, for getting an absolutely continuous model, both asymmetry parame-ters must be equal to 1. Consequently, as every bivariate margins must satisfythe same criteria, all asymmetry parameters must necessarily be equal to 1, andso actually almost all terms assuming different dependenceparameters for anysubsets of components is cancelled from the formula apart from the one whichcaptures alld components together. Following the above logic the only realisticlogistic model within an MGPD is the form of

    VLogistic(t1, . . . , td) =

    (

    d∑

    i=1

    (ψiti

    )α)1/α

    ,

    11

  • having only one parameter to capture the dependence among the dimensions. Ofcourse we should also emphasize, that this simplicity appears only in the depen-dence structure and the correct choice for the marginal parameters gives moreflexibility to the model. Moreover we should notice that marginal parametersof the underlying MEVD model are not marginal parameters in the usual sensefor the MGPD model, because of having thelogG(0, ..., 0) constant term in itsdefinition (6).

    • Dirichlet model (Coles and Tawn, 1991) The spectral densityof the model is

    sDirichlet(w) =Γ(

    (∑d

    i=1 ψi) + 1)

    (

    ∑di=1 ψiwi

    )p+1

    d∏

    i=1

    ψiΓ(ψi)

    d∏

    i=1

    (

    ψiwi∑di=1 ψiwi

    )ψi−1

    ,

    whereψi > 0 for i = 1, . . . , d. Although the corresponding exponent measureV is complicated, it is obtainable by using numerical integration.

    2.5 Construction of new non-exchangeable models

    From the second and third column of Table 1 we can see that there is a lack of easilycomputable non-exchangeable models, especially if all probability mass is required tobe put on the interior of theS2, ensuring the absolute continuity of the model. Thebilogistic and negative bilogistic models are available, but without having an explicitformula for thier exponent measures, or the Dirichlet model, in which case there isexplicit formula only for the spectral density. In order to solve this problem, herewe propose a methodology which allows to construct new dependence models withextra asymmetry parameter(s) from any valid models. As the result of this method wemay obtain more flexible non-exhangeable models defining a new class for absolutelycontinuous MGPD-s. Because of its mathematical simplicitywe illustrate the methodin the bivariate setting using dependence function, but thesame methodology can beextended to the higher dimensional cases as well. The algorithm is fairly simple:

    i. take a differentiablebaseline dependence modelfrom Table 1 (except asym. lo-gistic or asym. negative logistic) and switch characterisation form from exponentmeasure to dependence function byA(t) = V ( 11−t ,

    1t );

    ii. take a strictly monotonictransformation Ψ(x) : [0, 1] → [0, 1], such thatΨ(0) = 0, Ψ(1) = 1;

    iii. construct anew dependence modelfrom the baseline modelAΨ(t) = A(Ψ(t));

    iv. check the constraints:A′Ψ(0) = −1,A′Ψ(1) = 1 andA

    ′′Ψ ≥ 0 (convexity).

    For the construction of a feasible transformation we can assume e.g. that it has a formof

    Ψ(t) = t+ f(t)

    hence(A(Ψ(t)))′ = A′(Ψ(t))Ψ′(t) = A′(Ψ(t)) × [1 + f ′(t)]. (10)

    12

  • 0.0 0.2 0.4 0.6 0.8 1.0

    0.00

    0.02

    0.04

    0.06

    0.08

    0.10

    tψ2(1 − t)ψ2

    t

    ψ2 = 2ψ2 = 2.25ψ2 = 2.5ψ2 = 2.75

    0.0 0.2 0.4 0.6 0.8 1.0−

    0.2

    −0.

    10.

    00.

    10.

    2

    δt(tψ2(1 − t)ψ2)

    t

    ψ2 = 2ψ2 = 2.25ψ2 = 2.5ψ2 = 2.75

    0.0 0.2 0.4 0.6 0.8 1.0

    −1.

    0−

    0.5

    0.0

    0.5

    1.0

    1.5

    2.0

    δ2t(tψ2(1 − t)ψ2)

    t

    ψ2 = 2ψ2 = 2.25ψ2 = 2.5ψ2 = 2.75

    Figure 1: Examples of functionsf(t) in (11) and its derivatives, assuming fixedψ1 = 1and differentψ2 parameters

    Obviously by choosing the second term ofΨ(t) such thatf ′(0) = f ′(1) = 0, theA′Ψ(0) = −1 andA

    ′Ψ(1) = 1 constrains are fulfilled for the new model if and only if

    A′(0) = −1 andA′(1) = 1 is true for the baseline model. The only problem which canoccur, is that the new dependence functionAΨ is not necessarily convex. In generalwe found that the following functional form leads to valid models

    fψ1,ψ2(t) = ψ1(t(1 − t))ψ2 , for t ∈ [0, 1], (11)

    whereψ1 ∈ R and andψ2 ≥ 2 are asymmetry parameters. (See Fig 1 for illustration.)If ψ1 = 0 we get back the baseline model. The convexity inequation below providesthe valid range for the asymmetry parameterψ1, assuming fixed baselineA(t) andψ2we obtain

    (A(Ψ(t)))′′ = A′′(t+ f(t))(1 + 2f ′(t)) +A′(t+ f(t))f ′′(t) ≥ 0.

    Graphical illustration of some valid model parameterisation can be found in Fig 2.Later when using these transformation we put the "Ψ−" prefix before the name of thebaseline dependence model, as e.g.Ψ-logistic model. The difference caused by thenew asymmetry parameters can be seen in Fig 3 for 3Ψ-logistic BGPD models.

    13

  • 0.0 0.2 0.4 0.6 0.8 1.0

    −1.

    00.

    00.

    51.

    0

    δtAαLog=2(Ψ(t))

    t

    Psi−Logisticψ1 = − 0.3ψ1 = − 0.15ψ1 = 0ψ1 = 0.15ψ1 = 0.3

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    1.0

    2.0

    3.0

    δ2tAαLog=2(Ψ(t))

    t

    Psi−Logisticψ1 = − 0.3ψ1 = − 0.15ψ1 = 0ψ1 = 0.15ψ1 = 0.3

    0.0 0.2 0.4 0.6 0.8 1.0

    −1.

    00.

    00.

    51.

    0

    δtAαNegLog=1.2(Ψ(t))

    t

    Psi−NegLogψ1 = − 0.3ψ1 = − 0.15ψ1 = 0ψ1 = 0.15ψ1 = 0.3

    0.0 0.2 0.4 0.6 0.8 1.0

    0.0

    1.0

    2.0

    3.0

    δ2tAαNegLog=1.2(Ψ(t))

    t

    Psi−NegLogψ1 = − 0.3ψ1 = − 0.15ψ1 = 0ψ1 = 0.15ψ1 = 0.3

    Figure 2: First and second derivatives ofΨ-logistic andΨ-negative logistic depen-dence functions, where the transformationΨ(t) = t+ fψ1,ψ2(t) has fixedψ2 = 2 anddifferentψ1 parameters.

    −4 −2 0 2 4 6

    −4−2

    02

    46

    Transformed logistic BGPD density

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    ψ1 = 0.3ψ2 = 2

    −4 −2 0 2 4 6

    −4−2

    02

    46

    Transformed logistic BGPD density

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    ψ1 = 0ψ2 = 2

    −4 −2 0 2 4 6

    −4−2

    02

    46

    Transformed logistic BGPD density

    0.00

    0.05

    0.10

    0.15

    0.20

    0.25

    0.30

    ψ1 = − 0.3ψ2 = 2

    Figure 3: BGPD density plots usingΨ-logistic dependence models with fixedψ2 = 2andψ1 = −0.3/0/0.3 parameters

    14

  • 3 Statistical inference on wind speed data

    We illustrate the practical application of the proposed methods using data on windspeeds that have been collected at four locations in northern Germany over the pastfive decades (1957-2007): Hannover, Bremerhaven, Schleswig and Fehmarn. The en-tire period covers 18061 days, and 17926 complete observations are actually availableduring this period (the remaining 135 values have missing coordinates). The observa-tions are considered as being stationary, but non-stationarity can also be handled e.g.by choosing time-dependent model parameters as in Rakonczai et al. (2010). Afterchoosing a relatively high quantile (95%) as threshold level the main aim is model-ing threshold exceedances occurring simultaneously at anypairs or triple of measuringstations. The bivariate data and the threshold levels are displayed in Fig 6. The windspeed data for the above cities are rather strongly correlated, due to the relatively shortdistance between them (with Kendall’s correlations in the range0.45-0.55 for pair-wise comparisons between the four stations), and previous studies have suggested thatBGPD models can have reasonable performance for this level of association (Rakon-czai and Tajvidi, 2010).

    3.1 Computational aspects of estimation and prediction

    The parameters can be estimated by numerical maximum likelihood method and ina Bayesian way by using MCMC simulations with conditional resampling algorithm.The effective parameterisation of the MCMC algorithm seemed to be more compli-cated and required longer time for computation, so we found the maximum likelihoodmethod preferable. Although the maximum likelihood estimator is complicated to de-rive analytically it is fairly easy to find by numerical optimisation using the Nelder-Mead algorithm (as implemented inR using theoptim function). Some complicationsmay arise as usually theµi, σi parameters show high correlation within the estimatedMGPD models. Our general finding is that the effect of settingone of these param-eters (e.g.µ1) to be zero has a negligible effect upon the parameter estimates forξiandα, and upon the maximum value of the log-likelihood, suggesting that the modelswith and without fixedµ1 are effectively equivalent. Fixingµ1 to be zero removes thestrong correlations between the remaining parameters. Another problem can be that inthe parameterisation of the MGPD there are no automatic constrains providing posi-tivity in the original scale of the observations. Hence, theoretically, it can happen thatthe fitted model gives positive probability on negative regions even for wind speed databeing non-negative by its nature. This discrepancy can be avoided by assuming themargins having finite left endpoints not less than the theoretical minimum for the givenapplications. Similar assumptions have been used in Drydenand Zempléni (2006).

    3.2 Results for bivariate observations

    The large number of model parameters (7-9), which are beyondthe interpretable com-plexity (location and scale parameters of the underlying MEVD’s of MGPD are not anymore location and scale parameters for the MGPD margins) requires some transparentillustration. Hence we used bivariate prediction regions as an alternative for visualising

    15

  • the estimates. A prediction region (Hall and Tajvidi, 2004)for a given probabilityγis a region bounded by a horizontal level curve of the bivariate density over which theintegral of the density equalsγ. The dependence parameters for 4 pairs of locations canbe found in Table 2 and the prediction regions of some models are displayed in Fig 4.In the new (Ψ) models the asymmetry parameters seem to be non-zero, providing someevidence of asymmetry in the data, whereas the dependence parameters are very closeto those estimated for the baseline models. The effect of asymmetry parameters canbe clearly seen in the top right panel of Fig 4, where theΨ-NegLog model has a rele-vant torsion in its regions to the left. Similar torsion occurs for the regions calculatedfrom the Coles-Tawn dependence model. It is still difficult to make difference basedon visually very similar regions, hence a formal test would be very useful. The test wepropose in this case can be easily performed as a simple byproduct of the predictionregion method. It is known for any region that how many observations are expectedto be in/out of it, as well as between two neighboring regions, so by comparing thesetheoretical frequencies with the realisation we can perform aχ2 test. The results forBremerhaven and Schlesswig can be found in Table 3. The critical value for the testis 11.07 (p = 0.95, df = 5), hence the Coles-Tawn(TCT = 9.1) andΨ-Neglog mod-els (TΨ-NegLog = 10.7) may be accepted. In general the new asymmetric (Ψ) modelsare significantly better than their symmetric baseline versions,TLog = 21.5 reduces toTΨ-Log = 16.9 andTNegLog = 14 reduces toTΨ-NegLog = 10.7. The Tajvidi’s general-ized symmetric logistic model(TTajvidi’s = 11.5) is the best among the symmetric ones.

    5 10 15 20 25 30

    510

    1520

    25

    NegLog

    Bremerhaven(m/s)

    Sch

    lesw

    ig(m

    /s)

    Regionsγ = 0.99γ = 0.95γ = 0.9

    5 10 15 20 25 30

    510

    1520

    25

    Psi−NegLog

    Bremerhaven(m/s)

    Sch

    lesw

    ig(m

    /s)

    Regionsγ = 0.99γ = 0.95γ = 0.9

    5 10 15 20 25 30

    510

    1520

    25

    Coles−Tawn

    Bremerhaven(m/s)

    Sch

    lesw

    ig(m

    /s)

    Regionsγ = 0.99γ = 0.95γ = 0.9

    5 10 15 20 25 30

    510

    1520

    25

    Tajvidi’s

    Bremerhaven(m/s)

    Sch

    lesw

    ig(m

    /s)

    Regionsγ = 0.99γ = 0.95γ = 0.9

    Figure 4: Prediction regions at Bremerhaven and Schleswig for 4 different models(negative logistic,Ψ-negative logistic, Coles-Tawn and Tajvidi’s generalizedsymmet-ric)

    16

  • Hannover and Schleswig Bremerhaven and FehmarnModels Dependence Dependence

    Log α = 1.66 α = 1.985Ψ-Log α = 1.662, ψ1 = 0.207, ψ2 = 2 α = 2.049, ψ1 = −0.423,ψ2 = 2NegLog α = 0.94 α = 1.257Ψ-NegLog α = 0.963, ψ1 = 0.354, ψ2 = 2 α = 1.287, ψ1 = −0.249,ψ2 = 2Mix ψ1 = 0, ψ2 = 1 ψ1 = 0, ψ2 = 1C-T ψ1 = 1.42, ψ2 = 0.69 ψ1 = 1.12, ψ2 = 2.39Tajvidi’s ψ1 = 1.84, ψ2 = 0.37 ψ1 = 2.24, ψ2 = 0.44Bilog ψ1 = 0.63, ψ2 = 0.55 ψ1 = 0.43, ψ2 = 0.57Negbilog ψ1 = 0.8, ψ2 = 1.37 ψ1 = 1.02, ψ2 = 0.6

    Bremerhaven and Schleswig Fehmarn and SchleswigModels Dependence Dependence

    Log α = 2.06 α = 1.95Ψ-Log α = 2.072, ψ1 = 0.334, ψ2 = 2 α = 1.992, ψ1 = 0.369, ψ2 = 2NegLog α = 1.34 α = 1.22Ψ-NegLog α = 1.303, ψ1 = 0.231, ψ2 = 2 α = 1.226, ψ1 = 0.315,ψ2 = 2Mix ψ1 = 0, ψ2 = 1 ψ1 = 0, ψ2 = 1C-T ψ1 = 2.23, ψ2 = 1.25 ψ1 = 2.2, ψ2 = 1.06Tajvidi’s ψ1 = 2.13, ψ2 = 0.1 ψ1 = 2.26, ψ2 = 0.59Bilog ψ1 = 0.54, ψ2 = 0.42 ψ1 = 0.56, ψ2 = 0.45Negbilog ψ1 = 0.59, ψ2 = 0.93 ψ1 = 0.62, ψ2 = 1.05

    Table 2: Some fitted models. The dependence parameters are listed for 4 pairs oflocations. It is difficult to compare the results by the parameter values, the differencescan be shown by quantiles or prediction regions (see Fig 4).

    GoF with χ2-test: Bremerhaven and Schleswigγ-range 0.99< 0.95-0.99 0.9-0.95 0.75-0.9 0.5-0.75

  • 3.3 Results for trivariate observations

    There are 3D-logistic and negative logistic models estimated for the three triplets ofstations. The shape parameters and dependence parameter are summarized in Table4. By investigating the estimated marginal quantile curves(Table 5) we found thatthere is a significant overestimation of the high quantiles by the logistic models. Therates for the negative logistic are closer to their nominal level. Further improvementscan be possibly obtained by allowing non-exchangeability in the dependence structure.To this end our proposed modification in Section 2.5 is the most promising, as fit-ting the Dirichlet model gets significantly more complicated than in the bivariate case.The main difficulty is that the spectral density must be numerically integrated overthe interior of the unit simplex. The 3-dimensional modeling based on the proposedasymmetric extension for the above models is in progress.

    Hannover, Bremerhaven and FehmarnModels ξ1 ξ2 ξ3 DependenceLog -0.146 -0.027 0.038 2.157NegLog 0.093 0.110 0.137 0.770

    Hannover, Fehmarn and SchleswigModels ξ1 ξ2 ξ3 DependenceLog -0.099 -0.026 0.015 2.031NegLog 0.088 0.098 0.101 0.817

    Bremerhaven, Fehmarn and SchleswigModels ξ1 ξ2 ξ3 DependenceLog -0.083 0.031 -0.006 2.189NegLog 0.075 0.157 0.111 0.927

    Table 4: Fitted 3D models with logistic or negative logisticdependence structures.

    4 Discussion

    We have presented several models for multivariate exceedances and illustrated the prac-tical application of these models using a real dataset on wind speed. AnR library forfitting the proposed model,mgpd, has been developed, and will, in due course, be sub-mitted toCRAN. For a visual (and empirical) model evalution we suggested to comparethe cover rate of different prediction regions. While this is a valuable tool, the de-velopment of more formal tests, like the ones known for copula modeling (Rakonczaiand Zempléni, 2008) would definitely be a further improvement. Alternative methodsfor comparing models, assessing the statistical significance of individual terms in themodel, and assessing predictive performance should also beexplored.

    The suggested asymmetric models are promising, as with their help one becomesable to fit non-exchangeable MGPD families without heavy computational burden. Ourresults have shown that they are real competitors of the previously known such modelsin terms of coverage probabilities.

    18

  • Hannover, Bremerhaven and Fehmarn3D Logistic 3D Negative Logistic

    Quant. H(x, y,∞) H(x,∞, z) H(∞, y, z) H(x, y,∞) H(x,∞, z) H(∞, y, z)0.99 0.000 0.000 0.001 0.009 0.009 0.0080.95 0.004 0.003 0.005 0.030 0.026 0.0340.90 0.012 0.009 0.018 0.058 0.049 0.068

    Hannover, Fehmarn and Schleswig3D Logistic 3D Negative Logistic

    Quant. H(x, y,∞) H(x,∞, z) H(∞, y, z) H(x, y,∞) H(x,∞, z) H(∞, y, z)0.99 0.000 0.001 0.001 0.014 0.015 0.0110.95 0.004 0.005 0.005 0.030 0.032 0.0270.90 0.011 0.014 0.019 0.054 0.064 0.046

    Bremerhaven, Fehmarn and Schleswig3D Logistic 3D Negative Logistic

    Quant. H(x, y,∞) H(x,∞, z) H(∞, y, z) H(x, y,∞) H(x,∞, z) H(∞, y, z)0.99 0.002 0.002 0.002 0.003 0.008 0.0040.95 0.007 0.009 0.008 0.033 0.034 0.0230.90 0.024 0.026 0.022 0.071 0.077 0.043

    Table 5: GoF for 3D models by marginal quantile curves. Thereis a significant over-estimation of the high quantiles by the logistic models in general. The rates for thenegative logistic are closer to their nominal level. Further improvement could be ob-tained by more complex non-exchangeable models.

    There are a number of ways in which this work could be developed further. Themost obvious extension would be assuming time-dependent model parameters or pa-rameters depending on multiple explanatory variables. There are no additional con-ceptual difficulties involved in extending the model in thisway, but the computationalissues involved in maximising the likelihood function would probably be even morepronounced than for the current model. This may motivate theconsideration of alter-native methods of inference, such as Markov chain Monte Carlo (e.g., in the context ofextreme value modeling, Fawcett & Walshaw, 2008).

    The possible presence of residual dependence (caused by theclustering, which im-plies that a single extreme event may be associated with multiple extreme values) mayalso be investigated. Residual dependence occurs in the extremes of many environmen-tal (and financial) time series, and methods that ignore thiswill tend to under-estimatestandard errors and other measures of uncertainty. The block bootstrap, or extensionsthereof, may provide a strategy for constructing confidenceintervals in a way that ac-counts for residual dependence.

    Acknowledgements

    The European Union and the European Social Fund have provided financial support tothe project under the grant agreement no. TÁMOP 4.2.1./B-09/KMR-2010-0003.

    19

  • References

    [1] Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J. 2004.Statistics of Extremes:Theory and Applications. Wiley: West Sussex: England.

    [2] Brodin E. and Rootzén H. 2009. Univariate and Bivariate GPD Methods for Pre-dicting Extreme Wind Storm Losses.Insurance: Mathematics and Economics 44:345-356.

    [3] Coles, S. 2001. An Introduction to Statistical Modelling of Extreme Values.Springer-Verlag.

    [4] Coles, S. and Tawn, J., 1991. Modelling extreme multivariate events.J. R. Statist.Soc. Ser. B 53: 377-392.

    [5] Coles, S. and Tawn, J., 1994. Statistical Methods for Multivariate Extremes: anApplication to Structural DesignAppl. Statistics 431-48.

    [6] Dryden, I.L. and Zempléni, A., 2006 Extreme shape analysis. J. Roy. Stat. Soc.,Ser. C, 55, part 1, p. 103-121.

    [7] Fawcett, L. and Walshaw, D. 2008. Bayesian Inference forClustered Extremes.Extremes, 11: 217-233.

    [8] Fawcett, L. and Walshaw, D. 2007. Improved Estimation for Temporally Clus-tered Extremes.Environmetrics 18(2): 173-188.

    [9] Hall, P. and Tajvidi, N. 2000. Prediction regions for bivariate extreme events.Aust. N. Z. J. Stat. 46(1): 99-112.

    [10] Hall, P. and Tajvidi, N. 2004. Distribution and dependence function estimationfor bivariate extreme value distributions.Bernoulli 6(5): 835-844.

    [11] Kotz, S. and Nadarajah, S. 2000.Extreme Value Distributions. Imperial CollegePress: London.

    [12] Leadbetter M.R., Lindgren G. and Rootzén H. 1983.Extremes and related prop-erties of stationary sequences and processes. Springer: New York

    [13] Nadarajah, S., Anderson, C. W. and Tawn, J. A. 1998. Ordered multivariate ex-tremes.J. Roy. Statist. Soc. Ser. B. 60: 473-496.

    [14] Pickands, J. 1981. Multivariate extreme value distributions. InBulletin of the In-ternational Statistical Institute, Proceedings of the 43rd Session: 859-878.

    [15] Rakonczai, P. and Tajvidi, N. 2010. On prediction of bivariate extremes.Interna-tional Journal of Intelligent Technologies and Applied Statistics, 3(2): 115-139.

    [16] Rakonczai, P., Butler A. and Zempléni A. 2010. Modelingtemporal trendwithin bivariate generalized Pareto models of logistic type (in progresshttp://www.math.elte.hu/∼paulo/pdf/nonstat_bgpd_rakonczai.pdf)

    20

  • [17] Rakonczai, P. and Zempléni, A. 2008. Copulas and goodness of fit testsRecentAdvances in Stochastic Modeling and Data Analysis p. 198-206

    [18] Ribatet, M., Sauquet, E., Grésillon, J. M. and Ouarda, T. B. J. M. 2007. A Re-gional Bayesian POT Model for Flood Frequency Analysis.Stochastic Environ-mental Research and Risk Assessment 21(4): 327-339.

    [19] Rootzén H. and Tajvidi N. 2006. The multivariate generalized Pareto distribution.Bernoulli 12: 917-930

    [20] Tajvidi, N. (1996) Multivariate generalized Pareto distributions. Article in PhDthesis, Department of Mathematics, Chalmers, Göteborg.

    [21] Tawn, J. A. (1988) Bivariate extreme value theory: Models and estimation.Biometrika, 75(3): 397-415

    21

  • −4 −2 0 2 4

    −4

    −2

    02

    4

    Logistic

    x

    y

    0.0

    0.2

    0.4

    0.6

    0.8

    0.6 0.4

    0.2

    0.005

    −4 −2 0 2 4

    −4

    −2

    02

    4

    Negative Logistic

    x

    y

    0.0

    0.2

    0.4

    0.6

    0.8

    0.6 0.4

    0.2

    0.1

    0.005

    −4 −2 0 2 4

    −4

    −2

    02

    4

    Bilogistic

    x

    y

    0.0

    0.2

    0.4

    0.6

    0.8

    0.6 0.4

    0.2

    0.1 0.005

    −4 −2 0 2 4

    −4

    −2

    02

    4

    Dirichlet

    x

    y

    0.0

    0.2

    0.4

    0.6

    0.8

    0.6 0.4

    0.2

    0.1 0.005

    Figure 5: Examples. BGPD distribution functions with sameµ1 = µ2 = 0, σ1 =σ2 = 1 andξ1 = ξ2 = 0.1 parameters assuming different dependence structures. Inthe logistic and negative logistic modelsα = 0.5, in the bilogistic modelψ1 = 1.25,ψ2 = 2 and in the Dirichlet modelψ1 = 0.6, ψ2 = 0.2.

    22

  • 0 5 10 15 20 25 30

    05

    1020

    30

    Hannover(m/s)

    Sch

    lesw

    ig(m

    /s)

    0 5 10 15 20 25 30

    05

    1020

    30

    Bremerhaven(m/s)

    Fehm

    arn(

    m/s

    )

    0 5 10 15 20 25 30

    05

    1020

    30

    Bremerhaven(m/s)

    Sch

    lesw

    ig(m

    /s)

    0 5 10 15 20 25 30

    05

    1020

    30

    Fehmarn(m/s)

    Sch

    lesw

    ig(m

    /s)

    Figure 6: Example. Observations of daily wind speeds over the 95% threshold levelfor the period 1957-2007. Those exceedances that exceed thethreshold in both com-ponents are distinguished by blue colour.

    Logistic bgpd by ’evd’

    Fehmarn m/s

    Hann

    over

    m/s

    0 5 10 15 20 25

    05

    1015

    2025

    30

    density curves

    No Inference!

    No Inference!No Inference!

    0 5 10 15 20 25

    05

    1015

    2025

    30

    Logistic bgpd by ’mgpd’

    Fehmarn m/s

    Hann

    over

    m/s

    density curves

    No Inference!

    Figure 7: "classical BGPD" and BGPD models fitted to exceedances above the95% marginal quantiles. The shape and dependence parameters are (ξ̂1, ξ̂2, α̂) =(−0.001,−0.06, 1.49) and(ξ̂1, ξ̂2, α̂) = (0.18, 0.16, 1.59) for the left and right panelrespectively.

    23

  • 0 5 10 15 20 25 30

    05

    1020

    30

    Bremerhaven(m/s)

    Fehm

    arn(

    m/s

    )

    0 5 10 15 20 25 30

    05

    1020

    30

    Bremerhaven(m/s)

    Sch

    lesw

    ig(m

    /s)

    0 5 10 15 20 25 30

    05

    1020

    30

    Fehmarn(m/s)

    Sch

    lesw

    ig(m

    /s)

    5 10 15 20 25 30 0 5

    1015

    2025

    0 5

    1015

    2025

    Bremerhaven

    Fehm

    arn

    Sch

    lesw

    ig

    Figure 8: Exceedances in 3D. As only one component must be over a threshold, the 2Dmargins can contain observations which are below both of their thresholds. In thesecases the third component exceeds its threshold.

    24

  • 15 20 25

    1015

    20

    Margins of 3D Logistic

    Bremerhaven(m/s)

    Fehm

    arn(

    m/s

    )

    0.9

    0.95

    0.99

    15 20 25

    1015

    20

    Margins of 3D Logistic

    Bremerhaven(m/s)

    Fehm

    arn(

    m/s

    )

    0.2

    0.4

    0.6

    0.8

    0.9 0.95

    0.99

    15 20 25

    1015

    20

    Margins of 3D NegLog.

    Bremerhaven(m/s)

    Fehm

    arn(

    m/s

    )

    0.9 0.95

    0.99

    15 20 25

    1015

    20

    Margins of 3D NegLog.

    Bremerhaven(m/s)

    Fehm

    arn(

    m/s

    )

    0.2

    0.4

    0.6

    0.8

    0.9 0.95

    0.99

    Figure 9: Examples. 2D margins of the 3D logistic and negative logistic models forBremerhaven, Fehmarn and Schleswig. The two models are rather different, the lo-gistic model leads to relevantly higher quantile curves than the negative logistic. Thecomparison with the observation shows that there is a strongoverestimation in the lo-gistic and a slight overestimation in the negative logisticmodel.

    25

  • A Appendix: "classical BGPD" models fitted by ’fb-vpot’

    In Section 2.3 there are two different approaches describedfor modelling multivariateexceedances. Later on, the difference between the "classical BGPD" in (5) and BGPDin (6) is illustrated in Fig 7. Further details about "classical BGPD" are presented here,making the results comparable with those we got for the otherdefinition. The parameterestimates, produced byfbvpot routine of theevd package, are summarised in Table6. Similarly to the results in Table 2 there is a slight asymmetry present in the data. Inthis "classical" case there is parametric inference only onthe upper right quarter ofR2.The density plots are presented in Fig 10.

    Hannover and SchleswigModels σ1 ξ1 σ2 ξ2 DependenceLog 2.359 -0.059 1.529 0.025 α′ =0.635NegLog 2.366 -0.056 1.517 0.028 α′ =0.858Bilog 2.391 -0.068 1.505 0.035 ψ1 =0.694ψ2 =0.559Negbilog 2.389 -0.065 1.499 0.038 ψ1 =0.872ψ2 =1.517C-T 2.388 -0.065 1.500 0.037 ψ1 =1.123ψ2 =0.562

    Bremerhaven and FehmarnModels σ1 ξ1 σ2 ξ2 DependenceLog 2.381 -0.072 1.654 0.001 α′ =0.535NegLog 2.361 -0.062 1.651 0.004 α′ =1.160Bilog 2.352 -0.063 1.661 0.003 ψ1 =0.489ψ2 =0.575Negbilog 2.333 -0.052 1.656 0.004 ψ1 =1.030ψ2 =0.712C-T 2.323 -0.047 1.653 0.005 ψ1 =1.018ψ2 =1.756

    Bremerhaven and SchleswigModels σ1 ξ1 σ2 ξ2 DependenceLog 2.291 -0.060 1.518 0.023 α′ =0.514NegLog 2.293 -0.054 1.507 0.028 α′ =1.239Bilog 2.346 -0.073 1.484 0.034 ψ1 =0.581ψ2 =0.430Negbilog 2.337 -0.066 1.478 0.039 ψ1 =0.581ψ2 =1.071C-T 2.315 -0.062 1.470 0.040 ψ1 =2.268ψ2 =1.032

    Fehmarn and SchleswigModels σ1 ξ1 σ2 ξ2 DependenceLog 1.653 -0.002 1.534 0.026 α′ =0.552NegLog 1.654 0.001 1.520 0.030 α′ =1.101Bilog 1.682 -0.010 1.496 0.044 ψ1 =0.630ψ2 =0.450Negbilog 1.675 -0.007 1.488 0.049 ψ1 =0.617ψ2 =1.269C-T 1.671 -0.006 1.485 0.050 ψ1 =2.087ψ2 =0.803

    Table 6: "classical BGPD" models fitted byfbvpot in evd. Prediction regions arenot available in default, although the density plot is given. (See Fig 10 below.) In thelogistic and negative logistic modelsα = 1/α′.

    26

  • 15 20 25 30

    1216

    20

    Logistic Density

    Bremerhaven

    Sch

    lesw

    ig

    15 20 25 30

    1216

    20

    Negative Logistic Density

    Bremerhaven

    Sch

    lesw

    ig

    15 20 25 30

    1216

    20

    Negative Bilogistic Density

    Bremerhaven

    Sch

    lesw

    ig

    15 20 25 30

    1216

    20

    Coles−Tawn Density

    Bremerhaven

    Sch

    lesw

    ig

    Figure 10: "classical BGPD" contour plots

    27