Testing extreme value conditions an overview and recent approaches

International Conferenceon Mathematical and Statistical

Modelingin Honor of Enrique Castillo

(ICMSM 2006) University of Castilla-La Mancha

Ciudad Real (SPAIN) June 28-30, 2006

Testing extreme value conditions an overview and recent approaches

Isabel Fraga AlvesCEAUL & DEIO University Lisbon, Portugal

Cláudia NevesUIMA & DM University Aveiro, Portugal

ICMSM 2006 Ciudad Real, June 28-30, 2006 – 2Isabel Fraga Alves & Cláudia Neves

Contents

• Introduction

• Preliminaries and notation

• Testing extremes

Parametric Approaches

Annual Maxima (AM)

Peaks Over Threshold (POT)

Largest Observations (LO)

Semi-Parametric Approaches

Testing EV Conditions

• PORT approach Three Tests

• A case study S&P500 data


Introduction• In analysis of extreme large (or small) values it is of relevant importance the model assumptions on the right (or left) tail of the underlying distribution function (d.f.) F to the sample data.

• We focus on the problem of extreme large values. By an obvious transformation, the problem of extreme small values is analogous.

• Statistical inference about rare events can clearly be deduced only from those observations which are extreme in some sense:

classical Gumbel method of block of annual maxima (AM)

peaks-over-threshold (POT) methods

peaks-over-random-threshold (PORT) methods.

• Statistical inference is clearly improved if one make an a priori statistical choice about the more appropriate tail decay for the underlying df:

exponential

• This is supported by Extreme Value Theory (EVT).

or polynomiallight tails with finite right endpoint


Theory and Extreme Values Analysis• Extreme Values Analysis Models for Extreme Values, not

central values; modelling the tail of the underlying distribution

• Problem: How to make inference beyond the sample data?

• One Answer: use techniques based on EVT in such a way that it is possible to make statistical inference about rare events, using only a limited amount of data!

•Notation:

1 2( , , , ) iid r.v.'s wi Sam th d.ple f. ( ).nX X X F x •

• ,1, 2, Order Statistics =:

n nn n nX X X M ( ) ( ) 1 (Tail of ). F x P X x F xF •


1

( ).

n

n

P X x P X x

F x

1[ ] , , n nP M x P X x X x•

. .Consequently, ,

with sup , ( ) 1 .

a sn F

F

M x

x x F x

Suppose there exist >0 and , such that

( ), for every n n

n n n n

a b

P M a x b G x x

R

R

• Then

Basic Theory – distribution of the Maximum

1/exp 1 , para 1 0, se 0

( ) ( )exp( exp( )), para z , se 0

z zG z G z

z

R

,F G RD( )[GEV- Generalized Extreme Value]

von Mises-Jenkinson Representation

• Gnedenko (1943)


Extreme Value Distributions (maxima)

• The GEV() incorporates the 3 types:[Fisher-Tippett]

• Fréchet:

• Weibull:

• Gumbel:

( ) exp( ( ) ), 0, 0;z z z

( ) exp( ( ) ), 0, 0;z z z

R( ) exp( exp( )), .z z z

1 / 0

0

1 / 0

limit for heavy tailed distributions

limit for short tailed distributions with Fx

limit for exponential tailed distributions


Parametric aprochesFitting GEV() to Anual Maxima (AM) – GUMBEL METHOD

Block 1 Block 2 Block 3 Block 4 Block 5

• Inclusion of location and scale parametersin GEV() df

( ; , ) , , 0,x

G x G

R R

tail index (shape)


Testing problem in GEV()

The shape parameter determines the weight of the tail

Choice between Gumbel, Weibull or Fréchet

: 0 . : 0 G vs G

. : 0

. : 0

vs G

vs G

or

•Van Montfort (1970)•Bardsley (1977)•Otten and Van Montfort (1978)•Tiago de Oliveira (1981)•Gomes (1982)•Tiago de Oliveira (1984)

•Tiago de Oliveira and Gomes (1984)•Hosking (1984)•Marohn (1994) •Wang, Cooke, and Li (1996)•Marohn (2000)


Generalized Pareto distribution GP()

-1/

1- 1 if 0( ; , ) , , 0,

1- exp -( ) / if 0

xH x

x

R

for 0 and 1+ / 0 x x

• GP() df includes the models:

( ; , ) 1 log ( ; , )H x G x

1,W ( ) 1 , 0, 1 x x x• Pareto: Heavy Tail

• Exponential: 0W ( ) 1 exp( ), 0 x x x Exponential tail

• Beta: 2,W ( ) 1 ( ) , 0, -1 0 x x x bounded

support


Excesses over high thresholds – POT ( Peaks Over Thresholds )

( ) ( ) , 0 F u y F u P X u y X u y

( ; ( )) P X u y X u H y u

Excesses over : - | i iu X u X u

u

• Balkema-de Haan’74+Pickands’75

0(G ) lim sup - - ( ; ( )) 0

Fu x x u

F P X u x X u H x u

D


Testing problem in GP()


Choice between Exponential, Beta or Pareto

: 0 . : 0 H vs H

. : 0

. : 0

vs H

vs H

or

Fitting GPdf to data• Castillo and Hadi (1997)Goodness-of-fit tests for GPdf model• Choulakian and Stephens (2001)Goodness-of-fit problem heavy tailed Pareto-type dfs• Beirlant, de Wet and Goegebeur (2006)

Fitting GPdf to data• Castillo and Hadi (1997)Goodness-of-fit tests for GPdf model• Choulakian and Stephens (2001)Goodness-of-fit problem heavy tailed Pareto-type dfs• Beirlant, de Wet and Goegebeur (2006)

• Van Montfort and Witter (1985)• Gomes and Van Montfort (1986)• Brilhante (2004)• Marohn (2000) AM & POT


LO (Larger Observations)

(1)X

(2)X

(3)X

(4)X

X (k)

(1) (2) ( )kXX X k largest observations of the sample:

are modeled by joint pdf GEV() - extremal process

( )( ) 1, ,: ,ii i k

XZ

1

1 11

( ), , ( ) , , ( ) : ( ) /

( )

ki

k k ki i

g zf z z g z z z g z G z z

G z


Testing problem in GEV() GEV()-extremal process


Choice between Gumbel, Weibull or Fréchet

: 0 . : 0 G vs G

. : 0

. : 0

vs G

vs G

or

• Gomes and Alpuim (1986)• Gomes (1989) LO & AM

Goodness-of-fit tests• Gomes (1987)


Semi-Parametric Approach – Upper Order Statistics

(G )DF

,n nX

1,n nX

2,n nX

3,n nX

,n nX k

1 ,, ,n n n n n k nXX X

, n k nX ( ) ,

/ 0,

k k n

k n nupper

intermediate o.s.


Peaks Over Random Threshold - PORT

1: : , 1,: ,n k nn ni iX i kZ X Excesses Over Random Threshold :n k nX

n nX k:

1: : :Excesses over : - n ii nk n kn nnX Z X X


Testing Problem: Max-Domains of Attraction


Choice between Domains of Attraction

0 0(G ) . (G ) F vs F D D0

0

. (G )

. (G )

vs F

vs F

D

Dor

PORT approach• Neves, Picek and Fraga Alves(2006) • Neves and Fraga Alves (2006)

• Galambos (1982)• Castillo, Galambos and Sarabia (1989)• Hasofer and Wang (1992)• Falk (1995)• Fraga Alves and Gomes (1996)• Fraga Alves (1999)• Marohn (1998a,b)• Segers and Teugels (2000)


D for any real(G ), F

1 ,, ,n n n n n k nXX X , n k nX upper intermediate o.s.

Adapted Goodness-of-fit tests

(Kolmogorov-Smirnov & Cramér-von Mises type)

• Dietrich, de Haan and Husler (2002)

• Drees, de Haan and Li (2006)

Testing EV conditions


1 ,, ,n n n n n k nXX X ( ) ,

/ 0,

k k n

k n n

1, , , 1,: ,n k nn ni iX i kZ X

Largest Observations

Excesses over the Random Threshold ,n k nX

Define the r-Moment of Excesses

( )1,

1,

1

1, 21 1

: : ,

k k

r

nrr

n n k ini ni i

rXM X Zk k

PORT approach Three Tests for0 0

(G ) . (G ) F vs F D D


NPFA test statistic:Ratio between the Maximum and the Mean of Excesses

Motivation: different behaviour of the ratio between the maximum and the mean for light and heavy tails

1(1)

( )nn

ZT k

M

The distribution does NOT depend on the location and scale

Neves, Picek & FragaAlves ‘06


Gt test statistic:Greenwood-type Statistic

Motivation: based on the statistic Greenwood ’46

(2)

2(1)( ) n

n

n

MR k

M


(Neves & FragaAlves ‘06)


HW - test statistic:Hasofer and Wang Statistic

Motivation: based on goodness-of-fit statistic Shapiro-Wilk ’65

2(1)

2(2) (1)

1 1 1( ) :

( ) 1n

nnn n

MW k

k k R kM M


(Hasofer & Wang ’92; Neves & FragaAlves ‘06)


NPFA - Test at asymptotic level

0 0 1 0H : (G ) . H : (G ) F vs F D D

0 0 1 0H : (G ) . H : (G ) F vs F D D

0 0 1 0H : (G ) . H : (G ) F vs F D D Reject H0 (light tails) in favour of H1 (bilateral) if:

Reject H0 (light tails) in favour of H1 (short tails) if:

*, 1k nT g

*,k nT g

* *, 2 , 1 2or k n k nT g T g

ln( l: n ) g Gumbel quantile

Reject H0 (light tails) in favour of H1 (heavy tails) if:

under H0

+ extra second order conditions on the upper tail of F

+ extra conditions on convergence rate of k to infinity

,*, og: lkk n nT T k d

n 0 G


Gt & HW - Tests at asymptotic level

0 0 1 0H : (G ) . H : (G ) F vs F D D

0 0 1 0H : (G ) . H : (G ) F vs F D D

1 (: )z - Normal quantile

0 0 1 0H : (G ) . H : (G ) F vs F D D

Reject H0 (light tails) in favour of H1 (short tails) if:

*1

*1

( )

( )

n

n

R k z

W k z

Gt

HW

- test

- test

*

*

/ 4 ( ) 2

/ 4 (

(

) : )( 1

) : nn

n n

R k

W

k R k

k k kW k

under H0

+ extra second order conditions on the upper tail of F

+ extra conditions on convergence rate of k to infinity

dn (0,1)N

Reject H0 (light tails) in favour of H1 (heavy tails) if:

*1

*1

( )

( )

n

n

R k z

W k z

Gt

HW

- test

- test

Reject H0 (light tails) in favour of H1 (bilateral) if:

*1 2

*1 2

( )

( )

n

n

R k z

W k z

Gt

HW

- test

- test


Exact Properties of NPFA, GT & HW - Tests

An extensive simulation study concerning the proposed procedures,

allows us to conclude that:

The Gt-test is shown to good advantage when testing the presence of heavy-tailed distributions is in demand.

While the Gt-test barely detects small negative values of , the HW-test is the most powerful test under study concerning alternatives in the Weibull domain of attraction.

Since the NPFA- test based on the very simple Tn-statistic tends to be a conservative test and yet detains a reasonable power, this test proves to be a valuable complement to the remainder procedures.


Financial data: stock index log-returns EVT offers a powerful framework to characterize financial market crashes and booms.

The exact distribution of financial returns remains an open question.

Heavy tails are consistent with a variety of financial theories.

In financial studies, the following question is relevant:

are return distributions symmetric in the tails?

Differences in the behavior of extreme positive and negative tail movements within the same market constitute a point of investigation.

The aforementioned tests can be seen as a first test for symmetry between the positive and negative tails of the log-returns of some stock index.


S&P500: left and right tails of stock index log-returns

S&P500 data: n=6985 observations

series of closing prices, {Si , i = 1, … , n} of S&P500 stock index taken

from 4 January, 1960 up to Friday, 16 October, 1987 (the last trading day before the crash of Black Monday, October 19, 1987 ), from which we use the daily log-returns (assumed to be stationary and weakly dependent).

Study left tail of the distribution of the returns:negative log-returns, i.e.,

Li := log (Si+1 / Si ) , i = 1,…, n -1.

Study right tail of the distribution of the returns:positive log-returns, defined as

Xi := log (Si+1 / Si )= Li , i = 1,…, n -1.


S&P500: percentage log-returns Xi := log (Si+1 / Si )

S&P500 (log-returns, 5 Jan 60 - 16 Oct 87)

-8

-6

-4

-2

0

2

4

6

1/5/1960

1/5/1962

1/6/1964

1/6/1966

1/7/1968

1/7/1970

1/8/1972

1/8/1974

1/9/1976

1/9/1978

1/10/1980

1/10/1982

1/11/1984

1/11/1986

1/12/1988iX


S&P500 (Left tail)

-3

-2

-1

0

1

2

3

4

5

6

0 200 400 600 800 1000 1200

T* R* W*

k

(G ), 0 F L Fre chet Domain, Heavy Tail !D

0.95g

0.95z

0.05z

NPFA-test

HW-test

Gt-test

Sample paths of the statistics T*, R* and W*,plotted against k = 5, … , 1200, applied to S&P500:

negative log-returns Li := log (Si+1 / Si )


S&P500 (Right tail)

-3

-2

-1

0

1

2

3

4

5

6

0 200 400 600 800 1000 1200

T* R* W*

k

0(G ), F X D Gumbel Domain, light/exponential Tail !

0.975g

0.975z

0.025z

NPFA-testHW-test

Gt-test

Sample paths of the statistics T*, R* and W*,plotted against k = 5, … , 1200, applied to S&P500:

positive log-returns Xi := log (Si+1 / Si )

0.025g


S&P500: left and right tails of stock index log-returns

NPFA, HW and Gt testing procedures under the PORT approach yielded the sample paths plots presented.

This analysis suggests the consideration of the Fréchet and Gumbel domains of attraction, respectively, for the left and right tails of the returns distribution.

This may have the following interpretation: in this stock index the crashes are much more likely than large gain values.


Main References

Neves, C., Picek, J. and Fraga Alves, M.I. (2006). Contribution of the maximum to the sum of excesses for testing max-domains of attraction. JSPI, 136, 4, 1281-1301.

Neves, C. and Fraga Alves, M.I. (2006). Semi-parametric Approach to Hasofer-Wang and Greenwood Statistics in Extremes. To appear in TEST.

Testing extreme value conditions an overview and recent approaches

Documents

Transcript of Testing extreme value conditions an overview and recent approaches