Hypothesis testing and confidence intervals by resampling

24
Hypothesis testing and confidence intervals by resampling by J. Kárász

description

Hypothesis testing and confidence intervals by resampling. by J. Kárász. Contents. The bootstrap method The bootstrap analysis of the Kolmogorov-Smirnov test The bootstrap analysis of the GEV parameter investing The bootstrap analysis of the movig window method. Testing of homogenity. - PowerPoint PPT Presentation

Transcript of Hypothesis testing and confidence intervals by resampling

Page 1: Hypothesis testing and confidence intervals by resampling

Hypothesis testing and confidence intervals by resampling

by J. Kárász

Page 2: Hypothesis testing and confidence intervals by resampling

Contents

• The bootstrap method• The bootstrap analysis of the Kolmogorov-

Smirnov test• The bootstrap analysis of the GEV parameter

investing• The bootstrap analysis of the movig window

method

Page 3: Hypothesis testing and confidence intervals by resampling

Testing of homogenity

• Kolmogorov-Smirnov test

• Investigating the GEV parameter of dependence on time

Page 4: Hypothesis testing and confidence intervals by resampling

Bootstrap methodconditions of use

FXXXiid

n ~,..., 21

is a random sample from the unknown probability distribution function (F) with finite variance.

)(F is the unknown parameter, the function of F.

),...,(ˆˆ21 nXXX

is the non-parametric estimate of the parameter, the function of the random sample.

Page 5: Hypothesis testing and confidence intervals by resampling

Bootstrap methodbootstrap estimate of the standard error

2/1

21 ),...,(ˆ)( nF XXXVarF

is the standard error of the estimate.

Then the bootstrap estimate is) ˆ ( ˆF

Unfortunatly in most cases it’s impossible to express it as a simple function of or the random sample, so we have to use numeric approximation.

F

Page 6: Hypothesis testing and confidence intervals by resampling

Bootstrap methodbootstrap sample and bootstrap replicate

To approximate the empirical distribution function, the bootstrap algorithm takes random samples from the empirical distribution function:

FXXXiid

nˆ,..., ~21

where njin

xXP ji ,...2,1, 1

)(

This is the bootstrap sample. It is nothing else but a random sample from with replacement. By evaluating the statistic of interest we get a bootstrap replicate:

nxxx ,..., 21

)(ˆ),...,(ˆˆ21

XXXX n

Page 7: Hypothesis testing and confidence intervals by resampling

Bootstrap methodapproximation with Monte Carlo method

1. Independently draw a large number of bootstrap samples:

2. Evaluate the statistic of interest, so we get B bootstrap replicates:

3. Calculate the sample mean and sample standard deviation of the replicates:

)(),...2(),1( BXXX

BbbXb ,...2,1 ))((ˆ)(ˆ

Bb

BbB

bB

)(ˆ)(ˆ

)1()(ˆ)(ˆˆ2/1

1

2

Page 8: Hypothesis testing and confidence intervals by resampling

Bootstrap methodconfidence intervals

The histogram of the bootstrap replicates is an empirical density function for , so the and histogram percentiles are suitable limit estimates for the percent confidence interval.

Page 9: Hypothesis testing and confidence intervals by resampling

Bootstrap methodhypotesis testing

H0: are iid random variables.H1: are not iid random variables.The bootstrap samples are drawn from the same

distribution ( ) independently, so if H0 holds, then and the replicates are from quite similar distribution, because F and are similar for large sample size.

If is out of the empirical confidence interval,we accept H1, in other case, we accept H0.

nXXX ,..., 21

nXXX ,..., 21

F

F

Page 10: Hypothesis testing and confidence intervals by resampling

Kolmogorov-Smirnov test

H0: are iid random variables.

H1: are not iid random variables.

Our suppose was that the annual maximum water levels are independent, so if we refuse H0 we have to accept that are not from the same distribution. This can even mean trend.

nXXX ,..., 21

nXXX ,..., 21

nXXX ,..., 21

Page 11: Hypothesis testing and confidence intervals by resampling

Resultst=1 t=2 t=3

namhnamqpolh 1polqszeghszolh 1 1tbhtivhtivqzahhzahq

1 means H0 was refused.

t=1,2,3 : parameter for cutpoint in K-S test.

Page 12: Hypothesis testing and confidence intervals by resampling

Results - exampleAnnual maximum water level at Szolnok, t=1

akonf 0.08170000stat 0.40400000mean 0.58684509fkonf 0.98890000

Page 13: Hypothesis testing and confidence intervals by resampling

Results - example

Annual maximum water level at Szolnok, t=2

akonf 0.07790000stat 0.00960000mean 0.58174930fkonf 0.98710000

Page 14: Hypothesis testing and confidence intervals by resampling

Examinating the GEV parameter of shapeH0: are iid random variables.

H1: are not iid random variables.

Our suppose is that the annual maximum water levels are independent, so if we refuse H0 we have to accept that are not from the same distribution. This can even mean trend.

nXXX ,..., 21

nXXX ,..., 21

nXXX ,..., 21

Page 15: Hypothesis testing and confidence intervals by resampling

Results

namhnamqpolhpolqszeghszolhtbhtivhtivq 1zahhzahq

1 means H0 was refused.

No dataset was found to refuse H0 both in K-S test and GEV parameter testing.

Page 16: Hypothesis testing and confidence intervals by resampling

Results – data table

mean: stat: a. kvant f. kvantnamh 0.001840892 0.001009485 0.0009822134 0.002242123namq 0.003384007 0.002375165 0.002281889 0.00474283polh 0.002357022 0.002284407 0.002236172 0.002665589polq -0.0008121447 0.0002126318 -0.005037174 0.0002659811szeg 0.0005446647 0.000524554 0.0005190951 0.000650231szol 0.002080454 0.002240719 0.001939278 0.002299348tbh -0.001831871 -0.002128239 -0.002172273 0.001530627tbq 0.000863550 0.0003681706 0.000368170646947404 0.00456312439698707tivh 0.001255741 0.0006564334 0.0006503345 0.002444052tivq 0.002279426 0.008236962 0.00144411 0.008093537zah 0.0009532596 0.0009549216 0.0008029558 0.001073905zah 0.002086978 0.002254521 0.00196992 0.002347529

Page 17: Hypothesis testing and confidence intervals by resampling

Results - example

mean: stat: a. kvant f. kvantpolh 0.002357022 0.002284407 0.002236172 0.002665589tivq 0.002279426 0.008236962 0.00144411 0.008093537

Page 18: Hypothesis testing and confidence intervals by resampling

Further questions – two-peeked bootstrap empirical distributions

Page 19: Hypothesis testing and confidence intervals by resampling

Permutation testingsimilar to bootstrap method

Same as the bootstrap algorithm except that permutation sample is drawn without replacement.

The hypotesis testing is similar too, we examine the estimate and the empirical confidence interval.

Page 20: Hypothesis testing and confidence intervals by resampling

Moving window method forecastanalysis by permutation method

Our aims were:

• Simulating the original dataset by permutation.

• Supervise the quality of the forecast.

Page 21: Hypothesis testing and confidence intervals by resampling

Results -example

Page 22: Hypothesis testing and confidence intervals by resampling

Results - example

Page 23: Hypothesis testing and confidence intervals by resampling

Results - example

Page 24: Hypothesis testing and confidence intervals by resampling

Results - example