On Fitting Models for Danish Fire Data

On fitting models for Danish fire data-1981.

Reported to: Professor Jose Garrido

April 30, 2007

Presented and reported by:I. Mendoza

&S.Mozumder.

Reported as the partial fulfilment of the requirements for the course MAST726/4

1

1 Introduction

We have a portfolio of dwelling. In the year 1981 we received 793 claims where each claim amountand the floor space(in square meters) are given. The underlying policies are without deductible.

Given the data set our task is to obtain a feasible model for the portfolio. While searching formodels we managed to get some good fit which are not feasible from different point of views. Andhere we decided to include only the PP-plots for those “good fit but not feasible” model. Wewill describe only one of those non- feasible models. And finally we will talk about two feasiblebut not best fit models.

At first we used regression analysis to study the independence of two components of the dataset, namely space data and claim data.

When estimating the parameters for different models we used the utilities included in sophis-ticated packages of R. Here we should mention that though we tried our best to figure out bestmodel still we could not try some model because of the failure in optimization while estimatingthe parameters.

Our efforts were consistent in getting sensitivity curve, plotting likelihood region for someparameters, and finally getting box-plot for comparing two feasible models. For each model weproduced graphs of probability density function(PDF) over Histogram, cumulative density func-tion(CDF), limited expected value(LEV), mean residual life (MRL). We put both the empiricaland theoretical graphs together to get the idea of goodness of fit. The PP-plots we producedsupports, specially, our comments that to get a feasible model we need to sacrifice the best modelwe managed to get with infinite mean for the original claims.

In conducting the goodness of fit test for different models , we should mention that, werun our own routine in R for that job. We carried out Kolmogorov-Smirnov test for individualdata, Chi-square goodness of fit test for grouped data, Anderson-Darling test for heavy tailedmodel and Akaike selection criterion for model selection. We also programmed codes for com-puting Cramer-Von-Moises distances for both individual and grouped data with weights. Forweights we use the mean of each class. All the codes are included in the Appendix.

Talking about the plots of likelihood region we should mention that we fix one parameter toget the plot in two dimensions. However, at the same time, we produce 3D plots of the likelihoodregion to get the idea of optimal region containing the parameters.

Our last strenuous effort is devoted to the calculation of “premium”. We devote a completesection for this purpose.

2 Empirical tools for data analysis

In this section we simply define the empirical version of different tools that we will use in nextsections to compare with their corresponding theoretical versions. This tools are so common thatwe content ourself in simply defining those, without any elaboration.

2.1 Tools for individual data

For a random sample X1, X2, · · · , Xn the empirical CDF is defined as:

Fn(x) =1n

n∑i=1

I[Xi≤x] (1)

2

for x > 0 and I(x) is an indicator function. By definition it is a step function that jumps by1n at each observation. Since its is a discrete function the corresponding empirical probabilityfunction is defined as:

fn(x) =1n

n∑i=1

I[Xi=x] (2)

The kth sample moments is defined as :

µ′k =∫ ∞

0xkdFn(x) =

1n

n∑i=1

Xki (3)

The empirical limited expected value is defined as:

E(X ∧ d) =∫ d

0xdFn(x) + d[1− Fn(d)] =

1n

[∑

Xi<d

Xi + d∑

Xi≥d

1] (4)

k = 1 in (3) and d = ∞ in (4) implies that:

limd→∞

E(X ∧ d) = E(X)

And then the empirical MRL is defined as:

e(d) =E(X)− E(X ∧ d)

1− Fn(d)(5)

=1n

∑ni=1 Xi − 1

n [∑

Xi<d Xi + d∑

Xi≥d 1]

1− 1n

∑ni=1 I[Xi≤d]

where all the functions on the right hand side are defined earlier.

2.2 Tools for grouped data

The tools we defined in the previous section need to be modified for grouped data. Let

nj =n∑

i=1

I[cj−1<Xi≤cj ] (6)

be the frequency in the interval (cj−1, cj ] for j = 1, · · · , r. At boundary point cj the empiricalCDF is defined as (possibly with cr = ∞):

Fn(cj) =1n

j∑i=1

ni, Fn(cj) = 0 (7)

where n =∑r

j=1 nj .A smooth version of Fn is obtained, (in order to see whether the derivative exists), by simply

connecting the jump points by straight lines:

Fn(x) =

0 ifx ≤ co(cj−x)Fn(cj−1)+(x−cj−1)Fn(cj)

(cj−cj−1) ifcj−1 < x ≤ cj

1 ifx > cr

(8)

3

Clearly ˜Fn(x) is not defined for x > cr−1 if cr = ∞ and nr 6= 0.The derivative of the above function , if it exists, is known as histogram and is defined as:

fn(x) =

0 ifx ≤ coFn(cj)−Fn(cj−1)

(cj−cj−1) = nj

n(cj−cj−1) ifcj−1 < x ≤ cj

0 ifx > cr

(9)

Hence

fn(x) =n∑

i=1

1[cj−1<Xi<cj ]

n(cj − cj−1)if cj−1 < x < cj .

The kth sample moment for grouped data can be obtained from empirical CDF as:

µk′ =

∫ cr

c0

xkdFn(x) =r∑

j=1

ckj

nj

n(10)

and using ogive version we get:

µk′ =

∫ cr

c0

xkdFn(x) =r∑

j=1

∫ cj

cj−1

xkfn(x)dx (11)

=r∑

j=1

nj(ck+1j − ck+1

j−1)n(k + 1)(cj − cj−1)

The limited expected value , for cj−1 < d ≤ cj , can be expressed as:

E(X ∧ d) =∫ d

0xdFn(x) + d[1− Fn(d)] (12)

=j−1∑i=1

cini

n+ d

r∑i=j

ni

n

And in terms of ogive, for cj−1 < d ≤ cj , it becomes:

E(X ∧ d) =∫ d

0xdFn(x) + d[1− Fn(d)] (13)

=j−1∑i=1

∫ ci

ci−1

xfn(x)dx +∫ u

cj−1

xfn(x)dx

+∫ cj

uufn(x)dx +

r∑i=j+1

∫ ci

ci−1

ufn(x)dx

=j−1∑i=1

ni(ci + ci−1)2n

+nj(2ucj − c2

j−1 − u2)2n(cj − cj−1)

+r∑

i=j+1

niu

n.

4

And then the empirical MRL is defined as:

e(d) =E(x)− E(X ∧ d)

1− Fn(x)(14)

and

e(d) =E(x)− E(X ∧ d)

1− Fn(x)(15)

depending on discontinuous and continuous versions of CDF’s, where all other terms on theright hand side are defined earlier.

3 Analysis of outlier and independence

In our first look we observe that the values for losses (response variable), are very disperse. Thisbecomes evident from Figure(2)which contains the plot of “space VS losses”.

According to the graph in Figure(1), it is very clear that the Loss 3,408,712.19 representsan outlier for the data. And in fact we can check that by getting the Sensitivity Curve usingthis value as an outlier. We know that Sensitivity Curve (SC) is given by:

SCn(x;x1, · · · , xn−1, Tn) = n[Tn(x1, · · · , xn−1, x)− Tn−1(x1, · · · , xn−1)].

And with Tn(x1, · · · , xn) = xn we can obtain

SCn(x;x1, · · · , xn−1, xn) = x− xn−1

Using x = 3, 408, 712.19 we have the graph of sensitivity curve in Figure(3)The plot of the sensitivity curve with the same outlier , but the median as the estima-

tor, appears in Figure(4).So we can see that the range of sensitivity is much smaller for median than that obtained

for mean. That means that the value x, considered as outlier , is affecting the estimation.Even when we do not consider outlier the data itself seems very disperse as shown in

Figure(2). That is in addition to having outlier, a close look reveals that, the data itself isby nature very disperse. The possible reasons may be the heterogeneous dwelling standard ofthe people in Denmark and the extent of damages caused by fire. Because of varying quality oflife the insurance of dwelling is also varying. And for different sorts of damages the correspondingclaims are also different.

In addition, we run a normal regression analysis to check that Losses and Spaces are un-correlated. It means that the estimation of losses will not depend on space . For t-test( thetest which has usually been used in regression analysis to check whether there is any relationbetween the underlying variables) we get the p-value = 0.4873 when considering outliers andp-value = 0.4447 without considering outlier.

Since our data is very disperse we decided to use some transformation on response variablein order to reduce the scale of data and hence to analyze it better. The idea is to choose amonotone function that preserves all properties of the original data. And we use the logarithmictransformation for that job.

In the following sections we will analyze the log of the data, henceforth log(data), to finda good distribution fitting the empirical one. That is we have to fit the original data with atransformed distribution.

5

Figure 1: Dispersion with outlier

3.1 Transformation of variables

The following theorem ensures that finding a distribution for log(data) is equivalent to findinga distribution for the original data:

Theorem 3.1 Let X be a continuous random variable with PDF f(x). Let Y = g(X) is atransformation of X such that Y is a continuous random variable. Then the PDF of Y is givenby:

fY (y) =fX(x)g′(x)

.

Thus if Y = log (X) is the transformation then the new density is given by:

fY (y) =f(log (x))

x

4 Initial Model selection

4.1 Descriptive statistics

As we discussed earlier we are applying logarithm on our disperse data. We derived our sum-mary statistics for original and transformed data(which basically gives us the first idea of thedata), with and without outlier. We put this in Figure(5).

From the table of summary statistics we see that the distribution of losses are heavily skewedto the right and it is more evident in case of the original data with outlier. Right skewness isquite observable even in log scale. The effect of outlier is conceived from the huge differences ofmean and standard deviation for original data.

6

Figure 2: Dispersion without outlier

4.2 Empirical analysis for initial model selection

Initial model selection is very important from the perspective that we may exclude our best fitmodel initially because of erroneous initial choices. However it is more of an artistic selectionthan a scientific one.

For the initial model selection we plot the empirical PDF, CDF, LEV and MRL for log(data)inFigure(6).

And the corresponding graphs for the original losses appears in Figure(7)Having these empirical graphs in our exposure we now try to fit some well know right skewed

distributions e.g.Gumbel, Burr-3, Generalized Pareto, Inverse Transform Gamma, LogNor-mal, and Inverse Gaussian. And by trial and error we chose relatively better fits. We also takeinto account the values of the theoretical means E(X) =

∫∞0 xdF (x), for each F , and compare

that with the empirical mean µ′ =∫∞0 xdFn(x) = 1

n

∑ni=1 Xi. And we chose those models for

which the differences of these empirical and theoretical means are minimum. This is the basicconcern of premium calculation.

After initial selection we decided to describe only the log-normal and inverse gaussian feasiblemodels. Since all other models are not feasible on the grounds we just mentioned, we just describeone of those namely Gumble model. Though some of them are really good fit, we say these modelsthe garbage models.

5 Description of feasible models

When we mention “feasible model” we mean that we have managed to get finite theoretical meanfor the original claims. Our first feasible model is log-normal. This model performs better thanthe inverse gaussian model presented in the next section. But because of consistency between

7

Figure 3: Sensitivity curve with the estimator mean

empirical and theoretical mean we declared inverse gaussian our best model. Because a relevantand consistent theoretical means is vital for “premium ” calculation.

5.1 Log -Normal fit for log(data)

5.1.1 Theoretical functions

The probability density function of a log-normal distribution with parameter µ and σ is givenby:

f(x) =1

xσ√

2πe−

12( log x−µ

σ)2 (16)

The corresponding CDF is given by:

F (x) =∫ x

0

1yσ√

2πe−

12( log y−µ

σ)2dy (17)

To get the closed form we use the transformation log y−µσ = z = N(0, 1). So dz

dy = 1σy . Thus

F (x) =∫ log y−µ

σ

−∞

1√2π

e−12z2

dz (18)

= Φ(log y − µ

σ)

= Φ(z)

The theoretical LEV for the log-normal model is given by:

E(X ∧ d) =∫ d

0x.

1xσ√

2πe−

12( log x−µ

σ)2dx + d[1− Φ(

log y − µ

σ)] (19)

8

Figure 4: Sensitivity curve with the estimator median

To see how it gets into a closed form let us use the transformationlog y−µ

σ = z = N(0, 1). Then x = eµ+zσ which implies dx = σeµ+zσdz. Then (19) becomes:

E(X ∧ d) = eµ+ 12σ2

∫ log d−µσ

−∞

1√2π

e−12(z−σ)2dz + d[1− Φ(

log d− µ

σ)]

= eµ+ 12σ2

P{N(σ, 1) <log d− µ

σ}+ d[1− Φ(

log d− µ

σ)]

= eµ+ 12σ2

P{N(0, 1) <log d− µ

σ− σ}+ d[1− Φ(

log d− µ

σ)]

= eµ+ 12σ2

Φ(log d− µ− σ2

σ) + d[1− Φ(

log d− µ

σ)] (20)

With the form of LEV given by (20) the MRL function could be expressed in closed form asfollows:

e(d) =E(X)− E(X ∧ d)

1− F (d)(21)

=eµ+ 1

2σ2 − {eµ+ 1

2σ2

Φ( log d−µ−σ2

σ ) + d[1− Φ( log d−µσ )]}

1− Φ( log d−µσ )

Here we mention that E(X) can be easily obtained, from (20) by simply letting d −→∞ , as

E(X) = eµ+ 12σ2

9

Figure 5: Summary statistics with and without outlier

5.1.2 Parameter estimation

We estimated parameters for log-normal distribution by LogLikelihood Method. The Loglike-lihood function (LLF) is given, for µ ε R and σ > 0, by:

l(µ, σ2) = −n

2(lnσ2 + ln 2π)−

n∑i=1

lnxi −12

n∑i=1

(lnxi − µ

σ2)

And the corresponding score functions are:

S1(µ, σ2) =∂l(µ, σ2)

∂µ=

1σ2

n∑i=1

(lnxi − µ)

S2(µ, σ2) =∂l(µ, σ2)

∂σ= −n

σ+

1σ3

n∑i=1

(lnxi − µ)2

Solving these equations we got the maximum likelihood estimators (mle’s):

µ =1n

n∑i=1

lnxi (22)

σ2 =1n

n∑i=1

(lnxi − µ)2 (23)

We used a package included in R to estimate theses parameters (see the source code in Ap-pendix). Since the package needs initial values we used Method of Moments(mom) as initial

10

Figure 6: Empirical model tools for log(data)

values.The method of moments estimators are given by:

σ2mom = ln(

∑ni=1 x2

i

n)− 2 ln

∑ni=1 xi

n

µmom = ln(∑n

i=1 xi

n)− 1

2σ2

mom

Finally the mle’s we obtained are:

µ = 2.089585 and σ2 = 0.187487Talking about the effect of outlier on our estimate we should mention that we got almost

same values of the parameters both with and without outlier. This, may be, because of log-scale. May be the estimates coming from original claims will show difference with and withoutoutlier. But except one model(Burr4, which we didn’t report), with infinite mean, we were notable to estimate the parameters for the original claims because of the failure of the optimizationpackage.

To talk about the robustness of the estimates we see from (22) and (23) that µ and σ2 goes∞ when one of the xi’s goes ∞. That means that µ and σ2 are sensitive enough to outliers butthe sensitivity would be less than that we would have obtained from the parameters comingfrom original claims.

However there is quantitative way to check the robustness of the estimated parameters. Oneof the well known method is know as “Bootstrap”. We will take a sample of fixed size, k < 793,

11

Figure 7: Empirical model tools for the original data

from 793 claims lets say arbitrary large number of times. For each sample we need to estimatethe parameters separately and then if the mean of those estimated parameters are very close tothe real estimate of the parameters and the standard deviation of the estimates coming fromsamples is small enough so that the original estimate falls in 95% confidence interval we can saythe parameters are robust.

Also we should mention that we incorporate “Huber estimates”(see Appendix for the codes)in method of moments estimates which are supplied as the initial value of our MLE estima-tion, which we believe has a robustifying effect.

There is another way to check the robustness of our parameters.Now we talk about Variance-Covariance Matrix of log-normal mle’s . To estimate the

Variance-Covariances matrix, we used the Loglikelihood function and got the derivatives.

∂2l(µ, σ2)∂µ2

= − n

σ2

∂2l(µ, σ2)∂σ2

=n

σ2− 3

σ4

n∑i=1

(lnxi − µ)2

∂2l(µ, σ2)∂σ∂µ

= − 2σ3

n∑i=1

(lnxi − µ)2

12

So the Covariance matrix can be computed as follows: −1∂2l(µ,σ2)

∂µ2

∂2l(µ,σ2)∂σ∂µ

∂2l(µ,σ2)∂σ∂µ

−1∂2l(µ,σ2)

∂σ2

=

−1−n

σ2

−2σ3

∑ni=1 (lnxi − µ)2

−2σ3

∑ni=1 (lnxi − µ)2 −1

nσ2−

3σ4

Pni=1 (ln xi−µ)2

=

[0.0000443293 0

0 0.00002216616

]Hence we can compute a 95% confidence interval around parameters, that is, approximately1.96 standard deviations on both sides of each estimate:

µ : 2.0859586± 1.96(0.0000443293)1/2 ⇒ µ ∈ (2.072909, 2.099008)σ : 0.1874917± 1.96(0.00002216616)1/2 ⇒ σ ∈ (0.1782638, 0.1967196)

We can see that our estimates fall in 95% confidence interval.Now we talk about the Log-relative likelihood function to get the graphs of likelihood re-

gion. The Log-relative likelihood function is defined by:

r(µ, σ2) = l(µ, σ2)− l(µ, σ2)

Evaluating the log-likelihood function using our estimates for µ and σ2 we obtain:

r(µ, σ2) = −n

2(lnσ2 + ln 2π)−

n∑i=1

lnxi −12

n∑i=1

(lnxi − µ

σ2)− (−1451.867)

The set of values of µ and σ2 such that r(µ, σ2) > ln(p) is called a 100×p Likelihood Region(LR)for each parameter. In order to show the LR for different values of p (let say 10% and 50%),we plot LLF in 2-dimensions (fixing one parameter) and in 3-dimensions without fixing anyparameter.

We generated possible values of µ for fixed σ to find the feasible region of µ. In Figure(8)we can see the plausible region from where we can choose the most likely parameter value for µwhen σ is fixed.

We also generated possible values for µ and σ to find the feasible region around the optimumvalues of the parameters. In Figure(9) we can see the plausible surface from where we can choosethe most likely values for µ and σ.

5.1.3 Comparison by graphs

In this section we produce the graphs of this model together with the corresponding empiricalone using the parameters we estimated in previous section. The graphs of PDF, CDF, LEV, andMRL appears in Figure(10)

We end this section with the PP-plot which gives us the idea of our fit(we remind its notbest but feasible), in Figure(11).

13

Figure 8: Likelihood region of the feasible models

5.2 Log(log normal) fit for the original data

In this section we present the model tools for the original data. In previous section we fit lognormal on log(data). And so here the model is log(lognormal).


The PDF for this model is given by:

fo(x) =1x

e−12(

log(log(x))−µσ

)2

log(x)σ√

2π(24)

where x ∈ (1,∞).The CDF of this model can be obtained as

F o(x) =∫ x

1

1y

e−12(

log(log(y))−µσ

)2

log(y)σ√

2πdy (25)

The theoretical LEV for this model , with the density given by (24), can be obtained as:

Eo(X ∧ d) =∫ d

1x

1x

e−12(

log(log(x))−µσ

)2

log(x)σ√

2πdx + d[1− F o(d)] (26)

14

Figure 9: 3D Likelihood region of log normal model

The theoretical MRL for this model can be computed as:

e(d) =Eo(X)− Eo(X ∧ d)

1− F o(d)(27)

where Eo(X) = limd→∞Eo(X ∧ d) and the functions on the right hand side are given above.


Since MLE’s are invariant our estimated parameters for this model remain same as µ = 2.08958523and σ = 0.187487455.


In this section we produce the graphs of this model for the original losses using the parameterswe estimated in previous section. The graphs of PDF, CDF,LEV, MRL are produced in Figure(12)

5.3 Inverse Gaussian fit for log(data)

15

Figure 10: Log normal model graphs


The PDF of inverse Gaussian distribution is given by:

f(x) =(

θ

2πx3

)1/2

e−θ(

x−µµ )2

2x (28)

And the corresponding CDF for this model can be obtained as :

F (x) =∫ x

0

(θ

2πy3

)1/2

e−

θ(y−µ

µ )2

2y dy (29)

The LEV for inverse Gaussian model can be expressed as:

E(X ∧ d) =∫ d

0x

(θ

2πx3

)1/2

e−θ(

x−µµ )2

2x dx + d[1− F (d)] (30)

Finally the expression of MRL for inverse gaussian distribution is:

e(d) =E(X)− E(X ∧ d)

1− F (d)(31)

where the functions on the right hand side are defined above.

16

Figure 11: PP-plot of the feasible models


As before we estimated the parameters for inverse gaussian distribution by LogLikelihoodMethod. The Loglikelihood function is given , for µ ε R and θ > 0, by:

l(µ, θ) =n

2(ln θ − ln 2π)− 3

2

n∑i=1

lnxi −θ

2µ2

n∑i=1

(xi − µ)2

xi


S1(µ, θ) =∂l(µ, θ)

∂µ=

θ

µ3

n∑i=1

(xi − µ)

S2(µ, θ) =∂l(µ, θ)

∂θ=

n

2θ− 1

2µ2

n∑i=1

(xi − µ)2

xi

Solving these equations we get the maximum likelihood estimators (mle’s):

µ =1n

n∑i=1

xi (32)

θ =nµ2∑n

i=1(xi−µ)2

xi

(33)

17

Figure 12: Log(lognormal) model graphs

We used a package inclueded in R to estimate these parameters (see the source code in Ap-pendix). Because R needs initial values we used method of moments estimates as initial valueswhich are given by:

µmom =∑n

i=1 xi

n

θmom =µ3

momPni=1 x2

in − µ2

mom

Finally the mle’s we obtained are as followings:

µ = 8.19756 and θ = 229.18479To talk about the robustness of the estimates we see from (32) and (33) that µ and θ goes ∞when one of the xi’s goes ∞. That means that µ and θ are sensitive enough to outliers but thesensitivity would be less than that we would have obtained from the parameters coming fromoriginal claims.

For quantitative analysis of robustness we can apply the “Bootstrap” method as discussed incase of “lognormal model”. Also we should mention that we incorporate “Huber estimates”(seeAppendix for the codes) in method of moments estimates which are supplied as the initial valueof our MLE estimation , which we believe has a robustifying effect.

Now we talk about the Variance-Covariance Matrix of inverse Gaussian mle’s. To estimate

18

the Variance-Covariances matrix, we used the Loglikelihood function and got the derivatives.

∂2l(µ, θ)∂2µ

= −−3θ

∑ni=1 xi

µ4+

2nθ

µ3

∂2l(µ, θ)∂2θ

= − n

2θ2

∂2l(µ, θ)∂θ∂µ

=∑n

i=1 xi − nµ

µ3

So the Covariance matrix can be computed as follows: −1∂2l(µ,θ)

∂µ2

∂2l(µ,θ)∂µ∂θ

∂2l(µ,θ)∂µ∂θ

−1∂2l(µ,θ)

∂θ2

=

[0.003028186 −0.0001363386−0.0001363386 132.7429

]Hence we can compute a 95% confidence interval around parameters, that is, approximately1.96 standard deviations on both sides of each estimate:

µ : 8.197649± 1.96(0.003028186)1/2 ⇒ µ ∈ (8.089792, 8.305506)θ : 229.417857± 1.96(132.7429)1/2 ⇒ θ ∈ (206.8359, 251.9998)

We can see that our estimates fall in 95% confidence interval.We know the Log-relative likelihood function is defined as:

r(µ, θ) = l(µ, θ)− l(µ, θ)

Evatuating the log-likelihood function using the estimates of the parameters µ and θ we obtain:

r(µ, θ) =n

2(ln θ − ln 2π)− 3

2

n∑i=1

lnxi −θ

2µ2

n∑i=1

(xi − µ)2

xi− (−1451.670)

The set of values of µ and θ such that r(µ, θ) > ln(p) is called a 100×p Likelihood Region(LR)for each parameter. In order to show the LR for different values of p (let say 10% and 50%),we plot LLF in 2-dimensions (fixing one parameter) and in 3-dimensions without fixing anyparameter.

We generated possible values for µ and we fixed θ to find the feasible region for µ. In Figure(8)we can see the plausible region from where we can choose the most likely parameter value for µwhen θ is fixed.

We also generated possible values for µ and θ to find the feasible region around the optimumparameter values. In Figure(13) we can see the plausible surface from where we can choose themost likely parameter values for µ and θ.

19

Figure 13: 3D likelihood region for Inverse gaussian model

5.3.3 Comparison by Graphs

In this section we produce the graphs of the inverse gaussian model for the transformed datausing the parameters we estimated in previous section. The graphs of PDF, CDF, LEV, MRLare produced in Figure(14)

We produce the PP-plot for this model in Figure(11)

5.4 Log(inverse gaussian) model for the original data


The PDF of log(inverse gaussian) fit is given by:

fo(x) =1x

(θ

2π(log x)3

)1/2

e−

θ(logx−µ

µ )2

2 log x (34)

The corresponding CDF is obtained as:

F o(x) =∫ x

1

1y

(θ

2π(log y)3

)1/2

e−

θ(logy−µ

µ )2

2 log y dy (35)

The equation of LEV for this model can be obtained as:

Eo(X ∧ d) =∫ d

1x

1x

(θ

2π(log x)3

)1/2

e−

θ(logx−µ

µ )2

2 log x dx + d[1− F o(d)] (36)

20

Figure 14: Inverse gaussian model graphs

The MRL of this model is given by :

e(d) =Eo(X)− Eo(X ∧ d)

1− F o(d)(37)


The invariant MLE of the parameters are µ = 8.19756 and θ = 229.18479 as obtained in theprevious section.


In this section we produce the graphs of the inverse gaussian model for the transformed datausing the parameters we estimated in previous section. The graphs of PDF, CDF, LEV, MRLare produced in Figure(15)

We produce the PP-plot for this model in Figure(11)

6 A non-feasible good fit model

In this section we present a nice fit, both for log(data) and original data which has eventuallyno use because of its infinite mean for original claims.

21

Figure 15: log(inverse gaussian) model graphs

6.1 Gumble fit for Log(data)


The PDF of Gumble distribution is defined as:

f(x) =1β

e(α−x)

β e−e( α−x

β)

where β > 0.The corresponding CDF is defined as :

F (x) = e−e( α−x

β)

Then the theoretical LEV can be expressed as:

E(X ∧ d) =∫ d

0x

1β

e(α−x)

β e−e( α−x

β)

dx + d[1− F (d)]

from which it can be shown that E(X) = α + γβ, γ being a constant.Finally the theoretical MRL can be written as:

e(d) =α + γβ −

∫ d0 x 1

β e(α−x)

β e−e( α−x

β)

dx− d[1− F (d)]

1− e−e( α−d

β)

22


As usual we try to estimate the parameters of Gumble distribution by LogLikelihood Method.The Loglikelihood function is given , for α ε R and β > 0, by:

l(α, β) = −n lnβ +1β

n∑i=1

(α− xi)−n∑

i=1

exp (α− xi

β)


S1(α, β) =∂l(α, β)

∂α=

n

β− 1

β

n∑i=1

exp (α− xi

β)

S2(α, β) =∂l(α, β)

∂β= −n

β+

1β2

n∑i=1

(xi − α)− 1β2

n∑i=1

(xi − α) exp (α− xi

β)

We have to solve the following equations to get the maximum likelihood estimators (mle’s):

n∑i=1

exp (α− xi

β)− n = 0

−nβ +n∑

i=1

xi −n∑

i=1

xi exp (α− xi

β) = 0

Since we are not able to find an explicit form for the estimators, we had to use the package in-built in R to estimate them numerically. As an initial value we supply the method of momentsestimates. From our experience we learned that supplying method of moments estimates asinitial value sometimes leads to some absurd estimates in R. But fortunately this time the mlesgiven by R with method of moments estimates as initial value are consistent. We know thattheoretical first and second moments are:

E[X] = α + γβ

V [X] = E[X2]− E2[X] =16(πβ)2

So we matched the empirical moments with the theoretical one to get the initial values neededto get the mle’s:

α = 7.464233 and β = 1.294363


In this section we put four basic graphs of this model, namely PDF, CDF, LEV, and MRL us-ing the parameters estimated in the previous section. The graphs appear in Figure(16) Weend this section with the PP-plot of this model , together with other models which didn’twork. In Figure(17) we produce the PP-plots of so called “garbage models” for log(data). Andin Figure(18) we produce the PP-plots of the corresponding models for the original data. Asexpected the PP-plots are exactly same for both original and transformed data for each model.

23

Figure 16: Gumble fit for log(data)

6.2 Log(gumble) fit for the original data


The PDF of this distribution is given by:

fo(x) =1β

e(α−log (x))

β e−e(

α−log (x)β

)

where β > 0.And the corresponding CDF is given by:

F o(x) =∫ x

1

1β

e(α−log (y))

β e−e(

α−log (y)β

)

dy

The theoretical LEV for this model can be expressed as:

Eo(X ∧ d) =∫ d

1x

1β

e(α−log (x))

β e−e(

α−log (x)β

)

+ d[1− F o(d)]

And, finally, the theoretical MRL can be expressed as :

eo(d) =Eo(x)− Eo(X ∧ d)

1− F o(d)

where the functions on the right hand side are defined earlier.

24

Figure 17: PP plots of non-feasible models for log(data)


The estimated parameter values are the invariant MLE’s obtained in previous section.


In this subsection we plot the four basic graphs, using the parameters we estimated in theprevious subsection, for the original claims. The graphs appear in Figure(19)

And we already produced the PP-plot of this model for the original claims in Figure(18)

7 Goodness of fit and model selection

In previous section we produce all the graphs giving the idea of the goodness of our fit. Speciallywe mention the PP-plots of both feasible and non-feasible models shown earlier.

Here we use quantitative tests for assessing the goodness of fit. We now define the teststatistics we will use in this section. In Appendix we put some codes, for some other goodnessof fit tests, which are not even included here.

The Kolmogorov-Smirnov test statistic is defined as :

K.S. = maxx

| Fn(x)− F (x, θ) | (38)

where Fn(x) is the empirical and F (x) is the model CDF.

25

Figure 18: PP plots of non-feasible models for original data

Anderson-Darling test statistic is defined as :

A2 = −nF (u) + nk∑0

(1− Fn(yj))2{log (1− F (yj))− log (1− F (yj+1))}

+nk∑1

(Fn(yj)2{log (F (yj+1))− log (F (yj))} (39)

where yk’s are class boundaries with t = y0 < y1 < y2 < · · · < yk < yk+1 = u.The Cramer-Von -Misses distance for the individual data is defined as :

Cn(θ) =1n

n∑1

[Fn(xi)− F (xi, θ)

]2 (40)

The Akike value is defined as:

AIC = 2K − 2loglikelihood (41)

All the above test statistics and the selection criteria are applied on some of our non-feasibleand two feasible models. The results are produced in tabular form in Figure(20). Since we testthe goodness of fit for two feasible models we have to choose better one from these two. Boththe model passed the tests. Now it might have been better to compare the models with someempirical statistics. But because of computational difficulties most of the theoretical statistics are

26

Figure 19: log(Gumble) fit for the original data

not available to compare. However comparing theoretical and empirical means(See Figure(21))we can clearly opt for Inverse Gaussian model. And we are more interested in doing so because aconsistent mean is the vital concern of premium calculation. The boxplot(appears in Figure(22))we produce also strongly supports the choice of inverse Gaussian model. Because, first of all, wecan see from box plot that the claims treated as outliers(the dots outside the whiskers in boxplot )are very few in inverse Gaussian model compare to log-normal or even Gumble. Thatmeans that inverse Gaussian model can accommodate more and more large claims with lesspremium(because of smaller mean) than log-normal. Also we know the length of the box, whichis just IQR(inter quartile range), is a good source of comparison for different models. Fromthis point too we see that the inverse Gaussian quartiles matched better, than the log-normalquartiles, with the empirical quartiles (two extremes of the box being 1st and 3rd quartilesand the line inside the box being median). So we believe that our moled selection is stronglyjustified.

8 Simulating rate and getting premium

We simulated “number of claims arrive in a year” 10000 times(see the source code in Ap-pendix). And then we use the mean of those rates in our premium calculation. We know thepremium(P) is calculated according to:

P = (1 + θ)λE(X) (42)

27

Figure 20: Comparison of different models

where θ is the security loading. In most cases it varies among 1%,2% and 5%. The calculatedpremiums with different security loadings appear in Figure(23).

Though we didn’t include in our premium calculation we tried in another way to simulatethe claims arrival time. Since in our data we are missing recording or arrival times of claims wesimulate the arrival times. With those simulated arrival times one can apply the Poisson modelto estimate the parameter λ of the associated Poisson process.

Talking about simulating claim arrival times we should mention the idea we applied. Wesimulate 1000 times. For each simulation we generate exponential inter arrival times with fixedrate, so the arrival times can be treated as Poisson with same rate. The rates, which varies fromone simulation to other, are generated randomly from U(1.17, 3.17). We chose this interval aswe have , originally, 793 claims observed in one year , which means the rate is 793

365 = 2.17. Andwe deviated one unit on both sides. Finally we take the mean , of those 1000 simulated arrivaltimes for each of 793 claims , as the arrival time of each claims.

We come up with some , seemingly, nice codes. We were able to write a code which willgive us , for a sorted vector of claims, the number of repetitions of each claims. We also writea code to compute ”n.claimsday”, where n = 1, 2 · · · , which gives us , how many days, out of365 days, we received n (number of) claims. See the source codes in Appendix. A typical outputappears in Figure(24).

28

Figure 21: Final model selection

9 Scopes limitations and conclusion

The model we chose is specially suitable for premium calculation. We accommodate the influenceof outlier in parameter estimation. This has an increasing influence on premium, which goes infavor of the company, and the price are being evenly paid by all policy holders. However one canestimate the parameters disregarding one or few outliers. And the premium obtained with thenew estimate will be less than that obtained previously. Which goes in favor of the policy holdersbut troughs the company in a more risky position in case such outlying claims occurs. Theremay be compromised solution to this problem, namely, segregating the big claims and smallclaims. Estimating the parameters separately and then projecting more consistent premiums foreach groups.

However in some situation if premium doesn’t seem to be that important then log(log-normal) model may turn out to be more suitable than log(inverse-gaussian). And , furthermore, iffinite mean is not a requirement then either of log(Gumble), log(inverse transformed gamma)orlog(generalized Pareto) may serve well.

When talking about limitation(s), we should mention that, outside the models we reportedor talked about there are some other models which we couldn’t try because of the failure in op-timization when using the package in R for estimating parameters. May be some more advancedoptimization package will help at that point, which may lead to more suitable model may evenwith finite mean.

29

Figure 22: Comparison by box plots

10 Appendix

This section is completely devoted to the source codes . Here basically we put all the codes , inR, we used for different parameter estimation, various numerical computations and graphicalmanipulation etc.

fires.x=matrix(scan("C:/Ivan/Master/Winter Term 2007/MAST726 - LossDistributions/Project/Data/Danish_Fires_2.txt"),byrow=T,ncol=2)space=fires.x[,1]losses=fires.x[,2]lnloss=log(losses)origfunc=lossesfunc=lnlossy=funcclasses=c(0,4.27,5.25,6.23,7.21,8.19,9.17,10.2,11.1,12.1,13.1,15.1)hbreak=classessummary(func)cvar=function(x){stdev(x)/mean(x)}cvar(func)skew<-function(x){mean((x-mean(x))^3)/sd(x)^3}skew(func)kurt<-function(x){mean((x-mean(x))^4)/sd(x)^4}

30

Figure 23: Premiums with different security loadings

Figure 24: A typical output for Poisson model

kurt(func)quantile(func, c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9))func.mu=mean(func)func.std=sd(func)func.mufunc.std

10.1 Codes for empirical tools

Empirical CDF:

func.s<-sort(func)ecdf<-function(x){a<-x for(i in 1:length(x)){a[i]<-sum(x<=x[i])/length(x)}return(a)}func.fn<-ecdf(func.s)plot(func.s, func.fn, type="s",xlab="ln(Losses)", ylab="F_n", main="Empirical CDF",col="red")

Empirical MRL:

emrl<-function(x){a<-xfor(i in 1:length(x))

31

{(if(x[i] < max(x)) (a[i] <- sum(x[x-x[i] > 0] - x[i])/(length(x))/(1-func.fn[i])) else a[i] <- 0)}return(a)}func.en<-emrl(func.s) plot(func.s, func.en, type="l",xlab="ln(Losses)", ylab="Limited Expected Value", main="EmpiricalMRL of the Individual Ln(Losses)",col="red")

Empirical LEV:

func.lev<-mean(func.s)-func.en*(1-func.fn)plot(func.s,func.lev,type="l", xlab="ln(Losses)", ylab="Limited Expected Value",main="Empirical LEV of the Individual Ln(Losses)",col="red")

Histogram:

hist(func, prob=T, breaks=hbreak, xlim=c(4,14.5),xlab="Ln(losses)",main="Histogram ln(Losses)",col="gray")n=length(func)

n.sim<-10000t.period<-365rate<-rep(0,n.sim)lambda<-rep(0,n.sim)claimnumbers<-rep(0,t.period)for (i in 1:n.sim){lambda[i]<-runif(1,2,2.34)claimnumbers<-rpois(t.period,lambda[i])rate[i]<-sum(claimnumbers)/t.period }rate.estimate<-mean(rate)rate.estimatenum.claims=rpois(365,rate.estimate)sum(num.claims)

10.2 Selective codes to get the model tools for log(data)

Here we put some selective codes for one of the non-feasible model and two feasible models. Thecodes for other models are exactly same just need to change the respective inputs.

10.2.1 Gumble tools

Libraries used:

func<-lnlosslibrary(stats4)library (MASS)library(lattice)

To obtain MLE:

32

#MLEll<-function(alpha,beta){n*log(beta)-sum((alpha-y)/beta)+sum(exp((alpha-y)/beta))}est<-mle(minuslog=ll,start=list(alpha=alpha.mom,beta=beta.mom))summary(est)alpha_gumbel=7.464233beta_gumbel=1.294363alpha=7.464233beta=1.294363

#PDFpdfgumbel=function(x,alpha,beta){(beta^(-1))*exp((alpha-x)/beta)*exp(-exp((alpha-x)/beta))}pdf7=pdfgumbel(func,alpha,beta) hist(func, prob=T, breaks=hbreak,col="gray",xlim=c(4,14.5),xlab="ln(losses)", main="Histogramln(losses) vs Gumbel Distribution")lines(func,pdf7,type="o",xpd=T,col="red")

#CDFcdfgumbel=function(x,alpha,beta) {exp(-exp((alpha-x)/beta))}alpha=alpha_gumbelbeta=beta_gumbelcumgumbel=cdfgumbel(func,alpha,beta)plot(func.s, func.fn, type="s",xlab="ln(losses)", ylab="F_n", main="Empirical vs Theoretical GumbelCDF",col="blue")lines(func.s, cumgumbel, type="o",col="red")

#LEVn=length(func)alpha=alpha_gumbelbeta=beta_gumbelint=rep(0,793)levgumbel=rep(0,n)for(i in 1:n){integrand=function(z){z*((1/beta)*exp((alpha-z)/beta))*exp(-exp((alpha-z)/beta))}int=integrate(integrand, lower = 0, upper = func[i])levgumbel[i]=int$va+(func.s[i]*(1-cumgumbel[i]))}plot(func.s,func.lev, type="l", xlab="ln(Losses)", ylab="Limited ExpectedValue", main="Empirical-Gumbel LEV of the Ln(Losses)",col="blue")lines(func.s,levgumbel,type="o",xpd=T,col="red")

#MRLeuler=0.577215665

33

mean.gumbel=alpha+(euler*beta)mrl3=(mean.gumbel-levgumbel)/(1-cumgumbel)plot(func.s, func.en,type="l", xlab="ln(Losses)", ylab="Limited Expected Value",main="Empirical-Gumbel MRL of the Ln(Losses)",col="blue")lines(func.s,mrl3,type="o",xpd=T,col="red")

# P-P Plotqqplot(cumgumbel,func.fn,xlab="CDF Theoretical",ylab="CDF Empirical",main="P-P Plot Gumbel vsln(Losses)",col="blue")abline(0,1)# Q-Q Plot and simulating‘‘Gumble’’u=runif(793)simgumbel=alpha-(beta*log(-log(u)))qgumbel=quantile(simgumbel, probs = func.fn, na.rm = FALSE,names =TRUE, type = 7) qfunc=quantile(func, probs = func.fn, na.rm =FALSE,names = TRUE, type = 7)plot(qgumbel,qfunc,xlab="TheoreticalQuantiles",ylab="SampleQuantiles",xlim=c(min(func),max(func)),col="red",main="Q-Q PlotGumbel vs ln(Losses)")abline(0,1)

10.2.2 Lognormal tools

Here in parameter estimation we have something to mention. Instead of using first moment(fm)and second moment(sm) we use use “Hubers robust estimate” for mean and std deviation. Whichhave eventually been used in method of moments estimate and then in MLE’s. Though becauseof log-sclale the difference is not that much observable we believe it has a robustifying effect andwe believe applying on original large scale data the difference would be quite observable. Usefulcode s are listed below:

# MLErobmean=huber(y,k=1.5,tol = 1e-06)$murobstd=huber(y,k=1.5,tol= 1e-06)$s#fm=mean(y)#sm=sum(y^2)/nfm=robmeansm=robstd^2+robmean^2sigma.mom=sqrt(log(sm)-2*log(fm))mu.mom=log(fm)-0.5*sigma.mom^2ll<-function(mu,sigma){n*log(sigma)+(n/2)*log(2*pi)+sum(log(y))+0.5*sum(((log(y)-mu)/sigma)^2)}est<-mle(minuslog=ll,start=list(mu=mu.mom,sigma=sigma.mom))summary(est)

34

# Not Robust MEAN#mu=2.0859586#sigma=0.1874896#Robust MEANmu=2.0859586sigma= 0.1874917

# PDFpdflognorm=function(x,mu,sigma){(1/(x*sigma*sqrt(2*pi)))*exp(-(((log(x)-mu)/sigma)^2)/2)}denlognorm=pdflognorm(func,mu,sigma) hist(func,prob=T,breaks=hbreak,col="gray",xlim=c(4,15),xlab="ln(Losses)", main="Histogramln(Losses) vs LogNormal Distribution")lines(func,denlognorm,type="o",xpd=T,col="red")

# CDFcdflognorm=function(x,mu,sigma) {pnorm((log(x)-mu)/sigma,0,1)}cumlognorm=cdflognorm(func,mu,sigma) plot(func.s, func.fn, type="s",xlab="ln(Losses)", ylab="F_n", main="Empirical vs TheoreticalLogNormal CDF",col="blue")lines(func.s,cumlognorm,type="o",col="red")

# LEVint=rep(0,793)levlognorm=rep(0,n)for(i in 1:n){integrand=function(z){z*((1/(z*sigma*sqrt(2*pi)))*exp(-(((log(z)-mu)/sigma)^2)/2))}int=integrate(integrand, lower = 0, upper = y[i])levlognorm[i]=int$va+(y[i]*(1-cumlognorm[i]))}plot(func.s,func.lev, type="l", xlab="ln(Losses)", ylab="LimitedExpected Value", main="Empirical-LogNormal LEV of theln(Losses)",col="blue")lines(func.s, levlognorm,type="o",col="red")

# MRLmean.lognorm=exp(mu+((sigma^2)/2))mrllognorm=(mean.lognorm-levlognorm)/(1-cumlognorm)plot(func.s,func.en, type="l", xlab="ln(Losses)", ylab="LimitedExpected Value", main="Empirical-Lognormal MRL of theLn(Losses)",col="blue")lines(func.s,mrllognorm,type="o",xpd=T,col="red")

# P-P Plot plot(plnorm(func.s,mu,sigma),func.fn,xlab="CDFTheoretical", ylab="CDF Empirical",main="P-P Plot LogNormal vsln(Losses)",col="blue") abline(0,1) # Q-Q Plotplot(qlnorm(func.fn,mu,sigma),func.s,xlab="TheoreticalQuantiles",ylab="Sample

35

Quantiles",xlim=c(min(func),max(func)),col="red",main="Q-Q PlotLogNormal vs ln(Losses)") abline(0,1)

# Simulating LogNormal random numbersu=runif(n)v=runif(n)simstdnormal=sqrt(-2*log(u))*cos(2*pi*v)simnormal=mu+sigma*simstdnormal simlognorm=exp(simnormal)

# Likelihood Region (2D - Fixing Sigma)ll<-function(mu){-(n*log(sigma)+(n/2)*log(2*pi)+sum(log(y))+0.5*sum(((log(y)-mu)/(sigma))^2))}loglike=ll(mu)rpar=function(x){ll(x)-loglike}mu=seq(2,2.15,length=n)rpar=rep(0,n)for(i in 1:n) { rpar[i]=ll(mu[i])-loglike }prob1=rep(log(0.10),n)prob2=rep(log(0.50),n)plot(mu,rpar,type="l",col="black",ylim=c(-50,0),xlab="mu",ylab="r(mu)",main="Log-relativeLikelihood Function for LogNormal (Given sigma)")lines(mu,prob1,col="red")lines(mu,prob2,col="blue")

# Likelihood Region (3D)ll<-function(mu,sigma){-(n*log(sigma)+(n/2)*log(2*pi)+sum(log(y))+0.5*sum(((log(y)-mu)/(sigma))^2))}mu_ini=2.0859586sigma_ini=0.1874917loglike=ll(mu_ini,sigma_ini)rpar=function(x,y){ll(x,y)-loglike}nrep=100mu=seq(1.9,2.25,length=nrep)sigma=seq(0.15,0.21,length=nrep)llr_m=matrix(rep(0,nrep*nrep),ncol=nrep)for(i in 1:nrep){for(j in 1:nrep){llr_m[i,j]=rpar(mu[i],sigma[j])}}wireframe(llr_m, shade = TRUE,list(arrows = TRUE),more=TRUE,

aspect = c(50/97, 0.4),drape=TRUE,colorkey = FALSE,screen =list(z=40, x=-80),

light.source =c(10,0,10),xlab="Mu",ylab="Sigma",zlab="LoglikelihoodF.",main="Loglikelihood Function LogNormal")

36

# Covariance Matrixvar.mu=-1/(-n/sigma^2)var.sigma=-1/((n/sigma^2)-((3/sigma^4)*sum((log(y)-mu)^2)))cov.mu.sigma=(-2/sigma^3)*(sum(log(y)-mu))sqrt(var.mu)sqrt(var.sigma)cov.mu.sigmarho.mu.sigma=cov.mu.sigma/(sqrt(var.mu)*sqrt(var.sigma))rho.mu.sigma# Confidence Interval (95%)lim.inf.mu=mu-(1.96*sqrt(var.mu))lim.sup.mu=mu+(1.96*sqrt(var.mu))lim.inf.sigma=sigma-(1.96*sqrt(var.sigma))lim.sup.sigma=sigma+(1.96*sqrt(var.sigma))lim.inf.mulim.sup.mulim.inf.sigmalim.sup.sigma

10.2.3 Inverse Gaussian Tools

Like “lognormal”model here also we incorporate the “Hunber estimate” in method of momentsestimates and hence in MLE’s.

#mu.mom=mean(y)#theta.mom=(n*(mean(y)^3))/(sum(y^2)-(n*(mean(y)^2)))robmean=huber(y,k=1.5,tol = 1e-06)$murobstd=huber(y,k=1.5,tol=1e-06)$smu.mom=robmeantheta.mom=robmean^3/robstd^2# MLE library(stats4)ll<-function(mu,theta){-((n*log(theta)/2)-(n*log(2*pi)/2)-(1.5*sum(log(y)))-(theta*sum((((y-mu)/mu)^2)/(2*y))))}est<-mle(minuslog=ll,start=list(mu=mu.mom,theta=theta.mom))summary(est)mu=8.197649theta=229.417857

# PDFpdfinvnorm=function(x,mu,theta){((theta/(2*pi*x^3))^(1/2))*exp(-theta*((x-mu)^2)/(2*x*mu^2))}deninvnorm=pdfinvnorm(y,mu,theta)hist(func,prob=T,breaks=hbreak,col="gray",xlim=c(4,15),xlab="ln(Losses)",main="Histogram ln(Losses) vs Inverse Gaussian Distribution")lines(func,deninvnorm,type="o",xpd=T,col="red")

#CDFint=rep(0,n)

37

cdfinvnorm=rep(0,n)for(i in 1:n){integrand=function(x) {pdfinvnorm(x,mu,theta)}int=integrate(integrand, lower = 0, upper = y[i])cdfinvnorm[i]=int$va}plot(func.s,func.fn,type="l",xlab="ln(Losses)", ylab="F_n",main="Empirical vs Theoretical Inverse Gaussian CDF",col="blue")lines(func.s,cdfinvnorm,type="o",col="red")

# LEVint=rep(0,793)levinvnorm=rep(0,n)for(i in 1:n){integrand=function(z) {z*pdfinvnorm(z,mu,theta)}int=integrate(integrand,lower= 0, upper = y[i])levinvnorm[i]=int$va+(y[i]*(1-cdfinvnorm[i]))}plot(func.s,func.lev, type="l", xlab="ln(Losses)", ylab="LimitedExpected Value", main="Empirical-Inverse Gaussian LEV of theln(Losses)",col="blue")lines(func.s, levinvnorm,type="o",col="red")

# MRLmean.invnorm=mumrlinvnorm=(mean.invnorm-levinvnorm)/(1-cdfinvnorm)plot(func.s,func.en, type="l", xlab="ln(Losses)", ylab="LimitedExpected Value", main="Empirical-Inverse Gaussian MRL of theLn(Losses)",col="blue")lines(func.s,mrlinvnorm,type="o",xpd=T,col="red")

# P-P Plotqqplot(cdfinvnorm,func.fn,xlab="CDF Theoretical",ylab="CDF Empirical",main="P-P Plot Inverse Gaussian vsln(Losses)",col="blue")abline(0,1)

Here we simulate inverse Gaussian distribution . We refer (simig)

# simulating inverse gaussianrinvgauss<-function(n,mu=stop("no shape arg"), lambda =1){if(any(mu<=0)) stop("mu must be positive")if(any(lambda<=0)) stop("lambda must be positive")if(length(n)>1) n <- length(n)if(length(mu)>1 && length(mu)!=n) mu <- rep(mu,length=n)

38

if(length(lambda)>1 && length(lambda)!=n) lambda <-rep(lambda,length=n)y2 <- rchisq(n,1)u <- runif(n)r1 <- mu/(2*lambda) * (2*lambda + mu*y2 - sqrt(4*lambda*mu*y2 +mu^2*y2^2))r2 <- mu^2/r1ifelse(u < mu/(mu+r1), r1, r2)}siminvgaussian=rinvgauss(n, mu, lambda)

# Likelihood Region (2D - Fixing theta)theta=229.417857ll<-function(mu){((n*log(theta)/2)-(n*log(2*pi)/2)-(1.5*sum(log(y)))-(theta*sum((((y-mu)/mu)^2)/(2*y))))}mu=8.197649loglike=ll(mu)rpar=function(x) {ll(x)-loglike}mu=seq(8,8.5,length=n)rpar=rep(0,n)for(i in 1:n){rpar[i]=ll(mu[i])-loglike }prob1=rep(log(0.10),n)prob2=rep(log(0.50),n)plot(mu,rpar,type="l",col="black",ylim=c(-20,0),xlab="mu",ylab="r(mu)",main="Log-relativeLikelihood Function for InvGaussian (Given theta)")lines(mu,prob1,col="red")lines(mu,prob2,col="blue")

# Likelihood Region (3D)ll<-function(mu,theta){((n*log(theta)/2)-(n*log(2*pi)/2)-(1.5*sum(log(y)))-(theta*sum((((y-mu)/mu)^2)/(2*y))))}mu_ini=8.197649theta_ini=229.417857loglike=ll(mu_ini,theta_ini)rpar=function(x,y) {ll(x,y)-loglike}nrep=100mu=seq(8,8.5,length=nrep)theta=seq(200,250,length=nrep)llr_m=matrix(rep(0,nrep*nrep),ncol=nrep)for(i in 1:nrep){for(j in 1:nrep){llr_m[i,j]=rpar(mu[i],theta[j])}} wireframe(llr_m, shade = TRUE,list(arrows = TRUE),more=TRUE,

aspect = c(60/97, 0.4),drape=TRUE,colorkey = FALSE,screen =

39

list(z=40, x=-70),light.source =

c(10,0,10),xlab="Mu",ylab="Theta",zlab="LoglikelihoodF.",main="Loglikelihood Function InvGaussian") #mu.1=rep(0,nrep)plane=matrix(rep(0,nrep*nrep),ncol=nrep) p.l=log(0.5)k=((-1/theta)*(p.l-n*log(theta)+n*log(2*pi)+3*sum(log(y))))a.eq=k-sum(1/y) b.eq=2*n c.eq=n*mean(y)root1=(-b.eq-(sqrt(b.eq^2-(4*a.eq*c.eq))))/(2*a.eq)root2=(-b.eq+(sqrt(b.eq^2-(4*a.eq*c.eq))))/(2*a.eq)for(i in 1:nrep){for(j in 1:nrep){plane[i,j]=rpar(root1[i],theta[j])}}win.graph()wireframe(llr_m, shade = TRUE,list(arrows =TRUE),more=TRUE,

aspect = c(60/97, 0.4),drape=TRUE,colorkey = FALSE,screen =list(z=40, x=-70),

light.source =c(10,0,10),xlab="Mu",ylab="Theta",zlab="LoglikelihoodF.",main="Loglikelihood Function InvGaussian")

wireframe(plane, shade = FALSE,list(arrows = TRUE),more=TRUE,aspect = c(60/97, 0.4),drape=TRUE,colorkey = FALSE,screen =

list(z=40, x=-70),light.source =

c(10,0,10),xlab="Mu",ylab="Theta",zlab="LoglikelihoodF.",main="Loglikelihood Function InvGaussian")

10.3 Selective codes to get the model tools for original data

In this section we provide mostly the similar codes as the previous section but for original data.

10.3.1 Log(lognormal) tools

Histogram and classes

losses=fires.x[,2]# Transformation:origfunc=lossesy=origfunc#Classes used in this analysis classes=c(0, 229, 337, 727, 1068,1569, 2304, 3383, 4968, 7295, 10712, 15730, 23098, 33917, 49805,73134, 107391, 157693, 231559, 340023, 688498, 3470555)hbreak=classes

40

hist(origfunc, prob=T, breaks=classes ,xlim=c(0,40000), xlab="Losses",main="Histogram - Losses")

# Some summary statistics:summary(y)cvar=function(x){sd(x)/mean(x)}cvar(y)skew<-function(x){mean((x-mean(x))^3)/sd(x)^3}skew(y)kurt<-function(x){mean((x-mean(x))^4)/sd(x)^4}kurt(y)quantile(y,c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9))sd(y)

# Sensitivity Curve (ordinary mean)y=origfuncn=793t=rep(0,n)for(i in 3:n) { t[i]=y[793]-mean(y[c(1:(i-1))]) }plot(y[c(3:792)],t[c(3:792)])# Sensitivity Curve (robust mean)y=origfunc n=793 t=rep(0,n)for (i in 3:n){t[i]=y[793]-median(y[c(1:(i-1))]) }plot(y[c(3:792)],t[c(3:792)])

We dont like to report the codes for empirical PDF, CDF, LEV, MRL for the original data asthose are exactly same as log(data) with the respective changes being made in input.

# Parameters from LogDatamu=2.0859586sigma= 0.1874917pdflognorm=function(x,mu,sigma){(1/(x*sigma*sqrt(2*pi)))*exp(-(((log(x)-mu)/sigma)^2)/2)}# It is a density!!!integrand=function(x){(1/x)*pdflognorm(log(x),mu,sigma)}int=integrate(integrand, lower =1, upper = Inf)

# PDFpdflnlognormal=function(x,mu,sigma){(1/x)*pdflognorm(log(x),mu,sigma)}denlnlognormal=pdflnlognormal(origfunc,mu,sigma)hist(origfunc,prob=T, breaks=classes , xlim=c(0,40000),xlab="Losses",main="Histogram - Empirical vs TheoreticalLnLogNormal",col="gray")lines(origfunc,denlnlognormal,type="p",xpd=T,col="red")

#CDFint=rep(0,793)cdflnlognormal=rep(0,n)

41

for(i in 1:n){integrand= function(x){ pdflnlognormal(x,mu,sigma)}int=integrate(integrand, lower = 1, upper = origfunc.s[i])cdflnlognormal[i]=int$va}plot(origfunc.s, origfunc.fn, type="s",xlab="Losses", ylab="F_n",main="Empirical vs Theoretical lnlognormalCDF",col="blue")lines(origfunc.s,cdflnlognormal,type="p",col="red")

# LEVint=rep(0,n)levlnlognormal=rep(0,n)for(i in 1:n){integrand= function(x) {x*pdflnlognormal(x,mu,sigma)}int=integrate(integrand, lower = 1, upper = origfunc[i])levlnlognormal[i]=int$va+(origfunc.s[i]*(1-cdflnlognormal[i]))}plot(origfunc.s, origfunc.lev, type="l", xlab="Losses",ylab="Limited Expected Value",ylim=c(0,30000),xlim=c(0,600000),main="Empirical-lnlognormal LEV ofLosses",col="blue")lines(origfunc.s,levlnlognormal,type="p",xpd=T,col="red")#Computingthe mean :integrand= function(x){x*pdflnlognormal(x,mu,sigma)}int=integrate(integrand,lower=1,upper= Inf)

# MRLmean_lnlognormal=int$vamrllnlognormal=(mean_lnlognormal-levlnlognormal)/(1-cdflnlognormal)plot(origfunc.s, origfunc.en, type="l", xlab="Losses",xlim=c(0,origfunc[n-1]),ylim=c(0,2500000),ylab="Limited ExpectedValue",main="Empirical-lnlognormal MRL of Losses",col="blue")lines(origfunc.s,mrllnlognormal,type="p",xpd=T,col="red")

# P-P Plotqqplot(cdflnlognormal,origfunc.fn,xlab="CDF Theoretical",ylab="CDF Empirical",main="P-P Plot Log-LogNormal vsLosses",col="blue")abline(0,1)

# Simulating Log-LogNormal random numbersu=runif(n)v=runif(n)

42

simstdnormal=sqrt(-2*log(u))*cos(2*pi*v)simnormal=mu+sigma*simstdnormal simlognorm=exp(simnormal)simloglognorm=exp(exp(simnormal))qloglognorm=quantile(simloglognorm, probs = origfunc.fn, na.rm =FALSE,names = TRUE, type = 7)qfunc=quantile(origfunc,probs=origfunc.fn, na.rm = FALSE,names =TRUE, type = 7)plot(qloglognorm,origfunc,xlab="TheoreticalQuantiles",ylab="SampleQuantiles",xlim=c(min(origfunc),100000),ylim=c(min(origfunc),100000),col="red",main="Q-QPlot Log-LogNormal vs Losses")abline(0,1)

10.3.2 Log(inverse gaussian) tools

mu=8.19756theta=229.18479pdfinvnorm=function(x,mu,theta){((theta/(2*pi*x^3))^(1/2))*exp(-theta*((x-mu)^2)/(2*x*mu^2))}# PDFy=origfuncpdflninvnorm=function(x,mu,theta){(1/x)*pdfinvnorm(log(x),mu,theta)}denlninvnorm=pdflninvnorm(y,mu,theta)hist(y, prob=T, breaks=classes, xlim=c(0,40000), xlab="Losses",main="Histogram - Empirical vsTheoretical Ln(Inverse Gaussian)",col="gray")lines(y,denlninvnorm,type="p",xpd=T,col="red")

# CDFint=rep(0,n)cdflninvnorm=rep(0,n)for(i in 1:n){integrand=function(x) {pdflninvnorm(x,mu,theta)}int=integrate(integrand,lower = 1, upper = y[i])cdflninvnorm[i]=int$va}plot(origfunc,origfunc.fn, type="l",xlab="Losses", ylab="F_n",main="Empirical vs Theoretical Ln(InverseGaussian) CDF",col="blue")lines(origfunc.s,cdflninvnorm,type="p",col="red")

# LEVint=rep(0,n)levlninvnorm=rep(0,n)for(i in 1:n){integrand=function(x) {x*pdflninvnorm(x,mu,theta)}int=integrate(integrand,lower = 1, upper = origfunc[i])

43

levlninvnorm[i]=int$va+(origfunc.s[i]*(1-cdflninvnorm[i]))}plot(origfunc.s, origfunc.lev, type="l", xlab="Losses",ylab="Limited Expected Value",ylim=c(0,30000),xlim=c(0,y[n-1]),main="Empirical-Ln(InverseGaussian) LEV of Losses",col="blue")lines(origfunc.s,levlninvnorm,type="p",xpd=T,col="red")#Computingthe mean :integrand= function(x) {x*pdflninvnorm(x,mu,theta)}int=integrate(integrand, lower = 1, upper = Inf)

# MRLmean_lninvnorm=int$vamrllninvnorm=(mean_lninvnorm-levlninvnorm)/(1-cdflninvnorm)plot(origfunc.s, origfunc.en, type="l", xlab="Losses",xlim=c(0,y[n-2]),ylim=c(0,origfunc.en[n-1]),ylab="Limited ExpectedValue",main="Empirical-Ln(Inverse Gaussian) MRL ofLosses",col="blue")lines(origfunc.s,mrllninvnorm,type="p",xpd=T,col="red")# P-P Plotqqplot(cdflninvnorm,origfunc.fn,xlab="CDF Theoretical", ylab="CDFEmpirical",main="P-P Plot Log-InvGaussian vs Losses",col="blue")abline(0,1)

10.4 Codes for Goodness of fit tests

Kolmogorov-Smirnov test:Here “mcdf” stands for “model CDF”, which , for each model, is defined earlier.

mu=2.08sigma= 0.18mcdf=function(x) {cdflognorm(x,mu,sigma)}#Kolmogorov smirnov test:ks.n<-rep(0,n)ks.n[1]<-abs(ecdf(y)[1]-mcdf(y[1]))for(i in 2:n){ks.n[i]<-max(abs(ecdf(y)[i-1]-mcdf(y[i])),abs(ecdf(y)[i]-mcdf(y[i])))}ks.statistic<-max(ks.n)# Critical value at 10% significance levelks.cri.val<-(1.36/sqrt(n))ks.statisticks.cri.val

In “Inverse Gaussian” model the code is same just parameter values and “mcdf”are different.Anderson Darling test:Here we report the code for inverse gaussian and for other models its same just need to changethe respective inputs.

44

mu=8.197649theta=229.417857z=vector()integrand= function(x){pdfinvnorm(x,mu,theta)}mcdf=function(z) {integrate(integrand,lower = 0, upper = z)}fsum=0for(i in 1:(n-1)) {fsum=(((1-ecdf(y)[i])^(2))*(log(1-mcdf(y[i])$va)-log(1-mcdf(y[i+1])$va)))+fsum}fsumssum=0for(i in 2:(n-1)) {ssum=(((ecdf(y)[i])^(2))*(log(mcdf(y[i+1])$va)-log(mcdf(y[i])$va)))+ssum}ssumadtest=(-n*mcdf(y[n])$va)+n*fsum+n*ssumadcrit=1.933adtestadcrit

The code for “log-normal” is same just need to change the parameters and “mcdf”.Cramer Von Misses(individiual data) test:

mu=8.197649theta=229.417857z=vector()integrand= function(x){pdfinvnorm(x,mu,theta)} mcdf=function(z) {integrate(integrand,lower = 0, upper = z)}crv.test=0for(i in 1:n) {crv.test=(1/n)*sum((ecdf(y)[i]-mcdf(y[i])$va)^2)+crv.test }crv.test

Cramer Von Misses(grouped data) test:Though we didn’t report in any of our model here is a useful code which we report for gumblemodel. It can be applied for any other model with respective changes being made for parametersand “mcdf”.

par=2alpha=7.464233beta=1.294363mcdf=function(x){cdfgumbel(x,alpha,beta)} freq<-rep(0,k) total<-rep(0,k) for(i in2:k) {

for(j in 1:n)

45

{(if(y[j]<=p[i])(total[i]<-total[i]+y[j]))}

}classsum<-rep(0,k-1)for(mx in 1:(k-1)) {classsum[mx]<-total[mx+1]-total[mx] }classfreq<-rep(0,k-1)for(nxin 1:(k-1)) { classfreq[nx]<-table(ybreaks)[nx] }classmean=classsum/classfreqcvm=0for(i in 2:k) {cvm=(classmean[i-1]*((((cumsum(classfreq)[i-1])/n)-mcdf(p[i]))^2))+cvm}cvm

Chi square test:Here is another code which we didn’t use but may be helpful to many people. We set it forGumble but can be easily modified for other models by changing the “inputs”.

par=2alpha=7.464233beta=1.294363mcdf=function(x){cdfgumbel(x,alpha,beta)}# Computing expected frequencies:n<-length(y)ybreaks=cut(y,breaks=classes)p=classesk=length(p)f.ex<-rep(0,k-1)for(i in 1:k-1){f.ex[i]<-(mcdf(p[i+1])-mcdf(p[i]))*n}# Getting observed frequecies& computing chisquare:f.ob=rep(0,k-1)for(i in 1:(k-1)){f.ob[i]=table(ybreaks)[[i]]}# Empirical Statistis Chi-Squarechi.sq<-sum((f.ob-f.ex)^2/f.ex)chi.sq# Getting degrees of freedomdf=(k-1)-par-1# Getting p-valuep.value=1-pchisq(chi.sq,df)p.valuet_value=qchisq(0.95,df)

46

e_value=chi.sqt_valuee_value

10.4.1 Code for estimating λ from simulated claim count:

Here is the code which we used in our premium calculation.

n.sim<-10000t.period<-365rate<-rep(0,n.sim)lambda<-rep(0,n.sim)claimnumbers<-rep(0,t.period)for (i in 1:n.sim){lambda[i]<-runif(1,2,2.34)claimnumbers<-rpois(t.period,lambda[i])rate[i]<-sum(claimnumbers)/t.period }rate.estimate<-mean(rate)rate.estimatenum.claims=rpois(365,rate.estimate)sum(num.claims)

10.4.2 Simulating arrival times

Here is the code for simulating arrival times which we didn’t use but we report a typical outputfrom it to apply the poisson model for estimating λ. The output appears in Figure(24) fromwhich we can obtain an estimate of λ.Inside the code there are some other useful codes. Here is the outputs from those codes:(*)“f” will give us the frequency of each arrival time in the vector of final arrival times(y=f.ar.time)(*)“cardinality.y”will give us how many distinct days, out of 365 days, claims occur.(*)“n.claimsday”will give us how many days, out of 365, “n” number of claims occur and finally(*)“no.claimsday” will give us how many days, out of 365, no claim occur.

n<-793n.sim<-50mat<-matrix(c(rep(0,n.sim*n)),nrow=n.sim,ncol=n,byrow=TRUE)lambda<-rep(0,n.sim)for(j in 1:n.sim){ i.ar.time<-rep(0,n)ar.time2<-rep(0,n)ar.time<-rep(0,n)lambda[j]<-runif(1,1.17,3.17)i.ar.time<-rexp(n,lambda[j])ar.time2[1]<-i.ar.time[1]for(i in1:(n-1)){ar.time2[i+1]<-ar.time2[i]+i.ar.time[i+1]}c<-365/max(ar.time2)ar.time<-c*ar.time2mat[j,]<-ar.time}

47

f.ar.time<-rep(0,n)for(k in1:n){f.ar.time[k]<-ceiling(sum(mat[,k])/n.sim)}##Frequencycounter(My best efforts succeeded):y<-f.ar.timef<-rep(0,length(y))for(i in 1:length(y)){t1<-sum(y<=y[i])t2<-sum(y>=y[i])if(t1+t2!=length(y))(f[i]<-(t1+t2-length(y)))}distinct.total=1for(i in 1:(length(y)-1)){temp1<-y[i]temp2<-y[i+1]if(temp1!=temp2)(distinct.total<-distinct.total+1) }max(f)cardinality.y<-distinct.totalcardinality.ycl.count<-rep(0,max(f))j<-1repeat{##repeat can help nicelyin place of repeated use of of "for loop".if(j>max(f)){break}i<-1repeat{if(i>n){break}if(f[i]==j){cl.count[j]<-cl.count[j]+1}i<-(i+1) } j<-(j+1) }n.claimsday<-rep(0,max(f))for(i in1:max(f)){n.claimsday[i]<-((cl.count[i])/i)}no.claimsday<-(365-cardinality.y)cardinality.ycl.countsum(cl.count)n.claimsdayno.claimsdaysum(n.claimsday)+no.claimsday

References

[1] Garrido, Jose (2007).Loss distribution lecture notes Unpublished.

48

[2] Klugman, Stuart A., Panjer, Harry H. ,Willmot, Gordon E(2004).Loss Models: From Datato Decision. Wiley serier in probability and statistics.

[3] McLaughlin, Dr. Michael, P(2001)A Compendium of common Probability Distributions.

[4] The R Develoment Core Team(2003)R: A Language and Environment for Statistical Com-puting.

[5] Patrick, Daly W.(1998)Graphics and Colour with Latex.

[6] Prof. Dr. Antonio Jos Sez Castillo Funciones de una variable aleatoria

[7] Chhikara and Folks(1989)The Inverse Gaussian Distribution Marcel Dekker, page53. GKS15 Jan 98

49

On Fitting Models for Danish Fire Data

Documents

Transcript of On Fitting Models for Danish Fire Data