ChJS-03!01!05 Uncorrected Proofs

download ChJS-03!01!05 Uncorrected Proofs

of 20

Transcript of ChJS-03!01!05 Uncorrected Proofs

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    1/20

    Chilean Journal of Statistics (ChJS) www.soche.cl/chjs ISSN 0718-7912/20 (print/online version)

    Dear author:

    Please find attached the proofs of your article.

    You can submit your corrections via e-mail to [email protected],Cc to [email protected] or by fax to (56)(32)2508322.

    Please always indicate the line number to which the correction refers.

    You can insert your corrections in the proof PDF and email to us the annotatedPDF.

    For fax submission or scanned by email, please ensure that your corrections areclearly legible. Use a fine black pen and write the correction in the margin, not too

    close to the edge of the page.

    Remember to indicate the article title, article number, and your name whensending your response via e-mail or fax.

    Check the questions that may have arisen during copy editing and insert youranswers/corrections.

    Check that the text is complete and that all figures, tables and their legends areincluded. Also check the accuracy of special characters, equations, and electronicsupplementary material if applicable. If necessary refer to the edited manuscript.

    Please take particular care that all such details are correct.

    Please do not make changes that involve only matters of style. We have followedChJS style. Substantial changes in content, e.g., new results, corrected values, titleand authorship are not allowed without the approval of the responsible editor. Insuch a case, please contact the Editorial Office and return his/her consent togetherwith the proof.

    Your article will be published Online First (iFirst) approximately one week afterreceipt of your corrected proofs. This is the official first publication. Furtherchanges are, therefore, not possible.

    The printed version will follow in a forthcoming issue.

    Please note after online publication, subscribers (personal/institutional) to this journalwill have access to the complete article using the URL: http://www.soche.cl/chjs.

    Kind regards,

    Editorial OfficeChilean Journal of Statistics

    http://www.soche.cl/chjsmailto:[email protected]:[email protected]:[email protected]://www.soche.cl/chjshttp://www.soche.cl/chjsmailto:[email protected]:[email protected]://www.soche.cl/chjs
  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    2/20

    Journal: Chilean Journal of Statistics

    Article: ChJS Vol. 03 No. 01 Art. 05

    Author Query Form

    Please ensure you fill out your response to the queries raised below and

    return this form along with your corrections

    Dear author:

    During the typesetting process of your article, the following queries have arisen. Please checkyour typeset proof carefully against the queries listed below and mark the necessary changeseither directly on the proof/online grid or in the Authors response area provided below.

    Query Details required Authors response

    1 Please send corresponding authorsmailing address.

    2 Line 142, Cook (1982) is not in the

    reference list. Please add or remove3 Line 444, Reference Henderson (1975)

    is not cited. Please cite or remove.

    4 Line 484, Reference Wei et al. (1998)is not cited. Please cite or remove.

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    3/20

    Chilean Journal of StatisticsVol. 3, No. 1, April 2012, 118

    UNCORRECTED PROOFS1Statistical Modeling2

    Research Paper3

    On linear mixed models and their influence4diagnostics applied to an actuarial problem5

    Luis Gustavo Bastos Pinho, Juvencio Santos Nobre and Slvia Maria de Freitas6

    Department of Applied Mathematics and Statistics, Federal University of Ceara, Fortaleza, Brazil7

    (Received: 10 June 2011 Accepted in final form: 23 September 2011)8

    Abstract9

    In this paper, we motivate the use of linear mixed models and diagnostic analysis in10practical actuarial problems. Linear mixed models are an alternative to traditional cred-11ibility models. Frees et al. (1999) showed that some mixed models are equivalent to some12widely used credibility models. The main advantage of linear mixed models is the use13of diagnostic methods. These methods may help to improve the model choice and to14identify outliers or influential subjects which deserve better attention by the insurer. As15an application example, the well known data set in Hachemeister (1975) is modeled by16a linear mixed model. We can conclude that this approach is superior to the traditional17credibility one since the former is more flexible and allows the use of diagnostic methods.18

    Keywords: Credibility models Hachemeister model Linear mixed models19 Diagnostics Local influence Residual analysis.20Mathematics Subject Classification: 62J05

    62J20.21

    1. Introduction22

    One of the main concerns in actuarial science is to predict the future behavior for the23aggregate amount of claims of a certain contract based on its past experience. By accurately24predicting the severity of the claims, the insurer is able to provide a fairer and thus more25competitive premium.26

    Statistical analysis in actuarial science generally belongs to the class of repeated measures27studies, where each subject may be observed more than once. By subject we mean each28element of the observed set which we want to investigate. Workers of a company, class of29employees, and different states are possible examples of subjects in actuarial science. To30

    model actuarial data, a large variety of statistical models can be used, but it is usually31difficult to choose a model due to the data structure, in which within-subject correlation32is often seen. Correlation misspecification may lead to erroneous analysis. In some cases33this error is very severe. A clear example may be seen in Demidenko (2004, pp. 2-3) and34a similar artificial situation is reproduced in Figure 1, that shows the relation between35the number of claims and the number of policy holders of an insurer for nine different36regions within a country. In each region the two variables are measured once a year on the37

    Corresponding author. Email: [email protected]

    ISSN: 0718-7912 (print)/ISSN: 0718-7920 (online)

    c Chilean Statistical Society Sociedad Chilena de Estadsticahttp://www.soche.cl/chjs

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    4/20

    2 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    same day for three consecutive years. In Figure 1(a) we do not consider the within-region38(within-subject) correlation. The dashed line is a simple linear regression and suggests that39the more the policy holders, the less claims occur. In Figure 1(b) we joined the observations40for each region by a solid line. It is clear now that the number of claims increases with the41number of policy holders.42

    12 14 16 18 20

    500

    550

    600

    Number of policy holders (thousands)

    Numberofclaims

    12 14 16 18 20

    500

    550

    600

    Number of policy holders (thousands)

    Numberofclaims

    Figure 1. (a) Not considering the within-subject correlation, (b) considering the within-subject correlation.

    It is necessary to take into consideration that each region may have a particular behavior43which should be modeled, but only this is usually not enough. Techniques summarized44under the name of diagnostic procedures may help to identify issues of concern, such as high45influential observations, which may distort the analysis. For linear homoskedastic models,46a well known diagnostic procedure is the residual plot. For linear mixed models better47types of residuals are defined. Besides residual techniques, which are useful, there is a less48used class of diagnostic procedures, which includes case deletion and measuring changes in49the likelihood of the adjusted model under minor perturbations. Several important issues50

    may not be noticed without the aid of these last diagnostics methods.51For introductory information regarding regression models and respective diagnostic anal-52

    ysis; see Cook and Weisberg (1982) or Drapper and Smith (1998). For a comprehensive53introduction to linear mixed models, see Verbeke and Molenberghs (2000), McCulloch54and Searle (2001) and Demidenko (2004). Diagnostic analysis of linear mixed models were55presented and discussed in Beckman et al. (1987), Christensen and Pearson (1992), Hilden-56Minton (1995), Lesaffre and Verbeke (1998), Banerjee and Frees (1997), Tan et al. (2001),57Fung et al. (2002), Demidenko (2004), Demidenko and Stukel (2005), Zewotir and Galpin58(2005), Gumedze et al. (2010) and Nobre and Singer (2007, 2011).59

    The seminal work of Frees et al. (1999) showed some similarities and equivalences be-60tween mixed models and some well known credibility models. Applications to data sets in61actuarial context may be seen in Antonio and Beirlant (2006). Our contribution is to show62

    how to use diagnostic methods for linear mixed models applied to actuarial science. We63illustrate how to identify outliers and influential observations and subjects. We also show64how to use diagnostics as a tool for model selection. These methods are very important65and usually overlooked by most of the actuaries.66

    This paper is divided as follows. In Section 2 we present a motivational example using67a well known data set. In Section 3 we briefly present the linear mixed models. Section 468contains a short introduction to the diagnostic methods used in the example. In Section695 we present an application based on the motivational example. Section 6 shows some70conclusions. Finally, in an Appendix, we present mathematical details of some formulas71and expressions used in the text.72

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    5/20

    Chilean Journal of Statistics 3

    2. Motivational Example73

    For a practical example, consider the Hachemeister (1975) data on private passenger bodily74injury insurance. The data were collected from five states (subjects) in the US, through75twelve trimesters between July 1970 and June 1973, and show the mean claim amount and76the total number of claims in each trimester. The data may be found in the actuar package77

    (see Dutang et al., 2008) from R (R Development Core Team, 2009) and are partially shown78in Table 1.79

    Table 1. Hachemeisters data.Trimester State Mean claim amount Number of claims

    1 1 1738 78611 2 1364 16221 3 1759 11471 4 1223 4071 5 1456 29022 1 1642 9251...

    ......

    ...12 1 2517 9077

    12 2 1471 186112 3 2059 112112 4 1306 34212 5 1690 3425

    In Figure 2 we plot the individual profiles for each state and the mean profile. It suggests80that the claims have a different behavior along the trimesters for each state. One may81notice that the claims from state 1 are greater than those from other states for almost82every observation, and the claims from states 2 and 3 seem to grow more slowly than83those from state 1. If the insurer wants to accurately predict the severity, the subjects84individual behavior must also be modeled. Traditionally this is possible with the aid of85credibility models; see, e.g., Buhlmann (1967), Hachemeister (1975) and Dannenburg et86

    al. (1996). These models assign weights, known as credibility factors, to a pair of different87estimates of severity.88

    2 4 6 8 10 12

    1000

    1500

    2000

    2500

    Trimester

    Averageclaim

    amount

    State 1State 2State 3State 4State 5Average

    Figure 2. Individual profiles and mean profile for Hachemeister (1975) data.

    Credibility models may be functionally defined as

    ZB + (1 Z)C,

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    6/20

    4 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    where A represents the severity in a given state, Z is a credibility factor restricted to [0, 1],89B is a priori estimate of the expected severity for the same estate and C is a posteriori90estimate also of the expected severity. Considering a particular state, B may be equal to91the sample mean of the severity of its observations and C equal to the overall sample mean92of the data in the same period.93

    Frees et al. (1999) showed that it is possible to find linear mixed models equivalent to94

    some known credibility models, such as Buhlmann (1967) and Hachemeister (1975) models.95Information about linear mixed models is provided in the next section.96

    3. Linear Mixed Models97

    Linear mixed models are a popular alternative to analyze repeated measures. Such models98may be functionally expressed as99

    yi = Xi + Zibi + ei, i = 1, . . . , k , (1)

    where yi = (y1, y2, . . . , yni) is a ni1 vector of the observed values of the response variable100

    for the ith subject, Xi is a ni p known full rank matrix, is a p 1 vector of unknown101parameters, also known as fixed effects, which are used to model E[yi], Zi is a niq known102full rank matrix, bi is a q 1 vector of latent variables, also known as random effects,103used to model the within-subject correlation structure, and ei = (ei1, ei2, . . . , eini)

    is the104ni 1 random vector of (within-subject) measurement errors. It is usually also assumed105that ei

    ind Nni(0, 2Ini), where Ini denotes the identity matrix of order ni for i = 1, . . . , k,106bi

    iid Nq(0, 2G) for i = 1, . . . , k in which G is a q q positive definite matrix, and107ei and bj are independent i, j. Under these assumption, this is called a homoskedastic108conditional independence model. It is possible to rewrite model given in Equation (1) in a109more concise way as110

    y = X + Zb + e, (2)

    where y = (y1 , . . . , yk )

    , X = (X1 , . . . , Xk )

    , Z = ki=1 Zi, b = (b1 , . . . , bk ) and111e = (e1 , . . . ,e

    k )

    , with

    representing the direct sum.112It can be shown that, conditional on known covariance parameters of the model, it is113

    conditional to the elements of G and 2, the best linear unbiased estimator (BLUE) for114 and the best linear unbiased predictor (BLUP) for b are given by115

    = (XV1X)1XV1y, (3)

    and116

    b = DZV1(y

    X),

    respectively, where D = 2G, V = 2(In + ZGZ), with n =

    ki=i ni; see Hachemeister117

    (1975).118Maximum likelihood (ML) and restricted maximum likelihood (RML) methods can be119

    used to estimate the variance components of the model. The latter, proposed in Patterson120and Thompson (1971), is usually chosen since it often generates less biased estimators121related to the variance structure. When estimates for V are used in Equation (3) to122

    obtain and b, they are called empirical BLUE (EBLUE) and empirical BLUP (EBLUP),123respectively. Usually the estimation of the parameters involves the use of iterative methods124for maximizing the likelihood function.125

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    7/20

    Chilean Journal of Statistics 5

    Linear mixed models are not the only way to deal with repeated measures studies. Other126popular alternatives are the generalized estimation equations (see Liang and Zeger, 1986;127Diggle et al., 2002) and multivariate models as seen in Johnson and Whichern (1982)128and Vonesh and Chinchilli (1997). But usually these alternatives are more restrictive than129linear mixed models, and they only model the marginal expected value of the response130variable.131

    4. Diagnostic Methods132

    Diagnostic methods comprehend techniques whose purpose is to investigate the plausi-133bility and robustness of the assumptions made when choosing a model. It is possible to134divide the techniques shown here in two classes: residual analysis, which investigates the135assumptions on the distribution of errors and presence of outliers; and sensitivity analysis,136which analyzes the sensitivity of a statistical model when subject to minor perturbations.137Usually, it would be far more difficult, or even impossible, to observe these aspects in a138traditional credibility model.139

    In the context of traditional linear models (homoskedastic and independent), examples140

    of diagnostic methods may be seen in Hoaglin and Welsch (1978), Belsley et al. (1980) and141Cook (1982). Linear mixed models, extensions and generalizations are briefly discussed142here and may be seen in Beckman et al. (1987), Christensen and Pearson (1992), Hilden-143Minton (1995), Lesaffre and Verbeke (1998), Banerjee and Frees (1997), Tan et al. (2001),144Fung et al. (2002), Demidenko (2004), Demidenko and Stukel (2005), Zewotir and Galpin145(2005), Nobre and Singer (2007, 2011) and Gumedze et al. (2010).146

    4.1 Residual analysis147

    In the linear mixed models class three different kinds of residuals may be considered. The148conditional residuals: e = y X Zb, the EBLUP: Zb, and the marginal residuals:149 = y X. These predict respectively conditional error e = y E[y|b] = y X Zb,150random effects Zb = E[y|b] E[y], the marginal error = y E[y] = y X. Each of151the mentioned residuals is useful to verify some assumption of the model, as seen in Nobre152and Singer (2007) and briefly presented next.153

    4.1.1 Conditional residuals154

    To identify cases with a possible high influence on 2 in linear mixed models, Nobre and155Singer (2007) suggested the standardization for the conditional residual given by156

    ei =ei

    qii,

    where qii represents the ith element in the main diagonal of Q defined as157

    Q = 2(V1 V1X(XV1X)1XV1).

    Under normality assumptions on e, this standardization identifies outlier observations and158subjects; see Nobre and Singer (2007). To do so, the same authors consider the quadratic159form MI = y

    QUI(UI QUI)

    1UI Qy, where UI = (uij)(nk) = (Ui1 , . . . , Uik), with Ui160representing the ith column of the identity matrix of order n. To identify an outlier subject161let I be the index set of the subject observations and evaluate MI for this subset.162

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    8/20

    6 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    Table 2. Diagnostic techniques involving residuals.

    Diagnostic Graph

    Linearity of fixed effects vs. explanatory variables (fitted values)Presence of outliers e vs. observation indexHomoskedasticity of the conditional errors e vs. fitted valuesNormality of the conditional errors QQ plot for the least confounded residualsPresence of outlier subjects Mahalanobis distance vs. observation index

    Normality of the fixed effects weighted QQ plot for bi

    4.1.2 Confounded residuals163

    It can be shown that, under the assumptions made by model given in Equation (1), we164have165

    e = RQe+ RQZb and Zb = ZGZQZb + ZGZQe,166

    where R = 2In. These identities tell us that e and Zb depend on b and e and thus167are called confounded residuals; see Hilden-Minton (1995). To verify the normality of the168conditional errors using only e may be misleading because of the presence of b in the169

    above formulas. Hilden-Minton (1995) defined the confounding fraction as the proportion170of variability in e due to the presence of b. The same work suggested the use of a linear171transformation L such that Le has the least confounding fraction possible. The suggested172transformation also generates uncorrelated homoskedastic residuals. It is more appropri-173ated to analyze the assumption of normality in the conditional errors using Le instead174ofe as suggested by Hilden-Minton (1995) and verified by simulation in Nobre and Singer175(2007).176

    4.1.3 EBLUP177

    The EBLUP is useful to identify outlier subjects given that it represents the distance178between the population mean value and the value predicted for the ith subject. A way of179using the EBLUP to search for outliers subjects is to use the Mahalanobis distance (see180

    Waternaux et al., 1989), i = bi (Var[bi bi])1bi. It is also possible to use the EBLUP181to verify the random effects normality assumption. For more information; see Nobre and182Singer (2007). In Table 2 we summarize diagnostic techniques involving residuals discussed183in Nobre and Singer (2007).184

    4.2 Sensitivity analysis185

    Influence diagnostic techniques are used to detect observations that may produce excessive186influence in the parameters estimates. There are two main approaches of such techniques:187global influence, which is usually based on case deletion; and local influence, which intro-188duces small perturbations in different components of the model.189

    In normal homoskedastic linear regression, examples of sensitivity measures are the190 Cook distance, DFFITS and the COVRATIO; see Cook (1977), Belsley et al. (1980) and191Chatterjee and Hadi (1986, 1988).192

    4.2.1 Global influence193

    A simple way to verify the influence of a group of observations in the parameters es-timates is to remove the group and observe the changes in the estimation. The groupof observations are influential if the changes are considerably large. However, in LMM,it may not be practical to reestimate the parameters every time a set of observations isremoved. To avoid doing so, Hilden-Minton (1995) presented an update formula for the

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    9/20

    Chilean Journal of Statistics 7

    BLUE and BLUP. Let I = {i1, . . . , ik} be the index set of the removed observations andUI = (Ui1 , . . . ,Uik). Hilden-Minton (1995) showed that

    (I) = (XMX)1XMUI(I) and b b(I) = DZQUI(I),

    where the subscript (I) indicates that the estimates were obtained without the observations194

    indexed by I and (I) = (UIQUI)1UIQy.195A suggestion to measure the influence on the parameters estimates in linear mixed

    models is to use the Cook distance (see Cook, 1977) given by

    DI =( (I))(XV1X)1( (I))

    c=

    (y y(I))V1(y y(I))c

    ,

    such as seen in Christensen and Pearson (1992) and Banerjee and Frees (1997), where cis a scale factor. However, it was pointed out by Tan et al. (2001) that DI is not alwaysable to measure the influence on the estimation properly in the mixed models class. Thesame authors suggest the use of a measure similar to the Cook distance, but conditionalto BLUP (b). The conditional Cook distance is defined for the ith observation as

    Dcondi =k

    j=1

    Pj(i)Var[y|b]1Pj(i)(n 1)k + p , i = 1, . . . , k ,

    where Pj(i) = yjyj(i) = (Xj+Zjbj)(Xj(i)+Zjbj(i)). The same authors decomposed196Dcondi = D

    condi1 + D

    condi2 + D

    condi3 and commented the interpretation of each part of the197

    decomposition. Dcondi1 is related to the influence in the fixed effects, Dcondi2 is related to the198

    influence on the predicted values and Dcondi3 to the covariance of the BLUE and the BLUP,199which should be close to zero if the model is valid.200

    When all the observations from a subject are deleted, it is not possible to obtain the201

    BLUP for the random effects of that subject, making it impossible to obtain D

    cond

    I as202

    stated above. For this purpose, Nobre (2004) suggested using Dcondi = (ni)1

    jIDcondj ,203

    where I indexes the observation from a subject, as a way to measure the influence of a204subject on the parameters estimates when its observations are deleted.205

    There are natural extensions of leverage measures for linear mixed models. These can206be seen in Banerjee and Frees (1997), Fung et al. (2002), Demidenko (2004) and Nobre207(2004). However, they only provide information about leverage regarding fitted marginal208values. This has two main limitations as commented in Nobre and Singer (2011). First209we may be interested in detecting high-leverage within-subject observations. Second, in210some cases the presence of high-leverage within-subject observations does not imply that211the subject itself is detected as a high-leverage subject. Suggestions of how to evaluate212the within-subject leverage may be seen in Demidenko and Stukel (2005) and Nobre and213

    Singer (2011).2144.2.2 Local influence215

    The concept of local influence was proposed by Cook (1986) and consists in analyzing the216sensitivity of a statistical model when subjected to small perturbations. It is suggested217to use an influence measure called the likelihood displacement. Considering the model218described in Equation (2), the log-likelihood function may be written as219

    L() =

    ki=1

    Li() = 12

    ki=1

    ln |Vi| + (yi Xi)V1(yi Xi)

    .

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    10/20

    8 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    The likelihood displacement is defined as LD() = 2{L() L()}, where is a l 1perturbations vector in an open set Rl; is the parameters vector of the model,including covariance parameters; is the ML estimate of and is the ML estimate whenthe model is perturbed. It is necessary to assume that 0 exists such that L() = L(0)

    and such that LD has its first and second derivatives in a neighborhood of (

    ,0 ). Cook

    (1986) considered a Rl+1 surface formed by the influence function () = (, LD())

    and the normal curvature in the vicinity of0 in the direction of a vector d, denoted byCd. In this case, the normal curvature is given by

    Cd = 2|dHL1Hd|,

    where L = 2L()/ and H = 2L()/ both evaluated at = ; see Cook220

    (1986). It can be shown that Cd always lies between the minimum and maximum eigen-221value of the matrix F = HL1H, so dmax, the eigenvector associated to the highest222eigenvalue, gives information about the direction that exhibits more sensitivity of LD()223in a 0 neighborhood. Beckman et al. (1987) made some comments on the effectiveness of224the local influence approach. Lesaffre and Verbeke (1998) and Nobre (2004) showed some225examples of perturbation schemes in the linear mixed models context.226

    Perturbation scheme for the covariance matrix of the conditional errors. To ver-ify the sensitivity of the model to the conditional homoskedasticity assumption, pertur-bations are inserted in the covariance matrix of the conditional errors. This can be doneby considering Var[] = 2(), where () = diag(), with = (1, . . . , N)

    , theperturbation vector. For this case we have 0 = 1N. The log-likelihood function in thiscase is given by

    L = L() = 12

    ln |V()| + (y X)V()1(y X)

    ,

    where V = ZDZ + 2().227

    Perturbation scheme for the response. For the local influence approach, Beckman et228al. (1987) proposed the perturbation scheme229

    y() = y + s,

    where s represents a scale factor and is a n 1 perturbation vector. For this scheme230we have 0 = 0, with 0 representing the n 1 null vector. In this case, the perturbed231log-likelihood function is proportional to232

    L() = 12

    (y + s X)V1(y + s X).

    Perturbation scheme for the random effects covariance matrix. It is possible to233assess the sensitivity of the model in relation to the random effects homoskedasticity234assumption by perturbing the matrix G. Nobre (2004) suggested the use of Var[bi] = iG235as a perturbation scheme. In this case is a q 1 vector and 0 = 1q. The perturbed236log-likelihood function is proportional to237

    L() = 12

    ki=1

    ln |Vi()| + (yi Xi)1V()1(yi Xi)

    .

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    11/20

    Chilean Journal of Statistics 9

    Perturbation weighted case. Verbeke (1995) and Lesaffre and Verbeke (1998) suggested238perturbing the log-likelihood function as239

    L(|) =k

    i=1

    iLi().

    Such a perturbation scheme is appropriate for measuring the influence of the ith subject240using the normal curvature in its direction and is given by241

    Ci = 2|di HL1Hdi|,

    where di is a vector whose entries are 1 in the ith coordinate and zero everywhere else.242Verbeke (1995) showed that if Ci has a high value, then the ith subject has great influence243

    in the value of. A threshold of twice the mean value of all Cjs helps to decide whether244or not the observation is influential.245

    Lesaffre and Verbeke (1998) extracted from Ci some interpretable measures. They es-246

    pecially propose using XiX

    i 2

    , Ri2

    ZiZ

    i 2

    , Ini RiR

    i 2

    and V1i 2, where247 Xi = V1/2Xi, Zi = V1/2i Zi, Ri = V1/2i ei, to evaluate the influence of the ith subject248in the model parameter estimates. The actual interpretation of each of these terms can be249seen in the original paper.250

    4.2.3 Conform local influence251

    The Cd measure proposed by Cook (1986) is not invariant to scale re-parametrization.252To obtain a similar standardized measure and make it more comparable, Poon and Poon253(1999) used the conform normal curvature instead of the normal curvature given by254

    Bd() =2|dHL1Hd|

    2H

    L1

    H.

    It can be shown that 0 Bd() 1 to d direction and that Bd is invariant to conform scale255re-parametrization. A re-parametrization is said to be conform if its jacobian J is such that256JJ = tIs, to some real t and integer s. They showed that if 1, . . . , l are the eigenval-257ues of F matrix with v1, . . . , vl representing the respective normalized eigenvectors, then258

    the value of the conform normal curvature in vi direction is equal to i/l

    i=1 2i and259 l

    i=1 B2vi() = 1. If every eigenvector has the same conform normal curvature, its value is260

    equal to 1/

    l. Poon and Poon (1999) proposed to use this measure as a referential to mea-261sure the intensity of the local influence of an eigenvector. It can also be shown that when262d has the direction of dmax the conform normal curvature also attains its maximum. In263

    this way, the normal curvature and the conform normal curvature are equivalent methods.264

    5. Application265

    According to Frees et al. (1999), the random coefficient models are equivalent to the266Hachemeister linear regression model which is used for the example data in Hachemeister267(1975). The random coefficient model to the data in Table 1 may be described as268

    yij = i + ji + eij, i = 1, . . . , 5, j = 1, . . . , 12,

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    12/20

    10 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    where yij represents the average claim amount for state i in the jth trimester, i =269 + ai and i = + bi, with fixed and , and (ai, bi)

    N2(0, D), in which D is a2702 2 covariance matrix. Before adjusting the model to the data, we used R to apply the271asymptotic likelihood ratio test described in Giampaoli and Singer (2009) to compare the272suggested random coefficients model and a random intercept model. The p-value obtained273from the test was 0.0514. It indicates that it may be enough to consider the random274

    effect for the intercept only. This decision is also supported by the Bayesian information275criterion (BIC), which is equal to 808.3 for the single random effect model and 811.6 for the276model with two random effects. We may also use another set of tests, involving bootstrap,277monte-carlo and permutational methods, to investigate whether or not should we prefer278the random intercept model. These tests may be seen in Crainiceanu and Ruppert (2004),279Greven et al. (2008) and Fitzmaurice et al. (2007). However, this is very distant from our280goals and is not discussed here. For the sake of simplicity and based on the presented281reasons we shall use the random intercept model, which differs a little from the model282proposed by Frees et al. (1999). Thus, the model to be adjusted for the data in this283example is284

    yij = i + j + ij , i = 1, . . . , 5, j = 1, . . . , 12, (4)

    where i = + ai, and are the same as defined before. Assume also that Var[ij ] = 2285

    and Var[ai] = 2a.286

    The model parameter estimates were obtained by the RML method using the lmer()287function from the lme4 package in R. The standard errors were obtained from SAS c (SAS288Institute Inc., 2004) using the proc MIXED. The estimates are shown in Table 3.289

    Table 3. Model parameter estimates.

    Parameter 2 2a

    Estimate 1460.32 32.41 32981.53 73398.25SE 131.07 6.79 6347.17 24088.00

    Figure 3 shows the five conditional regression lines obtained from the linear mixed model290given in Equation (4). The adjusted model clearly suggests that the claim amount is higher291in state 1. Also it suggests a similarity in the claim amounts from states 2 and 4. Besides292that, we can expect a smaller risk from policies in state 5, since they are much closer to293the respective adjusted conditional line. Further information is explored by the diagnostic294analysis commented next.295

    0 10 20 30 40 50 60

    1000

    1500

    2000

    2500

    Conditional regression lines

    Observation

    Aggregateclaim

    amount

    State 1

    State 2

    State 3

    State 4

    State 5

    Figure 3. Conditional regression lines.

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    13/20

    Chilean Journal of Statistics 11

    5.1 Diagnostic analysis296

    The standardized residuals proposed by Nobre and Singer (2007) suggest that observation2974.7 (obtained from state 4 in the seventh trimester) may be considered an outlier as shown298in Figure 4(a). According to the QQ plot in Figure 4(b) it is reasonable to assume that the299conditional errors are normally distributed. The Mahalanobis distance in Figure 4(c) was300normalized to fit the interval [0, 1] and suggests that the first state may be an outlier. The301measure MI proposed by Nobre and Singer (2007) in Figure 4(d), also normalized, suggests302that none of the states have outlier observations. The Mahalonobis distance should not303be confounded with MI. The first is based on the EBLUP and the last is based on the304conditional errors, and thus they have different meanings. For both analyses, an observation305is highlighted if it is greater than twice the mean of the measures.306

    1 2 3 4 5

    3

    2

    1

    0

    1

    2

    3

    (a)

    State

    Standardized

    conditional

    residual

    4.7

    2 1 0 1 2

    3

    2

    1

    0

    1

    2

    3

    (b)

    Quantiles of N(0,1)

    Leastconfounded

    residualstandardized

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1

    .0

    (c)

    State

    Mahalanobisdistance

    1

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1

    .0

    (d)

    State

    MI

    Figure 4. Residual analysis: (a) standardized residuals, (b) least confounding residuals, (c) EBLUP, (d) values forMI .

    The conditional Cook distance is shown in Figure 5. The distances were normalized for307comparison. Figure 5(a) suggests that observation 4.7 is influential in the model estimates.308The first term of the distance decomposition suggests that no observations were influential309in the estimate of as shown in Figure 5(b). The second term of the decomposition310suggests that observation 4.7 is potentially influential in the prediction of b as seen on311Figure 5(c). The last term, Di3, is as close to zero as expected and is omitted.312

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    14/20

    12 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (a)

    State

    Cooksconditionaldistance

    4.7

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (b)

    State

    D

    1

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (c)

    State

    D2

    4.7

    Figure 5. (a) Conditional Cook distance, (b) Di1, (c) Di2.

    Figure 6 shows the local influence analysis using three different perturbation schemes.313The first, in Figure 6(a), is related to the conditional errors covariance matrix, as suggested314in Beckman et al. (1987), and indicates that the observations from the fourth state, espe-315cially 4.7, are possibly influential in the homoskedasticity and independence assumption316for the conditional errors. Notice that it is possible to explain the influence of observation3174.7 analyzing Figure 2. This observation has a value considerably higher than the others318from the same state. Figure 6(b) demonstrates the perturbation scheme for the covariance319matrix associated to the random effects as shown in Nobre (2004). Alternative perturba-320tion schemes for this case can be seen at Beckman et al. (1987). These schemes suggest that321

    all states are equally influential in the random effects covariance matrix estimate. Finally,322

    there are evidences that the observations in the fourth state may not be well predicted by323the model; see Figure 6(c).324

    After the diagnostic we proceed to a confirmatory analysis by removing the observations325from states 1 and 4, one at a time and then both at the same time. The new estimates are326shown in Table 4. For each parameter, we calculate the relative change in the estimated327values, defined for parameter , as328

    RC() =

    (i) 100%.

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    15/20

    Chilean Journal of Statistics 13

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (a)

    State

    Absolutevalueso

    fdmaxcomponents

    4.7

    4.14.114.12

    1.2

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (b)

    State

    Absolutevalueofdmaxcomponents

    1 2 3 4 5

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    (c)

    State

    ||IRR

    T||

    2

    4

    Figure 6. Perturbation schemes: (a) conditional covariance matrix, (b) random effects covariance matrix, (c) valuesfor Ini RiRi2.

    Table 4. Estimates and relative changes for the model given in Equation (4) parameter estimates with and withoutstates 1 and 4.

    Situation 2

    2a

    Complete data 1460.32 32.41 32981.53 73398.25Without State 1 1408.63 (3.67) 25.26 (22.06) 34666.31 ( 5.11) 34335.64 (53.22)Without State 4 1530.94 (4.61) 33.50 ( 3.36) 24940.12 (24.38) 59214.50 ( 19.32)

    Without States 1 and 4 1485.56 (1.70) 24.32 (24.96) 24497.48 (25.72) 23707.07 (67.70)

    If all five states were equally influential, we would expect the value for RC to lie around329

    1/5 = 20% after removing a state. If RC() exceeds two times this value, that is 40%,330for some parameter we consider the state was potentially influential. It is possible to331conclude that three observations from state 1 were influential in the within-subject variance332estimate. From Figure 2, one can explain this influence noticing that all the observations333from state 1 had higher values compared to the other states. Notice that such influence was334not detected in Figure 5(b), but was pointed out by the Mahalanobis distance in Figure3354(c). Removing state 1 from our analysis and running every diagnostic procedure again336we detect no excessive influence and the only issue is the observation 4.7, which is still an337outlier. From this result the model is validated and it is assumed to be robust and ready338for use.339

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    16/20

    14 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    340

    6. Conclusions341

    The use of linear mixed models in actuarial science should be encouraged given their342capability to model the within-subject correlation, their flexibility and the presence of343diagnostic tools. Insurers should not use a model without validating it first. For the specific344

    example seen here, the decision makers may consider a different approach for state 1.345After removing observations from state 1 there was a relative change of more than 50% in346the random effect variance estimate, which reflects significantly in the premium estimate.347Such analysis would not be possible in the traditional credibility models approach. This348illustrates how the model can be used to identify different sources of risk and can be used349in portfolio management. Linear mixed models are also usually easier to understand and350to present, when compared to standard actuarial methods, such as the credibility models351and Bayesian approach for determining the fair premium. The natural extension of this352work is to repeat the estimation and diagnostic procedures, adapting what is necessary,353to the generalized linear mixed models, which are also useful to actuarial science. Some354works have already been made in this area; see, e.g., Antonio and Beirlant (2006). It is also355interesting to continue a further analysis of the example in Hachemeister (1975), using the356diagnostic procedures again when weights are introduced to the covariance matrix of the357conditional residuals in the random coefficient models, and to evaluate the robustness of358the linear mixed models equivalent to the other classic credibility models. Again, this care359is justified because the fairest premium is more competitive in the market.360

    Appendix361

    We present here expressions for matrix H and the derivatives seen in the different pertur-362bation schemes presented in Section 4.2.2. These calculations are taken from Nobre (2004)363and are presented here to make this text more self-content.364

    Appendix A. Perturbation Scheme for the Covariance Matrix of the365Conditional Errors366

    Let H(k) be the kth column ofH and f be the number of distinct components of matrix367D, then368

    H(k) =

    2L()

    k

    ,

    2L()

    k2,

    2L()

    k1, . . . ,

    2L()

    kf

    ,

    where369

    2L()k

    = ;=0

    = XDke,

    370

    2L()

    k2

    = ;=0

    = 12

    2 tr

    DkZDZ 2eDkV1e + 2eDke ,

    371

    2L()

    ki

    = ;=0

    = 12

    trDkZDiZ

    2eDkZDiZe

    ,

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    17/20

    Chilean Journal of Statistics 15

    with372

    Dk =V1()

    k

    = ;=0

    = 2V(k)(V(k)), e Di = Di

    = ;=0

    ,

    and V(k) representing the kth column ofV1.373

    Appendix B. Perturbation Scheme for the Response374

    It can be shown that375

    2L()

    = ;=0

    = sV1X,376

    2L()

    2 =;=0 = sV1V1e,377

    2L()

    i

    = ;=0

    = sV1ZDiZV1e,

    implying378

    H = sV1X, V1e,ZD1ZV1e, . . . ,ZDfZV1e .

    Appendix C. Perturbation Scheme for the Random Effects Covariance379Matrix380

    For this scheme we have381

    H(k) =

    2L()

    k

    ,

    2L()

    k2,

    2L()

    k1, . . . ,

    2L()

    kf

    .

    It can be shown that382

    2L(k)

    =;=0 = Xk V1k ZkGZk V1k ek,383

    2L(k)

    2

    = ;=0

    = trV1k ZkGkZk 2ek V1k ZkGZk V1k V1k ek,

    384

    2L(k)

    i

    = ;=0

    = trV1k ZkGkZk V1k ZkGiZk ek V1k ZkGZk V1k ZkGiZk V1k ek.

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    18/20

    16 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    Acknowledgements385

    We are grateful to Conselho Nacional de Desenvolvimento Cientfico e Tecnolgico (CNPq386project # 564334/2008-1) and Fundao Cearense de Apoio ao Desenvolvimento Cientfico387e Tecnolgico (FUNCAP), Brazil for partial financial support. We also thank an anonym388referee and the executive editor for their careful and constructive review.389

    References390

    Antonio, K., Beirlant, J., 2006. Actuarial statistics with generalized linear mixed models.391Insurance: Mathematics and Economics, 75, 643676.392

    Banerjee, M., Frees, E.W., 1997. Influence diagnostics for linear longitudinal models. Jour-393nal of the American Statistical Association, 92, 9991005.394

    Beckman, R.J., Nachtsheim, C.J., Cook, R.D., 1987. Diagnostics for mixed-model analysis395of variance. Technometrics, 29, 413426.396

    Belsley, D.A., Kuh, E., Welsch, R.E., 1980. Regression Diagnostics: Identifying Influential397Data and Sources of Collinearity. John Wiley & Sons, New York.398

    Buhlmann, H., 1967. Experience rating and credibility. ASTIN Bulletin, 4, 199207.399 Chatterjee, S., Hadi, A.S., 1986. Influential observations, high leverage points, and outliers400in linear regression (with discussion). Statistical Science, 1, 379393.401

    Chatterjee, S., Hadi, A.S., 1988. Sensitivity Analysis in Linear Regression. John Wiley &402Sons, New York.403

    Christensen, R., Pearson, L.M., 1992. Case-deletion diagnostics for mixed models. Tech-404nometrics, 34, 3845.405

    Cook, R.D., 1977. Detection of influential observation in linear regression. Technometrics,40619, 1528.407

    Cook, R.D., 1986. Assessment of local influence (with discussion). Journal of The Royal408Statistical Society Series B - Statistical Methodology, 48, 117131.409

    Cook, R.D., Weisberg, S., 1982. Residuals and Influence in Regression. Chapman and Hall,410

    London.411Crainiceanu, C.M., Ruppert, D., 2004. Likelihood ratio tests in linear mixed models with412

    one variance component. Journal of The Royal Statistical Society Series B - Statistical413Methodology, 66, 165185.414

    Dannenburg, D.R., Kaas, R., Goovaerts, M.J., 1996. Practical actuarial credibility models.415Institute of Actuarial Science and Economics, University of Amsterdam, Amsterdam.416

    Demidenko, E., 2004. Mixed Models - Theory and Applications. Wiley, New York.417Demidenko, E., Stukel, T.A., 2005. Influence analysis for linear mixed-effects models,418

    Statistics in Medicine, 24, 893909.419Diggle, P.J., Heagerty, P., Liang, K.Y., Zeger, S.L., 2002. Analysis of Longitudinal Data.420

    Oxford Statistical Science Series.421Drapper, N.R., Smith, N., 1998. Applied Regression Analysis. Wiley, New York.422

    Dutang, C., Goulet, V., Pigeon, M., 2008. Actuar: an R package for actuarial science.423Journal of Statistical Software, 25, 137.424

    Fitzmaurice, G.M., Lipsitz, S.R., Ibrahim, J.G., 2007. A note on permutation tests for425variance components in multilevel generalized linear mixed models. Biometrics, 63, 942426946.427

    Frees, E.W., Young, V.R., Luo, Y., 1999. A longitudinal data analysis interpretation of428credibility models. Insurance: Mathematics and Economics, 24, 229247.429

    Fung, W.K., Zhu, Z.Y., Wei, B.C., He, X., 2002. Influence diagnostics and outliers tests430for semiparametric mixed models. Journal of The Royal Statistical Society Series B -431Statistical Methodology, 64, 565579.432

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    19/20

    Chilean Journal of Statistics 17

    Giampaoli, V., Singer, J., 2009. Restricted likelihood ratio testing for zero variance com-433ponents in linear mixed models. Journal of Statistical Planning and Inference, 139,43414351448.435

    Greven, S., Crainiceanu, C.M., Kuchenhoff, H., Peters, A., 2008. Likelihood ratio tests for436variance components in linear mixed models. Journal of Computational and Graphical437Statistics, 17, 870891.438

    Gumedze, F.N., Welham, S.J., Gogel, B.J., Thompson, R., 2010. A variance shift model439for detection of outliers in the linear mixed model. Computational Statistics and Data440Analysis, 54, 21282144.441

    Hachemeister, C.A., 1975. Credibility for regression models with application to trend.442Proceedings of the Berkeley Actuarial Research Conference on Credibility, pp. 129163.443

    Henderson, C.R., 1975. Best linear unbiased estimation and prediction under a selection444model. Biometrics, 31, 423447.445

    Hilden-Minton, J.A., 1995. Multilevel diagnostics for mixed and hierarchical linear models.446Ph.D. Thesis. University of California, Los Angeles.447

    Hoaglin, D.C., Welsch, R.E., 1978. The hat matrix in regression and ANOVA. American448Statistical Association, 32, 1722.449

    Johnson, R.A., Whichern, D.W., 1982. Applied Multivariate Stastical Analysis. Sixth edi-450

    tion. Prentice Hall. pp. 273332.451Lesaffre, E., Verbeke, G., 1998. Local influence in linear mixed models. Biometrics, 54,452

    570582.453Liang, K.Y., Zeger, S.L., 1986. Longitudinal analysis using generalized linear models.454

    Biometrika, 73, 1322.455McCulloch, C.E, Searle, S.R., 2001. Generalized, Linear, and Mixed Models. Wiley, New456

    York.457Nobre, J.S., 2004. Mtodos de diagnstico para modelos lineares mistos. Unpublished Master458

    Thesis (in portuguese). IME/USP, Sao Paulo.459Nobre, J.S., Singer, J.M., 2007. Residuals analysis for linear mixed models. Biometrical460

    Journal, 49, 863875.461Nobre, J.S., Singer, J.M., 2011. Leverage analysis for linear mixed models. Journal of462

    Applied Statistics, 38, 10631072.463Patterson, H.D., Thompson, R., 1971. Recovery of interblock information when block sizes464

    are unequal. Biometrika, 58, 545554.465Poon, W.Y., Poon, Y.S., 1999. Conformal normal curvature and assessment of local in-466

    fluence. Journal of The Royal Statistical Society Series B - Statistical Methodology, 61,4675161.468

    R Development Core Team, 2009. R: A Language and Environment for Statistical Com-469puting. R Foundation for Statistical Computing, Vienna, Austria.470

    SAS Institute Inc., 2004. SAS 9.1.3 Help and Documentation. SAS Institute Inc., Cary,471North Carolina.472

    Tan, F.E.S., Ouwens, M.J.N., Berger, M.P.F., 2001. Detection of influential observation in473longitudinal mixed effects regression models. The Statistician, 50, 271284.474

    Verbeke, G., 1995. The linear mixed model. A critical investigation in the context of longi-475tudinal data analysis. Ph.D. Thesis. Catholic University of Leuven, Faculty of Science,476Department of Mathematics, Leuven, Belgium.477

    Verbeke, G., Molenberghs, G., 2000. Linear Mixed Models for Longitudinal Data. Springer.478Vonesh, E.F., Chinchilli, V.M., 1997. Linear and Nonlinear Models for the Analysis of479

    Repeated Measurements. Marcel Dekker, New York.480Waternaux, C., Laird, N.M., Ware, J.H., 1989. Methods for analysis of longitudinal data:481

    blood-lead concentrations and cognitive development. Journal of the American Statisti-482cal Association, 84, 3341.483

  • 8/2/2019 ChJS-03!01!05 Uncorrected Proofs

    20/20

    18 L.G. Bastos Pinho, J.S. Nobre and S.M. Freitas

    Wei, B.C., Hu, Y.Q., Fung, W.K., 1998. Generalized leverage and its applications. Scan-484dinavian Journal of Statistics, 25, 2537.485

    Zewotir, T., Galpin, J.S., 2005. Influence diagnostics for linear mixed models. Journal of486Data Science, 3, 53177.487