Power-Law Adjusted Failure-Time Models · 2012. 5. 23. · Power-Law Adjusted Failure-Time Models...

Power-Law Adjusted Failure-Time ModelsWilliam J. Reed

Abstract—A simple adjustment to parametric failure-timedistributions, which allows for much greater flexibility in theshape of the hazard-rate function, is considered. Analyticalexpressions for the distributions of the power-law adjustedWeibull, gamma, log-gamma, generalized gamma, lognormaland Pareto distributions are given. Most of these allow forbathtub shaped and other multi-modal forms of the hazardrate. The new distributions are fitted to real failure-time datawhich exhibit a multi-modal hazard-rate function and the fitsare compared.

Index Terms—survival analysis; bathtub hazard; acceleratedfailure time (AFT) regression; power-law distribution.

I. INTRODUCTIONParametric distributions play an important role in the

analysis of lifetime data especially in accelerated failure time(AFT) regression models. Generally speaking analysis basedon a parametric model will be more precise than that basedon a nonparametric or semi-parametric model, because it willhave fewer unknown parameters. However this is contingenton it being possible to find a suitable parametric model to fitthe data. Unfortunately for most of the common distributionsemployed there is very little flexibility in the shape of thehazard rate function. In particular none of the two-parameterdistributions customarily employed can be used to model abathtub-shaped hazard.

There are a number of three-parameter distributions whichallow a bathtub-shaped hazard including the exponentiatedWeibull [3], the generalized Weibull [4] and the generalizedgamma (see e.g. [1]) distributions. An addition to these wasproposed in a recent article by Reed [5]. This distribution,which is a special case of a double Pareto-lognormal distri-bution [6], can be characterised as the product of independentrandom variables, one with a lognormal distribution and theother with a power-law distribution on [0, 1]. For this reasonthe new distribution was called the lognormal-power functiondistribution. It can be thought of as an extension of thelognormal distribution.

In this article it is shown how any simple parametricfailure-time distribution can be extended in a similar wayto allow for much greater flexibility in its form, includingin most cases the possibility of bathtub shaped hazard-ratefunctions. Precisely, the failure time T is modelled as theproduct T d= T0U , where T0 follows the “simple” failure-time distribution and U follows the power-law distributionwith density λuλ−1 on [0, 1]. Alternatively this can beexpressed as T d= T0/V where V has a Pareto distribution,with density λ/vλ+1 on [1,∞).

As might be expected, it is not possible for every paramet-rically specified distribution (of T0) to obtain an analytical

Manuscript received March 9, 2012; revised March, 2012. This work wassupported in part by NSERC Grant OGP 7252.

W. J. Reed is emeritus professor at Department of Mathematics andStatistics, University of Victoria, PO Box 3060 STN CSC, Victoria, B.C.,Canada V8W 3R4 e-mail:[email protected]

expression for the resulting power-law modified density.However it turns out to be possible to do so for a numberof the more common failure-time distributions including thelognormal (Reed, 2011), exponential, Weibull, gamma, log-gamma, Pareto and generalized gamma distributions. Thesedistributions are considered in this article. In all cases, exceptthe lognormal and Pareto, the resulting power-function mod-ified densities can be expressed in terms of an incompletegamma function.

In Sec.2 the distribution theory associated with the power-law modification is presented, and in Sec.3 maximum likeli-hood estimation discussed. In Sec.4 the results of fitting thevarious power-law modified failure-time distributions to datawith a multi-modal shaped hazard rate, are presented.

II. THEORY

Let T0 be a random variable with a known continuousfailure-time distribution. The power-law modified form ofthis distribution can be represented by a random variable Twith

Td= T0U

where U , independent of T0, follows the power-law distri-bution with density λuλ−1 (λ > 0) on the interval [0, 1].Taking logarithms leads to

X = log(T )d= Z0 −

1

λE

where Z0 = log T0 (with survivor function and density S0(z)and f0(z), say) and E is a standard (unit mean) exponentialrandom variable. The survivor function for X can be foundas a convolution as follows:

SX(x) = P(Z0 − E/λ ≥ x)= P(E ≤ λ(Z0 − x))= E{P(E ≤ λ(Z0 − x))|Z0}= E{[1− e−λ(Z0−x)] I[Z0 − x > 0]}

=

� ∞

x[1− e−λ(z−x)]f0(z)dz

= S0(x)− eλx� ∞

xe−λzf0(z)dz (1)

where the expectation E is with respect to Z0 and I is aBernoulli indicator random variable. Upon integrating byparts one obtains

SX(x) = λeλx

� ∞

xe−λzS0(z)dz. (2)

From this, by differentiation and using (1), one obtains thecorresponding formula for the density of X

fX(x) = λeλx

� ∞

xe−λzf0(z)dz. (3)

Proceedings of the World Congress on Engineering 2012 Vol I WCE 2012, July 4 - 6, 2012, London, U.K.

ISBN: 978-988-19251-3-8 ISSN: 2078-0958 (Print); ISSN: 2078-0966 (Online)

WCE 2012

From (2) and (3) the survivor function and density of Tin terms of those of T0 (ST0(t) and fT0(t)) can be easilyobtained:

ST (t) = λtλ

� ∞

tu−λ−1ST0(u)du. (4)

fT (t) = λtλ−1

� ∞

tu−λfT0(u)du. (5)

We now consider power-law modified forms of some specificfailure-time distributions.

Weibull and exponential model. If T0 has a Weibulldistribution with hazard rate function hT0(t) = αβtβ−1, itssurvivor function and density are ST0(t) = exp(−αtβ) andfT0(t) = αβt

β−1 exp(−αtβ). The hazard rate is monotoneincreasing for β > 1 and monotone decreasing for β < 1.In the case β = 1 it is constant and the Weibull distributionreduces to an exponential distribution. The survivor functionand density for Z0 = log T0 are

S0(z) = exp(−αeβz) and f0(z) = αβ exp(βz−αeβz).

From (2) and (3), the survivor function and density ofX = log T , where T follows the power-law adjusted Weibulldistribution, are

SX(x) =λαλ/β

βeλx I(αeβx,−λ/β)

fX(x) = λαλ/β eλx I(αeβx, 1− λ/β)

where I is the incomplete gamma function

I(y,θ ) =

� ∞

yuθ−1e−udu. (6)

Note that although the ordinary gamma function can beexpressed as the integral Γ(θ) =

�∞0 u

θ−1e−udu only forθ > 0, the incomplete gamma function I(y,θ ) evaluated aty > 0 converges for all real θ. Thus SX(x) and fX(x) aboveare well-defined since αeβx > 0.

The survivor function, density and hazard-rate function forT are easily computed from the above as

ST (t) = SX(log t); fT (t) =1

tfX(log t); hT (t) =

fT (t)

ST (t)

Fig.1 (top row) illustrates three shapes that the hazard ratefunction of the power-law adjusted Weibull distribution canassume.

Gamma model. If T0 follows a gamma distribution withscale parameter θ−1 and shape parameter κ, then the densityand survivor function of Z0 = log T0 are

S0(z) =I(θez,κ)

Γ(κ)and f0(z) =

θκ

Γ(κ)exp(κz−θez)

From (2) and (3), the survivor function and density ofX = log T , where T follows the power-law adjusted gammadistribution, are

SX(x) =1

Γ(κ)

�I(θex,κ)− θλeλxI(θex,κ− λ)

�

fX(x) =λθλ

Γ(κ)eλxI(θex,κ− λ)

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0 1 2 3 4 5

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Time

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Haz

ard

rate

0 1 2 3 4 5

02

46

8

Haz

ard

rate

0.0 0.5 1.0 1.5 2.0

02

46

8

Haz

ard

rate

Fig. 1. Some shapes of the hazard rate function for for various power-lawadjusted distributions. Top row: Weibull distribution with α = 1: (l.hand)β = 1 (exponential distribution) and λ = 0.02; (centre) β = 2 and λ = 2;r.hand β = 3 and λ = .02. Second row: gamma distribution with θ = 0.25:(l.hand) κ = .01 and λ = 1; (centre) κ = .01 and λ = 2.5; (r.hand)κ = .1 and λ = 7. Third row: log-gamma distribution with θ = 20:(l.hand) κ = 50 and λ = .01; (centre) κ = 10 and λ = .01; (r.hand)κ = 5 and λ = .5. Bottom row: Pareto distribution with τ0 = 1.5: (l.hand)α = 1 and λ = 0.1; (centre) α = 15 and λ = 2; (r.hand): α = 15 andλ = 0.2

Fig.1 (second row) illustrates some shapes that the hazardrate function of the power-law adjusted gamma distributioncan assume.

Log-gamma model. If Z0 = log T0 follows agamma distribution, so that T0 has density fT0(t) =θκ

Γ(κ) t−(θ+1)(log t)κ−1 with support on [1,∞) then from (2)

and (3), it is easy to show that the power-law adjusted randomvariable T has support on (0,∞) and that X = log T hassurvivor function and density

SX(x) =

1− eλx

�θ

θ+λ

�κif x ≤ 0

1Γ(κ)

�I(θx,κ )−

�θ

θ+λ

�κeλxI([θ + λ]x,κ )

�if x > 0

and

fX(x) =

λeλx

�θ

θ+λ

�κif x ≤ 0

λeλx�

θθ+λ

�κI([θ+λ]x,κ)

Γ(κ) if x > 0

Fig.1 (third row) illustrates some shapes that the hazard ratefunction of the power-law adjusted log-gamma distributioncan assume.

Pareto model. If T0 follows a Pareto distribution withsupport on (τ0,∞) and pdf fT0(t) = ατ0

�tτ0

�−(α+1)thereon,

one can show that the power-law adjusted form has supporton (0,∞) and (using (4)) that the survivor function of thepower-law adjusted form is

ST (t) =

1− αα+λ�

tτ0

�λif t ≤ τ0

λα+λ

�tτ0

�−αif t > τ0



WCE 2012

and using (5) that the corresponding pdf is

fT (t) =

αλα+λ

1τ0

�tτ0

�λ−1if t ≤ τ0

αλα+λ

1τ0

�tτ0

�−α−1if t > τ0

Fig.1 (bottom row) illustrates some shapes that the hazardrate function of the power-law adjusted Pareto distributioncan assume.

Lognormal model. Consider the case where Z0 = log T0follows a normal distribution with mean µ and variance σ2.Reed (2011) The power-law adjusted version of this distri-bution (the lognormal-power function or lNpf distribution)was considered in [5] where it is shown that the survivorfunction and density of X = log T , where T follows thelNpf distribution, are

SX(x) = φ

�x− µσ

��R

�x− µσ

�−R

�λσ +

x− µσ

��

andfX(x) = λφ

�x− µσ

�R

�λσ +

x− µσ

�

where R is Mills’ ratio of the complementary cumulativedistribution function (cdf) to the pdf of a standard normaldistribution:

R(z) =Φc(z)

φ(z).

Generalized gamma model. The three-parameter general-ized gamma distribution includes the Weibull, gamma andlognormal models as special or limiting cases. It has density

fT0(t) = αθκtακ−1 exp(−θtα)/Γ(κ)

With some work using (2) and (3), the survivor functionand density of X = log T , where T follows the power-lawadjusted gamma distribution, can be shown to be

SX(x) =1

Γ(κ)

�I(θeαx,κ)− θλ/αeλxI(θeαx,κ− λ/α)

�

fX(x) =λθλ/α

Γ(κ)eλxI(θeαx,κ− λ/α)

It should be noted that while the (unadjusted) log-gammaand Pareto distributions have support bounded away fromzero, their power law adjusted versions have support on[0,∞) as indeed occurs in all of the power law adjustedmodels discussed in this paper. Thus in these models thereare no problems with the range of support depending ona parameter, as occurs for example with the generalizedWeibull distribution.

III. PARAMETER ESTIMATION BY MAXIMUMLIKELIHOOD.

The parametric likelihood for much failure-time data isproportional to

n�

i=1

[fTi(ti)]δi [STi(ti)]

1−δi

where δi is an indicator variable with value 1 for an observedfailure time, and value 0 for a right-censored observation. Ifthere are no covariates and the failure times are considered

0 2000 4000 6000 8000 10000

0e+0

02e−0

44e−0

46e−0

48e−0

4

Time (# of cycles)

Smoo

thed

est

imat

ed h

azar

d ra

te

Fig. 2. Kernel smoothed non-parametric estimate of the hazard rate functionfor electrical appliances data. The Epanechnikov kernel with a bandwith of1500 was used. Note that the right-hand part (> 6000) of the estimatedhazard is unreliable, being based on only two observations.

to be identically distributed following a power-law adjusteddistribution with pdf and survivor function fT and ST , thenup to an additive constant the log-likelihood is

n�

i=1

δi log fT (ti) +n�

i=1

(1− δi) logST (ti)

which is the same asn�

i=1

δi log fX(log ti)+n�

i=1

(1− δi) logSX(log ti)−n�

i=1

log ti

Thus for each of the models discussed above an analyticalexpression for the log-likelihood can be obtained. This willneed to be maximized numerically to obtain maximumlikelihood estimates using an optimization routine such asoptim in R. For starting values one can use the MLEs of thetwo parameters of the unadjusted distribution and an arbitraryvalue (say 1) for λ.

Covariates ZT = (Z1, Z2, . . . , Zp) can be incorporated inan accelerated failure time (AFT) regression model:

log T = β0 + βTZ +X (7)

where X is a random variable with one of the power-law adjusted distributions of the previous section. Note thatfor all but the log-gamma these distributions can be re-parameterized in terms of a location parameter and two otherparameters. In these cases the intercept term β0 in (7) is notneeded (and indeed will result in a non-identifiable model ifit is included).

IV. AN EXAMPLE.

Electrical appliances. Lawless (p. 256) [2] presents data onthe numbers of cycles to failure for 60 electrical appliancesput on test. All of the sixty appliances eventually failed,the largest failure times being 6065 and 9701 cycles. Fig.2shows a kernel-smoothed non-parametric estimate of thehazard rate for these data. There is clearly a suggestion ofmulti-modality. To assess and compare the various power-law adjusted models discussed in the previous section eachwas fitted to these data. Maximization of the log-likelihood



WCE 2012

0 2000 4000 6000 8000 10000

0.00

000.

0004

0.00

080.

0012

Time(cycles)

Fitte

d ha

zard

rate

0 2000 4000 6000 8000 10000

0.00

000.

0004

0.00

080.

0012

Time(cycles)

Fitte

d ha

zard

rate

0 2000 4000 6000 8000 10000

0.00

000.

0004

0.00

080.

0012

Time(cycles)

Fitte

d ha

zard

rate

0 2000 4000 6000 8000 10000

0.00

000.

0004

0.00

080.

0012

Time(cycles)

Fitte

d ha

zard

rate

Fig. 3. Maximum likelihood estimates of various power-law adjusteddistributions for the electrical appliance data. They are (clockwise fromupper left) Weibull, log gamma, lognormal and Pareto.

was performed in R using the Nelder-Mead method in theroutine optim and in all cases required only a minute or twoof computation.

The values of the maximized log-likelihood and of theAkaike Information Criterion (AIC) for the power-law ad-justed forms of the two-parameter models are given inTable 1. In all cases, the improvement in fit obtained byincluding the power-law adjustment was highly significant(P

in evaluating the incomplete gamma functions which occurin the distributions discussed in this paper and so the extracomputation involved might not be too great.

REFERENCES[1] Cox, C., Chu, H., Schneider, M. & Muñoz, A. “Parametric survival

analysis and taxonomy of hazard functions for the generalized gammadistribution,”’ Statist. Med. 26, pp. 4352-4374, 2007

[2] Lawless, J. F. Statistical Models and Methods for Lifetime Data. NewYork: John Wiley and Sons. 1982

[3] Muldholkar, G. S. and D. K. Srivastava, “Exponentiated Weibull familyfor analyzing bathtub failure-rate data,”’ IEEE Trans. Rel., 42, pp. 299-302. 1993

[4] Muldholkar, G. S., Srivastava, D. K. & Kollia, G. D. “A Generalizationof the Weibull distribution with application to the analysis of survivaldata. J. Amer. Stat. Assoc.”’ 1996. 91, pp. 1575-1583.

[5] Reed, W. J. “A flexible parametric survival model which allows abathtub shaped hazard rate function”’. J. Appl. Stat. 38, 2011. pp. 1665-1680.

[6] Reed, W. J & Jorgensen, M. “The double Pareto-lognormal distribution- A new parametric model for size distributions,”’ Comm. Stats - Theory& Methods, 33, 2011. pp. 1733-1753



WCE 2012

Power-Law Adjusted Failure-Time Models · 2012. 5. 23. · Power-Law Adjusted Failure-Time Models...

Documents

Transcript of Power-Law Adjusted Failure-Time Models · 2012. 5. 23. · Power-Law Adjusted Failure-Time Models...