An optimal completion of the product limit estimator

ARTICLE IN PRESS

0167-7152/$ - se

doi:10.1016/j.sp

�CorrespondE-mail addr

1Partially sup

Statistics & Probability Letters 76 (2006) 913–919

www.elsevier.com/locate/stapro

An optimal completion of the product limit estimator

Zhiqiang Chen�,1, Eswar Phadia

Department of Mathematics, The William Paterson University of New Jersey, Wayne, NJ 07470, USA

Received 4 June 2003; accepted 7 September 2005

Available online 14 November 2005

Abstract

It is well known that the product limit estimator is undefined beyond the largest observation if it is censored. Some

completion methods are suggested in the literature (see e.g. [Efron, 1967. The two sample problem with censored data.

Proceedings of the 5th Berkeley Symposium] and [Gill, 1980. Censoring and stochastic integrals. Mathematical Centre

Tract No. 124, Mathematisch Centrum, Amsterdam]). In this note, we propose a completion method that is optimal in the

sense that the expected value of the integrated squared error loss function is minimized. This method yields an estimator

that falls between the above two extremes and possesses the same large sample properties. New bounds for the biases are

also derived for the above-mentioned cases.

r 2005 Elsevier B.V. All rights reserved.

MSC: primary 62C15; 62D05

Keywords: Bias bound; Censored data; Kaplan–Meier estimator; Loss function; Proportional hazard model

1. Introduction

The product limit (PL) estimator introduced by Kaplan and Meier (1958) has been used extensively inpractice even though it is known that the estimator is undefined beyond the last observation (in the sense ofordered observations) if it happens to be a right censored observation. To overcome this shortfall, Efron(1967) suggested that it be defined as zero beyond the last observation, whether it is censored or not. Hetermed such an estimator as self-consistent. Gill (1980) on the other hand recommended that it be defined asthe value of the estimator at the last observation. Some other completion methods have been suggested in theliterature without adequate justification. In this note, we propose a completion method that is optimal in thesense that the expected value of the integrated squared error loss function is minimum. In our treatment, weuse the exact formula of the kth moment derived by Phadia and Shao (1999). This approach yields anestimator that falls between the above two extremes but has the same large sample properties. New bounds forbiases are also derived. Although our focus is to concentrate on the proportional hazard model, a generalapproach is adopted.

e front matter r 2005 Elsevier B.V. All rights reserved.

l.2005.10.023

ing author. Tel.:+1973 720 3382; fax: +1973 720 3622.

ess: [email protected] (Z. Chen).

ported by CFR, College of Science and Health, William Paterson University.

www.elsevier.com/locate/stapro

ARTICLE IN PRESSZ. Chen, E. Phadia / Statistics & Probability Letters 76 (2006) 913–919914

The organization of the paper is as follows. In Section 2 we introduce some notations and preliminaries. Themain result is presented in Section 3. Bias bounds are then presented in Section 4.

2. Notations and preliminaries

Let X i; i ¼ 1; 2; . . . ; n, be i.i.d. sample from F , and Y i; i ¼ 1; 2; . . . ; n i.i.d. (G), be the censoring variables,independent of X i, and we observe, as in the context of survival analysis, Zi ¼ minfX i;Y ig anddi ¼ IðX ipY iÞ; i ¼ 1; 2; . . . ; n. We assume that both F and G are continuous and defined on the interval½0;1Þ. If we further assume that di and Zi are independent, we have the proportional hazard model. Kaplanand Meier (1958) introduced the PL estimator for the survival function SðtÞ ¼ 1� F ðtÞ, defined as

bSðtÞ ¼ Pn�11 1�

di:n

n� i þ 1

� �IðZi:nptÞ

,

where Zi:n denotes the ith ordered observation among Z0is and di:n corresponds to Zi:n. However, for t4Zn:n,and dn:n ¼ 0, the estimator was undefined. Efron (1967) introduced the notion of self-consistency andsuggested that it be defined as 0, while Gill (1980) defined it as

ð1� dn:nÞP 1�di:n

n� i þ 1

� �IðZi:nptÞ

; for t4Zn:n.

It is well known that EcSEðtÞpSðtÞpEcSGðtÞ, where cSE and cSG stands for the Efron’s and Gill’s versions,respectively. Since the survival function is monotone, any sensible modification of the last term of the PLestimator will be

bScðtÞ ¼ cð1� dn:nÞPn�11 1�

di:n

n� i þ 1

� �IðZi:nptÞ

; for t4Zn:n,

where c will be a quantity between 0 and 1. Clearly, the extreme values of c yield Efron’s and Gill’s versions,respectively. The goal of this article is to suggest a completion which minimizes the mean squared error loss

EðbS � SÞ2 ¼ EðbF � F Þ2. But instead of point-wise consideration, we will consider the loss functionR10ðbF ðtÞ �

F ðtÞÞ2 dF ðtÞ and try to determine a constant c which minimizes ER10 ðbF ðtÞ � F ðtÞÞ2 dF ðtÞ ¼ �

R10 EðbS � SÞ2 dS.

The results of Phadia and Shao (1999) on the exact formula for E bSkðtÞ will be used.

For this purpose, let H1ðtÞ ¼ PðZpt;X4Y Þ ¼R t

0 S dG, H2ðtÞ ¼ PðZpt;XpY Þ ¼R t

0 G dF , and HðtÞ ¼

PðZptÞ where G ¼ 1� G. For positive integer k, define fðkÞj ðtÞ ¼ H1ðtÞ þ ððn� jÞ=ðn� j þ 1ÞÞkH2ðtÞ. For

i ¼ 1; 2; . . . ; n, denote Zi:n ¼ Zi, such that Z1pZ2p � � �pZn are the ordered observations and di correspondsto Zi. Then Phadia and Shao (1999) showed that

E bSkðtÞ ¼ S

n

i

� �H

n�ii!

ZIðziotÞPi

j¼1dðfðkÞj ðzjÞÞ,

where the summation extends to n� 1 for Efron’s version and to n for Gill’s version. They also used linearapproximation for the integral and obtained

i!

ZIðziotÞPi

j¼1dðfðkÞj ðzjÞe¼Pi

j¼1fðkÞj ðtÞ (2.1)

yielding E bSkðtÞ ¼ Sðn

iÞH

n�iPi

j¼1fðkÞj ðtÞ.

3. Optimal completion results

We are now ready to derive the main result.

ARTICLE IN PRESSZ. Chen, E. Phadia / Statistics & Probability Letters 76 (2006) 913–919 915

Theorem 1. The critical value of c such that the corresponding completion minimizes ER10 ðbF ðtÞ � F ðtÞÞ2 dF ðtÞ is

c ¼

R10 SðtÞ½

RIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ� dSðtÞR1

0 ½R

IðznotÞPnj¼1 dðf

ð2Þj ðzjÞÞ�dSðtÞ

, (3.1)

which is approximatelyR10 SðtÞPn

j¼1fð1Þj ðtÞdSðtÞR1

0 Pnj¼1f

ð2Þj ðtÞdSðtÞ

.

Proof. Differentiating the loss function with respect to c, we get

d

dc�

Z 10

EðbSðtÞ � SðtÞÞ2 dSðtÞ

� �¼

Z 10

2SðtÞd

dcfEðdSðtÞÞgdSðtÞ �

Z 10

d

dcfEðbS2

ðtÞÞgdSðtÞ.

SinceR10 2SðtÞd=dcfEðbSðtÞÞgdSðtÞ ¼ 2

R10 SðtÞn!

RIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞdSðtÞ and

R10 d=dc fEðbS2

ðtÞÞgdSðtÞ ¼

2cR10 n!

RIðznotÞPn

j¼1 dðfð2Þj ðzjÞÞdSðtÞ, we get the critical value

c ¼

R10 SðtÞ½

RIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ� dSðtÞR1

0 ½R


ð2Þj ðzjÞÞ�dSðtÞ

40.

The approximate value is obtained trivially by making the substitution given in (2.1). &

Since ER10ðcF c � F ðtÞÞ2 dF ðtÞ ¼ �

R10

EðbSc � SÞ2 dS is a quadratic function of c, we see from the above

proof that, if c 2 ð0; 1�, then bSc is the optimal completion. However, if c41, Gill’s version is the optimalbecause the constant c has to be in the interval of 0 and 1 in order for the completion to make sense. It isbelieved that the above critical value c is between 0 and 1, but it is rather difficult to prove it in general.However, it is true for the proportional hazard model as shown in Corollary 3. In any case, we define thecompletion to be optimal for values of c in the interval ð0; 1�.

Remark 1. Our optimal completion of the PL estimator and the two versions mentioned above differ only inthe value beyond the last observation. In fact they are all very close and the differences tend to zeroexponentially fast. This can be seen for instance as follows (see Phadia and Van Ryzin, 1980). With Ei ¼

fZipuoZiþ1g and Znþ1 ¼ 1, we have

EfjcSEðtÞ � bScðtÞj2g ¼ EfjcSEðtÞ � bScðtÞj

2jEng � PðEnÞ.

The first factor on the right-hand side is finite and

PðEnÞ ¼ PðZnpuÞ ¼ ð1�HðuÞÞn ¼ exp½�n lnðð1�HðuÞÞ�1Þ��!0 if HðuÞ40.

Thus all of their asymptotic properties should be the same. Similar results hold for bSG.

Under the proportional hazard model, Z and d are independent and therefore H2ðtÞ ¼ PðZpt;d ¼ 1Þ ¼ Pðd ¼ 1ÞPðZptÞ ¼ gHðtÞ, H1ðtÞ ¼ ð1� gÞHðtÞ, where g ¼ Pðd ¼ 1Þ ¼ PðXpY Þ. The next proposi-tion gives a neat expression for c.

Theorem 2. Under the proportional hazard model, the critical value of c in (3.1) reduces to

c ¼1

2Pn

1

iði � gÞði þ gÞ

ði þ 2gÞ½ði � gÞ2 þ gð1� gÞ�

� �. (3.2)

Proof. By substituting the values of H1ðtÞ and H2ðtÞ, it is easy to see that the constant c derived in Theorem 1reduces to

c ¼

R10

SðtÞHnðtÞdSðtÞR10 HnðtÞdSðtÞ

Pn1

ðð1� gÞ þ ðn� jÞ=ðn� j þ 1ÞgÞ

ðð1� gÞ þ ððn� jÞ=ðn� j þ 1ÞÞ2gÞ.


To evaluateR10 HnðtÞdSðtÞ and

R10 SðtÞHnðtÞdSðtÞ, note that SG ¼ 1�H, and dH ¼ 1=gG dS, where

H ¼ 1�H. Now using the technique of integration by parts, we haveZ 10

HnðtÞdSðtÞ ¼

Z 10

nSðtÞHn�1ðtÞdHðtÞ

¼n

g

Z 10

Hn�1ðtÞSðtÞGðtÞdSðtÞ

¼n

g

Z 10

Hn�1ðtÞdSðtÞ �

Z 10

HnðtÞdSðtÞ

� �.

Therefore by induction,Z 10

HnðtÞdSðtÞ ¼n

nþ g

Z 10

Hn�1ðtÞdSðtÞ

¼n

nþ g�

n� 1

ðn� 1Þ þ g

Z 10

Hn�2ðtÞdSðtÞ

¼n!

Pn1ði þ gÞ

Z 10

dSðtÞ

¼ �n!

Pn1ði þ gÞ

.

Similarly,Z 10

SðtÞHnðtÞdSðtÞ ¼1

2

Z 10

S2ðtÞnHn�1ðtÞdHðtÞ

¼n

2g

Z 10

SðtÞHn�1ðtÞSðtÞGðtÞdSðtÞ

¼n

2g

Z 10

SðtÞ½Hn�1ðtÞ �HnðtÞ�dSðtÞ,

and by induction, we obtainZ 10

SðtÞHnðtÞdSðtÞ ¼n

nþ 2g

Z 10

SHn�1 dS

¼n!

Pn1ði þ 2gÞ

Z 10

S dS

¼ �n!

2Pn1ði þ 2gÞ

.

HenceR10 SðtÞHnðtÞdSðtÞR1

0 HnðtÞdSðtÞ¼

1

2Pn

1

ði þ gÞði þ 2gÞ

.

Now we will simplify

Pn1


ðð1� gÞ þ ððn� jÞ=ðn� j þ 1ÞÞ2gÞ,

Pn1


ðð1� gÞ þ ððn� jÞ=ðn� j þ 1ÞÞ2gÞ¼ Pn

1

½ðn� j þ 1Þ2ð1� gÞ þ ðn� j þ 1Þðn� jÞg�

½ðn� j þ 1Þ2ð1� gÞ þ ðn� jÞ2g�


¼ Pn1

½i2ð1� gÞ þ iði � 1Þg�

½i2ð1� gÞ þ ði � 1Þ2g�

¼ Pn1

iði � gÞ

½ði � gÞ2 þ gð1� gÞ�,

where in the second equality above, we substituted n� j þ 1 ¼ i. Putting together these terms we get (3.2). &

Thus, the optimality constant c depends on the proportional hazard rate g ¼ Pðd ¼ 1Þ. Since d is observable,g and hence, c and bSc can easily be estimated from the given data.

Our next corollary shows that c is within the (0; 1) range as we expected, and therefore, bSc is optimal.

Corollary 3. Under the proportional hazard model, the optimality constant c, a function of g, satisfies

1

4�

4� g2

4� g2 þ 5gð1� gÞoco

4� g2

4� g2 þ 5gð1� gÞ,

and hence c 2 ð0; 1Þ.

Proof. Since

cðgÞ ¼ð1þ gÞ

2ð1þ 2gÞ�2ð2� gÞð2þ gÞð2þ 2gÞð4� 3gÞ

�Pn3



¼4� g2

4� g2 þ 5gð1� gÞ�1

2Pn

3



¼4� g2

4� g2 þ 5gð1� gÞ� f ðgÞ; say.

Denote

LðgÞ ¼D1

2�Pn

3

iði � 2gÞ

ði � gÞ2 þ gð1� gÞand UðgÞ ¼D

1

2�Pn

3

iði þ gÞði þ 2gÞði � gÞ

.

Then we have

LðgÞpf ðgÞ ¼1

2Pn

3


ði þ 2gÞ½ði � gÞ2 þ gð1� gÞ�pUðgÞ.

Simple calculation shows that

½ln 2UðgÞ�0 ¼ Sn3

1

ði þ gÞ�

2

ði þ 2gÞþ

1

ði � gÞ

� �¼ Sn

3

2gð2i þ gÞði � gÞði þ gÞði þ 2gÞ

40,

so UðgÞ is an increasing function of g when g 2 ð0; 1Þ, therefore

f ðgÞpUðgÞpUð1Þ ¼1

2Pn

3

iði þ 1Þ

ði þ 2Þði � 1Þ¼

1

2�

n

2�

4

nþ 2¼

n

nþ 2o1.

Similarly, ½ln 2LðgÞ�0o0, so f ðgÞXLðgÞXLð1Þ ¼ n=4ðn� 1Þ414. &

4. Bias bounds

Let cSEðtÞ and cSGðtÞ be as defined earlier, the Efron’s and Gill’s completion, respectively, and bScðtÞ be the

optimal completion considered in this article. Further, let BðtÞ ¼ E bSðtÞ � SðtÞ be the bias with suffix indicating

the bias of a particular completion. Efron (1967) showed that EcSEðtÞpSðtÞ, and Klein (1991) proved that

SðtÞpEcSGðtÞ. A special case of Zhou (1988) shows that jBGjpR t

0 HnðtÞdF ðtÞ. We now give a different bias

bound.


Proposition 4. For both Efron’s and Gill’s completion, the bound on bias B is

jBjon!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ � Pn

j¼1 HðtÞ �1

jH2ðtÞ

� �,

and for the completion discussed in this article, the bound is

jBcjpmaxfc; 1� cgn!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ � maxfc; 1� cgPn

j¼1 HðtÞ �1

jH2ðtÞ

� �.

Proof. We have

0oBG ¼ EcSEðtÞ � SðtÞ þ n!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ

on!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ,

and

04BE ¼ EcSEðtÞ � SðtÞ4� n!

ZIðznotÞPn


Therefore,jBjon!

RIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ for both Efron’s and Gill’s versions. Now

Bc ¼ EcSGðtÞ � SðtÞ þ ðc� 1Þn!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ

X� ð1� cÞn!

ZIðznotÞPn


and

Bc ¼ EcSEðtÞ � SðtÞ þ cn!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞ

pcn!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞÞ.

So, jBcjpmaxfc; 1� cgn!R


ð1Þj ðzjÞÞ. &

In the case of proportional hazard model,

n!

ZIðznotÞPn

j¼1 dðfð1Þj ðzjÞ ¼ HnðtÞPn

1 1�1

ig

� �,

(see Chen et al., 1982), therefore we have the following.

Corollary 5. In the case of proportional hazard model, the bounds on biases reduces to

jBjoHnðtÞPn1 1�

1

ig

� �and jBcjpmaxfc; 1� cgHnðtÞPn

1 1�1

ig

� �.

The above bias bounds are easy to obtain. However, they are new to the best of our knowledge. To see howgood these bounds are, we compare them with Zhou’s (1988) result. In the case of propositional hazard model,earlier computation shows that

R t

0 HnðxÞdF ðxÞ ! n!=Pn1ði þ gÞ as t!1. Since HnðtÞPn

1ð1� ð1=iÞgÞ !ð1=n!ÞPn

1ði � gÞ, as t!1, and clearly ð1=n!ÞPn1ði � gÞon!=Pn

1ði þ gÞ when g 2 ð0; 1Þ, the new bias bound givenabove is sharper for t!1. The question whether the new bound is always better (that is, whetherR t

0 HnðtÞdF ðtÞXn!R


ð1Þj ðzjÞÞ) for all t40, is still open.


References

Chen, Y.Y., Hollander, M., Langberg, N.A., 1982. Small-sample results for the Kaplan–Meier estimator. J. Amer. Statist. Assoc. 77,

141–144.

Efron, B., 1967. The two sample problem with censored data. Proceedings of the 5th Berkeley Symposium vol. 4, pp. 831–852.

Gill, R.D., 1980. Censoring and stochastic integrals. Mathematical Centre Tract No. 124. Mathematisch Centrum, Amsterdam.

Kaplan, E.L., Meier, P., 1958. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 53, 457–481.

Klein, J.P., 1991. Small sample moments of some estimators of the variance of the Kaplan–Meier and Nelson–Aalen estimators. Scand. J.

Statist. 18, 333–340.

Phadia, E., Shao, Y., 1999. Exact moments of the product limit estimator. Statist. Probab. Lett. 41, 277–286.

Phadia, E.G., Van Ryzin, J., 1980. A note on convergence rates for the product limit estimator. Ann. Statist. 8, 673–678.

Zhou, M., 1988. Two sided bias bounds of the Kaplan–Meier estimator. Probab. Theory Related Fields 79, 165–173.

An optimal completion of the product limit estimator

Documents

Transcript of An optimal completion of the product limit estimator