Applied Mathematics
Vol.10 No.07(2019), Article ID:93786,16 pages
10.4236/am.2019.107038
Asymptotic Normality of the Nelson-Aalen and the Kaplan-Meier Estimators in Competing Risks
Didier Alain Njamen Njomen
Department of Mathematics and Computer Science, Faculty of Science, University of Maroua, Maroua, Cameroon
Copyright © 2019 by author(s) and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).
http://creativecommons.org/licenses/by/4.0/
Received: December 30, 2018; Accepted: July 19, 2019; Published: July 22, 2019
ABSTRACT
This paper studies the asymptotic normality of the Nelson-Aalen and the Kaplan-Meier estimators in a competing risks context in presence of independent right-censorship. To prove our results, we use Robelledo’s theorem which makes it possible to apply the central limit theorem to certain types of particular martingales. From the results obtained, confidence bounds for the hazard and the survival functions are provided.
Keywords:
Censored Data, Right-Censoring, Counting Process, Competing Risks, Nelson-Aalen and Kaplan-Meier Estimators, Asymptotic Properties of Estimators, Confidence Bands
1. Introduction and Background
The model of competing risks has been widely studied in the literature, see e.g., Heckman and Honoré [1] , Commenges [2] , Com-nougué [3] , Fine and Gray [4] , Crowder [5] , Fermanian [6] , Latouche, A. [7] , Geffray [8] , Belot [9] , Njamen and Ngatchou ( [10] , [11] ), Njamen ( [12] , [13] ). In most approaches, the competing risks are assumed to be either all independent or all dependent. Here, the independent component of the potential risks constitutes an independent censoring variable while the other risks are kept as possibly dependent. This approach is used by Geffray [8] . Namely, we consider a population in which each subject is exposed to m mutually exclusive competing risks which may be dependent. For , the failure time from the jth cause is a non-negative random variable (r.v.) . The competing risks model postulates that only the smallest failure time is observable, it is given by the r.v. with distribution function (d.f.) denoted by F. The cause of failure associated to T is then indicated by a r.v. which takes value j if the failure is due to the jth cause for a i.e. if . The following modeling technique is extracted in Njamen and Ngatchou [10] : we assume that T is, in its turn, at risk of being independently right-censored by a non-negative r.v. C with d.f. G. Consequently, the observable r.v. are
where and where denotes the indicator function. As T and C are independent, the r.v. Z has d.f. H given by . Let denote the right-endpoint of H beyond which no observation is possible. The subdistribution functions pertaining to the different risks or causes of failure are defined for and by
(1)
When the independence of the different competing risks may not be assumed, the functions for are the basic estimable quantities.
The Kaplan-Meier estimator was developed for situations in which only one cause of failure and the independent right-censoring are considered. Aalen and Johansen [14] were the first to extend the Kaplan-Meier estimator to several causes of failure in the presence of independent censoring. In the present situation, the d.f. F may be consistently estimated by the Kaplan-Meier estimator denoted by b . For , the subdistribution functions may be consistently estimated by means of the Aalen-Johansen estimators denoted respectively by , for . Indeed, when the process of the states occupied by an individual in time is a time-inhomogeneous Markov process, Aalen and Johansen [14] introduced an estimator of the transition probabilities between states in presence of independent random right-censoring. The competing risks set-up corresponds to the case of a time-inhomogeneous Markov process with only one transient state and several absorbing states (that can be labeled ). Aalen and Johansen [14] obtained the joint consistency of to for uniformly over fixed compact intervals for . They also obtained the joint weak convergence of the processes on fixed compact intervals for .
The asymptotic properties of the Kaplan-Meier estimator on the distribution function have been studied by several authors (see Peterson [15] , Andersen and al. [16] , Shorack and Wellner [17] , Breslow and Crowley [18] ).
In this paper, in a region where there is at least one observation, we are interested in providing asymptotic properties of the Nelson-Aalen and Kaplan-Meier nonparametric estimators of the functions and . For in the presence of independent right-wing censorship in the context of competitive risks set out in Njamen and Ngatchou ( [10] , [11] ).
The rest of the paper is organized as follows: Section 2 describes preliminary results and rappels used in the paper. In Section 3, we obtain two laws: In Section 3.1, we give limit law of Nelson-Aalen’s nonparametric estimator for competing risks as defined in Njamen and Ngatchou [10] and Njamen [12] . In Sect. 3.2, we give limit law of Kaplan-Meier’s nonparametric estimator in competing risks as defined in Njamen and Ngatchou [10] and Njamen [13] . In Section 4, we give the trust Bands, including the Hall-Wellner trust Bands and the Nair precision equal bands.
2. Preliminary and Rappels
For , we introduce the following subdistribution functions and of H by:
and
and for
The relations and hold for since the different risks are mutually exclusive. The relation is also valid for . The relations that connect the observable distribution functions , and to the unobservable distributions F, G and are given by:
and
The cumulative hazard function of T and the partial cumulative hazard function of T related to cause j for are given for respectively by the following expressions:
(2)
(3)
Let us set estimators for the different quantities. Let be n independent copies of the random vector . We define the empirical counterparts of ,, and H, for by:
The relations and are valid for . As T is independently randomly right-censored by C, a well-known estimator for F is the Kaplan-Meier estimator defined for by:
where the left-continuous modification of any d.f. L is denoted by . The Nelson-Aalen estimators of and of for respectively are defined for by:
(4)
(5)
The Aalen-Johansen estimator for is defined for by:
For all , the following equalities hold:
where , the Kaplan-Meier estimator of G, is defined for by:
3. Results
In this section, we continue the works of Njamen and Ngatchou [10] , Njamen [12] and Njamen and Ngatchou [11] . In fact, Njamen and Ngatchou ( [10] , p. 9), studies the consistency of Nelson-Aalen’s non-parametric estimator in competing risks, while Njamen ( [12] , pp. 11-12) studies respectively the simple convergence and the uniform convergence in probability of Nelson-Aalen’s nonparametric estimator in competing risks; and Njamen and Ngatchou ( [11] , p. 13) study the bias and the uniform convergence of the non-parametric estimator survival function in a context of competing risks. It is also shown there that this estimator is asymptotically unbiased. For this purpose, we use the martingale approach as the authors mentioned above.
3.1. Limit Law of Nelson-Aalen’s Nonparametric Estimator for Competing Risks
In what follows, we study the asymptotic normality of Nelson-Aalen’s non-parametric estimator in competitive risks. For that, considering, for all and , one has the Nelson-Aalen type cumulative hazard function estimator (Nelson, [19] ; Aalen, [20] , Njamen and Ngatchou, [10] ) defined by
(6)
where .
The cumulative risk in a region where there is at least one observation is given for all , by (see Njamen, [12] . p. 9)
(7)
with which indicates whether the individual i is still at risk just before time t (the individual has not yet undergone the event). Its estimator was defined in Njamen and Ngatchou ( [10] , p. 7).
The following theorem gives the limit law of the Neslson-Aalen estimator in competing risks of Njamen (2017, p. 9). This is the first fundamental result of this article.
Theorem 1.
In a region where there is at least one observation, it is assumed that for and . Then, for all ,
(8)
where is a centered Gaussian martingale of variance such that:
(9)
where for all ,
(10)
with standing for the distribution function of and the instant risk function.
To prove this theorem, we need the Robelledo theorem. In fact, the Rebolledo theorem below makes it possible to apply the central limit theorem for certain types of particular martingales.
Theorem 2. (Rebolledo’s Theorem)
Let a sequence of martingales where , denotes a counting process and its compensator. Consider the processes , and for all ,. Suppose that and f are predictable and locally bounded processes such that
Suppose also that the processes are bounded. Let’s for all ,. If
1) ;
2) for all ,.
Then,
where denotes the weak convergence in the space of continuous functions on the right, having a left-hand boundary with the topology of Skorokhod and where W is a Brownian motion.
To prove Theorem 1, it is sufficient to check whether the previous conditions of Rebolledo’s Theorem are satisfied:
Proof. For all and , also decomposes into
which in turn can be written in terms of by
which finally, can be rewritten as
where can be seen as a random noise process. The martingale above represents the difference between the number of failures due to a specific cause j observed in the time interval , i.e. (see Njamen, [12] , p.6), and the number of failures predicted by the model for the jth cause. This definition fulfills the Doob-Meyer decomposition.
This martingale is used in Fleming and Harrington ( [21] , p. 26) and in Breuils ( [22] , p. 25).
Now, to explain the asymptotic nature of the results, we defined, for all ,, to pose:
In a subgroup , where there is at least one observation, the survival function of is defined for all by:
Recall also that is the distribution function of , is that of ’s and that of the ’s. From the Glivenko-Cantelli theorem, one has:
(11)
Otherwise,
one has:
from which one obtains (see Theorem 3, p. 11 of Njamen, [12] ),
Differentiating the martingale , one has:
and from
one obtains
Consequently, the increasing process of
is given by
Next, for all and , one has
Also, for all and for all , the process
is a martingale. We apply the central limit theorem for the martingales (Rebolledo’s Theorem). In this purpose, we show that the condition of this theorem is satisfied by .
One has, for all ,
and also by the proof of the Theorem 3 of Njamen ( [12] , p. 11), we have:
So that, for all , when ,
which is determinist. Thus, the first condition of Robelledo Theorem holds.
To check the second condition, for all and , define
where for all ,.
We have to show that as , converges to 0 in probability.
One has, for all ,
because
Then
Thus, the second condition of Robelledo Theorem holds.
The conditions of the Rebolledo Theorem are verified and by consequently, for all ,
with .
Finally, for all ,
This ends the proof of the Theorem 1.
The following subsection gives the asymptotic law of nonparametric Kaplan-Meier’s estimator of the survival function in the competing risks of Njamen and Ngatchou ( [10] , p. 13).
3.2. Limit Law of Kaplan-Meier’s Nonparametric Estimator in Competing Risks
The Kaplan-Meier estimator of the survival function (Kaplan and Meier, [23] ) is defined by
where is the Nelson-Aalen estimator and where, for a process continuous to the right with a left limit such that
For all , an estimator of the variance of , where is the survival function associated with the subgroup is given by
The variance of approximated by that of is:
(12)
The estimator of the corresponding variance of is given by
(13)
The following result concerning the asymptotic law of nonparametric Kaplan-Meier estimator and constituted the second fundamental result of this paper:
Theorem 3.
In an area where there is at least one observation, if we assume that for all and ,
1) for all ,
2) for all ,
3) for all ,
Then, for all and , the non-parametric estimator checks
where is the center Gaussian martingale and where denotes the weak convergence in the space of continuous functions on the right, having a left-hand boundary with the topology of Skorokhod.
Proof. To prove this theorem, it suffices to show that it satisfies the conditions of the Rebolledo Theorem.
In an area where there is at least one observation, by posing, for all ,,
where .
For and , we have for all and ,
By the proof of Theorem 3 of Njamen ( [12] , p.11), we deduce that
Hence the 1st condition of Robolledo’s Theorem.
For the second condition of Robolledo’s Theorem, condition B is similar to the proof of Theorem 1 above, we find that for all ,
So, for each ,
where and where
Finally,
The fact that , for all and condition C implies:
As when , we deduce that:
It follows that:
This ends the proof of the theorem.
4. Confidence Bands of Survival Function
4.1. Confidence Intervals
For , we wish to find two random functions and such that ,
Recall that from the previous sections, for all , converges in distribution to a Gaussian martingale centered (see Theorem 3 above). As a consequence, is asymptotically Gaussian centered on . Given the above results, the estimated standard deviation of , noted is given for all by:
(14)
Therefore a threshold confidence level can be built for all and , by:
(15)
Here is the percentile of a standard normal distribution.
A threshold confidence interval can also be obtained for all , by:
(16)
where is the rank of fractile of the standardized normal distribution.
A disadvantage of the construction of the confidence interval (CI) with the previous formula is that the bound can be obtained external to the interval . A solution is to consider a transform via a continuous function g, differentiable and invertible such that belongs to a more wide space ideally unbounded and best approximate a Gaussian random variable. The delta method then allows for the estimation of
the standard deviation of the object created by defined by . The confidence interval associated with the risk threshold is built as for all ,
The most common transformation is , and in this case we have: for all ,
Remark 1. It is also possible to use log, square-root or logit-type transformations in most software defined respectively by for all ,
4.2. The Confidence Bands
The challenge now is to find an area containing the survival function with probability , or a set of bounds and which, with probability , contains for all and . Among the proposed solutions, the two most commonly used are firstly Hall and Wellner ( [24] ) bands and secondly, strips Nair ( [25] ) (“equal precision bands”). If is the maximum time event observed in the sample, then for the Nair bands, we have the following restrictions , however, boter Hall-Wiener may authorize the nullity of , let . Technically obtaining these bands is complex, and their practical utility in relation to the point intervals is not obvious.
Remark 2. The starting point uses the fact that for all , converges to a centered Gaussian martingale. We then go through a transformation making appear a Brownian bridge , weighted by at Nair, to retrieve the suitable critical value.
In particular, because of the joined character, for a given t their extent is wider than that of the corresponding point IC. In what follows we give the expressions obtained in the absence of transformation.
4.2.1. The Hall-Wellner Confidence Bands
Under the assumption of continuity of survival functions and respectively related to the event time and the time of censorship, Hall and Wellner show that for every , the IC joined the risk threshold is given for all and by:
(17)
where and are given by
and is bounds checking
4.2.2. The Nair Precision Equal Bands
Using a weighted Brownian bridge will notably modify the bounds to IC. For , and all , they are then given by:
(18)
where satisfies
If we compare (12) and (14), we see that the bounds relating to Nair ( [25] ) bands are proportional to the bounds IC and simply correspond to a risk adjustment threshold used in the past.
5. Conclusions and Perspectives
In this paper we have studied the asymptotic normality of Nelson-Aalen and Kaplan-Meier type estimators in the presence of independent right-censorship as defined in Njamen and Ngatchou ( [10] , [11] ) and Njamen [12] using Robelledo’s theorem that allows applying the central limit theorem to certain types of particular martingales. From the results obtained, confidence bounds for the hazard and the survival functions are provided.
As a perspective, obtaining actual data would allow us to perform numerical simulations to gauge the robustness of our obtained estimators.
Acknowledgements
We thank the publisher and the referees for their comments which allowed to raise considerably the level of this article.
Conflicts of Interest
The author declares no conflicts of interest regarding the publication of this paper.
Cite this paper
Njomen, D.A.N. (2019) Asymptotic Normality of the Nelson-Aalen and the Kaplan-Meier Estimators in Competing Risks. Applied Mathematics, 10, 545-560. https://doi.org/10.4236/am.2019.107038
References
- 1. Heckman, J.J. and Honoré, B.E. (1989) The Identifiability of the Competing Risks Models. Biometrika, 77, 325-330. https://www.jstor.org/stable/2336666 https://doi.org/10.1093/biomet/76.2.325
- 2. Commemges, D. (2017) Risques compétitifs et modèles multi-états en épidemiologie. Revue d’épidémiologie et de santé publique Elsevier Masson, 77, 605-611.
- 3. Com-Nougué, C., Guérin, S. and Rey, A. (1999) Estimation des risques associés à des événements multiples. Revue d’épidémiologie et de Santé Publique, 47, 75-85.
- 4. Fine, J.P. and Gray, R.J. (1999) A Proportional Hazards Model for the Subdistribution of a Competing Risk. Journal of the American Statistical Association, 94, 496-509. https://www.jstor.org/stable/2670170 https://doi.org/10.1080/01621459.1999.10474144
- 5. Crowder, M. (2001) Classical Competing Risks. Chapman and Hall, London.
- 6. Fermanian, J.D. (2003) Nonparametric Estimation of Competing Risks Models with Covariates. Journal of Multivariate Analysis, 85, 156-191. https://doi.org/10.1016/S0047-259X(02)00069-6
- 7. Latouche, M. (2004) Modèles de régression en présence de compétition. Thèse de doctorat, Université de Paris, Paris, 6. https://tel.archives-ouvertes.fr/tel-00129238
- 8. Geffray, S. (2009) Strong Approximations for Dependent Competing Risks with Independent Censoring with Statistical Applications. Test, 18, 76-95. https://doi.org/10.1007/s11749-008-0113-y
- 9. Belot, A. (2009) Modélisation flexible des données de survie en présence de risques concurrents et apports de la mthode du taux en excès. Thèse de doctorat, Université de la Méditerranée, Marseille.
- 10. Njamen, N.D.A. and Ngatchou, W.J. (2014) Nelson-Aalen and Kaplan-Meier Estimators in Competing Risks. Applied Mathematics, 5, 765-776. https://doi.org/10.4236/am.2014.54073
- 11. Njamen, N.D.A. and Ngatchou, W.J. (2018) Consistency of the Kaplan-Meier Estimator of the Survival Function in Competiting Risks. The Open Statistics and Probability Journal, 9, 1-17. https://benthamopen.com/TOSPJ/home https://doi.org/10.2174/1876527001809010001
- 12. Njamen, N.D.A. (2017) Convergence of the Nelson-Aalen Estimator in Competing Risks. International Journal of Statistics and Probability, 6, 9-23. https://doi.org/10.5539/ijsp.v6n3p9
- 13. Njamen, N.D.A. (2018) Study of the Nonparametric Kaplan-Meier Estimator of the Cumulative Incidence Function in Competiting Risks. Journal of Advanced Statistics, 3, 1-13. https://doi.org/10.22606/jas.2018.31001
- 14. Aalen, O.O. and Johansen, S. (1978) An Empirical Transition Matrix for Non-Homogeneous Markov Chains Based on Censored Observations. Scandinavian Journal of Statistics, 5, 141-150. https://www.jstor.org/stable/4615704
- 15. Peterson, G.L. (1977) A Simplification of the Protein Assay Method of Lowry et al. Which Is More Generally Applicable. Analytical Biochemistry, 83, 346-356. https://doi.org/10.1016/0003-2697(77)90043-4
- 16. Andersen, P.K., Borgan, Ø., Gill, R.D. and Keiding, N. (1993) Statistical Models Based on Counting Processes. Springer Series in Statistics, Spring-Verlag, New York.
- 17. Shorack, G.R. and Wellner, J.A. (1986) Empirical Processes with Applications to Statistics. John Wiley and Sons, Inc., New York.
- 18. Breslow, N. and Crowley, J. (1974) A Large Sample Study of the Life Table and Product-Limit Estimates under Random Censorship. The Annals of Statistics, 2, 437-453. https://www.jstor.org/stable/2958131 https://doi.org/10.1214/aos/1176342705
- 19. Nelson, W. (1972) A Short Life Test for Comparing a Sample with Previous Accelerated Test Results. Technometrics, 14, 175-185. https://www.jstor.org/stable/1266929 https://doi.org/10.1080/00401706.1972.10488894
- 20. Aalen, O.O. (1978) Nonparametric Inference for a Family of Counting Processes. The Annals of Statistics, 6, 701-726. https://www.jstor.org/stable/2958850 https://doi.org/10.1214/aos/1176344247
- 21. Fleming, T.R. and Harrington, D.P. (1990) Counting Processes and Survival Analysis. John Wiley and Sons, Hoboken.
- 22. Breuils, C. (2003) Analyse de Durées de Vie: Analyse Séquentielle du Modèle des Risques Proportionnels et Tests d’Homogénéité. Thèse de doctorat, Université de Technologie de Compiégne, Compiègne. https://tel.archives-ouvertes.fr/tel-00005524
- 23. Kaplan, E.L. and Meier, P. (1958) Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53, 457-481. https://www.jstor.org/stable/2281868 https://doi.org/10.1080/01621459.1958.10501452
- 24. Hall, W.J. and Wellner, J.A. (1980) Confidence Bands for a Survival Curve. Biometrika, 67, 133-143. https://www.jstor.org/stable/2335326 https://doi.org/10.1093/biomet/67.1.133
- 25. Nair, V.N. (1984) Confidence Bands for Survival Functions with Censored Data: A Comparative Study. Technometrics, 26, 265-275. https://www.jstor.org/stable/1267553 https://doi.org/10.1080/00401706.1984.10487964