Asymptotic Normality of the Nelson-Aalen and the Kaplan-Meier Estimators in Competing Risks

doi:10.4236/am.2019.107038

Applied Mathematics
Vol.10 No.07(2019), Article ID:93786,16 pages
10.4236/am.2019.107038

Didier Alain Njamen Njomen

●How to Cite this Article

Department of Mathematics and Computer Science, Faculty of Science, University of Maroua, Maroua, Cameroon

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: December 30, 2018; Accepted: July 19, 2019; Published: July 22, 2019

ABSTRACT

This paper studies the asymptotic normality of the Nelson-Aalen and the Kaplan-Meier estimators in a competing risks context in presence of independent right-censorship. To prove our results, we use Robelledo’s theorem which makes it possible to apply the central limit theorem to certain types of particular martingales. From the results obtained, confidence bounds for the hazard and the survival functions are provided.

Keywords:

Censored Data, Right-Censoring, Counting Process, Competing Risks, Nelson-Aalen and Kaplan-Meier Estimators, Asymptotic Properties of Estimators, Confidence Bands

1. Introduction and Background

The model of competing risks has been widely studied in the literature, see e.g., Heckman and Honoré [1] , Commenges [2] , Com-nougué [3] , Fine and Gray [4] , Crowder [5] , Fermanian [6] , Latouche, A. [7] , Geffray [8] , Belot [9] , Njamen and Ngatchou ( [10] , [11] ), Njamen ( [12] , [13] ). In most approaches, the competing risks are assumed to be either all independent or all dependent. Here, the independent component of the potential risks constitutes an independent censoring variable while the other risks are kept as possibly dependent. This approach is used by Geffray [8] . Namely, we consider a population in which each subject is exposed to m mutually exclusive competing risks which may be dependent. For $j \in {1, \dots, m}$ , the failure time from the j^th cause is a non-negative random variable (r.v.) $τ_{j}$ . The competing risks model postulates that only the smallest failure time is observable, it is given by the r.v. $T = \min (τ_{1}, \dots, τ_{m})$ with distribution function (d.f.) denoted by F. The cause of failure associated to T is then indicated by a r.v. $η$ which takes value j if the failure is due to the j^th cause for a $j \in {1, \dots, m}$ i.e. $η = j$ if $T = τ_{j}$ . The following modeling technique is extracted in Njamen and Ngatchou [10] : we assume that T is, in its turn, at risk of being independently right-censored by a non-negative r.v. C with d.f. G. Consequently, the observable r.v. are

$(Z = \min (T, C), ξ = η δ),$

where $δ = 1 1_{{T \leq C}}$ and where $1 1_{(.)}$ denotes the indicator function. As T and C are independent, the r.v. Z has d.f. H given by $1 - H = (1 - F) (1 - G)$ . Let $τ_{H} = \sup {t : H (t) < 1}$ denote the right-endpoint of H beyond which no observation is possible. The subdistribution functions $F^{(j)}$ pertaining to the different risks or causes of failure are defined for $j = 1, \dots, m$ and $t \geq 0$ by

$F^{(j)} (t) = ℙ [T \leq t, η = j], j = 1, \dots, m$ (1)

When the independence of the different competing risks may not be assumed, the functions $F^{(j)}$ for $j = 1, \dots, m$ are the basic estimable quantities.

The Kaplan-Meier estimator was developed for situations in which only one cause of failure and the independent right-censoring are considered. Aalen and Johansen [14] were the first to extend the Kaplan-Meier estimator to several causes of failure in the presence of independent censoring. In the present situation, the d.f. F may be consistently estimated by the Kaplan-Meier estimator denoted by b ${\hat{F}}_{n}$ . For $j = 1, \dots, m$ , the subdistribution functions $F^{(j)}$ may be consistently estimated by means of the Aalen-Johansen estimators denoted respectively by ${\hat{F}}_{n}^{(j)}$ , for $j = 1, \dots, m$ . Indeed, when the process of the states occupied by an individual in time is a time-inhomogeneous Markov process, Aalen and Johansen [14] introduced an estimator of the transition probabilities between states in presence of independent random right-censoring. The competing risks set-up corresponds to the case of a time-inhomogeneous Markov process with only one transient state and several absorbing states (that can be labeled $1, \dots, m$ ). Aalen and Johansen [14] obtained the joint consistency of ${\hat{F}}_{n}^{(j)}$ to $F^{(j)}$ for $j = 1, \dots, m$ uniformly over fixed compact intervals $[0, σ]$ for $σ < τ_{H}$ . They also obtained the joint weak convergence of the processes $\sqrt{n} ({\hat{F}}_{n}^{(j)} - F^{(j)})$ on fixed compact intervals $[0, σ]$ for $σ < τ_{H}$ .

The asymptotic properties of the Kaplan-Meier estimator on the distribution function have been studied by several authors (see Peterson [15] , Andersen and al. [16] , Shorack and Wellner [17] , Breslow and Crowley [18] ).

In this paper, in a region where there is at least one observation, we are interested in providing asymptotic properties of the Nelson-Aalen and Kaplan-Meier nonparametric estimators of the functions $Λ^{* (j)}$ and $S^{* (j)}$ . For $j = 1, \dots, m$ in the presence of independent right-wing censorship in the context of competitive risks set out in Njamen and Ngatchou ( [10] , [11] ).

The rest of the paper is organized as follows: Section 2 describes preliminary results and rappels used in the paper. In Section 3, we obtain two laws: In Section 3.1, we give limit law of Nelson-Aalen’s nonparametric estimator for competing risks as defined in Njamen and Ngatchou [10] and Njamen [12] . In Sect. 3.2, we give limit law of Kaplan-Meier’s nonparametric estimator in competing risks as defined in Njamen and Ngatchou [10] and Njamen [13] . In Section 4, we give the trust Bands, including the Hall-Wellner trust Bands and the Nair precision equal bands.

2. Preliminary and Rappels

For $t \geq 0$ , we introduce the following subdistribution functions $H^{(0)}$ and $H^{(1)}$ of H by:

$H^{(0)} (t) = ℙ [Z \leq t, ξ = 0],$

and

$H^{(1)} (t) = ℙ [Z \leq t, ξ \neq 0]$

and for $j = 1, \dots, m$

$H^{(1, j)} (t) = ℙ [Z \leq t, ξ = j] .$

The relations $F (t) = \sum_{j = 1}^{m} F^{(j)} (t)$ and $H^{(1)} (t) = \sum_{j = 1}^{m} H^{(1, j)} (t)$ hold for $t \geq 0$ since the different risks are mutually exclusive. The relation $H (t) = H^{(0)} (t) + H^{(1)} (t)$ is also valid for $t \geq 0$ . The relations that connect the observable distribution functions $H^{(0)}$ , $H^{(1)}$ and $H^{(1, j)}$ to the unobservable distributions F, G and $F^{(j)}$ are given by:

$H^{(0)} (t) = \int_{0}^{t} (1 - F) d G,$

$H^{(1)} (t) = \int_{0}^{t} (1 - G^{-}) d F,$

and

$H^{(1, j)} (t) = \int_{0}^{t} (1 - G^{-}) d F^{(j)} .$

The cumulative hazard function of T and the partial cumulative hazard function of T related to cause j for $j \in {1, \dots, m}$ are given for $t \geq 0$ respectively by the following expressions:

$Λ (t) = \int_{0}^{t} \frac{d F}{1 - F^{-}} = \int_{0}^{t} \frac{d H^{(1)}}{1 - H^{-}},$ (2)

$Λ^{(1, j)} (t) = \int_{0}^{t} \frac{d F^{(j)}}{1 - F^{-}} = \int_{0}^{t} \frac{d H^{(1, j)}}{1 - H^{-}} .$ (3)

Let us set estimators for the different quantities. Let ${(Z_{i}, ξ_{i})}_{i = 1, \dots, n}$ be n independent copies of the random vector $(Z, ξ)$ . We define the empirical counterparts of $H^{(0)}$ , $H^{(1)}$ , $H^{(1, j)}$ and H, for $j \in 1, \dots, m$ by:

$H_{n}^{(0)} (t) = \frac{1}{n} \sum_{i = 1}^{n} 1 1_{{Z_{i} \leq t, ξ_{i} = 0}},$

$H_{n}^{(1)} (t) = \frac{1}{n} \sum_{i = 1}^{n} 1 1_{{Z_{i} \leq t, ξ_{i} \neq 0}},$

$H_{n}^{(1, j)} (t) = \frac{1}{n} \sum_{i = 1}^{n} 1 1_{{Z_{i} \leq t, ξ_{i} = j}},$

$H_{n} (t) = \frac{1}{n} \sum_{i = 1}^{n} 1 1_{{Z_{i} \leq t}} .$

The relations $H_{n} (t) = H_{n}^{(0)} (t) + H_{n}^{(1)} (t)$ and $H_{n}^{(1)} (t) = \sum_{j = 1}^{m} H_{n}^{(1, j)} (t)$ are valid for $t \geq 0$ . As T is independently randomly right-censored by C, a well-known estimator for F is the Kaplan-Meier estimator defined for $t \geq 0$ by:

${\hat{F}}_{n} (t) = 1 - \prod_{i = 1}^{n} (1 - \frac{1 1_{{Z_{i} \leq t, ξ_{i} \neq 0}}}{n (1 - H_{n}^{-} (Z_{i}))}),$

where the left-continuous modification of any d.f. L is denoted by $L^{-}$ . The Nelson-Aalen estimators of $Λ$ and of $Λ^{(1, j)}$ for $j = 1, \dots, m$ respectively are defined for $t \geq 0$ by:

$Λ_{n} (t) = \int_{0}^{t} \frac{d H_{n}^{(1)}}{1 - H_{n}^{-}},$ (4)

$Λ_{n}^{(1, j)} (t) = \int_{0}^{t} \frac{d H_{n}^{(1, j)}}{1 - H_{n}^{-}} .$ (5)

The Aalen-Johansen estimator for $F^{(j)}$ is defined for $t \geq 0$ by:

${\hat{F}}_{n}^{(j)} (t) = \int_{0}^{t} \frac{1 - {\hat{F}}_{n}^{-}}{1 - H_{n}^{-}} d H_{n}^{(1, j)} .$

For all $t \geq 0$ , the following equalities hold:

$1 - H_{n} (t) = (1 - {\hat{F}}_{n} (t)) (1 - {\hat{G}}_{n} (t))$

$Λ_{n} (t) = \int_{0}^{t} \frac{d {\hat{F}}_{n}}{1 - {\hat{F}}_{n}^{-}},$

where ${\hat{G}}_{n}$ , the Kaplan-Meier estimator of G, is defined for $t \geq 0$ by:

${\hat{G}}_{n} (t) = 1 - \prod_{i = 1}^{n} (1 - \frac{1 1_{{T_{i} \leq t, ξ_{i} = 0}}}{n (1 - H_{n}^{-} (Z_{i}))}) .$

3. Results

In this section, we continue the works of Njamen and Ngatchou [10] , Njamen [12] and Njamen and Ngatchou [11] . In fact, Njamen and Ngatchou ( [10] , p. 9), studies the consistency of Nelson-Aalen’s non-parametric estimator in competing risks, while Njamen ( [12] , pp. 11-12) studies respectively the simple convergence and the uniform convergence in probability of Nelson-Aalen’s nonparametric estimator in competing risks; and Njamen and Ngatchou ( [11] , p. 13) study the bias and the uniform convergence of the non-parametric estimator survival function in a context of competing risks. It is also shown there that this estimator is asymptotically unbiased. For this purpose, we use the martingale approach as the authors mentioned above.

3.1. Limit Law of Nelson-Aalen’s Nonparametric Estimator for Competing Risks

In what follows, we study the asymptotic normality of Nelson-Aalen’s non-parametric estimator in competitive risks. For that, considering, for all $j \in {1, \dots, m}$ and $t \geq 0$ , one has the Nelson-Aalen type cumulative hazard function estimator (Nelson, [19] ; Aalen, [20] , Njamen and Ngatchou, [10] ) defined by

${\hat{Λ}}_{n} (t) = \int_{0}^{t} \frac{J (u)}{Y (u)} d N (u),$ (6)

where $J (t) = 1 1_{{Y (t) > 0}}$ .

The cumulative risk in a region where there is at least one observation is given for all $j \in {1, \dots, m}$ , by (see Njamen, [12] . p. 9)

$Λ^{* (j)} = \int_{0}^{t} L^{* (j)} λ^{* (j)} (s) d s,$ (7)

with $L_{i}^{* (j)} (t) = 1 1_{{Z_{i} \geq t}}$ which indicates whether the individual i is still at risk just before time t (the individual has not yet undergone the event). Its estimator was defined in Njamen and Ngatchou ( [10] , p. 7).

The following theorem gives the limit law of the Neslson-Aalen estimator ${\hat{Λ}}_{n}^{* (j)}$ in competing risks of Njamen (2017, p. 9). This is the first fundamental result of this article.

Theorem 1.

In a region where there is at least one observation, it is assumed that $F_{i}^{* (j)} (t) < 1$ for $i \in {1, \dots, n}$ and $j \in {1, \dots, m}$ . Then, for all $t \geq 0$ ,

$\sqrt{n} ({\hat{Λ}}_{n}^{* (j)} (t) - Λ^{* (j)} (t)) \overset{L}{\to} U_{i}^{* (j)} (t),$ (8)

where $U_{i}^{* (j)}$ is a centered Gaussian martingale of variance such that:

${\begin{cases} U_{i}^{* (j)} (0) = 0 \\ V (U_{i}^{* (j)} (t)) = \int_{0}^{t} \frac{α_{i}^{* (j)} (u)}{y_{i}^{* (j)} (u)} d u, \end{cases}$ (9)

where for all $s \geq 0$ ,

$y_{i}^{* (j)} (s) = [1 - F_{i}^{* (j)} (s)] [1 - G_{i}^{* (j)} (s^{-})]$ (10)

with $G_{i}^{* (j)}$ standing for the distribution function of $C_{i}^{* (j)}$ and $α_{i}^{* (j)}$ the instant risk function.

To prove this theorem, we need the Robelledo theorem. In fact, the Rebolledo theorem below makes it possible to apply the central limit theorem for certain types of particular martingales.

Theorem 2. (Rebolledo’s Theorem)

Let $M^{n} = \sum_{i = 1}^{n} M_{i}$ a sequence of martingales where $M_{i} = K_{i} - A_{i}$ , $K_{i}$ denotes a counting process and $A_{i}$ its compensator. Consider the processes $I_{n} (t) = \int_{0}^{t} f_{n} (s) d M^{n} (s)$ , and for all $ε > 0$ , $I_{n, ε} (t) = \int_{0}^{t} f_{n} (s) 1 1_{{| f_{n} (s) | > ε}} d M^{n} (s)$ . Suppose that $f_{n}$ and f are predictable and locally bounded $F_{s^{-}}$ processes such that

$\sup_{s} | f_{n} - f (s) | \to 0 (n \to \infty) .$

Suppose also that the processes $K_{i}, A_{i}, f_{n}$ are bounded. Let’s for all $t > 0$ , $α (t) = \int_{0}^{t} f^{2} (s) d s$ . If

1) ${〈 I_{n} 〉}_{t} \overset{ℙ}{\to} α (t), (n \to \infty)$ ;

2) for all $ε > 0$ , ${〈 I_{n, ε} 〉}_{t} \overset{ℙ}{\to} 0, (n \to \infty)$ .

Then,

$(I_{n} (t), t > 0) \Rightarrow (\int_{0}^{t} f (s) d W (s), t > 0), (n \to \infty),$

where $\Rightarrow$ denotes the weak convergence in the space of continuous functions on the right, having a left-hand boundary with the topology of Skorokhod and where W is a Brownian motion.

To prove Theorem 1, it is sufficient to check whether the previous conditions of Rebolledo’s Theorem are satisfied:

Proof. For all $j \in {1, \dots, m}$ and $t \geq 0$ , $M_{i}^{* (j)} (t)$ also decomposes into

$M_{i}^{* (j)} (t) = K_{i}^{* (j)} (t) - \int_{0}^{t} d Λ_{i}^{* (j)} (s) d s,$

which in turn can be written in terms of $α_{j} (t)$ by

$M_{i}^{* (j)} (t) = K_{i}^{* (j)} (t) - \int_{0}^{t} α_{i}^{* (j)} (s) L_{i}^{* (j)} (s) d s,$

which finally, can be rewritten as

$d K_{i}^{* (j)} (t) = α_{i}^{* (j)} (t) L_{i}^{* (j)} (t) d t + d M_{i}^{* (j)} (t),$

where $d M_{i}^{* (j)} (t)$ can be seen as a random noise process. The martingale $M_{i}^{* (j)} (t)$ above represents the difference between the number of failures due to a specific cause j observed in the time interval $[0, t]$ , i.e. $K_{i}^{* (j)} (t)$ (see Njamen, [12] , p.6), and the number of failures predicted by the model for the j^th cause. This definition fulfills the Doob-Meyer decomposition.

This martingale is used in Fleming and Harrington ( [21] , p. 26) and in Breuils ( [22] , p. 25).

Now, to explain the asymptotic nature of the results, we defined, for all $t \geq 0$ , $j \in {1, \dots, m}$ , to pose:

$N^{(n)} (t) = \sum_{i = 1}^{n} K_{i}^{* (j)} (t), Y^{(n)} (t) = \sum_{i = 1}^{n} L_{i}^{* (j)} (t), J^{(n)} = 1 1_{{Y^{(n)} (t) > 0}},$

In a subgroup $A^{(j)}$ , where there is at least one observation, the survival function of $Z_{i} = \min (T_{i}, C_{i})$ is defined for all $t \geq 0$ by:

$S_{Z}^{* (j)} (t) = (1 - F_{i}^{* (j)} (t)) (1 - G_{i}^{* (j)} (t^{-})) .$

Recall also that $F_{i}^{* (j)}$ is the distribution function of $T_{i}$ , $G_{i}^{* (j)}$ is that of $C_{i}$ ’s and $[1 - (1 - F_{i}^{* (j)})] [1 - G_{i}^{* (j)}]$ that of the $Z_{i}$ ’s. From the Glivenko-Cantelli theorem, one has:

$\sup_{s \in [0, t]} | \frac{Y^{(n)} (s)}{n} - [1 - F_{i}^{* (j)} (s)] [1 - G_{i}^{* (j)} (s^{-})] | \overset{ℙ}{\to} 0 (n \to \infty) .$ (11)

Otherwise,

$J^{(n)} (t) = 1 1_{{Y^{(n)} (t) >0}},$

one has:

$1 - J^{(n)} (t) = 1 1_{{Y^{(n)} (t) = 0}} = 1 1_{{B (n, [1 - F_{i}^{* (j)} (t)] [1 - G_{i}^{* (j)} (t^{-})]) = 0}} \overset{ℙ}{\to} 0 (n \to \infty),$

from which one obtains (see Theorem 3, p. 11 of Njamen, [12] ),

$J^{(n)} (t) \overset{ℙ}{\to} 1 (n \to \infty) .$

Differentiating the martingale $M_{i}^{* (j)} (t) = K_{i}^{* (j)} - \int_{0}^{t} L_{i}^{* (j)} (s) α_{i}^{* (j)} (s) d s$ , one has:

$d M_{i}^{* (j)} (t) = d K_{i}^{* (j)} (t) - L_{i}^{* (j)} (t) α_{i}^{* (j)} (t) d t,$

and from

$d {〈 M_{i}^{* (j)} 〉}_{t} = V a r (d M_{i}^{* (j)} (t) / F_{t^{-}}),$

one obtains

$\begin{matrix} d {〈 M_{i}^{* (j)} 〉}_{t} = V a r (d K_{i}^{* (j)} (t) - L_{i}^{* (j)} (t) α_{i}^{* (j)} (t) d t / F_{t^{-}}) \\ = V a r (d K_{i}^{* (j)} (t) / F_{t^{-}}) = L_{i}^{* (j)} (t) α_{i}^{* (j)} (t) d t . \end{matrix}$

Consequently, the increasing process of

$D_{t} = \int_{0}^{t} \frac{J^{(n)} (u)}{Y^{(n)} (u)} d M_{i}^{* (j)} (u), t \geq 0,$

is given by

${〈 D 〉}_{t} = \int_{0}^{t} \frac{{(J^{(n)})}^{2} (u)}{{(Y^{(n)})}^{2} (u)} d {〈 M 〉}_{u}, t \geq 0.$

Next, for all $t \geq 0$ and $j = {1, \dots, m}$ , one has

$\begin{matrix} {〈 \sqrt{n} \sum_{i = 1}^{n} \int_{0}^{t} \frac{J^{(n)} (u)}{Y^{(n)} (u)} d M_{i}^{* (j)} (u) 〉}_{t} = \sum_{i = 1}^{n} n \int_{0}^{t} \frac{{(J^{(n)})}^{2} (u)}{{(Y^{(n)})}^{2} (u)} L_{i}^{* (j)} (u) α_{i}^{* (j)} (u) d u \\ = \int_{0}^{t} n \frac{{(J^{(n)})}^{2} (u)}{{(Y^{(n)})}^{2} (u)} \sum_{i = 1}^{n} L_{i}^{* (j)} (u) α_{i}^{* (j)} (u) d u \\ = \int_{0}^{t} n \frac{{(J^{(n)})}^{2} (u)}{{(Y^{(n)})}^{2} (u)} Y^{(n)} (u) α_{i}^{* (j)} (u) d u \\ = \int_{0}^{t} n \frac{J^{(n)} (u)}{Y^{(n)} (u)} α_{i}^{* (j)} (u) d u . \end{matrix}$

Also, for all $t \geq 0$ and for all $j \in {1, \dots, m}$ , the process

$\sqrt{n} ({\hat{Λ}}_{n}^{* (j)} (t) - Λ^{* (j)} (t)) = \sqrt{n} \sum_{i = 1}^{n} \int_{0}^{t} \frac{J^{(n)} (u)}{Y^{(n)} (u)} d M_{i}^{* (j)} (u) = R_{n} (t), \forall i \in {1, \dots, n},$

is a martingale. We apply the central limit theorem for the martingales (Rebolledo’s Theorem). In this purpose, we show that the condition of this theorem is satisfied by $R_{n} (t)$ .

One has, for all $i \in {1, \dots, n}$ ,

${〈 R_{n} 〉}_{t} = \int_{0}^{t} n \frac{J^{(n)} (u)}{Y^{(n)} (u)} α_{i}^{* (j)} (u) d u, \forall j \in {1, \dots, m},$

and also by the proof of the Theorem 3 of Njamen ( [12] , p. 11), we have:

$\frac{Y^{(n)} (u)}{n} \overset{ℙ}{\to} (1 - F_{i}^{* (j)} (u)) (1 - G_{i}^{* (j)} (u^{-})), J^{(n)} (u) \overset{ℙ}{\to} 1, (n \to \infty) .$

So that, for all $j \in {1, \dots, m}$ , when $n \to \infty$ ,

$\begin{array}{l} {〈 R_{n} 〉}_{t} = \int_{0}^{t} \frac{J^{(n)} (u)}{\frac{Y^{(n)} (u)}{n}} α_{i}^{* (j)} (u) d u \\ \overset{ℙ}{\to} \int_{0}^{t} \frac{α_{i}^{* (j)} (u) d u}{(1 - F_{i}^{* (j)} (u)) (1 - G_{i}^{* (j)} (u^{-}))} = β (t), (n \to \infty), \end{array}$

which is determinist. Thus, the first condition of Robelledo Theorem holds.

To check the second condition, for all $ϵ > 0$ and $t \geq 0$ , define

$R_{n, ε} (t) = \int_{0}^{t} \sqrt{n} \frac{J^{(n)} (u)}{Y^{(n)} (u)} 1 1_{{| \sqrt{n} \frac{J^{(n)} (u)}{Y^{(n)} (u)} | > ϵ}} d M^{(n)} (u),$

where for all $j = 1, \dots, m$ , $M^{(n)} (u) = \sum_{i = 1}^{n} M_{i}^{* (j)} (u)$ .

We have to show that as $n \to \infty$ , ${〈 Z_{n, ϵ} 〉}_{t}$ converges to 0 in probability.

One has, for all $t \geq 0$ ,

$\begin{matrix} {〈 R_{n, ε} 〉}_{t} = \int_{0}^{t} n \frac{J^{(n)} (u)}{{(Y^{(n)} (u))}^{2}} 1 1_{{| \sqrt{n} \frac{J^{(n)} (u)}{Y^{(n)} (u)} | > ϵ}} d {〈 M^{(n)} 〉}_{u} \\ = \int_{0}^{t} n \frac{J^{(n)} (u)}{{(Y^{(n)} (u))}^{2}} 1 1_{{| \sqrt{n} \frac{J^{(n)} (u)}{Y^{(n)} (u)} | > ϵ}} Y^{(n)} (u) α_{i}^{* (j)} (u) d u \\ = \int_{0}^{t} n \frac{J^{(n)} (u)}{Y^{(n)} (u)} 1 1_{{| \sqrt{n} \frac{J^{(n)} (u)}{Y^{(n)} (u)} | > ϵ}} α_{i}^{* (j)} (u) d u \\ \overset{ℙ}{\to} 0, (n \to \infty), \end{matrix}$

because

$n \frac{J^{(n)} (u)}{Y^{(n)} (u)} \overset{ℙ}{\to} \frac{1}{(1 - F_{i}^{* (j)} (u)) (1 - G_{i}^{* (j)} (u^{-}))}, (n \to \infty) .$

Then

$\sqrt{n} \frac{J^{(n)} (u)}{Y^{(n)} (u)} = \frac{1}{\sqrt{n}} n \frac{J^{(n)} (u)}{Y^{(n)} (u)} \overset{ℙ}{\to} 0, (n \to \infty) .$

Thus, the second condition of Robelledo Theorem holds.

The conditions of the Rebolledo Theorem are verified and by consequently, for all $t \geq 0$ ,

$(R_{n} (t), t > 0) \Rightarrow (\int_{0}^{t} f (s) d W (s), t > 0), (n \to \infty),$

with $γ (t) = \int_{0}^{t} f^{2} (s) d s$ .

Finally, for all $t > 0$ ,

This ends the proof of the Theorem 1.

The following subsection gives the asymptotic law of nonparametric Kaplan-Meier’s estimator of the survival function in the competing risks of Njamen and Ngatchou ( [10] , p. 13).

3.2. Limit Law of Kaplan-Meier’s Nonparametric Estimator in Competing Risks

The Kaplan-Meier estimator of the survival function (Kaplan and Meier, [23] ) is defined by

${\hat{S}}_{n} (t) = \prod_{s \leq t} (1 - Δ {\hat{Λ}}_{n} (s)) = \prod_{s \leq t} (1 - \frac{J^{(n)} (s) Δ N^{(n)} (s)}{Y^{(n)} (s)}),$

where ${\hat{Λ}}_{n} (t)$ is the Nelson-Aalen estimator and where, for a process $X (t)$ continuous to the right with a left limit such that

$Δ X (t) = X (t) - X (t^{-}) .$

For all $j = 1, \dots, m$ , an estimator of the variance of ${\hat{S}}_{n}^{(j)} (t) / S^{* (j)} (t)$ , where $S^{* (j)}$ is the survival function associated with the subgroup $A^{(j)}$ is given by

${\hat{σ}}^{(j) 2} (t) = \int_{0}^{t} \frac{J^{(n)} (s)}{{(Y^{(n)})}^{2} (s)} d N^{(n)} (s) .$

The variance of ${\hat{S}}_{n}^{(j)} (t) / S^{(j)} (t)$ approximated by that of ${\hat{S}}^{(j)} (t) / S^{* (j)} (t)$ is:

$\begin{matrix} V [\frac{{\hat{S}}_{n}^{(j)} (t)}{S^{* (j)} (t)} - 1] = E [〈 \frac{{\hat{S}}_{n}^{(j)}}{S^{* (j)}} - 1 〉 (t)] \\ = \int_{0}^{t} {\frac{{\hat{S}}_{n}^{(j)} (s^{-})}{S^{* (j)} (s)}}^{2} \times \frac{J^{(n)} (s)}{Y^{(n)} (s)} α_{i}^{* (j)} (s) d s \forall i \in {1, \dots, n} . \end{matrix}$ (12)

The estimator of the corresponding variance of ${\hat{S}}_{n}^{(j)} (t)$ is given by

$\hat{V} ({\hat{S}}_{n}^{(j)} (t)) = {[{\hat{S}}_{n}^{(j)} (t)]}^{2} {\hat{σ}}_{i}^{(j) 2} (t) \forall i \in {1, \dots, n} .$ (13)

The following result concerning the asymptotic law of nonparametric Kaplan-Meier estimator and constituted the second fundamental result of this paper:

Theorem 3.

In an area where there is at least one observation, if we assume that for all $j \in {1, \dots, m}$ and $i \in {1, \dots, n}$ ,

1) for all $s \in [0, t]$ ,

$n \int_{0}^{s} \frac{J^{(n)} (u)}{Y^{(n)} (u)} α_{i}^{* (j)} (u) d u \overset{ℙ}{\to} σ_{i}^{* (j) 2} (u) (n \to \infty),$

2) for all $ε > 0$ ,

$n \int_{0}^{t} \frac{J^{(n)} (u)}{Y^{(n)} (u)} α_{i}^{* (j)} 1 1_{{\sqrt{n} | \frac{J^{(n)} (u)}{Y^{(n)} (u)} | > ε}} d u \overset{ℙ}{\to} 0 (n \to \infty),$

3) for all $t > 0$ ,

$\sqrt{n} \int_{0}^{t} (1 - J^{(n)} (u)) α_{i}^{* (j)} (u) d u \overset{ℙ}{\to} 0 (n \to \infty) .$

Then, for all $t > 0$ and $j \in {1, \dots, m}$ , the non-parametric estimator ${\hat{S}}_{n}^{* (j)}$ checks

$\sqrt{n} ({\hat{S}}_{n}^{* (j)} (t) - S^{* (j)} (t)) \Rightarrow - U_{i}^{* (j)} (t) \times S^{* (j)} (t), (n \to \infty),$

where $U_{i}^{* (j)}$ is the center Gaussian martingale and where $\Rightarrow$ denotes the weak convergence in the space of continuous functions on the right, having a left-hand boundary with the topology of Skorokhod.

Proof. To prove this theorem, it suffices to show that it satisfies the conditions of the Rebolledo Theorem.

In an area where there is at least one observation, by posing, for all $j = 1, \dots, m$ , $i = 1, \dots, n$ ,

${\tilde{S}}_{n}^{* (j)} (t) = \exp (- {\tilde{Λ}}_{n}^{* (j)})$

where ${\tilde{Λ}}_{n}^{* (j)} = \int_{0}^{t} J^{(n)} (u) α_{i}^{* (j)} (u) d u$ .

For $t \in [0, τ [$ and $τ > 0$ , we have for all $j = 1, \dots, m$ and $i = 1, \dots, n$ ,

$\begin{array}{l} \sqrt{n} {〈 (\frac{{\hat{S}}_{n}^{(j)}}{{\tilde{S}}_{n}^{* (j)}} - 1) 〉}_{t} = n \int_{0}^{t} \frac{{\hat{S}}_{n}^{(j)} {(u^{-})}^{2}}{{\tilde{S}}_{n}^{* (j)} {(u)}^{2}} \frac{J^{(n)} (u)}{Y^{(n)} (u)} α_{i}^{* (j)} (u) d u \\ \overset{ℙ}{\to} σ_{i}^{* (j) 2}, (n \to \infty) . \end{array}$

By the proof of Theorem 3 of Njamen ( [12] , p.11), we deduce that

$\frac{{\hat{S}}_{n}^{* (j)} (u^{-})}{{\tilde{S}}_{n}^{* (j)} (u)} \overset{ℙ}{\to} 1, (n \to \infty) .$

Hence the 1st condition of Robolledo’s Theorem.

For the second condition of Robolledo’s Theorem, condition B is similar to the proof of Theorem 1 above, we find that for all $ε > 0$ ,

$n \int_{0}^{t} \frac{{\hat{S}}_{n}^{(j)} {(u^{-})}^{2}}{{\tilde{S}}_{n}^{* (j)} {(u)}^{2}} \frac{J^{(n)} (u)}{Y^{(n)} (u)} 1 1_{{\sqrt{n} | \frac{J^{(n)} (s)}{Y^{(n)} (s)} | > ε}} α_{i}^{* (j)} (u) d u \to 0, (n \to \infty) .$

So, for each $t > 0$ ,

$\sqrt{n} \int_{0}^{t} \frac{{\hat{S}}_{n}^{(j)} (u^{-})}{{\tilde{S}}_{n}^{* (j)} (u)} \frac{J^{(n)} (u)}{Y^{(n)} (u)} d M^{(n)} (u) \Rightarrow U_{i}^{* (j)} (t),$

where $M^{(n)} (u) = \sum_{i = 1}^{n} M_{i}^{* (j)} (u)$ and where

Finally,

$\sqrt{n} (\frac{{\hat{S}}_{n}^{(j)} (t)}{{\tilde{S}}_{n}^{* (j)} (t)} - 1) \Rightarrow - U_{i}^{* (j)} (t) .$

The fact that $S^{* (j)} (u) \leq S_{n}^{* (j)} (u)$ , for all $u \in [0, s [$ and condition C implies:

$\begin{matrix} \sqrt{n} | \frac{S^{* (j)} (s)}{{\tilde{S}}_{n}^{* (j)} (u)} - 1 | \leq \sqrt{n} \int_{0}^{t} \frac{S^{* (j)} (u)}{{\tilde{S}}_{n}^{* (j)} (u)} d (Λ^{* (j)} - {\tilde{Λ}}^{* (j)}) (u) \\ \leq \sqrt{n} \int_{0}^{t} (1 - J (u)) α_{i}^{* (j)} (u) d u \\ \overset{ℙ}{\to} 0 (n \to \infty) . \end{matrix}$

As ${\tilde{S}}_{n}^{* (j)} (t) \to S^{* (j)} (t)$ when $n \to \infty$ , we deduce that:

$\sqrt{n} ({\tilde{S}}_{n}^{* (j)} - S^{* (j)} (t)) \overset{ℙ}{\to} 0, n \to \infty .$

It follows that:

$\begin{array}{l} \sqrt{n} ({\hat{S}}_{n}^{* (j)} (t) - S^{* (j)} (t)) \\ = \sqrt{n} ({\hat{S}}_{n}^{* (j)} (t) - {\tilde{S}}_{n}^{* (j)} (t)) + \sqrt{n} ({\tilde{S}}_{n}^{* (j)} (t) - S^{* (j)}) \\ = \frac{\sqrt{n} ({\hat{S}}_{n}^{* (j)} (t) - {\tilde{S}}_{n}^{* (j)} (t))}{{\tilde{S}}_{n}^{* (j)}} {\tilde{S}}_{n}^{* (j)} + \sqrt{n} ({\tilde{S}}_{n}^{* (j)} (t) - S^{* (j)}) \\ \Rightarrow - U_{i}^{* (j)} (t) S^{* (j)}, (n \to \infty) . \end{array}$

This ends the proof of the theorem.

4. Confidence Bands of Survival Function

4.1. Confidence Intervals

For $α \in (0,1)$ , we wish to find two random functions $b_{L}$ and $b_{U}$ such that $\forall t > 0$ ,

$ℙ [b_{U} (t) \geq S (t) \geq b_{L} (t)] = 1 - α .$

Recall that from the previous sections, for all $j \in {1, \dots, m}$ , $\sqrt{n} ({\hat{S}}_{n}^{* (j)} (t) - S^{* (j)} (t)) / S^{* (j)} (t)$ converges in distribution to a Gaussian martingale centered (see Theorem 3 above). As a consequence, ${\hat{S}}_{n}^{* (j)} (t)$ is asymptotically Gaussian centered on $S^{* (j)}$ . Given the above results, the estimated standard deviation of $S^{* (j)}$ , noted ${\hat{σ}}_{S_{t}}$ is given for all $t \geq 0$ by:

${\hat{σ}}_{S_{t}^{*}}^{* 2} (t) = \frac{\hat{V} ({\hat{S}}_{n}^{* (j)} (t))}{{[{\hat{S}}_{n}^{* (j)} (t)]}^{2}} .$ (14)

Therefore a threshold confidence level $100 (1 - α) %$ can be built for all $t \geq 0$ and $j \in {1, \dots, m}$ , by:

${\hat{S}}_{n}^{* (j)} (t) - Z_{1 - α / 2} {\hat{σ}}_{S_{t}^{*}}^{* (j)} (t) {\hat{S}}_{n}^{* (j)} (t), {\hat{S}}_{n}^{* (j)} (t) + Z_{1 - α / 2} {\hat{σ}}_{S_{t}^{*}}^{* (j)} (t) {\hat{S}}_{n}^{* (j)} (t) .$ (15)

Here $z_{1 - α / 2}$ is the $1 - α / 2$ percentile of a standard normal distribution.

A threshold confidence interval $100 (1 - α) %$ can also be obtained for all $j \in {1, \dots, m}$ , by:

${\hat{S}}_{n}^{* (j)} (t) \pm z_{α / 2} {\hat{σ}}_{S_{t}^{*}}^{* (j)},$ (16)

where $z_{α / 2}$ is the rank of fractile $100 \times α / 2$ of the standardized normal distribution.

A disadvantage of the construction of the confidence interval (CI) with the previous formula is that the bound can be obtained external to the interval $[0,1]$ . A solution is to consider a $S^{* (j)} (t)$ $(j \in {1, \dots, m})$ transform via a continuous function g, differentiable and invertible such that $g (S^{* (j)} (t))$ belongs to a more wide space ideally unbounded and best approximate a Gaussian random variable. The delta method then allows for the estimation of

the standard deviation of the object created by ${\hat{σ}}_{g (S_{t}^{*})}^{* (j)}$ defined by ${\hat{σ}}_{g (S_{t}^{*})}^{* (j)} (t) = g^{'} ({\hat{S}}_{n}^{* (j)}) {\hat{σ}}_{S_{t}^{*}}^{* (j)} (t)$ . The confidence interval associated with the risk threshold $α$ is built as for all $j \in {1, \dots, m}$ ,

$g^{- 1} (g ({\hat{S}}_{n}^{* (j)}) \pm z_{α / 2} g^{'} ({\hat{S}}_{n}^{* (j)}) {\hat{σ}}_{S_{t}^{*}}^{* (j)} (t)) .$

The most common transformation is $g (S_{t}^{*}) = \log [\log (S_{t}^{*})]$ , and in this case we have: for all $j \in {1, \dots, m}$ ,

${\hat{σ}}_{\log [- \log (S_{t}^{*})]}^{* (j)} = \frac{{\hat{σ}}_{S_{t}^{*}}^{* (j)}}{{\hat{S}}_{n}^{* (j)} \log {\hat{S}}_{n}^{* (j)}} and {\hat{S}}_{n}^{* (j) \exp (\pm z_{α / 2} \frac{{\hat{σ}}_{S_{t}^{*}}^{* (j)}}{{\hat{S}}_{n}^{* (j)} \log ({\hat{S}}_{t}^{* (j)})})} .$

Remark 1. It is also possible to use log, square-root or logit-type transformations in most software defined respectively by for all $j \in {1, \dots, m}$ ,

$g (S_{t}^{* (j)}) = \log [S_{t}^{* (j)}], g (S_{t}^{* (j)}) = \sin^{- 1} [\sqrt{S_{t}^{* (j)}}], g (S_{t}^{* (j)}) = \log [\frac{S_{t}^{* (j)}}{1 - S_{t}^{* (j)}}] .$

4.2. The Confidence Bands

The challenge now is to find an area containing the survival function with probability $1 - α$ , or a set of bounds $b_{L} (t)$ and $b_{U} (t)$ which, with probability $1 - α$ , contains $S^{* (j)} (t)$ for all $t \in [t_{L}, t_{U}]$ and $j \in {1, \dots, m}$ . Among the proposed solutions, the two most commonly used are firstly Hall and Wellner ( [24] ) bands and secondly, strips Nair ( [25] ) (“equal precision bands”). If $t_{k}$ is the maximum time event observed in the sample, then for the Nair bands, we have the following restrictions $0 < t_{L} < t_{U} \leq t_{k}$ , however, boter Hall-Wiener may authorize the nullity of $t_{L}$ , let $0 \leq t_{L} < t_{U} \leq t_{k}$ . Technically obtaining these bands is complex, and their practical utility in relation to the point intervals is not obvious.

Remark 2. The starting point uses the fact that for all $j \in {1, \dots, m}$ , $\sqrt{n} (\frac{{\hat{S}}_{n}^{* (j)} (t)}{S^{* (j)} (t)} - 1)$ converges to a centered Gaussian martingale. We then go through a transformation making appear a Brownian bridge ${W^{0} (x), x \in [0,1]}$ , weighted by $\frac{1}{\sqrt{x (1 - x)}}$ at Nair, to retrieve the suitable critical value.

In particular, because of the joined character, for a given t their extent is wider than that of the corresponding point IC. In what follows we give the expressions obtained in the absence of transformation.

4.2.1. The Hall-Wellner Confidence Bands

Under the assumption of continuity of survival functions $S^{* (j)} (t)$ and $C^{* (j)} (t)$ respectively related to the event time and the time of censorship, Hall and Wellner show that for every $t \in [t_{L}, t_{U}]$ , the IC joined the risk threshold $α$ is given for all $j = 1, \dots, m$ and $t \geq 0$ by:

${\hat{S}}_{n}^{* (j)} (t) \pm h_{α} (x_{L}, x_{U}) n^{\frac{- 1}{2}} [1 + n {\hat{σ}}_{S_{t}^{*}}^{* (j) 2} (t)] {\hat{S}}_{n}^{* (j)} (t),$ (17)

where $x_{L}$ and $x_{U}$ are given by

$x_{i} = \frac{n {\hat{σ}}_{S_{t_{i}}^{*}}^{* (j) 2} (t)}{(1 + n {\hat{σ}}_{S_{t_{i}}^{*}}^{* (j) 2} (t))}, for i = L, U$

and $h_{α} (x_{L}, x_{U})$ is bounds checking

$α = ℙ [\sup_{x_{L} \leq x \leq x_{U}} | W^{0} (x) | > h_{α} (x_{L}, x_{U})] .$

4.2.2. The Nair Precision Equal Bands

Using a weighted Brownian bridge will notably modify the bounds to IC. For $α \in (0,1)$ , $t \in [t_{L}, t_{U}]$ and all $j \in {1, \dots, m}$ , they are then given by:

${\hat{S}}_{n}^{* (j)} (t) \pm e_{α} (x_{L}, x_{U}) {\hat{σ}}_{S_{t}^{*}}^{* (j)},$ (18)

where $e_{α} (x_{L}, x_{U})$ satisfies

$α = ℙ [\sup_{x_{L} \leq x \leq x_{U}} \frac{| W^{0} (x) |}{\sqrt{x (1 - x)}} > e_{α} (x_{L}, x_{U})] .$

If we compare (12) and (14), we see that the bounds relating to Nair ( [25] ) bands are proportional to the bounds IC and simply correspond to a risk adjustment threshold used in the past.

5. Conclusions and Perspectives

In this paper we have studied the asymptotic normality of Nelson-Aalen and Kaplan-Meier type estimators in the presence of independent right-censorship as defined in Njamen and Ngatchou ( [10] , [11] ) and Njamen [12] using Robelledo’s theorem that allows applying the central limit theorem to certain types of particular martingales. From the results obtained, confidence bounds for the hazard and the survival functions are provided.

As a perspective, obtaining actual data would allow us to perform numerical simulations to gauge the robustness of our obtained estimators.

Acknowledgements

We thank the publisher and the referees for their comments which allowed to raise considerably the level of this article.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

Cite this paper

Njomen, D.A.N. (2019) Asymptotic Normality of the Nelson-Aalen and the Kaplan-Meier Estimators in Competing Risks. Applied Mathematics, 10, 545-560. https://doi.org/10.4236/am.2019.107038

References

1. Heckman, J.J. and Honoré, B.E. (1989) The Identifiability of the Competing Risks Models. Biometrika, 77, 325-330. https://www.jstor.org/stable/2336666 https://doi.org/10.1093/biomet/76.2.325

2. Commemges, D. (2017) Risques compétitifs et modèles multi-états en épidemiologie. Revue d’épidémiologie et de santé publique Elsevier Masson, 77, 605-611.

3. Com-Nougué, C., Guérin, S. and Rey, A. (1999) Estimation des risques associés à des événements multiples. Revue d’épidémiologie et de Santé Publique, 47, 75-85.

4. Fine, J.P. and Gray, R.J. (1999) A Proportional Hazards Model for the Subdistribution of a Competing Risk. Journal of the American Statistical Association, 94, 496-509. https://www.jstor.org/stable/2670170 https://doi.org/10.1080/01621459.1999.10474144

5. Crowder, M. (2001) Classical Competing Risks. Chapman and Hall, London.

6. Fermanian, J.D. (2003) Nonparametric Estimation of Competing Risks Models with Covariates. Journal of Multivariate Analysis, 85, 156-191. https://doi.org/10.1016/S0047-259X(02)00069-6

7. Latouche, M. (2004) Modèles de régression en présence de compétition. Thèse de doctorat, Université de Paris, Paris, 6. https://tel.archives-ouvertes.fr/tel-00129238

8. Geffray, S. (2009) Strong Approximations for Dependent Competing Risks with Independent Censoring with Statistical Applications. Test, 18, 76-95. https://doi.org/10.1007/s11749-008-0113-y

9. Belot, A. (2009) Modélisation flexible des données de survie en présence de risques concurrents et apports de la mthode du taux en excès. Thèse de doctorat, Université de la Méditerranée, Marseille.

10. Njamen, N.D.A. and Ngatchou, W.J. (2014) Nelson-Aalen and Kaplan-Meier Estimators in Competing Risks. Applied Mathematics, 5, 765-776. https://doi.org/10.4236/am.2014.54073

11. Njamen, N.D.A. and Ngatchou, W.J. (2018) Consistency of the Kaplan-Meier Estimator of the Survival Function in Competiting Risks. The Open Statistics and Probability Journal, 9, 1-17. https://benthamopen.com/TOSPJ/home https://doi.org/10.2174/1876527001809010001

12. Njamen, N.D.A. (2017) Convergence of the Nelson-Aalen Estimator in Competing Risks. International Journal of Statistics and Probability, 6, 9-23. https://doi.org/10.5539/ijsp.v6n3p9

13. Njamen, N.D.A. (2018) Study of the Nonparametric Kaplan-Meier Estimator of the Cumulative Incidence Function in Competiting Risks. Journal of Advanced Statistics, 3, 1-13. https://doi.org/10.22606/jas.2018.31001

14. Aalen, O.O. and Johansen, S. (1978) An Empirical Transition Matrix for Non-Homogeneous Markov Chains Based on Censored Observations. Scandinavian Journal of Statistics, 5, 141-150. https://www.jstor.org/stable/4615704

15. Peterson, G.L. (1977) A Simplification of the Protein Assay Method of Lowry et al. Which Is More Generally Applicable. Analytical Biochemistry, 83, 346-356. https://doi.org/10.1016/0003-2697(77)90043-4

16. Andersen, P.K., Borgan, Ø., Gill, R.D. and Keiding, N. (1993) Statistical Models Based on Counting Processes. Springer Series in Statistics, Spring-Verlag, New York.

17. Shorack, G.R. and Wellner, J.A. (1986) Empirical Processes with Applications to Statistics. John Wiley and Sons, Inc., New York.

18. Breslow, N. and Crowley, J. (1974) A Large Sample Study of the Life Table and Product-Limit Estimates under Random Censorship. The Annals of Statistics, 2, 437-453. https://www.jstor.org/stable/2958131 https://doi.org/10.1214/aos/1176342705

19. Nelson, W. (1972) A Short Life Test for Comparing a Sample with Previous Accelerated Test Results. Technometrics, 14, 175-185. https://www.jstor.org/stable/1266929 https://doi.org/10.1080/00401706.1972.10488894

20. Aalen, O.O. (1978) Nonparametric Inference for a Family of Counting Processes. The Annals of Statistics, 6, 701-726. https://www.jstor.org/stable/2958850 https://doi.org/10.1214/aos/1176344247

21. Fleming, T.R. and Harrington, D.P. (1990) Counting Processes and Survival Analysis. John Wiley and Sons, Hoboken.

22. Breuils, C. (2003) Analyse de Durées de Vie: Analyse Séquentielle du Modèle des Risques Proportionnels et Tests d’Homogénéité. Thèse de doctorat, Université de Technologie de Compiégne, Compiègne. https://tel.archives-ouvertes.fr/tel-00005524

23. Kaplan, E.L. and Meier, P. (1958) Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53, 457-481. https://www.jstor.org/stable/2281868 https://doi.org/10.1080/01621459.1958.10501452

24. Hall, W.J. and Wellner, J.A. (1980) Confidence Bands for a Survival Curve. Biometrika, 67, 133-143. https://www.jstor.org/stable/2335326 https://doi.org/10.1093/biomet/67.1.133

25. Nair, V.N. (1984) Confidence Bands for Survival Functions with Censored Data: A Comparative Study. Technometrics, 26, 265-275. https://www.jstor.org/stable/1267553 https://doi.org/10.1080/00401706.1984.10487964

Journal Menu >>