Vol.6 No.10(2014), Article ID:46952,11 pages DOI:10.4236/ns.2014.610074
Statistical Diagnosis for General Transformation Model with Right Censored Data Based on Empirical Likelihood
Shuling Wang1, Xiaohong Deng1, Lin Zheng2
1Department of Fundamental Course, Air Force Logistics College, Xuzhou, China
2School of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, China
Copyright © 2014 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
Received 1 May 2014; revised 21 May 2014; accepted 8 June 2014
In this work, we consider statistical diagnostic for general transformation models with right censored data based on empirical likelihood. The models are a class of flexible semiparametric survival models and include many popular survival models as their special cases. Based on empirical likelihood methodologe, we define some diagnostic statistics. Through some simulation studies, we show that out proposed procedure can work fairly well.
Keywords:Random Right Censorship, Empirical Likelihood, Outliers, Influence Analysis
Statistical diagnosis developed in the mid-1970s, which is a new statistical branch. In the course of development of the past 40 years, the diagnosis and influence analysis of linear regression model has been fully developed (R. D. Cook and S. Weisberg  , Bocheng Wei, Guobin Lu & Jianqing Shi  ). Influence diagnostics for the proportional hazards model has been fully developed (L. A. Weissfeld  ), for example, the proportional odds model, heteroscedastic linear transformation model, generalized linear transformation model, generalized transformation model and the other survival models.
The empirical likelihood method originates from Thomas & Grunkemeier  . Owen  first proposed the definition of empirical likelihood and expounded the system info of empirical likelihood. The empirical CDF of
is defined as for. The empirical likelihood of the CDF is. Zhu and Ibrahim  utilized this method for statistical diagnostic, they developed diagnostic measures for assessing the influence of individual observations when using empirical likelihood with general estimating equations, and used these measures to construct goodness-of-fit statistics for testing possible misspecification in the estimating equations. Liugen Xue and Lixing Zhu  summarized the application of empirical likelihood method.
Many authors have successfully applied empirical likelihood to the analysis of survival data. For example, Qin and Jing  investigated empirical likelihood confidence intervals for Cox’s regression models with right censored data; He  studied the goodness-of-fit of Cox’s regression models with various types of censored data; Gu et al.  considered inferences for Cox’s regression models with time-dependent coefficients; Zhou  , Zheng and Yu  and Zhou et al.  studied empirical likelihood for accelerated failure time models, multivariate accelerated failure time models and heteroscedastic accelerated failure time models respectively. Li et al.  overviewed some applications of empirical likelihood in survival analysis; Lu and Liang  discussed empirical likelihood procedure based on estimating equations for a class of flexible survival models-lineartransformation models, which includes popular proportional hazard regression models and proportional odds regression models as its special cases. Jianbo Li et al.  studied empirical likelihood inference for general transformation models with right censored data.
In this paper, we will consider statistical diagnostic for a class of very general survival models-general transformation models with right censored data in the form of
where is the conditional survival function of failure time variable given covariate vector; is a completely unspecified baseline survival function when; is a known monotonically increasing function with respect to satisfying and for any and; is a parameter vector including regression coefficients and possible model transformation parameters in. Model (1) includes many popular survival models, for example heteroscedastic linear transformation models, as their special cases. Note that when
So far the diagnosis of the general transformation model with random right censorship based on empirical likelihood method has not yet seen in the literature. This paper attempts to study it. One advantage of this procedure is that it is free of baseline survival function and censoring distribution. The class of models we investigate is also general than previous studies for survival models.
The rest of the paper is organized as follows. Empirical likelihood and estimation equation are presented in Section 2. The main results are given in Section 3 and Section 4. Section 5 contains some simulation studies as well as applications. Conclusions with discussions are given in Section 6.
2. Empirical Likelihood and Estimation Equation
Let be the censoring variable, be the censored event time variable and be the censoring indicator. Suppose are i.i.d. copies of. Denote by the total number of uncensored failure times., the partial ranking among the uncensored failure times and the censored observations between each neighboring pair of uncensored observations. Given the partial ranking and covariate observations, Jianbo Li  has proposed the empirical log-likelihood ratio function for can be defined by
the empirical log-likelihood ratio statistic equal to the maximum
Regard and as independent variable and define
Obviously, the maximum empirical likelihood estimates and are the solutions of following equations
3. Case-Deletion Influence Measures
Consider Model (1), where the j-th case is deleted.
This model is called case-deletion model. Let is the maximum empirical likelihood estimate of in model (2). In order to study the influence of the j-th case, and compare the difference between and
. The important result as follows theorem.
By Zhu, et al.  , for model (2), the maximum empirical likelihood estimator of is
3.1. Empirical Cook Distance
Zhu, et al.  proposed empirical cook distance. Let M is a nonnegative matrix. The empirical cook distance is defined as follows
3.2. Empirical Likelihood Distance
Empirical likelihood distance is advanced from the view of data fitting. Considering the influence of deleting the j-th case. In order to eliminate the influence of scale, it is also need to divide the variance of estimator.
Because the keystone is to review the influence of deleting the j-th case. Hence, is substituted by. Then, the W-K statistic can be expressed as follows
4. Local Influence Analysis of Model
We consider the local influence method for a case-weight perturbation, for which the empirical log-likelihood function is defined by. In this case, , defined to be an
vector with all elements equal to 1, represents no perturbation to the empirical likelihood, because. Thus, the empirical likelihood displacement is defined as
where is the maximum empirical likelihood estimator of based on. Let with and
where is a direction in. Thus, the normal curvature of the influence graph is given by
where in which is a
matrix with -th element given by.
We consider two local influence measures based on the normal curvature as follows. Let
be the ordered eigen values of the matrix and let
be the associated orthonormal basis, that is,. Thus, the spectral decomposition of is given by
The most popular local influence measures include, which corresponds the largest eigen value, as well as,where is an vector with j-th component 1 and 0 otherwise. The represents the most influential perturbation to the empirical likelihood function, whereas the observation with a large can be regarded as influential.
As the discuss of Zhu et al.  , for the general transformation regression model with random right censorship, we can deduce that
5. Numerical Studies
In this section, we simulate data with sample sizes from the follow transformation model
where, , , , , where denotes the Bernoulli distribution and denotes the uniform distribution. For the simulation studies, we will consider three choices of: 1) standard exponential survival function 2). Note that when takes standard exponential survival function and, Model (1) corresponds to the proportional hazard Cox regression model and the proportional odds regression model. For all two models, we will generate censoring times from. By properly choosing values of, we consider three censoring proportions for all the cases (Qian Jun, et al.  ). The survival data simulated by software SAS as follows Table 1.
In order to check out the validity of our proposed methodology, we change the response variable value of the third, 20th, 54th, 80th and 99th data.
For every case, it is easy to obtain. For the parameters and, using the samples, we evaluated their maximum empirical likelihood estimators for two models.
Consequently, it is easy to calculate the value of and. The result of is as following figures.
From all figures, we can see that in most cases, the value of are reasonably close to one fixed value. Following the definition and properties of, we can diagnose the strong influence points, the value of which deviate from the average seriously. From Figures 1-3, we can see from the value of that the third, 20th, 54th and 80th data are strong influence point. From Figures 4-6, we can see from the value of that the third, 20th, 54th, 80th and 99th data are strong influence point. Indeed, our proposed approaches are illustrated.
In this paper, we considered the statistical diagnostic for general transformation models with right censored data based on empirical likelihood. We also studied in detail the method of simulating survival data under three different censored proportions. Through simulation studies, we illustrate that our proposed method can work fairly well.
Zhensheng Huang  analyzed empirical likelihood for varying-coefficient single-index model with right censored data. In addition, Zhengsheng Huang  studied profile empirical likelihood inferences for the single
Table 1. Survival data (Note: the “star” in top right corner represent censored data).
Figure 1. The influence of Model (1).
Figure 2. The influence of Model (1).
Figure 3. The influence of Model (1).
Figure 4. The influence of Model (2).
Figure 5. The influence of Model (2).
Figure 6. The influence of Model (2).
index-coefficient regression model. All of these will be topics for our further research.
- Cook, R.D. and Weisberg, S. (1982) Residuals and Influence in Regression. Chapman and Hall, New York.
- Wei, B.C., Lu, G.B. and Shi, J.Q. (1990) Statistical Diagnostics. Publishing House of Southeast University, Nanjing.
- Weissfeld, L.A. (1990) Influence Diagnostics for the Proportional Hazards Model. Statistics & Probability Letters, 10, 411-417. http://dx.doi.org/10.1016/0167-7152(90)90022-Y
- Thomas, D.R. and Grunkemeier, G.L. (1975) Confidence Interval Estimation of Survival Interval Estimation of Survival Probabilities for Censored Data. Journal of the American Statistical Association, 70, 865-871. http://dx.doi.org/10.1080/01621459.1975.10480315
- Owen, A. (2001) Empirical Likelihood. Chapman and Hall, New York. http://dx.doi.org/10.1201/9781420036152
- Zhu, H.T., Ibrahim, J.G. and Tang, N.S., et al. (2008) Diagnostic Measures for Empirical Likelihood of Generalized Estimating Equations. Biometrika, 99, 489-507. http://dx.doi.org/10.1093/biomet/asm094
- Xue, L.G. and Zhu, L.X. (2010) Empirical Likelihood in Nonparametric and Semiparametric Models. Science Press, Beijing.
- Qin, G. and Jing, B. (2001) Empirical Likelihood for Cox Regression Model under Random Censorship. Communication in Statistics—Simulation and Computation, 30, 79-90.
- He, B. (2006) Application of the Empirical Likelihood Method in Propotional Hazards Model. Ph.D. Thesis, University of Central Florida, Orlando.
- Gu, M., Sun, L. and Zuo, G. (2005) A Baseline-Free Procedure for Tansformation Models under Interval Censorship. Lifetime Data Analysis, 11, 437-488. http://dx.doi.org/10.1007/s10985-005-5235-x
- Zhou, M. (2005) Empirical Likelihood Analysis of the Rank Estimator for the Censored Accelerated Failure Time Model. Biometrika, 92, 492-498. http://dx.doi.org/10.1093/biomet/92.2.492
- Zheng, M. and Yu, W. (2011) Empirical Likelihood Method for the Multivariate Accelerated Filure Time Models. Journal of Statistical Planning and Inference, 141, 972-983. http://dx.doi.org/10.1016/j.jspi.2010.09.001
- Zhou, M., Kim, M. and Bathke, A. (2012) Empirical Likelihood Analysis for the Heteroscedastic Accelerated Failure Time Model. Statistica Sinica, 22, 295-316.
- Li, G., Li, R. and Zhou, M. (2005) Empirical Likelihood in Survival Analysis. In: Fan, J. and Li, G., Eds., Contemporary Multivariate Analysis and Design of Experiments, World Scientific, Singapore City, 337-350. http://dx.doi.org/10.1142/9789812567765_0020
- Lu, W. and Liang, Y. (2006) Empirical Likelihood Inference for Linear Transformation Models. Journal of Multivariate Analysis, 97, 1586-1599. http://dx.doi.org/10.1016/j.jmva.2005.09.007
- Li, J.B., Huang, Z.S. and Lian, H. (2013) Empirical Likelihood Influence for General Transformation Models with Right Censored Data. Statistics and Computing, 8, 1-10. http://dx.doi.org/10.1007/s11222-013-9415-3
- Clayton, D. and Cuzick, J. (1985) Multivariate Generalizations of the Proportional Hazards Model. Royal Statistical Society A, 48, 82-117. http://dx.doi.org/10.2307/2981943
- Dabrowska, D. and Doksum, K. (1988) Partial Likelihood in Transformation Models with Censoring Data. Scandinavian Journal of Statistics, 15, 1-23.
- Bickel, P. (1998) Efficient and Adaptive Estimation for Semiparametric Models. Springer, Berlin.
- Cheng, S., Wei, L. and Ying, Z. (1995) Analysis of Transformation Models with Censored Data. Biometrika, 82, 835-845. http://dx.doi.org/10.1093/biomet/82.4.835
- Fine, J., Ying, Z. and Wei, L. (1998) On the Linear Transformation Model for Censored Data. Biometrika, 85, 980-986. http://dx.doi.org/10.1093/biomet/85.4.980
- Qin, J. and Lawless, J. (1994) Empirical Likelihood and General Estimating Equations. Annals of Statistics, 22, 300- 325. http://dx.doi.org/10.1214/aos/1176325370
- Jun, Q., et al. (2013) Simulate Survival Data under Different Censored Proporions. Journal of Mathematical Medicine, 26, 644-646.
- Huang, Z.S. (2012) Empirical Likelihood for Varying-Coefficient Single-Index Model with Right-Censored Data. Metrika, 75, 55-71. http://dx.doi.org/10.1007/s00184-010-0314-8
- Huang, Z.S. and Zhang, R.Q. (2013) Profile Empirical-Likelihood Inferences for the Single-Index-Coefficient Regression Model. Statistics and Computing, 23, 455-465. http://dx.doi.org/10.1007/s11222-012-9322-z