Natural Science
Vol.6 No.10(2014), Article ID:46952,11 pages DOI:10.4236/ns.2014.610074

Statistical Diagnosis for General Transformation Model with Right Censored Data Based on Empirical Likelihood

Shuling Wang1, Xiaohong Deng1, Lin Zheng2

1Department of Fundamental Course, Air Force Logistics College, Xuzhou, China

2School of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, China

Email: 155328313@qq.com

Copyright © 2014 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Received 1 May 2014; revised 21 May 2014; accepted 8 June 2014

ABSTRACT

In this work, we consider statistical diagnostic for general transformation models with right censored data based on empirical likelihood. The models are a class of flexible semiparametric survival models and include many popular survival models as their special cases. Based on empirical likelihood methodologe, we define some diagnostic statistics. Through some simulation studies, we show that out proposed procedure can work fairly well.

Keywords:Random Right Censorship, Empirical Likelihood, Outliers, Influence Analysis

1. Introduction

Statistical diagnosis developed in the mid-1970s, which is a new statistical branch. In the course of development of the past 40 years, the diagnosis and influence analysis of linear regression model has been fully developed (R. D. Cook and S. Weisberg [1] , Bocheng Wei, Guobin Lu & Jianqing Shi [2] ). Influence diagnostics for the proportional hazards model has been fully developed (L. A. Weissfeld [3] ), for example, the proportional odds model, heteroscedastic linear transformation model, generalized linear transformation model, generalized transformation model and the other survival models.

The empirical likelihood method originates from Thomas & Grunkemeier [4] . Owen [5] first proposed the definition of empirical likelihood and expounded the system info of empirical likelihood. The empirical CDF of

is defined as for. The empirical likelihood of the CDF is. Zhu and Ibrahim [6] utilized this method for statistical diagnostic, they developed diagnostic measures for assessing the influence of individual observations when using empirical likelihood with general estimating equations, and used these measures to construct goodness-of-fit statistics for testing possible misspecification in the estimating equations. Liugen Xue and Lixing Zhu [7] summarized the application of empirical likelihood method.

Many authors have successfully applied empirical likelihood to the analysis of survival data. For example, Qin and Jing [8] investigated empirical likelihood confidence intervals for Cox’s regression models with right censored data; He [9] studied the goodness-of-fit of Cox’s regression models with various types of censored data; Gu et al. [10] considered inferences for Cox’s regression models with time-dependent coefficients; Zhou [11] , Zheng and Yu [12] and Zhou et al. [13] studied empirical likelihood for accelerated failure time models, multivariate accelerated failure time models and heteroscedastic accelerated failure time models respectively. Li et al. [14] overviewed some applications of empirical likelihood in survival analysis; Lu and Liang [15] discussed empirical likelihood procedure based on estimating equations for a class of flexible survival models-lineartransformation models, which includes popular proportional hazard regression models and proportional odds regression models as its special cases. Jianbo Li et al. [16] studied empirical likelihood inference for general transformation models with right censored data.

In this paper, we will consider statistical diagnostic for a class of very general survival models-general transformation models with right censored data in the form of

(1)

where is the conditional survival function of failure time variable given covariate vector; is a completely unspecified baseline survival function when; is a known monotonically increasing function with respect to satisfying and for any and; is a parameter vector including regression coefficients and possible model transformation parameters in. Model (1) includes many popular survival models, for example heteroscedastic linear transformation models, as their special cases. Note that when

where is a survival function, Model (1) reduces to the popular linear transformation models (Clayton and Cuzik [17] ; Dabrowska and Doksum [18] ; Bickel [19] ; Cheng et al. [20] ; Fine et al. [21] ).

So far the diagnosis of the general transformation model with random right censorship based on empirical likelihood method has not yet seen in the literature. This paper attempts to study it. One advantage of this procedure is that it is free of baseline survival function and censoring distribution. The class of models we investigate is also general than previous studies for survival models.

The rest of the paper is organized as follows. Empirical likelihood and estimation equation are presented in Section 2. The main results are given in Section 3 and Section 4. Section 5 contains some simulation studies as well as applications. Conclusions with discussions are given in Section 6.

2. Empirical Likelihood and Estimation Equation

Let be the censoring variable, be the censored event time variable and be the censoring indicator. Suppose are i.i.d. copies of. Denote by the total number of uncensored failure times., the partial ranking among the uncensored failure times and the censored observations between each neighboring pair of uncensored observations. Given the partial ranking and covariate observations, Jianbo Li [16] has proposed the empirical log-likelihood ratio function for can be defined by

where,

,

,

,

.

By Qin and Lawless [22] , Owen [5] , when

the empirical log-likelihood ratio statistic equal to the maximum

where and.

Regard and as independent variable and define

.

Obviously, the maximum empirical likelihood estimates and are the solutions of following equations

.

3. Case-Deletion Influence Measures

Consider Model (1), where the j-th case is deleted.

. (2)

This model is called case-deletion model. Let is the maximum empirical likelihood estimate of in model (2). In order to study the influence of the j-th case, and compare the difference between and

. The important result as follows theorem.

By Zhu, et al. [6] , for model (2), the maximum empirical likelihood estimator of is

, (3)

where,.

3.1. Empirical Cook Distance

Zhu, et al. [6] proposed empirical cook distance. Let M is a nonnegative matrix. The empirical cook distance is defined as follows

(4)

where.

3.2. Empirical Likelihood Distance

Empirical likelihood distance is advanced from the view of data fitting. Considering the influence of deleting the j-th case. In order to eliminate the influence of scale, it is also need to divide the variance of estimator.

Because the keystone is to review the influence of deleting the j-th case. Hence, is substituted by. Then, the W-K statistic can be expressed as follows

(5)

4. Local Influence Analysis of Model

We consider the local influence method for a case-weight perturbation, for which the empirical log-likelihood function is defined by. In this case, , defined to be an

vector with all elements equal to 1, represents no perturbation to the empirical likelihood, because. Thus, the empirical likelihood displacement is defined as

where is the maximum empirical likelihood estimator of based on. Let with and

where is a direction in. Thus, the normal curvature of the influence graph is given by

where in which is a

matrix with -th element given by.

We consider two local influence measures based on the normal curvature as follows. Let

be the ordered eigen values of the matrix and let

be the associated orthonormal basis, that is,. Thus, the spectral decomposition of is given by

.

The most popular local influence measures include, which corresponds the largest eigen value, as well as,where is an vector with j-th component 1 and 0 otherwise. The represents the most influential perturbation to the empirical likelihood function, whereas the observation with a large can be regarded as influential.

As the discuss of Zhu et al. [6] , for the general transformation regression model with random right censorship, we can deduce that

(6)

where,

,

.

5. Numerical Studies

In this section, we simulate data with sample sizes from the follow transformation model

where, , , , , where denotes the Bernoulli distribution and denotes the uniform distribution. For the simulation studies, we will consider three choices of: 1) standard exponential survival function 2). Note that when takes standard exponential survival function and, Model (1) corresponds to the proportional hazard Cox regression model and the proportional odds regression model. For all two models, we will generate censoring times from. By properly choosing values of, we consider three censoring proportions for all the cases (Qian Jun, et al. [23] ). The survival data simulated by software SAS as follows Table 1.

In order to check out the validity of our proposed methodology, we change the response variable value of the third, 20th, 54th, 80th and 99th data.

For every case, it is easy to obtain. For the parameters and, using the samples, we evaluated their maximum empirical likelihood estimators for two models.

Consequently, it is easy to calculate the value of and. The result of is as following figures.

From all figures, we can see that in most cases, the value of are reasonably close to one fixed value. Following the definition and properties of, we can diagnose the strong influence points, the value of which deviate from the average seriously. From Figures 1-3, we can see from the value of that the third, 20th, 54th and 80th data are strong influence point. From Figures 4-6, we can see from the value of that the third, 20th, 54th, 80th and 99th data are strong influence point. Indeed, our proposed approaches are illustrated.

6. Discussion

In this paper, we considered the statistical diagnostic for general transformation models with right censored data based on empirical likelihood. We also studied in detail the method of simulating survival data under three different censored proportions. Through simulation studies, we illustrate that our proposed method can work fairly well.

Zhensheng Huang [24] analyzed empirical likelihood for varying-coefficient single-index model with right censored data. In addition, Zhengsheng Huang [25] studied profile empirical likelihood inferences for the single

Table 1. Survival data (Note: the “star” in top right corner represent censored data).                    

Continued

Continued

Figure 1. The influence of Model (1).    

Figure 2. The influence of Model (1).   

Figure 3. The influence of Model (1).     

Figure 4. The influence of Model (2).     

Figure 5. The influence of Model (2).   

Figure 6. The influence of Model (2).     

index-coefficient regression model. All of these will be topics for our further research.

References

  1. Cook, R.D. and Weisberg, S. (1982) Residuals and Influence in Regression. Chapman and Hall, New York.
  2. Wei, B.C., Lu, G.B. and Shi, J.Q. (1990) Statistical Diagnostics. Publishing House of Southeast University, Nanjing.
  3. Weissfeld, L.A. (1990) Influence Diagnostics for the Proportional Hazards Model. Statistics & Probability Letters, 10, 411-417. http://dx.doi.org/10.1016/0167-7152(90)90022-Y
  4. Thomas, D.R. and Grunkemeier, G.L. (1975) Confidence Interval Estimation of Survival Interval Estimation of Survival Probabilities for Censored Data. Journal of the American Statistical Association, 70, 865-871. http://dx.doi.org/10.1080/01621459.1975.10480315
  5. Owen, A. (2001) Empirical Likelihood. Chapman and Hall, New York. http://dx.doi.org/10.1201/9781420036152
  6. Zhu, H.T., Ibrahim, J.G. and Tang, N.S., et al. (2008) Diagnostic Measures for Empirical Likelihood of Generalized Estimating Equations. Biometrika, 99, 489-507. http://dx.doi.org/10.1093/biomet/asm094
  7. Xue, L.G. and Zhu, L.X. (2010) Empirical Likelihood in Nonparametric and Semiparametric Models. Science Press, Beijing.
  8. Qin, G. and Jing, B. (2001) Empirical Likelihood for Cox Regression Model under Random Censorship. Communication in Statistics—Simulation and Computation, 30, 79-90.
  9. He, B. (2006) Application of the Empirical Likelihood Method in Propotional Hazards Model. Ph.D. Thesis, University of Central Florida, Orlando.
  10. Gu, M., Sun, L. and Zuo, G. (2005) A Baseline-Free Procedure for Tansformation Models under Interval Censorship. Lifetime Data Analysis, 11, 437-488. http://dx.doi.org/10.1007/s10985-005-5235-x
  11. Zhou, M. (2005) Empirical Likelihood Analysis of the Rank Estimator for the Censored Accelerated Failure Time Model. Biometrika, 92, 492-498. http://dx.doi.org/10.1093/biomet/92.2.492
  12. Zheng, M. and Yu, W. (2011) Empirical Likelihood Method for the Multivariate Accelerated Filure Time Models. Journal of Statistical Planning and Inference, 141, 972-983. http://dx.doi.org/10.1016/j.jspi.2010.09.001
  13. Zhou, M., Kim, M. and Bathke, A. (2012) Empirical Likelihood Analysis for the Heteroscedastic Accelerated Failure Time Model. Statistica Sinica, 22, 295-316.
  14. Li, G., Li, R. and Zhou, M. (2005) Empirical Likelihood in Survival Analysis. In: Fan, J. and Li, G., Eds., Contemporary Multivariate Analysis and Design of Experiments, World Scientific, Singapore City, 337-350. http://dx.doi.org/10.1142/9789812567765_0020
  15. Lu, W. and Liang, Y. (2006) Empirical Likelihood Inference for Linear Transformation Models. Journal of Multivariate Analysis, 97, 1586-1599. http://dx.doi.org/10.1016/j.jmva.2005.09.007
  16. Li, J.B., Huang, Z.S. and Lian, H. (2013) Empirical Likelihood Influence for General Transformation Models with Right Censored Data. Statistics and Computing, 8, 1-10. http://dx.doi.org/10.1007/s11222-013-9415-3
  17. Clayton, D. and Cuzick, J. (1985) Multivariate Generalizations of the Proportional Hazards Model. Royal Statistical Society A, 48, 82-117. http://dx.doi.org/10.2307/2981943
  18. Dabrowska, D. and Doksum, K. (1988) Partial Likelihood in Transformation Models with Censoring Data. Scandinavian Journal of Statistics, 15, 1-23.
  19. Bickel, P. (1998) Efficient and Adaptive Estimation for Semiparametric Models. Springer, Berlin.
  20. Cheng, S., Wei, L. and Ying, Z. (1995) Analysis of Transformation Models with Censored Data. Biometrika, 82, 835-845. http://dx.doi.org/10.1093/biomet/82.4.835
  21. Fine, J., Ying, Z. and Wei, L. (1998) On the Linear Transformation Model for Censored Data. Biometrika, 85, 980-986. http://dx.doi.org/10.1093/biomet/85.4.980
  22. Qin, J. and Lawless, J. (1994) Empirical Likelihood and General Estimating Equations. Annals of Statistics, 22, 300- 325. http://dx.doi.org/10.1214/aos/1176325370
  23. Jun, Q., et al. (2013) Simulate Survival Data under Different Censored Proporions. Journal of Mathematical Medicine, 26, 644-646.
  24. Huang, Z.S. (2012) Empirical Likelihood for Varying-Coefficient Single-Index Model with Right-Censored Data. Metrika, 75, 55-71. http://dx.doi.org/10.1007/s00184-010-0314-8
  25. Huang, Z.S. and Zhang, R.Q. (2013) Profile Empirical-Likelihood Inferences for the Single-Index-Coefficient Regression Model. Statistics and Computing, 23, 455-465. http://dx.doi.org/10.1007/s11222-012-9322-z