Journal of Modern Physics
Vol.08 No.03(2017), Article ID:74452,16 pages
10.4236/jmp.2017.83024
An Application of Generalized Entropy Optimization Methods in Survival Data Analysis
Aladdin Shamilov1, Cigdem Kalathilparmbil1*, Sevda Ozdemir2
1Faculty of Science, Department of Statistics, Anadolu University, Eskişehir, Turkey
2Ozalp Vocational School, Accountancy and Tax Department, Yuzuncu Yil University, Van, Turkey

Copyright © 2017 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).
http://creativecommons.org/licenses/by/4.0/



Received: August 5, 2016; Accepted: February 25, 2017; Published: February 28, 2017
ABSTRACT
In this paper, survival data analysis is realized by applying Generalized Entropy Optimization Methods (GEOM). It is known that all statistical distributions can be obtained as
distribution by choosing corresponding moment functions. However, Generalized Entropy Optimization Distributions (GEOD) in the form of
distributions which are obtained on basis of Shannon measure and supplementary optimization with respect to characterizing moment functions, more exactly represent the given statistical data. For this reason, survival data analysis by GEOD acquires a new significance. In this research, the data of the life table for engine failure data (1980) is examined. The performances of GEOD are established by Chi-Square criteria, Root Mean Square Error (RMSE) criteria and Shannon entropy measure, Kullback-Leibler measure. Comparison of GEOD with each other in the different senses shows that along of these distributions
is better in the senses of Shannon measure and of Kullback- Leibler measure. It is showed that,
is more suitable for statistical data among
. Moreover,
is better for statistical data than
in the sense of RMSE criteria. According to obtained distribution 
estimator of Probability Density Function
, Cumulative Distribution Function
, Survival Function
and Hazard Rate
are evaluated and graphically illustrated. The results are acquired by using statistical software MATLAB.
Keywords:
Survival Function, Censored Observation, Generalized Entropy Optimization Methods,
Distributions

1. Introduction
Entropy Optimization Methods (EOM) have important applications, especially in statistics, economy, engineering and so on. There are several examples in the literature that known statistical distributions do not conform to statistical data; however, the entropy optimization distributions conform well. Generalized Entropy Optimization Methods (GEOM) have suggested distributions in the form of MinMaxEnt which is the closest to statistical data, and MaxMaxEnt which is the furthest from mentioned data in the sense of information theory [1] [2] , respectively. For this reason, GEOM can be more successfully applied in Survival Data Analysis.
Different aspects and methods of investigations of survival data analysis are considered in [3] - [8] .
In particular in the paper [6] , it is investigated several problems of hazard rate function estimation based on the maximum entropy principle. The potential applications include developing several classes of the maximum entropy distributions which can be used to model different data-generating distributions that satisfy certain information constraints on the hazard rate function.
In order to represent the results of our investigations, we give some auxiliary concepts and facts first.
2. Survival Analysis
Survival time can be defined broadly as the time to the occurrence of a given event. This event can be the development of a disease, response to a treatment, relapse or death [9] .
Censoring: The techniques for reducing experimental time are known as censoring. In survival analysis, the observations are lifetimes, which can be indefinitely long. So quite often the experiment is so designed that the time required for collecting the data is reduced to manageable levels.
Let
be a continuous, non-negative valued random variable representing the lifetime of a unit. This is the time for which an individual (or unit) carries out its appointed task satisfactorily and then passes into “failed’’ or “dead’’ state thereafter [10] .
The probabilistic properties of the random variable are studied through its cumulative distribution function
or other equivalent functions defined below [9] :
Cumulative Distribution Function:

Survival Function: This function is denoted by
, is defined as the probability that an individual survives longer than
Probability Density Function: Like any other continuous random variable, the survival time 


Hazard Rate: This function is defined as the probability of failure during a very small time interval, assuming that the individual has survived to the beginning of the interval, or as the limit of the probability that an individual fails in a very short interval, 

3. Generalized Entropy Optimization Methods (GEOM)
Entropy Optimization Problem (EOP) [11] and Generalized Entropy Optimization problem (GEOP) [10] can be formulated in the following form.
EOP: Let 






GEOP: Let 

















The method of solving GEOP is called as GEOM.
3.1. 
The problem of maximizing entropy function

subject to constraints

where
has solution

where 

If (3) is substituted into (1), the maximum entropy value is obtained:

If distribution 








3.2. 

Let 

Consequently,
Distributions 






Let 







Solving the 

































4. Application of 

4.1. 

In the present research, the data of the life table for engine failure data (1980) given in Table 1 is considered [10] .
In our investigation, the experiment is planned for 200 numbers of patients surviving at beginning of interval but the presence of censoring from the planning patients 97 individuals stay out the experiment. This situation is taken into account in Table 2.
It should be noted that, the presence of censoring in the survival times leads to a situation where the sum of observation probabilities stands less than 1 for the
Table 1. The data of the life table for engine failure data (1980).
Table 2. Observed and corrected probabilities.
survival data. For this reason, in solving many problems, it is required to supplement the sum of observation probabilities up to 1. Since the sum of observed probabilities 


As we noted that above, 




Consequently,

gives the least value to 
gives the greatest value to
The 









In order to obtain the performance of the mentioned distributions, we use various criteria as Root Mean Square Error (RMSE), Chi-Square, entropy values of distributions. The acquired results are demonstrated in Table 9 and Table 10.
All 
In the sense of RMSE criteria each 



Table 3. The predicted probabilities for the 
Table 4. The predicted probabilities for the 
Table 5. The predicted probabilities for the 
Table 6. The predicted probabilities for the 
Table 7. Distributions of
Table 8. Distributions of















Although the distribution with the largest number of moment functions tends to fit better, it should be noted that in some cases, the set of moment functions with fewer elements is more informative then a different set of moment functions with more number of elements.
Table 9. The obtained results for

Table 10. The obtained results for


Figure 1. Graphic of 


Figure 2. Graphic of 


Figure 3. Graphic of 


Figure 4. Graph of 

4.2. Availability of GEOD to Survival Data in the Sense of Shannon Measure
In order to establish availability of GEOD to survival data in the sense of Shannon measure it is required to consider entropy values of GEOD.
From Table 3 it is seen that the 



From Table 4 it is seen that the 



From Table 5 it is seen that the 



From Table 6 it is seen that the 

and
Comparison of GEOD with each other in the sense of Shannon measure shows that along of these distributions 
The results of our investigation according to using known characterizing moment vector functions from 
Corollary 1. If by 




is fulfilled, when



Moreover for any 
takes place.
4.3. Availability of GEOD to Survival Data in the Sense of Kullback-Leibler Measure
Now, we calculate the distance between observed distribution


It is known that the Kullback ? Leibler distance between distributions



By starting these formula Kullback-Leibler measures for the distance between observed distribution 

From Table 11 and Table 12 follows that along of GEOD 
The results of our investigation according to using known characterizing moment vector functions from 
Corollary 2. If 





is fulfilled, when



Moreover for any 
takes place.
Table 11. Kullback-Leibler measure of 
Table 12. Kullback-Leibler measure of 
4.4. Survival Expression of Distributions
In this section survival data analysis is conducted by










On basis of the results given in Table 13 & Table 14, graphs of


Table 13. Survival analysis by
Table 14. Survival analysis by


Figure 5. Survival expression of distribution
5. Conclusion
In this study, it is established that survival data analysis is realized by applying Generalized Entropy Optimization Methods (GEOM). Generalized Entropy Optimization Distributions (GEOD) in the form of




Figure 6. Survival expression of distribution
is showed that, 








Cite this paper
Shamilov, A., Kalathilparmbil, C. and Ozdemir, S. (2017) An Application of Generalized Entropy Optimization Methods in Survival Data Analysis. Journal of Modern Physics, 8, 349-364. https://doi.org/10.4236/jmp.2017.83024
References
- 1. Shamilov, A. (2006) A Development of Entropy Optimization Methods. Wseas Transactions on Mathematics, 5, 568-575.
- 2. Shamilov, A. (2007) Generalized Entropy Optimization Problems and the Existence of Their Solutions. Physica A: Statistical Mechanics and Its Applications, 382, 465-472.
https://doi.org/10.1016/j.physa.2007.04.014 - 3. Kaminski, D. and Geisler, C. (2012) Survival Analysis of Faculty Retention in Science and Engineering by Gender. Science, 335, 864-866.
https://doi.org/10.1126/science.1214844 - 4. Reingold, E.M., Reichle, E.D. and Glaholt, M.G. (2012) Heather Sheridan, Direct Lexical Control of Eye Movements in Reading: Evidence from a Survival Analysis of Fixation Durations. Cognitive Psychology, 65, 177-206.
- 5. Wang, H. and Dai, H.S. (2012) Accelerated Failure Time Models for Censored Survival Data under Referral Bias. Biostatistics, 14, 313-326.
- 6. Ebrahimi, N. (2000) The Maximum Entropy Method for Lifetime Distributions. Sankhya: The Indian Journal of Statistics, Series A, 62, 236-243.
- 7. Guyot, P., Ades, A., Ouwens, M.J. and Welton, N.J. (2012) Enhanced Secondary Analysis of Survival Data: Reconstructing the Data from Published Kaplan-Meier Survival Curves. BMC Medical Research Methodology, 12, 9.
https://doi.org/10.1186/1471-2288-12-9 - 8. Joly, P., Gerds, T.A., Qvist, V., Commenges, D. and Keiding, N. (2012) Estimating Survival of Dental Fillings on the Basis of Interval-Censored Data and Multi-State Models. Statistics in Medicine, 31, 11-12.
- 9. Lee, E.T. and Wang, J.W. (2003) Statistical Methods for Survival Data Analysis. Wiley-Interscience, Oklahoma.
- 10. Deshpande, J.V. and Purohit, S.G. (2005) Life Time Data: Statistical Models and Methods, Series on Quality. Vol. 11, Reliability and Engineering Statistics, India.
- 11. Kapur, J.N. (1992) Kesavan, Entropy Optimization Principles with Applications.
- 12. Shamilov, A. (2009) Entropy, Information and Entropy Optimization. T.C. Anadolu University Publication, Eskisehir.
- 13. Shamilov, A. (2010) Generalized Entropy Optimization Problems with Finite Moment Functions Sets. Journal of Statistics and Management Systems, 13, 595-603.
https://doi.org/10.1080/09720510.2010.10701489 - 14. Shamilov, A., Giriftinoglu, C., Usta, I. and Mert Kantar, Y. (2008) A New Concept of Relative Suitability of Moment Function Sets. Applied Mathematics and Computation, 206, 521-529.
https://doi.org/10.1016/j.amc.2008.05.063
























