 Applied Mathematics, 2011, 2, 93-105 doi:10.4236/am.2011.21011 Published Online January 2011 (http://www.SciRP.org/journal/am) Copyright © 2011 SciRes. AM Average Life Prediction Based on Incomplete Data* Tang Tang1, Lingzhi Wang2, Faen Wu1, Lichun Wang1 1Department of Mat hem at ic s, Beijing Jiaotong University, Beijing, China 2School of Mechanical, Beijing Jiaotong University, Beijing, China E-mail: wlc@amss.ac.cn Received May 4, 2010; revised November 18, 2010; accepted November 22, 2010 Abstract The two-parameter exponential distribution can often be used to describe the lifetime of products for exam-ple, electronic components, engines and so on. This paper considers a prediction problem arising in the life test of key parts in high speed trains. Employing the Bayes method, a joint prior is used to describe the va-riability of the parameters but the form of the prior is not specified and only several moment conditions are assumed. Under the condition that the observed samples are randomly right censored, we define a statistic to predict a set of future samples which describes the average life of the second-round samples, firstly, under the condition that the censoring distribution is known and secondly, that it is unknown. For several different priors and life data sets, we demonstrate the coverage frequencies of the proposed prediction intervals as the sample size of the observed and the censoring proportion change. The numerical results show that the pre-diction intervals are efficient and applicable. Keywords: Prediction Interval, Incomplete Data, Bayes Method, Two-Parameter Exponential Distribution 1. Introduction Prediction problem has been very often and useful in many fields of applications. The general prediction pro- blem can be regarded as that of using the results of pre- vious data to infer the results of future data from the same population. The lifetime of the second round sam- ple is an important index in life testing experiments and in many situations people want to forecast the lifetimes of these samples as well as the system composed of these samples (See [1,2] and among others). For more details on the history of statistical prediction, analysis and appli- cation, see [3,4]. As we know, many quality characteristics are not nor- mally distributed, especially the lifetime of products for example, electronic components, engines and so on. Ass- ume that the lifetime of a component follows the two- parameter exponential distribution whose probability density funct i o n (p df) given by  1;,= exp>,xfx Ix (1.1) where >0 and 0 are called the scale parameter and the location parameter, respectively, and IA de- notes the indicator function of the set A. The readers are referred to [5,6] for some practical applications of the two-parameter exponential distribution in real life. The recent relevant studies on the two-parameter exponential distribution can be found in [7-9], etc. In this paper, we adopt the following testing scheme: for n groups of components, which come from n dif- ferent manufacture units possessing the same techno- logy and regulations, we sample m components from each group and put them to use at time =0t and to practice economy the experiment will be terminated if one of the m components is ineffective, where m is a predetermined integer. Denote the lifetime of the ineffe- ctive component by 1iXin . Obviously, =iX 12min,, ,,ii imXX Xwhere 1ijXjm is the life of the j-th component of the i-th group. Hence, we obtain nlifetime data 12,,,.nXXX If iXa, where >0a is a known constant, then we again sample one component from the i-th group and denote its unknown lifetime by iY. In this paper, our interest is to predict the av erage life of the second round sample, i .e.,  1=1 =1nniiiiiIXa IXaY. For i nstance,  1=1 =1nniiiiikIXa IXaYapproximately des- cribes the average lifetime of a system of k com- ponents, based on the samples of the second round, is connected in active-parallel which fails on ly when all k components fail. *Sponsored by the Scientific Foundation of BJTU (2007XM046) T. TANG ET AL. Copyright © 2011 SciRes. AM 94Normally, there are two different views on prediction problems, the frequentist approach and the Bayes appr- oach. The Bayesian viewpoint has received large atten- tion for analyzing data in past several decades and has been often proposed as a valid alternative to traditional statistical perspectives (see [10-12], etc.). A main diff- erent point between the Bayes approach an d the frequen- tist approach is that in Bayesian analysis we use not on ly the sample information but also the prior information of the parameter. To adopt the Bayes approach, we regard the para- meters  and  as the realization value of a random variable pair ,U with a joint prior distribution ,G. Let  112 2,,,,,,nn  be independent and identically distributed (i.i.d.) with the prior distribution ,G, and conditionally on ,ii assume ijX has the pdf (1.1) which will be denoted by ,fx in the following section. Set =1=1=.niiiniiIXaYSIXa (1.2) Our problem is how to construct a function 12,,,ngXX X to predict S. As we know, many statistical experiments result in incomplete sample, even under well-controlled situations. This is because individuals will experience some other competing events which cause them to be removed. In life testing experiments, the experimenter may not al- ways be in a position to observe the lifetimes of all components put on test due to time limitation or other restrictions(such as money, material resources, etc.) on data collection (see [9,13] and others). Hence, censored samples may arise in practice. In this paper we assume furthermore that 1, 1ijXin jm  are censored from the right by nonnegative independent random vari- ables 1iVin with a distribution function W. It is assumed that 1, 1ijXin jm  are independent of 1iVin . In the random censorship model, the true lifetimes 12=min,, ,1iiiimXXX Xin are not always observable. Instead, we observe only =min ,iiiZXVand =iiiIXV. The paper is organized as follows. In Section 2, based on ,1iiZin , we define a predictive statistic for S and simulate its prediction results under the condition that the censoring distribution W is known. In Section 3, when the censoring distribution W is unknown, we obtain a s imilar result for a co rresponding predictive sta- tistic of S as well as demonstrate the prediction perfor- mances. Some conclusions and remarks are presented in Section 4 and Section 5 deals with the proofs of the main theorems. 2. Predictive Statistic for S with Known W Note that ijX has the conditional pdf ,fx, we know that, given ,, 12=min,, ,iiiimXXXX has the pdf ,=exp >.mxmlx Ix  (2.1) Since iX and Y ( if iiXa) come from the same group, ,1iiXYin would be i.i.d. with common marginal pdf ,=,, ,,Upxylxf ydG (2.2) where Udenotes the support of the prior distribu- tion ,G. Rewrite 1=1=1 112=1 =1ˆ== =.nnii iiiinniiiiIX aY nIX aYSSSIX anIXa (2.3) By Fubini’s theorem, we know Equation (2.4) (Below) and  12=1(,)==, =exp,niiiESEnIXaEEI XamaE (2.5) where ,E denotes the expectation with respect to ,. Based on ,1iiZin, we define 1=1=11=11 11 1,1nn jj iiiji jiniiiiiZIZ aSnnWZWZIZ amZanWZ   (2.6)    111000 0,== ,=,,, =exp,=expUUESE I XaYI xaypxydxdyI xalxdxyfydydGma madG E         (2.4) T. TANG ET AL. Copyright © 2011 SciRes. AM 95 and 2=11=.1niiiiIZ aSnWZ (2.7) Note that conditionally on ,ii all 1iZin are i.i.d. with the distribution function =1 11,HWzLz , where 0(,)=,,zLzl xdx  we have Equation (2.8) (B elow) and  1121,2,=1 =,1 =exp=.xvIZaES EWZIx aEdWvlxdxWxmaEES  (2.9) Hence, the statistics =1,2iSi have the same exp- ectation as =1,2iSi . Set 12=.SSS (2.10) Remark 1. Note that it is almost impossible that all i’s are equal to zero, so 1S and 2S are reasonable estimators for 1S and 2S, respectively. The main result in this section can be formulated in the following theorem. Theorem 1. If the following conditions are satisfied: 1) 22<, <;EE 2) ,,<;E 3) 22<;1XIXaEWX then 0, pSS n where 2,= ,1XEWX , ,= ,1IX aEWX and p denotes con- vergence in probability. Clearly, S can be used as a predictive statistic for S in this case. Especially, when there is no censors hip =, =1iiiZX, 1S and 2S turn into, respectively 10 =1=11=11 1nnjiijiniiiSXIXannmXaIXan  (2.11) and 20 =11=.niiSIXan (2.12) Consequently, we use 01020=SSS as a prediction statistics of S. Normally, we choose Gamma prior for the parameters ,, however, it is easy to see that Theorem 1 does not depend on any specific prior distribution. This shows that for any a prior distribution satisfying the conditions of Theorem 1, the conclusion of Theorem 1 will hold. So in the simulation study, we let the prior distribution of parameters 500,1300 ,Uniform (2.13) 800,1400 ,Uniform (2.14) and the censoring distribution be   1=111111=,,11 1 1,1 =,,11 1nn jj iiiji jixv xvxvZIZ aESE EEnnWZ WZZaIZamEE WZxIxaEdWv lxdxdWv lxdxWx WxmE      1,,11 =expexp=,xaIxadWv lxdxWxma mamEESmm    (2.8) T. TANG ET AL. Copyright © 2011 SciRes. AM 96  =1exp, >0,Wvcv v (2.15) where = 0.0001,0.0002c and 0.00025 can be used to describe the censorship proportion (CP) >PXV, which denotes the probability that X is larger than V. In the censorship model, if the probability >PX V gets larger, then more 'iVs are likely to be observed other than 'iXs. Also, let =3m. Under the above assumptions, it is not difficult to check that the conditions (i), (ii) and (iii), defined in Theorem 1, are satisfied. Note that ==38003EXEE m, which shows the mean time to failure (MTTF) of mini- mum lifespan of the m components is 3800/3. Firstly, we generate n random values from the priors (2.13) and (2.14), and denote them by ,1ii in. Secondly, by Equations (2.1) and (2.15) we obtain 1iXin and 1iVin , accordingly, we get =min,1iiiZXVin and =1iiiIXV in. Thirdly, let the predeter- mined constant a be equal to the MTTF, we compute the frequencies of the event 0fx Ix, which may be a finite mixture of any two life distributions, which occurs when two different causes of failure are present (see  and among others). 5. Proofs 5.1. The Proof of Theorem 1 Proof. In order to obtain the conclusion of Theorem 1, we first pr ove 110, .pSS n (5.1) Note that 22211111 1=2 .ES SESESSES (5.2) Firstly, it is easy to see Equation (5.3) (Below) Secondly, we have belowing Equation (5.4)    2212=12221=211 =22expexp.nii i jijiijESEI XaYI XaI XaYYnma manEEnn      (5.3)  11 2=122=11=1111 1111 1nnjj iiiiijijinjjkkiiijkj kjniiiiiiiZIZ aES SEIXaYWZ WZnnIZaZEIX aYWZ WZnnIZ amEZa IXaYWZn     211211 =expexp2211 expniiijjij iIZ amEZa IXaYWZnma maEEanm nmmanmEEnm nm                 exp211 exp.mamanmEnm  (5.4)Thirdly, 1S ca n be rep rese nted as 1=111=1.11 1nnjj iiiiji jiZIZ aSmZan nWZWZ  (5.5) We know T. TANG ET AL. Copyright © 2011 SciRes. AM 103212=1 11=,niijiijnESQ Qn  (5.6) where   2222221=111 112 =,,,1111 1 1,1njj iiiiji jiZIZ aQEm ZanWZ WZXIXanIXaEEEE EnWX WXnmWXXaIXamEE WX            21, ,1XaIXamE EmWX    (5.7) and  211=1111111111 11nn njjjjjii iikk llkkij kil jkikiljkijnjlllj lIZaZ aIZaIZ aIZ aZZ ZmQE EWZ WZWZWZnWZ WZWZnIZ aZmEnWZ     222 221111 122223 =,expexp1112 jjjjiii iiijii jZaIZaZaIZaZaIZamEWZ WZWZWZma manX nnEE EWX mnnn            22 221,exp ,111121 ,exp112221 exp1maXI XaXI XaEE EEWX mWXnnmamXXaIXaEEnWXmnmEnm m          22221exp.amamEm    (5.8) Along with Equa tions (5.3)-(5.4 ) and (5.6)-(5.8), we have  2221122213=2exp exp1221exp, ,11 121ma manES SEEnnnmma XIXaEaEE Enm nnWXWXnIXEEnn m                2221,,11221 2,,exp11122 ,11amXaIXaEEWX nWXmamXaIXanXEE EEnm WXnnWXnXIXaEEnnW Xm           2exp21 1,exp, .111mamamXX aIXaXIXaEE EEnWX nnWX        (5.9) Combining Equation (5.9) and using the following facts: 1)  22,>,>;EXm  T. TANG ET AL. Copyright © 2011 SciRes. AM 104 2) 2222;11 1XaXIXaEEWX WaWX   3) 2222<<;11XIXaX a IXaEEWX WX  (5.10) and also by Cauchy-Schwarz inequality, we easily know that under the conditions 1), 2) and 3), 211=0.limnES S  (5.11) Then by Markov's inequality, we conclude that (5.1) holds. On the other h a nd , not e that as n  22 =11=0,1 withprobability1,niiiiiIZ aSS IXanWZ (5.12) and 2with probability1.SPXa (5.13) From Equations (5.1), (5.12) and (5.13), Theorem 1 follows. 5.2. The Proof of Theorem 2 Proof. To prove Theorem 2, it is enough to show that ˆ0.pSS (5.14) Firstly, represent 1ˆS as 1=1 =12=1=111ˆ=ˆˆ11111 1ˆ11 1.ˆ1nnjjiiijninjnii iininiiiiniZIZ anSnn nWZ WZZI Zann WZIZ amZanWZ  (5.15) Note that     =1 =1=1()11ˆ11111sup ˆ11ˆ, sup ˆ11with probability1,nnii iiiiininiiZZ iiin nini iZZinni iIZ aIZ annWZWZIZ aWZ nWZWZ WZPX aWZWZ  (5.16) since 11=111,niiiIZaEIXaWXPX an  (5.17) with probability 1. By Equation (5.16) and the following result (see ),  ()ˆ0,sup pni iZZinWZ WZ (5.18) we know =1 =111 0.ˆ11nnpii iiiiiniIZ aIZ annWZWZ (5.19) Similarly, we have  =1 =1110,ˆ11nnpjj jjjjjnjZZnnWZWZ (5.20) 22=1 =111 0,ˆ11nnpii iii iiiiniZI ZaZI ZannWZWZ (5.21)  =1 =111 0.ˆ11nnpii iiiiiiiniIZ aIZ aZa ZannWZWZ(5.22) Combining Equations (5.15) with (5.19)-(5.22), we conclude that 11ˆ0,pSS (5.23) and 22ˆ0.pSS (5.24) Hence, Equation (5.14) has been proved. Together with Theorem 1’s conclusion Theorem 2 holds. 6. Acknowledgements The authors would like to thank an anonymous referee for his helpful comments. 7. References  H. Robbins, “Some Thoughts on Empirical Bayes Esti-mation,” Annals of Statistics, Vol. 11, No. 3, 1983, pp. 713-723. doi:10.1214/aos/1176346239  S. Zacks, “Introduction to Reliability Analysis,” Springer, New York, 1991.  J. Aitchison and I. R. Dunsmore, “Statistical Prediction Analysis,” Cambridge University Press, Cambridge, 1975. doi:10.1017/CBO9780511569647  S. Geisser, “Predictive Inference: An Introduction,” T. TANG ET AL. Copyright © 2011 SciRes. AM 105Chapman and Hall, London, 1993.  L. J. Bain and M. Engelhardt, “Interval Estimation for the Two-Parameter Double Exponential Distribution,” Tech- nometrics, Vol. 15, No. 4, 1973, pp. 875-887. doi: 10.2307/1267397  R. G. Easterling, “Exponential Responses with Double Exponetial Distribution Measurement Error-A Model for Steam Generator Inspection,” Proceedings of the DOE Statistical Symposium, U.S. Department of Energy, 1978, pp. 90-110.  M. T. Madi and T. Leonard, “Bayesian Estimation for Shifted Exponential Distribution,” Journal of Statistical Planning and Inference, Vol. 55, No. 6, 1996, pp. 345-351. doi:10.1016/S0378-3758(95)00199-9  W. T. Huang and H. H. Huang, “Empirical Bayes Esti-mation of the Guarantee Lifetime in a Two-Parameter Exponential Distribution,” Statistics and Probability Let-ters, Vol. 76, No. 16, 2006, pp. 1821-1829. doi:10.1016/ j.spl.2006.04.034  J. W. Wu, H. M. Lee and C. L. Lei, “Computational Testing Algorithmic Procedure of Assessment for Life-time Performance Index of Products with Two-Parameter Exponential Distribution,” Applied Mathematics and Computation, Vol. 190, No. 5, 2007, pp. 116-125. doi: 10.1016/j.amc.2007.01.010  J. O. Berger, “Statistical Decision Theory and Bayesian Analysis,” Springer, New York, 1985.  R. S. Singh, “Bayes and Empirical Bayes Procedures for Selecting Good Populations from a Translated Exponen-tial Family,” Empirical Bayes and Likelihood Inference, Springer, New York, 2001.  H. Lian, “Consistency of Bayesian Estimation of a Step Function,” Statistics and Probability Letters, Vol. 77, No. 1, 2007, pp. 19-24. doi:10.1016/j.spl.2006.05.007  Q. H. Wang, “Empirical Likelihood for a Class of Func-tions of Survival Distribution with Censored Data,” An-nals of the Institute of Statistical Mathematics, Vol. 53, No. 3, 2001, pp. 517-527. doi:10.1023/A:1014617112870  E. T. Kaplan and P. Meier, “Nonparametric Estimation from Incomplete Observations,” Journal of the American Statistical Association, Vol. 53, No. 282, 1958, pp. 457- 481. doi:10.2307/2281868  R. Y. Zheng, “Improvement on Grey Modelling Ap-proach and Its Applications to Fatigue Reliability Re-search,” Master Dissertation, National University of De-fense Technology, Chang Sha, 1989.  K. O. Bowman and L. R. Shenton, “Weibull Distributions When the Shape Parameter is Defined,” Computational Statistics and Data Analysis, Vol. 36, No. 4, 2001, pp. 299-310. doi:10.1016/S0167-9473(00)00048-7  U. Singh, P. K. Gupta and S. K. Upadhyay, “Estimation of Parameters for Exponentiated-Weibull Family Under Type-II Censoring Scheme,” Computational Statistics and Data Analysis, Vol. 48, No. 3, 2005, pp. 509-523. doi:10.1016/j.csda.2004.02.009  L. F. Zhang, M. Xie and L. C. Tang, “A Study of Two Estimation Approaches for Parameters of Weibull Dis-tribution Based on WPP,” Reliability Engineering and System Safety, Vol. 92, No. 3, 2007, pp. 360-368. doi: 10.1016/j.ress.2006.04.008  Z. F. Jaheen, “Bayesian Prediction Under a Mixture of Two-Component Gompertz Lifetime Model,” Test, Vol. 12, No. 1, 2003, pp. 413-426. doi:10.1007/BF02595722  M. Zhou, “Asymptotic Normality of the Synthetic Data Regression Estimation for Censored Survival Data,” An-nals of Statistics, Vol. 20, No. 2, 1992, pp. 1002-1021. doi:10.1214/aos/1176348667.