Open Journal of Statistics
Vol.04 No.09(2014), Article ID:50509,8 pages
10.4236/ojs.2014.49069
Two-Sample Bayesian Predictive Analyses for an Exponential Non-Homogeneous Poisson Process in Software Reliability
Albert Orwa Akuno, Luke Akong’o Orawo, Ali Salim Islam
Department of Mathematics, Egerton University, Egerton, Kenya
Email: orwaakuno@gmail.com, orawo2000@yahoo.com, asislam54@yahoo.com
Copyright © 2014 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/



Received 4 August 2014; revised 9 September 2014; accepted 29 September 2014
ABSTRACT
The Goel-Okumoto software reliability model is one of the earliest attempts to use a non-homo- geneous Poisson process to model failure times observed during software test interval. The model is known as exponential NHPP model as it describes exponential software failure curve. Parameter estimation, model fit and predictive analyses based on one sample have been conducted on the Goel-Okumoto software reliability model. However, predictive analyses based on two samples have not been conducted on the model. In two-sample prediction, the parameters and characteristics of the first sample are used to analyze and to make predictions for the second sample. This helps in saving time and resources during the software development process. This paper presents some results about predictive analyses for the Goel-Okumoto software reliability model based on two samples. We have addressed three issues in two-sample prediction associated closely with software development testing process. Bayesian methods based on non-informative priors have been adopted to develop solutions to these issues. The developed methodologies have been illustrated by two sets of software failure data simulated from the Goel-Okumoto software reliability model.
Keywords:
Nonhomogeneous Poisson Process, Software Reliability Models, Non-Informative Priors, Bayesian Approach

1. Introduction
Software reliability is defined as the probability of failure free software operations for a specified period of time in a specified environment [1] . The reliability of any software is of great interest to the software developers before a decision is made to release the software into the market. Software developers need correct and concise information about how reliable software is before they decide to release the software into the market as single software defect can cause system failure and to avoid these failures, reliable software is required [2] . Software reliability is achieved through testing during the software development stage [3] . The usual way of removing bugs from a software system is by running test cases on the software system similar to the way users will operate it in their particular environment. However, the emulation of end-user environment during the test interval is difficult, expensive and time consuming especially when there are multiple types of end-users in different environments. Software reliability modeling can be used to address this dilemma especially when reliability testing on two software systems can be achieved in one testing period. Software reliability modeling can provide the basis for planning reliability growth tests, monitoring progress, estimating current reliability, forecasting and predicting future reliability improvements [4] . Predictive analyses help in conducting forecasting and prediction. A prediction interval is usually constructed to provide the time frame when the
future failure observation will occur with a pre-determined confidence level [5] .
An Exponential Nonhomogeneous Poisson Process with intensity function
(1)
is the earliest software reliability model to be developed. Such a model is a NHPP and is mostly referred to as the Goel-Okumoto (1979) software reliability model, after the researchers Goel and Okumoto who first introduced it in 1979.
The model described in Equation (1) is a software reliability model and has been applied to a number of software testing environments and its application and usefulness in describing and assessing software failures has been conducted by various authors. For instance, [6] used Kolmorgorov-Sminorv goodness-of-fit test for checking the adequacy of the software reliability model and they also presented they also presented software failure data which, after study, depicted that the failure rate, i.e. the number of failures per hour, seemed to be decreasing with time. One-sample Bayesian predictive analysis on the model has also been conducted, [7] . However, there is no literature on two-sample Bayesian predictive analyses on the model.
This paper therefore focuses on two-sample Bayesian predictive analyses on the model whose intensity function is described in Equation (1). First, three issues in two-sample predictions that may be experienced during the development testing stage of the software are identified and their corresponding predictive distributions are thereafter developed in Section 2. The main results for the two-sample prediction are presented in Section 3. The developed methodologies are illustrated in Section 6 using simulated two-software failure data. Discussion is given in Section 7 and finally, mathematical proofs are given in the Appendix.
2. Issues in Two-Sample Software Reliability Prediction
In this section, three issues associated closely with software development testing process are presented and their predictive distributions are developed using Bayesian approach. For the purposes of the three predictive issues, it is assumed that a reliability growth testing is performed on a software and the cumulative number of failures of the software in the time interval
, denoted by
is observed. It is further assumed that
follows the NHPP with intensity function given in Equation (1).
Let
be the observed failure times. Failure data is said to be failure-truncated when testing stops after a predetermined
number of failures occur. The
failure times are denoted by
where
. Failure data is said to be time truncated if testing stops at a predetermined time
. The corresponding observed failure data is denoted by
, where
. Now, let us consider two software systems and assume that their cumulative inter-failure times obey the Goel-Okumoto (1979) software reliability model with observed data being either
or
. Based on
or
A1: How to predict the 
B1: How to predict the number of failures that will occur in the time interval 
C1: How to predict the 



Posterior and Predictive Distributions
Let 




Case 1: when the shape parameter 


Thus, the posterior distribution of 

Let 


Hence the Bayesian UPL of 



3. Main Results for the Two-Sample Prediction
Proposition 1 (for issue A1)
The Bayesian UPL of 




Proposition 2 (for issue B1)
The probability that the number of failures 




Proposition 3 (for issue C1)
Given that the number of failures in 





4. Data Simulation
In this section, two software failure data sets are generated from the Goel-Okumoto (1979) software reliability model. The two data sets are simulated using the same model and parameters. The simulated data is used to illustrate the methodologies developed for the two sample Bayesian predictive analyses. The simulation procedure was as follows. The Goel-Okumoto (1979) model is as given in Equation (1).
The values of 







Step 1:
Step 2: Generate a random number
Step 3:

Step 4: Generate a random number U.
Step 5: If


Step 6: Go to step 2.
In the above steps, 







Software one: 8.9345, 27.0177, 34.5816, 54.8606, 83.5715, 111.4006, 139.8851, 157.4743, 181.0868, 182.8410.
Software two: 2.3159, 16.2530, 20.5721, 23.3416, 42.8030, 46.7417, 61.0926, 63.8807, 75.1330, 80.7768, 97.3435, 117.9091, 129.3157, 138.0590, 169.3410, 172.7516, 186.0293, 193.1918, 198.5999.
5. Maximum Likelihood Estimation
Suppose the observation of the failure times occurred in the time interval 



Differentiating 




Solving Equation (11) and Equation (12) we obtain


A necessary and sufficient condition for Equation (13) and Equation (14) to have a unique and positive solution 



two times the mean failure time is less than



A numerical procedure known as the Newton Raphson method can be used to solve Equation (13) and Equation (15). The Newton Raphson method requires choosing of initial values of 





6. Real Example for Two-Sample Bayesian Prediction
Here, we use the two software data sets simulated in Section 4.6 to illustrate the developed propositions in Section 4.4 for two sample Bayesian prediction problems. Assuming that the two software systems were observed in the time interval
Software one: 8.9345, 27.0177, 34.5816, 54.8606, 83.5715, 111.4006, 139.8851, 157.4743, 181.0868, 182.8410.
Software two: 2.3159, 16.2530, 20.5721, 23.3416, 42.8030, 46.7417, 61.0926, 63.8807, 75.1330, 80.7768, 97.3435, 117.9091, 129.3157, 138.0590, 169.3410, 172.7516, 186.0293, 193.1918, 198.5999.
The two software failure times are simulated from the same Goel-Okumoto (1979) software reliability model. The three issues in the two sample prediction in chapter three are addressed as follows:
Issue A2: First, we assume that the failure times of the second software were not observed. Based on the failure data of software one, the maximum likelihood estimate of 





Issue B2: if



Issue C2: suppose that the number of observed failures of the second software during 




7. Discussion
Several issues may arise during development testing of a software system especially when the Goel-Okumoto (1979) software reliability model has been used to model the failure process of the software system. This paper has provided solutions to three issues associated closely with software development testing process. Bayesian approach with non-informative prior has been used to address the three issues. Explicit solutions to the issues have been obtained. These solutions may prove useful to software engineers in determining when to modify, debug and terminate the software development testing process.
Non-informative prior has been used in this paper to develop the methodologies to the said three issues. However, informative priors may also prove useful in deriving the methodologies. We leave this open for future research. Further, this paper has only derived the methodologies for known shape parameter

References
- Nuria, T.R. (2011) Stochastic Comparisons and Bayesian Inference in Software Reliability. Ph.D. Thesis, Universidad Carlos III de Madrid, Madrid.
- Satya, P., Bandla, S.R. and Kantham, R.R.L. (2011) Assessing Software Reliability Using Inter Failures Time Data. International Journal of Computer Applications, 18, 975-978.
- Daniel, R.J. and Hoang, P. (2001) On the Maximum Likelihood Estimates for the Goel-Okumoto Software Reliability Model. The American Statistician, 55, 219-222. http://dx.doi.org/10.1198/000313001317098211
- Meth, M. (1992) Reliability Growth Myths and Methodologies: A Critical View. Proceedings of the Annual Reliability and Maintainability Symposium, New York, 230-238.
- Yu, J.-W., Tian, G.-L. and Tang, M.-L. (2007) Predictive Analyses for Nonhomogeneous Poisson Processes with Power Law Using Bayesian Approach. Computational Statistics & Data Analysis, 51, 4254-4268. http://dx.doi.org/10.1016/j.csda.2006.05.010
- Razeef, M. and Mohsin, N. (2012) Software Reliability Growth Models: Overview and Applications. Journal of Emerging Trends in Computing and Information Sciences, 3, 1309-1320.
- Akuno, A.O., Orawo, L.A. and Islam, A.S. (2014) One-Sample Bayesian Predictive Analyses for an Exponential Non- Homogeneous Poisson Process in Software Reliability. Open Journal of Statistics, 4, 402-411. http://dx.doi.org/10.4236/ojs.2014.45039
- Musa, J. (1987) Software Reliability: Measurement, Prediction, Application. McGraw-Hill, New York.
- Sheldon, R. (2002) Simulation. 3rd Edition, Academic Press, Waltham.
- Hossain, S.A. and Dahiya, R.C. (1993) Estimating the Parameters of a Non-Homogenous Poisson-Process Model for Software Reliability. IEEE Transactions on Reliability, 42, 604-612.
Appendix (Proofs of Proposition 1-3)
The following identity is used in proving some of the propositions. The identity is given without proof.

where 




Proof of Proposition 1
We know that given









The joint density of 


Replacing 



From Equation (5) and Equation (A.4) we have

From Equation (6) and Equation (A.5), we have

Equation (A.6) implies the formula in Equation (7) .
Proof of Proposition 2
The study is interested in predicting the number of failures (denoted by


For any level



Here, an equivalent problem is considered. For any given positive integer


When 

Rearranging Equation (A.9) we obtain

Equation (A.9) implies the formula in Equation (8) .
Proof of Proposition 3
First, we want to find the conditional density of 


After integrating Equation (A.11) with respect to 

Further integrating Equation (A.12) with respect to

Therefore, the conditional density of 


Which is independent of

where 


Given




If

Solving the integral part of Equation (A.16), we obtain

Thus, the Bayesian UPL of 





