Let us call the model with this prior specification as Model I(1).

In Model I with the second set of prior specifications, we assume a Gamma prior distribution for α given by Γ(0.001, 0.001). For the parameter k we conducted a study of the sensitivity of different priors assuming different hyperparameter values. This was done because, when this parameter is estimated, it is difficulty to obtain convergence of the algorithm used to simulate samples for the parameters. The prior distributions tested for this parameter include: Uniform (0, T), Γ(0.001, 0.001) and Γ(10,000), besides other values for the hyperparameters of such distributions. The transformation log(k) was also tested, where log(K) ~ N(a1, b1), with many values for the hyperparameters a1, b1, including N(0, 0.01), N(10, 10) and N(0, 1). In most tests the convergence of the Markov chain simulations was not satisfactory. The best result was obtained for a truncated exponential distribution with mean parameter equal to 0.95 and truncated in 3. Let us call the model with this prior distribution Model I(2). Table 1 shows summaries of the posterior the estimates of Models I, II and III.

For Model I(1) which uses a non-informative prior distribution specification for parameter k, the graphs of the MCMC chains are shown in Figure 1 (letters (a), (c), (e) and (g)). Only for parameter θ (letter (a)) do we appear to have clear convergence, while for the parameter β (letter (c)) there is clear nonconvergence, and for parameter k (letter (g)) there is possible nonconvergence. Figure 2 (letters (a), (c), (e) and (g)) presents the posterior distributions of the two chains. Only for the parameter θ (letter

Table 1. Posterior summaries of the estimates of the parameters for Models I, II and III. True model: θ = 300, β = 0.02, α = 1 and k = 1. Sample size of 300. Priors: for the parameters θ, β and α the prior distributions are the ones in Section 2. For the parameter in Model I: (1) Non-informative prior distribution proposed in [1]; (2) Exponential prior distribution with mean parameter 0.95 and truncated in 3. For Model I, we have: 1st row of each parameter corresponds to an estimate of chain 1, starting from a given value; 2nd row corresponds to an estimate of chain 2, starting from the true values.


   (a)(b)(c)(d)      (e)(f)(g)(h)

Figure 1. Convergence of chains 1 and 2, for the parameters of Model I(1), for which the graphs are labeled (a), (c), (e) and (g), and Model I(2), for which the graphs are labeled (b), (d), (f) and (h).


   (a)(b)(c)            (d)(e)(f)            (g)(h)

Figure 2. Posterior distribution of the parameters for chains 1 and 2 of Model I, where graphs (a), (c), (e) and (g) correspond to the first set of priors, and graphs (b), (d), (f) and (h) correspond to the second set of priors. The true value of each parameter is represented by a vertical line.

(a)) is there any indication of convergence. These results are confirmed by the Gelman-Rubin’s statistics which are equal to 1.25 and 1.15, for the parameters β and k respectively. That is, we did not get convergence for the MCMC chains with these prior distributions.

In Model I(2) we use, as the new proposed prior distribution for the parameter k, an exponential distribution with mean parameter 0.95 and truncated in 3. The chains are presented in Figure 1 (letters (b), (d), (f) and (h)) and there is an indication of convergence. Figure 1 (letters (b), (d), (f) and (h)) shows the posterior distribution of the parameters for chains 1 and 2, and there is also indication of convergence. The convergence is confirmed by the Gelman-Rubin’s statistics, with the largest value equal to 1.01.

The main conclusion reached in this sensitivity analysis is that in the case of the proposed Model 2, this is close to the process that generates the actual data; in other words, it is possible to have a good approximation of the posterior distribution of the parameters. Even using prior distributions for parameters α, β and θ which are not very informative, Model I(2) can estimate the true value of these parameters. This does not hold true for Model I(1).

4. An Application to Pollution Data

This section applies Models I(1) and I(2) to fit data corresponding to the maximum daily mean measurements of ozone gas, based on data from the northeast (NE) region of Mexico City. The sample has 981 observations, which correspond to times when a certain threshold established for the air quality standard is violated in the interval time (0,T] These data are available at www.sma.df.gob.mx/simat, and we took eighteen years of observations, from January 1st, 1990 to December 31st, 2008 [14].

For these models and the assumed data set we used a burn-in of 15,000 iterations, and after this we obtained 7000 samples of the posterior distributions, with a sample at each one-hundredth iterations.

Table 2 shows the posterior estimates of the parameters for Models I(1) and I( 2).

In Figure 3, we show the trace plots of chains 1 and 2, respectively, for the parameters of the models, where graphs (a), (c), (e) and (g) correspond to Model I(1), and graphs (b), (d), (f) and (h) correspond to Model I(2). We can observe in Figure 3, that in graphs (a), (c), (e) and (g), related to Model I(1), only parameter θ appears to have convergence (graph (a)), and there is evidence that there is non convergence for the other parameters. In this model we used a non-informative prior distribution specification for the parameter k. In Model I(2), graphs (b), (d), (f) and (h), appear to have convergence. In this model we proposed as the prior distribution for the parameter k an exponential distribution with mean parameter 0.99 and truncated in 6.

The largest value of the Gelman-Rubin’s statistics for the parameters of the general Model I(2) is equal to 1.00 indicating convergence of the MCMC chains. In Model I(1) the Gelman-Rubin’s statistics are equal to 1.22, 1.23 and 1.19, for the parameters α, β and k respectively. So we did not get the convergence of the MCMC chains using the non informative prior for the parameter k.

In Figure 4, we have the posterior distribution of the parameters for chains 1 and 2 for Models I(1) and I(2). The graphs labelled (a), (c), (e) and (g) correspond to the first set of priors; and the graphs t labelled (b), (d), (f)

Table 2. Posterior estimates of parameters, using a sample of real data corresponding to the NE region of Mexico City. Priors: for parameters θ, β and α the prior distributions are the ones in Section 2. Priors for parameter k: Model I(1) non-informative prior distribution proposed in [1]; Model I(2) exponential prior distribution with mean parameter 0.99 truncated in 6. In the 1st row, for each parameter, we present the estimates corresponding to chain 1, and in the 2nd row, we present estimates corresponding to chain 2.


   (a)(b)(c)            (d)(e)(f)            (g)(h)

Figure 3. Convergence of chains 1 and 2, for the parameters of Model I(1), in which the graphs are labelled (a), (c), (e) and (g), and Model I(2), in which the graphs are labelled (b), (d), (f) and (h).

and (h) correspond to the second set of priors. The results confirm the convergence in the Model I(2) and nonconvergence in Model I(1)

5. Conclusions

In this article we proposed a sensitivity analysis with various specifications of prior distributions for the model previously introduced in [1], which was developed using the Bayesian approach. We have conducted a study of the effect of prior distributions on the convergence and accuracy of the results, and in this way we have been able to propose a prior distribution for the parameter k that gives convergence of the samples simulation algorithm. Observe that using improper prior distributions for the parameter k could not guarantee that the posterior distribution was proper, as this depended on the data set (as was observed in our application with real data). After trying several prior distribution specifications for this parameter, it was possible to propose a prior distribution that would considerably improve the convergence of the chains. Such improvements can be noted in credible intervals, in which the range of the interval is smaller using a truncated exponential prior distribution; we may also observed these best results through graphical analysis when using a truncated exponential prior distribution for the parameter k, we obtained convergence of the chains.

It is important to point out that other informative priors


   (a)(b)(c)            (d)(e)(f)            (g)(h)

Figure 4. Posterior distributions of parameters for chains 1 and 2, respectively for Model I, where graphs (a), (c), (e) and (g), correspond to the first set of priors and graphs (b), (d), (f) and (h) correspond to the second set of priors.

for the parameter k could be considered, such as a Gamma distribution. This possibility is a goal for a future study.

6. Acknowledgements

This work was partially supported by grants from Capes, CNPq and FAPESP. We would like to thank one referee and Laboratório Epifisma.

REFERENCES

  1. J. A. Achcar, K. D. Dey and M. Niverthi, “A Bayesian Approach Using Nonhomogeneous Poisson Process for Software Reliability Models,” In: A. S. Basu, S. K. Basu and S. Mukhopadhyay, Eds., Frontiers in Reliability, World Scientific Publishing Co., Singapore City, 1998, pp. 1-18. doi:10.1142/9789812816580_0001
  2. A. L. Goel and K. Okumoto, “An Analysis of Recurrent Software Errors in a Real-Time Control System,” Proceedings of the 1978 Annual Conference, ACM’78, Washington DC, 4-6 December 1978, pp. 496-501. doi:10.1145/800127.804160
  3. A. L. Goel, “A Guidebook for Software Reliability Assessment,” Technical Report, Syracuse University, Syracse, 1983.
  4. G. S. Muldholkar, D. K. Srivastava and M. Friemer, “The Exponentiated-Weibull Family: A Reanalysis of the BusMotor Failure Data,” Technometrics, Vol. 37, No. 4, 1995, pp. 436-445. doi:10.1080/00401706.1995.10484376
  5. V. G. Cancho, H. Bolfarine and J. A. Achcar, “A Bayesian Analysis of the Exponentiated-Weibull Distribution,” Journal of Applied Statistical Science, Vol. 8, No. 4, 1999, pp. 227-242.
  6. J. E. R. Cid and J. A. Achcar, “Bayesian Inference for Nonhomogeneous Poisson Processes in Software Reliability Models Assuming Nonmonotonic Intensity Functions,” Computational Statistics and Data Analysis, Vol. 32, No. 2, 1999, pp. 147-159. doi:10.1016/S0167-9473(99)00028-6
  7. A. Gelman and D. R. Rubin, “A Single Series from the Gibbs Sampler Provides a False Sense of Security,” In: J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, Eds., Bayesian Statistics 4, Oxford University Press, Oxford, 1992, pp. 625-631.
  8. S. P. Brook and A. Gelman, “General Methods for Monitoring Convergence of Iterative Simulations,” Journal of Computational and Graphical Statistics, Vol. 7, No. 4, 1997, pp. 434-455. doi:10.2307/1390675
  9. A. Gelman, “Inference and Monitoring Convergence,” In: W. R. Gilks, S. Richardson and D. J. Spiegelhalter, Eds., Markov Chain Monte Carlo in Practice, Chapman and Hall, London, 1996, pp. 131-143.
  10. D. R. Cox and P. A. Lewis, “Statistical Analysis of Series of Events,” Methuen, London, 1966. doi:10.1063/1.1699114
  11. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller and E. Teller, “Equations of State Calculations by Fast Computing Machines,” Journal of Chemical Physics, Vol. 21, No. 6, 1953, pp. 1087-1092. doi:10.1063/1.1699114
  12. W. K. Hastings, “Monte Carlo Sampling Methods Using Markov Chains and Their Applications,” Biometrika, Vol. 57, No. 1, 1970, pp. 97-109. doi:10.1093/biomet/57.1.97
  13. G. Casella, and R. L. Berger, “Statistical Inference,” 2nd Edition, Duxbury Press, Pacific Grove, 2001.
  14. J. A. Achcar, E. R. Rodrigues, C. D. Paulino and P. Soares, “Non-homogeneous Poisson Models With a Change-Point: An Application to Ozone Peaks in Mexico City,” Environmental and Ecological Statistics, Vol. 17, No. 4, 2010, pp. 521-541. doi:10.1007/s10651-009-0114-3

NOTES

*Corresponding author.

Journal Menu >>