﻿ Estimation of Hazard Function for Censoring Random Variable by Using Wavelet Decomposition and Evaluation of MISE, AMSE with Simulation

Journal of Data Analysis and Information Processing
Vol.2 No.1(2014), Article ID:43245,5 pages DOI:10.4236/jdaip.2014.21001

Estimation of Hazard Function for Censoring Random Variable by Using Wavelet Decomposition and Evaluation of MISE, AMSE with Simulation

Mahmoud Afshari*, Saeed Tahmasebi

Department of Statistics, College of Science, Persian Gulf University, Bushehr, Iran

Email: *Afshari@pgu.ac.ir, Tahmasebi@pgu.ac.ir

Received November 15, 2013; revised December 17, 2013; accepted January 25, 2014

KEYWORDS

Wavelet; Estimator; Censoring Random Variable; Mean Square Integral Error; Average Mean Square Error; Simulation

ABSTRACT

Wavelet analysis is one of the mostly new methods of pure and applied mathematics science. In this paper, we use the wavelet method to estimate the hazard function for censoring random variable. We consider the convergence ratio of given estimator. Also we present the simulation in order to test purpose estimator by calculating the mean integrated squared error (MISE) and average mean squared error (AMSE).

1. Introduction

One of data types, which researchers are extremely interested in, is carrying to the time interval till the occurrence of certain events such as death etc. Every process waiting for a specific event produces survival data. Failure in survival analysis means the occurrence of the event that we were waiting for. The time, which survival is measured after that point, is called the start time.

The failure time which is denoted by , , is the time that failure occurs for each individual. It’s not always possible to observe the failure time for each individual in such cases that censorship occurs.

Survival function, which is shown by, indicates the ratio of people who survived since the base time which is the point they enter the experiment to the time unit t analysis. Hazard function for the failure continuous time is as follows:

(1)

In this paper, we obtain estimator hazard function for censoring data by using wavelet method. We evaluate convergence ratio of given estimator by simulation.

2. Estimation of Hazard Function by Using Wavelet Method

Wavelets can be used for transient phenomena analysis or function analysis which sometimes changes rapidly. They are symmetrical and have limited period. A close relationship between wavelet coefficients and some spaces is wavelet bases orthogonally. Also useful properties of them in wavelet issues simplify the computational algorithms. As a result, numerous articles have been published about in statistical science.

The mathematical theorem of wavelets and their application in statistics have been studied as a technique for density function estimator, by Harr [1], Doukhan [2], Antoniadys [3], nonparametric curve estimators by Malat [4], Meyer [5], Daubechies [6], Donoho [7], Kyacharyan and Picard [8], Hall and Patil [9] have found a formula for the Mean Integrated Squared Error of Nonlinear Wavelet based on density estimators. Antoniadys et al. [10] achieved the density function estimator and the hazard function for right-censored data with the wavelets. Daubechies [11] studied and discussed the compactly supported wavelets which produce orthogonal bases. Afshari et al. [12-14] studied about density, derivative density function estimator, regression function for the mixing random variables.

Let the nested sequence of closed subspaces;

be a multiresolutuon approximation to. Define, to be orthogonal complement of in. Wavelets basis for function as scaling function and mother wavelet such that forms an orthogonal basis for and forms an orthonormal basis for. Other wavelets in the basis are then generated by translation of the scaling function and dilations of the mother wavelet by using the relationships:

(2)

Given above Wavelet basis, a function can be written a formal expansion:

(3)

where

As for general orthogonal series estimator, Daubechies [4], density estimator can be writhen as:

(4)

where the obvious coefficient estimator can be written:

(5)

In this article, we divide time axis into two parts, the intervals and the number of events in each interval. We determine number of events and hazard function according to the observations. Then we flatten them separately via linear wavelet density estimation on the whole time and then we calculate the function estimator and evaluate the asymptotic distribution.

Suppose are failure time of n tests that are studied. They are non-negative, independent, identically distributed, with the density function f and distribution function F. Also suppose that are corresponding to censored times, non-negative, independent, identically distributed, with the density function and distribution function.

Assuming independency of failure times and censored time of the observed random variable, and the function and hazard function are shown as below:

Such that is indicator function of A. For data censoring, if,

We assume that,

Such that then we can write as follows:

(6)

To estimate we need the estimator of and.

For estimating, we divide the time axis into two parts of small intervals and the amounts of events (0 or 1) in each interval, and then we divide these values to the length of intervals.

Estimation procedures of can be summarized as the following:

Select and collect the observed failures in intervals with the length and using wavelet estimation on the collected data. We find an estimate of sub density. This means that we calculate the collected wavelet coefficients data on the scale of by choosing the decomposition level and then we estimate. It is necessary to state the following symbols to show the details:

We figure estimators on the finite interval in which. Note that if is the ordinal order statistic of the sequence then,

. In fact we suppose.

Suppose that N is an integer that could be dependent to n and the estimated points are as follows:

Suppose that and we divide the interval of time axis to intervals with long

The k-th interval is marked by so: for,.

Now we define the following indicator function that indicates the number of uncensored failures in the time interval: We assume that is the observed failures ratio in the interval, in other words:

We smooth the data by an appropriate wavelet smoother to find the estimation of.

We can write as the following:

(7)

where,.

The complex structural polymorphism analysis causes an efficient tree construction algorithm for analysis of functions in VN with theoretic scale wavelet coefficients. However, the integral scale is not well available and we need an initial value for a fast wavelet transform. Antoniadis [4] suggested the following initial amount:

As a result a reasonable estimate for image of

with clarity N is:

If we assume that the collected values which are equal to the estimators of, are in Sobolev space and is regular of degree. We estimate the unknown function as follows to level the data with a better rate for the sample size and the sequence:

(8)

That it is the orthogonal image of on the leveler approximation space.

Now we consider an appropriate consistent estimator of, and finally we estimate the Hazard function.

We assume that has distribution function and density function.

For estimating of, we use an empirical distribution as the following:

Such that is Histogram estimator of. Suppose that, , we can write:

Suppose that as, then we define:

so we can write as the following:

(9)

By substituting Equation (9) in Equation (8), we obtain the estimator

Theorem: Suppose that the sub density is a continuous function on and it’s times differentiable, If and

then,

,

Proof:

By using Chung-Smirnov property and Taylor’s theorem we can write as the following:

(10)

(11)

By using Equations (10) and (11), we can write:

then the proof is completed.

3. Numerical Computation and Simulation

In this section, we simulate and on the data of size by using Semlayt’s wavelet. We consider convergence ratio of given estimator by computing of average mean square error of given estimators. We use R software and wavelet package for simulations.

Example 1: We generate and from the samples of size and with K = 16, , and for optimal surface

The solid line in the Figure 1 displays the wavelet estimate of hazard function with the denoted line representing the true hazard rate The results in Table 1 display the average mean square errors of hazard function estimator for sample sizes and.

Example 2: Suppose, where and. We generate from sample size of n = 400 and n = 600 with K = 16, K = 32, K = 64 and.

The solid line in the Figure 2 displays the wavelet estimate of hazard function with the denoted line representing the true hazard rate.

The results in Table 2 display the average mean square

Figure 1. The panel in Figure 1 displays the wavelet estimator of hazard function with the denoted line representing the true hazard rate.

Table 1. Average mean square errors of hazard function estimator.

Figure 2. The solid line in the panel displays the wavelet estimate of hazard function with the denoted line representing the true hazard rate.

Table 2. Average mean square errors of hazard function estimator.

errors of hazard function estimator for sample sizes n = 400 and n = 600.

Acknowledgements

The support of Research Committee of Persian Gulf University is greatly acknowledged.

REFERENCES

1. A. Haar, “Zur Thorie der Orthogonal Functioned-System,” Annals of Mathematics, Vol. 69, No. 3, 1910, pp. 331-371. http://dx.doi.org/10.1007/BF01456326
2. P. Doukhan, “Mixing Properties and Examples,” Springer-Verlag, New York, 1995.
3. A. Antoniadis, “Smoothing Noisy Data with Tapered Coiflet Series,” Scandinavian Journal of Statistics, Vol. 23, 1996, pp. 313-330.
4. S. G. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, No. 7, 1989, pp. 674-693. http://dx.doi.org/10.1109/34.192463
5. Y. Meyer, “Ondelettes et Operateurs,” Hermann, Paris, 1990.
6. I. Daubechies, “Ten Lectures on Wavelets,” SIAM, Philadelphia, 1992.
7. D. L. Donoha and I. M. Johnstone, “Ideal Spatial Adaptation by Wavelet Shrinkage,” Biometrika Journal, Vol. 81, No. 3, 1994, pp. 425-455. http://dx.doi.org/10.1093/biomet/81.3.425
8. G. Kerkyacharian and D. Picard, “Density Estimation by Kernel,” Probability and Letters, Vol. 18, No. 4, 1993, pp. 327-336. http://dx.doi.org/10.1016/0167-7152(93)90024-D
9. P. Hall and P. Patil, “Formula for Mean Integrated Squared Error of Non-Linear Wavelet Based Density Estimators,” Annals of Statistics, Vol. 23, No. 3, pp. 905-928. http://dx.doi.org/10.1214/aos/1176324628
10. A. Antoniadis, G. Gregoire and G. P. Nason, “Density and Harzard Rate Estimation for Right Censored Data Using Wavelet Methods,” Journal of Royal Statistical Society, Series B, Vol. 61, No. 1, 1999, pp. 63-84. http://dx.doi.org/10.1111/1467-9868.00163
11. I. Daubechies, “Orthogonal Bases of Compactly Supported Wavelets,” Communication in Pure and Applied Mathematics, Vol. 41, No. 7, 1988, pp. 909-996. http://dx.doi.org/10.1002/cpa.3160410705
12. M. Afshari, “A Fast Wavelet Algorithm for Analyzing of Signal Processing and Empirical Distribution of Wavelet Coefficients with Numerical Example and Simulation,” Communication of Statistics-Theory and Methods, Vol. 42, No. 22, 2013, pp. 4156-4169. http://dx.doi.org/10.1080/03610926.2011.642917
13. M. Afshari, “Wavelet Density Estimation of Censoring Data and Evaluate of Mean Integral Square Error with Convergence Ratio and Empirical Distribution of Given Estimator,” 2013, under print.
14. H. Doosti, M. Afshari and H. A. Niroomand, “Wavelets for Nonparametric Stochastic Regression with Mixing Stochastic Process,” Communication of Statistics-Theory and Methods, Vol. 37, No. 3, 2008, pp. 373-385. http://dx.doi.org/10.1080/03610920701653003

NOTES

*Corresponding author.