Open Journal of Statistics
Vol.08 No.04(2018), Article ID:86874,16 pages
10.4236/ojs.2018.84048
Asymptotic Results for Goodness-of-Fit Tests Using a Class of Generalized Spacing Methods with Estimated Parameters
Andrew Luong
École d’actuariat, Université Laval, Ste Foy, Québec, Canada

Copyright © 2018 by author and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).
http://creativecommons.org/licenses/by/4.0/



Received: June 8, 2018; Accepted: August 24, 2018; Published: August 27, 2018
ABSTRACT
A class of pseudo distances is used to derive test statistics using transformed data or spacings for testing goodness-of-fit for parametric models. These statistics can be considered as density based statistics and expressible as simple functions of spacings. It is known that when the null hypothesis is simple, the statistics follow asymptotic normal distributions without unknown parameters. In this paper we emphasize results for the null composite hypothesis: the parameters can be estimated by a generalized spacing method (GSP) first which is equivalent to minimize a pseudo distance from the class which is considered; subsequently the estimated parameters are used to replace the parameters in the pseudo distance used for estimation; goodness-of-fit statistics for the composite hypothesis can be constructed and shown to have again an asymptotic normal distribution without unknown parameters. Since these statistics are related to a discrepancy measure, these tests can be shown to be consistent in general. Furthermore, due to the simplicity of these statistics and they come a no extra cost after fitting the model, they can be considered as alternative statistics to chi-square statistics which require a choice of intervals and statistics based on empirical distribution (EDF) using the original data with a complicated null distribution which might depend on the parametric family being considered and also might depend on the vector of true parameters but EDF tests might be more powerful against some specific models which are specified by the alternative hypothesis.
Keywords:
Density Based Tests, EDF Tests, Anderson-Darling Statistic, Hellinger Distance Statistic, Pseudo-Distance, Maximum Spacing Method

1. Introduction
Let be a sample of size from a continuous distribution and let be the order statistics and let the transformed data be defined as , and define and . The spacings are given by ,
Ghosh and Jammalammadaka [1] , Luong [2] have studied generalized spacing methods (GSP) of estimation with the vector of GSP estimators given by the vector which minimizes
with (1)
Using this class of , it is shown that the asymptotic covariance matrix of is given by
with
and depends on but does not depend on the parametric family and is the usual information matrix of maximum likelihood (ML) estimation.
Furthermore, by letting , . This result is interesting, as it means if we set we then have and therefore, the loss of efficiency comparing to ML estimation or maximum spacing (MSP) method is around two percent no matter which parametric model is used. Luong [2] has also shown that this loss of efficiency is compensated by a gain in robustness and it might be preferred to use GSP estimation if ML and MSP estimation are not robust; see Remark 2 as given by Luong [2] (p 632). Furthermore, when there are tied observations, this implies some spacings will be equal to 0 and log of these spacings is undefined so that we might want to use GSP methods instead of maximum spacing method (MSP) method; see Section 5 for tied-observations. MSP method is also called maximum product of spacings method; see Cheng and Stephens [3] .
In this paper, we focus on using this class of GSP methods for construction of goodness-of-fit tests statistics for testing the simple null hypothesis:
H0: data come from a distribution ; is specified and for testing the composite null hypothesis.
H0: data come from the parametric family ; is unspecified. For the composite H0, Cheng and Stephens [3] have shown that the Moran’s statistic with parameters estimated by the MSP method has an asymptotic normal distribution which does not depend on the parametric family and we shall show that similar properties hold for the class of test statistics constructed using the class of GSP methods being considered in this paper. In a previous paper, we have considered estimation using this class of GSP methods and parameter hypothesis testing. In this paper, we focus on model validation using this class of GSP methods since testing for goodness-of-fit for composite H0 using GDP methods has not received much attention in the literature.
We adopt an approach using pseudo distances by showing the class of induces a class of pseudo distances which we shall denote by , the function , and are densities and is a measure to quantify how close these densities are. Implicitly, for methods using spacings we work with transform data and if is the true parameter then the transform data
will follow a standard uniform distribution with density and , elsewhere.
Using the transformed data we can obtain an easily constructed elementary density estimate without requiring a kernel of the usual density estimate, this empirical density estimate is denoted by , see expression (6) and for testing the simple null hypothesis, a test statistic can be constructed which is based on
(2)
with the restriction on and can be reexpressed equivalently as a simple function of spacings and numerically simple to compute; the statistic will follow an asymptotic normal distribution which does not depend on the parametric family. For the statistic to have good power for large samples, it appears that we should choose the scaling factor so that an asymptotic distribution exists for the statistic given by expression (2) and at the same time so that as and if can be used to discriminate whether the sample is drawn from an assumed distribution, the test will be consistent and it is an advantage over chi-square tests which do not have the consistency property, in general.
For the composite hypothesis, we use a GSP method to obtain the GSP estimators given by the vector first but we shall see that minimizing expression (1) is equivalent to minimizing the following pseudo distance based on a function , the expression up to a positive multiplicative constant is given by , is defined by expression (11) in Section (4).
Subsequently the statistic is based on
(3)
and after simplifications, it is reduced to a simple function of spacings with estimated parameters and it will be shown again the equivalent statistic to the one given by expression (3) will follow an asymptotic normal distribution without unknown parameters; this property will facilitate goodness-of-fit testing. Using this unified presentation, we would like to show that these statistics are density based and they are parallel to traditional test statistics based on distribution functions (EDF) such as the Anderson-Darling statistic, see Anderson Darling [4] , Boos [5] , Stephens [6] or chi-square goodness-of-fit statistics with parameters estimated with minimum chi-square methods as discussed by Greenwood and Nikulin [7] (p 70-159).
The approach used in this paper hopefully will unify estimation and model testing and facilitate the comparisons of these density based statistics with traditional EDF statistics and chi-square statistics which are more often used than these density based statistics. We note that these statistics can be computed easily and their null asymptotic distribution is normal without unknown parameters which make it easy to use these statistics and comparing to the related chi-square statistics, these statistics do not need a choice of intervals and they come as by products when fitting models using the corresponding GSP methods. This feature is not shared by maximum likelihood (ML) methods.
We also note that power analysis using theoretical works might not give a complete picture for these density based statistics as the analysis is often based on only one sequence of functions which belongs to the alternative hypothesis converging to the functions specified by the null distribution and there are so many sequences that can approach the functions of the hypothesis in a functional space; see Sethuraman and Rao [8] for Pitman efficiency analysis for these statistics for the simple null hypothesis.
In this paper, we shall concentrate on asymptotic distributions goodness-of-fit tests statistics based on GSP methods and emphasizing a class of GSP methods which complete the results on estimation and parameter testing given by a previous paper. Implicitly, GSP methods in this paper mean GSP methods restricted to the class being considered in this paper. Furthermore, we do not touch upon the question of power analysis which might need extensive simulations studies with many models chosen for the alternative hypothesis as we do not have enough computing facilities and resources for such large scale simulation studies, see Cheng and Stephens [3] (p 386) on power of the Moran’s statistic with the MSP method which is also called maximum product of spacings method; also see Zhang [9] for simulation studies for assessing the power of some EDF tests.
The paper is organized as follows.
In Section 2, a class of pseudo distances which generate the related GSP methods for estimation and model testing is introduced and the inference methods are based on spacings or equivalently on transformed data. The elementary density estimate introduced by Kale [10] is presented in Section 3 and a pseudo distance between the elementary density estimate and the standard uniform density is used to construct goodness-of-fit statistic for testing the simple null hypothesis, the statistic is shown to be expressible as a simple function of spacings which follows an asymptotic normal distribution without unknown parameters under the simple null hypothesis. In Section 4, for testing the composite hypothesis we can choose a GSP method within the class being considered by minimizing a corresponding pseudo distance which implicitly define a GSP method for estimation then use the obtained estimators to replace the unknown parameters in the pseudo distance to construct a goodness-of-fit statistic and after simplification the statistic is expressible as a simple function of spacings with an asymptotic normal distribution which does not depend on the parameters as in the simple null hypothesis case. In Section 5, tied-observations are discussed and it might be more practical to use a GSP method instead of the MSP method as tied observations do not cause numerical difficulties for GSP methods but extra cares are needed if MSP method is used, see Cheng and Stephens [3] (p 391) for tied observation treatments for MSP method. Section 6 gives some discussions on power analysis using theoretical works highlighting that theoretical power analysis might not give a complete picture of power of the statistics due to functions are involved under the null and alternative hypothesis comparing to the classical set up which only involve scalars.
2. Discrepancy Measures or Pseudo Distances
We shall see that pseudo-distances can be created using a convex function with and being respectively its first and second derivatives with . We focus on pseudo distances defined by using as and let . The GSP estimators given by the vector can be seen are based on this class as they are obtained by minimizing the following objective function with respect to and by choosing a value for ,
,
i.e., specifying .
We shall see that using this class of using spacings is equivalent to use a class of pseudo distances for densities defined using . It has been shown in our previous paper that GSP methods can attain high efficiency for estimation using values for being positive and near 0.
Note that by letting we obtain full efficiency and with , the asymptotic relative efficiency is around 0.98 for all parametric families comparing to fully efficient methods such as the MSP method or ML method or Hellinger method based on density estimate using the original data introduced by Beran [11] . The elementary density estimate which makes use of spacings is based on transformed data and it is easily obtainable without requiring a kernel. The elementary density estimate is due to Kale [10] . We shall introduce it subsequently after the definition of pseudo distance and give an interpretation to GSP methods as minimum distance methods based on pseudo distances which are density based measures of discrepancy. Presenting from this point of view, it parallels the Hellinger methods introduced by Beran [11] with the use of Hellinger distance and the original data. It might be more complicated for practitioners to implement Beran’s minimum Hellinger distance methods which require a kernel density estimate with a choice of window than implementing these GSP methods.
This will also make the GSP methods parallel to EDF methods such as the Cramér Von Mises methods or weighted Cramér-Von Mises distances such as the Anderson-Darling distance methods which also make use of the original data. For Anderson-Darling (AD) distance, see Anderson and Darling [12] . The Anderson-Darling distance is also a pseudo distance which is always nonnegative and measures the discrepancy between two distribution functions and it needs not obey the triangle inequality. Minimizing the discrepancy between the usual empirical distribution and the distribution of the parametric family will give the minimum Anderson-Darling estimators (MAD), see Boos [5] .
In general, the MAD estimators are robust and have high efficiencies but for some parametric families, the overall relative efficiency when compared to maximum likelihood (ML) estimators can fall below 0.80, see Boos [5] (p 2754). Once the MAD estimators given by the vector is obtained, the Anderson-Darling distance can be used to form the AD statistic which is given by
(4)
to test the validity of the model specified by the composite , i.e., the data is drawn from a distribution F which belongs to the family and is the usual empirical distribution function using the original data. The expression (4) can also be reexpressed so that it is more suitable for calculations see Boos [5] (p 2748). It has been shown that the null distribution of statistics which is based on empirical distribution function (EDF) such as the AD statistic defined by expression (4) does not have a unique null distribution asymptotically as it will depend on and possibly also on , see Boos [5] (p 2759-2766), Pollard [12] (p 61). Even in the case where the null hypothesis is simple, it is still quite complicated and often extensive simulations are needed to calculate the p-value of such tests or extensive tables are needed for the use of these EDF tests. We shall see that it is not the case for the GSP methods based on the functions as we have defined earlier. We focus on this class of functions as it can give high efficiency for estimation and the pseudo distances used for estimation can also be used to construct goodness-of-fit statistics. Unlike the EDF test statistics, for statistics using these pseudo distances we have an asymptotic normal distribution as null distribution regardless of the value of the vector and regardless of for goodness-of-fit the parametric model. The goodness-of-fit statistics are easily obtainable as they are based on the same pseudo distances used for estimation and the statistics can be expressed in an equivalent form as simple functions of the spacings. In this paper, we relate estimation and goodness of fit by considering them as inference methods based on pseudo distances; the approach might provide more insights on the methods using spacings which have appeared in the literature as methods for estimation and testing using spacings are usually presented separately.
Before introducing these goodness-of-fit statistics, first we shall define a ф-discrepancy measure which induces a ф-pseudo-distance. The definitions have been given by Ali and Silvey [13] , Pardo [14] (p 5-7) and reproduced below.
Definition (ф -pseudo-distance)
The ф-pseudo-distance or ф-divergence measure between two densities
and is defined by , is the expectation using
, is a convex function with defined for and the second derivative exists and .
We have , if and only if except on a set of measure 0. The discrepancy measure needs not be symmetric as and it does not need to obey the triangle inequality and unless otherwise stated, we focus on the class of and let .
Using the above function, the pseudo distance can be expressed as
and we shall use these pseudo distances to construct goodness-of-fit test statistics using transformed data or equivalently spacings and related them with results which already obtained using spacings which have appeared in the literature. The advantage of this approach is an unified treatment can be given to estimation and model testing and it can reveal tests based on statistics which make use of spacings which might not be powerful for large samples when used for testing of goodness-of-fit.
Note that Hellinger distance (HD) which is a true distance as used by Beran [11] can be expressed in a similar form with
.
In the next section, we shall present an elementary density estimate using transformed data and we aim to test the following simple H0 which specifies that the random sample of observation is drawn from a distribution function , is specified and has a closed form expression.
We assume to have a random sample of size which consists of and these observations are independent and identically distributed(iid) as X which follows a distribution , is the parametric model used and let the order statistics be denoted by .The vector of parameters is denoted by , is the true vector of parameters.
If we want to test the simple null hypothesis which specifies that data come from
, let
be the transformed data and the order statistics based on transformed data are
and the spacings be defined as
with
and
and it is clear that the transformed data will follow a uniform distribution under the null hypothesis. Now using the transformed data and instead of constructing the usual empirical distribution function which is a step function, we use the line segments to join the points where there are jumps so that it becomes a piecewise linear function, i.e., define the following smoothed empirical distribution as given by Kale [10] (p 44),
(5)
The density function of
is
and
, elsewhere. (6)
The procedure to smooth the empirical distribution using transformed data is similar to the procedure of constructing an ogive function when data have been grouped into intervals and we need to smooth the empirical distribution function, see Klugman et al. [15] (p 212) for the ogive function.
The smoothed empirical distribution function admits the following elementary density estimate as density,
(7)
and it can be obtained easily without requiring a kernel and specifying a window.
3. Density Based Statistics for Simple Null Hypothesis
It is not difficult to see that under the simple null hypothesis the transformed data follow the uniform distribution with density function given by
and
elsewhere and an appropriate goodness-of-fit statistic can be based on

since
and
elsewhere.
Therefore, if we can find a real number
so that
(8)
has an asymptotic distribution which no longer depends on the functional form of
, the statistic for testing goodness-of-fit can be based on the statistic
and the test will have power since with
, this will imply
and with 

In fact, we do not need to require

1) If the sample is drawn from a distribution F and 




2) If the sample is drawn from a distribution G and 

we shall use the notation 
Then, we should have 


Furthermore, if we can simplify the expression of V so that we can have an equivalent statistic which serves the same purpose and it is simpler to compute then it is interesting to use its equivalent form. It turns out that this is the case as the statistic can be expressed as a simple function of spacings. However, by relating to the discrepancy measure, the test based on such a statistic can be seen to be consistent. This statistic parallels the one proposed by Beran [11] (p 458) which uses the Hellinger distance with the original data and a kernel density estimate. It is simpler to obtain this statistic than the one given by Beran.
Now we shall examine the component 

and it can be re-expressed as

see Kirmani and Alam [17] for goodness of fit test using statistic of the form
Using results as given in section 2 by Luong [2] , we can conclude that

Theorem can be applied to the expression. By letting the mean and variance of 


we have 


so that we have an asymptotic normal distribution for the test statistic P defined below and if we need to emphasize the dependence on



Therefore, if we look for the scaling factor k using expression (9) we should consider


and with



The asymptotic distribution of the statistic
which can also be represented as 

follows a Normal distribution with mean 



and hence, 

with 


or Hellinger distance with 
given by expression (11) might make the test having low power for large samples when the null hypothesis is composite; see Kirmani and Alam [16] , Kirmani
[17] for the statistic using 

efficient for estimation. Testing for the null hypothesis which is composite will be considered subsequently.
4. Density Based Statistics for Null Composite Hypothesis
For testing the null composite hypothesis which specifies that data come from the parametric family 



Parallel to the simple null hypothesis case, we transform the data and let 



with 


Since the transformed data 


using the notations

We estimate first 




The estimators given by the vector 


The goodness-of-fit test statistic can be based on 


Argue as in the case of simple null hypothesis it leads to consider the equivalent statistic

We shall show subsequently that we have the equality in distribution





The statistic is similar to the one used for the simple null hypothesis case. All we need is to replace 

Observe that we can expand the expression

around 








with



by Luong [2] (p 629-630) and 




which justifies the use of expression (15).
The same type of property has been shown to hold for the asymptotic distribution of the Moran’s statistic with Maximum spacing estimators for testing goodness of fit for parametric models, see Cheng and Stephens [3] (p 390).
The GSP methods with 



5. Tied Observations
In this section, we would like to make the following remark by pointing out that in a data set which is not large and there are many tied observations, it might be preferred to use a GSP method instead of the MSP method as the MSP method is based on minimizing 

Cheng and Stephens [3] (p 391) also proposed methods to handle tied observations for the use of MSP method but tied measurements do not cause numerical difficulties for the GSP method as discussed and there is little loss of efficiency using a GSP method and a GSP method might be more robust than the MSP method, see Remark 2 as given by Luong [2] .
6. Discussions
In this section, we touch upon the question of power analysis for these density based tests. Power analysis for null hypothesis which specifies functions is more complicated than Pitman efficiency analysis when parameters are scalars, see Lehmann [19] (p 158-187) for the classical set up with scalars as parameters instead of functions.
Here, under the null hypothesis a function or functions are specified, this makes the study of power more complicated even for the simplest case when the null hypothesis H0 is simple which specifies the data comes from F0 or equivalently the transformed data comes from a standard uniform density with density function 


For power study, often a sequence of tests based on a sequence of functions which belongs to the alternative hypothesis 
Rao [8] , used the following sequence of functions 



For theoretical works and Pitman efficiencies, the focus is on best tests based on a chosen sequence of functions but it might not provide a complete answer for applications as an optimum statistic might no longer be optimum if another sequence of functions are chosen. In applications, the distributions belonging to the alternative hypothesis which are useful and commonly used might not have been included in the analysis for theoretical works. This makes the assessment of power difficult using theoretical analysis especially when parameters are functions instead of scalars, see Lehman [19] for the classical set up on Pitman efficiency analysis using scalars and parameters belong to the real line. The functional space is more complicated than the real line.
Cheng and Stephens [3] (p 386) recognized this problem and pointed out that power depends on the alternative hypothesis and to get some ideas on the power of these tests often large scale simulations seem to be needed and many parametric families should be considered as given by the alternative hypothesis which are techniques that Zhang [9] has used to conduct power studies for some EDF tests. We do not have resources for these large scale simulation studies. These tests have not been not used extensively and in the future if they are used more frequently and concomitantly with GSP methods for estimation in applications, we will have better ideas on power of these tests.
7. Conclusion
In a previous paper, we have studied estimation, asymptotic properties, robustness and parameter hypothesis testing using GSP methods. In this paper we have adopted the view that GSP methods are minimum density based distance methods using transformed data or equivalently spacings so that estimation and model testing can be treated in a unified way. Model validation via goodness-of-fit tests and construction of density based tests are treated in this paper. We have shown that these statistics for testing come at no extra cost once a GSP method is used for fitting a parametric model and might be useful for assessment of the model in practice. These tests are simple to perform and practitioners might want to use these tests concomitantly with GSP estimation especially when sample sizes are relatively large. For some real life data sets, GSP methods might be preferred over MSP method for estimation and chosen for their robustness property, efficiency and the flexibility to handle tied observations and finally tests statistics for goodness-of-fit can be constructed at no extra cost. The last feature is not shared by maximum likelihood (ML) method.
Acknowledgements
The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support from the editorial staffs of Open Journal of Statistics to process the paper are all gratefully acknowledged.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.
Cite this paper
Luong, A. (2018) Asymptotic Results for Goodness-of-Fit Tests Using a Class of Generalized Spacing Methods with Estimated Parameters. Open Journal of Statistics, 8, 731-746. https://doi.org/10.4236/ojs.2018.84048
References
- 1. Ghosh, K. and Jammalamadaka, S.R. (2001) A General Estimation Method Using Spacings. Journal of Statistical Planning Inference, 93, 71-82. https://doi.org/10.1016/S0378-3758(00)00160-9
- 2. Luong, A. (2018) Unified Asymptotic Results for Maximum Spacing and Generalized Spacing Method for Continuous Model. Open Journal of Statistics, 8, 614-639. https://doi.org/10.4236/ojs.2018.83040
- 3. Cheng, R.C.H. and Stephens, M.A. (1989) A Goodness-of-Fit Test Using Moran’s Statistic with Estimated Parameters. Biometrika, 76, 385-392. https://doi.org/10.1093/biomet/76.2.385
- 4. Anderson, T.W. and Darling, D.A. (1954) A Test of Goodness of Fit. Journal of American Statistical Association, 49, 765-769. https://doi.org/10.1080/01621459.1954.10501232
- 5. Boos, D. (1982) Minimum Anderson Darling Estimation. Communications in Statistics, Theory and Methods, 11, 2747-2774. https://doi.org/10.1080/03610928208828420
- 6. Stephens, M.A. (1986) Tests Based on EDF Statistics in Goodness-of-Fit Techniques. d’Agostino, R.B. and Stephens, M.A., Eds., Marcel Dekker, New York.
- 7. Greenwood, P.E. and Nikulin, S.M. (1996) A Guide to Chi-Squared Testing. Wiley, New York.
- 8. Sethuraman and Rao, J.S. (1970) Pitman Efficiencies of Tests Based on Spacings in Nonparametric Techniques in Statistical Inference. Puri, M.L., Ed. Cambridge University Press, Cambridge.
- 9. Zhang, J. (2002) Powerful Goodness-of-Fit Tests Based on the Likelihood Ratio. Journal of the Royal Statistical Society, Series B, 62, 281-294. https://doi.org/10.1111/1467-9868.00337
- 10. Kale, B.K. (1969) Unified Derivation of Tests of Goodness of Fit Based on Spacings. Sankhya, Series A, 31, 43-48.
- 11. Beran, R. (1977) Minimum Hellinger Distance Estimates for Parametric Models. Annals of Statistics, 5, 445-463. https://doi.org/10.1214/aos/1176343842
- 12. Pollard, D. (1980) The Minimum Distance Method of Testing. Metrika, 27, 43-70. https://doi.org/10.1007/BF01893576
- 13. Ali, S.M. and Silvey, S.D. (1966) A Generalized Class of Coefficients of Divergence of One Distribution from Another. Journal of the Royal Statistical Society, 28, 813-828.
- 14. Pardo, L. (2006) Statistical Inference Based on Divergence Measures. Chapman and Hall, Boca Raton.
- 15. Klugman, S.A., Panjer, H.H. and Willmot, G.E. (2012) Loss Models: From Data to Decisions. 4th Edition, Wiley, New York.
- 16. Kirmani, S.N.U.A. and Alam, S.N. (1974) On Goodness of Fit Tests Based on Spacings. Sankhya, Series A, 36,197-203.
- 17. Kirmani, S.N.U.A. (1973) On a Goodness of Fit Test Based on Matusita’s Distance. Annals of the Institue of Mathematical Statistics, 24, 493-500. https://doi.org/10.1007/BF02479394
- 18. Carrasco, M. and Florens, J.-P. (2000) Generalization of GMM to a Continuum of Moment Conditions. Econometric Theory, 16, 797-834. https://doi.org/10.1017/S0266466600166010
- 19. Lehmann, E.L. (1999) Elements of Large Sample Theory. Springer, New York. https://doi.org/10.1007/b98855





