Applied Mathematics
Vol.4 No.8(2013), Article ID:35095,6 pages DOI:10.4236/am.2013.48150
On Some Procedures Based on Fisher’s Inverse Chi-Square Statistic
Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University, Washington DC, USA
Email: khm33@georgetown.edu
Copyright © 2013 Kepher H. Makambi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Received February 1, 2013; revised March 1, 2013; accepted March 9, 2013
Keywords: Values; Weighting; Linear Combination; Correlation Coefficient; Estimated Degrees of Freedom
ABSTRACT
We present approximations to the distribution of the weighted combination of independent and dependent values,
In case that independence of
’s is not assumed, it is argued that the quantity
is implicitly dominated by positive definite quadratic forms that induce a chi-square distribution. This gives way to the approximation of the associated degrees of freedom using Satterthwaite (1946) or Patnaik (1949) method. An approximation by Brown (1975) is used to estimate the covariance between the log transformed
-values. The performance of the approximations is compared using simulations. For both the independent and dependent cases, the approximations are shown to yield probability values close to the nominal level, even for arbitrary weights,
’s.
1. Introduction
Let be
tail probabilities or probability values from continuous distributions. Associate null hypotheses
to these
probability values. Using the probability integral transform, we know that
when
is true. For
That is, which is the cumulative distribution function of a chi-square variable with 2 degrees of freedom. That is,
and the decision rule is to reject
if
Define a combined statistic by
For independent’s, the variable
The overall test procedure is to reject
if
This is Fisher’s Inverse Chi-square method. We notice that for the statistic
all the
’s are weighted equally, which may not be acceptable in some situations and therefore unequal weighting may be necessary. A number of authors have attempted to derive the distribution of a weighted form of
For instancelet
where
has a non-central
distribution with non-centrality parameter
Solomon and Stephens [1] approximated the distribution of
by a random variable of the form
matching the first three moments. The disadvantage with this approximation is that there is no closed-form formula for computing the parameters. Buckley and Eagleson [2] approximation of the distribution of involves approximating
using a variable that takes the form
and matching the first three cumulants of
and
Zhang [3] showed that by equating the first three cumulants of
and
the distribution of
can be approximated by
Zhang [3] also proposed a chi-square approximation to the distribution of
Others authors have approximated the null distribution of
by intensive bootstrap [4-8].
In this article, we concentrate on linear combinations of (a function of
’s) that have a central chi-square distribution, and involve dependent and independent
’s and arbitrary weights,
’s. For dependent
’s, we use simulations to investigate the performance of the approach by Makambi [9] when it is assumed that there is homogeneity in correlation coefficients between any pair of the
’s.
2. Distribution of Independent and Dependent Weighted’s
Let’s focus on the mixture
where has a central
-distribution with 2 degrees of freedom and
are arbitrary weights. For independent
’s, Good [10] provided the following approximation:
where This approximation is usually regarded as the exact distribution of
The approximation has been criticized because the calculations become ill-conditioned when any two weights,
and
are equal. To avoid this problem, Bhoj [11] proposed the approximation
where denotes the incomplete gamma function. This approximation is also for independent probability values.
For an alternative and more general approximation to the distribution of where independence of
’s is not assumed, it may be argued that
is a quantity that is implicitly dominated by positive definite quadratic forms that induce a chi-square distribution. Thus by Satterthwaite [12] or Patnaik [13], we have
It follows that
Therefore, the degrees of freedom can be obtained by solving the above equation for namely,
Now,
and
where denotes the covariance between
and
for
An estimate of the degrees of freedom,
is given by
(see [9,14]).
We can now synthesize the probability values
based on the decision rule
For normalized weights, that is, the decision rule is:
with an estimate of the degrees of freedom given by
Notice that for independent
and
and normalized weights, Makambi [9] and Hou [14] utilize
For
and 4, Hou [14] presented simulation results indicating that the approximation given above attains probability values close to the nominal level, similar to the Good [10] and Bhoj [11] approximations.
For independent
-values, we use Table 1 in Hou [14] to obtain Table 1, just for purposes of comparing the performance of the approaches. We notice that using
(column 5, Table 1) yields results that are close to both the exact method by Good [10] and the method by Bhoj [11].
To illustrate the application of the methods for independent probability values, we use data from Canner [15] on four selected multicenter trials involving aspirin and post-myocardial infarction patients carried out in Europe and the United States in the period 1970-1979. Two of these trials, referred to as UK-1 and UK-2 were carried out in the United Kingdom; the Coronary Drug Project Aspirin Study (CDPA); and the Persantine-Aspirin Reinfarction Study (PARIS) (Table 2).
The values provided in column 4 of Table 2 are for the log odds ratio as the outcome measure of interest. Using the
values in Table 2 and the weights from Table 1 of [14], we obtain the values in Tables 3. We have also included results for normalized inverse variance weights determined from the data. The three approximations yield values that are close to each other, and are in good agreement with the exact method by Good [10].
If and
are non-independent, the expression for
contains a covariance term between
and
that has to be estimated. Let
be the correlation between
and
i.e.,
An approximation of the variance of
is given by [16]
3. A Procedure for Constant Correlation Coefficient
We require estimates of to implement the procedures above for dependent
’s. Let’s consider the case of homogeneous nonnegative correlation coefficients, that is,
for
Let
and define the quadratic form [9]
We can write
where is identity matrix of order
and
is a square matrix of order
with every element equal to unity. It can be shown that
Table 1. for independent
’s.
where
and
is the trace of the matrix
For homogeneous
and using results from Brown [16] we have
We can show that
Solving the preceding equation for
yields the approximate admissible solution
with an estimate for
given by
(1)
We investigate how well this approximation works compared with the other approximations by simulating data from a variate normal distribution with covariance matrix
with
and
Just as in Hou [14], we simulated
Table 2. Data on total mortality in six aspirin trials (Number of Deaths/Number of patients).
Table 3. for independent
’s using Canner (1987) data for
.
10,000 multivariate normal samples and computed the corresponding values of For
and 4, we present values for
at selected nominal levels and weights (Tables 4-6).
Table 4. Simulated estimates of at selected nominal levels for non-independent
’s from bivariate normal distribution with
and covariance matrix
with
and
.
Table 5. Simulated estimates of at selected nominal levels for non-independent
’s from multivariate normal distribution with
and covariance matrix
with
and
.
Table 6. Simulated estimates of (with weights,
simulated from
) at selected nominal levels for non-independent
’s from bivariate normal distribution with
and covariance matrix
with
and
.
For (Table 4) the proposed method attains probability levels that are close to the nominal level, similar to the Makambi/Hou method.
For (Table 5) the proposed estimate of the constant correlation coefficient
leads to attained probability level that are close to the nominal level,
for
and 0.9. However, for values of
close to 0.5, the estimate leads to underestimation of the probability level.
Now, instead of using pre-defined weights, we simulated weights from a beta distribution with parameters and
That is, for
such that
Results are given in Table 6 for selected nominal levels.
4. Conclusion
In this article, we have presented chi-square approximations to the distribution of Fisher’s inverse chi-square statistic for independent and dependent values. It has also been shown that, for dependent
values, the proposed estimate of the constant correlation coefficient
performs well by attaining probability levels close to the nominal level for correlation coefficients close to 0.1 and 0.9. We expect the proposed estimate to underestimate probability levels for relatively large numbers of studies, especially when
is close to 0.5. However, for values close to 0.1 and 0.9, the proposed estimate works quite well and can be recommended.
REFERENCES
- H. Solomon and M. A. Stephens, “Distribution of a Weighted Sum of Chi-Squared Variables,” Journal of the American Statistical Association, Vol. 72, No. 360a, 1977, pp. 881-885.
- M. J. Buckley and G. G. Eagleson, “An Approximation to the Distribution of Quadratic Forms in Normal Random Variables,” Australian Journal of Statistics, Vol. 30A, No. 1, 1988, pp. 150-159. doi:10.1111/j.1467-842X.1988.tb00471.x
- J.-T. Zhang, “Approximate and Asymptotic Distributions of Chi-Squared-Type Mixtures with Applications,” Journal of the American Statistical Association, Vol. 100, No. 469, 2005, pp. 273-285. doi:10.1198/016214504000000575
- R. L. Eubank and C. H. Spiegelman, “Testing the Goodness of Fit of a Linear Model via Non-Parametric Regression techniques,” Journal of the American Statistical Association, Vol. 85, No. 410, 1990, pp. 387-392. doi:10.1080/01621459.1990.10476211
- A. Azzalini and A. W. Bowman, “On the Use of NonParametric Regression for Checking Linear Relationships,” Journal of the Royal Statistical Society Series B, Vol. 55, No. 2, 1993, pp. 549-557.
- J. C. Chen, “Testing the Goodness of Fit of Polynomial Models via Spline Smoothing Techniques,” Statistics and Probability Letters, Vol. 19, No. 1, 1994, pp. 65-76. doi:10.1016/0167-7152(94)90070-1
- W. Gonzalez-Manteiga and R. Cao, “Testing the Hypothesis of a General Linear Model Using Non-Parametric Regression Estimation,” Test, Vol. 2, No. 1-2, 1993, pp. 161-188. doi:10.1007/BF02562674
- J. Fan, C. Zhang, J. Zhang, “Generalized Likelihood Ratio Statistics and Wilks Phenomenon,” The Annals of Statistics, Vol. 29, No. 1, 2001, pp. 153-193. doi:10.1214/aos/996986505
- K. H. Makambi, “Weighted Inverse Chi-Square Method for Correlated Significance Tests,” Journal of Applied Statistics, Vol. 30, No. 2, 2003, pp. 225-234. doi:10.1080/0266476022000023767
- I. J. Good, “On the Weighted Combination of Significance Tests,” Journal of the Royal Statistical Society Series B, Vol. 17, No. 1, 1995, pp. 264-265.
- D. S. Bhoj, “On the Distribution of the Weighted Combination of Independent Probabilities,” Statistics & Probability Letters, Vol. 15, No. 1, 1992, pp. 37-40. doi:10.1016/0167-7152(92)90282-A
- F. E. Satterthwaite, “An Approximate Distribution of the Estimates of Variance Components,” Biometrics Bulletin, Vol. 2, No. 6, 1946, pp. 110-114. doi:10.2307/3002019
- P. B. Patnaik, “The Non-Central
-and
-Distributions and Their Applications,” Biometrika, Vol. 36, No. 1-2, 1049, pp. 202-232.
- C. D. Hou, “A Simple Approximation for the Distribution of the Weighted Combination of Nonindependent or Independent Probabilities,” Statistics and Probability Letters, Vol. 73, No. 2, 2005, pp. 179-187. doi:10.1016/j.spl.2004.11.028
- P. L. Canner, “An Overview of Six Clinical Trials of Aspirin in the Coronary Heart Disease,” Statistics in Medicine, Vol. 6, No. 3, 1987, pp. 255-263.
- M. B. Brown, “A Method for combining Non-Independent, One-Sided Tests of Significance,” Biometrics, Vol. 31, No. 4, 1975, pp. 987-992. doi:10.2307/2529826