8b60-e2cf17a9e021.jpg width=45.980001449585 height=22.4200003623962  /> so that with consisting of the first p columns of C, the second pcolumns of C, and so on.

Remark 7 When where with being the l-th column of, we have.

Set. Then

. Define the total variation of a random matrix as, i.e., the sum of the variances of all the entries of X.

Theorem 1 We have



are independent with. Furthermore,


Theorem 1 is important for the AHT test. It says that W is a Wishart mixture and it gives the mean matrix and the total variation of W.

2.2. The AHT Test

When were valid with, the random variable T given in (5) would follow, a Hotelling T2-distribution with parameters q and d. Theorem 1 shows that W is in general a Wishart mixture instead of a single Wishart random matrix. To overcome this difficulty, we may approximate the distribution of W by that of a single Wishart random matrix, say, where the unknown parameters d and are determined via matching the mean matrices and total variations of W and R. That is, we solve the following two equations for d and:


The solution is given in Theorem 2 below together with the range of d.

Theorem 2 The solution of (9) is given by 


Moreover, d satisfies the following inequalities: 


Remark 8 Theorem 2 indicates that provided. This guarantees that the distribution of the test statistic T defined in (5) can be approximated by. That is why the test proposed here is called the AHT (approximate Hotelling T2) test.

Remark 9 From (10), it is seen that when becomes large, d generally becomes large; and when, we have so that weakly tends to, the limit distribution of T as pointed out in Remark 5.

Remark 10 The technique used to approximate a Wishart mixture W by a single Wishart random matrix may be referred to as the Wishart-approximation method. The original version of the Wishartapproximation method is due to [8] who determined the unknown parameters d and via matching the first two moments of W and R. The article obtained a number different solutions to d, with the simplest one being the same as the one presented in Theorem 2.

Remark 11 The key idea of the Wishart-approximation method is very similar to that of the well-known -approximation method developed by [19] who approximated the distribution of a -mixture (see [20]) using that of a -random variable multiplied by a constant via matching the first two moments.

Remark 12 The first application of the Wishart-approximation method may be due to [8] who obtained an approximate test for the multivariate two-sample BF problem. The resulting test is not affine-invariant, as pointed out by [5]. The authors of [5] then modified Nel and van der Merwe’s test, resulting in the so-called MNV test. Recent applications of the Wishart-approximation method were given by [17,18] who studied tests of linear hypotheses in heteroscedastic one-way and two-way ANOVA. The AHT test proposed in this paper is a new application of the Wishart-approximation method.

In real data application, the parameter d has to be estimated based on the data. A natural estimator of d is obtained via replacing by their estimators:


so that


Notice that so that the range of d given in (11) is also the range of.

Remark 13 Under the assumption (6), it is standard to show that as, we have. In additionwe can show that and . That is, the means of T and are matched up to order while the variances of T and are matched only up to order. This is not bad since here we only use one tuning parameter and the distribution of is easy to use.

In summary, the AHT test is based on approximating the distribution of the Wald-type test statistic T (4) by. It can be conducted using the usual F-distribution since


where and throughout, the expression means “X and Y have the same distribution”. In other words, the critical value of the AHT test can be specified as

for the nominal significance level. We reject the null hypothesis in (3) when this critical value is exceeded by the observed test statistic T. The AHT test can also be conducted via computing the P-value based on the approximate distribution specified in (14).

2.3. Minimum Sample Size Determination

Let denote the integer part of a. When, it is easy to show that X has up to finite moments:

In general, T has some finite moments. If its approximate Hotelling T2-distribution is good, it should also have the same number of finite moments. To assure that has up to r finite moments, by (14), the minimum sample size must satisfy 


which is obtained via using the lower bound of d (and as well) given in (11). The required minimum sample size may be defined as where a is the quantity given in the right-hand side of (15). It is seen that when p or r is large or when q is small, the required minimum sample size is also large. By Remark 2, we have . Thus, a sufficient condition to guarantee that the approximate Hotelling T2-distribution (14) has up to r finite moments is that.

Remark 14 When is too small, e.g., , the AHT test may not perform well since in this case, the first moment of is not finite although the first moment of T is usually finite.

2.4. Properties of the AHT Test

In practice, the observed response vectors in (1) are often re-centered or rescaled before any inference is conducted. It is desirable that the inference is invariant under the recentering or rescale transformation. They are two special cases of the following affine transformation of the observed response vectors:


where B is any nonsingular matrix and b is any constant vector. The proposed AHT test is affine-invariant as stated in the theorem below.

Theorem 3 The proposed AHT test is affine-invariant in the sense that both T and are invariant under the affine-transformation (16).

Remark 1 mentions that the contrast matrix C used to write (2) into the form of the GLHT problem (3) is not unique and the AHT test is invariant to various choices of the contrast matrix. This result follows from Theorem 4 below immediately if we notice a result from [21] (Ch. 5, Sec. 4), which states that for any two contrast matrices and C defining the same hypothesis, there is a nonsingular matrix P such that.

Theorem 4 The AHT test is invariant when the coefficient matrix C and the constant vector c in (3) are replaced with 


respectively where P is any nonsingular matrix.

Finally, we have the following result.

Theorem 5 The AHT test is invariant under different labeling schemes of the mean vectors.

3. Simulation Studies

In this section, intensive simulations are conducted to compare the AHT test against the test of [1] and the PB test of [2]. All the three tests are affine-invariant. Reference [2] demonstrated that the PB test generally outperforms the test of [1] and the generalized F-test of [14] in terms of size controlling. The generalized F-test are generally very liberal and time consuming. Therefore, we shall not include it for comparison against the AHT test.

Following [2], for simplicity, we set and to be some positive definite matrices, where p, and other tuning parameters are specified later. Let denote the vector consisting of the k sample sizes. For given n and, we first generated k sample mean vectors and k sample covariance matrices by

where the population mean vectors with being the first population mean vector, u a constant unit vector specifying the direction of the population mean differences, and a tuning parameter controlling the amount of the population mean differences. Without loss of generality, we specified as 0 and u as where for any p and denotes the usual L2-norm of u0. We then applied the Johansen, PB, and AHT tests to the generated sample mean vectors and the sample covariance matrices, and recorded their P-values. The empirical sizes and powers of the Johansen, PB, and AHT tests were computed based on 10000 runs and the number of inner loops for the PB test is 1000. In all the simulations conducted, the significance level was specified as 5% for simplicity.

The empirical sizes (associated with) and powers (associated with) of the Johansen, PB, and AHT tests for the multivariate k-sample BF problem (2), together with the associated tuning parameters, are presented in Tables 1-3, in the columns labeled with “Joh”, “PB”, and “AHT” respectively. As seen from the three tables, three sets of the tuning parameters for population covariance matrices are examined, with the first set specifying the homogeneous cases and seven sets of sample sizes are specified, with the first three sets specifying the balanced sample size cases. To measure the overall performance of a test in terms of maintaining the nominal size, we define the average relative error as where denotes the j-th empirical size for, and M is the number of empirical sizes under consideration. The smaller ARE value indicates the better overall performance of the associated test. Usually, when ARE ≤ 10, the test performs very well; when, the test performs reasonably well; and when, the test does not perform well since its empirical sizes are either too liberal or too conservative. Notice that for a good test, the larger the sample sizes, the smaller the ARE values. Notice that for simplicity, in the specification of the covariance and sample size tuning parameters, we often use to denote “a repeats r times”, e.g., (30)2 = (30, 30) and (23, 4, 12) = (2, 2, 2, 4, 1, 1). Tables 1-3 show the empirical sizes and powers of the Johansen, PB, and AHT tests for a bivariate case with, a

Table 1. Empirical sizes and powers of the Johansen, PB, and AHT tests for bivariate one-way MANOVA.

3-variate case with and a 5-variate case with, respectively.

From Table 1, it is seen that for the two-sample BF problem, the Johansen, PB, and AHT tests performed very similarly with the Johansen test slightly outperforming the other two tests. However, from Tables 2 and 3, it is seen that with k increasing to 3 and 5, the Johansen test performed much worse than the PB and AHT tests. The later two tests were generally comparable for various sample sizes and parameter configurations. Since the PB test is much more computationally intensive, it is less attractive in real data analysis. The AHT test is then a nice alternative, especially when k is moderate or large.

4. Application to the Egyptian Skull Data

The Egyptian skull data set was recently analyzed by [2]. It can be downloaded freely at Statlib (http://lib.stat.cmu. edu/DASL/Stories/EgyptianSkullDevelopment.html).

There are five samples of 30 skulls from the early pre-dynastic period (circa 4000 BC), the late pre-dynastic period (circa 3300 BC), the 12-th and 13-th dynasties (circa 1850 BC), the Ptolemaic period (circa 200 BC), and the Roman period (circa AD 150). Four measurements are available on each skull, namely, = maximum breadth, = borborygmatic height, = dentoalveolar length, and = nasal height (all in mm). To compare the AHT test with the test of [1] and the PB test of [2] in various cases, we applied these three tests to

Table 2. Empirical sizes and powers of the Johansen, PB, and AHT tests for trivariate one-way MANOVA.

check the significance of the mean vector differences of the first k samples, using only the first observations for and for k = 2, 3, 4 and 5. There are totally 12 cases under consideration. The number of bootstrap replications in the PB test is 10000 and hence the time spent by the PB test is about 10000 times of that spent by the other two tests. The P-values of the three tests for various cases are presented in Table 4.

From Table 4, it is seen that the P-values of the three tests are close to each other with the P-values of the Johansen test slightly smaller in almost all the cases. Reference [2] showed via intensive simulations that the PB test performed well for various parameter configurations. Therefore, we may use the P-values of the PB test as benchmark to compare the AHT test with the Johansen test. It is seen from Table 4 that the P-values of the AHT tests are closer to the P-values of the PB test than those of the Johansen test. In this sense, the AHT test performed similar to the PB test and outperformed the Johansen test. This is in agreement with the conclusions drawn from the simulation results presented in the previous section.

Table 3. Empirical sizes and powers of the Johansen, PB, and AHT tests for 5-variate one-way MANOVA.

Table 4. P-values of the Johansen, PB, and AHT tests for the Egyptian skull data example.

It is also seen that the first null hypothesis in Table 4 is not significant, with the P-values of the three tests larger than 60% and increasing with increasing the sample sizes; the other three null hypotheses are significant, with the P-values of the three tests decreasing to less than 5% with increasing the sample sizes. These results suggest that the Egyptian skulls had little change in the early and late pre-dynastic periods but experienced a significant change over the later three periods.

5. Technical Proofs

Proof of Theorem 1 Notice first that if, then we have and

. In addition, it is well known that for, we have

. Thus



Therefore, we have and

Since are independent, we have


as desired. The theorem is proved.

Proof of Theorem 2 By Theorem 1, and


By the proof of Theorem 1, we have and


Equating and leads to. It follows that. Equating and then leads to (10) as desired.

We first find the lower bound of d. This is equivalent to finding the upper bound of the denominator of d. For, set which is a full rank matrix. Then. It follows that are nonnegative, so are their eigenvalues. In addition, the matrix and the matrix have the same non-zero eigenvalues. Thus, has at most p nonzero eigenvalues. Denote the largest p eigenvalues of by which include all the nonzero eigenvalues of. By Theorem 1,. This leads to, which implies that is nonnegative. By singular value decomposition of, it is easy to show that. It follows that



It follows that. The first inequality in (11) is proved.

We now find the upper bound for d. This is equivalent to finding the minimum value of the denominator of d. Using the eigenvalues of defined above, we have

It follows that

For convenience, we now set. Then by Theorem 1, we have

Set where

are linear independent but. Taking the partial derivatives of g with respect to and setting them to 0 lead to the following normal equation system:

Solving the above equation system with respect to, together with the fact, leads to 


where as defined before. Since for , we have

the associated Hessian matrix of g is positive definite. Thus, the function has minimum value when take the values in (18).

It follows that the upper bound of d is

as desired. The theorem is proved.

Proof of Theorem 3 Since and denote the mean vector and covariance matrix of, we let and denote the mean vector and covariance matrix of the affine-transformed responses given by (16). Then we have and It follows that. As we defined the long mean vector and the big covariance matrix in Section 2, we define and similarly. Then we have and where and. It follows that the GLHT problem (3) can be equivalently expressed as

where and.

Since and denote the unbiased estimators of and for the original responses, we define and as the unbiased estimators of and for the affine-transformed responses . Then by the affine-transformation (16), it is easy to see that, and Therefore, and. Using the above, we have and. The affine-invariance of T follows immediately.

To show that is affine-invariant, by (13), it is sufficient to show that and are affine-invariant. Let and. Then we have and . It follows that and. Since G is affine-invariant, we only need to show that are affine-invariant. Since implies and implies , the affine-invariance of follows immediately. The theorem is then proved.

Proof of Theorem 4 First of all, under the transformation (17), we have and . The invariance of T under (17) follows immediately.

To show that is invariant under the transformation (17), by (13), it is sufficient to show that and are invariant under (17). The transformation (17) implies that. Then we have and. It follows that so that

Similarly, we can show that. This proves that is invariant under the transformation (17). The theorem is then proved.

Proof of Theorem 5 Let be any permutation of. Then it is easy to see that

showing that, , and are invariant under different labeling schemes of the mean vectors and so is the Wald-type test statistic T.

To show that is invariant under different labeling schemes of the group mean vectors, by (13), it is sufficient to show that the denominator of has such a property. This is actually the case by noticing that the denominator of

This completes the proof of the theorem.

6. Acknowledgements

The work was supported by the National University of Singapore Academic Research Grant R-155-000-108-112. The author thanks the Editor for helpful comments and suggestions that help improve the presentation of the paper.


  1. S. Johansen, “The Welch-James Approximation to the Distribution of the Residual Sum of Squares in a Weighted linear Regression,” Biometrika, Vol. 67, No. 1, 1980, pp. 85-95. doi:10.1093/biomet/67.1.85
  2. K. Krishnamoorthy and F. Lu, “A Parametric Bootstrap Solution to the MANOVA under Heteroscedasticity,” Journal of Statistical Computation and Simulation, Vol. 80, No. 8, 2010, pp. 873-887. doi:10.1080/00949650902822564
  3. T. W. Anderson, “An Introduction to Multivariate Statistical Analysis,” Wiley, New York, 2003.
  4. K. Krishnamoorthy and Y. Xia, “On Selecting Tests for Equality of Two Normal Mean Vectors,” Multivariate Behavioral Research, Vol. 41, No. 4, 2006, pp. 533-548. doi:10.1207/s15327906mbr4104_5
  5. K. Krishnamoorthy and J. Yu, “Modified Nel and van der Merwe Test for the Multivariate Behrens-Fisher Problem,” Statistics and Probability Letters, Vol. 66, No. 2, 2004, pp. 161-169. doi:10.1016/j.spl.2003.10.012
  6. G. S. James, “Tests of Linear Hypotheses in Univariate and Multivariate Analysis When the Ratios of the Population Variances Are Unknown,” Biometrika, Vol. 41, No. 1-2, 1954, pp. 19-43.
  7. Y. Yao, “An Approximate Degrees of Freedom Solution to the Multivariate Behrens-Fisher Problem,” Biometrika, Vol. 52, 1965, pp. 139-147.
  8. D. G. Nel and C. A. van der Merwe, “A Solution to the Multivariate Behrens-Fisher Problem,” Communication Statistics: Theory and Methods, Vol. 15, No. 12, 1986, pp. 3719-3735. doi:10.1080/03610928608829342
  9. S. Kim, “A Practical Solution to the Multivariate BehrensFisher Problem,” Biometrika, Vol. 79, No. 1, 1992, pp. 171-176. doi:10.1093/biomet/79.1.171
  10. H. Yanagihara and K. H. Yuan, “Three Approximate Solutions to the Multivariate Behrens-Fisher Problem,” Communication Statistics: Simulation and Computation, Vol. 34, No. 4, 2005, pp. 975-988. doi:10.1080/03610910500308396
  11. A. Belloni and G. Didier, “On the Behrens-Fisher Problem: A Globally Convergent Algorithm and a FiniteSample Study of the Wald, LR and LM Tests,” Annals of Statistics, Vol. 36, No. 5, 2008, pp. 2377-2408. doi:10.1214/07-AOS528
  12. W. F. Christensen and A. C. Rencher, “A Comparison of Type I Error Rates and Power Levels for Seven Solutions to the Multivariate Behrens-Fisher Problem,” Communication Statistics: Theory and Methods, Vol. 26, 1997, pp. 1251-1273.
  13. B. L. Welch, “On the Comparison of Several Mean Values: An Alternative Approach,” Biometrika, Vol. 38, 1951, pp. 330-336.
  14. J. Gamage, T. Mathew and S. Weerahandi, “Generalized p-Values and Generalized Confidence Regions for the Multivariate Behrens-Fisher Problem and MANOVA,” Journal of Multivariate Analysis, Vol. 88, No. 1, 2004, pp. 177-189. doi:10.1016/S0047-259X(03)00065-4
  15. K. L. Tang and J. Algina, “Performing of Four Multivariate Tests under Variance-Covariance Heteroscedasticity,” Multivariate Behavioral Research, Vol. 28, No. 4, 1993, pp. 391-405. doi:10.1207/s15327906mbr2804_1
  16. K. Krishnamoorthy, F. Lu and T. Mathew, “A Parametric Bootstrap Approach for ANOVA with Unequal Variances: Fixed and Random Models,” Computational Statistics and Data Analysis, Vol. 51, No. 12, 2007, pp. 5731-5742. doi:10.1016/j.csda.2006.09.039
  17. J. T. Zhang, “Tests of Linear Hypotheses in the ANOVA under Heteroscedasticity,” Manuscript, 2012.
  18. J. T. Zhang, “An Approximate Degrees of Freedom Test for Heteroscedastic Two-Way ANOVA,” Journal of Statistical Planning and Inference, Vol. 142, 2012, pp. 336-346.
  19. F. E. Satterthwaite, “An Approximate Distribution of Estimate of Variance Components,” Biometrics Bulletin, Vol. 2, No. 6, 1946, pp. 110-114. doi:10.2307/3002019
  20. J. T. Zhang, “Approximate and Asymptotic Distribution of χ2-Type Mixtures with Application,” Journal of American Statistical Association, Vol. 100, No. 469, 2005, pp. 273-285. doi:10.1198/016214504000000575
  21. A. M. Kshirsagar, “Multivariate Analysis,” Marcel Decker, New York, 1972.

Journal Menu >>