﻿ Decompositions of Symmetry Using Generalized Linear Diagonals-Parameter Symmetry Model and Orthogonality of Test Statistic for Square Contingency Tables

Open Journal of Statistics
Vol.3 No.6A(2013), Article ID:41361,5 pages DOI:10.4236/ojs.2013.36A002

Decompositions of Symmetry Using Generalized Linear Diagonals-Parameter Symmetry Model and Orthogonality of Test Statistic for Square Contingency Tables

Kouji Yamamoto1, Motoki Ohama2, Sadao Tomizawa2

1Department of Medical Innovation, Osaka University Hospital, Osaka, Japan

2Department of Information Sciences, Faculty of Science and Technology, Tokyo University of Science, Chiba, Japan

Email: yamamoto-k@hp-crc.med.osaka-u.ac.jp, o.motoki1211@gmail.com, tomizawa@is.noda.tus.ac.jp

Copyright © 2013 Kouji Yamamoto et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In accordance of the Creative Commons Attribution License all Copyrights © 2013 are reserved for SCIRP and the owner of the intellectual property Kouji Yamamoto et al. All Copyright © 2013 are guarded by law and by SCIRP as a guardian.

Received October 10, 2013; revised November 10, 2013; accepted November 17, 2013

Keywords: Cumulative Probability; Global Symmetry; Linear Diagonals-Parameter Symmetry; Mean Equality; Ordinal Category; Orthogonal Test Statistic

ABSTRACT

For square contingency tables with ordered categories, the present paper gives several theorems that the symmetry model holds if and only if the generalized linear diagonals-parameter symmetry model for cell probabilities and for cumulative probabilities and the mean nonequality model of row and column variables hold. It also shows the orthogonality of statistic for testing goodness-of-fit of the symmetry model. An example is given.

1. Introduction

Consider an square contingency table with the same row and column classifications. Let denote the probability that an observation will fall in the ith row and jth column of the table Bowker [1] considered the symmetry (S) model defined by

This model describes the structure of symmetry with respect to the cell probabilities As a model which indicates the structure of asymmetry for Agresti [2] considered the linear diagonals-parameter symmetry (LDPS) model defined by

A special case of this model obtained by putting is the S model. Yamamoto and Tomizawa [3] considered the generalized linear diagonals-parameter symmetry (LDPS(K)) model as follows; for a fixed

Especially the LDPS(0) model is equivalent to the LDPS model.

Let for

and

The S model may be expressed as

Thus the S model also has the structure of symmetry with respect to the cumulative probabilities Miyamoto et al. [4] considered the cumulative linear diagonals-parameter symmetry (CLDPS) model defined by

which indicates a structure of asymmetry for The CLDPS model is different from the LDPS model. Yamamoto and Tomizawa [3] considered the generalized cumulative linear diagonals-parameter symmetry (CLDPS(K)) model as follows; for a fixed

Especially the CLDPS(0) model is equivalent to the CLDPS model.

Let and denote the row and column variables, respectively. We consider the mean equality (ME) model as

where and and

Yamamoto et al. [5] gave Theorem 1. The S model holds if and only if both the LDPS and ME models hold.

Yamamoto and Tomizawa [6] gave Theorem 2. The S model holds if and only if both the CLDPS and ME models hold.

The present paper gives several decompositions of the S model using the LDPS(K) and CLDPS(K) models. It also proposes the mean nonequality model, and gives the orthogonal decomposition for testing goodness-of-fit of the S model. An example is given.

2. Decompositions of Symmetry Model

We shall give five kinds of decompositions of the S model using the LDPS(K) and CLDPS(K) models.

Theorem 3. For a fixed the S model holds if and only if both the LDPS(K) and ME models hold.

Proof. If the S model holds, then both the LDPS(K) and ME models hold. Conversely, assuming that the LDPS(K) and ME models hold and then we shall show that the S model holds. The ME model may be expressed as

From the LDPS(K) model, we see

Therefore we obtain. Namely the S model holds. The proof is completed.

Theorem 4. For a fixed the S model holds if and only if both the CLDPS(K) and ME models hold.

Considering the global symmetry (GS) model as

namely

we obtain Theorem 5. For a fixed the S model holds if and only if both the LDPS(K) and GS models hold.

We shall omit the proofs of Theorems 4 and 5 because these are obtained in a similar manner to the proof of Theorem 3.

For a fixed consider the mean nonequality (MNE(K)) model as follows:

which is

This model indicates that the difference between the means of and is times higher than the difference between the global symmetric probabilities. When the MNE(0) model is identical to the ME model. We obtain Theorem 6. For a fixed the S model holds if and only if both the LDPS(K) and MNE(K) models hold.

Theorem 7. For a fixed and for a fixed the S model holds if and only if both the LDPS(K) and MNE(L) models hold.

We shall omit the proofs of Theorems 6 and 7 because there are obtained in a similar manner to the proof of Theorem 3. Note that: 1) Theorem 6 is an extension of Theorem 1 because when Theorem 6 is identical to Theorem 1; 2) Theorem 7 is an extension of Theorem 3 because when Theorem 7 is identical to Theorem 3; and 3) Theorem 7 is an extension of Theorem 6 because when Theorem 7 is identical to Theorem 6.

3. Test Statistic and Orthogonality

Let denote the observed frequency in the ith row and jth column of the table with and let denote the corresponding expected frequency. Assume that has a multinomial distribution. The maximum likelihood estimates of expected frequencies under each model could be obtained, for example, using the Newton-Raphson method to the log-likelihood equations. Each model (say, model) can be tested for goodness-of-fit by the likelihood ratio chi-squared statistic with the corresponding degrees of freedom, defined by

where is the maximum likelihood estimate of under the model. The number of degrees of freedom for the S model is and that for each of the LDPS(K) and CLDPS(K) models is (being one less than that for the S model). That for each of ME, GS, and MNE(K) models is 1. Note that the number of degrees of freedom for the S model is equal to the sum of those for the decomposed models.

Lang and Agresti [7] and Lang [8] considered the simultaneous modeling of a model for the joint distribution and a model for the marginal distribution. Aitchison [9] discussed the asymptotic separability, which is equivalent to the orthogonality in Read [10] and the independence in Darroch and Silvey [11], of the test statistic for goodness-of-fit of two models (also see Tomizawa and Tahata [12], Tahata et al. [13], and Tahata and Tomizawa [14]). On the orthogonality of test statistic for models in Theorem 6, we obtain.

Theorem 8. For a fixed test statistic is asymptotically equivalent to the sum of and

Proof. The LDPS(K) model may be expressed as

(1)

where Let

where “t” denotes the transpose, and

is the vector. The LDPS(K) model is expressed as

where is the matrix with and is the vector with

where

and is matrix of 0 or 1 elements determined from (1). The matrix is full column rank which is In a similar manner to Haber [15], Lang and Agresti [7], and Tahata and Tomizawa [16], we denote the linear space spanned by columns of the matrix by with the dimension Note that where is the vector of 1 elements, and thus Let be an where full column rank matrix such that the linear space is the orthogonal component of the space Thus, where is the zero matrix. Therefore, the LDPS(K) model is expressed as

where is the zero matrix, and

The MNE(K) model may be expressed as

where

Note that From Theorem 6, the S model may be expressed as

where

Note that are the numbers of degrees of freedom for testing goodness-of-fit of the LDPS(K), MNE(K) and S models, respectively.

Let denote the matrix of partial derivatives of with respect to i.e., Let where denotes a diagonal matrix with ith component of as ith diagonal component. We see that

because and that

Thus we obtain

Therefore we obtain where

From the asymptotic equivalence of the Wald statistic and the likelihood ratio statistic (Rao [17], Darroch and Silvey [11], Aitchison [9]), we obtain Theorem 8. The proof is completed.

4. Analysis of Data

Table 1 taken directly from Agresti [18, p. 232] summarizes responses to the questions “How successful is the government in (1) providing health care for the sick? (2) Protecting the environment?”.

Table 2 gives the values of the likelihood ratio test statistic for models applied to these data. The S model does not fit these data so well. Also, each of the ME (i.e., MNE(0)), MNE(K) and the GS models does not fit these data so well. However each of the LDPS(K) models and the CLDPS(K) models fit these data very well. Using Theorems 3 through 7 (including Theorems 1 and 2), we shall consider the reason why the S model fits these data poorly. For the structure of cell probabilities we see from Theorems 3, 5, 6 and 7 that the poor fit of the S model is caused by the influence of the lack of structure of the ME model (the GS model or the MNE(K) model) rather than the LDPS(K) model For the structure of cumulative probabilities we see from Theorem 4 that the poor fit of the S model is caused by the influence of the lack of structure of the ME model rather than the CLDPS(K) model

Table 1. Data on success of US government in providing health care and protecting the environment; from Agresti [18, p. 232]. (Upper and lower parenthesized values are maximum likelihood estimates of expected frequencies under the LDPS(1) and CLDPS(4) models, respectively.)

Table 2. Values of likelihood ratio chi-squared statistic G2 for models applied to the data in Table 1.

5. Concluding Remarks

We have given the new five kinds of decompositions of the S model. Theorems 3 and 4 are extensions of Theorems 1 and 2, respectively. Theorem 5 is another decomposition of the S model. Theorem 6 is another extension of Theorem 1, and Theorem 7 is an extension of Theorem 3. These theorems may be useful for seeing in more details the reason for the poor fit when the S model fits the data poorly.

From the orthogonality of test statistic given by Theorem 8, we point out that for instance, the likelihood ratio chi-squared statistic for testing goodness-of-fit of the S model assuming that the LDPS(K) model holds true is and this is asymptotically equivalent to the likelihood ratio chi-squared statistic for testing goodness-of-fit of the MNE(K) model, i.e., We see that for the data in Table 1 the value of is very close to the sum of the values of and (see Table 2). The orthogonal decomposition of the S model into the LDPS(K) and MNE(K) models would guarantee that (1) if both the LDPS(K) and MNE(K) models are accepted (e.g., at the 0.05 significance level) with high probability, then the S model would be accepted, and (2) it would be impossible to arise such an incompatible situation that both the LDPS(K) and MNE(K) models are accepted with high probability but the S model is rejected with high probability. Therefore, in particular Theorems 6 and 8 would be useful for analyzing the data.

REFERENCES

1. A. H. Bowker, “A Test for Symmetry in Contingency Tables,” Journal of the American Statistical Association, Vol. 43, No. 244, 1948, pp. 572-574. http://dx.doi.org/10.1080/01621459.1948.10483284
2. A. Agresti, “A Simple Diagonals-Parameter Symmetry and Quasi-Symmetry Model,” Statistics and Probability Letters, Vol. 1, No. 6, 1983, pp. 313-316. http://dx.doi.org/10.1016/0167-7152(83)90051-2
3. K. Yamamoto and S. Tomizawa, “Statistical Analysis of Case-Control Data of Endometrial Cancer Based on New Asymmetry Models,” Journal of Biometrics and Biostatistics, Vol. 3, No. 5, 2012, pp. 1-4. http://dx.doi.org/10.4172/2155-6180.1000147
4. N. Miyamoto, W. Ohtsuka and S. Tomizawa, “Linear Diagonals-Parameter Symmetry and Quasi-Symmetry Models for Cumulative Probabilities in Square Contingency Tables with Ordered Categories,” Biometrical Journal, Vol. 46, No. 6, 2004, pp. 664-674. http://dx.doi.org/10.1002/bimj.200410066
5. H. Yamamoto, T. Iwashita and S. Tomizawa, “Decomposition of Symmetry into Ordinal Quasi-Symmetry and Marginal Equimoment for Multi-way Tables,” Austrian Journal of Statistics, Vol. 36, No. 4, 2007, pp. 291-306.
6. K. Yamamoto and S. Tomizawa, “Analysis of Unaided Vision Data Using New Decomposition of Symmetry,” American Medical Journal, Vol. 3, No. 1, 2012, pp. 37- 42. http://dx.doi.org/10.3844/amjsp.2012.37.42
7. J. B. Lang and A. Agresti, “Simultaneously Modeling Joint and Marginal Distributions of Multivariate Categorical Responses,” Journal of the American Statistical Association, Vol. 89, No. 426, 1994, pp. 625-632. http://dx.doi.org/10.1080/01621459.1994.10476787
8. J. B. Lang, “On the Partitioning of Goodness-of-Fit Statistics for Multivariate Categorical Response Models,” Journal of the American Statistical Association, Vol. 91, No. 435, 1996, pp. 1017-1023. http://dx.doi.org/10.1080/01621459.1996.10476972
9. J. Aitchison, “Large-Sample Restricted Parametric Tests,” Journal of the Royal Statistical Society: Series B, Vol. 24, No. 1, 1962, pp. 234-250.
10. C. B. Read, “Partitioning Chi-Square in Contingency Tables: A Teaching Approach,” Communications in Statistics-Theory and Methods, Vol. 6, No. 6, 1977, pp. 553- 562. http://dx.doi.org/10.1080/03610927708827513
11. J. N. Darroch and S. D. Silvey, “On Testing More than One Hypothesis,” Annals of Mathematical Statistics, Vol. 34, No. 2, 1963, pp. 555-567. http://dx.doi.org/10.1214/aoms/1177704168
12. S. Tomizawa and K. Tahata, “The Analysis of Symmetry and Asymmetry: Orthogonality of Decomposition of Symmetry into Quasi-Symmetry and Marginal Symmetry for Multi-Way Tables,” Journal de la Société Francaise de Statistique, Vol. 148, No. 3, 2007, pp. 3-36.
13. K. Tahata, H. Yamamoto and S. Tomizawa, “Orthogonality of Decompositions of Symmetry into Extended Symmetry and Marginal Equimoment for Multi-Way Tables with Ordered Categories,” Austrian Journal of Statistics, Vol. 37, No. 2, 2008, pp. 185-194.
14. K. Tahata and S. Tomizawa, “Orthogonal Decomposition of Point-Symmetry for Multiway Tables,” Advances in Statistical Analysis, Vol. 92, No. 3, 2008, pp. 255-269. http://dx.doi.org/10.1007/s10182-008-0070-5
15. M. Haber, “Maximum Likelihood Methods for Linear and Log-Linear Models in Categorical Data,” Computational Statistics and Data Analysis, Vol. 3, No. 1, 1985, pp. 1- 10. http://dx.doi.org/10.1016/0167-9473(85)90053-2
16. K. Tahata and S. Tomizawa, “Double Linear DiagonalsParameter Symmetry and Decomposition of Double Symmetry for Square Tables,” Statistical Methods and Applications, Vol. 19, No. 3, 2010, pp. 307-318. http://dx.doi.org/10.1007/s10260-009-0127-y
17. C. R. Rao, “Linear Statistical Inference and its Applications,” 2nd Edition, John Wiley, New York, 1973. http://dx.doi.org/10.1002/9780470316436
18. A. Agresti, “Analysis of Ordinal Categorical Data,” 2nd Edition, John Wiley, Hoboken, 2010. http://dx.doi.org/10.1002/9780470594001