Extended Diagonal Exponent Symmetry Model and Its Orthogonal Decomposition in Square Contingency Tables with Ordered Categories

doi:10.4236/ojs.2015.54028

Open Journal of Statistics
Vol.05 No.04(2015), Article ID:56873,10 pages
10.4236/ojs.2015.54028

Kiyotaka Iki, Akira Shibuya, Sadao Tomizawa

●How to Cite this Article

Department of Information Sciences, Tokyo University of Science, Noda, Japan

Email: iki@is.noda.tus.ac.jp

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Received 31 March 2015; accepted 31 May 2015; published 3 June 2015

ABSTRACT

For square contingency tables with ordered categories, this article proposes new models, which are the extension of Tomizawa’s [1] diagonal exponent symmetry model. Also it gives the decomposition of proposed model, and shows the orthogonality of the test statistics for decomposed models. Examples are given and the simulation studies based on the bivariate normal distribution are also given.

Keywords:

Diagonal Exponent Symmetry, Ordinal Category, Orthogonal Decomposition, Quasi-Symmetry, Square Contingency Table

1. Introduction

Consider an square contingency table with the same row and column classifications. Let denote the probability that an observation will fall in the ith row and jth column of the table. The symmetry (S) model is defined by

where see Bowker [2] . Caussinus [3] considered the quasi-symmetry (QS) model defined by

where The marginal homogeneity (MH) model is defined by

where and see Stuart [4] . Caussinus [3] gave the theorem that the S model holds if and only if both the QS and MH models hold.

Tomizawa [1] considered the diagonal exponent symmetry (DES) model defined by

By putting and this model is also expressed as

Note that the DES model implies the S model; thus the DES model implies the QS (MH) model. The DES model states that is times higher than; in other words, for fixed distance k from the main diagonal of the table, increase (decrease) exponentially along every subdiagonal of the table as the value i increase.

Iki, Yamamoto and Tomizawa [5] considered the quasi-diagonal exponent symmetry (QDES) model defined by

A special case of the QDES model obtained by putting is the DES model. Note that the QDES model implies the QS model. Let X and Y denote the row and column variables, respectively. We define the mean equality (ME) model as. Iki et al. [5] gave the theorem that the DES model holds if and only if both the QDES and ME models hold.

Iki et al. [5] described the relationship between the QDES model and a joint bivariate normal distribution, and showed that the QDES model may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate normal distribution with equal marginal variances. We are interested in considering the new model which is appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate normal distribution without equal marginal variances, and a decomposition using the proposed models.

The present paper proposes two models, and gives the decomposition using the proposed models. Also it shows the orthogonality of the test statistics for decomposed model.

2. New Models

Consider a model defined by

A special case of this model obtained by putting is the DES model. Thus we shall refer to this model as the extended diagonal exponent symmetry (EDES) model. The EDES model states that is times higher than; in other words, for fixed distance from the main diagonal of the table, the ratio of to increases (decreases) exponentially along every subdiagonal of the table. Note that the EDES model implies the S model.

Next, consider a model defined by

A special case of this model obtained by putting is the QDES model. Thus we shall refer to this model as the extended quasi-diagonal exponent symmetry (EQDES) model. A special case of the EQDES model obtained by putting and is the EDES model. The EQDES model states that is times higher than; in other words, for fixed distance from the main diagonal of the table, the ratio of to increases (decreases) exponentially along every subdiagonal of the table. Note that the EQDES model implies the QS model.

Under the EQDES model, we can see

where. This indicates that the odds that an observation will fall in the th cell, instead of the th cell is times higher than the odds that the observation will fall in the th cell, instead of the th cell. Also we can see

where and. If, for corresponding i and j, the structure of holds. Also if, the structure of holds.

In Figure 1, we show the relationships among models. In figure, indicates that model A implies model B.

3. Decomposition

Refer to model of equality of marginal means and variances, i.e., and as the MVE model. This model is also expressed as and We obtain the decomposition of the EDES model as follows:

Theorem 1. The EDES model holds if and only if the EQDES and MVE models hold.

Proof. If the EDES model holds, then the EQDES and MVE models hold. Assuming that both the EQDES and MVE models hold, then we shall show that the EDES model holds. Let denote the cell probabilities which satisfy both the EQDES and MVE models. Since the EQDES model holds, we see

(1)

Let with We denote that with Then, since satisfy the EQDES and MVE models, we see

(2)

and

(3)

where with and.

Figure 1. Relationships among models.

Then, we denote by and by

Consider the arbitrary cell probabilities satisfying

(4)

where and

From (2), (3) and (4), we see

(5)

Using the Equation (5), we obtain

where

and is the Kullback-Leibler information between and. Since being a func-

tion of is fixed, we see

and then uniquely minimizes (see Bhapkar and Darroch [6] ).

Let for Then

(6)

Noting that the Equation (6) is also expressed as

(7)

From (3), (4) and (7), we see

(8)

Using the Equation (8), we obtain

Since being a function of is fixed, we see

and then uniquely minimizes Therefore, we see Thus,

From (1) and (6), for, we see

Thus, we obtain and Namely, the EDES model holds. The proof is completed.

4. Orthogonality of Test Statistics

Let n_ij denote the observed frequency in the (i, j)th cell of the table with, and let denote the corresponding expected frequency. Assume that have a multinomial distribution. The maximum likelihood estimates (MLEs) of under the EDES and EQDES models could be obtained using iterative procedures; for example, see Darroch and Ratcliff [7] . The MLEs of under the MVE model could be obtained using Newton-Raphson method to the log-likelihood equations.

Let denote the likelihood ratio chi-squared statistic for testing goodness-of-fit model M. The numbers of degrees of freedom (df) for the EDES and EQDES models are and, respectively.

The orthogonality (asymptotic separability or independence) of the test statistics for goodness-of-fit of two models is discussed by, e.g., Darroch and Silvey [8] and Read [9] . We obtain as follow:

Theorem 2. The test statistic is asymptotically equivalent to the sum of and.

Proof. The EQDES model is expressed as

(9)

where Let

where “t” denotes the transpose, and

is the vector. The EQDES model is expressed as

where X is the matrix with (the R² × 1 vector), (the R² × 1 vector), (the R² × 1 vector), (the R² × 1 vector), and is the matrix of 1 or 0 elements determined from (9), is the vector of 1 elements, and denotes the Kronecker product. The matrix X is full column rank which is K. In a similar manner to Haber [10] , we denote the linear space spanned by the columns of the matrix X by with the dimension K.

Let U be an, where, full column rank matrix such that is the orthogonal complement of. Thus, , where is the s × t zero matrix. Therefore the EQDES model is expressed as

where is the zero vector, and. The MVE model is expressed as

where, and, with being the matrix. Namely,. Thus belongs to Hence From Theorem 1, the EDES model is expressed as

where and.

Let denote the matrix of partial derivative of with respect to p, i.e., Let, where denotes a diagonal matrix with ith component of p as ith diagonal component. Let denote p with replaced by. Then has asymptotically a normal distribution with mean and covariance matrix. Using the delta method, has asymptotically a normal distribution with mean and covariance matrix

Note that belongs to because. Thus. Since

and, we see

Thus, we obtain where

(10)

Under each, the Wald statistic has asymptotically a chi-squared distribution with degrees of freedom. From (10), we see that From the asymptotic equivalence of the Wald statistic and likelihood ratio statistic, we obtain Theorem 2.

5. Examples

Example 1. Consider the data in Table 1, taken from Bishop, Fienberg and Holland [11] , which describe the cross-classification of father’s and son’s occupational status categories in Denmark. The row is the father’s status category and column is the son’s status category. The categories are ordered from (1) to (5) (high to low). These data have also been analyzed by some statisticians; see for example, Kullback [12] , Haberman [13] , Goodman [14] , and Yamamoto, Tahata and Tomizawa [15] .

Table 1. Occupational status for Danish father-son pairs; from Bishop et al. [11] . (The parenthesized values are MLEs of expected frequencies under the EQDES model.)

Note: Status (1) is high professionals, (2) white-collar employees of higher education, (3) white-collar employees of less high education, (4) upper working class, and (5) unskilled workers.

We see from Table 3 that the EQDES and QS models fit these data well, although the other models fit poorly. The EQDES model is a special case of the QS model. We shall test the hypothesis that the EQDES model holds assuming that the QS model holds for these data. Since with 6 df being the difference between the numbers of df for the EQDES and the QS models, this hypothesis is accepted at the 0.05 significance level. Therefore, the EQDES model would be preferable to the QS model.

Under the EQDES model, the MLEs of and are and respectively. Therefore the probability that a father’s and his son’s status categories are and, respectively, is estimated to be times higher than the probability that those are i and j, respectively. Since the values of for and are greater than 1 and it for is less than 1 (see Table 2), the probability that a father’s and his son’s status categories are and, respectively, is estimated to be greater than the probability that those are i and j, respectively.

Also the MLEs of are, , , , ,

and, respectively. Therefore, it is estimated that there is the structure of for with and for with and 4.

We see from Table 3 that the poor fit of the EDES model is caused by the influence of the lack of structure of the MVE model rather than the EQDES model.

Example 2. Consider the data in Table 4 taken from Tomizawa [16] . These data are an unaided distance vision of 3168 pupils comprising nearly equal number of boys and girls aged 6 - 12 at elementary schools in Tokyo, Japan, examined in June 1984. These data have also been analyzed by Tomizawa [1] , Tahata and Tomizawa [17] , and Iki et al. [5] . The row is the right eye grade and column is the left eye grade.

We see from Table 3 that the EDES and EQDES models fit these data well, although the MVE model fits poorly. The EDES model is a special case of the EQDES model. We shall test the hypothesis that the EDES

Table 2. Values of, , under the EQDES model applied to Table 1.

Table 3. Likelihood ratio chi-squared values for models applied to Table 1 and Table 4.

^*means significant at the 0.05 level.

model holds assuming that the EQDES model holds for these data. Since

with 2 df being the difference between the numbers of df for the EDES and the EQDES models, this hypothesis is rejected at the 0.05 significance level. Therefore, the EQDES model would be preferable to the EDES model.

Under the EQDES model, the MLEs of and are and respectively. Therefore the probability that a pupil’s right eye grade and his or her left eye grade are and, respectively, is estimated to be times higher than the probability that those are i and j, respectively. Since all values of, , are less than 1 (see Table 5), the probability that a pupil’s right eye grade and his or her left eye grade are and, respectively, is estimated to be less than the probability that those are i and j, respectively.

Also the MLEs of are, , , and, respectively. Therefore, it is estimated that there is the structure of for with and 4 and for with and 7.

6. Simulation Studies

Under the QDES model, we see the structure of which is the structure of Agresti’s [18]

Table 4. Unaided distance vision of 3168 pupils comprising nearly equal number of boys and girls aged 6 - 12 at elementary schools in Tokyo, Japan, examined in June 1984; from Tomizawa [16] . (Upper and lower parenthesized values are MLEs of expected frequencies under the EDES and EQDES models, respectively.)

Table 5. Values of, , under the EQDES model applied to Table 4.

linear diagonals-parameter symmetry model, and under the EQDES model, we see the structure of which is the structure of Tomizawa’s [19] extended linear diagonals-parameter symmetry model. Also under the DES and EDES models, we see the structure of for.

Consider now random variables U and V having a joint bivariate normal distribution with means and variances and and correlation Then the joint bivariate normal density function satisfies

Namely, has the form for constant and. Agresti [18] described relationship between the linear diagonals-parameter symmetry model and the joint bivariate normal distribution (see also Tomizawa [19] ). We now consider the relationship between the QDES (DES) and EQDES (EDES) models and the joint bivariate normal distribution in terms of simulation studies.

Table 6 gives the tables of sample size 5000 formed by using cut points for each variable at, , for underlying bivariate normal distribution with the conditions, and and (Table 6(a)), and (Table 6(b)), and (Table 6(c)) and and (Table 6(d)).

Table 6. The tables of sample size 5000, formed by using cut points for each variable at μ₁, μ₁ ± 0.7σ₁, from an underlying bivariate normal distribution with the conditions ρ = 0.3 and (a) and, (b) and, (c) and, (d) and.

(c)

(d)

Table 7. Likelihood ratio chi-squared values for models applied to Tables 6(a)-6(d).

^*means significant at the 0.05 level.

We see from Table 7 that the EQDES model fits well for each of Tables 6(a)-6(d), although the QDES model fits well for each of Table 6(a) and Table 6(b), and fits poorly for each of Table 6(c) and Table 6(d). The DES and EDES models fit well for Table 6(a) and fit poorly for each of Tables 6(b)-6(d). Thus the EQDES model may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate normal distribution (without the equality of marginal variances), although the QDES model may be appropriate if it is reasonable to assume it with equal marginal variances, and the DES and EDES models may be appropriate if it is reasonable to assume it with both equal marginal means and equal marginal variances.

7. Concluding Remarks

Theorem 1 may be useful for seeing the reason for the poor fit when the EDES model fits the data poorly; in fact, see from Example 1, a poor fit of the EDES model would be caused by a poor fit of the MVE model rather than the EQDES model.

From Theorem 2, we point out that the can be easily calculated using the and; in fact, see from Table 3, the value of is very close to the value of the sum of and

From Simulation studies, the EQDES model may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate normal distribution without equal marginal means and equal marginal variances; although the QDES model may be appropriate if it is reasonable to assume it with equal marginal variances.

Acknowledgements

The authors would like to thank the editor and the referee for theirhelpful comments.

References

Tomizawa, S. (1992) A Model of Symmetry with Exponents along Every Subdiagonal and Its Application to Data on Unaided Vision of Pupils at Japanese Elementary Schools. Journal of Applied Statistics, 19, 509-512. http://dx.doi.org/10.1080/02664769200000046
Bowker, A.H. (1948) A Test for Symmetry in Contingency Tables. Journal of the American Statistical Association, 43, 572-574. http://dx.doi.org/10.1080/01621459.1948.10483284
Caussinus, H. (1965) Contribution à l’analyse statistique des tableaux de corrélation. Annales de la Faculté des Sciences de l’Université de Toulouse, 29, 77-182.
Stuart, A. (1955) A Test for Homogeneity of the Marginal Distributions in a Two-Way Classification. Biometrika, 42, 412-416. http://dx.doi.org/10.1093/biomet/42.3-4.412
Iki, K., Yamamoto, K. and Tomizawa, S. (2014) Quasi-Diagonal Exponent Symmetry Model for Square Contingency Tables with Ordered Categories. Statistics and Probability Letters, 92, 33-38. http://dx.doi.org/10.1016/j.spl.2014.04.029
Bhapkar, V.P. and Darroch, J.N. (1990) Marginal Symmetry and Quasi Symmetry of General Order. Journal of Multivariate Analysis, 34, 173-184. http://dx.doi.org/10.1016/0047-259X(90)90034-F
Darroch, J.N. and Ratcliff, D. (1972) Generalized Iterative Scaling for Log-Linear Models. Annals of Mathematical Statistics, 43, 1470-1480. http://dx.doi.org/10.1214/aoms/1177692379
Darroch, J.N. and Silvey, S.D. (1963) On Testing More than One Hypothesis. Annals of Mathematical Statistics, 34, 555-567. http://dx.doi.org/10.1214/aoms/1177704168
Read, C.B. (1977) Partitioning Chi-Square in Contingency Table: A Teaching Approach. Communications in Statistics-Theory and Methods, 6, 553-562. http://dx.doi.org/10.1080/03610927708827513
Haber, M. (1985) Maximum Likelihood Methods for Linear and Log-Linear Models in Categorical Data. Computational Statistics and Data Analysis, 3, 1-10. http://dx.doi.org/10.1016/0167-9473(85)90053-2
Bishop, Y.M.M., Fienberg, S.E. and Holland, P.W. (1975) Discrete Multivariate Analysis: Theory and Practice. The MIT Press, Cambridge.
Kullback, S. (1971) Marginal Homogeneity of Multidimensional Contingency Tables. Annals of Mathematical Statistics, 42, 594-606. http://dx.doi.org/10.1214/aoms/1177693409
Haberman, S.J. (1974) The Analysis of Frequency Data. The University of Chicago Press, Chicago.
Goodman, L.A. (1981) Association Models and the Bivariate Normal for Contingency Tables with Ordered Categories. Biometrika, 68, 347-355. http://dx.doi.org/10.1093/biomet/68.2.347
Yamamoto, K., Tahata, K. and Tomizawa, S. (2012) Some Symmetry Models for the Analysis of Collapsed Square Contingency Tables with Ordered Categories. Calcutta Statistical Association Bulletin, 64, 21-36.
Tomizawa, S. (1985) Analysis of Data in Square Contingency Tables with Ordered Categories Using the Conditional Symmetry Model and Its Decomposed Models. Environmental Health Perspectives, 63, 235-239. http://dx.doi.org/10.1289/ehp.8563235
Tahata, K. and Tomizawa, S. (2006) Decompositions for Extended Double Symmetry Models in Square Contingency Tables with Ordered Categories. Journal of the Japan Statistical Society, 36, 91-106. http://dx.doi.org/10.14490/jjss.36.91
Agresti, A. (1983) A Simple Diagonals-Parameter Symmetry and Quasi-Symmetry Model. Statistics and Probability Letters, 1, 313-316. http://dx.doi.org/10.1016/0167-7152(83)90051-2
Tomizawa, S. (1991) An Extended Linear Diagonals-Parameter Symmetry Model for Square Contingency Tables with Ordered Categories. Metron, 49, 401-409.

Journal Menu >>