An Extended Bivariate T-Distribution Type Symmetry Model for Square Contingency Tables

doi:10.4236/ojs.2018.82015

Open Journal of Statistics
Vol.08 No.02(2018), Article ID:83567,9 pages
10.4236/ojs.2018.82015

Kiyotaka Iki^*, Masayuki Okada, Sadao Tomizawa

●How to Cite this Article

Department of Information Sciences, Faculty of Science and Technology, Tokyo University of Science, Chiba, Japan

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: November 2, 2017; Accepted: April 1, 2018; Published: April 4, 2018

ABSTRACT

The purpose of this paper is to propose a new model of asymmetry for square contingency tables with ordered categories. The new model may be appropriate for a square contingency table if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances having any degrees of freedom. As the degrees of freedom becomes larger, the proposed model approaches the extended linear diagonals-parameter symmetry model, which may be appropriate for a square table if it is reasonable to assume an underlying bivariate normal distribution. The simulation study based on bivariate t-distribution is given. An example is given.

Keywords:

Bivariate T-Distribution, Square Contingency Table, Symmetry, Underlying Distribution

1. Introduction

Consider an $R \times R$ square contingency table with the same row and column ordinal classifications. Let $p_{i j}$ denote the probability that an observation will fall in the ith row and jth column of the table ( $i = 1, \dots, R; j = 1, \dots, R$ ). The symmetry (S) model is defined by

$p_{i j} = p_{j i} (i < j);$ (1)

see Bowker [1] and Bishop et al. [2] . This model indicates a structure of symmetry of the probabilities with respect to the main diagonal of the table. Agresti [3] considered the linear diagonals-parameter symmetry (LDPS) model defined by

$p_{i j} = θ^{j - i} p_{j i} (i < j) .$ (2)

This indicates that the probability that an observation will fall in the $(i, j)$ th cell, $i < j$ , is $θ^{j - i}$ times higher than the probability that it falls in the $(j, i)$ th cell. A special case of the LDPS model obtained by putting $θ = 1$ is the S model. Furthermore, Tomizawa [4] proposed an extended linear diagonals-parameter symmetry (ELDPS) model defined by

$p_{i j} = θ_{1}^{j - i} θ_{2}^{j^{2} - i^{2}} p_{j i} (i < j) .$ (3)

This indicates that the probability that an observation will fall in the $(i, j)$ th cell, $i < j$ , is $θ_{1}^{j - i} θ_{2}^{j^{2} - i^{2}}$ times higher than the probability that it falls in the $(j, i)$ th cell.

Consider random variables X and Y having a joint bivariate normal distribution with means $E (X) = μ_{1}$ , $E (Y) = μ_{2}$ , variances $Var (X) = σ_{1}^{2}$ , $Var (Y) = σ_{2}^{2}$ and covariance $Cov (X, Y) = ρ σ_{1} σ_{2}$ , where ρ is the correlation coefficient $C o r r (X, Y)$ . The joint bivariate normal density $f (x, y)$ satisfies

$\frac{f (x, y)}{f (y, x)} = δ_{1}^{y - x} δ_{2}^{y^{2} - x^{2}},$ (4)

where

$δ_{1} = \exp [\frac{1}{1 - ρ^{2}} {(\frac{μ_{2}}{σ_{2}^{2}} - \frac{μ_{1}}{σ_{1}^{2}}) + \frac{ρ (μ_{2} - μ_{1})}{σ_{1} σ_{2}}}],$

$δ_{2} = e x p [\frac{1}{2 (1 - ρ^{2})} (\frac{1}{σ_{1}^{2}} - \frac{1}{σ_{2}^{2}})] .$

When $σ_{1}^{2} = σ_{2}^{2} (= σ^{2})$ , $f (x, y)$ satisfies

$\frac{f (x, y)}{f (y, x)} = δ^{y - x},$ (5)

where

$δ = e x p [\frac{1}{1 - ρ} (\frac{μ_{2} - μ_{1}}{σ^{2}})] .$

Agresti [3] [5] described the relationship between the LDPS model and the joint bivariate normal distribution as follows: the LDPS model may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate normal distribution with equal marginal variances. Also, Tomizawa [4] pointed out that the ELDPS model may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate normal distribution with different marginal variances.

Consider a bivariate t-distribution with m degree of freedom. The limit of this joint probability density function as $m \to \infty$ is bivariate normal. Therefore, the LDPS model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with equal marginal variances such that m degree of freedom is very large. Also, the ELDPS model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances such that m degree of freedom is very large.

Consider the $R \times R$ square contingency table with ordered categories. For any fixed constant $m (m > 2)$ , Iki et al. [6] proposed the t-distribution type symmetry (TS(m)) model defined by

$p_{i j}^{- \frac{2}{m + 2}} - p_{j i}^{- \frac{2}{m + 2}} = η_{m} (j - i) (i < j) .$ (6)

A special case of this model obtained by putting $η_{m} = 0$ is the S model. The TS(m) model indicates that the difference between the symmetric two probabilities raised to the power $[= - 2 / (m + 2)]$ is proportional to the distance from the main diagonal of the $R \times R$ table. The TS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with equal marginal variances having m degrees of freedom (see Iki et al. [6] ).

Now, we are interested in considering a new model, which is appropriate if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances.

The purpose of present paper is to introduce an extended TS(m) model. Section 2 proposes an extended TS(m) model and describes the properties of the new model. Section 3 describes the relationships between the extended TS(m) model and t-distribution by the simulation study. Section 4 illustrates the use of our new model with father’s and son’s occupational structure data. Section 5 provides some concluding remarks.

2. Extended Bivariate T-Distribution Type Symmetry Model

Consider random variables X and Y having a joint bivariate t-distribution with $m (m > 2)$ degrees of freedom, and means $E (X) = μ_{1}$ , $E (Y) = μ_{2}$ , variances $Var (X) = m σ_{1}^{2} / (m - 2)$ , $Var (Y) = m σ_{2}^{2} / (m - 2)$ , and correlation coefficient $C o r r (X, Y) = ρ$ . The probability density function $h (x, y)$ is

$h (x, y) = \frac{1}{2 π σ_{1} σ_{2} \sqrt{1 - ρ^{2}}} {(1 + \frac{Q (x, y)}{m})}^{- \frac{m + 2}{2}},$ (7)

where

$Q (x, y) = \frac{1}{1 - ρ^{2}} [{(\frac{x - μ_{1}}{σ_{1}})}^{2} - 2 ρ (\frac{x - μ_{1}}{σ_{1}}) (\frac{y - μ_{2}}{σ_{2}}) + {(\frac{y - μ_{2}}{σ_{2}})}^{2}] .$

See, e.g., Muirhead [7] . Another form of the probability density function $h (x, y)$ is expressed as

$h (x, y) = c {[1 + \frac{1}{m} (a_{1} x + b_{1} y + a_{2} x^{2} + b_{2} y^{2} + d (x, y))]}^{- \frac{m + 2}{2}},$ (8)

where

$\begin{array}{l} c = \frac{1}{2 π σ_{1} σ_{2} \sqrt{1 - ρ^{2}}}, \\ a_{1} = \frac{2}{σ_{1} (1 - ρ^{2})} (\frac{ρ μ_{2}}{σ_{2}} - \frac{μ_{1}}{σ_{1}}), b_{1} = \frac{2}{σ_{2} (1 - ρ^{2})} (\frac{ρ μ_{1}}{σ_{1}} - \frac{μ_{2}}{σ_{2}}), \end{array}$

$\begin{array}{l} a_{2} = \frac{1}{σ_{1}^{2} (1 - ρ^{2})}, b_{2} = \frac{1}{σ_{2}^{2} (1 - ρ^{2})}, \\ d (x, y) = \frac{1}{1 - ρ^{2}} (- \frac{2 ρ}{σ_{1} σ_{2}} x y + \frac{μ_{1}^{2}}{σ_{1}^{2}} + \frac{μ_{2}^{2}}{σ_{2}^{2}} - \frac{2 ρ μ_{1} μ_{2}}{σ_{1} σ_{2}}) . \end{array}$

We note that $d (x, y) = d (y, x)$ . Also, the probability density function $h (x, y)$ satisfies

${(h (x, y))}^{- \frac{2}{m + 2}} - {(h (y, x))}^{- \frac{2}{m + 2}} = k_{m} (y^{2} - x^{2}) + l_{m} (y - x) (x < y),$ (9)

where

$k_{m} = \frac{σ_{1}^{2} - σ_{2}^{2}}{m σ_{1}^{2} σ_{2}^{2} (1 - ρ^{2})} {(2 π σ_{1} σ_{2} \sqrt{1 - ρ^{2}})}^{\frac{2}{m + 2}},$

$l_{m} = \frac{2 {σ_{2}^{2} μ_{1} - σ_{1}^{2} μ_{2} + ρ σ_{1} σ_{2} (μ_{1} - μ_{2})}}{m σ_{1}^{2} σ_{2}^{2} (1 - ρ^{2})} {(2 π σ_{1} σ_{2} \sqrt{1 - ρ^{2}})}^{\frac{2}{m + 2}} .$

For continuous bivariate data, when we make the $R \times R$ square contingency table formed using $(R - 1)$ ’s cut points for each of row and column variables, we are interested in the structure of asymmetry of bivariate discrete probabilities ${p_{i j}}$ . Consider the $R \times R$ square contingency table with ordered categories. For any fixed constant m, we propose a model defined by

$p_{i j} = γ {[1 + \frac{1}{m} (α_{1} i + β_{1} j + α_{2} i^{2} + β_{2} j^{2} + ψ (i, j))]}^{- \frac{m + 2}{2}},$ (10)

where $ψ (i, j) = ψ (j, i)$ . We shall refer to this model as an extended t-distribution type symmetry (ETS(m)) model. The ETS(m) model may be appropriate if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances having m degrees of freedom. Under the ETS(m) model, setting $τ_{i j} = α_{1} i + β_{1} j + α_{2} i^{2} + β_{2} j^{2} + ψ (i, j)$ , we see that

$\begin{matrix} \underset{m \to \infty}{l i m} \frac{p_{i j}}{p_{j i}} = \underset{m \to \infty}{l i m} \frac{{(1 + \frac{τ_{i j}}{m})}^{- \frac{m + 2}{2}}}{{(1 + \frac{τ_{j i}}{m})}^{- \frac{m + 2}{2}}} = \lim_{m \to \infty} \frac{{{(1 + \frac{τ_{i j}}{m})}^{\frac{m}{τ_{i j}}}}^{- \frac{τ_{i j}}{2} (1 + \frac{2}{m})}}{{{(1 + \frac{τ_{j i}}{m})}^{\frac{m}{τ_{j i}}}}^{- \frac{τ_{j i}}{2} (1 + \frac{2}{m})}} = \frac{\exp [- \frac{τ_{i j}}{2}]}{\exp [- \frac{τ_{j i}}{2}]} \\ = \exp [\frac{1}{2} (α_{1} - β_{1}) (j - i) + \frac{1}{2} (α_{2} - β_{2}) (j^{2} - i^{2})] = θ_{1}^{j - i} θ_{2}^{j^{2} - i^{2}}, \end{matrix}$ (11)

where

$\begin{array}{l} θ_{1} = e x p [\frac{1}{2} (α_{1} - β_{1})], \\ θ_{2} = e x p [\frac{1}{2} (α_{2} - β_{2})] . \end{array}$

Namely, the ETS(m) model approaches the ELDPS model as m becomes larger, although the TS(m) model approaches the LDPS model as m becomes larger (see Appendix 1).

The ETS(m) model is also expressed as

$p_{i j}^{- \frac{2}{m + 2}} - p_{j i}^{- \frac{2}{m + 2}} = γ_{m} (j^{2} - i^{2}) + η_{m} (j - i) (i < j),$ (12)

where

$γ_{m} = \frac{γ^{- \frac{2}{m + 2}} (β_{2} - α_{2})}{m}, η_{m} = \frac{γ^{- \frac{2}{m + 2}} (β_{1} - α_{1})}{m} .$

A special case of this model obtained by putting $γ_{m} = 0$ is the TS(m) model.

The maximum likelihood estimates of expected frequencies under the ETS(m) model could be obtained using the Newton-Raphson method in the log-likelihood equation (see Appendix 2). For the ETS(m) model, ${p_{i j}}$ are determined by $R (R - 1) / 2$ of ${ψ (i, j), i < j}$ , R of ${ψ (i, i)}$ , 1 of $γ_{m}$ and 1 of $η_{m}$ , thus a total of $(R^{2} + R + 4) / 2$ . Therefore, the number of degrees of freedom (df) for the ETS(m) is $R^{2} - (R^{2} + R + 4) / 2 = (R^{2} - R - 4) / 2$ , which is one less than that for the TS(m) model, and equal to that for the ELDPS model.

3. Simulation Study

As described in Section 2, the ETS(m) model may be appropriate for a square ordinal table if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances having m degrees of freedom. We shall consider the simulation study based on bivariate t-distribution. Consider random variables X and Y having a bivariate t-distribution with $m (m > 2)$ degrees of freedom, and means $E (X) = μ_{1}$ , $E (Y) = μ_{2} (= μ_{1} + 0.2)$ , variances $Var (X) = m σ_{1}^{2} / (m - 2)$ , $Var (Y) = m σ_{2}^{2} / (m - 2)$ and correlation coefficient $C o r r (X, Y) = ρ (= 0.2)$ . (Note that it is possible to take various values of ρ.) Such random numbers are obtained by using normal random number and chi-square random number with m degree of freedom. Suppose that there is an underlying bivariate t-distribution with some conditions, namely, $σ_{2}^{2} / σ_{1}^{2} = 1.2, 1.5$ and 1.7, and a $4 \times 4$ table of sample size 1000 is formed using cut points for each variable at $μ_{1}, μ_{1} \pm 0.6 σ_{1}$ .

Then, we shall count the frequencies of acceptance (at the 0.05 significance level) based on the likelihood ratio chi-squared statistic for testing the hypothesis that the ETS(m) model or the TS(m) model with the corresponding m degrees of freedom of underlying t-distribution holds per 10000 times for $4 \times 4$ tables on each conditions. From Table 1, we see that the ETS(m) model gives good fit, however, the TS(m) model gives poor fit. Thus, from the result of simulation for comparison the ETS(m) and TS(m) models, we obtain that if it is reasonable to assume the underlying bivariate t-distribution with different marginal variances and the ratio of different marginal variances (i.e., $σ_{2}^{2} / σ_{1}^{2}$ ) being large, the corresponding ETS(m) model rather than the TS(m) model would fit the data well without depending the value of degrees of freedom m.

Table 1. The frequencies of acceptance (at the 0.05 significance level) based on the likelihood ratio chi-squared statistic for testing the hypothesis that the ETS(m) model or the TS(m) model with the corresponding degrees of freedom of underlying t-distribution holds per 10,000 times for $4 \times 4$ tables on some ratios of variances $σ_{2}^{2} / σ_{1}^{2}$ and m degrees of freedom.

4. Example

The data in Table 2 is taken from Tominaga [8] . These data describe the cross-classification of father’s and son’s occupational status categories in Japan, which were examined in 1955. From Table 3, we see that the TS(m) ( $m = 5, 20, 50, 100$ ) models fit these data poorly, however, the ETS(m) ( $m = 5, 20, 50, 100$ ) and the ELDPS models fit these data well. We obtain the similar result without depending the value of degree of freedom m, although the detail is omitted. We also see that the values of the likelihood ratio chi-squared statistic G² for the ETS(m) model approach the value of G² for the ELDPS model as

Table 2. The cross-classification of father’s and son’s occupational status categories in Japan, which were examined in 1955 from Tominaga [8] . (The parenthesized values are the maximum likelihood estimates of expected frequencies under the ETS(5) model.)

Note: (1) is upper non-manual, (2) lower non-manual, (3) manual, and (4) agriculture.

Table 3. Likelihood ratio chi-squared values G² for models applied to Table 2.

*means significant at the 0.05 level.

the value of m increases. Under the ETS(5) model, the maximum likelihood estimates of $γ_{5}$ and $η_{5}$ are ${\hat{γ}}_{5} = 0.335$ and ${\hat{η}}_{5} = - 1.405$ , respectively.

Therefore, the difference between the probability raised to the power $- \frac{2}{m + 2}$

[=−0.286] that the occupational status category of the father in pair is i and that of his son is $j (i < j)$ and the probability raised to the power that the occupational status category of the father in pair is j and that of his son is i, is ${\hat{γ}}_{5} (j^{2} - i^{2}) + {\hat{η}}_{5} (j - i)$ $[= 0.335 \times (j^{2} - i^{2}) - 1.405 \times (j - i)]$ . For example, the difference between the probability raised to the power that the occupational status category of the father in pair is (1) “upper non-manual” and that of his son is (2) “lower non-manual” and the probability raised to the power that the occupational status category of the father in pair is (2) “lower non-manual” and that of his son is (1) “upper non-manual”, is estimated to be -0.400 [ $= 0.335 \times 3 - 1.405 \times 1$ ]. Namely, the probability that the occupational status category of the father in pair is (1) “upper non-manual” and that of his son is (2) “lower non-manual” is estimated to be greater than the probability that the occupational status category of the father in pair is (2) “lower non-manual” and that of his son is (1) “upper non-manual”. Also, the probability that the occupational status category of the father in pair is (1) and that of his son is (3) is estimated to be greater than the probability that the occupational status category of the father in pair is (3) and that of his son is (1). However, for the other symmetric two probabilities with respect to the main diagonal, the probability that the occupational status category of the father in pair is j and that of his son is $i (i < j)$ is estimated to be greater than the probability that the occupational status category of the father in pair is i and that of his son is j.

5. Concluding Remarks

From the result of simulation studies, the ETS(m) model would be appropriate for a square contingency table if it is reasonable to assume an underlying bivariate t-distribution with different marginal variances, although, the TS(m) model may be appropriate for a square contingency table if it is reasonable to assume an underlying bivariate t-distribution with equal marginal variances having m degrees of freedom.

Acknowledgements

The authors would like to thank referees for helpful comments.

Cite this paper

Iki, K., Okada, M. and Tomizawa, S. (2018) An Extended Bivariate T-Distribution Type Symmetry Model for Square Contingency Tables. Open Journal of Statistics, 8, 249-257. https://doi.org/10.4236/ojs.2018.82015

References

1. Bowker, A.H. (1948) A Test for Symmetry in Contingency Tables. Journal of the American Statistical Association, 43, 572-574. https://doi.org/10.1080/01621459.1948.10483284

2. Bishop, Y.M.M., Fienberg, S.E. and Holland, P.W. (1975) Discrete Multivariate Analysis: Theory and Practice. The MIT Press, Cambridge.

3. Agresti, A. (1983) A Simple Diagonals-Parameter Symmetry and Quasi-Symmetry Model. Statistics and Probability Letters, 1, 313-316. https://doi.org/10.1016/0167-7152(83)90051-2

4. Tomizawa, S. (1991) An Extended Linear Diagonals-Parameter Symmetry Model for Square Contingency Tables with Ordered Categories. Metron: International Journal of Statistics, 49, 401-409.

5. Agresti, A. (1984) Analysis of Ordinal Categorical Data. John Wiley & Sons, New York.

6. Iki, K., Ishihara, T. and Tomizawa, S. (2013) Bivariate t-Distribution Type Symmetry Model for Square Contingency Tables with Ordered Categories. Model Assisted Statistics and Applications: An International Journal, 8, 315-319.

7. Muirhead, R.J. (2005) Aspects of Multivariate Statistical Theory. John Wiley & Sons, New Jersey.

8. Tominaga, K. (1979) Nippon no Kaisou Kouzou (Japanese Hierarchical Structure). University of Tokyo Press, Tokyo. (In Japanese)

Appendix 1

The TS(m) model is also expressed as

$p_{i j} = γ {[1 + \frac{1}{m} (α i + β j + ϕ (i, j))]}^{- \frac{m + 2}{2}},$ (A.1)

where $ϕ (i, j) = ϕ (j, i)$ . Under the TS(m) model, we see that

$\underset{m \to \infty}{l i m} \frac{p_{i j}}{p_{j i}} = θ^{j - i},$ (A.2)

where

$θ = e x p [\frac{1}{2} (α - β)] .$

Namely, the TS(m) model approaches the LDPS model as m becomes larger.

Appendix 2

Let $n_{i j}$ denote the observed frequency in the ith row and jth column of the table ( $i = 1, \dots, R; j = 1, \dots, R$ ), with $n = \sum \sum n_{i j}$ . Assume that a multinomial distribution applies to the $R \times R$ table. We consider the maximum likelihood estimates of expected frequencies ${m_{i j}}$ under the ETS(m) model. We must maximize the Lgrangian

$\begin{matrix} L = \sum_{i = 1}^{R} \sum_{j = 1}^{R} n_{i j} l o g p_{i j} - λ (\sum_{i = 1}^{R} \sum_{j = 1}^{R} p_{i j} - 1) \\ - \underset{i < j}{\sum \sum} ϕ_{i j} {p_{i j}^{- \frac{2}{m + 2}} - p_{j i}^{- \frac{2}{m + 2}} - γ_{m} (j^{2} - i^{2}) - η_{m} (j - i)}, \end{matrix}$ (A.3)

with respect to ${p_{i j}}, λ, {ϕ_{i j}}, γ_{m}$ and $η_{m}$ . Setting the partial derivations of L equal to zero, we obtain the equations:

(A.4)

(A.5)

(A.6)

$p_{k l}^{- \frac{2}{m + 2}} - p_{l k}^{- \frac{2}{m + 2}} = γ_{m} (l^{2} - k^{2}) + η_{m} (l - k) (k < l),$ (A.7)

$\underset{i < j}{\sum \sum} ϕ_{i j} (j^{2} - i^{2}) = 0,$ (A.8)

$\underset{i < j}{\sum \sum} ϕ_{i j} (j - i) = 0.$ (A.9)

Using the Newton-Raphson method, we can solve Equations (A.4) to (A.9) with respect to ${p_{i j}}, γ_{m}$ and $η_{m}$ . Noting that ${m_{i j} = n p_{i j}}$ , we obtain the maximum likelihood estimates of ${m_{i j}}$ and the parameters $γ_{m}$ and $η_{m}$ under the ETS(m) model.

Journal Menu>>