In this paper, we proposed the generalized method and algorithms developed for estimation of parameters and best model fits of log linear model for n-dimensional contingency table. For purpose of this work, the method was used to provide parameter estimates of log-linear model for three-dimensional contingency table. In estimating parameter estimates and best model fit, computer programs in R were developed for the implementation of the algorithms. The iterative proportional fitting procedure was used to find the parameter estimates and goodness of fits of the log linear model. Akaike information criteria (AIC) and Bayesian information criteria (BIC) were used to check the adequacy of the model of the best fit. Secondary data were used for illustration and the result obtained showed that the best model fit for three-dimensional contingency table had a generating class: [CA, AB]. This showed that the best model fit had sufficient evidence to fit the data without loss of information. This model also revealed that breed was independent of chick loss given age. The best model in harmony with the hierarchy principle is Logmijk=μ+μC(i)+ μA(j)+ μB(k)+ μCA(ij)+ μAB(jk).
Contingency tables [
[
[
[
[
[
[
Data for 510 chickens on breed, age and chick loss were collected from poultry record book of Sambo feeds, Awka, Nigeria. The breed, age, and chick loss were classified into two, three and two categories of levels respectively.
For any
where
tions of sequence
For
Conventionally, we take
The iterative proportional fitting (IPF) in [
ing expected frequencies
The
The procedure assumes initial value
The second cycle is of the same form as the first cycle above but uses updated estimates
The steps are repeated until convergence to desired accuracy is attained.
The goodness of fit can be assessed by a number of statistics. The statistics include Pearson chi-square test statistic due to [
where
d.f = number of cells in the table-number of independent parameters estimated.
Information criteria should be considered also for testing the adequacy of the model fit. Akaike information criteria [
Another information criteria is Bayesian information [
where
> I=2; J= 3; K= 2
>n=c(55, 67, 16, 44, 8, 45, 48, 66, 20, 52, 18, 71)
> n=array(n, dim=c(I, J, K))
> dimnames(n)=list(Chicksloss=c("Yes", "No"), Age=c(1, 2, 3), Breed=c("Broiler", "Old layer"))
> n
, , Breed = Broiler
Age
Chicksloss 1 2 3
Yes 55 16 8
No 67 44 45
, , Breed = Old layer
Age
Chicksloss 1 2 3
Yes 48 20 18
No 66 52 71
> mu=log(sum(n)/(I*J*K))
> mu_1=mu_2=mu_3=0
> mu_12=matrix(0, I, J); mu_13=matrix(0, I, K); mu_23=matrix(0, J, K)
> for (i in 1:I){
+ mu_1[i]=log(sum(n[i, ,])/(J*K))-mu
+ for (j in 1:J){
+ mu_2[j]=log(sum(n[,j,])/(I*K))-mu
+ for (k in 1:K){
+ mu_3[k]=log(sum(n[, , k])/(I*J))-mu
+ }
+ }
+ }
> for (j in 1:J){
+ for (k in 1:K){
+ mu_13[i, k]=log(sum(n[i, , k])/J)-mu-mu_1[i]-mu_3[k]
+ }
+ }
> for (i in 1:I){
+ for (j in 1:J){
+ mu_12[i, j]=log(sum(n[i, j, ])/K)-mu-mu_1[i]-mu_2[j]
+ }
+ }
> for (j in 1:J){
+ for (k in 1:K){
+ mu_23[j, k]=log(sum(n[, j, k])/I)-mu-mu_2[j]-mu_3[k]
+ }
+ }
> mu; mu_1; mu_2; mu_3; mu_13; mu_12; mu_23
[
[
[
[
[,1] [,2]
[1,] 0.0000000 0.00000000
[2,] -0.0188632 0.01584223
[,1] [,2] [,3]
[1,] 0.2993624 -0.17081773 -0.5692653
[2,] -0.1826164 0.07241258 0.1886294
[,1] [,2]
[1,] 0.11501445 -0.10999373
[2,] -0.01363215 0.01150382
[3,] -0.21070993 0.15044894
> m_1=m_2=m_3=m_4=array(0, dim=c(I, J, K))
> for(i in 1:I){
+ for (j in 1:J){
+ for (k in 1:K){
+ m_1[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k])
+ m_2[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_13[i, k])
+ m_3[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_12[i, k])
+ m_4[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_23[j, k])
+ }
+ }
+ }
> m_1; m_2; m_3; m_4
, , 1
[,1] [,2] [,3]
[1,] 35.18224 19.67820 21.16897
[2,] 73.56286 41.14533 44.26240
, , 2
[,1] [,2] [,3]
[1,] 41.1707 23.02768 24.77220
[2,] 86.0842 48.14879 51.79642
, , 1
[,1] [,2] [,3]
[1,] 35.18224 19.67820 21.16897
[2,] 72.18824 40.37647 43.43529
, , 2
[,1] [,2] [,3]
[1,] 41.17070 23.02768 24.77220
[2,] 87.45882 48.91765 52.62353
, , 1
[,1] [,2] [,3]
[1,] 47.46078 26.54586 28.55691
[2,] 61.28431 34.27767 36.87446
, , 2
[,1] [,2] [,3]
[1,] 34.70588 19.41176 20.88235
[2,] 92.54902 51.76471 55.68627
, , 1
[,1] [,2] [,3]
[1,] 39.47059 19.41176 17.14706
[2,] 82.52941 40.58824 35.85294
, , 2
[,1] [,2] [,3]
[1,] 36.88235 23.29412 28.79412
[2,] 77.11765 48.70588 60.20588
> model1=loglm(~Chicksloss + Age + Breed, data=n)
> model2=loglm(~Chicksloss + Age+ Breed + Chicksloss*Breed, data=n)
> model3=loglm(~Chicksloss + Age+ Breed + Chicksloss*Age, data=n)
> model4=loglm(~Chicksloss + Age+ Breed + Age*Breed, data=n)
> model1; model2; model3; model4
Call:
loglm(formula = ~Chicksloss + Age + Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 37.13769 7 4.41719e-06
Pearson 36.33465 7 6.26767e-06
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Breed,
data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 36.81976 6 1.909229e-06
Pearson 35.20502 6 3.932731e-06
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Age,
data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 8.280408 5 0.1414439
Pearson 8.181170 5 0.1465296
Call:
loglm(formula = ~Chicksloss + Age + Breed + Age * Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 29.68738 5 1.699225e-05
Pearson 28.75595 5 2.588877e-05
> m_5a=m_5b=m_5c=m_6=array(0, dim=c(I, J, K))
> for(i in 1:I){
+ for (j in 1:J){
+ for (k in 1:K){
+ m_5a[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_12[i, j]+mu_13[i,k])
+ m_5b[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_12[i, j]+mu_23[j,k])
+ m_5c[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_13[i, k]+mu_23[j,k])
+ m_6[i, j, k]=exp(mu+mu_1[i]+mu_2[j]+mu_3[k]+mu_12[i, j]+mu_13[i,k]+mu_23[j, k])
+ }
+ }
+ }
> m_5a; m_5b; m_5c; m_6
, , 1
[,1] [,2] [,3]
[1,] 47.46078 16.58824 11.98039
[2,] 60.13913 43.40870 52.45217
, , 2
[,1] [,2] [,3]
[1,] 55.53922 19.41176 14.01961
[2,] 72.86087 52.59130 63.54783
, , 1
[,1] [,2] [,3]
[1,] 53.24576 16.36364 9.704225
[2,] 68.75424 43.63636 43.295775
, , 2
[,1] [,2] [,3]
[1,] 49.75424 19.63636 16.29577
[2,] 64.24576 52.36364 72.70423
, , 1
[,1] [,2] [,3]
[1,] 39.47059 19.41176 17.14706
[2,] 80.98723 39.82979 35.18298
, , 2
[,1] [,2] [,3]
[1,] 36.88235 23.29412 28.79412
[2,] 78.34909 49.48364 61.16727
, , 1
[,1] [,2] [,3]
[1,] 53.24576 16.36364 9.704225
[2,] 67.46947 42.82096 42.486732
, , 2
[,1] [,2] [,3]
[1,] 49.75424 19.63636 16.29577
[2,] 65.27166 53.19980 73.86519
> model5a=loglm(~Chicksloss + Age+ Breed + Chicksloss*Age + Chicksloss *Breed, data=n)
> model5b=loglm(~Chicksloss + Age+ Breed + Chicksloss*Age + Age*Breed, data=n)
> model5c=loglm(~Chicksloss + Age+ Breed + Chicksloss* Breed + Age*Breed, data=n)
> model6=loglm(~Chicksloss + Age+ Breed + Chicksloss*Age + Chicksloss *Breed + Age*Breed , data=n)
>model5a; model5b; model5c; model6
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Age +
Chicksloss * Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 7.962476 4 0.09296247
Pearson 7.853555 4 0.09709239
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Age +
Age * Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 0.8301007 3 0.8422546
Pearson 0.8172251 3 0.8453427
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Breed +
Age * Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 29.36945 4 6.576301e-06
Pearson 28.32066 4 1.073878e-05
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Age +
Chicksloss * Breed + Age * Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 0.8263105 2 0.6615596
Pearson 0.8145226 2 0.6654703
> model7=loglm(~Chicksloss + Age+ Breed + Chicksloss*Age *Breed, data=n); model7
Call:
loglm(formula = ~Chicksloss + Age + Breed + Chicksloss * Age *
Breed, data = n)
Statistics:
X^2 df P(> X^2)
Likelihood Ratio 0 0 1
Pearson 0 0 1
The best model fit that explained the observed data is [CA, AB]. This model means that at significance level of 5%, the data provide sufficient evidence (likelihood ratio = 0.83, d.f = 2, P = 0.84 or Pearson 0.82, d.f = 2, P = 0.84) that breed and chickloss are independent given the Age, thus the association or relationship between the breed and chickloss is independent for each age (
The results of AIC and BIC confirmed that best model (CA, AB) adequately fit the data (
A generalized method and algorithms developed for estimation of the parameters and best model fits of log linear model for n-dimensional contingency table. Three-dimensional contingency table was considered for this paper. In estimating these parameter estimates and best model fits, computer programs in R were developed for the implementation of the algorithms. The iterative proportional fitting procedure was used to estimate the parameter estimates and goodness of fits of the log linear model.
The results of the analysis showed that the best model fit for 3-dimensional contingency table was
Models | P-value | |||
---|---|---|---|---|
[C, A, B] | 37.14 | 36.34 | 7 | 0 |
[A, CB] | 36.82 | 35.21 | 6 | 0 |
[B, CA] | 8.28 | 8.18 | 5 | 0.14 |
[C, AB] | 29.69 | 28.76 | 5 | 0 |
[CA, CB] | 7.96 | 7.85 | 4 | 0.093 |
[CA, AB] | 0.83 | 0.82 | 3 | 0.84 |
[CB, AB] | 29.36 | 28.32 | 4 | 0 |
[CA, CB, AB] | 0.83 | 0.81 | 2 | 0.66 |
[CAB] | 0 | 0 | 0 | 1 |
P-value for
Model | AIC | BIC | d.f | P-value | ||
---|---|---|---|---|---|---|
(C, A, B) | 24.14 | −6.50 | 37.14 | 36.33 | 7 | <0.001 |
(C, AB) | 19.69 | −1.48 | 29.69 | 28.76 | 5 | <0.001 |
(A, CB) | 24.82 | −0.59 | 36.82 | 35.21 | 6 | <0.001 |
(CB, AB) | 21.37 | 4.43 | 29.37 | 28.32 | 4 | <0.001 |
(CA, AB) | −5.17 | −17.87 | 0.83 | 0.82 | 3 | 0.84 |
(CA, CB) | −0.04 | −16.98 | 7.96 | 7.85 | 4 | 0.09 |
(CA, CB, AB) | −3.17 | 11.64 | 0.83 | 0.82 | 2 | 0.67 |
(CAB) | 0 | 0 | 0.00 | 0.00 | 0 | I |
estimate. The values of the best of fit statistics are: likelihood ratio
model adequacy to the data. The final model in harmony with the hierarchy principle is written as:
Cecilia N. Okoli,Sidney I. Onyeagu,George A. Osuji, (2015) On the Estimation of Parameters and Best Model Fits of Log Linear Model for Contingency Table. Open Access Library Journal,02,1-11. doi: 10.4236/oalib.1101189