Y. Y. LIANG, K. C. CARRIERE 5

analyzing medical and health data with correlated binary

responses. To illustrate, we made use of alcohol abuse

data from a family study conducted in Edmonton, Can-

ada.

The non-linear mixed effects model assumes that the

error distribution is normal, while allowing for the het-

erogeneity of the data in the form of mixed effects of

some covariates. In this situation, multilevel models can

be viewed as a special case of non-linear mixed effects

models, but they are especially useful when the data have

more than two levels of hierarchies. Unlike these two

approaches, which assume a correlated binomial distri-

bution of the data, the GEE method does not require the

data to follow a particular parametric distribution. How-

ever, if the number of clusters is very large, all three me-

thods are expected to perform similarly.

Using the alcohol abuse data, we found that the non-

linear mixed model resulted in the best overall model

prediction with the additional advantage of requiring

only a moderate amount of computing time. However,

only a limited number of researches have been done to

check the model assumptions [9-11]. Further research is

needed to improve this aspect of the non-linear mixed

model.

The multilevel method allows for a model with several

levels, but for two-level hierarchical data such as was

used in this study, it is essentially the same as the usual

non-linear mixed effects model. A disadvantage of this

approach is that the algorithm may not converge, espe-

cially when the cluster size is small with few clusters.

Furthermore, the computation time to convergence is

relatively long.

In the GEE approach, we choose the exchangeable

structure of the working correlation matrix, assuming all

relatives of the proband have the same correlation. We

note that Liang and Zeger (1986) originally considered

the correlation among clustered observations as a nui-

sance, while the regression parameters are the primary

interest [2]. With the GEE approach, regression parame-

ters can be estimated consistently but not necessarily

with complete efficiency, whether the working correla-

tion structure is correct or not. This consistency is based

on the assumption that the regression parameters and the

association parameters are orthogonal to one another,

even when they are not [12]. However, the GEE method

may still be preferable in some cases, when a population

averaged level analysis is suitable for the research ques-

tions and objectives. Further, the computational algo-

rithm is relatively fast and it makes weaker assumptions

about the structure of the variance-covariance matrix of

the response vector.

Regarding the model’s prediction accuracy, in general,

the model’s specificity is quite high, but rather low in

sensitivity. In other words, the model has much more

difficulty predicting cases that have an alcohol abuse

problem, which is a common situation when there are no

strong risk factors for modelling. In the absence of ran-

dom effects, all models have an essentially equal ability

to predict the outcome. However, when there are signifi-

cant random effects, the prediction level improves by

properly accounting for the random effects in the model.

5. Conclusion

Overall, the non-linear mixed effects approach to analy-

sis of these data seems quite competitive with the multi-

level method in terms of convergence properties. The

random family effects were significant, reflecting het-

erogeneity among families, and both the non-linear

mixed effects model and the multilevel model captured

them effectively. The popular marginal modeling via the

GEE method may still be preferable because of its com-

putational ease and relaxed distribution assumptions.

However, caution is advised, as it might underestimate

the odds ratio and its standard error, as indicated in this

case study.

REFERENCES

[1] H. Goldstein, “Nonlinear Multilevel Models, with an

Application to Discrete Response Data,” Biometrika, Vol.

78, No. 1, 1991, pp. 45-51. doi:10.1093/biomet/78.1.45

[2] K. Y. Liang and S. L. Zeger, “Longitudinal Data-Analysis

Using Generalized Linear-Models,” Biometrika, Vol. 73,

No. 1, 1986, pp. 13-22. doi:10.1093/biomet/73.1.13

[3] S. C. Newman and R. C. Bland, “A Population-Based

Family Study of DSM-III Generalized Anxiety Disorder,”

Psychological Medicine, Vol. 36, No. 9, 2006, pp. 1275-

1281. doi:10.1017/S0033291706007732

[4] N. Mantel and W. Haenszel, “Statistical Aspects of the

Analysis of Data from Retrospective Studies of Disease,”

Journal of the National Cancer Institute, Vol. 22, No. 4,

1959, pp. 719-748.

[5] J. C. Pinheiro and D. M. Bates, “Approximations to the

Log-Likelihood Function in the Nonlinear Mixed-Effects

Model,” Journal of Computational and Graphical Statis-

tics, Vol. 4, No. 1, 1995, pp. 12-35.

[6] J. Nocedal and S. J. Wright, “Numerical Optimization,”

Springer-Verlag, New York, 1999. doi:10.1007/b98874

[7] N. E. Breslow and D. G. Clayton, “Approximate Infer-

ence in Generalized Linear Mixed Models,” Journal of

the American Statistical Association, Vol. 88, No. 421,

1993, pp. 9-25.

[8] H. Goldstein and J. Rasbash, “Improved Approximations

for Multilevel Models with Binary Responses,” Journal

of the Royal Statistical Society Series A—Statistics in So-

ciety, Vol. 159, 1996, pp. 505-513.

[9] Y. Yano, S. L. Beal and L. B. Sheiner, “Evaluating Phar-

macokinetic/Pharmacodynamic Models Using the Poste-

rior Predictive Check,” Journal of Pharmacokinetics and

Pharmacodynamics, Vol. 28, No. 2, 2001, pp. 171-192.

Copyright © 2013 SciRes. OJS