Open Journal of Statistics
Vol.05 No.07(2015), Article ID:62412,15 pages
10.4236/ojs.2015.57082
Stochastic Restricted Maximum Likelihood Estimator in Logistic Regression Model
Varathan Nagarajah1,2, Pushpakanthie Wijekoon3
1Postgraduate Institute of Science, University of Peradeniya, Peradeniya, Sri Lanka
2Department of Mathematics and Statistics, University of Jaffna, Jaffna, Sri Lanka
3Department of Statistics and Computer Science, University of Peradeniya, Peradeniya, Sri Lanka

Copyright © 2015 by authors and Scientific Research Publishing Inc.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/



Received 2 November 2015; accepted 27 December 2015; published 30 December 2015
ABSTRACT
In the presence of multicollinearity in logistic regression, the variance of the Maximum Likelihood Estimator (MLE) becomes inflated. Şiray et al. (2015) [1] proposed a restricted Liu estimator in logistic regression model with exact linear restrictions. However, there are some situations, where the linear restrictions are stochastic. In this paper, we propose a Stochastic Restricted Maximum Likelihood Estimator (SRMLE) for the logistic regression model with stochastic linear restrictions to overcome this issue. Moreover, a Monte Carlo simulation is conducted for comparing the performances of the MLE, Restricted Maximum Likelihood Estimator (RMLE), Ridge Type Logistic Estimator(LRE), Liu Type Logistic Estimator(LLE), and SRMLE for the logistic regression model by using Scalar Mean Squared Error (SMSE).
Keywords:
Logistic Regression, Multicollinearity, Stochastic Restricted Maximum Likelihood Estimator, Scalar Mean Squared Error

1. Introduction
In many fields of study such as medicine and epidemiology, it is very important to predict a binary response variable, or to compute the probability of occurrence of an event, in terms of the values of a set of explanatory variables related to it. For example, the probability of suffering a heart attack is computed in terms of the levels of a set of risk factors such as cholesterol and blood pressure. The logistic regression model serves admirably this purpose and is the most used for these cases.
The general form of logistic regression model is
(1)
which follows Bernoulli distribution with parameter
as
(2)
where
is the
row of X, which is an
data matrix with p explanatory variables and
is a
vector of coefficients,
is independent with mean zero and variance
of the response
. The maximum likelihood method is the most common estimation technique to estimate the parameter
, and the Maximum Likelihood Estimator (MLE) of
can be obtained as follows:
(3)
where
; Z is the column vector with






As many authors have stated (Hosmer and Lemeshow (1989) [2] and Ryan (1997) [3] , among others), the logistic regression model becomes unstable when there exists strong dependence among explanatory variables (multi-collinearity). For example, we suppose that the probability of a person surviving 10 or more extra years is modelled using three predictors Sex, Diastolic blood pressure and Body mass index. Since the response “whether the person surviving 10 or more extra years” is binary, the logistic regression model is appropriate for this problem. However, it is understood that the predictors Sex, Diastolic blood pressure and Body mass index may have some inter-relationship within each person. In this case, the estimation of the model parameters becomes inaccurate because of the need to invert near-singular information matrices. Consequently, the interpretation of the relationship between the response and each explanatory variable in terms of odds ratio may be erroneous. As a result, the estimates have large variances and large confidence intervals, which produce inefficient estimates.
To overcome the problem of multi-collinearity in the logistic regression, many estimators are proposed alternatives to the MLE. The most popular way to deal with this problem is called the Ridge Logistic Regression (RLR), which is first proposed by Schaffer et al. (1984) [4] . Later Principal Component Logistic Estimator (PCLE) by Aguilera et al. (2006) [5] , the Modified Logistic Ridge Regression Estimator (MLRE) by Nja et al. (2013) [6] , Liu Estimator by Mansson et al. (2012) [7] , and Liu-type estimator by Inan and Erdogan (2013) [8] in logistic regression have been proposed.
An alternative technique to resolve the multi-collinearity problem is to consider parameter estimation with priori available linear restrictions on the unknown parameters, which may be exact or stochastic. That is, in some practical situations there exist different sets of prior information from different sources like past experience or long association of the experimenter with the experiment and similar kind of experiments conducted in the past. If the exact linear restrictions are available in addition to logistic regression model, many authors propose different estimators for the respective parameter
In this paper we propose a new estimator which is called as the Stochastic Restricted Maximum Likelihood Estimator (SRMLE) when the linear stochastic restrictions are available in addition to the logistic regression model. The rest of the paper is organized as follows. The proposed estimator and its asymptotic properties are given in Section 2. In Section 3, the mean square error matrix and the scalar mean square error for this new estimator are obtained. Section 4 describes some important existing estimators for the logistic regression models. Performance of the proposed estimator with respect to Scalar Mean Squared Error (SMSE) is compared with some existing estimators by performing a Monte Carlo simulation study in Section 5. The conclusion of the study is presented in Section 6.
2. The Proposed Estimator and its Asymptotic Properties
First consider the multiple linear regression model

where y is an






The Ordinary Least Square Estimator (OLSE) of


where
In addition to sample model (5), consider the following linear stochastic restriction on the parameter space

where r is an











The Restricted Ordinary Least Square Estimator (ROLSE) due to exact prior restriction (i.e.

Theil and Goldberger (1961) [10] proposed the mixed regression estimator (ME) for the regression model (2.1) with the stochastic restricted prior information (7)

Suppose that the following linear prior information is given in addition to the general logistic regression model (1)

where h is an







to be known




Duffy and Santner (1989) [9] proposed the Restricted Maximum Likelihood Estimator (RMLE) for the logistic regression model (1) with the exact prior restriction (i.e.

Following RMLE in (11) and the Mixed Estimator (ME) in (9) in the Linear Regression Model, we propose a new estimator which is named as the Stochastic Restricted Maximum Likelihood Estimator (SRMLE) when the linear stochastic restriction (10) is available in addition to the logistic regression model (1).

Asymptotic Properties of SRMLE:
The


The asymtotic covariance matrix of SRMLE equals

3. Mean Square Error Matrix Comparisons
To compare different estimators with respect to the same parameter vector


where


The Scalar Mean Square Error (SMSE) of the estimator


For two given estimators





The MSE and SMSE of the proposed estimator SRMLE is



Note that the difference given in (20) is non-negative definite. Thus by the MSE criteria it follows that


4. Some Existing Logistic Estimators
To examine the performance of the proposed estimator SRMLE over some existing estimators, the following estimators are considered.
1) Logistic Ridge Estimator
Schaefer et al. (1984) [4] proposed a ridge estimator for the logistic regression model (1).

where


The asymptotic MSE and SMSE of

where

2) Logistic Liu Estimator
Following Liu (1993) [11] , Urgan and Tez (2008) [12] , Mansson et al. (2012) [7] examined the Liu Estimator for logistic regression model, which is defined as

where


The asymptotic MSE and SMSE of

where

3) Restricted MLE
As we mentioned in Section 2, Duffy and Santner (1989) [9] proposed the Restricted Maximum Likelihood Estimator (RMLE) for the logistic regression model (1) with the exact prior restriction (i.e.

The asymptotic MSE and SMSE of


where
and
Mean Squared Error Comparisons
・ SRMLE versus LRE

where





Theorem 1 (see Appendix 1), it is clear that





Theorem 4.1. The estimator SRMLE is superior to LRE if and only if
・ SRMLE Versus LLE

where





nite matrices. Further by Theorem 1 (see Appendix 1), it is clear that





Theorem 4.2. The estimator



・ SRMLE versus RMLE

where










Theorem 4.3. The estimator



Based on the above results one can say that the new estimator SRMLE is superior to the other estimators with respect to the mean squared error matrix sense under certain conditions. To check the superiority of the estimators numerically, we then consider a simulation study in the next section.
5. A Simulation Study
A Monte Carlo simulation is done to illustrate the performance of the new estimator SRMLE over the MLE, RMLE, LRE, and LLE by means of Scalar Mean Square Error (SMSE). Following McDonald and Galarneau (1975) [13] the data are generated as follows:

where




noulli (




Moreover, for the restriction, we choose

Further for the ridge parameter k and the Liu parameter d, some selected values are chosen so that


The experiment is replicated 3000 times by generating new pseudo-random numbers and the estimated SMSE is obtained as

The simulation results are listed in Tables A1-A16 (Appendix 3) and also displayed in Figures A1-A4 (Appendix 2). From Figures A1-A4, it can be noticed that in general increase in degree of correlation between two explanatory variables





6. Concluding Remarks
In this research, we introduced the Stochastic Restricted Maximum Likelihood Estimator (SRMLE) for logistic regression model when the linear stochastic restriction was available. The performances of the SRMLE over MLE, LRE, RMLE, and LLE in logistic regression model were investigated by performing a Monte Carlo simulation study. The research had been done by considering different degree of correlations, different numbers of observations and different values of parameters k, d. It was noted that the SMSE of the MLE was inflated when the multicollinearity was presented and it was severe particularly for small samples. The simulation results showed that the proposed estimator SRMLE had smaller SMSE than the estimator MLE with respect to all the values of n and

Acknowledgements
We thank the editor and the referee for their comments and suggestions, and the Postgraduate Institute of Science, University of Peradeniya, Sri Lanka for providing necessary facilities to complete this research.
Cite this paper
VarathanNagarajah,PushpakanthieWijekoon,11, (2015) Stochastic Restricted Maximum Likelihood Estimator in Logistic Regression Model. Open Journal of Statistics,05,837-851. doi: 10.4236/ojs.2015.57082
References
- 1. Siray, G.U., Toker, S. and, Kaçiranlar, S. (2015) On the Restricted Liu Estimator in Logistic Regression Model. Communications in Statistics—Simulation and Computation, 44, 217-232.
http://dx.doi.org/10.1080/03610918.2013.771742 - 2. Hosmer, D.W. and Lemeshow, S. (1989) Applied Logistic Regression. Wiley, New York.
- 3. Ryan, T.P. (1997) Modern Regression Methods. Wiley, New York.
- 4. Schaefer, R.L., Roi, L.D. and Wolfe, R.A. (1984) A Ridge Logistic Estimator. Communications in Statistics—Theory and Methods, 13, 99-113.
http://dx.doi.org/10.1080/03610928408828664 - 5. Aguilera, A.M., Escabias, M. and Valderrama, M.J. (2006) Using Principal Components for Estimating Logistic Regression with High-Dimensional Multicollinear Data. Computational Statistics & Data Analysis, 50, 1905-1924.
http://dx.doi.org/10.1016/j.csda.2005.03.011 - 6. Nja, M.E., Ogoke, U.P. and Nduka, E.C. (2013) The Logistic Regression Model with a Modified Weight Function. Journal of Statistical and Econometric Method, 2, 161-171.
- 7. Mansson, G., Kibria, B.M.G. and Shukur, G. (2012) On Liu Estimators for the Logit Regression Model. The Royal Institute of Techonology, Centre of Excellence for Science and Innovation Studies (CESIS), Paper No. 259.
http://dx.doi.org/10.1016/j.econmod.2011.11.015 - 8. Inan, D. and Erdogan, B.E. (2013) Liu-Type Logistic Estimator. Communications in Statistics—Simulation and Computation, 42, 1578-1586.
http://dx.doi.org/10.1080/03610918.2012.667480 - 9. Duffy, D.E. and Santner, T.J. (1989) On the Small Sample Prosperities of Norm-Restricted Maximum Likelihood Estimators for Logistic Regression Models. Communications in Statistics—Theory and Methods, 18, 959-980.
http://dx.doi.org/10.1080/03610928908829944 - 10. Theil, H. and Goldberger, A.S. (1961) On Pure and Mixed Estimation in ECONOMICS. International Economic Review, 2, 65-77.
http://dx.doi.org/10.2307/2525589 - 11. Liu, K. (1993) A New Class of Biased Estimate in Linear Regression. Communications in Statistics—Theory and Methods, 22, 393-402.
http://dx.doi.org/10.1080/03610929308831027 - 12. Urgan, N.N. and Tez, M. (2008) Liu Estimator in Logistic Regression When the Data Are Collinear. International Conference on Continuous Optimization and Knowledge-Based Technologies, Linthuania, Selected Papers, Vilnius, 323-327.
- 13. McDonald, G.C. and Galarneau, D.I. (1975) A Monte Carlo Evaluation of Some Ridge-Type Estimators. Journal of the American Statistical Association, 70, 407-416.
http://dx.doi.org/10.1080/01621459.1975.10479882 - 14. Rao, C.R. and Toutenburg, H. (1995) Linear Models: Least Squares and Alternatives. 2nd Edition, Springer-Verlag, New York, Inc.
- 15. Rao, C.R., Toutenburg, H., Shalabh and Heumann, C. (2008) Linear Models and Generalizations. Springer, Berlin.
Appendix 1
Theorem 1. Let A:





Lemma 1. Let the two





Appendix 2
Figure A1. Estimated SMSE values for MLE, LRE, RMLE, LLE and SRMLE for n = 20.
Figure A2. Estimated SMSE values for MLE, LRE, RMLE, LLE and SRMLE for n = 50.
Figure A3. Estimated SMSE values for MLE, LRE, RMLE, LLE and SRMLE for n = 75.
Figure A4. Estimated SMSE values for MLE, LRE, RMLE, LLE and SRMLE for n = 100.
Appendix 3
Table A1. The estimated MSE values for different



Table A2. The estimated MSE values for different


Table A3. The estimated MSE values for different



Table A4. The estimated MSE values for different



Table A5. The estimated MSE values for different



Table A6. The estimated MSE values for different



Table A7. The estimated MSE values for different


Table A8. The estimated MSE values for different



Table A9. The estimated MSE values for different



Table A10. The estimated MSE values for different



Table A11. The estimated MSE values for different



Table A12. The estimated MSE values for different



Table A13. The estimated MSE values for different



Table A14.The estimated MSE values for different



Table A15. The estimated MSE values for different



Table A16. The estimated MSE values for different



Table A17. Summary of the Tables A1-A16.
Table A18. The best estimators and the corresponding



Table A19. The best estimators and the corresponding













