Open Journal of Statistics
Vol.04 No.09(2014), Article ID:50518,7 pages
10.4236/ojs.2014.49071

On Diagnostics in Stochastic Restricted Linear Regression Models

Shuling Wang, Man Liu, Xiaohong Deng

Department of Fundametal Course, Air Force Logistics College, Xuzhou, China

Email: 155328313@qq.com

Copyright © 2014 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY).

http://creativecommons.org/licenses/by/4.0/

Received 18 August 2014; revised 23 September 2014; accepted 2 October 2014

ABSTRACT

The aim of this paper is to propose some diagnostic methods in stochastic restricted linear regression models. A review of stochastic restricted linear regression models is given. For the model, this paper studies the method and application of the diagnostic mostly. Firstly, review the estimators of this model. Secondly, show that the case deletion model is equivalent to the mean shift outlier model for diagnostic purpose. Then, some diagnostic statistics are given. At last, example is given to illustrate our results.

Keywords:

Stochastic Restricted Linear Regression Model, Stochastic Restricted Ridge Estimator, Statistical Diagnostics

1. Introduction

In a linear regression, the ordinary least squares estimator (LS) is unbiased and has minimum variance among all linear unbiased estimators and has been treated as the best estimator for a long time. When the addition of stochastic linear restrictions on the unknown parameter vector was assumed to be held, Theil [1] proposed the ordinary mixed estimator (OME). Hubert and Wijekoon [2] proposed the stochastic restricted Liu estimator (SRLE). And Li and Yang [3] introduced the stochastic restricted ridge estimator (SRRE) by grating the ORE into the mixed estimation procedure. Wu [4] discussed stochastic restricted class estimator and stochastic restricted class estimator in linear regression model. When the prior information and the sample information were not equally important, Schafrin and Toutenburg [5] introduced the method of weighted mixed regression and developed the weighted mixed estimator (WME). Li and Yang [6] grated the ORE into the weighted mixed estimation procedure and proposed the weighted mixed ridge estimator. Liu, et al. [7] proposed the stochastic weighted mixed almost unbiased ridge estimator by combining the WME and the AURE and also proposed the stochastic weighted mixed almost unbiased Liu estimator by combining the WME and the AULE in a linear regression model. He and Wu [8] proposed a new estimator to combat the multicollinearity in the linear model when there were stochastic linear restrictions on the regression coefficients. The new estimator is constructed by combining the ordinary mixed estimator (OME) and the principal components regression (PCR) estimator, which is called the stochastic restricted principal components (SRPC) regression estimator. Liu, Yang and Wu [9] introduced the weighted mixed almost unbiased ridge estimator (WMAURE) based on the weighted mixed estimator (WME) and the almost unbiased ridge estimator (AURE) in linear regression model. They discussed superiorities of the new estimator under the quadratic bias (QB) and the mean square error matrix (MSEM) criteria. Wu and Liu [10] considered several estimators for estimating the stochastic restricted ridge regression estimators. A simulation study has been conducted to compare the performance of the estimators. The result from the simulation study shows that stochastic restricted ridge regression estimators outperform mixed estimator.

Nearly forty years, the diagnosis and influence analysis of linear regression model has been fully developed (R.D. Cook and S. Weisberg [11] , Wei, et al. [12] ). Jiawei Wang [13] discussed the linear regression model with the random constraints, introduced its residuals and showed that the CDM was equivalent to the mean shift outlier model for diagnostics purpose based on general least square estimate. Lian Yang and Hu Yang [14] dealt with the data deleted model and the mean shift model under ellipsoidal restriction and obtained the equivalence of the diagonal statistic between the two models. Lu Wang [15] discussed the statistical diagnostic of multivariate linear regression model with linear restriction.

However, statistical diagnostics of stochastic restricted linear regression models based on stochastic restricted ridge estimator (SRRE) are studied in this paper. The paper is organized as follows. The model and the estimators are reviewed in Section 2. We show that the case deletion model is equivalent to the mean shift outlier model for diagnostic purpose in Section 3. Some diagnostic statistics are given in Section 4. The example to illustrate our results is given in Section 5.

2. Review of Stochastic Restricted Linear Regression Model

Consider the following linear model:

, (1)

where is an vector of observation, is an design matrix of rank, is a vector denoting unknown coefficients, and is an random error vector with and.

Suppose that satisfies the following stochastic restriction, that is,

, (2)

where is a nonzero matrix with and is a known vector, and. In this paper, we assumed that is independent of. And now model (1) is called stochastic restricted linear regression model.

2.1. Estimates of Model

Using the mixed approach, Durbin [16] , Theil and Goldberger [17] introduced the mixed estimator (ME), which is defined as follows:

. (3)

The mixed estimator is an unbiased estimator. However, when multicollinearity exists, the mixed estimator is no longer a good estimator.

Ozkale [18] proposed the following stochastic restricted ridge estimator (SRRE):

. (4)

The result from the simulation study shows that SRRE outperform ME (see Wu and Liu [19] ).

2.2. Estimating k

The most classical ridge estimator for linear regression is the following:

,

proposed by Hoerl and Kennard [20] , where denote the maximum element of,

, , and is the estimator of. Hoerl, et al. [21] introduced

an alternative of the estimator of, which is defined as follows:

.

In Schaefer, et al. [22] , a modified version of this estimator is proposed as follows:

.

In Kibria, et al. [23] , a new estimator is proposed as follows:

.

This paper selects to estimate below.

3. Diagnostic Methods

3.1. Case-Deletion Model

Consider the stochastic restricted linear model, where the -th case is deleted,.

(5)

This model is called case-deletion model. Supposed that the SRRE of the coefficient function in model (5) is.

In order to study the influence of the -th case, and compare the difference between and. The important result as following theorem.

Theorem 1. For model (5), the SRRE of is

(6)

and

(7)

where,.

Proof: Let and corresponding to and to delete the cases which belong to. For model (5), we use the SRRE obtained that

Supposed that, , then

which leads to (6).

Because

and,

hence

3.2. Mean Shift Outlier Model

The other common statistical diagnosis model is the mean shift outlier model (MSOM). For the stochastic restricted linear regression model, the corresponding MSOM is

(8)

where the parameter are number, which describe the outlier. Let the SRRE of model (8) are and. The corresponding matrix formula of model (8) as follows:

(9)

where, is a n-dimensional vector, the -th component is 1, and the other are zero.

Theorem 2. For model (8), there are, and.

Proof: By the matrix form of model (8), we obtained

On the other hand, by the formula of calculating the inverse matrix of partitioned matrix, we have

which leads to and.

4. Diagnostical Statistics

4.1. Generalized Cook Distance

Let is a nonnegetive matrix and is one real number. The generalized Cook distance of the -th case is defined as follows:

(10)

Theorem 3. Supposed that, then the generalized Cook distance of the -th case is

where,.

Proof: Because

Substituting these results into (9) gives

4.2. W-K Statistic

W-K statistic is advanced from the view of data fitting. Considering the influence of the -th case. In order to eliminate the influence of scale, it is also need to divide the variance of estimator. Because the keystone is to review the influence of deleting the -th case. Hence, is substituted by. Then, the W-K statistic can be expressed as follows:

(11)

4.3. Covariance Ratio Statistic

is to measure the superiorities of. The covariance ratio statistic is defined as follows:

,

which measure the influence of the -th case.

5. Monte Carlo Experiments

In order to illustrate the validity of above results, extensive Monte Carlo sampling experiments were conducted. To evaluate the finite-sample performance of our proposed method, we simulate 60 random samples from the following model:

The stochastic restricts as follows:

where,. In order to checkout the validity of our proposed metho-

dology, we change the value of the first, 125th and 374th data. For every case, it is easy to obtain, and.

From the Figure 1, Figure 2, Figure 3, we can see that in most cases, the value of are reasonably close to one fixed value. Following the definition and properties of diagnosis statistics, we can diagnose the strong influence points, the value of which deviate from the average seriously. Figure 2 and Figure 3 show that the first and the third data are strong influence points. Indeed, our results are illustrated.

Figure 1. Generalized cook distance.

Figure 2. W-K statistic.

Figure 3. Covariance ratio statistic.

6. Conclusion

In this paper, stochastic restricted linear regression models are revisited. Useful diagnostic methods are derived. Through simulation study, we illustrate that our proposed methods can work fairly well.

References

  1. Theil, H. (1963) On the Use of Incomplete Prior Information in Regression Analysis. Journal of the American Statistical Association, 58, 401-414. http://dx.doi.org/10.1080/01621459.1963.10500854
  2. Hubert, M.H. and Wijekoon, P. (2006) Improvement of the Liu Estimator in Linear Regression Model. Statistical Papers, 47, 471-479. http://dx.doi.org/10.1007/s00362-006-0300-4
  3. Li, Y. and Yang, H. (2010) A New Stochastic Mixed Ridge Estimator in Linear Regression Model. Statistical Papers, 51, 315-323. http://dx.doi.org/10.1007/s00362-008-0169-5
  4. Wu, J. (2014) On the Stochastic Restricted r-k Class Estimator and Stochastic Restricted r-d Class Estimator in Linear Regression Model. Journal of Applied Mathematics, 2014, Article ID: 173836.
  5. Schafrin, B. and Toutenburg, H. (1990) Weighted Mixed Regression. Zeitschrit fur Angewandte Mathematik und Mechanik, 70, T735-T738.
  6. Li, Y. and Yang, H. (2011) A New Ridge-Type Estimator in Stochastic Restricted Linear Regression. Statistics, 45, 123-130. http://dx.doi.org/10.1080/02331880903573153
  7. Liu, C.L., Jiang, H.N., Shi, X.H. and Liu, D.L. (2014) Two Kinds of Weighted Biased Estimators in Stochastic Restricted Regression Model. Journal of Applied Mathematics, 2014, Article ID: 314875.
  8. He, D.J. and Wu, Y. (2014) A Stochastic Restricted Components Regression Estimator in the Linear Model. The Science World Journal, 2014, Article ID: 231506.
  9. Liu, C.L., Yang, H. and Wu, J.B. (2014) On the Weighted Mixed Almost Unbiased Ridge Estimator in Stochastic Restricted Linear Regression. Journal of Applied Mathematics, 2014, Article ID: 902715.
  10. Wu, J.B. and Liu, C.L. (2014) Performance of Some Stochastic Restricted Ridge Estimator in Linear Regression Model. Journal of Applied Mathematics, 2014, Article ID: 508793.
  11. Cook, R.D. and Weisberg, S. (1982) Residuals and Influence in Regression. Chapman and Hall, New York.
  12. Wei, B., Lu, G. and Shi, J. (1990) Statistical Diagnostics. Publishing House of Southeast University, Nanjing.
  13. Wang, J. (2007) Statistical Diagnosis of Linear Regression Model with the Random Constraints and BAYES Method. Nanjing University of Science and Technology, Nanjing.
  14. Yang, L. and Yang, H. (2007) Influence of Linear Model under Ellipsoidal Restriction. Chinese Journal of Engineering Mathematics, 24, 60-64.
  15. Wang, L. (2008) Statistical Diagnosis of Linear Model under Linear Restriction. Nanjing University of Science and Technology, Nanjing.
  16. Durbin, J. (1953) A Note on Regression When There Is Extra Neous Information about One of the Coefficients. Journal of the American Statistical Association, 48, 799-808. http://dx.doi.org/10.1080/01621459.1953.10501201
  17. Theil, H. and Goldberger, A.S. (1961) On Pure and Mixed Statistical Estimation in Economics. International Economic Review, 2, 65-78. http://dx.doi.org/10.2307/2525589
  18. Özkale, M.R. (2009) A Stochastic Restricted Ridge Regression Estimator. Journal of Multivariate Analysis, 100, 1706- 1716.
  19. Hoerl, A.E. and Kennard, R.W. (1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12, 55-67. http://dx.doi.org/10.1080/00401706.1970.10488634
  20. Hoerl, A.E. and Kennard, R.W. (1970) Ridge Regression: Applications to Nonorthogonal Problems. Technometrics, 12, 69-82. http://dx.doi.org/10.1080/00401706.1970.10488635
  21. Hoerl, A.E., Kennard, R.W. and Baldwin, K.F. (1975) Ridge Regression: Some Simulation. Communications in Statistics: Theory and Methods, 4, 105-123.
  22. Schaefer, R.L., Roi, L.D. and Wolfe, R.A. (1984) A Ridge Logistic Estimator. Communications in Statistics: Theory and Methods, 13, 99-113. http://dx.doi.org/10.1080/03610928408828664
  23. Kibria, B.M.G., Mansson, K. and Shukur, G. (2011) Performance of Some Logistic Ridge Regression Estimators. Computational Economics, 40, 401-414. http://dx.doi.org/10.1007/s10614-011-9275-x