Performance of Existing Biased Estimators and the Respective Predictors in a Misspecified Linear Regression Model

doi:10.4236/ojs.2017.75062

Open Journal of Statistics
Vol.07 No.05(2017), Article ID:80097,25 pages
10.4236/ojs.2017.75062

Manickavasagar Kayanan^1,2, Pushpakanthie Wijekoon³

●How to Cite this Article

¹Deparment of Physical Science, Vavuniya Campus of the University of Jaffna, Vavuniya, Sri Lanka

²Postgraduate Institute of Science, University of Peradeniya, Peradeniya, Sri Lanka

³Department of Statistics and Computer Science, University of Peradeniya, Peradeniya, Sri Lanka

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: September 19, 2017; Accepted: October 28, 2017; Published: October 31, 2017

ABSTRACT

In this paper, the performance of existing biased estimators (Ridge Estimator (RE), Almost Unbiased Ridge Estimator (AURE), Liu Estimator (LE), Almost Unbiased Liu Estimator (AULE), Principal Component Regression Estimator (PCRE), r-k class estimator and r-d class estimator) and the respective predictors were considered in a misspecified linear regression model when there exists multicollinearity among explanatory variables. A generalized form was used to compare these estimators and predictors in the mean square error sense. Further, theoretical findings were established using mean square error matrix and scalar mean square error. Finally, a numerical example and a Monte Carlo simulation study were done to illustrate the theoretical findings. The simulation study revealed that LE and RE outperform the other estimators when weak multicollinearity exists, and RE, r-k class and r-d class estimators outperform the other estimators when moderated and high multicollinearity exist for certain values of shrinkage parameters, respectively. The predictors based on the LE and RE are always superior to the other predictors for certain values of shrinkage parameters.

Keywords:

Misspecified Regression Model, Generalized Biased Estimator, Generalized Predictor, Mean Square Error Matrix, Scalar Mean Square Error

1. Introduction

It is well known that the misspecification of the linear model is unavoidable in practical situations. Misspecification may occur due to including some irrelevant explanatory variables or excluding some relevant explanatory variables. Excluding some relevant explanatory variables from a regression model causes these variables to become part of the error term. In this case the mean of error term of the model is not zero. Furthermore, the excluded variables may be correlated with the variables in the model. According to the assumptions of linear regression model, the error term of the model should be independently and identically normally distributed with mean zero and variance $σ^{2}$ . Therefore, one or more assumptions of the linear regression model will be violated when the model is misspecified, and hence the estimators become biased and inconsistent.

Further, it is well known that the ordinary least square estimator (OLSE) does not hold its desirable properties if multicollinearity exists among the explanatory variables in the regression model. To overcome this problem, biased estimators based on the sample model $y = X β + ε$ , or by combining sample model with the exact or stochastic restrictions have been used in the literature. The motivation of this article is to examine the performance of the existing biased estimators in the misspecified linear regression model when multicollinearity exists.

Sarkar [1] examined the consequences of omission of some relevant explanatory variables from a linear regression model when multicollinearity exists among the explanatory variables. Recently, Şiray [2] and Wu [3] discussed the efficiency of r-d class estimator and r-k class estimator over some existing estimators, respectively. Teräsvirta [4] was discussed the case of biased estimation with stochastic linear restrictions in the misspecified regression model due to including an irrelevant variable with incorrectly specified prior information. Later, the efficiency of Mixed Regression Estimator (MRE) under misspecified regression model due to excluding relevant variable with correctly specified prior information was discussed by Mittelhmmer [5] , Ohtani and Honda [6] , Kadiyala [7] and Trenkler and Wijekoon [8] . Further, the superiority of MRE over the Ordinary Least Squares Estimator (OLSE) under the misspecified regression model with incorrectly specified sample and prior information was discussed by Wijekoon and Trenkler [9] . Hubert and Wijekoon [10] have considered the improvement of Liu estimator under a misspecified regression model with stochastic restrictions.

In this paper, the performance of existing biased estimators of the linear regression model based on the sample information such as Principal Component Regression Estimator (PCRE) introduced by Massy [11] , Ridge Estimator (RE) defined by Hoerl and Kennard [12] , r-k class estimator proposed by Baye and Parker [13] , Almost Unbiased Ridge Estimator (AURE) proposed by Singh et al. [14] , Liu Estimator (LE) proposed by Liu [15] , Almost Unbiased Liu Estimator (AULE) proposed by Akdeniz and Kaçiranla r [16] , r-d class estimator proposed by Kaçıranlar and Sakallıoğlu [17] were examined under misspecified regression model without combining any prior information to the sample model. A generalized form to represent all the above estimators was used for comparing these estimators and their respective predictors easily.

The rest of this article is organized as follows. The model specification and respective OLSE are written in section 2. In section 3, generalized form to represent the estimators under the misspecified regression model is proposed. In section 4, the Mean Square Error Matrix (MSEM) and Scalar Mean Square Error (SMSE) comparison between two generalized estimators and their respective predictors are considered. In section 5, the numerical example and Monte Carlo simulation are given to illustrate the theoretical results in SMSE criterion. Finally, some concluding remarks are stated in section 6. The references and Appendix are given at the end of the paper.

2. Model Specification

Assume that the true regression model is given by

$y = X_{1} β_{1} + X_{2} β_{2} + ε = X_{1} β_{1} + δ + ε$ (2.1)

where $y$ is the $n \times 1$ vector of observations on the dependent variable, $X_{1}$ and $X_{2}$ are the $n \times l$ and $n \times p$ matrices of observations on the $m = l + p$ regressors, $β_{1}$ and $β_{2}$ are the $l \times 1$ and $p \times 1$ vectors of unknown coefficients, $ε$ is the $n \times 1$ vector of disturbances with mean vector zero $(E (ε) = 0)$ and dispersion matrix $σ^{2} I$ $(E (ε ε^{'}) = Ω = σ^{2} I)$ .

Let us say that the researcher misspecifies the regression model by excluding $p$ regressors as

$y = X_{1} β_{1} + u$ (2.2)

According to Singh et al. [14] , by applying the spectral decomposition of the symmetric matrix ${X^{'}}_{1} X_{1}$ (since ${X^{'}}_{1} X_{1}$ is a positive definite matrix) we have $T^{'} {X^{'}}_{1} X_{1} T = Λ = diag (λ_{1}, \dots, λ_{l})$ , where $T = (t_{1}, t_{2}, \dots, t_{l})$ is the orthogonal matrix and $λ_{i} > 0$ being the i^th eigenvalue of ${X^{'}}_{1} X_{1}$ . Let $T_{r} = (t_{1}, t_{2}, \dots, t_{r})$ be the remaining column of $T$ having deleted $l - r$ columns where $r \leq l$ . Hence, ${T^{'}}_{r} {X^{'}}_{1} X_{1} T_{r} = Λ_{r} = diag (λ_{1}, \dots, λ_{r})$ .

Let $Z = X_{1} T$ and $γ = T^{'} β_{1}$ then models (2.1) and (2.2) can be written in canonical form as

$y = Z γ + δ + ε$ (2.3)

$y = Z γ + u$ (2.4)

The OLS estimator of model (2.4) is given by

$\hat{γ} = {(Z^{'} Z)}^{- 1} Z^{'} y = Λ^{- 1} Z^{'} y$ (2.5)

Using (2.3), $\hat{γ}$ can be written as

$\hat{γ} = Λ^{- 1} Z^{'} (Z γ + δ + ε) = γ + Λ^{- 1} Z^{'} (δ + ε)$ (2.6)

Hence, the expectation vector and the dispersion matrix of $\hat{γ}$ are given by

$E (\hat{γ}) = γ + Λ^{- 1} Z^{'} δ$ (2.7)

and

$D (\hat{γ}) = σ^{2} {(Z^{'} Z)}^{- 1} = σ^{2} Λ^{- 1}$ (2.8)

respectively.

3. Modified Biased Estimators, Predictors and Its Generalized Form

To combat multicollinearity several researchers introduce different types of biased estimators in place of OLSE, and seven such estimators are RE, AURE, LE, ALUE, PCRE, r-k class estimator and r-d class estimator given bellow respectively:

${\hat{β}}_{k} = {(X^{'} X + k I)}^{- 1} X^{'} X \hat{β}$ (3.1)

${\hat{β}}_{A U R E} = (I - k^{2} {(X^{'} X + k I)}^{- 2}) \hat{β}$ (3.2)

${\hat{β}}_{d} = {(X^{'} X + I)}^{- 1} (X^{'} X + d I) \hat{β}$ (3.3)

${\hat{β}}_{A U L E} = (I - {(1 - d)}^{2} {(X^{'} X + I)}^{- 2}) \hat{β}$ (3.4)

${\hat{β}}_{P C R} = T_{r} {({T^{'}}_{r} X^{'} X T_{r})}^{- 1} {T^{'}}_{r} X^{'} y$ (3.5)

${\hat{β}}_{r k} = T_{r} {({T^{'}}_{r} X^{'} X T_{r} + k I_{r})}^{- 1} {T^{'}}_{r} X^{'} y$ (3.6)

${\hat{β}}_{r d} = T_{r} {({T^{'}}_{r} X^{'} X T_{r} + I_{r})}^{- 1} (I_{r} + d {({T^{'}}_{r} X^{'} X T_{r})}^{- 1}) {T^{'}}_{r} X^{'} y$ (3.7)

where $X = [X_{1} X_{2}]$ , $k > 0$ , $0 < d < 1$ and $\hat{β}$ is the OLS estimator of the model (2.1).

Further, Xu and Yang [18] showed that Equations (3.5)-(3.7) could be rewritten as

${\hat{β}}_{P C R} = T_{r} {T^{'}}_{r} \hat{β}$ (3.8)

${\hat{β}}_{r k} = T_{r} {T^{'}}_{r} {(X^{'} X + k I)}^{- 1} X^{'} X \hat{β}$ (3.9)

${\hat{β}}_{r d} = T_{r} {T^{'}}_{r} {(X^{'} X + I)}^{- 1} (X^{'} X + d I) \hat{β}$ (3.10)

In the case of misspecification, the RE, AURE, LE, AULE, PCRE, r-k class estimator and r-d class estimator for the model (2.4) can be written as

${\hat{γ}}_{k} = {(Z^{'} Z + k I)}^{- 1} Z^{'} Z \hat{γ} = {(Λ + k I)}^{- 1} Λ \hat{γ} = A_{k} \hat{γ}$ (3.11)

${\hat{γ}}_{A U R E} = (I - k^{2} {(Z^{'} Z + k I)}^{- 2}) \hat{γ} = (I - k^{2} {(Λ + k I)}^{- 2}) \hat{γ} = A_{k}^{*} \hat{γ}$ (3.12)

${\hat{γ}}_{d} = {(Z^{'} Z + I)}^{- 1} (Z^{'} Z + d I) \hat{γ} = {(Λ + I)}^{- 1} (Λ + d I) \hat{γ} = F_{d} \hat{γ}$ (3.13)

${\hat{γ}}_{A U L E} = (I - {(1 - d)}^{2} {(Z Z^{'} + I)}^{- 2}) \hat{γ} = (I - {(1 - d)}^{2} {(Λ + I)}^{- 2}) \hat{γ} = F_{d}^{*} \hat{γ}$ (3.14)

${\hat{γ}}_{P C R} = T_{r} {T^{'}}_{r} \hat{γ} = W_{r} \hat{γ}$ (3.15)

${\hat{γ}}_{r k} = T_{r} {T^{'}}_{r} {(Λ + k I)}^{- 1} Λ \hat{γ} = W_{r k} \hat{γ}$ (3.16)

${\hat{γ}}_{r d} = T_{r} {T^{'}}_{r} {(Λ + I)}^{- 1} (Λ + d I) \hat{γ} = W_{r d} \hat{γ}$ (3.17)

respectively.

where $A_{k} = {(Λ + k I)}^{- 1} Λ$ , $A_{k}^{*} = (I - k^{2} {(Λ + k I)}^{- 2})$ , $F_{d} = {(Λ + I)}^{- 1} (Λ + d I)$ , $F_{d}^{*} = (I - {(1 - d)}^{2} {(Λ + I)}^{- 2})$ , $W_{r} = T_{r} {T^{'}}_{r}$ , $W_{r k} = T_{r} {T^{'}}_{r} {(Λ + k I)}^{- 1} Λ$ and $W_{r d} = T_{r} {T^{'}}_{r} {(Λ + I)}^{- 1} (Λ + d I)$ .

It is clear that $A_{k}$ and $F_{d}$ are positive definite, and $W_{r}$ , $W_{r k}$ and $W_{r d}$ are non-negative definite.

Now consider $\begin{matrix} A_{k}^{*} = (I - k^{2} {(Λ + k I)}^{- 2}) = {(Λ + k I)}^{- 2} ({(Λ + k I)}^{2} - k^{2}) \\ = {(Λ + k I)}^{- 2} Λ (Λ + 2 k I) > 0 \end{matrix}$

and $\begin{matrix} F_{d}^{*} = (I - {(1 - d)}^{2} {(Λ + I)}^{- 2}) = {(Λ + I)}^{- 2} ({(Λ + I)}^{2} - {(1 - d)}^{2} I) \\ = {(Λ + I)}^{- 2} (Λ + d I) (Λ + (2 - d) I) > 0; 0 < d < 1 \end{matrix}$

Hence, $A_{k}^{*}$ and $F_{d}^{*}$ are also positive definite.

Since RE, AURE, LE, AULE, PCRE, r-k class estimator and r-d class estimator are based on $\hat{γ}$ so we can use the following generalized form:

${\hat{γ}}_{(j)} = G_{(j)} \hat{γ}$ (3.18)

where $G_{(j)}$ is positive definite matrix if it stands for $A_{k}$ , $A_{k}^{*}$ , $F_{d}$ and $F_{d}^{*}$ , and it is non-negative definite matrix if it stands for $W_{r}, W_{r k}$ and $W_{r d}$ .

The expectation vector, bias vector, dispersion matrix and the mean square error matrix can be calculated with

$E ({\hat{γ}}_{(j)}) = G_{(j)} E (\hat{γ}) = G_{(j)} (γ + Λ^{- 1} Z^{'} δ)$

$\begin{matrix} B i a s ({\hat{γ}}_{(j)}) = E ({\hat{γ}}_{(j)} - γ) = G_{(j)} (γ + Λ^{- 1} Z^{'} δ) - γ \\ = (G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ \end{matrix}$ (3.19)

$D ({\hat{γ}}_{(j)}) = G_{(j)} D (\hat{γ}) {G^{'}}_{(j)} = σ^{2} G_{(j)} Λ^{- 1} {G^{'}}_{(j)}$ (3.20)

$\begin{array}{l} M S E M ({\hat{γ}}_{(j)}) = E ({\hat{γ}}_{(j)} - γ) {({\hat{γ}}_{(j)} - γ)}^{'} = D ({\hat{γ}}_{(j)}) + B i a s ({\hat{γ}}_{(j)}) B i a s {({\hat{γ}}_{(j)})}^{'} \\ = σ^{2} G_{(j)} Λ^{- 1} {G^{'}}_{(j)} + ((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ) {((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ)}^{'} \end{array}$ (3.21) (3)

Based on 3.19 to 3.21, the respective expectation vector, bias vector and dispersion matrix of the RE, AURE, LE, AULE, PCR, r-k class estimator and r-d class estimator can easily be obtained and given in Table A1 in the Appendix.

By using the approach of Kadiyala [7] , and Equations ((2.3) and (2.4)), the generalized prediction function can be defined as follows:

$y_{0} = Z γ + δ$ (3.22)

${\hat{y}}_{(j)} = Z {\hat{γ}}_{(j)}$ (3.23)

where $y_{0}$ is the actual value and ${\hat{y}}_{(j)}$ is the corresponding predictor value.

The MSEM of the generalized predictor is given by

$\begin{array}{l} M S E M ({\hat{y}}_{(j)}) = E ({\hat{y}}_{(j)} - y_{0}) {({\hat{y}}_{(j)} - y_{0})}^{'} \\ = Z (M S E M ({\hat{γ}}_{(j)})) Z^{'} - Z (B i a s ({\hat{γ}}_{(j)})) δ^{'} - δ {(B i a s ({\hat{γ}}_{(j)}))}^{'} Z^{'} + δ δ^{'} \end{array}$ (3.24)

Note that the predictors based on the OLSE, RE, AURE, LE, AULE, PCRE, r-k class estimator and r-d class estimator are denoted by $\hat{y}, {\hat{y}}_{k}, {\hat{y}}_{A U R E}, {\hat{y}}_{d}, {\hat{y}}_{A U L E}, {\hat{y}}_{P C R}, {\hat{y}}_{r k}$ and ${\hat{y}}_{r d}$ respectively.

4. Mean Square Error Comparisons

4.1. Mean Square Error Matrix (MSEM) Comparison of Generalized Estimators

If two generalized biased estimators ${\hat{γ}}_{(i)}$ and ${\hat{γ}}_{(j)}$ are given, the estimator ${\hat{γ}}_{(j)}$ is said to be superior to ${\hat{γ}}_{(i)}$ with respect to MSEM sense if and only if $M S E M ({\hat{γ}}_{(i)}) - M S E M ({\hat{γ}}_{(j)}) \geq 0$ .

Let us consider

$\begin{array}{l} M S E M ({\hat{γ}}_{(i)}) - M S E M ({\hat{γ}}_{(j)}) \\ = ((G_{(i)} - I) γ + G_{(i)} Λ^{- 1} Z^{'} δ) {((G_{(i)} - I) γ + G_{(i)} Λ^{- 1} Z^{'} δ)}^{'} \\ - ((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ) {((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ)}^{'} \\ + σ^{2} G_{(i)} Λ^{- 1} {G^{'}}_{(i)} - σ^{2} G_{(j)} Λ^{- 1} {G^{'}}_{(j)} \end{array}$

Now let $D_{(i, j)} = σ^{2} (G_{(i)} Λ^{- 1} {G^{'}}_{(i)} - G_{(j)} Λ^{- 1} {G^{'}}_{(j)})$ , $b_{(i)} = ((G_{(i)} - I) γ + G_{(i)} Λ^{- 1} Z^{'} δ)$ , $b_{(j)} = ((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ)$ , then the above difference can be written as

$Δ_{(i, j)} = M S E M ({\hat{γ}}_{(i)}) - M S E M ({\hat{γ}}_{(j)}) = D_{(i, j)} + b_{(i)} {b^{'}}_{(i)} - b_{(j)} {b^{'}}_{(j)}$ (4.1)

The following theorem can be stated for superiority of ${\hat{γ}}_{(j)}$ over ${\hat{γ}}_{(i)}$ with respect to the MSEM criterion.

Theorem 1: If $G_{(i)}$ is positive definite, ${\hat{γ}}_{(j)}$ is superior to ${\hat{γ}}_{(i)}$ in MSEM sense when the regression model is misspecified due to excluding relevant variables if and only if $λ_{*} < 1$ and ${b^{'}}_{(j)} {(D_{(i, j)} + b_{(i)} {b^{'}}_{(i)})}^{- 1} b_{(j)} \leq 1$ , where $λ_{*}$ is the largest eigenvalue of $(G_{(j)} Λ^{- 1} {G^{'}}_{(j)}) {(G_{(i)} Λ^{- 1} {G^{'}}_{(i)})}^{- 1}$ , $D_{(i, j)} = σ^{2} (G_{(i)} Λ^{- 1} {G^{'}}_{(i)} - G_{(j)} Λ^{- 1} {G^{'}}_{(j)})$ , $b_{(i)} = ((G_{(i)} - I) γ + G_{(i)} Λ^{- 1} Z^{'} δ)$ and $b_{(j)} = ((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ)$ .

Proof: Assume that $G_{(i)}$ is positive definite, which implies $G_{(i)} Λ^{- 1} {G^{'}}_{(i)} > 0$ .

Due to Lemma 3 (see Appendix), $G_{(i)} Λ^{- 1} {G^{'}}_{(i)} > G_{(j)} Λ^{- 1} {G^{'}}_{(j)}$ if $λ_{*} < 1$ , where $λ_{*}$ is the largest eigenvalue of $(G_{(j)} Λ^{- 1} {G^{'}}_{(j)}) {(G_{(i)} Λ^{- 1} {G^{'}}_{(i)})}^{- 1}$ .

Hence, according to Lemma 2 (see Appendix), $Δ_{(i, j)}$ is non-negative if and only if ${b^{'}}_{(j)} {(D_{(i, j)} + b_{(i)} {b^{'}}_{(i)})}^{- 1} b_{(j)} \leq 1$ , which completes the proof.

4.2. Scalar Mean Square Error (SMSE) Comparison of Generalized Estimators

If two generalized biased estimators ${\hat{γ}}_{(i)}$ and ${\hat{γ}}_{(j)}$ are given, the estimator ${\hat{γ}}_{(j)}$ is said to be superior to ${\hat{γ}}_{(i)}$ with respect to SMSE sense if and only if $S M S E ({\hat{γ}}_{(i)}) - S M S E ({\hat{γ}}_{(j)}) > 0$ .

The following theorem can be stated for superiority of ${\hat{γ}}_{(j)}$ over ${\hat{γ}}_{(i)}$ with respect to the SMSE criterion.

Theorem 2: ${\hat{γ}}_{(j)}$ is superior to ${\hat{γ}}_{(i)}$ when the regression model is misspecified due to excluding relevant variable with respect to SMSE sense if and only if

$\frac{t r (b_{(j)} {b^{'}}_{(j)} - b_{(i)} {b^{'}}_{(i)})}{t r (D_{(i, j)})} < 1$

where $D_{(i, j)} = σ^{2} (G_{(i)} Λ^{- 1} {G^{'}}_{(i)} - G_{(j)} Λ^{- 1} {G^{'}}_{(j)})$ , $b_{(i)} = ((G_{(i)} - I) γ + G_{(i)} Λ^{- 1} Z^{'} δ)$ and

$b_{(j)} = ((G_{(j)} - I) γ + G_{(j)} Λ^{- 1} Z^{'} δ)$ .

Proof: Let us consider

$S M S E ({\hat{γ}}_{(i)}) - S M S E ({\hat{γ}}_{(j)}) = t r (M S E M ({\hat{γ}}_{(i)}) - M S E M ({\hat{γ}}_{(j)}))$

Using (4.1) we can write

$S M S E ({\hat{γ}}_{(i)}) - S M S E ({\hat{γ}}_{(j)}) = t r (Δ_{(i, j)}) = t r (D_{(i, j)} + b_{(i)} {b^{'}}_{(i)} - b_{(j)} {b^{'}}_{(j))}$

Then ${\hat{γ}}_{(j)}$ is superior to ${\hat{γ}}_{(i)}$ if $S M S E ({\hat{γ}}_{(i)}) - S M S E ({\hat{γ}}_{(j)}) > 0$ .

$S M S E ({\hat{γ}}_{(i)}) - S M S E ({\hat{γ}}_{(j)}) > 0$ if and only if

$t r (D_{(i, j)} + b_{(i)} {b^{'}}_{(i)} - b_{(j)} {b^{'}}_{(j)}) > 0$

$t r (D_{(i, j)}) > t r (b_{(j)} {b^{'}}_{(j)} - b_{(i)} {b^{'}}_{(i))}$

$1 > \frac{t r (b_{(j)} {b^{'}}_{(j)} - b_{(i)} {b^{'}}_{(i)})}{t r (D_{(i, j)})}$

which completes the proof.

4.3. Mean Square Error Matrix (MSEM) Comparison of Generalized Predictors

If two generalized predictors ${\hat{y}}_{(i)}$ and ${\hat{y}}_{(j)}$ are given, the estimator ${\hat{y}}_{(j)}$ is said to be superior to ${\hat{y}}_{(i)}$ with respect to MSEM sense if and only if $M S E M ({\hat{y}}_{(i)}) - M S E M ({\hat{y}}_{(j)}) \geq 0$ .

Let us consider

$\begin{array}{l} M S E M ({\hat{y}}_{(i)}) - M S E M ({\hat{y}}_{(j)}) \\ = Z (M S E M ({\hat{γ}}_{(i)}) - M S E M ({\hat{γ}}_{(j)})) Z^{'} - Z (B i a s ({\hat{γ}}_{(i)}) - B i a s ({\hat{γ}}_{(j)})) δ^{'} \\ - δ {(B i a s ({\hat{γ}}_{(i)}) - B i a s ({\hat{γ}}_{(j)}))}^{'} Z^{'} \end{array}$

The following theorem can be stated for superiority of ${\hat{y}}_{(j)}$ over ${\hat{y}}_{(i)}$ with respect to the MSEM criterion.

Theorem 3: ${\hat{y}}_{(j)}$ is superior to ${\hat{y}}_{(j)}$ in MSEM sense when the regression model is misspecified due to excluding relevant variables if and only if $A \geq 0$ , $θ \in ℜ (A)$ and $θ^{'} A^{- 1} θ \leq 1$ , where

$A = Z Δ_{(i, j)} Z^{'} + δ δ^{'} + Z (b_{(i)} - b_{(j)}) {(b_{(i)} - b_{(j)})}^{'} Z^{'}$ , $θ = δ + Z (b_{(i)} - b_{(j)})$ , $ℜ (A)$ stands for column space of $A$ and $A^{- 1}$ is an independent choice of g-inverse of $A$ .

Proof: Using (4.1) MSEM difference of the two generalized predictor can be written as

$M S E M ({\hat{y}}_{(i)}) - M S E M ({\hat{y}}_{(j)}) = Z Δ_{(i, j)} Z^{'} - Z (b_{(i)} - b_{(j)}) δ^{'} - δ {(b_{(i)} - b_{(j)})}^{'} Z^{'}$ (4.2)

After some straight forward calculation, equation (5.1) can be written as

$M S E M ({\hat{y}}_{(i)}) - M S E M ({\hat{y}}_{(j)}) = A - θ θ^{'}$

where $A = Z Δ_{(i, j)} Z^{'} + δ δ^{'} + Z (b_{(i)} - b_{(j)}) {(b_{(i)} - b_{(j)})}^{'} Z^{'}$ , $θ = δ + Z (b_{(i)} - b_{(j)})$ and $Δ_{(i, j)} = D_{(i, j)} + b_{(i)} {b^{'}}_{(i)} - b_{(j)} {b^{'}}_{(j)}$ .

Due to Lemma 1 (see Appendix), $M S E M ({\hat{y}}_{(i)}) - M S E M ({\hat{y}}_{(j)})$ is non-negative definite matrix if and only if $A \geq 0$ , $θ \in ℜ (A)$ and $θ^{'} A^{- 1} θ \leq 1$ , where $A = Z Δ_{(i, j)} Z^{'} + δ δ^{'} + Z (b_{(i)} - b_{(j)}) {(b_{(i)} - b_{(j)})}^{'} Z^{'}$ , $θ = δ + Z (b_{(i)} - b_{(j)})$ , $ℜ (A)$ stands for column space of $A$ and $A^{- 1}$ is an independent choice of g-inverse of $A$ , which completes the proof.

Note that, obviously the conditions derived under Theorem 1 are sufficient for $A \geq 0$ . Consequently we may say that there are situations where ${\hat{y}}_{(j)}$ is superior to ${\hat{y}}_{(i)}$ in MMSE sense.

4.4. Scalar Mean Square Error (SMSE) Comparison of Generalized Predictors

Using (4.2) SMSE difference of the two generalized predictor can be written as

$\begin{array}{l} S M S E ({\hat{y}}_{(i)}) - S M S E ({\hat{y}}_{(j)}) = t r (M S E M ({\hat{y}}_{(i)}) - M S E M ({\hat{y}}_{(j)})) \\ = t r (Z Δ_{(i, j)} Z^{'} - Z (b_{(i)} - b_{(j)}) δ^{'} - δ {(b_{(i)} - b_{(j)})}^{'} Z^{'}) \end{array}$

The following theorem can be stated for superiority of ${\hat{y}}_{(j)}$ over ${\hat{y}}_{(i)}$ with respect to the SMSE criterion.

Theorem 4: ${\hat{y}}_{(j)}$ is superior to ${\hat{y}}_{(i)}$ in SMSE sense when the regression model is misspecified due to excluding relevant variables if and only if

$\frac{t r (Z (b_{(i)} - b_{(j)}) δ^{'} + δ {(b_{(i)} - b_{(j)})}^{'} Z^{'})}{t r (Z Δ_{(i, j)} Z^{'})} < 1.$

Proof: ${\hat{y}}_{(j)}$ is superior to ${\hat{y}}_{(i)}$ if $S M S E ({\hat{y}}_{(i)}) - S M S E ({\hat{y}}_{(j)}) > 0$ .

$S M S E ({\hat{y}}_{(i)}) - S M S E ({\hat{y}}_{(j)}) > 0$ if and only if

$t r (Z Δ_{(i, j)} Z^{'} - Z (b_{(i)} - b_{(j)}) δ^{'} - δ {(b_{(i)} - b_{(j)})}^{'} Z^{'}) > 0$

$t r (Z Δ_{(i, j)} Z^{'}) > t r (Z (b_{(i)} - b_{(j)}) δ^{'} + δ {(b_{(i)} - b_{(j)})}^{'} Z^{'})$

$1 > \frac{t r (Z (b_{(i)} - b_{(j)}) δ^{'} + δ {(b_{(i)} - b_{(j)})}^{'} Z^{'})}{t r (Z Δ_{(i, j)} Z^{'})}$

which completes the proof.

Based on Theorem 1, Theorem 2, Theorem 3 and Theorem 4 we can obtain the corresponding results for each of the biased estimators and respective predictors by plugging the values for $G_{(i)}$ , $G_{(j)}$ , $D_{(i, j)}$ , $b_{(i)}$ and $b_{(j)}$ . The results are summarized in Tables A2-A6 in the Appendix.

5. Illustration of Theoretical Results

5.1. Numerical Example

To illustrate our theoretical results, we consider a dataset which gives total National Research and Development Expenditures―as a Percent of Gross National Product by Country: 1972-1986. It represents the relationship between the dependent variable $Y$ the percentage spent by the United States and the four other independent variables $X_{1}$ , $X_{2}$ , $X_{3}$ and $X_{4}$ . The variable $X_{1}$ represents the percent spent by former Soviet Union, $X_{2}$ that spent by France, $X_{3}$ that spent by West Germany, and $X_{4}$ that spent by the Japan. The data was discussed in Gruber [19] , and the data has been analysed by Akdeniz and Erol [20] , Li and Yang [21] and among others. Now we assemble the data as follows:

$X = (\begin{matrix} 1 & 1.9 & 2.2 & 1.9 & 3.7 \\ 1 & 1.8 & 2.2 & 2.0 & 3.8 \\ 1 & 1.8 & 2.4 & 2.1 & 3.6 \\ 1 & 1.8 & 2.4 & 2.2 & 3.8 \\ 1 & 2.0 & 2.5 & 2.3 & 3.8 \\ 1 & 2.0 & 2.6 & 2.4 & 3.7 \\ 1 & 2.1 & 2.6 & 2.6 & 3.8 \\ 1 & 2.2 & 2.6 & 2.6 & 4.0 \\ 1 & 2.3 & 2.8 & 2.8 & 3.7 \\ 1 & 2.3 & 2.7 & 2.8 & 3.8 \end{matrix})$ , $y = (\begin{matrix} 2.3 \\ 2.2 \\ 2.2 \\ 2.3 \\ 2.4 \\ 2.5 \\ 2.6 \\ 2.6 \\ 2.7 \\ 2.7 \end{matrix})$

Note that the eigenvalues of the $X^{'} X$ are 312.932, 0.754, 0.045, 0.037, 0.002, the condition number is 299 and Variance Inflation Factor (VIF) values are 6.91, 21.58, 29.75, 1.79. Since condition number is greater than 100 and first three VIF values are greater than 5, which implies the existence of serious multicollinearity in the data set.

After the standardization of the data, the corresponding OLS estimator is

$β = b_{s} = {({X^{'}}_{s} X_{s})}^{- 1} {X^{'}}_{s} y_{s} = {(0.6402, - 0.1179, 0.4733, 0.0139)}^{'}$

For the standardized data (since there are ten observations and four parameters), we obtain

${\hat{σ}}_{s}^{2} = \frac{{(y_{s} - X_{s} b_{s})}^{'} (y_{s} - X_{s} b_{s})}{10 - 4} = 0.003932$

Table 1 shows the estimated SMSE values of OLSE, RE, AURE, LE, AULE, PCRE, r-k class estimator and r-d class estimator for the regression model when $(l, p) = (4, 0)$ , $(l, p) = (3, 1)$ , and $(l, p) = (2, 2)$ with respect to shrinkage parameters (k/d), where $l$ denotes the number of variable in the model and p denotes the number of misspecified variables.

Table 2 shows the estimated SMSE values of the predictor of OLSE, RE, AURE, LE, AULE, PCRE, r-k class estimator and r-d class estimator for the regression

Table 1. Estimated SMSE values of the estimators.

From Table 1, it can be observed that the minimum SMSE of the estimators depends on the values of shrinkage parameters and the level of misspecification, which is agreed with our theoretical findings.

Table 2. Estimated SMSE values of the predictors.

According to Table 2, it can be observed that the predictors behave differently than the respective estimators, which is also agreed with our theoretical findings.

model when $(l, p) = (4, 0)$ , $(l, p) = (3, 1)$ , and $(l, p) = (2, 2)$ for some selected shrinkage parameters (k/d). For simplicity we choose shrinkage parameter values k and d in the range (0, 1).

5.2. Monte Carlo Simulation Study

For further clarification, a Monte Carlo simulation study is done under different levels of misspecification using R 3.2.5. Following McDonald and Galarneau [22] , we can get explanatory variables as follows:

$x_{i j} = {(1 - ρ^{2})}^{1 / 2} z_{i j} + ρ z_{i, p + 1}; i = 1, 2, \dots, n; j = 1, 2, \dots, m .$

where $z_{i j}$ is an independent standard normal pseudo random number, and $ρ$ is specified so that the theoretical correlation between any two explanatory variables is given by $ρ^{2}$ . A dependent variable is generated by using the equation

$y_{i} = β_{1} x_{i 1} + β_{2} x_{i 2} + β_{3} x_{i 3} + β_{4} x_{i 4} + β_{5} x_{i 5} + ε_{i}; i = 1, 2, \dots, n .$

where $ε_{i}$ is a normal pseudo random number with mean zero and variance one. In this study, we choose $β = (β_{1}, β_{2}, β_{3}, β_{4}, β_{5})$ as the normalized eigenvector corresponding to the largest eigenvalue of $X^{'} X$ for which $β^{'} β = 1$ . We consider the following set up to investigate the effects of different degrees of multicollinearity on the estimators:

$ρ = 0.9$ , condition number = 6.06 and VIF = (4.84, 4.83, 4.82, 4.81, 4.87)

$ρ = 0.99$ , condition number = 20.12 and VIF = (46.09, 46.12, 46.02, 45.97, 46.56)

$ρ = 0.999$ , condition number = 64 and VIF = (458.3, 459.2, 458.1, 457.8, 463.4)

Three different sets of observations are considered by selecting $(l, p) = (5, 0)$ , $(l, p) = (4, 1)$ and $(l, p) = (3, 2)$ when $n = 50$ , where $l$ denotes the number of variable in the model and p denotes the number of misspecified variables. For simplicity, we select values k and d in the range $(0, 1)$ .

The simulation is repeated 2000 times by generating new pseudo random numbers and the simulated SMSE values of the estimators and predictors are obtained using the following equations:

$S M S E ({\hat{γ}}_{(j) r}) = \frac{1}{2000} \sum_{r = 1}^{2000} t r (M S E M ({\hat{γ}}_{(j) r}))$

and $S M S E ({\hat{y}}_{(j) r}) = \frac{1}{2000} \sum_{r = 1}^{2000} t r (M S E M ({\hat{y}}_{(j) r}))$ respectively.

Tables 3-5 are showing the estimated SMSE values of the estimators for the regression model when $(l, p) = (5, 0)$ , $(l, p) = (4, 1)$ and $(l, p) = (3, 2)$ , and $ρ = 0.9$ , $ρ = 0.99$ and $ρ = 0.999$ for the selected values of shrinkage parameters (k/d), respectively. Tables 6-8 are showing the corresponding estimated SMSE values of the predictors for the regression model, respectively.

From Tables 3-8, we can summarise the results as shown in Table 9.

6. Conclusions

In this study, a common form of superiority conditions were obtained for comparison among the biased estimators (RE, AURE, LE, AULE, PCRE, r-k class estimator and r-d class estimator) and their predictors by using a generalized form for the misspecified linear regression model when explanatory variables are multicollinearity. Furthermore, the theoretical findings were analyzed by using a numerical example and a Monte Carlo simulation study.

The simulation study shows that the LE and RE outperform the other estimators when weak multicollinearity exist, and RE, r-k class and r-d class estimators

Table 3. Estimated SMSE values of the estimators when $n = 50$ and $ρ = 0.9.$

According to Table 3, it can be observed that, ${\hat{γ}}_{d}$ is superior over other estimators when $k \leq 0.4$ and ${\hat{γ}}_{k}$ is superior over other estimators when $k > 0.4$ under diferent levels of misspecifications for the weak multicollinearity.

Table 4. Estimated SMSE values of the estimators when $n = 50$ and $ρ = 0.99.$

According to Table 4, it can be observed that, ${\hat{γ}}_{r d}$ is superior over other estimators when $k \leq 0.4$ and ${\hat{γ}}_{r k}$ is superior over other estimators when $k > 0.4$ under different levels of misspecifications for the moderated multicollinearity.

Table 5. Estimated SMSE values of the estimators when $n = 50$ and $ρ = 0.999.$

According to Table 5, it can be observed that, ${\hat{γ}}_{r d}$ is superior over other estimators when $k \leq 0.2$ under all level of misspecifications, ${\hat{γ}}_{r k}$ is superior over other estimators when $0.2 < k \leq 0.5$ and ${\hat{γ}}_{k}$ is superior over other estimators when $k > 0.5$ under $(l, p) = (5, 0)$ and $(l, p) = (4, 1)$ , and ${\hat{γ}}_{r k}$ is superior over other estimators when $0.2 < k \leq 0.7$ and ${\hat{γ}}_{k}$ is superior over other estimators when $k > 0.7$ under $(l, p) = (3, 2)$ for high multicollinearity.

Table 6. Estimated SMSE values of the predictors when $n = 50$ and $ρ = 0.9.$

According to Table 6, it can be observed that, ${\hat{y}}_{d}$ is superior over other estimators when $k \leq 0.4$ and ${\hat{y}}_{k}$ is superior over other estimators when $k > 0.4$ under different levels of misspecifications.

Table 7. Estimated SMSE values of the predictors when $n = 50$ and $ρ = 0.99.$

According to Table 7, it can be observed that, ${\hat{y}}_{d}$ is superior over other estimators when $k \leq 0.4$ and ${\hat{y}}_{k}$ is superior over other estimators when $k > 0.4$ under different levels of misspecifications.

Table 8. Estimated SMSE values of the predictors when $n = 50$ and $ρ = 0.999.$

According to Table 8, it can be observed that, ${\hat{y}}_{d}$ is superior over other estimators when $k \leq 0.2$ and ${\hat{y}}_{k}$ is superior over other estimators when $k > 0.2$ under different levels of misspecifications.

Table 9. Shrinkage parameter ranges for superior estimators and predictors.

outperform the other estimators when moderated and high multicollinearity exist for selected values of shrinkage parameters, respectively. It can also be noted that, the predictors based on the LE and RE are always superior to the other predictors for selected values of shrinkage parameters when multicollinearity exists among explanatory variables.

One of the limitation of this study is that we assume the error variance is equal for all models even when the variables are omitted from the model.

Cite this paper

Kayanan, M. and Wijekoon, P. (2017) Performance of Existing Biased Estimators and the Respective Predictors in a Misspecified Linear Regression Model. Open Journal of Statistics, 7, 876-900. https://doi.org/10.4236/ojs.2017.75062

References

1. Sarkar, N. (1989) Comparisons among Some Estimators in Misspecified Linear Models with Multicollinearity. Annals of the Institute of Statistical Mathematics, 41, 717-724. https://doi.org/10.1007/BF00057737

2. Siray, G.ü. (2015) r-d Class Estimator under Misspecification. Communications in Statistics—Theory and Methods, 44, 4742-4756. https://doi.org/10.1080/03610926.2013.835421

3. Wu, J. (2016) Superiority of the r-k Class Estimator over Some Estimators in a Misspecified Linear Model. Communication in Statistics—Theory and Methods, 45, 1453-1458. https://doi.org/10.1080/03610926.2013.863934

4. Terasvirta, T. (1980) Linear Restrictions in Misspecified Linear Models and Polynomial Distributed Lag Estimation. Department of Statistics, University of Helsinki, Helsinki.

5. Mittelhammer, R.C. (1981) On Specification Error in the General Linear Model and Weak Mean Square Error Superiority of the Mixed Estimator. Communications in Statistics—Theory and Methods, 167-176. https://doi.org/10.1080/03610928108828027

6. Ohtani, K. and Honda, Y. (1984) On Small Sample Properties of the Mixed Regression Predictor under Misspecification. Communications in Statistics—Theory and Methods, 2817-2825. https://doi.org/10.1080/03610928408828863

7. Kadiyala, K. (1986) Mixed Regression Estimator under Misspecification. Economic Letters, 21, 27-30. https://doi.org/10.1016/0165-1765(86)90115-1

8. Trenkler, G. and Wijekoon, P. (1989) Mean Square Error Matrix Superiority of the Mixed Regression Estimator under Misspecification. Statistica, 49, 65-71.

9. Wijekoon, P. and Trenkler, G. (1989) Mean Square Error Matrix Superiority of Estimators under Linear Restrictions and Misspecification. Economics Letters, 30, 141-149. https://doi.org/10.1016/0165-1765(89)90052-9

10. Hubert, M. and Wijekoon, P. (2004) Superiority of the Stochastic Restricted Liu Estimator under Misspecification. Statistica, 64, 153-162.

11. Massy, F. (1965) Principal Components Regression in Exploratory Statistical Research. Journal of the American Statistical Association, 60, 234-266. https://doi.org/10.1080/01621459.1965.10480787

12. Hoerl, A. and Kennard, R. (1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634

13. Baye, R. and Parker, F. (1984) Combining Ridge and Principal Component Regression: A Money Demand Illustration. Communications in Statistics—Theory and Methods, 13, 197-205. https://doi.org/10.1080/03610928408828675

14. Singh, B., Chaubey, Y.P. and Dwivedi, T.D. (1986) An Almost Unbiased Ridge Estimator. The Indian Journal of Statistics, 48, 342-346.

15. Liu, K. (1993) A New Class of Biased Estimate in Linear Regression. Communications in Statistics—Theory and Methods, 22, 393-402. https://doi.org/10.1080/03610929308831027

16. Akdeniz, F. and Kaciranlar, S. (1995) On the Almost Unbiased Generalized Liu Estimator and Unbiased Estimation of the Bias and MSE. Communications in Statistics—Theory and Methods, 24, 1789-1797. https://doi.org/10.1080/03610929508831585

17. Kaciranlar, S. and Sakallioglu, S. (2001) Combining the Liu Estimator and the Principal Component. Communications in Statistics—Theory and Methods, 30, 2699-2705. https://doi.org/10.1081/STA-100108454

18. Xu, W. and Yang, H. (2011) On the Restricted r-k Class Estimator and the Restricted r-d Class Estimator in Linear Regression. Journal of Statistical Computation and Simulation, 81, 679-691. https://doi.org/10.1080/00949650903471023

19. Gruber, M. (1998) Improving Efficiency by Shrinkage: The James-Stein and Ridge Regression Estimators. CRC Press, New York.

20. Akdeniz, F. and Erol, H. (2003) Mean Squared Error Matrix Comparisons of Some Biased Estimators in Linear Regression. Communications in Statistics—Theory and Methods, 2, 389-2413. https://doi.org/10.1081/STA-120025385

21. Li, Y. and Yang, H. (2010) A New Stochastic Mixed Ridge Estimator in Linear Regression Model. Statistical Papers, 51, 315-323. https://doi.org/10.1007/s00362-008-0169-5

22. McDonald, G.C. and Galarneau, D.I. (1975) A Monte Carlo Evaluation of Some Ridge-Type Estimators. Journal of the American Statistical Association, 70, 407-416. https://doi.org/10.1080/01621459.1975.10479882

23. Baksalary, J. and Kala, R. (1983) Partial Orderings between Matrices One of Which Is of Rank One. Bulletin of the Polish Academy of Sciences Mathematics, 31, 5-7.

24. Trenkler, G. and Toutenburg, H. (1990) Mean Square Error Matrix Comparisons between Biased Estimators: An Overview of Recent Results. Statistical Papers, 31, 165-179. https://doi.org/10.1007/BF02924687

25. Wang, S., et al. (2006) Matrix Inequalities. 2nd Edition, Chinese Science Press, Beijing.

Appendix

Lemma 1: (Baksalary and Kala [23] )

Let $B \geq 0$ of type $n \times n$ matrix, $b$ is a $n \times 1$ vector and $λ$ is a positive real number. Then the following conditions are equivalent.

i) $λ B - b b^{'} \geq 0$

2) $B \geq 0$ , $b \in ℜ (B)$ and $b^{'} B^{- 1} b \leq 1$ , where $ℜ (B)$ stands for column space of B and $B^{- 1}$ is a independent choice of g-inverse of B.

Lemma 2: (Trenkler and Toutenburg [24] )

Let ${\hat{β}}_{1}$ and ${\hat{β}}_{2}$ be two linear estimator of $β$ . Suppose that $D = D ({\hat{β}}_{1})$ $- D ({\hat{β}}_{2})$ is positive definite, then $Δ = M S E ({\hat{β}}_{1}) - M A S E ({\hat{β}}_{2})$ is non negative if and only if ${b^{'}}_{2} {(D + b_{1} {b^{'}}_{1})}^{- 1} b_{2} \leq 1$ , where $D ({\hat{β}}_{j})$ , $M S E ({\hat{β}}_{j})$ and $b_{j}$ denote dispersion matrix, mean square error matrix and bias vector of ${\hat{β}}_{j}$ respectively, $j = 1, 2$ .

Lemma 3: (Wang et al. [25] )

Let $n \times n$ matrices $M > 0, N \geq 0$ , then $M > N$ if and only if $λ_{*} < 1$ , where $λ_{*}$ is the largest eigenvalue of the matrix $N M^{- 1}$ .

Table A1. Expectation vector, Bias vector and Dispersion matrix of the estimators.

Journal Menu>>