Open Journal of Statistics, 2012, 2, 297-299
http://dx.doi.org/10.4236/ojs.2012.23035 Published Online July 2012 (http://www.SciRP.org/journal/ojs)
Unbalanced Regressions and Spurious Inference
Daniel Ventosa-Santaulària
Centro de Investigación y Docencia Económicas, CIDE, Mexico City, Mexico
Email: daniel@ventosa-santaularia.com
Received April 23, 2012; revised May 24, 2012; accepted June 5, 2012
ABSTRACT
Spurious regression has been extensively studied in time series econometrics since Granger and Newbold’s [1] seminal
paper. Recently, it has been advanced that this phenomenon is due to a mistreatment of short-range autocorrelation in
the residuals of the regression when at least one of the variables in a bivariate regression is stationary. HAC errors, fea-
sible GLS and Cochrane-Orcutt-type procedures are then proposed to draw correct inference. Such a proposal should be
cautiously considered, since nonsense inference might also be due to deterministic trend mechanisms, structural breaks,
and long range dependence. In these cases, standard autocorrelation correction procedures would not solve the problem
of spurious regression. We aim to make the later argument clear.
Keywords: Spurious Regression; Stationarity; Unbalanced Regression; Unit Root
1. Unbalanced Spurious Regressions
Spurious regression has been extensively studied since
Granger and Newbold [1] seminal paper in which inde-
pendent nonstationary variables are simulated and then
used to estimate a simple bivariate regression. Phillips [2]
provided the theoretical framework to understand the
phenomenon in the simplest case (independent driftless
unit root processes). Since then, the spurious regression
phenomenon has been identified for many data-generat-
ing processes (DGPs), such as unit root with drifts, (bro-
ken-) trend stationary and long range, for example1.
Here, we are concerned with the results presented in
Noriega and Ventosa-Santaulària [4] and Stewart [5]
pertaining to the spurious regression phenomenon under
the following conditions: 1) both variables, t and t
y
x
(see Equation (1)), are stationary (i.e., integrated of order
0, I(0)), and 2) at least one of the variables (the regressor
or the regressand) is integrated of order 1, I(1). The later
combinations result in an unbalanced regression and
Noriega and Ventosa-Santaulària [4] found that, in a sim-
ple regression specification,
,
ttt
yxu
 
t
y
(1)
where either , t
x
(or both) is I(0)2, the t-ratio associ-
ated with ˆ
, ˆ
t
, does not diverge as the sample size
grows; i.e. . Results in Noriega and Ventosa-
Santaulària [4] imply that the asymptotic spurious re-
gression phenomenon does not occur. Nevertheless, non-
sense inference cannot be fully discarded. In a recent
paper, Stewart [5] argues that, although the t-ratio does
not diverge, it may not necessarily converge to a standard
normal distribution. Furthermore, in the absence of auto-
correlation in the DGP’s innovations, only when both
variables are iid I(0) processes, the t-ratio behaves—
asymptotically—as a standard normal. Other DGP com-
binations, such as
~1
tI

~1 t
y
x
I and vice versa,
do have asymptotic nonstandard distributed t-ratios. Nev-
ertheless, the size distortions are better explained by the
presence of autocorrelation in the DGP innovations. This
point is illustrated by Stewart [5] throughout a number of
finite-sample experiments. The problem comes as no
surprise since the estimated residuals behave as an auto-
correlated process and size distortion should be expected
in that case. Moreover, the use of heteroskedasticity and
autocorrelation consistent (HAC) errors considerably re-
duce size distortions in some cases, as argues Stewart [5].
Table 1 summarizes the relevant DGPs for both the de-
pendent and the explanatory variables, similar to those
used by Noriega and VentosaSantaulària [4] and Stewart
[5], to estimate a simple linear specification.
For simplicity, we assume that innovations,
z
t, for e
,zx
y
, are iid white noises. Following Noriega and
Ventosa-Santaulària [4] and using the aforementioned
DGPs, we present the following corollary:

ˆ1
p
tO
Corollary Let t and
1For a recent survey see Ventosa-Santaulària [3]. yt
x
, be generated by DGPs i
and j of Table 1. Denote ij as the DGP combination
that generated y and x, respectively, and use them to
2In Noriega and Ventosa-Santaulària [4], the other variable may be-
have as: 1) driftless I(1); 2) I(1) with drift (with a possible drift break);
3) I(2); and 4) (Broken-)Trend stationary.
C
C
opyright © 2012 SciRes. OJS
D. VENTOSA-SANTAULÀRIA
298
Table 1. The DGPs for ,
ttt
zx
y
.
Case Name Model
1 I(0) tzz
ze

t
2 )1(I 1tt zt
zz e

estimate specification (1) by ordinary least square. The
asymptotic distribution of the t-ratio associated with ˆ
,
ˆ, is: t


 
11
00
12 1/
2
11
0
ˆ
2
()
0
d1d
:
dd
xy y
xxr
rrr
Ct
r
d
rr
 







 

2
x
 

11
00
11
2
00
ˆ
d1d
dd
:yx x
yy
rrr
Ct
rr rr
d
 





,,zxy
21 1/2
2
y
where , for , is a standard Brownian mo-

.
z
d
tion and
Proof: See Noriega and Ventosa-Santaulària [4]3.
Results in the corollary reveal that the asymptotic dis-
tribution of the t-ratio is nonstandard when the regression
is unbalanced. However, a simple simulation of the as-
ymptotic distribution shows a striking resemblance of
this distribution with a standard normal (insets (a) and (b)
in Figure 1). Such resemblance fades out in the presence
of autocorrelation (insets (c) and (d) in Figure 1). We,
therefore, confirm that the size distortions pointed out by
Stewart [5] are due to autocorrelation; the latter happens
to be an important source of spurious regression when at
least one of the variables is I(0) and confirms the results
of Granger Hyung and Jeon [6] and Mikosch and Vries
[7] results. Nevertheless, short range autocorrelation
should not be considered as the sole source of spurious
inference. It is well documented that deterministic trends,
structural breaks, and long range dependence, also gen-
erate nonsense inference (see Perron [8] and Tsay and
Chung [9]). It is important to note that the latter cannot
be prevented by using Cochrane-Orcutt or Feasible GLS.
Using standard correction procedures to deal with the
 denotes convergence in distribution.
(a) (b)
(c) (d)
Figure 1. t-ratio asymptotic distribution for unbalanced regressions: insets (a) and (c) I(0) vs I(1); insets (b) and (d) I(1) vs
I(0); insets (a) and (b) iid innovations; insets (c) and (d) AR(1) innovations ( and
0.4
x
0.7
y
). Number of re-
plications: 10,000. The blue area corresponds to the standard normal distribution whilst the red dashed line depicts the as-
mptotic distribution of the t-ratio.
21
e
y
3Noriega and Ventosa-Santaulària [4] only provide the order in convergence of the t-ratio. However, by following the instructions in the appendix, the
asymptotic expressions can also be obtained, as we demonstrated in this paper. The Mathematica code is available upon request.
Copyright © 2012 SciRes. OJS
D. VENTOSA-SANTAULÀRIA 299
spurious regression phenomenon is tempting, even if such
procedures cannot always provide correct inference (see
Stewart [5] and McCallum [10], for example). Sun [11]
proposed a convergent t-statistic using modified HAC
errors with a bandwidth proportional to the sample size
when the variables are highly persistent. The author ac-
knowledges, however, that such a procedure cannot be
used in empirical applications, since the limit distribution
of the test depends on the memory parameter under the
null hypothesis and critical values cannot, therefore, be
tabulated. McCallum [10] and Kolev [12] also advocate
classical correction procedures to deal with spurious re-
gressions, such as the Cochrane-Orcutt procedure and
Feasible Generalized Least Squares. They argue that us-
ing them reduces size distortions of the t-test. However,
Martínez-Rivera and Ventosa-Santaulària [13] proved that
such methods are not always effective and remain highly
dependent on the DGP of the series4.
2. Concluding Remarks
There is finite-sample evidence showing that spurious
inference in unbalanced regressions mostly occurs when
the innovations of the DGPs are not iid. In that sense,
standard autocorrelation-correction procedures, such as
HAC errors, Feasible GLS and Cochrane-Orcutt esti-
mates, have been advanced to eliminate/reduce the size
distortions and, thus, spurious inference. This approach
should, nevertheless, be reconsidered. First, there is evi-
dence that spurious regression using stationary series
cannot always be interpreted as a short range autocorre-
lation phenomenon: long range dependence and struc-
tural breaks (level shifts, for example) also cause spuri-
ous inference; spurious regression cannot, therefore, be
always corrected using classical procedures. Second, an
unbalanced regression (in which the order of integrations
of the involved series is not the same) is an empirical
situation which remains to be proved relevant. The esti-
mation of an unbalanced regression is not intuitive, al-
though there are cases such as in the predictive equation
in the finance literature, in which the market returns
(usually, found to be stationary) is regressed against
dividend yield (stationary but highly persistent). Spurious
regression cannot be simply considered as a short-mem-
ory autocorrelation phenomenon and cannot, therefore,
be treated using standard procedures. The main conclu-
sion is, therefore, twofold: 1) practitioners should inter-
pret cautiously their results whenever they find evidence
of autocorrelation, since the inference could be spurious;
2) they should, however, be aware that spurious regres-
sions arise for many diverse reasons, autocorrelation be-
ing only one of them; standard autocorrelation correction
procedures are not to be considered as the sole solution
to prevent spurious inference; on the contrary: parameter
stability, long memory and cointegration tests should
always be also considered.
REFERENCES
[1] C. W. J. Granger and P. Newbold, “Spurious Regressions
in Econometrics,” Journal of Econometrics, Vol. 2 No. 2,
1974, pp. 111-120. doi:10.1016/0304-4076(74)90034-7
[2] P. C. B. Phillips, “Understanding Spurious Regressions in
Econometrics,” Journal of Econometrics, Vol. 33, No. 3,
1986, pp. 311-340. doi:10.1016/0304-4076(86)90001-1
[3] D. Ventosa-Santaulària, “Spurious Regression,” Journal
of Probability and Statistics, Vol. 2009, No. 1, 2009, pp.
155-182. doi:10.1155/2009/802975
[4] A. E. Noriega and D. Ventosa-Santaulària, “Spurious Re-
gression and Trending Variables,” Oxford Bulletin of
Economics and Statistics, Vol. 69, No. 3, 2007, pp. 439-
444. doi:10.1111/j.1468-0084.2007.00481.x
[5] C. Stewart, “A Note on Spurious Significance in Regres-
sions Involving I(0) and I(1) Variables,” Empirical Eco-
nomics, Vol. 41, No. 3, 2011, pp. 565-571.
doi:10.1007/s00181-010-0404-5
[6] C. W. J. Granger IV, N. Hyung and Y. Jeon, “Spurious
Regressions with Stationary Series,” Applied Economics,
Vol. 33, No. 7, 2001, pp. 899-904.
[7] T. Mikosch and C. G. Vries, “Tail Probabilities for Re-
gression Estimators,” Tinbergen Institute Discussion Pa-
pers, TI 2006-085/2, 2006.
[8] P. Perron, “The Great Crash, the Oil Price Shock, and the
Unit Root Hypothesis,” Econometrica, Vol. 57, No. 6,
1989, pp. 1631-1401. doi:10.2307/1913712
[9] W. J. Tsay and C. F. Chung, “The Spurious Regression of
Fractionally Integrated Processes,” Journal of Economet-
rics, Vol. 96, No. 1, 2000, pp. 155-182.
doi:10.1016/S0304-4076(99)00056-1
[10] B. T. McCallum, “Is the Spurious Regression Problem
Spurious?” Economics Letters, Vol. 107, No. 3, 2010, pp.
321-323. doi:10.1016/j.econlet.2010.02.004
[11] Y. Sun, “A Convergent T-Statistic in Spurious Regres-
sions,” Econometric Theory, Vol. 20, No. 5, 2004, pp.
943-962. doi:10.1017/S0266466604205072
[12] G. I. Kolev, “The ‘Spurious Regression Problem’ in the
Classical Regression Model Framework,” Economics
Bulletin, Vol. 31, No. 1, 2011, pp. 925-937.
4There are other approaches worth mentioning. In Davidson and Mac-
Kinnon [14], for example, the authors consider that the spurious re-
gression phenomenon is, at least partially due to a misspecification o
f
the model. The authors argue that instead of Equation (1), practitioners
should estimate yt = α + βxt + γyt-1 + μy. Using this specification makes
the null hypothesis of the t-test valid. Simulations presented in David-
son and MacKinnon [14] reveal, however, that even a correct specifi-
cation is unable to provide an adequate size of the t-test under the null
hypothesis when the variables are nonstationary.
[13] B. Martínez-Rivera and D. Ventosa-Santaulària, “A Com-
ment on ‘Is the Spurious Regression Problem Spurious?’”
Economics Letters, Vol. 115, No. 2, 2012, pp. 229-231.
[14] D. Davidson and J. G. MacKinnon, “Econometric Theory
and Methods,” Oxford University Press, New York, 2004.
Copyright © 2012 SciRes. OJS