Theoretical Economics Letters, 2011, 1, 15-17
doi:10.4236/tel.2011.12004 Published Online August 2011 (http://www.scirp.org/journal/tel)
Copyright © 2011 SciRes. TEL
Nonparametric Lag Selection for Additive Models Based
on the Smooth Backfitting Estimator
Zheng-Feng Guo1, Ling Yan Cao2, Ying He3
1Vanderbilt University, Nashville, USA
2University of Maryland, Maryland, USA
3Ninewell, Inc., Bethesda, Maryland, USA
E-mail: zhengfeng.guo@vanderbilt.edu; lycao@math.umd.edu; yheumd@yahoo.com
Received May 22, 2011; revised June 15, 2011; accepted June 22, 2011
Abstract
This paper proposes a nonparametric FPE-like procedure based on the smooth backfitting estimator when the
additive structure is a priori known. This procedure can be expected to perform well because of its
well-known finite sample performance of the smooth backfitting estimator. Consistency of our procedure is
established under very general conditions, including heteroskedasticity.
Keywords: Lag Selection, FPE, Consistency, Additive Models
1. Introduction
With the wide application of nonparametric techniques in
the time series literature, many nonparametric lag selec-
tion criteria based on kernel smoothing methods have
been proposed; such as nonparametric FPE (Tjostheim
and Auestad, 1994 [1] and Tschernig and Yang, 2000 [2])
and cross validation(Cheng and Tong, 1992 [3]). Under
very general assumptions, Tschernig and Yang (2000) [2]
show the asymptotic equivalence of cross validation and
nonparametric FPE and the consistency of the latter pro-
cedure originally proposed by Tjotheim and Auestad
(1994) [1]. Unfortunately, despite the desirable asymp-
totic property of the FPE procedure, Tschernig and Yang
(2000) [2] point out that overfitting models are selected
too often when the sample size is small.
Until recently, Guo and Shintani (2011 ) [4 ] impose th e
additivity assumption and propose a consistent FPE-like
lag selection procedur e based on the marginal integr ation
method by Linton and Nelson (1995) [5]. In contrast to
the unrestricted FPE procedure without the additivity
assumption, the additive nonparametric FPE-like proce-
dure performs reasonably well in small samples and
generally outperforms the unrestricted FPE due to the
reduction of overfitting.
As is discussed in the conclusion of Guo and Shintani
(2011) [4], the better finite sample properties of the
backfitting method over marginal integration have been
reported in simulation studies (e.g., Sperlich, Linton and
Hardle, 1999 [6]). The possibility of developing a more
effective lag selection procedure based on the smooth
backfitting remains to be studied. In this paper, we close
this gap and propose a consistent FPE-like procedure
based on the smooth backfitting estimator. We provide
the condition s required for the consistency. In contrast to
the FPE-like procedure by Guo and Shintani (2011) [4],
our procedure can be expected to perform better in the
finite sample because of its well-known desirable prop-
erties.
The remainder of the paper is organized as follows. In
section 2, we introduce the model and discuss the as-
ymptotic properties of our procedure. Section 3 discusses
the implementation and the consistenc y of the criterion.
2. The Nonparametric FPE for Additive
Models
In this paper, we consider the problem of selecting the
combination of lag12
{ ,,...,}
m
Sii i
, where
j
k
iifor
, in an additive AR model with th e form of
jk

titi
iS
Yc fYX
tt
 
for 1,..., ,tn
where 12
tm
ti titi
X (,,...,)YY Y
 
. Since the
convergence rate of additive regression estimators does
not depend on the dimension of the model, we do not
impose any restriction on . Below are our
main assumptions. m
(i)m
Assumptions.
For some integerm
M
i, the vector process ,=
M
t
X
Z. F. GUO ET AL.
16
, ,...,
tt tM
YY Y
 

nc
12 is strictly stationary and β-m ix-
ing with

22 /
0n
 for some 0
and
0c.
0
The stationary distribution of the process,
t
X
has a
continuous differentiable dens ity()
M
X
.
The autoregression function for iSis twice con-
tinuously differentiable while

is continuous and
positive on the support of

.
t
is a sequence of i.i.d random variables with
() 0Et
, 2
()0E
and a finite fourth moment.
t
The support of the weight function

is compact
with nonempty interior. The function
is con-
tinuous, nonne gative and

0
M
x
for

supp
M
x
.
The kernel-based nonparametric additive regression
estimator

ˆii
f
xforiSconverges to

ii
f
xat the
one-dimensional rate.
In estimating the additive AR model, we employ the
smooth backfitting estimator, which is a useful practical
variant of the classical backfitting estimator (see Mam-
men, Linton and Nielsen, 1999 [7], and Nielsen and
Sperlich, 2005 [8]). By using an analogy to the asymp-
totic FPE of Tschernig and Yang (2000 [2]), the second
term in formula (7) of Tjostheim and Auestad (1994) [1]
is decomposed as follows
 
 


2
,
ˆ
itiiti
iS iS
i tii ti
iS iS
Mt
EfY fY
EfY EfY
EI IIX















2
,
44 4
Mt
hK
iS
EII X
hc

h
c


2
,
ˆˆ
Mt
i t
iS
X
Ef


i i
x






1/5
nh




,
ii tiMt
iS
Y fYX

2
ˆ



M
dx
We can show

M M
x x

with being the limit of and


i
2
1
2
ii
i
i
fx x
ii
iS
x
fx
xx





 
22
()fx
when the local linear estimator is used for , or with
iiiiiijjj
ˆ


x
fx 
)
fx xdx

ˆ(
when the
local constant (Nadaraya-Watson) estimator is used for
f
x

2
EI
.
Similarly, we can show that con-
verges to

,
Mt
X





2
2
1|| ||ii
M
MM
iS hi
x
K
xx
c x





dx
nh
where


2var |
iii i
x
YfxXx
 .
Therefore, we can define AFPE as
4
24
2
1|| ||4
K
A
FPEAKB hC
nh
 (1)
with
 
2
M
MMM
A
xxxdx

,


2
2
2
|| ||ii
M
MM
iS hi
x
BKx xdx
cx





,
,

2
()
iiMM M
iS
Crxxxd




x
where

22 22
2
||||, K
K
Kudu Kuudu


and
ii
rx
is the term appears in the asymptotic bias of the estima-
tor ˆi()
f
xopt
h
. The optimal bandwidth, which minimizes
equation (1), is given by
1/5
24 11/
opt 2
|| ||K
hKBCn
 
5
.
3. Estimati on a nd the Consis tency of our
Criterion
Our criterion for additive AR models takes the form
 
(1)1
1
ˆˆ
20
m
F
PE SAKB
nh


where [0,1]
,

2
1,
1
ˆ()
n
titi M
tiS
An YfYX





 t
,
and
2
1,
1
(())
()
ˆ()
n
titi
M
t
tiS ti
YfY
Bn X
Y


.
The first term inFP corresponds to the measure of
regression fit in traditional information criteria for the
model selection, while the second term serves as a pen-
alty to avoid overfitting, depending on a tuning parame-
ter
E( )S
.
We follow Tschernig and Yang (2000) [2] and focus
on the case when the optimal bandwidth is used for
opt
h
ˆ()
f
x in ˆ
A
, but any bandwidth of order can be
used for
1/
n5
ˆ()
f
xin B
. We select the subset ˆ
S
...,,2M
12
, ,ii
 1,...,
mwhich minimizes i
()
F
PE S am-
ong all possible combinations of {1,2,..., }
M
. The selected
ˆ
SS
overfits if ˆ
SS
and ; and underfits if
it does not overfit and
ˆ
SS
ˆ
SS
.
Copyright © 2011 SciRes. TEL
Z. F. GUO ET AL.
Copyright © 2011 SciRes. TEL
17
The lag selection procedure is consistent if the prob-
ability of approaches one as.
ˆ
SSn
Theorem 1: Under our assumptions and[0,1]
, as
,
n
()
()
FPE SA
FPE SA

for any overfitting combinations.
12
{, ,...,}
m
Siii
 
The overfitting ()
F
PE S
0
asymptotically becomes lar-
ger than the correctly specified because the
penalty of the former converges at a rate slower than the
latter as long as
FPE( )S
. It should be noted that opt
h
used
for differs from opt . Unlike the unrestricted
FPE, however, the convergence rates of two bandwidths
are the same for additive models even if the dimensions
of the regressors are different, that is why
FPE( )Sh
0
is not
desirable in excluding overfitting models. Following the
same argument as in Guo and Shintani (2011) [4], we
can easily show that the FPE for underfitting case is lar-
ger that of a correctly fitting model. Then we have:
Theorem 2. Under our assumptions and [0,1]
, as
,
n
ˆ1
PS S



.
Remarks
The consistency of our procedure holds for both local
linear and local constant estimators.
If 0
, the probability of selecting the correct
model converges to one as the sample size increases.
If 0
, our criterion is asymptotically equivalent
to the asymptotic FPE.
While the FPE-like procedure by Guo and Shintani
(2011) [4] and our procedure are both consistent,
the latter procedure can be expected to perform
better in the finite sample because of better finite
sample performance of our procedure.
4. Conclusions
The better finite sample properties of the backfitting me-
thod over marginal integration have been often reported
in many simulation studies. Guo and Shintani (2011) [4]
propose a FPE-like procedure based on the marginal in-
tegration method due to its simplicity. Our paper pro-
poses a more effective lag selection criterion based on
the smooth backfitting estimator. The new criterion can
be expected to perform better in the finite sample.
5. References
[1] D. Tjøstheim and B. Auestad, “Nonparametric Identifica-
tion of Nonlinear Time Series: Selecting Significant
Lags,” Journal of the American Statistical Association,
Vol. 89, No. 428, 1994, pp. 1410-1419.
doi:10.2307/2291003
[2] R. Tschernig and L. Yang, “Nonparametric Lag Selection
for Time Series,” Journal of Time Series Analysis, Vol.
21, No. 4, 2000, pp. 457-585.
doi:10.1111/1467-9892.00193
[3] B. Cheng and H. Tong, “On Consistent Nonparametric
Order Determination and Chaos,” Journal of the Royal
Statistical Society series B (Methodological), Vol. 54, No.
2. 1992, pp. 427-449.
[4] Z. F. Guo and M. Shintani, “Nonparametric Lag Selec-
tion for Additive Models,” Economics Letters, Vol. 2, No.
2, 2011, pp. 131-134.
doi:10.1016/j.econlet.2011.01.014
[5] O. B. Linton and J. P. Nielsen, “A Kernel Method of
Estimating Structured Nonparametric Regression Based
on Marginal Integration,” Biometrika, Vol. 82, No. 1,
1995, pp. 93-100. doi:10.1093/biomet/82.1.93
[6] S. Sperlich, O. B. Linton and W. Härdle, “Integration and
Backfitting Methods in Additive Models-Finite Sample
Properties and Comparison,” Test, Vol. 8, No. 2, 1999, pp.
419-458. doi:10.1007/BF02595879
[7] E. Mammen, O. B. Linton and J. P. Nielsen, “The Exis-
tence and Asymptotic Properties of a Backfitting Projec-
tion Algorithm under Weak Conditions,” Annals of Sta-
tistics, Vol. 27, No. 5, 1999, pp. 1443-1490.
doi:10.1214/aos/1017939137
[8] J. P. Nielsen and S. Sperlich, “Smoothing Backfitting in
Practice,” Journal of the Royal Statistical Society Series
B, Vol. 67, No. 1, 2005, pp. 43-61.
doi:10.1111/j.1467-9868.2005.00487.x