 Open Journal of Statistics, 2012, 2, 383-388 http://dx.doi.org/10.4236/ojs.2012.24046 Published Online October 2012 (http://www.SciRP.org/journal/ojs) Edgeworth Approximation of a Finite Sample Distribution for an AR(1) Model with Measurement Error Shuichi Nagata Department of Mathematical Sciences, Kwansei Gakuin University, Sanda, Japan Email: nagatas@kwansei.ac.jp Received July 25, 2012; revised August 27, 2012; accepted September 9, 2012 ABSTRACT In this paper, we consider the finite sample property of the ordinary least squares (OLS) estimator for an AR(1) model with measurement error. We present the Edgeworth approximation for a finite distribution of OLS up to O(12T). We introduce an instrumental variable estimator that is consistent in the presence of measurement error. Finally, a simula- tion study is conducted to assess the theoretical results and to compare the finite sample performances of these estima- tors. Keywords: Edgeworth Expansion; OLS; Measurement Error; Instrumental Variables Estimator 1. Introduction The Ordinary Least Squares (OLS) estimator for the AR(1) model is one of the most general estimators in economet- rics, and a number of studies considering the properties of the OLS estimator under certain conditions have been conducted by several authors. For example, it is well known that the OLS estimator for the AR(1) model has a non-negligible bias when the sample size T is not large. This problem is known as the small sample problem ([1, 2]). Another problem of the OLS estimator is that the ob- servation data are sometimes contaminated by noise, which also affects the estimation result. In this case, the OLS estimator in the AR(1) model is not consistent. This problem is commonly known as the measurement error problem. Following  that summarized the early results on this topic, numerous articles have been published on this topic. For example, with respect to time series analy- sis, some estimators in an AR model with measurement errors in  and statistical a test for the existence of noise is proposed in . In this paper, we deal with these two important prob- lems simultaneously. In particular, we consider the OLS estimation when the sample size T is not large, and when an measurement error is present but ignored. To evaluate the effect of a small sample size and ignoring measure- ment error, we derive finite sample properties of the OLS estimator with noise using the Edgeworth expansion, which is a traditional technique in econometrics to ap- proximate a finite sample distribution. For example, the OLS estimator was studied in  for pure AR(1), in  for AR(1) with an unknown mean, and in  for AR(1) with exogenous variables. Following these studies, we apply the algorithm proposed in  to calculate the Edgeworth coefficients. In our setting, some problems are the result of noise, which make calculation difficult. First, if data are af- fected by noise, it is difficult to obtain variables that are related to the autocovariance function of the observation process. To obtain these variables, we use the result in , which shows that an AR(1) process together with noise can be represented by an ARMA(1, 1) process. Second, the OLS estimator is inconsistent with noise, and it is impossible to apply the formula in  in this case; hence we use a corrected error function that follows  and  to avoid this problem. In the simulation section, we also consider a instru- mental variable (IV) estimation, which is the consistent estimation in our setting. We compare the finite sample performances of the OLS estimator with those of our proposed IV estimator using simulations. This paper is organized as follows. In the next section, we provide our setting and the main result for the Edge- worth approximation of the OLS estimator up to O(12T). In Section 3, we examine a Monte Carlo simulation and graphical comparison. Finally, Section 4 concludes this paper. 2. Setup and Main Results We consider a following measurement error model given by Copyright © 2012 SciRes. OJS S. NAGATA 384 22.0,,...0,,udNiidN1,~..,~ttt tttttyxuuiixx (1) where only yt is observable, xt is a stationary AR(1) process with 10,1,,tT and ut is the measurement error or noise. For a given sample period , the OLS estimate can be written as follows: 12ˆ,yCyyC y0,, T (2) where yyy, 1210002101000 0121000 200100 02CC  0000.01000 2 The result of this paper relies on the following well known result given in . If xt is an AR(1) process with AR parameter β, and ut is white noise with constant variance , then yt follows an ARMA(1, 1) process, which is given by the following equations: 11,ttLy L2~...0,,tiidN 2 (3) where L is the lag operator. The parameters  and γ (the MA parameter) can be related to β, 2, and 2 as follows: 2 114,,2qq22u where 222 2uu1q. Then, we obtain the following theorem. Theorem 1. The finite sample distribution of the OLS estimator up to O(12T) is given by  26,Q Qw1432 61261ˆ3Tdpwiw PIw PPPTPPP where  12212πexp2 ,iww I dwwit t1,, 6iPi. 1242612PPP 11Prrω, and Q are as follows: ,  2212Prr, 52443343233232 432244Prrrr rrr rrr,    4233222422432232Prrrrrr,   343323 32457552443 6563549 43105 45263 74 85 7665677485 678336263832 61624122616 203212384 862444 3220262432 823,Prrrrrrrrr rrrrrrrrr rrrrrrrrrr  ,     32 455262242336524343635436 35410 36 3439 449 2643Pr rrrrrrrrrr rrrrrrr        3225252413241 612QPPPPPPPP PP , . Proof. The proof is given in Appendix. Here, we also examine the IV estimator, which is de- fined as follows: 2212TiiiTiiiyyyy. (4) The IV estimator is consistent in the presence of the noise. It is easy to show that the asymptotic variance of the IV estimator is 22121. When there is no noise, the asymptotic variance of the OLS estimator is 1 – β2. Therefore, the OLS estimator is more efficient than the IV estimator in the absence of noise. 3. Simulation and Graphical Comparison In this section, we examine the finite sample property of the OLS estimator, and evaluate the approximate distri- bution generated in Section 2 by Monte Carlo simulation. Data were simulated from Equation (1) with . Therefore, the noise-to-signal ratio 22 2  throughout this section. In addition to the OLS estimator, we also compute the IV estimator defined in the previous section. The number of replications was 20,000. We computed the mean (Mean) and the root mean squared error (RMSE) for each estimator. The simulation results are summarized in Tables 1-3. From Tables 1-3, we confirm that the OLS estimator is biased. As expected, the smaller the sample size and Copyright © 2012 SciRes. OJS S. NAGATA Copyright © 2012 SciRes. OJS 385Table 3. Simulation results (ρ = 0.7). Table 1. Simulation results (ρ = 0.2). β = 0.4 β = 0.8 Mean RMSE Mean RMSE T OLS IV OLS IV OLS IV OLSIV20 0.31 0.91 0.24 68.73 0.68 0.71 0.231.6840 0.33 0.58 0.17 45.50 0.71 0.75 0.160.22100 0.34 0.20 0.12 23.74 0.73 0.78 0.110.10500 0.34 0.40 0.07 0.13 0.74 0.80 0.070.04800 0.34 0.40 0.07 0.10 0.74 0.80 0.060.03 β = 0.4 β = 0.8 Mean RMSE Mean RMSE TOLSIVOLSIV OLS IV OLSIV200.230.620.28109.62 0.56 0.59 0.32 12.6400.240.990.2272.55 0.60 0.77 0.262.891000.250.370.184.39 0.62 0.78 0.20 0.125000.250.400.160.18 0.64 0.80 0.17 0.058000.250.400.150.14 0.64 0.80 0.17 0.04 the larger the noise variance, the larger will be the bias. On the other hand, as the IV estimator is a consistent estimator, IV may converge to the true value. The simu- lation results are consistent with this hypothesis. How- ever, in small samples such as for T = 20 and 40, the RMSE of the IV estimator is rather large, as seen in all tables. Table 2. Simulation results (ρ = 0.4). β = 0.4 β = 0.8 Mean RMSE Mean RMSE T OLS IV OLS IV OLS IV OLSIV20 0.27 -0.06 0.25 48.96 0.63 0.69 0.27 3.2040 0.29 4.23 0.19 540.2 0.66 0.75 0.20 0.29100 0.29 0.36 0.15 2.53 0.68 0.78 0.150.10500 0.30 0.40 0.11 0.15 0.70 0.80 0.110.04800 0.30 0.40 0.11 0.11 0.70 0.80 0.110.03From the simulation results for the RMSE in Tables 2 and 3, we find that the IV estimator is more efficient than the OLS estimator when T ≥ 800 (β = 0.4) and T ≥ 100 (β = 0.8). Therefore, we can conclude that, if the sample size is not large (T = 20, 40), or both β and ρ are small as in Table 1 (β = 0.4 and ρ = 0.2), then the OLS estimator is better than IV in terms of the RMSE. Next, we compare the exact cdf with the asymptotic normal distribution. Figure 1 depicts the approximate distributions of the OLS and the IV with β = 0.4, T = 20, and various values of ρ. Figure 1 indicates that the OLS Figure 1. Exact distributions of OLS and IV. S. NAGATA 386 values have a downward bias. The IV exhibits good be- havior in the central region of the distribution; however, its distribution is fatter in the tails compared to the nor- mal distribution. Finally, we evaluate the approximate distributions ob- tained in Section 2, and compare them with the exact cdf and asymptotic normal distributions. To enable a com- parison of the shapes of the distributions, the asymptotic bias of the OLS estimator is corrected hereinafter. Fig- ure 2 shows the approximate distributions of the OLS estimator with T = 20, ρ = 0.2, and three different values of β. From this figure, we can observe the same result as those obtained in . Figure 3 depicts the approximate distributions of the OLS with ρ = 0.7, where the other pa- Figure 2. Exact and approximate distributions of OLS. 0.8 Figure 3. Exact and approximate distributions of OLS. Copyright © 2012 SciRes. OJS S. NAGATA 387 rameter values are the same as those for Figure 2. We note that the shapes of the distributions are almost the same, even if the noise ratio ρ is changed. From these figures, the noise variance has only a small effect on the shape of the OLS distribution. 4. Discussion In this paper, we considered finite sample properties of the OLS estimator for the AR(1) model with measure- ment error. Using the formula in , we obtained the Edgeworth expansion for finite sample distributions of the OLS estimator up to O(12T). In the simulation work, we have compared naive OLS estimator with the IV estimator which is a consistent es- timator in the presence of noise. We can confirm that, even if the measurement errors is exist, the OLS estima- tor is more efficient than the IV estimator when the sam- ple size is small such as T = 20 and 40. If the noise- to-signal ratio is not so small (ρ ≥ 0.4), the IV estimator is more efficient than the OLS estimator when T ≥ 800 (β = 0.4) or T ≥ 100 (β = 0.8). From the graph of the nor- malized OLS distributions, we find similar properties to those of the distributions, which correspond to the no noise situation examined by . This result implies that measurement error mainly distorts the OLS distributions for mean and variance; hence we can separately deal with the two problems of small sample size and observation errors. Recently, the differenced-AR(1) estimator was dis- cussed in [12,13]. Even if the sample size T is relatively small, this estimator has a small bias. To obtain the finite sample distribution and to examine the robustness of such estimators with respect to observation errors, we can apply the technique of this paper, and this will be dealt with in a future study. 5. Acknowledgements I am grateful to Professor Koichi Maekawa for his guid- ance on this topic and his valuable comments on this paper. I am also grateful to Professor Yasuyoshi Tokutsu for his valuable comments. REFERENCES  F. H. C. Marriott and J. A. Pope, “Bias in the Estimation of Autocorrelations,” Biometrika, Vol. 41, No. 3-4, 1954, pp. 390-402. doi:10.2307/2332719  M. G. Kendall, “Note on the Bias in the Estimation of Autocorrelation,” Biometrika, Vol. 41, No. 3-4, 1954, pp. 403-404. doi:10.2307/2332720  W. A. Fuller, “Measurement Error Models,” John Wiley, New York, 1987. doi:10.1002/9780470316665  J. Staudenmayer and P. Buonaccorsi, “Measurement Er- ror in Linear Autoregressive Model,” Journal of the Ameri- can Statistical Association, Vol. 100, No. 471. 2005, pp. 841-851. doi:10.1198/016214504000001871  K. Tanaka, “A Unified Approach to the Measurement Error Problem in Time Series Models,” Econometric The- ory, Vol. 18, No. 2, 2002, pp. 278-296. doi: 10.1017/S026646660218203X  P. C. B. Phillips, “Approximations to Some Sample Dis- tributions Associated with a First Order Stochastic Dif- ference Equation,” Econometrica, Vol. 45, No. 2, 1977, pp. 463-485. doi.org/10.2307/1911222  K. Tanaka, “Asymptotic Expansions Associated with the AR(1) Model with Unknown Mean,” Econometrica, Vol. 51, No. 4, 1983, pp. 1221-1231. doi:10.2307/1912060  K. Maekawa “An Approximation to the Distribution of the Least Squares Estimator in an Autoregressive Model with Exogenous Variables,” Econometrica, Vol. 51, No.1, 1983, pp. 229-238. doi:10.2307/1912258  J. D. Sargan, “Econometric Estimators and the Edgeworth Expansions,” Econometrica, Vol. 44, No. 3, 1976, pp. 421-448. doi:10.2307/1913972  C. W. J. Granger and M. J. Morris, “Time Series Model- ing and Interpretation,” Journal of the Royal Statistical Society A, Vol. 139, No. 2, 1976, pp. 246-257. doi:10.2307/2345178  K. Maekawa, “Edgeworth Expansion for the OLS Esti- mator in a Time Series Regression Model,” Econometric Theory, Vol. 1, No. 2, 1985, pp. 223-239. doi:10.1017/S0266466600011154  K. Hayakawa, “A Note on Bias in First-Differenced AR(1) Models,” Economics Bulletin, Vol. 3 No. 27, 2006, pp. 1- 10.  P. C. B. Phillips and C. Han, “Gaussian Inference in AR(1) Times Series with or without Unit Root,” Econo- metric Theory, Vol. 24, No. 3, 2008, pp. 631-650. doi:10.1017/S0266466608080262 Copyright © 2012 SciRes. OJS S. NAGATA 388 Appendix Proof of Theorem 1 The OLS estimator for β is given by Equation (2). Intro- ducing ii, we can write the derivation of the estimation as follows: EyCy222.uT11 222ˆyC yyCyyC y    In addition, we introduce iii and . Then, we have the error function as; qyCy T12,qqq2qq1222ˆ.uqT   In order to develop the Edgeworth expansion, we de-fine a modified function e(q): 2222 2uuuT1222ˆTq qeq qT  . It is easy to obtain the cumulant generating function of 12Tq is  1122ICC ,11 2212log det2,TTT    where θj = itj and 122134,,,,.. The matrices I and Σ are the identity matrix and the covariance matrix of y, re- spectively. jkjkjkljklj jkkjk jkaajjee eeeeee   Edgeworth expansion requires the partial derivatives of e(q) and φ(θ) up to the third order. In the current paper these derivatives are denoted as ej, ejk, ψjk and ψjkl, which are all evaluated at the origin. Using the tensor summa-tion convention, Edgeworth coefficients of Sargan’s formula are obtained by these derivatives as follows: Although we only show Edgeworth coefficients re-lated to approximations of up to O(12T), the original formula of Sargan can approximate up to O(T), see . After many calculations, we finally obtain the Edge-worth coefficients. To save space, we show only the re-sults as follows: 2215 3241141374632226141432 422221 42,,4,,iP PPPPPPPPPTPPPPPP  where Pi and Q are polynomials defined in Section 2. The approximation of the OLS estimator up to O(12T) is derived in  as follows. 202 ,xx xpTeqxIicc      (5) where c0 and c2 are 3341 10233316622 2iiccTT T  . Using these results, we obtain the following equations.  1461 102261 263132 2633PPPPQPQccPPP TPPP T Substituting these results into Equation (5), we obtain Theorem 1. Copyright © 2012 SciRes. OJS