mg class="lazy 100" data-original="http://html.scirp.org/file/9-1240732x32.png" />.
Both FGNs and FI(d) s have been extensively applied as models for long memory time series and their theoretical properties studied. See the volumes by    and the collections of  and  and the references therein.
2.2. Constrained Non-Stationary Models
 argued that long memory in hydrological time series was a statistical artifact caused by analyzing non- stationary time series with statistical tools which assume stationarity. Often series which display the long memory property are constrained for physical reasons to lie in a bounded range, but beyond that we have no reason to believe that they are stationary. For example, in the series we study in this paper (realized volatilities, see Section 6), as long as the companies remain in the index their stock price volatilities cannot have an unbounded increasing or decreasing trend.
Models of this type which have been proposed typically have stochastic shifts in the mean, but overall are mean reverting about some long term average. The most popular of these are the breaks models which we define as follows:
where is the observation at time t, is an indicator variable which is 1 only if and 0 otherwise, t is the time, , , are the breakpoints and is the mean of the regime i, and is a noise term. In this case, a regime is defined as the period between breakpoints.
It is important to note that Equation (2) is just a way to represent a sequence of different models (i.e. models subjected to structural breaks). This model only deals with breaks in mean. It can be generalized for any kind of break. In series with a structural break the noise process, , may also undergo a change. We present this class of model simply by way of example because it has been used by others when studying long memory processes. We do not regard it as the only alternative model to create long memory-like properties.
The rationale behind our approach is that if we were to test a true long memory process for breaks we would observe some, but they would, in fact, all be spurious. However, we can estimate by simulation how many spurious breaks we might observe under the null and this will vary depending on d (or H). If we observe a significant difference to that expected we will conclude that the process is unlikely to be a true long memory process as our expectation of the number of (spurious) breaks will have been exceeded. Unlike other authors we do not conclude that the DGP must be one of breaks as there may be other alternatives to true long memory that might exhibit long memory-like properties, we simply conclude “not the null”. For this procedure to have any power, we need to establish the distribution of breaks for the individual and bivariate breaks tests, which we do below.
3.1. The Distribution of Breaks Reported by ART
To obtain the null distribution of reported breaks for fractionally integrated series we simulated FI(d) series using farima Sim from the package f Series  in R  with lengths from 1000 to 16,000 data points in steps of 1000 data points, and d values between 0.02 and 0.48 in steps of 0.02 d units. ART was applied to each series using functions implemented in the tree package of  and the number of reported breaks, their locations and associated regime lengths were recorded. For each set of parameter values 1000 replications were run. This yielded a set of 384 simulations which we used as raw data for establishing the mean number of reported breaks and the distributions of those breaks for FI(d) series. The results are reported in Section 4 below.
3.2. The Bivariate Distribution of ART vs CUSUM
We obtained a bivariate distribution of the number of breaks reported by ART  and the CUSUM range   for FI(d) series under a null hypothesis of true long memory through simulation. ART is an application of standard regression trees  to the problem of finding apparent mean shifts in univariate time series. Regression trees are widely used in many branches of statistics but have only infrequently been applied to time series, see  for an example. While they have been used to locate structural breaks in time series it must be stressed that we are only using them to obtain the statistic of the reported number of breaks. In the simulated series to which we are applying ART to obtain the reported number of breaks, all breaks are spurious for there are no real breaks in these series. The question of whether the reported breaks in the example data set are structural breaks or something else is not addressed in this paper.
In the CUSUM test the residuals are standardized by dividing by the estimated series standard deviation and the cumulative summation of the residuals is plotted against time. That is
where is the residual at time t, is the estimated standard deviation and T is the length of the time series, is plotted against r. Under a null hypothesis of no structural breaks in the mean (i.e. that the series has a constant local mean), the cumulative summation forms a Brownian Bridge usually referred to as the Empirical Fluctuation Process (EFP). The range of the CUSUM test is simply the difference between the maximum and minimum values of the EFP.
This type of method can be usefully thought of as a parametric bootstrap. An important characteristic that is important for our procedure is that the bivariate distributions in later examples shows low correlation between the two statistics (ART and CUSUM). This indicates that the information from the two is complementary, so that the combined test can be expected to perform better than either univariate test on their own.
3.3. The New Procedure
The procedure is as follows and uses several different packages for the R  statistical software and comprises the following.
1) Estimate d for the full series. Unless otherwise stated all estimates of d were obtained with the estimator of  as implemented in the R package fracdiff of  .
2) Through simulation, obtain the bivariate probability distribution of number of breaks reported by ART and the CUSUM range for the null distribution of an FI(d) series with d as estimated in the previous step and the same number of observations as the series under test.
a) Simulate a large number of FI(d) series (we used 1000 replications) with appropriate d value and length. FI(d) series were simulated with the function farima Sim in the R package f Series of  .
b) Use ART to break the series into “regimes” and record the number of reported breaks.
c) Obtain the CUSUM range using the efp function in the R package strucchange of  .
d) Plot the bivariate distribution.
3) Apply ART to the full series to obtain the reported number of break points.
4) Use the efp to obtain the CUSUM range of the full series.
5) Overplot the bivariate statistic of reported number of breaks with CUSUM range on the previously obtained null distribution.
6) Assess whether the bivariate statistic for the series is consistent with the null distribution obtained by simulation. This can be done either visually or by generating contours of significance for the bivariate dis- tribution. Contours of significance can be obtained either by using the data from the simulations or kernel smoothing can be applied if desired.
For comparison purposes we provide the p-values for the univariate ART and CUSUM tests.
4. Univariate ART
 determined that the number of breaks reported by ART when applied to Fractional Gaussian Noises was well-described by a Poisson distribution when the series length and the H parameter were fixed. However, his work was quite limited and considered only two series lengths. His work lead to two possible tests based on the number of reported breaks in a long memory series; one using the mean number of reported breaks and the Poisson distribution to determine significance levels; and the other using the simulated data as a bootstrap to estimate the distribution and determine significance levels.
We obtain the mean number of breaks under the null hypothesis of the series being an FI(d) process. If the number of reported breaks in a series under test exceeds the 95% (or other significance) level based on the Poisson distribution then we reject the null of an FI(d) process. It should be noted that  interpreted a rejection of the null as indicating the series had structural breaks. However, we do not propose a specific alternative to the null, instead report whether the reported number of breaks is either consistent or inconsistent with the series being fractionally integrated. Alternatively, we obtained sufficient empirical data to establish the 95% confidence interval through simulation.
The method and results presented here are intended to extend his work to obtain a way of gaining a reasonable estimate of the expected number of reported breaks for a wide range of d values and series lengths in fractionally integrated series.
For reasons of space we report a representative selection of results. The remainder are available on request from the authors.
The distribution of the number of breaks reported by ART for series with 4000 data points is presented in Figure 1. As  noted, as the d parameter increased, the simulated series underwent a transition from ART reporting no breaks to reporting multiple breaks for all replications.
Figure 1. Distribution of the number of breaks reported by ART for 1000 replications of different values of d in an FI(d) series of 4000 data points.
The mean number of reported breaks per series for various series lengths and d values is presented in Figure 2. As can be seen, as the series lengths increased the d values for which no breaks were reported increased. Further, for a fixed d value, the mean number of reported breaks decreased with increasing series length.
We fitted a function to the empirical data to obtain formulas for calculating the mean and various tail probabilities. The approximations are calculated by computing the following variables in the order given:
where are given in Table 1. The function denotes the maximum of q and zero, and similarly for other arguments. The function “floor(z)” returns the greatest integer less than or equal to z. When approximating the mean the step in Equation (4) is omitted; is used instead.
Each fit has been generated by minimizing a function measuring the error between the fitted function f and the known values at data points, where d ranges from 0.02 to 0.48 in steps of 0.02, and series length ranges from 1000 to 16,000 in steps of 1000. The minimization is with respect to the parameters.
The two columns on the left of Table 1 list the parameters for which the probability that the number of breaks is greater than or equal to f is at least 97.5% and 95% respectively. In these cases the relevant error function is given by
Figure 2. Mean number of breaks reported by ART for different values of d in an FI(d) series with lengths ranging from 1000 to 16,000 data points.
Table 1. Coefficients for the function approximating the mean, and various upper and lower quantiles.
where denotes the minimum of z and zero. This error function imposes no penalty when because the discrete nature of the floor function means. The use of F rather than f in Equation (5) gives a better measure of the fit in the region between data points. The errors with the optimal parameter values for these fits are 0.0332 and 0.0894 respectively.
The third column of Table 1 lists the parameter values which gives the fitted approximation to the mean. In this case the least squares error is simply
and the approximation is given by F, not f. The residual sum of squares for the optimal fit was 0.4436.
The four columns on the right of Table 1 list the parameters for which the probability that the number of breaks is less than or equal to f is at least 90%, 95%, 97.5% and 99% respectively. For these, the relevant error function is Equation (5). The values of this error function with the optimal parameters are 0.0576, 0.0835, 0.2125, and 0.33 respectively. Clearly the 90% and 95% fits are rather better than the other two. These two fits (90% and 95%) have approximately the same final errors as the two upper quantile fits, however the latter are zero over much larger areas than the 90% and 95% fits. Hence the 90% and 95% fits will have smaller relative errors.
The two upper quantiles given by the first two columns allow two sided tests to be performed. When the lower limit on the two sided test is zero, the lower limit effectively says nothing as the number of breaks can not be negative. In such cases a one sided test should be used in place of the two sided test.
Figure 3 presents the differences between the fitted functions estimated upper 95 percent confidence interval and that estimated from the empirical data.
5. The Data Set
The data set analysed in Section 6 below, comprised the realized volatility of 16 Dow Jones Industrial Average (DJIA) index stocks and were provided by  . The 16 stocks are Alcoa (AA), American International Group (AIG), Boeing (BA), Caterpillar (CAT), General Electric (GE), General Motors (GM), Hewlett Packard (HP), IBM, Intel (INTC), Johnson and Johnson (JNJ), Coca-Cola (KO), Merck (MRK), Microsoft (MSFT), Pfizer (PFE), Walmart (WMT), and Exxon (XON). The period of analysis was from January 3, 1994 to December 31, 2003. Trading days with abnormally small trading volumes were excluded, leaving a total of 2539 daily observations. The daily realized volatility was estimated using the two time scale estimator of  with
Figure 3. Errors in the fitted function to the empirically determined 95 percent confidence intervals using the formulas in the text. The horizontal axis is the series length, the vertical axis is the d value used in the simulations.
five-minute grids, which is a consistent estimator of the daily volatility. A fuller explanation of the dataset and how the realized volatilities were calculated can be found in  . It should be noted that all 16 stocks were major American corporations traded on the New York Stock Exchange. They are subject to correlated shocks and so cannot be considered to be independent series.
6. Application-Realized Volatilities
We applied the bivariate ART vs CUSUM range as described in the Section 3 to the 16 series in the data set. For reasons of space we present only a representative selection of results, the remainder are available on request from the authors. The four results from the new computational procedure are presented for series with d estimates of 0.36 (GM), 0.40 (JNJ), 0.42 (PFE), and 0.44 (IBM) in panel (a) of Figures 4-7. For comparison purposes we plot the corresponding univariate results for the reported number of breaks and the CUSUM range in panels (b) and (c) respectively. These four results include one example for which the null can be rejected by univariate ART (JNJ), one by univariate CUSUM (GM), and two (IBM, PFE) for which the null is not rejected by either univariate test but is rejected by the bivariate procedure.
In Figures 4-7 the vertical axis is the number of breaks reported by ART. When considered as a discrete univariate distribution the vertical axis is simply the test of  . We provide the results of the Zheng test for all series in the column labelled “ART Test p-value” in Table 2 below. As can be seen from the table the null hypothesis of true long memory was only rejected for one of the 16 series at the five percent level, a result which could easily have occurred by chance. With the exception of Johnson and Johnson, JNJ, (see Figure 5) the number of reported breaks in these series was not in the tails of the univariate distribution. Thus the null hypothesis of a fractionally integrated series would not be rejected on the basis of Zheng’s univariate test.
The horizontal axis in panel (a) of Figures 4-7 is the CUSUM range from the well-known CUSUM test. When taken alone some stocks such as GM (see Figure 4) did appear to have a CUSUM range in the tails of this continuous univariate distribution. On a univariate CUSUM test the null hypothesis of a fractionally integrated series with critical values obtained through simulation was rejected for 12 of the 16 series. However, once these two univariate distributions are combined into a bivariate distribution it is clear that, for the four results presented here, the data points from the realized volatility series lie in the tails of the null distribution obtained by simulation. For 15 of the 16 realised volatility series the null of a fractionally integrated series was rejected,
Figure 4. (a) Bivariate distribution of the CUSUM range and number of breaks reported by ART for 1000 replications of in an FI(0.36) series of 2539 data points. The letter “G” denotes the GM data point. (b) and (c) are the corresponding univariate distributions for the number of reported breaks and the CUSUM range respectively.
the exception was Walmart (WMT).
To summarise, the results of univariate ART and CUSUM tests are presented in columns “ART Test p-value” and “CUSUM Test p-value” respectively in Table 2. As can be seen the null of true long memory was rejected for 1 of the 16 series by the univariate ART test and for 12 of the 16 series by the univariate CUSUM test.
 have also proposed a new test based on comparing the long memory parameter of a series at varying levels of aggregation. In the their test they used the GPH estimator  because of its robustness to short term
Figure 5. Distribution of the number of breaks reported by ART for 1000 replications of an FI(0.40) series of 2539 data points. The letter “J” denotes the JNJ data point. (b) and (c) are the corresponding univariate distributions for the number of reported breaks and the CUSUM range respectively.
correlations and well understood asymptotic properties which allowed them to theoretically derive critical values for varying levels of statistical significance. The results of their test are presented in the column labelled “ORT” of Table 2. The null of true long memory was rejected for one of the 16 series at the five percent level, a result which could easily have occurred by chance.
As discussed in Section 1 the problem of distinguishing among models with true long memory and other models
Figure 6. Distribution of the number of breaks reported by ART for 1000 replications of an FI(0.42) series of 2539 data points. The letter “P” denotes the PFE data point. (b) and (c) are the corresponding univariate distributions for the number of reported breaks and the CUSUM range respectively.
which display apparent long memory properties is difficult. This paper’s primary contribution is that we present a procedure based on the use of a bivariate distribution which, in the 16 series examined, appears to easily show the realized volatility 15 of the series are not FI(d). Secondarily we have extended the work of  on univariate ART.
In Section 4 the change of behaviour seen in Figure 1 between values of d for which ART reported no breaks and values for which breaks were reported suggests that to distinguish between long memory and other
Figure 7. (a) Bivariate distribution of the CUSUM range and number of breaks reported by ART for 1000 replications of FI(0.44) series of 2539 data points. The letter “M” denotes the IBM data point. (b) and (c) are the corresponding univariate distributions for the number of reported breaks and the CUSUM range respectively.
processes exhibiting the long memory property at least two approaches are required. Tests or procedures involving ART, either alone or in conjuction with other established statistics, would only be useful when d was sufficiently high, and the series sufficiently short, that a reasonable number of breaks would be expected to be reported. When d was sufficiently low, or the series sufficiently long, that no breaks would be expected to be reported some alternative method would need to be used. For financial data with a typical d value of about 0.40 and several thousand observations ART should be useful.
Table 2. For the 16 stocks, the d estimate is that reported by the estimator of  . The actual and expected breaks reported by ART, together with p-values calculated from Poisson Distribution as in the  test. The column “CUSUM Test p-value” is the p-value for the univariate CUSUM test with critical values obtained by simulation. Results for which the null is rejected at at least the five percent level are marked by an asterisk (*). The final column lists the test statistic for the  test. Those results for which the null is rejected at at least the five percent level are marked by an asterisk (*).
The results reported by  were encouraging but our results with realised volatilities, reported in Table 2, indicated that the problem of the finite sample properties of FI(d) series and series with structural breaks being similar rendered the test of little help in practice.
With the exception of the  test, tests based on univariate distributions have, in general, not been successful in distinguishing among the proposed models. The results of the  test had the same rate of rejection of the null as the  test. The simple CUSUM test with null distributions obtained by simulation rejected the null of true long memory at a much higher rate than either the  or the  test.
The results of looking at the data with a bivariate breaks vs CUSUM range distribution, as in Figures 4-7, are promising and we believe points the way for future progress in this area. On these bivariate distributions the data point for each of the four real time series was clearly in the extremes of the tails of the distribution. Indeed, all four of the results presented here appear to be significant at close to the 0.001 level. Of the 16 series, 15 of them were unable to accept the null hypothesis of an FI(d) series with d as estimated for the full series.
A summary of the results for all four tests and procedures is presented in Table 3.
8. Conclusions and Future Research
Many other authors have expressed reservations about the reality of the long memory property apparently exhibited by many financial and economic time series. We have proposed a new method based on a bivariate check of the data which compares the real data with properties of the distribution obtained for simulated series. The particular properties we concentrate on are the number of breaks observed in the real data and their EFP range compared to what we would expect if the true DGP were a fractionally integrated process. The use of bivariate distributions to distinguish between true fractionally integrated series and other series displaying the long memory property appears to be a very promising avenue of future research. In the first application to realized volatilities this methodology did not accept the null hypothesis of true fractionally integrated series for
Table 3. A summary of the number of rejections of the null by test or procedure.
15 of the 16 series. This is a higher rate of rejection than using either of the two statistics, which form the bivariate distribution, in their univariate forms seperately.
There are unresolved statistical issues which merit further research. In the bivariate approach we have estimated d but then proceeded as if the d value was known a priori. Clearly the bivariate distribution is dependent on d and further work needs to be done to establish the usefulness of the approach. Also tests of other models which display the long memory property need to be carried out.
Finally, it must be stressed that this new procedure uses the number of breaks to ascertain whether the series is likely to be generated by a fractionally integrated prcoess. In rejecting the null we do not conclude that the alternative is actually a breaks model. There may be possible alternatives to the null and our procedure is simply to test for fractional integration.
We would like to thank the participants in the MODSIM09 conference and two anonymous referees for helpful and constructive feedback.
Cite this paper
William Rea,Chris Price,Les Oxley,Marco Reale,Jennifer Brown, (2016) A New Procedure to Test for Fractional Integration. Open Journal of Statistics,06,651-666. doi: 10.4236/ojs.2016.64055