^{1}

^{*}

^{1}

^{*}

^{1}

^{*}

Due to the relative uncertainty involved with the variables which affect financial market behavior, forecasting future variations in a time series of the Brazilian stock market Index (Ibovespa) can be considered a difficult task. This article aims to evaluate the performance of the model ARIMA for time series forecasting of Ibovespa. The research method utilized was mathematical modeling and followed the Box-Jenkins method. In order to compare results with other smoothing models, the parameter of evaluation MAPE (Mean Absolute Percentage Error) was used. The results showed that the model utilized obtained lower MAPE values, thus indicating greater suitability. This therefore demonstrates that the ARIMA model can be used for time-series indices related to stock market index forecasting.

Economic crises in recent decades and the consequent financial losses demonstrate that markets, financial institutions and investors urgently needed to improve their models to measure and predict the risks to which they were exposed. Equity investments become a great alternative when compared to other applications, especially in long periods [

Predicting the future behavior of a time series of data on the Bovespa Index (Index of shares of the São Paulo Stock Exchange) is not an easy task, given the uncertainties related to the variables that affect the behavior of financial markets and how they will impact prices practiced in the future. Studies applied to forecasting financial time series of assets, indices and investment portfolios are an important tool used for decision making in many areas, investment managers, asset pricing, and the areas responsible for risk management.

The genius of the work of Markowitz [

This paper aims to evaluate the performance of the ARIMA model to predict the time series of the Bovespa Index, measured by MAPE (mean absolute error percentage) and compare it with other models. Historical data of monthly Bovespa quotations from January 1995 to January 2013 were used. The models were used to compare Single Exponential Smoothing and Double Exponential Smoothing.

This paper is organized as follows: In Section 2, a review of forecasting is presented. In Section 3, the research method is shown. In Section 4, analysis of historical data of the Bovespa Index, data transformation, the necessary adjustments and calculation of MAPE values are all examined based on the results found. Section 5 contains the conclusions of the study.

There are two main types of approaches to demand forecasting: qualitative methods and quantitative methods. The combination of qualitative and quantitative methods is approaching the ideal time to make a good forecast demand [

The main qualitative methods are: Panel data approach, Delphi method, scenario planning, educated guess, executive committee consensus, sales force survey, Historical Analysis and Market Research [

Quantitative methods are based on historical data (time series) and assume that past results are relevant for predicting the future [

Another critical effect on time series is the presence of seasonality, i.e. oscillations or disturbances in series occurring at regular intervals of less than one year. And, according to Bacci [

Some stationary random processes (forward constant average over time) can be modeled by means of a mixed autoregressive process and moving average ARMA_{t} will depend on the past p values of Y and past q values of

For the process shown in (1), the stationary sum

The first differentiation of the data is in Equation (2):

Being:

Y_{t} = observation Y, in period t of the series Y_{t} without differentiation;

Y_{t }_{−}_{ 1} = observation Y, in period t − 1 of the series Y_{t} without differentiation;

ΔY_{t} = Z_{t} = observation Z, in period t, belonging to series Z_{t} with data from the series Y_{t} differentiated for the first time.

The data series will be differentiated for the first time in the following manner: the value of the second datum is decreased from the first; the third will be decreased in the second, the fourth from the third, and so on. With this process, the differentiated series for the first time, Z_{t} will have one less observation (n − 1 observations) than the original series Y_{t}.

The second differentiation of data can be represented by Equation (3):

According to Bacci [_{t} differentiated a second time, or series Z_{t} diffentiated once, will lead to the series W_{t}.

The differential data series Z_{t} is obtained as follows: the value of the second observation decreased from the first observation forms the first observation, the value of the third diminished from the second provides the second and so on.

The twice differentiated series _{t}. Thus, after one or more differentiations of Y_{t} series to make it stationary, it produces a series stationary W_{t}, which can now be modeled as an ARMA process

According to Pindyck and Rubinfeld [_{t} is an autoregressive process of order

Being:

_{t}, that is, the number of times that the non-stationary series Y_{t} was differentiated until becoming a stationary series W_{t};

_{t}.

The construction of an ARIMA model is based on a cycle with the following stages [

According to Bertrand and Fransoo [

Initially data collection was done using Economatica^{® }software, using monthly closing prices of the Bovespa Index, for the period January 2000 to December 2012.

The values obtained were plotted using Minitab^{®} 16 Statistical Software for an initial evaluation of the data as shown in

It can be seen that the data is not stationary and the series presents variance from one period to another. The analysis used in this series demonstrated the need for a logarithmic transformation on the data which generate

Tests for ACF (autocorrelation) and PACF (partial autocorrelation) indicated that the AR1 model ARIMA (0, 2, 1) model could be used to predict the behavior of the series, shown in

A verification of the series’ residuals transformed by Log Bovespa Index and both tests were carried out, through which it was demonstrated that autocorrelation does not exist between series residuals, which enabled the utilization of both to forecast the behavior of the series, as shown in Figures 6-9.

Coefficient | Coefficient standard error | T statistic | P-value | |
---|---|---|---|---|

AR 1 | 0.9996 | 0.0006 | 1697.01 | 0.000 |

Lag | 12 | 24 | 36 | 48 |
---|---|---|---|---|

Chi-square | 11.5 | 18.5 | 26.2 | 33.5 |

Degrees of freedom | 11 | 23 | 35 | 47 |

P-value | 0.399 | 0.728 | 0.858 | 0.931 |

The next step was to realize tests to verify the accuracy of the models. Initially the model was used to forecast 10 months ahead in several periods of the series, trying to compare the MAPE between these periods, as shown in

The use of this indicator (MAPE) to evaluate the models was used because this measures the absolute average, i.e., the sum of the percentage errors, in which the series data values undergo alterations throughout time, which influences the size of the error. Using the value of the error divided by the value of the observation, transforming the error as a percentage of this observation, diminishes the effects caused by variation of the values of the series, allowing one to compare the error between observations of distinct values.

A test of forecasting 10 months ahead in five distinct periods showed that, in all models, the error tends to increase after the second period, significantly impacting the average error. This fact could result from the choice of one model over another simply because of the errors from the second period are lower. This finding could be valid, if the prediction is used to making decisions in the midterm, in which the prediction of several periods ahead would determine actions to be taken and which could hardly be changed in the short term.

In the case of financial time series, significant changes in the forecasts made can trigger immediate decisions and corrections in a matter of minutes, or at worst a few days, repositioning strategies by hedging or even the rebalancing of the portfolio by the complete elimination of certain positions that would be affected or even entire investment strategies in a short space of time.

Thus, it was chosen to perform the forecast one-step-ahead, in the case of this study, one month ahead. In the various models analyzed, this prediction proved to be such as that found the lowest MAPE. Thus, it avoids using only a period of the series, analyze the ASM one-step-ahead composed of five periods, which evaluated the forecast of each of these periods, being the MAPE constructed from the absolute percent average of the sum of these errors.

Through the results obtained, it is observed that the model is effective in its forecasts. The statistics of the AR1 model coefficients and Chi-square statistics for modified Box-Pierce (Ljung-Box) provide proof to this fact.

A MAPE (mean absolute error percentage) of 0.052% was obtained, a lower value than those found in predictions made with other models used for comparison.

This study sought to obtain short-term forecasts for the next month (one step ahead) in order to minimize pre- diction errors. The model can be considered adequate for predicting the Bovespa Index series, and can be used

AR1 | Single Exp smoothing | Double Exp smoothing | ARIMA (0, 2, 1) | |
---|---|---|---|---|

MAPE | 0.052% | 0.086% | 0.118% | 0.064% |

as an aid to decision-making mechanism.

The authors would like to express their gratitude to the Brazilian agencies CNPq (National Counsel of Technological and Scientific Development), CAPES (Post-Graduate Federal Agency), and FAPEMIG (Foundation for the Promotion of Science of the State of Minas Gerais), which have been supporting the efforts for the development of this work in different ways and periods.