**Open Journal of Statistics**

Vol.09 No.02(2019), Article ID:91965,13 pages

10.4236/ojs.2019.92018

Modeling Consumer Price Index in Zambia: A Comparative Study between Multicointegration and Arima Approach

Stanley Jere^{1*}, Alick Banda^{1}, Rodgers Chilyabanyama^{1}, Edwin Moyo^{2}^{ }

^{1}Department of Mathematics and Statistics, Mulungushi University, Kabwe, Zambia

^{2}Department of Mathematics and Statistics, Northrise University, Ndola, Zambia

Copyright © 2019 by author(s) and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

http://creativecommons.org/licenses/by/4.0/

Received: July 24, 2018; Accepted: April 20, 2019; Published: April 23, 2019

ABSTRACT

Consumer Price Index (CPI) is an important indicator used to determine inflation. The main objective of this research was to compare the forecasting ability of two time-series models using Zambia Monthly Consumer Price Index. We used monthly CPI data which were collected from January 2003 to December 2017. The models that were compared are the Autoregressive Integrated Moving average (ARIMA) model and Multicointegration (ECM) model. Results show that the ECM was the best fit model of CPI in Zambia since it showed smallest errors measures. Lastly, a forecast was done using the ECM and results show an average growth rate for food CPI at 6.63% and an average growth rate for nonfood CPI at 7.41%. Forecasting CPI is an important factor for any economy because it is essential in economic planning for the future. Hence, identifying a more accurate forecasting model is a major contribution to the development of Zambia.

**Keywords:**

Consumer Price Index, Multicointegration, ARIMA, ECM, Forecast

1. Introduction

Rising prices affect everyone in terms of purchasing power especially if wages remain constant. This lowers the living standards. Generally, it is difficult to detect change in price levels across product in the absence of a systematic approach. The consumer price measures the weighted average of prices of a basket of goods and services, which include fuel, transport, food and medical care purchased by households. CPI identifies price changes across product categories relevant to the consumer. According to [1] , CPI is a weighted aggregate index that is computed and published monthly. The CPI may not adequately explain actual movements in the costs of living according to [2] . This may be as a result of some biases which may include inaccurate data. Thus, the Engel curve method introduced by [3] addresses the above bias.

In Zambia, the consumer price index is recorded monthly by the Central Statistics Office (CSO). In order to come up with the monthly CPI, products that are essential to human needs such as fuel, food, medical services and so on are categorized in two major categories as; foods which are edible products needed to sustain humans and nonfood products such as fuel, education and so on. The two groups are further used to calculate the monthly CPI as an average.

Forecasts of CPI are important because they affect many economic decisions. Without knowing future CPI rates, future inflation rates cannot be estimated which would make it difficult for lenders to price loans, which in turn have a negative impact on the economy. Investors require good inflation forecasts, since the returns to stocks and bonds depend totally on what happens to inflation. Businesses need inflation forecasts to price their goods and services as well as plan production. Modelling inflation is important from the point of view of poverty alleviation and social justice [4] .

2. Literature Review

The study by [5] stated that the CPI is one of the main indicators of economic performance and also the key indicator of the results of the monetary policy of the country, because of its wide use as a measure of inflation. The ARIMA (4, 1, 6) was selected as a potential model which fits the data as well as for accurate forecasting. Hence, the forecast was made for 12 months ahead of the year 2016, and the findings showed that the CPI was likely to continue rising up with time.

A research by [6] also further described CPI as a measure of changes in the general level of prices of a group of commodities. The best model was found to be the ARIMA (1, 1, 0) compare to ARIMA (0, 1, 1), and ARIMA (1, 1, 1).

The study by [7] relates between CPI and oil prices in Turkey using the Error Correction Model (ECM). Their study revealed that a 1% increase in fuel prices caused the CPI to rise by 1.26% with an approximate one-year lag.

According to [8] , cointegration was actually present in the long run equilibrium relationship of different time series which is a key basic thought and theory in the current econometric field and also an important theoretical cornerstone in current researches on combination forecasting launched by time series.

The paper by [9] modelled inflation using a structural cointegration approach. This paper used cointegration and error-correction models to analyze the relative impact of the monetary, labor and external sectors on Polish inflation from 1990 to 1999. Results showed that the labor and external sectors dominated the determination of Polish inflation during the above period, but their effects have been opposite since 1994. The monetary sector appears not to have exerted influence on inflation, suggesting monetary policy has been passive.

3. Methodology

To carry out this study, monthly food and nonfood CPI collected from January 2003 to December 2017 was used. We used the monthly CPI (which is the average of the food and nonfood CPI) for the ARIMA model while food and nonfood CPI for Multicointegration to develop the error correction model. Statistical software package R (version 0.99.903) was used in obtaining results.

1) Variable Definition

We let, Monthly CPI be denoted by ${U}_{t}$ , Food be denoted by ${X}_{t}$ and Nonfood be denoted by ${Y}_{t}$ .

2) Relationship among the Variables

${U}_{t}=\frac{{X}_{t}+{Y}_{t}}{2}$ (1)

3) ARIMA (Box and Jenkins) Model

George Box and Jenkins developed a practical approach to build ARIMA model. The Box-Jenkins methodology uses a three-step approach of model identification, parameter estimation and diagnostic checking to determine the best model from a general class of ARIMA model. ARIMA model is used to fit historical time series expressed in terms of past values of itself plus current and lagged values of error term. Once the series is confirmed to be stationary, one may proceed by tentatively choosing the appropriate order of models through visual inspection of plots, both the Autocorrelation Function (ACF) and Partial Autocorrelation Functions (PACF). The relevant properties are set out as follows: The series show an AR (p) process, if the ACF decays exponentially (either direct or oscillatory) and PACF cut off after lag p. The series show a MA (q) process, if the PACF decays exponentially (either direct or oscillatory) and ACF cut off after lag q. The series show an ARMA (p, q) process, if the PACF decays exponentially (either direct or oscillatory) and ACF decays exponentially (either direct or oscillatory).

The MA, AR and ARMA are defined as follows:

AR model: ${Y}_{t}={\displaystyle \sum _{i=1}^{p}{\varphi}_{i}{Y}_{t-i}}+{\epsilon}_{t},$ (2)

MA model: ${Y}_{t}={\displaystyle \sum _{i=1}^{q}{\theta}_{i}{\epsilon}_{t-i}},$ (3)

The combination of AR and MA gives

ARMA model: ${Y}_{t}={\displaystyle \sum _{i=1}^{p}{\varphi}_{i}{Y}_{t-i}}+{\epsilon}_{t}+{\displaystyle \sum _{i=1}^{q}{\theta}_{i}{\epsilon}_{t-i}}$ (4)

where ${\varphi}_{t}$ is the autoregressive parameter at time t, ${\epsilon}_{t}$ is the error term at time t and ${\theta}_{t}$ is the moving-average parameter at time t.

In order to build our tentative model, we will follow the three highlighted steps which are: Model Identification, Parameter Estimation and Diagnostic Checking.

4) Multicointegration Model^{ }

According to [10] , Cointegration, occurs if two non-stationary variables ${X}_{t}$ and ${Y}_{t}$ are combined into a unique linear relationship. Under Multicointegration we will consider two variables food and nonfood to model the consumer price index level. Therefore, let ${X}_{t}$ denote the food variable at time t and let ${Y}_{t}$ denote the nonfood variable at time t to fit a short run and long run dynamic relationship and estimate an error correction model (ECM).

In order to build the tentative model, we will follow the two highlighted steps which are:

Step 1, Unit root test

To test for unit root for each variable ( ${X}_{t}$ ) and ( ${Y}_{t}$ ), we used the Augmented Dickey-Fuller test (ADF) based on the hypothesis that

H_{0}: the series has a unit root

H_{1}: the series has no unit root.

Step 2, Two-step method

This is based on the idea that cointegration between ${X}_{t}$ and ${Y}_{t}$ is tested using standard cointegration techniques before testing for multicointegration. We test for a cointegrating relationship between ( ${X}_{t}$ ) and ( ${Y}_{t}$ ) using a proposed cointegrating regression of

${X}_{t}={\alpha}_{0}+{\alpha}_{1}{Y}_{t}+{z}_{t}$ (5)

where ${X}_{t}$ is food in time t, ${Y}_{t}$ is nonfood in time t, ${\alpha}_{0}$ , ${\alpha}_{1}$ are parameters and ${z}_{t}$ is the residual. If ${z}_{t}$ is stationary then a cointegraion relationship exists between ${X}_{t}$ and ${Y}_{t}$ .

5) Error Correction Models (ECM)

Following the two step method above, we estimate the error correction model for ${X}_{t}$ and ${Y}_{t}$ . The ECM model is given by

$\Delta {U}_{t}={\alpha}_{3}+{\beta}_{1}{z}_{t-1}+{\beta}_{2}{\epsilon}_{t-1}+{\mu}_{1}\Delta {Y}_{t}+\text{lagged}\left(\Delta {X}_{t},\Delta {Y}_{t}\right)+\text{residual}$ (6)

where
${z}_{t-1}$
is the residual from the first cointegrating relationship between X_{t}_{−1} and Y_{t}_{−1},
${\alpha}_{3}$
,
${\beta}_{1}$
,
${\beta}_{2}$
,
${\mu}_{1}$
are parameters,
${\epsilon}_{t-1}$
is the residual from the cointegrating relationship between CPI (
${U}_{t}$
) and
${Y}_{t}$
. ΔX_{t} = X_{t} − X_{t}_{−1}, and ΔY_{t} = Y_{t} − Y_{t}_{−1} are lagged values.

4. Results

Table 1 shows the summary statistics of the variables Food, Nonfood and monthly CPI. For food CPI, the minimum CPI was 48.4 with a maximum of 197.8.

Then 25% of the data was less or equal to 74.53 while 50% of the data was less of equal 106.2 and 75% of the data was less or equal to 134.32. On average, the food CPI was 109.64 with a standard deviation of 41.93263.

Table 1. Summary statistics.

For non-food CPI, the minimum was 38.6 with a maximum of 205.1. Then 25% of the data were less or equal to 72.58 while 50% of the data was less or equal to 110.25 and 75% of the data was less or equal to 144.22. On average the non-food CPI was 111.72 with standard deviation of 46.34205.

For monthly CPI, the minimum was 44.2 with a maximum of 201.2. Then 25% of the data was less or equal to 72.47 while 50% of it was less or equal to 108.2 and 75% of it was 110.53. On average, the monthly CPI was 110.53 with standard deviation of 43.98698.

Figure 1 shows time plots for the variables considered in this study from January 2003 to December 2017. The figure clearly shows an upward trend in the monthly CPI, Food and Non Food.

Table 2 shows the ADF test for monthly CPI and differenced monthly CPI which shows that the monthly CPI data is stationary at difference order 1 (d = 1).

Figure 2 shows the time plot of the differenced data of order 1.

Figure 3 shows the ACF (left) and PACF (right) respectively for d = 1.

The error measures for selecting the best fit model were used in this study though there are several ways to determine best forecasting model. The best fit model is one with minimal errors. The error indicators for our study are MPE, MAE, MASE, RMSE and MAPE defined in Table 3.

Table 4 shows the measure of accuracy for selected ARIMA models. An ARIMA model with the smallest errors is the best model. The ARIMA (3, 1, 3) has been identified as the model with the smallest AIC, RMSE, MAE and MASE as can be seen in Table 4. Next, we proceed to estimate the parameters.

Table 5 shows the estimated parameters for ARIMA (3, 1, 3) model.

Table 6 shows the Box-Ljung test results of the residues. Since the test fails to reject the null hypothesis at 5% level of significance, we conclude that the model is a good fit since the data is independent and uncorrelated.

Figure 4 shows the ACF of residuals plot. It is clear that there is no significant spike. So there is no residual correlation left in our data.

Figure 5 shows that the residuals are approximately normally distributed, and there is no correlation in the residuals implying ARIMA (3, 1, 3) was successfully selected as the tentative model to be used for Forecasting.

1) Multicointegration

Table 7 shows the Augmented Dickey-Fuller Test results for food and nonfood variables before and after differencing respectively. Results show that Food and nonfood CPI is stationary after differencing.

Figure 1. Time plots for monthly CPI, food and nonfood.

Figure 2. Time plot of the differenced data (d = 1).

Figure 3. ACF (left) and PACF (right) for d = 1

Figure 4. ACF of residuals plot.

Figure 5. Histogram and q-q plot of residuals.

Table 2. Augmented dickey-fuller test.

Table 3. The error measures for ARIMA model selection.

Table 4. Error measures of tentative ARIMA models.

Table 5. Estimated parameters of ARIMA (3, 1, 3).

Table 6. Box-Ljung test of residuals.

Table 7. Augmented dickey-fuller test.

Figure 6 shows time plots for Food CPI and Non Food CPI after differencing respectively and both time plots exhibit an upward trend.

2) Johansen Cointegration Test

The results from the ADF test showed that both variables (food and non-food) become stationary at first difference. We then used the Johansen cointegration test whose results yielded test statistic of 62.539 which was compared to the critical value of 8.18 at 5% significance level. This shows that there is sufficient evidence to conclude that the two variables are cointegrated.

3) Estimation of the Error Correction Model

Having identified that both food and nonfood variables where stationary at first difference, the Error Correction Model was developed as shown below.

$\text{food}=\text{food}\text{.l}1+\text{nonfood}\text{.l}1+\text{food}\text{.l}2+\text{nonfood}\text{.l}2+\text{const}$

$\text{nonfood}=\text{food}\text{.l}1+\text{nonfood}\text{.l}1+\text{food}\text{.l}2+\text{nonfood}\text{.l}2+\text{const}$

Next, parameters of the Error Correction Model were estimated.

Table 8 shows the estimated parameters for food and non-food.

4) Diagnostic Checking

We carried out an empirical fluctuation process and we found that our observations where dynamic which implied that the lagged observations where included in our model in order to increase the accuracy of the model. Further an ARCH Engle’s test for residual heteroscedasticity was carried out and we observed from our results that our model was significant for this research.

Results in Figure 7 show that the residuals are approximately normally

Table 8. Estimated parameters for food and non-food.

Figure 6. Time plot for Food CPI and Non Food CPI after first differencing.

Figure 7. The ACF, Histogram and q-q plot of residuals for Error Correction Model.

distributed, and there is no correlation in the residuals implying Error Correction Model was successfully selected as the tentative model to be used for Forecasting.

5) Model Comparison

Finally, we compare the ARIMA and ECM prediction accuracy, the model with the smallest errors is selected as the better forecasting model.

Table 9 shows the comparison of the two models. The ECM model shows the smallest errors as compared to the ARIMA (3, 1, 3) model. Thus, ECM is the better forecasting model.

Table 10 shows the forecast for food from the ECM for January 2018 to December 2019. The average growth rate for food CPI is at 6.63%.

Table 11 shows the forecast for nonfood of the ECM for January 2018 to December 2019. The average growth rate for nonfood CPI is at 7.41%.

Table 9. Comparison between ARIMA and ECM.

Table 10. Forecast for food from the ECM model.

Table 11. Forecast for nonfood from the ECM model.

5. Discussion

This paper aimed at comparing two-time series models, ARIMA and Multicointegration using the Zambia CPI data which is recorded monthly. This data was collected from January 2003 to December 2017. ARIMA (3, 1, 3) model was chosen from other ARIMA models as it exhibited the smallest Mean Error (ME), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE) and Mean Absolute Squared Error (MASE). A diagnostic checking was carried using q-q plot, ACF plot and the histogram of residuals. Results showed that the model was significant.

Multicointegration was also used as an appropriate approach to establish whether the two variables food and nonfood are cointegrated and if they can be used to model CPI. We established that both variables were stationary at first difference which enabled us to carry out a cointegration test as a special case. Results from the Johansen cointegration test showed that the variables where cointegrated and it was appropriate to estimate an ECM. An ECM was estimated successfully. To check if the model was significant, we further carried out an ARCH and STABILITY tests and the results showed that the model was significant.

The ECM was selected as the better model to forecast CPI as it showed smallest errors. The identified model was later used to forecast the CPI of Zambia using the relationship of the food CPI and the non-food CPI. The forecast showed an average growth rate for food CPI at 6.63% and an average growth rate for nonfood CPI at 7.41%.

6. Conclusion

The main objective of this research was to compare the forecasting ability of two time-series models using Zambia Monthly Consumer Price Index. Multicointegration was identified as the more accurate model for forecasting compared to the ARIMA (3, 1, 3). The ECM forecast showed an average growth rate for food CPI at 6.63% and an average growth rate for nonfood CPI at 7.41%. The consumer price index plays a very important role as an economic indicator because it is key in the measurement of the inflation rate. Having the ability to forecast CPI is an important factor for any economy because forecasting is essential in economic planning for the future. Forecasts need to be accurate to avoid future dilemmas such as underestimating or overestimating economic flow variables; hence identifying a more accurate model to produce forecasts is a major contribution to the development of Zambia.

Acknowledgements

Many thanks go to the Dean, School of Science, Engineering and Technology Professor Douglas Kunda for the encouragements. Not forgetting Mulungushi University for making it possible through provision of resources to come up with this research work. Also many other colleagues who made good comments on this paper.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

Cite this paper

Jere, S., Banda, A., Chilyabanyama, R. and Moyo, E. (2019) Modeling Consumer Price Index in Zambia: A Comparative Study between Multicointegration and Arima Approach. Open Journal of Statistics, 9, 245-257. https://doi.org/10.4236/ojs.2019.92018

References

- 1. Costa, D.L. (2001) Estimating Real Income in the United States from 1888 to 1994: Correcting CPI Bias Using Engel Curves. Journal of Political Economy, 109, 1288-1310. https://doi.org/10.1086/323279
- 2. Günther, I. and Grimm, M. (2007) Measuring Pro-Poor Growth When Relative Prices Shift. Journal of Development Economics, 82, 245-256. https://doi.org/10.1016/j.jdeveco.2005.07.002
- 3. Hamilton, B.W. (2001) Using Engel’s Law to Estimate CPI Bias. American Economic Review, 91, 619-630. https://doi.org/10.1257/aer.91.3.619
- 4. Gathingi, V.W. (2014) Modelling Inflation in Kenya Using ARIMA and VAR Models. https://www.academia.edu/37491717/Modeling_Inflation_in_Kenya_Using_ARIMA_and_VAR_Models
- 5. Norbert, H., Wanjoya, A. and Waititu A. (2016) Modeling and Forecasting Consumer Price Index (Case of Rwanda). https://doi.org/10.11648/j.ajtas.20160503.14
- 6. Faiga, K., et al. (2015) Time Series Modeling and Forecasting of the Consumer Price index Bandar Lampung. Lampung University, Lampung
- 7. Celik, T. and Akgül, B. (2011) Changes in Fuel Prices in Turkey: An Estimation of the Inflation Effect Using VAR Analysis. Journal of Economic and Business, 14, 11-21.
- 8. Jiang, C., et al. (2014) Selecting Single Model in Combination Forecasting Based on Cointegration Test and Encompassing Test. The Scientific World Journal, 2014, Article ID: 621917. https://doi.org/10.1155/2014/621917
- 9. Kim, Byung-Yeon (2001) Determinants of Inflation in Poland: A Structural Cointegration Approach. BOFIT Discussion Paper No. 16/2001. https://ssrn.com/abstract=1015770 https://doi.org/10.2139/ssrn.1015770
- 10. Harrell, F.E., Lee, K.L. and Mark, D.B. (1996) Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors. Statistics in Medicine, 15, 361-387. https://doi.org/10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4