Zambia largely depend s on the international second-hand car (SHC) market for their motor vehicle supply. The importation of Second hand Cars in Zambia presents a time series problem. The data used in this paper is monthly data on SHC importation from 1st January, 2014 to 31st December, 2016. Data was analyzed using Exponential Smoothing ( ES) and Autoregressive Integrated Moving Average (ARIMA) models. The results showed that ARIMA (2, 1, 2) was the best fit for the SHC importation since its errors were smaller than those of the SES, DES and TES. The four error measures used were Root-mean-square error ( RMSE), Mean absolute error ( MAE), Mean percentage error (MPE) and Mean absolute percentage error ( MAPE). The forecasts were also produced using the ARIMA (2, 1, 2) model for the next 18 months from January 2017. Although there is percentage increase of 90.6% from November 2015 to December 2016, the SHC importation generally is on the decrease in Zambia with percentage change of 59.5% from January 2014 to December 2016. The forecasts also show a gradual percentage decrease of 1.12% by June 2018. These results are more useful to policy and decision makers of Government departments such as Zambia Revenue Authority (ZRA) and Road Development Agency (RDA) in a bid to plan and execute their duties effectively.
Over the last two decades private vehicle ownership in the developing countries has increased at an unprecedented pace. Between 1990 and 2005 the total number of registered vehicles in developing countries rose from 110 million to 210 million, and by some estimates it is forecast to reach 1.2 billion by 2030 [
The importation of Second-hand Cars in Zambia presents a time series problem. There are several techniques that use time series but in this study we shall only concentrate on the Exponential Smoothing (ES) and Autoregressive Integrated Moving Average (ARIMA) models. In [
Below is the flowchart of the methodology.
Two main classes of models are considered in this paper: The Exponential Smoothing (ES) and Autoregressive Integrated Moving Average (ARIMA) models. The first class involves the SES, DES and TES models. The three models will be analysed and the best fit model will be chosen depending on whether the data used will exhibit a level and/or trend and/or seasonality. The second class involves the ARIMA models with the following model-building process: Tentative identification of a model, Estimation of parameters in the identified model and Diagnostic checks. The Best fit from the two classes will finally be compared to
choose the model for forecasting (
The SES is applied when the data pattern is nearly horizontal, and shows no particular trend or seasonal variation exists in previous data sets. For the series ϕ 1 , ϕ 2 , ⋯ , ϕ t the forecast for the preceding value ϕ t + 1 , say #Math_5#, is based on the weights 1 − α and α to the recent observation ϕ t and forecast ϕ ¯ t respectively. Where α is the smoothing constant called alpha, ϕ t is the actual value for period t, ϕ ¯ t is the forecast value for period t. The model is of the form
ϕ ¯ t + 1 = ϕ ¯ t + α ( ϕ t − ϕ ¯ t ) , 0 < α < 1 and t > 0. (1)
The value of α is subjectively such that a value close to zero is for smoothing out unwanted cyclical and irregular components and a value close to one is for forecasting.
This technique is used when the data exhibits a trend in its pattern. If you have a time series that can be described using an additive model with increasing or decreasing trend and no seasonality. The model is
ϕ ¯ t = α ϕ t + ( 1 − α ) ( ϕ ¯ t − 1 + β t − 1 ) , 0 < α < 1 , (2)
β t = θ ( ϕ ¯ t − ϕ ¯ t − 1 ) + ( 1 − θ ) β t − 1 , 0 < θ < 1 , (3)
ϕ ^ t + m = ϕ ¯ t + β t m (4)
where ϕ t is the actual value in time t, ϕ ¯ t is the level of series at time t, β t is the slope (trend) of the time series at time t. α and β ( = 0.1 , 0.2 , ⋯ , 0.9 ) are the smoothing coefficient for level and smoothing coefficient for trend respectively. The best values of α and β correspond to the minimum mean square error (MSE).
The TES model is applied when time series data exhibit seasonality. It incorporates three smoothing equations; first for the level, second for trend and third for seasonality. The Triple exponential smoothing model is:
ϕ ¯ t = α ϕ t S t − p + ( 1 − α ) ( ϕ ¯ t − 1 + β t − 1 ) , 0 < α < 1 , (5)
β t = θ ( ϕ ¯ t − ϕ ¯ t − 1 ) + ( 1 − θ t − 1 ) β t − 1 , 0 < θ < 1 , (6)
S t = γ ϕ t ϕ ¯ t + ( 1 − γ ) S t − p , 0 < γ < 1 , (7)
So we have our prediction for time period T + τ :
ϕ ^ T + τ = ( ϕ ¯ T + τ θ T ) S T (8)
where: ϕ ¯ T is the smoothed estimate of the level at time T, θ T is the smoothed estimate of the change in the trend value at time T, S T is the smoothed estimate of the appropriate seasonal component at T. α, β and γ are the level, trend and seasonal smoothing parameters respectively. ϕ ¯ t is the smoothed level at time t, θ t is the change in the trend at time t, S t is the seasonal smooth at time t and p is the number of seasons per year.
The ARIMA model has the following stages: identification, estimation, diagnosis and prediction. “I” stands for integrated process which implies that the process needs to undergo differentiation and that, upon completion of the modelling, the results undergo an integration process to produce final predictions and estimates [
AR model:
Y ^ t = ϑ 1 Y t − 1 + ϑ 2 Y t − 2 + ⋯ + ϑ p Y t − p + ε t = ∑ i = 1 p ϑ i Y t − i + ε t , (9 )
MA model:
Y ^ t = φ 1 ε t − 1 + φ 2 ε t − 2 + ⋯ + φ q ε t − q = ∑ i = 1 q φ i ε t − i , (10)
and ARMA model:
Y ^ t = ∑ i = 1 p ϑ i Y t − i + ε t + ∑ i = 1 q φ i ε t − i (11)
where ϑ t is the auto-regressive parameter at time t, ε t is the error term at time t and φ t is the moving-average parameter at time t.
The stationarity assumption implies that the mean, variance and autocorrelation structures do not change over time. Stationarity will mean a flat looking series, without trend, constant variance over time and no periodic fluctuations (seasonality). However, this assumption of stationarity applies to ARIMA models and not ES models. When the data is found to be non-stationary, the first difference (d = 1) will be used. Only in extreme cases will second difference (d = 2) be applied.
Four model-selection metrics to evaluate the performance of the estimated Exponential Smoothing models and the estimated ARIMA model are used. The best fit model is one with a high number of smaller errors. These errors are; the Root Mean Square Error (RMSE), the Mean Absolute Percentage Error (MAPE), the Mean Percentage Error (MPE) and the Mean Absolute Error (MAE).
The data collected was called into R version 3.3.3 to perform the necessary analysis as outlined in the subsections to follow.
Using the appropriate coding in R, the following output was automatically generated.
The R output for the SES model was as shown in
Criteria | Formula | Criteria | Formula |
---|---|---|---|
RMSE | MAPE | ||
MPE | MAE |
Model information: | ||||||
---|---|---|---|---|---|---|
Smoothing parameters: | Initial states: | sigma: | AIC | AICc | BIC | |
alpha = 0.9104 | l = 5521.3676 | 473.3778 | 578.5190 | 579.2690 | 583.2696 | |
Model information: | |||||
---|---|---|---|---|---|
Smoothing parameters: | Initial states: | sigma: | AIC | AICc | BIC |
alpha = 0.8006 beta = 0.0004 | l = 6033.9228 b = −118.9081 | 465.3511 | 581.2877 | 583.2877 | 589.2053 |
estimated at α = 0.9104 with initial state, l = 5521.3676 and AIC = 578.5190
And the fitted model for this result took the form of
ϕ ¯ t + 1 = 0.9104 ϕ t + 0.0896 ϕ ¯ t (12)
The R output for the DES model was as shown in
The following equations constituted the fitted DES model for SHC importation using Equations ((2) and (3)).
ϕ ¯ t = 0.8006 ϕ t + 0.1994 ( ϕ ¯ t − 1 + β t − 1 ) , (13)
β t = 0.0004 ( ϕ ¯ t − ϕ ¯ t − 1 ) + 9.9996 β t − 1 ,
β t = 0.0004 ( ϕ ¯ t − ϕ ¯ t − 1 ) + 9.9996 β t − 1 , (14)
Using Equations (5)-(7), we fitted the HW model for SHC imports as;
Model information: | |
---|---|
Smoothing parameters: | Coefficients: |
alpha: 0.7706147 beta: 0 gamma: 1 | [,1] a 2158.93914 b −116.18546 |
ϕ ¯ t = 0.7706147 ϕ t S t − p + 0.2293853 ( ϕ ¯ t − 1 + β t − 1 ) , (15)
β t = β t − 1 , (16)
S t = ϕ t ϕ ¯ t (17)
Clearly the AICs in
To model an ARIMA, a time plot is the first step.
Hence
Model selection requires that the ACF and PACF plots for d = 1 in
When estimating the parameters, R gave the following output for ARIMA (2, 1, 2) in
The parameters found significant were AR (1), AR (2), MA (1), and MA (2) at
Model | AIC | Ranking |
---|---|---|
SES | 578.5190 | 1 |
DES | 589.2877 | 2 |
Tentative model | ARIMA (0, 1, 1) | ARIMA (1, 1, 0) | ARIMA (2, 1, 0) | ARIMA (1, 1, 1) | ARIMA (0, 1, 2) | ARIMA (1, 1, 2) | ARIMA (2, 1, 2) | ARIMA (3, 1, 0) |
---|---|---|---|---|---|---|---|---|
AIC | 535.51 | 535.55 | 537.31 | 537.44 | 537.81 | 538.81 | 538.93 | 539.29 |
Rank | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
** | **** | |||||||
Tentative model | ARIMA (2, 1, 1) | ARIMA (0, 1, 3) | ARIMA (1, 1, 3) | ARIMA (3, 1, 1) | ARIMA (0, 1, 4) | ARIMA (4, 1, 0) | ARIMA (4, 1, 1) | ARIMA (1, 1, 4) |
AIC | 539.3 | 539.29 | 540.77 | 540.78 | 541.12 | 541.28 | 542.72 | 543.28 |
Rank | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
Tentative model | ARIMA (3, 1, 2) | ARIMA (3, 1, 3) | ||||||
AIC | 543.29 | 544.17 | ||||||
Rank | 17 | 18 |
Note: *Number of significant parameters and lesser prediction errors
Model information: | ||||
---|---|---|---|---|
parameters: | s.e. | sigma2 | log likelihood | AIC |
ar1 = −1.0536 ar2 = −0.9947 ma1 = 1.0907 ma2 = 0.9983 | 0.0881 0.0258 0.1881 0.1942 | 198,508 | −264.47 | 538.93 |
Variables | Coefficients | p-value |
---|---|---|
AR(1) | −1.0536 | 0* |
AR(2) | −0.9947 | 0* |
MA(1) | 1.0907 | 0.000000006720324* |
MA(2) | 0.9983 | 0.0000002724817* |
Note: *implies p-value < 0.05 hence significant coefficient.
5% significance level. Hence the fitted ARIMA (2, 1, 2) using equation 11 was;
X ⌢ t = 1.0907 ε t − 1 + 0.9983 ε t − 2 − 1.0536 X t − 1 − 0.9947 X t − 2 (18)
The model with best fit was identified by analysis of residuals to ensure they form a white noise process. The ACF of residual, the Q-Q plot and the histogram of residuals were used to show that the residuals of the fit form a white noise process.
The preceding sections revealed that of the three Exponential Smoothing techniques used for this analysis that is SES, DES and TED, SES was chosen as fitting the SHC imports data better than DES and TES. Its fitted model was estimated to be
F t + 1 = 0.9104 Y t + 0.0896 F t .
It was also revealed that ARIMA (2, 1, 2) fitted the data well as compared to other tentative ARIMA models suggested. ARIMA (2, 1, 2) was estimated to be
Y ^ t = 1.0907 ε t − 1 + 0.9983 ε t − 2 − 1.0536 Y t − 1 − 0.9947 Y t − 2 .
But then the question remains as to what is the best fit for the SHC imports
Measures of accuracy (Errors) | SES | DES | TES | ARIMA |
---|---|---|---|---|
RMSE | 480.0931 | 465.3511 | 580.6247 | 439.3114* |
MAE | 358.1159 | 366.8639 | 372.0239 | 325.0751* |
MPE | −5.300821* | −0.09889986 | 0.7725993 | −4.472449 |
MAPE | 15.39877 | 15.67945 | 18.14711 | 13.46418* |
Model ranking | 2 | 3 | 4 | 1 |
Note: Smaller error (*) implies better fit.
data of all the four considered in this report as highlighted in
The results indicate that the ARIMA model performs better than either of the other models for this given time series. The ARIMA (2, 1, 2) has more smaller prediction errors than the SES and so it was rightfully concluded that ARIMA (2, 1, 2) is the best model fit for the SHC imports data. Thus it can be used to even forecast future imports of SHCs.Note, however, that although the SES model exhibits the second best forecast after that of the ARIMA model, the performance of each model relies on the data used.
Here, it should be noted that differences between their performances are related to the differences between the methods of determining forecasts in the ES and in the ARIMA models. The forecasting method in the ES models relies on a weighted average of the past observed values in which the weights decline exponentially. This basically implies that the data for more recent observations contribute significantly more than the previous data does. The ARIMA model, however, has three parts: autoregression, integration and moving average, with the future value of a variable being a linear combination of the past values and the associated errors.
Forecasting is usually the last stage in time series analysis as stated in
Zambia largely depends on the international second-hand car market for their motor vehicle supply. In this paper, monthly time series data on second hand car
Time (months) | Point Forecast | Lo 80 | Hi 80 | Lo 95 | Hi 95 |
---|---|---|---|---|---|
Jan 2017 | 2045.993 | 1463.52478 | 2628.462 | 1155.18458 | 2936.802 |
Feb 2017 | 2047.141 | 1217.04098 | 2877.242 | 777. 61237 | 3316.670 |
Mar 2017 | 2237.907 | 1236.01893 | 3239.796 | 705.65127 | 3770.163 |
Apr 2017 | 2035.770 | 876.03471 | 3195.505 | 262.10811 | 3809.431 |
May 2017 | 2059.000 | 757.79151 | 3360.209 | 68.97329 | 4049.027 |
Jun 2017 | 2235.582 | 818.66531 | 3652.498 | 68.59512 | 4402.568 |
Jul 2017 | 2026.425 | 493.22452 | 3559.625 | −318.40260 | 4371.252 |
Aug 2017 | 2071.159 | 428.86885 | 3713.450 | −440.50729 | 4582.826 |
Sep 2017 | 2232.065 | 496.66342 | 3967.467 | −422.00285 | 4886.133 |
Oct 2017 | 2018.035 | 185.98431 | 3850.086 | −783.84482 | 4819.915 |
Nov 2017 | 2083.496 | 159.69855 | 4007.294 | −858.69851 | 5025.691 |
Dec 2017 | 2227.411 | 223.49336 | 4231.330 | −837.31682 | 5292.140 |
Jan 2018 | 2010.667 | −77.89577 | 4099.231 | −1183.51437 | 5204.849 |
Feb 2018 | 2095.888 | −73.16495 | 4264.941 | −1221.39227 | 5413.168 |
Mar 2018 | 2221.684 | −18.82067 | 4462.188 | −1204.87199 | 5648.239 |
Apr 2018 | 2004.377 | −312.46350 | 4321.218 | −1538.92478 | 5547.679 |
May 2018 | 2108.213 | −281.03488 | 4497.461 | −1545.82639 | 5762.253 |
Jun 2018 | 2214.954 | −239.45714 | 4669.366 | −1538.74416 | 5968.653 |
(SHC) importation was analyzed using SES, DES, TES and ARIMA techniques. The quality of all the techniques was determined by comparing each one of the fitted model’s predictive power with the observed data. The results showed that ARIMA (2, 1, 2) was the best fit for the SHC importation because its errors were smaller than those of the SES, DES and TES. The four error measures used were RMSE, MAE, MPE and MAPE. The forecasts were also produced using the ARIMA (2, 1, 2) model for the next 18 months from January 2017. Although there is percentage increase of 90.6% from November 2015 to December 2016, the SHC importation generally has been on the decrease in Zambia with percentage change of 59.5% from January 2014 to December 2016. The forecasts also show a gradual percentage decrease of 1.12% by June 2018. Ultimately, these results can be used by Government departments like Zambia Revenue Authority and Road Development Agency in the bid to plan and execute their duties effectively.
Jere, S., Kasense, B. and Bwalya, B.B. (2017) Univariate Time-Series Analysis of Second-Hand Car Importation in Zambia. Open Journal of Statistics, 7, 718-730. https://doi.org/10.4236/ojs.2017.74050