This article aims to provide an analysis for a time series data of gross domestic product (GDP) of the Sudan. An econometric time series model with macroeconomic variables is conducted. Since a non-stationary time series must be made stationary, some statistical tests are followed so that the time series become stationary series. After applying these tests, the time series became stationary and integrated of order I. Box-Jenkins procedure is used to determine ARMA. OLS is used to estimate the models parameters. Performances chosen ARIMA model are verified on the basis of classical statistical tests and forecasting. The model features are interpreted on the basis of standard measures of forecasting performance.
As a measure of performance for an economy, the (GDP), gross domestic product, is the value of all final goods and services produced within a country in a year. GDP data is widely used economic data in the field of time series modeling and analysis. GDP data are used to meet a wide variety of requirements, such as in industry, finance, research institutions, and other fields. Forecasting economic model is an essential component of a country’s economy decision-making process. The GDP forecast is necessary for policy makers to forecast economic model. For these reasons, this paper investigates the performance of GDP model for the Sudan. It aims at analyzing time series econometric model of macroeconomic variable GDP in the country.
Time series models and analysis has been discussed in [
Many statistical tests are used in time series models in order to make it a stationary series and integrated; thus, Box-Jenkins procedure is used for the determination of ARMA, and OLS method is used to estimate the model parameters. In the following sections, among the techniques those are useful for analyzing will be identified.
This paper is organized as follows: Chapter 2 is devoted to the proposed model of the study. A background about data collection and methodology is presented in Chapter 3 while Chapter 4 is devoted to data analysis and results which has been discussed in Chapter 5 and then a brief conclusion has been introduced in Chapter 6.
The methodology of time series analysis composed of two steps: constructing a data model for that time series, and forecasting the future values.
For a regular time series pattern, the value of the series, Yt, should be a function of previous values. If Y is the target value that we are trying to model and predict, and Yt is the value of Y at time t, then the goal is to build a model of the type:
Y t = f ( Y t − 1 , Y t − 2 , Y t − 3 , … , Y t − n ) + e t (1)
where Yt−1 is the previous observation value of Y, Yt−2 is the value two observations ago, etc., and et (a random shock), represents noise that does not follow a predictable pattern. Variables Values occurring prior to the current observation are called lag values. In a repeating pattern time series, the value of Yt is usually highly correlated with Yt−cycle. Thus, the goal of constructing a time series model is to build a model such that the error between the predicted value of the target variable and the actual value is as small as possible.
Consider a time series of data Xt, the ARMA model consists of two parts, an autoregressive (AR) part and a moving average (MA) part. Following [
Y t = c + ∑ i = 1 p φ i Y t − i + ε t (2)
where φ i , ... , φ p are the model parameters, c is a constant (which may be omitted for simplicity) and ε t is an error term. The MA (q) notation stands for the moving average model of order q:
Y t = ε t + ∑ i = 1 q θ i ε t − i (3)
where the θ1, ..., θq are the parameters of the model and the εt, εt−1, ... are, the error terms.
The notation ARMA (p, q) refers to the model with p autoregressive terms and q moving average terms. This model contains the AR (p) and MA (q) models,
Y t = ε t + ∑ i = 1 p φ i Y t − i + ∑ i = 1 q θ i ε t − i (4)
where the error terms εt are assumed to be independent identically-distributed random variables with mean zero and εt - N (0, σ2) where σ2 is the variance.
The process ( Y ) t is said to be ARIMA (p, d, q) if:
( 1 − l ) d ∅ * ( l ) Y t = c + θ ( l ) ε t (5)
where
∅ * ( l ) is defined in ∅ ( l ) = ( 1 − l ) ∅ * ( l ) , (6)
∅ * ( z ) ≠ 0 for all | z | ≤ 1 . And θ ( l ) is defined in θ ( z ) ≠ 0 for all | z | ≤ 1 .
The process ( Y ) t is stationary if and only if d = 0 in which case it reduces to ARMA (p, q) process:
∅ ( l ) Y t = c + θ ( l ) ε t (7)
The Box-Jenkins methodology [
1) Time series stationary. A time series is said to be stationary if both its mean and its variance remain constant through time. Classical Box-Jenkins ARMA models only work satisfactorily with stationary time series.
2) Identify a (stationary) conditional mean model for underlying data. The sample autocorrelation functions (ACF) and partial autocorrelation functions (PACF) can help with this selection. For an autoregressive (AR) process, the sample ACF decays gradually, but the sample PACF cuts off after a few lags. Conversely, for a moving average process (MA), the sample ACF cuts off after a few lags, but the sample PACF decays gradually. If both the ACF and PACF decay gradually, consider an Auto-Regressive Moving Average (ARMA) model.
3) Model Specification stage, and estimation of the parameters required.
4) Model checks for goodness-of-fit by using methods such as Proportion of variance explained by model or Correlation between actual and predicted. Residuals should be uncorrelated, homoscedastic, and normally distributed with constant mean and variance.
5) Forecasting: The model can be used to forecast or generate simulations over a period of time after checking its goodness of fit and its forecasting ability.
Adopting the ARIMA (auto-regressive, integrated, moving average) method iteratively, to best-fit time series data, then auto-regressive component (AR) in ARIMA is designated as p, the integrated component (I) as d, and moving average (MA) as q. The AR component represents the effects of previous data observations. The I component represents trends, including seasonality. And the MA component represents effects of previous random shocks (or error). To fit an ARIMA model to a time series, the order of each model component must be selected. Usually a small integer value (usually 0, 1, or 2) is determined for each component.
The GDP is equals to the total expenditures for all final goods and services produced within the country in a stipulated period of time. It is the sum of gross value added by all resident producers in the economy plus any product taxes and minus any subsidies not included in the value of the products [
The Sudan Central Bureau of Statistics (CBS) issues annual report includes all National accounts, while the Central Bank of the Sudan [
Annual percentage growth rate of GDP at market prices is based on constant local currency. Aggregates are based on constant U.S. dollars. Reported by the World Bank, the GDP in Sudan was worth 97.156 billion US dollars in 2015. It represents 0.14 percent of the world economy. GDP in Sudan averaged 18.774 USD Billion from 1960 until 2015, reaching an all time high of 97.156 USD Billion in 2015 and a record low of 1.307 USD Billion in 1960.
According to [
The resulting high external and internal deficits, coupled with the sustained American sanctions as well as the security concerns in the country, affected the economic situation which led to devaluation to supplement the budget, including the devaluation of the currency by 29% and removal of fuel subsidies worth SDG 3.6 billion (Sudanese pounds) about 1.2% of GDP, resulting in riots. Economic linkages and value addition were weakened during the period of oil-driven growth (1999-2011), mainly in agriculture (which provided 47.6% of total jobs in 2011). The major field of government expenditure might be on the security services, though no official figures were displayed.
Also, the high taxes along the supply chains and the recent increase in tariffs on imported inputs in addition to the high costs of energy and infrastructure services raised domestic resource costs and reduced domestic value addition. During 2001-2007, 41% of all factories closed because of intense competition.
After the production of oil fields in the southern Sudan from 1998 onward, the economy developed rapidly, reaching levels of 8% per annum. However, the fall in oil revenues after the secession of what is now South Sudan in 2011 has affected greatly on GDP growth, which stands negative (−6%) in 2013.
GDP per capita―current prices estimated as US$1985 for 2014 while the GDP (Purchasing Power Parity) is estimated to be 168 billion of International dollars in 2015, while the estimate of GDP per capita―PPP is 4522 International Dollars for 2014, (see [
Sudan’s trade suffers from several difficulties, despite persistent efforts by the government to liberalize trade. Import restrictions, discriminatory taxes, delays in customs clearance and non-transparent regulations are some of the factors impeding Sudanese trade.
Some chief import commodities of Sudan are: Manufactured goods, Transport equipment, Medicines and Chemicals. The main share of Sudan’s export partners in its total trade, according to CIA World Fact book reports for 2009, UAE (32%), China (16%), Saudi Arabia (15.5%), while the import partners are China (26.3%), UAE (10%), India (9%), Egypt (5.6%) and Turkey (4.7%).
Time series are analyzed in order to understand the nature of underlying structure and mechanism of the function that produce the observations. In this section, the data of GDP statistics of Sudan, which include the current and constant prices in million US$ for the period (1960-2015) will be investigated.
Figures 1-3 show a line graph of GDP levels in the period under consideration. Overall, the line graph shows a clear dominance of a long-term upward trend, suggesting a non-stationary time series in levels. In this analysis of GDP data, a summary of the model descriptive statistics for GDP) is given in
N | Minimum | Maximum | Mean | Std. Deviation | |
---|---|---|---|---|---|
GDP in billion US$ | 56 | 1.307 | 97.156 | 18.77350 | 23.207303 |
Valid N (listwise) | 56 |
Sum of Squares | df | Mean Square | F | Sig. | |
---|---|---|---|---|---|
Between Groups | 29621.840 | 55 | 538.579 | ||
Within Groups | 0.000 | 0 | |||
Total | 29621.840 | 55 |
GDP in billion US$.
ANOVA (b) | ||||||
---|---|---|---|---|---|---|
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 18257.653 | 1 | 18257.653 | 86.756 | 0.000 (a) |
Residual | 11364.187 | 54 | 210.448 | |||
Total | 29621.840 | 55 |
(a) Predictors: (Constant), year; (b) Dependent Variable: GDP in billion US$.
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
---|---|---|---|---|
1 | 0.785 (a) | 0.616 | 0.609 | 14.506822 |
(a) Predictors: (Constant), year.
Coefficients (a) | ||||||
---|---|---|---|---|---|---|
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | B | Std. Error | ||
1 | (Constant) | −2201.505 | 238.381 | −9.235 | 0.000 | |
year | 1.117 | 0.120 | 0.785 | 9.314 | 0.000 |
(a) Dependent Variable: GDP in billion US$.
Equation | Model Summary | Parameter Estimates | |||||
---|---|---|---|---|---|---|---|
R Square | F | df1 | df2 | Sig. | Constant | b1 | |
Linear | 0.616 | 86.756 | 1 | 54 | 0.000 | −2201.505 | 1.117 |
Dependent Variable: GDP in billion US$. The independent variable is year.
Equation | Model Summary | Parameter Estimates | ||||||
---|---|---|---|---|---|---|---|---|
R Square | F | df1 | df2 | Sig. | Constant | b1 | b2 | |
Linear | 0.616 | 86.756 | 1 | 54 | 0.000 | −2201.505 | 1.117 | |
Logarithmic | 0.614 | 85.776 | 1 | 54 | 0.000 | −16805.735 | 2215.325 | |
Quadratic | 0.619 | 87.750 | 1 | 54 | 0.000 | −1093.851 | 0.000 | 0.000 |
Exponential | 0.897 | 472.242 | 1 | 54 | 0.000 | 2.84E−059 | 0.069 |
Dependent Variable: GDP in billion US$. The independent variable is year.
described in
When building a time series model, it is necessary to include lag values that
Model Type | |||
---|---|---|---|
Model ID | GDP in billion US$ | Model_1 | ARIMA (0, 0, 0) |
Percentiles | Maximum | Minimum | SE | Mean | Fit statistic |
---|---|---|---|---|---|
5, 10, 25, 50, 75, 90, 95 | |||||
0.616 | 0.616 | 0.616 | 0.616 | Stationary R-squared | |
0.616 | 0.616 | 0.616 | 0.616 | R-squared | |
14.507 | 14.507 | 14.507 | 14.507 | RMSE | |
154.318 | 154.318 | 154.318 | 154.318 | MAPE | |
1014.104 | 1014.104 | 1014.104 | 1014.104 | MaxAPE | |
10.755 | 10.755 | 10.755 | 10.755 | MAE | |
47.662 | 47.662 | 47.662 | 47.662 | MaxAE |
Model | Number of Predictors | Model Fit Statistics | Ljung-Box Q (18) | Number of Outliers | ||
---|---|---|---|---|---|---|
Stationary R-squared | Statistics | DF | Sig. | Stationary R-squared | Statistics | |
GDP in billion US$-Model_1 | 1 | 0.616 | 181.207 | 18 | 0.000 | 0 |
have large, positive autocorrelation values or that have large negative autocorrelations. The partial autocorrelation is the autocorrelation of time series observations separated by a lag of k time units with the effects of the intervening observations eliminated. Autocorrelation and partial autocorrelation tables are also provided for the residuals (errors) between the actual and predicted values of the time series. Proportion of variance explained by model is the best single measure of how well the predicted values match the original values. If the predicted values exactly match the original values, then the model would explain 100% of the variance. In fact this is not always the case (here the model explains 61.6% of the variance due to the R square value), as seen in
Examining the autocorrelation table shown in
The autocorrelation ACF (
Lag | Autocorrelation | Std. Error (a) | Box-Ljung Statistic | ||
---|---|---|---|---|---|
Value | df | Sig. (b) | Value | df | |
1 | 0.056 | 0.131 | 0.182 | 1 | 0.670 |
2 | 0.060 | 0.130 | 0.392 | 2 | 0.822 |
3 | −0.020 | 0.129 | 0.416 | 3 | 0.937 |
4 | 0.141 | 0.128 | 1.630 | 4 | 0.803 |
5 | −0.181 | 0.126 | 3.675 | 5 | 0.597 |
6 | −0.050 | 0.125 | 3.839 | 6 | 0.699 |
7 | −0.070 | 0.124 | 4.162 | 7 | 0.761 |
8 | 0.174 | 0.122 | 6.173 | 8 | 0.628 |
9 | −0.034 | 0.121 | 6.250 | 9 | 0.715 |
---|---|---|---|---|---|
10 | 0.102 | 0.120 | 6.978 | 10 | 0.728 |
11 | −0.161 | 0.118 | 8.833 | 11 | 0.637 |
12 | 0.025 | 0.117 | 8.880 | 12 | 0.713 |
13 | 0.042 | 0.116 | 9.012 | 13 | 0.772 |
14 | −0.143 | 0.114 | 10.574 | 14 | 0.719 |
15 | −0.313 | 0.113 | 18.229 | 15 | 0.251 |
16 | −0.064 | 0.112 | 18.561 | 16 | 0.292 |
Series: GDP in billion US$. (a) The underlying process assumed is independence (white noise); (b) Based on the asymptotic chi-square approximation.
Lag | Partial Autocorrelation | Std. Error |
---|---|---|
1 | 0.056 | 0.135 |
2 | 0.057 | 0.135 |
3 | −0.027 | 0.135 |
4 | 0.141 | 0.135 |
5 | −0.199 | 0.135 |
6 | −0.042 | 0.135 |
7 | −0.039 | 0.135 |
8 | 0.168 | 0.135 |
9 | −0.002 | 0.135 |
10 | 0.072 | 0.135 |
11 | −0.193 | 0.135 |
12 | −0.026 | 0.135 |
13 | 0.133 | 0.135 |
14 | −0.191 | 0.135 |
15 | −0.236 | 0.135 |
16 | −0.108 | 0.135 |
Series: GDP in billion US$.
Lag | Cross Correlation | Std. Error (a) |
---|---|---|
−7 | 0.603 | 0.143 |
−6 | 0.631 | 0.141 |
−5 | 0.658 | 0.140 |
−4 | 0.685 | 0.139 |
−3 | 0.711 | 0.137 |
−2 | 0.737 | 0.136 |
---|---|---|
−1 | 0.761 | 0.135 |
0 | 0.785 | 0.134 |
1 | 0.678 | 0.135 |
2 | 0.587 | 0.136 |
3 | 0.507 | 0.137 |
4 | 0.431 | 0.139 |
5 | 0.352 | 0.140 |
6 | 0.274 | 0.141 |
7 | 0.211 | 0.143 |
Series Pair: GDP in billion US$ with year. (a) Based on the assumption that the series are not cross correlated and that one of the series is white noise.
Thus, if we rely on this information, we may conclude that we have a good fit. From
y = β 0 + β 1 x i , orGDP = 9.314 + 0.785 x i , with standard error ( 0.120 ) for β 1
Based on forecasting model results, the forecasted values for Sudan GDP (in in billion US$), are 99.51 (for the year 2017), 101 (2018), 106.58 (2019) and 112.62 (for the year 2020). The Annual growth rates are estimated to be about 5.3%, 5.36%, 5.52% and 5.67% for the above years respectively.
We evaluate Autoregressive Integrated Moving Average (ARIMA) model of the GDP series using Box-Jenkins methodology by using four different equations which are, linear, logarithmic, quadratic and exponential equations. I also successively eliminated the AR or the MA term while leaving the other term in, but still got higher values for all test parameters. Based on the parameter values, I found that the ARIMA (0, 0, 0) is the best model for the data. Comparing with other models, ARIMA model has been selected as the final model. We provide method for prediction and forecasting based on data, which may be applicable and useful to government and business institutions.
Sudan GDP Annual Growth Rate Forecasts are projected using an autoregressive integrated moving average (ARIMA) to be 4.9 for 2017 and 4.9 for 2020, using analysis expectations. We model the past behavior of Sudan GDP Annual Growth Rate using historical data and adjustments of the coefficients of the econometric model by taking into account analysis assessments and future expectations. It can be seen that time series are very complex because each observation is somewhat dependent upon the previous observation, and often is influenced by more than one previous observation. Random error is also influential from one observation to another. These influences are called autocorrelation―dependent relationships between successive observations of the same variable. The challenge of time series analysis is to extract the autocorrelation elements of the data, either to understand the trend itself or to model the underlying mechanisms.
A word of caution about using multiple regression techniques with time series data: because of the autocorrelation nature of time series, time series violate the assumption of independence of errors. Type I error rates will increase substantially when autocorrelation is present. Also, inherent patterns in the data may dampen or enhance the effect of an intervention; in time series analysis, patterns are accounted for within the analysis.
This article has discussed the analysis for GDP statistics of the Sudan. The ARIMA method used here might be appropriate only for a time series that is stationery (i.e., its mean, variance, and autocorrelation should be approximately constant through time) and it is recommended that there are at least 50 observations in the input data (the underlying model has 55 observations). It is also assumed that the values of the estimated parameters are constant throughout the series. The article has discussed changes in the GDP for the period (1960-2015). The results for the analysis, indicated that model, provides useful information for identifying GDP trend. An important policy consideration rising from the study is that there is increasing trend for the model of the data. More advanced future work can be done on the basis of these investigations, particularly in residual analysis of the model.
The author declares no conflicts of interest regarding the publication of this paper.
Elsayir, H.A. (2018) An Econometric Time Series GDP Model Analysis: Statistical Evidences and Investigations. Journal of Applied Mathematics and Physics, 6, 2635-2649. https://doi.org/10.4236/jamp.2018.612219