North Dakota’s oil production has been rapidly increasing during the past several years. The state’s oil production in March 2013 even increased to more than twice the quantity produced in March 2011, and the estimated Bakken Formation reserves were reported very large compared with those of the United Arab Emirates. It eventually makes a question to us of how much oil will be able to be actually extracted with currently available technologies. To answer this question, this paper forecasts future oil development trend in North Dakota using the Seasonal Autoregressive Integrated Moving Average (S-ARIMA) model. Nonstationarity derived from a stochastic trend and the abrupt structural change of oil industry was a big potential problem, but through the Quandt Likelihood Ratio test, we found break points, which allowed us to select a model fitting period suitable for the S-ARIMA method to provide accurate statistical inference for the historical period. The seven major oil producing counties were investigated to determine whether the current oil boom was consistent across all oil fields in North Dakota. Empirical estimates show that North Dakota’s oil production will be more than double in the next five years. What we can predict with great certainty is that North Dakota’s influence over domestic and global oil supply systems will increase in the near future, especially over the next five to six years. This is good news for those who are concerned about domestic energy security in the USA.
North Dakota is one of the lower 48 states of the USA. Historically, its economy had been highly dependent on agriculture, which accounted for over 87% of land cover in the state in 2007―more than 16 million hectares [
LeFever and Helms [
North Dakota produced approximately 24.2 million bbl. of oil in March 2013―more than doubling the 11.1 million bbl. produced in March 2011. The state’s oil production has been rapidly increasing during the past several years. North Dakota now accounts for 10% of the total USA crude oil production, rivaling the production levels of Texas and the USA Federal Offshore region [
Autoregressive Integrated Moving Average (ARIMA) and Seasonal Autoregressive Integrated Moving Average (S-ARIMA) techniques have been broadly applied to forecast how variables change over time. These techniques typically use (seasonal) autoregressive terms, (seasonal) moving average terms, and/or (seasonal) autoregressive terms to forecast the changes of time series. As generally reported, these forecasting techniques regard preceding values of a variable and their associated error terms as essential information in forecasting future values of the variable. Given large time series dataset, ARIMA and S-ARIMA methods show high forecast accuracy. Forecasting analyses in a variety of fields such as electricity demand, wheat prices, inflation, unemployment, reliability and fishery landings have demonstrated the validity of ARIMA and/or S-ARIMA models [
Numerous studies have used the ARIMA or S-ARIMA models to forecast oil prices, as well as production or consumption levels. For example, Ayeni and Pilat [
Rapidly increasing oil production in North Dakota has focused world attention on the Bakken Formation, especially in light of increasing oil prices in recent years and quickly increasing oil demand in emerging markets due to rapid economic growth and industrialization. In the near future, North Dakota will be one of the largest oil production regions in the world, having a significant effect on domestic and international production levels and prices.
The objectives of this paper are two-fold: 1) to forecast oil development and production for each of North Dakota’s major and minor oil producing counties through January 2020, as well as for North Dakota as a whole; 2) to determine whether ongoing oil booms are consistent across all the regions of North Dakota. Empirical results of this study will be useful to federal officials tasked with creating a national energy security plan, as well as for state officials with responsibilities related to the energy and mining sectors of North Dakota.
New Drilling Technology and Oil Production in the Bakken FormationThe Bakken Formation―located in the Williston Basin―is the richest and most productive oil reservoir in North Dakota. Initial oil production in the Bakken Formation began in the 1950s, but Mission Canyon, Spearfish and other formations under the Williston Basin were spotlighted at that time because of much higher productivity. However, despite highly productive wells in these formations, they produced a small total volume because they covered relatively small portions of the Williston Basin. Because porosity and permeability in the Bakken Formation were not conducive to oil development given the technology of the time, oil extraction was generally limited to sites with natural fractures. By the 1980s, through a combination of vertical and horizontal drilling technologies, natural facture networks became easier to locate, permitting temporarily increased oil production until the early 1990s. However, oil production in the Bakken Formation then stagnated for a decade due to low oil prices and over-saturation of oil developments at natural fracture sites. Since the mid-2000s, however, the introduction of the horizontal drilling with hydraulic fracturing has instigated an oil boom in North Dakota [
The oil counties in North Dakota are divided into two groups―major oil producing counties and minor oil producing counties. The major oil producing counties include McKenzie, Mountrail, Williams and Dunn, while the minor oil producing counties include Divide, Bowman and Burke.
Varied levels of productivity among these counties within the Bakken Formation is a function of several interacting factors, including location of ground water, extent and depth of oil source rocks, and the number of oil currently developed [
The original data for this research were downloaded from the online databases of North Dakota Drilling and
Production Statistics, provided by the Oil and Gas Division of the North Dakota Industrial Commission’s Department of Mineral Resources [
Stock and Watson [
Additionally, serial correlation was present in the data, so the technique of differencing was applied in the S- ARIMA model. Gujarati and Porter [
S-ARIMA models were used to forecast oil production and to investigate whether the oil production trend in North Dakota is consistent across the counties in the assorted groups. According to Pindyck and Rubinfeld [
where p is the number of autoregressive terms; d is an integer indicating how many times the series must be differenced to achieve stationarity; q is the number moving average terms; P is the number of seasonal autoregressive terms; D is the number of seasonal differences needed to achieve stationarity; Q is the number of seasonal moving average terms; s denotes the length of the seasonal period (12 months for these data).
The S-ARIMA is a product of the non-seasonal part and the seasonal part, and can eliminate seasonally unst-
Major oil production group | |||||
---|---|---|---|---|---|
Counties | Observations | Mean | Std. Dev. | Minimum | Maximum |
McKenzie | 528 | 0.965 | 1.255 | 0.309 | 8.554 |
Mountrail | 528 | 0.596 | 1.540 | 0.011 | 7.368 |
Williams | 528 | 0.576 | 0.752 | 0.240 | 4.379 |
Dune | 528 | 0.381 | 0.824 | 0 | 5.069 |
Minor oil production group | |||||
Counties | Observations | Mean | Std. Dev. | Min. | Max. |
Divide | 528 | 0.112 | 0.202 | 0.012 | 0.291 |
Bowman | 528 | 0.384 | 0.398 | 0.047 | 1.541 |
Burke | 528 | 0.102 | 0.083 | 0.041 | 5.348 |
Major group | 528 | 2.518 | 4.258 | 0.684 | 24.880 |
Minor group | 528 | 0.598 | 0.542 | 0.154 | 2.457 |
North Dakota | 528 | 4.259 | 4.679 | 1.497 | 29.328 |
able effects (i.e. nonstationarity) by using differencing. This model was processed by a few steps in this paper. First, we identified whether the S-ARIMA model is appropriate for the data by analyzing plots of the autocorrelation and partial autocorrelation functions, Akaike Information Criterion, and the QLR test described above. Second, we found the S-ARIMA models with the Least Root Mean Square Error to measure the accuracy for forecasting, and compared estimated coefficients’ p-values with the 10%, 5% and 1% significant levels. Third, we forecasted future oil production with the estimated models.
The statistical results of the QLR tests to check for structural changes in the oil development trend are summarized in
The empirical estimation results of the S-ARIMA models of oil production for each county, for each production group, and for North Dakota as a whole are shown in Tables 3-5. The regression results in the three tables show model type, estimated model parameters, mean absolute error (MAE), R² (goodness of fit), Ljung-Box Chi-Square test for error autocorrelation (lag 2), Augmented Dickey-Fuller (ADF) test for trend, and Seasonal Dickey-Fuller (SDF) test for trend.
The S-ARIMA forecasts results for two production groups and for the state are given in
Major group | Break point | QLR test statistic | Minor group | Break point | QLR test statistic |
---|---|---|---|---|---|
McKenzie | August 2006 | 14.87 (<0.001) | Divide | September 2006 | 5.19 (0.015) |
Mountrail | September 2006 | 13.02 (<0.001) | Bowman | July 2006 | 25.98 (<0.001) |
Williams | September 2006 | 10.15 (<0.001) | Burke | July 2006 | 6.29 (<0.001) |
Dunn | September 2006 | 9.16 (<0.001) | |||
Major group | September 2006 | 10.6 (<0.001) | Minor group | January 2004 | 7.34 (<0.001) |
North Dakota | January 2006 | 7.65 (<0.001) |
Note: The null hypothesis of QLR Test is no break; the p-values are in parenthesis.
Minor group | Major group | North Dakota | |
---|---|---|---|
Model type | ARIMA (0, 1, 0) (0, 1, 1)s | ARIMA (2, 1, 0) (1, 0, 0)s | ARIMA (2, 1, 0) (1, 0, 0)s |
AR (1) | N/A | −0.03 (0.10) | −0.14 (0.10) |
AR (2) | N/A | 0.19* (0.11) | 0.20* (0.11) |
SAR (12)/SMA (12) | 0.86*** (0.14) | 0.69*** (0.11) | 0.72*** (0.10) |
No. of Obs. | 121 | 89 | 97 |
MAE | 53133 | 313431 | 347598 |
R2 | 0.959 | 0.996 | 0.996 |
Ljung-Box χ2 test for error autocorrelation | No at 5% | No at 5% | No at 5% |
ADF test for trend | No at 1% | No at 1% | No at 1% |
SDF test for trend | No at 1% | No at 1% | No at 1% |
Note: AR (1) is autoregressive lag 1; AR (2) is autoregressive lag 2; SAR (12) is seasonal autoregressive lag 12; SMA (12) is seasonal moving average lag 12; the standard errors are in parentheses; *, ** and *** indicate significance at 10%, 5% and 1%, respectively; the null hypothesis of Ljung-Box Chi-Square test (lag 2) is that the autocorrelations of lag 1 through 2 in the prediction error are zero; the ADF test and SDF test have the null hypothesis of a unit root, that is, a stochastic trend; “N/A” means that the results are not available.
McKenzie | Mountrail | Williams | Dunn | |
---|---|---|---|---|
Model type | ARIMA(0, 1, 0) (1, 0, 0)s | ARIMA (2, 1, 0) (0, 1, 1)s | ARIMA (0, 1, 0) (1,0, 0)s | ARIMA (0, 1, 0) (1, 0, 0)s |
AR (1) | N/A | −0.02 (0.14) | N/A | N/A |
AR (2) | N/A | 0.23** (0.11) | N/A | N/A |
SAR (12)/SMA (12) | 0.70*** (0.12) | 0.55*** (0.14) | 0.32*** (0.11) | 0.66*** (0.14) |
No. of Obs. | 90 | 89 | 89 | 89 |
MAE | 105222 | 129687 | 83597 | 79041 |
R2 | 0.995 | 0.994 | 0.991 | 0.992 |
Ljung-Box χ2 test for error autocorrelation | No at 5% | No at 5% | No at 5% | No at 5% |
ADF test for trend | No at 1% | No at 1% | No at 1% | No at 1% |
SDF test for trend | No at 1% | No at 1% | No at 1% | No at 1% |
Note: AR (1) is autoregressive lag 1; AR (2) is autoregressive lag 2; SAR (12) is seasonal autoregressive lag 12; SMA (12) is seasonal moving average lag 12; the standard errors are in parenthesis; *, ** and *** indicate significance at 10%, 5% and 1%, respectively; the null hypothesis of Ljung-Box Chi-Square test (Lag 2) is that the autocorrelations of lag 1 through 2 in the prediction error are zero; the ADF test and SDF test have the null hypothesis of a unit root, that is, a stochastic trend; “N/A” means that the results are not available.
In
The estimated models have been used to forecast oil production in each county (models from
Divide | Bowman | Burke | |
---|---|---|---|
Model type | ARIMA (2, 1, 0) (0, 1, 1)s | ARIMA (2, 1, 0) (0, 1, 0)s | ARIMA (2, 1, 0) (0, 1, 1)s |
AR (1) | 0.39*** (0.11) | −0.30** (0.11) | −0.06 (0.11) |
AR (2) | −0.32*** (0.12) | −0.18 (0.11) | 0.31*** (0.11) |
SMA (12) | 0.37*** (0.14) | N/A | 0.71*** (0.24) |
No. of Obs. | 89 | 91 | 91 |
MAE | 29413 | 38122 | 12685 |
R2 | 0.985 | 0.967 | 0.974 |
Ljung-Box χ2 test for error autocorrelation | No at 5% | No at 5% | No at 5% |
ADF test for trend | No at 1% | No at 1% | No at 1% |
SDF test for trend | No at 1% | No at 1% | No at 1% |
Note: AR (1) is autoregressive lag 1; AR (2) is autoregressive lag 2; SMA (12) is seasonal moving average lag 12; the standard errors are in parenthesis; *, ** and *** indicate significance at 10%, 5% and 1%, respectively; the null hypothesis of Ljung-Box Chi-Square test (lag 2) is that the autocorrelations of lag 1 through 2 in the prediction error are zero; the ADF test and SDF test have the null hypothesis of a unit root, that is, a stochastic trend; “N/A” means that the results are not available.
Time | Divide | Bowman | Burke | McKenzie | Mountrail | Williams | Dunn | North Dakota |
---|---|---|---|---|---|---|---|---|
Jul. 2014 | 1.41 | 0.70 | 0.48 | 9.84 | 7.63 | 4.03 | 5.05 | 31.62 |
Jan. 2015 | 1.40 | 0.69 | 0.49 | 10.39 | 7.85 | 3.95 | 4.97 | 32.93 |
Jul. 2015 | 1.69 | 0.68 | 0.53 | 11.41 | 8.54 | 3.98 | 5.40 | 34.87 |
Jan. 2016 | 1.67 | 0.67 | 0.54 | 11.80 | 8.76 | 3.95 | 5.35 | 35.80 |
Jul. 2016 | 1.96 | 0.66 | 0.58 | 12.52 | 9.46 | 3.96 | 5.63 | 37.20 |
Jan. 2017 | 1.94 | 0.65 | 0.59 | 12.79 | 9.67 | 3.95 | 5.60 | 37.87 |
Jul. 2017 | 2.23 | 0.64 | 0.63 | 13.30 | 10.37 | 3.96 | 5.79 | 38.87 |
Jan. 2018 | 2.21 | 0.63 | 0.64 | 13.49 | 10.59 | 3.95 | 5.77 | 39.35 |
Jul. 2018 | 2.50 | 0.62 | 0.68 | 13.84 | 11.28 | 3.96 | 5.90 | 40.07 |
Jan. 2019 | 2.48 | 0.61 | 0.69 | 13.98 | 11.50 | 3.95 | 5.88 | 40.42 |
Jul. 2019 | 2.77 | 0.59 | 0.73 | 14.23 | 12.20 | 3.96 | 5.97 | 40.93 |
Jan. 2020 | 2.76 | 0.59 | 0.74 | 14.32 | 12.42 | 3.95 | 5.96 | 41.18 |
41.18 million bbl. in January 2020, which is a 40% increase relative to January 2014. It also amounts to 19% of current total U.S. crude oil production―222.224 million bbl.―and 1.7% of world oil production―2.29 billion bbl.―in March 2013 [
For each North Dakota county in the Bakken Formation, excluding Bowman County, oil production will continue to increase―especially in McKenzie and Mountrail Counties, where production is predicted to increase sharply.
The purpose of this paper was to forecast how much oil can be produced in North Dakota for the next five years using the S-ARIMA. Nonstationarity derived from a stochastic trend and the abrupt structural change of oil industry was a big potential problem to our monthly oil production time series. Through the QLR Test, we found break points, which allowed us to select a model fitting period suitable for the S-ARIMA method to provide accurate statistical inference for the historical period, and we hoped good forecasts of future production.
The forecasting results will be useful to the federal government in planning for domestic energy security, and to oil producers and state and local governments in the state of North Dakota as they plan production and infrastructure. The oil production forecasts were produced using separate time series data and models for North Dakota as a whole, for the major and minor oil production groups and for each of the groups’ constituent counties. Excluding Bowman County, the oil development trends for the individual counties of the North Dakota Bakken Formation―i.e. Burke, Divide, Dunn, McKenzie, Mountrail and Williams―and for North Dakota as a whole are consistently increasing and the overall trend is highly likely to continue.
One caveat is that structural changes in the transportation fuels markets could reduce the accuracy of time series methods for forecasting. For example, if a new technology for oil extraction were developed in the near future that made oil extraction in the Bakken Formation more efficient, this might lead to actual extraction levels during the forecasting period being much higher than our forecasts. On the other hand, if a new liquid transportation fuel became available that could readily replace gasoline at a lower cost, oil extraction in the Bakken Formation and other places would likely diminish. In either of these scenarios, S-ARIMA and other time series models would fail to predict the future accurately.
What we can predict with great certainty, however, is that North Dakota’s influence over domestic and global oil supply systems will increase in the near future, especially over the next five to six years. This is good news for those who are concerned about domestic energy security in the USA.
This research was supported by Mountain-Plains Consortium, which is sponsored by the USA Department of Transportation through its university Transportation Centers Program. The contents are sole responsibility of the authors. The authors also would like to thank anonymous reviewers for their constructive comments.