Growing water scarcity is one of the major challenges of the 21st century, especially in arid and semi-arid climates such as our study area. The efficient, sustainable and integrated groundwater management plays a key role for conserving this vital resource. In order to overcome this issue, the study of aquifer system’s behavior seems necessary. For this purpose, the areal piezometric level map is an essential tool. As piezometric level data are spatially limited in sample points, the spatial interpolation and geostatistics are the best way to produce the needed map. Several methods exist allowing to approach real values with varying degrees of accuracy. This work aims to compare and evaluate spatial interpolation methods for groundwater level of Haouz using a dataset of 39 piezometers. The deterministic methods used in this study are Inverse Distance Weighted (IDW) and Radial Basis Functions (RBF) and the probabilistic ones are ordinary kriging (OK), simple kriging (SK) and universal kriging (UK). This study shows the difficulty of having a key role to choose the suitable method for given input dataset. The best model remains the one that, after comparing several methods, offers the best accuracy, which is assessed using Cross-validation and statistical indicators. The results reveals that ordinary kriging with trend removal technique is the optimal method in this case. It indicates the superiority of this technique with a decrease in Root Mean Square Error (RMSE) up to 61.67%. It underestimates groundwater level with an average of 2.8%, which is reliable. The areal piezometric level and associated prediction standard error maps give additional information and recommendations that characterize the studied aquifer system and will ultimately improve sustainable groundwater management.
Groundwater is a vital source of supply for urban and rural needs and socio-economic development across the world [
All the threats stated above are exacerbated in arid and semi-arid climates. Amongst those climates, the Middle East and North Africa (MENA) region and particularly Morocco have to address water-related challenges even more than anywhere else. Besides, MENA region has been a focal area in discussions about the impact of water scarcity on food security [
Understanding, analyzing and studying the behavior of aquifer systems is essential for making any management decision and for optimum exploitation and rational use of water [
However, determining the groundwater level is usually accomplished during the in situ measurement campaigns. This makes the cost of data production substantial. Therefore data is spatially limited by a network of georeferenced points. This process can surely master the costs of the measures but it presents a major obstacle to the efficient study of the status of aquifer and the effects of hydrologic stresses on groundwater systems. In order to overcome these issues and reach the goals intended, the spatial interpolation for piezometric level is used to set up the regionalized map.
Spatial interpolation, mainly geostatistics, is the key tool to create this map. The research work carried out since the advent of this discipline is more than ever an essential capital [
Using geostatistics for processing groundwater data extends to several fields [
In terms of interpolation, several methods exist allowing to approach real values with varying degrees of accuracy. In literature, every geostatistical method was created to respond to a particular situation of the given variable studied. However, some studies have shown the difficulty of providing a key role to select the optimal spatial interpolation method for a given dataset [
Among the work carried out by researchers in the field of groundwater level interpolation, the authors Kumar [
In Morocco, Geostatistical approach has been used by many researches in different fields. For instance, the authors Jarar Oulidi et al. [
Unfortunately, no comparison of spatial interpolation methods for groundwater level of Haouz has been made. Therefore, the objectives of this study are 1) to prove the inexistence of a key role for choosing the optimal method based only on input data distribution, 2) to compare the precision of probabilistic methods OK, SK and UK and the deterministic ones IDW and RBF on mapping the groundwater level in the study area. All comparisons between the models were assessed and carried out based on statistical accuracy indicators provided by cross-validation and statistics. The scope of this work directly affects the economic decision-making, policy and strategies for sustainable management of water. Indeed, the choice of the optimal method to interpolate the groundwater level in this region is a starting point for future studies to be conducted there.
The study area is located in the Haouz plain (
The groundwater of this area is overused [
number of pumping stations continuously growing. The area is one of the oldest irrigated regions [
The Hydraulic Basin Agency of Tensift manages the piezometric monitoring network in this area. It allows regular monitoring of the water level. This study involved a sample of 39 points measuring the groundwater level in May 2010. The positions of the piezometers are shown in
In order to achieve the objectives set for the present work, a subdivision into three phases was needed. As shown in the
The first step was to gather the data needed for this study and then integrate them into a spatial database to enable an easiest exploitation and analysis. Afterwards, data was analyzed and examined in different views within different mapping and statistical tools. This process helped to understand the distribution and spatial correlation of the variable. Also, it allows to detect abnormalities that can slide within the input dataset.
The next step was comparing and selecting the optimal model of each deterministic and probabilistic interpolation method: IDW, RBF, ordinary, simple and universal kriging. Then, these five optimal models were compared and the best one amongst all of them was selected and assessed. These methods are greatly described in the literature [
All spatial interpolation methods, whether deterministic or probabilistic, can be represented as weighted averages of the measured data. They have the same general estimation formula as in Equation (1):
where
used in the estimation.
The major difference between all interpolation methods is mainly the way of calculating weight values (i.e.
While probabilistic methods family “Kriging” takes into account the spatial correlation between samples in the calculation of weight values. These correlations are studied from the semivariogram (also commonly known as variogram [
To construct the experimental variogram, it is first necessary to calculate the semivariance
where N(h) is the number of pairs separated by a distance h.
Once the experimental variogram built, it should be modeled. This is an important and crucial step in the interpolation process. This step consists on choosing the mathematical function (Spherical, Exponential, Gaussian, Bessel...) that best fits the experimental variogram. Here the parameters like sill, nugget effect and range should be changed until the optimum model is found.
To choose the optimal model and compare the interpolation methods, geostatistics uses Cross-validation, which is an essential tool. It also allows to validate and assess the accuracy of each method and model. It consists in predicting the value of each point in the data set by removing it and based on the remaining data. Thus, the difference between the measured and the estimated value can be calculated, from this follows several interpolation’s statistical accuracy indicators.
The quality assessment was carried out on the basis of three statistical indicators: Mean Error (ME) in Equation (3), Root Mean Square Error (RMSE) in Equation (4) and coefficient of determination R2 in Equation (5). The validity of the model can be asserted when the mean error is close to zero, the RMSE is low and the R2 is close to one.
Mean Error is the average of the difference between measured values and predicted ones:
Root Mean Square Error indicates how the model predicts the measured values:
Coefficient of determination R2 is another statistical indicator for assessing the accuracy of the estimations. It measures the correlation between measured and predicted values. It is calculated as follows:
where Pave is the average of the estimated value; Cave is the average of the measured value; and n is the number of points used for estimation.
Exploratory analysis is an important step that allows to identify and to verify abnormal samples that could cause distortion in the calculation of the estimations. It also leads to an interpolation that can be the most representative of reality.
The analysis of the parameters Pearson kurtosis and skewness provides information on the distribution of the variable studied. Kurtosis Pearson (=3.9) being greater than 3, demonstrates a leptokurtic distribution type, which is more pointed than the normal distribution. The skewness (=0.82) being greater than 0 indicates an asymmetry in the distribution of data. The median (=469.06) and the average (=470.62) are significantly similar. This shows that the data is approximate to a normal distribution.
The sample map (
The same conclusion was provided by trend analysis tool (
In
In addition, the Semivariogram Cloud tool (
After this exploratory data analysis and prior to the interpolation of the groundwater level, we must compare, assess and choose the best model that can be the most accurate and representative of the study area.
Over 150 tests were conducted in order to find the optimal method for the given input dataset. In fact, for every method among all five we had to produce the optimal model: 1) Ordinary Kriging (OK), 2) Simple Kriging (SK), 3) Universal Kriging (UK), 4) Inverse Distance Weighted (IDW) and 5) Radial Basis Functions (RBF). Indeed, several parameters were tested in that matter. Then, the final five optimal models were compared and the best one was therefore selected. Cross-validation has been a key tool to select all the optimal models. The table below compares the optimal results for each method based on the indicators Root Mean Square Error (RMSE) and Coefficient of determination R²:
Furthermore, the
According to both the table and the
The results indicate the superiority of ordinary kriging technique with a decrease in RMSE up to 61.67%. In addition, OK is ranked better than UK technique, whereas our study variable is not stationary as it is mentioned in the exploratory analysis.
In literature, the OK method is designed specifically for stationary variables while UK method has been developed to interpolate non-stationary variables. However, the method called Trend Removal [
Ordinary kriging (OK) | Simple kriging (SK) | Radial Basis Functions (RBF) | Universal kriging (UK) | Inverse Distance Weighted (IDW) | |
---|---|---|---|---|---|
RMSE | 14.678 | 16.017 | 18,034 | 19.01 | 38,293 |
R2 | 0.986 | 0.984 | 0,978 | 0.976 | 0,908 |
another way to predict this kind of variables. This method consists on isolating the drift and modeling it with a polynomial function (1st, 2nd or 3rd order), then the stationary part is kriged. Finally, and before the final calculation of the predictions is set up, the drift is added.
In our case, this Trend Removal method used in conjunction with OK was more accurate than modeling the non-stationary variable with UK method. As the piezometric level has a trend, it can therefore be subdivided into two sections and it is as follows:
This proves the inexistence of universal rule for choosing the optimal interpolation model. Comparing methods and assessing their accuracy using Cross-validation and statistical indicators are the best way. This finding follows the same conclusion of the authors Burrough et al. [
Assessing the accuracy of the best method chosen is based on the study of the correlation degree between the measured and the estimated values, which is conducted through Cross-validation and statistical indicators.
The result of this geostatistical study allowed to find the optimal interpolation method and to produce areal piezometric level map and the prediction standard error map.
The spatial distribution of uncertainties is illustrated in
Ordinary kriging with trend removal technique was chosen for its relevance and accuracy in describing the reality of the groundwater level of Haouz. It is the optimal method in the case of this study area and for the given input dataset. This result was reached after conducting a comparison between deterministic and probabilistic interpolation methods. Indeed, over 150 unit tests related to the parameters and mathematical models were needed to achieve this conclusion. In addition, cross-validation has played a decisive role in comparing and assessing the accuracy of every model.
This study also reveals and proves that there is no universal and key rule for choosing the optimal spatial interpolation model for a given dataset. Comparing methods and assessing their accuracy using Cross-validation and statistical indicators are the best way.
This optimal geostatistical method, in conjunction with mapping tools [
LamiaaKhazaz,Hassane JararOulidi,Saida ElMoutaki,AbdessamadGhafiri, (2015) Comparing and Evaluating Probabilistic and Deterministic Spatial Interpolation Methods for Groundwater Level of Haouz in Morocco. Journal of Geographic Information System,07,631-642. doi: 10.4236/jgis.2015.76051