Journal of Water Resource and Protection, 2012, 4, 528539 http://dx.doi.org/10.4236/jwarp.2012.47062 Published Online July 2012 (http://www.SciRP.org/journal/jwarp) MultistepAhead River Flow Prediction Using LSSVR at Daily Scale Parag P. Bhagwat, Rajib Maity Department of Civil Engineering, Indian Institute of Technology, Kharagpur, India Email: rajib@civil.iitkgp.ernet.in Received April 5, 2012; revised May 5, 2012; accepted June 1, 2012 ABSTRACT In this study, potential of Least SquareSupport Vector Regression (LSSVR) approach is utilized to model the daily variation of river flow. Inherent complexity, unavailability of reasonably long data set and heterogeneous catchment response are the couple of issues that hinder the generalization of relationship between previous and forthcoming river flow magnitudes. The pr oblem complexity may ge t enhanced with the influence of upstream dam releases. These issues are investigated by exploiting the capability of LSSVR—an approach that considers Structural Risk Minimization (SRM) against the Empirical Risk Minimization (ERM)—used by other learning approaches, such as, Artificial Neural Network (ANN). This study is conducted in upper Narmada river basin in India having Bargi dam in its catchment, constructed in 1989. The river gauging station—Sandia is located few hundred kilometer downstream of Bargi dam. The model development is carried out with preconstruction flow regime and its performance is checked for both pre and postconstruction of the dam for any perceivable difference. It is found that the performances are similar for both the flow regimes, which indicates that the releases from the dam at daily scale for this gauging site may be ignored. In order to investigate the temporal horizon over which the prediction performance may be relied upon, a multistepahead prediction is carried out and the model performance is found to be reasonably good up to 5dayahead predictions though the p erformance is decreasing with the increase in leadtime. Skills of both LSSVR and ANN are reported and it is found that the former performs better than the latter for all the leadtimes in general, and shor ter lead times in par ticular. Keywords: MultistepAhead Prediction; KernelBased Learning; Least SquareSupport Vector Regression (LSSVR); Daily River Flow; Narmada River 1. Introduction River flow is an important component in hydrological cycle, which is directly available to the community. In hydrology, river flow plays an important role in estab lishing some of the critical interactions that occur be tween physical, ecological, social or economic processes. Accurate or at least reasonably reliable prediction of river flow is an important foundation for preventing flood, reducing natural disasters, and the optimum man agement of water resource. Constructions of major and minor dams are essential in order to effective use of avai lable water resources. This is more crucial for the mon soon dominated countries, where most of the annu al rain fall occurs during couple of months, and rest of the months are mostly norainfall months. Existence of dams adds to the complexity in th e river flow modelling, which is influenced by the releases from upstream reservoirs as well as the influence of the inputs from the catchment area between immediate downstream of the dam and the gauging site. Depending on the natural variability and complexity, traditional statistical methods, such as, transfer function model, BoxJenkins approach etc. may be inadequate due to their underlying assumptions. As a consequence many new methodologies have been introduced to understand the variations of hydrologic variables and to predict for the future time steps. Development of different techniques to predict river flow is having a long history. Among the parametric linear approaches, AutoRegressive Integrated Moving Average (ARIMA) model is one of the most popular approaches. Since last decade, machine learning techniques are being applied in this field. For instance, Artificial Neural Net works (ANN), fuzzy logic, genetic programming, etc., have been widely used in the modelling and prediction of hydrologic variables. More recently, kernelbased learn ing approaches have gained wide popularity. The statistical learning framework proposed by Vap nik [1] led to the introduction of the Support Vector Ma chine (SVM), which has been successfully applied for C opyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY 529 nonlinear classification and regression in learning prob lems. Kernel based approaches are expected to perform better due its nonlinear, even smooth enough, feature space development based on the available historical re cord. One such kernel based machines learning approach is the Least SquaresSupport Vector Machine (LSSVM) [2]. 1.1. Support Vector Machines (SVMs) and Kernel Based Learning SVMs are a kind of supervised machine learning system that use a linear high dimensional hypothesis space called feature space. These systems are trained using a learning algorithm, which is based on optimization theory. SVMs belong to a family of generalized linear classifier [3]. The SVMs can be applied to both classification and regression problems. Application of SVMs to regression problem was made in late nineties [1,4]. Popularity of SVM increases very rapidly since then in fields of text classification, pattern recognition, remote sensing and so on. The basic idea of working principle of SVMs is pro vided by the use of kernel functions that implicitly map the data to a higher dimensional space. According to Cover’s theorem, linear solution in the higher dimen sional feature space corresponds to a nonlinear solution in the original lower dimensional input space. This makes SVM a powerful tool for solving a variety of hydrologic problems, which are nonlinear in nature. There are me thods available which use nonlinear kernels in their app roach towards regression problems while applying SVMs. 1.2. Application of SVMs in Hydrologic Problems Application of SVMs in the field of hydrology is gaining wide popularity and the results are found to be encoura ging. Such applications range from remotely sensed im age classification [5], statistical downscaling [6], soil water forecasting [7], stream flow forecasting [8] and so on. Liong and Sivapragasam [9] compared SVM per formance with other machine learning model, such as, ANN in forecasting flood stage and reported a superior performance of SVM. Bray and Han [10] used SVM for rainfall runoff modelling and that model was compared with a transfer function model. The study outlin ed a pro mising area of research for further application of SVMs in unexplored areas. Samui [11] used LSSVM to deter mine evaporation loss of reservoir and it is established to be a powerful approach for the determination of evapo ration loss. She and Basketfield [12] forecasted spring and fall season stream flows in Pacific Northwest region of US using SVM and reported superior results in fore casting. Zhang et al. [13] studied two machine learning approaches—ANN and SVM and compared for appro ximating the Soil and Water Assessment Tool (SWAT) model. The results showed that SVM in general exhibited better generalization ability than ANN. Khadam and Kaluarachchi [14] discussed the impact of accuracy and reliability of hydrolog ical data o n model calib ration. Th is, coupled with application of SVMs, was used to identify the faulty model calibration, which would have been undetected otherwise. Applicability of SVMs was also demonstrated in downscaling Global Circulation Models (GCMs), which are among the most advanced tools for estimating future climate change scenarios. The results presented SVMs as a compelli ng alternative to tradition al Artificial Neural Networks (ANN) to conduct climate impact studies [10,11] downscaled monthly precipitation to basin scale using SVMs and reported the results to be encouraging in their accuracy while showing large pro mise for further applications. 1.3. Advancement of SVM Apart from the general benefit of SVM pointed out in the aforementioned studies, SVMs are sometimes criticized by its large number of parameters and high level of com putational effort, particularly in case of large dataset. Chunking is one of the proposed remedies to the latter problem. However, according to Suyken et al. [15], it is worthwhile to investigate the possibility of simplifying the approach to the extent possible without losing the any of its advantages. Thus, they proposed a modification over the SVM approach, which essentially leads to Least SquareSupport Vector Machines (LSSVM). The main advantage of LSSVM is in its higher com putational efficiency than that of standard SVM method, since the training of LSSVM requires only the solution of a set of linear equations instead of the long and com putationally demanding quadratic programming problem involved in the standard SVM [2]. Qin et al. [16] in vestigated the application of LSSVM for the modelling of water vapor and carbon diox ide fluxes and they fou nd the excellent generalization property of LSSVM and its potential for further applications in area of general hy drology. Maity et al. [13] investigated the potential of support vector regression, which is also based on LS SVM principle, for prediction of streamflow using en dogenous property of the monthly time series. In this study, potential of LSSVM for Regression (LSSVR) is investigated for the obj ective outlined as follows. 1.4. Objective of This Study Potential of LSSVM for Regression (LSSVR) is ex ploited for multistepahead river flow prediction at daily scale, to assess its performance with the increasing time Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY 530 horizon. Most of the major river basins are being po pulated with major and minor dams. River flow mode lling is supposed to be influenced by the effect of the release from these dams if the site location is on the downstream side of the dam. However, the effect of its existence will gradually reduce with increase of the dis tance from the dam location. The study is carried out with observed daily river flow in the upper Narmada River basin with Sandia gauging station at the outlet. Bargi dam exists few hundred km upstream from the out let of the watershed. Details of this dam are provided in th e “Study Area” section later. Investigation is carried out to assess the necessity of consideration of daily releases from upstream dam on the daily river flow variation at the outlet of the study area. Also, a multistepahead pre diction is carried out to assess maximum temporal hori zon over which the prediction results may be relied upon. The results are compared with the performance of neural networks approach that uses Empirical Risk Minimi zation (ERM). 2. Methodology 2.1. Data Normalization The observed river flow data, as commonly used in data driven models, is normalized to prevent the model from being dominated by the large values. The performance of LSSVM with normalized input data has shown to out perform the same with nonnormalized input data [16]. Therefore, the data is normalized and finally the model outputs are back transformed to their original form by denormalisation. The normalization (also back transfor mation) is carried out using 0.1 i y y 1.2max i i S S th i S 1 , (1) where i is the normalized data for day and is the observed value for day. i th i 2.2. Least SquareSupport Vector Regression (LSSVR) Let us consider a given training set ii i y x 1 , T i y y x y ()yf , where represents a 1 ,, in in yy n ndimensional input vector and i is a scalar measured output, which represents system output. Subscript i indicates the training pattern. In the context of multistep ahead river flow prediction, i is the vector comprising of th i i nprevious days normalized river flow values, i is the target river flow with certain leadtime in day(s) and subscript indicates the reference time from which n previous days and the leadtime are counted. The goal is to construct a function x yx T yb , which represents the dependence of the output on the input . The form of this function is as follows: wx wb (2) where is known as weight vector and as bias. This regression model can be constructed using a non linear mapping function . The function :nh is a mostly nonlinear function, which maps the data into a higher, possibly infinite, dimen sional feature space. The main difference from the stan dard SVM is that the LSSVR involves equality con straints instead of inequality constraints, and works with a least square cost function. The optimization problem and the equality constraints are defined by the following equations: 2 1 11 Minimize ,22 such that1,, N T i i T iii Je e be iN www wx e e (3) where i is the random error and is a regu larization parameter in optimizing the tradeoff between minimizing the training errors and minimizing the mo del’s complexity. The objective is now to find the opti mal parameters that minimize the prediction error of the regression model. The optimal model will be chosen by minimizing the cost function where the errors, i, are minimized. This formulation corresponds to the regre ssion in the feature space and, since the dimension of the feature space is high, possibly infinite, this problem is difficult to solve. Therefore, to solve this optimization problem, the following Lagrange function is given, 1 ,,;, NT iiii i LwbeJ we be y wx wbi ei (4) The solution of above can be obtained by partially dif ferentiating with respect to , , and , i.e. 1 0N ii i L w wx (5) 1 00 N i i L b (6) 0,1,, ii i Lei N e (7) 00,1,, T iii i Lbe yiN x wx w (8) From the set of Equations (5)(8), and e can be eliminated and finally, the estimated values of b and i , i.e. and i bˆ , can be obtained by solving the linear system. Replacing in Equation (2 ) from Equatio n (5 ), w Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY 531 2 the kernel trick may be applied as follows: ,Kxx T ii x x ˆ , ii (9) Here, the kernel trick means a way to map the obser vations to an inner product space, without actually com puting the mapping and it is expected that the obser vations will have meaningful linear structure in that inner product space. Thus, the resulting LSSVR model can be expressed as 1 ˆ ˆN i Kbxx ,Kxx (10) where is a kernel function. i In comparison with some other feasible kernel fun ctions, the RBF is a more compact and able to shorten the computational training process and improve the gene ralization performance of LSSVR (LSSVM, in general), a feature of great importance in designing a model [13]. Aksornsingchai and Srinilta [17] studied support vector machine with polynomial kernel (SVMPOL), and su pport vector machine with Radial Basis Function kernel (SVMRBF) and found SVMRBF is the accurate model for statistical downscaling methods. Also, many works have demonstrated the favorable performance of the ra dial basis function [9,15]. Therefore, the radial basis fun ction is adopted in this study. The nonlinear radial basis function (RBF) kernel is defined as: 2 2 1 i x x,e xp i Kxx (11) where is the kernel function parameter of the RBF kernel. The symbol is the norm of the vector and thus, 2 xx xx xˆ y yˆi y i is basically the Euclidean distance be tween the vectors and i. In the context of river flow prediction, i is the new vector of previous river flow, based on which multistep ahead prediction (i) is made with certain leadtime. i (observed) and are compared to assess the model performances. 2.3. Model Calibration and Parameter Estimation The regularization parameter determines the trade off between the fitting error minimization and smooth ness of the estimated function. It is not known before hand which and 2 are the best for a particular problem to achieve maximum performance with LSSVR models. Thus, the regularization parameter and the RBF kernel parameter 2 have to be calibrated during model development period. These parameters are inter dependent, and their (near) optimal values are often ob tained by a trialanderror method. Interrelationship is also coupled with the number of previous river flow values to be considered, which is denoted as . In order to find all these parameters ( n , and ) grid search method is employed in parameter space. Once the para meters are estimated from the training dataset, the ob tained LSSVR model is complete and ready to use for modelling new river flow data period. Performance of the developed model is then assessed with the data set during testing period. Different models (parameter sets) are developed for different prediction leadtimes in case of multistep ahead prediction. n 2.4. Comparison with Artificial Neural Networks (ANN) The flexible computing based ANN models have been extensively studied and used for time series forecasting in many hydrologic applications since late 1990s. This model has the capability to execute complex mapping between input and output and to form a network that approximates nonlinear functions. A single hidden layer feed forward network is the most widely used model form for time series modeling and forecasting [18]. This model usually consists of three layers: the first layer is the input layer where the data are introduced to the net work followed by the hidden layer where data are pro cessed and the final or output layer is where the results of the given input are produced. The number of input nodes and output nodes in an ANN are dependent on the problem to which the network is being applied. However, there is no fixed method to find out the number of hidden layer nodes. If there are too few nodes in the hidden layer, the network may have difficulty in generalizing the problems. On the other hand, if there are too many nodes in the hidden layer, the net work may take an unacceptably long time to learn any thing from the training set [19]. Increase in the number of parameters may slow the calibration process [20]. In a study by Zealand et al. [21], networks were initially con figured with both one and two hidden layers. However, the improvement in forecasting results was only marginal for the two hidden layer cases. Therefore, it is decided to use a single hidden layer in this study. In most of the cases, suitable number of neurons in the hidden layer is obtained based on the trialanderror method [22]. Maity and Nagesh Kumar, [23] proposed a GA based evolution ary approach to decide the complete network structure. The output t of an ANN, assuming a linear output neuron having a single hidden layer with h sigmoid hidden nodes, is given by: 1 h tjjk j gwfsb (12) where and k are the linear transfer function and bias respectively of the output neuron k, b w th j is the connection weights between neuron of hidden la Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY CopyrigSciRes. JWARP 532 f ht © 2012 yers and output units, is the transfer function of the hidden layer [24]. The transfer functions can take several forms and the most widely used transfer func tions are Logsigmoid, Linear, Hyperbolic tangent sig moid etc. In this study, hyperbolic tangent sigmoid is used: in study area is from 289 to 1134 m. The basin lies between east longitudes 78˚30' and 81˚45', and north latitudes 21˚20' and 23˚45'. Bargi dam (later renamed as Rani Avanti Bai Sagar Project) is a major structure in the basin up to Sandia, which is located few hundred km up stream of Sandia. The latitude and longitude of the dam are 22˚56'30''N and 79˚55'30''E, respectively. It was constructed in late eighties and being operated from early nineties. Thus, preconstruction period (19781986) is considered for training. For testing, two sets of data are used—preconstruction data set (19861990) and post construction period (19902000). However, out of this entire range four year s data is missing (1, 1981  31 May 1982; June 1, 1987  May 31, 1988; June 1, 1993  May 31, 1994; June 1, 1998  May 31, 1999). These periods are ignored from the analysis. River flow data from Sandia station, operated by the Water Resources Agency, is obtained from Central Water Commission, Govt. of India. Among these records, a daily data (June 1, 1978 to May 31, 1986) is used for training and (June 1, 1986 to May 31, 2000) is used to test the model performance. 21 xp2 i s i i 1e i fs (13) 0 n i i wx where is the input signal referred to as the weighted sum of incoming information. Several optimi zation algorithms can be used to train the ANN. Among the various training algorithms available, the back propagation is most popular and widely used algorithm [25]. Details of this techniques is well established in the literature and can be found elsewhere (ASCE 2000 and references therein) [26]. 3. Study Area and Data Sets Narmada River is the largest west flowing river of Indian peninsula. It is the fifth largest river in India. The study area is up to Sandia gauging station, which is in the upstream part of Narmada river basin as shown in Figure 1. The upstream part Narmada river basin is in the state of Madhya Pradesh, India. The river originates from Maikala ranges at Amarkantak and flows westwards over a length of 334 km up to Sandia. The elevation difference 4. Results and Discussions: Performance of LSSVR for River Flow Prediction 4.1. Data PreProcessing and Parameter Estimation Observed river flow data at Sandia is normalized as ex plained in the methodology before proceeding to parameter Figure 1. Narmada river basin with study area up to Sandia station.
P. P. BHAGWAT, R. MAITY 533 meter estimation. Optimum number of previous river flow values to be considered is denoted as n. This para meter along with the regularization parameter and the RBF kernel parameter 2 is calibrated during mo del development period (training period). To select best combination of n, and 2 , grid search method is used. Model performances for different combinations of these parameters are assessed based on statistical mea sures, such as, Correlation Coefficient (CC), Root Mean Square Error (RMSE) and NashSutcliffe Efficiency (NSE). Ten different values of n (1 through 10) are tested to decide the optimum number of previous daily river flow to be considered for the best possible results. Range of is considered to be 25 to 1000 with a resolution of 25 and 2 in the range of 0.01 to 1 with a resolution of 0.01. Approximately (because of different lag and lead times considered) 2556 data points are used for training purpose. Performance of each model is assessed with the remaining testing data points. Model performances stati stic is obtained between observed and modelled river flow values during training and testing period. The com bination that yields comparable performance during training and testing period is selected, which ensures the best parameter values without the fear of overfitting. Results are shown in Table 1 for prediction leadtime of 1 day. This is achieved in the following way: for each value of n, two (training period and testing period) 2D surfaces are obtained for each of the performance mea sures (CC, RMSE and NSE). The values of and 2 are identified for which the training and testing period performances are “closest”. Priority is given to CC and corresponding RMSE and NSE are reported for same values of and 2 . These values are named as “best ” and “best 2 ” in Table 1 for a particular n. It is found from that the best combination of n, and 2 is 5 days, 175 and 0.21 respectively. However, these parameters are for prediction leadtime of 1 day. Opti mum combinations of parameters are computed separa tely for different prediction leadtimes and reported later. 4.2. Model Performance during Pre and PostConstruction Testing Period Model performances are tested during both pre and postconstruction period. For this purpose, the model is trained with river flow data during preconstruction pe riod and the developed model is used to assess the per formance during both preconstruction and post con struction period. It is found that the performance remains almost similar in both pre and postconstruction period (Table 2). Flow regimes are supposed to be modified on the immediate downstream of a newly constructed dam due to the effect of human controlled releases from the dam. However, such effect is supposed to get diminished with increase in distance from the dam location towards downstream. This is being reflected in case of river flow at Sandia gauging station indicating that the station is sufficiently away towards downstream. Model perfor mance during training period is shown in Figure 2 and the model performances during testing period—both preconstruction and postconstruction period are shown in Figures 3 and 4 respectively. The immediate next question would be to find the temporal horizon up to which the prediction would be reliable. 4.3. Performance of River Flow Prediction for Different LeadTimes The multistepahead river flow prediction is carried out for time steps T, T + 1, T + 2, T + 3 and T + 4. In other words, five different prediction leadtimes are tested for prediction performance, i.e., 1 day, 2 days, 3 days, 4 days and 5 days in advance. As stated earlier, optimum combi nations of parameters are computed separately for diffe rent prediction leadtimes. Results are shown in Table 3 . Table 1. Model performances during training (testing) period for different numbers of previous daily river flows considered as inputs (Leadtime = 1 day) wi th corre sponding “be st ” and “best ” values. 2 σ Number of previous river flow values input (n) Best Best 2 CC RMSE NSE 1 265 0.01 0.810 (0.810) 0.025 (0.030) 0.657 (0.655) 2 960 0.03 0.829 (0.829) 0.024 (0.029) 0.687 (0.679) 3 865 0.11 0.827 (0.827) 0.024 (0.029) 0.683 (0.675) 4 845 0.31 0.825 (0.825) 0.024 (0.030) 0.681 (0.662) 5 175 0.21 0.827 (0.827) 0.0243 (0.029) 0.684 (0.665) 6 795 0.34 0.829 (0.829) 0.024 (0.029) 0.688 (0.674) 7 710 0.32 0.831 (0.831) 0.024 (0.029) 0.690 (0.681) 8 235 0.22 0.833 (0.833) 0.024 (0.029) 0.694 (0.681) 9 530 0.28 0.838 (0.838) 0.024 (0.028) 0.702 (0.688) 10 750 0.45 0.839 (0.839) 0.024 (0.028) 0.703 (0.696) Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY 534 Table 2. Comparison of performance during pre and postc onstr uc tion of Bar g i dam. Testing performance Statistical Measures Training Period (5Jun78 to 31May86) Before dam const ruc ti on (5Jun86 to 31May90) After dam construction (5Jun90 to 31May97) Correlation Coefficient 0.83 0.84 0.83 RMSE 0.024 0.017 0.033 NSE 0.68 0.71 0.66 Table 3. Optimum river flow lags and parameter estimation for different leadtimes. Leadtime (in days) Number of previous river flow value s input (n) Gamma (γ) Sigma (σ) 1 5 175 0.21 2 2 75 0.14 3 1 150 0.40 4 1 50 0.38 5 1 75 0.37 Figure 2. Comparison between observed and predicted river flow for training period (1dayahead). Figure 3. A plot between observed and predicted river flow (1dayahead) during testing period before construction of dam. Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY Copyright © 2012 SciRes. JWARP 535 Figure 4. A plot between observed and predicted river flow (1dayahead) during testing period after construction of dam. Table 4. Performance of LSSVR for multistepahead daily river flow prediction for different leadtimes. From these results, it is observed that for different pre diction leadtimes the best combination of model pa rameters varies. The optimum numbers of previous river values to be considered differ for different leadtimes. This is decreasing with the increase of leadtime. For instance, five previous days input is best for 1day ahead prediction; similarly for 2day leadtime best is two pre vious days inputs and for further leadtimes one previous day input is resulting the best river flow prediction. Leadtime (in days) CC RMSE NSE 1 0.83 0.029 0.67 2 0.71 0.036 0.49 3 0.64 0.040 0.39 4 0.60 0.041 0.35 5 0.58 0.042 0.32 Model performance is investigated to determine ability of model to predict multistepahead river flow through CC, RMSE and NSE. Results are shown in Table 4. It is found that the prediction performance decreases with the increase in leadtime. For instance, 69% (square of correlation coefficient) of daily variation is explained in case of 1dayahead prediction, whereas 50%, 41%, 36% and 34% of daily variation is explained in case of 2day, 3day, 4day and 5day ahead predictions respectively. RMSE and NSE values are also increasing and de creasing respectively with the increase in leadtimes, in dicating that the performance is better for the shorter prediction leadtimes. 4.4. Comparison of LSSVR and ANN Model Performances The relative skill of LSSVR is compared with the per formance of ANN. Basics of ANN approaches are dis cussed in methodology section. The architecture of ANN consists of different layers and combinations of neurons as discussed earlier. Architecture of ANN networks adopted in this study is having n neurons in the input layer as river flow values from n previous days are used as input. As explained earlier, n is the optimum number of previous river flow values to be considered. Output layer consists of one neuron as the target is to predict the daily river flow with a particular leadtime. The adopted networks for different leadtimes also differ from each other with respect to the number of neuron s in the hi dden layers. The trialanderror method is used to find out best combination of neur on for h idden layer. In trialanderror method, different combinations of hidden layer neurons are tried and performance is checked. Based upon the performance, optimum number of hidden layer neurons is selected. It is found that 5, 3, 3, 3 and 3 neurons in hidden layer provid e the best performance for leadtimes of 1, 2, 3, 4 and 5 days respectively. It might be noted Plots between observed and predicted river flow for Training and Testing periods for 1day leadtimes are shown in Figures 24 sequentially. It is observed that predicted river flow values well correspond to the observed one at low and medium range of river flows. However, the highest peak is not captured properly. As these types of peaks are very shortlived (at daily scale), the reason could be the effect of some other factor, such as, sudden flash rain, which is shorter than the daily temporal resolution and thus, not available from the information contained in the previous days river flows.
P. P. BHAGWAT, R. MAITY 536 here that back propagation algorithm (as explained ear lier) is used to train these networks using 2556 to 2552 (from leadtime 1day through 5day) training data set. The largest network (551) is having 25 + 5 + 5 + 1 = 31 parameters, which is much less than number of training patterns. Different networks for different leadtimes are trained separately. Statistical measures showing the prediction performances are obtained during training and testing period for different leadtime cases and the results are shown in Table 5. After comparison between LSSVR and ANN, it is found that, in general, LSSVR performs better than ANN for all the leadtimes. For instance, at 1day leadtime, 69% (CC = 0.83) and 53.2% (CC = 0.73) of daily river flow variation are captured by LSSVR and ANN respectively dur ing the entire testing period (Table 5). Also, for this case, the NSE values are found to be 0.67 and 0.52 in case of LSSVR and ANN respectively. The performance of ANN also decreases with the in crease in leadtime. It is also found that the difference in prediction performance obtained from LSSVR and ANN is more prominent for shorter lead times. As the lead time increases, the performance of LSSVR and ANN becomes comparable. Different error measures, i.e., Mean Absolute Error (MAE), Mean Absolute Relative Error (MARE) and Root Mean Square Error (RMSE) are also determined for LSSVR and ANN based predictions. Results are shown in Table 6. Based on these measures also, it is found that error measures for LSSVR are bet ter compared to that of ANN. These observations indi cate the higher capability of LSSVR approach to capture the river flow variation using the previous inf ormation. For a visual comparison of prediction performances by LSSVR and ANN simultaneously, both LSSVR and ANN model predicted river flow values are plotted with the observed river flow during the model testing period (Figure 5). This is shown in case of 1dayahead prediction—best for both LSSVR and ANN. It is ob served that predicted river flow values are very well corresponds to the low and medium range of river flow in case of LSSVR and ANN. Relatively better perfor mance of LSSVR is discussed with respect to the sta tistical measures as shown in Table 5. However, as discussed in case of LSSVR earlier, the peak river flow values are not captured with reasonable accuracy in case of ANN as well. Apart from the fact that these peaks are very shortlived (at daily scale), this is a shortcoming for both LSSVR and ANN based approaches. The reason could be the same as discussed before, i.e., these peaks might be the effect of some factor, such as, sudden flash rain, which is shorter than the daily temporal resolution and thus, not available from the information contained in the previous days river flows alone. Consideration and incorporation of such additional inputs could be future scope of study. However, the better performance of LS SVR, particularly for shorter prediction leadtimes, as compared to that of ANN, is noticed in this study. The better performance of LSSVR over ANN may be Table 5. Performance measures for LSSVR and ANN models for training and testing period (19782000). Testing period values are shown within parentheses. LSSVR ANN Leadtime (in days) Optimum number of previous daily flow(s) to be considered CC NSE CC NSE 1 5 0.83 (0.83) 0.68 (0.67) 0.76 (0.73) 0.58 (0.52) 2 2 0.71 (0.71) 0.51 (0.49) 0.71 (0.71) 0.50 (0.49) 3 1 0.64 (0.64) 0.40 (0.39) 0.63 (0.59) 0.39 (0.33) 4 1 0.61 (0.60) 0.36 (0.35) 0.66 (0.55) 0.43 (0.30) 5 1 0.58 (0.58) 0.33 (0.32) 0.62 (0.56) 0.39 (0.30) CC: Correlation Coefficient; NSE: NashSutcliffe Efficiency (NSE). Table 6. Error measures for LSSVR and ANN models for training and testing period. Testing period values are shown within parentheses. LSSVR ANN Leadtime (in days) MAE MARE RMSE MAE MARE RMSE 1 0.006 (0.008) 0.031 (0.041) 0.024 (0.029) 0.008 (0.009) 0.055 (0.049) 0.028 (0.035) 2 0.008 (0.010) 0.044 (0.059) 0.030 (0.036) 0.008 (0.010) 0.047 (0.056) 0.031 (0.036) 3 0.009 (0.012) 0.056 (0.065) 0.033 (0.040) 0.009 (0.011) 0.053 (0.058) 0.034 (0.042) 4 0.011 (0.012) 0.066 (0.070) 0.034 (0.041) 0.010 (0.012) 0.061 (0.061) 0.033 (0.043) 5 0.011 (0.013) 0.069 (0.073) 0.035 (0.042) 0.010 (0.012) 0.060 (0.067) 0.034 (0.042) MAE: Mean Absolute Error; MARE: Mean Absolute Relative Error; RMSE : Root Mean Square Error. Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY 537 Figure 5. Comparison between observed and predicted river flow from LSSVR and ANN (1dayahead). attributed to its fundamental approach towards error mi nimization. Fundamental difference in the working prin ciples of ANN and LSSVR (SVM, in general) lies in their approaches of risk minimization. ANN follows an Empirical Risk Minimization (ERM) approach, whereas Structural Risk Minimization (SRM) principle is fol lowed in LSSVR. The SRM minimizes an upper bound on the generalization error, as opposed to ERM which minimizes the error on the training data. Thus, the solu tions may fall in to local optima in case of ANN [27]. It is this difference which equips LSSVR with a greater potential to generalize the regression surface without the danger of overfitting the training data set and global opti mum solution is guaranteed in LSSVR [28]. With respect to the number of parameters, LSSVR is less complex than ANN. As discussed earlier, there are only three parameters in LSSVR based approach, namely—regularization parameter ( ) RBF kernel pa rameter ( ) and number of previous daily river flow values to be considered (n). However, in ANN, the num ber of hidden layers, number of hidden nodes, transfer functions and so on must be determined, which are com paratively more complex. On the other hand, LSSVR is able to provide better prediction with small sample size. This is because the decision function of LSSVR is only determined by supporting vectors. In general, the su pporting vectors are only a part of training pattern (from available river flow data) and the remaining pattern are not used in constructing the LSSVR model. Therefore the performance of LSSVR may still be acceptable, even if the sample size is small. In contrast, the decision fun ction in ANN is determined by all training data sets [29]. Thus, generalization of relationship between past and future river flow values is more likely in case of LSSVR as found in this study and thus, more suitable for mul tistepahead prediction. 5. Conclusions In this study, daily variation of river flow is modelled and potential Least SquareSupport Vector Regression (LSSVR) is used for multistepahead river flow predic tion. Daily river flow values from Sandia station at upper Narmada river basin in India are used for illustration. Bargi dam is located few hundred km upstream of Sandia. It is investigated whether it is required to consider the releases from the upstream dam to model the daily varia tion at Sandia. Parameters of LSSVR – (regulariza tion parameter) and (RBF kernel parameter) and optimum numbers of previous river flow values to be considered (n) are estimated based on the model per formance during model development period (June 1, 1978 to May 31, 1986 excluding the missing data). The model is tested for the period June 1, 1986 to May 31, 2000 and different statistical performance measures (CC, NSE, RMSE, MAE and MARE) are obtained. These sta tistical measures confirm the well correspondence be tween observed and predicted river flow values. This correspondence is found to be better for low and medium range of flow values. However, the peak river flow va lues are not captured with reasonable accuracy in case of both LSSVR and ANN. While comparing the performance during pre and postconstruction of dam, it is found that the prediction performances are similar for both the flow regimes, which indicates that we may ignore the releases from the dam at daily scale for this gauging site. In order to inves tigate the temporal horizon over which this may be relied upon, a multistepahead prediction is carried out and the model performance is investigated up to 5dayahead Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY 538 predictions. The performance is found to be decrease with the increase in leadtime. In other words, the per formance is better for shorter leadtimes. General comparison between LSSVR and ANN mo del reveals that the performance of LSSVR is better that that of ANN for all the leadtimes—1d ayahead through 5dayahead prediction. Better performance of LSSVR, in comparison with that of ANN, becomes more pro minent for shorter prediction leadtimes. Thus, the better performance of LSSVR, as compared to that of ANN, is noticed for multistepahead prediction. The superior per formance of LSSVR over ANN may be attributed to its fundamental approach towards error minimization, en sured global optimum solution and capability to gene ralize the relationship between past and future river flow values, even with short length of available data. Thus, it may be inferred that Structural Risk Minimization (SRM) is better approach as compared to Empirical Risk Mini mization (ERM) and LSSVR, being a SRM based app roach, may be used for multistepahead prediction to obtain better performance. Use of exogenous inputs may be of further research interests. REFERENCES [1] V. N. Vapnik, “Statistical Learning Theory,” John Wiley and Sons, New York, 1998. [2] J. A. K. Suykens and J. Vandewalle, “Least Squares Support Vector Machine Classifiers,” Neural Processing Letters, Vol. 9, No. 3, 1999, pp. 293300. doi:10.1023/A:1018628609742 [3] B. E. Boser, I. Guyon and V. Vapnik, “A Training Algo rithm for Optimal Margin Classifiers,” Proceedings Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992, pp. 144152. [4] H. D. Drucker, C. J. C. Burges, L. Kaufman, A. Smola and V. Vapnik, “Support Vector Regression Machines,” In: M. C. Mozer, M. I. Jordan and T. Petsche, Eds., Ad vances in Neural Information Processing Systems, Vol. 9, Morgan Kaufmann, San Mateo, 1997, pp. 155161. [5] Y. B. Dibike, S. Velickov, D. Slomatine and M. B. Ab bott, “Model Induction with Support Vector Machines: Introduction and Applications,” Journal of Computing in Civil Engineering, Vol. 15, No. 3, 2001, pp. 208216. doi: 10.1061/(ASCE)08873801(2001)15:3(208) [6] S. Tripathi, V. V. Srinivas and R. S. Nanjundian, “Down scaling of Precipitation for Climate Change Scenarios: A Support Vector Machine Approach,” Journal of Hydrol ogy, Vol. 330 No. 34, 2006, pp. 621640. doi: 10.1016/j.jhydrol.2006.04.030 [7] W. Wu, X. Wang, D. Xie and H. Liu, “Soil Water Con tent Forecasting by Support Vector Machine in Purple Hilly Region,” International Federation for Information Processing, Vol. 258, 2008, pp. 223230. doi:10.1007/9780387772516_25 [8] R. Maity, P. P. Bhagwat and A. Bhatnagar, “Potential of Support Vector Regression for Prediction of Monthly Streamflow Using Endogenous Property,” Hydrological Processes, Vol. 24, No. 7, 2010, pp. 917923. doi: 10.1002/hyp.7535 [9] S.Y. Liong and C. Sivapragasam, “Flood Stage Forecas ting with Support Vector Machines,” Journal of the Ame rican Water Resources Association, Vol. 38, No. 1, 2002, pp. 173196. doi:10.1111/j.17521688.2002.tb01544.x [10] M. Bray and D. Han, “Identification of Support Vector Machines for Runoff Modelling,” Journal of Hydroin formatics, Vol. 6, No. 4, 2004, pp. 265280. [11] P. Samui, “Application of Least Square Support Vector Machine (LSSVM) for Determination of Evaporation Losses in Reservoirs,” Engineering, Vol. 3, No. 4, 2011, pp. 431434. doi:10.4236/eng.2011.34049 [12] N. She and D. Basketfield, “Long Range Forecast of Stream Flow Using Support Vector Machine,” Proceed ings of the World Water and Environment Resources Congress, ASCE, Anchorage, 2005. doi:10.1061/40792(173)481 [13] X. Zhang, R. Srinivasan and M. V. Liew, “Approxi mat ing SWAT Model Using Artificial Neural Network and Support Vector Machine,” Journal of the American Water Resources Association, Vol. 45, No. 2, 2009, pp. 460474. doi:10.1111/j.17521688.2009.00302.x [14] I. M. Khadam and J. J. Kaluarachchi, “Use of Soft Infor mation to Describe the Relative Uncertainty of Calibra tion Data in Hydrologic Models,” Water Resources Re search, Vol. 40, No. W11505, 2004, p. 15. doi: 10.1029/2003WR002939 [15] J. A. K. Suykens, J. De Brabanter, L. Lukas and J. Vandewalle, “Weighted Least Squares Support Vector Machines: Robustness and Sparse Approximation,” Neu rocomputing, Vol. 48, No. 14, 2002, pp. 85105. doi:10.1016/S09252312(01)006440 [16] Z. Qin, Q. Yu, J. Li, Z. Wu and B. Hu, “Application of Least Squares Vector Machines in Modelling Water Va por and Carbon Dioxide Fluxes over a Cropland,” Jour nal of Zhejiang University Science, Vol. B6, No. 6, 2005, pp. 491495, doi: 10.1631/jzus.2005.B0491 [17] P. Aksornsingchai and C. Srinilta, “Statistical Down scaling for Rainfall and Temperature Prediction in Thai land,” Proceeding of the International MultiConference of Engineers and Computer Scientists, Hong Kong, Vol. 1, 1618 March 2011. [18] G. Zhang, B. E. Patuwo and M. Y. Hu, “Forecasting with Artificial Neural Networks: The State of the Art,” Inter national Journal of Forecasting, Vol. 14, No. 1, 1998, pp. 3562. doi:10.1016/S01692070(97)000447 [19] T. Naes, K. Kvaal, T. Isaksson and C. Miller, “Artificial Neural Networks in Multivariate Calibration,” Journal of NearInfrared Spectroscopy, Vol. 1, 1993, pp. 111. doi:10.1255/jnirs.1 [20] A. Y. Shamseldin, “Application of a Neural Network Technique to RainfallRunoff Modeling,” Journal of Hy drology, Vol. 199, No. 3, 1997, pp. 272294. [21] C. M. Zealand, D. H. Burn and S. P. Simonovic, “Short Term Stream Flow Forecasting Using Artificial Neural Copyright © 2012 SciRes. JWARP
P. P. BHAGWAT, R. MAITY Copyright © 2012 SciRes. JWARP 539 Networks,” Journal of Hydrology, Vol. 214, No. 14, 1999, pp. 3248. doi:10.1016/S00221694(98)00242X [22] A. S. Weigend, D. E. Rumelhart and B. A. Huberman, “Predicting the Future: A Connectionist Approach,” In ternational Journal of Neural Systems, Vol. 1, No. 3, 1992, pp. 193209. [23] R. Maity and D. Nagesh Kumar, “BasinScale Stream Flow Forecasting Using the Information of LargeScale Atmospheric Circulation Phenomena,” Hydrological Pro cesses, Vol. 22, No. 5, 2008, pp. 643650. doi:10.1002/hyp.6630 [24] P. Coulibaly and N. D. Evora, “Comparison of Neural Network Methods for Infilling Missing Daily Weather Records,” Journal of Hydroogy, Vol. 341, No. 12, 2007, pp. 2741. doi:10.1016/j.jhydrol.2007.04.020 [25] H. F. Zou, G. P. Xia, F. T. Yang and H. Y. Wang, “An Investigation and Comparison of Artificial Neural Net work and Time Series Models for Chinese Food Grain Price Forecasting,” Neurocomputing, Vol. 70, No. 1618, 2007, pp. 29132923. doi:10.1016/j.neucom.2007.01.009 [26] ASCE, “Artificial Neural Networks in Hydrology. II: Hydrologic Applications,” ASCE Task Committee on Application of Artificial Neural Networks in Hydrology, Journal of Hydrologic Engineering, Vol. 5, No. 2, 2000, pp. 124137. doi:10.1061/(ASCE)10840699(2000)5:2(124) [27] T. Van Gestel, J. A. K. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor and J. Vandewalle, “Benchmarking Least Squares Support Vector Machine Classifiers,” Machine Learning, Vol. 54, No. 1, 2004, pp. 532. doi:10.1023/B:MACH.0000008082.80494.e0 [28] W. H. Chen, J. Y. Shih and S. Wu, “Comparison of Sup portVector Machines and Back Propagation Neural Networks in Forecasting the Six Major Asian Stock Mar kets,” International Journal of Electronic Finance, Vol. 1, No. 1, 2006 , pp. 4967. [29] R. M. Balabin and E. I. Lomakina, “Support Vector Ma chine Regression (LSSVM)—An Alternative to Artifi cial Neural Networks (ANNs) for the Analysis of Quan tum Chemistry Data?” Physical Chemistry Chemical Phy sics, Vol. 13, No. 24, 2011, pp. 1171011718. doi:10.1039/c1cp00051a
