### Paper Menu >>

### Journal Menu >>

Problems

Application of SVMs in the field of hydrology is gaining

wide popularity and the results are found to be encoura-

ging. Such applications range from remotely sensed im-

age classification [5], statistical downscaling [6], soil

water forecasting [7], stream flow forecasting [8] and so

on. Liong and Sivapragasam [9] compared SVM per-

formance with other machine learning model, such as,

ANN in forecasting flood stage and reported a superior

performance of SVM. Bray and Han [10] used SVM for

rainfall runoff modelling and that model was compared

with a transfer function model. The study outlin ed a pro-

mising area of research for further application of SVMs

in unexplored areas. Samui [11] used LS-SVM to deter-

mine evaporation loss of reservoir and it is established to

be a powerful approach for the determination of evapo-

ration loss. She and Basketfield [12] forecasted spring

and fall season stream flows in Pacific Northwest region

of US using SVM and reported superior results in fore-

casting. Zhang et al. [13] studied two machine learning

approaches—ANN and SVM and compared for appro-

ximating the Soil and Water Assessment Tool (SWAT)

model. The results showed that SVM in general exhibited

better generalization ability than ANN. Khadam and

Kaluarachchi [14] discussed the impact of accuracy and

reliability of hydrolog ical data o n model calib ration. Th is,

coupled with application of SVMs, was used to identify

the faulty model calibration, which would have been

undetected otherwise. Applicability of SVMs was also

demonstrated in downscaling Global Circulation Models

(GCMs), which are among the most advanced tools for

estimating future climate change scenarios. The results

presented SVMs as a compelli ng alternative to tradition al

Artificial Neural Networks (ANN) to conduct climate

impact studies [10,11] downscaled monthly precipitation

to basin scale using SVMs and reported the results to be

encouraging in their accuracy while showing large pro-

mise for further applications.

1.3. Advancement of SVM

Apart from the general benefit of SVM pointed out in the

aforementioned studies, SVMs are sometimes criticized

by its large number of parameters and high level of com-

putational effort, particularly in case of large dataset.

Chunking is one of the proposed remedies to the latter

problem. However, according to Suyken et al. [15], it is

worthwhile to investigate the possibility of simplifying

the approach to the extent possible without losing the any

of its advantages. Thus, they proposed a modification

over the SVM approach, which essentially leads to Least

Square-Support Vector Machines (LS-SVM).

The main advantage of LS-SVM is in its higher com-

putational efficiency than that of standard SVM method,

since the training of LS-SVM requires only the solution

of a set of linear equations instead of the long and com-

putationally demanding quadratic programming problem

involved in the standard SVM [2]. Qin et al. [16] in-

vestigated the application of LS-SVM for the modelling

of water vapor and carbon diox ide fluxes and they fou nd

the excellent generalization property of LS-SVM and its

potential for further applications in area of general hy-

drology. Maity et al. [13] investigated the potential of

support vector regression, which is also based on LS-

SVM principle, for prediction of streamflow using en-

dogenous property of the monthly time series. In this

study, potential of LS-SVM for Regression (LS-SVR) is

investigated for the obj ective outlined as follows.

1.4. Objective of This Study

Potential of LS-SVM for Regression (LS-SVR) is ex-

ploited for multistep-ahead river flow prediction at daily

scale, to assess its performance with the increasing time

Copyright © 2012 SciRes. JWARP

P. P. BHAGWAT, R. MAITY

530

horizon. Most of the major river basins are being po-

pulated with major and minor dams. River flow mode-

lling is supposed to be influenced by the effect of the

release from these dams if the site location is on the

downstream side of the dam. However, the effect of its

existence will gradually reduce with increase of the dis-

tance from the dam location. The study is carried out

with observed daily river flow in the upper Narmada

River basin with Sandia gauging station at the outlet.

Bargi dam exists few hundred km upstream from the out-

let of the watershed. Details of this dam are provided in

th e “Study Area” section later. Investigation is carried out

to assess the necessity of consideration of daily releases

from upstream dam on the daily river flow variation at

the outlet of the study area. Also, a multistep-ahead pre-

diction is carried out to assess maximum temporal hori-

zon over which the prediction results may be relied upon.

The results are compared with the performance of neural

networks approach that uses Empirical Risk Minimi-

zation (ERM).

2. Methodology

2.1. Data Normalization

The observed river flow data, as commonly used in data-

driven models, is normalized to prevent the model from

being dominated by the large values. The performance of

LS-SVM with normalized input data has shown to out-

perform the same with non-normalized input data [16].

Therefore, the data is normalized and finally the model

outputs are back transformed to their original form by

denormalisation. The normalization (also back transfor-

mation) is carried out using

0.1

i

y

y

1.2max

i

i

S

S

th

i

S

1

,

(1)

where i is the normalized data for day and is

the observed value for day. i

th

i

2.2. Least Square-Support Vector Regression

(LS-SVR)

Let us consider a given training set

N

ii

i

y

x

1

,

T

i

y

y

x

y

()yf

, where

represents a

1

,,

in in

yy

n

n-dimensional

input vector and i is a scalar measured output,

which represents system output. Subscript i indicates

the training pattern. In the context of multistep-

ahead river flow prediction, i is the vector comprising

of

th

i

i

n-previous days normalized river flow values, i is

the target river flow with certain lead-time in day(s) and

subscript indicates the reference time from which n-

previous days and the lead-time are counted. The goal is

to construct a function

x

yx

T

yb

, which represents the

dependence of the output on the input . The form

of this function is as follows:

wx

wb

(2)

where is known as weight vector and as bias.

This regression model can be constructed using a non-

linear mapping function

. The function

:nh

is a mostly nonlinear function, which

maps the data into a higher, possibly infinite, dimen-

sional feature space. The main difference from the stan-

dard SVM is that the LS-SVR involves equality con-

straints instead of inequality constraints, and works with

a least square cost function. The optimization problem

and the equality constraints are defined by the following

equations:

2

1

11

Minimize ,22

such that1,,

N

T

i

i

T

iii

Je e

be iN

y

www

wx

e

e

(3)

where i is the random error and is a regu-

larization parameter in optimizing the trade-off between

minimizing the training errors and minimizing the mo-

del’s complexity. The objective is now to find the opti-

mal parameters that minimize the prediction error of the

regression model. The optimal model will be chosen by

minimizing the cost function where the errors, i, are

minimized. This formulation corresponds to the regre-

ssion in the feature space and, since the dimension of the

feature space is high, possibly infinite, this problem is

difficult to solve. Therefore, to solve this optimization

problem, the following Lagrange function is given,

1

,,;,

NT

iiii

i

LwbeJ we

be y

wx

wbi

ei

(4)

The solution of above can be obtained by partially dif-

ferentiating with respect to , , and

, i.e.

1

0N

ii

i

L

w

wx (5)

1

00

N

i

i

L

b

(6)

0,1,,

ii

i

Lei N

e

(7)

00,1,,

T

iii

i

Lbe yiN

x

wx

w

(8)

From the set of Equations (5)-(8), and e can be

eliminated and finally, the estimated values of b and

i

, i.e. and i

bˆ

, can be obtained by solving the linear

system. Replacing in Equation (2 ) from Equatio n (5 ),

w

Copyright © 2012 SciRes. JWARP

P. P. BHAGWAT, R. MAITY 531

2

the kernel trick may be applied as follows:

,Kxx

T

ii

x x

ˆ

,

ii

(9)

Here, the kernel trick means a way to map the obser-

vations to an inner product space, without actually com-

puting the mapping and it is expected that the obser-

vations will have meaningful linear structure in that inner

product space.

Thus, the resulting LS-SVR model can be expressed as

1

ˆ

ˆN

i

y

Kbxx

,Kxx

(10)

where is a kernel function.

i

In comparison with some other feasible kernel fun-

ctions, the RBF is a more compact and able to shorten

the computational training process and improve the gene-

ralization performance of LS-SVR (LS-SVM, in general),

a feature of great importance in designing a model [13].

Aksornsingchai and Srinilta [17] studied support vector

machine with polynomial kernel (SVM-POL), and su-

pport vector machine with Radial Basis Function kernel

(SVM-RBF) and found SVM-RBF is the accurate model

for statistical downscaling methods. Also, many works

have demonstrated the favorable performance of the ra-

dial basis function [9,15]. Therefore, the radial basis fun-

ction is adopted in this study. The nonlinear radial basis

function (RBF) kernel is defined as:

2

2

1

i

x x,e

xp

i

Kxx (11)

where

is the kernel function parameter of the RBF

kernel. The symbol is the norm of the vector

and

thus, 2

xx xx

xˆ

y

yˆi

y

i is basically the Euclidean distance be-

tween the vectors and i. In the context of river

flow prediction, i is the new vector of previous river

flow, based on which multi-step ahead prediction (i) is

made with certain lead-time. i (observed) and are

compared to assess the model performances.

2.3. Model Calibration and Parameter

Estimation

The regularization parameter

determines the trade-

off between the fitting error minimization and smooth-

ness of the estimated function. It is not known before-

hand which

and 2

are the best for a particular

problem to achieve maximum performance with LS-SVR

models. Thus, the regularization parameter

and the

RBF kernel parameter 2

have to be calibrated during

model development period. These parameters are inter-

dependent, and their (near) optimal values are often ob-

tained by a trial-and-error method. Interrelationship is

also coupled with the number of previous river flow

values to be considered, which is denoted as . In order

to find all these parameters (

n

,

and ) grid search

method is employed in parameter space. Once the para-

meters are estimated from the training dataset, the ob-

tained LS-SVR model is complete and ready to use for

modelling new river flow data period. Performance of the

developed model is then assessed with the data set during

testing period. Different models (parameter sets) are

developed for different prediction lead-times in case of

multi-step ahead prediction.

n

2.4. Comparison with Artificial Neural Networks

(ANN)

The flexible computing based ANN models have been

extensively studied and used for time series forecasting

in many hydrologic applications since late 1990s. This

model has the capability to execute complex mapping

between input and output and to form a network that

approximates non-linear functions. A single hidden layer

feed forward network is the most widely used model

form for time series modeling and forecasting [18]. This

model usually consists of three layers: the first layer is

the input layer where the data are introduced to the net-

work followed by the hidden layer where data are pro

cessed and the final or output layer is where the results of

the given input are produced.

The number of input nodes and output nodes in an

ANN are dependent on the problem to which the network

is being applied. However, there is no fixed method to

find out the number of hidden layer nodes. If there are

too few nodes in the hidden layer, the network may have

difficulty in generalizing the problems. On the other hand,

if there are too many nodes in the hidden layer, the net-

work may take an unacceptably long time to learn any

thing from the training set [19]. Increase in the number of

parameters may slow the calibration process [20]. In a

study by Zealand et al. [21], networks were initially con

figured with both one and two hidden layers. However,

the improvement in forecasting results was only marginal

for the two hidden layer cases. Therefore, it is decided to

use a single hidden layer in this study. In most of the

cases, suitable number of neurons in the hidden layer is

obtained based on the trial-and-error method [22]. Maity

and Nagesh Kumar, [23] proposed a GA based evolution-

ary approach to decide the complete network structure.

The output t

X

of an ANN, assuming a linear output

neuron having a single hidden layer with h sigmoid

hidden nodes, is given by:

1

h

tjjk

j

gwfsb

X

(12)

g

where

and k are the linear transfer function and

bias respectively of the output neuron k,

b

j

w

th

j

is the

connection weights between neuron of hidden la-

Copyright © 2012 SciRes. JWARP

P. P. BHAGWAT, R. MAITY

CopyrigSciRes. JWARP

532

f

ht © 2012

yers and output units, is the transfer function of

the hidden layer [24]. The transfer functions can take

several forms and the most widely used transfer func-

tions are Log-sigmoid, Linear, Hyperbolic tangent sig-

moid etc. In this study, hyperbolic tangent sigmoid is

used:

in study area is from 289 to 1134 m. The basin lies

between east longitudes 78˚30' and 81˚45', and north

latitudes 21˚20' and 23˚45'. Bargi dam (later renamed as

Rani Avanti Bai Sagar Project) is a major structure in the

basin up to Sandia, which is located few hundred km up-

stream of Sandia. The latitude and longitude of the dam

are 22˚56'30''N and 79˚55'30''E, respectively. It was

constructed in late eighties and being operated from early

nineties. Thus, pre-construction period (1978-1986) is

considered for training. For testing, two sets of data are

used—pre-construction data set (1986-1990) and post-

construction period (1990-2000). However, out of this

entire range four year s data is missing (1, 1981 - 31 May

1982; June 1, 1987 - May 31, 1988; June 1, 1993 - May

31, 1994; June 1, 1998 - May 31, 1999). These periods

are ignored from the analysis. River flow data from

Sandia station, operated by the Water Resources Agency,

is obtained from Central Water Commission, Govt. of

India. Among these records, a daily data (June 1, 1978 to

May 31, 1986) is used for training and (June 1, 1986 to

May 31, 2000) is used to test the model performance.

21

xp2

i

s

i

i

1e

i

fs (13)

0

n

i

i

s

wx

where is the input signal referred to as the

weighted sum of incoming information. Several optimi-

zation algorithms can be used to train the ANN. Among

the various training algorithms available, the back-

propagation is most popular and widely used algorithm

[25]. Details of this techniques is well established in the

literature and can be found elsewhere (ASCE 2000 and

references therein) [26].

3. Study Area and Data Sets

Narmada River is the largest west flowing river of Indian

peninsula. It is the fifth largest river in India. The study

area is up to Sandia gauging station, which is in the

upstream part of Narmada river basin as shown in Figure

1. The upstream part Narmada river basin is in the state

of Madhya Pradesh, India. The river originates from

Maikala ranges at Amarkantak and flows westwards over

a length of 334 km up to Sandia. The elevation difference

4. Results and Discussions: Performance of

LS-SVR for River Flow Prediction

4.1. Data Pre-Processing and Parameter

Estimation

Observed river flow data at Sandia is normalized as ex-

plained in the methodology before proceeding to parameter

Figure 1. Narmada river basin with study area up to Sandia station.

P. P. BHAGWAT, R. MAITY 533

meter estimation. Optimum number of previous river

flow values to be considered is denoted as n. This para-

meter along with the regularization parameter

and

the RBF kernel parameter 2

is calibrated during mo

del development period (training period). To select best

combination of n,

and 2

, grid search method is

used. Model performances for different combinations of

these parameters are assessed based on statistical mea-

sures, such as, Correlation Coefficient (CC), Root Mean

Square Error (RMSE) and Nash-Sutcliffe Efficiency

(NSE). Ten different values of n (1 through 10) are tested

to decide the optimum number of previous daily river

flow to be considered for the best possible results. Range

of

is considered to be 25 to 1000 with a resolution of

25 and 2

in the range of 0.01 to 1 with a resolution of

0.01. Approximately (because of different lag and lead-

times considered) 2556 data points are used for training

purpose. Performance of each model is assessed with the

remaining testing data points. Model performances stati-

stic is obtained between observed and modelled river

flow values during training and testing period. The com-

bination that yields comparable performance during

training and testing period is
Multistep-ahead River Flow Prediction using LS-SVR at Daily Scale
### Paper Menu >>

### Journal Menu >>