Linear genetic programming application for successive-station monthly streamflow prediction
Transcript of Linear genetic programming application for successive-station monthly streamflow prediction
Author's Accepted Manuscript
Linear genetic programming application forsuccessive-station monthly streamflow pre-diction
Ali Danandeh Mehr, Ercan Kahya, CAHİTYerdelen
PII: S0098-3004(14)00101-0DOI: http://dx.doi.org/10.1016/j.cageo.2014.04.015Reference: CAGEO3373
To appear in: Computers & Geosciences
Received date: 9 November 2013Revised date: 22 March 2014Accepted date: 29 April 2014
Cite this article as: Ali Danandeh Mehr, Ercan Kahya, CAHİT Yerdelen, Lineargenetic programming application for successive-station monthly streamflowprediction, Computers & Geosciences, http://dx.doi.org/10.1016/j.cageo.2014.04.015
This is a PDF file of an unedited manuscript that has been accepted forpublication. As a service to our customers we are providing this early version ofthe manuscript. The manuscript will undergo copyediting, typesetting, andreview of the resulting galley proof before it is published in its final citable form.Please note that during the production process errors may be discovered whichcould affect the content, and all legal disclaimers that apply to the journalpertain.
www.elsevier.com/locate/cageo
1
Linear Genetic Programming Application for Successive-Station Monthly 1
Streamflow Prediction 2
Short Title: Streamflow prediction using LGP 3
4
ALI DANANDEH MEHRa, ERCAN KAHYAb, CAHİT YERDELENc 5
a Corresponding author. Istanbul Technical University, Civil Engineering Department, Hydraulics 6
Division, 34469, Maslak, Istanbul, Turkey, Phone: +90 553 417 8028 Fax: +90 212 285 65 87 Email: 7
b Istanbul Technical University, Civil Engineering Department, Hydraulics Division, Istanbul, Turkey. 9
E-mail: [email protected] 10
c Ege University, Civil Engineering Department, Hydraulics Division, Izmir, Turkey. E-mail: 11
13
Abstract 14
In recent decades, artificial intelligence (AI) techniques have been pronounced as a branch of 15
computer science to model wide range of hydrological phenomena. A number of researches 16
have been still comparing these techniques in order to find more effective approaches in 17
terms of accuracy and applicability. In this study, we examined the ability of linear genetic 18
programming (LGP) technique to model successive-station monthly streamflow process, as 19
an applied alternative for streamflow prediction. A comparative efficiency study between 20
LGP and three different artificial neural network algorithms, namely feed forward back 21
propagation (FFBP), generalized regression neural networks (GRNN), and radial basis 22
function (RBF), has also been presented in this study. For this aim, firstly, we put forward six 23
different successive-station monthly streamflow prediction scenarios subjected to training by 24
2
LGP and FFBP using the field data recorded at two gauging stations on Çoruh River, Turkey. 25
Based on Nash-Sutcliffe and root mean square error measures, we then compared the 26
efficiency of these techniques and selected the best prediction scenario. Eventually, GRNN 27
and RBF algorithms were utilized to restructure the selected scenario and to compare with 28
corresponding FFBP and LGP. Our results indicated the promising role of LGP for 29
successive-station monthly streamflow prediction providing more accurate results than those 30
of all the ANN algorithms. We found an explicit LGP-based expression evolved by only the 31
basic arithmetic functions as the best prediction model for the river, which uses the records of 32
the both target and upstream stations. 33
34
Keywords: Artificial neural networks; linear genetic programming; streamflow prediction; 35
successive stations. 36
37
Introduction 38
Artificial neural networks (ANNs) are from popular artificial intelligence (AI) techniques 39
broadly used in various fields of geoscience. They are capable of using field data directly and 40
modelling the corresponding phenomena without prior knowledge of it. Successful results of 41
ANN application in geoscience particularly in hydrological predictions have been extensively 42
published in recent years (e.g. Minns and Hall 1996; Nourani et al. 2008, Besaw et al. 2010; 43
Piotrowski et al. 2014). Our review concerning the application of different ANN structures in 44
streamflow forecasting indicated that it has been received tremendous attention of research 45
(e.g. Dawson and Wilby 1998; Dolling and Varas 2002; Cannas et al. 2006; Kerh and lee 46
2006; Kisi and Cigizoglu 2007; Adamowski 2008; Kişi 2009; Shiri and Kisi 2010; Marti et 47
3
al. 2010; Nourani et al. 2011; Abrahart et al. 2012; Can et al. 2012; Krishna 2013; Kalteh 48
2013; Danandeh Mehr et al. 2013). 49
50
Minns and Hall (1996) introduced ANNs as rainfall-runoff models and demonstrated that 51
they are capable of identifying usable relationships between discharges and antecedent 52
rainfalls. Kerh and Lee (2006) applied an ANN-based model using information at stations 53
upstream of Kaoping River to forecast flood discharge at the downstream station which lacks 54
measurements. They found that the back-propagation ANN model performs relatively better 55
than conventional Muskingum method. Besaw et al. (2010) developed two different ANN 56
models using the time-lagged records of precipitation and temperature in order to forecast 57
streamflow in an ungauged basin in the US. The authors explained that ANNs forecasts daily 58
streamflow in the nearby ungauged basins as accurate as in the basin on which they were 59
trained. Can et al. (2012) used streamflow records of nine gauging stations located in Çoruh 60
River basin to model daily streamflow in Turkey. They compared the performance of their 61
ANN-based models with those of auto regressive moving average (ARMA) models and 62
demonstrated that the ANNs resulted in higher performance than ARMA. A comprehensive 63
review concerning the application of different ANN structures in river flow prediction has 64
been presented by Abrahart et al. (2012). 65
66
In spite of providing satisfactory estimation accuracy, all aforementioned ANN-based models 67
are implicit and often are criticized as ‘ultimate black boxes’ that are difficult to interpret 68
(Babovic 2005). Depending on the number of applied hidden layers, they may produce huge 69
matrix of weights and biases. Consequently, the necessity of additional studying in order to 70
develop not only explicit but also precise models still requires serious attention. 71
72
4
Genetic programming (GP) is a heuristic evolutionary computing technique (Koza 1992; 73
Babovic 2005) that has been pronounced as an explicit predictive modelling tool for 74
hydrological studies (Babovic and Abbott 1997a; Babovic and Keijzer 2002). The capability 75
of GP to model hydro-meteorological phenomena as well as its degree of accuracy are of the 76
controversial topics in recent hydroinformatic studies (e.g. Ghorbani et al. 2010; Kisi and 77
Shiri 2012; Yilmaz and Muttil 2013; Wang et al. 2014). 78
79
After Babovic and Abbott (1997b), who pronounced GP as an advanced operational tool to 80
solve wide range of hydrological modelling problems, GP and it’s variants/advancements 81
were considered broadly in different hydrological processes such as rainfall-runoff (Babovic 82
and Keijzer 2002; Khu et al. 2001; Liong et al. 2001; Whigham and Crapper 2001; Nourani et 83
al. 2012), sediment transport (Babovic 2000; Aytek and Kisi 2008; Kisi and Shiri 2012), sea 84
level fluctuation (Ghorbani et al. 2010), precipitation (Kisi and Shiri 2011), evaporation (Kisi 85
and Guven 2010), and others. 86
87
GP and its variants have also been received remarkable attention in the most recent 88
comparative studies among different AI techniques (e.g. Ghorbani et al. 2010; Kisi and 89
Guven 2010; Kisi and Shiri 2012). In the field of streamflow forecasting, Guven (2009) 90
applied linear genetic programming (LGP), an advancement of GP, and two versions of 91
neural networks for daily flow prediction in Schuylkill River, USA. The author demonstrated 92
that the performance of LGP was moderately better than that of ANNs. Wang et al. (2009) 93
developed and compared several AI techniques comprising ANN, neural-based fuzzy 94
inference system (ANFIS), GP, and support vector machine (SVM) for monthly streamflow 95
forecasting using long-term observations. Their results indicated that the best performance in 96
terms of different evaluation criteria can be obtained by ANFIS, GP and SVM. Londhe and 97
5
Charhate (2010) used ANN, GP, and model trees (MT) to forecast river flow one-day in 98
advance at two gauging stations in India’s Narmada Catchment. The authors concluded that 99
ANN and MT techniques perform almost equally well, but GP performs better than its 100
counterparts. Ni et al. (2010) applied GP to model the impact of climate change on annual 101
streamflow of the West Malian River, China. They compared the results of GP with those of 102
ANN and multiple linear regression models and indicated that GP provides higher accuracy 103
than the others. Yilmaz and Muttil (2013) used GP to predict river flows in different parts of 104
the Euphrates River basin, Turkey. They compared the results of GP with those of ANN and 105
ANFIS and demonstrated that GP are superior to ANN in the middle zone of the basin. 106
Among the most recent comparative studies between different AI techniques, Wang et al. 107
(2014) proposed the singular spectrum analysis (SSA) in order to modify SVM, GP, and 108
seasonal autoregressive (SAR) models. They applied the modified models to predict monthly 109
inflow for three Gorges Reservoirs and indicated that modified GP is slightly superior to 110
modified SVM at peak discharges prediction. Although there are some other comparative 111
studies between GP and different AI techniques, to the best of our knowledge, there is no 112
research examining the performance of LGP for successive-station monthly streamflow 113
prediction in comparison with different ANN structures/algorithms. 114
115
The main goals and motivation of our study are (i) to further enhance the available LGP 116
modelling tool to provide an explicit expression for successive-stations streamflow prediction 117
and (ii) for the first time, to compare the efficiency of LGP with three different ANN 118
algorithms for monthly streamflow prediction. In this way, at the first stage, we put forward 119
six different successive-station prediction scenarios structured by commonly used feed-120
forward back propagation neural network algorithm (FFBP). Then, using LGP technique a 121
new set of explicit expressions has been generated for these scenarios. We performed a 122
6
comparative performance analysis between the proposed LGP and FFBP models using Nash-123
Sutcliffe efficiency and root mean square error measures. As a consequence of the first stage 124
of the study, the best scenario was identified and discussed. In the second stage, two other 125
ANN algorithms, namely generalized regression neural networks (GRNN), and radial basis 126
function (RBF) neural networks were utilized to restructure the best prediction scenario. 127
Ultimately, we put forward a discussion about both accuracy and applicability of different 128
ANN and LGP models. 129
130
It is observed that some of gauging stations are closed down in all over the world where the 131
stations are no longer required or funding to support continued operation is limited. Using 132
successive-station prediction strategy, in case of developing a plausible model between a pair 133
of upstream-downstream stations, the model can be used as a substitute for the station(s) 134
which is at risk for discontinuation. In addition, since inputs of the successive-station 135
prediction models are only time-lagged streamflow observations, such models are also 136
considered more useful for the catchments with sparse rain gauge stations (Besaw et al., 137
2010). The successive-station strategy also tends to decrease the lagged prediction effect of 138
commonly proposed single-station runoff-runoff modes which has been mentioned by some 139
researchers (Chang et al. 2007; De Vos and Rientjes 2005; Muttil and Chau 2006; Wu et al 140
2009a). 141
142
Overview of FFBP, GRNN, and RBF networks 143
144 ANNs are from black-box regression methods which are commonly used to find out the 145
nonlinear systems attitude. FFBP networks are probably the most popular ANNs in 146
hydrological problems (Tahershamsi et al. 2012; Krishna 2013) which considered as general 147
7
nonlinear approximations (Hornik et al. 1989). The primary goal of this algorithm is to 148
minimize the estimation error by searching for a set of connection weights, synaptic weights, 149
which cause the network to produce outputs closer to the targets. They are typically 150
composed of three parts: a) input layer including a number of input nodes, b) one or more 151
hidden layers and c) a number of output layer nodes. The number of hidden layers and 152
relevant nodes are two of the design parameters of FFBP networks. A neural network with 153
too many nodes may overfit the data, causing poor generalization on data not used for 154
training, while too few hidden units may underfit the model (Fletcher et al. 1998).The input 155
nodes do not perform any transformation upon the input data sets. They only send their initial 156
weighted values to hidden layer nodes. The hidden layer nodes typically receive the weighted 157
inputs from the input layer or a previous hidden layer, perform their transformations on it, 158
and pass the output to the next adjacent layer which is generally another hidden layer or an 159
output layer. The output layer consists of nodes that receive the hidden layer outputs and send 160
it to the modeller. Initial synapses are progressively corrected during the training process that 161
compares predicted outputs with corresponding observations and back-propagates any errors 162
to minimize them. The design issues, training mechanisms and application of FFBP in 163
hydrological studies have been the subject of different studies (e.g. Abrahart et al. 2012; 164
Nourani et al. 2013). Therefore, to avoid duplication, we only introduced the main concepts 165
of the FFBP here. FFBP modelling attributes used in the proposed models will be stated in 166
the following. 167
168
RBF is a variant of ANN that uses radial basis functions as activation functions. with the 169
exception of a few researches (e.g. Azmathullah et al. 2005; Bateni et al. 2007; Kisi 2008; 170
Tahershamsi et al. 2012), RBF networks have not been used broadly in hydrological studies. 171
A typical RBF network has a feed forward structure consists of an input layer, a hidden layer 172
8
with a radial basis activation function, and a linear output layer. The hidden layer node 173
calculates the Euclidean distance between the centre of function and the network input layer 174
and then passes the result to the radial basis function. Thus the hidden layer performs a fixed 175
nonlinear transformation which maps the input space onto a new space. The output layer 176
implements a weighted sum of hidden layer outputs. In order to use RBF, it is required to 177
specify the number of layers, a radial activation function, and a criterion for network training. 178
More information on the application of RBF networks on streamflow prediction has been 179
provided by Kisi (2008). 180
181
GRNN is a kind of radial basis networks that only uses training data with BP algorithm to 182
derive estimation function. It typically comprises four layers including input, pattern, 183
summation, and output layers. The number of input units in the first layer is equal to the total 184
number of parameters. The first layer is fully connected to the pattern layer, where each unit 185
represents a training pattern and its output is a measure of the distance of the input from the 186
stored patterns. Each pattern layer is connected to the two neurons (i.e. S-summation and D-187
summation) in the summation layer. The S-summation neuron computes the sum of the 188
weighted outputs of the pattern layer whereas the D-summation neuron calculates the un-189
weighted outputs of the pattern neurons (Cigizoglu and Alp 2004). Successful application of 190
GRNN networks for daily flow forecasting in intermittent rivers has been reported by 191
Cigizoglu (2005). More details on the basics of the GRNN algorithm can be obtained in the 192
literature (Specht 1991). 193
194
195
196
197
9
198
199
Overview of GP and LGP 200
201
GP is a so-called symbolic regression technique (Babovic 2005) that automatically solves 202
problems without pre-specified form or structure of the solution in advance (Koza 1992). In 203
brief, GP is a systematic, domain-independent method for getting computers to solve 204
problems automatically starting from a high-level statement of what needs to be done (Poli et 205
al. 2008). Unlike ANN, GP is self-parameterizing that builds model’s structure without any 206
user tuning. 207
208
Individual solutions in GP are computer programs represented as parse trees (Fig. 1). The 209
population of initial generation is typically generated through a random process. However, 210
subsequent generations are evolved through genetic operators of selection reproduction, 211
crossover and mutation (Babovic and Keijzer 2002). The major inputs for a GP model are (1) 212
patterns for learning, (2) fitness function (e.g. minimizing the squared error), (3) functional 213
and terminal sets, and (4) parameters for the genetic operators like the crossover and mutation 214
probabilities (Sreekanth and Datta 2011). As shown in Fig. 1, in GP modelling, the functions 215
and terminals are chosen randomly from the user defined sets to form a computer model in a 216
tree-like structure with a root node and branches extending from each function and ending in 217
a leaf or a terminal. In many cases in GP leaves are the inputs to the program. More details on 218
GP can be obtained from Koza (1992) and Babovic and Keijzer (2000). 219
10
220
Fig. 1. Tree representing of function ((x .y - x/y)) 221
222
Besides the tree-based GP, which is also referred to as the traditional Koza-style GP, there 223
are new variants of GP such as linear, graph-based, probabilistic and multi objective GP (Poli 224
et al. 2008). LGP is a subset of GP that has emerged, recently (Brameier and Banzhaf 2007). 225
Comparing LGP to traditional GP, there are some main differences. The tree-based programs 226
used in GP correspond to expressions from a functional programming language. Functions 227
are located at root and inner nodes while the leaves hold input values or constants. In 228
contrast, LGP denotes a GP variant that evolves sequences of instructions from an imperative 229
programming language or machine language (Brameier and Banzhaf 2007). The term “linear” 230
refers to the imperative program representation. It does not mean that the method provides 231
linear solutions. An example of an LGP evolved program, which is the C code (with removed 232
introns) of a model developed in this study, is illustrated as follows: 233
234
L0: f [0] += Input001; 235 L1: f [0] += -1.0632883443450928f; 236 L2: f [0] *= f [0]; 237 L3: f [0] /= 1.216173887252808f; 238 L4: f [0] += -0.2360081672668457f; 239 L5: f [0] *= f [0]; 240 L6: f [0] *= Input000; 241 L7: f [0] += Input000; 242 243
where f[0] represent the temporary computation variable created in the program by LGP. 244
LGP uses such temporary computation variables to store values while performing 245
11
calculations. The variable f[0] is initialized to the value of “Input001” in this program and the 246
output is the value remaining in f[0] in the last line of the code. 247
248
Similar to pseudo-algorithm of any GP variants, LGP generally solves any problem through 249
the six steps: (i) generation of an initial population (machine-code functions) randomly by the 250
user defined functions and terminals; (ii) Selection of two functions from the population 251
randomly, Comparison of the outputs and designation of the function that is more fit as 252
winner_1 and less fit as loser_1; (iii) Selection of two other functions from the population 253
randomly and designation of the winner_2 and loser_2; (iv) Application of transformation 254
operators to winner_1 and winner_2 to create two similar, but different evolved programs 255
(i.e. offspring) as modified winners (v) replace The loser_1 and loser_2 in the population 256
with modified winners and (vi) Repetition of steps (i)-(v) until the predefined run termination 257
criterion. 258
259
Akin to GP, the user specified functions in LGP can consist of the arithmetic operations (+, - , 260
×, ÷), Boolean logic functions, conditionals, and any other mathematical functions such Sin, 261
Ln, EXP, and others. The choice of the functional set determines the complexity of the 262
evolved program. For example, a functional set with only addition and subtraction results in a 263
linear model whereas a functional set which includes exponential functions will result in a 264
highly nonlinear model. The terminal set contains the arguments for the functions and can 265
consist of numerical constants, logical constants, variables, etc. More details on the 266
application of LGP in predictive modelling can be obtained from Poli et al. (2008). The LGP 267
modelling attributes and its relevant parameters used in this study will be explained in the 268
following. 269
270
12
Data preparation and efficiency criteria 271
272
As shown in Fig. 2, the data used in this study were selected from two gauging stations, 273
namely station 2322 and station 2315, on Çoruh River located in eastern Black Sea region, 274
Turkey. The river springs from Mescit Mountains in Bayburt and reaches the Black Sea in 275
Batum City of Georgia after a course of 431 kms. Yearly mean flow of the river before 276
leaving the Turkey’s border is about 200m3/s. The applied data is composed of 348 277
observations of the mean monthly streamflow at each station. The statistical characteristics of 278
applied streamflow time series at 29-year period (1972-2000) are presented in Table1. 279
Because there are only 348-month observations in our data source, only training and 280
validation data sets was applied in the present study. The training data was used for model 281
fitting and the validation data was used for testing as well as evaluating different models. 282
Considering previous similar studies (e.g. Nourani et al. 2011; Kalteh 2013), the training and 283
validation data percentage were assumed as 75% and 25%, respectively (Fig.3). Therefore, 284
the entire data set were divided into two subsets. The statistical parameters of each set are 285
tabulated in Table 2. Details on data splitting methods can be found at (May et al. 2010; Wu 286
et al. 2012, 2013). 287
288
289
13
290
Fig. 2. Location of Study area (Çoruh River Basin) 291
292 Table1. The monthly statistical parameters of observed streamflow data 293
Statistical parameter
Station Upstream (2322) Downstream (2315)
raw normalized Raw normalized Number of data (X) 348 348 348 348 Xmax (m3/s) 867.4 0.932 1018 0.940 Xmin (m3/s) 33.5 0.132 48.9 0.140 Xmean (m3/s) 158.6 0.252 207.1 0.271 Standard Deviation (m3/s) 159.1 0.152 184.4 0.152
Coefficient of Skewness 1.727 1.727 1.700 1.700 294
Table2. Statistical parameters of subsets 295 Statistical parameter Entire
data Training
subset Validation
subset Number of data (X) 348 261 96 Xmax (m3/s) 1018 930 1018 Xmin (m3/s) 33.5 36 48 Xmean (m3/s) 158.6 158 223 Standard Deviation (m3/s) 159.1 156 204 Coefficient of Skewness 1.7271 1.71 1.71
296
14
297
Fig. 3. Mean monthly streamflow observations at Çoruh River 298
299
Normalization is a standard operating procedure in any data-driven modelling technique. In 300
this study, before training of the ANN and LGP models, the normalization was applied for 301
the data, which made them dimensionless and confined within a certain range. After training 302
process, the model that yields the best results in terms of Nash-Sutcliffe efficiency (NSE) and 303
root mean squared error (RMSE) at validation period was selected as the most efficient 304
model. NSE is a normalized statistic that indicates how well the plot of observed versus 305
predicted data fits the 1:1 line (Eq. (1), Nash and Sutcliffe 1970). RMSE, Eq. (2), measures 306
the root average of the squares of the errors. The error is the amount by which the value 307
implied by the estimator differs from the target or quantity to be estimated. 308
nobs pre 2i i
i 1n
obs obs 2i mean
i 1
( X X )NSE 1
( X X )
=
=
−= −
−
∑∑
(1) 309
nobs pre 2i i
i 1
( X X )RMSE
n=
−=∑
(2) 310
15
where obsiX = observed value of X, pre
iX = predicted value obsmeanX = mean value of observed 311
data and n = number of observed data. Obviously, a high value for NSE (up to one) and a 312
small value for RMSE indicate high efficiency of the corresponding model. A number of 313
studies have indicated that a hydrological model can be sufficiently assessed by these two 314
statistics (Nourani et al. 2012). 315
316
Proposed models 317
318 In this section, successive-stations scenarios proposed for monthly streamflow prediction at 319
downstream station are explained. Two successive gauging stations on Çoruh River with 320
approximately 60km distance have been selected and six different scenarios (models (1) to 321
(6)) have been considered to train by AI techniques. In other words, the streamflow value at 322
downstream is assumed to be a function of finite sets of concurrent or antecedent streamflow 323
observations. The assumed scenarios can be expressed as follows: 324
325
Model (1) Dt = f (Dt-1) 326
Model (2) Dt = f (Ut) 327
Model (3) Dt = f (Ut, Dt-1) 328
Model (4) Dt = f (Ut-1, Ut, Dt-1) 329
Model (5) Dt = f (Ut-1, Dt-1) 330
Model (6) Dt = f (Ut-1, Ut, Dt-2, Dt-1) 331
where Dt and Ut represent downstream and upstream monthly streamflow, respectively and 332
indices t-1 and t-2 symbolize the one-month lag and two-months lag, respectively. 333
16
334
Application of FFBP network 335
336 At the first stage of the study, we used the commonly proposed three-layer feed forward back 337
propagation (FFBP) network to construct the FFBP-based ANN models for all of the 338
aforementioned scenarios. It has been proved that FFBP networks are satisfied for any 339
forecasting problem in hydrology (ASCE Task committee 2000; Nourani et al. 2008). In the 340
FFBP, any input node was multiplied by a proper weight initially then was shifted by a 341
constant value (i.e. bias), and finally was transformed using a predefined transfer function. 342
Sigmoid transfer function in the hidden layer and linear transfer function in the output layer 343
were adopted in our FFBP. The expression for an output value of a three-layered FFBP 344
networks is given in Eq. (3) (Nourani et al. 2011). 345
N NM N
C O kj h ji i jo koj 1 i 1
y f W .f W x W W= =
⎡ ⎤⎛ ⎞= + +⎢ ⎥⎜ ⎟
⎝ ⎠⎣ ⎦∑ ∑
(3) 346
where Wji is a weight in the hidden layer connecting the ith neuron in the input layer and the 347
jth neuron in the hidden layer, Wjo is the bias for the jth hidden neuron, fh is the activation 348
function of the hidden neuron, Wkj is a weight in the output layer connecting the jth neuron in 349
the hidden layer and the kth neuron in the output layer, Wko is the bias for the kth output 350
neuron, fo is the activation function for the output neuron, xi is ith input variable for input 351
layer and yc , yo are computed and observed output variables, respectively. NN and MN are the 352
number of neurons in the input and hidden layers, respectively. The weights are different in 353
the hidden and output layers, and their values can be changed during the process of the 354
training. 355
356
17
After definition of the three-layer structure, the training process was carried out using the 357
Levenberg–Marquardt (LM) algorithm written for MATLAB® software. The successful 358
implementation of LM algorithm to train ANN-based hourly, daily, and monthly streamflow 359
prediction models is frequently reported (e.g. Chau et al. 2005; Cannas et al. 2006; Kisi 2010; 360
Danandeh mehr et al. 2013). As it mentioned previously, a critical issue in ANN modelling is 361
to avoid likely under or overfitting problems. In the former situation, the network may not be 362
possible to fully detect all attributes of a complex data set, while overfitting may reduce the 363
model ability for generalization properties. Application of cross-validation or selection of 364
appropriate number of neurons in hidden layer using the trial-and-error procedure with a 365
confined training iteration number (epoch), were commonly suggested to prevent these 366
problems (Principe et al. 2000; Cannas et al. 2006; Nourani et al. 2008; Kisi 2008; Wu et al. 367
2009b; Elshorbagy et al. 2010; Krishna 2013). The trial-and-error method is implemented in 368
this study in order to avoid overfitting problem during the training process All of the 369
proposed models in this study, possess fixed input-nodes (varying from 1 to 4 for models (1) 370
to (6)), a single hidden layer with nodes vary from 1 to 10 and just 1 node in the output layer. 371
No significant improvement in model performance was found when the number of hidden 372
neurons was increased from the limit, which is similar to the experiences of other researchers 373
(Cheng et al. 2005; Cannas et al. 2006; Wu et al. 2009b). 374
375
Application of LGP 376
377
We applied Discipulus®, the LGP soft-ware package developed by Francone (1998) in order 378
to create the best programs for already proposed scenarios (i.e. models (1) to (6)). To 379
generate the best programs, several decisions are required to be made by modeller. The first 380
decision consists of choosing the sets of terminals and functions utilized for creation of the 381
18
initial population. The terminal set obviously is program’s external inputs consist of the 382
independent variables of each model. A series of random constants between -1 and 1 is also 383
assumed to be in our terminal set as part of potential solution. The choice of the appropriate 384
function set is not so clear; however, an appropriate guess will be helpful in order to include 385
all the necessary functions. In this study, we considered only four basic arithmetic functions 386
including addition, subtraction, multiplication and division in our function set. Other kind of 387
mathematical functions were ignored to avoid increasing complexity of the final solution. In 388
other words, the power of LGP is assessed for the nonlinear streamflow process modelling by 389
only basic arithmetic functions and confined program size. 390
391
Definition of the terminal and function sets by modeller, indirectly defines the search space 392
for LGP to generate initial population. In the next step, the generated programs (initial 393
population) must be ranked based on the fitness value and then new programs should be 394
created by using both crossover and mutation operators (Poli et al., 2008). Eventually, the 395
best program is selected from the new generations. We used RMSE measure of validation 396
period to select the best program for each scenario. Other adopted parameters for LGP setting 397
was tabulated in Table 3. 398
399
Table 3. Parameter settings for the LGP training 400
Parameter Value Initial Population (programs) 100 Mutation Rate % 5 Crossover Rate % 50 Initial Program Size 64 (Byte) Maximum Program Size 256 (Byte) Maximum Numbers of Runs 300 Generation Without Improvement 300 Maximum Generation Since Start 1000
401
19
In addition to parameter selection, one of the main concerns of LGP modelling as well as 402
ANN is overfitting problem. This is more likely to happen when a small data sets or a large 403
number of generations is used for LGP runs. In order to prevent overfitting problem in this 404
study, we firstly confined both maximum number of generations and maximum size of 405
programs to 1000 generation, and 256 byte, respectively (see Table 3). Then, as suggested by 406
Nourani et al. (2013), we monitored simultaneously error variations in both training and 407
validation sets in each LGP run in order to stop the run at the time that error of validation set 408
begins to rise. Stopping the run after a certain number of generations or certain length of 409
wall-clock time has also been previously suggested by Babovic and Keijzer (2000). 410
411
Application of GRNN and RBF networks 412
413
At the second stage of the present study, in order to generalize the results of comparative 414
study between LGB and ANN-based models for successive-station monthly streamflow 415
prediction, we considered GRNN and RBF algorithms to restructure the best scenario 416
designated in the first stage. For this aim, two different program codes were written in the 417
MATLAB® to generate the GRNN and RBF networks. In GRNN modelling, the training and 418
validation data sets were selected identical to those of FFBP. However, in RBF only training 419
part of data was used to structure the new ANN models. Determination of optimum value of 420
spread parameter in both GRNN and RBF was one of the important aspects of an efficient 421
network design in this phase. With respect to the RMSE and NSE measures of validation 422
step, we employed a trial-error method to select the optimum value of spread parameter as 423
suggested by Cigizoglu (2005). In order to optimize RBF model, different number of nodes in 424
the hidden layer (RBF centres) were also examined in each trial. For this aim, the utilized 425
training program adds neurons to the hidden layer of a RBF network until it meets the 426
20
specified mean squared error or maximum number of neurons. In order to make a fare 427
comparison with FFBP, the maximum number of hidden layer neurons within RBF networks 428
was also confined to 10 that already had been accepted as a threshold for FFBP networks. 429
430
Results and discussion 431
432
According to Models (1) to (6), six different streamflow prediction scenarios have been 433
modelled by both FFBP and LGP techniques and have been compared with each other using 434
RMSE and NSE values. The efficiency results of the best developed FFBP and LGP models 435
at validation period were tabulated in Table 4. The results indicated that Model (1) resulted in 436
the lowest performance level with respect to all of the scenarios. It might because of 437
insufficient inputs accepted in this scenario. Although LGP yielded the slightly higher 438
performance in terms of both RMSE and NSE measures, it is obvious that the single-step-439
ahead monthly streamflow prediction scenario could not provide a reliable prediction 440
(NSE=0.457) for Çoruh River. 441
442
Table 4. Performance comparison of LGP and FFBP results at validation period 443
FFBP LGP Model Prediction scenario NHL* RMSE NSE RMSE NSE
1 Dt = f (Dt-1) 3 0.131 0.355 0.119 0.457 2 Dt = f (Ut) 3 0.130 0.363 0.099 0.624 3 Dt = f (Ut, Dt-1) 4 0.079 0.766 0.064 0.845 4 Dt = f (Ut-1, Ut, Dt-1) 4 0.037 0.948 0.029 0.967 5 Dt = f (Ut-1, Dt-1) 5 0.039 0.943 0.029 0.968 6 Dt = f (Ut-1, Ut, Dt-2, Dt-1) 6 0.046 0.921 0.029 0.969
* Nodes in hidden layer 444 445
Based upon abovementioned successive-station prediction strategy, in the second scenario 446
(i.e. Model (2)) concurrent upstream flow was considered as the modelling input instead of 447
21
increasing in the lag time of downstream station records. Considering the remarkable 448
differences between NSE values of LGP and FFBP in this scenario (i.e. more than 40% 449
improvement), LGP is superior to FFBP. Compared to the Model (1), although significant 450
improvement in the performance of LGP Model (2) was resulted in this scenario, the model 451
does not provide a suitable streamflow prediction scenario yet (NSE=0.624). The reason 452
behind is apparently related to the fact that the monthly flow in the study reach generally has 453
a spatially increasing regime, which is also distinguishable in Fig. 3. The results of the first 454
two models indicated the necessity of additional forecasting lead time and/or more efficient 455
input variables. 456
457 The first successive-station scenario, Model (3), demonstrated a dramatic improvement in the 458
performance levels of both FFBP and LGP models. The reason lies under the fact that 459
existing sub-basins between the stations have considerable physical effects (i.e., increasing 460
drainage area) on the occurrence of flow at downstream station. This is the reason why we 461
intentionally keep downstream flow at time (t-1), Dt-1, in constructing Models (4)–(6). 462
Efficiency values of Model (3) imply that LGP is still remains superior to FFBP. 463
464 Models (4) to (6) represent three effective successive-station combinations with high 465
performance levels of both LGP and FFBP. In all cases, LGP shows slightly higher 466
performance than FFBP. Due to striking coherence between the observed and predicted 467
values in these models, there is a very strict competition among them to be chosen as an 468
optimum streamflow prediction model for the river. Models (4) and (5) use both flows at time 469
(t-1) in downstream and upstream stations; however, the former also includes streamflow at 470
time (t) in upstream station (Ut). Removing one of the input variables (i.e. Ut) from Model (4) 471
interestingly created inconsiderable effect in the efficiency of corresponding FFBP and LGP 472
models. Compared to Models (5), having two more inputs (i.e. Ut and Dt-2) in Model (6) 473
22
causes the FFBP network to reduce the efficiency of the model. However, LGP still provides 474
progressive performance. Diminishing results of FFBP in this case may relate to the fact that 475
ANN-based models are unsatisfactory in case of highly non-stationary phenomena (Cannas et 476
al. 2006) or highly noisy data (Nourani et al. 2011). 477
478
Considering all the aforementioned results and the concept of simplicity and applicability as 479
the main issue of modelling, Model (5) has been evaluated as the best scenario in this study. 480
The dimensionless explicit expression of the model resulting by LGP was given in Eq. (4). It 481
shows that LGP generated a quadratic equation in Dt-1-squared, scaled and shifted by 482
upstream flow for the previous month (Ut-1). 483
221
11 236.0217.1
)063.1(⎥⎦
⎤⎢⎣
⎡−
−×+= −
−−t
tttDUUD (4) 484
Apart from highly nonlinear relation between downstream flow (Dt) and its values at previous 485
month (Dt-1), the model (i.e. Eq. (4)) exposes a linear relation between Dt and Ut-1. Regarding 486
the linear and nonlinear terms of the equation, its plausibility was investigated through the 487
dyadic scatterplots presented in Fig. 4 and piecewise linear surface plot of Dt vs. Ut-1, Dt-1 488
with the observations superposed on it illustrated in Fig. 5. These figures prove a linear 489
correlation between Dt and Ut-1 and highly nonlinear relation between Dt and Dt-1. Three 490
dimensional surface plot of the proposed model based on a mesh grid produced by observed 491
Ut-1 and Dt-1 was also presented in Fig. 6. Owing to the physical interrelation between Ut-1 and 492
Dt-1, any point on this surface does not necessarily provide a physically acceptable prediction 493
for Dt. It is why some gaps can be seen within the surface (see Fig. 6b). According to the 494
model, one-month lag is enough for successive-station monthly streamflow prediction at our 495
study reach. 496
497
23
498
Fig. 4. Scatterplots of observed data used in the present study 499
500
501
Fig. 5. Three dimensional Scatterplots of the observation data 502
503
504 Fig. 6. Three dimensional surface plot, a) front view and b) top view perspective of Eq. 4. 505
506 507
a) b)
24
In order to assess the efficiency of Model (5) in detail, LGP and FFBP predicted time series 508
and their scatter plots compared to the corresponding observations for the validation period as 509
illustrated in Fig. 7. The figure shows both FFBP and LGP models are able to predict the low 510
and medium monthly streamflow (Dt<500 m3/s) successfully. The LGP acts more accurate 511
than FFBP to predict the global maximum, global minimum, local maxima, and local 512
minima, which warrants its superiority to FFBP in overall sense. 513
514
515
Fig. 7. Predicted and scatter plots of the proposed (a) FFBP and (b) LGP prediction models 516
517
As it mentioned previously, in order to investigate the efficiency of LGP technique in 518
comparison with different ANN algorithms, the selected scenario (i.e. Model (5)) was also 519
subjected for restructuring using RBF and GRNN. The efficiency results of the best RBF and 520
GRNN networks developed for the scenario was compared with those of FFBP and LGP as 521
tabulated in Table 5. Corresponding predictions of the validation period comparing to the 522
relevant observations were also depicted in Fig. 8. The results indicated that both RBF and 523
25
GRNN networks are able to produce monthly streamflow of the target station more precise 524
than FFBP. The RBF network predicts global maximum value more reliable than other ANN 525
networks. The proposed LGP model is still superior to both RBF and GRNN not only in term 526
of extreme values prediction but also in low and medium flows prediction. 527
528
Table 5. Efficiency results of LGP and different ANN models at validation period 529
Model RMSE NSE
LGP 0.029 0.968 FFBP 0.039 0.943 RBF 0.034 0.955
GRNN 0.036 0.950
530
All of the ANN algorithms assessed in this study not only provides implicit networks with 531
accuracy less than LGP technique but also they suffer from issues of optimum parameter 532
selection. It was observed that after each FFBP trail, different prediction values are obtained 533
by unique FFBP structure resulting in dynamic performance level. This drawback is mainly 534
due to the random assignment of synapses in the beginning of each trial. The efficiency of 535
each RBF network developed in this study was highly depended on the number of hidden 536
nodes, RBF centres, and spread constant chosen for the network. The appropriate selection of 537
the latter was also crucial to develop the most efficient GRNN network. 538
26
539
Fig. 8. Predicted and scatter plots of the (a) RBF and (b) GRNN models at validation period 540
541
Some of the training trials revealed the fact that awkward selection of RBF centres and 542
spread constant may provide unsatisfactory results. Therefore, modellers should be very 543
careful when they choose these parameters. Otherwise, implementation of an optimization 544
technique in parallel with training algorithm is inevitable. In contrast, in LGP-based 545
modelling, selected parameters (see Table 3) do not play significant role in the efficiency of 546
the best program evolved by LGP. Heuristics-based evolutionary optimization feature of LGP 547
lets modellers choose the required parameters in a variety of range. Evidently, a wisdom 548
selection, specifically for function and terminal sets, based upon physics of the studying 549
phenomenon is always suggested. 550
551
552
553
554
555
27
556
557
Summary and conclusion 558
559
In this study, based upon spatial and temporal features of the historical streamflow records on 560
Çoruh River, we investigated the ability of LGP to model the nonlinear successive-station 561
streamflow process and compared its efficiency with that of calculated by three different 562
ANN algorithms namely FFBP, RBF, and GRNN. Considering the successive-station 563
prediction strategy, we assumed that streamflow at our target station is a function of historical 564
streamflow records at the station and another one in the upstream. Therefore, we put forward 565
six different scenarios between the stations and then evaluated them as a candidate for 566
monthly streamflow prediction model for the river. 567
568
Our results demonstrated that the LGP and ANN are both able to handle the nonlinearity and 569
non-stationary elements of the successive-station streamflow process in general. With respect 570
to all of the scenarios examined here, LGP approach resulted in higher performance than all 571
of the ANN algorithms, even though only basic arithmetic functions (+, - , ×, ÷) were adopted 572
in the LGP function set. The proposed LGP model was superior to ANNs not only in extreme 573
flow prediction but also in low and medium flows (Dt<500 m3/s). It is also obtained that an 574
explicit LGP-based model, comprising one-month-lag records of both target and upstream 575
stations, provides the best prediction model for the target station. The results also revealed a 576
diminishing performance in FFBP model when the number of input variables increases to 577
four, whereas it is not the case in LGP model. This may be related to disability of ANNs for 578
modelling highly noisy data. 579
580
28
Since the programs evolved using LGP technique can be represented by the explicit 581
mathematical equations (e.g. Eq. (4)), they are preferential to ANNs not only for practical use 582
but also for mining the knowledge from the information contained in the field data. 583
Evidently, the empirical successive-station equation proposed in this study providing strong 584
expressivity underlies streamflow process more than the body of observation data between 585
the stations. However, this type of empirical equations sometimes shows such a complexity 586
that cannot be easily interpreted and this issue can be considered as major disadvantage of 587
LGP model, indicating a necessity for further studies to overcome such problems. 588
589
Acknowledgements 590
591
The authors would like to thank Dr. Vasily Demyanov, Associate Editor Computers 592
and Geosciences, for his helpful suggestions during the revision of the initial manuscript. We 593
also would like to thank two anonymous reviewers for their fruitful critiques improving this 594
paper. 595
596
References 597
598 Abrahart, R.J., Anctil, F., Coulibaly, P., Dawson, C.W., Mount, N.J., See, L.M., Shamseldin, A.Y., 599
Solomatine, D.P., Toth, E., Wilby, R., 2012. Two decades of anarchy? Emerging themes and 600 outstanding challenges for neural network modelling of surface hydrology. Progress in Physical 601 Geography 36(4), 480-513. 602
603 Adamowski, J., 2008. Development of a short-term river flood forecasting method for snowmelt 604
driven floods based on wavelet and cross-wavelet analysis. Journal of Hydrology 353, 247-266. 605 606 ASCE Task Committee, 2000. Artificial neural networks in hydrology: hydrologic applications. 607
Journal of Hydrologic Engineering 5(2), 124–137. 608 609 Aytek, A., Kisi, O., 2008. A genetic programming approach to suspended sediment modelling. 610
Journal of Hydrology 351, 288-298. 611
29
612 Azmathullah, H.M., Deo, M.C., Deolalikar, P.B. et al., 2005. Neural network for estimation of scour 613
downstream of a ski-jump bucket. Journal of Hydraulic Engineering 131(10), 898–908. 614 615 Babovic, V., 2000. Data mining and knowledge discovery in sediment transport. Computer-Aided 616
Civil and Infrastructure Engineering 15, 383–389. 617 618 Babovic, V., 2005, Data mining in hydrology, Hydrological processes 19 (7), 1511-1515. 619 620 Babovic, V., Abbott, M.B., 1997a. The evolution of equations from hydraulic data Part I: Theory, 621
Journal of hydraulic research 35 (3), 397-410. 622 623 Babovic, V., Abbott, M.B., 1997b. The evolution of equations from hydraulic data Part II: 624
Applications, Journal of Hydraulic Research 35 (3), 411-430. 625 626 Babovic, V., Keijzer, M., 2000. Genetic programming as a model induction engine. Journal of 627
Hydroinformatics 2, 35-60 628 629 Babovic, V., Keijzer M. 2002. Rainfall runoff modeling based on genetic programming. Nordic 630
Hydrology 33, 331–346. 631 632 Bateni, S.M., Borghei S.M., Jeng D.S., 2007. Neural network and neuro-fuzzy assessments for scour 633
depth around bridge piers. Engineering Application of Artificial Intelligence 20 (3), 401–414. 634 635 Brameier, M., Banzhaf, W., 2007. Linear genetic programming. Springer Science + Business Media, 636
LLC, New York. 637 638 Besaw, L.E., Rizzo, D.M., Bierman, P.R., Hackett W.R., 2010. Advances in ungauged streamflow 639
prediction using artificial neural networks. Journal of Hydrology 386 (1-4), 27–37. 640 641 Can, İ., Tosunogulu, F., Kahya, E., 2012. Daily streamflow modelling using autoregressive moving 642
average and artificial neural networks models: case study of Çoruh basin, Turkey. Water and 643 Environment Journal 26, 567–576. 644
645 Cannas, B., Fanni, A., See, L., Sias, G., 2006. Data preprocessing for river flow forecasting using 646
neural networks: Wavelet transforms and data partitioning. Physics and Chemistry of Earth, 647 Parts A/B/C, 31(18), 1164-1171. 648
649 Chang, F.J., Chiang, Y.M., Chang, L.C., 2007. Multi-step-ahead neural networks for flood 650
forecasting. Hydrological Science Journal 52(1), 114–130. 651 652 Chau, K.W., Wu, C.L., Li, Y.S., 2005. Comparison of several flood forecasting models in Yangtze 653
River. Journal of Hydrologic Engineering 10 (6), 485-491. 654 655 Cheng, C.T., Chau, K.W., Sun, Y.G., Lin, J.Y., 2005. Long-term prediction of discharges in Manwan 656
Reservoir using artificial neural network models. Lecture Notes in Computer Science 3498, 657 1040-1045. 658
659
30
Cigizoglu, H.K., Alp, M., 2004. Rainfall Runoff Modeling Using Three Neural Network Methods. 660 Lecture Notes in Computer Science 3070, 166-171. 661
662 Cigizoglu, H.K., 2005. Application of generalized regression neural networks to intermittent flow 663
forecasting and estimation. Journal of Hydrologic Engineering 10(4), 336–341. 664 665 Danandeh Mehr, A., Kahya, E., Bagheri, F., Deliktas, E., 2013. Successive-station monthly 666
streamflow prediction using neuro-wavelet technique. Earth Science Informatics (doi: 667 10.1007/s12145-013-0141-3, in press) 668
669 Dawson, C.W., Wilby, R., 1998. An artificial neural network approach to rainfall-runoff modelling, 670
Hydrological Sciences Journal 43(1), 47-66. 671 672 De Vos NJ, Rientjes THM., 2005. Constraints of artificial neural networks for rainfall–runoff 673
modeling: trade-offs in hydrological state representation and model evaluation. Hydrology and 674 Earth System Science 9(1, 2), 111–126. 675
676 Dolling, O.R., Varas, E. A., 2002 . Artificial neural networks for streamflow prediction. Journal of 677
Hydraulic Research 40 (5), 547-554. 678 679 Elshorbagy, A., Corzo, G., Srinivasulu, S., Solomatine, D.P., 2010. Experimental investigation of the 680
predictive capabilities of data driven modeling techniques in hydrology - Part 1: Concepts and 681 methodology. Hydrology and Earth System Science 14, 1931-1941. 682
683 Francone, F.D., 1998. DiscipulusTM software owner’s manual, version 3.0 DRAFT. Machine Learning 684
Technologies Inc., Littleton CO., USA. 685 686 Fletcher, L., Katkovnik, V., Steffens, F. E., Engelbrecht, A. P., 1998. Optimizing the number of 687
hidden nodes of a feed forward artificial neural network. Proceedings of International Joint 688 Conference on Neural Networks, vol. 2. pp. 1608-1612. 689
690 Ghorbani, M.A., Khatibi, R., Aytek, A., Makarynskyy, O., Shiri, J., 2010. Sea water level forecasting 691
using genetic programming and artificial neural networks. Computers and Geosciences 36(5), 692 620–627. 693
694 Guven, A., 2009. Linear genetic programming for time-series modeling of daily flow rate. Journal of 695
Earth System Science 118(2), 137–146. 696 697 Hornik, K., Stinchcombe, M., White, M., 1989. Multi-layer feed forward networks are universal 698
approximators. Neural Networks 2(5), 359–366. 699 700 Kalteh, A.M., 2013. Monthly river flow forecasting using artificial neural network and support vector 701
regression models coupled with wavelet transform. Computers and Geosciences, 54, 1–8. 702 703 Kerh, T., Lee, C.S., 2006. Neural networks forecasting of flood discharge at an unmeasured station 704
using river upstream information. Advances in Engineering Software 37, 533–543. 705 706
31
Khu, S.T., Liong S-Y., Babovic, V., Madsen, H., Muttil, N., 2001. Genetic programming and its 707 application in real-time runoff forecasting. Journal of the American Water Resources 708 Association 37, 439–451. 709
710 Koza, J.R., 1992. Genetic programming: On the Programming of Computers by means of Natural 711
Selection. MIT Press, Cambridge, MA. 712 713 Kişi, Ö., 2009. Neural networks and wavelet conjunction model for intermittent streamflow 714
forecasting. Journal of Hydrologic Engineering 14(8), 773–782. 715 716 Kisi, O., 2008, Stream flow forecasting using neuro-wavelet technique. Hydrological Process 22, 717
4142–4152. 718 719 Kisi, O., 2010. Wavelet regression model for short-term streamflow forecasting. Journal of Hydrology 720
389, 344–353. 721 722 Kisi, O., Shiri J. 2012. River suspended sediment estimation by climatic variables implication: 723
Comparative study among soft computing techniques. Computers & Geosciences 43, 73-82. 724 725 Kisi, O., Guven, A., 2010. Evapotranspiration modeling using linear genetic programming Technique. 726
Journal of Irrigation and Drainage Engineering, 136(10), 715–723. 727 728 Kisi, O., Shiri, J., 2011. Precipitation forecasting using wavelet-genetic programming and wavelet-729
Neuro-fuzzy conjunction models. Water Resource Managment 25(13),3135–3152 730 731 Kisi, O., Cigizoglu, H.K., 2007. Comparison of different ANN techniques in river flow prediction. 732
Civil Engineering and Environmental Systems 24, 211–231. 733 734 Krishna, B., 2013. Comparison of wavelet based ANN and Regression models for Reservoir Inflow 735
Forecasting. J Hydrologic Engineering, 10.1061/(ASCE)HE.1943-5584.0000892. 736 737 Liong, S-Y., Gautam, T.R., Khu, S.T., Babovic, V., Keijzer, M., Muttil, N., 2001. Genetic 738
programming: a new paradigm in rainfall runoff modeling. Journal of the American Water 739 Resources Association 38, 705–718. 740
741 Londhe, S., Charhate, S., 2010. Comparison of data-driven modelling techniques for river flow 742
forecasting. Hydrological Sciences Journal 55 (7), 1163-1174. 743 744 Marti, A.I., Yerdelen, C., Kahya E. 2010. Enso modulations on streamflow characteristics. Earth 745
Science Research Journal 14 (1), 31-42. 746 747 May, R.J., Maier, H.R., Dandy, G.C., 2010. Data splitting for artificial neural networks using SOM-748
based stratified sampling. Neural Netw. 23 (2), 283-294. 749 750 Minns, A.W., Hall, M.J., 1996. Artificial neural networks as rainfall-runoff models. Hydrological 751
Sciences Journal 41(3), 399-417. 752 753
32
Muttil, N., Chau, K.W., 2006. Neural network and genetic programming for modelling coastal algal 754 blooms. International Journal of Environment and Pollution 28 (3-4): 223-238. 755
756 Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models: Part 1. A 757
discussion of principles. Journal of Hydrology 10(3), 282-290. 758 759 Ni, Q., L. Wang, R. Ye, F. Yang, Sivakumar, M.. 2010. Evolutionary modeling for streamflow 760
forecasting with minimal datasets: A Case Study in the West Malian River, China. 761 Environmental Engineering Science 27(5), 377-385. 762
763 Nourani, V., Mogaddam, A.A., Nadiri, A.O.. 2008. An ANN based model for spatiotemporal 764
groundwater level forecasting. Hydrological Process 22, 5054–5066. 765 766 Nourani, V., Kisi, Ö., Komasi, M., 2011. Two hybrid Artificial Intelligence approaches for modeling 767
rainfall–runoff process. Journal of Hydrology 402 (1-2), 41–59. 768 769 Nourani, V., Komasi, M., Alami, M.T., 2012. Hybrid wavelet-genetic programming approach to 770
optimize ANN modelling of rainfall-runoff process. Journal of Hydrologic Engineering 7(6), 771 724-741. 772
773 Nourani, V., Hosseini Baghanam, A., Adamowski, J., Gebremichael, M., 2013. Using self-organizing 774
maps and wavelet transforms for space–time pre-processing of satellite precipitation and runoff 775 data in neural network based rainfall–runoff modeling. Journal of Hydrology 476, 228–243. 776
777 Piotrowski, A.P., Osuch, M., Napiorkowski, M.J., Rowinski, P.M., Napiorkowski, J.J., 2014. 778
Comparing large number of metaheuristics for artificial neural networks training to predict 779 water temperature in a natural river. Computers and Geosciences 64, 136-151. 780
781 Poli, R., Langdon, W.B., McPhee, N.F., 2008. A field guide to genetic programming. Published via 782
http://lulu.com and freely available at http://www.gp-field-guide.org.uk, (With contributions by 783 J. R. Koza) (accessed October 8, 2012). 784
785 Shiri, J., Kisi, O., 2010. Short-term and long-term streamflow forecasting using a wavelet and neuro-786
fuzzy conjunction model. Journal of Hydrology 394, 486-493. 787 788 Specht, D.F., 1991. A general regression neural network. IEEE Transactions on Neural Networks, 789
2(6), 568-576. 790 791 Sreekanth, J., and B. Datta. 2011. Coupled simulation-optimization model for coastal aquifer 792
management using genetic programming-based ensemble surrogate models and multiple-793 realization optimization, Water Resource Research 47 (4) W04516. 794
795 Tahershamsi, A., Majdzade Tabatabai, M.R., Shirkhani, R., 2012 An evaluation model of artificial 796
neural network to predict stable width in gravel bed rivers. International Journal of 797 Environmental Science and Technology 9, 333–342. 798
799
33
Wang, W.C., K.W. Chau, C.T. Cheng, Qiu, L., 2009. A comparison of performance of several 800 artificial intelligence methods for forecasting monthly discharge time series, Journal of 801 Hydrology 374, 294-306. 802
803 Wang, Y., Guo ,S., Chen, H., Zhou, Y., 2014. Comparative study of monthly inflow prediction 804
methods for the Three Gorges Reservoir. Stochastic Environmental Research and Risk 805 Assessment 28, 555-570. 806
807 Whigham, P.A., Crapper, P.F., 2001. Modeling rainfall runoff using genetic programming. 808
Mathematical and computer modelling 33, 707-721. 809 810 Wu, W., May, R., Dandy, G.C., Maier, H.R., 2012. A method for comparing data splitting approaches 811
for developing hydrological ANN models. In: The 6th International Congress on Environmental 812 Modelling and Software (iEMSs), Leipzig, Germany. 813
814 Wu, C.L., Chau, K.W, Li, Y.S., 2009a. Methods to improve neural network performance in daily 815
flows prediction. Journal of Hydrology 372, 80-93. 816 817 Wu, C.L., Chau, K.W, Li, Y.S., 2009b. Predicting monthly streamflow using data-driven models 818
coupled with data-preprocessing techniques. Water Resource Research 45, W08432. 819 820 Wu, W., May, R.J., Maier, H.R., Dandy, G.C., 2013. A benchmarking approach for comparing data 821
splitting methods for modeling water resources parameters using artificial neural networks. 822 Water Resource Research 49 (11), 7598-7614. 823
824 Yilmaz, A., Muttil, N., 2013. Runoff estimation by machine learning methods and application to 825
Euphrates Basin in Turkey. Journal of Hydrologic Engineering, 10.1061/(ASCE)HE.1943-826 5584.0000869. 827
828
34
• We compared FFBP, GRNN, RBF neural networks and LGP for successive-station 829 monthly streamflow prediction. 830
• Both ANNs and LGP models are more reliable in low and medium flow prediction. 831 • LGP is more capable of capturing extreme values than ANNs. 832 • LGP is superior to ANN in terms of overall accuracy and practical applicability. 833 • In contrast with implicit ANNs, LGP provided explicit equation for streamflow 834
prediction. 835