Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

9
ISSN 00978078, Water Resources, 2014, Vol. 41, No. 5, pp. 574–582. © Pleiades Publishing, Ltd., 2014. 574 1 INTRODUCTION Floods are among the most frequent and costly natural disasters in terms of human hardship and eco nomic loss. As much as 90 percent of the damage related to natural disasters (excluding droughts) is caused by floods and associated mud and debris flows [16]. However, flood damages can be substantially reduced by timely flood warnings and river forecasts. Accurate flood forecast is important for flood fighting, water diversion, human safety and ecosystem sustain ability. Therefore, there is a need for models capable of efficiently forecasting high river discharges in real time. This need is felt more in India which is the sec ond most flood affected country in the world, second only to Bangladesh. Most of the previous works for river flood forecast ing reported in literature can broadly be divided into two major groups, conceptual and databased. The conceptual models, despite adequately describing the hydrological processes based on physical laws, are not very popular as they require intensive data and compli cated differential equations for their implementation. Moreover, difficulty arises in application of these models because a single observation such as river flow at a gaugesite, as is the case in this study, may not be 1 The article is published in the original. sufficient to calibrate and verify these models. In com parison, databased models have recently gained pop ularity in hydrological applications due to their rapid development, fewer data requirement and ease of real time implementation [1]. These models establish inputoutput relationships without regard to the phys ical laws which govern the hydrological processes. The statistical and artificial intelligent models are very popular databased models. However, statistical mod els are found to be unsuitable in handling data with transitory characteristics such as drifts, trends and abrupt changes. Artificial neural network (ANN) and ANFIS are two the most widely used artificial intelli gent models. Though, ANNs have the ability to learn complex and nonlinear relationships between inputs and outputs which conventional methods find diffi cult, they are incapable of dealing with imprecise data. In addition, the surface of the objective function of a neural network is nonconvex and contains multiple local optima where the network solution can easily get trapped. ANFIS, on the other hand, combines the human knowledge and reasoning ability of fuzzy infer ence system (FIS) with the adapting capability of ANN to deal with unclear, imprecise and incomplete information. In recent times, wavelet transform, another data based method has become very popular in time series WaveletANFIS Models for Forecasting Monsoon Flows: Case Study for the Gandak River (India) 1 Rajeev Ranjan Sahay* and Vinit Sehgal** Civil Engineering Department, Birla Institute of Technology, Mesra, 835215 India Email: *[email protected], **[email protected] Received July 17, 2013 Abstract—WANFIS, a conjunction model of discreet wavelet transform (DWT) and adaptive neurofuzzy inference system (ANFIS) was developed for forecasting the currentday flow in a river when only available data are historical flows. Discreet wavelet transform decomposed the observed flow time series (OFTS) into wavelet components which captured useful information on three resolution levels. A smoothened flow time series (SFTS) was formed by filtering out the noise wavelet components and recombining the effective wavelet components. WANFIS model is essentially an ANFIS model with SFTS hydrograph as the input, while ANFIS and autoregression (AR) models, developed for comparison purpose, use OFTS hydrograph as input. For performance evaluation, the developed models were utilized for predicting daily monsoon flows for the Gandak River in Bihar state of India. During monsoon (June–October), this river carries large flows making the entire North Bihar unsafe for habitation or cultivation. Based on various performance indices, it was con cluded that WANFIS models simulate the monsoon flows in the Gandak more reliably than ANFIS and AR models. The best performing WANFIS model, with four previous days’ flows as input, predicted the current day Gandak flows with 80.7% accuracy while ANFIS and AR models predicted it with only 71.8 and 51.2% accuracies. Keywords: ANFIS, Gandak River, discreet wavelet transform, flood forecasting, India, river flow, WANFIS DOI: 10.1134/S0097807814050108 WATER QUALITY AND PROTECTION: ENVIRONMENTAL ASPECTS

Transcript of Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

Page 1: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

ISSN 0097�8078, Water Resources, 2014, Vol. 41, No. 5, pp. 574–582. © Pleiades Publishing, Ltd., 2014.

574

1 INTRODUCTION

Floods are among the most frequent and costlynatural disasters in terms of human hardship and eco�nomic loss. As much as 90 percent of the damagerelated to natural disasters (excluding droughts) iscaused by floods and associated mud and debris flows[16]. However, flood damages can be substantiallyreduced by timely flood warnings and river forecasts.Accurate flood forecast is important for flood fighting,water diversion, human safety and ecosystem sustain�ability. Therefore, there is a need for models capable ofefficiently forecasting high river discharges in realtime. This need is felt more in India which is the sec�ond most flood affected country in the world, secondonly to Bangladesh.

Most of the previous works for river flood forecast�ing reported in literature can broadly be divided intotwo major groups, conceptual and data�based. Theconceptual models, despite adequately describing thehydrological processes based on physical laws, are notvery popular as they require intensive data and compli�cated differential equations for their implementation.Moreover, difficulty arises in application of thesemodels because a single observation such as river flowat a gauge�site, as is the case in this study, may not be

1 The article is published in the original.

sufficient to calibrate and verify these models. In com�parison, data�based models have recently gained pop�ularity in hydrological applications due to their rapiddevelopment, fewer data requirement and ease of real�time implementation [1]. These models establishinput−output relationships without regard to the phys�ical laws which govern the hydrological processes. Thestatistical and artificial intelligent models are verypopular data�based models. However, statistical mod�els are found to be unsuitable in handling data withtransitory characteristics such as drifts, trends andabrupt changes. Artificial neural network (ANN) andANFIS are two the most widely used artificial intelli�gent models. Though, ANNs have the ability to learncomplex and nonlinear relationships between inputsand outputs which conventional methods find diffi�cult, they are incapable of dealing with imprecise data.In addition, the surface of the objective function of aneural network is non�convex and contains multiplelocal optima where the network solution can easily gettrapped. ANFIS, on the other hand, combines thehuman knowledge and reasoning ability of fuzzy infer�ence system (FIS) with the adapting capability ofANN to deal with unclear, imprecise and incompleteinformation.

In recent times, wavelet transform, another data�based method has become very popular in time series

Wavelet�ANFIS Models for Forecasting Monsoon Flows: Case Study for the Gandak River (India)1

Rajeev Ranjan Sahay* and Vinit Sehgal**Civil Engineering Department, Birla Institute of Technology, Mesra, 835215 India

E�mail: *[email protected], **[email protected] July 17, 2013

Abstract—WANFIS, a conjunction model of discreet wavelet transform (DWT) and adaptive neuro�fuzzyinference system (ANFIS) was developed for forecasting the current�day flow in a river when only availabledata are historical flows. Discreet wavelet transform decomposed the observed flow time series (OFTS) intowavelet components which captured useful information on three resolution levels. A smoothened flow timeseries (SFTS) was formed by filtering out the noise wavelet components and recombining the effective waveletcomponents. WANFIS model is essentially an ANFIS model with SFTS hydrograph as the input, whileANFIS and autoregression (AR) models, developed for comparison purpose, use OFTS hydrograph as input.For performance evaluation, the developed models were utilized for predicting daily monsoon flows for theGandak River in Bihar state of India. During monsoon (June–October), this river carries large flows makingthe entire North Bihar unsafe for habitation or cultivation. Based on various performance indices, it was con�cluded that WANFIS models simulate the monsoon flows in the Gandak more reliably than ANFIS and ARmodels. The best performing WANFIS model, with four previous days’ flows as input, predicted the current�day Gandak flows with 80.7% accuracy while ANFIS and AR models predicted it with only 71.8 and 51.2%accuracies.

Keywords: ANFIS, Gandak River, discreet wavelet transform, flood forecasting, India, river flow, WANFIS

DOI: 10.1134/S0097807814050108

WATER QUALITY AND PROTECTION: ENVIRONMENTAL ASPECTS

Page 2: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

WATER RESOURCES Vol. 41 No. 5 2014

WAVELET�ANFIS MODELS FOR FORECASTING MONSOON FLOWS 575

analysis as wavelets extract effectively both time andfrequency�like information from it. Smith et al. [15]successfully used DWT for quantifying streamflowvariability and classified streamflows into distincthydroclimatic categories. Labat et al. [9] effectivelyapplied wavelet methods to model rainfalls and runoffsmeasured at different sampling rates, from daily tohalf�hourly and found that the wavelet�neuro�fuzzymodel is superior to the classical neuro�fuzzy methodespecially for the peak values. Golitsyn et al. [4] ana�lyzed the annual and long�term variations in hydro�logical regimes of the Ladoga and Onega lakes in Rus�sia by the spectral and wavelet analysis. Coulibaly andBurn [2] used wavelet analysis to identify variability inannual flows in Canadian rivers. Partal and Kucuk [11]satisfactorily used DWT for determining possibletrends in the annual precipitation in Turkey. Based onwavelet and cross�wavelet constituent components forstream flows, Adamowski [1] developed flood fore�casting models and showed the predicting stream flowswith greater accuracy when there were no significanttrends in river flows. Rajee et al. [12] developed a con�junction model of ANN and wavelet for predictingsediment load in a river. Tiwari et al. [17, 18] developeda combined wavelet–bootstrap–ANN hybrid modelfor the hourly and daily discharge forecasting in riversand showed that these models more reliable thanwavelet–ANN and bootstrap–ANN models. Ip et al.[5] analyzed the variability and trends of flood/drynessin North China for the period of 1470–2000 utilizingthe power spectral and continuous wavelet transform.Kisi [8] used wavelet regression as an alternative toneural networks for river stage forecasting. Sahay andChakraborty [13] demonstrated the efficiency of thecombined model of discrete wavelet transform andautoregression in estimating river flows when the onlydata available is historical flow series. Kisi and Shiri [7]developed wavelet and neuro�fuzzy conjunction tech�nique for forecasting ground water depth and showedit is superior to neuro�fuzzy technique. Sahay andSengal [14] developed wavelet regression models forforecasting 1�day�ahead monsoon river stages andfound them better predictive models than ANN andAR models.

The foregoing discussion suggests wavelet trans�form to be an effective tool in analyzing hydrologicaltime series. In this paper, taking advantage of thetime−frequency description ability of DWT and theapproximation ability of ANFIS, a new hybrid modelWANFIS was developed for forecasting the river flowsfor the monsoon period. Monsoon flows are difficultto be modeled as they are characterized by irregularlyspaced spiky large events and sustained flows of vary�ing duration. A practical application of the developedmodels was made to the Gandak River of the EasternIndia. The WANFIS results were compared with ARand ANFIS results, which were specifically developedfor the purpose.

MODEL DEVELOPMENT

In this study, a new hybrid model, WANFIS, wasdeveloped by combining discreet wavelet transformand adaptive neuro�fuzzy inference system. DWT wasused to decompose the flow time series into waveletcomponents or sub�time series. Thereafter, ignoringthe noise wavelet components and recombining theeffective wavelet components, a smoothened flow timeseries was constructed. This wavelet�smoothenedSFTS hydrograph formed the input for WANFIS,while OFTS hydrograph was used for stand�aloneANFIS and autoregression (AR) models, developedfor comparison purpose.

Discreet Wavelet Transform

Time series analysis can be done using various tech�niques, the most well known of these being the Fouriertransform (FT). However, FT, which breaks down theseries (here, river discharge series) into constituentsinusoids of different frequencies, suffers from theserious drawback of losing time information whiletransforming the time series to the frequency domainand is therefore not found suitable for analyzing sig�nals with transitory characteristics such as drifts,trends and abrupt changes. Wavelet transform (WT),on the other hand, resolves both time and scale (fre�quency) events better than FT. It breaks up a timeseries into shifted and scaled versions of wavelets,which are waveforms of effectively limited durationand zero mean, along the full signal in such a way thathigher frequency wavelets will be very narrow andlower frequency wavelets will be wide. At each step, agenerated wavelet coefficient measures the correlationof the wavelet to the signal in each section (Fig. 1).This capability of WT to focus on short time intervalsfor gross features and long intervals for local featuresmakes it well suited for approximating data with sharpchanges and discontinuities. The convolution process,called continuous wavelet transform (CWT) of timeseries f(t) with respect to a mother wavelet ϕ(t), isdefined as the sum over all time of the signal multipliedby the scaled and shifted version of the mother waveletϕ(t):

dt. (1)

The results of the CWT are many wavelet coeffi�cients. They are functions of scale and position andgive measure of correlation between the scaled andshifted wavelet and the original signal. The conjugate

wavelet basis functions , are derived from a

common mother wavelet function ϕ(0, 0)(t) by scaling(or dilating) it by a and translating it by b:

Ta b,1

a����� f t( )ϕ' t b–

a��������⎝ ⎠⎛ ⎞

∞–

∫=

ϕ ' t b–a

��������⎝ ⎠⎛ ⎞

Page 3: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

576

WATER RESOURCES Vol. 41 No. 5 2014

SAHAY, SEHGAL

(2)

Determining wavelet coefficients at every possiblescale is an enormous task and time consuming. More�over, actual river flow data are measured at specifictime intervals and are discrete in nature. In such cases,discrete wavelet transform is found to be more suitablein analyzing the time series. Out of several availableschemes, in this study, DWT uses dyadic scheme ofwavelet decomposition where alternate scale and posi�tion are adopted for calculating transform coeffi�cients, thereby reducing the computation burden.Equation (2) is based on the dyadic scheme of waveletcomposition. For a discrete time series xi with integertime steps, DWT in dyadic decomposition scheme isdefined as

(3)

where, Tm, n is the wavelet coefficient for scale a = 2m,and location b = 2mn, m and n being positive integers.N is the data length of the time series and an integerpower of 2, i.e., N = 2M. This gives the ranges of m andn as 0 < n < 2M – m – 1 and 1 < m < M, respectively.

The original time series may be reconstructedemploying inverse discrete transform:

(4)

or, in a simple format, as:

(5)

where, _

T(t) is called approximation sub�time series(denoted by AM in this study) at level M and Wm(t) are

ϕa b, t( ) 2a2��–

ϕ0 0, 2 a– t b–( ).=

Tm n, 2m2���–

xiϕ 2 m– t n–( ),i 0=

N 1–

∑=

xi T Tm n, 2m2���–

ϕ 2m– t n–( )

n 0=

2M m–

1–

∑m 1=

M

∑+=

xi T t( ) Wm t( ),m 1=

M

∑+=

detail sub�time series (denoted by Dm in this study) atlevels m = 1, 2, … M.

Mallat [10] devised an efficient way of estimatingDWT coefficients at every subset of scale and positionutilizing filters. The process consists of a number ofsuccessive filtering steps in which the time series isdecomposed into approximations and detail sub�timeseries/wavelet coefficient. Approximation coefficientsrepresent the slowly changing coarse features of thetime series and are obtained by correlating stretchedversion (low�frequency and high�scale) of a waveletwith the original time series, while detail coefficientssignify rapidly changing features of the time series andare obtained by correlating wavelet (high�frequencycompressed and low�scale) with the original timeseries. The decomposition process can be iterated,with successive decomposition of As so as to break theoriginal signal into many lower resolution components(Fig. 2).

Adaptive Neuro�Fuzzy Inference System (ANFIS)

System modeling based on conventional mathe�matical tools is unsuitable for dealing with ill�definedand uncertain systems. ANFIS, on the other hand,uses human knowledge and reasoning process withoutemploying precise quantitative analyses. This is doneembedding the fuzzy inference system into frameworkof the adaptive networks. A detailed discussion onANFIS is beyond the scope of this work. However, agood illustration of the working of ANFIS can befound in [6].

Hybrid Wavelet�ANFIS Model, WANFIS

WANFIS, as developed in this study, is a conjunc�tion model of DWT and ANFIS. First, OFTS isdecomposed into its sub�time series or wavelet com�ponents on three resolution levels (2–4–8) by DWT.

C1

C2

C3

Fig. 1. Correlating a time series with the shifting and translating wavelet.

Page 4: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

WATER RESOURCES Vol. 41 No. 5 2014

WAVELET�ANFIS MODELS FOR FORECASTING MONSOON FLOWS 577

The sub�time series, i.e., D1, D2 and D3 representdetail components corresponding to 2, 4 and 8 days’scale or periodicity respectively, and A3 representsapproximation component of 8 days’ scale or period�icity. Based on the statistical indices like coefficient ofcorrelation and root mean square error, D1, is gener�ally found the noisiest wavelet component, i.e., themost rapidly varying and uncorrelated component ofthe measured time series. With the increase in thedecomposition level, irregularity in wavelets reduces

(Fig. 3). Therefore, for accurate and reliable modelingof a time series, it is important to remove the noisecomponent from it. The removal should help approxi�mate the time series better. Hence, SFTS was con�structed by combining components D2, D3 and A3. TheSFTS hydrograph constituted inputs for WANFISmodels, while the OFTS hydrograph constitutedinputs for ANFIS and AR models. The working struc�ture of WANFIS is shown in Fig. 4. In addition toWANFIS class of models, AR and ANFIS classes ofmodels were also developed for comparison purpose.

S = Observed time seriesAm = Approximations at level mDm = Details at level m

S = A1 + D1

S = A2 + D1 + D2

S = A3 + D1 + D2 + D3

S

A1

A2

D1

D2

D3A3

Fig. 2. Decomposition of a time series into its wavelet components.

1000

500

0

–500

300 4002001000–1000

8000

6000

4000

2000

300 4002001000

0

1000

500

0

–500

300 4002001000–1000

8000

6000

4000

2000

300 4002001000

1000

500

0

–500

300 4002001000–1000

8000

6000

4000

2000

300 4002001000

Original signal

D2 + D3 + A3

D1

D2

D3

A3

Fig. 3. Decomposition of the Gandak’ flow time series into its wavelet components (derivation dataset).

Page 5: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

578

WATER RESOURCES Vol. 41 No. 5 2014

SAHAY, SEHGAL

In each class, four models based on various input setswere constructed. To evaluate the developed models,the following performance indices were used:

(6)

(7)

(8)

% Accuracy = Percentage of predicted valueswith DR values lying between –0.06 to 0.06 (9)

(i.e., ±15% deviation between Qm and Qp),

CC

QpQm Qp Qm

1

N

∑1

N

∑–i 1=

N

NSpSm

��������������������������������������������������,=

RMSE

Qp Qm–( )2

i 1=

N

∑N

���������������������������,=

DRQp

Qm

�����,log=

where CC is coefficient of correlation, RMSE is rootmean square error, DR is discrepancy ratio, Qp and Qm

are the predicted and measured flow rates in the river,respectively; Sp and Sm are standard deviations in pre�dicted and measured values, respectively; and N is thenumber of observations. From Eq. (8), it follows thatDR = 0 suggests exact matching between measuredand predicted values, otherwise, there is either over�prediction [DR > 0, i.e., Qp > Qm] or underprediction[DR < 0, i.e., Qp < Qm].

MODEL IMPLEMENTATION

The developed models were evaluated for forecast�ing current flow of the Gandak River at Balmikinagarin Bihar state of India. The Gandak rises in the GreatHimalayan Range of Nepal and flows southwestwardwith deep gorges towards the Indo�Gangetic Plane ofIndia. The mountain peaks Dhaulagiri (8167 m) andAnnapurna (8091 m) lie in its catchment. The Gandakis older than Himalaya. Also known as Narayani, ittravels a windy course of about 765 km before fallinginto the Ganga River opposite Patna City. Floodsresulting from any heavy downpour in the uppercatchment of Nepal rush very fast toward the Indianborder. These floods take away a lot of lives and causedamages to infrastructure, agriculture and industrialproduction. Shift in courses is a regular feature of theNorth Bihar Rivers. To confine the Gandak River andto control the flood damages, long embankments onboth sides of the rivers were constructed. Although,the embankments have confined the lateral shift of therivers to a large extent, frequent breaches and over�toppings of the embankments have made flooding aperpetual challenge in the area. The catchment map ofthe North Bihar Rivers (India) is shown in Fig. 5. Therecent monsoon flow data for 5 years (2004–2009) forthe Gandak at Balmikinagar were utilized for derivingand verifying the developed models. Table 1 summa�rizes statistical information on the measured datasets.

The models were derived using 492 daily flow dataof the Gandak River for the monsoon period (fromJune 15 to October 15) for the year 2004–2008. Afterderiving the models satisfactorily, another 123 dailyflow data for the monsoon period for the year 2009were utilized for verifying them. The hydrometeoro�logical data are collected in the region through a net�work of gauge and discharge stations by the govern�ment agencies like Central water Commission (NewDelhi, India) and Water Resources Department(Patna, Bihar, India). To derive wavelet based models,first, the OFTS was decomposed at three resolutionlevels using Morlet mother wavelet. This complexnon�orthogonal function is found to be suitable foranalysis of signals with strong wavelike features as isthe case with monsoon flows in a river [1]. Based ondifferent input combinations, four models each ofWANFIS class (i.e., WANFIS1, WANFIS2,WANFIS3 and WANFIS4), ANFIS class (i.e.,

Obseved daily flow time series (OFTS)

Decomposition of OFTS by DWT at three resolution levels

Wavelet components of OFTS, D1, D2, D3 and A3 obtained

A wavelet�smoothened flow time series (SFTS) obtained by recombining effective wavelets, i.e., (D2 + D3 + A3) and removing noise wavelet component, D1

Apply ANFIS on SFTS

Current�day river flows

Fig. 4. Working structure of the developed WANFISmodels.

Table 1. Statistical parameters for the Gandak Riverat Triveni (India)

ParameterDerivation

dataset (2004–2008)

Verification dataset (2009)

Max. daily disch., m3/s 13250.0 6710.0

Min. daily disch., m3/s 1255.0 1450.0

Mean daily disch., m3/s 4339.1 3643.4

Std. dev., m3/s 2069.6 1259.3

Range, m3/s 11995.0 5260.0

Page 6: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

WATER RESOURCES Vol. 41 No. 5 2014

WAVELET�ANFIS MODELS FOR FORECASTING MONSOON FLOWS 579

ANFIS1, ANFIS2, ANFIS3 and ANFIS4) and ARclass (i.e., AR1, AR2, AR3 and AR4) were developed.WANFIS3, for example, had three previous days’flows from SFTS, while ANFIS3 and AR3 had themfrom OFTS for predicting the current�day river flows.In this study, WANFIS and ANFIS models had a five�

layered structure, and in any assumed structure, thebest combination of inputs and membership functions(MFs) were obtained by trial and error. The objectivewas to obtain the minimum deviation between themeasured and the predicted values. Though manyMFs like trimf, trapmf, gbellmf, gaussmf, gauss2mf,

India

Nepal

UT

TA

R

PR

AD

ES

H

GANDAK

BURHI GANDAK

GHAGHRA

MAHI

SON

BAGMATI

BAYA

ADHWARA

KAMLA

BHUTANI

KOSI

GANGA

MAHARANDA

WE

ST

B

EN

GA

L

Gandak Catchment

BAGMATI

GANDAK

Bhagalpur

Fig. 5. Basin map of the Gandak and its adjoining rivers [3].

Table 2. Performance indices of the models for the Gandak River at Triveni (India)

Set/Input variables Model

Derivation datasets RMSE,

m3/s

Verification datasets

RMSE, m3/s DR range Accuracy, %

Set 1 qt WANFIS1 502.2 846.3 –.19 to .19 69.4

Qt ANFIS1 654.1 975.1 –.32 to .25 66.9

AR1 678.5 1109.8 –.31 to .25 51.6

qt and qt – 1 WANFIS2 436.9 676.7 –.13 to .21 74.4

Set 2 Qt and Qt – 1 ANFIS2 630.4 862.5 –.35 to .34 67.2

AR2 672.7 1106.0 –.30 to .26 52.0

qt, qt – 1 and qt – 2 WANFIS3 390.4 646.4 –.13 to .29 79.8

Set 3 Qt, Qt – 1 and Qt – 2 ANFIS3 627.9 772.4 –.34 to .37 70.5

AR3 668.8 1104.4 –.30 to .26 52.2

Set 4 qt, qt – 1, qt – 2, and qt – 3

WANFIS4 381.3 640.7 –.14 to .20 80.7

Qt, Qt – 1, Qt – 2, and Qt – 3

ANFIS4 625.4 744.9 –.32 to .40 71.8

AR4 676.4 1101.8 –.30 to .27 52.2

Qt – i and qt – i (i = 0 to 3) are i�day�antecedent flow of the original and the smoothened flow series, respectively.

Page 7: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

580

WATER RESOURCES Vol. 41 No. 5 2014

SAHAY, SEHGAL

pimf, dsigmf and psigmf were tried in this study, thegeneralized bell curve MF, i.e., gbellmf, with hybridlearning rule, captured the input−output pattern of thegiven dataset most closely. It was also observed thatmore MFs complicated the models without significantimprovement in their performances.

RESULTS AND DISCUSSION

Table 2 summarizes the structure and performanceof each model. It shows WR models performing betterthan AR and ANFIS models. This is true for the deri�vation as well as the verification dataset. To facilitatecomparison, the developed models are divided intofour sets. The following section discusses set�wise per�formance of the models:

Set 1 consists of models WANFIS1, ANFIS1 andAR1. qt – 1, the previous�day river flow from SFTSwas considered as the only input for WANFIS1, whileQt – 1, the previous�day river flow from OFTS was con�sidered as the only input for ANFIS1 and AR1. Theobjective was to investigate effectiveness of these sim�ple models with a single input in predicting the cur�rent�day flow. The results suggest that WANFIS1 is afairly reliable model as its prediction accuracy for theverification dataset is found to be as high as 69.4% forthe Gandak. The corresponding accuracies byANFIS1 and AR1 are only 66.9 and 51.6%, respec�tively. WANFIS1 predicts river flows with the mini�mum RMSE for the verification as well as the deriva�tion datasets. DR range, another performance indica�tor which is commonly used as an error measure andshows proximity between the predicted and theobserved values, seems superior for WANFIS1 com�pared to ANFIS1 and WR1. Its value of –0.19 to0.19 for the Gandak River suggest an unbiased predic�tion, whereas, ANFIS1 and WR1, show slightlyskewed prediction towards the negative side indicatingunderprediction. Another input, the 2�days’ previousflow, qt – 2/Qt – 2, was added to the models in Set 2 to seethe effect of inclusion on the performance of thesemodels. Though all models improve, the greatestimprovement can be seen in WANFIS model. ItsRMSE for the verification dataset is reduced almost by20%, while the corresponding reduction in ANFISand AR models are 11.5 and 0.3% respectively. Theaccuracy and DR range of models are also improved.While WANFIS2 predicts the Gandak flows accu�rately in 74.4% of the cases for the verification dataset,ANFIS2 and AR2 predict accurately only in 67.2 and52% of the cases respectively. Set 3 considered threeprevious days’ flow rates as input to predict the cur�rent�day flow rate. The performance indicators ofmost of the developed models show improvement,though the improvement is not as significant as wasseen in models of Set 2. Another antecedent flow wasadded as input in models of Set 4. Thus, WANFIS4had qt – 3, qt – 2, qt – 1 and qt as inputs, while modelsANFIS4 and AR4 had Qt – 3, Qt – 2, Qt – 1 and Qt as their

inputs. Table 2 shows that the addition of anotherinput though improves the performance of the models,the extent of improvement is not significant. However,they predict the Gandak flows with the minimumRMSE, the maximum accuracy and the best DR rangein their respective categories, both for the derivation aswell as the verification datasets. WANFIS4 is found tobe the most reliable among all the developed models.Its RMSE value of 640.7 m3/s and DR range of –0.14to 0.20 is the best performance indices among all thedeveloped models. It predicts the Gandak flows accu�rately in 80.7% of the cases, while ANFIS4 and AR4predict them accurately only in 71.8 and 52.2% casesrespectively. Apparently, addition of another previousday flow is not going to improve the models any fur�ther.

Figure 6 shows the residuals of the observed andpredicted flows by WANFIS4, ANFIS4 and AR4 inthe Gandak River for the verification dataset. It can beobserved from this figure that flows are better pre�dicted by WANFIS4 as compared to ANFIS4 andAR4, as the deviation between the observed and thepredicted flows is the least by this model. Figure 7shows the measured and predicted peak flows(>4000 m3/s) by the developed models. It shows thatpeak flows are better predicted by WANFIS4 thanANFIS4 and AR4. WANFIS4 successfully predictsthe highest three flows of 6710, 6640 and 6410 m3/s inthe Gandak River as 6492, 6415 and 6400 m3/s,respectively. On comparison, AR4 and ANFIS4 pre�dict the highest three flows as 5335, 6020 and6197 m3/s, respectively, and 5612, 5577 and6202 m3/s, respectively. The prediction of the lowflows by WANFIS4 also seems to be in good agree�ment with the measured data. It estimates the lowestthree flows of 1450, 1600 and 1630 m3/s in the GandakRiver as 1382, 1634 and 1522 m3/s, respectively, whileAR4 and ANFIS4 significantly overpredict these flowsas 2030, 2170 and 2182 m3/s, respectively, and 1589,1769 and 1785 m3/s, respectively. This implies thatWANFIS4 captured the input–output pattern welleven for extreme values.

Figure 8 shows the predicted flows falling into dif�ferent discrepancy brackets by WANFIS4, ANFIS4and AR4. The objective was to show how the predictedvalues are compared with the measured values for theentire verification dataset. It can be observed from thisfigure that WANFIS4 gives an even prediction, show�ing little bias for under� or overprediction. On com�parison, predictions by AR4 and ANFIS4 are skewedeither towards the positive side (in case of ANFIS4) orthe negative side (in case of AR4).

The above discussion suggests WANFIS modelsefficient in forecasting river flows. However, it shouldbe understood that the present study used daily flowdata only for 5 years which may not be representativeof complexity of a large river system like the Gandakand models may have over�fitted the derivation data. If

Page 8: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

WATER RESOURCES Vol. 41 No. 5 2014

WAVELET�ANFIS MODELS FOR FORECASTING MONSOON FLOWS 581

so, these models would give unsatisfactory forecast fornew data. Furthermore, the developed models arelocation and period specific, hence, these models aresensitive and may have significant phase problems ifmade to forecast flows for other periods, as the causesof floods may be different at other points of the years.For example, monsoon greatly influences floods inNorth Bihar rivers during June–October, while gla�cier�melt contributes significantly to their flows dur�ing January–May. Therefore, the forecasting modelsshould be developed for a specific period by consider�ing the data for that period only.

CONCLUSIONS

A new model, WANFIS, was developed combiningdiscrete wavelet transform and adaptive neuro�fuzzyinference system to predict monsoon flows in a river.Monsoon flows are irregular high flows with large vari�ations. Modeling such flows poses a great challenge.Based on different inputs, four WANFIS models weredeveloped and a practical application of these modelswas made to the Gandak River at Balmikinagar gaugesite in Bihar State of India. The Gandak is infamousfor bringing large flows almost every monsoon whichtake a heavy toll on life and property. The performanceof WANFIS models was compared with ANFIS andAR models. Based on several statistical indices, it wasconcluded that WANFIS models, which are essen�tially ANFIS models with wavelet�smoothened flow

time series as input, predict the river flows with greateraccuracy than ANFIS and AR models. Smootheningof the observed flow time series by removing its noisewavelet components from it increases the predictabil�ity of the series. The best performing WANFIS model,with four previous days’ average flows as input, pre�dicted the current�day monsoon flows with 80.7%accuracy, while the best performing ANFIS and ARmodel predicted them with only 71.8 and 51.2% accu�racies respectively. Even for extreme flows, WANFIS

8000

6000

4000

2000

–2000

–4000

51413121110

61 71 81 91 101 111

Res

idu

als

(cu

mec

s)

AR4 ANFIS4 WANFIS4

Days

Fig. 6. Residuals of the observed and predicted flows for the Gandak River (verification dataset).

7500

7000

6500

6000

5500

5000

4500

6500 7500700060005500500045004000

Qm

, m

3 /s

Candak at Triveni

AR4

ANFIS4

WANFIS4

Q0, m3/s

Fig. 7. Observed and predicted peak flows (>4000 m3/s) forthe Gandak flows (verification dataset).

Page 9: Wavelet-ANFIS models for forecasting monsoon flows: Case study for the Gandak River (India)

582

WATER RESOURCES Vol. 41 No. 5 2014

SAHAY, SEHGAL

prediction is closest to the observed values. On com�parison, ANFIS and AR show substantial bias forunderprediction or overprediction.

REFERENCES

1. Adamowski, J.F., River flow forecasting using waveletand cross�wavelet transform models, Hydrol. Processes,2008, vol. 22, no. 25, pp. 4877–4891.

2. Coulibaly, P. and Burn, H.D., Wavelet analysis of vari�ability in annual Canadian streamflows, Water Resour.Res., 2004, vol. 40, no. 3, doi: 10.1029/2003WR002667

3. FMIS: Flood Management Information System,http://fmis.bih.nic.in, 2012.

4. Golitsyn, G.S., Efimova, L.K., Mokhov, I.I., Ru�myantsev, V.A., Somova, N.G., and Khon, V.Ch.,Ladoga and Onega hydrological regimes and their vari�ations, Water Resour., 2002, vol. 29, no. 2, pp. 149–154.

5. Ip, W.Ch., Zhang, H.W., and Xia, J., Multi�scale vari�ability and trends of precipitation in North China,Water Resour., 2011, vol. 38, no. 1, pp. 18–28.

6. Jang, J.S.R., Sun, C.T., and Mizutani, E., Neuro�Fuzzyand Soft Computing: A Computational Approach toLearning and Machine Intelligence, New Delhi: PearsonEducation, 2004.

7. Kisi, O. and Shiri, J., Wavelet and neuro�fuzzy con�junction model for predicting water table depth fluctu�ations, Hydrol. Res., 2012, vol. 43, no. 3, pp. 286–300.

8. Kisi, O., Wavelet regression model as an alternative toneural networks for river stage forecasting, WaterResour. Manage., 2011, vol. 25, no. 2, pp. 579–600.

9. Labat, D., Ababou, R., and Mangin, A., Rainfall–run�off relation for Karstic spring, Part 2: Continuous wave�let and discrete orthogonal multi resolution analyses,J. Hydrol., 2000, vol. 238, nos. 3–4, pp. 149–178.

10. Mallat, S.G., A theory for multi resolution signaldecomposition: the wavelet representation, IEEETrans. Pattern. Anal. Mach. Intell., 1989, vol. 11, no. 7,pp. 674–693.

11. Partal, T. and Kucuk, M., Long�term trend analysisusing discrete wavelet components of annual precipita�tions measurements in Marmara Region Turkey, Phys.Chem. Earth, 2006, vol. 31, no. 18, pp. 1189–1200.

12. Rajaee, T., Mirbagheri, S.A., Nourani, V., andAlikhani, A., Prediction of daily suspended sedimentload using wavelet and neuro�fuzzy combined model,Int. J. Environ. Sci. Tech., 2010, vol. 7, no. 1, pp. 93–110.

13. Sahay, R.R. and Chakraborty, A., Predicting riverfloods using discrete wavelet, J. Soil Water Sci., 2012,vol. IV, no. 1, pp. 29–41.

14. Sahay, R.R. and Sehgal, V., Wavelet regression modelsfor predicting flood stages in rivers: a case study in East�ern India, J. Flood Risk Manage., 2012, vol. 6, no. 2,pp. 146–155.

15. Smith, L.C., Turcotte, D.L., and Isacks, B., Streamflow characterization and feature detection using a dis�crete wavelet transform, Hydrol. Processes, 1998,vol. 12, no. 2, pp. 233–249.

16. Stream Gaging and Flood Forecasting, U.S. GeologicalSurvey (USGS), 1995.

17. Tiwari, M.K. and Chatterjee, C., Development of anaccurate and reliable hourly flood forecasting modelusing wavelet−bootstrap−ANN (WBANN) hybridapproach, J. Hydrol., 2010, vol. 394, nos. 3–4,pp. 458–470.

18. Tiwari, M.K. and Chatterjee, C., A New wavelet–boot�strap–ANN hybrid model for daily discharge forecast�ing, J. Hydroinf., 2011, vol. 13 no. 3, pp. 500–519.

40

30

20

10

0

Pro

po

rtio

n,

%

AR

ANFIS

WANFIS

less t

han ⎯

.12

more

than

.12

DR range

.12 to

⎯.0

1

.1 to

–.0

8

–.08 to

–.0

6

–.06 to

–.0

4

–.04 to

–.0

2

–.02 to

0

0 to .0

2

.02 to

.04

.04 to

.06

.06 to

.08

.08 to

.1

.1 to

.12

Fig. 8. Percentage predicted flows and discrepancy brackets (verification dataset).