Wavelet-ANN versus ANN-Based Model for Hydrometeorological...

21
water Article Wavelet-ANN versus ANN-Based Model for Hydrometeorological Drought Forecasting Md Munir H. Khan 1,2 ID , Nur Shazwani Muhammad 1, * and Ahmed El-Shafie 3 ID 1 Smart and Sustainable Township Research Centre, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor Darul Ehsan, Malaysia; [email protected] 2 Faculty of Engineering & Quantity Surveying, INTI International University (INTI-IU), Persiaran Perdana BBN, Putra Nilai, Nilai 71800, Negeri Sembilan, Malaysia 3 Department of Civil Engineering, Faculty of Engineering, University of Malaya (UM), Kuala Lumpur 50603, Malaysia; elshafi[email protected] * Correspondence: [email protected]; Tel.: +60-3-8921-6226 Received: 4 June 2018; Accepted: 10 July 2018; Published: 27 July 2018 Abstract: Malaysia is one of the countries that has been experiencing droughts caused by a warming climate. This study considered the Standard Index of Annual Precipitation (SIAP) and Standardized Water Storage Index (SWSI) to represent meteorological and hydrological drought, respectively. The study area is the Langat River Basin, located in the central part of peninsular Malaysia. The analysis was done using rainfall and water level data over 30 years, from 1986 to 2016. Both of the indices were calculated in monthly scale, and two neural network-based models and two wavelet-based artificial neural network (W-ANN) models were developed for monthly droughts. The performance of the SIAP and SWSI models, in terms of the correlation coefficient (R), was 0.899 and 0.968, respectively. The application of a wavelet for preprocessing the raw data in the developed W-ANN models achieved higher correlation coefficients for most of the scenarios. This proves that the created model can predict meteorological and hydrological droughts very close to the observed values. Overall, this study helps us to understand the history of drought conditions over the past 30 years in the Langat River Basin. It further helps us to forecast drought and to assist in water resource management. Keywords: drought analysis; ANN model; drought indices; meteorological drought; SIAP; SWSI; hydrological drought; discrete wavelet 1. Introduction Drought gradually happens with a lack of rainfall for a long period of time (i.e., months or years). This natural disaster is considered to be the most complex and least understood by many scientists. The impact of drought varies with respect to the affected areas. The damage may include impacts on the social and agriculture sectors, and the economy [1]. In 2007, it was reported that, because of the tremendously hot temperature, heat waves, and heavy rainfalls, extreme events would accumulate and become more frequent [2]. Although Malaysia experiences a tropical climate and receives more than 2000 mm of total rainfall annually, over the recent years, the country has experienced several drought episodes. For example, the state of Melaka faced a serious water shortage when water levels in the dams fell under critical levels in 1991, and the Durian Tunggal dam, which serves as a major water supply dam, ran dry [3]. In 1998, an El Nino-related drought severely hit the states of Selangor, Kedah, and Penang, which caused severe social and environmental impacts across the country [3]. This drought caused water rationing and hardship for 1.8 million residents of Kuala Lumpur and other towns in Klang Valley. Water 2018, 10, 998; doi:10.3390/w10080998 www.mdpi.com/journal/water

Transcript of Wavelet-ANN versus ANN-Based Model for Hydrometeorological...

Page 1: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

water

Article

Wavelet-ANN versus ANN-Based Model forHydrometeorological Drought Forecasting

Md Munir H. Khan 1,2 ID , Nur Shazwani Muhammad 1,* and Ahmed El-Shafie 3 ID

1 Smart and Sustainable Township Research Centre, Faculty of Engineering and Built Environment,Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor Darul Ehsan, Malaysia;[email protected]

2 Faculty of Engineering & Quantity Surveying, INTI International University (INTI-IU),Persiaran Perdana BBN, Putra Nilai, Nilai 71800, Negeri Sembilan, Malaysia

3 Department of Civil Engineering, Faculty of Engineering, University of Malaya (UM),Kuala Lumpur 50603, Malaysia; [email protected]

* Correspondence: [email protected]; Tel.: +60-3-8921-6226

Received: 4 June 2018; Accepted: 10 July 2018; Published: 27 July 2018�����������������

Abstract: Malaysia is one of the countries that has been experiencing droughts caused by a warmingclimate. This study considered the Standard Index of Annual Precipitation (SIAP) and StandardizedWater Storage Index (SWSI) to represent meteorological and hydrological drought, respectively.The study area is the Langat River Basin, located in the central part of peninsular Malaysia.The analysis was done using rainfall and water level data over 30 years, from 1986 to 2016. Both ofthe indices were calculated in monthly scale, and two neural network-based models and twowavelet-based artificial neural network (W-ANN) models were developed for monthly droughts.The performance of the SIAP and SWSI models, in terms of the correlation coefficient (R), was 0.899and 0.968, respectively. The application of a wavelet for preprocessing the raw data in the developedW-ANN models achieved higher correlation coefficients for most of the scenarios. This proves thatthe created model can predict meteorological and hydrological droughts very close to the observedvalues. Overall, this study helps us to understand the history of drought conditions over the past30 years in the Langat River Basin. It further helps us to forecast drought and to assist in waterresource management.

Keywords: drought analysis; ANN model; drought indices; meteorological drought; SIAP; SWSI;hydrological drought; discrete wavelet

1. Introduction

Drought gradually happens with a lack of rainfall for a long period of time (i.e., months or years).This natural disaster is considered to be the most complex and least understood by many scientists.The impact of drought varies with respect to the affected areas. The damage may include impacts onthe social and agriculture sectors, and the economy [1]. In 2007, it was reported that, because of thetremendously hot temperature, heat waves, and heavy rainfalls, extreme events would accumulate andbecome more frequent [2]. Although Malaysia experiences a tropical climate and receives more than2000 mm of total rainfall annually, over the recent years, the country has experienced several droughtepisodes. For example, the state of Melaka faced a serious water shortage when water levels in the damsfell under critical levels in 1991, and the Durian Tunggal dam, which serves as a major water supply dam,ran dry [3]. In 1998, an El Nino-related drought severely hit the states of Selangor, Kedah, and Penang,which caused severe social and environmental impacts across the country [3]. This drought causedwater rationing and hardship for 1.8 million residents of Kuala Lumpur and other towns in Klang Valley.

Water 2018, 10, 998; doi:10.3390/w10080998 www.mdpi.com/journal/water

Page 2: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 2 of 21

The Langat River Basin also experienced a rise in temperature nearly 5◦ higher than usual on many daysin March and April 2016 [4]. A research study applied the standardized precipitation index (SPI) toevaluate dry conditions using the data from 10 gauging stations throughout peninsular Malaysia, andfound that extreme dry conditions are becoming more frequent than extreme wet conditions [5]. Thus,emphasis should be placed on measures to reduce the impact of dry conditions, although the authoritiesusually put more focus on reducing extreme wet conditions (i.e., floods).

Drought is generally analyzed by means of drought indices, which are effectually a functionof precipitation and other hydrometeorological variables [6]. Different drought indices have beendiscovered and are used in different nations [6]. Hydrologists have defined four major categories ofdrought, namely, meteorological drought, agricultural drought, hydrological drought, and socioeconomicdrought [1]. Drought monitoring by indices in specific areas must be based on the availability ofhydrometeorological data and the capability of the index to dependably detect spatial and temporaldifferences through a drought event. Nevertheless, no single indicator or index alone can preciselydescribe the onset and severity of the event. Numerous climate and water supply indices are used todescribe the severity of any drought event. Although none of the major indices is inherently superiorto the rest in all circumstances, some indices are better suited for certain uses than others [7]. In thisstudy, the first objective was to assess the drought using two drought indices (DIs), the StandardIndex of Annual Precipitation (SIAP) and the Standardized Water Storage Index (SWSI), to representmeteorological and hydrological droughts, respectively. The SIAP and SWSI were chosen for theirsimplicity, and they do not require parameter estimation. Gourabi [8] used SIAP and the dependablerainfall index (DRI) for the recognition of drought years in several areas in Iran, and to analyze theeffects on rice yield and water surface. Sing et al. [9] used SIAP and a few other indices to assess thedrought spells in the Almora district of Uttarakhand, India. On the other hand, to calculate SWSI,the Standardized Drought Assessment Toolbox (SDAT), developed by Farahmand and AghaKouchakin 2015 [10], is used. The SDAT methodology standardizes the marginal probability of drought-relatedvariables (e.g., precipitation, soil moisture, and relative humidity) using the empirical distributionfunction of the data. This approach does not require an assumption of the representativeness ofa parametric distribution function to describe drought-related variables. Additionally, the nonparametricframework does not require a parameter estimation and goodness-of-fit evaluation, which makes theSDAT framework computationally much more efficient. Wang et al. [11] used four drought indices,including SWSI, in order to assess the intensity and timing of drought events in the upper and middleYangtze River Basin in China. In the second objective of this study, artificial neural network (ANN)–basedmodels coupled with a wavelet were developed and their performance evaluation was carried out forboth SIAP and SWSI models.

Many researchers have developed and applied various models to predict hydrological events, whichcould be divided into two major types, conceptual models (CM) and data-based models (DDM) [12].The conceptual models usually incorporate simplified schemes of physical laws and are generallynonlinear, time-invariant, and deterministic, with parameters that are representative of watershedcharacteristics. However, when they are calibrated to a given set of hydrological signals (time series),there is no guarantee that the conceptual models can predict accurately when they are used to extrapolatebeyond the range of calibration or verification experience [13,14]. It was also a bit difficult to understandthe nature of these kind of models, so, in order to use such kind of models it was very importantthat, in order to get better results, one should have all of the knowledge about the models and itsparameters [15]. However, DDM, which are basically numerical and based on biological neuron systems,recently known as an artificial brain or intelligence, have received more attention in water relatedapplications because of their ease, fast progress time, and less data necessity. The ANN- or data-drivenmodels have become increasingly popular in hydrologic forecasting because they are effective at dealingwith the nonlinear characteristics of hydrological data [16]. Among the various machine learningmethods, artificial neural networks (ANNs), which include back-propagation neural network (BPNN),radial basis function (RBF) neural network, generalized regression neural network (GRNN), Elman

Page 3: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 3 of 21

neural network, and multilayer feed-forward (MLFF) network, are among the most popular techniquesfor hydrological time series forecasting [17]. Although data driven models have attained high levels inthe hydrological field, there is still space present to improve the forecasting methods [18]. Hydrologicalprocesses are non-linear and arbitrary. By simply applying such models on an original time series, thefacts of alteration are overlooked, so that prediction correctness is reduced [19].

In the last decade, wavelet transform has become a widely applied technique for analyzingvariations, periodicities, and trends in time series [20,21]. Wavelet transform, which can produce a goodlocal representation of the signal, in both the time and frequency domains, provides considerableinformation on the structure of the physical process to be modelled. Discrete wavelet transformationprovides a decomposition of original time series. Subseries decomposed by discrete wavelet transform,from original time series, provide detailed information about the data structure and its periodicity [22].The attributes of each subseries are different. The wavelet components of the original time seriesimprove on a forecasting model by giving useful information on various resolution levels [23]; however,not much research has applied a wavelet for drought forecasting. A major limitation of artificialneural networks (ANNs) is their inability to deal with nonstationary data. To overcome this limitation,researchers have increasingly begun to use a wavelet analysis to preprocess the inputs of the hydrologicdata. Shabri [24] proposed a hybrid wavelet–least square support vector machine (WLSSVM) modelthat combines the wavelet method and the LSSVM model for monthly stream flow forecasting.Belayneh and Adamowski [25] studied drought forecasting using machine learning techniques andfound that coupled wavelet neural network models were the most accurate for forecasting three monthSPI (SPI 3) and six month SPI (SPI 6) values over lead times of one and three months in the AwashRiver Basin in Ethiopia. Therefore, in this study, coupling wavelets with ANN was expected to providesignificant improvements in the model performance.

2. Materials and Methods

2.1. Standard Index of Annual Precipitation (SIAP)

The SIAP is known for transferring the raw data of precipitation to relative amounts, so that thedeviation of rainfall from mean can be divided to standard deviation. Khalili [26] developed the SIAPand applied it to the study the processes of drought and wet conditions in Iran [27]. The values of theSIAP can be computed by Equation (1), provided by Khalili [6,26], as follows:

SIAP =Pi − PPSD

(1)

where SIAP is the drought index, Pi is the annual precipitation, P is the mean of precipitation in theperiod, and PSD is the standard deviation of the period. SIAP classifies drought intensity into fivemajor categories, namely, extremely wet, wet, normal, drought, and extreme drought. Details on theSIAP classifications are given in Table 1 [9,27].

In this study, SIAP is applied for short-term/monthly drought analysis. The pattern of theraw rainfall data shows a normal distribution, which supports the concept behind using SIAP ona short-term/monthly scale. Hence, Equation (1) is rewritten as follows:

SIAP (M) =Pi − P

SD

where SIAP (M) is the drought index on a monthly time scale, Pi is the monthly rainfall in the ith month(i = 1, 2, 3, 4, . . . 360), P is the mean of the monthly rainfall data for the whole period of study, and SDis the standard deviation of the monthly rainfall for the duration of the study.

Page 4: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 4 of 21

Table 1. Classification of Standard Index of Annual Precipitation (SIAP) values.

Classes of Drought Intensity SIAP Values

Extremely wet 0.84 or moreWet 0.52 to 0.84

Normal −0.52 to 0.52Drought −0.52 to −0.84

Extreme drought −0.84 or less

2.2. Standardized Water Storage Index (SWSI)

The SWSI is used to assess the deficit in the terrestrial water reserves. The SWSI calculation isbased on Equation (2). It is calculated by the SDAT toolbox in MATLAB, which was developed byFarahmand and AghaKouchak [10].

SWSIi,j =Si,j − Sj,mean

Sj,sd(2)

where Si,j is the seasonal water level for year i and month j, Sj,mean is the mean water level of thecorresponding month for the duration of the study, and Sj,sd is the standard deviation. The details ofSWSI classification are given in Table 2.

Table 2. Classification of the Standardized Water Storage Index (SWSI).

SWSI Values Classification

2.0 or more Extremely wet1.5 to 1.99 Very wet1.0 to 1.49 Moderately wet−0.99 to 0.99 Near normal−1.49 to −1.00 Moderate drought−1.99 to −1.5 Severe drought−2 or less Extreme drought

2.3. Development of Forecasting Model Using ANN

An artificial neural network can be defined as a set of simple processing units working as a paralleldistributed processor [28]. These units, which are called neurons, are responsible for storing experimentalknowledge for later disposal. The ANNs mimic the biological nervous system, similar to the brain;they learn through examples and have acquired knowledge stored in the connection weights betweenneurons [29]. The data are introduced in the input layer and the network progressively processes the datathrough the subsequent layers, producing a result in the output layer. The input neurons are linked tothose in the intermediate layer through wji weights, and the neurons in the intermediate layer are linkedto those in the output layer through wki weights. The symbols i, j, and k represent the ith, jth, and kthneuron in input, hidden, and output layers, respectively. The network maps out the relation between theinput data and the output variables based on the nonlinear activation functions. The purpose of traininga network is to minimize the error between outputs of the network and the target values. The trainingalgorithm reduces the error by adjusting the weights and biases of the network. In training, the inputvalues are multiplied by the respective connection weights and then the biases are added. The sameprocess is repeated for the output layer, where the output of a hidden layer is used as an input the outputlayer. The combination of net weighted input and biases netj to the jth neuron of the hidden layer can beexpressed as [30] follows :

netj =l

∑i=1

(wjixi + bj

)(3)

Page 5: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 5 of 21

where xi is the input value to the ith neuron of the input layer, while wji is the weight of the jth neuron ofthe hidden layer connected to the ith neuron of the input layer, and bj is the bias of the jth hidden neuron.The net value, netj, is passed through a transfer or activation function in the hidden layer to producean output from the hidden neuron. The output from the hidden layer can be expressed as [30] follows:

yj = f(netj

)= fh

(p

∑i=1

(wjixi + bj

))(4)

where yj is the output from the jth hidden neuron. The output from the hidden layer, yj, is used asan input to the output layer, and the same process as in hidden layer is repeated in the output neuronsin order to produce an output from the output layer. The net weighted input to the output neuron canbe represented by [30] the following:

netk =q

∑j=1

fo(yj)wki + bk (5)

Similar to above, the output from the kth neuron in the output layer is given by [30] the following:

yk = f (netk) = fo

(q

∑j=1

wki fh

(p

∑i=1

(wjixi + bj

))+ bk

)(6)

The ANN weights are made and modified iteratively through a procedure called calibration.The ANN models used in this study have a feed-forward multilayer perceptron (MLP) architecturethat was trained with the Levenberg–Marquardt (LM) back-propagation algorithm. MLPs have oftenbeen used in hydrologic forecasting because of their simplicity. MLPs consist of an input layer, oneor more hidden layers, and an output layer. The Levenberg–Marquardt (LM) algorithm is used fortraining because it is considered one of the fastest methods for training ANNs. The major drawback offeed-forward network models, as used in this study, is their inability to mimic the temporal patterntrend during the model training stage. Therefore, this type of model may not be capable of providinga reliable and accurate forecasting solution [31]. The efficiency of the models may be assessed usingseveral statistical parameters, which describe the adhering degree among the data that are observedand predicted by the model [32,33]. A neuron computes and gives feedback based on the weighted sumof all of its inputs, according to an activation function based on its output [34]. The activation functionselected here is the sigmoidal activation function. Standard neural network training procedures adjustthe weights and biases in the network to minimize a measure of ‘error’ in the training cases, whichis most commonly the sum of the squared differences between the network outputs and the targets.Finding the weights and biases that minimize the chosen error function is commonly done by usingsome gradient-based optimization method, with derivatives of the error, with respect to the weightsand biases calculated by back-propagation. A detailed theory of the back-propagation algorithm isbeyond the scope of this research and can be found in Haykin [28]. In this study, the Neural NetworksToolbox of MATLAB® is used. Figure 1 shows a simple neural network structure.

The performance of the presented models is evaluated based on their correlation coefficient (R)and root mean-square error (RMSE). The estimation of R is done using Equation (7), as follows:

R =1n ∑n

t=1 (yot − yo

t )(y f

t − y ft)√

1n ∑n

t=1(yot − yo

t )2√

1n ∑n

t=1(y f

t − y ft)2

(7)

where yot and y f

t are the observed and forecasted values at time t, respectively, and n is the number ofdata points.

Page 6: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 6 of 21

Water 2018, 10, x FOR PEER REVIEW    5 of 21 

 

  (4)

where yj is the output from the jth hidden neuron. The output from the hidden layer, yj, is used as an 

input to the output layer, and the same process as in hidden layer is repeated in the output neurons 

in order to produce an output from the output layer. The net weighted input to the output neuron 

can be represented by [30] the following: 

  (5)

Similar  to  above,  the  output  from  the  kth  neuron  in  the  output  layer  is  given  by  [30]  the 

following:   

  (6)

The ANN weights are made and modified  iteratively through a procedure called calibration. 

The ANN models used in this study have a feed‐forward multilayer perceptron (MLP) architecture 

that was trained with the Levenberg–Marquardt (LM) back‐propagation algorithm. MLPs have often 

been used in hydrologic forecasting because of their simplicity. MLPs consist of an input layer, one 

or more hidden layers, and an output layer. The Levenberg–Marquardt (LM) algorithm is used for 

training because it is considered one of the fastest methods for training ANNs. The major drawback 

of feed‐forward network models, as used in this study, is their inability to mimic the temporal pattern 

trend during the model training stage. Therefore, this type of model may not be capable of providing 

a reliable and accurate forecasting solution [31]. The efficiency of the models may be assessed using 

several statistical parameters, which describe the adhering degree among the data that are observed 

and predicted by the model [32,33]. A neuron computes and gives feedback based on the weighted 

sum of all of its inputs, according to an activation function based on its output [34]. The activation 

function  selected  here  is  the  sigmoidal  activation  function.  Standard  neural  network  training 

procedures adjust  the weights and biases  in  the network  to minimize a measure of  ‘error’  in  the 

training cases, which  is most commonly  the sum of  the squared differences between  the network 

outputs and the targets. Finding the weights and biases that minimize the chosen error function is 

commonly done by using some gradient‐based optimization method, with derivatives of the error, 

with respect to the weights and biases calculated by back‐propagation. A detailed theory of the back‐

propagation algorithm is beyond the scope of this research and can be found in Haykin [28]. In this 

study, the Neural Networks Toolbox of MATLAB® is used. Figure 1 shows a simple neural network 

structure. 

 

Figure 1. Neural network structure. Figure 1. Neural network structure.

The correlation coefficient (R) measures how well the predicted values correlate with the observedvalues and shows the degree to which the two variables are linearly related. An R value close to unityindicates a satisfactory result, while a low value or one that is close to zero implies an inadequate result.

The RMSE provides information about the predictive capabilities of the model. The RMSE evaluateshow close the predictions match the observations, shown in Equation (8), as follows:

RMSE =

√1n ∑n

t=1

(yo

t − y ft)2 (8)

The criteria for deciding the best models are based on how small the RMSEs found in training,testing, and validation of the data are.

2.4. Discrete Wavelet

Wavelet analysis is a multi-decomposition analysis that provides information on both the time andfrequency domains of the signal, and is the important derivative of the Fourier transform. The waveletwill become an important tool in time series forecasting. The basic objective of wavelet transformationis analyzing the time series data, in both the time and frequency domains, by decomposing theoriginal time series in different frequency bands using wavelet functions. Compared to the Fouriertransform, time series are analyzed using sine and cosine functions. Wavelet transformationsprovide useful decompositions of the original time series by capturing useful information on variousdecomposition levels.

Nowadays, wavelet analysis is one of the most powerful tools in the study of time series. Wavelettransform can be divided into two categories, continuous wavelet transform (CWT) and discrete wavelettransform (DWT). CWT is not often used for forecasting because of its computational complexity andtime requirements [35]. Among the reviewed papers by Nourani et al., [36], only about 20% of thestudies used the CWT for decomposing the hydrological time series, and the majority of studies utilizedthe DWT. This is because real world observed hydrologic time series are measured and gathered indiscrete form, rather that in a continuous format [36]. DWT is often used in forecasting applications tosimplify numeric solutions. DWT requires less computation time and is simpler to apply. DWT is givenby the following:

ψm,n(t) =1√sm

{t− nτosm

osm

o

}(9)

where ψ (t) is the mother wavelet, and m and n are integers that control the scale and time, respectively.The most common choices for the parameters are So = 2 and τo = 1. According to Mallat’s theory,

Page 7: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 7 of 21

the original discrete time series can be decomposed into a series of linearity-independent approximationand detail signals, by using the inverse DWT. The inverse DWT is given by Mallat [37], as follows:

x(t) = T +M

∑m=1

2M−m−1

∑t=0

Wm,n 2−m2 ψ(2−m t− n

). (10)

where Wm,n = 2−m2 ∑N−1

t=0 . ψ(2−m t− n)x(t) is the wavelet coefficient for the discrete wavelet at scales = 2m and τ = 2m n.

2.5. W-ANN Model

The W-ANN model is obtained by combining the DWT and ANN models. The W-ANN modeluses the subseries obtained from using DWT on original data. The W-ANN model structure developedin this study can be described with the following steps:

1. Decompose the original time series for each input into subseries components (details andapproximations) by DWT.

2. Select the most important and effective of each subseries component for each input by thecorrelation coefficient.

3. Construct a W-ANN model using the new summed series obtained by adding the significantcomponents of details sub-time series and approximations sub-time series for each input as thenew input to the ANN, and the original output time series as the output of the ANN. Figure 2shows a schematic representation of the model.

Water 2018, 10, x FOR PEER REVIEW    7 of 21 

 

2.5. W‐ANN Model   

The W‐ANN model is obtained by combining the DWT and ANN models. The W‐ANN model 

uses  the  subseries  obtained  from  using  DWT  on  original  data.  The W‐ANN  model  structure 

developed in this study can be described with the following steps: 

1. Decompose  the  original  time  series  for  each  input  into  subseries  components  (details  and 

approximations) by DWT.   

2. Select  the most  important  and  effective  of  each  subseries  component  for  each  input  by  the 

correlation coefficient.   

3. Construct a W‐ANN model using the new summed series obtained by adding the significant 

components of details sub‐time series and approximations sub‐time series for each input as the 

new input to the ANN, and the original output time series as the output of the ANN. Figure 2 shows a schematic representation of the model. 

 

Figure 2. Schematic diagram of wavelet‐based artificial neural network (W‐ANN) model development. 

2.6. Study Area and Data Collection 

2.6.1. Langat River Basin   

The Langat River is situated in the state of Selangor, Malaysia. This river basin is located near 

Kuala Lumpur, the capital city of Malaysia. Therefore, the study area has been rapidly developed, 

which makes  it dependent  on  the Langat River  for water  supply  [38]. The Langat River has  an 

estimated total catchment area of 1817 km2 and is located at latitude 2°40′152″ N to 3°16′15″ N and 

longitude 101°19′20″ E to 102°1′10″ E [30], and the main river is 141 km in length. The Beranang River, 

Semenyih River, and Lui River are the main tributaries of the Langat River, as shown in Figure 3. 

There are two reservoirs  in the Langat River Basin, Hulu Langat and Semenyih. The northeastern 

part of the river basin has a reduced level (RL) of 960 m above the mean sea level and is mountainous. 

The temperature of the area varies from 23.5 °C to 33.5 °C all year, and the comparative humidity 

ranges from 63% to 95%, with an average of 81%. Heavier rainfall happens in the month of November, 

with a monthly average rainfall of 270 mm; the average annual rainfall of the study area is about 2400 

mm  [38].  The  area  also  sometimes  experiences  rainstorms,  and  these  usually  occur  in  the  early 

evening through the year, and are usually of a short duration with a high intensity.   

Figure 2. Schematic diagram of wavelet-based artificial neural network (W-ANN) model development.

2.6. Study Area and Data Collection

2.6.1. Langat River Basin

The Langat River is situated in the state of Selangor, Malaysia. This river basin is located nearKuala Lumpur, the capital city of Malaysia. Therefore, the study area has been rapidly developed,which makes it dependent on the Langat River for water supply [38]. The Langat River has an estimatedtotal catchment area of 1817 km2 and is located at latitude 2◦40′152” N to 3◦16′15” N and longitude101◦19′20” E to 102◦1′10” E [30], and the main river is 141 km in length. The Beranang River, SemenyihRiver, and Lui River are the main tributaries of the Langat River, as shown in Figure 3. There are tworeservoirs in the Langat River Basin, Hulu Langat and Semenyih. The northeastern part of the riverbasin has a reduced level (RL) of 960 m above the mean sea level and is mountainous. The temperatureof the area varies from 23.5 ◦C to 33.5 ◦C all year, and the comparative humidity ranges from 63% to95%, with an average of 81%. Heavier rainfall happens in the month of November, with a monthlyaverage rainfall of 270 mm; the average annual rainfall of the study area is about 2400 mm [38].The area also sometimes experiences rainstorms, and these usually occur in the early evening throughthe year, and are usually of a short duration with a high intensity.

Page 8: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 8 of 21Water 2018, 10, x FOR PEER REVIEW    8 of 21 

 

 

Figure 3. Langat River Basin. 

2.6.2. Data Collection 

Thirty years  (1986–2016) of  rainfall and water  level data of  stations were  collected  from  the 

Department of Irrigation and Drainage (DID), Malaysia. Table 3 gives details of the gauging stations, 

including the station name, station number, coordinates (latitude and longitude), data availability, 

and percentage of missing data. 

Table 3. Details on rainfall and water level gauging stations. 

Station  Station Name  Station No. Coordinates  Data Availability 

(Years) 

Missing 

Data (%) Latitude (N)  Longitude (E) 

1 Sg. Semenyih di 

Sg. Rincing WL 2918401  02°54′55″  101°49′25″ 

1986–2016 5.4% 

2  Ldg. Dominion  RF 3018107  03°00′13″  101°52′55″  6.5% 

For the simplicity of naming the stations, water level (WL) station Sg. Semenyih di Sg. Rincing 

(WL 2918401) will be referred as station 1, and rainfall station (RF) Ldg. Dominion (RF 3018107) as 

station 2. The missing rainfall data of station 2 is estimated using the normal ratio method from the 

observations of rainfall at some of the other stations, as close to and as evenly spaced around the 

station with the missing record as possible [39].   

2.6.3. Distribution of Rainfall and Water Level   

The mean and median values were estimated for 30 years of raw data, from October 1986  to 

September 2016, and are presented in Figure 4. The most likely time for drought to happen is when 

the rainfall is low. It can be seen that, for the distribution of the rainfall data for station 2, the highest 

mean and median were in November, at 369.4 mm and 317.0 mm, respectively. The lowest rainfall 

was  in  January, with a mean of 138.4 mm and median of 126.3 mm,  followed by  June,  July, and 

February. There are basically three different seasons in the Langat River Basin of Malaysia. The wet 

period of  the year  is  from October  through  to  the beginning of  January, and  the dry months are 

generally observed from January to March, and June to September. October and November are the 

wettest months, with an average rainfall of 321.5 mm and 369.4 mm, respectively.   

Figure 5 presents the water level data of station 1, and it can be seen that the water level steadily 

decreased for the second half of the duration of this study. 

Figure 3. Langat River Basin.

2.6.2. Data Collection

Thirty years (1986–2016) of rainfall and water level data of stations were collected from the Departmentof Irrigation and Drainage (DID), Malaysia. Table 3 gives details of the gauging stations, including thestation name, station number, coordinates (latitude and longitude), data availability, and percentage ofmissing data.

Table 3. Details on rainfall and water level gauging stations.

Station Station Name Station No.Coordinates Data Availability

(Years)MissingData (%)Latitude (N) Longitude (E)

1 Sg. Semenyih diSg. Rincing WL 2918401 02◦54′55” 101◦49′25”

1986–20165.4%

2 Ldg. Dominion RF 3018107 03◦00′13” 101◦52′55” 6.5%

For the simplicity of naming the stations, water level (WL) station Sg. Semenyih di Sg. Rincing(WL 2918401) will be referred as station 1, and rainfall station (RF) Ldg. Dominion (RF 3018107) asstation 2. The missing rainfall data of station 2 is estimated using the normal ratio method from theobservations of rainfall at some of the other stations, as close to and as evenly spaced around thestation with the missing record as possible [39].

2.6.3. Distribution of Rainfall and Water Level

The mean and median values were estimated for 30 years of raw data, from October 1986 toSeptember 2016, and are presented in Figure 4. The most likely time for drought to happen is whenthe rainfall is low. It can be seen that, for the distribution of the rainfall data for station 2, the highestmean and median were in November, at 369.4 mm and 317.0 mm, respectively. The lowest rainfall wasin January, with a mean of 138.4 mm and median of 126.3 mm, followed by June, July, and February.There are basically three different seasons in the Langat River Basin of Malaysia. The wet period of theyear is from October through to the beginning of January, and the dry months are generally observed

Page 9: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 9 of 21

from January to March, and June to September. October and November are the wettest months, withan average rainfall of 321.5 mm and 369.4 mm, respectively.

Figure 5 presents the water level data of station 1, and it can be seen that the water level steadilydecreased for the second half of the duration of this study.Water 2018, 10, x FOR PEER REVIEW    9 of 21 

 

 

Figure 4. Monthly rainfall distribution at station 2, estimated using data from 1986 to 2016. 

 

Figure 5. Water level data for 30 years at station 1 (1986–2016). 

3. Results and Discussion 

3.1. Assessment Using Standard Index of Annual Precipitation (SIAP) 

As illustrated in Figure 6, the highest SIAP value was 4.921 (October 1994) and the lowest value 

was  −1.591  (August 1990).  In 1988,  the drought period was 11 months;  followed by 1990, with a 

drought period of 10 months; and then 2015, with a nine month dry period. Figure 6 also shows that 

in 1988, there was a 10 month dry period from March to December. 

Figure 7 shows a categorization of the results for the five different classes of drought. It shows 

that 17% of the months were extremely wet, 7% were wet, 39% were normal, 17% had drought, and 

20% had very severe drought. Overall, drought happened during 37% of the total months, and wet 

periods  occurred  during  24%  of  the  total  months.  Table  4  shows  a  summary  of  the  drought 

classifications. 

0

50

100

150

200

250

300

350

400

Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep

Rainfall (m

m)

Month

Mean Median

2020.420.821.221.622

22.422.823.223.624

24.424.8

Oct‐86

Nov‐87

Dec‐88

Jan‐90

Feb‐91

Mar‐92

Apr‐93

May‐94

Jun‐95

Jul‐96

Aug‐97

Sep‐98

Oct‐99

Nov‐00

Dec‐01

Jan‐03

Feb‐04

Mar‐05

Apr‐06

May‐07

Jun‐08

Jul‐09

Aug‐10

Sep‐11

Oct‐12

Nov‐13

Dec‐14

Jan‐16

Water Level (m)

Months

Figure 4. Monthly rainfall distribution at station 2, estimated using data from 1986 to 2016.

Water 2018, 10, x FOR PEER REVIEW    9 of 21 

 

 

Figure 4. Monthly rainfall distribution at station 2, estimated using data from 1986 to 2016. 

 

Figure 5. Water level data for 30 years at station 1 (1986–2016). 

3. Results and Discussion 

3.1. Assessment Using Standard Index of Annual Precipitation (SIAP) 

As illustrated in Figure 6, the highest SIAP value was 4.921 (October 1994) and the lowest value 

was  −1.591  (August 1990).  In 1988,  the drought period was 11 months;  followed by 1990, with a 

drought period of 10 months; and then 2015, with a nine month dry period. Figure 6 also shows that 

in 1988, there was a 10 month dry period from March to December. 

Figure 7 shows a categorization of the results for the five different classes of drought. It shows 

that 17% of the months were extremely wet, 7% were wet, 39% were normal, 17% had drought, and 

20% had very severe drought. Overall, drought happened during 37% of the total months, and wet 

periods  occurred  during  24%  of  the  total  months.  Table  4  shows  a  summary  of  the  drought 

classifications. 

0

50

100

150

200

250

300

350

400

Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep

Rainfall (m

m)

Month

Mean Median

2020.420.821.221.622

22.422.823.223.624

24.424.8

Oct‐86

Nov‐87

Dec‐88

Jan‐90

Feb‐91

Mar‐92

Apr‐93

May‐94

Jun‐95

Jul‐96

Aug‐97

Sep‐98

Oct‐99

Nov‐00

Dec‐01

Jan‐03

Feb‐04

Mar‐05

Apr‐06

May‐07

Jun‐08

Jul‐09

Aug‐10

Sep‐11

Oct‐12

Nov‐13

Dec‐14

Jan‐16

Water Level (m)

Months

Figure 5. Water level data for 30 years at station 1 (1986–2016).

3. Results and Discussion

3.1. Assessment Using Standard Index of Annual Precipitation (SIAP)

As illustrated in Figure 6, the highest SIAP value was 4.921 (October 1994) and the lowest valuewas −1.591 (August 1990). In 1988, the drought period was 11 months; followed by 1990, with a droughtperiod of 10 months; and then 2015, with a nine month dry period. Figure 6 also shows that in 1988,there was a 10 month dry period from March to December.

Figure 7 shows a categorization of the results for the five different classes of drought. It shows that17% of the months were extremely wet, 7% were wet, 39% were normal, 17% had drought, and 20%had very severe drought. Overall, drought happened during 37% of the total months, and wet periodsoccurred during 24% of the total months. Table 4 shows a summary of the drought classifications.

Page 10: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 10 of 21

Water 2018, 10, x FOR PEER REVIEW    10 of 21 

 

 

Figure 6. Standard Index of Annual Precipitation (SIAP) values for 30 years at station 2. 

 

Figure 7. Distribution of SIAP values into classes (station 2). 

Table 4. Summary of drought classifications for station 2. 

Category  Number of Months  Percentage (%) 

Extremely wet  62  17 

Wet  25  7 

Normal  140  39 

Drought  61  17 

Very severe drought  72  20 

Total  360  100 

Artificial Neural Network (ANN) Model 

The  ANN  architecture  does  not  have  a  systematic  way  to  establish  suitable  architecture. 

Networks that are too small and simple can lead to underfitting, while networks that are too complex 

tend to overfit the training pattern [40]. Usually, nonlinear sigmoidal activation functions are used, 

as reported in the literature, which were also adopted in this study. The inputs to the ANN model 

were normalized and kept within  the  range of 0.1  to 0.9. Normalization or scaling  is not  really a 

functional  requirement  for  the NNs  to  learn, but  it  significantly helps  as  it  transposes  the  input 

variables into the data range that the sigmoid activation functions lie in (i.e., 0.1). The learning rate 

and momentum coefficient are influential parameters that control the convergence rate, but optimize 

them for the best output. Here, the two parameters were kept constant at 0.4 and 0.6, respectively, 

‐2‐1.5‐1

‐0.50

0.51

1.52

2.53

3.54

4.55

Oct‐86

Nov‐87

Dec‐88

Jan‐90

Feb‐91

Mar‐92

Apr‐93

May‐94

Jun‐95

Jul‐96

Aug‐97

Sep‐98

Oct‐99

Nov‐00

Dec‐01

Jan‐03

Feb‐04

Mar‐05

Apr‐06

May‐07

Jun‐08

Jul‐09

Aug‐10

Sep‐11

Oct‐12

Nov‐13

Dec‐14

Jan‐16

SIAP

Months

17%

7%

39%

17%

20%

Extremely wet Wet Normal Drought Very Severe Drought

Figure 6. Standard Index of Annual Precipitation (SIAP) values for 30 years at station 2.

Water 2018, 10, x FOR PEER REVIEW    10 of 21 

 

 

Figure 6. Standard Index of Annual Precipitation (SIAP) values for 30 years at station 2. 

 

Figure 7. Distribution of SIAP values into classes (station 2). 

Table 4. Summary of drought classifications for station 2. 

Category  Number of Months  Percentage (%) 

Extremely wet  62  17 

Wet  25  7 

Normal  140  39 

Drought  61  17 

Very severe drought  72  20 

Total  360  100 

Artificial Neural Network (ANN) Model 

The  ANN  architecture  does  not  have  a  systematic  way  to  establish  suitable  architecture. 

Networks that are too small and simple can lead to underfitting, while networks that are too complex 

tend to overfit the training pattern [40]. Usually, nonlinear sigmoidal activation functions are used, 

as reported in the literature, which were also adopted in this study. The inputs to the ANN model 

were normalized and kept within  the  range of 0.1  to 0.9. Normalization or scaling  is not  really a 

functional  requirement  for  the NNs  to  learn, but  it  significantly helps  as  it  transposes  the  input 

variables into the data range that the sigmoid activation functions lie in (i.e., 0.1). The learning rate 

and momentum coefficient are influential parameters that control the convergence rate, but optimize 

them for the best output. Here, the two parameters were kept constant at 0.4 and 0.6, respectively, 

‐2‐1.5‐1

‐0.50

0.51

1.52

2.53

3.54

4.55

Oct‐86

Nov‐87

Dec‐88

Jan‐90

Feb‐91

Mar‐92

Apr‐93

May‐94

Jun‐95

Jul‐96

Aug‐97

Sep‐98

Oct‐99

Nov‐00

Dec‐01

Jan‐03

Feb‐04

Mar‐05

Apr‐06

May‐07

Jun‐08

Jul‐09

Aug‐10

Sep‐11

Oct‐12

Nov‐13

Dec‐14

Jan‐16

SIAP

Months

17%

7%

39%

17%

20%

Extremely wet Wet Normal Drought Very Severe Drought

Figure 7. Distribution of SIAP values into classes (station 2).

Table 4. Summary of drought classifications for station 2.

Category Number of Months Percentage (%)

Extremely wet 62 17Wet 25 7

Normal 140 39Drought 61 17

Very severe drought 72 20Total 360 100

Artificial Neural Network (ANN) Model

The ANN architecture does not have a systematic way to establish suitable architecture. Networksthat are too small and simple can lead to underfitting, while networks that are too complex tend tooverfit the training pattern [40]. Usually, nonlinear sigmoidal activation functions are used, as reportedin the literature, which were also adopted in this study. The inputs to the ANN model were normalizedand kept within the range of 0.1 to 0.9. Normalization or scaling is not really a functional requirementfor the NNs to learn, but it significantly helps as it transposes the input variables into the data rangethat the sigmoid activation functions lie in (i.e., 0.1). The learning rate and momentum coefficient

Page 11: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 11 of 21

are influential parameters that control the convergence rate, but optimize them for the best output.Here, the two parameters were kept constant at 0.4 and 0.6, respectively, throughout the networkstructure for various numbers of hidden neurons. The network input models that were tested forthe forecasting were based on the SIAP at station 2, shown by Equations (11) and (12). The inputcombinations consisted of lagged data of the rainfall and drought index, and the output was kept asa single drought index variable.

SIAP (t) = f (Rt−1, Rt−2, Rt−3) Input model number 1 (11)

SIAP (t) = f (SIt−1, SIt−2, SIt−3) Input model number 2 (12)

where SIAP or SI is the drought index; R is the precipitation; n is the time lag, which is effectively thelead time of the forecast; and t is time in months. The input models based on the main parameter,rainfall, in calculating the index or a drought index itself as input, performed better in the forecastingusing ANN [41]. The same study also illustrated a lack of impact of the secondary parameters onthe performance of the networks. In the ANN model stated above, there are three classifications ofsamples, training, which was kept at 70% (252 samples); validation, 15% (54 samples); and testing, 15%(54 samples). In the majority of the cases, data division is carried out on an arbitrary basis. However,the way the data divided can have a significant effect on the model performance. Shahin et al. [42]investigated the issue of data division and its impact on the ANN model performance for a casestudy of predicting the settlement of shallow foundations on granular soils. The results indicatedthat the statistical properties of the data in the training, testing, and validation sets need to be takeninto account in order to ensure that the optimal model performance is achieved [42]. During training,it adjusts the network according to its final measured error. The validation process was used at theend of training as an extra check on the performance of the model. If the performance of the networkwas found to be consistently good on both the test and the validation samples, then it was reasonableto assume that the network would generalize well on unseen data. For testing, this does not affectthe training part, but it provides an independent measure of the network performance during andafter training. Each MLP was trained with 5 to 15 hidden neurons in a single hidden layer, as shown inTable 5, to select the most effective model by analyzing the performance. The three best-performingcombinations are shown for each input model.

Table 5. Correlation coefficient (R) of artificial neural network (ANN) network structure (SIAP).

Input Model Number Number of NeuronsR

Training Validation Testing Overall

1 10 0.907 0.865 0.908 0.8991 15 0.803 0.845 0.758 0.8001 12 0.796 0.765 0.801 0.7832 8 0.712 0.813 0.705 0.7412 9 0.737 0.799 0.782 0.7702 10 0.882 0.875 0.851 0.868

For the comparison between the output and target, it was found that for input model number 1,for training, validation, testing, and overall, the R values were 0.907, 0.865, 0.909, and 0.899, respectively.An R value of 1 means a close relationship, 0 means no relationship. So, this means that the relationshipbetween the two (output and target) are close and related, which is shown in Figure 8. The errors inthe training, validation, and testing stages are illustrated in Figure 9.

Figure 10 displays a section of time series from January 1987 to December 1989 of the SIAP observedvalues against the forecasted ones, by using input model number 1. The results effectively exemplify thehigh accuracy of the short-range forecasts of the droughts at station 2. Such studies may be a way toidentify the operational accuracy of forecasts, and have been used by others for similar purposes [43].

Page 12: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 12 of 21

The forecasted and actual index values were similar, so the model can be said to be reliable.Therefore, this ANN model can be used to predict short- to medium-term drought occurrences inMalaysia. In addition, SIAP is an effective index for the assessment of drought monitoring and thecharacteristics of drought conditions in the Langat River Basin. Authorities can render early warningsfor the timely implementation of preparedness based on predictions.

Water 2018, 10, x FOR PEER REVIEW    12 of 21 

 

characteristics of drought conditions in the Langat River Basin. Authorities can render early warnings 

for the timely implementation of preparedness based on predictions. 

 

Figure 8. Neural network training regression for input model 1. 

 

Figure 9. Error histogram of input model number 1. 

-3 -2 -1 0 1 2

-3

-2

-1

0

1

2

Target

Out

put ~

= 0.

81*T

arge

t + -0

.068

Training: R=0.90741

DataFitY = T

-2 -1 0 1 2

-2

-1

0

1

2

TargetO

utpu

t ~=

0.84

*Tar

get +

0.1

Validation: R=0.86503

DataFitY = T

-2 -1 0 1 2

-2

-1

0

1

2

Target

Out

put ~

= 0.

8*Ta

rget

+ -0

.088

Test: R=0.90857

DataFitY = T

-3 -2 -1 0 1 2

-3

-2

-1

0

1

2

Target

Out

put ~

= 0.

81*T

arge

t + -0

.046

All: R=0.89906

DataFitY = T

0

5

10

15

20

25

30

35

40

45

50

Error Histogram with 20 Bins

Inst

ance

s

Errors = Targets - Outputs

-1.1

59

-1.0

42

-0.9

262

-0.8

1

-0.6

938

-0.5

775

-0.4

613

-0.3

45

-0.2

288

-0.1

126

0.00

3662

0.11

99

0.23

61

0.35

24

0.46

86

0.58

48

0.70

11

0.81

73

0.93

36

1.05

TrainingValidationTestZero Error

Figure 8. Neural network training regression for input model 1.

Water 2018, 10, x FOR PEER REVIEW    12 of 21 

 

characteristics of drought conditions in the Langat River Basin. Authorities can render early warnings 

for the timely implementation of preparedness based on predictions. 

 

Figure 8. Neural network training regression for input model 1. 

 

Figure 9. Error histogram of input model number 1. 

-3 -2 -1 0 1 2

-3

-2

-1

0

1

2

Target

Out

put ~

= 0.

81*T

arge

t + -0

.068

Training: R=0.90741

DataFitY = T

-2 -1 0 1 2

-2

-1

0

1

2

TargetO

utpu

t ~=

0.84

*Tar

get +

0.1

Validation: R=0.86503

DataFitY = T

-2 -1 0 1 2

-2

-1

0

1

2

Target

Out

put ~

= 0.

8*Ta

rget

+ -0

.088

Test: R=0.90857

DataFitY = T

-3 -2 -1 0 1 2

-3

-2

-1

0

1

2

Target

Out

put ~

= 0.

81*T

arge

t + -0

.046

All: R=0.89906

DataFitY = T

0

5

10

15

20

25

30

35

40

45

50

Error Histogram with 20 Bins

Inst

ance

s

Errors = Targets - Outputs

-1.1

59

-1.0

42

-0.9

262

-0.8

1

-0.6

938

-0.5

775

-0.4

613

-0.3

45

-0.2

288

-0.1

126

0.00

3662

0.11

99

0.23

61

0.35

24

0.46

86

0.58

48

0.70

11

0.81

73

0.93

36

1.05

TrainingValidationTestZero Error

Figure 9. Error histogram of input model number 1.

Page 13: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 13 of 21

Water 2018, 10, x FOR PEER REVIEW    13 of 21 

 

 

Figure 10. Comparison of observed and forecasted SIAP at station 2 of input model number 1. 

3.2. Assessment Using SWSI for Hydrological Drought 

Figure 11 shows the time series created using SWSI for the 30 years of data at station 1. The data 

used for the analysis was the water level of the river. Initially, the values are seen to be classified as 

very wet or above, then, they slowly change to near normal. The trend of SWSI in Figure 11 is similar 

to the raw input water level data, seen in Figure 5, which shows that almost the first half of the whole 

period was very wet or above normal,  and  the  second half was below normal or had droughts. 

Climate change, rapid urbanization, environmental degradation, and  industrial development may 

have resulted in water and related resources within the basin becoming increasingly stressed. A study 

conducted  in Malaysia highlighted  that extreme dry conditions are becoming more  frequent  than 

extreme wet  conditions  [5]. With  reference  to Figure  5,  the  time  series  starts with one month of 

moderately wet conditions, which  follows 12 months of near‐normal conditions. From November 

1987  to May 1994  (months 14  to 92),  the  conditions are  classified as very wet,  extremely wet, or 

moderately wet. After this wet period, the values are observed to decrease gradually, from very wet 

conditions  to near normal conditions. From  June 1994  to November 2008  (months 93  to 266),  the 

conditions were near normal. However, there were few months that were moderately wet, very wet, 

or extremely wet. The first drought occurred in December 2008 (month 267), with an index value of 

−1.39. Drought started to occur more frequently from this point onward. Near‐normal conditions are 

observed from January 2009 (month 268) to February 2013 (month 317). However, within this period, 

the months with drought increased. From March 2013 (month 318) to September 2016 (month 360), 

all of the months experienced drought. The most frequent type of drought was moderate drought, 

followed by  extreme drought. The number of occurrences of  severe drought  is  less  than  that of 

moderate and extreme drought. 

Table 6 shows the number of months that each drought occurred, with percentages varying from 

2.50%  to 66.67%. The most observed condition within  the period of study was near normal. Near 

normal conditions occurred for 240 months, about 66.67%. Except for the near normal, all of the other 

conditions were below 11%. Moderate drought occurred  for 37 months  (10.28%). Moderately wet 

conditions occurred for a similar number of months (35 months; 9.72%). Very wet conditions were 

observed in 16 months (4.44%), followed by extremely wet conditions in 12 months (3.33%). Extreme 

drought occurred  in 11 months  (3.06%). The  least  frequent  condition was  severe drought, which 

occurred in 9 months (2.5%). 

‐2

‐1.5

‐1

‐0.5

0

0.5

1

1.5

2

2.5Jan‐87

Feb‐87

Mar‐87

Apr‐87

May‐87

Jun‐87

Jul‐87

Aug‐87

Sep‐87

Oct‐87

Nov‐87

Dec‐87

Jan‐88

Feb‐88

Mar‐88

Apr‐88

May‐88

Jun‐88

Jul‐88

Aug‐88

Sep‐88

Oct‐88

Nov‐88

Dec‐88

Jan‐89

Feb‐89

Mar‐89

Apr‐89

May‐89

Jun‐89

Jul‐89

Aug‐89

Sep‐89

Oct‐ 89

Nov‐89

Dec‐89

SIAP

Observed

Forcasted

Months

Figure 10. Comparison of observed and forecasted SIAP at station 2 of input model number 1.

3.2. Assessment Using SWSI for Hydrological Drought

Figure 11 shows the time series created using SWSI for the 30 years of data at station 1. The dataused for the analysis was the water level of the river. Initially, the values are seen to be classifiedas very wet or above, then, they slowly change to near normal. The trend of SWSI in Figure 11 issimilar to the raw input water level data, seen in Figure 5, which shows that almost the first half of thewhole period was very wet or above normal, and the second half was below normal or had droughts.Climate change, rapid urbanization, environmental degradation, and industrial development mayhave resulted in water and related resources within the basin becoming increasingly stressed. A studyconducted in Malaysia highlighted that extreme dry conditions are becoming more frequent thanextreme wet conditions [5]. With reference to Figure 5, the time series starts with one month ofmoderately wet conditions, which follows 12 months of near-normal conditions. From November 1987to May 1994 (months 14 to 92), the conditions are classified as very wet, extremely wet, or moderatelywet. After this wet period, the values are observed to decrease gradually, from very wet conditions tonear normal conditions. From June 1994 to November 2008 (months 93 to 266), the conditions werenear normal. However, there were few months that were moderately wet, very wet, or extremely wet.The first drought occurred in December 2008 (month 267), with an index value of −1.39. Droughtstarted to occur more frequently from this point onward. Near-normal conditions are observed fromJanuary 2009 (month 268) to February 2013 (month 317). However, within this period, the monthswith drought increased. From March 2013 (month 318) to September 2016 (month 360), all of themonths experienced drought. The most frequent type of drought was moderate drought, followedby extreme drought. The number of occurrences of severe drought is less than that of moderate andextreme drought.

Table 6 shows the number of months that each drought occurred, with percentages varying from2.50% to 66.67%. The most observed condition within the period of study was near normal. Nearnormal conditions occurred for 240 months, about 66.67%. Except for the near normal, all of the otherconditions were below 11%. Moderate drought occurred for 37 months (10.28%). Moderately wetconditions occurred for a similar number of months (35 months; 9.72%). Very wet conditions wereobserved in 16 months (4.44%), followed by extremely wet conditions in 12 months (3.33%). Extremedrought occurred in 11 months (3.06%). The least frequent condition was severe drought, whichoccurred in 9 months (2.5%).

Page 14: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 14 of 21

Water 2018, 10, x FOR PEER REVIEW    14 of 21 

 

 

Figure 11. Standardized Water Storage Index (SWSI) values for station 1 for 30 years (360 months). 

Table 6. SWSI drought percentage from station 1. 

Drought Classification  Condition  Number of Months  Percentage (%) 

Extremely wet  >2  12  3.33 

Very wet  1.5 to −2  16  4.44 

Moderately wet  1.0 to 1.5  35  9.72 

Near normal  −1.0 to 1.0  240  66.67 

Moderate drought  −1.5 to −1.0  37  10.28 

Severe drought  −2.0 to −1.5  9  2.50 

Extreme drought  <−2  11  3.06 

Artificial Neural Network Model for Hydrological Drought 

The network input models that were tested are based on SWSI at station 1, shown by Equations 

(13) and (14), as follows: 

SWSI (t) = f (Wt−1, Wt−2, Wt‐3)            Input model number 3  (13) 

SWSI (t) = f (SWt−1, SWt‐2, SWt‐3)      Input model number 4  (14) 

where SWSI or SW is the drought index, W is the water level, and n is the time lag, which is effectively 

the lead time of the forecasted SWSI model developed for station 1. Similar to the SIAP ANN model, 

in this case as well, each MLP was trained with 5 to 15 hidden neurons in a single hidden layer, as 

shown in Table 7, in order to select the most effective model by analyzing performance. The three 

best‐performing combinations are shown for each input model. 

Table 7. Correlation coefficient (R) of ANN network structure (SWSI). 

Input Model Number  Number of Neurons R 

Training  Validation  Testing  Overall 

3  10  0.968  0.967  0.969  0.968 

3  11  0.898  0.908  0.951  0.918 

3  7  0.767  0.801  0.822  0.796 

4  15  0.901  0.855  0.835  0.865 

4  10  0.911  0.899  0.853  0.888 

4  6  0.751  0.772  0.811  0.779 

‐2.5

‐2

‐1.5

‐1

‐0.5

0

0.5

1

1.5

2

2.5

Oct‐86

Nov‐87

Dec‐88

Jan‐90

Feb‐91

Mar‐92

Apr‐93

May‐94

Jun‐95

Jul‐96

Aug‐97

Sep‐98

Oct‐99

Nov‐00

Dec‐01

Jan‐03

Feb‐04

Mar‐05

Apr‐06

May‐07

Jun‐08

Jul‐09

Aug‐10

Sep‐11

Oct‐12

Nov‐13

Dec‐14

Jan‐16

SWSI

Months

Figure 11. Standardized Water Storage Index (SWSI) values for station 1 for 30 years (360 months).

Table 6. SWSI drought percentage from station 1.

Drought Classification Condition Number of Months Percentage (%)

Extremely wet >2 12 3.33Very wet 1.5 to −2 16 4.44

Moderately wet 1.0 to 1.5 35 9.72Near normal −1.0 to 1.0 240 66.67

Moderate drought −1.5 to −1.0 37 10.28Severe drought −2.0 to −1.5 9 2.50

Extreme drought <−2 11 3.06

Artificial Neural Network Model for Hydrological Drought

The network input models that were tested are based on SWSI at station 1, shown by Equations (13)and (14), as follows:

SWSI (t) = f (Wt−1, Wt−2, Wt−3) Input model number 3 (13)

SWSI (t) = f (SWt−1, SWt−2, SWt−3) Input model number 4 (14)

where SWSI or SW is the drought index, W is the water level, and n is the time lag, which is effectivelythe lead time of the forecasted SWSI model developed for station 1. Similar to the SIAP ANN model,in this case as well, each MLP was trained with 5 to 15 hidden neurons in a single hidden layer,as shown in Table 7, in order to select the most effective model by analyzing performance. The threebest-performing combinations are shown for each input model.

Table 7. Correlation coefficient (R) of ANN network structure (SWSI).

Input Model Number Number of NeuronsR

Training Validation Testing Overall

3 10 0.968 0.967 0.969 0.9683 11 0.898 0.908 0.951 0.9183 7 0.767 0.801 0.822 0.7964 15 0.901 0.855 0.835 0.8654 10 0.911 0.899 0.853 0.8884 6 0.751 0.772 0.811 0.779

Page 15: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 15 of 21

The output, which is the forecasted results, is plotted together with the observed results inFigure 12, using the SWSI input model (number 3). The dotted line shows the forecasted values andthe solid line shows the observed results, which were calculated by SWSI. In general, the two plots arenot very different. The forecasted values have only minor differences.

Water 2018, 10, x FOR PEER REVIEW    15 of 21 

 

The output, which is the forecasted results, is plotted together with the observed results in Figure 

12, using the SWSI input model (number 3). The dotted line shows the forecasted values and the solid 

line shows the observed results, which were calculated by SWSI. In general, the two plots are not 

very different. The forecasted values have only minor differences. 

 

Figure 12. SWSI observed and forecasted values (360 months) of station 1 for input model number 3. 

An error histogram for input model number 3 of SWSI was also plotted, and is shown in Figure 

13. The  error histogram  assists  in  authenticating  the performance  of  the network. The  blue part 

represents the training data and the green part represents the validation data. The biggest portion of 

data is surrounding the zero line. The zero line offers a way to confirm the outliers to determine if 

the data contains errors. It can also confirm that those data features are not like the leftovers of the 

dataset [44]. 

 

Figure 13. Error histogram for SWSI input model number 3. 

The R value shown in Figure 14 concludes the connection between the output or target values 

of the artificial neural network models. The R value is also known as the correlation coefficient. Strong 

-3.5-3

-2.5-2

-1.5-1

-0.50

0.51

1.52

2.53

0 30 60 90 120 150 180 210 240 270 300 330 360

SW

SI

Observed Forcasted

Months

Figure 12. SWSI observed and forecasted values (360 months) of station 1 for input model number 3.

An error histogram for input model number 3 of SWSI was also plotted, and is shown in Figure 13.The error histogram assists in authenticating the performance of the network. The blue part representsthe training data and the green part represents the validation data. The biggest portion of data issurrounding the zero line. The zero line offers a way to confirm the outliers to determine if the datacontains errors. It can also confirm that those data features are not like the leftovers of the dataset [44].

Water 2018, 10, x FOR PEER REVIEW    15 of 21 

 

The output, which is the forecasted results, is plotted together with the observed results in Figure 

12, using the SWSI input model (number 3). The dotted line shows the forecasted values and the solid 

line shows the observed results, which were calculated by SWSI. In general, the two plots are not 

very different. The forecasted values have only minor differences. 

 

Figure 12. SWSI observed and forecasted values (360 months) of station 1 for input model number 3. 

An error histogram for input model number 3 of SWSI was also plotted, and is shown in Figure 

13. The  error histogram  assists  in  authenticating  the performance  of  the network. The  blue part 

represents the training data and the green part represents the validation data. The biggest portion of 

data is surrounding the zero line. The zero line offers a way to confirm the outliers to determine if 

the data contains errors. It can also confirm that those data features are not like the leftovers of the 

dataset [44]. 

 

Figure 13. Error histogram for SWSI input model number 3. 

The R value shown in Figure 14 concludes the connection between the output or target values 

of the artificial neural network models. The R value is also known as the correlation coefficient. Strong 

-3.5-3

-2.5-2

-1.5-1

-0.50

0.51

1.52

2.53

0 30 60 90 120 150 180 210 240 270 300 330 360

SW

SI

Observed Forcasted

Months

Figure 13. Error histogram for SWSI input model number 3.

Page 16: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 16 of 21

The R value shown in Figure 14 concludes the connection between the output or target values ofthe artificial neural network models. The R value is also known as the correlation coefficient. Strongand random connections were identified when the R value was 1 and 0, respectively [45]. The linemust be at a 45◦ angle toward 1 to be a perfect fit. The 45◦ line means that the output value is equal tothe input target values. For SWSI, the R values for training, validation, testing, and overall, are thesame at 0.96. This indicates a strong correlation in the prediction of drought, based on the observedvalues and developed model [44].

A time series of the calculated indices plotted shows that drought is not increasing gradually,but occurs irregularly. The water level decreased and drought increased gradually every year. SWSIconsiders values between +1 and −1 as near normal, whereas other hydrological drought indices(e.g., SDI) consider all of the values below 0 as drought.

Water 2018, 10, x FOR PEER REVIEW    16 of 21 

 

and random connections were identified when the R value was 1 and 0, respectively [45]. The line 

must be at a 45° angle toward 1 to be a perfect fit. The 45° line means that the output value is equal 

to the input target values. For SWSI, the R values for training, validation, testing, and overall, are the 

same at 0.96. This indicates a strong correlation in the prediction of drought, based on the observed 

values and developed model [44]. 

A time series of the calculated indices plotted shows that drought is not increasing gradually, 

but occurs irregularly. The water level decreased and drought increased gradually every year. SWSI 

considers values between +1 and −1 as near normal, whereas other hydrological drought indices (e.g., 

SDI) consider all of the values below 0 as drought.   

 

Figure 14. Correlation coefficient for SWSI at station 1 for input model number 3. 

3.3. W‐ANN Model 

As seen in Table 8, there is a correlation between the DWT wavelet components D1, D2, D3, D4, 

D5, D6, D7, and D8 of the SIAP, SWSI, rainfall, and water level series, with the original series. It can 

be observed, in the case of SIAP, SWSI, and rainfall, that D1, D2, D3, and D8 show significantly higher 

correlations  than  the  average  of  correlations  among  them  compared  to  the  D4,  D5,  and  D6 

components. However,  for  the water  level, only D7 and D8  subseries  show higher  than  average 

correlations. According  to  the  correlation analysis,  the effective  components were  selected as  the 

dominant wavelet components, as stated above. Afterwards, the significant wavelet components and 

the approximation (A8) component were added to constitute the new series.   

   

Figure 14. Correlation coefficient for SWSI at station 1 for input model number 3.

3.3. W-ANN Model

As seen in Table 8, there is a correlation between the DWT wavelet components D1, D2, D3,D4, D5, D6, D7, and D8 of the SIAP, SWSI, rainfall, and water level series, with the original series.It can be observed, in the case of SIAP, SWSI, and rainfall, that D1, D2, D3, and D8 show significantlyhigher correlations than the average of correlations among them compared to the D4, D5, and D6components. However, for the water level, only D7 and D8 subseries show higher than averagecorrelations. According to the correlation analysis, the effective components were selected as thedominant wavelet components, as stated above. Afterwards, the significant wavelet components andthe approximation (A8) component were added to constitute the new series.

Page 17: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 17 of 21

Table 8. The correlation coefficient between each sub-time series and original drought indices/rawinput series.

Discrete WaveletComponents (db3)

Correlation between Detailed Sub-Time Series and ObservedDrought Index/Rainfall/Water Level Data

SIAP SWSI Rainfall Water Level * Dominant

D1 0.5291 0.1996 0.5291 0.1335√

D2 0.5732 0.2221 0.5732 0.1522√

D3 0.3645 0.1900 0.3645 0.1409√

D4 0.2130 0.1322 0.2130 0.1120 xD5 0.1616 0.1519 0.1616 0.1333 xD6 0.2576 0.1885 0.2576 0.0694 xD7 0.2015 0.0820 0.2015 0.3151 *D8 0.3201 0.3923 0.3201 0.4303 *

Average 0.3274 0.1948 0.3274 0.1858

* Only D7 and D8 subseries show higher than average correlations.√

indicates that the subseries is dominant;x indicates that the subseries is not dominant.

Secondly, the W-ANN models were developed for monthly drought prediction, using waveletsubseries. The most important part of this wavelet-based model is the selection of inputs for its formation.The summed wavelet components (the new series) instead of the original data were employed as inputsof the W-ANN model for drought prediction. Four different models based on combinations of differentinput data (SIAP, SWSI, rainfall, and water level) were evaluated. The forecasting performance of thewavelet–neural network models are presented in Table 9, in terms of RMSE and R. The table shows thatthe W-ANN model has a significant positive effect on the monthly drought forecast. As seen from thetable, model number 4, with three months of previous SWSI data, has the lowest RMSE and the highestcorrelation coefficients among all of the wavelet–neural network models. For meteorological droughtprediction, while the highest correlation coefficient (R) obtained by the ANN model is 0.899, with thewavelet-ANN model, this value increased to 0.940. Similarly, for the case of hydrological drought, whilethe R obtained by the ANN model is 0.968, with the wavelet-ANN model, this value increased to 0.973.The application of wavelet in the ANN model achieved higher correlation coefficients for all of themodels, except for input model number 3. In both types of drought forecasting, it was found that themodels based on preceding drought index values as inputs performed better than the models developedwith raw data, such as rainfall or water level as inputs. This proves that the created models can improvehydrologic and meteorological drought prediction close to the observed values.

Table 9. Root mean-square error (RMSE) and R statistics of different W-ANN models.

Input Model (After WaveletDecomposition) RMSE (Validation) R (Overall) Hidden Neurons

1 (Rt-1, Rt-2, Rt-3) 0.38 0.932 81 (Rt-1, Rt-2, Rt-3) 0.41 0.931 101 (Rt-1, Rt-2, Rt-3) 0.38 0.901 15

2 (SIt-1, SIt-2, SIt-3) 0.40 0.922 82 (SIt-1, SIt-2, SIt-3) 0.42 0.931 102 (SIt-1, SIt-2, SIt-3) 0.39 0.940 153 (Wt-1, Wt-2, Wt-3) 0.40 0.902 83 (Wt-1, Wt-2, Wt-3) 0.43 0.901 103 (Wt-1, Wt-2, Wt-3) 0.38 0.910 15

4 (SWt-1, SWt-2, SWt-3) 0.19 0.971 104 (SWt-1, SWt-2, SWt-3) 0.17 0.972 134 (SWt-1, SWt-2, SWt-3) 0.21 0.973 15

Page 18: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 18 of 21

Table 10 shows the performance improvement of the W-ANN models, and it can be seen thatthe models for meteorological drought forecasting improved by 3.67% and 8.29%; however, for thehydrological drought forecasting models, there was a decrease of R value by 5.99% for input modelnumber 3. Input model number 4 performed better than the other models that were considered in thisstudy, with a performance improvement of 9.57%.

Table 10. Performance improvement of R statistics of different W-ANN models.

Input Model R (With WaveletDecomposition)

R (Without WaveletDecomposition)

PerformanceImprovement (%)

1 (Rt-1, Rt-2, Rt-3) 0.932 0.899 +3.672 (SIt-1, SIt-2, SIt-3) 0.940 0.868 +8.293 (Wt-1, Wt-2, Wt-3) 0.910 0.968 −5.99

4 (SWt-1, SWt-2, SWt-3) 0.973 0.888 +9.57

Figure 15 shows a scatter plot using model number 4 (SWSI), and it shows that the W-ANN forecastsapproximate the general behavior of the observed data more satisfactorily for the drought months.

Figure 16 shows a scatter plot using model number 3 (SIAP), and it shows that the W-ANNforecasts do not linearly approximate the general behavior of the observed data, but the correlationcoefficient is 0.940.

Water 2018, 10, x FOR PEER REVIEW    18 of 21 

 

Table 10 shows the performance improvement of the W‐ANN models, and it can be seen that 

the models for meteorological drought forecasting improved by 3.67% and 8.29%; however, for the 

hydrological drought forecasting models, there was a decrease of R value by 5.99% for input model 

number 3. Input model number 4 performed better than the other models that were considered in 

this study, with a performance improvement of 9.57%. 

Table 10. Performance improvement of R statistics of different W‐ANN models. 

Input Model R (With Wavelet 

Decomposition) 

R (Without Wavelet 

Decomposition) 

Performance 

Improvement (%) 

1 (Rt‐1, Rt‐2, Rt‐3)  0.932  0.899  +3.67 

2 (SIt‐1, SIt‐2, SIt‐3)  0.940  0.868  +8.29 

3 (Wt‐1, Wt‐2, Wt‐3)  0.910  0.968  −5.99 

4 (SWt‐1, SWt‐2, SWt‐3)  0.973  0.888  +9.57 

Figure 15 shows a scatter plot using model number 4  (SWSI), and  it shows  that  the W‐ANN 

forecasts approximate the general behavior of the observed data more satisfactorily for the drought 

months. 

Figure 16 shows a scatter plot using model number 3  (SIAP), and  it shows  that  the W‐ANN 

forecasts do not linearly approximate the general behavior of the observed data, but the correlation 

coefficient is 0.940. 

 

Figure 15. Scatter plot comparing observed and forecasted hydrological drought using W‐ANN models. 

 

Figure 16. Scatter plot comparing observed and forecasted meteorological drought using W‐ANN models. 

y = ‐0.0082x + 1.4945R² = 0.7765

‐3‐2.5‐2

‐1.5‐1

‐0.50

0.51

1.52

2.53

0 24 48 72 96 120 144 168 192 216 240 264 288 312 336 360

SWSI

Months

ObservedForecastedLinear (Forecasted)

y = ‐0.0011x + 0.2287R² = 0.0145

‐3‐2.5‐2

‐1.5‐1

‐0.50

0.51

1.52

2.53

3.54

4.55

5.5

0 24 48 72 96 120 144 168 192 216 240 264 288 312 336 360

SIAP

Months

Observed

Forecasted

Linear (Forecasted)

Figure 15. Scatter plot comparing observed and forecasted hydrological drought using W-ANN models.

Water 2018, 10, x FOR PEER REVIEW    18 of 21 

 

Table 10 shows the performance improvement of the W‐ANN models, and it can be seen that 

the models for meteorological drought forecasting improved by 3.67% and 8.29%; however, for the 

hydrological drought forecasting models, there was a decrease of R value by 5.99% for input model 

number 3. Input model number 4 performed better than the other models that were considered in 

this study, with a performance improvement of 9.57%. 

Table 10. Performance improvement of R statistics of different W‐ANN models. 

Input Model R (With Wavelet 

Decomposition) 

R (Without Wavelet 

Decomposition) 

Performance 

Improvement (%) 

1 (Rt‐1, Rt‐2, Rt‐3)  0.932  0.899  +3.67 

2 (SIt‐1, SIt‐2, SIt‐3)  0.940  0.868  +8.29 

3 (Wt‐1, Wt‐2, Wt‐3)  0.910  0.968  −5.99 

4 (SWt‐1, SWt‐2, SWt‐3)  0.973  0.888  +9.57 

Figure 15 shows a scatter plot using model number 4  (SWSI), and  it shows  that  the W‐ANN 

forecasts approximate the general behavior of the observed data more satisfactorily for the drought 

months. 

Figure 16 shows a scatter plot using model number 3  (SIAP), and  it shows  that  the W‐ANN 

forecasts do not linearly approximate the general behavior of the observed data, but the correlation 

coefficient is 0.940. 

 

Figure 15. Scatter plot comparing observed and forecasted hydrological drought using W‐ANN models. 

 

Figure 16. Scatter plot comparing observed and forecasted meteorological drought using W‐ANN models. 

y = ‐0.0082x + 1.4945R² = 0.7765

‐3‐2.5‐2

‐1.5‐1

‐0.50

0.51

1.52

2.53

0 24 48 72 96 120 144 168 192 216 240 264 288 312 336 360

SWSI

Months

ObservedForecastedLinear (Forecasted)

y = ‐0.0011x + 0.2287R² = 0.0145

‐3‐2.5‐2

‐1.5‐1

‐0.50

0.51

1.52

2.53

3.54

4.55

5.5

0 24 48 72 96 120 144 168 192 216 240 264 288 312 336 360

SIAP

Months

Observed

Forecasted

Linear (Forecasted)

Figure 16. Scatter plot comparing observed and forecasted meteorological drought using W-ANN models.

Page 19: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 19 of 21

4. Conclusions

Drought occurrences in the Langat River catchment of peninsular Malaysia were characterizedusing meteorological and hydrological drought indices, SIAP and SWSI, respectively. Overall, SWSIand SIAP were found to be effective indices for the assessment of drought. The occurrence ofhydrological and meteorological droughts was found to be around 16% and 37% by SWSI and SIAP,respectively. Two neural network-based models and two wavelet-based ANN models were developedusing the values of SIAP and SWSI. For SWSI and SIAP, correlation coefficients of 0.96 and 0.90,respectively, were calculated. Therefore, it is concluded that both of the models are found to be reliable.However, with the W-ANN model, these values increased to 0.940 and 0.973 for meteorological andhydrological drought forecasting, respectively. This proves that the proposed models are able topredict hydrologic and meteorological drought very close to the observed values. This study can helpin the drought assessment and the prediction of drought occurrence in the study area. Authorities canissue an early warning for the timely implementation of preparedness, based on predictions.

Author Contributions: M.M.H.K. designed the research, carried out the analysis, and wrote the article. N.S.M.and A.E.-S. helped to conceive the research, review the overall results, and initiate the article. All of the authorscontributed substantially to the work reported.

Funding: Fundamental Research Grant Scheme (Grant No. FRGS/2/2014/TK02/UKM/03/2), provided by theMinistry of Education Malaysia

Acknowledgments: The authors would like to acknowledge support from the Ministry of Education (MOE)and the Universiti Kebangsaan Malaysia (UKM) for research grant FRGS/2/2014/TK02/UKM/03/2. They alsoacknowledge contributions from Adil Rassam Timimi and Ahmed Ali Jabir at the University of Nottingham,Malaysia, for help with the analysis. Appreciation is due to the Department of Irrigation and Drainage (DID) forthe data supplied to conduct this research study.

Conflicts of Interest: The authors declare no conflict of interest.

References

1. Basics, D.; Drought, T. Types of Drought. Available online: http://drought.unl.edu/DroughtBasics/TypesofDrought.aspx (accessed on 14 October 2017).

2. Deni, S.; Jemain, A.; Ibrahim, K. The spatial distribution of wet and dry spells over Peninsular Malaysia.Theor. Appl. Climatol. 2008, 94, 163–173. [CrossRef]

3. Jamalluddin, S.A.; Low, K.S. Droughts in Malaysia: A Look at Its Characteristics, Impacts, Related Policiesand Management Strategies. In Proceedings of the Water and Drainage 2003 Conference, Kuala Lumpur,Malaysia, 28–29 April 2003.

4. Staff, S.; Staff, S. Drought beyond India: Malaysia Faces a Massive Water Crisis as South East Asia Swelters.Available online: http://scroll.in/article/806812/drought-beyond-india-malaysia-faces-a-massive-water-crisis-as-south-east-asia-swelters (accessed on 25 November 2017).

5. Wan, W.Z.; Ibrahim, K.; Jemain, A.A. Evaluating dry conditions in Peninsular Malaysia using bivariatecopula. Anziam J. 2010, 51, 555–569.

6. Morid, S.; Smakhtin, V.; Moghaddasi, M. Comparison of seven meteorological indices for drought monitoringin Iran. Int. J. Climatol. 2006, 26, 971–985. [CrossRef]

7. Karavitis, C.A.; Alexandris, S.; Tsesmelis, D.E.; Athanasopoulos, G. Application of the StandardizedPrecipitation Index (SPI) in Greece. Water 2011, 3, 787–805. [CrossRef]

8. Gourabi, B.R. The Recognition of Drought with Dri and Siap Method and its Effects on Rice Yield and WaterSurface in Shaft, Gilan, south Western of Caspian Sea. Aust. J. Basic Appl. Sci. 2010, 4, 4374–4378.

9. Arvind Singh, T.; Bhawana, N.; Lokendra, S. Drought spells identification with indices for Almora district ofUttarakhand, India. Am. Int. J. Res. Sci. Technol. Eng. Math. 2015, 12, 1–5.

10. Farahmand, A.; AghaKouchak, A. A generalized framework for deriving nonparametric standardizeddrought indicators. Adv. Water Resour. 2015, 76, 140–145. [CrossRef]

11. Wang, W.; Wang, P.; Cui, W. A comparison of terrestrial water storage data and multiple hydrological data inthe Yangtze River basin. Adv. Water Sci. 2015, 26, 759–768.

Page 20: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 20 of 21

12. Nourani, V.; Komasi, M.; Mano, A. A Multivariate ANN-Wavelet Approach for Rainfall–Runoff Modeling.Water Resour. Manag. 2009, 23, 2877. [CrossRef]

13. Panagoulia, D. Hydrological modeling of a medium-sized mountainous catchment from incompletemeteoro-logical data. J. Hydrol. 1992, 137, 279–310. [CrossRef]

14. Panagoulia, D. Artificial neural networks and high and low flows in various climate regimes. Hydrol. Sci. J.2006, 51, 563–587. [CrossRef]

15. Govindaraju, R. Artificial neural networks in hydrology. II: hydrologic applications. J. Hydrol. Eng. 2000, 5,124–137.

16. Peng, T.; Zhou, J.; Zhang, C.; Fu, W. Streamflow Forecasting Using Empirical Wavelet Transform andArtificial Neural Networks. Water 2017, 9, 406. [CrossRef]

17. Zhou, J.; Peng, T.; Zhang, C.; Sun, N. Data Pre-Analysis and Ensemble of Various Artificial Neural Networksfor Monthly Streamflow Forecasting. Water 2018, 10, 628. [CrossRef]

18. Remesan, R.; Shamim, M.A.; Han, D.; Mathew, J. Runoff prediction using an integrated hybrid modellingscheme. J. Hydrol. 2009, 372, 48–60. [CrossRef]

19. Tayyab, M.; Zhoua, J.; Adnana, R.; Zenga, X.; Zenga, X. Application of Artificial Intelligence Method Coupledwith Discrete Wavelet Transform Method. Procedia. Comput. Sci. 2017, 107, 212–217. [CrossRef]

20. Zhou, T.; Wang, F.; Yang, Z. Comparative Analysis of ANN and SVM Models Combined with WaveletPreprocess for Groundwater Depth Prediction. Water 2017, 9, 781. [CrossRef]

21. Seo, Y.; Choi, Y.; Choi, J. River Stage Modeling by Combining Maximal Overlap Discrete Wavelet Transform,Support Vector Machines and Genetic Algorithm. Water 2017, 9, 525.

22. Wang, D.; Ding, J. Wavelet network model and its application to the prediction of hydrology. Nat. Sci. 2003,1, 67–71.

23. Kim, T.W.; Valdes, J.B. Nonlinear model for drought forecasting based on a conjunction of wavelet transformsand neural networks. J. Hydrol. Eng. 2003, 6, 319–328. [CrossRef]

24. Shabri, A. A hybrid model for stream flow forecasting using wavelet and least Squares support vectormachines. Jurnal Teknologi 2015, 73, 89–96. [CrossRef]

25. Belayneh, A.; Adamowski, J. Drought forecasting using new machine learning methods. J. Water Land Dev.2013, 18, 3–12. [CrossRef]

26. Khalili, A.; Bazrafshan, J. Assessing the efficiency of several drought indices in different climatic regions ofIran. Nivar J. 2003, 48, 79–93.

27. Najjar, S.; Rouhollah, R.Y. Studying & Comparing the Efficiency of 7 Meteorological Drought Indices inDroughts Risk Management (Case Study: North West Regions). Appl. Math. Eng. Manag. Technol. 2015, 3,131–142.

28. Haykin, S. Neural Networks: A Comprehensive Foundation, 2nd ed.; Prentice Hall: Upper Saddle River, NJ,USA, 1999.

29. Demuth, H.; Beale, M. Neural Network Toolbox: For Use with Matlab; The MathWorks, Inc.: Natick, MA,USA, 2005.

30. Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995; p. 482.31. El-Shafie, A.; Noureldin, A.; Taha, M.; Hussain, A. Dynamic versus static neural network model for rainfall

forecasting at Klang River Basin, Malaysia. Hydrol. Earth Syst. Sci. Discuss. 2011, 8, 6489–6532. [CrossRef]32. Dawson, C.W.; Abrahart, R.J.; See, L.M. HydroTest: A web-based toolbox of statistical measures for the

standardised assessment of hydrological forecasts. Environ. Modell. Softw. 2007, 27, 1034–1052. [CrossRef]33. Napolitano, G.; Serinaldi, F.; See, L. Impact of EMD decomposition and random initialisation of weights

in ANN hindcasting of daily stream flow series: An empirical examination. J. Hydrol. 2011, 406, 199–214.[CrossRef]

34. Barua, S.; Ng, A.; Perera, B. Artificial Neural Network—Based Drought Forecasting Using a NonlinearAggregated Drought Index. J. Hydrol. Eng. 2012, 17, 1408–1413. [CrossRef]

35. Kisi, O. Wavelet regression model as an altrnative to neural networks for river stage forecasting.Water Resour. Manag. 2011, 25, 579–600. [CrossRef]

36. Nourani, V.; Baghanam, A.H.; Adamowski, J.; Kisi, O. Applications of hybrid wavelet-Artificial Intelligencemodels in hydrology: A review. J. Hydrol. 2014, 514, 358–377. [CrossRef]

37. Mallat, S.G. A theory for multi decomposition signal decomposition: The wavelet representation. IEEE Trans.Pattern Anal. Mach. Intell. 1989, 11, 674–693. [CrossRef]

Page 21: Wavelet-ANN versus ANN-Based Model for Hydrometeorological …eprints.intimal.edu.my/1183/1/water-10-00998 (1).pdf · 2018. 10. 18. · Drought monitoring by indices in specific areas

Water 2018, 10, 998 21 of 21

38. Hasan, M.; Begum, M.; Al Mamun, A.; Haque Khan, Z. Selection of extreme drought event for Langat Basinand its consequence on salinity intrusion through Langat River System. IOSR J. Mech. Civ. Eng. 2014, 11,62–69. [CrossRef]

39. Singh, V.P. Elementary Hydrology; Prentice Hall of India: New Delhi, India, 1994.40. Dawson, C.W.; Abrahart, R.J.; Shamseldin, A.Y.; Wilby, R.C. Flood estimation at ungauged sites using

artificial neural networks. J. Hydrol. 2006, 319, 391–409. [CrossRef]41. Morid, S.; Smakhtin, V.; Bagherzadeh, K. Drought forecasting using artificial neural networks and time series

of drought indices. Int. J. Climatol. 2007, 27, 2103–2111. [CrossRef]42. Shahin, M.A.; Maier, H.R.; Jaksa, M.B. Data Division for Developing Neural Networks Applied to

Geotechnical Engineering. J. Comput. Civ. Eng. 2004, 18, 105–114. [CrossRef]43. Khadr, M. Forecasting of meteorological drought using Hidden Markov Model (case study: The upper Blue

Nile river basin, Ethiopia). Ain Shams Eng. J. 2016, 7, 47–56. [CrossRef]44. Claudio, G.; Joseph, Q.; Carmine, T.; Svetoslav, I.; Silviya, P. Public Transportation Energy Consumption

Prediction by means of Neural Network and Time Series Analysis Approaches. In Proceedings of the6th International Conference on Automotive and Transportation Systems, Slerno, Italy, 27–29 June 2015;pp. 64–70.

45. Amit, K.Y.; Hasmat, M.; Chandel, S.S. Selection of most relevant input parameters using WEKA for artificialNeural network based solar radiation prediction models. Renew. Sustain. Energy Rev. 2014, 31, 509–519.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open accessarticle distributed under the terms and conditions of the Creative Commons Attribution(CC BY) license (http://creativecommons.org/licenses/by/4.0/).