Neural and Fuzzy Neural Networks

download Neural and Fuzzy Neural Networks

of 10

Transcript of Neural and Fuzzy Neural Networks

  • 7/27/2019 Neural and Fuzzy Neural Networks

    1/10

    2003 IEEE XIIIWorkshop onNeural Networks for Signal Processing

    N EU RA L A N D F U Z Z Y N EU RA L N E T W O R K SFOR NA T URA L GA S CO NS U M P T ION

    P RED CT l ONNguyen Hoang Viet

    Institute of Fundamental Technological Research, Polish Academyof Sciences. ul. Swigtokrzyska 21 , 00-049 Warsaw, Poland.

    email: [email protected] I\fandziuk*

    Faculty of Mathematics and Information Science, Warsaw Universityof Technology. P1. Politechniki 1, 00-661 Wariaw, Poland.e-mail: [email protected]

    Abstract. In thi s work several approaches to prediction of natura lgas consumption with neura l and fuzzy neural systems for a certainregion of Poland are analyzed and tested. Prediction strategiestested in the paper include: single neural net module approach,combination of three neural modules, temperature clusterizationbased meth od, an d application of fuzzy neural networks. Th e re-sults indicate t he superiority of temperature clusterization basedmethod over modu lar and fuzzy neural approaches. One of theinteresting issues observed in the paper is relatively good perfor-mance of th e tes ted methods in th e case of a long-term (four week)prediction compared to mid-term (one week) prediction. Gener-ally, th e results a re significantly bet ter th an th ose obtained bystatistical methods currently used in the gas company under con-sideration.

    INTRODUCTIONThe paper discusses several neural and fuzzy neural approaches to the prob-lem of prediction of natural gas consumption in a certain region of Poland.The area is mostly rural with several small cities and therefore the consumersare mostly individual or they belung to small iudustry [bakeries, restaurants,laundries, etc.).

    'The work was supported by the grant no. 503G 1120 0009 002 from the WarsawUniversity of Technology. Support f rom the Masov im Oil and Gas Company Ltd. is alsogratefully acknowledged.

    0-7803-8178-5/03/$17.000 2003 IEEE 759

  • 7/27/2019 Neural and Fuzzy Neural Networks

    2/10

    Predicting the consumption of natural gas is an interesting, non-trivialand highly economically-motivated task [l, 1. The conventional approach t osolving it is based on applying statistical methods which are usually efficientenough only if the large amount of past (historical) dat a is available. Analternative approach is to exploit soft-computing methods, especially neuraland fuzzy neural network models.

    Neural networks ar e well known to be universal non-linear approximators[3, 41 and therefore are in general capable of close approximation of the pre-diction model without the need of its explicit (mathematical) formulation incontrast to statistical approaches. The other advantage of using neural net-works over statistical methods in the application domain considered in thispaper is the problem of gas market volatility in Poland in recent years. Onone hand several new consumers are attracted by the relatively low cost ofthis type of energy but on the other hand some gas consumers (especially inthe rural regions) switch to alterna tive even cheaper energy sources, like woodor coal. This t ype of structu ral instability of natu ral gas consumers is veryharmful for statistical approaches, which do not implement any adaptationmechanisms.

    NETWORK ARCHITECTURESFeed-forward networksFeed-forward networks applied here are two hidden layer architectures withsigmoidal units. Each prediction network takes the amount of daily gasconsumption together with the daily average temperature measured in thegiven region as the inputs. Since the gas consumption is highly seasonal, thetime factor coded in a special way is also included as part of the input.

    Lets denote by G( t ) he actual gas consumption on day t , by G( t ) heprediction of gas consumption made on day t and by T( t ) he average temper-ature on day t . It should be noted that all references to temperature valuesin this work concern the observable average temperatures in a given periodin the past. No explicit temperature prediction was made since this kind ofprediction was virtually embedded in the gas load prediction network, dueto it s specific archite cture (see description bellow).

    The network archit ecture used for prediction is schematically presented inFig. 1. Th e neurons in the input layer can be devided into three groups. Thefirst k neurons in the input layer represent the daily gas loads. The secondgroup of k neurons refers to the average daily temperatures from the k prev-ous days. The last input group T I , . . . ,T, denotes the time factor associatedwith the considered period (see description bellow for more details).

    The first hidden layer also consists of three groups of neurons. Neuronsbelonging to the first group are fully connected to those in the first group ofthe input layer (i.e. the gas units). Similarly, neurons in the second group arefully connected to those in the second input g roup (the temperature inputs).

    760

  • 7/27/2019 Neural and Fuzzy Neural Networks

    3/10

    Figure 1: The architecture of feed-forward net.work used for one-day prediction.The third group neurons are connected to the rest of the inputs. The numberof neurons in this group is fixed to two. All the first hidden layer neuronsare fully connected to the neurons in the second hidden layer, which in turnare connected to the single neuron in the output layer.Three types of prediction ar e considered in th is work: one-day prediction(denoted by D-t yp e) , onoweek prediction (denoted by W-t yp e) and four-week prediction (denoted by 4W-t ype ). The output G(t+1)of the networkis equal to the predicted daily consumption for the next day in the case ofD-type prediction or th e average daily consumption for the following week orthe following four weeks for W-type and 4Wtype prediction, rep.

    The sizes of t he input and t he hidden layers were selected experimentallyby some preliminary tests, depending on the prediction horizon. The gasloads and the average daily temperatures from the last three, five and sevendays were taken as inputs to make D-type, W-type and 4T -type predictions,resp.

    Encoding of the last group of inputs representing the time factor requiresspecial care. The main problem tha t occurs when applying a straightforwardencoding scheme (either by the use of one input representing the day of theyear or th e two inputs denoting the month of the year and the day of themonth) is harmful disc ont inu it y in the New Years period. Therefore adifferent encoding approach is proposed here: denote by t th e day of the yearnumber of the first day of some n-day period ( n 2 1,0 5 t 5 365 ) . Th esubsequent days in that period are numbered by t + 1, t + 2 , . . ., t + n - 1.For such a period, the two following time encoding inputs are applied:

    27i D ,and 72 = cos- 366 . 27iDcr1 = sin-66where D, is a real value iudicating the center day of the period underconsideration. i.e.:

    It can be observed that the use of (1) allows smooth coding of the season

    761

  • 7/27/2019 Neural and Fuzzy Neural Networks

    4/10

    of the year, which is especially important in the case of mid-term and long-term prediction. In the case of D-type prediction there is an additionaltime encoding input r,, which indicates the type of the next day, where73 = -1 for working days and ~3 = +1 for non-working days'. This inputcan be interpreted as defining the working and non-working day context ofthe networks.

    All feed-forward networks described above are trained by the Scaled Con-jugate Gradient method [5],which was proven to be a fast and effectivetraining method.Fuzzy neural networks (FNNs )Several F N N models have been developed and successfully used in variousapplications [6, 71. The advantages of fuzzy over conventional neural net-works are their high generalization abilities and the capability of dealingwith imprecise data . In the gas load prediction problem the tempera tureinput plays significant role. The use of fuzzy neural network for this task ismainly motivated by the fact that the average daily temperature is only anapproximation of the exact temper ature which varies throughout the day.The F N N model. The fuzzy neural network implemented in this work isrepresented by a single hidden layer feed-forward architecture with N inputunits, K hidden units and A4 output ones (Figure 2 ) . Every connection

    Figure 2: The architecture of fuzzy-neural network used in simulations.between input and hidden units has a weight equal to 1. Each unit in thehidden layer represents a fuzzy set over RN. The output value of the i-th hidden neuron for a given input vector X = [ X I , XZ,. . ,X N ] ~an beinterpreted as the degree of membership of X to the fuzzy set representedby this neuron. The neurons in the hidden layer arc fully connected to theneurons in th e outpu t layer.Assuming the Gaussian membership function of the fuzzy sets represented

    by the hidden neurons: the output of the i-th hidden neuron given X as the'Actually, for the sake of simplicity,only weekend days are treated as non-working days.Al l holidays that occurred in weekdays are treated as working days.

    762

  • 7/27/2019 Neural and Fuzzy Neural Networks

    5/10

    (3 )where (1 . 1 is the Euclidian norm and U

  • 7/27/2019 Neural and Fuzzy Neural Networks

    6/10

    DATA COLLECTION AND PREPROCESSINGOne of the problems in this particular prediction task is the lack of historicaldata. The available da ta set provided by the gas company ranged fromJan.01, 2000 to Dec. 31, 2002. The set was initially divided into two sets: thefirst one covering years 2000 - 2001 to compose the training set and thesecond one with 2002 year dat a to form the test set. Furthermore 10%of thetraining data was randomly chosen for validation.

    Each record of the data represents the daily cumulative load of naturalgas provided hy the telemetric system as well as the average daily temper-ature . Due to the relatively small amount of past da ta , the sliding windowmechanism was used in order to artificially enlarge the dat a sets. Namely,for W and 4W-type predictions the target periods are overlapping and sub-sequent target periods are defined as [t+ 1, + n] - for prediction made onday t , [t+ 2, t +n + 11 - for prediction made on day t + 1, [t+ 3, t + n + 21- for prediction made on day t + 2 , etc., where n = 7 or n = 28 for W-typeand 4W-type predictions, resp.

    The data, before being input to t he network, was scaled to some prede-fined range. The maximum and minimum temperatures in the training se t,called t,,, and t,,,, resp. were determined. Afterwards the tempera turerange [ t mt n ,mar] was uniformly extended by 20% from each side into therange [T,,,, TmaZ].Temperature values were finally scaled to the interval(-1,l) using T,,, and T,,,. The daily load values were scaled to ( 0 , l ) bydividing each value G(t)by G,,,, where G,,, is the extended (as in thecase of temperature) maximum daily load over the training da ta set. Theoutput (target) values in the training set, which arc the average daily loadsfor a given one-day, one-weekor four-week period - depending on the type ofprediction - were scaled similarly, i.e. were divided by G,,,.EXPERIMENTAL RESULTSFor each prediction horizon the following experiments were performed: naiveprediction, prediction using a single neural network module, prediction usinga combination of three neural modules, prediction using three neural modules- each of which is devoted to a predefined temperature range, single neuralnetwork prediction performed for working days only (concerns D-type pre-diction only) and prediction based on a single fuzzy neural network module.

    The te st da ta set (covering the whole year 2002) for W and 4W-type pre-dictions was also artificially enlarged by using the sliding window mechanism- analogously to the case of the training da ta. In all tests, the same errormeasure - the Mean Absolute Percentage Error, denoted by R4APE - com-monly used in prediction tasks 12, 91 was applied. The following experimentswith various neural architectures were performed:

    Naive prediction (Naive) . In this experiment no neural network isinvolved. The predicted average daily load in the period [t+ 1, + 2.1 is

    764

  • 7/27/2019 Neural and Fuzzy Neural Networks

    7/10

    simply assumed to be equal to the average daily load in t he precedingperiod, i.e. [t n+ l , t ] , here n = 1 , 7 , 2 8 , resp. for D , W and 4Wtyp eprediction.Prediction using a single neura l network ( S ing l eN . In this ex-periment, for each prediction horizon, 50 neural networks were trainedand tested as single predictors.Prediction using a combinat ion of three ne ur a l ' modu l e s -(3AvgN). Here all combinations of three different networks among theabove 50 were tested. The final output value was equal to the averagevalue of three modules' outputs.Prediction using three neura l modules - each of which is de-voted to a specific temperature r ange (3TempN) . In this experi-ment the training set T was divided into three equipotent, overlappingsubsets. Namely, for each training sample p , the average daily temper-ature Ep of the days covered by this sample was calculated. Let t l , tzand t g be some temperature values where:

    tl < t z < t gLet:

    L = { P E T Tp < t z }M = { P E T : i I p < t 3 )H = { P E T : Ep 2 z } .

    The values t l , a and t g were chosen so that:card(L) = card(M) = card(H).

    In other words, the training set was divided according to the tempera-ture context. The subsets L, M and H can be regarded as containingsample data corresponding to the low, medium and high temperature'.For each of these subsets, a set of 20 neural networks was indepen-dently trained. The context-based partition of the training da ta couldfacilitate th e prediction task within each temperature range. In thetest phase, all possible combinations of the three modules (one of eachtype) were tested (SO00 combinations in total). Please note tha t dueto some overlapping areas in the training sets, the effectof gluing themodules was achieved in a straightforward way (cf. [lo , 111). Fo r agiven test sample, the corresponding average input temperature wascalculated and then depending on this value, one or (usually) two ap-propriate networks were activated. The ult imate prediction was theaverage value of the outputs of the modules involved in prediction.

    '

    'Certainly, such notions are only subjective.

    765

  • 7/27/2019 Neural and Fuzzy Neural Networks

    8/10

    TABLE: TH E V E R A G E , T H E b l lN I M U Y A N D T H E h lA X I MU M I\'IAPE ( I N P E R C E N T )A N D I T S S T A N D A R D D E V I A T I O N FOR D-TYPE P R E D I C T I O N S .

    TABLE: THE V E R A G E , T H E h l l N l hl U h l A N D T H E MAXIMUM h'IAPE ( I N P E R C E N T )A N 0 ITS S T A N D A R D D E V I A T I O N FOR W - T Y P E P R E D I C T I O N S .Prediction for the working days only (WorkD).This experimentwas analogous to the single network prediction except that it was madefor the working days only. The experiment w as performed for D-typeprediction only.Prediction based on a fuzzy neural network module ( f i z r y n r ) .An ensemble of 50 F N N s was trained and tested analogously to the caseof single neural modules.

    For each experiment mentioned above, the average, the minimum andthe maximum value (in percent) as well as the standard deviation of MAPEover all tested networks (o r all their combinations, where applicable) werecalculated. The results are shown in Tables 1-3. Examples of W-type and4W-type predictions are presented in Fig. 3.

    CONCLUSIONSThe main conclusions drawn from Tables 1- 3 are as follows:

    Comparing the average MAPE values, the best performance is observedfor the combination of three temperature context based modules. Thenext one is the combination of three non-temperature-based networks.These two outperform the single neural and the single fuzzy neuralapproaches.The efficaciesof single fuzzy neural network and single neural networkmodules are comparable. It is important t o note, however, that for

    766

  • 7/27/2019 Neural and Fuzzy Neural Networks

    9/10

    TABLE: THE V E R A G E , T H E h l lN l h l U h l A N D T H E hlAXIhlUhl AlAPE ( I N P E R C E N T )A N D ITS S T A N D A R D D E \' IA T I ON FOR I W - T Y P E P RE D IC TI ON S.

    each type of prediction, th e size of a fuzzy neural network module wasmuch smaller compared t o the corresponding crisp neural module.In the case of D-type prediction an interesting issue is to compare theworking days prediction versus all days prediction. Th e average resultsfor working days are only slightly better than the respective ones foral l days. The main reason for such a little improvement is the ruralcharacter of the area with very few industr ial consumers. Thereforethe consumption profile over the week is more stable than in highlyindustrialized regions.

    In summary, the results achieved by neural networks and fuzzy neuralnetworks are encouraging and acceptable from the natural gas company%viewpoint. Statistical methods used so far by the company yield t he averageh4APE error for monthly prediction in the period 01.2000- 12.2001 equal to12.86%3. In the numerical evaluation of the results it should be taken intoaccount that this year's winter was unusually cold (cf. Fig. 3) and thereforemaking predictions for the period Nov.-Dec. n a s much more difficult in theyear 2002 than in the two previous years.

    Another comparison can be fairly made with the naive prediction a pproach, which was clearly outperformed by neural methods in the case ofmid-term and long-term predictions.

    The comparison of our results with the literature is ambiguous since theda ta sets used in other works are different from ours and also the consumerprofiles may be different. One example is [2]where the MAPE error of 3.78%for D-type prediction involving temperatu re da ta is reported.

    In the pape r, th e proposed techniques were applied in the context involv-ing individual and small industrial consumers. It will he interesting to verifythe efficacy of these methods in the case of a different consumer's profile.When considering highly inhabited and industrialized areas, more accurateand reliable prediction would he necessary. The techniques presented herecould be exploited in cooperation with some other approaches like partiallyrecurrent or RBF neural networks, as well as statistical methods.

    3Unfortunately, the more recent data is not available.

    767

  • 7/27/2019 Neural and Fuzzy Neural Networks

    10/10

    nFigure 3: Test results for W-type and 4W-type predictions in the period Jan.-Dec.,2002. The solid lines and the dotted lines represent the target and predicted values,resp.REFERENCES

    [I ] R.H. Brown, L. Martin, P. Kharouf, L.P. Piessens, "Development of artificialneural-network models to predict daily gas consumption", A.G .A. Forecast-ing Review, vol. 5 , pp. 1-22, 1996.[2] A. Khotanzad, H. Elragal, T-L. Lu, "Combination of artificial neural-networkforecasters for prediction of natural gas consumption", IEEE Transactionson Neural Networks, vol. 11, no. 2, pp. 464-473, 2000.[3] A N . Kolmogorov, "On the representation of continuous functions of manyvariables by superposition of continuous functions of one variable and addi-tion", Dokl. Akad. Nauk ZSRR, ol. 114, pp. 953-956, 1957.[4] V. Kurkova, "Rates of approximation by neural networks", in P. Sincak,J. Vascak (eds.) Quo Vadis Computational Intelligence. Springer,Berlin, 2000, pp. 23-36.[5] M.F. Mdler, "A scaled conjugate gradient algorithm for fast supervised learn-ing'', Neural Networks, vol. 6, pp. 525-533, 1993.

    [6 ] J.J. Buckley, Y . Hayashi, "Fuzzy neural networks: A survey", Fuzzy Setsand Systems, vol. 66, pp. 1-13, 1994.[7] M.M. Gupta, D.H. Rm, "On the principles of fuzzy neural networks", FuzzySets and System s, vol. 61, pp. 1-18, 1994.[8] M . Martinetz, S. Berkovich, K . Schulten, "Neural-gas net\rork for vector quan-tization and its application to time series prediction", IEEE Transactionson Neural Networks, vol. 4, pp. 558.569, 1993.

    (91 G. Zhang, B.E. Patuwo, M.Y. Hu, "Forecasting with artificial neural networks:The state of the art", International Journal of Forecasting, vol. 14, pp.35-62, 1998.

    [IO] A.J.C. Sharkey, "Modularity, combining and artificial neural nets", Connec-tion Science, vol. 9, no. 1, pp. 3-10, 1997.

    [Ill A . Waibel, "Consonant recognition by modular construction of large phonemictimedelay neural networks", in D. Touretzky (ed.) Advances in NIPS1, Morgan Kaufmann, 1989, pp. 215223.

    768