ARMA-Stochastic Time Series Modeling
Transcript of ARMA-Stochastic Time Series Modeling
-
8/2/2019 ARMA-Stochastic Time Series Modeling
1/19
3
Contents
Abstract
Chapter 1 Critical Reviews:
1.1 Stochastic Time Series Modeling, Simulation & Prediction
1.2 Regression Analysis Time Series Modeling & Simulation
1.3 Chaotic Time Series without Rule Based Fuzzy logic(FL),
Mackey Glass Simulation with FL and Prediction
1.4 Rule Based Fuzzy Logic Time Series Prediction,
Modeling and Simulation
1.5 Artificial Neural Network Time Series (ANNTS)
Modeling, Simulation & Prediction
1.6 Thesis Plan
-
8/2/2019 ARMA-Stochastic Time Series Modeling
2/19
4
1.1 Stochastic Time Series Modeling, Simulation & Prediction
A method of forecasting wind power output a few hours advance, from a wind
power generator that is supplying power and energy system, is required to ensure
efficient utilization of the power. Time series modeling of wind speed has been the
subject of many discussions because of the interest in wind as an alternative form of
energy. When the records of wind speed are incomplete or of too short a duration or the
handling and storage of large values of the data are not desirable, then a time series
model is needed .Since wind power is a function of wind speed, simulation of power
generally are derived from simulations of speed. Wind speed simulations can be done
with Monto Carlo methods that rely solely on the estimated parameters of the marginal
distribution of wind speeds. The
The multiplicative ARMA (autoregressive moving average) models to generate
hourly series of global radiation by Mora-Lopez and Sidrarch-de-Cardona (1998),
stochastic simulation using ARIMA (autoregressive integrated moving average)
modeling of solar irradiation by Craggs et al (1999) and a time dependent autoregressive
Gaussian model (TAG) for generating synthetic hourly radiation by Aguiar and Collares
Pereira (1992) are important contributions from modeling and simulation point of view.
Lalarukh and Jafri (1999) used an ARMA process on hourly global radiation data,
performed stochasting modeling through MTM (Markov Transition Matrix) and
generated synthetic sequences of hourly global solar irradiation for Quetta, Pakistan.
They found MTM approach relatively better as a simulator compared to ARMA
modeling. But, their analysis for ARMA process to simulate and forecast hourly averaged
wind speed for Quetta, Pakistan also yielded good results Lalarukh and Jafri (1997).
Several non-Gaussian distributions have been suggested as appropriate models for
wind speed. These models include the inverse Gaussian distribution Bardsley (1980), thelog normal distribution Luna and Church (1974), the gamma distribution Sherlock
(1951), the Weibull distribution Hennessey (1977); Justus, et al (1976); Stewart and
Essenwanger (1978) and Takle and Brown (1978) and the squared normal distribution
Carlin and Haslett (1982). We have seen from our previous studies Nasir et al (1991);
Raza and Jafri, (1987) and Brown ((1981) that the Weibull distribution fits the actual
-
8/2/2019 ARMA-Stochastic Time Series Modeling
3/19
5
wind speed frequencies quite well. However, the use of inverse Gaussian distribution on
wind data Bardsley (1980) ignores the positive correlations between consecutive
observations of wind speed. Failure to take this autocorrelation into account leads to
underestimation of the variances of the time averages of wind speeds. Moreover, the long
runs of high and low wind speeds that are characteristic of such data do not occur
frequently enough in simulated data when wind speeds are assured to be uncorrelated
over time.
To overcome this problem Chou and Corotis (1981) and Goh and Nathan (1979)
have attempted to incorporate autocorrelation into wind speed models, but they do not
consider the Gaussian shape of transformed wind speed distributions and its
corresponding statistics. Some of the studies have neglected the non Gaussian shape of
the wind speed distribution. Brown et al (1984) suggested methods to take into account
the autocorrelated nature of wind speed, the diurnal non-stationarity and non Gaussian
shape of wind speed distribution so that forecasting of hourly averaged wind speed could
be done. Brown et al (1982) in their previous study, have also indicated the need for
standardization to remove diurnal non-stationarity. Diurnal variations in wind speed
occur as a natural phenomenon Jafri et al (1989) and as mentioned in a paper by Kamal
and Jafri (1996) standardization corresponds to smoothing of a profile, such as of a
Gaussian distribution that is obtained after transforming a non- Gaussian shape to an
approximately Gaussian shape,.i.e., by bringing scattered data points close to the profile.
We accomplished this standardization procedure in the present study, for hourly averaged
wind data for a period of twenty years ,.i.e ., 1985-2004, of Quetta, Pakistan before using
ARMA process.
Jafri (1996)a established that the hierarchical random process is a Markovian
random process, which can be characterized by a scaling probability distribution. A
generating function for such a process was obtained. These observations can be
successfully applied to chaotic time series Jafri (1996)b to overcome the non-stationarity
in ARMA process but it would require handy stochastic simulation techniques. Jafri
(1996)b suggested that the chaotic time series both in Bayesian and non Bayesion
statistics is deterministic. Jafri (1995) developed a first order Markov transition matrix
(MTM) for non Gaussian nature of wind speed of Quetta for 1985 and suggested a
-
8/2/2019 ARMA-Stochastic Time Series Modeling
4/19
6
Gaussian form of MTM sequence to yield HAWS (hourly averaged wind speed)
sequences. The same work was extended further on wind speed data for a period of
twenty years, .i.e.,1985-2004. Needless to mention, the simulation of wind data using
MTM Jafri (2001) is relatively difficult compared to simulation on solar radiation data
Lalarukh and Jafri (1999).The number of iterations exceeds beyond a certain limit thus
causing for HAWS and DAWS (daily average wind speed) sequences to become
cumbersome and entangled. Jafri (1995; 2001) also found autocorrelation coefficient for
wind data, which shows levels of persistence in wind speed frequencies and of wind
speed magnitudes when compared with diurnal variations over daily averaged wind speed
(DAWS) sequences.
Blanchard and Deserochers (1984) and Brown et al (1984) employed a class of
parametric time series models called autoregressive moving average processes (ARMA)
of Box and Jenkins (1976). Such processes have been employed to model many
meteorological time series Katz and Skaggs (1981). The model of Blanchard and
Desrochers (1984) takes into account high autocorrelation and allows a time series to be
generated which presumes all the main characterstics of the data ; and it does not require
any assumption about the wind speed distribution. In fact, a larger class of seasonal
models include ARIMA models Blanchard and Desrochers (1984). Sfetos (2002) studied
the linear ARIMA models and feed forward artificial neural networks (FFANN). He
found that the model order is selected from the minimization of the evaluation set error in
the ARIMA process. He suggested the multi step forecasting and the subsequent
averaging to generate mean hourly prediction of wind data. The ARIMA models have
been critically analyzed by Jain and Lungu (2002). They considered both non- seasonal
and seasonal ARIMA models by using stochastic components. They also deliberated to
determine the persistence patterns if any, of the stochastic components.
We know the model of Chou and Corotis (1981) is based on Weibull distribution
and does not require stationarity in the data. McWilliams and Sprevak (1982 a) described
a new version of an existing time series modeling procedure Box and Jenkins (1976)
from which the distribution of wind speeds and wind directions are obtained McWilliams
et al (1979) and McWilliams, and Sprevak (1982)b. Their model incorporates diurnal
variations observed in wind speed in such a manner that the time series of wind speed
-
8/2/2019 ARMA-Stochastic Time Series Modeling
5/19
7
component remain stationary; the sample autocorrelation functions for the series have
identical stochastic behavior as far as the second order statistics are concerned, thus
reducing the problem to modeling single Gaussian series. This model is corrected for
autocorrelation functions, to account for diurnal variations. There is one point which is
obvious: they did not use transformation of hourly averaged wind speed. Instead, they
considered annual deterministic variation (t) and 2(t) which are modeled by harmonic
series representation to account for diurnal variation of wind speed . With regard to our
conjecture, diurnal variation Jafri et al (1989) should be employed in model development
in a manner similar to McWilliams and Sprevak (1996b)b.
We followed the approach of Daniel & Chen (1991) which consists of first fitting
ARMA processes of various orders to hourly averaged wind speed (HAWS) data which
have been transformed to make their distribution approximately Gaussian and standardize
to remove the so called diurnal stationarity . We did not like procedures of
transformation and standardization but preferred this approach for the reason that the
model had the capability of using wind data of more than one year .The primary
advantage of including more than one year of data in the model development is the
increased reliability of the estimates of the model parameter.
We used MINITAB (version 11) for ARMA, non seasonal ARIMA and seasonal
ARIMA modeling and simulation. ARIMA models are used to model a special class of
non- stationary series. Seasonal ARIMA (SARIMA) models are used to incorporate
cyclic components in models. In other words, ARIMA models are, in theory the most
general class of models (Parsemonius) for forecasting a time series which can be
stationarized by transformations such as differencing and logging. SARIMA has the same
structure as ARIMA . We used both non seasonal and seasonal models on hourly
averaged wind data of 1985-2004. For non- seasonal ARIMA modeling and simulation,
the six options,. i.e., random walk (ARIMA(0,1,0)), differenced first order autoregressive
model (ARIMA(1,1,0)), constant (ARIMA(0,1,1), linear exponential smoothing (LES)
without constant (ARIMA (0,2,1) or (0,2,2)) and mixed ARIMA(1,1,1) are tried for each
month and on four seasons. Non seasonal ARIMA (0,1,1) which deals with exponential
growth and constant incorporates simple exponential smoothing (SES) model. MA(1)
-
8/2/2019 ARMA-Stochastic Time Series Modeling
6/19
8
coefficients correspond to 1- in the SES formula. The term is called training
parameter. For LES without constant, MA(1) coefficient corresponds to 2.
For seasonal ARIMA (SARIMA) modeling and simulation, the seven
options,. i.e., SARIMA(0,1,1)x(0,1,1)12, SARIMA(0,0,0)x(0,1,0)12 with constant,
SARIMA(0,1,0)x(0,1,0)12 SARIMA(1,0,1)x(0,1,1)12 with constant, SARIMA following
SES with =0.4772 and Browns SARIMA(LES) with = 0.2106 are tried for each
month only. The most oftenly used model of ARIMA is SARIMA(0,1,1)x(0,1,1)12 which
strictly follows seasonal exponential smoothing. SARIMA(0,1,0)x(0,1,0)12 is also
known as seasonal random trend (SRT) model. The alternate to SRT model is seasonal
random walk model,.i.e., SARIMA (1,0,0)x(0,1,0)12. There is, of course, a difference
between seasonal and simple exponential models. The values of = 1- is used in
exponential smoothing formulas. The best option is selected by considering the most
minimum chi- squared value at 5% confidence interval.
1.2Regression Analysis Time Series Modeling & SimulationThe regression is strictly the correlation analysis, accomplished with time and
sometimes without time series. The modern interpretations and fundamental concepts of
regression analysis are thoroughly presented by Gujarati (1988), Siegel (1997), Rawlings
(1988) and Newton (1988). All kinds of regression analysis can be accomplished by theleast squares regression technique, which minimizes the discrepancy between data points
and the fit Chapra and Canal(1990). It comprises of linear regression (LR), polynomial
regression (PR), multiple linear regression (MLR), general linear least square (GLLS)
and non-linear regression (NLR). For NLR, least square technique is used. Gauss-Siedel
technique can not be employed because the normal equations are not diagonally
dominant. NLR analysis is sometimes very useful to fit but it also requires minimization
of the sum of the square of the residuals (SSR). This analysis is only carried out on a
single independent variable, therefore, multiple parameters which are interrelated with
each other such as in MLR can not be studied. However, NLR analysis has the advantage
over PR because it exploits iteration. For NLR analysis the Gauss-Newton method has
some short comings such as slow convergence, wide oscillations,.i.e., changing directions
and sometimes divergence Draper and Smith (1981). These discrepancies were overcome
-
8/2/2019 ARMA-Stochastic Time Series Modeling
7/19
9
by other methods such as the steepest descent and the Lavenberg-Marquardt techniques
Trabea and Shaltout (2000). However, PR in some cases, especially when data is
distributed like a parabola or in a cubic polynomial can be applied because it is dependent
on a single variable, such as PRATS, in our case. Trabea and Shaltout (2000) studied
correlation of global solar radiation with meteorological parameters like mean daily
maximum temperature, mean daily relative humidity, mean daily sea level pressure, mean
daily vapour pressure and hours of bright sunshine, by using MLR analysis. The
correlation, the regression coefficient and the standard error were estimated. But they did
not consider the interdependence of the meteorological parameters. Rapti (2000)
developed mathematical correlation of atmospheric turbidity with specific humidity and
of diffuse radiation with atmospheric turbidity for maritime and for continental air
masses. This study does not include any statistical correlations.
Ilyas and Nasir (2000) developed a relationship between humidity and
temperature and found Guassian trend. The best fit to the experimental data as suggested
by them, is as follows:
2ln
o
th o eT
H Hk
=
whereHthis the theoretical humidity,Hoand Toare the experimental values of humidity
and temperature, respectively and kis a constant for the fit. Hussain, Jafri and Kamal [10]
used regression modeling of weather data and found PRATS relatively better than PR.
Ilyas (2000) found an inverse Guassian relationship for percentage cumulative frequency
of sunshine hours and solar energy,. i.e.,
2(%) exp 0.5cum
th
Ef k
E
=
where
{ }2
exp ln cumth
x Ek x f and x
n E
= =
-----------(3)
_____________________ (1)
___________ (2)
-
8/2/2019 ARMA-Stochastic Time Series Modeling
8/19
10
In eq.2, symbols E, Eth, x and n represent solar energy, threshold solar energy,
square of the ratio of solar energy to its threshold values and the total number of data
respectively.
The overall behavior of humidity on temperature and solar energy on its
cumulative frequency of sunshine hours shows a reversal,.i.e., the former is Guassian and
the later is inverse Guassian. We tried to establish the best fit to our diverse data by using
regression analysis. Kamal and Jafri (1999) developed stochastic modeling and generated
synthetic sequences of hourly global solar irradiation. They also found the Markov
transition matrices (MTM) approach relatively better as a simulator compared to
Autoregressive Moving Average (ARMA) process. The time series models to stimulate
and forecast hourly averaged wind speed (HAWS) were presented by Kamal and Jafri
(1997). They also used simulation of Weibull distribution of HAWS Kamal and Jafri
(1996). With the use of triangulation method and statistical correlation from regression
equations, solar radiations were estimated at locations where there were no observatory
and found it very much reliable Raza and Kamal(2002). Jafri recently performed fuzzy
logic time series (FLTS) prediction modeling on HAWS (2007). Needless to mention,
regression modeling despite many of its short comings is a better predictor. The fuzzy
regression analysis is defined as the model which includes the fuzziness (uncertainty) in
itself Tanaka and Ishibuchi (1992). Ozawa et al.(1997) used the fuzzy autoregressive
(AR) model to describe the fuzzy time series Ozawa et.al (1997) which can not be dealt
by stochastic models. The fuzzy time series analysis was proposed by Watada (1992).
1.3 Chaotic Time series without Rule Based Fuzzy logic (FL), Mackey
Glass Simulation with FL and Prediction
The original fuzzy logic (FL) pioneered by Lotfi Zadeh (1965) has been around
for forty years, and yet it is unable to handle uncertainties. Zadeh introduced the conceptof a fuzzy set, a set whose boundary is not sharp or precise. This concept contrasts with
the classical concept of a set recently called a crisp set, whose boundary is required to be
precise. Probability and fuzzy sets describe different kind of uncertainty .The probability
is the theory of sets. It deals with the likelihood of relevant events or with the expectation
of a future event based on something now known (outcome of a random event) while the
-
8/2/2019 ARMA-Stochastic Time Series Modeling
9/19
11
fuzziness is not the uncertainty expectation. Fuzzy set theory, on the other hand is not
concerned with events. It is concerned with concepts. Rule based fuzzy logic system
(FLS) is a powerful design methodology to minimize the effect of uncertainty Mendel
(2001). Model free designs are artificial neural networks (ANN) and fuzzy logic(FL).The
fuzzy logic (FL) rules are extracted from numerical data and are then combined with
linguistic knowledge. The richness of fuzzy logic is that there are enormous members of
possibilities that lead to a lot of non-linear mappings of an input data vector into a scalar
output. In model free approaches, the associated model is a representation of architecture
to solve a specific problem. With model approach in fuzzy logic, one can endeavor the
truth or close approximation theory. FLSs employ 500 rules for one pass (OP) and
sixteen rules for back propagation (BP) steepest descent method of designs, respectively.
We followed a model free approach, .i.e., fuzzy logic on hourly wind speed data to
predict future value, . i.e., consequents from antecedents (past values) . A single stage
forecasting for a chaotic time series wind data will be used.
1.4 Rule based Fuzzy Logic Time series Prediction, Modeling and
Simulation
Rule based fuzzy logic systems (FLS), a powerful design methodology, minimize
the effect of uncertainty Mendel(2001). The two most popular FLSs used by engineers
today are the Mamdani and Takagi-Sugano-Kang (TSK) systems. Both are characterized
by IF-Then rules and have the same antecedent structures. They differ in the structure of
the consequents. The consequent of a Mamdani rule is a Fuzzy set, whereas the
consequent of a TSK rule is a function. The type-1 TSK FLSs have been widely used in
control and other applications Terano et al (1994). The output of type-1 TSK forecaster
occurs without a defuzzification step. Lieng and Mendel (1999; 2000) developed type-2
TSK FLSs. The FLS forecasters comprise of singleton type-1 (with virtually nouncertainties), non-singleton type-1 (with uncertainties), singleton type-2, type-1 non-
singleton type-2, type-2 non-singleton type-2, type-1 TSK and type-2 TSK Mendel
(2001). The rule based fuzzy logic systems (FLSs), both type-1 and type-2, handle
uncertainties because modeling and minimization of uncertainties can be accomplished.
If all uncertainties disappear, type-2 FL reduces to type-1 FL, in much the same manner
-
8/2/2019 ARMA-Stochastic Time Series Modeling
10/19
12
that if randomness disappears, probability reduces to determinism. For basic singleton
type-1 FLSs, we assume that there are no uncertainties; all fuzzy sets are of type-1,
measurements are perfect and treated as crisp values,.i.e., as singletons. Thus, the non-
singleton FLS do not yield crisp values, i.e., uncertainties are inherently present. A FLS
that is described completely in terms of type-1 fuzzy sets is called a type-1 FLS. Type-1
FLSs are unable to directly handle rule uncertainties, because they use type-1 fuzzy sets
that are certain. Therefore, a better way to handle uncertainties is to use a type-2 FLS.
But, a non-singleton type-1 FLS is a type-1 FLS whose inputs are modeled as type-1
fuzzy numbers; hence, it can be used to handle uncertainties. Moreover, the type-1 FL, in
its applications, deciphers rule based systems as a powerful design methodology.
The rules of a non singleton-type-1 FLS are the same as those for a singleton
type-1 FLS Mendel (2001). The difference is of the fuzzifier, which treats the inputs, as
type-1 fuzzy sets, and the effect of this on the inference block. The output of the
inference block will again be a type-1 fuzzy set. The type-1 FLS, both for singleton and
non-singleton, is shown in Fig.1. So the defuzzifiers that are described for a singleton
type-1 FLS apply as well to a non-singleton type-1 FLS Mendel (2001).
We know that non-stationarity (randomness) in our wind data inherently exists
Jafri (2005); Kamal and Jafri (1996), therefore, uncertainties or randomness cannot be
reduced. It can be handled properly with non-singleton type-1 FLS, therefore, there
appears no reason to use a type-2 FLS.
We recently performed fuzzy logic (FL) time series prediction modeling on
hourly averaged wind speed (HAWS) data of 1985-2004 and used Mackey-Glass
simulation, for Quetta, Pakistan.. We shall use the same results of wind data with the
applications of rule based type-1 FLS. We used the MATLAB M-files which are:
URL:http://sipi.usc.edu/~mendle/software. The M-files are available in three folders:
type-1 FLS, general type-2 FLSs and Interval type-2 FLSs. We used, in this study, the
following type-1FLSs:
- Singleton Mamdani type-1 FLSsfls_type1.m: compute the output(s) of a singleton
type-1 FLS when the antecedent membership functions are Gaussian
-
8/2/2019 ARMA-Stochastic Time Series Modeling
11/19
13
train_fls_type.1.m: tune the parameters of a singleton type-1 FLS when the
antecedent membership functions are Gaussian using some input-output training data
Non-singleton Mamdani type-1 FLS
nsfls_type1.m: compute the output(s) of a non-singleton type-1 FLS when the
antecedent membership functions are Gaussian and the input sets are Gaussian
train_nsfls_type1.m: tune the parameters of a non- singleton type-1 FLS when the
antecedent membership functions are Gaussian, using some input- output training data.
We avoid the extraneous matter on the development and historical background of
rule- based FLSs because we are concerned only with use of FLSs in time series. The
exhaustive literature and indeed critical review on rule-based FLSs are available in the
form of a book by M. Mendel (2001). However, we shall deliberate on fundamental rules
extracted from the data under consideration. The rules in fuzzy logic time-series are
usually extracted from designing the FLSs. Prior to 1992, all FLSs reported in the open
literature fixed the parameters, such as the type of fuzzification, composition,
implication, t-norm (operators for fuzzy intersection), defuzzification (produces crisp
output) and membership functions, arbitrarily,.e.g., the locations and spreads of the
membership functions were chosen by the designer independent of the numerical training
data. Then, at the first IEEE conference in Fuzzy systems, held in San Diago in 1992,
three different groups of researchers,.i.e., Horikowa et al (1992), Jang (1992) and Wang
and Mendel (1992), presented the same idea: tune the parameters of a FLS using the
numerical training data. Since that time, quite a few adaptive training procedures have
been published. Because tuning of free parameters had been in feed forward neural
network (FFNN) long before it was done in a FLS, a tuned FLS has also come to be
known as a neural fuzzy system. Designing a FLS Mendel and Mouzouris (1997) can be
viewed as approximating a function or fitting a complex surface in a multidimensional
space. Given a set of input-output pairs, tuning is essentially equivalent to determining a
system that provides an optimal fit to input-output pairs, with respect to a cost function
(tuning algorithm). Utilizing concepts from real analysis, Monzouris and Mendel have
proven that a non-singleton FLS can uniformly approximate any continuous function on a
compact set. Although the proof of approximation Mendel and Mouzouris (1997)
provides some insight, it does not tell us how to choose the parameters of the non-
-
8/2/2019 ARMA-Stochastic Time Series Modeling
12/19
14
singleton FLS, nor does it tell us how many basis functions will be needed to achieve
such performance. The latter are accomplished through design. The designing of FLSs
require one-pass (OP), least square, back-propagation (steepest descent, BP), SVD-QR
(SVD-QR is a matrix tool in numerical linear algebra used in signal processing,
extracting fuzzy rules, reducing fuzzy rules and modeling the fuzzy rules) and iterative
design methods.
The forecasting of timeseries following the rule-based FLSs designing employ
only two methods, .i.e., one pass (OP) and back propagation (BP) methods, respectively.
The OP design constructs 500 rules for each antecedent consequent membership
functions. We set the value of the standard deviation equal to 0.1 for all Gaussians in a
pre-defined OP design. But, the OP is exhaustive as compared to BP designing in FLSs.
On the contrary the BP constructs only 16 rules for each antecedent and consequent
membership functions. The initial values of the standard deviation of Gaussian
membership function are all set equal to 0.5240 in a pre-defined BP design. The BP
designing, in many respects, is better than OP, Mendel (2001). The predefined values of
all four antecedent membership functions and for the centers of the consequent
membership functions ( ly -height defuzzifier) for each corresponding 16 rules in a BP
design for FLSs are used in the form of a matrix as an input. We use the height
defuzzifier (l
y or centers of the consequent membership functions); to be a random
number from the interval (0,1). After training and using BP design, the FLS forecaster
was fixed. We use the learning parameter=0.2 in BP design.
Withtractable learning laws, we set the learning parameters. Alpha stable statistics model
the impulsiveness as a parameterized family of probability density functions. Additive
fuzzy systems can filter impulsive noise from signals. With < 2 one gets impulsive
noise and noise has infinite variance. The alpha in statistics is an exponent parameter.
With
=2, we get the classical Gaussian case, .i.e., exponential tail and finite variance.
The predefined initial mean (center) values of antecedent membership functions
along with height defuzzifiers (mean values of consequent membership functions) and
the standard deviations of the Gaussian antecedent, in the form of matrix membership
functions, as shown in tables 1 and 2, are used for determining the values of singleton
-
8/2/2019 ARMA-Stochastic Time Series Modeling
13/19
15
consequent membership functions, .i.e., )( ks sf for hourly 600 trainee wind data and 120
or 144 testing wind data, respectively.
The predefined final mean (center) values of antecedent membership functions
along with height defuzzifiers (mean values of the consequent membership functions)and the standard deviations of the Gaussian antecedent membership functions, in the
form of a matrix, after six epochs of training, as shown in tables 2 and 3, are used for
determining the values of non-singleton consequent member functions, .i.e., fns(sk
), for
hourly 600 trainee data and 120 or 144 testing data, respectively. In both cases, 600
trainee wind data and 120 or 144 testing data for all four antecedent membership
functions are used as an input matrix, X, in sfls_type1.m and nsfls_type1.m, respectively.
For trainee as well as for testing data, we calculated the predicted values Jafri (2005);
Jafri et al (2005). It is difficult to reproduce all predicted values and the values of
consequent membership functions for singleton and non-singleton type-1 FLSs in this
manuscript. Therefore, we will compare root mean square error,.i,e., RSMEs (BP) with
RSMEns (BP) only for testing data.
RMSEs = 2)(719
600
)]()1([120
1 ks
k
xfks +=
-------------------(4)
RMSEns = 2)(719
600
)]()1([120
1 kns
k
xfks +=
where x(k)
= [ x (k-18), x(t-12), x(t-6) x(t)]T
------------------(5)
s(k+1) = x(t+6)
It is worth mentioning that trainee pairs are obtained with testing data, therefore,
the analysis of testing data will be the same for trainee data, We input predefined initial
mean values of all antecedent membership functions (table 1) in case of a singleton type-
1 FLS because we assume that there are no uncertainties in the data. But, we cannot
totally ignore the noisy measurement environment, therefore, we tested our final FLS
forecasters on noisy testing data, .i.e.,
-
8/2/2019 ARMA-Stochastic Time Series Modeling
14/19
16
x(k) = s(k) + n(k) -------------------(6)
where n (k) is OdB (decibel) uniformly distributed noise.
We accomplished this task for a Monte Carlo set of 60 realizations. After each
epoch we used the testing data to see how FLS performed by computing RMSEs(BP) and
RMSEns(BP), respectively by using equation (4). This entire process was repeated 60
times using 60 independent sets of mean and standard deviation of 720 or 744 hourly
averaged wind data. The predefined BP RMSEs (BP), Mendel (2001) for each of the six
epochs of tuning are:
RMSEs (BP) = {.0548,.0431,.0322,.0261,.0237,.0232}-------(4)
The non-singleton FLS shares most of the same parameters as the singleton FLS.
So we shall use the partially dependent BP design approach. In BP design we use only
two fuzzy sets for each of the four antecedents, so that there are only 16 rules. Each rule
is characterized by eight antecedent membership function parameters (the mean and
standard deviation for each of the four Gaussian membership functions) and one
consequent parameter, y . More specifically, we initially chose the mean of each and
every antecedents, two Gaussian membership functions as xxm 2 or xxm 2+ ,
respectively, and the standard deviations of these membership functions as x2 .
For the non-singleton type-1 FLS, we modeled each of the four noisy input
measurements using a Gaussian membership function. Two choices are possible: (1) use
a different standard deviation for each of the four input measurement membership
functions, or (2) use the same standard deviation for each of the four input measurement
membership functions. We tried both approaches and got similar results because theadditive noise n(k) is stationary. The predefined average values and standard deviations
Mendel (2001) of RMSEs (BP) and RMSEns (BP) are shown in fig. 2 for each of the 6
epochs.
-
8/2/2019 ARMA-Stochastic Time Series Modeling
15/19
17
1.5 Artificial Neural Network Time Series (ANNTS) Modeling,
Simulation & Prediction
McCulloch-Pitts neuron is the earliest artificial neuron described with fixed
weights, a threshold activation function and a fixed discrete (non zero) time step for thetransmission of a signal from one neuron to the next McCllouch and Pitts (1943). A
processing unit is termed as a neuron or node. An artificial neural network (ANN) is an
information processing paradigm that is inspired by the biological nervous system such
as the brain and its processing information. A biological neuron has three types of
components, that are of particular interest in understanding an artificial neuron: its
dendrites, soma and axon. The dendrites receive signals from neighboring neurons. The
signals are electric impulses that are transmitted across a synaptic gap by means of a
chemical process. The synapse is a connection amongst neurons where their membranes
almost touch and signal are transmitted from one to the other by chemical
neurotransmitters. The soma or cell body sums the incoming signals, fixes signals when
sufficient input is received and transmits signals over its axons to other cells. The axon is
a long fiber over which a biological neuron transmits its output signals to other neurons.
Neural networks are computer algorithms following the information processing exactly
in the same manner as in the nervous system. They learn from the past to predict the
future; offer solutions when explicit algorithms and modules are unavailable or too
cumbersome. The neural network representative data is gathered and training algorithms
are invoked to automatically learn the structure of data. There are many types of network
ranging from simple Boolean networks (perceptron), to complex self-organizing
networks (Kohonen Networks),to networks modeling thermodynamic properties
(Boltzmann machines) Haykins (1994).There are nearly as many training methods as
there are network types but some of the more popular ones include back propagation, the
delta rule and Kohonen learning. A standard network architecture consists of several
layers of neurons.
An ANN is configured for a specific application, such as pattern recognition or
data classification, through a learning process. Learning in biological system involves
adjustments to the synaptic connections that exist between the neurons. This is true of
ANNs as well. We shall emphasize only on ANN simulations which appear to be a
-
8/2/2019 ARMA-Stochastic Time Series Modeling
16/19
18
recent development. This discipline of knowledge was established before the advent of
computers. Many important advances in ANNs reported during five decades since its
discovery in 1943, resulted into frustration among researches Fausset (1994). Recently,
the neural networks (NN) enjoy resurgence of interest and have begun to emerge as an
entirely novel approach for the modeling of complex and non-linear phenomena Hertz et
al (1991); Bishop(1995); Candill and Butler (1993); Whitley (1995); Connor et al (1994);
Dorffner (1996); Ababarnal et al (1993); Gershenfeld and Weigend (1993); Fahlman and
Lebiere (1990); Kanter et al (1995); Eisentein et al (1995); Bengio et al (1995); Fessant
et al (1995); Ruiz-Suarez et al (1995) and Boznar et al (1993). Neural network (NN) is
particularly useful when problems are driven rather by data than by concept or theory. To
date NNs have yielded many successful applications in areas, as diverse as finance,
medicine, engineering, geology, and physics indeed, any where that they are problems of
prediction or classification, neural network are being introduced. ANN models have been
applied to problems involving runoff forecasting and weather predictions Kang et al
(1993) ANNs have been applied to groundwater reclamation problems Ranjethan and
Eheart (1993), predicting average air temperature Cook and Wolfe(1991), predicting
precipitation Kalogirou et al (1998) and for forecasting of price increments Castiglioue
(2002). There has been intensive research on NNs Engel and Broeck(2000) and Kinzel
(1999); Gardner and Dorling (1998); Kulkarni et al (1997); Edwards et al (1997); Geva
(1998); Giles et al (2001); Khotanzad et al; Biehl and Caticha (2001); Schroder and
Kinzel (1998); Eindor and Kanter (1998); Priel and Kanter (2000); A-Hujazi and
Nashash (1996); Hertz and Krogh (1991); Andreas et al ((1994) and Azoff (1994)..
Prediction of time series is an important application of NNs. Since 1995 the time series
prediction by NNs have been exhaustively studied, Kalogirou et al (1998); Castigioue
(2002); Engel and Broeck(2000); Kinzel (1999); Garden Dorling (1998); Kulkarni et al
(1997); Edwards et al (1997); Geva (1998); Giles et al (2001); Khotanzad and
Abaye(1997); Biehl and Caticha (2001); Schroder and Kinzel (1998); EinDor and
Kanter (1998); Priel and Kanter (2000); Al-Hujazi and Al-Nashash (1996); Hertz et al
(1991); Andreas et al ((1994) and Azoff (1994); Gately (1996); Refenes et al (1997);
Mohandes et al (1998); Zhand et al (1998) and Hill et al (1996). Detecting trends and
patterns in financial data is of great interest to the business world to support the decision
-
8/2/2019 ARMA-Stochastic Time Series Modeling
17/19
19
making process through time series forecasting,. i.e., with neural networks Lin et al
(1995). Generally wind speed is a highly non-linear phenomenon Kamal and Jafri
(1996)a and Kamal and Jafri (1997). ANNs have recently been used successfully in
prediction of wind speed/energy Mohandes et al (1998); Kariniotakis et al (1996); Li et al
(1997); Shuhui et al (2001); Sfetsos (2002) and Kamal (2004). ANNs which are trained
on a time series are supported to achieve firstly to predict the time series many time steps
ahead and secondly to learn the rule which has produced. The prediction and learning are
not necessarily related to each other especially for chaotic time series Freking et al
(2005).
Burney (1999) studied artificial Neural networks (ANNs) with emphasis on predictive
data mining. Burney and Jilani (2001) applied methods of ANNs for the forecasting of
stock exchange. They performed the supervised ANNs for stock exchange share rates
prediction Burney and Jilani (2003). The most notable work on ANNs was the
comparison of first and second order algorithms, Burney et al (2004).More and Deo
(2003) employed the technique of neural networks to forecast daily, weekly and monthly
wind speed. Both feed forward (FF) as well as recurrent networks (RN) are used and
trained on past data in the autoregressive (AR) manner using back propagation (BP) and
cascade correlation (CC) algorithms. They conclude that the CC algorithms yield more
accurate forecasts compared to that of BP.
With critical analysis & review on ANNs, we are of the opinion that ANNs yield better
forecasts than the traditional stochastic time series model of ARIMA. We have not been
able to find any relevant research article pertaining to ANNs in Journal of the American
Statistical Association of the last two decades.Recent research activities in forecasting
with ANNs can be a promising alternatives to the traditional ARMA structure. Zhang
(2003) presented a hybrid ARMA and neural network model. Org et al (2005) worked on
model identification of ARIMA using genetic algorithms. Pai and Lin (2005) obtained
stock price forecasting using hybrid ARIMA and support vector machines model. With
hybridization of intelligent techniques such as ANNs, fuzzy systems and evolutionary
algorithms, one could expect a relatively better time series such as ANNs, fuzzy systems,
other intelligent systems prediction. Valenzuela et al (2008) exploited hybridization of
intelligent techniques and ARIMA models for time series prediction. A critical survey on
-
8/2/2019 ARMA-Stochastic Time Series Modeling
18/19
20
neural networks in business forecasting is self-explanatory to reflect modeling issues for
forecasting applications Zhang (2004).
1.6 Thesis Plan
With critical analysis and review on various time series modeling, simulation and prediction, we have been able to unravel the unattended areas of researches as
well as the areas which were overemphasized. It has been realized that statistical
techniques like ARMA, ARIMA, non-seasonal ARIMA and seasonal ARIMA
have limited capabilities when modeling time series data. Likewise, the regression
analysis time series modeling and simulation have enormous limitation. In such a
trivial situation, we shall generalize statistical techniques and accomplish
modeling of time series wind data.
We shall compare MTM ( Markov Transition Matrices) with stochastic timeseries models. On comparison of statistical and generalized techniques for
stochastic time series, we shall find very pertinent and useful results. The minor
statistical details are useful for deciphering proper stochastic time series such as
the comparison of MTM with ARMA as a simulator, suitability of short range
with large rang prediction, stochastic simulator in ARIMA and indeed the
heteroscedasticity /homoscedasticity tests in regression analysis time series partly
on some weather data.
We find the recent trends of modeling & simulation of time series only in feedforward back propagation neural network (FFBPNN), therefore, we shall attempt
FFBPNN on our data.
We shall apply singleton and non singleton type- 1 back propagation (BP)designed sixteen rule fuzzy logic system (FLS) on hourly averaged wind data,
which to our knowledge, nobody has ever attempted till todate.
We shall also use design free fuzzy logic and obtain prediction on wind data,which again to our use knowledge, has never been done on wind data till todate.
We shall perform Mackey Glass simulation on wind data. There are diverse categories of time series like neuro fuzzy logic Burney et al
(2006), Burney and Jilani (2007), second order modeling of fuzzy time series Tsai
&Wu (1999), multivariate fuzzy logic Jilani and Burney (2007), autoregressive
-
8/2/2019 ARMA-Stochastic Time Series Modeling
19/19
21
fuzzy logic Kezuhiro et al(1997), fuzzy predictor by extrapolating a time series
and parallel structure fuzzy system Kim et al (2001) which would, of course, have
extensive applications in business and trade related activities, risk assessments
and small scale weather or climate predictions.