ARMA-Stochastic Time Series Modeling

download ARMA-Stochastic Time Series Modeling

of 19

Transcript of ARMA-Stochastic Time Series Modeling

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    1/19

    3

    Contents

    Abstract

    Chapter 1 Critical Reviews:

    1.1 Stochastic Time Series Modeling, Simulation & Prediction

    1.2 Regression Analysis Time Series Modeling & Simulation

    1.3 Chaotic Time Series without Rule Based Fuzzy logic(FL),

    Mackey Glass Simulation with FL and Prediction

    1.4 Rule Based Fuzzy Logic Time Series Prediction,

    Modeling and Simulation

    1.5 Artificial Neural Network Time Series (ANNTS)

    Modeling, Simulation & Prediction

    1.6 Thesis Plan

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    2/19

    4

    1.1 Stochastic Time Series Modeling, Simulation & Prediction

    A method of forecasting wind power output a few hours advance, from a wind

    power generator that is supplying power and energy system, is required to ensure

    efficient utilization of the power. Time series modeling of wind speed has been the

    subject of many discussions because of the interest in wind as an alternative form of

    energy. When the records of wind speed are incomplete or of too short a duration or the

    handling and storage of large values of the data are not desirable, then a time series

    model is needed .Since wind power is a function of wind speed, simulation of power

    generally are derived from simulations of speed. Wind speed simulations can be done

    with Monto Carlo methods that rely solely on the estimated parameters of the marginal

    distribution of wind speeds. The

    The multiplicative ARMA (autoregressive moving average) models to generate

    hourly series of global radiation by Mora-Lopez and Sidrarch-de-Cardona (1998),

    stochastic simulation using ARIMA (autoregressive integrated moving average)

    modeling of solar irradiation by Craggs et al (1999) and a time dependent autoregressive

    Gaussian model (TAG) for generating synthetic hourly radiation by Aguiar and Collares

    Pereira (1992) are important contributions from modeling and simulation point of view.

    Lalarukh and Jafri (1999) used an ARMA process on hourly global radiation data,

    performed stochasting modeling through MTM (Markov Transition Matrix) and

    generated synthetic sequences of hourly global solar irradiation for Quetta, Pakistan.

    They found MTM approach relatively better as a simulator compared to ARMA

    modeling. But, their analysis for ARMA process to simulate and forecast hourly averaged

    wind speed for Quetta, Pakistan also yielded good results Lalarukh and Jafri (1997).

    Several non-Gaussian distributions have been suggested as appropriate models for

    wind speed. These models include the inverse Gaussian distribution Bardsley (1980), thelog normal distribution Luna and Church (1974), the gamma distribution Sherlock

    (1951), the Weibull distribution Hennessey (1977); Justus, et al (1976); Stewart and

    Essenwanger (1978) and Takle and Brown (1978) and the squared normal distribution

    Carlin and Haslett (1982). We have seen from our previous studies Nasir et al (1991);

    Raza and Jafri, (1987) and Brown ((1981) that the Weibull distribution fits the actual

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    3/19

    5

    wind speed frequencies quite well. However, the use of inverse Gaussian distribution on

    wind data Bardsley (1980) ignores the positive correlations between consecutive

    observations of wind speed. Failure to take this autocorrelation into account leads to

    underestimation of the variances of the time averages of wind speeds. Moreover, the long

    runs of high and low wind speeds that are characteristic of such data do not occur

    frequently enough in simulated data when wind speeds are assured to be uncorrelated

    over time.

    To overcome this problem Chou and Corotis (1981) and Goh and Nathan (1979)

    have attempted to incorporate autocorrelation into wind speed models, but they do not

    consider the Gaussian shape of transformed wind speed distributions and its

    corresponding statistics. Some of the studies have neglected the non Gaussian shape of

    the wind speed distribution. Brown et al (1984) suggested methods to take into account

    the autocorrelated nature of wind speed, the diurnal non-stationarity and non Gaussian

    shape of wind speed distribution so that forecasting of hourly averaged wind speed could

    be done. Brown et al (1982) in their previous study, have also indicated the need for

    standardization to remove diurnal non-stationarity. Diurnal variations in wind speed

    occur as a natural phenomenon Jafri et al (1989) and as mentioned in a paper by Kamal

    and Jafri (1996) standardization corresponds to smoothing of a profile, such as of a

    Gaussian distribution that is obtained after transforming a non- Gaussian shape to an

    approximately Gaussian shape,.i.e., by bringing scattered data points close to the profile.

    We accomplished this standardization procedure in the present study, for hourly averaged

    wind data for a period of twenty years ,.i.e ., 1985-2004, of Quetta, Pakistan before using

    ARMA process.

    Jafri (1996)a established that the hierarchical random process is a Markovian

    random process, which can be characterized by a scaling probability distribution. A

    generating function for such a process was obtained. These observations can be

    successfully applied to chaotic time series Jafri (1996)b to overcome the non-stationarity

    in ARMA process but it would require handy stochastic simulation techniques. Jafri

    (1996)b suggested that the chaotic time series both in Bayesian and non Bayesion

    statistics is deterministic. Jafri (1995) developed a first order Markov transition matrix

    (MTM) for non Gaussian nature of wind speed of Quetta for 1985 and suggested a

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    4/19

    6

    Gaussian form of MTM sequence to yield HAWS (hourly averaged wind speed)

    sequences. The same work was extended further on wind speed data for a period of

    twenty years, .i.e.,1985-2004. Needless to mention, the simulation of wind data using

    MTM Jafri (2001) is relatively difficult compared to simulation on solar radiation data

    Lalarukh and Jafri (1999).The number of iterations exceeds beyond a certain limit thus

    causing for HAWS and DAWS (daily average wind speed) sequences to become

    cumbersome and entangled. Jafri (1995; 2001) also found autocorrelation coefficient for

    wind data, which shows levels of persistence in wind speed frequencies and of wind

    speed magnitudes when compared with diurnal variations over daily averaged wind speed

    (DAWS) sequences.

    Blanchard and Deserochers (1984) and Brown et al (1984) employed a class of

    parametric time series models called autoregressive moving average processes (ARMA)

    of Box and Jenkins (1976). Such processes have been employed to model many

    meteorological time series Katz and Skaggs (1981). The model of Blanchard and

    Desrochers (1984) takes into account high autocorrelation and allows a time series to be

    generated which presumes all the main characterstics of the data ; and it does not require

    any assumption about the wind speed distribution. In fact, a larger class of seasonal

    models include ARIMA models Blanchard and Desrochers (1984). Sfetos (2002) studied

    the linear ARIMA models and feed forward artificial neural networks (FFANN). He

    found that the model order is selected from the minimization of the evaluation set error in

    the ARIMA process. He suggested the multi step forecasting and the subsequent

    averaging to generate mean hourly prediction of wind data. The ARIMA models have

    been critically analyzed by Jain and Lungu (2002). They considered both non- seasonal

    and seasonal ARIMA models by using stochastic components. They also deliberated to

    determine the persistence patterns if any, of the stochastic components.

    We know the model of Chou and Corotis (1981) is based on Weibull distribution

    and does not require stationarity in the data. McWilliams and Sprevak (1982 a) described

    a new version of an existing time series modeling procedure Box and Jenkins (1976)

    from which the distribution of wind speeds and wind directions are obtained McWilliams

    et al (1979) and McWilliams, and Sprevak (1982)b. Their model incorporates diurnal

    variations observed in wind speed in such a manner that the time series of wind speed

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    5/19

    7

    component remain stationary; the sample autocorrelation functions for the series have

    identical stochastic behavior as far as the second order statistics are concerned, thus

    reducing the problem to modeling single Gaussian series. This model is corrected for

    autocorrelation functions, to account for diurnal variations. There is one point which is

    obvious: they did not use transformation of hourly averaged wind speed. Instead, they

    considered annual deterministic variation (t) and 2(t) which are modeled by harmonic

    series representation to account for diurnal variation of wind speed . With regard to our

    conjecture, diurnal variation Jafri et al (1989) should be employed in model development

    in a manner similar to McWilliams and Sprevak (1996b)b.

    We followed the approach of Daniel & Chen (1991) which consists of first fitting

    ARMA processes of various orders to hourly averaged wind speed (HAWS) data which

    have been transformed to make their distribution approximately Gaussian and standardize

    to remove the so called diurnal stationarity . We did not like procedures of

    transformation and standardization but preferred this approach for the reason that the

    model had the capability of using wind data of more than one year .The primary

    advantage of including more than one year of data in the model development is the

    increased reliability of the estimates of the model parameter.

    We used MINITAB (version 11) for ARMA, non seasonal ARIMA and seasonal

    ARIMA modeling and simulation. ARIMA models are used to model a special class of

    non- stationary series. Seasonal ARIMA (SARIMA) models are used to incorporate

    cyclic components in models. In other words, ARIMA models are, in theory the most

    general class of models (Parsemonius) for forecasting a time series which can be

    stationarized by transformations such as differencing and logging. SARIMA has the same

    structure as ARIMA . We used both non seasonal and seasonal models on hourly

    averaged wind data of 1985-2004. For non- seasonal ARIMA modeling and simulation,

    the six options,. i.e., random walk (ARIMA(0,1,0)), differenced first order autoregressive

    model (ARIMA(1,1,0)), constant (ARIMA(0,1,1), linear exponential smoothing (LES)

    without constant (ARIMA (0,2,1) or (0,2,2)) and mixed ARIMA(1,1,1) are tried for each

    month and on four seasons. Non seasonal ARIMA (0,1,1) which deals with exponential

    growth and constant incorporates simple exponential smoothing (SES) model. MA(1)

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    6/19

    8

    coefficients correspond to 1- in the SES formula. The term is called training

    parameter. For LES without constant, MA(1) coefficient corresponds to 2.

    For seasonal ARIMA (SARIMA) modeling and simulation, the seven

    options,. i.e., SARIMA(0,1,1)x(0,1,1)12, SARIMA(0,0,0)x(0,1,0)12 with constant,

    SARIMA(0,1,0)x(0,1,0)12 SARIMA(1,0,1)x(0,1,1)12 with constant, SARIMA following

    SES with =0.4772 and Browns SARIMA(LES) with = 0.2106 are tried for each

    month only. The most oftenly used model of ARIMA is SARIMA(0,1,1)x(0,1,1)12 which

    strictly follows seasonal exponential smoothing. SARIMA(0,1,0)x(0,1,0)12 is also

    known as seasonal random trend (SRT) model. The alternate to SRT model is seasonal

    random walk model,.i.e., SARIMA (1,0,0)x(0,1,0)12. There is, of course, a difference

    between seasonal and simple exponential models. The values of = 1- is used in

    exponential smoothing formulas. The best option is selected by considering the most

    minimum chi- squared value at 5% confidence interval.

    1.2Regression Analysis Time Series Modeling & SimulationThe regression is strictly the correlation analysis, accomplished with time and

    sometimes without time series. The modern interpretations and fundamental concepts of

    regression analysis are thoroughly presented by Gujarati (1988), Siegel (1997), Rawlings

    (1988) and Newton (1988). All kinds of regression analysis can be accomplished by theleast squares regression technique, which minimizes the discrepancy between data points

    and the fit Chapra and Canal(1990). It comprises of linear regression (LR), polynomial

    regression (PR), multiple linear regression (MLR), general linear least square (GLLS)

    and non-linear regression (NLR). For NLR, least square technique is used. Gauss-Siedel

    technique can not be employed because the normal equations are not diagonally

    dominant. NLR analysis is sometimes very useful to fit but it also requires minimization

    of the sum of the square of the residuals (SSR). This analysis is only carried out on a

    single independent variable, therefore, multiple parameters which are interrelated with

    each other such as in MLR can not be studied. However, NLR analysis has the advantage

    over PR because it exploits iteration. For NLR analysis the Gauss-Newton method has

    some short comings such as slow convergence, wide oscillations,.i.e., changing directions

    and sometimes divergence Draper and Smith (1981). These discrepancies were overcome

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    7/19

    9

    by other methods such as the steepest descent and the Lavenberg-Marquardt techniques

    Trabea and Shaltout (2000). However, PR in some cases, especially when data is

    distributed like a parabola or in a cubic polynomial can be applied because it is dependent

    on a single variable, such as PRATS, in our case. Trabea and Shaltout (2000) studied

    correlation of global solar radiation with meteorological parameters like mean daily

    maximum temperature, mean daily relative humidity, mean daily sea level pressure, mean

    daily vapour pressure and hours of bright sunshine, by using MLR analysis. The

    correlation, the regression coefficient and the standard error were estimated. But they did

    not consider the interdependence of the meteorological parameters. Rapti (2000)

    developed mathematical correlation of atmospheric turbidity with specific humidity and

    of diffuse radiation with atmospheric turbidity for maritime and for continental air

    masses. This study does not include any statistical correlations.

    Ilyas and Nasir (2000) developed a relationship between humidity and

    temperature and found Guassian trend. The best fit to the experimental data as suggested

    by them, is as follows:

    2ln

    o

    th o eT

    H Hk

    =

    whereHthis the theoretical humidity,Hoand Toare the experimental values of humidity

    and temperature, respectively and kis a constant for the fit. Hussain, Jafri and Kamal [10]

    used regression modeling of weather data and found PRATS relatively better than PR.

    Ilyas (2000) found an inverse Guassian relationship for percentage cumulative frequency

    of sunshine hours and solar energy,. i.e.,

    2(%) exp 0.5cum

    th

    Ef k

    E

    =

    where

    { }2

    exp ln cumth

    x Ek x f and x

    n E

    = =

    -----------(3)

    _____________________ (1)

    ___________ (2)

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    8/19

    10

    In eq.2, symbols E, Eth, x and n represent solar energy, threshold solar energy,

    square of the ratio of solar energy to its threshold values and the total number of data

    respectively.

    The overall behavior of humidity on temperature and solar energy on its

    cumulative frequency of sunshine hours shows a reversal,.i.e., the former is Guassian and

    the later is inverse Guassian. We tried to establish the best fit to our diverse data by using

    regression analysis. Kamal and Jafri (1999) developed stochastic modeling and generated

    synthetic sequences of hourly global solar irradiation. They also found the Markov

    transition matrices (MTM) approach relatively better as a simulator compared to

    Autoregressive Moving Average (ARMA) process. The time series models to stimulate

    and forecast hourly averaged wind speed (HAWS) were presented by Kamal and Jafri

    (1997). They also used simulation of Weibull distribution of HAWS Kamal and Jafri

    (1996). With the use of triangulation method and statistical correlation from regression

    equations, solar radiations were estimated at locations where there were no observatory

    and found it very much reliable Raza and Kamal(2002). Jafri recently performed fuzzy

    logic time series (FLTS) prediction modeling on HAWS (2007). Needless to mention,

    regression modeling despite many of its short comings is a better predictor. The fuzzy

    regression analysis is defined as the model which includes the fuzziness (uncertainty) in

    itself Tanaka and Ishibuchi (1992). Ozawa et al.(1997) used the fuzzy autoregressive

    (AR) model to describe the fuzzy time series Ozawa et.al (1997) which can not be dealt

    by stochastic models. The fuzzy time series analysis was proposed by Watada (1992).

    1.3 Chaotic Time series without Rule Based Fuzzy logic (FL), Mackey

    Glass Simulation with FL and Prediction

    The original fuzzy logic (FL) pioneered by Lotfi Zadeh (1965) has been around

    for forty years, and yet it is unable to handle uncertainties. Zadeh introduced the conceptof a fuzzy set, a set whose boundary is not sharp or precise. This concept contrasts with

    the classical concept of a set recently called a crisp set, whose boundary is required to be

    precise. Probability and fuzzy sets describe different kind of uncertainty .The probability

    is the theory of sets. It deals with the likelihood of relevant events or with the expectation

    of a future event based on something now known (outcome of a random event) while the

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    9/19

    11

    fuzziness is not the uncertainty expectation. Fuzzy set theory, on the other hand is not

    concerned with events. It is concerned with concepts. Rule based fuzzy logic system

    (FLS) is a powerful design methodology to minimize the effect of uncertainty Mendel

    (2001). Model free designs are artificial neural networks (ANN) and fuzzy logic(FL).The

    fuzzy logic (FL) rules are extracted from numerical data and are then combined with

    linguistic knowledge. The richness of fuzzy logic is that there are enormous members of

    possibilities that lead to a lot of non-linear mappings of an input data vector into a scalar

    output. In model free approaches, the associated model is a representation of architecture

    to solve a specific problem. With model approach in fuzzy logic, one can endeavor the

    truth or close approximation theory. FLSs employ 500 rules for one pass (OP) and

    sixteen rules for back propagation (BP) steepest descent method of designs, respectively.

    We followed a model free approach, .i.e., fuzzy logic on hourly wind speed data to

    predict future value, . i.e., consequents from antecedents (past values) . A single stage

    forecasting for a chaotic time series wind data will be used.

    1.4 Rule based Fuzzy Logic Time series Prediction, Modeling and

    Simulation

    Rule based fuzzy logic systems (FLS), a powerful design methodology, minimize

    the effect of uncertainty Mendel(2001). The two most popular FLSs used by engineers

    today are the Mamdani and Takagi-Sugano-Kang (TSK) systems. Both are characterized

    by IF-Then rules and have the same antecedent structures. They differ in the structure of

    the consequents. The consequent of a Mamdani rule is a Fuzzy set, whereas the

    consequent of a TSK rule is a function. The type-1 TSK FLSs have been widely used in

    control and other applications Terano et al (1994). The output of type-1 TSK forecaster

    occurs without a defuzzification step. Lieng and Mendel (1999; 2000) developed type-2

    TSK FLSs. The FLS forecasters comprise of singleton type-1 (with virtually nouncertainties), non-singleton type-1 (with uncertainties), singleton type-2, type-1 non-

    singleton type-2, type-2 non-singleton type-2, type-1 TSK and type-2 TSK Mendel

    (2001). The rule based fuzzy logic systems (FLSs), both type-1 and type-2, handle

    uncertainties because modeling and minimization of uncertainties can be accomplished.

    If all uncertainties disappear, type-2 FL reduces to type-1 FL, in much the same manner

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    10/19

    12

    that if randomness disappears, probability reduces to determinism. For basic singleton

    type-1 FLSs, we assume that there are no uncertainties; all fuzzy sets are of type-1,

    measurements are perfect and treated as crisp values,.i.e., as singletons. Thus, the non-

    singleton FLS do not yield crisp values, i.e., uncertainties are inherently present. A FLS

    that is described completely in terms of type-1 fuzzy sets is called a type-1 FLS. Type-1

    FLSs are unable to directly handle rule uncertainties, because they use type-1 fuzzy sets

    that are certain. Therefore, a better way to handle uncertainties is to use a type-2 FLS.

    But, a non-singleton type-1 FLS is a type-1 FLS whose inputs are modeled as type-1

    fuzzy numbers; hence, it can be used to handle uncertainties. Moreover, the type-1 FL, in

    its applications, deciphers rule based systems as a powerful design methodology.

    The rules of a non singleton-type-1 FLS are the same as those for a singleton

    type-1 FLS Mendel (2001). The difference is of the fuzzifier, which treats the inputs, as

    type-1 fuzzy sets, and the effect of this on the inference block. The output of the

    inference block will again be a type-1 fuzzy set. The type-1 FLS, both for singleton and

    non-singleton, is shown in Fig.1. So the defuzzifiers that are described for a singleton

    type-1 FLS apply as well to a non-singleton type-1 FLS Mendel (2001).

    We know that non-stationarity (randomness) in our wind data inherently exists

    Jafri (2005); Kamal and Jafri (1996), therefore, uncertainties or randomness cannot be

    reduced. It can be handled properly with non-singleton type-1 FLS, therefore, there

    appears no reason to use a type-2 FLS.

    We recently performed fuzzy logic (FL) time series prediction modeling on

    hourly averaged wind speed (HAWS) data of 1985-2004 and used Mackey-Glass

    simulation, for Quetta, Pakistan.. We shall use the same results of wind data with the

    applications of rule based type-1 FLS. We used the MATLAB M-files which are:

    URL:http://sipi.usc.edu/~mendle/software. The M-files are available in three folders:

    type-1 FLS, general type-2 FLSs and Interval type-2 FLSs. We used, in this study, the

    following type-1FLSs:

    - Singleton Mamdani type-1 FLSsfls_type1.m: compute the output(s) of a singleton

    type-1 FLS when the antecedent membership functions are Gaussian

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    11/19

    13

    train_fls_type.1.m: tune the parameters of a singleton type-1 FLS when the

    antecedent membership functions are Gaussian using some input-output training data

    Non-singleton Mamdani type-1 FLS

    nsfls_type1.m: compute the output(s) of a non-singleton type-1 FLS when the

    antecedent membership functions are Gaussian and the input sets are Gaussian

    train_nsfls_type1.m: tune the parameters of a non- singleton type-1 FLS when the

    antecedent membership functions are Gaussian, using some input- output training data.

    We avoid the extraneous matter on the development and historical background of

    rule- based FLSs because we are concerned only with use of FLSs in time series. The

    exhaustive literature and indeed critical review on rule-based FLSs are available in the

    form of a book by M. Mendel (2001). However, we shall deliberate on fundamental rules

    extracted from the data under consideration. The rules in fuzzy logic time-series are

    usually extracted from designing the FLSs. Prior to 1992, all FLSs reported in the open

    literature fixed the parameters, such as the type of fuzzification, composition,

    implication, t-norm (operators for fuzzy intersection), defuzzification (produces crisp

    output) and membership functions, arbitrarily,.e.g., the locations and spreads of the

    membership functions were chosen by the designer independent of the numerical training

    data. Then, at the first IEEE conference in Fuzzy systems, held in San Diago in 1992,

    three different groups of researchers,.i.e., Horikowa et al (1992), Jang (1992) and Wang

    and Mendel (1992), presented the same idea: tune the parameters of a FLS using the

    numerical training data. Since that time, quite a few adaptive training procedures have

    been published. Because tuning of free parameters had been in feed forward neural

    network (FFNN) long before it was done in a FLS, a tuned FLS has also come to be

    known as a neural fuzzy system. Designing a FLS Mendel and Mouzouris (1997) can be

    viewed as approximating a function or fitting a complex surface in a multidimensional

    space. Given a set of input-output pairs, tuning is essentially equivalent to determining a

    system that provides an optimal fit to input-output pairs, with respect to a cost function

    (tuning algorithm). Utilizing concepts from real analysis, Monzouris and Mendel have

    proven that a non-singleton FLS can uniformly approximate any continuous function on a

    compact set. Although the proof of approximation Mendel and Mouzouris (1997)

    provides some insight, it does not tell us how to choose the parameters of the non-

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    12/19

    14

    singleton FLS, nor does it tell us how many basis functions will be needed to achieve

    such performance. The latter are accomplished through design. The designing of FLSs

    require one-pass (OP), least square, back-propagation (steepest descent, BP), SVD-QR

    (SVD-QR is a matrix tool in numerical linear algebra used in signal processing,

    extracting fuzzy rules, reducing fuzzy rules and modeling the fuzzy rules) and iterative

    design methods.

    The forecasting of timeseries following the rule-based FLSs designing employ

    only two methods, .i.e., one pass (OP) and back propagation (BP) methods, respectively.

    The OP design constructs 500 rules for each antecedent consequent membership

    functions. We set the value of the standard deviation equal to 0.1 for all Gaussians in a

    pre-defined OP design. But, the OP is exhaustive as compared to BP designing in FLSs.

    On the contrary the BP constructs only 16 rules for each antecedent and consequent

    membership functions. The initial values of the standard deviation of Gaussian

    membership function are all set equal to 0.5240 in a pre-defined BP design. The BP

    designing, in many respects, is better than OP, Mendel (2001). The predefined values of

    all four antecedent membership functions and for the centers of the consequent

    membership functions ( ly -height defuzzifier) for each corresponding 16 rules in a BP

    design for FLSs are used in the form of a matrix as an input. We use the height

    defuzzifier (l

    y or centers of the consequent membership functions); to be a random

    number from the interval (0,1). After training and using BP design, the FLS forecaster

    was fixed. We use the learning parameter=0.2 in BP design.

    Withtractable learning laws, we set the learning parameters. Alpha stable statistics model

    the impulsiveness as a parameterized family of probability density functions. Additive

    fuzzy systems can filter impulsive noise from signals. With < 2 one gets impulsive

    noise and noise has infinite variance. The alpha in statistics is an exponent parameter.

    With

    =2, we get the classical Gaussian case, .i.e., exponential tail and finite variance.

    The predefined initial mean (center) values of antecedent membership functions

    along with height defuzzifiers (mean values of consequent membership functions) and

    the standard deviations of the Gaussian antecedent, in the form of matrix membership

    functions, as shown in tables 1 and 2, are used for determining the values of singleton

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    13/19

    15

    consequent membership functions, .i.e., )( ks sf for hourly 600 trainee wind data and 120

    or 144 testing wind data, respectively.

    The predefined final mean (center) values of antecedent membership functions

    along with height defuzzifiers (mean values of the consequent membership functions)and the standard deviations of the Gaussian antecedent membership functions, in the

    form of a matrix, after six epochs of training, as shown in tables 2 and 3, are used for

    determining the values of non-singleton consequent member functions, .i.e., fns(sk

    ), for

    hourly 600 trainee data and 120 or 144 testing data, respectively. In both cases, 600

    trainee wind data and 120 or 144 testing data for all four antecedent membership

    functions are used as an input matrix, X, in sfls_type1.m and nsfls_type1.m, respectively.

    For trainee as well as for testing data, we calculated the predicted values Jafri (2005);

    Jafri et al (2005). It is difficult to reproduce all predicted values and the values of

    consequent membership functions for singleton and non-singleton type-1 FLSs in this

    manuscript. Therefore, we will compare root mean square error,.i,e., RSMEs (BP) with

    RSMEns (BP) only for testing data.

    RMSEs = 2)(719

    600

    )]()1([120

    1 ks

    k

    xfks +=

    -------------------(4)

    RMSEns = 2)(719

    600

    )]()1([120

    1 kns

    k

    xfks +=

    where x(k)

    = [ x (k-18), x(t-12), x(t-6) x(t)]T

    ------------------(5)

    s(k+1) = x(t+6)

    It is worth mentioning that trainee pairs are obtained with testing data, therefore,

    the analysis of testing data will be the same for trainee data, We input predefined initial

    mean values of all antecedent membership functions (table 1) in case of a singleton type-

    1 FLS because we assume that there are no uncertainties in the data. But, we cannot

    totally ignore the noisy measurement environment, therefore, we tested our final FLS

    forecasters on noisy testing data, .i.e.,

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    14/19

    16

    x(k) = s(k) + n(k) -------------------(6)

    where n (k) is OdB (decibel) uniformly distributed noise.

    We accomplished this task for a Monte Carlo set of 60 realizations. After each

    epoch we used the testing data to see how FLS performed by computing RMSEs(BP) and

    RMSEns(BP), respectively by using equation (4). This entire process was repeated 60

    times using 60 independent sets of mean and standard deviation of 720 or 744 hourly

    averaged wind data. The predefined BP RMSEs (BP), Mendel (2001) for each of the six

    epochs of tuning are:

    RMSEs (BP) = {.0548,.0431,.0322,.0261,.0237,.0232}-------(4)

    The non-singleton FLS shares most of the same parameters as the singleton FLS.

    So we shall use the partially dependent BP design approach. In BP design we use only

    two fuzzy sets for each of the four antecedents, so that there are only 16 rules. Each rule

    is characterized by eight antecedent membership function parameters (the mean and

    standard deviation for each of the four Gaussian membership functions) and one

    consequent parameter, y . More specifically, we initially chose the mean of each and

    every antecedents, two Gaussian membership functions as xxm 2 or xxm 2+ ,

    respectively, and the standard deviations of these membership functions as x2 .

    For the non-singleton type-1 FLS, we modeled each of the four noisy input

    measurements using a Gaussian membership function. Two choices are possible: (1) use

    a different standard deviation for each of the four input measurement membership

    functions, or (2) use the same standard deviation for each of the four input measurement

    membership functions. We tried both approaches and got similar results because theadditive noise n(k) is stationary. The predefined average values and standard deviations

    Mendel (2001) of RMSEs (BP) and RMSEns (BP) are shown in fig. 2 for each of the 6

    epochs.

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    15/19

    17

    1.5 Artificial Neural Network Time Series (ANNTS) Modeling,

    Simulation & Prediction

    McCulloch-Pitts neuron is the earliest artificial neuron described with fixed

    weights, a threshold activation function and a fixed discrete (non zero) time step for thetransmission of a signal from one neuron to the next McCllouch and Pitts (1943). A

    processing unit is termed as a neuron or node. An artificial neural network (ANN) is an

    information processing paradigm that is inspired by the biological nervous system such

    as the brain and its processing information. A biological neuron has three types of

    components, that are of particular interest in understanding an artificial neuron: its

    dendrites, soma and axon. The dendrites receive signals from neighboring neurons. The

    signals are electric impulses that are transmitted across a synaptic gap by means of a

    chemical process. The synapse is a connection amongst neurons where their membranes

    almost touch and signal are transmitted from one to the other by chemical

    neurotransmitters. The soma or cell body sums the incoming signals, fixes signals when

    sufficient input is received and transmits signals over its axons to other cells. The axon is

    a long fiber over which a biological neuron transmits its output signals to other neurons.

    Neural networks are computer algorithms following the information processing exactly

    in the same manner as in the nervous system. They learn from the past to predict the

    future; offer solutions when explicit algorithms and modules are unavailable or too

    cumbersome. The neural network representative data is gathered and training algorithms

    are invoked to automatically learn the structure of data. There are many types of network

    ranging from simple Boolean networks (perceptron), to complex self-organizing

    networks (Kohonen Networks),to networks modeling thermodynamic properties

    (Boltzmann machines) Haykins (1994).There are nearly as many training methods as

    there are network types but some of the more popular ones include back propagation, the

    delta rule and Kohonen learning. A standard network architecture consists of several

    layers of neurons.

    An ANN is configured for a specific application, such as pattern recognition or

    data classification, through a learning process. Learning in biological system involves

    adjustments to the synaptic connections that exist between the neurons. This is true of

    ANNs as well. We shall emphasize only on ANN simulations which appear to be a

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    16/19

    18

    recent development. This discipline of knowledge was established before the advent of

    computers. Many important advances in ANNs reported during five decades since its

    discovery in 1943, resulted into frustration among researches Fausset (1994). Recently,

    the neural networks (NN) enjoy resurgence of interest and have begun to emerge as an

    entirely novel approach for the modeling of complex and non-linear phenomena Hertz et

    al (1991); Bishop(1995); Candill and Butler (1993); Whitley (1995); Connor et al (1994);

    Dorffner (1996); Ababarnal et al (1993); Gershenfeld and Weigend (1993); Fahlman and

    Lebiere (1990); Kanter et al (1995); Eisentein et al (1995); Bengio et al (1995); Fessant

    et al (1995); Ruiz-Suarez et al (1995) and Boznar et al (1993). Neural network (NN) is

    particularly useful when problems are driven rather by data than by concept or theory. To

    date NNs have yielded many successful applications in areas, as diverse as finance,

    medicine, engineering, geology, and physics indeed, any where that they are problems of

    prediction or classification, neural network are being introduced. ANN models have been

    applied to problems involving runoff forecasting and weather predictions Kang et al

    (1993) ANNs have been applied to groundwater reclamation problems Ranjethan and

    Eheart (1993), predicting average air temperature Cook and Wolfe(1991), predicting

    precipitation Kalogirou et al (1998) and for forecasting of price increments Castiglioue

    (2002). There has been intensive research on NNs Engel and Broeck(2000) and Kinzel

    (1999); Gardner and Dorling (1998); Kulkarni et al (1997); Edwards et al (1997); Geva

    (1998); Giles et al (2001); Khotanzad et al; Biehl and Caticha (2001); Schroder and

    Kinzel (1998); Eindor and Kanter (1998); Priel and Kanter (2000); A-Hujazi and

    Nashash (1996); Hertz and Krogh (1991); Andreas et al ((1994) and Azoff (1994)..

    Prediction of time series is an important application of NNs. Since 1995 the time series

    prediction by NNs have been exhaustively studied, Kalogirou et al (1998); Castigioue

    (2002); Engel and Broeck(2000); Kinzel (1999); Garden Dorling (1998); Kulkarni et al

    (1997); Edwards et al (1997); Geva (1998); Giles et al (2001); Khotanzad and

    Abaye(1997); Biehl and Caticha (2001); Schroder and Kinzel (1998); EinDor and

    Kanter (1998); Priel and Kanter (2000); Al-Hujazi and Al-Nashash (1996); Hertz et al

    (1991); Andreas et al ((1994) and Azoff (1994); Gately (1996); Refenes et al (1997);

    Mohandes et al (1998); Zhand et al (1998) and Hill et al (1996). Detecting trends and

    patterns in financial data is of great interest to the business world to support the decision

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    17/19

    19

    making process through time series forecasting,. i.e., with neural networks Lin et al

    (1995). Generally wind speed is a highly non-linear phenomenon Kamal and Jafri

    (1996)a and Kamal and Jafri (1997). ANNs have recently been used successfully in

    prediction of wind speed/energy Mohandes et al (1998); Kariniotakis et al (1996); Li et al

    (1997); Shuhui et al (2001); Sfetsos (2002) and Kamal (2004). ANNs which are trained

    on a time series are supported to achieve firstly to predict the time series many time steps

    ahead and secondly to learn the rule which has produced. The prediction and learning are

    not necessarily related to each other especially for chaotic time series Freking et al

    (2005).

    Burney (1999) studied artificial Neural networks (ANNs) with emphasis on predictive

    data mining. Burney and Jilani (2001) applied methods of ANNs for the forecasting of

    stock exchange. They performed the supervised ANNs for stock exchange share rates

    prediction Burney and Jilani (2003). The most notable work on ANNs was the

    comparison of first and second order algorithms, Burney et al (2004).More and Deo

    (2003) employed the technique of neural networks to forecast daily, weekly and monthly

    wind speed. Both feed forward (FF) as well as recurrent networks (RN) are used and

    trained on past data in the autoregressive (AR) manner using back propagation (BP) and

    cascade correlation (CC) algorithms. They conclude that the CC algorithms yield more

    accurate forecasts compared to that of BP.

    With critical analysis & review on ANNs, we are of the opinion that ANNs yield better

    forecasts than the traditional stochastic time series model of ARIMA. We have not been

    able to find any relevant research article pertaining to ANNs in Journal of the American

    Statistical Association of the last two decades.Recent research activities in forecasting

    with ANNs can be a promising alternatives to the traditional ARMA structure. Zhang

    (2003) presented a hybrid ARMA and neural network model. Org et al (2005) worked on

    model identification of ARIMA using genetic algorithms. Pai and Lin (2005) obtained

    stock price forecasting using hybrid ARIMA and support vector machines model. With

    hybridization of intelligent techniques such as ANNs, fuzzy systems and evolutionary

    algorithms, one could expect a relatively better time series such as ANNs, fuzzy systems,

    other intelligent systems prediction. Valenzuela et al (2008) exploited hybridization of

    intelligent techniques and ARIMA models for time series prediction. A critical survey on

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    18/19

    20

    neural networks in business forecasting is self-explanatory to reflect modeling issues for

    forecasting applications Zhang (2004).

    1.6 Thesis Plan

    With critical analysis and review on various time series modeling, simulation and prediction, we have been able to unravel the unattended areas of researches as

    well as the areas which were overemphasized. It has been realized that statistical

    techniques like ARMA, ARIMA, non-seasonal ARIMA and seasonal ARIMA

    have limited capabilities when modeling time series data. Likewise, the regression

    analysis time series modeling and simulation have enormous limitation. In such a

    trivial situation, we shall generalize statistical techniques and accomplish

    modeling of time series wind data.

    We shall compare MTM ( Markov Transition Matrices) with stochastic timeseries models. On comparison of statistical and generalized techniques for

    stochastic time series, we shall find very pertinent and useful results. The minor

    statistical details are useful for deciphering proper stochastic time series such as

    the comparison of MTM with ARMA as a simulator, suitability of short range

    with large rang prediction, stochastic simulator in ARIMA and indeed the

    heteroscedasticity /homoscedasticity tests in regression analysis time series partly

    on some weather data.

    We find the recent trends of modeling & simulation of time series only in feedforward back propagation neural network (FFBPNN), therefore, we shall attempt

    FFBPNN on our data.

    We shall apply singleton and non singleton type- 1 back propagation (BP)designed sixteen rule fuzzy logic system (FLS) on hourly averaged wind data,

    which to our knowledge, nobody has ever attempted till todate.

    We shall also use design free fuzzy logic and obtain prediction on wind data,which again to our use knowledge, has never been done on wind data till todate.

    We shall perform Mackey Glass simulation on wind data. There are diverse categories of time series like neuro fuzzy logic Burney et al

    (2006), Burney and Jilani (2007), second order modeling of fuzzy time series Tsai

    &Wu (1999), multivariate fuzzy logic Jilani and Burney (2007), autoregressive

  • 8/2/2019 ARMA-Stochastic Time Series Modeling

    19/19

    21

    fuzzy logic Kezuhiro et al(1997), fuzzy predictor by extrapolating a time series

    and parallel structure fuzzy system Kim et al (2001) which would, of course, have

    extensive applications in business and trade related activities, risk assessments

    and small scale weather or climate predictions.