Automated Mathematical Modelling for Financial Time

8/8/2019 Automated Mathematical Modelling for Financial Time

1/7

AUTOM ATED M ATHEMATICAL MODELLING FOR FINANCIAL TIMESERIES PREDICTION USING FUZZY LOGIC, DYNAMICAL SYSTE MSAND FRACTAL THEORY

OSCA R CASTILLO PATRICIA MEL INDepartment of EconomicsUABC University CET Y S Universityocastillo@besttj. tij.cetys.mx em elin abe sttj .tij cetys.mxSchool of Engineering

P.O. Box 4207 Chula Vista CA 91909, USA

ABSTRACT:We describe in this paper a new method to perform automated mathematical modelling fo r jnancial t ime seriesprediction using Fuzzy Logic Techniques, Dynamical Systems and Fractal Theory. The main idea in this paper isthat using Fuzzy Logic techniques we can simulate and automate the reasoning process of the human experts inmathematical modelling for jn an ci al time series prediction. Ou r new method fo r automated modelling consists ofthree main parts: Time Series Analysis, Developing a set of Admissible Models and Selecting the "Best" modef.Our method for Time Series Analysis con sists in the use of the fracta l dimension of a set of points as a measure ofthe geometrical complexity ofthe time series. Our method fo r developing a set of admissible dynamical systemsmodels is based on the use of Fuzzy Logic techniques to simulate the decision process of the human experts inmodelling fk an ci al problems. The selection of the "best" model fo r Financial Time Series Prediction (FTSP) isdone using heuristics fr om the experts and statistical calculations. This new method can be implemented as acomputer program and can be considered an intelligent system fo r automated mathematical modelling fo r FITP.

1.- INTRODUCTIONWe describe in this paper a new method to perform automated mathematical modelling for FTSP using FuzzyLogic techniques, Dynamical Systems and Fractal theory. The idea of using Dynam ical Systems Theory an dFractal Theory as alternative approaches for predxtion can be justified if we consider that traditional statisticalmethods only have limited success in real world complex financial applications, and this is mainly bec ause manyfinancial problems show very complicated dynamics in time. Traditional statistical methods assum e that the erraticbehavior of a time series is mainly due to a external random error (that can not be explained). However, aDynamical Systems approach, using non-linear mathematical models, can explain this erratic behavior because"cha os" is an intrinsic p art of this type of models. It is a well known fact from Dynam ical Systems (Devaney, 1989)that even very simple non-linear mathematical models can exhbit the behavior known as "chaos" for certainparameter values, and therefore are good cankdates to use as equations for prediction in complex financialsituations. Fractal Theory (M andelbrot, 1983) also offers a way to explain the erratic behavior of a time series, butthe method is geometrical in the sense that the fractal dimension is used to describe the complexity of thedistribution of the data points.We describe a prototype implementation of our new method for Automated Mathematical Modelling(AMM) as a computer program written in the PROLOG programming language. This computer program can beconsidered an intelligent system for the dom ain of FTSP because it uses Artificial Intelligence (AI) techniques toobtain the "best' mathematical model for a given financial problem. The use of AI techniques is to achieve th e goal

120


2/7

of automated modelling of financial problems by simulating (in the computer) how human experts in this do mainobtain the "best" model for a given problem. Given a financial time series the intelligent system developsmalhematical models based on the geometry of the data. The m ethod for A M M consists of three main parts: TimeSeries Analysis, Developing a Set of Admissible Models and Selecting the Best Model. First, the computerprogram uses the fractal dimension to classify the com ponents of the time series over a set of qualitative values,then the program uses this information to decide (using a fuzzy rule base) which dynamical models are the mostappropriate for the data, and finally the program decides which model is the "best' one for prediction usingheuristics and statistical calculations. In i3 prior prototype intelligent system developed by the authors (Castillo &MeXin, 1995) the method used didn't conside r using the fractal dimen sion as a mea sure of classification an d alsodidin't consider a fuzzy rule base for the simulation of the decision process. The use of Fuzzy logic in financialapplications has been well recognized (Deny, 1994) and many applications have been developed. in thi s case wecame to the conclusion that the best way to convey the information of financial modelling problems was usingf u z q sets (Badiru, 1992). Also, we think that the best way to reason with uncertainty in this case is using FuzzyLogic. The intelligent system develops only the kind of mathematical models that are m ore likely to give a "good'preldiction based on the know ledge that hum an experts have about this m atter. This knowledge is contained in theknowledge base of the intelligent system, and is the m ain factor in limiting the num ber of m odels that th e systemexplores. The intelligent system also has some generalized knowledge about the mathematical models that weexpect to discover in the financial domain. This knowledge is expressed as families of parametrized mathematicalmodels.

2.- DESCRIPTION OF THE FINANCIAL MODELLING PROBLEMFinancial forecasting systems are necessarily limited by the data presented to them. Afterall, it is the data uponwhich they make decisions or estimates. Therefore, for any given market of interest, one must determine whichdata is relevant to the overall requirements. In the case of financial forecasting systems, such data is often derivedfrom various time series, usually represented as prices and volumes, but also as open interest, dividends. etc.Sufficient "historical" data is needed for developing software tools intended to extract charac teristics of previousmarket behavior in an effort to apply this knowledge to trading or investing in the "future" market (Caldwell,1994). Many financial time series are composed of relatively long-term trends plus some degree of periodic andnoise-like components. Periodic components may include those related to seasonalities or limit-cycles in the data.The linearities and periodicities of the series can be removed, leaving the random or noise-like components. Suchnoise-like compon ents may truly be randalm and be describable by a probability distribution function. On the oth erhand, they may only appear to be random and/or non-periodic, while actually representing an underlyingdeterministic non-linear dynamic time series. In fact, our hypothesis is that in many financial problems this is thecase. and that "chaotic" time series are present in different scales and for different kinds of d ata. Some evidence onthe correctness of this hypothesis has been found over the years by several researchers. Brock (1988) has foundevidence of complex dynamics in stock returns and strong evidence of "chaotic" behavior on T-bill returns.Eckman et. al. (1988) also found evidence of non-linear dynamics in time series of weekly stock returns. Castillo(1991) also found evidence of non-linear dynamics in time series of world oil prices. Jurik (1994) described amethod potentially useful for determining the forecast horizon for T-Bonds using "Chaos A nalysis". T his evidenceis the main reason why we are considering non-linear dynamical systems models in our space of mathematicalmodels for the computer system.With all of this in mind our goal in this paper is to show how AI techniques can be used to autom ate theprocess of discovering the best model for a financial problem. The human financial experts usually try several (insome cases many) mathematical m odels before they are satisfied with the "goodness" of one m odel. The experts usetheir knowledge about modelling financial problems to limit their search of models, in this way obtaining the"best" results as quickly as possible. The "goal" of our work is to capture this knowledge of modelling in acomputer program, in this way obtaining a software tool capable of emulating intelligent behavior in th is dom ain.In the remaining sections of this paper we: describe the basic algorithm for discovering mathematical models, then

121


3/7

the implementation of the algorithm as a computer program and finally we discuss some lines of future researchwork in this area.Our new method for solving the financial m odelling problem is based on several novel ideas. W e considerthat the financial modelling problem can be divided in three main parts: Time Series Analysis, Selection ofAppropriate Models and Selection of the Best Model. The first part of the problem consists in obtaining the timeseries components from the data. Our solution to this part of the problem is a new classification schem e based onthe notion of the fractal dimension . This classification scheme is a one to one m ap between the fractal dim ensionof the data set and the qualitative values for the components of the time series. Once this part of the problem issolved. the second part consists in simulating an expert decision process that gives us the set of MathematicalModels appropriate for the geometry of the time series. This expert decision process is simulated using AItechniques and is the main part of the intelligent system. The third part of the problem consists in designing amethod to compare all the models obtained in the second part, to obtain which one is the "best" model forpredicting the time series. Our method to compare all the models has to consider statistical measures of goodnessbetween non-linear dynam ical systems and linear statistical models, to decide which model fits best the data set(time series).

3.- MATHEMATICAL MO DELLING FO R TIME SERIES PRED ICTIONThe problem of achieving automated mathem atical modelling can be defined as follows:Given: A data set (time series) with n data points, D = {d l, d,, ..., dm ) where di E Rn, i = 1 ...,m, n = 1, 2 ,... .Goal:From the data set D, discover automatically the "best" mathematical model for the time series.This problem is not a simple one, because in theory there can be an infinite number of mathematicalmodels that can be build for a given data set (Rao & Lu: 1993). So the problem lies in knowing which m odels to tryfor a data set and then to select the "best" one. We ca n state the problem more formally in the following way:Let M be the space of mathematical models defined for a given data set D. Le t MA = { M l , ..., Mq} be the set ofadmissible models that are considered to be appropriate for the geometry of the data set D. The problem is to findautomatically the "best" model Mb for time series predlction.We consider mathematical statistical models of the following form:

Y = F(X) + E (0,o)where E (0,o) represents a 0-mean Gaussian noise-process with standard deviation CY. F(X) is a polynomialequation in X , where t he predictor variables are in the vector: X = (Xl , X,, ..., Xp).We consider m athematical m odels as "dynamical systems" of the following form:dY/dt = F(Y)where Y is a vector of variables of the form: Y = (Y1,Y,, ..., Yp) and F(Y) is a non-linear function of Y. Otherkind of mathematical models are the discrete "dynamical systems" of the following form:Yt = F(X)where X = (Y t-l, Yt-,, ..., Yt-p) and F(X) is a non-linear function of X. Note that in this case we have deterministicmodels expressed as differential or difference equations.We consider the use of the fractal dimension as a mathematical model of the time series in the followingform: d = [log(N)/log( Ur)] where d is the fractal dimension for an object of N parts, each scaled dow n by a ratior. For an estimation of this dimension we can use the following equation: N(r) = p[ l / r ] ,where N(r) = number ofboxes contained in a piece of object and r = size of the box. We can obtain the box dimension of a geometricalobject (Mandelbrot, 1983) counting the number of boxes for different sizes and performing a linear logarithmicregression o n this data. For our particular case the geometrical object consists of the curve constructed using the setof points from the time series.The models for the statistical methods can be linear as well as non-linear equations. We show below somesample statistical models (Mendenhall & Reinmuth, 1981) hat the intelligent system explores:

d

a) linear regression: Yt = a + btc) logarithmic regression: lnY, = a + blnt b) quadratic regression: Yt = a + bt + ct 2d) first order autoregression: Yt = a + by,-,

122


4/7

The mathematical models for continuous dynamical systems can be one-dimensional, two-dimensional orthree-dimensiona l. W e show below some sample models (Rasband, 1990) that the intelligent system explores:

a) logistic differential equation: dY /dt = a Y (1 Y ,)b) lotka volterra two dimensiona l: {dY,/dt = a y , - bY,Y,, dY2/dt = bYlY2 - CY,}c) lorenz three dimensional: {dY l/dt = ay , - ay,, dY,/dt = - YlY, +by, - Y,, dY,/dt = YlY, - CY,}The mathematical m odels for discrete dynamical systems can also be one, two or three d imensional. Weshow below some sample models (Rasband, 1990) that the intelligent system explores:a) logistic difference equation: Y,+, = aYt(l-Yt)b) lotka volterra two dimensiona l: {Yt+ l= a y , - by&, X t+ l = bY& - c q >c) henon map two dimensional: {Yt+, =T , T+, a -5, by,)In all of the above mathematical models a, b and c are parameters that need to be estimated using thecorresponding numerical methods. For example, for the regression models we can use the least squares method for

parameter estimation, b ut for the differential equations we need to use a Gauss-Ne wton type method.The algorithm for automated mathematical modelling for prediction is shown i n Figure 1.~ ~~~ ~~ ~~~ ~~

STEP 1 Read the data set D = {d l, d,, ...,dm}.STEP 2 Time Series Analysis of the set D to find the components.STEP 3 Find th e set of Admissible models MA = {M,, M,, ..., Mq}, using the qualitative values of thetime series components.STEP I Find the "Best" mathematical model Mb from the set MA using the measures of "goodness" oeach of the mode ls from thie set M A.Figure 1 - Algorithm for automated mathematical m odelling

We call this algorithm IDIMM (for Intelligent Discovery of Mathematical M odels) and is an integrationof Artificial Intelligence techniques with Dynamical Systems Theory, Fractal Theory and Statistical Methods, toobtain mathematical models for prediction of time series in the financial domain. In the following section we willshow how this algorithm can be implemented to achieve the goal of AMM fo r FTSP.

4.- IMPLEMENTATION OF THE METHOD FO R AUTOMATED M ODELLINGThe implementation of the method for NUM as a computer program was done using the PROLOG programminglanguage. The choice of PROLOG is because of its symbolic manipulation features and also because it is anexcellent language for developing Prototypes (Bratko, 1990). The computer program was developed using anarchitecture very simila r to that of an intelligent system (knowledge base, inference engin e an d user interface) withthe addition of a numerical module for parameter estimation. We will focus our description of implementationdetails only to the knowledge base of the intelligent system. In the computer program, the knowledge base is thepafit that simulates the proc ess of model discovery described by step 1 to 4 in the algorithm of the last section.Accordingly, the knowledge base consists of three Expert Modules: Time Series Analysis, Expert Selection andBest Mode l Selection. In the following lines we will describe briefly each of this m odules.4.1 Description of the Time Series Analysis ModuleThis module is the implementation of Stelp 2 of the algorithm and contains the knowledge necessary for time seriesanalysis, i.e., the knowledge to extract from the data the time series components. Our method for time seriesanalysis consist in the use of the fractal dimension of the set of points D as a measure of the geometrical

12 3


5/7

complexity of the time series. We use the value of the fractal dimension to classify the time series components overa set of qualitative values. Our classification scheme was obtained by a combination of expert knowledge andmathematical modelling for several samples of data.To give an idea of this scheme we show in Table 1 somesample rules of this module.

Table 1.- Sample rules for time series analysisIF THEN

Fractal_dimension(D)E(0.8,1.2)Fractal-dimension(D)E [1.2,1.5)Fractal-dimension(D) E [1 .5 , l . )Fractal-dimension(D)E [1.2,1.4)Fractal-dimension(D)E [1.4,1.6)Fractal-dimension(D) E [1.6,l .7)Fractal-dimension@) [1.7,l.S)Fractal dimension(D)>l.S

Trend = linear, Time-series = smoothTrend = non-linear, Time-series = cyclicTime-series = erraticPer iodicqar t = simplePer iodicqar t = regularPericldicgart = difficultPer iodicgar t = veq-difficultPeriodic part = chaotic

We performed several experiments with real data sets to decide on the classification needed for this "TimeSeries Analysis Module" and we found that for the moment classifying the periodic components in "simple","regular", "difficult" and "chaotic" was sufficient. Also, we only classify the "trend" component in two kinds:"linear" and "non-linear". Of course, it is possible that we may need a better classification in the future, for a moreaccurate implementation of this Module, but now we are only showing how the method can be implemented.In conclusion our method for time series analysis consists of a one to one mapping between the fractaldimension of the set D and the qualitative values of the time series components. This set of qualitative values forthe components is the information needed a s input for the "Expert Selection Module" (implementing Step 3 of thealgorithm).4.2 Description of the Expert Selection ModuleThis module is the implementation of the step 3 of the algorithm a nd contains the knowledge necessary to selectthe kind of mathematical models more appropriate for the type of data given, i.e., given the qualitative values ofthe time series components decide which models are more likely to give a good prediction. Our method forselecting the models consists of a set of fuzzy ules (heuristics) that simulates the hu man expert decision process ofmodel selection. In our approach the qualitative values of the time series components are viewed as fuzzy sets(using the fractal dimension as a classification variable). We have mem bership functions for each of th e qua litativevalues of the time series components. Also, the qualitative values of the "Type-Model" variable are considered asfuzzy sets and we have membership functions for each of this values. To give and idea of the way this Expertknowledge is structured, we show in Table 2 some sample rules of this module.

Table 2.- Sample fuzzy rules for model selectionIF THEN

Dim Trend Per iodicqar t Type-Modelone non-linear simple logistic-differential-equationtwo non-linear simple lotka-volterradifferential-equationthree non-linear regular lorenz-differential-equationone non-linear simple logistic-difference-equationtwo noq lin ear regular lotkavolterra-difference-equationThe rules in Table 2 show how this expert module selects the appropriate models for a given financialproblem, using as information the dimensionality of the problem and the qualitative values of the time seriescomponents. Each rule of this expert m odule contains a piece of knowledge about the problem of model selection.

12 4


6/7

4.3 Description of the Best Model Selection ModuleThis module is the implementation of step 4 of the algorithm and contains the knowledge to select the "best"mathematical model for prediction, i.e., g h en the set of selected models generated by step 3 , decide which model isthe "best" one to predict the time series. Our method for selecting the "best' model consists of comparing th e Sumof Squares of Errors (SSE) for all the models and selecting the one that minimizes SSE. This criteria has theadvantage of been valid for all the types of models that we consider for the intelligent system (statistical modelsand non-linear dynamical systems models). To give an idea of our method, we show in Table 3 a sample casewhere the set of selected models is MS={MI, MZ,M3, M4, M5, M 6} and the model with lowest SSE is M4.

Table 3.- M ethod for best model selection using the SSE- MODEL TYPE SQUARES SUM BEST MODEL-M 1: Y = a l +bit Statistical SSE 1 noM2: Y = a2 + b2 t + c2t2 Statistical SSE2 noM3: InY = h a 3 + b31nt Statistical SSE3 noM5: lotka-volterra-difference-eq Dynamical SSE5 noMg: lorenz-differential-eq Dyna mical SSE6 noM4: logistic-differential-eq Dynamical SSE4 Yes-

In Table 3 the best model is M4 because S E m i n = SSE4 where:S E m i n = min(SSE1, S E 2 , S E 3 , S E 4 , SSE5, SSE6)

The im plementation of this minimization procedure is easy once the numerical values of the Sum of the Squares(SSE) are calculated by the numerical module.

5.- COMPARISONWITH RELATED WORKThere has been some work recently in the area of numerical law discovery, but much of the research in M achineLearning is in other areas such as induction (Sleeman and Edwards, 1992). We think that this is mainly because"discovery" is a more difficult kind of "learning". However, we can state that autom ated m athem atical mode lling isvery important for many domains of application for obvious reasons. For example in the engineering an d financialdomains is critical to obtain mathematical models for the problems, to be able to understand them and also to beable to predict their future behavior. This is why we consider that m ore research in this area i s very important.Similar work with respect to Machine Learning can be found in a paper by M oulet (1992), however theapproach to model discovery is different to ours (this can be seen from the heuristic method by Moulet). Also in apaper by Rao and Lu (1993) we can see a method for model discovery for engineering domains, but also w ith adifferent approach to ours (his approac h i!; similar to "clustering"). Also, there is another very important differencewitlh other authors, in the kind of mathematical models that we are considering for our intelligent system. We areconsidering non-linear mathematical modlels from the theory of Dynamical Systems and not only linear regressionmodels like other authors. In this paper we have successfully generalized our previous work on this matter (Castilloand M elin, 19941, by considering this type of non-linear models.

6.- CONCLUSIONSWe described in this paper a new m ethod to perform automated mathematical modelling for financial time seriesprediction. Also, we described the implementation of this new method as a prototype intelligent system for thedomain of financial problems. We have tested this computer program with real financial data with encouragingresults. In this paper we have successfully generalized our previous work on this matter (Castillo & Melin, 1994)by using non-linear dynamical systems models. Also, we have improved our work presented last year in this forum(Castillo & Melin, 1995) by using the fractal dimension for time series analysis and by u sing Fuzzy Logic

125


7/7

techniques for the simulation of the expert reasoning process. The importance of our work ca n be measured if weconsider that accurate prediction is critical in the a reas of Finance, Economics, Management a nd A ccounting. W ethink that ou r work can help i n m aking more easy the job of mathema tical modelling and prediction in all of theseareas. The implementation of our new method for Automated Mathematical Modelling can be improved inseveral ways. First of all, the Time Series Analysis Module can be improved by designing a better classificationscheme using the fractal dimension (with t h s malung it more accurate). Second of all, in the Expert SelectionModule we can increase the number of different mathematical models that can be used for the fina ncial dom ain, inthis way making the intelligent system capable of solving a bigger variety of problems. Finally, we think that otherAI techniques can be tested for the impleme ntation of our method for AMM with the idea of possibly improvingefficiency. We are plann ing to follow this lines of research in the near future.

ACKNOWLEDGMENTSThe authors would like to thank the Research Grant Committee of UABC University for supporting this project,the Department of Economics of UABC University for the time and resources given to this project and the Schoolof Engineering of CET YS University for the collaboration in this research and for providing the Compu tingfacilities needed for this research work.

REFERENCESBadiru, A.B. (1 992). "Exp ert Systems Applications in En gineering and M anufacturing", Prentice-Hall.Bratko, I. (1990). "Prolog Programming for Artificial Intelligence", Addison Wesley.Brock, W.A., 1988. "NonLinearity and Complex Dynamics in Economics and Finance", The Economy a s anEvolving Com plex System, Edited by Anderson, P.W ., Arrow, K.J., Pines, D, Addisson W esley, 5, 77-97.Caldwell. R.B. 1994. "Design of Neural Network-based Financial Forecasting Systems: Data Selection and DataProcessing", Neurovest Journal, 2 (3 , 2-20.Castillo, 0. 1991. TJna Introduccibn a1 Estudio del Caos en Sistemas Dinamicos. Su Aplicacion en E conomia

Matem atica", Cuadernos de Econo miaN o.4, Editorial UABC.Castillo, 0. and P. M elin (1994). "An Intelligent System for Discovering Mathem atical Models for financial tim eSeries Prediction", Proceedings of the IEEE Region 10's Ninth Annual International Conference, pp. 2 17-22 1,IEEE C omputer Society of Singapore.Castillo, 0. and P. Melin (1995). "An Intelligent System for Financial time Series Prediction CombiningDynam ical System s Theory, Fractal Theory and Statistical methods", P roceedings of the IEEE/IAFE 1995Conference on Compu tational Intelligence for financial Engineering (CIFER), pp. 151 155.Derry, J. (1994). 'I A Fuzz y Expert System and M arket Psychology", NeuroVest Journ al, Vol. 2, No. 1, pp. 20-22.Devaney, R. (1989). "A n Introduction to C haotic Dynamical Systems", Addison W esley Publishing Co mpany.Eckmann, J.P.,Kamphorst, S.Q., Ruelle D. and Scheinkman J., 1988 "Lyapunov Exponents for Stock Returns",The Economy as an Evolving Complex System, Edited by Anderson, P.W., Arrow. K.J., Pines, D. AddissonWesley, 5, 301-304.Jurik,M., 1994. "Estimating Optimal Forecast Distance using C haos Analysis", Neurovest Journal, 2(1), 11-19.Mandelbrot, B. (1983). "T he Fractal G eometry of Nature", W . H. Freeman and Company.Menden hall, W. and J.E. Reinmuth (198 1). "Statistics for managem ent and Econom ics", Wadsw orth International.Rasband, (1990). "Chaotic Dynamics of Non-Linear Systems", John Wiley & Sons.Rao, R. B. and S. Lu (1993). "An Knowledge-Based Equation Discovery System for engineering Dom ains, IEEESleeman, D. and Edwards P. 1992. Proceedings of the Ninth International Workshop on Machine Learning,Expert, pp. 37-42.Morgan Kaufman Publishers, San Mateo, CA.

126

Automated Mathematical Modelling for Financial Time

Documents

Transcript of Automated Mathematical Modelling for Financial Time