International Transaaction Journal of Engineerring, Management, … · 2020. 4. 13. · time series...

*CorrespondinTransaction Jo1906-9642 CO

a Departme A R T I C LArticle historyReceived 11 JuReceived in reNovember 201Accepted 29 NAvailable onli2019 Keywords: Moving aveAutoregressArtificial neSupport vecFinancial mExtreme leamachine.

1. INTRFinanc

investmentGiven theiprediction.

©

ng author (Fahournal of EngineODEN: ITJEA8

InteMan

COMPCOMP

Beenish

nt of Manag

L E I N F O y: uly 2019

evised form 20 19 November 2019ine 09 Decembe

erages; sive models; eural networkctor machine

marketing; arning

RODUCTIcial marketst, institutionir importanc. These pr

©2020 Internatio

eem Aslam). Teering, ManagemPaper ID:11A04

ernationanagemen

PARATIVPUTING M

h Bashir a, F

gement Scienc A B S T

9 er

k; e;

Findue to prices rsoft cocomplenonlinea compcomputexchanunique dynamitraditioautoregartificialearningshows modelsfinanciamodelsOur reregardiinvestopredictiDiscipl©2020 IN

ON s are the banal investmce, much atredictions a

onal Transaction

Tel: +92-321-50ment, & Applied4J http://TUEN

al Transat, & App

http

VE ANALMODELS

Faheem As

ces, COMSA

R A C T nancial marthe nature reflect. Ecoomputing mex and dynaear models.parative anting modelsge coveringregarding t

ics of the onal modelgressive moal neural g machinesthat soft in tradingal time se, the artifici

esults are mng the car

ors to betteive ability olinary: ManNT TRANS J EN

arometer ofment acting ttention hasare then in

n Journal of Eng

063253. Email:d Sciences & TeNGR.COM/V11/

action Joplied Scie

p://TuEngr

LYSIS OFS FOR TR

lam a*

ATS Universi

rket forecasof the timenomist take

models assuamic in naThis resear

nalysis of s for the trag the dailythe target va

market nols, we inodels whernetworks,

s to have a computing

g signals preries data. ial neural nmore convireful selecter understaof the modenagement SNG MANAG SCI

f any econoas a part o

s been givenncorporated

gineering, Mana

: fahimparachaechnologies. Vol11A04J.pdf DO

urnal of ences & T

r.com

F TRADITRADING

ty Islamabad

sting alwaye series dataes this data ume that t

ature and carch paper setraditional

ading signay data fromariable it preot just the nclude moreas in sof

support vcomprehenmodels pe

rediction shHowever,

etwork becoincing as cion of marand marke

el. ciences (FinI TECH.

omy. It facilof mutual fn by researinto financ

agement, & App

[email protected] lume 11 No.4 ISOI: 10.14456/IT

EngineerTechnolo

PA

TIONAL G SIGNAL

d, PAKISTAN

ys remains a as well thas a linear time-series an be betteettles the deforecasting

als predictiom 1997 to edicts which

next periooving averft computinector mach

nsive analyserform betthowing non

amongst omes the becompare torket feature

et behavior

nancial Scie

litates internfunds and prchers and acial risk m

plied Sciences &

©2019 InternaSSN 2228-9860 TJEMAST.2020.7

ring, ogies

APER ID:

L AND SOLS PRED

N.

a challengihe informatiprocess whdata is n

er analyzedebate by cog models on of Pakist2018. Our h reflects thod future rages as ng models hines and sis. This comtter than trn-linear behall soft co

est predictivo existing es, which r and imp

ence).

rnational traportfolio macademia re

models and

& Technologies

ational eISSN

70 1

: 11A04J

OFT ICTION

ing issue ion stock

hereas the nonlinear, d through onducting and soft tan stock study is

he overall price. In well as we take extreme

mparison raditional havior of omputing ve model. literature can help rove the

ade, foreignanagement.egarding itsdeveloping

J

n . s g

2 Beenish Bashir, Faheem Aslam

trading strategies. (Moradi & Rafiei, 2019; Zhong & Enke, 2019). As the importance and predictability of financial markets are supported by previous studies

(Fama, 1965; Malkiel & Fama, 1970) but when it comes to the modeling techniques for forecasting there is no consensus. The dispute is much related to the nature of the data financial markets generate. The long-term investors are more concerned about the fundamentals of companies like price-earnings ratio, revenue, expenses, assets, liabilities, management policy and financial ratio (Lam, 2004). Whereas, short-term investors rely on price movements of stock, understanding market behavior through different market features (Murphy, 1986). Technicians avoid the analysis of all economic factors by focusing on pattern recognition of price considering this information enough for future price determination (Hu et al., 2015; Patel et al., 2015; Teixeira & De Oliveira, 2010; Żbikowski, 2015). Tsinaslanidis and Kugiumtzis (2014) use numerous technical indicators for market analysis which is time-series data in nature. Huang et al. (2019) also use different momentum and volatility indicators for bitcoin predictive analysis.

For short-term prediction, traditional forecasting models consider time series data as a linear process and apply the smoothening and autoregressive process to predict future price movement (Kumar & Murugan, 2013; Lin et al., 2012; Wang et al., 2012). Financial time series are complex, nonlinear, dynamic and chaotic. Soft computing models can capture this nonlinear behavior (Cheng & Wei, 2014; Huang & Tsai, 2009; Lee, 2009). Tay and Cao (2001) compare artificial neural networks (ANN) and support vector machines (SVM) and confirm their suitability for prediction of the stock market. Huang, et al. (2005) use SVM for predicting the directional movement of the NIKKIE 225 index. Kara et al. (2011) also employ SVM and ANN for predicting the Istanbul Stock Exchange.

Our work is an addition to existing literature to settle the debate amongst traditional forecasting models and soft computing models for trading signals prediction. All the existing literature is related to price prediction, completely ignoring the trading signals forecasting that can enlighten investors about entry and exist decisions. Secondly, we also focused on the relevant feature selection to improve the predictive ability of models and to better understand market behavior. Lastly, our analysis confirms the nonlinear and dynamic nature of financial time series data negating the assumption of traditional forecasting models.

2. LITERATURE REVIEW In the past ten years, various time series methodologies have been developed for financial market

forecasting that helps improve investment decisions (Teixeira & De Oliveira, 2010). Traditional forecast models and soft computing models are two major approaches (Wang et al., 2011). In both modeling approaches, financial data is considered as a time series data which is actually numerical observations accumulated in sequential order over a period of time. (Brockwell & Davis, 2009; B. Wang et al., 2012). This organization of financial data enables model developers to utilize time series tools to understand market behavior (Oliveira & Meira, 2006). Further, these tools pave a way to do data mining tasks, such as classification, trend analysis, seasonal effect, cycles and extreme event detection (Cryer & Chan, 2008).

The traditional models involve averages, regression and autoregressive models based on the linearity of normally distributed variables (Lin et al., 2012; Wang et al., 2012). In reality, financial time series data is dynamic, chaotic, nonlinear and highly volatile which makes its prediction

*Corresponding author (Faheem Aslam). Tel: +92-321-5063253. Email: [email protected] ©2019 International Transaction Journal of Engineering, Management, & Applied Sciences & Technologies. Volume 11 No.4 ISSN 2228-9860 eISSN 1906-9642 CODEN: ITJEA8 Paper ID:11A04J http://TUENGR.COM/V11/11A04J.pdf DOI: 10.14456/ITJEMAST.2020.70

3

complex as compare to other non-financial time-series data (Vanstone & Finnie, 2009). Financial time series inherit these characteristics from economic factors, investor’s sentiments, political events and movement of other stock markets (Kara et al., 2011).

The nature of financial time series data calls for flexible and adaptive models for future price prediction (Huang & Tsai, 2009) and soft computing models serve the purpose (Lin et al., 2012). Soft computing models, such as artificial neural networks, support vector machine and extreme learning machine give better predictions with high accuracy (Lee, 2009). Most of the soft computing models can handle nonlinear relations through relevant market features with less statistical assumptions (Atsalakis & Valavanis, 2009). Weng et al. (2017) find that soft computing models are appropriate for developing rules when using with a rich knowledge database. Zhong and Enke (2017) conclude a trading strategy based on classification models yield high returns than the benchmark T-bill strategy. Liang et al. (2009) find that the non-parametric model outperforms parametric models in financial market forecasting. As soft computing models are self-adaptive and are more tolerant of imprecision (Cheng & Wei, 2014). Hence, we investigate whether these findings hold for trading signals prediction of Pakistan stock exchange or not.

3. METHOD 3.1 DATA

We use the Pakistan Stock Exchange (KSE-100 index) daily quotes from 7/02/1997 to 18/07/2018 with the total observations 5168. The data set is bifurcated into a training dataset and testing dataset. The training dataset has 3764 observations cover 70% of total observations and the testing set has 1404 observations include the remaining 30% of total observation.

3.2 TARGET VARIABLE “T” The target variable T (Equation 1) gives a holistic picture of stock prices by incorporating overall

dynamics in the following days. This cannot be merely done through price movements, therefore, the target variable is defined as the sum of all variation above an absolute value of target margin 𝑥% (Torgo, 2011) 𝑇 = ( 𝑣 ∈ 𝑉 : 𝑣 > 𝑥% ∨ 𝑣 < −𝑥%) (1)

where 𝑉 (Equation 2) is k percentage variation of current close price and the following k days price

average.

𝑉 = 𝑃 − 𝐶𝐶 (2)

and the daily average price is computed as Equation 3.

𝑃 = 𝐶 + 𝐻 + 𝐿3 (3)

𝐶 , 𝐻 𝑎𝑛𝑑 𝐿 are close, high and low prices for the day i respectively.

The target variable is T value and develops a model that predicts this value by using the feature’s

4 Been

informationusing the v

The this the averaThe only cvalues let a

3.3 INPUWe us

selected frothrough a r

3.4 METThis se

nish Bashir, Fa

n that givesvalues in Eq

hreshold 0.1age of last fcare should an investor

UT FEATUse fifteen feom the list orandom fore

Fig

Sr. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

THODOLOection expla

aheem Aslam

s three tradquation 4.

𝑠𝑖𝑔𝑛𝑎 and -0.1 ar

four days avbe taken is

trade on mi

URES eatures (Tabof features est techniqu

gure 1: Var

TableSelected TAverage TAverage TAverage TATR tr. ADX (WeADX DipADX DinADX DXBollinger Bollinger Bollinger Bollinger SAR (ParRun MeanRun Stand

OGY ains both the

ding actions

𝑎𝑙 = 𝑠ℎ𝑜𝑙𝑑𝑏re heuristic verage prices that very inor variatio

ble 1) as theavailable in

ue (Figure 1

iable impor

e 1: List of STechnical IndTrue Range True Range HTrue Range Lo

elles Wilder’sp n

Bands Down Bands MovinBand Band Up

rabolic Stop ann dard Deviation

e traditional

s (sell, hold

𝑠𝑒𝑙𝑙 𝑖𝑓 𝑇 <𝑖𝑓 − 0.1𝑏𝑢𝑦 𝑖𝑓 𝑇 >and any oth

e that is 2.5%high values

on, incurring

e input of thn the Techni).

rtance accor

Selected Tedicators

High ow

s Directional M

n ng Average

nd Reverse)

n

l and soft co

d and buy),

−0.1𝑇 0.1> 0.1

her value ca% higher thas will generg higher ris

he soft comical trading

rding to the

echnical Ind

Movement Ind

omputing m

this transfo

n be used. Han the currenrate fewer sk (Torgo, 2

mputing modrule packag

random for

icators

dex)

models used f

ormation is

However, thent close (4xsignals and

2014).

del. These fge (Ulrich e

rest.

for predictio

carried out

(4)

he value 0.1x0.025=0.1)

very small

features areet al., 2018)

on analysis.

t

) l

e )

.


5

3.4.1 TRADITIONAL MODELS

We first explain the simple moving average, centered moving average, right-aligned moving average, exponential weighted moving average then move to autoregressive moving average model.

Simple Moving Average (SMA) is the average of n prices (Equation 5) where each observation is given equal weight. If we take difference of the two of SMA values at times t and t-1, it gives Equation 6 and recursive form become as Equation 7 which improves the computational speed of SMA in real-life scenarios (Zakamulin, 2017) by reducing total n operations involve in Equation 5 to just three mathematical operations (Equation 8) irrespective of the length of averaging window length.

𝑆𝑀𝐴 (𝑛) = 1𝑛 𝑃 (5)

𝑆𝑀𝐴 (𝑛) − 𝑆𝑀𝐴 (𝑛) = 𝑃 − 𝑃𝑛 (6)

𝑆𝑀𝐴 (𝑛) = 𝑆𝑀𝐴 (𝑛) + 𝑃 − 𝑃𝑛 (7)

The Herfindahl index of SMA (Rhoades, 1993) equals ; hence defining the smoothness of

SMA(n) as ( ) = 𝑛. The increase in the average window length not only increases its smoothness

but also increases the average lag time of SMA which is a linear function of smoothness (Equation 9). 𝐿𝑎𝑔 𝑡𝑖𝑚𝑒(𝑆𝑀𝐴 ) = ∑ 𝑖∑ 𝑖 = 𝑛 − 12 (8)

𝐿𝑎𝑔 𝑡𝑖𝑚𝑒(𝑆𝑀𝐴 ) = 12 × 𝑆𝑚𝑜𝑜𝑡ℎ𝑛𝑒𝑠𝑠(𝑆𝑀𝐴 ) − 12 (9)

It is quite easy to find a trend and breakthroughs in a trend by looking on historical data

considering time series data of 𝑃 as a combination of trend 𝑇 and an irregular component called as noise 𝐼 . Then, an additive model can be written as Equation 10. Noise is a short-lived variation around the trend eliminated through centered moving average or other smoothening tools. Any average of time series data is computed using a fixed window length that rolls through time. This window length is called an averaging period. In case of centered moving average if n is the window length then it consists of a center and two halves of length k such that n = 2k + 1. Since centered moving average at time t (equation11) removes noise so the value of CMA is the value of trend in the given time series data. The window length n is selected keeping in mind the elimination of noise (Equation 12) 𝑃 = 𝑇 + 𝐼 (10)

𝐶𝑀𝐴 = 1𝑛 𝑃 (11)


Tt = 𝐶𝑀𝐴 (n) (12)

In real life, an analyst is interested in the forecasting of time series t + 1 given the historical data series t (Equation 13).

𝑅𝑀𝐴 = 1𝑛 𝑃 (13)

Mathematical comparison of centered moving average and right-aligned moving average (RMA)

at a given time t leads to the conclusion that 𝑅𝑀𝐴 is equal to 𝐶𝑀𝐴 (Equation 11). In fact, RMA is a lagged version of CMA given the same window length and shares the same properties of CMA. Particularly, RMA with longer window length is better at removing noise from the data series but this comes with longer lag time. The lag time is given as Equation 15 𝑅𝑀𝐴 (𝑛) = 𝐶𝑀𝐴 (𝑛) (14)

𝑙𝑎𝑔𝑡𝑖𝑚𝑒 = 𝑘 − 𝑛 − 12 (15)

However, the linearly weighted moving average has the drawback that it assigns the same weight to all observations ignoring the importance of recent observation in future prediction. This problem is addressed by exponential moving average (EMA), using the concept of exponential factor λ (Equation 16). The value of λ can be greater than zero and less than or equal to 1. 𝐸𝑀𝐴 (𝜆, 𝑛) = ∑ 𝜆 𝑃∑ 𝜆

(16)

Autoregressive Integrated Moving Average (ARIMA) is the most commonly used forecasting

model amongst traditional forecasting models. It is a generalized version of ARMA (autoregressive moving average) when used for differenced data rather than original data series. Orders of AR part (p), the difference (d) and MA (q) part specify the ARMA model and the model is said to be of order (p,d,q). However, AR and MA are different models for stationary time series and ARMA (and ARIMA) is a hybrid form of these two models for a better fit. The steps of building ARIMA models are explained as follows.

Auto Regression (AR) is a class of linear models where the dependent variable is regressed against its own lagged values. If 𝑦 is modeled via the AR process, it is written as Equation 17 similar to simple linear regression. It has an intercept like term (δ), regressors 𝑦 , and parameters ∅ an error term 𝜀 . The only special thing is that regressors are the dependent variable’s own lagged terms. If lag up to p is included in the model, the AR process is said to be of order p. 𝑦 = 𝛿 + ∅𝑦 + ∅ 𝑦 + ⋯ + ∅ 𝑦 + 𝜀 (17)

Moving Average (MA) is another class of linear models. In MA, the output or the variable of interest is modeled via its own imperfectly predicted values of current and previous times. It can be written in terms of error terms as Equation 18.

𝑦 = 𝜇 + 𝜃 𝜀 + 𝜃 𝜀 + ⋯ + 𝜃 𝜀 + 𝜀 (18)


7

Again, it has a form similar to classic linear regression. The regressors are the imperfections (errors) in predicting previous terms. Here the model is specified with a positive sign for the parameters. It is not uncommon where we have a negative sign for the parameters. The model above include errors for q lags and said to have an order of q. In the case of differenced data ARIMA (p,d,q) become ARMA(p,q) having the mathematical form as Equation 19.

𝑦 = 𝛿 + ∅ 𝑦 + ∅ 𝜀 + 𝜀 (19)

3.4.2 SOFT COMPUTING MODELS

Artificial Neural Network (ANN) is a nonlinear regression technique and popularly used for stock market prediction (Zhiqiang et al., 2013). This field is tracked back to (McCulloch & Pitts, 1943) mathematical function perceived as a model of biological neural network. This model of the neural network comprises three components: weighted inputs that are like synapses in the human brain. Adder- The summation of all input signals corresponds to the neuron membrane which assembles all electrical charges. Activation function- determines if a neuron has an action potential for a specific set of inputs (Algorithm 1).

ALGORITHM 1 Artificial Neural Network

1. Start with small random weights 2. Input the data set 3. For forwarding phase: hidden layer compute activation function of each neuron then

calculate the activation function of output 4. For the backward phase: calculate error at the output and at the hidden layers then update

the weights of both layers 5. For recall apply the forward phase described in step 3

When it comes to the interpretation of ANN it’s more like a black box. Therefore, Support vector machines (SVMs) apply the concept of margin to solve the problem of ANN. SVM easily separates the data by mapping it into high dimensions (Boser et al., 1992). By aiming to maximize the size of margin that classifies the objects without any point lying inside.

ALGORITHM 2 Support Vector Machine

1. Begin with an input data set 2. Classify the data set 3. Apply SVM with a different kernel function 4. Specify hyperplane 5. Repeat step 3 if obtained accuracy is not obtained

The learning speed of ANN and SVM is slower than what is required because of slow

gradient-based learning algorithms. Whereas the Extreme Learning Machine (ELM) randomly selects hidden nodes and analytically finds the output weights (Algorithm 3).

8 Been

ALGO1. 2. 3. 4.

4. RESUSMA i

R package returns thepurple colrepresentedline. This i95% predic

Center(Svetunkov

nish Bashir, Fa

ORITHM 3With the trAssign ranCompute hCompute th

ULT AND is computedforecast (H

e model witlored line wd after a reds because thction interv

red movingv, 2017) w

aheem Aslam

3 Extreme L

raining set andom input whidden layerhe output w

DISCUSSd using sma

Hyndman et th the loweswith AIC v

d cutting linehe actual va

val.

Fg average iith the argu

Learning Ma

and activatioweights andr output mat

weights

SION a function () al., 2019). Bst value. Hevalue of -3e (Figure 2)alues are use

Figure 2:

igure 3: Ces computedument orde

achine

on functiond bias trix

in R packaBy default,ere the orde3151.602. T. The foreca

ed in the con

: Simple Mo

entered Movd in R deper equals to

n and hidden

ge smooth (SMA orderer of SMA The forecasast trajectorynstruction o

oving Avera

ving Averagloying func “null” to f

n neuron num

(Tukey) andr selection (his 12 givin

st is done y of SMA (1f point forec

age.

ge. ction cma (find order b

umber

d forecast fu(h) is based ong a good ffor the nex12) is not jucasts up to h

() in packabased on A

unction () inon AIC and

fit shown inxt 10 days

ust a straighth=12 with a

age SmoothAIC. In this

n d n s t a

h s


analysis Cproperty ofreal data pa

The rialign “righblack line prediction a magnifieand the lighof the pred

We esmethod of is minimizhigher valu1990). Figuis EMA wh


CMA at ordf a centeredattern (Figu

ight-alignedht” in R pacand the righinterval (Fi

ed version oht shaded ar

diction interv

stimate expoleast square

zed (Čisar &ues of lambure 6 illustrahereas the g


der 13 with d moving avure 3).

d moving avckage zoo (ht aligned mgure 4). Ov

of the forecarea is 95% pval.

Fi

onential moe is used to f& Čisar, 20da. Whereaates EMA a

gray line is t

Figure 5


AIC valueverage that

verage is c(Shah, Zeilemoving avererall, the RMasted regionprediction in

igure 4: RM

oving averafind the opti11). Table

as most of thand actual dathe predicte

5: Magnified


e, -3896.784is irrespect

computed byeis, & Grothrage is showMA fits weln where thenterval and

MA and Tra

age by own imal value oe 2, there ishe packagesataset, the b

ed interval a

d Graph of R


4 is computive of wind

y using funhendieck, 2wn by the oll with the p

e dark shadethe predicte

ading Signal

code with of λ for whics an increass use lambdblack line is and EMA is

RMA Forec


uted. Our redow length,

nction rollm2005). The arange line a

pattern of reaed area is 8ed blue line

ls Data.

the lambda ch the sum osing trend ina value as 0the actual dfollowing t

casted Regio


esult reveal, CMA clos

mean () withactual data and the grayal data. Figu0% predicti

e lies within

a optimized of squared en SSE valu0.2 (Lucas &data and the the actual d

on.

ational eISSN

70 9

ls the basicsely follows

h argumentseries is in

y line is theure 5 showsion interval the bounds

at 0.1. Theerrors (SSE)ues with the& Saccucci,orange line

data set.

c s

t n e s l s

e ) e , e

10 Been

Three Parameter the data bynumber is and autocothe AR or(autocorrelfirst differePACF plotoff after la

Fi

Since stationary (Pfaff et alOnce we d

nish Bashir, Fa

major stepsestimation

y taking diffd. There ar

orrelation. Arder p. The lation) cut oence stationt of a simulag 1. So thre

igure 7: Gr

the ACF pat first diff., 2016). Thetermine th

aheem Aslam

Tab

Fs are follow(3) Diagno

ferences andre different After determ

lag at whioff is the ordnary as showated time seee potential

raphical Rep

plot of firstference. Wehe value of the orders, th

ble 2: The o

00000

Figure 6: EMwed to makeostics and pod keep takintools to det

mining d, wech PACF cder q of MAwn in Figur

eries. For the(p, q) speci

presentation

t differencee further invthe test statie next step i

optimized vaλ SS

0.1 0.040.2 0.050.5 0.050.7 0.050.9 0.05

MA and Acte an ARIMAotential impng the diffetect stationae can utilizecuts off is tA. The tradiures 7 and 8e series, theification wo

n of First Di

e is better vestigate it istic is 0.003is to find th

alue of LamSE 4432 035 146 157 157

tual Data SeA forecast mprovement. rence until

arity. Like vsample PA

the order oing signal da8. Next Figue ACF cuts oould be (1, 0

ifferenced s

than level with unit r39 so the see parameter

mbda

et. model: (1) MWe first chthe series b

visual obserACF (partial

f AR and tata was not ures 9 and off after lag

0), (0, 1) and

series of Tra

stationary goot test KP

eries is first rs. Common

Model speciheck the stabecomes starvation of thautocorrela

the lag at wlevel statio10 show th

g 1 and PACd (1,1).

ading Signa

graph so thPSS in R padifferenced

nly used tec

ification (2)ationarity oftionary thathe data plotation) to getwhich ACFonary but itshe ACF andCF also cuts

als.

he series isackage urcad stationary.hniques are

) f t t t

F s d s

s a . e


least-squar2007) use adequacy t

The or


Figu

Figure 9

Figure 10:

re, maximumthe maxim

through diagriginal plot


ure 8: Autoc

: Autocorre

Partial Aut

m likelihoodmum likelihgnostics. shows clear


correlation P

lation Plot o

tocorrelatio

d, method ohood metho

r nonstation


Plot of Orig

of First Dif

on Plot of Fi

of moments.od. After d

narity. The A


ginal Tradin

fferenced Tr

irst Differen

R package developing t

ACF does n


ng Signals S

rading Signa

nced Tradin

forecast (Hythe model,

not cuts off,


Series.

als Series.

ng Signals S

Hyndman & Kwe check

, rather it sh

ational eISSN

70 11

Series

Khandakar,the model

hows a slow

, l

w

12 Been

decrease. Osome lags. from the plis stationaris the ordermodels, A

4.1 MODWe de

70:30 ratiothe testing (Hyndman

In Figuinterval is sautocorrelaand then 10of the test Figure 13 r

The mshows thatalgebraical(1,1,1) wit

nish Bashir, Fa

On the otheWe can per

lot of the orry. Since it tr of AR and

ARIMA(0,1

DEL BUILevelop threeo. The total ndataset has

n & Khandakure 11, the shown in daation betwee0 is out of thstatistic 0.0

represents a

model selectit all AIC vally lower AIth AIC va

aheem Aslam

r hand, the rform anothriginal data atook one difd MA part i1,1), ARIMA

DING ANDe ARIMA mnumber of othe remainikar, 2007) aforecasts fo

ark shaded aen forecast he significa0287 confira more or le

Figure 1

ion among talues are inIC is selectelue -3923.6

first differeher formal teand the ACFfferencing tois determineA(1,1,0) and

D DIAGNOmodels withobservationsing 1545 oband packageor the next 1area and 95%errors then t

ance boundsrming the ess normal d

Figure 11

12: ACF Plo

the above ten negative med rather tha65 and pre

ence data loest to test theF plot provio get stationed through td ARIMA(

OSTIC h orders as ps is 5151, th

bservations.e Metrics (H10 days are% predictiothe model c

s. We carry evidence of distribution.

1: Ten days

ot of Residu

ested traditimeans that tan the absolecision valu

ooks much se stationaritide a good inarity, here dthe plots of 1,1,1).

proposed eahe training dFor estimat

Hamner et aplotted as a

on interval acannot be imout a unit ro

f non-zero c

s Forecast.

ual ARIMA

ional modelthe likeliholute value oue 0.1333.

stationery. Tty of the diffndication thd=1. The va

f ACF and P

arlier. We bidata set has tion purpose

al., 2018) ara blue line, s a light sha

mproved. In oot test whecorrelation.

(1, 1, 1).

ls is based oood at the mf AIC whichRight-align

The ACF cufferenced dahat the diffealues of p anPACF that m

ifurcate the3606 obseres, R packa

re deployedwhere 80%

aded area. IfFigure 12, s

ere KPSS haThe foreca

on AIC valumaximum wh in our casned averag

uts off afterata. But hererenced datand q, whichmakes three

e data into arvations andage Forecast.

% predictionf there is nospike 2, 4, 7as the valueast errors in

ues. Table 3was > 1 and

e is ARMAges and the

r e a h e

a d t

n o 7 e n

3 d A e


exponentia

The A

hidden andfunction. Tindicators parameters30% of theestimated ahighest preelm_train()sigmoid anprecision vthe highest


ally weighte

TeA

ANN is applid output. ANThe network

through ths are fixed fe data set. Tagainst threecision valu) and elm_pnd tanh are uvalue is 0.50t precisions


ed average h

Figure 1

Tabl

Table 4chnique ANN

SVM

ELM

ied in packaNN model uk has 15 inhe random for this ANNThe highestee activatioue 0.5265 ipredict(). Thused and th000 with acscore with


has no repor

3: Histogram

le 3: CompaModeSMACMARMAEMA

ARMA(0ARMA(1ARMA(1

4: ComparisActivation fu

ReLUTanh

SigmoiPolynomRadial B

TanhSigmoiReLUTanh

age nnet of Rutilized in thnput neurons

forest (TabN. The netwt precision sn functionsin polynomhe number

he input layectivation funactivation f


rted value o

am of Residu

arison of Trel A A A A 0,1,1)

,1,0) ,1,1)

son of Soft function U h id

mial Basis h id

U h

R. The feedhis study ems that correble 1). Thework is trainscore is 0.6s which are

mial activatioof hidden ner has 15 senction sigmfunction ReL


of AIC.

ual ARIMA

raditional MAIC

-3151.602 -3896.784

Null Null

-3680.29 -2611.42 -3923.65

ComputingPrecision

0.630.580.590.520.480.490.500.420.44

d-forward Amploys the siespondents te output laned on 70%

6311 with ac, polynomion functionnodes is 10,elected inputmoid. The soLU for KSE


A (1, 1, 1).

Model

g Model. n Score 11 36 45 65 30 50 00 85 10

ANN model igmoid, tanhto the 15 seayer has a

% data and tectivation fual, tanh and

n. ELM is athree activ

t technical ioft computinE-100 tradin


has three lanh and ReLUelected inpupredicted

ested on theunction ReLd radial basapplied usin

vation functiindicators. Tng models

ng signal dat

ational eISSN

70 13

ayers: input,U activationut technicalsignal. All

e remainingLU. SVM issis with theng functionions ReLU,The highestANN givesta as shown

, n l l g s e n , t s n


in Table 4. The analysis of traditional econometric techniques only ARMA (1,1,1) is the best model but the precision score is quite low (Table 5). These results are in line with previous studies (Hsu, Lessmann, Sung, Ma, & Johnson, 2016; Rojas et al., 2008) which also assert the better predictive performance of soft computing models over traditional models.

Table 5: Comparison of Best Traditional Model and Best Soft Computing Model Model Type Model Precision Score

Traditional Model ARMA(1,1,1) 0.1333 Soft Computing Model ANN 0.6311

5. CONCLUSION This study brings forward the unsettled debate in the literature about modeling techniques for

trading signals prediction. We cover a vast range of models from moving averages to autoregressive models and then soft computing models. Our detailed analysis finds the best of the traditional models and best of soft computing models but soft computing models perform better than traditional forecasting models for trading signals prediction. These findings are important for traders who can forecast trading signals on the basis of the soft computing model rather than the traditional model. Our analysis also confirms the non-linear behavior of time series financial data which can be better handled through the soft computing model.

6. AVAILABILITY OF DATA AND MATERIAL Data can be made available by contacting the corresponding authors

7. REFERENCES Atsalakis, G. S., & Valavanis, K. P. (2009). Surveying stock market forecasting techniques–Part II: Soft

computing methods. Expert systems with Applications, 36(3), 5932-5941.

Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Paper presented at the Proceedings of the fifth annual workshop on Computational learning theory.

Brockwell, P. J., & Davis, R. A. (2009). Time Series: Theory and Methods, (Springer Series in Statistics).

Cheng, C.-H., & Wei, L.-Y. (2014). A novel time-series model based on empirical mode decomposition for forecasting TAIEX. Economic Modelling, 36, 136-141.

Čisar, P., & Čisar, S. M. (2011). Optimization methods of EWMA statistics. Acta Polytechnica Hungarica, 8(5), 73-87.

Cryer, J. D., & Chan, K.-S. (2008). Time series regression models. Time series analysis: with applications in R, 249-276.

Fama, E. F. (1965). The behavior of stock-market prices. The journal of Business, 38(1), 34-105.

Hamner, B., Frasco, M., & LeDell, E. (2018). Package ‘Metrics’.

Hsu, M.W., Lessmann, S., Sung, M.C., Ma, T., & Johnson, J.E. (2016). Bridging the divide in financial market forecasting: machine learners vs. financial economists. Expert systems with Applications, 61, 215-234.

Hu, Y., Feng, B., Zhang, X., Ngai, E., & Liu, M. (2015). Stock trading rule discovery with an evolutionary trend following model. Expert systems with Applications, 42(1), 212-222.

Huang, C.-L., & Tsai, C. Y. (2009). A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting. Expert Systems with Applications, 36(2), 1529-1539.


15

Huang, J.Z., Huang, W., & Ni, J. (2019). Predicting bitcoin returns using high-dimensional technical indicators. The Journal of Finance and Data Science, 5(3), 140-155.

Huang, W., Nakamori, Y., & Wang, S.-Y. (2005). Forecasting stock market movement direction with support vector machine. Computers & operations research, 32(10), 2513-2522.

Hyndman, R. J., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O'Hara-Wild, M., . . . Razbash, S. (2019). Package ‘forecast’. Online] https://cran. r-project. org/web/packages/forecast/forecast. pdf.

Hyndman, R. J., & Khandakar, Y. (2007). Automatic time series for forecasting: the forecast package for R: Monash University, Department of Econometrics and Business Statistics ….

Kara, Y., Boyacioglu, M. A., & Baykan, Ö. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert systems with Applications, 38(5), 5311-5319.

Kumar, D. A., & Murugan, S. (2013). Performance analysis of Indian stock market index using neural network time series model. Paper presented at the 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering.

Lam, M. (2004). Neural network techniques for financial performance prediction: integrating fundamental and technical analysis. Decision Support Systems, 37(4), 567-581.

Lee, M.-C. (2009). Using support vector machine with a hybrid feature selection method to the stock trend prediction. Expert systems with Applications, 36(8), 10896-10904.

Lin, C.-S., Chiu, S.-H., & Lin, T.-Y. (2012). Empirical mode decomposition–based least squares support vector regression for foreign exchange rate forecasting. Economic Modelling, 29(6), 2583-2590.

Lucas, J. M., & Saccucci, M. S. (1990). Exponentially weighted moving average control schemes: properties and enhancements. Technometrics, 32(1), 1-12.

Malkiel, B. G., & Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. The journal of finance, 25(2), 383-417.

McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133.

Moradi, S., & Rafiei, F. M. (2019). A dynamic credit risk assessment model with data mining techniques: evidence from Iranian banks. Financial Innovation, 5(1), 15.

Murphy, J. J. (1986). Technical Analysis of the Futures Markets, New York Institute of Finance. Englewood Cliffs, NJ.

Oliveira, A. L., & Meira, S. R. (2006). Detecting novelties in time series through neural networks forecasting with robust confidence intervals. Neurocomputing, 70(1-3), 79-92.

Patel, J., Shah, S., Thakkar, P., & Kotecha, K. (2015). Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert systems with Applications, 42(1), 259-268.

Pfaff, B., Zivot, E., Stigler, M., & Pfaff, M. B. (2016). Package ‘urca’. Unit root and cointegration tests for time series data. R package version, 1.2-6.

Rhoades, S. A. (1993). The herfindahl-hirschman index. Fed. Res. Bull., 79, 188.

Rojas, I., Valenzuela, O., Rojas, F., Guillén, A., Herrera, L. J., Pomares, H., . . .Pasadas, M. (2008). Soft-computing techniques and ARMA model for time series prediction. Neurocomputing, 71(4-6), 519-537.

Shah, A., Zeileis, A., & Grothendieck, G. (2005). zoo Quick Reference. Package vignette.

16 Been

Svetunkov,

Tay, F. E., Om

Teixeira, Lana688

Torgo, L. (2

Torgo, L. pre

Tsinaslanidand

Tukey, J. E

Ulrich, J., U

Vanstone, syst

Wang, B., fore

Wang, J.-Zpro

Weng, B., Ausin

Zakamulin,Rul

Żbikowski,featApp

Zhiqiang, Gopt

Zhong, X.,Exp

Zhong, X.,mac

Trademarkthis only

nish Bashir, Fa

, I. (2017). S

& Cao, L. (2mega, 29(4),

. A., & De Oalysis and 85-6890.

2011). Data

(2014). Andictive mod

dis, P. E., & d dynamic tim

Exploratory d

Ulrich, M. J

B., & Finntems using a

Huang, H.ecasting. Ne

Z., Wang, Jopagation ne

Ahmed, M. ng disparate

, V. (2017). les: Springer

, K. (2015). ture selectioplications, 4

G., Huaiqingimized by P

& Enke, Dpert systems

& Enke, Dchine learnin

Beenish BasShe was a visMarkets, Fin

Dr. FaheemIslamabad. HKorea. His re

ks Disclaimearticle are th

y. Use of them

aheem Aslam

Statistical m

2001). Appl309-317.

Oliveira, A. Lnearest ne

a mining with

n infrastrucdels in r. arX

Kugiumtzisme warping

data analysi

., & RUnit,

nie, G. (200artificial neu

, & Wang, eurocomputi

J.-J., Zhang,ural network

A., & Megae data source

Market Timr.

Using voluon for the

42(4), 1797-

g, W., & QuPSO. Soft Co

D. (2017). F with Applic

D. (2019). Png algorithmshir is a PhD ssiting research

nancial Risk Ma

Aslam is an AHe earned his Mesearch intere

er: All produhe property om does not im

odels underl

lication of su

L. I. (2010).ighbor clas

h R: learnin

ture for peXiv preprint a

s, D. (2014)g. Expert sys

s 1977 Read

S. (2018). P

09). An empural network

X. (2012).ing, 83, 136-

, Z.-G., & k. Expert sys

ahed, F. M. es. Expert sy

ming with Mo

ume weightepurpose o

1805.

uan, L. (201omputing, 17

orecasting dcations, 67,

redicting thms. Financiascholar at Deph student at Uanagement, M

Assistant ProfMaster's and Pests are Financ

uct names incf their respecmply any en

lying functio

upport vecto

A method fssification.

ng with case

erformance arXiv:1412.

). A predictiostems with Ap

ding. MA Ad

Package ‘TT

pirical methks. Expert sy

A novel t-145.

Guo, S.-P. ystems with A

(2017). Stocystems with A

oving Averag

ed support vf creating

13). Financia7(5), 805-81

daily stock 126-139.

he daily retual Innovationpartment of MUniversity of CMachine Learn

fessor at DepaPh.D. Degrees cial Analytics

cluding tradective owners,ndorsement o

ons of'smoo

or machines

for automaticExpert sys

studies: Ch

estimation 0436.

on scheme upplications,

ddison-Wesle

R’.

hodology foystems with A

ext mining

(2011). FoApplications

ck market onApplications

ges: The An

vector machistock tradin

al time serie8.

market retu

urn directionn, 5(1), 4.

Management SCôte d'Azur, Fring and Data S

artment of Mafrom Hanyang, Data Mining,

emarks™ or r, using for id

or affiliation.

th package f

in financial

c stock tradistems with

apman and H

and experi

using percep 41(15), 684

ey.

or developinApplications

approach t

recasting sts, 38(11), 14

ne-day aheas, 79, 153-16

atomy and P

ines with wang strategy

es forecastin

urn using dim

n of the stoc

ciences, COMrance. Her reseScience.

nagement Scig University Bu, Artificial Inte

registered® tentification a

for R.

time series

ing combininApplicatio

Hall/CRC.

imental com

ptually impo48-6860.

ng stockmars, 36(3), 666

to financial

tock indices4346-14355.

ad movemen63.

Performance

alk forwardy. Expert sy

ng using LP

mensionality

ck market u

MSATS Universiearch interest

iences, COMSAusiness Schoolelligence and

trademarks mand educatio

forecasting.

ng technicalns, 37(10),

mparison of

ortant points

rket trading68-6680.

time series

s with back

nt prediction

e of Trading

d testing andystems with

P and SVM

y reduction.

using hybrid

ity Islamabadt are Financia

ATS Universityl, Seoul, SouthData Science

mentioned innal purposes

.

l ,

f

s

g

s

k

n

g

d h

M

.

d

. l

y h .

n s

International Transaaction Journal of Engineerring, Management, … · 2020. 4. 13. · time series...

Documents

Transcript of International Transaaction Journal of Engineerring, Management, … · 2020. 4. 13. · time series...