Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet ...

87
Genetic Programming for Genetic Programming for Financial Trading Financial Trading Nicolas NAVET Nicolas NAVET INRIA, France INRIA, France AIECON NCCU, Taiwan AIECON NCCU, Taiwan http:// http:// www.loria.fr www.loria.fr /~ /~ nnavet nnavet http://www.aiecon.org/ Tutorial at CIEF 2006, Kaohsiung, Taiwan, 08/10/2006

Transcript of Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet ...

Page 1: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Genetic Programming for Genetic Programming for Financial Trading Financial Trading

Nicolas NAVETNicolas NAVET

INRIA, France INRIA, France

AIECON NCCU, TaiwanAIECON NCCU, Taiwan

http://http://www.loria.frwww.loria.fr/~/~nnavetnnavet

http://www.aiecon.org/

Tutorial at CIEF 2006, Kaohsiung, Taiwan, 08/10/2006

Page 2: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

2

Outline of the talk (1/2)Outline of the talk (1/2) PART 1 : Genetic programming (GP) ?

GP among machine learning techniques GP on the symbolic regression problem Pitfalls GP

PART 2 : GP for financial trading Various schemes How to implement it ? Experimentations : GP at work

Page 3: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

3

Outline of the talk (2/2)Outline of the talk (2/2) PART 3 : Analyzing GP results

Why GP results are usually inconclusive?

Benchmarking with “Zero-intelligence trading strategies” “Lottery Trading”

Answering the questions “is there anything to learn on the data at hand”

“is GP effective at this task”

PART 4 : Perspectives

Page 4: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

4

GP is a Machine Learning technique

Ultimate goal of machine learning is the automatic programming, that is computers programming themselves ..

More achievable goal: “Build computer-based systems that can adapt and learn from their experience”

ML algorithms originate from many fields: mathematics (logic, statistics), bio-inspired techniques (neural networks), evolutionary computing (Genetic Algorithm, Genetic Programming), swarm intelligence (ant, bees)

Page 5: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

5

Evolutionary Computing Algorithms that make use of mechanisms inspired by natural evolution, such as

“Survival of the fittest” among an evolving population of solutions Reproduction and mutation

Prominent representatives: Genetic Algorithm (GA) Genetic Programming (GP) : GP is a branch of GA where the genetic code of a solution is of variable length

Over the last 50 years, evolutionary algorithms have proved to be very efficient for finding approximate solutions to algorithmically complex problems

Page 6: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

6

Two main problems in Two main problems in Machine LearningMachine Learning

Classification : model output is a prediction whether the input belongs to some particular class Examples : Human being recognition in image analysis,

spam detection, credit scoring, market timing decisions

Regression : prediction of the system’s output for a specific input Example: predict tomorrow's opening price for a stock given

closing price, market trend, other stock exchanges, …

Page 7: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

7

Functioning scheme of Functioning scheme of ML ML

Learning on a “training interval”

Use of the model outside the training

interval

Page 8: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

GP basics GP basics

Page 9: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

9

Genetic programmingGenetic programming

Generate a population of

random programs

Evaluate their quality

(“fitness”)

Create better programs by applying genetic

operators, eg- mutation

- combination (“crossover”)

GP is the process of evolving a population of computer programs, that are candidate solutions, according to

the evolutionary principles (e.g. survival of the fittest)

Solution

Page 10: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

10

In GP, programs are In GP, programs are represented represented by trees (1/3)by trees (1/3) Trees are a very general representation

form : Formula : sin(sin( 0.993) ( 0.3549))X

functions

terminals

Page 11: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

11

In GP, programs are In GP, programs are represented by trees (2/3)represented by trees (2/3)

Logical formula :

(( 1 2)XOR 5) 4IN IN IN IN

Page 12: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

12

In GP, programs are In GP, programs are represented by trees (3/3)represented by trees (3/3) Trading rule formula : BUY IF (VOL>10) AND (Moving Average(25) > Moving Average(45))

Picture from

[BhPiZu02]

Page 13: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

13

Preliminary steps of GP Preliminary steps of GP

The user has to define :

the set of terminals

the set of functions

how to evaluate the quality of an individual: the “fitness” measure

parameters of the run : e.g. number of individuals of the population

the termination criterion

Page 14: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

14

Symbolic regression : a Symbolic regression : a problem problem

GP is good at …GP is good at …

“Symbolic” means that one looks for both

- the functional form

- the value of the parameters, e.g.

2( ) sin( )f x x x

0.37

Differs from other regressions where one solely looks for the best coefficient values for a pre-fixed model. Usually the choice of the model is the most difficult issue !

Symbolic regression : find a function that fits well a set of experimental data points

Page 15: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

15

Symbolic regression Symbolic regression Given a set of points :

°( ) 1..i iy f x i n

Find the function s.t. “as far as possible” :

1 1 2 2{( , ),( , )...( , )}n nx y x y x y

°( )f x

Possible fitness function : ° 2

1

( ( ) )n

i ii

f x y

GP functions :

GP terminals :

{ , , /, ,sin,cos,...}

{ }x¡

Page 16: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

16

GP Operators : GP Operators : biologically inspired …biologically inspired …

Recombination (aka “crossover”) : 2 individuals share genetic material and create one or several offsprings

Mutation : introduce genetic diversity by random changes in the genetic code

Reproduction : individual survives ‘as is’ in the next generation

Page 17: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

17

Selection Operators for Selection Operators for Crossover/reproductionCrossover/reproduction

Fitness proportionate : each individual is selected with a probability that depends on the value of its fitness

Tournament selection of size n : n individuals are randomly chosen and the best is kept

Rank based : each individual is selected with a probability function of its rank according to the fitness order

General principles : in GP the fittest individuals should have more chance to survive and transmit their genetic code

Page 18: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

18

Standard Recombination Standard Recombination (aka crossover)(aka crossover)

Standard recombination : exchange two randomly chosen sub-trees among the parents

+

Page 19: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

19

Mutation Operator 1 : Mutation Operator 1 : standard mutationstandard mutation

Standard mutation : replacement of a sub-tree with a randomly generated one

Page 20: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

20

Mutation Operator 2 : Mutation Operator 2 : swap sub-tree mutationswap sub-tree mutation

Swap sub-tree Mutation : swap two sub-trees of an individual

Page 21: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

21

Mutation Operator 3 : Mutation Operator 3 : shrink mutationshrink mutation

Shrink Mutation : replacing a branch (a node with one or more arguments) with one of his child node

Page 22: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

22

Other Mutation Operators Other Mutation Operators

“Swap mutation” : (≠ swap sub-tree mutation) exchanging the function associated to a node by one having the same number of arguments

“Headless Chicken crossover” : mutation implemented as a crossover between a program and a newly generated random program

….

Page 23: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

23

Reproduction / Elitism Reproduction / Elitism OperatorsOperators

Reproduction : an individual is reproduced in the next generation without any modification Elitism : the best n individuals are kept in the next generation

Page 24: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

24

GP is no silver bullet GP is no silver bullet ……

Page 25: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

25

GP Issue 1 : how to choose GP Issue 1 : how to choose the function set ?the function set ?

1.1. The problem cannot be solved if the set The problem cannot be solved if the set of functions is not “sufficient”of functions is not “sufficient”……

2.2. But “Non-relevant” functions increases But “Non-relevant” functions increases uselessly the search space …uselessly the search space …

Problem : Problem : there is no there is no automatic way to decide a priori automatic way to decide a priori the “relevant” functions and to the “relevant” functions and to build a “sufficient” function sets build a “sufficient” function sets ……

Page 26: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

26

Problem cannot be solved if Problem cannot be solved if the set of functions is not the set of functions is not “sufficient”“sufficient” : illustration: illustration

Generating function: ( ) 0.3 sin( )f x x x

GP functions : with and with and without sin(x) without sin(x)

{ , , /, }

GP terminals : { }x¡

2020Number of generationsNumber of generations

Standard GP operators: crossover, mutation, Standard GP operators: crossover, mutation, reproduction, tournament selection of size 6, …reproduction, tournament selection of size 6, …

500500Number of individualsNumber of individuals

SETU

P

Page 27: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

27

Results with sin(x) Results with sin(x) in the function set in the function set

Typical Typical outcomoutcom

e :e :

Page 28: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

28

Results without sin(x) in Results without sin(x) in the function set the function set

Typical outcom

e :

Page 29: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

29

Yes, sin(x) can be Yes, sin(x) can be approximated by its approximated by its

Taylor’s series ..Taylor’s series ..

Problem 1 : there is little hope to discover that ..

2 1 3 5 7

0

( 1) 1 1 1sin( ) ...

(2 1)! 6 120 5040

nn

n

x x x x x xn

Sin(x) and taylor approximation of degree 1, 3 , 5, 7, 9, 11, 13

[image Wikipedia]

Problem 2 : what happens outside the training interval ?

Page 30: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

30

Composition of the function Composition of the function set set

is crucial : illustrationis crucial : illustration GP functions :

{cos{ , , /, ,a,si bs, log,ex }} pn U

Same experimental setup as before Same experimental setup as before

Subset is extraneous in this context …

{cos,abs,log,exp}

Page 31: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

31

Function set containing Function set containing “redundant” functions “redundant” functions

(1/2)(1/2)

Typical outcome

:

Page 32: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

32

Function set containing Function set containing “redundant” functions “redundant” functions

(2/2)(2/2) On average, with the “extraneous” functions the best solution is 10% farther from the curve in the training interval (much more outside!) With the “extraneous” functions, the average solution is better .. because the tree is more likely to contain a trigonometric function

Page 33: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

33

GP Issue 2 : “code bloat”GP Issue 2 : “code bloat” Solutions increase in size over generations …

Same experimental setup as

before

Page 34: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

34

GP Issue 2 : “code bloat”GP Issue 2 : “code bloat”

non-effective

code !! aka “introns”

Much of the genetic code has no influence on the fitness .. but may constitute a useful reserve of genetic material

Page 35: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

35

Code bloat: why is it a Code bloat: why is it a problem ?problem ?

1. Solutions are hard to understand : learning something from huge

solutions is almost impossible .. One has no confidence using

programs one does not understand !

2. Much of the computing power is spent manipulating non-contributing code, which may slow down the search

Page 36: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

36

Countermeasures .. (1/2)Countermeasures .. (1/2) Static limit of the tree depth Dynamic maximum tree depth [SiAl03] : the limit

is increased each time an outstanding individual deeper than the current limit is found

Limit the probability of longer-than-average individuals to be chosen by reducing their fitness

Apply operators than ensure limited code growth Discard newly created individuals whose

“behavior” is too close to the ones of their parents (e.g. “behavior” for regression pb could be position of the points [Str03])

Page 37: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

37

Countermeasures .. (2/2)Countermeasures .. (2/2) Possible : symbolic simplification of the tree

-0.3473282443 sin( )x x

Needs to be further investigated ! preliminary experiments [TeHe04] show that simplification does not necessarily help (“introns” may constitute a useful reserve of genetic materials)

can be simplified into :

Page 38: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

38

GP Issue 3 : GP can be GP Issue 3 : GP can be disappointing outside the disappointing outside the

training settraining set

and such a

behavior can

hardly be predicte

d …

Page 39: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

39

GP Issue 3 : explanation GP Issue 3 : explanation (1/2)(1/2)

Usually GP functions are implemented to have the closure property: “each function must be able to handle every possible value”

What to do with :• division by 0 ?• sqrt(x) with x < 0 ?• …

Solution: “protected operators”, eg. the division :

if (abs(denominator) < value-near-0) return 1;

Page 40: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

40

Why did it not occur on the training interval ? - not training points chosen such that x k

GP Issue 3 : explanation GP Issue 3 : explanation (2/2)(2/2)

in our case, fragment of the best GP tree :

0 for withx k k ¢

0 for x

Page 41: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

41

GP Issue 4 : standard GP is GP Issue 4 : standard GP is not good at finding not good at finding

numerical constants (1/3)numerical constants (1/3) Where do numerical values come from ?

“Ephemeral random constants” : random values inserted at the leafs of the GP trees during the creation of initial population Use of arithmetic operators on existing numerical constants Generation by combination of variables/functions:

1; 2; 1/ 2 0.5; ...X X X

X X X

Lately, many studies show that standard GP is not good at finding constants …

Page 42: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

42

GP Issue 4 : standard GP is GP Issue 4 : standard GP is not good at finding not good at finding

numerical constants (2/2)numerical constants (2/2) Experiment : find a constant function equal to the numeric constant 3.141592

Typical outcome:

3.128

0.5% error

Page 43: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

43

GP Issue 4 : standard GP is GP Issue 4 : standard GP is not good at finding not good at finding

numerical constants (3/3)numerical constants (3/3) There are several more efficient schemes for constants generation in GP [Dem95] :

- local optimization [ZuPiMa01],- numeric mutation [EvFe98],- …

One of them should be implemented otherwise 1) computation time is lost searching for constants 2) solutions may tend to be “bigger”

Page 44: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

44

Some (personal) conclusions Some (personal) conclusions on GP (1/3)on GP (1/3)

GP is undoubtedly a powerful technique :

Efficient for predicting / classifying .. but not more than other techniques

Symbolic representation of the created solutions may help to give good insight into the system under study .. not only the best solutions are interesting but also how the population has evolved over time

GP is a tool to learn knowledge …

Page 45: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

45

Some (personal) conclusions Some (personal) conclusions on GP (2/3)on GP (2/3)

Powerful tool but ... a good knowledge of the application field is required for choosing the right functions set prior experience with GP is mandatory to avoid common mistakes – there is no theory to tell us what to do ! it tend to create solutions too big to be analyzable -> countermeasures should be implemented fine-tuning the GP parameters is very time-consuming

Page 46: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

46

Some (personal) conclusions Some (personal) conclusions on GP (3/3)on GP (3/3)

How to analyze the results of GP ? efficiency can hardly be predicted, it varies

from problem to problem … and from GP run to GP run

if results are not very positive : is it because there is no good solution ? or GP is not effective and further work is needed ? There are solutions – part 3 of

the talk

Page 47: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Part 2 :Part 2 :GP for financial GP for financial

tradingtrading

Page 48: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

48

Why GP is an appealing Why GP is an appealing technique for financial technique for financial

trading ?trading ?

Easy to implement / robust evolutionary technique

Trading rules (TR) should adapt to a changing environment – GP may simulate this evolution

Solutions are produced under a symbolic form that can be understood and analyzed

GP may serve as a knowledge discovery tool (e.g. evolution of the market)

Page 49: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

49

GP for financial tradingGP for financial trading GP for composing portfolio (not

discussed here, see [Lag03] ) GP for evolving the structure of

neural networks used for prediction (not discussed here, see [GoFe99] )

GP for predicting price evolution (briefly discussed here, see [Kab02] )

Most common : GP for inducing technical trading rules

Page 50: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

50

Predicting price evolution :Predicting price evolution : general comments .. general comments ..

“Long term forecast of stock prices remain a fantasy” [Kab02] Swing trading or intraday

trading Many other (more?) efficient ML tools : e.g. SVM and NN GP is anyway useful for

ensemble methodsCIEF Tutorial 1 by Prof. Fyfe – today

1h30 pm ! 2 excellent starting points :

[Kab02] : single-day-trading-strategy based on the forecasted spread [SaTe01]: winner of the CEC2000 Dow-Jones Prediction – Prediction t+1, t+2, t+3,…, t+h - a solution has one tree per forecast horizon

Page 51: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

51

Predicting price evolution :Predicting price evolution :fitness functionfitness function

Definition of the fitness function has been shown to be crucial e.g. [SaTe01], there are many possible : (Normalized) Mean square error Mean Absolute Percentage Error (1-) statistic = 1 - MAPE / MAPE-Randow-Walk Directional symmetry index (DS) DS weighted by the direction and amplitude of the error … Issue : a meaningful fitness function

is not always “GP friendly” …

Page 52: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

52

Inducing technical trading rules

Training interval

Validation interval

Out-of-sample interval

1 ) Creation of the trading rules using GP

2) Selection of the best resulting strategies

Further selection on unseen data

-

One strategy is chosen for

out-of-sample

Performance evaluation

Page 53: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

53

Steps of the algorithm (1/3)Steps of the algorithm (1/3)

1. Extracting training time series from the database

2. Preprocessing : cleaning, sampling, averaging, normalizing, …

Page 54: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

54

3. GP on the training set3.1 Creation of the individuals 3.2 Evaluation

Steps of the algorithm (2/3)Steps of the algorithm (2/3)

Trading Rules Interpreter

Trading Sequence Simulator

0,1,1,1,0,0,0,0,1,1,1

Fitness $$

3.3 Selection of the individuals

4. Analysis of the evolution : statistics, html files …

Non

-pre

pro

cesse

d

serie

s !

Page 55: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

55

Steps of the algorithm (3/3)Steps of the algorithm (3/3)

5. Evaluate selected individuals on the validation set

6. Evaluate best individual out-of sample

$1

$2$3

Page 56: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

GP at work :GP at work :Demo on the Taiwan Demo on the Taiwan

Capitalization Weighted Capitalization Weighted

Stock IndexStock Index

Page 57: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Part 3 :Part 3 :Analyzing GP resultsAnalyzing GP results

Page 58: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

58

One may cast doubts on GP One may cast doubts on GP efficiency ..efficiency ..

Highly heuristic – no theory ! Problems on which GP has been shown not to be significantly better than random search

Few clear-cut successes reported in the financial literature

GP embeds little domain specific knowledge yet .. Doubts on the efficiency of GP to use the available

computing time : code bloat bad at finding numerical constants best solutions are sometimes found very early in the run ..

Variability of the results ! e.g. returns:

-0.160993, 0.0526153, 0.0526153, 0.0526153, 0.0526153, -0.0794787, 0.0526153, -0.0794787, 0.132354, 0.364311, -0.0990995, -0.0794787, -0.0855786, -0.094433, 0.0464288, -0.140719, 0.0526153, 0.0526153, -0.0746189, 0.418075, ….

Page 59: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

59

Possible pretest : measure of Possible pretest : measure of predictability of the financial predictability of the financial

time-seriestime-series

Serial correlation Kolmogorov complexity Lyapunov exponent Unit root analysis Comparison with results on surrogate

data : “shuffled” series (e.g. Kaboudan statistics)

...

Actual question : how predictable for a given horizon with a given cost function?

Page 60: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

60

In practice, some predictability In practice, some predictability does not imply profitability ..does not imply profitability ..

Volatility may not be sufficient to cover round-trip transactions costs!

t Not the right trading instrument at hand .. typically short selling not available

t

Prediction horizon must be large enough!

Page 61: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

61

Pretest methodologyPretest methodology Compare GP with several variants

of Random search algorithms

“Zero-Intelligence Strategies” - ZIS Random trading behaviors

“Lottery trading” - LT

Statistical hypotheses testing Null : GP does not outperform

ZIS Null : GP does not outperform LT

Issue : how to best constrain randomness ?

Page 62: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Pretest 1 :Pretest 1 : GP versus GP versus Zero-Intelligence Zero-Intelligence

strategiesstrategies(=“Equivalent search intensity” (=“Equivalent search intensity”

Random Search (ERS) with Random Search (ERS) with validation stage)validation stage)

-Null hypothesis Null hypothesis HH1,0 : : GP does not GP does not outperform equivalent random outperform equivalent random search search - Alternative hypothesis is - Alternative hypothesis is HH1,1

Page 63: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

63

Pretest 1 : GP vs Pretest 1 : GP vs zero-intelligence strategieszero-intelligence strategies

H1,0 cannot be rejected – interpretation : There is nothing to learn or GP is not very

effective

Training interval

Validation interval

Out-of-sample interval

1 ) Creation of the trading rules using GP

2) Selection of the best resulting strategies

Further selection on unseen data

-

One strategy is chosen for

out-of-sample

Performance evaluation

ERS

Page 64: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

64

Pretest 4 : GP vs lottery Pretest 4 : GP vs lottery tradingtrading

Lottery trading (LT) = random trading behavior according the outcome of a r.v. (e.g. Bernoulli law)

Issue 1 : if LT tends to hold positions (short, long) for less time that GP, transactions costs may advantage GP ..

Issue 2 : it might be an advantage or an disadvantage for LT to trade much less or much more than GP. ex: downward oriented market with no

short-sell

Page 65: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

65

Frequency and intensity Frequency and intensity of a trading strategyof a trading strategy

Frequency : average number of transactions per unit of time

Intensity : proportion of time where a position is held

For pretest 4 : We impose that average frequency and

intensity of LT is equal to the ones of GP Implementation : generate random

trading sequences having the right characteristics0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,0,1,0,0,0,

0,0,0,1,1,1,1,1,1,…

Page 66: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

66

Training interval

Validation interval

Out-of-sample interval

1 ) Creation of the trading rules using GP

2) Selection of the best resulting strategies

Further selection on unseen data

-

One strategy is chosen for

out-of-sample

Performance evaluation

Pretest 4 : implementationPretest 4 : implementation

0,0,1,1,1,0,0,0,0,0,1,…

Lottery trading

Page 67: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Answering question 1 Answering question 1 ::

is there anything to is there anything to learn on the training learn on the training

data at hand ? data at hand ?

Page 68: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

68

Question 1 : pretests Question 1 : pretests involvedinvolved

Starting point: if a set of search algorithms do not outperform LT, it gives evidence that there is nothing to learn ..

Pretest 4 : GP vs Lottery TradingNull hypothesis H4,0 : GP does not

outperform LT Pretest 5 : Equivalent Random Search

(ZIS) vs Lottery TradingNull hypothesis H5,0 : ERS does not

outperform LT

Page 69: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

69

Question 1 : some Question 1 : some answers ...answers ...

RR means that the null hypothesis means that the null hypothesis HHi,0 cannot cannot be rejected – be rejected – R R means we should favormeans we should favor HHi,1

H4,0 H5,0 Interpretation

Case 1

R R

Case 2

R R

Case 3

R R

Case 4

R R

there is nothing to there is nothing to learnlearnthere is something to there is something to learnlearnthere may be something to there may be something to learn -ERS might not be learn -ERS might not be powerful enoughpowerful enoughthere may be something to there may be something to learn – GP evolution process learn – GP evolution process is detrimentalis detrimental

Page 70: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Answering question 2 Answering question 2 ::

is GP effective ? is GP effective ?

Page 71: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

71

Question 2 : some Question 2 : some answers ... answers ...

Question 2 cannot be answered if there is nothing to learn (case 1)

Case 4 provides us with a negative answer ..

In case 2 and 3, run pretest 1 : GP vs Equivalent random search Null hypothesis H1,0 : GP does not outperform

ERS If one cannot reject H1,0 GP shows no

evidence of efficiency…

Page 72: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Pretests at work Pretests at work Methodology :Methodology :

Draw conclusions from pretests using Draw conclusions from pretests using our own programs and compare with our own programs and compare with results in the literature results in the literature [ChKuHo06][ChKuHo06]

on the same time series on the same time series

Page 73: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

73

Setup : GP control parameters Setup : GP control parameters - same as in - same as in [ChKuHo06][ChKuHo06]

Page 74: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

74

Setup : statistics, data, Setup : statistics, data, trading scheme trading scheme

Hypothesis testing with student t-test with a Hypothesis testing with student t-test with a 95% confidence level95% confidence level

Pretests with samples made of 50 GP runs, 50 Pretests with samples made of 50 GP runs, 50 ERS runs and 100 LT runsERS runs and 100 LT runs

Data : indexes of 3 stock exchanges Canada, Data : indexes of 3 stock exchanges Canada, Taiwan and JapanTaiwan and Japan

Daily trading with short sellingDaily trading with short selling Training of 3 years – Validation of 2 yearsTraining of 3 years – Validation of 2 years Out-of-sample periods: 1999-2000, 2001-2002, Out-of-sample periods: 1999-2000, 2001-2002,

2003-20042003-2004 Data normalized with a 250 days moving Data normalized with a 250 days moving

averageaverage

Page 75: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

75

Results on actual data (1/2)Results on actual data (1/2)

Evidence that there is something to learn : 4 markets out of 9 (C3,J2,T1,T3) Experiments in [ChKuHo06], with another

GP implementation, show that GP performs very well on these 4 markets

Evidence that there is nothing to learn : 3 (C1,J3,T2) In [ChKuHo06], there is only one (C1) where

GP has positive return (but less than B&H)

Page 76: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

76

Results on actual data (2/2)Results on actual data (2/2)

GP effective : 3 markets out of 6 In these 3 markets, GP outperforms Buy and

Hold – same outcome as in [ChKuHo06] Preliminary conclusion : one can rely on

pretests .. When there is nothing to learn, no GP

implementation did good (except in one case) When there is something to learn, at least one

implementation did good (always) When our GP is effective, GP in [ChKuHo06] is

effective too (always)

Page 77: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

77

Further conclusionFurther conclusion Our GP implementation is

1. is more efficient than random search : no case where ERS outperform LT and GP did not

2. But only slightly more efficient … one would expect much more cases where GP does better than LT and not ERS

Our GP is actually able to take advantage of regularities in data … but only of “simple” ones

Page 78: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

Part 4 :Part 4 :Perspectives in the Perspectives in the

field of GP for field of GP for financial trading financial trading

Page 79: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

79

Rethinking Rethinking fitness functionsfitness functions

Fitness functions : accumulated return, risk-adjusted return, …

Issue : on some problems [LaPo02], GP is only marginally better than random search because fitness function induces a “difficult" landscape …

Come up with GP-friendly fitness functions …

From [LaPo02]

Page 80: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

80

Preprocessing of the data : Preprocessing of the data : still an open issuestill an open issue

Studies in forecasting show the importance of preprocessing – for GP, often, normalization with MA(250) is used - with benefits [ChKuHo06]

Length of MA should change according to markets volatility, regime changes, etc ?

Why not consider : MACD, Exponential MA, differencing, rate of change, log value, FFT, wavelet, …

Page 81: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

81

Data division schemeData division scheme

There is evidence that GP performs poorly when the characteristics of the training interval are very different from the out-of-sample interval …

Characterization of the current market condition : mean reverting, trend following ...

Relearning on a smaller interval if needed ?

Page 82: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

82

More extensive tests are More extensive tests are needed .. automating the testneeded .. automating the test

A comprehensive test for daily indexes done in [ChKuHo06], none exists for individual stocks and intraday data …

Automated testing on several hundred of stocks is fully feasible … but require a software infrastructure and much computing power

Page 83: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

83

Ensemble methods : Ensemble methods : combining trading rules combining trading rules

In ML, ensemble methods have proven to be very effective

Majority rule tested in [ChKuHo06] with some success

Efficiency requirement : accuracy (better than random) and diversity (uncorrelated errors) – what does it mean for trading rules?

More fine grained selection / weighting scheme may lead to better results …

Page 84: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

84

Embed more domain specific Embed more domain specific knowledge knowledge

Black-box algorithms are usually outperformed by domain-specific algorithms

Domain-specific language is limited as yet …

Enrich primitive set with volume, indexes, bid/ask spread, …

Enrich function set with cross-correlation, predictability measure, …

Page 85: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

85

References (1/2)References (1/2) [ChKuHo06][ChKuHo06] S.-H. Chen and T.-W. Kuo and K.-M. Hoi. “Genetic S.-H. Chen and T.-W. Kuo and K.-M. Hoi. “Genetic

Programming and Financial Trading: How Much about "What we Programming and Financial Trading: How Much about "What we Know“”. In 4th NTU International Conference on Economics, Know“”. In 4th NTU International Conference on Economics, Finance and Accounting, April 2006.Finance and Accounting, April 2006.

[ChNa06][ChNa06] S.-H. Chen and N. Navet. “Pretests for genetic- S.-H. Chen and N. Navet. “Pretests for genetic-programming evolved trading programs : “zero-intelligence” programming evolved trading programs : “zero-intelligence” strategies and lottery trading”, Proc. ICONIP’2006.strategies and lottery trading”, Proc. ICONIP’2006.

[SiAl03][SiAl03] S. Silva and J. Almeida, “Dynamic Maximum Tree Depth - S. Silva and J. Almeida, “Dynamic Maximum Tree Depth - A Simple Technique for Avoiding Bloat in Tree-Based GP”, GECCO A Simple Technique for Avoiding Bloat in Tree-Based GP”, GECCO 2003, LNCS 2724, pp. 1776–1787, 2003.2003, LNCS 2724, pp. 1776–1787, 2003.

[Str03][Str03] M.J. Streeter, “The Root Causes of Code Growth in Genetic M.J. Streeter, “The Root Causes of Code Growth in Genetic Programming”, EuroGP 2003, pp. 443 - 454, 2003. Programming”, EuroGP 2003, pp. 443 - 454, 2003.

[TeHe04][TeHe04] M.D. Terrio, M. I. Heywood, “On Naïve Crossover Biases M.D. Terrio, M. I. Heywood, “On Naïve Crossover Biases with Reproduction for Simple Solutions to Classification with Reproduction for Simple Solutions to Classification Problems”, GECCO 2004, 2004.Problems”, GECCO 2004, 2004.

[ZuPiMa01][ZuPiMa01] G. Zumbach, O.V. Pictet, and O. Masutti, “Genetic G. Zumbach, O.V. Pictet, and O. Masutti, “Genetic Programming with Syntactic Restrictions applied to Financial Programming with Syntactic Restrictions applied to Financial Volatility Forecasting”, Olsen & Associates, Research Report, Volatility Forecasting”, Olsen & Associates, Research Report, 2001.2001.

[EvFe98][EvFe98] M. Evett, T. Fernandez, “Numeric Mutation Improves the M. Evett, T. Fernandez, “Numeric Mutation Improves the Discovery of Numeric Constants in Genetic Programming”, Discovery of Numeric Constants in Genetic Programming”, Genetic Programming 1998: Proceedings of the Third Annual Genetic Programming 1998: Proceedings of the Third Annual Conference, 1998.Conference, 1998.

Page 86: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

86

References (2/2)References (2/2) [Kab02][Kab02] M. Kaboudan, “GP Forecasts of Stock Prices for Profitable M. Kaboudan, “GP Forecasts of Stock Prices for Profitable

Trading”, Evolutionary computation in economics and finance, Trading”, Evolutionary computation in economics and finance, 2002.2002.

[SaTe02][SaTe02] M. Santini, A. Tettamanzi, “Genetic Programming for M. Santini, A. Tettamanzi, “Genetic Programming for Financial Series Prediction”, Proceedings of EuroGP'2001, 2001.Financial Series Prediction”, Proceedings of EuroGP'2001, 2001.

[[BhPiZu02]BhPiZu02] S. Bhattacharyya, O. V. Pictet, G. Zumbach, S. Bhattacharyya, O. V. Pictet, G. Zumbach, “Knowledge-Intensive Genetic Discovery in Foreign Exchange “Knowledge-Intensive Genetic Discovery in Foreign Exchange Markets”, IEEE Transactions on Evolutionary Computation, vol 6, Markets”, IEEE Transactions on Evolutionary Computation, vol 6, n° 2, April 2002.n° 2, April 2002.

[LaPo02][LaPo02] W.B. Langdon, R. Poli, “Fondations of Genetic W.B. Langdon, R. Poli, “Fondations of Genetic Programming”, Springer Verlag, 2002.Programming”, Springer Verlag, 2002.

[Kab00][Kab00] M. Kaboudan, “Genetic Programming Prediction of Stock M. Kaboudan, “Genetic Programming Prediction of Stock Prices”, Computational Economics, vol16, 2000.Prices”, Computational Economics, vol16, 2000.

[Wag03][Wag03] L. Wagman, “Stock Portfolio Evaluation: An Application of L. Wagman, “Stock Portfolio Evaluation: An Application of Genetic-Programming-Based Technical Analysis”, Genetic Genetic-Programming-Based Technical Analysis”, Genetic Algorithms and Genetic Programming at Stanford 2003, 2003.Algorithms and Genetic Programming at Stanford 2003, 2003.

[GoFe99][GoFe99] W. Golubski and T. Feuring, “Evolving Neural Network W. Golubski and T. Feuring, “Evolving Neural Network Structures by Means of Genetic Programming”, Proceedings of Structures by Means of Genetic Programming”, Proceedings of EuroGP'99, 1999.EuroGP'99, 1999.

[Dem05][Dem05] I. Dempsey, “Constant Generation for the Financial I. Dempsey, “Constant Generation for the Financial Domain using Grammatical Evolution”, Proceedings of the 2005 Domain using Grammatical Evolution”, Proceedings of the 2005 workshops on Genetic and evolutionary computation 2005, pp 350 workshops on Genetic and evolutionary computation 2005, pp 350 – 353, Washington, June 25 - 26, 2005.– 353, Washington, June 25 - 26, 2005.

Page 87: Genetic Programming for Financial Trading Nicolas NAVET INRIA, France AIECON NCCU, Taiwan nnavet  Tutorial at.

87

?