DeepThought-FinML

124
DeepThought 1.4.2 Machine Learning for Financial Trading Systems Deep Thought Software (NZ) Ltd www.deep-thought.co c September, 2014

description

Machine Learning for Trading

Transcript of DeepThought-FinML

  • DeepThought 1.4.2

    Machine Learning for Financial Trading Systems

    Deep Thought Software (NZ) Ltdwww.deep-thought.co

    cSeptember, 2014

  • Contents

    1 Introduction 11.1 Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Output Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    2 Data 32.1 Importing Historical Data from MT4 . . . . . . . . . . . . . . . . . . . . . . . . . 3

    2.1.1 Exporting data as CSV from MT4 . . . . . . . . . . . . . . . . . . . . . . 32.1.2 Importing MT4 CSV data into a DeepThought Database . . . . . . . . . 4

    2.2 Importing from Dukascopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Importing from HistData . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    3 Terminology 5

    4 Machine Learning 64.1 Support Vector Machines (SVM) . . . . . . . . . . . . . . . . . . . . . . . . . . . 64.2 Gradient Boosted Trees (GBT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.3 Random Forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.4 Extremely Randomised Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.5 Multi-Layer Perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.6 Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.7 Continuous Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    4.7.1 Feature Normalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.8 Categorical Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    5 Backtesting 115.1 Backtesting Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115.2 Recording and Using Recorded Signals . . . . . . . . . . . . . . . . . . . . . . . . 125.3 Order Fill Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125.4 Paper Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135.5 Files Produced During Backtesting and Paper Trading . . . . . . . . . . . . . . . 13

    6 Genetic Algorithm for Parameter Search 146.1 Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156.2 Running the Genetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    6.2.1 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.3 Genetic Algorithm Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.4 Using Recorded Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.5 The Condor Submit File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.6 Trouble Shooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    I

  • CONTENTS II

    7 Live and Paper Trading 207.1 Manual Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207.2 Automated Trading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217.3 Trouble Shooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    8 Python Scripting 248.1 Python Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248.2 Python Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258.3 Python Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288.4 Python Predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308.5 Python Signal Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328.6 The deep thought intf Interface Object . . . . . . . . . . . . . . . . . . . . . . 33

    9 Configuration Details 359.1 bar-series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    9.1.1 Renko Bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389.1.2 Summary of bar-series Options . . . . . . . . . . . . . . . . . . . . . . 40

    9.2 bar-series-collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409.3 model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419.4 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    9.4.1 hour-of-day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439.4.2 day-of-week . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449.4.3 bar-diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459.4.4 bar-attribute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479.4.5 moving-average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499.4.6 csv-feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519.4.7 python-script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    9.5 Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559.5.1 bars-in-future . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559.5.2 python-script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

    9.6 Predictors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589.6.1 svm-predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589.6.2 linear-svm-predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629.6.3 gbt-predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649.6.4 random-forest-predictor . . . . . . . . . . . . . . . . . . . . . . . . . . 659.6.5 extremely-randomised-trees-predictor . . . . . . . . . . . . . . . . . 669.6.6 multi-layer-perceptron-predictor . . . . . . . . . . . . . . . . . . . . 669.6.7 python-predictor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    9.7 predictor-ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689.8 signal-generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709.9 trader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729.10 backtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739.11 genetic-algo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    10 Commandline Tools 7710.1 Candle Statistics (--stats) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7810.2 Generate Bars (--generate-bars) . . . . . . . . . . . . . . . . . . . . . . . . . . 7910.3 Generating a Manual Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8010.4 Generating Feature Statistics (--generate-feature-stats) . . . . . . . . . . . . 8010.5 Extracting a Training Set (--extract-training-set) . . . . . . . . . . . . . . . 8110.6 SVM Grid Search (--svm-param-search-c) . . . . . . . . . . . . . . . . . . . . . 8110.7 GBT Grid Search (--gbt-param-search-c) . . . . . . . . . . . . . . . . . . . . . 82

  • CONTENTS III

    10.8 Printing XML Configuration Documentation (--print-config) . . . . . . . . . . 83

    11 Fundamental Indicators (Experimental) 9211.1 Fundamental Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    12 Tutorial: Preparing the Commandline 9412.1 Step 1: Open the commandline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9412.2 Step 2: Open the defaults window . . . . . . . . . . . . . . . . . . . . . . . . . . 9512.3 Step 3: Change the font . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9512.4 Step 4: Change the default window size . . . . . . . . . . . . . . . . . . . . . . . 96

    13 Tutorial: Backtesting in DeepThought and MT4 9713.1 Step 1: Edit the configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9713.2 Step 2: Start the DeepThought backtest . . . . . . . . . . . . . . . . . . . . . . . 9813.3 Step 3: Copy files to Metatrader . . . . . . . . . . . . . . . . . . . . . . . . . . . 9913.4 Step 4: Modify the EA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9913.5 Step 5: Running Metatrader Strategy Tester . . . . . . . . . . . . . . . . . . . . 9913.6 Step 6: Optimisation with MT Strategy Tester . . . . . . . . . . . . . . . . . . . 10013.7 Step 7: Analyse the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    A Sample Configuration 104

    B Condor Setup and Operation 108B.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    B.1.1 Adding a Condor User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112B.2 Useful Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

    B.2.1 condor status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113B.2.2 condor q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113B.2.3 condor rm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

  • List of Figures

    4.1 Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74.2 Multi-Layer Perceptron with 3 inputs, 5 hidden and 2 output neurons. . . . . . . 8

    9.1 Type 1 Renko Bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389.2 Type 2 Renko Bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    12.1 Opening the DeepThought Commandline . . . . . . . . . . . . . . . . . . . . . . 9412.2 Opening the Defaults Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9512.3 Changing the Font . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9512.4 Changing the Window Size/Layout . . . . . . . . . . . . . . . . . . . . . . . . . . 96

    13.1 Editing the Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9813.2 Starting the Backtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9813.3 The Completed Backtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9913.4 Metatrader tester setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10013.5 The Completed Backtest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10013.6 Enabling the Genetic Optimisation in Metatrader . . . . . . . . . . . . . . . . . . 10113.7 Selecting which Parameters to Optimise . . . . . . . . . . . . . . . . . . . . . . . 10113.8 Enabling Optimisation in the Strategy Tester . . . . . . . . . . . . . . . . . . . . 10213.9 List of the Best Results of the Metatrader Optimiser . . . . . . . . . . . . . . . . 10213.10Report of the Optimum Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . 10213.11Graph of a Test With Optimum Settings . . . . . . . . . . . . . . . . . . . . . . . 103

    B.1 Condor Setup 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108B.2 Condor Setup 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109B.3 Condor Setup 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109B.4 Condor Setup 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110B.5 Condor Setup 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110B.6 Condor Setup 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111B.7 Condor Setup 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111B.8 Condor Setup 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112B.9 Condor Setup 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    IV

  • List of Tables

    4.1 Normalisation and Scaling Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Binarising categorical variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    5.1 Files Produced During Backtesting/Paper Trading . . . . . . . . . . . . . . . . . 13

    6.1 parameter configuration options for the genetic-algo configuration section. . . 16

    7.1 DeepThought parameters for the Metatrader EA. . . . . . . . . . . . . . . . . . . 22

    8.1 Summary of the deep thought intf interface object . . . . . . . . . . . . . . . . 34

    9.1 Sections in the XML configuration file . . . . . . . . . . . . . . . . . . . . . . . . 369.2 The effect of the delay-minutes-offset parameter on intraday candles. . . . . 379.3 bar-series configuration options. . . . . . . . . . . . . . . . . . . . . . . . . . . 409.4 Features used as independent inputs to machine learning models. . . . . . . . . . 429.5 hour-of-day feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439.6 day-of-week feature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449.7 Price difference examples for the bar-diff feature. . . . . . . . . . . . . . . . . . 459.8 bar-diff feature parameter options. . . . . . . . . . . . . . . . . . . . . . . . . . 469.9 bar-attribute feature parameter options. . . . . . . . . . . . . . . . . . . . . . 479.10 moving-average feature parameter options. . . . . . . . . . . . . . . . . . . . . . 509.11 csv-feature parameter options. . . . . . . . . . . . . . . . . . . . . . . . . . . . 529.12 python-script feature parameter settings. . . . . . . . . . . . . . . . . . . . . . 539.13 bars-in-future target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559.14 python-script target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579.15 svm-predictor configuration options. . . . . . . . . . . . . . . . . . . . . . . . . 609.16 params configuration options for the svm-predictor. . . . . . . . . . . . . . . . . 619.17 linear-svm-predictor configuration options. . . . . . . . . . . . . . . . . . . . 639.18 params configuration options for the linear-svm-predictor. . . . . . . . . . . . 639.19 params configuration options for the gbt-predictor. . . . . . . . . . . . . . . . . 649.20 gbt-predictor options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649.21 params configuration options for the random-forest-predictor. . . . . . . . . . 659.22 Random Forest random-forest-predictor options. . . . . . . . . . . . . . . . . 659.23 Multi-layer Perceptron multi-layer-perceptron-predictor options. . . . . . . 679.24 params configuration options for the multi-layer-perceptron-predictor. . . . 679.25 python-predictor parameter settings. . . . . . . . . . . . . . . . . . . . . . . . . 699.26 retrain-period options for the predictor-ensemble. . . . . . . . . . . . . . . . 699.27 signal-generator configuration options . . . . . . . . . . . . . . . . . . . . . . . 709.28 trader configuration options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729.29 backtest options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739.30 genetic-algo options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759.31 parameter configuration options for the genetic-algo configuration section. . . 76

    V

  • LIST OF TABLES VI

    10.1 Column meanings using the --stats commandline option. . . . . . . . . . . . . . 7910.2 --generate-bars parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    11.1 title values for the fundamental-indicator feature. . . . . . . . . . . . . . . . 93

    B.1 The Job States in Condor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

  • Listings

    8.1 Python feature configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258.2 Python script example defining a feature. . . . . . . . . . . . . . . . . . . . . . . 278.3 Python target configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288.4 Python script example defining a target. . . . . . . . . . . . . . . . . . . . . . . . 298.5 Python predictor configuration example. . . . . . . . . . . . . . . . . . . . . . . . 308.6 Python script example for a predictor. . . . . . . . . . . . . . . . . . . . . . . . . 318.7 Python signal generator configuration example. . . . . . . . . . . . . . . . . . . . 328.8 Python script example for the signal generator. . . . . . . . . . . . . . . . . . . . 33

    VII

  • Chapter 1

    Introduction

    DeepThought is a sophisticated software package for creating trading systems utilising state ofthe art machine learning algorithms. Currently supported are Support Vector Machines (SVM),Linear Support Vector Machines (LSVM), Gradient Boosted Trees (GBT) and Random Forests.Other methods will be added over time if there is a potential benefit to trading.

    This software tool is designed for people who are serious about their trading. It does have alearning curve, so be prepared to spend some time understanding and researching before rushingto live trading. If you are looking for a $99 get rich quick EA1 and do not want to spend anytime developing your own system, then this probably isnt the tool for you. If you believe that$99 get rich quick EAs actually exist and do what they claim, then this definitely isnt the toolfor you.

    Predicting financial markets is a difficult problem. The patterns we are attempting to forecast areextremely weak. There are many academic papers that discuss which machine learning algorithmis best SVM versus Neural Networks versus Random Forests, etc. We have found that thefeatures that make up an observation which is used for forecasting is much more importantthan the actual algorithm. If you have a feature set which does not contain any patterns, thenwhichever technique you use will not work. Thus is it better to spend the majority of yourtime working on feature selection and engineering rather than fuss about SVM versus NeuralNetworks.

    DeepThought is a command line tool that operates on XML configuration files. A DLL versionintegrates with Metatrader for live trading. Both the DLL and the command line EXE are builtfrom the same source code. The configuration file contains all the settings necessary for bothbacktesting and live trading, thus once a good configuration has been found, the DLL can usethe same configuration without modification. A genetic algorithm is able be used for parametersearch.

    1.1 Software Requirements

    Scripts in Python are provided to perform analysis on backtested configurations, and to importdata into DeepThought databases. It is suggested that Python(x,y) is used as it provides afull Python development environment, including common scientific, mathematical and plottinglibraries. It can be downloaded from code.google.com/p/pythonxy/

    1EA stands for Expert Advisor, Metatraders terminology for a script which trades automatically withouthuman intervention.

    1

  • CHAPTER 1. INTRODUCTION 2

    1.2 Data

    DeepThought needs access to historical data for backtesting (not connected to a trading plat-form), live and paper-trading, so data is stored in separate Sqlite databases. These can beinspected using any Sqlite tools such as Sqliteman available for free from sqliteman.com.If you have access to reliable historical data, then you should import this into the database.Python scripts have been supplied for this purpose see sections 2.1.2 and 2.3 for more details.This database is also used for live trading. When Metatrader is running, the DeepThought EAcollects market ticks and passes them to the DLL. The ticks are used to create 1 min candleswhich are then stored in the database which is used for signal generation. Metatrader is onlyused as a order placement/management system all signal logic is contained in the DLL.

    1.3 Configuration

    This is the heart of the system. Use a text editor such as Notepad++ to edit XML configurationfiles. A few samples are supplied in the examples file. Configuration files must be in their ownunique directory with the filename config.xml or config.xml. This is because you will likelybe working on several config files at the same time, or want to keep previous config files withtheir output for reference purposes. It is easier to keep all related files together in the samedirectory as it minimises clutter.

    1.4 Output Files

    During backtesting, paper-trading and live trading various files are produced recording thesignals generated, log file, PnL and the feature statistics. These are listed in table 5.1 on page13.

  • Chapter 2

    Data

    Data is stored in Sqlite databases. Each instrument has its own database. These are generallystored in the directory C:\FX Database (in Windows). These databases are used both forbacktesting and for live trading. There are certain limits with EA access to data in MT4which can only be overcome by not using MT4 for historical data access. Data can also beimported from other sources such as www.histdata.com, and a python script is provided forthis purpose.

    The DeepThought EA running in MT4 builds 1 minute candles from ticks passed from MT4.These are automatically stored in the database as they are created.

    2.1 Importing Historical Data from MT4

    Data can be exported as CSV files from MT4. The first task is to ensure MT4 has the maximumamount of data available from the broker.

    2.1.1 Exporting data as CSV from MT4

    You can export data from MT4 to a CSV file in the following way:

    Open a 1 minute chart on the instrument that you want data. Select Tools Options and in the Charts tab, make sure the Max bars in history and

    Max bars in chart are set to something huge. If not, enter something like 9999999999999.

    Make sure auto-scrolling is off by checking Charts Auto Scroll. Press and hold the Page Up Key. This forces MT4 to download older data and is more

    reliable than Tools History Centre Download. This can take a while and theamount of data available depends on your broker.

    Once the chart stops downloading data, select Tools History Center, and navigateto the 1 Minute (M1) option of the desired instrument. Click on Export and save as acsv file.

    3

  • CHAPTER 2. DATA 4

    2.1.2 Importing MT4 CSV data into a DeepThought Database

    The python directory contains scripts to import historical data. To import data from an ex-ported CSV from from MT4, use the following command:

    python import_mt4_csv.py -d C:\FX_database\EURUSD.db -c EURUSDm1.csv -n

    The above command, run from the python directory, will create a database EURUSD.db in theC:\FX Database directory and import the data in EURUSDm1.csv. The script assumes the fileEURUSDm1.csv is in the same directory as the script. The -n parameter will create a newdatabase. If you have an existing database, omit this parameter and the new data will bemerged with existing data. It will not overwrite any conflicts, but will fill in the gaps of anymissing data.

    It is useful to run the above script once a week, maybe at the weekend, to ensure any data gapscaused due to network outages, or other interruptions to the DeepThought EA running in MT4,are filled.

    2.2 Importing from Dukascopy

    Dukascopy www.dukascopy.com makes historical tick data available for free. This can bedownloaded using a free tool at www.strategyquant.com/tickdatadownloader/. Note thatDeepThought is not associated or affiliated with Dukascopy or Strategyquant in any way.

    Once the tick data has been downloaded using the above tool, it can be imported using thecommand:

    deepthought --import-dukascopy-csv D:\TickDataDownloader\tickdata\EURUSD.csv--dbname C:\FX Database\EURUSD.db

    where the tick downloader has downloaded and created a single CSV file inD:\TickDataDownloader\tickdata\EURUSD.csv. A new database will be created if it doesntexist, otherwise the new data will be merged with an existing database. When merging the olddata will not be overwritten.

    2.3 Importing from HistData

    In the python directory there is a script for importing historical data files downloaded fromwww.histdata.com. Run this script similar to the MT4 import script:

    python import_histdata.py --db --dir --createdb --unzip

    The --createdb and --unzip parameters are optional. If the --createdb option is present, anew database will be created. The process will fail if a database with the same name alreadyexists to prevent accidental overwriting. HistData files are downloaded in zip format. You canunzip these manually yourself, or supply the --unzip option to have the script do this beforeimporting.

  • Chapter 3

    Terminology

    We define a Feature as a type of information used in the training/forecasting sets. Examplesfor features are Hour of Day and Close price difference between two candles. Eachfeature has at least one attribute. An attribute is the actual number or value used in thetraining/forecasting set. The Hour of Day feature has one attribute, the hour, and the Closeprice difference between two candles feature can have as many attributes as defined.

    A feature vector is a series of feature attributes that form an observation. This could comprise,for example, Hour of Day, Day of Week, 30 differences in close price and 10 differences in movingaverages, thus the feature vector wold have 42 attributes.

    An attribute is classed as either a continuous or a categorical variable. Continuous variablesare variables whose values are real numbers such as a change in price. Categorical variablesare variables that can only take specific values such as day of week which can only be one of{sun,mon,tue,wed,thu,fri,sat,sun}.A label is the forecast variable, i.e. the thing we are trying to predict. When used during the(supervised) training phase, each of the features used for training must be assigned a label. Theset of feature vectors together with their labels is the training set.

    The current version of DeepThought focuses mainly on classification problems. That is thelabels is 1 (for true), and -1 (for false). Regression problems attempt to forecast magnitudeas well as direction. There is limited support for regression problems, and this will be enhancedin future versions. We have found that it is hard enough to predict whether the market willmove up or down, let alone by how much.

    A label is typically something like the close price is higher/lower at the end of the next candlein the future. A label of 1 would indicate higher, and a label -1 would indicate lower.

    A model is the collection of parameters that define a feature vector. This would includeparameters such as how many previous close price difference to include, and how to scaled thevalues.

    A predictor is a self contained forecaster, such as an individual SVM or GBT.

    After the training phase, a final model is built for each predictor. Note, this a different usage ofthe term model to the one given above. Currently these are stored in memory as retraining isfrequent. The model is used to forecast a label for an unlabelled feature vector.

    5

  • Chapter 4

    Machine Learning

    It is beyond the scope of this manual to describe each of the machine learning algorithms indetail. The interested reader should consider the Stanford and/or Caltech machine learningcourse offered via iTunesU (for free).

    Machine learning problems tend to be divided into two main approaches: Classification wherethe goal is to forecast discrete classes ; and regression where the goal is to forecast a realnumber.

    DeepThought supports both classification and regression. For trading system, it is probablybest to focus on classification as it is a difficult enough to forecast market up or down, let aloneby how much. Most classification problems are two class. Multi-class is possible generally byreducing the problem into several two class problems, or a one-versus-all setting.

    Often the process in applying machine learning to trading systems involves an oine step ofbuilding a model, then deploying the model to the trading system. DeepThought takes a differentapproach by enabling the system to continuously retrain. While it is possible to build a singlemodel then forecast using only this model, the preferred mode of operation is to retrain afterthe forecast and signal has been sent to the market. The sequence of events is:

    1. At system spin-up, train all predictors.

    2. New candle (or Renko bar) received and saved to database.

    3. Forecasts made by ensemble of predictors and combined into a single signal.

    4. Signal sent to trading platform (e.g. Metatrader).

    5. All predictors retrained, ready for the next candle to complete to trigger the next forecast.

    Thus your system can continuously adapt to the market.

    4.1 Support Vector Machines (SVM)

    The parameters associated with SVMs are: Kernel type, C (penalty), g (for radial basis functionkernel), e (only for regression). Generally the radial basis function kernel is used, with classifica-tion so the only parameters to select are C and g. The DeepThought command line tool has anoption to perform a grid search using 5 fold cross validation. This means the results providedare for out of sample data, avoiding over-fitting. See 10.6 on page 81 for details on how to do agrid search.

    6

  • CHAPTER 4. MACHINE LEARNING 7

    DeepThought supports linear SVMs and kernel SVMs. Linear SVMs are faster than kernelSVMs, but may not perform well if the problem is non-linear ; i.e. the dependent variable(the thing we are forecasting) is not a linear combination of the inputs. Kernel SVMs use akernel function such as a gaussian to map inputs into a higher dimensional space, then a linearalgorithm is applied to these higher dimensional features. This enables them to model non-linearrelationships. The trade-off is that they can be prone to overfitting and care must be taken toavoid this by properly evaluating on out-of-sample test data. Overfitting is where the model hasvery accurately fitted the training data but does not generalise the underlying relationship well.This is illustrated in figure 4.1.

    Figure 4.1: Overfitting where the green line has overfitted the training data. A better fit is theblack line where the underlying function has been modelled by allowing a few mis-classificationsin the training data.

    4.2 Gradient Boosted Trees (GBT)

    GBT is a decision tree process. It works by creating an initial decision tree then creates successivetrees that are trained on the errors of the previous trees. This is termed a greedy algorithm. Themore trees the better, and this method is good at avoiding over fitting. An advantage of decision-tree based methods, including Random Forests detailed below, is that no normalisation oroutlier removal is required. Two parameters are required by the GBT: number of trees and treedepth.

    4.3 Random Forests

    Random forests work by combining the results of many weak predictors to form a strongerpredictor. The weak predictor is a decision tree, and each decision tree is built on a randomsubset of features and samples. When forecasting, the final prediction is the most common classfor classification, or an average of each trees prediction for regression.

    4.4 Extremely Randomised Trees

    This is a variation of Random Forests where a different method is used to select the features/-data for each tree.

  • CHAPTER 4. MACHINE LEARNING 8

    4.5 Multi-Layer Perceptron

    Also known as Neural Networks. Probably the most widely familiar form of machinelearning. DeepThought supports two methods of training a multi-layer Perceptron: back-propagation and Rprop. For details on back-propagation see http://en.wikipedia.org/wiki/Backpropagation, and for Rprop see http://en.wikipedia.org/wiki/Rprop.

    A multi-layer perceptron is made up of a number of neurons, connected in layers. The firstlayer takes the input so the number of neurons in this layer is always equal to the numberof attributes in the input. The output layer is where we read the forecast so the number ofneurons is equal to the number of attributes that make up the forecast variable. For a two-classclassification problem there will be two output neurons, and for a regression problem there willonly be one.

    The multi-layer perceptron can also contain hidden layers. Normally there is only one hiddenlayer, but we can have more than one. The topology is illustrated in figure 4.2.

    Figure 4.2: Multi-Layer Perceptron with 3 inputs, 5 hidden and 2 output neurons.

    4.6 Ensembles

    Ensembles have been described as the closest thing to a free lunch in machine learning. Mucheffort has been put into the implementation of ensembles in DeepThought . Each of the predictortypes can have one or more sets of parameters. One of the drawbacks of SVMs is that hyper-parameters must be selected. For classification these are C and . As the patterns we areattempting to predict are extremely weak, we can never be sure that a single set of specificvalues will perform as well as indicated during cross-validation. A way around this is simply touse an ensemble of all values and use the majority vote as the final prediction. This is whateach predictor does.

    We can also mix different predictor types, and have multiple models per predictor type. The limitto the number and variety of predictors is limited only by computational power. DeepThoughtwill use all available cores on your PC during backtesting, but it can still be slow when largeensembles are used. A future version will be able to spread a single backtest across severalmachines. Note that the genetic algorithm can use an unlimited cluster of machines by utilisingthe Condor system.

  • CHAPTER 4. MACHINE LEARNING 9

    4.7 Continuous Features

    Continuous features are features whose value is a floating point number, i.e they can take anyvalue. An example is the difference between two close prices. Different features can have differentranges. For example comparing a feature of the difference between two close prices, and a featureof the difference between moving average values 100 bars apart we can see that the latter willhave a greater range than the former. To adjust for this, feature values must be normalisedto bring them into more-or-less the same range. This prevents a feature with a large rangesquashing or overwhelming features with smaller ranges. DeepThought has several methods ofapproaching this.

    4.7.1 Feature Normalisation

    Normalisation is the process that scales the values of each feature in the same range. Theparameters found during normalisation are used to scale the feature vector used for forecasting.DeepThought supports several techniques for normalising training/forecasting features, listed intable 4.1.

    Table 4.1: Normalisation and Scaling Schemes

    Scaling Type Description

    min-max scale all values between 1 and 1 using the minimum andmaximum values for the feature value

    zscore for each feature value, subtract the mean and divide by thestandard deviation. The resulting scaled feature has a meanof zero and a standard deviation of one.

    div-sd divide each feature by the standard deviation

    div-max divide each feature by the maximum of the absolute valuesof the maximum and minimum

    log10 take the base-10 logarithm of each feature value

    Which scheme works best is a task for trial-and-error, but starting with min-max and zscore isrecommended.

    4.8 Categorical Features

    Categorical features have specific values. An example is day of week. We could map days ofweek to an integer in the range 0. . . 6 and use that as a continuous value and treat it as above, orwe could binarise it into multiple attributes. DeepThought supports both approaches. When afeature is binarised, it is mapped into multiple attributes that can take values of zero or one. Aset of attributes for the feature can have only one attribute with the 1 value and all others are0. This is sometimes called a one-hot vector approach. Table 4.2 illustrates this process.

  • CHAPTER 4. MACHINE LEARNING 10

    Table 4.2: Binarising categorical variables.

    Day of Week Binarised encoding

    Sunday 1 0 0 0 0 0 0

    Monday 0 1 0 0 0 0 0

    Tuesday 0 0 1 0 0 0 0

    Wednesday 0 0 0 1 0 0 0

    Thursday 0 0 0 0 1 0 0

    Friday 0 0 0 0 0 1 0

    Saturday 0 0 0 0 0 0 1

  • Chapter 5

    Backtesting

    You will probably spend a lot of time in backtesting as there is a lot trial-and-error involved increating a system. The built in backtester is capable of simulating market and limit orders, takeprofit, stop loss and move to break even. It operates on 1 minute candles. The backtester alsooperates when in live trading mode so you can compare actual results with simulated in realtime. It also functions as as a paper-trader when running in live trading mode and the EA isset to not place any actual trades.

    Chapter 9 on page 9 describes configuration settings in detail. This chapter focuses on theprocess of backtesting.

    5.1 Backtesting Setup

    The first step is to create a unique directory that contains a configuration file. This has thefilename either config.xml or config.xml. The latter filename ensures the configurationfile is always at the top of a directory listing. Each configuration file is kept in a separatedirectory as files are created during the backtest for later analysis. If you are working on severalconfigurations at the same time, or want to keep previous configuration files with their results,then having each in a separate directory avoids clutter.

    The configuration file contains a section named backtest. A typical setup is shown below:

    1

    2 2013-01-01

    3 2013-12-08

    4 True

    5 False

    6 python "C:\DeepThought\python\analyse_backtest_results.py" %CONFIG_LOCATION%

    7

    The display-progress setting turns on/off display of trades as they are closed in the console.Windows display of text in the console is slow (compared to Linux), so if you are using recordedsignals as described below, turning the progress display off can speed the backtest further. Youdont lose anything by turning the progress off as all results including progress is logged in thevarious output files.

    The actual backtest is started by the command:

    deepthought --backtest C:\configs\EURUSD MA TEST

    11

  • CHAPTER 5. BACKTESTING 12

    where C:\configs\EURUSD MA TEST is the directory where the config.xml is located.

    5.2 Recording and Using Recorded Signals

    During the backtesting (and paper-trading) process, the signals are recorded to a file and storedin the same directory as the configuration file. If you are using a large ensemble of machinelearning predictors, a backtest over a year can take hours or even days. Sometimes you maynot be changing the machine learning settings, but experimenting with other settings such astake profit, or trigger. In this situation you can run the backtest once to generate and recordthe signals. Before running the next backtest set use-recorded-signals setting to True andthe next time the backtest is run the machine learning training and forecasting will be bypassedand the signal looked up from the recorded signals file. This dramatically shortens the time torun a backtest provided none of the machine learning settings have been altered.

    Another use of the recorded signals file is for backtesting in Metatrader. An EA has beenprovided which uses these signals in Metatraders strategy tester. The recorded signals file isnamed recorded.signals.csv and must be copied to:

    \tester\files

    For example, if your broker was InterbankFX this directory would be

    C:\Program Files (x86)\InterbankFX\tester\files

    This is a restriction by Metatrader as EAs run in the strategy tester cannot access files outsidethis location. The source has been provided for this EA so you could adapt an existing systemto utilise the signals.

    5.3 Order Fill Simulation

    At the close of each 1 minute candle, the simulator looks at the high and low prices and decidesif order prices have been hit. Orders can have optional take-profit and stop-loss prices.There is also an optional break-even setting. If this has been set, a stop-loss is automaticallyset with a price equal to entry price plus 1 pip for a buy order, and entry price minus 1 pip fora sell order. The typical sequence of events in order fill simulation is:

    1. Signal indicates an order is to be placed.

    2. Order is placed. If it is a market order, it is immediately filled as the last known bid pricefor a sell, and last known bid + spread price for a buy. If it is a limit order, the price ischecked at the end of the next 1 minute candle.

    3. At the end of the 1 minute candle limit orders are checked for fills by looking at the candleshigh and low prices.

    4. Check take-profit, stop-loss and break-even. If the take-profit or stop-loss hasbeen hit, close the position. If break-even has been set and the price has been reachedset a stop-loss at break even +1 pip.

  • CHAPTER 5. BACKTESTING 13

    5.4 Paper Trading

    The provided MT4 EA has a setting do live trade. When this is set to false then no livetrades are placed, but DeepThought will continue to simulate trades using live market data, andpopulate the database. It is worthwhile to always run a paper-trader for the purpose of keepingthe database for each instrument you use up to date. The files produced during paper-tradingare identical to the files produced during backtesting as the same process is used.

    5.5 Files Produced During Backtesting and Paper Trading

    Various files are produced by the trade simulator during backtesting and paper trading. Theseare detailed in table 5.1.

    Table 5.1: Files Produced During Backtesting/Paper Trading

    Filename Description

    backtest.log The log file detailing all events during backtesting. Usefulfor debugging.

    daily.returns.csv The daily returns of the backtest. The trade open date-timeis used to group trades to the same day.

    pnl.csv A record of each individual trade.

    recorded.signals.csv The signals generated by the ensemble. Used in playbackduring subsequent backtests, and used by MT4 in the strat-egy tester.

    statistics.h4-features.csv The statistics (min,max,mean,stddev) of each attribute ina model. Useful to spot data errors as values should bereasonably stable over time. Any sharp and or large changesshould be investigated. This filename example is for a modelnamed h4-features in the configuration.

    svm-c-rbf.forecasts.csv For each predictor, a file is generated detailing the forecastsit made. Useful for external analysis. This filename exampleis for a predictor named svm-c-rbf in the configuration.

  • Chapter 6

    Genetic Algorithm for ParameterSearch

    A genetic algorithm is a way of searching a large search space using methods inspired by biology.In DeepThought the genetic algorithm is used for parameter selection. We could use a brute-force approach and test every combination of parameters available however the (usually) verylarge number of combinations makes this infeasible.

    A genome defines a list of parameters. This list of parameters is tested in a backtest to producea score which is used to rank the parameter set. In genetic algorithm terminology the backtestgenerates the objective function which is specified in the configuration as Sharpe Ratio,Accuracy or Profit (in Pips). You could potentially run separate genetic algorithms andoptimise on all objective functions and combine the results in an ensemble.

    DeepThought uses the Condor high performance computing clustering system. It is available fordownload from http://research.cs.wisc.edu/htcondor/. Condor is a system that clustersindividual computers together to form a high performance cluster. It can operate on a singlecomputer, so you still run the genetic algorithm if you only have access to a single computer.Condor is a system for running many jobs in parallel, so has uses well beyond our use of it forgenetic algorithms. Although it is beyond the scope of this document to provide detailed infoon installing and using condor, we explain the parts relevant to DeepThought. Further detail isgiven in appendix B.

    The genetic algorithm in DeepThought operates in the following way:

    1. DeepThought is started as a GA Server.

    2. A random population of genomes is created. This is the first generation.

    3. A configuration file is produced from a template for each genome.

    4. A Condor submit file is produced and the population submitted to Condor. Each individ-ual configuration file is run on one core of the cluster in parallel with other configurationfiles.

    5. DeepThought listens on a TCP port for backtests to finish and send a summary of results.

    6. As each backtest completes, a summary is transmitted via UDP to DeepThought GAServer. The log files and other outputs are sent back to the server and stored in individualdirectories for later analysis if required.

    7. After a configurable timeout has been reached, all running jobs are terminated. This stepis skipped if all jobs complete before the timeout.

    14

  • CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH 15

    8. The backtest results produced by the genomes are assessed and using parameters detailedin section 9.11 the next generation of genomes is produced.

    9. Steps 3 to 8 are repeated until the number of generations specified in the configurationhas been reached. Alternatively, the GA will stop if all possible combinations have beentested.

    DeepThought keeps a list of all genomes and their results. This is to prevent the same parametercombination being tested more than once. This file is persisted to disk after each generationand it also operates as a save point and can be used to resume a genetic algo in the event thatit was interrupted.

    6.1 Configuration

    A configuration file is supplied to DeepThought in the same format as for backtesting andlive/paper trading. It is used as a template as configuration files are created for each parametercombination (genome) to be tested. The configuration file must contain a genetic-algo sectionsimilar to the configuration snippet below:

    1

    2 tcp://wraith

    3 55566

    4 -1

    5 sortino

    6 360

    7 20

    8 10

    9 30

    10 30

    11 2

    12 10

    13

    14

    15

    16

    17

    18

    See section 9.11 for a detailed explanation of the options. To have the genetic algorithm modifyvalues in a configuration file, the file must have XML tag ga-subst defined for each value thatcan vary where the value of ga-subst is equal to the parameter id defined in the genetic-algosection. The example below illustrates this:

    1

    2 hour-of-day

    3 h4

    4

    5

    6 bar-attribute

    7 average-close

    8 30

    9 diff

    10 EURUSDh4

    11 min-max

    12

    13 ...

    14

    15 svm-c-rbf

    16 h4-features

    17 false

    18

    19 8

    20 0.015625

    21 1.0

  • CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH 16

    22 SVC

    23 rbf

    24

    25 2000

    26 1

    27

    28 ...

    29

    30

    31 all

    32 all

    33

    34 0.0

    35 SVC

    36 0.0

    37 0.0

    38 20.0

    39 -1

    40 EURUSDm1

    41 False

    42

    The parameter types are defined in table 6.1.

    Table 6.1: parameter configuration options for the genetic-algo configuration section.

    Option Description

    integer Used when the parameter can be modelled as an integer. The optionsavailable for the integer type are:low The lowest value that the integer can take.high The highest value that the integer can take.step The value to increment/decrement for different values of this

    parameter.

    categorical Used when the parameter can only take certain (string) values. Theoptions available for the categorical type are:values Comma separated list of values that this parameter can take.

    exp-2 Used when the parameter is best suited to an exponential grid search.For example SVM penalty, SVM gamma and SVM epsilon arebest searched using an exponential grid search. This means thatrather than use values that are linearly spaced such as 5, 10, 15, 20, ...,we use values such as 21, 22, 23, 24, .... This results in final values of2, 4, 8, 16, .... Note that negative numbers can be used and result inthe final values being less than 1. e.g. 25, 24, 23, 22, ... become0.03125, 0.0625, 0.125, 0.25, .... The options available for the exp-2 typeare:low The lowest value that the exponent can take.high The highest value that the exponent can take.step The value to increment/decrement the exponent.

    6.2 Running the Genetic Algorithm

    The genetic algorithm is started with the following command:

    DeepThought --genetic-algo C:\configs\EURUSD MA TEST

  • CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH 17

    This will use the file config.xml or \config.xml in the directory C:\configs\EURUSD MA TESTin the same way as for backtesting described in section 5.1.

    The progress of the genetic algorithm is printed to the console similar to the example below. Inthis example, we are using a population of 20 on a single machine with 8 cores. As each backtestcompletes, a summary of the results are displayed. The beginning of each line contains threenumbers. The first is the generation number, the second the genome number and the last, thenumber of genomes in a population. The Compute host is the name of the machine that thebacktest ran on. Useful for monitoring a cluster of machines to see which machines are quickerand running more backtests.

    C:\DeepThought>DeepThought --genetic-algo C:\DeepThought_Configs\EURUSD_GA

    DeepThoughtLib::GeneticAlgo::SubmitToClusterSubmitting job(s)....................20 job(s) submitted to cluster 41.Submitted to Condor cluster 412014-Jan-12 17:08:59.307337 Info: DeepThoughtLib::GeneticAlgo::WaitForResults -

    Waiting for (20) results for generation 3. Num of jobs is 20. Max wait time is 06:00:00

    3/1/20 3003: Obj=-0.102615 Sharpe=-0.102615 PnL=-259.2 dd=-2357.7 num=1616 %=50.5569Compute host=Slartibartfast svm-gamma=-8 svm-penalty=7 . Time left is 05:38:49.

    3/2/20 3006: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=4 . Time left is 05:38:47.

    3/3/20 3002: Obj=-1.028 Sharpe=-1.028 PnL=-2684.7 dd=-5193.9 num=1378 %=48.4761Compute host=Slartibartfast svm-gamma=0 svm-penalty=12 . Time left is 05:38:31.

    3/4/20 3004: Obj=-1.37488 Sharpe=-1.37488 PnL=-5296.8 dd=-6997.9 num=1268 %=46.6088Compute host=Slartibartfast svm-gamma=0 svm-penalty=0 . Time left is 05:38:30.

    3/5/20 3014: Obj=-1.34507 Sharpe=-1.34507 PnL=-3374 dd=-4027.3 num=1619 %=48.7338Compute host=Slartibartfast svm-gamma=-6 svm-penalty=5 . Time left is 05:38:09.

    3/6/20 3008: Obj=-1.21692 Sharpe=-1.21692 PnL=-2878.6 dd=-3719.6 num=1619 %=48.2397Compute host=Slartibartfast svm-gamma=-6 svm-penalty=8 . Time left is 05:34:31.

    3/7/20 3001: Obj=-1.3953 Sharpe=-1.3953 PnL=-3371.4 dd=-3995.2 num=1619 %=47.8073Compute host=Slartibartfast svm-gamma=-8 svm-penalty=11 . Time left is 05:26:46.

    3/8/20 3038: Obj=-0.365059 Sharpe=-0.365059 PnL=-1207.7 dd=-4145.4 num=1600 %=51.625Compute host=Slartibartfast svm-gamma=-5 svm-penalty=0 . Time left is 05:17:44.

    3/9/20 3021: Obj=-0.102615 Sharpe=-0.102615 PnL=-259.2 dd=-2357.7 num=1616 %=50.5569Compute host=Slartibartfast svm-gamma=-8 svm-penalty=7 . Time left is 05:17:22.

    3/10/20 3018: Obj=-1.57754 Sharpe=-1.57754 PnL=-4522.9 dd=-4974.1 num=1618 %=48.8257Compute host=Slartibartfast svm-gamma=-4 svm-penalty=3 . Time left is 05:17:21.

    3/11/20 3043: Obj=-0.577919 Sharpe=-0.577919 PnL=-1692.1 dd=-2551.6 num=1596 %=48.7469Compute host=Slartibartfast svm-gamma=-1 svm-penalty=4 . Time left is 05:17:01.

    3/12/20 3022: Obj=-1.028 Sharpe=-1.028 PnL=-2684.7 dd=-5193.9 num=1378 %=48.4761Compute host=Slartibartfast svm-gamma=0 svm-penalty=4 . Time left is 05:16:59.

    3/13/20 3009: Obj=-1.21359 Sharpe=-1.21359 PnL=-3082.6 dd=-4291.5 num=1618 %=48.7021Compute host=Slartibartfast svm-gamma=-8 svm-penalty=12 . Time left is 05:16:48.

    3/14/20 3044: Obj=-0.0356163 Sharpe=-0.0356163 PnL=-83 dd=-3019.5 num=1619 %=50.8956Compute host=Slartibartfast svm-gamma=-6 svm-penalty=3 . Time left is 05:14:11.

    3/15/20 3052: Obj=-0.365059 Sharpe=-0.365059 PnL=-1207.7 dd=-4145.4 num=1600 %=51.625Compute host=Slartibartfast svm-gamma=-5 svm-penalty=0 . Time left is 05:00:50.

    3/16/20 3046: Obj=-0.469054 Sharpe=-0.469054 PnL=-1470.5 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=3 . Time left is 05:00:43.

    3/17/20 3048: Obj=-1.57754 Sharpe=-1.57754 PnL=-4522.9 dd=-4974.1 num=1618 %=48.8257Compute host=Slartibartfast svm-gamma=-4 svm-penalty=3 . Time left is 05:00:30.

    3/18/20 3062: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=5 . Time left is 05:00:16.

    3/19/20 3056: Obj=-0.469054 Sharpe=-0.469054 PnL=-1470.5 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=3 . Time left is 05:00:15.

    3/20/20 3045: Obj=-1.21359 Sharpe=-1.21359 PnL=-3082.6 dd=-4291.5 num=1618 %=48.7021Compute host=Slartibartfast svm-gamma=-8 svm-penalty=12 . Time left is 04:53:48.

    Received enough results (20) for generation 3All jobs in cluster 41 have been marked for removal

    ***********************************Best 20 results for generation 3***********************************2024: Obj=1.04909 Sharpe=1.04909 PnL=2929.9 dd=-1373 num=1619 %=53.7986

    Compute host=Slartibartfast svm-gamma=-8 svm-penalty=32025: Obj=1.04909 Sharpe=1.04909 PnL=2929.9 dd=-1373 num=1619 %=53.7986

    Compute host=Slartibartfast svm-gamma=-8 svm-penalty=31002: Obj=0.167274 Sharpe=0.167274 PnL=298.4 dd=-1526.8 num=1620 %=60.1852

    Compute host=Slartibartfast svm-gamma=-8 svm-penalty=51003: Obj=0.0850919 Sharpe=0.0850919 PnL=170.2 dd=-1966.8 num=1620 %=60.6173

    Compute host=Slartibartfast svm-gamma=-8 svm-penalty=43044: Obj=-0.0356163 Sharpe=-0.0356163 PnL=-83 dd=-3019.5 num=1619 %=50.8956

    Compute host=Slartibartfast svm-gamma=-6 svm-penalty=33003: Obj=-0.102615 Sharpe=-0.102615 PnL=-259.2 dd=-2357.7 num=1616 %=50.5569

    Compute host=Slartibartfast svm-gamma=-8 svm-penalty=73021: Obj=-0.102615 Sharpe=-0.102615 PnL=-259.2 dd=-2357.7 num=1616 %=50.5569

    Compute host=Slartibartfast svm-gamma=-8 svm-penalty=73038: Obj=-0.365059 Sharpe=-0.365059 PnL=-1207.7 dd=-4145.4 num=1600 %=51.625

    Compute host=Slartibartfast svm-gamma=-5 svm-penalty=03052: Obj=-0.365059 Sharpe=-0.365059 PnL=-1207.7 dd=-4145.4 num=1600 %=51.625

    Compute host=Slartibartfast svm-gamma=-5 svm-penalty=03046: Obj=-0.469054 Sharpe=-0.469054 PnL=-1470.5 dd=-4545.9 num=1616 %=49.1337

  • CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH 18

    Compute host=Slartibartfast svm-gamma=-2 svm-penalty=33056: Obj=-0.469054 Sharpe=-0.469054 PnL=-1470.5 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=3

    2002: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=12

    2034: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=8

    3006: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=4

    3062: Obj=-0.474801 Sharpe=-0.474801 PnL=-1489 dd=-4545.9 num=1616 %=49.1337Compute host=Slartibartfast svm-gamma=-2 svm-penalty=5

    2036: Obj=-0.508462 Sharpe=-0.508462 PnL=-1231.8 dd=-3003.5 num=1620 %=49.6296Compute host=Slartibartfast svm-gamma=-8 svm-penalty=8

    2018: Obj=-0.570008 Sharpe=-0.570008 PnL=-1397.2 dd=-2994.2 num=1618 %=50.1236Compute host=Slartibartfast svm-gamma=-6 svm-penalty=4

    2031: Obj=-0.577919 Sharpe=-0.577919 PnL=-1692.1 dd=-2551.6 num=1596 %=48.7469Compute host=Slartibartfast svm-gamma=-1 svm-penalty=3

    3043: Obj=-0.577919 Sharpe=-0.577919 PnL=-1692.1 dd=-2551.6 num=1596 %=48.7469Compute host=Slartibartfast svm-gamma=-1 svm-penalty=4

    1004: Obj=-0.630181 Sharpe=-0.630181 PnL=-1419.7 dd=-2004.3 num=1616 %=57.5495

    Compute host=Slartibartfast svm-gamma=-1 svm-penalty=5

    6.2.1 Database

    A copy of the database must exist on all machines in the cluster in an identical location. Normallythe database(s) are in C:\FX Database and this should be copied to all machines in the cluster.This is so that the genetic algorithm does not need to send a copy of the data to each node(there would be 8 copies of the same data on a single machine with 8 cores).

    6.3 Genetic Algorithm Results

    While the genetic algorithm is running, results are accumulated in the directory given in thecommandline. In the above example this is C:\configs\EURUSD MA TEST. A separate directoryis created for each generation, named generation-1, generation-2, etc. For each individualgenome a results.zip file is created containing the generated configuration and all output files.This filename is prepended with the genome-id.

    In the directory containing the configuration template, a file genetic-algo-cache.xml is cre-ated and updated each time a backtest completes on a Condor node. This contains thegenome-id along with a summary of results and a list of values assigned to the parametersthat the genetic algorithm is optimising. A sample of this file is given below. The best resultsare always at the top. This is the file that is also used as a save point in the event that thegenetic algorithm was interrupted.

    6.4 Using Recorded Results

    If use-recorded-results is set to True in the backtest configuration section you must ensurethat a file named recorded.signals.csv is in the same directory as the configuration. Thisfile is generated by a backtest as explained in section 5.2.

    6.5 The Condor Submit File

    Condor operates on submit files. These are plain text files that list the jobs to be run on thecluster. DeepThought generates submit files for each generation. These are created in the samedirectory as the genetic algorithm configuration file. You should not normally need to view

  • CHAPTER 6. GENETIC ALGORITHM FOR PARAMETER SEARCH 19

    these files, and altering them will have no effect as they are always generated by the geneticalgorithm.

    6.6 Trouble Shooting

    Most problems occur because of a problem with the configuration file. First check the backtest-.log file for errors. Other things to check are dates: are the backtest start/stop dates containedwithin the data? Also check database filenames and that the database exists in the samedirectory on all machines in the cluster and is populated. Normally the database(s) are in C:\FX Database and this should be copied to all machines in the cluster.

    The genetic algorithm is slightly harder to debug as there is less direct access to what is happen-ing, and there is a reliance on a third party component (Condor). In the results.zip file of agenome located in the generation-n directory of generation n, check the backtest.log file forerrors. If it is empty, or the problem is not evident, try directly backtesting the configurationfile.

    If you are using a multi-machine cluster, try disabling firewalls and other things that may preventnetwork access. You can also check the Condor log file. Although this tends to be a little crypticit may provide clues where to start looking.

  • Chapter 7

    Live and Paper Trading

    Once you have been through the research and development process and have found a config-uration that you are happy with the next stage is to paper trade. We strongly suggest doingthis before live trading to ensure that paper trading results match (in a statistical sense) yourbacktested results.

    The process for live and paper trading is identical with the exception that orders are not placedin a live market.

    7.1 Manual Trading

    DeepThought can be traded manually. One use of manual trading is for end-of-day systemswhere it is feasible for a human to make every trade manually.

    One challenge with manual trading is populating the database. To do this we suggest usingthe provided Metatrader EA as described below, but setting the parameter do-live-trade tofalse. This will populate the database while placing no trades. The EA can be left running asit is possible for more than one program to access the database at any one time.

    Manual trading is done with two commands. The first (--manual-trade-train-and-persist)will train all models in the configuration and save in the configuration directory. The commandbelow is an example of manually training from the configuration in C:\DeepThought Configs\-EURUSD Strategy 1:

    deepthought --manual-trade-train-and-persist C:\DeepThought Configs\EURUSD Strategy 1

    The output should simply show Ok if the training could be done. If not check the log file in theconfiguration directory for hints on what went wrong. Once the models have been trained theforecasts can be generated using the --manual-trade-generate-signal option:

    deepthought --manual-trade-generate-signal C:\DeepThought Configs\EURUSD Strategy 1

    The output will be similar to the following:

    DeepThought built on Jan 7 2014 at 16:48:35BUYConsensus=25

    NumberOfPredictors=45

    20

  • CHAPTER 7. LIVE AND PAPER TRADING 21

    The output is formatted in this way to make it easy for other scripts to parse the output ifDeepThought manual signals form part of a larger trading strategy. When generating signalsmanually the normal sequence of events should be:

    1. Run --manual-trade-train-and-persist to generate the initial model.

    2. Wait for the candle to complete on the time-frame that you are forecasting on.

    3. When the candle completes run --manual-trade-generate-signal and act on the fore-cast.

    4. After the forecast has been processed, re-run --manual-trade-train-and-persist tore-generate the models on the latest data.

    The events are sequenced in this way as forecasting can take a while if you are using a largeensemble. Using the sequence above we have as much time as it takes a candle to complete totrain the models.

    7.2 Automated Trading

    DeepThought is able to auto-trade with Metatrader 4 using the supplied Expert Advisor. Linksto other trading platforms will be added over time. Please contact us with a request to createa link to the platform you are using (if not Metatrader) the more requests, the higher thepriority will become for implementation.

    An Expert Advisor (EA), Metatraders terminology for an automated trading script is pro-vided which accesses the DeepThought DLL. The source of the EA is provided so you can addyour own trading logic to the signals generated by DeepThought. This EA provides basic trad-ing of the signals generated by DeepThought and you should add your own trading logic. ThisEA is intended to be starting point as a place to add your own trading rules. You can backtestany trading logic using recorded signals following the process described in section 5.2 of page12.

    The EA is in the DeepThought installation directory Metatrader, named DeepThought.mq4.It must be placed in the experts directory where Metatrader is installed. For example if yourbroker was InterbankFX this directory would be

    C:\Program Files (x86)\InterbankFX\experts

    The DLL, named DeepThought.Dll located in the DeepThought installation directory needsto be copied to the experts\library directory where Metatrader is installed. For example ifyour broker was InterbankFX this directory would be

    C:\Program Files (x86)\InterbankFX\experts\library

    When you next start Metatrader the DeepThought EA should be available in the Experts folderin Metatrader. Add this to a chart in the normal way. It can be added to any time-frame as ticksare used to generate candles, however we suggest adding it to the 1 minute time frame.

    You also need to place the licence file you received when purchasing DeepThought in thesame directory as the Metatrader executable. For example if your broker was InterbankFX thisdirectory would be

    C:\Program Files (x86)\InterbankFX\

    If you are trading several instruments, we suggest a separate Metatrader instance for eachinstrument as Metatrader will likely crash when loading the same DLL into more than one EA.

  • CHAPTER 7. LIVE AND PAPER TRADING 22

    To create a new instance of Metatrader, simply copy the Metatrader installation to anotherlocation so each instance has a completely separate set of files.

    Table 7.1: DeepThought parameters for the Metatrader EA.

    Data Type Parameter Default Description

    string files location The directory where the XMLconfiguration is location

    int gmt offset 0 The hour offset from GMT ofyour broker. If your historicaldata is in UTC (i.e. GMT)time then you will need an off-set to ensure there are no gapsin the data caused by time-zone changes.

    int max trade duration seconds 0 Automatically close trades af-ter this many seconds. Set to0 to leave trades open (theywill close with an opposite sig-nal).

    string deep thought db EURUSDm1 The identifier of the1 minute bar-series inDeepThought that price tickswill be sent to, to build 1minute candles.

    double trade lot size 0.1 The trade size in lots.

    bool do limit orders true Use limit orders. If set tofalse, market orders will beused.

    double limit order offset 0.0002 The price offset to use forplacing limit orders.

    int magic number 1600 The number that Metatraderinserts with trade info. En-ables you to track whichtrades came from what sys-tem if you are running multi-ple systems.

    bool do live trade false Set to true for live trading;set to false for paper trading.

    bool add to position true If set to true, will add po-sitions to existing positions,sometimes known as pyra-miding.

    7.3 Trouble Shooting

    Metatrader can be unstable when it is working with external DLLs. It can be particularly badwhen changing parameters in the EA and a Metatrader crash is unfortunately all too common.We hope that these stability issues will be fixed in Metatrader 5.

    If you are having problems changing parameters in the DeepThought EA, follow these

  • CHAPTER 7. LIVE AND PAPER TRADING 23

    steps:

    1. Save the EA parameters:

    2. Delete the EA from the chart.

    3. Exit Metatrader.

    4. Open the windows task-manager and check if Terminal is still running. If it is, highlightit and click End Process.

    5. Restart Metatrader.

    6. Add the EA back to the chart, load the parameters saved in step 1 and make the changes.

    This may seem a bit odd but since as the stability of Metatrader is beyond our control thisis all we can offer. If Metatrader still crashes, then a reboot of the computer is probablyrequired.

  • Chapter 8

    Python Scripting

    DeepThought uses the Python language for scripting. There are no restrictions on the Pythonscripts. DeepThought uses the Python system installed on your PC thus you are able to usewhatever libraries and modules (e.g. scipy, numpy, pandas etc) you require. When a functionis called from DeepThought , an interface object is passed to your script enabling your scriptto access elements in DeepThought such as candle data and to pass back various values such asthe forecast or training label value.

    The use of embedded Python enables unlimited customisation in the following areas:

    1. Features - custom features from any datasource accessible to your Python scripts.

    2. Target and trigger - define a trigger when forecasts are made and define a target that thepredictors is forecasting.

    3. Predictor - works with the builtin machine learning predictors, or supply your own. Doesnot need to be machine learning based, essentially allowing DeepThought to be used as astandard algorithmic platform.

    4. Signal Generator - combine forecasts from the predictors to produce buy/sell signals.

    The use of Python is optional and it is entirely possible to produce a working system withoutthe use of Python.

    There is a section in the configuration file. This contains one or more entries. Each script can contain one or more functions, or all functionscan be in a single script file. If you are using multiple script files, they all operate in the samenamespace. An example section is given below.

    1

    2 target_num_pips.py

    3

    Your Python scripts reside in the same directory as the config.xml configuration file.

    8.1 Python Installation

    DeepThought uses Python version 2.7. It uses the 32-bit version on windows. The installationprocess installs an optional Python distribution MiniConda. This is a cut-down version of thefree Anaconda distribution available at http://store.continuum.io/cshop/anaconda/. You

    24

  • CHAPTER 8. PYTHON SCRIPTING 25

    can bypass the installation provided with DeepThought and install the full Anaconda distri-bution, or install another distribution. The only requirement is that it is version 2.7 32-bit(Windows). The Linux and MacOS versions of DeepThought use 64-bit. We use the 2.7 versionrather than the 3.3 version as most large distributions (e.g. Anaconda, Python(x,y)) are stillbased on 2.7.

    We strongly recommend the usage of the Numpy and Pandas libraries for numberical andtime-series processing and Matplotlib for visualisation. These are installed by default withAnaconda and can be installed if you are using Miniconda with the following in a commandlineprompt:

    1 conda install pandas

    2 conda install matplotlib

    The Numpy library automatically will be installed as both Pandas and Matplotlib depend onit.

    8.2 Python Feature

    A feature is comprised of one or more numical values (attributes). You can have as manyPython generated features as you wish. Each feature must be generated using a unique functionname.

    To add a Python generated feature to your model, add a feature of type python-script. Acomplete example configuration is given below.

    1

    2

    3 EURUSDm1

    4 const-time

    5 eurusd.db

    6 10000.0

    7 0.0

    8 1

    9 0.0

    10

    11

    12 EURUSDh4

    13 const-time

    14 EURUSDm1

    15 bar-series

    16 10000.0

    17 0.0

    18 240

    19 0

    20

    21

    22 C:\FX_Database

    23

    24

    25 ema diff feature.py

    26

    27

    28 h4-features

    29

    30 bars-in-future

    31 target-1-bar-in-future

    32 EURUSDh4

    33 1

    34 up-down

    35

    36

    37 hour-of-day

    38 h4

  • CHAPTER 8. PYTHON SCRIPTING 26

    39

    40

    41 python-script

    42 SetParameterValue

    43 GetNumberOfAttributes

    44 GetFeatures

    45 20

    46 50

    47 python-test-1

    48 min-max

    49

    50

    51 bar-attribute

    52 average-close

    53 30

    54 diff

    55 EURUSDh4

    56 min-max

    57

    58

    59

    60 svm-c-rbf

    61 h4-features

    62 false

    63

    64 512

    65 0.0625

    66 1.0

    67 SVC

    68 rbf

    69

    70 500

    71 1

    72

    73

    74 Weekly

    75

    76

    77

    78 all

    79 all

    80

    81 h4-features

    82 0.0

    83 0.0

    84 0.0

    85 0.0

    86 EURUSDm1

    87 False

    88

    89

    90 0

    91 0

    92 100000

    93 False

    94 False

    95 100

    96 False

    97

    98

    99 2013-01-01

    100 2014-01-01

    101 False

    102 True

    103 python C:\DeepThought\python\analyse_backtest_results.py %CONFIG_LOCATION%

    104

    105

    Listing 8.1: Python feature configuration.

    Here we have defined three functions:

  • CHAPTER 8. PYTHON SCRIPTING 27

    SetParameterValue

    We can set parameters in the configuration file which will be passed to this function once onspin-up. These parameters can be controlled by the Genetic Algorithm described in Chapter6 on page 14. The parameters are defined in the configuration using entries asshown in the example above.

    GetNumberOfAttributes

    A function that returns the number of attributes that make up the feature.

    GetFeatures

    A function that is responsible for generating the actual numerical attributes. A DeepThoughtinterface object is provided to this function to pass the attributes back to DeepThought .

    An example script that implements the above functions is given below.

    1 import pandas as pd

    2 import numpy as np

    3 import sys

    4

    5 ma_short_period = None # int

    6 ma_long_period = None # int

    7 number_of_diffs = 30

    8 num_required_candles = None

    9

    10 def ExpMovingAverage(values, period):

    11 weights = np.exp(np.linspace(-1., 0., period))

    12 weights /= weights.sum()

    13 ema = np.convolve(values, weights)[:len(values)]

    14 ema[:period] = ema[period]

    15 return ema

    16

    17 def GetNumberOfAttributes(deep_thought_intf):

    18 deep_thought_intf.SetNumAttributes(2)

    19

    20 def SetParameterValue(param_name, param_value):

    21 global ma_short_period

    22 global ma_long_period

    23 global num_required_candles

    24 if param_name == "ma_short_period":

    25 ma_short_period = param_value

    26 elif param_name == "ma_long_period":

    27 ma_long_period = param_value

    28 num_required_candles = ma_long_period + number_of_diffs + 2

    29 else:

    30 print("Unknown parameter:", param_name)

    31

    32 def GetFeatures(deep_thought_intf):

    33 if (ma_short_period == None):

    34 print("Error: ma_short_period has not been set!")

    35 return -1

    36

    37 csv_file_name = deep_thought_intf.GetLastBars(num_required_candles, "EURUSDh4")

    38 candles = pd.read_csv(csv_file_name, index_col=False)

    39

    40 if len(candles.index) < num_required_candles:

    41 return -1

    42

    43 close_values = candles[close].values

    44 reversed_close_values = close_values[::-1]

    45

    46 ema_short = ExpMovingAverage(reversed_close_values, ma_short_period)

    47 ema_long = ExpMovingAverage(reversed_close_values, ma_long_period)

    48

    49 for i in range(1, number_of_diffs, 1):

    50 deep_thought_intf.SetAttribute(i-1, ema_short[-i] - ema_long[-i])

    Listing 8.2: Python script example defining a feature.

    This script uses the Python libraries pandas which provides data analysis functions including a

  • CHAPTER 8. PYTHON SCRIPTING 28

    data-frame and numpy for numeric analysis. An interface object deep thought intf is passed tothe GetNumberOfAttributes() and GetFeatures() functions. This interface is the mechanismfor passing data back and forth between DeepThought and your scripts.

    The methods of the deep thought interface object are detailed in table 8.1 on page 34.

    8.3 Python Target

    The target script has two functions; to detect a forecast trigger and to label a training instancewith a target. Detecting a forecast trigger can be as simple as forecasting each 4-hourly bar, ormore complex such as only forecasting when a pair of moving averages has crossed. If a triggerhas been detected, your GetTargetIsTrigger() function returns True and a training sample iscreated. If the criteria has not been met, your script returns False.

    Your script must also supply a GetTarget() function. This function is passed a candle atobservation time for a sample where the target trigger was met, along with the current candle.Your function can then compare and decide if the target criteria has been met.

    The following examples should make this a little clearer.

    This example is the setup for a system that forecasts whether a 20 pip target will be hit first byprice moving up or price moving down. A new forecast is created each four hours so that everyfour hours this system will enter a trade with a target of +20 pips for an up forecast and -20pips for a down forecast. To do this the scripts use the 1 minute candles. The script is setupin the config in the section. The target is configured in the section. Moredetail on configuration is given in chapter 9 on page 35. Detail on the configurationsection is on page 41.

    1

    2 target num pips.py

    3

    4

    5 h4-features

    6

    7 python-script

    8 target-next-pip-movement

    9 EURUSDm1

    10 20.0

    11 GetIsTargetTrigger

    12 GetTarget

    13 TargetSetParameterValue

    14

    15

    16 hour-of-day

    17 h4

    18

    19

    20 bar-attribute

    21 average-close

    22 30

    23 diff

    24 EURUSDh4

    25 min-max

    26

    27

    28 moving-average

    29 average-close

    30 5

    31 30

    32 1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100

    33 EURUSDh4

    34 min-max

    35

  • CHAPTER 8. PYTHON SCRIPTING 29

    36

    37 moving-average

    38 average-close

    39 10

    40 30

    41 1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100

    42 EURUSDh4

    43 min-max

    44

    45

    46 moving-average

    47 average-close

    48 20

    49 30

    50 1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100

    51 EURUSDh4

    52 min-max

    53

    54

    55 moving-average

    56 average-close

    57 50

    58 30

    59 1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100

    60 EURUSDh4

    61 min-max

    62

    63

    64 moving-average

    65 average-close

    66 100

    67 30

    68 1,2,3,4,5,7,9,13,16,20,25,31,45,55,70,100

    69 EURUSDh4

    70 min-max

    71

    72

    Listing 8.3: Python target configuration.

    The following is the listing of the target num pips.py script defined in the configuration file.We have defined a single parameter pip-movement in the configuration. This is passed to thescript using the TargetSetParameterValue() function when DeepThought starts up.

    The function GetIsTargetTrigger() tests to see if the trigger criteria has been reached. Aswe are calling this script every time a 1 minute candle complete (setup in the config using the element in the section), the script must check to see if a four hourcandle has just completed.

    The function GetTarget() set the target of either -1.0 or 1.0 if a target has been reached. If notarget has been reached, a target is not set.

    1 import pandas as pd

    2 import numpy as np

    3 import sys

    4 from datetime import datetime

    5

    6 num_pips = None # double

    7 last_close_datetime = "none"

    8

    9 def TargetSetParameterValue(param_name, param_value):

    10 if param_name == "pip-movement":

    11 global num_pips

    12 num_pips = param_value/10000.0

    13

    14 def GetIsTargetTrigger(deep_thought_intf, latest_candle):

    15 # Check if we are at the close of an H4 candle. We need to do it this way as we are

    16 # triggering from M1 candles.

    17 csv_file_name = deep_thought_intf.GetLastBars(2, "EURUSDh4")

    18 candles = pd.read_csv(csv_file_name, index_col=False)

  • CHAPTER 8. PYTHON SCRIPTING 30

    19 if len(candles.index) < 1:

    20 return False

    21 candle_close_datetime = candles.iloc[0][close_date_time]

    22 global last_close_datetime

    23 if (last_close_datetime != candle_close_datetime):

    24 last_close_datetime = candle_close_datetime

    25 return True

    26 return False

    27

    28 def GetTarget(deep_thought_intf, candle_at_observation, latest_candle):

    29 if (latest_candle.ClosePrice() - candle_at_observation.ClosePrice() >= num_pips):

    30 deep_thought_intf.SetTarget(1.0)

    31 elif (latest_candle.ClosePrice() - candle_at_observation.ClosePrice()

  • CHAPTER 8. PYTHON SCRIPTING 31

    28 h4-features

    29

    30 bars-in-future

    31 target-1-bar-in-future

    32 EURUSDh4

    33 1

    34 up-down

    35

    36

    37

    38 h4-features

    39 python-predictor-h4

    40 1.0

    41 SetParameterValue

    42 Predict

    43 Train

    44 20

    45 5

    46 25

    47 1

    48

    49

    50 True

    51

    52

    53

    54 all

    55 all

    56

    57 h4-features

    58 0.0

    59 0.0

    60 0.0

    61 0.0

    62 -1

    63 EURUSDm1

    64 False

    65

    66

    67 0

    68 0

    69 100000

    70 False

    71 False

    72 100

    73 False

    74

    75

    76 2013-01-01

    77 2014-01-01

    78 False

    79 True

    80

    81

    Listing 8.5: Python predictor configuration example.

    1 import numpy as np

    2 import pandas as pd

    3

    4 # global variables

    5 ma_long_period = None #int

    6 ma_short_period = None #int

    7

    8 def ExpMovingAverage(values, period):

    9 weights = np.exp(np.linspace(-1., 0., period))

    10 weights /= weights.sum()

    11 ema = np.convolve(values, weights)[:len(values)]

    12 ema[:period] = ema[period]

    13 return ema

    14

    15 def Sign(value):

    16 if (value >= 0):

  • CHAPTER 8. PYTHON SCRIPTING 32

    17 return (1)

    18 else:

    19 return (-1)

    20

    21 def SetParameterValue(param_name, param_value):

    22 global ma_long_period

    23 global ma_short_period

    24 if param_name == "ma-long":

    25 ma_long_period = param_value

    26 if param_name == "ma-short":

    27 ma_short_period = param_value

    28

    29 def Train(deep_thought_intf, training_csv):

    30 # This example does not need to training anything but we could use

    31 # the following line to read a training set into a Pandas data frame:

    32 # training_df = pd.read_csv(training_csv, index_col=False)

    33 return True

    34

    35 def Predict(deep_thought_intf, attributes_csv):

    36 # Get an array of close prices

    37 csv_file_name = deep_thought_intf.GetLastBars(ma_long_period + 4, "EURUSDh4")

    38 candles = pd.read_csv(csv_file_name, index_col=False)

    39 close_values = candles[close].values

    40 reversed_close_values = close_values[::-1]

    41

    42 # As numpy convolve (moving average) calculates from lowest index to highest,

    43 # we must reverse the array of values as a bar series has the most recent

    44 # values with the lowest index and we want to compute the moving average

    45 # moving forward in time (i.e. from the back of the array of close prices

    46 # forwards).

    47 ema_short = ExpMovingAverage(reversed_close_values, ma_short_period)

    48 ema_long = ExpMovingAverage(reversed_close_values, ma_long_period)

    49

    50 # Calculate the difference between the moving averages at the most

    51 # recent candle, and the one before that

    52 diff_current = ema_short[-1] - ema_long[-1]

    53 diff_previous = ema_short[-2] - ema_long[-2]

    54

    55 # Look for a cross. If we find one, predict in the direction of

    56 # the cross.

    57 if Sign(diff_current) != Sign(diff_previous):

    58 deep_thought_intf.SetForecast(Sign(diff_current))

    59 else:

    60 # Not strictly necessarily as forecast defaults to 0 if

    61 # not set, but set here for completeness.

    62 deep_thought_intf.SetForecast(0)

    Listing 8.6: Python script example for a predictor.

    8.5 Python Signal Generation

    The signal generator is the component that transforms the forecasts produced by the predictorsinto buy and sell signals. It also controls trading parameters such as take profit and stop loss.More detail on the signal generator can be found in section 9.8 on page 70.

    As there is only one signal generator, you only need to provide optional Python function namesto the signal-generator component. An example is given below.

    1

    2 signal generator.py

    3

    4

    5

    6 all

    7 all

    8

    9 h4-features

    10 SetParameterValue

  • CHAPTER 8. PYTHON SCRIPTING 33

    11 CombineForecasts

    12 20.0

    13 EURUSDm1

    14

    Listing 8.7: Python signal generator configuration example.

    1 #

    2 # Simple demonstration of the signal generator calling a Python function.

    3 # This example simply buys/sells if the combined forecasts of the

    4 # predictor ensemble threshold have exceeded a threshold given by the

    5 # "threshold" parameter in the configuration file.

    6 #

    7

    8 import pandas as pd

    9 import numpy as np

    10

    11 # Globals

    12 threshold = None # double

    13

    14 # We could make these parameters in the configuration, but hard code them here

    15 # for the moment.

    16 take_profit = 20.0

    17 stop_loss = 25.0

    18

    19 def SetParameterValue(param_name, param_value):

    20 if param_name == "threshold":

    21 global threshold

    22 threshold = param_value

    23 print("set threshold to ", threshold)

    24

    25 def CombineForecasts(deep_thought_intf, predictions_csv):

    26 # read the predictions into a Pandas dataframe

    27 predictions = pd.read_csv(predictions_csv)

    28

    29 # Remove all limit orders and tag them in the log file as "Missed"

    30 deep_thought_intf.DeleteLimitOrders("Missed")

    31 average_forecast = predictions.forecast.mean()

    32

    33 if average_forecast >= threshold:

    34 deep_thought_intf.SendBuyOrder("EURUSDm1", take_profit, stop_loss)

    35 elif average_forecast

  • CHAPTER 8. PYTHON SCRIPTING 34

    Table 8.1: Summary of the deep thought intf interface object

    Method Description

    GetLastBars(num bars,

    bar series)

    Returns the file name of a CSV file containing thelast num bars candles of bar series bar series.

    GetNumAttributes() Returns the number of attributes of this feature.Set using SetNumAttributes().

    SetAttribute(index, value) Set the value of the attribute with the given index.Indexes are zero indexed.

    SetNumAttributes(num attributes) Set the number of attributes of this features.

    SetTarget(value) Set the target value if a target has been reachedwhen a GetTarget() function is called.

    SetForecast(value) Set the forecast value when a Predict() functionhas been called.

    SendBuyOrder(bar series id,

    take profit, stop loss)

    Send a buy order to the bar series sepcifiecd bybar series at the current market price. Option-ally set take profit or stop loss to be non-zeroif required.

    SendSellOrder(bar series id,

    take profit, stop loss)

    Send a sell order to the bar series sepcifiecd bybar series at the current market price. Option-ally set take profit or stop loss to be non-zeroif required.

    CloseAllTrades(comment) Close all open trades with an optional comment.The comment appears in the log file.

    DeleteLimitOrders(comment) Remove all unfilled limit orders with an optionalcomment. The comment appears in the log file.

  • Chapter 9

    Configuration Details

    DeepThought is driven by XML configuration files. A GUI will be available in a future version.The XML is divided into sections as listed in table 9.1. Some sections are required, someare optional and some can contain their own sections. Also, some sections can have only onedefinition (e.g. bar-series-collection), while others can have as many as desired, e.g. model.Each section is detailed individually in this chapter.

    Where the option value is a string, e.g. True, False, RBF, the text is case insensitive. You cancheck the first part of the log file to check what default values were used for missing values.Table 9.1 details the configuration sections.

    9.1 bar-series

    A bar-series is the raw data. For Forex trading, DeepThought operates by using 1 minutecandles to generate longer duration bars (candles) . Renko bars can also be generated. Anexample configuration snippet that generates 90 minute candles is:

    1

    2 EURUSDm1

    3 const-time

    4 eurusd.db

    5 10000.0

    6 1.5

    7 1>

    8

    9

    10 EURUSDh4

    11 const-time

    12 EURUSDm1

    13 10000.0

    14 0.0

    15 240

    16 0

    17

    In the above example, a pip as defined by the broker is a price change of 0.0001. Therefore wemust multiply by 10000 to u