Post on 21-Jan-2018
A GUIDED TOUR OF MACHINE LEARNING FOR TRADERS
TUCKER BALCH, PH.D. PROFESSOR, GEORGIA TECH CO-FOUNDER AND CTO, LUCENA RESEARCH
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
WHO THIS IS FOR People who are… • familiar with quantitative techniques • interested to know what’s under the “hood”
with ML techniques. • No Machine Learning knowledge assumed.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
ABOUT THE SPEAKER • Professor of Interactive Computing at
Georgia Institute of Technology. • Teach courses in Artificial Intelligence and
Finance. • Teach MOOCs on Machine Learning for
Trading • Published over 120 research publications
related to Robotics and Machine Learning. • Co-founder of Lucena Research.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
ABOUT MY COURSE
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
ABOUT LUCENA RESEARCH • Fin-tech company who employ
experts in Computational Finance, Quantitative Analysis, and Software Development.
• We deliver investment decision support technology to hedge funds and wealth managers:
• Price forecasting • Hedging • ML-based stock screening • Model portfolios
• Python-based infrastructure. • http://lucenaresearch.com
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
TALK OVERVIEW • Machine Learning: Big Picture • Decision Trees: Classification • Decision Trees: Regression • Decision Trees Example: Sentiment-based strategy • kNN: Classification • kNN: Regression • Reinforcement Learning
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
THE BIG PICTURE “Machine Learning” goes by many names: • Machine Learning • Big Data • Predictive Analytics Focus: Supervised Learning • Start with examples: Factor values & outcomes • Build model from examples • Use model to predict outcomes
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
HOW TO BUILD A PREDICTIVE MODEL Factors (X1, X2, … XN) Predict outcome: Y
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
HOW TO BUILD A PREDICTIVE MODEL Factors (X1, X2, … XN) Predict outcome: Y Classification: One of several outcomes Regression: Numerical outcome
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
HOW TO BUILD A PREDICTIVE MODEL Factors (X1, X2, … XN) Predict outcome: Y Classification: One of several outcomes Regression: Numerical outcome Lots of methods solve same problem • kNN • Decision Trees • Support Vector Machines (SVM) • Artificial Neural Networks (ANN) • Deep Learning
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
WHO SHOULD I VOTE FOR?
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
PREDICT VOTING BEHAVIOR Factors: • Do you believe the country is “broken”? • If so, what caused the country to become broken? • Where do you stand on a woman’s right to chose? • What are your religious views? Outcomes: • Trump • Clinton • Cruz • Sanders • Kasich
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
PREDICT VOTING BEHAVIOR Model: Decision Tree
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
PREDICT STOCK BEHAVIOR Model: Decision Tree
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
TREES ALSO WORK FOR REGRESSION Model: Decision Tree
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
LOTS OF TREES = FOREST
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
HOW TO BUILD A TREE • Gather data <X1, X2, X3, Y> • Find most predictive factor Xi of Y • Find threshold Ti that splits data most effectively • Decision node: Xi < Ti?
• Left tree: Xi < Ti • Right tree: Xi >= Ti
• Recurse until only one data item left: Leaf
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
DECISION TREES RECAP • A decision tree is a flow chart of yes/no questions • When you reach a leaf, that is your prediction • Can be used for classification or regression • Training:
• find most predictive factor • split data based on that factor • Recurse
• Query: • Follow path through decision nodes until leaf
• Forest: An ensemble learner with multiple trees • Training: Build trees with sampled data • Query: Query each tree: Vote, or average to find result • Less susceptible to overfitting
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
USING DECISION TREES FOR STOCK SCANS • CHECKMATE: Trading strategy developed by Lucena Research,
Inc. in partnership with PsychSignal.com • Classification-based strategy • Separate scans for long and short positions • Factors:
• PyschSignal: Sentiment data: stocktwits, twitter analysis • Lucena: 400+ technical & fundamental factors per stock
• Outcomes: Up/Down/Neutral
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
BACKTEST OF LONG SCAN
Backtest simulation performance from QuantDesk® – Past performance is no guarantee of future results. In-sample training period: 2011.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
BACKTEST OF SHORT SCAN
Backtest simulation performance from QuantDesk® – Past performance is no guarantee of future results. In-sample training period: 2011.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
BACKTEST OF LONG & SHORT COMBINED
Backtest simulation performance from QuantDesk® – Past performance is no guarantee of future results. In-sample training period: 2011.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
FORWARD TESTING SINCE NOV 2015
Forward testing performance – Past performance is no guarantee of future results. In-sample training period: 2011.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
K NEAREST NEIGHBOR • Solves the same problem as decision trees • Train: Save data • Query: Find k nearest neighbors, vote or take mean
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
K NEAREST NEIGHBOR
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
TRADE OFFS KNN • Classification or regression • Training is fast • Query is slow • Requires data normalization • Susceptible to overfitting
• Larger K • Ensemble
• Must discover features • You must map to strategy
Decision Trees • Classification or regression • Training is slow • Query is fast • No data normalization • Susceptible to overfitting
• Larger leafsize • Ensemble (forest)
• Auto feature discovery • You must map to strategy
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
REINFORCEMENT LEARNING Solves a different problem: • Find a policy π that tells us which action a to take in
every situation s. • a = π(s) • π*(s) is the optimal policy
Nomenclature • s: state • r: reward for last action • a: action • T: transition matrix (which state is next) • π: the policy
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
REINFORCEMENT LEARNING For trading problem: • s: factors/features describing a stock’s “situation” • r: return • a: buy, sell, do nothing Algorithms: • Model-based:
• Policy iteration • Value iteration
• Model-free • Q-learning • Dyna-Q
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
REINFORCEMENT LEARNING Advantages: • Maps well to finance problems • Provides entire strategy including
entry and exit conditions • Policy accounts for whether to enter
based on probability of success
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
REVIEW • Decision Trees
• Classification • Regression
• kNN • Classification • Regression
• Reinforcement learning • Finds a policy
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
THANK YOU To learn about my company: • www.lucenaresearch.com
To learn about my course: • Google “Balch Udacity”
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
OVERFITTING Description: An overfit model is one that models in-sample data very well. It predicts the data so well that it is likely modeling noise.
A Guided Tour of Machine Learning for Traders Tucker Balch, Ph.D.
OVERFITTING Description: An overfit model is one that models in-sample data very well. It predicts the data so well that it is likely modeling noise. As the degrees of freedom of the model increase, overfitting occurs when in-sample prediction error decreases and out-of-sample prediction error increases.