AMIR TABAKOVIC VP OF BUSINESS DEVELOPMENT IG...

32
December 1, 2015 Zürich MACHINE LEARNING MADE BEAUTIFULLY SIMPLE AMIR TABAKOVIC VP OF BUSINESS DEVELOPMENT, BIGML, INC

Transcript of AMIR TABAKOVIC VP OF BUSINESS DEVELOPMENT IG...

December 1, 2015Zürich

MACHINE LEARNING MADE BEAUTIFULLY SIMPLE

AMIR TABAKOVICVP OF BUSINESS DEVELOPMENT, BIGML, INC

BigML  Inc

What’s  Machine  Learning?

“to  give  computers  the  ability  to  learn  to  perform  a  task  without   being  

explicitly  programmed”

“to  automatically  find  patterns  in  data  that  can  be  reused  in  the  future”

BigML  Inc

Age  of  ML“The first half of the information age was programming computers to do what we want. In the second half of

the information age computers will program themselves.”

(Pedro Domingos, Master Algorithm)

Computer  Chess  Game

Past  Games

Solving  Complex  ProblemsProgrammer

BigML  Inc

"When I was a programmer, I was very good at figuring out all the algorithms and writing them all

down.

Today, I think I would try to figure out how to program a computer to learn something.”

Eric Schmidt, Google

BigML  Inc

Data  Science  Endeavor

http://www.jmaxkanter.com/static/papers/DSAA_DSM_2015.pdf

BigML  Inc 8

Developers

Data Scientists

Everyone

Analysts

Academics & Researchers

Machine  Learning

1980s

2000s

2010

Weka,  R,  Orange,  Knime,  Scikit2015

2025

2030

Market  Evolution

BigML, Google, Azure ML, Amazon ML

RapidMiner,  H20,  SkyTree,  Dato,  Spark

BigML  Inc

To Automate and democratizeMachine Learning

1-click

The  simple  idea

DEVELOPERS15

TASKS7.8M+

DATASETS720k+

PREDICTIVE MODELS4.6M+

20k+Team Members

24

• Founded in January 2011 to automate machine learning.

• Pioneered MLAAS.

• API-first company with a beautiful UI.

• Cloud-based and on-premise private deployments for enterprises.

BigML  features  overview

REST API

Auto-scalable Infrastructure

Distributed Machine Learning Backend

Web Interface and Visualization Bindings

• Python• Node.js• Java• C#• R

BigMLer BigML GASBigML X

Multi-tenantPrivate Deployments

On-premise Private Deployments

Predictive Applications

13

Types  of  algorithms

Clustering Anomaly Detection

To group data points by similarity To find outliers that do not fit standard patterns

Supervised

Unsupervised

Decision Trees & Random ForestsRegression Classification

Continuous values Discrete values (classes/labels)

BigML  Inc

Benefits  of  implementing  ML  APIs

ML APIs automate and transform Machine Learning from a highly manual anddetached mix of processes and heterogenous tools into a single cohesive and easy-­‐to-­‐use service

ML APIs reduce the cost and complexity of building and deploying predictivemodels

ML APIs increase business performance rapidly incorporating Machine Learninginto each department's operations and decisions reducing the time-­‐to-­‐market ofdata-­‐driven decisions

Manages the heavy infrastructure needed to learn from data and make predictionsat scale

Adds traceability and repeatability to Machine Learning tasks

BigML  Inc

Data Transformations

Algorithmic Modeling Process Application / Reports

Users Operations Loans

Early  Detection  of  Delinquency

HistoricalData

Data is Transformedand Tagged

Major model

TransformedData

CurrentData Threshold

Customers with higher probability of falling into default

• Fully automated process

• System is capable of predicting in batch or in real time if acustomer will stop paying a loan in a specified window oftime (2 months, 3 months).

• Generation of reports that give future confidence of acustomer staying current and/or defaulting.

• Can be directly integrated with other systems throughexporting files, use of REST calls, or through libraries inmultiple programming languages.

Predictions Delinquency

Lending Club DemoManaging Credit Risks From Your Couch

BigML  Inc

BigML  Inc

https://www.lendingclub.com/info/demand-and-credit-profile.action

BigML  Inc

http://www.lendingmemo.com/lending-club-strategy/

• 5 simple ways to increase returns at Lending Club

Diversify, Increase Risk, Reinvest…

• 2 more complicated ways to increase ROI

Predictive Models, Secondary Market

“creating a custom algorithm is beyond the ability of 99% of investors”

BigML  Inc

BigML  Inc

• Playing with data from Lending Club

https://www.lendingclub.com/info/download-data.action

• The data is real but has been filtered

• This is not financial advice – you’re old enough to know that

Disclaimer

BigML  Inc

Basic  Idea

• Focus on Lending Club assigned grades B - G

• Build a predictive model to detect and filter out bad loans

• Automate everything, build predictive app

• Get shamelessly rich

BigML  Inc

https://www.lendingclub.com/info/demand-and-credit-profile.action

BigML  Inc

Loan  Life  Cycle

“Closed”“Open”

In GracePeriod

Late 16-

30 Days

Fully Paid

Late 31-120

Days

Charged OffDefault

Current

( if ( = ( field "loan_status" ) "Fully Paid" ) "good", "bad" )

BigML  Inc

Exclude Grade A

Split Dataset in “Open” & “Closed”

Loans

Transform 3 Categories of “Closed” Loan

Status Feature in new Label “Quality”

“good” “bad”

Split Dataset in Training and Test Dataset

Exclude Anomalies

with Anomaly Detector

Train Dataset

with Decision

Tree

Train Dataset

with Ensemble

Score “Open” Dataset with

“Quality” Label based on best

Predictive Model“OPEN”

“CLOSED”

20%80%

Evaluate

BigML  Inc

Isolation Forest:

Grow a random decision tree until each instance is in its own leaf

“easy” to isolate

“hard” to isolate

Depth

Now repeat the process several times and use average Depth to compute anomaly score: 0 (similar) -> 1 (dissimilar)

BigML Anomaly

Dia Color Shape Fruit

4 red round plum

5 red round apple

5 red round apple

6 red round plum

7 red round apple

27

Ensembles

Bagging!Random Decision Forest!

All Data: “plum”

What is a round, red 6cm fruit?

Sample 2: “apple”

Sample 3: “apple”

Sample 1: “plum”}“apple”

ML  Opportunity  in  FinTechBanks have traditionally • Used statistical analysis of data for many years as the primary way to

model the behaviour and needs of their customers • Not leveraged all the information they have available• No incorporated additional information available to capture new

dimensions for risk taken

ML  Opportunity  in  FinTechIt is time to• Take advantage of machine learning• dealing with many variables to explore thousands of complex

combinations (not just linear) • discover new patterns that otherwise would have been hidden

• Benefit from • better predictive accuracy• adaptability• publicly available data• ML APIs

PRO  Subscription  Coupon

FINTECHML