TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep...

20
Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

Transcript of TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep...

Page 1: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak

2019 © GridGain Systems

Page 2: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Speaker

2

Yuriy Babak● Head of ML/DL framework development at GridGain● Apache Ignite committer

Page 3: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Agenda

• ML introduction• Apache Ignite ML overview• Integration with TensorFlow• Demo

2019 © GridGain Systems

Page 4: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems GridGain Company Confidential4

Apache Ignite ML overview

Page 5: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

ML Terms

5

● Raw Data - data of business (logs, transactions, …) ● Training/Test set - preprocessed data from raw data, numeric features

(set: age, user type id, gender, ...)● Training algorithm - produce model by training set (GDB, Gibbs

Sampling, Decision Tree building)● Model - formula for producing answers on new business data (decision

tree, SVM, linear regression)● Meta algorithm - combinator of several models (boosting, bagging,

stacking)● Evaluation/Validation - estimation of model given test/validation set

(Precision, MSE, ROC AUC)● Inference - using ML model for prediction

Page 6: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Typical ML tasks

6

1. Regression2. Time-series regression3. Classification (binary, multiclass)4. Clusterization5. Ranking6. Entity extraction, structure recognition7. Generative models

Page 7: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Distributed model training

7

Calculate local improvements

Aggregate local improvements

We have a model

Update model

Map

Reduce

Page 8: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Ignite partition-based dataset

8

Partition Data Dataset Context Dataset Data

Upstream Cache Context Cache On-Heap

Learning Env

On-Heap

Durable Stateless Durable Recoverable

Dataset dataset = … // Partition based dataset, internal API

dataset.compute((env, ctx, data) -> map(...), (r1, r2) -> reduce(...))

double[][] x = …double[] y = ...

double[][] x = …double[] y = ...

Partition Based Dataset StructuresSource Data

Page 9: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Apache Ignite ML API

9

Page 10: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

List pre-build models

10

Regression: Linear Regression (LSQR and SGD), Decision Tree, Random Forest, GBT, Nearest Neighbours (KNN), MLP.

Classification: SVM, Nearest Neighbours (ANN, KNN), Decision Tree, MLP, GBT, Logistic Regression.

Clustering: K-Means, Gaussian Mixture Model (GMM).

Preprocessing: Normalization, Binarization, Imputer, One-Hot Encoder, String Encoder, MinMax Scaler, MaxAbsScaler.

Ensambles: Boosting, Stacking, Bagging of any model.

Page 11: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems GridGain Company Confidential11

TensorFlow integration

Page 12: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Apache Ignite as data source

12

Page 13: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Distributed training

13

TensorFlow Cluster on top of Apache Ignite ClusterTensorFlow/Ignite Client

ignite-tf.sh

Tool that help to submit TensorFlow code into cluster and manage it.

Page 14: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Distributed training

14

Page 15: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems GridGain Company Confidential15

Demo

Page 16: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Demo

16

Training and inference for pre-build models

Page 17: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Demo

17

TensorFlow model inference from Apache Ignite

Page 18: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Links

18

● Apache Ignite docs - https://apacheignite.readme.io/docs● Apache Ignite sources - https://github.com/apache/ignite● TensorFlow IO module - https://github.com/tensorflow/io/tree/master/tensorflow_io/ignite

Page 19: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems19

Q&A

Page 20: TensorFlow with Apache Ignite and Machine and Deep ... · Achieving Continuous Machine and Deep Learning with Apache Ignite and TensorFlow Yuriy Babak 2019 © GridGain Systems

2019 © GridGain Systems2019 © GridGain Systems

Attend the In-Memory Computing Summit

20