Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active...

42
@sahildua2305 @sahildua2305 Putting Deep Learning Models in Production Sahil Dua

Transcript of Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active...

Page 1: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Putting Deep Learning Models in Production

Sahil Dua

Page 2: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Let’s imagine!

Page 3: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

But ...

Page 4: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

whoami

➔ Software Developer @ Booking.com

➔ Previously - Deep Learning Infrastructure

➔ Open Source Contributor (Git, Pandas, Kinto, go-github, etc.)

➔ Tech Speaker

Page 5: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

➔ Deep Learning at Booking.com

➔ Life-cycle of a model

➔ Training Models

➔ Serving Predictions

Agenda

Page 6: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Deep Learningat Booking.com

Page 7: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

1.4 million+

active properties

in 220+ countries

1,500,000+

room nights booked every 24 hours

Scalehighlights.

Page 8: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Deep Learning

➔ Image understanding

➔ Translations

➔ Ads bidding

➔ ...

Page 9: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Image Tagging

Page 10: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Image Tagging

Page 11: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Image Tagging

Sea view: 6.38Balcony/Terrace: 4.82Photo of the whole room: 4.21Bed: 3.47Decorative details: 3.15Seating area: 2.70

Page 12: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Page 13: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Image Tagging

Using the image tag information in the right contextSwimming pool, Breakfast Buffet, etc.

Page 14: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Lifecycle of a model

Page 15: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Deploy

Lifecycle of a model

TrainData

Analysis

Page 16: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Training a Model - on laptop

Page 17: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Training a Model - on laptop

Page 18: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Machine Learning workload

➔ Computationally intensive workload

➔ Often not highly parallelizable algorithms

➔ 10 to 100 GBs of data

Page 19: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Why Kubernetes (k8s)?

➔ Isolation

➔ Elasticity

➔ Flexibility

Page 20: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Why k8s – GPUs?

➔ In alpha since 1.3

➔ Speed up 20X-50X

resources: limits: alpha.kubernetes.io/nvidia-gpu: 1

Page 21: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Training with k8s

➔ Base images with ML frameworks

◆ TensorFlow, Torch, VowpalWabbit, etc.

➔ Training code is installed at start time

➔ Data access - Hadoop (or PVs)

Page 22: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Startup

Code

Training pod

..start.shtrain.pyevaluate.py

Page 23: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Startup

Data..start.shtrain.pyevaluate.py

PV

Training pod

Page 24: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Streaming logs back

Logs..start.shtrain.pyevaluate.py

PV

Training pod

Page 25: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Exports the model

..start.shtrain.pyevaluate.py

PV

Training pod

model

Page 26: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving predictions

Page 27: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

ModelClient

Input Features

Prediction

Page 28: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

Model 1Client

Input Features

Prediction

Model XClient

Input Features

Prediction

Page 29: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

Model 1Client

Input Features

Prediction

Model XClient

Input Features

Prediction

Page 30: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

➔ Stateless app with common code

➔ Containerized

➔ No model in image

➔ REST API for predictions

Page 31: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

App

Client

Input Features

Prediction

model

Page 32: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

➔ Get trained model from Hadoop

➔ Load model in memory

➔ Warm it up

➔ Expose HTTP API

➔ Respond to the probes

Page 33: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

Client

Input Features

Prediction

Page 34: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Serving Predictions

Client

Input Features

Prediction

Client

Input Features

Prediction

Page 35: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Deploying a new model

➔ Create new Deployment

➔ Create new HTTP Route

➔ Wait for liveness/readiness probe

Page 36: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Performance

PredictionTime = RequestOverhead + N*ComputationTime

N is the number of instances to predict on

Page 37: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Optimizing for Latency

➔ Do not predict if you can precompute

➔ Reduce Request Overhead

➔ Predict for one instance

➔ Quantization (float 32 => fixed 8)

➔ TensorFlow specific: freeze network & optimize for inference

Page 38: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Optimizing for Throughput

➔ Do not predict if you can precompute

➔ Batch requests

➔ Parallelize requests

Page 39: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Summary

➔ Training models in pods

➔ Serving models

➔ Optimizing serving for latency/throughput

Page 40: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Next steps

➔ Tooling to control hundred deployments

➔ Autoscale prediction service

➔ Hyper parameter tuning for training

Page 41: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

Want to get in touch?

LinkedIn / Twitter / GitHub@sahildua2305

Websitewww.sahildua.com

Page 42: Putting Deep Learning Models in Production Sahil Dua · @sahildua2305 1.4 million+ active properties in 220+ countries 1,500,000+ room nights booked every 24 hours Scale highlights.

@sahildua2305@sahildua2305

THANKYOU

@sahildua2305