Deep learning in MATLAB - GTC On Demand · Deep learning in MATLAB From Concept to CUDA Code Roy...

1© 2017 The MathWorks, Inc.

Deep learning in MATLABFrom Concept to CUDA Code

Roy Fahn

Applications Engineer

Systematics

royf@systematics.co.il

03-7660111

Ram Kokku

Principal Engineer

MathWorks

ram.kokku@mathworks.com

2

Talk Outline

Design Deep

Learning & Vision

Algorithms

High Performance

Deployment

• Manage large image sets

• Automate image labeling

• Easy access to models

• Pre-built training

frameworks

Automate compilation

with GPU Coder

On TitanXP: 7x faster than TensorFlow

5x faster than pyCaffe2

On Jetson: On par with TensorRT

2x faster than C++-Caffe

Accelerate and Scale

Training

• Acceleration with GPU’s

• Scale to clusters

3

Example: Transfer Learning Workflow

Transfer Learning

Images

New

ClassifierLearn New

Weights

Modify

Network

Structure

Load

Reference

NetworkLabels

Training Data

Labels: Cars, Trucks,

BigTrucks, SUVs, Vans

4

Example: Transfer Learning in MATLAB

Set up

training

dataset

Split, shuffle, re-arrange images

Read image, Data augmentation

(clip, rotate, resize, etc)

Easily manage large sets of images

Single line of code to access images

Operates on disk, database, big-data file system

5


Load

Reference

Network

Set up

training

dataset

Create DNNs in MATLAB

1. Easy access to research models

2. Caffe Model importer

3. Build from scratch

6


Modify

Network

Structure

Load

Reference

Network

Set up

training

dataset

7


Modify

Network

Structure

Load

Reference

Network

Set up

training

dataset

8


Learn New

Weights

Modify

Network

Structure

Load

Reference

Network

Set up

training

dataset

Many more training options

9

Deep learning on CPU, GPU, multi-GPU and clusters

More GPUs

10


More GPUs

Mo

re C

PU

s

11


More GPUs

Mo

re C

PU

s

13

Visualizing and Debugging Intermediate Results

Filters…

Activations

Deep Dream

Training Accuracy Visualization Deep Dream

Layer Activations Feature Visualization

• Many options for visualizations and debugging• Examples to get started

14

GPU Coder for Deployment: New Product in R2017b

Neural Networks

Deep Learning, machine learning

Image Processing and

Computer Vision

Image filtering, feature detection/extraction

Signal Processing and

Communications FFT, filtering, cross correlation,

7x faster than state-of-art 700x faster than CPUs

for feature extraction

20x faster than

CPUs for FFTs

GPU Coder

Accelerated implementation of

parallel algorithms on GPUs

15

GPU Coder Compilation Flow

GPU Coder

CUDA Kernel creation

Memory allocation

Data transfer minimization

• Library function mapping

• Loop optimizations

• Dependence analysis

• Data locality analysis

• GPU memory allocation

• Data-dependence analysis

• Dynamic memcpy reduction

16

Scalarized MATLAB

GPU Coder Generates CUDA from MATLAB: saxpy

CUDA kernel for GPU parallelization

CUDA

Vectorized MATLAB

Loops and matrix operations are

directly compiled in to kernels

17

Generated CUDA Optimized for Memory Performance

Mandelbrot space

CUDA kernel for GPU parallelization

… …

… …

CUDA

Kernel data allocation is

automatically optimized

20

Algorithm Design to Embedded Deployment Workflow

MATLAB algorithm

(functional reference)

Functional test1 Deployment

unit-test

2

Desktop

GPU

C++

Deployment

integration-test

3

Desktop

GPU

C++

Real-time test4

Embedded GPU

.mex .lib Cross-compiled

.lib

Build type

Call CUDA

from MATLAB

directly

Call CUDA from

(C++) hand-

coded main()

Call CUDA from (C++)

hand-coded main().

21

Demo: Alexnet Deployment with ‘mex’ Code Generation

22

Algorithm Design to Embedded Deployment on Tegra GPU

MATLAB algorithm

(functional reference)

Functional test1

(Test in MATLAB on host)

Deployment

unit-test

2

(Test generated code in

MATLAB on host + GPU)

Tesla

GPU

C++

Deployment

integration-test

3

(Test generated code within

C/C++ app on host + GPU)

Tesla

GPU

C++

Real-time test4

(Test generated code within

C/C++ app on Tegra target)

Tegra GPU

.mex .lib Cross-compiled

.lib

Build type

Call CUDA

from MATLAB

directly

Call CUDA from

(C++) hand-

coded main()

Call CUDA from (C++)

hand-coded main().

Cross-compiled on host

with Linaro toolchain

23

Alexnet Deployment to Tegra: Cross-Compiled with ‘lib’

Two small changes

1. Change build-type to ‘lib’

2. Select cross-compile toolchain

24

End-to-End Application: Lane Detection

Transfer Learning

Alexnet

Lane detection

CNN

Post-processing

(find left/right lane

points)Image

Image with

marked lanes

Left lane co-efficients

Right lane co-efficients

Output of CNN is lane parabola co-efficients according to: y = ax^2 + bx + c

GPU coder generates code for whole application

https://tinyurl.com/ybaxnxjg

https://devblogs.nvidia.com/parallelforall/deep-learning-automated-driving-matlab/https://tinyurl.com/ybaxnxjg

25

How Good is Generated Code Performance

Performance of image processing and computer vision

Performance of CNN inference (Alexnet) on Titan XP GPU

Performance of CNN inference (Alexnet) on Jetson (Tegra) TX2

26

GPU Coder for Image Processing and Computer Vision

Distance

transform

Fog removal

SURF feature

extraction

Ray tracing

Stereo disparity

Orders magnitude speedup over CPU

27

Alexnet Inference on NVIDIA Titan XP

MATLAB GPU Coder

(R2017b)

TensorFlow (1.2.0)

Caffe2 (0.8.1)

Fra

mes p

er

second

Batch Size

CPU Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz

GPU Pascal Titan Xp

cuDNN v5

Testing platform

mxNet (0.10)

MATLAB (R2017b)

2x 7x5x

28

0

50

100

150

200

250

300

350

400

1 16 32 64 128 256

Alexnet Inference on Jetson TX2: Frame-Rate Performance

MATLAB GPU Coder

(R2017b)

Fra

me

s p

er

se

co

nd

Batch Size

C++ Caffe

(1.0.0-rc5)

TensorRT (2.1)

2x

0.85x

30

Alexnet Inference on Jetson TX2: Memory PerformanceP

ea

k M

em

ory

(M

B)

Batch Size

MATLAB GPU Coder

(R2017b)

C++ Caffe

(1.0.0-rc5)

TensorRT 2.1

(using giexec wrapper)

31

Design Your DNNs in MATLAB, Deploy with GPU Coder

Design Deep

Learning & Vision

Algorithms

High Performance

Deployment

Manage large image sets

Automate image labeling

Easy access to models

Pre-built training

frameworks

Automate compilation

with GPU Coder

On TitanXP: 7x faster than TensorFlow

5x faster than pyCaffe2

On Jetson TX2: On par with TensorRT

2x faster than C++-Caffe

Accelerate and Scale

Training

Acceleration with GPU’s

Scale to clusters

32

Check Out Deep Learning in MATLAB and GPU Coder

GPU Coder

Deep learning in MATLAB

systematics.co.il\mwevents
https://www.mathworks.com/products/gpu-coder.htmlhttps://www.mathworks.com/solutions/deep-learning.htmlhttp://www.systematics.co.il/mwevents

Deep learning in MATLAB - GTC On Demand · Deep learning in MATLAB From Concept to CUDA Code Roy...

Documents

Transcript of Deep learning in MATLAB - GTC On Demand · Deep learning in MATLAB From Concept to CUDA Code Roy...

Microsoft Word - 20170328 - FAHN Region 4 Q1 … 2018... · Web viewMicrosoft Word - 20170328 - FAHN Region 4 Q1 Training.docx Last modified by Bernal, Eugene Company OCPS ...

Demystifying Deep Learning - MATLAB & Simulink · Demystifying Deep Learning Dr. Amod Anandkumar Senior Team Lead –Signal Processing & Communications . 2 What is Deep Learning?

Developing Deep using MATLAB · 26 MATLAB makes Deep Learning Easy and Accessible Learn about new MATLAB capabilities to Handle and label large sets of images Accelerate deep learning

Deep Learning with MATLAB and Multiple GPUs - …€¦ · Deep Learning with MATLAB and Multiple GPUs ... and evaluate neural networks for deep learning ... inspired by biological

Deep Tubewells Provide Protection Against Childhood Diarrhea in Matlab, Bangladesh

Matthias Fahn; Tahmina Sadat Hadjer: How Blackwater Takes ... · Matthias Fahn; Tahmina Sadat Hadjer: How Blackwater Takes Uncle Sam for a Ride - and Why He Likes It Munich Discussion

Introducing Deep Learning with MATLAB - MathWorks · Introducing Deep Learning with MATLAB7 How A Deep Neural Network ... relevant features of an image. With deep learning, ... Deep

What’s New in MATLAB and Simulink - MathWorks...Deep Learning Onramp Learn to use deep learning techniques in MATLAB for image recognition. MATLAB Onramp Quickly learn the essentials

TCC 2016 Deep Learning with MATLAB - Humusoft · TCC 2016 Deep Learning with MATLAB Jan Studnička ... 8.9.2016 Brno . Computer Vision Applications ... Manual …

What’s New in MATLAB and Simulink · Deep Learning Onramp Learn to use deep learning techniques in MATLAB for image recognition. MATLAB Onramp Quickly learn the essentials of MATLAB.

Fabela, Daniel Fahn, A. ~ovedad~?bPbllaqráfrCca. ~natmfa ...

Makers of MATLAB and Simulink - MATLAB & Simulink - 딥 ......딥러닝기반응용프로그램작성기법 Application Engineer 엄준상과장 3 What is Deep Learning ? Deep learning

© 2015 The MathWorks, Inc. · 8 MATLAB deep network in a nutshell • A MATLAB deep network (**) is a MATLAB object that contains an array of trained layer objects. • Layers array

Developing Deep using MATLAB · using MATLAB David Willingham. 2 New MATLAB framework makes deep learning easy and accessible. 3 Object Recognition using Deep Learning Training (using

Stanley Fahn, MD, CV (PDF Format) - neuroinstitute.orgneuroinstitute.org/docs/Fahn-CV-BIOGRAPHY-AND-BIBLIOGRAPHY201… · The Stanley Fahn Award was established in 2003 by the Dystonia

Deep Learning with MATLAB - matlabexpo.com · 23 MATLAB使深度学习变得简便易用 Feature Visualization Training Accuracy Plot Deep Dream Network Activations MATLAB用于以下方面的新功能：

Deep Learning in the Cloud with MATLAB R2016b · ATA R2016 WHITE PAPER | 2 Introduction You can use MATLAB® to perform deep learning in the cloud using Amazon Elastic Compute Cloud

Stanley Fahn, M.D.neuroinstitute.org/docs/Fahn-CV-BIOGRAPHY-AND-BIBLIOGRAPHY20… · July 1958 - June 1959 Rotating internship. Philadelphia General Hospital July 1959 - June 1962

Machine Learning y Deep Learning con MATLAB...MATLAB makes Deep Learning Easy and Accessible Learn about new MATLAB capabilities to Handle and label large sets of images Accelerate

Applying Artificial Intelligence to Product Development€¦ · 21 NVIDIA NGC & DGX Supports MATLAB for Deep Learning GPU-accelerated MATLAB Docker container for deep learning –Leverage