Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code...
Transcript of Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code...
![Page 1: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/1.jpg)
1© 2018 The MathWorks, Inc.
Deep Learning in
From Concept to Embedded Code
Alexander SchreiberPrincipal Application Engineer
MathWorks Germany
MathWorks Automotive Conference 2018
Stuttgart
April 17th, 2018
![Page 2: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/2.jpg)
2
Example: Lane Detection
Transfer Learning
Alexnet
Lane detection
CNN
Post-processing
(find left/right lane
points)Image
Image with
marked lanes
Left lane coefficients
Right lane coefficients
Output of CNN is lane parabola coefficients according to: y = ax^2 + bx + c
GPU coder generates code for whole application
![Page 3: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/3.jpg)
3
Example: Lane DetectionImport of Pre-Trained
Network
![Page 4: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/4.jpg)
4
Example: Lane DetectionImport of Pre-Trained
Network
Modification of Network
Architecture
![Page 5: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/5.jpg)
5
Example: Lane DetectionImport of Pre-Trained
Network
Modification of Network
Architecture
Transfer Learning
![Page 6: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/6.jpg)
6
Example: Lane DetectionImport of Pre-Trained
Network
Modification of Network
Architecture
Transfer Learning
Verification
![Page 7: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/7.jpg)
7
Example: Lane DetectionImport of Pre-Trained
Network
Modification of Network
Architecture
Transfer Learning
Verification
Autom. CUDA
Code Generation
![Page 8: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/8.jpg)
8
Example: Lane DetectionImport of Pre-Trained
Network
Modification of Network
Architecture
Transfer Learning
Verification
Autom. CUDA
Code Generation
mex
Verification
![Page 9: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/9.jpg)
9
Example: Lane DetectionImport of Pre-Trained
Network
Modification of Network
Architecture
Transfer Learning
Verification
Autom. CUDA
Code Generation
mex
Verification
Deployment to
embedded GPU
![Page 10: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/10.jpg)
10
MATLAB Deep Learning Framework
Access Data Design + Train Deploy
▪ Manage large image sets
▪ Automate image labeling
▪ Easy access to models
▪ Automate compilation to
GPUs and CPUs using
GPU Coder:▪ 11x faster than TensorFlow
▪ 4.5x faster than MXNet
▪ Acceleration with GPU’s
▪ Scale to clusters
![Page 11: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/11.jpg)
11
Deep Learning Workflow
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
LABEL AND PREPROCESS
DATA
Data Augmentation/
Transformation
Labeling Automation
Import Reference
Models
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
![Page 12: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/12.jpg)
12
Deep Learning Workflow
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
LABEL AND PREPROCESS
DATA
Data Augmentation/
Transformation
Labeling Automation
Import Reference
Models
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
![Page 13: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/13.jpg)
13
Ground Truth Labeling
▪ Adding Ground Truth Information
▪ Semi-automated Labeling
– Object Detection
– Scene Classification
– Semantic Image Segmentation
▪ Solutions
– Ground Truth Labeler App
– Image Labeler App
LABEL AND
PREPROCESS DATA
![Page 14: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/14.jpg)
14
Importing Reference Models (e.g. AlexNet) LABEL AND
PREPROCESS DATA
![Page 15: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/15.jpg)
15
Importing Reference Models (e.g. AlexNet) LABEL AND
PREPROCESS DATA
![Page 16: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/16.jpg)
16
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
Deep Learning Workflow
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
LABEL AND PREPROCESS
DATA
Data Augmentation/
Transformation
Labeling Automation
Import Reference
Models
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
LABEL AND PREPROCESS
DATA
Data Augmentation/
Transformation
Labeling Automation
Import Reference
Models
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
![Page 17: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/17.jpg)
17
Two Approaches for Deep Learning
▪ Reusing existing feature
extraction
▪ Adapting to specific needs
▪ Requires
– Smaller training data set
– Lower training time
▪ Tailored and optimized to
specific needs
▪ Requires
– Larger training data set
– Longer training time
2. Fine-tune a pre-trained model (transfer learning)
1. Train a Deep Neural Network from Scratch
DEVELOP
PREDICTIVE MODELS
![Page 18: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/18.jpg)
18
Transfer Learning DEVELOP
PREDICTIVE MODELS
![Page 19: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/19.jpg)
19
Transfer Learning DEVELOP
PREDICTIVE MODELS
![Page 20: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/20.jpg)
20
Transfer Learning DEVELOP
PREDICTIVE MODELS
![Page 21: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/21.jpg)
21
Accelerating Training (CPU, GPU, multi-GPU, Clusters)
More GPUs
Mo
re C
PU
sDEVELOP
PREDICTIVE MODELS
![Page 22: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/22.jpg)
22
Accelerating Training (CPU, GPU, multi-GPU, Clusters)
Multiple GPU support
More GPUs
Single GPU performance
DEVELOP
PREDICTIVE MODELS
![Page 23: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/23.jpg)
23
Hyperparameter Tuning (e.g. Bayesian Optimization)
▪ Goal
– Set of optimal hyperparamters for a
training algorithm
▪ Algorithms
– Grid search
– Rando search
– Bayesian optimization
▪ Benefits
– Faster training
– Better network performance
DEVELOP
PREDICTIVE MODELS
![Page 24: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/24.jpg)
24
Visualizing and Debugging Intermediate Results
Filters…
Activations
Deep Dream
Training Accuracy Visualization Deep Dream
Layer Activations Feature Visualization
• Many options for visualizations and debugging• Examples to get started
DEVELOP
PREDICTIVE MODELS
![Page 25: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/25.jpg)
25
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
Deep Learning Workflow
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
LABEL AND PREPROCESS
DATA
Data Augmentation/
Transformation
Labeling Automation
Import Reference
Models
INTEGRATE MODELS WITH
SYSTEMS
Desktop Apps
Enterprise Scale Systems
Embedded Devices and
Hardware
Files
Databases
Sensors
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
Hardware-Accelerated
Training
Hyperparameter Tuning
Network Visualization
LABEL AND PREPROCESS
DATA
Data Augmentation/
Transformation
Labeling Automation
Import Reference
Models
![Page 26: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/26.jpg)
26
Algorithm Design to Embedded Deployment Workflow
MATLAB algorithm
(functional reference)
Functional test1 Deployment
unit-test
2
Desktop
GPU
C++
Deployment
integration-test
3
Desktop
GPU
C++
Real-time test4
Embedded GPU
.mex .lib/.dll Cross-compiled
.lib
Build type
Call CUDA
from MATLAB
directly
Call CUDA from
(C++) hand-
coded main()
Call CUDA from (C++)
hand-coded main().
INTEGRATE MODELS
WITH SYSTEMS
(Test in MATLAB on host) (Test generated code in
MATLAB on host + GPU)
(Test generated code within
C/C++ app on host + GPU)
(Test generated code within
C/C++ app on Tegra target)
![Page 27: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/27.jpg)
27
GPUs and CUDA
CUDA
kernelsC/C++
ARM
Cortex
GPU
CUDA Cores
C/C++
CUDA Kernel
C/C++
CUDA Kernel
GPU Memory
Space
CPU Memory
Space
INTEGRATE MODELS
WITH SYSTEMS
![Page 28: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/28.jpg)
28
Challenges of Programming in CUDA for GPUs
▪ Learning to program in CUDA
– Need to rewrite algorithms for parallel processing paradigm
▪ Creating CUDA kernels
– Need to analyze algorithms to create CUDA kernels that maximize parallel processing
▪ Allocating memory
– Need to deal with memory allocation on both CPU and GPU memory spaces
▪ Minimizing data transfers
– Need to minimize while ensuring required data transfers are done at the appropriate
parts of your algorithm
INTEGRATE MODELS
WITH SYSTEMS
![Page 29: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/29.jpg)
29
GPU Coder Compilation Flow
Benefits:
▪ MATLAB as single golden
reference
▪ Much faster conversion
from MATLAB to CUDA
▪ Elimination of manual
coding errors
▪ No expert-level expertise
in parallel computing
needed
GPU Coder
CUDA Kernel creation
Memory allocation
Data transfer minimization
• Library function mapping
• Loop optimizations
• Dependence analysis
• Data locality analysis
• GPU memory allocation
• Data-dependence analysis
• Dynamic memcpy reduction
INTEGRATE MODELS
WITH SYSTEMS
![Page 30: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/30.jpg)
30
GPU Coder Output INTEGRATE MODELS
WITH SYSTEMS
![Page 31: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/31.jpg)
31
Deep Learning Network Support (with Neural Network Toolbox)
SeriesNetwork DAGNetwork
GPU Coder: R2017b
Networks: MNist
Alexnet
YOLO
VGG
Lane detection
Pedestrian detection
GPU Coder: R2018a
Networks: GoogLeNet
ResNet
SegNet
FCN
DeconvNet
Semantic
segmentation
Object
detection
INTEGRATE MODELS
WITH SYSTEMS
![Page 32: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/32.jpg)
32
Semantic Segmentation
Running in MATLAB Generated Code from GPU Coder
INTEGRATE MODELS
WITH SYSTEMS
![Page 33: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/33.jpg)
33
Algorithm Design to Embedded Deployment
MATLAB algorithm
(functional reference)
Functional test1 Deployment
unit-test
2
Tesla
GPU
C++
Deployment
integration-test
3
Tesla
GPU
C++
Real-time test4
Tegra GPU
.mex Cross-compiled
.lib
Build type
Call CUDA
from MATLAB
directly
Call CUDA from
(C++) hand-
coded main()
Call CUDA from (C++)
hand-coded main().
Cross-compiled on host
with Linaro toolchain
INTEGRATE MODELS
WITH SYSTEMS
.lib/.dll
![Page 34: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/34.jpg)
34
Alexnet Inference on NVIDIA Titan Xp
GPU Coder +
TensorRT (3.0.1)
GPU Coder +
cuDNN
Fra
mes p
er
second
Batch Size
CPU Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz
GPU Pascal Titan Xp
cuDNN v7
Testing platform
MXNet (1.1.0)
GPU Coder +
TensorRT (3.0.1, int8)
TensorFlow (1.6.0)
INTEGRATE MODELS
WITH SYSTEMS
![Page 35: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/35.jpg)
35
Algorithm Design to Embedded Deployment
MATLAB algorithm
(functional reference)
Functional test1 Deployment
unit-test
2
Tesla
GPU
C++
Deployment
integration-test
3
Tesla
GPU
C++
Real-time test4
Tegra GPU
.mex Cross-compiled
.lib
Build type
Call CUDA
from MATLAB
directly
Call CUDA from
(C++) hand-
coded main()
Call CUDA from (C++)
hand-coded main().
Cross-compiled on host
with Linaro toolchain
INTEGRATE MODELS
WITH SYSTEMS
.lib/.dll
![Page 36: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/36.jpg)
36
Alexnet Deployment to Tegra: Cross-Compiled with ‘lib’
Two small changes
1. Change build-type to ‘lib’
2. Select cross-compile toolchain
INTEGRATE MODELS
WITH SYSTEMS
![Page 37: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/37.jpg)
37
0
50
100
150
200
250
300
350
400
1 16 32 64 128 256
Alexnet Inference on Jetson TX2: Performance
MATLAB GPU Coder (R2017b)
Fra
me
s p
er
se
co
nd
Batch Size
C++ Caffe (1.0.0-rc5)
TensorRT (2.1)
2x
0.85x
INTEGRATE MODELS
WITH SYSTEMS
![Page 38: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/38.jpg)
38
Deploying to GPUs and CPUs
GPU
Coder
Deep Learning
Networks
NVIDIA
cuDNN
& TensorRT
Libraries
ARM
Compute
Library
Intel
MKL-DNN
Library
INTEGRATE MODELS
WITH SYSTEMS
![Page 39: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/39.jpg)
39
Deploying to GPUs and CPUs
GPU
Coder
Deep Learning
Networks
NVIDIA
cuDNN
& TensorRT
Libraries
ARM
Compute
Library
Intel
MKL-DNN
Library
Desktop CPU
Raspberry Pi board
INTEGRATE MODELS
WITH SYSTEMS
![Page 40: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/40.jpg)
40
Deep Learning in MATLAB
▪ Integrated Deep Learning Framework
– Data Access and Preprocessing
– Deep Learning Network Design and Verification
– Integration within larger System
▪ Acceleration through GPU and Parallel Computing
– Training
– Inference
▪ Deployment through automatic CUDA Code Generation
– Desktop GPU
– Embedded GPU
ACCESS AND EXPLORE
DATA
DEVELOP PREDICTIVE
MODELS
LABEL AND PREPROCESS
DATA
INTEGRATE MODELS WITH
SYSTEMS
![Page 41: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/41.jpg)
41
GPU Coder for Deployment
Deep Neural Networks 1,2,3
Deep Learning, machine learning
Image Processing and
Computer Vision 2
Image filtering, feature detection/extraction
Signal Processing and
Communications 2
FFT, filtering, cross correlation,
5x faster than TensorFlow
2x faster than MXNet
60x faster than CPUs
for stereo disparity
20x faster than
CPUs for FFTs
GPU CoderAccelerated implementation of
parallel algorithms on GPUs & CPUs
ARM 3
Compute
Library
Intel 1
MKL-DNN
Library
2
![Page 42: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/42.jpg)
42
GPU Coder for Image Processing and Computer Vision
8x speedup
Distance
transform
5x speedup
Fog removal
700x speedup
SURF feature
extraction
18x speedup
Ray tracing
3x speedup
Frangi filter
![Page 43: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/43.jpg)
43
Design Your DNNs in MATLAB, Deploy with GPU Coder
Access Data Design + Train Deploy
▪ Manage large image sets
▪ Automate image labeling
▪ Easy access to models
▪ Automate compilation to
GPUs and CPUs using
GPU Coder:▪ 11x faster than TensorFlow
▪ 4.5x faster than MXNet
▪ Acceleration with GPU’s
▪ Scale to clusters
![Page 44: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/44.jpg)
44
Questions?
![Page 45: Deep Learning in - it.mathworks.com€¦ · Deep Learning in From Concept to Embedded Code Alexander Schreiber Principal Application Engineer MathWorks Germany MathWorks Automotive](https://reader033.fdocuments.net/reader033/viewer/2022053014/5f132900f9a6a207a436bcbd/html5/thumbnails/45.jpg)
45
Thank You!