APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER … · 2015. 8. 10. · APPLICATIONS...
Transcript of APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER … · 2015. 8. 10. · APPLICATIONS...
APPLICATIONS OF DEEP LEARNING TO COMPUTER VISION AND COMPUTER GRAPHICS Mike Houston
Practical DEEP LEARNING Examples
Image Classification, Object Detection, Localization, Action Recognition, Scene Understanding
Speech Recognition, Speech Translation, Natural Language Processing
Pedestrian Detection, Traffic Sign Recognition Breast Cancer Cell Mitosis Detection, Volumetric Brain Image Segmentation
What is DEEP LEARNING?
Input Result
Tree
Cat
Dog
Deep Learning Framework
“turtle”
Forward Propagation
Compute weight update to nudge
from “turtle” towards “dog”
Backward Propagation
Trained Neural
Net Model
“cat”
Repeat
Training
Inference
Making a vehicle classifier
PICKUP
SUV
SUV
The “Big Bang” In Deep Learning
Algorithms Data Compute Capability
Medical Research
Detecting Mitosis in
Breast Cancer Cells — IDSIA
Predicting the Toxicity
of New Drugs — Johannes Kepler University
Understanding Gene Mutation
to Prevent Disease — University of Toronto
“Automated Image Captioning with ConvNets and Recurrent Nets”
—Andrej Karpathy, Fei-Fei Li
Captioning
Why Are GPUs Good for Deep Learning?
GPUs deliver --
same or better prediction accuracy
faster results
smaller footprint
lower power
Neural Networks GPUs
Inherently
Parallel Matrix
Operations
FLOPS
0 0 4
60
110 28%
26%
16%
12%
7%
2010 2011 2012 2013 2014
bird
frog
person
dog
chair
GPU-Accelerated Deep Learning
START-UPS
GPU-Accelerated Deep Learning Frameworks
CAFFE TORCH THEANO CUDA-CONVNET2 KALDI
Domain Deep Learning
Framework
Scientific Computing
Framework
Math Expression
Compiler
Deep Learning
Application
Speech Recognition
Toolkit
cuDNN R2 R2 R2 -- --
Multi-GPU In Progress In Progress In Progress (nnet2)
Multi-CPU (nnet2)
License BSD-2 GPL BSD Apache 2.0 Apache 2.0
Interface(s) Text-based definition
files, Python, MATLAB Python, Lua, MATLAB Python C++ C++, Shell scripts
Embedded (TK1)
http://developer.nvidia.com/deeplearning
DIGITS
DIGITS DEEP GPU TRAINING
SYSTEM FOR DATA
SCIENTISTS
Design DNNs
Visualize activations
Manage multiple trainings GPU GPU HW Cloud GPU
Cluster Multi-GPU
USER INTERFACE
Visualize Layers
Configure DNN
Process Data
Monitor Progress
Theano Torch
Caffe cuDNN, cuBLAS
CUDA
DIGITS
Test Image
Monitor Progress Configure DNN Process Data Visualize Layers
DIGITS DEVBOX World’s fastest GPU
Max GPU out of a plug
Multi-GPU training & inference
Production Automotive Pipeline
TEGRA X1 CLASSIFICATION Performance
AlexNet
0
10
20
30
40
50
60
70
80
90
100
Tegra K1 Tegra X1
IMAG
ES /
SECO
ND
Project dave — darpa autonomous vehicle
DNN-based self-driving robot
Training data by human
driver
No hand-coded CV algorithms
IMAGENET
CHALLENGE Accuracy %
2010 2014 2012 2011 2013
74%
84%
DNN
CV
72%
TRAINING DATA 225K Images
DAVE IN ACTION
Data Scientist Vehicle
Active Learning
Drive PX - Deploy
Model Classification
Detection
Segmentation DIGITS - Train
Network
Solver
Dashboard
Deep Learning and Vision/Graphics
Street Number Detection
[Goodfellow 2014]
Object Classification
[Krizhevsky 2012]
Image Retrieval
[Krizhevsky 2012]
Pose Estimation
[Toshev, Szegedy 2014]
Object Detection
[Huval et al. 2015]
Face Recognition
[Taigman et al. 2014]
Action Recognition
[Simonyan et al. 2014]
Playing Games
[Mnih et al. 2013]
Semantic Segmentation
[Farabet et al. 2013]
Super Resolution
[Dong et al. 2014]
Ray Tracing – Monte Carlo Denoising
[Kalantari et al. 2015]
“Dreams”
[Mordvinstev et al. 2015]
“Dreams”
[Mordvinstev et al. 2015]