GENERIC VISUAL PERCEPTION PROCESSOR 1 Generic Visual Perception Processor.
Visual perception through Deep Learning
-
Upload
barcelona-graduate-school-of-economics-gse -
Category
Data & Analytics
-
view
2.910 -
download
3
Transcript of Visual perception through Deep Learning
Visual perception through Deep Learning
Dario [email protected]
Barcelona Supercomputing Center (BSC)
June 1, 2016
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
The basics
Dario Garcia-Gasulla June 1, 2016 2 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
The Artificial Neuron and the Artificial Neural Network
Definition (Maureen Caudill)...a computing system made up of a number of simple, highlyinterconnected processing elements, which process information bytheir dynamic state response to external inputs.
McCulloch & Pitts, 1943 Rosenblat, 1958
Dario Garcia-Gasulla June 1, 2016 3 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Training Neural Networks
Backpropagation algorithmI Measure error on output
(loss function)
I Optimize weights to reduceloss (Gradient Descent)
I Backpropagate the loss,layer by layer, until allneuron weights have beenimproved
I Repeat!
Dario Garcia-Gasulla June 1, 2016 4 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
(old) Neural Networks
I Traditionally used as classifiers for simple problems
I Capable of finding non-linearities on the data
LimitationsI Large networks are increasingly expensive to train (millions
of weights)
I Needs tons of data to find complex non-linearities
I Training easily stalls on local sub-optimals
Dario Garcia-Gasulla June 1, 2016 5 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Deep Neural Network (aka Deep Learning)
More layers! Made possible by:I Hardware Advances (GPU’s)I More efficient types of neuronsI Training optimizations
Dario Garcia-Gasulla June 1, 2016 6 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Deep Learning families
Based on neuron, layer and training particularities:I Convolutional Neural Networks (CNNs): Capture 2D
features. Appropriate for visual data.
I Recurrent Neural Networks (RNNs): Capture streams ofdata. May include memory components (LSTM).Appropriate for text, sound, etc..
I Deep Belief Network : Probabilistic model.
I ... and many others
Dario Garcia-Gasulla June 1, 2016 7 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Convolutional Neural Networks
Dario Garcia-Gasulla June 1, 2016 8 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
The explosion of Deep LearningThe ImageNet Challenge: Visual recognition competition.Recognize 1,000 different objects.
In 2012...Alex Krizhevsky et. al. trained a CNN with 5 layers...
and improved the best result by 11%.In 2014 all candidates were based on CNNs.In 2015, human-level performance was achieved.
Dario Garcia-Gasulla June 1, 2016 9 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs: The Origin
DesignI Fukushima, 1980 (neocognitron). LeCun, 1998, 2003.I Based on the visual cortex of animals: Each neuron
percieves a small portion of the input, and exploits thespatial correlation.
I Reuse neuron weights to reduce complexity.
What was missing:I Feasible implementation.I GPUs
Dario Garcia-Gasulla June 1, 2016 10 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs Layers: Convolution
Convolutional LayersEach neuron inputs a small patch of data (called receptive fielde.g., 3x3). A neuron parameters are convolved on all the input.This provides translation invariace.
Dario Garcia-Gasulla June 1, 2016 11 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs Layers: Pooling
Pooling LayersI Down-sampling technique to reduce complexity at the
price of precision.I Reduce values within pooling filter (e.g., 2x2) to the
maximum or average (e.g., max pooling, average pooling).I The exact location is not as important as relative location.
Dario Garcia-Gasulla June 1, 2016 12 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs Layers: Fully-connected
Fully-connected LayersI Standard NN layer. Each neuron inputs all neurons from
the previous layer.I Spatial information is no longer taken into account.I The output will be an estimate of prediction (class
probability).
Dario Garcia-Gasulla June 1, 2016 13 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs Architecture
Standard ArchitectureI Stack convolution and pooling layers.
I To estimate probabilities, use fully connected layers at theend. Output feeds a classifier (softmax, SVM).
Dario Garcia-Gasulla June 1, 2016 14 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs in ActionDuring Traning
A CNN trained to recognize objectslearns different representations ateach depth.
1. Lines, angles2. Composed shapes3. Parts of entities4. Full entities
During Deployment
The CNN looks for increasinglycomplex patterns in the image.Finally, by considering the mostcomplex (top layer) a class predictionis made.
Dario Garcia-Gasulla June 1, 2016 15 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
CNNs Practical Notes
RequirementsI Large set of labeled data for trainingI Computational power for training (GPUs)I Deployment is cheap
Where to start?I Almost out-of-the-box CNNs: Caffe, Torch, Theano,
TensorFlowI Pre-trained models are available for download
Dario Garcia-Gasulla June 1, 2016 16 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Applications ofConvolutional Neural Networks
Dario Garcia-Gasulla June 1, 2016 17 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Object Recognition (A. Krizhevsky et. al., 2012.)
Dario Garcia-Gasulla June 1, 2016 18 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Image Segmentation (LC Chen et. al., 2014.)
Dario Garcia-Gasulla June 1, 2016 19 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Style Transfer (Gatys et. al., 2015.)
Dario Garcia-Gasulla June 1, 2016 20 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Colorization (Zhang et. al., 2016.)
Dario Garcia-Gasulla June 1, 2016 21 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Colorization II (Iizuka et. al., 2016.)
Dario Garcia-Gasulla June 1, 2016 22 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Other applications
Mobile AppsI Aipoly, AI Scry, BlindTool: Textual description of image
I Artify: Artistic style
I Nippler, AwesomeCNN, WhatPlant: Object detections
AI ChallengesI Playing videogames, GO, ...
I Self-driving car
I Image retrieval (Google, Facebook, Instagram, etc.)
Dario Garcia-Gasulla June 1, 2016 23 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Mining CNN learnt representations:A case of research
Dario Garcia-Gasulla June 1, 2016 24 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Beyond CNNs
CNNs...I Learn lots of relevant representations (millions) from a
training set
I Characterize input data based on learnt representations
Our hypothesisI Kind of a Feature extractorI What mining/learning can be performed with those
features?
Dario Garcia-Gasulla June 1, 2016 25 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Step 1: Vector Embeddings
From Images to VectorsI For a given image, annotate which neurons
activate for it, and its activation strength
I Use a subset of those neurons to define afixed vector length
I Produce a vector for each image, assumingeach variable is independent
The vector represents everything the CNNpercieves in the image
Dario Garcia-Gasulla June 1, 2016 26 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Step 2: Abstract Representations
Image Class VectorsI Images are imperfect representations of entities (changes
in perspective, ilumination, specimen etc.)
I To build stable class representations we need to aggregatethe evidence provided by many images of the same entity
I Result: One vector per class, with millions of values
Result: One vector with millions of numerical values for eachabstract class
Dario Garcia-Gasulla June 1, 2016 27 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Step 3: Exploit vectors
Vector operationsI Compute distances to perform clustering
(unsupervised learning)
I Visualize class vectors(see what the CNN sees)
I Vector arithmetics(visual reasoning)
Dario Garcia-Gasulla June 1, 2016 28 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Mining process: Step by Step
1. Build million-pattern description for a set ofimages
2. Aggregate images by class
3. Compute distances, clusters, arithmetics
I Image to vector
I Image Class tovector
I Image Classclustering andequations
Dario Garcia-Gasulla June 1, 2016 29 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Actual Data
The ModelI GoogLeNet architecture pretrained to recognize 1,000
classes using 1.5M images. 80MB.
I Extract 1.2M features from the CNN. One vector < 3MB
The DataI Process 50,000 images (ImageNet test set)
I Aggregate 50 images per class: 1,000 class vectors
Dario Garcia-Gasulla June 1, 2016 30 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Clustering (I)
114 Dogs (black)44 Wheeled vehicles (grey) ?
I Similar things are close I Implicit high level knowledgeDario Garcia-Gasulla June 1, 2016 31 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Clustering (II)
Which semantics does the vector space actually capture?I Find n-clustersI For each cluster, find their most representative
WordNet label
Dario Garcia-Gasulla June 1, 2016 32 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Class Visualization (I)
Vector to imageI Generate images from ClassVectorsI See a concept as the CNN percieves it
Based on Gatys et.al., 2015
Dario Garcia-Gasulla June 1, 2016 33 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Class Visualization (II)
Dario Garcia-Gasulla June 1, 2016 34 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Vector Arithmetics (I)
Church - Mosque = Bellcote
Horse cart - Horse = Rickshaw
Dario Garcia-Gasulla June 1, 2016 35 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Vector Arithmetics (III)
Panda bear - Brown bear = Skunk, Football, Indri, Angora rabbit
I What do these four image classes have in common?
Dario Garcia-Gasulla June 1, 2016 36 / 38
Deep Learning basics Convolutional Neural Networks CNN Applications Mining CNNs Summary
Deep Learning and CNN
TechnicallyI Not so new
I Made possible by increase computational power, and fewoptimizations
I Currently, trial and error research approach
ImpactI Anything related with visual data has changedI Same will happen with text, sound and othersI Just the tip of the iceberg!
Dario Garcia-Gasulla June 1, 2016 37 / 38
Deep Learning and CNNs online materialshttp://cs231n.github.io/convolutional-networks/
http://ufldl.stanford.edu/tutorial/
Almost out-of-the-box CNNsCaffe, Torch, Theano, TensorFlow