Convolutional Neural NetworksWhy not Regular Neural Nets • They don't scale well to full images....

Convolutional Neural Networks

Amin MirMohammad Javadi

Application

• Image recognition

• Completely dominated the machine vision space

• One of the hottest topics in AI today

• Tricky to understand

Why not Regular Neural Nets

• They don't scale well to full images.

• In CIFAR-10

๏ Images of size 32⨉32⨉3 ⟹ 3072 weights per neuron

• Larger images

๏ 200⨉200⨉3 ⟹ 120,000 weights

• Input consists of images

• Neurons in layers arranged in three dimension:

๏ Width, height, depth

Historical Overview• CNN’s are inspired by organization of animals visual cortex

• In 1998, Yann LeCun et al. presented first CNN

• Between 10 thousands of images, it gave only 82 case errors

ImageNet Challenge

• ImageNet Large Scale Visual Recognition challenge(ILSVRC)

• As of 2016, over ten million of images have been hand-annotated

• Every year error rates fell to a few precent(25%,16% …)

ConvNet Architecture• Convolution Layer

• ReLU Layer

• Pooling Layer

• Fully-Connected Layer

• Softmax

Convolution Layer• Neurons are not fully-connected

• Compute dot product

Convolution Layer

Convolution Layer• Several filters

Convolution Layer• Weights ⟹ Learnable Filter

Convolution Layer• Slide the filter over the width and height

• Like Convolution Operation

• Produces a 2-dimensional activation map

Convolution Layer• Example

Convolution Layer

• Rectified linear unit

• Applies an element-wise activation function

๏ y = max(0, X)

Pooling Layer• Performs a downsampling operation

• Progressively reduces the spatial size of activation maps

๏ Shrinks the number of parameters & computation

๏ Control overfitting

• Max pool with filters of size 2 and stride of 2

๏ reduces the spatial extent by half

Pooling Layer

Fully-Connected Layer• Fully-Connected

• No parameter sharing

• Using ReLU activation function instead of Sigmoid is common

Fully-Connected Layer

Softmax• Normalized exponential function

• Generalization of the logistic function

CNN Architecture• Stack Conv/ReLU

• Periodically use Pool layers

CNN in Practice

VGGNet• VGGNet (Simonyan and

Zisserman 2014)

• 3⨉3 filters

• zero-padding of 1

• stride of 2

• 2⨉2 MAX POOLING with stride of 2

• 7.3% top five error

VGGNet

LeNet• 5 x 5 filter with stride of 1

• 2⨉2 MAX POOLING with stride of 2

Other Examples

• GoogleNet

• MSRA(Microsoft Research Asia)

• SqueezeNet

• And …

Toolbox and frameworks• Caffe

• Tensorflow

• CNTK(Microsoft)

• Theano

• and …

Showtime

• http://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html

• http://demo.caffe.berkeleyvision.org/

DenseCap• Fei-Fei Li

• Andrej Karpathy

• Justin Johnson

• Dense Captioning

• a Convolutional Network

• a dense localization layer

• Recurrent Neural Network language

DenseCap❝We introduce the dense captioning task, which requires a computer vision system to both localize and describe salient regions in images in natural language.❞

Other Researches and Applications

• FaceApp

• JibJab

• Soccer Activity Recognition

• and …

Thanks everyone!

Any Question??!!

Convolutional Neural NetworksWhy not Regular Neural Nets • They don't scale well to full images....

Documents

Transcript of Convolutional Neural NetworksWhy not Regular Neural Nets • They don't scale well to full images....

High Performance SqueezeNext for CIFAR-10...1 High Performance SqueezeNext for CIFAR-10 Jayan Kant Duggal and Mohamed El-Sharkawy jduggal@purdue.edu, melshark@iupui.edu IoT Collaboratory

cifar summer school 2016

Albia Dugger Miami Dade College Chapter 32 Neural Control Sections 1-6.

Delta-DNN: Efficiently Compressing Deep Neural Networks via …dtao/paper/ICPP20-DeltaDNN.pdf · 2020. 6. 16. · Compressing neural networks [32] is an effective way to reduce the

CIFAR Reach Magazine Spring 2010

FROM NEURONS TO BRAINS TO NEURAL NETWORK MODELS 1 + + + - -- - - - + + + I1I1 I2I2 I3I3 32 x1x1 x2x2 x3x3.

Refactoring Neural Networks for Veriﬁcation · Refactoring for Veriﬁcation: To achieve this we adapt the concept of refactoring to deep neural networks. Code refactoring [32]

CIFAR-10: KNN-based Ensemble of Classifiers

Ulfar Erlingsson - FujitsuTwo common data sets in Machine Learning MNIST dataset: 70,000 images. 28⨉28 pixels each. CIFAR-10 dataset: 60,000 color images. 32⨉32 pixels each

Low-Power High-Resolution 32-channel Neural Recording Systemmilutin/pubs/2007/embc07_neuro.pdf · Low-Power High-Resolution 32-channel Neural Recording System ... or 1/f noise is

Synchronization in Neural Netspapers.nips.cc/paper/32-synchronization-in-neural-nets.pdf · cooperating neurons in local networks4. 5 . The model presented in this paper contrasts

Using Fractal Neural Networks to play SimCity 1 and Conway ...Figure 1: The result of a fractal neural network’s deterministic action on a 32 32 map, after training on a 16 16 map.

arXiv:2004.00345v1 [cs.LG] 1 Apr 2020 · 4.1 TOY EXPERIMENT: CIFAR-10 First, we experiment on image classiﬁcation with the small CIFAR-10 dataset with standard train/test splits

Neural Rendering - MIT Deep Learning 6.S191introtodeeplearning.com/slides/6S191_MIT_DeepLearning_L9.pdf · Neural Voxels 32 x 32 x 32 x 16 3D Encoder 3D-2D 32 x 32 x 512 Neural Pixels

Low-Memory Neural Network Training: A Technical Reportweb.stanford.edu/~nims/low_memory_training.pdffor image classi cation on CIFAR-10, and training a DynamicConv Transformer (DC-Transformer;

CIfAR Stanford 2008 SN Ia Rates: Theory, Progenitors, and Implications.

Training a Recurrent Neural Network to Predict Quantum … · 2019. 5. 13. · Recurrent Neural Network JPA RNN Learning Stochastic Quantum Dynamics Recurent Neural Network 32 neurones

Neural Function Measuring System MEE-2000 · 2018. 2. 8. · Neural Function Measuring System MEE-2000 16/32 ch Intraoperative Monitoring System. ... MEP left APB Breakout box C MEP

Convolutional Neural Networkspeople.cs.pitt.edu/~milos/courses/cs3750/lectures/class21.pdf · Convolutional Neural Network Architectures LeNet-5 ⋮ ⋮ 32×32 ×1 28×6 14×14×6

CIFAR-10 IMAGE CLASSIFICATION USING FEATURE ENSEMBLES · Upon selection of the top 1000 principal components from TL-VGG, TL-Inception, HOG, pixel intensities, and CIFAR-VGG, we achieved