Neural Network and Deep Learning - Welcome to IITPai-nlp-ml/resources/talks/DeepLearning... ·...

Neural Network

Deep Learning

Md Shad Akhtar

Research Scholar

IIT Patna

Neural Network

• Mimics the functionality of a brain.

• A neural network is a graph with neurons

(nodes, units etc.) connected by links.

Neural Network: Neuron

Neural Network: Perceptron

• Network with only single layer.

• No hidden layers

W2 = ?

W1 = ?

AND Gate

W2 = ?

W1 = ?

OR Gate

t = ? W1 = ?

NOT GateX1

t = 1.5

AND Gate

t = 0.5

OR Gate

t = -0.5

NOT GateX1

Neural Network: Multi Layer Perceptron

(MLP) or Feed-Forward Network (FNN)

• Network with n+1 layers

• One output and n hidden layers.

• Typically n = 1

Training: Back propagation algorithm

• Gradient decent algorithm

Training: Back propagation algorithm

1. Initialize network with random weights

2. For all training cases (called examples):

a. Present training inputs to network and calculate output

b. For all layers (starting with output layer, back to input layer):

i. Compare network output with correct output (error function)

ii. Adapt weights in current layer

Deep Learning

What is Deep Learning?

• A family of methods that uses deep architectures to learn high-level feature representations

Example 1

Example 2

Why are Deep Architectures hard to train?

• Vanishing gradient problem in Back Propagation

Layer-wise Pre-training

• First, train one layer

at a time, optimizing

data-likelihood

objective P(x)

• Then, train second

layer next, optimizing

data-likelihood

objective P(h)

• Finally, fine-tune labelled objective P(y|x) by

Backpropagation

Deep Belief Nets

• Uses Restricted Boltzmann Machines (RBMs)

• Hinton et al. (2006), A fast learning algorithm

for deep belief nets.

Restricted Boltzmann Machine (RBM)

• RBM is a simple energy-based model:

Example:

• Let weights (h1; x1), (h1; x3) be positive, others be

zero, b = d = 0.

• Calculate p(x,h) ?

• Ans: p(x1 = 1; x2 = 0; x3 = 1; h1 = 1; h2 = 0; h3 = 0)

Restricted Boltzmann Machine (RBM)

• P(x, h) = P(h|x) P(x)

• P(h|x): easy to compute

• P(x): hard if datasets are large.

Contrastive Divergence:

Deep Belief Nets (DBN) = Stacked RBM

Auto-Encoders: Simpler alternative to

Deep Learning - Architecture

• Recurrent Neural Network (RNN)

• Convolution Neural Network (CNN)

Recurrent Neural Network (RNN)

• Enable networks to do temporal processing

and learn sequences

Character level language model Vocabulary: [h,e,l,o]

Training of RNN: BPTT

: Actual

: Predicted

Training of RNN: BPTT

One to many:

Sequence output (e.g. image captioning takes an image and outputs a sentence of

words)

Many to one:

Sequence input (e.g. sentiment analysis where a given sentence is classified as

expressing positive or negative sentiment)

Many to many:

Sequence input and sequence output (e.g. Machine Translation: an RNN reads a

sentence in English and then outputs a sentence in French)

Many to many:

Synced sequence input and output (e.g. Language modelling where we wish to

predict next words.

RNN Extension

• Bidirectional RNN

• Deep (Bidirectional) RNNs

RNN (Cont..)

• “the clouds are in the sky”

the clouds are in the

W1theclouds are in

RNN (Cont..)

• “India is my home country. I can speak fluent Hindi.”

India is my speak fluent

fluentmyis home W2

Ideally: It should

Practically: It is very hard for RNN to learn “Long Term Dependency”.

• Capable of learning long-term dependencies.

Simple RNN

• LSTM remove or add information to the cell

state, carefully regulated by structures called

gates.

• Cell state: Conveyer belt of the cell

• Gates

– Forget Gate

– Input Gate

– Output Gate

• Gates

– Forget Gate

– Input Gate

– Output Gate

• Gates

– Forget Gate

– Input Gate

– Output Gate

• Gates

– Forget Gate

– Input Gate

– Output Gate

LSTM- Variants

Convolutional Neural Network (CNN)

• A special kind of multi-layer neural networks.

• Implicitly extract relevant features.

• Fully-connected network architecture does not

take into account the spatial structure.

• In contrast, CNN tries to take advantage of the s

patial structure.

1. Convolutional layer

2. Pooling layer

3. Fully connected layer

1. Convolutional layer1 0 1

1 1 1 0 0

0 1 1 1 0

0 0 1 1 1

0 0 1 1 0

0 1 1 0 0

Convolution Filter

1. Convolutional layer1 0 1

1. Convolutional layer• Local receptive field

• Shared weights

2. Pooling layer

3. Fully connected layer

Putting it all together

References

• http://www.wildml.com/2015/09/recurrent-

neural-networks-tutorial-part-1-introduction-

to-rnns/

• http://karpathy.github.io/2015/05/21/rnn-

effectiveness/

• http://colah.github.io/posts/2015-08-

Understanding-LSTMs/

• http://cl.naist.jp/~kevinduh/a/deep2014/

Neural Network and Deep Learning - Welcome to IITPai-nlp-ml/resources/talks/DeepLearning... ·...

Documents

Transcript of Neural Network and Deep Learning - Welcome to IITPai-nlp-ml/resources/talks/DeepLearning... ·...

Deep Learning in Neural Networks: An Overviejuergen/DeepLearning28May2014.pdf · Deep Learning in Neural Networks: An Overview ... 1 Introduction to Deep Learning ... Neural Fitted

Recurrent Neural Networksbhiksha/courses/deeplearning/...The Backpropagation Through Time (BTT) Algorithm Different Recurrent Neural Network (RNN) paradigms How Layering RNNs works

SKM C3350i19121315110 · New Pedagogies for DeepLearning A GLOBAL PARTNERSHIP WHAT IS DEEP LEARNING? Six deep learning competencies define what it means to be a deep learner. Deep

Special Topic - University of Georgiacobweb.cs.uga.edu/~khaled/MLcourse/Deep_Learning_Lecture.pdf · Deep Neural Networks Deep (but not that deep) Neural Network 6. Deep Neural Networks

Deep convolutional neural networks - Computer Science- …web.cs.ucdavis.edu/.../lee_lecture19_deeplearning_notes.pdf · Deep convolutional neural networks ... “Deep” architecture

Deep Neural Network Approach for the Dialog State Tracking ...people.cs.pitt.edu/~huynv/research/deep-nets/Deep Neural Network Approach for the...Deep Neural Network Approach for the

Deep Convolutional Neural Network for Image …papers.nips.cc/paper/5485-deep-convolutional-neural...Deep Convolutional Neural Network for Image Deconvolution Li Xu ∗ LenovoResearch

CS6501: Deep Learning for Visual Recognition Neural ...cs.virginia.edu/~vicente/deeplearning/slides/lecture05.pdfCS6501: Deep Learning for Visual Recognition Neural Networks / Multi-layer

Evolutionary (deep) neural network

Neural Networks - Carnegie Mellon School of Computer …bhiksha/courses/deeplearning/Fall.2015/... · Neural Networks are taking over! ... –Modelling distributions and generating

Tutorial: Learning Deep Architectures - University of …rsalakhu/deeplearning/yoshua_icml2009.pdf · Tutorial: Learning Deep Architectures Yoshua Bengio, U. Montreal Yann LeCun,

Face Recognition through Deep Neural Networkweb.eecs.utk.edu/~hqi/deeplearning/lecture07-cnn-vggface.pdf · Face Recognition through Deep Neural Network Yang Song Face Recognition,

Deep Learning for Neural Networks - viXra · Deep Learning for Neural Networks Today, deep neural networks with different architectures, such as convolutional, recurrent and autoencoder

Deep Neural Networks - uchile.cl

Neural Networksbhiksha/courses/deeplearning/...Neural Networks are taking over! •Neural networks have become one of the major thrust areas recently in various pattern recognition,

Face Recognition through Deep Neural Network - UTKweb.eecs.utk.edu/~hqi/deeplearning/lecture08-cnn-vggface.pdf · Face Recognition through Deep Neural Network Yang Song. Face Recognition,

Deep Compression: Compressing Deep Neural Networks with …forum.stanford.edu/events/posterslides/DeepCompression... · 2016-03-01 · Deep Compression: Compressing Deep Neural Networks

Debugging deep neural networks

Deep Learning for Natural Language Processingsergey/slides/N17_AISTDLNLP.pdf · deeplearning fornaturallanguageprocessing SergeyI.Nikolenko1 AIST2017 Moscow,July28,2017 1SteklovInstituteofMathematicsatSt.Petersburg

Deep-Learningperso.mines-paristech.fr/.../deepLearning-convNets... · Deep-Learning: general principles + convNets, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL,