From neural networks to deep learning

From Artificial Neural Networks to Deep learning

Viet-Trung Tran

Perceptron •  Rosenblatt 1957 •  input signals x1, x2, •  bias x0 = 1 •  Net input = weighted sum = Net(w,x) •  Activation/transfer func = f(Net(w,x)) •  output

weighted sum

step func1on

Weighted Sum and Bias

•  Weighted sum

•  Bias

Hard-limiter function

•  Hard-limiter – Threshold function – Discontinuous function – Discontinuous derivative

Threshold logic function

•  Saturating linear function

•  Contiguous function

•  Discontinuous derivative

Sigmoid function •  Most popular •  Output (0,1) •  Continuous derivatives •  Easy to differentiate

Artificial neural network – ANN structure

•  Number of input/output signals •  Number of hidden layers •  Number of neurons per layer •  Neuron weights •  Topology •  Biases

Feed-forward neural network

•  connections between the units do not form a directed cycle

Recurrent neural network

•  A class of artificial neural network where connections between units form a directed cycle

Why hidden layers

Neural network learning

•  2 types of learning – Parameter learning •  Learn neuron weight connections

– Structure learning •  Learn ANN structure from training data

Error function

•  Consider an ANN with n neurons •  For each learning example (x,d) – Training error caused by current weight w

•  Training error caused by w for entire learning examples

Learning principle

Neuron error gradients

Parameter learning: back propagation of error

•  Calculate total error at the top •  Calculate contributions to error at each step going

backwards

Back propagation discussion

•  Initial weights •  Learning rate •  Number of neurons per hidden layers •  Number of hidden layers

Stochastic gradient descent (SGD)

Deep learning

Google brain

Learning from tagged data

•  @Andrew Ng

2006 breakthrough

•  More data •  Faster hardware: GPU’s, multi-core CPU’s •  Working ideas on how to train deep

architectures

Deep Learning trends

•  @Andrew Ng

AI will transform the internet •  @Andrew Ng •  Technology areas with potential for paradigm shift: –  Computer vision –  Speech recognition & speech synthesis –  Language understanding: Machine translation; Web

search; Dialog systems; …. –  Advertising –  Personalization/recommendation systems –  Robotics

•  All this is hard: scalability, algorithms.

Deep learning

CONVOLUTIONAL NEURAL NETWORK

http://colah.github.io/

Convolution •  Convolution is a mathematical operation on two

functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions,

Convolutional neural networks

•  Conv Nets is a kind of neural network that uses many identical copies of the same neuron – Large number of neurons – Large computational models – Number of actual weights (parameters) to be

learned fairly small

A 2D Convolutional Neural Network

•  a convolutional neural network can learn a neuron once and use it in many places, making it easier to learn the model and reducing error.

Structure of Conv Nets

•  Problem – predict whether a human is speaking or not

•  Input: audio samples at different points in time

Simple approach

•  just connect them all to a fully-connected layer

•  Then classify

A more sophisticated approach •  Local properties of the data –  frequency of sounds (increasing/decreasing)

•  Look at a small window of the audio sample –  Create a group of neuron A to compute certain features –  the output of this convolutional layer is fed into a fully-

connected layer, F

Max pooling layer

2D convolutional neural networks

Three-dimensional convolutional networks

Group of neurons: A

•  Bunch of neurons in parallel •  all get the same inputs and compute different

features.

Network in Network (Lin et al. (2013)

Conv Nets breakthroughs in computer vision

•  Krizehvsky et al. (2012)

Diferent Levels of Abstraction

RECURRENT NEURAL NETWORKS

http://colah.github.io/

Recurrent Neural Networks (RNN) have loops

•  A loop allows information to be passed from one step of the network to the next.

Unroll RNN

•  recurrent neural networks are intimately related to sequences and lists.

Examples •  predict the last word in “the clouds are in the sky" •  the gap between the relevant information and the

place that it’s needed is small •  RNNs can learn to use the past information

•  “I grew up in France… I speak fluent French.” •  As the gap grows, RNNs become unable to

learn to connect the information.

LONG SHORT TERM MEMORY NETWORKS

LSTM Networks

LSTM networks •  A Special kind of RNN •  Capable of learning long-term dependencies •  Structure in the form of a chain of repeating

modules of neural network

•  repeating module has a very simple structure, such as a single tanh layer

•  The tanh(z) function is a rescaled version of the sigmoid, and its output range is [ − 1,1] instead of [0,1].

LSTM networks

•  Repeating module consists of four neuron, interacting in a very special way

Core idea behind LSTMs •  The key to LSTMs is the cell state, the horizontal line

running through the top of the diagram. •  The cell state runs straight down the entire chain, with only

some minor linear interactions •  Easy for information to just flow along it unchanged

•  The ability to remove or add information to the cell state, carefully regulated by structures called gates

•  Sigmoid – How much of each component should be let

through. – Zero means nothing through – One means let everything through

•  An LSTM has three of these gates 72

LSTM step 1

•  decide what information we’re going to throw away from the cell state

•  forget gate layer

LSTM step 2

•  decide what new information we’re going to store in the cell state

•  input gate layer

LSTMs step 3

•  update the old cell state, Ct−1, into the new cell state Ct

LSTMs step 4

•  decide what we’re going to output

RECURRENT NEURAL NETWORKS WITH WORD EMBEDDINGS

APPENDIX

Perceptron 1957

Perceptron 1986

Perceptron

Activation function

Back propagation 1974/1986

•  Inspired by the architectural depth of the brain, researchers wanted for decades to train deep multi-layer neural networks.

•  No successful attempts were reported before 2006 …Exception: convolutional neural networks, LeCun 1998

•  SVM: Vapnik and his co-workers developed the Support Vector Machine (1993) (shallow

•  architecture). •  Breakthrough in 2006!

2006 breakthrough

•  More data •  Faster hardware: GPU’s, multi-core CPU’s •  Working ideas on how to train deep

architectures

•  Beat state of the art in many areas: – Language Modeling (2012, Mikolov et al) –  Image Recognition (Krizhevsky won 2012

ImageNet competition) – Sentiment Classification (2011, Socher et al) – Speech Recognition (2010, Dahl et al) – MNIST hand-written digit recognition (Ciresan et

al, 2010)

Credits

•  Roelof Pieters, www.graph-technologies.com •  Andrew Ng •  http://colah.github.io/

From neural networks to deep learning

Data & Analytics

Transcript of From neural networks to deep learning

Deep convolutional neural networks - Computer Science- …web.cs.ucdavis.edu/.../lee_lecture19_deeplearning_notes.pdf · Deep convolutional neural networks ... “Deep” architecture

Introduction to Deep Neural Networksbhiksha/courses/deep... · Introduction to Deep Neural Networks 0. Logistics Spring 2020 1. Neural Networks are taking over! •Neural networks

Deep Convolutional Neural Networks - Cjoint.com

Handwritten Hangul recognition using deep convolutional ...xhx/publications/HHR.pdf · Handwritten Hangul recognition using deep convolutional neural networks ... deep neural networks

Introduction to Deep Neural Networks

Deep Multi-State Dynamic Recurrent Neural Networks ...papers.nips.cc/paper/9594-deep-multi-state-dynamic-recurrent-neural... · Deep Multi-State Dynamic Recurrent Neural Networks

Lecture 1: Introduction to Neural Networks and Deep Learning · · 2016-12-17Lecture 1: Introduction to Neural Networks and Deep Learning ... o Book on Neural Networks and Deep

Deep Neural Networks Convolutional Networks IIbhiksha/courses/deep... · Deep Neural Networks Convolutional Networks II Bhiksha Raj 1. Story so far • Pattern classification tasks

The Next Generation Neural Networks: Deep Learning and Spiking Neural Networks · · 2015-08-10Networks: Deep Learning and Spiking Neural Networks ADVANCED SEMINAR submitted by

Deep Convolutional Neural Networks on Multichannel Time ... · Deep Convolutional Neural Networks On Multichannel Time Series ... as a common input for the neural network ... Deep

Introduction to Deep Neural Networks - Deep Learning · Introduction to Deep Neural Networks 0. Logistics Spring 2020 1. Neural Networks are taking over! •Neural networks have become

Special Topic - University of Georgiacobweb.cs.uga.edu/~khaled/MLcourse/Deep_Learning_Lecture.pdf · Deep Neural Networks Deep (but not that deep) Neural Network 6. Deep Neural Networks

Topology of Deep Neural Networks

Explaining Deep Neural Networks - arXiv

Deep Neural Networks in Machine Translation: An …perpustakaan.unitomo.ac.id/repository/Deep Neural...Deep Neural Networks in Machine Translation: An Overview Jiajun Zhang and Chengqing

Deep Neural Networks 1 - zcu.cz

Deep Convolutional Neural Networks - Overview

Debugging deep neural networks

Deep Neural Networks - dmi.unibas.ch

Comparison of Training Methods for Deep Neural Networks ... · Comparison of Training Methods for Deep Neural Networks Patrick Oliver GLAUNER ... 2.3 Training neural networks ...