Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author:...

30
Neural Networks An Introduction Warith HARCHAOUI MAP5, UMR 8145 Universit´ e Paris-Descartes Sorbonne Paris Cit´ e & Oscaro.com Research and Development March 2017

Transcript of Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author:...

Page 1: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Neural NetworksAn Introduction

Warith HARCHAOUI

MAP5, UMR 8145Universite Paris-Descartes

Sorbonne Paris Cite&

Oscaro.comResearch and Development

March 2017

Page 2: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 3: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 4: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Supervised ClassificationThe binary case

Given a training set that consists of:

I xi ∈ RD

I yi ∈ {0, 1}for i = 1, . . . , nFind F s.t. F(xi ) ' yiEx:xi is an imageyi = 1 corresponds to “cat”yi = 0 corresponds to “non-cat”

Page 5: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Supervised ClassificationMore than 2 classes

Given a training set that consists of:

I xi ∈ RD

I yi ∈ {0, 1}K one-hot representation

for i = 1, . . . , nFind F s.t. F(xi ) ' yiEx:xi is an imageyi = [1, 0, 0] corresponds to “cat”yi = [0, 1, 0] corresponds to “dog”yi = [0, 0, 1] corresponds to “elephant”

Page 6: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Regression

Given a training set that consists of:

I xi ∈ RD

I yi ∈ RK

for i = 1, . . . , nFind F s.t. F(xi ) ' yiEx:xi is a buildingyi is the rent value of the building

Page 7: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 8: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

One NeuronAn Input-Output Machine

x1

x2

x3

a y

w1

w2

w3

Figure: One Neuron

y = a(w1x1 + w2x2 + w3x3 + b)

Page 9: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

One Neuron for RegressionLeast Mean Squares

Prediction:F(xi ) = yi = Wxi + b

Loss:

L(W,b) =1

n

n∑i=1

‖yi − yi‖22

Page 10: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

One Neuron for Binary ClassificationLogistic Function

Prediction:

scorei = w>xi + b

P(yi = 1) = pi = Sigmoid(scorei ) =1

1 + exp(−scorei )

Loss:

`(w, b) =∏

i :yi=1

pi∏

i :yi=0

(1− pi ) =n∏

i=1

pyii (1− pi )1−yi

L(w, b) =−1

nlog(`(w, b)) =

−1

n

n∑i=1

yi log(pi )+(1−yi ) log(1−pi )

Page 11: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

One neuron for Binary ClassificationLogistic function

Figure: The Sigmoid Function

Sigmoid(a) =1

1 + exp(−a)

Page 12: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

One Neuron for Classification of K > 2 classesSoftmax Function

Prediction:

scoreik = wk>xi + bk

pik = SoftMax(scorei ) =exp(scoreik)∑K

k ′=1 exp(scoreik ′)

yi ,k = 1⇔ xi belongs to the kth class

yi ,k = 0⇔ xi does not belong to the kth class

Loss:

`(W,b) =n∏

i=1

K∏k=1

pyi,ki ,k

L(W,b) =−1

nlog(`(W, b)) =

−1

n

n∑i=1

K∑k=1

yi ,k log(pi ,k)

Page 13: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 14: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Batch Gradient DescentThe common problem

Loss function:

L(W,b) =1

n

n∑i=1

Li (W,b)

Problem:

minW,bL(W,b)

Page 15: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Batch Gradient DescentA Universal Learning Procedure

minw

1

n

n∑i=1

Li (w)

1. Choose a random w and a constant α > 0

2. Iterate:wnew = wold − α∇L(wold)

∇L(wold) =1

n

n∑i=1

∇Li (wold)

Page 16: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Stochastic Gradient DescentA Universal Learning Procedure

minw

1

n

n∑i=1

Li (w)

1. Choose a random w and a constant α > 0

2. Iterate:

2.1 Choose a random subset J ⊂ (1, n) ⊂ N (sometimes reducedto a singleton)

2.2wnew = wold − α

|J|∑j∈J

∇Lj(wold)

Page 17: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 18: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Several NeuronsThe Power of Back-Propagation

x1i

x2i

x3i

x4i

pi

Hiddenlayer

Inputlayer

Outputlayer

Figure: A Multi-Layer-Perceptron

Page 19: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Several NeuronsThe Power of Back-Propagation

Back-Propagation is just an iterated version of Chain Rule forplenty of functions:

(F ◦ G)′ =(F ′ ◦ G

)× G′

NB: (F ◦ G)(x) = F(G(x))

Page 20: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Three Remarks

1. Non-linearity: Sigmoid, SoftMax, ReLu

ReLu(x) = max(x , 0)

2. Automatic Differentiation thanks to: Theano, Torch, Caffe,Tensorflow, PyTorch

3. GPU Acceleration

Page 21: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 22: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Convolutional Neural Networks for ImagesConvolutions

x : a PixelI: an Image in gray levelsK: a Kernel = A FilterI ∗ K: Convolution of image I by filter KnonLinearity(I ∗ K): Element-wise non-linearity on the convolutionresult producing a Feature Map

(I ∗ K)(x) =∑

y∈Supp(K)

I(x − y)K(y)

The same neuron of weights K is applied many times (as much asthe number of pixels in I) producing a new image called featuremap.

Page 23: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Convolutional Neural Networks for ImagesConvolutions

Figure: LeNet architecture

Page 24: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 25: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Adversarial NetworksA Desired Network

yx or z Generator

Figure: Scheme for a Desired Network

Page 26: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Adversarial NetworksBinary Classification Networks

y

y

Discriminator p

Figure: Scheme for Binary Classification Networks

Page 27: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Adversarial NetworksThe full system

yx or z Generator

y

Discriminator p

Figure: Scheme for Adversarial Networks

Page 28: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Adversarial NetworksA New Kind of Loss

G : Generator (e.g. of images) from random noise or a real imageD: Discriminator that distinguished fake examples from realexamples

minwD

maxwG

L

Figure: Adversarial Networks Example

Page 29: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

Outline

Supervised Classification and RegressionClassificationRegression

One Neuronfor Regressionfor Classification

Gradient DescentBatch Gradient DescentStochastic Gradient Descent

Several Neurons

Convolutional Neural Networks for Images

Adversarial Networks

Conclusion

Page 30: Neural Networks - An Introduction · 2017-03-16 · Neural Networks - An Introduction Author: Warith HARCHAOUI Subject: Neural Networks, Convolutional, Adversarial, Logistic Regression,

ConclusionA Great Book

Figure: The Deep Learning Book