Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209...

64
Apprentissage, réseaux de neurones et modèles graphiques (RCP209) - Neural Networks and Deep Learning Convolutionnal Neural Nets (ConvNets) Nicolas Thome [email protected] http://cedric.cnam.fr/vertigo/Cours/ml2/ Département Informatique Conservatoire Nationnal des Arts et Métiers (Cnam)

Transcript of Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209...

Page 1: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Apprentissage, réseaux de neurones et modèles graphiques(RCP209) - Neural Networks and Deep Learning

Convolutionnal Neural Nets (ConvNets)

Nicolas [email protected]

http://cedric.cnam.fr/vertigo/Cours/ml2/

Département InformatiqueConservatoire Nationnal des Arts et Métiers (Cnam)

Page 2: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Outline

1 Limitations of Fully Connected Networks

2 Convolution

3 Pooling

4 Deep Convolutionnal Neural Nets

[email protected] RCP209 / Deep Learning 2/ 64

Page 3: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Limitations of Fully Connected Networks

Credit: M.A. Ranzato

• Scalability issue with Fully Connected Networks# Parameter explosion even for a single hidden layer !

[email protected] RCP209 / Deep Learning 3/ 64

Page 4: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Limitations of Fully Connected Networks

• Signal data: importance of local structure

1D signals: local temporal structure2D signal data: local spatial structure

[email protected] RCP209 / Deep Learning 4/ 64

Page 5: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Limitations of Fully Connected Networks

• BUT: vectorial representation of inputs: dimensions arbitrary!

[email protected] RCP209 / Deep Learning 5/ 64

Page 6: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Limitations of Fully Connected Networks

• MNIST ex: same performances with initial and permuted images!However, local information obviously useful

[email protected] RCP209 / Deep Learning 6/ 64

Page 7: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Limitations of Fully Connected Networks

• Prior knowlege on data structure⇒ useful

• Example: MLP training for shaperecognition (rectangle, trinagle,diamond, star) from color images

[email protected] RCP209 / Deep Learning 7/ 64

Page 8: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Limitations of Fully Connected Networks

• Invariance & stability• Expectations:

Small deformation ⇒ similarrepresentationsLarge deformation ⇒ dissimilar

• Translation invariance difficult withFully Connected Networks ∼ localscale, rotation, deformations, etc

[email protected] RCP209 / Deep Learning 8/ 64

Page 9: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolutionnal Neural Networks

Overcome most of the aforementioned limitations:• Significantly limit number of free parameters• Explicitly focus on local structure of the signal• Able to gain invariance to local deformations• All parameters remain trainable with error back-propagation

[email protected] RCP209 / Deep Learning 9/ 64

Page 10: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Outline

1 Limitations of Fully Connected Networks

2 Convolution

3 Pooling

4 Deep Convolutionnal Neural Nets

[email protected] RCP209 / Deep Learning 10/ 64

Page 11: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution in 1D (Signal)

• Discrete 1D convolution with Finite Impulse Response (FIR) filter h, sized (odd)

• Input signal f (i), i ∈ {1;N}• Output signal f ′(i), i ∈ {1;N}• Convolution: operator T ∶ f → f ′ = T [f ] = f ⋆ h

f ′(i) = (f ⋆ h)(i) =d−12

∑n=− d−1

2

f (i − n)h(n)

[email protected] RCP209 / Deep Learning 11/ 64

Page 12: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution in 2D (Images)

• Discrete 2D convolution with FIR filter h (size d odd),T ∶ f → f ′ = T [f ] = f ⋆ h:

f ′(i , j) = (f ⋆ h)(i , j) =d−12

∑n=− d−1

2

d−12

∑m=− d−1

2

f (i − n,m − j)h(n,m)

• Ex with a 3 × 3 filter:f ′(i, j) = w1f (i − 1, j − 1) +w2f (i − 1, j) +w3f (i − 1, j + 1)

+ w4f (i, j − 1) +w5f (i, j) +w6f (i, j + 1)

+ w7f (i + 1, j − 1) +w8f (i + 1, j) +w9f (i + 1, j + 1)

• Convolution processing:1 Apply central symmetry to the filter:

h(n,m)⇒ h(−n,−m) = g(n,m)2 ∀(i , j), compute weighted sum between image value

around f (i , j) and filter coeffs g(n,m)

h =⎛⎜⎝

w9 w8 w7w6 w5 w4w3 w2 w1

⎞⎟⎠

g =⎛⎜⎝

w1 w2 w3w4 w5 w6w7 w8 w9

⎞⎟⎠

[email protected] RCP209 / Deep Learning 12/ 64

Page 13: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

2D Convolution vs Cross-Correlation

• 2D Convolution: f ′(i , j) = (f ⋆ h)(i , j) = ∑n∑m

f (i − n,m − j)h(n,m)

• Cross-Correlation: f ′(i , j) = (f ⊗ h)(i , j) = ∑n∑m

f (i + n,m + j)h(n,m)

Cross-Correlation ∼ Convolution without symmetrizing mask!

h =⎛⎜⎝

−4 0 00 0 00 0 4

⎞⎟⎠⇒ g =

⎛⎜⎝

4 0 00 0 00 0 −4

⎞⎟⎠

[email protected] RCP209 / Deep Learning 13/ 64

Page 14: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

2D Convolution / Cross-Correlation: Interpretation

• Cross-Correlation: ∀(i , j): dot productbetween image region and filter hLarge f ′(i , j)⇒ filter and region aligned

• Input: 2d image ⇒ output: 2d map

[email protected] RCP209 / Deep Learning 14/ 64

Page 15: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

2D Convolution / Cross-Correlation: Example

• Cross-Correlation: output maps ⇔ location in input image similar to mask

[email protected] RCP209 / Deep Learning 15/ 64

Page 16: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

2D Convolution / Cross-Correlation: Real Image Example

Credit: K. Matsui

[email protected] RCP209 / Deep Learning 16/ 64

Page 17: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Strided Convolution

f ′(i , j) = (f ⋆ h)(i , j) =d−12

∑n=− d−1

2

d−12

∑m=− d−1

2

f (i − n,m − j)h(n,m)

• Standard convolution: stride 1 ⇒ compute f ′(i , j) for (i , j) ∈ {1;N} × {1;M}• Strided convolution: compute f ′(i , j) for

i ∈ {1,1 + s,1 + 2s, ...,N} (idem for j)• Ex: s = 2, N = M = 5, d = 3⇒ reduced map size (3 × 3)

[email protected] RCP209 / Deep Learning 17/ 64

Page 18: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution: Example for Gradient Computation

Ix ≈ I ⋆Mx

• Gradient:Ð→G (x , y) = ( ∂I

∂x∂I∂y )T = ( Ix Iy )T

• Convolution approximation: Ix ≈ I ⋆Mx , Iy ≈ I ⋆My

Mx =14⋅

⎡⎢⎢⎢⎢⎣

−1 0 1−2 0 2−1 0 1

⎤⎥⎥⎥⎥⎦

My =14⋅

⎡⎢⎢⎢⎢⎣

−1 −2 −10 0 01 2 1

⎤⎥⎥⎥⎥⎦

[email protected] RCP209 / Deep Learning 18/ 64

Page 19: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution with Multiple Filters: Edge Detection

Ix ∼ filter 1 Iy ∼ filter 2 Ie = Ix2 + Iy2

Ie,t

Ie,t : Ie threshold⇒ edge detector!

[email protected] RCP209 / Deep Learning 19/ 64

Page 20: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution: Linear Filtering

• Convolution can be viewed as multiplication by a matrix• 1D case: univariate Toeplitz matrix:

• 2D case: doubly block circulant matrix

[email protected] RCP209 / Deep Learning 20/ 64

Page 21: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution vs Fully Connected Layers

Convolution: overcome fully connected network limitations1 Local connection⇒ drastic reduction in the number of parametersa) Sparse connectivity: hidden unit only connected to a local patch

Credit: M.A. Ranzato

[email protected] RCP209 / Deep Learning 21/ 64

Page 22: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution vs Fully Connected Layers

Convolution: overcome fully connected network limitations1 Local connection

b) Weight sharing: same feature detected across all image positions

Credit: M.A. Ranzato

• Convolution: number of parameters independent of input image size ! ≠fully connected layers

[email protected] RCP209 / Deep Learning 22/ 64

Page 23: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Translation-Invariant Feature Detection

• Convolution, weight sharing: same feature detected across all imagepositions

• Very relevant prior for object classification / scene recognition

[email protected] RCP209 / Deep Learning 23/ 64

Page 24: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution vs Fully Connected Layers

Convolution: overcome fully connected network limitations2 Convolution: local spatial structure

Analyses shape/appearance in a local neighborhoodPermutation to input images ⇒ very different local info⇒ Different classification performances

[email protected] RCP209 / Deep Learning 24/ 64

Page 25: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution vs Fully Connected Layers

Convolution: overcome fully connected network limitations3 Convolution: equivariance property

Equivariance:f equivariant to g ⇔ f [g(x)] = g [f (x)]Convolution equivariant to translation:

T [x(t − τ)] = x(t − τ) ⋆ h(t) = (x ⋆ h)(t − τ) = y(t − τ)

[email protected] RCP209 / Deep Learning 25/ 64

Page 26: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution vs Fully Connected Layers

Convolution: overcome fully connected network limitations

3 Convolution: translation equivarianceEnsure that deformation, i.e. translation, encoded in mapsLocal translation invariance: local pooling ⇒ next !

Credit: G. Hinton

[email protected] RCP209 / Deep Learning 26/ 64

Page 27: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution and Non-Linearity

source image I I ⋆Mx ∣I ⋆Mx ∣ (I ⋆Mx)2

• Convolution, linear operation for each feature map

Gradient Ix ≈ I ⋆Mx , Mx = 14 ⋅⎡⎢⎢⎢⎢⎢⎣

−1 0 1−2 0 2−1 0 1

⎤⎥⎥⎥⎥⎥⎦• Convolution + point-wise non-linearity: feature detection

Ex: σ(z) = z2, σ(z) = ∣z ∣⇒ activate for large > 0 & < 0 Ix values

[email protected] RCP209 / Deep Learning 27/ 64

Page 28: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution and Non-Linearity

source image I I ⋆Mx Sigmoid ReLU

• Other non-linearities: only activate for Ix > 0Sigmoid (with bias) σ(z) = (1 + e−a(z−b))−1,a = 8 ⋅ 10−2, b = 50ReLU (see later) σ(z) = max(0, z)

[email protected] RCP209 / Deep Learning 28/ 64

Page 29: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Outline

1 Limitations of Fully Connected Networks

2 Convolution

3 Pooling

4 Deep Convolutionnal Neural Nets

[email protected] RCP209 / Deep Learning 29/ 64

Page 30: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Pooling

• Pooling: statistical aggregation of a set of values, e.g. x = {x1, ..., xN}• Output: a single scalar value - Possible pooling functions:

Max pooling: pool(x) = maxi∈{1;N}

xi

Average pooling: pool(x) = 1N

N∑i=1

xi

max = 8, avg = 4.8• Goal: capture statistics of responses

Invariance wrt position of valuesPermut values ⇒ same features

[email protected] RCP209 / Deep Learning 30/ 64

Page 31: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

`p Pooling

• `p pooling: pool(x) = ( 1N

N∑i=1

xpi )

1p

• Smooth transition: average → max (wrt p)

[email protected] RCP209 / Deep Learning 31/ 64

Page 32: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Pooling in Convolution Feature Maps

• Spatial pooling: aggregation over image (map) regions• Pooling Input: map (image), output: map• Local aggregation: ⇒ local pooling receptive field• Key pooling parameters:

Pooling functionLocal pooling sizeStride between two pooling areas

[email protected] RCP209 / Deep Learning 32/ 64

Page 33: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Spatial Max Pooling

Credit: K. Matsui

• Ex: max pooling with 5 × 5pooling area

• binary input: pooling ⇒ presence/ absence of feature in localpooling area

• (partial) Translation invariance ⇒later

[email protected] RCP209 / Deep Learning 33/ 64

Page 34: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Spatial Average Pooling

Credit: K. Matsui

• Ex: average pooling with 5 × 5pooling area

• binary input: pooling ∼ countnumber of present features in localpooling area

[email protected] RCP209 / Deep Learning 34/ 64

Page 35: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Spatial Pooling: Stride

• Step s to which pooling areas centered• s > 1: decreases spatial resolution ⇒ less parameters in Deep Models ∼Downsampling

Credit: M. Antony

[email protected] RCP209 / Deep Learning 35/ 64

Page 36: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Spatial Pooling: from Equivariance to Invariance

• Recap: convolution equivariantto translation:

f [g(x)] = g [f (x)]

f convolution, g translationCredit: G. Hinton

[email protected] RCP209 / Deep Learning 36/ 64

Page 37: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Max Pooling & Translation Invariance

• Under some conditions, max pooling ⇒ translation invariance:

f ′ [g(x)] = f ′(x)

f ′ = f ○ p with f convolution, p pooling

[email protected] RCP209 / Deep Learning 37/ 64

Page 38: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Max Pooling & Translation Invariance

• Translation invariance wrt vectorÐ→T = (tx , ty)t if:

Ð→T ⇏ new largest element at pooling region edgeÐ→T ⇏ remove max from pooling region

• Ex: 5 × 5 conv map, 3 × 3 max pooling centered at15: max = 15,

• Invariance OK: ∀ translation (tx , ty) ∈ ±1 px⇒ max = 15

C =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

11 −5 1 −2 01 3 0 0 58 4 15 −10 48 6 5 3 73 0 −2 9 3

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

[email protected] RCP209 / Deep Learning 38/ 64

Page 39: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Max Pooling & Translation Invariance

• Translation invariance wrt vectorÐ→T = (tx , ty)t if:

Ð→T ⇏ new largest element at pooling region edgeÐ→T remove max from pooling region

• Ex: 5 × 5 conv map, 3 × 3 max pooling centered at15: max = 15,

• Invariance KO: right translation tx = +1 px⇒ max = 7

C =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

11 −5 1 −2 01 3 0 0 58 15 4 −10 48 6 5 3 73 0 −2 9 3

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

[email protected] RCP209 / Deep Learning 39/ 64

Page 40: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Max Pooling & Translation Invariance

• Max pooling: partial translation invariance (under some conditions)At least local stability: every value in bottom changed, only half values in topchanged ⇒ Distance after pooling decreases

From [Goodfellow et al., 2016]

[email protected] RCP209 / Deep Learning 40/ 64

Page 41: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Pooling: Conclusion

• Reducing spatial feature map size (stride)• Partial translation invariance and stability

• Convolution on tensors (color images / hierarchies)?⇒ following!

[email protected] RCP209 / Deep Learning 41/ 64

Page 42: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Outline

1 Limitations of Fully Connected Networks

2 Convolution

3 Pooling

4 Deep Convolutionnal Neural Nets

[email protected] RCP209 / Deep Learning 42/ 64

Page 43: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer

• 2D convolution: each filter ⇒ 2D map (image)• Convolution Layer: staking maps from multiple Filters⇒ Tensor: multi-dimensional array

[email protected] RCP209 / Deep Learning 43/ 64

Page 44: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer

• Tensor: stacking several filters outputsDepth ⇔ # filtersEach spatial position: output for the differentfilters

• Ex: 2D convolution with gray-scale imagesInput tensor depth = 1

• Convolution on color images / hierarchies:Convolution on tensors!Input Tensor ⇒ output Tensor

[email protected] RCP209 / Deep Learning 44/ 64

Page 45: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer for Tensors

f ′(i , j) = (f ⋆ h)(i , j) =K

∑k=1

d−12

∑n=− d−1

2

d−12

∑m=− d−1

2

f (i − n,m − j , k)h(n,m, k) + b

• Convolution: linear, bias b ⇒ affine

• Filtering on depth: correlation between feature maps

[email protected] RCP209 / Deep Learning 45/ 64

Page 46: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer for Tensors

Ex: input color image

[email protected] RCP209 / Deep Learning 46/ 64

Page 47: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer for Tensors

Natural extension for multiple filters

[email protected] RCP209 / Deep Learning 47/ 64

Page 48: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer for Tensors

Ex: input color image

[email protected] RCP209 / Deep Learning 48/ 64

Page 49: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Specific Tensor Convolution Filters

• Input tensor size W ×H ×D• Filter size = W ×H ×D = tensor size, nopadding⇒ No possible displacement for filter

Output: single scalar valueUse of K filters ⇒ output: K-dim vector

• Convolution ∼ fully connected onflattened tensor

[email protected] RCP209 / Deep Learning 49/ 64

Page 50: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer: Non-Linearity

• Convolutional Layer:Input Tensor → Output Tensor

1 Convolution: linear / affine filtering2 Follwed by point wise non-linearity

∼ non-linearity on spatial maps

[email protected] RCP209 / Deep Learning 50/ 64

Page 51: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Layer: Non-Linearity

• Each activation in tensor map ⇔ formal neuron• Ex: sigmoid activation:

σ(z) = (1 + e−az)−1

[email protected] RCP209 / Deep Learning 51/ 64

Page 52: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Hierarchies

• Convolution Layer: affine filtering + non-linear activation• Convolution Hierarchies: stacking Convolution Layers• Motivation: depth increase modeling capacities

Non-linearity crucial: hierarchical model ≠ flat model!∼ fully connected network

[email protected] RCP209 / Deep Learning 52/ 64

Page 53: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Hierarchies: Receptive Field

• Cascading two 3× 3 convolutions: same receptive field as 5× 5 convolutionin input image

• Convolution Hierarchies:Feature combinationGradual increase of spatial receptive field ⇒ indirect global connectivity

[email protected] RCP209 / Deep Learning 53/ 64

Page 54: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Hierarchies: Example

Edge detection with convolution hierarchy (pyramid) ⇒ two layers:• Input: gray-scale image I ⇒ W ×H × 1 tensor

I 2x ∼ filter 1 I 2

y ∼ filter 2

1 1st layer: convolution with two filters

Mx =1

4⋅⎡⎢⎢⎢⎢⎢⎣

−1 0 1−2 0 2−1 0 1

⎤⎥⎥⎥⎥⎥⎦My =

1

4⋅⎡⎢⎢⎢⎢⎢⎣

−1 −2 −10 0 01 2 1

⎤⎥⎥⎥⎥⎥⎦

followed by non-linearity: σ(z) = z2

⇒ Output: W ×H × 2 tensor, H1 ∼ (Ix)2, H2 ∼ (Iy)2

[email protected] RCP209 / Deep Learning 54/ 64

Page 55: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution Hierarchies: Example

Edge detection with convolution hierarchy (pyramid): two-Layers

I 2x I 2

y output

2 2nd layer: convolution with one 1 × 1 filter [1 1]For each pixel: (Ix)2 + (Iy )2 = ∣∣

Ð→G (x , y)∣∣2 = ∣∣

Ð→∇I ∣∣

2

σ(z) = Step(z −T), T threshold on ∣∣Ð→G (x , y)∣∣2

⇒ Output: W ×H × 1 tensor ⇒ edge detector

[email protected] RCP209 / Deep Learning 55/ 64

Page 56: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Pooling in Convolution Layer

Where to pool in a convolution tensor?• Most common choice: pool in each feature map independently⇒ spatial pooling on top of convolution Layer

Input / Output: a tensor of depth DOutput smaller spatial size (pooling stride)

[email protected] RCP209 / Deep Learning 56/ 64

Page 57: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution / Pooling Layer

• Pooling on top of convolution Layer• An elementary block: Convolution Layer + Pooling [Conv-Pool]

[email protected] RCP209 / Deep Learning 57/ 64

Page 58: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolution / Pooling Layer

• An elementary block:

Convolution + Non linearity³¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹·¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹µConvolution Layer + Pooling

[email protected] RCP209 / Deep Learning 58/ 64

Page 59: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolutional Neural Networks (ConvNets)

• Stack several Convolution /Pooling blocks⇒ Convolutional Neural Network(ConvNet)

• Ex: 7 × 7 convolution2 × 2 pooling area, stride s = 2

• Input image 46 × 46, 1st

[Conv-Pool] Layer:Conv output = 40Pooling output = 20Receptive field size for eachpooled unit?⇒ Pooling ↑ receptive field

[email protected] RCP209 / Deep Learning 59/ 64

Page 60: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolutional Neural Networks (ConvNets)

• ConvNets: hierarchical tensor modification• At some (depth) point, often flattening input tensor ⇒ vector

[email protected] RCP209 / Deep Learning 60/ 64

Page 61: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolutional Neural Networks (ConvNets)

• ConvNet prediction: 2-stage process:1 Representation learning with [Conv-Pool] hierarchy:

Conv detects relevant featuresPool gains spatial invariance for classification

[email protected] RCP209 / Deep Learning 61/ 64

Page 62: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Convolutional Neural Networks (ConvNets)

• ConvNet prediction: 2-stage process:2 Classification: Tensor flattened ⇒ vector

Flattening: neuron position in initial tensor⇒ breaking translation invarianceFollowed by a hierarchy of fully connected layers

[email protected] RCP209 / Deep Learning 62/ 64

Page 63: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

Motivation Convolution Pooling ConvNets

Conclusion

• ConvNet: hierarchical [Conv-Pool] + fully connectedArchitecture for famous historical Nets, e.g. LeNet, or more recent,e.g. AlexNet (2012)

• Deep Learning History?⇒ next course!

[email protected] RCP209 / Deep Learning 63/ 64

Page 64: Apprentissage,réseauxdeneuronesetmodèlesgraphiques (RCP209 ...cedric.cnam.fr/vertigo/Cours/ml2/docs/coursDeep2.pdf · 2018-05-03 · Motivation Convolution Pooling ConvNets Outline

References I

[Goodfellow et al., 2016] Goodfellow, I., Bengio, Y., and Courville, A. (2016).Deep Learning.MIT Press.http://www.deeplearningbook.org.