Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks...

90
Lecture 9 - Convolutional Neural Networks I2DL: Prof. Niessner, Prof. Leal-Taixé 1

Transcript of Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks...

Page 1: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Lecture 9 -Convolutional

Neural Networks

I2DL: Prof. Niessner, Prof. Leal-Taixé 1

Page 2: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Fully Connected Neural Network

Depth

Wid

th

I2DL: Prof. Niessner, Prof. Leal-Taixé 2

Wid

thW

idth

Page 3: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Problems using FC Layers on Images

• How to process a tiny image with FC layers

I2DL: Prof. Niessner, Prof. Leal-Taixé 3

3 neuron layer35

5

5 weights

Page 4: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Problems using FC Layers on Images

• How to process a tiny image with FC layers

I2DL: Prof. Niessner, Prof. Leal-Taixé 4

25 weights

For the whole 5×5image on 1channel

3 neuron layer35

5

Page 5: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Problems using FC Layers on Images

• How to process a tiny image with FC layers

I2DL: Prof. Niessner, Prof. Leal-Taixé 5

75 weights

For the whole 5×5image on the 3channel

3 neuron layer35

5

Page 6: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Problems using FC Layers on Images

• How to process a tiny image with FC layers

I2DL: Prof. Niessner, Prof. Leal-Taixé 6

75 weights For the whole 5×5 image on the three channels per neuron

75 weights

75 weights

3 neuron layer35

5

Page 7: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Problems using FC Layers on Images

• How to process a normal image with FC layers

I2DL: Prof. Niessner, Prof. Leal-Taixé 7

3

1000

1000

3 neuron layer

Page 8: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Problems using FC Layers on Images

• How to process a normal image with FC layers

I2DL: Prof. Niessner, Prof. Leal-Taixé 8

3 𝑏𝑖𝑙𝑙𝑖𝑜𝑛 weights

3

1000

1000

1000 neuron layer

IMPRACTIC

AL

Page 9: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Why not simply more FC Layers?

We cannot make networks arbitrarily complex

• Why not just go deeper and get better?– No structure!!– It is just brute force!

– Optimization becomes hard– Performance plateaus / drops!

I2DL: Prof. Niessner, Prof. Leal-Taixé 9

Page 10: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Better Way than FC ?

• We want to restrict the degrees of freedom– We want a layer with structure– Weight sharing à using the same weights for different

parts of the image

I2DL: Prof. Niessner, Prof. Leal-Taixé 10

Page 11: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Using CNNs in Computer Vision

I2DL: Prof. Niessner, Prof. Leal-Taixé 11[Li et al., CS231n Course Slides] Lecture 12: Detection and Segmentation

Page 12: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions

I2DL: Prof. Niessner, Prof. Leal-Taixé 12

Page 13: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

𝑓 ∗ 𝑔 = )!"

"

𝑓 𝜏 𝑔 𝑡 − 𝜏 𝑑𝜏

𝑓 = red𝑔 = blue

𝑓 ∗ 𝑔 = green

Convolution of two box functions Convolution of two Gaussians

Application of a filter to a function— The ‘smaller’ one is typically called the filter kernel

I2DL: Prof. Niessner, Prof. Leal-Taixé 13

Page 14: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6𝑓

1/3 1/3 1/3𝑔

‘Slide’ filter kernel from left to right; at each position,compute a single value in the output data

I2DL: Prof. Niessner, Prof. Leal-Taixé 14

Discrete case: box filter

Page 15: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3

𝑓

𝑔

𝑓 ∗ 𝑔

Discrete case: box filter

4 ⋅13 + 3 ⋅

13 + 2 ⋅

13 = 3

I2DL: Prof. Niessner, Prof. Leal-Taixé 15

Page 16: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 16

Discrete case: box filter

3 ⋅13 + 2 ⋅

13 + (−5) ⋅

13 = 0

Page 17: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0 0

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 17

Discrete case: box filter

2 ⋅13 + (−5) ⋅

13 + 3 ⋅

13 = 0

Page 18: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0 0 1

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 18

Discrete case: box filter

−5 ⋅13 + 3 ⋅

13 + 5 ⋅

13 = 1

Page 19: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0 0 1 10/3

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 19

Discrete case: box filter

3 ⋅13 + 5 ⋅

13 + 2 ⋅

13 =

103

Page 20: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0 0 1 10/3 4

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 20

Discrete case: box filter

5 ⋅13 + 2 ⋅

13 + 5 ⋅

13 = 4

Page 21: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0 0 1 10/3 4 4

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 21

Discrete case: box filter

2 ⋅13 + 5 ⋅

13 + 5 ⋅

13 = 4

Page 22: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

1/3 1/3 1/3

3 0 0 1 10/3 4 4 16/3

𝑓

𝑔

𝑓 ∗ 𝑔

I2DL: Prof. Niessner, Prof. Leal-Taixé 22

Discrete case: box filter

5 ⋅13 + 5 ⋅

13 + 6 ⋅

13 =

163

Page 23: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

?? 3 0 0 1 10/3 4 4 16/3 ??

1/3 1/3 1/3

What to do at boundaries?

I2DL: Prof. Niessner, Prof. Leal-Taixé 23

Discrete case: box filter

Page 24: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

?? 3 0 0 1 10/3 4 4 16/3 ??

1/3 1/3 1/3

3 0 0 1 10/3 4 4 16/3

Option 1: Shrink

I2DL: Prof. Niessner, Prof. Leal-Taixé 24

Discrete case: box filter

What to do at boundaries?

Page 25: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

What are Convolutions?

4 3 2 -5 3 5 2 5 5 6

?? 3 0 0 1 10/3 4 4 16/3 ??

1/3 1/3 1/3

Option 2: Pad (often 0’s)

7/3 3 0 0 1 10/3 4 4 16/3 11/3

I2DL: Prof. Niessner, Prof. Leal-Taixé 25

0 0

0 ⋅13 + 4 ⋅

13 + 3 ⋅

13 =

73

Discrete case: box filter

What to do at boundaries?

Page 26: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

Imag

e 5×5

Ker

nel 3×3

6

Ou

tpu

t 3×3

5 ⋅ 3 + −1 ⋅ 3 + −1 ⋅ 2 + −1 ⋅ 0 + −1 ⋅ 4= 15 − 9 = 6

I2DL: Prof. Niessner, Prof. Leal-Taixé 26

Page 27: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1

5 ⋅ 2 + −1 ⋅ 2 + −1 ⋅ 1 + −1 ⋅ 3 + −1 ⋅ 3= 10 − 9 = 1

I2DL: Prof. Niessner, Prof. Leal-Taixé 27

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 28: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

5 ⋅ 1 + −1 ⋅ −5 + −1 ⋅ −3 + −1 ⋅ 3+ −1 ⋅ 2= 5 + 3 = 8

I2DL: Prof. Niessner, Prof. Leal-Taixé 28

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 29: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

-7

5 ⋅ 0 + −1 ⋅ 3 + −1 ⋅ 0 + −1 ⋅ 1 + −1 ⋅ 3= 0 − 7 = −7

I2DL: Prof. Niessner, Prof. Leal-Taixé 29

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 30: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

-7 9

5 ⋅ 3 + −1 ⋅ 2 + −1 ⋅ 3 + −1 ⋅ 1 + −1 ⋅ 0= 15 − 6 = 9

I2DL: Prof. Niessner, Prof. Leal-Taixé 30

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 31: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

-7 9 2

5 ⋅ 3 + −1 ⋅ 1 + −1 ⋅ 5 + −1 ⋅ 4 + −1 ⋅ 3= 15 − 13 = 2

I2DL: Prof. Niessner, Prof. Leal-Taixé 31

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 32: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

-7 9 2

-5

5 ⋅ 0 + −1 ⋅ 0 + −1 ⋅ 1 + −1 ⋅ 6+ −1 ⋅ −2= −5

I2DL: Prof. Niessner, Prof. Leal-Taixé 32

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 33: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

-7 9 2

-5 -9

5 ⋅ 1 + −1 ⋅ 3 + −1 ⋅ 4 + −1 ⋅ 7 + −1 ⋅ 0= 5 − 14 = −9

I2DL: Prof. Niessner, Prof. Leal-Taixé 33

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 34: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on Images

-5 3 2 -5 3

4 3 2 1 -3

1 0 3 3 5

-2 0 1 4 4

5 6 7 9 -1

0 -1 0

-1 5 -1

0 -1 0

6 1 8

-7 9 2

-5 -9 3

5 ⋅ 4 + −1 ⋅ 3 + −1 ⋅ 4 + −1 ⋅ 9 + −1 ⋅ 1= 20 − 17 = 3

I2DL: Prof. Niessner, Prof. Leal-Taixé 34

Imag

e 5×5

Ker

nel 3×3

Ou

tpu

t 3×3

Page 35: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Image Filters

• Each kernel gives us a different image filter

I2DL: Prof. Niessner, Prof. Leal-Taixé 35

InputEdge detection−1 −1 −1−1 8 −1−1 −1 −1

Sharpen0 −1 0−1 5 −10 −1 0

Box mean19

1 1 11 1 11 1 1

Gaussian blur116

1 2 12 4 21 2 1

Page 36: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on RGB Images

Images have depth: e.g. RGB -> 3 channels

Convolve filter with image i.e., ‘slide’ over it and:− apply filter at each location− dot products

I2DL: Prof. Niessner, Prof. Leal-Taixé 36

width height depth

filter 5×5×3

3 5

5

Depth dimension *must* match;i.e., filter extends the full depth of the input

32

323

image 32×32×3

Page 37: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on RGB Images

𝑧# = 𝒘$𝒙# + 𝑏

5×5×3 ×1 (5×5×3)×1 1I2DL: Prof. Niessner, Prof. Leal-Taixé 37

32×32×3 image (pixels 𝑿)

1 number at a time:equal to dot product betweenfilter weights 𝒘 and 𝒙𝒊 − 𝑡ℎ chunk of the image. Here: 5 ⋅ 5 ⋅ 3 = 75-dim dot product + bias

32

323

3 5

5

𝑧#

5×5×3 filter (weights vector 𝒘)

Page 38: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions on RGB Images

Activation map(also feature map)

Slide over all spatial locations 𝑥#and compute all output 𝑧# ;w/o padding, there are 28×28 locations

I2DL: Prof. Niessner, Prof. Leal-Taixé 38

Convolve

1

28

2832

323

3 5

5

32×32×3 image

5×5×3 filter

Page 39: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layer

I2DL: Prof. Niessner, Prof. Leal-Taixé 39

Page 40: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layer

Let’s apply a different filterwith different weights!

I2DL: Prof. Niessner, Prof. Leal-Taixé 40

Convolve

5×5×3 filter

32

323

3 5

5

32×32×3 image

128

28

1

Activation maps

Page 41: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layer

I2DL: Prof. Niessner, Prof. Leal-Taixé 41

Let’s apply **five** filters,each with different weights!

32

323

32×32×3 image

Convolution “Layer”

Convolve

528

28

Activation maps

Page 42: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layer

• A basic layer is defined by– Filter width and height (depth is implicitly given)– Number of different filter banks (#weight sets)

• Each filter captures a different image characteristic

I2DL: Prof. Niessner, Prof. Leal-Taixé 42

Page 43: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Different Filters

I2DL: Prof. Niessner, Prof. Leal-Taixé 43

• Each filter captures different image characteristics:– Horizontal edges– Vertical edges

– Circles– Squares– …

[Zeiler & Fergus, ECCV’14] Visualizing and Understanding Convolutional Networks

Page 44: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Dimensions of a Convolution Layer

I2DL: Prof. Niessner, Prof. Leal-Taixé 44

Page 45: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: DimensionsInput: Filter: Output:

7×73×35×5

I2DL: Prof. Niessner, Prof. Leal-Taixé 45

1

Imag

e 7×7

Page 46: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Dimensions

I2DL: Prof. Niessner, Prof. Leal-Taixé 46

2 Input: Filter:Output:

7×73×35×5

Imag

e 7×7

Page 47: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Dimensions

I2DL: Prof. Niessner, Prof. Leal-Taixé 47

3 Input: Filter: Output:

7×73×35×5

Imag

e 7×7

Page 48: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Dimensions

I2DL: Prof. Niessner, Prof. Leal-Taixé 48

4 Input: Filter: Output:

7×73×35×5

Imag

e 7×7

Page 49: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Dimensions

I2DL: Prof. Niessner, Prof. Leal-Taixé 49

5 Input: Filter: Output:

7×73×35×5

Imag

e 7×7

Page 50: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideIm

age 7×7

With a stride of 1

Stride of 𝑆: apply filterevery 𝑆-th spatial location;i.e. subsample the image

I2DL: Prof. Niessner, Prof. Leal-Taixé 50

Input: Filter: Stride:Output:

7×73×315×5

Page 51: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideWith a stride of 2

I2DL: Prof. Niessner, Prof. Leal-Taixé 51

Input: Filter: Stride:Output:

7×73×323×3

Imag

e 7×7

Page 52: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideWith a stride of 2

I2DL: Prof. Niessner, Prof. Leal-Taixé 52

Imag

e 7×7

Input: Filter: Stride:Output:

7×73×323×3

Page 53: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideWith a stride of 2

I2DL: Prof. Niessner, Prof. Leal-Taixé 53

Imag

e 7×7

Input: Filter: Stride:Output:

7×73×323×3

Page 54: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideWith a stride of 3

I2DL: Prof. Niessner, Prof. Leal-Taixé 54

Imag

e 7×7

Input: Filter: Stride:Output:

7×73×33

? × ?

Page 55: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideWith a stride of 3

I2DL: Prof. Niessner, Prof. Leal-Taixé 55

Imag

e 7×7

Input: Filter: Stride:Output:

7×73×33

? × ?

Page 56: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: StrideWith a stride of 3

Does not really fit (remainder left)à Illegal stride for input & filter size!

I2DL: Prof. Niessner, Prof. Leal-Taixé 56

Imag

e 7×7

Input: Filter:Stride:Output:

7×73×33

? × ?

Page 57: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: DimensionsIn

pu

t he

igh

t of 𝑵

Input width of 𝑵

𝑁 = 7, 𝐹 = 3, 𝑆 = 1:"#$%+ 1 = 5

𝑁 = 7, 𝐹 = 3, 𝑆 = 2:"#$&+ 1 = 3

𝑁 = 7, 𝐹 = 3, 𝑆 = 3:"#$$+ 1 = 2. <3

Fractions are illegal

Filter width of 𝑭F

ilte

r h

eig

ht o

f 𝑭

I2DL: Prof. Niessner, Prof. Leal-Taixé 57

Input: Filter:Stride:Output:

𝑁×𝑁F×𝐹𝑆

'#()+ 1 × '#(

)+ 1

Page 58: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Dimensions

Shrinking down so quickly (32à28à24à20) is typically not a good idea…

I2DL: Prof. Niessner, Prof. Leal-Taixé 58

32

323

28

285

24

248

Conv +ReLU

Conv +ReLU

Conv +ReLU

12

5 filters5×5×3

8 filters5×5×5

12 filters5×5×8

Input Image

20

Page 59: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Padding

I2DL: Prof. Niessner, Prof. Leal-Taixé 59

Why padding?

• Sizes get small too quickly• Corner pixel is only used

once

Page 60: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Padding

0 0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0 0

Imag

e 7×7

+ ze

ro p

add

ing

I2DL: Prof. Niessner, Prof. Leal-Taixé 60

Why padding?

• Sizes get small too quickly• Corner pixel is only used

once

Page 61: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Input (𝑁×𝑁): Filter (𝐹×𝐹): Padding (𝑃): Stride (𝑆): Output

7×73×3117×7

Convolution Layers: Padding

Most common is ‘zero’ padding

Output Size:

'*&⋅,#()

+ 1 × '*&⋅,#()

+ 1

I2DL: Prof. Niessner, Prof. Leal-Taixé 61

denotes the floor operator (as in practice an integer division is performed)

0 0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0 0

Imag

e 7×7

+ ze

ro p

add

ing

Page 62: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Padding

0 0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0 0 Set padding to 𝑃 = (#%&

I2DL: Prof. Niessner, Prof. Leal-Taixé 62

Types of convolutions:

• Valid convolution: using no padding

• Same convolution: output=input size

Imag

e 7×7

+ ze

ro p

add

ing

Page 63: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolution Layers: Dimensions

Example

I2DL: Prof. Niessner, Prof. Leal-Taixé 63

Input image: 32×32×310 filters 5×5Stride 1Pad 2 Depth of 3 is implicitly given

3Output size is:32 + 2 ⋅ 2 − 5

1+ 1 = 32

i.e. 32×32×10

10 Gilters5×5×3

32

32

3

Remember

Output: '*&⋅,#(

)+ 1 × '*&⋅,#(

)+ 1

Page 64: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Output size is:32 + 2 ⋅ 2 − 5

1+ 1 = 32

i.e. 32×32×10

Input image: 32×32×310 filters 5×5Stride 1Pad 2

Convolution Layers: DimensionsExample

I2DL: Prof. Niessner, Prof. Leal-Taixé 64

Remember

Output: '*&⋅,#(

)+ 1 × '*&⋅,#(

)+ 1

10 Gilters5×5×3

32

32

3

Page 65: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Number of parameters (weights):Each filter has 5×5×3 + 1 = 76 params (+1 for bias)-> 76 ⋅ 10 = 760 parameters in layer

Input image: 32×32×310 filters 5×5Stride 1Pad 2

Convolution Layers: DimensionsExample

I2DL: Prof. Niessner, Prof. Leal-Taixé 65

10 Gilters5×5×3

32

32

3

Page 66: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Example• You are given a convolutional layer with 4 filters,

kernel size 5, stride 1, and no padding that operates on an RGB image.

• Q1: What are the dimensions and the shape of its weight tensor?

pA1: (3, 4, 5, 5)pA2: (4, 5, 5)pA3: depends on the width and height of the image

I2DL: Prof. Niessner, Prof. Leal-Taixé 66

Page 67: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Example• You are given a convolutional layer with 4 filters,

kernel size 5, stride 1, and no padding that operates on an RGB image.

• Q1: What are the dimensions and the shape of its weight tensor?

pA1: (3, 4, 5, 5)pA2: (4, 5, 5)pA3: depends on the width and height of the image

I2DL: Prof. Niessner, Prof. Leal-Taixé 67

Page 68: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Example• You are given a convolutional layer with 4 filters,

kernel size 5, stride 1, and no padding that operates on an RGB image.

• Q1: What are the dimensions and the shape of its weight tensor?

I2DL: Prof. Niessner, Prof. Leal-Taixé 68

Input channels (RGB = 3)

Output size = 4 filters

Filter size = 5×5A1: (3, 4, 5, 5)

Page 69: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutional Neural Network

(CNN)

I2DL: Prof. Niessner, Prof. Leal-Taixé 69

Page 70: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

CNN Prototype

ConvNet is concatenation of Conv Layers and activations

32

323

28

285

24

248

Conv +ReLU

Conv +ReLU

Conv +ReLU

12

5 filters5×5×3

8 filters5×5×5

12 filters5×5×8

Input Image

20

I2DL: Prof. Niessner, Prof. Leal-Taixé 70

Page 71: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

CNN Learned Filters

I2DL: Prof. Niessner, Prof. Leal-Taixé 71

[Zeiler & Fergus, ECCV’14] Visualizing and Understanding Convolutional Networks

Page 72: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Pooling

I2DL: Prof. Niessner, Prof. Leal-Taixé 72

Page 73: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Pooling Layer

I2DL: Prof. Niessner, Prof. Leal-Taixé 73[Li et al., CS231n Course Slides] Lecture 5: Convolutional Neural Networks

Page 74: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Pooling Layer: Max Pooling

3 1 3 5

6 0 7 9

3 2 1 4

0 2 4 3

6 9

3 4

Single depth slice of input

Max pool with2×2 filters and stride 2

‘Pooled’ output

I2DL: Prof. Niessner, Prof. Leal-Taixé 74

Page 75: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Pooling Layer

• Conv Layer = ‘Feature Extraction’– Computes a feature in a given region

• Pooling Layer = ‘Feature Selection’– Picks the strongest activation in a region

I2DL: Prof. Niessner, Prof. Leal-Taixé 75

Page 76: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Pooling Layer• Input is a volume of size 𝑊!"×𝐻!"×𝐷!"• Two hyperparameters

– Spatial filter extent 𝐹– Stride 𝑆

• Output volume is of size 𝑊#$%×𝐻#$%×𝐷#$%– 𝑊#$% =

&%&'()

+ 1

– 𝐻#$% =*%&'()

+ 1

– 𝐷#$% = 𝐷!"• Does not contain parameters; e.g. it’s fixed function

Filter count 𝐾 and padding 𝑃make no sense here

I2DL: Prof. Niessner, Prof. Leal-Taixé 76

Page 77: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Pooling Layer• Input is a volume of size 𝑊!"×𝐻!"×𝐷!"• Two hyperparameters

– Spatial filter extent 𝐹– Stride 𝑆

• Output volume is of size 𝑊#$%×𝐻#$%×𝐷#$%– 𝑊#$% =

&%&'()

+ 1

– 𝐻#$% =*%&'()

+ 1

– 𝐷#$% = 𝐷!"• Does not contain parameters; e.g. it’s fixed function

Common settings:𝐹 = 2, S = 2𝐹 = 3, 𝑆 = 2

I2DL: Prof. Niessner, Prof. Leal-Taixé 77

Page 78: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

• Typically used deeper in the network

Pooling Layer: Average Pooling

3 1 3 5

6 0 7 9

3 2 1 4

0 2 4 3

2.5 6

1.75 3

Single depth slice of input

Average pool with2×2 filters and stride 2

‘Pooled’ output

I2DL: Prof. Niessner, Prof. Leal-Taixé 78

Page 79: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

CNN Prototype

I2DL: Prof. Niessner, Prof. Leal-Taixé 79

[Li et al., CS231n Course Slides]Lecture 5: Convolutional Neural Networks

Page 80: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Final Fully-Connected Layer

• Same as what we had in ‘ordinary’ neural networks – Make the final decision with the extracted features from

the convolutions– One or two FC layers typically

I2DL: Prof. Niessner, Prof. Leal-Taixé 80

Page 81: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Convolutions vs Fully-Connected• In contrast to fully-connected layers, we want to

restrict the degrees of freedom– FC is somewhat brute force– Convolutions are structured

• Sliding window to with the same filter parameters to extract image features – Concept of weight sharing– Extract same features independent of location

I2DL: Prof. Niessner, Prof. Leal-Taixé 81

Page 82: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive field

I2DL: Prof. Niessner, Prof. Leal-Taixé 82

Page 83: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive Field

• Spatial extent of the connectivity of a convolutional filter

83

5x5 input

3x3 filter 3x3 output

=

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 84: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive Field

• Spatial extent of the connectivity of a convolutional filter

84

5x5 input

3x3 filter 3x3 output

=

3x3 receptive field = 1 output pixel is connected to 9 input pixels

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 85: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive Field

• Spatial extent of the connectivity of a convolutional filter

85

7x7 input

3x3 output

3x3 receptive field = 1 output pixel is connected to 9 input pixels

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 86: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive Field

• Spatial extent of the connectivity of a convolutional filter

86

7x7 input

3x3 output

3x3 receptive field = 1 output pixel is connected to 9 input pixels

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 87: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive Field

• Spatial extent of the connectivity of a convolutional filter

87

7x7 input

3x3 output

3x3 receptive field = 1 output pixel is connected to 9 input pixels

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 88: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

Receptive Field

• Spatial extent of the connectivity of a convolutional filter

88

3x3 output

5x5 receptive field on the original input: one output value is connected to 25 input pixels

7x7 input

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 89: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

See you next time!

I2DL: Prof. Niessner, Prof. Leal-Taixé 89

Page 90: Convolutional Neural Networks Part 1 - GitHub Pages · 2020-06-16 · Convolutional Neural Networks — Part 1 I2DL: Prof. Niessner, Dr. Dai 1. Fully Connected Neural Network Depth

References

• Goodfellow et al. “Deep Learning” (2016),– Chapter 9: Convolutional Networks

• http://cs231n.github.io/convolutional-networks/

I2DL: Prof. Niessner, Prof. Leal-Taixé 90