Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to...

149
Deep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks to Cees Snoek, Laurens van der Maaten & Arnold Smeulders who all contributed to the slides

Transcript of Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to...

Page 1: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Deep Learning Intro

Max Welling

Special thanks to Efstratios Gavveswhose slides I am using and who has helped with the practicals

Also thanks to Cees Snoek, Laurens van der Maaten & Arnold Smeulders who all contributed to the slides

Page 2: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 2

Deep Learning in Computer Vision

Page 3: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 3

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 3

Object and activity recognition

C“ick to go to the video in Youtube

Page 4: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 4

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 4

Object detection, segmentation, pose estimation

C“ick to go to the video in Youtube

Page 5: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 5

Deep Learning in Robotics

Page 6: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 6

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 6

Self-driving cars

C“ick to go to the video in Youtube

Page 7: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 7

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 7

Drones and robots

C“ick to go to the video in Youtube

Page 8: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 8

Deep Learning in NLP and Speech

Page 9: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 9

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 9

Speech recognition and Machine translation

Page 10: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 10

Deep Learning in the arts

Page 11: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 11

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 11

Imitating famous painters Gatys et al

Page 12: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 12

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 12

Handwriting

C“ick to go to the website

Page 13: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

INTRODUCTION TO DEEP LEARNING AND NEURAL NETWORKS - 13

Deep Learning: The What and Why

Page 14: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

A Neural Network perspective

Data PredictionModel Loss

5 convolutional layers2 fully connected layers-----------------------------------100 convolutional layers50 batch normalization layers

Softmax---------------Sigmoid---------------Linear etc.

Cross entropy---------------Euclidean---------------Contrastive

Page 15: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

The Feedforward NN

Page 16: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Module/Layer types?

Page 17: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Nonlinearities

Page 18: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

SigmoidSigmoid 1

Sigmoid 2

sigmoid

derivative

Page 19: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Tanh

Page 20: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

ReLU

Page 21: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

ReLUMuch faster computations, gradients

❑ No vanishing/exploding gradients❑ People claim biological plausibility :/

Sparse activationsNo saturationNon-symmetricNon-differentiable at 0A large gradient during training can cause a neuron to “die”.

Higher learning rates mitigate the problem

Page 22: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Softmax

Page 23: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Euclidean Loss

Page 24: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Cross-entropy loss

Page 25: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Training Neural Networks

1. The Neural Network

2. Learning by minimizing empirical error

3. Optimizing with Gradient Descend based methods

Page 26: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

IntuitiveBackpropagation

UVA DEEP LEARNING COURSEEFSTRATIOS GAVVES

MODULAR LEARNING - PAGE 26

Page 27: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 27

SGD Sample small mini-batch of data-cases uniformly at random. Compute average gradient based on this mini-batch.Perform update based on gradient.

Page 28: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

o Things are dead simple, just compute per module

o Then follow iterative procedure

Backpropagation in practice

MODULAR LEARNING - PAGE 77UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 29: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

o

Backpropagation in practice

Modu“e derivatives

Derivatives fro” “ayer above

MODULAR LEARNING - PAGE 78UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 30: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Backpropagation visualization

MODULAR LEARNING - PAGE 82UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 31: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Exa”p“e

Store!!!

MODULAR LEARNING - PAGE 83UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 32: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Exa”p“e

Store!!!

MODULAR LEARNING - PAGE 84UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 33: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Exa”p“e

Store!!!

MODULAR LEARNING - PAGE 85UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 34: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Backpropagation

Exa”p“e

MODULAR LEARNING - PAGE 86UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 35: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Stored during forward co”putations

BackpropagationExa”p“e

Store!!!

MODULAR LEARNING - PAGE 87UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 36: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Backpropagation

Exa”p“e

Co”puted fro” the exact previous backpropagation step (Re”e”ber, recursive ru“e)

MODULAR LEARNING - PAGE 88UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 37: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 37

o

Adam [Ba & Kingma 2014]

Page 38: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 38

Visual overview

Picture credit: A“ec Radford

Page 39: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convolutional Neural Networks

Or just Convnets/CNNs

Page 40: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convnets vs NNsQuestion: Spatial structure?

❑ NNs: not modelled❑ Convnets: Convolutional filters

Question: Huge input dimensionalities?❑ NNs: scale badly in nr parameters and compute efficiency❑ Convnets: Parameter sharing

Question: Local invariances?❑ NNs: not hardwired into model❑ Convnets: Pooling

Question: Translation equivariance?❏ NN: not hardwired❏ Convnets: translation equivariant

Page 41: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Spatial Information?One pixel alone does not carry much information

Many pixels in the right order though → tons of information

I.e., Neighboring variables are correlated

And the variable correlations is thevisual structure we want to learn

Page 42: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Parameter SharingNatural images are stationary

Visual features are common fordifferent parts of one or multiple image

If features are local and similar acrosslocations, why not reuse filters?

Local parameter sharing → Convolutions

Page 43: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convolutional filtersOriginal image

Page 44: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convolutional filters

1 1 10 1 10 0 1

0 01 01 1

0 0 10 1 1

1 00 0

Original image

1 0 1

0

1

1 0

0 1

Convolutionalfilter 1

Page 45: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convolutional filters

41 1 1

0 1 1

0 0 1

0 0

1 0

1 1

0 0 1

0 1 1

1 0

0 0

x0

x1 x0 x1

x0 x1

x1 x0

1 1 1

0 1 1

0 0

0 0

1 0

1 1

0 0 1

0 1 1

1 0

0 0

x0

x1 x0 x1

x0 x1

x0 x1

1 1 1

0 1 1

0 0

0 0

1 0

1 1

0 0

0 1 1

1 0

0 0

x1 x0 x1

x0 x1

x1 x0

441 1 1

0 1 1

0 0

0 0

1 0

0 0

0 1

1 1 10 1 10 0 1

0 01 01 1

0 0 10 1 1

1 00 0

Convolving the image ResultOriginal image

1 0 1

0

1

1 0

0 1

Convolutionalfilter 1

Inner product

x0

x1 x0 x1

x0 x1

x1 x0 x1

Page 46: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convolutional filters

4 31 1 1

0 1 1

0 0 1

0 0

1 0

1 1

0 0 1

0 1 1

1 0

0 0

x0

x1 x0 x1

x0 x1

x1 x0

1 1 1

0 1 1

0 0

0 0

1 0

1 1

0 0 1

0 1 1

1 0

0 0

x0

x1 x0 x1

x0 x1

x0 x1

1 1 1

0 1 1

0 0

0 0

1 0

1 1

0 0

0 1 1

1 0

0 0

x1 x0 x1

x0 x1

x1 x0

441 1 1

0 1 1

0 0

0 0

1 0

0 0

0 1

1 1 10 1 10 0 1

0 01 01 1

0 0 10 1 1

1 00 0

Convolving the image ResultOriginal image

1 0 1

0

1

1 0

0 1

Convolutionalfilter 1

Inner product

x0

x1 x0 x1

x0 x1

x1 x0 x1

Page 47: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convolutional filters

4 31 1 1

0 1 1

0 0 1

0 0

1 0

1 1

0 0 1

0 1 1

1 0

0 0

x0

x1 x0 x1

x0 x1

x1 x0

1 1 1

0 1 1

0 0

0 0

1 0

1 1

0 0 1

0 1 1

1 0

0 0

x0

x1 x0 x1

x0 x1

x0 x1

1 1 1

0 1 1

0 0

0 0

1 0

1 1

0 0

0 1 1

1 0

0 0

x1 x0 x1

x0 x1

x1 x0

4

2

2

4

4 3

3 4

41 1 1

0 1 1

0 0

0 0

1 0

0 0

0 1

1 1 10 1 10 0 1

0 01 01 1

0 0 10 1 1

1 00 0

Convolving the image ResultOriginal image

1 0 1

0

1

1 0

0 1

Convolutionalfilter 1

Inner product

x0

x1 x0 x1

x0 x1

x1 x0 x1

Page 48: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Why call them convolutions?

Page 49: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: Notice anything weird?

Page 50: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: Notice anything weird?

Inverted. In practice we can do cross-correlations, not convolutions

Page 51: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Filters have width/height/depth

Grayscale

3 di”s

RGB

D di”s

Multiple channels

Fi“ters

Page 52: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 52

Parameter Sharing

3 di”s

7

7

Assu”e the i”age is 30x30x3.1 co“u”n of fi“ters co””on across the i”age.How ”any para”eters in tota“?

Page 53: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Local connectivityThe weight connections are surface-wise local!

❑ Local connectivity

The weights connections are depth-wise global

For standard neurons no local connectivity❑ Everything is connected to everything

Page 54: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 54

Depthwise Convolution

Page 55: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Pooling

1 4 32 1 02 2 7

697

5 3 3 6

4 95 7

Page 56: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Implementation detailsStride

❑ every how many pixels do you compute a convolution❑ equivalent to sampling coefficient, influences output size

Padding❑ Add 0s (or another value) around the layer input❑ Prevent output from getting smaller and smaller

Dilation❑ Atrous convolutions

Page 57: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 57

o

Batch normalization [Ioffe2015]

Batch normalization

Page 58: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 58

Batch normalization – The algorithm

o

Trainab“e para”eters

Page 59: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 59

o

Regularization

Page 60: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 60

o

Weight decay , because weights get s”a““er

Page 61: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 61

o

Sign function

Page 62: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 62

o To tackle overfitting another popular technique is early stopping

o Monitor performance on a separate validation set

o Training the network will decrease training error, as well validation error (although with a slower rate usually)

o Stop when validation error starts increasing◦This quite likely means the network starts to overfit

Early stopping

Page 63: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 63

o

Dropout [Srivastava2014]

Page 64: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 64

o Effectively, a different architecture at every training epoch

◦Similar to model ensembles

Dropout

Origina“ ”ode“

Page 65: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 65

o Effectively, a different architecture at every training epoch

◦Similar to model ensembles

Dropout

Epoch 1

Page 66: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 66

o Effectively, a different architecture at every training epoch

◦Similar to model ensembles

Dropout

Epoch 1

Page 67: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 67

o Effectively, a different architecture at every training epoch

◦Similar to model ensembles

Dropout

Epoch 2

Page 68: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 68

o Effectively, a different architecture at every training epoch

◦Similar to model ensembles

o At test time keep all neurons butmultiply output by p (e.g. 0.5) to compensate for the fact that moreof them are active than during training

Dropout

Epoch 2

Page 69: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 69

o

Weight initialization

Linear regi”e

Large gradients

Linear regi”e

Large gradients

Page 70: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 70

o

Xavier initialization [Glorot2010]

Page 71: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 71

o

[He2015] initialization for ReLUs

Page 72: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Network-in-network [Lin et al., arXiv 2013]

MODULAR LEARNING - PAGE 102UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 73: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

ResNet [He et al., CVPR 2016]

MODULAR LEARNING - PAGE 103UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

Page 74: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

No degradation anymoreWithout residual connections deeper networks are untrainable

Page 75: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

ResNet breaks recordsRidiculously low error in ImageNet

Up to 1000 layers ResNets trained❑ Previous deepest network ~30-40 layers on simple datasets

Page 76: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

UVA DEEP LEARNING COURSE – EFSTRATIOS GAVVES & MAX WELLING - DEEPER INTO DEEP LEARNING AND OPTIMIZATIONS - 76

Data augmentation [Krizhevsky2012]

Origina“

F“ip

Rando” crop

Contrast

Tint

Page 77: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Some practical tricks of the trade

MODULAR LEARNING - PAGE 95UVA DEEP LEARNING COURSE –

EFSTRATIOS GAVVES

● For classification use entropy loss● Use variant of ReLU as nonlinearity● Use Adam SGD● Use random minibatch at each iteration● Normalize input to zero mean, unit variance● Use batch-normalization● Use dropout on fully connected layers● Use ResNet architecture● Think about weight inititalization● Do extensive hyperparameter search● Use data augmentation

Page 78: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Case studiesAlexnet

❑ Or the modern version of it, VGGnet

ResNet❑ From 14 to 1000 layers

Google Inception❑ Networks as Direct Acyclic Graphs (DAG)

Page 79: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Alexnet

Page 80: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Architectural details

Page 81: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Removing layer 7

Page 82: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Removing layer 6, 7

Page 83: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Removing layer 3, 4

Removing layer 3, 4

Page 84: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Removing layer 3, 4, 6, 7

Page 85: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: Translation invariance?

Credit: R. Fergus slides in Deep Learning Summer School 2016

Page 86: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Translation invariance

Credit: R. Fergus slides in Deep Learning Summer School 2016

Page 87: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: Scale invariance?

Credit: R. Fergus slides in Deep Learning Summer School 2016

Page 88: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Scale invariance

Credit: R. Fergus slides in Deep Learning Summer School 2016

Page 89: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: Rotation invariance?

Credit: R. Fergus slides in Deep Learning Summer School 2016

Page 90: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Rotation invariance

Credit: R. Fergus slides in Deep Learning Summer School 2016

Page 91: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Google Inception V1

Credit: https://culurciello.github.io/tech/2016/06/04/nets.html

Page 92: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

State-of-the-art

Credit: https://culurciello.github.io/tech/2016/06/04/nets.html

Page 93: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Recurrent NetworksSo far, all tasks assumed stationary data

Neither all data, nor all tasks are stationary though

Page 94: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Sequential data

Page 95: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Or …

What about text that is naturally sequential?

We need memory to handle long range correlations.

Page 96: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Recurrent Networks

Output para”eters

Input para”eters

Me”ory

Input

Output

Me”orypara”eters

Page 97: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Recurrent Networks

Output para”eters

Input para”eters

Me”ory

Input

Output

Me”orypara”eters

Page 98: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Recurrent Networks

Output para”eters

Input para”eters

Me”ory

Input

Output

Me”orypara”eters

Page 99: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Folding the memory

Unro““ed/Unfo“ded Network

Fo“ded Network

Page 100: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

RNN vs NNWhat is really different?

❑ Steps instead of layers❑ Step parameters shared whereas in a Multi-Layer Network they are different❑ Input at every layer instead of only at first layer.

3-gra” Unro““ed Recurrent Network

3-“ayer Neura“ Network

Layer/Step 1

Layer/Step 2

Layer/Step 3

Page 101: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Training an RNN

Page 102: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

RNN Gradients

Page 103: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Vanishing/Exploding gradients

Vanishing gradient

Exploding gradient

Page 104: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Advanced RNN: LSTM

Page 105: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Bringing Structure toVisual Deep Learning

Page 106: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Standard inferenceN-way classification

Dog? Cat? Bike? Car? P“ane?

Page 107: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Standard inferenceN-way classification

RegressionHow popu“ar wi““ this ”ovie be in IMDB?

Page 108: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Standard inferenceN-way classification

Regression

Ranking

Who is o“der?

Page 109: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: What is common?N-way classification

Regression

Ranking

Page 110: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Quiz: What is common?They all make “single value” predictionsDo all our machine learning tasks

boil down to “single value” predictions?

Page 111: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Beyond “single value” predictions?

Do all our machine learning tasksboil to “single value” predictions?

Are there tasks where outputsare somehow correlated?

Is there some structurein these output correlations?

How can we predict such structures?❑ Structured prediction

Page 112: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Object detectionPredict a box around an objectImages

❑ Spatial location❑ bounding box (bbox)

Videos❑ Spatio-temporal location❑ bbox@t, bbox@t+1, …

Page 113: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Object segmentation

Page 114: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Optical flow & motion estimation

Page 115: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Depth estimation

Godard et al., Unsupervised Monocular Depth Estimation with Left-Right Consistency, 2016

Page 116: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Structured predictionPrediction goes beyond asking for “single values”Outputs are complex and output dimensions correlatedOutput dimensions have latent structureCan we make deep networks to return structured

predictions?

Page 117: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Convnets for structured prediction

Page 118: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Sliding window on feature mapsSelective Search Object Proposals [Uijlings2013]SPPnet [He2014]Fast R-CNN [Girshick2015]

Page 119: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsFast R-CNN [Girshick2015]

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Page 120: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsProcess the whole image up to conv5Compute possible locations for objects

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Page 121: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsProcess the whole image up to conv5Compute possible locations for objects

❑ some correct, most wrong

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Page 122: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsProcess the whole image up to conv5Compute possible locations for objects

❑ some correct, most wrongGiven single location → ROI pooling module extracts fixed

length feature

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Always 4x4 no matter the size of candidate location

Page 123: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsProcess the whole image up to conv5Compute possible locations for objects

❑ some correct, most wrongGiven single location → ROI pooling module extracts fixed

length feature

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Always 4x4 no matter the size of candidate location

ROI Pooling Module

Page 124: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsProcess the whole image up to conv5Compute possible locations for objects

❑ some correct, most wrongGiven single location → ROI pooling module

extracts fixed length feature

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Always 4x4 no matter the size of candidate location

ROI Pooling Module

Page 125: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNN: StepsProcess the whole image up to conv5Compute possible locations for objects

❑ some correct, most wrongGiven single location → ROI pooling module

extracts fixed length feature

Conv 1

Conv 2

Conv 3

Conv 4

Conv 5

Conv 5 feature map

Always 4x4 no matter the size of candidate location

ROI Pooling Module

Car/dog/bicycleNew box

coordinates

Page 126: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Some results

Page 127: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Fast R-CNNReuse convolutions for different candidate boxes

❑ Compute feature maps only onceRegion-of-Interest pooling

❑ Define stride relatively → box width divided by predefined number of “poolings” T

❑ Fixed length vectorEnd-to-end training!(Very) Accurate object detection(Very) Faster

❑ Less than a second per imageExternal box proposals needed

T=5

Page 128: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Faster R-CNN [Girshick2016]

Region Proposal Network

Page 129: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Image Segmentation: Fully Convolutional

[LongCVPR2014]Image larger than network input

❑ slide the network

Conv

1 Conv

2 Conv

3 Conv

4 Conv

5

fc1

fc2

Is this pixel a camel?Yes! No!

Page 130: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Image Segmentation: Fully Convolutional

[LongCVPR2014]Image larger than network input

❑ slide the network

Conv

1 Conv

2 Conv

3 Conv

4 Conv

5

fc1

fc2

Is this pixel a camel?Yes! No!

Page 131: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Image Segmentation: Fully Convolutional

[LongCVPR2014]Image larger than network input

❑ slide the network

Conv

1 Conv

2 Conv

3 Conv

4 Conv

5

fc1

fc2

Is this pixel a camel?Yes! No!

Page 132: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Image Segmentation: Fully Convolutional

[LongCVPR2014]Image larger than network input

❑ slide the network

Conv

1 Conv

2 Conv

3 Conv

4 Conv

5

fc1

fc2

Is this pixel a camel?Yes! No!

Page 133: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Image Segmentation: Fully Convolutional

[LongCVPR2014]Image larger than network input

❑ slide the network

Conv

1 Conv

2 Conv

3 Conv

4 Conv

5

fc1

fc2

Is this pixel a camel?Yes! No!

Page 134: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Deep ConvNets with CRF loss[Chen, Papandreou 2016]

Page 135: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Deep ConvNets with CRF loss

Unary “oss Pairwise “oss

Tota“ “oss

Page 136: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Examples

Page 137: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Discovering structure

Page 138: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Standard Autoencoder

Page 139: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Standard Autoencoder

Page 140: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Denoising Autoencoder

Page 141: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Denoising AutoencoderThe network does not overlearn the data

❑ Can even use overcomplete latent spacesModel forced to learn more intelligent, robust

representations❑ Learn to ignore noise or trivial solutions(identity)❑ Focus on “underlying” data generation process

Page 142: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Variational Autoencoder

Reconstruction termRegularization term

Page 143: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Examples

Page 144: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Generative Adversarial Networks

Page 145: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Generator

Discriminator

Generative Adversarial Networks

Page 146: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

“Police vs Thief”

Page 147: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Examples

Bedrooms

Page 148: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Image “arithmetics”

Page 149: Deep Learning Intro - Data Science AfricaDeep Learning Intro Max Welling Special thanks to Efstratios Gavves whose slides I am using and who has helped with the practicals Also thanks

Thank you!