Deep Convolutional Neural Networks - Overview

43
Convolutional Neural Networks Keunwoo.Choi @qmul.ac.uk Overview CNN use-cases References Convolutional Neural Networks A brief explanation Keunwoo.Choi @qmul.ac.uk Centre for Digital Music, Queen Mary University of London, UK 1/43

Transcript of Deep Convolutional Neural Networks - Overview

Page 1: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

References

Convolutional Neural NetworksA brief explanation

[email protected]

Centre for Digital Music, Queen Mary University of London, UK

1/43

Page 2: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

References

1 OverviewCNNs vs DNNsCNN structuresInside CNNs

2 CNN use-casesImageMusic

3 References

2/43

Page 3: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

CNNs: Convolutional Neural Networks

(Deep) Convolutional Neural Networks

deep = cascadedconvolutional = filters

1

2

1cns.org2AlexNet

3/43

Page 4: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

CNNs vs. general DNNs

DNNs: fully-connected

3

CNNs: locally-connected and shared

3urlhttp://cs231n.github.io/convolutional-networks/

4/43

Page 5: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filteringConvolution

45

Example: 200x200 image 40K hidden units

~2B parameters!!!

- Spatial correlation is local- Waste of resources + we have not enough training samples anyway..

Fully Connected Layer

Ranzato

5/43

Page 6: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

48

Convolutional Layer

Share the same parameters across

different locations (assuming input is

stationary):

Convolutions with learned kernels

Ranzato

6/43

Page 7: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

7/43

Page 8: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

8/43

Page 9: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

9/43

Page 10: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

10/43

Page 11: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

11/43

Page 12: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

12/43

Page 13: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

13/43

Page 14: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

14/43

Page 15: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

15/43

Page 16: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

16/43

Page 17: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

17/43

Page 18: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

18/43

Page 19: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

19/43

Page 20: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

20/43

Page 21: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

Ranzato

21/43

Page 22: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filtering

Convolutional Layer

RanzatoMathieu et al. “Fast training of CNNs through FFTs” ICLR 2014

22/43

Page 23: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Convolution == filteringExample: vertical edge detector

Convolutional Layer

*

-1 0 1

-1 0 1

-1 0 1

Ranzato

=

23/43

Page 24: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

CNN structuresConvolutional layers + something else 1

[6]

Many convolutional layers

that learn filters,

and subsampling layers

that reduce sizes and add invariances

24/43

Page 25: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

CNN structuresConvolutional layers + something else 2

[1]

Many convolutional layersthat learn filters,

and subsampling layersthat reduce sizes and add invariances

25/43

Page 26: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

Hierarchical features

Hierarchical feature learning

Each layer learns features in different levels of hierarchy

High-level features are built on low-level features

E.g.

Layer 1: Edges (low-level, concrete)Layer 2: Simple shapesLayer 3: Complex shapesLayer 4: More complex shapesLayer 5: Shapes of target objects (high-level, abstract)

26/43

Page 27: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?in image recognition task

[11]

27/43

Page 28: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?in image recognition task

[11]

28/43

Page 29: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?in image recognition task

[11]

29/43

Page 30: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?1/2 in music genre classification task

Layer 1/5BachOriginal

Dream Toy Eminem

Bach[Feature 1-9], Crude onset detector

Dream Toy Eminem

Bach[Feature 1-27], Onset detector

Dream Toy Eminem

[2]blog demo

30/43

Page 31: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?1/2 in music genre classification task

Layer 2/5BachOriginal

Dream Toy Eminem

Bach[Feature 2-0], Good onset detector

Dream Toy Eminem

Bach[Feature 2-1], Bass note selector

Dream Toy Eminem

Bach[Feature 2-10], Harmonic selector

Dream Toy Eminem

Bach[Feature 2-48], Melody (large energy)

Dream Toy Eminem

[2]

31/43

Page 32: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?1/2 in music genre classification task

Layer 3/5BachOriginal

Dream Toy Eminem

Bach[Feature 3-1], Better onset detector

Dream Toy Eminem

Bach[Feature 3-7], Melody (top note)

Dream Toy Eminem

Bach[Feature 3-38], Kick drum extractor

Dream Toy Eminem

Bach[Feature 3-40], Percussive eraser

Dream Toy Eminem

[2]

32/43

Page 33: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?1/2 in music genre classification task

Layer 4/5BachOriginal

Dream Toy Eminem

Bach[Feature 4-5], Lowest notes selector

Dream Toy Eminem

Bach[Feature 4-11], Vertical line eraser

Dream Toy Eminem

Bach[Feature 4-30], Long horizontal line selector

Dream Toy Eminem

[2]

33/43

Page 34: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?1/2 in music genre classification task

Layer 5/5BachOriginal

Dream Toy Eminem

Bach[Feature 5-11], texture 1

Dream Toy Eminem

Bach[Feature 5-15], texture 2

Dream Toy Eminem

Bach[Feature 5-56], Harmo-Rhythmic structure

Dream Toy Eminem

Bach[Feature 5-33], texture 3

Dream Toy Eminem

[2]

34/43

Page 35: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNNs vs DNNs

CNN structures

Inside CNNs

CNN use-cases

References

What is learned in CNNs?2/2 in music tagging task: Learn the transform!

Audio → 2-D representation

[3]

35/43

Page 36: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

Image

Music

References

CNN use-casesVisual image recognition

36/43

Page 37: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

Image

Music

References

CNN use-casesImage segmentation

[12]

37/43

Page 38: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

Image

Music

References

CNN use-casesArtistic style

[4]

38/43

Page 39: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

Image

Music

References

CNN use-casesMusic information retrieval

Anything people can do by seeing spectrograms

E.g. Auto tagging [1], chord recognition [5], instrumentrecognition [7], music-noise segmentation [8], onsetdetection [9], boundary detection [10]

+ style change? source separation? effects/de-effects?

39/43

Page 40: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

References

References I

Choi, K., Fazekas, G., Sandler, M.: Automatic taggingusing deep convolutional neural networks. In: Proceedingsof the 17th International Society for Music InformationRetrieval Conference (ISMIR 2016), New York, USA (2016)

Choi, K., Fazekas, G., Sandler, M.: Explainingconvolutional neural networks on music classification(submitted). In: IEEE Conference on Machine Learningand Signal Processing (2016)

Dieleman, S., Schrauwen, B.: End-to-end learning formusic audio. In: Acoustics, Speech and Signal Processing(ICASSP), 2014 IEEE International Conference on. pp.6964–6968. IEEE (2014)

40/43

Page 41: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

References

References II

Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithmof artistic style. arXiv preprint arXiv:1508.06576 (2015)

Humphrey, E.J., Bello, J.P.: From music audio to chordtablature: Teaching deep convolutional networks toplayguitar. In: Acoustics, Speech and Signal Processing(ICASSP), 2014 IEEE International Conference on. pp.6974–6978. IEEE (2014)

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.:Gradient-based learning applied to document recognition.Proceedings of the IEEE 86(11), 2278–2324 (1998)

Li, P., Qian, J., Wang, T.: Automatic instrumentrecognition in polyphonic music using convolutional neuralnetworks. arXiv preprint arXiv:1511.05520 (2015)

41/43

Page 42: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

References

References III

Park, T., Lee, T.: Music-noise segmentation inspectrotemporal domain using convolutional neuralnetworks. ISMIR late-breaking session (2015)

Schluter, J., Bock, S.: Improved musical onset detectionwith convolutional neural networks. In: InternationalConference on Acoustics, Speech and Signal Processing.IEEE (2014)

Ullrich, K., Schluter, J., Grill, T.: Boundary detection inmusic structure analysis using convolutional neuralnetworks. In: Proceedings of the 15th International Societyfor Music Information Retrieval Conference (ISMIR 2014),Taipei, Taiwan (2014)

42/43

Page 43: Deep Convolutional Neural Networks - Overview

ConvolutionalNeural

Networks

[email protected]

Overview

CNN use-cases

References

References IV

Zeiler, M.D., Fergus, R.: Visualizing and understandingconvolutional networks. In: Computer Vision–ECCV 2014,pp. 818–833. Springer (2014)

Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet,V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditionalrandom fields as recurrent neural networks. In: Proceedingsof the IEEE International Conference on Computer Vision.pp. 1529–1537 (2015)

43/43