From neural networks to deep learning

95
From Artificial Neural Networks to Deep learning Viet-Trung Tran 1

Transcript of From neural networks to deep learning

Page 1: From neural networks to deep learning

From Artificial Neural Networks to Deep learning

Viet-Trung Tran

1  

Page 2: From neural networks to deep learning

2  

Page 3: From neural networks to deep learning

3  

Page 4: From neural networks to deep learning

4  

Page 5: From neural networks to deep learning

5  

Page 6: From neural networks to deep learning

Perceptron •  Rosenblatt 1957 •  input signals x1, x2, •  bias x0 = 1 •  Net input = weighted sum = Net(w,x) •  Activation/transfer func = f(Net(w,x)) •  output

weighted  sum  

step  func1on  

6  

Page 7: From neural networks to deep learning

Weighted Sum and Bias

•  Weighted sum

•  Bias

7  

Page 8: From neural networks to deep learning

8  

Page 9: From neural networks to deep learning

Hard-limiter function

•  Hard-limiter – Threshold function – Discontinuous function – Discontinuous derivative

9  

Page 10: From neural networks to deep learning

Threshold logic function

•  Saturating linear function

•  Contiguous function

•  Discontinuous derivative

10  

Page 11: From neural networks to deep learning

Sigmoid function •  Most popular •  Output (0,1) •  Continuous derivatives •  Easy to differentiate

11  

Page 12: From neural networks to deep learning

Artificial neural network – ANN structure

•  Number of input/output signals •  Number of hidden layers •  Number of neurons per layer •  Neuron weights •  Topology •  Biases

12  

Page 13: From neural networks to deep learning

Feed-forward neural network

•  connections between the units do not form a directed cycle

13  

Page 14: From neural networks to deep learning

Recurrent neural network

•  A class of artificial neural network where connections between units form a directed cycle

14  

Page 15: From neural networks to deep learning

Why hidden layers

15  

Page 16: From neural networks to deep learning

Neural network learning

•  2 types of learning – Parameter learning •  Learn neuron weight connections

– Structure learning •  Learn ANN structure from training data

16  

Page 17: From neural networks to deep learning

Error function

•  Consider an ANN with n neurons •  For each learning example (x,d) – Training error caused by current weight w

•  Training error caused by w for entire learning examples

17  

Page 18: From neural networks to deep learning

Learning principle

18  

Page 19: From neural networks to deep learning

Neuron error gradients

19  

Page 20: From neural networks to deep learning

Parameter learning: back propagation of error

•  Calculate total error at the top •  Calculate contributions to error at each step going

backwards

20  

Page 21: From neural networks to deep learning

Back propagation discussion

•  Initial weights •  Learning rate •  Number of neurons per hidden layers •  Number of hidden layers

21  

Page 22: From neural networks to deep learning

Stochastic gradient descent (SGD)

22  

Page 23: From neural networks to deep learning

23  

Page 24: From neural networks to deep learning

Deep learning

24  

Page 25: From neural networks to deep learning

Google brain

25  

Page 26: From neural networks to deep learning

GPU

26  

Page 27: From neural networks to deep learning

Learning from tagged data

•  @Andrew Ng

27  

Page 28: From neural networks to deep learning

2006 breakthrough

•  More data •  Faster hardware: GPU’s, multi-core CPU’s •  Working ideas on how to train deep

architectures

28  

Page 29: From neural networks to deep learning

29  

Page 30: From neural networks to deep learning

30  

Page 31: From neural networks to deep learning

31  

Page 32: From neural networks to deep learning

Deep Learning trends

•  @Andrew Ng

32  

Page 33: From neural networks to deep learning

33  

Page 34: From neural networks to deep learning

34  

Page 35: From neural networks to deep learning

AI will transform the internet •  @Andrew Ng •  Technology areas with potential for paradigm shift: –  Computer vision –  Speech recognition & speech synthesis –  Language understanding: Machine translation; Web

search; Dialog systems; …. –  Advertising –  Personalization/recommendation systems –  Robotics

•  All this is hard: scalability, algorithms.

35  

Page 36: From neural networks to deep learning

36  

Page 37: From neural networks to deep learning

37  

Page 38: From neural networks to deep learning

38  

Page 39: From neural networks to deep learning

Deep learning

39  

Page 40: From neural networks to deep learning

40  

Page 41: From neural networks to deep learning

CONVOLUTIONAL NEURAL NETWORK

http://colah.github.io/

41  

Page 42: From neural networks to deep learning

Convolution •  Convolution is a mathematical operation on two

functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions,

42  

Page 43: From neural networks to deep learning

Convolutional neural networks

•  Conv Nets is a kind of neural network that uses many identical copies of the same neuron – Large number of neurons – Large computational models – Number of actual weights (parameters) to be

learned fairly small

43  

Page 44: From neural networks to deep learning

A 2D Convolutional Neural Network

•  a convolutional neural network can learn a neuron once and use it in many places, making it easier to learn the model and reducing error.

44  

Page 45: From neural networks to deep learning

Structure of Conv Nets

•  Problem – predict whether a human is speaking or not

•  Input: audio samples at different points in time

45  

Page 46: From neural networks to deep learning

Simple approach

•  just connect them all to a fully-connected layer

•  Then classify

46  

Page 47: From neural networks to deep learning

A more sophisticated approach •  Local properties of the data –  frequency of sounds (increasing/decreasing)

•  Look at a small window of the audio sample –  Create a group of neuron A to compute certain features –  the output of this convolutional layer is fed into a fully-

connected layer, F

47  

Page 48: From neural networks to deep learning

48  

Page 49: From neural networks to deep learning

49  

Page 50: From neural networks to deep learning

Max pooling layer

50  

Page 51: From neural networks to deep learning

2D convolutional neural networks

51  

Page 52: From neural networks to deep learning

52  

Page 53: From neural networks to deep learning

53  

Page 54: From neural networks to deep learning

Three-dimensional convolutional networks

54  

Page 55: From neural networks to deep learning

Group of neurons: A

•  Bunch of neurons in parallel •  all get the same inputs and compute different

features.

55  

Page 56: From neural networks to deep learning

Network in Network (Lin et al. (2013)

56  

Page 57: From neural networks to deep learning

Conv Nets breakthroughs in computer vision

•  Krizehvsky et al. (2012)

57  

Page 58: From neural networks to deep learning

Diferent Levels of Abstraction

58  

Page 59: From neural networks to deep learning

59  

Page 60: From neural networks to deep learning

60  

Page 61: From neural networks to deep learning

RECURRENT NEURAL NETWORKS

http://colah.github.io/

61  

Page 62: From neural networks to deep learning

Recurrent Neural Networks (RNN) have loops

•  A loop allows information to be passed from one step of the network to the next.

62  

Page 63: From neural networks to deep learning

Unroll RNN

•  recurrent neural networks are intimately related to sequences and lists.

63  

Page 64: From neural networks to deep learning

Examples •  predict the last word in “the clouds are in the sky" •  the gap between the relevant information and the

place that it’s needed is small •  RNNs can learn to use the past information

64  

Page 65: From neural networks to deep learning

•  “I grew up in France… I speak fluent French.” •  As the gap grows, RNNs become unable to

learn to connect the information.

65  

Page 66: From neural networks to deep learning

LONG SHORT TERM MEMORY NETWORKS

LSTM Networks

66  

Page 67: From neural networks to deep learning

LSTM networks •  A Special kind of RNN •  Capable of learning long-term dependencies •  Structure in the form of a chain of repeating

modules of neural network

67  

Page 68: From neural networks to deep learning

RNN

•  repeating module has a very simple structure, such as a single tanh layer

68  

Page 69: From neural networks to deep learning

•  The tanh(z) function is a rescaled version of the sigmoid, and its output range is [ − 1,1] instead of [0,1].

69  

Page 70: From neural networks to deep learning

LSTM networks

•  Repeating module consists of four neuron, interacting in a very special way

70  

Page 71: From neural networks to deep learning

Core idea behind LSTMs •  The key to LSTMs is the cell state, the horizontal line

running through the top of the diagram. •  The cell state runs straight down the entire chain, with only

some minor linear interactions •  Easy for information to just flow along it unchanged

71  

Page 72: From neural networks to deep learning

Gates

•  The ability to remove or add information to the cell state, carefully regulated by structures called gates

•  Sigmoid – How much of each component should be let

through. – Zero means nothing through – One means let everything through

•  An LSTM has three of these gates 72  

Page 73: From neural networks to deep learning

LSTM step 1

•  decide what information we’re going to throw away from the cell state

•  forget gate layer

73  

Page 74: From neural networks to deep learning

LSTM step 2

•  decide what new information we’re going to store in the cell state

•  input gate layer

74  

Page 75: From neural networks to deep learning

LSTMs step 3

•  update the old cell state, Ct−1, into the new cell state Ct

75  

Page 76: From neural networks to deep learning

LSTMs step 4

•  decide what we’re going to output

76  

Page 77: From neural networks to deep learning

77  

Page 78: From neural networks to deep learning

78  

Page 79: From neural networks to deep learning

79  

Page 80: From neural networks to deep learning

80  

Page 81: From neural networks to deep learning

RECURRENT NEURAL NETWORKS WITH WORD EMBEDDINGS

81  

Page 82: From neural networks to deep learning

APPENDIX

82  

Page 83: From neural networks to deep learning

83  

Page 84: From neural networks to deep learning

Perceptron 1957

84  

Page 85: From neural networks to deep learning

Perceptron 1957

85  

Page 86: From neural networks to deep learning

Perceptron 1986

86  

Page 87: From neural networks to deep learning

Perceptron

87  

Page 88: From neural networks to deep learning

Activation function

88  

Page 89: From neural networks to deep learning

Back propagation 1974/1986

89  

Page 90: From neural networks to deep learning

90  

Page 91: From neural networks to deep learning

91  

Page 92: From neural networks to deep learning

•  Inspired by the architectural depth of the brain, researchers wanted for decades to train deep multi-layer neural networks.

•  No successful attempts were reported before 2006 …Exception: convolutional neural networks, LeCun 1998

•  SVM: Vapnik and his co-workers developed the Support Vector Machine (1993) (shallow

•  architecture). •  Breakthrough in 2006!

92  

Page 93: From neural networks to deep learning

2006 breakthrough

•  More data •  Faster hardware: GPU’s, multi-core CPU’s •  Working ideas on how to train deep

architectures

93  

Page 94: From neural networks to deep learning

•  Beat state of the art in many areas: – Language Modeling (2012, Mikolov et al) –  Image Recognition (Krizhevsky won 2012

ImageNet competition) – Sentiment Classification (2011, Socher et al) – Speech Recognition (2010, Dahl et al) – MNIST hand-written digit recognition (Ciresan et

al, 2010)

94  

Page 95: From neural networks to deep learning

Credits

•  Roelof Pieters, www.graph-technologies.com •  Andrew Ng •  http://colah.github.io/

95