Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural...
Transcript of Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural...
![Page 1: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/1.jpg)
Intro to Neural Networks and Deep Learning
Jack LanchantinDr. Yanjun Qi
1
UVA CS 6316
![Page 2: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/2.jpg)
Neurons1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice2
![Page 3: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/3.jpg)
3
x1
x2
x3
W1
w3
W2
x ŷ
![Page 4: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/4.jpg)
ewx+b
1 + ewx+b
Logistic Regression
Sigmoid Function(aka logistic, logit, “S”, soft-step)
4
P(Y=1|x) =
x
![Page 5: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/5.jpg)
ez
1 + ez
Expanded Logistic Regression
x1
x2
x3
Σ
+1
z
z = wT x + b
y = sigmoid(z) =5
px11xp
p = 3
1x1
w1
w2
w3
b1SummingFunction
SigmoidFunction
Multiply by weights
ŷ = P(Y=1|x,w)
Input x
![Page 6: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/6.jpg)
“Neuron”
x1
x2
x3
Σ
+1
z
6
w1
w2
w3
b1
![Page 7: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/7.jpg)
Neurons
7http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 8: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/8.jpg)
Neuron
x1
x2
x3
Σ z ŷ
8
ez
1 + ez
z = wT x
ŷ = sigmoid(z) =px11xp1x1
From here on, we leave out bias for simplicity
Input x
w1
w2
w3
![Page 9: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/9.jpg)
“Block View” of a Neuron
x *
Dot Product Sigmoid
w z
9
y
Input output
Dot Product SigmoidInput output
x *w z ŷ
parameterized block
ez
1 + ez
z = wT x
ŷ = sigmoid(z) =px11xp1x1
![Page 10: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/10.jpg)
Neuron Representation
Σ
10
The linear transformation and nonlinearity together is typically considered a single neuron
ŷ
x1
x2
x3
x
w1
w2
w3
![Page 11: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/11.jpg)
Neuron Representation
*w
11
ŷx
The linear transformation and nonlinearity together is typically considered a single neuron
![Page 12: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/12.jpg)
Neurons
1-Layer Neural NetworkMulti-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice12
![Page 13: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/13.jpg)
13
x1
x2
x3
W1
w3
W2
x ŷ
![Page 14: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/14.jpg)
1-Layer Neural Network (with 4 neurons)
x1
x2
x3
Σ
Σ
Σ
Σ
Input x Output ŷ
14
ŷ1
ŷ2
ŷ3
ŷ4
Wz
Linear Sigmoid
1 layer
matrixvector
![Page 15: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/15.jpg)
1-Layer Neural Network (with 4 neurons)
x1
x2
x3
Σ
Σ
Σ
Σ
Input x
15
d = 4
p = 3
Wz
Output ŷ
ez
1 + ez
z =WT x
ŷ = sigmoid(z) =px1dxpdx1
dx1 dx1
Element-wise on vector z
ŷ1
ŷ2
ŷ3
ŷ4
![Page 16: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/16.jpg)
1-Layer Neural Network (with 4 neurons)
x1
x2
x3
Input x
16
ez
1 + ez
z =WT x
ŷ = sigmoid(z) =px1dxpdx1
dx1 dx1
d = 4
p = 3
W Σ
Σ
Σ
Σ
Output ŷ
ŷ1
ŷ2
ŷ3
ŷ4
Element-wise on vector z
![Page 17: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/17.jpg)
“Block View” of a Neural Network
x *
Dot Product Sigmoid
w z
17
y
Input output
Dot Product SigmoidInput output
x *W z ŷ
W is now a matrix
ez
1 + ez
z =WT x
ŷ = sigmoid(z) =px1dxpdx1
dx1 dx1
z is now a vector
![Page 18: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/18.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural NetworkLoss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice18
![Page 19: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/19.jpg)
19
x1
x2
x3
W1
w3
W2
x ŷ
![Page 20: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/20.jpg)
Multi-Layer Neural Network(Multi-Layer Perceptron (MLP) Network)
20
Outputlayer
x1
x2
x3
x ŷ
W1
w2
Hiddenlayer
weight subscript represents layer number
2-layer NN
![Page 21: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/21.jpg)
Multi-Layer Neural Network (MLP)
21
1st hidden layer
2nd hiddenlayer
Outputlayer
x1
x2
x3
x ŷ
3-layer NN
W1
w3
W2
![Page 22: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/22.jpg)
Multi-Layer Neural Network (MLP)
22
z1 =WT xh1 = sigmoid(z1)z2 =WT h1 h2 = sigmoid(z2)z3 =wT h2 ŷ = sigmoid(z3)
x1
x2
x3
h1 h2
W1
w3
W2
x ŷ
1
2
3
hidden layer 1 output
![Page 23: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/23.jpg)
Multi-Class Output MLP
x1
x2
x3
x ŷ
23
h1 h2
z1 =WT xh1 = sigmoid(z1)z2 =WT h1 h2 = sigmoid(z2)z3 =WT h2 ŷ = sigmoid(z2)
1
2
3
W1
w3
W2
![Page 24: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/24.jpg)
“Block View” Of MLP
x y
1st hidden layer
2nd hidden layer
Output layer
24
*W1
*W2
*W3
z1 z2 z3h1 h2
![Page 25: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/25.jpg)
“Deep” Neural Networks (i.e. > 1 hidden layer)
25Researchers have successfully used 1000 layers to train an object classifier
![Page 26: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/26.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss FunctionsBackpropagation
Nonlinearity Functions
NNs in Practice26
![Page 27: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/27.jpg)
27
x1
x2
x3
W1
w3
W2
x ŷ E (ŷ)
![Page 28: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/28.jpg)
ŷ = P(y=1|X,W)
28
x1
x2
x3
x
Binary Classification Loss
E= loss = - logP(Y = ŷ | X= x ) = - y log(ŷ) - (1 - y) log(1-ŷ)
W1
w3
W2
this example is for a single sample x
true output
![Page 29: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/29.jpg)
29
x1
x2
x3
Σ
W1
w3
W2
x
Regression Loss
E = loss= ( y - ŷ )212
ŷ
true output
![Page 30: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/30.jpg)
z1
z2
z3
30
x1
x2
x3
x
Σ
Σ
Σ
ŷ1
ŷ2
ŷ3
Multi-Class Classification Loss
“Softmax” function. Normalizing function which converts each class output to a probability.
E = loss = - yj ln ŷjΣj = 1...K
= P( ŷi = 1 | x )
W1 W3
W2
ŷi
“0” for all except true class
K = 3
![Page 31: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/31.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
BackpropagationNonlinearity Functions
NNs in Practice31
![Page 32: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/32.jpg)
32
x1
x2
x3
W1
w3
W2
x ŷ E (ŷ)
![Page 33: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/33.jpg)
Training Neural Networks
33
How do we learn the optimal weights WL for our task??● Gradient descent:
LeCun et. al. Efficient Backpropagation. 1998
WL(t+1) = WL(t) - E WL(t)
But how do we get gradients of lower layers?● Backpropagation!
○ Repeated application of chain rule of calculus○ Locally minimize the objective○ Requires all “blocks” of the network to be differentiable
x ŷ
W1w3
W2
![Page 34: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/34.jpg)
34
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 35: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/35.jpg)
35
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 36: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/36.jpg)
36
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 37: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/37.jpg)
37
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 38: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/38.jpg)
38
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 39: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/39.jpg)
39
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 40: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/40.jpg)
40
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 41: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/41.jpg)
41
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 42: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/42.jpg)
42
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 43: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/43.jpg)
43
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 44: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/44.jpg)
44
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 45: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/45.jpg)
45
Backpropagation Intro
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 46: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/46.jpg)
46
Backpropagation Intro
Tells us: by increasing x by a scale of 1, we decrease f by a scale of 4
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 47: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/47.jpg)
47
ŷ = P(y=1|X,W)
x1
x2
x3
W1
w2
x
Backpropagation(binary classification example)
Example on 1-hidden layer NN for binary classification
![Page 48: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/48.jpg)
48
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 49: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/49.jpg)
49
x y*W1
*w2
z1z2h1
E = loss =
Backpropagation(binary classification example)
Gradient Descent to Minimize loss: Need to find these!
![Page 50: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/50.jpg)
50
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 51: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/51.jpg)
51
Backpropagation(binary classification example)
= ??
= ??
x y*W1
*w2
z1z2h1
![Page 52: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/52.jpg)
52
Backpropagation(binary classification example)
= ??
= ??
Exploit the chain rule!
x y*W1
*w2
z1z2h1
![Page 53: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/53.jpg)
53
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 54: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/54.jpg)
54
Backpropagation(binary classification example)
chain rule
x y*W1
*w2
z1z2h1
![Page 55: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/55.jpg)
55
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 56: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/56.jpg)
56
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 57: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/57.jpg)
57
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 58: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/58.jpg)
58
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 59: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/59.jpg)
59
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 60: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/60.jpg)
60
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 61: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/61.jpg)
61
Backpropagation(binary classification example)
x y*W1
*w2
z1z2h1
![Page 62: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/62.jpg)
62
Backpropagation(binary classification example)
already computed
x y*W1
*w2
z1z2h1
![Page 63: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/63.jpg)
63
“Local-ness” of Backpropagation
fx y
“local gradients”activations
gradients
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 64: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/64.jpg)
64
“Local-ness” of Backpropagation
fx y
“local gradients”activations
gradients
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 65: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/65.jpg)
x
65
Example: Sigmoid Block
sigmoid(x) = (x)
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 66: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/66.jpg)
Deep Learning = Concatenation of Differentiable Parameterized Layers (linear & nonlinearity functions)
x y
1st hidden layer
2nd hidden layer
Output layer
66
*W1
*W2
*W3
z1 z2 z3h1 h2
Want to find optimal weights W to minimize some loss function E!
![Page 67: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/67.jpg)
Backprop Whiteboard Demo
x1
x2
1
Σ
Σ
ŷ
67
w1 z1w2
w3w4
b1
b2
z2
h1
h2
Σ
w5
w6
1b3
z1 = x1w1 + x2w3 + b1z2 = x1w2 + x2w4 + b2
h1 =exp(z1)
1 + exp(z1)exp(z2)
1 + exp(z2)h2 =
ŷ = h1w5 + h2w6 + b3
E = ( y - ŷ )2
f1
f2
f3
f4
w(t+1) = w(t) - E w(t)
E w = ??
![Page 68: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/68.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity FunctionsNNs in Practice
68
![Page 69: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/69.jpg)
Nonlinearity Functions (i.e. transfer or activation functions)
x1
x2
x3
Σ
SummingFunction
SigmoidFunction
w1
w2
w3
+1
b1
z
x﹡w
Multiply by weights
69
![Page 70: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/70.jpg)
Nonlinearity Functions (i.e. transfer or activation functions)
x﹡w
70https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions
Name Plot Equation Derivative ( w.r.t x )
![Page 71: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/71.jpg)
Nonlinearity Functions (i.e. transfer or activation functions)
x﹡w
71https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions
Name Plot Equation Derivative ( w.r.t x )
![Page 72: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/72.jpg)
Nonlinearity Functions (aka transfer or activation functions)
x﹡w
72https://en.wikipedia.org/wiki/Activation_function#Comparison_of_activation_functions
Name Plot Equation Derivative ( w.r.t x )
usually works best in practice
![Page 73: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/73.jpg)
Neurons
1-Layer Neural Network
Multi-layer Neural Network
Loss Functions
Backpropagation
Nonlinearity Functions
NNs in Practice73
![Page 74: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/74.jpg)
74
Neural Net Pipeline
x ŷ
W1w3
1. Initialize weights2. For each batch of input x samples S:
a. Run the network “Forward” on S to compute outputs and lossb. Run the network “Backward” using outputs and loss to compute gradientsc. Update weights using SGD (or a similar method)
3. Repeat step 2 until loss convergence
W2
E
![Page 75: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/75.jpg)
75
Non-Convexity of Neural Nets
In very high dimensions, there exists many local minimum which are about the same.
Pascanu, et. al. On the saddle point problem for non-convex optimization 2014
![Page 76: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/76.jpg)
76
Building Deep Neural Nets
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
fx
y
![Page 77: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/77.jpg)
77
Building Deep Neural Nets
“GoogLeNet” for Object Classification
![Page 78: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/78.jpg)
78
Block Example Implementation
http://cs231n.stanford.edu/slides/winter1516_lecture5.pdf
![Page 79: Intro to Neural Networks and Deep Learningjjl5sw/documents/DeepLearningIntro.pdf · Intro to Neural Networks and Deep Learning Jack Lanchantin Dr. Yanjun Qi 1 UVA CS 6316. Neurons](https://reader030.fdocuments.net/reader030/viewer/2022040611/5ed841d60fa3e705ec0e2296/html5/thumbnails/79.jpg)
79
Advantage of Neural Nets
As long as it’s fully differentiable, we can train the model to automatically learn features for us.