Machine Learning & Neural...
Transcript of Machine Learning & Neural...
![Page 1: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/1.jpg)
Machine Learning &
Neural Networks
CS16: Introduction to Data Structures & Algorithms
Spring 2020
![Page 2: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/2.jpg)
Outline
2
‣ Overview
‣ Artificial Neurons
‣ Single-Layer Perceptrons
‣ Multi-Layer Perceptrons
‣ Overfitting and Generalization
‣ Applications
![Page 3: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/3.jpg)
What do think of when you hear
“Machine Learning”?
3
Bobby
“Alexa, play
Despacito.”
![Page 4: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/4.jpg)
Artificial Intelligence vs. Machine Learning
![Page 5: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/5.jpg)
What does it mean for machines to learn?
‣ Can machines think?
‣ Difficult question to answer because vague
definition of “think”:
‣ Ability to process information/perform calculations
‣ Ability to arrive at ‘intelligent’ results
‣ Replication of the ‘intelligent’ process
5
![Page 6: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/6.jpg)
Let’s Think About This Differently
6
‣ A machine learns when its performance at a
particular task improves with experience
‣ Alan Turing, in “Computing Machinery and
Intelligence” (1950)
‣ Turing’s test: the Imitation Game
‣ Proposed that we instead consider the question, “Can machines do what we (as thinking entities) do?”
![Page 7: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/7.jpg)
Machine Learning Algorithm Structure
‣ Three key components:
‣ Representation: define a space of possible programs
‣ Loss function: decide how to score a program’s performance
‣ Optimizer: how to search the space for the program with the highest score
‣ Let’s revisit decision trees:
‣ Representation: space of possible trees that can be built using attributes of the dataset as internal nodes and outcomes as leaf nodes
‣ Loss function: percent of testing examples misclassified
‣ Optimizer: choose attribute that maximizes information gain
7
![Page 8: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/8.jpg)
Neurons
‣ The brain has 100 billion neurons
‣ Neurons are connected to 1000’s of other neurons by synapses
‣ If the neuron’s electrical potential is high enough, neuron is
activated and fires
‣ Each neuron is very simple
‣ it either fires or not depending on its potential
‣ but together they form a very complex “machine”
8
![Page 9: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/9.jpg)
Neuron Anatomy (…very simplified)
Dendrites
Axon
Cell Body
Axon
Terminals
![Page 10: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/10.jpg)
Artificial Neuron
10
![Page 11: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/11.jpg)
Artificial Neuron
11
-1
multiplication
inner product
bias
Outputs 1 if input is
larger than some threshold
else it outputs 0
![Page 12: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/12.jpg)
Artificial Neuron
12
-1
Outputs 1 if input is
larger than some threshold
else it outputs 0
multiplication
inner product
bias
![Page 13: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/13.jpg)
Artificial Neuron
‣ The bias b allows us to control the threshold of 𝞅
‣ we can change the threshold by changing the weight/bias b
‣ this will simplify how we describe the learning process
13
![Page 14: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/14.jpg)
The Perceptron (Rosenblatt,1957)
14
![Page 15: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/15.jpg)
Perceptron Network
15
x1
x2
x3
x4
N
N
N
y1
y2
y3
-1
![Page 16: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/16.jpg)
Perceptron Network
16
x1
x2
x3
x4
N
N
y1
y2
y3
x1
x0=
-1w0
w1
w2
w3
w4
![Page 17: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/17.jpg)
Training a Perceptron
‣ What does it mean for a perceptron to learn?
‣ as we feed it more examples (i.e., input + classification pairs)
‣ it should get better at classifying inputs
‣ Examples have the form (x1,…,xn,t)
‣ where t is the “target” classification (the right classification)
‣ How can we use examples to improve a (artificial) neuron?
‣ which aspects of a neuron can we change/improve?
‣ how can we get the neuron to output something closer to the target value?
17
![Page 18: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/18.jpg)
Perceptron Network
18
N y1
t
Comp
update weights
x1
x2
x3
x4
N
N
y2
y3
x1
x0=
-1
![Page 19: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/19.jpg)
Perceptron Training
‣ Set all weights to small random values (positive and negative)
‣ For each training example (x1,…,xn,t)
‣ feed (x1,…,xn)to a neuron and get a result y
‣ if y=t then we don’t need to do anything!
‣ if y<t then we need to increase the neuron’s weights
‣ if y>t then we need to decrease the neuron’s weights
‣ We do this with the following update rule
19
![Page 20: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/20.jpg)
Perceptron Network
20
x1
x2
x3
x4
N
N
y1
y2
y3
x1
x0=
-1w0
w1
w2
w3
w4
![Page 21: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/21.jpg)
Artificial Neuron Update Rule
21
‣ If y=t then Δi=0 and wi=wi
‣ if y<t and xi>0 then Δi>0 and wi increases by Δi
‣ if y>t and xi>0 then Δi<0 and wi decreases by Δi
‣ What happens when xi<0?
‣ last two cases are inverted! why?
‣ recall that wi gets multiplied by xi so when xi<0, so if we want y to
increase then wi needs to be decreased!
![Page 22: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/22.jpg)
Artificial Neuron Update Rule
22
‣ What is η for?
‣ to control by how much wi should increase or decrease
‣ if η is large then errors will cause weights to be changed a lot
‣ if η is small then errors will cause weights to be change a little
‣ large η increases speed at which a neuron learns but increases sensitivity to
errors in data
![Page 23: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/23.jpg)
Perceptron Training Pseudocode
23
Perceptron(data, neurons, k):
for round from 1 to k:
for each training example in data:
for each neuron in neurons:
y = output of feeding example to neuron
for each weight of neuron:
update weight
![Page 24: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/24.jpg)
Perceptron Training
24
3 minActivity #1
x1
x2
x1 x2 t
0 0 0
0 1 1
1 0 1
1 1 1
-1 w0=-0.5
w1=-0.5
w2=-0.50.5
![Page 25: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/25.jpg)
Perceptron Training
‣ Example (-1,0,0,0)
‣ y=𝞅(-1×-0.5+0×-0.5+0×-0.5)=𝞅(0.5)=1
‣ w0=-0.5+0.5(0-1)×-1=0
‣ w1=-0.5+0.5(0-1)×0=-0.5
‣ w2=-0.5+0.5(0-1)×0=-0.5
‣ Example (-1,0,1,1)
‣ y=𝞅(-1×0+0×-0.5+1×-0.5)=𝞅(-0.5)=0
‣ w0=0+0.5(1-0)×-1=-0.5
‣ w1=-0.5+0.5(1-0)×0=-0.5
‣ w2=-0.5+0.5(1-0)×1=0
25
biastarget
![Page 26: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/26.jpg)
Perceptron Training
‣ Example (-1,1,0,1)
‣ y=𝞅(-1×-0.5+1×-0.5+0×0)=𝞅(0)=0
‣ w0=-0.5+0.5(1-0)×-1=-1
‣ w1=-0.5+0.5(1-0)×1=0
‣ w2=0+0.5(1-0)×0=0
‣ Example (-1,1,1,1)
‣ y=𝞅(-1×-1+1×0+1×0)=𝞅(1)=1
‣ w0=-1
‣ w1=0
‣ w2=0
26
biastarget
![Page 27: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/27.jpg)
Perceptron Training
‣ Are we done?
‣ No!
‣ perceptron was wrong on examples:
(0,0,0),(0,1,1),&(1,0,1)
‣ so we keep going until weights stop changing, or change only by
very small amounts (convergence)
‣ For sanity, check if our final weights correctly classify (0,0,0)
‣ w0=-1, w1=0, w2=0
‣ y=𝞅(-1×-1+0×0+0×0)=𝞅(1)=1
27
![Page 28: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/28.jpg)
Perceptron Animation
![Page 29: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/29.jpg)
Single-Layer Perceptron
30
x1
x2
x3
x4
N
N
N
y1
y2
y3
-1
![Page 30: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/30.jpg)
Limits of Single-Layer Perceptrons
‣ Perceptrons are limited
‣ there are many functions they cannot learn
‣ To better understand their power and limitations, it’s helpful to
take a geometric view
‣ If we plot classifications of all possible inputs in the plane (or
hyperplane if high-dimensional)
‣ perceptrons can learn the function if classifications can be
separated by a line (or hyperplane)
‣ data is linearly separable
31
![Page 31: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/31.jpg)
Linearly-Separable Classifications
32
![Page 32: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/32.jpg)
Single-Layer Perceptrons
‣ In 1969, Minksy and Papert published
‣ Perceptrons: An Introduction to Computational Geometry
‣ In it they proved that single-layer perceptrons
‣ could not learn some simple functions
‣ This really hurt research in neural networks…
‣ …many became pessimistic about their potential
33
![Page 33: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/33.jpg)
Multi-Layer Perceptron
34
x1
x2
x3
x4
N
N
N
y1
y2
y3
-1
N
N
N
InputsHidden
Layer
Output
Layer
-1
![Page 34: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/34.jpg)
Training Multi-Layer Perceptrons
‣ Harder to train than a single-layer perceptron
‣ if output is wrong, do we update weights of hidden neuron
or of output neuron? or both?
‣ update rule for neuron requires knowledge of target but
there is no target for hidden neurons
‣ MLPs are trained with stochastic gradient descent (SGD) using
backpropagation
‣ invented in 1986 by Rumelhart, Hinton and Williams
‣ technique was known before but Rumelhart et al. showed
precisely how it could be used to train MLPs
35
![Page 35: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/35.jpg)
Training Multi-Layer Perceptrons
36
![Page 36: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/36.jpg)
Training by Backpropagation
37
x1
x2
x3
x4
N
N
N
y1
y2
y3
-1
N
N
N
-1
t
Comp
update weights
Comp
update weights
![Page 37: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/37.jpg)
Training Multi-Layer Perceptrons
‣ Specifics of the algorithm are beyond CS16
‣ covered in CS142 and CS147
‣ Architecture depends on your task and inputs
‣ oftentimes, more layers don’t seem to add much more power
‣ tradeoff between complexity and number of parameters needed to tune
‣ Other kinds of neural nets
‣ convolutional neural nets (image & video recognition)
‣ recurrent neural nets (speech recognition)
‣ many many more38
![Page 38: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/38.jpg)
Overfitting
‣ A challenge in ML is deciding how much to train a model
‣ if a model is overtrained then it can overfit the training data
‣ which can lead it to make mistakes on new/unseen inputs
‣ Why does this happen?
‣ training data can contain errors and noise
‣ if model overfits training data then it “learns” those errors and noise
‣ and won’t do as well on new unseen inputs
‣ for more on overfitting see
‣ https://www.youtube.com/watch?v=DQWI1kvmwRg
39
![Page 39: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/39.jpg)
Overfitting
‣ A challenge in ML is deciding how much to train a model
‣ if a model is overtrained then it can overfit the training data
‣ which can lead it to make mistakes on new/unseen inputs
‣ Why does this happen?
‣ training data can contain errors and noise
‣ if model overfits training data then it “learns” those errors and noise
‣ and won’t do as well on new unseen inputs
‣ for more on overfitting see
‣ https://www.youtube.com/watch?v=DQWI1kvmwRg
40
![Page 40: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/40.jpg)
Overfitting & Generalization
41
![Page 41: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/41.jpg)
Overfitting & Generalization
‣ So how do we know when to stop training?
‣ one approach is to use the early stopping technique
‣ Split the training examples into 3 sets
‣ a training set (50%), a validation set (25%), a testing set (25%)
‣ Train on the training set but
‣ every 5 rounds, run NN on validation set
‣ compute the NN’ s error over entire validation set
‣ compare current error to previous error
‣ if error is increasing, stop and use previous version of NN
42
![Page 42: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/42.jpg)
Early Stopping
43
![Page 43: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/43.jpg)
Applications‣ Musical composition
‣ Daniel Johnson – composing music using a
recurrent neural network (RNN)
![Page 44: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/44.jpg)
Applications (continued)
‣ Style Transfer
![Page 45: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/45.jpg)
Applications (continued)
‣ Style Transfer
![Page 46: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/46.jpg)
Applications
‣ Advertising
‣ Credit card fraud detection
‣ Skin-cancer diagnosis
‣ Predicting earthquakes
‣ Lip-reading from video
‣ Even…neural networks to help you write neural
networks! (Neural Complete)
![Page 47: Machine Learning & Neural Networkscs.brown.edu/courses/cs016/static/files/lectures/slides/22_neural_networks.pdfAxon Cell Body Axon Terminals. Artificial Neuron 10. Artificial Neuron](https://reader033.fdocuments.net/reader033/viewer/2022052013/602a245f1540f6419e5c2f06/html5/thumbnails/47.jpg)
48
Questions?