Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational...
Transcript of Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational...
![Page 1: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/1.jpg)
Genome 559: Introduction to Statistical and
Computational Genomics
Elhanan Borenstein
Artificial Neural Networks
Some slides adapted from Geoffrey Hinton and Igor Aizenberg
![Page 2: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/2.jpg)
� Ab initio gene prediction
� Parameters:
� Splice donor sequence model
� Splice acceptor sequence model
� Intron and exon length distribution
� Open reading frame
� More …
� Markov chain
� States
� Transition probabilities
� Hidden Markov Model
(HMM)
A quick review
![Page 3: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/3.jpg)
Machine learning
“A field of study that gives computers the ability to learn
without being explicitly programmed.” Arthur Samuel (1959)
![Page 4: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/4.jpg)
Tasks best solved by learning algorithms
� Recognizing patterns:
� Facial identities or facial expressions
� Handwritten or spoken words
� Recognizing anomalies:
� Unusual sequences of credit card transactions
� Prediction:
� Future stock prices
� Predict phenotype based on markers
� Genetic association, diagnosis, etc.
![Page 5: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/5.jpg)
Why machine learning?
� It is very hard to write programs that solve problems
like recognizing a face.
� We don’t know what program to write.
� Even if we had a good idea of how to do it, the program
might be horrendously complicated.
� Instead of writing a program by hand, we collect lots of
examples for which we know the correct output
� A machine learning algorithm then takes these examples,
trains, and “produces a program” that does the job.
� If we do it right, the program works for new cases as well as
the ones we trained it on.
![Page 6: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/6.jpg)
Why neural networks?
� One of those things you always hear about but never
know exactly what they actually mean…
� A good example of a machine learning framework
� In and out of fashion …
� An important part of machine learning history
� A powerful
framework
![Page 7: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/7.jpg)
The goals of neural computation
1. To understand how the brain actually works
� Neuroscience is hard!
2. To develop a new style of computation
� Inspired by neurons and their adaptive connections
� Very different style from sequential computation
3. To solve practical problems by developing novel
learning algorithms
� Learning algorithms can be very useful even if they have
nothing to do with how the brain works
![Page 8: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/8.jpg)
How the brain works (sort of)
� Each neuron receives inputs from many other neurons
� Cortical neurons use spikes to communicate
� Neurons spike once they “aggregate enough stimuli” through
input spikes
� The effect of each input spike on the neuron is controlled by
a synaptic weight. Weights can be positive or negative
� Synaptic weights adapt so that the whole network learns to
perform useful computations
� A huge number of weights can
affect the computation in a very
short time. Much better bandwidth
than a computer.
![Page 9: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/9.jpg)
A typical cortical neuron
� Physical structure:
� There is one axon that branches
� There is a dendritic tree that collects
input from other neurons
� Axons typically contact dendritic trees at
synapses
� A spike of activity in the axon causes
a charge to be injected into the post-
synaptic neuron
axon
body
dendritic
tree
![Page 10: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/10.jpg)
Idealized Neuron
Σ
X1
w1
X2
X3
w2
w3
Y
� Basically, a weighted sum!
i
i
iwxy ∑=
![Page 11: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/11.jpg)
Adding bias
Σ,b
X1
w1
X2
X3
w2
w3
Y
� Function does not have to pass through the origin
bwxy i
i
i −=∑
![Page 12: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/12.jpg)
Adding an “activation” function
Σ,b
X1
w1
X2
X3
w2
w3
Y
� The “field” of the neuron goes through an activation
function
)( bwxy i
i
i −Φ= ∑
φ
Z, (the field of the neuron)
![Page 13: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/13.jpg)
Σ,b,φ
X1w1
X2
X3
w2
w3
Y
Common activation functions
( )z zφ =
Linear activation
Threshold activation Hyperbolic tangent activation
Logistic activation
( ) ( )u
u
e
eutanhu
γ
γ
γϕ2
2
1
1−
−
+
−==
( )1
1 zz
eα
φ−
=+
( )1, 0,
sign( )1, 0.
if zz z
if zφ
≥= =
− <
z
z
z
z
1
-1
1
0
0
1
-1
13
![Page 14: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/14.jpg)
McCulloch-Pitts neurons
� Introduced in 1943 (and influenced Von Neumann!)
� Threshold activation function
� Restricted to binary inputs and outputs
bwxz i
i
i −=∑1 if z>0
0 otherwisey=
Σ,b
X1 w1
X2
w2
Y
w1=1, w2=1, b=1.5
X1 X2 y
0 0 0
0 1 0
1 0 0
1 1 1
X1 AND X2
w1=1, w2=1, b=0.5
X1 X2 y
0 0 0
0 1 1
1 0 1
1 1 1
X1 OR X2
![Page 15: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/15.jpg)
Beyond binary neurons
bwxz i
i
i −=∑1 if z>0
0 otherwisey=
Σ,b
X1 w1
X2
w2
Y
w1=1, w2=1, b=1.5
X1 X2 y
0 0 0
0 1 0
1 0 0
1 1 1
X1 AND X2
w1=1, w2=1, b=0.5
X1 X2 y
0 0 0
0 1 1
1 0 1
1 1 1
X1 OR X2
X1
X2
(0,0)
(0,1)
(1,0)
(1,1)
X1
X2
(0,0)
(0,1)
(1,0)
(1,1)
![Page 16: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/16.jpg)
Beyond binary neurons
X1
X2
� A general classifier
� The weights determine the slope
� The bias determines the distance from the origin
But … how would we know
how to set the weights
and the bias?(note: the bias can be represented
as an additional input)
![Page 17: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/17.jpg)
Perceptron learning
� Use a “training set” and let the perceptron learn from
its mistakes
� Training set: A set of input data for which we know the
correct answer/classification!
� Learning principle: Whenever the perceptron is wrong, make
a small correction to the weights in the right direction.
� Note: Supervised learning
� Training set vs. testing set
![Page 18: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/18.jpg)
Perceptron learning
1. Initialize weights and threshold (e.g., use small
random values).
2. Use input X and desired output d from training set
3. Calculate the actual output, y
4. Adapt weights: wi(t+1) = wi (t) + α(d − y)xi for all
weights. α is the learning rate (don’t overshoot)
Repeat 3 and 4 until the d−y is smaller than a user-specified
error threshold, or a predetermined number of iterations have
been completed.
If solution exists – guaranteed to converge!
![Page 19: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/19.jpg)
Linear separability
� What about the XOR function?
� Or other non linear separable classification problems
such as:
![Page 20: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/20.jpg)
Multi-layer feed-forward networks
� We can connect several neurons, where the output of
some is the input of others.
![Page 21: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/21.jpg)
Solving the XOR problem
� Only 3 neurons are required!!!
b=1.5X1
X2 +1
Y
b=0.5
b=0.5+1
+1
+1-1
+1
![Page 22: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/22.jpg)
In fact …
� With one hidden layer you can solve ANY classification
task!
� But …. How do you find the right set of weights?
(note: we only have an error delta for the output neuron)
� This problem caused this framework to fall out of favor
… until …
![Page 23: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/23.jpg)
Back-propagation
Main idea:
� First propagate a training input data point forward to
get the calculated output
� Compare the calculated output with the desired
output to get the error (delta)
� Now, propagate the error back in the network to
get an error estimate for each neuron
� Update weights accordingly
![Page 24: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/24.jpg)
Types of connectivity
� Feed-forward networks
� Compute a series of
transformations
� Typically, the first layer is the
input and the last layer is the
output.
� Recurrent networks
� Include directed cycles in their
connection graph.
� Complicated dynamics.
� Memory.
� More biologically realistic?
hidden units
output units
input units
![Page 25: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/25.jpg)
� Which is the most useful representation?
B
C
A
D
A B C D
A 0 0 1 0
B 0 0 0 0
C 0 1 0 0
D 0 1 1 0
Connectivity MatrixList of edges:
(ordered) pairs of nodes
[ (A,C) , (C,B) ,
(D,B) , (D,C) ]
Object Oriented
Name:A
ngr:
p1Name:B
ngr:
Name:C
ngr:
p1
Name:D
ngr:
p1 p2
Computational representationof networks
![Page 26: Artificial Neural Networks - Borenstein LabGenome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Artificial Neural Networks Some slides adapted from](https://reader035.fdocuments.net/reader035/viewer/2022081410/609f48625b33e63efc11d5e5/html5/thumbnails/26.jpg)