1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic...
-
Upload
raymond-walton -
Category
Documents
-
view
224 -
download
0
Transcript of 1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic...
1
Machine Learning
The Perceptron
2
• Heuristic Search
• Knowledge Based Systems (KBS)
• Genetic Algorithms (GAs)
In a Knowledge Based Representation the solution is a set of rules
RULE Prescribes_Erythromycin IF Diseases = strep_throat AND Allergy <> erythromycin THEN Prescribes = erythromycin CNF 95 BECAUSE "Erythromycin is an effective treatment for a strep throat if the patient is not allergic to it";
In a Genetic Algorithm Based Representation the solution is a chromosome
011010001010010110000111
4
Genetic algorithmGenetic algorithm
1 01 0X1i
Generation i
0 01 0X2i
0 00 1X3i
1 11 0X4i
0 11 1X5i f = 56
1 00 1X6i f = 54
f = 36
f = 44
f = 14
f = 14
1 00 0X1i+1
Generation (i + 1)
0 01 1X2i+1
1 10 1X3i+1
0 01 0X4i+1
0 11 0X5i+1 f = 54
0 11 1X6i+1 f = 56
f = 56
f = 50
f = 44
f = 44
Crossover
X6i 1 00 0 01 0 X2i
0 01 0X2i 0 11 1 X5i
0X1i 0 11 1 X5i1 01 0
0 10 0
11 101 0
Mutation
0 11 1X5'i 01 0
X6'i 1 00
0 01 0X2'i 0 1
0 0
0 1 111X5i
1 1 1 X1"i1 1
X2"i0 1 0
0X1'i 1 1 1
0 1 0X2i
5
The neuron as a simple computing element The neuron as a simple computing element The perceptronThe perceptron Multilayer artificial neural networksMultilayer artificial neural networks Accelerated learning in multilayer ANNsAccelerated learning in multilayer ANNs
TopicsTopics
6
Machine LearningMachine Learning
Machine learning involves adaptive mechanisms Machine learning involves adaptive mechanisms that enable computers to learn from experience, that enable computers to learn from experience, learn by example and learn by analogy. Learning learn by example and learn by analogy. Learning capabilities can improve the performance of an capabilities can improve the performance of an intelligent system over time. The most popular intelligent system over time. The most popular approaches to machine learning are approaches to machine learning are artificial artificial neural networksneural networks and and genetic algorithmsgenetic algorithms. This . This lecture is dedicated to neural networks.lecture is dedicated to neural networks.
7
Our brain can be considered as a highly complex, Our brain can be considered as a highly complex, non-linear and parallel information-processing non-linear and parallel information-processing system. system.
Information is stored and processed in a neural Information is stored and processed in a neural network simultaneously throughout the whole network simultaneously throughout the whole network, rather than at specific locations. In other network, rather than at specific locations. In other words, in neural networks, both data and its words, in neural networks, both data and its processing are processing are globalglobal rather than local. rather than local.
Learning is a fundamental and essential Learning is a fundamental and essential characteristic of biological neural networks. The characteristic of biological neural networks. The ease with which they can learn led to attempts to ease with which they can learn led to attempts to emulate a biological neural network in a computer.emulate a biological neural network in a computer.
8
Biological neural networkBiological neural network
Soma Soma
Synapse
Synapse
Dendrites
Axon
Synapse
Dendrites
Axon
9
The neurons are connected by weighted links passing The neurons are connected by weighted links passing signals from one neuron to another. signals from one neuron to another.
The human brain incorporates nearly 10 billion neurons The human brain incorporates nearly 10 billion neurons and 60 trillion connections, and 60 trillion connections, synapsessynapses, between them. , between them. By using multiple neurons simultaneously, the brain By using multiple neurons simultaneously, the brain can perform its functions much faster than the fastest can perform its functions much faster than the fastest computers in existence today.computers in existence today.
10
A A artificial artificial neural networkneural network can be defined as a model can be defined as a model of reasoning based on the human brain. The brain of reasoning based on the human brain. The brain consists of a densely interconnected set of nerve cells, consists of a densely interconnected set of nerve cells, or basic information-processing units, called or basic information-processing units, called neuronsneurons. .
An artificial neural network consists of a number of An artificial neural network consists of a number of very simple processors, also called very simple processors, also called neuronsneurons, which are , which are analogous to the biological neurons in the brain.analogous to the biological neurons in the brain.
11
Architecture of a typical artificial neural networkArchitecture of a typical artificial neural network
Input Layer Output Layer
Middle Layer
I n
p u
t S
i g
n a
l s
O u
t p
u t
S
i g n
a l
s
12
Analogy between biological and Analogy between biological and artificial neural networksartificial neural networks
Soma Soma
Synapse
Synapse
Dendrites
Axon
Synapse
Dendrites
Axon
Input Layer Output Layer
Middle Layer
I n
p u
t S
i g
n a
l s
O u
t p
u t
S
i g n
a l
s
13
Each neuron has a simple structure, but an army of Each neuron has a simple structure, but an army of such elements constitutes a tremendous processing such elements constitutes a tremendous processing power. power.
A neuron consists of a cell body, A neuron consists of a cell body, somasoma, a number of , a number of fibers called fibers called dendritesdendrites, and a single long fiber , and a single long fiber called the called the axonaxon..
14
The neuron as a simple computing elementThe neuron as a simple computing element
Diagram of a neuronDiagram of a neuron
Neuron Y
Input Signals
x1
x2
xn
Output Signals
Y
Y
Y
w2
w1
wn
Weights
15
The PerceptronThe Perceptron
The operation of Rosenblatt’s perceptron is based The operation of Rosenblatt’s perceptron is based on the on the McCulloch and Pitts neuron modelMcCulloch and Pitts neuron model. The . The model consists of a linear combiner followed by a model consists of a linear combiner followed by a hard limiter. hard limiter.
The weighted sum of the inputs is applied to the The weighted sum of the inputs is applied to the hard limiter. (Step or sign function)hard limiter. (Step or sign function)
16
Threshold
Inputs
x1
x2
Output
Y
HardLimiter
w2
w1
LinearCombiner
Single-layer two-input perceptronSingle-layer two-input perceptron
17
The neuron computes the weighted sum of the input The neuron computes the weighted sum of the input signals and compares the result with a signals and compares the result with a threshold threshold valuevalue, , . If the net input is less than the threshold, . If the net input is less than the threshold, the neuron output is –1. But if the net input is greater the neuron output is –1. But if the net input is greater than or equal to the threshold, the neuron becomes than or equal to the threshold, the neuron becomes activated and its output attains a value +1.activated and its output attains a value +1.
The neuron uses the following transfer or The neuron uses the following transfer or activationactivation functionfunction::
This type of activation function is called a This type of activation function is called a sign sign functionfunction..
n
iiiwxX
1
X
XY
if ,1
if ,1
18
Y = sign(x1 * w1 + x2 * w2 - threshold);
static int sign(double x) { return (x < - 0) ? -1 : 1; }
19
Possible Activation functionsPossible Activation functions
20
Exercise
Threshold
Inputs
x1
x2
Output
Y
HardLimiter
w2
w1
LinearCombiner
A neuron uses a step function as its activation function, has threshold 0.2 and has W1 = 0.1, W2 = 0.1.
What is the output with the following values of x1 and x2?
Do you recognize this?
x1 x2 Y
0 0
0 1
1 0
1 1
Can you find weightsthat will give an orfunction?An xor function?
21
Can a single neuron learn a task?Can a single neuron learn a task?
In 1958, In 1958, Frank RosenblattFrank Rosenblatt introduced a training introduced a training algorithm that provided the first procedure for algorithm that provided the first procedure for training a simple ANN: a training a simple ANN: a perceptronperceptron. .
The perceptron is the simplest form of a neural The perceptron is the simplest form of a neural network. It consists of a single neuron with network. It consists of a single neuron with adjustableadjustable synaptic weights and a synaptic weights and a hard limiterhard limiter. .
22
This is done by making small adjustments in the This is done by making small adjustments in the weights to reduce the difference between the actual weights to reduce the difference between the actual and desired outputs of the perceptron. The initial and desired outputs of the perceptron. The initial weights are randomly assigned, usually in the range weights are randomly assigned, usually in the range [[0.5, 0.5], and then updated to obtain the output 0.5, 0.5], and then updated to obtain the output consistent with the training examples.consistent with the training examples.
How does the perceptron learn its classification How does the perceptron learn its classification tasks?tasks?
23
If at iteration If at iteration pp, the actual output is , the actual output is YY((pp) and the ) and the desired output is desired output is YYd d ((pp), then the error is given by:), then the error is given by:
where where pp = 1, 2, 3, . . . = 1, 2, 3, . . .
Iteration Iteration pp here refers to the here refers to the ppth training example th training example presented to the perceptron.presented to the perceptron.
If the error, If the error, ee((pp), is positive, we need to increase ), is positive, we need to increase perceptron output perceptron output YY((pp), but if it is negative, we ), but if it is negative, we need to decrease need to decrease YY((pp).).
)()()( pYpYpe d
24
The perceptron learning ruleThe perceptron learning rule
where where pp = 1, 2, 3, . . . = 1, 2, 3, . . .
is the is the learning ratelearning rate, a positive constant less than, a positive constant less than
unity.unity.
The perceptron learning rule was first proposed byThe perceptron learning rule was first proposed by
Rosenblatt Rosenblatt in 1960. Using this rule we can derive in 1960. Using this rule we can derive
the perceptron training algorithm for classification the perceptron training algorithm for classification
tasks.tasks.
)()()()1( pepxpwpw iii
25
Step 1Step 1: Initialisation: InitialisationSet initial weights Set initial weights ww11, , ww22,…, ,…, wwnn and threshold and threshold
to random numbers in the range [to random numbers in the range [0.5, 0.5]. 0.5, 0.5].
Perceptron’s training algorithmPerceptron’s training algorithm
26
Step 2Step 2: Activation: ActivationActivate the perceptron by applying inputs Activate the perceptron by applying inputs xx11((pp), ),
xx22((pp),…, ),…, xxnn((pp) and desired output ) and desired output YYd d ((pp). ).
Calculate the actual output at iteration Calculate the actual output at iteration pp = 1 = 1
where where nn is the number of the perceptron inputs, is the number of the perceptron inputs, and and stepstep is a step activation function. is a step activation function.
Perceptron’s tarining algorithm (continued)Perceptron’s tarining algorithm (continued)
n
iii pwpxsteppY
1
)( )()(
27
Step 3Step 3: Weight training: Weight trainingUpdate the weights of the perceptronUpdate the weights of the perceptron
where is the weight correction at iteration where is the weight correction at iteration pp..
The weight correction is computed by the The weight correction is computed by the delta delta rulerule::
Step 4Step 4: Iteration: IterationIncrease iteration Increase iteration pp by one, go back to by one, go back to Step 2Step 2 and and repeat the process until convergence.repeat the process until convergence.
)()()1( pwpwpw iii
Perceptron’s tarining algorithm (continued)Perceptron’s tarining algorithm (continued)
)()()( pepxpw ii )()()( pYpYpe d
28
Epoch x1 X2 Yd w1 w2 Y err w1 w2
1 0 0 0.3 -0.1
2
Exercise: Complete the first two epochs training the perceptron to learn the logical AND function. Assume = 0.2, = 0.1
29
The perceptron in effect classifies the inputs, The perceptron in effect classifies the inputs,
xx11, , xx22, . . ., , . . ., xxnn, into one of two classes, say , into one of two classes, say
AA11 and and AA22. .
In the case of an elementary perceptron, the n-In the case of an elementary perceptron, the n-dimensional space is divided by a dimensional space is divided by a hyperplanehyperplane into into two decision regions. The hyperplane is defined by two decision regions. The hyperplane is defined by the the linearly separablelinearly separable function function::
01
n
iiiwx
30
• If you proceed with the algorithm described above the termination occurs with w1 = 0.1 and w2 = 0.1
• This means that the line is
0.1x1 + 0.1x2 = 0.2or
x1 + x2 - 2 = 0• The region below the line is where
x1 + x2 - 2 < 0 and above the line is where
x1 + x2 - 2 >= 0
31
Linear separability in the perceptronLinear separability in the perceptron
x1
x2
Class A2
Class A1
1
2
x1w1 + x2w2 = 0
(a) Two-input perceptron. (b) Three-input perceptron.
x2
x1
x3x1w1 + x2w2 + x3w3 = 0
12
32
Perceptron Capability
• Independence of the activation function (Shynk, 1990; Shynk and Bernard, 1992)
33
General Network Structure
• The output signal is transmitted through the The output signal is transmitted through the neuron’s outgoing connection. The outgoing neuron’s outgoing connection. The outgoing connection splits into a number of branches that connection splits into a number of branches that transmit the same signal. The outgoing branches transmit the same signal. The outgoing branches terminate at the incoming connections of other terminate at the incoming connections of other neurons in the network.neurons in the network.
• Training by back-propagation algorithmTraining by back-propagation algorithm• Overcame the proven limitations of perceptronOvercame the proven limitations of perceptron
34
Architecture of a typical artificial neural networkArchitecture of a typical artificial neural network
Input Layer Output Layer
Middle Layer
I n
p u
t S
i g
n a
l s
O u
t p
u t
S
i g n
a l
s