Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 6 of 9
-
Upload
randa-elanwar -
Category
Education
-
view
95 -
download
0
Transcript of Introduction to Neural networks (under graduate course) Lecture 6 of 9
Neural Networks
Dr. Randa Elanwar
Lecture 6
Lecture Content
• Non Linearly separable functions: XOR gate implementation
– MLP data transformation
– mapping implementation
– graphical solution
2Neural Networks Dr. Randa Elanwar
Non linear problems
• XOR problem
• The only way to separate the positive from negative examples is to draw 2 lines (i.e., we need 2 straight line equations) or nonlinear region to capture one type only
3Neural Networks Dr. Randa Elanwar
+ve
+ve-ve
-ve+ve-ve
cba yx 22
Non linear problems
• To implement the nonlinearity we need to insert one or more extra layer of nodes between the input layer and the output layer (Hidden layer)
4Neural Networks Dr. Randa Elanwar
Non linear problems
2-layer Feed Forward Example XOR solution
5Neural Networks Dr. Randa Elanwar
MLP data transformation and mapping implementation
• Need for hidden units:
• If there is one layer of enough hidden units, the input can be recoded (memorized) multilayer perceptron (MLP)
• This recoding allows any problem to be mapped/represented (e.g., x2, x3, etc.)
• Question: how can the weights of the hidden units be trained?
• Answer: Learning algorithms e.g., back propagation
• The word ‘Back propagation’ is meant to the error propagation for weight adaptation of layers beginning from the last hidden layer back to the first i.e. weights of last layer are computed before the previous layers
6Neural Networks Dr. Randa Elanwar
Learning Non Linearly Separable Functions
• Back propagation tries to transform training patterns to make them almost linearly separable and use linear network
• In other words, if we need more than 1 straight line to separate +ve and –ve patterns, we solve the problem in two phases:
– In phase 1: we first represent each straight line with a single perceptron and classify/map the training patterns (output)
– In phase 2: these outputs are transformed to new patterns which are now linearly separable and can be classified by an additional perceptron giving the final result.
7Neural Networks Dr. Randa Elanwar
Multi-layer Networks and Perceptrons
8Neural Networks Dr. Randa Elanwar
- Have one or more layers of hidden units.
- With two possibly very large hidden layers, it is possible to implement any function.
- Networks without hidden layer are called perceptrons.
- Perceptrons are very limited in what they can represent, but this makes their learning problem much simpler.
Solving Non Linearly Separable Functions
• Example: XOR problem
• Phase 1: we draw arbitrary lines• We find the line equations g1(x) = 0 and g2(x) = 0 using
arbitrary intersections on the axes (yellow points p1, p2, p3, p4). • We assume the +ve and –ve directions for each line.• We classify the given patterns as +ve/-ve with respect to both g1(x) & g2(x)• Phase 2: we transform the patterns we have• Let the patterns that are +ve/-ve with respect to both g1(x) and g2(x) belong to class
B (similar signs), otherwise belong to class A (different signs).• We find the line equations g(y) = 0 using arbitrary intersections on the new axes
9Neural Networks Dr. Randa Elanwar
X2
X1
A
A
B
B
x1 x2 XOR Class0 0 0 B0 1 1 A1 0 1 A1 1 0 B
g2(x)g1(x) +ve+ve
p1p2
p3
p4
Solving Non Linearly Separable Functions
• Let p1 = (0.5,0), p2 = (0,0.5), p3(1.5,0), p4(0,1.5)
• Constructing g1(x) = 0
• g1(x) = x1 + x2 – 0.5 = 0
• Constructing g2(x) = 0
• g2(x) = x1 + x2 – 1.5 = 0
10Neural Networks Dr. Randa Elanwar
)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
px
px
5.00
05.0
5.01
02
x
x
)1(3)1(4
)2(3)2(4
)1(31
)2(32
pp
pp
px
px
5.10
05.1
5.11
02
x
x
Solving Non Linearly Separable Functions
• Assume x1>p1(1) is the positive direction for g1(x)• Assume x1>p3(1) is the positive direction for g2(x)• Classifying the given patterns with respect to g1(x) and g2(x):
• If we represent +ve and –ve values as a result of step function i.e., y1 =f(g1(x))= 0 if g1(x) is –ve and 1 otherwiseand y2 =f(g2(x))= 0 if g2(x) is –ve and 1 otherwise
• We now have only three patterns that can be linearly separable and we got rid of the extra pattern causing the problem (since 2 patterns coincide)
11Neural Networks Dr. Randa Elanwar
x1 x2 g1(x) g2(x) y1 y2 Class
0 0 -ve -ve 0 0 B
0 1 +ve -ve 1 0 A
1 0 +ve -ve 1 0 A
1 1 +ve +ve 1 1 B
Solving Non Linearly Separable Functions
• Let p1 = (0.5,0), p2 = (0,-0.25)
• Constructing g(y) = 0
• g(y) = y1 – 2 y2 – 0.5 = 0
12Neural Networks Dr. Randa Elanwar
y2
y1A
B
B
g(y)
+ve
p1
p2)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
py
py
5.00
025.0
5.01
02
y
y
x1
x2
1
1
-0.51
-2 -0.5
1
1-1.5
g1(x)
g2(x)
g(y)
Output layerHidden layerInput layer
Solving Non Linearly Separable Functions
• Example: The linearly non separable patterns x1 = [3 0], x2 = [5 2], x3 = [1 3], x4 = [2 4], x5 = [1 1], x6 = [3 3] have to be classified into two categories C1 = {x1, x2, x3, x4} and C2 = {x5, x6} using a feed forward 2-layer neural network. – Select a suitable number of partitioning straight lines.
– Consequently design the first stage (hidden layer) of the network with bipolar discrete perceptrons.
– Using this layer, transform the six samples.
– Design the output layer of the network with a bipolar discrete perceptron using the transformed samples.
13Neural Networks Dr. Randa Elanwar
Solving Non Linearly Separable Functions
• Let p1 = (0,1), p2 = (1,2), • p3 = (2,0), p4 = (3,1)• Constructing g1(x) = 0
• g1(x) = x1 - x2 + 1 = 0• Constructing g2(x) = 0
• g2(x) = x1 - x2 – 2 = 0
14Neural Networks Dr. Randa Elanwar
X2
X1
x3
x1
x6
x5
g2(x)
g1(x)
+ve
+ve
p1
p2x2
x4
p3
p4)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
px
px
01
12
01
12
x
x
)1(3)1(4
)2(3)2(4
)1(31
)2(32
pp
pp
px
px
23
01
21
02
x
x
Solving Non Linearly Separable Functions
15Neural Networks Dr. Randa Elanwar
x1 x2 g1(x) g2(x) y1 y2 Class
3 0 +ve +ve 1 1 B
5 2 +ve +ve 1 1 B
1 3 -ve -ve -1 -1 B
2 4 -ve -ve -1 -1 B
1 1 +ve -ve 1 -1 A
3 3 +ve -ve 1 -1 A
Assume x2<p1(2) is the positive direction for g1(x)Assume x1>p3(1) is the positive direction for g2(x)Classifying the given patterns with respect to g1(x) and g2(x):
We now have only three patterns that can be linearly separable and we got rid of the extra pattern causing the problem (since 2 patterns coincide)
we represent +ve and –ve values as a result of bipolar function y1 =f(g1(x))= -1 if g1(x) is –ve and 1 otherwisey2 =f(g2(x))= -1 if g2(x) is –ve and 1 otherwise
Solving Non Linearly Separable Functions
• Let p1 = (1,0), p2 = (0,-1)
• Constructing g(y) = 0
• g(y) = y1 – y2 – 1 = 0
16Neural Networks Dr. Randa Elanwar
y2
y1
x5,x6
x1,x2
x3,x4
g(y)
+vep1
p2
)1(1)1(2
)2(1)2(2
)1(11
)2(12
pp
pp
py
py
10
01
11
02
y
y
x1
x2
1
-1
11
-1 -1
1
-1-2
g1(x)
g2(x)
g(y)
Output layerHidden layerInput layer