Introduction to Neural networks (under graduate course) Lecture 6 of 9

Neural Networks

Dr. Randa Elanwar

Lecture 6

Lecture Content

• Non Linearly separable functions: XOR gate implementation

– MLP data transformation

– mapping implementation

– graphical solution

2Neural Networks Dr. Randa Elanwar

Non linear problems

• XOR problem

• The only way to separate the positive from negative examples is to draw 2 lines (i.e., we need 2 straight line equations) or nonlinear region to capture one type only


+ve

+ve-ve

-ve+ve-ve

cba yx 22

Non linear problems

• To implement the nonlinearity we need to insert one or more extra layer of nodes between the input layer and the output layer (Hidden layer)


Non linear problems

2-layer Feed Forward Example XOR solution


MLP data transformation and mapping implementation

• Need for hidden units:

• If there is one layer of enough hidden units, the input can be recoded (memorized) multilayer perceptron (MLP)

• This recoding allows any problem to be mapped/represented (e.g., x2, x3, etc.)

• Question: how can the weights of the hidden units be trained?

• Answer: Learning algorithms e.g., back propagation

• The word ‘Back propagation’ is meant to the error propagation for weight adaptation of layers beginning from the last hidden layer back to the first i.e. weights of last layer are computed before the previous layers


Learning Non Linearly Separable Functions

• Back propagation tries to transform training patterns to make them almost linearly separable and use linear network

• In other words, if we need more than 1 straight line to separate +ve and –ve patterns, we solve the problem in two phases:

– In phase 1: we first represent each straight line with a single perceptron and classify/map the training patterns (output)

– In phase 2: these outputs are transformed to new patterns which are now linearly separable and can be classified by an additional perceptron giving the final result.


Multi-layer Networks and Perceptrons


- Have one or more layers of hidden units.

- With two possibly very large hidden layers, it is possible to implement any function.

- Networks without hidden layer are called perceptrons.

- Perceptrons are very limited in what they can represent, but this makes their learning problem much simpler.

Solving Non Linearly Separable Functions

• Example: XOR problem

• Phase 1: we draw arbitrary lines• We find the line equations g1(x) = 0 and g2(x) = 0 using

arbitrary intersections on the axes (yellow points p1, p2, p3, p4). • We assume the +ve and –ve directions for each line.• We classify the given patterns as +ve/-ve with respect to both g1(x) & g2(x)• Phase 2: we transform the patterns we have• Let the patterns that are +ve/-ve with respect to both g1(x) and g2(x) belong to class

B (similar signs), otherwise belong to class A (different signs).• We find the line equations g(y) = 0 using arbitrary intersections on the new axes


X2

X1

A

A

B

B

x1 x2 XOR Class0 0 0 B0 1 1 A1 0 1 A1 1 0 B

g2(x)g1(x) +ve+ve

p1p2

p3

p4


• Let p1 = (0.5,0), p2 = (0,0.5), p3(1.5,0), p4(0,1.5)

• Constructing g1(x) = 0

• g1(x) = x1 + x2 – 0.5 = 0

• Constructing g2(x) = 0

• g2(x) = x1 + x2 – 1.5 = 0


)1(1)1(2

)2(1)2(2

)1(11

)2(12

pp

pp

px

px

5.00

05.0

5.01

02

x

x

)1(3)1(4

)2(3)2(4

)1(31

)2(32

pp

pp

px

px

5.10

05.1

5.11

02

x

x


• Assume x1>p1(1) is the positive direction for g1(x)• Assume x1>p3(1) is the positive direction for g2(x)• Classifying the given patterns with respect to g1(x) and g2(x):

• If we represent +ve and –ve values as a result of step function i.e., y1 =f(g1(x))= 0 if g1(x) is –ve and 1 otherwiseand y2 =f(g2(x))= 0 if g2(x) is –ve and 1 otherwise

• We now have only three patterns that can be linearly separable and we got rid of the extra pattern causing the problem (since 2 patterns coincide)


x1 x2 g1(x) g2(x) y1 y2 Class

0 0 -ve -ve 0 0 B

0 1 +ve -ve 1 0 A

1 0 +ve -ve 1 0 A

1 1 +ve +ve 1 1 B


• Let p1 = (0.5,0), p2 = (0,-0.25)

• Constructing g(y) = 0

• g(y) = y1 – 2 y2 – 0.5 = 0


y2

y1A

B

B

g(y)

+ve

p1

p2)1(1)1(2

)2(1)2(2

)1(11

)2(12

pp

pp

py

py

5.00

025.0

5.01

02

y

y

x1

x2

1

1

-0.51

-2 -0.5

1

1-1.5

g1(x)

g2(x)

g(y)

Output layerHidden layerInput layer


• Example: The linearly non separable patterns x1 = [3 0], x2 = [5 2], x3 = [1 3], x4 = [2 4], x5 = [1 1], x6 = [3 3] have to be classified into two categories C1 = {x1, x2, x3, x4} and C2 = {x5, x6} using a feed forward 2-layer neural network. – Select a suitable number of partitioning straight lines.

– Consequently design the first stage (hidden layer) of the network with bipolar discrete perceptrons.

– Using this layer, transform the six samples.

– Design the output layer of the network with a bipolar discrete perceptron using the transformed samples.



• Let p1 = (0,1), p2 = (1,2), • p3 = (2,0), p4 = (3,1)• Constructing g1(x) = 0

• g1(x) = x1 - x2 + 1 = 0• Constructing g2(x) = 0

• g2(x) = x1 - x2 – 2 = 0


X2

X1

x3

x1

x6

x5

g2(x)

g1(x)

+ve

+ve

p1

p2x2

x4

p3

p4)1(1)1(2

)2(1)2(2

)1(11

)2(12

pp

pp

px

px

01

12

01

12

x

x

)1(3)1(4

)2(3)2(4

)1(31

)2(32

pp

pp

px

px

23

01

21

02

x

x



x1 x2 g1(x) g2(x) y1 y2 Class

3 0 +ve +ve 1 1 B

5 2 +ve +ve 1 1 B

1 3 -ve -ve -1 -1 B

2 4 -ve -ve -1 -1 B

1 1 +ve -ve 1 -1 A

3 3 +ve -ve 1 -1 A

Assume x2<p1(2) is the positive direction for g1(x)Assume x1>p3(1) is the positive direction for g2(x)Classifying the given patterns with respect to g1(x) and g2(x):

We now have only three patterns that can be linearly separable and we got rid of the extra pattern causing the problem (since 2 patterns coincide)

we represent +ve and –ve values as a result of bipolar function y1 =f(g1(x))= -1 if g1(x) is –ve and 1 otherwisey2 =f(g2(x))= -1 if g2(x) is –ve and 1 otherwise


• Let p1 = (1,0), p2 = (0,-1)

• Constructing g(y) = 0

• g(y) = y1 – y2 – 1 = 0


y2

y1

x5,x6

x1,x2

x3,x4

g(y)

+vep1

p2

)1(1)1(2

)2(1)2(2

)1(11

)2(12

pp

pp

py

py

10

01

11

02

y

y

x1

x2

1

-1

11

-1 -1

1

-1-2

g1(x)

g2(x)

g(y)

Output layerHidden layerInput layer

Introduction to Neural networks (under graduate course) Lecture 6 of 9

Education

Transcript of Introduction to Neural networks (under graduate course) Lecture 6 of 9