Course notes (.ppt)

Post on 04-Nov-2014

57 views 8 download

Tags:

description

 

Transcript of Course notes (.ppt)

Neural Networks

• Resources– Chapter 19, textbook

• Sections 19.1-19.6

Neuroanatomy Metaphor• Neural networks (aka connectionist, PDP, artificial

neural networks, ANN)– Rough approximation to animal nervous system– See systems such as NEURON for modeling at more biological

levels of detail; http://neuron.duke.edu/

• Neuron components in brains– Soma (cell body); dendritic tree – Axon: sends signal downstream– Synapses

• Receive incoming signals from upstream neurons• Connections on dendrites, cell body, axon, synapses• Neurotransmitter mechanisms

‘a’ or ‘the’ brain?

• Are we using computer models of neurons to model ‘the’ brain or model ‘a’ brain?

Neuron Firing Process

1) Synapse receives incoming signals, change electrical (ionic) potential of cell body

2) When a potential of cell body reaches some limit, neuron “fires”, electrical signal (action potential) sent down axon

3) Axon propagates signal to other neurons, downstream

What is represented by a biological neuron?

• Cell body sums electrical potentials from incoming signals– Serves as an accumulator function over time– But “as a rule many impulses must reach a neuron

almost simultaneously to make it fire” (p. 33, Brodal, 1992; italics added)

• Synapses have varying effects on cell potential– Synaptic strength

ANN (Artificial Neural Nets)

• Approximation of biological neural nets by ANN’s– No direct model of accumulator function

– Synaptic strength• Approximate with connection weights (real numbers)

– Spiking of output• Approximate with non-linear activation functions

• Neural units– Represent activation values (numbers)

– Represent inputs, and outputs (numbers)

Graphical Notation & Terms

• Circles– Are neural units– Metaphor for nerve cell

body• Arrows

– Represent synaptic connections from one unit to another

– These are often called weights and represented with a single value (e.g., real value)

One layer of neural units

Another layer of neural units

Another Example: 8 units in each layer, fully connected network

Units & Weights

• Units– Sometimes notated

with unit numbers• Weights

– Sometimes give by symbols

– Sometimes given by numbers

– Always represent numbers

– May be boolean valued or real valued

1

2

3

4

0.3-0.1

2.1

-1.1

1

1

Unit num

bers

Unit num

ber

1

2

3

4

W1,1

W1,2

W1,3

W1,4

Computing with Neural Units

• Inputs are presented to input units

• How do we generate outputs

• One idea– Summed Weighted Inputs

1

2

3

4

0.3-0.1

2.1

-1.1

Input: (3, 1, 0, -2)

Processing:

3(0.3) + 1(-0.1) + 0(2.1) + -1.1(-2)

= 0.9 + (-0.1) + 2.2

Output: 3

Activation Functions

• Usually, don’t just use weighted sum directly• Apply some function to the weighted sum before it

is used (e.g., as output)• Call this the activation function• Step function could be a good simulation of a

biological neuron spiking

x 0

x 1)(

if

ifxf

Is called the threshold

Step Function Example

• Let = 4

1

2

3

4

0.3-0.1

2.1

-1.14

1

0)3( f

Another Activation Function:The Sigmoidal

• The math of some neural nets requires that the activation function be continuously differentiable

• A sigmoidal function often used to approximate the step function

xexf

1

1)(

Is the steepness parameter

Sigmoidal Example

0.3-0.1

2.1

-1.1

95.1

1)3(

xe

f

1xe

xf

1

1)(

Input: (3, 1, 0, -2)

Sigmoidal

0

0.2

0.4

0.6

0.8

1

1.2

-5

-4.4

-3.8

-3.2

-2.6 -2

-1.4

-0.8

-0.2 0.4 1

1.6

2.2

2.8

3.4 4

4.6

1/(1+exp(-x)))

1/(1+exp(-10*x)))

Another Example

• A two weight layer, feedforward network• Two inputs, one output, one ‘hidden’ unit

0.5

-0.5

Input: (3, 1)xe

xf

1

1)(

0.75

What is the output?

Computing in Multilayer Networks

• Start at leftmost layer– Compute activations based on inputs

• Then work from left to right, using computed activations as inputs to next layer

• Example solution– Activation of hidden unit

f(0.5(3) + -0.5(1)) =f(1.5 – 0.5) =f(1) = 0.731

– Output activationf(0.731(0.75)) =

f(0.548) = .634

xexf

1

1)(

Notation for Weighted Sums

Weight (scalar) from unit j in left layer to unit i in right layerjiW ,Activation value of unit k in layer l; layers increase in number from left to right lka ,

)(1 ,,1,

n

i lijilk aWfaW1,1

W1,2

W1,3

W1,4

1,2a1,3a1,4a

2,1a

1,1a

Notation

)(

1 1,,12,1

n

j jjaWfa

)( 1,44,11,33,11,22,11,11,1 aWaWaWaWf

1,4a

W1,1

W1,2

W1,3

W1,4

1,2a1,3a 2,1a

1,1a

Notation

iW Row vector of incoming weights for unit i

ia Column vector of activation values of units connected to unit i

Example

4,13,12,11,11 WWWWW

1,4

1,3

1,2

1,1

4,13,12,11,111

a

a

a

a

WWWWaW

4

3

2

1

1

a

a

a

a

a Recall: multiplying a n×r with a r×m matrix produces an n×m matrix, C, where each element in that n×m matrix Ci,j is produced as the scalar product of row i of the left and column j of the right

1,4a

W1,1

W1,2

W1,3

W1,4

1,2a1,3a 2,1a

1,1a

Scalar Result:Summed Weighted Input

1,44,11,33,11,22,11,11,1 aWaWaWaW

1×4 row vector

4×1 column vector

1×1 matrix (scalar)

1,4

1,3

1,2

1,1

4,13,12,11,111

a

a

a

a

WWWWaW

Computing New Activation Value

Where: f(x) is the activation function, e.g., the sigmoid function

)( iiaWfa

)( 11aWfa

For the case we were considering:

In the general case:

)( 1,44,11,33,11,22,11,11,1 aWaWaWaWfa

Example

• Compute the output value

• Draw the corresponding ANN

)

3

2

1

1- 0.5 4.0(f

ANN Solving the Equality Problem for 2 Bits

x1

x2

y1

y2

z1

Network Architecture

x1 x2 z1

0 0 1

0 1 0

1 0 0

1 1 1

What weights solve this problem?

Goal outputs:

Approximate Solution

http://www.d.umn.edu/~cprince/courses/cs5541fall02/lectures/neural-networks/

x1

x2

y1

y2

z1

Network Architecture

x1 x2 z1

0 0 .925

0 1 .192

1 0 .19

1 1 .433

Actual network results:

Weights

w_x1_y1 w_x1_y2 w_x2_y1 w_x2_y2

-1.8045 -7.7299 -1.8116 -7.6649

w_y1_z1 w_y2_z1

-10.3022 15.3298

How well did this approximate the goal function?

• Categorically– For inputs x1=0, x2=0 and x1=1, x2=1, the

output of the network was always greater than for inputs x1=1, x2=0 and x1=0, x2=1

• Summed squared error

2

1

)( s

mplesnumTrainSa

ss putDesiredOututActualOutp

• Compute the summed squared error for our example

2

1

)( s

mplesnumTrainSa

ss putDesiredOututActualOutp

x1 x2 z1

0 0 .925

0 1 .192

1 0 .19

1 1 .433

Solution

Expected Actual

x1 x2 z1 z1 squared error

0 0 1 0.925 0.005625

0 1 0 0.192 0.036864

1 0 0 0.19 0.0361

1 1 1 0.433 0.321489

0.400078Sum squared error =

Weight Matrix

• Row vector provides weights for a single unit in “right” layer

• A weight matrix provides all weights connecting “left” layer to “right” layer

• Let W be a n×r weight matrix– Row vector i in matrix connects unit i on “right” layer

to units in “left” layer– n units in layer to “right”– r units in layer to “left”

Notation

ia The vector of activation values of layer to “left”; an r×1 column vector (same as before)

iaW n×1 column vector; summed weights for “right” layer

)( iaWf n×1 - New activation values for “right” layer

Function f is now taken as applying to elements of a matrix

Example

)75.

4.0

23

02

11.1

34

0.31

(

f

Updating hidden layer activation values

)

1

3

3

2

1.

6.3310

56471.

4.1322

(

f

Updating output activation values

Draw the architecture (units and arcs representing weights) of the connectionist model

Answer

• 2 input units

• 5 hidden layer units

• 3 output units

• Fully connected, feedforward network

Bias Weights

• Used to provide a train-able threshold

W1,1

W1,2

W1,3

W1,4

1

b

b is treated as another weight; but connected to a unit with constant activation value