WK1 - Introduction

Post on 12-Feb-2016

63 views 1 download

description

WK1 - Introduction. CS 476: Networks of Neural Computation WK1 – Introduction Dr. Stathis Kasderidis Dept. of Computer Science University of Crete Spring Semester, 2009. Contents. Course structure and details Basic ideas of Neural Networks Historical development of Neural Networks - PowerPoint PPT Presentation

Transcript of WK1 - Introduction

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

WK1 - Introduction

CS 476: Networks of Neural Computation

WK1 – Introduction

Dr. Stathis KasderidisDept. of Computer Science

University of Crete

Spring Semester, 2009

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Contents

Contents

•Course structure and details•Basic ideas of Neural Networks•Historical development of Neural Networks•Types of learning•Optimisation techniques and the LMS method•Conclusions

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Course

Course Details

•Duration: 13 weeks (2 Feb – 15 May 2009)•Lecturer: Stathis Kasderidis

•E-mail: stathis@ics.forth.gr•Meetings: After arrangement through e-mail. •Assts: Farmaki, Fasoulakis

•Hours: •Every Tue 11-1 am and Wed 11-1 am. •Laboratory at Fri 11-1 am.

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Course

Course Timetable

•WK1 (3/5 Feb): Introduction•WK2 (10/12 Feb): Perceptron•WK3 (17/19 Feb): Multi-layer Perceptron•WK4 (24/26 Oct): Radial Basis Networks•WK5 (3/5 Mar): Recurrent Networks•WK6 (10/12 Mar): Self-Organising Networks•WK7 (17/19 Mar): Hebbian Learning•WK8 (24/26 Mar): Hopfield Networks•WK9 (31/2 Apr): Principal Component Analysis

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Course

Course Timetable (Cont)

•WK10 (7/9 Apr): Support Vector Machines•WK11 (28/30 Apr): Stochastic Networks•WK12 (5/7 May): Student Projects’ Presentation•WK13 (12/14 May): Exams Preparation•Every week:

•3hrs Theory•1hr Demonstration

• 19 Mar 2009: Written mid-term exams (optional)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

•Lab sessions will take place every Friday 11-1 am. In Lab sessions, you will be examined in written assignments and you can get help between assignments.•There will be four assignments during the term on the following dates:

•Fri 6 Mar (Ass1 – Perceptron / MLP / RBF)• Fri 20 Mar (Ass2 – Recurrent / Self-organising)•Fri 3 Apr (Ass3 – Hebbian / Hopfield)•Fri 8 May (Ass4 – PCA/SVM/Stochastic)

Course Timetable (Cont)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Course

Course Structure

•Final grade is divided:•Laboratory attendance (20%)

•Obligatory!•Course project (40%)

•Starts at WK2. Presentation at WK12.•Teams of 2-4 people depending on class size. Selection from a set of offered projects.

•Theory. Best of: •Final Theory Exams (40%) or•Final Theory Exams (25%) + Mid-term exams (15%)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Course

Project Problems

• Problems categories:1. Time Series Prediction (Financial

Series?)2. Color Segmentation with Self-

Organising Networks.3. Robotic Arm control with Self-

Organising Networks4. Pattern Classification (Geometric

Shapes)5. Cognitive Modeling (ALCOVE model)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Course

Suggested Tools

• Tools:• MATLAB (+ Neural Networks Toolbox). Can

be slow in large problems!• TLearn:

http://crl.ucsd.edu/innate/tlearn.html• Any C/C++ compiler • Avoid Java and other interpreted

languages! Too slow!

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

What are Neural Networks?

•Models inspired by real nervous systems•They have a mathematical and computational formulation•Very general modelling tools•Different approach to Symbolic AI (Connectionism)•Many paradigms exist but based on common ideas•A type of graphical models•Used in many scientific and technological areas, e.g.

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

What are Neural Networks? (Cont.)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

What are Neural Networks? (Cont. 2)

• NNs & Physics: e.g. Spin Glasses• NNs & Mathematics: e.g. Random Fields• NNs & Philosophy: e.g. Theory of Mind,

Consciousness• NNs & Cognitive Science: e.g. Connectionist

Models of High-Level Functions (Memory, Language, etc)

• NNs & Engineering: e.g. Control, Hybrid Systems, A-Life

• NNs & Neuroscience: e.g. Channel dynamics, Compartmental models

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

What are Neural Networks? (Cont. 3)

•NNs & Finance: e.g. Agent-based models of markets,•NNs & Social Science: e.g. Artif. Society

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

General Characteristics I

•How do they look like?

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

General Characteristics II

•Node details:

1. Y=f(Act)2. f is called Transfer

function3. Act=I Xi * Wi –B4. B is called Bias5. W are called

Weights

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

General Characteristics III

•Form of transfer function:

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

General Characteristics IV

•Network Specification:

•Number of neurons•Topology of connections (Recurrent, Feedforward, etc)•Transfer function(s)•Input types (representation: symbols, etc)•Output types (representation: as above)•Weight parameters, W•Other (weights initialisation, Cost function, training criteria, etc)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

General Characteristics V

•Processing Modes:

•Recall•“Learning”

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Basic Ideas

General Characteristics VI

• Common properties of all Neural Networks:

•Distributed representations•Graceful degradation due to damage•Noise robustness•Non-linear mappings•Generalisation and prototype extraction•Allow access of memory by contents•Can work with incomplete input

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

History

Historical Development of Neural Networks

•History in brief:•McCulloch-Pitts, 1943: Digital Neurons•Hebb, 1949: Synaptic plasticity•Rosenblant, 1958: Perceptron•Minksy & Papert, 1969: Perceptron Critique •Kohonen, 1978: Self-Organising Maps• Hopfiled, 1982: Associative Memory•Rumelhart & McLelland, 1986: Back-Prop algorithm•Many people, 1985-today: EXPLOSION!

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

What is Learning in NN?

Def:

“Learning is a process by which the free parameters of neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place”

[Mendel & McClaren (1970)]

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Learning Sequence

1. The network is stimulated by the environment;2. The network undergoes changes in its free

parameters as a result of this stimulation;3. The network responds in a new way to the

environment because of the changes that have occurred in its internal structure.

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Learning Criteria

1. Sum squared error2. Mean square error3. X2 statistic4. Mutual information5. Entropy6. Other (e.g. Dot product – ‘similarity’)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Learning Paradigms

• Learning with a teacher (supervised learning)• Learning without a teacher

• Reinforcement learning• Unsupervised learning (self-organisation)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Families of Learning Algorithms

• Error-based learning wkj(n) = h*ek(n)*xj(n) (Delta rule)

• Memory-based learning (??)• 1-Nearest Neighbour• K-Nearest Neighbours

• Hebbian learning wkj(n) = h*yk(n)*xj(n) wkj(n) =F(yk(n),xj(n)) (more general case)

• Competitive learning wij(n+1) = h*(xj(n)- wij(n))

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Families of Learning Algorithms II

• Stochastic Networks • Boltzmann learning

wkj(n) = h*(kj+(n)-kj

-(n)) • (kj

* = avg corr of states of neurons i, j )

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Learning Tasks

• Function approximation• Association

• Auto-association• Hetero-association

• Pattern recognition• Control• Filtering

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Learning

Credit Assignment Problem

• Def: It is the problem of providing credit or blame to states that lead to useful / harmful outcomes

• Temporal Credit Assignment Problem: Find which actions in a period q=[t,t-T] lead to useful outcome at time t and credit these actions, I.e.

Outcome(t) – f Actions(q)• Structural Credit Assignment Problem: Find

which states at time t lead to useful actions at time t, I.e.

Actions(t) – g State(t)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Bias / Var

Statistical Nature of the Learning Process

•Assume that a set of examples is given: Niii dx ...1,T

•Assume that a statistical model of the generating process is given (regression equation):

)(XfD

•Where X is a vector random variable (independent variable), D is scalar random variable (dependent) and is a random variable with the following properties:

0)]([

0]|[

XfE

xE

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Bias / Var

Statistical Nature of the Learning Process II

•The first property says that has zero mean given any realisation of X•The second property says that is uncorrelated with the regression function f(X) (principle of orthogonality) is called intrinsic error

•Assume that the neural network describes an “approximation” to the regression function, which is:

),( wXFY

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Bias / Var

Statistical Nature of the Learning Process III

•The weight vector w is obtained by minimising the cost function:

N

iii wxFdw

1

2)),((21)( E

•We can re-write this, using expectation operators, as:

2)),((21)( TxFdEw T

E

•… (after some algebra we get) ….

22 )),()((21

21)( TxFxfEEw TT

E

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Bias / Var

Statistical Nature of the Learning Process IV

•Thus to obtain w we need to optimise the function:

2)),()(()),(),(( TxFxfEwxFxfL Tav

• … (after some more algebra!) ….

2

2

)),(),(()(

)(),()()()()),(),((

TxFETxFEwV

xfTxFEwBwVwBwxFxfL

TT

T

av

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

Conclusions

Bias / Var

Statistical Nature of the Learning Process V

•B(w) is called bias (or approximation error)•V(w) is called variance (or estimation error)•The last relation shows the bias-variance dilemma:

“We cannot minimise at the same time both bias and variance for a finite set, T. Only when N both are becoming zero”

•Bias measures the “goodness” of our functional form in approximating the true regression function f(x)•Variance measures the amount of information present in the data set T which is used for estimating F(x,w)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

ConclusionsConclusions

Comments I

•We should distinguish Artificial NN from bio-physical neural models (e.g. Blue Brain Project);•Some NNs are Universal Approximators, e.g. feed-forward modles are based on the Kolmogorov Theorem•Can be combined with other methods, e.g. Neuro-Fuzzy Systems

•Flexible modeling tools for:•Function approximation•Pattern Classification•Association•Other

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

ConclusionsConclusions

Comments II

•Advantages:•Distributed representation allows co-activation of categories•Graceful degradation•Robustness to noise•Automatic generalisation (of categories, etc)

Contents

Course

Basic Ideas

History

Learning

Bias / Var

CS 476: Networks of Neural Computation, CSD, UOC, 2009

ConclusionsConclusions

Comments III

•Disadvantages:•They cannot explain their function due to distributed representations•We cannot add existing knowledge to neural networks as rules•We cannot extract rules•Network parameters found by trial and error (in general case)