Post on 11-Jan-2016
Abstract
We start with a statistical formulation of Helmholtz’s ideas about neural energy to furnish a model of perceptual inference and learning that can explain a remarkable range of neurobiological facts. Using constructs from statistical physics it can be shown that the problems of inferring what cause our sensory inputs and learning causal regularities in the sensorium can be resolved using exactly the same principles. Furthermore, inference and learning can proceed in a biologically plausible fashion. The ensuing scheme rests on Empirical Bayes and hierarchical models of how sensory information is generated. The use of hierarchical models enables the brain to construct prior expectations in a dynamic and context-sensitive fashion. This scheme provides a principled way to understand many aspects of the brain’s organization and responses.
Perceptual inference and learningCollège de France 2008
Inference and learning under the free energy principleHierarchical Bayesian inference
A simple experiment
Bird songs (inference)Structural and dynamic priorsPrediction and omissionPerceptual categorisation
Bird songs (learning)Repetition suppressionThe mismatch negativity
Overview
agent - menvironment
( , )y g w
min ( , )F y
min ( , )F y
( , )t f w
Separated by a Markov blanket
External states
Internal states
Sensation
Action
Exchange with the environment
Perceptual inference Perceptual learning Perceptual uncertainty
ln ( ( ) | , ) ( || ( ))
min
max ln ( ( ) | , )
q
q
F p y m D q p
F
p y m
Action to minimise a bound on surprise
The free-energy principle
ln ( ( ), | ) ln ( ) ln ( | )q q
F p y m q p y m
ln ( | ) ( ( ; ) || ( | ))
min
( ; ) ( | )
F p y m D q p y
F
q p y
Perception to optimise the bound
);();();();( qquqq u
Fu min F
min F min
The ensemble density and its parameters
)2(v
)1(1x
)1(2x
2y 3y 4y
Prediction error
Generation
)2(v
)1(v1y 2y 3y 4y
)1(x)1(
1x )1(
2x
)2(v
1y
),(
),()1()()()(
)1()()()(
iv
ix
ixt
ix
iv
ix
iv
iv
f
g
)()1()()(
)()1()()(
),(
),(ix
iiit
iv
iii
vxfx
vxgv
Hierarchical models and message passing
time
Top-down messages Bottom-up messages
Recognition
( 1) ( )
( )
( ) ( 1) ( )
( ) ( )
i i
i
i i it v v v
i it x x
F F
F
Empirical Bayes and hierarchical models
Bottom-up Lateral
E-StepPerceptual learning
M-StepPerceptual uncertainty
D-StepPerceptual inference
Recurrent message passing among neuronal populations, with top-down predictions changing to suppress bottom-up prediction error
Friston K Kilner J Harrison L A free energy principle for the brain. J. Physiol. Paris. 2006
)()(1)()( iu
Tiu
iit
)()1()()( iu
iTiu
it
Associative plasticity, modulated by precision
Encoding of precision through classical neuromodulation or plasticity in lateral connections
( ) ( 1) ( 1) ( 1) ( ) ( )
( ) ( ) ( ) ( 1)( , )
i i T i i i it v v u u u
i i i iv v x vg
Top-down
Bottom-up prediction errors
Top-down predictions
)(iv
spiny stellate and small basket neurons in layer 4
superficialpyramidal
cells
double bouquet cells
deep pyramidal cells
),( )1()()()( iv
ix
iv
iv g
)1( iv )1( i
v
)1( iv
)1( iv )1( i
v
)1( iv
)1( iv
)(iv
)1( iv
Neural implementation in cortical hierarchies (c.f. evidence accumulation models)
( ) ( 1) ( 1) ( 1) ( ) ( )i i T i i i it v v u u u
Inference and learning under the free energy principleHierarchical Bayesian inference
A simple experiment
Bird songs (inference)Structural and dynamic priorsPrediction and omissionPerceptual categorisation
Bird songs (learning)Repetition suppressionThe mismatch negativity
Overview
01
A brain imaging experiment with sparse visual stimuli
V2
V1
Angelucci et al
Coherent and predicable
Random and unpredictable
top-down suppression of prediction error when coherent?
CRF V1 ~1o
Horizontal V1 ~2o
Feedback V2 ~5o
Feedback V3 ~10o
Classical receptive field V1
Extra-classical receptive field
Classical receptive field V2
?
V2
V1
V5
pCGpCG
V5
RandomStationaryCoherent
V1V5V2
Suppression of prediction error with coherent stimuli
regional responses (90% confidence intervals)
decreases increases
Harrison et al NeuroImage 2006
Inference and learning under the free energy principleHierarchical Bayesian inference
A simple experiment
Bird songs (inference)Structural and dynamic priorsPrediction and omissionPerceptual categorisation
Bird songs (learning)Repetition suppressionThe mismatch negativity
Overview
Time (sec)
Freq
uenc
y (Hz
)
simulus
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Time (sec)
Freq
uenc
y (Hz
)
percept
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Synthetic song-birds
Syrinx
Neuronal hierarchy
20 40 60 80 100 120-20
-10
0
10
20
30
40
50prediction and error
time20 40 60 80 100 120
-20
-10
0
10
20
30
40
50hidden states
time
20 40 60 80 100 120-10
0
10
20
30
40
50causes - level 2
time20 40 60 80 100 120
-20
-10
0
10
20
30
40
50hidden states
time
Time (sec)
Freq
uenc
y (H
z)
simulus
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Time (sec)
Freq
uenc
y (H
z)
percept
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Song recognition with DEM
Time (sec)
Freq
uenc
y (H
z)
percept
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Time (sec)
Freq
uenc
y (H
z)no structural priors
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Time (sec)
Freq
uenc
y (H
z)
no dynamical priors
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
0 500 1000 1500 2000-40
-20
0
20
40
60
peristimulus time (ms)
LFP
(micr
o-vo
lts)
LFP
0 500 1000 1500 2000-60
-40
-20
0
20
40
60
peristimulus time (ms)
LFP
(micr
o-vo
lts)
LFP
0 500 1000 1500 2000-60
-40
-20
0
20
40
60
peristimulus time (ms)
LFP
(micr
o-vo
lts)
LFP
… and broken birds
20 40 60 80 100 120-20
-10
0
10
20
30
40
50prediction and error
time20 40 60 80 100 120
-20
-10
0
10
20
30
40
50hidden states
time
20 40 60 80 100 120-10
0
10
20
30
40
50causes - level 2
time20 40 60 80 100 120
-20
-10
0
10
20
30
40
50hidden states
time
… omitting the last chirps
Time (sec)
Freq
uenc
y (H
z)
stimulus (sonogram)
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Time (sec)
Freq
uenc
y (H
z)
percept
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
500 1000 1500 2000-100
-50
0
50
100
peristimulus time (ms)
LFP
(micr
o-vo
lts)
ERP (error)
Time (sec)
Freq
uenc
y (H
z)
without last syllable
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
Time (sec)
Freq
uenc
y (H
z)
percept
0.5 1 1.52000
2500
3000
3500
4000
4500
5000
500 1000 1500 2000-100
-50
0
50
100
peristimulus time (ms)
LFP
(micr
o-vo
lts)
with omission
omission and violation of predictions
Stimulus but no percept
Percept but no stimulus
Inference and learning under the free energy principleHierarchical Bayesian inference
A simple experiment
Bird songs (inference)Structural and dynamic priorsPrediction and omissionPerceptual categorisation
Bird songs (learning)Repetition suppressionThe mismatch negativity
Overview
10 20 30 40 50 60-5
0
5
10
15
20prediction and error
time10 20 30 40 50 60
-5
0
5
10
15
20hidden states
time
10 20 30 40 50 60-10
-5
0
5
10
15
20causes - level 2
time Time (sec)
Freq
uenc
y (H
z)
percept
0.2 0.4 0.6 0.82000
2500
3000
3500
4000
4500
5000
A simple song
Encoding sequences in terms of attractor
manifolds
( )v t
2 1
1 1 3 1 2
1 2 2 3
18 18
( ) 2
2
x x
f x v x x x x
x x v x
Time (sec)
Freq
uenc
y (H
z)
Song A
0.2 0.4 0.6 0.82000
3000
4000
5000
0 0.2 0.4 0.6 0.8 1-20
-10
0
10
20
30
40
50
Song ASong BSong C
10 15 20 25 30 351
1.5
2
2.5
3
3.5
Song A
Song BSong C
Time (sec)
Freq
uenc
y (H
z)
Song B
0.2 0.4 0.6 0.82000
3000
4000
5000
Time (sec)
Freq
uenc
y (H
z)
Song C
0.2 0.4 0.6 0.82000
3000
4000
5000
Categorizing sequences
90% confidence regions
1v
2v
( )v t
Inference and learning under the free energy principleHierarchical Bayesian inference
A simple experiment
Bird songs (inference)Structural and dynamic priorsPrediction and omissionPerceptual categorisation
Bird songs (learning)Repetition suppressionThe mismatch negativity
Overview
Suppression of inferotemporal responses to repeated faces
Main effect of faces
Henson et al 2000
Repetition suppression and the MMN
The MMN is an enhanced negativity seen in response to any change (deviant) compared to the standard response.
10 20 30 40 50 60-5
0
5
10
15
20
25
30
35prediction and error
time10 20 30 40 50 60
-2
-1
0
1
2hidden states
time
10 20 30 40 50 60-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5causes - level 2
time Time (sec)
Freq
uenc
y (H
z)
0.1 0.2 0.3 0.42000
2500
3000
3500
4000
4500
5000
Prediction errorencoded by superficial
pyramidal cells
A simple chirp
0.1 0.2 0.3 0.4-4
-2
0
2
4Hidden states
Time (sec)
Freq
uenc
y (H
z)
percept
0.2 0.42000
3000
4000
5000
100 200 300 400 500
-5
0
5
10
peristimulus time (ms)
LFP
(micr
o-vo
lts)
prediction error
0.1 0.2 0.3 0.4-4
-2
0
2
4
Time (sec)
Freq
uenc
y (H
z)
0.2 0.42000
4000
100 200 300 400 500
-5
0
5
10
peristimulus time (ms)
LFP
(micr
o-vo
lts)
0.1 0.2 0.3 0.4-4
-2
0
2
4
Time (sec)
Freq
uenc
y (H
z)
0.2 0.42000
4000
100 200 300 400 500
-5
0
5
10
peristimulus time (ms)
LFP
(micr
o-vo
lts)
0.1 0.2 0.3 0.4-4
-2
0
2
4
Time (sec)Fr
eque
ncy
(Hz)
0.2 0.42000
4000
100 200 300 400 500
-5
0
5
10
peristimulus time (ms)
LFP
(micr
o-vo
lts)
0.1 0.2 0.3 0.4-4
-2
0
2
4
Time (sec)
Freq
uenc
y (H
z)
0.2 0.42000
4000
100 200 300 400 500
-5
0
5
10
peristimulus time (ms)
LFP
(micr
o-vo
lts)
0.1 0.2 0.3 0.4-4
-2
0
2
4
Time (sec)
Freq
uenc
y (H
z)
0.2 0.42000
4000
100 200 300 400 500
-5
0
5
10
peristimulus time (ms)
LFP
(micr
o-vo
lts)
min F
Perceptual inference: suppressing error over peristimulus time
Perceptual learning: suppression over repetitions
Simulating ERPs to
repeated chirps
minu F
1 2 3 4 5 60
100
200
300
400
500
600
700SSQ prediction error
repetition
0 100 200 300 400 500-8
-6
-4
-2
0
2
4
6
8LFP: Oddball
peristimulus time (ms)0 100 200 300 400 500
-8
-6
-4
-2
0
2
4
6
8LFP: Standard (P1)
peristimulus time (ms)
0 100 200 300 400 500-8
-6
-4
-2
0
2
4
6
8Difference waveform (MMN)
peristimulus time (ms)
primary area - amplitudeprimary area - frequencysecondary area - chirp
The MMN
Enhanced N1 (primary area) MMN (secondary area)
Last presentation(after learning)
First presentation(before learning)
P300 (tertiary area)?
Summary
A free energy principle can account for several aspects of action and perception
The architecture of cortical systems speak to hierarchical generative models
Estimation of hierarchical dynamic models corresponds to a generalised deconvolution of inputs to disclose their causes
This deconvolution can be implemented in a neuronally plausible fashion by constructing a dynamic system that self-organises when exposed to inputs to suppress its free energy
Minimisation of free energy proceeds over many spaces, including the state of a model (perception), its parameters (learning), its hyperparameters (salience and attention) and the model itself (selection in somatic or evolutionary time).