Class 21, 1999 CBCl/AI MIT Neuroscience II T. Poggio.
-
Upload
dayna-brittany-anderson -
Category
Documents
-
view
223 -
download
3
Transcript of Class 21, 1999 CBCl/AI MIT Neuroscience II T. Poggio.
Class 21, 1999
CBCl/AI MIT
Neuroscience II
T. Poggio
Class 21, 1999
CBCl/AI MIT
Neuroscience
Brain Overview?
Class 21, 1999
CBCl/AI MIT
The Ventral Visual Pathway
modified from Ungerleider and Haxby, 1994
Class 21, 1999
CBCl/AI MIT
Class 21, 1999
CBCl/AI MIT
Visual Areas
Class 21, 1999
CBCl/AI MIT
Face-tuned cells in IT
Class 21, 1999
CBCl/AI MIT
Model of view-invariant recognition: learning from views
VIEW ANGLE
Poggio, Edelman Nature, 1990.
A graphical rewriting of mathematics of regularization (GRBF),a learning technique
Class 21, 1999
CBCl/AI MIT
Learning to Recognize3D Objects in IT Cortex
Logothetis, Pauls, Poggio1995
Examples of Visual Stimuli
After human psychophysics (Buelthoff, Edelman,Tarr, Sinha,…), whichsupports modelsbased on view-tunedunits...physiology!
Class 21, 1999
CBCl/AI MIT
Task Description
Task Description
Lef t LeverBlue Fixspot
Response
Color Change
2 sec
Stimulus
Yellow Fixspot
Response
OffOn
Recognition Task
Fixation Task
StimulusRight Lever
T TTT TD DD DDTT = TargetD = Distractor
Testing P haseLear ning Phase
Stimulus
Blue Fixspot
Response
Stimulus
Yellow Fixspot
Response
Left Lever
Right Lever
T = TargetD = Distractor
Logothetis, Pauls, Poggio1995
Class 21, 1999
CBCl/AI MIT
Recording Sites in Anterior IT
LUNLAT
IOS
STS
AMTSLAT
STS
AMTS
Ho=0
Logothetis, Pauls, and Poggio, 1995;Logothetis, Pauls, 1995
Class 21, 1999
CBCl/AI MIT
Model’s predictions: View-tuned Neurons
VIEW ANGLE
VIEW-TUNEDUNITS
Class 21, 1999
CBCl/AI MIT
The Cortex: Neurons Tuned to Object Views
Logothetis, Pauls, Poggio1995
Class 21, 1999
CBCl/AI MIT
A View Tuned Cell
12 7224 8448 10860 12036 96
12 24 36 48 60 72 84 96 108 120 132 168o o o o o o o o o o o o
-108 -96 -84 -72 -60 -48 -36 -24 -12 0-168 -120
Distractors
Target Views60
sp
ikes
/sec
800 msec
-108 -96 -84 -72 -60 -48 -36 -24 -12 0-168 -120 oo o o o o o o o oo o
Logothetis, Pauls, Poggio1995
Class 21, 1999
CBCl/AI MIT
Model’s predictions : View-invariant, Object-specific
Neurons
View Angle
VIEW-INVARIANT,
OBJECT-SPECIFIC
UNIT
Class 21, 1999
CBCl/AI MIT
The Cortex: View-invariant, Object-specific Neurons
Logothetis, Pauls, Poggio,1995
Class 21, 1999
CBCl/AI MIT
Recognition of Wire Objects
Class 21, 1999
CBCl/AI MIT
Generalization Field
Distractors (N=60)
10
0
Z
Y
X
- +45
Y
+45
-45
X
45
5
10
15
20
25
90
45
0
-45
-90-90
-450
4590
Class 21, 1999
CBCl/AI MIT
600 msec
1 2 3 4 6
7 8 9 10 12
13 14 15 16 18
20 21 22 28 30
184 s
pik
es/s
ec
Amoeba 01, Cell = 265
(a)
(b)
48 o
96 o
144 o
0 o
60 o
108o
156 o
12 o
72 o
120o
168 o
24 o
36 o
84 o
-12 o
132 o
Class 21, 1999
CBCl/AI MIT
Wire 526, Cell = 202
(b)
(a)
o108
o60
o12
-36o
o84
o36
o
-60 o
-12
o72
o24
o
-72 o
-24
o96
o48
-48 o
o0
14
2 sp
ike
s/se
c
600 msec
4
22
39
17
5
24
43
18
25
44
8
45
49
26
20
9
27
50
59
21
Class 21, 1999
CBCl/AI MIT
Distractors (N=60)0
10
Spik
es p
er
Second
-180 -135 -90 -45 0 45 90 135 1800
16
32
48
64
80 ( - 120 deg) ( 60 deg)
Rotation Around Y Axis
0
5
10
15
20
25
0 60 120 180-60-120-180S
pik
es p
er
Second
Rotation Around Y Axis
Hit Rate > 95% for all views
Class 21, 1999
CBCl/AI MIT
View-dependent Response of an IT Neuron
0 250 500 0 250 500 0 250 500 0 250 500 0 250 500 0 250 500 0 250 500
160
Hit
Ra
te
0
2040
60
80100
Sp
ike
s p
er S
eco
nd
10
20
30
40
50
60
0 60 120 180-60-120-180
0
-160 -120 -40 60 120900
Rotation Around Y Axis (degrees)0 60 120 180-60-120-180
Class 21, 1999
CBCl/AI MIT
Sparse Representations in IT
• About 400 view tuned cells per object
• Perhaps 20 view-invariant cells per object
In the recording area in AMTS -- a In the recording area in AMTS -- a specialized region for paperclips (!) --specialized region for paperclips (!) --we estimate that there are, after training,we estimate that there are, after training, (within an order of magnitude or two) …
Logothetis, Pauls, Poggio, 1997
Class 21, 1999
CBCl/AI MIT
Previous glimpses: cells tuned to face identity and
view
Perrett, 1989
Class 21, 1999
CBCl/AI MIT
2. View-tuned IT neurons
View-tuned cells in
IT Cortex:
how do they work?
How do they
achieve selectivity
and invariance? Max Riesenhuber andT. Poggio, Nature Neuroscience, just published
Class 21, 1999
CBCl/AI MIT
max
Some ofour fundingis fromHonda...
Class 21, 1999
CBCl/AI MIT
Model’s View-tuned Neurons
VIEW ANGLE
VIEW-TUNEDUNITS
Class 21, 1999
CBCl/AI MIT
Scale-Invariant Responses of an IT Neuron
(training on one size only!)
Logothetis, Pauls and Poggio, 1995
Scale Invariant Responses of an IT Neuron
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
0 2000 3000Time (msec)
1000
Spi
kes/
sec
0
76
4.0 deg(x 1.6)
4.75 deg(x 1.9)
5.5 deg(x 2.2)
6.25 deg(x 2.5)
2.5 deg(x 1.0)
1.0 deg(x 0.4)
1.75 deg(x 0.7)
3.25 deg(x 1.3)
Scale Invariant Responses of an IT Neuron
Class 21, 1999
CBCl/AI MIT
• Invariance around training view
• Invariance while maintaining specificity
*
Sp
ike
R
ate
Distractor ID
10 Best Distractors
37 9 20 5 24 3 2 1 0 6
0
10
20
30
40
60 108 132 156 18084
0
10
20
30
40
Rotation Around Y Axis
(a) (b)
Azimuth and Elevation(x = 2.25 degrees)
1.90 2.80 3.70 4.70 5.60
0
1
2
3
4
5
6
7
( 0,0) ( x,x) ( x,- x)( - x,x) ( - x,- x)
0
1
2
3
4
5
6
Degrees of Visual Angle(Target Response)/
(M
ea
n o
f B
est D
istra
cto
rs)
(c) (d)
Sp
ike
Ra
te(T
arg
et
Re
spo
nse
)/(M
ea
n o
f B
est
Dis
tra
cto
rs)
Invariances: Overview
Logothetis, Pauls and Poggio, 1995
Class 21, 1999
CBCl/AI MIT
Our quantitative model builds upon previous hierarchical models
•Hubel & Wiesel (1962): Simple to complex to ``higher order
hypercomplex cells’’
•Fukushima (1980): Alternation of “S” and “C” layers to build up
feature specificity and translation invariance, resp.
•Perrett & Oram (1993): Pooling as general mechanism to achieve
invariance
Class 21, 1999
CBCl/AI MIT
Model of view tuned cells
MAX Riesenhuber andTommy Poggio, 1999
Class 21, 1999
CBCl/AI MIT
Model Diagram
“IT”
“V4”
“V1”
. . .
...
. . .
...
w
View-specific learning: synaptic plasticity
Class 21, 1999
CBCl/AI MIT
Max (or “softmax”)
• key mechanism in the model
• computationally equivalent to selection (and scanning in our object detection system)
Class 21, 1999
CBCl/AI MIT
V1: Simple Features, Small Receptive Fields
• Simple cells respond to bars
Hubel & Wiesel, 1959•“Complex Cells”: translation invariance; pool over simple cells of the same orientation (Hubel&Wiesel)
Class 21, 1999
CBCl/AI MIT
Two possible Pooling Mechanisms
thanks to Pawan Sinha
Nn
nn
Class 21, 1999
CBCl/AI MIT
An Example: Simple to Complex Cells
“simple”cells
“complex” cell
?
Class 21, 1999
CBCl/AI MIT
Simple to Complex: Invariance to Position and Feature
Selectivity
?
“simple”cells
“complex” cell
Class 21, 1999
CBCl/AI MIT
3. Some predictions of the model
• Scale and translation invariance of view-tuned AIT neurons
• Response to pseudomirror views
• Effect of scrambling
• Multiple objects
• Robustness to clutter
• Consistent with K. Tanaka’s simplification procedure
• More and more complex features from V1 to AIT
Class 21, 1999
CBCl/AI MIT
Testing Selectivity and Invariance of Model Neurons
• Test specificity AND transformation tolerance of view-tuned model neurons
• Same objects as in Logothetis’ experiment
• 60 distractors
Class 21, 1999
CBCl/AI MIT
Invariances of IT (view-tuned) Model Neuron
Class 21, 1999
CBCl/AI MIT
Invariances: Experiment vs. Model (view-tuned cells)
05
10152025303540
3D Rotation
degrees
0
0.5
1
1.5
2
2.5
translation
degrees o
f v
is. angle
0
0.5
1
1.5
2
2.5
3
3.5
scale change
octaves
Model
Experiment
*
Class 21, 1999
CBCl/AI MIT
MAX vs. Summation
05
10152025303540
3D Rotation
deg
rees
0
0.5
1
1.5
2
translation
deg
rees o
f v
is. a
ng
le
0
0.5
1
1.5
2
2.5
3
3.5
scale change
octaves
max sum
Class 21, 1999
CBCl/AI MIT
Response toPseudo-Mirror Views
As in experiment, somemodel neurons show tuningto pseudo-mirror image
Class 21, 1999
CBCl/AI MIT
Robustness to scrambling: model and IT neurons
Experiments: Vogels, 1999
Class 21, 1999
CBCl/AI MIT
Recognition in Context: Two Objects
Class 21, 1999
CBCl/AI MIT
Recognition in Context: some experimental support
• Sato: Response of IT cells to two stimuli in RF
Sato, 1989
Class 21, 1999
CBCl/AI MIT
Recognition in Clutter:data
How does response of IT neurons change if background is introduced?
00.30.60.91.2
avg.
re
spon
se
stimulus stim + bg
00.250.5
0.751
%
CO
RREC
T
stimulus stim + bg
Missal et al., 1997
Class 21, 1999
CBCl/AI MIT
Recognition in Clutter: model
• average model neuron response
• recognition rates
Class 21, 1999
CBCl/AI MIT
Further Support: Keiji just mentioned his simplification paradigm...
Wang et al., 1998
Class 21, 1999
CBCl/AI MIT
Consistent behaviour of the model
Class 21, 1999
CBCl/AI MIT
Higher complexity and invariances in Higher Areas
Kobatake & Tanaka, 1994
Class 21, 1999
CBCl/AI MIT
Fujita and Tanaka’s Dictionary of Shapes (about 3000) in posterior IT (columnar
organization)
Class 21, 1999
CBCl/AI MIT
Similar properties in the model...
M. Tarr, Nature Neuroscience
Class 21, 1999
CBCl/AI MIT
Layers With Linear Pooling and With Max Pooling
•Linear pooling: yields more complex features (e.g. from LGN
inputs to simple cells and -- perhaps -- from PIT to AIT cells)
•Max pooling: yields invariant (position, scale) features
over a larger receptive field(e.g. from simple to complex V1 cells)
Class 21, 1999
CBCl/AI MIT
4. Hypothetical circuitry for Softmax
• The max operation is at the core of the model properties
• Which biophysical mechanisms and circuitry underlies the max operation?
Class 21, 1999
CBCl/AI MIT
Softmax circuitry
The SOFTMAX operation may arise from cortical microcircuitsof lateral inhibition between neurons in a cortical layer. An example:a circuit based on feed forward (or recurrent) shunting presynaptic (or post synaptic) inhibition. Key elements: 1) shunting inhibition 2) nonlinear transformation of the signals (synaptic nonlinearities or active membrane properties). The circuit performs: a gain control operation (as in the canonical microcircuit of Martin and Douglas…) and -- for certain values of the parameters -- a softmax operation:
j
pj
qi
i x
xy
Class 21, 1999
CBCl/AI MIT
Summary: main points of model
• Max-like operation, computationally similar to scanning and
selecting • Hypothetical inhibitory microcircuit for Softmax in cortex • Easy grafting of top-down attentional effect on circuitry• Segmentation is a byproduct of recognition• No binding problem, syncrhonization not needed• Model is extension of classical hierarchical H-W scheme• Model deals with nice object classes (e.g. faces) and can be extended to object classification (rather then subordinate level recognition).
• Just a plausibility proof!• Experiments wanted (to prove it wrong)!
Class 21, 1999
CBCl/AI MIT
Category boundary
Prototypes
100% Cat
80% Cat Morphs
60% Cat Morphs
60% Dog Morphs 80% Dog
Morphs
Prototypes 100% Dog
Novel 3D morphing system to create new objects that are linear combinations of 3D prototypes
Class 21, 1999
CBCl/AI MIT
.. .
.
..
.
FixationSample
Delay
Test(Nonmatch)
Delay
(Match)
Test(Match)
600 ms.
1000 ms.
500 ms.
Object classification task for monkey physiology
Class 21, 1999
CBCl/AI MIT
dog 100%
dog 80%dog 60%
cat 60%cat 80%
cat 100%
0 500 1000 1500 2000 2500 300010
15
20
25
30
35
40
45l04.spk 1301
time (msec)
spik
e ra
te (
Hz)
Dog activity
Cat activity
Sample on Delay period Fixation
Preliminary results from Prefrontal Cortex Recordings
This suggests that prefrontal neurons carry information about the category of objects
Class 21, 1999
CBCl/AI MIT
Class 21, 1999
CBCl/AI MIT
Recognition in Context: some experimental support
• Sato: Response of IT cells to two stimuli in RF
Sato, 1989
Summation index
for Max is 0for Sum is 1
Sato finds -0.1 in the average
Class 21, 1999
CBCl/AI MIT
Simulation