Brain-like software architecture Confessions of an ex-neuroscientist Bill Softky.
-
Upload
duane-casey -
Category
Documents
-
view
216 -
download
0
Transcript of Brain-like software architecture Confessions of an ex-neuroscientist Bill Softky.
Brain-like software architecture
Confessions of an ex-neuroscientist
Bill Softky
Which comes first: the problem or the solution?
• Reverse engineering starts with hardware, works backward
• Usually only succeeds if problem is understood
• “Forward” software engineering starts with the problem, and saves hardware for last
“Forward” software engineering
Question UPS example e*Trade example
1 What is the “truth” out there?
Package en route John Doe, investor
2 How does input data approximate the truth?
Tracking #, destination Customer #portfolio ownedtrades made
3 What do we want to do with the data?
Deliver to next station; display web tracking page
Log in, display net worth, trade stocks
4 What architecture? Client-server, relational DB
Distributed serversseparate web front-end
5 What implementation? Oracle, C++, cgi-script web, Solaris machines
MySQL, Java server pages, Linux/Intel clusters
“Reverse” engineeringQuestion Neocortex vision Neocortex audio
? What is the “truth” out there?
Moving objects Speech
? How does input data approximate the truth?
Retinal “pixels”: contours, color, correlation, disparity
Sound pressure waveforms: frequencies, stereo, echoes
? What do we want to do with the data?
Find what and where an object is
Figure who/where is talking, what they're saying, what they mean
1-2 What architecture? Cortical columns, attractors, spikes, associative
1-2 What implementation? Hebbian synapses, integrate-and-fire, shunting inhibition
From an engineering perspective, this is nuts!
Initial goals here
• Input: we need a generic description of sensory input (at least audio & visual)
• Processing: speculate on generic, modular processing “API” which can untangle those correlations
• No neurons, synapses, spikes…yet.
Simple “truth” tangled inputs
Hypothesis: each entangling transformation is fairly simple
Stepwise decorr untangled truth
Hypothesis: a sequence of similar compressions will yield useful representation
First toy problem: cocktail party with echoes
• Multiple independent speakers• Multiple “ears” (mics)• Multiple echoes/amplitudes for each speaker/mic
combo• Echo patterns constrained (3-D) and unchanging
Try to remove echoes and separate speakers (our brains can do this...)
Echo kernels = location info
M-a
S 1
S 2
M-c
M-b=
M-a
M-b
M-c
“pure signals” Echo kernels, transfer
functions, “maps”
“entangled signals”
3 x 10kHz(x,y,z) static2 x 10kHz+
+
+
Second toy problem: video
•Moving “objects” (simple shapes)
•Constant velocity
•Spatiotemporal pixel pattern is just echoes from t=0 at center
Echo kernels = location/shape/velocity
(0,0)
(4,4)
(0,1)=
“Time at center”
Spatio-temporal Pixel responses
“entangled signals”
100 x 1 kHz{v, } semi-static
1 kHz
+
.
.
.
(0,0)
(0,1)
(4,4)
+
+
Generic entanglement
=
Very few independent pure signals
to track
Echo kernels in low-dim
subspace give persistent structure
many entangled, correlated,
high-bw signals as
inputs
.
.
. .
.
.
. .
Recap: echo-entanglement as a generic perceptual problem
• Very similar to early vision
• Just like audio echo-removal
• Structured “echoes” carry near-static info
• Associative memory and vector quantization are special cases
How to dis-entangle?
• Want to reveal original signals and structures
• Problem is hard (unsolved!)So…– Skip the mere algorithms– Skip the neurons and biology– Focus on a module’s inputs & outputs– Try to make modules work together
What would one disentangling module do?
• Note separate timescales:– Many channels of high-BW input– 1-3 indep channels med-BW output (time blurring)– Many channels near-static output & input
• Learn correlations (echoes) in input• Find low-dim subspace for echos (e.g. {x,y,z}, or
{v, })• Reconstruct inputs all at once (batch)• Minimize reconstruction error
(Assume typically 1 pure signal max during learning)
Basic disentangling module
T=-500 +100, coarseFloat outputs
Float inputs
Decorrelation &vector quantization
Reconstruction &prediction
x,y,z Pure signal
T=-500 +100, fine
e.g. for cocktail-party decorrelation
“mics”
“now”
“now”
Add multiple, independent outputs
• Multiple speakers/objects multiple outputs
• Each output represents one object (max 3)
• Output streams and mappings are independent
• An even harder disentangling task
• (complications too!....)
Module with multiple outputs
X1,y1,z1
X2,y2,z2
X3,y3,z3
Speaker 1
Speaker 2
Speaker 3
Add confidence estimates (sigmas)
• Disentangling is already a statistical-estimation task
• Confidence estimates come for free during reconstruction
• Propagate inputs’ sigmas forward
• Create output sigmas based on input sigmas and reconstruction error
Module with sigmas
Add layers
• Pure signal outputs become inputs to next layer• Many modules below feed each module above• Maybe, modules below can feed more than one
above• Whole upper layer uses longer and coarser
timescale• Stackable indefinitely• Top layers have huge input range, long memory,
broad abstractions
Modules in layers
T=-1000200
-500100
Add feedback
• Upper layer reconstructions provide estimates to lower modules (might help, can’t hurt)
• Near-static channels provide cheap “prediction” of input interrelations
• Update all estimates frequently
• Predicted pure signals could help reconstruction below
Feedback between modules
Open problems
• How do do the decompression?– Iterative? Monte Carlo? Low-dim subspace?
• Multiple objects/pure signals:– Deciding how many objects from a module– “binding” problem across modules– Which goes with which?– Layers 2-N need “clones,” one clone per extra
object
Summary: generic sensory model
• Assume inputs result from cascading a simple entangling transformation
• Entangling transformation is cocktail-party with echoes
=
Summary: stackable disentangling modules
• Assume one layer of disentangling can be learned and done somehow
• Separate time-series from static echo-kernel structure• Disentangle time-series in batches• Use reconstructions for error-checking and feedback• Propose “API” by which such modules can interact to
solve multi-scale, multi-sensory problems