Auditory Coding

8/8/2019 Auditory Coding

1/30

> What Determine's a Neuron's Tuning? The

Efficient Coding of Sensory Information

4 January 2011

[email protected]


2/30

18/01/2011 Eff ic ient Coding of Sensory Information2

Summary

1. Introduction

2. The model

3. Results

4. Conclusion

5. Bibliography


3/30

18/01/2011 Efficient Coding of Sensory Information3

1. Introduction

The context :

Problem of finding an efficient coding


4/30


1. Introduction

What makes a coding efficient ?

Preserves the underlying sound features

Lowest size possible for a given quality

Easy to encode and decode


5/30


1. Introduction

Our problematic here :

Time-Relative

Spikes in a

population

Given an

Input Waveform

= Reconstructed

Waveform

With lowdifferences with

the original


6/30


1. Introduction

What was done until then :

Reverse Correlation (RevCor) :

Given an input

waveform

We insert different

white noises

We use filters to find

most probable spikes

Then we use

functions to

reconstruct a

waveform


7/30


1. Introduction

What was done until then :

Reverse Correlation (RevCor) :

Given an input

waveform

We insert different

white noises

We use filters to

find most probablespikes

Then we use

functions to

reconstruct a

waveform


8/30


1. Introduction

The originality of this model:

Theoretical Code ( Black box model) vs

physiological revcor filters

Spikes are chosen to maximize efficiency of the code

(non-redudancy)

The algorithm is trained with specific datasets


9/30


Summary

1. Introduction

2. The model

3. Results

4. Conclusion

5. Bibliography


10/30


2. The model

What do we need for encoding accoustic signal :

Suppressing useless information or noise

Efficient for a wide range of signals (both transient

and harmonic)

Time-relative

Event-based


11/30


2. The model

An efficient way of coding would be a kernel take :

The signal x(t) is encoded with a set a kernel functions 1, ,

M that can be positioned arbitrarily and independently in

time.

Assuming that the kernel functions exist at all time points

during the signal t :

sm() = coefficient at time forM(t) = additive noise

(1)


12/30


2. The model

By using a sparse coefficient signal sm() composed only ofDirac delta functions (our event-based condition), this

equation reduces to :

sim

= coefficient of the ith

instance ofmi

m = temporal position of the ith instance ofmnm = number of instance ofm (can be different for each m)

(t) = additive noise

(2)


13/30


2. The model


14/30


2. The model

This is just a way to code sounds, we need to find values to these

distincts parematers, in two (linked) steps :

1. Encoding: Determining the optimal temporal positions and

coefficients of kernels functions

2. Learning : Determining the optimal kernel functions

We repeat these steps until we find a treshold value (here 0,1 for the

coefficient s)

At the beginning, kernel functions are initialized as standard gammatone

functions


15/30


2. The model

Encoding : Matching-base pursuit

The general idea is to iteratively approximate the input signal with successive

orthogonal projections onto the unit-normed gammatone kernels.

As such, we decompose the signal as

: inner product between signal and m , equivalent to smRx(t) = Residual signal after approximating x(t) in the direction of m

(1)


16/30


2. The model

The projection with the biggest inner product will minimize the power of

Rx(t), thus yielding the best approximation of x(t) with a single kernel. We

want to record the coefficient for this approximation.

Iteratively, (1) becomes

Rx0 = x(t) on initialization.

We then substract the best fitting projection, and record it, leaving

orthogonal to

On each iteration, the power of Rxn is thus bound to disminish. We

put up a treshold to stop the algorithm


17/30


2. The model

Learning: Probabilistic form

We rewrite our main equation as :

Where s^ is the approximation of the maximum and thegradient of

1

, , M

(1)


18/30


2. The model

Learning: Probabilistic form

We then have, for each m :

= Residual error at position tim of kernel M

We know and x^, so with additional computations we can deduce anoptimal value forM


19/30


Summary

1. Introduction

2. The model

3. Results

4. Conclusion

5. Bibliography


20/30


3. Results

Because one of the condition was that the model had to be

robust to a large range of accoustic signals, the training dataset

was composed by :

1. Mammalian vocalizations2. Nature sounds

a) Ambient (rain, wind)

b) Transient (crunching leaves, impact of wood)


21/30


3. Results

Red : ModelBlue : Physiological cat

data

The model predictsrevcor (physiological)

shapes!


22/30


3. Results

Red : Classic model Blue : Physiological Cat Data

Black : Environmental SoundsInitialization

Green : Animal Vocalization

Initialization

With speech initialization, the

model yields similar results aswith the classic dataset.

Comparison of the model initialized with different datasets :


23/30


3. Results

Comparison of the model initialized with different datasets :

Environmental Sound :

Very brief

Vocalizations :

Longer

Reserved Speech :

Reverse of classic model

(Grey bars = 5 ms)


24/30


3. Results

Red : Classic model Light Blue : Not learning

Model

Black : Fourier Transform

Blue : Daubechies wavelettransform

For fidelity under 35db

(treshold beyond which the

difference between the original

signal and the computed signalis untellable), the spike-coding

model performs better thanclassic Fourier or wavelet

transforms.

Efficiency of the code :


25/30


Summary

1. Introduction

2. The model

3. Results

4. Conclusion

5. Bibliography


26/30


4. Conclusion

General conclusions :

This model yields results strikingly similar to those recorded

physiologically in auditory nerves of a cat.

The kernel functions we obtained with the right initialization

(mixed natural sounds) should be good approximations of what

happens in a neuron black box .

The good results with speech initialization could prove that

evolution had adapted in the direction of speech.


27/30


4. Conclusion

Limits and extensions :

Depends heavily on the datasets used (right one?)

Optimizing this system is NP-Hard

Doesnt describe the underlying sound features

Doesnt take into account changes in response with signal

intensity


28/30


Summary

1. Introduction

2. The model

3. Results

4. Conclusion

5. Bibliography


29/30


5. Bibliography

Bibliography :

Evan Smith, Michael S. Lewicki, Efficient coding of time-relative structure usingspikes, Neural Computation January 2005, Vol. 17, No. 1: 1945.

Evan Smith, Michael S. Lewicki, Efficient auditory coding,Nature 439

, 978-982(23 February 2006)

Dario Ringach, Robert Shapley, Reverse correlation in neurophysiology (2003),Cognitive Science

Mallat, S. G. & Zhang, Z. Matching pursuits with time-frequency dictionaries.IEEE Trans. Signal Process. 41, 3397-3415 (1993).


30/30


Thank you for your attention !

Auditory Coding

Documents

Transcript of Auditory Coding