Download - Michigan State University 1 “Saliency-Based Visual Attention” “Computational Modeling of Visual Attention”, Itti, Koch, (Nature Reviews – Neuroscience.

Michigan State University 1

“Saliency-Based Visual Attention”

“Computational Modeling of Visual Attention”, Itti, Koch, (Nature Reviews – Neuroscience 2001 )

“A Model of Saliency-Based Visual Attention for Rapid Scene Analysis” , Itti, Koch and Niebur’s (IEEE PAMI 1998)

Zhengping Ji


Overview Background System architecture The saliency map

Preprocessing Feature maps Feature integration

Focus of attention Results Conclusion


Related Work

“Feature Integration Theory,” Treisman & Gelade, 1980.

Computational model of bottom-up attention, Koch and Ullman, 1985

Saliency map is believed to be located in the posterior parietal cortex (Gotlieb, et al., 1998) and the pulvinar nuclei of the thalamus (Roinson & Peterson, 1992)


Architecture


Gaussian Pyramids

Repeated low-pass filtering

=0, 1, 2, 3,…,8 I(0) is original input

])([ Subsampled)1( 55 GII

640 x 480

320 x 240

160 x 120

80x60

Scaling by a factor 2x2

* G5x5



* G5x5

* G5x5


Preprocessing

Original image with red, green, blue channels Intensity as I = (r + g + b)/3 Broadly tuned color channels

R = r - (g + b)/2G = g - (r + b)/2B = b - (r + g)/2Y = (r + g)/2 - |r – g|/2 - b


Preprocessing

R G B Y

Intensity


Center-surround Difference Achieve center-surround difference through across-scale difference

Operated denoted by Interpolation to finer scale and point-to-point subtraction

One pyramid for each channel: I(), R(), G(), B(), Y()where [0..8] is the scale


Intensity Feature Maps

I(c, s) = | I(c) I(s)| c {2, 3, 4} s = c + where {3, 4} So I(2, 5) = | I(2) I(5)|

I(2, 6) = | I(2) I(6)| I(3, 6) = | I(3) I(6)| …

6 Feature Maps


Colour Feature Maps

Similar to double-opponent cells (Prim. V. C) Red-Green and Yellow-Blue

RG(c, s) = | (R(c) - G(c)) (G(s) - R(s)) | BY(c, s) = | (B(c) - Y(c)) (Y(s) - B(s)) | Same c and s as with intensity

+R-G

+R-G+G-R

+G-R +B-Y

+B-Y+Y-B

+Y-B


Orientation Feature Maps

Create Gabor pyramids for = {0º, 45º, 90º, 135º}

c and s again similar to intensity

),(),(),,( sOcOscO


Normalization Operator

Promotes maps with few strong peaks Surpresses maps with many comparable

peaks1. Normalization of map to range [0…M]

2. Compute average m of all local maxima

3. Find the global maximum M

4. Multiply the map by (M – m)2


Normalization Operator


Conspicuity Maps

)),((4

3

4

2scINI

c

csc

)),(()),((4

3

4

2scBYNscRGNC

c

csc

}º135,º90,º45,º0{

4

3

4

2)),,((

scONNO

c

csc


Saliency Map

Average all conspicuity maps

3

)()()( ONCNINS


Neural Layers

Saliency Map (SM) modeled as layer of leaky integrate-and-fire neurons

SM feeds into winner-take-all (WTA) neural network

Inhibition of Return as transient inhibition of SM at FOA

SM

Stimulus

WTA

Inhibition of Return

+

-

+

FOA shifted to position of winner


Example of Operation

Inhibition of return


Results

Image

Saliency Map

High saliency Locations(yellow circles)


Shifting Attention

Using 2D “winner-take-all” neural network at scale 4

FOA shifts every 30-70 ms


Summary Saliecy map can be broken down into main steps

Create pyramids for 5 channels of original image Determine feature maps then conspicuity maps Combine into saliency map (after normalizing)

The key idea of saliency map is to extract local spatial discontinuities in the modalities of color, intensity and orientation.

Use two layers of neurons to model shifting attention.

Model appears to work accurately and robustly (but difficult to evaluate)


Discussion

No top-down attention modeling, e.g., top-down spacial control, obejct-based attention.

Biologically plausible? Neuromorphic architecture? In which way the top-down and bottom-up

processes are related? In which way the attention and recognition are

integrated and interacted with each other?


References

Itti, Koch, and Niebur: “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”IEEE PAMI Vol. 20, No. 11, November (1998)

Itti, Koch: “Computational Modeling of Visual Attention”Nature Reviews – Neuroscience Vol. 2 (2001)