Michigan State University 1
“Saliency-Based Visual Attention”
“Computational Modeling of Visual Attention”, Itti, Koch, (Nature Reviews – Neuroscience 2001 )
“A Model of Saliency-Based Visual Attention for Rapid Scene Analysis” , Itti, Koch and Niebur’s (IEEE PAMI 1998)
Zhengping Ji
Michigan State University 2
Overview Background System architecture The saliency map
Preprocessing Feature maps Feature integration
Focus of attention Results Conclusion
Michigan State University 3
Related Work
“Feature Integration Theory,” Treisman & Gelade, 1980.
Computational model of bottom-up attention, Koch and Ullman, 1985
Saliency map is believed to be located in the posterior parietal cortex (Gotlieb, et al., 1998) and the pulvinar nuclei of the thalamus (Roinson & Peterson, 1992)
Michigan State University 4
Architecture
Michigan State University 5
Gaussian Pyramids
Repeated low-pass filtering
=0, 1, 2, 3,…,8 I(0) is original input
])([ Subsampled)1( 55 GII
640 x 480
320 x 240
160 x 120
80x60
Scaling by a factor 2x2
* G5x5
Scaling by a factor 2x2
Scaling by a factor 2x2
* G5x5
* G5x5
Michigan State University 6
Preprocessing
Original image with red, green, blue channels Intensity as I = (r + g + b)/3 Broadly tuned color channels
R = r - (g + b)/2G = g - (r + b)/2B = b - (r + g)/2Y = (r + g)/2 - |r – g|/2 - b
Michigan State University 7
Preprocessing
R G B Y
Intensity
Michigan State University 8
Center-surround Difference Achieve center-surround difference through across-scale difference
Operated denoted by Interpolation to finer scale and point-to-point subtraction
One pyramid for each channel: I(), R(), G(), B(), Y()where [0..8] is the scale
Michigan State University 9
Intensity Feature Maps
I(c, s) = | I(c) I(s)| c {2, 3, 4} s = c + where {3, 4} So I(2, 5) = | I(2) I(5)|
I(2, 6) = | I(2) I(6)| I(3, 6) = | I(3) I(6)| …
6 Feature Maps
Michigan State University 10
Colour Feature Maps
Similar to double-opponent cells (Prim. V. C) Red-Green and Yellow-Blue
RG(c, s) = | (R(c) - G(c)) (G(s) - R(s)) | BY(c, s) = | (B(c) - Y(c)) (Y(s) - B(s)) | Same c and s as with intensity
+R-G
+R-G+G-R
+G-R +B-Y
+B-Y+Y-B
+Y-B
Michigan State University 11
Orientation Feature Maps
Create Gabor pyramids for = {0º, 45º, 90º, 135º}
c and s again similar to intensity
),(),(),,( sOcOscO
Michigan State University 12
Normalization Operator
Promotes maps with few strong peaks Surpresses maps with many comparable
peaks1. Normalization of map to range [0…M]
2. Compute average m of all local maxima
3. Find the global maximum M
4. Multiply the map by (M – m)2
Michigan State University 13
Normalization Operator
Michigan State University 14
Conspicuity Maps
)),((4
3
4
2scINI
c
csc
)),(()),((4
3
4
2scBYNscRGNC
c
csc
}º135,º90,º45,º0{
4
3
4
2)),,((
scONNO
c
csc
Michigan State University 15
Saliency Map
Average all conspicuity maps
3
)()()( ONCNINS
Michigan State University 16
Neural Layers
Saliency Map (SM) modeled as layer of leaky integrate-and-fire neurons
SM feeds into winner-take-all (WTA) neural network
Inhibition of Return as transient inhibition of SM at FOA
SM
Stimulus
WTA
Inhibition of Return
+
-
+
FOA shifted to position of winner
Michigan State University 17
Example of Operation
Inhibition of return
Michigan State University 18
Results
Image
Saliency Map
High saliency Locations(yellow circles)
Michigan State University 19
Shifting Attention
Using 2D “winner-take-all” neural network at scale 4
FOA shifts every 30-70 ms
Michigan State University 20
Summary Saliecy map can be broken down into main steps
Create pyramids for 5 channels of original image Determine feature maps then conspicuity maps Combine into saliency map (after normalizing)
The key idea of saliency map is to extract local spatial discontinuities in the modalities of color, intensity and orientation.
Use two layers of neurons to model shifting attention.
Model appears to work accurately and robustly (but difficult to evaluate)
Michigan State University 21
Discussion
No top-down attention modeling, e.g., top-down spacial control, obejct-based attention.
Biologically plausible? Neuromorphic architecture? In which way the top-down and bottom-up
processes are related? In which way the attention and recognition are
integrated and interacted with each other?
Michigan State University 22
References
Itti, Koch, and Niebur: “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”IEEE PAMI Vol. 20, No. 11, November (1998)
Itti, Koch: “Computational Modeling of Visual Attention”Nature Reviews – Neuroscience Vol. 2 (2001)
Top Related