Simple baselines can fool 360º saliency metrics.

Simple baselines can fool 360º saliency metrics.

Yasser Dahou, Kevin McGuinness, Noel E. O’ConnorDublin City University

The importance of visual attention

Why don’t we see the changes

We don’t really see the whole image

We only focus on small specific regions: the salient parts

Human beings reliably attend to the same regions of images when shown

What we perceive

Where we look

What we actually see

Saliency predictionProduce a computational model of visual attention: predict where humans will look.

Often want to map an image to a heatmap (saliency map).

How to measure the quality of the predicted map against the Ground Truth one?

Bylinskii, Zoya, et al. "What do different evaluation metrics tell us about saliency models?." IEEE transactions on pattern analysis and machine intelligence 41.3 (2018): 740-757.

Saliency metrics: functions that take two inputs representing eye fixations (ground truth and predicted) and then output a number assessing the similarity or dissimilarity between them.

Location-based metrics:• Area under ROC Curve (AUC-J, AUC-Borji)• Normalized Scanpath Saliency (NSS)• Information Gain (IG)• Similarity (SIM)

Distribution-based metrics:• Pearson’s Correlation Coefficient (CC)• Kullback-Leibler divergence (KL)• Earth Mover’s Distance (EMD)

TABLE 4: Properties of the 8 evaluation metrics (with our specific implementations) considered in this paper.

ANALYSIS OF METRIC BEHAVIOR

Saliency formulation

Definition: The ground truth saliency maps are computed by convolving each fixation or trajectory points (for all observers of one image), defined as:

with a Gaussian or Kent kernel. The resulting saliency map P ∈ [0,1] can be treated as a multivariate Bernoulli distribution where each pixel is Bernoulli distributed, with a probability p to be attended, and(1−p) to be discarded

The Equi-rectangular saliency maps visualization of the four baselines. The upper row represent the density map, whereas the bottom row applies a color map to the saliency map for more perceptual figures.

We sort the saliency models listed in Table 1 individuallyby each metric, and compute the Spearman rankcorrelation between every pair of metrics. The first 5metrics listed are highly correlated. JSD is mostuncorrelated with other metrics, due to their high sensitivityto zero-valued predictions at fixated locations

ThanksEmail: [email protected]

Simple baselines can fool 360º saliency metrics.

Documents

Transcript of Simple baselines can fool 360º saliency metrics.