Post on 27-Jan-2022
CS 1699: Intro to Computer Vision
Detection III: Analyzing and Debugging Detection Methods
Prof. Adriana KovashkaUniversity of Pittsburgh
November 17, 2015
Today
• Review: Deformable part models
• How can we speed up detection?
• In what ways does detection fail?
• How can we visualize features and models?
Parts-based Models
Define object by collection of parts modeled by
1. Appearance
2. Spatial configuration
Rob Fergus
How to model spatial relations?
• Star-shaped model
=X X
X Root
Part
Part
Part
Part
Part
Derek Hoiem
Implicit shape models: Training
1. Build vocabulary of patches around
extracted interest points using clustering
2. Map the patch around each interest point to
closest word
3. For each word, store all positions it was
found, relative to object center
Lana Lazebnik
Implicit shape models: Testing
1. Given new test image, extract patches, match to
vocabulary words
2. Cast votes for possible positions of object center
3. Search for maxima in voting space
Lana Lazebnik
Bin gradients from 8x8 pixel neighborhoods into 9
orientations
(Dalal & Triggs CVPR 05)
Histograms of oriented gradients (HOG)
Discriminative part-based models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, Object Detection
with Discriminatively Trained Part Based Models, PAMI 32(9), 2010
Root
filterPart
filtersDeformation
weights
Lana Lazebnik
Scoring an object hypothesis
• The score of a hypothesis is
the sum of appearance scores
minus the sum of deformation costs
),,,()(),...,( 22
0 1
0 ii
n
i
n
i
iiiiin dydxdydxscore
DpHFpp
Appearance weights
Subwindow
features
Deformation weights
Displacements
Adapted from Lana Lazebnik
What is an Object?
B. Alexe, T. Deselaers, and V. Ferrari
Computer Vision and Pattern Recognition (CVPR) 2010
Speeding up detection: Restrict set of windows we pass through SVM to those w/ high “objectness”
Alexe et al., CVPR 2010
Objectness cue #1: Where people look
Alexe et al., CVPR 2010
Objectness cue #2: color contrast at boundary
Alexe et al., CVPR 2010
Objectness cue #3: no segments “straddling” the object box
Alexe et al., CVPR 2010
Boxes found to have high “objectness”
Alexe et al., CVPR 2010
Cyan = ground truth bounding boxes, yellow = correct and red = incorrect predictions for “objectness”
Only run the sheep / horse / chair etc. classifier on the yellow/red boxes.
Today
• Review: Deformable part models
• How can we speed up detection?
• In what ways does detection fail?
• How can we visualize features and detections?
Diagnosing Error in Object Detectors
D. Hoiem, Y. Chodpathumwan and Q. Dai
European Conference on Computer Vision (ECCV) 2012
Object detection is a collection of problems
DistanceShapeOcclusion Viewpoint
Intra-class Variation for “Airplane”
Hoiem et al., ECCV 2012
Object detection is a collection of problems
Localization
ErrorBackgroundDissimilar
Categories
Similar
Categories
Confusing Distractors for “Airplane”
Hoiem et al., ECCV 2012
Top false positives: Airplane (DPM)
3
27 37
1
4
5
30
33
26
7
Other Objects
11%
Background
27%
Similar Objects
33%
Bird, Boat, Car
Localization
29%
AP = 0.36
Hoiem et al., ECCV 2012
Top false positives: Dog (DPM)
Similar Objects
50%
Person, Cat, Horse
1 6 1642 5
8 22
Background
23%
93
10
Localization
17%
Other Objects
10%
AP = 0.03
Hoiem et al., ECCV 2012
Analysis of object characteristics
Additional annotations for seven categories: occlusion level, parts visible, sides visible
Hoiem et al., ECCV 2012
Object characteristics: AeroplaneOcclusion: poor robustness to occlusion, but little impact on overall performance
Easier (None) Harder (Heavy)Hoiem et al., ECCV 2012
Size: strong preference for average to above average sized airplanes
Object characteristics: Aeroplane
Easier Harder
X-SmallSmallX-LargeMediumLarge
Hoiem et al., ECCV 2012
Aspect Ratio: 2-3x better at detecting wide (side) views than tall views
Object characteristics: Aeroplane
TallX-TallMediumWideX-Wide
Easier (Wide) Harder (Tall)Hoiem et al., ECCV 2012
Sides/Parts: best performance = direct side view with all parts visible
Object characteristics: Aeroplane
Easier (Side) Harder (Non-Side)Hoiem et al., ECCV 2012
Conclusions
• Most errors that detectors make are reasonable
– Localization error and confusion with similar objects
– Misdetection of occluded or small objects
• Detectors have different sensitivity to different factors
– E.g. less sensitive to truncation than to size differences
• Code and annotations are available online– http://web.engr.illinois.edu/~dhoiem/projects/detectionAnalysis/
Adapted from Hoiem et al., ECCV 2012
Today
• Review: Deformable part models
• How can we speed up detection?
• In what ways does detection fail?
• How can we visualize features and detections?
HOGgles: Visualizing ObjectDetection Features
C. Vondrick, A. Khosla, T. Malisiewicz, and A. Torralba
International Conference on Computer Vision (ICCV) 2013
Car
Why did the detector fail?
Vondrick et al., ICCV 2013
What information is lost?
Vondrick et al., ICCV 2013
What information is lost?
Vondrick et al., ICCV 2013
Recovering image from neighbors
Image HOG
Top detections
Vondrick et al., ICCV 2013
Recovering image from neighbors
Image HOG
Vondrick et al., ICCV 2013
Top detections
Recovering image from neighbors
Image HOG
Vondrick et al., ICCV 2013
Top detections
Recovering image from neighbors
Image HOG
Vondrick et al., ICCV 2013
Top detections
Better recovery using paired dictionary
Vondrick et al., ICCV 2013
2x more intuitive
A microscope to view HOG
Vondrick et al., ICCV 2013
vs
Vondrick et al., ICCV 2013
Human Vision HOG Vision
Vondrick et al., ICCV 2013
Vondrick et al., ICCV 2013
Vondrick et al., ICCV 2013
Vondrick et al., ICCV 2013
Vondrick et al., ICCV 2013
Vondrick et al., ICCV 2013
The HOGgles Challenge
Humans detect &
DPMs detect
Vondrick et al., ICCV 2013
The HOGgles Challenge
Humans miss &
DPM miss
Vondrick et al., ICCV 2013
Chair Detections
Vondrick et al., ICCV 2013
Chair Detections
Vondrick et al., ICCV 2013
Car Detections
Vondrick et al., ICCV 2013
Car Detections
Vondrick et al., ICCV 2013
HOG+Human
Detector
RGB+Human
HOG+Human
HOG+DPM
0 0.2 0.8 10
0.1
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.4 0.6
Recall
Pre
cisio
n
Chair
Human performance with HOG is poor despite perfect learning
Loss due to RGB -> HOG
Vondrick et al., ICCV 2013
Car
Why did the detector fail?
Vondrick et al., ICCV 2013
Car
Why did the detector fail?
Vondrick et al., ICCV 2013
Car
Why did the detector fail?
Vondrick et al., ICCV 2013
Visualizing Learned Models
Car Person Bottle Bicycle
Motorbike Chair TV Horse
Vondrick et al., ICCV 2013
What is this?
http://web.mit.edu/vondrick/ihog/
What is this?
http://web.mit.edu/vondrick/ihog/
What is this?
http://web.mit.edu/vondrick/ihog/
What is this?
http://web.mit.edu/vondrick/ihog/
What is this?
http://web.mit.edu/vondrick/ihog/
What is this?
http://web.mit.edu/vondrick/ihog/
Summary
• We can speed up object detection by using the notion of “objectness” to prune windows unlikely to contain any object
• Some failure modes are more important than others and fixing them could increase the overall detection performance
• Even humans cannot produce correct classifications with imperfect features