Lecture 31: Modern object recognition
-
Upload
kiona-oneal -
Category
Documents
-
view
36 -
download
0
description
Transcript of Lecture 31: Modern object recognition
![Page 1: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/1.jpg)
Lecture 31: Modern object recognition
CS4670 / 5670: Computer VisionNoah Snavely
![Page 2: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/2.jpg)
Announcements
• Project 4 due today• Demos Monday
![Page 3: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/3.jpg)
Object detection: where are we?
• Incredible progress in the last ten years• Better features, better models, better learning
methods, better datasets• Combination of science and hacks
Credit: Flickr user neilalderney123
![Page 4: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/4.jpg)
Vision Contests
• PASCAL VOC Challenge
• 20 categories• Annual classification, detection, segmentation, …
challenges
![Page 5: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/5.jpg)
Machine learning for object detection
• What features do we use?– intensity, color, gradient information, …
• Which machine learning methods?– generative vs. discriminative– k-nearest neighbors, boosting, SVMs, …
• What hacks do we need to get things working?
![Page 6: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/6.jpg)
Histogram of Oriented Gradients (HoG)
[Dalal and Triggs, CVPR 2005]
10x10 cells
20x20 cells
HoGifyHoGify
![Page 7: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/7.jpg)
Histogram of Oriented Gradients (HoG)
![Page 8: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/8.jpg)
Histogram of Oriented Gradients (HoG)
• Like SIFT (Scale Invariant Feature Transform), but…– Sampled on a dense, regular grid– Gradients are contrast normalized in overlapping
blocks
10x10 cells
20x20 cells
HoGifyHoGify
[Dalal and Triggs, CVPR 2005]
![Page 9: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/9.jpg)
Histogram of Oriented Gradients (HoG)
• First used for application of person detection [Dalal and Triggs, CVPR 2005]
• Cited since in thousands of computer vision papers
![Page 10: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/10.jpg)
Linear classifiers• Find linear function to separate positive and negative examples
0:negative
0:positive
b
b
ii
ii
wxx
wxx
Which lineis best?
[slide credit: Kristin Grauman]
![Page 11: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/11.jpg)
Support Vector Machines (SVMs)
• Discriminative classifier based on optimal separating line (for 2D case)
• Maximize the margin between the positive and negative training examples
[slide credit: Kristin Grauman]
![Page 12: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/12.jpg)
Support vector machines• Want line that maximizes the margin.
1:1)(negative
1:1)( positive
by
by
iii
iii
wxx
wxx
MarginSupport vectors
C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998
For support, vectors, 1 bi wx
wx+b=-1
wx+b=0
wx+b=1
[slide credit: Kristin Grauman]
![Page 13: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/13.jpg)
Person detection, ca. 20051. Represent each example with a single, fixed
HoG template
2. Learn a single [linear] SVM as a detector
Code available: http://pascal.inrialpes.fr/soft/olt/
![Page 14: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/14.jpg)
Positive and negative examples
+ thousands more…
+ millions more…
![Page 15: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/15.jpg)
HoG templates for person detection
![Page 16: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/16.jpg)
Person detection with HoG & linear SVM
[Dalal and Triggs, CVPR 2005]
![Page 17: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/17.jpg)
Are we done?
![Page 18: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/18.jpg)
Are we done?• Single, rigid template usually not enough to
represent a category– Many objects (e.g. humans) are articulated, or have parts
that can vary in configuration
– Many object categories look very different from different viewpoints, or from instance to instance
![Page 19: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/19.jpg)
Difficulty of representing positive instances
• Discriminative methods have proven very powerful• But linear SVM on HoG templates not sufficient?
• Alternatives:– Parts-based models [Felzenszwalb et al. CVPR 2008]– Latent SVMs [Felzenszwalb et al. CVPR 2008]– Today’s paper [Exemplar-SVMs, Malisiewicz, et al. ICCV
2011]
![Page 20: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/20.jpg)
Parts-based models
Felzenszwalb, et al., Discriminatively Trained Deformable Part Models, http://people.cs.uchicago.edu/~pff/latent/
![Page 21: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/21.jpg)
Latent SVMs
• Rather than training a single linear SVM separating positive examples…
• … cluster positive examples into “components” and train a classifier for each (using all negative examples)
![Page 22: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/22.jpg)
Two-component bicycle model
“side” component
“frontal” component
![Page 23: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/23.jpg)
Six-component car model
root filters (coarse) part filters (fine) deformation models
side view
frontal view
![Page 24: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/24.jpg)
Six-component person model
![Page 25: Lecture 31: Modern object recognition](https://reader035.fdocuments.net/reader035/viewer/2022081508/568135dc550346895d9d4f12/html5/thumbnails/25.jpg)
• Twenty object categories (aeroplane to TV/monitor)
• Three challenges:– Classification challenge (is there an X in this image?)– Detection challenge (draw a box around every X)– Segmentation challenge