Multiclass Recognition and Part Localization with Humans in...

36
Multiclass Recognition and Part Localization with Humans in the Loop Heath Vinicombe Department of Computer Science The University of Texas at Austin 30 th November 2012

Transcript of Multiclass Recognition and Part Localization with Humans in...

Page 1: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Multiclass Recognition and Part Localization with Humans in the Loop

Heath Vinicombe

Department of Computer ScienceThe University of Texas at Austin

30th November 2012

Page 2: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Outline

• Motivation• System Overview• Features• Probabilistic Model• Prediction• Results• Conclusions

Page 3: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Motivation

• Humans vs. Computers

Easy for humans but Harder for Computers

Page 4: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Motivation

• Leveraging abilities of Humans and ComputersDifficult for Humans and Computers

Easy for Humans Easy for Computers

Page 5: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Visipedia• http://www.vision.caltech.edu/visipedia/• Visual encyclopedia of images

Online Crowdsourcing Scalable Structure Learning and Annotation

Visual Recognition with Humans in the Loop

Page 6: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

System Overview

• Features: Attributes and Parts• Initial probabilities from Computer Vision• Answers to questions used to update p(c|x)

Increasing confidence

Page 7: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Outline

• Motivation• System Overview• Features • Probabilistic Model• Prediction• Results• Conclusions

Page 8: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Features - Attributes

Is Yellow

Voted for Romney

Class: Big Bird

Page 9: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Features – Example Attributes

• has_crown_color::yellow• has_bill_shape::hooked• has_head_pattern::striped• has_size::very large (32 - 72 in)

Page 10: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Features - Parts

x position y position

aspect binary visibility

Page 11: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Features – User Questions• Attribute queries• Part location queries

Page 12: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Outline

• Motivation• System Overview• Features • Probabilistic Model• Prediction• Results• Conclusions

Page 13: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Probability Model

Attributes detector

Parts detector

User’s answers to questions

Page 14: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Attribute Detection

output of linear classifier

Full Attribute Vector

Single Attribute

Page 15: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Discussion

Page 16: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Part Detection

Response of sliding window detector

Pairwise potentials

Page 17: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Discussion

• In this case are the pairwise potential terms useful or not?

Page 18: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

User Model

Attribute Values: Binomial distribution

Part Locations: Normal distribution

Page 19: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Outline

• Motivation• System Overview• Features • Probabilistic Model• Inference• Results• Conclusions

Page 20: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Inference

• Inference updates probabilities after each question

Page 21: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Inference

For all possible combinations of:

• Classes: 200 in total• Part Locations: ~1000’s of windows per part

• Exponential in number of parts

Page 22: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Inference

Evaluate hereDon’t bother here

Part Probability Density

Page 23: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Choice of Questions

• Minimize user input by asking “best” questions

Two candidate classes

Bad question: Is the head white?Good question: ???

Page 24: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Information Gain

High Entropy RV: H = 1.38 Low Entropy RV: H = 0.71

Page 25: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Discussion

• What other factors should be taken into consideration when choosing a question?

Page 26: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Selection by Time

Page 27: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Outline

• Motivation• System Overview• Features • Probabilistic Model• Prediction• Results• Conclusions

Page 28: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Dataset

• Caltech-UCSD Birds 200 (CUB-200) • 11,800 images of birds• 200 classes• 312 binary attributes• 15 part labels• Part labels obtained through MTurk

Page 29: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Dataset

Page 30: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Results

• Time to classify using IG criterion

Page 31: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Results

• Comparison of criterion

Information Gain criterion Time criterion

Page 32: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Results Analysis

• Computer Vision reduces time to classify• Time criterion reduces time to classify• Part localization improves performance

(attribute detectors 17.3% on ground truth locations vs. 10.3% on predicted)

• Part localization questions are quicker to answer (3s vs. 7.6s)

Page 33: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Future Work

• Visipedia iPad App

Page 35: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

Conclusion

• Better performance by combining strengths of humans and computers

• Using two types of questions and simple computer vision, bird species are classified in ~ 60s

• Human input can “guide” computer vision algorithms to produce better results

Page 36: Multiclass Recognition and Part Localization with Humans in ...cv-fall2012/slides/heath-paper.pdfMulticlass Recognition and Part Localization with Humans in the Loop Heath Vinicombe

References

• Multiclass Recognition and Part Localization with Humans in the Loop. C. Wah et al. ICCV 2011

• http://www.vision.caltech.edu/visipedia/index.html• A Discriminatively Trained, Multiscale, Deformable Part

Model, by P. Felzenszwalb, D. McAllester and D. Ramanan