Lecture 14: Introduction to Object Recognition & Bag-of ... · PDF file Lecture 14-Fei-Fei...
date post
27-Jun-2020Category
Documents
view
0download
0
Embed Size (px)
Transcript of Lecture 14: Introduction to Object Recognition & Bag-of ... · PDF file Lecture 14-Fei-Fei...
Lecture 14 -
Fei-Fei Li
Lecture 14:
Introduction to Object Recognition
& Bag-of-Words (BoW) Models
Professor Fei-Fei Li
Stanford Vision Lab
14-Nov-111
Lecture 14 -
Fei-Fei Li
What we will learn today?
• Introduction to object recognition – Representation – Learning – Recognition
• Bag of Words models (Problem Set 4 (Q2)) – Basic representation – Different learning and recognition algorithms
14-Nov-112
Lecture 14 -
Fei-Fei Li
What are the different visual recognition tasks?
14-Nov-113
Lecture 14 -
Fei-Fei Li
Classification: Does this image contain a building? [yes/no]
Yes!
14-Nov-114
Lecture 14 -
Fei-Fei Li
Classification: Is this an beach?
14-Nov-115
Lecture 14 -
Fei-Fei Li
Image Search
Organizing photo collections
14-Nov-116
Lecture 14 -
Fei-Fei Li
Detection: Does this image contain a car? [where?]
car
14-Nov-117
Lecture 14 -
Fei-Fei Li
Building
clock
person car
Detection: Which object does this image contain? [where?]
14-Nov-118
Lecture 14 -
Fei-Fei Li
clock
Detection: Accurate localization (segmentation)
14-Nov-119
Lecture 14 -
Fei-Fei Li
Object: Person, back;
1-2 meters away
Object: Police car, side view, 4-5 m away
Object: Building, 45º pose,
8-10 meters away
It has bricks
Detection: Estimating object semantic &
geometric attributes
14-Nov-1110
Lecture 14 -
Fei-Fei Li
Applications of computer vision
SurveillanceAssistive technologies
Security Assistive driving
Computational photography
14-Nov-1111
Lecture 14 -
Fei-Fei Li
Categorization vs Single instance
recognition Does this image contain the Chicago Macy building’s?
14-Nov-1112
Lecture 14 -
Fei-Fei Li
Where is the crunchy nut?
Categorization vs Single instance
recognition
14-Nov-1113
Lecture 14 -
Fei-Fei Li
+ GPS
•Recognizing landmarks in mobile platforms
Applications of computer vision
14-Nov-1114
Lecture 14 -
Fei-Fei Li
Activity or Event recognition What are these people doing?
14-Nov-1115
Lecture 14 -
Fei-Fei Li
Visual Recognition
• Design algorithms that are capable to – Classify images or videos – Detect and localize objects – Estimate semantic and geometrical
attributes
– Classify human activities and events
Why is this challenging? 14-Nov-1116
Lecture 14 -
Fei-Fei Li
How many object categories are there?
14-Nov-1117
Lecture 14 -
Fei-Fei Li
Challenges: viewpoint variation
Michelangelo 1475-1564
14-Nov-1118
Lecture 14 -
Fei-Fei Li
Challenges: illumination
image credit: J. Koenderink
14-Nov-1119
Lecture 14 -
Fei-Fei Li
Challenges: scale
14-Nov-1120
Lecture 14 -
Fei-Fei Li
Challenges: deformation
14-Nov-1121
Lecture 14 -
Fei-Fei Li
Challenges:
occlusion
Magritte, 1957
14-Nov-1122
Lecture 14 -
Fei-Fei Li
Challenges: background clutter
Kilmeny Niland. 1995
14-Nov-1123
Lecture 14 -
Fei-Fei Li
Challenges: intra-class variation
14-Nov-1124
Lecture 14 -
• Turk and Pentland, 1991 • Belhumeur, Hespanha, & Kriegman, 1997 • Schneiderman & Kanade 2004 • Viola and Jones, 2000
• Amit and Geman, 1999
• LeCun et al. 1998
• Belongie and Malik, 2002
• Schneiderman & Kanade, 2004
• Argawal and Roth, 2002
• Poggio et al. 1993
Some early works on object
categorization
14-Nov-11
Lecture 14 -
Fei-Fei Li
Basic issues
• Representation – How to represent an object category; which
classification scheme?
• Learning – How to learn the classifier, given training data
• Recognition – How the classifier is to be used on novel data
14-Nov-1126
Lecture 14 -
Fei-Fei Li
Representation - Building blocks: Sampling strategies
RandomlyMultiple interest operators
Interest operators Dense, uniformly
Im a
g e
c re
d it
s: L
. F e
i- F e
i, E
. N
o w
a k ,
J. S
iv ic
14-Nov-1127
Lecture 14 -
Fei-Fei Li
Representation – Appearance only or location and appearance
14-Nov-1128
Lecture 14 -
Fei-Fei Li
Representation
–Invariances • View point • Illumination • Occlusion • Scale • Deformation • Clutter • etc.
14-Nov-1129
Lecture 14 -
Fei-Fei Li
Representation
– To handle intra-class variability, it is convenient to describe an object categories using probabilistic
models
– Object models: Generative vs Discriminative vs hybrid
14-Nov-1130
Lecture 14 -
Object categorization:
the statistical viewpoint
)|( imagezebrap
)( ezebra|imagnop vs.
)|(
)|(
imagezebranop
imagezebrap
• Bayes rule:
14-Nov-11 31
Lecture 14 -
Object categorization:
the statistical viewpoint
)|( imagezebrap
)( ezebra|imagnop vs.
• Bayes rule:
)(
)(
)|(
)|(
)|(
)|(
zebranop
zebrap
zebranoimagep
zebraimagep
imagezebranop
imagezebrap ⋅=
posterior ratio likelihood ratio prior ratio
14-Nov-11
Lecture 14 -
Object categorization:
the statistical viewpoint
• Bayes rule:
)(
)(
)|(
)|(
)|(
)|(
zebranop
zebrap
zebranoimagep
zebraimagep
imagezebranop
imagezebrap ⋅=
posterior ratio likelihood ratio prior ratio
• Discriminative methods model posterior
• Generative methods model likelihood and prior
14-Nov-11
Lecture 14 -
Fei-Fei Li
Discriminative models
Zebra
Non-zebra
Decision
boundary
)|(
)|(
imagezebranop
imagezebrap • Modeling the posterior ratio:
14-Nov-1134
Lecture 14 -
Fei-Fei Li
Discriminative models
Support Vector Machines
Guyon, Vapnik, Heisele,
Serre, Poggio…
Boosting
Viola, Jones 2001,
Torralba et al. 2004,
Opelt et al. 2006,…
106 examples
Nearest neighbor
Shakhnarovich, Viola, Darrell 2003
Berg, Berg, Malik 2005...
Neural networks
Source: Vittorio Ferrari, Kristen Grauman, Antonio Torralba
Latent SVM
Structural SVM
Felzenszwalb 00 Ramanan 03…
LeCun, Bottou, Bengio, Haffner 1998
Rowley, Baluja, Kanade 1998
…
14-Nov-1135
Lecture 14 -
Fei-Fei Li
Generative models
• Modeling the likelihood ratio:
)|(
)|(
zebranoimagep
zebraimagep
0 0.2 0.4 0.6 0.8 1 0
1
2
3
4
5
cl as
s de
ns iti
es
p(x|C 1 )
p(x|C 2 )
x
14-Nov-1136
Lecture 14 -
)|( zebranoimagep)|( zebraimagep
Generative models
0
1
2
3
4
5
cl as
s de
ns iti
es
p(x|C 1 )
p(x|C 2 )
High Low
Low High
14-Nov-11 37
Lecture 14 -
Fei-Fei Li
Generative models
• Naïve Bayes classifier – Csurka Bray, Dance & Fan, 2004
• Hierarchical Bayesian topic models (e.g. pLSA and LDA)
– Object categorization: Sivic et al. 2005, Sudderth et al. 2005 – Natural scene categorization: Fei-Fei et al. 2005
• 2D Part based models - Constellation models: Weber et al 2000; Fergus et al 200
- Star models: ISM (Leibe et al 05)
• 3D part based models: - multi-aspects: Sun, et al, 2009
14-Nov-1138
Lecture 14 -
Fei-Fei Li
Basic issues
• Representation – How to represent an object category; which
classification scheme?
• Learning – How to learn the classifier, given training data
• Recognition – How the classifier is to be used on novel data
14-Nov-1139