Lecture 14: Introduction to Object Recognition & Bag-of...

Lecture 14-

Fei-Fei Li

Lecture 14:

Introduction to Object Recognition

& Bag-of-Words (BoW) Models

Professor Fei-Fei Li

Stanford Vision Lab

14-Nov-111

Lecture 14-

Fei-Fei Li

What we will learn today?

• Introduction to object recognition

– Representation

– Learning

– Recognition

• Bag of Words models (Problem Set 4 (Q2))

– Basic representation

– Different learning and recognition algorithms

14-Nov-112

Lecture 14-

Fei-Fei Li

What are the different visual recognition tasks?

14-Nov-113

Lecture 14-

Fei-Fei Li

Classification: Does this image contain a building? [yes/no]

Yes!

14-Nov-114

Lecture 14-

Fei-Fei Li

Classification:Is this an beach?

14-Nov-115

Lecture 14-

Fei-Fei Li

Image Search

Organizing photo collections

14-Nov-116

Lecture 14-

Fei-Fei Li

Detection:Does this image contain a car? [where?]

car

14-Nov-117

Lecture 14-

Fei-Fei Li

Building

clock

personcar

Detection:Which object does this image contain? [where?]

14-Nov-118

Lecture 14-

Fei-Fei Li

clock

Detection:Accurate localization (segmentation)

14-Nov-119

Lecture 14-

Fei-Fei Li

Object: Person, back;

1-2 meters away

Object: Police car, side view, 4-5 m away

Object: Building, 45º pose,

8-10 meters away

It has bricks

Detection: Estimating object semantic &

geometric attributes

14-Nov-1110

Lecture 14-

Fei-Fei Li

Applications of computer vision

SurveillanceAssistive technologies

Security Assistive driving

Computational photography

14-Nov-1111

Lecture 14-

Fei-Fei Li

Categorization vs Single instance

recognitionDoes this image contain the Chicago Macy building’s?

14-Nov-1112

Lecture 14-

Fei-Fei Li

Where is the crunchy nut?

Categorization vs Single instance

recognition

14-Nov-1113

Lecture 14-

Fei-Fei Li

+ GPS

•Recognizing landmarks in

mobile platforms

Applications of computer vision

14-Nov-1114

Lecture 14-

Fei-Fei Li

Activity or Event recognitionWhat are these people doing?

14-Nov-1115

Lecture 14-

Fei-Fei Li

Visual Recognition

• Design algorithms that are capable to

– Classify images or videos

– Detect and localize objects

– Estimate semantic and geometrical

attributes

– Classify human activities and events

Why is this challenging?14-Nov-1116

Lecture 14-

Fei-Fei Li

How many object categories are there?

14-Nov-1117

Lecture 14-

Fei-Fei Li

Challenges: viewpoint variation

Michelangelo 1475-1564

14-Nov-1118

Lecture 14-

Fei-Fei Li

Challenges: illumination

image credit: J. Koenderink

14-Nov-1119

Lecture 14-

Fei-Fei Li

Challenges: scale

14-Nov-1120

Lecture 14-

Fei-Fei Li

Challenges: deformation

14-Nov-1121

Lecture 14-

Fei-Fei Li

Challenges:

occlusion

Magritte, 1957

14-Nov-1122

Lecture 14-

Fei-Fei Li

Challenges: background clutter

Kilmeny Niland. 1995

14-Nov-1123

Lecture 14-

Fei-Fei Li

Challenges: intra-class variation

14-Nov-1124

Lecture 14-

• Turk and Pentland, 1991

• Belhumeur, Hespanha, & Kriegman, 1997

• Schneiderman & Kanade 2004

• Viola and Jones, 2000

• Amit and Geman, 1999

• LeCun et al. 1998

• Belongie and Malik, 2002

• Schneiderman & Kanade, 2004

• Argawal and Roth, 2002

• Poggio et al. 1993

Some early works on object

categorization

14-Nov-11

Lecture 14-

Fei-Fei Li

Basic issues

• Representation

– How to represent an object category; which

classification scheme?

• Learning

– How to learn the classifier, given training data

• Recognition

– How the classifier is to be used on novel data

14-Nov-1126

Lecture 14-

Fei-Fei Li

Representation- Building blocks: Sampling strategies

RandomlyMultiple interest operators

Interest operators Dense, uniformly

Ima

ge

cre

dit

s: L

. Fe

i-Fe

i, E

. N

ow

ak,

J. S

ivic

14-Nov-1127

Lecture 14-

Fei-Fei Li

Representation– Appearance only or location and appearance

14-Nov-1128

Lecture 14-

Fei-Fei Li

Representation

–Invariances

• View point

• Illumination

• Occlusion

• Scale

• Deformation

• Clutter

• etc.

14-Nov-1129

Lecture 14-

Fei-Fei Li

Representation

– To handle intra-class variability, it is convenient to

describe an object categories using probabilistic

models

– Object models: Generative vs Discriminative vs hybrid

14-Nov-1130

Lecture 14-

Object categorization:

the statistical viewpoint

)|( imagezebrap

)( ezebra|imagnopvs.

)|(

)|(

imagezebranop

imagezebrap

• Bayes rule:

14-Nov-1131

Lecture 14-



)|( imagezebrap

)( ezebra|imagnopvs.

• Bayes rule:

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap ⋅=

posterior ratio likelihood ratio prior ratio

14-Nov-11

Lecture 14-



• Bayes rule:

)(

)(

)|(

)|(

)|(

)|(

zebranop

zebrap

zebranoimagep

zebraimagep

imagezebranop

imagezebrap ⋅=

posterior ratio likelihood ratio prior ratio

• Discriminative methods model posterior

• Generative methods model likelihood and prior

14-Nov-11

Lecture 14-

Fei-Fei Li

Discriminative models

Zebra

Non-zebra

Decision

boundary

)|(

)|(

imagezebranop

imagezebrap• Modeling the posterior ratio:

14-Nov-1134

Lecture 14-

Fei-Fei Li


Support Vector Machines

Guyon, Vapnik, Heisele,

Serre, Poggio…

Boosting

Viola, Jones 2001,

Torralba et al. 2004,

Opelt et al. 2006,…

106 examples

Nearest neighbor

Shakhnarovich, Viola, Darrell 2003

Berg, Berg, Malik 2005...

Neural networks

Source: Vittorio Ferrari, Kristen Grauman, Antonio Torralba

Latent SVM

Structural SVM

Felzenszwalb 00Ramanan 03…

LeCun, Bottou, Bengio, Haffner 1998

Rowley, Baluja, Kanade 1998

…

14-Nov-1135

Lecture 14-

Fei-Fei Li

Generative models

• Modeling the likelihood ratio:

)|(

)|(

zebranoimagep

zebraimagep

0 0.2 0.4 0.6 0.8 10

1

2

3

4

5

clas

s de

nsiti

es

p(x|C1)

p(x|C2)

x

14-Nov-1136

Lecture 14-

)|( zebranoimagep)|( zebraimagep

Generative models

0

1

2

3

4

5

clas

s de

nsiti

es

p(x|C1)

p(x|C2)

High Low

Low High

14-Nov-1137

Lecture 14-

Fei-Fei Li

Generative models

• Naïve Bayes classifier– Csurka Bray, Dance & Fan, 2004

• Hierarchical Bayesian topic models (e.g. pLSAand LDA)

– Object categorization: Sivic et al. 2005, Sudderth et al. 2005

– Natural scene categorization: Fei-Fei et al. 2005

• 2D Part based models- Constellation models: Weber et al 2000; Fergus et al 200

- Star models: ISM (Leibe et al 05)

• 3D part based models: - multi-aspects: Sun, et al, 2009

14-Nov-1138

Lecture 14-

Fei-Fei Li

Basic issues

• Representation



• Learning


• Recognition


14-Nov-1139

Lecture 14-

Fei-Fei Li

• Learning parameters: What are you maximizing? Likelihood (Gen.) or performances on train/validation set (Disc.)

Learning

14-Nov-1140

Lecture 14-

Fei-Fei Li


• Level of supervision

• Manual segmentation; bounding box; image labels; noisy labels

Learning

• Batch/incremental

• Priors

14-Nov-1141

Lecture 14-

Fei-Fei Li


• Level of supervision

• Manual segmentation; bounding box; image labels; noisy labels

Learning

• Batch/incremental

• Training images:•Issue of overfitting

•Negative images for

discriminative methods

• Priors

14-Nov-1142

Lecture 14-

Fei-Fei Li

Basic issues

• Representation



• Learning


• Recognition


14-Nov-1143

Lecture 14-

Fei-Fei Li

– Recognition task: classification, detection, etc..

Recognition

14-Nov-1144

Lecture 14-

Fei-Fei Li

Recognition

– Recognition task

– Search strategy: Sliding Windows

• Simple

• Computational complexity (x,y, S, θ, N of classes)

- BSW by Lampert et al 08

- Also, Alexe, et al 10

Viola, Jones 2001,

14-Nov-1145

Lecture 14-

Fei-Fei Li

Recognition



• Simple


• Localization

• Objects are not boxes



Viola, Jones 2001,

14-Nov-1146

Lecture 14-

Fei-Fei Li

Recognition



• Simple


• Localization

• Objects are not boxes

• Prone to false positive



Non max suppression:

Canny ’86

….

Desai et al , 2009

Viola, Jones 2001,

14-Nov-1147

Lecture 14-

Fei-Fei Li

Recognition

Category: car

Azimuth = 225º

Zenith = 30º

•Savarese, 2007

•Sun et al 2009

• Liebelt et al., ’08, 10

•Farhadi et al 09

- It has metal

- it is glossy

- has wheels

•Farhadi et al 09

• Lampert et al 09

• Wang & Forsyth 09


– Search strategy

– Attributes

14-Nov-1148

Lecture 14-

Fei-Fei Li

Semantic:•Torralba et al 03

• Rabinovich et al 07

• Gupta & Davis 08

• Heitz & Koller 08

• L-J Li et al 08

• Yao & Fei-Fei 10

Recognition


– Search strategy

– Attributes

– Context

Geometric• Hoiem, et al 06

• Gould et al 09

• Bao, Sun, Savarese 10

14-Nov-1149

Lecture 14-

Fei-Fei Li

Basic issues

• Representation



• Learning


• Recognition


14-Nov-1150

Lecture 14-

Fei-Fei Li

Part 1: Bag-of-words models

This segment is based on the tutorial “Recognizing and Learning

Object Categories: Year 2007”, by Prof L. Fei-Fei, A. Torralba, and R. Fergus

14-Nov-1151

Lecture 14-

Fei-Fei Li

Related works

• Early “bag of words” models: mostly texture recognition– Cula & Dana, 2001; Leung & Malik 2001; Mori, Belongie & Malik, 2001;

Schmid 2001; Varma & Zisserman, 2002, 2003; Lazebnik, Schmid & Ponce, 2003;

• Hierarchical Bayesian models for documents (pLSA, LDA, etc.)– Hoffman 1999; Blei, Ng & Jordan, 2004; Teh, Jordan, Beal & Blei, 2004

• Object categorization– Csurka, Bray, Dance & Fan, 2004; Sivic, Russell, Efros, Freeman &

Zisserman, 2005; Sudderth, Torralba, Freeman & Willsky, 2005;

• Natural scene categorization– Vogel & Schiele, 2004; Fei-Fei & Perona, 2005; Bosch, Zisserman &

Munoz, 2006

14-Nov-1152

Lecture 14-

Fei-Fei Li

Object Bag of ‘words’

14-Nov-1153

Lecture 14-

Fei-Fei Li

Analogy to documents

Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step-wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image.

sensory, brain, visual, perception,

retinal, cerebral cortex,eye, cell, optical

nerve, imageHubel, Wiesel

China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value.

China, trade, surplus, commerce,

exports, imports, US, yuan, bank, domestic,

foreign, increase, trade, value

14-Nov-1154

Lecture 14-

Fei-Fei Li

– Independent features

definition of “BoW”

face bike violin

14-Nov-1155

Lecture 14-

Fei-Fei Li

definition of “BoW”

– Independent features

– histogram representation

codewords dictionary

14-Nov-1156

Lecture 14-

Fei-Fei Li

categorydecision

Representation

feature detection& representation


image representation

category models(and/or) classifiers

recognitionle

arni

ng

14-Nov-1157

Lecture 14-

Fei-Fei Li

1.Feature detection and representation

14-Nov-1158

Lecture 14-

Fei-Fei Li


• Regular grid

– Vogel & Schiele, 2003

– Fei-Fei & Perona, 2005

14-Nov-1159

Lecture 14-

Fei-Fei Li


• Regular grid



• Interest point detector

– Csurka, et al. 2004


– Sivic, et al. 2005

14-Nov-1160

Lecture 14-

Fei-Fei Li


• Regular grid



• Interest point detector

– Csurka, Bray, Dance & Fan, 2004


– Sivic, Russell, Efros, Freeman & Zisserman, 2005

• Other methods

– Random sampling (Vidal-Naquet & Ullman, 2002)

– Segmentation based patches (Barnard, Duygulu, Forsyth, de Freitas, Blei, Jordan, 2003)

14-Nov-1161

Lecture 14-

Fei-Fei Li


Normalize patch

Detect patches[Mikojaczyk and Schmid ’02]

[Mata, Chum, Urban & Pajdla, ’02]

[Sivic & Zisserman, ’03]

Compute SIFT

descriptor[Lowe’99]

Slide credit: Josef Sivic

14-Nov-1162

Lecture 14-

Fei-Fei Li

…


14-Nov-1163

Lecture 14-

Fei-Fei Li

2. Codewords dictionary formation

…

14-Nov-1164

Lecture 14-

Fei-Fei Li


Clustering/vector quantization

…

Cluster center= code word

14-Nov-1165

Lecture 14-

Fei-Fei Li


Fei-Fei et al. 2005

14-Nov-1166

Lecture 14-

Fei-Fei Li

Image patch examples of codewords

Sivic et al. 2005

14-Nov-1167

Lecture 14-

Fei-Fei Li

Visual vocabularies: Issues

• How to choose vocabulary size?

– Too small: visual words not representative of all patches

– Too large: quantization artifacts, overfitting

• Computational efficiency

– Vocabulary trees

(Nister & Stewenius, 2006)

14-Nov-1168

Lecture 14-

Fei-Fei Li

3. Bag of word representation

Codewords dictionary

• Nearest neighbors assignment

• K-D tree search strategy

14-Nov-1169

Lecture 14-

Fei-Fei Li

3. Bag of word representation

Codewords dictionary codewords

freq

uenc

y

….

14-Nov-1170

Lecture 14-

Fei-Fei Li

feature detection& representation


image representation

Representation

1.

2.

3.

14-Nov-1171

Lecture 14-

Fei-Fei Li

categorydecision



Learning and Recognition

14-Nov-1172

Lecture 14-

Fei-Fei Li



1. Discriminative method: - NN- SVM

2.Generative method: - graphical models

14-Nov-1173

Lecture 14-

Fei-Fei Li

category models

Class 1 Class N

… ……

Discriminative classifiers

Model space

14-Nov-1174

Lecture 14-

Fei-Fei Li

Discriminative classifiers

Query image

Winning class: pink

Model space

14-Nov-1175

Lecture 14-

Fei-Fei Li

Nearest Neighborsclassifier

Query image

Winning class: pink

• Assign label of nearest training data point to each test data point

Model space

14-Nov-1176

Lecture 14-

Fei-Fei Li

Query image

• For a new point, find the k closest points from training data• Labels of the k points “vote” to classify• Works well provided there is lots of data and the distance function is good

K- Nearest Neighborsclassifier

Model space

Winning class: pink

14-Nov-1177

Lecture 14-

Fei-Fei Li

• For k dimensions: k-D tree = space-partitioning data structure for organizing points in a

k-dimensional space

• Enable efficient search

from Duda et al.

K- Nearest Neighborsclassifier

• Voronoi partitioning of feature space for 2-category 2-D and 3-D data

• Nice tutorial: http://www.cs.umd.edu/class/spring2002/cmsc420-0401/pbasic.pdf

14-Nov-1178

Lecture 14-

Fei-Fei Li

Functions for comparing histograms

• L1 distance

• χ2 distance

• Quadratic distance (cross-bin)

∑=

−=N

i

ihihhhD1

2121 |)()(|),(

Jan Puzicha, Yossi Rubner, Carlo Tomasi, Joachim M. Buhmann: Empirical Evaluation of Dissimilarity Measures for Color and Texture. ICCV 1999

( )∑

= +−=

N

i ihih

ihihhhD

1 21

221

21 )()(

)()(),(

∑ −=ji

ij jhihAhhD,

22121 ))()((),(

14-Nov-1179

Lecture 14-

Fei-Fei Li


1. Discriminative method: - NN- SVM

2.Generative method: - graphical models

14-Nov-1180

Lecture 14-

Fei-Fei Li

Discriminative classifiers(linear classifier)

Model spacecategory models

Class 1 Class N

… ……

14-Nov-1181

Lecture 14-

Fei-Fei Li

Support vector machines• Find hyperplane that maximizes the margin between the positive and

negative examples

MarginSupport vectors

Distance between point and hyperplane: ||||

||

wwx bi +⋅

Support vectors: 1±=+⋅ bi wx

Margin = 2 / ||w||

Credit slide: S. Lazebnik

∑=i iii y xw α

bybi iii +⋅=+⋅ ∑ xxxw α

Classification function (decision boundary):

Solution:

14-Nov-1182

Lecture 14-

Fei-Fei Li

Support vector machines• Classification

Margin

bybi iii +⋅=+⋅ ∑ xxxw α

20

10

classbif

classbif

→<+⋅→≥+⋅

wx

wx

Test point

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

14-Nov-1183

Lecture 14-

Fei-Fei Li

• Datasets that are linearly separable work out great:•

•

• But what if the dataset is just too hard?

• We can map it to a higher-dimensional space:

0 x

0 x

0 x

x2

Nonlinear SVMs

Slide credit: Andrew Moore

14-Nov-1184

Lecture 14-

Fei-Fei Li

Φ: x→ φ(x)

Nonlinear SVMs• General idea: the original input space can always be mapped

to some higher-dimensional feature space where the

training set is separable:

Slide credit: Andrew Moore

lifting transformation

14-Nov-1185

Lecture 14-

Fei-Fei Li

Nonlinear SVMs

• Nonlinear decision boundary in the original feature

space:

bKyi

iii +∑ ),( xxα

C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 1998

•The kernel K = product of the lifting transformation φ(x):

K(xi ,xj j) = φ(xi ) · φ(xj)

NOTE:

• It is not required to compute φ(x) explicitly:

• The kernel must satisfy the “Mercer inequality”

14-Nov-1186

Lecture 14-

Fei-Fei Li

Kernels for bags of features

• Histogram intersection kernel:

• Generalized Gaussian kernel:

• D can be Euclidean distance, χ2 distance etc…

∑=

=N

i

ihihhhI1

2121 ))(),(min(),(

−= 22121 ),(

1exp),( hhD

AhhK

J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, Local Features and Kernels for Classifcation of Texture and Object Categories: A Comprehensive Study, IJCV 2007

14-Nov-1187

Lecture 14-

Fei-Fei Li

Pyramid match kernel

• Fast approximation of Earth Mover’s Distance

• Weighted sum of histogram intersections at mutliple resolutions (linear in

the number of features instead of cubic)

K. Grauman and T. Darrell. The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features, ICCV 2005.

14-Nov-1188

Lecture 14-

Fei-Fei Li

Spatial Pyramid Matching

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce. CVPR 2006

14-Nov-1189

Lecture 14-

Fei-Fei Li

What about multi-class SVMs?

• No “definitive” multi-class SVM formulation

• In practice, we have to obtain a multi-class SVM by combining

multiple two-class SVMs

• One vs. others

– Traning: learn an SVM for each class vs. the others

– Testing: apply each SVM to test example and assign to it the class of

the SVM that returns the highest decision value

• One vs. one

– Training: learn an SVM for each pair of classes

– Testing: each learned SVM “votes” for a class to assign to the test

example

Credit slide: S. Lazebnik

14-Nov-1190

Lecture 14-

Fei-Fei Li

SVMs: Pros and cons

• Pros

– Many publicly available SVM packages:

http://www.kernel-machines.org/software

– Kernel-based framework is very powerful, flexible

– SVMs work very well in practice, even with very small training sample sizes

• Cons

– No “direct” multi-class SVM, must combine two-class SVMs

– Computation, memory

• During training time, must compute matrix of kernel values for every pair of

examples

• Learning can take a very long time for large-scale problems

14-Nov-1191

Lecture 14-

Object recognition results

• ETH-80 database of 8 object

classes (Eichhorn and Chapelle 2004)

• Features:

– Harris detector

– PCA-SIFT descriptor, d=10

Kernel Complexity Recognition rate

Match [Wallraven et al.] 84%

Bhattacharyya affinity [Kondor & Jebara]

85%

Pyramid match 84%

Slide credit: Kristen Grauman

14-Nov-11

Lecture 14-

Fei-Fei Li


Support Vector Machines

Guyon, Vapnik, Heisele,

Serre, Poggio…

Boosting

Viola, Jones 2001,

Torralba et al. 2004,

Opelt et al. 2006,…

106 examples

Nearest neighbor

Shakhnarovich, Viola, Darrell 2003

Berg, Berg, Malik 2005...

Neural networks

Source: Vittorio Ferrari, Kristen Grauman, Antonio Torralba

Latent SVM

Structural SVM

Felzenszwalb 00Ramanan 03…

LeCun, Bottou, Bengio, Haffner 1998

Rowley, Baluja, Kanade 1998

…

14-Nov-1193

Lecture 14-

Fei-Fei Li


1. Discriminative method:

- NN

- SVM

2.Generative method:

- graphical models

� Model the probability distribution that produces a given bag of

features

14-Nov-1194

Lecture 14-

Fei-Fei Li

Generative models

1. Naïve Bayes classifier– Csurka Bray, Dance & Fan, 2004

2. Hierarchical Bayesian text models (pLSA and LDA)

– Background: Hoffman 2001, Blei, Ng & Jordan, 2004



14-Nov-1195

Lecture 14-

Fei-Fei Li

• w: a collection of all N codewords in the image

w = [w1,w2,…,wN]

• c: category of the image

Some notations

14-Nov-1196

Lecture 14-

wN

c

the Naïve Bayes model

)|()( cwpcp∝)|( wcp

14-Nov-11

Prior prob. of the object classes

Image likelihoodgiven the class

Graphical model

Posterior =probability that image I is of category c

Lecture 14-

wN

c

the Naïve Bayes model

)|()( cwpcp ∏=

=N

nn cwpcp

1

)|()(

Object classdecision

∝)|( wcpc

c maxarg=∗

Likelihood of ith visual wordgiven the class

Estimated by empirical frequencies of code words in images from a given class

14-Nov-11

Graphical model

Lecture 14-

Fei-Fei Li

Csurka et al. 2004

14-Nov-1199

Lecture 14-

Fei-Fei Li

Csurka et al. 2004

14-Nov-11100

Lecture 14-

Fei-Fei Li

Other generative BoW models

• Hierarchical Bayesian topic models (e.g. pLSAand LDA)



14-Nov-11101

Lecture 14-

Fei-Fei Li

Generative vs discriminative

• Discriminative methods

– Computationally efficient & fast

• Generative models

– Convenient for weakly- or un-supervised, incremental

training

– Prior information

– Flexibility in modeling parameters

14-Nov-11102

Lecture 14-

Fei-Fei Li

• No rigorous geometric information of the object components

• It’s intuitive to most of us that objects are made of parts – no such information

• Not extensively tested yet for

– View point invariance

– Scale invariance

• Segmentation and localization unclear

Weakness of BoW the models

14-Nov-11103

Lecture 14-

Fei-Fei Li

What have learned today?

• Introduction to object recognition

– Representation

– Learning

– Recognition

• Bag of Words models (Problem Set 4 (Q2))

– Basic representation

– Different learning and recognition algorithms

14-Nov-11104

Lecture 14: Introduction to Object Recognition & Bag-of...

Documents

Transcript of Lecture 14: Introduction to Object Recognition & Bag-of...