Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able...

15
1 1 1 Object Recognition and Localization Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Includes slides from: Anna Bosch, Ondra Chum, Mark Everingham, Rob Fergus, Pawan Kumar, Bastian Leibe, Pietro Perona, Bernt Schiele, Josef Sivic and Manik Varma

Transcript of Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able...

Page 1: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

111

Object Recognition and Localization

Andrew ZissermanVisual Geometry Group

University of Oxfordhttp://www.robots.ox.ac.uk/~vgg

Includes slides from: Anna Bosch, Ondra Chum, Mark Everingham, Rob Fergus, PawanKumar, Bastian Leibe, Pietro Perona, Bernt Schiele, Josef Sivic and Manik Varma

Page 2: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

222

What we would like to be able to do …• Visual recognition and scene understanding

• What is in the image and where

• scene type: outdoor, city …• object classes• material properties• actions

Page 3: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

333

Various tasks

Image classification:“the image contains an airplane”

Object pixel-level segmentation:“the object ‘owns’ these pixels”

Object detection/localization:“the object is here, in a bounding box”

Page 4: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

5

Why is the recognition problem hard ?

• Scale and shape of the imaged object varies with viewpoint

• Occlusion (self- or by a foreground object)

• Lighting changes

• Background “clutter”

Page 5: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

6

Some object classes(Caltech datasets)

Difficulties:

• Size/shape variation• Partial occlusion• Lighting• Background clutter• Intra-class variation

Page 6: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

7

Class of model: Pictorial Structure

• Intuitive model of an object

• Model has two components

1. parts (2D image fragments)

2. structure (configuration of parts)

• Dates back to Fischler & Elschlager 1973

Is this complexity of representation necessary ?

Page 7: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

888

Deformations

Page 8: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

101010

Unclear how to model categories, so learn rather than manually specify

Learning

Page 9: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

111111

Levels of supervision in learning

pixel level (regions)“parts”

ROInone (weak supervision)

Page 10: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

121212

David Lowe [1985]

Specific object recognition history: Geometric methods

Rothwell et al. [1992]

Page 11: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

131313

Specific object recognition history: Appearance-based methods

• Murase & Nayer 1995 • Schmid & Mohr 1996• Lowe 1999• …

Original use of SIFT descriptor

Page 12: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

141414

• Rowley & Kanade, 1998• Schneiderman & Kanade 2000• Viola and Jones, 2000• Heisele et al., 2001

• LeCun et al. 1998 • Amit and Geman, 1999• Belongie and Malik, 2002• DeCoste and Scholkopf, 2002• Simard et al. 2003

• Poggio et al. 1993• Schneiderman & Kanade, 2000• Argawal and Roth, 2002…..

History: early object categorization

Page 13: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

151515

Application I: improving photography

Page 14: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

161616

Application II: Improving online search

Query:STREET

Application III: Organizing photo collections

Page 15: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image

171717

Outline

1. Bag of visual words model for categorization

• SVM classifier

2. Adding spatial information for localization

3. Databases and challenges

4. Spatial layout

5. Class based segmentation

• Pixel level localization

6. Conclusions and the future