Computer Vision

Basic Info •  Instructor: Paolo Favaro ([email protected]) •  Teaching Assistant: Thoma Papadhimitri

([email protected])

•  Course webpage: http://cvg.unibe.ch/teaching.html

•  Course material acknowledgements: –  Svetlana Lazebnik – University of Illinois –  Steve Seitz – University of Washington –  Jan Koenderink – University of Leuven –  Kristen Grauman – University of Texas at Austin –  Andrea Vedaldi – University of Oxford –  Srinivas Narasimhan – Carnegie Mellon University

Textbooks •  Forsyth & Ponce, Computer

Vision: A Modern Approach

•  Richard Szeliski, Computer Vision: Algorithms and Applications (available online)

•  Kristen Grauman and Bastian Leibe, Visual Object Recognition (pdf available online)

Course requirements

•  3 Assignments –  Deadlines specified in the next slide –  Each assignment has a maximum score of 100 and a pass is 60 –  To register for the exam one needs at least a pass on each assignment –  Each assignment to be made available in Ilias –  Assignments will require use of MATLAB (tutorial will be provided)

•  Exercises –  Weekly –  Aim is exam preparation

•  Exam –  There will be a final exam on 11 February 2014 (duration 120 mins)

•  Final Mark –  70% Exam and 30% Assignments

Assignments

•  First assignment: Photometric Stereo –  Available on 8/10 –  Due on 29/10

•  Second assignment: Uncalibrated Stereo –  Available on 29/10 –  Due on 19/11

•  Third assignment: Object detection via Bag-of-Words –  Available on 19/11 –  Due on 10/12

Academic integrity policy

•  Feel free to discuss assignments with each other, but coding must be done individually

•  Feel free to incorporate code or tips you find on the Web, provided this doesn’t make the assignment trivial and you explicitly acknowledge your sources

•  Remember: we can Google too!

The goal of computer vision

•  To extract “meaning” from pixels

What we see What a computer sees Source: S. Narasimhan

The goal of computer vision

•  To extract “meaning” from pixels

Source: “80 million tiny images” by Torralba et al.

Humans are remarkably good at this…

What kind of informa.on can be extracted from an image?

Geometric informa.on Seman,c informa.on

building

person trashcan car car

ground

tree tree

sky

door window

building

roof

chimney

Outdoor scene City European

…

Why study computer vision? •  Vision is useful

•  Vision is interesting

•  Vision is difficult •  Half of primate cerebral cortex is devoted to visual

processing •  Achieving human-level visual perception is probably

“AI-complete”

Successes of computer vision to date

Optical character recognition (OCR)

Source: S. Seitz, N. Snavely

Digit recognition yann.lecun.com

License plate readers h<p://en.wikipedia.org/wiki/Automa.c_number_plate_recogni.on

Sudoku grabber h<p://sudokugrab.blogspot.com/

Automa.c check processing

Biometrics

Fingerprint scanners on many new laptops,

other devices

Face recognition systems now beginning to appear more widely

http://www.sensiblevision.com/

Source: S. Seitz

Biometrics

How the Afghan Girl was Identified by Her Iris Patterns

Source: S. Seitz

Face detection

Many consumer digital cameras now detect faces

Source: S. Seitz

Smile detection

Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz

Face recognition: Apple iPhoto software

http://www.apple.com/ilife/iphoto/

Mobile visual search: Google Goggles

Automotive safety

Mobileye: Vision systems in high-end BMW, GM, Volvo models •  Pedestrian collision warning •  Forward collision warning •  Lane departure warning •  Headway monitoring and warning Source: A. Shashua, S. Seitz

Google self-driving cars

Vision-based interaction: Xbox Kinect

3D Reconstruction: Kinect Fusion

YouTube Video

3D Reconstruction: Multi-View Stereo

YouTube Video

Google Maps Photo Tours

http://maps.google.com/phototours

Special effects: shape and motion capture

Source: S. Seitz

Vision for robotics, space exploration

NASA'S Curiosity Rover has a system consisting of 17 cameras

Why is computer vision difficult?

Challenges: viewpoint variation

Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba

Challenges: illumination

image credit: J. Koenderink

Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba

Challenges: deformation

Xu, Beihong 1943


Challenges: occlusion, clutter

Image source: National Geographic

Challenges: Motion

Challenges: object intra-class variation


Challenges: local ambiguity


Challenges: local ambiguity

Source: Rob Fergus and Antonio Torralba

Challenges: Inherent ambiguity

•  Many different 3D scenes could have given rise to a particular 2D picture

Challenges or opportunities?

•  Images are confusing, but they also reveal the structure of the world through numerous cues

•  Our job is to interpret the cues!

Image source: J. Koenderink

Depth cues: Linear perspective

Depth cues: Aerial perspective

Depth ordering cues: Occlusion

Source: J. Koenderink

Shape cues: Texture gradient

Shape and lighting cues: Shading

Position and lighting cues: Cast shadows

Source: J. Koenderink

Grouping cues: Similarity (color, texture, proximity)

Grouping cues: “Common fate”

Image credit: Arthus-Bertrand (via F. Durand)

Origins of computer vision

L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

Connections to other disciplines

Computer Vision

Image Processing

Machine Learning

Artificial Intelligence

Robotics

Cognitive science Neuroscience

Computer Graphics

The computer vision industry

•  A list of companies here: http://www.cs.ubc.ca/spider/lowe/vision.html

Course overview

I.  Image formation: Camera and projection models II.  Radiometry and shading

I.  Calibrated/Uncalibrated photometric stereo

III.  Early vision: Image filtering, edges, interest points, denoising and deblurring

IV.  Epipolar geometry, stereo and structure from motion V.  Tracking, optical flow and registration VI.  Mid-level vision: Clustering and segmentation VII.  Recognition

I.  Bag of Features and Support Vector Machines

VIII. Deformable parts models IX.  Special topics

Early vision

Cameras and sensors Light and color

Linear filtering Edge detection

* =

Feature extraction: corner and blob detection

•  Basic image formation and processing

Mid-level vision

Fitting: Least squares RANSAC

Alignment

•  Fitting and grouping

Multi-view geometry

Structure from motion

Stereo Epipolar geometry

3D Photography

Recognition

Instance recognition, large-scale alignment Image classification

Sliding window detection Part-based models

Computer Vision - Lec01 Intro

Documents

Transcript of Computer Vision - Lec01 Intro