Computer Vision - Lec01 Intro
Click here to load reader
-
Upload
halla-kristin-gudfinnsdottir -
Category
Documents
-
view
29 -
download
3
description
Transcript of Computer Vision - Lec01 Intro
Computer Vision
Basic Info • Instructor: Paolo Favaro ([email protected]) • Teaching Assistant: Thoma Papadhimitri
• Course webpage: http://cvg.unibe.ch/teaching.html
• Course material acknowledgements: – Svetlana Lazebnik – University of Illinois – Steve Seitz – University of Washington – Jan Koenderink – University of Leuven – Kristen Grauman – University of Texas at Austin – Andrea Vedaldi – University of Oxford – Srinivas Narasimhan – Carnegie Mellon University
Textbooks • Forsyth & Ponce, Computer
Vision: A Modern Approach
• Richard Szeliski, Computer Vision: Algorithms and Applications (available online)
• Kristen Grauman and Bastian Leibe, Visual Object Recognition (pdf available online)
Course requirements
• 3 Assignments – Deadlines specified in the next slide – Each assignment has a maximum score of 100 and a pass is 60 – To register for the exam one needs at least a pass on each assignment – Each assignment to be made available in Ilias – Assignments will require use of MATLAB (tutorial will be provided)
• Exercises – Weekly – Aim is exam preparation
• Exam – There will be a final exam on 11 February 2014 (duration 120 mins)
• Final Mark – 70% Exam and 30% Assignments
Assignments
• First assignment: Photometric Stereo – Available on 8/10 – Due on 29/10
• Second assignment: Uncalibrated Stereo – Available on 29/10 – Due on 19/11
• Third assignment: Object detection via Bag-of-Words – Available on 19/11 – Due on 10/12
Academic integrity policy
• Feel free to discuss assignments with each other, but coding must be done individually
• Feel free to incorporate code or tips you find on the Web, provided this doesn’t make the assignment trivial and you explicitly acknowledge your sources
• Remember: we can Google too!
The goal of computer vision
• To extract “meaning” from pixels
What we see What a computer sees Source: S. Narasimhan
The goal of computer vision
• To extract “meaning” from pixels
Source: “80 million tiny images” by Torralba et al.
Humans are remarkably good at this…
What kind of informa.on can be extracted from an image?
Geometric informa.on Seman,c informa.on
building
person trashcan car car
ground
tree tree
sky
door window
building
roof
chimney
Outdoor scene City European
…
Why study computer vision? • Vision is useful
• Vision is interesting
• Vision is difficult • Half of primate cerebral cortex is devoted to visual
processing • Achieving human-level visual perception is probably
“AI-complete”
Successes of computer vision to date
Optical character recognition (OCR)
Source: S. Seitz, N. Snavely
Digit recognition yann.lecun.com
License plate readers h<p://en.wikipedia.org/wiki/Automa.c_number_plate_recogni.on
Sudoku grabber h<p://sudokugrab.blogspot.com/
Automa.c check processing
Biometrics
Fingerprint scanners on many new laptops,
other devices
Face recognition systems now beginning to appear more widely
http://www.sensiblevision.com/
Source: S. Seitz
Biometrics
How the Afghan Girl was Identified by Her Iris Patterns
Source: S. Seitz
Face detection
Many consumer digital cameras now detect faces
Source: S. Seitz
Smile detection
Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz
Face recognition: Apple iPhoto software
http://www.apple.com/ilife/iphoto/
Mobile visual search: Google Goggles
Automotive safety
Mobileye: Vision systems in high-end BMW, GM, Volvo models • Pedestrian collision warning • Forward collision warning • Lane departure warning • Headway monitoring and warning Source: A. Shashua, S. Seitz
Google self-driving cars
Vision-based interaction: Xbox Kinect
3D Reconstruction: Kinect Fusion
YouTube Video
3D Reconstruction: Multi-View Stereo
YouTube Video
Google Maps Photo Tours
http://maps.google.com/phototours
Special effects: shape and motion capture
Source: S. Seitz
Vision for robotics, space exploration
NASA'S Curiosity Rover has a system consisting of 17 cameras
Why is computer vision difficult?
Challenges: viewpoint variation
Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba
Challenges: illumination
image credit: J. Koenderink
Challenges: scale
slide credit: Fei-Fei, Fergus & Torralba
Challenges: deformation
Xu, Beihong 1943
slide credit: Fei-Fei, Fergus & Torralba
Challenges: occlusion, clutter
Image source: National Geographic
Challenges: Motion
Challenges: object intra-class variation
slide credit: Fei-Fei, Fergus & Torralba
Challenges: local ambiguity
slide credit: Fei-Fei, Fergus & Torralba
Challenges: local ambiguity
Source: Rob Fergus and Antonio Torralba
Challenges: local ambiguity
Source: Rob Fergus and Antonio Torralba
Challenges: Inherent ambiguity
• Many different 3D scenes could have given rise to a particular 2D picture
Challenges or opportunities?
• Images are confusing, but they also reveal the structure of the world through numerous cues
• Our job is to interpret the cues!
Image source: J. Koenderink
Depth cues: Linear perspective
Depth cues: Aerial perspective
Depth ordering cues: Occlusion
Source: J. Koenderink
Shape cues: Texture gradient
Shape and lighting cues: Shading
Position and lighting cues: Cast shadows
Source: J. Koenderink
Grouping cues: Similarity (color, texture, proximity)
Grouping cues: “Common fate”
Image credit: Arthus-Bertrand (via F. Durand)
Origins of computer vision
L. G. Roberts, Machine Perception of Three Dimensional Solids, Ph.D. thesis, MIT Department of Electrical Engineering, 1963.
Connections to other disciplines
Computer Vision
Image Processing
Machine Learning
Artificial Intelligence
Robotics
Cognitive science Neuroscience
Computer Graphics
The computer vision industry
• A list of companies here: http://www.cs.ubc.ca/spider/lowe/vision.html
Course overview
I. Image formation: Camera and projection models II. Radiometry and shading
I. Calibrated/Uncalibrated photometric stereo
III. Early vision: Image filtering, edges, interest points, denoising and deblurring
IV. Epipolar geometry, stereo and structure from motion V. Tracking, optical flow and registration VI. Mid-level vision: Clustering and segmentation VII. Recognition
I. Bag of Features and Support Vector Machines
VIII. Deformable parts models IX. Special topics
Early vision
Cameras and sensors Light and color
Linear filtering Edge detection
* =
Feature extraction: corner and blob detection
• Basic image formation and processing
Mid-level vision
Fitting: Least squares RANSAC
Alignment
• Fitting and grouping
Multi-view geometry
Structure from motion
Stereo Epipolar geometry
3D Photography
Recognition
Instance recognition, large-scale alignment Image classification
Sliding window detection Part-based models