COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th...

61
COMP 776: Computer Vision COMP 776: Computer Vision

Transcript of COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th...

Page 1: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

COMP 776: Computer VisionCOMP 776: Computer Vision

Page 2: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

TodayToday

I t d ti t t i i• Introduction to computer vision• Course overview• Course requirements• Course requirements

Page 3: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

The goal of computer visionThe goal of computer vision

T b id th b t i l d “ i ”• To bridge the gap between pixels and “meaning”

What we see What a computer seesSource: S. Narasimhan

Page 4: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

What kind of information can we extract What kind of information can we extract from an image?from an image?

M t i 3D i f ti• Metric 3D information• Semantic information

Page 5: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Vision as measurement deviceVision as measurement device

Real-time stereo Structure from motionReconstruction from

Internet photo collections

NASA Mars RoverNASA Mars Rover

Pollefeys et al. Goesele et al.

Page 6: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Vision as a source of semantic information

slide credit: Fei-Fei, Fergus & Torralba

Page 7: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Object categorization

sky

building

flag

wallbannerface

bus busstreet lamp

cars slide credit: Fei-Fei, Fergus & Torralba

Page 8: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Scene and context categorizationtd• outdoor

• city• traffic• …

slide credit: Fei-Fei, Fergus & Torralba

Page 9: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Qualitative spatial information

slantedslanted

non-rigid moving objectobject

rigid moving

vertical

rigid movingrigid moving object

horizontal slide credit: Fei-Fei, Fergus & Torralba

rigid moving object

Page 10: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Why study computer vision?Why study computer vision?• Vision is useful: Images and video are everywhere!Vision is useful: Images and video are everywhere!

Personal photo albums Movies, news, sports

Surveillance and security Medical and scientific images

Page 11: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Why study computer vision?Why study computer vision?

• Vision is useful• Vision is interesting• Vision is difficult

– Half of primate cerebral cortex is devoted to visual processing– Achieving human-level visual perception is probably “AI-complete”Achieving human level visual perception is probably AI complete

Page 12: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Why is computer vision difficult?Why is computer vision difficult?

Page 13: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: viewpoint variation

Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba

Page 14: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: illumination

image credit: J. Koenderink

Page 15: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba

Page 16: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: deformation

Xu, Beihong 1943

slide credit: Fei-Fei, Fergus & Torralba

Page 17: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: occlusion

Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba

Page 18: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: background clutter

Page 19: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: Motion

Page 20: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: object intra-class variationvariation

slide credit: Fei-Fei, Fergus & Torralba

Page 21: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba

Page 22: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Challenges or opportunities?Challenges or opportunities?

I f i b t th l l th t t f• Images are confusing, but they also reveal the structure of the world through numerous cues

• Our job is to interpret the cues!Our job is to interpret the cues!

Image source: J. Koenderink

Page 23: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Depth cues: Linear perspectiveDepth cues: Linear perspective

Page 24: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Depth cues: Aerial perspectiveDepth cues: Aerial perspective

Page 25: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Depth ordering cues: OcclusionDepth ordering cues: Occlusion

Source: J. Koenderink

Page 26: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Shape cues: Texture gradientShape cues: Texture gradient

Page 27: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Shape and lighting cues: ShadingShape and lighting cues: Shading

Source: J. Koenderink

Page 28: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Position and lighting cues: Cast shadowsPosition and lighting cues: Cast shadows

Source: J. Koenderink

Page 29: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Grouping cues: Similarity (color, texture,Grouping cues: Similarity (color, texture,proximity)proximity)p y)p y)

Page 30: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Grouping cues: “Common fate”Grouping cues: “Common fate”

Image credit: Arthus-Bertrand (via F. Durand)

Page 31: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Bottom lineBottom line

P ti i i h tl bi bl• Perception is an inherently ambiguous problem– Many different 3D scenes could have given rise to a particular 2D picture

Page 32: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Bottom lineBottom line

P ti i i h tl bi bl• Perception is an inherently ambiguous problem– Many different 3D scenes could have given rise to a particular 2D picture

• Possible solutions– Bring in more constraints (more images)– Use prior knowledge about the structure of the worldUse prior knowledge about the structure of the world

• Need a combination of different methods

Page 33: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Connections to other disciplinesConnections to other disciplines

Artificial Intelligence

Machine LearningRobotics

Computer Vision

Cognitive scienceNeuroscience

Computer Graphics

Image Processing

Neuroscience

Page 34: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Origins of computer visionOrigins of computer vision

L. G. Roberts, Machine Perception of Three Dimensional Solids,Ph.D. thesis, MIT Department of pElectrical Engineering, 1963.

Page 35: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Computer Vision in the Real World

Page 36: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Special effects: shape and motion capture

Source: S. Seitz

Page 37: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

3D urban modeling

Bing maps, Google StreetviewBing maps, Google Streetview

Source: S. Seitz

Page 38: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

3D urban modeling: Microsoft Photosynth

http://labs.live.com/photosynth/ Source: S. Seitz

Page 39: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Face detection

Many new digital cameras now detect faces• Canon, Sony, Fuji, …

Source: S. Seitz

Page 40: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Smile detection

Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz

Page 41: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Face recognition: Apple iPhoto software

http://www.apple.com/ilife/iphoto/

Page 42: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Biometrics

How the Afghan Girl was Identified by Her Iris Patterns

Source: S. Seitz

Page 43: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Biometrics

Fingerprint scanners on many new laptops

Face recognition systems now beginning to appear more widely

htt // ibl i i /many new laptops, other devices

http://www.sensiblevision.com/

Source: S. Seitz

Page 44: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Optical character recognition (OCR)

Technology to convert scanned docs to text• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition

Source: S. Seitz

Page 45: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Mobile visual search: Google Goggles

Page 46: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Mobile visual search: iPhone Apps

Page 47: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Automotive safety

Mobileye: Vision systems in high-end BMW, GM, Volvo models • “In mid 2010 Mobileye will launch a world's first application of full

emergency braking for collision mitigation for pedestrians whereemergency braking for collision mitigation for pedestrians where vision is the key technology for detecting pedestrians.”

Source: A. Shashua, S. Seitz

Page 48: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Vision in supermarkets

LaneHawk by EvolutionRobotics“A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items thatWhen an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it… “

Source: S. Seitz

Page 49: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Vision-based interaction (and games)

Sony EyeToy

Nintendo Wii has camera-based IRtracking built in. See Lee’s work attracking built in. See Lee s work atCMU on clever tricks on using it tocreate a multi-touch display!

Source: S. Seitz

Assistive technologies

Page 50: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Vision for robotics, space exploration

NASA'S Mars Exploration Rover Spirit captured this westward view from atop

Vision systems (JPL) used for several tasks

a low plateau where Spirit spent the closing months of 2007.

y ( )• Panorama stitching• 3D terrain modeling

Obstacle detection position tracking• Obstacle detection, position tracking• For more, read “Computer Vision on Mars” by Matthies et al.

Source: S. Seitz

Page 51: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

The computer vision industryThe computer vision industry

A li t f i h• A list of companies here:

http://www.cs.ubc.ca/spider/lowe/vision.htmlp p

Page 52: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Course overviewCourse overview

I E l i i I f ti d iI. Early vision: Image formation and processingII. Mid-level vision: Grouping and fittingIII Multi view geometryIII. Multi-view geometryIV. RecognitionV. Advanced topicsd a ced top cs

Page 53: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

I. Early visionI. Early vision

B i i f ti d i• Basic image formation and processing

* =

Cameras and sensorsLight and color

Linear filteringEdge detection

Light and color

Feature extraction: corner and blob detection

Page 54: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

II. “MidII. “Mid--level vision”level vision”

Fitti d i• Fitting and grouping

Alignment

Fitting: Least squaresHough transformHough transform

RANSAC

Page 55: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

III. MultiIII. Multi--view geometryview geometry

Stereo Epipolar geometry

Tomasi & Kanade (1993)

Projective structure from motionAffine structure from motion

Page 56: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

IV. RecognitionIV. Recognition

Patch description and matching Clustering and visual vocabularies

Bag-of-features models ClassificationBag of features models

Sources: D. Lowe, L. Fei-Fei

Page 57: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

V. Advanced TopicsV. Advanced TopicsTi itti• Time permitting…

Segmentation Face detection

Articulated models Motion and tracking

Page 58: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Basic InfoBasic Info

• Instructor: Svetlana Lazebnik ([email protected])

• Office hours: By appointment FB 244By appointment, FB 244

• Textbooks (suggested): Forsyth & Ponce, Computer Vision: A Modern ApproachRichard Szeliski, Computer Vision: Algorithms and Applications (draft available online)Applications (draft available online)

• Class webpage:// / / 10http://www.cs.unc.edu/~lazebnik/spring10

Page 59: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Course requirementsCourse requirements• Philosophy: computer vision is best experienced hands-onPhilosophy: computer vision is best experienced hands on

• Programming assignments: 50%– Four assignments– Expect the first one in the next couple of classes– Brush up on your MATLAB skills (see web page for tutorial)

• Final assignment: 30% – Recognition competitionRecognition competition– Winner gets a prize!

• Participation: 20%• Participation: 20% – Come to class regularly– Ask questions

A ti– Answer questions

Page 60: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

Collaboration policyCollaboration policy

F l f t di i t ith h th b t di• Feel free to discuss assignments with each other, but coding must be done individually

• Feel free to incorporate code or tips you find on the Web, provided this doesn’t make the assignment trivial and you explicitly acknowledge your sourcesexplicitly acknowledge your sources

• Remember: I can Google too (and I have the copies of g ( peverybody’s assignments from the last two years this class was offered)

Page 61: COMP 776: Computer Visionlazebnik/spring10/lec01_intro.pdfThe goal of computer vision • To bid th b t i l d“ i ”bridge the gap between pixels and “meaning” What we see What

For next timeFor next time

S lf t d MATLAB t t i l• Self-study: MATLAB tutorial • Reading: cameras and image formation (F&P chapter 1)