CSci 4968 and 6270 Computational Vision Lecture 15 ...stewart/cv_2009/lec15-overview.pdf · CSci...
Transcript of CSci 4968 and 6270 Computational Vision Lecture 15 ...stewart/cv_2009/lec15-overview.pdf · CSci...
CSci 4968 and 6270Computational Vision
Lecture 15Overview of Remainder of the Semester
Charles Stewart
Department of Computer ScienceRensselaer Polytechnic Institute
2009/10/22
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 1 / 36
Topics to be Covered in the Rest of the Semester
A brief overview to help you think about project topicsStructure-from-motion and photo tourismStereoMotion and trackingSegmentationActive contoursMarkov random fields, global optimization and graph cutsFace recognition and detectionInstance recognitionCategory recognitionHigh-dynamic range imagingImage matting and compositingTexture analysis, synthesis, inpainting
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 2 / 36
What to Think About
As we discuss the topics in this overview, think about... Applications: fun, interesting, useful things you would like to dowith images and video... How any one of the problems we discuss and associatedtechniques may be mapped to your application idea(s)... What problems and challenges might you be able toinvestigate?
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 3 / 36
Structure From Motion
Problem overviewGiven either:
I An image sequence taken from a moving cameraI A set of images of the same scene taken from different cameras
and viewpointsEstimate:
I Determine which images show overlapping viewsI Estimate camera positions and internal parametersI Estimate 3d locations of a set of keypoints
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 4 / 36
Structure From Motion
TechniquesKeypoint matching: image-to-image or one image against manyFundamental matrix estimation for each pair of matching imagesCamera estimation based on assumptions about intrinsicparametersMulti-image “bundle adjustment” to place points and cameras inthe world.
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 5 / 36
Structure From Motion
Applications
Photo tourismMatch-move sceneinsertion
grail.cs.washington.edu/rome/rome
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 6 / 36
Two-Camera Stereo Correspondence
ProblemGiven: images, I1 and I2, taken (more or less) simultaneously fromtwo different positions with (usually) calibrated camerasProblem: for each pixel in I1 find the corresponding pixel in I2, andfor each pixel in I2 find the corresponding pixel in I1.
I This is known as “disparity computation”
http://vision.middlebury.edu/stereo/data/scenes2003/
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 7 / 36
Two-Camera Stereo Correspondence
TechniquePixel similarity measuresSurface smoothness, with allowances for discontinuitiesGraph-based global optimization algorithms
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 8 / 36
Two-Camera Stereo Correspondence
ApplicationsView synthesis and image-based rendering3d reconstructionRobot navigation
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 9 / 36
Motion
ProblemGiven: two or more images of a scene, where both the cameraand objects in the the scene may be movingCompute the apparent motion of the image:
I Pixel-level offset vectors, also known as optical flow.I Parametric model, similar to image registrationI Layered motion: segment image into layers of different motion,
each having their own model
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 10 / 36
Motion
Figure 8.1 of the Szeliski text.
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 11 / 36
Motion
TechniquesImage-to-image correlationSpatial and temporal image derivatives to generate imageconstraintsParametric model fitting or motion-field smoothness measures;followed by optimizationIterate:
I Assign pixels to layersI Estimate motion within each layer
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 12 / 36
Motion — Applications
View stabilizationRemoval of motion blurSuper-resolutionSeparation of transparent motionForeground-background segmentation
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 13 / 36
Tracking
Problem
Given a videosequenceIdentify and continuallylocate moving objectsin the scene
I Could includeestimating theparameters of adynamic model ofthe scene
http://www.porikli.com/objecttracking_files/image007.jpg
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 14 / 36
Tracking
TechniquesIdentify and follow “interesting” features; cluster those withcoherent motionsExtract and follow contoursCorrelationMatch distributionsKalman filtering
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 15 / 36
Tracking
ApplicationsSecurityHuman-computer interactionTraffice safetyActivity recognitionMultiframe segmentation
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 16 / 36
SegmentationSeparate an image into foreground region (region of interest) anda background region.Wide variety of techniques proposed:
I Classical methods: thresholding, merging and spliting, watershedI Contour drivenI Statistical methods and mean-shiftI Graph-based algorithms
Many of these can be applied as part of interactive tools /user-assist tools.
Szeliski, Fig. 5-15, from Felzenszwalb and Huttenlocher 2004
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 17 / 36
Segmentation
“Classical” methodsThresholding: single and multiple thresholdRegional split and mergeWatershedApplications of these are mostly in medical imaging
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 18 / 36
Contour Driven: Active Contours
Compute the location of a contour C(t) in an imageConstraints are combined using what can be thought of as a “soft”form of Lagrange multipliers
I Prefer smoother curves — low curvature preferredI Prefer to stay in high gradient regions
Requires initializationLevel-set methods to avoid topological issuesApplications in medical imaging and in interactive segmentationmethods
http://controls.ae.gatech.edu/avcs/kickoff/
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 19 / 36
Segmentation —Statistical methods and mean-shiftForm measurement vector on each pixel in a high dimensionalspace.
I Location, color, texture, etc.
Think of the distrbution of these pointsEach pixel attempts to seek the mode of its distributionSegment based on who belongs to which modeOften used as a pre-processing step to “oversegment” image andcreate “super pixels”.
Szeliski, Figure 5.20, from Comaniciu and Meer 2002
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 20 / 36
Segmentation — Graph-Based MethodsImage forms weighted graphPixels connected to neighbors, with weights based on consistencybetween them — color, texture, etc.Each pixel also connected to a special “source” (foreground) anda “sink” (background) node in the graphCompute the minimal “cut” in the graph that separates the sourceand the sink.Pixels connected to the source node are foreground and pixelsconnected to the sink are background.
Szeliski, Fig. 5-26. From Rother, et al. 2004
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 21 / 36
Recognition
Types of recognition problemsFace recognition: given an image (or subimage) that is known toshow a face, whose face is it?Face detection: find all of the faces in the imageInstance recognition: given a set of particular objects, which (ifany) of them appear in the image?
I Examples range from recognizing music CD covers to particularlocations on a street
Category recognition: find all of the cars, chairs, horses, cows,bicycles etc. in a scene.Context recognition: divide up an image into the streets, cars,buildings, grass, sky, etc.
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 22 / 36
Algorithmic Tools for Recognition
Face recognitionWrite image as a vector and normalizeGather many different vectors for different facesApply principal component analysis (PCA) to the vectorsMatch by projecting new images onto the principal components
Szeliski, Fig. 14.3, from Moghaddam and Pentland 1997
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 23 / 36
Algorithmic Tools for Recognition
Face detection
Learned appearancemodels for parts offaces
I PCAI Trained classifiers
using large numbersof features
I Neural networks
Spatial relationshipsamong parts
Szeliski, Fig. 14.15, from Rowley et al. 1998
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 24 / 36
Algorithmic Tools for RecognitionInstance recognition — among many different objects
Keypoint detection and descriptionHierarchical (tree) clustering of descriptors — following from“vocabulary trees” — with leaf nodes have the (potential)instancesMatch an image by matching its keypoints against the treeTest those “instances” that have a lot of keypoint matches basedon, for example, fundamental matrix estimation
Szeliski, Fig. 14.30, from Philbin et al. 2007
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 25 / 36
Algorithmic Tools for Recognition
Category recognition — a much harder problemBag of words: match enough SIFT (or more recent) keypoints anddescriptors and you have recognized the object, independent ofthe relative positions of the keypointsParts models: tree-structured, graphical representation of therelationship between body parts and how they move.Segmentation and labeling: integrate the labeling of objects andthe segmentation of the object from the background throughcombinations of classifiers and graphical models.
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 26 / 36
Category Recognition — Examples of Challenges
Szeliski, Fig. 14.34, from Csurka et al. 2007
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 27 / 36
Algorithmic Tools for Recognition
Context recognitionDivide up an image into the streets, cars, buildings, grass, sky, etc.Simultaneous recognition and segmentationConditional random fields — a generalization of Markov randomfieldsProbability that an image region is a car depends on probabilitythat nearby region is a street.
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 28 / 36
Context Recognition — Some Successes
Szeliski, Fig. 14.43, from Shotton et al. 2009
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 29 / 36
Computational Photography
Ch. 10 of Szeliski, but we will only cover partPhotometric calibrationHigh dynamic range imagingSuper-resolution and blur removalImage matting and compositingTexture analysis and synthesis
We will only cover the latter during the last part of the semester, butthe other topics may be of interest for projects.
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 30 / 36
Computational Photography
Photometric calibrationResponse curve: radiance to intensity; color balancing.Vignetting correctionSpatial blur estimation
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 31 / 36
Computational Photography
High dynamic range imagingCan’t represent a wide enough set of radiance values in[0, ..., 255].Take images under different exposuresInvert response curve to get radianceRemap (“tone mapping”) to range 0..255
Szeliski, Fig. 10.12
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 32 / 36
Computational Photography
Super-resolution and blur removalDeconvolution for single imagesMultiple images:
I Highly-accurate, sub-pixel alignmentI Interpolate pixel values
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 33 / 36
Computational Photography
Image matting and compositingSimple view: segment region from one image, paste it on anotherBlue screen techniques have been used for years.Multiview and motion-based techniques are of continuing interest“Mixed pixels”, combining foreground and background, are ofongoing difficulty for both segmentation and compositing.
I Effectively computing an alpha channel
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 34 / 36
Natural Image Matting Example
Szeliski, Fig. 10.39, from Chuang et al. 2001
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 35 / 36
Computational Photography
Texture analysis and synthesisGiven a “texture”, generate new textures that look similarMatching of frequency characteristics at multiple scalesHole-filling and inpaintingNon-photo-realistic rendering.
Szeliski, Fig. 10.52
Charles Stewart (Rensselaer) Lec 15 — Remainder 2009/10/22 36 / 36