Scene Completion Using Millions of PhotographsJames Hays, Alexei A. Efros
Carnegie Mellon University
ACM SIGGRAPH 2007
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Introduction
Image completion(inpainting, hole-filling) Filling in or replacing an image region with new
image data such that the modification can not be detected
Introduction
The data could have been there
The data should have been there
Introduction
The existing methods operate by extending adjacent textures and contours into the unknown region Filling in the unknown region with content from
the known parts of the input image
Introduction
The assumption is that all the necessary image data to fill in an unknown region is located somewhere else in the same image This assumption is flawed
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Overview
We perform image completion by leveraging a massive database of images
Two compelling reasons A region will be impossible to fill plausibly
using only image data from the source image Reusing that content would often leave obvious
duplications
Overview
There are several challenges with drawing content from other images Computational Semantically invalid Seamlessly
Overview
Alleviate computational and semantic Find images depicting semantically similar
scenes Use only the best matching scenes to find
patches which match the content surrounding the missing region
Seamlessly combine image regions Graph cut segmentation Poisson blending
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Semantic Scene Matching
Our image database Download images in thirty Flickr.com groups Download images based on keyword searches Discarded duplicate images and images that are
too small Distributed among a cluster of 15 machines Acquir about 2.3 million unique images
Semantic Scene Matching
Look for scenes which are most likely to be semantically equivalent to the image requiring completion GIST descriptor Augment the scene descriptor with color
information of the query image down-sampled to the spatial resolution of the gist
Semantic Scene Matching
Given an input image to be hole-filled, we first compute its gist descriptor with the missing regions excluded
We calculate the SSD between the the gist of the query image and every gist in the database
The color difference is computed in the lab color space
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Local Context Matching
Having constrained our search to semantically similar scenes we can use Template matching to more precisely align
Local Context Matching
Pixel-wise alignment score We define the local context to be all pixels
within an 80 pixel radius of the hole’s boundary This context is compared against the 200 best
matching scenes Using SSD error in lab color space
Local Context Matching
Texture similarity score Measure coarse compatibility of the proposed
fill-in region to the source image within the local context
Computed as a 5x5 median filter of image gradient magnitude at each pixel
The descriptors of the two images are compared via SSD
Local Context Matching
Composite each matching scene into the incomplete image at its best placement using a form of graph cut seam finding and standard poisson blending
Local Context Matching
Past image completion algorithms The remaining valid pixels in an image can not
changed Our completion algorithms
Allow to remove valid pixels from the query image
But discourage the cutting of too many pixels
Local Context Matching
Past seam-finding Minimum intensity difference between two
images Cause the seam to pass through many high
frequency edges Our seam-finding
Minimum the gradient of the image difference along the seam
Local Context Matching
We find the seam by minimizing the following cost function
: unary costs of assigning any pixel p, to a specific label L(p)
L(p) : patch or exist
))(,( pLpCd
Local Context Matching
For missing regions of the existing image is a very large number
For regions of the image not covered by the scene match is a very large number
For all other pixels is pixel’s distance from the hole k = 0.02
),( existpCd
),( patchpCd
Local Context Matching
is non-zero only for immediately adjacent, 4-way connected pixels L(p) = L(q), the cost is zero L(p) L(q), is the magnitude of the gradient of the
SSD between the existing image and the scene match at pixels p and q
))(),(,,( qLpLqpCi
),())(),(,,( qpdiffqLpLqpCi
),( qpdiff
Local Context Matching
Finally we assign each composite a score The scene matching distance The local context matching distance The local texture similarity distance The cost of the graph cut
We present the user with the 20 composites with the lowest scores
Local Context Matching
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Results and Comparison
Results and Comparison
Results and Comparison
Results and Comparison
Results and Comparison
Lucky Find another image from the same physical
location It is not our goal to complete scenes and objects
with their true selves in the database
Results and Comparison
Results and Comparison
Results and Comparison
Failure cases : artifact
Results and Comparison
Failure cases : semantic violations
Results and Comparison
Failure cases : no object recognition
Results and Comparison
Failure cases : past methods perform well For uniformly textured backgrounds Our method is unlikely to find the exact same
texture in another photograph
Outline
Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion
Conclusion
This paper Present a new image completion algorithm
powered by a huge database. Unlike past methods that reuse visual data
within the source image. Further work
Two million images are still a tiny fraction of the high quality photograph available.
Our approach would be an attractive web-base application.
Thank you!!!
Top Related