Download - Scene Completion Using Millions of Photographs James Hays, Alexei A. Efros Carnegie Mellon University ACM SIGGRAPH 2007.

Scene Completion Using Millions of PhotographsJames Hays, Alexei A. Efros

Carnegie Mellon University

ACM SIGGRAPH 2007

Outline

Introduction Overview Semantic Scene Matching Local Context Matching Results and Comparison Conclusion

Introduction

Image completion(inpainting, hole-filling) Filling in or replacing an image region with new

image data such that the modification can not be detected

Introduction

The data could have been there

The data should have been there

Introduction

The existing methods operate by extending adjacent textures and contours into the unknown region Filling in the unknown region with content from

the known parts of the input image

Introduction

The assumption is that all the necessary image data to fill in an unknown region is located somewhere else in the same image This assumption is flawed

Outline


Overview

We perform image completion by leveraging a massive database of images

Two compelling reasons A region will be impossible to fill plausibly

using only image data from the source image Reusing that content would often leave obvious

duplications

Overview

There are several challenges with drawing content from other images Computational Semantically invalid Seamlessly

Overview

Alleviate computational and semantic Find images depicting semantically similar

scenes Use only the best matching scenes to find

patches which match the content surrounding the missing region

Seamlessly combine image regions Graph cut segmentation Poisson blending

Outline


Semantic Scene Matching

Our image database Download images in thirty Flickr.com groups Download images based on keyword searches Discarded duplicate images and images that are

too small Distributed among a cluster of 15 machines Acquir about 2.3 million unique images


Look for scenes which are most likely to be semantically equivalent to the image requiring completion GIST descriptor Augment the scene descriptor with color

information of the query image down-sampled to the spatial resolution of the gist


Given an input image to be hole-filled, we first compute its gist descriptor with the missing regions excluded

We calculate the SSD between the the gist of the query image and every gist in the database

The color difference is computed in the lab color space

Outline


Local Context Matching

Having constrained our search to semantically similar scenes we can use Template matching to more precisely align


Pixel-wise alignment score We define the local context to be all pixels

within an 80 pixel radius of the hole’s boundary This context is compared against the 200 best

matching scenes Using SSD error in lab color space


Texture similarity score Measure coarse compatibility of the proposed

fill-in region to the source image within the local context

Computed as a 5x5 median filter of image gradient magnitude at each pixel

The descriptors of the two images are compared via SSD


Composite each matching scene into the incomplete image at its best placement using a form of graph cut seam finding and standard poisson blending


Past image completion algorithms The remaining valid pixels in an image can not

changed Our completion algorithms

Allow to remove valid pixels from the query image

But discourage the cutting of too many pixels


Past seam-finding Minimum intensity difference between two

images Cause the seam to pass through many high

frequency edges Our seam-finding

Minimum the gradient of the image difference along the seam


We find the seam by minimizing the following cost function

: unary costs of assigning any pixel p, to a specific label L(p)

L(p) : patch or exist

))(,( pLpCd


For missing regions of the existing image is a very large number

For regions of the image not covered by the scene match is a very large number

For all other pixels is pixel’s distance from the hole k = 0.02

),( existpCd

),( patchpCd


is non-zero only for immediately adjacent, 4-way connected pixels L(p) = L(q), the cost is zero L(p) L(q), is the magnitude of the gradient of the

SSD between the existing image and the scene match at pixels p and q

))(),(,,( qLpLqpCi

),())(),(,,( qpdiffqLpLqpCi

),( qpdiff


Finally we assign each composite a score The scene matching distance The local context matching distance The local texture similarity distance The cost of the graph cut

We present the user with the 20 composites with the lowest scores

Outline


Results and Comparison


Lucky Find another image from the same physical

location It is not our goal to complete scenes and objects

with their true selves in the database


Failure cases : artifact


Failure cases : semantic violations


Failure cases : no object recognition


Failure cases : past methods perform well For uniformly textured backgrounds Our method is unlikely to find the exact same

texture in another photograph

Outline


Conclusion

This paper Present a new image completion algorithm

powered by a huge database. Unlike past methods that reuse visual data

within the source image. Further work

Two million images are still a tiny fraction of the high quality photograph available.

Our approach would be an attractive web-base application.

Thank you!!!