TIM @ SBU 2008
AutoCollage FrameworkImplementation StepsResults and User Study
TIM @ SBU 2008
Outline
TIM @ SBU 2008
About AutoCollage
An automatic procedure for constructing a visually appealing collage from a collection of input images.
TIM @ SBU 2008
AutoCollage : A Labelling System
Collage : I cInput Images : I = {I1, . . . , IN}Domain : PEach pixel-location : p P∈Label : L(p) = (n, s) in which In I and s S ∈ ∈Compact form : I(p) = S(p,L(p)) = In(p−s)
The labelling L = {L(p), p P} completely specifies the collage.∈
Goal : To find the labelling L which minimises the energy/cost E(L) E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
AutoCollage FrameworkImplementation StepsResults and User Study
TIM @ SBU 2008
TIM @ SBU 2008
Representative Images
TIM @ SBU 2008
Representative Images
The cost associated with the set Is of chosen images is of the form Erep = ∑n Erep(n) where
Erep(n) = −an Dr(n) − min an am Vr(n,m) m:Im I∈ s
an = 1 if p P with L(p) = (n, s)∃ ∈Dr(n) = Entropy(In) + Wfaced δ({Image n contains a face})δ(π) = 1 if predicate π is trueWface weights the influence of an image containing a faceVr(m,n) shows pairwise distances between images
E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
TIM @ SBU 2008
Region ofInterest
TIM @ SBU 2008
Region of Interest
E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
The Eimp term ensures that a substantial and interesting region of interest (ROI) is selected from each image in Is.
Eimp(L) = − ∑p G(p, L(p)) T(p, L(p))
G(p, L(p)) is the Gaussian weighting function that favors the center of the input image from which p is drawn.
T(p, L(p)) measures the local entropy of a 32×32-pixel region around the pixel p, and is normalized so that local entropy sums to 1 over a given input image.
TIM @ SBU 2008
Packing ROIs
TIM @ SBU 2008
Packing ROIs : Constraints
1. Faces should be regard as a preferred material
2. Sky should be constrained to appear at the top
3. No two ROIs should intersect
4. Every pixel is couvered by an image
TIM @ SBU 2008
Packing ROIs : The Energy
E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
Eobj incorporates information on object recognition, and favors placement of objects in reasonable configurations.
For faces,
Eobj = ∑p, q N∈ f (p, q, L(p), L(q))
f (p,q,L(p),L(q)) = ∞ whenever L(p) ≠ L(q) and p,q are pixels from the same face in either the images of L(p) or L(q), 0 otherwise.
For sky, rather than defining an explicit energy, we simply label images containing sky and pass this information to the constraint satisfaction engine which attempts to position such images only at the top of the collage.
TIM @ SBU 2008
Transition Between Images
TIM @ SBU 2008
Transition between Images
E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
Etrans is a pairwise term which penalises any transition between images that is not visually appealing.
Etrans = ∑p,q N∈ VT (p, q, L(p), L(q))
N is the set of all pairs of neighboring (8-neighborhood) pixels.
Compact form: I(p) = S(p,L(p)) = In(p−s)ε = 0.001 prevents underflow||.|| defines the Euclidean norm
L(q))L(p),q,(p, VT
)||L(q))S(q,-L(q))S(p,||
||L(q))S(p,-L(p))S(p,||,
||L(p))S(q,-L(p))S(p,||
||L(q))S(q,-L(p))S(q,||(min
TIM @ SBU 2008
Transition between Images
E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
Etrans measures mismatch across the boundary betweentwo input images.
1. VT (p, q, L(p), L(q)) = 0 unless L(p) ≠ L(q). 2. VT (p, q, L(p), L(q)) is small if there is a strong gradient in one of the input images, since the relevant denominator will then be large.
L(q))L(p),q,(p, VT
)||L(q))S(q,-L(q))S(p,||
||L(q))S(p,-L(p))S(p,||,
||L(p))S(q,-L(p))S(p,||
||L(q))S(q,-L(p))S(q,||(min
TIM @ SBU 2008
AutoCollage : Constraints
1. Information bound Any image In that is present in the labelling must satisfy Eimp(L,n) > T.Eimp(L,n) [0,1]∈ is the proportion of local image information that is captured in the ROI.
2. Uniform shift A given input image In may appear in the collage with one unique shift s. Given two distinct pixels p, q P : p ≠ q, ∈ with labels L(p) = (n, s), L(q) = (n, s), ′ it is required that s = s .′
3. Connectivity Each set Sn {p P: L(p)=(n, s), for some s} ∈ ∈ of collage pixels drawn from image n, should form a connected region.
TIM @ SBU 2008
Implementation Summary
Erep tends to select the images from the input image set that are most representative.Eimp term ensures that a substantial and interesting region of interest (ROI) is selected from each image in I.Etrans is a pairwise term which penalises any transition between images that is not visually appealing. Eobj incorporates information on object recognition, and favors placement of objects in reasonable configurations.
Wimp, Wtrans, Wobj and Wfaced are weighting parameters and have been adjusted by informal testing over 50 sets of images.
E(L) = Erep(L) + Wimp Eimp(L) + Wtrans Etrans(L) + Wobj Eobj(L)
TIM @ SBU 2008
Demo
AutoCollage FrameworkImplementation StepsResults and User Study
TIM @ SBU 2008
TIM @ SBU 2008
Results : Comparison
AutoCollage
Tapestry
TIM @ SBU 2008
Results : Limitations
1. Occasional inclusion of sky fragments in the interior (83% Accuracy)
2. Occasionally the face detection fails, allowing inappropriate cut 3. Sometimes texture edges trigger inappropriately sharp transitions
4. Lake of user interaction
TIM @ SBU 2008
User Study
TIM @ SBU 2008
Top Related