A Comparative Study of Image...
Transcript of A Comparative Study of Image...
Spatial Retargeting Diego Gutierrez
Universidad de Zaragoza
(slides material also from Miki Rubinstein, Olga Sorkine,
Arik Shamir and Susana Castillo)
The Retargeting Problem
Common solutions
• Homogeneous squeezing/stretching
• Cropping [DeCarlo and Santella 2002; Viola and Jones 2004…]
• Hybrid solution [modern TV sets]
original squeeze crop hybrid
Visual Media Retargeting: Siggraph Asia Course 2009
Ariel Shamir
The Interdisciplinary Center, Herzliya
Olga Sorkine
New York Univeristy
Visual Media Retargeting: An Example
[Avidar & Shamir 2007]
Visual Media Retargeting: Scaling
[Avidar & Shamir 2007]
Visual Media Retargeting: Seams
[Avidar & Shamir 2007]
1. Define an energy function E(I)
(interest, importance, saliency…)
2. Use some operator(s)
to change the image I
Visual Media Retargeting: Energy Concept
[Avidar & Shamir 2007]
Visual Media Retargeting: Energy & Saliency
• Magnitude of gradients (simple)
• Saliency (e.g. Itty’s method) - multires
[Shamir and Sorkine 2009]
Different energy functions
[Shamir and Sorkine 2009]
•Histogram of Gradients
•Entropy
•E1
•Mean Shift & E1
Different energy functions
•Histogram of Gradients
•Entropy
•E1
•Mean Shift & E1
[Shamir and Sorkine 2009]
Different energy functions
•Histogram of Gradients
•Entropy
•E1
•Mean Shift & E1
[Shamir and Sorkine 2009]
Different energy functions
[Shamir and Sorkine 2009]
•Histogram of Gradients
•Entropy
•E1
•Mean Shift & E1
Simple operators: cropping
[Shamir and Sorkine 2009]
• Crop s.t. important (salient) parts remain
• Use domain-specific tools, such as face
detector, gaze estimation… [DeCarlo and
Santella 2002; Viola and Jones 2004]
Simple operators: scaling
[Shamir and Sorkine 2009]
• Cam combine with cropping techniques (done
on modern TV sets – center remains, peripheral
data is scaled)
• Distorts content but is perfectly temporally
coherent (video)
Discrete vs continuous
[Shamir and Sorkine 2009]
Problem statement
• Given an image I of size (n x m), we want to produce an image I* of size (n* x m*) which is a good representative of image I
• But what is a “good representative”? No definitions exist
• Goals of a retargeting algorithm: – 1. The important content of I should be preserved in I*.
– 2. The important structure of I should be preserved in I*.
– 3. I* should be artifact-free
Discrete approaches
[Shamir and Sorkine 2009]
• Seam carving for content aware image resizing
SIGGRAPH 2007
S. Avidan and A. Shamir
• Improved seam carving for video retargeting
SIGGRAPH 2008
M. Rubinstein, A. Shamir and S. Avidan
• Seam carving for Media Retargeting
Trans. Of the ACM
S. Avidan and A. Shamir
• Multi-Operator Media Retargeting
SIGGRAPH 2009
M. Rubinstein, A. Shamir and S. Avidan
Continuous approaches
[Shamir and Sorkine 2009]
• Feature-aware textureing
EGSR 2006
R. Gal, O. Sorkine and D. Cohen-Or
• Non-homogeneous content-drive video retargeting
ICCV 2007
L. Wolf, M Guttmann and D. Cohen-Or
• Optimized scale-and-stretch for image resizing
SIGGRAPH ASIA 2008
Y. Wang, C. Tai, O. Sorkine and T. Lee
• Shrinkability maps for content-aware video resizing
Pacific Graphics 2008
Y. Zhang, S. Hu and R. Martin
Discrete example: Seam carving
[Rubinstein, Avidan and Shamir 2007]
Seam carving
[Rubinstein, Avidan and Shamir 2007]
Seam carving
[Rubinstein, Avidan and Shamir 2007]
Seam carving
[Rubinstein, Avidan and Shamir 2007]
Seam carving: problems
[Rubinstein, Avidan and Shamir 2007]
• Discrete and greedy – may break structures
• Can run out of good seams in one direction
Continuous example: Warping
[Wang, Tai, Sorkine and Lee 2008]
• Allow important regions to uniformly scale
• Find optimal local scaling factors by global
optimization
• Result: preserve the shape of important regions,
distort non-important ones
Continuous example: Warping
[Wang, Tai, Sorkine and Lee 2008]
• Grid mesh, preserve the shape of the important
quads
• Optimize the location of mesh vertices,
interpolate image
Continuous example: Warping
[Wang, Tai, Sorkine and Lee 2008]
• Grid mesh, preserve the shape of the important
quads
• Optimize the location of mesh vertices,
interpolate image
Video?
[Shamir and Sorkine 2010]
• Naïve… every frame by itself
Video?
[Shamir and Sorkine 2010]
• Camera movement
• Object movement
• Seams should adapt and change through time!
• Global Solution (video cube)
Video?
[Shamir and Sorkine 2010]
Current State of Retargeting Research
No clear evaluation methodology!
– Mostly visual comparison
– Small subset of previous techniques
Relation between the operator and the type of content?
Tradeoff between loss of content and deformation?
Is there an agreement between viewers on retargeting
evaluation?
Computational retargeting measure?
Source
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
• Benchmark and evaluation methodology for image retargeting
• Comprehensive perceptual study and analysis of image retargeting
http://people.csail.mit.edu/mrub/retargetme/
A Comparative Study of Image Retargeting Miki Rubinstein, Diego Gutierrez, Olga Sorkine and Arik Shamir
ACM Transactions on Graphics, Vol. 29(5) (SIGGRAPH Asia 2010)
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Goals
• What is the “correct” way to retarget this image?
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Goals
• The dataset and user study
• User response (subjective) analysis
– Is there consensus between viewers?
– When is one method better than another?
• Computational (objective) analysis
– Can an image distance measure predict retargeting quality?
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Constructing the Dataset
• Image Retargeting objectives:
1. Preserve the important content and structures
2. Limit artifacts
• Concentrate on challenging images from previous
publications
• Classified by attributes
– people/faces, line/edges, foreground objects, texture, geometric
structures, symmetry
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
• Seam Carving [SC] [Rubinstein et al. 2008]
• Shift Map [SM] [Pritch et al. 2009]
• Multi-Operator [MULTIOP] [Rubinstein et al. 2009]
• Warping [WARP] [Wolf et al. 2007]
• Streaming Video [SV] [Krähenbühl et al. 2009]
• Scale-and-Stretch [SNS] [Wang et al. 2008]
• Cropping [CR] [Manual]
• Scaling [SCL] [Cubic interpolation]
Retargeting Operators D
iscre
te
Contin
uous
Refe
rence
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Comparative Analysis
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
The Survey Interface
Retargeting Results
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Additional Questions
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
User Statistics
• Each participant performs 12 comparisons over 5
images
• 210 participants; 252 votes per image
– Half volunteers
– Half (25 cents per completed survey)
• Average time to complete: 20 minutes
“It was a very interesting survey. Very nice experience”
“i need your more survey so that i can help u a lot”
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
User Statistics
Gender Age
Country
Graphics Background Comparison time
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
User Agreement
• Similarity of votes = consensus on “good” retargeting
• Coefficient of Agreement [Kendall 1940]
• aij = # times method i chosen over method j
• m = # participants
• t = 8 (# retargeting operators)
•
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
User Agreement
• Low agreement in general
• Greater agreement on images containing faces/people, evident foreground objects and symmetry.
lines/
edges
faces/
people
Textur
e
foregroun
d
objects
Geometri
c
Structure
s
Symmetr
y
Total
u 0.073 0.166 0.070 0.146 0.084 0.132 0.095
lines/
edges
faces/
people
Textur
e
foregroun
d
objects
Geometri
c
Structure
s
Symmetr
y
Total
u 0.073 0.166 0.070 0.146 0.084 0.132 0.095
lines/
edges
faces/
people
Textur
e
foregroun
d
objects
Geometri
c
Structure
s
Symmetr
y
Total
u 0.073 0.166 0.070 0.146 0.084 0.132 0.095
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Operator Ranking
Lines/edges
Total
Faces/people
Nonhomo. Warping
Streaming Video
Cropping Multi- operator
Shift- maps
Scale & Stretch
Scaling Seam Carving
Operator Ranking
By Attribute
Total
Rank
1 2 3 4 5 6 7 8
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Additional Questions
Cropping Shift-maps Scale & Stretch
(At least for our retargeted setup)
SUBJECTIVE:
Clear and consistent division in groups
CR, SV, MULTIOP: good!
SCL, SC, WARP: not so good
Greater agreement for faces/people and foreground objects:
Saliency at object level?
Partial Conclusion
Source is Usually Unknown!
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
• Similar setup, source image not shown
• New set of 210 participants
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
“No Reference” Experiment Results
“No Reference” Experiment Results
Reference No Reference
Nonhomo. Warping
Streaming Video
Cropping Multi- operator
Shift- maps
Scale & Stretch
Scaling Seam Carving
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Analysis of the users’ responses:
significance test
Computational Retargeting Measures
• Goal: can computational image distance measures predict human retargeting preferences?
– Can be used to evaluate new operators
– Can be used to develop new operators – [Simakov et al. 2008], [Rubinstein et al. 2009]
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
(Non-blind) Retargeting Measures
,
= score
Source Retarget
?
?
Objective Measures
• High level semantics: – Bidirectional Similarity [BDS] - Simakov et al. 2008
– Bidirectional Warping [BDW] - Rubinstein et al. 2009
– SIFT Flow [SIFTflow] – Liu et al. 2008
– Earth Mover’s Distance [EMD] - Pele and Werman 2009
• Low level features – Edge Histogram [EH] – Menjunath et al. 2001
– Color Layout [CL] – Kasutani and Yamada 2001
• See dataset website and supplemental material for more details
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
How to Evaluate an Objective Measure?
• Define rate of agreement as the correlation between
rankings induced by the user responses, and the
objective measure Subjective Objective
Objective Analysis Results
• The results were spectacular(ly poor!)
• We tried something else: – SIFT-flow [Liu et al. 2008]: SIFT
– Earth mover’s distance [Pele & Werman 2009]: EMD
• Somewhat better
Metric lines/ edges
faces/ people
texture Foreground
objects geometric structures
symmetry total
BDS 0.04 0.19 0.06 0.17 0.00 -0.01 0.08
BDW 0.03 0.05 -0.05 0.06 0.00 0.12 0.05
EH 0.04 -0.08 -0.06 -0.08 0.10 0.30 0.00
CL -0.02 -0.18 -0.07 -0.18 -0.01 0.21 -0.07
SIFTflow 0.10 0.25 0.12 0.22 0.08 0.07 0.14 EMD 0.22 0.26 0.11 0.23 0.24 0.50 0.25
Metric lines/ edges
faces/ people
texture Foreground
objects geometric structures
symmetry total
BDS 0.04 0.19 0.06 0.17 0.00 -0.01 0.08
BDW 0.03 0.05 -0.05 0.06 0.00 0.12 0.05
EH 0.04 -0.08 -0.06 -0.08 0.10 0.30 0.00
CL -0.02 -0.18 -0.07 -0.18 -0.01 0.21 -0.07
SIFTflow 0.10 0.25 0.12 0.22 0.08 0.07 0.14 EMD 0.22 0.26 0.11 0.23 0.24 0.50 0.25
Metric lines/ edges
faces/ people
texture Foreground
objects geometric structures
symmetry total
BDS 0.04 0.19 0.06 0.17 0.00 -0.01 0.08
BDW 0.03 0.05 -0.05 0.06 0.00 0.12 0.05
EH 0.04 -0.08 -0.06 -0.08 0.10 0.30 0.00
CL -0.02 -0.18 -0.07 -0.18 -0.01 0.21 -0.07
SIFTflow 0.10 0.25 0.12 0.22 0.08 0.07 0.14 EMD 0.22 0.26 0.11 0.23 0.24 0.50 0.25
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
Can computational image distance metrics predict human
retargeting perception?
[Rubinstein, Gutierrez, Sorkine and Shamir 2010]
SUBJECTIVE:
More recent algorithms do outperform their predecessors in
a (surprisingly) consistent way
Cropping is the simplest and one of the best:
loss of info OK
distortion NOT OK
bring it back!
Interestingly, scaling and seam carving do not do very well
on their own… but are two of the three in MULTIOP:
combination of simple methods?
Conclusions
Conclusions
OBJECTIVE:
We are a long way from predicting human perception
Four similarity image metrics did not perform well at all
Two metrics not originally designed for that purpose did
somewhat better
Optimize retargeting wrt those?
Further research is (badly!) needed
Conclusions
Conclusions
We need video analysis and experiments!
http://people.csail.mit.edu/mrub/retargetme/
Image Retargeting Quality Assessment Computer Graphics Forum, 2011, Vol. 30, No. 2, Eurographics 2011,
Yong-Jin Liu, Xi Luo, Yu-Ming Xuan, Wen-Feng Chen, Xiao-Lan Fu
Using Eye-Tracking to Assess Different Image Retargeting Methods
Using Eye-Tracking to Assess Different Image Retargeting Methods Susana Castillo,Tilke Judd and Diego Gutierrez
Applied Perception in Graphics and Visualization 2011
[Castillo, Judd and Gutierrez 2011]
Overview
[Castillo, Judd and Gutierrez 2011]
• Seam Carving [SC] [Rubinstein et al. 2008]
• Shift Maps [SM] [Pritch et al. 2009]
• Multi-Operator [MULTIOP] [Rubinstein et al. 2009]
• Streaming Video [SV] [Krähenbühl et al. 2009]
Retargeting Operators
Rank [Rubinstein et al. SIGAsia 2010]
1 2 3 4 5 6 7 8
[Castillo, Judd and Gutierrez 2011]
Collect eye tracking data
Screen resolution
1280x1024
Each image shown for 5 seconds
[Photo Credit: Jason Dorfman CSAIL website]
[Castillo, Judd and Gutierrez 2011]
Eye tracking data
Fixations for 7 users
Contextual guidance of eye movements and attention in real-world scenes: The role of global features on object search [Torralba et al. 2006]
[Castillo, Judd and Gutierrez 2011]
Eye tracking data
Average fixation locations / continuous saliency map
[Castillo, Judd and Gutierrez 2011]
Learning to predict where humans look [Judd et al. 2009]
Eye tracking data
Top 20% salient locations
Learning to predict where humans look [Judd et al. 2009]
[Castillo, Judd and Gutierrez 2011]
• People tend to fixate on: 1. Text & Faces
2. Animals
3. Center
• Features – Low level : illuminance, orientation, color
– Mid level: vanishing point, horizon line
– High level: face detection, object detection
MIT Predictive Model of Saliency
[Judd et al. 2009]
MIT Predictive Model of Saliency
Saliency Maps from eye-tracking data
Saliency Maps predicted by the model from Judd et al. [2009]
[Castillo, Judd and Gutierrez 2011]
Examples and Discussion
[Castillo, Judd and Gutierrez 2011]
Conclusions
• Lots of methods in the past few years, in top-notch places
• Relatively small impact in industry
• We need more (and better!) metrics
• Does video retargeting really work?
http://people.csail.mit.edu/mrub/retargetme/ or Google: “retargetme”
Conclusions
• Eye-tracking data framework
• The model of saliency from Judd et al. [2009] can be an useful tool in a retargeting context when using an eye tracker is not feasible
• Analysis of 4 retargeting operators with 6 image distance measures – Using eye-tracking data can improve the predicting capabilities of these
measures
• Alteration of the image semantics.
– Content removal alters RoIs although the results can be aesthetically pleasing
• Attentional tension between RoIs and artifacts – Large artifacts can remain unnoticed when not in a RoI (At least for our 5
second task)