Graph Cut based Inference with Co-occurrence Statistics · Standard CRF Energy Pairwise CRF models...
Transcript of Graph Cut based Inference with Co-occurrence Statistics · Standard CRF Energy Pairwise CRF models...
Graph Cut based Inference with Co-occurrence Statistics
Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr
Image labelling Problems
Image Denoising Geometry Estimation Object Segmentation
Assign a label to each image pixel
Building
Sky
Tree Grass
Standard CRF Energy
Pairwise CRF models
Data term Smoothness term
Standard CRF Energy
Pairwise CRF models
Restricted expressive power
Data term Smoothness term
Structures in CRF
Taskar et al. 02 – associative potentials Kohli et al. 08 – segment consistency Woodford et al. 08 – planarity constraint Vicente et al. 08 – connectivity constraint Nowozin & Lampert 09 – connectivity constraint Roth & Black 09 – field of experts Ladický et al. 09 – consistency over several scales Woodford et al. 09 – marginal probability Delong et al. 10 – label occurrence costs
Pairwise CRF models
Standard CRF Energy for Object Segmentation
Cannot encode global consistency of labels!!
Local context
Image from Torralba et al. 10
Detection Suppression
roadtablechair
keyboardtablecar
road
If we have 1000 categories (detectors), and each detector produces 1 fp every 10 images, we will have 100 false alarms per image… pretty much garbage…
[Torralba et al. 10, Leibe & Schiele 09, Barinova et al. 10]
• Thing – Thing • Stuff - Stuff • Stuff - Thing
[ Images from Rabinovich et al. 07 ]
Encoding Co-occurrence
Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. ‘07]
• Thing – Thing • Stuff - Stuff • Stuff - Thing
[ Images from Rabinovich et al. 07 ]
Encoding Co-occurrence
Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. ‘07]
Proposed solutions : 1. Csurka et al. 08 - Hard decision for label estimation 2. Torralba et al. 03 - GIST based unary potential 3. Rabinovich et al. 07 - Full-connected CRF
So...
What properties should these global co-occurence potentials have ?
Desired properties
1. No hard decisions
Desired properties
1. No hard decisions
Incorporation in probabilistic framework
Unlikely possibilities are not completely ruled out
Desired properties
1. No hard decisions 2. Invariance to region size
Desired properties
1. No hard decisions 2. Invariance to region size
Cost for occurrence of {people, house, road etc .. } invariant to image area
Desired properties
1. No hard decisions 2. Invariance to region size
The only possible solution :
Local context Global context
Cost defined over the assigned labels L(x)
L(x)={ , , }
Desired properties
1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred
L(x)={ building, tree, grass, sky }
L(x)={ aeroplane, tree, flower, building, boat, grass, sky }
Desired properties
1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred 4. Efficiency
Desired properties
1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred 4. Efficiency
a) Memory requirements as O(n) with the image size and number or labels b) Inference tractable
• Torralba et al.(2003) – Gist-based unary potentials
• Rabinovich et al.(2007) - complete pairwise graphs
• Csurka et al.(2008) - hard estimation of labels present
Previous work
Zhu & Yuille 1996 – MDL prior Bleyer et al. 2010 – Surface Stereo MDL prior Hoiem et al. 2007 – 3D Layout CRF MDL Prior • Delong et al. 2010 – label occurence cost
Related work
C(x) = K |L(x)|
C(x) = ΣLKLδL(x)
Zhu & Yuille 1996 – MDL prior Bleyer et al. 2010 – Surface Stereo MDL prior Hoiem et al. 2007 – 3D Layout CRF MDL Prior • Delong et al. 2010 – label occurence cost
Related work
C(x) = K |L(x)|
C(x) = ΣLKLδL(x)
All special cases of our model
Inference
Pairwise CRF Energy
Inference
IP formulation (Schlesinger 73)
Inference
Pairwise CRF Energy with co-occurence
Inference
IP formulation with co-occurence
Inference
IP formulation with co-occurence
Pairwise CRF cost Pairwise CRF constaints
Inference
IP formulation with co-occurence
Co-occurence cost
Inference
IP formulation with co-occurence
Inclusion constraints
Inference
IP formulation with co-occurence
Exclusion constraints
Inference
LP relaxation
Relaxed constraints
Inference
LP relaxation
Very Slow! 80 x 50 subsampled image takes 20 minutes
Inference: Our Contribution
Pairwise representation • One auxiliary variable Z 2 L
• Infinite pairwise costs if xi Z [see technical report] *Solvable using standard methods: BP, TRW etc.
Inference: Our Contribution
Pairwise representation • One auxiliary variable Z 2 L
• Infinite pairwise costs if xi Z [see technical report] *Solvable using standard methods: BP, TRW etc.
Relatively faster but still computationally expensive!
Inference using Moves
Graph Cut based move making algorithms [Boykov et al. 01]
α-expansion transformation function
• Series of locally optimal moves
• Each move reduces energy
• Optimal move by minimizing submodular function
Space of Solutions (x) : LN
Move Space (t) : 2N
Search Neighbourhood
Current Solution
N Number of Variables
L Number of Labels
Inference using Moves
Graph Cut based move making algorithms [Boykov, Veksler, Zabih. 01]
α-expansion transformation function
Inference using Moves
Label indicator functions
Co-occurence representation
Inference using Moves
Move Energy
Cost of current label set
Inference using Moves
Move Energy
Decomposition to α-dependent and α-independent part
α-independent α-dependent
Inference using Moves
Move Energy
Decomposition to α-dependent and α-independent part
Either α or all labels in the image after the move
Inference using Moves
Move Energy
submodular non-submodular
Inference
Move Energy
non-submodular
Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling
Inference
Move Energy
non-submodular
Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling
Occurrence - tight
Inference
Move Energy
non-submodular
Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling
Co-occurrence overestimation
Inference
Move Energy
non-submodular
Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling
General case [See the paper]
Inference
Move Energy
non-submodular
Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t) E(t) for any other labelling
Quadratic representation
Application: Object Segmentation
Standard MRF model for Object Segmentation
Label based Costs
Cost defined over the assigned labels L(x)
Training of label based potentials
Indicator variables for occurrence of each label
Label set costs
Approximated by 2nd order representation
Experiments
• Methods – Segment CRF
– Segment CRF + Co-occurrence Potential
– Associative HCRF [Ladický et al. ‘09]
– Associative HCRF + Co-occurrence Potential
• Datasets
MSRC-21
• Number of Images: 591
• Number of Classes: 21
• Training Set: 50%
• Test Set: 50%
PASCAL VOC 2009
• Number of Images: 1499
• Number of Classes: 21
• Training Set: 50%
• Test Set: 50%
MSRC - Qualitative
VOC 2010-Qualitative
Quantitative Results
MSRC-21
PASCAL VOC 2009
• Incorporated label based potentials in CRFs
• Proposed feasible inference
• Open questions
– Optimal training method for co-occurence
– Bounds of graph cut based inference
• Questions ?
Summary and further work