Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering...

Torr Vision Group, Engineering Department

Semantic Image Segmentation withDeep LearningSadeep Jayasumana

07/10/2015

Collaborators:Bernardino Romera-ParedesShuai ZhengPhillip Torr

Live Demo - http://crfasrnn.torr.vision/

Outline

Semantic segmentation

CNNs for Pixelwise prediction

CRF as RNN

Conclusion

Semantic Segmentation

• Recognizing and delineating objects in an image Classifying each pixel in the image

Why Semantic Segmentation?

• To help partially sighted people by highlighting important objects in their glasses

• To let robots segment objects so that they can grasp them

• Road scenes understanding• Useful for autonomous navigation of cars and

drones

Image taken from the cityscapes dataset.

• Useful tool for editing images

• Medical purposes: e.g. segmenting tumours, dental cavities, ...

Image taken from Mauricio Reyes

ISBI Challenge 2015, dental x-ray images

But How?

• Deep convolutional neural networks are successful at learning a good representation of the visual inputs.

• However, here we have a structured output.

CNN for Pixel-wise Labelling• Usual convolutional networks

• Fully convolutional networks

Long et. al., Fully Convolutional Networks for Semantic Segmentation, CVPR 2015.

Fully Convolutional Networks[Long et al, CVPR 2014]

+ Significantly improved the state of the art in semantic segmentation.

- Poor object delineation: e.g. spatial consistency neglected.

Fully Convolutional Networks[Long et al, CVPR 2014]

Image FCN Results Ground truth

• A CRF can account for contextual information in the image

Conditional Random Fields (CRFs)

Coarse output from the pixel-wise classifier

MRF/CRF modelling Output after the CRF inference

�� ∈ {bg, cat, tree, person, …}

• Define a discrete random variable Xi for each pixel i.

• Each Xi can take a value from the label set.

• Connect random variables to form a random field. (MRF)

�� ∈ {bg, cat, tree, person, …} �� = cat�� = bg

• Define a discrete random variable Xi for each pixel i.

• Each Xi can take a value from the label set.

• Connect random variables to form a random field. (MRF)

• Most probable assignment given the image → segmentation.

Finding the Best Assignment

�� = bgPr �� = ��, �� = ��,… , �� = �� |� = Pr(� = �|�)

�� = cat

Pr � = �|� = exp −� �|�

• Maximize Pr � = � → Minimize� �

• So we have formulated the problem as an energy minimization.

� �|� = ��_�� + ��_��

�� = ��

Unary energy

��(�� = ��) =?

� �|� = ��_�� + ��_��

�� = ��

Unary energy

��(�� = ��) =?

Your label doesn’t agree with the initial

classifier → you pay a penalty.

� �|� = ��_�� + ��_��

�� = ��

Unary energy

��(�� = ��) =?

Pairwise energy

��(�� = ��, �� = ��) =?

You assign different labels to two very similar

pixels → you pay a penalty.

How do you measure similarity?

� �|� = ��_�� + ��_��

��

Unary energy

��(�� = ��) =?

Pairwise energy

��(�� = ��, �� = ��) =?

� �|� = ��_�� + ��_��

��

Unary energy

��(�� = ��) =?

Pairwise energy

��(�� = ��, �� = ��) =?

� �|� = ��_�� + ��_��

Dense CRF Formulation

• Pairwise energies are defined for every pixel pair in the image.

� � = ��(

��) + ��(��, ��)

�,�

• Exact inference is not feasible.

• Use approximate mean field inference.

[Krähenbühl & Koltun, NIPS 2011.]

Dense CRF Formulation

• Pairwise energies are defined for every pixel pair in the image.

� � = ��(

��) + ��(��, ��)

�,�

• Exact inference is not feasible.

• Use approximate mean field inference.

[Krähenbühl & Koltun, NIPS 2011.]

exp(−� � ) = � � = ��(��)

��

Fully Connected CRFs as a CNN

BilateralQ

Bilateral ConvQ

Bilateral Conv ConvQ

Bilateral Conv Conv +Q

Bilateral Conv Conv + SoftMaxQ

CRF as a Recurrent Neural Network

• Each of these blocks is differentiable We can backprop

Mean-field Iteration

CRF Iteration

SoftMax

Unaries

• Each of these blocks is differentiable We can backprop

Output

CRF as RNN

CRF as a Recurrent Neural Network

Putting Things Together

FCN CRF-RNN

Experiments

68.3 69.5 72.9

FCN CRFFCNCRF-RNNCRF-RNN

Ours[Chen et al, 2015][Long et al, 2014]

Try our demo: http://crfasrnn.torr.visionCode & model: https://github.com/torrvision/crfasrnn

Shuai Zheng

Bernardino Romera-Paredes

Philip Torr

Examples

http://pp.vk.me/c622119/v622119584/20dc3/7lS5BU2Bp_k.jpg

Examples

http://media1.fdncms.com/boiseweekly/imager/mountain-bikers-are-advised-to-dism/u/original/3446917/walk_thru_sheep_1_.jpg

Examples

http://img.rtvslo.si/_up/upload/2014/07/22/65129194_tour-3.jpg

Examples

http://www.toxel.com/wp-content/uploads/2010/11/bike05.jpg

Not-so-good examples

http://www.independent.co.uk/incoming/article10335615.ece/alternates/w620/planecat.jpg

http://i1.wp.com/theverybesttop10.files.wordpress.com/2013/02/the-world_s-top-10-best-images-of-camouflage-cats-5.jpg?resize=375,500

Not-so-good examples

Tricky examples

http://se-preparer-aux-crises.fr/wp-content/uploads/2013/10/Golum.png

https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcRf4J7Hszkc8Wf6riVUX-cV_K-un8LJy5dYIBW1KDIn6i7UCzGHpg

Tricky examples

http://i.huffpost.com/gen/1478236/thumbs/s-DIRD6-large640.jpg

Tricky examples

Conclusion

• CNNs yield a coarse prediction on pixel-labeled tasks.

• CRFs improve the result by accounting for the contextual information in the image.

• Learning the whole pipeline end-to-end significantly improves the results.

CNN CRF

Conclusion

• CNNs yield a coarse prediction on pixel-labeled tasks.

• CRFs improve the result by accounting for the contextual information in the image.

• Learning the whole pipeline end-to-end significantly improves the results.

CNN CRF

Thank You!

Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering...

Documents

Transcript of Semantic Image Segmentation with Deep Learningsadeep/files/crfasrnn...Torr Vision Group, Engineering...

Season-Invariant Semantic Segmentation with A Deep … · Season-Invariant Semantic Segmentation with A ... Daniel Maturana, Masashi Uenoyama, and Sebastian Scherer Abstract Semantic

Johnson2008 Semantic Segmentation

Semantic segmentation for seismic facies classiﬁcation ...

Semantic Segmentation with Second-Order Pooling

Curriculum Domain Adaptation for Semantic Segmentation of ...openaccess.thecvf.com/content_ICCV_2017/papers/... · Curriculum Domain Adaptation for Semantic Segmentation of Urban

Data-efﬁcient semantic segmentation via extremely ...

Semantic Part Segmentation using Compositional … · Semantic Part Segmentation using Compositional Model combining ... we study the problem of semantic ... using a newly annotated

Few-Shot Semantic Segmentation with Prototype Learning · 2018. 8. 6. · in the few-shot semantic segmentation.2 Related Work. Semantic Segmentation. Semantic segmentation is the

Real-Time Semantic Segmentation Benchmarking Frameworklearningsys.org/nips17/assets/papers/paper_20.pdf · Real-Time Semantic Segmentation Benchmarking Framework Mennatullah Siam

Polarization-driven Semantic Segmentation via Efficient ...

Introduction to Semantic Segmentation using …machinelearning.math.rs/Radovic-SemanticSegmentation.pdfSemantic segmentation - problem definition The goal of semantic segmentation

Image Semantic Segmentation - TUM Wiki

Semantic segmentation

From Interactive to Semantic Image Segmentation

Semantic Video Segmentation by Gated Recurrent Flow ...openaccess.thecvf.com/content...Video_Segmentation... · Semantic Video Segmentation by Gated Recurrent Flow Propagation David

Multi-Exit Semantic Segmentation Networks

Semantic Segmentation of Small Objects and Modeling of ...openaccess.thecvf.com/content_cvpr_2016_workshops/... · Semantic Segmentation of Small Objects and Modeling of Uncertainty

Self-Paced Learning for Semantic Segmentation

Orientation-Aware Semantic Segmentation on Icosahedron Spheresmi.eng.cam.ac.uk/...ICCV-Segmentation-Icosahedron.pdf · put, targeting semantic segmentation. We take advantage of both,

S4Net: Single stage salient-instance segmentation · rather than instance segments. 2.3 Semantic instance segmentation Earlier semantic instance segmentation methods [22–24, 54]