Deep Convolutional Neural Fields for Depth …...Deep Convolutional Neural Fields for Depth...

Deep Convolutional Neural Fields for Depth Estimation from a Single Image

Fayao Liu, Chunhua Shen, Guosheng LinUniversity of Adelaide, Australia; Australian Centre for Robotic Vision.

Estimating depths from a single monocular image is challenging as no re-liable depth cues are available, e.g., stereo correspondences, motions etc.Previous methods either exploit geometric assumptions [3] or employ non-parametric methods [1]. The former is constrained to model particular scenestructures, while the latter is prone to propagate errors through different de-couple stages. Recent efforts have been focusing on exploiting additionalsources of information, e.g., semantic labels [2], which are generally notavailable. We in this paper present a deep convolutional neural field modelfor estimating depths from a single image, without relying on any geometricassumptions nor extra information. Specifically, we propose a deep struc-tured learning scheme which learns the unary and pairwise potentials of con-tinuous conditional random field (CRF) [4] in a unified deep convolutionalneural network (CNN) framework.

In our method, the integral of the partition function can be analyticallycalculated, thus we can exactly solve the log-likelihood optimization. More-over, solving the MAP problem for predicting depths of a new image ishighly efficient as closed-form solutions exist. We experimentally demon-strate that the proposed method outperforms state-of-the-art depth estima-tion methods on both indoor and outdoor scene datasets.

Let x be an image and y = [y1, . . . ,yn]> ∈ Rn be a vector of continuous

depth values of all n superpixels in x. We model the conditional probabilitydistribution of the data with the following density function:

Pr(y|x) = 1Z(x)

exp(−E(y,x)), (1)

where Z is the partition function: Z(x) =∫

y exp{−E(y,x)}dy; E is the en-ergy function. Here, because y is continuous, the integral in Z(x) can be an-alytically calculated under certain circumstances (refer to paper for details).This is different from the discrete case, in which approximation methodsneed to be applied. To predict the depths of a new image, we solve themaximum a posteriori (MAP) inference problem:

y? = argmaxy

Pr(y|x). (2)

We formulate the energy function as a typical combination of unarypotentials U and pairwise potentials V over the nodes (superpixels) N andedges S of the image x:

E(y,x) = ∑p∈N

U(yp,x)+ ∑(p,q)∈S

V (yp,yq,x). (3)

The unary term U aims to regress the depth value from a single superpixel.The pairwise term V encourages neighbouring superpixels with similar ap-pearances to take similar depths. We aim to jointly learn U and V in a unifiedCNN framework.

Unary potential The unary potential is constructed from the output of aCNN by considering the least square loss:

U(yp,x;θ) = (yp− zp(θ))2, ∀p = 1, ...,n. (4)

Here zp is the regressed depth of the superpixel p parametrized by the CNNparameters θ .

Pairwise potential We construct the pairwise potential from K types ofsimilarity observations, each of which enforces smoothness by exploitingconsistency information of neighbouring superpixels:

V (yp,yq,x;β ) =12

Rpq(yp− yq)2, ∀p,q = 1, ...,n. (5)

This is an extended abstract. The full paper is available at the Computer Vision Foundationwebpage.

CRF loss layer

Input image

224x224

Shared network parameters

Predicted depth map Shared network parameters

(unary)

(pairwise)

Neighbouring superpixel pairwise similarities:

... ....

Negative log-likelihood:

Supperpixel image patch

5 conv + 4 fc

224x224

Figure 1: An illustration of our deep convolutional neural field model fordepth estimation. The input image is first over-segmented into superpixels.In the unary part, for a superpixel p, we crop the image patch centred aroundits centroid, then resize and feed it to a CNN composed of 5 convolutionaland 4 fully-connected layers. In the pairwise part, for a pair of neighbouringsuperpixels (p,q), we consider K types of similarities, and feed them intoa fully-connected layer. The outputs of unary part and the pairwise partare then fed to the CRF structured loss layer, which minimizes the negativelog-likelihood. Predicting the depths of a new image x is to maximize theconditional probability Pr(y|x), which has closed-form solutions.

Here Rpq is the output of the network in the pairwise part (see Fig. 1) froma neighbouring superpixel pair (p,q). We use a fully-connected layer here:

Rpq = β>[S(1)pq , . . . ,S

(K)pq ]> =

∑k=1

βkS(k)pq , (6)

where S(k) is the k-th similarity matrix whose elements are S(k)pq (S(k) is sym-metric); β = [β1, . . . ,βk]

> are the network parameters.

Learning We minimizes the negative conditional log-likelihood of thetraining data:

minθ ,β≥0

∑i=1

logPr(y(i)|x(i);θ ,β )+λ1

2‖θ‖2

2 +λ2

2‖β‖2

2 , (7)

where x(i), y(i) denote the i-th training image and the i-th depth map; N isthe number of training images; λ1, λ2 are weight decay parameters.

Implementation of the method using MatConvNet [5] and details of net-work architectures are described in the paper. We conclude that the proposedmethod provide a general framework for joint learning of deep CNN andcontinuous CRF, which can be used for depth estimations of general scenes.

[1] Kevin Karsch, Ce Liu, and Sing Bing Kang. Depthtransfer: Depth ex-traction from video using non-parametric sampling. PAMI, 2014.

[2] Lubor Ladicky, Jianbo Shi, and Marc Pollefeys. Pulling things out ofperspective. In CVPR, 2014.

[3] David C. Lee, Abhinav Gupta, Martial Hebert, and Takeo Kanade. Esti-mating spatial layout of rooms using volumetric reasoning about objectsand surfaces. In NIPS, 2010.

[4] Tao Qin, Tie-Yan Liu, Xu-Dong Zhang, De-Sheng Wang, and Hang Li.Global ranking using continuous conditional random fields. In NIPS,2008.

[5] Andrea Vedaldi. MatConvNet. http://www.vlfeat.org/matconvnet/, 2013.

Deep Convolutional Neural Fields for Depth …...Deep Convolutional Neural Fields for Depth...

Documents

Transcript of Deep Convolutional Neural Fields for Depth …...Deep Convolutional Neural Fields for Depth...

Convolutional Neural Networks (CNNs) and Recurrent Neural ...

Convolutional Neural Networks for Object Recognition...Overview •What is Computer Vision ? •Convolutional Neural Networks •Convolutional Networks for Visual Object Recognition

Deep Convolutional Neural Fields for Depth …Deep Convolutional Neural Fields for Depth Estimation from a Single Image Fayao Liu 1, Chunhua Shen;2, Guosheng Lin 1The University of

IMPLEMETASI ALGORITMA CONVOLUTIONAL NEURAL …

Convolutional Neural Networks (CNN)

Doubly Convolutional Neural Networks - papers.nips.ccpapers.nips.cc/paper/6340-doubly-convolutional-neural-networks.pdf · convolutional neural network (DCNN). In Figure 3 we have

Very Deep Convolutional Neural Networks for Noise …...convolutional layers with carefully tuned architecture. •ASR: Very Deep Convolutional Neural Networks uses up to 10 convolutional

Convolutional Neural Networks for No-Reference Image Quality Assessmentopenaccess.thecvf.com/...Convolutional_Neural_Networks_2014_CVPR_paper.pdf · Convolutional Neural Networks

Convolutional Neural Network · Convolutional Neural Networks (CNN) •A.k.a. CNN or ConvNet Adit Deshpande, A Beginner's Guide To Understanding Convolutional Neural Networks. Digital

Hyperbolic Graph Convolutional Neural Networkspapers.nips.cc › paper › 8733-hyperbolic-graph-convolutional-neural-networks.pdfGraph Convolutional Neural Networks (GCNs) are state-of-the-art

Convolutional Neural Networksimft.ftn.uns.ac.rs/ssip2017/wp-content/uploads/... · Convolutional Neural Networks for Semantic ... Shelhamer, Darrell, Fully Convolutional Networks

Convolutional Neural Networks

Conditional Random Fields as Recurrent Neural Networks Random... · a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs)

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, …

ImageNet Classiﬁcation with Deep Convolutional Neural …kriz/imagenet_classification_with_deep_convolutional.pdfImageNet Classiﬁcation with Deep Convolutional Neural Networks

Convolutional Neural Networks (Part II)lvelho.impa.br/ip17/proj/slides/1110GoingDeeperConv.pdf · Convolutional Neural Networks 1 Convolutional Neural Networks (Part II) 08, 10 &

Convolutional Neural Fabrics - arXiv

Using Support Vector Machines, Convolutional Neural ... · Using Support Vector Machines, Convolutional Neural Networks and ... Using Support Vector Machines, Convolutional Neural

Convolutional neural networks deepa

Interpretable Convolutional Neural Networksopenaccess.thecvf.com/content...Convolutional_Neural_CVPR_2018_… · Interpretable Convolutional Neural Networks Quanshi Zhang, Ying Nian