01. a Generalized Random Walk With Restart and Its Application in Depth Up-Sampling and Interactive...

2574 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 7, JULY 2013

A Generalized Random Walk With Restart andIts Application in Depth Up-Sampling and

Interactive SegmentationBumsub Ham, Student Member, IEEE, Dongbo Min, Member, IEEE,

and Kwanghoon Sohn, Senior Member, IEEE

Abstract In this paper, the origin of random walk withrestart (RWR) and its generalization are described. It is wellknown that the random walk (RW) and the anisotropic diffusionmodels share the same energy functional, i.e., the former providesa steady-state solution and the latter gives a flow solution.In contrast, the theoretical background of the RWR scheme isdifferent from that of the diffusion-reaction equation, althoughthe restarting term of the RWR plays a role similar to thereaction term of the diffusion-reaction equation. The behaviorsof the two approaches with respect to outliers reveal that theypossess different attributes in terms of data propagation. Thisobservation leads to the derivation of a new energy functional,where both volumetric heat capacity and thermal conductivityare considered together, and provides a common framework thatunifies both the RW and the RWR approaches, in addition toother regularization methods. The proposed framework allowsthe RWR to be generalized (GRWR) in semilocal and nonlocalforms. The experimental results demonstrate the superiority ofGRWR over existing regularization approaches in terms of depthmap up-sampling and interactive image segmentation.

Index Terms Anisotropic diffusion, depth up-sampling,diffusion-reaction equation, interactive segmentation, randomwalk with restart (RWR), thermal diffusivity.

I. INTRODUCTION

MANY researchers have attempted to answer thefollowing questions: How can a computer extractuseful information from digital photographs or videos, as ahuman being does?", or What is the optimal way of com-pleting this process?" Some physiological observations havebeen translated into mathematical models and subsequentlyimplemented with an approximation for simplicity. This allows

Manuscript received May 18, 2012; revised November 3, 2012; acceptedFebruary 25, 2013. Date of publication March 20, 2013; date of currentversion May 13, 2013. This work was supported by the MKE (The Ministryof Knowledge Economy), Korea, under the Information Technology ResearchCenter support program supervised by the National IT Industry PromotionAgency (NIPA) (NIPA-2012-H0301-12-1008). The work of D. Min wassupported by the research grant for the Human Sixth Sense Programme atthe Advanced Digital Sciences Center from Singapores Agency for Science,Technology, and Research. The associate editor coordinating the review ofthis manuscript and approving it for publication was Prof. Xilin Chen.

B. Ham and K. Sohn are with the School of Electrical and Elec-tronic Engineering, Yonsei University, Seoul 120-749, South Korea (e-mail:[email protected]; [email protected]).

D. Min is with the Advanced Digital Sciences Center, 138632, Singapore(e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TIP.2013.2253479

machinery to (semi-) automatically handle complicated prob-lems such as object recognition, surveillance, object tracking,and 3D reconstruction. One of the most important technolo-gies employed to address the aforementioned questions is toregularize an image while preserving universal features. Imageregularization can be classified into two categories, local andnonlocal approaches, according to how the neighborhood usedin the regularization process is defined.

A number of approaches to using local regularization havebeen proposed, including the anisotropic diffusion [1], [2],the total variation [3], [4], the Mumford-Shah regularization[5], [6], the bilateral filter [7], [8], the random walk (RW)[9], [10], and the random walk with restart (RWR) [11] (seealso [12]). Perona and Malik proposed an anisotropic diffusionmodel, the thermal diffusivity of which changes from a con-stant to a space-variant function called the edge-stopping"function [1]. You et al. addressed the anisotropic diffusionin an optimization problem and proposed its energy func-tional [13]. The work was extended within a robust statisticsframework, resulting in the robust anisotropic diffusion [14].The bilateral filter, first proposed by Tomasi and Manduchi,is a nonlinear filter that combines tonal and spatial kernels. Itregularizes homogeneous regions while preserving importantfeatures [7]. However, the bilateral filter is an intuitive methodwith no theoretical links to other existing methods [15], [16].Elad proposed an energy functional corresponding to thebilateral filter and showed how the bilateral filter can beimproved and expanded in order to handle more compli-cated reconstruction problems [16]. The RW approach is aclassical method that estimates the probability of a randomwalker on a graph [9]. Theoretically, the method shares thesame energy functional with the anisotropic diffusion [10].Specifically, the RW model provides a steady-state solution,while the anisotropic diffusion gives a flow solution. Shenet al. [17] generalized the RW model by introducing anaugmented node similar to graph cuts [18], which can bethought of as imposing a prior knowledge on the RW [19].Recently, the RWR method has become increasingly popular,as its restarting term gives meaningful information in thesteady-state, allowing the global relation to be consideredat all scales (or at all times). Thus, the RWR approachis more suitable to some applications, such as interactivesegmentation [20], cost aggregation [21], and informationretrieval [22].

1057-7149/$31.00 2013 IEEE

HAM et al.: GENERALIZED RWR 2575

Nonlocal regularization has been intensively studied suchthat (semi-) local methods have been extended to thecorresponding nonlocal form, thus allowing texture informa-tion to be successfully leveraged without altering a signal tobe preserved [23]. Gilboa and Osher proposed the nonlocaldiffusion [24], a nonlocal counterpart of the anisotropic dif-fusion, to better capture texture information through nonlocalprocessing. The symmetric energy flow preserves the overallenergy, as in the anisotropic diffusion scheme, and preventssingular regions from being blurred [24], [25]. Buades et al.proposed the nonlocal mean filter [26] as a nonlocal extensionof the bilateral filter by utilizing a patch-wise affinity function.Protter et al. presented an energy functional of the nonlocalmean filter and further generalized it within a weighted leastsquare framework [27]. Similarly, Pizarro et al. generalizedthe nonlocal mean filter by utilizing a patch similarity forboth the fidelity and smoothness terms [28]. Recently, the totalvariation [3], [4] and the MumfordShah regularization [5], [6]were also extended to the corresponding nonlocal formulationfor better handling of fine structures and textures [29], [30].

To the best of our knowledge, there have been no studieson the origin of the RWR model. In general, the RWR can bedescribed as an ad-hoc method of the RW approach in that arestarting term is simply added to obtain a non-trivial steady-state solution. To explore the origin of the RWR, we firstinvestigate the relationship between the RW and the diffusionmodels, as the two approaches have been known to share thesame energy functional [10]. Namely, the RW model providesa steady-state solution, while the diffusion approach yields aflow solution. On the other hand, the behavior of the RWR isdifferent from that of the diffusion-reaction equation, althoughthe restarting term of the RWR constrains the steady-statesolution to an initial condition in the same manner as thereaction term in the diffusion-reaction equation.

In this paper, we show that the RWR and the diffusion-reaction equation have different theoretical backgrounds. Thebehavior of the two models with respect to outliers revealsthat each approach possesses different attributes in terms ofdata propagation in the presence of the outliers. Specifically,it is shown that the RWR is more robust against outliers thanthe diffusion-reaction equation. Based on this observation, wepropose a new energy functional, where both volumetric heatcapacity and thermal conductivity are considered together, andprovide a common framework that unifies both the RW andthe RWR approaches, as well as other regularization methods.The proposed energy functional allows us to generalize theRWR in semi-local and nonlocal frameworks. To verify itsperformance, the generalized RWR (GRWR) scheme is appliedto depth map up-sampling and interactive image segmentation.The experimental results show that: 1) the GRWR approach ismore robust to outliers; 2) the GRWR can aggregate textureinformation better; and 3) a global relation can be capturedby the GRWR model with no stopping criterion, which is notfeasible in classical diffusion.

The remainder of this paper is organized as follows. Theanisotropic diffusion and the RW model are briefly reviewedin Section II. A common energy functional that unifies the RWand the RWR schemes and generalizes them in semi-local and

nonlocal frameworks is described in Section III. An extensiveanalysis of experimental results is presented in Section IV.Finally, conclusions and suggestions for future work are givenin Section V.

II. DIFFUSION AND RANDOM WALK

Let u(x) : R+ be a function with a continuous imagedomain where R2 is an open and bounded space withx and y being 2D vectors, which represent spatialcoordinates.

A. Anisotropic DiffusionThe heat equation, also known as the diffusion equation, is a

fundamental partial differential equation that models the distri-bution of heat or temperature over a given domain with respectto time. Perona and Malik proposed the anisotropic diffusionmodel, and applied this physics model to image processing,particularly for an edge preserving regularization [1]. With aninitial condition f (x), the anisotropic diffusion is defined asfollows [24]:

t u(x) =

(u(y) u(x)) w(x, y)dy (1)

ut=0(x) = f (x) (2)

where t denotes a partial derivative with respect to time t . Theaffinity function w(x, y) is positive w(x, y) > 0 andsymmetric w(x, y) = w(y, x), playing a role as a discontinuitymarker that stops diffusion across different features.

w(x, y) ={

exp[| f (x) f (y)|2/h2] , y L(x)

0, y / L(x) (3)

with L(x) = {y \x : |y x | 1}. The range bandwidthis represented as h. Note that w(x, y) corresponds to thethermal diffusivity in physics [24].

B. Random Walk

The RW is a classical method in the field of randomprocesses and is closely related to circuit theory [9]. It formu-lates the trajectory of a random walker that takes successiverandom steps. The RW model is usually defined on a discretegraph, but without a loss of generality, its continuous formal-ization can be represented, for a starting position f (x), as

ut+1(x) = u

t (y)w(x, y)dy w(x, y)dy

(4)

u0(x) = f (x) (5)

where ut (x) represents the trajectory or the position of arandom walker at time t . Similar to the anisotropic diffusionmodel in (1), any monotonically decreasing function can beused as the affinity function w(x, y).


C. Relationship Between Diffusion and Random WalkTheoretically, the RW and the anisotropic diffusion schemes

share the same origin and thus, they have the same energyfunctional [10]. Let us consider the following energy func-tional:

E(u) = 14

(u(x) u(y))2w(x, y)dydx (6)where

w(x, y) ={

exp[| f (x) f (y)|2/h2] , y L(x)

0, y / L(x) (7)with L(x) = {y \x : |y x | 1}.

Since this energy functional is linear and strictly convex,a global minimum is guaranteed. This minimum can becalculated via the steepest descent method or the GaussJacobiiteration as follows:

1) Flow Solution Via Steepest Descent Method: Since thederivative of the energy functional is

E (u) =

(u(x) u(y)) w(x, y)dy (8)

the flow solution with an initial condition f (x) is obtained ast u(x) = E (u) =

(u(y) u(x)) w(x, y)dy (9)which is identical to the anisotropic diffusion model in (1).

2) Steady-State Solution Via GaussJacobi Iteration: Whena solution reaches the steady-state, an energy transition withrespect to time approaches 0, i.e., t u(x) = 0. Thus, thesteady-state solution is given by

0 =

(u(x) u(y)) w(x, y)dy. (10)

This can be solved by GaussJacobi iteration as follows:

ut+1(x) = u


(11)

which is equivalent to the RW model in (4).Accordingly, it can be seen that the RW and the anisotropic

diffusion models seek the same global minimum on a givenenergy functional: the former provides the steady-state solu-tion, while the latter gives the flow solution.

III. GENERALIZED RANDOM WALK WITH RESTARTIn this section, we first observe the relationship between the

RWR and the diffusion-reaction equation, and then describe aunified energy functional for the RW and the RWR models.The RWR approach is ultimately generalized in both semi-local and nonlocal forms.

A. Problem StatementThe steady-state solutions of the anisotropic diffusion and

the RW models give no meaningful information, i.e., a constantimage. To avoid this problem, the diffusion-reaction equationconstrains the steady-state solution to an initial condition [31]as follows:

t u(x) =

(u(y) u(x)) w(x, y)dy+( f (x)u(x)) (12)

where (> 0) represents the regularization parameter that con-trols the leverage between a fidelity term and a regularizationterm. Similarly, the RWR model is defined as

ut+1(x) = (1 c) u


+ c f (x) (13)

where the restarting probability means that a random walkergoes back to the starting position f (x) with the probability c.

As described in Section II-C, the anisotropic diffusion andthe RW model are based on the same energy functional.The following questions should then be considered: Whatis the energy functional of the RWR model?" and Is theenergy functional of the RWR model equivalent to that ofthe diffusion-reaction equation?" Although the RWR approachhas been successfully applied to many applications (e.g., seg-mentation [20], image matting [32], information retrieval [22],annotation refinement [33], graph matching [34]), the origin ofthe RWR has not yet been extensively investigated. Knowledgeof the origin of the RWR model will allow us to betterunderstand its behavior from the perspective of energy flowand further generalize its energy functional.

Proposition 1: The energy functional of the diffusion-reaction equation is different from that of the RWR model.

Proof: Let us consider the following energy functional thatis similar to (6), with the exception of an additional fidelityterm:

EDR(u) = 14

(u(x) u(y))2w(x, y)dydx

+2

(u(x) f (x))2dx . (14)The solution can be computed in two ways:1) Flow Solution Via Steepest Descent Method:

t u(x) =

(u(y) u(x)) w(x, y)dy+( f (x)u(x)). (15)

2) Steady-State Solution Via GaussJacobi Iteration:

ut+1(x) = u

t (y)w(x, y)dy + f (x) w(x, y)dy +

. (16)

The flow solution is the same as that for the diffusion-reaction equation in (12). However, the steady-state solutiondoes not correspond to that of the RWR model in (13), leadingto the conclusion that the RWR and the diffusion-reactionequation have a different energy functional.

Proposition 1 also means that the energy functional in (14)unifies the anisotropic diffusion ( = 0) and the diffusion-reaction equation ( = 0). It, however, does not unify boththe RW and the RWR models although it becomes the energyfunctional of the RW when = 0. Note that the nonlocalextension of (16) is similar to the NDS model proposedin [28].

B. Derivation of Random Walk With Restart1) Behavioral Analysis Against Outliers: Before deriving

the energy functional of the RWR model, we first comparethe behaviors of the diffusion-reaction equation and the RWR.Shown in Fig. 1(a) are input images that, from left to right,


(a)

(b)

(c)

Fig. 1. Different behaviors of the diffusion-reaction equation and the RWRmodel. (a) Initial image (from left to right) that is noise-free, corrupted bythe Gaussian noise with a standard deviation of 20, and corrupted by theimpulsive noise with a density of 0.05. (b) Results obtained with the diffusion-reaction equation. (c) Results obtained with the RWR model. The number ofiterations is set to 50 in the diffusion-reaction equation and 20 in the RWRmodel, respectively, for making the filtering results with similar extent ofblurring. The regularization parameter in (12) and the restarting probabilityc in (13) are set to 1/10 and 1/11, respectively. Both methods show similarfiltering behaviors for both the original image and the Gaussian noise image.However, we found that the RWR model is more robust against impulsivenoise than the diffusion-reaction equation. (a) Initial image. (b) Diffusion-reaction equation. (c) RWR.

are noise-free, corrupted by the Gaussian noise with a standarddeviation of 20, and corrupted by the impulsive noise witha density of 0.05, respectively. These images are filtered bythe diffusion-reaction equation and the RWR model. Bothmethods show similar filtering behaviors for the noise-freeand the Gaussian noisy images. However, against impulsivenoise, the RWR method shows more robust filtering resultsthan the diffusion-reaction equation, i.e., the most impulsivenoises, which are rarely handled by the diffusion-reactionequation, are effectively eliminated by the RWR within asmall number of iterations. The anisotropic diffusion andthe RW models also showed similar results and thus, thefindings are not shown here. The results imply that there isan energy functional that unifies the RW and RWR models.This interesting observation gives us new insights into filteringalgorithm design. Consequently, we must investigate why theRW and the RWR methods are robust to outliers. The answeris closely related to the thermal diffusivity in physics, whichwill be described in the next section.

2) Thermal Diffusivity in the Diffusion and the RWRModels: First, let us explain the physical meaningof the thermal diffusivity. Thermal diffusivityin the diffusion is defined as T = k/ [35],

where k and represent the thermal conductivityand the volumetric heat capacity, respectively. Materialswith a low (high) thermal diffusivity slowly (rapidly) adapttheir temperature to the surrounding environment. This impliesthat the affinity function w(x, y) and thermal diffusivity Tplay a similar role, e.g., a low affinity w(x, y) correspondsto a low thermal diffusivity, thus preventing diffusion andvice versa.

Next, let us describe the volumetric heat capacity in thediffusion and the RW (or the RWR) models. Note that thevolumetric heat capacity is closely related to the diffusionvelocity and the purity of the material. For instance, pure mate-rials have a higher volumetric heat capacity (a lower diffusionvelocity) than mixtures. In other words, materials with a highvolumetric heat capacity slowly adapt their temperature to thesurrounding environment and vice versa.

However, thermal diffusivity, as defined in the classicaldiffusion, does not include the volumetric heat capacity,i.e., rapidity is not taken into account. Specifically, the dif-fusion as described by (12) rarely occurs when a center nodehas a different distribution from its neighborhood, making thediffusion process sensitive to impulsive outliers, as shown inFig. 1(b). Classical anisotropic diffusion approaches model thethermal diffusivity with the thermal conductivity only, i.e., thevolumetric heat capacity is set to 1 as follows:

TD = k w(x, y). (17)In contrast, the thermal diffusivity of the RW and the RWR

models includes the thermal conductivity w(x, y) as well asthe volumetric heat capacity

w(x, y)dy as follows:

TR = k

= w(x, y) w(x, y)dy

. (18)

The denominator w(x, y)dy in (18) indicates the purity

of the image (the volumetric heat capacity); it becomes smallwhen a signal within a neighborhood belongs to a mixture(the outlier) and vice versa. Therefore, the diffusion velocityincreases when outliers exist, making the RW and the RWRmethods more robust to outliers, as shown in Fig. 1(c).To summarize, the filtering properties of the two approachescompletely differ according to the definition of the volumetricheat capacity.

3) Energy Functional Unifying the RW and RWR Models:We have observed that RW-based approaches are more robustto impulsive outliers when compared to conventional diffusionmethods due to the volumetric heat capacity. Based on thisobservation, a new energy functional unifying the RW and theRWR models is proposed as follows:

ERWR(u) = 14

(u(x) u(y))2w(x, y)dydx

+2

(u(x) f (x))2dx (19)where

w(x, y) = w(x, y) w(x, y)dy

. (20)

Note that the volumetric heat capacity w(x, y)dy

is directly incorporated into the energy functional. From


TABLE ICOMPARISON OF THE SOLUTIONS OF THE ENERGY FUNCTIONAL EDR(u)

IN (14) AND THE ENERGY FUNCTIONAL ERWR(u) IN (19)

= 0 Flow Solution Steady-State SolutionEDR(u) Anisotropic diffusion [1] Random walk (RW) [9]ERWR(u) Robust scale-space

filter [36]Random walk (RW) [9]

= 0 Flow Solution Steady-State SolutionEDR(u) Diffusion-reaction

equation [31]Local version ofNDS [24]

ERWR(u) Robust diffusion-reactionequation

Random walk with restart(RWR) [33]

a probabilistic point of view, the thermal diffusivityw(x, y)

/ w(x, y)dy corresponds to the probability that a

random walker at x transits to y in a single step. Two solutionscan also be obtained as follows.

a) Flow solution via steepest descent method:

t u(x) = (u(y) u(x)) w(x, y)dy

w(x, y)dy+ ( f (x) u(x)).

(21)b) Steady-state solution via gaussjacobi iteration:

ut+1(x) = 11 +

u


+ 1 + f (x). (22)

The steady-state solution is exactly the same as the RWRmodel in (13) when c is substituted for /(1 + ) in (22).In contrast to the energy functional of (14), the steady-statesolution becomes the RWR model when = 0; otherwise,the solution is the RW model. Interestingly, when = 0,the flow solution corresponds to the robust scale-space filterrecently proposed in [36], which is more robust to outliers thanthe classical anisotropic diffusion scheme in (1). Therefore,(21) can be thought of as the diffusion-reaction counterpart ofthe robust scale-space filter, i.e., the robust diffusion-reactionequation. The energy functionals EDR(u) and ERWR(u) arecompared in Table I. Note that when = 0, the steady-statesolutions of EDR(u) and ERWR(u) become equivalent to thoseof the RW model.

C. Generalized RWR (GRWR)The unified energy functional in (19) allows us to generalize

the RWR model in nonlocal forms. Let us consider thefollowing energy functional:

EGRWR(u) = 14

(u(x) u(y))2wS (x, y)dydx

+2

(u(x) f (y))2wD(x, y)dydx (23)

wherewi (x, y) = wi (x, y)

wi (x, y)dy, i = S,D. (24)

The affinity functions wS (x, y) or wD(x, y) can be repre-sented by the following local, semi-local, and nonlocal affinityfunctions, respectively.

a) Local affinity function:wL(x, y) = w(x, y)

={

exp[| f (x) f (y)|2/h2] , y L(x)

0, y / L(x)(25)

with L(x) = {y \x : |y x | 1}.b) Semi-local affinity function:wSL(x, y)

=

exp[| f (x) f (y)|2/h2

|x y|2/2s2]

, y SL(x)0, y / SL(x)

(26)with SL(x) = {y \x : |y x | r}. The window radiusand spatial bandwidth are denoted as r and s, respectively.

c) Nonlocal affinity function:wNL(x, y)

={

exp[ fB(x) fB(y)2/h2] , y NL(x)

0, y / NL(x)(27)

where NL(x) = {y \x} and fB(x) denotes a vectorconsisting of a patch centered at pixel x .

1) Flow Solution Via Steepest Descent Method: The flowsolution of (23) can be obtained as:

t u(x) = (u(y) u(x)) wS (x, y)dy

wS (x, y)dy

+ ( f (y) u(x)) wD(x, y)dy

wD(x, y)dy(28)

which can be seen as a generalized robust diffusion-reactionequation.

2) Steady-State Solution Via GaussJacobi Iteration: Thesteady-state solution of (23) can be derived as

0 =

(u(x) u(y)) wS (x, y)dy

+

(u(x) f (y)) wD(x, y)dy. (29)

The final solution is given by

ut+1(x) = 11 +

( u

t (y)wS(x, y)dy wS(x, y)dy

)

+ 1 +

( f (y)wD(x, y)dy

wD(x, y)dy

). (30)

The RWR becomes a special case of (30) when wS (x, y)and wD(x, y) are set to the local affinity function in (25) anda constant, respectively. The RW model can be derived bysetting wS (x, y) to the local affinity with being 0.

3) Relationship Between the GRWR and Other Regulariza-tion Methods: The solution of the energy functional in (23)is related to other regularization methods according to thesolution type and/or the affinity function used, as summarizedin Table II.


TABLE IIRELATIONSHIP BETWEEN THE PROPOSED ENERGY FUNCTIONAL IN (23) AND OTHER REGULARIZATION METHODS

Method Solution Type wD(x, y) ()(x) of wD(x, y) wS (x, y) ()(x) of wS(x, y)AD [1] Flow = 0 - - wS (x, y) {y \x : |y x| 1}

RSS [36] Flow = 0 - - wS (x, y) {y \x : |y x| 1}NLD [24] Flow = 0 Constant {y : y = x} wS (x, y) {y \x}RW [9] Steady-state = 0 - - wS(x, y) or wS (x, y) {y \x : |y x| 1}BL [7] Steady-state

(noniterative)= 0 - - wS(x, y) or wS (x, y) {y : |y x| r}

NLM [26] Steady-state(noniterative)

= 0 - - wS(x, y) or wS (x, y) {y }

RWR [20] Steady-state = 0 Constant {y : y = x} wS (x, y) {y \x : |y x| 1}GRDR Flow = 0 wD(x, y) {y : |y x| r}

or {y \x}wS (x, y) {y : |y x| r} or{y \x}

GRWR Steady-state = 0 wD(x, y) {y : |y x| r}or {y \x}

wS (x, y) {y : |y x| r} or{y \x}AD: anisotropic diffusion [1], RSS: robust scale-space filter [36], NLD: nonlocal diffusion [24], RW: random walk [9], BL: bilateral filter [7],NLM: nonlocal mean filter [26], RWR: random walk with restart [20], GRDR: generalized robust diffusion-reaction equation,GRWR: generalized random walk with restart.

First, the regularization methods are classified accordingto the solution types: the flow solution and the steady-statesolution. The flow solution is more flexible than the steady-state solution, since the diffusion time, i.e., the number ofiterations, can be adjusted, resulting in varying solution withrespect to time. In contrast, the steady-state solution is unique(piecewise smooth) for the given energy functional, since thereis no energy transition in the steady-state. It means theirusage depends on the applications. For example, the nonlocaldiffusion [24] is more suitable in denoising images than theGRWR, although the number of iterations should be specifiedto yield visually improved results. In the image segmentation,the GRWR achieve better results than the nonlocal diffusion,since the GRWR gives the piecewise constant solution in thesteady-state, which is preferred in the image segmentation.

Second, the regularization parameter heavily influencesthe smoothness of the solution. Especially, it determineswhether the solution is meaningful in the steady-state. When = 0, the solution diffuses only, and thus gives a trivial solu-tion in the steady-state. In contrast, the steady-state solutionbecomes meaningful when = 0. As this parameter is set tosmaller, the solution becomes smoother, and vice versa. It alsohas a similar role in the flow solution.

Finally, the affinity function also determines the type ofthe regularization. Generally, the nonlocal affinity function isbased on a patch, which is closely related to a self-exampleconcept. The nonlocal affinity function is capable of discrimi-nating texture information from noisy images, in contrast to thelocal and semi-local affinity functions. Therefore, the nonlocalmean filter [26] has been widely used in the image denoisingdespite its huge computational overhead.

D. ImplementationLet uk : R+ be a function on a discrete image domain,

where N2 is an open and bounded space with k and

l being 2D vectors, representing spatial coordinates. Theunk term denotes the intensity of the pixel k at time n. Thediscrete affinity function w[k, l] between two nodesk = (k1, k2) and l = (l1, l2) is defined within an interestdomain N()(x), which is a discrete counterpart of ()(x).The thermal diffusivity function in (24) is discretized as

wi [k, l] = wi [k, l]l wi [k, l]

, i = S,D. (31)

We can set wi [k, l] to the following functions.a) Local affinity function:

wL[k, l] ={

exp[| fk fl |2/h2] , l NL

0, l / NL (32)

with NL = {l \k : |l k| 1}.b) Semi-local affinity function:

wSL[k, l]=

{exp

[| fk fl |2/h2 |k l|2/2s2] , l NSL0, l / NSL

(33)with NSL = {l \k : |l k| r}.

c) Nonlocal affinity function:

wNL[k, l] ={

exp[ fB,k fB,l2/h2

], l NNL

0, l / NNL(34)where NNL = {l \k} and fB,k denotes the patch aroundpixel k. Theoretically, this function measures a patch-wisesimilarity with an entire image except a reference pixel k, butin practice, for computational efficiency, the interest domainNNL is usually constrained using a set of neighboring pixelswithin a certain spatial distance, i.e., NNL = {l \k :|l k| rN }. From here on, we use the constrained interestdomain when defining the nonlocal affinity function wNL.


1) Flow Solution Via Steepest Descent Method: By approx-imating the partial derivative with respect to time via theforward Euler equation with an evolution step size n, wecan discretize (28) as follows:

un+1k = unk + n(

l wS [k, l]unll wS [k, l]

unk)

+n(

l wD[k, l] fll wD[k, l]

unk). (35)

While (35) has a similar form to the nonlocal diffusionequation [24], it can have a larger evolution step size thanthat of nonlocal diffusion. The evolution step size in thenonlocal diffusion model should decrease according to theneighborhood size so as to ensure stability (see Appendix I).

2) Steady-State Solution Via GaussJacobi Iteration: Thesteady-state solution of (30) can be approximated as:

un+1k =1

1 +

l wS [k, l]unll wS [k, l]

+ 1 +

l wD[k, l] fl

l wD[k, l].

(36)Note that the solution remains unchanged when it reaches

the steady-state, i.e., un+1k = unk .The steady-state solution can also be written in matrix

form [37]. We will re-formulate (36) using combinatorial nota-tion. Let f and uS denote an M 1 column vector representingthe initial condition and the steady-state solution, respectively.Also, let WS = [wS [k, l]]MM and WD = [wD[k, l]]MMdenote the smoothness and the data affinity matrix with asize M M . The corresponding degree matrix is representedby DS = diag(D1S , . . . , DMS ) and DD = diag(D1D, . . . , DMD ),where DkS =

l wS [k, l] and DkD =

l wD[k, l].

When the solution reaches the steady-state, (36) can be re-written in combinatorial form as

uS = 11 + PSuS +

1 + PDf (37)

where PS = D1S WS (PD = D1D WD) represents a transitionmatrix whose elements pSk,l (pDk,l) can be interpreted as thetotal probability with which a random walker reaches uk ( fk )from ul ( fl ) after a single iteration [38]. In contrast to theRWR model in (13), where a random walker moves back toan initial position with a fixed probability c = /(1 + ), arandom walker in the proposed method goes back to an initialposition with a probability cPD, which takes the local structurePD into account. Thus, the GRWR scheme can capture thefine structure and texture information better. The steady-statesolution can be expressed as:

uS =(

1 + )(

I 11 + PS

)1PDf

=(

1 + )

n=0

(1

1 + )n

PnSPDf (38)

where I denotes the identity matrix of size M .

E. Relationship Between the Flow and Steady-State SolutionsWe have shown that EDR(u) in (14) unifies the anisotropic

diffusion and the diffusion-reaction equation, and ERWR(u)

in (19) and EGRWR(u) in (23) unify the RW and the RWRmodels. In this section, we explore the relationship betweentheir flow and steady-state solutions.

Proposition 2: The steady-state solutions of EDR(u),ERWR(u), and EGRWR(u) are equivalent to the correspondingflow solutions with a maximum evolution step size.

Proof: Let us approximate t u(x) asun+1k unk

n(k)(39)

with an evolution step size n(k). Then, (15), (21), and (28)can be represented, respectively, as

un+1k =(

1 n(k)l

w[k, l] n(k))

unk

+n(k)l

w[k, l]unl + n(k) fk (40)

un+1k = (1 n(k) n(k)) unk+n(k)

l w[k, l]unl

l w[k, l]+ n(k) fk (41)

and

un+1k = (1 n(k) n(k)) unk+n(k)

l wS [k, l]unl

l wS [k, l]+n(k)

l wD[k, l] fl

l wD[k, l]. (42)

Note that (40), (41), and (42) represent the flow solutionsfor EDR(u), ERWR(u), and EGRWR(u), respectively. Therefore,the corresponding stability conditions for the flow solutions arecomputed as

0 n(k) 1l w[k, l] +

(43)

0 n(k) 11 + (44)

and0 n(k) 1

1 + . (45)When the evolution step size is set to its maximum

value, (40)(42) can be written as

un+1k =

l w[k, l]unl + fkl w[k, l] +

(46)

un+1k =1

1 +

l w[k, l]unll w[k, l]

+ 1 + fk (47)

and

un+1k =1

1 +

l wS [k, l]unll wS[k, l]

+ 1 +

l wD[k, l] fl

l wD[k, l](48)which are identical to the steady state solutions of EDR(u),ERWR(u), and EGRWR(u) in (16), (22), and (36).

Remark 1: When = 0 in (40), the RW model can bereferred to as anisotropic diffusion with an adaptive evolutionstep size n(k) = 1/l w[k, l], which plays the same roleas the volumetric heat capacity. Thus, the RW approach is


0 500 1000 1500 2000 2500 3000 35000.88

0.9

0.92

0.94

0.96

0.98

1

iteration

norm

aliz

ed e

ner

gy

RWR

GRWR

Fig. 2. Normalized energy of the RWR and the GRWR methods, accordingto the number of iterations. A test image of Fig. 1(a) (the first column)is regularized by the two methods, and then a normalized energy of bothmethods is experimentally measured using (19) and (23), respectively. Therange parameter and the restarting probability are set to 10 and 0.01,respectively, for both methods. In the GRWR method, the nonlocal affinityfunction is used in which the patch radius rP and neighborhood radius rNare set to 2 and 5, respectively. Although both methods guarantee nontrivialsteady-state solutions, the GRWR method achieves a lower normalized energyin the steady-state as well as a faster convergence rate.

more robust to outliers than the anisotropic diffusion scheme,although both are derived from the same energy functional.

IV. EXPERIMENTAL RESULTSIn this section, the GRWR method is applied to depth

image up-sampling and interactive image segmentation. TheGRWR could be an alternative to other regularization methodssince a global relation can be effectively captured due to thesteady-state property of the GRWR. Furthermore, the GRWRapproach can effectively handle the complicated texture inhighly cluttered regions.

A. ImplementationThe RWR and the GRWR methods can be implemented via

two ways [37]. One is to yield the steady-state solution throughthe power iteration, i.e., by repeatedly applying GaussJacobiiteration as in (36). The other is to pre-compute and storethe pseudo inversion of the matrix I (1 c)PS as in (38).In general, the computational load of calculating the pseudoinversion depends heavily on the sparseness of the matrix.

As opposed to the local affinity function, the semi-localand nonlocal affinity functions utilize many neighbors insideNSL and NNL, resulting in the semi-dense transition matrix.Thus, the GRWR method using these affinity functions wasimplemented via the power iteration, which does not requirehuge pre-computation and memory usage. In this case, it isimportant to pre-set the number of iterations for computationalefficiency. Namely, the steady-state solution can be efficientlyobtained by seeking the minimum number of GaussJacobiiterations in order for the solution to converge.

To find the minimum iteration number, a test image ofFig. 1(a) (the first column) was regularized by the RWR andthe GRWR method, and then each normalized energy wasexperimentally measured using (19) and (23), respectively,according to the number of iteration, as shown in Fig. 2.

Note that the normalized energy of the RWR method, whichuses the local affinity function, was also measured using thepower iteration to compare the convergence rate, even thoughits transition matrix I (1 c)PS is sparse. The range para-meter and the restarting probability were set to 10 and 0.01,respectively, for both methods. In the GRWR method, thenonlocal affinity function in (34) was used in which the patchradius rP and neighborhood radius rN were set to 2 and 5,respectively. It is shown that both methods guarantee non-trivial steady-state solutions, but the GRWR method achievesa faster convergence rate. Following this observation, we fixedthe iteration number to 300 for the GRWR and 3000 for theRWR in all experiments. Although it might vary accordingto a signal type, resolution and/or application, we foundthrough various experiments that excellent results could beobtained.

B. Depth Map Up-sampling1) Background: In the field of 3D computer vision, it

is important to find visual correspondence between images.While many stereo algorithms have been extensively studied,most methods are still far from being implemented in practicaluse due to their heavy computational complexity and unstableaccuracy. Alternatively, active depth sensors can be used toobtain depth information, but their quality is not satisfactory,e.g., depth maps captured by Mesa Imaging SR4000 havelow-resolution and are noisy [39]. Many studies have beenperformed so as to overcome this limitation [40][43]. TheGRWR approach can be an appropriate solution to depthmap up-sampling due to its steady-state property. Note thatthe steady-state solution cannot be obtained by conventionalmethods, which are based on the bilateral filter [41], [42] ormode filter [43].

2) Experimental Environments: The semi-local affinityfunction (wS and wD) in (33) was utilized to up-sample thedepth map. The nonlocal affinity is not appropriate for use indepth up-sampling. Since it is based on the patch similarity,neighboring pixels with high affinity values can be found atdifferent depth layers, causing serious depth fatting problemson depth discontinuities.

We compared the performance of the GRWR model withthat of the 2D joint bilateral up-sampling (2D JBU) [41],the 3D joint bilateral up-sampling (3D JBU) [42], and theweighted mode filter (WMF) [43]. All parameters were fixedduring the experiments: In Section IV-B-3), the spatial and therange bandwidths were set to 3.0 and 5.0, respectively, for allof the algorithms, while the histogram bandwidth of the WMFwas set to 21. The neighborhood radius r was set equal to thespatial bandwidth. The restarting probability c = /(1 + ) inthe GRWR model was set to 0.01. For a proper comparison,no pre-processing or post-processing (e.g., the multiscale colormeasure (MCM) used in the WMF) was employed [43].In Section IV-B.4), all parameters were set to the same asthat of section IV-B-3) except that the spatial bandwidth andthe neighborhood radius r were set to 7.0.

3) Performance Evaluation With Noisy Depth Maps: Thereference images (from top to bottom) Teddy and Cones are


(a) (b) (c) (d) (e) (f)

Fig. 3. Depth up-sampling results for Middlebury test bed images. (a) Reference images (from top to bottom) Teddy and Cones. (b) Initial low-resolutiondepth maps corrupted by the Gaussian noise with a standard deviation of 30. The down-sampling ratio is eight in each dimension. (c) 2-D JBU [41].(d) 3D JBU [42]. (e) WMF [43]. (f) GRWR results. In the GRWR, the semilocal affinity function in (33) is utilized.

(a) (b) (c) (d) (e)

Fig. 4. Results of the synthesized virtual views. The virtual views are synthesized using the reference images (from top to bottom) Teddy andCones, and the corresponding up-sampled depth map as in Fig. 3. (a) Ground truth depth map. (b) 2D JBU [41]. (c) 3D JBU [42]. (d) WMF [43].(e) GRWR. The outliers in the up-sampled depth degenerate the quality of the synthesized view. The depth map corrupted by the outlier leads to geometricdistortion in the synthesized view.

shown in Fig. 3(a). Initial low-resolution depth maps obtainedby down-sampling with a factor of 8 in each dimension werecorrupted by the Gaussian noise with a standard deviation of 10, 20, and 30. Fig. 3 shows the up-sampling results whenthe standard deviation is 30. As evident in Fig. 3(c), the2D JBU method suppresses the noise to some extent, but alsosmooths important features such as depth boundaries due toits inherent averaging property [43]. Note that the up-sampleddepth maps with noises have inaccurate depth information,leading to geometric distortion. In Fig. 3(d), the 3D JBUscheme shows similar results to those obtained by the 2D JBU,since it also leverages the summation (averaging) propertyof the bilateral filter when aggregating the cost. The resultsobtained by the WMF are shown in Fig. 3(e). The WMFregularizes the depth map by utilizing the mode, i.e., it findsa global maximum in a localized histogram for each pixel.Thus, the WMF is more robust to noises than the 2D JBUand the 3D JBU methods and it preserves object boundarieswell. However, some artifacts are still observed. It should benoted that the results of the WMF may be different fromthose of the original paper [43], since the MCM proposedin [43] is not used for fair performance comparison. Depthmaps up-sampled by the GRWR method are shown in Fig. 3(f).In contrast to the aforementioned filtering-based approaches,

the GRWR approach yields a steady-state solution that is non-trivial due to the restarting term. The noise was successfullysuppressed while universal features were effectively preserved.Note that the robustness against noises is consistent with thecharacteristics of the robust scale-space filter [36], which is aspecific case of the flow solution of the GRWR model. In [36],it was shown that the robust scale-space filter is robust againstvarious outliers such as the Gaussian noise, the impulsivenoise, and a combination of the two.

For a quantitative evaluation, the percent of bad matchingpixels at all regions was measured with ground truth depthmaps [44], as shown in Table III. It shows that the 2D JBUslightly outperformed the 3D JBU in the noisy environment.The performance of both methods drastically varied accordingto the standard deviation of the noise. In contrast, the WMFand the GRWR give consistent results even though the initialdepth map is degraded by severe noise. We also found thatthe performance of the GRWR approach is superior to thoseof the other methods.

To visualize the influence of the degraded depth maps,virtual views were synthesized using the reference imagesand the corresponding up-sampled depth maps as in Fig. 3.As shown in Fig. 4, the degraded depth maps significantlyinfluence the quality of the synthesized view. In addition, there


(a) (b) (c) (d) (e) (f)

Fig. 5. Up-sampling results for depth maps captured by the active range sensor (the first and second rows) ToF sensor (SR4000) [39] and (the third row)structured light sensor (Kinect) [47]. In the ToF (structured light) sensor setup, the sizes of the input depth and reference images are 176 144 (80 60)and 1024 768 (640 480), respectively. The depth maps are normalized between 0 and 255. (a) Reference images. (b) Initial low-resolution depth maps.(c) 2D JBU [41]. (d) 3D JBU [42]. (e) WMF [43]. (f) GRWR results. In the GRWR, the semilocal affinity function in (33) is utilized.

TABLE IIIOBJECTIVE COMPARISON (THE PERCENT OF BAD MATCHING PIXELS) OF

THE UP-SAMPLED DEPTH MAPS

Tsukuba = 10 = 20 = 302D JBU [41] 16.30 28.50 42.303D JBU [42] 17.20 29.40 42.50WMF [43] 20.50 18.80 21.30GRWR 20.90 21.60 27.20

Venus = 10 = 20 = 302D JBU [41] 17.60 48.00 61.103D JBU [42] 18.60 48.80 63.80WMF [43] 5.55 12.10 26.60GRWR 3.34 9.99 19.30

Teddy = 10 = 20 = 302D JBU [41] 51.30 73.00 80.603D JBU [42] 52.40 73.90 81.20WMF [43] 34.10 44.70 64.20GRWR 26.40 41.70 57.30

Cones = 10 = 20 = 302D JBU [41] 53.90 73.80 81.003D JBU [42] 55.10 74.20 81.30WMF [43] 43.60 53.00 64.50GRWR 38.10 49.50 62.60

are many holes in the synthesized view, since the distorteddepth values warp several pixels into the wrong location ofthe virtual view.

4) Performance Evaluation With Depth Maps from ActiveRange Sensor: We also up-sampled depth maps obtained fromthe active range sensors (the Time-of-Flight (ToF) sensor andthe structured light sensor) in Fig. 5. For the ToF sensorconfiguration, the SR4000 depth sensor [39] and the PointGrey Flea camera [45] were used to capture single depth mapand its corresponding color image. For a spatial alignment,the calibration parameters estimated for two cameras were

utilized to warp the depth data into the corresponding spatialcoordinate of the color camera. The sizes of the input depthand color images are 176 144 and 1024 768, respectively.The depth maps were normalized between 0 and 255. Forsimulating the up-sampling with the structured light sensor,we used the the data set provided by [46] which offers thedepth map and corresponding color image captured by theMicrosoft Kinect depth sensor [47]. The sizes of the inputdepth and color image are 8060 and 640480, respectively.An original depth map (640480) from the Kinect was down-sampled with a factor of 8 in order to verify the upsamplingperformance. Note that the depth value describes a distancefrom the sensor, different from the disparity value in Fig. 3.

The 2D JBU and the 3D JBU enhance the depth boundariesto some extent compared to the initial depth map, but thequality on depth boundaries is still dizzy. The WMF givesbetter results than those of the 2D JBU and the 3D JBU.It was already shown in [43] that by applying the MCM tothe input sparse depth maps, the WMF approach can achievevery accurate results on the depth boundaries, but in ourexperiments, the bilinearly interpolated depth maps, whichmay be even more seriously noisy across the depth boundariesthan the input original sparse depth maps, were used as initialinputs for fair comparison with other methods. Thus, Fig. 5(e)shows a bit worse results around the depth boundaries thanthose of the original WMF paper [43]. In Fig. 5(f), we foundthat the GRWR shows the best performance among all themethods, even though the MCM is not used like the case of theWMF in Fig. 5(e), since it always gives a piecewise constantsolution in the steady-state.

C. Interactive Image Segmentation1) Background: Interactive image segmentation has become

increasingly important for digital image editing since it yieldssatisfactory results that are unattainable by state-of-the-art


(a)

0.49 0.57 0.47 0.54(b)

0.75 0.72 0.61 0.60(c)

0.77 0.78 0.69 0.71(d)

0.69 0.81 0.67 0.70(e)

0.83 0.85 0.90 0.81(f)

Fig. 6. Segmentation results on texture images. (a) Original images(the first and third columns) are corrupted by both the Gaussian noise with astandard deviation of 20 and the impulsive noise with a density of 0.05 (thesecond and fourth columns). Green and blue strokes: the foreground and thebackground seeds, respectively. (b) RW [10], (c) ARW [48], (d) RWR [20],(e) nonlocal diffusion [24], and (f) GRWR results. In the GRWR, the nonlocalaffinity function in (34) is used. The normalized overlap scores are given underthe results. As expected, the GRWR approach is more robust to impulsiveoutliers than the RW, the RWR, and the nonlocal diffusion (the second andfourth columns) schemes. The GRWR method also segments the texture inthe highly cluttered regions very well, in the absence of noise (the first andthird columns). (a) Input image. (b) RW [10]. (c) ARW [48]. (d) RWR [20].(e) Nonlocal diffusion [24]. (f) GRWR.

automatic methods. One of the most difficult tasks in segmen-tation is to separate the texture in highly cluttered regions.

There have been many attempts to address the aboveproblem [10], [19], [20], [24], [32]. The graph cuts-basedapproach mostly produces allowable results in a general image.However, the method suffers from the small cut problem dueto the inherent nature of the max-flow/min cut algorithm [18].While the RW and the RWR schemes (which are based on aprobabilistic framework) do not cause the small cut problem,they still do not work in highly textured images [10], [20],

[48]. This texture problem could be handled to some extentby leveraging prior models based on a statistical distributionof the foreground and the background [19], but it significantlyincreases the computational complexity. Furthermore, the per-formance of the approach depends heavily on the statisticalmodel used. The nonlocal diffusion method recently proposedby Gilboa and Osher can segment cluttered regions using thenonlocal neighborhood [24], but, unlike the RWR, it doesnot consider the global relation and thus, it requires thatthe number of iterations be set manually for an accuratesegmentation.

The GRWR method with the nonlocal affinity functionin (34) can be an alternative solution to handle the textureproblem in image segmentation since the nonlocal affinityfunction inherently considers both texture and structure infor-mation, similar to the nonlocal diffusion model. Furthermore,the global relation can be easily captured in the steady-statesolution.

2) Experimental Environments: The performance of theRW [10], the anisotropic RW (ARW) [48], the RWR [20],the nonlocal diffusion [24], and the GRWR approaches wascompared. All parameters were fixed during the experiments:the range parameter was set to 10 for all of the algorithms,while the restarting probability of the RWR and the GRWRmodels was set to 0.01. In the nonlocal diffusion approach,the patch radius rP and neighborhood radius rN were setto 2 and 5, respectively. Note that the performance of thenonlocal diffusion model depends heavily on the number ofiterations, and its steady-state solution gives no meaningfulinformation. Thus, the number of iterations was carefully setthrough intensive experiments, meaning that it should be setdifferently for each image. In the GRWR, the sizes of the patchand neighborhood in the nonlocal affinity function in (34) wereset to be equal to those of the nonlocal diffusion model.

3) Performance Evaluation With Synthetic Images: Thesegmentation results obtained on the texture images are shownin Fig. 6. The RWR [20], the ARW [48], and the nonlocaldiffusion models [24] discriminate the texture better than theRW approach. Note that, although the RWR and the nonlocaldiffusion models show similar behavior, the number of iter-ations in nonlocal diffusion should be carefully set throughexhaustive experiments. The ARW and the RWR models showsimilar behavior, while the GRWR approach shows the bestperformance in that it discriminates the texture along sharpboundaries very well. In contrast to the nonlocal diffusion,the GRWR model considers the global relations at all scalesdue to the steady-state property and thus, the stopping criterionis not required. Note that all RW-based approaches, includingthe RW, the ARW, the RWR, and the GRWR methods, werenot affected by impulsive outliers due to the volumetric heatcapacity used in the affinity function. The nonlocal diffusionscheme is also robust to outliers since the nonlocal operatorcan, to some extent, discriminate outliers from the true signalby using a patch similarity [24]. However, the robustness ofthe nonlocal diffusion approach is still worse than that of theGRWR scheme. For a quantitative comparison, the similaritybetween the segmentation results and the ground truth was


0.81 0.83 0.84 0.86

0.78 0.77 0.81 0.82

0.69 0.72 0.76 0.78

Fig. 7. Segmentation results on natural images. From left to right: initial input images, ARW [48], RWR [20], nonlocal diffusion [24], and GRWR results.Green and blue strokes: the foreground and the background seeds, respectively. The nonlocal affinity function in (34) is used in the GRWR. The nonlocaldiffusion, the ARW, and the RWR approaches exhibit similar performance. The GRWR method has the combinatorial property of nonlocal diffusion and theRWR model: texture information is successfully extracted and a nontrivial steady-state solution is guaranteed. The normalized overlap scores are given beloweach result.

measured by a normalized overlap score O [49]:

O = |A B||A B| (49)

where A and B are the sets of pixels assigned as the fore-ground from the segmentation results and from the groundtruth, respectively. A higher score means better segmentationperformance. As expected, the GRWR model is found to yieldthe best segmentation results.

4) Performance Evaluation With Natural Images: We con-ducted additional experiments with natural images [50] whichare highly textured and have a similar color distributionbetween the foreground and the background. The resultsobtained with the GRWR, the ARW [48], the RWR [20],and the nonlocal diffusion approaches [24] were comparedin Fig. 7. The graph cuts and the RW schemes generallyexhibit worse performance than the RWR and thus, the resultsobtained with these methods are not shown here (see theresults in [20]). Although the nonlocal diffusion and the RWRapproaches exhibit similar performance, the latter guarantees anon-trivial steady-state solution that cannot be achieved by theformer. The GRWR scheme possesses the combinatorial prop-erty of the two methods: the structure and texture informationare successfully extracted (see the tigers head and the birdsbreast) and a non-trivial steady-state solution can be achievedwithin a few iterations.

The normalized overlap score was also measured with theground truth data [50] in Fig. 7. The GRWR approach yieldedslightly higher scores, even though its segmentation results aresubjectively better to those of other methods. Note that thismetric does not fully reflect the human visual system; it cannotcapture the coherence or the connectivity of the boundary

0.89 0.89 0.89 0.88

Fig. 8. Limitation of the objective evaluation using normalized overlap score:segmentation results on natural images for (from left to right) the ARW [48],the RWR [20], the nonlocal diffusion [24], and the GRWR are shown. Thenormalized overlap score given below each result cannot capture the coherenceor the connectivity of the boundary completely as it simply counts the numberof segmented pixels on the entire image.

completely since it simply counts the number of segmentedpixels on the entire image. As a result, the connectivityor the coherence along the segment boundaries is ignored.Furthermore, when a segmented region is larger than a groundtruth, the score decreases regardless of the segmented results(see the upper part of the stone image of Fig. 8) along theboundaries. Consequently, the GRWR approach has a slightlylower score than the RWR in the stone image of Fig. 8 in spiteof its better results along object boundaries.

To further verify the effectiveness of the proposed method,we measured the percentage of mislabeled pixels (PMP)with the ground truth data [50]. Let us denote MA =[m1A, . . . , m

MA

]Tand MB =

[m1B, . . . , m

MB

]Tas the sets of

pixels (mask) from the segmentation results and from theground truth, respectively, the components of which is 1 or 0


0 200 400 600 800 1000 12000.88

0.9

0.92

0.94

0.96

0.98

1

iteration

norm

aliz

ed e

ner

gy

(1, 2)(2, 2)(2, 3)

Fig. 9. Normalized energy of the GRWR method obtained using the patch( fB ) and neighborhood (NNL) with varying sizes, according to the numberof iterations. The normalized energy is measured in a manner similar toFig. 2. (A, B) means the patch radius rP and the neighborhood radius rN ,respectively. For instance, (1, 3) means that the patch and neighbors consistsof 3 3 and 7 7 windows, respectively. It shows that as the patch and/orthe neighborhood size increases, the convergence rate becomes faster.

TABLE IVAVERAGE PERCENT OF MISLABELED PIXELS (PMP) IN FIG. 7 WITH

THE GROUND TRUTH DATA [50]

Method ErrorRate(%)

RW [10] 5.95ARW [48] 4.10RWR [20] 3.46Nonlocal diffusion [24] 3.40GRWR 2.47

if they belong to the foreground or background. Then, the PMPis defined as follows:

PMP =M

i=1 miA miBM

100 (50)where represents the XOR operator. As shown in Table IV,the GRWR approach has a lower error rate than the othermethods on an average.

5) Convergence: The nonlocal affinity function measuresthe patch ( fB) similarity using many neighboring pixels insideNNL so as to capture the textural information well. To explorethe relationship between the patch ( fB) and neighborhood(NNL) size and convergence rate, we measured the normal-ized energy of the GRWR in a manner similar to Fig. 2 byvarying the two parameters ( fB and NNL) as shown in Fig. 9.(A, B) means the patch radius rP and neighborhood radius rN ,respectively. It shows that as the patch and/or the neighborhoodsize increases, the convergence rate becomes faster.

Next, we will investigate how the convergence rate influ-ences the accuracy of the GRWR method. Fig. 10 showsthe segmentation results obtained using the patch ( fB) andneighborhood (NNL) with varying sizes: the patch radius rPwas set to (from top to bottom) 0, 1, and 2, respectively. Theneighborhood radius rN was set to (from left to right) 3, 5,and 7, respectively. The normalized overlap scores are givenunder the results. We can obtain the following observations.First, enlarging the patch and the neighborhood size is slightly

(0, 3), 0.91 (0, 5), 0.95 (0, 7), 0.97

(1, 3), 0.92 (1, 5), 0.97 (1, 7), 0.98

(2, 3), 0.78 (2, 5), 0.90 (2, 7), 0.97

Fig. 10. Segmentation results on texture image obtained using the patch( fB ) and the neighborhood (NNL) with varying sizes. (A, B) means thepatch radius rP and the neighborhood radius rN , respectively. (0, a) can bethought of the semilocal function in that a pixel-wise similarity measure isused. The normalized overlap scores are given below each result. It showsthat enlarging the patch ( fB ) and the neighborhood (NNL) size acceleratesthe convergence rate, but it does not always guarantee a better performance.

helpful to achieve better segmentation results. Note that italso enables the fast convergence. Second, in order to preventan isolated segmentation, it is essential to use the patch-wisesimilarity. In highly cluttered region, the local and/or the semi-local affinity function cannot discriminate the structure infor-mation since it only consider the point-wise similarity. Finally,enlarging the patch and the neighborhood size accelerates theconvergence rate, but it does not always guarantee a betterperformance. Therefore, it is important to find the proper patch( fB) and the neighborhood (NNL) size for different inputs.

V. CONCLUSION

A. SummaryIn this paper, the origin of the RWR model was investigated.

We showed that the RWR approach has a different theoreticalfoundation from the diffusion-reaction equation and has betterfiltering behaviors with respect to impulsive outliers. Thisallowed us to propose a new energy functional unifying theRW and the RWR schemes and further generalize the RWRwithin semi-local and nonlocal forms. The GRWR approachwith the nonlocal affinity function can aggregate texture infor-mation better than the RWR method, while maintaining thesteady-state property of the RWR, i.e., the global relationcan be captured. To verify the performance of the GRWRapproach, it was applied to depth image up-sampling and inter-active image segmentation. The experimental results showedthe superiority of the GRWR over existing regularizationmethods. However, the parameters used in the experimentwere not fully optimized such as the range parameter, thepatch and the neighborhood size, the restarting probability,and the number of iterations. Thus, it is expected that the


performance of the GRWR method could be further improvedwith optimized parameters.

B. DiscussionThe GRWR approach can be combined with more powerful

techniques. First, it can be applied to soft image segmentationin a similar manner to that described in [32], which results inthe ability to composite or edit an image seamlessly. Second,the GRWR method can be accelerated in two ways. It can bethought of as the RWR approach on which semi-local or non-local operators are imposed, so that the fast algorithms used inthe RWR [37], [51] can be applied to reduce the complexity.In addition, the GRWR scheme has a similar form to thebilateral filter [7] or the nonlocal mean filter [24], which allowsit to be accelerated via signal processing techniques [52].

The proposed energy functional can give new insights intorelated fields, such as the generalized PageRank [22]. ThePageRank has been widely used as a ranking criterion toretrieve web pages in a search engine. Theoretically, the RWRmodel is a special case of the PageRank. This means thatthe PageRank considers the local topology or structure thatis directly linked to only a current node. Thus, the accuracyand/or the convergence rate of retrieval algorithms can beimproved by considering a nonlocal hyperlinking topology inthe generalized PageRank. Next, the RWR and the PageRankcan be translated to even more powerful regularization orretrieval methods by modifying their energy functional. In amanner similar to the robust anisotropic diffusion [14], therobust RWR scheme based on l p norm [49] can be derivedusing a robust energy functional, e.g., the total variation-RWRmodel.

APPENDIXVI. CONVERGENCE COMPARISON BETWEEN (28) AND

NONLOCAL DIFFUSION [24]In the flow solution, the evolution step size is crucial and

closely related to the stability and convergence. We comparethe evolution step size of the proposed method with that ofnonlocal diffusion [24]. The energy functional of nonlocaldiffusion is as follows:

E(u) = 14

(u(x) u(y))2wS (x, y)dydx

+2

(u(x) f (x))2dx . (51)Similar to (35), the corresponding discretized flow solution

can be found as

un+1k = (1 nl

wS[k, l] n)unk

+nl

wS [k, l]unl + n fk . (52)

Since the weight of the center node unk should be between0 and 1 so as to ensure numerical stability, the stabilitycondition is derived as follows:

0 n 1/(|N | + ) (53)

where |N | denotes the number of neighbors in the window.In contrast, the stability condition of the GRWR model is

given by0 n 1/(1 + ). (54)

We can see that the maximum evolution step size is fixedto 1

/(1 + ) regardless of the number of neighbors, while

that of nonlocal diffusion decreases according to the numberof neighbors. In conclusion, the proposed method can havea larger evolution step size than nonlocal diffusion, thusguaranteeing faster convergence without a loss of stability.

ACKNOWLEDGMENTThe first and the second authors contributed equally to

this work. The authors would like to thank the advanceddigital science center (ADSC) for providing data sets of theToF sensor, and the anonymous reviewers for their valuablecomments and suggestions.

REFERENCES[1] P. Perona and J. Malik, Scale-space and edge detection using

anisotropic diffusion, IEEE Trans. Pattern Anal. Mach. Intell., vol. 12,no. 7, pp. 629639, Jul. 1990.

[2] D. Tschumperle and R. Ceriche, Vector-valued image regularizationwith PDEs: A common framework for different applications, IEEETrans. Pattern Anal. Mach. Intell., vol. 27, no. 4, pp. 506517,Apr. 2005.

[3] T. F. Chan, S. Osher, and J. Shen, The digital TV filter and nonlineardenoising, IEEE Trans. Image Process., vol. 10, no. 2, pp. 231241,Feb. 2001.

[4] A. Chambolle, An algorithm for total variation minimization and appli-cations, J. Mach. Imag. Vis., vol. 20, nos. 12, pp. 8997, Jan. 2004.

[5] D. Mumford and J. Shah, Optimal approximations by piecewise smoothfunctions and associated variational probelems, Commun. Pure Appl.Math., vol. 42, no. 5, pp. 577685, Jul. 1989.

[6] T. Pock, D. Cremers, H. Cischof, and A. Chambolle, An algorithm forminimizing the Mumford-Shah functional, in Proc. IEEE 12th Conf.Comput. Vis., Oct. 2009, pp. 11331140.

[7] C. Tomasi and R. Manduchi, Bilateral filtering for gray and colorimages, in Proc. 6th Int. Conf. Comput. Vis., Jan. 1998, pp. 839846.

[8] S. Paris, P. Kornprobst, J. Tumblin, and F. Durand, A gentle introductionto bilateral filtering and its applications, in Proc. ACM SIGGRAPHClasses, 2008, pp. 150.

[9] P. G. Doyle and J. L. Snell, Random Walks and Electric Networks.Washington, DC, USA: Mathematical Association of America, 1984.

[10] L. Grady, Random walks for image segmentation, IEEE Trans. PatternAnal. Mach. Intell., vol. 28, no. 11, pp. 17681783, Sep. 2006.

[11] J. Y. Pan, H. J. Yanh, C. Faloutsos, and P. Duygulu, Automaticmultimedia cross-modal correlation discovery, in Proc. 10th ACMSIGKDD Int. Conf. Knowl. Discovery Data Mining, Aug. 2004,pp. 653658.

[12] G. Steidl, J. Weickert, T. Brox, P. Mrazek, and M. Welk, On theequivalence of soft Wavelet Shrinkage, total variation diffusion, totalvariation regularization, and SIDEs, SIAM J. Numer. Anal., vol. 42,no. 2, pp. 686713, Sep. 2005.

[13] Y. L. You, W. Xu, and M. Kaveh, Behavior analysis of anisotropicdiffusion in image processing, IEEE Trans. Image Process., vol. 5,no. 11, pp. 15391553, Nov. 1996.

[14] M. J. Black, G. Sapiro, D. H. Marimont, and D. Heeger, Robustanisotropic diffusion, IEEE Trans. Image Process., vol. 7, no. 3,pp. 421432, Mar. 1998.

[15] D. Barash, A fundamental relationship between bilateral filtering,adaptive smoothing, and the nonlinear diffusion equation, IEEE Trans.Pattern Anal. Mach. Intell., vol. 24, no. 6, pp. 844847, Jun. 2002.

[16] M. Elad, On the origin of the bilateral filter and ways to improve it,IEEE Trans. Image Process., vol. 11, no. 10, pp. 11411151, Oct. 2002.

[17] R. Shen, I. Cheng, J. Shi, and A. Basu, Generalized random walks forfusion of multi-exposure images, IEEE Trans. Image Process., vol. 20,no. 12, pp. 36343646, Apr. 2011.


[18] Y. Boykov and G. Funka-Lea, Graph cuts and efficient N-D imagesegmentation, Int. J. Comput. Vis., vol. 70, no. 2, pp. 109131,Nov. 2006.

[19] L. Grady, Multilabel random walker image segmentation using priormodels, in Proc. IEEE Conf. Comput. Vis. Pattern Recogn., Jun. 2005,pp. 763770.

[20] T. Kim, K. Lee, and S. Lee, Generative image segmentation usingrandom walks with restart, in Proc. Eur. Conf. Comput. Vis., 2008,pp. 264275.

[21] C. Oh, B. Ham, and K. Sohn, Probabilistic correspondence matchingusing random walk with restart, in Proc. Brit. Mach. Vis. Conf., 2012,pp. 3757.

[22] T. H. Haveliwala, Topic-sensitive pageRank: A context-sensitive rank-ing algorithm for web search, IEEE Trans. Knowl. Data Eng., vol. 15,no. 4, pp. 784796, Sep. 2003.

[23] A. Elmoataz, O. Lezoray, and S. Bougleux, Nonlocal discrete regu-larization on weighted graphs: A framework for image and manifoldprocessing, IEEE Trans. Image Process., vol. 17, no. 7, pp. 10471060,Jul. 2008.

[24] G. Gilboa and S. Osher, Nonlocal linear image regularization andsegmentation, Multiscale Model. Simul., vol. 6, no. 2, pp. 595630,Jul. 2007.

[25] M. Subry, S. Paris, S. W. Hasinoff, J. Kautz, and F. Durand,Fast and robust pyramid-based image processing, in Proc.MIT-CSAIL-TR-2001049, Nov. 2011, pp. 13.

[26] A. Buades, B. Coll, and J.-M. Morel, Nonlocal image and moviedenoising, Int. J. Comput. Vis., vol. 76, no. 2, pp. 123139, Feb. 2008.

[27] M. Protter, M. Elad, H. Takeda, and P. Milanfar, Generalizing thenonlocal-means to super-resolution reconstruction, IEEE Trans. ImageProcess., vol. 18, no. 1, pp. 3651, Jan. 2009.

[28] L. Pizarro, P. Mrazek, S. Didas, S. Grewenig, and J. Weickert, Gener-alised nonlocal image smoothing, Int. J. Comput. Vis., vol. 90, no. 1,pp. 6287, Apr. 2010.

[29] M. Werlberger, T. Pock, and H. Bischof, Motion estimation with non-local total variation regularization, in Proc. IEEE Conf. Comput. Vis.Pattern Recognit., Feb. 2010, pp. 24642471.

[30] M. Jung, X. Bresson, T. F. Chan, and L. A. Vese, Nonlocal Mumford-Shah regularizers for color image restoration, IEEE Trans. ImageProcess., vol. 20, no. 6, pp. 15831598, Jun. 2011.

[31] G. Plonka and J. Ma, Nonlinear regularized reaction-diffusion filtersfor denoising of images with textures, IEEE Trans. Image Process.,vol. 17, no. 8, pp. 12831294, Aug. 2008.

[32] L. Grady, T. Schiwietz, S. Aharon, and R. Westermann, Random walksfor interactive alpha-matting, in Proc. Int. Conf. Visualizat. Imag. ImageProcess., 2005, pp. 423429.

[33] C. Wang, F. Jing, L. Zhang, and H.-J. Zhang, Image annotationrefinement using random walk with restarts, in Proc. 14th Annu. ACMInt. Conf. Multimedia, 2006, pp. 647650.

[34] J. Lee, M. Cho, and K. Lee, Hyper-graph matching via reweightedrandom walks, in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,Jun. 2011, pp. 16331640.

[35] E. Kreyszig, Advanced Engineering Mathematics, 9th ed. New York,NY, USA: Wiley, 2006, pp. 552560.

[36] B. Ham, D. Min, and K. Sohn, Robust scale-space filter using second-order partial differential equations, IEEE Trans. Image Process., vol. 21,no. 9, pp. 39373951, Sep. 2012.

[37] H. Tong, C. Faloutsos, and J.-Y. Pan, Fast random walk with restart andits application, in Proc. Int. Conf. Data Mining, 2006, pp. 613622.

[38] A. Singer, Y. Shkolnisky, and B. Nadler, Diffusion interpretation ofnonlocal neighborhood filters for signal denoising, SIAM J. Imag. Sci.,vol. 2, no. 1, pp. 118139, Jan. 2009.

[39] [Online]. Available: http://www.mesa-imaging.ch[40] B. Huhle, T. Schairer, P. Jenke, and W. Staber, Fusion of range and

color images for denoising and resolution enhancement with a non-localfilter, Comput. Vis. Image Understand., vol. 114, no. 12, pp. 11361345,Dec. 2010.

[41] J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele, Joint bilateralupsampling, ACM Trans. Graph., vol. 26, no. 3, pp. 96100, Aug. 2007.

[42] Q. Yang, R. Yang, J. Davis, and D. Nister, Spatial-depth super res-olution for range images, in Proc. IEEE Conf. Comput. Vis. PatternRecogn., Jun. 2007, pp. 18.

[43] D. Min, J. Lu, and M. N. Do, Depth video enhancement based onweighted mode filtering, IEEE Trans. Image Process., vol. 21, no. 3,pp. 11761190, Mar. 2012.

[44] D. Scharstein and R. Szeliski. (2002, Apr.Jun.). A Taxonomy andEvaluation of Dense Two-Frame Stereo Correspondence Algorithms[Online]. Available: http://vision.middlebury.edu/stereo/

[45] [Online]. Available: http://www.ptgrey.com[46] [Online]. Available: http://kinectdata.com[47] [Online]. Available: http://kinectforwindows.org[48] J. Zhang, J. Zheng, and J. Cai, A Diffusion Approach to Seeded Image

Segmentation, in Proc. IEEE Conf. Comput. Vis. Pattern Recogn., Jan.2010, pp. 21252132.

[49] A. K. Sinop and L. Grady, A seeded image segmentation frameworkunifying graph cuts and random walker which yields a new algorithm,in Proc. Int. Conf. Comput. Vis., Feb. 2007, pp. 18.

[50] D. Martin, C. Fowlkes, D. Tal, and J. Malik, A database of humansegmented natural images and its application to evaluating segmentationalgorithms and measuring ecological statistics, in Proc. Int. Conf.Comput. Vis., Jul. 2001, pp. 416423.

[51] X. An and F. Pellacini, AppProp: All-pairs appearance-space editpropagation, ACM Trans. Graph., vol. 27, no. 3, pp. 401409, 2008.

[52] S. Paris and F. Durand, A fast approximation of the bilateral filterusing a signal processing approach, Int. J. Comput. Vis., vol. 81, no. 1,pp. 2452, Jan. 2009.

Bumsub Ham (S09) received the B.S. degree inelectrical and electronic engineering from YonseiUniversity, Seoul, Korea, in 2008, where he is cur-rently pursuing the joint M.S. and Ph.D. degrees inelectrical and electronic engineering.

His current research interests include variationalmethods and geometric partial differential equations,both in theory and applications in computer visionand image processing, particularly regularization,stereo vision, super-resolution, and HDR imaging.

Mr. Ham was a recipient of the Honor Prize in17th Samsung Human-Tech Prize in 2011 and the Grand Prize in QualcommInnovation Fellowship in 2012.

Dongbo Min (M09) received the B.S., M.S., andPh.D. degrees in electrical and electronic engineer-ing from Yonsei University, Seoul, Korea, in 2003,2005, and 2009, respectively.

He was with the Mitsubishi Electric Research Lab-oratories, Cambridge, MA, USA, as a Post-DoctoralResearcher from June 2009 to June 2010. He iscurrently with the Advanced Digital Sciences Center,which was jointly founded by the University ofIllinois at Urbana-Champaign, Urbana, IL, USA, andthe Agency for Science, Technology, and Research,

a Singapore Government Agency. His current research interests include 3-Dcomputer vision, video processing, 3D modeling, and hybrid sensor systems.

Kwanghoon Sohn (M92SM12) received the B.E.degree in electronic engineering from Yonsei Univer-sity, Seoul, Korea, in 1983, the M.S.E.E. degree inelectrical engineering from the University of Min-nesota, Minneapolis, MN, USA, in 1985, and thePh.D. degree in electrical and computer engineeringfrom North Carolina State University, Raleigh, NC,USA, in 1992.

He was a Senior Member of the Research Staffwith the Satellite Communication Division, Elec-tronics and Telecommunications Research Institute,

Daejeon, Korea, from 1992 to 1993, and as a Post-Doctoral Fellow withthe MRI Center, Medical School, Georgetown University, in 1994. He was aVisiting Professor with Nanyang Technological University, Singapore, from2002 to 2003. He is currently a Professor with the School of Electricaland Electronic Engineering, Yonsei University. His current research interestsinclude 3-D image processing, computer vision, and image communication.

Dr. Sohn is a member of SPIE.

/ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 150 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 600 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages false /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 400 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

/Description >>> setdistillerparams> setpagedevice

01. a Generalized Random Walk With Restart and Its Application in Depth Up-Sampling and Interactive...

Documents

Transcript of 01. a Generalized Random Walk With Restart and Its Application in Depth Up-Sampling and Interactive...