Robust recovery of multiple light source based on local light source constant constraint

14
Robust recovery of multiple light source based on local light source constant constraint Jie Wei * Department of Computer Science, City College and Graduate Center, City University of New York, Convent avenue at 138th Street, New York, NY 10031, USA Received 12 October 2001; received in revised form 7 April 2002 Abstract In this paper we are concerned with the robust calibration of light sources in the scene from a known shape. The image of a 3-D object depends on the light source(s), its 3-D geometry, and its surface reflectance properties (Robot Vision, MIT Press, Cambridge, MA, 1986). In the last two decades in the computer vision research communities intensive researches have been conducted along the line of shape from shading (Shape from Shading, MIT Press, Cambridge, MA, 1989), where great efforts are made to recover the 3-D geometry with a priori knowledge regarding the illumination and surface reflectance properties. However, as pointed out by Sato et al. (Proceedings of CVPR’99, 1999, pp. 306), little progress has been made for the recovery of light source(s) with known shape and surface reflectance properties. In a recent paper (IEEE Trans PAMI 23 (2001) 915), Zhang and Yang achieved multiple illuminant di- rection recovery based on the critical points with impressive performance. In this paper we first formulate the local light source constant constraint, i.e., in a local area of smooth lightness it is likely that the corresponding 3-D world points on the object are illuminated by the same light sources. Based on this constraint we develop an algorithm to recover the multiple illuminants: first based on the Lambertian irradiance formula a linear system is formulated for a local area, the local illuminant direction is then reconstructed by a least-squares solution. To effect insensitivity to noises, the least trimmed square method is carried out. Next a dense set of candidate critical points is obtained as a result of a two-step robust processing, which is used to arrive at the directions of multiple illuminants with an adaptive Hough transform. The magnitude for each light source is then computed by solving an over-determined linear system which is formed by pooling pixels illuminated by the same combined light vector. Initial experimental results based on synthetic and real world images suggested encouraging performance. Ó 2002 Elsevier Science B.V. All rights reserved. Keywords: Light source calibration; Least squares solution; Least trimmed squares method; Adaptive Hough transform; Shape from shading 1. Introduction It is well established that there are three factors which determine the image of an object in the scene: the light source(s), the 3-D geometry and Pattern Recognition Letters 24 (2003) 159–172 www.elsevier.com/locate/patrec * Tel.: +1-212-650-5604; fax: +1-212-650-6284. E-mail address: [email protected] (J. Wei). 0167-8655/03/$ - see front matter Ó 2002 Elsevier Science B.V. All rights reserved. PII:S0167-8655(02)00208-8

Transcript of Robust recovery of multiple light source based on local light source constant constraint

Robust recovery of multiple light source based on locallight source constant constraint

Jie Wei *

Department of Computer Science, City College and Graduate Center, City University of New York,

Convent avenue at 138th Street, New York, NY 10031, USA

Received 12 October 2001; received in revised form 7 April 2002

Abstract

In this paper we are concerned with the robust calibration of light sources in the scene from a known shape. The

image of a 3-D object depends on the light source(s), its 3-D geometry, and its surface reflectance properties (Robot

Vision, MIT Press, Cambridge, MA, 1986). In the last two decades in the computer vision research communities

intensive researches have been conducted along the line of shape from shading (Shape from Shading, MIT Press,

Cambridge, MA, 1989), where great efforts are made to recover the 3-D geometry with a priori knowledge regarding the

illumination and surface reflectance properties. However, as pointed out by Sato et al. (Proceedings of CVPR’99, 1999,

pp. 306), little progress has been made for the recovery of light source(s) with known shape and surface reflectance

properties. In a recent paper (IEEE Trans PAMI 23 (2001) 915), Zhang and Yang achieved multiple illuminant di-

rection recovery based on the critical points with impressive performance. In this paper we first formulate the local light

source constant constraint, i.e., in a local area of smooth lightness it is likely that the corresponding 3-D world points

on the object are illuminated by the same light sources. Based on this constraint we develop an algorithm to recover the

multiple illuminants: first based on the Lambertian irradiance formula a linear system is formulated for a local area, the

local illuminant direction is then reconstructed by a least-squares solution. To effect insensitivity to noises, the least

trimmed square method is carried out. Next a dense set of candidate critical points is obtained as a result of a two-step

robust processing, which is used to arrive at the directions of multiple illuminants with an adaptive Hough transform.

The magnitude for each light source is then computed by solving an over-determined linear system which is formed by

pooling pixels illuminated by the same combined light vector. Initial experimental results based on synthetic and real

world images suggested encouraging performance.

� 2002 Elsevier Science B.V. All rights reserved.

Keywords: Light source calibration; Least squares solution; Least trimmed squares method; Adaptive Hough transform; Shape from

shading

1. Introduction

It is well established that there are three factors

which determine the image of an object in the

scene: the light source(s), the 3-D geometry and

Pattern Recognition Letters 24 (2003) 159–172

www.elsevier.com/locate/patrec

* Tel.: +1-212-650-5604; fax: +1-212-650-6284.

E-mail address: [email protected] (J. Wei).

0167-8655/03/$ - see front matter � 2002 Elsevier Science B.V. All rights reserved.

PII: S0167-8655 (02 )00208-8

the surface reflectance properties of the object

being imaged. Great progress has been made to

understand the 3-D shape of an object along the

line of shape from shading pioneered by Horn

(1975) (Horn and Brooks, 1989) using local

methods (e.g., Pentland, 1984; Ferrie and Levine,1989), partially global methods (e.g., Oliensis,

1991; Kimmel and Bruckstein, 1995), and global

methods (Brooks and Horn, 1986; Szeliski, 1991),

where assumptions regarding the light source(s)

and surface reflectance are made to arrive at a

tractable reconstruction. However, as pointed by

Sato et al. (1999), little work has been dedicated to

the recovery of light sources in cases where the 3-Dgeometry and the surface reflectance properties are

known, which are of essential importance in

computer vision and graphics applications. For

instance, in computer vision applications, if we can

recover light sources based on the known shape

and reflectance of one object in the scene, then

instead of making an over-simplistic assumption

about the scene illuminants, the recovered lightsources provide valuable information for a better

understanding of the whole scene. In graphics

applications, the recovered light sources make it

easier to generate photo-realistic synergy of real

world and computer generated imagery for better

virtual reality.

In view of the significance of light source re-

covery or light source calibration, several groupsof researchers have developed their schemes with

varying degrees of success. Sato et al. (1999) pro-

posed to use the shadow information for the pur-

pose of illuminant recovery. However, their

reliable illuminant recovery is only possible under

the condition that the full knowledge of occlusion

information of an object is available in the image,

which is not applicable for many reasonable caseswhere some light sources are far away from the

principal direction of the camera and thus the

shadow information is lacking. In (Powell et al.,

2001), by use of three known spheres of controlled

surface reflectance, the specular regions are then

employed to triangulate the positions of the light

source(s). The recovered light sources are then

used to achieve effective color correction. Thistriangulation based method follows the line of

attack established by Mancini and Wolff (1992).

Like other triangulation based methods in com-

puter vision, their calibration results are extremely

sensitive to noises. However no systematic strategy

is taken to address this crucial issue in their work.

In a recent paper, Zhang and Yang (2001), devel-

oped an interesting algorithm to obtain the mul-tiple illuminant directions by detecting the critical

points, loci on a sphere where the direction of one

illuminant is perpendicular to the surface normal.

Based on the Lambertian reflectance assumption,

the search for the critical points is conducted by a

recursive refinement on some arcs. Nice experi-

mental results have been reported on both syn-

thetic and real world images. However, by closeinspection of their method, it is found that this

recovery algorithm is not without problems. In

their algorithm the search for the critical points on

the known sphere plays the paramount role. First

let’s briefly summarize their algorithm to locate

critical points, hereafter denoted by Zhang–Yang-

detection. Their search is carried out as below: An

arc on the sphere is partitioned into subarcs ofequal length, for each subarc a least-squares min-

imization is performed to search for a vector~vvi ¼ ðbi; ciÞT, where ‘T’ is the transpose operator,

such that in this subarc the equality EðhÞ ¼bi sin h þ ci cos h holds true, where h uniquely de-

termines the location of a pixel on the subarc.

Then the presence of critical point(s) in a subarc is

detected if the Euclidean distance between~vvi’s fortwo adjacent subarcs exceeds a prescribed thresh-

old. The location of the critical points are then

found by an iterative refinement in subarcs with

shorter length.

As is well known that the least-squares mini-

mization scheme is susceptible to the negative

impacts of outliers (Rousseeuw and Leroy, 1987)

unless special care is taken. With the seminal workby many computer vision researchers, (e.g., Black

and Anandan, 1993; Irani et al., 1997), better un-

derstanding has been advanced in achieving robust

results in vision tasks, where due to a wide array of

sources in the world, for instance, the imaging

process of the camera, and human errors, outliers

are rampant. It has been discovered that to arrive

at numerically stable and robust estimation re-sults far more over-determined data should be

available. In Zhang–Yang-detection algorithm, to

160 J. Wei / Pattern Recognition Letters 24 (2003) 159–172

estimate~vv, only those points in a subarc are pooled

together in the over-determined linear system. In

the ensuing refinement process subarcs of even

shorter lengths are employed which further re-

duces the data volume of sample points in the

least-squares estimation. Unless the image resolu-tion is extremely high, the lack of data in the

subarcs would undermine the efficacy of Zhang–

Yang-detection algorithm. Besides, the iterative

searching in an arc can find several critical points,

thereby to reliably recover the directions of mul-

tiple illuminants many arcs, in an order close to

the image size, should be tested upon. Hence the

Zhang–Yang-detection algorithm is rather com-putationally intensive.

In this paper, a new algorithm is developed to

recover multiple illuminants. This algorithm

shares a common philosophy with the Zhang–

Yang-detection algorithm, namely, the directions

of multiple illuminants are also recovered by de-

tecting those critical points. However, instead of

working on subarc points, an extremely sparsedata set, our work is based on local areas with a

far larger data volume such that the robust tech-

niques such as least trimmed squares (LTS) can be

readily borrowed to effect estimates insensitive to

noises. Toward that end we first formulate the

local light source constant constraint (LLSCC),

i.e., in a local area of smooth lightness it is likely

that the corresponding 3-D world points on theobject are illuminated by the same light sources.

Based on this constraint we develop an algorithm

to recover multiple illuminant directions: first

based on the Lambertian irradiance formula a

linear system is formulated for a local area, the

local illuminant direction is then reconstructed by

a least-squares solution. To effect insensitivity to

noises, the LTS method is carried out followed bya Gaussian smoothing. Next a dense set of can-

didate critical points is obtained as a result of a

two-step robust processing, which is used to finally

arrive at the directions of multiple illuminants with

an adaptive Hough transform. The magnitude for

each light source is then computed by solving an

over-determined linear system which is formed by

pooling pixels illuminated by the same combinedlight vector. Initial experimental results based on

synthetic and real world images suggested en-

couraging performance. In the same vein as Powell

et al. (2001) and Zhang and Yang (2001), in this

paper the globe is elected to be the calibrating

object in our efforts to recover multiple illumi-

nants.

This paper is laid out as follows. The LLSCC isformulated in Section 2. Section 3 expands on the

proposed algorithm to recover the multiple illu-

minants. Empirical results on both synthetic and

real world imagery are presented in Section 4.

Section 5 concludes this paper with more remarks.

2. Local light source constant constraint

Throughout this paper the surface reflectance is

assumed to be Lambertian, i.e., the following

equation is employed to determine the intensity

value of an object point:

Eð~XX Þ ¼ q~nn �~ll; ð1Þwhere ~XX is the 3-D coordinate of a world point X,

the scalar q is the albedo of the surface, ~nn ¼ðnx; ny ; nzÞT is the normal vector of the world point

X and its norm k~nnk ¼ 1,~ll ¼ ðlx; ly ; lzÞT is the vec-

tor representing the light source, and � indicatesthe inner product. The light sources can be either

directional or point ones, for the latter case, if the

coordinate of a point source is ~PP , the direction of~llis then defined by ~XX �~PP . It is evident that the

former case is merely the extreme case when ~PP is

extremely far away from the object or the size of

the object under consideration is negligible com-

pared with the distance of ~PP to the object. Aspointed out by Pentland (1984), though a seem-

ingly simple reflectance model, it has an extremely

wide applicability in modeling object surface in

computer vision and graphics, so much so that

even in cases where specular reflection is involved

this diffuse reflectance model is still applicable

except for a relatively small field of view, namely

those regions whose viewing angles are comple-mentary with the incoming light direction. When

there are more than one light sources, then for

each illuminant determined by li, according to Eq.

(1) its contribution Eið~xxÞ is given as below:

Eið~XX Þ ¼ q~nn �~lli: ð2Þ

J. Wei / Pattern Recognition Letters 24 (2003) 159–172 161

Therefore with multiple illuminants ~lli’s, Eð~XX Þ is

the sum of Eið~xxÞ’s:

Eð~XX Þ ¼ q~nnXi

~lli

!: ð3Þ

If one denotes ~LL ¼P

i~lli, then we have the fol-

lowing formula to compute the intensity value for

position~xx:

Eð~XX Þ ¼ q~nn �~LL: ð4Þ

Hence each location ~XX in effect can be viewed as

illuminated by only one illuminant, namely, the

joint vector ~LL of multiple illuminants ~lli’s. In our

work it is assumed that the weak perspective

projection is used in our image formation and thus

image intensities are linearly related to surface

radiance, which is a widely used camera model(Horn, 1986). Suppose the imaging point of ~XX is~xx,then according to our image formation model the

intensity value Ið~xxÞ at ~xx is in direct proportion to

the surface radiance Eð~XX Þ, i.e.,

Ið~xxÞ ¼ aEð~XX Þ; ð5Þ

where a is a certain constant. Plugging Eq. (4) into

Eq. (5) one receives the following formula con-

necting Ið~xxÞ and the world geometry ~nn and (joint)illuminant ~LL:

Ið~xxÞ ¼ aq~nn �~LL: ð6Þ

For a small region in an image of smooth intensity

variations, it is reasonable to assume that the

corresponding world points have the same albedo.

Due to the fact that in our work the size of the

region in question is of small size, all light sourcesilluminating this small region can thus be viewed

as directional illuminants. Therefore according to

Eq. (6) each point in this region is illuminated by

the same illuminant, be it a real single light source

or a joint vector of multiple light sources. The

aforementioned facts are summarized in the fol-

lowing constraint:

LLSCC: For a small region of smooth inten-

sity changes in an image, the corresponding

world points is of the same albedo and illumi-

nated by the same single light source.

The single light source as mentioned in the

LLSCC is either a real single light source or the

joint vector of multiple sources. Equipped with

the LLSCC, we are ready to disclose its power in

recovering the direction ~̂LL~LL of illuminant ~LL for this

region, ~̂LL~LL is the normalized version of ~LL, i.e.,~̂LL~LL ¼~LL=k~LLk, where k~LLk is the norm of ~LL. It can be

seen by inspecting Eq. (6) that the lack of knowl-

edge of a and q except the fact that they are con-

stant across the region poses no problem for our

recovery of ~̂LL~LL due to the fact that their sole effect

according to Eq. (6) is merely a scaling of the

constant vector illuminant ~LL. If somehow the

normal vector for each location~xx in a small region

is known, the LLSCC readily gives rise to a way tocompute ~LL: for each pixel inside the region Eq. (6)

holds true, to recover the constant illuminant di-

rection ~̂LL~LL, one can pool Eq. (6)’s satisfied by all

pixels in the region together and form the follow-

ing over-determined linear system:

~II ¼ cN~LL; ð7Þassuming the number of pixels in the region is m,

the m-dimensional column vector ~II is defined asbelow:

~II ¼ ðIðx1Þ; Iðx2Þ; . . . ; IðxmÞÞT; ð8Þwhile c is the product of unknown constants aq,the m� 3 normal matrix N is formed by patching

normals for X’s together:

N ¼

nx1 ny1 nz1nx2 ny2 nz2� � �nxm nym nzm

0BB@

1CCA: ð9Þ

From Eq. (7), the illuminant~LL can be computed in

the least-squares sense Golub and van Loan, 1983:

~LL ¼ 1

cNþ~II ; ð10Þ

where Nþ ¼ ðNTNÞ�1is the pseudo-inverse of the

matrix N, given the fact that ~̂LL~LL is obtained by

normalizing~LL, we thus have the following formula

to calculate ~̂LL~LL:

~̂LL~LL ¼ Nþ~II

kNþ~IIk: ð11Þ

162 J. Wei / Pattern Recognition Letters 24 (2003) 159–172

Therefore Eq. (11) lends us an effective area-based

scheme to estimate the illuminant direction, it is

indeed the foundation of our algorithm upon

which our algorithm to recover multiple illumi-

nants is built.

3. Robust multiple illuminant direction recovery

algorithm

In this section we first expand on our method to

recover the joint illuminant direction for a local

region. We then present our scheme to locatecritical points. Finally the recovery algorithm is

described at length.

3.1. Robust local illuminant direction recovery

algorithm

According to the LLSCC constraint as devel-

oped in the preceding section, the end result ofmultiple illuminants for a small region can be

viewed as illuminated by a single light source,

which is then estimated by use of Eq. (11). In this

section, we shall develop an algorithm to recover

the directions of multiple illuminants for a sphere

with uniform a and q and thus the same c on the

surface and known pose and size. In cases where

c’s for different local smooth regions are the samevalue, then all N’s in different local region under-

goes the same scaling thus there is no need to

factor out c and ~LL in Eq. (7), therefore we

can simply rewrite Eq. (7) into the following

equality:

~II ¼ N~LL; ð12Þwhere it is evident that ~LL here is indeed c~LL in the

original Eq. (7). So now the single light source ~LLfor each small region is estimated by the following

least-squares formula for this special case:

~LL ¼ Nþ~II : ð13ÞDue to the fact that the pose and size of the

sphere are known, without loss of generality one

can assume that the origin of the coordinate sys-

tem is located at the center of the sphere, thenormal ~nn at each point ði; jÞ on the image of the

sphere is then given below:

~nni;j ¼jr

�� 1; y; 1 � i

r

�T

; ð14Þ

where y ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 � ððj=rÞ � 1Þ2 � ð1 � ði=rÞÞ2

q. There-

by according to Eq. (13) N and~II in a local region

can be readily collected from the image in order

to estimate ~LL which is deemed constant based onthe LLSCC.

Because Eq. (13) which is used to compute ~LL is

arrived at by minimizing the least-squares error as

defined by Eq. (12) (Horn, 1986), the estimated ~LLfor the concerned local region is thus susceptible to

noises which are rampant in vision applications,

especially outliers which, even with small num-

ber, can entirely distort the estimates. Accordingto the theory of robust statistics (Huber, 1981), the

breakdown point, the percentage of allowable out-

liers which incur no adverse impact to the esti-

mate, for the least-squares estimation formula Eq.

(13) is 0, which manifests the fact that it is not

appropriate to use Eq. (13) per se for the purpose

of recovering the local light source. Fortunately a

rich stock of ammunition is readily available fromrobust statistics to mitigate this problem, such as

least-median-squares (LMedS), LTS (Huber, 1981;

Rousseeuw and Leroy, 1987), whereof the break-

down point can be increased to as large as 50%.

Between these two most widely used robust oper-

ators, LMedS tends to generate better estimate

result in discounting outliers. However, due to its

stochastic nature, namely, its robustness indeedcan only be achieved by a large number of iterations

for random sampling thus requiring extremely

great computational intensity. By contrast, the

LTS operates based on order statistics of the raw

data. More specifically, for a data array whose

volume is OðnÞ, this operator is able to arrive at

the robust estimate within a time complexity

Oðn log nÞ, the targeted breakdown point can beachieved by adjusting the size of the trimmed data.

In view of its algorithmic efficiency and effective-

ness to obtain robust results, in this work we chose

the LTS as our robust operator to estimate ~LL for

each local neighborhood.

Based on the LTS technique, in order to remove

outliers, for a given region R, instead of pooling

intensity values Ii’s and corresponding normals for

J. Wei / Pattern Recognition Letters 24 (2003) 159–172 163

all points within R according to Eqs. (8) and (9),

first Ii’s are sorted in order, the resulting set is

denoted by S. Next to achieve a targeted break-

down point b%, the top and bottom b% of the data

in S is thrown out, the remaining data set is re-

ferred to as T. The resulting long vector�~II~II is

formed by pooling all Ii’s in T, whereas the new

normal matrix �NN is formed by collecting the nor-

mals corresponding to each Ii in�~II~II in the manner

similar to Eq. (9). Consequently according to Eq.

(13) the applicable robust version of estimating ~LLis as given below.

~LL ¼ �NNþ �~II~II : ð15ÞThat the resulting breakdown point based on Eq.

(15) is b% can be simply proved by the fact that the

b% outliers whose collection is a subset of the set

T � S, the top or bottom b% data entries in S, are

effectively removed from the set T.

The robust method to recover the illuminant for

a small local region R whose number of pixels is mdeveloped in this subsection is summarized in the

following procedure local-illuminant-recovery.

PROC local-illuminant-recovery

(1) formulate the m-dimensional vector ~II and

m� 3 matrix N for all pixels in R according

to Eqs. (8) and (9), respectively;

(2) sort~II and change N accordingly;

(3) trim the top and bottom b% of rows in~II and

N, respectively;

(4) apply Gaussian smoothing on the trimmed�~II~II ;

(5) estimate the light source ~LL for R by use of Eq.

(15).END

In the foregoing procedure, the outliers are

handled by Step 3, which is indeed the LTS

scheme. As discussed previously, the choice of b is

crucial to the robust performance of this algorithm

due to the use of the LTS scheme. It is evident that

if b is less than the actual percentage p of theoutliers then this procedure is unable to entirely

contain the adverse impacts of the outliers. How-

ever, due to the use of Hough transform in the

ensuing process of the estimation algorithm as

described below, the slightly higher p than b can be

alleviated. However, if p is too large then the re-

sults will be unreliable. This is illustrated in Fig. 1

for a synthesized image with pepper/salt noisesadded by different percentages: when p ¼ 0:16 and

b ¼ 0:15, the recovered L’s as shown in the fourth

column is still acceptable which can be effectively

handled by the Hough transform in order to re-

construct the multiple illuminants. However, in the

case of p ¼ 0:2 and b remains 0.15, the recovered

L’s as depicted in the last column is too noisy to be

mitigated by the Hough procedure.

Fig. 1. The impacts of choice b for salt/pepper noise only. The synthesized images (row 1) and the ~LL’s recovered by the local-illu-

minant-recovery, where b is fixed at 0.15 while the percentages p of the added salt/pepper noises for the five columns are 0.05, 0.1, 0.15,

0.17, and 0.2, respectively.

164 J. Wei / Pattern Recognition Letters 24 (2003) 159–172

The Gaussian noises are alleviated by Step 4: we

apply a Gaussian diffusion on the trimmed sorted

local intensity array�~II~II which was sorted and trim-

med in Steps 2 and 3. Due to the fact that in this

array the outliers are mostly removed and adjacent

values are of similar magnitude, the Gaussiansmoothing can effectively suppress the white

noises. In our implementation, a nine-tap Gauss-

ian filter is used. To showcase the performance of

this procedure in the presence of Gaussian noises,

the results generated by this procedure for a syn-

thesized image disturbed by white noises with

differing standard deviations are illustrated in Fig.

2. As can be observed in this figure, this procedureis fairly resilient to minor Gaussian noises, e.g., the

first three columns when r ¼ 1, 3 or 5. In cases

where r is relatively large, e.g., 7, the recovered~LL’sare noisy but can be handled by the ensuing pro-

cess to recover the multiple illuminants. For con-

siderably large r, say, 11, as depicted in the last

column, the recovered ~LL’s are too noisy to be used

by our algorithm to calibrate the multiple illumi-nants.

3.2. Critical points recovery

Based on the LLSCC constraint, when there are

more than one light source the ~LL recovered by

local-illuminant-recovery procedure as developed

in the preceding subsection is essentially the vectorsum of the multiple illuminants visible to pixels in

R. In the image of the sphere, for two adjacent

regions R1 and R2, if they are visible to the same

set of light sources, then the respective recovered

illuminant vectors ~LL1 and ~LL2, as the vector sum of

exactly the same set of multiple illuminants, the

equality ~LL1 ¼~LL2 should hold. This observationleads us to the following proposition which has an

important role in our effort to recover multiple

illuminants:

Proposition 1. The inequality~LL1 6¼~LL2 indicates thefact that the two neighbor regions are not visible tothe same set of illuminants.

The concept critical points as defined in Zhang–

Yang-detection algorithm plays the essential role

in their work of recovering multiple illuminant

directions. The critical points on the sphere are the

loci where one illuminant grazes on the sphere,

namely,

~̂lglg~lglg �~nnX ¼ 0; ð16Þ

where ~̂lglg~lglg is the normal vector representing the

grazing illuminant, ~nnX is the normal vector at a

critical point X. It is evident that Eq. (16) repre-

sents a plane P whose normal is ~̂lglg~lglg. Since the center

of the globe is located at the origin, the set of

critical points due to the grazing illuminant ~̂ll~llg is

the great circle which is the intersection between

the known sphere and the plane P. Of course only

Fig. 2. The impacts of different Gaussian noises on the local-illuminant-recovery procedure.

J. Wei / Pattern Recognition Letters 24 (2003) 159–172 165

part of the great circle is visible on the image after

the image formation process. Given that our work

is region based, any illuminant grazing the sphere

intersects with a region at an arc which is referred

to as critical arc. It is evident that the critical arc isa set of critical points which are part of a great

circle formed by a grazing illuminant. The critical

arcs are the foundation for us to detect the mul-

tiple illuminants in this work. If we denote the set

of all critical arcs on the image of the sphere by G,

and overall there are k illuminants, then G is de-

fined as below:

G ¼ C1 [ C2 [ � � � [ Ck; ð17Þwhere the ith critical arc Ci, i ¼ 1; 2; . . . ; k, is the

partial great circle generated by the ith illuminant.

If somehow a dense data set, namely, critical

points, for each critical arc Ci can be detected, then

its corresponding~lli can be estimated by the over-

determined linear system which is formed bypooling together Eq. (16) for each point on Ci, i.e.,

U~lli ¼~00; ð18Þ

where U is the nci � 3 matrix each row of which is

the three components of the normal at a critical

point on Ci, nci is the number of critical points in

Ci, ~00 is a nci -dimensional zero vector. A numeri-

cally stable and robust estimate of ~lli is possible

only if a relatively large number nci of critical

points can be recovered for each Ci.

As discussed in the preceding paragraph, it is ofutmost importance to locate as many critical

points as possible to obtain desirable estimate of

the corresponding~ll. If two adjacent regions are of

different~LL’s, based on the definition of critical arcs

from Proposition 1 we see right away that if~LL1 6¼~LL2 then G \ ðR1 [ R2Þ 6¼ ;, i.e., the union of

these two adjacent regions contains one or many

critical points which are part of one or more crit-ical arcs Ci’s. Therefore the testing of the preced-

ing inequality serves as the valuable criterion in

search of critical points. One can detect critical

points by partitioning the image of the sphere into

non-overlapped regions, small squares for sim-

plicity, then the local-illuminant-recovery proce-

dure is carried out to robustly compute ~LL for each

square. The candidate regions where criticalpoints, referred to as critical regions for short, are

present are then detected in cases where the mag-

nitude of the difference of two~LL’s for two adjacent

regions exceed a certain threshold, namely, as-

suming ~LL1 and ~LL2 are the illuminants recovered by

the local-illuminant-recovery procedure for two

adjacent regions R1 and R2, respectively, then R1

and R2 are labeled as critical regions if the fol-

lowing condition holds true:

k~LL1 �~LL2k > d1; ð19Þwhere d1 is a prescribed scalar threshold to indi-

cate significant difference. Here a paradox presents

itself: in order for the local-illuminant-recovery

procedure to work, the size of each region shouldbe relatively large to render the LTS and ensuing

least-squares estimate effective, in our empirical

study numerically unstable estimates are observed

when the number of pixels in each region is fewer

than thirty. However, to recover the grazing illu-

minant with high confidence the locations of those

corresponding critical points need to be as precise

as possible, thus the size of the critical regionsshould be as small as possible. To tackle these

conflicting requirements, a two-step process can be

employed: (1) search for critical regions E through

non-overlapping partitioning; (2) refine the loca-

tions of critical points from the critical regions.

The non-overlapping search conducted in the first

step is meant to render an efficient search: those

regions which fail the inequality (19) are prunedbefore the second step is carried out, which is far

more computationally intensive. The multi-scale

search is not employed here due to the following

two reasons: (1) The outliers such as salt/pepper

noises in the images of original resolution will

propagate to the lower resolution representations

thus having adverse impacts on the estimation of

critical points. (2) As described earlier generally alarger number of pixels are more desired in order

to render our estimation more robust, unless the

image is of extremely large resolution otherwise

the lower resolution images in the multi-scale

representations may not be able to facilitate robust

estimation results. In the second step the refine-

ment of more precise location of critical points

cannot be achieved by reducing the number ofpixels contained in the critical regions, which

simply undermines the efficacy of the local-illu-

166 J. Wei / Pattern Recognition Letters 24 (2003) 159–172

minant-recovery procedure. Instead, we choose to

perform our refinement process based on over-

lapped partitioning of E: for each pixel p inside E a

region Rp centered on p is formed, then the local-

illuminant-recovery procedure is called to compute~LLp for this region. Due to the fact that it is likely

that pixels in Rp are not illuminated by the same

set of illuminants––its intersection with critical

point set G is non-empty, in order to render the

recovered light source a better approximate of the

light source illuminating the majority part of Rp, it

is found that increasing the trimming constant b in

the local-illuminant-recovery procedure is helpful,which has a better chance to throw out more pixels

belonging to the ‘‘minority parts’’ in Rp. Now two

adjacent points p1 and p2 are earmarked as can-

didate critical points if the following condition is

satisfied:

k~LLp1 �~LLp2k > d2; ð20Þwhere the scalar d2 (< d1) is the threshold to sig-

nify candidate critical points. Its value should be

smaller than d1 since now the two regions Rp1 andRp2 overlap greatly in their domain.

Summarizing, we have the following procedure

to recover candidate critical points:

PROC candidate-generation

(1) partition the image of the sphere into non-

overlapping squares, then for each square call

local-illuminant-recovery to estimate the light

source ~LL;(2) generate the set E of critical regions according

to (19);

(3) for each pixel p within E form a square cen-tered on p, then call local-illuminant-recovery

with slightly larger magnitude for b to com-

pute the corresponding ~LLp;

(4) generate the set C of candidate critical points

according to (20).

END

By inspecting the preceding procedure, it can beobserved that the output point set C is not the

actual critical point set G, they are still its superset,

namely, C G. However, C is already a significant

reduction from the set E, which is a far greater

superset of G. The recovered C’s for the noisy

images as depicted in Figs. 1 and 2 of previous

section are illustrated in Fig. 3. As can be observed,

for images corrupted by considerable noises, e.g.,Gaussian noise with r > 7 and salt/pepper noise

with p > bþ 3%, the recovered ~LL’s are too noisy

to be used for lighting source calibration.

3.3. Multiple light sources recovery algorithm

Though C as estimated in the candidate-gener-ation procedure as described in the precedingsubsection is still a superset of G, and to make

Fig. 3. The recovered critical points for the noisy images shown in Figs. 1 and 2.

J. Wei / Pattern Recognition Letters 24 (2003) 159–172 167

matters worse, there is no way for us to know

which critical arc the estimated points belong to,

hence the over-determined linear system as defined

by Eq. (18) cannot be called upon to obtain the

corresponding illuminant. A classification proce-

dure is apparently needed before estimating thedirection of the illuminants. Fortunately, special

structural property of G lends us an effective

scheme to finally reconstruct the directions of

multiple illuminants. This is because G is a union

of a number of great circles as formulated in (17),

and pixels in C are either critical points or those in

the close neighborhood of points in G. Thus if two

points p1 and p2 in C is from or close to the samecritical arc Ci, then their normals should each

satisfy Eq. (16) with ~̂lglg~lglg’s being similar to each

other. In light of the linear nature of Eq. (16), the

Hough transform can be used toward the recoveryof the multiple illuminant directions. For each

pixel p in C, first its normal on the world globe is

computed according to Eq. (14), next a vote can be

cast for ~̂lglg~lglg’s where Eq. (16) is satisfied. Finally each~̂lglg~lglg is obtained by finding the clustering point in the

vote array. It is well established that for a lowdimensionality problem the Hough transform is an

extremely effective and robust scheme for the

purpose of parameter estimation (Ballard and

Brown, 1982). In our problem, the dimensionality

of the parameter ~̂lglg~lglg to be estimated is merely two,

three components subject to the constraint that

their magnitude is one. To ease the computational

intensity incurred by the voting process, the

adaptive Hough transform (Illingworth and Kit-tler, 1987; Tian and Shah, 1997) is adopted: first a

coarse grid is used to cast the vote, then regions

around the clustering points are discretized with

finer-grained grid. This grid refinement process is

iterated until an expected precision is arrived at.

The foregoing adaptive Hough transform pro-

cedure only provides the directions ~̂lglg~lglg’s of the

multiple illuminants, to fully recover the multiple

illuminants ~llg’s, their corresponding magnitudes

should be estimated, which turns out to be an easy

task with knowledge of ~̂lglg~lglg’s. Toward that end first

a concept macro region is defined: the set of pixelswhich are visible to the same set t of illuminant

directions is referred to as a macro region R. It is

evident that R is a partition of the sphere image.

To generate this partitioning, first the set t of~llgi ’svisible to each pixel p is computed by inspecting

the positivity of the inner product ~̂ll~llgi �~nnp for every~̂ll~llgi , then pixels having the same t are grouped into

one macro region R. Each R is thus a superset of

R’s as employed in the local-illuminant-recovery

procedure. To effect robust estimate of the mag-nitude of each illuminant, the median vector ~LLR is

first computed for all R’s inside R, where each

element of ~LLR is the median of the corresponding

elements for all~LLR’s, R 2 R. QuickSort is called to

compute these element-wise median. Next ac-

cording to the LLSCC constraint and denoting mgi

as the magnitude of the illuminant ~llgi , the fol-

lowing equality holds:

~LLRj ¼X

~llgi2tðRjÞ

mgi~̂ll~llgi : ð21Þ

It can be proved by a simple induction that the

number of R’s is more than the number of illu-

minants~llg’s, thus the number of Eq. (21)’s for each

R is more than the number of unknowns mgi ’s.

Furthermore, Eq. (21) indeed confers us three

equations for mgi ’s. Therefore by patching Eq. (21)together, we again have an over-determined linear

system which can be used to estimate mgi ’s. To

effect easier algebraic manipulation, Eq. (21) is

rewritten as the following equation:

~LLRj ¼ GFj~mm; ð22Þ

where if J is the number of illuminants then G is a3 � J matrix obtained by patching ~llgi ’s column-

wise:

G ¼ ð~llg1;~llg2

; . . . ;~llgJ Þ; ð23Þfiltering matrix F j is a J � J diagonal matrix whereFj½k; l� ¼ 0 if k 6¼ l and Fj½k; k� ¼ 1 if ~llgk 2 vðRjÞ,otherwise Fj½k; k� ¼ 0. ~mm is a J-dimensional column

vector ðmg1;mg2

; . . . ;mgJ ÞT. It is evident the filtering

matrix F j is used to annihilate~llgi ’s which make no

contribution to Rj.

By pooling Eq. (23) for all Rj’s together we thus

arrive at the following over-determined linear

system:

~LLRall¼ �GGall~mm; ð24Þ

168 J. Wei / Pattern Recognition Letters 24 (2003) 159–172

where the long vector ~LLRallis formed by patching

~LLRj ’s together, while the matrix �GGall is generated by

attaching GF j in the vertical direction. Thereby ~mmcan be estimated by the following least-squares

formula:

~mm ¼ �GGþall~LLRall

: ð25ÞFinally, we are ready to stipulate our algorithm to

recover the multiple illuminants:

Algorithm multiple-illuminant-recovery

1. Call candidate-generation procedure to detect

the candidate critical point set C;2. perform the adaptive Hough transform to esti-

mate the illuminant directions ~̂lgilgi~lgilgi ’s;3. generate the macro region partitioning for the

sphere image and then use Eq. (25) to estimate

the magnitude mgi for each illuminant whose di-

rection has been estimated in the previous step.

End

The corresponding flow chart of the foregoing

algorithm is illustrated in Fig. 4.

4. Experimental results

In this section, experimental results generatedby virtue of the multiple-illuminant-recovery algo-

rithm as developed in the preceding section are

presented. First results on synthetic imagery with

known ground truth are reported which provide

insight into the numeric quality of the proposed

algorithm. We then present the performance of

this algorithm on real world images to further

depict its efficacy.The advantage that the synthetic experimental

results have to provide is that the ground truths of

the original images are available to make quanti-

tative evaluation of the proposed algorithm. To

provide a quantitative measure of the proposed

algorithm, we hereby define errors �d and �m to

gauge the goodness of the recovered illuminants

and the original M illuminants:

�d ¼1

M

XMi¼1

j~̂lili~lili � ~̂~ll~ll~~ll~llij; ð26Þ

�m ¼ 1

M

XMi¼1

jk~llik � k ~~lili~lilikj; ð27Þ

where ~lli and~~lili~lili are the vectors representing the

light sources which are estimated by the proposedalgorithm and the known ones in rendering the

synthetic images, respectively. ~̂lili~lili indicates the

normalized ~lli, and k~llik corresponds to the mag-nitude of~lli. It is evident that �’s indicate the mean

relative errors between the estimated light source

vectors and the generative light source vectors in

direction and magnitude, respectively.

In the first row of Fig. 5, a globe of radius 200

pixels illuminated by two light sources is rendered

by our program, Gaussian (l ¼ 0, r ¼ 5) and salt/

pepper noises with equal probability are then ad-ded in 10% of pixels to generate the final image.

The local illuminants for each local region pro-

duced by the procedure local-illuminant-recovery

are depicted in the second column. 1 The critical

points are given in the third column, whereof it can

be observed that far more critical points are

available for numerically recovering the light di-

rection. The final column illustrates the newlyrendered sphere using the parameters estimated by

the proposed algorithm. The two error measures

for the recovered illuminants are �d ¼ 2:1� and

�m ¼ 2:3.As the second experiment, the globe of radius

200 pixels illuminated by two light sources which

are complement to each other, i.e.,~ll1 ¼ �~ll2. Here,

8% pixels are replaced by salt/pepper noises andthe Gaussian noises of the same parameter as the

Fig. 4. The flow chart of the proposed algorithm multiple-illuminant-recovery.

1 The magnitudes are scaled to result in better visualization.

J. Wei / Pattern Recognition Letters 24 (2003) 159–172 169

previous example are added to each pixel. The

running results for this synthetic image is listed in

the second row of Fig. 5. Therefore our algorithm

has no problem to handle this case which was not

covered by other algorithms (e.g., Zhang and

Yang, 2001). The corresponding errors are �d ¼2:8� and �m ¼ 2:5.

To further delineate the performance of the new

algorithm, a globe of radius 200 pixels illuminated

by four light sources with salt/pepper in 8% pixels

and Gaussian (l ¼ 0, r ¼ 5) noises for each pixel

is tested against our algorithm. As can be observed

in the third row of Fig. 5, the recovered image is

both qualitatively and quantitatively satisfactory.

In this test, �d ¼ 3:4�, �m ¼ 4:6.

Intensive experiments have been conducted by

using the proposed multiple-illuminant-recovery

algorithm, the performances are consistent with

the three shown in Fig. 5. The adverse impacts of

various noises, e.g., Gaussian or salt/pepper, can

be discounted effectively.

We also tested the performance of the proposedalgorithm on real world imagery. As depicted in the

last row of Fig. 5, the picture of a ping-pang ball

illuminated by two light sources is taken. Due to

the small size of the ball, the directional light

sources can be approximated by distant light bulbs.

As can be seen from the recovered images shown in

the same row, the recovered two light sources clo-

sely approximated the original ones qualitatively.

Fig. 5. Synthetic and real world imagery experimental results of the proposed algorithm.

170 J. Wei / Pattern Recognition Letters 24 (2003) 159–172

5. Conclusion and discussion

In this paper we first developed the LLSCC,

next based on this constraint, a novel robust

scheme is presented to recover the multiple lightsources from a sphere in the scene. The LTS

technique is employed in the least squares esti-

mation of local light vectors in order to facilitate

robustness of the recovery algorithm. Then the

candidate critical points as defined in (Zhang and

Yang, 2001) are spotted after a check on the dif-

ference between adjacent blocks/pixels. Next an

adaptive Hough transform procedure is performedto arrive at the directions of the multiple illumi-

nants. Finally the magnitudes for all light sources

are obtained by solving an over-determined linear

system after pooling all pixels inside regions illu-

minated by the same combined light vectors. Ini-

tial experimental results on synthetic and real

world imagery alike show encouraging perfor-

mances.Indeed the LLSCC can be applied to situations

where the shape of the object is known a priori to

calibrate the light sources, not just limited to a

sphere. A wide array of man-made objects can

serve the role as the calibrating object. In this

paper only directional light sources are considered

which is applicable when the calibrating object is

very small or the light sources are far away.However, as pointed out in (Powell et al., 2001), to

effectively calibrate the light sources in an indoor

scene, it is more reasonable to assume the light

sources as point sources. In the near future we will

further examine methods to recover point light

sources within the framework developed in this

paper. Last but not least, efforts will also be

made to exploit the recovered light sources for thepurpose of illumination-invariant recognition in

computer vision, and virtual reality in computer

graphics.

The proposed algorithm can be very robust

against outliers: a large choice of b can effectively

discount a large percentage of them. However,

Gaussian noises with significantly large variance

cannot be easily fixed by this algorithm: the localGaussian diffusion in Step 4 of procedure local-

illumination-recovery together with the ensuing

Hough transform are just inadequate. This is be-

cause the method proposed in this paper is of the

nature of a local algorithm: each ~II is estimated

based on a local neighborhood and a local mini-

mum is assumed to be the estimate. To effect a

better resilience against noises, a global method to

estimate the ~II field can be used, where a rich ar-senal of energy formulation and optimization

schemes can be employed, such as Markov Ran-

dom Field (Barnard, 1993), anisotropic diffusion

(Perona and Malik, 1990), variational calculus

(Brooks and Horn, 1986), and partial differential

equation based methods (Koenderink, 1984). In

the near future we will try on these avenues to

enhance the lighting source calibration method.Furthermore, issues regarding different calibrating

objects other than globes based on our LLSCC as

expanded in this paper will also be explored.

References

Ballard, D.H., Brown, C.M., 1982. Computer Vision. Prentice

Hall, Englewood Cliffs, NJ.

Barnard, S., 1993. Stereo matching. In: Chellappa, R., Jain,

A.K. (Eds.), Markov Random Fields. Academic Press, New

York, pp. 245–271.

Black, M.J., Anandan, P., 1993. A framework for the robust

estiamtion of optical flow. In: Proceedings of ICCV’93, pp.

231–236.

Brooks, M.J., Horn, B.K.P., 1986. The variational approach to

shape from shading. Comput. Vis. Graph. Image Process.

33, 174–208.

Ferrie, F., Levine, M., 1989. Where and why local shading

analysis works. IEEE Trans. PAMI 11 (2), 198–206.

Golub, G.H., van Loan, C.F., 1983. Matrix Computations. The

Johns Hopkins University Press, Baltimore, MD.

Horn, B.K.P., 1986. Robot Vision. MIT Press, Cambridge,

MA.

Horn, B.K.P., 1975. Obtaining shape from shading informa-

tion. In: Winston, P.H. (Ed.), The Psychology of Computer

Vision. McGraw-Hill, New York, pp. 115–155.

Horn, B.K.P., Brooks, M.J. (Eds.), 1989. Shape from Shading.

MIT Press, Cambridge, MA.

Huber, P.J., 1981. Robust Statistics. Wiley, New York.

Illingworth, J., Kittler, J., 1987. Adaptive Hough transform.

IEEE Trans. PAMI 9, 690–698.

Irani, M., Rousso, B., Peleg, S., 1997. Recovery of ego-motion

using region alignment. IEEE Trans. PAMI 19 (3), 268–272.

Kimmel, K., Bruckstein, A.M., 1995. Tracking level sets by

level sets: a method for solving the shape from shading

problem. Comput. Vis. Image Understanding 62 (2), 47–58.

Koenderink, J., 1984. The structure of images. Biol. Cyber. 50,

363–370.

J. Wei / Pattern Recognition Letters 24 (2003) 159–172 171

Mancini, T.A., Wolff, L.B., 1992. 3d shape and light source

location from depth and reflectance. In: Proceedings of

CVPR’92, pp. 707–709.

Oliensis, J., 1991. Shape from shading as a partially well-

constrained problem. CVGIP: Image Understanding 54 (2),

163–183.

Pentland, A., 1984. Local shading analysis. IEEE Trans. PAMI

6 (2), 170–187.

Perona, P., Malik, J., 1990. Scale space and edge detection using

anisotropic diffusion. IEEE Trans. PAMI 12 (7), 629–639.

Powell, M.W., Sarkar, S., Goldgof, D., 2001. A simple strategy

for calibrating the geometry of light sources. IEEE Trans.

PAMI 23 (9), 1022–1027.

Rousseeuw, P.J., Leroy, A.M., 1987. Robust Regression and

Outlier Detection. Wiley, New York.

Sato, I., Sato, Y., Ikeuchi, K., 1999. Illumination distribu-

tion from shadows. In: Proceedings of CVPR’99, pp. 306–

312.

Szeliski, R., 1991. Fast shape from shading. CVGIP: Image

Understanding 53 (2), 125–153.

Tian, T.Y., Shah, M., 1997. Recovering 3d motion of multiple

objects using adaptive Hough Transform. IEEE Trans.

PAMI 19 (10), 1178–1183.

Zhang, Y., Yang, Y., 2001. Multiple illuminant direction

detection with application to image synthesis. IEEE Trans.

PAMI 23 (8), 915–920.

172 J. Wei / Pattern Recognition Letters 24 (2003) 159–172