Multiple Light Source Optical Flow Multiple Light Source Optical Flow Robert J. Woodham ICCV’90.
Robust recovery of multiple light source based on local light source constant constraint
Transcript of Robust recovery of multiple light source based on local light source constant constraint
Robust recovery of multiple light source based on locallight source constant constraint
Jie Wei *
Department of Computer Science, City College and Graduate Center, City University of New York,
Convent avenue at 138th Street, New York, NY 10031, USA
Received 12 October 2001; received in revised form 7 April 2002
Abstract
In this paper we are concerned with the robust calibration of light sources in the scene from a known shape. The
image of a 3-D object depends on the light source(s), its 3-D geometry, and its surface reflectance properties (Robot
Vision, MIT Press, Cambridge, MA, 1986). In the last two decades in the computer vision research communities
intensive researches have been conducted along the line of shape from shading (Shape from Shading, MIT Press,
Cambridge, MA, 1989), where great efforts are made to recover the 3-D geometry with a priori knowledge regarding the
illumination and surface reflectance properties. However, as pointed out by Sato et al. (Proceedings of CVPR’99, 1999,
pp. 306), little progress has been made for the recovery of light source(s) with known shape and surface reflectance
properties. In a recent paper (IEEE Trans PAMI 23 (2001) 915), Zhang and Yang achieved multiple illuminant di-
rection recovery based on the critical points with impressive performance. In this paper we first formulate the local light
source constant constraint, i.e., in a local area of smooth lightness it is likely that the corresponding 3-D world points
on the object are illuminated by the same light sources. Based on this constraint we develop an algorithm to recover the
multiple illuminants: first based on the Lambertian irradiance formula a linear system is formulated for a local area, the
local illuminant direction is then reconstructed by a least-squares solution. To effect insensitivity to noises, the least
trimmed square method is carried out. Next a dense set of candidate critical points is obtained as a result of a two-step
robust processing, which is used to arrive at the directions of multiple illuminants with an adaptive Hough transform.
The magnitude for each light source is then computed by solving an over-determined linear system which is formed by
pooling pixels illuminated by the same combined light vector. Initial experimental results based on synthetic and real
world images suggested encouraging performance.
� 2002 Elsevier Science B.V. All rights reserved.
Keywords: Light source calibration; Least squares solution; Least trimmed squares method; Adaptive Hough transform; Shape from
shading
1. Introduction
It is well established that there are three factors
which determine the image of an object in the
scene: the light source(s), the 3-D geometry and
Pattern Recognition Letters 24 (2003) 159–172
www.elsevier.com/locate/patrec
* Tel.: +1-212-650-5604; fax: +1-212-650-6284.
E-mail address: [email protected] (J. Wei).
0167-8655/03/$ - see front matter � 2002 Elsevier Science B.V. All rights reserved.
PII: S0167-8655 (02 )00208-8
the surface reflectance properties of the object
being imaged. Great progress has been made to
understand the 3-D shape of an object along the
line of shape from shading pioneered by Horn
(1975) (Horn and Brooks, 1989) using local
methods (e.g., Pentland, 1984; Ferrie and Levine,1989), partially global methods (e.g., Oliensis,
1991; Kimmel and Bruckstein, 1995), and global
methods (Brooks and Horn, 1986; Szeliski, 1991),
where assumptions regarding the light source(s)
and surface reflectance are made to arrive at a
tractable reconstruction. However, as pointed by
Sato et al. (1999), little work has been dedicated to
the recovery of light sources in cases where the 3-Dgeometry and the surface reflectance properties are
known, which are of essential importance in
computer vision and graphics applications. For
instance, in computer vision applications, if we can
recover light sources based on the known shape
and reflectance of one object in the scene, then
instead of making an over-simplistic assumption
about the scene illuminants, the recovered lightsources provide valuable information for a better
understanding of the whole scene. In graphics
applications, the recovered light sources make it
easier to generate photo-realistic synergy of real
world and computer generated imagery for better
virtual reality.
In view of the significance of light source re-
covery or light source calibration, several groupsof researchers have developed their schemes with
varying degrees of success. Sato et al. (1999) pro-
posed to use the shadow information for the pur-
pose of illuminant recovery. However, their
reliable illuminant recovery is only possible under
the condition that the full knowledge of occlusion
information of an object is available in the image,
which is not applicable for many reasonable caseswhere some light sources are far away from the
principal direction of the camera and thus the
shadow information is lacking. In (Powell et al.,
2001), by use of three known spheres of controlled
surface reflectance, the specular regions are then
employed to triangulate the positions of the light
source(s). The recovered light sources are then
used to achieve effective color correction. Thistriangulation based method follows the line of
attack established by Mancini and Wolff (1992).
Like other triangulation based methods in com-
puter vision, their calibration results are extremely
sensitive to noises. However no systematic strategy
is taken to address this crucial issue in their work.
In a recent paper, Zhang and Yang (2001), devel-
oped an interesting algorithm to obtain the mul-tiple illuminant directions by detecting the critical
points, loci on a sphere where the direction of one
illuminant is perpendicular to the surface normal.
Based on the Lambertian reflectance assumption,
the search for the critical points is conducted by a
recursive refinement on some arcs. Nice experi-
mental results have been reported on both syn-
thetic and real world images. However, by closeinspection of their method, it is found that this
recovery algorithm is not without problems. In
their algorithm the search for the critical points on
the known sphere plays the paramount role. First
let’s briefly summarize their algorithm to locate
critical points, hereafter denoted by Zhang–Yang-
detection. Their search is carried out as below: An
arc on the sphere is partitioned into subarcs ofequal length, for each subarc a least-squares min-
imization is performed to search for a vector~vvi ¼ ðbi; ciÞT, where ‘T’ is the transpose operator,
such that in this subarc the equality EðhÞ ¼bi sin h þ ci cos h holds true, where h uniquely de-
termines the location of a pixel on the subarc.
Then the presence of critical point(s) in a subarc is
detected if the Euclidean distance between~vvi’s fortwo adjacent subarcs exceeds a prescribed thresh-
old. The location of the critical points are then
found by an iterative refinement in subarcs with
shorter length.
As is well known that the least-squares mini-
mization scheme is susceptible to the negative
impacts of outliers (Rousseeuw and Leroy, 1987)
unless special care is taken. With the seminal workby many computer vision researchers, (e.g., Black
and Anandan, 1993; Irani et al., 1997), better un-
derstanding has been advanced in achieving robust
results in vision tasks, where due to a wide array of
sources in the world, for instance, the imaging
process of the camera, and human errors, outliers
are rampant. It has been discovered that to arrive
at numerically stable and robust estimation re-sults far more over-determined data should be
available. In Zhang–Yang-detection algorithm, to
160 J. Wei / Pattern Recognition Letters 24 (2003) 159–172
estimate~vv, only those points in a subarc are pooled
together in the over-determined linear system. In
the ensuing refinement process subarcs of even
shorter lengths are employed which further re-
duces the data volume of sample points in the
least-squares estimation. Unless the image resolu-tion is extremely high, the lack of data in the
subarcs would undermine the efficacy of Zhang–
Yang-detection algorithm. Besides, the iterative
searching in an arc can find several critical points,
thereby to reliably recover the directions of mul-
tiple illuminants many arcs, in an order close to
the image size, should be tested upon. Hence the
Zhang–Yang-detection algorithm is rather com-putationally intensive.
In this paper, a new algorithm is developed to
recover multiple illuminants. This algorithm
shares a common philosophy with the Zhang–
Yang-detection algorithm, namely, the directions
of multiple illuminants are also recovered by de-
tecting those critical points. However, instead of
working on subarc points, an extremely sparsedata set, our work is based on local areas with a
far larger data volume such that the robust tech-
niques such as least trimmed squares (LTS) can be
readily borrowed to effect estimates insensitive to
noises. Toward that end we first formulate the
local light source constant constraint (LLSCC),
i.e., in a local area of smooth lightness it is likely
that the corresponding 3-D world points on theobject are illuminated by the same light sources.
Based on this constraint we develop an algorithm
to recover multiple illuminant directions: first
based on the Lambertian irradiance formula a
linear system is formulated for a local area, the
local illuminant direction is then reconstructed by
a least-squares solution. To effect insensitivity to
noises, the LTS method is carried out followed bya Gaussian smoothing. Next a dense set of can-
didate critical points is obtained as a result of a
two-step robust processing, which is used to finally
arrive at the directions of multiple illuminants with
an adaptive Hough transform. The magnitude for
each light source is then computed by solving an
over-determined linear system which is formed by
pooling pixels illuminated by the same combinedlight vector. Initial experimental results based on
synthetic and real world images suggested en-
couraging performance. In the same vein as Powell
et al. (2001) and Zhang and Yang (2001), in this
paper the globe is elected to be the calibrating
object in our efforts to recover multiple illumi-
nants.
This paper is laid out as follows. The LLSCC isformulated in Section 2. Section 3 expands on the
proposed algorithm to recover the multiple illu-
minants. Empirical results on both synthetic and
real world imagery are presented in Section 4.
Section 5 concludes this paper with more remarks.
2. Local light source constant constraint
Throughout this paper the surface reflectance is
assumed to be Lambertian, i.e., the following
equation is employed to determine the intensity
value of an object point:
Eð~XX Þ ¼ q~nn �~ll; ð1Þwhere ~XX is the 3-D coordinate of a world point X,
the scalar q is the albedo of the surface, ~nn ¼ðnx; ny ; nzÞT is the normal vector of the world point
X and its norm k~nnk ¼ 1,~ll ¼ ðlx; ly ; lzÞT is the vec-
tor representing the light source, and � indicatesthe inner product. The light sources can be either
directional or point ones, for the latter case, if the
coordinate of a point source is ~PP , the direction of~llis then defined by ~XX �~PP . It is evident that the
former case is merely the extreme case when ~PP is
extremely far away from the object or the size of
the object under consideration is negligible com-
pared with the distance of ~PP to the object. Aspointed out by Pentland (1984), though a seem-
ingly simple reflectance model, it has an extremely
wide applicability in modeling object surface in
computer vision and graphics, so much so that
even in cases where specular reflection is involved
this diffuse reflectance model is still applicable
except for a relatively small field of view, namely
those regions whose viewing angles are comple-mentary with the incoming light direction. When
there are more than one light sources, then for
each illuminant determined by li, according to Eq.
(1) its contribution Eið~xxÞ is given as below:
Eið~XX Þ ¼ q~nn �~lli: ð2Þ
J. Wei / Pattern Recognition Letters 24 (2003) 159–172 161
Therefore with multiple illuminants ~lli’s, Eð~XX Þ is
the sum of Eið~xxÞ’s:
Eð~XX Þ ¼ q~nnXi
~lli
!: ð3Þ
If one denotes ~LL ¼P
i~lli, then we have the fol-
lowing formula to compute the intensity value for
position~xx:
Eð~XX Þ ¼ q~nn �~LL: ð4Þ
Hence each location ~XX in effect can be viewed as
illuminated by only one illuminant, namely, the
joint vector ~LL of multiple illuminants ~lli’s. In our
work it is assumed that the weak perspective
projection is used in our image formation and thus
image intensities are linearly related to surface
radiance, which is a widely used camera model(Horn, 1986). Suppose the imaging point of ~XX is~xx,then according to our image formation model the
intensity value Ið~xxÞ at ~xx is in direct proportion to
the surface radiance Eð~XX Þ, i.e.,
Ið~xxÞ ¼ aEð~XX Þ; ð5Þ
where a is a certain constant. Plugging Eq. (4) into
Eq. (5) one receives the following formula con-
necting Ið~xxÞ and the world geometry ~nn and (joint)illuminant ~LL:
Ið~xxÞ ¼ aq~nn �~LL: ð6Þ
For a small region in an image of smooth intensity
variations, it is reasonable to assume that the
corresponding world points have the same albedo.
Due to the fact that in our work the size of the
region in question is of small size, all light sourcesilluminating this small region can thus be viewed
as directional illuminants. Therefore according to
Eq. (6) each point in this region is illuminated by
the same illuminant, be it a real single light source
or a joint vector of multiple light sources. The
aforementioned facts are summarized in the fol-
lowing constraint:
LLSCC: For a small region of smooth inten-
sity changes in an image, the corresponding
world points is of the same albedo and illumi-
nated by the same single light source.
The single light source as mentioned in the
LLSCC is either a real single light source or the
joint vector of multiple sources. Equipped with
the LLSCC, we are ready to disclose its power in
recovering the direction ~̂LL~LL of illuminant ~LL for this
region, ~̂LL~LL is the normalized version of ~LL, i.e.,~̂LL~LL ¼~LL=k~LLk, where k~LLk is the norm of ~LL. It can be
seen by inspecting Eq. (6) that the lack of knowl-
edge of a and q except the fact that they are con-
stant across the region poses no problem for our
recovery of ~̂LL~LL due to the fact that their sole effect
according to Eq. (6) is merely a scaling of the
constant vector illuminant ~LL. If somehow the
normal vector for each location~xx in a small region
is known, the LLSCC readily gives rise to a way tocompute ~LL: for each pixel inside the region Eq. (6)
holds true, to recover the constant illuminant di-
rection ~̂LL~LL, one can pool Eq. (6)’s satisfied by all
pixels in the region together and form the follow-
ing over-determined linear system:
~II ¼ cN~LL; ð7Þassuming the number of pixels in the region is m,
the m-dimensional column vector ~II is defined asbelow:
~II ¼ ðIðx1Þ; Iðx2Þ; . . . ; IðxmÞÞT; ð8Þwhile c is the product of unknown constants aq,the m� 3 normal matrix N is formed by patching
normals for X’s together:
N ¼
nx1 ny1 nz1nx2 ny2 nz2� � �nxm nym nzm
0BB@
1CCA: ð9Þ
From Eq. (7), the illuminant~LL can be computed in
the least-squares sense Golub and van Loan, 1983:
~LL ¼ 1
cNþ~II ; ð10Þ
where Nþ ¼ ðNTNÞ�1is the pseudo-inverse of the
matrix N, given the fact that ~̂LL~LL is obtained by
normalizing~LL, we thus have the following formula
to calculate ~̂LL~LL:
~̂LL~LL ¼ Nþ~II
kNþ~IIk: ð11Þ
162 J. Wei / Pattern Recognition Letters 24 (2003) 159–172
Therefore Eq. (11) lends us an effective area-based
scheme to estimate the illuminant direction, it is
indeed the foundation of our algorithm upon
which our algorithm to recover multiple illumi-
nants is built.
3. Robust multiple illuminant direction recovery
algorithm
In this section we first expand on our method to
recover the joint illuminant direction for a local
region. We then present our scheme to locatecritical points. Finally the recovery algorithm is
described at length.
3.1. Robust local illuminant direction recovery
algorithm
According to the LLSCC constraint as devel-
oped in the preceding section, the end result ofmultiple illuminants for a small region can be
viewed as illuminated by a single light source,
which is then estimated by use of Eq. (11). In this
section, we shall develop an algorithm to recover
the directions of multiple illuminants for a sphere
with uniform a and q and thus the same c on the
surface and known pose and size. In cases where
c’s for different local smooth regions are the samevalue, then all N’s in different local region under-
goes the same scaling thus there is no need to
factor out c and ~LL in Eq. (7), therefore we
can simply rewrite Eq. (7) into the following
equality:
~II ¼ N~LL; ð12Þwhere it is evident that ~LL here is indeed c~LL in the
original Eq. (7). So now the single light source ~LLfor each small region is estimated by the following
least-squares formula for this special case:
~LL ¼ Nþ~II : ð13ÞDue to the fact that the pose and size of the
sphere are known, without loss of generality one
can assume that the origin of the coordinate sys-
tem is located at the center of the sphere, thenormal ~nn at each point ði; jÞ on the image of the
sphere is then given below:
~nni;j ¼jr
�� 1; y; 1 � i
r
�T
; ð14Þ
where y ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 � ððj=rÞ � 1Þ2 � ð1 � ði=rÞÞ2
q. There-
by according to Eq. (13) N and~II in a local region
can be readily collected from the image in order
to estimate ~LL which is deemed constant based onthe LLSCC.
Because Eq. (13) which is used to compute ~LL is
arrived at by minimizing the least-squares error as
defined by Eq. (12) (Horn, 1986), the estimated ~LLfor the concerned local region is thus susceptible to
noises which are rampant in vision applications,
especially outliers which, even with small num-
ber, can entirely distort the estimates. Accordingto the theory of robust statistics (Huber, 1981), the
breakdown point, the percentage of allowable out-
liers which incur no adverse impact to the esti-
mate, for the least-squares estimation formula Eq.
(13) is 0, which manifests the fact that it is not
appropriate to use Eq. (13) per se for the purpose
of recovering the local light source. Fortunately a
rich stock of ammunition is readily available fromrobust statistics to mitigate this problem, such as
least-median-squares (LMedS), LTS (Huber, 1981;
Rousseeuw and Leroy, 1987), whereof the break-
down point can be increased to as large as 50%.
Between these two most widely used robust oper-
ators, LMedS tends to generate better estimate
result in discounting outliers. However, due to its
stochastic nature, namely, its robustness indeedcan only be achieved by a large number of iterations
for random sampling thus requiring extremely
great computational intensity. By contrast, the
LTS operates based on order statistics of the raw
data. More specifically, for a data array whose
volume is OðnÞ, this operator is able to arrive at
the robust estimate within a time complexity
Oðn log nÞ, the targeted breakdown point can beachieved by adjusting the size of the trimmed data.
In view of its algorithmic efficiency and effective-
ness to obtain robust results, in this work we chose
the LTS as our robust operator to estimate ~LL for
each local neighborhood.
Based on the LTS technique, in order to remove
outliers, for a given region R, instead of pooling
intensity values Ii’s and corresponding normals for
J. Wei / Pattern Recognition Letters 24 (2003) 159–172 163
all points within R according to Eqs. (8) and (9),
first Ii’s are sorted in order, the resulting set is
denoted by S. Next to achieve a targeted break-
down point b%, the top and bottom b% of the data
in S is thrown out, the remaining data set is re-
ferred to as T. The resulting long vector�~II~II is
formed by pooling all Ii’s in T, whereas the new
normal matrix �NN is formed by collecting the nor-
mals corresponding to each Ii in�~II~II in the manner
similar to Eq. (9). Consequently according to Eq.
(13) the applicable robust version of estimating ~LLis as given below.
~LL ¼ �NNþ �~II~II : ð15ÞThat the resulting breakdown point based on Eq.
(15) is b% can be simply proved by the fact that the
b% outliers whose collection is a subset of the set
T � S, the top or bottom b% data entries in S, are
effectively removed from the set T.
The robust method to recover the illuminant for
a small local region R whose number of pixels is mdeveloped in this subsection is summarized in the
following procedure local-illuminant-recovery.
PROC local-illuminant-recovery
(1) formulate the m-dimensional vector ~II and
m� 3 matrix N for all pixels in R according
to Eqs. (8) and (9), respectively;
(2) sort~II and change N accordingly;
(3) trim the top and bottom b% of rows in~II and
N, respectively;
(4) apply Gaussian smoothing on the trimmed�~II~II ;
(5) estimate the light source ~LL for R by use of Eq.
(15).END
In the foregoing procedure, the outliers are
handled by Step 3, which is indeed the LTS
scheme. As discussed previously, the choice of b is
crucial to the robust performance of this algorithm
due to the use of the LTS scheme. It is evident that
if b is less than the actual percentage p of theoutliers then this procedure is unable to entirely
contain the adverse impacts of the outliers. How-
ever, due to the use of Hough transform in the
ensuing process of the estimation algorithm as
described below, the slightly higher p than b can be
alleviated. However, if p is too large then the re-
sults will be unreliable. This is illustrated in Fig. 1
for a synthesized image with pepper/salt noisesadded by different percentages: when p ¼ 0:16 and
b ¼ 0:15, the recovered L’s as shown in the fourth
column is still acceptable which can be effectively
handled by the Hough transform in order to re-
construct the multiple illuminants. However, in the
case of p ¼ 0:2 and b remains 0.15, the recovered
L’s as depicted in the last column is too noisy to be
mitigated by the Hough procedure.
Fig. 1. The impacts of choice b for salt/pepper noise only. The synthesized images (row 1) and the ~LL’s recovered by the local-illu-
minant-recovery, where b is fixed at 0.15 while the percentages p of the added salt/pepper noises for the five columns are 0.05, 0.1, 0.15,
0.17, and 0.2, respectively.
164 J. Wei / Pattern Recognition Letters 24 (2003) 159–172
The Gaussian noises are alleviated by Step 4: we
apply a Gaussian diffusion on the trimmed sorted
local intensity array�~II~II which was sorted and trim-
med in Steps 2 and 3. Due to the fact that in this
array the outliers are mostly removed and adjacent
values are of similar magnitude, the Gaussiansmoothing can effectively suppress the white
noises. In our implementation, a nine-tap Gauss-
ian filter is used. To showcase the performance of
this procedure in the presence of Gaussian noises,
the results generated by this procedure for a syn-
thesized image disturbed by white noises with
differing standard deviations are illustrated in Fig.
2. As can be observed in this figure, this procedureis fairly resilient to minor Gaussian noises, e.g., the
first three columns when r ¼ 1, 3 or 5. In cases
where r is relatively large, e.g., 7, the recovered~LL’sare noisy but can be handled by the ensuing pro-
cess to recover the multiple illuminants. For con-
siderably large r, say, 11, as depicted in the last
column, the recovered ~LL’s are too noisy to be used
by our algorithm to calibrate the multiple illumi-nants.
3.2. Critical points recovery
Based on the LLSCC constraint, when there are
more than one light source the ~LL recovered by
local-illuminant-recovery procedure as developed
in the preceding subsection is essentially the vectorsum of the multiple illuminants visible to pixels in
R. In the image of the sphere, for two adjacent
regions R1 and R2, if they are visible to the same
set of light sources, then the respective recovered
illuminant vectors ~LL1 and ~LL2, as the vector sum of
exactly the same set of multiple illuminants, the
equality ~LL1 ¼~LL2 should hold. This observationleads us to the following proposition which has an
important role in our effort to recover multiple
illuminants:
Proposition 1. The inequality~LL1 6¼~LL2 indicates thefact that the two neighbor regions are not visible tothe same set of illuminants.
The concept critical points as defined in Zhang–
Yang-detection algorithm plays the essential role
in their work of recovering multiple illuminant
directions. The critical points on the sphere are the
loci where one illuminant grazes on the sphere,
namely,
~̂lglg~lglg �~nnX ¼ 0; ð16Þ
where ~̂lglg~lglg is the normal vector representing the
grazing illuminant, ~nnX is the normal vector at a
critical point X. It is evident that Eq. (16) repre-
sents a plane P whose normal is ~̂lglg~lglg. Since the center
of the globe is located at the origin, the set of
critical points due to the grazing illuminant ~̂ll~llg is
the great circle which is the intersection between
the known sphere and the plane P. Of course only
Fig. 2. The impacts of different Gaussian noises on the local-illuminant-recovery procedure.
J. Wei / Pattern Recognition Letters 24 (2003) 159–172 165
part of the great circle is visible on the image after
the image formation process. Given that our work
is region based, any illuminant grazing the sphere
intersects with a region at an arc which is referred
to as critical arc. It is evident that the critical arc isa set of critical points which are part of a great
circle formed by a grazing illuminant. The critical
arcs are the foundation for us to detect the mul-
tiple illuminants in this work. If we denote the set
of all critical arcs on the image of the sphere by G,
and overall there are k illuminants, then G is de-
fined as below:
G ¼ C1 [ C2 [ � � � [ Ck; ð17Þwhere the ith critical arc Ci, i ¼ 1; 2; . . . ; k, is the
partial great circle generated by the ith illuminant.
If somehow a dense data set, namely, critical
points, for each critical arc Ci can be detected, then
its corresponding~lli can be estimated by the over-
determined linear system which is formed bypooling together Eq. (16) for each point on Ci, i.e.,
U~lli ¼~00; ð18Þ
where U is the nci � 3 matrix each row of which is
the three components of the normal at a critical
point on Ci, nci is the number of critical points in
Ci, ~00 is a nci -dimensional zero vector. A numeri-
cally stable and robust estimate of ~lli is possible
only if a relatively large number nci of critical
points can be recovered for each Ci.
As discussed in the preceding paragraph, it is ofutmost importance to locate as many critical
points as possible to obtain desirable estimate of
the corresponding~ll. If two adjacent regions are of
different~LL’s, based on the definition of critical arcs
from Proposition 1 we see right away that if~LL1 6¼~LL2 then G \ ðR1 [ R2Þ 6¼ ;, i.e., the union of
these two adjacent regions contains one or many
critical points which are part of one or more crit-ical arcs Ci’s. Therefore the testing of the preced-
ing inequality serves as the valuable criterion in
search of critical points. One can detect critical
points by partitioning the image of the sphere into
non-overlapped regions, small squares for sim-
plicity, then the local-illuminant-recovery proce-
dure is carried out to robustly compute ~LL for each
square. The candidate regions where criticalpoints, referred to as critical regions for short, are
present are then detected in cases where the mag-
nitude of the difference of two~LL’s for two adjacent
regions exceed a certain threshold, namely, as-
suming ~LL1 and ~LL2 are the illuminants recovered by
the local-illuminant-recovery procedure for two
adjacent regions R1 and R2, respectively, then R1
and R2 are labeled as critical regions if the fol-
lowing condition holds true:
k~LL1 �~LL2k > d1; ð19Þwhere d1 is a prescribed scalar threshold to indi-
cate significant difference. Here a paradox presents
itself: in order for the local-illuminant-recovery
procedure to work, the size of each region shouldbe relatively large to render the LTS and ensuing
least-squares estimate effective, in our empirical
study numerically unstable estimates are observed
when the number of pixels in each region is fewer
than thirty. However, to recover the grazing illu-
minant with high confidence the locations of those
corresponding critical points need to be as precise
as possible, thus the size of the critical regionsshould be as small as possible. To tackle these
conflicting requirements, a two-step process can be
employed: (1) search for critical regions E through
non-overlapping partitioning; (2) refine the loca-
tions of critical points from the critical regions.
The non-overlapping search conducted in the first
step is meant to render an efficient search: those
regions which fail the inequality (19) are prunedbefore the second step is carried out, which is far
more computationally intensive. The multi-scale
search is not employed here due to the following
two reasons: (1) The outliers such as salt/pepper
noises in the images of original resolution will
propagate to the lower resolution representations
thus having adverse impacts on the estimation of
critical points. (2) As described earlier generally alarger number of pixels are more desired in order
to render our estimation more robust, unless the
image is of extremely large resolution otherwise
the lower resolution images in the multi-scale
representations may not be able to facilitate robust
estimation results. In the second step the refine-
ment of more precise location of critical points
cannot be achieved by reducing the number ofpixels contained in the critical regions, which
simply undermines the efficacy of the local-illu-
166 J. Wei / Pattern Recognition Letters 24 (2003) 159–172
minant-recovery procedure. Instead, we choose to
perform our refinement process based on over-
lapped partitioning of E: for each pixel p inside E a
region Rp centered on p is formed, then the local-
illuminant-recovery procedure is called to compute~LLp for this region. Due to the fact that it is likely
that pixels in Rp are not illuminated by the same
set of illuminants––its intersection with critical
point set G is non-empty, in order to render the
recovered light source a better approximate of the
light source illuminating the majority part of Rp, it
is found that increasing the trimming constant b in
the local-illuminant-recovery procedure is helpful,which has a better chance to throw out more pixels
belonging to the ‘‘minority parts’’ in Rp. Now two
adjacent points p1 and p2 are earmarked as can-
didate critical points if the following condition is
satisfied:
k~LLp1 �~LLp2k > d2; ð20Þwhere the scalar d2 (< d1) is the threshold to sig-
nify candidate critical points. Its value should be
smaller than d1 since now the two regions Rp1 andRp2 overlap greatly in their domain.
Summarizing, we have the following procedure
to recover candidate critical points:
PROC candidate-generation
(1) partition the image of the sphere into non-
overlapping squares, then for each square call
local-illuminant-recovery to estimate the light
source ~LL;(2) generate the set E of critical regions according
to (19);
(3) for each pixel p within E form a square cen-tered on p, then call local-illuminant-recovery
with slightly larger magnitude for b to com-
pute the corresponding ~LLp;
(4) generate the set C of candidate critical points
according to (20).
END
By inspecting the preceding procedure, it can beobserved that the output point set C is not the
actual critical point set G, they are still its superset,
namely, C G. However, C is already a significant
reduction from the set E, which is a far greater
superset of G. The recovered C’s for the noisy
images as depicted in Figs. 1 and 2 of previous
section are illustrated in Fig. 3. As can be observed,
for images corrupted by considerable noises, e.g.,Gaussian noise with r > 7 and salt/pepper noise
with p > bþ 3%, the recovered ~LL’s are too noisy
to be used for lighting source calibration.
3.3. Multiple light sources recovery algorithm
Though C as estimated in the candidate-gener-ation procedure as described in the precedingsubsection is still a superset of G, and to make
Fig. 3. The recovered critical points for the noisy images shown in Figs. 1 and 2.
J. Wei / Pattern Recognition Letters 24 (2003) 159–172 167
matters worse, there is no way for us to know
which critical arc the estimated points belong to,
hence the over-determined linear system as defined
by Eq. (18) cannot be called upon to obtain the
corresponding illuminant. A classification proce-
dure is apparently needed before estimating thedirection of the illuminants. Fortunately, special
structural property of G lends us an effective
scheme to finally reconstruct the directions of
multiple illuminants. This is because G is a union
of a number of great circles as formulated in (17),
and pixels in C are either critical points or those in
the close neighborhood of points in G. Thus if two
points p1 and p2 in C is from or close to the samecritical arc Ci, then their normals should each
satisfy Eq. (16) with ~̂lglg~lglg’s being similar to each
other. In light of the linear nature of Eq. (16), the
Hough transform can be used toward the recoveryof the multiple illuminant directions. For each
pixel p in C, first its normal on the world globe is
computed according to Eq. (14), next a vote can be
cast for ~̂lglg~lglg’s where Eq. (16) is satisfied. Finally each~̂lglg~lglg is obtained by finding the clustering point in the
vote array. It is well established that for a lowdimensionality problem the Hough transform is an
extremely effective and robust scheme for the
purpose of parameter estimation (Ballard and
Brown, 1982). In our problem, the dimensionality
of the parameter ~̂lglg~lglg to be estimated is merely two,
three components subject to the constraint that
their magnitude is one. To ease the computational
intensity incurred by the voting process, the
adaptive Hough transform (Illingworth and Kit-tler, 1987; Tian and Shah, 1997) is adopted: first a
coarse grid is used to cast the vote, then regions
around the clustering points are discretized with
finer-grained grid. This grid refinement process is
iterated until an expected precision is arrived at.
The foregoing adaptive Hough transform pro-
cedure only provides the directions ~̂lglg~lglg’s of the
multiple illuminants, to fully recover the multiple
illuminants ~llg’s, their corresponding magnitudes
should be estimated, which turns out to be an easy
task with knowledge of ~̂lglg~lglg’s. Toward that end first
a concept macro region is defined: the set of pixelswhich are visible to the same set t of illuminant
directions is referred to as a macro region R. It is
evident that R is a partition of the sphere image.
To generate this partitioning, first the set t of~llgi ’svisible to each pixel p is computed by inspecting
the positivity of the inner product ~̂ll~llgi �~nnp for every~̂ll~llgi , then pixels having the same t are grouped into
one macro region R. Each R is thus a superset of
R’s as employed in the local-illuminant-recovery
procedure. To effect robust estimate of the mag-nitude of each illuminant, the median vector ~LLR is
first computed for all R’s inside R, where each
element of ~LLR is the median of the corresponding
elements for all~LLR’s, R 2 R. QuickSort is called to
compute these element-wise median. Next ac-
cording to the LLSCC constraint and denoting mgi
as the magnitude of the illuminant ~llgi , the fol-
lowing equality holds:
~LLRj ¼X
~llgi2tðRjÞ
mgi~̂ll~llgi : ð21Þ
It can be proved by a simple induction that the
number of R’s is more than the number of illu-
minants~llg’s, thus the number of Eq. (21)’s for each
R is more than the number of unknowns mgi ’s.
Furthermore, Eq. (21) indeed confers us three
equations for mgi ’s. Therefore by patching Eq. (21)together, we again have an over-determined linear
system which can be used to estimate mgi ’s. To
effect easier algebraic manipulation, Eq. (21) is
rewritten as the following equation:
~LLRj ¼ GFj~mm; ð22Þ
where if J is the number of illuminants then G is a3 � J matrix obtained by patching ~llgi ’s column-
wise:
G ¼ ð~llg1;~llg2
; . . . ;~llgJ Þ; ð23Þfiltering matrix F j is a J � J diagonal matrix whereFj½k; l� ¼ 0 if k 6¼ l and Fj½k; k� ¼ 1 if ~llgk 2 vðRjÞ,otherwise Fj½k; k� ¼ 0. ~mm is a J-dimensional column
vector ðmg1;mg2
; . . . ;mgJ ÞT. It is evident the filtering
matrix F j is used to annihilate~llgi ’s which make no
contribution to Rj.
By pooling Eq. (23) for all Rj’s together we thus
arrive at the following over-determined linear
system:
~LLRall¼ �GGall~mm; ð24Þ
168 J. Wei / Pattern Recognition Letters 24 (2003) 159–172
where the long vector ~LLRallis formed by patching
~LLRj ’s together, while the matrix �GGall is generated by
attaching GF j in the vertical direction. Thereby ~mmcan be estimated by the following least-squares
formula:
~mm ¼ �GGþall~LLRall
: ð25ÞFinally, we are ready to stipulate our algorithm to
recover the multiple illuminants:
Algorithm multiple-illuminant-recovery
1. Call candidate-generation procedure to detect
the candidate critical point set C;2. perform the adaptive Hough transform to esti-
mate the illuminant directions ~̂lgilgi~lgilgi ’s;3. generate the macro region partitioning for the
sphere image and then use Eq. (25) to estimate
the magnitude mgi for each illuminant whose di-
rection has been estimated in the previous step.
End
The corresponding flow chart of the foregoing
algorithm is illustrated in Fig. 4.
4. Experimental results
In this section, experimental results generatedby virtue of the multiple-illuminant-recovery algo-
rithm as developed in the preceding section are
presented. First results on synthetic imagery with
known ground truth are reported which provide
insight into the numeric quality of the proposed
algorithm. We then present the performance of
this algorithm on real world images to further
depict its efficacy.The advantage that the synthetic experimental
results have to provide is that the ground truths of
the original images are available to make quanti-
tative evaluation of the proposed algorithm. To
provide a quantitative measure of the proposed
algorithm, we hereby define errors �d and �m to
gauge the goodness of the recovered illuminants
and the original M illuminants:
�d ¼1
M
XMi¼1
j~̂lili~lili � ~̂~ll~ll~~ll~llij; ð26Þ
�m ¼ 1
M
XMi¼1
jk~llik � k ~~lili~lilikj; ð27Þ
where ~lli and~~lili~lili are the vectors representing the
light sources which are estimated by the proposedalgorithm and the known ones in rendering the
synthetic images, respectively. ~̂lili~lili indicates the
normalized ~lli, and k~llik corresponds to the mag-nitude of~lli. It is evident that �’s indicate the mean
relative errors between the estimated light source
vectors and the generative light source vectors in
direction and magnitude, respectively.
In the first row of Fig. 5, a globe of radius 200
pixels illuminated by two light sources is rendered
by our program, Gaussian (l ¼ 0, r ¼ 5) and salt/
pepper noises with equal probability are then ad-ded in 10% of pixels to generate the final image.
The local illuminants for each local region pro-
duced by the procedure local-illuminant-recovery
are depicted in the second column. 1 The critical
points are given in the third column, whereof it can
be observed that far more critical points are
available for numerically recovering the light di-
rection. The final column illustrates the newlyrendered sphere using the parameters estimated by
the proposed algorithm. The two error measures
for the recovered illuminants are �d ¼ 2:1� and
�m ¼ 2:3.As the second experiment, the globe of radius
200 pixels illuminated by two light sources which
are complement to each other, i.e.,~ll1 ¼ �~ll2. Here,
8% pixels are replaced by salt/pepper noises andthe Gaussian noises of the same parameter as the
Fig. 4. The flow chart of the proposed algorithm multiple-illuminant-recovery.
1 The magnitudes are scaled to result in better visualization.
J. Wei / Pattern Recognition Letters 24 (2003) 159–172 169
previous example are added to each pixel. The
running results for this synthetic image is listed in
the second row of Fig. 5. Therefore our algorithm
has no problem to handle this case which was not
covered by other algorithms (e.g., Zhang and
Yang, 2001). The corresponding errors are �d ¼2:8� and �m ¼ 2:5.
To further delineate the performance of the new
algorithm, a globe of radius 200 pixels illuminated
by four light sources with salt/pepper in 8% pixels
and Gaussian (l ¼ 0, r ¼ 5) noises for each pixel
is tested against our algorithm. As can be observed
in the third row of Fig. 5, the recovered image is
both qualitatively and quantitatively satisfactory.
In this test, �d ¼ 3:4�, �m ¼ 4:6.
Intensive experiments have been conducted by
using the proposed multiple-illuminant-recovery
algorithm, the performances are consistent with
the three shown in Fig. 5. The adverse impacts of
various noises, e.g., Gaussian or salt/pepper, can
be discounted effectively.
We also tested the performance of the proposedalgorithm on real world imagery. As depicted in the
last row of Fig. 5, the picture of a ping-pang ball
illuminated by two light sources is taken. Due to
the small size of the ball, the directional light
sources can be approximated by distant light bulbs.
As can be seen from the recovered images shown in
the same row, the recovered two light sources clo-
sely approximated the original ones qualitatively.
Fig. 5. Synthetic and real world imagery experimental results of the proposed algorithm.
170 J. Wei / Pattern Recognition Letters 24 (2003) 159–172
5. Conclusion and discussion
In this paper we first developed the LLSCC,
next based on this constraint, a novel robust
scheme is presented to recover the multiple lightsources from a sphere in the scene. The LTS
technique is employed in the least squares esti-
mation of local light vectors in order to facilitate
robustness of the recovery algorithm. Then the
candidate critical points as defined in (Zhang and
Yang, 2001) are spotted after a check on the dif-
ference between adjacent blocks/pixels. Next an
adaptive Hough transform procedure is performedto arrive at the directions of the multiple illumi-
nants. Finally the magnitudes for all light sources
are obtained by solving an over-determined linear
system after pooling all pixels inside regions illu-
minated by the same combined light vectors. Ini-
tial experimental results on synthetic and real
world imagery alike show encouraging perfor-
mances.Indeed the LLSCC can be applied to situations
where the shape of the object is known a priori to
calibrate the light sources, not just limited to a
sphere. A wide array of man-made objects can
serve the role as the calibrating object. In this
paper only directional light sources are considered
which is applicable when the calibrating object is
very small or the light sources are far away.However, as pointed out in (Powell et al., 2001), to
effectively calibrate the light sources in an indoor
scene, it is more reasonable to assume the light
sources as point sources. In the near future we will
further examine methods to recover point light
sources within the framework developed in this
paper. Last but not least, efforts will also be
made to exploit the recovered light sources for thepurpose of illumination-invariant recognition in
computer vision, and virtual reality in computer
graphics.
The proposed algorithm can be very robust
against outliers: a large choice of b can effectively
discount a large percentage of them. However,
Gaussian noises with significantly large variance
cannot be easily fixed by this algorithm: the localGaussian diffusion in Step 4 of procedure local-
illumination-recovery together with the ensuing
Hough transform are just inadequate. This is be-
cause the method proposed in this paper is of the
nature of a local algorithm: each ~II is estimated
based on a local neighborhood and a local mini-
mum is assumed to be the estimate. To effect a
better resilience against noises, a global method to
estimate the ~II field can be used, where a rich ar-senal of energy formulation and optimization
schemes can be employed, such as Markov Ran-
dom Field (Barnard, 1993), anisotropic diffusion
(Perona and Malik, 1990), variational calculus
(Brooks and Horn, 1986), and partial differential
equation based methods (Koenderink, 1984). In
the near future we will try on these avenues to
enhance the lighting source calibration method.Furthermore, issues regarding different calibrating
objects other than globes based on our LLSCC as
expanded in this paper will also be explored.
References
Ballard, D.H., Brown, C.M., 1982. Computer Vision. Prentice
Hall, Englewood Cliffs, NJ.
Barnard, S., 1993. Stereo matching. In: Chellappa, R., Jain,
A.K. (Eds.), Markov Random Fields. Academic Press, New
York, pp. 245–271.
Black, M.J., Anandan, P., 1993. A framework for the robust
estiamtion of optical flow. In: Proceedings of ICCV’93, pp.
231–236.
Brooks, M.J., Horn, B.K.P., 1986. The variational approach to
shape from shading. Comput. Vis. Graph. Image Process.
33, 174–208.
Ferrie, F., Levine, M., 1989. Where and why local shading
analysis works. IEEE Trans. PAMI 11 (2), 198–206.
Golub, G.H., van Loan, C.F., 1983. Matrix Computations. The
Johns Hopkins University Press, Baltimore, MD.
Horn, B.K.P., 1986. Robot Vision. MIT Press, Cambridge,
MA.
Horn, B.K.P., 1975. Obtaining shape from shading informa-
tion. In: Winston, P.H. (Ed.), The Psychology of Computer
Vision. McGraw-Hill, New York, pp. 115–155.
Horn, B.K.P., Brooks, M.J. (Eds.), 1989. Shape from Shading.
MIT Press, Cambridge, MA.
Huber, P.J., 1981. Robust Statistics. Wiley, New York.
Illingworth, J., Kittler, J., 1987. Adaptive Hough transform.
IEEE Trans. PAMI 9, 690–698.
Irani, M., Rousso, B., Peleg, S., 1997. Recovery of ego-motion
using region alignment. IEEE Trans. PAMI 19 (3), 268–272.
Kimmel, K., Bruckstein, A.M., 1995. Tracking level sets by
level sets: a method for solving the shape from shading
problem. Comput. Vis. Image Understanding 62 (2), 47–58.
Koenderink, J., 1984. The structure of images. Biol. Cyber. 50,
363–370.
J. Wei / Pattern Recognition Letters 24 (2003) 159–172 171
Mancini, T.A., Wolff, L.B., 1992. 3d shape and light source
location from depth and reflectance. In: Proceedings of
CVPR’92, pp. 707–709.
Oliensis, J., 1991. Shape from shading as a partially well-
constrained problem. CVGIP: Image Understanding 54 (2),
163–183.
Pentland, A., 1984. Local shading analysis. IEEE Trans. PAMI
6 (2), 170–187.
Perona, P., Malik, J., 1990. Scale space and edge detection using
anisotropic diffusion. IEEE Trans. PAMI 12 (7), 629–639.
Powell, M.W., Sarkar, S., Goldgof, D., 2001. A simple strategy
for calibrating the geometry of light sources. IEEE Trans.
PAMI 23 (9), 1022–1027.
Rousseeuw, P.J., Leroy, A.M., 1987. Robust Regression and
Outlier Detection. Wiley, New York.
Sato, I., Sato, Y., Ikeuchi, K., 1999. Illumination distribu-
tion from shadows. In: Proceedings of CVPR’99, pp. 306–
312.
Szeliski, R., 1991. Fast shape from shading. CVGIP: Image
Understanding 53 (2), 125–153.
Tian, T.Y., Shah, M., 1997. Recovering 3d motion of multiple
objects using adaptive Hough Transform. IEEE Trans.
PAMI 19 (10), 1178–1183.
Zhang, Y., Yang, Y., 2001. Multiple illuminant direction
detection with application to image synthesis. IEEE Trans.
PAMI 23 (8), 915–920.
172 J. Wei / Pattern Recognition Letters 24 (2003) 159–172