Cosine lobes for interactive direct lighting in dynamic scenes

Computers & Graphics 34 (2010) 767–778

Contents lists available at ScienceDirect

Computers & Graphics

0097-84

doi:10.1

$This� Corr

E-m

journal homepage: www.elsevier.com/locate/cag

Technical Section

Cosine lobes for interactive direct lighting in dynamic scenes$

Sylvain Meunier a, Romuald Perrot a, Lilian Aveneau a, Daniel Meneveaux a,�, Djamchid Ghazanfarpour b

a XLIM-SIC laboratory, UMR CNRS 6172, Computer Graphics, Bat. SP2MI, Teleport 2, Bvd Marie et Pierre Curie, BP 30179, 86962 Futuroscope Chasseneuil Cedex, Franceb XLIM-DMI laboratory, UMR CNRS 6172, Computer Graphics, 123, avenue A. Thomas 87060 Limoges cedex, France

a r t i c l e i n f o

Article history:

Received 1 February 2010

Received in revised form

13 August 2010

Accepted 20 August 2010

Keywords:

Interactive rendering

Basis functions

Cosine lobes

93/$ - see front matter & 2010 Elsevier Ltd. A

016/j.cag.2010.08.002

article was recommended for publication by

esponding author.

ail address: [email protected] (D. Me

a b s t r a c t

Cosine functions have been widely used for representing Bidirectional Reflection Distribution Functions

(BRDF) such as Lambert, Phong and Lafortune models. They are well suited to represent both high and

low frequency signals. However, they are difficult to use with visibility and incident radiance. In most

systems, the rendering equation terms are thus estimated using various methods. Several interactive

rendering systems rather rely on the projection of each term onto orthonormal basis functions such as

spherical harmonics or wavelets. These methods are easier to handle since the integration becomes a

dot product. However, these functions are also subject to several drawbacks. For instance the number of

coefficients is high for the representation of high frequency phenomena; the pre-computation time

required for projecting each term of the rendering equation cannot be neglected. This paper

demonstrates that cosine lobes can be generalized to visibility and incoming radiance with several

advantages. First, cosine lobes do not form an orthonormal basis of functions and the number of

parameters remains naturally adapted to the signal. This is very interesting for complex and high

frequency functions: glossy BRDF or small light sources for instance. We also use this property for

reducing the number of parameters as the computation goes along. Second, Lambert, Phong and

Lafortune BRDF models are already used in many rendering systems. Since they already rely on this

representation, no transformation into other types of model is necessary. This paper shows how it is

possible to rapidly integrate the product of cosine lobes. As a demonstration of our methodology, we

propose an interactive rendering system for direct lighting, including soft shadows and spatially varying

materials.

& 2010 Elsevier Ltd. All rights reserved.

1. Introduction

Several methods in the literature have been proposed forproducing physically based images with lighting simulations,complex bidirectional reflectance distribution function (BRDF),natural lighting, high number of objects and so on [1]. Unfortu-nately, producing realistic images at interactive frame rates indynamic scenes remains a challenging task: Estimating theradiance reflected towards the camera requires to integrateincoming light from all directions, weighted by the BRDF. Manyauthors have provided interesting solutions, with the use oforthogonal basis functions such as spherical harmonics (SH) [2],hemispherical harmonics (HSH) [3,4] or Haar wavelets (HW) [5].This type of representation simplifies the rendering equationcomputation and replaces the continuous integration by productsof coefficients. Nevertheless, the projection of visibility, BRDFand/or incoming radiance on basis functions remains a costlyprocess. In addition, SH-based methods are difficult to manage

ll rights reserved.

Joaquim Jorge.

neveaux).

with all-frequency lighting environments while HW ones produceblocking artifacts.

This paper explores the use of cosine lobe functions, thatappear in the rendering equation and BRDF models such as Phongand Lafortune, with potentially spatially varying, anisotropic,retro-reflection and/or off-specular reflectance data [6]. Thesefunctions benefit from the flexibility of spherical radial basisfunctions (SRBF) [7]: They are defined on the sphere, notnecessarily uniformly distributed, spatially localized and allowinexpensive rotations; they only require choosing an axis andsome shape parameters which is more intuitive; the representa-tion is compact (a small number of coefficients accuratelyrepresents the signal). However, they have not been generalizedbecause the produced terms are more difficult to combine andintegrate. This paper proposes a solution to overcome these majordifficulties. Our main contribution concerns the use of cosinelobes for homogeneously solving the rendering equation. We alsoprovide a framework for both their product and their integration.One interesting property is that the product of two lobes can beexpressed as another one, with low error. This framework alsocomes with secondary contributions: (i) an acceleration systembased on table fetching and linear interpolations; (ii) an applica-tion using cosine lobes for estimating shadows and direct lighting

www.elsevier.com/locate/cag

dx.doi.org/10.1016/j.cag.2010.08.002

mailto:[email protected]

dx.doi.org/10.1016/j.cag.2010.08.002

S. Meunier et al. / Computers & Graphics 34 (2010) 767–778768

with various types of BRDF; and (iii) an interactive renderingsystem providing direct lighting, soft shadows and realisticmaterials in dynamic scenes. This paper does not address inter-reflections; however, existing systems [8,9] should be able tohandle our framework.

The remaining of this paper is organized as follows: Section 2reviews existing methods and highlights our contributions; Section3 presents an overview of our approach; Section 4 details the use ofcosine lobes for the rendering equation; Section 5 proposesrepresentations of visibility, BRDF and incoming radiance usingcosine lobe sums; Section 6 presents and discusses performances;finally, Section 7 concludes and mentions future work.

2. Related work

Let us consider the rendering equation:

Loðx,vo!Þ¼

ZOþ

frðx, v!

,vo!ÞLiðx, v!Þð n!� v!Þdv ð1Þ

¼

ZOþ

~fr ðx, v!

,vo!ÞLiðx, v!Þ dv ð2Þ

where Lo denotes the outgoing radiance from a point x in directionvo!

; Oþ is the upper hemisphere with solid angles dv arounddirections v

!; fr is the BRDF and ~fr the BRDF multiplied by n

!� v!

;Li is the incoming radiance at x from direction v

!; n!

is the surfacenormal at x.

Let us consider a surrounding illumination Ls independentfrom 3D objects. Ls is defined for any direction v

!at any

illuminated point x; when 3D objects (blockers from now on) inthe scene are located between x and the lighting environment,visibility has to be considered. Eq. (2) can thus be rewritten:

Loðx,vo!Þ¼

ZOþ

~fr ðx, v!

,vo!ÞLsðx, v

!ÞVðx, v!Þdv ð3Þ

V being the visibility term.We aim at dealing with dynamic scenes (with potentially varying

BRDF, moving light sources and objects), so that the triple product~fr � Ls � V must be solved at run-time. Several methods choose SH orHW basis functions for representing ~fr , Ls and V, such as [8–13].

Generally speaking, orthogonal basis functions such as SH andHW provide a useful representation of functions on the sphere.Once the projection is performed, the integration becomes a dotproduct or a triple product, allowing real-time or interactiverendering for dynamic scenes. However, all the coefficients haveto be estimated even though some (potentially many) of themonly contribute negligibly to lighting computations [10].

Shadow fields [11] tabulate SH or HW visibility representationin the space surrounding each blocker, requiring large tables andmany triple product computations. SH exponentiation [12,8,9]replaces SH triple products by fast coefficients accumulation inlog space. With bounding sphere sets, these methods are capableof handling more complex scenes and skinned objects. Never-theless, SH usage practically limits the application to lowfrequency lighting environments. Kautz et al. [10] propose ahemispherical rasterization of blocker geometry for estimating aSH representation of the transfer function ( ~fr :V in this case). Thismethod also uses SH with low memory requirements and no pre-computation based on the scene geometry. The above methods[8–12] are commonly denoted as blocker accumulation and implyvisibility computations for each shaded point.

Kozlowski and Kautz [14] perceptually analyze possiblesimplifications for visibility. They propose a directional ambientocclusion method based on a piecewise constant approximationof V. SH are used to smooth results and even with approxi-mated shadows and highlights, images remain visually plausible.

Green et al. [15,16] use a similar simplification for shadows,associated with isotropic Gaussian functions (SRBF) for representingBRDF and/or visibility; incoming radiance is represented byenvironment maps, pre-filtered using Gaussian functions, allowingreal time rendering. These methods have been designed for existingenvironment maps and do not allow interactive dynamic lighting.

SRBF are well known in the computer graphics community. Forinstance generalized cosine lobes [17] and isotropic Gaussiankernel [18] are used to model BRDF, leading to a compact,expressive and physically plausible representation. Tsai andShih [7] adapt the Abel-Poisson kernel to PRT. More recently,Wang et al. [19] propose to use spherical Gaussians for efficientlyrendering scenes with static geometry and realistic BRDF. Withthis method, visibility is pre-processed independently of BRDF,using spherical signed distance functions (SSDF). The proposedapproach uses spherical Gaussians exclusively for BRDF models,while our method proposes a unified representation for each termof the rendering equation allowing dynamic environments.The compactness of SRBF leads to a more efficient compressionscheme than the clustered principal component analysis proposedin [20] and provides faster rendering with more realistic results.

The attractiveness of fixed and orthonormal basis functions ispractically limited by both the projection cost and the number ofresulting coefficients for high-frequency environments or BRDF.For instance, with SH/HW, a small detail may require a lot of(sometimes very small) coefficients, while only one SRBF functionmight be necessary. We overcome this limitation with the use ofan unfixed set of cosine lobes for every factor: Visibility, lightingand reflectance. These cosine lobes do not form an orthonormalbasis of functions, it would rather correspond to an infinite basisset of functions. This paper proposes a methodology for solvingthe rendering equation with such a representation. Our applica-tion focuses on direct lighting only.

3. Overview

We propose to represent each term of the triple product~fr � Ls � V with cosine lobes in the rendering equation (Eq. 3).Integration can then be performed due to a key element: The ideathat the product of two lobes can be approximated by anotherlobe, corresponding to the closure property (Section 4). Thus, theintegration of a lobe product can be approximated by theintegration of a single one. Furthermore, the main advantages ofcosine lobes are: (i) The reduced number of coefficients neededfor any SRBF and (ii) the possibility of pre-computing many costlyoperations. This paper describes all the possible pre-computationsand discusses their associated precision losses.

As an example, we propose an interactive application for directlighting (Section 5), where our approximation is applied todynamically distributed lobes, avoiding useless computations ofnegligible coefficients (Section 6). Our implementation corre-sponds to a blocker accumulation system that benefits from SRBF(with cosine lobes) advantages, while providing interactiverendering in dynamic scenes with all-frequency BRDF, area lightsources and visibility management. Our proposed visibilitycomputation relies on a simplified geometry using sphere sets(blockers and light sources). From a given point, a sphere defines acone, used for constructing a lobe. This visibility computationcorresponds to a practical illustration for using cosine lobes. Thiscan be replaced by any other shadow or soft shadow system [21].

The three sums of cosine lobes given in the triple product~fr � Ls � V are generated using:

�
The BRDF (Lambert, Phong, Lafortune, etc.) representation for~fr (see Section 5.1).

S. Meunier et al. / Computers & Graphics 34 (2010) 767–778 769

�
A set of spherical light sources for Ls, with one lobe for eachsphere (see Section 5.2). � For V, a visibility mask is produced using the simplified object
geometry for constructing again spheres and thus cosine lobes(see Section 5.3).

YX

Z

YX

Z

X

Y

Z

X

Y

Z

X

Z

X

Z

Fig. 1. An example of cosine lobe product. Cosine lobes in the left box are

multiplied. The top-right box provides the actual resulting surface while the

bottom-right box illustrates the (single) lobe corresponding to our approximation:

The global shape and direction are preserved.

4. The rendering equation and cosine lobes

A cosine lobe function can be defined as

cð v!Þ¼ s maxð a

!� v!

,0Þe ð4Þ

¼ sCð v!Þ ð5Þ

where s is a scaling factor, a!

is the axis and e influences the lobethickness. Let us recall that cosine lobes do not form anorthonormal basis of functions. Still, each term of the renderingequation can be approximated by one lobe, or more generally by asum of lobes, that are manipulated later independently of theirorigin.

Fixing x, vo!

and assuming ~fr , Ls and V approximated by a cosinelobe sum lead to the following expressions:

~fr ð v!Þ¼X

i

ðsfiC

fi ð v!ÞÞ ð6Þ

Lsð v!Þ¼X

j

ðsljC

ljð v!ÞÞ ð7Þ

Vð v!Þ¼X

k

ðsvkC

vkð v!ÞÞ ð8Þ

Eq. (3) can thus be rewritten as follows:

Lo ¼

ZOþ

Xi

ðsfiC

fi ð v!ÞÞX

j

ðsljC

ljð v!ÞÞX

k

ðsvkC

vkð v!ÞÞdv ð9Þ

¼X

i

Xj

Xk

sfi sl

jsvk

ZOþ

Cfi ð v!ÞCl

jð v!ÞCv

kð v!Þdv ð10Þ

Contrary to SH- or HW-based methods, the termROþC

fi ð v!ÞCl

jð v!ÞCv

kð v!Þ dv cannot be pre-computed since cosine

lobes are not fixed. Let us rewrite Eq. (9):

Lo ¼

ZOþ

Xi,j

½sfiC

fi ð v!Þsl

jCljð v!Þ�X

k

svkC

vkð v!Þdv ð11Þ

We propose to approximate each term ½sfiC

fi ð v!Þsl

jCljð v!Þ� by a

single cosine lobe samC

amð v!Þ:

Lo �

ZOþ

Xm

samC

amð v!ÞX

k

svkC

vkð v!Þdv ð12Þ

¼

ZOþ

Xm,k

½samC

amð v!Þsv

kCvkð v!Þ�dv ð13Þ

Using again the same approximation, the rendering equationfinally becomes a sum of cosine lobe integrations:

Lo �

ZOþ

Xp

spCpð v!Þdv¼

Xp

sp

ZOCpð v!Þdv ð14Þ

where p¼m� k¼ i� j� k.Sections 4.1 and 4.2 describe, respectively, how product of two

lobes is managed, and how cosine lobe integration can be pre-computed. Section 4.3 explains how the potentially high numberof resulting lobes can be efficiently processed.

4.1. Product

We propose to approximate the product of two cosine lobesc1 and c2 by another single cosine lobe cr:

c1ð v!Þc2ð v!Þ¼ s1maxða1

!� v!

,0Þe1 s2maxða2!� v!

,0Þe2 ð15Þ

� crð v!Þ ð16Þ

� s1s2prmaxðar!� v!

,0Þer ð17Þ

We call pr the partial scaling factor. Parameters pr, ar!

and er

depend on e1, e2 and a (the angle between a1!

and a2!

). pr, ar!

and er

are pre-computed for a wide range of ðe1,e2,aÞ triplets, and storedin a lookup table. For maintaining directionality and shape for theapproximated cosine lobe (see Fig. 1), the L2 distance betweenc1ð v!Þ � c2ð v

!Þ and crð v

!Þ is minimized as follows:

fpr ,ar!

,erg ¼ arg

fpr ,ar!

,erg

min

ZOðc1ð v!Þc2ð v!Þ�crð v

!ÞÞ

2 dv ð18Þ

Since solving this optimization problem for the parametersaltogether is time-consuming and prone to numerical errors, werather proceed in two steps: (i) Finding the axis ar

!and (ii) Fitting

the partial scaling factor pr and the exponent er.First step:

farg ¼ arg max

far!g

c1ðar!Þc2ðar!Þ ð19Þ

is done with the Nelder–Mead method [22], which is very fast inthis case. The algorithm is initialized with a1

!þa2!=Ja1!þa2!

J, sincear!

is necessarily between a1!

and a2!

.Second step:

fpr ,erg ¼ arg minfpr ,erg

ZOðc1ð v!Þc2ð v!Þ�crð v

!ÞÞ

2 dv ð20Þ

is done with Levenberg–Marquardt algorithm [22]. pr is initializedwith 1, while er is initialized with e1 + e2 which is the solution fora1!¼ a2!

.


In practice, a is linearly sampled in range ½0,2p� with 128 values.We observe that using more samples does not provide more preciseresults. For e1 and e2, our sampling strategy is based on the cosinelobe integration variation. The result of these products is alwaysintegrated. As shown in Fig. 2, the integration variation tends to zerologarithmically when e1 and e2 increase. Therefore, a more precisescheme is required for low exponent values. Consequently, we use anadaptive sampling in range [0, 105]: 16 linearly spaced samples arechosen in each interval [0,1], [1,10],y, [104, 105].

4.2. Integration

Note that the cosine lobe integral only depends on theexponent. More precisely, given a cosine lobe cð v

!Þ¼

s �maxð a!� v!

,0Þe, the integral:ZO

cð v!Þdv¼

ZO

s maxð a!� v!

,0Þe dv

is also equal to

s

ZO

maxðX!� v!

,0Þe dv

where c is an arbitrary lobe and X!

is any fixed axis.

10−1 100 101 1020

1

2

3

4

5

6

Cosine lobe exponent

Inte

grat

ion

valu

e

Fig. 2. Integral value according to cosine lobe exponents (logarithmic scale), with

a scaling factor s set to 1. The low variation for high exponents allows a sparser

sampling than with lower exponents.

Fig. 3. Lobe product approximation error (Section 4.1): e1 and e2 correspond to the lob

absolute difference between the integration of c1 � c2 and cr. The error value is represen

high error. An important error (� 50%) is noticeable for e1 and e2 between 0 and 1 app

range [0..105]; right: Zoom on the region corresponding to the highest error.

Since this integration is time-consuming, we choose to pre-compute a set of values for e in range [0, 105] using the commonadaptive Gauss quadrature. For limiting memory requirements,our table sampling strategy also relies on the cosine lobeintegration variation (see Fig. 2): The first samples are denselydistributed while the last ones can be sparser; 128 linearly spacedsamples are estimated in each interval [0,1], [1,10]y, [104, 105].At run-time, the rendering process uses linear interpolation; it isfast and the error remains low with this sampling strategy(cf. Section 6).

As an estimation of our approximation precision, compared tothe actual lobe product integration, we have estimated theintegration difference value between (i) the approximated lobeproduct in the table and (ii) the corresponding lobe product.Integration is processed using an adaptive Simpson quadra-ture with a tolerance set to 10�7; the mean error is equal to7:1379� 10�4, with a variance equal to 1:2091� 10�8; themedian error value is 0. Note that the maximum error (Fig. 3)occurs when both exponents e1 and e2 are very low, very rarelyused in practice, even with Lafortune model. In addition, asexplained in Section 4.3, cosine lobe exponents produced duringrendering always increase. For these two reasons, cosine lobeswith high errors have a very low probability to be produced.

4.3. Rendering

The shading of a 3D point seen from a given viewpoint isperformed using only cosine lobe sums for BRDF, visibility andlighting. Eq. (3) can be rewritten:

Lo ¼

ZOþ

XI

i

ðcið v!ÞÞXJ

j

ðcjð v!ÞÞXK

k

ðckð v!ÞÞdv ð21Þ

Note that the cosine lobe n!� v!

is included in cið v!Þ. Using

approximation (15)–(17), Eq. (21) becomes:

Lo ¼XI�J�K

p

ZOþ

cpð v!Þdv ð22Þ

Many cosine lobe products only negligibly contribute to theoutgoing radiance Lo, for two main reasons: (i) When a productc1 � c2 is approximated as a new cosine lobe cr (cf. Eq. (17)), er is

e exponents and a is the angle between the lobe axes. The error is defined as the

ted by a color; the white regions correspond to a low error; the black regions to a

roximatively and an angle a around p=2. Left: Full representation for e1 and e2 in


greater than both e1 and e2 (cf. Fig. 6) and (ii) integrals of cosinelobes decrease when exponents increase. Thus, a negligible cosinelobe multiplied by another one becomes finally approximated bya negligible lobe (a similar property exists with HW [23]).

When the product of two cosine lobe sums is expanded(cf. Eqs. (11) and (13)), we propose to check the integral of eachapproximated cosine lobe product. If the integral is smaller than a

10−2 10−1 100 1010

20

40

60

80

100

Threshold value

Per

cent

age

Fig. 4. Percentage of lobe products integration (ordinate) larger than or equal to a

given threshold value. The blue dashed curve represents integrated products in a

low frequency environment (exponentsA ½0,10�); the green dotted curve corre-

sponds to a higher frequency environment (exponentsA ½0,102�) and the last red

curve to exponentsA ½0,103�. In practice, exponents often grow up to 15� 103. (For

interpretation of the references to color in this figure legend, the reader is referred

to the web version of this article.)

Fig. 5. This scene is rendered using several values of ti. Visibility is represented

with a high number of thin cosine lobes (small contributions) in penumbra, while

umbra require a small number of thick lobes. Consequently, when ti is too high,

shadows become aliased.

02000

40006000

80009150

02000

40006000

80009150

0

5000

10000

15000

19 000

e2e1

er

Fig. 6. Representation of er according to e1 and e2 for the approximation of the

product c1.c2 for a fixed a. Note that er is greater or equal than e1 and e2 (the

corresponding surface is not a plane).

given threshold ti, the new cosine lobe can be neglected and thusremoved from the expanded sum (cf. Sections 5 and 6). Noticethat a similar property also exists with spherical Gaussians [19].

Fig. 4 presents a statistical distribution of lobe product integrals(10 millions of randomly generated products). This shows that evena small threshold avoids many lobe computations.

In our application, a single threshold ti can be set manually bythe user for all shaded points (cf. Fig. 5); other strategies withvarying thresholds depending on time constraints or the sceneconfiguration could also be investigated.

In addition, given a product of several lobes, when the first onecan be neglected, all the remaining computations can obviouslybe avoided too. Consequently, computation order affects thenumber of products actually performed. Section 5.4 describes theordering strategy we propose.

5. Cosine lobes for direct lighting

This section describes how our representation can be effi-ciently used with direct lighting. We discuss cosine lobesconstruction for each term of the integral. Once lobes are defined,computations are performed using pre-computed products andintegration described above.

5.1. Review of cosine lobe-based BRDF models

Phong, Lafortune or Neumann et al. models for instance[24,17,25] rely on cosine lobes for compactly representing BRDF.This section expresses those models according to our formalism.

Phong model:

~fr ð v!Þ¼ kdð n

!� v!Þþksmaxðð r

!� v!

,0Þes ,1Þ ð23Þ

¼ cdð v!Þþcsð v

!Þ ð24Þ

where

cdð v!Þ¼ kdmaxð n

!� v!

,0Þ1 ð25Þ

and

csð v!Þ¼ ksmaxð r

!� v!

,0Þes ð26Þ

kd corresponds to the usual Lambert model coefficient and n!

isthe surface normal. es is the specular cosine lobe exponent, and ks

its scale factor. r!

is the reflection of vo!

by n!

.Note that the original Phong model is not physically plausible,

but we can as well express the modified Phong model [26] with ourrepresentation.

Lafortune model:

frð v!Þ¼X

i

rimaxð v!T

Mivo!Þei ð27Þ

¼X

i

cið v!Þ ð28Þ

where, according to [17], and using Helmoltz reciprocity:

cið v!Þ¼ kiJMivo

!Jei max

Mivo!

JMivo!

J� v!

,0

!ei

ð29Þ

ki is the scale factor of ci, the matrix Mi represents the lobe axisin a local frame depending on material anisotropy and ei theexponent.


5.2. Incoming radiance

Given a spherical light source (r,p,v) defined by a radius r, aposition p and a radiance value v, the incident radiance on thehemisphere centered around a given point x can be approximatedby a single cosine lobe:

clightð v!

,xÞ ¼ vwlightmaxxp�!J xp�!

J� v!

,0

!elight

ð30Þ

Parameters wlight and elight are estimated solving the followingoptimization problem:

fwlight ,elightg ¼ arg minfwlight ,elightg

ZOðLsð v!

,xÞ�clightð v!

,xÞÞ2 dv ð31Þ

where Lsð v!

,xÞ depends on the distance J xp�!

J and r.In practice, wlight and elight are pre-computed and stored in

tables containing 103 values, linearly interpolated at run-time.When using several light sources, the corresponding cosine

lobes may overlap. However, our visibility processing (describedin Section 5.3) handles this case.

5.3. Visibility

Visibility is a key term in the rendering equation, for producingrealistic shadows and computing light inter-reflections. It is oftenestimated using shadow maps, or soft shadow maps [27],according to the type of light sources. We think that usualvisibility methods can be adapted to a cosine lobes representa-tion. For instance, shadow maps can be easily integrated in ourapplication since the visibility term V and the incoming radianceLi in the rendering equation can be approximated by lobes:Liðx, a!Þ¼ licosð�Þei and Vðx, a

!Þ¼ dcosð�Þev , where d is the Kronecker

delta function and � corresponds to the solid angle subtended bythe shadow map pixels.

This section describes another solution for estimating visibilityrather dedicated to low-frequency environments. During the off-line process, objects are simplified by sphere sets using themethods proposed by Wang et al. [28] or by Bradshaw andO’Sullivan [29]; they have been already successfully used forvisibility computations by many authors [12,8,9]. The computedspheres set is used to simplify objects geometry and is stilldeformable with articulated skeletons systems.

We aim at defining occlusion cosine lobes using spherical caps(cones). Visibility becomes:

Vð v!Þ¼ 1�foccð v

!Þ ð32Þ

where focc is a function representing a spherical cap (cf. Fig. 7):

foccð v!Þ¼

1 if v!� acap��!

ZcosðacapÞ

0 otherwise

(ð33Þ

Fig. 7. acap represents the spherical cap angle corresponding to sphere (p,r).

As for light sources, focc is approximated by a cosine lobe ccap:

ccapð v!Þ¼maxð a

!� v!

,0Þecap ð34Þ

using

fecapg ¼ arg minfecapg

ZOðfoccð v!Þ�ccapð v

!ÞÞ

2 dv ð35Þ

With N spheres, visibility becomes

Vð v!Þ¼

YNj ¼ 1

ð1�f joccð v!ÞÞ ð36Þ

¼ 1�X2N�1

k ¼ 1

gkð v!Þ ð37Þ

where gk is a term corresponding to a product of spherical caps.Unfortunately, this expression leads to the computation of 2N

terms. Replacing each spherical cap by only one lobe still requires2N lobes and introduces imprecisions that cumulate during thecomputation of the above product. The main reason is that onecosine lobe does not accurately approximate a spherical cap, andfinally caps overlapping (that is correctly handled for visibility inEq. (37) produces errors with this representation.

To tackle these problems, we propose to hierarchically build avisibility mask, that controls lobes generation (including over-lapping problems) and limits their number. A visibility mask isbuilt for each (spherical) light source, for precisely identifying theincoming light directions and the effective blockers. This processis repeated for each point at each frame.

Given a point x, for each light source represented by a sphere(r,p), a unit square Sunit surrounding the corresponding cone(cf. Fig. 8(a)) is placed at a distance dist, deduced from the coneangle (acap): dist¼ 1=ð2� tanacapÞ.

Blockers are clipped and remaining spheres are splatted ontoSunit. Note that Sunit may be placed behind the light source.

Splatted spheres are used to build a quad-tree (cf. Fig. 9). Inpractice, quad-tree depth is limited to 4 or 5. We have chosen anaggressive approach where partially covered leaves are ignored soas to reduce the number of cosine lobes. A cosine lobe is built foreach shadow leaf using a bounding sphere (cf. Fig. 10). Sincenearby cosine lobes mutually overlap, the error must becorrected; a weight is associated with each lobe:

Vð v!Þ¼ cunitð v

!Þ�XN�1

i

wicið v!Þ ð38Þ

where

cunitð v!Þ¼ 1 maxð n

!� v!

,0Þ0 ¼ 1 ð39Þ

Fig. 8. Occluding spheres splatting: (a) a unit square is placed according to the

spherical light source; (b) useless blockers are clipped; and (c) remaining spheres

are splatted onto the unit square.

sphere 3sphere 3sphere 2

sphere 1 sphere 1 sphere 2

Fig. 9. Quad-tree construction example. Three spheres are splatted and a maximal

depth of two is used. Red leaves are subdivided, yellow leaves are not currently

considered, and green leafs corresponding to shadows are frozen. (For interpreta-

tion of the references to color in this figure legend, the reader is referred to the

web version of this article.)

Fig. 10. Once visibility mask is built (left), one bounding sphere is constructed per

shadow leaf and the associated cosine lobes are produced (right).

Fig. 11. The cosine lobes overlapping vary according to the distance dist, even with

the same visibility mask.

Fig. 12. Direct lighting: (a) reference ray tracing image and (b) image computed

using our application. Maximal depth of the visibility mask is set to 5.


cunitð v!Þcað v!Þ¼ cað v

!Þ ð40Þ

and wi depends on the distance dist and on leaves depth. Since dist

is related to the light source radius, cosine lobes overlap varies(see Fig. 11). wi values are fixed experimentally for various valuesof distAf0:25,0:5,1,2,3,4,5,6,7,8,9,10g and for each depth. Thisprocess is done only once, with a simple scene composed of oneplane, one object and one light source, so as to obtain smoothshadows. First, weight values are initialized to 0; the first levelweight is then chosen by the user to obtain a perfect dark in theumbra area. Weights associated with each successive level arethen fixed to avoid shadow sharpness with previous ones. Thisprocess is performed only once and the resulting weights can beused with any scene. At run-time, weights are linearly inter-polated according to dist.

5.4. Ordering computations

This section describes our strategy for ordering lobe productsin Eq. (21), in order to reduce computing time. The cosine lobe cr

resulting from the product of two lobes c1 and c2 depends on e1, e2

and a (cf. Section 4.1). Let us make some observations:

�
Firstly, when a increases, cr integral decreases. � Secondly, with visibility masks, all the products between
visibility lobes and incoming radiance lobes are actually useful(they cannot be neglected).
� Thirdly, with direct lighting the number of cosine lobes
required is lower for incoming radiance than for visibility.

Thus, firstly multiplying reflectance and lighting (beforevisibility) produces fast results. This strategy produces resultsremaining close to Monte Carlo integration (for direct lightingonly), with interactive rendering (Fig. 12), though it does notensure the best precision.

6. Results and discussion

This section presents some results obtained with our applica-tion, implemented in C+ +, using CUDA 2.0. Results have beenproduced with an Intel Core i7 920 CPU and an NVidia GTX 280.Our GPU implementation requires to take care of memorycoalescing, branching in CUDA kernels and cache occupancy.Similarly to Zhou et al. [30], we use structure of arrays instead ofarray of structures for optimal memory access and build visibilitymasks in cache (32 simultaneously) using a small stack (fivenodes in our case). The very small cache available on the latestGPU limits the overall performances of our application. We trust abigger cache would lead to better performances.

Constructing our three pre-computed tables (product, integra-tion and spherical caps) requires approximately 4 hours. Let usrecall that these tables do not depend on the scene character-istics: They are computed once and for all. The texture memoryrequired for their storage is less than 10 MB. For the sake ofclarity, the following results only correspond to the cosine loberepresentation, without any optimization: For each frame, thewhole image is completely computed, no caching technique isemployed, nor spatial or temporal coherency. Our applicationconsists in shading a selected set of 3D points either trianglevertices or estimated per pixel, employing a G-buffer. Shading istotally performed on the GPU using CUDA and OpenGL.

Due to the flexibility and adaptivity of the cosine lobesrepresentation (number of lobes, bandwidth, free distribution, etc.),


computing efforts can be concentrated on regions where lightingneeds to be detailed. For instance, penumbra areas require ahigher number of lobes for representing visibility than umbraareas or fully lit regions, where only few computations canproduce precise results. Consequently, our application perfor-mances vary according to chosen materials (Fig. 13), number ofshaded points, maximum depth of visibility masks and chosenthreshold of cosine lobe products (Figs. 15 and 5) or relativeposition between occluders and lights. Frame rates are given inthe caption of figures.

Once cosine lobe sums are defined, the approach given inSection 4 approximates lobe products and uses tables for reducingcomputing time. Fig. 14 shows that these approximations lead toa small difference, compared to ray-tracing (with direct lightingand surface light source). For the Monte Carlo reference image,the maximum radiance value is 0.701961, the mean radiance

Fig. 13. Spatially varying BRDF are possible using Lafortune model (two lobes).

This animated scene with soft shadows (corresponding to one of the provided

videos) is composed of 89 284 shaded points, 67 spherical occluders and one

spherical light source (3.01 frame per second).

Fig. 14. Comparison between Monte Carlo integration (top left) and our approach

(top right) for the cosine lobe sum integration. Lobe sums are generated using the

methods described in Section 5. The bottom images show the difference: (bottom

left) positive difference values and (bottom right) negative difference values.

value is 0.127906; with our approach, the maximum difference is0.054902 while the mean difference is 0.003884. Note that ourmethod introduces a bias: Regions located in shadows, wheremore cosine lobes products are computed, exhibit more negativedifference values.

We also compare the images produced by our application withray tracing for direct lighting and soft shadows. The imagesproduced in the same conditions are shown in Fig. 12. Highlightsare properly estimated as well as incidence radiance, atinteractive frame rates. The average HDR pixel-to-pixel differencebetween path tracing and our method is equal to 12.57%, which isalso close to our cosine lobes integration approximation error:10.02%.

Table 1 presents the computing time ratio corresponding toeach rendering part, for different types of scenes, with hard andsoft shadows, complex or simple material and various numbers ofoccluders. The bottleneck of our application concerns visibilitymasks construction, as for any other blocker accumulation method.

Fig. 16 shows a more complex scene (also illustrated by one ofthe videos associated with this paper), made up of 664 011triangles, and approximated by 3578 spheres for producing cosinelobes for visibility and shadows. Rendering time is higher due tothe high number of shaded points.

Shadows are considered by many authors as a key issue forrealistic rendering; their processing is often handled specifically.Our method naturally handles various types of shadows, once the

Fig. 15. Performance gains obtained according to the threshold ti ( Section 4.3).

The scene is composed of 65536 shaded points, one spherical light source and 64

spherical occluders (the bunny). The threshold is chosen so as to keep

indistinguishable the difference between resulting images (left and right

columns). The false-color images illustrate the number of cosine lobe products

computed (scale is given at the bottom). The specular highlight properly interacts

with the shadows, our method naturally allows to avoid many computations

thanks to the threshold. We can see a reduction of about 50% of the number of

product computations (corresponding to 2 frames per second in practice, because

of visibility mask construction cost).

Table 1CPU+GPU time (percent) for the shading loop: (a) generation of cosine lobe sums

for reflectance; (b) lighting; (c) visibility mask construction; (d) visibility mask

lobes; and (e) rendering equation integration computation.

Figure a b c d e

13 2.1 1.5 84.8 4.5 7.1

22(a) 3.6 4.7 72.2 9.1 10.4

22(b) 2.5 3.5 60.1 14.4 19.5

22(c) 2.7 2.5 52.1 18.2 24.5


corresponding cosine lobes are defined. Figs. 12 and 22 show thatour method produces both hard and soft shadows. Fig. 17illustrates computing time variations according to (spherical)light source radius. Computing time depends on the number ofoccluding spheres as well as the maximum quadtree depth in our

Fig. 16. An illustration of a more complex scene, with 3578 spheres, 642 000

shaded points, 664 011 triangles, 2 spherical light sources, an octree depth equal

to 5. Rendering time : 17,68 s.

450

600

750

900

1050

1200

0.5

Ren

derin

g tim

e (in

ms)

Surface of 45.8 91.1

Fig. 17. Computing time according to light source size (area), with the example of a mo

our visibility process that requires testing all the spheres.

Fig. 18. Self-shadowing when light source diameter varies. Four-hundred and eighty

Shaded points are 85 558 and the max depth for visibility is 5, in (left) 4.5 fps and (rig

visibility computation proposal. Since the number of spheres isfinite, the worst case appears when all spheres intersect the solidangle corresponding to the light source and requires filling thewhole octree.

Fig. 18 presents an example of self-shadowing with twodifferent light source sizes. Note that objects with straight edgesrequire more spheres for precisely representing geometric details,which consequently increase computing time. Self-shadowing isalso clearly visible on Fig. 14.

Computing time is linear according to the number of(spherical) light sources. Fig. 20 illustrates performances for acube lit by several light sources (Fig. 21). Note that in manyapplications, soft shadows do not need to be precisely computedfor visually convincing effects [31,32]. Area light sources can bereplaced by one single spherical light source. Fig. 19 presentsimages produced with a rectangular area light source approxi-mated by spheres, and three different object positions (fixedviewpoint). Noise appears because of lobes distribution on thelight source. With one single lobe for the whole light source,shadows are smoother.

The application developed in this paper provides somesolutions for illustrating cosine lobes products and integration.The main computing time for each frame is related to visibilitysince for each shaded point many operations are performed: Shaftculling for occluding spheres selection, and lobes generation usinga quadtree construction. Further investigations could highlyimprove performances. For instance, we do not use visibilitycoherence between close shaded points, which is nowadays aclassical technique. We also think that using others approachesshould help to allow real-time rendering. For instance, in [33],Annen et al. use shadow maps with convolution for soft shadows;Schwarz and Stamminger [34] also use shadow maps forapproximating shadow visibility information. Such techniquescould be used as another way to generate visibility lobes.

light source (m2)136.4 181.7 227

ving cube and randomly placed light sources. This function is close to linear due to

-four spheres for approximating the Ajax model composed of 610 102 triangles.

ht) 2.3 fps.

Fig. 19. Square area light source represented by a set of nine spheres; the object is represented by 485 spheres and 301 713 triangles; visibility quadtree maximum depth is

5. The viewpoint is fixed, the object is raised (a) height #1; (b) height #2; (c) height #3; (d) height #1 with one spherical light source instead of the square with same visible

area, 1.6 fps, 277 K shaded points; (e) height #2 with one spherical light source, 0.2 fps, 277 K shaded points; and (f) height #3 with one spherical light source, 0.2 fps, 277 K

shaded points.

0

160

320

480

640

800

5037.512.50

Avg

rend

erin

g tim

e (in

ms)

# of light sources

25

Fig. 20. Computation time according to number of light sources. With our method, one visibility mask has to be constructed for each area light source.

Fig. 21. Images of a cube lit by several light sources. The cube is approximated by 64 spheres. left: four spherical light sources; middle: six spherical light sources; and right

11 spherical light sources.


7. Conclusion

This paper explores the use of cosine lobe functions for solvingthe rendering equation. Contrary to fixed basis functions such aswavelets or spherical harmonics for instance, SRBF allows to choosethe most appropriate function amongst an infinite set, reducing thenumber of parameters for describing a given signal. Using cosine

lobes for rendering requires correct product and integrationschemes. We propose solutions for these two problems. The productof two lobes can be approximated by a single one with only a smallerror in practical situations, and many computations can be pre-computed and stored in lookup tables.

As an illustration, we also propose an interactive applicationfor direct lighting including various lighting conditions, materials,

Fig. 22. Shadow smoothing according to the distance between the plane and the Bunny on the first row and according to the distance between the light source and the

plane on the second row. Frame rates decrease with respect to the increasing penumbra area. Penumbra implies complex visibility masks and more cosine lobes. 65 536

points are shaded considering a strictly diffuse material (one lobe), one spherical light source, 64 spherical occluders, fixing the maximal depth of the visibility mask to 5

and the threshold to 0.001 (Section 4.3).


soft shadows and dynamic scenes. As shown in the results, cosinelobe basis functions are useful for properly targeting relevanthemispherical directions in terms of BRDF, incoming radiance orvisibility. In addition, their representation generally requires lessparameters than with fixed basis functions since the chosen lobesfunction parameters (axis, exponent and scale factor) can bededuced from the signal of an infinite set of solutions.

Our application demonstrates the feasibility of computing directlighting, without using any spatial and temporal coherency. Visibilitymasks are often identical for close neighborhoods of shaded points;this coherency should be used for reducing computing time.

Global illumination in dynamic scenes has been addressedusing spherical proxies and spherical harmonics [8,9]. Similarly tothese authors, our direct lighting application makes use of blockeraccumulation. In the future, we aim at adapting their interestingsimplifications to cosine lobes for overcoming the usual SHdrawbacks.

Acknowledgements

This project as been funded by Poitou-Charentes region, thanksto PPF GIC and the NavII FEDER Project. We would also like tothank the anonymous reviewers for their valuable feedback.

Appendix A. Supplementary material

Supplementary data associated with this article can be foundin the online version of 10.1016/j.cag.2010.08.002.

References

[1] Pharr M, Humphreys G. Physically based rendering: from theory toimplementation. Morgan Kaufmann Publishers Inc; 2004.

[2] Sloan P-P, Kautz J, Snyder J. Precomputed radiance transfer for real-timerendering in dynamic, low-frequency lighting environments. In: ACMtransactions on graphics (Proceedings of SIGGRAPH’02), 2002. p. 527–36.

[3] Gautron P, Krivanek J, Pattanaik S, Bouatouch K. A novel hemispherical basisfor accurate and efficient rendering. In: Eurographics symposium onrendering (EGSR’04), 2004. p. 321–30.

[4] Gautron P. Cache de luminance et cartes graphiques: une approche pour lasimulation d’eclairage temps reel dans des sc�enes animees. PhD thesis,Universite de Rennes 1; 2006.

[5] Ng R, Ramamoorthi R, Hanrahan P. All-frequency shadows using non-linearwavelet lighting approximation. In: ACM transactions on graphics (Proceed-ings of SIGGRAPH’03), 2003. p. 376–81.

[6] McAllister DK, Lastra A, Heidrich W. Efficient rendering of spatial bi-directional reflectance distribution functions. In: Conference on graphicshardware (HWWS’02), 2002. p. 79–88.

[7] Tsai Y-T, Shih Z-C. All-frequency precomputed radiance transfer using sphericalradial basis functions and clustered tensor approximation. In: ACM transac-tions on graphics (Proceedings of SIGGRAPH’06), 2006. p. 967–76.

[8] Sloan P-P, Govindaraju NK, Nowrouzezahrai D, Snyder J. Image-based proxyaccumulation for real-time soft global illumination. In: Pacific conference oncomputer graphics and applications, 2007. p. 97–105.

[9] Guerrero P, Jeschke S, Wimmer M. Real-time indirect illumination and softshadows in dynamic scenes using spherical lights. Computer Graphics Forum2008;27(8):2154–68.

[10] Kautz J, Lehtinen J, Aila T. Hemispherical rasterization for self-shadowing ofdynamic objects. In: Eurographics symposium on rendering (EGSR’04), 2004.p. 179–84.

[11] Zhou K, Hu Y, Lin S, Guo B, Shum H-Y. Precomputed shadow fieldsfor dynamic scenes. In: ACM transactions on graphics (Proceedings ofSIGGRAPH’05), 2005. p. 1196–201.

[12] Ren Z, Wang R, Snyder J, Zhou K, Liu X, Sun B, et al. Real-time soft shadows indynamic scenes using spherical harmonic exponentiation. In: ACM transac-tions on graphics (Proceedings of SIGGRAPH’06), 2006. p. 977–86.

[13] Feng W-W, Peng L, Jia Y, Yu Y. Large-scale data management for prt-basedreal-time rendering of dynamically skinned models. In: Eurographicssymposium on rendering (EGSR’07), 2007. p. 23–34.

[14] Kozlowski O, Kautz J. Is accurate occlusion of glossy reflections necessary? In:Symposium on applied perception in graphics and visualization (APGV 07),2007. p. 91–8.

[15] Green P, Kautz J, Matusik W, Durand F. View-dependent precomputed lighttransport using nonlinear Gaussian function approximations. In: Symposiumon interactive 3D graphics (I3D’06), 2006, p. 7–14.

[16] Green P, Kautz J, Durand F. Efficient reflectance and visibility approximationsfor environment map rendering. Computer Graphics Forum 2007;26(3):495–502.

10.1016/j.cag.2010.08.002


[17] Lafortune EPF, Foo S-C, Torrance KE, Greenberg DP. Non-linear approximationof reflectance functions. In: ACM transactions on graphics (Proceedings ofSIGGRAPH’97), 1997. p. 117–26.

[18] G. J. Ward, Measuring and modeling anisotropic reflection. In: ACMtransactions on graphics (Proceedings of SIGGRAPH’92), 1992. p. 265–72.

[19] Wang J, Ren P, Gong M, Snyder J, Guo B. All-frequency rendering of dynamic,spatially-varying reflectance. In: ACM transactions on graphics (Proceedingsof SIGGRAPH ASIA’09), 2009. p. 1–10.

[20] Sloan P-P, Hall J, Hart J, Snyder J. Clustered principal components forprecomputed radiance transfer. In: ACM transactions on graphics (Proceed-ings of SIGGRAPH’03), 2003. p. 382–91.

[21] Eisemann E, Assarsson U, Schwarz M, Wimmer M. Casting shadows in realtime. In: ACM transactions on graphics (Proceedings of SIGGRAPH ASIA’09Courses), 2009.

[22] Press W, Flannery B, Teukolsky S, Vetterling W. Numerical recipes in C: theart of scientific computing. Cambridge University Press; 1992.

[23] Ng R, Ramamoorthi R, Hanrahan P. Triple product wavelet integrals for all-frequency relighting. In: ACM transactions on graphics (Proceedings ofSIGGRAPH’04), 2004. p. 477–87.

[24] Phong BT. Illumination for computer generated pictures. Communications ofthe ACM 1975;18(6):311–7.

[25] Neumann L, Neumann A, Szirmay-Kalos L. Compact metallic reflectancemodels. In: Computer graphics forum (Proceedings of EUROGRAPHICS’99),1999. p. 161–72.

[26] Lewis RR. Making shaders more physically plausible. In: Computer graphicsforum (Proceedings of EUROGRAPHICS’94), 1994. p. 1–13.

[27] Fernando R. GPU Gems: programming techniques, tips and tricks for real-time graphics. Pearson Higher Education; 2004.

[28] Wang R, Zhou K, Snyder J, Liu X, Bao H, Peng Q, et al. Variational sphereset approximation for solid objects. The Visual Computer 2006;22(9):612–21.

[29] Bradshaw G, O’Sullivan C. Sphere-tree construction using dynamic medialaxis approximation. In: Symposium on computer animation (SCA’02), 2002.p. 33–40.

[30] Zhou K, Hou Q, Wang R, Guo B. Real-time kd-tree construction on graphicshardware. In: ACM transactions on graphics (Proceedings of SIGGRAPHASIA’08), 2008. p. 1–11.

[31] Wanger L. The effect of shadow quality on the perception of spatialrelationships in computer generated imagery. In: Symposium on interactive3D graphics (I3D’92), 1992. p. 39–42.

[32] Sattler M, Sarlette R, Mucken T, Klein R. Exploitation of human shadowperception for fast shadow rendering. In: Symposium on applied perceptionin graphics and visualization (APGV 05), 2005.

[33] Annen T, Dong Z, Mertens T, Bekaert P, Seidel H-P, Kautz J. Real-time, all-frequency shadows in dynamic scenes. In: ACM transactions on graphics(Proceedings of SIGGRAPH’08), 2008. p. 1–8.

[34] Schwarz M, Stamminger M. Bitmask soft shadows. Computer Graphics Forum2007;26(3):515–24.

Cosine lobes for interactive direct lighting in dynamic scenes

Documents

Transcript of Cosine lobes for interactive direct lighting in dynamic scenes