Embree Ray Tracing Kernels

64
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Embree Ray Tracing Kernels Sven Woop

Transcript of Embree Ray Tracing Kernels

Page 1: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree Ray Tracing KernelsSven Woop

Page 2: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice.All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.Intel processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.Optimized Intel® HD Graphics P3000 only available on select models of the Intel® Xeon® processor E3 family. To learn more about Intel Xeon processors for workstation visit www.intel.com/go/workstation.HD Graphics P4000 introduces four additional execution units, going from 8 in the HD P3000 to 12 in the HD P4000. Optimized Intel® HD Graphics P4000 only available on select models of the Intel® Xeon® processor E3-1200 v2 product family. For more information, visithttp://www.intel.com/content/www/us/en/architecture-and-technology/hdgraphics/hdgraphics-developer.htmlIris™ graphics is available on select systems. Consult your system manufacturer.Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user.Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Performance claims: Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to : http://www.Intel.com/performance

Legal

8/18/20152

Page 3: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

Legal Disclaimer and Optimization Notice

3

Page 4: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree Overview

Embree Performance

Embree API

Catmull Clark Subdivision Surfaces

Outline

8/18/20154

Page 5: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree Overview

8/18/20156

Page 6: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.8/18/2015

• Movie industry transitioning to ray tracing (better image quality, faster feedback)

• High quality rendering for commercials, prints, etc.

• Provides higher fidelity for virtual design (automotive industry, architectural design, etc.)

• Various kind of simulations (lighting, sound, particles, collision detection, etc.)

• Prebaked lighting in games

• etc.

Usage of Ray Tracing Today

7

Page 7: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Need to multi-thread: easy for rendering but difficult for hierarchy construction

Need to vectorize: efficient use of SIMD units, different ISAs (SSE, AVX, AVX2, AVX-512, KNCNI)

Need deep domain knowledge: many different data structures (kd-trees, octrees, grids, BVH2, BVH4, ..., hybrid structures) and algorithms (single rays, packets, large packets, stream tracing, ...) to choose

Need to support different CPUs: Different ISAs/CPU types favor different data structures, data layouts, and algorithms

Writing a Fast Ray Tracer is Difficult

8/18/20158

Page 8: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Observations

8/18/2015

Ray tracers are often not sufficiently optimized

Ray traversal consumes a lot of cycles of renderer (often over 70%)

Ray tracing can be expressed by small number of commonly used operations (build and traversal)

Ray tracing kernel library has potential to speed up many rendering applications

9

Page 9: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Provides highly optimized and scalable Ray Tracing Kernels (data structure build and ray traversal)

Highest ray tracing performance (1.5x – 6x speedup reported by users)

Support for latest CPUs (e.g. AVX512 support)

Targets application developers in professional rendering environment

API for easy integration into applications

Free and Open Source under Apache 2.0 license (http://embree.github.com)

Embree Ray Tracing Kernels

8/18/201510

Page 10: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Find closest and any hit kernel (rtcIntersect, rtcOccluded)

Single Rays and Ray Packets (4, 8, 16)

High quality and high performance hierarchy builders

Intel® SPMD Program Compiler (ISPC) supported

Triangles, Instances, Hair, Catmull Clark Subdivision Surfaces, Displacement Mapping

Extensible (User Defined Geometry, Intersection filter functions, Open Source)

SSE, AVX, AVX2, and AVX512 support

Embree Features

8/18/201511

Page 11: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Catmull Clark Subdivision Surfaces

– Smooth surface primitive

Vector Displacement Mapping

– Add geometric detail

Initial AVX512 support

– 16 wide AVX512 traversal kernels

New Embree Features

8/18/201512

Page 12: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree System Overview

8/18/201513

Embree API (C++ and ISPC)

Ray Tracing Kernel Selection

Accel. structure

bvh4.triangle4,bvh8.triangle8,

bvh4aos.triangle1,bvh4.grid

Builders

SAH builderSpatial split builder

Morton codebuilder

BVH Refitter

Traversal

Single ray (SSE2, AVX, AVX2),

packet (SSE2), hybrid

(SSE4.2), ...

Common Vector and SIMD Library

(Vec3f, Vec3fa, float4, float8, float16, SSE2, SSE4.1, AVX, AVX2, AVX512)

Intersection

MöllerTrumbore,Plücker Variant,

Bezier Curve,Triangle Grids

Subdiv

Engine

B-Spline PatchGregory Patch

TessellationCacheDispl. Mapping

Page 13: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

High ray tracing performance for photorealistic rendering

Large memory capacity to render really complex models

Robust tools to develop and debug rendering application

Complex shading and rendering applications are executed efficiently (e.g. light cuts with large per pixel state)

Why Ray Tracing on CPUs?

8/18/201515

Page 14: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.8/18/201516

Hides complexity of writing high performance ray tracing kernels gives you more time developing your renderer

High performance on latest Intel® Xeon® Processor family and Intel® Xeon Phi™ coprocessor products

Embree always up to date with latest ISA instruction sets

High potential performance gain (1.5x – 6x rendering speedup reported by Embree users)

Why should I use Embree?

Page 15: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

As a benchmark to identify performance issues in existing applications

Adopt algorithms from Embree to your code

– However Embree internals change frequently!

As a library through the Embree API (recommended)

– Benefit from future Embree improvements!

How can I use Embree?

8/18/201517

Page 16: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree v2.6.1 Performance

19

Page 17: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Models and illumination effects representative for professional rendering environment

Path tracer with different material types, different light types, about 2000 lines of code

Evaluation on typical Intel® Xeon® rendering workstation* and Intel® Xeon Phi™ Coprocessor**

Compare against state of the art GPU*** methods (using OptiX™ 3.8.0 and CUDA® 7.0.28)

Identical implementations in ISPC (Xeon®), ISPC (Xeon Phi™), OptiX™ (GTX™ Titan X)

Performance Methology

20

Imperial Crown of Austria4.3M triangles

Bentley 4.5l Blower (1927)2.3M triangles

Asian Dragon12.3M triangles

* Dual Socket Intel® Xeon® E5-2699 v3 2x18 cores @ 2.30GHz ** Intel® Xeon Phi™ 7120, 61 cores @ 1.238 GHz *** NVIDIA® GeForce® GTX™ Titan X

Page 18: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Build Performance for Static Scenes

40 41 4532.3 31.7 35.1

0

50

100

150

Intel® Xeon® E5-2699 v3Processor2 x 18 cores, 2.3 GHz

Intel® Xeon Phi™ 7120Coprocessor61 cores, 1.28 GHz

21

Mill

ion

Tri

angl

es/S

eco

nd

SAH Build (high quality)

Page 19: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Build Performance for Dynamic Scenes

112 108 105

160.1140.4

162.1

0

50

100

150

Intel® Xeon® E5-2699 v3Processor2 x 18 cores, 2.3 GHz

Intel® Xeon Phi™ 7120Coprocessor61 cores, 1.28 GHz

22

Mill

ion

Tri

angl

es/S

eco

nd

Morton Build

Page 20: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Ray Tracing Performance (incl. Shading)

107.2

129.6 134.98

64.9675.36

82.62

29.472 35.04 38.76

0

20

40

60

80

100

120

140Intel® Xeon® E5-2699 v3Processor2 x 18 cores, 2.3 GHz

Intel® Xeon Phi™ 7120Coprocessor61 cores, 1.28 GHz

NVIDIA® GeForce® GTX™ Titan X Coprocessor12 GB RAM

23

Mill

ion

Ray

s/Se

con

d

Page 21: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree API

8/18/201524

Page 22: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Version 2 of the Embree API

Compact and easy to use

C++ and ISPC version

Hides implementation details(e.g. different spatial index structures)

Embree API Overview

8/18/201525

Page 23: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Scene is container for set of geometries

Scene flags passed at creation time

Scene geometry changes have to get commited (rtcCommit) which triggers BVH build

Scene Object

26

/* include embree headers */#include <embree2/rtcore.h>

int main () {/* initialize at application startup */rtcInit ();

/* create scene */RTCScene scene = rtcNewScene(RTC_SCENE_STATIC,RTC_INTERSECT1);

/* add geometries */... later slide ...

/* commit changes */rtcCommit (scene);

/* trace rays */... later slide ...

/* cleanup at application exit */rtcExit ();

}

Page 24: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Static Scenes

– Geometry cannot get changed

– High quality BVH build (SAH) faster ray traversal

– For final frame rendering

Dynamic Scenes

– Geometries can get added, modified, and removed

– Faster build (Morton) slower ray traversal

– Preview mode during geometric modeling

Scene Types

27

Page 25: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Triangle Mesh

Contains vertex and index buffers

Number of triangles and vertices set at creation time

Linear motion blur supported (2 vertex buffers)

/* add mesh to scene */

unsigned int geomID = rtcNewTriangleMesh

(scene, numTriangles, numVertices, 1);

/* fill data buffers */

... later slide ...

/* add more geometries */

...

/* commit changes */

rtcCommit (scene);

29

Page 26: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Recommended to use buffer sharing

Reduces memory consumption

Application manages buffers (buffer has to stay alive as long as geometry is alive)

Support for stride and offset allows application flexibility in its data layout

Buffer Sharing

30

Page 27: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Buffer Sharing Example

/* application vertex and index layout */

struct Vertex { float x,y,z,s,t; };

struct Triangle { int materialID, v0, v1, v2; };

/* share buffers with application */

rtcSetBuffer(scene,geomID,RTC_VERTEX_BUFFER,vertexPtr,0,sizeof(Vertex));

rtcSetBuffer(scene,geomID,RTC_INDEX_BUFFER ,indexPtr ,4,sizeof(Triangle));

31

Page 28: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Tracing Rays

rtcIntersect (scene, ray) reports first intersection

rtcOccluded (scene, ray) reports any intersection

Packet versions for ray packets of size 4,8, and 16

32

Page 29: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

rtcIntersect: Ray Structure Inputs

Ray origin and direction (org, dir)

Ray interval (tnear, tfar)

Time used for motion blur [0,1]

struct RTCRay

{

Vec3f org;

Vec3f dir;

float tnear;

float tfar;

float time;

Vec3f Ng;

float u;

float v;

int geomID;

int primID;

int instID;

}

33

Page 30: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

rtcIntersect: Ray structure Outputs

Hit distance (tfar) Unnormalized geometry normal (Ng) Local hit coordinates (u,v) Geometry identifier of hit geometry

(geomID) Index of hit primitive of geometry

(primID) Geometry identifier of hit instance

(instID) No shading normals, texture

coordinates, etc.

struct RTCRay

{

Vec3f org;

Vec3f dir;

float tnear;

float tfar;

float time;

Vec3f Ng;

float u;

float v;

int geomID;

int primID;

int instID;

}

34

Page 31: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

35

Simplifies writing vectorized renderer

C-based language plus vector extensions

Scalar looking code that gets vectorized automatically

Guaranteed vectorization

Compilation to different vector ISAs (SSE, AVX, AVX2, AVX512, Xeon Phi™)

Available as Open Source from http://ispc.github.com

Intel® SPMD Program Compiler (ISPC)

Page 32: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

/* loop over all screen pixels */

foreach (y=0 ... screenHeight-1, x=0 ... screenWidth-1)

{

/* create and trace primary ray */

RTCRay ray = make_Ray(p,normalize(x*vx + y*vy + vz),eps,inf);

rtcIntersect(scene,ray);

/* environment shading */

if (ray.geomID == RTC_INVALID_GEOMETRY_ID) {

pixels[y*screenWidth+x] = make_Vec3f(0.0f); continue;

}

/* calculate hard shadows */

RTCRay shadow = make_Ray(ray.org+ray.tfar*ray.dir,neg(lightDir),eps,inf);

rtcOccluded(scene,shadow);

if (shadow.geomID == RTC_INVALID_GEOMETRY_ID)

pixels[y*width+x] = colors[ray.primID]*(0.5f + clamp(-dot(lightDir,normalize(ray.Ng)),0.0f,1.0f));

else

pixels[y*width+x] = colors[ray.primID]*0.5f;

}

Embree Rendering: ISPC Example

36

Page 33: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Dynamic Scenes

Create scene with RTC_SCENE_DYNAMIC flag

Report modified meshes with rtcUpdate call

Possibly enable (rtcEnable), disable (rtcDisable), add (rtcNewXX), and delete (rtcDeleteGeometry) geometries

for each frame

{

for each dynamic mesh

{

/* modify shared buffers */

modify mesh->indices

modify mesh->vertices

/* signal mesh update */

rtcUpdate(scene,mesh);

}

/* commit changes */

rtcCommit (scene);

/* trace rays */

...

}

37

Page 34: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Per geometry callback that is called during traversal for each primitive intersection

Callback can accept or reject hit

Can be used for:– Trimming curves (e.g. modeling tree leaves)

– Transparent shadows (reject and accumulate)

– Find all hits (reject and collect)

Intersection Filter Functions

42

Page 35: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

/* procedural intersection filter function */

void intersectionFilter(void* userPtr, RTCRay& ray)

{

Vec3fa h = ray.org + ray.dir*ray.tfar;

float v = abs(sin(4.0f*h.x)*cos(4.0f*h.y)*sin(4.0f*h.z));

float T = clamp((v-0.1f)*3.0f,0.0f,1.0f);

if (T > 1.0f) return; // accept hit

ray.geomID = RTC_INVALID_GEOMETRY_ID; // reject hit

}

/* set intersection filter for the cube */

rtcSetIntersectionFilterFunction(scene, geomID, (RTCFilterFunc)&intersectionFilter);

rtcSetOcclusionFilterFunction (scene, geomID, (RTCFilterFunc)&intersectionFilter);

rtcSetUserData (scene, geomID, NULL);

Filter Function Example

43

Page 36: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Hair curves represented as cubic bezier curves with varying radius

High performance through use of oriented bounding boxes

Low memory consumption through direct ray/curve intersection

Hair Geometry

44

p0/r0 p1/r1

p2/r2

p3/r3

Page 37: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Catmull Clark Subdivision Surfaces

8/18/201545

Page 38: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Converts coarse mesh into smooth surface by subdivision

Generalization of bi-cubic B-Spline surfaces to arbitrary topology

Embree is compatible with OpenSubdiv 3.0

Catmull Clark Subdivision Surfaces

46 46

Page 39: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Catmull Clark Subdivision

47

Page 40: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Low resolution base mesh controls high resolution surface

Smoothness always guaranteed (C2 continous almost everywhere)

Support for arbitrary topology(no trimming required as with NURBS)

Creases allow introducing sharp features

Support in most modeling tools

Established as standard in movie production

CC Subdivision Surface Advantages

Inside-Out (2015)Pixar, Walt Disney Pictures

48

Page 41: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Semi-sharp edge creases

Semi-sharp vertex creases

Vertex attribute interpolation

Tessellation level per edge

Non-manifolds and Holes

Boundary modes

Triangles, Quads, Pentagons, ...

Displacement mapping

Embree Subdivision Features

4949

Page 42: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree Subdivision Example

50

unsigned geomID = rtcNewSubdivisionMesh (scene, RTC_GEOMETRY_STATIC,

numFaces, numIndices, numVertices,

numEdgeCreases, numVertexCreases, numHoles);

rtcSetBuffer (scene,geomID,RTC_VERTEX_BUFFER, vertices, 0, sizeof(float3));

rtcSetBuffer (scene,geomID,RTC_INDEX_BUFFER , indices, 0, sizeof(int));

rtcSetBuffer (scene,geomID,RTC_FACE_BUFFER , faces, 0, sizeof(int));

rtcSetBuffer (scene,geomID,RTC_LEVEL_BUFFER , levels, 0, sizeof(float));

rtcSetBuffer (scene,geomID,RTC_EDGE_CREASE_INDEX_BUFFER,...);

rtcSetBuffer (scene,geomID,RTC_EDGE_CREASE_WEIGHT_BUFFER,...);

rtcSetBuffer (scene,geomID,RTC_VERTEX_CREASE_INDEX_BUFFER,...);

rtcSetBuffer (scene,geomID,RTC_VERTEX_CREASE_WEIGHT_BUFFER,...);

rtcSetBuffer (scene,geomID,RTC_HOLE_BUFFER,holes,0,sizeof(char));

Page 43: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Feature adaptive subdivision to evaluate patch (subdivides only at irregular vertices and crease features)

Fast path for B-Spline patches and Gregory patches

Tessellation Cache limits memory consumption (trade memory for performance)

Embree Subdivision Implemention

51

Feature adaptive subdivisioninto B-Spline patches (green) and

Gregory Patches (blue)

Page 44: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Embree Subdivision Performance

Patches 16 52k 53k

Edge Creases 0 0 30k

Micro Quads 1048k 831k 837k

Walkthrough 32 fps 36 fps 23 fps

Same View 66 fps 51 fps 56 fps

Intel® Xeon® E5-26902.9 GHz2x 8 cores1024 x 1024 pixels

52

Page 45: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Interpolates arbitrary user data over geometries (non-trivial for subdivision geometries)

Interpolated data P as well as dPdu and dPdv can be calculated at arbitrary location

Enables smooth normals and anisotropic texture lookups

Different rules for interpolation of texture coordinates supported (by evaluation of second subdiv mesh)

Vertex Data Interpolation

53

Page 46: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Vertex Data Interpolation Example

rtcNewScene (RTC_STATIC, RTC_INTERSECT1 | RTC_INTERPOLATE);

...

unsigned geomID = rtcNewSubdivisionMesh (...);

rtcSetBuffer (scene,geomID,RTC_INDEX_BUFFER, indices, 0, sizeof(int));

rtcSetBuffer (scene,geomID,RTC_VERTEX_BUFFER, vertices, 0, sizeof(float3));

rtcSetBuffer (scene,geomID,RTC_USER_VERTEX_BUFFER, vertex_colors, 0, sizeof(float3));

...

rtcCommit (scene);

...

rtcIntersect (scene, ray);

...

float3 P, dPdu, dPdv;

rtcInterpolate (scene, geomID, primID, ray.u,ray.v, RTC_VERTEX_BUFFER, &P, &dPdu, &dPdv, 3);

float3 color;

rtcInterpolate (scene, geomID, primID, ray.u,ray.v, RTC_USER_VERTEX_BUFFER, &color, 0,0, 3);

54

Page 47: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Displaced Subdivision Surface

55

Support for vector displacement Callback function displaces vertex

positions Bounded displacement allows for lazy

evaluation Smooth normals possible through

approximation

Q = P + D*NgdQdu ≈ dPdu + dDdu*NgdQdv ≈ dPdv + dDdv*Ng

Page 48: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

void displacementFunction(

void* ptr, int geomID, int primID,

const float* u, const float* v,

const float* nx, const float* ny, const float* nz,

float* px, float* py, float* pz,

size_t N)

{

for (size_t i = 0; i<N; i++) {

float D = displacement(...);

px[i] += D*nx[i];

py[i] += D*ny[i];

pz[i] += D*nz[i];

}

}

BBox3fa bounds(...);

rtcSetDisplacementFunction (scene,geomID,displacementFunction,&bounds);

Displaced Subdivision Surface Example

56

Page 49: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

57

Embree delivers highest ray tracing performance on CPUs

Embree can speed up many ray tracing applications

Embree is easy to use through its API

Subdivision surface support compatible to OpenSubdiv 3.0

Free and Open Source (https://embree.github.com)

Summary

Page 50: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Questions?

https://[email protected]

8/18/201558

Page 51: Embree Ray Tracing Kernels

C o p y r i g h t © 2 0 1 5 , I n t e l C o r p o r a t i o n . A l l r i g h t s r e s e r v e d . * O t h e r n a me s a n d b r a n d s ma y b e c l a i me d a s t h e p r o p e r t y o f o t h e r s .

Page 52: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

60

Technical Overview

Page 53: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Stochastic integration of all potential light paths between source and pixel

Follows light backwards from pixel to light source

Produces incoherent ray distributions

Monte Carlo Ray Tracing

Light

Pixel

Page 54: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Two Kinds of Ray Distributions

Incoherent Rays(typical for Monte Carlo)

Coherent Rays

Page 55: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

BVH Acceleration Structure

Page 56: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

BVH Acceleration Structure

Page 57: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

BVH Acceleration Structure

Page 58: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

BVH Acceleration Structure

Page 59: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

BVH Acceleration Structure

Page 60: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

Solution Space for Vectorized Ray Tracing

Single RaySIMD

Traversal

ScalarTraversal

PacketTraversal

IndependentRay TraversalMulti Ray

Single Ray

Single Box Multi Box

Page 61: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

BVH4 Spatial Index Structure

struct Node4 {

ssef minx, miny, minz;

ssef maxx, maxy, maxz;

Node4* child[4];

}

struct Triangle4 {

ssef v0x,v0y,v0z;

ssef e1x,e1y,e1z;

ssef e2x,e2y,e2z;

ssef Nx,Ny,Nz;

ssei id0,id1;

}

Page 62: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

For each dimension:

Intersect ray with near plane of each box in SIMD

Intersect ray with far plane of each box in SIMD

Clip the near and the far parameters (box hit if near <= far)

BVH4 Traversal

near4

near1near2

near3

far1far1

far3 far4

Page 63: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

• Reduce number of executed instructions

• Reduce data dependencies of critical paths

• Take advantage of special instructions (e.g. SSE, bitscan, etc.)

• Optimize most frequently executed code paths

Optimizing BVH4 Traversal for CPUs

Page 64: Embree Ray Tracing Kernels

Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.

• Load front/back plane based on direction sign of the ray.

• Balanced min/max trees

• Bitscans to iterate through hit children

• Early exit for 0 children hit (20%)

• 1 child hit (50%): keep next node in register (instead of push/pop sequence)

• 2 children hit (20%): keep next node in register, sort using a branch

Optimizing BVH4 Traversal for CPUs (SSE)while (true) {

if (isLeaf(node)) goto leaf;ssef nearX = (norg.x + node[nearX]) * rdir.x;ssef nearY = (norg.y + node[nearY]) * rdir.y;ssef nearZ = (norg.z + node[nearZ]) * rdir.z;ssef farX = (norg.x + node[farX ]) * rdir.x;ssef farY = (norg.y + node[farY ]) * rdir.y;ssef farZ = (norg.z + node[farZ ]) * rdir.z;ssef near = max(max(nearX,nearY),

max(nearZ,ray.near));ssef far = min(min(farX,farY),

min(farZ,ray.far));int hitmask = movemask(near <= far);if (hitmask == 0) goto pop;int c = bitscan(hitmask);hitmask = clearbit(hitmask,c); if (hitmask == 0) {

node = node.child[c]; continue;} …