Embree Ray Tracing Kernels
-
Upload
intel-software -
Category
Software
-
view
1.218 -
download
1
Transcript of Embree Ray Tracing Kernels
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree Ray Tracing KernelsSven Woop
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice.All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.Intel processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.Optimized Intel® HD Graphics P3000 only available on select models of the Intel® Xeon® processor E3 family. To learn more about Intel Xeon processors for workstation visit www.intel.com/go/workstation.HD Graphics P4000 introduces four additional execution units, going from 8 in the HD P3000 to 12 in the HD P4000. Optimized Intel® HD Graphics P4000 only available on select models of the Intel® Xeon® processor E3-1200 v2 product family. For more information, visithttp://www.intel.com/content/www/us/en/architecture-and-technology/hdgraphics/hdgraphics-developer.htmlIris™ graphics is available on select systems. Consult your system manufacturer.Any code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user.Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Performance claims: Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to : http://www.Intel.com/performance
Legal
8/18/20152
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Legal Disclaimer and Optimization Notice
3
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree Overview
Embree Performance
Embree API
Catmull Clark Subdivision Surfaces
Outline
8/18/20154
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree Overview
8/18/20156
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.8/18/2015
• Movie industry transitioning to ray tracing (better image quality, faster feedback)
• High quality rendering for commercials, prints, etc.
• Provides higher fidelity for virtual design (automotive industry, architectural design, etc.)
• Various kind of simulations (lighting, sound, particles, collision detection, etc.)
• Prebaked lighting in games
• etc.
Usage of Ray Tracing Today
7
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Need to multi-thread: easy for rendering but difficult for hierarchy construction
Need to vectorize: efficient use of SIMD units, different ISAs (SSE, AVX, AVX2, AVX-512, KNCNI)
Need deep domain knowledge: many different data structures (kd-trees, octrees, grids, BVH2, BVH4, ..., hybrid structures) and algorithms (single rays, packets, large packets, stream tracing, ...) to choose
Need to support different CPUs: Different ISAs/CPU types favor different data structures, data layouts, and algorithms
Writing a Fast Ray Tracer is Difficult
8/18/20158
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Observations
8/18/2015
Ray tracers are often not sufficiently optimized
Ray traversal consumes a lot of cycles of renderer (often over 70%)
Ray tracing can be expressed by small number of commonly used operations (build and traversal)
Ray tracing kernel library has potential to speed up many rendering applications
9
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Provides highly optimized and scalable Ray Tracing Kernels (data structure build and ray traversal)
Highest ray tracing performance (1.5x – 6x speedup reported by users)
Support for latest CPUs (e.g. AVX512 support)
Targets application developers in professional rendering environment
API for easy integration into applications
Free and Open Source under Apache 2.0 license (http://embree.github.com)
Embree Ray Tracing Kernels
8/18/201510
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Find closest and any hit kernel (rtcIntersect, rtcOccluded)
Single Rays and Ray Packets (4, 8, 16)
High quality and high performance hierarchy builders
Intel® SPMD Program Compiler (ISPC) supported
Triangles, Instances, Hair, Catmull Clark Subdivision Surfaces, Displacement Mapping
Extensible (User Defined Geometry, Intersection filter functions, Open Source)
SSE, AVX, AVX2, and AVX512 support
Embree Features
8/18/201511
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Catmull Clark Subdivision Surfaces
– Smooth surface primitive
Vector Displacement Mapping
– Add geometric detail
Initial AVX512 support
– 16 wide AVX512 traversal kernels
New Embree Features
8/18/201512
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree System Overview
8/18/201513
Embree API (C++ and ISPC)
Ray Tracing Kernel Selection
Accel. structure
bvh4.triangle4,bvh8.triangle8,
bvh4aos.triangle1,bvh4.grid
…
Builders
SAH builderSpatial split builder
Morton codebuilder
BVH Refitter
Traversal
Single ray (SSE2, AVX, AVX2),
packet (SSE2), hybrid
(SSE4.2), ...
Common Vector and SIMD Library
(Vec3f, Vec3fa, float4, float8, float16, SSE2, SSE4.1, AVX, AVX2, AVX512)
Intersection
MöllerTrumbore,Plücker Variant,
Bezier Curve,Triangle Grids
Subdiv
Engine
B-Spline PatchGregory Patch
TessellationCacheDispl. Mapping
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
High ray tracing performance for photorealistic rendering
Large memory capacity to render really complex models
Robust tools to develop and debug rendering application
Complex shading and rendering applications are executed efficiently (e.g. light cuts with large per pixel state)
Why Ray Tracing on CPUs?
8/18/201515
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.8/18/201516
Hides complexity of writing high performance ray tracing kernels gives you more time developing your renderer
High performance on latest Intel® Xeon® Processor family and Intel® Xeon Phi™ coprocessor products
Embree always up to date with latest ISA instruction sets
High potential performance gain (1.5x – 6x rendering speedup reported by Embree users)
Why should I use Embree?
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
As a benchmark to identify performance issues in existing applications
Adopt algorithms from Embree to your code
– However Embree internals change frequently!
As a library through the Embree API (recommended)
– Benefit from future Embree improvements!
How can I use Embree?
8/18/201517
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree v2.6.1 Performance
19
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Models and illumination effects representative for professional rendering environment
Path tracer with different material types, different light types, about 2000 lines of code
Evaluation on typical Intel® Xeon® rendering workstation* and Intel® Xeon Phi™ Coprocessor**
Compare against state of the art GPU*** methods (using OptiX™ 3.8.0 and CUDA® 7.0.28)
Identical implementations in ISPC (Xeon®), ISPC (Xeon Phi™), OptiX™ (GTX™ Titan X)
Performance Methology
20
Imperial Crown of Austria4.3M triangles
Bentley 4.5l Blower (1927)2.3M triangles
Asian Dragon12.3M triangles
* Dual Socket Intel® Xeon® E5-2699 v3 2x18 cores @ 2.30GHz ** Intel® Xeon Phi™ 7120, 61 cores @ 1.238 GHz *** NVIDIA® GeForce® GTX™ Titan X
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Build Performance for Static Scenes
40 41 4532.3 31.7 35.1
0
50
100
150
Intel® Xeon® E5-2699 v3Processor2 x 18 cores, 2.3 GHz
Intel® Xeon Phi™ 7120Coprocessor61 cores, 1.28 GHz
21
Mill
ion
Tri
angl
es/S
eco
nd
SAH Build (high quality)
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Build Performance for Dynamic Scenes
112 108 105
160.1140.4
162.1
0
50
100
150
Intel® Xeon® E5-2699 v3Processor2 x 18 cores, 2.3 GHz
Intel® Xeon Phi™ 7120Coprocessor61 cores, 1.28 GHz
22
Mill
ion
Tri
angl
es/S
eco
nd
Morton Build
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Ray Tracing Performance (incl. Shading)
107.2
129.6 134.98
64.9675.36
82.62
29.472 35.04 38.76
0
20
40
60
80
100
120
140Intel® Xeon® E5-2699 v3Processor2 x 18 cores, 2.3 GHz
Intel® Xeon Phi™ 7120Coprocessor61 cores, 1.28 GHz
NVIDIA® GeForce® GTX™ Titan X Coprocessor12 GB RAM
23
Mill
ion
Ray
s/Se
con
d
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree API
8/18/201524
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Version 2 of the Embree API
Compact and easy to use
C++ and ISPC version
Hides implementation details(e.g. different spatial index structures)
Embree API Overview
8/18/201525
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Scene is container for set of geometries
Scene flags passed at creation time
Scene geometry changes have to get commited (rtcCommit) which triggers BVH build
Scene Object
26
/* include embree headers */#include <embree2/rtcore.h>
int main () {/* initialize at application startup */rtcInit ();
/* create scene */RTCScene scene = rtcNewScene(RTC_SCENE_STATIC,RTC_INTERSECT1);
/* add geometries */... later slide ...
/* commit changes */rtcCommit (scene);
/* trace rays */... later slide ...
/* cleanup at application exit */rtcExit ();
}
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Static Scenes
– Geometry cannot get changed
– High quality BVH build (SAH) faster ray traversal
– For final frame rendering
Dynamic Scenes
– Geometries can get added, modified, and removed
– Faster build (Morton) slower ray traversal
– Preview mode during geometric modeling
Scene Types
27
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Triangle Mesh
Contains vertex and index buffers
Number of triangles and vertices set at creation time
Linear motion blur supported (2 vertex buffers)
/* add mesh to scene */
unsigned int geomID = rtcNewTriangleMesh
(scene, numTriangles, numVertices, 1);
/* fill data buffers */
... later slide ...
/* add more geometries */
...
/* commit changes */
rtcCommit (scene);
29
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Recommended to use buffer sharing
Reduces memory consumption
Application manages buffers (buffer has to stay alive as long as geometry is alive)
Support for stride and offset allows application flexibility in its data layout
Buffer Sharing
30
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Buffer Sharing Example
/* application vertex and index layout */
struct Vertex { float x,y,z,s,t; };
struct Triangle { int materialID, v0, v1, v2; };
/* share buffers with application */
rtcSetBuffer(scene,geomID,RTC_VERTEX_BUFFER,vertexPtr,0,sizeof(Vertex));
rtcSetBuffer(scene,geomID,RTC_INDEX_BUFFER ,indexPtr ,4,sizeof(Triangle));
31
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Tracing Rays
rtcIntersect (scene, ray) reports first intersection
rtcOccluded (scene, ray) reports any intersection
Packet versions for ray packets of size 4,8, and 16
32
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
rtcIntersect: Ray Structure Inputs
Ray origin and direction (org, dir)
Ray interval (tnear, tfar)
Time used for motion blur [0,1]
struct RTCRay
{
Vec3f org;
Vec3f dir;
float tnear;
float tfar;
float time;
Vec3f Ng;
float u;
float v;
int geomID;
int primID;
int instID;
}
33
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
rtcIntersect: Ray structure Outputs
Hit distance (tfar) Unnormalized geometry normal (Ng) Local hit coordinates (u,v) Geometry identifier of hit geometry
(geomID) Index of hit primitive of geometry
(primID) Geometry identifier of hit instance
(instID) No shading normals, texture
coordinates, etc.
struct RTCRay
{
Vec3f org;
Vec3f dir;
float tnear;
float tfar;
float time;
Vec3f Ng;
float u;
float v;
int geomID;
int primID;
int instID;
}
34
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
35
Simplifies writing vectorized renderer
C-based language plus vector extensions
Scalar looking code that gets vectorized automatically
Guaranteed vectorization
Compilation to different vector ISAs (SSE, AVX, AVX2, AVX512, Xeon Phi™)
Available as Open Source from http://ispc.github.com
Intel® SPMD Program Compiler (ISPC)
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
/* loop over all screen pixels */
foreach (y=0 ... screenHeight-1, x=0 ... screenWidth-1)
{
/* create and trace primary ray */
RTCRay ray = make_Ray(p,normalize(x*vx + y*vy + vz),eps,inf);
rtcIntersect(scene,ray);
/* environment shading */
if (ray.geomID == RTC_INVALID_GEOMETRY_ID) {
pixels[y*screenWidth+x] = make_Vec3f(0.0f); continue;
}
/* calculate hard shadows */
RTCRay shadow = make_Ray(ray.org+ray.tfar*ray.dir,neg(lightDir),eps,inf);
rtcOccluded(scene,shadow);
if (shadow.geomID == RTC_INVALID_GEOMETRY_ID)
pixels[y*width+x] = colors[ray.primID]*(0.5f + clamp(-dot(lightDir,normalize(ray.Ng)),0.0f,1.0f));
else
pixels[y*width+x] = colors[ray.primID]*0.5f;
}
Embree Rendering: ISPC Example
36
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Dynamic Scenes
Create scene with RTC_SCENE_DYNAMIC flag
Report modified meshes with rtcUpdate call
Possibly enable (rtcEnable), disable (rtcDisable), add (rtcNewXX), and delete (rtcDeleteGeometry) geometries
for each frame
{
for each dynamic mesh
{
/* modify shared buffers */
modify mesh->indices
modify mesh->vertices
/* signal mesh update */
rtcUpdate(scene,mesh);
}
/* commit changes */
rtcCommit (scene);
/* trace rays */
...
}
37
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Per geometry callback that is called during traversal for each primitive intersection
Callback can accept or reject hit
Can be used for:– Trimming curves (e.g. modeling tree leaves)
– Transparent shadows (reject and accumulate)
– Find all hits (reject and collect)
Intersection Filter Functions
42
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
/* procedural intersection filter function */
void intersectionFilter(void* userPtr, RTCRay& ray)
{
Vec3fa h = ray.org + ray.dir*ray.tfar;
float v = abs(sin(4.0f*h.x)*cos(4.0f*h.y)*sin(4.0f*h.z));
float T = clamp((v-0.1f)*3.0f,0.0f,1.0f);
if (T > 1.0f) return; // accept hit
ray.geomID = RTC_INVALID_GEOMETRY_ID; // reject hit
}
/* set intersection filter for the cube */
rtcSetIntersectionFilterFunction(scene, geomID, (RTCFilterFunc)&intersectionFilter);
rtcSetOcclusionFilterFunction (scene, geomID, (RTCFilterFunc)&intersectionFilter);
rtcSetUserData (scene, geomID, NULL);
Filter Function Example
43
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Hair curves represented as cubic bezier curves with varying radius
High performance through use of oriented bounding boxes
Low memory consumption through direct ray/curve intersection
Hair Geometry
44
p0/r0 p1/r1
p2/r2
p3/r3
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Catmull Clark Subdivision Surfaces
8/18/201545
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Converts coarse mesh into smooth surface by subdivision
Generalization of bi-cubic B-Spline surfaces to arbitrary topology
Embree is compatible with OpenSubdiv 3.0
Catmull Clark Subdivision Surfaces
46 46
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Catmull Clark Subdivision
47
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Low resolution base mesh controls high resolution surface
Smoothness always guaranteed (C2 continous almost everywhere)
Support for arbitrary topology(no trimming required as with NURBS)
Creases allow introducing sharp features
Support in most modeling tools
Established as standard in movie production
CC Subdivision Surface Advantages
Inside-Out (2015)Pixar, Walt Disney Pictures
48
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Semi-sharp edge creases
Semi-sharp vertex creases
Vertex attribute interpolation
Tessellation level per edge
Non-manifolds and Holes
Boundary modes
Triangles, Quads, Pentagons, ...
Displacement mapping
Embree Subdivision Features
4949
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree Subdivision Example
50
unsigned geomID = rtcNewSubdivisionMesh (scene, RTC_GEOMETRY_STATIC,
numFaces, numIndices, numVertices,
numEdgeCreases, numVertexCreases, numHoles);
rtcSetBuffer (scene,geomID,RTC_VERTEX_BUFFER, vertices, 0, sizeof(float3));
rtcSetBuffer (scene,geomID,RTC_INDEX_BUFFER , indices, 0, sizeof(int));
rtcSetBuffer (scene,geomID,RTC_FACE_BUFFER , faces, 0, sizeof(int));
rtcSetBuffer (scene,geomID,RTC_LEVEL_BUFFER , levels, 0, sizeof(float));
rtcSetBuffer (scene,geomID,RTC_EDGE_CREASE_INDEX_BUFFER,...);
rtcSetBuffer (scene,geomID,RTC_EDGE_CREASE_WEIGHT_BUFFER,...);
rtcSetBuffer (scene,geomID,RTC_VERTEX_CREASE_INDEX_BUFFER,...);
rtcSetBuffer (scene,geomID,RTC_VERTEX_CREASE_WEIGHT_BUFFER,...);
rtcSetBuffer (scene,geomID,RTC_HOLE_BUFFER,holes,0,sizeof(char));
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Feature adaptive subdivision to evaluate patch (subdivides only at irregular vertices and crease features)
Fast path for B-Spline patches and Gregory patches
Tessellation Cache limits memory consumption (trade memory for performance)
Embree Subdivision Implemention
51
Feature adaptive subdivisioninto B-Spline patches (green) and
Gregory Patches (blue)
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Embree Subdivision Performance
Patches 16 52k 53k
Edge Creases 0 0 30k
Micro Quads 1048k 831k 837k
Walkthrough 32 fps 36 fps 23 fps
Same View 66 fps 51 fps 56 fps
Intel® Xeon® E5-26902.9 GHz2x 8 cores1024 x 1024 pixels
52
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Interpolates arbitrary user data over geometries (non-trivial for subdivision geometries)
Interpolated data P as well as dPdu and dPdv can be calculated at arbitrary location
Enables smooth normals and anisotropic texture lookups
Different rules for interpolation of texture coordinates supported (by evaluation of second subdiv mesh)
Vertex Data Interpolation
53
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Vertex Data Interpolation Example
rtcNewScene (RTC_STATIC, RTC_INTERSECT1 | RTC_INTERPOLATE);
...
unsigned geomID = rtcNewSubdivisionMesh (...);
rtcSetBuffer (scene,geomID,RTC_INDEX_BUFFER, indices, 0, sizeof(int));
rtcSetBuffer (scene,geomID,RTC_VERTEX_BUFFER, vertices, 0, sizeof(float3));
rtcSetBuffer (scene,geomID,RTC_USER_VERTEX_BUFFER, vertex_colors, 0, sizeof(float3));
...
rtcCommit (scene);
...
rtcIntersect (scene, ray);
...
float3 P, dPdu, dPdv;
rtcInterpolate (scene, geomID, primID, ray.u,ray.v, RTC_VERTEX_BUFFER, &P, &dPdu, &dPdv, 3);
float3 color;
rtcInterpolate (scene, geomID, primID, ray.u,ray.v, RTC_USER_VERTEX_BUFFER, &color, 0,0, 3);
54
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Displaced Subdivision Surface
55
Support for vector displacement Callback function displaces vertex
positions Bounded displacement allows for lazy
evaluation Smooth normals possible through
approximation
Q = P + D*NgdQdu ≈ dPdu + dDdu*NgdQdv ≈ dPdv + dDdv*Ng
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
void displacementFunction(
void* ptr, int geomID, int primID,
const float* u, const float* v,
const float* nx, const float* ny, const float* nz,
float* px, float* py, float* pz,
size_t N)
{
for (size_t i = 0; i<N; i++) {
float D = displacement(...);
px[i] += D*nx[i];
py[i] += D*ny[i];
pz[i] += D*nz[i];
}
}
BBox3fa bounds(...);
rtcSetDisplacementFunction (scene,geomID,displacementFunction,&bounds);
Displaced Subdivision Surface Example
56
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
57
Embree delivers highest ray tracing performance on CPUs
Embree can speed up many ray tracing applications
Embree is easy to use through its API
Subdivision surface support compatible to OpenSubdiv 3.0
Free and Open Source (https://embree.github.com)
Summary
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Questions?
https://[email protected]
8/18/201558
C o p y r i g h t © 2 0 1 5 , I n t e l C o r p o r a t i o n . A l l r i g h t s r e s e r v e d . * O t h e r n a me s a n d b r a n d s ma y b e c l a i me d a s t h e p r o p e r t y o f o t h e r s .
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
60
Technical Overview
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Stochastic integration of all potential light paths between source and pixel
Follows light backwards from pixel to light source
Produces incoherent ray distributions
Monte Carlo Ray Tracing
Light
Pixel
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Two Kinds of Ray Distributions
Incoherent Rays(typical for Monte Carlo)
Coherent Rays
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
BVH Acceleration Structure
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
BVH Acceleration Structure
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
BVH Acceleration Structure
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
BVH Acceleration Structure
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
BVH Acceleration Structure
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
Solution Space for Vectorized Ray Tracing
Single RaySIMD
Traversal
ScalarTraversal
PacketTraversal
IndependentRay TraversalMulti Ray
Single Ray
Single Box Multi Box
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
BVH4 Spatial Index Structure
struct Node4 {
ssef minx, miny, minz;
ssef maxx, maxy, maxz;
Node4* child[4];
}
struct Triangle4 {
ssef v0x,v0y,v0z;
ssef e1x,e1y,e1z;
ssef e2x,e2y,e2z;
ssef Nx,Ny,Nz;
ssei id0,id1;
}
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
For each dimension:
Intersect ray with near plane of each box in SIMD
Intersect ray with far plane of each box in SIMD
Clip the near and the far parameters (box hit if near <= far)
BVH4 Traversal
near4
near1near2
near3
far1far1
far3 far4
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
• Reduce number of executed instructions
• Reduce data dependencies of critical paths
• Take advantage of special instructions (e.g. SSE, bitscan, etc.)
• Optimize most frequently executed code paths
Optimizing BVH4 Traversal for CPUs
Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.Copyright © 2015, Intel Corporat ion. Al l r ights reserv ed. *Other names and brands may be claimed as the property of others.
• Load front/back plane based on direction sign of the ray.
• Balanced min/max trees
• Bitscans to iterate through hit children
• Early exit for 0 children hit (20%)
• 1 child hit (50%): keep next node in register (instead of push/pop sequence)
• 2 children hit (20%): keep next node in register, sort using a branch
Optimizing BVH4 Traversal for CPUs (SSE)while (true) {
if (isLeaf(node)) goto leaf;ssef nearX = (norg.x + node[nearX]) * rdir.x;ssef nearY = (norg.y + node[nearY]) * rdir.y;ssef nearZ = (norg.z + node[nearZ]) * rdir.z;ssef farX = (norg.x + node[farX ]) * rdir.x;ssef farY = (norg.y + node[farY ]) * rdir.y;ssef farZ = (norg.z + node[farZ ]) * rdir.z;ssef near = max(max(nearX,nearY),
max(nearZ,ray.near));ssef far = min(min(farX,farY),
min(farZ,ray.far));int hitmask = movemask(near <= far);if (hitmask == 0) goto pop;int c = bitscan(hitmask);hitmask = clearbit(hitmask,c); if (hitmask == 0) {
node = node.child[c]; continue;} …