Efficient Clustered BVH Update Algorithm for Highly

40
RT 08 Symposium on Interactive Ray Tracing 2008 Los Angeles, California Efficient Clustered BVH Update Algorithm for Highly-Dynamic Models Kirill Garanzha Department of Software for Computers Bauman Moscow State Technical University, Russia

Transcript of Efficient Clustered BVH Update Algorithm for Highly

Page 1: Efficient Clustered BVH Update Algorithm for Highly

RT 08Symposium on Interactive Ray Tracing 2008

Los Angeles, California

Efficient Clustered BVH Update Algorithm

for Highly-Dynamic Models

Kirill GaranzhaDepartment of Software for Computers

Bauman Moscow State Technical University, Russia

Page 2: Efficient Clustered BVH Update Algorithm for Highly

RT 08BVH summary

Advantages:BVH is the tree that encloses scene objects

Advantages:Fast refittingPredictable memoryPredictable memory consumptionFastest empty spaceFastest empty space passing for SAH trees

Disadvantages:Disadvantages:Not ordered traversal of the ray in the spaceT lit d d ti ft fittiTree-quality degradation after refittingNot very fast SAH-based build

Page 3: Efficient Clustered BVH Update Algorithm for Highly

RT 08Dynamic BVH related

Fast BVH build assume some treeFast BVH build assume some tree-quality degradation for ray tracing

BIH [Wächter EGSR06]BIH [Wächter EGSR06]Build from hierarchy [Hunt RT07]SAH binning [Wald RT07]Selective restructuring [Yoon EGSR07]Selective restructuring [Yoon EGSR07]

Page 4: Efficient Clustered BVH Update Algorithm for Highly

RT 08Dynamic BVH related

Fast BVH build assume some treeFast BVH build assume some tree-quality degradation for ray tracing

In general these approaches apartIn general these approaches apart from “Selective restructuring” produce splits for every BVH-node in everysplits for every BVH-node in every frame – brute force

Page 5: Efficient Clustered BVH Update Algorithm for Highly

RT 08Dynamic BVH related

How to get rid of brute force?How to get rid of brute force?Lazy buildLazy buildSelective restructuringAvoid full reconstruction of nodes without strong dynamism under themwithout strong dynamism under them

Page 6: Efficient Clustered BVH Update Algorithm for Highly

RT 08IdeaTo measure cheaply the dynamism under node while refitting the BVHunder node while refitting the BVHTo apply the cheapest update t h i f th d th ttechnique for the node that saves good SAH cost:

1. Leave it refitted2. Relocate BVH-cluster in proper position

f th tof the tree3. Rebuild the structure under node by

exploiting SAH binning existingexploiting SAH-binning, existing hierarchy, degree of dynamism

Page 7: Efficient Clustered BVH Update Algorithm for Highly

RT 08Dynamism detection

The type of dynamism is detected in theThe type of dynamism is detected in the bottom-up refitting process for every affected node:affected node:

Migrating node represents a BVH-cluster of coherently moving trianglesExploded node represents a cluster ofExploded node represents a cluster of strong triangle explosions and growing overlapsgrowing overlaps

Page 8: Efficient Clustered BVH Update Algorithm for Highly

RT 08Detect migrating nodeExample:

B

C …

D

A…

C

……

E

Nodes B and C are moving They representNodes B and C are moving. They represent clusters of coherently moving triangles

Page 9: Efficient Clustered BVH Update Algorithm for Highly

RT 08Detect migrating nodeExample:

B…

D

A C

… …

E

The SAH cost of the node (B+C) is improvedThe SAH cost of the node (B+C) is improved. But nodes B and C should be united with others

Page 10: Efficient Clustered BVH Update Algorithm for Highly

RT 08Detect migrating nodeExample:

B…

D

A C

… …

E

Like nowLike now…

Page 11: Efficient Clustered BVH Update Algorithm for Highly

RT 08dM

NstSavedSAHCoNSAHCost

−< 1)(

)(

Detect migrating nodeFormula:

if

N d N t i tithen NL and NR represent migrating clusters and should be relocated

NL and NR are children of NdM is the predefined thresholddM is the predefined threshold

Page 12: Efficient Clustered BVH Update Algorithm for Highly

RT 08Detect exploded nodedM

NstSavedSAHCoNSAHCost

−< 1)(

)(

Example:

NL

NR

The overlap between N and N is likely to growThe overlap between NL and NR is likely to grow if SAH cost of N is increasing

Page 13: Efficient Clustered BVH Update Algorithm for Highly

RT 08dM

NstSavedSAHCoNSAHCost

−< 1)(

)(

Detect exploded nodeExample:

NL

NR

N is worth to rebuild if there is a big % of brokenN is worth to rebuild if there is a big % of broken nodes under it

Page 14: Efficient Clustered BVH Update Algorithm for Highly

RT 08dM

NstSavedSAHCoNSAHCost

−< 1)(

)(

Detect exploded nodeExample:

NL

NR

Broken node – migrating or exploded BrokenBroken node migrating or exploded. Broken count measures the dynamism under node

Page 15: Efficient Clustered BVH Update Algorithm for Highly

RT 08dM

NstSavedSAHCoNSAHCost

−< 1)(

)(

Detect exploded nodeFormula:

Th i bi l b t N d N

if

There is big overlap between NL and NR

There is big % of broken nodes under NdM

NstSavedSAHCoNSAHCost

−< 1)(

)( dMNstSavedSAHCo

NSAHCost−< 1

)()(if

g

th The cluster with root Nthen The cluster with root Nshould be restructured

NL and NR are children of NdE is the predefined threshold

Page 16: Efficient Clustered BVH Update Algorithm for Highly

RT 08Refitting phase

The bottom-up refitting phase:The bottom-up refitting phase:1. Updates BVs2. Detects the dynamism3 Accumulates broken counts3. Accumulates broken counts

At the output it produces 2 arrays with migrating and exploded nodeswith migrating and exploded nodes

Page 17: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clusters

How to obtain a lot of smaller exploded clusters to perform independentto perform independent rebuilds on them?

Page 18: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clusters

Wh l d d d N i d t t dWhen exploded node N is detected:Lock it. N will be considered as atomic for rebuild process on a cluster above it.Unlock all exploded descendants of NUnlock all exploded descendants of Nthat intersect the BV of the overlap b t dbetween NL and NR

Page 19: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clusters

N

Locked

NL

LockedL k d

Locked

Locked NR

Locked exploded descendants will be rebuilt independentl and ill be considered as atomicindependently and will be considered as atomic for some rebuild process above them

Page 20: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clusters

N

Locked

NL

LockedL k d

Locked

Locked NR

But if some of them intersect the BV of the overlap within some ancestor N then they createoverlap within some ancestor N then they create obstacles in the process eliminating the overlap

Page 21: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clusters

N

Locked

NL

LockedL k d

Locked

Locked NR

So they should be unlocked

Page 22: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clustersL

L L…

…… …

The set of independent clusters for rebuildingThe set of independent clusters for rebuilding may form the hierarchy. Their roots are locked

Page 23: Efficient Clustered BVH Update Algorithm for Highly

RT 08Hierarchical rebuild1. New split

should be created ith SAH bi i

4. Locked nodes are atomic

with SAH binning

2. Smaller-sized rows Lare considered while

partitioning

… … … …3. In the next partition step the sub-row is refined if it has sufficient

% of Broken count 5. Other cluster, independent rebuild

This way saves some rebuild operations

Page 24: Efficient Clustered BVH Update Algorithm for Highly

RT 08Cluster migrationMigrating nodes are processed in 3 loops:

U li k h d f th t1. Unlink each node from the tree2. Adjust the remaining tree and search djus e e a g ee a d sea c

reinsertion nodes3 Insert each node back into the tree3. Insert each node back into the tree

At every loop all migrating nodes areAt every loop all migrating nodes are passed in the “end-begin” order – i.e. a loop starts from nodes of higher levelsloop starts from nodes of higher levels

Page 25: Efficient Clustered BVH Update Algorithm for Highly

RT 08Cluster migrationMigrating nodes are processed in 3 loops:

U li k h d f th t1. Unlink each node from the tree2. Adjust the remaining tree and search djus e e a g ee a d sea c

reinsertion nodes3 Insert each node back into the tree3. Insert each node back into the tree

The formula that detects migrating nodesThe formula that detects migrating nodes and these 3 loops automatically resolve the problem of the tree thinning effectthe problem of the tree-thinning effect

Page 26: Efficient Clustered BVH Update Algorithm for Highly

RT 08Cluster migration

Where insert N 1. Unite X and N

UThe variant of

decision is taken when start from X?

? 2 Descend into

X Naccording to its SAH evaluation

X

XL XR

N

NL NR

<= 2. Descend into one of the children

The insertion is a i

X

XL XR N? <=

recursive process of taking decisions

X NL NR

3. Decompose N

,? <=

N is decomposed when its insertion produces severeX

XL XR

L R,< produces severe overlap in X.

Page 27: Efficient Clustered BVH Update Algorithm for Highly

RT 08Memory manager

Problem: frequent updates ofProblem: frequent updates of pointers in the proposed highly-dynamic structure result in cachedynamic structure result in cache-efficiency degradation.Property of restructurings: in either rebuild or insert process a new pBVH-node is allocated under some parent nodeparent node.

Page 28: Efficient Clustered BVH Update Algorithm for Highly

RT 08Memory managerAcceleration structure for pre-allocated array of BVH nodes improves the situation:array of BVH-nodes improves the situation:

16Selector

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

4 4 4 4acceleration structure

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15BVH nodes array

It accelerates the allocation of a free BVH-node that is the nearest to the given one in gthe memory space (e.g. nearest to parent)

Page 29: Efficient Clustered BVH Update Algorithm for Highly

RT 08Memory managerAcceleration structure for pre-allocated array of BVH nodes improves the situation:array of BVH-nodes improves the situation:

16Selector

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

4 4 4 4acceleration structure

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15BVH nodes array

This strategy tries to keep the BVH-layout of reasonable cache-efficiencyy

Page 30: Efficient Clustered BVH Update Algorithm for Highly

RT 08BenchmarksBVH Updater – Core 2 Quad 2.4 GHz (1 core utilization)(1 core utilization)Ray Tracer – GeForce 8800 GTX and CUDACUDA, mono-ray tracerUNC Dynamic models:UNC Dynamic models:

Exploding dragon (252K triangles)

Cloth simulation (92K triangles)

Colliding balls (146K triangles)

Page 31: Efficient Clustered BVH Update Algorithm for Highly

RT 08Dragon timings

120130140

8090

100110120

ms

50607080

Tim

e, m

10203040

010

0 10 20 30 40 50 60 70 80 90 100FrameFrameRefit only Total update

Migrate only Render onlyRebuild only

Page 32: Efficient Clustered BVH Update Algorithm for Highly

RT 08Cloth timings

505560

35404550

ms

20253035

Tim

e, m

5101520

05

0 10 20 30 40 50 60 70 80 90 100FrameFrameRefit only Total update

Migrate only Render onlyRebuild only

Page 33: Efficient Clustered BVH Update Algorithm for Highly

RT 08Balls timings

606570

4045505560

ms

25303540

Tim

e, m

5101520

05

0 10 20 30 40 50 60 70 80 90 100FrameFrameRefit only Total update

Migrate only Render onlyRebuild only

Page 34: Efficient Clustered BVH Update Algorithm for Highly

RT 08Rebuild partitions

80%90%

100%tit

ions

50%60%70%

rebu

ild p

art

10%20%30%40%

Num

ber o

f r

0%10%

0 10 20 30 40 50 60 70 80 90 100Frame

N

FrameDragon Cloth Balls

Relative number of rebuild partitions. Model ifi 100% b f titi f thspecific 100% = number of partitions for the

full-rebuild (i.e. number of inner nodes)

Page 35: Efficient Clustered BVH Update Algorithm for Highly

RT 08Independent clusters

1000110012001300

clus

ters

600700800900000

epen

dent

c

200300400500

mbe

r of i

nde

0100

0 10 20 30 40 50 60 70 80 90 100Frame

Num

FrameDragon Cloth Balls

The number of detected independent exploded clusters

Page 36: Efficient Clustered BVH Update Algorithm for Highly

RT 08Rendering performance

100%

105%ce

95%

perfo

rman

c

85%

90%

Rel

ativ

e p

80%0 10 20 30 40 50 60 70 80 90 100

FrameFrameDragon Cloth Balls

Relative rendering performance of the BVH i i t th d d b f llin comparison to the one produced by full SAH binned-rebuild

Page 37: Efficient Clustered BVH Update Algorithm for Highly

RT 08Dragon video

Page 38: Efficient Clustered BVH Update Algorithm for Highly

RT 08Cloth video

Page 39: Efficient Clustered BVH Update Algorithm for Highly

RT 08Balls video

The algorithm adapts to the rigid motion associating every ball with a separate subtree and then exploits only migrating updates

Page 40: Efficient Clustered BVH Update Algorithm for Highly

RT 08Conclusions

The algorithm unites advantages of severalThe algorithm unites advantages of several updating techniques for the BVH.

Cheapest update techniques are utilizedCheapest update techniques are utilized when they can keep reasonable BVH quality.

The algorithm is applicable to various typesThe algorithm is applicable to various types of dynamic models.

L f d d i d d l fLots of produced independent clusters for rebuild will be useful in a future parallel version.