Post on 31-Mar-2015
GPU-based Hierarchical Computations for View Independent VisibilityRhushabh Goradia, Prekshu Ajmera, Sharat Chandran Sixth Indian Conference on Computer Vision,
Graphics & Image Processing, 2008http://www.cse.iitb.ac.in/~{rhushabh, prekshu, sharat}
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Parallel Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Result Video
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
VIDEO !
Challenges
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
O(N3) Time ComplexityN2 point-pairsN occluders considered for every pair
No surface Information
Polygonal Model Point Model
A A BB
[GKSD 07], Visibility Map for Global Illumination in Point Clouds, GRAPHITE 2007
Building the Octree Hierarchy
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Building the Octree Hierarchy
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Building the Octree Hierarchy
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Building the Octree Hierarchy
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Building the Octree Hierarchy
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Visibility between Point Clusters
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Visibility between Point Clusters
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Visibility between Point Clusters
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Discussion: Visibility between Point Clusters O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Discussion: Visibility between Point Clusters O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Contributions O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement & Contributions:
Achieved upto 19x speed-up
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
Achieved upto 19x speed-up
Visibility Problem is highly parallel
Problem Statement & Contributions:
Contributions
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
Achieved upto 19x speed-up
No compromise on quality Visibility problem is highly parallel
Problem Statement & Contributions:
Contributions
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
Achieved upto 19x speed-up Visibility problem is highly parallel
No Dynamic Memory Allocation and Recursion No compromise on quality
Problem Statement & Contributions:
Contributions
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
O(M2 log M) Time Complexity (for Octree with M leaves)
M2 log M << N3
M << N
… but still not fast enough !
Achieved upto 19x speed-up Visibility problem is highly parallel No compromise on quality No Dynamic Memory Allocation and Recursion
Problem Statement & Contributions:
Contributions
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Application to Global Illumination
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Application to Global Illumination
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Point modeled Cornell Room and the Stanford Bunny
Parallel, Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Application to Global Illumination
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Can be extended to the Digital Heritage Projecthttp://research.microsoft.com
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
V-map for a tree is a collection of visibility links for every node in the tree
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
V-map for a tree is a collection of visibility links for every node in the tree
The visibility link for any node N is a set L of nodes
N
Node N List L
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
V-map for a tree is a collection of visibility links for every node in the tree
The visibility link for any node N is a set L of nodes
Every point in any node L is guaranteed to be visible from every point in N V1
N
Node N V1Visibility Link
List L
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
V-map for a tree is a collection of visibility links for every node in the tree
The visibility link for any node N is a set L of nodes
Every point in any node L is guaranteed to be visible from every point in N
V2
V1
N
Node N V1 V2Visibility Link
List L
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
V-map for a tree is a collection of visibility links for every node in the tree
The visibility link for any node N is a set L of nodes
Every point in any node L is guaranteed to be visible from every point in N
V2
V1
N
Node N V1 V2Visibility Link
V3
V3
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level 0
(Root)
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
With respect to at any level,
-- Completely Visible
-- Completely Invisible
-- Partially Visible
Level 0
Level 1
(Root)
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
With respect to at any level,
-- Completely Visible
-- Completely Invisible
-- Partially Visible
Level 0
Level 1
Level 2
(Root)
What is a V-map ?
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
With respect to at any level,
-- Completely Visible
-- Completely Invisible
-- Partially Visible
Level 0
Level 1
Level 2
Level 3
(Root)
(Leaf)
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Compute Visibility
A B
Root
Level L-2
Level L-1
Level L
Level L-3
Level 0
a1 b1
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0
Look up Look up
Root
A B
a1 b1
All leaf-pairs are visible
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Complete Visibility Root
A B
a1 b1
Look up Look up
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Complete Visibility
New VisibilityLink
Root
A B
a1 b1
Look up Look up
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Root
A B
a1 b1
Only some leaf-pairs are visible
Look up Look up
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Partial Visibility Root
New VisibilityLink
A B
a1 b1
Look up Look up
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Partial Visibility Root
New VisibilityLink
A B
a1 b1
Look up Look up
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Compute Visibility
Root
Level L-2
Level L-1
Level L
Level L-3
Level 0
A B
a2 b1
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Root
Level L-2
Level L-1
Level L
Level L-3
Level 0
A B
a2 b1
Uses Dynamic Memory Allocation
Compute Visibility
CPU based V-map Construction
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Root
Level L-2
Level L-1
Level L
Level L-3
Level 0
A B
a2 b1
Uses Recursion on the tree
Compute Visibility
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Strategy 1: Multiple Threads per Node
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
A
N
I0 I1 I2 I3
T0T1
T3T2
Interaction List
Computation by a Single Thread
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Compute Visibility
A I0
Root
Level L-2
Level L-1
Level L
Level L-3
Level 0
a1 b1
Strategy 1: Multiple Threads per Node
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
A
N
I0 I1 I2 I3
T0T1
T3T2
Interaction List
Degree of parallelism limited by the size of the node’s interaction list Serious Limitation: No support for Recursion and Dynamic memory allocation
Discussion:
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Strategy 2: One Thread per Node
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Degree of parallelism limited by number of nodes per level
Serious Limitation: No support Recursion and Dynamic memory allocation
Discussion:
A I0 I1 I2 I3
N I0 I1
Thread T0
Thread T1
LevelL
LevelL
Interaction List
Performance increases as we go down the octree
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0
Compute Visibility between leaves in parallel
A B
Root
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0
Store the results in a boolean array
A B
Root
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0
Look – up from the boolean array
A B
a1 b1
Look up Look up
Root
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Root
A B
a1 b1
All leaf-pairs are visible
Look up Look up
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Complete Visibility Root
A B
a1 b1
Look up Look up
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Complete Visibility
New VisibilityLink
Root
A B
a1 b1
Look up Look up
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Root
A B
a1 b1
Look up Look up
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Partial Visibility Root
New VisibilityLink
A B
a1 b1
Look up Look up
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Level L-2
Level L-1
Level L
Level L-3
Level 0Partial Visibility Root
New VisibilityLink
A B
a1 b1
Look up Look up
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Root
Level L-2
Level L-1
Level L
Level L-3
Level 0
A B
a2 b1
Look up Look up
CPU v/s GPU
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Compute Visibility in parallel
Compute Visibility
A B
Root
CPU Algorithm GPU Algorithm
A B
Root
CPU v/s GPU
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Compute Visibility
A B
Root
CPU Algorithm GPU Algorithm
Look – up from the boolean array
A B
Root
Strategy 3: Multiple Threads per Node-Pair
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Degree of parallelism is high as hundreds/thousands of threads run concurrently most of the time Recursion and Dynamic memory allocation happens on the CPU
Performance is best at octree levels near to root
Achieves upto 19x speed-up
Discussion:
No change in Quality
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Results: Visibility Validation
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Results: Visibility Validation
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Results: Timings
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Bunny in the Cornell Room (1,35,000 points)
Max Speedup : 19X
Results: Timings
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Buddha in the Cornell Room (2,30,000 points)
Max Speedup : 14X
Results: Timings
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Ganesha & Satyawati in the Cornell Room (3,50,000 points)
Max Speedup : 17X
Results: Application to Global Illumination
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Results: Application to Global Illumination
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Results: Quality Comparison
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
CPU GPU
Accuracy upto 5 decimal points
Results: Quality Comparison
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
CPU GPU
Accuracy upto 5 decimal points
Discussion: Implementation Details
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Optimal Thread and Block Size Each block (16 x 16) contained 256 threads Every thread-block Grid contained no less than 64 blocks
Optimal Octree Heights Every thread works on a leaf-pair; Multiple threads are independent No Shared Memory Synchronization required With 16 Multi-processors, 64 thread-blocks and 256 threads per block,
we need 16384 leaf-pairs
Asynchronous Computations
Loop Unrolling
Plan
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Hierarchical Visibility Map (V-map) Construction on CPU
Parallel V-map Construction on GPU
Strategy 1 – Multiple threads per node Strategy 2 – One thread per node Strategy 3 – Multiple threads per node-pair
Results
Conclusion
Final Note
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Parallel implementation of the view-independent mutual point-pair visibility problem on the modern day GPU (G80/G92 architecture) was presented
Several strategies to achieve the desired parallelism were introduced and the most suitable one chosen
Speed-ups upto 19X were reported while maintaining the quality of results
Blocks of 256 threads achieved optimal GPU performance
Good octree heights essential for high speed-ups and accurate visibility results
By viewing this V-map construction as a “pre-processing” step, photo-realistic global illumination rendering of complex point models have been shown
Parallel V-map construction solution can be thought as an intuition to solve other problems involving computing relationships amongst data at various levels in a hierarchy
Thank You
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Fractal: Mandel Zoom - Satellite Antenna, Mandelbrot set
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU
Problem Statement
ICVGIP, 2008 Rhushabh Goradia, Prekshu Ajmera, Sharat Chandran
Fast computation of view-independent visibility in a complex scene represented as a point model, using a GPU