Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.
-
date post
21-Dec-2015 -
Category
Documents
-
view
221 -
download
0
Transcript of Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.
![Page 1: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/1.jpg)
Acceleration on many-cores CPUs and GPUs
Dinesh ManochaLauri Savioja
![Page 2: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/2.jpg)
Frustum Tracing Pipeline
Frustum TriangleIntersection
![Page 3: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/3.jpg)
Frustum Tracing Pipeline
Frustum TriangleIntersection
![Page 4: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/4.jpg)
Frustum Tracing Results (7 cores)
Theater
54 ∆s
Factory
174 ∆s
Game
14K ∆s
Sibenik
71K ∆s
City
72K ∆s
SodaHall
1.5M ∆s
diffraction NO NO NO NO YES YES
#frusta 56K 40K 206K 198K 80K 108K
time (msec)
33 27 273 598 206 373
![Page 5: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/5.jpg)
Frustum Tracing Results (7 cores)
Interactive geometric propagation on complex scenes
[Chandak et al. 2008]
![Page 6: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/6.jpg)
Scaling of FastV (Scaling with #cores)
Fastest, accurate geometric propagation algorithm[Chandak et al. 2009]
![Page 7: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/7.jpg)
Numerical Acoustics with Adaptive Rectangular
Decomposition on the GPU
Nikunj Raghuvanshi+, Brandon Lloyd*, Naga K. Govindaraju*, Ming C. Lin+
+ Department of Computer Science, UNC Chapel Hill* Microsoft Corporation
![Page 8: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/8.jpg)
Rectangular Decomposition
Numerical Acoustics can be solved very efficiently on a rectangular domain
Decompose complex domains into rectangles
![Page 9: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/9.jpg)
Leveraging GPU for acoustics
Solution of Wave Equation within each rectangle can be done using a Discrete Cosine Transform (DCT)
DCTs can be done using FFT
Use an efficient FFT implementation on the GPU Govindaraju, N. K., Lloyd, B., Dotsenko, Y., Smith, B., and
Manferdelli, J. 2008. High performance discrete Fourier transforms on graphics processors. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing
![Page 10: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/10.jpg)
FFT on the GPU
![Page 11: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/11.jpg)
Performance
Scene Name
Volume (m3)
Time: FDTD (CPU)
Time: Our Technique
(GPU)
Speedup
Corridor 375 365 min 4 min ~ 90x
House 1,275 2718 min 13 min ~ 200x
Cathedral 13,650 ~1 week (projected)
30 min ~ 300 x
![Page 12: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/12.jpg)
Rectangular decomposition leverages GPU FFT combined with algorithmic improvements leading to ~100x improvement in performance for numerical acoustics
Conclusion
![Page 13: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/13.jpg)
Case: Real-time acoustic radiance transfer
![Page 14: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/14.jpg)
![Page 15: Acceleration on many-cores CPUs and GPUs Dinesh Manocha Lauri Savioja.](https://reader030.fdocuments.net/reader030/viewer/2022032704/56649d635503460f94a45854/html5/thumbnails/15.jpg)
Case continued
More information in:S. Siltanen, T. Lokki, and L. Savioja, `Frequency domain acoustic radiance transfer for real-time auralization,' Acta Acustica united with Acustica, vol. 95, no. 1, pp. 106-117, 2009.