LabVIEW GPU Computing
21
1 LabVIEW GPU Computing 國立臺灣大學光電所 博士候選人 張家凱 Chia‐Kai Chang 2012/6/28
Transcript of LabVIEW GPU Computing
Microsoft PowerPoint - LabVIEW GPU Computing.pptxOutline
What’s LabVIEW and how it works? The parallel computing in LabVIEW What applications which LabVIEW and GPUs being a fit? Case study Gallery
2
What is LabVIEW?
The graphical, dataflow programming language provides a better way for you to solve problems than traditional, lowerlevel alternatives, and the proof is in its longevity.
3
Integrate with LabVIEW
9
FFT
Interpolation
LabVIEW
LabVIEW
What applications which LabVIEW and GPUs being a fit? People who need… FFT calculation in realtime massively parallel tasks heavy dsp algorithms
Just like… Multichannel audio analysis OCT (Optical Coherence Tomography)
10
12
D. D. Sampson, T. R. Hillman, Optical coherence tomography, Lasers and Current Optical Techniques in Biology, G. Palumbo and R. Pratesi, eds. (ESP Comprehensive Series in Photosciences, Cambridge, UK, 2004), pp. 481-571.
Case study #1 Using NI FlexRIO to Develop a HighSpeed, Compact OCT Imaging System
13
In an OCT system, obtaining the final image requires significant processing including fast Fourier transforms (FFTs), interpolation, and DC offset calculations.
Case study #2 GPU1: GTX 580 512 stream processors, 1.59GHz processor clock and 1.5 GBytes graphics memory
GPU2: GTS 450 with 192 stream processors, 1.76GHz processor clock and 1.0 GBytes graphics memory is dedicated for the volume rendering and display of the complete Cscan data
The GPU is programmed through NVIDIA’s Compute Unified Device Architecture (CUDA) technology. The software is developed under the Microsoft Visual C + + environment with National Instrument’s IMAQ Win32 APIs.
14
Zhang, K. (2011). dx.doi.org/10.1364/BOE.2.000764
Signal processing flow chart of the dualGPUs architecture. Dashed arrows, thread triggering; Solid arrows, main data stream; Hollow arrows, internal data flow of the GPU. Here the graphics memory refers to global memory. The signal processing flow chart of the dualGPUs architecture is illustrated in Fig. 2, where three major threads are used for the FDOCT system raw data acquisition (Thread 1), the GPU accelerated FDOCT data processing (Thread 2), and the GPU based volume rendering (Thread 3).
15
In vivo human finger nail fold imaging: (a)~(d) are rendered from the same 3D data set with different view angles. The green arrows/dots on each 2D frame correspond to the same edges/ vertexes of the rendering volume frame. Volume size: 256(Y) × 100(X) × 1024(Z) voxels/ 3.5mm (Y) × 3.5mm (X) × 3mm (Z).
16
What’s LabVIEW and how it works? The parallel computing in LabVIEW What applications which LabVIEW and GPUs being a fit? Case study Gallery
2
What is LabVIEW?
The graphical, dataflow programming language provides a better way for you to solve problems than traditional, lowerlevel alternatives, and the proof is in its longevity.
3
Integrate with LabVIEW
9
FFT
Interpolation
LabVIEW
LabVIEW
What applications which LabVIEW and GPUs being a fit? People who need… FFT calculation in realtime massively parallel tasks heavy dsp algorithms
Just like… Multichannel audio analysis OCT (Optical Coherence Tomography)
10
12
D. D. Sampson, T. R. Hillman, Optical coherence tomography, Lasers and Current Optical Techniques in Biology, G. Palumbo and R. Pratesi, eds. (ESP Comprehensive Series in Photosciences, Cambridge, UK, 2004), pp. 481-571.
Case study #1 Using NI FlexRIO to Develop a HighSpeed, Compact OCT Imaging System
13
In an OCT system, obtaining the final image requires significant processing including fast Fourier transforms (FFTs), interpolation, and DC offset calculations.
Case study #2 GPU1: GTX 580 512 stream processors, 1.59GHz processor clock and 1.5 GBytes graphics memory
GPU2: GTS 450 with 192 stream processors, 1.76GHz processor clock and 1.0 GBytes graphics memory is dedicated for the volume rendering and display of the complete Cscan data
The GPU is programmed through NVIDIA’s Compute Unified Device Architecture (CUDA) technology. The software is developed under the Microsoft Visual C + + environment with National Instrument’s IMAQ Win32 APIs.
14
Zhang, K. (2011). dx.doi.org/10.1364/BOE.2.000764
Signal processing flow chart of the dualGPUs architecture. Dashed arrows, thread triggering; Solid arrows, main data stream; Hollow arrows, internal data flow of the GPU. Here the graphics memory refers to global memory. The signal processing flow chart of the dualGPUs architecture is illustrated in Fig. 2, where three major threads are used for the FDOCT system raw data acquisition (Thread 1), the GPU accelerated FDOCT data processing (Thread 2), and the GPU based volume rendering (Thread 3).
15
In vivo human finger nail fold imaging: (a)~(d) are rendered from the same 3D data set with different view angles. The green arrows/dots on each 2D frame correspond to the same edges/ vertexes of the rendering volume frame. Volume size: 256(Y) × 100(X) × 1024(Z) voxels/ 3.5mm (Y) × 3.5mm (X) × 3mm (Z).
16