Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing...
Transcript of Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing...
![Page 1: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/1.jpg)
Accelerating High-Throughput Computing through OpenCL
Andrei Dafinoiu, Joshua Higgins, Violeta Holmes
High-Performance Computing Research Group
University of Huddersfield
Huddersfield, United Kingdom
![Page 2: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/2.jpg)
OverviewIntroduction
Resources
Motivation
Experiment Design
Results and Performance
Conclusions
![Page 3: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/3.jpg)
OpenCL
• OpenCL -> A programming
framework for heterogeneous
compute platforms
• Supports CPU, GPU, DSP, and
other accelerators
• Programming API based on C/C++
![Page 4: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/4.jpg)
Introduction
• High-Throughput Computing
• HTCondor
• QGGCondor
![Page 5: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/5.jpg)
Aim of project• To expand the capabilities of the QGGCondor.
• To evaluate the efficiency and flexibility of OpenCL for use within a heterogeneous HTC environment.
• To increase the visibility of GPGPU computing
![Page 6: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/6.jpg)
HTCondor SetupClassAdds, what are they ?
Old ClassAdd HTCondor Proposed New ClassAdd
![Page 7: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/7.jpg)
Environment SetupReasoning:
- Unreliable environment for benchmarking purposes.
- Desire to execute benchmarks over the live system.
Method:
- Extracted hostnames of 1000 condor machines.
- Script based generation of jobs.
![Page 8: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/8.jpg)
Resource Discovery
AMD 5600
1%AMD 6400
19%
AMD 6500
14%
K600
4%GTX 610
4%GTX 670
7%
GTX 750 TI
8%
GTX 970
13%
Not Detected
30%
GPU DISTRIBUTION
AMD 5600 AMD 6400 AMD 6500 K600 GTX 610
GTX 670 GTX 750 TI GTX 970 Not Detected
![Page 9: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/9.jpg)
Experiment DesignFast-Fourier Transform
1000 iterations -> to ensure precision
17 FFT sizes -> from 2^8 to 2^24
701 machines -> that reported GPUs
Resulting in:
-Aprox. 12 million FFT calculations
-Aprox. 28 thousand CPU hours
![Page 10: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/10.jpg)
Results
0
0.5
1
1.5
2
2.5
2^8 2^11 2^14 2^17 2^20 2^23
GF
LOP
S
FFT Size
1D FFT on CPU
Intel I50
10
20
30
40
50
2^8 2^11 2^14 2^17 2^20 2^23
GF
LOP
S
FFT Size
1D FFT on GPU
AMD 6500
![Page 11: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/11.jpg)
5
10
20
40
80
160
2^8 2^11 2^14 2^17 2^20 2^23
GF
LOP
S
FFT Size
QGGCondor GPU FFT Performance
AMD 6500 AMD 6400 GTX 970 GTX 750 Ti GTX 670 AMD 5600
2014
GTX
970
AMD
5600
2009
2010
2012
![Page 12: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/12.jpg)
Comparison with GPU cluster
0
20
40
60
80
100
120
140
160
2^8 2^11 2^14 2^17 2^20 2^23
GF
LOP
S
FFT Size
Average Performance
Condor C2050 AMD NVIDIA• On average, a single compute
GPU is negligibly faster.
• Newer-gen GPGPUs outperform
older compute counterparts.
P.S: NVIDIA C2050 was released in 2011
![Page 13: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/13.jpg)
Conclusion•HTCondor GPU integration was successful with single entity ClassAdd definition.
• OpenCL based implementations over HTCondor are straightforward (without specific optimizations) however machines need dedicated graphics drivers installed.
• QGGCondor GPU performance varies greatly however the average is marginally slower than that of a dedicated compute GPU.
![Page 14: Accelerating High-Throughput Computing through OpenCL · Accelerating High-Throughput Computing through OpenCL Andrei Dafinoiu, Joshua Higgins, VioletaHolmes High-Performance Computing](https://reader030.fdocuments.net/reader030/viewer/2022040723/5e32f685df118009c761c733/html5/thumbnails/14.jpg)
Thank you for your attention
Questions ?