Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism....
Transcript of Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism....
![Page 1: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/1.jpg)
Refactoring Conventional Task Schedulers to Exploit
Asymmetric ARM big.LITTLE Architectures in Dense Linear Algebra
Luis Costero, Francisco D. Igual, Katzalin Olcoz
Sandra Catalán, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí
![Page 2: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/2.jpg)
https://www.youtube.com/watch?v=KClygZtp8mA
![Page 3: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/3.jpg)
Task parallelism
![Page 4: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/4.jpg)
Contribution
Asymmetry-oblivious scheduler
Asymmetry-aware DLA library+
![Page 5: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/5.jpg)
Contribution
Asymmetry-oblivious scheduler
Asymmetry-aware DLA library+
Task parallelism Data parallelism
![Page 6: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/6.jpg)
Contribution
Asymmetry-oblivious scheduler
Asymmetry-aware DLA library+
Task parallelism Data parallelism
Virtual Cores
![Page 7: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/7.jpg)
Software execution models for ARM big.LITTLE
![Page 8: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/8.jpg)
Target architecture
![Page 9: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/9.jpg)
Execution Models
Cluster swithching mode
CPU Migration
Global task scheduling
![Page 10: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/10.jpg)
Parallel execution of DLA operations on multi-threaded architectures
![Page 11: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/11.jpg)
A=UTU
![Page 12: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/12.jpg)
Runtime task scheduling of DLA operations
● Task scheduling for the Cholesky factorization
![Page 13: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/13.jpg)
Runtime task scheduling of DLA operations
● Task scheduling in heterogeneous architectures– The runtime distinguishes between CPU and
GPU targets: OmpSs, StarPU, MAGMA, libflame
– Tasks assigned depending on target properties and specific techniques are applied
![Page 14: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/14.jpg)
Runtime task scheduling of DLA operations
● Task scheduling in asymmetric architectures– Asymmetry-concious runtime: Botlev-OmpSs– Critical-aware Task Scheduler policy– Each task is mapped to a single core
![Page 15: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/15.jpg)
Data parallel libraries of BLAS3 kernels
● Multi-threaded implementation of the BLAS-3
![Page 16: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/16.jpg)
Data parallel libraries of BLAS3 kernels
● Data-parallel libraries for asymmetric architectures:– Global Task Scheduling– Dynamic workload distribution between the
clusters– Static workload distribution in a cluster– Specific loop strides for each type of core
![Page 17: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/17.jpg)
Retargeting existing task schedulers to asymmetric architectures
![Page 18: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/18.jpg)
Evaluation of conventional runtimes on AMPs
![Page 19: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/19.jpg)
Combining conventional runtimes with asymmetric libraries
● GTS model (inspired in CPUM)– Virtual cores composed of 1A15 + 1A7
– Both cores are active simultaneously
● Parallelism:– Task-level: symmetric runtime
– Data-level: asymmetric library
![Page 20: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/20.jpg)
Combining conventional runtimes with asymmetric libraries
● Comparison with other approaches:✔ Any conventional task scheduler will work
transparently with no special modifications✔ Any improvement in the runtime will impact the
performance on an AMP✔ Any improvement in the asymmetry-aware library
will impact the performace on an AMP✗ Need of a tuned asymmetry-aware DLA library
![Page 21: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/21.jpg)
Experimental results
![Page 22: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/22.jpg)
Performance evaluation of the asymmetric BLIS
![Page 23: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/23.jpg)
Performance evaluation of the asymmetric BLIS
![Page 24: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/24.jpg)
Integration of the asymmetric BLIS in a conventional task scheduler
![Page 25: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/25.jpg)
Performance comparison versus asymmetry-aware task scheduler
![Page 26: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/26.jpg)
Conclusions
![Page 27: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/27.jpg)
In this work...
● Task-parallelism + Data-parallelism on AMPs● Reuse of existing task schedulers.● Competitive with asymmetry-aware schedulers
![Page 28: Refactoring Conventional Task Schedulers to Exploit ...€¦ · Task parallelism Data parallelism. Contribution Asymmetry-oblivious scheduler Asymmetry-aware + DLA library Task parallelism](https://reader034.fdocuments.net/reader034/viewer/2022050605/5faca97a01a73f3bfa2cce75/html5/thumbnails/28.jpg)
Thank you