A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March...
Transcript of A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March...
![Page 1: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/1.jpg)
A Framework for Large-Scale, High-Performance Lattice
Boltzmann Simulations in Complex Geometries
C. Godenschwager, M. Bauer, F. Schornbaum, and U. Rüde
Chair for System Simulation, FAU Erlangen-Nürnberg
SIAM PP18, Tokyo, Japan
March 8, 2018
![Page 2: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/2.jpg)
A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Outline
• The waLBerla framework
• Simulation setup & results
• Scaling experiments
• Summary & Outlook
March 8, 2018 2
![Page 3: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/3.jpg)
March 8, 2018 3A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
The waLBerla framework
• written in C++14
• main focus on CFD simulations based on the lattice Boltzmann
method (LBM)…
• …but generally suitable for all kinds of numeric codes working with
uniform domain decompositions (Multigrid solvers, phase field method, physics engine)
• at its very core designed as an HPC software framework:
• scales from laptops to current petascale supercomputers
• largest simulation: 1,835,008 processes (IBM Blue Gene/Q @ Jülich)
• hybrid parallelization: MPI + OpenMP
• vectorization of compute kernels
• Many modules are available open source→ http://www.walberla.net
![Page 4: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/4.jpg)
March 8, 2018 4A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Vocal Fold Study(Florian Schornbaum)
Fluid Structure Interaction (Simon Bogner)
Free Surface Flow(Martin Bauer)
Rigid Body Dynamics(Sebastian Eibl, Christian Godenschwager)
Electron Beam Melting(Matthias Markl, Regina Ammer)
Phase Field Simulations(Martin Bauer, Johannes Hötzer)
![Page 5: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/5.jpg)
March 8, 2018 5A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
LBMhydrodynamics
pe
rigid body dynamics
Static grid refinement
Dynamic grid refinement
Multigrid methods
Free surface flows
Fluid-solid interaction
Phase field models
Thermal LBMComplex geometry handling
waLBerla building blocks
ElectrokineticsCompute kernel
generation
![Page 6: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/6.jpg)
A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
The lattice Boltzmann method (TRT)
• Meaning: relax the PDFs 𝑓𝑞 linearly towards their equilibrium values
• TRT: 𝑓𝑞 are split into their symmetric (+) and anti-symmetric (-) parts
• 𝜆+: determines fluid viscosity
• 𝜆−: improves boundary accuracy & stability
𝑓𝑞 Ԧ𝑥 + Ԧ𝑐𝑞, 𝑡 + 1 = 𝑓𝑞 Ԧ𝑥, 𝑡 − 𝜆+ 𝑓𝑞𝑒𝑞,+
𝜌, 𝑢 − 𝑓𝑞+ Ԧ𝑥, 𝑡
−𝜆− 𝑓𝑞𝑒𝑞,−
𝜌, 𝑢 − 𝑓𝑞− Ԧ𝑥, 𝑡
density: 𝜌 = σ𝑞 𝑓𝑞momentum: 𝜌𝑢 = σ𝑞 𝑓𝑞 Ԧ𝑐𝑞
March 8, 2018 6
![Page 7: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/7.jpg)
A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
LBM in waLBerla
March 8, 2018 7
Godenschwager et al. - A framework for hybrid parallel flow
simulations with a trillion cells in complex geometries, 2013
• Oldest module of the waLBerla framework
• Highly optimized compute kernels
• Scalability: over 1 trillion (1012) lattice cells
on 1,835,008 processes
![Page 8: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/8.jpg)
March 8, 2018 8A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
geometry given by surface mesh
allocation of block data (→ grids)
domain partitioning into blocks
load balancing empty blocks are discarded
![Page 9: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/9.jpg)
March 8, 2018 9A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
allocation of block data (→ grids)
geometry given by surface mesh domain partitioning into blocks
load balancing empty blocks are discarded
DISK
separation ofdomain
partitioningfrom simulation
file size: kilobytes to few megabytes
DISK
![Page 10: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/10.jpg)
March 8, 2018 10A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
allocation of block data (→ grids)
geometry given by surface mesh domain partitioning into blocks
load balancing empty blocks are discarded
DISK
separation ofdomain
partitioningfrom simulation
file size: kilobytes to few megabytes
DISK
More on Octree based setup,
static and dynamic refinement
tomorrow 2:30 PM in MS91
„Extreme-Scale Block-Structured Adaptive
Mesh Refinement“ by Florian Schornbaum
![Page 11: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/11.jpg)
March 8, 2018 11A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
143 → 649
183 → 413
233 → 277
293 → 201
373 → 149
333 → 154
313 → 184
303 → 190
303 → 190
block size → #blocksdx = 0.2mm target: ≤ 200 blocks
![Page 12: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/12.jpg)
March 8, 2018 12A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Domain partitioning of coronary tree dataset one block per process
512 processes
485 blocks
458,752 processes
458,184 blocks
![Page 13: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/13.jpg)
Surface Meshes
![Page 14: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/14.jpg)
March 8, 2018 14A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
![Page 15: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/15.jpg)
March 8, 2018 15A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
![Page 16: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/16.jpg)
March 8, 2018 16A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
![Page 17: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/17.jpg)
Scaling experiments
![Page 18: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/18.jpg)
March 8, 2018 18A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
JUQUEEN SuperMUC (Phase 1)
Forschungszentrum Jülich, Germany LRZ, Garching (Munich), Germany
IBM system IBM system
Blue Gene/Q Intel Sandy Bridge-EP
28,672 nodes 9,216 nodes
458,752 cores 147,456 cores
5.9 Petaflops peak 3.2 Petaflops peak
448 TB main memory 288 TB main memory
5D Torus Network Non-blocking tree / 4:1 pruned tree
![Page 19: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/19.jpg)
March 8, 2018 19A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Weak scaling experiments
![Page 20: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/20.jpg)
March 8, 2018 20A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Weak scaling experiments
![Page 21: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/21.jpg)
March 8, 2018 24A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Strong scaling experiments
![Page 22: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/22.jpg)
March 8, 2018 25A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Strong scaling experiments
![Page 23: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/23.jpg)
March 8, 2018 26A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Strong scaling JUQUEEN
![Page 24: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/24.jpg)
March 8, 2018 27A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Strong scaling SuperMUC
![Page 25: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/25.jpg)
Automatic generation of LBM compute kernels
(Martin Bauer)
![Page 26: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/26.jpg)
March 8, 2018 29A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
Models / Features Hardware / Optimization
too many combinations to provide handcrafted compute kernels for
code generation
• GPU/CUDA support• (manual) or guided vectorization (AVX2,
AVX512, QPX)• inner loop splitting to improve prefetching due
to lower number of load/store streams• sparse (list-based) kernels for domains with
many boundary cells• data layout: simple two grid stream-collide,
AABB pattern, EsoTwist
• stencils• moment-based methods (MRT)
• efficient SRT and TRT implementations• moment basis construction• various equilibria• forcing approaches
• different collision space: cumulant method• entropic stabilization• locally varying relaxation rates e.g. to include
turbulence models• coupling of multiple kernels
![Page 27: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/27.jpg)
March 8, 2018 30A Framework for Large-Scale, High-Performance LB Simulations in Complex Geometries - C. Godenschwager et al.
![Page 28: A Framework for Large-Scale, High-Performance Lattice ......Mar 08, 2018 · LBM in waLBerla March 8, 2018 7 Godenschwager et al. - A framework for hybrid parallel flow simulations](https://reader034.fdocuments.net/reader034/viewer/2022042109/5e893cc469c5e73e4f2e8554/html5/thumbnails/28.jpg)
Thank you!
A special thank you to the LRZ in Garching and the JSC in Jülich for the
compute time and their friendly support!