The Cactus Framework & Numerical Relativitygallen/Presentations/ORNL_Nov06.pdf · The Cactus...
Transcript of The Cactus Framework & Numerical Relativitygallen/Presentations/ORNL_Nov06.pdf · The Cactus...
The Cactus Framework &Numerical Relativity:Petascale Requirements andScience DriversGabrielle Allen, Ed Seidel, Peter Diener, Erik Schnetter, Christian Ottand [email protected] for Computation & TechnologyDepartments of Computer Science & PhysicsLouisiana State University
4/8/07
Gravitational Wave Physics
Observations Models
Analysis & Insight
LSU/AEI Collaboration(Seidel/Rezzolla)
4/8/07
• Einstein Equations: Gµν(γij) = 8πTµν
– Constraint Equations:• 4 coupled elliptic equations for initial data and beyond• Familiar from Newton: ∇2 φ = 4πρ
– 12 fully 2nd order evolution equations for γij, Kij (∂γij /∂t)• Like a “wave equation” ∂2φ/dt2 −∇2 φ = Source (φ, φ2, φ’)• Thousands of terms in RHS (automatic code generation)
– 4 gauge conditions• Elliptic, hyperbolic, whatever …
– GR hydrodynamics for Tµν
• Analytically can only study trivial solutions or approximations… full numerical 3D models needed
• Black hole physics, neutron star physics, vacuumspacetimes, BH-BH/NS-NS/NS-BH binaries, supernovae,gamma ray bursts, …
Solving Einstein’s Equations
4/8/07
• Previous unigrid runs scaled tothousands of processors
• Current runs use FMR (around 6levels, ~80x80x80 nested grids)
• Scale to 64 procs• Take 2 weeks to run• (Boundary conditions, gauge
conditions, initial data, ellipticsolves)
• Easy AMR viz needed!!!!
Current BH Runs
(Movie from P. Diener, CCT)
4/8/07
Current Supernova WorkChristian Ott (U. Arizona) et al: Fully consistant 3+1 GR calculationsof rotating core collapse (using ORNL machines)
4/8/07
Cactus Code• Freely available, modular, portable and
manageable environment forcollaboratively developing parallel, high-performance multi-dimensionalsimulations (Component-based)
• Developed for Numerical Relativity, butnow general framework for parallelcomputing (CFD, astro, climate, chemeng, quantum gravity, …)
• Finite difference, AMR (Carpet, Samrai,Grace), new FE/FV, multipatch
• Active user and developer communities,main development now at LSU and AEI.
• Open source, documentation, etc
4/8/07
Cactus StructureCore “flesh” with plug-in “thorns”
Core “Flesh”
Plug-In “Thorns”(modules)
driverdriver
input/outputinput/output
interpolationinterpolation
SOR solverSOR solver
coordinatescoordinatesboundaryboundary conditionsconditions
black holesblack holes
equations of stateequations of state
remote steeringremote steering
wave evolverswave evolvers
multigridmultigrid
parametersparameters
gridgrid variablesvariables
errorerror handlinghandling
schedulingscheduling
extensibleextensible APIsAPIs
makemake systemsystem
ANSI CANSI CFortran/C/C++Fortran/C/C++
Your Physics !!Your Physics !!
ComputationalComputationalTools !!Tools !!
4/8/07
Cactus Flesh• Written in ANSI C• Independent of all thorns• Contains flexible build system, parameter
parsing, rule based scheduler, …• After initialization acts as utility/service library
which thorns call for information or to requestsome action (e.g. parameter steering)
• Contains abstracted APIs for:– Parallel operations, IO and checkpointing, reduction
operations, interpolation operations, timers. (APIsdesigned for science needs)
• All actual functionality provided by (swappable)thorns
4/8/07
Cactus Thorns (Components)
• Can be written in C, C++, Fortran 77, Fortran 90,(Java, Perl, Python)
• Separate libraries encapsulating some functionality• To keep distinction between functionality and
implementation of functionality each thorn declaresit provides a certain “implementation”
• Different thorns can provide the same“implementations”
• Thorn dependencies expressed in terms of“implementations”, so that thorns providing same“implementation” are interchangeable.
4/8/07
Thorn Specification• Each thorn contains configuration files which
specify interface with Flesh and other thorns• Configuration files converted at compile time
(Fortran) into a set of routines the Flesh cancall for thorn information– Scheduling directives– Variable definitions– Function definitions– Parameter definitions– Configuration details
• Configuration files have well defined language, canuse as basis to build interoperability with othercomponent based frameworks
4/8/07
Users and Toolkits• Many numerical relativity
groups around the world– Over 100 publications– Maya, Whisky, Lazarus, …
• Others:– CFD– Quantum Gravity– Chemical Engineering– Crack Propagation– Environmental modeling– Plasma physics– Computer science– Astrophysics– Cosmology– [Biology/Materials]
• Toolkits– Cactus Computational
Toolkit– Einstein Toolkit– CFD Toolkit– (Biology Toolkit)
• Teaching– Over 30 student
thesis/diploms
4/8/07
• Cactus modules (thorns)for numerical relativity.
• Many (>100) additionalthorns available from othergroups (AEI, CCT, …)
• Agree on few basics (e.g.names of variables) andthen can share evolution,analysis etc.
• Over 100 relativitypapers & 30student theses
ADM
EvolSimple
Evolve
ADMAnalysis
ADMConstraints
AHFinder
Extract
PsiKadelia
TimeGeodesic
Analysis
IDAnalyticBH
IDAxiBrillBH
IDBrillData
IDLinearWaves
IDSimple
InitialData
CoordGauge
Maximal
Gauge Conditions
SpaceMask ADMCoupling
ADMMacros StaticConformal
ADMBaseCactus Einstein
4/8/07
Numerical Methods• Most application codes using Cactus use finite
differences on structured meshes• Parallel driver thorns: Unigrid (PUGH), FMR
(Carpet), AMR (PARAMESH, Grace, SAMRAI),finite volume/element on structured meshes
• Method of lines thorn• Elliptic solver interface (PETSc, SOR, Multigrid,
Trilinos)• Multipatch with Carpet driver• Unstructured mesh support being added.
4/8/07
Parallelism• Scheduler calls routines and provides n-D
block of data (typical set up for FD codes)• Also information about size, boundaries, etc.• Fortran memory layout used (appears to C as
1D array)• Driver thorns are responsible for memory
management and communication.– Abstracted from science modules
• Supported parallel operations– Ghostzone synchronization, generalized reduction,
generalized interpolation.
4/8/07
PUGH UniGrid Driver• Was standard driver for science runs
until last year• MPI domain decomposition• Flexible methods for load balancing,
processor topology• Well optimized, scales very well for
numerical relativity kernels (e.g. to 33Kprocessors on BG/L)
4/8/07
Carpet AMR Driver• Carpet is a mesh refinement library for
Cactus, written in C++, mainly developed byErik Schnetter (LSU)
• Implements (minimal) Berger Oligeralgorithm, constant refinement ratio, vertexcentered refinement
• Uses MPI to decompose grids acrossprocessors, handle communications
• Current scaled and used with up to 64processors, work ongoing to better optimize.
4/8/07
IO
• Support for IO in different formats (generic interface)– 2-d slices as jpegs– N-d ASCII data– N-d data in IEEEIO format– N-d data in HDF5 format (to disk or streamed)– Panda parallel IO– Isosurfaces, MPEGS
• Checkpoint/restart (move to any new machine)
4/8/07
External Libraries• IO API:
– HDF5, FlexIO, jpeg, mpeg, NetCDF• Elliptic/Solver subsystem:
– LAPACK, BLAS, PETSc, Trilinos, FFTW• Timer API:
– PAPI• Driver API:
– MPI, Grace, SAMRAI, PARAMESH• Grid:
– GAT, MPICH-G2, (Globus)
4/8/07
• Designed for portability: IPAQ, PS2, Xbox, Itanium’99, EarthSim,BG/L, …
• 33K procs on BG/L but now new complex data structures (AMR)
BG/L (2006)
Performance and Optimization
4/8/07
Relativistic Astrophysics in 2010
• Frontier Astrophysics Problems– Full 3D GR simulations of binary systems for dozens or
orbits and merger to final black hole• All combinations of black holes, neutron stars, and exotic
objects like boson stars, quark stars, strange stars– Full 3D GR simulations of core collapse, supernova
explosions, accretion onto NS, BH– Gamma-ray bursts
• All likely to be observed byLIGO in the timeframe ofthis facility.
4/8/07
Physics Needed• Full 3D GR
– Coupled hyperbolic, elliptic equations, 10^4 double precisionfloats/grid point. 30 years of work, just in last year havealgorithms been developed for many orbits of binaries
• GR Hydro– Complete coupled GR-Hydro codes just now available
• MHD– Just starting
• Nuclear physics, EOS– Complex, time consuming
• Radiation transport– Not really started in 3D GRNeed all for complete solution to the above problems
4/8/07
• Resolve from 10,000km down to 100m on a domainof 1,000,000km-cubed for 100 secs of physical time
• Assume 16,000 flops per gridpoint• 512 grid functions• Computationally:
– High order (>4th) adaptive finite difference schemes– 16 levels of refinement– Several weeks with1 PFLOP/s sustained performance– (at least 4 PFLOP/s peak, > 100K procs)– 100 TB memory (size of checkpoint file needed)– PBytes storage for full analysis of output
Computational Needs for GRB
4/8/07
Code Details• Cactus is a computational framework that is used widely in numerical relativity and
astrophysics groups around the world. Using the Einstein Toolkit in Cactus, similarcodes have been developed that solve Einstein’s equation for the gravitational fieldand applied them successfully to solve the binary black hole problem. An EU grouphas further a GR-Hydro code in Cactus, called Whisky, that extends thecapabilities to include matter, and has successfully applied the coupled Einstein-Hydro solver to problems of relativistic neutron star binaries, mixed binarysystems, collapse of a neutron star to a black hole, and full 3D relativisticsupernova calculations.
• Cactus couples the physics solvers to parallel driver layers, that now provide AMR,parallel I/O, steering interfaces, and so on. We estimate that in several years, thecomputations described above will be the state of the art, and will be urgentlyneeded to interpret gravitational wave events being seen by LIGO.
• Cactus has been used in benchmarking studies on virtually every machine, fromSony Playstations to the Earth simulator. It was recently shown to scale well to32K processors with a relativity code using a uniform mesh. Work will be neededto achieve such scaling results for AMR drivers, which is now underway.
4/8/07
Code Details (2)• Main methods
– 64 bit– Explicit finite differencing– Domain decomposition– Structured grids with adaptive mesh
refinement– Parallel I/O– Minimal global reductions (sums, min/max)
4/8/07
Code Details (3)• What is the programming paradigm?
– Domain decomposition with messagepassing (applications work with abstractoperations)
• What languages wil be used?– Cactus Flesh and core thorns in C, some
physics thorns F90, Carpet driver C++.• What libraries will be required?
– PETSc, LAPACK, HDF5, PAPI, MPI
4/8/07
Code Details (4)• What does the source code look like
– Around 500,000 lines of code, but dependson number of analysis thorns used (this isfor 85 thorns in total)
• On what is the source code based?– Open source, used by different apps– Cactus toolkit, many community thorns for
numerical relativity• What special features of the system will
it use
4/8/07
Code Details (5)• I/O
– Typically using HDF5 (need platformindependence)
– I/O layer in Cactus allows new methods tobe easily added/used
– Many parameter choices (one file perprocessor, one file per n-procs, one file pertimestep, chunked vs unchunked data)
– Checkpoint and restart handled by Cactus
4/8/07
Code Details (6)• Code termination
– Steering interface allowed termination fromgiven condition, from external machine etc.
– Usually codes crash through fromnumerical problems (black holesingularities and grid stretching etc)
• Fault tolerance– None implemented at MPI level, but have
plans for this if necessary.
4/8/07
Code details (7)• Debugging
– Printf, gdb– Some Cactus thorns (and plans for more),
e.g. NaNChecker– Fair amount of checks in Cactus
infrastructure (e.g. parameter checking,different levels of warnings)
– Plans for more realtime debuggingcapabilities (e.g. via web interface, logging)
– Totalview, purify, …
4/8/07
Code details (8)• Profiling• Cactus has its own
timing interface(thorns, timebins,communication, userdefined, …)
• Use PAPI throughthe Cactus timinginterface