PRACE – A Mid-Term Update · PRACE – A Mid-Term Update Dietmar Erwin, Forschungszentrum Jülich...
Transcript of PRACE – A Mid-Term Update · PRACE – A Mid-Term Update Dietmar Erwin, Forschungszentrum Jülich...
4
Supercomputing Drives Science through Simulation
EnvironmentEnvironmentWeather/ ClimatologyWeather/ ClimatologyPollution / Ozone HolePollution / Ozone Hole
Ageing SocietyAgeing SocietyMedicineMedicineBiologyBiology
EnergyEnergyPlasma PhysicsPlasma Physics
Fuel CellsFuel Cells
Materials/ Inf. TechMaterials/ Inf. TechSpintronicsSpintronics
NanoNano--sciencescience
5
PRACE InitiativeHPCEUR HET
History and First Steps
2004 2005 2006 2007 2008
Bringing scientists togetherCreation of the Scientific Case
Production of the HPC part ofthe ESFRI Roadmap;
Creation of a vision,involving 15 European countries Signature of the MoU
Approval of the project
Submission ofan FP7 project proposal
Project start
6
HET: The Scientific Case• Weather, Climatology, Earth Science
– degree of warming, scenarios for our future climate.– understand and predict ocean properties and variations– weather and flood events
• Astrophysics, Elementary particle physics, Plasma physics– systems, structures which span a large range of different length and time scales– quantum field theories like QCD, ITER
• Material Science, Chemistry, Nanoscience– understanding complex materials, complex chemistry, nanoscience– the determination of electronic and transport properties
• Life Science– system biology, chromatin dynamics, large scale protein dynamics, protein
association and aggregation, supramolecular systems, medicine• Engineering
– complex helicopter simulation, biomedical flows, gas turbines and internal combustion engines, forest fires, green aircraft,
– virtual power plant
7
First success: HPC in ESFRI RoadmapThe European Roadmap for Research Infrastructures is the first comprehensive definition at the European level
Research Infrastructures areone of the crucial pillars of the European Research Area
A European HPC service – impact foreseen:
strategic competitivenessattractiveness for researcherssupporting industrial development
8
Second success: The PRACE Initiative • Memorandum of Understanding
signed by 15 States in Berlin, on April 16, 2007
• France, Germany, Spain, The Netherlands, UKcommitted funding for a European HPC Research Infrastructure (LoS) New:
9
Third success: The PRACE Project
EU approved the PRACE Preparatory Phase Project (Grant: INFSO-RI-211528)
• 16 Partners from 14 countries• Project duration:
January 2008 – December 2009• Project budget: 20 M € ,
EC funding: 10 M €• Kickoff: Jülich, January 29-30,
2008
10
PRACE Objectives in a Nutshell
• Provide world-class systems for world-class science• Create a single European entity • Deploy 3 – 5 systems of the highest performance level
(tier-0)• Ensure diversity of architectures• Provide support and training
PRACE will be created to stay
12
After the First Successful Year:• Project review March 5-6, 2009, in Brussels• A full day of presentation of the project results of year1• Coordinator and all work packages presented • PRACE is the collaborative achievement of over 250
persons at the 16 partner sites• Expertise includes a Chairman of the Board,
(coordinator), executives, scientists, programmers, legal experts, …
“The project made very good progressin many areas”
13
PRACE Work Packages• WP1 Management• WP2 Organizational concept• WP3 Dissemination, outreach and training• WP4 Distributed computing• WP5 Deployment of prototype systems• WP6 Software enabling for prototype systems• WP7 Petaflop/s systems for 2009/2010• WP8 Future petaflop/s technologies
13
14
Selected Results and Highlights of 2008
• Applications• Systems/Architectures• Training and Outreach
15
Software Enabling for Petaflop/s Systems
Prototype Systems
Architectures
Application Requ.
Optimisation
Optimisedapplications
Petascaledapplications
Analysis
Benchmarks
Representativeapplications
Categorisation ofApplications
Petascaling
Benchmarks
Libraries&Models
16
Categorisation of Applications
Particle Physics 23.5
Computational Chemistry 22.1
Condensed Matter Physics 14.2
CFD 8.6
Earth & Climate 7.8
Astronomy & Cosmology 5.8
Life Sciences 5.3
Computational Engineering 3.7
Plasma Physics 3.3
Other 5.8
• Benchmark applications should be representative of European HPC usage
• We conducted surveys of PRACE partners’ HPC systems and major applications– Collecting various interesting data for 24
systems and 69 applications
• Quantitative basis for selecting representative applications
• Disseminated as Technical Report
17
Application Requirements• Analysis of representative applications
– Ported to relevant architectures• Result: Quantitative data from initial porting• Supplemented by user survey
– Sent to Top 10 users in each PRACE country• Questions covered
– The user– Usage patterns– HPC infrastructure– Upcoming algorithms
• Analysed almost 70 responses from these major users
Job Requirements
0
20
40
60
80
100
120
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67
Mac
hine
Fra
ctio
n [%
]
0,1
1
10
100
1000
Min
imal
exe
cutio
n tim
e [h
]
Size Length
18
Representative Benchmark Suite• Defined a set of applications benchmarks
– To be used in the procurement process for Petaflop/s systems• 12 core applications, plus 8 additional applications
– Core: NAMD, VASP, QCD, CPMD, GADGET, Code_Saturne, TORB, ECHAM5, NEMO, CP2K, GROMACS, N3D
– Additional: AVBP, HELIUM, TRIPOLI_4, PEPC, GPAW, ALYA, SIESTA, BSIT
• Each application will be ported to appropriate subset of prototypes• Synthetic benchmarks for architecture evaluation
– Computation, mixed-mode, IO, bandwidth, OS, communication• Applications and Synthetic benchmarks integrated into JuBE
– Juelich Benchmark Environment
19
Mapping Applications to Architectures• Identified affinities and priorities • Based on the application
analysis - expressed in a condensed, qualitative way– Need for different “general
purpose” systems– There are promising emerging
architectures• Will be more quantitative after
benchmark runs on prototypes
E = estimated
20
Selected set of Prototypes
• Process– Call – evaluation– selection– Approval by EC
• Funding 2.2 M€(50% of cost in average)
July 16:EC review
M7.1
SMP-TNBull et al. Intel Xeon Nehalem
CEA-FZJFrance+Germany
SMP-FNIBM Power 6
NCFNetherlands
MPPIBM BlueGene/P
FZJGermany
Hybrid – coarse grainNEC Vector SX/9 + x86
HLRSGermany
Hybrid – fine grainIBM Cell + Power6
BSCSpain
MPPCray XT5/XTn - AMD Opteron
CSC-CSCSFinland+Switzerland
ArchitectureVendor/Technology
Site
2121
Installed prototypes
IBM BlueGene/P (FZJ)01-2008
IBM Power6 (SARA)07-2008
Cray XT5 (CSC)11-2008
IBM Cell/Power (BSC)12-2008
NEC SX9, vector part (HLRS)02-2009
Intel Nehalem/Xeon (CEA/FZJ): expected installation date 4/5-2009
2222
Summary of current prototype status
milestone IBM BlueGene/Pat FZJ
IBM Power6at SARA
Cray XTat CSC
IBM Cell/Powerat BSC
NEC SX9/x86at HLRS
Intel Nehalem/Xeonat CEA/FZJ
system installed yes yes yes yes partly (vector) no
system in production yes yes yes yes partly (vector) no
technical assessment yes nearly yes started no no
evaluation ofcommunication and I/Oinfrastructure
yes nearly yes started no no
evaluation andbenchmarking of userapplications
started started started started started no
23
Web site and the dissemination channels• The PRACE web presence with
news, events, RSS feeds etc. http://www.prace-project.eu
• Alpha-Galileo service: 6500 journalists around the globe: http://www.alphagalileo.org
• Belief Digital Library• HPC-magazines• PRACE partner sites, top 10 HPC
users The PRACE website, www.prace-project.eu
24
PRACE Dissemination Package• PRACE WP3 has created a dissemination package including templates,
brochures, flyers, posters, badges, t-shirts, USB-keys, badges etc.
Heavy Computing 10^15: the PRACE t-shirtPRACE USB-key
The PRACE logo
DE1
Slide 24
DE1 The slides are very good and livelyHowever, you should bring some structure into the presentation.I propose to start with a slide on WP3 objectives- Dissemination ...- Industry Collaboration ...- Training ...
Highlight explicitely at the appropriate places that the Milestones were successfully reached.(Don't expect that the reviewers read everything)Dietmar Erwin; 27.02.2009
25
PRACE booth at ISC, ICT, SC• PRACE had an exhibition booth at
ISC’08 Dresden, DSC08 Austin, Texas, USICT 2008, Lyon, F
PRACE booth at SC08 PRACE booth at ICT 2008
26
Selected Events Industry SeminarPRACE Award ISC’08Best student paper on peta-scaling toDominikGöddeke,Dortmund
28
Training: Survey of HPC education and training needs• The Top 10 users at each participating PRACE member site were invited
to participate in completing the survey. • The data was obtained from the most comprehensive evaluation of user
training requirements • Over 90% of respondents believed they would benefit from formal
training in the following areas: performance optimization, debugging tools and techniques, code testing and compiler optimisations;
• Over 90% of users considered that there is an important need forimproved HPC training programmes
• Request to use survey from Hong Kong and US
29
Training: summer School• PRACE Petascale Summer School, August
26-29, Stockholm, Sweden;• It attracted 31 students, representing all
PRACE member countries plus South Africa. In the anonymous feedback answered by 26 students, the School received excellentgrades;
• Training material published on the PRACE website.
• Special access to large European systems- BG/P,Cray XT5, IBM Power6, and CELL -was offered to the students in conjunction with lectures about programming models, hardware, development tools and optimisation/debugging PRACE Summer School took place at
PDC, Stockholm
30
2009 training: Winter School• PRACE Winter School, February 9-13,
2009, Athens, Greece;• 48 registered attendees
(78 applications);• Training material is published on the
PRACE web site.
The PRACE Winter Schoolat the OTE academy, Athens
HPC training and HPC training events on the
PRACE website
33
The next important challenges
Month 15 Month 18 Month 24
Draft contract PRACE CONTRACT
Governancedocument
Doc on fundingagreement
Peer review process
Report onOperating model
selection of next production systems
Get the initial commitments by Government converted into signed contracts
34
How to get involved ?If you are a national coordinator of HPC activitiesand your country is not yet a member:
Join the PRACE Initiative !
Port your code to the PRACE Prototypes– Prototypes will mainly be used project-internally, but– Prototypes will also be made available to selected users for
testing/porting purposes using a light-weight peer-review process– See: http://www.prace-project.eu/prototype-access
Participate in the PRACE training events
35
Accessing the future PRACE RIDisclaimer: this is work in progress, not yet confirmedAccess Model• Based on peer-review: “the best systems for the best science”• Three types of resource allocations
– Test / evaluation access– Project access – for a specific project, grant period ~ 1 year– Programme access – resources managed by a community
• Free-of-chargeFunding• Mainly national funding through partner countries• European contribution• Access model has to respect national interests (ROI)
3636
Prototypes: Change Status to Green
milestone IBM BlueGene/Pat FZJ
IBM Power6at SARA
Cray XTat CSC
IBM Cell/Powerat BSC
NEC SX9/x86at HLRS
Intel Nehalem/Xeonat CEA/FZJ
system installed yes yes yes yes partly (vector) no
system in production yes yes yes yes partly (vector) no
technical assessment yes nearly yes started no no
evaluation ofcommunication and I/Oinfrastructure
yes nearly yes started no no
evaluation andbenchmarking of userapplications
started started started started started no
3737
Prototypes: Change Status to Green
milestone IBM BlueGene/Pat FZJ
IBM Power6at SARA
Cray XTat CSC
IBM Cell/Powerat BSC
NEC SX9/x86at HLRS
Intel Nehalem/Xeonat CEA/FZJ
system installed yes yes yes yes yes yes
system in production yes yes yes yes yes yes
technical assessment yes yes yes yes yes yes
evaluation ofcommunication and I/Oinfrastructure
yes yes yes yes yes yes
evaluation andbenchmarking of userapplications
yes yes yes yes yes yes
38
Refine Costs Analysis / Estimates• Initial: simplified scenarios, using market survey, vendor input and
partners’ experience• Updated and refined iteratively• Consistent with the a priori PRACE initiative estimates in 2007
39
Market watch – Top 500 watch (Petascale worldwide…)
Top 10 Top 5Late 2009 0.5 Pflops 1 PflopsLate 2010 1 Pflops 2 PflopsLate 2011 2 Pflops 5 Pflops
40
Applications: Petascaling and Optimisation
Petascaling– Mixed-mode parallelisation– Load balancing– Minimisation of communication
overheads– Parallel I/O– Checkpointing
Optimisation– Optimising serial performance
• CPU• Memory
– Optimising both for general-purpose architectures and specialised architectures
– Algorithmic optimisations
PRACE will disseminatebest practice in these areas
41
Libraries and Programming Models • Classification of Benchmark applications• Current programming models
– MPI, OpenMP, mixed-mode, …
• PGAS and other future programming models• Accelerator Languages
– CUDA, RapidMind, openCL …
• Petascale libraries
42
Future PRACE Events• Second scientific seminar in 11.-13. May 2009 in Amsterdam in
collaboration with DEISA2 (DEISA PRACE Symposium)• The Second industry seminar Toulouse, 7-8th September organised by
GENCI & GAUSS: covers also small and medium size enterprises• 2nd PRACE Award at ISC09, Hamburg• Exhibition booths at major events:
ISC09 (June), SC09 (November)• Five code porting and optimization workshops
Finland, Sweden (2), Poland, Switzerland
• Additional training on GPU programming April, France
43
Integration into the HPC ecosystem• Software builds on the developments in the DEISA
project• Synergies through overlapping partnership between
PRACE and DEISA• Enhancements and adaption where needed
44
Procurement strategy• Analysis of European procurement procedures
completed– Open– Restricted– Competitive dialogue– Negociated– Pre-commercial
• Work in progress– Definition of general procurement process– Definition of selection and evaluation criteria– Evaluation process for offers by vendors
45
Potential sites and installation constraints
• Survey of PRACE partners• Result:
The spectrum of the building infrastructure of the PRACE partners can cover the requirements expected for future petascale systems and most site can evolve to fulfil the needs of future generations of systems.
• To be updated at intervals
4646PRACE EC Review – March 5-6 2009 - Brussels
Future Petaflop/s Computer Technologies beyond 2010
• Assessment and evaluation of emerging multi-petascale-technology following the requirements of HPC users
• Implementation of a strategy that guarantees a continuous HPC technology evaluation and system evolution within the PRACE Research Infrastructure
• Fostering the development of components for future multi-petascale production systems in cooperation with European and international HPC industry
47
Creation of STRATOSSTRATOS - PRACE advisory group for
Strategic Technologies
• Design and implementation of a permanent HPC research and technology platform
• Agreements of STRATOS with cooperation partners, i.e. Consortia for HPC development and research institutions (STRATOS contract)
• Definition of long-term work plan (2009-2014)
47PRACE EC Review – WP8 – March 5-6, 2009 - Brussels
48
Multi-Petaflop/s technology• Assessment of technologies needed to develop future
multi-petascale systems– Multi- and many-core processors– Computational accelerators– High speed interconnects, storage architectures and I/O
technologies– Novel compilers and performance tools– System management software
• Analysis of user requirements and selection of promising architecture options
48
49
Multi-Petaflop/s components and architectures
Assessment and evaluation/development of • very highly multi-threaded processors (e.g. tens of
cores per processor)• Novel languages like PGAS languages• Novel high speed interconnects• Novel memory and storage technologies• Novel parallel file system solutions
49
50
Promising Architectures Selected
System Architectures• Thin node systems (e.g., sgi ICE & Sun constellation)• Fat node systems (e.g., sgi UV)• All types of hybrid system architectures with one exception
MarieCell (since it was already selected as WP7 prototype)Components for multi-Petaflop/s Architectures
• Novel multi- and many-core processors• Compute accelerators• Novel high speed interconnects• Novel storage solutions and file systems with end to end data
integrity features• PGAS and streaming processing languages
50
51
Prototypes selected
51
Sites Hardware/Software Porting effortCEA“GPU/CAPS”
1U Tesla Server T1070 (CUDA, CAPS, DDT)Intel Harpertown nodes
“Evaluate GPU accelerators and GPGPU programming models and middleware.” (e.g., pollutant migration code (ray tracing algorithm) to CUDA and HMPP)
CINES-LRZ“LRB/CS”
Hybrid SGI ICE2/UV/Nehalem-EP & Nehalem-EX/ClearSpeed/Larrabee
Gadget,SPECFEM3D_GLOBE, RaXml, Rinf, RandomAccess, ApexMap, Intel MPI BM
CSCS“UPC/CAF”
Prototype PGAS language compilers (CAF + UPC for Cray XT systems)
“The applications chosen for this analysis will include some of those already selected as benchmark codes”
EPCC“FPGA”
Maxwell – FPGA prototype (VHDL support & consultancy + software licenses (e.g., Mitrion-C))
“We wish to port several of the PRACE benchmark codes to the system. The codes will be chosen based on their suitability for execution on such a system.”
52
Prototypes selected (cont’d)Sites Hardware/Software Porting effortFZJ (BSC)“Cell & FPGA interconnect”
eQPACE (PowerXCell cluster with special network processor)
Extend FPGA-based interconnect beyond QCD applications.
LRZ“RapidMind”
RapidMind (Streaming Processing Programming Paradym)X86, GPGPU, Cell
ApexMap, Multigrid, FZJ (QCD), CINECA (linear algebra kernels involved in solvers for ordinary differential equations), SNIC
NCF“ClearSpeed”
ClearSpeed CATS 700 units Astronomical many-body simulation, Iterative sparse solvers with preconditioning, finite element code, cryomicrotome image analysis
CINECA I/O Subsystem (SSD, Lustre, pNFS) -
March 10th: Selection of prototypes targeting energy efficiency
-
52
54
Backup slides• Advantage of Supercomputers• Europe‘s position• The Scientific Case / Extrapolations from Germany• Energy Consumption
55
6 years technolgy advantage with a #1 systemTo
p 50
0 lis
t 11
/07
6 years
after 12 years the power is on the desk
56
Europe’s current position in HPCAggregated LINPACK Performance
in PetaFlop/s in November Top 500 Lists
91% of European HPC-power is within PRACEcountries
PFlops
57
HET: The Scientific Case• Weather, Climatology, Earth Science
– degree of warming, scenarios for our future climate.– understand and predict ocean properties and variations– weather and flood events
• Astrophysics, Elementary particle physics, Plasma physics– systems, structures which span a large range of different length and time scales– quantum field theories like QCD, ITER
• Material Science, Chemistry, Nanoscience– understanding complex materials, complex chemistry, nanoscience– the determination of electronic and transport properties
• Life Science– system biology, chromatin dynamics, large scale protein dynamics, protein
association and aggregation, supramolecular systems, medicine• Engineering
– complex helicopter simulation, biomedical flows, gas turbines and internal combustion engines, forest fires, green aircraft,
– virtual power plant
58
Status and Requirements, e.g.: German Case
>5005010Plasma Physics>100015-803Biophysics and Bioinformatics>200303Soft Matter>30025-1253Theoretical Chemistry>50050-10010Materials Science
>100010030Elementary Particle Physics and Physics of Hadrons and Nuclei
>50050-10010Astrophysics>100025-1002.5Computational Fluid Dynamics>100050-1001Solid-State Physics>20010-501Nanostructure Physics
>100010-1001Geophysics>50050-10020Climate and Earth System Research20102007-20092005-2007Scientific Field (numbers in TeraFlop/s)
A. Bode, W. Hillebrandt, and Th. Lippert: German Scientific Case for the BMBF, 8/2005
59
LBNL IJHPCA Study for ~1/5 Exaflop for Climate Science in 2008Extrapolation of Blue Gene and AMD design trendsEstimate: 20 MW for BG and 179 MW for AMD
DOE E3 ReportExtrapolation of existing design trends to Exascale in 2016Estimate: 130 MW
DARPA StudyMore detailed assessment of component technologiesEstimate: 20 MW just for memory alone, 60 MW aggregate extrapolated from current design trends
The current approach is not sustainable!
More holistic approach is needed!
Exascale Data Centre Challenges
59