Symposium Groene ICT en duurzaamheid: Nieuwe energie in het hoger onderwijs
Jezelf Groen Rekenen met Supercomputers
Walter Lioen <[email protected]> Groepsleider Supercomputing
About SURFsara
• SURFsara offers an integrated ICT research infrastructure and provides services in the areas of computing, data storage, visualization, networking, cloud and e-Science.
• SARA was founded in 1971 as an Amsterdam computing center by the two Amsterdam universities (UvA and VU) and the current CWI.
• Independent as of 1995. • Founded Vancis in 2008 offering ICT services and ICT products to
enterprises, universities, and educational and healthcare institutions.
• As from 1 January 2013, SARA – from then on SURFsara – forms part of the SURF Foundation.
• First supercomputer in The Netherlands in 1984 (Control Data Cyber 205). Hosting the national supercomputer(s) ever since.
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 2
What is a Supercomputer?
• A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation
• Consequently, the specification of a supercomputer is constantly changing • Rule of thumb: a supercomputer is at least 1,000 – 10,000 up to 100,000 times faster than an
average PC
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 3
Why supercomputing?
• Large scale scientific computing Simulation of processes tot are otherwise - Impossible in practice - Too expensive - Too dangerous - Too extended
• Examples - Astronomy
- How did the universe begin? - How do stars form and evolve?
- Weather Prediction, Climatology - Nuclear Physics - Aerodynamics (cars, planes, rockets) - Biology (proteins, DNA, drugs) - Medical sciences (bone formation, blood flow)
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 4
Top500: PFlop/s
• HPL, the High-Performance Linpack benchmark, solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers
• For Tianhe-2, the as of June 2013 nr. 1 (3,120,000 cores, 54.9 PFlop/s, 17.8 MW): - n = 9,960,000
• Computational kernel: DGEMM (matrix multiply) • Extremely efficient on all processors (in cache)
• Limiting factors: - Speed of interconnect - Speed to (local accelerator) memory (for e.g. GPGPU)
• However, far more important: application speed • “In Amsterdam a Ferrari is useless (speed-wise)”
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 5
Top500 – iPad 2 performance
• An A5 processor core of an iPad 2 is as fast as a four processor Cray 2 supercomputer (1.951 GFlop/s)
• In 1985 an eight processor Cray 2 was the fastest supercomputer in the world • The iPad 2 would still have been listed in the Top500 of 1994
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 6
Green500: MFlop/s / Watt
November 2013 Green500 List observations: • Rank 1 – 10: (Intel Xeon + NVIDIA K20)
- commodity processors with GPGPUs (graphics processing units) • Rank 1: TSUBAME-KFC (Japan, Ivy Bridge + NVIDIA K20x)
- 4,503.17 MFlop/s / W (first time > 4 GFlop/s / W)
- An exaflop system would require 222 MW (DARPA’s target is > 1 EFlop/s using < 20 MW)
• Rank 4: Piz Daint (Switzerland, Cray XC30, Sandy Bridge + NVIDIA K20x) - 3,185.91 MFlop/s / W - the greenest petaflop supercomputer - the current Top500 #6
• Rank 12: (USA, Blue Gene/Q) - 2,299.15 MFlop/s / W - highest ranked non-heterogeneous (CPU only) system
• Rank 40: Thianhe-2 (China, Ivy Bridge + Xeon Phi) - 1,901.54 MFlop/s / W - the current Top500 #1
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 7
SURFsara National Supercomputing History
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014
Year Machine Rpeak GFlop/s kW GFlop/s
/ kW 1984 CDC Cyber 205 1-pipe 0.1 250 0.0004 1988 CDC Cyber 205 2-pipe 0.2 250 0.0008 1991 Cray Y-MP/4128 1.33 200 0.0067 1994 Cray C98/4256 4 300 0.0133 1997 Cray C916/121024 12 500 0.024 2000 SGI Origin 3800 1,024 300 3.4 2004 SGI Origin 3800 + Altix 3700 3,200 500 6.4 2007 IBM p575 Power5+ 14,592 375 40 2008 IBM p575 Power6 62,566 540 116 2009 IBM p575 Power6 64,973 560 116 2013 Bull bullx B710 (DLC) + R428 270,950 245 1106 2014 Bull bullx B515 (NVIDIA K40) >200,000 <60 >3333 2014 Bull bullx complete system >1,000,000 >520 >1923
8
Moore’s Law (1965)
• The number of transistors on an integrated circuit doubles every 2 years • Because of faster transistors, the speed doubles every 18 months • The clock speed stopped doubling a couple of years ago • Nowadays the number of cores doubles
• Moore noted that if car manufacturers had something like this, cars would get 100,000 miles to the gallon and it would be cheaper to buy a Rolls Royce than park it. (Cars would also be only a half an inch long.)
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 9
Cartesius – specs
Phase 1 (production June 2013, total peak performance 271 TFlop/s) • Direct Liquid Cooled thin node islands
- 360 thin nodes, 2 × 12-core 2.4 GHz Intel Ivy Bridge CPUs/node, 64 GB/node - 180 thin nodes, 2 × 12-core 2.4 GHz Intel Ivy Bridge CPUs/node, 64 GB/node
• Fat node island - 32 fat nodes, 4 × 8-core Intel Sandy Bridge CPUs/node, 256 GB/node
• Total - 13,968 cores, 41.75 TB memory, 2.4 PB disk - Interconnect: InfiniBand 56 Gbit/s bandwidth, 3 µs latency - Top 500 November 2013: # 184
Phase 1.5 (scheduled production 2014 Q2, total peak performance ~ 470 TFlop/s) • Addition of accelerator island
- 66 nodes, 2 × Intel Ivy Bridge CPUs/node, 2 × NVIDIA Tesla K40 GPGPUs/node
Phase 2 (scheduled production 2014 H2, total peak performance > 1 PFlop/s) • On-demand addition of thin node islands with latest Intel Haswell CPUs
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 10
Cartesius – Greenness
• All thin compute nodes use Direct Liquid Cooling - inlet temperature 30ºC: warm water cooling - free cooling if outdoor temperature < 30ºC
in Amsterdam: 99.1% of days - (Cartesius System) Power Usage Effectiveness 1.2
(typical PUE for cold water cooling: 1.4; air cooling: 1.6) • System requirements based on detailed usage analysis
- which user applications - actual memory usage - I/O profiles
• Optimized price/performance - TCO: total budget =investment + energy + cooling + housing + ups (storage only) - performance: application throughput using the 7 most relevant applications (# jobs / lifetime) - maximization of application throughput / TCO
(optimization of power related costs vs. investment costs) left as an “exercise” for the vendor during the procurement
- result: using “slower” processors (lower clock frequency)
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 11
Cartesius – Greenness
• On demand growth - minimizes idle time - use latest technology maximizes value for money
- higher performance - lower energy
- (less good for Top500 ranking) • On demand growth: accelerator island (NVIDIA K40)
- Phase 1 and Phase 2 (both CPU only) are general purpose - accelerators are more special purpose
- can deliver more MFlop/s / Watt - efficient use of accelerators requires
- suitable applications - investment in programming effort
- proven interest of more than 10 research groups
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 12
Scalable Hybrid Architecture
PRACE-2IP prototype: Bull System @ CSC, Finland EU collaboration: CSC, SURFsara, CSCS • 44 nodes with two Intel Xeon Phi 7120X co-processors • 37 nodes with two NVIDIA K40 GPGPUs SURFsara research topics: • Programming paradigms
- Application porting to accelerator + MPI • Energy policies
- Dynamic Voltage and Frequency Scaling (DVFS) Adjust frequency and voltage of the CPU. The actual workload determines which frequency/voltage is chosen.
- Dynamic Power Management (DPM) Power off when device becomes idle. Activation uses temporarily more energy.
- Maybe a hybrid policy, e.g. a mix of DPM and DVFS, is preferable.
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 13
Measuring Energy Consumption of Applications
• MRA Cluster Green Software - SEFLab – Software Energy Footprint Lab (founded by SIG and HvA) R&D project - SURFsara one of the seven partners
• Provide insight in energy consumption - Total consumption after run - Consumption during run (time curve)
• Using sensors in modern CPUs (RAPL) - CPU cores - Memory controller - PAPI to read hardware counters - Correlate with performance measurements (Flop/s/Watt)
• Using sensors on node (IPMI) - Memory - Disk drives - Network card
• Use SLURM (Cartesius batch system) - Link with resource manager - Energy consumption in job report
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 14
• Prof. dr. ir. Bendiks Jan Boersma (TU Delft) • Studies
- conversion of heat into work or movement - conversion of movement into electricity - interaction between liquids and their
environment - fluid mechanics
• Lower resistance in pipe networks using agents, polymers or chemicals - Gasunie during cold winters - Oil companies Trans-Alaska pipeline - Drilling of oil wells - Fire fighting in situations where the water
must be sprayed twice as high or far • Two images of axial velocity in a cross-
section of a pipe flow. The pictures show the friction Reynolds number
Energy Technology
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 15
Sustainable Energy
• Dr. Evgeny Pidko (TU/e assistant professor) • Field of study: computational catalysis for
sustainable energy technologies • Combining theory and experiment to
understand mechanisms of catalytic reactions on a molecular level
• Computational studies using state-of-the-art quantum chemical methods
• Used to formulate design rules for new and improved catalytic systems
• Studying different processes related to the conversion of biomass and carbon dioxide to value-added chemicals and fuel components (fuels)
• Research also focuses on more classical chemical systems in order to make technologies greener
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 16
Thank you for listening!
Jezelf Groen Rekenen met Supercomputers – Walter Lioen January 30, 2014 17
Top Related