IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32...

35

Transcript of IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32...

Page 1: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.
Page 2: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

2

Page 3: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

3

Page 4: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

4

Page 5: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

5

Page 6: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

6

Fujitsu K Computer SPARC64 VIIIfx CPUs 8-core 2.0 GHz

8 floating point ops per cycle

Custom Tofu Interconnect

Approx 800 racks total Water cooled

17,136 (nodes) x 4 (sockets) x 8 (cores) x 8 (FP/cycle) x 2.0 (GHz)

= 8.773632 PFlops (Rpeak)

New #1 – K Computer at RIKEN Advanced Institute for Computational Science - Japan

10.51

8.162 PFlops Rmax

93% Linpack Efficiency

3.2 times previous #1

13.8% of Jun’11 aggregate throughput

9.898 Megawatts

Computer Power Consumption

824.6 MF/w 11.280 PF Rpeak

830.2

12.66 Megawatts

22,032 (nodes)

14.2% of Nov’11

Page 7: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

Fortran C/C++

MPI OpenMP

CUDA OpenCL

OpenACC

HMPP

MIC

Scout

X10

CAF

CLIK

StarSs

G.Array

UPC

Chapel 7

Page 8: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

8

Page 9: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

9

Page 10: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

10

Page 11: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

11

Page 12: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

12

Page 13: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

o

o

13

Page 14: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

14

Page 15: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

15

Page 16: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

• °C

16

Page 17: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

17

Page 18: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

18

Page 19: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

19

Page 20: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

The Mont-Blanc Project

• To develop an European exascale approach

• Based on embedded power-efficient technology

Taken from Alex Ramirez’s presentation, BSC 20

Page 21: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

The Mont-Blanc Project

Integrated system design built from mobile / embedded components

• ARM multicore processors

• Nvidia Tegra / Denver, Calxeda, Marvell Armada, ST-Ericsson Nova A9600, …

• Mobile accelerators

• Mobile GPU (Nvidia GT 500M etc.)

• Embedded GPU (Nvidia Tegra, ARM Mali T604)

• Low power 10 GbE switches

Taken from Alex Ramirez’s presentation, BSC 21

Page 22: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

The Mont-Blanc Project

• Exploit massive number of low-power processors

• Sustain performance with lower bandwidth components (i.e

interconnect and Memory)

• Programmability

Taken from Alex Ramirez’s presentation, BSC 22

Page 23: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

1. Chip 16 cores

2. Module

Single Chip

4. Node Card

32 Compute Cards,

Optical Modules, Link

Chips, Torus

5a. Midplane

16 Node Cards

6. Rack

2 Midplanes

1, 2 or 4 I/O Drawers

7. System

96 racks @ 20PF/s

3. Compute Card

One single chip module,

16 GB DDR3 Memory

5b. I/O Drawer

8 I/O Cards w/16 GB

8 PCIe Gen2 slots

IBM Blue Gene/Q

Per Rack

Peak Performance 209 TF

Sustained (Linpack) ~170+ TF

Power ~100 kW

Power Efficiency ~2 GF/W

Scalability

23

Page 24: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

BG/Q

Processor 64-bit

PowerPC

Processor Frequency 1.6 GHz

Nodes/Rack x Cores 1024 x 16

Memory/Core 1 GB

Memory Bandwidth 43 GB/s

Cores/Rack 16384

Peak

Performance/Rack 209.7 TF

Average Power/Rack 65 kW

Availability 1H12

Blue Gene/Q Ultra Low Power, Dense Parallel System

24

Page 25: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

25

Page 26: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

26

Page 27: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

o

o

27

Page 28: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

••

••

••

•28

Page 29: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

29

Page 30: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

Application performance

difference in %

power consumption

difference in %

CP2K 8.5 16.3

SEISSOL 10.9 18.2

GADGET 7.2 18.7

LBDC 4.5 16.5

NAMD 6.0 19.7

WRF 4.5 13.6

Lesli3d 1.7 13.1

GemsFDTD 1.3 13.1

BQCD 0.0 13.6

WALBERLA 0.0 13.7

•30

Page 31: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

31

Page 32: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

– MDDTL ~ 7 years (simulated, MTTFdisk=600Khrs, Weibull, 100-PB usable)

32

Page 33: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

Software / De-clustered RAID

Failu

re

Read

Write

Failu

re

22 HDDs

Traditional RAID

Declustered RAID

33

Page 34: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

34

Page 35: IBM Presentations: Smart Planet Template · 1. Chip 16 cores 2. Module Single Chip 4. Node Card 32 Compute Cards, Optical Modules, Link Chips, Torus 5a. Midplane 16 Node Cards 6.

Thank you

35