CSCfi Computing Services 12/2014

36
CSC Computing Services Olli-Pekka Lehto Development Manager Computing Platforms [email protected] @ople December 11 th 2014

description

Computing Services portfolio of CSC - IT Center for Science Ltd. 12/2014 edition

Transcript of CSCfi Computing Services 12/2014

Page 1: CSCfi Computing Services 12/2014

CSC Computing Services

Olli-Pekka Lehto

Development Manager

Computing Platforms

[email protected]

@opleDecember 11th 2014

Page 2: CSCfi Computing Services 12/2014

CSC Computing Capacity 1989–2014

2

Page 3: CSCfi Computing Services 12/2014

244

1700

180

600

240

0

500

1000

1500

2000

2500

3000

2012 2013 2014

Bull

Taito

Sisu

Vuori

Louhi

CSC Computing Capacity 2012-2014

Phase 1 installed

Louhi retired

Phase 2 installed

Vuori retired

3,4x

5,6x

19,2x

Page 4: CSCfi Computing Services 12/2014

Total performance:

2.54 PFlop/s

CSC is the most powerful academic

computing facility in the Nordics

Page 5: CSCfi Computing Services 12/2014

CSC Computing Services

Performance Capacity Accelerated Cloud Hosting

Sisu

Massive

parallelism

Fast

interconnect

Taito

General use

Large memory

>100

applications

Taito

extension

Visualization

Special codes

Nvidia GPU

Intel Xeon Phi

cPouta

Build your

own

Openstack

IaaS

Kajaani

Espoo

Efficient and

secure

datacenters

Virtual and

physical

servers

Storage Services

Backup Archiving

Fast parallel storage

Page 6: CSCfi Computing Services 12/2014

CSC Computing Services

Performance Capacity Accelerated Cloud Hosting

Sisu

40512

cores

1700 TFlops

Taito

18880

cores

600 TFlops

Taito

extension

76 Nvidia

K40 GPU

90 Intel Xeon

Phi 7120X

240 TFlops

cPouta

Dynamically

provisioned

from Taito

Kajaani

Espoo

Storage Services

>4PB, ~ 100GB/s

Page 7: CSCfi Computing Services 12/2014

New in 2014: Xeon Haswell E5 CPUs

Intel Xeon E5-2690v3 2,6GHz

– 12 cores/CPU (+50%)

– AVX2 instructions (2x max flops/GHz)

– DDR4 memory

– “Energy-to-solution” at best 1/3 vs.

Sandy Bridge

We are one of earliest adopters

– Sisu upgraded 7/2014

– Taito upgraded 12/2014

Page 8: CSCfi Computing Services 12/2014

Sisu Cray XC päivitys

8 new cabinets

Haswell CPUs

More memory per node

>7x performance

~2x energy consumption

40512 cores

108TB memory

680kW

15384kg

Page 9: CSCfi Computing Services 12/2014

Sisu – Cray XC40

Designed for computationally intensive tasks

40512 cores, 64GB RAM / node

#37 on the Nov 14 Top500 -list

Aries-interconnect with 7TB/s bisection BW

Cray development tools

Comprehensive set of scalable applications

Page 10: CSCfi Computing Services 12/2014

Cray XC blade

4 dual CPU nodes (96 cores)

64GB RAM per node

Aries Router

(500GB/s switching capability)

Power

Net

Page 11: CSCfi Computing Services 12/2014

Blade in XC Rack

48 blades

384 CPUs

4608 cores

Page 12: CSCfi Computing Services 12/2014

Aries Interconnect Cabling

Page 13: CSCfi Computing Services 12/2014

Aries Interconnect Topology

2 dimensional

all-to-all network

in a group

All-to-all network

between groups

Source:

Robert Alverson, Cray

Hot Interconnects 2012 keynote

Optical uplinks to

inter-group net

13CSC presentation

Page 14: CSCfi Computing Services 12/2014

Aries Bisection Bandwidth

9 000 000 x 1,75 x

Average European

consumer IP traffic in 2013

OR

1080p Netflix streams

~7 TB/s

=

Page 15: CSCfi Computing Services 12/2014

Taito

HP cluster for general use

– 576 dual-CPU 8 core E5-2670 (Sandy Bridge)

64GB RAM per node

– 400 dual-CPU 12 core E5-2690v3 (Haswell)

128GB RAM per node

New HP Apollo 6000 chassis and blades

– Big memory nodes

10 x 256GB ; 2 x 1,5TB

56Gbit/s FDR InfiniBand –interconnect

Large selection of applications

taito-shell for instant interactive use

Page 16: CSCfi Computing Services 12/2014

Taito Extension

Bull DLC 715

– Direct warm-water cooling

Special processors for computing

– 72 Nvidia Tesla K40 GPGPU

– 90 Intel Xeon Phi 7120X

High-performance and energy efficient

– Porting and optimization of applications needed

GPUs can be used for visualization

– For example by using VirtualGL

Connected to the Taito cluster

Page 17: CSCfi Computing Services 12/2014

Taito Extension

Page 18: CSCfi Computing Services 12/2014

Energy-efficiency of Systems

0

0.5

1

1.5

2

2.5

3

3.5

Vuori Sisu P1 Taito P1 Sisu P2 Taito P2 Bull

GF

lop

/W

Page 19: CSCfi Computing Services 12/2014

Taito ja Sisu user interfaces

NoMachine NX –virtual desktop

(NX Web-client in beta testing)

Unix shell & X-forwarding

Scientists’ User Interface

https://sui.csc.fi

Page 20: CSCfi Computing Services 12/2014

cPouta Cloud

IaaS cloud service for HPC

Use cases

– Running HTTP and other servers

– Non-CentOS Linux or Windows OS needed

– Superuser access needed

– Agile service development (“DevOps”)

Tuned for HPC

– Provisioned from Taito nodes

– Powerful CPUs, interconnect, storage

Simple web interface, CLI, REST API

https://research.csc.fi/pouta-iaas-cloud

Page 21: CSCfi Computing Services 12/2014

Creating an Instance in cPouta

Page 22: CSCfi Computing Services 12/2014

Creating a Virtual Volume in cPouta

Page 23: CSCfi Computing Services 12/2014

Storage Services

HPC Storage (~4PB, ~100GB/s)

– Lustre parallel filesystem

– DDN SFA10k and SFA12k storage arrays

– Capacity and performance scalable

Cloud storage (in 2015)

– Ceph filesystem

– Software-defined storage (SDS)

– Block and object storage

Archive (iRODS), tape backup

Page 24: CSCfi Computing Services 12/2014

Kajaani Datacenter

30MW power

PUE 1.05-1.2

3000m2 floorspace

99% free cooling

Page 25: CSCfi Computing Services 12/2014

Modular Datacenter (MDC):

Easily Expandable, Highly Efficient

Page 26: CSCfi Computing Services 12/2014

Interior view of the MDC

Page 27: CSCfi Computing Services 12/2014

Cooling Technologies

Water-air hybrid (Sisu)

Air (Taito)

Direct warm-water

(Bull Taito extension)

PUE: 1,2

PUE: 1,05 PUE: 1,02

Page 28: CSCfi Computing Services 12/2014

Why CSC?

High-performance, latest technologies

Secure environment (ISO27001:2013)

Finnish, non-profit organization

Ecologically sustainable infrastructure

Competitive and simple pricing model

Excellent network connectivity

Everything under one roof

– Cloud, traditional HPC, visualization, accelerators

– Various storage solutions, EUDAT, RDA

– Consulting, training, porting, optimization

Page 29: CSCfi Computing Services 12/2014

ISO27001:2013 certification

Reflects our commitment to security

– Risk management

– Leadership

– Technical solutions and documentation

– Continual improvement

– Recovery planning

– Security as part of company culture

Covers nearly all ICT services and

datacenters

– Cloud services to be certified in early 2015

Page 30: CSCfi Computing Services 12/2014

Leveraging Best-of-breed Open

Source Software

Deployment & ClusteringCloud

Operating Systems

MonitoringStorage Queuing

Logstash

Graphite

Page 31: CSCfi Computing Services 12/2014

Near future developmentcPouta improvements, including:

– Oversubscribed instances (WWW-servers etc)

– Docker-support

– ISO27001:2013 certification

New ePouta service

– Productization of Biomedinfra

– Secure computing for organizations

– No direct visibility to Internet (VPN / OPN)

– Possible to extend your local resources

seamlessly

Services for data-intensive computing

– Hadoop/MapReduce optimized systems

– SSD storage

Page 32: CSCfi Computing Services 12/2014
Page 33: CSCfi Computing Services 12/2014

Backup slides

Intro to HPC architecture

Page 34: CSCfi Computing Services 12/2014

Supercomputers in Olden Days

Page 35: CSCfi Computing Services 12/2014

Supercomputers Today

Commodity technologies

– Server clusters

– Linux

– Ethernet, InfiniBand

– x86

Proprietary solutions in very high-end

– BlueGene, Cray, NEC

Cloud services on the rise

– Especially for modest compute needs

Use is constantly spreading to new fields

– Skilled people needed!

Page 36: CSCfi Computing Services 12/2014

Basic Supercomputer Architecture

Edustasolmu

Frontend nodes

Interconnect networkCompute nodes

Storage servers

Storage system

Internet

Hallintasolmut

Management nodes

Management network

Management nodes

§