ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer...

23
ACCELERATED COMPUTING THE PATH FORWARD

Transcript of ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer...

Page 1: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

ACCELERATED COMPUTINGTHE PATH FORWARD

Page 2: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

2

Every Computer MakerWorld’s Most Powerful

AI SupercomputersVolta in Production Every Cloud

| 5,120 CUDA cores

| 7.8 FP64 TFLOPS

21B xtors

15.7 FP32

125 Tensor TFLOPS

VOLTA TAKING OFF

NVIDIA ACCELERATED COMPUTING

Page 3: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

3

500+ GPU-ACCELERATED APPLICATIONS

All Top 15 HPC Apps Accelerated

GROMACS

ANSYS Fluent

Gaussian

VASP

NAMD

Simula Abaqus

WRF

OpenFOAM

ANSYS

LS-DYNA

BLAST

LAMMPS

AMBER

Quantum Espresso

GAMESS

HOW WE GOT HERE

APPLICATIONS

SYSTEMS

ALGORITHMS

CUDA

ARCHITECTURE

EVERY DEEP LEARNING FRAMEWORK ACCELERATED

INVESTING FROM TOP TO BOTTOM

Page 4: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

4

ARCHITECTING MODERN DATACENTERS

APPS

% W

ORKLO

AD

Strong Scaling

Weak Scaling

Deep Learning

Apps Accelerated

Parallel Speed-Up

% Workload

% Sequential “Amdahl’s Law”

APPLICATION WORKLOAD IN MODERN DATACENTERS

Page 5: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

5

ARCHITECTING MODERN DATACENTERS

Strong Core CPU for Sequential code

Volta 5,120 CUDA Cores

125 TFLOPS Tensor Core

NVLink for Strong Scaling

ARCHITECTING MODERN DATACENTERS

Page 6: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

6

0

10

20

30

40

50

60

70

80

20 40 60 80 1000

# of CPUs

1 Node with 4x V100 GPUs

48 CPU Nodes Comet Supercomputer

ns/

day

AMBER Simulation of CRISPR

ARCHITECTING MODERN DATACENTERSTHE POWER OF ACCELERATED COMPUTING

Page 7: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

7Intersect360, Nov 2017 “HPC Application Support for GPU Computing”

GROMACS

ANSYS Fluent

Gaussian

VASP

NAMD

Simula Abaqus

WRF

OpenFOAM

ANSYS

LS-DYNA

NCBI-BLAST

LAMMPS

AMBER

Quantum Espresso

GAMESS

500+ Accelerated ApplicationsTop 15 HPC Applications

70% OF THE WORLD’S SUPERCOMPUTINGWORKLOAD ACCELERATED

Page 8: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

8

Mixed Workload: Materials Science (VASP)Life Sciences (AMBER)Physics (MILC)Deep Learning (ResNet-50)

160 Self-hosted Servers

96 KWatts

4X BETTER HPC SYSTEM TCO4X BETTER HPC SYSTEM TCO

Page 9: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

9

Mixed Workload: Materials Science (VASP)Life Sciences (AMBER)Physics (MILC)Deep Learning (ResNet-50)

12 Accelerated Servers w/4 V100 GPUs

20 KWatts

1/3 the Cost

1/4 the Space

1/5 the Power

4X BETTER HPC SYSTEM TCO4X BETTER HPC SYSTEM TCO

Page 10: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

10

POWERING NEXT-GENERATION SUPERCOMPUTERS

Page 11: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

11

3+EFLOPSTensor Ops

AI Exascale Today

ACME

DIRAC FLASH GTC

HACC LSDALTON NAMD

NUCCOR NWCHEM QMCPACK

RAPTOR SPECFEM XGC

AcceleratedScience

10XPerf Over Titan

20 PF

200 PF

Performance Leadership

VOLTA TO FUEL SUMMITNext Milestone In AI Supercomputing

5-10XApplication Perf Over Titan

Page 12: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

12

Most Powerful AI Supercomputer in Japan

4,352 Tesla V100 GPUs

37 PetaFLOPS FP64 HPC Performance

0.55 ExaFLOPS AI Performance

ANNOUNCING JAPAN’S AIST ADOPTS NVIDIA VOLTA FOR ABCI SUPERCOMPUTER

Page 13: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

13

40 PetaFLOPS Peak FP64 Performance | 660 PetaFLOPS DL FP16 Performance | 660 NVIDIA DGX-1 Server Nodes

ANNOUNCING NVIDIA SATURNV WITH VOLTA

ANNOUNCINGNVIDIA SATURNV WITH VOLTA

Page 14: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

14

ERRORS

REGRESSION TESTING (FP16/INT8)

INFERENCE (FP16/INT8)

TRAINING (FP32/FP16)

SIMULATION (FP64/FP32)

NEW DATA

TRAINING SET REGRESSION SET NEW DATA

DEEP LEARNING COMES TO HPCDEEP LEARNING COMES TO HPC

INSIGHTS

Page 15: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

15

UIUC & NCSA: ASTROPHYSICS

5,000X LIGO Signal Processing

U. FLORIDA & UNC: DRUG DISCOVERY

300,000X Molecular Energetics Prediction

SLAC: ASTROPHYSICS

Gravitational Lensing: From Weeks to 10ms

AI ACCELERATES SCIENCE

U.S. DoE: CLEAN ENERGY

33% More Accurate Neutrino Detection

PRINCETON & ITER: PARTICLE PHYSICS

50% Higher Accuracy for Fusion Sustainment

U. PITT: DRUG DISCOVERY

35% Higher Accuracy for Protein Scoring

DEEP LEARNING COMES TO HPC

Page 16: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

16

AI FOR STEADY FLOW APPROXIMATION

CHALLENGE

CFD is computationally expensive.

CFD is time consuming to complete

Autodesk Research & University of Michigan

SOLUTION

Adoption of a Convolutional Neural Network to replace Velocity field in 2D & 3D

IMPACT

Rapid time to solution

Re-use of existing data.

AI

TRADITIONAL CFD

ERROR Xiaoxiao Guo, Wei Li, Francesco Iorio (2016)

Convolutional Neural Networks for Steady Flow Approximation

Page 17: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

17

SIMULATING LIQUID SURFACES WITH GENERATIVE ADVERSARIAL NETWORKS

CHALLENGE

Liquid modelling is expensive.

Real-time simulations require a GPU or multiple CPUs. Or both.

Technical University of Munich

SOLUTION

Generative Adversarial Network simulates the properties of liquids with 10X reduction in compute

IMPACT

Simulation is portable and available to any user – You can use a phone for inference.

GAN can be re-trained for any liquid in any situation – if given enough data

Lukas Prantl, Boris Bonev, and Nils Thuerey. 2010.

Pre-computed LiquidSpaces with Generative Neural Networks and Optical

Flow.

Page 18: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

18

PREDICTING DISRUPTIONS IN FUSION REACTOR USING DL

CHALLENGE

Controlling plasma at high power is a multi-dimensional challenge with high consequences.

Time to act is ~30ms

Princeton University

SOLUTION

Fusion recurrent Neural Network (FRNN) is 80-90% true positives. 5% false positives

Trained on relatively small dataset and old GPU

IMPACT

Reduced down-time on largest experiments

Within sight of an Actively controlled model

William Tang, Alexey Svyatkovskiy, Julian Kates-Harbeck,Kyle Felker, Eliot Feibush, Michael Churchill

Page 19: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

19

Containerized in NVDocker

Optimized for GPU-accelerated Systems

Up-to-Date Containers

Available NOW

Sign up at nvidia.com/gpu-cloud

ANNOUNCING NVIDIA GPU CLOUD FOR HPC

CLOUD CONTAINER REGISTRY FORACCELERATED HPC APPS

Page 20: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

20

HPC APPS COMING TO NVIDIA GPU CLOUD

Page 21: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

21

VISUALIZATION IS VITAL TO SCIENCE

Page 22: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

22

Large-scale Volumetric Rendering

Physically Accurate Ray Tracing

Production-quality Images

Seamless integration with ParaView

Early Access NOW

Signup now at nvidia.com/gpu-cloud

U CLOUD FOR HPC VISUALIZATION

UNIFIED VISUALIZATIONFOR LARGE DATA SETS

ParaView with NVIDIA OptiX

ParaView with NVIDIA Holodeck

ParaView with NVIDIA IndeX

NVIDIA GPU CLOUD FOR HPC VISUALIZATION

Page 23: ACCELERATED COMPUTING THE PATH FORWARD · ACCELERATED COMPUTING THE PATH FORWARD. 2 Every Computer Maker World’s Most Powerful Volta in Production Every Cloud AI Supercomputers

23

NVIDIA GPU CLOUDSIMPLIFYING AI & HPC

DEEP LEARNING HPC APPS HPC VIZ