Nvidia Cuda Apps Jun27 11

14
–– Accelerating High Performance Applications

description

PEER 1 Offers NVIDIA GPU to Accelerate High Performance Applications PEER 1 has teamed up with NVIDIA the creator of the GPU and a world leader in visual computing, to provide high performance GPU Cloud applications. NVIDIA’s GPUs are well known for making customer software run faster and PEER 1 is offering a number of services that run on NVIDA’s GPUs. PEER 1’s cloud service is built on NVIDIA Telsa GPU’s delivering supercomputing performance in the cloud to solve much tougher problems. Click here to find out how PEER 1 and NVIDIA can transform your business.

Transcript of Nvidia Cuda Apps Jun27 11

Page 1: Nvidia Cuda Apps Jun27 11

––

Accelerating High Performance Applications

Page 2: Nvidia Cuda Apps Jun27 11

Strategic Focus on Applications

Senior-level relationship and market

managers

Dedicated technical resources

More than 150 people devoted to

libraries, tools, application porting

and market development

Worldwide focus

Page 3: Nvidia Cuda Apps Jun27 11

Reaching a Broad Range of Markets

Scientific computing Creative pro Education / research

Page 4: Nvidia Cuda Apps Jun27 11

CAD/ CAM/

CAID

CAE/ EDA

Computational

chemistry

Computational

Finance

Defence &

Intelligence

Digital

Content

creation

Physical

Sciences

Seismic

processing

and

visualization

Autodesk

Ansys

Amber

MATLAB Ikena Adobe

Quda (L-QCD) Schlumberger

Dassault

Systemes:

CATIA

Solidworks

Dassault

Systemes:

Simulia

NAMD

Mathematica Intergraph Autodesk M&E

WRF Landmark

PTC

Nastran

Gromacs NAG ESRI Avid

ACUSA Paradigm

Siemens

LSTC

Lammps Murex Manifold MainConcept

HOMME

Synopsys

GAMESS

Sony HYCOM

Strategic Partners

Page 5: Nvidia Cuda Apps Jun27 11

Application Features

Supported GPU Perf Release Status Notes

AMBER PMEMD :

Explicit & Implicit

Solvent 8X V11 Released

Single and multi-GPUs.

Expect 2x more performance in

V11 patch release (shortly)

GROMACS Implicit (5x), Explicit

(2x) Solvent 2x-5x

Single GPU released,

Version 4.5.4

Next release: 2H2011

Better Explicit, MPI

LAMMPS Lennard-Jones, Gay-

Berne 6x Released Single and multi-GPU.

NAMD Non-bond force

calculation 2x-7x Released, v2.8 Single and multi-GPU.

Leading MD Applications

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features

and may be a kernel to kernel perf comparison

Page 6: Nvidia Cuda Apps Jun27 11

Application Features

Supported GPU Perf Release Status Notes

Abalone TBD,

“Simulations” 4-29X

(on 1060 GPU) Released

Single GPU.

Agile Molecule, Inc.

ACEMD Written for use on

GPUs

“µ-sec long

trajectories on

workstation”

Released

Production bio-molecular

dynamics (MD) software specially

optimized to run on single and

multi-GPUs

DL_POLY Two-body Forces, Link-

cell Pairs, Ewald SPME

forces, Shake VV 4x

V 4.0 Source only

Results Published

Next release: 2H2011

Multi-GPU, multi-node supported

HOOMD-

Blue Written for use on

GPUs

2X (32 CPU cores vs.

2 10XX GPUs)

Released, Version

0.9.2 Single and multi-GPU.

Additional MD/MM Applications Ramping

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features

and may be a kernel to kernel perf comparison

Page 7: Nvidia Cuda Apps Jun27 11

Related

Applications

Features

Supported GPU Perf Release Status Notes

Amira 5® 3D visualization of

volumetric data and

surfaces

N/A Released, Version 5.3.3

Visualization from Visage

Imaging. Next release, 5.4, will

use GPU for general purpose

processing in some functions

Core

Hopping GPU accelerated

application

Up to

5000X Released, Suite 2011

Single and multi-GPUs.

Schrodinger, Inc.

FastROCS Real-time shape

similarity

searching/comparison

800-3000X Released Single and multi-GPUs.

Open Eyes Scientific Software

VMD

High quality rendering,

large structures (100 million atoms),

GPU acceleration for

computationally demanding analysis

and visualization tasks, multiple

GPU support for very fast display of

molecular orbitals arising in

quantum chemistry calculations

100-125X or

greater Released, Version 1.9

Visualization from University of

Illinois at Urbana-Champaign

Viz and “Docking” Applications

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features

and may be a kernel to kernel perf comparison

Page 8: Nvidia Cuda Apps Jun27 11

Application Features

Supported

GPU

Perf Release Status Notes

GAMESS-US

Libqc with Rys

Quadrature Algorithm,

integral evaluation,

closed shell Fock

matrix construction

2.5X Released

Single GPU supported in 10/1/10

release.

Multi-GPU supported in

July 2011 release.

NWChem

Triples part of Reg-

CCSD(T), CCSD &

EOMCCSD task

schedulers

3-8X

projected

Date TBA,

in development

Development GPGPU

benchmarks: www.nwchem-

sw.org

Q-CHEM Various features

including RI-MP2

8-14x

projected

Date TBA,

In development

Significant porting already

TeraChem “Full GPU-based

solution”

44-650X

vs.

GAMESS

CPU ver.

Version 1.45 released

Single and Multi-GPU.

Completely redesigned to exploit

massive GPU parallelism

Quantum Chemistry

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features

and may be a kernel to kernel perf comparison

Page 9: Nvidia Cuda Apps Jun27 11

Application Features

Supported

GPU

Perf Release Status Notes

Abinit BigDFT - 50% of the

program (short

convolutions) 6-30X Released June 2009

http://inac.cea.fr/L_Sim/BigDFT

/news.html

Quantum-

Espresso/

PWscf

PWscf package: linear

algebra (matrix

multiply), explicit

computational kernels,

3D FFTs

TBD Released May 5, 2011 Created by Irish Centre for High-

End Computing

Material Science

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features

and may be a kernel to kernel perf comparison

Page 10: Nvidia Cuda Apps Jun27 11

Bioinformatics

CUDA-BLASTP

CUDA-EC

CUDA-MEME

CUDASW++ (Smith-Waterman)

DNADist

GPU Blast

GPU-HMMER

HEX Protein Docking

Jacket (MATLAB Plugin)

MUMmerGPU

MUMmerGPU++

SARUMAN

SeqNFind

UGENE

Additional details can be found at Tesla Bio Workbench:

http://www.nvidia.com/object/tesla_bio_workbench.html

Page 11: Nvidia Cuda Apps Jun27 11

Structural Mechanics

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison

Application GPU Features GPU Perf Release Status Notes

ANSYS Mechanical Linear eqn solvers 2x Total Today, release 13 SP2 FE implicit, single-GPU

Abaqus/Standard Linear eqn solver 2x Total Today, release 6.11 FE implicit, single-GPU

IMPETUS Afea Explicit solver, SPH 10x SPH, 2x Total Today, release 1.0 FE explicit, multi-GPU

LS-DYNA implicit Linear eqn solver 3x Total Planned for 2011 FE implicit, multi-GPU

MD Nastran Linear eqn solvers 2x Solver Planned for 2011 FE implicit, multi-GPU

Marc Linear eqn solver 1.5x Total Planned for 2011 FE implicit, single-GPU

RADIOSS Implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU

PAM-CRASH implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU

NX Nastran Linear eqn solver 1.4x Total Demonstration FE implicit, single-GPU

Page 12: Nvidia Cuda Apps Jun27 11

Fluid Dynamics

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison

Application GPU Features GPU Perf Release Status Notes

Altair AcuSolve Linear eqn solver 2x Total Today, release 1.8 FE unstructured NS, multi-GPU

Autodesk Moldflow Linear eqn solver 2x Total Today, release 2011 FE unstructured NS, single-GPU

FluiDyna LBultra LBM, particle CFD 20x Total Today, release 1.0 Structured LBM, multi-GPU

FluiDyna Culises-

OpenFOAM Solver Linear eqn solvers 3x Solver Today, release 1.0 Unstructured NS, single-GPU

Vratis SpeedIT-

OpenFOAM Solver Linear eqn solvers 3x Solver Today, release 1.2 Unstructured NS, multi-GPU

Prometech

Particleworks MPS, particle CFD 4x-9x Total Q3CY11 release 2.5 Particle based, multi-GPU

Sandia NL S3D Chemistry kernel 8x SP, 5x DP kernel Demonstration Structured grid DNS, multi-GPU

Turbostream Explicit solver 19x Total Today, release 2.0 Structured grid NS, multi-GPU

SD++ (Jameson) Explicit solver 16x Total Planned for 2011 FE unstructured NS, multi-GPU

FEFLO (Lohner) Explicit solver 2x Total Planned for 2011 FE unstructured NS, multi-GPU

Page 13: Nvidia Cuda Apps Jun27 11

Electromagnetics

Application Features

Supported GPU Perf Release Status Notes

Agilent EMPro FDTD 6X 2011.07 Released Single & multi-GPU;

EMPro 2011 PR

CST Microwave

Studio

Transient (FIT)

solver; Combined MPI

& GPU computing

9X on 1 GPU

to 20X+ on 4

GPUs

2011 Released Single & multi-GPU;

www.cst.com/perf

Remcom XFdtd FDTD 30-300X XF7 Released Single and multi-GPU;

XStream GPU acceleration

SPEAG SEMCAD X FDTD;

Acceleware 100X 14.4.3 Released

Single and multi-GPU;

www.speag.com/perf

GPU Performance compared against quad-core x86 CPU socket;

Remcom XFdtd GPU performance compared against single core CPU

Page 14: Nvidia Cuda Apps Jun27 11

Climate/ Weather/ Ocean

GPU Perf compared against Multi-core x86 CPU socket.

GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison

Application GPU Features GPU Perf Production Status Notes

WRF WSM5, WSM3, Ice

Microphysics models 4x-6x Models Today, release 3.2 single-GPU

ASUCA Most routines 12x Total In production at JMA multi-GPU

NIM Most routines 7x Dynamics Limited production multi-GPU

HIRLAM Dynamical core 3x Solver Planned for 2011 multi-GPU

HOMME Models 3x Models Planned for 2011 single-GPU

CAM Linear eqn solver 2x Solver Planned for 2011 single-GPU

GEOS-5 Most routines 10x Models, 3x

Dynamics Demonstration multi-GPU

MITgcm Linear eqn solver 3x solver Demonstration single-GPU

HYCOM Linear eqn solver 2x solver Demonstration single-GPU