Storrs HPC Overview - Feb. 2017

9
High Performance Computing (HPC) Storrs Campus Ed Swindelles [email protected] 2/24/2017

Transcript of Storrs HPC Overview - Feb. 2017

Page 1: Storrs HPC Overview - Feb. 2017

High Performance Computing (HPC)

Storrs Campus

Ed Swindelles

[email protected]

2/24/2017

Page 2: Storrs HPC Overview - Feb. 2017

Storrs HPC

• Alignment between Storrs and Farmington/CBC HPC clusters

– Storrs – General purpose HPC for traditional workloads

(Engineering, Chemistry, Biology, etc.), with tightly-coupled

systems architecture (Infiniband + parallel file system).

– Farmington/CBC – Focus on Bioscience workloads (genomics,

bioinformatics, etc.), with data/memory intensive architecture.

– Shared storage across campuses – ~3PB of object storage

(Amazon S3, REST) with file system gateways (NFS, SMB).

– 100Gbe frictionless network between resources (Science DMZ).

• Storrs cluster access is available to all UConn researchers at all

campuses at no charge.

• Researchers who require high priority access invest in semi-

dedicated nodes – “condo model”. Capital investment only.

Page 3: Storrs HPC Overview - Feb. 2017

Usage

• 54M total core hours used 2011 through 2016

• Equivalent to 6,172 years of continuous execution on a single-core PC

0.16M

2.06M3.60M

6.83M

9.67M

31.75M

2011 2012 2013 2014 2015 2016

Popular Applications• MATLAB• ANSYS• MPI libraries• Intel compiler suite• R, Python• HDF5, NetCDF• GROMACS• IDL• CUDA• NAMD• Gaussian• CHARMM• COMSOL• ABAQUS• Schrodinger• Avizo

Page 4: Storrs HPC Overview - Feb. 2017

Technical Specifications

• Most common CPU nodes (184 total):

– 24 cores @ 2.6Ghz, 128GB RAM, 200GB local SSD, FDR Infiniband

• Most dense CPU nodes (4 total):

– 44 cores @ 2.2Ghz, 256GB RAM, 800GB local SSD, FDR Infiniband

• New academic cluster: ~450 cores available

CPU Cores 6,328

CPU Architecture Intel Xeon x64

Total RAM 33TB

Interconnect FDR Infiniband

Shared Storage 1PB Parallel Storage

Accelerators Intel Xeon Phi and NVIDIA GPUs

Operating System Linux

Peak Performance 250 Teraflops

Campus Connection 20Gbps Ethernet

Hardware Partners

Page 5: Storrs HPC Overview - Feb. 2017

Default Resource Allocations

Partition Cores Runtime

general

(default)192 12 hours

parallel 384 6 hours

serial 24 7 days

debug

(high priority)24 30 minutes

Path Description

/scratch

• High performance, parallel• 1 petabyte shared, no quotas• 30 day data retention, no backups

/work

• High performance local SSD on each node, ~150GB per node.

• 5 day data retention, no backups

/archive• Highly resilient, permanent• Low relative performance

/home• 50GB non-shared personal storage• Backed up regularly

/shared• Shared storage for teams/labs• Backed up regularly

Data transfers facilitated by our Globus endpoint:uconnhpc#dtn-transfer

https://wiki.hpc.uconn.edu/index.php/Globus_Connect

Researchers who purchase priority allocations have no runtime limits on their

cores once their job begins.

Compute Allocations Data Allocations

Page 6: Storrs HPC Overview - Feb. 2017

More Information

• Attend our Beginner HPC Training on the Storrs campus in Laurel Hall room 306 on 3/6 10AM-12PM. Then, Advanced HPC Training on 3/20.

• For more information, and to apply for an account: http://hpc.uconn.edu

• Documentation: http://wiki.hpc.uconn.edu

• Technical support: [email protected]

• My contact info:

Ed Swindelles

[email protected]

Phone: x4522

Office: Storrs, HBL A-09

UConn High Performance Computing with Dell EMC and Intel

http://s.uconn.edu/hpcvideo

Page 7: Storrs HPC Overview - Feb. 2017

Connecticut Education Network

• Providing Network Access for data intensive needs

• Feb 2015: completed 100G Science DMZ connection

between Storrs (main campus) and UConn Health

Center (Farmington).

• Result of 2013 successful NSF Grant (CC-NIE

#1341007) to UConn’s School of Engineering Computer

Networking Research Group, to provide 100G layer-2

connectivity Science DMZ beginning in Jan 2014.

Page 8: Storrs HPC Overview - Feb. 2017

Diverse Fiber Path

Page 9: Storrs HPC Overview - Feb. 2017

CEN Core Backbone