Streamlining Research Computing Infrastructure- 1 front end! - 2 login nodes! - 1 NAS node (2 TB...

Streamlining Research Computing Infrastructure

A small school’s experience

Gowtham HPC Research Scientist, ITS Adj. Asst. Professor, Physics/ECE

!g@mtu.edu

(906) 487-3593 http://www.mtu.edu

��2

Houghton, MI

Houghton, MIIsle Royale National Park, MI 56 miles

Green Bay, WI 215 miles

Detroit, MI 550 miles

Sault Ste. Marie, M

I Canada 265 m

��3

��4

Michigan Tech Fall 2013

- Population - Houghton/Hancock: 15,000 (22,000)

- Students: 7,000 (5,600 +1,400)

- Faculty: 500

- Staff: 1,000

- General budget: $170 million

- Sponsored programs awards: $48 million

- Endowment value: $83 million

- 8 mini to medium sized clusters - Spread around campus

- Varying versions of Rocks

- Different software configurations

- Single power supply for most components

- Manual systems administration and maintenance

- Minimal end user training and documentation

��5

An as is snapshot January 2011

These 8 clusters — purchased mostly with start-up funds — had 1,000 CPU cores spanning several hardware generations and few low-end GPUs. Only one of them had InfiniBand (40 Gb/s).

- Move all clusters to one of two data centers - Merge clusters when possible

- Consistent racking, cabling and labeling scheme

- Upgrade to Rocks 5.4.2

- Identical software configuration

- End user training

- Complete documentation

��6

Rack 107, Back side, 36th slot, On Board NIC 1 (of a node)

Initial consolidation January 2011 — March 2011

Compute nodes deemed not up to the mark were put away for building a test cluster : wigner.research.mtu.edu

R107B36 OB1

R107B41 P01

Rack 107 Back side, 41st slot, Port 01 (of the switch)

- hpcmonitor.it.mtu.edu

- Ganglia monitoring system

��7

Capture usage pattern April 2011 — December 2011

Monitoring multiple clusters with Ganglia: http://central6.rocksclusters.org/roll-documentation/ganglia/6.1/x111.html

- Low usage - 20% on most days

- 45-50% on luckiest of days

- Inability and/or unwillingness to share resources - Lack of resources for researchers in need

- More systems administrative work - Space, power and cooling costs

- Less time for research, teaching and collaborations

��8

Analysis of usage pattern January 2012

- VPR, Provost, CIO, CTO, Chair of HPC Committee and yours truly - Strongly encourage sharing of under-utilized clusters

- End of life for existing individual clusters

- Stop funding new individual clusters

- Acquire one big centrally managed cluster

- Central administration will fully support the new policies - One person committees

- No exceptions for anyone

��9

The meeting January 2012

��10

The philosophy January 2012

Greatest good for the greatest number- Warren Perger and Gifford Pinchot

Much is said of the questions of this kind, about greatest good for the greatest number. But the greatest number too often is found to be one. It is never the greatest number in the common meaning of the term that makes the greatest noise and stir on questions mixed with money … - John Muir

��11

It’s not just a keel and a hull and a deck and sails. That’s what a ship needs but not what a ship is. But what a ship is … what the Black Pearl Superior really is … is freedom. - Captain Jack Sparrow, Pirates of the Caribbean

Adopted shamelessly from Henry Neeman’s SC11 presentation: Supercomputing in Plain English

The philosophy January 2012

- $750k for everything

- $675k for hardware + 10% for unexpected expenses

- 5 rounds with 4 vendors (2 local; 2 brand names) - Local vendor won the bid February 2013 - Staggered delivery of components April — May 2013

- Fly-wheel installation April — May 2013 - Load test with building and campus generators

��12

Bidding/Acquiring process February 2012 — May 2013

- Built with retired nodes from other clusters - 1 front end

- 2 login nodes

- 1 NAS node (2 TB RAID1 storage)

- 32 compute nodes

- 50+ software suites

- 150+ users

First version of wigner had just two nodes: 1 front end and 1 compute node, built with retired lab PCs and no switch

��13

wigner.research January 2011 — December 2013

As of Spring 2014, wigner has been retired. The nodes are being used as a testing platform for upcoming Data Science program at Michigan Tech and to teach building and managing a research computing cluster as part of PH4395: Computer Simulations

- HPC Proving Grounds - OS installation and customization

- Software compilation and integration with queueing system

- Extensive testing of policies, procedures and user experience - PH4390, PH4395 and MA5903 students

- Small to medium sized research groups

- Automating systems administration

- Integrating configuration files, logs, etc. with a revision control system

��14

wigner.research March 2011 — December 2013

- Central Rocks server (x86_64) - Serves 6.1, 6.0, 5.5, 5.4.3 and 5.4.2

- Saves time during installation

- Facilitates inclusion of cluster-specific rolls

��15

rocks.it.mtu.edu April 2012 — present

Scripts and procedures were provided by Philip Papadopoulos

- 1 front end

- 2 login nodes

- 1 NAS node: 33 TB usable RAID60 storage space

- 72 CPU compute nodes

- 5 GPU compute nodes - 4 NVIDIA Tesla M2090 GPUs (448 CUDA cores)

��16

Superior June 2013

Compute nodes (CPU and GPU): Intel Sandy Bridge E5-2670 2.60 GHz 16 CPU cores and 64 GB RAM

Housed in the newly built Great Lakes Research Center : http://www.mtu.edu/greatlakes/

- 56 Gbps InfiniBand - Primary research network

- Copper cables

- Gigabit ethernet - Administrative and secondary research network

- Redundant power supply for every component

��17

Superior June 2013

With 81 total nodes, there was 33% room for growth before needing to re-design the InfiniBand switch system. Final cost was $680k. Remaining $70k was used to build a test cluster : portage.research.mtu.edu

- Physical assembly (7 days) - Racking, cabling and labeling

- Rocks Cluster Distribution (5 days) - OS installation, customization, compliance

- Software compilation, user accounts

- 3 pilot research groups (14 days) - Reward for being good and productive users

- Help fix bugs, etc.

��18

Superior June 2013

��19

Superior June 2013

Ethernet switch system

Front end

Login nodes

CPU Compute nodes

InfiniBand switch system

Storage node

GPU compute nodes

- short.q (compute-0-N; N: 0-7) - 24 hour limit on run time

- long.q (compute-0-N: N: 8-81) - No limit on run time

- gpu.q (compute-0-N: N: 82-86) - No limit on run time

��20

Superior June 2013

http://superior.research.mtu.edu/available-resources

��21

Benchmarks: HPL June 2013

# Performance (TFLOPS) Notes

1 Theoretical 23.96 --2 Practical 21.57 ~90% of #13 Measured 21.38 89.23% of #1

http://netlib.org/benchmark/hpl Theoretical performance = # of nodes x # of cores per node x Clock frequency (cycles/second) x # of floating point operations per cycle

��22

Benchmarks: LAMMPS June 2013

Benjamin Jensen (advisor : Dr. Gregory Odegard) Computational Mechanics and Materials Research Laboratory, Mechanical Engineering-Engineering Mechanics Results from a simulation involving 1,440 atoms and 500,000 time steps.

# of nodes (CPU cores)

2 (32) 4 (64) 6 (96) 10 (160)

Michigan Tech: SuperiorNASA: Pleiades

��23

Submit completed proposal to: !Dr. Warren Perger Chair, HPC Committee wfp@mtu.edu

Account request

LaTeX/MS Word template available at http://superior.research.mtu.edu/account-request

- List of software/compilers

- Scalability

- Source of funding

- Résumé

- Proposal

- Title and abstract

- User population

- Preliminary results

- Nature of data sets

- Required resources

- A metric for merit

- An easily accessible list of projects - Know what the facility is being used for - Intellectual scholarship and computational requirements

- For VPR, CIO, deans, dept. chairs and institutional directors

- A fail-safe opportunity to practice writing proposals seeking allocations in NSF’s XSEDE, etc.

��24

Why proposal?

http://nsf.gov http://xsede.org http://superior.research.mtu.edu/list-of-projects

- Tier A - New faculty

- Established faculty with funding

- Tier B - Established faculty with no (immediate) funding

��25

User population

Group members and external collaborators inherit their PI’s tier. New faculty status is valid for 2 years from the first day of work.

��26

Job submission: qgenscriptOne stop shop for

- Array jobs

- Exclusive node access

- Wait on pending jobs

- Email/SMS notifications

- Wait time statistics

- Command to submit the script

- Job information filehttp://superior.research.mtu.edu/job-submission/#batch-submission-scripts

- Users’ priorities are computed periodically - A weighted function of CPU time and production

- In effect only when Superior is running at near 100% capacity

- Pre-emption and advanced reservation are disabled

- Any job that will start will run to completion

��27

Job scheduling policy

http://superior.research.mtu.edu/job-submission/#scheduling-policy

��28

Email/SMS notifications

http://superior.research.mtu.edu/job-submission/#sms-notifications

��29

Job information file

- Reduces performance for all users

- First offense - Terminates the program

- An email notification [cc: user’s advisor]

- Subsequent offenses - Same as first offense

- Logs the user out and locks down the account

��30

Running programs in login nodes

http://superior.research.mtu.edu/job-submission/#running-programs-on-login-nodes A continued trend will be grounds for removal of user’s account.

- Data is not backed up

- Limits per user

- /home/john - 25 MB

- /research/john - decided on a per proposal basis

- When a user exceeds the limit

- 12 reminders at 6 hour intervals [cc: user’s advisor]

- 13th reminder, logs out the user and locks down the account

��31

Disk usage

http://superior.research.mtu.edu/job-submission/#disk-usage

��32

Useful commands

Developed at Michigan Tech http://superior.research.mtu.edu/job-submission/#useful-commands

- qgenscript

- qresources

- qlist

- qnodes-map

- qnodes-active | qnodes-idle

- qwaittime

- qstatus | quser | qgroup

- qnodes-in-job

- qjobs-in-node

- qjobs-in-active-nodes

- qjobinfo | qjobcount

- qusage

��33

Usage reportsAll PIs and Chair of HPC Committee receive a weekly report. !VPR, CIO, deans, department chairs and institutional directors receive quarterly and annual reports (or when necessary).

- 21 projects - 10 Tier A +11 Tier B

- 100 users

- 9 publications

- 75+% busy on most days

- $325k worth usage

- ~50% of initial investment

- Cost recovery model: $0.10 per CPU-core per hour

��34

Usage reports July 2013 — December 2013

��35

Metrics

Cannot manage what cannot be measured

Not everything that’s (easily) measurable is (really) meaningful Not everything that’s (really) meaningful is (easily) measurable

- Move towards a merit-based system

- Easily measurable quantities - Who users are

- # of CPUs and total CPU time

- Really meaningful entities - Publications - Type (poster, conference proceeding, journal) and impact factor

- Citations

��36

Metrics

Publications reported to: !Dr. Warren Perger Chair, HPC Committee wfp@mtu.edu

��37

An in-house algorithm to compute users’ priorities

Metrics: job prioritySystem already knows who the users are

��38

Metrics

http://superior.research.mtu.edu/usage-reports Interactive visualizations are built using Highcharts framework

��39

Metrics

http://superior.research.mtu.edu/usage-reports

��40

Metrics

��41

Metrics

��42

Metrics

��43

Metrics: global impact

http://superior.research.mtu.edu/list-of-publications

Michigan Tech original Journal Article Book Chapter Conference Proceeding

MS Thesis PhD Dissertation

- Move all clusters to Great Lakes Research Center - Upgrade to Rocks 6.1 and add a login node

- Retire individual clusters when possible

- 16 compute nodes and 1 NAS node added to Superior

- portage.research.mtu.edu - Segue to Superior - 1 front end, 1 login node, 1 NAS node and 6 compute nodes

- Testing, course work projects and beginner research groups

��44

Further consolidation August 2013 — December 2013

- 1 big, 1 mini (central) and 3 individual clusters - 1 data center with .research.mtu.edu network

- Rocks 6.1

- Identical software configurations

- Automated systems administration and maintenance

- Extensive end user training

- Complete documentation

��45

An as is snapshot January 2014

Immersive Visualization Studio (IVS) is powered by a Rocks 5.4.2 cluster and has 24 HD screens (46” 240 Hz LED) working in unison to create a 160 sq. feet display wall. @MTUHPCStatus

- More tools to enhance user experience - Videos for self-paced learning of command line linux

- Encourage GPU computing

- Expand storage

- Provide backup

- Re-design InfiniBand switch system (216 nodes)

- Plan for expanded (or new) Superior

��46

Immediate future February 2014 and beyond

��47

Thanks be to- Philip Papadopoulos and Luca Clementi (UCSD and SDSC)

- Timothy Carlson (PNL)

- Thomas Reuti Reuter (Phillips Universität Marburg)

- Alexander Chekholko (Stanford University)

- Rocks, Grid Engine and Ganglia mailing lists

- Henry Neeman (University of Oklahoma)

- Steven Gordon (The Ohio State University)

- Gergana Slavova, Walter Shands and Michael Tucker (Intel)

- Gaurav Sharma and Scott Benway (MathWorks)

- Adam DeConinck (NVIDIA)

Streamlining Research Computing Infrastructure- 1 front end! - 2 login nodes! - 1 NAS node (2 TB...

Documents

Transcript of Streamlining Research Computing Infrastructure- 1 front end! - 2 login nodes! - 1 NAS node (2 TB...

CÓMO INSTALAR CentOS CON RAID1 - RafaelSantos.es · 2014. 9. 2. · CÓMO INSTALAR CentOS CON RAID1 INTRODUCCIÓN Antes de comenzar daremos nombre a los discos duros para poder seguir

Streamlining Cardiovascular Image

Streamlining Interpretation Workflow

NEC EXPRESS5800/R320e-M4 Configuration Guide · Only RAID1 is supported for hard drives including system partition. RAID1 or RAID1+0 can be configured for other data drives. LVM of

Red Hat Enterprise Linux 8€¦ · 13/07/2020 · converting an lvm raid1 logical volume to an lvm linear logical volume 10.6. converting a mirrored lvm device to a raid1 device

Streamlining Government

Streamlining SSAD

Antibiotic Streamlining

Streamlining deployment and_management_of_desktop_virtualization

Streamlining Government Through IT Service Delivery · Streamlining Government Through ... Streamlining IT Service Delivery LaGov Facts Continued ... Streamlining IT Service Delivery

Oracle Raid1

Session 30AB Streamlining the Physician Contracting ... · Streamlining the Physician Contracting Process: Lessons Learned on Efficiency and ... Streamlining the Physician Contracting

FINAL REPORT Streamlining Live Music Regulationmdo.sa.gov.au/.../08/Streamlining-Live-Music-Regulation-Report-FIN… · Streamlining Live Music Regulation 90-day project - FINAL Streamlining

Implementing Linux Software RAID1 on HPE …h20628. · Implementing Linux Software RAID1 on HPE ProLiant Servers Enterprise Linux 7.2, SuSE Linux E Abstract This document describes

CENTRALIZED DATA STORAGE (FreeNAS, ZFS RAID1, …raisa.web.id/conten/dok/freenas/CENTRALIZED_DATA_STORAGE.pdf · CENTRALIZED DATA STORAGE (FreeNAS, ZFS RAID1, iSCSI) ... Resume Console

Streamlining product documentation

HBase - Last.fmstatic.last.fm/johan/nosql-20090611/hbase_nosql.pdf · HBase deployment trivia • Nodes are 8x16 w/2TB (best price point) – Don’t use RAID1. Use RAID0 or JBOD

Applications Important to Streamlining Our Customers .../media/files/tcom/knowledge-center/... · Border Control Border Surveillance Inter-Agency Communications ... Wireless Nodes

Streamlining deposit

STREAMLINING PLANNING ASSESSMENTS