Provisioning with Stacki at NIST
-
Upload
stackiq -
Category
Technology
-
view
143 -
download
0
Transcript of Provisioning with Stacki at NIST
ProvisioningwithStackiJustinSenseney
HPC/ISD/OISM/[email protected]
HighPerformanceComputingGroupIncluding:CarlSpangler,JustinSenseney,MarkWilliams
November30,2016 1
Overview
2
• Disclaimer• AboutNIST• Ourideas• Existingsystem
• Hardware• Network• Software
• Newsystem• Stacki• SSHkeys• Directorystructure• Software• Modules
• Examples• Monitoring• Interactivejob• Parallelprocessing
Disclaimer
3
• Anymentionofcommercialproducts,includingStacki,withinthispresentationisforinformationonly;itdoesnotimplyrecommendationorendorsementbyNIST.
AboutNIST• PartofU.S.DepartmentofCommerce• Non-regulatory
• Ametrologyinstitute– maintainsmeasurementstandards• Timeserver- time.nist.gov• Twofactorauthentication– HSPD12• Standardreferencedata
• Gaithersburg,MD• Boulder,CO
4
Ourideas
• Orderingcartsformultiplecarts• Stackcompilecartforchanges• Versionnumberthatnodeisinstalledwith• Cleanupgradeprocesstopreservegit repository
5
Existingsystem
• Inplacelast10+years:• Hardware– heterogeneoushardware
• Network– simple,flatnetwork
• Software– CentOS5,maui/torque,consistentimage
6
Existingsystem
• Inplacelast10+years:• Hardware– heterogeneoushardware
• Supportdifferenthardwareconfigurations• Network– simple,flatnetwork
• Increasetopologycomplexity• Software – CentOS5,maui/torque,consistentimage
• Softwareinstalledonlocalmachine
7
Hardware
• LocatedinGaithersburg,MD.• OwnedbyNISTscientificorganizationalunits,managedbyOfficeofInformationSystemsManagement(OISM).
• Nodes- 8GB/core,procured2010- 2016:• Differentvendors
• Network– infiniband,ethernet• Daisychainedswitches• Differentvendors
8
Network• Protocols
• Infiniband onseparatecard• Etherneton-board
• Headnode• Login• Provisioning• Queuemanager
• FlatnetworkRaritanusesthistypicalHPCsetup.Allincomingconnectionsgototheheadnode.Picturefrom:http://www.udel.edu/it/research/training/config_laptop/
9
Software
• Maui/Torque• systemimager – providedconsistentimage• CentOS5and7• SSHkeysshared
• Directories:• /usr/local/binforsharedsoftware• /tmp forlocalcomputing• /wrk forsharedstorage• /homeforuserdata
10
NewSystem
• Stacki• SSHkeys• Directorystructure• Software• Modules
11
12
NewSystem
• Stacki• SSHkeys• Directorystructure• Software• Modules
Rocks Stacki
Appliance Box
Roll Pallet
Distribution Cart
Stacki
• Stacki box• OurboxcontainsmultiplePallets• Ourboxcontains1Cart
13https://commons.wikimedia.org/wiki/File:Box.agr.jpghttps://upload.wikimedia.org/wikipedia/commons/8/88/A1210.jpghttps://commons.wikimedia.org/wiki/Category:Utility_carts#/media/File:Moebelhunt_fcm.jpg
Stacki
• Stacki box• OurboxcontainsmultiplePallets• Ourboxcontains1Cart
• Pallets• 3rd party,softwarewithdifferentversions
14https://commons.wikimedia.org/wiki/File:Box.agr.jpghttps://upload.wikimedia.org/wikipedia/commons/8/88/A1210.jpghttps://commons.wikimedia.org/wiki/Category:Utility_carts#/media/File:Moebelhunt_fcm.jpg
Stacki
• Stacki box• OurboxcontainsmultiplePallets• Ourboxcontains1Cart
• Pallets• 3rd party,softwarewithdifferentversions
15https://commons.wikimedia.org/wiki/File:Box.agr.jpghttps://upload.wikimedia.org/wikipedia/commons/8/88/A1210.jpghttps://commons.wikimedia.org/wiki/Category:Utility_carts#/media/File:Moebelhunt_fcm.jpg
Stacki
• Stacki box• OurboxcontainsmultiplePallets• Ourboxcontains1Cart
• Pallets• 3rd party,softwarewithdifferentversions
• Carts• Custom/configuredsoftware• Systemfiles
16https://commons.wikimedia.org/wiki/File:Box.agr.jpghttps://upload.wikimedia.org/wikipedia/commons/8/88/A1210.jpghttps://commons.wikimedia.org/wiki/Category:Utility_carts#/media/File:Moebelhunt_fcm.jpg
Stacki
• Stacki box• OurboxcontainsmultiplePallets• Ourboxcontains1Cart
• Pallets• 3rd party,softwarewithdifferentversions
• Carts• Custom/configuredsoftware• Systemfiles
17https://commons.wikimedia.org/wiki/File:Box.agr.jpghttps://upload.wikimedia.org/wikipedia/commons/8/88/A1210.jpghttps://commons.wikimedia.org/wiki/Category:Utility_carts#/media/File:Moebelhunt_fcm.jpg
Stacki
• Stacki box• OurboxcontainsmultiplePallets• Ourboxcontains1Cart
• Pallets• 3rd party,softwarewithdifferentversions
• Carts• Custom/configuredsoftware• Systemfiles
18https://commons.wikimedia.org/wiki/File:Box.agr.jpghttps://upload.wikimedia.org/wikipedia/commons/8/88/A1210.jpghttps://commons.wikimedia.org/wiki/Category:Utility_carts#/media/File:Moebelhunt_fcm.jpg
• Orderingcartsformultiplecarts• stackcompilecartforchanges
SSHkeys
• systemimager duplicatesssh keys,howdouserslogin
• Mungeauthentication:• Slurm letsuserslaunchcommandsusingmungeusingsharedkey
19
SSHkeys
• systemimager duplicatesssh keys,howdouserslogin
• Mungeauthentication:• Slurm letsuserslaunchcommandsusingmungeusingsharedkey
20
SSHkeys
• systemimager duplicatesssh keys,howdouserslogin
• Mungeauthentication:• Slurm letsuserslaunchcommandsusingmungeusingsharedkey
21
SSHkeys
• systemimager duplicatesssh keys,howdouserslogin:
• Mungeauthentication:• Slurm letsuserslaunchcommandsusingmungeusingsharedkey
• SSHkeys:• HaveusersaddSSHkeyinhometo.ssh/authorized_keys
22
SSHkeys
• systemimager duplicatesssh keys,howdousersloginnow:
• Mungeauthentication:• Slurm letsuserslaunchcommandsusingmungeusingsharedkey
• SSHkeys:• HaveusersaddSSHkeyinhometo.ssh/authorized_keys• Placedscriptfordoingthisin/share/sw <-- needunderstandabledirectorystructure
23
Directorystructure
• Types• Configurationfiles• Stacki-usedfiles• Sharedsoftware• Stacki-createdfiles
• Maintenance• Git repositories
24
Directorystructure
• /export/apps/configfiles• Agit repository
25
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
26
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
27
• Versionnumberthatnodeisinstalledwith• Cleanupgradeprocesstopreservegit repository
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
28
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
29
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
• /export/sw – 3rd party/licensedsoftwareinstalledhere• Cangetmessy
30
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
• /export/sw – 3rd party/licensedsoftwareinstalledhere• Cangetmessy
• /export/stack/spreadsheets• Createdbystacki
31
Directorystructure
• /export/apps/configfiles – cartpullsfromhere• Agit repository
• /export/stack/carts/extend– theconfig cart• Agit repository
• /export/sw – 3rd party/licensedsoftwareinstalledhere• Cangetmessy
• /export/stack/spreadsheets• Createdbystacki
32
Clustersoftware– wemaintain
• Localsoftware– ifoutofdatefileITACticket• /usr/local/bin• /usr/bin• IncludesLAMMPS,Python34,Paraview,FFTW
• Maintainedbyyumrepository,morelikelytobeuptodate• Useyumsearch[name]tofindmoreavailablesoftware
• Sharedsoftware– ifoutofdatecontactPOC• /share/sw• Anaconda,Gaussian,OpenFOAM,OpenMPI
• FixedversionsrequestedbyRaritanusers,onlyupdatedbyrequest
33
Clustersoftware– allbyyuminstall• Programminglibraries:
• Lapack• MKL• ACML• BLACS• ScaLAPACK• CMLIB• Python• R• Java
• Editingfiles:• Nox11
• Vi• Vim• Emacs• Nano
• X11• Gedit• Sublime
• Comparingfiles:• Nox11
• Vimdiff• X11
• Kdiff3• Meld
34
Modules
• Easiermanagementofenvironmentvariables• moduleavail• Gaussian,mpi,intel,hpc
• modulepurge
35
Modules– Example
• moduleloadopenmpi
• moduledisplayopenmpi
• echo$LD_LIBRARY_PATH
• moduleremoveopenmpi
36
Examples• Monitoring• Sview• Checkpacking
• Jobsubmission• sbatchBuilder• Srun• OpenMPI
37
Sview
38
Checkpacking
39
sbatchBuilder• Helpsspecifycommandslikelastexample• Toget:sbatch –N3–pquick–carch=8myScript.sh• sbatchBuilder –N3 –c8 –qquickmyScript.sh
40
Seelistofoptions
Srun
• salloc –nodes=4--ntasks-per-node=1
• srun hostname
• srun /wrk/jss7/makedir.sh /scratch/jss7
• scancel [jobNum]
41
OpenMPI
• Integratingfunctiontoapproximatepi
• Canrunon1machine:• Moduleloadpgi• Moduleloadopenmpi
• mpicc cpi.c –ocpi
• Don’trunonheadnode
42
OpenMPI – submissionscript
• Trympirun
• sbatch multijob.sbatch
43
OpenMPI – submissionscript
• Trympirun
• sbatch multijob.sbatch
• sbatch –constraint=”haswell”–nodes=4
44
OpenMPI – results
• 12workernodes
45
OpenMPI – results
• 19workernodes
46
Ourideas
• Orderingcartsformultiplecarts• Stackcompilecartforchanges• Versionnumberthatnodeisinstalledwith• Cleanupgradeprocesstopreservegit repository
47
Questions?
• Myemail:[email protected]
48
• Anymentionofcommercialproductswithinthispresentationisforinformationonly;itdoesnotimplyrecommendationorendorsementbyNIST.