Download - Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Transcript
Page 1: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

UpcomingBiowulf Seminars

• November30,1- 3pmPythoninHPCOverviewofpythontoolsusedinhighperformancecomputing,andhowtoimprovetheperformanceofyourpythonjobsonBiowulf• Jan16,1- 3pmRelion tipsandtricks,andParalleljobsandbenchmarkingMechanicsandbestpracticesforsubmiting RELIONjobstothebatchsystemfromboththecommandlineandviatheRELIONGUI,aswellasmethodsformonitoringandevaluatingtheresults.Scalingofparalleljobs,howtobenchmarktomakeeffectiveuseofyourallocatedresources

Bldg 50,Rm1227

Page 2: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

MakingEffectiveUseoftheBiowulf BatchSystem

NIHHPCSystems

[email protected],CIT

Oct30,2017

Page 3: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

EffectiveUse==EffectiveResourceAllocation

• Specifyingresources• Estimatingrequiredresources• Allocatingresourceswithsbatch andswarm

• Monitoringresourceallocation• Schedulingandresourceallocation• Post-mortemanalysis

Page 4: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

CPUs

CPU

HardwareTerminologyReview

Hyper-threading

Processor

Node

Page 5: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

EstimatingResources• CPU• Checkdocumentation(https://hpc.nih.gov/apps/)• Objective-- matchCPU:Threads 1:1(thereareexceptions,e.g.,MDjobs)

• Memory• Runajoborswarmwithalargememoryallocation• Checkactualmemoryusage• Add10%toactualmemoryusage

• Time• Runajoborswarmwithalargetimeallocation• Checkactualwalltime• Add10%toactualwalltime

Page 6: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

AllocatingResourceswithsbatch andswarm• Alljobs• --mem(sbatch)or-g(swarm)• --time(sbatch andswarm)• -btobundlecommandlines(swarm)

• Single-threadedjobs• “-p2”toloadcoreswith2threads(swarm)

• Multi-nodejobs• “ParallelJobsandBenchmarking”Jan16

• Multi-threadedjobs• --cpus-per-task(sbatch)or-t(swarm)• Use$SLURM_CPUS_PER_TASKinbatchscript• OMP_NUM_THREADS

Page 7: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

MonitoringResourceAllocation

• CPU• jobload whilethejobisrunning• Dashboardduringorafterthejob• (NoeasywaytomonitorGPUutilizationatthemoment)

• Walltime• jobhist,Dashboardorsacct duringorafterthejobhascompleted

• Memory• jobload whilethejobisrunning• jobhist,Dashboardorsacct duringorafterthejobhascompleted

Page 8: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

jobload% jobload -u someuserJOBID TIME NODES CPUS THREADS LOAD MEMORY

Elapsed / Wall Alloc Active Used / Alloc51863534 6-22:08:01 / 10-00:00:00 cn3095 4 4 100% 1.0 / 8.0 GB51863535 6-22:08:01 / 10-00:00:00 cn3256 4 5 125% 0.9 / 8.0 GB51863536 6-22:08:01 / 10-00:00:00 cn3348 4 1 25% 1.0 / 8.0 GB51863537 6-22:08:01 / 10-00:00:00 cn3401 4 3 75% 0.9 / 8.0 GB 51881591 6-19:42:16 / 10-00:00:00 cn3097 4 1 25% 1.0 / 8.0 GB

% jobload -j 51874438_233 JOBID TIME NODES CPUS THREADS LOAD MEMORY

Elapsed/Wall Alloc Active Used/Alloc51874438_233 6-20:10:13/10-00:00:00 cn3105 2 1 50% 0.5/ 1.5 GB

Page 9: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

jobhist

# jobhist 52102264_67Jobid Partition State Nodes CPUs Walltime Runtime MemReq MemUsed Nodelist52102264_67 norm COMPLETED 1 2 02:00:00 00:05:29 4.0GB/node 0.8GB cn3185

allocated

used

Page 10: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

sacct

% sacct --format=Jobname,AllocCPUS,AllocNodes,ReqMem,MaxRSS,Elapsed -j 52102332 JobName AllocCPUS AllocNodes ReqMem MaxRSS Elapsed ---------- ---------- ---------- ---------- ---------- ----------tbss_2_reg 2 1 4Gn 00:05:29 batch 2 1 4Gn 815152K 00:05:29

Page 11: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

UsingYourDashboardtoMonitorJobshttps://hpc.nih.gov

https://hpc.nih.gov/dashboard/

Page 12: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

SchedulingandResourceAllocation

• Schedulingisdeterminedbyjobpriority• PriorityisdeterminedbyFairshare valueofuser• Fairshare isdeterminedbyrecentcpu andmemoryallocations ofrunningjobs

• Unnecessarilylongtimeallocationwillpreventjobsfrombeingbackfilled

Page 13: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

• ‘freen’showsfreeCPUsbutnotfreememoryordisk• Otherjobshavehigherpriority(sprio)• Nodesarereservedforhigher-priorityjobs

Whyaremyjobspending?

Page 14: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Consequencesof…

Specifyingmoreresourcesthanneeded

Specifyingfewerresourcesthanneeded

CPU WastedCPUresources,possiblyunnecessaryschedulingdelays

Jobrunsalittle/a lotslower

Memory Wastedmemoryresources,possiblyunnecessaryschedulingdelays

Jobis“Killed” bythekernel

Time Possiblyunnecessaryschedulingdelays

Jobiskilledbythebatch system

Page 15: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Post-mortemofjobsusinguserDashboardOr

TheGood,theBad,

andtheUgly…

Page 16: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:jobisrunningwithdefaultallocationsforCPUandmemoryRecommendation:ifasubjob ofalargeswarm,try“-p2”

Page 17: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

A+

Page 18: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

AnotherA+

Page 19: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Recommendation:reduceCPUallocation

Page 20: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

A+

Page 21: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:perfectCPUutilization;underutilizedmemorybutentirenodeisallocatedduetocpu allocation

Page 22: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:goodoverallutilization;possiblysplitintotwojobswithadependency,andwithdifferingresourceallocations

Page 23: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:8CPUstoolittle/toomuch,2woulddoRecommendation:couldruninhalfthememory

Page 24: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high
Page 25: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:goodmemoryutilizationRecommendation:increasingCPUallocationprobablywon’thelp

Page 26: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:CPUsbadlyoverloadedRecommendation:couldruninlessthanhalfthememory

Page 27: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:CPUsoverloaded200%Recommendation:256GBmemoryallocated,MBsused

Page 28: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Comment:goodmemoryutilizationRecommendation:mightrunfasterwith32CPUs?

Page 29: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Recommendation:increasingCPUcountmightimprove?

Page 30: Upcoming BiowulfSeminars - NIH HPC › training › handouts › Effective_batch...Upcoming BiowulfSeminars •November 30, 1 -3 pm Python in HPC Overview of python tools used in high

Recommendation:allocating56CPUswouldlikelyhelp