Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on...
Transcript of Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on...
![Page 1: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/1.jpg)
Converting Your
(Simple) Job Scripts
from PBS to SLURM
on discover
NASA Center for Climate Simulation
High Performance Science
July 31, 2014
![Page 2: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/2.jpg)
Introduction
• Portable Batch System
–Developed at Ames for NASA
–Commercial version: PBS Pro (Altair
Engineering)
• Simple Linux Utility for Resource
Management
–Developed at LLNL
–Open-source (supported by SchedMD)
• PBS->SLURM on discover in October 2013. NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 2
![Page 3: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/3.jpg)
What’s the difference?
• Concepts and commands have new names.
• Overall script design remains essentially the
same.
• A PBS “queue” is equivalent to a SLURM
“partition”.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 3
![Page 4: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/4.jpg)
Why did we switch?
• Quality of Service (QoS)
– Eliminates need for dedicated queues
• Great reduction in cost
• But PBS is still used at NAS….
– … so we use a PBS emulation layer.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 4
![Page 5: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/5.jpg)
PBS emulation with SLURM
• SchedMD provided wrapper scripts (in Perl).
• We modified the wrappers for discover.
• Most changes were folded back into baseline.
• Wrapped tools: qsub, qalter, qdel,
qhold, qrerun, qrls, qstat, xsub
• Wrappers handle command-line options only.
• #PBS script directives are translated to
#SBATCH and processed by sbatch.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 5
![Page 6: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/6.jpg)
Emulation “gotchas”
• Not all PBS features can be emulated.
• SLURM exports user environment by default.
• SLURM runs in the current directory.
• SLURM combines stdout and stderr.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 6
![Page 7: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/7.jpg)
Batch job submission
• For simple cases, just replace qsub with
sbatch.
$ qsub myjob.sh
becomes
$ sbatch myjob.sh
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 7
![Page 8: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/8.jpg)
Naming your job
• Naming the job makes it easier to find.
#PBS -N job_name
becomes
#SBATCH -J job_name
or
#SBATCH --job-name=job_name
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 8
![Page 9: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/9.jpg)
Specifying the account
• Make sure the proper account is charged.
#PBS -A account_name
becomes
#SBATCH -A account_name
or
#SBATCH --account=account_name
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 9
![Page 10: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/10.jpg)
Specifying the partition
• Only if you have to ….
#PBS -q destination
becomes
#SBATCH -p destination
or
#SBATCH --partition=destination
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 10
![Page 11: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/11.jpg)
Specifying the number of nodes
• Specify how many nodes you need.
#PBS -l select=num
becomes
#SBATCH -N num
or
#SBATCH --nodes=num
• A range can also be specified as nmin-nmax.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 11
![Page 12: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/12.jpg)
Specifying processes per node
• Use one process per core on each node by
default, but may want less.
#PBS -l mpiprocs=num
becomes
#SBATCH --ntasks-per-node=num
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 12
![Page 13: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/13.jpg)
Specifying processor type
• Choices are:
– Sandy Bridge (sand – 16 cores/node)
– Westmere (west – 12 cores/node)
#PBS -l proc=proc_type
becomes
#SBATCH -C proc_type
or
#SBATCH --constraint=proc_type
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 13
![Page 14: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/14.jpg)
stdout and stderr streams
• Specify where the output streams are written.
#PBS -o opath -e epath
becomes
#SBATCH -o opath -e epath
or
#SBATCH --output=opath --error=epath
• Streams are joined in SLURM by default (./slurm-
NNNNNNN.out), which required -j oe or -j eo
in PBS. NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 14
![Page 15: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/15.jpg)
Mail notification
• Use to get a message when your job is done, or when something bad happens…
#PBS -M user_list
becomes
#SBATCH --mail-type=type
#SBATCH --mail-user=user
• Type can be BEGIN, END, FAIL, ALL (any state change).
• Default user is the submitter.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 15
![Page 16: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/16.jpg)
Your working directory
• PBS jobs ran in a spool directory.
• SLURM jobs run in the current directory.
• Can be changed with cd command, or:
#SBATCH -D path
or
#SBATCH --workdir=path
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 16
![Page 17: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/17.jpg)
Exporting environment variables
• PBS exported nothing by default.
• SLURM exports everything by default.
• Change with one or more of:
#SBATCH --export=names
#SBATCH --export=ALL
#SBATCH --export=NONE
#SBATCH --export-file=path
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 17
![Page 18: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/18.jpg)
Threads and MPI
• Set up and run threads as you always have.
• mpirun/mpiexec/mpiexec.hydra are
not part of PBS, so no changes needed.
• The SLURM tool srun provides additional
features that are SLURM-specific.
– Provides features similar to those of MPI tools.
– Differences in job step control and signal
propagation.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 18
![Page 19: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/19.jpg)
A simple example
• User inigo has an old PBS script:
–The job name is revenge.
–Runs in the default PBS queue.
–Runs the application sword.x.
–Uses 8 Westmere nodes
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 19
![Page 20: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/20.jpg)
The simple script (PBS)
#PBS –N revenge
#PBS –l select=8:proc=west
mpirun sword.x
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 20
![Page 21: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/21.jpg)
The simple script (SLURM)
#SBATCH --job-name=revenge
#SBATCH --nodes=8
#SBATCH --constraint=west
mpirun sword.x
# Could also use:
# srun sword.x
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 21
![Page 22: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/22.jpg)
The simple results…
• Program runs in current directory, not the
spool directory.
• User environment is exported.
• Standard output and standard error together in ./slurm-NNNNNNN.out.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 22
![Page 23: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/23.jpg)
A not-so-simple example
• User westley has an old PBS script:
–The job name is pirate.
–Charge the account roberts.
–Runs in the PBS queue dread.
–Uses 12 Sandy Bridge nodes.
–Uses 8 cores per node.
–Export only the variable BUTTERCUP.
–Runs the application sword.x.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 23
![Page 24: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/24.jpg)
The NSS Script (PBS)
#PBS –N pirate
#PBS –A roberts
#PBS –q dread
#PBS –l
select=12:proc=sand:mpiprocs=8
#PBS –v BUTTERCUP
mpirun sword.x
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 24
![Page 25: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/25.jpg)
The NSS Script (SLURM)
#SBATCH --job-name=pirate
#SBATCH --account=roberts
#SBATCH --partition=dread
#SBATCH --nodes=12
#SBATCH --constraint=sand
#SBATCH --ntasks-per-node=8
#SBATCH --export=NONE,BUTTERCUP
mpirun sword.x
# or
# srun sword.x
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 25
![Page 26: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/26.jpg)
The NSS results…
• Program runs in current directory, not the spool directory.
• User environment is exported.
• Standard output and standard error together in ./slurm-NNNNNNN.out.
• NOTE: If you have an environment variable named NONE, and use --export=NONE, nothing is exported. But if you have NONE, and use --export=NONE,OTHER, NONE and OTHER are exported with everything else! So don’t do that….
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 26
![Page 27: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/27.jpg)
Much more to come…
• Using mpirun/mpiexec/mpiexec.hydra
vs. using srun.
– Differing behavior for signal propagation and job
control commands.
• Job dependencies with strigger.
• Copying files with sbcast.
• Attaching to running jobs with sattach.
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 27
![Page 28: Converting Your (Simple) Job Scripts from PBS to … Your (Simple) Job Scripts from PBS to SLURM on discover NASA Center for Climate Simulation High Performance Science ... –Uses](https://reader033.fdocuments.net/reader033/viewer/2022052420/5aa9ef327f8b9a8b188d7ab6/html5/thumbnails/28.jpg)
Questions?
NCCS Brown Bag 7/31/2014 NASA Center for Climate Simulation 28