Skip to content Skip to main navigation Report an accessibility issue
High Performance & Scientific Computing

SLURM vs. PBS on ISAAC-NG



The new ISAAC-NG computing cluster is housed at the University of Tennessee’s Kingston Pike Building (KPB) in Knoxville, Tennessee and utilizes SLURM (Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management) to manage and schedule jobs submitted to the cluster.

ISAAC-NG users should be aware that Slurm functions in a slightly different way than Torque/Moab function, and commands to accomplish the same tasks will differ depending on which scheduler is in use on a particular cluster.

Stuff

Command DescriptionTorqueSLURM
Batch job submissionqsub <Job File Name>sbatch <Job File Name>
Interactive job submissionqsub -lsalloc | srun –pty /bin/bash
Job listqstatsqueue -l
Job list by usersqstat -u <User Name>squeue -l -u <User Name>
Job deletionqdel <Job ID>scancel <Job ID>
Job holdqhold <Job ID>scontrol hold <Job ID>
Job releaseqrls <Job ID>scontrol release <Job ID>
Job updateqalter <Job ID>scontrol update job <Job ID>
Job detailsqstat -f <Job ID>scontrol show job <Job ID>
Node listpbsnodes -lsinfo -N
Node detailspbsnodesscontrol show nodes
Table 1.1: System commands for Torque and SLURM

Stuff

Command Description MoabSLURM
Job start timeshowstart <Job ID>squeue –start -j <Job ID>
Status of nodesmdiag -nsinfo -N -l
User’s accountmdiag -u <User Name>sacctmgr show association user=<User Name>
Account membersmdiag -a <Account Name>sacctmgr show assoc account=<Account Name>
Nodes of accountsmdiag -ssinfo -a
Table 1.2: System commands for Moab and SLURM

Stuff

Command DescriptionOpenMPISLURM
Parallel wrappermpirunsrun
Table 1.3: System Commands for Parallel Processing with OpenMPI and SLURM

Stuff

Notice: There are important differences between SLURM and PBS. Please be careful when using the specifications –ntask= (-n) and –cpus-per-task= (-c) in SLURM because they are not PBS specifications, and there are no CPUs per node or ppn options in SLURM. The number of tasks (-n) is the specified number of parallel processes in the distributed memory (such as MPI model).

TorqueSLURMDescription of Torque CommandTorque ExampleSLURM Example
#PBS#SBATCHHead of each lineN/AN/A
-A-A, –account=<account>This option tells Torque to use the Account (not the User Name) specified for the credential.#PBS -A <Account Name>#SBatch -A <Account Name>

#SBATCH –account=<Account Name>
-a-begin=<time>This option tells PBS to run a job at the given time#PBS -a <0001>#SBATCH –begin=00:01
-e

-o

-j
-e, –error=<filename pattern>

-o, –output=<filename pattern>

*NOTE: this command requires a file name pattern, and the file name pattern cannot be a directory. To find the details on valid filename patterns, check the sbatch manual by typing “man sbatch” into a CLI terminal and looking at the filename pattern section of the sbatch manual.
The location for the error_path and output_path attributes.

Users only need to use one of these two options if the -j option is also in use.

Combining the -e and -o options (to create “eo” or “oe”) will combine the STDOUT and STDERR together into a single file that the user specifies. “eo” will place the information into the error_path file, and “oe” will place the information into the output_path file.
#PBS -e ~/ErrorFile


#PBS -j oe

#PBS -j eo
#SBATCH -e ~/ErrorFile_%j_%u

*NOTE: Both standard output and standard error information for the job are directed into the same file.
qsub -l salloc

srun –pty /bin/bash
Declares that the job is to be run interactively.qsub -l

qsub -l -X
srun –pty /bin/bash

salloc –x11
-l-N, –nodes=<minnodes[-maxnodes]> -n, –ntasks=<number> –ntasks-per-node=<ntasks>

-c, –cpus-per-task=<ncpus>

–gres=<list>

-t, –time=<time>

–mem=<size[units]>

-C, –constraint=<list>

–tmp=<size[units]>
Remember to separate options with a comma ( , ).

Nodes=# : gives the number and/or type of nodes desired.

ppn=# : gives the number of processors per node desired.

gpus=# : gives the number of GPUs desired.

walltime= : total runtime desired in the format DD:HH:MM:SS or HH:MM:SS

mem= : maximum amount of memory required by the job.

feature= : states name of type of compute node required.

file= : states the maximum amount of local disk space required by the job.

#PBS -l nodes=5:ppn=2:gpus=3

#PBS -l walltime=01:30:00

#PBS -l mem=5gb

#PBS -l feature=intel14|intel16

#PBS -l file=50GB
#SBATCH -n 5 -c 2 –gres=gpu:3

#SBATCH –time=01:30:00

#SBATCH –mem=5G

#SBATCH -C NOAUTO:intel14|intel16

#SBATCH –tmp=50G
-M–mail-user=<User Name>Emails email accounts(s) that are listed to notify users once a job changes states.#PBS -M <username>@utk.edu#SBATCH –mail-user=<username>@utk.edu
-m–mail-type=<type>a- this option sends a mail notification when the job is aborted.

b- this option sends a mail notification when the job execution begins.

e- this option sends a mail notification when the job ends.

n- this option does not send mail.

#PBS -m abe#SBATCH –mail-type=FAIL/BEGIN/END

if NONE option is used, no mail will be sent.
-N-J, –job-name=<jobname>Names a job#PBS -N <Desired Name of Job>#SBATCH -J <Desired Name of Job>
-t-a, –array=<indexes>Submits an array job with “n” number of identical tasks.

Remember: each job that is part of an array job will have the same JOBID, but a different ARRAYID.
#PBS -t 7

#PBS -t 2-13

#SBATCH -a 7

#SBATCH –array=2-13
-V–export=<environment variables [all] | none>Passes all current environmental variables to the job.#PBS -V#SBATCH –export=ALL
-v–export=<environment variables [all] | none>Defines any additional environmental variables for the job.#PBS -v ev1=ph5,ev2=43#SBATCH –export=’ev1=ph5,ev2=43′
-W-L, –licenses=<license>Special Generic Resources (i.e., software licenses) can be requested by using the -W option. #PBS -W gres:<Name of Software>#SBATCH -L <name of software>@<specified by license>@<specified by license>
Table 1.4: Job Submission Specification Options

Stuff

DescriptionTorqueSLURM Variables
The ID of the jobPBS_JOB IDSLURM_JOB_ID
Job array ID (index) numberPBS_ARRAYIDSLURM_ARRAY_TASK_ID
Directory where the submission command was executedPBS_O_WORKDIRSLURM_SUBMIT_DIR
Name of the jobPBS_JOBNAMESLURM_JOB_NAME
List of nodes allocated to the jobPBS_NODEFILESLURM_JOB_NODELIST
Number of Processors Per Node (ppn) requestedPBS_NUM_PPNSLURM_JOB_CPUS_PER_NODE
Total number of cores requestedPBS_NPSLURM_NTASKS*
SLURM_CPUS_PER_TASK
Total number of nodes requestedPBS_NUM_NODESSLURM_NTASKS
Current Host of the PBS jobPBS_O_HOSTSLURM_SUBMIT_HOST
Table 1.5: Environment Variables for Torque and SLURM