High Performance & Scientific Computing

Running Jobs on ISAAC-NG

**The compute resources of ISAAC-NG are managed using SLURM scheduler and workload manager**

Please note: The login nodes should ONLY be used for basic tasks such as: file editing, file management, code compilation, and job submission.

Introduction

ISAAC-NG cluster offers access to its compute resources via two means: (i) submission of a SLURM job via the command line or (ii) Web based job submission via Open OnDemand (See the ISAAC-NG web page about Open OnDemand for further details). The compute resources on ISAAC-NG cluster are managed by SLURM workload manager. This webpage describes how to efficiently use the compute resources of ISAAC-NG via command line and submit, monitor, and manage the computational jobs for production work using the SLURM scheduler.

A collection of example job scripts are available at /lustre/isaac/examples/jobs.

The below table shows the SLURM Quality of Service (QOS) categories, the run time limits and the partitions that can be selected for each QOS.

Quality of Service (QOS)	Run Time Limit (Hours)	Valid Partitions
campus	24	campus
campus-gpu	24	campus-gpu, campus-gpu-large, campus-gpu-bigmem
campus-bigmem	24	campus-bigmem
short	3	short
long	144 [6 days]	long
long-gpu	144 [6 days]	long-gpu
long-bigmem	144 [6 days]	long-bigmem
condo	720 [30 days]	condo-*
overflow	72 [3 days]	-overflow
genomics	168 [7 days]	condo-ut-genomics
ai-tenn	72 [3 days]	ai-tenn
condo-gpu	720 [30 days] (max 1 GPU)	condo-gpu-ise
condo-gpu-full	3 (max 4 GPUs)	condo-gpu-ise

Partition, Run Time, and QOS Matrix

The following table describes the QoS limits.

QoS	Max Wall Time Per Job	Max Resource Usage Per Job	Max Resource Usage Per User (across all running jobs)	Max Running/Submitted Jobs Per User
campus	1 day	6 Nodes	6 Nodes	48/96
campus-gpu	1 day	2 Nodes	2 GPUs	3/6
campus-bigmem	1 day	1 Node	1 Node	4/8
condo	30 days	None (defaults to size of condo)	None	150/250
genomics	7 days	1 Node	1 Node	2/3
short	3 hours		48 cores split across any number of nodes	12/18
long	6 days	–	96 cores	12/18
long-gpu	6 days	1 Node	1 GPU	1/3
long-bigmem				2/4
overflow	3 days	12 Nodes	–	-/-
condo-hicap				-/500
ai-tenn	3 days	minimum 1 GPU	–	28/56
condo-gpu	30 days	16 cores, 1 GPU	16 cores, 1 GPU (per account, not user)	150/250
condo-gpu-full	3 hours	–	–	150/250

QOS Limits

General Information

As explained in the Access and Login page, when you log in to ISAAC-NG HPSC cluster, you will be directed to one of the login nodes from where you can access the cluster to do your research work. Please note that the login nodes should only be used for basic tasks such as file editing, code compilation, and job submission. Running production jobs on login nodes is highly discouraged and any production job running on the login nodes is subject to termination.

If a user is not sure how to determine whether their job(s) are running on a login node, the user is encouraged to reach out to the High Performance & Scientific Computing (HPSC) group for assistance by submitting a help request ticket using the “Submit HPSC Service Request” button at the top right corner of any of the HPSC website pages at oit.utk.edu/hpsc.

As discussed above, login nodes should not be used to run production jobs. Any kind of production work should be performed on the system’s compute resources. The compute resources of ISAAC-NG cluster are managed and allocated by SLURM scheduler (Simple Linux Utility for Resource Management) which uses several partitions to efficiently allocate the resources for different jobs.

This page provides information for getting started with the batch facilities of SLURM scheduler as well as the basic job execution. However, before getting started with the job submission, it is imperative to understand how to access different job directories on ISAAC-NG clusters from where the jobs can be submitted.

Job Directories

By default, SLURM scheduler assumes the working directory to be the one from where the jobs are submitted. It is recommended that the jobs should be submitted from within a directory in the Lustre file system. . The other storage space is located in /lustre/isaac/projThe following storage spaces are available for users to choose from when deciding which directory from which they would like to run their job(s):

/lustre/isaac/proj/<project-name> – This is the project directory for the project under which the job is being run. Users can create their own directory under this project directory with their username where data can be stored. This storage space is allocated for a specific project and is made available upon request by the Principle Investigator of the project.
/lustre/isaac/scratch/<netid> – All ISAAC-NG users have access to a scratch directory, which can be accessed by specifying the path or using the $SCRATCHDIR environmental variable.

Please refer to the File Systems page for more details on home, project, and scratch directories.

SLURM Batch Scripts

In this section, we will explain how to request the SLURM scheduler to allocate the resources for a job and how to submit those jobs.

Partitions

SLURM scheduler organizes similar sets of nodes and job features into individual groups, which are individually referred to as a “partition”. Each partition has hard limits for the maximum wall clock time, maximum job size, the upper limit to the number of nodes a user can request, etc. The partitions available for all ISAAC-NG users include “campus”, “campus-gpu”, “campus-bigmem”, “campus-gpu-bigmem”, “long”, “long-gpu”, “long-bigmem”, and “short” under which a number of CPU and GPU cores are available. The default partition is “campus”; in order to use any other partition, your SLURM job must explicitly request it. For more information on available resources under each partition, please refer to our System Overview page.

Scheduling Policy

ISAAC-NG cluster uses a variety of mechanisms to determine how to schedule jobs. Most of the resource request mechanisms can be manipulated by users to ensure that their jobs are scheduled and executed in a reasonable time period.

Condos

The ISAAC-NG computing cluster has been divided into logical units referred to as “condos”. In the context of the cluster, a condo is a logical group of compute nodes that act as an independent cluster to effectively schedule jobs. There are “institutional” and individual “private” condos within the ISAAC-NG cluster. All users belong to their respective institutional condo, whether that be UTK or UTHSC. Private condos are owned and used by individual investors and their associated projects (discussed below). With a private condo, the investor and his project have exclusive use of the compute nodes in the condo. Investors can choose to share their private condo with the entire cluster under the Responsible Node Sharing implementation, but this is not a requirement. In Slurm, each condo is implemented as a “partition”. Most private condos also have a corresponding “overflow” partition, which combine nodes from institutional and private condos, allowing investors the option of running larger jobs than can be run either on a type of condo alone. Private condos will typically be named condo-<project-pi-netid>, but you can always get a current list of private condos on the system via the command:
sinfo -Nel | grep condo | grep -v overflow

Project Accounts

A project is the combination of an account used in the batch system and is related to a Linux group used on the system for file access control. In the cluster’s condo-based scheduling model, projects and their associated accounts in the batch system are a key component. Projects control access to institutional and private condos. For UTK institutional users, the default project account is ACF-UTK0011. For UTHSC institutional users, the default project is ACF-UTHSC0001. Other projects follow the same project identifier format. To determine which projects you belong to, login to the User Portal and look under the “Projects” header. Always ensure you use the correct project account when submitting jobs with a SLURM directive. The following is an example of specifying to use of the UTK institutional project account through SLURM directive SBATCH in a job script:

#SBATCH -A ACF-UTK0011

Note that the SLURM utility’s scheduling policy requires that the nodes under each condo have to be a part of a partition. Therefore, each condo has a unique partition associated with its own project account; this makes it imperative for users to define their desired partition and a specific project account while requesting the resources on the ISAAC-NG computing cluster.For example: the institutional condo has a campus partition associated with its project account (ACF-UTK0011). Once a user submits a job to the ISAAC-NG cluster, the SLURM scheduler sets the processing constraints on the submitted job such as maximum wall time, number of nodes required, etc. and schedules the job.

Quality of Service (QOS)

A Quality of Service (QoS) is a set of restrictions on resources accessible to a job. You can always get the current list of QoS specifications via

sacctmgr show qos \
format=name,maxwall,maxtres,maxtrespu,maxjobspu,maxjobspa,maxsubmitpa

As of August 4 2022, these QoS limits available to users are as follows:

QoS	Max Wall Time Per Job	Max Resource Usage Per Job	Max Resource Usage Per User (across all running jobs)	Max Running/Submitted Jobs Per User	Max Running/Submitted Jobs in Queue Per Account
campus	1 day	6 Nodes	6 Nodes	48/96	-/-
campus-gpu	1 day	2 Nodes	2 GPUs	3/6	-/-
campus-bigmem	1 day	1 Node	1 Node	4/8	-/-
condo	30 days	None (defaults to size of condo)	None	150/250	-/-
genomics	7 days	1 Node	1 Node	2/3	-/-
short	3 hours		48 cores split across any number of nodes	12/18	-/-
long	6 days	–	96 cores	12/18	-/-
long-gpu	6 days	1 Node	1 GPU	1/3	-/-
long-bigmem				2/4	-/-
overflow	3 days	12 Nodes	–	-/-	1/3
condo-hicap				-/500	-/-

Job Specifications

Each partition has a list of valid accounts which may submit jobs to that partition, and each user has a list of valid account/QoS combinations they may specify. For example, to submit a job under the campus partition, use the below batch header:

 #SBATCH --account ACF-UTK0011 (or #SBATCH -A ACF-UTK0011)
 #SBATCH --partition=campus (or #SBATCH -p campus)
 #SBATCH --qos=campus (or #SBATCH -q condo)

To submit a GPU job with <n> GPUs to the campus-gpu partition, use the below batch header:

#SBATCH --account ACF-UTK<project-number>
#SBATCH --partition=campus-gpu
#SBATCH --qos=utk
#SBATCH --gpus=<n>

NOTE: The “ai-tenn” partition for the AI Tennessee Initiative resources requires at least one GPU to be requested (e.g. --gpus=1) or your job will fail to submit.

To submit a job under a private condo, use the below batch header:

#SBATCH --account ACF-UTK<project-number>
#SBATCH --partition=condo-<condo-name>
#SBATCH --qos=condo

As users can see, the first command tells SLURM which project account the job is to be associated with, the second command tells SLURM to which partition the job should be submitted, and the third what Quality of Service the user wants to run the job under. Remember, Slurm needs all three pieces of information (the project account name, the partition name, and the Quality of Service) in order to appropriately assign resources to the job a user has submitted, and the schedule that job to run on the ISAAC-NG cluster’s computing resources.

Important Note: It is the combination of a partition, QoS, and project account which the Slurm scheduler program uses to allocate computing resources from the cluster for each job. If a user does not appropriately specify all three, the Slurm utility may not be able to process the user’s job as submitted due to a lack of information from the user.

isaac-sinfo is used to view partitions and each partition’s node information for a system running Slurm. The command isaac-sinfo is a customized wrapper that allows easier access to information about certain classes of partitions. For example, if you type isaac-sinfo help, it will show:

Usage: isaac-sinfo <campus,gpu,bigmem,long,condo>

Then, if you type isaac-sinfo campus will show information only about the campus partitions.

campus	up infinite	3	mix clr[0738,0821,0824]
campus	up infinite	26	alloc clr[0707-0709,0724-0737,0822-0823,0825,0829-0832,1241,1244]
campus	up infinite	5	idle clr[0826-0828,1242-1243]
campus	up infinite	1	down clr0710
campus- bigmem	up 30-00:00:0	3	idle ilm[0833,0835,0837]
campus-gpu	up 30-00:00:0	1	mix clrv0701
campus-gpu	up 30-00:00:0	2	idle clrv[0703,0801]
campus-gpu-bigmem	up 30-00:00:0	1	idle ilpa1209
campus-gpu-large	up 30-00:00:0	8	idle clrv[1101,1103,1105,1107,1201,1203,1205,1207]

Likewise, if a user attempts to specify an account or QoS that’s not valid for a partition, or attempts to request an invalid account/qos combination for their netid, their job will be rejected. Detailed information about all the partitions and the respective nodes on ISAAC-Legacy can be viewed using the following command:

 $ sinfo -Nel

For more information on the flags used by commands, please refer to the Slurm documentation. There are a handful of partitions that will be returned by the above command which are not documented below. These are reserved for administrative purposes, and end users should ignore them.

You can always see the valid accounts for a partition with

$ scontrol show PartitionName=<partition-name> | grep AllowAccounts

You can see the valid QoS settings for a partition with

$ scontrol show PartitionName=<partition-name> | grep AllowQos

You can always see your netid’s valid account/qos combinations with

$ sacctmgr -p show assoc | grep <netid>

However, this is not usually necessary. As a general rule, the valid QoS and accounts combinations for partitions will be as follows:

Condo(s)	Allowed QoS	Allowed Account(s)
campus	campus	All
campus-bigmem	campus-bigmem	All
campus-gpu, campus-gpu-bigmem	campus-gpu	All
condo-<project-pi-netid>	condo	Project Accounts Only
condo-<project-pi-netid>-overflow	overflow	Project Accounts Only
condo-ut-genomics	genomics	Project Accounts Only
short	short	All
long	long	All
long-bigmem	long-bigmem	All
long-gpu	long-gpu	All

The general association between allowed accounts and QoS will be as follows:

Account Type	Example Account(s)	Valid QoS
Institutional Account	ACF-UTK0011, ACF-UTHSC0001	campus, campus-bigmem, campus-gpu, long, long-bigmem, long-gpu, short
Private/Project/Class Account	ACF-UTK<nnnn>, ACF-UTHSC<nnnn>, ISAAC-UTK<nnnn>	All institutional QoS plus condo and overflow
Genomics Account	As for private account	All instutional QoS plus genomics

You can always determine your current active accounts or request access to new accounts in the user portal.

Please note that in addition to above two directives (project account and partition), users also need to specify the type of nodes, number of nodes, number of cores, and estimated maximum amount of wall time as parameters or other optional SBATCH directives when submitting a job script. A collection of complete sample jobs scripts are also available at /lustre/isaac/examples/jobs. Detailed information on how to submit a job using SBATCH directives is provided under the section Submitting Jobs with Slurm.

Once a job is submitted, the Slurm scheduler checks for the available resources and allocates them to the jobs to launch the tasks.

Important Note: users can choose to run a job exclusively on an entire node by using the exclusive flag while also using srun to distribute tasks among different CPUs. Check table 3.3 to see how to use this flag.

The order in which jobs are run depends on the following factors:

Number of nodes requested – jobs that request more nodes get a higher priority.
Queue wait time – a job’s priority increases along with its queue wait time (not counting blocked jobs as they are not considered “queued.”)
Number of jobs – a maximum of ten jobs per user, at a time, will be eligible to run.

In certain special cases, the priority of a job may be manually increased upon request. To request priority change you may contact the OIT HelpDesk. They will need the job ID and reason to submit the request.

Slurm Commands/Variables

Slurm Commands

In the table below, are listed few important Slurm commands (with a description of the command) which are most often used while working with Slurm scheduler, and can be used on the ISAAC-NG login nodes.

Command	Description
sbatch jobscript.sh	Used to submit the job script to request the resources
squeue	Used to displays the status of all the jobs
squeue -u username	Used to displays the status and other information of user’s all jobs
squeue [jobid]	Display the job status and information of a particular job
scancel	Cancel the job with a jobid
scontrol show jobid/parition value	Yields the information about a job or any resource
scontrol update	Alter the resources of a pending job
salloc	Used to allocate the resources for the interactive job run

Slurm Variables

Below are tabulated few important Slurm variables which ISAAC-NG users may find useful.

Variable	Description
SLURM_SUBMIT_DIR	The directory from where the job is submitted
SLURM_JOBID	The job identifier of the submitted job
SLURM_NODELIST	List of nodes allocated to a job
SLURM_NTASKS	Prints the total number of CPUs used

SBATCH Flags

The jobs a user submits to the ISAAC-NG cluster are submitted using a sbatch command which passes the request for the resources a user has requested in the job script to the Slurm scheduler. The resources in the job script are requested using the “SBATCH” directive. Note that SLURM scheduler can accept SBATCH directives in two formats. Users can choose to use either of the two formats at their own discretion. The description of each of the SBATCH flags is given below:

Flags	Description
`#SBATCH -J Jobname`	Name of the job
`#SBATCH --account (or -A) Project Account`	Project account to which the time will be charge
`#SBATCH --time (or -t)=days-hh:mm:ss`	Request wall time for the job
`#SBATCH --nodes (or -N)=1`	Number of nodes needed
`#SBATCH --ntasks (or -n) = 48`	Total number of tasks requested
`#SBATCH --ntasks-per-node = 48`	Request number of tasks per node
`#SBATCH --cpus-per-task = 4`	Cores allocated per task. Useful for e.g. OpenMP
`#SBATCH --partition (or -p) = campus`	Selects the partition or queue
`#SBATCH --output (or -o) = Jobname.o%j`	The file where output of terminal is dumped
`#SBATCH --error (-e) = Jobname.e%j`	The files where run time errors are dumped
`#SBATCH --exclusive`	Allocates the exclusive excess of node(s)
`#SBATCH --gpus=1`	Asks for a GPU allocation. (Must be on a gpu partition)
`#SBATCH --array (-a) = index`	Used to run multiple jobs with identical parameters
`#SBATCH --chdir=directory`	Used to change the working directory. The default working directory is the one from where a job is submitted
`#SBATCH --qos=campus`	The Quality of Service level for the job.
`#SBATCH --constraint (-C) = hardware,feature,list`	Comma-separated list of hardware features for the job. The possible features for each can be displayed via the `isaac-sinfo` command.

Targeting Specific CPU Types

You can request specific CPU types via the --constraint=<cputype> flag, which can be useful if you are running an application compiled for a specific architecture. Specific available options for this flag are documented in the table below.

Flag	CPU Type
`--constraint=intel`	Any Intel CPU
`--constraint=amd`	Any AMD CPU
`--constraint=cascadelake`	Intel Cascade Lake CPU
`--constraint=icelake`	Intel Ice Lake CPU
`--constraint=sapphirerapids`	Intel Sapphire Rapids CPU
`--constraint=rome`	AMD Rome CPU
`--constraint=milan`	AMD Milan CPU
`--constraint=genoa`	AMD Genoa CPU
`--constraint=bergamo`	AMD Bergamo CPU
`--constraint=avx512`	Any CPU supporting the AVX-512 Instruction set (all Intel and a subset of AMD CPUs).

The available feature list will occasionally change depending on the node composition of the cluster. The most current version of this list can always be found using the isaac-sinfo command. More details about the specifications of the various CPU types can be found at the System Overview page.

Asking for Memory (And Addressing OOM Errors)

Asking for Memory (And Addressing OOM errors)

Memory is tied to number of cores. If you require more memory for your job, then you will need to request more tasks or cores. Bigmem nodes pin a minimum of 32 GB of RAM per core, whereas non-bigmem nodes pin a minumum of 3.8 GB of RAM per core. The settings per partition can be found in the {Def,Max}MemPerCPU field of scontrol show part <partition name>

Submitting Jobs with Slurm

On ISAAC-NG, batch jobs can be submitted in two ways: (i) interactive batch mode, and (ii) non-interactive batch mode.

Interactive Batch mode

Interactive batch jobs give users the interactive access to compute nodes. In interactive batch mode, users can request that the Slurm scheduler allocate the resources of compute nodes, directly on the terminal (e.g., using a terminal’s command line interface (CLI)). A common use for interactive batch jobs is to debug a calculation or a program before submitting non-interactive batch jobs for production runs. This section demonstrates how to run interactive jobs through the batch system and provides common usage tips.

Interactive batch mode can be invoked on an ISAAC-NG login node by using the salloc command followed by specific SBATCH flags to request specific resources to which the user would like access. The different SBATCH flags are given in table 3.3.

$ salloc -A projectaccount --nodes=1 --ntasks=1 --partition=campus --time=01:00:00
or 
$ salloc -A projectaccount -N 1 -n 1 -p campus -t 01:00:00

The salloc command interprets the user’s request to the SLURM scheduler, and as a result, allows the user to request the resources. In the example command above, we requested that the Slurm scheduler allocate one node and one CPU core for a total time of 1 hour using the “campus” partition. Note that if the salloc command is executed without specifying the resources (i.e., number of nodes, number of CPUs, tasks, and wall clock time etc.), then the Slurm scheduler will allocate the default resources.

Important Note: The default resources on ISAAC-NG are: one processor (1 CPU on 1 node) located under the campus partition, with a maximum wall clock time of 1 hour.

When the Slurm scheduler allocates the resources, the user who submitted the job gets a message on the terminal (as shown below) containing the information about the jobid and the hostname of the compute node where the resources are allocated. Notice that in the below example the jobid is “1234” and the hostname of the compute node is “nodename.”

 $ salloc --nodes=1 --ntasks=1 --time=01:00:00 --partition=campus
  salloc: Granted job allocation 1234
  salloc: Waiting for resource configuration
  salloc: Nodes nodename are ready for job
 $

Once the the interactive job starts, the user need to ssh to the allocated node or one of the allocated nodes (for the jobs requesting more than one node) and should change their working directories to a Lustre file system project space or scratch space to run the user’s computationally intense applications. It is best to run large user jobs in Lustre scratch space rather than Lustre project space, because there is no size limit to the amount of data that a user may place on Lustre scratch space; however, there is a limit to how much data that a user may place on Lustre project space. Please visit File System for more information. If users have questions about how to best run their interactive jobs, users are welcome to send their questions to HPSC staf by submitting HPSC Service Request.

To run the parallel executable, we recommend using srun followed by the executable as shown below:

 $ srun executable
or
 $ srun -n <nprocs> executable

Important Note: Users do not necessarily need to include the number of processors they will require, before running the parallel executable while calling srun. The Slurm wrapper srun will execute your calculations in parallel on the number of processors requested in the user’s job script. Serial applications (that is, non-parallel applications) can be run both with and without srun.

Non-interactive batch mode

In non-interactive batch mode, the set of resources as well as the commands for the application to be run are written in a text file, which is referred to as a “batch file” or “batch script”. A user’s “batch script” for their particular job is submitted to the Slurm scheduler by using the sbatch command. Batch scripts are very useful for users who want to run productions jobs. This is because batch scripts allow users to work on a cluster non-interactively. In batch jobs, users submit a group of commands to Slurm and then simply check the job’s status and the output of the commands from time to time. However, sometimes it is very useful to run a job interactively (primarily for debugging). Click here to check how to run the batch jobs interactively. A typical example of a non-interactive batch script/job script is given below:

 #!/bin/bash
 #This file is a submission script to request the ISAAC resources from Slurm 
 #SBATCH -J job			       #The name of the job
 #SBATCH -A ACF-UTK0011              # The project account to be charged
 #SBATCH --nodes=1                     # Number of nodes
 #SBATCH --ntasks-per-node=48          # cpus per node 
 #SBATCH --partition=campus            # If not specified then default is "campus"
 #SBATCH --time=0-01:00:00             # Wall time (days-hh:mm:ss)
 #SBATCH --error=job.e%J	       # The file where run time errors will be dumped
 #SBATCH --output=job.o%J	       # The file where the output of the terminal will be dumped
 #SBATCH --qos=campus

 # Now list your executable command/commands.
 # Example for code compiled with a software module:
 module load example/test

 hostname
 sleep 100
 srun executable

Stuff

The above job script can be divided into three sections:

Shell interpreter (one line)
- The first line of the script specifies the script’s interpreter. The syntax of this line is #!/bin/shellname (sh, bash, csh, ksh, zsh)
- This line is important and essential. If not mentioned, then scheduler will print an error.
SLURM submission options
- The second section contains a bunch of lines starting with ‘#SBATCH’.
- These lines are not the comments, but by format start with ‘#’.
- #SBATCH is a Slurm directive which communicates information regarding the resources requested by the user in the batch script file.
- #SBATCH options after the first non-comment line are ignored by Slurm scheduler
- The description about each of the flags is mentioned in the table 3.3
- The command sbatch on the terminal is used to submit the non-interactive batch script.
Shell commands
- The shell command follows the last #SBATCH line.
- These commands are a set of commands or tasks which a user wants to run. This also includes any software modules which may be needed to access a particular application.
- To run the parallel application, it is recommended to use srun followed by the name of the full path and name of the executable if the executable path is not loaded into Slum environment while submitting the script.

For the quick start, we have also provided a collection of complete sample job scripts that are available on the ISAAC-NG cluster at /lustre/isaac/examples/jobs

Checking for Available Node Resources

In order to best utilize resources and avoid jobs waiting in queue, users should check what Slurm resources are available before deciding job submission parameters and submitting jobs. The Slurm commands sinfo and isaac-sinfo are offered for users to query the system for the status of available resources.

The command sinfo outputs the status of available nodes organized by partition and node-type while isaac-sinfo is a modified version to include partitions directly usable by ISAAC users with cleaner output. If a user was searching for a GPU node to run a GPU-accelerated job, they may for example execute the following:

$ isaac-sinfo | grep gpu | egrep 'idle|mixed'

The above outputs nodes in GPU partitions that are either idle and totally free or are mixed where only some resources are taken by other users’ jobs. This filters out nodes in the ‘allocated’ state since those nodes’ resources are entirely utilized by current running jobs. A sample output line may look like the following:

campus-gpu               intel,cascadelake,avx512    up 30-00:00:0 1-infinite   no       NO        all      1       mixed             clrv0701

Utilizing the information given, users can tailor their Slurm job requests to the available resources. For example, knowing that there resources left available on node “clrv0701” based on the previous output of isaac-sinfo showing ‘mixed’, a user may request the following:

$ salloc --account=ACF-UTK0011 --nodes=1 --ntasks=8 --partition=campus-gpu --qos=campus-gpu --time=1:00:00 --gpus=1 --nodelist=clrv0701

It is important to notice that the amount of nodes, cores, and GPUs requested match what is currently available, based on the output of the aforementioned commands, otherwise the job will wait in queue until the requested resources become available. To ensure a submitted job has requested an appropriate amount of resources based on the desired or available target node(s), users should compare the resources requested with the resources available. Checking available resources applies to all jobs whether they are submitted interactively, non-interactively, or through Open OnDemand.

Job Arrays

Slurm offers a useful option of submitting jobs that use array flags, for users whose batch jobs require identical resources. Using the array flag in a job script, users can submit multiple jobs with with a single sbatch command. Although the job script is submitted only once using sbatch command, the individual jobs in the array (of jobs) are scheduled independently with unique job array identifiers ($SLURM_ARRAY_JOB_ID). Each of the individual jobs can be differentiated using Slurm’s environmental variable $SLURM_ARRAY_TASK_ID. To understand this variable, let us consider an example of a Slurm script given below:

#!/bin/bash
#SBATCH -J myjob
#SBATCH -A ACF-UTK0011
#SBATCH -N 1
#SBATCH --ntasks-per-node=30  ###-ntasks is used when we want to define total number of processors
#SBATCH --time=01:00:00
#SBATCH --partition=campus     #####
##SBATCH -e myjob.e%j   ## Errors will be written in this file
#SBATCH -o myjob%A_%a.out    ## Separate output file will be created for each array. %A will be replaced by jobid and %a will be replaced by array index
#SBATCH --qos=campus
#SBATCH --array=1-30
       # Submit array of jobs numbered 1 to 30
###########   Perform some simple commands   ########################
set -x
###########   Below code is used to create 30 script files needed to submit the array of jobs   ###############
for i in {1..30}; do cp sleep_test.sh 'sleep_test'$i'.sh';done

###########   Run your executable   ###############
sh sleep_test$SLURM_ARRAY_TASK_ID.sh

In the above example, 30 sleep_test$index.sh executable files were created, and these jobs have names are differentiated by an index. One understands that the 30 executable files can be run by either submitting 30 individual jobs or by using Slurm arrays. A Slurm array would take these files in the form of an array named sleep_test[1-30].sh and execute these files. The variable SLURM_ARRAY_TASK_ID is set to array index value [between 1 and 30], with 30 being the total number of jobs in the array in this example, and the index value is defined in the Slurm script (above) using the #SBATCH directive

#SBATCH --array=1-30

The simultaneous number of jobs using a job array can also be limited by using a %n flag along with –array flag. For example: To run only 5 jobs at a time in a Slurm array, users can include the SLURM directive

#SBATCH --array=1-30%5

In order to create a separate output file for each of the submit jobs using Slurm arrays, use %A and %a, which represents the jobid and job array index as shown in the above example.

Exclusive Access to Nodes

As explained in the Scheduling Policy, jobs that are submitted by the same user can share computational nodes. However, if desired, users can request in their job scripts that whole node(s) be reserved to run their jobs without sharing them with other jobs. This provides the full amount of resources (cores and memory) to the users’ job. To do that use the below command:

Interactive batch mode:

 $ salloc -A projectaccount --nodes=1 --ntasks=1 --partition=campus --time=01:00:00 --exclusive --qos=campus

Non-Interactive batch mode:

Add the below line in your job script

 #SBATCH --exclusive

Monitoring Job Status

Users can regularly check status of their jobs by using the squeue command.

$ squeue -u <netID>
              JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
              1202    campus     Job3 username PD       0:00      2 (Resources)
              1201    campus     Job1 username  R       0:05      2 node[001-002]
              1200    campus     Job2 username  R       0:10      2 node[004-005]

The description of each of the columns of the output from squeue command is given below:

Name of Column	Description
JOBID	The unique identifier of each job
PARTITION	The partition/queue from which the resources are to be allocated to the job
NAME	The name of the job specified in the Slurm script using #SBATCH -J option. If the -J option is not used, Slurm will use the name of the batch script.
USER	The login name of the user submitting the job
ST	Status of the job. Slurm scheduler uses short notation to give the status of the job. The meaning of these short notations is given in the table below.
TIME	The maximum wall time requested by the user for a job
NODES	The requested number of nodes on which the job is running along with the node names if resources are already allocated

When a user submits a job, it passes through various states. The values of these states for a job is given by squeue command under the column ST. The possible values of the job under ST column are given below:

Status Value	Meaning	Description
CG	Completing	Job is about to complete.
PD	Pending	Job is waiting for the resources to be allocated
R	Running	Job is running on the allocated resources
S	Suspended	Job was allocated resources but the execution got suspended due to some problem and CPUs are released for other jobs
NF	Node Failure	Job terminated due to failure of one or more allocated nodes

Altering Batch Jobs

The users are allowed to change the attributes of their jobs until the job starts running. In this section, we will describe how to alter your batch jobs with examples.

Remove a Job from Queue

User can remove the jobs in any state which are submitted by them using the command scancel.

To remove a job with a JOB ID 1234, use the command:

scancel 1234

Modifying the Job Details

Users can make use of the Slurm command scontrol which is used to alter a variety of Slurm parameters. Most of the commands using scontrol can only be executed by an ISAAC-NG System Adminstrator; however, users are granted some permissions to use scontrol for use on the jobs they have submitted, provided the jobs are not in the running mode.

Release/Hold a job

scontrol release/hold jobid

Modify the name of the job

scontrol update JobID=jobid JobName=any_new_name

Modify the total number of tasks

scontrol update JobID=jobid NumTasks=Total_tasks

Modify the number of CPUs per node

scontrol update JobID=jobid MinCPUsNode=CPUs

Modify the wall time of the job

scontrol update JobID=jobid TimeLimit=day-hh:mm:ss