Skip to content Skip to main navigation Report an accessibility issue
High Performance & Scientific Computing

Running Jobs on ISAAC-NG



**The compute resources of ISAAC-NG are managed using SLURM scheduler and workload manager**

Introduction

ISAAC-NG cluster offers access to its compute resources via two means: (i) command line (ii) Web based Open OnDemand (See the ISAAC-NG web page about Open OnDemand for further detail). The compute resources on ISAAC-NG cluster are managed using SLURM scheduler and workload manager. In this document, we will learn how to efficiently use the compute resources of ISAAC-NG via command line and submit, monitor, and manage the computational jobs for production work using the SLURM scheduler. Please note that:

The login nodes should ONLY be used for basic tasks such as: file editing, file management, code compilation, and job submission.

General Information

As explained in the Access and Login page, when you log in to ISAAC-NG HPSC cluster, you will be directed to one of the login nodes from where you can access the cluster to do your research work. Please note that the login nodes should only be used for basic tasks such as file editing, code compilation, and job submission. Running production jobs on login nodes is highly discouraged and any production job running on the login nodes is subject to termination.

If a user is not sure how to determine whether their job(s) are running on a login node, the user is encouraged to reach out to the High Performance & Scientific Computing (HPSC) group for assistance by submitting a help request ticket using the “Submit HPSC Service Request” button at the top right corner of any of the HPSC website pages at oit.utk.edu/hpsc.

As discussed above, login nodes should not be used to run production jobs. Any kind of production work should be performed on the system’s compute resources. The compute resources of ISAAC-NG cluster are managed and allocated by SLURM scheduler (Simple Linux Utility for Resource Management) which uses several partitions to efficiently allocate the resources for different jobs.

This page provides information for getting started with the batch facilities of SLURM scheduler as well as the basic job execution. However, before getting started with the job submission, it is imperative to understand how to access different job directories on ISAAC-NG cluster from where the jobs can be submitted.

Job Directories

By default, SLURM scheduler assumes the working directory to be the one from where the jobs is submitted. It is recommended that the jobs should be submitted from within a directory in the Lustre file system. . The other storage space is located in /lustre/isaac/projThe following storage spaces are available for users to choose from when deciding which directory from which they would like to run their job(s):

  • /lustre/isaac/proj/<project-name> – This is the scratch space for the project under which the job is being run. Users will have their own folder under this space with their own username which users can use to store their data. However, this storage space is allocated under a specific project and is made available upon request by the Principle investigators of the projects.
  • /lustre/isaac/scratch/ – All the ISAAC-NG users have access to the scratch storage space, which can be accessed using the $SCRATCHDIR environmental variable.

Please refer to File Systems page for more details.

SLURM Batch Scripts

In this section, we will explain how to request SLURM scheduler to allocate the resources for a job, and how to submit those jobs.

Partitions

SLURM scheduler organizes the similar sets of nodes and job features into individual groups, and these groups are individually referred to as a “partition”. Each partition has hard limits for the maximum amount of wall clock time, maximum job size, and the upper limit to number of nodes a user is able to request, etc. The default partition for all ISAAC-NG users is named “campus” and “campus-gpu” under which number of CPU and GPU cores are available. For more information on available resources under each partition, please refer to our System Overview page.

Scheduling Policy

ISAAC-NG cluster uses a variety of mechanisms to determine how to schedule jobs. Most of the resource request mechanisms can be manipulated by users to ensure that their jobs are scheduled and executed in a reasonable time period.

Condos

The ISAAC-NG computing cluster has been divided into logical units referred to as “condos”. In the context of the cluster, a condo is a logical group of compute nodes that act as an independent cluster to effectively schedule jobs. There are “institutional” and individual “private” condos within the ISAAC-NG cluster. All users belong to their respective institutional condo, whether that be UTK or UTHSC. Private condos are owned and used by individual investors and their associated projects (discussed below). With a private condo, the investor and his project have exclusive use of the compute nodes in the condo. Investors can choose to share their private condo with the entire cluster under the Responsible Node Sharing implementation, but this is not a requirement.

Project Accounts

A project is the combination of an account used in the batch system and is related to a Linux group used on the system for file access control. In the cluster’s condo-based scheduling model projects and their associated accounts in the batch system are a key component. Projects control access to institutional and private condos. For UTK institutional users, the default project account is ISAAC-UTK0011. For UTHSC institutional users, the default project is ISAAC-UTHSC0001. Other projects follow the same project identifier format. To determine which projects you belong, login to the User Portal and look under the “Projects” header. Always ensure you use the correct project account when submitting jobs with a SLURM directive. The following is an example of specifying to use the UTK institutional project account through SLURM directive SBATCH in a job script:

#SBATCH -A ACF-UTK0011

Note that the SLURM utility’s scheduling policy requires that the nodes under each condo have to be a part of a partition. Therefore, each condo has a unique partition associated with its own project account; this makes it imperative for users to define their desired partition and a specific project account while requesting the resources on the ISAAC-NG computing cluster.For example: the institutional condo has a campus partition associated with its project account (ISAAC-UTK0011). Once a user submits a job to the ISAAC-NG cluster, the SLURM scheduler sets the processing constraints on the submitted job such as maximum wall time, number of nodes required, etc. and schedules the job.

To submit a job under the campus partition, use the below command:

 #SBATCH --account ACF-UTK0011 (or #SBATCH -A ACF-UTK0011
 #SBATCH --partition=campus (or #SBATCH -p campus)
 #SBATCH --qos=normal (or --qos=condo for private condo users)

Important Note: It is the combination of a partition, QoS, and project account which the Slurm scheduler program uses to allocate computing resources from the cluster for each job. If a user does not appropriately specify all three, the Slurm utility may not be able to process the user’s job as submitted due to a lack of information from the user.

As users can see, the first command tells SLURM which project account the job is to be associated with, the second command tells SLURM to which partition the job should be submitted, and the third what Quality of Service the user wants to run the job under. Remember, Slurm needs all three pieces of information (the project account name, the partition name, and the Quality of Service) in order to appropriately assign resources to the job a user has submitted, and the schedule that job to run on the ISAAC-NG cluster’s computing resources.

Detailed information about all the partitions and the respective nodes on ISAAC-NG can be viewed using the following command:

 $ sinfo -Nel

For more information on the flags used by commands, please refer to the Slurm documentation

Please note that in addition to above two directives (project account and partition), users also need to specify the type of nodes, number of nodes, number of cores, and estimated maximum amount of wall time as parameters or other optional SBATCH directives when submitting a job script. A collection of complete sample jobs scripts are also available at /lustre/isaac/examples/jobs. Detailed information on how to submit a job using SBATCH directives is provided under the section Submitting Jobs with Slurm.

Once a job is submitted, the Slurm scheduler checks for the available resources and allocate them to the jobs to launch the tasks.

Important Note: users can choose to run a job exclusively on entire node by using the exclusive flag while also using srun to distribute tasks among different CPUs. Check table 3.3 to see how to use this flag.

The order in which jobs are run depends on the following factors:

  • Number of nodes requested – jobs that request more nodes get a higher priority.
  • Queue wait time – a job’s priority increases along with its queue wait time (not counting blocked jobs as they are not considered “queued.”)
  • Number of jobs – a maximum of ten jobs per user, at a time, will be eligible to run.

In certain special cases, the priority of a job may be manually increased upon request. To request priority change you may contact the OIT HelpDesk. They will need the job ID and reason to submit the request.

Slurm Commands/Variables

In the table below, are listed few important Slurm commands (with a description of the command) which are most often used while working with Slurm scheduler, and can be used on the ISAAC-NG login nodes.

CommandDescription
sbatch jobscript.shUsed to submit the job script to request the resources
squeueUsed to displays the status of all the jobs
squeue -u usernameUsed to displays the status and other information of user’s all jobs
squeue [jobid]Display the job status and information of a particular job
scancelCancel the job with a jobid
scontrol show jobid/parition valueYields the information about a job or any resource
scontrol updateAlter the resources of a pending job
sallocUsed to allocate the resources for the interactive job run
Table 3.1: Common Slurm commands and associated descriptions

Slurm Variables

Below are tabulated few important Slurm variables which ISAAC-NG users may find useful.

VariableDescription
SLURM_SUBMIT_DIRThe directory from where the job is submitted
SLURM_JOBIDThe job identifier of the submitted job
SLURM_NODELISTList of nodes allocated to a job
SLURM_NTASKSPrints the total number of CPUs used
Table 3.2: Slurm variables used to get the information of an ISAAC-NG running job

SBATCH Flags

The jobs a user submits to the ISAAC-NG cluster are submitted using a sbatch command which passes the request for the resources a user has requested in the job script to the Slurm scheduler. The resources in the job script are requested using the “SBATCH” directive. Note that SLURM scheduler can accept SBATCH directives in two formats. Users can choose to use either of the two formats at their own discretion. The description of each of the SBATCH flags is given below:

Flags Description
#SBATCH -J JobnameName of the job
#SBATCH --account (or -A) Project Account Project account to which the time will be charge
#SBATCH --time (or -t)=days-hh:mm:ss Request wall time for the job
#SBATCH --nodes (or -N)=1Number of nodes needed
#SBATCH --ntasks (or -n) = 48Total number of cores requested
#SBATCH --ntasks-per-node = 48 Request number of cores per node
#SBATCH --partition (or -p) = campusSelects the partition or queue
#SBATCH --output (or -o) = Jobname.o%jThe file where output of terminal is dumped
#SBATCH --error (-e) = Jobname.e%jThe files where run time errors are dumped
#SBATCH --exclusiveAllocates the exclusive excess of node(s)
#SBATCH --array (-a) = indexUsed to run multiple jobs with identical parameters
#SBATCH --chdir=directoryUsed to change the working directory. The default working directory is the one from where a job is submitted
#SBATCH – -qos=normalThe Quality of Service level for the job.
Table 3.3: Slurm flags used in creating job scripts with descriptions

Stuff

Submitting Jobs with Slurm

On ISAAC-NG, batch jobs can be submitted in two ways: (i) interactive batch mode, and (ii) non-interactive batch mode.

Interactive Batch mode

Interactive batch jobs give users the interactive access to compute nodes. In interactive batch mode, users can request that the Slurm scheduler allocate the resources of compute nodes, directly on the terminal (e.g., using a terminal’s command line interface (CLI)). A common use for interactive batch jobs is to debug a calculation or a program before submitting non-interactive batch jobs for production runs. This section demonstrates how to run interactive jobs through the batch system and provides common usage tips.

Interactive batch mode can be invoked on an ISAAC-NG login node by using the salloc command followed by specific SBATCH flags to request specific resources to which the user would like access. The different SBATCH flags are given in table 3.3.

$ salloc -A projectaccount --nodes=1 --ntasks=1 --partition=campus --time=01:00:00
or
$ salloc -A projectaccount -N 1 -n 1 -p campus -t 01:00:00

The salloc command interprets the user’s request to the SLURM scheduler, and as a result, allows the user to request the resources. In the example command above, we requested that the Slurm scheduler allocate one node and one CPU core for a total time of 1 hour using the “campus” partition. Note that if the salloc command is executed without specifying the resources (i.e., number of nodes, number of CPUs, tasks, and wall clock time etc.), then the Slurm scheduler will allocate the default resources.

Important Note: The default resources on ISAAC-NG are: one processor (1 CPU on 1 node) located under the campus partition, with a maximum wall clock time of 1 hour.

When the Slurm scheduler allocates the resources, the user who submitted the job gets a message on the terminal (as shown below) containing the information about the jobid and the hostname of the compute node where the resources are allocated. Notice that in the below example the jobid is “1234” and the hostname of the compute node is “nodename.”

 $ salloc --nodes=1 --ntasks=1 --time=01:00:00 --partition=campus
  salloc: Granted job allocation 1234
  salloc: Waiting for resource configuration
  salloc: Nodes nodename are ready for job
 $

Once the the interactive job starts, the user need to ssh to the allocated node or one of the allocated nodes (for the jobs requesting more than one node) and should change their working directories to a Lustre file system project space or scratch space to run the user’s computationally intense applications. It is best to run large user jobs in Lustre scratch space rather than Lustre project space, because there is no size limit to the amount of data that a user may place on Lustre scratch space; however, there is a limit to how much data that a user may place on Lustre project space. Please visit File System for more information. If users have questions about how to best run their interactive jobs, users are welcome to send their questions to HPSC staf by submitting HPSC Service Request.

To run the parallel executable, we recommend using srun followed by the executable as shown below:

 $ srun executable
or
 $ srun -n <nprocs> executable

Important Note: Users do not necessarily need to include the number of processors they will require, before running the parallel executable while calling srun. The Slurm wrapper srun will execute your calculations in parallel on the number of processors requested in the user’s job script. Serial applications (that is, non-parallel applications) can be run both with and without srun.

Non-interactive batch mode

In non-interactive batch mode, the set of resources as well as the commands for the application to be run are written in a text file, which is referred to as a “batch file” or “batch script”. A user’s “batch script” for their particular job is submitted to the Slurm scheduler by using the sbatch command. Batch scripts are very useful for users who want to run productions jobs. This is because batch scripts allow users to work on a cluster non-interactively. In batch jobs, users submit a group of commands to Slurm and then simply check the job’s status and the output of the commands from time to time. However, sometimes it is very useful to run a job interactively (primarily for debugging). Click here to check how to run the batch jobs interactively. A typical example of a non-interactive batch script/job script is given below:

 #!/bin/bash
 #This file is a submission script to request the ISAAC resources from Slurm 
 #SBATCH -J job			       #The name of the job
 #SBATCH -A ISAAC-UTK0011              # The project account to be charged
 #SBATCH --nodes=1                     # Number of nodes
 #SBATCH --ntasks-per-node=48          # cpus per node 
 #SBATCH --partition=campus            # If not specified then default is "campus"
 #SBATCH --time=0-01:00:00             # Wall time (days-hh:mm:ss)
 #SBATCH --error=job.e%J	       # The file where run time errors will be dumped
 #SBATCH --output=job.o%J	       # The file where the output of the terminal will be dumped
 #SBATCH --qos=normal

 # Now list your executable command/commands.
 # Example for code compiled with a software module:
 module load example/test

 hostname
 sleep 100
 srun executable

Stuff

The above job script can be divided into three sections:

  1. Shell interpreter (one line)
    • The first line of the script specifies the script’s interpreter. The syntax of this line is #!/bin/shellname (sh, bash, csh, ksh, zsh)
    • This line is important and essential. If not mentioned, then scheduler will print an error.
  2. SLURM submission options
    • The second section contains a bunch of lines starting with ‘#SBATCH’.
    • These lines are not the comments, but by format start with ‘#’.
    • #SBATCH is a Slurm directive which communicates information regarding the resources requested by the user in the batch script file.
    • #SBATCH options after the first non-comment line are ignored by Slurm scheduler
    • The description about each of the flags is mentioned in the table 3.3
    • The command sbatch on the terminal is used to submit the non-interactive batch script.
  3. Shell commands
    • The shell command follows the last #SBATCH line.
    • These commands are a set of commands or tasks which a user wants to run. This also includes any software modules which may be needed to access a particular application.
    • To run the parallel application, it is recommended to use srun followed by the name of the full path and name of the executable if the executable path is not loaded into Slum environment while submitting the script.

For the quick start, we have also provided a collection of complete sample job scripts that are available on the ISAAC-NG cluster at /lustre/isaac/examples/jobs

Job Arrays

Slurm offers a useful option of submitting jobs that use array flags, for users whose batch jobs require identical resources. Using the array flag in a job script, users can submit multiple jobs with with a single sbatch command. Although the job script is submitted only once using sbatch command, the individual jobs in the array (of jobs) are scheduled independently with unique job array identifiers ($SLURM_ARRAY_JOB_ID). Each of the individual jobs can be differentiated using Slurm’s environmental variable $SLURM_ARRAY_TASK_ID. To understand this variable, let us consider an example of a Slurm script given below:

#!/bin/bash
#SBATCH -J myjob
#SBATCH -A ISAAC-UTK0011
#SBATCH -N 1
#SBATCH --ntasks-per-node=30  ###-ntasks is used when we want to define total number of processors
#SBATCH --time=01:00:00
#SBATCH --partition=campus     #####
##SBATCH -e myjob.e%j   ## Errors will be written in this file
#SBATCH -o myjob%A_%a.out    ## Separate output file will be created for each array. %A will be replaced by jobid and %a will be replaced by array index
#SBATCH --qos=normal
#SBATCH --array=1-30
       # Submit array of jobs numbered 1 to 30
###########   Perform some simple commands   ########################
set -x
###########   Below code is used to create 30 script files needed to submit the array of jobs   ###############
for i in {1..30}; do cp sleep_test.sh 'sleep_test'$i'.sh';done

###########   Run your executable   ###############
sh sleep_test$SLURM_ARRAY_TASK_ID.sh

In the above example, 30 sleep_test$index.sh executable files were created, and these jobs have names are differentiated by an index. One understands that the 30 executable files can be run by either submitting 30 individual jobs or by using Slurm arrays. A Slurm array would take these files in the form of an array named sleep_test[1-30].sh and execute these files. The variable SLURM_ARRAY_TASK_ID is set to array index value [between 1 and 30], with 30 being the total number of jobs in the array in this example, and the index value is defined in the Slurm script (above) using the #SBATCH directive

#SBATCH --array=1-30

The simultaneous number of jobs using a job array can also be limited by using a %n flag along with –array flag. For example: To run only 5 jobs at a time in a Slurm array, users can include the SLURM directive

#SBATCH --array=1-30%5

In order to create a separate output file for each of the submit jobs using Slurm arrays, use %A and %a, which represents the jobid and job array index as shown in the above example.

Exclusive Access to Nodes

As explained in the Scheduling Policy, jobs that are submitted by the same user can share computational nodes. However, if desired, users can request in their job scripts that whole node(s) be reserved to run their jobs without sharing them with other jobs. To do that use the below command:

Interactive batch mode:

 $ salloc -A projectaccount --nodes=1 --ntasks=1 --partition=campus --time=01:00:00 --exclusive

Non-Interactive batch mode:

Add the below line in your job script

 #SBATCH --exclusive

Monitoring Job Status

Users can regularly check status of their jobs by using the squeue command.

$ squeue -u <netID>
              JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
              1202    campus     Job3 username PD       0:00      2 (Resources)
              1201    campus     Job1 username  R       0:05      2 node[001-002]
              1200    campus     Job2 username  R       0:10      2 node[004-005]

The description of each of the columns of the output from squeue command is given below

Name of ColumnDescription
JOBIDThe unique identifier of each job
PARTITIONThe partition/queue from which the resources are to be allocated to the job
NAMEThe name of the job specified in the Slurm script using #SBATCH -J option. If the -J option is not used, Slurm will use the name of the batch script.
USERThe login name of the user submitting the job
STStatus of the job. Slurm scheduler uses short notation to give the status of the job. The meaning of these short notations is given in the table below.
TIMEThe maximum wall time requested by the user for a job
NODESThe requested number of nodes on which the job is running along with the node names if resources are already allocated
Table 3.4: SQUEUE command outputs

Description of the output

When a user submits a job, it passes through various states. The values of these states for a job is given by squeue command under the column ST. The possible values of the job under ST column are given below:

Status ValueMeaningDescription
CGCompletingJob is about to complete.
PDPendingJob is waiting for the resources to be allocated
RRunningJob is running on the allocated resources
SSuspendedJob was allocated resources but the execution got suspended due to some problem and CPUs are released for other jobs
NFNode FailureJob terminated due to failure of one or more allocated nodes
Table 3.5: Possible states of a Slurm job

Altering Batch Jobs

The users are allowed to change the attributes of their jobs until the job starts running. In this section, we will describe how to alter your batch jobs with examples.

Remove a Job from Queue

User can remove the jobs in any state which are submitted by them using the command scancel.

To remove a job with a JOB ID 1234, use the command:

scancel 1234

Modifying the Job Details

Users can make use of the Slurm command scontrol which is used to alter a variety of Slurm parameters. Most of the commands using scontrol can only be executed by an ISAAC-NG System Adminstrator; however, users are granted some permissions to use scontrol for use on the jobs they have submitted, provided the jobs are not in the running mode.

Release/Hold a job

scontrol release/hold jobid

Modify the name of the job

scontrol update JobID=jobid JobName=any_new_name

Modify the total number of tasks

scontrol update JobID=jobid NumTasks=Total_tasks

Modify the number of CPUs per node

scontrol update JobID=jobid MinCPUsNode=CPUs

Modify the wall time of the job

scontrol update JobID=jobid TimeLimit=day-hh:mm:ss