Skip to main content

Slurm

This page serves as point of introduction to understanding and understanding the best use of Slurm.

Slurm is a free and open-source job scheduler for Linux and Unix-like kernels, used by many supercomputers and computer clusters.

Slurm serves the purpose of being the scheduler of jobs and the manager of the resources on both, MERCED and Pinnacles clusters. It does it in three ways:

  • Provides exclusive and/or non-exclusive access to the resources on the compute nodes to the users for a certain amount of time.
  • Provides a framework to start, execute, and check the work on the set of allocated compute nodes.
  • Slurm manages the queue of pending jobs based on the availability of resources.

Slurm Commands

Basic Slurm Commands to interact via the command line interface with the queueing and resource scheduler are shown below.

CommandsSyntaxDescription
sbatchsbatch <jobname>Submit a batch script to Slurm for processing.
squeuesqueue --meShow information about your job(s) in the queue. The command when run without the --me flag, shows all jobs in the queue.
srunsrun <option>Please see our example here for using srun in interactive session.
skill/scancelscancel <job-id>End or cancel a queued job.
sinfosinfoGet information about the resources on available nodes that make up the HPC cluster.

sbatch Command

The sbatch command is used to sumbit job scripts to the Slurm scheduler to be placed on the respective queue and to then begin when the requested resources are available.

Here is an example of using sbatch to submit a job file to the job scheduler.

sbatch <batchscript>

tip

sbatch command is different from SBATCH directives. A directive is a way of specifiing how the job script should be configured in a certain aspect(i.e. max wall time, max core count, etc.). Visit this sction for further information on Slurm directives.

Slurm Script Main Parts

In creating a Slurm script, there are 4 main components in order for your job to be successfully processed.

1) Shebang

The Shebang command tells the shell (which interprets the UNIX commands) to interpret and run the Slurm script using the bash shell. This line should always be added at the very top of your SBATCH/Slurm script.

!/bin/bash

2) SBATCH Directives in the Script

The SBATCH directives must be used in the following manner for Slurm to properly recognize them:

Template:

#SBATCH --[directive]

Example:

#SBATCH --job-name=EXAMPLEJOB
#SBATCH --nodes=2

Common Directives

OptionDescription
--job-name=NAMESpecifies the name of the job.
--partition=PARTITIONSubmits the job to a specific partition.
--nodes=NRequests N nodes for the job.
--ntasks-per-node=NSpecifies the number of tasks to launch per node.
--mem=MBRequests a specific amount of memory (in megabytes).
--time=DAYS:HOURS:MINUTES:SECONDSSets a limit on the total run time of the job.
--output=FILE_NAMEDesignates a file to capture the standard output.
--error=FILE_NAMESpecifies a file to capture the standard error output.
tip

Please see directive section for a more complete list.

3) Exporting All of the user's environment for Job:

--export=all 

This line will tell Slurm to export the current working enviroment to be used on the computing nodes to process the requested instructions. This line is important as it helps ensure consistancy across user's working enviroment and the resources' working enviroment.

4) Begin to include list of commands/instructions to execute your script.

Here is where you can write the commands or instructions to execute the computations or calculations desired. Below is a small example of what instructions can look like:

    pwd # Prints the working Directory
source activate env1 # conda activates an env called env1
python hello.py #runs a python program called hello.py

Job Examples

More detailed job examples can be found here

Slurm - Job Management

Job management is critical before running or scaling jobs and computations within a HPC enviroment. We have created a manual page that can be found [ADDBACK MANAGEMENT LINK]. The documentation goes over common Slurm commands that help debug job errors and overall performance using sacct and scontrol.

SBATCH Directives for Job Scripts

DirectivesDescription
--job-name <jobname>Specifies a name for the job allocation. The specified name will appear along with the job id number when querying running jobs on the system. The default is the name of the batch script, or just sbatch if the script is read on sbatch’s standard input.
--output <filename.out>Instructs Slurm to connect the batch script's standard output directly to the filename. If not specified, the default filename is slurm-jobID.out.
--partition <partition>Requests a specific partition for the resource allocation (gpu, interactive, normal). If not specified, the default partition is normal.
--cpus-per-task=<number>Advises the Slurm controller that ensuing job steps will require n-cpus number of processors per task. Without this option, the controller will just try to assign one processor per task.
--mem-per-cpu=<number>This is the minimum memory required per allocated CPU. Note: It’s highly recommended to specify --mem-per-cpu. If not, the default setting of 500MB will be assigned per CPU.
--time=<D:H:M:S>Sets a limit on the total run time of the job allocation. If the requested time limit exceeds the partition’s time limit, the job will be left in a PENDING state (possibly indefinitely). The default time limit is the partition’s default time limit. A time limit of zero requests that no time limit be imposed. The acceptable time format is days-hours:minutes:seconds.
--mail-user=<email>Defines user who will receive email notification of state changes as defined by --mail-type.
--mail-type=<type>Notifies user by email when certain event types occur. Valid type values are BEGIN, END, FAIL. The user to be notified is indicated with --mail-user. The values of the --mail-type directive can be declared in one line like so: --mail-type BEGIN, END, FAIL

Slurm Output Environment Variables

When a job scheduled by Slurm starts, it needs to know some information about its execution environment. For example, It needs to know the working directory, and what nodes allocated to it. Slurm passes this information to the running job via what so-called environment variables. The following is a table of the most common-used environment variable.

Slurm Environment VariableDescription
SLURM_CPUS_ON_NODENumber of CPUs on the allocated node.
SLURM_CPUS_PER_TASKNumber of CPUs requested per task.
SLURM_GPUSNumber of GPUs requested.
SLURM_GPUS_PER_NODERequested GPU count per allocated node.
SLURM_GPUS_PER_TASKRequested GPU count per allocated task.
SLURM_JOB_IDThe ID of the job allocation.
SLURM_JOB_CPUS_PER_NODECount of processors available to the job on this node.
SLURM_JOB_NAMEName of the job.
SLURM_JOB_NODELISTList of nodes allocated to the job.
SLURM_JOB_NUM_NODESTotal number of nodes in the job's resource allocation.
SLURM_JOB_PARTITIONName of the partition in which the job is running.
SLURM_MEM_PER_CPUMinimum memory required per allocated CPU.
SLURM_MEM_PER_GPURequested memory per allocated GPU.
SLURM_MEM_PER_NODETotal amount of memory per node that the job needs.
SLURM_NODELISTList of nodes allocated to the job.
SLURM_NPROCSTotal number of CPUs allocated
SLURM_NTASKSMaximum number of MPI tasks (that's processes).
SLURM_NTASKS_PER_CORENumber of tasks requested per core.
SLURM_NTASKS_PER_GPUNumber of tasks requested per GPU.
SLURM_NTASKS_PER_NODENumber of tasks requested per node.
SLURM_PRIO_PROCESSThe scheduling priority (nice value) at the time of job submission. This value is propagated to the spawned processes.
SLURM_PROCIDThe MPI rank (or relative process ID) of the current process.
SLURM_SUBMIT_DIRThe directory from which SBATCH was invoked.
SLURM_SUBMIT_HOSTThe Hostname of the computer from which SBATCH was invoked.
SLURM_TASK_PIDThe process ID of the corresponding task.