HPC environments and SLURM

Every HPC system is differently configured and it takes some time to get to know the file structure and optimized workflow. Here I give basic information about exploring HPC systems as well as some details of SLURM job scheduler.

HPC Environments and modules

HPC systems use modules to manage software environments. That means unlike the local machine, not all the paths are accessible. We have to load specific modules before we can access specific executables.

`module avail`

Description: Lists all available modules, there could be thousands of modules. It lists all their names and the partition it resides. module spider does the same thing.

`module spider <keyword>`

Description: Lists all modules that contains the string keyword. If there is a module named keyword, then it shows detailed information about that module.

`module list`

Description: Lists all modules that are currently loaded.

`module purge`

Description: It unloads all the modules. By default, the HPC may load some module when we log in. Purging will unload all modules. Use with caution.

`module load <module_name>`

Description: Loads a specific module (e.g., module load quantumespresso/7.2).

`module unload <module_name>`

Description:: Unloads a specific module.

In rare case, we have to source some script to set some environment variables after loading the modules. Such as in ARF cluster, we have to source intel/setvars.sh or something like this.

Job Submission on HPC

In local machine, we can simply execute a command by saying pw.x or in case of parallel running, mpirun -np 10 pw.x, etc. However, in an HPC, they usually employ a job scheduling systems such as SLURM or PBS. The job scheduler prioritizes which job to run first, or which nodes to assign to which job, memory allocation, etc. I am only familiar with SLURM. If you want, you can even install SLURM in your local machine. Here are some common commands for SLURM compatible HPC:

`sbatch`

Usage: sbatch job.sh
Description: It submits a job.sh job file to the queue. More on the job.sh file is below. When we submit a job, it shows a message saying Submitted batch jobs <job_ID>.

`squeue`

Usage: squeue
Description: It shows the list of current job queue. The PD job status refers to pending job, R refers to running, and CG refers to closing (due to an error or normal finish) job. If the job is finished, it doesn't show in the queue.

`scancel`

Usage: scancel <job_ID>
Description: As the name suggests, we can cancel a job at any stage.

`sinfo`

Usage: sinfo
Description: This command shows the partition information and node availability. It's super helpful to see which partition and nodes are available right before submitting a job. However, some HPC such as MN5 does not give access to sinfo commands. It can be used with grep to get the names of available partition and nodes such as sinfo | grep idle

`sacct`

Usage: sacct -j <job_ID>
Description: It shows the accounting information about jobs. Sometimes it helps to inspect this to know about the timing and other information about a job. We can use -o option to get more fields.

Example:

$ sacct -j 1926489 -o JobID,State,Elapsed,Start,End,AllocCPUs,ExitCode
JobID             State    Elapsed               Start                 End  AllocCPUS ExitCode 
------------ ---------- ---------- ------------------- ------------------- ---------- -------- 
1926489       COMPLETED   00:18:49 2025-04-03T21:51:58 2025-04-03T22:10:47        220      0:0 
1926489.bat+  COMPLETED   00:18:49 2025-04-03T21:51:58 2025-04-03T22:10:47        110      0:0 
1926489.ext+  COMPLETED   00:18:49 2025-04-03T21:51:58 2025-04-03T22:10:47        220      0:0 
1926489.0     COMPLETED   00:18:47 2025-04-03T21:52:00 2025-04-03T22:10:47        220      0:0

`seff`

Usage: seff <job_ID>
Description: Not every HPC has this but if your HPC has this, it shows the job efficiency information. Run this command afte the job finishes, otherwise it gives inaccurate info for running/pending jobs.

Example:

$ seff 1926489

ID: 1926489
Cluster: arf
User/Group: amuhaymin/amuhaymin
State: COMPLETED (exit code 0)
Nodes: 2
Cores per node: 110
CPU Utilized: 2-19:36:54
CPU Efficiency: 98.00% of 2-20:59:40 core-walltime
Job Wall-clock time: 00:18:49
Memory Utilized: 203.68 GB (estimated maximum)
Memory Efficiency: 47.40% of 429.69 GB (1.95 GB/core)

SLURM job files

SLURM job file can be submitted with sbatch command. It's possible to submit any bash shell file with sbatch as sbatch -A username -J jobname job.sh but to be systematic, we put all the slurm directives inside the job file. Below I gave an example of such job file:

#!/bin/bash
#SBATCH --job-name=my_dft_job
#SBATCH --account=username
#SBATCH --partition=orfoz
#SBATCH --ntasks=550
#SBATCH --nodes=5
#SBATCH --time=2-10:20:30
#SBATCH --output=output_%j.txt
#SBATCH --error=error_%j.txt
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@example.com

module load quantum_espresso

srun pw.x -in input_file.in > output_file.out

This job file will assign 550 cpu from 5 nodes for a maximum time limit of 2 days 10 hours 20 minutes 30 seconds. It simply runs a QE calculation. The same file can be written using a short options such as:

#!/bin/bash
#SBATCH -J  my_dft_job
#SBATCH -A  username
#SBATCH -p  orfoz
#SBATCH -n  550
#SBATCH -N  5
#SBATCH -t  2-10:20:30
#SBATCH -o  output_%j.txt
#SBATCH -e  error_%j.txt
#SBATCH --mail-type=ALL
#SBATCH --mail-user=your_email@example.com

module load quantum_espresso

srun pw.x -in input_file.in > output_file.out

You can learn more about each of the key word from this page. Especially check the sbatch page.

Useful Tips and Shortcuts

Tab Completion: Use the Tab key to auto-complete commands and filenames.
Command History: Type history to view a list of previously executed commands.
Searching in Files:
Use grep to search for text patterns. For example, QE output file reports the total energy lines with an exclamation mark. So, grep ! output_file.out will show the line with energy. Similarly, grep accuracy output_file.out will show the trend in the SCF accuracy which will help during a run to determine if the calculation is converging or not.
Clear Screen: Use clear to clean your terminal.
Monitoring Resources: top or htop to view running processes. free -h to check memory usage. It's good for checking on your local machine, not so good on HPC.
Feedback loop: Use the output of a command as the input of another command using the pipe operator |. For example, module avail will list all the available modules but if we are trying to find wannier, we can do module avail | grep wannier.