[slurm]How to check the job type on slurm ? batch or interactive - slurm

I want to determine whether a job on the slurm is a batch job or an interactive job.
It is possible to check the batch host with the following command, but is there a better way?
squeue -O "Name,BatchHost"
NAME EXEC_HOST
int login001
batch compute009

scontrol show job has a flag BatchFlag. The scontrol manpage says:
BatchFlag
Jobs submitted using the sbatch command have BatchFlag set to 1. Jobs submitted using other commands have BatchFlag set to 0.

Related

Check sbatch script of running job

When running an slurm job from an sbatch script, is there a command that lets me see what was in the sbatch script that I used to start this job?
For example sacct tells me I'm on SLURM_JOB_ID.3 and I would like to see how many job steps there will be in total.
I'm looking for a command that takes the job id and prints the sbatch script it is running.
You can use
scontrol write batch_script SLURM_JOB_ID
The above will display the submission script for job identified with jobid 12345
More info: https://slurm.schedmd.com/scontrol.html#OPT_write-batch_script

Add squeue in a Slurm job script

Normally I can check the job status by executing
squeue -u [userid]
at the command line after submitting the job. Is there a way to add this line in the job submission script so that the job status automatically shows up once the job is submitted?

How can I find out the "command" (batch script filename) of a finished SLURM job?

I often have lots of SLURM jobs running from different directories. Therefore, it is useful to query the workdir of the jobs. I can do this for jobs in the queue (e.g. pending, running, etc.) something like this:
squeue -u $USER -o "%i %Z"
and I can do this for finished jobs (e.g. completed, timeout, cancelled, etc.) something like this:
sacct -u $USER -o JobID,WorkDir
The problem is, sometimes I have a directory with two (or more) SLURM batch scripts in it, e.g. submit.sh and restart.sh. Therefore, it is also useful to query the "command" of the jobs, i.e. the filename of the batch script. I can do this for jobs in the queue something like this:
squeue -u $USER -o "%i %o"
However, from checking the documentation of sacct and playing around with sacct, there appears to be no equivalent option for sacct so I cannot currently get the command for finished jobs. I also cannot use the squeue method for finished jobs - it just says slurm_load_jobs error: Invalid job id specified because finished jobs are not included in the squeue list. So, how can I find out the command of a finished SLURM job (using sacct or otherwise)?
Slurm does not indeed store the command in the accounting database. Two workarounds:
For a single user: use the JobName or Comment to store the script name upon submission. These are stored in the database, but this approach is error-prone;
Cluster-wise: enable job completion plugin to ElastiSearch as this stores not only the script name but the whole contents as well.

SLURM how to know if a running job is a srun or a sbatch job?

I need to distinguish between batch and interactive job when are in RUNNING state.
I can't find with sact or stat a way to know if a job is a interactive session.
Did anyone already solved a similar problem?
You can use the batchflag formatting keyword in the squeue command to infer if a job has been submitted using the sbatch command.
$ squeue --Format=batchflag -u ${USER} --states=RUNNING
From the BatchFlag description in the scontrol help page:
Jobs submitted using the sbatch command have BatchFlag set to 1. Jobs submitted using other commands have BatchFlag set to 0.

SLURM job dependency by job name not job id

The format for job dependencies in the documentation is as follows:
sbatch --dependency=<type:job_id[:job_id][,type:job_id[:job_id]]> ...
Is it possible to make a job dependency using job name instead of job ID?
Slurm does not seem to handle that, but a workaround, that would work in the command line (not in a #SBATCH directive in a script), would be:
sbatch --dependency=$(squeue --noheader --format %i --name <JOB_NAME>) ...

Resources