Default job time limit in Slurm - slurm

I want to allow the user scheduling a job to list any job time limit using -t, --time=<time>. However, when the user does not set a time limit I'd like to impose a default time limit, for example 1 hour. I can't find any setting in slurm.conf to do this.

The default time limit is set per partition. If not specified, the maximum time limit is used:
DefaultTime
Run time limit used for jobs that don't specify a value. If not set then MaxTime will be used. Format is the same as for MaxTime.
Example:
PartitionName=debug Nodes=dev[0-8,18-25] MaxTime=12:00:00 DefaultTime=00:30:00 Default=YES
This will set the maximum wall time for the partition to 12 hours and the default, if not specified by the user, to 30 minutes.

You can't set default time limit twice, right? If user do not specify time then the job will be terminated automatically when the job is completed. You can read about -t, --time here. Anyways, the default time limit is the partition's default time limit. So, you can have it changed as you like.
Here's an example of slurm.conf to set time-limit for partition -
# slurm.conf file
# for CPU
PartitionName=cpu Nodes=ALL Default=YES MaxTime=INFINITE State=UP
# for GPU
PartitionName=gpu Nodes=ALL MaxTime=INFINITE State=UP

Related

Is there a way to specify a niceness value per partition of a sbatch command?

I launch a bunch of jobs with the following format:
sbatch -p partitionA,partitionB --nice=${NICE} script_to_run.sh
Is there a way to specify the nice value per partition, or is the way to do this is to set a default niceness for each partition and use that?

How to find max number of tasks that I can run using slurm?

I have access to supercomputer that uses slurm, but I need one information, that I cannot find. How many parallel tasks can I run? I know I can use --ntasks to set the number, and e.g. if I have parallel prblen and I want to check it running 1000 processes I can run it with --ntasks 1000 but what sets max number? Nuber of nodes or number of CPUs or something else?
There is a physical limitation which is the total number of cores available in the cluster. You can check that with sinfo -o%C; the last number in the output will be the total number of CPUs.
There can also be limits defined in the "Quality of Services". You can see them with sacctmgr show qos. Look for the MaxTRES column.
But there can be also administrative limits specific to your user or your account. You can see them with sacctmgr show user $USER withassoc. Look for the MaxCPUMins column.

Monitor memory usage of each node in a slurm job

My slurm job uses several nodes, and I want to know the maximum memory usage of each node for a running job. What can I do?
Right now, I can ssh into each node and do free -h -s 30 > memory_usage, but I think there must be a better way to do this.
The Slurm accounting will give you the maximum memory usage over time over all tasks directly. If that information is not sufficient, you can setup profiling following this documentaiton and you will receive from Slurm the full memory usage of each process as a time series for the duration of the job. You can then aggregate per node, find the maximum, etc.

Slurm does not allocate the resources and keeps waiting

I'm trying to use our cluster but I have issues. I tried allocating some resources with:
salloc -N 1 --ntasks-per-node=5 bash
but It keeps wainting on:
salloc: Pending job allocation ...
salloc: job ... queued and waiting for resources
or when I try:
srun -N1 -l echo test
it lingers at waiting queue!
Am I making a mistake or there is something wrong with our cluster?
It might help to set a time limit for the Slurm job using the option --time, for instance set a limit of 10 minutes like this:
srun --job-name="myJob" --ntasks=4 --nodes=2 --time=00:10:00 --label echo test
Without time limit, Slurm will use the partition's default time limit. The issue is that sometimes this might be set to infinity or to several days, so this might cause a delay in the start of the job. To check the partition's default time limit use:
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
prod* up infinite 198 ....
gpu* up 4-00:00:00 70 ....
From the Slurm docs:
-t, --time=<time>
Set a limit on the total run time of the job allocation. If the requested time limit exceeds the partition's time limit, the job will be left in a PENDING state (possibly indefinitely). The default time limit is the partition's default time limit. When the time limit is reached, each task in each job step is sent SIGTERM followed by SIGKILL.

slurm: How can I prevent job's information to be removed?

Using sacct I want to obtain information about my completed jobs.
Answer mentions how could we obtain a job's information.
I have submitted a job name jobName.sh which has jobID 176. After 12 hours and new 200 jobs came in, I want to check my job's (jobID=176) information and I obtain slurm_load_jobs error: Invalid job id specified.
scontrol show job 176
slurm_load_jobs error: Invalid job id specified
And following line returns nothing: sacct --name jobName.sh
I assume there is a time-limit to keep previously submitted job's information that somehow previous jobs' information has been removed. Is there a limit for that? How could I make that limit very large value in order to prevent them to be deleted?
Please not that JobRequeue=0 is at slurm.conf.
Assuming that you are using mySQL to store that data, in your database configuration file slurmdbd.conf, you can tune, among others, the purging time. Here you have some examples:
PurgeJobAfter=12hours
PurgeJobAfter=1month
PurgeJobAfter=24months
If not set (default), then job records are never purged.
More info.
On Slurm documentation mentioned that:
MinJobAge The minimum age of a completed job before its record is
purged from Slurm's active database. Set the values of MaxJobCount and
to ensure the slurmctld daemon does not exhaust its memory or other
resources. The default value is 300 seconds. A value of zero prevents
any job record purging. In order to eliminate some possible race
conditions, the minimum non-zero value for MinJobAge recommended is 2.
On my slurm.conf file, MinJobAge was 300 which is 5 minutes. That's why after 5 minutes each completed job's information was removed. I increased MinJobAge's value in order to prevent the delete operation.

Resources