Scheduling more jobs than MaxArraySize - slurm

Let's say I have 6233 simulations to run. The commands are generated and stored in a file, one in each line. I would like to use Slurm to schedule and run these commands. However, the MaxArraySize limit is 2000. So I can't use one job array to schedule all of them.
One solution is given here, where we create four separate jobs and use arithmetic indexing into the file, with the last job having a smaller number of tasks to run (233).
Is it possible to do this using one sbatch script with one job ID?
I set ntasks=1 when using job arrays. Do larger ntasks help in such situations?
Update:
Following Damien's solution and examples given here, I ended up with the following line in my bash script:
curID=$(( ${SLURM_ARRAY_TASK_ID} * ${SLURM_NTASKS} + ${SLURM_PROCID} ))
The same can be done using Python (shown in the referenced page). The only difference is that the environment variables should be imported into the script.

Is it possible to do this using one sbatch script with one job ID?
No. That solution will give you multiple job IDs
I set ntasks=1 when using job arrays. Do larger ntasks help in such situations?
Yes, that is a factor that you can leverage.
Each job in the array can spawn multiple tasks (--ntasks=...). In that case, the line number in the command file must be computed from $SLURM_ARRAY_TASK_ID and $SLURM_PROCID, and the program must be started with srun. Each task in a job member of the array will run in parallel. How large the job can be will depend on the MaxJobsize limit defined on the cluster/partition/qos you have access to.
Another option is to chain the tasks inside each job of the array, with a Bash loop (for i in $seq(...) ; do ...; done). In that case, the line number in the command file must be computed from $SLURM_ARRAY_TASK_ID and $i. Each task in a job member of the array will run serially. How large the job can be will depend on the MaxWall limit defined on the cluster/partition/qos you have access to.

Related

one input file to yield many output files

This is a bit of a backwards approach to snakemake whose main paradigm is "one job -> one output", but i need many reruns in parallel of my script on the same input matrix on the slurm batch job submission cluster. How do I achieve that?
I tried specifying multiple threads, multiple nodes, each time indicating one cpu per task, but it never submits an array of many jobs, just an array of one job.
I don't think there is a nice way to submit an array job like that. In snakemake, you need to specify a unique output for each job. But you can have the same input. If you want 1000 runs of a job:
ids = range(1000)
rule all:
input: expand('output_{sample}_{id}', sample=samples, id=ids)
rule simulation:
input: 'input_{sample}'
output: 'output_{sample}_{id}'
shell: echo {input} > {output}
If that doesn't help, provide more information about the rule/job you are trying to run.

Is it possible to assign job names to separate workers in a SLURM array via sbatch?

By default, when submitting a SLURM job as an array, all jobs within the array share the same job name. In the docs (here: https://slurm.schedmd.com/job_array.html), it shows that each job in the array can have its name set separately via scontrol (described under the section "Scontrol Command Use").
Can this be done directly from an sbatch script?
I just created an account because I was trying to do this and I did find a solution.
You can use scontrol to change the name of a job, the syntax is the following:
scontrol update job=<job_id> JobName=<new_name>
You can do this manually, but you can also automatically set the name of the job from within the array job, thus automatically assigning a different name to each job in the array.
I find this useful because I'm mostly running calculations in different directories and if I have one job running much longer than the others I want to be able to quickly retrieve where it's running to see what's going on.
Of course you could set other things as your job name, as you prefer.
In my case, I add the scontrol command to the script I run through the array in order to obtain the following name for each directory: "job_name - directory". The job id and job name can be retrieved from environment variables.
scontrol update job=$SLURM_ARRAY_JOB_ID JobName="$SLURM_JOB_NAME - $folder"

Using option --array as an argument in slurm

Is it possible to use the --array option as an argument? I mean, I have a R code where I use arrays. The number of arrays depends of the file on which I execute my R code. I would like to pass as argument the number of arrays into the sbatch my_code.R command line , in order to never modify my slurm code : for example, for a file with 550.000 columns, I will need 10 arrays, a file with 1.000.000 columns will needed 19 arrays etc. I must get something like this "sbatch --array 1-nb_of_arrays_needed my_code.R" . The goal is to make my code usable by everyone, without the user needs to go into the slurm code in order to change the line #SBATCH --array=x-y
My R code (I don't show it in full) :
data<-read.table(opt$file, h=T, row.names=1, sep="\t")+1
ncol=ncol(data)
nb_arrays=ncol/55000
nb_arrays=round(nb_arrays)
opt$number=nb_arrays
...
Bests
Your R script will start only when the job is scheduled. To be scheduled, it must be submitted, and to be submitted, it must know the argument to --array.
So you have two options:
Either split your R script into to, one part that will run before the job is submitted, and the other that will run when the job starts. The first part will compute the necessary number of jobs in the array (and possibly submit the job array automatically) and the other part will do the actual calculations.
If you prefer having only one R script, you can differentiate the behaviour based on the presence or absence of SLURM_JOB_ID variable in the environment. If it is not present, compute the number of jobs and submit, if it is present, do the actual calculations.
The other option is to set --array in the submission job to a large value, and when the first job in the array starts, it computes the number of jobs that are necessary, and cancels the superfluous jobs.

PBS Pro: setting job array slot limit by the user

Using torque user can specify slot limit when submitting the job array by using the %, e.g.: qsub job.sh -t 1-20%5 will create a job array with 20 jobs, but with only 5 running simultaneously.
Currently I work with PBS Professional, but unfortunately, as far as I can see, option % is not supported. How can I achieve similar behavior as % in torque as simple as possible?

SLURM: Changing the maximum number of simultaneously running tasks for a running array job

I have set of an array job as follows:
sbatch --array=1:100%5 ...
which will limit the number of simultaneously running tasks to 5. The job is now running, and I would like to change this number to 10 (i.e. I wish I'd run sbatch --array=1:100%10 ...).
The documentation on array jobs mentions that you can use scontrol to change options after the job has started. Unfortunately, it's not clear what this option's variable name is, and I don't think it is listed in the documentation of the sbatch command here.
Any pointers well received.
You can change the array throttling limit with the following command:
scontrol update ArrayTaskThrottle=<count> JobId=<jobID>

Resources