I want to train a neural network on a cluster which uses SLURM to manage jobs. There is a time limit of 10 hours imposed on each job submitted. Therefore, I need a script that could automatically submit sequential jobs i.e. train from scratch for the first job and reload the checkpoint from the most recent job to continue training for the second job afterwards.
I have written the following script. I would like to know if this is okay or if there is any standard way to handle this in SLURM.
#!/bin/bash
Njobs=1000
# Read the configuration variables
# Each training should have a difference config
CONFIG=experiments/model.cfg
source $CONFIG
# Submit first job - no dependencies
j0=$(sbatch run-debug.slurm $CONFIG)
echo "ID of the first job: $j0"
# add first job to the list of jobs
jIDs+=($j0)
# for loop: submit Njobs: where job (i+1) is dependent on job i.
# and job (i+1) (i.e. new_job) resume from the checkpoint of job i
for i in $(seq 0 $Njobs); do
# Submit job (i+1) with dependency ('afterok:') on job i
RESUME_CHECKPOINT=$OUTPUTPATH/$EXPNAME/${jIDs[$i - 1 ]}/checkpoint.pkl
new_job=$(sbatch --dependency=afterok:${jIDs[$i - 1 ]} run-debug.slurm $CONFIG $RESUME_CHECKPOINT)
echo "Submitted job $new_job that will be executed once job ${jIDs[$i - 1 ]} has completed with success."
echo "This task will resume training from $RESUME_CHECKPOINT."
jIDs+=($new_job)
echo "List of jobs that have been submitted: $jIDs"
done
Thank you so much in advance for your help!
Related
In a previous question I asked how to queue a job B to start after job A, which is done with
sbatch --dependency=after:123456:+5 jobB.slurm
where 123456 is the id for job A, and :+5 denotes that it will start five minutes after job A.
I now need to do this for several jobs. Job B should depend on job A, job C on B, job D on C.
sbatch jobA.slurm will return Submitted batch job 123456, and I will need to pass the job id to the call with dependency for all but the first job. As I am using a busy cluster, I can't rely on incrementing the job ids by one, as someone might queue a job between.
As such I want to write a script that takes the job scripts (*.slurm) I want to run as arguments, e.g.
./run_jobs.sh jobA.slurm jobB.slurm jobC.slurm jobD.slurm
The script should then run, for all jobs scripts passed to it,
sbatch jobA.slurm # Submitted batch job 123456
sbatch --dependency=after:123456:+5 jobB.slurm # Submitted batch job 123457
sbatch --dependency=after:123457:+5 jobC.slurm # Submitted batch job 123458
sbatch --dependency=after:123458:+5 jobD.slurm # Submitted batch job 123459
What is an optimal way to do this with bash?
You can use the --parsable option to get the jobid of the previously submitted job:
#!/bin/bash
ID=$(sbatch --parsable $1)
shift
for script in "$#"; do
ID=$(sbatch --parsable --dependency=after:${ID}:+5 $script)
done
I am running slurm job arrays with --array, and I would like to run about 2000 tasks/array items. However this is beyond the cluster's job submission limit of ~500 at a time.
Are there any tips/best practices for splitting this up? I'd like to submit it all at once and still be able to pass the array id arguments 1-2000 to my programs if possible. I think something like waiting to submit pieces of the array might be helpful but I'm not sure how to do this at the moment.
If the limit is on the size of an array:
You will have to split the array into several job arrays. The --array parameter accepts values of the form <START>-<END> so you can submit four jobs:
sbatch --array=1-500 ...
sbatch --array=501-1000 ...
sbatch --array=1001-1500 ...
sbatch --array=1501-200 ...
This way you will bypass the 500-limit and still keep the SLURM_ARRAY_TASK_ID ranging from 1 to 2000.
To ease things a bit, you can write this all in one line like this:
paste -d- <(seq 1 500 2000) <(seq 500 500 2000) | xargs -I {} sbatch --array={} ...
If the limit is on the number of submitted jobs:
Then one option is to have the last job of the array submit the following chunk.
#!/bin/bash
#SBATCH ...
...
...
if [[ $((SLURM_ARRAY_TASK_ID % 500)) == 0 ]] ; then
sbatch --array=$((SLURM_ARRAY_TASK_ID+1))-$((SLURM_ARRAY_TASK_ID+500)) $0
fi
Note that ideally, the last running job of the array should submit the job, and it may or may not be the one with the highest TASK ID, but this has worked for all practical purposes in many situations.
Another options is to setup a cron job to monitor the queue and submit each chunk when possible, or to use a workflow manager that will that for you.
you can run a script to submit your jobs and try to make the program sleep a few seconds after every 500 submissions. see https://www.osc.edu/resources/getting_started/howto/howto_submit_multiple_jobs_using_parameters
I submit jobs to a cluster (high-performance computer) using file1.sh and file2.sh.
The content of file1.sh is
qsub job1.sh
qsub job2.sh
qsub job3.sh
...
qsub job999.sh
qsub job1000.sh
The content of file2.sh is
qsub job1001.sh
qsub job1002.sh
qsub job1003.sh
...
qsub job1999.sh
qsub job2000.sh
After typing ./file1.sh in putty, job1 to job1000 are submitted.
Is there an automatic way to type ./file2.sh ONLY after job1000 has completed? Please note, I want to type ./file2.sh automatically only after job1000 has finished (not just successfully submitted).
The reason for doing this, is that we can only submit 1000 jobs at a time. This 1000 limit includes the jobs at running and at the queue. The use of -hold_jid will still be considered within the limit of 1000. So I have to wait for all the first 1000 jobs finished (not simply submitted) then I am able to submit the next 1000 jobs.
Without the limitation to submitting 1000 Jobs, you could name your first jobs. You can then tell the next jobs to wait until the first jobs are finished. But as all jobs will be submitted, I think you will run against your 1000 jobs limit.
qsub -N job1 ./a.sh
qsub -N job2 ./b.sh
qsub -hold_jid job1,job2 -N job3 ./c.sh
You could write a shell script that submits the first 1000 jobs. Then the scripts waits until some jobs have finished and submits the next jobs. The script checks with something like
qstat -u username | wc -l
How many jobs you have submitted. If you have less than 1000 submitted jobs, the script could submit the next x jobs, where x = 1000 - #SubmittedJobs.
Cluster operators usually vary, what user behaviour they tolerate. So maybe it would be better to ask if this is ok for them. Also, some schedulers give jobs of powerusers (here in number of jobs) a lower priority for new jobs. So it could be the case, that your new jobs spend more time in the queue.
I'm using APScheduler(3.5.3) to run three different jobs. I need to trigger the second job immediately after the completion of first job. Also I don't know the completion time of first job.I have set trigger type as cron and scheduled to run every 2 hours.
One way I overcame this is by scheduling the next job at the end of each job. Is there any other way we can achieve it through APScheduler?
This can be achieved using scheduler events. Check out this simplified example adapted from the documentation (not tested, but should work):
def execution_listener(event):
if event.exception:
print('The job crashed')
else:
print('The job executed successfully')
# check that the executed job is the first job
job = scheduler.get_job(event.job_id)
if job.name == 'first_job':
print('Running the second job')
# lookup the second job (assuming it's a scheduled job)
jobs = scheduler.get_jobs()
second_job = next((j for j in jobs if j.name == 'second_job'), None)
if second_job:
# run the second job immediately
second_job.modify(next_run_time=datetime.datetime.utcnow())
else:
# job not scheduled, add it and run now
scheduler.add_job(second_job_func, args=(...), kwargs={...},
name='second_job')
scheduler.add_listener(my_listener, EVENT_JOB_EXECUTED | EVENT_JOB_ERROR)
This assumes you don't know jobs' IDs, but identify them by names. If you know the IDs, the logic would be simpler.
I can only submit 10 jobs to the PBS system at the same time.
If I have independent 99 scripts, I would like to have a script to finish all these 99 scripts by one click.
jobs1.sh
jobs2.sh
jobs3.sh
.
.
jobs99.sh
The purpose is to submit another 10 jobs after finishing previous 10 jobs.
Currently, I'm using sleep to separate every 10 jobs by estimating how much time they need. I know it's not a nice way.....
You need to check out the PBS section about dependencies. There are good ways of queuing jobs to run after the previous ones have finished. For example-
qsub -W depend=afterok:Job_ID job11.sh
Job_ID can be the JOB_ID for job1.sh.
So job11 runs only after job1 has finished. You can elaborate on this idea and set up a loop.