SLURM does not ensure all jobs finish in a bash script despite remaining wall time - slurm

I'm executing several python scripts using SLURM in the following format:
General SLURM header (partition, wall-time, etc.)
python script.py scenario_1 &
python script.py scenario_2 &
python script.py scenario_3 &
python script.py scenario_4
I'm discovering that the order I specify these jobs to be completed matters. If scenario_4 (or, more generally, the last job) finishes before other jobs, the remaining jobs will not complete. I am able to simply organize my jobs by duration, although in many cases I do not know the relative compute time which makes this estimation imperfect. Is there a way to ensure that SLURM doesn't prematurely kill jobs?

Written like this, this submission script will ensure only termination of scenario_4.
Slurm will consider a job to be finished when the submission script is finished; and the submission script will consider itself done whenever all foreground jobs are done.
Add the wait command at the end, like this:
python script.py scenario_1 &
python script.py scenario_2 &
python script.py scenario_3 &
python script.py scenario_4 &
wait
The wait command will fore the submission script to wait for all background jobs to be done before considering itself finished.

Related

send a blocking process to background after a delay

I am trying to write a bash script which launches an external python script, with some conditions.
The goal is to avoid waiting for the python script to complete, but wait a delay (eg. 30 seconds) and send the process to background if the python script is still running.
Very simplified python script example:
#!/usr/bin/python
import time
time.sleep(120)
In this case, I would like the bash script to launch the python script, wait for 30 seconds, and send the python process to background (like nohup example.py & would do).
In case the python script would crash during the 30 seconds delay, the error message should display on terminal.
I cannot modify the python script.
Is it possible to do it in a clean way?
I managed to do the job by using nohup / & and redirecting output of the python script to a temp file, and read this file after 30 seconds, to check if there is no error message.
But I am curious to know if there is a better way.
Thanks!
The approach suggested by Jordanm (start the process in background) will provides the right direction. Consider the following bash script, which uses 2 background jobs inside bash wrapper.
#! /bin/bash
sleep 30 &
sleep_pid=$!
(python -c 'import time ; time.sleep(10)' ; kill $sleep_pid) &
if wait $sleep_pid ; then
echo "Python continue in background ..."
else
echo "Python Completed"
fi
As an alternative, can be implement in Python/Perl, using a single background process.

Parallel processing or threading in Shell script

I am new in Parallel processing. I have some python scripts which I should not change them for some reasons. Each of these python scripts only use one cpu core and does some processing on an input image. I run these python scripts with a shell script one after another. Can I do paralel threading in shell script without thouching python scripts so that each python script uses multiple cpu cores and the processing speed on each image gets increasd?
Yes, start them with GNU Parallel.
So, if you want to run your script 10 times, with parameters 0..9:
parallel python yourScript.py {} ::: {0..9}
If you want to see what would run without actually running anything:
parallel --dry-run ...
If you want a progress meter:
parallel --progress ...

Trigger a script when nohup process terminates

I have some fairly time consuming python scripts to run ~3 hours or so on my machine. I don't want to run them concurrently since it might crash my machine. Alone I have more than enough memory but running 5 or so might cause an issue. I am running them remotely so I ssh into my server and run them like this:
nohup python my_script.py > my_output.txt &
That way if my connection gets interrupted I can re-establish the connection and my result is right there. I want to run the same python script a couple times with different command line arguments sequentially so I can run everything I need without me needing to set up the next one every few hours. I could manually code all of the arguments into a python script and do it that way but it seems inelegant. I don't want to have to fiddle with my python script every time I do this. Is there some sort of listener I could use to trigger the next one when one of them finishes?
I'd suggest writing a bash script that runs the python jobs sequentially:
#!/bin/bash
python3 my_script1.py > my_output1.txt
python3 my_script2.py > my_output2.txt
Then nohup that:
nohup ./driver.sh &
You really want to read up on utilities like tmux or screen and just script the while thing.

Bash running in sequence from Cronjob

#!/bin/bash
# My first script
sleep 15 & wait
python pythonFileName.py & wait
python pythonFileName.py & wait
python pythonFileName.py & wait
How do I get it to wait for the previous line to finish executing before moving to the next?
It work fine when you call the bash file directly , but when called from cronjob, it's executing everything without waiting for the previous one to finish.
Sleep works fine with this but for the py file, it's executing without waiting.
I also tried the following
A; B Run A and then B, regardless of success of A
A && B Run B if A succeeded
A || B Run B if A failed
A & Run A in background.
Are you sure python is in the path when running from cron? It's often the case that your cron job's environment is stripped down compared to your normal shell environment since it doesn't execute the normal environment setup scripts. Does it work if you specify the full path to python?

Running the shell scripts in Background

I would to give the user the feature to run the shell script in background.
My shell program instantiates a number of other shell scripts.
Here is a small code snippet of my script
./main.sh # Main script
in main.sh
I call preprocessing.sh
create_dir.sh
handle_file.sh
post_processing.sh
report_generation.sh
I would like to know if I have to initiate all the child script as well.. What is the syntax if i have to initiate all the scripts in background and at the end inform the user by displaying message in that test run is complete.
Thanks
Kiran
Start your processes in the background with & and then use bash's builtin wait command:
wait [n ...]
Wait for each specified process and return its termination sta‐
tus. Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job’s pipeline are
waited for.
A couple of example are available here. For instance:
# wait on 2 processes
sleep 10 &
sleep 10 &
wait %1 %2 && echo "Completed!"
add "&" to the end of the commands
I call preprocessing.sh &
create_dir.sh &
...

Resources