Avoid duplicate execution of same python script in different location when invoked by cron - python-3.x

I have two same python file in different location and these scripts will be invoked and executed in random time. If one script is executing, I want other script to wait until that job is finished. How can I get it done?

you can have a global Flag in somewhere like DB and change your code in a way that script1 only runs when flag is True and script2 only runs when flag is set to false. you should change the value of flag in both scripts.
Or you can use Trigger. Trigger script2 from script1 after his work is finished.

Related

Set global variable value within a job in gitlab yml file

I have two jobs : first one is "test" and the second one is "push". test job is allowed to fail (allow_failure: true) I only wanna run the push job if the test job is success.
One option is to save the variable to a file and use it through artifact. But what I'm interested in is that is if there's a way achieving this without the file, like having global var and update the value in the test job if it's a success, but apparently modifying global variables from the job scope is not possible. Any suggestions?

How to get SLURM_ARRAY_TASK_ID when submitting job on command line

I want to be able to select a file or dir based on the TASK_ID.
Ideally like this
sbatch -o %a.log --array=1-10 script.sh data/%a
But %a is only meant to be used for log files, and can't be passed to the script at runtime.
In the docs $SLURM_ARRAY_TASK_ID is brought up. But that requires modifying the script because only then will that variable be set (not when I submit the job with sbatch)!
Is there really no nice way to do this?

Maintain a session across multiple instances of app when called from same shell

I'm trying to have data (generated by an application only after its launch) persisted across multiple invocations of an application, but only when they're started from the same shell session.
One possible way to do that would be to pass the data back from the application to the calling shell, but since environment variable changes are only passed from parent to child, I don't know how to implement that.
Practical example:
There is job command that create subdirectory with current datetime and does work inside. Sometimes job needs to be killed and restarted, so it need directory where if finished, like job --resume 21Fri_1849/data. I would like to save 21Jan_1849/data so I don't have to check and type it each time I need to resume job. If I created something like .last_job, and wanted to restart job in another session, it could resume wrong (last) job, so files are not solution (AFAIK).
How can this be done?
Since you're only trying to target Linux, there are a fair number of tricks available here. Consider this one:
#!/usr/bin/env bash
current_boot_id=$(</proc/sys/kernel/random/boot_id)
# honor myprog_shell_pid if set and valid, fall back to PPID otherwise
if [[ $myprog_shell_pid ]] && [[ -e /proc/$myprog_shell_pid/stat ]]; then
parent_pid=$myprog_shell_pid
else
parent_pid=$PPID
fi
parent_start_time=$(awk '{print $22}' "/proc/$parent_pid/stat")
mkdir -p "$HOME/.cache/myscript-sessions"
data=$HOME/.cache/myscript-sessions/${current_boot_id}:${parent_pid}:${parent_start_time}
Now, we have a data file name that changes:
When we're rebooted (because current_boot_id is updated)
If we're run from a different shell (because our PPID changes).
If we're run from a different shell with the same PID (because the start time for the parent PID will be different).
...and you can easily delete files with the wrong boot id (because the system rebooted), or with names that refer to PID/start-time combinations that don't exist.
One caveat is that by default, this is sensitive to being called by subshells (output=$(./yourprog) will have a different PPID than ./yourprog will), but if the parent shell runs export myprog_shell_pid=$$, that issue goes away.
You're crossing over to where you need a simple job management engine instead of just shell. Using 'make' and writing Makefiles is the probably the simplest way to set this up. You can write a rule that tells how to turn a stage 1 file into a stage 2 file based on file extension, and then make will know how far things got and how to resume next time you run it.

Nextflow error in 1 process stops all other processes

When I submit a job with nextflow, one of the processes fails, there's a corrupted file. Obviously I can remove that file from the job list but I don't want this to happen in the future when I scale it up. By default this stops all the other processes (9) from running and the nextflow job finishes.
How do I stop this one failed job from affecting the others?
I found the answer after some more digging in the docs here (https://www.nextflow.io/docs/latest/process.html#errorstrategy). I needed to add errorStrategy 'finish' to my process
process ignoreAnyError {
errorStrategy 'finish'
script:
<your command string here>
}

How to completely exit a running asyncio script in python3

I'm working on a server bot in python3 (using asyncio), and I would like to incorporate an update function for collaborators to instantly test their contributions. It is hosted on a VPS that I access via ssh. I run the process in tmux and it is often difficult for other contributors to relaunch the script once they have made a commit, etc. I'm very new to python, and I just use what I can find. So far I have used subprocess.Popen to run git pull, but I have no way for it to automatically restart the script.
Is there any way to terminate a running asyncio loop (ideally without errors) and restart it again?
You can not start a event loop stopped by event_loop.stop()
And in order to incorporate the changes you have to restart the script anyways (some methods might not exist on the objects you have, etc.)
I would recommend something like:
asyncio.ensure_future(git_tracker)
async def git_tracker():
# check for changes in version control, maybe wait for a sync point and then:
sys.exit(0)
This raises SystemExit, but despite that exits the program cleanly.
And around the python $file.py a while true; do git pull && python $file.py ; done
This is (as far as I know) the simplest approach to solve your problem.
For your use case, to stay on the safe side, you would probably need to kill the process and relaunch it.
See also: Restart process on file change in Linux
As a necromancer, I thought I give an up-to-date solution which we use in our UNIX system.
Using the os.execl function you can tell python to replace the current process with a new one:
These functions all execute a new program, replacing the current process; they do not return. On Unix, the new executable is loaded into the current process, and will have the same process id as the caller. Errors will be reported as OSError exceptions.
In our case, we have a bash script which executes the killall python3.7, sending the SIGTERM signal to our python apps which in turn listen to it via the signal module and gracefully shutdown:
loop = asyncio.get_event_loop()
loop.call_soon_threadsafe(loop.stop)
sys.exit(0)
The script than starts the apps in background and finishes.
Note that killall python3.7 will send SIGTERM signal to every python3.7 process!
When we need to restart we jus rune the following command:
os.execl("./restart.sh", 'restart.sh')
The first parameter is the path to the file and the second is the name of the process.

Resources