how do i submit multiple jobs with nohup? - linux

I have several jobs which i would like to submit in a certain order.
First, multiple jobs like cal1st_1.sh cal1st_2.sh ... cal1st_10.sh are submitted at one time and processed simultaneously. After all of them are finished, i have a job called post_process.sh. Then, multiple jobs like cal2nd_1.sh, cal2nd_2.sh .... cal2nd_10.sh, then post_process.sh again, I need to do this for four times.
I tried to write a script get_final.sh like
#!/bin/bash
./cal1st_1.sh & ./cal1st_2.sh & ./cal1st_3.sh...... & ./cal1st_10.sh
./post_process.sh
./cal2nd_1.sh & ./cal2nd_2.sh & ./cal2nd_3.sh...... & ./cal2nd_10.sh
./post_process.sh
and run it with command: nohup ./get_final.sh &, but it seems doesn't work, sometimes post_process.sh started even cal1st*sh weren't all finished, sometimes cal1st*sh weren't processed simultaneously. Can someone tell me which part of my codes is wrong? If you have any questions related to my codes, please leave a comment.
EDIT
I wrote a script get_final.sh like this, do you think it will work? Should i just execute it with nohup ./get_final.sh &
#!/bin/bash
pre_cal.sh
for i in `seq 1 13`; do
cal_dis_vel_strain_$i.sh &
done
wait
echo 1 > record
./post_cal.sh
...

Can someone tell me which part of my codes is wrong?
That is wrong is your assumption that the "post-process" tasks won't be started until the previous "cal" processes have finished. That's not the way that & works. What & does is to leave the child process to run in the background ... without waiting for it to finish.
The way to do what you are trying to do is to use the builtin wait command, as described here:
How to wait in bash for several subprocesses to finish and return exit code !=0 when any subprocess ends with code !=0?
http://www.unixcl.com/2008/06/bash-wait-command.html
In this case, you will need to wait for each of the backgrounded processes (in any order).
(The problem is nothing to do with nohup.)
In response to your followup:
That's not correct. You need to wait for each and every child process. A single wait will just wait for one process.
Once you've fixed that problem, you can call it like that. But you only need to do that if there is a possibility that your session will be disconnected before the script finishes. Another alternative is the screen program ... which allows you to detach and reattach the session.

Related

cron job to execute programs one after the other?

Have six scripts which I want to execute once a day using following logic:
s11, s12 can start and run parallel
s21, s22 should only start after s11 and s12 finished. Both can run parallel
s31, s32 should only start after s21 and s22 finished. Both can run parallel
So far I did it by starting a daily masterscript m by cron.
m started all six scripts s11-s32, s11 and s12 did their job directly but the others looked every minute in a counter file and only if the counter had the right value they started the real job. Each script changed the counter before closing, this was the handover to the next script generation.
But for other reasons my server was too busy that the new cron started m before the yesterday scripts finished and I screwed up my data.
I assume others had similar problems and know a little library or anything else to get this done properly and stable, for sure the new series shouldn't start before the old finished... .
Thanks in advance for any hints!
A example of master script in bash , i used wait to wait for the completion of
script that was started in background with &
This example assume :
that all yours scripts are a folder /home/me/myproject/
you have a logs folder where you want to capture some outputs
#!/bin/bash
cd /home/me/myproject/
bin/s11 > logs/s11_stdout.log 2>logs/s11_stderr.log &
bin/s12 > logs/s12_stdout.log 2>logs/s12_stderr.log &
wait
./s21 &
./s22 &
wait
s31 &
s32 &
wait

Using nohup to help run a loop of python code while disconnecting from ssh

I'm looking for help running a python script that takes some time to run.
It is a long running process that takes about 2hours per test observation. For example, these observations could be the 50 states of the usa.
I dont want to baby sit this process all day - I'd like to kick it off then drive home from work - or have it run while I'm sleeping.
Since this a loop - I would need to call one python script that loops through my code going over each of the 50 states - and a 2nd that runs my actual code that does things.
I've heard of NOHUP, but I have very limited knowledge. I saw nohup ipython mypython.py but then when I google I get alot of other people chiming in with other methods and so I don't know what is the ideal approach for someone like me. Additionally, I am essentially looking to run this as a loop - so don't know how that complicates things.
Please give me something simple and easier to understand. I don't know linux all that well or I wouldn't be asking as this seems like a common sort of command/activity...
Basic example of my code:
Two files: code_file.py and loop_file.py
Code_file.py does all the work. Loop file just passes in the list of things to run the stuff for.
code_file.py
output = each_state + ' needs some help!'
print output
loop_file.py
states = ['AL','CO','CA','NY','VA','TX']
for each_state in states:
code_file.py
Regarding the loop - I have also heard that I can't pass in parameters or something via nohup? I can fix this part within my python code....for example reading from a CSV in my code and deleting the current record from that CSV file and then re-writing it out...that way I can always select the top record in the CSV file for my loop (the list of states)
May be you could modify your loop_file.py like this:
import os
states = ['AL','CO','CA','NY','VA','TX']
for each_state in states:
os.system("python /dir_of_your_code/code_file.py")
Then in a shell, you could run the loop_file.py with:
nohup python loop_file.py & # & is not necessary, it just redirect all output of the file to a file named nohup.out instead of printing it on screen.

BASH: Run script.sh in background and output that instance's pid to file?

I'm sure this is probably a shamefully daft question, but it feels like I've done two laps of the web reading things about process management, output redirection etc. but I'm struggling to make sense of it all enough to achieve what I'd like.
In essence:
I want to run a shell script (with some arguments passed when it's called) in the background; and as it runs, write the pid of that instance to a specified file.
./offtimer.sh device1 10 &
of course runs the script in the background, and outputs that instance's pid to the screen; I'd just like to get that number into a file (eg. device1.pid).
I know (by reading as well as trial & error, I promise!) that
./offtimer.sh device1 10 & > device1.pid
isn't valid syntax; but despite thread after thread I've read, I can't figure out a way to do it.
Sincerely grateful for any help!
R
You can access the last child process ID using $!.
./offtimer.sh device1 10 &
echo $! > device1.pid
$! is the last background process's ID.
./offtimer.sh device1 10 &
echo $! > device1.pid
./offtimer.sh device1 10 > device1.pid &
Alternatively you could write that to the file within the script rather than output it to stdout.

Run bash shell in parallel and wait

I have 100 files in a directory, and want to process each one with several steps, while step1 is time-consuming. So the pseudocode is like:
for filename in ~/dir/*; do
run_step1 filename >${filename}.out &
done
for outfile in ~/dir/*.out; do
run_step2 outfile >${outfile}.result
done
My question is how can I check if step1 is complete for a given input file. I used to use threads.join in C#, but not sure if bash shell has equivalent.
It looks like you want:
for filename in ~/dir/*
do
(
run_step1 $filename >${filename}.out
run_step2 ${filename}.out >${filename}.result
) &
done
wait
This processes each file in a separate sub-shell, running first step 1 then step 2 on each file, but processing multiple files in parallel.
About the only issue you'll need to worry about is ensuring you don't try running too many processes in parallel. You might want to consider GNU parallel.
You might want to write a trivial script (doit.sh, perhaps):
run_step1 "$1" > "$1.out"
run_step2 "$1.out" > "$1.result"
and then invoke that script from parallel, one file per invocation.
Try this:
declare -a PROCNUMS
ITERATOR=0
for filename in ~/dir/*; do
run_step1 filename >${filename}.out &
PROCNUMS[$ITERATOR]=$!
let "ITERATOR=ITERATOR+1"
done
ITERATOR=0
for outfile in ~/dir/*.out; do
wait ${PROCNUMS[$ITERATOR]}
run_step2 outfile >${outfile}.result
let "ITERATOR=ITERATOR+1"
done
This will make an array of the created processes then wait for them in order as they need to be completed, not it relies on the fact there is a 1 to 1 relationship between in and out files and the directory is not changed while it is running.
Not for a small performance boost you can now run the second loop asynchronously too if you like assuming each file is independant.
I hope this helps, but if you have any questions please comment.
The Bash builtin wait can wait for a specific background job or all background jobs to complete. The simple approach would be to just insert a wait in between your two loops. If you'd like to be more specific, you could save the PID for each background job and wait PID directly before run_step2 inside the second loop.
After the loop that executes step1 you could write another loop that executes fg command which moves last process moved to background into foreground.
You should be aware that fg could return an error if a process already finished.
After the loop with fgs you are sure that all steps1 have finished.

Handle "race-condition" between 2 cron tasks. What is the best approach?

I have a cron task that runs periodically. This task depends on a condition to be valid in order to complete its processing. In case it matters this condition is just a SELECT for specific records in the database. If the condition is not satisfied (i.e the SELECT does not return the result set expected) then the script exits immediately.
This is bad as the condition would be valid soon enough (don't know how soon but it will be valid due to the run of another script).
So I would like somehow to make the script more robust. I thought of 2 solutions:
Put a while loop and sleep constantly until the condition is
valid. This should work but it has the downside that once the script
is in the loop, it is out of control. So I though to additionally
after waking up to check is a specific file exists. If it does it
"understands" that the user wants to "force" stop it.
Once the script figures out that the condition is not valid yet it
appends a script in crontab and stops. That seconds script
continually polls for the condition and if the condition is valid
then restart the first script to restart its processing. This solution to me it seems to work but I am not sure if it is a good solution. E.g. perhaps programatically modifying the crontab is a bad idea?
Anyway, I thought that perhaps this problem is common and could have a standard solution, much better than the 2 I came up with. Does anyone have a better proposal? Which from my ideas would be best? I am not very experienced with cron tasks so there could be things/problems I could be overseeing.
instead of programmatically appending the crontab, you might want to consider using at to schedule the job to run again at some time in the future. If the script determines that it cannot do its job now, it can simply schedule itself to run again a few minutes (or a few hours, as it may) later by way of an at command.
Following up from our conversation in comments, you can take advantage of conditional execution in a cron entry. Supposing you want to branch based on time of day, you might use the output from date.
For example: this would always invoke the first command, then invoke the second command only if the clock hour is currently 11:
echo 'ScriptA running' ; [ $(date +%H) == 11 ] && echo 'ScriptB running'
More examples!
To check the return value from the first command:
echo 'ScriptA' ; [ $? == 0 ] echo 'ScriptB'
To instead check the STDOUT, you can use as colon as a noop and branch by capturing output with the same $() construct we used with date:
: ; [ $(echo 'ScriptA') == 'ScriptA' ] && echo 'ScriptB'
One downside on the last example: STDOUT from the first command won't be printed to the console. You could capture it to a variable which you echo out, or write it to a file with tee, if that's important.

Resources