Run bash shell in parallel and wait - linux

I have 100 files in a directory, and want to process each one with several steps, while step1 is time-consuming. So the pseudocode is like:
for filename in ~/dir/*; do
run_step1 filename >${filename}.out &
done
for outfile in ~/dir/*.out; do
run_step2 outfile >${outfile}.result
done
My question is how can I check if step1 is complete for a given input file. I used to use threads.join in C#, but not sure if bash shell has equivalent.

It looks like you want:
for filename in ~/dir/*
do
(
run_step1 $filename >${filename}.out
run_step2 ${filename}.out >${filename}.result
) &
done
wait
This processes each file in a separate sub-shell, running first step 1 then step 2 on each file, but processing multiple files in parallel.
About the only issue you'll need to worry about is ensuring you don't try running too many processes in parallel. You might want to consider GNU parallel.
You might want to write a trivial script (doit.sh, perhaps):
run_step1 "$1" > "$1.out"
run_step2 "$1.out" > "$1.result"
and then invoke that script from parallel, one file per invocation.

Try this:
declare -a PROCNUMS
ITERATOR=0
for filename in ~/dir/*; do
run_step1 filename >${filename}.out &
PROCNUMS[$ITERATOR]=$!
let "ITERATOR=ITERATOR+1"
done
ITERATOR=0
for outfile in ~/dir/*.out; do
wait ${PROCNUMS[$ITERATOR]}
run_step2 outfile >${outfile}.result
let "ITERATOR=ITERATOR+1"
done
This will make an array of the created processes then wait for them in order as they need to be completed, not it relies on the fact there is a 1 to 1 relationship between in and out files and the directory is not changed while it is running.
Not for a small performance boost you can now run the second loop asynchronously too if you like assuming each file is independant.
I hope this helps, but if you have any questions please comment.

The Bash builtin wait can wait for a specific background job or all background jobs to complete. The simple approach would be to just insert a wait in between your two loops. If you'd like to be more specific, you could save the PID for each background job and wait PID directly before run_step2 inside the second loop.

After the loop that executes step1 you could write another loop that executes fg command which moves last process moved to background into foreground.
You should be aware that fg could return an error if a process already finished.
After the loop with fgs you are sure that all steps1 have finished.

Related

Detect data flowing through a port in a bash script

I have data flowing through a Linux box and a custom command that prints the data as it flows to STDOUT (the screen). I want to detect if data is flowing and restart some processes if it's not.
Let's say my test file is "flowchk.sh". How do I use that in a conditional statement in a shell script? My plan so far has been to push the data to a file then check to see if the file has any data in it:
timeout 5s flowchk.sh > anythinghere
FILENAME=./anythinghere
MAXSIZE=5000
FILESIZE=$(stat -c%s FILENAME)
if (( FILESIZE > MAXSIZE )); then
echo "all ok"
else
restarteverything!
fi
This has run into problems because the timeout command doesn't terminate properly when using my flowchk script (never returns to the command prompt). So I either need help figuring out how to stop flowchk's execution after a period of time (or it will run forever) so I can test the temp file to see if there's anything there OR I need to know if there's a better way to approach this problem and I'm wasting time.

Find out ID of 'at' job from within it

When I schedule a job with 'at' it is assigned an id, viz:
job 44 at 2014-01-28 17:30
When that job runs I would like to get at that id from within it. This is on Centos, FWIW. I have established that no environment variable contains the ID. When the Perl code in that job runs I would like it to be able to print the job ID (44 in this example).
Yes, I know that atq shows an = next to jobs that are executing, but there might be more than one of those at a time.
I could do something like pass a unique argument to the job when scheduling it, capture the ID, save that and the argument to a file somewhere, read that from the job. That's a lot of work I'd rather not go to if I don't have to, and it seems like this should be simple but I'm drawing a blank.
What follows is figured out by reading sources of at-3.14. The way at puts job id and the time when it is run into the file name should be similar for any version, but I haven't checked this.
To begin whith at encodes the job id and the time when a particular job should be run into the file name describing a job. The file name has format aJJJJJTTTTTTTT, where JJJJJ is 5 character hexadecimal string, the job id, and TTTTTTTT is an 8 character hexadecimal string, the time when the job should be run. The time is stored as seconds from the epoch.
At jobs are run by feeding a job description file as the standard input to sh -c. Fortunately the Linux kernel provides a symbolic link, /proc/self/fd/0, which will point to the standard input of the process currently being executed (play with ls -l /proc/self/fd/0 in case you need to assure yourself that this indeed is so).
A file describing a job has been deleted by the time a job is run. However, the file is still available for the kernel because it has been duplicated with dup(2) before being used as the standard input for a job. So, actually we are resolving a symbolic link to a file name which is not visible any more. In the perl script at the end we need to take this into account as readlink will return something like /foo/bar/baz (deleted) instead of /foo/bar/baz. And we're interested in just the file name which has all the information we need.
The reason why the symbolic link points to a deleted file is because at daemon unlinks the original before executing the job. Unlinking gets done only after creating a copy, a hard link, which begins with = instead of a. With this the at daemon tries to ensure there will be only one copy of a job running: the daemon will not execle(2), ie. it will bail out, should the link(2) fail. Because the original file has been subject to open(2) and dup(2) the inode is still there for the kernel to use because it still has hard links pointing to it.
After a fairly long and possibly confusing introduction, here is how to put it all together:
#!/usr/bin/perl
use strict;
use warnings;
my $job_file = readlink("/proc/self/fd/0");
if (index($job_file, " ") > 0) {
$job_file = substr($job_file, 0, index($job_file, " ") - 1);
}
my $tmp = substr($job_file, rindex($job_file, "/") + 1);
$tmp =~ s/^a([0-9a-f]{5})[0-9a-f]+/$1/;
my $job_id = hex($tmp);
if ($job_id > 0) {
printf("My AT job id is %d.\n", $job_id);
}
# end of file.

Environment variables does not update

I have two shell scripts. In one of them I have:
export FOO="yes"
sh another.sh &
# ops ...
export FOO=0
In the another.sh I have:
while [[ $FOO -eq "yes" ]]
do
# something ...
done
The thing is, when the first script finishes, setting FOO=0, the value of FOO in another.sh continues being "yes". I want to know how to get the updated FOO so I can figure out when the first script (the caller) has finished.
A child process receives a copy of its parent's environment, not references to the original. If the parent changes while the child is still running, the child still has the original values in its memory space.
As a workaround, you could have the parent create an empty file in a known location before it exits, and have the child look for that file to be created. It's not very robust, but it's a simple form of interprocess communication that may be sufficient.
This is not possible: changes to the parent's environment are not passed on to child processes.
Have you considered inverting the parent and child relationship? If the child is waiting for the parent to finish, why not call what was the parent script from what was the child script? Assuming your first script was called "first.sh", something like
another.sh:
sh first.sh &
CHILDPID=$!
while ! (ps -p $CHILDPID | grep $CHILDPID)
do
[...]
done
Using job control (using set -m in bash) may be another solution.
Piped `while-read' loop starts subshell
Try to use other loop mechanism instead
Something like:
while read fileName
do
.........
done < <(ls -1tr $SOURCE_DIR)

how do i submit multiple jobs with nohup?

I have several jobs which i would like to submit in a certain order.
First, multiple jobs like cal1st_1.sh cal1st_2.sh ... cal1st_10.sh are submitted at one time and processed simultaneously. After all of them are finished, i have a job called post_process.sh. Then, multiple jobs like cal2nd_1.sh, cal2nd_2.sh .... cal2nd_10.sh, then post_process.sh again, I need to do this for four times.
I tried to write a script get_final.sh like
#!/bin/bash
./cal1st_1.sh & ./cal1st_2.sh & ./cal1st_3.sh...... & ./cal1st_10.sh
./post_process.sh
./cal2nd_1.sh & ./cal2nd_2.sh & ./cal2nd_3.sh...... & ./cal2nd_10.sh
./post_process.sh
and run it with command: nohup ./get_final.sh &, but it seems doesn't work, sometimes post_process.sh started even cal1st*sh weren't all finished, sometimes cal1st*sh weren't processed simultaneously. Can someone tell me which part of my codes is wrong? If you have any questions related to my codes, please leave a comment.
EDIT
I wrote a script get_final.sh like this, do you think it will work? Should i just execute it with nohup ./get_final.sh &
#!/bin/bash
pre_cal.sh
for i in `seq 1 13`; do
cal_dis_vel_strain_$i.sh &
done
wait
echo 1 > record
./post_cal.sh
...
Can someone tell me which part of my codes is wrong?
That is wrong is your assumption that the "post-process" tasks won't be started until the previous "cal" processes have finished. That's not the way that & works. What & does is to leave the child process to run in the background ... without waiting for it to finish.
The way to do what you are trying to do is to use the builtin wait command, as described here:
How to wait in bash for several subprocesses to finish and return exit code !=0 when any subprocess ends with code !=0?
http://www.unixcl.com/2008/06/bash-wait-command.html
In this case, you will need to wait for each of the backgrounded processes (in any order).
(The problem is nothing to do with nohup.)
In response to your followup:
That's not correct. You need to wait for each and every child process. A single wait will just wait for one process.
Once you've fixed that problem, you can call it like that. But you only need to do that if there is a possibility that your session will be disconnected before the script finishes. Another alternative is the screen program ... which allows you to detach and reattach the session.

Handle "race-condition" between 2 cron tasks. What is the best approach?

I have a cron task that runs periodically. This task depends on a condition to be valid in order to complete its processing. In case it matters this condition is just a SELECT for specific records in the database. If the condition is not satisfied (i.e the SELECT does not return the result set expected) then the script exits immediately.
This is bad as the condition would be valid soon enough (don't know how soon but it will be valid due to the run of another script).
So I would like somehow to make the script more robust. I thought of 2 solutions:
Put a while loop and sleep constantly until the condition is
valid. This should work but it has the downside that once the script
is in the loop, it is out of control. So I though to additionally
after waking up to check is a specific file exists. If it does it
"understands" that the user wants to "force" stop it.
Once the script figures out that the condition is not valid yet it
appends a script in crontab and stops. That seconds script
continually polls for the condition and if the condition is valid
then restart the first script to restart its processing. This solution to me it seems to work but I am not sure if it is a good solution. E.g. perhaps programatically modifying the crontab is a bad idea?
Anyway, I thought that perhaps this problem is common and could have a standard solution, much better than the 2 I came up with. Does anyone have a better proposal? Which from my ideas would be best? I am not very experienced with cron tasks so there could be things/problems I could be overseeing.
instead of programmatically appending the crontab, you might want to consider using at to schedule the job to run again at some time in the future. If the script determines that it cannot do its job now, it can simply schedule itself to run again a few minutes (or a few hours, as it may) later by way of an at command.
Following up from our conversation in comments, you can take advantage of conditional execution in a cron entry. Supposing you want to branch based on time of day, you might use the output from date.
For example: this would always invoke the first command, then invoke the second command only if the clock hour is currently 11:
echo 'ScriptA running' ; [ $(date +%H) == 11 ] && echo 'ScriptB running'
More examples!
To check the return value from the first command:
echo 'ScriptA' ; [ $? == 0 ] echo 'ScriptB'
To instead check the STDOUT, you can use as colon as a noop and branch by capturing output with the same $() construct we used with date:
: ; [ $(echo 'ScriptA') == 'ScriptA' ] && echo 'ScriptB'
One downside on the last example: STDOUT from the first command won't be printed to the console. You could capture it to a variable which you echo out, or write it to a file with tee, if that's important.

Resources