Start programs synchronized in bash

Start programs synchronized in bash - multithreading

What I simply want to do is to do a wait for release lock.
I have for example 4 (because I have 4 core) identical script that works each on a part of a project each script looks like that:
#!/bin/bash
./prerenderscript $1
scriptsync step1 4
./renderscript $1
scriptsync step2 4
./postprod $1
when I run the main script that call the four script, I want each script to work individualy but at some point, I want to have each script waiting for each other because the next part need all data from the first part.
For now I used some logic like the number of file or a file that get created for each process and their existance getting tested with other one.
I also got the idea to use a makefile and to have
prerender%: source
./prerender $#
renderscript%: prerender1 prerender2 prerender3 prerender4
./renderscript $#
postprod: renderscript1 renderscript2 renderscript3 renderscript4
./postprod $#
But actually the process is simplified here the script is more complex and for each step the thread need to keep his variables.
Is there anyway to get the script in sync instead of the placeholder command scriptsync.

To achieve this in Bash, one way to do it is using inter-process communication to cause a task to wait for the previous one to finish. Here is an example.
#!/bin/bash
# $1 is received to allow for an example command, not required for the mechanism suggested
task_a()
{
# Do some work
sleep $1 # This is just a dummy command as an example
echo "Task A/$1 completed" >&2
# Send status to stdin, telling next task to proceed
echo "OK"
}
task_b()
{
IFS= read status ; [[ $status = OK ]] || return 1
# Do some work
sleep $1 # This is just a dummy command as an example
echo "Task B/$1 completed" >&2
}
task_a 2 | task_b 2 &
task_a 1 | task_b 1 &
wait
You will notice that the read could be anywhere in task B, so you could do some work, then wait (read) for the other task, then continue. You could have many signals sent by task A to task B, and several corresponding read statements.
As shown in the example, you can launch several pipelines in parallel.
One limit of this approach is that a pipeline establishes a communication channel between one writer and one reader. If a task needs to wait for signals from several tasks, you would need FIFOs to allow the task with dependencies to read from multiple sources.

Related

Linux | Background assignment of command output to variable

I need 3 commands to be run and their (single-line) outputs assigned to 3 different variables, which then I use to write to a file. I want to wait till the variable assignment is complete for all 3 before I echo the variables to the file. I am running these in a loop within a bash script.
This is what I have tried -
var1=$(longRunningCommand1) &
var2=$(longRunningCommand2) &
var3=$(longRunningCommand3) &
wait %1 %2 %3
echo "$var1,$var2,$var3">>$logFile
This gives no values at all, for the variables. I get -
,,
,,
,,
However, if I try this -
var1=$(longRunningCommand1 &)
var2=$(longRunningCommand2 &)
var3=$(longRunningCommand3 &)
wait %1 %2 %3
echo "$var1,$var2,$var3">>$logFile
I get the desired output,
o/p of longRunningCommand1, o/p of longRunningCommand2, o/p of longRunningCommand3
o/p of longRunningCommand1, o/p of longRunningCommand2, o/p of longRunningCommand3
o/p of longRunningCommand1, o/p of longRunningCommand2, o/p of longRunningCommand3
but the nohup.out for this shell script indicates that there was no background job to wait for -
netmon.sh: line 35: wait: %1: no such job
netmon.sh: line 35: wait: %2: no such job
netmon.sh: line 35: wait: %3: no such job
I would not have bothered much about this, but I definitely need to make sure that my script is waiting for all the 3 variables to be assigned before attempting the write. Whereas, the nohup.out tells me otherwise! I think I want to know if the 2nd approach is the right way when I run into a situation where any of those 3 commands are running for more than a few seconds. I have not yet been able to get a really long running command or a resource contention on the box to actually resolve this doubt of mine.
Thank you very much for any helpful thoughts.
-MT

Your goal of writing the output of echo "$var1,$var2,$var3">>$logFile while backgrounding actual processes of longRunningCommand1, ..2, ..3 can be accomplished using a list and redirection. As #that_other_guy notes, you cannot assign the result of a command substitution to a variable in the background to begin with. However, for a shell that provides process substitution like bash, you can write the output of a process to a file in the background and separating your processes and redirections by a ';' will insure the sequential write of command1, ..2, ..3 to the log file, e.g.:
Commands that are separated by a <semicolon> ( ';' )
shall be executed sequentially.
POSIX Specification - lists
Putting those pieces together, you would sequentially write the results of your comment to $logfile with something similar to the following,
( (longRunningCommand1) >> $logfile; (longRunningCommand2) >> $logfile; \
(longRunningCommand3) >> $logfile) &
(note: the ';' between commands writing to $logfile)
While not required, if you wanted to wait until all commands had been written to $logfile within your script (and your script supports $! as the PID for the last backgrouded process), you could simply wait $!, though that is not required to insure the write to the file completes.

Bash output happening after prompt, not before, meaning I have to manually press enter

I am having a problem getting bash to do exactly what I want, it's not a major issue, but annoying.
1.) I have a third party software I run that produces some output as stderr. Some of it is useful, some of it is regularly stuff I don't care about and I don't want this dumped to screen, however I do want the useful parts of the stderr dumped to screen. I figured the best way to achieve this was to pass stderr to a function, then use conditions in that function to either show the stderr or not.
2.) This works fine. However the solution I have implemented dumped out my errors at the right time, but then returns a bash prompt and I want to summarise the status of the errors at the end of the function, but echo-ing here prints the text after the prompt meaning that I have to press enter to get back to a clean prompt. It shall become clear with the example below.
My error stream generator:
./TestErrorStream.sh
#!/bin/bash
echo "test1" >&2
My function to process this:
./Function.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # This is used simply to simulate the processing work I'm doing on the errors.
echo "Completed"
}
I source the Function.sh file to make ProcessErrors() available, then I run:
2> >(ProcessErrors) ./TestErrorStream.sh
I expect (and want) to get:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
Completed
user#user-desktop:~/path$
However what I really get is:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
user#user-desktop:~/path$ Completed
And no clean prompt. Of course the prompt is there, but "Completed" is being printed after the prompt, I want to printed before, and then a clean prompt to appear.
NOTE: This is a minimum working example, and it's contrived. While other solutions to my error stream problem are welcome I also want to understand how to make bash run this script the way I want it to.
Thanks for your help
Joey

Your problem is that the while loop stay stick to stdin until the program exits.
The release of stdin occurs at the end of the "TestErrorStream.sh", so your prompt is almost immediately available compared to what remains to process in the function.
I suggest you wrap the command inside a script so you'll be able to handle the time you want before your prompt is back (I suggest 1sec more than the suspected time needed for the function to process the remaining lines of codes)
I successfully managed to do this like that :
./Functions.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # simulate required time to process end of function (after TestErrorStream.sh is over and stdin is released)
echo "Completed"
}
./TestErrorStream.sh
#!/bin/bash
echo "first"
echo "firsterr" >&2
sleep 20 # any number here
./WrapTestErrorStream.sh
#!/bin/bash
source ./Functions.sh
2> >(ProcessErrors) ./TestErrorStream.sh
sleep 6 # <= this one is important
With the above you'll get a nice "Completed" before your prompt after 26 seconds of processing. (Works fine with or without the additional "time" command)
user#host:~/path$ time ./WrapTestErrorStream.sh
first
Line was:firsterr
Completed
real 0m26.014s
user 0m0.000s
sys 0m0.000s
user#host:~/path$
Note: the process substitution ">(ProcessErrors)" is a subprocess of the script "./TestErrorStream.sh". So when the script ends, the subprocess is no more tied to it nor to the wrapper. That's why we need that final "sleep 6"

#!/bin/bash
function ProcessErrors {
while read data; do
echo Line was:"$data"
done
sleep 5
echo "Completed"
}
# Open subprocess
exec 60> >(ProcessErrors)
P=$!
# Do the work
2>&60 ./TestErrorStream.sh
# Close connection or else subprocess would keep on reading
exec 60>&-
# Wait for process to exit (wait "$P" doesn't work). There are many ways
# to do this too like checking `/proc`. I prefer the `kill` method as
# it's more explicit. We'd never know if /proc updates itself quickly
# among all systems. And using an external tool is also a big NO.
while kill -s 0 "$P" &>/dev/null; do
sleep 1s
done
Off topic side-note: I'd love to see how posturing bash veterans/authors try to own this. Or perhaps they already did way way back from seeing this.

Need an in-depth explanation of how to use flock in Linux shell scripting

I am working on a tiny Raspberry Pi cluster (4 pis). I have 3 Raspberry Pi nodes that will be leaving a message in a message.txt file on the head Pi. The head Pi will be in a loop checking the message.txt file to see if it has any lines. When it does I want to lock the file and then extract the info I need. The problem I am having is that I need to do multiple commands. The only ways I have found that allows multiple commands look like this...
(
flock -s 200
# ... commands executed under lock ...
) 200>/var/lock/mylockfile
The problem with this way is that it uses a sub shell. The problem with that is that I have "job" files labeled job_1 job_2 etc..... that I want to be able to use a counter with. If I place the increment of the counter in the subshell it will be considered only in the scope of the subshell. If I pull the incrementation out there is a chance that another pi will add an entry before I increment the counter and lock the file.
I have heard talk that there is a way to lock the file and run multiple commands and flow control and then unlock it all using flock. I have not seen any good examples though.
Here is my current code.
# Now go into loop to send out jobs as pis ask for more work
while [ $jobsLeftCount -gt 0 ]
do
echo "launchJobs.sh: About to check msg file"
msgLines=$(wc -l < $msgLocation)
if [ $msgLines ]; then
#FIND WAY TO LOCK FILE AND DO THAT HERE
echo "launchJobs.sh: Messages found. Locking message file to read contents"
(
flock -e 350
echo "Message Received"
while read line; do
#rename file to be sent to node "job"
mv $jobLocation$jobName$jobsLeftCount /home/pi/algo2/Jobs/job
#transfer new job to each script that left a message
scp /home/pi/algo2/Jobs/job pi#192.168.0.$line:/home/pi/algo2/Jobs/
jobsLeftCount=$jobsLeftCount-1;
echo $line
done < $msgLocation
#clear msg file
>$msgLocation
#UNLOCK MESG FILE HERE
) 350>>$msgLocation
echo "Head node has $jobsLeftCount remaining"
fi
#jobsLeftCount=$jobsLeftCount-1;
#echo "here is $jobsLeftCount file"
done

If the sub-shell environment is not acceptable, use braces in place of parentheses to group the commands:
{
flock -s 200
# ... commands executed under lock ...
} 200>/var/lock/mylockfile
This runs the commands executed under lock in a new I/O context, but does not start a sub-shell. Within the braces, all the commands executed will have file descriptor 200 open to the locked lock file.

How to run script continously in background without using crontab

I have a small script that checks certain condition continously and as soon as that condition is met the program should execute. Can this be done. I thought of using crontab where script runs every 5 min but now I want that to be done without crontab

You probably want to create an infinite loop first, then within that loop you probably want to verify your condition or wait a bit. As you did not mention which scripting language you wanted to use, I'm going to write pseudo code for the example. Give us more info about the scripting language, and perhaps also the conditions.
Example in pseudo code:
# Defining a timeout of 1 sec, so checking the condition every second
timeout = 1
# Running in background infinitely
while true do
# Let's check the condition
if condition then
# I got work to do
...
else
# Let's wait a specified timeout, not to block the system
sleep timeout
endif
endwhile
Example in sh with your input code
#!/bin/sh
PATH=/bin:/usr/bin:/sbin:/usr/sbin
# Defining a timeout of 1 hour
timeout=3600
while true
do
case 'df /tmp' in
" "[5-9]?%" ") rm -f /tmp/af.*
;;
*)
sleep $timeout
;;
esac
done
You can then run this script from the shell using 'nohup':
nohup yourscript &

Multi-threaded BASH programming - generalized method?

Ok, I was running POV-Ray on all the demos, but POV's still single-threaded and wouldn't utilize more than one core. So, I started thinking about a solution in BASH.
I wrote a general function that takes a list of commands and runs them in the designated number of sub-shells. This actually works but I don't like the way it handles accessing the next command in a thread-safe multi-process way:
It takes, as an argument, a file with commands (1 per line),
To get the "next" command, each process ("thread") will:
Waits until it can create a lock file, with: ln $CMDFILE $LOCKFILE
Read the command from the file,
Modifies $CMDFILE by removing the first line,
Removes the $LOCKFILE.
Is there a cleaner way to do this? I couldn't get the sub-shells to read a single line from a FIFO correctly.
Incidentally, the point of this is to enhance what I can do on a BASH command line, and not to find non-bash solutions. I tend to perform a lot of complicated tasks from the command line and want another tool in the toolbox.
Meanwhile, here's the function that handles getting the next line from the file. As you can see, it modifies an on-disk file each time it reads/removes a line. That's what seems hackish, but I'm not coming up with anything better, since FIFO's didn't work w/o setvbuf() in bash.
#
# Get/remove the first line from FILE, using LOCK as a semaphore (with
# short sleep for collisions). Returns the text on standard output,
# returns zero on success, non-zero when file is empty.
#
parallel__nextLine()
{
local line rest file=$1 lock=$2
# Wait for lock...
until ln "${file}" "${lock}" 2>/dev/null
do sleep 1
[ -s "${file}" ] || return $?
done
# Open, read one "line" save "rest" back to the file:
exec 3<"$file"
read line <&3 ; rest=$(cat<&3)
exec 3<&-
# After last line, make sure file is empty:
( [ -z "$rest" ] || echo "$rest" ) > "${file}"
# Remove lock and 'return' the line read:
rm -f "${lock}"
[ -n "$line" ] && echo "$line"
}

#adjust these as required
args_per_proc=1 #1 is fine for long running tasks
procs_in_parallel=4
xargs -n$args_per_proc -P$procs_in_parallel povray < list
Note the nproc command coming soon to coreutils will auto determine
the number of available processing units which can then be passed to -P

If you need real thread safety, I would recommend to migrate to a better scripting system.
With python, for example, you can create real threads with safe synchronization using semaphores/queues.

sorry to bump this after so long, but I pieced together a fairly good solution for this IMO
It doesnt work perfectly, but it will limit the script to a certain number of child tasks running, and then wait for all the rest at the end.
#!/bin/bash
pids=()
thread() {
local this
while [ ${#} -gt 6 ]; do
this=${1}
wait "$this"
shift
done
pids=($1 $2 $3 $4 $5 $6)
}
for i in 1 2 3 4 5 6 7 8 9 10
do
sleep 5 &
pids=( ${pids[#]-} $(echo $!) )
thread ${pids[#]}
done
for pid in ${pids[#]}
do
wait "$pid"
done
it seems to work great for what I'm doing (handling parallel uploading of a bunch of files at once) and keeps it from breaking my server, while still making sure all the files get uploaded before it finishes the script

I believe you're actually forking processes here, and not threading. I would recommend looking for threading support in a different scripting language like perl, python, or ruby.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string