Linux | Background assignment of command output to variable - linux

I need 3 commands to be run and their (single-line) outputs assigned to 3 different variables, which then I use to write to a file. I want to wait till the variable assignment is complete for all 3 before I echo the variables to the file. I am running these in a loop within a bash script.
This is what I have tried -
var1=$(longRunningCommand1) &
var2=$(longRunningCommand2) &
var3=$(longRunningCommand3) &
wait %1 %2 %3
echo "$var1,$var2,$var3">>$logFile
This gives no values at all, for the variables. I get -
,,
,,
,,
However, if I try this -
var1=$(longRunningCommand1 &)
var2=$(longRunningCommand2 &)
var3=$(longRunningCommand3 &)
wait %1 %2 %3
echo "$var1,$var2,$var3">>$logFile
I get the desired output,
o/p of longRunningCommand1, o/p of longRunningCommand2, o/p of longRunningCommand3
o/p of longRunningCommand1, o/p of longRunningCommand2, o/p of longRunningCommand3
o/p of longRunningCommand1, o/p of longRunningCommand2, o/p of longRunningCommand3
but the nohup.out for this shell script indicates that there was no background job to wait for -
netmon.sh: line 35: wait: %1: no such job
netmon.sh: line 35: wait: %2: no such job
netmon.sh: line 35: wait: %3: no such job
I would not have bothered much about this, but I definitely need to make sure that my script is waiting for all the 3 variables to be assigned before attempting the write. Whereas, the nohup.out tells me otherwise! I think I want to know if the 2nd approach is the right way when I run into a situation where any of those 3 commands are running for more than a few seconds. I have not yet been able to get a really long running command or a resource contention on the box to actually resolve this doubt of mine.
Thank you very much for any helpful thoughts.
-MT

Your goal of writing the output of echo "$var1,$var2,$var3">>$logFile while backgrounding actual processes of longRunningCommand1, ..2, ..3 can be accomplished using a list and redirection. As #that_other_guy notes, you cannot assign the result of a command substitution to a variable in the background to begin with. However, for a shell that provides process substitution like bash, you can write the output of a process to a file in the background and separating your processes and redirections by a ';' will insure the sequential write of command1, ..2, ..3 to the log file, e.g.:
Commands that are separated by a <semicolon> ( ';' )
shall be executed sequentially.
POSIX Specification - lists
Putting those pieces together, you would sequentially write the results of your comment to $logfile with something similar to the following,
( (longRunningCommand1) >> $logfile; (longRunningCommand2) >> $logfile; \
(longRunningCommand3) >> $logfile) &
(note: the ';' between commands writing to $logfile)
While not required, if you wanted to wait until all commands had been written to $logfile within your script (and your script supports $! as the PID for the last backgrouded process), you could simply wait $!, though that is not required to insure the write to the file completes.

Related

Redirect parallel process bash script output to individual log file

I have a requirement, where i need to pass multiple arguments to the script to trigger parallel process for each argument. Now i need to capture each process output in the separate log file.
for arg in test_{01..05} ; do bash test.sh "$arg" & done
Above piece of code can only give parallel processing for the input arguments. I tried with exec > >(tee "/path/of/log/$arg_filedate +%Y%m%d%H.log") 2>&1 and it was able to create single log file name with just date with empty output. Can someone suggest whats going wrong here or if there is any best way other than using parallel package
Try:
data_part=$(date +%Y%m%d%H)
for arg in test_{01..05} ; do bash test.sh "$arg" > "/path/to/log/${arg}_${data_part}.log" & done
If i use "$arg_date +%Y%m%d%H.log" it is creating a file with just date without arg
Yes, because $arg_ is parsed as a variable name
arg_=blabla
echo "$arg_" # will print blabla
echo "${arg_}" # equal to the above
To separate _ from arg use braces "${arg}_" would expand variable arg and add string _.

Prevent script running with same arguments twice

We are looking into building a logcheck script that will tail a given log file and email when the given arguments are found. I am having trouble accurately determining if another version of this script is running with at least one of the same arguments against the same file. Script can take the following:
logcheck -i <filename(s)> <searchCriterion> <optionalEmailAddresses>
I have tried to use ps aux with a series of grep, sed, and cut, but it always ends up being more code than the script itself and seldom works very efficiently. Is there an efficient way to tell if another version of this script is running with the same filename and search criteria? A few examples of input:
EX1 .\logcheck -i file1,file2,file3 "foo string 0123" email#address.com
EX2 .\logcheck -s file1 Hello,World,Foo
EX3 .\logcheck -i file3 foo email#address1.com,email#address2.com
In this case 3 should not run because 1 is already running with parameters file3 and foo.
There are many solutions for your problem, I would recommend creating a lock file, with the following format:
arg1Ex1 PID#(Ex1)
arg2Ex1 PID#(Ex1)
arg3Ex1 PID#(Ex1)
arg4Ex1 PID#(Ex1)
arg1Ex2 PID#(Ex2)
arg2Ex2 PID#(Ex2)
arg3Ex2 PID#(Ex2)
arg4Ex2 PID#(Ex2)
when your script starts:
It will search in the file for all the arguments it has received (awk command or grep)
If one of the arguments is present in the list, fetch the process PID (awk 'print $2' for example) to check if it is still running (ps) (double check for concurrency and in case of process ended abnormally previously garbage might remain inside the file)
If the PID is still there, the script will not run
Else append the arguments to the lock file with the current process PID and run the script.
At the end, of the execution you remove the lines that contains the arguments that have been used by the script, or remove all lines with its PID.

Start programs synchronized in bash

What I simply want to do is to do a wait for release lock.
I have for example 4 (because I have 4 core) identical script that works each on a part of a project each script looks like that:
#!/bin/bash
./prerenderscript $1
scriptsync step1 4
./renderscript $1
scriptsync step2 4
./postprod $1
when I run the main script that call the four script, I want each script to work individualy but at some point, I want to have each script waiting for each other because the next part need all data from the first part.
For now I used some logic like the number of file or a file that get created for each process and their existance getting tested with other one.
I also got the idea to use a makefile and to have
prerender%: source
./prerender $#
renderscript%: prerender1 prerender2 prerender3 prerender4
./renderscript $#
postprod: renderscript1 renderscript2 renderscript3 renderscript4
./postprod $#
But actually the process is simplified here the script is more complex and for each step the thread need to keep his variables.
Is there anyway to get the script in sync instead of the placeholder command scriptsync.
To achieve this in Bash, one way to do it is using inter-process communication to cause a task to wait for the previous one to finish. Here is an example.
#!/bin/bash
# $1 is received to allow for an example command, not required for the mechanism suggested
task_a()
{
# Do some work
sleep $1 # This is just a dummy command as an example
echo "Task A/$1 completed" >&2
# Send status to stdin, telling next task to proceed
echo "OK"
}
task_b()
{
IFS= read status ; [[ $status = OK ]] || return 1
# Do some work
sleep $1 # This is just a dummy command as an example
echo "Task B/$1 completed" >&2
}
task_a 2 | task_b 2 &
task_a 1 | task_b 1 &
wait
You will notice that the read could be anywhere in task B, so you could do some work, then wait (read) for the other task, then continue. You could have many signals sent by task A to task B, and several corresponding read statements.
As shown in the example, you can launch several pipelines in parallel.
One limit of this approach is that a pipeline establishes a communication channel between one writer and one reader. If a task needs to wait for signals from several tasks, you would need FIFOs to allow the task with dependencies to read from multiple sources.

Difference between `exec n<&0 < file` and `exec n<file` commands and some general questions regarding exec command

As I am a newbie in shell scripting, exec command always confuses me and while exploring this topic with while loop had triggered following 4 questions:
What is the difference between the below syntax 1 and 2
syntax 1:
while read LINE
do
: # manipulate file here
done < file
syntax 2:
exec n<&0 < file
while read LINE
do
: # manipulate file here
done
exec 0<&n n<&-
Kindly elaborate the operation of exec n<&0 < file lucidly
Is this exec n<&0 < file command equivalent to exec n<file ? (if not then what's the difference between two?)
I had read some where that in Bourne shell and older versions of ksh, a problem with the while loop is that it is executed in a subshell. This means that any changes to the script environment,such as exporting variables and changing the current working directory, might not be present after the while loop completes.
As an example, consider the following script:
#!/bin/sh
if [ -f “$1” ] ; then
i=0
while read LINE
do
i=`expr $i + 1`
done < “$1”
echo $i
fi
This script tries to count the number of lines in the file specified to it as an argument.
On executing this script on the file
$ cat dirs.txt
/tmp
/usr/local
/opt/bin
/var
can produce the following incorrect result:
0
Although you are incrementing the value of $i using the command
i=expr $i + 1
when the while loop completes, the value of $i is not preserved.
In this case, you need to change a variable’s value inside the while loop and then use that value outside the loop.
One way to solve this problem is to redirect STDIN prior to entering the loop and then restore STDIN after the loop completes.
The basic syntax is
exec n<&0 < file
while read LINE
do
: # manipulate file here
done
exec 0<&n n<&-
My question here is:
In Bourne shell and older versions of ksh,ALTHOUGH WHILE LOOP IS EXECUTED IN SUBSHELL, how this exec command here helps in retaining variable value even after while loop completion i.e. how does here exec command accomplishes the task change a variable’s value inside the while loop and then use that value outside the loop.
The difference should be nothing in modern shells (both should be POSIX compatible), with some caveats:
There are likely thousands of unique shell binaries in use, some of which are missing common features or simply buggy.
Only the first version will behave as expected in an interactive shell, since the shell will close as soon as standard input gets EOF, which will happen once it finishes reading file.
The while loop reads from FD 0 in both cases, making the exec pointless if the shell supports < redirection to while loops. To read from FD 9 you have to use done <&9 (POSIX) or read -u 9 (in Bash).
exec (in this case, see help exec/man exec) applies the redirections following it to the current shell, and they are applied left-to-right. For example, exec 9<&0 < file points FD 9 to where FD 0 (standard input) is currently pointing, effectively making FD 9 a "copy" of FD 0. After that file is sent to standard input (both FD 0 and 9).
Run a shell within a shell to see the difference between the two (commented to explain):
$ echo foo > file
$ exec "$SHELL"
$ exec 9<&0 < file
$ foo # The contents of the file is executed in the shell
bash: foo: command not found
$ exit # Because the end of file, equivalent to pressing Ctrl-d
$ exec "$SHELL"
$ exec 9< file # Nothing happens, simply sends file to FD 9
This is a common misconception about *nix shells: Variables declared in subshells (such as created by while) are not available to the parent shell. This is by design, not a bug. Many other answers here and on USE refer to this.
So many questions... but all of them seem variants of the same one, so I'll go on...
exec without a command is used to do redirection in the current process. That is, it changes the files attached to different file descriptors (FD).
Question #1
I think it should be this way. In may system the {} are mandadory:
exec {n}<&0 < file
This line dups FD 0 (standard input) and stores the new FD into the n variable. Then it attaches file to the standard input.
while read LINE ; do ... done
This line reads lines into variable LINE from the standard input, that will be file.
exec 0<&n {n}<&-
And this line dups back the FD from n into 0 (the original standard input), that automatically closes file and then closes n (the dupped original stdin).
The other syntax:
while read LINE; do ... done < file
does the same, but in a less convoluted way.
Question #2
exec {n}<&0 < file
These are redirections, and they are executed left to right. The first one n<&0 does a dup(0) (see man dup) and stores the result new FD in variable n. Then the <file does a open("file"...) and assigns it to the FD 0.
Question #3
No. exec {n}<file opens the file and assigns the new FD to variable n, leaving the standard input (FD 0) untouched.
Question #4
I don't know about older versions of ksh, but the usual problem is when doing a pipe.
grep whatever | while read LINE; do ... done
Then the while command is run in a subshell. The same is true if it is to the left of the pipe.
while read LINE ; do ... done | grep whatever
But for simple redirects there is no subshell:
while read LINE ; do ... done < aaa > bbb
Extra unnumbered question
About your example script, it works for me once I've changed the typographic quotes to normal double quotes ;-):
#!/bin/sh
if [ -f "$1" ] ; then
i=0
while read LINE
do
i=`expr $i + 1`
done < "$1"
echo $i
fi
For example, if the file is test:
$ ./test test
9
And about your latest question, the subshell is not created by while but by the pipe | or maybe in older versions of ksh, by the redirection <. What the exec trick does is to prevent that redirection so no subshell is created.
Let me answer your questions out-of-order.
Q #2
The command exec n<&0 < file is not valid syntax. Probably the n stands for "some arbitrary number". That said, for example
exec 3<&0 < file
executes two redirections, in sequence: it duplicates/copies the standard input file descriptor, which is 0, as file descriptor 3. Next, it "redirects" file descriptor 0 to read from file file.
Later, the command
exec 0<&3 3<&-
first copies back the standard input file descriptor from the saved file descriptor 3, redirecting standard input back to its previous source. Then it closes file descriptor 3, which has served its purpose to backup the initial stdin.
Q #1
Effectively, the two examples do the same: they temporarily redirect stdin within the scope of the while loop.
Q #3
Nope: exec 3<filename opens file filename using file descriptor 3. exec 3<&0 <filename I described in #2.
Q #4
I guess those older shells mentioned effectively executed
while ...; do ... ; done < filename
as
cat filename | while ...
thereby executing the while loop in a subshell.
Doing the redirections beforehand with those laborious exec commands avoids the redirection of the while block, and thereby the implicit sub-shell.
However, I never heard of that weird behavior, and I guess you won't have to deal with it unless you're working with truly ancient shells.

Bash output happening after prompt, not before, meaning I have to manually press enter

I am having a problem getting bash to do exactly what I want, it's not a major issue, but annoying.
1.) I have a third party software I run that produces some output as stderr. Some of it is useful, some of it is regularly stuff I don't care about and I don't want this dumped to screen, however I do want the useful parts of the stderr dumped to screen. I figured the best way to achieve this was to pass stderr to a function, then use conditions in that function to either show the stderr or not.
2.) This works fine. However the solution I have implemented dumped out my errors at the right time, but then returns a bash prompt and I want to summarise the status of the errors at the end of the function, but echo-ing here prints the text after the prompt meaning that I have to press enter to get back to a clean prompt. It shall become clear with the example below.
My error stream generator:
./TestErrorStream.sh
#!/bin/bash
echo "test1" >&2
My function to process this:
./Function.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # This is used simply to simulate the processing work I'm doing on the errors.
echo "Completed"
}
I source the Function.sh file to make ProcessErrors() available, then I run:
2> >(ProcessErrors) ./TestErrorStream.sh
I expect (and want) to get:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
Completed
user#user-desktop:~/path$
However what I really get is:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
user#user-desktop:~/path$ Completed
And no clean prompt. Of course the prompt is there, but "Completed" is being printed after the prompt, I want to printed before, and then a clean prompt to appear.
NOTE: This is a minimum working example, and it's contrived. While other solutions to my error stream problem are welcome I also want to understand how to make bash run this script the way I want it to.
Thanks for your help
Joey
Your problem is that the while loop stay stick to stdin until the program exits.
The release of stdin occurs at the end of the "TestErrorStream.sh", so your prompt is almost immediately available compared to what remains to process in the function.
I suggest you wrap the command inside a script so you'll be able to handle the time you want before your prompt is back (I suggest 1sec more than the suspected time needed for the function to process the remaining lines of codes)
I successfully managed to do this like that :
./Functions.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # simulate required time to process end of function (after TestErrorStream.sh is over and stdin is released)
echo "Completed"
}
./TestErrorStream.sh
#!/bin/bash
echo "first"
echo "firsterr" >&2
sleep 20 # any number here
./WrapTestErrorStream.sh
#!/bin/bash
source ./Functions.sh
2> >(ProcessErrors) ./TestErrorStream.sh
sleep 6 # <= this one is important
With the above you'll get a nice "Completed" before your prompt after 26 seconds of processing. (Works fine with or without the additional "time" command)
user#host:~/path$ time ./WrapTestErrorStream.sh
first
Line was:firsterr
Completed
real 0m26.014s
user 0m0.000s
sys 0m0.000s
user#host:~/path$
Note: the process substitution ">(ProcessErrors)" is a subprocess of the script "./TestErrorStream.sh". So when the script ends, the subprocess is no more tied to it nor to the wrapper. That's why we need that final "sleep 6"
#!/bin/bash
function ProcessErrors {
while read data; do
echo Line was:"$data"
done
sleep 5
echo "Completed"
}
# Open subprocess
exec 60> >(ProcessErrors)
P=$!
# Do the work
2>&60 ./TestErrorStream.sh
# Close connection or else subprocess would keep on reading
exec 60>&-
# Wait for process to exit (wait "$P" doesn't work). There are many ways
# to do this too like checking `/proc`. I prefer the `kill` method as
# it's more explicit. We'd never know if /proc updates itself quickly
# among all systems. And using an external tool is also a big NO.
while kill -s 0 "$P" &>/dev/null; do
sleep 1s
done
Off topic side-note: I'd love to see how posturing bash veterans/authors try to own this. Or perhaps they already did way way back from seeing this.

Resources