How to launch a process for reading and writing in bash? - linux

Background: I have to revive my old program, which unfortunately fails when it comes to communication with subprocess. The program is written in C++ and creates subprocess for writing with opened pipe for reading. Nothing crashes, but there is no data to read.
My idea is to recreate entire scenario in bash, so I could interactively check what is going on.
Things I used in C++:
mkfifo for creating pipe, there is a bash equivalent
popen for creating subprocess (in my case for writing)
espeak -x -q -z 1> /dev/null 2> /tmp/my-pipe
open and read -- for opening the pipe and then reading, I hope simple cat will suffice
fwrite -- for writing to subprocess, will just redirection work?
So I hope open, read and fwrite will be straightforward, but how do I launch a program as a process (what is popen in bash)?

bash naturally makes piping between processes very easy, so commands to create and open pipes are not normally needed
program1 | program2
This is the equivalent of program1 running popen("program2","w");
It could also be achieved by program2 running popen("program1","r");
If you explicitly want to use a named pipe:
mkfifo /tmp/mypipe
program1 >/tmp/mypipe &
program2 </tmp/mypipe
rm /tmp/mypipe

A thought that might solve your original problem (and is a consideration for using pipes in shell):
Using stdio commands such as popen, fwrite, etc involve buffering. If a program on the write end of the pipe only writes a small amount of data to the pipe, the program on the reading end won't see any of it until a full block of data has been written to the pipe, after which, the block of data will be pushed along the pipe. If you wish to have the data get there sooner, you need either call fflush() on the writing end, or fclose() if you are not planning on sending any more data. Note that with bash, I don't believe there is any equivalent of fflush.

You simply run the process in the background.
espeak -x -q -z >/dev/null 2>/tmp/mypipe &

Related

Trouble understanding bash piping behaviour

I'm a bit confused on how does bash performs pipe redirections.
First on piping behaviour:
cat /dev/random | ls doesn't waits cat and ends as soon as ls result is printed, however
cat /dev/random | grep foo waits for cat before executing grep.
It make sense because ls doesn't need cat result to work as grep do, but I don't understand how it can work, does bash waits processes with waitpid calls on some processes? Does it waits for EOF on write end of the pipe's right side, and on read end on the pipe's left side?
More, I'm not sure about which commands are forked or not and where it's done:
I guess built-in commands are exectuted in the main process as most of them are used to modify some shell settings. On the other hand (I guess), binary are always executed in a subprocess, with fork, am I right?
If I am, it means piping redierction doesn't call fork itself as it don't know if the commands to execute are built-ins or not.
Bash source code is way too far from my skills, I don't understand it, can someone explain me how it behaves?
I tried to reimplement its behaviour (without success), I asked here, with some code if you want to check.

When piping a command to shell script, why does exiting piped command makes shell script exit?

First of all, sorry if the title is not clear or misleading, my question is not not exactly easy to be understood out of context.
So here it is: I am running a shell script (hello.sh) that needs to relocate itself from /root to /.
thus I made a simple recursion, to test from where the script is running and to make a temporary copy and launch it and exit(this last temporary copy will move the original file, and delete itself while still running).
#!/bin/sh
IsTMP=$(echo $0 | grep "tmp")
if [ -z "$IsTMP" ]; then
cp /root/hello.sh /tmp/hello.sh
/bin/sh /tmp/hello.sh &
exit
else
unlink /hello.sh
rsync /root/hello.sh /hello.sh
rm /root/hello.sh
rm /tmp/hello.sh
fi
while true; do
sleep 5
echo "Still Alive"
done
This script works totally well and suits my needs (even though it is a horrendous hack): the script is moved, and re-executed from a temporary place. However, when i pipe the shell script with a tee, just like:
/hello.sh | tee -a /log&
The behaviour is not the same:
hello.sh is exiting but not tee
When i try to kill tee, the temporary copy is automatically killed after a few seconds, without entering the infinite loop
This behaviour is the exact same if i replace tee with another binary (e.g. watch,...), so I am wondering if it comes from piping.
Sorry if i am not too clear about my problem.
Thanks in advance.
When i try to kill tee, the temporary copy is automatically killed after a few seconds, without entering the infinite loop
That's not the case. The script is entering the infinite loop, the few seconds are the five the sleep 5 in the loop pauses, and then it is killed by the signal SIGPIPE (Broken pipe) because it tries to echo "Still Alive" to the pipe which is closed on the read end since tee has been killed.
There is no link between tee and the second instance
That's not the case. There is a link, namely the pipe, the write end of which is the standard output of the parent as well as (inherited) the child shell script, and the read end is the standard input of tee. You can see this if you look at ls -l /proc/pid/fd, where pid is the process id of the script's shell on the one hand, and of tee on the other.

Run two shell script in parallel and capture their output

I want have a shell script, which configure several things and then call two other shell scripts. I want these two scripts run in parallel and I want to be able to get and print their live output.
Here is my first script which calls the other two
#!/bin/bash
#CONFIGURE SOME STUFF
$path/instance2_commands.sh
$path/instance1_commands.sh
These two process trying to deploy two different application and each of them took around 5 minute so I want to run them in parallel and also see their live output so I know where are they with the deploying tasks. Is this possible?
Running both scripts in parallel can look like this:
#!/bin/bash
#CONFIGURE SOME STUFF
$path/instance2_commands.sh >instance2.out 2>&1 &
$path/instance1_commands.sh >instance1.out 2>&1 &
wait
Notes:
wait pauses until the children, instance1 and instance2, finish
2>&1 on each line redirects error messages to the relevant output file
& at the end of a line causes the main script to continue running after forking, thereby producing a child that is executing that line of the script concurrently with the rest of the main script
each script should send its output to a separate file. Sending both to the same file will be visually messy and impossible to sort out when the instances generate similar output messages.
you may attempt to read the output files while the scripts are running with any reader, e.g. less instance1.out however output may be stuck in a buffer and not up-to-date. To fix that, the programs would have to open stdout in line buffered or unbuffered mode. It is also up to you to use -f or > to refresh the display.
Example D from an article on Apache Spark and parallel processing on my blog provides a similar shell script for calculating sums of a series for Pi on all cores, given a C program for calculating the sum on one core. This is a bit beyond the scope of the question, but I mention it in case you'd like to see a deeper example.
It is very possible, change your script to look like this:
#!/bin/bash
#CONFIGURE SOME STUFF
$path/instance2_commands.sh >> script.log
$path/instance1_commands.sh >> script.log
They will both output to the same file and you can watch that file by running:
tail -f script.log
If you like you can output to 2 different files if you wish. Just change each ling to output (>>) to a second file name.
This how I end up writing it using Paul instruction.
source $path/instance2_commands.sh >instance2.out 2>&1 &
source $path/instance1_commands.sh >instance1.out 2>&1 &
tail -q -f instance1.out -f instance2.out --pid $!
wait
sudo rm instance1.out
sudo rm instance2.out
My logs in two processes was different so I didn't care if aren't all together, that is why I put them all in one file.

Stream specific numbered Bash file descriptor into variable

I am trying to stream a specific numbered file descriptor into a variable in Bash. I can do this from normal standard in using the following function, but, how do it do it from a specific file descriptor. I need to direct the FD into the sub-shell if I use the same approach. I could always do it reading line by line, but, if I can do it in a continuous stream then that would be massively preferable.
The function I have is:
streamStdInTo ()
{
local STORE_INvar="${1}" ; shift
printf -v "${STORE_INvar}" '%s' "$( cat - )"
}
Yes, I know that this wouldn't work normally as the end of a pipeline would be lost (due to its execution in a sub-shell), however, either in the context of the Bash 4 set +m ; shopt -s lastpipe method of executing the end of a pipeline in the same shell as the start, or, by directing into this via a different file descriptor I am hoping to be able to use it.
So, my question is, How do I use the above but with different file descriptors than the normal?
It's not entirely clear what you mean, but perhaps you are looking for something like:
cat - <&4 # read from fd 4
Or, just call your current function with the redirect:
streamStdInTo foo <&4
edit:
Addressing some questions from the comment, you can use a fifo:
#!/bin/bash
trap 'rm -f $f' 0
f=$(mktemp xxx)
rm $f
mkfifo $f
echo foo > $f &
exec 4< $f
cat - <&4
wait
I think there's a lot of confusion about what exactly you're trying to do. If I understand correctly the end goal here is to run a pipeline and capture the output in a variable, right? Kind of like this:
var=$(cmd1 | cmd2)
Except I guess the idea here is that the name of "$var" is stored in another variable:
varname=var
You can do an end-run around Bash's usual job control situation by using process substitution. So instead of this normal pipeline (which would work in ksh or zsh, but not in bash unless you set lastpipe):
cmd1 | cmd2 | read "$varname"
You would use this command, which is equivalent apart from how the shell handles the job:
read "$varname" < <(cmd1 | cmd2)
With process substitution, "read $varname" isn't run in a pipeline, so Bash doesn't fork to run it. (You could use your streamStdInTo() function there as well, of course)
As I understand it, you wanted to solve this problem by using numeric file descriptors:
cmd1 | cmd2 >&$fd1 &
read "$varname" <&$fd2
To create those file descriptors that connect the pipeline background job to the "read" command, what you need is called a pipe, or a fifo. These can be created without touching the file system (the shell does it all the time!) but the shell doesn't directly expose this functionality, which is why we need to resort to mkfifo to create a named pipe. A named pipe is a special file that exists on the filesystem, but the data you write to it doesn't go to the disk. It's a data queue stored in memory (a pipe). It doesn't need to stay on the filesystem after you've opened it, either, it can be deleted almost immediately:
pipedir=$(mktemp -d /tmp/pipe_maker_XXXX)
mkfifo ${pipedir}/pipe
exec {temp_fd}<>${pipedir}/pipe # Open both ends of the pipe
exec {fd1}>${pipedir}/pipe
exec {fd2}<${pipedir}/pipe
exec {temp_fd}<&- # Close the read/write FD
rm -rf ${pipedir} # Don't need the named FIFO any more
One of the difficulties in working with named pipes in the shell is that attempting to open them just for reading, or just for writing causes the call to block until something opens the other end of the pipe. You can get around that by opening one end in a background job before trying to open the other end, or by opening both ends at once as I did above.
The "{fd}<..." syntax dynamically assigns an unused file descriptor number to the variable $fd and opens the file on that file descriptor. It's been around in ksh for ages (since 1993?), but in Bash I think it only goes back to 4.1 (from 2010).

Use of tee command promptly even for one command

I am new to using tee command.
I am trying to run one of my program which takes long time to finish but it prints out information as it progresses. I am using 'tee' to save the output to a file as well as to see the output in the shell (bash).
But the problem is tee doesn't forward the output to shell until the end of my command.
Is there any way to do that ?
I am using Debian and bash.
This actually depends on the amount of output and the implementation of whatever command you are running. No program is obliged to print stuff straight to stdout or stderr and flush it all the time. So even though most C runtime implementation flush after a certain amount of data was written using one of the runtime routines, such as printf, this may not be true depending on the implementation.
It tee doesn't output it right away, it is likely only receiving the input at the very end of the run of your command. It might be helpful to mention which exact command it is.
The problem you are experienced is most probably related to buffering.
You may have a look at stdbuf command, which does the following:
stdbuf - Run COMMAND, with modified buffering operations for its standard streams.
If you were to post your usage I could give a better answer, but as it is
(for i in `seq 10`; do echo $i; sleep 1s; done) | tee ./tmp
Is proper usage of the tee command and seems to work. Replace the part before the pipe with your command and you should be good to go.

Resources