Redirect stdout to fifo immediately - linux

I have, for example, a c program that prints three lines, two seconds apart, that is:
printf("Wait 2 seconds...\n");
sleep(2);
printf("Two more\n");
sleep(2);
printf("Quitting in 2 seconds...\n");
sleep(2);
I execute the program and redirect it to a pipe:
./printer > myPipe
On another terminal
cat < myPipe
The second terminal prints all at once, 6 seconds later! I would like it to print the available lines immediatly. How can i do it?
Obs: I can't change the source code. It's actually the output of a boardgame algorithm, i have to get it immediatly so that i can plug it into another algorithm, get the answer back and plug into the first one...

Change the program to this approach:
printf("Wait 2 seconds...\n");
fflush (stdout);
sleep(2);
printf("Two more\n");
fflush (stdout);
sleep(2);
printf("Quitting in 2 seconds...\n");
fflush (stdout);
sleep(2);
Additional:
If you can't change the program, there really is no way to affect the program's built-in buffering without hacking it.
If you can relink the program, you could substitute a printf() function which flushes after each call. Or changes the startup initialization of stdout to be unbuffered—or at least line buffered.

If you can't change the source, you might want to try some of the solutions to this related question:
bash: force exec'd process to have unbuffered stdout
Basically, you have to make the OS execute this program interactively.

I'm assuming that the actual source file is complete. If so, then you have to compile the source and run it to get it to do anything. Using cat will just print the contents of the file, not run it.
If it was written in bash then it would have to be set mode bit +x, which would then make it executable. Allowing you to run it from a terminal ./script
No need to worry about the syntax since you've stated it's not an option to change it and... It's correctly written in C.

Related

Understanding file descriptor duplication in bash

I'm having a hard time understanding something about redirections in bash.
I'll start with what I know:
Each process has file descriptors opened which it can write to/read from. These file descriptors may represent files on disk, terminals, devices, etc.
When we start teminal with bash, we have file stdin (0) stdout (1) and stderr (2) opened, pointing to the terminal. Whenever we run a command (a new process), that process inherits the file descriptors of its parent (bash), so by default, it will print stdout and stderr messages to the terminal, and read from terminal also.
When we redirect, for example:
$ ls 1>filelist
We're actually changing file descriptor 1 of the ls process, to point to the filelist file, instead of the terminal. So when ls will write(1, ...) it will go to the file.
So to sum it up, a redirection is basically changing the file to which the file descriptor to which the program writes/reads to/from refers to.
Now, let's say I have the following C program:
#include <stdio.h>
#include <fcntl.h>
int main()
{
int fd = 0;
fd = open("info.log", O_CREAT | O_RDWR);
printf("%d", fd);
write(fd, "INFO::", 6);
return 0;
}
This program opens a file info.log, which is referred to by a file descriptor (usually 3).
Indeed, if I now compile this program and run it:
$ ./app
3
It creates the file info.log which contains the "INFO::" text in it.
But here's what I don't get: according to the logic described above, if I now redirect FD 3 to another file:
$ ./app 3> another_file
The text should be written to this other file, but for some reason, it doesn't.
Can someone explain?
Hint: when you run ./app 3> another_file, it'll print "4" instead of "3".
More detailed explanation: when you run ./app 3> another_file in the shell, a series of things happens:
The shell fork()s a subprocess that'll run ./app. The subprocess is basically a clone of its parent process so, it'll still be running the shell program.
In that subprocess, the shell opens "another_file" on file descriptor #3 for writing.
Then it uses one of the execl() family of calls to execute the ./app binary (with "another_file" still open on FD#3).
The program runs open("info.log", O_CREAT | O_RDWR), which creates "info.log" and opens it on the next available file descriptor. Since FD#3 is already in use, that's FD#4.
The program writes "INFO::" to FD#4, which is "info.log".
Since open() uses a new FD, it's not really affected by any active redirects. And actually, if the program did open something on FD#3, that'd replace the connection to "another_file" with whatever it had opened instead, essentially overriding the redirect.
If the program wanted to use the redirect, it'd have to write to FD#3 without first opening anything on it. This is what's normally done with FD#1 and 2 (standard output and error), and that's why redirecting those works.

Bash output happening after prompt, not before, meaning I have to manually press enter

I am having a problem getting bash to do exactly what I want, it's not a major issue, but annoying.
1.) I have a third party software I run that produces some output as stderr. Some of it is useful, some of it is regularly stuff I don't care about and I don't want this dumped to screen, however I do want the useful parts of the stderr dumped to screen. I figured the best way to achieve this was to pass stderr to a function, then use conditions in that function to either show the stderr or not.
2.) This works fine. However the solution I have implemented dumped out my errors at the right time, but then returns a bash prompt and I want to summarise the status of the errors at the end of the function, but echo-ing here prints the text after the prompt meaning that I have to press enter to get back to a clean prompt. It shall become clear with the example below.
My error stream generator:
./TestErrorStream.sh
#!/bin/bash
echo "test1" >&2
My function to process this:
./Function.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # This is used simply to simulate the processing work I'm doing on the errors.
echo "Completed"
}
I source the Function.sh file to make ProcessErrors() available, then I run:
2> >(ProcessErrors) ./TestErrorStream.sh
I expect (and want) to get:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
Completed
user#user-desktop:~/path$
However what I really get is:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
user#user-desktop:~/path$ Completed
And no clean prompt. Of course the prompt is there, but "Completed" is being printed after the prompt, I want to printed before, and then a clean prompt to appear.
NOTE: This is a minimum working example, and it's contrived. While other solutions to my error stream problem are welcome I also want to understand how to make bash run this script the way I want it to.
Thanks for your help
Joey
Your problem is that the while loop stay stick to stdin until the program exits.
The release of stdin occurs at the end of the "TestErrorStream.sh", so your prompt is almost immediately available compared to what remains to process in the function.
I suggest you wrap the command inside a script so you'll be able to handle the time you want before your prompt is back (I suggest 1sec more than the suspected time needed for the function to process the remaining lines of codes)
I successfully managed to do this like that :
./Functions.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # simulate required time to process end of function (after TestErrorStream.sh is over and stdin is released)
echo "Completed"
}
./TestErrorStream.sh
#!/bin/bash
echo "first"
echo "firsterr" >&2
sleep 20 # any number here
./WrapTestErrorStream.sh
#!/bin/bash
source ./Functions.sh
2> >(ProcessErrors) ./TestErrorStream.sh
sleep 6 # <= this one is important
With the above you'll get a nice "Completed" before your prompt after 26 seconds of processing. (Works fine with or without the additional "time" command)
user#host:~/path$ time ./WrapTestErrorStream.sh
first
Line was:firsterr
Completed
real 0m26.014s
user 0m0.000s
sys 0m0.000s
user#host:~/path$
Note: the process substitution ">(ProcessErrors)" is a subprocess of the script "./TestErrorStream.sh". So when the script ends, the subprocess is no more tied to it nor to the wrapper. That's why we need that final "sleep 6"
#!/bin/bash
function ProcessErrors {
while read data; do
echo Line was:"$data"
done
sleep 5
echo "Completed"
}
# Open subprocess
exec 60> >(ProcessErrors)
P=$!
# Do the work
2>&60 ./TestErrorStream.sh
# Close connection or else subprocess would keep on reading
exec 60>&-
# Wait for process to exit (wait "$P" doesn't work). There are many ways
# to do this too like checking `/proc`. I prefer the `kill` method as
# it's more explicit. We'd never know if /proc updates itself quickly
# among all systems. And using an external tool is also a big NO.
while kill -s 0 "$P" &>/dev/null; do
sleep 1s
done
Off topic side-note: I'd love to see how posturing bash veterans/authors try to own this. Or perhaps they already did way way back from seeing this.

What is the order of redirection in terminal?

I want to take input from file input.txt and write output of execution to output.txt What is the right order? The below does not work.
./a.out < input.txt > output.txt
EDIT
Do I have to wait for execution to complete for it to be written? I usually break in the middle to see if o/p is getting written as run time is very high.
CLARIFICATION:
This C program (P1) iterates through a loop and feeds the loop value x to a system() call which calls another C program (P2) using ./P2 < x. Program P2 executes for each value of x and outputs to screen. I want to the complete output of both programs to output.txt.
If you're killing the command before it finishes, this is probably a buffering issue. Line-buffered terminal output and block-buffered file output are default behaviors in the C stdio library, so redirection can cause output to be buffered until a few kilobytes have been written.
Some programs have a command line option to force line-buffered or unbuffered output. They do this by calling setvbuf. If that a.out is a program you wrote, you could addsetvbuf(stdout, NULL, _IOLBF, 0);
If the program is not yours and you can't recompile it, there is a utility called stdbuf that might help, as in stdbuf -oL ./a.out < in > out
stdbuf is kind of a kludge though. I wouldn't use it unless there is no other option.

Shell script to call external program which has user-interface

I have an external program, say a.out, which while running asks for an input parameter, i.e.,
./a.out
Please select either 1 or 2:
this will do something
this will do something else
Then when I enter '1', it will do its job. I don't have the code itself but just binary so can't change it.
I want to write a shell script which runs a.out and also inserts '1' in.
I tried many things including silly things like:
./a.out 1
./a.out << 1
./a.out < 1
etc.
but don't work.
Could you please let me know if there is any way to write such as shell script?
Thanks,
dbm368
I think you just need a pipe. For example:
echo 1 | ./a.out
In general terms a pipe takes whatever the program on the left writes to stdout and redirects to the stdin of the program on the right.

on-the-fly output redirection, seeing the file redirection output while the program is still running

If I use a command like this one:
./program >> a.txt &
, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it actually stops computing. I want to be able to read the redirected output on file while the program is running.
This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.
You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.
l = range(0,100000)
for i in l:
if i%1000==0:
print i
for j in l:
s = i + j
One can run this with:
./python program.py >> a.txt &
Then cat a.txt .. you will only get results once the script is done computing.
From the stdout manual page:
The stream stderr is unbuffered.
The stream stdout is line-buffered
when it points to a terminal.
Partial lines will not appear until
fflush(3) or exit(3) is called, or
a new‐line is printed.
Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.
Ways to work around this:
Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.
Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:
screen -d -m -L python test.py
would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.
EDIT:
On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:
It won't work at all on static executables
It's fragile and rather ugly.
It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.
It's fragile and rather ugly.
It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.
It's fragile and rather ugly.
It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.
Did I mention that it's fragile and rather ugly?
That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
char *getenv(const char *s) {
static char *(*getenv_real)(const char *s) = NULL;
if (getenv_real == NULL) {
getenv_real = dlsym(RTLD_NEXT, "getenv");
setlinebuf(stdout);
}
return getenv_real(s);
}
Then you compile that file as a shared object:
gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc
Then you set the LD_PRELOAD variable to load it along with your program:
$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out
0
1000
2000
3000
4000
If you are lucky, your problem will be solved with no unfortunate side-effects.
You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.
If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).
This buffers stdout up to a line:
stdbuf -oL command > output
This disables stdout buffering altogether:
stdbuf -o0 command > output
Have you considered piping to tee?
./program | tee a.txt
However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.
If the program writes to a file, you can read it while it is being written using tail -f a.txt.
Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.
Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.
The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.
The unbuffer command from the expect package does exactly what you are looking for.
$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>

Resources