Process substitution >(cmd) doesn't output correctly

Process substitution >(cmd) doesn't output correctly - linux

I am trying to learn about process substitution and when I execute this:
$ tee >(wc -l) <<< $'aaa\nbbb'
aaa
bbb
$ 2
bash prints the number after the next prompt and waits for me to press enter.
I am using bash 4.4.12, experiencing the same issue with bash 4.3.48. There is no issue in bash 4.3.30 and the command correctly outputs:
$ tee >(wc -l) <<< $'aaa\nbbb'
aaa
bbb
2
What could be the possible issue?

It's a quirk / design flaw with process substitution; bash only waits for the main command to terminate. It doesn't wait for substituted processes to end. It creates a race condition when tee ends: does wc finish before bash prints its next prompt? Sometimes it does, sometimes it doesn't.
See this Unix.SE answer by Stéphane Chazelas for a detailed explanation and possible workaround.

When you use process substitution, the process you're substituting runs in the background, and the shell doesn't wait for it to finish before displaying the prompt.
Sometimes it will display its output before the shell displays the next prompt, sometimes it will be a little slower. It has nothing to do with the version of bash, it's just purely by chance which process is faster.
The shell is waiting for you to press enter because it's already displayed the prompt and it's waiting for you to type another command. The shell doesn't know anything about the 2 that was displayed while you were at the command prompt -- that's output, not part of your input.
This is generally not an issue because you don't usually use process substitution with programs that display output to the user interactively.

Related

Bash completion sometimes meshes up my terminal when the completion function reads a file

So I've been having a problem with some cli programs. Sometimes when I kill the running process with Ctrl+C, it leaves the terminal in a weird state (e.g. echo is turned off). Now that is to be expected for many cases, as killing a process does not give it a chance to restore the terminal's state. But I've discovered that for many other cases, bash completion is the culprit. As an example, try the following:
Start a new bash session as follows: bash --norc to ensure that no completions are loaded.
Define a completion function: _completion_test() { grep -q foo /dev/null; return 1; }.
Define a completion that uses the above function: complete -F _completion_test rlwrap.
Type exactly the following: r l w r a p Space c a t Tab Enter (i.e. rlwrap cat followed by a Tab and then by an Enter).
Wait for a second and then kill the process with Ctrl+C.
The echo of the terminal should have not been turned off. So if you type any character, it will not be echoed by the terminal.
What is really weird is that if I remove the seemingly harmless grep -q foo /dev/null from the completion function, everything works correctly. In fact, adding a grep -q foo /dev/null (or even something even simpler such as cat /dev/null) to any completion function that was installed in my system, causes the same issue. I have also reproduced the problem with programs that don't use readline and without Ctrl+C (e.g. find /varTab| head, with the above completion defined for find).
Why does this happen?
Edit: Just to clarify, the above is a contrived example. In reality, what I am trying to do, is more like this:
_completion_test() {
if grep -q "$1" /some/file; then
#do something
else
#do something else
fi
}
For a more concrete example, try the following:
_completion_test() {
if grep -q foo /dev/null; then
COMPREPLY=(cats)
else
return 1
fi
}
But the mere fact that I am calling grep, causes the problem. I don't see why I can't call grep in this case.

Well, the answer to this is very simple; it's a bug:
This happens when a programmable completion function calls an external command during the execution of a completion function. Bash saves the tty state after every successful job completes, so it can restore it if a job is killed by a signal and leaves the terminal in an undesired state. In this case, we need to suppress that if the job that completes is run during programmable completion, since the terminal settings at that time are as readline sets them for line editing. This fix will be in the release version of bash-4.4.

You're just doing a wrong implementation of the completion function. See the manual
-F function
The shell function function is executed in the current shell environment. When it is executed, $1 is the name of the command
whose arguments are being completed, $2 is the word being completed,
and $3 is the word preceding the word being completed, as described
above (see Programmable Completion). When it finishes, the possible
completions are retrieved from the value of the COMPREPLY array
variable.
for example the following implemenation:
_completion_test() { COMPREPLY=($(cat /dev/null)); return 1; }
doesn't break the terminal.
Regarding your original question why your completion function breaks terminal, I've played a little with strace and I saw that there are ioctl calls with -echo argument. I assume that when you are terminating it with Ctrl+C the ioctl with echo argument just isn't being called in order to restore the original state. Typing stty echo will bring the echo back.

How to show full output on linux shell?

I have a program that runs and shows a GUI window. It also prints a lot of things on the shell. I need to view the first thing printed and the last thing printed. the problem is that when the program terminates, if I scroll to the top of the window, the stuff printed when it began is removed. So stuff printed during the program is now at the top. So that means I can't view the first thing printed.
Also I tried doing > out.txt, but the problem is that the file only gets closed and readable when I manually close the GUI window. If it gets outed to a file, nothing gets printed on the screen and I have no way to know if the program finished. I can't modify any of the code too.
Is there a way I can see the whole list of text printed on the shell?
Thanks

You can just use tee command to get output/error in a file as well on terminal:
your-command |& tee out.log
Though just keep in mind that this output is line buffered by default (4k in size).

When the output of a program goes to your terminal window, the program generally flushes its output after each newline. This is why you see the output interactively.
When you redirect the output of the program to out.txt, it only flushes its output when its internal buffer is full, which is probably after every 8KiB of output. This is why you don't see anything in the file right away, and you don't see the last things printed by the program until it exits (and flushes its last, partially-full buffer).
You can trick a program into thinking it's sending its output to a terminal using the script command:
script -q -f -c myprogram out.txt
This script command runs myprogram connected to a newly-allocated “pseudo-terminal” (or pty for short). This tricks myprogram into thinking it's talking to a terminal, so it flushes its output on every newline. The script command copies myprogram's output to your terminal window and to the file out.txt.
Note that script will write a header line to out.txt. I can't find a way to disable that on my test Linux system.
In the example above, I assumed your program takes no arguments. If it does, you either need to put the program and arguments in quotes:
script -q -f -c 'myprogram arg1 arg2 arg3' out.txt
Or put the program command line in a shell script and pass that shell script to the script command.

shell prompt seemingly does not reappear after running a script that uses exec with tee to send stdout output to both the terminal and a file

I have a shell script which writes all output to logfile
and terminal, this part works fine, but if I execute the script
a new shell prompt only appear if I press enter. Why is that and how do I fix it?
#!/bin/bash
exec > >(tee logfile)
echo "output"

First, when I'm testing this, there always is a new shell prompt, it's just that sometimes the string output comes after it, so the prompt isn't last. Did you happen to overlook it? If so, there seems to be a race where the shell prints the prompt before the tee in the background completes.
Unfortunately, that cannot fixed by waiting in the shell for tee, see this question on unix.stackexchange. Fragile workarounds aside, the easiest way to solve this that I see is to put your whole script inside a list:
{
your-code-here
} | tee logfile

If I run the following script (suppressing the newline from the echo), I see the prompt, but not "output". The string is still written to the file.
#!/bin/bash
exec > >(tee logfile)
echo -n "output"
What I suspect is this: you have three different file descriptors trying to write to the same file (that is, the terminal): standard output of the shell, standard error of the shell, and the standard output of tee. The shell writes synchronously: first the echo to standard output, then the prompt to standard error, so the terminal is able to sequence them correctly. However, the third file descriptor is written to asynchronously by tee, so there is a race condition. I don't quite understand how my modification affects the race, but it appears to upset some balance, allowing the prompt to be written at a different time and appear on the screen. (I expect output buffering to play a part in this).
You might also try running your script after running the script command, which will log everything written to the terminal; if you wade through all the control characters in the file, you may notice the prompt in the file just prior to the output written by tee. In support of my race condition theory, I'll note that after running the script a few times, it was no longer displaying "abnormal" behavior; my shell prompt was displayed as expected after the string "output", so there is definitely some non-deterministic element to this situation.

#chepner's answer provides great background information.
Here's a workaround - works on Ubuntu 12.04 (Linux 3.2.0) and on OS X 10.9.1:
#!/bin/bash
exec > >(tee logfile)
echo "output"
# WORKAROUND - place LAST in your script.
# Execute an executable (as opposed to a builtin) that outputs *something*
# to make the prompt reappear normally.
# In this case we use the printf *executable* to output an *empty string*.
# Use of `$ec` is to ensure that the script's actual exit code is passed through.
ec=$?; $(which printf) ''; exit $ec
Alternatives:
#user2719058's answer shows a simple alternative: wrapping the entire script body in a group command ({ ... }) and piping it to tee logfile.
An external solution, as #chepner has already hinted at, is to use the script utility to create a "transcript" of your script's output in addition to displaying it:
script -qc yourScript /dev/null > logfile # Linux syntax
This, however, will also capture stderr output; if you wanted to avoid that, use:
script -qc 'yourScript 2>/dev/null' /dev/null > logfile
Note, however, that this will suppress stderr output altogether.

As others have noted, it's not that there's no prompt printed -- it's that the last of the output written by tee can come after the prompt, making the prompt no longer visible.
If you have bash 4.4 or newer, you can wait for your tee process to exit, like so:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[0-3].*|4.[0-3]) echo "ERROR: Bash 4.4+ needed" >&2; exit 1;; esac
exec {orig_stdout}>&1 {orig_stderr}>&2 # make a backup of original stdout
exec > >(tee -a "_install_log"); tee_pid=$! # track PID of tee after starting it
cleanup() { # define a function we'll call during shutdown
retval=$?
exec >&$orig_stdout # Copy your original stdout back to FD 1, overwriting the pipe to tee
exec 2>&$orig_stderr # If something overwrites stderr to also go through tee, fix that too
wait "$tee_pid" # Now, wait until tee exits
exit "$retval" # and complete exit with our original exit status
}
trap cleanup EXIT # configure the function above to be called during cleanup
echo "Writing something to stdout here"

Bash Command Substitution Giving Weird Inconsistent Output

For some reasons not relevant to this question, I am running a Java server in a bash script not directly but via command substitution under a separate sub-shell, and in the background. The intent is for the subcommand to return the process id of the Java server as its standard output. The fragement in question is as follows:
launch_daemon()
{
/bin/bash <<EOF
$JAVA_HOME/bin/java $JAVA_OPTS -jar $JAR_FILE daemon $PWD/config/cl.yml <&- &
pid=\$!
echo \${pid} > $PID_FILE
echo \${pid}
EOF
}
daemon_pid=$(launch_daemon)
echo ${daemon_pid} > check.out
The Java daemon in question prints to standard error and quits if there is a problem in initialization, otherwise it closes standard out and standard err and continues on its way. Later in the script (not shown) I do a check to make sure the server process is running. Now on to the problem.
Whenever I check the $PID_FILE above, it contains the correct process id on one line.
But when I check the file check.out, it sometimes contains the correct id, other times it contains the process id repeated twice on the same line separated by a space charcater as in:
34056 34056
I am using the variable $daemon_pid in the script above later on in the script to check if the server is running, so if it contains the pid repeated twice this totally throws off the test and it incorrectly thinks the server is not running. Fiddling with the script on my server box running CentOS Linux by putting in more echo statements etc. seems to flip the behavior back to the correct one of $daemon_pid containing the process id just once, but if I think that has fixed it and check in this script to my source code repo and do a build and deploy again, I start seeing the same bad behavior.
For now I have fixed this by assuming that $daemon_pid could be bad and passing it through awk as follows:
mypid=$(echo ${daemon_pid} | awk '{ gsub(" +.*",""); print $0 }')
Then $mypid always contains the correct process id and things are fine, but needless to say I'd like to understand why it behaves the way it does. And before you ask, I have looked and looked but the Java server in question does NOT print its process id to its standard out before closing standard out.
Would really appreciate expert input.

Following the hint by #WilliamPursell, I tracked this down in the bash source code. I honestly don't know whether it is a bug or not; all I can say is that it seems like an unfortunate interaction with a questionable use case.
TL;DR: You can fix the problem by removing <&- from the script.
Closing stdin is at best questionable, not just for the reason mentioned by #JonathanLeffler ("Programs are entitled to have a standard input that's open.") but more importantly because stdin is being used by the bash process itself and closing it in the background causes a race condition.
In order to see what's going on, consider the following rather odd script, which might be called Duff's Bash Device, except that I'm not sure that even Duff would approve: (also, as presented, it's not that useful. But someone somewhere has used it in some hack. Or, if not, they will now that they see it.)
/bin/bash <<EOF
if (($1<8)); then head -n-$1 > /dev/null; fi
echo eight
echo seven
echo six
echo five
echo four
echo three
echo two
echo one
EOF
For this to work, bash and head both have to be prepared to share stdin, including sharing the file position. That means that bash needs to make sure that it flushes its read buffer (or not buffer), and head needs to make sure that it seeks back to the end of the part of the input which it uses.
(The hack only works because bash handles here-documents by copying them into a temporary file. If it used a pipe, it wouldn't be possible for head to seek backwards.)
Now, what would have happened if head had run in the background? The answer is, "just about anything is possible", because bash and head are racing to read from the same file descriptor. Running head in the background would be a really bad idea, even worse than the original hack which is at least predictable.
Now, let's go back to the actual program at hand, simplified to its essentials:
/bin/bash <<EOF
cmd <&- &
echo \$!
EOF
Line 2 of this program (cmd <&- &) forks off a separate process (to run in the background). In that process, it closes stdin and then invokes cmd.
Meanwhile, the foreground process continues reading commands from stdin (its stdin fd hasn't been closed, so that's fine), which causes it to execute the echo command.
Now here's the rub: bash knows that it needs to share stdin, so it can't just close stdin. It needs to make sure that stdin's file position is pointing to the right place, even though it may have actually read ahead a buffer's worth of input. So just before it closes stdin, it seeks backwards to the end of the current command line. [1]
If that seek happens before the foreground bash executes echo, then there is no problem. And if it happens after the foreground bash is done with the here-document, also no problem. But what if it happens while the echo is working? In that case, after the echo is done, bash will reread the echo command because stdin has been rewound, and the echo will be executed again.
And that's precisely what is happening in the OP. Sometimes, the background seek completes at just the wrong time, and causes echo \${pid} to be executed twice. In fact, it also causes echo \${pid} > $PID_FILE to execute twice, but that line is idempotent; had it been echo \${pid} >> $PID_FILE, the double execution would have been visible.
So the solution is simple: remove <&- from the server start-up line, and optionally replace it with </dev/null if you want to make sure the server can't read from stdin.
Notes:
Note 1: For those more familiar with bash source code and its expected behaviour than I am, I believe that the seek and close takes place at the end of case r_close_this: in function do_redirection_internal in redir.c, at approximately line 1093:
check_bash_input (redirector);
close_buffered_fd (redirector);
The first call does the lseek and the second one does the close. I saw the behaviour using strace -f and then searched the code for a plausible looking lseek, but I didn't go to the trouble of verifying in a debugger.

process substitution in bash, sometimes I have to press "Enter"

I'm just learning to use process substitution in bash. Here's the command:
echo TEXT > >(tee log)
This is a pointless command but the thing is I have to press Enter after I run it. Why is that?
Sometimes this happens with more useful commands like:
ls SOME_NON_EXISTING_FILE 2> >(tee log)

Actually Enter is not really needed, you can just enter next command e.g. date and check. What is happening that because of process substitution your command exits first and then output gets written on your terminal, that is the reason you get false impression of need to press an Enter.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string