Need explanations for Linux bash builtin exec command behavior

Need explanations for Linux bash builtin exec command behavior - linux

From Bash Reference Manual I get the following about exec bash builtin command:
If command is supplied, it replaces the shell without creating a new process.
Now I have the following bash script:
#!/bin/bash
exec ls;
echo 123;
exit 0
This executed, I got this:
cleanup.sh ex1.bash file.bash file.bash~ output.log
(files from the current directory)
Now, if I have this script:
#!/bin/bash
exec ls | cat
echo 123
exit 0
I get the following output:
cleanup.sh
ex1.bash
file.bash
file.bash~
output.log
123
My question is:
If when exec is invoked it replaces the shell without creating a new process, why when put | cat, the echo 123 is printed, but without it, it isn't. So, I would be happy if someone can explain what's the logic of this behavior.
Thanks.
EDIT:
After #torek response, I get an even harder to explain behavior:
1.exec ls>out command creates the out file and put in it the ls's command result;
2.exec ls>out1 ls>out2 creates only the files, but do not put inside any result. If the command works as suggested, I think the command number 2 should have the same result as command number 1 (even more, I think it should not have had created the out2 file).

In this particular case, you have the exec in a pipeline. In order to execute a series of pipeline commands, the shell must initially fork, making a sub-shell. (Specifically it has to create the pipe, then fork, so that everything run "on the left" of the pipe can have its output sent to whatever is "on the right" of the pipe.)
To see that this is in fact what is happening, compare:
{ ls; echo this too; } | cat
with:
{ exec ls; echo this too; } | cat
The former runs ls without leaving the sub-shell, so that this sub-shell is therefore still around to run the echo. The latter runs ls by leaving the sub-shell, which is therefore no longer there to do the echo, and this too is not printed.
(The use of curly-braces { cmd1; cmd2; } normally suppresses the sub-shell fork action that you get with parentheses (cmd1; cmd2), but in the case of a pipe, the fork is "forced", as it were.)
Redirection of the current shell happens only if there is "nothing to run", as it were, after the word exec. Thus, e.g., exec >stdout 4<input 5>>append modifies the current shell, but exec foo >stdout 4<input 5>>append tries to exec command foo. [Note: this is not strictly accurate; see addendum.]
Interestingly, in an interactive shell, after exec foo >output fails because there is no command foo, the shell sticks around, but stdout remains redirected to file output. (You can recover with exec >/dev/tty. In a script, the failure to exec foo terminates the script.)
With a tip of the hat to #Pumbaa80, here's something even more illustrative:
#! /bin/bash
shopt -s execfail
exec ls | cat -E
echo this goes to stdout
echo this goes to stderr 1>&2
(note: cat -E is simplified down from my usual cat -vET, which is my handy go-to for "let me see non-printing characters in a recognizable way"). When this script is run, the output from ls has cat -E applied (on Linux this makes end-of-line visible as a $ sign), but the output sent to stdout and stderr (on the remaining two lines) is not redirected. Change the | cat -E to > out and, after the script runs, observe the contents of file out: the final two echos are not in there.
Now change the ls to foo (or some other command that will not be found) and run the script again. This time the output is:
$ ./demo.sh
./demo.sh: line 3: exec: foo: not found
this goes to stderr
and the file out now has the contents produced by the first echo line.
This makes what exec "really does" as obvious as possible (but no more obvious, as Albert Einstein did not put it :-) ).
Normally, when the shell goes to execute a "simple command" (see the manual page for the precise definition, but this specifically excludes the commands in a "pipeline"), it prepares any I/O redirection operations specified with <, >, and so on by opening the files needed. Then the shell invokes fork (or some equivalent but more-efficient variant like vfork or clone depending on underlying OS, configuration, etc), and, in the child process, rearranges the open file descriptors (using dup2 calls or equivalent) to achieve the desired final arrangements: > out moves the open descriptor to fd 1—stdout—while 6> out moves the open descriptor to fd 6.
If you specify the exec keyword, though, the shell suppresses the fork step. It does all the file opening and file-descriptor-rearranging as usual, but this time, it affects any and all subsequent commands. Finally, having done all the redirections, the shell attempts to execve() (in the system-call sense) the command, if there is one. If there is no command, or if the execve() call fails and the shell is supposed to continue running (is interactive or you have set execfail), the shell soldiers on. If the execve() succeeds, the shell no longer exists, having been replaced by the new command. If execfail is unset and the shell is not interactive, the shell exits.
(There's also the added complication of the command_not_found_handle shell function: bash's exec seems to suppress running it, based on test results. The exec keyword in general makes the shell not look at its own functions, i.e., if you have a shell function f, running f as a simple command runs the shell function, as does (f) which runs it in a sub-shell, but running (exec f) skips over it.)
As for why ls>out1 ls>out2 creates two files (with or without an exec), this is simple enough: the shell opens each redirection, and then uses dup2 to move the file descriptors. If you have two ordinary > redirects, the shell opens both, moves the first one to fd 1 (stdout), then moves the second one to fd 1 (stdout again), closing the first in the process. Finally, it runs ls ls, because that's what's left after removing the >out1 >out2. As long as there is no file named ls, the ls command complains to stderr, and writes nothing to stdout.

Related

How to capture error messages from a program that fails only outside the terminal?

On a Linux server, I have a script here that will work fine when I start it from the terminal, but fail when started and then detached by another process. So there is probably a difference in the script's environment to fix.
The trouble is, the other process integrating that script does not provide access to its error messages when the script fails. What is an easy (and ideally generic) way to see the output of such a script when it's failing?
Let's assume I have no easy way to change the code of the process calling this script. The failure happens right at the start of the script's run, so there is not enough time to manually attach to it with strace to see its output.
(The specifics should not matter, but for what it's worth: the failing script is the backup script of Discourse, a widespread open source forum software. Discourse and this script are written in Ruby.)

The idea is to substitute original script with wrapper which calls original script and saves its stdin and stderr to files. Wrapper may be like this:
#!/bin/bash
exec /path/to/original/script "$#" 1> >(tee /tmp/out.log) 2> >(tee /tmp/err.log >&2)
1> >(tee /tmp/out.log) redirects stdout to tee /tmp/out.log input in subshell. tee /tmp/out.log passes it to stdout but saves copy to the file.
2> >(tee /tmp/err.log) redirects stderr to tee /tmp/err.log input in subshell. tee /tmp/err.log >&2 passes it to stderr but saves copy to the file.
If script is invoked multiple times you may want to append stdout and stderr to files. Use tee -a in this case.
The problem is how to force caller to execute wrapper script instead of original one.
If caller invokes script in a way that it is searched in PATH you can put wrapper script to a separate directory and provide modified PATH to the caller. For example, script name is script. Put wrapper to /some/dir/script and run caller as
$ PATH="/some/dir:$PATH" caller
/path/to/original/script in wrapper must be absolute.
If caller invokes script from specific path then you can rename original script e.g. to original-script and name wrapper as script. In this case wrapper should call /path/to/original/original-script.
Another problem may rise if script behaves differently depending on name it's called. In this case exec -a ... may be needed.

You can use a bash script that (1) does "busy waiting" until it sees the targeted process, and then (2) immediately attaches to it with strace and prints its output to the terminal.
#!/bin/sh
# Adapt to a regex that matches only your target process' full command.
name_pattern="bin/ruby.*spawn_backup_restore.rb"
# Wait for a process to start, based on its name, and capture its PID.
# Inspiration and details: https://unix.stackexchange.com/a/410075
pid=
while [ -z "$pid" ] ; do
pid="$(pgrep --full "$name_pattern" | head -n 1)"
# Set delay for next check to 1ms to try capturing all output.
# Remove completely if this is not enough to capture from the start.
sleep 0.001
done
echo "target process has started, pid is $pid"
# Print all stdout and stderr output of the process we found.
# Source and explanations: https://unix.stackexchange.com/a/58601
strace -p "$pid" -s 9999 -e write

How redirecting output from a function works in bash?

From what I've learned, also stated in the answer to this thread, redirection of stdout works as follows:
When we do something like: ls > dirlist
bash does the followings:
forks a process, which still runs bash
in the subprocess, open the file dirlist for writing on file descriptor 1
calling exec passing to it the ls executable.
this way, when ls writes to FD 1, it actually writes to the file.
With this in mind, I wonder about the following:
$ foo() { echo "hello" ; }
$ foo > file
$ cat file
hello
as far as I know, functions run in the same shell process, so how does redirection works in that case?

Redirection itself is just a shell construct, so the shell can make it work however it wants. Every command, whether external processes or shell builtins, has its own idea of standard output, and standard output is inherited just as it is by child processes from parent processes. In this case, the command foo either inherits its standard output from the shell or takes whatever file a shell redirection specifies. Once inside the function, echo writes to whatever file it inherits from foo.
Put another way, for its own built-in commands (which includes functions, compound statements like while, if, etc) the shell effectively simulates exec without actually calling exec.

Get full command from shell script

I'm looking for a way to access the full command from shell script, e.g.
Assume I have a script called test.sh. When I run it, the command line is passed to ruby as is (except the script itself is removed).
$ test.sh print ENV['HOME']
Is equivalent to
$ ruby -e "print ENV['HOME']"

When you run:
test.sh print ENV['HOME']
...then, before test.sh is started, the shell runs string-splitting, expansion, and similar processes. Thus, what's eventually run is (assuming no glob expansion):
execvp("test.sh", {"test.sh", "print", "ENV[HOME]"});
If you have a file named ENVH in the current directory, the shell may treat ENV['HOME'] as a glob, expanding it by replacing the glob expression with the filename, and thus running:
execvp("test.sh", {"test.sh", "print", "ENVH"});
...in any event, what exists on the other side of the execv*-series call done to run the new program has no information which was local to the original shell -- and thus no way of knowing what the original command was before parsing and expansion. Thus, it is impossible to retrieve the original string unless the outer shell is modified to expose it out-of-band (as via an environment variable).
This is why your calling convention should instead require:
test.sh "print ENV['HOME']"
or, allowing even more freedom from shell quoting/escaping syntax, passing program text via stdin, as with:
test.sh <<'EOF'
print ENV['HOME']
EOF
Now, if you want to modify your shell to do that, I'd suggest a function that exposes BASH_COMMAND. For instance:
shopt -s extdebug
expose_command() {
export SHELL_COMMAND="$BASH_COMMAND"
return 0
}
trap expose_command DEBUG
...then, inside test.sh, you can refer to SHELL_COMMAND. Again, however: This will only work if the calling shell had that trap configured, as within a user's ~/.bashrc; you can't simply put the above content in a script and expect it to work, because it's only the interactive shell -- the script's parent process -- that has access to this information and is thus able to expose it.

Linux All Output to a File

Is there any way to tell Linux system put all output(stdout,stderr) to a file?
With out using redirection, pipe or modification the how scrips get called.
Just tell the Linux use a file for output.
for example:
script test1.sh:
#!/bin/bash
echo "Testing 123 "
If i run it like "./test1.sh" (with out redirection or pipe)
i'd like to see "Testing 123" in a file (/tmp/linux_output)
Problem: in the system a binary makes a call to a script and this script call many other scrips. it is not possible to modify each call so If i can modify Linux put "output" to a file i can review the logs.

#!/bin/bash
exec >file 2>&1
echo "Testing 123 "
You can read more about exec here

If you are running the program from a terminal, you can use the command script.
It will open up a sub-shell. Do what you need to do.
It will copy all output to the terminal into a file. When you are done, exit the shell. ^D, or exit.
This does not use redirection or pipes.

You could set your terminal's scrollback buffer to a large number of lines and then see all the output from your commands in the buffer - depending on your terminal window and the options in its menus, there may be an option in there to capture terminal I/O to a file.

Your requirement if taken literally is an impractical one, because it is based in a slight misunderstanding. Fundamentally, to get the output to go in a file, you will have to change something to direct it there - which would violate your literal constraint.
But the practical problem is solvable, because unless explicitly counteracted in the child, the output directions configured in a parent process will be inherited. So you only have to setup the redirection once, using either a shell, or a custom launcher program or intermediary. After that it will be inherited.
So, for example:
cat > test.sh
#/bin/sh
echo "hello on stdout"
rm nosuchfile
./test2.sh
And a child script for it to call
cat > test2.sh
#/bin/sh
echo "hello on stdout from script 2"
rm thisfileisnteither
./nonexistantscript.sh
Run the first script redirecting both stdout and stderr (bash version - and you can do this in many ways such as by writing a C program that redirects its outputs then exec()'s your real program)
./test.sh &> logfile
Now examine the file and see results from stdout and stderr of both parent and child.
cat logfile
hello on stdout
rm: nosuchfile: No such file or directory
hello on stdout from script 2
rm: thisfileisnteither: No such file or directory
./test2.sh: line 4: ./nonexistantscript.sh: No such file or directory
Of course if you really dislike this, you can always always modify the kernel - but again, that is changing something (and a very ungainly solution too).

shell prompt seemingly does not reappear after running a script that uses exec with tee to send stdout output to both the terminal and a file

I have a shell script which writes all output to logfile
and terminal, this part works fine, but if I execute the script
a new shell prompt only appear if I press enter. Why is that and how do I fix it?
#!/bin/bash
exec > >(tee logfile)
echo "output"

First, when I'm testing this, there always is a new shell prompt, it's just that sometimes the string output comes after it, so the prompt isn't last. Did you happen to overlook it? If so, there seems to be a race where the shell prints the prompt before the tee in the background completes.
Unfortunately, that cannot fixed by waiting in the shell for tee, see this question on unix.stackexchange. Fragile workarounds aside, the easiest way to solve this that I see is to put your whole script inside a list:
{
your-code-here
} | tee logfile

If I run the following script (suppressing the newline from the echo), I see the prompt, but not "output". The string is still written to the file.
#!/bin/bash
exec > >(tee logfile)
echo -n "output"
What I suspect is this: you have three different file descriptors trying to write to the same file (that is, the terminal): standard output of the shell, standard error of the shell, and the standard output of tee. The shell writes synchronously: first the echo to standard output, then the prompt to standard error, so the terminal is able to sequence them correctly. However, the third file descriptor is written to asynchronously by tee, so there is a race condition. I don't quite understand how my modification affects the race, but it appears to upset some balance, allowing the prompt to be written at a different time and appear on the screen. (I expect output buffering to play a part in this).
You might also try running your script after running the script command, which will log everything written to the terminal; if you wade through all the control characters in the file, you may notice the prompt in the file just prior to the output written by tee. In support of my race condition theory, I'll note that after running the script a few times, it was no longer displaying "abnormal" behavior; my shell prompt was displayed as expected after the string "output", so there is definitely some non-deterministic element to this situation.

#chepner's answer provides great background information.
Here's a workaround - works on Ubuntu 12.04 (Linux 3.2.0) and on OS X 10.9.1:
#!/bin/bash
exec > >(tee logfile)
echo "output"
# WORKAROUND - place LAST in your script.
# Execute an executable (as opposed to a builtin) that outputs *something*
# to make the prompt reappear normally.
# In this case we use the printf *executable* to output an *empty string*.
# Use of `$ec` is to ensure that the script's actual exit code is passed through.
ec=$?; $(which printf) ''; exit $ec
Alternatives:
#user2719058's answer shows a simple alternative: wrapping the entire script body in a group command ({ ... }) and piping it to tee logfile.
An external solution, as #chepner has already hinted at, is to use the script utility to create a "transcript" of your script's output in addition to displaying it:
script -qc yourScript /dev/null > logfile # Linux syntax
This, however, will also capture stderr output; if you wanted to avoid that, use:
script -qc 'yourScript 2>/dev/null' /dev/null > logfile
Note, however, that this will suppress stderr output altogether.

As others have noted, it's not that there's no prompt printed -- it's that the last of the output written by tee can come after the prompt, making the prompt no longer visible.
If you have bash 4.4 or newer, you can wait for your tee process to exit, like so:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[0-3].*|4.[0-3]) echo "ERROR: Bash 4.4+ needed" >&2; exit 1;; esac
exec {orig_stdout}>&1 {orig_stderr}>&2 # make a backup of original stdout
exec > >(tee -a "_install_log"); tee_pid=$! # track PID of tee after starting it
cleanup() { # define a function we'll call during shutdown
retval=$?
exec >&$orig_stdout # Copy your original stdout back to FD 1, overwriting the pipe to tee
exec 2>&$orig_stderr # If something overwrites stderr to also go through tee, fix that too
wait "$tee_pid" # Now, wait until tee exits
exit "$retval" # and complete exit with our original exit status
}
trap cleanup EXIT # configure the function above to be called during cleanup
echo "Writing something to stdout here"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Need explanations for Linux bash builtin exec command behavior - linux

Related

How to capture error messages from a program that fails only outside the terminal?

How redirecting output from a function works in bash?

Get full command from shell script

Linux All Output to a File

shell prompt seemingly does not reappear after running a script that uses exec with tee to send stdout output to both the terminal and a file

Categories

Resources