how does linux shell standard output and error output redirect work when combined?

how does linux shell standard output and error output redirect work when combined? - linux

I am trying to understand fine point of standard and error redirection in linux shell scripting (bourne, bash).
Example 1:
cat file1 > output.txt
or
cat file1 1> output.txt
This redirects contents of file1 to output.txt. Works as expected.
Example 2:
kat file1 2> output.txt
kat command does not exist so error is redirected to output.txt. Works as expected.
Example 3:
cat file1 2>&1 output.txt
Because cat is a valid command and file1 exists, here I would expect same behavior as example 1. Instead I seem to get contents of both files to screen.
Example 4:
kat file1 2>&1 output.txt
since kat does not exist, I would expect same behavior as example 2. Instead I get error to screen ("-bash: kat: command not found")
as explained in many on-line manuals, example:
https://www.gnu.org/software/bash/manual/html_node/Redirections.html

The problem is that 2>&1 only tells the shell to redirect file descriptor 2 (standard error) to file descriptor 1 (standard output). It doesn't actually do any redirection of standard output.
For that you have to do it explicitly like
cat file1 > output.txt 2>&1
Note that you have to do the descriptor-redirection last (after the standard output redirection) or it will not work.
This is all explained in the Bash manual page (see the section about redirection).

cat file1 2>&1 output.txt
The shell will set up the redirection (stderr to stdout). After that you will have "left" as command executed by shell:
cat file1 output.txt
That's why you see both contents.
For
kat file1 2>&1 output.txt
it is the same because only
kat file1 output.txt
is left after shell sets the descriptors for the command to be executed. And this can't be found => error message from shell.

Related

Pipe printf to ls in Bash?

So I'm learning about pipes in bash and I found this pithy description:
A Unix pipe connects the STDOUT (standard output) file descriptor of
the first process to the STDIN (standard input) of the second. What
happens then is that when the first process writes to its STDOUT, that
output can be immediately read (from STDIN) by the second process.
Source
Given this understanding, let's connect the STDOUT of printf to the STDIN of ls. For simplicity, print the parent directory (printf ..).
~/Desktop/pipes$ mkdir sub
~/Desktop/pipes$ ls
sub
~/Desktop/pipes$ cd sub
(no files)
~/Desktop/pipes/sub$ printf .. | ls
(no files)
~/Desktop/pipes/sub$
I want to be doing: ls .. but it seems that all I'm getting is ls. Why is this so? How can I ls the parent directory using pipes? Am I misunderstanding pipes?

Many programs don't read from stdin, not just ls. It is also possible that a program might not write to stdout either.
Here is a little experiment that might clarify things. Carry out these steps:
cat > file1
This is file1
^D
The ^D is you pressing <CTRL>+D, which is the default end-of-file. So, first we are calling the cat program and redirecting its stdout to file1. If you don't supply an input filename then it reads from stdin, so we type "This is file1".
Now do similar:
cat > file2
This is file2
^D
Now if you:
cat < file1
You get:
This is file1
what if you:
cat file1 | cat file2
or:
cat file2 < file1
Why? Because if you supply an input filename then the cat program does not read stdin, just like ls.
Now, how about:
cat - file1 < file2
By convention the - means a standard stream, stdin when reading or stdout when writing.

The problem is ls does not read from stdin as you intended it to do. You need to use a tool that reads from the stdin like xargs and feed the read input to ls
printf "someSampleFolderOrFile" | xargs ls
xargs man page,
xargs - build and execute command lines from standard input

Simple tee example

Can someone please explain why tee works here:
echo "testtext" | tee file1 > file2
My understanding was that tee duplicates the input and prints 1 to screen.
The above example allows the output from echo to be sent to 2 files, the first redirecting to the second.
I would expect 'testtext' to be printed to screen and passed through file1 and landing in file2. Similar as to how the text in the following example would only end up in file2.
echo "testtext" > file1 > file2
Can anyone explain what i am missing in my understanding?
Edit
Is it because its writing to file and then to stdout which gets redirected?

Your description is right: tee receives data from stdin and writes it both into file and stdout. But when you redirect tee's stdout into another file, there is obviously nothing written into terminal because the data ended up inside the second file.
Is it because its writing to file and then to stdout which gets redirected?
Exactly.
What you are trying to do could be done like this (demonstrating how tee works):
$ echo "testtext" | tee file1 | tee file2
testtext
But since tee from gnu coreutils accepts several output files to be specified, one can do just:
$ echo "testtext" | tee file1 file2
testtext
But your idea of passed through file1 and landing in file2 is not correct. Your shell example:
echo "testtext" > file1 > file2
makes the shell open both files file1 and file2 for writing which effectively truncates them and since stdout can be only redirected into another file directly, only the last redirection is effective (as it overrides the previous ones).

tee writes its input to each of the files named in its arguments (it can take more than one) as well as to standard output. The example could also be written
echo "testtext" | tee file1 file2 > /dev/null
where you explicitly write to the two files, then ignore what goes to standard output, rather than redirecting standard output to one of the files.
The > file2 in the command you showed does not somehow "extract" what was written to file1, leaving standard output to be written to the screen. Rather, > file2 instructs the shell to pass a file handle opened on file2 (rather than the terminal) to tee for it to use as standard output.

"is it because its writing to file and then to stdout which gets redirected?"
That is correct
tee sends output to the specified file, and to stdout.
the last ">" redirects standout to the second file specified.

Difference between '2>&1' and '&>filename'

I'm an beginner in Linux, I have a question about redirecting STDOUT and STDERR.
Create a file1 to add some strings
echo hello > file1
After this, when I do something like this
cat file1 file2
It will give an error like this
hello
cat: file2: No such file or directory
I want to redirect the STDOUT and STDERR, so
cat file1 file2 > file3 2>&1 | cat
hello
cat: file2: No such file or directory
I know that | can use last command's output as its input, right?
So the first cat's output is:
hello
cat: file2: No such file or directory
Now, I find another method to redirect output, like:
cat file1 file2 &> file3
cat file3
hello
cat: file2: No such file or directory
It can do the same thing, but when I add |cat , the result is
cat file1 file2 &> file3 | cat
hello
Where is the STDERR? It means only hello is the output of the first cat?
What the difference between 2>&1 and &>file?

cat file1 file2
It will give an error like this
hello cat: file2: No such file or directory
The error is simply telling you file2 does not exist. You create file1 with your redirection:
echo hello > file1
Now file1 exists. When you do cat file1 file2, cat attempts to output the contents of file1 & file2 to stdout, but file2 doesn't exist (it tells you). To make file2, you can do cat file1 > file2 to redirect the output of cat file1 to file2, or you can simply cp file1 file2. Then file2 will exist.
I want to redirect the STDOUT and STDERR, so
cat file1 file2 > file3 2>&1 | cat
hello
cat: file2: No such file or directory
Again, file2 still doesn't exist. cat short for concatenate simply output the contents of the files given as input to stdout unless redirected. file1 contains hello so it is output along with the error. hello is redirected to file3, ...and..., since you have redirected stderr to stdout (e.g. 2>&1), the error message also ends up in file3
The | (pipe) command in Linux shell just takes the stdout from the command on the left and redirects it to the stdin of the command after the pipe. Since you already redirected the stdout and stderr from cat file1 file2 into file3, nothing is sent to the cat following the pipe. The output you posted appears to have come from:
cat file3
In a Linux shell, stdin, stdout and stderr are simply special files that represent file descriptors 0, 1 & 2, respectively. The actual files in the filesystem are /dev/stdin, /dev/stdout, and /dev/stderr. If you check with the ls -l command, you will see the relation between the file and file descriptors:
$ ls -l /dev/std*
lrwxrwxrwx 1 root root 4 Apr 2 17:47 /dev/stderr -> fd/2
lrwxrwxrwx 1 root root 4 Apr 2 17:47 /dev/stdin -> fd/0
lrwxrwxrwx 1 root root 4 Apr 2 17:47 /dev/stdout -> fd/1

The output hello appears on standard output. The error message appears on standard error. Henceforth, those will be stdout and stderr.
You claim that cat file1 file2 > file3 2>&1 | cat produces some output on the terminal. It doesn't with any standard shell. When I run it, it produces no visible output, but file3 contains the line from file1 and the error message. Since there's no input for it to read, the second cat command exits without producing any output.
The > file3 redirection sends stdout to file3. The 2>&1 sends stderr to the same place that stdtout is going. File descriptor 0 is standard input (stdin), 1 is stdout, and 2 is stderr.
There is no output sent to the pipe; it is all sent to the file (but the pipe is created first, then stdout is redirected to the file).
These commands demonstrate that all the output (stdout and stderr) was written to file3.
You claim that cat file1 file2 &> file3 | cat produces some output on the terminal. It doesn't with Bash; there is no output because both stdout and stderr go to file3.
The difference between &> file3 and > file3 2>&1 is portability (the &> notation is less portable) and number of characters; functionally, they're equivalent.

Redirecting program output

How do I redirect my program, so that the output goes to 3 file such that
stdout goes to file1
stderr goes to file2
the combined result of stdout and stderr goes to file3 in their original order
While redirecting, the output is also printed to the screen as the program is running
I tried
myprogram > file1 2> file2
but this does not satisfy 3 & 4.
Edit: It would be better if the screen displays messages immediately after they are printed. (to increase responsiveness)

(./foo.sh > >(tee out.log) 2> >(tee err.log >&2)) |& tee all.log
What have we done here? First, we create two subshells to run tee out.log and tee err.log, and redirect the appropriate descriptors to them. We are careful to redirect stdout from err.log back to stderr where it belongs, otherwise it will mess up out.log (credit to https://stackoverflow.com/a/692407/4323 for this idea). Second, we put that entire thing in a subshell so that we can redirect its stdout and stderr in one shot to all.log, again using tee to print to the screen at the same time.
One caveat is that the program we're running is likely to buffer stdout when it is not a TTY/PTY (terminal device). If you need immediate output from stdout on your screen and in the files, you can try running your program with unbuffer, a utility which avoids this buffering.

myprogram > file1 2> file2 &> file3; cat file3
Or do you think the cat file3 is cheating?

How to append one file to another in Linux from the shell?

I have two files: file1 and file2. How do I append the contents of file2 to file1 so that contents of file1 persist the process?

Use bash builtin redirection (tldp):
cat file2 >> file1

cat file2 >> file1
The >> operator appends the output to the named file or creates the named file if it does not exist.
cat file1 file2 > file3
This concatenates two or more files to one. You can have as many source files as you need. For example,
cat *.txt >> newfile.txt
Update 20130902
In the comments eumiro suggests "don't try cat file1 file2 > file1." The reason this might not result in the expected outcome is that the file receiving the redirect is prepared before the command to the left of the > is executed. In this case, first file1 is truncated to zero length and opened for output, then the cat command attempts to concatenate the now zero-length file plus the contents of file2 into file1. The result is that the original contents of file1 are lost and in its place is a copy of file2 which probably isn't what was expected.
Update 20160919
In the comments tpartee suggests linking to backing information/sources. For an authoritative reference, I direct the kind reader to the sh man page at linuxcommand.org which states:
Before a command is executed, its input and output may be redirected
using a special notation interpreted by the shell.
While that does tell the reader what they need to know it is easy to miss if you aren't looking for it and parsing the statement word by word. The most important word here being 'before'. The redirection is completed (or fails) before the command is executed.
In the example case of cat file1 file2 > file1 the shell performs the redirection first so that the I/O handles are in place in the environment in which the command will be executed before it is executed.
A friendlier version in which the redirection precedence is covered at length can be found at Ian Allen's web site in the form of Linux courseware. His I/O Redirection Notes page has much to say on the topic, including the observation that redirection works even without a command. Passing this to the shell:
$ >out
...creates an empty file named out. The shell first sets up the I/O redirection, then looks for a command, finds none, and completes the operation.

Note: if you need to use sudo, do this:
sudo bash -c 'cat file2 >> file1'
The usual method of simply prepending sudo to the command will fail, since the privilege escalation doesn't carry over into the output redirection.

Try this command:
cat file2 >> file1

Just for reference, using ddrescue provides an interruptible way of achieving the task if, for example, you have large files and the need to pause and then carry on at some later point:
ddrescue -o $(wc --bytes file1 | awk '{ print $1 }') file2 file1 logfile
The logfile is the important bit. You can interrupt the process with Ctrl-C and resume it by specifying the exact same command again and ddrescue will read logfile and resume from where it left off. The -o A flag tells ddrescue to start from byte A in the output file (file1). So wc --bytes file1 | awk '{ print $1 }' just extracts the size of file1 in bytes (you can just paste in the output from ls if you like).
As pointed out by ngks in the comments, the downside is that ddrescue will probably not be installed by default, so you will have to install it manually. The other complication is that there are two versions of ddrescue which might be in your repositories: see this askubuntu question for more info. The version you want is the GNU ddrescue, and on Debian-based systems is the package named gddrescue:
sudo apt install gddrescue
For other distros check your package management system for the GNU version of ddrescue.

Another solution:
tee < file1 -a file2
tee has the benefit that you can append to as many files as you like, for example:
tee < file1 -a file2 file3 file3
will append the contents of file1 to file2, file3 and file4.
From the man page:
-a, --append
append to the given FILEs, do not overwrite

Zsh specific: You can also do this without cat, though honestly cat is more readable:
>> file1 < file2
The >> appends STDIN to file1 and the < dumps file2 to STDIN.

cat can be the easy solution but that become very slow when we concat large files, find -print is to rescue you, though you have to use cat once.
amey#xps ~/work/python/tmp $ ls -lhtr
total 969M
-rw-r--r-- 1 amey amey 485M May 24 23:54 bigFile2.txt
-rw-r--r-- 1 amey amey 485M May 24 23:55 bigFile1.txt
amey#xps ~/work/python/tmp $ time cat bigFile1.txt bigFile2.txt >> out.txt
real 0m3.084s
user 0m0.012s
sys 0m2.308s
amey#xps ~/work/python/tmp $ time find . -maxdepth 1 -type f -name 'bigFile*' -print0 | xargs -0 cat -- > outFile1
real 0m2.516s
user 0m0.028s
sys 0m2.204s

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string