How to redirect one of several inputs? - linux

In Linux/Unix command line, when using a command with multiple inputs, how can I redirect one of them?
For example, say I'm using cat to concatenate multiple files, but I only want the last few lines of one file, so my inputs are testinput1, testinput2, and tail -n 4 testinput3.
How can I do this in one line without any temporary files?
I tried tail -n 4 testinput3 | cat testinput1 testinput2, but this seems to just take in input 1 and 2.
Sorry for the bad title, I wasn't sure how to phrase it exactly.

Rather than trying to pipe the output of tail to cat, bash provides process substitution where the process substitution is run with its input or output connected to a FIFO or a file in /dev/fd (like your terminal tty). This allows you to treat the output of a process as if it were a file.
In the normal case you will generally redirect the output of the process substitution into a loop, e.g, while read -r line; do ##stuff; done < <(process). However, in your case, cat takes the file itself as an argument rather than reading from stdin, so you omit the initial redirection, e.g.
cat file1 file2 <(tail -n4 file3)
So be familiar with both forms, < <(process) if you need to redirect a process as input or simply <(process) if you need the result of process to be treated as a file.


Bash command "read" behaviour using redirection operator

If I execute the following command:
> read someVariable _ < <(echo "54 41")
> echo $someVariable
The result is: 54.
What does < < (with spaces) do?
Why is _ giving the first word from the result in the "echo" command?
The commands above are just example.
Thanks a lot
Process Substitution
As explains,
Process substitution feeds the output of a process (or processes) into
the stdin of another process.
So in effect this is similar to piping stdout of one command to the other , e.g. echo foobar barfoo | wc . But notice: in the [bash manpage][3] you will see that it is denoted as <(list). So basically you can redirect output of multiple (!) commands.
Note: technically when you say < < you aren't referring to one thing, but two redirection with single < and process redirection of output from <( . . .).
Now what happens if we do just process substitution?
$ echo <(echo bar)
As you can see, the shell creates temporary file descriptor /dev/fd/63 where the output goes. That means < redirects that file descriptor as input into a command.
So very simple example would be to make process substitution of output from two echo commands into wc:
$ wc < <(echo bar;echo foo)
2 2 8
So here we make shell create a file descriptor for all the output that happens in the parenthesis and redirect that as input to wc .As expected, wc receives that stream from two echo commands, which by itself would output two lines, each having a word, and appropriately we have 2 words, 2 lines, and 6 characters plus two newlines counted.
Side Note: Process substitution may be referred to as a bashism (a command or structure usable in advanced shells like bash, but not specified by POSIX), but it was implemented in ksh before bash's existence as ksh man page. Shells like tcsh and mksh however do not have process substitution. So how could we go around redirecting output of multiple commands into another command without process substitution? Grouping plus piping!
$ (echo foo;echo bar) | wc
2 2 8
Effectively this is the same as above example, However, this is different under the hood from process substitution, since we make stdout of the whole subshell and stdin of wc [linked with the pipe][5]. On the other hand, process substitution makes a command read a temporary file descriptor.
So if we can do grouping with piping, why do we need process substitution? Because sometimes we cannot use piping. Consider the example below - comparing outputs of two commands with diff (which needs two files, and in this case we are giving it two file descriptors)
diff <(ls /bin) <(ls /usr/bin)

Are linux shell pipes pipelined?

Given a file input.txt if I do something like
grep pattern1 input.txt | grep pattern2 | wc -l
is the output from the first command continuously passed (as soon as it is generated) as input to the second command?
Or does the pipe wait until the first command finishes to start running the second command?
Yes, they're pipelined -- each component's stdout is connected to the stdin of the next via a FIFO, and all components are started in parallel.
This is why
cat some-file | >some-file
...typically results in a truncated file: Because the pipeline is started all at once, the last piece (truncating some-file for write) happens before cat has finished (or often, even started) reading the file from input.
The general answer to your question is "yes".
HOWEVER, some programs, such as grep itself, buffer their results up to some arbitrary point. They may include options to disable this buffering, but you should not rely on them being available.
Piped commands run concurrently.
This is very commonly used to allow the second program to process data as it comes out from the first program, before the first program has completed its operation. For example
grep pattern huge-file | tr a-z A-Z
begins to display the matching lines in uppercase even before grep has finished traversing the large file.
grep pattern huge-file | head -n 1
displays the first matching line, and may stop processing well before grep has finished reading its input file.
These two examples which I can think of explains that they run concurrently.

How to take advantage of filters

I've read here that
To make a pipe, put a vertical bar (|) on the command line between two commands.
When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output, it is referred to as a filter.
So I've first tried the ls command whose output is:
Desktop Templates glassfish-4.0
Documents Music Videos hs_err_pid26742.log
Downloads NetBeansProjects apache-tomcat-8.0.3 mozilla.pdf
HelloWord Pictures examples.desktop netbeans-8.0
Then ls | echo which outputs absolutely nothing.
I'm looking for a way to take advantages of pipelines and filters in my bash script. Please help.
echo doesn't read from standard input. It only writes its command-line arguments to standard output. The cat command is what you want, which takes what it reads from standard input to standard output.
ls | cat
(Note that the pipeline above is a little pointless, but does demonstrate the idea of a pipe. The command on the right-hand side must read from standard input.)
Don't confuse command-line arguments with standard input.
echo doesn't read standard input. To try something more useful, try
ls | sort -r
to get the output sorted in reverse,
ls | grep '[0-9]'
to only keep the lines containing digits.
In addition to what others have said - if your command (echo in this example) does not read from standard input you can use xargs to "feed" this command from standard input, so
ls | echo
doesn't work, but
ls | xargs echo
works fine.

Send multiple outputs to sed

When there is a program which, upon execution, prints several lines on stout, how can I redirect all those lines to sed and perform some operations on them while they are being generated?
For example:
7zip a -t7z output_folder input_folder -mx9 > sed 's/.*[ \t][ \t]*\([0-9][0-9]*\)%.*/\1/'
7zip generates a series of lines as output, each including a percentage value, and I would like sed to display these values only, while they are being generated. The above script unfortunately does not work...
What is the best way to do this?
You should use the pipe | instead of redirection > so that sed uses first command output as its input.
The above script line must have created a sed file in the current directory.
Furthermore, maybe 7zip outputs these lines to stderr instead of stdout. If it is the case, first redirect standard error to standard output before piping: 2>&1 |

Why doesn't "sort file1 > file1" work?

When I am trying to sort a file and save the sorted output in itself, like this
sort file1 > file1;
the contents of the file1 is getting erased altogether, whereas when i am trying to do the same with 'tee' command like this
sort file1 | tee file1;
it works fine [ed: "works fine" only for small files with lucky timing, will cause lost data on large ones or with unhelpful process scheduling], i.e it is overwriting the sorted output of file1 in itself and also showing it on standard output.
Can someone explain why the first case is not working?
As other people explained, the problem is that the I/O redirection is done before the sort command is executed, so the file is truncated before sort gets a chance to read it. If you think for a bit, the reason why is obvious - the shell handles the I/O redirection, and must do that before running the command.
The sort command has 'always' (since at least Version 7 UNIX) supported a -o option to make it safe to output to one of the input files:
sort -o file1 file1 file2 file3
The trick with tee depends on timing and luck (and probably a small data file). If you had a megabyte or larger file, I expect it would be clobbered, at least in part, by the tee command. That is, if the file is large enough, the tee command would open the file for output and truncate it before sort finished reading it.
It doesn't work because '>' redirection implies truncation, and to avoid keeping the whole output of sort in the memory before re-directing to the file, bash truncates and redirects output before running sort. Thus, contents of the file1 file will be truncated before sort will have a chance to read it.
It's unwise to depend on either of these command to work the way you expect.
The way to modify a file in place is to write the modified version to a new file, then rename the new file to the original name:
sort file1 > file1.tmp && mv file1.tmp file1
This avoids the problem of reading the file after it's been partially modified, which is likely to mess up the results. It also makes it possible to deal gracefully with errors; if the file is N bytes long, and you only have N/2 bytes of space available on the file system, you can detect the failure creating the temporary file and not do the rename.
Or you can rename the original file, then read it and write to a new file with the same name:
mv file1 file1.bak && sort file1.bak > file1
Some commands have options to modify files in place (for example, perl and sed both have -i options (note that the syntax of sed's -i option can vary). But these options work by creating temporary files; it's just done internally.
Redirection has higher precedence. So in the first case, > file1 executes first and empties the file.
The first command doesn't work (sort file1 > file1), because when using the redirection operator (> or >>) shell creates/truncates file before the sort command is even invoked, since it has higher precedence.
The second command works (sort file1 | tee file1), because sort reads lines from the file first, then writes sorted data to standard output.
So when using any other similar command, you should avoid using redirection operator when reading and writing into the same file, but you should use relevant in-place editors for that (e.g. ex, ed, sed), for example:
ex '+%!sort' -cwq file1
or use other utils such as sponge.
Luckily for sort there is the -o parameter which write results to the file (as suggested by #Jonathan), so the solution is straight forward: sort -o file1 file1.
Bash open a new empty file when reads the pipe, and then calls to sort.
In the second case, tee opens the file after sort has already read the contents.
You can use this method
sort file1 -o file1
This will sort and store back to the original file. Also, you can use this command to remove duplicated line:
sort -u file1 -o file1
