Bash command "read" behaviour using redirection operator

Bash command "read" behaviour using redirection operator - linux

If I execute the following command:
> read someVariable _ < <(echo "54 41")
and:
> echo $someVariable
The result is: 54.
What does < < (with spaces) do?
Why is _ giving the first word from the result in the "echo" command?
The commands above are just example.
Thanks a lot

Process Substitution
As tldp.org explains,
Process substitution feeds the output of a process (or processes) into
the stdin of another process.
So in effect this is similar to piping stdout of one command to the other , e.g. echo foobar barfoo | wc . But notice: in the [bash manpage][3] you will see that it is denoted as <(list). So basically you can redirect output of multiple (!) commands.
Note: technically when you say < < you aren't referring to one thing, but two redirection with single < and process redirection of output from <( . . .).
Now what happens if we do just process substitution?
$ echo <(echo bar)
/dev/fd/63
As you can see, the shell creates temporary file descriptor /dev/fd/63 where the output goes. That means < redirects that file descriptor as input into a command.
So very simple example would be to make process substitution of output from two echo commands into wc:
$ wc < <(echo bar;echo foo)
2 2 8
So here we make shell create a file descriptor for all the output that happens in the parenthesis and redirect that as input to wc .As expected, wc receives that stream from two echo commands, which by itself would output two lines, each having a word, and appropriately we have 2 words, 2 lines, and 6 characters plus two newlines counted.
Side Note: Process substitution may be referred to as a bashism (a command or structure usable in advanced shells like bash, but not specified by POSIX), but it was implemented in ksh before bash's existence as ksh man page. Shells like tcsh and mksh however do not have process substitution. So how could we go around redirecting output of multiple commands into another command without process substitution? Grouping plus piping!
$ (echo foo;echo bar) | wc
2 2 8
Effectively this is the same as above example, However, this is different under the hood from process substitution, since we make stdout of the whole subshell and stdin of wc [linked with the pipe][5]. On the other hand, process substitution makes a command read a temporary file descriptor.
So if we can do grouping with piping, why do we need process substitution? Because sometimes we cannot use piping. Consider the example below - comparing outputs of two commands with diff (which needs two files, and in this case we are giving it two file descriptors)
diff <(ls /bin) <(ls /usr/bin)

Related

How to redirect one of several inputs?

In Linux/Unix command line, when using a command with multiple inputs, how can I redirect one of them?
For example, say I'm using cat to concatenate multiple files, but I only want the last few lines of one file, so my inputs are testinput1, testinput2, and tail -n 4 testinput3.
How can I do this in one line without any temporary files?
I tried tail -n 4 testinput3 | cat testinput1 testinput2, but this seems to just take in input 1 and 2.
Sorry for the bad title, I wasn't sure how to phrase it exactly.

Rather than trying to pipe the output of tail to cat, bash provides process substitution where the process substitution is run with its input or output connected to a FIFO or a file in /dev/fd (like your terminal tty). This allows you to treat the output of a process as if it were a file.
In the normal case you will generally redirect the output of the process substitution into a loop, e.g, while read -r line; do ##stuff; done < <(process). However, in your case, cat takes the file itself as an argument rather than reading from stdin, so you omit the initial redirection, e.g.
cat file1 file2 <(tail -n4 file3)
So be familiar with both forms, < <(process) if you need to redirect a process as input or simply <(process) if you need the result of process to be treated as a file.

Do 'cat foo.txt | my_cmd' and 'my_cmd < foo.txt' accomplish the same thing?

This question helped me understand the difference between redirection and piping, but the examples focus on redirecting STDOUT (echo foo > bar.txt) and piping STDIN (ls | grep foo).
It would seem to me that any command that could be written my_command < file.txt could also be written cat file.txt | my_command. In what situations are STDIN redirection necessary?
Apart from the fact that using cat spawns an extra process and is less efficient than redirecting STDIN, are there situations in which you have to use the STDIN redirection? Put another way, is there ever a reason to pipe the output of cat to another command?

What's the difference between my_command < file.txt and cat file.txt | my_command?
my_command < file.txt
The redirection symbol can also be written as 0< as this redirects file descriptor 0 (stdin) to connect to file.txt instead of the current setting, which is probably the terminal. If my_command is a shell built-in then there are NO child processes created, otherwise there is one.
cat file.txt | my_command
This redirects file descriptor 1 (stdout) of the command on the left to the input stream of an anonymous pipe, and file descriptor 0 (stdin) of the command on the right to the output stream of the anonymous pipe.
We see at once that there is a child process, since cat is not a shell built-in. However in bash even if my_command is a shell builtin it is still run in a child process. Therefore we have TWO child processes.
So the pipe, in theory, is less efficient. Whether that difference is significant depends on many factors, including the definition of "significant". The time when a pipe is preferable is this alternative:
command1 > file.txt
command2 < file.txt
Here it is likely that
command1 | command2
is more efficient, remembering that, in practice, we will probably need a third child process in rm file.txt.
However, there are limitations to pipes. They are not seekable (random access, see man 2 lseek) and they cannot be memory mapped (see man 2 mmap). Some applications map files to virtual memory, but it would be unusual to do that to stdin or stdout. Memory mapping in particular is not possible on a pipe (whether anonymous or named) because a range of virtual addresses has to be reserved and for that a size is required.
Edit:
As mentioned by #JohnKugelman, a common error and source of many SO questions is the associated issue with a child process and redirection:
Take a file file.txt with 99 lines:
i=0
cat file.txt|while read
do
(( i = i+1 ))
done
echo "$i"
What gets displayed? The answer is 0. Why? Because the count i = i + 1 is done in a subshell which, in bash, is a child process and does not change i in the parent (note: this does not apply to korn shell, ksh).
while read
do
(( i = i+1 ))
done < file.txt
echo "$i"
This displays the correct count because no child processes are involved.

You can of course replace any use of input redirection with a pipe that reads from cat, but it is inefficient to do so, as you are spawning a new process to do something the shell can already do by itself. However, not every instance of cat ... | my_command can be replaced with my_command < ..., namely when cat is doing its intended job of concatenating two (or more) files, it is perfectly reasonable to pipe its output to another command.
cat file1.txt file2.txt | my_command

File redirection fails in Bash script, but not Bash terminal

I am having a problem where cmd1 works, but not cmd2 in my Bash script ending in .sh. I have made the Bash script executable.
Additionally, I can execute cmd2 just fine from my Bash terminal. I have tried to make a minimally reproducible example, but my larger goal is to run a complicated executable with command line arguments and pass output to a file that may or may not exist (rather than displaying the output in the terminal).
Replacing > with >> also gives the same error in the script, but not the terminal.
My Bash script:
#!/bin/bash
cmd1="cat test.txt"
cmd2="cat test.txt > a"
echo $cmd1
$cmd1
echo $cmd2
$cmd2
test.txt has the words "dog" and "cat" on two separate lines without quotes.

Short answer: see BashFAQ #50: I'm trying to put a command in a variable, but the complex cases always fail!.
Long answer: the shell expands variable references (like $cmd1) toward the end of the process of parsing a command line, after it's done parsing redirects (like > a is supposed to be) and quotes and escapes and... In fact, the only thing it does with the expanded value is word splitting (e.g. treating cat test.txt > a as "cat" followed by "test.txt", ">", and finally "a", rather than a single string) and wildcard expansion (e.g. if $cmd expanded to cat *.txt, it'd replace the *.txt part with a list of matching files). (And it skips word splitting and wildcard expansion if the variable is in double-quotes.)
Partly as a result of this, the best way to store commands in variables is: don't. That's not what they're for; variables are for data, not commands. What you should do instead, though, depends on why you were storing the command in a variable.
If there's no real reason to store the command in a variable, then just use the command directly. For conditional redirects, just use a standard if statement:
if [ -f a ]; then
cat test.txt > a
else
cat test.txt
fi
If you need to define the command at one point, and use it later; or want to use the same command over and over without having to write it out in full each time, use a function:
cmd2() {
cat test.txt > a
}
cmd2
It sounds like you may need to be able to define the command differently depending on some condition, you can actually do that with a function as well:
if [ -f a ]; then
cmd() {
cat test.txt > a
}
else
cmd() {
cat test.txt
}
fi
cmd
Alternately, you can wrap the command (without redirect) in a function, then use a conditional to control whether it redirects:
cmd() {
cat test.txt
}
if [ -f a ]; then
cmd > a
else
cmd
fi
It's also possible to wrap a conditional redirect into a function itself, then pipe output to it:
maybe_redirect_to() {
if [ -f "$1" ]; then
cat > "$1"
else
cat
fi
}
cat test.txt | maybe_redirect_to a
(This creates an extra cat process that isn't really doing anything useful, but if it makes the script cleaner, I'd consider that worth it. In this particular case, you could minimize the stray cats by using maybe_redirect_to a < test.txt.)
As a last resort, you can store the command string in a variable, and use eval to parse it. eval basically re-runs the shell parsing process from the beginning, meaning that it'll recognize things like redirects in the string. But eval has a well-deserved reputation as a bug magnet, because it's easy for it to treat parts of the string you thought were just data as command syntax, which can cause some really weird (& dangerous) bugs.
If you must use eval, at least double-quote the variable reference, so it runs through the parsing process just once, rather than sort-of-once-and-a-half as it would unquoted. Here's an example of what I mean:
cmd3="echo '5 * 3 = 15'"
eval "$cmd3"
# prints: 5 * 3 = 15
eval $cmd3
# prints: 5 [list of files in the current directory] 3 = 15
# ...unless there are any files with shell metacharacters in their names, in
# which case something more complicated might happen.
BashFAQ #50 discusses some other possible reasons and solutions. Note that the array approach will not work here, since arrays also get expanded after redirects are parsed.

If you pop an 'eval' in front of $cmd2 it should work as expected:
#!/bin/bash
cmd2="cat test.txt > a"
eval $cmd2

If you're not sure about the operation of a script you could always use the debug mode to see if you can determine the error.
bash -x scriptname
This will run the command and display the output of variable evaluations. Hopefully this will reveal any issues with syntax.

How to take advantage of filters

I've read here that
To make a pipe, put a vertical bar (|) on the command line between two commands.
then
When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output, it is referred to as a filter.
So I've first tried the ls command whose output is:
Desktop HelloWord.java Templates glassfish-4.0
Documents Music Videos hs_err_pid26742.log
Downloads NetBeansProjects apache-tomcat-8.0.3 mozilla.pdf
HelloWord Pictures examples.desktop netbeans-8.0
Then ls | echo which outputs absolutely nothing.
I'm looking for a way to take advantages of pipelines and filters in my bash script. Please help.

echo doesn't read from standard input. It only writes its command-line arguments to standard output. The cat command is what you want, which takes what it reads from standard input to standard output.
ls | cat
(Note that the pipeline above is a little pointless, but does demonstrate the idea of a pipe. The command on the right-hand side must read from standard input.)
Don't confuse command-line arguments with standard input.

echo doesn't read standard input. To try something more useful, try
ls | sort -r
to get the output sorted in reverse,
or
ls | grep '[0-9]'
to only keep the lines containing digits.

In addition to what others have said - if your command (echo in this example) does not read from standard input you can use xargs to "feed" this command from standard input, so
ls | echo
doesn't work, but
ls | xargs echo
works fine.

Accessing each line using a $ sign in linux

Whenever I execute a linux command that outputs multiple lines, I want to perform some operation on each line of the output. generally i do
command something | while read a
do
some operation on $a;
done
This works fine. But my question is, Is there some how I can access each line by a predefined symbol( dont know how to call it) /// something like $? .. or .. $! .. or .. $_
Is it possible to do
cat to_be_removed.txt | rm -f $LINE
is there a predefined $LINE in bash .. or the previous one is the shortest way. ie.
cat to_be_removed.txt | while read line; do rm -f $line; done;

xargs is what you're looking for:
cat to_be_removed.txt | xargs rm -f
Watch out for spaces in your filenames if you use that one, though. Check out the xargs man page for more information.

You might be looking for the xargs command.
It takes control arguments, plus a command and optionally some arguments for the command. It then reads its standard input, normally splitting at white space, and then arranges to repeatedly execute the command with the given arguments and as many 'file names' read from the standard input as will fit on the command line.

rm -f $(<to_be_removed.txt)
This works because rm can take multiple files as input. It also makes it much more efficient because you only call rm once and you don't need to create a pipe to cat or xargs
On a separate note, rather than using pipes in a while loop, you can avoid a subshell by using process substitution:
while read line; do
some operation on $a;
done < <(command something)
The additional benefit you get by avoiding a subshell is that variables you change inside the loop maintain their altered values outside the loop as well. This is not the case when using the pipe form and it is a common gotcha.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Bash command "read" behaviour using redirection operator - linux

If I execute the following command: > read someVariable _ < <(echo "54 41") and: > echo $someVariable The result is: 54. What does < < (with spaces) do? Why is _ giving the first word from the result in the "echo" command? The commands above are just example. Thanks a lot

Related

How to redirect one of several inputs?

Do 'cat foo.txt | my_cmd' and 'my_cmd < foo.txt' accomplish the same thing?

File redirection fails in Bash script, but not Bash terminal

How to take advantage of filters

Accessing each line using a $ sign in linux

Categories

Resources