why can't pass file path argument to shell command 'more' in pipeline mode? - linux

I have a text file a.txt
hello world
I use following commands:
cmd1:
$ more a.txt
output:
hello world
cmd2:
$ echo 'a.txt'|more
output:
a.txt
I thought cmd2 should equal to echo 'a.txt'|xargs -i more {},but it's not.
I want to know why cmd2 worked like that and how to write code which work differently in pipeline mode.

Redirection with | or < controls what the stdin stream contains; it has no impact on a program's command line argument list.
Thus, more <a.txt (efficiently) or cat a.txt | more (inefficiently) both attach a file handle from which one can read the contents of a.txt to the stdin file handle of a new process before replacing that process with more. Similarly, echo a.txt | more makes a.txt itself the literal text that more reads from its stdin stream, which is the default place it's documented to get the input to display from, if not given any more specific filename(s) on its command line.
Generally, if you have a list of filenames and want to convert them to command-line arguments, this is what xargs is for (though using it without a great deal of care can introduce bugs, potentially-security-impacting ones).
Consider the following, which (using NUL rather than newline delimiters to separate filenames) is a safe use of xargs to take a list of filenames being piped into it, and transform that into an argument list to cat, used to concatenate all those files together and generate a single stream of input to more:
printf '%s\0' a.txt b.txt |
xargs -0 cat -- |
more

Related

How to pipe multiple binary files to an application which reads from stdin

For a single file,
$ my_app < file01.binary
For multiple files,
$ cat file*.binary | my_app
Each binary file is of size 500MB and the total size of all file*.binary is around 8GB. Based on my understanding, cat will first concatenate all files then redirect the single big file to my_app.
Is there a better way to send multiple binary files to my_app without first concatenating them?
No. cat will just read lines/blocks from the input files in a loop and print them to the pipe. No worries.
The "concatenate" in cat means that it concatenates its input to its output. It does not imply that it concatenates its input(s) in memory first.
ls file*.binary | xargs cat | xargs my_app
xargs is a command to build and execute commands from standard input. It converts input from standard input into arguments to a command.

How to redirect one of several inputs?

In Linux/Unix command line, when using a command with multiple inputs, how can I redirect one of them?
For example, say I'm using cat to concatenate multiple files, but I only want the last few lines of one file, so my inputs are testinput1, testinput2, and tail -n 4 testinput3.
How can I do this in one line without any temporary files?
I tried tail -n 4 testinput3 | cat testinput1 testinput2, but this seems to just take in input 1 and 2.
Sorry for the bad title, I wasn't sure how to phrase it exactly.
Rather than trying to pipe the output of tail to cat, bash provides process substitution where the process substitution is run with its input or output connected to a FIFO or a file in /dev/fd (like your terminal tty). This allows you to treat the output of a process as if it were a file.
In the normal case you will generally redirect the output of the process substitution into a loop, e.g, while read -r line; do ##stuff; done < <(process). However, in your case, cat takes the file itself as an argument rather than reading from stdin, so you omit the initial redirection, e.g.
cat file1 file2 <(tail -n4 file3)
So be familiar with both forms, < <(process) if you need to redirect a process as input or simply <(process) if you need the result of process to be treated as a file.

Pipe Operator in Linux

As per my understanding, the pipe operator in Linux takes standard output of the one command and channelizes it to the standard input of the next command. But I have faced one anomaly.
I am trying to get the content of a file in the standard ouput as below.
cat file1
It displays the content. Let's say the content is another file named file2.
Now I want to display the content of file2.
So to take the advantage of pipe operator, I am trying to execute as below
cat file1 | cat
The first cat command should pipe the output (here "file2"). The cat in the subsequent command must accept it from the standard input (here the value is "file2") and print the content of file2.
But it displays "file2" only instead of its contents.
What you should do:
cat `cat file1`
From man cat:
Concatenate FILE(s), or standard input, to standard output.
In other words, if there is a filename provided as parameter, it will display its content to the standard output, otherwise it will just redirect to standard input to the standard output.
In your case, the filename is read from the standard input and it is interpreted as a string to concatenate to the standard output.
The backquotes are used to inject the standard output of a command, i.e:
cat `cat file1`
is equivalent to
cat file2
which will dump file2 content to standard output.
You could use xargs:
cat file1 | xargs cat
XARGS General Commands Manual

How to take advantage of filters

I've read here that
To make a pipe, put a vertical bar (|) on the command line between two commands.
then
When a program takes its input from another program, performs some operation on that input, and writes the result to the standard output, it is referred to as a filter.
So I've first tried the ls command whose output is:
Desktop HelloWord.java Templates glassfish-4.0
Documents Music Videos hs_err_pid26742.log
Downloads NetBeansProjects apache-tomcat-8.0.3 mozilla.pdf
HelloWord Pictures examples.desktop netbeans-8.0
Then ls | echo which outputs absolutely nothing.
I'm looking for a way to take advantages of pipelines and filters in my bash script. Please help.
echo doesn't read from standard input. It only writes its command-line arguments to standard output. The cat command is what you want, which takes what it reads from standard input to standard output.
ls | cat
(Note that the pipeline above is a little pointless, but does demonstrate the idea of a pipe. The command on the right-hand side must read from standard input.)
Don't confuse command-line arguments with standard input.
echo doesn't read standard input. To try something more useful, try
ls | sort -r
to get the output sorted in reverse,
or
ls | grep '[0-9]'
to only keep the lines containing digits.
In addition to what others have said - if your command (echo in this example) does not read from standard input you can use xargs to "feed" this command from standard input, so
ls | echo
doesn't work, but
ls | xargs echo
works fine.

How do I use the filenames output by "grep" as argument to another program

I have this grep command which outputs the names of files (which contains matches to some pattern), and I want to parse those files with some file-parsing program. The pipechain looks like this:
grep -rl "{some-pattern}" . | {some-file-parsing-program} > a.out
How do I get those file names as command line arguments to the file-parsing program?
For example, let's say grep returns the filenames a, b, c. How do I pass the filenames so that it's as if I'm executing
{some-file-parsing-program} a b c > a.out
?
It looks to me as though you're wanting xargs:
grep -rl "{some_pattern" . | xargs your-command > a.out
I'm not convinced a.out is a good output file name, but we can let that slide. The xargs command reads white-space separated file names from standard input and then invokes your-command with those names as arguments. It may need to invoke your-command several times; unless you're using GNU xargs and you specify -r, your-command will be invoked at least once, even if there are no matching file names.
Without using xargs, you could not use sed for this job. Without using xargs, using awk would be clumsy. Perl (and Python) could manage it 'trivially'; it would be easy to write the code to read file names from standard input and then process each file in turn.
I don't know of any linux programs that cannot read from stdin. Depending on the program, the default input may be stdin or you may need to specify to use stdin by using a command line option (often - by itself). Do you have anything particular in mind?

Resources