How can I use xargs to run a function in a command substitution for each match?

How can I use xargs to run a function in a command substitution for each match? - linux

While writing Bash functions for string replacements I have encountered a strange behaviour when using xargs. This is actually driving me mad currently as I cannot get it to work.
Fortunately I have been able to nail it down to the following simple example:
Define a simple function which doubles every character of the given parameter:
function subs { echo $1 | sed -E "s/(.)/\1\1/g"; }
Call the function:
echo $(subs "ABC")
As expected the output is:
AABBCC
Now call the function using xargs:
echo "ABC" | xargs -I % echo $(subs "%")
Surprisingly the result now is:
ABCABC
It seems as if the sed command inside the function treats the whole string now as a single character.
Why does this happen and how can it be prevented?
You might ask, why I use xargs at all. Of course, this is a simplified example and the actual use case is much more complex.
In the original use case, I have a program which produces lots of output. I pipe the output through several greps to get the lines of interest. Afterwards, I pipe the lines to sed to extract the data I need from the lines. Because some transformations I need to do on the data are too complex to do with regular expressions alone, I'd like to use a function for these. So, my original idea was to simply pipe into the function but I couldn't get that to work and end up with the xargs solution. My original idea was something like this:
command | grep ... | grep ... | grep ... | sed ... | subs
BTW: I do not do this from the command line but from within a script. The function is defined in the very same script in which it is used.
I'm using Bash 3.2 (Mac OS X default), so fancy Bash 4.x stuff won't help me, sorry.
I'll be happy about everything which might shed some light on this topic.
Best regards
Frank

If you really need to do this (and you probably don't, but we can't help without a more representative sample), a better-practice approach might look like:
subs() { sed -E "s/(.)/\1\1/g" <<<"$1"; }
export -f subs
echo "ABC" | xargs bash -c 'for arg; do subs "$arg"; done' _
The use of echo "$(subs "$arg")" instead of just subs "$arg" adds nothing but bugs (consider what happens if one of your arguments is -n -- and that's assuming a relatively tame echo; they're allowed to consume backslashes even without a -e argument and to do all manner of other surprising things). You could do it above, but it slows your program down and makes it more prone to surprising behaviors; there's no point.
Running export -f subs export your function to the environment, so it can be run by other instances of bash invoked as child processes (all programs invoked by xargs are outside your shell, so they can't see shell-local variables or functions).
Without -I -- which is to say, in its default mode of operation -- xargs appends arguments to the end of the command it's given. This permits a much more efficient usage mode, where instead of invoking one command per line of input, it passes as many arguments as possible to the shortest possible number of subprocesses.
This also avoids major security bugs that can happen when using xargs -I in conjunction with bash -c '...' or sh -c '...'. (If you ever use -I% sh -c '...%...', then your filenames become part of your code, and are able to be used in injection attacks on your system).

That's because the construct $(subs "%") gets expanded by the shell when parsing the pipeline, so xargs runs with echo %%.

Related

Function in pipe chain

I have this function where input parameters are searched string and input file. Function works with files:
f_highlite() {
sed -e 's/\($1\)/\o033[91m\1\o033[39m/g' $2
}
Now I would like to use this function in pipe. How does it should be modified?
ps aux | grep java | f_highlite "Xms" -
PS: I'm not sure how to exactly name this question. If you have better suggestion say it. ;]

First, you need to use double quotes, otherwise $1 wouldn't get expanded:
f_highlite() {
sed -e "s/\($1\)/\o033[91m\1\o033[39m/g" "$2"
}
Btw, you need to make sure that $1 won't contain characters that are understood by sed as syntax elements. For Xms that's fine.
To the topic, you can pass - as the second argument to the function because sed understands - as stdin:
ps aux | grep Java | f_highlite "Xms" -
(thanks #chepner!)

There are two other approaches that you might want to know about, as not all commands would support the - trick.
The first one is having a function that works on streams and does not take a file as input. You can do that by removing the $2 at the end, and changing how you call the function
f_highlite() {
sed -e 's/\($1\)/\o033[91m\1\o033[39m/g'
}
f_highlite <"Xms"
This will redirect the content of your file and connect it to the standard input of the function (and hence to that of sed).
The other approach is to keep your function as is (I am reusing a correction to the quoting suggested in another answer), but feed it a file by using process substitution.
f_highlite() {
sed -e "s/\($1\)/\o033[91m\1\o033[39m/g" "$2"
}
f_highlite < <(<"Xms")
This (conceptually at least) creates a FIFO that has its input fed with the content of your file, and its output connected to the input of the function. The key here is that <(<"Xms") becomes a filename (you can try printing its name to validate that).

Bash command line arguments passed to sed via ssh

I am looking to write a simple script to perform a SSH command on many hosts simultaneously, and which hosts exactly are generated from another script. The problem is that when I run the script using sometihng like sed it doesn't work properly.
It should run like sshall.sh {anything here} and it will run the {anything here} part on all the nodes in the list.
sshall.sh
#!/bin/bash
NODES=`listNodes | grep "node-[0-9*]" -o`
echo "Connecting to all nodes and running: ${#:1}"
for i in $NODES
do
:
echo "$i : Begin"
echo "----------------------------------------"
ssh -q -o "StrictHostKeyChecking no" $i "${#:1}"
echo "----------------------------------------"
echo "$i : Complete";
echo ""
done
When it is run with something like whoami it works but when I run:
[root#myhost bin]# sshall.sh sed -i '/^somebeginning/ s/$/,appendme/' /etc/myconfig.conf
Connecting to all nodes and running: sed -i /^somebeginning/ s/$/,appendme/ /etc/myconfig.conf
node-1 : Begin
----------------------------------------
sed: -e expression #1, char 18: missing command
----------------------------------------
node-1 : Complete
node-2 : Begin
----------------------------------------
sed: -e expression #1, char 18: missing command
----------------------------------------
node-2 : Complete
…
Notice that the quotes disappear on the sed command when sent to the remote client.
How do I go about fixing my bash command?
Is there a better way of achieving this?

Substitute an eval-safe quoted version of your command into a heredoc:
#!/bin/bash
# ^^^^- not /bin/sh; printf %q is an extension
# Put your command into a single string, with each argument quoted to be eval-safe
printf -v cmd_q '%q ' "$#"
while IFS= read -r hostname; do
# run bash -s remotely, with that string passed on stdin
ssh -q -o 'StrictHostKeyChecking no' "$hostname" "bash -s" <<EOF
$cmd_q
EOF
done < <(listNodes | grep -o -e "node-[0-9*]")
Why this works reliably (and other approaches don't):
printf %q knows how to quote contents to be eval'd by that same shell (so spaces, wildcards, various local quoting methods, etc. will always be supported).
Arguments given to ssh are not passed to the remote command individually!
Instead, they're concatenated into a string passed to sh -c.
However: The output of printf %q is not portable to all POSIX-derived shells! It's guaranteed to be compatible with the same shell locally in use -- ksh will always parse output from printf '%q' in ksh, bash will parse output from printf '%q' in bash, etc; thus, you can't safely pass this string on the remote argument vector, because it's /bin/sh -- not bash -- running there. (If you know your remote /bin/sh is provided by bash, then you can run ssh "$hostname" "$cmd_q" safely, but only under this condition).
bash -s reads the script to run from stdin, meaning that passing your command there -- not on the argument vector -- ensures that it'll be parsed into arguments by the same shell that escaped it to be shell-safe.

You want to pass the entire command -- with all of its arguments, spaces, and quotation marks -- to ssh so it can pass it unchanged to the remote shell for parsing.
One way to do that is to put it all inside single quotation marks. But then you'll also need to make sure the single quotation marks within your command are preserved in the arguments, so the remote shell builds the correct arguments for sed.
sshall.sh 'sed -i '"'"'/^somebeginning/ s/$/,appendme/'"'"' /etc/myconfig.conf'
It looks redundant, but '"'"' is a common Bourne trick to get a single quotation mark into a single-quoted string. The first quote ends single-quoting temporarily, the double-quote-single-quote-double-quote construct appends a single quotation mark, and then the single quotation mark resumes your single-quoted section. So to speak.
Another trick that can be helpful for troubleshooting is to add the -v flag do your ssh flags, which will spit out lots of text, but most importantly it will show you exactly what string it's passing to the remote shell for parsing and execution.
--
All of this is fairly fragile around spaces in your arguments, which you'll need to avoid, since you're relying on shell parsing on the opposite end.

Thinking outside the box: instead of dealing with all the quoting issues and the word-splitting in the wrong places, you could attempt to a) construct the script locally (maybe use a here-document?), b) scp the script to the remote end, then c) invoke it there. This easily allows more complex command sequences, with all the power of shell control constructs etc. Debugging (checking proper quoting) would be a breeze by simply looking at the generated script.

I recommend reading the command(s) from the standard input rather than from the command line arguments:
cmd.sh
#!/bin/bash -
# Load server_list with user#host "words" here.
cmd=$(</dev/stdin)
for h in ${server_list[*]}; do
ssh "$h" "$cmd"
done
Usage:
./cmd.sh <<'CMD'
sed -i '/^somebeginning/ s/$/,appendme/' /path/to/file1
# other commands
# here...
CMD
Alternatively, run ./cmd.sh, type the command(s), then press Ctrl-D.
I find the latter variant the most convenient, as you don't even need for here documents, no need for extra escaping. Just invoke your script, type the commands, and press the shortcut. What could be easier?
Explanations
The problem with your approach is that the quotes are stripped from the arguments by the shell. For example, the argument '/^somebeginning/ s/$/,appendme/' will be interpreted as /^somebeginning/ s/$/,appendme/ string (without the single quotes), which is an invalid argument for sed.
Of course, you can escape the command with the built-in printf as suggested in other answer here. But the command becomes not very readable after escaping. For example
printf %q 'sed -i /^somebeginning/ s/$/,appendme/ /home/ruslan/tmp/file1.txt'
produces
sed\ -i\ /\^somebeginning/\ s/\$/\,appendme/\ /home/ruslan/tmp/file1.txt
which is not very readable, and will look ugly, if you print it to the screen in order to show the progress.
That's why I prefer to read from the standard input and leave the command intact. My script prints the command strings to the screen, and I see them just in the form I have written them.
Note, the for .. in loop iterates $IFS-separated "words", and is generally not preferred way to traverse an array. It is generally better to invoke read -r in a while loop with adjusted $IFS. I have used the for loop for simplicity, as the question is really about invoking the ssh command.

Logging into multiple systems over SSH and using the same (or variations on the same) command is the basic use case behind ansible. The system is not without significant flaws, but for simple use cases is pretty great. If you want a more solid solution without too much faffing about with escaping and looping over hosts, take a look.
Ansible has a 'raw' module which doesn't even require any dependencies on the target hosts, and you might find that a very simple way to achieve this sort of functionality in a way that frees you from the considerations of looping over hosts, handling errors, marshalling the commands, etc and lets you focus on what you're actually trying to achieve.

How to make bash to know | is a pipe and not a string

Hi my question is simple. I want to do this in a command prompt.
var="ls | cat"
$var
Now I know that when I try to do this manually
ls | cat
Bash takes | as a special thing. I don't know how its called, I know | it's called a pipe but I mean that bash takes | as a ... and actually makes a pipe. I also figured that when I try to do $var bash actually takes | as a string and not as a pipe. Well, my question is How can I make bash to realize that | is actually a pipe and not a string. Thanks, I hope I am clear about my point.

Simple solution: use eval:
var="ls | cat"
eval $var
bash interprets the arguments to eval as if you had typed that on the command line.
Of course, keep in mind the security risks to using eval with user input, in case that's an issue for your program.

This may or may not apply - but it sounds like you may be looking for the alias command. You can do alias var="ls | cat" and then in your command prompt you can do var and it treats it as if you wrote ls | cat

Rather than trying to embed executable code into a variable (which should be used to hold data, not code), use a shell function, which is intended to hold code:
my_func () {
ls | cat
}

| is called a pipe, I haven't heard any other naming. Basically the stream output by the command on its left goes as the input to the command on its right. In your case, ls output goes into a stream (i.e. a temporary file), and that stream is fed to cat. cat prints the content of a file, and ls stream is very much like a file.
Now, you are trying to make bash interpret your variable var. To do this, try:
var=`ls | cat`
$var
On my computer I get this:
-bash: Applications: command not found
Because in my case, $var is expanded to Applications Documents Downloads, the output of my ls.
Given crudely as is, bash believes this is a command I want him to execute.
If your intention is not to execute $varcontent but print it, try:
var=`ls | cat`
echo $var

The cat is not needed here, just use ls -1 and as other answers say you can alias it or put it in a function.
For example, if you want to override ls to print each file on a new line do something like
> alias ls='command ls -1'
> ls
file1
file2
etc...
And put it in a bash init file like ~/.bashrc if you want to make the change permanent

1) Functions are suitable for such tasks:
func (){
ls | cat
}
Invoke it by saying func
2) Also another suitable solution could be eval:
eval takes a string as its argument, and evaluates it as if you'd typed that string on a command line. (If you pass several arguments, they are first joined with spaces between them.)
var="ls | cat"
eval $var

Accessing each line using a $ sign in linux

Whenever I execute a linux command that outputs multiple lines, I want to perform some operation on each line of the output. generally i do
command something | while read a
do
some operation on $a;
done
This works fine. But my question is, Is there some how I can access each line by a predefined symbol( dont know how to call it) /// something like $? .. or .. $! .. or .. $_
Is it possible to do
cat to_be_removed.txt | rm -f $LINE
is there a predefined $LINE in bash .. or the previous one is the shortest way. ie.
cat to_be_removed.txt | while read line; do rm -f $line; done;

xargs is what you're looking for:
cat to_be_removed.txt | xargs rm -f
Watch out for spaces in your filenames if you use that one, though. Check out the xargs man page for more information.

You might be looking for the xargs command.
It takes control arguments, plus a command and optionally some arguments for the command. It then reads its standard input, normally splitting at white space, and then arranges to repeatedly execute the command with the given arguments and as many 'file names' read from the standard input as will fit on the command line.

rm -f $(<to_be_removed.txt)
This works because rm can take multiple files as input. It also makes it much more efficient because you only call rm once and you don't need to create a pipe to cat or xargs
On a separate note, rather than using pipes in a while loop, you can avoid a subshell by using process substitution:
while read line; do
some operation on $a;
done < <(command something)
The additional benefit you get by avoiding a subshell is that variables you change inside the loop maintain their altered values outside the loop as well. This is not the case when using the pipe form and it is a common gotcha.

How to pass the value of a variable to the standard input of a command?

I'm writing a shell script that should be somewhat secure, i.e., does not pass secure data through parameters of commands and preferably does not use temporary files. How can I pass a variable to the standard input of a command?
Or, if it's not possible, how can I correctly use temporary files for such a task?

Passing a value to standard input in Bash is as simple as:
your-command <<< "$your_variable"
Always make sure you put quotes around variable expressions!
Be cautious, that this will probably work only in bash and will not work in sh.

Simple, but error-prone: using echo
Something as simple as this will do the trick:
echo "$blah" | my_cmd
Do note that this may not work correctly if $blah contains -n, -e, -E etc; or if it contains backslashes (bash's copy of echo preserves literal backslashes in absence of -e by default, but will treat them as escape sequences and replace them with corresponding characters even without -e if optional XSI extensions are enabled).
More sophisticated approach: using printf
printf '%s\n' "$blah" | my_cmd
This does not have the disadvantages listed above: all possible C strings (strings not containing NULs) are printed unchanged.

(cat <<END
$passwd
END
) | command
The cat is not really needed, but it helps to structure the code better and allows you to use more commands in parentheses as input to your command.

Note that the 'echo "$var" | command operations mean that standard input is limited to the line(s) echoed. If you also want the terminal to be connected, then you'll need to be fancier:
{ echo "$var"; cat - ; } | command
( echo "$var"; cat - ) | command
This means that the first line(s) will be the contents of $var but the rest will come from cat reading its standard input. If the command does not do anything too fancy (try to turn on command line editing, or run like vim does) then it will be fine. Otherwise, you need to get really fancy - I think expect or one of its derivatives is likely to be appropriate.
The command line notations are practically identical - but the second semi-colon is necessary with the braces whereas it is not with parentheses.

This robust and portable way has already appeared in comments. It should be a standalone answer.
printf '%s' "$var" | my_cmd
or
printf '%s\n' "$var" | my_cmd
Notes:
It's better than echo, reasons are here: Why is printf better than echo?
printf "$var" is wrong. The first argument is format where various sequences like %s or \n are interpreted. To pass the variable right, it must not be interpreted as format.
Usually variables don't contain trailing newlines. The former command (with %s) passes the variable as it is. However tools that work with text may ignore or complain about an incomplete line (see Why should text files end with a newline?). So you may want the latter command (with %s\n) which appends a newline character to the content of the variable. Non-obvious facts:
Here string in Bash (<<<"$var" my_cmd) does append a newline.
Any method that appends a newline results in non-empty stdin of my_cmd, even if the variable is empty or undefined.

I liked Martin's answer, but it has some problems depending on what is in the variable. This
your-command <<< """$your_variable"""
is better if you variable contains " or !.

As per Martin's answer, there is a Bash feature called Here Strings (which itself is a variant of the more widely supported Here Documents feature):
3.6.7 Here Strings
A variant of here documents, the format is:
<<< word
The word is expanded and supplied to the command on its standard
input.
Note that Here Strings would appear to be Bash-only, so, for improved portability, you'd probably be better off with the original Here Documents feature, as per PoltoS's answer:
( cat <<EOF
$variable
EOF
) | cmd
Or, a simpler variant of the above:
(cmd <<EOF
$variable
EOF
)
You can omit ( and ), unless you want to have this redirected further into other commands.

Try this:
echo "$variable" | command

If you came here from a duplicate, you are probably a beginner who tried to do something like
"$variable" >file
or
"$variable" | wc -l
where you obviously meant something like
echo "$variable" >file
echo "$variable" | wc -l
(Real beginners also forget the quotes; usually use quotes unless you have a specific reason to omit them, at least until you understand quoting.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string