Can xargs be used to run several arbitrary commands in parallel? - linux

I'd like to be able to provide a long list of arbitrary/different commands (varying binary/executable and arguments) and have xargs run those commands in parallel (xargs -P).
I can use xargs -P fine when only varying arguments. It's when I want to vary the executable and arguments that I'm having difficulty.
Example: command-list.txt
% cat command-list.txt
binary_1 arg_A arg_B arg_C
binary_2 arg_D arg_E
.... <lines deleted for brevity>
binary_100 arg_AAA arg_BBB
% xargs -a command-list.txt -P 4 -L 1
** I know the above command will only echo my command-list.txt **
I am aware of GNU parallel but can only use xargs for now. I also can't just background all the commands since there could be too many for the host to handle at once.
Solution is probably staring me in the face. Thanks in advance!

If you don't have access to parallel, one solution is just to use sh with your command as the parameter.
For example:
xargs -a command-list.txt -P 4 -I COMMAND sh -c "COMMAND"
The -c for sh basically just executes the string given (instead of looking for a file). The man page explanation is:
-c string If the -c option is present, then commands are read from
string. If there are arguments after the string, they are
assigned to the positional parameters, starting with $0.
And the -I for xargs tells it to run one command at a time (like -L 1) and to search and replace the parameter (COMMAND in this case) with the current line being processed by xargs. Man page info is below:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with
names read from standard input. Also, unquoted blanks do not
terminate input items; instead the separator is the newline
character. Implies -x and -L 1.
sh seems to be very forgiving with commands containing quotations marks (") so you don't appear to need to regexp them into escaped quotations.

Related

Does bash -c or zsh -c have a limit on a string it executes?

It appears that there is a 240 character limit on the expanded string. This quick test works for short file names, but does not work for longer names.
ls | xargs -I {} zsh -c "echo '---------------------------------------------------------------------------------------------------------------------------------------------------------{}'; echo '==============================================================================={}'"
Is there a way to expand this limit on Mac and/or Linux?
No, bash and zsh have no such limit.
Instead, here's man xargs (emphasis mine):
-I replstr
Execute utility for each input line, replacing one or more occurrences of replstr in up to replacements (or 5 if no -R flag is specified) arguments to utility with the entire line of input. The resulting arguments, after replacement is done, will not be allowed to grow beyond 255 bytes; this is implemented by concatenating as much of the argument containing replstr as possible, to the constructed arguments to utility, up to 255 bytes. The 255 byte limit does not apply to arguments to utility which do not contain replstr, and furthermore, no replacement will be done on utility itself. Implies -x.
The source code is more direct:
Replaces str with a string consisting of str with match replaced with replstr as many times as can be done before the constructed string is maxsize bytes large.
So if the string is already 255+ characters long, the number of times it can replace the string is zero.
This is not a problem in practice since you would never use the replstr in the argument to *sh -c due to the security and robustness issues it causes.
Instead, pass the arguments separately and reference them from the shell command:
find . -print0 | xargs -0 sh -c 'for arg; do echo "Received: $arg"; done' _
This depends on the operating system, not on the shell. You can find this limit on Linux-like systems by
getconf ARG_MAX
On my platform, this is 32000.
Actually, this is not just the limit for a single command argument, but for the whole command line.

Bash script to mkdir on each line of a file that has been split by a delimiter?

Trying to figure out how to iterate through a .txt file (filemappings.txt) line by line, then split each line using tab(\t) as a delimiter so that we can create the directory specified on the right of the tab (mkdir -p).
Reading filemappings.txt and then splitting each line by tab
server/ /client/app/
server/a/ /client/app/a/
server/b/ /client/app/b/
Would turn into
mkdir -p /client/app/
mkdir -p /client/app/a/
mkdir -p /client/app/b/
Would xargs be a good option? Why or why not?
cut -f 2 filemappings.txt | tr '\n' '\0' | xargs -0 mkdir -p
xargs -0 is great for vector operations.
You already have an answer telling you how to use xargs. In my experience xargs is useful when you want to run a simple command on a list of arguments that are easy to retrieve. In your example, xargs will do nicelly. However, if you want to do something more complicated than run a simple command, you may want to use a while loop:
while IFS=$'\t' read -r a b
do
mkdir -p "$b"
done <filemappings.txt
In this special case, read a b will read two arguments separated by the defined IFS and put each in a different variable. If you are a one-liner lover, you may also do:
while IFS=$'\t' read -r a b; do mkdir -p "$b"; done <filemappings.txt
In this way you may read multiple arguments to apply to any series of commands; something that xargs is not well suited to do.
Using read -r will read a line literally regardless of any backslashes in it, in case you need to read a line with backslashes.
Also note that some operating systems may allow tabs as part of a file or directory name. That would break the use of the tab as the separator of arguments.
As others have pointed out, \t character could also be a part of the file or directory name, and the following command may fail. Assuming the question represents the true form of the input file, one can use:
$ grep -o -P '(?<=\t).*' filemappings.txt | xargs -d'\n' mkdir -p
It uses -P perl-style regex to get words after the \t(TAB) character, then use -d'\n' which provides all relevant lines as a single input to mkdir -p.
sed -n '/\t/{s:^.*\t\t*:mkdir -p ":;s:$:":;p}' filemappings.txt | bash
sed -n: only work with lines that contains tab (delimiter)
s:^.*\t\t*:mkdir -p :: change all things from line beggning to tab to mkdir -p
| bash: tell bash to create folders
With GNU Parallel it looks like this:
parallel --colsep '\t' mkdir -p {2} < filemapping.txt

Unix Bash - Refer to internal function and pass parameters

I am installing a AMP server in OSX (much easier in Ubuntu) using the MacPorts methods. I would like to add a bash script in my path called apachectl that will refer to /opt/local/apache2/bin/apachectl. I have been able to do this, but I was wondering how I can then pass parameters to apachectl that would then pass it to /opt/local/apache2/bin/apachectl?
e.g. apachectl -t >>> /opt/local/apache2/bin/apachectl -t
For those wondering why I don't just reorder my path, I was asking so that I could do the same thing with other commands, such ls -l which I currently have as ll (Ubuntu style) that looks like
ls -l $1
in the file.
Is the only way to do this why positional parameters such as what I have done above?
For what you want, you want to use "$#"
Explanation is from this answer that is in turn from this page
$# -- Expands to the positional parameters, starting from one.
When the expansion occurs within double quotes, each parameter
expands to a separate word. That is, "$#" is equivalent to "$1"
"$2" ... If the double-quoted expansion occurs within a word, the
expansion of the first parameter is joined with the beginning part
of the original word, and the expansion of the last parameter is
joined with the last part of the original word. When there are no
positional parameters, "$#" and $# expand to nothing (i.e., they are removed).
That would mean that you could call your ll script as follows:
ll -a /
"$#" will expand -a / into separate positional parameters, meaning that your script actually ran
ls -l -a /
You could also use a function:
apachectl() {
/opt/local/apache2/bin/apachectl "$#"
}

How to execute sh script for files beginning with minus and including spaces?

I am trying this:
ls | sed 's/.*/"&"/' | xargs sh -- script.sh
for files:
-test 23.txt
test24.txt
te st.txt
but after this, script.sh executed only for:
-test 23.txt
Better use a glob :
./script.sh *
No need to add double quotes like you try
If your script don't loop over the arguments, try this :
for i in *; do ./script.sh "$i"; done
xargs, by default, assumes that the command it is expanding can take multiple arguments. In your example, xargs would have executed
sh -- script.sh "-test 23.txt" "test24.txt" "te st.txt"
If your script only echoes its first argument, then you'll only see -test 23.txt
You can tell xargs to execute the command for every input by using the -n1 flag.
In many cases, xargs is not what you want, even coupled with the find command (which has a useful -exec action). When it is what you want, you usually want to use the -0 flag coupled with some flag on the other side of the pipe which delimits arguments with NUL characters instead of spaces or newlines.

bash: get list of commands starting with a given string

Is it possible to get, using Bash, a list of commands starting with a certain string?
I would like to get what is printed hitting <tab> twice after typing the start of the command and, for example, store it inside a variable.
You should be able to use the compgen command, like so:
compgen -A builtin [YOUR STRING HERE]
For example, "compgen -A builtin l" returns
let
local
logout
You can use other keywords in place of "builtin" to get other types of completion. Builtin gives you shell builtin commands. "File" gives you local filenames, etc.
Here's a list of actions (from the BASH man page for complete which uses compgen):
alias Alias names. May also be specified as -a.
arrayvar Array variable names.
binding Readline key binding names.
builtin Names of shell builtin commands. May also be specified as -b.
command Command names. May also be specified as -c.
directory Directory names. May also be specified as -d.
disabled Names of disabled shell builtins.
enabled Names of enabled shell builtins.
export Names of exported shell variables. May also be specified as -e.
file File names. May also be specified as -f.
function Names of shell functions.
group Group names. May also be specified as -g.
helptopic Help topics as accepted by the help builtin.
hostname Hostnames, as taken from the file specified by the HOSTFILE shell
variable.
job Job names, if job control is active. May also be specified as
-j.
keyword Shell reserved words. May also be specified as -k.
running Names of running jobs, if job control is active.
service Service names. May also be specified as -s.
setopt Valid arguments for the -o option to the set builtin.
shopt Shell option names as accepted by the shopt builtin.
signal Signal names.
stopped Names of stopped jobs, if job control is active.
user User names. May also be specified as -u.
variable Names of all shell variables. May also be specified as -v.
A fun way to do this is to hit M-* (Meta is usually left Alt).
As an example, type this:
$ lo
Then hit M-*:
$ loadkeys loadunimap local locale localedef locale-gen locate
lockfile-create lockfile-remove lockfile-touch logd logger login
logname logout logprof logrotate logsave look lorder losetup
You can read more about this in man 3 readline; it's a feature of the readline library.
If you want exactly how bash would complete
COMPLETIONS=$(compgen -c "$WORD")
compgen completes using the same rules bash uses when tabbing.
JacobM's answer is great. For doing it manually, i would use something like this:
echo $PATH | tr : '\n' |
while read p; do
for i in $p/mod*; do
[[ -x "$i" && -f "$i" ]] && echo $i
done
done
The test before the output makes sure only executable, regular files are shown. The above shows all commands starting with mod.
Interesting, I didn't know about compgen. Here a script I've used to do it, which doesn't check for non-executables:
#!/bin/bash
echo $PATH | tr ':' '\0' | xargs -0 ls | grep "$#" | sort
Save that script somewhere in your $PATH (I named it findcmd), chmod u+w it, and then use it just like grep, passing your favorite options and pattern:
findcmd ^foo # finds all commands beginning with foo
findcmd -i -E 'ba+r' # finds all commands matching the pattern 'ba+r', case insensitively
Just for fun, another manual variant:
find -L $(echo $PATH | tr ":" " ") -name 'pattern' -type f -perm -001 -print
where pattern specifies the file name pattern you want to use. This will miss commands that are not globally executable, but which you have permission for.
[tested on Mac OS X]
Use the -or and -and flags to build a more comprehensive version of this command:
find -L $(echo $PATH | tr ":" " ") -name 'pattern' -type f
\( \
-perm -001 -or \
\( -perm -100 -and -user $(whoami)\) \
\) -print
will pick up files you have permission for by virtue of owning them. I don't see a general way to get all those you can execute by virtue of group affiliation without a lot more coding.
Iterate over the $PATH variable and do ls beginningofword* for each directory in the path?
To get it exactly equivalent, you would need to filter out only executable files and sort by name (should be pretty easy with ls flags and the sort command).
What is listed when you hit are the binary files in your PATH that start with that string. So, if your PATH variable contains:
PATH=/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/lib/java/bin:/usr/lib/java/jre/bin:/usr/lib/qt/bin:/usr/share/texmf/bin:.
Bash will look in each of those directories to show you the suggestions once you hit . Thus, to get the list of commands starting with "ls" into a variable you could do:
MYVAR=$(ls /usr/local/bin/ls* /usr/bin/ls* /bin/ls*)
Naturally you could add all the other directories I haven't.

Resources