How to execute sh script for files beginning with minus and including spaces? - linux

I am trying this:
ls | sed 's/.*/"&"/' | xargs sh -- script.sh
for files:
-test 23.txt
test24.txt
te st.txt
but after this, script.sh executed only for:
-test 23.txt

Better use a glob :
./script.sh *
No need to add double quotes like you try
If your script don't loop over the arguments, try this :
for i in *; do ./script.sh "$i"; done

xargs, by default, assumes that the command it is expanding can take multiple arguments. In your example, xargs would have executed
sh -- script.sh "-test 23.txt" "test24.txt" "te st.txt"
If your script only echoes its first argument, then you'll only see -test 23.txt
You can tell xargs to execute the command for every input by using the -n1 flag.
In many cases, xargs is not what you want, even coupled with the find command (which has a useful -exec action). When it is what you want, you usually want to use the -0 flag coupled with some flag on the other side of the pipe which delimits arguments with NUL characters instead of spaces or newlines.

Related

How can I delete the oldest n group of files with the same prefix?

In Linux I use InfluxDB which can make a backup of the database for archival purposes. Each backup comprises a series of files with the same prefix "/tank/Backups/var/Influxdb/20191225T235655Z." and different extensions.
I wanted to write a bash script which first deletes the oldest existing backups, then creates a new one (here I paste only the removal):
ls -tp /tank/Backups/var/Influxdb/* | grep -v '/$' | sed -E 's/\..+//' | \
sort -ru | sed 's/$/.*/' | tail -n +4 | xargs -d '\n' -r rm --
However, when I run the script as "sudo", I get
rm: cannot remove '/tank/Backups/var/Influxdb/20191225T235655Z.*': No such file or directory
When I run the quoted script, except the latest part, I get:
/tank/Backups/var/Influxdb/20190930T215357Z.*
/tank/Backups/var/Influxdb/20190930T215352Z.*
which is correct. Also, if I manually write
sudo /tank/Backups/var/Influxdb/20190930T215357Z.*
the command succeeds.
Why is the script reporting an error?
I'm using Ubuntu 18.04 and the folder "/tank" is a ZFS volume.
Better do :
find /tank/Backups/var/Influxdb/* -mtime +5 -delete
to remove files older than 5 days.
Then, you can run the next command
Explaining the Error
This answer is only here to explain the error and give a deeper understanding of what is happening. If you are simply looking for an elegant solution search for other answers.
When I run the quoted script, except the latest part, I get:
/tank/Backups/var/Influxdb/20190930T215357Z.*
/tank/Backups/var/Influxdb/20190930T215352Z.*
which is correct
The listed strings are not what you want. When you pass these paths to rm it sees them just as literal strings, that is, two files whose names end with a literal *. Since you don't have such files you get an error.
When you type rm * manually into your console bash (not rm!) does globbing. bash searches files and replaces the * with the list of found files. Only after that bash executes rm foundFile1 foundFile2 .... rm never sees the *.
Strings inside a pipeline are not processed by bash, but by the commands in the pipeline, in your case rm. rm does not glob.
You could run bash inside your pipeline and let it expand the * you inserted earlier. To this end, replace the last command in your pipeline with xargs -r bash -c 'rm -- $*' --. However, note that your paths are not quoted here. If there are spaces or literal * in your filenames the command will break. This is necessary for globbing as quoted "*" are not expanded by bash.
To quote your files you have to insert the * glob inside the bash command:
ls -tp /tank/Backups/var/Influxdb/* | grep -v '/$' | sed -E 's/\..+//' |
sort -ru | tail -n +4 | xargs -d\\n -L1 -r bash -c 'rm -- "$0."*'
Above command is only a simple fix for your command. It is neither elegant nor very robust. Using tools like find is strongly recommended.

Generating a script to delete a list of files

I have a file containing a list of paths I want to delete.
Adding rm in front of each path (to generate a script that will run these deletions) seems like the obvious approach. How can I do this?
Changing a list of filenames into a shell script by prepending rm to the beginning of each line is dangerous practice: Filenames may not map to themselves when interpreted by a shell, and may even have side effects that include running arbitrary commands. Don't do that.
If you want to delete all files named in a file, just use xargs to directly invoke rm with the filenames passed:
xargs rm -f -- <input-file
Note that this will have xargs attempt to interpret escape characters, quotes, etc. inside the names; if you don't want this, and have GNU xargs:
xargs -d $'\n' rm -f -- <input-file
Similarly, if you had control over your input file's format, you should use a NUL-delimited stream of filenames rather than a newline-delimited list of names. (This is because POSIX filesystems allow newline literals inside filenames). If your input file is null-delimited, then you can use:
xargs -0 rm -f -- <null-delimimted-input-file
If you really want to generate a shell script that will delete a listed set of names, by the way, you can do this in bash, like so:
while IFS= read -r filename; do
printf 'rm -f -- %q\n' "$filename"
done <input-list >output-script
Using printf %q escapes content in such a way that when reread by bash, it will be parsed as its literal contents (thus, putting backslashes before characters like * or $ which might otherwise be interpreted).
That said, because this invokes rm once per file, it will be less efficient than xargs (which passes multiple filenames to each rm invocation).
That said -- there actually is a middle ground: You can have xargs invoke bash, and generated a safely quoted list in the latter, with only a minimal number of invocations:
{
echo "#!/bin/bash"
xargs bash -c 'printf "rm -f -- "; printf "%q " "$#"; printf "\n"'
} <input-file >output-script
you can use sed
sed 's/^/rm /' foo.sh > foo2.sh
^ is the beginning of a line, so a start of each line will be replaced by rm.

Can xargs be used to run several arbitrary commands in parallel?

I'd like to be able to provide a long list of arbitrary/different commands (varying binary/executable and arguments) and have xargs run those commands in parallel (xargs -P).
I can use xargs -P fine when only varying arguments. It's when I want to vary the executable and arguments that I'm having difficulty.
Example: command-list.txt
% cat command-list.txt
binary_1 arg_A arg_B arg_C
binary_2 arg_D arg_E
.... <lines deleted for brevity>
binary_100 arg_AAA arg_BBB
% xargs -a command-list.txt -P 4 -L 1
** I know the above command will only echo my command-list.txt **
I am aware of GNU parallel but can only use xargs for now. I also can't just background all the commands since there could be too many for the host to handle at once.
Solution is probably staring me in the face. Thanks in advance!
If you don't have access to parallel, one solution is just to use sh with your command as the parameter.
For example:
xargs -a command-list.txt -P 4 -I COMMAND sh -c "COMMAND"
The -c for sh basically just executes the string given (instead of looking for a file). The man page explanation is:
-c string If the -c option is present, then commands are read from
string. If there are arguments after the string, they are
assigned to the positional parameters, starting with $0.
And the -I for xargs tells it to run one command at a time (like -L 1) and to search and replace the parameter (COMMAND in this case) with the current line being processed by xargs. Man page info is below:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with
names read from standard input. Also, unquoted blanks do not
terminate input items; instead the separator is the newline
character. Implies -x and -L 1.
sh seems to be very forgiving with commands containing quotations marks (") so you don't appear to need to regexp them into escaped quotations.

understanding linux arguments and piping

So I'm trying to use the sh (Bourne Shell) to write some scripts. I keep running into this confusion. For the following:
1. rm `echo test`
2. echo test | rm
I know backticks are used to run the command first, okay.
But for piping in #2, why doesn't rm take in test as an argument? Is there something about piping I don't understand? I thought it was simply sending output of one command as the input to another.
And... related to my piping confusion maybe.
dir=/blah/blar/blar
files=`ls ${dir} -rt`
count=`wc -l $files` # doesn't work, in fact it's running it along with each file that exists
count2=`$files | wc -l` # doesn't work
How come I can't store the ls into "files" and use that?
You would need to use xargs there, as rm takes arguments to delete, it doesn't read from the STDIN (which is what pipes typically pipe).
echo test | xargs rm
The first one works because backticks are for substitutions, much like ${} but not as easy. :)
Alternatively, you could use find.
find . -name test -exec rm -f '{}' \;
In the first case the results of echo test (the string test) are being provided as a command-line argument to rm. In the second, the string test is being piped to the stdin file descriptor of the rm process. These are two very different things. Since rm doesn't read from stdin, it never sees test.

How can I use xargs to copy files that have spaces and quotes in their names?

I'm trying to copy a bunch of files below a directory and a number of the files have spaces and single-quotes in their names. When I try to string together find and grep with xargs, I get the following error:
find .|grep "FooBar"|xargs -I{} cp "{}" ~/foo/bar
xargs: unterminated quote
Any suggestions for a more robust usage of xargs?
This is on Mac OS X 10.5.3 (Leopard) with BSD xargs.
You can combine all of that into a single find command:
find . -iname "*foobar*" -exec cp -- "{}" ~/foo/bar \;
This will handle filenames and directories with spaces in them. You can use -name to get case-sensitive results.
Note: The -- flag passed to cp prevents it from processing files starting with - as options.
find . -print0 | grep --null 'FooBar' | xargs -0 ...
I don't know about whether grep supports --null, nor whether xargs supports -0, on Leopard, but on GNU it's all good.
The easiest way to do what the original poster wants is to change the delimiter from any whitespace to just the end-of-line character like this:
find whatever ... | xargs -d "\n" cp -t /var/tmp
This is more efficient as it does not run "cp" multiple times:
find -name '*FooBar*' -print0 | xargs -0 cp -t ~/foo/bar
I ran into the same problem. Here's how I solved it:
find . -name '*FoooBar*' | sed 's/.*/"&"/' | xargs cp ~/foo/bar
I used sed to substitute each line of input with the same line, but surrounded by double quotes. From the sed man page, "...An ampersand (``&'') appearing in the replacement is replaced by the string matching the RE..." -- in this case, .*, the entire line.
This solves the xargs: unterminated quote error.
This method works on Mac OS X v10.7.5 (Lion):
find . | grep FooBar | xargs -I{} cp {} ~/foo/bar
I also tested the exact syntax you posted. That also worked fine on 10.7.5.
Just don't use xargs. It is a neat program but it doesn't go well with find when faced with non trivial cases.
Here is a portable (POSIX) solution, i.e. one that doesn't require find, xargs or cp GNU specific extensions:
find . -name "*FooBar*" -exec sh -c 'cp -- "$#" ~/foo/bar' sh {} +
Note the ending + instead of the more usual ;.
This solution:
correctly handles files and directories with embedded spaces, newlines or whatever exotic characters.
works on any Unix and Linux system, even those not providing the GNU toolkit.
doesn't use xargs which is a nice and useful program, but requires too much tweaking and non standard features to properly handle find output.
is also more efficient (read faster) than the accepted and most if not all of the other answers.
Note also that despite what is stated in some other replies or comments quoting {} is useless (unless you are using the exotic fishshell).
Look into using the --null commandline option for xargs with the -print0 option in find.
For those who relies on commands, other than find, eg ls:
find . | grep "FooBar" | tr \\n \\0 | xargs -0 -I{} cp "{}" ~/foo/bar
find | perl -lne 'print quotemeta' | xargs ls -d
I believe that this will work reliably for any character except line-feed (and I suspect that if you've got line-feeds in your filenames, then you've got worse problems than this). It doesn't require GNU findutils, just Perl, so it should work pretty-much anywhere.
I have found that the following syntax works well for me.
find /usr/pcapps/ -mount -type f -size +1000000c | perl -lpe ' s{ }{\\ }g ' | xargs ls -l | sort +4nr | head -200
In this example, I am looking for the largest 200 files over 1,000,000 bytes in the filesystem mounted at "/usr/pcapps".
The Perl line-liner between "find" and "xargs" escapes/quotes each blank so "xargs" passes any filename with embedded blanks to "ls" as a single argument.
Frame challenge — you're asking how to use xargs. The answer is: you don't use xargs, because you don't need it.
The comment by user80168 describes a way to do this directly with cp, without calling cp for every file:
find . -name '*FooBar*' -exec cp -t /tmp -- {} +
This works because:
the cp -t flag allows to give the target directory near the beginning of cp, rather than near the end. From man cp:
-t, --target-directory=DIRECTORY
copy all SOURCE arguments into DIRECTORY
The -- flag tells cp to interpret everything after as a filename, not a flag, so files starting with - or -- do not confuse cp; you still need this because the -/-- characters are interpreted by cp, whereas any other special characters are interpreted by the shell.
The find -exec command {} + variant essentially does the same as xargs. From man find:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of invoca‐
matched files. The command line is built in much the same way
that xargs builds its command lines. Only one instance of `{}'
is allowed within the command, and (when find is being invoked
from a shell) it should be quoted (for example, '{}') to protect
it from interpretation by shells. The command is executed in
the starting directory. If any invocation returns a non-zero
value as exit status, then find returns a non-zero exit status.
If find encounters an error, this can sometimes cause an immedi‐
ate exit, so some pending commands may not be run at all. This
variant of -exec always returns true.
By using this in find directly, this avoids the need of a pipe or a shell invocation, such that you don't need to worry about any nasty characters in filenames.
With Bash (not POSIX) you can use process substitution to get the current line inside a variable. This enables you to use quotes to escape special characters:
while read line ; do cp "$line" ~/bar ; done < <(find . | grep foo)
Be aware that most of the options discussed in other answers are not standard on platforms that do not use the GNU utilities (Solaris, AIX, HP-UX, for instance). See the POSIX specification for 'standard' xargs behaviour.
I also find the behaviour of xargs whereby it runs the command at least once, even with no input, to be a nuisance.
I wrote my own private version of xargs (xargl) to deal with the problems of spaces in names (only newlines separate - though the 'find ... -print0' and 'xargs -0' combination is pretty neat given that file names cannot contain ASCII NUL '\0' characters. My xargl isn't as complete as it would need to be to be worth publishing - especially since GNU has facilities that are at least as good.
For me, I was trying to do something a little different. I wanted to copy my .txt files into my tmp folder. The .txt filenames contain spaces and apostrophe characters. This worked on my Mac.
$ find . -type f -name '*.txt' | sed 's/'"'"'/\'"'"'/g' | sed 's/.*/"&"/' | xargs -I{} cp -v {} ./tmp/
If find and xarg versions on your system doesn't support -print0 and -0 switches (for example AIX find and xargs) you can use this terribly looking code:
find . -name "*foo*" | sed -e "s/'/\\\'/g" -e 's/"/\\"/g' -e 's/ /\\ /g' | xargs cp /your/dest
Here sed will take care of escaping the spaces and quotes for xargs.
Tested on AIX 5.3
I created a small portable wrapper script called "xargsL" around "xargs" which addresses most of the problems.
Contrary to xargs, xargsL accepts one pathname per line. The pathnames may contain any character except (obviously) newline or NUL bytes.
No quoting is allowed or supported in the file list - your file names may contain all sorts of whitespace, backslashes, backticks, shell wildcard characters and the like - xargsL will process them as literal characters, no harm done.
As an added bonus feature, xargsL will not run the command once if there is no input!
Note the difference:
$ true | xargs echo no data
no data
$ true | xargsL echo no data # No output
Any arguments given to xargsL will be passed through to xargs.
Here is the "xargsL" POSIX shell script:
#! /bin/sh
# Line-based version of "xargs" (one pathname per line which may contain any
# amount of whitespace except for newlines) with the added bonus feature that
# it will not execute the command if the input file is empty.
#
# Version 2018.76.3
#
# Copyright (c) 2018 Guenther Brunthaler. All rights reserved.
#
# This script is free software.
# Distribution is permitted under the terms of the GPLv3.
set -e
trap 'test $? = 0 || echo "$0 failed!" >& 2' 0
if IFS= read -r first
then
{
printf '%s\n' "$first"
cat
} | sed 's/./\\&/g' | xargs ${1+"$#"}
fi
Put the script into some directory in your $PATH and don't forget to
$ chmod +x xargsL
the script there to make it executable.
bill_starr's Perl version won't work well for embedded newlines (only copes with spaces). For those on e.g. Solaris where you don't have the GNU tools, a more complete version might be (using sed)...
find -type f | sed 's/./\\&/g' | xargs grep string_to_find
adjust the find and grep arguments or other commands as you require, but the sed will fix your embedded newlines/spaces/tabs.
I used Bill Star's answer slightly modified on Solaris:
find . -mtime +2 | perl -pe 's{^}{\"};s{$}{\"}' > ~/output.file
This will put quotes around each line. I didn't use the '-l' option although it probably would help.
The file list I was going though might have '-', but not newlines. I haven't used the output file with any other commands as I want to review what was found before I just start massively deleting them via xargs.
I played with this a little, started contemplating modifying xargs, and realised that for the kind of use case we're talking about here, a simple reimplementation in Python is a better idea.
For one thing, having ~80 lines of code for the whole thing means it is easy to figure out what is going on, and if different behaviour is required, you can just hack it into a new script in less time than it takes to get a reply on somewhere like Stack Overflow.
See https://github.com/johnallsup/jda-misc-scripts/blob/master/yargs and https://github.com/johnallsup/jda-misc-scripts/blob/master/zargs.py.
With yargs as written (and Python 3 installed) you can type:
find .|grep "FooBar"|yargs -l 203 cp --after ~/foo/bar
to do the copying 203 files at a time. (Here 203 is just a placeholder, of course, and using a strange number like 203 makes it clear that this number has no other significance.)
If you really want something faster and without the need for Python, take zargs and yargs as prototypes and rewrite in C++ or C.
You might need to grep Foobar directory like:
find . -name "file.ext"| grep "FooBar" | xargs -i cp -p "{}" .
If you are using Bash, you can convert stdout to an array of lines by mapfile:
find . | grep "FooBar" | (mapfile -t; cp "${MAPFILE[#]}" ~/foobar)
The benefits are:
It's built-in, so it's faster.
Execute the command with all file names in one time, so it's faster.
You can append other arguments to the file names. For cp, you can also:
find . -name '*FooBar*' -exec cp -t ~/foobar -- {} +
however, some commands don't have such feature.
The disadvantages:
Maybe not scale well if there are too many file names. (The limit? I don't know, but I had tested with 10 MB list file which includes 10000+ file names with no problem, under Debian)
Well... who knows if Bash is available on OS X?

Resources