Generating a script to delete a list of files

Generating a script to delete a list of files - linux

I have a file containing a list of paths I want to delete.
Adding rm in front of each path (to generate a script that will run these deletions) seems like the obvious approach. How can I do this?

Changing a list of filenames into a shell script by prepending rm to the beginning of each line is dangerous practice: Filenames may not map to themselves when interpreted by a shell, and may even have side effects that include running arbitrary commands. Don't do that.
If you want to delete all files named in a file, just use xargs to directly invoke rm with the filenames passed:
xargs rm -f -- <input-file
Note that this will have xargs attempt to interpret escape characters, quotes, etc. inside the names; if you don't want this, and have GNU xargs:
xargs -d $'\n' rm -f -- <input-file
Similarly, if you had control over your input file's format, you should use a NUL-delimited stream of filenames rather than a newline-delimited list of names. (This is because POSIX filesystems allow newline literals inside filenames). If your input file is null-delimited, then you can use:
xargs -0 rm -f -- <null-delimimted-input-file
If you really want to generate a shell script that will delete a listed set of names, by the way, you can do this in bash, like so:
while IFS= read -r filename; do
printf 'rm -f -- %q\n' "$filename"
done <input-list >output-script
Using printf %q escapes content in such a way that when reread by bash, it will be parsed as its literal contents (thus, putting backslashes before characters like * or $ which might otherwise be interpreted).
That said, because this invokes rm once per file, it will be less efficient than xargs (which passes multiple filenames to each rm invocation).
That said -- there actually is a middle ground: You can have xargs invoke bash, and generated a safely quoted list in the latter, with only a minimal number of invocations:
{
echo "#!/bin/bash"
xargs bash -c 'printf "rm -f -- "; printf "%q " "$#"; printf "\n"'
} <input-file >output-script

you can use sed
sed 's/^/rm /' foo.sh > foo2.sh
^ is the beginning of a line, so a start of each line will be replaced by rm.

Related

Bash script to mkdir on each line of a file that has been split by a delimiter?

Trying to figure out how to iterate through a .txt file (filemappings.txt) line by line, then split each line using tab(\t) as a delimiter so that we can create the directory specified on the right of the tab (mkdir -p).
Reading filemappings.txt and then splitting each line by tab
server/ /client/app/
server/a/ /client/app/a/
server/b/ /client/app/b/
Would turn into
mkdir -p /client/app/
mkdir -p /client/app/a/
mkdir -p /client/app/b/
Would xargs be a good option? Why or why not?

cut -f 2 filemappings.txt | tr '\n' '\0' | xargs -0 mkdir -p
xargs -0 is great for vector operations.

You already have an answer telling you how to use xargs. In my experience xargs is useful when you want to run a simple command on a list of arguments that are easy to retrieve. In your example, xargs will do nicelly. However, if you want to do something more complicated than run a simple command, you may want to use a while loop:
while IFS=$'\t' read -r a b
do
mkdir -p "$b"
done <filemappings.txt
In this special case, read a b will read two arguments separated by the defined IFS and put each in a different variable. If you are a one-liner lover, you may also do:
while IFS=$'\t' read -r a b; do mkdir -p "$b"; done <filemappings.txt
In this way you may read multiple arguments to apply to any series of commands; something that xargs is not well suited to do.
Using read -r will read a line literally regardless of any backslashes in it, in case you need to read a line with backslashes.
Also note that some operating systems may allow tabs as part of a file or directory name. That would break the use of the tab as the separator of arguments.

As others have pointed out, \t character could also be a part of the file or directory name, and the following command may fail. Assuming the question represents the true form of the input file, one can use:
$ grep -o -P '(?<=\t).*' filemappings.txt | xargs -d'\n' mkdir -p
It uses -P perl-style regex to get words after the \t(TAB) character, then use -d'\n' which provides all relevant lines as a single input to mkdir -p.

sed -n '/\t/{s:^.*\t\t*:mkdir -p ":;s:$:":;p}' filemappings.txt | bash
sed -n: only work with lines that contains tab (delimiter)
s:^.*\t\t*:mkdir -p :: change all things from line beggning to tab to mkdir -p
| bash: tell bash to create folders

With GNU Parallel it looks like this:
parallel --colsep '\t' mkdir -p {2} < filemapping.txt

How to delete numbers, dashes and underscores in the beginning of a file name

I have thousands of mp3 files but all with unusual file names such as 1-2songone.mp3, 2songtwo.mp3, 2_2_3_songthree.mp3. I want to remove all the numbers, dashes and underscores in the beginning of these files and get the result:
songone.mp3
songtwo.mp3
songthree.mp3

This can be done using extended globbing:
$ ls
1-2songone.mp3 2_2_3_songthree.mp3 2songtwo.mp3
$ shopt -s extglob
$ for fname in *.mp3; do mv -- "$fname" "${fname##*([-_[:digit:]])}"; done
$ ls
songone.mp3 songthree.mp3 songtwo.mp3
This uses parameter expansion: ${fname##pattern} removes the longest possible match from the beginning of fname. As the pattern, we use *([-_[:digit:]]), where *(pattern) stands for "zero or more matches of pattern", and the actual pattern is a bracket expression for hyhpens, underscores and digits.
Remarks:
The -- after mv indicates the end of options for move and makes sure that filenames starting with - aren't interpreted as options.
The *() expression requires the extglob shell option. As pointed out, if you don't want extended globs later, you have to unset it again with shopt -u extglob.
As per Gordon Davisson's comment: this will clobber files if you have, for example, something like 1file.mp3 and 2file.mp3. To avoid that, you can either use mv -i (or --interactive), which will prompt you before overwriting a file, or mv -n (or --noclobber), which will just not overwrite any files.
triplee points out that this needlessly moves files onto themselves if they don't start with slash, underscore or digit. To avoid that, we can iterate only over matching files with
for fname in [-_[:digit:]]*.mp3; do mv -- "$fname" "${fname##*([-_[:digit:]])}"; done
which makes sure that there is something to rename.

Benjamin W.'s answer is helpful and efficient, but has two drawbacks:
It requires setting global shell option extglob, which should be restored to its previous value afterward (the alternative, at the cost of creating an extra process, is to use a subshell: (shopt -s extglob; for fname ...)).
The extglob syntax, an extension to regular glob syntax, is familiar to few people and still less powerful than true regular expressions.
Using Bash's regex-matching operator, =~:
for f in *.mp3; do [[ $f =~ ^[0-9_-]+(.+)$ ]] && echo mv "$f" "${BASH_REMATCH[1]}"; done
Remove the echo to perform actual renaming.
$f =~ ^[0-9_-]+(.+)$ matches the longest nonempty sequence of digits, hyphens, and underscores at the start of the filename, followed by any nonempty sequence of characters captured in a parenthesized subexpression (capture group).
If the match succeeds (&&), the mv command is invoked, with the captured subexpression - accessible via element 1 of special BASH array variable ${BASH_REMATCH[#]} - forming the target filename.

You may do it this way too :
find . -type f -name "*.mp3" -print0 | while read -r -d '' line
do
mv "$line" "$( sed -E 's!(.*)/[^[:alpha:]]*([[:alpha:]].*mp3)$!\1/\2!' <<<"$line")" 2>/dev/null
done
Using sed gives you more control over the regex, I guess. Also, the 2>/dev/null is for ignoring the mv error for already converted/correct filenames.
Note:
This will recursively change the filenames across subfolders too.

Bash loop through directory including hidden file

I am looking for a way to make a simple loop in bash over everything my directory contains, i.e. files, directories and links including hidden ones.
I will prefer if it could be specifically in bash but it has to be the most general. Of course, file names (and directory names) can have white space, break line, symbols. Everything but "/" and ASCII NULL (0×0), even at the first character. Also, the result should exclude the '.' and '..' directories.
Here is a generator of files on which the loop has to deal with :
#!/bin/bash
mkdir -p test
cd test
touch A 1 ! "hello world" \$\"sym.dat .hidden " start with space" $'\n start with a newline'
mkdir -p ". hidden with space" $'My Personal\nDirectory'
So my loop should look like (but has to deal with the tricky stuff above):
for i in * ;
echo ">$i<"
done
My closest try was the use of ls and bash array, but it is not working with, is:
IFS=$(echo -en "\n\b")
l=( $(ls -A .) )
for i in ${l[#]} ; do
echo ">$i<"
done
unset IFS
Or using bash arrays but the ".." directory is not exclude:
IFS=$(echo -en "\n\b")
l=( [[:print:]]* .[[:print:]]* )
for i in ${l[#]} ; do
echo ">$i<"
done
unset IFS

* doesn't match files beginning with ., so you just need to be explicit:
for i in * .[^.]*; do
echo ">$i<"
done
.[^.]* will match all files and directories starting with ., followed by a non-. character, followed by zero or more characters. In other words, it's like the simpler .*, but excludes . and ... If you need to match something like ..foo, then you might add ..?* to the list of patterns.

As chepner noted in the comments below, this solution assumes you're running GNU bash along with GNU find GNU sort...
GNU find can be prevented from recursing into subdirectories with the -maxdepth option. Then use -print0 to end every filename with a 0x00 byte instead of the newline you'd usually get from -print.
The sort -z sorts the filenames between the 0x00 bytes.
Then, you can use sed to get rid of the dot and dot-dot directory entries (although GNU find seems to exclude the .. already).
I also used sed to get read of the ./ in front of every filename. basename could do that too, but older systems didn't have basename, and you might not trust it to handle the funky characters right.
(These sed commands each required two cases: one for a pattern at the start of the string, and one for the pattern between 0x00 bytes. These were so ugly I split them out into separate functions.)
The read command doesn't have a -z or -0 option like some commands, but you can fake it with -d "" and blanking the IFS environment variable.
The additional -r option prevents a backslash-newline combo from being interpreted as a line continuation. (A file called backslash\\nnewline would otherwise be mangled to backslashnewline.) It might be worth seeing if other backslash-combos get interpreted as escape sequences.
remove_dot_and_dotdot_dirs()
{
sed \
-e 's/^[.]\{1,2\}\x00//' \
-e 's/\x00[.]\{1,2\}\x00/\x00/g'
}
remove_leading_dotslash()
{
sed \
-e 's/^[.]\///' \
-e 's/\x00[.]\//\x00/g'
}
IFS=""
find . -maxdepth 1 -print0 |
sort -z |
remove_dot_and_dotdot_dirs |
remove_leading_dotslash |
while read -r -d "" filename
do
echo "Doing something with file '${filename}'..."
done

It may not be the most favorable way but I tried bellow thing
while read line ; do echo $line; done <<< $(ls -a | grep -v -w ".")
check the below trail which I did

Try the find command, something like:
find .
That will list all the files in all recursive directories.
To output only files excluding the leading . or .. try:
find . -type f -printf %P\\n

Unix: How to delete files listed in a file

I have a long text file with list of file masks I want to delete
Example:
/tmp/aaa.jpg
/var/www1/*
/var/www/qwerty.php
I need delete them. Tried rm `cat 1.txt` and it says the list is too long.
Found this command, but when I check folders from the list, some of them still have files
xargs rm <1.txt Manual rm call removes files from such folders, so no issue with permissions.

This is not very efficient, but will work if you need glob patterns (as in /var/www/*)
for f in $(cat 1.txt) ; do
rm "$f"
done
If you don't have any patterns and are sure your paths in the file do not contain whitespaces or other weird things, you can use xargs like so:
xargs rm < 1.txt

Assuming that the list of files is in the file 1.txt, then do:
xargs rm -r <1.txt
The -r option causes recursion into any directories named in 1.txt.
If any files are read-only, use the -f option to force the deletion:
xargs rm -rf <1.txt
Be cautious with input to any tool that does programmatic deletions. Make certain that the files named in the input file are really to be deleted. Be especially careful about seemingly simple typos. For example, if you enter a space between a file and its suffix, it will appear to be two separate file names:
file .txt
is actually two separate files: file and .txt.
This may not seem so dangerous, but if the typo is something like this:
myoldfiles *
Then instead of deleting all files that begin with myoldfiles, you'll end up deleting myoldfiles and all non-dot-files and directories in the current directory. Probably not what you wanted.

Use this:
while IFS= read -r file ; do rm -- "$file" ; done < delete.list
If you need glob expansion you can omit quoting $file:
IFS=""
while read -r file ; do rm -- $file ; done < delete.list
But be warned that file names can contain "problematic" content and I would use the unquoted version. Imagine this pattern in the file
*
*/*
*/*/*
This would delete quite a lot from the current directory! I would encourage you to prepare the delete list in a way that glob patterns aren't required anymore, and then use quoting like in my first example.

You could use '\n' for define the new line character as delimiter.
xargs -d '\n' rm < 1.txt
Be careful with the -rf because it can delete what you don't want to if the 1.txt contains paths with spaces. That's why the new line delimiter a bit safer.
On BSD systems, you could use -0 option to use new line characters as delimiter like this:
xargs -0 rm < 1.txt

xargs -I{} sh -c 'rm "{}"' < 1.txt should do what you want. Be careful with this command as one incorrect entry in that file could cause a lot of trouble.
This answer was edited after #tdavies pointed out that the original did not do shell expansion.

You can use this one-liner:
cat 1.txt | xargs echo rm | sh
Which does shell expansion but executes rm the minimum number of times.

Just to provide an another way, you can also simply use the following command
$ cat to_remove
/tmp/file1
/tmp/file2
/tmp/file3
$ rm $( cat to_remove )

In this particular case, due to the dangers cited in other answers, I would
Edit in e.g. Vim and :%s/\s/\\\0/g, escaping all space characters with a backslash.
Then :%s/^/rm -rf /, prepending the command. With -r you don't have to worry to have directories listed after the files contained therein, and with -f it won't complain due to missing files or duplicate entries.
Run all the commands: $ source 1.txt

cat 1.txt | xargs rm -f | bash Run the command will do the following for files only.
cat 1.txt | xargs rm -rf | bash Run the command will do the following recursive behaviour.

Here's another looping example. This one also contains an 'if-statement' as an example of checking to see if the entry is a 'file' (or a 'directory' for example):
for f in $(cat 1.txt); do if [ -f $f ]; then rm $f; fi; done

Here you can use set of folders from deletelist.txt while avoiding some patterns as well
foreach f (cat deletelist.txt)
rm -rf ls | egrep -v "needthisfile|*.cpp|*.h"
end

This will allow file names to have spaces (reproducible example).
# Select files of interest, here, only text files for ex.
find -type f -exec file {} \; > findresult.txt
grep ": ASCII text$" findresult.txt > textfiles.txt
# leave only the path to the file removing suffix and prefix
sed -i -e 's/:.*$//' textfiles.txt
sed -i -e 's/\.\///' textfiles.txt
#write a script that deletes the files in textfiles.txt
IFS_backup=$IFS
IFS=$(echo "\n\b")
for f in $(cat textfiles.txt);
do
rm "$f";
done
IFS=$IFS_backup
# save script as "some.sh" and run: sh some.sh

In case somebody prefers sed and removing without wildcard expansion:
sed -e "s/^\(.*\)$/rm -f -- \'\1\'/" deletelist.txt | /bin/sh
Reminder: use absolute pathnames in the file or make sure you are in the right directory.
And for completeness the same with awk:
awk '{printf "rm -f -- '\''%s'\''\n",$1}' deletelist.txt | /bin/sh
Wildcard expansion will work if the single quotes are remove, but this is dangerous in case the filename contains spaces. This would need to add quotes around the wildcards.

How do I use the lines of a file as arguments of a command?

Say, I have a file foo.txt specifying N arguments
arg1
arg2
...
argN
which I need to pass to the command my_command
How do I use the lines of a file as arguments of a command?

If your shell is bash (amongst others), a shortcut for $(cat afile) is $(< afile), so you'd write:
mycommand "$(< file.txt)"
Documented in the bash man page in the 'Command Substitution' section.
Alterately, have your command read from stdin, so: mycommand < file.txt

As already mentioned, you can use the backticks or $(cat filename).
What was not mentioned, and I think is important to note, is that you must remember that the shell will break apart the contents of that file according to whitespace, giving each "word" it finds to your command as an argument. And while you may be able to enclose a command-line argument in quotes so that it can contain whitespace, escape sequences, etc., reading from the file will not do the same thing. For example, if your file contains:
a "b c" d
the arguments you will get are:
a
"b
c"
d
If you want to pull each line as an argument, use the while/read/do construct:
while read i ; do command_name $i ; done < filename

command `< file`
will pass file contents to the command on stdin, but will strip newlines, meaning you couldn't iterate over each line individually. For that you could write a script with a 'for' loop:
for line in `cat input_file`; do some_command "$line"; done
Or (the multi-line variant):
for line in `cat input_file`
do
some_command "$line"
done
Or (multi-line variant with $() instead of ``):
for line in $(cat input_file)
do
some_command "$line"
done
References:
For loop syntax: https://www.cyberciti.biz/faq/bash-for-loop/

You do that using backticks:
echo World > file.txt
echo Hello `cat file.txt`

If you want to do this in a robust way that works for every possible command line argument (values with spaces, values with newlines, values with literal quote characters, non-printable values, values with glob characters, etc), it gets a bit more interesting.
To write to a file, given an array of arguments:
printf '%s\0' "${arguments[#]}" >file
...replace with "argument one", "argument two", etc. as appropriate.
To read from that file and use its contents (in bash, ksh93, or another recent shell with arrays):
declare -a args=()
while IFS='' read -r -d '' item; do
args+=( "$item" )
done <file
run_your_command "${args[#]}"
To read from that file and use its contents (in a shell without arrays; note that this will overwrite your local command-line argument list, and is thus best done inside of a function, such that you're overwriting the function's arguments and not the global list):
set --
while IFS='' read -r -d '' item; do
set -- "$#" "$item"
done <file
run_your_command "$#"
Note that -d (allowing a different end-of-line delimiter to be used) is a non-POSIX extension, and a shell without arrays may also not support it. Should that be the case, you may need to use a non-shell language to transform the NUL-delimited content into an eval-safe form:
quoted_list() {
## Works with either Python 2.x or 3.x
python -c '
import sys, pipes, shlex
quote = pipes.quote if hasattr(pipes, "quote") else shlex.quote
print(" ".join([quote(s) for s in sys.stdin.read().split("\0")][:-1]))
'
}
eval "set -- $(quoted_list <file)"
run_your_command "$#"

If all you need to do is to turn file arguments.txt with contents
arg1
arg2
argN
into my_command arg1 arg2 argN then you can simply use xargs:
xargs -a arguments.txt my_command
You can put additional static arguments in the xargs call, like xargs -a arguments.txt my_command staticArg which will call my_command staticArg arg1 arg2 argN

Here's how I pass contents of a file as an argument to a command:
./foo --bar "$(cat ./bar.txt)"

None of the answers seemed to work for me or were too complicated. Luckily, it's not complicated with xargs (Tested on Ubuntu 20.04).
This works with each arg on a separate line in the file as the OP mentions and was what I needed as well.
cat foo.txt | xargs my_command
One thing to note is that it doesn't seem to work with aliased commands.
The accepted answer works if the command accepts multiple args wrapped in a string. In my case using (Neo)Vim it does not and the args are all stuck together.
xargs does it properly and actually gives you separate arguments supplied to the command.

I suggest using:
command $(echo $(tr '\n' ' ' < parameters.cfg))
Simply trim the end-line characters and replace them with spaces, and then push the resulting string as possible separate arguments with echo.

In my bash shell the following worked like a charm:
cat input_file | xargs -I % sh -c 'command1 %; command2 %; command3 %;'
where input_file is
arg1
arg2
arg3
As evident, this allows you to execute multiple commands with each line from input_file, a nice little trick I learned here.

Both solutions work even when lines have spaces:
readarray -t my_args < foo.txt
my_command "${my_args[#]}"
if readarray doesn't work, replace it with mapfile, they're synonyms.
I formerly tried this one below, but had problems when my_command was a script:
xargs -d '\n' -a foo.txt my_command

After editing #Wesley Rice's answer a couple times, I decided my changes were just getting too big to continue changing his answer instead of writing my own. So, I decided I need to write my own!
Read each line of a file in and operate on it line-by-line like this:
#!/bin/bash
input="/path/to/txt/file"
while IFS= read -r line
do
echo "$line"
done < "$input"
This comes directly from author Vivek Gite here: https://www.cyberciti.biz/faq/unix-howto-read-line-by-line-from-file/. He gets the credit!
Syntax: Read file line by line on a Bash Unix & Linux shell:
1. The syntax is as follows for bash, ksh, zsh, and all other shells to read a file line by line
2. while read -r line; do COMMAND; done < input.file
3. The -r option passed to read command prevents backslash escapes from being interpreted.
4. Add IFS= option before read command to prevent leading/trailing whitespace from being trimmed -
5. while IFS= read -r line; do COMMAND_on $line; done < input.file
And now to answer this now-closed question which I also had: Is it possible to `git add` a list of files from a file? - here's my answer:
Note that FILES_STAGED is a variable containing the absolute path to a file which contains a bunch of lines where each line is a relative path to a file I'd like to do git add on. This code snippet is about to become part of the "eRCaGuy_dotfiles/useful_scripts/sync_git_repo_to_build_machine.sh" file in this project, to enable easy syncing of files in development from one PC (ex: a computer I code on) to another (ex: a more powerful computer I build on): https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles.
while IFS= read -r line
do
echo " git add \"$line\""
git add "$line"
done < "$FILES_STAGED"
References:
Where I copied my answer from: https://www.cyberciti.biz/faq/unix-howto-read-line-by-line-from-file/
For loop syntax: https://www.cyberciti.biz/faq/bash-for-loop/
Related:
How to read contents of file line-by-line and do git add on it: Is it possible to `git add` a list of files from a file?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string