Delete all .SVN folders in paths with embedded blanks - linux

In this question, and a hundred other places, there are mostly identical Linux solutions for deleting all .svn directories. It works beautifully until the paths happen to include blanks. So, is there a technique to recursively remove .svn files in directories that contain blanks? Perhaps a way to tell find to wrap its answers in quotes?

you can tell find to use null as an output delimiter instead of newline with the -print0 action.
then you can tell xargs to use null as an input delimiter with the -0 argument.
example:
find . -name '*.svn' -print0 | xargs -0 -I{} rm \'{}\'
the -I{} argument to xargs tells it to replace {} with the current line from standard input. and personally, i like to include the backslash escaped quotes around the filenames as well, just to be doubly sure.

find . -name '*.svn' | while read x; do rm -r "$x" ; done

Yes, wrapping in quotes seems to do the trick.
mkdir "x y"
mkdir x\ y/.svn
find . -name '.svn' | awk '{print "rm -rf \""$0"\""}' | bash
And finally:
ls x\ y
total 8
drwxrwxr-x 6 dylan dylan 4096 Dec 13 06:09 ..
drwxrwxr-x 2 dylan dylan 4096 Dec 13 06:11 .

find . -type d -name '.svn' -delete
Current versions of GNU find have gained the "-delete" action.

If what you need is a clean copy of your repository, have you envisaged the use of the SVN export command?
You will then get a copy of all the directories present in your repository, including the ones with spaces in their name, but without any .svn folder.

Related

Linux: How to delete all of the files (not directories) inside a directory itself (not childs)

There are some files in a directory whose names is not usual (E.g. in unicode format).
How to delete them?
First, find the files and then delete them:
find [dir_path] -maxdepth 1 -type f | xargs rm -rf
Above is simple and not works when there is a space in any of file name(s). So, I've written a complex and complete command to handle spaces also:
find ./ -maxdepth 1 -type f | awk -F '/' '{printf "'\''%s'\''\n",$2}' | xargs rm -rf
"-maxdepth 1" means just from the directory not childs. In the other means, not recursive find. As you know, "xargs" executes a following command on the list sent to it.
You can just use the rm:
rm .* *
It doesn't delete directories and doesn't recurse into them by default.

How to redirect out put of xargs when using sed

Since swiching over to a better management system I am wanting to remove all the redundant logs at the top of each of our source files. In Notepad++ I was able to achieve the result by using "replace in files" and replacing matches to \A(//.*\n)+ with blank. On Linux however I am having no such luck and am needing to resort to 'xargs' and 'sed'.
The sed expression I'm using is:
sed '1,/^[^\/]/{/^[^\/]/b; d}'
Ugly to be sure but it does seem to work.
The problem I'm having is when I try to run that through 'xargs' in order to feed it all the source files in our system I am unable to redirect the output to 'stripped' files, which I then intend to copy over the originals.
I want something in the line of:
find . -name "*.com" -type f -print0 | xargs -0 -I file sed '1,/^[^\/]/{/^[^\/]/b; d}' "file" > "file.stripped"
However I'm having grief passing the ">" through to the receiving environment (shell) as I'm already using too many quote marks. I have tried all manner of escaping and shell "wrappers" but I just can't get it to play ball.
Anyone care to point me in the right direction?
Thanks,
Slarti.
I made a similar scenario with a simpler sed expression just as an example, see if it works for you:
I created 3 files with the string "abcd" inside each:
# ls -l
total 12
-rw-r--r-- 1 root root 5 Oct 6 09:05 test.aaaaa.com
-rw-r--r-- 1 root root 5 Oct 6 09:05 test2.aaaaa.com
-rw-r--r-- 1 root root 5 Oct 6 09:05 test3.aaaaa.com
# cat test*
abcd
abcd
abcd
Running the find command as you showed using the -exec option instead of xargs, and replacing the sed expression for a silly one that simply replaces every "a" for "b" and the option -i, that writes directly do the input file:
# find . -name "*.com" -type f -print0 -exec sed -i 's/a/b/g' {} \;
./test2.aaaaa.com./test3.aaaaa.com./test.aaaaa.com
# cat test*
bbcd
bbcd
bbcd
In your case it should look like this:
# find . -name "*.com" -type f -print0 -exec sed -i '1,/^[^\/]/{/^[^\/]/b; d}' {} \;

How to delete X number of files in a directory

To get X number of files in a directory, I can do:
$ ls -U | head -40000
How would I then delete these 40,000 files? For example, something like:
$ "rm -rf" (ls -U | head -40000)
The tool you need for this is xargs. It will convert standard input into arguments to a command that you specify. Each line of the input is treated as a single argument.
Thus, something like this would work (see the comment below, though, ls shouldn't be parsed this way normally):
ls -U | head -40000 | xargs rm -rf
I would recommend before trying this to start with a small head size and use xargs echo to print out the filenames being passed so you understand what you'll be deleting.
Be aware if you have files with weird characters that this can sometimes be a problem. If you are on a modern GNU system you may also wish to use the arguments to these commands that use null characters to separate each element. Since a filename cannot contain a null character that will safely parse all possible names. I am not aware of a simple way to take the top X items when they are zero separated.
So, for example you can use this to delete all files in a directory
find . -maxdepth 1 -print0 | xargs -0 rm -rf
Use a bash array and slice it. If the number and size of arguments is likely to get close to the system's limits, you can still use xargs to split up the remainder.
files=( * )
printf '%s\0' "${files[#]:0:40000}" | xargs -0 rm
What about using awk as the filter?
find "$FOLDER" -maxdepth 1 -mindepth 1 -print0 \
| awk -v limit=40000 'NR<=limit;NR>limit{exit}' RS="\0" ORS="\0" \
| xargs -0 rm -rf
It will reliably remove at most 40.000 files (or folders). Reliably means regardless of which characters the filenames may contain.
Btw, to get the number of files in a directory reliably you can do:
find FOLDER -mindepth 1 -maxdepth 1 -printf '.' | wc -c
I ended up doing this since my folders were named with sequential numbers. This should also work for alphabetical folders:
ls -r releases/ | sed '1,3d' | xargs -I {} rm -rf releases/{}
Details:
list all the items in the releases/ folder in reverse order
slice off the first 3 items (which would be the newest if numeric/alpha naming)
for each item, rm it
In your case, you can replace ls -r with ls -U and 1,3d with 1,40000d. That should be the same, I believe.

removing files with numerals in the beginning of the file name

I'm working with Ubuntu recently and I have been asked to remove files with numerals at the beginning.
How do I remove ordinary files from current directory that have numerals at the first three characters?
Since nobody else bothered to post this,
rm [0-9][0-9][0-9]*
First of all: Be careful when trying out such delete commands! Try running in a directory with test files or files that are backed up well.
You could try something like this from shell:
find . -regex './[0-9]{3}.*' -exec 'rm {}' \;
For debugging, try running it without the rm-command first, listing the files that will be deleted:
find . -regex './[0-9]{3}.*'
You may have to escape the curly braces - at least I had to in FreeBSD, using zsh-shell:
find . -regex './[0-9]\{3\}.*'
How about something like
ls | egrep '^[0-9]{3}' | xargs rm
The ls lists all the files, the egrep filters the list so that it only contains filenames that start with three digits, and the xargs applies rm to each of the filenamess that egrep lets through.

Why | doesn't work with find?

Why does the following command aiming to remove recursively all .svn folders
find . -name ".svn" | rm -rfv
doesn't work ?
I know the find command provides the -exec option to solve this issue but I just want to understand what is happening there.
In your example, the results from find are passed to rm's STDIN. rm doesn't expect its arguments in STDIN, though.
Here is an example how input redirecting works.
rm does not read file names from standard input, so any data piped to it is ignored.
The only thing it uses standard input for is checking whether it's a terminal, so it can determine whether to prompt.
It doesn't work because rm does not accept a list of file names on its standard input stream.
Just for reference, the safest way to handle this in the case of directories that might contain spaces is:
find . -name .svn -exec rm -frv {} \;
Or, if you are shooting for speed:
find . -name .svn -print0 | xargs -0 rm -frv
find do works with | ( for example find ~ -name .svn | grep "a") but the problem is with rm
This question is similar to this other answered question. Hope this helps.
How do I include a pipe | in my linux find -exec command?

Resources