Deleting all files in a directory except the ones mentioned in a list [duplicate] - linux

This question already has answers here:
Shell script: How to delete all files in a directory except ones listed in a file?
(2 answers)
Closed 2 years ago.
I have a directory called a00 containing 3000 files with extension .SAC. I have a text file called gd.list containing names of 88 of those 3000 files. I am trying to write a code that will delete all .SAC files except those mentioned in gd.list
How to do that using shell/bash?

The rm command is commented out so that you can check and verify that it's working as needed. Then just un-comment that line.
The check directory section will ensure you don't accidentally run the script from the wrong directory and clobber the wrong files.
You can remove the echo deleting line to run silently.
#!/bin/bash
cd /home/me/myfolder2tocleanup/
# Exit if the directory isn't found.
if (($?>0)); then
echo "Can't find work dir... exiting"
exit
fi
for i in *; do
if ! grep -qxFe "$i" filelist.txt; then
echo "Deleting: $i"
# the next line is commented out. Test it. Then uncomment to removed the files
# rm "$i"
fi
done
You can find the answer here https://askubuntu.com/questions/830776/remove-file-but-exclude-all-files-in-a-list by L. D. James

there are a few alternatives.
I'd prefer to see find -Z as it more clearly demarcates the file names:
find . -maxdepth 1 -name '*.sac' -print0 | grep -x -z -Z -f gd.list | xargs -0 echo rm
Again, test this first. Perhaps sort the output and make sure it is unique versus the original file.
For a smaller list of filenames I would recommend just using find with -and -not -name and -delete, but with a larger list that can be tricky.
You could tag the files you want to keep as read-only, then delete the wildcard with the appropriate setting in rm or find to skip read-only files. That assumes you own the read-only flag. You could tag the files as executable, and use find, if the read-only flag is not for you.
Another option would be to move the matching files to a temp folder, delete the wildcard, then move the files you want to keep back. That is assuming you can afford for the files to disappear temporarily.
To make them disappear for a shorter time, move the kept files out to a temp directory, move the original directory out, move the temp directory in, then delete the movced out directory.

If you are feeling brave, try something like
ls *.sac | fgrep -v -f gd.list | xargs echo rm
Note that I've put an echo in that xargs, just to make sure no one has a cut and paste accident.
Note also the limitations of this approach mentioned in the comments. As I said, if you are feeling brave...

Related

Bash: "No such file or directory" despite directory existing

I am making a custom command that moves or duplicates a file to a wastebasket directory instead of deleting it. I am trying to make a directory if it already isn't there, make a duplicate if a file has already been executed on, and simply move it if it doesn't. The issue is that I keep getting a no such file or directory error regardless of where I place the wastebasket directory. Do note that simply moving or copying the file with base linux commands work fine, and that being in root doesn't fix the issue. What steps should I take?
#!/bin/bash
set -x
mkdir -p /home/WASTEBASKIT #This makes a wastebasket directory if it doesn't already exist.
if test -e "$1"; then
if test -e /home/WASTEBASKIT/"$1"; then #Checking for duplicate files.
cp "$1" "/home/WASTEBASKIT/$1.$$"
else
mv "$1" "/home/WASTEBASKIT"
fi
else
printf '%s\n' "File not found." #Error if a file is not there.
fi
Here are the results: ++ mkdir -p /home/WASTEBASKIT
++ test -e config.sh
++ test -e /home/WASTEBASKIT/config.sh
++ cp config.sh.945 ' /home/WASTEBASKIT'
cp: cannot stat 'config.sh.945': No such file or directory
cp config.sh.945 ' /home/WASTEBASKIT'
cp: cannot stat 'config.sh.945': No such file or directory
The problem is on this line:
cp "$1" "$1.$$" "/home/WASTEBASKIT"
You try to copy two files into /home/WASTEBASKIT, namely $1 and $1.$$. The latter does not exist.
Change it to:
cp "$1" "/home/WASTEBASKIT/$1.$$"
I suggest that you instead create a unique file since process numbers aren't unique, so instead of the copy above, do something like:
newfile=$(mktemp "WASTEBASKIT/$1.XXXXXXX")
cp -p "$1" "$newfile"
You can then list all the copies with ls -t WASTEBASKIT to get them in historical order, newest first - or with ls -tr WASTEBASKIT to get the oldest first.
Also note: printf'%s\n' "File not found." will likely generate an error like printf%s\n: command not found.... You need to insert a space between the command printf and the argument '%s\n'.
The moving part is also wrong since you have a space before /home. It should be:
mv "$1" /home/WASTEBASKIT
mv "$1" " /home/WASTEBASKIT"
First issue: spaces matter. If you have previously created the /home/WASTEBASKIT directory, and then execute that copy command above, it will not copy the file into that directory - you will most likely end up with a file in your home directory called spaceWASTEBASKIT (unless you already have a directory of that name, including the leading space) in which case it will go into that directory.
Either way, it won't go where you want it to go.
Secondly, the command below is not doing what you seem to think. It will try to copy two files to the directory, the second of which probably does not even exist (config.sh.945 in your case):
cp "$1" "$1.$$" "/home/WASTEBASKIT"
If you want to create a "uniquely" versioned file so as to not overwrite an existing one, that would be:
mv "$1" "/home/WASTEBASKIT/$1.$$"
Note the quotes around the word "uniquely" since there's no guarantee $1.$$ may not also exist in the wastebasket - the PIDs do eventually wrap around at some point, and also do so on reboot.
I suspect a better approach (though still not bullet-proof) would be just to prefix every file with the date and time so that:
you can sort duplicates to establish the order of creation; and
sans date changes, the date/time won't give you possible duplicates (unless you're doing it more then once per second).
That approach would be something like:
mv "$1" "/home/WASTEBASKIT/$(date -u +%Y%m%d_%H%M%S)$1"
or, making duplicates even less likely:
mv "$1" "/home/WASTEBASKIT/$(date -u +%Y%m%d_%H%M%S)_${RANDOM}_$1"

Create symlink of every file in a folder tree [duplicate]

This question already has answers here:
symlink-copying a directory hierarchy
(7 answers)
Closed 4 years ago.
I need a Bash script that will create a symlink for every *.mp3 file in folder X (and its subfolders), and store those symlinks in folder Y, without subfolders and probably skipping duplicates.
For the curious, this is to automate a radio station using Libretime.
And sorry if this is a dumb question, I never used a Bash script.
As in the comment: use find to create a list of the mp3-files:
find /top/dir/for/mp3s -name '*mp3'
You will want to use that output to loop over it, so:
find /top/dir/for/mp3s -name '*mp3' | while read mp3file, do
# do the linking
done
You want create a link in a specific directory, probably with the same filename. You can get the filename with basename. So, that would make it something like this:
find /top/dir/for/mp3s -name '*mp3' | while read mp3file; do
filename=$(basename $mp3file)
ln -s $mp3file /dir/where/the/links/are/$filename
echo Linked $mp3file to /dir/where/the/links/are/$filename
done
However, this will probably give you two types of error:
If the mp3 filename contains spaces, basename will not produce the correct filename and ln will complain. Solution: use correct quoting.
If you have duplicate filenames, ln will complain that the link already exists. Solution: test if the link exists.
Because you're not destroying anything, you can try it and actually see the problems. So, our next iteration would be:
find /top/dir/for/mp3s -name '*mp3' | while read mp3file; do
filename=$(basename "$mp3file")
if [ ! -l "/dir/where/the/links/are/$filename" ] ; then
ln -s "$mp3file" "/dir/where/the/links/are/$filename"
echo "Linked $mp3file to /dir/where/the/links/are/$filename"
else
echo "Not linked $mp3file; link exists"
fi
done
That should give you a fairly good result. It also gives you a good starting point.

find returning inverted results

In a few words a wrote this little script to clean up some directories where I had consolidated directories/files from multiple sources where I used the cp command with the --backup=numbered feature so that files with identical names would have a suffix like .~1~ appended to avoid overwriting. I then ran fdupes to remove duplicate files, in some cases fdupes removed the file which did not have the suffix appended from the cp command (the original file) so I wanted to scan the directories looking for files with the suffix appended by the cp command and if the file does not exist with the suffix removed I would move mv the file otherwise I would leave it to avoid deleting anything as fdupes did not think it was a duplicate.
The issues is the test condition if [ -f ... ] part of the code below returns inverted results than what it should and I cannot understand why. For example, when the file exists it would return false and when the file did not exist it would return true. I fixed it by reversing the actions that I wanted to do based on the inverted return code and verified it was working as intended and it was so I ran it as such but would like to know if anyone knows why it would behave the way it did. I am not a bash script expert by any means so its possible that I missed something simple.
#!/bin/bash
logfile=$$.log
exec > $logfile 2>&1
IFS='
'
#set -f
for FILE in $(find . -type f -regextype posix-extended -regex '^.*(\.~[0-9]+~)+$')
do
FILE2=${FILE%%.~[0-9]*} # remove the suffix
if [ -f "${FILE2}" ]
then
echo ERROR: "${FILE2}" already exists!
else
echo "${FILE}" renamed "${FILE2}"
mv "${FILE}" "${FILE2}"
fi
done
You might be able to see the problem by modifying your script to show both FILE and FILE2 in the error message. There are a few minor problems with the script which could cause some confusion (but not the "inverted" logic):
find output is not sorted. If you had more than one backup file, a randomly chosen one would replace the original file;
you could sort the output using an expression like |sort -t~ -n -k2 on the end of the find-command.
the regular expression allows multiple matches of the ~[0-9]~ pattern. Conceivably you could have some odd file which ends with ~1~~2~.
the part where the suffix is removed assumes a single ~[0-9]~ is on the end of the filename. An embedded ~0, e.g., foo~0bar~1~ would reduce FILE to foo. The workaround for that would be more cumbersome (since the suffix-stripping uses globbing), but could be done with a case statement which matched an explicit number of digits (likely three digits would be enough).

Removing 10 Characters of Filename in Linux

I just downloaded about 600 files from my server and need to remove the last 11 characters from the filename (not including the extension). I use Ubuntu and I am searching for a command to achieve this.
Some examples are as follows:
aarondyne_kh2_13thstruggle_or_1250556383.mus should be renamed to aarondyne_kh2_13thstruggle_or.mus
aarondyne_kh2_darknessofunknow_1250556659.mp3 should be renamed to aarondyne_kh2_darknessofunknow.mp3
It seems that some duplicates might exist after I do this, but if the command fails to complete and tells me what the duplicates would be, I can always remove those manually.
Try using the rename command. It allows you to rename files based on a regular expression:
The following line should work out for you:
rename 's/_\d+(\.[a-z0-9A-Z]+)$/$1/' *
The following changes will occur:
aarondyne_kh2_13thstruggle_or_1250556383.mus renamed as aarondyne_kh2_13thstruggle_or.mus
aarondyne_kh2_darknessofunknow_1250556659.mp3 renamed as aarondyne_kh2_darknessofunknow.mp3
You can check the actions rename will do via specifying the -n flag, like this:
rename -n 's/_\d+(\.[a-z0-9A-Z]+)$/$1/' *
For more information on how to use rename simply open the manpage via: man rename
Not the prettiest, but very simple:
echo "$filename" | sed -e 's!\(.*\)...........\(\.[^.]*\)!\1\2!'
You'll still need to write the rest of the script, but it's pretty simple.
find . -type f -exec sh -c 'mv {} `echo -n {} | sed -E -e "s/[^/]{10}(\\.[^\\.]+)?$/\\1/"`' ";"
one way to go:
you get a list of your files, one per line (by ls maybe) then:
ls....|awk '{o=$0;sub(/_[^_.]*\./,".",$0);print "mv "o" "$0}'
this will print the mv a b command
e.g.
kent$ echo "aarondyne_kh2_13thstruggle_or_1250556383.mus"|awk '{o=$0;sub(/_[^_.]*\./,".",$0);print "mv "o" "$0}'
mv aarondyne_kh2_13thstruggle_or_1250556383.mus aarondyne_kh2_13thstruggle_or.mus
to execute, just pipe it to |sh
I assume there is no space in your filename.
This script assumes each file has just one extension. It would, for instance, rename "foo.something.mus" to "foo.mus". To keep all extensions, remove one hash mark (#) from the first line of the loop body. It also assumes that the base of each filename has at least 12 character, so that removing 11 doesn't leave you with an empty name.
for f in *; do
ext=${f##*.}
new_f=${base%???????????.$ext}
if [ -f "$new_f" ]; then
echo "Will not rename $f, $new_f already exists" >&2
else
mv "$f" "$new_f"
fi
done

Remove all files of a certain type except for one type in linux terminal

On my computer running Ubuntu, I have a folder full of hundreds files all named "index.html.n" where n starts at one and continues upwards. Some of those files are actual html files, some are image files (png and jpg), and some of them are zip files.
My goal is to permanently remove every single file except the zip archives. I assume it's some combination of rm and file, but I'm not sure of the exact syntax.
If it fits into your argument list and no filenames contain colon a simple pipe with xargs should do:
file * | grep -vi zip | cut -d: -f1 | tr '\n' '\0' | xargs -0 rm
First find to find matching file, then file to get file types. sed eliminates other file types and also removes everything but the filenames from the output of file. lastly, rm for deleting:
find -name 'index.html.[0-9]*' | \
xargs file | \
sed -n 's/\([^:]*\): Zip archive.*/\1/p' |
xargs rm
I would run:
for f in in index.html.*
do
file "$f" | grep -qi zip
[ $? -ne 0 ] && rm -i "$f"
done
and remove -i option if you feel confident enough
Here's the approach I'd use; it's not entirely automated, but it's less error-prone than some other approaches.
file * > cleanup.sh
or
file index.html.* > cleanup.sh
This generates a list of all files (excluding dot files), or of all index.html.* files, in your current directory and writes the list to cleanup.sh.
Using your favorite text editor (mine happens to be vim), edit cleanup.sh:
Add #!/bin/sh as the first line
Delete all lines containing the string "Zip archive"
On each line, delete everything from the : to the end of the line (in vim, :%s/:.*$//)
Replace the beginning of each line with "rm" followed by a space
Exit your editor, updating the file.
chmod +x cleanup.sh
You should now have a shell script that will delete everything except zip files.
Carefully inspect the script before running it. Look out for typos, and for files whose names contain shell metacharacters. You might need to add quotation marks to the file names.
(Note that if you do this as a one-line shell command, you don't have the opportunity to inspect the list of files you're going to delete before you actually delete them.)
Once you're satisfied that your script is correct, run
./cleanup.sh
from your shell prompt.
for i in index.html.*
do
$type = file $i;
if [[ ! $file =~ "Zip" ]]
then
rm $file
fi
done
Change the rm to a ls for testing purposes.

Resources