Avoid collision, if copying files - linux

I was trying to copy all files of a certain filetype from all subfolders to one place. Unfortunately, this might cause collisions, if two files have the same name from two different subfolders.
I was using
find ./ -name '*.jpg' -exec mv -u '{}' . \;
How can I adjust this to automatically rename files (e.g. append "_1") to avoid collisions.
Or better: check if the files are the same (e.g. same size) beforehand. If yes, ignore (overwrite would be fine, too). If No, rename to avoid collision.
Suggestion would be appreciated. Thanks!

You could check before moving each individual file. Here I've used cksum to compare, which returns both the filesize and a simple checksum.
find ./ -name '*.jpg' -print0 |
while read -d '' -r path; do
file=$(basename "$path")
if [[ -e $file ]]; then
if [[ $(cksum "$file" | awk '{print $1 $2}') = $(cksum "$path" | awk '{print $1 $2}') ]]; then
continue
fi
read -n 1 -p "File '$file' would be overwritten by '$path', continue? (y/N) " -r prompt </dev/tty
if [[ $prompt != [Yy] ]]; then
continue
fi
fi
mv -f -v "$path" "$file"
done

Related

Use find to copy files to a new folder

I'm searching for a find command to copy all wallpaper files that look like this:
3245x2324.png (All Numbers are just a placeholder)
3242x3242.jpg
I'm in my /usr/share/wallpapers folder and there are many sub folders with the files I want to copy.
There are many like "screenshot.png" and these files I don't want to copy.
My find command is like this:
find . -type f -name "*????x????.???"
If I search with this I get the files I wanted to see, but if I combine this with -exec cp:
find . -type f -name "*????x????.???" -exec cp "{}" /home/mine/Pictures/WP \;
the find command only copies 10 files and there are 77 (I counted with wc).
Does anyone know what I'm doing wrong?
You can look it up if you follow the link.
renaming with find
You can use -exec to do this. But i'm not sure you can do rename and copy in one take.Maybe with a script that got executed after every find result.
But that's only a suggestion.
One idea/approach is to copy absolute path of the file in question to the destination, but replace the / with an underscore _ since / is not allowed in file names, at least in a Unix like environment.
With find and bash, Something like.
find /usr/share/wallpapers -type f -name "????x????.???" -exec bash -c '
destination=/home/mine/Pictures/WP/
shift
for f; do
path_name=${f%/*}
file_name=${f##*/}
echo cp -vi -- "$f" "$destination${path_name//\//_}$file_name"
done' _ {} +
See understanding-the-exec-option-of-find
With globstar nullglob shell option and Associative array from the bash shell to avoid the duplicate filenames.
#!/usr/bin/env bash
shopt -s globstar nullglob
pics=(/usr/share/wallpapers/**/????x????.???)
shopt -u globstar nullglob
declare -A dups
destination=/home/mine/Pictures/WP/
for i in "${pics[#]}"; do
((!dups["${i##*/}"]++)) &&
echo cp -vi -- "$i" "$destination"
done
GNU cp(1) has the -u flag/option which might come in handy along the way.
Remove the echo if you're satisfied with the result.
Another option is to add a trailing ( ) with a number/int inside it and increment it , e.g. ????x????.???(N) where N is a number/int. Pretty much like how some gui file manager deals with duplicate file/directory names.
Something like:
#!/usr/bin/env bash
source=/usr/share/wallpapers/
destination=/home/mine/Pictures/WP/
while IFS= read -rd '' file; do
counter=1
file_name=${file##*/}
if [[ ! -e "$destination$file_name" && ! -e "$destination$file_name($counter)" ]]; then
cp -v -- "$file" "$destination$file_name"
elif [[ -e "$destination$file_name" && ! -e "$destination$file_name($counter)" ]]; then
cp -v -- "$file" "$destination$file_name($counter)"
elif [[ -e "$destination$file_name" && -e "$destination$file_name($counter)" ]]; then
while [[ -e "$destination$file_name($counter)" ]]; do
((counter++))
done
cp -v -- "$file" "$destination$file_name($counter)"
fi
done < <(find "$source" -type f -name '????x????.???' -print0)
Note that the -print0 primary is a GNU/BSD find(1) feature.

How to find and delete resized Wordpress images if the original image was already deleted?

This question pertains to the situation where
An image was uploaded, say mypicture.jpg
Wordpress created multiple copies of it with different resolutions like mypicture-300x500.jpg and mypicture-600x1000.jpg
You delete the original image only
In this scenario, the remaining photos on the filesystem are mypicture-300x500.jpg and mypicture-600x1000.jpg.
How can you script this to find these "dangling" images with the missing original and delete the "dangling" images.
You could use find to find all lower resolution pictures with the -regex test:
find . -type f -regex '.*-[0-9]+x[0-9]+\.jpg'
And this would be much better than trying to parse the ls output which is for humans only, not for automation. A safer (and simpler) bash script could thus be:
#!/usr/bin/env bash
while IFS= read -r -d '' f; do
[[ "$f" =~ (.*)-[0-9]+x[0-9]+\.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ] &&
echo rm -f "$f"
done < <(find . -type f -regex '.*-[0-9]+x[0-9]+\.jpg' -print0)
(delete the echo once you will be convinced that it works as expected).
Note: we use the -print0 action and the empty read delimiter (-d '') to separate the file names with the NUL character instead of the newline character. This is preferable because it works as expected even if you have unusual file names (e.g., with spaces).
Note: as we test the file name inside the loop we could simply search for files (find . -type f -print0). But I suspect that if you have a large number of files the performance would be negatively impacted. So keeping the -regex test is probably better.
Bash loops are OK but they tend to become really slow when the number of iteration increases. So, let's incorporate our simple bash script in a single find command with the -exec action:
find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9]+x[0-9]+\.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
Note: bash -c takes a script to execute as first argument, then the positional parameters to pass to the script, starting with $0. This is why we pass _ (my favourite for don't care), followed by {} (the current file path).
Note: -print is normally the default find action but here it is needed because -exec is one of the find actions that inhibit the default behaviour.
This will print a list of files. Check that it is correct and, once you will be satisfied, add the -delete action:
find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9]+x[0-9]+\.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -delete -print
See man find and man bash for more explanations.
Demo:
$ touch mypicture.jpg mypicture-300x500.jpg mypicture-600x1000.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9]+x[0-9]+\.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
$ rm -f mypicture.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9]+x[0-9]+\.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -print
./mypicture-300x500.jpg
./mypicture-600x1000.jpg
$ find . -type f -exec bash -c '[[ "$1" =~ (.*)-[0-9]+x[0-9]+\.jpg ]] &&
! [ -f "${BASH_REMATCH[1]}".jpg ]' _ {} \; -delete -print
./mypicture-300x500.jpg
./mypicture-600x1000.jpg
$ ls *.jpg
ls: cannot access '*.jpg': No such file or directory
One last note: if, by accident, one of your full resolution picture matches the regular expression for lower resolution pictures (e.g., if you have a balloon-1x1.jpg full resolution picture) it will be deleted. This is unfortunate but according your specifications there is no easy way to distinguish it from an orphan lower resolution picture. Be careful...
I've written a Bash script that will attempt to find the original filename (i.e. mypicture.jpg) based on scraping away the WordPress resolution (i.e. mypicture-300x500.jpg), and if it's not found, delete the "dangling image" (i.e. rm -f mypicture-300x500.jpg)
#!/bin/bash
for directory in $(find . -type d)
do
for image in $(ls $directory)
do
echo "The current filename is $image"
resolution=$(echo $image | rev | cut -f 1 -d "-" | rev | xargs)
echo "The resolution is $resolution"
extension=$(echo $resolution | rev| cut -f 1 -d "." | rev | xargs)
echo "The extension is $extension"
resolutiononly=$(echo $resolution | sed "s#.$extension##g")
echo "The resolution only is $resolutiononly"
pattern="[0-9]+x[0-9]+"
if [[ $resolutiononly =~ $pattern ]]; then
echo "The pattern matches"
originalfilename=$(echo $image | sed "s#-$resolution#.$extension#g")
echo "The current filename is $image"
echo "The original filename is $originalfilename"
if [[ -f "$originalfilename" ]]; then
echo "The file exists $originalfilename"
else
rm -f $directory/$image
fi
else
break
fi
done
done

batch rename dropbox conflict files

I have a large number of conflict files generated (incorrectly) by the dropbox service. These files are on my local linux file system.
Example file name = compile (master's conflicted copy 2013-12-21).sh
I would like to rename the file with its correct original name, in this case compile.sh and remove any existing file with that name. Ideally this could be scripted or in such a way to be recursive.
EDIT
After looking over the solution provided and playing around and further research I cobbled together something that works well for me:
#!/bin/bash
folder=/path/to/dropbox
clear
echo "This script will climb through the $folder tree and repair conflict files"
echo "Press a key to continue..."
read -n 1
echo "------------------------------"
find $folder -type f -print0 | while read -d $'\0' file; do
newname=$(echo "$file" | sed 's/ (.*conflicted copy.*)//')
if [ "$file" != "$newname" ]; then
echo "Found conflict file - $file"
if test -f $newname
then
backupname=$newname.backup
echo " "
echo "File with original name already exists, backup as $backupname"
mv "$newname" "$backupname"
fi
echo "moving $file to $newname"
mv "$file" "$newname"
echo
fi
done
all files from current directory:
for file in *
do
newname=$(echo "$file" | sed 's/ (.*)//')
if [ "$file" != "$newname" ]; then
echo moving "$file" to "$newname"
# mv "$file" "$newname" #<--- remove the comment once you are sure your script does the right thing
fi
done
or to recurse, put the following into a script that i'll call /tmp/myrename:
file="$1"
newname=$(echo "$file" | sed 's/ (.*)//')
if [ "$file" != "$newname" ]; then
echo moving "$file" to "$newname"
# mv "$file" "$newname" #<--- remove the comment once you are sure your script does the right thing
fi
then find . -type f -print0 | xargs -0 -n 1 /tmp/myrename (This is a bit hard to do on the command line without using an extra script because the file names contain blanks).
a small contribution:
I've had a problem this this script. The files with spaces in their name do not made a copy. So I modified line 17 :
-------cut-------------cut---------
if test -f "$newname"
-------cut-------------cut---------
This script displayed above is now outdated; the following works fine with the latest version of Dropbox running on Linux Mint at the time of writing:
#!/bin/bash
#modify this as needed
folder="./"
clear
echo "This script will climb through the $folder tree and repair conflict files"
echo "Press a key to continue..."
read -n 1
echo "------------------------------"
find "$folder" -type f -print0 | while read -d $'\0' file; do
newname=$(echo "$file" | sed 's/ (.*Case Conflict.*)//')
if [ "$file" != "$newname" ]; then
echo "Found conflict file - $file"
if test -f "$newname"
then
backupname=$newname.backup
echo " "
echo "File with original name already exists, backup as $backupname"
mv "$newname" "$backupname"
fi
echo "moving $file to $newname"
mv "$file" "$newname"
echo
fi
done
You can use the tool Dropbox Conflict Fix. It resolved all my conflicted copy files.

copy a directory structure with file names without content

I have a huge directory structure of movie files. For analysis of that structure I want to copy the entire directory structure, i.e. folders and files however I don't want to copy all the movie files while I want to keep there file names. Ideally I get zero-byte files with the original movie file name.
I tried to and then rsync to my remote machine which didn't fetch the link files.
Any ideas how to do that w/o writing scripts?
You can use find:
find src/ -type d -exec mkdir -p dest/{} \; \
-o -type f -exec touch dest/{} \;
Find directory (-d) under (src/) and create (mkdir -p) them under dest/ or (-o) find files (-f) and touch them under dest/.
This will result in:
dest/src/<file-structre>
You can user mv creatively to resolve this issue.
Other (partial) solution can be achieved with rsync:
rsync -a --filter="-! */" sorce_dir/ target_dir/
The trick here is the --filter=RULE option that excludes (-) everything that is not (!) a directory (*/)
On ubuntu you can try:
cp -r --attributes-only <source_dir> <target_dir>
It doesn't copy file data.
From manpage of cp
--attributes-only
don't copy the file data, just the attributes
Note: I'm not sure this option available for other distributions, if anybody can confirm please update the answer.
I needed an alternative to this to sync only the file structure:
rsync --recursive --times --delete --omit-dir-times --itemize-changes "$src_path/" "$dst_path"
This is how I realized it:
# sync source to destination
while IFS= read -r -d '' src_file; do
dst_file="$dst_path${src_file/$src_path/}"
# new files
if [[ ! -e "$dst_file" ]]; then
if [[ -d "$src_file" ]]; then
mkdir -p "$dst_file"
elif [[ -f $src_file ]]; then
touch -r "$src_file" "$dst_file"
else
echo "Error: $src_file is not a dir or file"
fi
echo -n "+ "
ls -ld "$src_file"
# modification time changed (files only)
elif [[ -f $dst_file ]] && [[ $(date -r "$src_file") != $(date -r "$dst_file") ]]; then
touch -r "$src_file" "$dst_file"
echo -n "+ "
ls -ld "$src_file"
fi
done < <(find "$src_path" -print0)
# delete files in destination if they disappeared in source
while IFS= read -r -d '' dst_file; do
src_file="$src_path${dst_file/$dst_path/}"
# file disappeard on source
if [[ ! -e "$src_file" ]]; then
delinfo=$(ls -ld "$dst_file")
if [[ -d "$dst_file" ]] && rmdir "$dst_file" 2>/dev/null; then
echo -n "- $delinfo"
elif [[ -f $dst_file ]] && rm "$dst_file"; then
echo -n "- $delinfo"
fi
fi
done < <(find "$dst_path" -print0)
As you can see I use echo and ls to display changes.
ls > listOfMovie.txt; You will have the list of your films in a .txt file
.For multiple directories see the man page.

Using bash, how do I find all files containing a specific string and replace them with an existing file?

I am using Linux and would like to replace all files containing the string 000000 with an existing file /home/user/offblack.png but keep the existing filename. I've been working at this for a while with various combinations of -exec and xargs but no luck. So far I have:
find | grep 000000
Which does list all the files I want to change fine. How do I copy and replace these files with my existing offblack.png file?
Here's what I would use:
find (your find args here) \
| xargs fgrep '000000' /dev/null \
| awk -F: '{print $1}' \
| xargs -n 1 -I ORIGINAL_FILENAME /bin/echo /bin/cp /path/to/offblack.png ORIGINAL_FILENAME
Expanding, find all the files you're interested in, grep inside of them for the string '000000' (adding /dev/null to the list of files in case one of the generated fgreps ended up with only one filename - it ensures the output is always formatted as "filename: <line containing '000000'>"), strip out only the filenames, then one-by-one, copy in offblack.png over those files. Note that I inserted a /bin/echo in there. That's your dry-run. Remove the echo to get it to run for real.
If what you mean is that the filenames contain "000000":
find . -type f -a -name '*000000*' -exec /bin/echo /bin/cp /path/to/offblack.png {} \;
Much simpler. :-) Find every file under the current directory with a name containing your string and exec the copy of offblack.png over it. Again, what I've given you there is a dry-run. Remove the echo for your live fire drill. :-)
find . -type f | grep 000000 | tr \\n \\0 | xargs -0i+ cp ~/offblack.png "+"
Let's try and use Bash a bit more:
for read -r filename
do
hit=""
for read -r
do
if [[ $REPLY == *000000* ]]
then
hit=$filename
break
fi
done < $filename
[[ -n $hit ]] && cp /path/offblack.png $filename
done < <(find . -type -f)
Fewer man pages to search!

Resources