Bash command to move only some files? - linux

Let's say I have the following files in my current directory:
1.jpg
1original.jpg
2.jpg
2original.jpg
3.jpg
4.jpg
Is there a terminal/bash/linux command that can do something like
if the file [an integer]original.jpg exists,
then move [an integer].jpg and [an integer]original.jpg to another directory.
Executing such a command will cause 1.jpg, 1original.jpg, 2.jpg and 2original.jpg to be in their own directory.
NOTE
This doesn't have to be one command. I can be a combination of simple commands. Maybe something like copy original files to a new directory. Then do some regular expression filter on files in the newdir to get a list of file names from old directory that still need to be copied over etc..

Turning on extended glob support will allow you to write a regular-expression-like pattern. This can handle files with multi-digit integers, such as '87.jpg' and '87original.jpg'. Bash parameter expansion can then be used to strip "original" from the name of a found file to allow you to move the two related files together.
shopt -s extglob
for f in +([[:digit:]])original.jpg; do
mv $f ${f/original/} otherDirectory
done
In an extended pattern, +( x ) matches one or more of the things inside the parentheses, analogous to the regular expression x+. Here, x is any digit. Therefore, we match all files in the current directory whose name consists of 1 or more digits followed by "original.jpg".
${f/original/} is an example of bash's pattern substitution. It removes the first occurrence of the string "original" from the value of f. So if f is the string "1original.jpg", then ${f/original/} is the string "1.jpg".

well, not directly, but it's an oneliner (edit: not anymore):
for i in [0-9].jpg; do
orig=${i%.*}original.jpg
[ -f $orig ] && mv $i $orig another_dir/
done
edit: probably I should point out my solution:
for i in [0-9].jpg: execute the loop body for each jpg file with one number as filename. store whole filename in $i
orig={i%.*}original.jpg: save in $orig the possible filename for the "original file"
[ -f $orig ]: check via test(1) (the [ ... ] stuff) if the original file for $i exists. if yes, move both files to another_dir. this is done via &&: the part after it will be only executed if the test was successful.

This should work for any strictly numeric prefix, i.e. 234.jpg
for f in *original.jpg; do
pre=${f%original.jpg}
if [[ -e "$pre.jpg" && "$pre" -eq "$pre" ]] 2>/dev/null; then
mv "$f" "$pre.jpg" targetDir
fi
done
"$pre" -eq "$pre" gives an error if not integer
EDIT:
this fails if there exist original.jpg and .jpg both.
$pre is then nullstring and "$pre" -eq "$pre" is true.

The following would work and is easy to understand (replace out with the output directory, and {1..9} with the actual range of your numbers.
for x in {1..9}
do
if [ -e ${x}original.jpg ]
then
mv $x.jpg out
mv ${x}original.jpg out
fi
done
You can obviously also enter it as a single line.

You can use Regex statements to find "matches" in the files names that you are looking through. Then perform your actions on the "matches" you find.

integer=0; while [ $integer -le 9 ] ; do if [ -e ${integer}original.jpg ] ; then mv -vi ${integer}.jpg ${integer}original.jpg lol/ ; fi ; integer=$[ $integer + 1 ] ; done
Note that here, "lol" is the destination directory. You can change it to anything you like. Also, you can change the 9 in while [ $integer -le 9 ] to check integers larger than 9. Right now it starts at 0* and stops after checking 9*.
Edit: If you want to, you can replace the semicolons in my code with carriage returns and it may be easier to read. Also, you can paste the whole block into the terminal this way, even if that might not immediately be obvious.

Related

Binary operator expected while using diff [duplicate]

I have a folder with a ton of old photos with many duplicates. Sorting it by hand would take ages, so I wanted to use the opportunity to use bash.
Right now I have the code:
#!/bin/bash
directory="~/Desktop/Test/*"
for file in ${directory};
do
for filex in ${directory}:
do
if [ $( diff {$file} {$filex} ) == 0 ]
then
mv ${filex} ~/Desktop
break
fi
done
done
And getting the exit code:
diff: {~/Desktop/Test/*}: No such file or directory
diff: {~/Desktop/Test/*:}: No such file or directory
File_compare: line 8: [: ==: unary operator expected
I've tried modifying working code I've found online, but it always seems to spit out some error like this. I'm guessing it's a problem with the nested for loop?
Also, why does it seem there are different ways to call variables? I've seen examples that use ${file}, "$file", and "${file}".
You have the {} in the wrong places:
if [ $( diff {$file} {$filex} ) == 0 ]
They should be at:
if [ $( diff ${file} ${filex} ) == 0 ]
(though the braces are optional now), but you should allow for spaces in the file names:
if [ $( diff "${file}" "${filex}" ) == 0 ]
Now it simply doesn't work properly because when diff finds no differences, it generates no output (and you get errors because the == operator doesn't expect nothing on its left-side). You could sort of fix it by double quoting the value from $(…) (if [ "$( diff … )" == "" ]), but you should simply and directly test the exit status of diff:
if diff "${file}" "${filex}"
then : no difference
else : there is a difference
fi
and maybe for comparing images you should be using cmp (in silent mode) rather than diff:
if cmp -s "$file" "$filex"
then : no difference
else : there is a difference
fi
In addition to the problems Jonathan Leffler pointed out:
directory="~/Desktop/Test/*"
for file in ${directory};
~ and * won't get expanded inside double-quotes; the * will get expanded when you use the variable without quotes, but since the ~ won't, it's looking for files under an directory actually named "~" (not your home directory), it won't find any matches. Also, as Jonathan pointed out, using variables (like ${directory}) without double-quotes will run you into trouble with filenames that contain spaces or some other metacharacters. The better way to do this is to not put the wildcard in the variable, use it when you reference the variable, with the variable in double-quotes and the * outside them:
directory=~/"Desktop/Test"
for file in "${directory}"/*;
Oh, and another note: when using mv in a script it's a good idea to use mv -i to avoid accidentally overwriting another file with the same name.
And: use shellcheck.net to sanity-check your code and point out common mistakes.
If you are simply interested in knowing if two files differ, cmp is the best option. Its advantages are:
It works for text as well as binary files, unlike diff which is for text files only
It stops after finding the first difference, and hence it is very efficient
So, your code could be written as:
if ! cmp -s "$file" "$filex"; then
# files differ...
mv "$filex" ~/Desktop
# any other logic here
fi
Hope this helps. I didn't understand what you are trying to do with your loops and hence didn't write the full code.
You can use diff "$file" "$filex" &>/dev/null and get the last command result with $? :
#!/bin/bash
SEARCH_DIR="."
DEST_DIR="./result"
mkdir -p "$DEST_DIR"
directory="."
ls $directory | while read file;
do
ls $directory | while read filex;
do
if [ ! -d "$filex" ] && [ ! -d "$file" ] && [ "$filex" != "$file" ];
then
diff "$file" "$filex" &>/dev/null
if [ "$?" == 0 ];
then
echo "$filex is a duplicate. Copying to $DEST_DIR"
mv "$filex" "$DEST_DIR"
fi
fi
done
done
Note that you can also use fslint or fdupes utilities to find duplicates

How do I move files to specific directories based on a pattern in the filename?

If any of this isn't particularly clear, please let me know and I'll do my best to clarify.
I basically need to sort a set of files with various extensions and similar patterns to the filename, into directories and subdirectories that match the pattern and type of extension.
To elaborate a bit:
All files, regardless of extension, begin with the pattern "zz####" where #### is a number from 1 to 900; "zz1.zip through zz950.zip, zz1.mov through zz950.mov, zz1.mp4 through zz950.mp4"
Some files contain additional characters; "zz360_hello_world.zip"
Some files contain spaces; "zz370_hello world.zip"
I need these files to be sorted and moved into directories and subdirectories following a particular format: "/home/hello/zz1/zip, /home/hello/zz1/vid"
If the directories and/or subdirectories don't exist, I need them created.
Example:
zz400_testing.zip ----> /home/hello/zz400/zip
zz400 testing video.mov ----> /home/hello/zz400/vid
zz500.zip ----> /home/hello/zz500/zip
zz500_testing another video.mp4 ----> /home/hello/zz500/vid
I found a few answers around here for simpler use-cases, but wasn't able to get anything working for my particular needs.
Any help at all would be much appreciated.
Thank you!
EDIT: Adding the code I've been messing with
for f in *.zip; do
set=`echo "$f"|sed 's/[0-9].*//'`
dir="/home/demo/$set/photos"
mkdir -p "$dir"
mv "$f" "$dir"
done
I think I'm just having trouble wrapping my head around how to match with regex. I've got this far with it:
[demo#alpha grep]$ echo zz433.zip|sed 's/[0-9].*//'
zz
The script will run the mkdir, and even move the zip files into their proper place. I just can't get it to create the proper top-level directory (zz433).
The sed command here doesn't do what you're trying to achieve:
set=`echo "$f"|sed 's/[0-9].*//'`
The meaning of the regular expression [0-9].* is "a digit followed by anything".
The s/// command of sed performs a replacement.
The result is effectively removing everything from the input starting from the first digit.
So for "zz360_hello_world.zip" it removes everything starting from "3",
leaving only "zz".
Note also that to match the files, the pattern *.zip doesn't match your description. You're looking for files starting with "zz" and a number from 1 up to 900. If you don't mind including numbers > 900 then you can write the loop expression like this:
for f in zz[0-9][^0-9]* zz[0-9][0-9][^0-9]* zz[0-9][0-9][0-9][^0-9]*; do
Or the same thing more compactly:
for f in zz{[0-9],[0-9][0-9],[0-9][0-9][0-9]}[^0-9]*; do
These are glob patterns.
zz[0-9][^0-9]* means "start with 'zz', followed by a digit, followed by a non-digit, followed by anything".
In the above example I use three patterns to cover the cases of "zz" followed by 1, 2 or 3 digits, followed by a non-digit.
The second example is a more compact form of the first,
the idea is that a{b,c}d expands to abd and acd.
Next, to get the appropriate prefix, you could use pattern matching with a case statement and extract substrings.
The syntax of these patterns is the same glob syntax as in the previous example in the for statement.
case "$f" in
zz[0-9][0-9][0-9]*) prefix=${f:0:5} ;;
zz[0-9][0-9]*) prefix=${f:0:4} ;;
zz[0-9]*) prefix=${f:0:3} ;;
esac
It seems you also want to create grouping by file type. You could get the file extension by chopping off the beginning of the name until the dot with ext=${f##*.}, and then use a case statement as in the earlier example to map extensions to the desired directory names.
Putting the above together:
for f in zz{[0-9],[0-9][0-9],[0-9][0-9][0-9]}[^0-9]*; do
case "$f" in
zz[0-9][0-9][0-9]*) prefix=${f:0:5} ;;
zz[0-9][0-9]*) prefix=${f:0:4} ;;
zz[0-9]*) prefix=${f:0:3} ;;
esac
ext=${f##*.}
case "$ext" in
mov|mp4) group=vid ;;
*) group=$ext ;;
esac
dir="/home/demo/$prefix/$group"
mkdir -p "$dir"
mv "$f" "$dir"
done
I've answered part of my own question!
for f in *.zip; do
set=`echo "$f"|grep -o -P 'zz[0-9]+.{0,0}'`
dir="/home/demo/$set/photos"
mkdir -p "$dir"
mv "$f" "$dir"
done
Basically, the following script will grab files like:
zz232.zip
zz233test.zip
zz234 test.zip
Then it will create the top-level directory (zz####), the photos sub-directory, and move the file into place:
/home/demo/zz232/photos/zz232.zip
/home/demo/zz233/photos/zz233test.zip
/home/demo/zz234/photos/zz234 test.zip
Moving on to expanding the script for additional functionality.
Thanks all!
How about:
#!/bin/bash
IFS=$'\n'
for file in *; do
if [[ $file =~ ^(zz[0-9]+).*\.(zip|mov|mp4)$ ]]; then
ext=${BASH_REMATCH[2]}
if [ $ext = "mov" -o $ext = "mp4" ]; then
ext="vid"
fi
dir="/home/hello/${BASH_REMATCH[1]}/$ext"
mkdir -p $dir
mv "$file" $dir
fi
done
Hope this helps.

Linux bash script: Recurisely delete files if empty based on file format from user input [duplicate]

This question already has answers here:
How do I pass a wildcard parameter to a bash file
(2 answers)
Closed 5 years ago.
So as the title describes, I want to recursively delete all files which match a naming pattern given by the user, but only if the file is empty. Here is my attempt:
#!/bin/bash
_files="$1"
[ $# -eq 0 ] && { echo "Usage: $0 filename"; exit 1; }
[ ! -f "$_files" ] && { echo "Error: $0 no files were found which match given naming structure."; exit 2; }
for f in $(find -name $_files)
do
if [ -s "$f" ]
then
echo "$f has some data."
# do something as file has data
else
echo "$f is empty. Deleting file."
rm $f
fi
done
Example output:
./remove_blank.sh *.o*
./Disp_variations_higher_0.o1906168 has some data.
./remove_blank.sh *.e*
./Disp_variations_higher_15.e1906183 is empty. Deleting file.
As you can see, the code works, but only for one file at a time. Should be a relatively simple fix to get it to work, but I extremely new to bash scripting and can't seem to figure it out. Sorry for the noobish question. I did some research to find an answer but didn't find exactly what I needed. Thanks in advance for any help.
Edit
I have found two different solutions to the problem. As #David Z's suggestion, one can fix this by 1st deleting the Error checking part of the script as well as putting quotes around the $_files variable in the find function. Then the code looks like this:
#!/bin/bash
_files=$1
[ $# -eq 0 ] && { echo "Usage: $0 filename"; exit 1; }
for f in $(find -name "$_files")
do
if [ -s $f ]
then
echo "$f has some data."
# do something as file has data
else
echo "$f is empty. Deleting file."
rm $f
fi
done
Or, one can also simply change the for loop to for f in "$#", which allows the error check to be kept in the script. I am not sure which method is better but will update again if I find out.
It looks like the way you're invoking the script, the shell expands the pattern before running your script. For example, in
./remove_blank.sh *.o*
the shell converts *.o* to the list of filenames that match that pattern, and then what it actually runs is something like
./remove_blank.sh Disp_variations_higher_0.o1906168 other_file.o12345 ...
But you only check the first argument in your script, so that's the one that winds up getting deleted.
Solution: quote the pattern when you run the script.
./remove_blank.sh '*.o*'
You will also need to remove the test [ ! -f "$_files" ] ... because $_files is being set to the pattern (such as *.o*), not a filename. In fact, you might want to rename the variable to make that clear. And finally, you need to quote the variable in your find command,
... $(find -name "$_files") ...
so that the pattern makes it all the way through to find without being converted into filenames by the shell.
There are some other issues with the script but you might want to ask about those on Code Review. What I've identified here is just the minimum needed to get it working.
Incidentally, you can accomplish this whole task using find itself, which will at least give cleaner code:
find -name "$_files" -type f -empty -delete

Iterating with ls and checking with -f doesn't work

I have a piece of code that should work, but it doesn't.
I want to iterate through the files and subdirectories of directories given in the command line and see which one of them is a file. The program never entries in the if statement.
for i in $#;do
for j in `ls $i`;do
if [ -f $j ];then
echo $j is a file!
fi
done
done
Things can go wrong with your approach. Do it this way.
for i in "$#" ; do
for j in "$i"/* ; do
if [ -f "$j" ]; then
echo "$j is a regular file!"
fi
done
done
Changes :
Quoted the "$#" to avoid problems with file paths containing spaces, newlines.
Used shell globbing in the inner loop, as parsing ls output is not a good idea (see http://mywiki.wooledge.org/ParsingLs)
Double-quoted variable expansion inside the test, once again to allow for files with spaces, newlines.
Added "regular" in the output, because this is what this specific test operator tests for (e.g. will exclude files that correspond to devices, FIFOs, not just directories).
You could simplify a bit if you are so inclined :
for i in "$#" ; do
for j in "$i"/* ; do
! [ -f "$j" ] || echo "$j is a regular file!"
done
done
If you want to use find, you need to make sure you only list files at a depth of one level (or else the results could be different from your code). You can do it this way :
find "$#" -mindepth 1 -maxdepth 1 -type f -exec echo "{} is a file" \;
Please note that this will still be a bit different, as globbing (by default) excludes files that start with a period. Adding shopt -s dotglob to the loop-based solution would allow globbing to consider all files, which should then make both solutions operate on the same files.
I think you'll be better off using find:
find $# -type f

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.
To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.
There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out
rename .bak '' *.bak
(rename is in the util-linux package)
Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done
You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Resources