I have a folder with multiple types of file ( mp4, mp4, jpg, wma .etc) and these files have either have no extension, or all messed up extensions extension such as mp3.mp3, mp3.jpg, or just file name. I was reading that exiftool or even python magic can be used to assign correct file extension on understanding filetype. I am looking for exiftool based solution where these file can be renamed with correct file extension.
eg
filename (this is mp3 file)
filename1.jpg ( this is again mp3 file, with jpg as file extension)
filename.mp3.mp3.mp3 (repetition of extension)
At the simplest, try this (change double quotes to single quotes if on Mac/Linux):
exiftool -ext "*" "-filename<$filename.$filetype" TargetDir
or
exiftool -ext "*" "-testname<%f.$filetype" TargetDir
That will simply add the extension all the files in TargetDir. To recurse, add -r. If there was already an extension, this will add the proper extension at the end of the false extension e.g. filename.mp3 would become filename.mp3.jpeg.
For a more complex version which strips away some of the previous, false extensions, you could try something like this:
exiftool -ext "*" "-filename<${filename;s/(\.(mp3|mp4|jpe?g|png|wma|mov))*($)//i}%-c.$filetype" TargetDir
which would strip away extensions that are in the center parens in the regex. The %-c will add a number if the resulting rename would be a duplicate e.g. filename.jpeg, filename-1.jpeg, … filename-n.jpeg.
Edit: added -ext option to deal with files without an extension.
Related
I have been using to zip up some icon files.
But for a project I needed to rename some .png files to filenames using only spaces and no extension.
zip -u icons.zip *.*
But now it does not zip those files?
Any fix?
Matches any number of characters. You can use the asterisk (*) anywhere in a character string.
zip -u icons.zip *
I have multiple sequential files naming in one directory with multiple incremental files extension. My objective is using rename command to rename just the file extension.
IBM0020.DAT_001
IBM0020.DAT_002
IBM0020.DAT_003
IBM0021.DAT_001
IBM0021.DAT_002
IBM0022.DAT_001
IBM0022.DAT_002
IBM0022.DAT_003
IBM0022.DAT_004
...
to
IBM0020.DAT_001
IBM0020.DAT_002
IBM0020.DAT_003
IBM0021.DAT_004
IBM0021.DAT_005
IBM0022.DAT_006
IBM0022.DAT_007
IBM0022.DAT_008
IBM0022.DAT_009
...
I have dry run the command below, but not the expected result. I want to retain the filename and only rename/change the extension with running number sequence.
rename -n 's/.+/our $i;sprintf(".DAT_%03d",1+$i++)/e' *
IBM0020.DAT_001 renamed as .DAT_001
IBM0020.DAT_002 renamed as .DAT_002
IBM0020.DAT_003 renamed as .DAT_003
IBM0021.DAT_001 renamed as .DAT_004
IBM0021.DAT_002 renamed as .DAT_005
IBM0022.DAT_001 renamed as .DAT_006
IBM0022.DAT_002 renamed as .DAT_007
IBM0022.DAT_003 renamed as .DAT_008
IBM0022.DAT_004 renamed as .DAT_009
Thanks for any help.
Continuing from the comment, if all of your files have .DAT_XXX as the extension you wish to rename sequentially, then there is no need to include ".DAT_" as part of the pattern you are matching. Simply match the 3-digits at the end of the filename and change those, e.g.
rename 's/\d{3}$/our $i; sprintf("%03d", 1+$i++)/e' *
If ".DAT_" isn't unique, and you have other extensions ending in 3-digits you want to avoid renaming, then you can include "DAT_" as part of the pattern matched and replaced, e.g.
rename -n 's/DAT_\d{3}/our $i; sprintf("DAT_%03d", 1+$i++)/e' *
(note: there are two different "rename" utilities in common use on Linux, the first provided as part of the util-linux package does not support regex renaming, and then perl-rename, which you have, that does support perl-regex renaming.)
i need to recognize files with different extensions even when there is a combination of multiple extensions
so if my cwd has this files:
file-1 .zip
file-2 .tar
file-3 .tar.gz
file-4 .gz
file-5 .zip.tar
file-6 .tar.gz
file-7 .gz
i need to tell bash what to do when the extension (in this case) is:
zip
tar
zip.tar
tar.gz
gz
because for every extension i need to do different things, this implies that if the extension is .tar (only) or .gz (only) i need to do certain things, but if the extension is .tar.gz i need to run another snippet.
example:
if the filename has .tar extension i need to do
# stuff
tar xf filename.tar
# other stuff
if the filename has .zip.tar extension i need to run more complex code (but the code is not totally dependent on the extensions, my only objective is to get the full extension of the filename (filename.tar.gz should return .tar.gz instead of .gz or .tar)
Also, is there any way using gawk?
Use case:
case "$filename" in
*.tar.gz) code for .tar.gz ;;
*.gz) code for .gz ;;
*.zip.tar) code for .zip.tar ;;
*.tar) code for .tar ;;
...
esac
Just make sure you put the combined extensions before the single extensions that they contain, because case executes the statements for the first pattern that matches.
The file command is a good option to detect file types, then you can write a logic
file -i test.*
test.gz: application/x-gzip; charset=binary
test.tar: application/x-tar; charset=binary
test.tar.gz: application/x-gzip; charset=binary
test.zip: application/zip; charset=binary
Ok so I kinda dropped the ball. I was trying to understand how things work. I had a few html files on my computer that I was trying to rename as txt files. This was strictly a learning exercise. Following the instructions I found here using this code:
for file in *.html
do
mv "$file" "${file%.html}.txt"
done
produced this error:
mv: rename *.html to *.txt: No such file or directory
Long story short I ended up going rogue and renaming the html files, as well as a lot of other non html files as txt files. So now I have files labeled like
my_movie.mp4.txt
my_song.mp3.txt
my_file.txt.txt
This may be a really dumb question but.. Is there a way to check if a file has two extensions and if yes remove the last one? Or any other way to undo this mess?
EDIT
Doing this find . -name "*.*.txt" -exec echo {} \; | cat -b seems to tell me what was changed and where it is located. The cat -b part is not necessary but I like it. This still doesn't fix what I broke though.
I'm not sure if terminal can check for extensions "twice", but you can check for . in every name an if there's more than one occurence of ., then your file has more extensions. Then you can cut the extension off with finding first occurence of . in a string when going backwards... or last one if checking characters in string in a normal way.
I have a faster option for you if you can use python. You can strip the extension with:
for file in list_of_files:
os.rename(file,os.path.splitext(file)[0])
which can give you from your file.txt.txt your file.txt
Example:
You wrote that your command tells you what has changed, so just take those changed files and dump them into a file(path to file per line). Then you can easily run this:
with open('<path to list>') as f:
list_of_files = f.readlines()
for file in list_of_files:
os.rename(file.strip('\n'), os.path.splitext(file.strip('\n'))[0])
If not, then you'd need to get the list from python:
import os
results = []
for root, folder, filenames in os.walk(<your path to folder>):
for filename in filenames:
if filename.endswith('.txt.txt'):
results.append(os.path.join(root, filename))
With this you got a list of files ending with .txt.txt like this <your folder>\\<path_to_file>.
Get a path to your directory used in os.walk() without folder's name(it's already in list) so it'll be like this:
e.g. os.walk('/home/me/directory') -> path='/home/me/' and res is item already in a list, which looks like directory/...
for res in results:
path = '' # set the path here
file = os.path.join(path,r)
os.rename(file, os.path.splitext(file)[0])
Depending on what files you want to find change .txt.txt in if filename.endswith('...') to whatever you like and os.rename() will take file's name without extension which in your case means it strips the additional extension you don't want to have.
I'm using ImageMagick to do some image processing from the commandline, and would like to operate on a list of files as specified in foo.txt. From the instructions here: http://www.imagemagick.org/script/command-line-processing.php I see that I can use Filename References from a file prefixed with #. When I run something like:
montage #foo.txt output.jpg
everything works as expected, as long as foo.txt is in the current directory. However, when I try to access bar.txt in a different directory by running:
montage /some_directory/#bar.txt
output2.jpg
I get:
montage: unable to open image
/some_directory/#bar.txt: No such file
or directory # blob.c/OpenBlob/2480.
I believe the issue is my syntax, but I'm not sure what to change it to. Any help would be appreciated.
Quite an old entry but it seems relatively obvious that you need to put the # before the full path:
montage #/some_directory/bar.txt output2.jpg
As of ImageMagick 6.5.4-7 2014-02-10, paths are not supported with # syntax. The # file must be in the current directory and identified by name only.
I haven't tried directing IM to pull the list of files from a file, but I do specify multiple files on the command line like this:
gm -sOutputFile=dest.ext -f file1.ppm file2.ppm file3.ppm
Can you pull the contents of that file into a variable, and then let the shell expand that variable?