One liner terminal command for renaming all files in directory to their hash values - linux

I am new to bash loops and trying to rename all files in a directory to their appropriate md5 values.
There are 5 sample files in the directory.
For testing purpose, I am trying to first just print md5 hashes of all files in the directory using below command and it is working fine.
for i in `ls`; do md5sum $i; done
Output:
edc47be8af3a7d4d55402ebae9f04f0a file1
72cf1321d5f3d2e9e1be8abd971f42f5 file2
4b7b590d6d522f6da7e3a9d12d622a07 file3
357af1e7f8141581361ac5d39efa4d89 file4
1445c4c1fb27abd9061ada3b30a18b44 file5
Now I am trying to rename each file with its appropriate md5 hashes by following command:
for i in `ls`; do mv $i md5sum $i; done
Failed Output:
mv: target 'file1' is not a directory
mv: target 'file2' is not a directory
mv: target 'file3' is not a directory
mv: target 'file4' is not a directory
mv: target 'file5' is not a directory
What am I missing here?

Your command is expanded to
mv file1 edc47be8af3a7d4d55402ebae9f04f0a file1
When mv has more than two non-option arguments, it understands the last argument to be the target directory to which all the preceding files should be moved. But there's no directory file1.
You can use parameter expansion to remove the filename from the string.
Parameter expansion is usually faster then running an external command like cut or sed, but if you aren't renaming thousands of files, it probably doesn't matter.
for f in *; do
m=$(md5sum "$f")
mv "$f" ${m%% *} # Remove everything after the first space
done
Also note that I don't parse the output of ls, but let the shell expand the glob. It's safer and works (with proper quoting) for filenames containing whitespace.

Syntax. Yes I was giving wrong syntax.
With some trial and errors with the command, I finally came up with the correct syntax.
I noticed that md5sum $i was giving me 2 column-ed output.
edc47be8af3a7d4d55402ebae9f04f0a file1
72cf1321d5f3d2e9e1be8abd971f42f5 file2
4b7b590d6d522f6da7e3a9d12d622a07 file3
357af1e7f8141581361ac5d39efa4d89 file4
1445c4c1fb27abd9061ada3b30a18b44 file5
By firing second command for i in ls; do mv $i md5sum $i; done, I was basically telling terminal to do something like :
mv $i md5sum $i
which, upto my knowledge, turns out to be
mv file1 <md5 value> file1 <-- this was the issue.
How I resolved the issue?
I used cut command to filter out required value and made new one-liner as below:
for i in `ls`; do mv $i "$(md5sum $i | cut -d " " -f 1)"; done
[Edit]
According to another answer and comment by #stark, #choroba and #tripleee, it is better to use * instead of ls.
for i in *; do mv $i "$(md5sum $i | cut -d " " -f 1)"; done
#choroba's answer is also a good addition here. Turning it into one-liner requirement, below is his solution:
for i in *; do m=$(md5sum $i); mv "$i" ${m%% *};done

Related

Rename file by swaping text

I need to rename files by swaping some text.
I had for example :
CATEGORIE_2017.pdf
CLASSEMENT_2016.pdf
CATEGORIE_2018.pdf
PROPRETE_2015.pdf
...
and I want them
2017_CATEGORIE.pdf
2016_CLASSEMENT.pdf
2018_CATEGORIE.pdf
2015_PROPRETE.pdf
I came up with this bash version :
ls *.pdf | while read i
do
new_name=$(echo $i |sed -e 's/\(.*\)_\(.*\)\.pdf/\2_\1\.pdf/')
mv $i $new_name
echo "---"
done
It is efficient but seems quite clumsy to me. Does anyone have a smarter solution, for example with rename ?
Using rename you can do the renaming like this:
rename -n 's/([^_]+)_([^.]+).pdf/$2_$1.pdf/g' *.pdf
The option -n does nothing, it just prints what would happen. If you are satisfied, remove the -n option.
I use [^_]+ and [^.]+ to capture the part of the filename before and after the the _. The syntax [^_] means everything but a _.
One way:
ls *.pdf | awk -F"[_.]" '{print "mv "$0" "$2"_"$1"."$3}' | sh
Using awk, swap the positions and form the mv command and pass it to shell.
Using only bash:
for file in *_*.pdf; do
no_ext=${file%.*}
new_name=${no_ext##*_}_${no_ext%_*}.${file##*.}
mv -- "$file" "$new_name"
done

Rename part of file name based on exact match in contents of another file

I would like to rename a bunch of files by changing only one part of the file name and doing that based on an exact match in a list in another file. For example, if I have these file names:
sample_ACGTA.txt
sample_ACGTA.fq.abc
sample_ACGT.txt
sample_TTTTTC.tsv
sample_ACCCGGG.fq
sample_ACCCGGG.txt
otherfile.txt
and I want to find and replace based on these exact matches, which are found in another file called replacements.txt:
ACGT name1
TTTTTC longername12
ACCCGGG nam7
ACGTA another4
So that the desired resulting file names would be
sample_another4.txt
sample_another4.fq.abc
sample_name1.txt
sample_longername12.tsv
sample_nam7.fq
sample_nam7.txt
otherfile.txt
I do not want to change the contents. So far I have tried sed and mv based on my search results on this website. With sed I found out how to replace the contents of the file using my list:
while read from to; do
sed -i "s/$from/$to/" infile ;
done < replacements.txt,
and with mv I have found a way to rename files if there is one simple replacement:
for files in sample_*; do
mv "$files" "${files/ACGTA/another4}"
done
But how can I put them together to do what I would like?
Thank you for your help!
You can perfectly combine your for and while loops to only use mv:
while read from to ; do
for i in test* ; do
if [ "$i" != "${i/$from/$to}" ] ; then
mv $i ${i/$from/$to}
fi
done
done < replacements.txt
An alternative solution with sed could consist in using the e command that executes the result of a substitution (Use with caution! Try without the ending e first to print what commands would be executed).
Hence:
sed 's/\(\w\+\)\s\+\(\w\+\)/mv sample_\1\.txt sample_\2\.txt/e' replacements.txt
would parse your replacements.txt file and rename all your .txt files as desired.
We just have to add a loop to deal with the other extentions:
for j in .txt .bak .tsv .fq .fq.abc ; do
sed "s/\(\w\+\)\s\+\(\w\+\)/mv 'sample_\1$j' 'sample_\2$j'/e" replacements.txt
done
(Note that you should get error messages when it tries to rename non-existing files, for example when it tries to execute mv sample_ACGT.fq sample_name1.fq but file sample_ACGT.fq does not exist)
You could use awk to generate commands:
% awk '{print "for files in sample_*; do mv $files ${files/" $1 "/" $2 "}; done" }' replacements.txt
for files in sample_*; do mv $files ${files/ACGT/name1}; done
for files in sample_*; do mv $files ${files/TTTTTC/longername12}; done
for files in sample_*; do mv $files ${files/ACCCGGG/nam7}; done
for files in sample_*; do mv $files ${files/ACGTA/another4}; done
Then either copy/paste or pipe the output directly to your shell:
% awk '{print "for files in sample_*; do mv $files ${files/" $1 "/" $2 "}; done" }' replacements.txt | bash
If you want the longer match string to be used first, sort the replacements first:
% sort -r replacements.txt | awk '{print "for files in sample_*; do mv $files ${files/" $1 "/" $2 "}; done" }' | bash

How to remove the first 2 letters of multiple file names in linux shell?

I have files with the names:
Ff6_01.png
Ff6_02.png
Ff6_03.png
...
...
FF1_01.png
FF1_02.png
FF1_03.png
I want to remove the first two letters of every file name, because then I would have a correct order of the files. Does anyone know the command in the linux shell?
You can use the syntax ${file:2} to refer to the name starting from the 3rd char.
Hence, you may do:
for file in F*png
do
mv "$file" "${file:2}"
done
In case ${file:2} did not work to you (neither rename), you can also use sed or cut:
for file in F*png
do
new_file=$(sed 's/^..//' <<< "$file") <---- cuts first two chars
new_file=$(cut -c3- <<< "$file") <---- the same
mv "$file" "$new_file"
done
Test
$ file="Ff6_01.png"
$ touch $file
$ ls
Ff6_01.png
$ mv "$file" "${file:2}"
$ ls
6_01.png

Script for renaming files with logical

Someone has very kindly help get me started on a mass rename script for renaming PDF files.
As you can see I need to add a bit of logical to stop the below happening - so something like add a unique number to a duplicate file name?
rename 's/^(.{5}).*(\..*)$/$1$2/' *
rename -n 's/^(.{5}).*(\..*)$/$1$2/' *
Annexes 123114345234525.pdf renamed as Annex.pdf
Annexes 123114432452352.pdf renamed as Annex.pdf
Hope this makes sense?
Thanks
for i in *
do
x='' # counter
j="${i:0:2}" # new name
e="${i##*.}" # ext
while [ -e "$j$x" ] # try to find other name
do
((x++)) # inc counter
done
mv "$i" "$j$x" # rename
done
before
$ ls
he.pdf hejjj.pdf hello.pdf wo.pdf workd.pdf world.pdf
after
$ ls
he.pdf he1.pdf he2.pdf wo.pdf wo1.pdf wo2.pdf
This should check whether there will be any duplicates:
rename -n [...] | grep -o ' renamed as .*' | sort | uniq -d
If you get any output of the form renamed as [...], then you have a collision.
Of course, this won't work in a couple corner cases - If your files contain newlines or the literal string renamed as, for example.
As noted in my answer on your previous question:
for f in *.pdf; do
tmp=`echo $f | sed -r 's/^(.{5}).*(\..*)$/$1$2/'`
mv -b ./"$f" ./"$tmp"
done
That will make backups of deleted or overwritten files. A better alternative would be this script:
#!/bin/bash
for f in $*; do
tar -rvf /tmp/backup.tar $f
tmp=`echo $f | sed -r 's/^(.{5}).*(\..*)$/$1$2/'`
i=1
while [ -e tmp ]; do
tmp=`echo $tmp | sed "s/\./-$i/"`
i+=1
done
mv -b ./"$f" ./"$tmp"
done
Run the script like this:
find . -exec thescript '{}' \;
The find command gives you lots of options for specifing which files to run on, works recursively, and passes all the filenames in to the script. The script backs all file up with tar (uncompressed) and then renames them.
This isn't the best script, since it isn't smart enough to avoid the manual loop and check for identical file names.

Unix: How to delete files listed in a file

I have a long text file with list of file masks I want to delete
Example:
/tmp/aaa.jpg
/var/www1/*
/var/www/qwerty.php
I need delete them. Tried rm `cat 1.txt` and it says the list is too long.
Found this command, but when I check folders from the list, some of them still have files
xargs rm <1.txt Manual rm call removes files from such folders, so no issue with permissions.
This is not very efficient, but will work if you need glob patterns (as in /var/www/*)
for f in $(cat 1.txt) ; do
rm "$f"
done
If you don't have any patterns and are sure your paths in the file do not contain whitespaces or other weird things, you can use xargs like so:
xargs rm < 1.txt
Assuming that the list of files is in the file 1.txt, then do:
xargs rm -r <1.txt
The -r option causes recursion into any directories named in 1.txt.
If any files are read-only, use the -f option to force the deletion:
xargs rm -rf <1.txt
Be cautious with input to any tool that does programmatic deletions. Make certain that the files named in the input file are really to be deleted. Be especially careful about seemingly simple typos. For example, if you enter a space between a file and its suffix, it will appear to be two separate file names:
file .txt
is actually two separate files: file and .txt.
This may not seem so dangerous, but if the typo is something like this:
myoldfiles *
Then instead of deleting all files that begin with myoldfiles, you'll end up deleting myoldfiles and all non-dot-files and directories in the current directory. Probably not what you wanted.
Use this:
while IFS= read -r file ; do rm -- "$file" ; done < delete.list
If you need glob expansion you can omit quoting $file:
IFS=""
while read -r file ; do rm -- $file ; done < delete.list
But be warned that file names can contain "problematic" content and I would use the unquoted version. Imagine this pattern in the file
*
*/*
*/*/*
This would delete quite a lot from the current directory! I would encourage you to prepare the delete list in a way that glob patterns aren't required anymore, and then use quoting like in my first example.
You could use '\n' for define the new line character as delimiter.
xargs -d '\n' rm < 1.txt
Be careful with the -rf because it can delete what you don't want to if the 1.txt contains paths with spaces. That's why the new line delimiter a bit safer.
On BSD systems, you could use -0 option to use new line characters as delimiter like this:
xargs -0 rm < 1.txt
xargs -I{} sh -c 'rm "{}"' < 1.txt should do what you want. Be careful with this command as one incorrect entry in that file could cause a lot of trouble.
This answer was edited after #tdavies pointed out that the original did not do shell expansion.
You can use this one-liner:
cat 1.txt | xargs echo rm | sh
Which does shell expansion but executes rm the minimum number of times.
Just to provide an another way, you can also simply use the following command
$ cat to_remove
/tmp/file1
/tmp/file2
/tmp/file3
$ rm $( cat to_remove )
In this particular case, due to the dangers cited in other answers, I would
Edit in e.g. Vim and :%s/\s/\\\0/g, escaping all space characters with a backslash.
Then :%s/^/rm -rf /, prepending the command. With -r you don't have to worry to have directories listed after the files contained therein, and with -f it won't complain due to missing files or duplicate entries.
Run all the commands: $ source 1.txt
cat 1.txt | xargs rm -f | bash Run the command will do the following for files only.
cat 1.txt | xargs rm -rf | bash Run the command will do the following recursive behaviour.
Here's another looping example. This one also contains an 'if-statement' as an example of checking to see if the entry is a 'file' (or a 'directory' for example):
for f in $(cat 1.txt); do if [ -f $f ]; then rm $f; fi; done
Here you can use set of folders from deletelist.txt while avoiding some patterns as well
foreach f (cat deletelist.txt)
rm -rf ls | egrep -v "needthisfile|*.cpp|*.h"
end
This will allow file names to have spaces (reproducible example).
# Select files of interest, here, only text files for ex.
find -type f -exec file {} \; > findresult.txt
grep ": ASCII text$" findresult.txt > textfiles.txt
# leave only the path to the file removing suffix and prefix
sed -i -e 's/:.*$//' textfiles.txt
sed -i -e 's/\.\///' textfiles.txt
#write a script that deletes the files in textfiles.txt
IFS_backup=$IFS
IFS=$(echo "\n\b")
for f in $(cat textfiles.txt);
do
rm "$f";
done
IFS=$IFS_backup
# save script as "some.sh" and run: sh some.sh
In case somebody prefers sed and removing without wildcard expansion:
sed -e "s/^\(.*\)$/rm -f -- \'\1\'/" deletelist.txt | /bin/sh
Reminder: use absolute pathnames in the file or make sure you are in the right directory.
And for completeness the same with awk:
awk '{printf "rm -f -- '\''%s'\''\n",$1}' deletelist.txt | /bin/sh
Wildcard expansion will work if the single quotes are remove, but this is dangerous in case the filename contains spaces. This would need to add quotes around the wildcards.

Resources