Script shell for renaming and rearranging files - linux

I would like to rearrange and rename files.
I have this tree structure of files :
ada/rda/0.05/alpha1_freeSurface.md
ada/rda/0.05/p_freeSurface.md
ada/rda/0.05/U_freeSurface.md
ada/rda/0.1/alpha1_freeSurface.md
ada/rda/0.1/p_freeSurface.md
ada/rda/0.1/U_freeSurface.md
I want that files will be renamed and rearranged like this structure below:
ada/rda/ada-0.05-alpha1.md
ada/rda/ada-0.05-p.md
ada/rda/ada-0.05-U.md
ada/rda/ada-0.1-alpha1.md
ada/rda/ada-0.1-p.md
ada/rda/ada-0.1-U.md

Using the perl rename (sometimes called prename) utility:
rename 's|ada/rda/([^/]*)/([^_]*).*|ada/rda/ada-$1-$2.md|' ada/rda/*/*
(Note: by default, some distributions install a rename command from the util-linux package. This command is incompatible. If you have such a distribution, see if the perl version is available under the name prename.)
How it works
rename takes a perl commands as an argument. Here the argument consists of a single substitute command. The new name for the file is found from applying the substitute command to the old name. This allows us not only to give the file a new name but also a new directory as above.
In more detail, the substitute command looks like s|old|new|. In our case, old is ada/rda/([^/]*)/([^_]*).*. This captures the number in group 1 and the beginning of the filename (the part before the first _) in group 2. The new part is ada/rda/ada-$1-$2.md. This creates the new file name using the two captured groups.

You can use basename and dirname functions to reconstruct the new filename:
get_new_name()
{
oldname=$1
prefix=$(basename $oldname _freeSurface.md)
dname=$(dirname $oldname)
basedir=$(dirname $dname)
dname=$(basename $dname)
echo "$basedir/ada-$dname-$prefix.md"
}
e.g. get_new_name("ada/rda/0.05/alpha1_freeSurface.md") will show ada/rda/ada-0.05-alpha1.md in console.
Then, you can loop through all your files and use mv command to rename the files.

Related

moving files from a folder into subfolders based on the prefix number with Linux

I'm relatively new to bash and I have tried multiples solutions that I could find here but none of them seem to be working in my case. It's pretty simple, I have a folder that looks like this:
- images/
- 0_image_1.jpg
- 0_image_2.jpg
- 0_image_3.jpg
- 1_image_1.jpg
- 1_image_2.jpg
- 1_image_3.jpg
and I would like to move these jpg files into subfolders based on the prefix number like so:
- images_0/
- 0_image_1.jpg
- 0_image_2.jpg
- 0_image_3.jpg
- images_1/
- 1_image_1.jpg
- 1_image_2.jpg
- 1_image_3.jpg
Is there a bash command that could do that in a simple way ?
Thank you
for src in *_*.jpg; do
dest=images_${src%%_*}/
echo mkdir -p "$dest"
echo mv -- "$src" "$dest"
done
Remove both echos if the output looks good.
I would do this with rename a.k.a. Perl rename. It is extremely powerful and performant. Here's a command for your use case:
rename --dry-run -p '$_="images_" . substr($_,0,1) . "/" . $_' ?_*jpg
Let's dissect that. At the right end, we specify we only want to work on files that start with a single character/digit before an underscore so we don't do damage trying to apply the command to files it wasn't meant for. Then --dry-run means it doesn't actually do anything, it just shows you what it would do - this is a very useful feature. Then -p which handily means "create any necessary directories for me as you go". Then the meat of the command. It passes you the current filename in a variable called $_ and we then need to create a new variable called $_ to say what we want the file to be called. In this case we just want the word images_ followed by the first digit of the existing filename and then a slash and the original name. Simples!
Sample Output
'0_image_1.jpg' would be renamed to 'images_0/0_image_1.jpg'
'0_image_2.jpg' would be renamed to 'images_0/0_image_2.jpg'
'1_image_3.jpg' would be renamed to 'images_1/1_image_3.jpg'
Remove the --dry-run and run again for real, if the output looks good.
Using rename has several benefits:
that it will warn and avoid any conflicts if two files rename to the same thing,
that it can rename across directories, creating any necessary intermediate directories on the way,
that you can do a dry run first to test it,
that you can use arbitrarily complex Perl code to specify the new name.
Note: On macOS, you can install rename using homebrew:
brew install rename
Note: On some Ones, rename is referred to as prename for Perl rename.

Recursively appending names of all files in a directory with exif specific png meta data field (aesthetic_score) with linux / EXIFtool

I am trying to rename all files located in a directory (recursively) with a specific meta data field appended to the end of the png file name.
the meta data field name is "aesthetic_score" with a value range from 1.0-9.0
when I type:
exiftool -Aesthetic_score -G1 -s testn.png
the result is:
[PNG] Aesthetic_score : 7.0
This is how I would like to append the png files recursively within a directory.
Note i would like to swap out the word aesthetic with the word chad in the append, and not all files will have this data field:
input file:
filename001.png (metadata aesthetic_score:7.0)
output:
filename001-chad-score-70.png
I tried to use Digikam and JExifToolGui-2.01, without success.
I am trying to perform this task in the cmd line, although other solutions are welcome. Thank you for your help.
So, this might work for you, I can't really test it; note that you would need to get rid of the echo before the mv for it to actually do something (rename rather than just show what it would do).
while read name
do
newname=$(exiftool -G1 -s "$name"|awk '$2~/FileName/{name=$4}; $2~/Aesthetic_score/{basename=gensub(/^(.+)\....$/,"\\1","1",name);ext=gensub(/^.*\.(...)$/,"\\1","1",name);gsub(/\./,"",$4);print basename"."$4"."ext}')
echo mv "$name" "$newname"
done <<<$( find -iname \*.png )
Basically the find at the very end finds all the pngs.
The while loop takes every name find throws it, and passes each file through exiftool (using your specs) and parses the output using awk, which then outputs the new name, which gets captured in the shell variable by the same name.
And finally the mv (without the echo) renames the files.

renaming multiple sequential files extension

I have multiple sequential files naming in one directory with multiple incremental files extension. My objective is using rename command to rename just the file extension.
IBM0020.DAT_001
IBM0020.DAT_002
IBM0020.DAT_003
IBM0021.DAT_001
IBM0021.DAT_002
IBM0022.DAT_001
IBM0022.DAT_002
IBM0022.DAT_003
IBM0022.DAT_004
...
to
IBM0020.DAT_001
IBM0020.DAT_002
IBM0020.DAT_003
IBM0021.DAT_004
IBM0021.DAT_005
IBM0022.DAT_006
IBM0022.DAT_007
IBM0022.DAT_008
IBM0022.DAT_009
...
I have dry run the command below, but not the expected result. I want to retain the filename and only rename/change the extension with running number sequence.
rename -n 's/.+/our $i;sprintf(".DAT_%03d",1+$i++)/e' *
IBM0020.DAT_001 renamed as .DAT_001
IBM0020.DAT_002 renamed as .DAT_002
IBM0020.DAT_003 renamed as .DAT_003
IBM0021.DAT_001 renamed as .DAT_004
IBM0021.DAT_002 renamed as .DAT_005
IBM0022.DAT_001 renamed as .DAT_006
IBM0022.DAT_002 renamed as .DAT_007
IBM0022.DAT_003 renamed as .DAT_008
IBM0022.DAT_004 renamed as .DAT_009
Thanks for any help.
Continuing from the comment, if all of your files have .DAT_XXX as the extension you wish to rename sequentially, then there is no need to include ".DAT_" as part of the pattern you are matching. Simply match the 3-digits at the end of the filename and change those, e.g.
rename 's/\d{3}$/our $i; sprintf("%03d", 1+$i++)/e' *
If ".DAT_" isn't unique, and you have other extensions ending in 3-digits you want to avoid renaming, then you can include "DAT_" as part of the pattern matched and replaced, e.g.
rename -n 's/DAT_\d{3}/our $i; sprintf("DAT_%03d", 1+$i++)/e' *
(note: there are two different "rename" utilities in common use on Linux, the first provided as part of the util-linux package does not support regex renaming, and then perl-rename, which you have, that does support perl-regex renaming.)

Iterate through files in a directory, create output files, linux

I am trying to iterate through every file in a specific directory (called sequences), and perform two functions on each file. I know that the functions (the 'blastp' and 'cat' lines) work, since I can run them on individual files. Ordinarily I would have a specific file name as the query, output, etc., but I'm trying to use a variable so the loop can work through many files.
(Disclaimer: I am new to coding.) I believe that I am running into serious problems with trying to use my file names within my functions. As it is, my code will execute, but it creates a bunch of extra unintended files. This is what I intend for my script to do:
Line 1: Iterate through every file in my "sequences" directory. (All of which end with ".fa", if that is helpful.)
Line 3: Recognize the filename as a variable. (I know, I know, I think I've done this horribly wrong.)
Line 4: Run the blastp function using the file name as the argument for the "query" flag, always use "database.faa" as the argument for the "db" flag, and output the result in a new file that is has the same name as the initial file, but with ".txt" at the end.
Line 5: Output parts of the output file from line 4 into a new file that has the same name as the initial file, but with "_top_hits.txt" at the end.
for sequence in ./sequences/{.,}*;
do
echo "$sequence";
blastp -query $sequence -db database.faa -out ${sequence}.txt -evalue 1e-10 -outfmt 7
cat ${sequence}.txt | awk '/hits found/{getline;print}' | grep -v "#">${sequence}_top_hits.txt
done
When I ran this code, it gave me six new files derived from each file in the directory (and they were all in the same directory - I'd prefer to have them all in their own folders. How can I do that?). They were all empty. Their suffixes were, ".txt", ".txt.txt", ".txt_top_hits.txt", "_top_hits.txt", "_top_hits.txt.txt", and "_top_hits.txt_top_hits.txt".
If I can provide any further information to clarify anything, please let me know.
If you're only interested in *.fa files I would limit your input to only those matching files like this:
for sequence in sequences/*.fa;
do
I can propose you the following improvements:
for fasta_file in ./sequences/*.fa # ";" is not necessary if you already have a new line for your "do"
do
# ${variable%something} is the part of $variable
# before the string "something"
# basename path/to/file is the name of the file
# without the full path
# $(some command) allows you to use the result of the command as a string
# Combining the above, we can form a string based on our fasta file
# This string can be useful to name stuff in a clean manner later
sequence_name=$(basename ${fasta_file%.fa})
echo ${sequence_name}
# Create a directory for the results for this sequence
# -p option avoids a failure in case the directory already exists
mkdir -p ${sequence_name}
# Define the name of the file for the results
# (including our previously created directory in its path)
blast_results=${sequence_name}/${sequence_name}_blast.txt
blastp -query ${fasta_file} -db database.faa \
-out ${blast_results} \
-evalue 1e-10 -outfmt 7
# Define a file name for the top hits
top_hits=${sequence_name}/${sequence_name}_top_hits.txt
# alternatively, using "%"
#top_hits=${blast_results%_blast.txt}_top_hits.txt
# No need to cat: awk can take a file as argument
awk '/hits found/{getline;print}' ${blast_results} \
| grep -v "#" > ${sequence_name}_top_hits.txt
done
I made more intermediate variables, with (hopefully) meaningful names.
I used \ to escape line ends and allow putting commands in several lines.
I hope this improves code readability.
I haven't tested. There may be typos.
You should be using *.fa if you only want files with a .fa ending. Additionally, if you want to redirect your output to new folders you need to create those directories somewhere using
mkdir 'folder_name'
then you need to redirect your -o outputs to those files, something like this
'command' -o /path/to/output/folder
To help you test this script out, you can run each line one by one to test them. You need to make sure each line works by itself before combining.
One last thing, be careful with your use of colons, it should look something like this:
for filename in *.fa; do 'command'; done

Linux rename files based on input file

I need to rename hundreds of files in Linux to change the unique identifier of each from the command line. For sake of examples, I have a file containing:
old_name1 new_name1
old_name2 new_name2
and need to change the names from new to old IDs. The file names contain the IDs, but have extra characters as well. My plan is therefore to end up with:
abcd_old_name1_1234.txt ==> abcd_new_name1_1234.txt
abcd_old_name2_1234.txt ==> abcd_new_name2_1234.txt
Use of rename is obviously fairly helpful here, but I am struggling to work out how to iterate through the file of the desired name changes and pass this as input into rename?
Edit: To clarify, I am looking to make hundreds of different rename commands, the different changes that need to be made are listed in a text file.
Apologies if this is already answered, I've has a good hunt, but can't find a similar case.
rename 's/^(abcd_)old_name(\d+_1234\.txt)$/$1new_name$2/' *.txt
Should work, depending on whether you have that package installed. Also have a look at qmv (rename-utils)
If you want more options, use e.g.
shopt -s globstart
rename 's/^(abcd_)old_name(\d+_1234\.txt)$/$1new_name$2/' folder/**/*.txt
(finds all txt files in subdirectories of folder), or
find folder -type f -iname '*.txt' -exec rename 's/^(abcd_)old_name(\d+_1234\.txt)$/$1new_name$2/' {} \+
To do then same using GNU find
while read -r old_name new_name; do
rename "s/$old_name/$new_name/" *$old_name*.txt
done < file_with_names
In this way, you read the IDs from file_with_names and rename the files replacing $old_name with $new_name leaving the rest of the filename untouched.
I was about to write a php function to do this for myself, but I came upon a faster method:
ls and copy & paste the directory contents into excel from the terminal window. Perhaps you may need to use on online line break removal or addition tool. Assume that your file names are in column A In excel, use the following formula in another column:
="mv "&A1&" prefix"&A1&"suffix"
or
="mv "&A1&" "&substitute(A1,"jpeg","jpg")&"suffix"
or
="mv olddirectory/"&A1&" newdirectory/"&A1
back in Linux, create a new file with
nano rename.txt and paste in the values from excel. They should look something like this:
mv oldname1.jpg newname1.jpg
mv oldname1.jpg newname2.jpg
then close out of nano and run the following command:
bash rename.txt. Bash just runs every line in the file as if you had typed it.
and you are done! This method gives verbose output on errors, which is handy.

Resources