How to rename multiple files in linux and store the old file names with the new file name in a text file? - linux

I am a novice Linux user. I have 892 .pdb files, I want to rename all of them in a sequential order as L1,L2,L3,L4...........,L892. And then I want a text file which contains the old names assigned to new names ( i.e L1,L2,L3). Please help me with this. Thank you for your time.

You could just do:
#!/bin/sh
i=0
for f in *.pdb; do
: $((i += 1))
mv "$f" L"$i" && echo "$f --> L$i"
done > filelist
Note that you probably want to move the files into a different directory, as that will make it easier to recover if an error occurs midway through. Also be wary that this will overwrite any existing files and potentially cause a big mess. It's not idempotent (you can't run it twice). You would probably be better off not doing the move at all and instead do something like:
#!/bin/sh
i=0
mkdir -p newfiles
for f in *.pdb; do
ln "$f" newfiles/L"$((++i))" && printf "%s\0%s\0" "$f" "L$i"
done > filelist
This latter solution creates links to the original files in a subdirectory, so you can run it multiple times without munging the original data. Also, it uses null separators in the file list so you can unambiguously distinguish names that have newlines or tabs or spaces in them. It makes for a list that is not particularly human readable, but you can easily filter it through tr to make it pretty.

Related

Renaming multiple different file extensions with BASH script

I'm trying to create a bash script that takes a directory full of files (about 500 files) that have all different types of extensions (no seriously, like 30 different types of extensions) and I want get rid of all of the extensions, and replace them with .txt
I've been searching around for a while now, and can only find examples of taking a specified extension, and changing it to another specified extension.
Like png --> jpg, or .doc --> .txt
Here's an example I've found:
# Rename all *.txt to *.text
for f in *.txt; do
mv -- "$f" "${f%.txt}.text"
done
This works, but only if you go from .txt to .text, I have multiple different extensions I'm working with.
My current code is:
directory=$1
for item in $directory/*
do
echo mv -- "$item" "$item.txt";
done
This will append the .txt onto them, but unfortunately I am left with the previous ones still attached. E.G. filename.etc.txt, filename.bla.txt
Am I going about this wrong? Any help is greatly appreciated.
It's a trivial change to the first example:
cd "$directory"
# Rename all files to *.txt
for f in *
do
mv -- "$f" "${f%.*}.txt"
done
If a file contains multiple extensions, this will replace only the last one. To remove all extensions, use %% in place of %.

Sort files according to their filetype

After an HD problem and some work, I have a bunch of files with names like "f1234", "f1235", etc.
My goal is to sort this files according to their filetype. For example, I want to move all the PDF files in the "pdfs" directory.
For one file, I can do : "file f1234", and if it's a PDF, I can "mv f1234 pdfs/". But I have thousands of file... Can you help me with a bash or zsh command for sort all the PDF in one pass ? Thanks
The hard part here is reliably turning the output of file into a directory name. I think probably the best candidate for that is the mime-type of the file rather than the human readable output of file. I'd use something like:
mkdir sorted
for f in f*
do
d=$(file -b --mime-type "$f" | tr / -)
mkdir -p "sorted/$d"
mv "$f" "sorted/$d/"
done
Obviously I'd test that out a bit before running it on your files, but something pretty close to that should work.

find returning inverted results

In a few words a wrote this little script to clean up some directories where I had consolidated directories/files from multiple sources where I used the cp command with the --backup=numbered feature so that files with identical names would have a suffix like .~1~ appended to avoid overwriting. I then ran fdupes to remove duplicate files, in some cases fdupes removed the file which did not have the suffix appended from the cp command (the original file) so I wanted to scan the directories looking for files with the suffix appended by the cp command and if the file does not exist with the suffix removed I would move mv the file otherwise I would leave it to avoid deleting anything as fdupes did not think it was a duplicate.
The issues is the test condition if [ -f ... ] part of the code below returns inverted results than what it should and I cannot understand why. For example, when the file exists it would return false and when the file did not exist it would return true. I fixed it by reversing the actions that I wanted to do based on the inverted return code and verified it was working as intended and it was so I ran it as such but would like to know if anyone knows why it would behave the way it did. I am not a bash script expert by any means so its possible that I missed something simple.
#!/bin/bash
logfile=$$.log
exec > $logfile 2>&1
IFS='
'
#set -f
for FILE in $(find . -type f -regextype posix-extended -regex '^.*(\.~[0-9]+~)+$')
do
FILE2=${FILE%%.~[0-9]*} # remove the suffix
if [ -f "${FILE2}" ]
then
echo ERROR: "${FILE2}" already exists!
else
echo "${FILE}" renamed "${FILE2}"
mv "${FILE}" "${FILE2}"
fi
done
You might be able to see the problem by modifying your script to show both FILE and FILE2 in the error message. There are a few minor problems with the script which could cause some confusion (but not the "inverted" logic):
find output is not sorted. If you had more than one backup file, a randomly chosen one would replace the original file;
you could sort the output using an expression like |sort -t~ -n -k2 on the end of the find-command.
the regular expression allows multiple matches of the ~[0-9]~ pattern. Conceivably you could have some odd file which ends with ~1~~2~.
the part where the suffix is removed assumes a single ~[0-9]~ is on the end of the filename. An embedded ~0, e.g., foo~0bar~1~ would reduce FILE to foo. The workaround for that would be more cumbersome (since the suffix-stripping uses globbing), but could be done with a case statement which matched an explicit number of digits (likely three digits would be enough).

Bash Script to replicate files

I have 25 files in a directory. I need to amass 25000 files for testing purposes. I thought I could just replicate these files over and over until I get 25000 files. I could manually copy paste 1000 times but that seemed tedious. So I thought I could write a script to do it for me. I tried
cp * .
As a trial but I got an error that said the source and destination file are the same. If I were to automate it how would i do it so that each of the 1000 times the new files are made with unique names?
As discussed in the comments, you can do something like this:
for file in *
do
filename="${file%.*}" # get everything up to last dot
extension="${file##*.}" # get extension (text after last dot)
for i in {00001..10000}
do
cp $file ${filename}${i}${extension}
done
done
The trick for i in {00001..10000} is used to loop from 1 to 10000 having the number with leading zeros.
The ${filename}${i}${extension} is the same as $filename$i$extension but makes more clarity over what is a variable name and what is text. This way, you can also do ${filename}_${i}${extension} to get files like a_23.txt, etc.
In case your current files match a specific pattern, you can always do for file in a* (if they all are on the a + something format).
If you want to keep the extension of the files, you can use this. Assuming, you want to copy all txt-files:
#!/bin/bash
for f in *.txt
do
for i in {1..10000}
do
cp "$f" "${f%.*}_${i}.${f##*.}"
done
done
You could try this:
for file in *; do for i in {1..1000}; do cp $file $file-$i; done; done;
It will append a number to any existing files.
The next script
for file in *.*
do
eval $(sed 's/\(.*\)\.\([^\.]*\)$/base="\1";ext="\2";/' <<< "$file")
for n in {1..1000}
do
echo cp "$file" "$base-$n.$ext"
done
done
will:
take all files with extensions *.*
creates the basename and extension (sed)
in a cycle 1000 times copyes the original file to file-number.extension
it is for DRY-RUN, remove the echo if satisfied

How would I flatten and overlay multiple directories into one directory?

I want to take a list of directory hierarchies and flatten them into a single directory. Any duplicate file later in the list will replace an earlier file. For example...
foo/This/That.pm
bar/This/That.pm
bar/Some/Module.pm
wiff/This/That.pm
wiff/A/Thing/Here.pm
This would wind up with
This/That.pm # from wiff/
Some/Module.pm # from bar/
A/Thing/Here.pm # from wiff/
I have a probably over complicated Perl program to do this. I'm interested in the clever ways SO users might solve it. The big hurdle is "create the intermediate directories if necessary" perhaps with some combination of basename and dirname.
The real problem I'm solving is checking the difference between two installed Perl libraries. I'm first flattening the multiple library directories for each Perl into a single directory, simulating how Perl would search for a module. I can then diff -r them.
If you do not mind the final order of the entries, I guess this can do the job:
#!/bin/bash
declare -A directory;
while read line; do
directory["${line#*/}"]=${line%%/*}
done < $1
for entry in ${!directory[#]}; do
printf "%s\t# from %s/\n" $entry ${directory[$entry]}
done
Output:
$ ./script.sh files.txt
A/Thing/Here.pm # from wiff/
This/That.pm # from wiff/
Some/Module.pm # from bar/
And if you need to move the files, then you can simply replace the printing step with a mv -- or cp --, like this:
for entry in ${!directory[#]}; do
mv "${directory[$entry]}/$entry" "your_dir_path/$entry"
done

Resources