How to rename string in multiple filename in a folder using shell script without mv command since it will move the files to different folder? [duplicate] - linux

This question already has answers here:
Rename multiple files based on pattern in Unix
(24 answers)
Closed 5 years ago.
Write a simple script that will automatically rename a number of files. As an example we want the file *001.jpg renamed to user defined string + 001.jpg (ex: MyVacation20110725_001.jpg) The usage for this script is to get the digital camera photos to have file names that make some sense.
I need to write a shell script for this. Can someone suggest how to begin?

An example to help you get off the ground.
for f in *.jpg; do mv "$f" "$(echo "$f" | sed s/IMG/VACATION/)"; done
In this example, I am assuming that all your image files contain the string IMG and you want to replace IMG with VACATION.
The shell automatically evaluates *.jpg to all the matching files.
The second argument of mv (the new name of the file) is the output of the sed command that replaces IMG with VACATION.
If your filenames include whitespace pay careful attention to the "$f" notation. You need the double-quotes to preserve the whitespace.

You can use rename utility to rename multiple files by a pattern. For example following command will prepend string MyVacation2011_ to all the files with jpg extension.
rename 's/^/MyVacation2011_/g' *.jpg
or
rename <pattern> <replacement> <file-list>

this example, I am assuming that all your image files begin with "IMG" and you want to replace "IMG" with "VACATION"
solution : first identified all jpg files and then replace keyword
find . -name '*jpg' -exec bash -c 'echo mv $0 ${0/IMG/VACATION}' {} \;

for file in *.jpg ; do mv $file ${file//IMG/myVacation} ; done
Again assuming that all your image files have the string "IMG" and you want to replace "IMG" with "myVacation".
With bash you can directly convert the string with parameter expansion.
Example: if the file is IMG_327.jpg, the mv command will be executed as if you do mv IMG_327.jpg myVacation_327.jpg. And this will be done for each file found in the directory matching *.jpg.
IMG_001.jpg -> myVacation_001.jpg
IMG_002.jpg -> myVacation_002.jpg
IMG_1023.jpg -> myVacation_1023.jpg
etcetera...

find . -type f |
sed -n "s/\(.*\)factory\.py$/& \1service\.py/p" |
xargs -p -n 2 mv
eg will rename all files in the cwd with names ending in "factory.py" to be replaced with names ending in "service.py"
explanation:
In the sed cmd, the -n flag will suppress normal behavior of echoing input to output after the s/// command is applied, and the p option on s/// will force writing to output if a substitution is made. Since a sub will only be made on match, sed will only have output for files ending in "factory.py"
In the s/// replacement string, we use "& " to interpolate the entire matching string, followed by a space character, into the replacement. Because of this, it's vital that our RE matches the entire filename. after the space char, we use "\1service.py" to interpolate the string we gulped before "factory.py", followed by "service.py", replacing it. So for more complex transformations youll have to change the args to s/// (with an re still matching the entire filename)
Example output:
foo_factory.py foo_service.py
bar_factory.py bar_service.py
We use xargs with -n 2 to consume the output of sed 2 delimited strings at a time, passing these to mv (i also put the -p option in there so you can feel safe when running this). voila.
NOTE: If you are facing more complicated file and folder scenarios, this post explains find (and some alternatives) in greater detail.

Another option is:
for i in *001.jpg
do
echo "mv $i yourstring${i#*001.jpg}"
done
remove echo after you have it right.
Parameter substitution with # will keep only the last part, so you can change its name.

Can't comment on Susam Pal's answer but if you're dealing with spaces, I'd surround with quotes:
for f in *.jpg; do mv "$f" "`echo $f | sed s/\ /\-/g`"; done;

You can try this:
for file in *.jpg;
do
mv $file $somestring_${file:((-7))}
done
You can see "parameter expansion" in man bash to understand the above better.

Related

How to rename fasta header based on filename in multiple files?

I have a directory with multiple fasta file named as followed:
BC-1_bin_1_genes.faa
BC-1_bin_2_genes.faa
BC-1_bin_3_genes.faa
BC-1_bin_4_genes.faa
etc. (about 200 individual files)
The fasta header look like this:
>BC-1_k127_3926653_6 # 4457 # 5341 # -1 # ID=2_6;partial=01;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.697
I now want to add the filename to the header since I want to annotate the sequences for each file.I tried the following:
for file in *.faa;
do
sed -i "s/>.*/${file%%.*}/" "$file" ;
done
It worked partially but it removed the ">" from the header which is essential for the fasta file. I tried to modify the "${file%%.*}" part to keep the carrot but it always called me out on bad substitutions.
I also tried this:
awk '/>/{sub(">","&"FILENAME"_");sub(/\.faa/,x)}1' *.faa
This worked in theory but only printed everything on my terminal rather than changing it in the respective files.
Could someone assist with this?
It's not clear whether you want to replace the earlier header, or add to it. Both scenarios are easy to do. Don't replace text you don't want to replace.
for file in ./*.faa;
do
sed -i "s/^>.*/>${file%%.*}/" "$file"
done
will replace the header, but include a leading > in the replacement, effectively preserving it; and
for file in ./*.faa;
do
sed -i "s/^>.*/&${file%%.*}/" "$file"
done
will append the file name at the end of the header (& in the replacement string evaluates to the string we are replacing, again effectively preserving it).
For another variation, try
for file in *.faa;
do
sed -i "/^>/s/\$/ ${file%%.*}/" "$file"
done
which says on lines which match the regex ^>, replace the empty string at the end of the line $ with the file name.
Of course, your Awk script could easily be fixed, too. Standard Awk does not have an option to parallel the -i "in-place" option of sed, but you can easily use a temporary file:
for file in ./*.faa;
do
awk '/>/{ $0 = $0 " " FILENAME);sub(/\.faa/,"")}1' "$file" >"$file.tmp" &&
mv "$file.tmp" "$file"
done
GNU Awk also has an -i inplace extension which you could simply add to the options of your existing script if you have GNU Awk.
Since FASTA files typically contain multiple headers, adding to the header rather than replacing all headers in a file with the same string seems more useful, so I changed your Awk script to do that instead.
For what it's worth, the name of the character ^ is caret (carrot is 🥕). The character > is called greater than or right angle bracket, or right broket or sometimes just wedge.
You just need to detect the pattern to replace and use regex to implement it:
fasta_helper.sh
location=$1
for file in $location/*.faa
do
full_filename=${file##*/}
filename="${full_filename%.*}"
#scape special chars
filename=$(echo $filename | sed 's_/_\\/_g')
echo "adding file name: $filename to: $full_filename"
sed -i -E "s/^[^#]+/>$filename /" $location/$full_filename
done
usage:
Just pass the folder with fasta files:
bash fasta_helper.sh /foo/bar
test:
lectures
Regex: matching up to the first occurrence of a character
Extract filename and extension in Bash
https://unix.stackexchange.com/questions/78625/using-sed-to-find-and-replace-complex-string-preferrably-with-regex
Locating your files
Suggesting to first identify your files with find command or ls command.
find . -type f -name "*.faa" -printf "%f\n"
A find command to print only file with filenames extension .faa. Including sub directories to current directory.
ls -1 "*.faa"
An ls command to print files and directories with extension .faa. In current directory.
Processing your files
Once you have the correct files list, iterate over the list and apply sed command.
for fileName in $(find . -type f -name "*.faa" -printf "%f\n"); do
stripedFileName=${fileName/.*/} # strip extension .faa
sed -i "1s|\$| $stripedFileName|" "fileName" # append value of stripedFileName at end of line 1
done

mv renaming filename to _*_

Given an example that my file name is
A_BC_DEF_GH_IJ_LMNO_PQ_11111111_1111111111_111111_AB.dat.meta
i am trying to rename this with unix command but when i tried using this cmd
for f in *.meta; do mv "$f" "$(echo $f|sed s/[0-9]/?/g|sed 's/-/*/g')" ; done
my file is renamed to
A_BC_DEF_GH_IJ_LMNO_PQ_????????_????????????????????_???????_AB.dat.meta
it is expected to rename the file to
A_BC_DEF_GH_IJ_LMNO_PQ__????????_????????????????????_*_AB.dat.meta
Im quite new with unix cmd , any approach that i should try ?
Since [0-9] and ? are undergoing filename expansion, you should quote them to avoid nasty error messages. With this in mind, I did a
echo A_BC_DEF_GH_IJ_LMNO_PQ_11111111_1111111111_111111_AB.dat.meta | sed 's/[0-9]/?/g'|sed 's/-/*/g'
and got as output A_BC_DEF_GH_IJ_LMNO_PQ????????????????????????AB.dat.meta, which makes sense to me. Why would you expect an asterisk in the resulting filename? In your second sed command, you are turning the hyphens into asterisks, but there is no hyphen in the input.
Of course it is pretty unsane to use question marks and asterisks in a file name, as this is just begging for trouble, but there is no law that you must not do this.
A_BC_DEF_GH_IJ_LMNO_PQ_11111111_1111111111_111111_AB.dat.meta
Match it with a regex. Remember which characters need to be escaped in sed. Remember about proper quoting - if you write $ it should be inside ". Note that if there are no files named *.meta it will just iterate over a string *.meta unless nullglob is set.
$ touch A_BC_DEF_GH_IJ_LMNO_PQ_11111111_1111111111_111111_AB.dat.meta
$ for f in *.meta; do mv "$f" "$(echo "$f" | sed 's/[0-9]/?/g; s/_\(?*\)_\(?*\)_\(?*\)_\([^_]*\)$/__\1_\2_*_\4/')" ; done

rename all files in folder through regular expression

I have a folder with lots of files which name has the following structure:
01.artist_name - song_name.mp3
I want to go through all of them and rename them using the regexp:
/^d+\./
so i get only :
artist_name - song_name.mp3
How can i do this in bash?
You can do this in BASH:
for f in [0-9]*.mp3; do
mv "$f" "${f#*.}"
done
Use the Perl rename utility utility. It might be installed on your version of Linux or easy to find.
rename 's/^\d+\.//' -n *.mp3
With the -n flag, it will be a dry run, printing what would be renamed, without actually renaming. If the output looks good, drop the -n flag.
Use 'sed' bash command to do so:
for f in *.mp3;
do
new_name="$(echo $f | sed 's/[^.]*.//')"
mv $f $new_name
done
...in this case, regular expression [^.].* matches everything before first period of a string.

shell script to list file names alone in a directory & rename it [duplicate]

This question already has answers here:
How can I remove the extension of a filename in a shell script?
(15 answers)
Closed 6 years ago.
I'm new to scripting concept.. I have a requirement to rename multiple files in a directory like filename.sh.x into filename.sh
First I tried to get the file names in a particular directory.. so i followed the below scripting code
for entry in PathToThedirectory/*sh.x
do
echo $entry
done
& the above code listed down all the file names with full path..
But my expected o/p is : to get file names alone like abc.sh.x,
so that I can proceed with the split string mechanism to perform rename
operation easily...
help me to solve this ... Thanks in advance
First approach trying to follow OP suggestions:
for i in my/path/*.py.x
do
basename=$(basename "$i")
mv my/path/"$basename" my/path/"${basename%.*}"
done
And maybe, you can simplify it:
for i in my/path/*.py.x
do
mv "$i" "${i%.*}";
done
Documentation regarding this kind of operation (parameter expansion): https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
In particular:
${parameter%word} : The word is expanded to produce a pattern just as in filename expansion. If the pattern matches a trailing portion of the expanded value of parameter, then the result of the expansion is the value of parameter with the shortest matching pattern (the ‘%’ case) or the longest matching pattern (the ‘%%’ case) deleted
So, ${i%.*} means:
Take $i
Match .* at the end of its value (. being a literal character)
Remove the shortest matching pattern
Look into prename (installed together with the perl package on ubuntu).
Then you can just do something like:
prename 's/\.x$//' *.sh.x
In ksh you can do this:
for $file in $(ls $path)
do
new_file=$(basename $path/$file .x)
mv ${path}/${file} ${path}/${new_file}
done
This should do the trick:
for file in *.sh.x;
do
mv "$file " "${file /.sh.x}";
done
Running this rename command from the root directory should work:
rename 's/\.sh\.x$/.sh/' *.sh.x
for i in ls -la /path|grep -v ^d|awk '{print $NF}'
do
echo "basename $i"
done
it will give u the base name of all files or you can try below
find /path -type f -exec basename {} \;

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.
To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.
There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out
rename .bak '' *.bak
(rename is in the util-linux package)
Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done
You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Resources