Remove part of filename with common delimiter - linux

I have a number of files with the following naming:
name1.name2.s01.ep01.RANDOMWORD.mp4
name1.name2.s01.ep02.RANDOMWORD.mp4
name1.name2.s01.ep03.RANDOMWORD.mp4
I need to remove everything between the last . and ep# from the file names and only have name1.name2.s01.ep01.mp4 (sometimes the extension can be different)
name1.name2.s01.ep01.mp4
name1.name2.s01.ep02.mp4
name1.name2.s01.ep03.mp4

This is a simpler version of #Jesse's [answer]
for file in /path/to/base_folder/* #Globbing to get the files
do
epno=${file#*.ep}
mv "$file" "${file%.ep*}."ep${epno%%.*}".${file##*.}"
#For the renaming part,see the note below
done
Note : Didn't get a grab of shell parameter expansion yet ? Check [ this ].

Using Linux string manipulation (refer: http://www.tldp.org/LDP/abs/html/string-manipulation.html) you could achieve like so:
You need to do per file-extension type.
for file in <directory>/*
do
name=${file}
firstchar="${name:0:1}"
extension=${name##${firstchar}*.}
lastchar=$(echo ${name} | tail -c 2)
strip1=${name%.*$lastchar}
lastchar=$(echo ${strip1} | tail -c 2)
strip2=${strip1%.*$lastchar}
mv $name "${strip2}.${extension}"
done

You can use rename (you may need to install it). But it works like sed on filenames.
As an example
$ for i in `seq 3`; do touch "name1.name2.s01.ep0$i.RANDOMWORD.txt"; done
$ ls -l
name1.name2.s01.ep01.RANDOMWORD.txt
name1.name2.s01.ep02.RANDOMWORD.txt
name1.name2.s01.ep03.RANDOMWORD.txt
$ rename 's/(name1.name2.s01.ep\d{2})\..*(.txt)$/$1$2/' name1.name2.s01.ep0*
$ ls -l
name1.name2.s01.ep01.txt
name1.name2.s01.ep02.txt
name1.name2.s01.ep03.txt
Where this expression matches your filenames, and using two capture groups so that the $1$2 in the replacement operation are the parts outside the "RANDOMWORD"
(name1.name2.s01.ep\d{2})\..*(.txt)$

Related

How do I delete the first delimiter of file names in linux?

I want to delete the first delimiter of file names in linux.
For example,
$ ls my_directory
a.b.c.txt a.b.d.txt a.b.e.txt
I want it to be like:
$ ls my_directory
ab.c.txt ab.d.txt ab.e.txt
I tried:
$ mv a.b* ab*
, but unfortunately this doesn't work.
What should I do?
Thank you in advance.
Use a replace once parameter expansion method if you're using Bash:
for f in a.b*; do
mv -i -- "$f" "${f/.}"
done
See Shell Parameter Expansion.
If you're using a POSIX shell, you can use ${f%%.*}${f#*.} or in the case of a known prefix like a.b, , simply ab${f#a.b}.

Alternative for AWK use

I'd love to have a more elegant solution for a mass rename of files, as shown below. Files were of format DEV_XYZ_TIMESTAMP.dat and we needed them as T-XYZ-TIMESTAMP.dat.
In the end, I copied them all (to be on the same side) into renamed folder:
ls -l *dat|awk '{system("cp " $10 " renamed/T-" substr($10, index($10, "_")+1))}'
So, first I listed all dat files, then picked up 10th column (file name) and executed a command using awk's system function.
The command was essentially copying of original filename into renamed folder with new file name.
New file name was created by removing (awk substring function) prefix before (including) _ and adding "T-" prefix.
Effectively:
cp DEV_file.dat renamed/T-file.dat
Is there a way to use cp or mv together with some regex rules to achieve the same in a bit more elegant way?
Thx
You may use this script:
for file in *.dat; do
f="${file//_/-}"
mv "$file" renamed/T-"${f#*-}"
done
You must avoid parsing output of ls command.
If you have rename utilitity
rename -E "s/[^_]*/T/" -e "s/_/-/g" *dat
Demo
$ls -1
ABC_DEF_TIMESTAMP.dat
DEV_XYZ_TIMESTAMP.dat
$rename -E "s/[^_]*/T/" -e "s/_/-/g" *
$ls -1
T-DEF-TIMESTAMP.dat
T-XYZ-TIMESTAMP.dat
$
This is how I would do it:
cpdir=renamed
for file in *dat; do
newfile=$(echo "$file" | sed -e "s/[^_]*/T/" -e "y/_/-/")
cp "$file" "$cpdir/$newfile"
done
The sed scripts transforms every non-underscore leading characters in a single T and then replaces every _ with -. If cpdir is not sure to exist before execution, you can simply add mkdir "$cpdir" after first line.

rename all files in folder through regular expression

I have a folder with lots of files which name has the following structure:
01.artist_name - song_name.mp3
I want to go through all of them and rename them using the regexp:
/^d+\./
so i get only :
artist_name - song_name.mp3
How can i do this in bash?
You can do this in BASH:
for f in [0-9]*.mp3; do
mv "$f" "${f#*.}"
done
Use the Perl rename utility utility. It might be installed on your version of Linux or easy to find.
rename 's/^\d+\.//' -n *.mp3
With the -n flag, it will be a dry run, printing what would be renamed, without actually renaming. If the output looks good, drop the -n flag.
Use 'sed' bash command to do so:
for f in *.mp3;
do
new_name="$(echo $f | sed 's/[^.]*.//')"
mv $f $new_name
done
...in this case, regular expression [^.].* matches everything before first period of a string.

Removing 10 Characters of Filename in Linux

I just downloaded about 600 files from my server and need to remove the last 11 characters from the filename (not including the extension). I use Ubuntu and I am searching for a command to achieve this.
Some examples are as follows:
aarondyne_kh2_13thstruggle_or_1250556383.mus should be renamed to aarondyne_kh2_13thstruggle_or.mus
aarondyne_kh2_darknessofunknow_1250556659.mp3 should be renamed to aarondyne_kh2_darknessofunknow.mp3
It seems that some duplicates might exist after I do this, but if the command fails to complete and tells me what the duplicates would be, I can always remove those manually.
Try using the rename command. It allows you to rename files based on a regular expression:
The following line should work out for you:
rename 's/_\d+(\.[a-z0-9A-Z]+)$/$1/' *
The following changes will occur:
aarondyne_kh2_13thstruggle_or_1250556383.mus renamed as aarondyne_kh2_13thstruggle_or.mus
aarondyne_kh2_darknessofunknow_1250556659.mp3 renamed as aarondyne_kh2_darknessofunknow.mp3
You can check the actions rename will do via specifying the -n flag, like this:
rename -n 's/_\d+(\.[a-z0-9A-Z]+)$/$1/' *
For more information on how to use rename simply open the manpage via: man rename
Not the prettiest, but very simple:
echo "$filename" | sed -e 's!\(.*\)...........\(\.[^.]*\)!\1\2!'
You'll still need to write the rest of the script, but it's pretty simple.
find . -type f -exec sh -c 'mv {} `echo -n {} | sed -E -e "s/[^/]{10}(\\.[^\\.]+)?$/\\1/"`' ";"
one way to go:
you get a list of your files, one per line (by ls maybe) then:
ls....|awk '{o=$0;sub(/_[^_.]*\./,".",$0);print "mv "o" "$0}'
this will print the mv a b command
e.g.
kent$ echo "aarondyne_kh2_13thstruggle_or_1250556383.mus"|awk '{o=$0;sub(/_[^_.]*\./,".",$0);print "mv "o" "$0}'
mv aarondyne_kh2_13thstruggle_or_1250556383.mus aarondyne_kh2_13thstruggle_or.mus
to execute, just pipe it to |sh
I assume there is no space in your filename.
This script assumes each file has just one extension. It would, for instance, rename "foo.something.mus" to "foo.mus". To keep all extensions, remove one hash mark (#) from the first line of the loop body. It also assumes that the base of each filename has at least 12 character, so that removing 11 doesn't leave you with an empty name.
for f in *; do
ext=${f##*.}
new_f=${base%???????????.$ext}
if [ -f "$new_f" ]; then
echo "Will not rename $f, $new_f already exists" >&2
else
mv "$f" "$new_f"
fi
done

Shell Script: Truncating String

I have two folders full of trainings and corresponding testfiles and I'd like to run the fitting pairs against each other using a shell script.
This is what I have so far:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$x.test
done
This is supposed to take file1(-n).train in one directory, look for file1(-n).test in the other, and run them trough a tool called timbl.
What it does instead is look for a file called SpanishLS.train/file1(-n).train.test which of course doesn't exist.
What I tried to do, to no avail, is truncate $x in a way that lets the script find the correct file, but whenever I do this, $x is truncated way too early, resulting in the script not even finding the .train file.
How should I code this?
If I got you right, this will do the job:
for x in SpanishLS.train/*.train
do
y=${x##*/} # strip basepath
y=${y%.*} # strip extention
timbl -f $x -t SpanishLS.test/$y.test
done
Use basename:
for x in SpanishLS.train/*.train
do
timbl -f $x -t SpanishLS.test/$(basename "$x" .train).test
done
That removes the directory prefix and the .train suffix from $x, and builds up the name you want.
In bash (and other POSIX-compliant shells), you can do the basename operation with two shell parameter expansions without invoking an external program. (I don't think there's a way to combine the two expansions into one.)
for x in SpanishLS.train/*.train
do
y=${x##*/} # Remove path prefix
timbl -f $x -t SpanishLS.test/${y%.train}.test # Remove .train suffix
done
Beware: bash supports quite a number of (useful) expansions that are not defined by POSIX. For example, ${y//.train/.test} is a bash-only notation (or bash and compatible shells notation).
Replace all occurences of .train in the filename to .text:
timbl -f $x -t $(echo $x | sed 's/\.train/.text/g')

Resources