How to remove special characters in file names? - linux

When creating playlists I often came across files that would break the playing process. Such would be files with spaces or apostrophes. I would fix it with the following command
for file in *.; do mv "$file" `echo $file | tr " " '_'` ; done **(for spaces)**
Now I more often come across files with commas, apostrophes, brackets and other characters. How would I modify the command to remove such characters?
Also tried rename 's/[^a-zA-Z0-9_-]//' *.mp4 but it doesnt seem to remove spaces or commas

for file in *; do mv "$file" $(echo "$file" | sed -e 's/[^A-Za-z0-9._-]/_/g'); done &

Your rename would work if you add the g modifier to it, this performs all substitutions instead of only the first one:
$ echo "$file"
foo bar,spam.egg
$ rename -n 's/[^a-zA-Z0-9_-]//' "$file"
foo bar,spam.egg renamed as foobar,spam.egg
$ rename -n 's/[^a-zA-Z0-9_-]//g' "$file"
foo bar,spam.egg renamed as foobarspamegg
You can do this will bash alone, with parameter expansion:
For removing everything except a-zA-Z0-9_- from file names, assuming variable file contains the filename, using character class [:alnum:] to match all alphabetic characters and digits from current locale:
"${file//[^[:alnum:]_-]/}"
or explicitly, change the LC_COLLATE to C:
"${file//[^a-zA-Z0-9_-]/}"
Example:
$ file='foo bar,spam.egg'
$ echo "${file//[^[:alnum:]_-]/}"
foobarspamegg

Related

How to remove the first 2 letters of multiple file names in linux shell?

I have files with the names:
Ff6_01.png
Ff6_02.png
Ff6_03.png
...
...
FF1_01.png
FF1_02.png
FF1_03.png
I want to remove the first two letters of every file name, because then I would have a correct order of the files. Does anyone know the command in the linux shell?
You can use the syntax ${file:2} to refer to the name starting from the 3rd char.
Hence, you may do:
for file in F*png
do
mv "$file" "${file:2}"
done
In case ${file:2} did not work to you (neither rename), you can also use sed or cut:
for file in F*png
do
new_file=$(sed 's/^..//' <<< "$file") <---- cuts first two chars
new_file=$(cut -c3- <<< "$file") <---- the same
mv "$file" "$new_file"
done
Test
$ file="Ff6_01.png"
$ touch $file
$ ls
Ff6_01.png
$ mv "$file" "${file:2}"
$ ls
6_01.png

How to remove the extension of a file?

I have a folder that is full of .bak files and some other files also. I need to remove the extension of all .bak files in that folder. How do I make a command which will accept a folder name and then remove the extension of all .bak files in that folder ?
Thanks.
To remove a string from the end of a BASH variable, use the ${var%ending} syntax. It's one of a number of string manipulations available to you in BASH.
Use it like this:
# Run in the same directory as the files
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
That works nicely as a one-liner, but you could also wrap it as a script to work in an arbitrary directory:
# If we're passed a parameter, cd into that directory. Otherwise, do nothing.
if [ -n "$1" ]; then
cd "$1"
fi
for FILENAME in *.bak; do mv "$FILENAME" "${FILENAME%.bak}"; done
Note that while quoting your variables is almost always a good practice, the for FILENAME in *.bak is still dangerous if any of your filenames might contain spaces. Read David W.'s answer for a more-robust solution, and this document for alternative solutions.
There are several ways to remove file suffixes:
In BASH and Kornshell, you can use the environment variable filtering. Search for ${parameter%word} in the BASH manpage for complete information. Basically, # is a left filter and % is a right filter. You can remember this because # is to the left of %.
If you use a double filter (i.e. ## or %%, you are trying to filter on the biggest match. If you have a single filter (i.e. # or %, you are trying to filter on the smallest match.
What matches is filtered out and you get the rest of the string:
file="this/is/my/file/name.txt"
echo ${file#*/} #Matches is "this/` and will print out "is/my/file/name.txt"
echo ${file##*/} #Matches "this/is/my/file/" and will print out "name.txt"
echo ${file%/*} #Matches "/name.txt" and will print out "/this/is/my/file"
echo ${file%%/*} #Matches "/is/my/file/name.txt" and will print out "this"
Notice this is a glob match and not a regular expression match!. If you want to remove a file suffix:
file_sans_ext=${file%.*}
The .* will match on the period and all characters after it. Since it is a single %, it will match on the smallest glob on the right side of the string. If the filter can't match anything, it the same as your original string.
You can verify a file suffix with something like this:
if [ "${file}" != "${file%.bak}" ]
then
echo "$file is a type '.bak' file"
else
echo "$file is not a type '.bak' file"
fi
Or you could do this:
file_suffix=$(file##*.}
echo "My file is a file '.$file_suffix'"
Note that this will remove the period of the file extension.
Next, we will loop:
find . -name "*.bak" -print0 | while read -d $'\0' file
do
echo "mv '$file' '${file%.bak}'"
done | tee find.out
The find command finds the files you specify. The -print0 separates out the names of the files with a NUL symbol -- which is one of the few characters not allowed in a file name. The -d $\0means that your input separators are NUL symbols. See how nicely thefind -print0andread -d $'\0'` together?
You should almost never use the for file in $(*.bak) method. This will fail if the files have any white space in the name.
Notice that this command doesn't actually move any files. Instead, it produces a find.out file with a list of all the file renames. You should always do something like this when you do commands that operate on massive amounts of files just to be sure everything is fine.
Once you've determined that all the commands in find.out are correct, you can run it like a shell script:
$ bash find.out
rename .bak '' *.bak
(rename is in the util-linux package)
Caveat: there is no error checking:
#!/bin/bash
cd "$1"
for i in *.bak ; do mv -f "$i" "${i%%.bak}" ; done
You can always use the find command to get all the subdirectories
for FILENAME in `find . -name "*.bak"`; do mv --force "$FILENAME" "${FILENAME%.bak}"; done

Script is not replacing the prefix of filenames

I am trying to replace the prefix of all files in the directory with another prefix (renaming).
This is my script
# Script to rename the files
#!/bin/bash
for file in $1*;
do
mv $file `echo $file | sed -e 's/^$1/$2/'`;
done
Upon executing the script with
rename.sh BIT SIT
I get the following errors
mv: `BITfile.h' and `BITFile.h' are the same file
mv: `BITDefs.cpp' and `BITDefs.cpp' are the same file
mv: `BITDefs.h' and `BITDefs.h' are the same file
Seems like sed is treating $1 and $2 as the same value, but when I print those variables on another line it shows that they are different.
As Roman Newaza says, you can use " instead of ' to tell Bash that you want variables to be expanded. However, in your case, it would be safest to write:
for file in "$1"* ; do
mv -- "$file" "$2${file#$1}"
done
so that weird characters in filenames, or in your script parameters, cannot cause any problems.
You can also use parameter expansion to replace the prefix of all files in a directory.
for file in "$1"*;
do
mv ${file} ${file/#$1/$2}
done
Use double quotes instead:
# ...
mv "$file" `echo $file | sed -e "s/^$1/$2/"`
# ...
And learn Quotes and escaping in Bash.
When you don't use double quotes, the variables won't expand.
I will rather use this
#!/bin/bash
for file in $1*;
do
mv "$file" "$1${file:${#2}}"
done
where
${file:${#2}
means substring, from length of the argument 2 to the end

How to insert a text at the beginning of a file?

So far I've been able to find out how to add a line at the beginning of a file but that's not exactly what I want. I'll show it with an example:
File content
some text at the beginning
Result
<added text> some text at the beginning
It's similar but I don't want to create any new line with it...
I would like to do this with sed if possible.
sed can operate on an address:
$ sed -i '1s/^/<added text> /' file
What is this magical 1s you see on every answer here? Line addressing!.
Want to add <added text> on the first 10 lines?
$ sed -i '1,10s/^/<added text> /' file
Or you can use Command Grouping:
$ { echo -n '<added text> '; cat file; } >file.new
$ mv file{.new,}
If you want to add a line at the beginning of a file, you need to add \n at the end of the string in the best solution above.
The best solution will add the string, but with the string, it will not add a line at the end of a file.
sed -i '1s/^/your text\n/' file
If the file is only one line, you can use:
sed 's/^/insert this /' oldfile > newfile
If it's more than one line. one of:
sed '1s/^/insert this /' oldfile > newfile
sed '1,1s/^/insert this /' oldfile > newfile
I've included the latter so that you know how to do ranges of lines. Both of these "replace" the start line marker on their affected lines with the text you want to insert. You can also (assuming your sed is modern enough) use:
sed -i 'whatever command you choose' filename
to do in-place editing.
Use subshell:
echo "$(echo -n 'hello'; cat filename)" > filename
Unfortunately, command substitution will remove newlines at the end of file. So as to keep them one can use:
echo -n "hello" | cat - filename > /tmp/filename.tmp
mv /tmp/filename.tmp filename
Neither grouping nor command substitution is needed.
To insert just a newline:
sed '1i\\'
You can use cat -
printf '%s' "some text at the beginning" | cat - filename
To add a line to the top of the file:
sed -i '1iText to add\'
my two cents:
sed -i '1i /path/of/file.sh' filename
This will work even is the string containing forward slash "/"
Hi with carriage return:
sed -i '1s/^/your text\n/' file
Note that on OS X, sed -i <pattern> file, fails. However, if you provide a backup extension, sed -i old <pattern> file, then file is modified in place while file.old is created. You can then delete file.old in your script.
There is a very easy way:
echo "your header" > headerFile.txt
cat yourFile >> headerFile.txt
PROBLEM: tag a file, at the top of the file, with the base name of the parent directory.
I.e., for
/mnt/Vancouver/Programming/file1
tag the top of file1 with Programming.
SOLUTION 1 -- non-empty files:
bn=${PWD##*/} ## bn: basename
sed -i '1s/^/'"$bn"'\n/' <file>
1s places the text at line 1 of the file.
SOLUTION 2 -- empty or non-empty files:
The sed command, above, fails on empty files. Here is a solution, based on https://superuser.com/questions/246837/how-do-i-add-text-to-the-beginning-of-a-file-in-bash/246841#246841
printf "${PWD##*/}\n" | cat - <file> > temp && mv -f temp <file>
Note that the - in the cat command is required (reads standard input: see man cat for more information). Here, I believe, it's needed to take the output of the printf statement (to STDIN), and cat that and the file to temp ... See also the explanation at the bottom of http://www.linfo.org/cat.html.
I also added -f to the mv command, to avoid being asked for confirmations when overwriting files.
To recurse over a directory:
for file in *; do printf "${PWD##*/}\n" | cat - $file > temp && mv -f temp $file; done
Note also that this will break over paths with spaces; there are solutions, elsewhere (e.g. file globbing, or find . -type f ... -type solutions) for those.
ADDENDUM: Re: my last comment, this script will allow you to recurse over directories with spaces in the paths:
#!/bin/bash
## https://stackoverflow.com/questions/4638874/how-to-loop-through-a-directory-recursively-to-delete-files-with-certain-extensi
## To allow spaces in filenames,
## at the top of the script include: IFS=$'\n'; set -f
## at the end of the script include: unset IFS; set +f
IFS=$'\n'; set -f
# ----------------------------------------------------------------------------
# SET PATHS:
IN="/mnt/Vancouver/Programming/data/claws-test/corpus test/"
# https://superuser.com/questions/716001/how-can-i-get-files-with-numeric-names-using-ls-command
# FILES=$(find $IN -type f -regex ".*/[0-9]*") ## recursive; numeric filenames only
FILES=$(find $IN -type f -regex ".*/[0-9 ]*") ## recursive; numeric filenames only (may include spaces)
# echo '$FILES:' ## single-quoted, (literally) prints: $FILES:
# echo "$FILES" ## double-quoted, prints path/, filename (one per line)
# ----------------------------------------------------------------------------
# MAIN LOOP:
for f in $FILES
do
# Tag top of file with basename of current dir:
printf "[top] Tag: ${PWD##*/}\n\n" | cat - $f > temp && mv -f temp $f
# Tag bottom of file with basename of current dir:
printf "\n[bottom] Tag: ${PWD##*/}\n" >> $f
done
unset IFS; set +f
Just for fun, here is a solution using ed which does not have the problem of not working on an empty file. You can put it into a shell script just like any other answer to this question.
ed Test <<EOF
a
.
0i
<added text>
.
1,+1 j
$ g/^$/d
wq
EOF
The above script adds the text to insert to the first line, and then joins the first and second line. To avoid ed exiting on error with an invalid join, it first creates a blank line at the end of the file and remove it later if it still exists.
Limitations: This script does not work if <added text> is exactly equal to a single period.
echo -n "text to insert " ;tac filename.txt| tac > newfilename.txt
The first tac pipes the file backwards (last line first) so the "text to insert" appears last. The 2nd tac wraps it once again so the inserted line is at the beginning and the original file is in its original order.
The simplest solution I found is:
echo -n "<text to add>" | cat - myFile.txt | tee myFile.txt
Notes:
Remove | tee myFile.txt if you don't want to change the file contents.
Remove the -n parameter if you want to append a full line.
Add &> /dev/null to the end if you don't want to see the output (the generated file).
This can be used to append a shebang to the file. Example:
# make it executable (use u+x to allow only current user)
chmod +x cropImage.ts
# append the shebang
echo '#''!'/usr/bin/env ts-node | cat - cropImage.ts | tee cropImage.ts &> /dev/null
# execute it
./cropImage.ts myImage.png
Another solution with aliases. Add to your init rc/ env file:
addtail () { find . -type f ! -path "./.git/*" -exec sh -c "echo $# >> {}" \; }
addhead () { find . -type f ! -path "./.git/*" -exec sh -c "sed -i '1s/^/$#\n/' {}" \; }
Usage:
addtail "string to add at the beginning of file"
addtail "string to add at the end of file"
With the echo approach, if you are on macOS/BSD like me, lose the -n switch that other people suggest. And I like to define a variable for the text.
So it would be like this:
Header="my complex header that may have difficult chars \"like these quotes\" and line breaks \n\n "
{ echo "$Header"; cat "old.txt"; } > "new.txt"
mv new.txt old.txt
TL;dr -
Consider using ex. Since you want the front of a given line, then the syntax is basically the same as what you might find for sed but the option of "in place editing" is built-in.
I cannot imagine an environment where you have sed but not ex/vi, unless it is a MS Windows box with some special "sed.exe", maybe.
sed & grep sort of evolved from ex / vi, so it might be better to say sed syntax is the same as ex.
You can change the line number to something besides #1 or search for a line and change that one.
source=myFile.txt
Front="This goes IN FRONT "
man true > $source
ex -s ${source} <<EOF
1s/^/$Front/
wq
EOF
$ head -n 3 $source
This goes IN FRONT TRUE(1) User Commands TRUE(1)
NAME
Long version, I recommend ex (or ed if you are one of the cool kids).
I like ex because it is portable, extremely powerful, allows me to write in-place, and/or make backups all without needing GNU (or even BSD) extensions.
Additionally, if you know the ex way, then you know how to do it in vi - and probably vim if that is your jam.
Notice that EOF is not quoted when we use "i"nsert and using echo:
str="+++ TOP +++" && ex -s <<EOF
r!man true
1i
`echo "$str"`
.
"0r!echo "${str}"
wq! true.txt
EOF
0r!echo "${str}" might also be used as shorthand for :0read! or :0r! that you have likely used in vi mode (it is literally the same thing) but the : is optional here and some implementations do not support "r"ead address of zero.
"r"eading directly to the special line #0 (or from line 1) would automatically push everything "down", and then you just :wq to save your changes.
$ head -n 3 true.txt | nl -ba
1 +++ TOP +++
2 TRUE(1) User Commands TRUE(1)
3
Also, most classic sed implementations do not have extensions (like \U&) that ex should have by default.
cat concatenates multiple files. <() sends output of a command as a file. Combining these two, we can insert lines at the beginning and end of a file by,
cat <(echo "line before the file") file.txt <(echo "line after the file")

Linux shell script to add leading zeros to file names

I have a folder with about 1,700 files. They are all named like 1.txt or 1497.txt, etc. I would like to rename all the files so that all the filenames are four digits long.
I.e., 23.txt becomes 0023.txt.
What is a shell script that will do this? Or a related question: How do I use grep to only match lines that contain \d.txt (i.e., one digit, then a period, then the letters txt)?
Here's what I have so far:
for a in [command i need help with]
do
mv $a 000$a
done
Basically, run that three times, with commands there to find one digit, two digits, and three digit filenames (with the number of initial zeros changed).
Try:
for a in [0-9]*.txt; do
mv $a `printf %04d.%s ${a%.*} ${a##*.}`
done
Change the filename pattern ([0-9]*.txt) as necessary.
A general-purpose enumerated rename that makes no assumptions about the initial set of filenames:
X=1;
for i in *.txt; do
mv $i $(printf %04d.%s ${X%.*} ${i##*.})
let X="$X+1"
done
On the same topic:
Bash script to pad file names
Extract filename and extension in bash
Using the rename (prename in some cases) script that is sometimes installed with Perl, you can use Perl expressions to do the renaming. The script skips renaming if there's a name collision.
The command below renames only files that have four or fewer digits followed by a ".txt" extension. It does not rename files that do not strictly conform to that pattern. It does not truncate names that consist of more than four digits.
rename 'unless (/0+[0-9]{4}.txt/) {s/^([0-9]{1,3}\.txt)$/000$1/g;s/0*([0-9]{4}\..*)/$1/}' *
A few examples:
Original Becomes
1.txt 0001.txt
02.txt 0002.txt
123.txt 0123.txt
00000.txt 00000.txt
1.23.txt 1.23.txt
Other answers given so far will attempt to rename files that don't conform to the pattern, produce errors for filenames that contain non-digit characters, perform renames that produce name collisions, try and fail to rename files that have spaces in their names and possibly other problems.
for a in *.txt; do
b=$(printf %04d.txt ${a%.txt})
if [ $a != $b ]; then
mv $a $b
fi
done
One-liner:
ls | awk '/^([0-9]+)\.txt$/ { printf("%s %04d.txt\n", $0, $1) }' | xargs -n2 mv
How do I use grep to only match lines that contain \d.txt (IE 1 digit, then a period, then the letters txt)?
grep -E '^[0-9]\.txt$'
Let's assume you have files with datatype .dat in your folder. Just copy this code to a file named run.sh, make it executable by running chmode +x run.sh and then execute using ./run.sh:
#!/bin/bash
num=0
for i in *.dat
do
a=`printf "%05d" $num`
mv "$i" "filename_$a.dat"
let "num = $(($num + 1))"
done
This will convert all files in your folder to filename_00000.dat, filename_00001.dat, etc.
This version also supports handling strings before(after) the number. But basically you can do any regex matching+printf as long as your awk supports it. And it supports whitespace characters (except newlines) in filenames too.
for f in *.txt ;do
mv "$f" "$(
awk -v f="$f" '{
if ( match(f, /^([a-zA-Z_-]*)([0-9]+)(\..+)/, a)) {
printf("%s%04d%s", a[1], a[2], a[3])
} else {
print(f)
}
}' <<<''
)"
done
To only match single digit text files, you can do...
$ ls | grep '[0-9]\.txt'
One-liner hint:
while [ -f ./result/result`printf "%03d" $a`.txt ]; do a=$((a+1));done
RESULT=result/result`printf "%03d" $a`.txt
To provide a solution that's cautiously written to be correct even in the presence of filenames with spaces:
#!/usr/bin/env bash
pattern='%04d' # pad with four digits: change this to taste
# enable extglob syntax: +([[:digit:]]) means "one or more digits"
# enable the nullglob flag: If no matches exist, a glob returns nothing (not itself).
shopt -s extglob nullglob
for f in [[:digit:]]*; do # iterate over filenames that start with digits
suffix=${f##+([[:digit:]])} # find the suffix (everything after the last digit)
number=${f%"$suffix"} # find the number (everything before the suffix)
printf -v new "$pattern" "$number" "$suffix" # pad the number, then append the suffix
if [[ $f != "$new" ]]; then # if the result differs from the old name
mv -- "$f" "$new" # ...then rename the file.
fi
done
There is a rename.ul command installed from util-linux package (at least in Ubuntu) by default installed.
It's use is (do a man rename.ul):
rename [options] expression replacement file...
The command will replace the first occurrence of expression with the given replacement for the provided files.
While forming the command you can use:
rename.ul -nv replace-me with-this in-all?-these-files*
for not doing any changes but reading what changes that command would make. When sure just reexecute the command without the -v (verbose) and -n (no-act) options
for your case the commands are:
rename.ul "" 000 ?.txt
rename.ul "" 00 ??.txt
rename.ul "" 0 ???.txt

Resources