How to match file extensions regardless of case in bash script

How to match file extensions regardless of case in bash script - linux

Let's just say i want to match the .abc extension but it could come over as .ABC or .AbC. How can I identify all of these variations of the .abc extension in order to process .abc files?
Right now i'm using:
ls | grep -i .abc
but i've heard that piping to grep is usually not the best idea. Is there a better way to do this?

If you enter the extension literally, you can use character classes:
ls *.[Aa][Bb][Cc]
You can also use the -iname option of find:
find -maxdepth 1 -iname '*.abc'

You can use the nocaseglob option to the shopt builtin to make globs not pay attention to case.
$ touch foo.abc foo.ABC
$ echo *.abc
foo.abc
$ shopt -s nocaseglob
$ echo *.abc
foo.ABC foo.abc

Related

rename all files in folder through regular expression

I have a folder with lots of files which name has the following structure:
01.artist_name - song_name.mp3
I want to go through all of them and rename them using the regexp:
/^d+\./
so i get only :
artist_name - song_name.mp3
How can i do this in bash?

You can do this in BASH:
for f in [0-9]*.mp3; do
mv "$f" "${f#*.}"
done

Use the Perl rename utility utility. It might be installed on your version of Linux or easy to find.
rename 's/^\d+\.//' -n *.mp3
With the -n flag, it will be a dry run, printing what would be renamed, without actually renaming. If the output looks good, drop the -n flag.

Use 'sed' bash command to do so:
for f in *.mp3;
do
new_name="$(echo $f | sed 's/[^.]*.//')"
mv $f $new_name
done
...in this case, regular expression [^.].* matches everything before first period of a string.

Identifying multiple file types with bash

I'm pretty sure I've seen this done before but I can't remember the exact syntax.
Suppose you have a couple of files with different file extensions:
foo.txt
bar.rtf
index.html
and instead of doing something with all of them (cat *), you only want to run a command on 2 of the 3 file extensions.
Can't you do something like this?
cat ${*.txt|*.rtf}
I'm sure there's some find trickery to identify the files first and pipe them to a command, but I think bash supports what I'm talking about without having to do that.

The syntax you want is cat *.{txt,rft}. A comma is used instead of a pipe.
$ echo foo > foo.txt
$ echo bar > bar.rft
$ echo "bar txt" > bar.txt
$ echo "test" > index.html
$ cat *.{txt,rft}
bar txt
foo
bar
$ ls *.{txt,rft}
bar.rft bar.txt foo.txt
But as Anthony Geoghegan said in their answer there's a simpler approach you can use.

Shell globbing is much more basic than regular expressions. If you want to cat all the files which have a .txt or .rtf suffix, you'd simply use:
cat *.txt *.rtf
The glob patterns will be expanded to list all the filenames that match the pattern. In your case, the above command would call the cat command with foo.txt and bar.rtf as its arguments.

Here's a simple way i use to do it using command substitution.
cat $(find . -type f \( -name "*.txt" -o -name "*.rtf" \))
But Anthony Geoghegan's answer is much simpler. I learned from it too.

file name matching, differentiation with numbers and characters

I have a folder with many files,
the name of some files are like file_1 file_10 file_21 file_345
others are like file_fr file_de file_cn
I want to move the first type of files into another folder
like
mv file_* another_folder
but file_* will match all files
are there any good scripts?
thanks

Try this
mv file_[0-9]* another_folder
In response to glenn jackman’s comment
ls | grep 'file_[0-9]*$' | xargs mv -t another_folder

bash:
shopt -s extglob
mv file_+([0-9]) ..
http://www.gnu.org/software/bash/manual/bashref.html#Pattern-Matching

ambiguous redirection

I'm trying to go the current directory and all sub direcotires, and add some annotations into each file that ends in .sql
heres a snippet of the code
HEADER="--SQL HEADER"
for f in 'find . -name *.sql';
do
echo $f
echo -e $HEADER > $f.tmp;
FNAME=${f//\//_/};
echo -e "\n\n--MORE ANNOTATIONS ${FNAME%.*}:1" >> $f.tmp;
cat $f >> $f.tmp;
mv $f.tmp $f;
rm $f.tmp
done;
im a beginner at bash so i think some of the errors im getting might be due to the find statement with the loop
but this is the error i get
find . -name X.sql A.sql W.sql E.sql S.sql
./annotate.sh: line 6: $f.tmp: ambiguous redirect
./annotate.sh: line 8: $f.tmp: ambiguous redirect
./annotate.sh: line 9: $f.tmp: ambiguous redirect
mv: invalid option -- n
Try `mv --help' for more information.
rm: invalid option -- n
Try `rm --help' for more information.
any help would be greatly appreciated =)

Here's the problem. Your "echo" gives it away:
echo $f
outputs
find . -name X.sql A.sql W.sql E.sql S.sql
I think the problem is you have straight single quotes ('') in the find command, instead of backquotes (``). So it's not really running find, but simply expanding the wildcards.
You may have to quote the wildcard so it gets passed to find instead of evaluated by the shell:
for f in `find . -name \*.sql`;
However, there are several problems in your script, which you should address if you want to use it more than once. See ormaaj's answer.

The problem, as already pointed out, is that find isn't actually being executed. However, this pattern is very wrong. Iterating using a for loop over anything that happens with a command substitution doesn't work because splitting the output into words requires word-splitting, which requires not quoting, which is a problem even if pathname expansion is disabled because filenames can contain newlines.
Preferably, use -exec. First write this script to a file and chmod u+x scriptname:
#!/usr/bin/env bash
header="--SQL HEADER"
for f in "$#"; do
echo "$f" >&2
fname=${f//\//_/}
cat - "$f" <<EOF >"$f.tmp"
${header}$'\n\n'
--MORE ANNOTATIONS ${fname%.*}:1
EOF
mv "$f.tmp" "$f"
done
Then run find like this:
find . -name '*.sql' -exec scriptname {} +
Alternatively, (and assuming this is a recent version of Bash), use globstar and no find (ksh has a similar feature if you prefer). This may be slower depending upon the job - the shell must pre-generate the list of files.
#!/usr/bin/env bash
shopt -s globstar
for f in ./**/*.sql; do
...
Alternatively, if you have Bash 4 and a system with the necessary GNU utilities, use -print0.
find . -name '*.sql' -print0 | while IFS= read -rd '' f; do
# <body of the above for loop here>
done
See: http://mywiki.wooledge.org/UsingFind

Recursively look for files with a specific extension

I'm trying to find all files with a specific extension in a directory and its subdirectories with my bash (Latest Ubuntu LTS Release).
This is what's written in a script file:
#!/bin/bash
directory="/home/flip/Desktop"
suffix="in"
browsefolders ()
for i in "$1"/*;
do
echo "dir :$directory"
echo "filename: $i"
# echo ${i#*.}
extension=`echo "$i" | cut -d'.' -f2`
echo "Erweiterung $extension"
if [ -f "$i" ]; then
if [ $extension == $suffix ]; then
echo "$i ends with $in"
else
echo "$i does NOT end with $in"
fi
elif [ -d "$i" ]; then
browsefolders "$i"
fi
done
}
browsefolders "$directory"
Unfortunately, when I start this script in terminal, it says:
[: 29: in: unexpected operator
(with $extension instead of 'in')
What's going on here, where's the error?
But this curly brace

find "$directory" -type f -name "*.in"
is a bit shorter than that whole thing (and safer - deals with whitespace in filenames and directory names).
Your script is probably failing for entries that don't have a . in their name, making $extension empty.

find {directory} -type f -name '*.extension'
Example: To find all csv files in the current directory and its sub-directories, use:
find . -type f -name '*.csv'

The syntax I use is a bit different than what #Matt suggested:
find $directory -type f -name \*.in
(it's one less keystroke).

Without using find:
du -a $directory | awk '{print $2}' | grep '\.in$'

Though using find command can be useful here, the shell itself provides options to achieve this requirement without any third party tools. The bash shell provides an extended glob support option using which you can get the file names under recursive paths that match with the extensions you want.
The extended option is extglob which needs to be set using the shopt option as below. The options are enabled with the -s support and disabled with he -u flag. Additionally you could use couple of options more i.e. nullglob in which an unmatched glob is swept away entirely, replaced with a set of zero words. And globstar that allows to recurse through all the directories
shopt -s extglob nullglob globstar
Now all you need to do is form the glob expression to include the files of a certain extension which you can do as below. We use an array to populate the glob results because when quoted properly and expanded, the filenames with special characters would remain intact and not get broken due to word-splitting by the shell.
For example to list all the *.csv files in the recursive paths
fileList=(**/*.csv)
The option ** is to recurse through the sub-folders and *.csv is glob expansion to include any file of the extensions mentioned. Now for printing the actual files, just do
printf '%s\n' "${fileList[#]}"
Using an array and doing a proper quoted expansion is the right way when used in shell scripts, but for interactive use, you could simply use ls with the glob expression as
ls -1 -- **/*.csv
This could very well be expanded to match multiple files i.e. file ending with multiple extension (i.e. similar to adding multiple flags in find command). For example consider a case of needing to get all recursive image files i.e. of extensions *.gif, *.png and *.jpg, all you need to is
ls -1 -- **/+(*.jpg|*.gif|*.png)
This could very well be expanded to have negate results also. With the same syntax, one could use the results of the glob to exclude files of certain type. Assume you want to exclude file names with the extensions above, you could do
excludeResults=()
excludeResults=(**/!(*.jpg|*.gif|*.png))
printf '%s\n' "${excludeResults[#]}"
The construct !() is a negate operation to not include any of the file extensions listed inside and | is an alternation operator just as used in the Extended Regular Expressions library to do an OR match of the globs.
Note that these extended glob support is not available in the POSIX bourne shell and its purely specific to recent versions of bash. So if your are considering portability of the scripts running across POSIX and bash shells, this option wouldn't be right.

find "$PWD" -type f -name "*.in"

There's a { missing after browsefolders ()
All $in should be $suffix
The line with cut gets you only the middle part of front.middle.extension. You should read up your shell manual on ${varname%%pattern} and friends.
I assume you do this as an exercise in shell scripting, otherwise the find solution already proposed is the way to go.
To check for proper shell syntax, without running a script, use sh -n scriptname.

To find all the pom.xml files in your current directory and print them, you can use:
find . -name 'pom.xml' -print

find $directory -type f -name "*.in"|grep $substring

for file in "${LOCATION_VAR}"/*.zip
do
echo "$file"
done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string