Why isn't nullglob behaviour default in Bash? - linux

I recently needed to use a Bash for loop to recurse through a few files in a directory:
for video in **/*.{mkv,mp4,webm}; do
echo "$video"
done
After way too much time spent debugging, I realised that the loop was run even when the pattern didn't match, resulting in:
file1.mkv
file2.mp4
**/*.webm # literal pattern printed when no .webm files can be found
Some detailed searching eventually revealed that this is known behaviour in Bash, for which enabling the shell's nullglob option with shopt -s nullglob is intended to be the solution.
Is there a reason that this counter-intuitive behaviour is the default and needs to be explicitly disabled with nullglob, instead of the other way around? Or to put the question another way, are there any disadvantages to always having nullglob enabled?

From man 7 glob:
Empty lists
The nice and simple rule given above: "expand a wildcard pattern
into the list of matching pathnames" was the original UNIX
definition. It allowed one to have patterns that expand into an
empty list, as in
xv -wait 0 *.gif *.jpg
where perhaps no *.gif files are present (and this is not an
error). However, POSIX requires that a wildcard pattern is left
unchanged when it is syntactically incorrect, or the list of
matching pathnames is empty. With bash one can force the
classical behavior using this command:
shopt -s nullglob
(Similar problems occur elsewhere. For example, where old
scripts have
rm `find . -name "*~"`
new scripts require
rm -f nosuchfile `find . -name "*~"`
to avoid error messages from rm called with an empty argument
list.)
In short, it is the behavior required to be POSIX compatible.
Granted though, you can now ask what the rationale for POSIX was to specify that behavior. See this unix.stackexchange.com question for some reasons/history.

Related

how to iterate over files using find in bash/ksh shell

I am using find in a loop to search recursively for files of a specific extension, and then do something with that loop.
cd $DSJobs
jobs=$(find $DSJobs -name "*.dsx")
for j in jobs; do
echo "$j"
done
assuming $DSJobs is a relevent folder, the output of $j is "Jobs" one time. doesn't even repeat.
I want to list all *.dsx files in a folder recursively through subfolders as well.
How do Make this work?
Thanks
The idiomatic way to do this is:
cd "$DSJobs"
find . -name "*.dsx" -print0 | while IFS= read -r -d "" job; do
echo "$job"
done
The complication derives from the fact that space and newline are perfectly valid filename characters, so you get find to output the filenames separated by the null character (which is not allowed to appear in a filename). Then you tell read to use the null character (with -d "") as the delimiter while reading the names.
IFS= read -r var is the way to get bash to read the characters verbatim, without dropping any leading/trailing whitespace or any backslashes.
There are further complications regarding the use of the pipe, which may or may not matter to you depending on what you do inside the loop.
Note: take care to quote your variables, unless you know exactly when to leave the quotes off. Very detailed discussion here.
Having said that, bash can do this without find:
shopt -s globstar
cd "$DSJobs"
for job in **/*.dsx; do
echo "$job"
done
This approach removes all the complications of find | while read.
Incorporating #Gordon's comment:
shopt -s globstar nullglob
for job in "$DSJobs"/**/*.dsx; do
do_stuff_with "$job"
done
The "nullglob" setting is useful when no files match the pattern. Without it, the for loop will have a single iteration where job will have the value job='/path/to/DSJobs/**/*.dsx' (or whatever the contents of the variable) -- including the literal asterisks.
Since all you want is to find files with a specific extension...
find ${DSJobs} -name "*.dsx"
Want to do this for several directories?
for d in <some list of directories>; do
find ${d} -name ""*.dsx"
done
Want to do something interesting with the files?
find ${DSJobs} -name "*.dsx" -exec dostuffwith.sh "{}" \;

Copying files with even number in its name - bash

I want to copy all files from /usr/lib which ends with .X.0.0 where X is an even number. Is there a better way than the following one to select all the files?
ls /usr/lib | grep "[02468].0.0$"
My problem with this solutions is that in case there are files with names like "xy.800.0.0" I need to use the bracket three times etc.
Just use a glob expansion to match the files:
cp /usr/lib/*.*[02468].0.0 /path/to/destination
The shell expands this pattern to the list of files before passing them as arguments to cp.
Since you tagged Bash, you can make the match more strict by using an extended glob:
shopt -s extglob failglob
cp /usr/lib/*.*([0-9])[02468].0.0 /path/to/destination
This matches 0 or more other digits followed by an even digit, and doesn't run the command at all if no files match.
You could use extended grep regular expressions to only match even numbers:
ls -1q /usr/lib | grep -E "\.[0-9]*[02468].0.0$"
However, as Tom suggested, there are better options than parsing the output of ls. It's generally safer and faster to use glob expansion, and more maintainable to just put everything in a python script.

Deleting files ending with 2 digits in linux

In my folder, I want to delete the files that ends with 2 digits (10, 11, 12, etc.)
What I've tried is
rm [0-9]*
but it seems like it doesn't work.
What is the right syntax of doing this?
Converting comments into an answer.
Your requirement is a bit ambiguous. However, you can use:
rm -i *[0-9][0-9]
as long as you don't mind files ending with three digits being removed. If you do mind the three-digit files being removed, use:
rm -i *[!0-9][0-9][0-9]
(assuming Bash history expansion doesn't get in the way). Note that if you have file names consisting of just 2 digits, those will not be removed; that would require:
rm -i [0-9][0-9]
Caution!
The -i option is for interactive. It is generally a bad idea to experiment with globbing and rm commands because you can do a lot of damage if you get it wrong. However, you can use other techniques to neutralize the danger, such as:
echo *[!0-9][0-9]
which echoes all the file names, or:
printf '%s\n' *[!0-9][0-9]
which lists the file names one per line. Basically, be cautious when experimenting with file deletion — don't risk making a mistake unless you know you have good backups readily available. Even then, it is better not to need to use them
See also the GNU Bash manual on:
Pattern matching — which notes you might be able to use ^ in place of !.
The shopt built-in
The command
rm [0-9]*
means remove all the files that start with a digit. The range within [] is expanded single time.
Whereas you intend to remove files ending with double digit, so command should be
rm *[0-9][0-9]
If you have some file extension, the command should be modified as
rm *[0-9][0-9]* or
rm *[0-9][0-9].ext
where ext is the extension like txt

Command Substitution working on command line but not in script

Using ubuntu 10.10 I have the following that I run on the command-line:
result="$(ls -d !(*.*))"
chmod +x $result
This gets a list of files that have no extensions and makes them executable.
But when I move it to a script file (shell) it does not work. From what I have read around the forum this is something to do with command substitution being run in a different a subshell.
But I could not find a solution yet that works in my scrpt :(
So how do you get the result of a command and store it in a variable within a script?
(Since #user000001 does not seem to write their comment into an answer, I'll do the toiling of writing the answer. So credit should got to them, though.)
The feature you are using is the extglob (extended globbing) feature of the bash. This is per default enabled for interactive shells, and per default disabled for non-interactive shells (i. e. shell scripts). To enable it, use the command shopt -s extglob.
Note that this command only has effect for lines below it:
shopt -s extglob
ls -d !(*.*)
It does not effect parsing of the same line:
shopt -s extglob; ls -d !(*.*) # won't work!!
In general I want to warn about using such special features of the bash. It makes the code rather unportable. I'd propose to use POSIX features and tools instead which enable porting the code to another platform rather easily, and they also represent a certain subset of possibilities more developers understand without having to consult the documentation first.
What you want to achieve could also be done using find. This also has the advantage of being unproblematic in combination with strange file names (e. g. containing spaces, quotes, etc.):
find . -maxdepth 1 -type f -name '*.*' -o -exec chmod +x "{}" \;

find only files with extension using ls

I need to find only files in directory which have a extension using ls (can't use find).
I tried ls *.*, but if dir doesn't contain any file with extension it returns "No such file or directory".
I dont want that error and want ls to return to cmd prompt if there are files with extension.
I have trying to use grep with ls to achieve the same.
ls|grep "*.*" - doesn't work
but ls | grep "\." works.
I have no idea why grep *.* doesn't work. Any help is appreciated!
Thanks!
I think the correct solution is this:
( shopt -s nullglob ; echo *.* )
It's a bit verbose, but it will always work no matter what kind of funky filenames you have. (The problem with piping ls to grep is that typical systems allow really bizarre characters in filenames, including, for example, newlines.)
The shopt -s nullglob part enables ("sets") the nullglob shell optoption, which tells Bash that if no files have names matching *.*, then the *.* should be removed (i.e., should expand into nothing) rather than being left alone.
The parentheses (...) are to set up a subshell, so the nullglob option is only enabled for this small part of the script.
It's important to understand the difference between a shell pattern and a regular expression. Shell patterns are a bit simpler, but less flexible. grep matches using a regular expression. A shell pattern like
*.*
would be done with a regular expression as
.*\..*
but the regular expressions in grep are not anchored, which means it searches for a match anywhere on the line, making the two .* parts unnecessary.
Try
ls -1 | grep "\."
list only files with extensión and nothing (empty list) if there is no file: like you need.
With Linux grep, you can add -v to get a list files with no extension.

Resources