How does the ** work while searching for a Path - linux

I didn't find a lot of info about this, as far as I know it matches filenames and directories recursively, but how does it work?

The glob-expression ** is used to match all files and zero or more directories and subdirectories. If the pattern is followed by a /, only directories and subdirectories match.
This means that it is used in a recursive file-search during path-name expansion patterns on the command line.
Depending on the shell you use, it needs to be enabled. In bash this is done with:
$ shopt -s globstar
Here are examples:
# list all files recursively
$ echo **
# list all files recursively that end with .txt
$ echo **/*.txt
# list all files recursively that are in a subdirectory foo
$ echo **/foo/**
Beware that the following pattern does not work recursively **.txt. This is just seen as a combination of two single asterisk globs and is identical to *.txt.
Note: there are subtle differences between bash and zsh, but in general it works the same.

Related

Why does echo command interpret variable for base directory?

I would like to find some file types in pictures folder and I have created the following bash-script in /home/user/pictures folder:
for i in *.pdf *.sh *.txt;
do
echo 'all file types with extension' $i;
find /home/user/pictures -type f -iname $i;
done
But when I execute the bash-script, it does not work as expected for files that are located on the base directory /home/user/pictures. Instead of echo 'All File types with Extension *.sh' the command interprets the variable for base directory:
all file types with extension file1.sh
/home/user/pictures/file1.sh
all file types with extension file2.sh
/home/user/pictures/file2.sh
all file types with extension file3.sh
/home/user/pictures/file3.sh
I would like to know why echo - command does not print "All File types with Extension *.sh".
Revised code:
for i in '*.pdf' '*.sh' '*.txt'
do
echo "all file types with extension $i"
find /home/user/pictures -type f -iname "$i"
done
Explanation:
In bash, a string containing *, or a variable which expands to such a string, may be expanded as a glob pattern unless that string is protected from glob expansion by putting it inside quotes (although if the glob pattern does not match any files, then the original glob pattern will remain after attempted expansion).
In this case, it is not wanted for the glob expansion to happen - the string containing the * needs to be passed as a literal to each of the echo and the find commands. So the $i should be enclosed in double quotes - these will allow the variable expansion from $i, but the subsequent wildcard expansion will not occur. (If single quotes, i.e. '$i' were used instead, then a literal $i would be passed to echo and to find, which is not wanted either.)
In addition to this, the initial for line needs to use quotes to protect against wildcard expansion in the event that any files matching any of the glob patterns exist in the current directory. Here, it does not matter whether single or double quotes are used.
Separately, the revised code here also removes some unnecessary semicolons. Semicolons in bash are a command separator and are not needed merely to terminate a statement (as in C etc).
Observed behaviour with original code
What seems to be happening here is that one of the patterns used in the initial for statement is matching files in the current directory (specifically the *.sh is matching file1.sh file2.sh, and file3.sh). It is therefore being replaced by a list of these filenames (file1.sh file2.sh file3.sh) in the expression, and the for statement will iterate over these values. (Note that the current directory might not be the same as either where the script is located or the top level directory used for the find.)
It would also still be expected that the *.pdf and *.txt would be used in the expression -- either substituted or not, depending on whether any matches are found. Therefore the output shown in the question is probably not the whole output of the script.
Such expressions (*.blabla) changes the value of $i in the loop. Here is the trick i would do :
for i in pdf sh txt;
do
echo 'all file types with extension *.'$i;
find /home/user/pictures -type f -iname '*.'$i;
done

Copying files with even number in its name - bash

I want to copy all files from /usr/lib which ends with .X.0.0 where X is an even number. Is there a better way than the following one to select all the files?
ls /usr/lib | grep "[02468].0.0$"
My problem with this solutions is that in case there are files with names like "xy.800.0.0" I need to use the bracket three times etc.
Just use a glob expansion to match the files:
cp /usr/lib/*.*[02468].0.0 /path/to/destination
The shell expands this pattern to the list of files before passing them as arguments to cp.
Since you tagged Bash, you can make the match more strict by using an extended glob:
shopt -s extglob failglob
cp /usr/lib/*.*([0-9])[02468].0.0 /path/to/destination
This matches 0 or more other digits followed by an even digit, and doesn't run the command at all if no files match.
You could use extended grep regular expressions to only match even numbers:
ls -1q /usr/lib | grep -E "\.[0-9]*[02468].0.0$"
However, as Tom suggested, there are better options than parsing the output of ls. It's generally safer and faster to use glob expansion, and more maintainable to just put everything in a python script.

Exclude directory when grepping with zsh globbing

I have some file structure which contains projects with build folders at various depths. I'm trying to use zsh (extended) globbing to exclude files in build folders.
I tried using the following command and many other variants:
grep "string" (^build/)#
I'm reading this as "Any folder that doesn't match build 0 or more times."
However I'm still getting results from folders such as:
./ProjectA/build/.../file.mi
./ProjectB/package/build/.../file2.mi
Any suggestions?
This should work:
grep string (^build/)#*(.)
Explanation:
^build: anything not named build
^build/: any directory not named build. It will not match any other file type
(^build/)#: any directory path consisting out of elements that are not named build. Again, this will not match a path where the last element is not a directory
(^build/)#*: Any path where all but the last element must not be named build. This will also list files. It also assumes that it would be ok, if the file itself were named build. If that is not the case you have to use (^build/)#^build
(^build/)#*(.): Same as before, but restricted to only match normal files by the glob qualifier (.)
I think you don't need to involve the shell for that task; grep comes with its own file-globber. From manpage:
--exclude-dir=GLOB
Skip any command-line directory with a name suffix that matches the pattern GLOB. When searching recursively, skip any subdirectory whose
base name matches GLOB. Ignore any redundant trailing slashes in GLOB.
So something like this should get the job done for you:
grep -R "string" --exclude-dir='build'
That filter will leave out subdirectories called exactly "build"; if you want to filter out directories that contain the string "build" (such as "build2" or "test-build") then use globbing inside the exclusion pattern:
grep -R "string" --exclude-dir='*build*'
For completeness' sake, I also include here how to do the same thing with two popular grep alternatives that I'm more or less familiar with:
The Silver Searcher: ag -G '^((?!build).)*$' string
Ripgrep: rg -g '!build'
Using the glob qualifier e did the trick for me, both for 'grep' and 'ls':
grep -s "string" **/*(e[' [[ ! `echo "$REPLY" | grep "build/" ` ]]'])

How to open all files in a directory in Bourne shell script?

How can I use the relative path or absolute path as a single command line argument in a shell script?
For example, suppose my shell script is on my Desktop and I want to loop through all the text files in a folder that is somewhere in the file system.
I tried sh myshscript.sh /home/user/Desktop, but this doesn't seem feasible. And how would I avoid directory names and file names with whitespace?
myshscript.sh contains:
for i in `ls`
do
cat $i
done
Superficially, you might write:
cd "${1:-.}" || exit 1
for file in *
do
cat "$file"
done
except you don't really need the for loop in this case:
cd "${1:-.}" || exit 1
cat *
would do the job. And you could avoid the cd operation with:
cat "${1:-.}"/*
which lists (cats) all the files in the given directory, even if the directory or the file names contains spaces, newlines or other difficult to manage characters. You can use any appropriate glob pattern in place of * — if you want files ending .txt, then use *.txt as the pattern, for example.
This breaks down if you might have so many files that the argument list is too long. In that case, you probably need to use find:
find "${1:-.}" -type f -maxdepth 1 -exec cat {} +
(Note that -maxdepth is a GNU find extension.)
Avoid using ls to generate lists of file names, especially if the script has to be robust in the face of spaces, newlines etc in the names.
Use a glob instead of ls, and quote the loop variable:
for i in "$1"/*.txt
do
cat "$i"
done
PS: ShellCheck automatically points this out.

find only files with extension using ls

I need to find only files in directory which have a extension using ls (can't use find).
I tried ls *.*, but if dir doesn't contain any file with extension it returns "No such file or directory".
I dont want that error and want ls to return to cmd prompt if there are files with extension.
I have trying to use grep with ls to achieve the same.
ls|grep "*.*" - doesn't work
but ls | grep "\." works.
I have no idea why grep *.* doesn't work. Any help is appreciated!
Thanks!
I think the correct solution is this:
( shopt -s nullglob ; echo *.* )
It's a bit verbose, but it will always work no matter what kind of funky filenames you have. (The problem with piping ls to grep is that typical systems allow really bizarre characters in filenames, including, for example, newlines.)
The shopt -s nullglob part enables ("sets") the nullglob shell optoption, which tells Bash that if no files have names matching *.*, then the *.* should be removed (i.e., should expand into nothing) rather than being left alone.
The parentheses (...) are to set up a subshell, so the nullglob option is only enabled for this small part of the script.
It's important to understand the difference between a shell pattern and a regular expression. Shell patterns are a bit simpler, but less flexible. grep matches using a regular expression. A shell pattern like
*.*
would be done with a regular expression as
.*\..*
but the regular expressions in grep are not anchored, which means it searches for a match anywhere on the line, making the two .* parts unnecessary.
Try
ls -1 | grep "\."
list only files with extensión and nothing (empty list) if there is no file: like you need.
With Linux grep, you can add -v to get a list files with no extension.

Resources