Wildcards as shell parameters - linux

I know how regex and wildcards work in general, but I don't really understand why you can use them as parameters.
ls /[!\(][!\(][!\(]/
command results in the following output
...
com.apple.launchd.AIPZ6SAfpO
com.apple.launchd.HarlOx3LWS
com.apple.launchd.VmTi5KDz1h
powerlog
/usr/:
X11 include libexec sbin standalone
bin lib local share
/var/:
agentx empty log netboot rwho
at folders ma networkd spool
audit install mail root tmp
backups jabberd msgs rpc vm
db lib mysql run yp
from my understanding this should match every three character folder name not containing slash /[!\(][!\(][!\(]/
But why can I use it as parameter?

You can't use regular expressions as parameters (or rather, the shell will not treat a string as a regular expression when placed in a parameter). The unquoted glob /[!\(][!\(][!\(]/ matches, in order:
A slash.
Three characters which are not starting brackets.
A slash.
In other words, three-letter root directories not containing ( anywhere.
The shell expands globs to zero (in case of Bash's nullglob, for example) or more arguments which may be passed to execve, as in this command:
$ strace -fe execve echo *
execve("/usr/bin/echo", ["echo", "directory1", "directory2"], 0x7ffcff705ce8 /* 44 vars */) = 0

Not, you don't know.... shell patterns are described in glob(3) while regular expressions (a more elaborate concept) are described in regex(3) Two different libraries used for similar purposes. sh(1) doesn't use regular expressions when substituting parameters at all. It only uses the glob(3) library.

Because that's how the shell works. Any arguments containing (unquoted) glob characters/expressions, are expanded to filenames. That's what happens in, say rm *.txt (since * is a glob character), and that's what happens in ls /[!\(][!\(][!\(]/ (since [abc] is a glob expression).
They're not regular expressions, though. See e.g. https://mywiki.wooledge.org/glob for the syntax.

Related

How do I exclude a character in Linux

Write a wildcard to match all files (does not matter the files are in which directory, just ask for the wildcard) named in the following rule: starts with a string “image”, immediately followed by a one-digit number (in the range of 0-9), then a non-digit char plus anything else, and ends with either “.jpg” or “.png”. For example, image7.jpg and image0abc.png should be matched by your wildcard while image2.txt or image11.png should not.
My folder contained these files imag2gh.jpeg image11.png image1agb.jpg image1.png image2gh.jpg image2.txt image5.png image70.jpg image7bn.jpg Screenshot .png
If my command work it should only display image1agb.jpg image1.png image2gh.jpg image5.png image70.jpg image7bn.jpg
This is the command I used (ls -ad image[0-9][^0-9]*{.jpg,.png}) but I'm only getting this image1agb.jpg image2gh.jpg image7bn.jpg so I'm missing (image1.png image5.png)Kali Terminal and what I did
ls -ad image[0-9][!0-9]*{.jpg,.png}
Info
Character ranges like [0-9] are usually seen in RegEx statements and such. They won't work as shell globs (wildcards) like that.
Possible solution
Pipe output of command ls -a1
to standard input of the grep command (which does support RegEx).
Use a RegEx statement to make grep filter filenames.
ls -a1|grep "image"'[[:digit:]]\+[[:alpha:]]*\.\(png\|jpg\)'

Cannot get git ls-remote list and no error at stdout

I'm executing git ls-remote ssh://git#git_repo:port * in two different computers under same network, one Linux another Windows, and on Windows I'm getting the list but on Linux not. No error at all just and empty list on Linux.
Both has the SSH key added to the remote repository and both are able to clone the repository.
Update 1:
Windows Git version: 2.19.2.windows.1
Linux Git version: 2.7.4
Update 2:
The repository is in Gerrit.
Update 3:
I'm facing this problem using the Jenkins plugin Extended Choice Parameter plugin. It has no change since 2016. Any workaround for this would be also an answer.
Any idea?
You probably should use:
git ls-remote ssh://git#git_repo:port
without any suffix, as it defaults to listing everything.
You can use:
git ls-remote ssh://git#git_repo:port '*'
(or the same with double quotes—one or both of these may work on Windows as well). In a Unix/Linux-style command shell, the shell will replace * with a list of all the files in the current directory before running the command, unless you protect the asterisk from the shell.
You can also use a single backlash:
git ls-remote ssh://git#git_repo:port \*
as there are a lot of ways to protect individual characters from shells. The rules get a little complicated, but in general, single quotes are the "most powerful" quotes, while double quotes quote glob characters1 but not other expansions.2 Backslashes quote the immediate next character if you're not already inside quotes (the behavior of backslash within double quotes varies in some shells).
1The glob characters are *, [, and ?. After [, characters inside the glob run to the closing ]. So echo foo[abc] looks for files named fooa, foob, and fooc. Note that . is generally not special: foo.* matches only files whose names start with foo., i.e., including the period: a file named foo does not start with foo., only with foo, and is not matched.
Globs are very different from regular expressions: in regular expressions, . matches any character (like ? does in glob) and asterisk means "repeat previous match zero or more times", so that glob * and regular-expression .* are similar. (In regular expression matches, we also need to consider whether the expression is anchored. Globs are always anchored so that the question does not arise.)
2Most expansions occur with dollar sign $, as in $var or ${var} or $(subcommand), but backquotes also invoke command substitution, as in echo `echo bar`.

Pick the specific file in the folder

I want pick the specific format of file among the list of files in a directory. Please find the below example.
I have a below list of files (6 files).
Set-1
1) MAG_L_NT_AA_SUM_2017_01_20.dat
2) MAG_L_NT_AA_2017_01_20.dat
Set-2
1) MAG_L_NT_BB_SUM_2017_01_20.dat
2) MAG_L_NT_BB_2017_01_20.dat
Set-3
1) MAG_L_NT_CC_SUM_2017_01_20.dat
2) MAG_L_NT_CC_2017_01_20.dat
From the above three sets I need only 3 files.
1) MAG_L_NT_AA_2017_01_20.dat
2) MAG_L_NT_BB_2017_01_20.dat
3) MAG_L_NT_CC_2017_01_20.dat
Note: There can be multiple lines of commands because i have create the script for above req. Thanks
Probably easiest and least complex solution to your problem is combining find (a tool for searching for files in a directory hierarchy) and grep (tool for printing lines that match a pattern). You also can read those tools manuals by typing man find and man grep.
Before going straight to solution we need to understand, how we will approach your problem. To find pattern in a name of file we search we will use find command with option -name:
-name pattern
Base of file name (the path with the leading directories removed) matches shell pattern pattern. The metacharacters ('*', '?', and '[]')
match a '.' at the start of the base name (this is a change in
findutils-4.2.2; see section STANDARDS CONFORMANCE below). To ignore a
directory and the files under it, use -prune; see an example in the
description of -path. Braces are not recognised as being special,
despite the fact that some shells including Bash imbue braces with a
special meaning in shell patterns. The filename matching is performed
with the use of the fnmatch(3) library function. Don't forget to
enclose the pattern in quotes in order to protect it from expansion by
the shell.
For instance, if we want to search for a file containing string 'abc' in directory called 'words_directory', we will enter following:
$ find words_directory -name "*abc*"
And if we want to search all directories in directory:
$ find words_directory/* -name "*abc*"
So first, we will need to find all files, which begin with string "MAG_L_NT_" and end with ".dat", therefore to find all matching names in /your/specified/path/ which contains many subdirectories, which could contain files that match this pattern:
$ find /your/specified/path/* -name "MAG_L_NT_*.dat"
However this prints all found filenames, but we still get names containing "SUM" string, there comes in grep. To exclude names containing unwanted string we will use option -v:
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v is
specified by POSIX .)
To use grep to filter out first commands output we will use pipe () |:
The standard shell syntax for pipelines is to list multiple commands,
separated by vertical bars ("pipes" in common Unix verbiage). For
example, to list files in the current directory (ls), retain only the
lines of ls output containing the string "key" (grep), and view the
result in a scrolling page (less), a user types the following into the
command line of a terminal:
ls -l | grep key | less
"ls -l" produces a process, the output (stdout) of which is piped to
the input (stdin) of the process for "grep key"; and likewise for the
process for "less". Each process takes input from the previous process
and produces output for the next process via standard streams. Each
"|" tells the shell to connect the standard output of the command on
the left to the standard input of the command on the right by an
inter-process communication mechanism called an (anonymous) pipe,
implemented in the operating system. Pipes are unidirectional; data
flows through the pipeline from left to right.
process1 | process2 | process3
After you got acquainted to mentioned commands and options which will be used to achieve your goal, you are ready for solution:
$ find /your/specified/path/* -name "MAG_L_NT_*.dat" | grep -v "SUM"
This command will produce output of all names which begin "MAG_L_NT_" and end with ".dat". grep -v will use first command output as input and remove all lines containing "SUM" string.

What is file globbing?

I was just wondering what is file globbing? I have never heard of it before and I couldn't find a definition when I tried looking for it online.
Globbing is the * and ? and some other pattern matchers you may be familiar with.
Globbing interprets the standard wild card characters * and ?, character lists in square brackets, and certain other special characters (such as ^ for negating the sense of a match).
When the shell sees a glob, it will perform pathname expansion and replace the glob with matching filenames when it invokes the program.
For an example of the * operator, say you want to copy all files with a .jpg extension in the current directory to somewhere else:
cp *.jpg /some/other/location
Here *.jpg is a glob pattern that matches all files ending in .jpg in the current directory. It's equivalent to (and much easier than) listing the current directory and typing in each file you want manually:
$ ls
cat.jpg dog.jpg drawing.png recipes.txt zebra.jpg
$ cp cat.jpg dog.jpg zebra.jpg /some/other/location
Note that it may look similar, but it is not the same as Regular Expressions.
You can find more detailed information here and here

Using wildcards to exclude files with a certain suffix

I am experimenting with wildcards in bash and tried to list all the files that start with "xyz" but does not end with ".TXT" but getting incorrect results.
Here is the command that I tried:
$ ls -l xyz*[!\.TXT]
It is not listing the files with names "xyz" and "xyzTXT" that I have in my directory. However, it lists "xyz1", "xyz123".
It seems like adding [!\.TXT] after "xyz*" made the shell look for something that start with "xyz" and has at least one character after it.
Any ideas why it is happening and how to correct this command? I know it can be achieved using other commands but I am especially interested in knowing why it is failing and if it can done just using wildcards.
These commands will do what you want
shopt -s extglob
ls -l xyz!(*.TXT)
shopt -u extglob
The reason why your command doesn't work is beacause xyz*[!\.TXT] which is equivalent to xyz*[!\.TX] means xyz followed by any sequence of character (*) and finally a character in set {!,\,.,T,X} so matches 'xyzwhateveryouwant!' 'xyzwhateveryouwant\' 'xyzwhateveryouwant.' 'xyzwhateveryouwantT' 'xyzwhateveryouwantX'
EDIT: where whateveryouwant does not contain any of !\.TX
I don't think this is doable with only wildcards.
Your command isn't working because it means:
Match everything that has xyz followed by whatever you want and it must not end with sequent character: \, .,T and X. The second T doesn't count as far as what you have inside [] is read as a family of character and not as a string as you thought.
You don't either need to 'escape' . as long as it has no special meaning inside a wildcard.
At least, this is my knowledge of wildcards.

Resources