File Glob Patterns in Linux terminal - linux

I want to search a filename which may contain kavi or kabhi.
I wrote command in the terminal:
ls -l *ka[vbh]i*
Between ka and i there may be v or bh .
The code I wrote isn't correct. What would be the correct command?

A nice way to do this is to use extended globs. With them, you can perform regular expressions on Bash.
To start you have to enable the extglob feature, since it is disabled by default:
shopt -s extglob
Then, write a regex with the required condition: stuff + ka + either v or bh + i + stuff. All together:
ls -l *ka#(v|bh)i*
The syntax is a bit different from the normal regular expressions, so you need to read in Extended Globs that...
#(list): Matches one of the given patterns.
Test
$ ls
a.php AABB AAkabhiBB AAkabiBB AAkaviBB s.sh
$ ls *ka#(v|bh)i*
AAkabhiBB AAkaviBB

a slightly longer cmd line could be using find, grep and xargs. it has the advantage of being easily extended to different search terms (by either extending the grep statement or by using additional options of find), a bit more readability (imho) and flexibility in being able to execute specific commands on the files which are found
find . | grep -e "kabhi" -e "kavi" | xargs ls -l

You can get what you want by using curly braces in bash:
ls -l *ka{v,bh}i*
Note: this is not a regular expression question so much as a "shell globbing" question. Shell "glob patterns" are different from regular expressions, though they are similar in many ways.

Related

How to use "Regular Expression" in ps?

I am trying to use
ps -C chromi*
to see all chromium processes, but no success. How can I use regular expression in here?
How to use "Regular Expression" in ps?
You cannot, ps does not support regular expressions. The argument is parsed literally.
How to use "Regular Expression" in ps?
You can patch procps ps to support it, most probably (with yet another!) additional flag. The patch looks simple, basically another tree traversing parse_* function that uses regex.h instead of strncmp.
I doubt such patch would make it upstream - it's typical to use other tools, most notably pgrep or shell with a pipe and grep, to filter process by command line name. ps has to stay POSIX compatible, and has so many options already.
Note that regular expression is not "globbing". Consult man 7 glob vs man 7 regex. Regular expression chromi* would match chrom or chromiiiii - chrom followed by zero or more i.
Note that unquoted arguments with "trigger" characters undergo filename expansion (ls 'chromi*' vs ls chromi*). This is different than passing the literal argument when there exist files that match the pattern. If the intention is to pass the pattern to the tool, quote the argument to prevent filename expansion.
I think you are looking for pgrep:
pgrep -f chromium
This will print pids only, no further information.
With the help of xargs, this can be piped to ps again for detailed output:
pgrep -f chromium | xargs ps -o pid,cmd,user,etime -p

Copying files with even number in its name - bash

I want to copy all files from /usr/lib which ends with .X.0.0 where X is an even number. Is there a better way than the following one to select all the files?
ls /usr/lib | grep "[02468].0.0$"
My problem with this solutions is that in case there are files with names like "xy.800.0.0" I need to use the bracket three times etc.
Just use a glob expansion to match the files:
cp /usr/lib/*.*[02468].0.0 /path/to/destination
The shell expands this pattern to the list of files before passing them as arguments to cp.
Since you tagged Bash, you can make the match more strict by using an extended glob:
shopt -s extglob failglob
cp /usr/lib/*.*([0-9])[02468].0.0 /path/to/destination
This matches 0 or more other digits followed by an even digit, and doesn't run the command at all if no files match.
You could use extended grep regular expressions to only match even numbers:
ls -1q /usr/lib | grep -E "\.[0-9]*[02468].0.0$"
However, as Tom suggested, there are better options than parsing the output of ls. It's generally safer and faster to use glob expansion, and more maintainable to just put everything in a python script.

What's the difference between "grep -e" and "grep -E" [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I have a file test.txt, in which there are some formatted phone numbers. I'm trying to use grep to find the lines containing a phone number.
It seems that grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt doesn't work and gives no results. But grep -E "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txtworks. So I wonder what's the difference between these 2 options.
According to man grep:
-E, --extended-regexp
Interpret pattern as an extended regular expression (i.e. force
grep to behave as egrep).
-e pattern, --regexp=pattern
Specify a pattern used during the search of the input: an input
line is selected if it matches any of the specified patterns.
This option is most useful when multiple -e options are used to
specify multiple patterns, or when a pattern begins with a dash
(`-').
But I don't quite understand it. What is an extended regex?
As you mentioned, grep -E is for extended regular expressions whereas -e is for basic regular expressions. From the man page:
EDIT: As Jonathan pointed out below, grep -e "specifies that the following argument is (one of) the regular expression(s) to be matched."
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose
their special meaning; instead use the backslashed versions \?, \+, \{,
\|, \(, and \).
Traditional egrep did not support the { meta-character, and some egrep
implementations support \{ instead, so portable scripts should avoid { in
grep -E patterns and should use [{] to match a literal {.
GNU grep -E attempts to support traditional usage by assuming that { is
not special if it would be the start of an invalid interval specification.
For example, the command grep -E '{1' searches for the two-character
string {1 instead of reporting a syntax error in the regular expression.
POSIX.2 allows this behavior as an extension, but portable scripts should
avoid it.
But man pages are pretty terse, so for further info, check out this link:
http://www.regular-expressions.info/posix.html
The part of the manpage regarding the { meta character though specifically talks about what you are seeing with respect to the difference.
grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}"
won't work because it is not treating the { character as you expect. Whereas
grep -E "[0-9]{3}-[0-9]{3}-[0-9]{4}"
does because that is the extended grep version — or the egrep version for example.
Here is a simple test:
$ cat file
apple is a fruit
so is orange
but onion is not
$ grep -e 'but' -e 'fruit' file #Allows you to pass multiple patterns explicitly
apple is a fruit
but onion is not
$ grep -E 'is (a|not)' file #Allows you to use extended regular expressions like ?, +, | etc
apple is a fruit
but onion is not
The -e option to grep simply says that the following argument is the regular expression. Thus:
grep -e 'some.*thing' -r -l .
looks for some followed by thing on a line in all the files in the current directory and all its sub-directories. The same could be achieved by:
grep -r -l 'some.*thing' .
(On Linux, the situation is confused by the behaviour of GNU getopt() which, unless you set POSIXLY_CORRECT in the environment, permutes options, so you could also run:
grep 'some.*thing' -r -l .
and get the same result. Under POSIX and other systems not using GNU getopt(), options need to precede arguments, and the grep would look for a file called -r and another called -l.)
The -E option changes the regular expressions from 'basic' to 'extended'. It can be used with -e:
grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
grep -E -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
The ERE option means the same regular expressions, more or less, as used to be recognized by the egrep command, which is no longer a part of POSIX (having been replaced by grep -E, and fgrep by grep -F).

find only files with extension using ls

I need to find only files in directory which have a extension using ls (can't use find).
I tried ls *.*, but if dir doesn't contain any file with extension it returns "No such file or directory".
I dont want that error and want ls to return to cmd prompt if there are files with extension.
I have trying to use grep with ls to achieve the same.
ls|grep "*.*" - doesn't work
but ls | grep "\." works.
I have no idea why grep *.* doesn't work. Any help is appreciated!
Thanks!
I think the correct solution is this:
( shopt -s nullglob ; echo *.* )
It's a bit verbose, but it will always work no matter what kind of funky filenames you have. (The problem with piping ls to grep is that typical systems allow really bizarre characters in filenames, including, for example, newlines.)
The shopt -s nullglob part enables ("sets") the nullglob shell optoption, which tells Bash that if no files have names matching *.*, then the *.* should be removed (i.e., should expand into nothing) rather than being left alone.
The parentheses (...) are to set up a subshell, so the nullglob option is only enabled for this small part of the script.
It's important to understand the difference between a shell pattern and a regular expression. Shell patterns are a bit simpler, but less flexible. grep matches using a regular expression. A shell pattern like
*.*
would be done with a regular expression as
.*\..*
but the regular expressions in grep are not anchored, which means it searches for a match anywhere on the line, making the two .* parts unnecessary.
Try
ls -1 | grep "\."
list only files with extensión and nothing (empty list) if there is no file: like you need.
With Linux grep, you can add -v to get a list files with no extension.

My regular expression isn't working in grep

Here's the text of the file I'm working with:
(4 spaces)Hi, everyone
(1 tab)yes
When I run this command - grep '^[[:space:]]+' myfile - it doesn't print anything to stdout.
Why doesn't it match the whitespace in the file?
I'm using GNU grep version 2.9.
There are several different regular expression syntaxes. The default for grep is called basic syntax in the grep documentation.
From man grep(1):
In basic regular expressions the meta-characters
?, +, {, |, (, and ) lose their special meaning; instead
use the backslashed versions \?, \+, \{, \|, \(, and \).
Therefore instead of + you should have typed \+:
grep '^[[:space:]]\+' FILE
If you need more power from your regular expressions, I also encourage you to take a look at Perl regular expression syntax. They are generally considered the most expressive. There is a C library called PCRE which emulates them, and grep links to it. To use them (instead of basic syntax) you can use grep -P.
You could use -E:
grep -E '^[[:space:]]+' FILE
This enables extended regex. Without it you get BREs (basic regex) which have a more simplified syntax. Alternatively you could run egrep instead with the same result.
I found you need to escape the +:
grep '^[[:space:]]\+' FILE
Try grep -P '^\s+' instead, provided you’re using GNU grep. It’s a lot easier to type, and has better regexes.

Resources