Linux command to do wild card matching - linux

Is there any bash command to do something similar to:
if [[ $string =~ $pattern ]]
but that it works with simple wild cards (?,*) and not complex regular expressions ??
More info:
I have a config file (a sort of .ini-like file) where each line is composed of a wild card pattern and some other data.
For any given input string that my script receives, I have to find the first line in the config file where the wild card pattern matches the input string and then return the rest of the data in that line.
It's simple. I just need a way to match a string against wild card patterns and not RegExps since the patterns may contain dots, brackets, dashes, etc. and I don't want those to be interpreted as special characters.

The [ -z ${string/$pattern} ] trick has some pretty serious problems: if string is blank, it'll match all possible patterns; if it contains spaces, the test command will parse it as part of an expression (try string="x -o 1 -eq 1" for amusement). bash's [[ expressions do glob-style wildcard matching natively with the == operator, so there's no need for all these elaborate (and trouble-prone) tricks. Just use:
if [[ $string == $pattern ]]

There's several ways of doing this.
In bash >= 3, you have regex matching like you describe, e.g.
$ foo=foobar
$ if [[ $foo =~ f.ob.r ]]; then echo "ok"; fi
ok
Note that this syntax uses regex patterns, so it uses . instead of ? to match a single character.
If what you want to do is just test that the string contains a substring, there's more classic ways of doing that, e.g.
# ${foo/b?r/} replaces "b?r" with the empty string in $foo
# So we're testing if $foo does not contain "b?r" one time
$ if [[ ${foo/b?r/} = $foo ]]; then echo "ok"; fi
You can also test if a string begins or ends with an expression this way:
# ${foo%b?r} removes "bar" in the end of $foo
# So we're testing if $foo does not end with "b?r"
$ if [[ ${foo%b?r} = $foo ]]; then echo "ok"; fi
# ${foo#b?r} removes "b?r" in the beginning of $foo
# So we're testing if $foo does not begin with "b?r"
$ if [[ ${foo#b?r} = $foo ]]; then echo "ok"; fi
ok
See the Parameter Expansion paragraph of man bash for more info on these syntaxes. Using ## or %% instead of # and % respectively will achieve a longest matching instead of a simple matching.
Another very classic way of dealing with wildcards is to use case:
case $foo in
*bar)
echo "Foo matches *bar"
;;
bar?)
echo "Foo matches bar?"
;;
*)
echo "Foo didn't match any known rule"
;;
esac

John T's answer was deleted, but I actually think he was on the right track. Here it is:
Another portable method which will work in most versions of bash is
to echo your string then pipe to grep. If no match is found, it will
evaluate to false as the result will be blank. If something is returned,
it will evaluate to true.
[john#awesome]$string="Hello World"
[john#awesome]$if [[ `echo $string | grep Hello` ]];then echo "match";fi
match
What John didn't consider is the wildcard requested by the answer. For that, use egrep, a.k.a. grep -E, and use the regex wildcard .*. Here, . is the wildcard, and * is a multiplier meaning "any number of these". So, John's example becomes:
$ string="Hello World"
$ if [[ `echo $string | egrep "Hel.*"` ]]; then echo "match"; fi
The . wildcard notation is fairly standard regex, so it should work with any command that speaks regex's.
It does get nasty if you need to escape the special characters, so this may be sub-optimal:
$ if [[ `echo $string | egrep "\.\-\$.*"` ]]; then echo "match"; fi

Related

How to extract the text after a hyphen in bash

I have a string: dev/2.0 or dev/2.0-tymlez. How can I extract the string after the last - hyphen in bash? If there is no -, then the variable should be empty else tymlez and I want to store the result in $STRING. After that I would like to check the variable with:
if [ -z "$STRING" ]
then
echo "\$STRING is empty"
else
echo "\$STRING is NOT empty"
fi
Is that possible?
I recommend against calling your variable STRING. All-uppercase variables are used by the system (e.g. HOME) or the shell itself (e.g. PWD, RANDOM).
That said, you could do something like
string='dev/2.0-tymlez'
case "$string" in
*-*) string="${string##*-}";;
*) string='';;
esac
It's a bit clunky: It first checks whether there are any - at all, and if so, it removes the longest prefix matching *-; otherwise it just sets string to empty (because *- wouldn't have matched anything then).
You could use the =~ operator:
string="dev/2.0-tymlez"
[[ $string =~ -([^-]+)$ ]]; string=${BASH_REMATCH[1]}
BASH_REMATCH is a special array where the matches from [[ ... =~ ... ]] are assigned to.
You can use sed:
for string in "dev/2.0" "dev/2.0-1-2-3" "dev/2.0-tymlez"; do
string=$(sed 's/[^-]*[-]*//' <<< "${string}")
echo "string=[${string}]"
done
Result
string=[]
string=[1-2-3]
string=[tymlez]

How to check if a string contains a special character (!##$%^&*()_+)

I was wondering what would be the best way to check if a string as
$str
contains any of the following characters
!##$%^&*()_+
I thought of using ASCII values but was a little confused on exactly how that would be implemented.
Or if there is a simpler way to just check the string against the values.
Match it against a glob. You just have to escape the characters that the shell otherwise considers special:
#!/bin/bash
str='some text with # in it'
if [[ $str == *['!'##\$%^\&*()_+]* ]]
then
echo "It contains one of those"
fi
This is portable to Dash et al. and IMHO more elegant.
case $str in
*['!&()'##$%^*_+]* ) echo yup ;;
esac
You can also use a regexp:
if [[ $str =~ ['!##$%^&*()_+'] ]]; then
echo yes
else
echo no
fi
Some notes:
The regexp includes the square brackets
The regexp must not be quoted (so $str =~ '[!...+]' would not work).
There is no need to escape chars as is necessary with the glob approach in another answer, because the chars are between the brackets, where they are taken literally by regexp
as with glob pattern [] means "anything in the contained string"
because the pattern does not start with ^ or end with $ there will be a match anywhere in the $str.
Using expr
str='some text with # in it'
if [ `expr "$str" : ".*[!##\$%^\&*()_+].*"` -gt 0 ];
then
echo "This str contain sspecial symbol";
fi
I think one simple way of doing would be like remove any alphanumeric characters and space.
echo "$str" | grep -v "^[a-zA-Z0-9 ]*$"
If you have a bunch of strings then put them in a file like strFile and following command would do the needful.
cat strFile | grep -v "^[a-zA-Z0-9 ]*$"

How to check if a string is a substring of another?

I have the following strings in bash
str1="any string"
str2="any"
I want to check if str2 is a substring of str1
I can do it in this way:
c=`echo $str1 | grep $str2`
if [ $c != "" ]; then
...
fi
Is there a more efficient way of doing this?
You can use wild-card expansion *.
str1="any string"
str2="any"
if [[ "$str1" == *"$str2"* ]]
then
echo "str2 found in str1"
fi
Note that * expansion will not work with single [ ].
str1="any string"
str2="any"
Old school (Bourne shell style):
case "$str1" in *$str2*)
echo found it
esac
New school (as speakr shows), however be warned that the string to the right will be viewed as a regular expression:
if [[ $str1 =~ $str2 ]] ; then
echo found it
fi
But this will work too, even if you're not exactly expecting it:
str2='.*[trs].*'
if [[ $str1 =~ $str2 ]] ; then
echo found it
fi
Using grep is slow, since it spawns a separate process.
You can use bash regexp matching without using grep:
if [[ $str1 =~ $str2 ]]; then
...
fi
Note that you don't need any surrounding slashes or quotes for the regexp pattern. If you want to use glob pattern matching just use == instead of =~ as operator.
Some examples can be found here.
if echo $str1 | grep -q $str2 #any command
then
.....
fi

Comparing variables in a Bash script

Looking at other Bash scripts, I see people comparing variables like: $S == $T while at other times I see the variable being wrapped inside strings: "$S" == "$T".
Some experiments seem to suggest that both do the same. The demo below will print equal in both cases (tested with GNU bash, version 4.2.37):
#!/usr/bin/env bash
S="text"
T="text"
if [[ $S == $T ]]; then
echo "equal"
fi
if [[ "$S" == "$T" ]]; then
echo "equal"
fi
My question: if there's a difference between $S == $T and "$S" == "$T", what is it?
If you use [[ they are almost the same, but not quite...
When the == and != operators are used, the string to the right of the operator is
considered a pattern and matched according to the rules described below under Pattern
Matching. [...]
Any part of the pattern may be quoted to force it to be matched as a string.
If you use [ then you have to use quotes unless you know that the variables cannot be empty or contain whitespace.
Just to be on the safe side, you probably want to quote all your variables all the time.

How to detect spaces in shell script variable [duplicate]

This question already has answers here:
How to check if a string has spaces in Bash shell
(10 answers)
Closed 3 years ago.
e.g string = "test test test"
I want after finding any occurance of space in string, it should echo error and exit else process.
The case statement is useful in these kind of cases:
case "$string" in
*[[:space:]]*)
echo "argument contains a space" >&2
exit 1
;;
esac
Handles leading/trailing spaces.
There is more than one way to do that; using parameter expansion
you could write something like:
if [ "$string" != "${string% *}" ]; then
echo "$string contains one or more spaces";
fi
For a purely Bash solution:
function assertNoSpaces {
if [[ "$1" != "${1/ /}" ]]
then
echo "YOUR ERROR MESSAGE" >&2
exit 1
fi
}
string1="askdjhaaskldjasd"
string2="asjkld askldja skd"
assertNoSpaces "$string1"
assertNoSpaces "$string2" # will trigger error
"${1/ /}" removes any spaces in the input string, and when compared to the original string should be exactly the same if there are not spaces.
Note the quotes around "${1/ /}" - This ensures that leading/trailing spaces are taken into consideration.
To match more than one character, you can use regular expressions to define a pattern to match - "${1/[ \\.]/}".
update
A better approach would be to use in-process expression matching. It will probably be a wee bit faster as no string manipulation is done.
function assertNoSpaces {
if [[ "$1" =~ '[\. ]' ]]
then
echo "YOUR ERROR MESSAGE" >&2
exit 1
fi
}
For more details on the =~ operator, see the this page and this chapter in the Advanced Bash Scripting guide.
The operator was introduced in Bash version 3 so watch out if you're using an older version of Bash.
update 2
Regarding question in comments:
how to handle the code if user enter
like "asd\" means in double quotes
...can we handle it??
The function given above should work with any string so it would be down to how you get input from your user.
Assuming you're using the read command to get user input, one thing you need to watch out for is that by default backslash is treated as an escape character so it will not behave as you might expect. e.g.
read str # user enters "abc\"
echo $str # prints out "abc", not "abc\"
assertNoSpaces "$str" # no error since backslash not in variable
To counter this, use the -r option to treat backslash as a standard character. See read MAN Page for details.
read -r str # user enters "abc\"
echo $str # prints out "abc\"
assertNoSpaces "$str" # triggers error
The == operator inside double brackets can match wildcards.
if [[ $string == *' '* ]]
You can use grep as:
string="test test test"
if ( echo "$string" | grep -q ' ' ); then
echo 'var has space'
exit 1
fi
I just ran into a very similar problem while handling paths. I chose to rely on my shell's parameter expansion rather than looking for a space specifically. It does not detect spaces at the front or the end, though.
function space_exit {
if [ $# -gt 1 ]
then
echo "I cannot handle spaces." 2>&1
exit 1
fi
}

Resources