sub string search bash scripting - string

When given a string I want to search for a substring which matches two characters (9&0. 0 should be the last character in that substring) and exactly two characters in between them
string="asd20 92x0x 72x0 YX92s0 0xx0 92x0x"
#I want to select substring YX92s0 from that above string
for var in $string
do
if [[ "$var" == *9**0 ]]; then
echo $var // Should print YX92s0 only
fi
done
Obviously this above command doesn't work.

You match each element against the pattern *9??0. There are several ways you can do this; here's one that uses the string to set the positional parameters in a subshell, then iterates over them in a for loop:
( set -- $string
for elt; do [[ $elt == *9??0 ]] && { echo "found"; exit; }; done )

string="asd20 92x0x 72x0 X92s0 0xx0"
if [[ $string =~ [[:space:]].?9.{2}0[[:space:]] ]]; then
echo "found"
fi
Or better, taking advantage of word spliting :
string="asd20 92x0x 72x0 X92s0 0xx0"
for s in $string; do
if [[ $s =~ (.*9.{2}0) ]]; then
echo "${BASH_REMATCH[1]} found"
fi
done
This is regex with bash.

Related

How to check that string contains only blank (\t\n ) characters?

How to check in bash that string contains only blank characters as space, tab, new line? I'm trying this but it doesn't work:
if [[ "$1" == #([\t\n ]) ]]; then
echo "Empty"
fi
Using a regular expression:
if [[ $1 =~ ^[[:space:]]+$ ]]; then
echo "Only whitespace"
else
echo "There are non-whitespace characters."
fi
Use * instead of + if you also want to match empty strings.
Convert it to an array, it will be empty if string contains only blank characters
string=" "
array=($string)
[[ ${!array[#]} ]] && echo fail || echo ok

How do I see if a parameter stars with an uppercase letter in Bash?

I need to make a script that iterates through a list of parameters and checks/counts if the parameter starts with an uppercase letter. I have some starter code but I am stuck and would appreciate any help!
Several notes:
You're missing the =~ operator for a regular expression
Your if is not ended by a fi.
Using [A-Z] doesn't work in all locales, and is needlessly fragile. Some collation orders are of the form AaBbCcDd, and thus A-Z contains a, b, etc; [[:upper:]] is guaranteed to do the right thing everywhere.
Unquoted $# behaves exactly the same as unquoted $*. If you want to correctly honor the quoting and escaping used when your function was first called, use "$#", quoted.
Consider instead:
#!/bin/bash
(( "$#" )) || { echo "Error: No arguments given" >&2; exit 1; }
re='^[[:upper:]]' # store regex in a variable for compatibility with old bash releases
for word in "$#"; do
[[ $word =~ $re ]] && ((++count))
done
echo "$count arguments started with upper-case characters"
Alternately, by using a case statement you can avoid requiring bash, and also check for other types:
for word in "$#"; do
case $word in
[[:upper:]]*) (( ++upper_count )) ;;
[[:lower:]]*) (( ++lower_count )) ;;
[[:digit:]]*) (( ++digit_count )) ;;
esac
done
echo "Found $upper_count arguments starting with upper-case letters"
echo "Found $lower_count arguments starting with lower-case letters"
echo "Found $digit_count arguments starting with digits"
#! /bin/bash
if [ $# -eq 0 ]; then
echo Error
exit 1
fi
COUNT=`echo "$#" | tr ' ' '\n' | grep "^[A-Z]" | wc -l`
echo $COUNT

Linux input pattern matching [duplicate]

String:
name#gmail.com
Checking for:
#
.com
My code
if [[ $word =~ "#" ]]
then
if [[ $word =~ ".com" || $word =~ ".ca" ]]
My problem
name#.com
The above example gets passed, which is not what I want. How do I check for characters (1 or more) between "#" and ".com"?
You can use a very very basic regex:
[[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]]
It looks for a string being exactly like this:
at least one a-z char
#
at least one a-z char
.
at least one a-z char
It can get as complicated as you want, see for example Email check regular expression with bash script.
See in action
$ var="a#b.com"
$ [[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]] && echo "kind of valid email"
kind of valid email
$ var="a#.com"
$ [[ $var =~ ^[a-z]+#[a-z]+\.[a-z]+$ ]] && echo "kind of valid email"
$
why not go for other tools like perl:
> echo "x#gmail.com" | perl -lne 'print $1 if(/#(.*?)\.com/)'
gmail
The glob pattern would be: [[ $word == ?*#?*.#(com|ca) ]]
? matches any single character and * matches zero or more characters
#(p1|p2|p3|...) is an extended globbing pattern that matches one of the given patterns. This requires:
shopt -s extglob
testing:
$ for word in #.com #a.ca a#.com a#b.ca a#b.org; do
echo -ne "$word\t"
[[ $word == ?*#?*.#(com|ca) ]] && echo matches || echo does not match
done
#.com does not match
#a.ca does not match
a#.com does not match
a#b.ca matches
a#b.org does not match

How to check if a string is a substring of another?

I have the following strings in bash
str1="any string"
str2="any"
I want to check if str2 is a substring of str1
I can do it in this way:
c=`echo $str1 | grep $str2`
if [ $c != "" ]; then
...
fi
Is there a more efficient way of doing this?
You can use wild-card expansion *.
str1="any string"
str2="any"
if [[ "$str1" == *"$str2"* ]]
then
echo "str2 found in str1"
fi
Note that * expansion will not work with single [ ].
str1="any string"
str2="any"
Old school (Bourne shell style):
case "$str1" in *$str2*)
echo found it
esac
New school (as speakr shows), however be warned that the string to the right will be viewed as a regular expression:
if [[ $str1 =~ $str2 ]] ; then
echo found it
fi
But this will work too, even if you're not exactly expecting it:
str2='.*[trs].*'
if [[ $str1 =~ $str2 ]] ; then
echo found it
fi
Using grep is slow, since it spawns a separate process.
You can use bash regexp matching without using grep:
if [[ $str1 =~ $str2 ]]; then
...
fi
Note that you don't need any surrounding slashes or quotes for the regexp pattern. If you want to use glob pattern matching just use == instead of =~ as operator.
Some examples can be found here.
if echo $str1 | grep -q $str2 #any command
then
.....
fi

KSH: search string for multiple substrings

I have a simple way to search for multiple substrings in a single string:
if [[ $string = *"string 1"* && $string = *"string 2"* && $string = *"string 3"* ]]
(here searching for string 1, string 2 and string 3 in string).
How can I simplify this, so that there is only one check?
I've tried:
if [[ $string = *"string 1"*"string 2"*"string 3"* ]]
and
if [[ $string = *"string 1*string 2*string 3"* ]]
Note: the three strings specified here will always be in this order, hence why I can simplify it.
In ksh93, you can use the & sub-pattern delimiter.
$ [[ abcdefg == #(*bcd*&*cde*&*efg*) ]]; echo $?
0
$ [[ abcdefg == #(*bcdz*&*cde*&*efg*) ]]; echo $?
1
Only ksh93 has this unfortunately. In mksh, zsh, and bash, with extended matching, the negation sub-pattern allows for this DeMorgan-like equivalence.
$ [[ abcdefg == !(!(*bcd*)|!(*cde*)|!(*efg*)) ]]; echo $?
0
$ [[ abcdefg == !(!(*bcdz*)|!(*cde*)|!(*efg*)) ]]; echo $?
1
To test for just one pattern, see this FAQ

Resources