Check if line in file contains a pattern in Bash - string

I'm trying to figure out why this wont check the lines in the file and echo
How do you compare or check if strings contain something?
#!/bin/bash
while read line
do
#if the line ends
if [[ "$line" == '*Bye$' ]]
then
:
#if the line ends
elif [[ "$line" == '*Fly$' ]]
then
echo "\*\*\*"$line"\*\*\*"
fi
done < file.txt

The problem is that *Bye$ is not a shell pattern (shell patterns don't use the $ notation, they just use the lack of a trailing *) — and even if it were, putting it in single-quotes would disable it. Instead, just write:
if [[ "$line" == *Bye ]]
(and similarly for Fly).

If you want to use proper regular expressions, that's done with the =~ operator, such as:
if [[ "$line" =~ Bye$ ]]
The limited regular expressions you get from shell patterns with == don't include things like the end-line marker $.
Note that you can do something this simple with shell patterns (*Bye) but, if you want the full power of regular expressions (or even just a consistent notation), =~ is the way to go.

Related

check if a file is jpeg format using shell script

I know I can use file pic.jpg to get the file type, but how do I write an if statement to check it in a shell script?
E.g. (pseudo code):
if pic.jpg == jpeg file then
Try (assumes Bash v3.0+, using =~, the regex-matching operator):
if [[ $(file -b 'pic.jpg') =~ JPEG ]]; then ...
If you want to match file's output more closely:
if [[ $(file -b 'pic.jpg') =~ ^'JPEG ' ]]; then ...
This will only match if the output starts with 'JPEG', followed by a space.
Alternatively, if you'd rather use a globbing-style pattern:
if [[ $(file -b 'pic.jpg') == 'JPEG '* ]]; then ...
POSIX-compliant conditionals ([ ... ]) do not offer regex or pattern matching, so a different approach is needed:
if expr "$(file -b 'pic.jpg')" : 'JPEG ' >/dev/null; then ...
Note: expr only supports basic regular expressions and is implicitly anchored at the start of the string (no need for ^).
As for why [[ ... ]] rather than [ ... ] is needed in the Bash snippets:
Advanced features such as the regex operator (=~) or pattern matching (e.g., use of unquoted * to represent any sequence of chars.) are nonstandard (not part of the POSIX shell specification).
Since these features are incompatible with how standard conditionals ([ ... ]) work, a new syntax was required; Bash, Ksh, and Zsh use [[ ... ]].
Good old case is worth a mention, too.
case $(file -b pic.jpg) in
'JPEG '*)
echo is
;;
*)
echo is not
;;
esac
The lone right parentheses might seem uncanny at first, but other than that, this is reasonably simple, readable, versatile, and portable way back to the original Bourne shell. (POSIX allows for a matching left parenthesis before the expression, too.)
For JPEG files, the file -b output has JPEG as the first word on the line:
pax> file -b somefile.jpg
JPEG image data, JFIF standard 1.01, blah blah blah
So, you can use that to detect it with something like:
inputFile=somefile.jpg
if [[ $(file -b $testFile | awk '{print $1}') == "JPEG" ]] ; then
echo $inputFile is a JPEG file.
fi

Extract property value in filename?

I have many file paths of the form:
dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext
I am running a bash script to do some processing on these files, and I need to extract the value of p (in this case 1.2; in general it is a floating number) from each of these paths. Basically I am running a for loop over all the file paths, and for each path, I need to extract the value of p. How can I do this?
Parameter expansion is a useful tool for this kind of operation:
#!/bin/bash
# ^^^^ IMPORTANT: Not /bin/sh
f=dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext
if [[ $f = *_p=* ]]; then # Check for substring in filename
val=${f##*_p=} # Trim everything before the last "_p="
val=${val%%_*} # Trim everything after first subsequent _
val=${val%.ext} # Trim extension, should it exist.
echo "Extracted $val from filename $f"
fi
Alternately, you could also use shell-native regex support:
#!/bin/bash
# ^^^^ again, NOT /bin/sh
f=dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext
# assigning regex to a variable avoids surprising behavior with some older bash releases
p_re='_p=([[:digit:].]+)(_|[.]ext$)'
if [[ $f =~ $p_re ]]; then # evaluate regex
echo "Extracted ${BASH_REMATCH[1]}" # extract groups from BASH_REMATCH array
fi
For completeness, another approach is to use eval. There can be security dangers here, you have to make your own mind-up if these are justified.
I am using IFS for the split - not everyone's favourite, but it is another way to do it. The eval will execute each assignment as it finds it, in this case dynamically creating variables q, a, and p.
fname='dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext'
OldIFS="$IFS"
IFS='_'
for val in $fname
do
if [[ $val == *=* ]]
then
val=${val%.ext}
eval "$val"
fi
done
IFS="$OldIFS"
echo "$q"
echo "$a"
echo "$p"

Bash if else string variable comparison

I am writing a shell script at which I am trying to compare 2 variables that are strings. Every thing works fine as in all the variables have a value from the commands however my if then else statement is not working.
#!/bin/bash
name=$1
for i in {1...10}
do
username=sudo cat /class/rolls/CSCE215-*|awk 'BEGIN {FIELDWIDTHS = "32 20"} {print $2}'|cut -d " " -f6 |sed -n '1,1p'
if ["$name" == "$username"]
then
echo Correct Username
else
echo Incorrect Username
fi
done
All of the tutorials and help online appears to have what I have here but I cannot find out what is wrong with mine.
You are using the "classic test" but it's a good idea to ALWAYS use the newer conditional expression: http://wiki.bash-hackers.org/syntax/ccmd/conditional_expression
if [[ "$name" == "$username" ]]
As far as I know the test command (with a single bracket) doesn't even officially support the "==" operator..
Oh, and of course don't forget the spaces inside the brackets, bash needs them to break the line up into words.
When using test or [, the correct comparison is:
test "$string1" = "string2"
or
[ "$sting1" = "$string2" ]
Note: the single = instead of ==, and always quote sting variables. Further, there is nothing wrong with using the test or [ operators, in fact, they are preferred when portability is needed. They simply lack some of the extended functionality of the [[ operator, such as character class comparison and the ability to use =~.
Now when using the [[ operator, the correct form is:
[[ "$sting1" == "$string2" ]]
Note: as pointed out, quotes are not required when using the [[ operator, but if you get in the habit of always quoting strings, you will be safe in both cases.

Remove substring matching pattern both in the beginning and the end of the variable

As the title says, I'm looking for a way to remove a defined pattern both at the beginning of a variable and at the end. I know I have to use # and % but I don't know the correct syntax.
In this case, I want to remove http:// at the beginning, and /score/ at the end of the variable $line which is read from file.txt.
Well, you can't nest ${var%}/${var#} operations, so you'll have to use temporary variable.
Like here:
var="http://whatever/score/"
temp_var="${var#http://}"
echo "${temp_var%/score/}"
Alternatively, you can use regular expressions with (for example) sed:
some_variable="$( echo "$var" | sed -e 's#^http://##; s#/score/$##' )"
$ var='https://www.google.com/keep/score'
$ var=${var#*//} #removes stuff upto // from begining
$ var=${var%/*} #removes stuff from / all the way to end
$ echo $var
www.google.com/keep
You have to do it in 2 steps :
$ string="fooSTUFFfoo"
$ string="${string%foo}"
$ string="${string#foo}"
$ echo "$string"
STUFF
There IS a way to do it one step using only built-in bash functionality (no running external programs such as sed) -- with BASH_REMATCH:
url=http://whatever/score/
re='https?://(.*)/score/'
[[ $url =~ $re ]] && printf '%s\n' "${BASH_REMATCH[1]}"
This matches against the regular expression on the right-hand side of the =~ test, and puts the groups into the BASH_REMATCH array.
That said, it's more conventional to use two PE expressions and a temporary variable:
shopt -s extglob
url=http://whatever/score/
val=${url#http?(s)://}; val=${val%/score/}
printf '%s\n' "$val"
...in the above example, the extglob option is used to allow the shell to recognized "extglobs" -- bash's extensions to glob syntax (making glob-style patterns similar in power to regular expressions), among which ?(foo) means that foo is optional.
By the way, I'm using printf rather than echo in these examples because many of echo's behaviors are implementation-defined -- for instance, consider the case where the variable's contents are -e or -n.
how about
export x='https://www.google.com/keep/score';
var=$(perl -e 'if ( $ENV{x} =~ /(https:\/\/)(.+)(\/score)/ ) { print qq($2);}')

Bash == operator in [[ ]] is too smart!

Case in point. I want to know if some set of files have as a first line '------'.
So,
for file in *.txt
do
if [[ `head -1 "$file"` == "------" ]]
then
echo "$file starts with dashes"
fi
done
Thing is, head returns the content with a newline, but "------" does not have a newline.
Why does it work?
The backticks strip the trailing newline. For example:
foo=`echo bar`
echo "<$foo>"
prints
<bar>
even though that first echo printed out "bar" followed by a newline.
Bash performs word splitting on the result of command substitution i.e. head -1 "$file"
Word splitting will remove newlines among other things.

Resources