check if a file is jpeg format using shell script - linux

I know I can use file pic.jpg to get the file type, but how do I write an if statement to check it in a shell script?
E.g. (pseudo code):
if pic.jpg == jpeg file then

Try (assumes Bash v3.0+, using =~, the regex-matching operator):
if [[ $(file -b 'pic.jpg') =~ JPEG ]]; then ...
If you want to match file's output more closely:
if [[ $(file -b 'pic.jpg') =~ ^'JPEG ' ]]; then ...
This will only match if the output starts with 'JPEG', followed by a space.
Alternatively, if you'd rather use a globbing-style pattern:
if [[ $(file -b 'pic.jpg') == 'JPEG '* ]]; then ...
POSIX-compliant conditionals ([ ... ]) do not offer regex or pattern matching, so a different approach is needed:
if expr "$(file -b 'pic.jpg')" : 'JPEG ' >/dev/null; then ...
Note: expr only supports basic regular expressions and is implicitly anchored at the start of the string (no need for ^).
As for why [[ ... ]] rather than [ ... ] is needed in the Bash snippets:
Advanced features such as the regex operator (=~) or pattern matching (e.g., use of unquoted * to represent any sequence of chars.) are nonstandard (not part of the POSIX shell specification).
Since these features are incompatible with how standard conditionals ([ ... ]) work, a new syntax was required; Bash, Ksh, and Zsh use [[ ... ]].

Good old case is worth a mention, too.
case $(file -b pic.jpg) in
'JPEG '*)
echo is
;;
*)
echo is not
;;
esac
The lone right parentheses might seem uncanny at first, but other than that, this is reasonably simple, readable, versatile, and portable way back to the original Bourne shell. (POSIX allows for a matching left parenthesis before the expression, too.)

For JPEG files, the file -b output has JPEG as the first word on the line:
pax> file -b somefile.jpg
JPEG image data, JFIF standard 1.01, blah blah blah
So, you can use that to detect it with something like:
inputFile=somefile.jpg
if [[ $(file -b $testFile | awk '{print $1}') == "JPEG" ]] ; then
echo $inputFile is a JPEG file.
fi

Related

Using grep command inside case statement

So I have this script which im trying to determine the type of the file and act accordingly, I am determining the type of the file using file command and then grep for specific string , for example if the file is zipped then unzip it, if its gzipped then gunzip it, I want to add a lot of different types of file.
I am trying to replace the if statements with case and can't figure it out
My script looks like this:
##$arg is the file itself
TYPE="$(file $arg)"
if [[ $(echo $TYPE|grep "bzip2") ]] ; then
bunzip2 $arg
elif [[ $(echo $TYPE|grep "Zip") ]] ; then
unzip $arg
fi
Thanks to everyone that help :)
The general syntax is
case expr in
pattern) action;;
other) otheraction;;
*) default action --optional;;
esac
So for your snippet,
case $(file "$arg") in
*bzip2*) bunzip2 "$arg";;
*Zip*) unzip "$arg";;
esac
If you want to capture the file output into a variable first, do that, of course; but avoid upper case for your private variables.
bzip2 and unzip by default modify their input files, though. Perhaps you want to avoid that?
case $(file "$arg") in
*bzip2*) bzip2 -dc <"$arg";;
*Zip*) unzip -p "$arg";;
esac |
grep "stuff"
Notice also how the shell conveniently lets you pipe out of (and into) conditionals.

How to replace date part in filename with current date

How to replace only date part to current date of all files present in diretory in unix.
Folder path: C:/shan
Sample files:
CN_Apria_837p_20180924.txt
DN_Apria_837p_20150502.txt
GN_Apria_837p_20160502.txt
CH_Apria_837p_20170502.txt
CU_Apria_837p_20180502.txt
PN_Apria_837p_20140502.txt
CN_Apria_837p_20101502.txt
Desired result should be:
CN_Apria_837p_20190502.txt
DN_Apria_837p_20190502.txt
GN_Apria_837p_20190502.txt
CH_Apria_837p_20190502.txt
CU_Apria_837p_20190502.txt
PN_Apria_837p_20190502.txt
CN_Apria_837p_20190502.txt
Edit:
I'm completely new to unix sell scripting. I tried this below, however it's not working.
#!/bin/bash
for i in ls $1 | grep -E '[0-9]{4}-[0-9]{2}-[0-9]{2}'
do
x=echo $i | grep -oE '[0-9]{4}-[0-9]{2}-[0-9]{2}'
y=echo $i | sed "s/$x/$(date +%F)/g"
mv $1/$i $1/$y 2>/dev/null #incase if old date is same as current date
done
I would use regular expressions here. From the bash man-page:
An additional binary operator, =~, is available, with the same
precedence as == and !=. When it is used, the string to the right
of the operator is considered an extended regular expression and
matched accordingly (as in regex(3)). The return value is 0 if the
string matches the pattern, and 1 otherwise. .... Substrings
matched by parenthesized subexpressions within the regular
expression are saved in the array variable BASH_REMATCH. ...
The element of BASH_REMATCH with indexn is the portion of the
string matching the nth parenthesized sub-expression.
Hence, assuming that the variable x holds the name of one of the files
in question, the code
if [[ $x =~ ^(.*_)[0-9]+([.]txt$) ]]
then
mv "$x" "$BASH_REMATCH[1]$(date +%Y%m%d)$BASH_REMATCH[2]"
fi
first tests roughly whether the file indeed follows the required naming scheme, and then modifies the name accordingly.
Of course in practice, you will tailor the regexp to match your application better. Only you can know what variations in the file name are permitted.
The below should do this
for f in $(find /path/to/files -name "*_*_*_*.txt")
do
newname=$(echo "$f" | sed -r "s/[12][0-9]{3}[01][0-9][0-3][0-9]/$(date '+%Y%m%d')/g")
mv "$f" "$newname"
done
Try this Shellcheck-clean code:
#! /bin/bash -p
readonly dir=$1
shopt -s nullglob # Make glob patterns that match nothing expand to nothing
readonly dateglob='20[0-9][0-9][0-9][0-9][0-9][0-9]'
currdate=$(date '+%Y%m%d')
# shellcheck disable=SC2231
for path in "$dir"/*_${dateglob}.* ; do
name=${path##*/}
newname=${name/_${dateglob}./_${currdate}.}
if [[ $newname != "$name" ]] ; then
newpath="$dir/$newname"
printf "%q -> %q\\n" "$path" "$newpath"
mv -i -- "$path" "$newpath"
fi
done
shopt -s nullglob stops the code trying to process a garbage path if nothing matches the glob pattern in for path in ....
The pattern assigned to dateglob assumes that you will not have to process dates before 2000 (or after 2099!). Change it if that assumption is not valid.
The # shellcheck ... line is to prevent Shellcheck warning about the use of ${dateglob} without quotes. The quotes would be wrong in this case because they would prevent the glob pattern being expanded.
The pattern used to match filenames (*_${dateglob}.*) will match many more forms of filename than the examples given (e.g. A_20180313.tar.gz). You might want to change it.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for information about the Bash string manipulation mechanisms used (${path##...}, ${name/...}).
I've added a printf to output details of what is being moved.
The -i option to mv prompts for confirmation if a file would be overwritten. This turns out to be an issue for the example files because both CN_Apria_837p_20180924.txt and CN_Apria_837p_20101502.txt are identical except for the date, so the code tries to rename them to the same thing.
If any of the files with dates in their names have names beginning with '.', the code will not process them. Add line shopt -s dotglob somewhere before the loop if that is an issue.

Extract property value in filename?

I have many file paths of the form:
dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext
I am running a bash script to do some processing on these files, and I need to extract the value of p (in this case 1.2; in general it is a floating number) from each of these paths. Basically I am running a for loop over all the file paths, and for each path, I need to extract the value of p. How can I do this?
Parameter expansion is a useful tool for this kind of operation:
#!/bin/bash
# ^^^^ IMPORTANT: Not /bin/sh
f=dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext
if [[ $f = *_p=* ]]; then # Check for substring in filename
val=${f##*_p=} # Trim everything before the last "_p="
val=${val%%_*} # Trim everything after first subsequent _
val=${val%.ext} # Trim extension, should it exist.
echo "Extracted $val from filename $f"
fi
Alternately, you could also use shell-native regex support:
#!/bin/bash
# ^^^^ again, NOT /bin/sh
f=dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext
# assigning regex to a variable avoids surprising behavior with some older bash releases
p_re='_p=([[:digit:].]+)(_|[.]ext$)'
if [[ $f =~ $p_re ]]; then # evaluate regex
echo "Extracted ${BASH_REMATCH[1]}" # extract groups from BASH_REMATCH array
fi
For completeness, another approach is to use eval. There can be security dangers here, you have to make your own mind-up if these are justified.
I am using IFS for the split - not everyone's favourite, but it is another way to do it. The eval will execute each assignment as it finds it, in this case dynamically creating variables q, a, and p.
fname='dir1/someotherdir/name_q=3_a=2.34_p=1.2.ext'
OldIFS="$IFS"
IFS='_'
for val in $fname
do
if [[ $val == *=* ]]
then
val=${val%.ext}
eval "$val"
fi
done
IFS="$OldIFS"
echo "$q"
echo "$a"
echo "$p"

Why am I getting command not found error on numeric comparison?

I am trying to parse each line of a file and look for a particular string. The script seems to be doing its intended job, however, in parallel it tries to execute the if command on line 6:
#!/bin/bash
for line in $(cat $1)
do
echo $line | grep -e "Oct/2015"
if($?==0); then
echo "current line is: $line"
fi
done
and I get the following (my script is readlines.sh)
./readlines.sh: line 6: 0==0: command not found
First: As Mr. Llama says, you need more spaces. Right now your script tries to look for a file named something like /usr/bin/0==0 to run. Instead:
[ "$?" -eq 0 ] # POSIX-compliant numeric comparison
[ "$?" = 0 ] # POSIX-compliant string comparison
(( $? == 0 )) # bash-extended numeric comparison
Second: Don't test $? at all in this case. In fact, you don't even have good cause to use grep; the following is both more efficient (because it uses only functionality built into bash and requires no invocation of external commands) and more readable:
if [[ $line = *"Oct/2015"* ]]; then
echo "Current line is: $line"
fi
If you really do need to use grep, write it like so:
if echo "$line" | grep -q "Oct/2015"; then
echo "Current line is: $line"
fi
That way if operates directly on the pipeline's exit status, rather than running a second command testing $? and operating on that command's exit status.
#Charles Duffy has a good answer which I have up-voted as correct (and it is), but here's a detailed, line by line breakdown of your script and the correct thing to do for each part of it.
for line in $(cat $1)
As I noted in my comment elsewhere this should be done as a while read construct instead of a for cat construct.
This construct will wordsplit each line making spaces in the file separate "lines" in the output.
All empty lines will be skipped.
In addition when you cat $1 the variable should be quoted. If it is not quoted spaces and other less-usual characters appearing in the file name will cause the cat to fail and the loop will not process the file.
The complete line would read:
while IFS= read -r line
An illustrative example of the tradeoffs can be found here. The linked test script follows. I tried to include an indication of why IFS= and -r are important.
#!/bin/bash
mkdir -p /tmp/testcase
pushd /tmp/testcase >/dev/null
printf '%s\n' '' two 'three three' '' ' five with leading spaces' 'c:\some\dos\path' '' > testfile
printf '\nwc -l testfile:\n'
wc -l testfile
printf '\n\nfor line in $(cat) ... \n\n'
let n=1
for line in $(cat testfile) ; do
echo line $n: "$line"
let n++
done
printf '\n\nfor line in "$(cat)" ... \n\n'
let n=1
for line in "$(cat testfile)" ; do
echo line $n: "$line"
let n++
done
let n=1
printf '\n\nwhile read ... \n\n'
while read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read ... \n\n'
let n=1
while IFS= read line ; do
echo line $n: "$line"
let n++
done < testfile
printf '\n\nwhile IFS= read -r ... \n\n'
let n=1
while IFS= read -r line ; do
echo line $n: "$line"
let n++
done < testfile
rm -- testfile
popd >/dev/null
rmdir /tmp/testcase
Note that this is a bash-heavy example. Other shells do not tend to support -r for read, for example, nor is let portable. On to the next line of your script.
do
As a matter of style I prefer do on the same line as the for or while declaration, but there's no convention on this.
echo $line | grep -e "Oct/2015"
The variable $line should be quoted here. In general, meaning always unless you specifically know better, you should double-quote all expansion--and that means subshells as well as variables. This insulates you from most unexpected shell weirdness.
You decclared your shell as bash which means you will have there "Here string" operator <<< available to you. When available it can be used to avoid the pipe; each element of a pipeline executes in a subshell, which incurs extra overhead and can lead to unexpected behavior if you try to modify variables. This would be written as
grep -e "Oct/2015" <<<"$line"
Note that I have quoted the line expansion.
You have called grep with -e, which is not incorrect but is needless since your pattern does not begin with -. In addition you have full-quoted a string in shell but you don't attempt to expand a variable or use other shell interpolation inside of it. When you don't expect and don't want the contents of a quoted string to be treated as special by the shell you should single quote them. Furthermore, your use of grep is inefficient: because your pattern is a fixed string and not a regular expression you could have used fgrep or grep -F, which does string contains rather than regular expression matching (and is far faster because of this). So this could be
grep -F 'Oct/2015' <<<"$line"
Without altering the behavior.
if($?==0); then
This is the source of your original problem. In shell scripts commands are separated by whitespace; when you say if($?==0) the $? expands, probably to 0, and bash will try to execute a command called if(0==0) which is a legal command name. What you wanted to do was invoke the if command and give it some parameters, which requires more whitespace. I believe others have covered this sufficiently.
You should never need to test the value of $? in a shell script. The if command exists for branching behavior based on the return code of whatever command you pass to it, so you can inline your grep call and have if check its return code directly, thus:
if grep -F 'Oct/2015` <<<"$line" ; then
Note the generous whitespace around the ; delimiter. I do this because in shell whitespace is usually required and can only sometiems be omitted. Rather than try to remember when you can do which I recommend an extra one space padding around everything. It's never wrong and can make other mistakes easier to notice.
As others have noted this grep will print matched lines to stdout, which is probably not something you want. If you are using GNU grep, which is standard on Linux, you will have the -q switch available to you. This will suppress the output from grep
if grep -q -F 'Oct/2015' <<<"$line" ; then
If you are trying to be strictly standards compliant or are in any environment with a grep that doesn't know -q the standard way to achieve this effect is to redirect stdout to /dev/null/
if printf "$line" | grep -F 'Oct/2015' >/dev/null ; then
In this example I also removed the here string bashism just to show a portable version of this line.
echo "current line is: $line"
There is nothing wrong with this line of your script, except that although echo is standard implementations vary to such an extent that it's not possible to absolutely rely on its behavior. You can use printf anywhere you would use echo and you can be fairly confident of what it will print. Even printf has some caveats: Some uncommon escape sequences are not evenly supported. See mascheck for details.
printf 'current line is: %s\n' "$line"
Note the explicit newline at the end; printf doesn't add one automatically.
fi
No comment on this line.
done
In the case where you did as I recommended and replaced the for line with a while read construct this line would change to:
done < "$1"
This directs the contents of the file in the $1 variable to the stdin of the while loop, which in turn passes the data to read.
In the interests of clarity I recommend copying the value from $1 into another variable first. That way when you read this line the purpose is more clear.
I hope no one takes great offense at the stylistic choices made above, which I have attempted to note; there are many ways to do this (but not a great many correct) ways.
Be sure to always run interesting snippets through the excellent shellcheck and explain shell when you run into difficulties like this in the future.
And finally, here's everything put together:
#!/bin/bash
input_file="$1"
while IFS= read -r line ; do
if grep -q -F 'Oct/2015' <<<"$line" ; then
printf 'current line is %s\n' "$line"
fi
done < "$input_file"
If you like one-liners, you may use AND operator (&&), for example:
echo "$line" | grep -e "Oct/2015" && echo "current line is: $line"
or:
grep -qe "Oct/2015" <<<"$line" && echo "current line is: $line"
Spacing is important in shell scripting.
Also, double-parens is for numerical comparison, not single-parens.
if (( $? == 0 )); then

Remove substring matching pattern both in the beginning and the end of the variable

As the title says, I'm looking for a way to remove a defined pattern both at the beginning of a variable and at the end. I know I have to use # and % but I don't know the correct syntax.
In this case, I want to remove http:// at the beginning, and /score/ at the end of the variable $line which is read from file.txt.
Well, you can't nest ${var%}/${var#} operations, so you'll have to use temporary variable.
Like here:
var="http://whatever/score/"
temp_var="${var#http://}"
echo "${temp_var%/score/}"
Alternatively, you can use regular expressions with (for example) sed:
some_variable="$( echo "$var" | sed -e 's#^http://##; s#/score/$##' )"
$ var='https://www.google.com/keep/score'
$ var=${var#*//} #removes stuff upto // from begining
$ var=${var%/*} #removes stuff from / all the way to end
$ echo $var
www.google.com/keep
You have to do it in 2 steps :
$ string="fooSTUFFfoo"
$ string="${string%foo}"
$ string="${string#foo}"
$ echo "$string"
STUFF
There IS a way to do it one step using only built-in bash functionality (no running external programs such as sed) -- with BASH_REMATCH:
url=http://whatever/score/
re='https?://(.*)/score/'
[[ $url =~ $re ]] && printf '%s\n' "${BASH_REMATCH[1]}"
This matches against the regular expression on the right-hand side of the =~ test, and puts the groups into the BASH_REMATCH array.
That said, it's more conventional to use two PE expressions and a temporary variable:
shopt -s extglob
url=http://whatever/score/
val=${url#http?(s)://}; val=${val%/score/}
printf '%s\n' "$val"
...in the above example, the extglob option is used to allow the shell to recognized "extglobs" -- bash's extensions to glob syntax (making glob-style patterns similar in power to regular expressions), among which ?(foo) means that foo is optional.
By the way, I'm using printf rather than echo in these examples because many of echo's behaviors are implementation-defined -- for instance, consider the case where the variable's contents are -e or -n.
how about
export x='https://www.google.com/keep/score';
var=$(perl -e 'if ( $ENV{x} =~ /(https:\/\/)(.+)(\/score)/ ) { print qq($2);}')

Resources