How to monitor CPU usage automatically and return results when it reaches a threshold - linux

I am new to shell script , i want to write a script to monitor CPU usage and if the CPU usage reaches a threshold it should print the CPU usage by top command ,here is my script , which is giving me error bad number and also not storing any value in the log files
while sleep 1;do if [ "$(top -n1 | grep -i ^cpu | awk '{print $2}')">>sy.log - ge "$Threshold" ]; then echo "$(top -n1)">>sys.log;fi;done

Your script HAS to be indented and stored to a file, especially if you are new to shell !
#!/bin/sh
while sleep 1
do
if [ "$(top -n1 | grep -i ^cpu | awk '{print $2}')">>sy.log - ge "$Threshold" ]
then
echo "$(top -n1)" >> sys.log
fi
done
Your condition looks a bit odd. It may work, but it looks really complex. Store intermediate results in variables, and evaluate them.
Then, you will immediately see the syntax error on the “-ge”.
You HAVE to store logfiles within an absolute path for security reasons. Use variables to simplify the reading.
#!/bin/sh
LOGFILE=/absolute_path/sy.log
WHOLEFILE=/absolute_path/sys.log
Thresold=80
while sleep 1
do
TOP="$(top -n1)"
CPU="$(echo $TOP | grep -i ^cpu | awk '{print $2}')"
echo $CPU >> $LOGFILE
if [ "$CPU" -ge "$Threshold" ] ; then
echo "$TOP" >> $WHOLEFILE
fi
done

You have a couple of errors.
If you write output to sy.log with a redirection then that output is no longer available to the shell. You can work around this with tee.
The dash before -ge must not be followed by a space.
Also, a few stylistic remarks:
grep x | awk '{y}' is a useless use of grep; this can usefully and more economically (as well as more elegantly) be rewritten as awk '/x/{y}'
echo "$(command)" is a useless use of echo -- not a deal-breaker, but you simply want command; there is no need to capture what it prints to standard output just so you can print that text to standard output.
If you are going to capture the output of top -n 1 anyway, there is no need really to run it twice.
Further notes:
If you know the capitalization of the field you want to extract, maybe you don't need to search case-insensitively. (I could not find a version of top which prints a CPU prefix with the load in the second field -- it the expression really correct?)
The shell only supports integer arithmetic. Is this a bug? Maybe you want to use Awk (which has floating-point support) to perform the comparison? This also allows for a moderately tricky refactoring. We make Awk output an exit code of 1 if the comparison fails, and use that as the condition for the if.
#!/bin/sh
while sleep 1
do
if top=$(top -n 1 |
awk -v thres="$Threshold" '1; # print every line
tolower($1) ~ /^cpu/ { print $2 >>"sy.log";
exitcode = ($2 >= thres ? 0 : 1) }
END { exit exitcode }')
then
echo "$top" >>sys.log
fi
done
Do you really mean to have two log files with nearly the same name, or is that a typo? Including a time stamp in the log might be useful both for troubleshooting and for actually using the log files.

Related

eliminate subshells for faster process?

I've read that scripts that are calling for a subshell are slow, which would explain why my script are slow.
for example here, where I'm running a loop that gets an number from an array, is this running a subshell everytime, and can this be solved without using subshells?
mmode=1
modes[1,2]="9,12,18,19,20,30,43,44,45,46,47,48,49"
until [[ -z $kik ]];do
((++mloop))
kik=$(echo ${modes[$mmode,2]} | cut -d "," -f $mloop)
filename=$(basename "$f")
# is all these lines
xcolorall=$((xcolorall+$storednr)
# also triggering
pros2=$(echo "100/$totpix*$xcolorall" | bc -l)
IFS='.' read -r pros5 pros6 <<< "$pros2"
procenthittotal2=$pros5.${pros6:0:2}
#subshells and if,
# is it possible to circumvent it?
#and more of the same code..
done
updated:
the pros2 variable is calculating percent, how many % xcolorall are of totpix and the kik variable is getting a number from the array modes, informing the loop about what color it should count in this loop.
I suspect these are the main hoggers, is there anyway to do this without subshells?
You can replace all the subshells and extern commands shown in your question with bash built-ins.
kik=$(echo ${modes[$mmode,2]} | cut -d "," -f $mloop) can be replaced by
mapfile -d, -t -s$((mloop-1)) -n1 kik <<< "${modes[$mmode,2]}".
If $mmode is constant here, better replace the whole loop with
while IFS=, read -r kik; do ...; done <<< "${modes[$mmode,2]}".
filename=$(basename "$f") can be replaced by
filename=${f##*/} which runs 100 times faster, see benchmark.
pros2=$(echo "100/$totpix*$xcolorall" | bc -l) can be replaced by
(( pros2 = 100 * xcolorall / totpix )) if you don't care for the decimals, or by
precision=2; (( pros = 10**precision * 100 * xcolorall / totpix )); printf -v pros "%0${precision}d" "$pros"; pros="${pros:0: -precision}.${pros: -precision}" if you want 2 decimal places.
Of course you can leave out the last commands (for turning 12345 into 123.45) until you really need the decimal number.
But if speed really matters, write the script in another language. I think awk, perl, or python would be a good match here.

Length comparison of one specific field in linux

I was trying to check the length of second field of a TSV file (hundreds of thousands of lines). However, it runs very very slowly. I guess it should be something wrong with "echo", but not sure how to do.
Input file:
prob name
1.0 Claire
1.0 Mark
... ...
0.9 GFGKHJGJGHKGDFUFULFD
So I need to print out what went wrong in the name. I tested with a little example using "head -100" and it worked. But just can't cope with original file.
This is what I ran:
for title in `cat filename | cut -f2`;do
length=`echo -n $line | wc -m`
if [ "$length" -gt 10 ];then
echo $line
fi
done
awk to rescue:
awk 'length($2)>10' file
This will print all lines having the second field length longer than 10 characters.
Note that it doesn't require any block statement {...} because if the condition is met, awk will by default print the line.
Try this probably:
cat file.tsv | awk '{if (length($2) > 10) print $0;}'
This should be a bit faster since the whole processing is done by the single awk process, while your solution starts 2 processes per loop iteration to make that comparison.
We can use awk if that helps.
awk '{if(length($2) > 10){print}}' filename
$2 here is 2nd field in filename which runs for every line. It would be faster.

Linux Scripting with Spaces in Filenames

I am currently working with a vendor-provided software that is trying to handle sending attachment files to another script that will text-extract from the listed file. The script fails when we receive files from an outside source that contain spaces, as the vendor-supplied software does not surround the filename in quotes - meaning when the text-extraction script is run, it receives a filename that will split apart on the space and cause an error on the extractor script. The vendor-provided software is not editable by us.
This whole process is designed to be an automated transfer, so having this wrench that could be randomly thrown into the gears is an issue.
What we're trying to do, is handle the spaced name in our text extractor script, since that is the piece we have some control over. After a quick Google, it seems like changing the IFS value for the script would be the quick solution, but unfortunately, that script would take effect after the extensions have already mutilated the incoming data.
The script I'm using takes in a -e value, a -i value, and a -o value. These values are sent from the vendor supplied script, which I have no editing control over.
#!/bin/bash
usage() { echo "Usage: $0 -i input -o output -e encoding" 1>&2; exit 1; }
while getopts ":o:i:e:" o; do
case "${o}" in
i)
inputfile=${OPTARG}
;;
o)
outputfile=${OPTARG}
;;
e)
encoding=${OPTARG}
;;
*)
usage
;;
esac
done
shift $((OPTIND-1))
...
...
<Uses the inputfile, outputfile, and encoding variables>
I admit, there may be pieces to this I don't fully understand, and it could be a simple fix, but my end goal is to be able to extract -o, -i, and -e that all contain 1 value, regardless of the spaces within each section. I can handle quoting the script after I can extract the filename value
The script fragment that you have posted does not have any issues with spaces in the arguments.
The following, for example, does not need quoting (since it's an assignment):
inputfile=${OPTARG}
All other uses of $inputfile in the script should be double quoted.
What matters is how this script is called.
This would fail and would assign only hello to the variable inputfile:
$ ./script.sh -i hello world.txt
The string world.txt would prompt the getopts function to stop processing the command line and the script would continue with the shift (world.txt would be left in $1 afterwards).
The following would correctly assign the string hello world.txt to inputfile:
$ ./script.sh -i "hello world.txt"
as would
$ ./script.sh -i hello\ world.txt
The following script uses awk to split the arguments while including spaces in the file names. The arguments can be in any order. It does not handle multiple consecutive spaces in an argument, it collapses them to one.
#!/bin/bash
IFS=' '
str=$(printf "%s" "$*")
istr=$(echo "${str}" | awk 'BEGIN {FS="-i"} {print $2}' | awk 'BEGIN {FS="-o"} {print $1}' | awk 'BEGIN {FS="-e"} {print $1}')
estr=$(echo "${str}" | awk 'BEGIN {FS="-e"} {print $2}' | awk 'BEGIN {FS="-o"} {print $1}' | awk 'BEGIN {FS="-i"} {print $1}')
ostr=$(echo "${str}" | awk 'BEGIN {FS="-o"} {print $2}' | awk 'BEGIN {FS="-e"} {print $1}' | awk 'BEGIN {FS="-i"} {print $1}')
inputfile=""${istr}""
outputfile=""${ostr}""
encoding=""${estr}""
# call the jar
There was an issue when calling the jar where Java threw a MalformedUrlException on a filename with a space.
So after reading through the commentary, we decided that although it may not be the right answer for every scenario, the right answer for this specific scenario was to extract the pieces manually.
Because we are building this for a pre-built script passing to it, and we aren't updating that script any time soon, we can accept with certainty that this script will always receive a -i, -o, and -e flag, and there will be spaces between them, which causes all the pieces passed in to be stored in different variables in $*.
And we can assume that the text after a flag is the response to the flag, until another flag is referenced. This leaves us 3 scenarios:
The variable contains one of the flags
The variable contains the first piece of a parameter immediately after the flag
The variable contains part 2+ of a parameter, and the space in the name was interpreted as a split, and needs to be reinserted.
One of the other issues I kept running into was trying to get string literals to equate to variables in my IF statements. To resolve that issue, I pre-stored all relevant data in array variables, so I could test $variable == $otherVariable.
Although I don't expect it to change, we also handled what to do if the three flags appear in a different order than we anticipate (Our assumption was that they list as i,o,e... but we can't see excatly what is passed). The parameters are dumped into an array in the order they were read in, and a parallel array tracks whether the items in slots 0,1,2 relate to i,o,e.
The final result still has one flaw: if there is more than one consecutive space in the filename, the whitespace is trimmed before processing, and I can only account for one space. But saying as we processed over 4000 files before encountering one with a space, I find it unlikely with the naming conventions that we would encounter something with more than one space.
At that point, we would have to be stepping in for a rare intervention anyways.
Final code change is as follows:
#!/bin/bash
IFS='|'
position=-1
ioeArray=("" "" "")
previous=""
flagArr=("-i" "-o" "-e" " ")
ioePattern=(0 1 2)
#echo "for loop:"
for i in $*; do
#printf "%s\n" "$i"
if [ "$i" == "${flagArr[0]}" ] || [ "$i" == "${flagArr[1]}" ] || [ "$i" == "${flagArr[2]}" ]; then
((position += 1));
previous=$i;
case "$i" in
"${flagArr[0]}")
ioePattern[$position]=0
;;
"${flagArr[1]}")
ioePattern[$position]=1
;;
"${flagArr[2]}")
ioePattern[$position]=2
;;
esac
continue;
fi
if [[ $previous == "-"* ]]; then
ioeArray[$position]=${ioeArray[$position]}$i;
else
ioeArray[$position]=${ioeArray[$position]}" "$i;
fi
previous=$i;
done
echo "extracting (${ioeArray[${ioePattern[0]}]}) to (${ioeArray[${ioePattern[1]}]}) with (${ioeArray[${ioePattern[2]}]}) encoding."
inputfile=""${ioeArray[${ioePattern[0]}]}"";
outputfile=""${ioeArray[${ioePattern[1]}]}"";
encoding=""${ioeArray[${ioePattern[2]}]}"";

Mail output with Bash Script

SSH from Host A to a few hosts (only one listed below right now) using the SSH Key I generated and then go to a specific file, grep for a specific word with a date of yesterday .. then I want to email this output to myself.
It is sending an email but it is giving me the command as opposed to the output from the command.
#!/bin/bash
HOST="XXXXXXXXXXXXXXXXXX, XXXXXXXXXXXXX"
DATE=$(date -d "yesterday")
INVALID=' cat /xxx/xxx/xxxxx | grep 'WORD' | sed 's/$/.\n/g' | grep "$DATE"'
COUNT=$(echo "$INVALID" | wc -c)
for x in $HOSTS
do
ssh BLA#"$x" $COUNT
if [ "$COUNT" -gt 1 ];
then
EMAILTEXT=""
if [ "$COUNT" -gt 1 ];
then
EMAILTEXT="$INVALID"
fi
fi
done | echo -e "$EMAILTEXT" | mail XXXXXXXXXXX.com
This isn't properly an attempt to answer your question, but I think you should be aware of some fundamental problems with your code.
INVALID=' cat /xxx/xxx/xxxxx | grep 'WORD' | sed 's/$/.\n/g' | grep "$DATE"'
This assigns a simple string to the variable INVALID. Because of quoting issues, s/$/.\n/g is not quoted at all, and will probably be mangled by the shell. (You cannot nest single quotes -- the first single-quoted string extends from the first quote to the next one, and then WORD is outside of any quotes, followed by the next single-quoted string, etc.)
If your intent is to execute this as a command at this point, you are looking for a command substitution; with the multiple layers of uselessness peeled off, perhaps something like
INVALID=$(sed -n -e '/WORD/!d' -e "/$DATE/s/$/./p" /xxx/xxx/xxxx)
which looks for a line matching WORD and $DATE and prints the match with a dot appended at the end -- I believe that's what your code boils down to, but without further insights into what this code is supposed to do, it's impossible to tell if this is what you actually need.
COUNT=$(echo "$INVALID" | wc -c)
This assigns a number to $COUNT. With your static definition of INVALID, the number will always be 62; but I guess that's not actually what you want here.
for x in $HOSTS
do
ssh BLA#"$x" $COUNT
This attempts to execute that number as a command on a number of remote hosts (except the loop is over HOSTS and the variable containing the hosts is named just HOST). This cannot possibly be useful, unless you have a battery of commands named as natural numbers which do something useful on these remote hosts; but I think it's safe to assume that that is not what is supposed to be going on here (and if it was, it would absolutely be necessary to explain this in your question).
if [ "$COUNT" -gt 1 ];
then
EMAILTEXT=""
if [ "$COUNT" -gt 1 ];
then
EMAILTEXT="$INVALID"
fi
fi
So EMAILTEXT is either an empty string or the value of INVALID. You assigned it to be a static string above, which is probably the source of your immediate question. But even if it was somehow assigned to a command on the local host, why do you need to visit remote hosts and execute something there? Or is your intent actually to execute the command on each remote host and obtain the output?
done | echo -e "$EMAILTEXT" | mail XXXXXXXXXXX.com
Piping into echo makes no sense at all, because it does not read its standard input. You should probably just have a newline after done; though a possibly more useful arrangement would be to have your loop produce output which we then pipe to mail.
Purely speculatively, perhaps something like the following is what you actually want.
for host in $HOSTS; do
ssh BLA#"$host" sed -n -e '/WORD/!d' -e "/$DATE/s/$/./p" /xxx/xxx/xxxx |
grep . || echo INVALID
done | mail XXXXXXXXXXX.com
If you want to check that there is strictly more than one line of output (which is what the -gt 1 suggests) then this may need to be a little bit more complicated.
Your command substitution is not working. You should read up on how it works but here are the problem lines:
COUNT=$(echo "$INVALID" | wc -c)
[...]
ssh BLA#"$x" $COUNT
should be:
COUNT_CMD="'${INVALID} | wc -c'"
[...]
COUNT=$(ssh BLA#"$x" $COUNT_CMD)
This inserts the value of $INVALID into the string, and puts the whole thing in single quotes. The single quotes are necessary for the ssh call so the pipes aren't evaluated in the script but on the remote host. (COUNT is changed to COUNT_CMD for readability/clarity.)
EDIT:
I misread the question and have corrected my answer.

Error with a script in bash

I have a little error with a script I wrote in bash and I can't figure out what's I'm doing wrong
note that I'm using this script for thousands of calculations and this error happened only a few times (like 20 or so), but it still happened
What the script does is this: basically it takes in input a web page that I got from a site with the utility w3m and it counts all the occurrences of the words in it... After it orders them from the most common to the ones that occur only once
this is the code:
#!/bin/bash
# counts the numbers of words from specific sites #
# writes in a file the occurrences ordered from the most common #
touch check # file used to analyze the occurrences
touch distribution # final file ordered
page=$1 # the web page that needs to be analyzed
occurrences=$2 # temporary file for the occurrences
dictionary=$3 # dictionary used for another purpose (ignore this)
# write the words one by column
cat $page | tr -c [:alnum:] "\n" | sed '/^$/d' > check
# lopp to analyze the words
cat check | while read words
do
word=${words}
strlen=${#word}
# ignores blacklisted words or small ones
if ! grep -Fxq $word .blacklist && [ $strlen -gt 2 ]
then
# if the word isn't in the file
if [ `egrep -c -i "^$word: " $occurrences` -eq 0 ]
then
echo "$word: 1" | cat >> $occurrences
# else if it is already in the file, it calculates the occurrences
else
old=`awk -v words=$word -F": " '$1==words { print $2 }' $occurrences`
### HERE IS THE ERROR, EITHER THE LET OR THE SED ###
let "new=old+1"
sed -i "s/^$word: $old$/$word: $new/g" $occurrences
fi
fi
done
# orders the words
awk -F": " '{print $2" "$1}' $occurrences | sort -rn | awk -F" " '{print $2": "$1}' > distribution
# ignore this, not important
grep -w "1" distribution | awk -F ":" '{print $1}' > temp_dictionary
for line in `cat temp_dictionary`
do
if ! grep -Fxq $line $dictionary
then
echo $line >> $dictionary
fi
done
rm check
rm temp_dictionary
this is the error: (I'm translating it, so it could be different in english)
./wordOccurrences line:30 let:x // where x is a number, usually 9 or 10 (but also 11, 13, etc)
1: syntax error in the espression (the error token is 1)
sed: expression -e #1, character y: command 's' not terminated // where y is another number (this one is also usually 9 or 10) with y being different from x
EDIT:
Talking with kev it looks like it's a newline problem
I added an echo between let and sed to print the sed and it worked perfectly for like 5 to 10 minutes until that error. Usually the sed without error looked like this:
s/^CONSULENTI: 6$/CONSULENTI: 7/g
but when I got the error it was like this:
s/^00145: 1
1$/00145: 4/g
how to fix this?
If you get a new line in $old, it means awk prints two lines so there is a duplicate in $occurences.
The script seems complicated to count words, and not efficient because it launches many processes and process file in a loop ;
maybe you can do something similar with
sort | uniq -c
You should also consider that your case-insensitivity is not consistent throughout the program. I created a page with just "foooo" in it and ran the program, then created one with "Foooo" in it and ran the program again. The 'old=`awk...' line sets 'old' to the empty string because awk is matching case sensitively. This results in the occurrences file not being updated. The subsequent sed and possibly some of the greps are also case sensitive.
This may not be the only error since it doesn't explain the error message you saw, but it is an indication that the same word with different capitalization will be handled erroneously by your script.
The following would separate the words, lowercase them, and then remove the ones smaller than three characters:
tr -cs '[:alnum:]' '\n' <foo | tr '[:upper:]' '[:lower:]' | egrep -v '^.{0,2}$'
Using this at the front of your script would mean that the rest of the script would not have to be case insensitive to be correct.

Resources