Create an aggregate Nagios check based on values from other checks - linux

I have multiple checks for ten different web servers. One of these checks is to monitor the number of established connections (using neststat and findstr to filter for ESTABLISHED). It works as expected on servers WEB1 through to WEB10. I can graph (using pnp4nagios) the TCP established connection count because the output is an integer. If it's above a certain threshold it goes into warning status, above another it becomes critical.
The individual checks are working the way I want.
However, what I'm looking to do is add up all of these connections into one graph. This would be some sort of aggregate graph or SUM of all of the others.
Is there a way to take the values/outputs from other checks and add them into one?
Server TCP Connections
WEB1 223
WEB2 124
WEB3 412
WEB4 555
WEB5 412
WEB6 60
WEB7 0
WEB8 144
WEB9 234
WEB10 111
TOTAL 2275
I want to graph only the total.

Nagios itself does not use performance data in any way, it just takes it and passes it to whatever you specify in your config. So there's no good way to do this in Nagios (You could pipe the performance output of nagios to some tee command which passes it to pnp4nagios and a different script that sums everything up, but that's just horrible to maintain).
If i had your problem, i'd do the following:
At the end of your current plugin, do something like
echo $nconnections > /some/dir/connections.$NAGIOS_HOSTNAME
where nconnections is the number of connections the plugin found. This example is shell, replace if you use some different language for the plugin. The important thing is: it should be easy to write the number to a special file in the plugin.
Then, create a new plugin which has code similar to:
#!/bin/bash
WARN=1000
CRIT=2000
sumconn=$(cat /some/dir/connections.* | awk '{sum += $1} END {print sum}')
if [ $sumconn -ge $CRIT ]; then
echo "Connection sum CRITICAL: $summconn connections|conn=$sumconn;$WARN;$CRIT"
exit 2
elif [ $sumconn -ge $WARN ]; then
echo "Connection sum WARNING: $summconn connections|conn=$sumconn;$WARN;$CRIT"
exit 1
else
echo "Connection sum OK: $summconn connections|conn=$sumconn;$WARN;$CRIT"
exit 0
fi
That way, whenever you probe an individual server, you'll save the data for the new plugin; the plugin will just pick up the data that's there, which makes it extremely short. Of course, the output of the summary will lag behind a bit, but you can minimize that effect by setting the normal_check_interval of the individual services low enough.
If you want to get fancy, add code to remove files older than a certain threshold from the cache directory. Or, you could even remove the individual services from your nagios configuration, and call the individual-server-plugin from the summation plugin for each server, if you're really uninterested in the connection count per server.
EDIT:
To solve the nrpe problem, create a check_nrpe_and_save plugin like this:
#!/bin/bash
output=$($NAGIOS_USER1/check_nrpe "$#")
rc=$?
nconnections=$(echo "$output" | head -1 | sed 's/.*you have \([0-9]*\) connections.*/$1/')
echo $nconnections > /some/dir/connections.$NAGIOS_HOSTNAME
echo $output
exit $rc
Create a new define command entry for this script, and use the new command in your service definitions. You'll have to adjust the sed pattern to what your plugin outputs. If you don't have the number of connections in your regular output, an expression like .*connections=\([0-9]*\);.* should work. This check_nrpe_and_save should behave just like check_nrpe, especially it should output the same string and return the same exit code, and write to the special file as well.

Related

Trying to use SCP to copy multiple files from remote to local using script

So I'll start with the fact that I'm relatively new to linux scripting, so if I am going about it the wrong way, let me know.
I am creating a script that is meant to copy logs from many different hosts onto the local machine depending on user input.
One of the functions I am writing requires the use of scp. Each time you use the scp command at a particular remote host, you have to enter your password. So to save time for the user, I want to copy any file that the particular host may have on it that the user wants.
I know I can do this using scp user#Remoteipaddress:'directory/file1 directory/file2' local/machine/directory
I have it running (what I feel is too many, so if there is a better way let me know) a bunch of loops.
The portion with the scp command is my main issue. Code looks fine if I quote it and echo it. I can even copy and paste the echoed result and it will work, but if I let the script do it I receive bash: -c: line 0: unexpected EOF while looking for matching `''
edit: $app is a static number created in another portion of program
added a couple things that seemed to be missing. I'm trying to piece together from multiple areas of program without making it more messy than it already is
#assigns different remote host paths do array variable
until [ $scriptCounter == $app ]
do
scpScript[$scriptCounter]="user#${ipAddress[$ipCounter]}:'"
((++ipCounter))
((++scriptCounter))
done
#$app value gets set by another function - typically 3 if that matters
scpCount=0
DayCounter=0
ipScriptCounter=0
until [ $Count == $app ]
do
((++scpCount))
mkdir ~/MyDocuments/Logs/$3/app$scpCount
echo "Creating ~/MyDocuments/Logs/${3}/app${scpCount}"
#there is one log for each day, $totalDiffDays is the total amount of days
#$DayCounter is set and gets marked up everytime it goes through loop until
it matches total days
until [ $DayCounter == $totalDiffDays ]
do
scpPath[$DayCounter]="/var/log/docker/theLog*${datePath[$DayCounter]}*"
noSpaceSCP[$DayCounter]=${scpPath[$DayCounter]//[[:blank:]]/}
((++DayCounter))
done
fullSCPscript[$scpCount]="${scpScript[$ipScriptCounter]}${noSpaceSCP[*]}'"
#this portion I have an issue with.
scp ${fullSCPscript[$scpCount]} ~/MyDocuments/Logs/$3/app$scpCount
#this ups the array counter for my ipaddress array
((++ipScriptCounter))
#How im zeroing out the $DayCounter so it will run through again for other
nodes but with different IP address
until [ $DayCounter == "0" ]
do
((--DayCounter))
done
done
example output i get when I echo the line with the scp command
scp user#10.10.200.100:'/var/log/docker/theLog*2018-07-26* /var/log/docker/theLog*2018-07-27*' /home/mobaxterm/MyDocuments/Logs/care3/app1
I'm sorry that this looks messy, but overall I'm trying to build the directory that its grabbing the log from, and if there are multiple days, just add onto the scp command. I'm trying to do this as opposed to running a whole separate command to save the user from entering their password 5 times if they need 5 files. Instead they would only have to enter it once.

Linux Read - Timeout after x seconds *idle*

I have a (bash) script on a server that I have inherited the administration aspect of, and have recently discovered a flaw in the script that nobody has brought to my attention.
After discovering the issue, others have told me that it has been irritating them, but never told me (great...)
So, the script follows this concept
#!/bin/bash
function refreshscreen(){
# This function refreshes a "statistics screen"
...
echo "Enter command to override update"
read -t 10 variable
}
This script refreshes a statistics screen, and allows the user to stall the update in lieu of commands built into a case statement. However, the read times-out (read -t 10) after 10 seconds, regardless of if the user is typing.
Long story short, is there a way to prevent read from timing out if the user is actively typing a command? Best case scenario would be a "Time out of SEC idle/inactive seconds" opposed to just timeout after x seconds.
I have thought about running a background script at the end of the cycle before the read command pauses the screen to check for inactivity, but have not found a way to make that command work.
You can use read in a loop, reading one character at a time, and adding it to a final read string. This would then give the user some timeout amount of time per character rather than per command. Here's a sample function you might be able to incorporate into your script that shows what I'm talking about:
read_with_idle_timeout() {
local input=""
read -t 10 -N 1 variable
while [ ! -z $variable ]
do
input+=$variable
read -t 10 -N 1 variable
done
echo "Read: $input"
}
This will give the user 10 seconds to type each character. If they stop typing, you'll get as much of the command as they had started typing before the timeout occurred, and then your case statement can handle it. Perhaps you can store the final string in a global variable, or just put this code directly into your other function.
If you need more than one word, since read breaks on $IFS, you could call this function multiple times until you get all the input you're expecting.
I have searched for a simple solution that will do the following:
timeout after 10 seconds, if there is no user input at all
the user has infinite time to finish his answer if the first character was typed within the first 10 sec.
This can be implemented in two lines as follows:
read -N 1 -t 10 -p "What is your name? > " a
[ "$a" != "" ] && read b && echo "Your name is $a$b" || echo "(timeout)"
In case the user waits 10 sec before he enters the first character, the output will be:
What is your name? > (timeout)
If the user types the first character within 10 sec, he has unlimited time to finish this task. The output will look like follows:
What is your name? > Oliver
Your name is Oliver
Caveat: the first character is not editable, once it was typed, while all other characters can be edited (backspace and re-type). Any ideas for a simple solution?

using awk and bash for monitoring exec output to log

I am looking for some help with awk and bash commands,
my project have an embedded (so very limited) hardware,
i need to run a specific command called "digitalio show"
the command output is:
Input=0x50ff <-- last char only change
Output=0x7f
OR
Input=0x50fd <-- last char only change
Output=0x7f
i need to extract the input parameter and convert it into either Active or Passive and log them to a file with timestamp.
the log file should look like this:
YYMMDDhhmmss;Active
YYMMDDhhmmss;Passive
YYMMDDhhmmss;Active
YYMMDDhhmmss;Passive
while logging only changes
The command "digitalio show" is an embedded specific command that give the I/O state at the time of the execution, so i basically need to log every change in the I/O into a file using a minimal tools i have in the embedded H/W.
i can run the command for every 500msec, but if i will log all the outputs i can finish the flash very quickly, so i need only log changes.
in the end this will run as a background deamon.
Thanks !
Rotem.
As far as I understand, a single run of digitalio show command outputs two lines in the following format:
Input=HEX_NUMBER
Output=0x7f
where HEX_NUMBER is either 0x50ff, or 0x50fd. Suppose, the former stands for "Active", the latter for "Passive".
Running the command once per 500 milliseconds requires keeping the state. The most obvious implementation is a loop with a sleep.
However, sleep implementations vary. Some of them support a floating point argument (fractional seconds), and some don't. For example, the GNU implementation accepts arbitrary floating point numbers, but the standard UNIX implementation guarantees to suspend execution for at least the integral number of seconds. There are many alternatives, though. For instance, usleep from killproc accepts microseconds. Alternatively, you can write your own utility.
Let's pick the usleep command. Then the Bash script may look like the following:
#!/bin/bash -
last_state=
while true ; do
i=$(digitalio show | awk -F= '/Input=0x[a-zA-Z0-9]+/ {print $2}')
if test "$i" = "0x50ff" ; then
state="Active"
else
state="Passive"
fi
if test "$state" != "$last_state" ; then
printf '%s;%s\n' $(date '+%Y%m%d%H%M%S') "$state"
fi
last_state="$state"
usleep 500000
done
Sample output
20161019103534;Active
20161019103555;Passive
The script launches digitalio show command in an infinite loop, then extracts the hex part from Input lines with awk.
The $state variable is assigned to whether "Active", or "Passive" depending on the value of hex string.
The $last_state variable keeps the value of $state in the last iteration. If $state is not equal to $last_state, then the state is printed to the standard output in the specific format.

Close uzbl-browser on certain url

I'm using uzbl-browser for a kiosk computer. I'd like to send "close" (or kill) to my uzbl-browser's instance when a user opens a certain URL. What is the best way?
My aim is not that.
I have a survey and i would show it before logout. If user close it then logout. Otherwise wait until last page of survey (identify by a "certain url") and the close uzbl and logout
My solution is that.
Add this to config file
#on_event LOAD_FINISH spawn #scripts_dir/survey_end_check.sh
and in my survey_end_check.sh
#!/bin/sh
if [ $UZBL_URI = "http://yoururl" ];
then
sleep 5
echo "exit" | socat - unix-connect:$UZBL_SOCKET
fi
variant in order to find ad certain string in final page.
After grep, $? is 0 if grep succeeded
#!/bin/sh
end=`echo "#<document.getElementsByClassName('success')[0].innerText>#" | socat - unix-connect:$UZBL_SOCKET | grep -q 'Success!'; echo $?`
if [ $end -eq 0 ];
then
sleep 5
echo "exit" | socat - unix-connect:$UZBL_SOCKET
fi
If I were a user on that computer and had any window, browser or not, closing itself without warning I'd consider this an application crash and try again.
Forcing that behavior on your users may not be the most informative choice.
What you want to look into is a transparent proxy that can filter content. This is how most companies restrict their employees from visiting certain pages.
Squid is one example of a proxy solution commonly used for this, usually setup together with SquidGuard. This guide will get you started.
Alternatively you could also use a DNS solution that redirects all filtered hostnames to a given page. DansGuardian is a possibility here.
A search on stackoverflow will also give you answers as several users already asked similar questions.

Calculating the difference between the first words(timestamp) using perl dynamically

I have a program that keeps on writing the icmp echo requests being received by a machine into a file.
I am using system ("tcpdump icmpecho[0] == 8 | tee abc.txt") to do that.
So this process keeps on going till I end the program manually.
Each line has the timestamp as its first word.
now i want to calculate the frequency of the echo requests I am receiving using a separate script so that if it reaches a certain threshold , I can print an alert.
I tried to use grep -Eo '^[^ ]+' file
to get the timestamps into an array, but I dont know what to do after getting them into an array. grep goes on in a while loop since the file it is reading from keeps on getting populated infinitely.(I'll not have an option of monitoring the differences and printing an alert if grep goes on like that right?)
All I am trying to do is to keep track of the frequency of icmp echo requests that are coming in on my machine and print an alert message whenever that frequency crosses a threshold. is there any alternative way?
All timestamps are saved in #arr
perl -ne '$f{$_}++ or push #arr, $_ for /(\d+:\d+)/ }{ print "$_ [$f{$_} times]\n" for #arr' file
constantly reading from log file,
perl -e 'open$T,pop;while(1){while(<$T>){ ++$f{$_}>10 and print "[$f{$_}]$_" for /(\d+:\d+)/ }sleep 1;seek $T,0,1}' file
I am using
tcpstat -i eth1 -f icmp[0] == 8
to get the request count. it gives me 3 more parameters but got to research a bit bout them!

Resources