I have implemented nsca in Nagios for distributed monitoring purposes, and everything seems to be working, except for one oddity that I can't seem to find an answer to anywhere.
The passive checks are sent and received, but the output shows the 4th variable to always be uninitialized, and thus it shows up as $OUTPUT$. It appears as though the checks are showing the proper information on the non-central server, but when it's sent, it doesn't seem to be interpolating properly.
commands.cfg
define command{
command_name submit_check_result
command_line /usr/share/nagios3/plugins/eventhandlers/submit_check_result $HOSTNAME$ '$SERVICEDESC$' $SERVICESTATE$ '$OUTPUT$'
}
submit_check_result
#!/bin/sh
return_code=-1
case "$3" in
OK)
return_code=0
;;
WARNING)
return_code=1
;;
CRITICAL)
return_code=2
;;
UNKNOWN)
return_code=-1
;;
esac
/usr/bin/printf "%s\t%s\t%s\t%s\n" "$1" "$2" "$return_code" "$4" | /usr/sbin/send_nsca 192.168.40.168 -c /etc/send_nsca.cfg
Example service
define service {
host_name example_host
service_description PING
check_command check_icmp
active_checks_enabled 1
passive_checks_enabled 0
obsess_over_service 1
max_check_attempts 5
normal_check_interval 5
retry_check_interval 3
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,c,r
contact_groups admins
}
The output from the log on the non-central server shows:
Nov 29 22:52:52 nagios-server nagios3: SERVICE ALERT: example_host;PING;OK;HARD;5;OK - 192.168.1.1: rta nan, lost 0%
The output from the log on the central server shows:
EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;example_host;PING;0;$OUTPUT$
Status information on the central server (web interface) shows:
PING OK 2016-11-29 22:54:50 0d 0h 54m 6s 1/5 $OUTPUT$
It's not just this service either. All services, including those that are essentially preconfigured for the Nagios server itself "check_load, check_proc, etc".
Any assistance would be appreciated.
I found the issue. Turns out the submit_check_result script above is not formatted properly for submitting check results to a remote server. It will do it, but it doesn't account for the status properly. Below is the proper script:
#!/bin/sh
# SUBMIT_CHECK_RESULT_VIA_NSCA
# Written by Ethan Galstad (egalstad#nagios.org)
# Last Modified: 10-15-2008
#
# This script will send passive check results to the
# nsca daemon that runs on the central Nagios server.
# If you simply want to submit passive checks from the
# same machine that Nagios is running on, look at the
# submit_check_result script.
#
# Arguments:
# $1 = host_name (Short name of host that the service is
# associated with)
# $2 = svc_description (Description of the service)
# $3 = return_code (An integer that determines the state
# of the service check, 0=OK, 1=WARNING, 2=CRITICAL,
# 3=UNKNOWN).
# $4 = plugin_output (A text string that should be used
# as the plugin output for the service check)s
#
#
# Note:
# Modify the NagiosHost parameter to match the name or
# IP address of the central server that has the nsca
# daemon running.
printfcmd="/usr/bin/printf"
NscaBin="/usr/sbin/send_nsca"
NscaCfg="/etc/send_nsca.cfg"
NagiosHost="central_host_IP_address"
# Fire the data off to the NSCA daemon using the send_nsca script
$printfcmd "%s\t%s\t%s\t%s\n" "$1" "$2" "$3" "$4" | $NscaBin -H $NagiosHost -c $NscaCfg
# EOF
Much better results.
Related
Im trying to send a notification upon login via PAM, but i cant figure out how to send it to the user that is logging in.
I'm configuring my PAM to execute a script every time a user logs in. The problem is i need to send a notification if there have been any login attempts (its part of a bigger security thing im trying to add, where my laptop takes a picture with the webcam upon failed logins, and notifies me when i log in again, since my classmates like to try and guess my password for some reason).
The problem is that the line in my .sh file, which sends a user notification, sends it to root since thats the 'user' that executes the script, i want my script to send the notification to my current user (called "andreas"), but im having problems figuring this out.
Here is the line i added to the end of the PAM file system-login:
auth [default=ignore] pam_exec.so /etc/lockCam/call.sh
And here is the call.sh file:
#!/bin/sh
/etc/lockCam/notifier.sh &
The reason im calling another file is because i want it to run in the background WHILE the login process continues, that way the process doesnt slow down logging in.
Here is the script that is then executed:
#!/bin/sh
#sleep 10s
echo -e "foo" > "/etc/lockCam/test"
#This line is simply to make sure that i know that my script was executed
newLogins=`sed -n '3 p' /etc/lockCam/lockdata`
if [ $newLogins -gt 0 ]
then
su andreas -c ' notify-send --urgency=critical --expire-time=6000 "Someone tried to log in!" "$newLogins new lockCam images!" && exit'
callsInRow=`sed -n '2 p' /etc/lockCam/lockdata`
crntS=$(date "+%S")
crntS=${crntS#0}
crntM=$(date "+%M")
crntM=${crntM#0}
crntH=$(date "+%H")
crntH=${crntH#0}
((crntTime = $crntH \* 60 \* 60 + $crntM \* 60 + $crntS ))
#This whole process is absolutely stupid but i cant figure out a better way to make sure none of the integers are called "01" or something like that, which would trigger an error
echo -e "$crntTime\n$callsInRow\n0" > "/etc/lockCam/lockdata"
fi
exit 0
And this is where i THINK my error is, the line "su andreas -c...." is most likely formatted wrong or im doing something else wrong, everythin is executed upon login EXCEPT the notification doesnt show up. If i execute the script from a terminal when im already logged in there is no notification either, unless i remove the "su andreas -c" part and simply do "notify-send...", but that doesnt send out a notification when i log in, and i think thats because the notification is sent to the root user, and not "andreas".
I think your su needs to be passed the desktop users DBUS session bus address. The bus address can be easily obtained and used for X11 user sessions, but Wayland has tighter security, for Wayland the user session actually has to run up proxy to receive the messages. (Had you considered it might be easier to send an email?)
I have notify-desktop gist on github that works for X11 and should also work on Wayland (provided the proxy is running). For completeness I've appended the source code of the script to this post, it's extensively commented, I think it contains the pieces necessary to get you own code working.
#!/bin/bash
# Provides a way for a root process to perform a notify send for each
# of the local desktop users on this machine.
#
# Intended for use by cron and timer jobs. Arguments are passed straight
# to notify send. Falls back to using wall. Care must be taken to
# avoid using this script in any potential fast loops.
#
# X11 users should already have a dbus address socket at /run/user/<userid>/bus
# and this script should work without requiring any initialisation. Should
# this not be the case, X11 users could initilise a proxy as per the wayland
# instructions below.
#
# Due to stricter security requirments Wayland lacks an dbus socket
# accessable to root. Wayland users will need to run a proxy to
# provide root with the necessary socket. Each user can must add
# the following to a Wayland session startup script:
#
# notify-desktop --create-dbus-proxy
#
# That will start xdg-dbus-proxy process and make a socket available under:
# /run/user/<userid>/proxy_dbus_<desktop_sessionid>
#
# Once there is a listening socket, any root script or job can pass
# messages using the syntax of notify-send (man notify-send).
#
# Example messages
# notify-desktop -a Daily-backup -t 0 -i dialog-information.png "Backup completed without error"
# notify-desktop -a Remote-rsync -t 6000 -i dialog-warning.png "Remote host not currently on the network"
# notify-desktop -a Daily-backup -t 0 -i dialog-error.png "Error running backup, please consult journalctl"
# notify-desktop -a OS-Upgrade -t 0 -i dialog-warning.png "Update in progress, do not shutdown until further completion notice."
#
# Warnings:
# 1) There has only been limited testing on wayland
# 2) There has only been no testing for multiple GUI sessions on one desktop
#
if [ $1 == "--create-dbus-proxy" ]
then
if [ -n "$DBUS_SESSION_BUS_ADDRESS" ]
then
sessionid=$(cat /proc/self/sessionid)
xdg-dbus-proxy $DBUS_SESSION_BUS_ADDRESS /run/user/$(id -u)/proxy_dbus_$sessionid &
exit 0
else
echo "ERROR: no value for DBUS_SESSION_BUS_ADDRESS environment variable - not a wayland/X11 session?"
exit 1
fi
fi
function find_desktop_session {
for sessionid in $(loginctl list-sessions --no-legend | awk '{ print $1 }')
do
loginctl show-session -p Id -p Name -p User -p State -p Type -p Remote -p Display $sessionid |
awk -F= '
/[A-Za-z]+/ { val[$1] = $2; }
END {
if (val["Remote"] == "no" &&
val["State"] == "active" &&
(val["Type"] == "x11" || val["Type"] == "wayland")) {
print val["Name"], val["User"], val["Id"];
}
}'
done
}
count=0
while read -r -a desktop_info
do
if [ ${#desktop_info[#]} -eq 3 ]
then
desktop_user=${desktop_info[0]}
desktop_id=${desktop_info[1]}
desktop_sessionid=${desktop_info[2]}
proxy_bus_socket="/run/user/$desktop_id/proxy_dbus_$desktop_sessionid"
if [ -S $proxy_bus_socket ]
then
bus_address="$proxy_bus_socket"
else
bus_address="/run/user/$desktop_id/bus"
fi
sudo -u $desktop_user DBUS_SESSION_BUS_ADDRESS="unix:path=$bus_address" notify-send "$#"
count=$[count + 1]
fi
done <<<$(find_desktop_session)
# If no one has been notified fall back to wall
if [ $count -eq 0 ]
then
echo "$#" | wall
fi
# Don't want this to cause a job to stop
exit 0
I need to configure oracle database of a remote server to start in system startup.
I followed this tutorial that is almost the same as others.
I am not allowed to restart the server, only can suggest the owner to do something on server.
The server configurations are similar to what the tutorial say but oracle database does not start at system startup. This is the /etc/dbora and /etc/init.d/oratab file contents:
dbora:
#!/bin/sh
# chkconfig: 345 99 10
# description: Oracle auto start-stop script.
#
# Set ORA_HOME to be equivalent to the $ORACLE_HOME
# from which you wish to execute dbstart and dbshut;
#
# Set ORA_OWNER to the user id of the owner of the
# Oracle database in ORA_HOME.
ORA_HOME=/u01/app/oracle/product/11.2.0/db_1
ORA_OWNER=oracle
if [ ! -f $ORA_HOME/bin/dbstart ]
then
echo "Oracle startup: cannot start"
exit
fi
case "$1" in
'start')
su $ORA_OWNER -c "$ORA_HOME/bin/lsnrctl start" &
su $ORA_OWNER -c $ORA_HOME/bin/dbstart &
touch /var/lock/subsys/dbora
;;
'stop')
su $ORA_OWNER -c $ORA_HOME/bin/dbshut
su $ORA_OWNER -c "$ORA_HOME/bin/lsnrctl stop"
rm -f /var/lock/subsys/dbora
;;
esac
oratab:
# This file is used by ORACLE utilities. It is created by root.sh
# and updated by the Database Configuration Assistant when creating
# a database.
# A colon, ':', is used as the field terminator. A new line terminates
# the entry. Lines beginning with a pound sign, '#', are comments.
#
# Entries are of the form:
# $ORACLE_SID:$ORACLE_HOME:<N|Y>:
#
# The first and second fields are the system identifier and home
# directory of the database respectively. The third filed indicates
# to the dbstart utility that the database should , "Y", or should not,
# "N", be brought up at system boot time.
#
# Multiple entries with the same $ORACLE_SID are not allowed.
#
#
orcl:/u01/app/oracle/product/11.2.0/db_1:Y
orcl:/u01/app/oracle/product/11.2.0/db_1:N
What is wrong with these files?
leave only first row
**# Multiple entries with the same $ORACLE_SID are not allowed.**
#
#
orcl:/u01/app/oracle/product/11.2.0/db_1:Y
I need to create a way to communicate commands and transfer files from a host server to a development board.
The only access to the board is via the host server (wired Ethernet connection).
Both the host server and the board are running Linux.
I have the ability to change the Linux environment on the board.
I DO NOT have the ability to change the host server environment.
I DO NOT know which users need to connect to the board.
The connection between the host server and the board does not need to be secure.
Right now, I'm using netcat for my needs but have run into reliability issues - Could someone point me to some tools that are better suited to my needs and perhaps higher performance, please?
My current solution:
A script is constantly running on the board:
while [ 1 ] ; do
# Start a netcat server, awaiting a file.
nc -l -p 1357 > file.bin
# Acknowledge receipt of file.
status=fail
while [ ${status} -ne 0 ] ; do
status=$(echo 0 | nc <host_ip> 2468 2>&1)
done
# Wait for a command.
nc -l -p 1357 -e /bin/sh
# Acknowledge receipt of command.
status=fail
while [ ${status} -ne 0 ] ; do
status=$(echo 0 | nc <host ip> 2468 2>&1)
done
done
The only way users have access to the board is through a script on the server:
# Send over a file.
cat some_file.bin | /bin/nc -w 10
# Wait for acknowledge.
if [ $( nc -l 2468 ) -ne 0 ] ; then
exit # Fail
fi
# Send a command.
echo "<some_command>" | nc -w 10 <board ip> 1357
# Wait for acknowledge.
if [ $( nc -l 2468 ) -ne 0 ] ; then
exit # Fail
fi
The reason I'm using netcat right now is that I don't know of any way to use SSH or SCP without a password given that:
I can't generate an SSH key for everyone that needs access to the board, especially since I don't know who will be using it.
I can't install sshpass or expect since I don't have control of the server.
Please help,
Thanks.
You could use SSH with host-based authentication. Then you would not need those script and simply use SSH and SCP normally, without the need for passwords nor user keys.
With host-based authentication, any user connecting from a predefined set of machines (here it would be the host) is automatically authenticated on the target machine (here the board).
You need to modify the SSH server configuration on the board, and the SSH client configuration on the host (which you can do without the need for a root access).
Here is a tutorial that should get you started.
Note that if encryption is not mandatory and performance is of issue, you can use the same host-based authentication scheme for RSH access.
I want to use ping to check to see if a server is up. How would I do the following:
ping $URL
if [$? -eq 0]; then
echo "server live"
else
echo "server down"
fi
How would I accomplish the above? Also, how would I make it such that it returns 0 upon the first ping response, or returns an error if the first ten pings fail? Or, would there be a better way to accomplish what I am trying to do above?
I'ld recommend not to use only ping. It can check if a server is online in general but you can not check a specific service on that server.
Better use these alternatives:
curl
man curl
You can use curl and check the http_response for a webservice like this
check=$(curl -s -w "%{http_code}\n" -L "${HOST}${PORT}/" -o /dev/null)
if [[ $check == 200 || $check == 403 ]]
then
# Service is online
echo "Service is online"
exit 0
else
# Service is offline or not working correctly
echo "Service is offline or not working correctly"
exit 1
fi
where
HOST = [ip or dns-name of your host]
(optional )PORT = [optional a port; don't forget to start with :]
200 is the normal success http_response
403 is a redirect e.g. maybe to a login page so also accetable and most probably means the service runs correctly
-s Silent or quiet mode.
-L Defines the Location
-w In which format you want to display the response
-> %{http_code}\n we only want the http_code
-o the output file
-> /dev/null redirect any output to /dev/null so it isn't written to stdout or the check variable. Usually you would get the complete html source code before the http_response so you have to silence this, too.
nc
man nc
While curl to me seems the best option for Webservices since it is really checking if the service's webpage works correctly,
nc can be used to rapidly check only if a specific port on the target is reachable (and assume this also applies to the service).
Advantage here is the settable timeout of e.g. 1 second while curl might take a bit longer to fail, and of course you can check also services which are not a webpage like port 22 for SSH.
nc -4 -d -z -w 1 ${HOST} ${PORT} &> /dev/null
if [[ $? == 0 ]]
then
# Port is reached
echo "Service is online!"
exit 0
else
# Port is unreachable
echo "Service is offline!"
exit 1
fi
where
HOST = [ip or dns-name of your host]
PORT = [NOT optional the port]
-4 force IPv4 (or -6 for IPv6)
-d Do not attempt to read from stdin
-z Only listen, don't send data
-w timeout
If a connection and stdin are idle for more than timeout seconds, then the connection is silently closed. (In this case nc will exit 1 -> failure.)
(optional) -n If you only use an IP: Do not do any DNS or service lookups on any specified addresses, hostnames or ports.
&> /dev/null Don't print out any output of the command
You can use something like this -
serverResponse=`wget --server-response --max-redirect=0 ${URL} 2>&1`
if [[ $serverResponse == *"Connection refused"* ]]
then
echo "Unable to reach given URL"
exit 1
fi
Use the -c option with ping, it'll ping the URL only given number of times or until timeout
if ping -c 10 $URL; then
echo "server live"
else
echo "server down"
fi
Short form:
ping -c5 $SERVER || echo 'Server down'
Do you need it for some other script? Or are trying to hack some simple monitoring tool? In this case, you may want to take a look at Pingdom: https://www.pingdom.com/.
I using the following script function to check servers are online or not. It's useful when you want to check multiple servers. The function hide the ping output, and you can handle separately the server live or server down case.
#!/bin/bash
#retry count of ping request
RETRYCOUNT=1;
#pingServer: implement ping server functionality.
#Param1: server hostname to ping
function pingServer {
#echo Checking server: $1
ping -c $RETRYCOUNT $1 > /dev/null 2>&1
if [ $? -ne 0 ]
then
echo $1 down
else
echo $1 live
fi
}
#usage example, pinging some host
pingServer google.com
pingServer server1
One good solution is to use MRTG (a simple graphing tool for *NIX) with ping-probe script. look it up on Google.
read this for start.
Sample Graph:
Good day,programmers. I have a problem. Please help.
I am creating a service, which must load automatically when Linux is being loaded. So,I copied the script into the directory /etc/rc.d/init.d or /etc/init.d/. But when I am preforming the command
chkconfig --add listOfProcesses
an error occurs:
service listOfProcesses doesn't support chkconfig
Here is the content of the script. I have found the first version in the Google and have used it as a pattern.
#!/bin/bash
# listOfProcesses Start the process which will show the list of processes
# chkconfig: 345 110 02
# description: This process shows current time and the list of processes
# processname: listOfProcesses
### BEGIN INIT INFO
# Provides:
# Required-Start:
# Required-Stop:
# Default-Start: 3 4 5
# Default-Stop: 0 1 2 6
# Short-Description: shows current time and the list of processes
# Description: This process shows current time and the list of processes
### END INIT INFO
# Source function library.
KIND="listOfProcesses"
start() {
echo -n $"Starting $KIND services: "
daemon /home/myscript
echo
}
stop() {
echo -n $"Shutting down $KIND services: "
killproc /home/myscript
echo
}
restart() {
echo -n $"Restarting $KIND services: "
killproc /home/myscript
daemon /home/myscript
echo
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
restart
;;
*)
echo $"Usage: $0 {start|stop|restart}"
exit 1
esac
exit $?
exit 0;
The second version was made from the cron script. I found the cron script,copied it, and changed it, so I used it as the pattern.
#!/bin/sh
#
# crond Start/Stop the cron clock daemon.
#
# chkconfig: 2345 90 60
# description: cron is a standard UNIX program that runs user-specified \
# programs at periodic scheduled times. vixie cron adds a \
# number of features to the basic UNIX cron, including better \
# security and more powerful configuration options.
### BEGIN INIT INFO
# Provides: crond crontab
# Required-Start: $local_fs $syslog
# Required-Stop: $local_fs $syslog
# Default-Start: 2345
# Default-Stop: 90
# Short-Description: run cron daemon
# Description: cron is a standard UNIX program that runs user-specified
# programs at periodic scheduled times. vixie cron adds a
# number of features to the basic UNIX cron, including better
# security and more powerful configuration options.
### END INIT INFO
rights=whoami;
root=root;
[ -f "$rights"=="$root" ] || {
echo "this programme requires root rights";
exit 1;
}
# Source function library.
. /etc/rc.d/init.d/functions
start() {
echo -n $"Starting $KIND services: ";
daemon showListOfProcesses;
}
stop() {
echo -n $"Shutting down $KIND services: ";
killproc showListOfProcesses;
}
restart() {
stop
start
}
reload() {
restart;
}
force_reload() {
# new configuration takes effect after restart
restart
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
restart
;;
reload)
reload
;;
force-reload)
force_reload
;;
*)
echo $"Usage: $0 {start|stop|restart|reload|force-reload}"
exit 2
esac
exit $?
# Show the list of processes
function showListOfProcesses {
top > /dev/tty2;
}
But the situation hadn't changed. What is the problem? What is wrong in the script?
Look at all the scripts that chkconfig can turn on or off in /etc/rc.d/init.d, you'll notice that the top few comments are very important. See How-To manage services with chkconfig and service
#!/bin/sh
#
# crond Start/Stop the cron clock daemon.
#
# chkconfig: 2345 90 60
# description: cron is a standard UNIX program that runs user-specified \
# programs at periodic scheduled times. vixie cron adds a \
# number of features to the basic UNIX cron, including better \
# security and more powerful configuration options.
You have a script called listofprocesses but to chkconfig this script looks like crond due to the 3rd line and thus it does not find any script called listofprocesses
You'll also most certainly want to change chkconfig: 2345 90 60. Which says which run levels it should be on (in this case 2, 3, 4 and 5), what it's start order is (90) and what its kill order is (60).
You can check the service is correctly set up with chkconfig --list listofprocesses.
Just add the following line at the top:
# chkconfig: - 99 10
it should do the trick
Here is an excellent map of the elements that need to be in an init script, to implement what chkconfig and the init subsystem is doing, and what each element actually does:
http://www.tldp.org/HOWTO/HighQuality-Apps-HOWTO/boot.html
Looks like the max priority is 99, at least on CentOS 6.5, which is what I'm playing with right now.
I was also facing this issue and it was not able to call stop function during shutdown. found the solution after trying so many suggestions on net.
You need to add "touch /var/lock/subsys/" for start and rm -f /var/lock/subsys/" for stop functions in script. Stop may not work for first reboot as lock may be not available during shutdown but will start working from next reboot.
Enjoy....:)
Satya