monitoring gearman in nagios

monitoring gearman in nagios - linux

I am trying to monitor gearman by nagios for that I am using script check_gearman.sh.
Localhost is where gearman server running.
When I run
./check_gearman.sh -H localhost -p 4730 -t 1000
It results in:
CRITICAL: gearman: gearman_client_run_tasks : gearman_wait(GEARMAN_TIMEOUT) timeout reached, 1 servers were poll(), no servers were available, pipe:false -> libgearman/universal.cc:331: pid(613)
Can some one please help me out in this.
below is script
#!/bin/sh
#
# gearman check for nagios
# written by Georg Thoma (georg#thoma.cn)
# Last modified: 07-04-2014
#
# Description:
#
#
#
PROGNAME=`/usr/bin/basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION="0.04"
export TIMEFORMAT="%R"
. $PROGPATH/utils.sh
# Defaults
hostname=localhost
port=4730
timeout=50
# search for gearmanstuff
GEARMAN_BIN=`which gearman 2>&1 | grep -v "no gearman in"`
if [ "x$GEARMAN_BIN" == "x" ] ; then # result of check is empty
echo "gearman executable not found in path"
exit $STATE_UNKNOWN
fi
GEARADMIN_BIN=`which gearadmin 2>&1 | grep -v "no gearadmin in"`
if [ "x$GEARADMIN_BIN" == "x" ] ; then # result of check is empty
echo "gearadmin executable not found in path"
exit $STATE_UNKNOWN
fi
print_usage() {
echo "Usage: $PROGNAME [-H hostname -p port -t timeout]"
echo "Usage: $PROGNAME --help"
echo "Usage: $PROGNAME --version"
}
print_help() {
print_revision $PROGNAME $REVISION
echo ""
print_usage
echo ""
echo "gearman check plugin for nagios"
echo ""
support
}
# Make sure the correct number of command line
# arguments have been supplied
if [ $# -lt 1 ]; then
print_usage
exit $STATE_UNKNOWN
fi
# Grab the command line arguments
exitstatus=$STATE_WARNING #default
while test -n "$1"; do
case "$1" in
--help)
print_help
exit $STATE_OK
;;
-h)
print_help
exit $STATE_OK
;;
--version)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
-V)
print_revision $PROGNAME $REVISION
exit $STATE_OK
;;
-H)
hostname=$2
shift
;;
--hostname)
hostname=$2
shift
;;
-t)
timeout=$2
shift
;;
--timeout)
timeout=$2
shift
;;
-p)
port=$2
shift
;;
--port)
port=$2
shift
;;
*)
echo "Unknown argument: $1"
print_usage
exit $STATE_UNKNOWN
;;
esac
shift
done
# check if server is running and replys to version query
VERSION_RESULT=`$GEARADMIN_BIN -h $hostname -p $port --server-version 2>&1 `
if [ "x$VERSION_RESULT" == "x" ] ; then # result of check is empty
echo "CRITICAL: Server is not running / responding"
exitstatus=$STATE_CRITICAL
exit $exitstatus
fi
# drop funtion echo to remove functions without workers
DROP_RESULT=`$GEARADMIN_BIN -h $hostname -p $port --drop-function echo_for_nagios 2>&1 `
# check for worker echo_for_nagios and start a new one if needed
CHECKWORKER_RESULT=`$GEARADMIN_BIN -h $hostname -p $port --status | grep echo_for_nagios`
if [ "x$CHECKWORKER_RESULT" == "x" ] ; then # result of check is empty
nohup $GEARMAN_BIN -h $hostname -p $port -w -f echo_for_nagios -- echo echo >/dev/null 2>&1 &
fi
# check the time to get the status from gearmanserver
CHECKWORKER_TIME=$( { time $GEARADMIN_BIN -h $hostname --status ; } 2>&1 |tail -1 )
# check if worker returns "echo"
CHECK_RESULT=`cat /dev/null | $GEARMAN_BIN -h $hostname -p $port -t $timeout -f echo_for_nagios 2>&1`
# validate result and set message and exitstatus
if [ "$CHECK_RESULT" = "echo" ] ; then # we got echo back
echo "OK: got an echo back from gearman server version: $VERSION_RESULT, responded in $CHECKWORKER_TIME sec|time=$CHECKWORKER_TIME;;;"
exitstatus=$STATE_OK
else # timeout reached, no echo
echo "CRITICAL: $CHECK_RESULT"
exitstatus=$STATE_CRITICAL
fi
exit $exitstatus
Thanks in advance.

If you download the mod_gearman package, this contains a much better and more featured check_gearman plugin for Nagios.
With your current plugin, the error message shows that the check script cannot connect to the gearman daemon.
You should verify that port 4370 is listening on localhost, and that there is no local firewall blocking connections. It is likely that you have installed your gearmand on a different port, or have it only listening on the network interface, not on localhost. Or maybe it is not runing at all, or is on a different server from the one running the check...

Related

Shell script segmentation fault - AWS

I've been following a tutorial on connecting a raspberry pi to the AWS greengrass and I keep getting a segmentation fault on the final step. AWS provided me with this greengrassd shell script however when i run it I'm getting a segmentation fault. I have no idea why its throwing this error so any help would be appreciated.
AWS Greengrass Tutorial / RaspberryPi
Error
pi#raspberrypi:/greengrass/ggc/packages/1.1.0 $ sudo ./greengrassd start
Setting up greengrass daemon
Validating execution environment
Found cgroup subsystem: cpu
Found cgroup subsystem: cpuacct
Found cgroup subsystem: blkio
Found cgroup subsystem: memory
Found cgroup subsystem: devices
Found cgroup subsystem: freezer
Found cgroup subsystem: net_cls
Starting greengrass daemon./greengrassd: line 158: 2254 Segmentation fault nohup $COMMAND > /dev/null 2> $CRASH_LOG < /dev/null
Greengrass daemon 2254 failed to start
greengrassd script
#!/usr/bin/env bash
##########Environment Requirement for Greengrass Daemon##########
# by default, the daemon assumes it's going to be launched from a directory
# that has the following structure:
# GREENGRASS_ROOT/
# greengrassd
# bin/daemon
# configuration/
# group/group.json
# certs/server.crt
# lambda/
# system_lambda1/...
# system_lambda2/...
# root cgroup has to be mounted separately, this script doesn't do that for you.
#################################################################
set -e
PWD=$(cd $(dirname "$0"); pwd)
GGC_PKG_HOME=$(readlink -f $PWD)
GG_HOME=$(cd $GGC_PKG_HOME/../../; pwd)
CRASH_LOG=$GG_HOME/var/log/crash.log
GGC_ROOT_FS=$GGC_PKG_HOME/ggc_root
PID_FILE=/var/run/greengrassd.pid
FS_SETTINGS=/proc/sys/fs
GGC_GROUP=ggc_group
GGC_USER=ggc_user
MAX_DAEMON_KILL_WAIT_SECONDS=60
RETRY_SIGTERM_INTERVAL_SECONDS=20
if [ -z "$COMMAND" ]; then
COMMAND="$GGC_PKG_HOME/bin/daemon -core-dir=$GGC_PKG_HOME -greengrassdPid=$$"
fi
# Function ran as part of initial setup
setup() {
echo "Setting up greengrass daemon"
mkdir -p $GGC_ROOT_FS
# Mask greengrass directory for containers
mknod $GGC_ROOT_FS/greengrass c 1 3 &>/dev/null || true
mkdir -p $(dirname "$CRASH_LOG")
}
validatePlatformSecurity() {
if [[ -f $FS_SETTINGS/protected_hardlinks &&
-f $FS_SETTINGS/protected_symlinks ]]; then
PROT_HARDLINK_VAL=$(cat $FS_SETTINGS/protected_hardlinks)
PROT_SOFTLINK_VAL=$(cat $FS_SETTINGS/protected_symlinks)
if [[ "$PROT_HARDLINK_VAL" -ne 1 || "$PROT_SOFTLINK_VAL" -ne 1 ]]; then
echo "AWS Greengrass detected insecure OS configuration: No hardlink/softlink protection enabled." | tee -a $CRASH_LOG
exit 1
fi
fi
}
validateEnvironment() {
echo "Validating execution environment"
# ensure all commands that the installation script is going to use are available
if ! type grep >/dev/null ; then
echo "grep command is NOT on the path or is NOT installed on the system"
exit 1
fi
if ! type cat >/dev/null ; then
echo "cat command is NOT on the path or is NOT installed on the system"
exit 1
fi
if ! type awk >/dev/null ; then
echo "awk command is NOT on the path or is NOT installed on the system"
exit 1
fi
if ! type id >/dev/null ; then
echo "id command is NOT on the path or is NOT installed on the system"
exit 1
fi
if ! type ps >/dev/null ; then
echo "ps command is NOT on the path or is NOT installed on the system"
exit 1
fi
if ! type sqlite3 >/dev/null ; then
echo "sqlite3 command is NOT on the path or is NOT installed on the system"
exit 1
fi
# the script needs to be run as root
if [ ! $(id -u) = 0 ]; then
echo "The script needs to be run using sudo"
exit 1
fi
if ! id $GGC_USER >/dev/null ; then
echo "${GGC_USER} doesn't exist. Please add a user ${GGC_USER} on the system"
exit 1
fi
if ! grep -q $GGC_GROUP /etc/group ; then
echo "${GGC_GROUP} doesn't exist. Please add a group ${GGC_GROUP} on the system"
exit 1
fi
# ensure that kernel supports cgroup
if [ ! -e /proc/cgroups ]; then
echo "The kernel in use does NOT support cgroup."
exit 1
fi
# assume that all kernel supported subsystems, which are listed in /proc/cgroups, are going to be used
# so check whether all of them are mounted.
for d in `awk '$4 == 1 {print $1}' /proc/cgroups`; do
if cat /proc/self/cgroup | grep -q $d; then
echo "Found cgroup subsystem: $d"
else
# exit with error if can't find cgroup
echo "The cgroup subsystem is not mounted: $d"
exit 1
fi
done
}
finish() {
pid=$1
echo "$pid" > $PID_FILE
echo ""
echo -e "\e[0;32mGreengrass successfully started with PID: $pid\e[0m"
exit 0
}
start() {
setup
if [[ $INSECURE -ne 1 ]]; then
validatePlatformSecurity
fi
validateEnvironment
trap 'finish $pid' SIGUSR1
echo ""
echo -n "Starting greengrass daemon"
if nohup $COMMAND >/dev/null 2>$CRASH_LOG < /dev/null &
then
pid=$!
# sleep 10 seconds to wait for daemon to start or exit
sleep 10 &
wait $!
echo ""
echo "Greengrass daemon $pid failed to start"
echo -e "\e[0;31m$(cat $CRASH_LOG)\e[0m"
exit 1
else
echo "Failed to start Greengrass daemon"
exit 1
fi
}
version() {
$GGC_PKG_HOME/bin/daemon --version
}
stop() {
if [ -f $PID_FILE ]; then
PID=$(cat $PID_FILE)
echo "Stopping greengrass daemon of PID: $PID"
if [ ! -e "/proc/$PID" ]; then
rm $PID_FILE
echo "Process with pid $PID does not exist already"
return 0
fi
echo -n "Waiting"
kill "$PID" > /dev/null 2>&1
total_sleep_seconds=0
until [ "$total_sleep_seconds" -ge "$MAX_DAEMON_KILL_WAIT_SECONDS" ]; do
sleep 1
# If the pid no longer exists, we're done, remove the pid file and exit. Otherwise, just increment the loop counter
if [ ! -e "/proc/$PID" ]; then
rm $PID_FILE
echo -e "\nStopped greengrass daemon, exiting with success"
break
else
total_sleep_seconds=$(($total_sleep_seconds+1))
echo -n "."
fi
# If it has been $RETRY_SIGTERM_INTERVAL_SECONDS since the last SIGTERM, send SIGTERM
if [ $(($total_sleep_seconds % $RETRY_SIGTERM_INTERVAL_SECONDS)) -eq "0" ]; then
kill "$PID" > /dev/null 2>&1
fi
done
if [ $total_sleep_seconds -ge $MAX_DAEMON_KILL_WAIT_SECONDS ] && [ -e "/proc/$PID" ]; then
# If we are here, we never exited in the previous loop and the pid still exists. Exit with failure.
kill -9 "$PID" > /dev/null 2>&1
echo -e "\nProcess with pid $PID still alive after timeout of $MAX_DAEMON_KILL_WAIT_SECONDS seconds. Forced kill process, exiting with failure."
exit 1
fi
fi
}
usage() {
echo ""
echo "Usage: $0 [FLAGS] {start|stop|restart}"
echo ""
echo -e "[FLAGS]: \n -i, --insecure \t Run GGC in insecure mode without hardlink/softlink protection, (highly discouraged for production use) \n -v, --version \t\t Outputs the version of GGC."
echo ""
exit 1
}
if [[ $# -eq 0 ]]; then
usage
fi
for var in "$#"
do
case "$var" in
-v|--version)
version
exit 0
;;
esac
done
while [[ $# -gt 0 ]]
do
key="$1"
case $key in
-i|--insecure)
mkdir -p $(dirname "$CRASH_LOG")
echo "Warning! You are running in insecure mode, this is highly discouraged!" | tee -a $CRASH_LOG
INSECURE=1
;;
-h|--help)
usage
;;
start)
stop
start
;;
stop)
stop
;;
restart)
stop
start
;;
*)
usage
esac
shift
done

#Jim Maybe check the model of Pi you are using?
It seems that the Pi version of Greengrass is for ARMv7-A. I got this problem too and I'm using an older Model 1 B+ which is ARMv6Z (https://en.wikipedia.org/wiki/Raspberry_Pi#Specifications).
The error we're seeing for line 158 is the ./greengrassd script waiting for the actual process to run:
sudo /greengrass/ggc/packages/1.1.0/bin/daemon -core-dir=/greengrass/ggc/packages/1.1.0 -greengrassdPid=641
/greengrass/ggc/packages/1.1.0/bin/daemon is the binary. If you run the above command directly in the console it exits with the same segmentation fault error.
AWS do recommend using the Pi 3 so I'm guessing it will work on that.

Service status not working

I have the following code for a service that I'm trying to have automatically start on boot.
#!/bin/sh
# Source function library.
. /etc/rc.d/init.d/functions
RETVAL=0
prog='foo'
exec="/usr/sbin/$prog"
pidfile="/var/run/$prog.pid"
lock_file="/var/lock/subsys/$prog"
logfile="/var/log/$prog"
if [ -f /etc/default/foo ]; then
. /etc/default/foo
fi
if [ -z $QUEUE_TYPE ]; then
echo 'ENV variable QUEUE_TYPE has not been set, please set it in /etc/default/foo'
exit 1
fi
get_pid() {
cat "$pidfile"
}
is_running() {
[ -f "$pidfile" ] && ps `get_pid` > /dev/null 2>&1
}
case "$1" in
start)
echo -n "Starting Consul daemon: "
#
daemon --pidfile $pidfile --check foo --user my-user "my app stuff here"
echo
;;
stop)
echo -n 'Stopping Consul daemon: '
killproc foo
echo
;;
status)
status $pidfile
RETVAL=$?
#status -p $pidfile -l $prog
#[ $RETVAL -eq 0 ] && RETVAL=$?
#RETVAL=$?
#if is_running; then
# echo 'Running'
#else
# echo 'Not Running'
#fi
#status foo
#RETVAL=$?
;;
restart)
$0 stop
$0 start
RETVAL=$?
;;
*)
echo 'Usage: foo {start|stop|status|restart}'
exit 1
esac
exit $RETVAL
When I run sudo service foo status it says that it hasn't been started which is correct. After running sudo service foo start and then running the status command, it tells me that the service hasn't been started. I'm not sure what is causing this to happen. I looked at the configurations for other init.d scripts to see how they were handling this and tried to follow their lead. Is there something obvious here that I'm doing wrong or something else that I may be unaware of that's causing this problem?

Automatically establish ssh-tunnel, wait until ssh-tunnel is established, then establish normal VPN connection

I got this script:
#!/usr/bin/env bash
if [ ! "$UID" = 0 ]; then
if [ `type -P gksu` ]; then
SUDOAPP="gksu"
elif [ `type -P kdesu` ]; then
SUDOAPP="kdesu"
else
SUDOAPP="sudo"
fi
fi
if [ -n "$1" ]; then
if [ "$1" = "start" ]; then
$SUDOAPP systemctl start openvpn#******
elif [ "$1" = "stop" ]; then
$SUDOAPP systemctl stop openvpn#******
elif [ "$1" = "restart" ]; then
$SUDOAPP systemctl restart openvpn#******
else
echo "Invalid command"
exit 1
fi
else
echo "Run 'start', 'stop' or 'restart' as an argument to start, stop or restart the ******"
exit 1
fi
It works fine. However I also need to establish the ssh tunnel. - Before openvpn connects to my VPN. I've got a script which does precisely that:
#!/bin/bash
# --------------------------------------------------------
# ******* | https://******.org | ****************************************
# SSH Client Configuration, Linux/OSX
# ******_*************
# --------------------------------------------------------
chmod 600 /etc/openvpn/sshtunnel.key
while :
do
echo ""; echo "****** SSH Tunnel"
ssh -i /etc/openvpn/sshtunnel.key -L ****:127.0.0.1:**** sshtunnel#**.**.**.* -p ** -N -T -v
read -t 5 -p "Retry? (or wait 5 sec for Y)" yn
if [[ $yn == "n" || $yn == "N" ]]; then break; fi
done
How do I add this to the first script in a way as to make the openvpn part wait until the ssh client is fired up?

The first script can loop checking the tunnel until it succeeds. You can use nc (netcat) to do that and capture the output in a shell variable:
while [[ -z "$nc_output" ]]; do
read -r nc_output < <(nc -v -d -u localhost openvpn 2>&1)
sleep 2
done
This checks every 2 seconds whether UDP port "openvpn" (substitute what you're actually tunnelling) can be connected to, relying on the -v option to output text if it succeeds.

Php script as daemon

I am new to php daemons. I am using the below script to fire Daemon.php script. But i am getting error while executing this below bash script via shell
The error is,
exit: 0RETVAL=0: numeric argument required
Please help me resolve this error
#!/bin/bash
#
# /etc/init.d/Daemon
#
# Starts the at daemon
#
# chkconfig: 345 95 5
# description: Runs the demonstration daemon.
# processname: Daemon
# Source function library.
#. /etc/init.d/functions
#startup values
log=/var/log/Daemon.log
#verify that the executable exists
test -x /home/godlikemouse/Daemon.php || exit 0RETVAL=0
#
# Set prog, proc and bin variables.
#
prog="Daemon"
proc=/var/lock/subsys/Daemon
bin=/home/godlikemouse/Daemon.php
start() {
# Check if Daemon is already running
if [ ! -f $proc ]; then
echo -n $"Starting $prog: "
daemon $bin --log=$log
RETVAL=$?
[ $RETVAL -eq 0 ] && touch $proc
echo
fi
return $RETVAL
}
stop() {
echo -n $"Stopping $prog: "
killproc $bin
RETVAL=$?
[ $RETVAL -eq 0 ] && rm -f $proc
echo
return $RETVAL
}
restart() {
stop
start
}
reload() {
restart
}
status_at() {
status $bin
}
case "$1" in
start)
start
;;
stop)
stop
;;
reload|restart)
restart
;;
condrestart)
if [ -f $proc ]; then
restart
fi
;;
status)
status_at
;;
*)
echo $"Usage: $0 {start|stop|restart|condrestart|status}"
exit 1
esac
exit $?
exit $RETVAL

This line produces the error:
test -x /home/godlikemouse/Daemon.php || exit 0RETVAL=0
If you want to set the the value of RETVAL to 0 you first need to remove the 0 as you can not have variables that start with a number.
Then you remove the value set from the second statement so it will exit in case Daemon.php does not exist.
test -x /home/godlikemouse/Daemon.php || exit
You can also remove the 2 empty echo statements inside the start and stop functions as the do nothing.
There are also errors in the case statement. You need to quote the case options and can remove the last exit block as the exit $? will trigger the exit before.
case "$1" in
"start")
start
;;
"stop")
stop
;;
"reload"|"restart")
restart
;;
"condrestart")
if [ -f $proc ]; then
restart
fi
;;
"status")
status_at
;;

There is several syntax and logic errors in this script presented. To highlight several:
echo $"Usage (should be just echo "Usage ..." since the string in ".." is not a variable
Double exit statements, the second one for $RETVAL is never ran.
exit 0RETVAL is not the same as exit $RETVAL, and one should just be using exit 1 instead to denote an error, exit 0 means the script ran correctly
$prog is defined but never used
test -x is to check for executable bit enabled in the given path. test -f is safer when testing for a file, test -d safer for testing directories, and test -L is safer when testing symlinks. Combine the test -f and test -x to ensure there is no race conditions or worst. (example: (test -f /home/godlikemouse/Daemon.php && test -x /home/godlikemouse/Daemon.php) || exit 1))
Further details on creating sysv init scripts can be read at http://refspecs.linuxbase.org/LSB_3.0.0/LSB-generic/LSB-generic/iniscrptact.html and bash scripting can be read at http://www.tldp.org/LDP/abs/html/index.html. It is strongly encouraged to learn both before writing system control programs such as init scripts.

Shell script get the right PID

I wrote a little program that I want to start as a service on Opensuse 11.3
This is from the init.d script, it starts my processes as I want but I don't get the right PID.
What am I missing?
echo "Starting DHCPALERT"
for (( i = 1; i <= $DHCP_AL_DAEMONS; i++ ))
do
var="DHCP_AL_$i"
START_CMD="exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &"
eval $START_CMD
echo "PID: "$!
echo "Command: "$START_CMD
done
results in
PID: 47347
Command: (/sbin/startproc -p /var/log/sthserver/dhcpalert/p_2.pid -l /var/log/sthserver/dhcpalert/log_2.log /usr/sbin/dhcpalert -i eth1 -c ./test.sh -a 00:15:5D:0A:16:07 -v )&
but pidof returns some othe pid.
If I try to execute it directly:
exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &
Then I get errors:
startproc: exit status of parent of /usr/sbin/dhcpalert: 1
I suppose because I don't escape the variables the right way?
This is the whole script:
#!/bin/sh
# Check for missing binaries (stale symlinks should not happen)
# Note: Special treatment of stop for LSB conformance
DHCPALERT_BIN=/usr/sbin/dhcpalert
#-x FILE exists and is executable
test -x $DHCPALERT_BIN || { echo "$DHCPALERT_BIN not installed";
if [ "$1" = "stop" ]; then exit 0;
else exit 5; fi; }
# Check for existence of needed config file and read it
DHCPALERT_CONFIG=/etc/sysconfig/dhcpalert
#-r FILE exists and is readable
test -r $DHCPALERT_CONFIG || { echo "$DHCPALERT_CONFIG not existing";
if [ "$1" = "stop" ]; then exit 0;
else exit 6; fi; }
# Read config to system VARs for this shell session only same as "source FILE"
. $DHCPALERT_CONFIG
#check for exitstence of the log dir
if [ -d "$DHCP_AL_LOG_DIR" ]; then
echo "exists 1"
echo "exists 2"
echo "exists 3"
if [ "$1" = "start" ]; then
echo "Deleting all old log files from: "
echo "Dir:... "$DHCP_AL_LOG_DIR
rm -R $DHCP_AL_LOG_DIR
mkdir $DHCP_AL_LOG_DIR
fi
else
echo "does not exist 1"
echo "does not exist 2"
echo "does not exist 3"
echo "Directory for Logfiles does not exist."
echo "Dir:... "$DHCP_AL_LOG_DIR
echo "Createing dir..."
mkdir $DHCP_AL_LOG_DIR
fi
. /etc/rc.status
# Reset status of this service
rc_reset
case "$1" in
start)
echo "Starting DHCPALERT"
for (( i = 1; i <= $DHCP_AL_DAEMONS; i++ ))
do
var="DHCP_AL_$i"
exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &
# START_CMD="exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &"
# eval $START_CMD
echo "PID: "$!
# echo "Command: "$START_CMD
done
rc_status -v
;;
stop)
echo -n "Shutting down DHCPALERT "
/sbin/killproc -TERM $DHCPALERT_BIN
rc_status -v
;;
try-restart|condrestart)
if test "$1" = "condrestart"; then
echo "${attn} Use try-restart ${done}(LSB)${attn} rather than condrestart ${warn}(RH)${norm}"
fi
$0 status
if test $? = 0; then
$0 restart
else
rc_reset # Not running is not a failure.
fi
rc_status
;;
restart)
$0 stop
$0 start
rc_status
;;
force-reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP $DHCPALERT_BIN
rc_status -v
;;
reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP $DHCPALERT_BIN
rc_status -v
;;
status)
echo -n "Checking for service DHCPALERT "
/sbin/checkproc $DHCPALERT_BIN
rc_status -v
;;
probe)
test /etc/DHCPALERT/DHCPALERT.conf -nt /var/run/DHCPALERT.pid && echo reload
;;
*)
echo "Usage: $0 {start|stop|status|try-restart|restart|force-reload|reload|probe}"
exit 1
;;
esac
rc_exit
The configfile:
## Specifiy where to store the Pid files
DHCP_AL_PID_DIR="/var/log/sthserver/dhcpalert/"
##
## Specifiy where to store the Log file
DHCP_AL_LOG_DIR="/var/log/sthserver/dhcpalert/"
##
## is needed to determine how many vars should be read and started!
DHCP_AL_DAEMONS="2"
##
## Then DHCP_AL_<number> to specify the command that one instance of
## dhcpalert should be started
DHCP_AL_1="-i eth0 -c ./test.sh -a 00:15:5D:0A:16:06 -v"
DHCP_AL_2="-i eth1 -c ./test.sh -a 00:15:5D:0A:16:07 -v"

Add exec to it to prevent forking:
START_CMD="(exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}") &"
Update. Please try this script:
#!/bin/bash
# Check for missing binaries (stale symlinks should not happen)
# Note: Special treatment of stop for LSB conformance
DHCPALERT_BIN=/usr/sbin/dhcpalert
#-x FILE exists and is executable
[[ -x $DHCPALERT_BIN ]] || {
echo "$DHCPALERT_BIN not installed"
if [ "$1" = "stop" ]; then
exit 0
else
exit 5
fi
}
# Check for existence of needed config file and read it
DHCPALERT_CONFIG=/etc/sysconfig/dhcpalert
#-r FILE exists and is readable
[[ -r $DHCPALERT_CONFIG ]] || {
echo "$DHCPALERT_CONFIG not existing"
if [[ $1 == stop ]]; then
exit 0
else
exit 6
fi
}
# Read config to system VARs for this shell session only same as "source FILE"
. "$DHCPALERT_CONFIG"
#check for exitstence of the log dir
CREATE_DIR=false
if [[ -d $DHCP_AL_LOG_DIR ]]; then
echo "exists 1"
echo "exists 2"
echo "exists 3"
if [[ $1 == start ]]; then
echo "Deleting all old log files from: "
echo "Dir:... $DHCP_AL_LOG_DIR"
rm -R "$DHCP_AL_LOG_DIR"
CREATE_DIR=true
fi
else
echo "does not exist 1"
echo "does not exist 2"
echo "does not exist 3"
echo "Directory for Logfiles does not exist."
CREATE_DIR=true
fi
if [[ $CREATE_DIR == true ]]; then
echo "Dir:... $DHCP_AL_LOG_DIR"
echo "Createing dir..."
mkdir "$DHCP_AL_LOG_DIR" || {
echo "Failed to create directory $DHCP_AL_LOG_DIR"
exit 1
}
fi
. /etc/rc.status
# Reset status of this service
rc_reset
case "$1" in
start)
echo "Starting DHCPALERT"
for (( I = 1; I <= DHCP_AL_DAEMONS; ++I )); do
REF="DHCP_AL_${I}[#]"
COMMAND=(/sbin/startproc -p "${DHCP_AL_PID_DIR}p_${I}.pid" -l "${DHCP_AL_LOG_DIR}log_${I}.log" "$DHCPALERT_BIN" "${!REF}")
echo "COMMAND: ${COMMAND[*]}"
"${COMMAND[#]}" &
PID=$!
echo "PID: $PID"
done
rc_status -v
;;
stop)
echo -n "Shutting down DHCPALERT "
/sbin/killproc -TERM "$DHCPALERT_BIN"
rc_status -v
;;
try-restart|condrestart)
[[ $1 == condrestart ]] && echo "${attn} Use try-restart ${done}(LSB)${attn} rather than condrestart ${warn}(RH)${norm}"
"$0" status ## ??
if [[ $? -eq 0 ]]; then
"$0" restart
else
rc_reset # Not running is not a failure.
fi
rc_status
;;
restart)
"$0" stop
"$0" start
rc_status
;;
force-reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP "$DHCPALERT_BIN"
rc_status -v
;;
reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP "$DHCPALERT_BIN"
rc_status -v
;;
status)
echo -n "Checking for service DHCPALERT "
/sbin/checkproc "$DHCPALERT_BIN"
rc_status -v
;;
probe)
[[ /etc/DHCPALERT/DHCPALERT.conf -nt /var/run/DHCPALERT.pid ]] & echo reload
;;
*)
echo "Usage: $0 {start|stop|status|try-restart|restart|force-reload|reload|probe}"
exit 1
;;
esac
rc_exit
Config file:
## Specifiy where to store the Pid files
DHCP_AL_PID_DIR="/var/log/sthserver/dhcpalert/"
##
## Specifiy where to store the Log file
DHCP_AL_LOG_DIR="/var/log/sthserver/dhcpalert/"
##
## is needed to determine how many vars should be read and started!
DHCP_AL_DAEMONS="2"
##
## Then DHCP_AL_<number> to specify the command that one instance of
## dhcpalert should be started
DHCP_AL_1=(-i eth0 -c ./test.sh -a 00:15:5D:0A:16:06 -v)
DHCP_AL_2=(-i eth1 -c ./test.sh -a 00:15:5D:0A:16:07 -v)

When you have a shell-command inside parentheses you start a new sub-shell. You run this sub-shell in the background and it's that sub-shells process id you get with $!.
There are two solutions: The first is to not run the /sbin/startproc command in a subshell, but directly and put that in the background. The secomd solution is to monitor the pid file created by /sbin/startproc.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

monitoring gearman in nagios - linux

Related

Shell script segmentation fault - AWS

Service status not working

Automatically establish ssh-tunnel, wait until ssh-tunnel is established, then establish normal VPN connection

Php script as daemon

Shell script get the right PID

Categories

Resources