I'd like to have an init.d daemon restart my node.js app if it crashes. This script starts/stops my node app. I've had no luck getting it to restart the app if it crashes.
I'm running under CentOS. What am I missing?
#!/bin/sh
. /etc/rc.d/init.d/functions
USER="rmlxadmin"
DAEMON="/usr/bin/nodejs"
ROOT_DIR="/home/rmlxadmin"
SERVER="$ROOT_DIR/my_node_app.js"
LOG_FILE="$ROOT_DIR/app.js.log"
LOCK_FILE="/var/lock/subsys/node-server"
do_start()
{
if [ ! -f "$LOCK_FILE" ] ; then
echo -n $"Starting $SERVER: "
runuser -l "$USER" -c "$DAEMON $SERVER >> $LOG_FILE &" && echo_success || echo_failure
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && touch $LOCK_FILE
else
echo "$SERVER is locked."
RETVAL=1
fi
}
do_stop()
{
echo -n $"Stopping $SERVER: "
pid=`ps -aefw | grep "$DAEMON $SERVER" | grep -v " grep " | awk '{print $2}'`
kill -9 $pid > /dev/null 2>&1 && echo_success || echo_failure
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && rm -f $LOCK_FILE
}
case "$1" in
start)
do_start
;;
stop)
do_stop
;;
restart)
do_stop
do_start
;;
*)
echo "Usage: $0 {start|stop|restart}"
RETVAL=1
esac
exit $RETVAL
You need to use additional tools like node-supervisor for this case.
Install node-supervisor with npm:
sudo npm install -g supervisor
Change DAEMON variable in your init.d script to node-supervisor executable: /usr/bin/supervisor. You can check this path using command 'whereis supervisor' in your system (after installation, of course).
Now supervisor will restart your application if it's crash.
Related
I found this initscript for launching nginx here: http://www.rackspace.com/knowledge_center/article/centos-adding-an-nginx-init-script. Im trying to modify it to work with a chroot environment and its proving more troublesome than I expected.
Here's what I have:
#!/bin/sh
#
# nginx - this script starts and stops the nginx daemin
#
# chkconfig: - 85 15
. /etc/rc.d/init.d/functions
. /etc/sysconfig/network
[ "$NETWORKING" = "no" ] && exit 0
nginx="/usr/local/nginx/sbin/nginx"
prog="nginx"
chroot="/usr/sbin/chroot /nginx
NGINX_CONF_FILE="/usr/local/nginx/conf/nginx.conf"
lockfile=/var/lock/subsys/nginx
start() {
[ -x $nginx ] || exit 5
[ -f $NGINX_CONF_FILE ] || exit 6
echo -n $"Starting $prog: "
daemon $chroot $nginx -c $NGINX_CONF_FILE
retval=$?
echo
[ $retval -eq 0 ] && touch $lockfile
return $retval
}
stop() {
echo -n $"Stopping $prog: "
killall $prog
retval=$?
echo
[ $retval -eq 0 ] && rm -f $lockfile
return $retval
}
restart() {
configtest || return $?
stop
start
}
reload() {
configtest || return $?
echo -n $"Reloading $prog: "
killall $prog -HUP
retval=$?
echo
}
force_reload() {
restart
}
configtest() {
$nginx -t -c $NGINX_CONF_FILE
}
rh_status() {
status $prog
}
rh_status_q() {
rh_status >/dev/null 2>&1
}
case "$1" in
start)
if rh_status_q; then
printf "$prog already running\n"
exit 0
fi
$1
;;
stop)
if ! rh_status_q; then
printf "$prog not running\n"
exit 0
fi
$1
;;
restart|configtest)
$1
;;
reload)
if ! rh_status_q; then
printf "$prog not running\n"
exit 7
fi
$1
;;
force-reload)
force_reload
;;
status)
rh_status
;;
condrestart|try-restart)
if ! rh_status_q; then
printf "$prog not running\n"
exit 0
fi
;;
*)
echo $"Usage: $0 {start|stop|status|restart|condrestart|try-restart|reload|force-reload|configtest}"
exit 2
esac
The main thing not working right now is the status call.
[root#localhost ~]# /etc/init.d/nginx status
nginx dead but subsys locked
This error points to a lockfile being present but no pid file. This is true, however when I use the original nginx init script and launch it in a non-chroot environment, there is no pidfile, and the status call works just fine.
I put in some lines to retrieve the pid by launching the process in the background instead of using daemon:
PID=$chroot $nginx -c $NGINX_CONF_FILE > /dev/null 2>&1 $ echo $!
echo $PID > $pidfile
But this grabs the pid of chroot, which ends after nginx is launched. Nginx has both a master and a child pid. Might grabbing those and putting them to a pidfile be the way forward? Or is there something else I should try?
I think its because daemon launches a chroot process it gets the wrong pid.
Alright it was as easy as grepping for the pid
pidfile='/var/run/nginx.pid'
start(){
...
daemon $chroot $nginx -c $NGINX_CONF_FILE
PID=`ps -ef | grep $nginx | grep -v grep | awk '{print $2}'`
echo $PID > $pidfile
...
}
stop() {
...
[ $retval -eq 0 ] && rm -f $lockfile && rm -f $pidfile
return $retval
}
I have the following code for a service that I'm trying to have automatically start on boot.
#!/bin/sh
# Source function library.
. /etc/rc.d/init.d/functions
RETVAL=0
prog='foo'
exec="/usr/sbin/$prog"
pidfile="/var/run/$prog.pid"
lock_file="/var/lock/subsys/$prog"
logfile="/var/log/$prog"
if [ -f /etc/default/foo ]; then
. /etc/default/foo
fi
if [ -z $QUEUE_TYPE ]; then
echo 'ENV variable QUEUE_TYPE has not been set, please set it in /etc/default/foo'
exit 1
fi
get_pid() {
cat "$pidfile"
}
is_running() {
[ -f "$pidfile" ] && ps `get_pid` > /dev/null 2>&1
}
case "$1" in
start)
echo -n "Starting Consul daemon: "
#
daemon --pidfile $pidfile --check foo --user my-user "my app stuff here"
echo
;;
stop)
echo -n 'Stopping Consul daemon: '
killproc foo
echo
;;
status)
status $pidfile
RETVAL=$?
#status -p $pidfile -l $prog
#[ $RETVAL -eq 0 ] && RETVAL=$?
#RETVAL=$?
#if is_running; then
# echo 'Running'
#else
# echo 'Not Running'
#fi
#status foo
#RETVAL=$?
;;
restart)
$0 stop
$0 start
RETVAL=$?
;;
*)
echo 'Usage: foo {start|stop|status|restart}'
exit 1
esac
exit $RETVAL
When I run sudo service foo status it says that it hasn't been started which is correct. After running sudo service foo start and then running the status command, it tells me that the service hasn't been started. I'm not sure what is causing this to happen. I looked at the configurations for other init.d scripts to see how they were handling this and tried to follow their lead. Is there something obvious here that I'm doing wrong or something else that I may be unaware of that's causing this problem?
I am writing a init.d script for kibana
as of not script is running partially, but the issue is if I run run service kibana start even if service is running then second instance start which bothers me I want to add check before starting service, if service is running then dont start second instance. I tried to put if check on "/var/lock/subsys/kibana" but didn't work. Here is my script :
#!/bin/bash
KIBANA_PATH="/opt/kibana4"
DESC="Kibana Daemon"
NAME=kibana
DAEMON=bin/kibana
CONFIG_DIR=$KIBANA_PATH/config/kibana.yml
LOGFILE=/var/log/kibana/kibana.log
#ARGS="agent --config ${CONFIG_DIR} --log ${LOGFILE}"
SCRIPTNAME=/etc/init.d/kibana
PIDFILE=/var/run/kibana.pid
base=kibana
# Exit if the package is not installed
if [ ! -x "$KIBANA_PATH/$DAEMON" ]; then
{
echo "Couldn't find $DAEMON"
exit 99
}
fi
. /etc/init.d/functions
#
# Function that starts the daemon/service
#
do_start()
{
cd $KIBANA_PATH && \
($DAEMON >> $LOGFILE &) && \
success || failure;
}
set_pidfile()
{
pgrep -f "kibana.jar" > $PIDFILE
}
#
# Function that stops the daemon/service
#
do_stop()
{
pid=`cat $PIDFILE`
if checkpid $pid 2>&1; then
# TERM first, then KILL if not dead
kill -TERM $pid >/dev/null 2>&1
usleep 100000
if checkpid $pid && sleep 1 &&
checkpid $pid && sleep $delay &&
checkpid $pid ; then
kill -KILL $pid >/dev/null 2>&1
usleep 100000
fi
fi
checkpid $pid
RC=$?
[ "$RC" -eq 0 ] && failure $"$base shutdown" || success $"$base shutdown"
}
case "$1" in
start)
echo -n "Starting $DESC: "
do_start
touch /var/lock/subsys/$NAME
set_pidfile
;;
stop)
echo -n "Stopping $DESC: "
do_stop
rm /var/lock/subsys/$NAME
rm $PIDFILE
;;
restart|reload)
echo -n "Restarting $DESC: "
do_stop
do_start
touch /var/lock/subsys/$NAME
set_pidfile
;;
status)
echo $DESC
status -p $PIDFILE
echo $!
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|status|restart}" >&2
exit 3
;;
esac
echo
exit 0
any help here ?
Thanks
use lockfile -r0 /path/to/lock/file.lck when you start the service. every new access then will retry zero times to create the file. so if that command fails do nothing or start the service otherwise.
lockfile -r0 /path/to/lock/file.lck
if [ "$?" == "0" ]; then
echo "lock does not exist. enter devils land :)"
fi
The following is a pretty standard implementation of this feature used by most init.d scripts.
start () {
[ -d /var/run/nscd ] || mkdir /var/run/nscd
[ -d /var/db/nscd ] || mkdir /var/db/nscd
echo -n $"Starting $prog: "
daemon /usr/sbin/nscd $NSCD_OPTIONS
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && touch /var/lock/subsys/nscd
return $RETVAL
}
...
# See how we were called.
case "$1" in
start)
[ -e /var/lock/subsys/nscd ] || start
RETVAL=$?
;;
...
I wrote a little program that I want to start as a service on Opensuse 11.3
This is from the init.d script, it starts my processes as I want but I don't get the right PID.
What am I missing?
echo "Starting DHCPALERT"
for (( i = 1; i <= $DHCP_AL_DAEMONS; i++ ))
do
var="DHCP_AL_$i"
START_CMD="exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &"
eval $START_CMD
echo "PID: "$!
echo "Command: "$START_CMD
done
results in
PID: 47347
Command: (/sbin/startproc -p /var/log/sthserver/dhcpalert/p_2.pid -l /var/log/sthserver/dhcpalert/log_2.log /usr/sbin/dhcpalert -i eth1 -c ./test.sh -a 00:15:5D:0A:16:07 -v )&
but pidof returns some othe pid.
If I try to execute it directly:
exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &
Then I get errors:
startproc: exit status of parent of /usr/sbin/dhcpalert: 1
I suppose because I don't escape the variables the right way?
This is the whole script:
#!/bin/sh
# Check for missing binaries (stale symlinks should not happen)
# Note: Special treatment of stop for LSB conformance
DHCPALERT_BIN=/usr/sbin/dhcpalert
#-x FILE exists and is executable
test -x $DHCPALERT_BIN || { echo "$DHCPALERT_BIN not installed";
if [ "$1" = "stop" ]; then exit 0;
else exit 5; fi; }
# Check for existence of needed config file and read it
DHCPALERT_CONFIG=/etc/sysconfig/dhcpalert
#-r FILE exists and is readable
test -r $DHCPALERT_CONFIG || { echo "$DHCPALERT_CONFIG not existing";
if [ "$1" = "stop" ]; then exit 0;
else exit 6; fi; }
# Read config to system VARs for this shell session only same as "source FILE"
. $DHCPALERT_CONFIG
#check for exitstence of the log dir
if [ -d "$DHCP_AL_LOG_DIR" ]; then
echo "exists 1"
echo "exists 2"
echo "exists 3"
if [ "$1" = "start" ]; then
echo "Deleting all old log files from: "
echo "Dir:... "$DHCP_AL_LOG_DIR
rm -R $DHCP_AL_LOG_DIR
mkdir $DHCP_AL_LOG_DIR
fi
else
echo "does not exist 1"
echo "does not exist 2"
echo "does not exist 3"
echo "Directory for Logfiles does not exist."
echo "Dir:... "$DHCP_AL_LOG_DIR
echo "Createing dir..."
mkdir $DHCP_AL_LOG_DIR
fi
. /etc/rc.status
# Reset status of this service
rc_reset
case "$1" in
start)
echo "Starting DHCPALERT"
for (( i = 1; i <= $DHCP_AL_DAEMONS; i++ ))
do
var="DHCP_AL_$i"
exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &
# START_CMD="exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}" &"
# eval $START_CMD
echo "PID: "$!
# echo "Command: "$START_CMD
done
rc_status -v
;;
stop)
echo -n "Shutting down DHCPALERT "
/sbin/killproc -TERM $DHCPALERT_BIN
rc_status -v
;;
try-restart|condrestart)
if test "$1" = "condrestart"; then
echo "${attn} Use try-restart ${done}(LSB)${attn} rather than condrestart ${warn}(RH)${norm}"
fi
$0 status
if test $? = 0; then
$0 restart
else
rc_reset # Not running is not a failure.
fi
rc_status
;;
restart)
$0 stop
$0 start
rc_status
;;
force-reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP $DHCPALERT_BIN
rc_status -v
;;
reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP $DHCPALERT_BIN
rc_status -v
;;
status)
echo -n "Checking for service DHCPALERT "
/sbin/checkproc $DHCPALERT_BIN
rc_status -v
;;
probe)
test /etc/DHCPALERT/DHCPALERT.conf -nt /var/run/DHCPALERT.pid && echo reload
;;
*)
echo "Usage: $0 {start|stop|status|try-restart|restart|force-reload|reload|probe}"
exit 1
;;
esac
rc_exit
The configfile:
## Specifiy where to store the Pid files
DHCP_AL_PID_DIR="/var/log/sthserver/dhcpalert/"
##
## Specifiy where to store the Log file
DHCP_AL_LOG_DIR="/var/log/sthserver/dhcpalert/"
##
## is needed to determine how many vars should be read and started!
DHCP_AL_DAEMONS="2"
##
## Then DHCP_AL_<number> to specify the command that one instance of
## dhcpalert should be started
DHCP_AL_1="-i eth0 -c ./test.sh -a 00:15:5D:0A:16:06 -v"
DHCP_AL_2="-i eth1 -c ./test.sh -a 00:15:5D:0A:16:07 -v"
Add exec to it to prevent forking:
START_CMD="(exec /sbin/startproc -p "$DHCP_AL_PID_DIR"p_"$i".pid -l "$DHCP_AL_LOG_DIR"log_"$i".log "$DHCPALERT_BIN" "${!var}") &"
Update. Please try this script:
#!/bin/bash
# Check for missing binaries (stale symlinks should not happen)
# Note: Special treatment of stop for LSB conformance
DHCPALERT_BIN=/usr/sbin/dhcpalert
#-x FILE exists and is executable
[[ -x $DHCPALERT_BIN ]] || {
echo "$DHCPALERT_BIN not installed"
if [ "$1" = "stop" ]; then
exit 0
else
exit 5
fi
}
# Check for existence of needed config file and read it
DHCPALERT_CONFIG=/etc/sysconfig/dhcpalert
#-r FILE exists and is readable
[[ -r $DHCPALERT_CONFIG ]] || {
echo "$DHCPALERT_CONFIG not existing"
if [[ $1 == stop ]]; then
exit 0
else
exit 6
fi
}
# Read config to system VARs for this shell session only same as "source FILE"
. "$DHCPALERT_CONFIG"
#check for exitstence of the log dir
CREATE_DIR=false
if [[ -d $DHCP_AL_LOG_DIR ]]; then
echo "exists 1"
echo "exists 2"
echo "exists 3"
if [[ $1 == start ]]; then
echo "Deleting all old log files from: "
echo "Dir:... $DHCP_AL_LOG_DIR"
rm -R "$DHCP_AL_LOG_DIR"
CREATE_DIR=true
fi
else
echo "does not exist 1"
echo "does not exist 2"
echo "does not exist 3"
echo "Directory for Logfiles does not exist."
CREATE_DIR=true
fi
if [[ $CREATE_DIR == true ]]; then
echo "Dir:... $DHCP_AL_LOG_DIR"
echo "Createing dir..."
mkdir "$DHCP_AL_LOG_DIR" || {
echo "Failed to create directory $DHCP_AL_LOG_DIR"
exit 1
}
fi
. /etc/rc.status
# Reset status of this service
rc_reset
case "$1" in
start)
echo "Starting DHCPALERT"
for (( I = 1; I <= DHCP_AL_DAEMONS; ++I )); do
REF="DHCP_AL_${I}[#]"
COMMAND=(/sbin/startproc -p "${DHCP_AL_PID_DIR}p_${I}.pid" -l "${DHCP_AL_LOG_DIR}log_${I}.log" "$DHCPALERT_BIN" "${!REF}")
echo "COMMAND: ${COMMAND[*]}"
"${COMMAND[#]}" &
PID=$!
echo "PID: $PID"
done
rc_status -v
;;
stop)
echo -n "Shutting down DHCPALERT "
/sbin/killproc -TERM "$DHCPALERT_BIN"
rc_status -v
;;
try-restart|condrestart)
[[ $1 == condrestart ]] && echo "${attn} Use try-restart ${done}(LSB)${attn} rather than condrestart ${warn}(RH)${norm}"
"$0" status ## ??
if [[ $? -eq 0 ]]; then
"$0" restart
else
rc_reset # Not running is not a failure.
fi
rc_status
;;
restart)
"$0" stop
"$0" start
rc_status
;;
force-reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP "$DHCPALERT_BIN"
rc_status -v
;;
reload)
echo -n "Reload service DHCPALERT "
/sbin/killproc -HUP "$DHCPALERT_BIN"
rc_status -v
;;
status)
echo -n "Checking for service DHCPALERT "
/sbin/checkproc "$DHCPALERT_BIN"
rc_status -v
;;
probe)
[[ /etc/DHCPALERT/DHCPALERT.conf -nt /var/run/DHCPALERT.pid ]] & echo reload
;;
*)
echo "Usage: $0 {start|stop|status|try-restart|restart|force-reload|reload|probe}"
exit 1
;;
esac
rc_exit
Config file:
## Specifiy where to store the Pid files
DHCP_AL_PID_DIR="/var/log/sthserver/dhcpalert/"
##
## Specifiy where to store the Log file
DHCP_AL_LOG_DIR="/var/log/sthserver/dhcpalert/"
##
## is needed to determine how many vars should be read and started!
DHCP_AL_DAEMONS="2"
##
## Then DHCP_AL_<number> to specify the command that one instance of
## dhcpalert should be started
DHCP_AL_1=(-i eth0 -c ./test.sh -a 00:15:5D:0A:16:06 -v)
DHCP_AL_2=(-i eth1 -c ./test.sh -a 00:15:5D:0A:16:07 -v)
When you have a shell-command inside parentheses you start a new sub-shell. You run this sub-shell in the background and it's that sub-shells process id you get with $!.
There are two solutions: The first is to not run the /sbin/startproc command in a subshell, but directly and put that in the background. The secomd solution is to monitor the pid file created by /sbin/startproc.
I have this script which I would like to switch to the user "terraria" before starting the daemon. I can't figure out how to do it. My research brings me to bash scripts using su my_user -c, but I don't think that works in this case.
#!/bin/bash
# Terraria daemon
# chkconfig: 345 20 80
# description: Terraria Server
# processname: TerrariaServer.exe
DAEMON_PATH="/usr/Terraria"
DAEMON=TerrariaServer.exe
DAEMONOPTS="-world This_Land.wld -port 7777 "
NAME=TerrariaServer
DESC="Terraria Server"
PIDFILE=/var/run/TerrariaServer.pid
SCRIPTNAME=/etc/init.d/Terraria-Server
case "$1" in
start)
printf "%-50s" "Starting $NAME..."
cd $DAEMON_PATH
PID=`mono $DAEMON $DAEMONOPTS > /dev/null 2>&1 & echo $!`
#echo "Saving PID" $PID " to " $PIDFILE
if [ -z $PID ]; then
printf "%s\n" "Fail"
else
echo $PID > $PIDFILE
printf "%s\n" "Ok"
fi
;;
status)
printf "%-50s" "Checking $NAME..."
if [ -f $PIDFILE ]; then
PID=`cat $PIDFILE`
if [ -z "`ps axf | grep ${PID} | grep -v grep`" ]; then
printf "%s\n" "Process dead but pidfile exists"
else
echo "Running"
fi
else
printf "%s\n" "Service not running"
fi
;;
stop)
printf "%-50s" "Stopping $NAME"
PID=`cat $PIDFILE`
cd $DAEMON_PATH
if [ -f $PIDFILE ]; then
kill -HUP $PID
printf "%s\n" "Ok"
rm -f $PIDFILE
else
printf "%s\n" "pidfile not found"
fi
;;
restart)
$0 stop
$0 start
;;
*)
echo "Usage: $0 {status|start|stop|restart}"
exit 1
esac
Check out the following link for the 'DJB' way of starting up processes as other users:
http://thedjbway.b0llix.net/daemontools/uidgid.html
Also, see:
How to run a command as a specific user in an init script?