Systemd - detect in ExecStopPost whether service exited without error - linux

I have an application that after it's finished and exited normally should not be restarted. After this app has done its business I'd like to shutdown the instance (ec2). I was thinking of doing this using systemd unit files with the options
Restart=on-failure
ExecStopPost=/path/to/script.sh
The script that should run on ExecStopPost:
#!/usr/bin/env bash
# sleep 1; adding sleep didn't help
# this always comes out deactivating
service_status=$(systemctl is-failed app-importer)
# could also do the other way round and check for failed
if [ $service_status = "inactive" ]
then
echo "Service exited normally: $service_status . Shutting down..."
#shutdown -t 5
else
echo "Service did not exit normally - $service_status"
fi
exit 0
The problem is that when post stop runs I can't seem to detect whether the service ended normally or not, the status then is deactivating, only after do I know if it enters a failed state or not.

Your problem is that systemd considers the service to be deactivating until the ExecPostStop process finishes. Putting sleeps in doesn't help since it's just going to wait longer. The idea for an ExecPostStop was to clean up anything the service might leave behind, like temp files, UNIX sockets, etc. The service is not done, and ready to start again, until the cleanup is finished. So what systemd is doing does make sense if you look at it that way.
What you should do is check $SERVICE_RESULT, $EXIT_CODE and/or $EXIT_STATUS in your script, which will tell you how the service stopped. Example:
#!/bin/sh
echo running exec post script | logger
systemctl is-failed foobar.service | logger
echo $SERVICE_RESULT, $EXIT_CODE and $EXIT_STATUS | logger
When service is allowed to to run to completion:
Sep 17 05:58:14 systemd[1]: Started foobar.
Sep 17 05:58:17 root[1663]: foobar service will now exit
Sep 17 05:58:17 root[1669]: running exec post script
Sep 17 05:58:17 root[1671]: deactivating
Sep 17 05:58:17 root[1673]: success, exited and 0
And when the service is stopped before it finishes:
Sep 17 05:57:22 systemd[1]: Started foobar.
Sep 17 05:57:24 systemd[1]: Stopping foobar...
Sep 17 05:57:24 root[1643]: running exec post script
Sep 17 05:57:24 root[1645]: deactivating
Sep 17 05:57:24 root[1647]: success, killed and TERM
Sep 17 05:57:24 systemd[1]: Stopped foobar.

Related

"Ambiguous redirect" error when executing a script

In the below code snippet, I keep getting this error message:
#!/bin/sh
java -jar /opt/wiremock/wiremock-standalone-2.7.1.jar -port 9393 --container-threads 200 --verbose >> /opt/wiremock/wiremock.log 2>&1
Error:
bash: /opt/wiremock/script.sh: line 2: 1#015: ambiguous redirect
Feb 8 17:11:27 ssa2 systemd: wiremock.service: main process exited, code=exited, status=1/FAILURE
Feb 8 17:11:27 ssa2 systemd: Unit wiremock.service entered failed state.
Feb 8 17:11:27 ssa2 systemd: wiremock.service failed.**
I have tried commenting out the redirection as in:
java -jar /opt/wiremock/wiremock-standalone-2.7.1.jar -port 9393 --container-threads 200 --verbose #>> /opt/wiremock/wiremock.log 2>&1
and it works properly. However, I would like to make the redirection work so that I have the logs in the specified file.
The first line of your error message is telling you that you have a carriage return at the end of your line.
You have several ways to solve the problem:
Make sure your script has unix-style line endings (eg. dos2unix)
Add a statement terminator to the end of each command, such as: ; #

Process (re)starting a Docker container from systemd exits unexpectedly

I have a docker container.During the restart of the linux server docker container is stopped so in systemd, script file is added for the container restart.This script is also stopping the chef-client. But script is executing only half of the commands.I don't know why it is stopping after chef-client stop.After that it is not proceeding.
Restart script:
[root#server01 user1]# more /apps/service-scripts/docker-container-restart.sh
#!/bin/bash
echo "Starting to stop the chef-client automatic running.."
service chef-client stop
echo "Completed the stopping the chef-client automatic running"
echo "Restart check... the Applicaiton Container $(date)"
IsAppRunning=$(docker inspect -f '{{.State.Running}}' app-prod)
echo "IsAppRunning state $IsAppRunning"
if [ "$IsAppRunning" != "true" ]; then
IsAppRunning=$(docker inspect -f '{{.State.Running}}' app-prod)
echo "Restarting.... the Applicaiton Container $(date)"
docker restart app-prod
IsAppRunning=$(docker inspect -f '{{.State.Running}}' app-prod)
echo "Restart completed($IsAppRunning) the Applicaiton Container $(date)"
else
echo "Restart is not required, app is already up and running"
fi
Systemctl log:
[root#server01 user1] systemctl status app-docker-container.service
● app-docker-container.service - Application start
Loaded: loaded (/etc/systemd/system/app-docker-container.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2017-09-24 16:00:40 CDT; 18h ago
Main PID: 1187 (docker-container-restart)
Memory: 16.4M
CGroup: /system.slice/app-docker-container.service
├─1187 /bin/bash /apps/service-scripts/docker-container-restart.sh
└─1220 docker inspect -f {{.State.Running}} app-prod
Sep 24 16:00:40 server01 systemd[1]: Started Application start.
Sep 24 16:00:40 server01 systemd[1]: Starting Application start...
Sep 24 16:00:40 server01 docker-container-restart.sh[1187]: Starting to stop the chef-client automatic running..
Sep 24 16:00:41 server01 docker-container-restart.sh[1187]: Redirecting to /bin/systemctl stop chef-client.service
Sep 24 16:00:41 server01 docker-container-restart.sh[1187]: Completed the stopping the chef-client automatic running
Sep 24 16:00:41 server01 docker-container-restart.sh[1187]: Restart check... the Applicaiton Container Sun Sep 24 16:00:41 CDT 2017
SystemD:
[root#server01 user1]# more /etc/systemd/system/app-docker-container.service
[Unit]
Description=Application start
After=docker.service,chef-client.service
[Service]
Type=simple
ExecStart=/apps/service-scripts/docker-container-restart.sh
[Install]
WantedBy=multi-user.target

Converting watch into a unit file systemd

I've got a shell script as follows
ss.sh
#!/bin/bash
opFile="custom.data"
sourceFile="TestOutput"
./fc app test > $sourceFile
grep -oP '[0-9.]+(?=%)|[0-9.]+(?=[A-Z]+ of)' "$sourceFile" | tr '\n' ',' > $opFile
sed -i 's/,$//' $opFile
The requirement is that I need to use this script with the watch command. And I'd like to make this into a systemctl service. I did it as so.
sc.sh
#!/bin/bash
watch -n 60 /root/ss.sh
And in my /etc/systemd/system,
log_info.service
[Unit]
Description="Test Desc"
After=network.target
[Service]
ExecStart=/root/sc.sh
Type=simple
[Install]
WantedBy=default.target
When I run systemctl start log_info.service, It runs but not continuously the way I'd like it to.
On running sytemctl status log_info.service,
info_log.service - "Test Desc"
Loaded: loaded (/etc/systemd/system/info_log.service; disabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2016-09-12 08:17:02 UTC; 2min 18s ago
Process: 35555 ExecStart=/root/sc.sh (code=exited, status=1/FAILURE)
Main PID: 35555 (code=exited, status=1/FAILURE)
Sep 12 08:17:02 mo-b428aa6b4 systemd[1]: Started "Test Desc".
Sep 12 08:17:02 mo-b428aa6b4 sc.sh[35654]: Error opening terminal: unknown.
Sep 12 08:17:02 mo-b428aa6b4 systemd[1]: info_log.service: Main process exited, code=exited, status=1/FAILURE
Sep 12 08:17:02 mo-b428aa6b4 systemd[1]: info_log.service: Unit entered failed state.
Sep 12 08:17:02 mo-b428aa6b4 systemd[1]: info_log.service: Failed with result 'exit-code'.
Any ideas as to why it's not running right? Any help would be appreciated!
So the reason I learnt (from superuser) for this failing was exactly what was in my error console, i.e,
Error opening terminal: unknown
Watch can only be executed from the terminal because it requires access to a terminal, while services don't have that access.
A possible alternative to watch could be using a command that doesn't require the terminal, like screen or tmux. Or, another alternative which worked for me, that was suggested by grawity on superuser, was
# foo.timer
[Unit]
Description=Do whatever
[Timer]
OnActiveSec=60
OnUnitActiveSec=60
[Install]
WantedBy=timers.target
The logic behind this was that, the need was to run the script every 60 seconds, not to use watch. Hence, grawity suggested that I use a timer unit file that calls the service file every 60 seconds instead. If the service unit was a different name from the timer unit, [Timer] Unit= can be used.
Hope this helped you and +1 to grawity and Eric Renouf from superuser for the answers!

Two instances of node started on linux

I have a node.js server app which is being started twice for some reason. I have a cronjob that runs every minute, checking for a node main.js process and if not found, starting it. The cron looks like this:
* * * * * ~/startmain.sh >> startmain.log 2>&1
And the startmain.sh file looks like this:
if ps -ef | grep -v grep | grep "node main.js" > /dev/null
then
echo "`date` Server is running."
else
echo "`date` Server is not running! Starting..."
sudo node main.js > main.log
fi
The log file storing the output of startmain.js shows this:
Fri Aug 8 19:22:00 UTC 2014 Server is running.
Fri Aug 8 19:23:00 UTC 2014 Server is running.
Fri Aug 8 19:24:00 UTC 2014 Server is not running! Starting...
Fri Aug 8 19:25:00 UTC 2014 Server is running.
Fri Aug 8 19:26:00 UTC 2014 Server is running.
Fri Aug 8 19:27:00 UTC 2014 Server is running.
That is what I expect, but when I look at processes, it seems that two are running. One under sudo and one without. Check out the top two processes:
$ ps -ef | grep node
root 99240 99232 0 19:24:01 ? 0:01 node main.js
root 99232 5664 0 19:24:01 ? 0:00 sudo node main.js
admin 2777 87580 0 19:37:41 pts/1 0:00 grep node
Indeed, when I look at the application logs, I see startup entries happening in duplicate. To kill these processes, I have to use sudo, even for the process that does not start with sudo. When I kill one of these, the other one dies too.
Any idea why I am kicking off two processes?
First, you are starting your node main.js application with sudo in the script startmain.sh. According to sudo man page:
When sudo runs a command, it calls fork(2), sets up the execution environment as described above, and calls the execve system call in the child process. The main sudo process waits until the command has completed, then passes the command's exit status to the security policy's close method and exits.
So, in your case the process with name sudo node main.js is the sudo command itself and the process node main.js is the node.js app. You can easily verify this - run ps auxfw and you will see that the sudo node main.js process is the parent process for node main.js.
Another way to verify this is to run lsof -p [process id] and see that the txt part for the process sudo node main.js states /usr/bin/sudo while the txt part of the process node main.js will display the path to your node binary.
The bottom line is that you should not worry that your node.js app starts twice.

Upstart respawning healthy process

I'm having an issue where upstart is respawning a Node.js (v0.8.8) process that is completely healthy. I'm on Ubunut 11.10. When I run the program from the command line it is completely stable and does not crash. But, when I run it with upstart, it gets respawned pretty consistently every few seconds. I'm not sure what is going on and none of logs seem to help. In fact, there are no error messages produced any of the upstart logs for the job. Below is my upstart script:
#!upstart
description "server.js"
start on (local-filesystems and net-device-up IFACE=eth0)
stop on shutdown
# Automtically respawn
respawn # restart when job dies
respawn limit 99 5 # give up restart after 99 respawns in 5 seconds
script
export HOME="/home/www-data"
exec sudo -u www-data NODE_ENV="production" /usr/local/bin/node /var/www/server/current/server.js >> /var/log/node.log 2>> /var/log/node.error.log
end script
post-start script
echo "server-2 has started!"
end script
The strange thing is that server-1 works perfectly fine and is setup the same way.
syslog messages look like this:
Sep 24 15:40:28 domU-xx-xx-xx-xx-xx-xx kernel: [5272182.027977] init: server-2 main process (3638) terminated with status 1
Sep 24 15:40:35 domU-xx-xx-xx-xx-xx-xx kernel: [5272189.039308] init: server-2 main process (3647) terminated with status 1
Sep 24 15:40:42 domU-xx-xx-xx-xx-xx-xx kernel: [5272196.050805] init: server-2 main process (3656) terminated with status 1
Sep 24 15:40:49 domU-xx-xx-xx-xx-xx-xx kernel: [5272203.064022] init: server-2 main process (3665) terminated with status 1
Any help would be appreciated. Thanks.
Ok, seems that it was actually monit that was restarting it. Problem has been solved. Thanks.

Resources