Systemd restarts my process which is not dead - linux

I have the current systemd service /etc/systemd/system/getty#tty1.service.d/override.conf:
[Service]
ExecStart=
ExecStart=-/home/auto/script.sh
Type=simple
StandardInput=tty
StandardOutput=tty
The point is, user turn on the computer and can manage few stuff on the computer and didnt need to log in.
Systemd starts the scripts it works fine. But after few minutes systemd restart "script.sh" for no reason. I think the problem is "script.sh" starts some child process and systemd does not like it.
After a restart I can find these lines in syslog:
Sep 25 12:33:32 hostname systemd[1]: getty#tty1.service: Service has no hold-off time, scheduling restart.
Sep 25 12:33:32 hostname systemd[1]: getty#tty1.service: Scheduled restart job, restart counter is at 1.
Sep 25 12:33:32 hostname systemd[1]: Stopped Getty on tty1.
Sep 25 12:33:32 hostname systemd[1]: getty#tty1.service: Found left-over process 1711 (docker) in control group while starting unit. Ignoring.
Sep 25 12:33:32 hostname systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
I tried a lot of things like Type=forking or RestartSec=86400s but Systemd still restart script.sh.
Any idea?
Best regards,

Related

Systemd "OnFailure=" not starting when binary or bash exits with an error code

So I have a systemd unit that needs to be monitored, restarted in case of a crash and also something done in case the unit fails. I'm working on an embedded system so this needs to be robust.
In my case we have a systemd service:
Description=Demo unit
Wants=multi-user.target
OnFailure=FailHandler#%N.service
[Service]
ExecStart=/bin/bash /home/root/demo.sh
Restart=on-failure
RestartSec=1
Type=simple
The bash I start:
echo "Started demo.sh"
current_date=`date`
sleep 10s
echo "${current_date} Demo was here" >> /home/root/demo.txt
exit 1
So far so good. The bash always exits with 1 afer 10 seconds and logs the time. The problem is that FailHandler is never called in that case. Now this is just a demo all of the applications are in C++ but the behavior is the same. Now if I manually set the wrong path to the bash file it unit fails but it starts the "OnFailure" part. Here's syslog output from having correct path:
2021-09-03T13:06:31.575094+00:00 hostname bash[1125]: Started demo.sh
2021-09-03T13:06:41.629450+00:00 hostname systemd[1]: demo.service: Main process exited, code=exited, status=1/FAILURE
2021-09-03T13:06:41.644681+00:00 hostname systemd[1]: demo.service: Failed with result 'exit-code'.
2021-09-03T13:06:41.818089+00:00 hostname systemd[1]: demo.service: Service RestartSec=100ms expired, scheduling restart.
2021-09-03T13:06:41.824005+00:00 hostname systemd[1]: demo.service: Scheduled restart job, restart counter is at 1.
2021-09-03T13:06:41.850933+00:00 hostname bash[1179]: Started demo.sh
2021-09-03T13:06:51.870376+00:00 hostname systemd[1]: demo.service: Main process exited, code=exited, status=1/FAILURE
2021-09-03T13:06:51.872611+00:00 hostname systemd[1]: demo.service: Failed with result 'exit-code'.
2021-09-03T13:06:52.117479+00:00 hostname systemd[1]: demo.service: Service RestartSec=100ms expired, scheduling restart.
2021-09-03T13:06:52.136102+00:00 hostname systemd[1]: demo.service: Scheduled restart job, restart counter is at 2.
2021-09-03T13:06:52.163865+00:00 hostname bash[1221]: Started demo.sh
Here's output from when path is incorrect:
2021-09-03T13:07:46.582269+00:00 hostnaem bash[1446]: /bin/bash: /ahome/root/daemo.sh: No such file or directory
2021-09-03T13:07:46.588715+00:00 hostnaem systemd[1]: daemo.service: Main process exited, code=exited, status=127/n/a
2021-09-03T13:07:46.590356+00:00 hostnaem systemd[1]: daemo.service: Failed with result 'exit-code'.
2021-09-03T13:07:46.694616+00:00 hostnaem systemd[1]: daemo.service: Service RestartSec=100ms expired, scheduling restart.
2021-09-03T13:07:46.701519+00:00 hostnaem systemd[1]: daemo.service: Scheduled restart job, restart counter is at 1.
2021-09-03T13:07:46.720879+00:00 hostnaem systemd[1]: daemo.service: Start request repeated too quickly.
2021-09-03T13:07:46.721405+00:00 hostnaem systemd[1]: daemo.service: Failed with result 'exit-code'.
2021-09-03T13:07:46.722723+00:00 hostnaem systemd[1]: daemo.service: Triggering OnFailure= dependencies.
2021-09-03T13:07:46.804815+00:00 hostnaem FailHandler.sh[1457]: Failed application: daemo
2021-09-03T13:07:46.822342+00:00 hostnaem bash[1457]: error: cannot stat /etc/logrotate.d/daemo: No such file or directory
2021-09-03T13:07:46.841577+00:00 hostnaem FailHandler.sh[1457]: ERROR: Failed logrotate for daemo crash
2021-09-03T13:07:46.977003+00:00 hostnaem systemd[1]: FailHandler#daemo.service: Succeeded.
I understand from the syslog that it starts the FailHandler whenever number of restarts reaches StartLimitBurst=1 within 100ms but is there a way that it starts anytime the application exits with an error code?
Thank you man. I took one look at the link you sent and it landed. The solution in my case was:
ExecStopPost=/bin/bash -c 'if [ "$$EXIT_STATUS" != 0 ]; then systemctl start FailHandler#%N.service; fi'

Systemd service wont execute at boot

I've created my own Service with systemd. It is supposed to run a python script once at boot time. It sends an Email with the IP-Address and the Teamviewer id, this is why i have an delay in it, otherwise i get an error that the domain of the Mailserver cant be resolved. The Script should run in the background because of the 30 seconds delay.
The script is located in /usr/bin/glatv.py and is ecexuteable, the script run without an problem. The construct is runnning on an Raspberry Pi4 with Raspian Buster 2020-02-13
The Service is in /etc/systemd/system/ located, is executeable and enabled:
[Unit]
Description=My Own Service
[Service]
Type=oneshot
ExecStart=/usr/bin/glatv.py &
[Install]
WantedBy=reboot.target
But
systemctl start myservice
is working without a Problem
● glatvd.service - My Own Service
Loaded: loaded (/etc/systemd/system/glatvd.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Apr 02 12:52:31 raspberrypi systemd[1]: Starting My Own Service...
Apr 02 12:53:02 raspberrypi systemd[1]: glatvd.service: Succeeded.
Apr 02 12:53:02 raspberrypi systemd[1]: Started My Own service.
after a reboot there is no call or log
Instead of having an arbitrary 30-second delay, add this to your service file:
After=network-online.target
Wants=network-online.target
You should try run this command to make your service enable to run after restart
systemctl enable myservice
and for log, I believe you must put this parameters into your service's config file
StandardOutput=/path/to/info/log/info_log.log
StandardError=/path/to/error/log/error_log.log
Anything I got this reference: How to redirect output of systemd service to a file

View journalctl logs of running service

I'm using sudo journalctl -u some-service -f to see the logs of a service. However, when the service is running, the last log is always of the form:
***Plenty of logs from previous instance of Some Service***
Apr 14 09:03:05 user-computer systemd[1]: Stopped Some Service.
Apr 14 09:03:35 user-computer systemd[1]: Started Some Service.
What I expect to see:
***Plenty of logs from previous instance of Some Service***
Apr 14 09:03:05 user-computer systemd[1]: Stopped Some Service.
Apr 14 09:03:35 user-computer systemd[1]: Started Some Service.
***New logs from current instance of Some Service***
Why can't I see the logs from the currently running instance of the service? Only once I stop/restart the service, can I see the logs from that instance.
Please help.
The problem was application specific and had nothing to do with systemd or journalctl.
For my Python services, I had to disable the output buffer:
[Service]
StandardOutput=journal
StandardError=journal
Environment="PYTHONUNBUFFERED=x"
For my PHP application, I simply had to add a newline character before the content got flushed.

Systemd script fail

I want to run a script at system startup in a Debian 9 box. My script works when run standalone, but fails under systemd.
My script just copies a backup file from a remote server to the local machine:
#!/bin/sh
set -e
/usr/bin/sshpass -p "PASSWORD" /usr/bin/scp -p USER#10.0.0.2:ORIGINPATH/backupserver.zip DESTINATIONPATH/backupserver/
Just for privacy I replaced password, user, and paths above.
I wrote the following systemd service unit:
[Unit]
Description=backup script
[Service]
Type=oneshot
ExecStart=PATH/backup.sh
[Install]
WantedBy=default.target
Then I set permissions for the script:
chmod 744 PATH/backup.sh
And installed the service:
chmod 664 /etc/systemd/system/backup.service
systemctl daemon-reload
systemctl enable backup.service
When I reboot the script fails:
● backup.service - backup script
Loaded: loaded (/etc/systemd/system/backup.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2017-05-13 13:39:54 -03; 47min ago
Main PID: 591 (code=exited, status=1/FAILURE)
Result of journalctl -xe:
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Main process exited, code=exited, status=6/NOTCONFIGURED
mai 16 23:34:27 rodrigo-acer systemd[1]: Failed to start backup script.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Unit entered failed state.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Failed with result 'exit-code'.
What could be wrong?
Solved guys. There was 2 problems:
1 - I had to change the service unit file to make the service run only after network was up. The unit section was changed to:
[Unit]
Description = World server backup
Wants = network-online.target
After = network.target network-online.target
2 - The root user did not have the remote host added to the known host list, unlike the ordinary user I used to test the script.
Failed with result 'exit-code' you could try this on your last line:
# REQUIRED FOR SYSTEMD: 0 means clean no error
exit 0
You may also need to add:
Type=forking
to the systemd entry similar to: https://serverfault.com/questions/751030/systemd-ignores-return-code-while-starting-service
If your service or script does not fork add a & at the end to run it in the background, and exit with 0 fast. Otherwise it will be like a startup that times out and takes forever / seems like frozen service.

Systemd Service for jar file gets "operation timed out" error after few minues or stay in "activating mode"

the service unit is:
[Unit]
Description=test
After=syslog.target
After=network.target
[Service]
Type=forking
ExecStart=/bin/java -jar /home/ec2-user/test.jar
TimeoutSec=300
[Install]
WantedBy=multi-user.target
it starts fine for 1-4 minues. But later it fails:
tail /var/log/messages:
Feb 27 18:43:44 ip-172-31-40-48 systemd: Reloading.
Feb 27 18:44:06 ip-172-31-40-48 systemd: Starting test...
Feb 27 18:44:06 ip-172-31-40-48 java: 5.1.73
Feb 27 18:44:06 ip-172-31-40-48 java: Starting the internal [HTTP/1.1] server on port 8182
Feb 27 18:49:06 ip-172-31-40-48 systemd: test.service operation timed out.Terminating.
Feb 27 18:49:06 ip-172-31-40-48 systemd: test.service: control process exited, code=exited status=143
Feb 27 18:49:06 ip-172-31-40-48 systemd: Failed to start test.
Feb 27 18:49:06 ip-172-31-40-48 systemd: Unit test.service entered failed state.
systemctl status test.service (while restarting- stays in activating mode):
test.service - Setsnew
Loaded: loaded (/etc/systemd/system/test.service; enabled)
Active: activating (start) since Sun 2015-03-01 14:29:36 EST; 2min 30s ago
Control: 32462 (java)
CGroup: /system.slice/test.service
systemctl status test.service (after fail):
test.service - test
Loaded: loaded (/etc/systemd/system/test.service; enabled)
Active: failed (Result: exit-code) since Fri 2015-02-27 18:49:06 EST; 18min ago
Process: 27954 ExecStart=/bin/java -jar /home/ec2-user/test.jar (code=exited, status=143)
when running the jar in command line it works just fine.
tried changing the jar location because I thought it's a permissions problem
selinux is off
How can i fix this issue so I could start the jar on boot? there any alternatives? (RHEL7 do not include service command)
You made the service type forking, but this service does not fork. It just runs directly. Thus systemd waited five minutes for the program to daemonize itself, and it never did. The correct type for such a service is simple.
You also disabled SELinux, which is another problem you should resolve.

Resources