Why is rhel8 aws systemd service throws No such file or directory error even when file exists? - linux

I have a systemd service define on rhel8
[Unit]
Description=Apache Kafka - ZooKeeper
Documentation=http://docs.confluent.io/
After=network.target
[Service]
Type=simple
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
User=kafka
Group=kafka
ExecStart=/app/bin/confluent/bin/zookeeper-server-start /app/bin/confluent/etc/kafka/zookeeper.properties
TimeoutStopSec=180
Restart=no
[Install]
WantedBy=multi-user.target
when i start this service i get the below error in journalctl
Jul 09 12:00:51 10.204.142.111 systemd[1]: confluent-zookeeper.service: Failed to load environment files: No such file or directory
Jul 09 12:00:51 10.204.142.111 systemd[1]: confluent-zookeeper.service: Failed to run 'start' task: No such file or directory
Jul 09 12:00:51 10.204.142.111 systemd[1]: confluent-zookeeper.service: Failed with result 'resources'.
the environment file exists in the path and so does the start script and properties files.
this is on RHEL8 aws and trying this for the first time.
the component starts up fine when i run the start script manually from command line.

Check that the path and file for
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
are correct

In my case, I had my service file like this
ExecStart=/app/bin/confluent/bin/zookeeper-server-start
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
I changed it to
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
ExecStart=/app/bin/confluent/bin/zookeeper-server-start
Then I run,
systemctl daemon-reload &&
systemctl restart service-name.service`
Initially, I was using `systemctl start service-name.service` and I guess that didn't make systemd read the environment files properly

Related

Linux service "Failed with result 'start-limit-hit'." error while trying to run jar file from automated script?

I have some scripts to restart a java jar service in linux machine taken from this post
and here is my service script:
[Unit]
Description=demo restarter
After=network.target
[Service]
Type=oneshot
ExecStart=systemctl stop demo.service
StartLimitIntervalSec=0
StartLimitBurst=0
ExecStartPost=systemctl start demo.service
[Install]
WantedBy=multi-user.target
If I manually add or remove file from the directory it works fine. But my main purpose is deploying jar file from bitbucket to droplet linux vm and if the file(jar) comes from bitbucket then I get the error:
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: Failed to start DEMOOOOOO Spring Boot application service.
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: demo.service: Start request repeated too quickly.
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: demo.service: Failed with result 'start-limit-hit'.
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: Failed to start DEMOOOOOO Spring Boot application service.
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: demo.service: Start request repeated too quickly.
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: demo.service: Failed with result 'start-limit-hit'.
Sep 08 00:09:06 ubuntu-1cpu systemd[1]: Failed to start DEMOOOOOO Spring Boot application service.
I did
StartLimitIntervalSec=0
StartLimitBurst=0
as suggested in some posts but still I am getting demo.service: Failed with result 'start-limit-hit'. error.
How can I solve this problem ? (I am suspicious about that when I deploy the jar from bitbucket it is overriding the original jar and maybe it makes problem??)
And here is the service file:
[Unit]
Description=DEMOOOOOO Spring Boot application service
After=network.target
[Service]
#User=ubuntu
Type=simple
ExecStart=java -jar /root/artifacts2/demo-0.0.1-SNAPSHOT.jar
TimeoutStopSec=10
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
here path file:
[Path]
Unit=demo-watcher.service
PathModified=/root/artifacts2
[Install]
WantedBy=multi-user.target
I solved it.
I have changed PathModified= to PathChanged= in path script and the problem is solved.
sometimes this message can be misleading, if above fix doesnt work, try changing "User" and also try upgrading or downgrading java version. This can be due to a java version mismatch too..

Systemd service wont execute at boot

I've created my own Service with systemd. It is supposed to run a python script once at boot time. It sends an Email with the IP-Address and the Teamviewer id, this is why i have an delay in it, otherwise i get an error that the domain of the Mailserver cant be resolved. The Script should run in the background because of the 30 seconds delay.
The script is located in /usr/bin/glatv.py and is ecexuteable, the script run without an problem. The construct is runnning on an Raspberry Pi4 with Raspian Buster 2020-02-13
The Service is in /etc/systemd/system/ located, is executeable and enabled:
[Unit]
Description=My Own Service
[Service]
Type=oneshot
ExecStart=/usr/bin/glatv.py &
[Install]
WantedBy=reboot.target
But
systemctl start myservice
is working without a Problem
● glatvd.service - My Own Service
Loaded: loaded (/etc/systemd/system/glatvd.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Apr 02 12:52:31 raspberrypi systemd[1]: Starting My Own Service...
Apr 02 12:53:02 raspberrypi systemd[1]: glatvd.service: Succeeded.
Apr 02 12:53:02 raspberrypi systemd[1]: Started My Own service.
after a reboot there is no call or log
Instead of having an arbitrary 30-second delay, add this to your service file:
After=network-online.target
Wants=network-online.target
You should try run this command to make your service enable to run after restart
systemctl enable myservice
and for log, I believe you must put this parameters into your service's config file
StandardOutput=/path/to/info/log/info_log.log
StandardError=/path/to/error/log/error_log.log
Anything I got this reference: How to redirect output of systemd service to a file

systemd service doesn't start silently. How to debug?

I wrote a program (called whisky) which I now want to startup when booting the machine (a Raspberry Pi with which I'm creating an autonomous boat). So I created the file /lib/systemd/system/whisky.service:
[Unit]
Description=Whisky Boat Program
After=network.target
StartLimitIntervalSec=0
[Service]
ExecStart=/home/pi/whisky/run
KillMode=process
IgnoreSIGPIPE=true
Restart=always
RestartSec=3
User=root
Type=simple
[Install]
WantedBy=multi-user.target
I verified the file is correctly formatted for systemd using systemd-analyze verify whisky.service.
When I now run sudo systemctl start whisky I get no output (suggesting no errors).
sudo systemctl status whisky gives me the following output though:
* whisky.service - Whisky Boat Program
Loaded: loaded (/lib/systemd/system/whisky.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Fri 2020-03-20 15:03:35 CET; 792ms ago
Process: 8621 ExecStart=/home/pi/whisky/run (code=exited, status=203/EXEC)
Main PID: 8621 (code=exited, status=203/EXEC)
Mar 20 15:03:35 raspberrypi systemd[1]: whisky.service: Unit entered failed state.
Mar 20 15:03:35 raspberrypi systemd[1]: whisky.service: Failed with result 'exit-code'.
The file /home/pi/whisky/run is actually a bash script which in turn starts the program. To check whether systemd even starts that bash script I added a first line to it: mkdir /home/pi/RUNNING_FROM_SYSTEMD. The dir RUNNING_FROM_SYSTEMD is not created though, so it seems systemd doesn't even try to run the file /home/pi/whisky/run.
Does anybody know what I'm doing wrong here?
Check your run script. There should be shebang. If not, add #!/bin/bash at the top of your script and give absolute path like ExecStart=/bin/bash /home/pi/whisky/run
The error message (code=exited, status=203/EXEC) is often seen when the script itself or its interpreter cannot be executed.
Other possible reasons maybe:
wrong path to script
script not executable
no shebang (first line) or wrong path in shebang
internal files in your script might be missing access permissions.

Prometheus 2.0 centos service won't start, because "Opening storage failed", "permission denied"

context: I've added some scripts to an empty centos VM to install some monitoring tools including prometheus 2.0.
problem: Once installed in the non-root sudo user's home directory, I copy the prometheus.service that I wrote to "/etc/systemd/system", run sudo systemctl daemon-reload, sudo systemctl enable prometheus.service, sudo systemctl start prometheus.service but the service fails.
note: I can run the prometheus binary in the terminal directly using the same command without any problems, but I can't run it as a service.
Here's my .service file:
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=centos
ExecStart=/home/centos/prometheus/prometheus --config.file="/home/centos/prometheus/prometheus.yml" --storage.tsdb.path="/home/centos/prometheus/data"
[Install]
WantedBy=multi-user.target
Here's some of the log:
...
Nov 21 12:41:55 localhost.localdomain prometheus[1554]: level=info ts=2017-11-21T17:41:55.114757834Z caller=main.go:314 msg="Starting TSDB"
Nov 21 12:41:55 localhost.localdomain prometheus[1554]: level=error ts=2017-11-21T17:41:55.114819195Z caller=main.go:323 msg="Opening storage failed" err="mkdir \": permission denied"
Nov 21 12:41:55 localhost.localdomain systemd[1]: prometheus.service: control process exited, code=exited status=1
Nov 21 12:41:55 localhost.localdomain systemd[1]: Failed to start Prometheus Server.
...
I'm new to linux services management, I've spent a lot of time reading online but I'm not sure how permissions works for services, and why it can't create the directory it needs to create.
I've tried:
Changing "SELINUX=enforcing" to "SELINUX=permissive"
Changing the permission to the prometheus directory to 777
...
You also have to set up --web.console.templates and --web.console.libraries. You can copy these directories from exctracted archive. For example:
sudo cp -R ~/prometheus-2.0.0.linux-amd64/consoles /etc/prometheus
sudo cp -R ~/prometheus-2.0.0.linux-amd64/console_libraries /etc/prometheus
Example of working service (change path for yours):
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus --config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
P.S. Inspired by suggestions here.
Data directory for Prometheus should have write permissions for prometheus application user. If you're running it from a container and external mounting the data directory, you can set 777 permissions on original folder.
If SELinux is stopping startup always consult journalctl -xe to view the SELinux alerts. There are recommended actions to be taken.
I have setup prometheus with SELinux on CentOS 8 without problems. And I don't agree with people that recommend disabling SELinux.
For reference Redhat has a good video for you to watch:
https://www.youtube.com/watch?v=_WOKRaM-HI4&t=1464s
Here is my prometheus.service file.
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=prometheus
#Restart=on-failure
#Change this line if you download the
#Prometheus on different path user
ExecStart=/home/prometheus/prometheus-2.22.0.linux-amd64/prometheus \
--config.file=/home/prometheus/prometheus-2.22.0.linux-amd64/prometheus.yml \
--storage.tsdb.path=/home/prometheus/prometheus-2.22.0.linux-amd64/data \
--web.listen-address="0.0.0.0:9091"
[Install]
WantedBy=multi-user.target

Systemd script fail

I want to run a script at system startup in a Debian 9 box. My script works when run standalone, but fails under systemd.
My script just copies a backup file from a remote server to the local machine:
#!/bin/sh
set -e
/usr/bin/sshpass -p "PASSWORD" /usr/bin/scp -p USER#10.0.0.2:ORIGINPATH/backupserver.zip DESTINATIONPATH/backupserver/
Just for privacy I replaced password, user, and paths above.
I wrote the following systemd service unit:
[Unit]
Description=backup script
[Service]
Type=oneshot
ExecStart=PATH/backup.sh
[Install]
WantedBy=default.target
Then I set permissions for the script:
chmod 744 PATH/backup.sh
And installed the service:
chmod 664 /etc/systemd/system/backup.service
systemctl daemon-reload
systemctl enable backup.service
When I reboot the script fails:
● backup.service - backup script
Loaded: loaded (/etc/systemd/system/backup.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2017-05-13 13:39:54 -03; 47min ago
Main PID: 591 (code=exited, status=1/FAILURE)
Result of journalctl -xe:
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Main process exited, code=exited, status=6/NOTCONFIGURED
mai 16 23:34:27 rodrigo-acer systemd[1]: Failed to start backup script.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Unit entered failed state.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Failed with result 'exit-code'.
What could be wrong?
Solved guys. There was 2 problems:
1 - I had to change the service unit file to make the service run only after network was up. The unit section was changed to:
[Unit]
Description = World server backup
Wants = network-online.target
After = network.target network-online.target
2 - The root user did not have the remote host added to the known host list, unlike the ordinary user I used to test the script.
Failed with result 'exit-code' you could try this on your last line:
# REQUIRED FOR SYSTEMD: 0 means clean no error
exit 0
You may also need to add:
Type=forking
to the systemd entry similar to: https://serverfault.com/questions/751030/systemd-ignores-return-code-while-starting-service
If your service or script does not fork add a & at the end to run it in the background, and exit with 0 fast. Otherwise it will be like a startup that times out and takes forever / seems like frozen service.

Resources