systemctl failing with unknown section 'Timer' - rhel

I have a systemctl job that performs vertica backup to s3, i wanted to add a timer that runs everyday at 3am . I tried to create an override file with the timer section but when i do daemon-reload, I am getting an error `Unknown section Timer', I am unable to find the issue.
/etc/systemd/system/vertica-backup.service.d/Override.conf
[Timer]
OnCalendar=*-*-* 03:00:00
Unit=vertica-backup.service
/etc/systemd/system/vertica-backup.service:
[Unit]
Description = Vertica Backup Service
After = network.target
[Service]
User= dbadmin
ExecStart= /usr/local/bin/vertica-backup.sh
Error
May 15 15:19:47 ip-10-150-4-42.ec2.internal systemd[1]: [/etc/systemd/system/vertica-backup.service.d/override.conf:1] Unknown section 'Timer'. Ignoring.
May 15 15:19:50 ip-10-150-4-42.ec2.internal systemd[1]: [/etc/systemd/system/vertica-backup.service.d/override.conf:1] Unknown section 'Timer'. Ignoring.

[Timer] sections don't go in service files, they go in their own .timer files. Create /etc/systemd/system/vertica-backup.timer and put the [Timer] section in there.
See man systemd.timer for reference.

Create the timer file /etc/systemd/system/vertica-backup.timer
[Timer]
OnCalendar=*-*-* 03:00:00
Unit=vertica-backup.service
verify it
sudo systemd-analyze verify /etc/systemd/system/vertica-backup.timer
start the timer
sudo systemctl start vertica-backup.timer
# check it
systemctl list-timers --all

Related

Systemd - Unknown lvalue 'ConditionEnvironment' in section 'Unit'

Simple systemd service not working as expected
Service name: test.service
[Unit]
Description=Test
ConditionEnvironment=STACK=prod
[Service]
Restart=always
ExecStart=/bin/bash -l -c 'echo "do prod stuff!!!"'
[Install]
WantedBy=default.target
sudo systemctl daemon-reload
sudo service test restart
journalctl -u test -f
Systemd is giving an error when I try to use the ConditionEnvironment setting.
Apr 27 13:16:33 ip-172-31-105-2 systemd[1]: Failed to start Test.
Apr 27 13:19:53 ip-172-31-105-2 systemd[1]: /etc/systemd/system/test.service:3: Unknown lvalue 'ConditionEnvironment' in section 'Unit'
Systemd ConditionEnvironment docs
While writing this question I found the answer.
The ConditionEnvironment setting was added in systemd version 246.
See release notes here
Seems Ubuntu is shipping with earlier versions.
ubuntu ~$ systemctl --version
systemd 237 (245.4-4ubuntu3.6)
Notes on updating systemd here: https://askubuntu.com/questions/627174/how-would-i-upgrade-systemd

Why is rhel8 aws systemd service throws No such file or directory error even when file exists?

I have a systemd service define on rhel8
[Unit]
Description=Apache Kafka - ZooKeeper
Documentation=http://docs.confluent.io/
After=network.target
[Service]
Type=simple
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
User=kafka
Group=kafka
ExecStart=/app/bin/confluent/bin/zookeeper-server-start /app/bin/confluent/etc/kafka/zookeeper.properties
TimeoutStopSec=180
Restart=no
[Install]
WantedBy=multi-user.target
when i start this service i get the below error in journalctl
Jul 09 12:00:51 10.204.142.111 systemd[1]: confluent-zookeeper.service: Failed to load environment files: No such file or directory
Jul 09 12:00:51 10.204.142.111 systemd[1]: confluent-zookeeper.service: Failed to run 'start' task: No such file or directory
Jul 09 12:00:51 10.204.142.111 systemd[1]: confluent-zookeeper.service: Failed with result 'resources'.
the environment file exists in the path and so does the start script and properties files.
this is on RHEL8 aws and trying this for the first time.
the component starts up fine when i run the start script manually from command line.
Check that the path and file for
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
are correct
In my case, I had my service file like this
ExecStart=/app/bin/confluent/bin/zookeeper-server-start
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
I changed it to
EnvironmentFile=/app/bin/confluent/etc/kafka/zookenv.properties
ExecStart=/app/bin/confluent/bin/zookeeper-server-start
Then I run,
systemctl daemon-reload &&
systemctl restart service-name.service`
Initially, I was using `systemctl start service-name.service` and I guess that didn't make systemd read the environment files properly

systemd unit for pgagent

I want to make a systemd unit for pgagnent.
I found only init.d script on this page http://technobytz.com/automatic-sql-database-backup-postgres.html, but I don't know how to exec start-stop-daemon in systemd.
I have written that unit:
[Unit]
Description=pgagent
After=network.target postgresql.service
[Service]
ExecStart=start-stop-daemon -b --start --quiet --exec pgagent --name pgagent --startas pgagent -- hostaddr=localhost port=5432 dbname=postgres user=postgres
ExecStop=start-stop-daemon --stop --quiet -n pgagent
[Install]
WantedBy=multi-user.target
But I get errors like:
[/etc/systemd/system/pgagent.service:14] Executable path is not absolute, ignoring: start-stop-daemon --stop --quiet -n pgagent
What is wrong with that unit?
systemd expects the ExecStart and ExecStop commands to include the full path to the executable.
start-stop-daemon is not necessary for services under systemd management. you will want to have it execute the underlying pgagent commands.
look at https://unix.stackexchange.com/questions/220362/systemd-postgresql-start-script for an example
If you installed pgagent with yum or apt-get, it should have created the systemd file for you. For example, on RHEL 7 (essentially CentOS 7), you can install PostgreSQL 12 followed by pgagent
sudo yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
sudo yum install postgresql12
sudo yum install postgresql12-server
sudo yum install pgagent_12.x86_64
This installs PostgreSQL to /var/lib/pgsql/12 and pgagent_12 to /usr/bin/pgagent_12
In addition, it creates a systemd file at /usr/lib/systemd/system/pgagent_12.service
View the status of the service with systemctl status pgagent_12
Configure it to auto-start, then start it, with:
sudo systemctl enable pgagent_12
sudo systemctl start pgagent_12
Most likely the authentication will fail, since the default .service file has
ExecStart=/usr/bin/pgagent_12 -s ${LOGFILE} hostaddr=${DBHOST} dbname=${DBNAME} user=${DBUSER} port=${DBPORT}
Confirm with sudo tail /var/log/pgagent_12.log which will show
Sat Oct 12 19:35:47 2019 WARNING: Couldn't create the primary connection [Attempt #1]
Sat Oct 12 19:35:52 2019 WARNING: Couldn't create the primary connection [Attempt #2]
Sat Oct 12 19:35:57 2019 WARNING: Couldn't create the primary connection [Attempt #3]
Sat Oct 12 19:36:02 2019 WARNING: Couldn't create the primary connection [Attempt #4]
To fix things, we need to create a .pgpass file that is accessible when the service starts. First, stop the service
sudo systemctl stop pgagent_12
Examining the service file with less /usr/lib/systemd/system/pgagent_12.service shows it has
User=pgagent
Group=pgagent
Furthermore, /etc/pgagent/pgagent_12.conf has
DBNAME=postgres
DBUSER=postgres
DBHOST=127.0.0.1
DBPORT=5432
LOGFILE=/var/log/pgagent_12.log
Examine the /etc/passwd file to look for the pgagent user and its home directory: grep "pgagent" /etc/passwd
pgagent:x:980:977:pgAgent Job Schedule:/home/pgagent:/bin/false
Thus, we need to create a .pgpass file at /home/pgagent/.pgpass to define the postgres user's password
sudo su -
mkdir /home/pgagent
chown pgagent:pgagent /home/pgagent
chmod 0700 /home/pgagent
echo "127.0.0.1:5432:postgres:postgres:PasswordGoesHere" > /home/pgagent/.pgpass
chown pgagent:pgagent /home/pgagent/.pgpass
chmod 0600 /home/pgagent/.pgpass
The directory and file permissions are important. If you're having problems, you can enable debug logging by editing the service file at /usr/lib/systemd/system/pgagent_12.service to enable debug logging by updating the ExecStart command to have -l 2
ExecStart=/usr/bin/pgagent_12 -l 2-s ${LOGFILE} hostaddr=${DBHOST} dbname=${DBNAME} user=${DBUSER} port=${DBPORT}
After changing a .service file, things must be reloaded with sudo systemctl daemon-reload (systemd will inform you of this requirement if you forget it).
Keep starting/stopping the service and checking /var/log/pgagent_12.log Eventually, it will start properly and sudo systemctl status pgagent_12 will show
● pgagent_12.service - PgAgent for PostgreSQL 12
Loaded: loaded (/usr/lib/systemd/system/pgagent_12.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2019-10-12 20:18:18 PDT; 13s ago
Process: 6159 ExecStart=/usr/bin/pgagent_12 -s ${LOGFILE} hostaddr=${DBHOST} dbname=${DBNAME} user=${DBUSER} port=${DBPORT} (code=exited, status=0/SUCCESS)
Main PID: 6160 (pgagent_12)
Tasks: 1
Memory: 1.1M
CGroup: /system.slice/pgagent_12.service
└─6160 /usr/bin/pgagent_12 -s /var/log/pgagent_12.log hostaddr=127.0.0.1 dbname=postgres user=postgres port=5432
Oct 12 20:18:18 prismweb3 systemd[1]: Starting PgAgent for PostgreSQL 12...
Oct 12 20:18:18 prismweb3 systemd[1]: Started PgAgent for PostgreSQL 12.

Prometheus 2.0 centos service won't start, because "Opening storage failed", "permission denied"

context: I've added some scripts to an empty centos VM to install some monitoring tools including prometheus 2.0.
problem: Once installed in the non-root sudo user's home directory, I copy the prometheus.service that I wrote to "/etc/systemd/system", run sudo systemctl daemon-reload, sudo systemctl enable prometheus.service, sudo systemctl start prometheus.service but the service fails.
note: I can run the prometheus binary in the terminal directly using the same command without any problems, but I can't run it as a service.
Here's my .service file:
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=centos
ExecStart=/home/centos/prometheus/prometheus --config.file="/home/centos/prometheus/prometheus.yml" --storage.tsdb.path="/home/centos/prometheus/data"
[Install]
WantedBy=multi-user.target
Here's some of the log:
...
Nov 21 12:41:55 localhost.localdomain prometheus[1554]: level=info ts=2017-11-21T17:41:55.114757834Z caller=main.go:314 msg="Starting TSDB"
Nov 21 12:41:55 localhost.localdomain prometheus[1554]: level=error ts=2017-11-21T17:41:55.114819195Z caller=main.go:323 msg="Opening storage failed" err="mkdir \": permission denied"
Nov 21 12:41:55 localhost.localdomain systemd[1]: prometheus.service: control process exited, code=exited status=1
Nov 21 12:41:55 localhost.localdomain systemd[1]: Failed to start Prometheus Server.
...
I'm new to linux services management, I've spent a lot of time reading online but I'm not sure how permissions works for services, and why it can't create the directory it needs to create.
I've tried:
Changing "SELINUX=enforcing" to "SELINUX=permissive"
Changing the permission to the prometheus directory to 777
...
You also have to set up --web.console.templates and --web.console.libraries. You can copy these directories from exctracted archive. For example:
sudo cp -R ~/prometheus-2.0.0.linux-amd64/consoles /etc/prometheus
sudo cp -R ~/prometheus-2.0.0.linux-amd64/console_libraries /etc/prometheus
Example of working service (change path for yours):
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus --config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
P.S. Inspired by suggestions here.
Data directory for Prometheus should have write permissions for prometheus application user. If you're running it from a container and external mounting the data directory, you can set 777 permissions on original folder.
If SELinux is stopping startup always consult journalctl -xe to view the SELinux alerts. There are recommended actions to be taken.
I have setup prometheus with SELinux on CentOS 8 without problems. And I don't agree with people that recommend disabling SELinux.
For reference Redhat has a good video for you to watch:
https://www.youtube.com/watch?v=_WOKRaM-HI4&t=1464s
Here is my prometheus.service file.
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=prometheus
#Restart=on-failure
#Change this line if you download the
#Prometheus on different path user
ExecStart=/home/prometheus/prometheus-2.22.0.linux-amd64/prometheus \
--config.file=/home/prometheus/prometheus-2.22.0.linux-amd64/prometheus.yml \
--storage.tsdb.path=/home/prometheus/prometheus-2.22.0.linux-amd64/data \
--web.listen-address="0.0.0.0:9091"
[Install]
WantedBy=multi-user.target

Systemd script fail

I want to run a script at system startup in a Debian 9 box. My script works when run standalone, but fails under systemd.
My script just copies a backup file from a remote server to the local machine:
#!/bin/sh
set -e
/usr/bin/sshpass -p "PASSWORD" /usr/bin/scp -p USER#10.0.0.2:ORIGINPATH/backupserver.zip DESTINATIONPATH/backupserver/
Just for privacy I replaced password, user, and paths above.
I wrote the following systemd service unit:
[Unit]
Description=backup script
[Service]
Type=oneshot
ExecStart=PATH/backup.sh
[Install]
WantedBy=default.target
Then I set permissions for the script:
chmod 744 PATH/backup.sh
And installed the service:
chmod 664 /etc/systemd/system/backup.service
systemctl daemon-reload
systemctl enable backup.service
When I reboot the script fails:
● backup.service - backup script
Loaded: loaded (/etc/systemd/system/backup.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2017-05-13 13:39:54 -03; 47min ago
Main PID: 591 (code=exited, status=1/FAILURE)
Result of journalctl -xe:
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Main process exited, code=exited, status=6/NOTCONFIGURED
mai 16 23:34:27 rodrigo-acer systemd[1]: Failed to start backup script.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Unit entered failed state.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Failed with result 'exit-code'.
What could be wrong?
Solved guys. There was 2 problems:
1 - I had to change the service unit file to make the service run only after network was up. The unit section was changed to:
[Unit]
Description = World server backup
Wants = network-online.target
After = network.target network-online.target
2 - The root user did not have the remote host added to the known host list, unlike the ordinary user I used to test the script.
Failed with result 'exit-code' you could try this on your last line:
# REQUIRED FOR SYSTEMD: 0 means clean no error
exit 0
You may also need to add:
Type=forking
to the systemd entry similar to: https://serverfault.com/questions/751030/systemd-ignores-return-code-while-starting-service
If your service or script does not fork add a & at the end to run it in the background, and exit with 0 fast. Otherwise it will be like a startup that times out and takes forever / seems like frozen service.

Resources