Services in RHEL7

Services in RHEL7 - linux

I already have a service that was written for RHEL6 and there i had some custom service commands that i can execute.Please see below for the extract from the script.
case "$1" in
'start')
start
;;
'stop')
stopit
;;
'restart')
stopit
start
;;
'status')
status
;;
'AppHealthCheck')
AppHealthCheck
;;
*)
echo "Usage: $0 { start | stop | restart | status | AppHealthCheck }"
exit 1
;;
esac
All the called method have there defination...So previously in RHEL6 if i had to execute the service and see if it is healthy i used to execute service $servicename AppHealthCheck .. and it used to work but now in RHEL7 i am not able to define in service unit file if i want to check say the AppHealth...As far as the research i have done i learnt that can define what will be called for service start/stop/restart but was not able to find if we can call any custom methods in the script..Please see my service unit file below:-
[Unit]
Description=SPIRIT Agent Application
[Service]
Type=forking
ExecStart=scripts/Agent start
ExecStop=scripts/Agent stop
ExecReload=scripts/Agent restart
[Install]
Can you one please help me in resolving this issue.Please let me know if more info is required.

The systemd way is to send output to the journal so that systemctl status shows the latest log messages, and tells you if the service is running. If you want more detailed status, you would create a separate command-line command that does AppHealthCheck. It wouldn't be executed via systemctl, it'd be a separate thing.
This is how Pacemaker works, for example. systemctl status pacemaker shows if the service is running.
# systemctl status pacemaker
● pacemaker.service - Pacemaker High Availability Cluster Manager
Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2016-11-10 15:28:11 GMT; 1 weeks 3 days ago
Nov 11 15:54:59 node1 crmd[4422]: notice: Operation svc1_stop_0: ok (node=node1, call=93, rc=0, cib-update=134, confirmed=true)
Nov 11 15:54:59 node1 crmd[4422]: notice: Operation svc2_stop_0: ok (node=node1, call=95, rc=0, cib-update=135, confirmed=true)
Nov 11 15:54:59 node1 crmd[4422]: notice: Operation svc3_stop_0: ok (node=node1, call=97, rc=0, cib-update=136, confirmed=true)
pcs status gives more detailed information about how it's doing.
# pcs status
Cluster name: node
Stack: corosync
Current DC: node2 (version 1.2.3) - partition with quorum
2 nodes and 3 resources configured
Online: [ node1 node2 ]
Full list of resources:
<snip>
PCSD Status:
node1: Online
node2: Online
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled

In RHEL7 we cannot define any custom service commands as we used to do or we can do in RHEL6 server. So even if we are calling any custom service command we have to internally call the 'service $servicename start' or 'systemctl start $servicename' so that the RHEL7 server can recognize that the service is running

Related

Sidekiq starting successfully, but systemd restarts every ~1 minute anyway

Rails: 6.0.3
Sidekiq: 6.1.2
Ruby 2.7.2
Running on AWS Amazon Linux 2
I'm running a fairly simply Sidekiq configuration on production, and using the boilerplate systemd/sidekiq.service file from the examples directory in the sidekiq repo.
I noticed that my workers can not run long jobs because they are killed every 1 minute or so. I was able to track down what's happening, and it appears that systemd is restarting sidekiq, even though it is successfully started. It appears that it never receives the message that the service started successfully, so systemd is killing the process.
Here are the logs:
sidekiq: 2021-06-01T23:30:56.510Z pid=24939 tid=gir INFO: Shutting down
sidekiq: 2021-06-01T23:30:56.511Z pid=24939 tid=4jxb INFO: Scheduler exiting...
systemd: Failed to start sidekiq.
systemd: Unit sidekiq.service entered failed state.
systemd: sidekiq.service failed.
sidekiq: 2021-06-01T23:30:56.513Z pid=24939 tid=gir INFO: Terminating quiet workers
sidekiq: 2021-06-01T23:30:56.513Z pid=24939 tid=4jvn INFO: Scheduler exiting...
sidekiq: 2021-06-01T23:30:57.015Z pid=24939 tid=gir INFO: Pausing to allow workers to finish...
sidekiq: 2021-06-01T23:30:57.516Z pid=24939 tid=gir INFO: Bye!
systemd: sidekiq.service holdoff time over, scheduling restart.
systemd: Starting sidekiq...
sidekiq: 2021-06-01T23:30:58.991Z pid=32046 tid=fs6 INFO: Enabling systemd notification integration
sidekiq: 2021-06-01T23:31:04.475Z pid=32046 tid=fs6 INFO: Booting Sidekiq 6.1.2 with redis options {:url=>"redis://******"}
sidekiq: 2021-06-01T23:31:08.869Z pid=32046 tid=fs6 INFO: Running in ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux]
sidekiq: 2021-06-01T23:31:08.870Z pid=32046 tid=fs6 INFO: See LICENSE and the LGPL-3.0 for licensing details.
systemd: sidekiq.service: Got notification message from PID 32046, but reception only permitted for main PID 31981
Following these messages, the sidekiq worker will successfully perform the jobs from the queue for about 1 minute before it's restarted again. This cycle continues forever.
I've tried modifying the sidekiq.service file a number of different ways, but nothing seems to do the trick. In particular, this line from the logs seems to indicate there's an issue sending the signal to the right process ID, that sidekiq correctly started up: systemd: sidekiq.service: Got notification message from PID 32046, but reception only permitted for main PID 31981
Any ideas on how I can ensure that systemd accurately knows when a job succeeds/fails to start?
Here is my current systemd/sidekiq.service file:
#
# This file tells systemd how to run Sidekiq as a 24/7 long-running daemon.
#
# Customize this file based on your bundler location, app directory, etc.
# Customize and copy this into /usr/lib/systemd/system (CentOS) or /lib/systemd/system (Ubuntu).
# Then run:
# - systemctl enable sidekiq
# - systemctl {start,stop,restart} sidekiq
#
# This file corresponds to a single Sidekiq process. Add multiple copies
# to run multiple processes (sidekiq-1, sidekiq-2, etc).
#
# Use `journalctl -u sidekiq -rn 100` to view the last 100 lines of log output.
#
[Unit]
Description=sidekiq
# start us only once the network and logging subsystems are available,
# consider adding redis-server.service if Redis is local and systemd-managed.
After=syslog.target network.target
# See these pages for lots of options:
#
# https://www.freedesktop.org/software/systemd/man/systemd.service.html
# https://www.freedesktop.org/software/systemd/man/systemd.exec.html
#
# THOSE PAGES ARE CRITICAL FOR ANY LINUX DEVOPS WORK; read them multiple
# times! systemd is a critical tool for all developers to know and understand.
#
[Service]
#
# !!!! !!!! !!!!
#
# As of v6.0.6, Sidekiq automatically supports systemd's `Type=notify` and watchdog service
# monitoring. If you are using an earlier version of Sidekiq, change this to `Type=simple`
# and remove the `WatchdogSec` line.
#
# !!!! !!!! !!!!
#
Type=simple
# If your Sidekiq process locks up, systemd's watchdog will restart it within seconds.
#WatchdogSec=10
EnvironmentFile=/opt/elasticbeanstalk/deployment/custom_env_var
WorkingDirectory=/var/app/current
# If you use rbenv:
# ExecStart=/bin/bash -lc 'exec /home/deploy/.rbenv/shims/bundle exec sidekiq -e production'
# If you use the system's ruby:
# ExecStart=/usr/local/bin/bundle exec sidekiq -e production
# If you use rvm in production without gemset and your ruby version is 2.6.5
# ExecStart=/home/deploy/.rvm/gems/ruby-2.6.5/wrappers/bundle exec sidekiq -e production
# If you use rvm in production wit gemset and your ruby version is 2.6.5
ExecStart=/bin/bash -lc 'cd /var/app/current; bundle exec sidekiq -e production -r /var/app/current -C /var/app/current/config/sidekiq.yml'
# Use `systemctl kill -s TSTP sidekiq` to quiet the Sidekiq process
# !!! Change this to your deploy user account !!!
User=root
Group=root
UMask=0002
# Greatly reduce Ruby memory fragmentation and heap usage
# https://www.mikeperham.com/2018/04/25/taming-rails-memory-bloat/
Environment=MALLOC_ARENA_MAX=2
# if we crash, restart
RestartSec=1
Restart=on-failure
# output goes to /var/log/syslog (Ubuntu) or /var/log/messages (CentOS)
StandardOutput=syslog
StandardError=syslog
# This will default to "bundler" if we don't specify it
SyslogIdentifier=sidekiq
[Install]
WantedBy=multi-user.target

Change ExecStart to:
ExecStart=/direct/path/to/bundle exec sidekiq -e production
Everything else in that line appears superfluous.

Maybe this work in your case:
Type=notify
Notify=all # or "exec"

Systemd service wont execute at boot

I've created my own Service with systemd. It is supposed to run a python script once at boot time. It sends an Email with the IP-Address and the Teamviewer id, this is why i have an delay in it, otherwise i get an error that the domain of the Mailserver cant be resolved. The Script should run in the background because of the 30 seconds delay.
The script is located in /usr/bin/glatv.py and is ecexuteable, the script run without an problem. The construct is runnning on an Raspberry Pi4 with Raspian Buster 2020-02-13
The Service is in /etc/systemd/system/ located, is executeable and enabled:
[Unit]
Description=My Own Service
[Service]
Type=oneshot
ExecStart=/usr/bin/glatv.py &
[Install]
WantedBy=reboot.target
But
systemctl start myservice
is working without a Problem
● glatvd.service - My Own Service
Loaded: loaded (/etc/systemd/system/glatvd.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Apr 02 12:52:31 raspberrypi systemd[1]: Starting My Own Service...
Apr 02 12:53:02 raspberrypi systemd[1]: glatvd.service: Succeeded.
Apr 02 12:53:02 raspberrypi systemd[1]: Started My Own service.
after a reboot there is no call or log

Instead of having an arbitrary 30-second delay, add this to your service file:
After=network-online.target
Wants=network-online.target

You should try run this command to make your service enable to run after restart
systemctl enable myservice
and for log, I believe you must put this parameters into your service's config file
StandardOutput=/path/to/info/log/info_log.log
StandardError=/path/to/error/log/error_log.log
Anything I got this reference: How to redirect output of systemd service to a file

run hazelcast based on docker locally

I want to run Hazelcast for POC for future use based on docker in the aws instances.
I use the next configuration to run in on my laptop for some investigation:
docker run -e JAVA_OPTS="-Dhazelcast.local.publicAddress=192.168.1.227:5701" -itd -p 5701:5701 hazelcast/hazelcast
docker run -e JAVA_OPTS="-Dhazelcast.local.publicAddress=192.168.1.227:5702" -itd -p 5702:5701 hazelcast/hazelcast
It starts ok, but once try to open it in the browser I got next warnings:
docker logs -ft a91ed298117a
2020-02-02T16:30:41.846203500Z ########################################
2020-02-02T16:30:41.846284000Z # JAVA_OPTS=-Dhazelcast.mancenter.enabled=false -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properties -XX:MaxRAMPercentage=80.0 -Dhazelcast.local.publicAddress=192.168.1.227:5702
2020-02-02T16:30:41.846346700Z # CLASSPATH=/opt/hazelcast/*:/opt/hazelcast/lib/*
2020-02-02T16:30:41.846374200Z # starting now....
2020-02-02T16:30:41.846424700Z ########################################
2020-02-02T16:30:41.846467100Z + exec java -server -Dhazelcast.mancenter.enabled=false -Djava.net.preferIPv4Stack=true -Djava.util.logging.config.file=/opt/hazelcast/logging.properties -XX:MaxRAMPercentage=80.0 -Dhazelcast.local.publicAddress=192.168.1.227:5702 com.hazelcast.core.server.StartServer
Members {size:2, ver:2} [
2020-02-02T16:30:52.360102700Z Member [192.168.1.227]:5701 - e152d11b-df3e-4c29-a363-188842fc624c
2020-02-02T16:30:52.360128200Z Member [192.168.1.227]:5702 - e7811c67-34ef-4ec5-9687-1945d7c36b69 this
2020-02-02T16:30:52.360159400Z ]
2020-02-02T16:30:52.360183200Z
2020-02-02T16:30:53.384531200Z Feb 02, 2020 4:30:53 PM com.hazelcast.core.LifecycleService
2020-02-02T16:30:53.384586000Z INFO: [192.168.1.227]:5702 [dev] [3.12.6] [192.168.1.227]:5702 is STARTED
2020-02-02T16:31:00.582731400Z Feb 02, 2020 4:31:00 PM com.hazelcast.nio.tcp.TcpIpConnection
2020-02-02T16:31:00.582871900Z WARNING: [192.168.1.227]:5702 [dev] [3.12.6] Connection[id=2, /172.17.0.3:5701->/172.17.0.1:60574, qualifier=null, endpoint=null, alive=false, type=NONE] closed. Reason: Exception in Connection[id=2, /172.17.0.3:5701->/172.17.0.1:60574, qualifier=null, endpoint=null, alive=true, type=NONE], thread=hz._hzInstance_1_dev.IO.thread-in-1
2020-02-02T16:31:00.582909200Z java.lang.IllegalStateException: REST API is not enabled.
2020-02-02T16:31:00.583013000Z at com.hazelcast.nio.tcp.UnifiedProtocolDecoder.onRead(UnifiedProtocolDecoder.java:96)
2020-02-02T16:31:00.583049600Z at com.hazelcast.internal.networking.nio.NioInboundPipeline.process(NioInboundPipeline.java:135)
2020-02-02T16:31:00.583077900Z at com.hazelcast.internal.networking.nio.NioThread.processSelectionKey(NioThread.java:369)
2020-02-02T16:31:00.583122400Z at com.hazelcast.internal.networking.nio.NioThread.processSelectionKeys(NioThread.java:354)
2020-02-02T16:31:00.583189100Z at com.hazelcast.internal.networking.nio.NioThread.selectLoop(NioThread.java:280)
2020-02-02T16:31:00.583220000Z at com.hazelcast.internal.networking.nio.NioThread.run(NioThread.java:235)
2020-02-02T16:31:00.583249400Z
2020-02-02T16:31:00.604505300Z Feb 02, 2020 4:31:00 PM com.hazelcast.nio.tcp.TcpIpConnection
Could you please help me to understand where I goes wrong?

The Hazelcast REST API is not enabled by default and that is why you get the exception in the logs. Also, keep in mind, that it does not make much sense to open Hazelcast in the browser, since it does not serve any HTTP webpage.
Saying that, you successfully run Hazelcast cluster in Docker. Now if you want to play with it, the simplest way is to either enable REST API or to use your language of choice and connect with Hazelcast client.
1. REST API
To start Hazelcast with REST API enabled, you need to add -Dhazelcast.rest.enabled=true to your JAVA_OPTS. So in your case, you can run the following commands:
docker run -e JAVA_OPTS="-Dhazelcast.local.publicAddress=192.168.1.227:5701 -Dhazelcast.rest.enabled=true" -itd -p 5701:5701 hazelcast/hazelcast:3.12.6
docker run -e JAVA_OPTS="-Dhazelcast.local.publicAddress=192.168.1.227:5702 -Dhazelcast.rest.enabled=true" -itd -p 5702:5701 hazelcast/hazelcast:3.12.6
Then, you can use Hazelcast REST API, for example to add and read the value form the map:
$ curl -X POST 192.168.1.227:5701/hazelcast/rest/maps/mapName/foo -d "bar"
$ curl 192.168.1.227:5701/hazelcast/rest/maps/mapName/foo
bar
2. Hazelcast Client
There are Hazelcast Clients in most programming languages. You only need to specify 192.168.1.227:5701 and 192.168.1.227:5702 as the address of your Hazelcast cluster. For example, in Python it would look like this.
import hazelcast
config = hazelcast.ClientConfig()
config.network_config.addresses.append("192.168.1.227:5701")
config.network_config.addresses.append("192.168.1.227:5702")
client = hazelcast.HazelcastClient(config)
my_map = client.get_map("map")
my_map.put("key", "value")
client.shutdown()
Then, you can run it with:
pip install hazelcast-python-client && python client.py

Systemd script fail

I want to run a script at system startup in a Debian 9 box. My script works when run standalone, but fails under systemd.
My script just copies a backup file from a remote server to the local machine:
#!/bin/sh
set -e
/usr/bin/sshpass -p "PASSWORD" /usr/bin/scp -p USER#10.0.0.2:ORIGINPATH/backupserver.zip DESTINATIONPATH/backupserver/
Just for privacy I replaced password, user, and paths above.
I wrote the following systemd service unit:
[Unit]
Description=backup script
[Service]
Type=oneshot
ExecStart=PATH/backup.sh
[Install]
WantedBy=default.target
Then I set permissions for the script:
chmod 744 PATH/backup.sh
And installed the service:
chmod 664 /etc/systemd/system/backup.service
systemctl daemon-reload
systemctl enable backup.service
When I reboot the script fails:
● backup.service - backup script
Loaded: loaded (/etc/systemd/system/backup.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2017-05-13 13:39:54 -03; 47min ago
Main PID: 591 (code=exited, status=1/FAILURE)
Result of journalctl -xe:
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Main process exited, code=exited, status=6/NOTCONFIGURED
mai 16 23:34:27 rodrigo-acer systemd[1]: Failed to start backup script.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Unit entered failed state.
mai 16 23:34:27 rodrigo-acer systemd[1]: backup.service: Failed with result 'exit-code'.
What could be wrong?

Solved guys. There was 2 problems:
1 - I had to change the service unit file to make the service run only after network was up. The unit section was changed to:
[Unit]
Description = World server backup
Wants = network-online.target
After = network.target network-online.target
2 - The root user did not have the remote host added to the known host list, unlike the ordinary user I used to test the script.

Failed with result 'exit-code' you could try this on your last line:
# REQUIRED FOR SYSTEMD: 0 means clean no error
exit 0
You may also need to add:
Type=forking
to the systemd entry similar to: https://serverfault.com/questions/751030/systemd-ignores-return-code-while-starting-service
If your service or script does not fork add a & at the end to run it in the background, and exit with 0 fast. Otherwise it will be like a startup that times out and takes forever / seems like frozen service.

Inconsistent systemd startup of freeswitch

I have two problems running freeswitch from systemd :
EDIT 2 - I have moved the slow start up question to here (Freeswitch pauses on check_ip at boot on centos 7.1) as although they may be related it's probably good as a standalone.
EDIT - I have noticed something else. Look at these next lines captured from the terminal output when running it from there. The gap is 4 minutes but it has been around 10 minutes before. I noticed it because I was trying to find out why port 8021 was taking several minutes to accept the fs_cli connection. Why does this happen? Never happened to me before and I've installed loads of FS boxes. This does the same thing on both 1.7 & todays 1.6.
2015-10-23 12:57:35.280984 [DEBUG] switch_scheduler.c:249 Added task 1 heartbeat (core) to run at 1445601455
2015-10-23 12:57:35.281046 [DEBUG] switch_scheduler.c:249 Added task 2 check_ip (core) to run at 1445601455
2015-10-23 13:01:31.100892 [NOTICE] switch_core.c:1386 Created ip list rfc6598.auto default (deny)
I sometimes get double processes started. Here is my status line after such an occurrence :
# systemctl status freeswitch -l
freeswitch.service - freeswitch
Loaded: loaded (/etc/systemd/system/multi-user.target.wants/freeswitch.service)
Active: activating (start) since Fri 2015-10-23 01:31:53 BST; 18s ago
Main PID: 2571 (code=exited, status=0/SUCCESS); : 2742 (freeswitch)
CGroup: /system.slice/freeswitch.service
├─usr/bin/freeswitch -ncwait -core -db /dev/shm -log /usr/local/freeswitch/log -conf /usr/local/freeswitch/conf -run /usr/local/freeswitch/run
└─usr/bin/freeswitch -ncwait -core -db /dev/shm -log /usr/local/freeswitch/log -conf /usr/local/freeswitch/conf -run /usr/local/freeswitch/run
Oct 23 01:31:53 fswitch-1 systemd[1]: Starting freeswitch...
Oct 23 01:31:53 fswitch-1 freeswitch[2742]: 2743 Backgrounding.
and there are two processes running.
The PID file is sometimes not written fast enough for the systemd process to pick it up, but by the time I see this (no matter how fast I run the command) it's always there by the time I do :
Oct 23 02:00:26 arribacom-sbc-1 systemd[1]: PID file
/usr/local/freeswitch/run/freeswitch.pid not readable (yet?) after
start.
Now, in (2) everything seems to work ok, and I can shut down the freeswitch process using
systemctl stop freeswitch
without any issues, but in (1) it just doesn't seem to do anything.
I'm wondering if the two are related, and that freeswitch is reporting back to systemd that the program is running before it actually is. Then systemd is either starting up another process or (sometimes) not.
Can anyone offer any pointers? I have tried to mail the freeswitch users list but despite being registered I simply cannot get any emails to appear on the list (but that's another problem).
* Update *
If I remove the -ncwait it seems to improve the double process starting but I still get the can't read PID warning, so I'm still sure there's an issue present, possibly around timing(?).
I'm on Centos 7.1, & my freeswitch version is
FreeSWITCH Version 1.7.0+git~20151021T165609Z~9fee9bc613~64bit (git
9fee9bc 2015-10-21 16:56:09Z 64bit)
and here's my freeswitch.service file (some things have been commented out until I understand what they are doing and any side effects they may have) :
[Unit]
Description=freeswitch
After=syslog.target network.target
#
[Service]
Type=forking
PIDFile=/usr/local/freeswitch/run/freeswitch.pid
PermissionsStartOnly=true
ExecStart=/usr/bin/freeswitch -nc -core -db /dev/shm -log /usr/local/freeswitch/log -conf /u
ExecReload=/usr/bin/kill -HUP $MAINPID
#ExecStop=/usr/bin/freeswitch -stop
TimeoutSec=120s
#
WorkingDirectory=/usr/bin
User=freeswitch
Group=freeswitch
LimitCORE=infinity
LimitNOFILE=999999
LimitNPROC=60000
LimitSTACK=245760
LimitRTPRIO=infinity
LimitRTTIME=7000000
#IOSchedulingClass=realtime
#IOSchedulingPriority=2
#CPUSchedulingPolicy=rr
#CPUSchedulingPriority=89
#UMask=0007
#
[Install]
WantedBy=multi-user.target

In the current master branch, take the two files from debian/ directory:
freeswitch-systemd.freeswitch.service -- should go as /lib/systemd/system/freeswitch.service
freeswitch-systemd.freeswitch.tmpfile -- should go as /usr/lib/tmpfiles.d/freeswitch.conf
You probably need to adapt the paths, or build FreeSWITCH to use standard Debian paths.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string