Monit config param - group? - linux

I cannot locate information on the purpose of the group parameter below.
It appears in the documentation as a way to control access to a block, but I can't locate what it does in the following block.
# nginx check:
check process nginx with pidfile /var/run/nginx.pid
start program = "/etc/init.d/nginx start" with timeout 60 seconds
stop program = "/etc/init.d/nginx stop"
if cpu > 50% for 2 cycles then alert
group www-data
Also, in some examples for monit, you'll see an explicit fail condition w/a then restart command. My understanding is the above block handles this for us automatically in the event of a failure. Do I understand this correctly?

Groups are useful for the html GUI interface for Monit and M/Monit. You can use them on the commandline, for example:
monit -g stop
Will stop all processes with that groupname.
The "depends on" command might do what you are thinking of, for example:
check process postfix with pidfile /var/spool/postfix/pid/master.pid
start program = "/etc/init.d/postfix start"
stop program = "/etc/init.d/postfix stop"
depends on postfix_bin
check file postfix_bin with path /usr/sbin/postfix
if failed permission 0755 then unmonitor
If postfix has wrong permissions (or isn't installed) it will not try to start it
Your example above will raise an alert. You need to replace the "alert" with "restart" in order to activate the response of running the stop then start programs automatically on process failure. If you remove the if statement entirely monit will default to doing a restart on failure.

Related

How to track down process that's running too long?

I have a VPS with firewall and security notices enabled. I keep getting emails like this:
Time: Wed Jun 19 19:01:54 2019 -0500
Account: user
Resource: Process Time
Exceeded: 7248 > 3600 (seconds)
Executable: /opt/cpanel/ea-php72/root/usr/sbin/php-fpm
Command Line: php-fpm: pool domain_com
PID: 16374 (Parent PID:9915)
Killed: No
So for some reason with this example I have a script that has apparently been running for 2+ hours non-stop. I don't have anything that should be doing that.
I'm getting notices like this quite often. How can I use this info to track down what specifically is causing this?
Any information would be greatly appreciated. Thanks!
You can track which the exact process with the process ID mentioned.
lsof -p 16374
The alert which you are getting is from the LDF which is installed as a part of CSF. I think its normal for cPanel with php_fpm to have the process php_fpm run this long.
You can add the php-fpm to csf.pignore file to stop this warning.
You can also refer the below cPanel fourm thread.
https://forums.cpanel.net/threads/lfd-excessive-resource-usage-normal-for-php-fpm.592583/
To get more information on processes, I would use the Htop tool. This is a great article for learning about how to manage processes using htop and ps
Lsof (List open files) will tell you more information about what files the process is using.
You can get htop and lsof with
sudo apt install htop lsof -y
This article indicates that :
That message comes from the third-party CSF/LFD application and indicates a PHP-FPM process was running longer than the maximum time configured for the CSF/LFD detection period. It shows the process was not killed, thus you should not have traffic loss.
So you might want to check the PHP-FPM error log for the account in-question to see if you notice any particular error messages. It's located at:
/home/$username/logs/domain_tld.php.error.log
It looks like your specific issue has not been resolved on that form. So, you might want to try strace. It handles watching system calls made by a given process including all read-write operations and os function calls. You can activate it on the command line before the program you want to track or attach to a running process by hitting s on a process selected in htop.

Make ExecStartPost command to run in background

I have a systemd service for my spring boot application connected to consul server, behind haproxy. consul provides consul-template to automatically update the service location in haproxy configuration file via consul-template command.
consul-template takes a template file and writes to the final haproxy configuration file and then reload the haproxy.
Now, consul-template process needs to run in background always along with my application, so that as the application comes up, it can detect new application startup and update its location in the configuration file.
Here is my systemd service file for this.
[Unit]
Description=myservice
Requires=network-online.target
After=network-online.target
[Service]
Type=forking
PIDFile=/home/dragon/myservice/run/myservice.pid
ExecStart=/home/dragon/myservice/bin/myservice-script start
ExecReload=/home/dragon/myservice/bin/myservice-script reload
ExecStop=/home/dragon/myservice/bin/myservice-script stop
ExecStartPost=consul-template -template '/etc/haproxy/haproxy.cfg.template:/etc/haproxy/haproxy.cfg:sudo systemctl reload haproxy'
User=dragon
[Install]
WantedBy=multi-user.target
Now, when I start systemctl start myservice, my application starts and the call to consul-template also works, but consul-template process doesn't go in background. I have to press Ctl+C and then systemctl comes back and I have both my application and consul-template process running.
Is there way to run the consul-template process in background specified in ExecStartPost?
I was trying to add & at the end of the ExecStartPost command, but then consul-template complains that it is an additional invalid argument and it fails.
I was also trying to make the command as /bin/sh -c "consul-template command here...", but then this also doesn't work. Even nohup in this command wasn't working.
Any help is really appreciated.
A workaround would be to have a bash file as your entrypoint, add all you need in there, then it will all magically work
I was trying to accomplish the same task. I wanted to fire off some HTTP requests to Tomcat once the service had started, so that I could warmup our servers ahead of the first user request.
I went through a lot of trial and error with using trying to use ExecStartPost to fire off an async process, but actually worked. By calling a shell script, I could trigger off background processes, but from my testing Systemd appears to kill the process thread when ExecStartPost finishes, so any child processes end up getting killed too. I tried various combinations of using &, setsid, nohup, etc, even some Perl to try and trigger off the an executable in it's own thread, but as soon as the shell script exite from ExecStartPost any processes running where killed. It's possible there's some solution that would work using ExecStartPost, but I couldn't find it.
However, what did work is creating a new service (like #divinedragon mentions) which piggy backs off the service I wanted to monitor (in this case Tomcat).
Since it took me a little research to get something working the way I wanted, I wanted to share my solution in case it helps someone.
The first step is to create a new service (e.g. /usr/lib/systemd/system/tomcat-service-listener.service):
[Unit]
Description=Tomcat start/stop event listener
# make sure to stop the service when Tomcat stops
BindsTo=tomcat.service
# waits for both Nginx & Tomcat to be started before this service is started
After=nginx.service tomcat.service
[Service]
Type=oneshot
ExecStart=/path/to/your/script.sh start
ExecStop=/path/to/your/script.sh stop
RemainAfterExit=yes
TimeoutStartSec=300
[Install]
# When the service is enabled, forces this service to start when Tomcat is started
WantedBy=tomcat.service
Some notes on what is happening here:
The BindsTo make sure the service gets stopped when Tomcat is stopped. This triggers the ExecStop command.
The After make sure that on server reboot, this service does not start until both Nginx & Tomcat have started.
The WantedBy will create the wants symlink for Tomcat (when the service is enabled), which will force Tomcat to start this service any time it's restarted.
The RemainAfterExit=yes is necessary for the ExecStop to work. If you only care about triggering something when you're service is started and don't care about when the service is stopped, you can set this to no and remove the ExecStop line.
Make the TimeoutStartSec long enough for whatever task you plan on running.
To get this service working, you then need to do the following:
# make the service executable
chmod 664 /usr/lib/systemd/system/tomcat-service-listener.service
# make Systemd aware of the new service
systemctl daemon-reload
# register the service so it's started/stopped with Tomcat
systemctl enable tomcat-service-listener.service
Now all you need script to trigger off the logic you want. In my case, I wanted to warmup some servers once Tomcat started so my /path/to/your/script.sh looks something like:
#!/bin/sh
SCRIPT_MODE="$1"
LOGFILE=/var/logs/myscript.log
log_message() {
local MESSAGE="$1"
echo "$(date '+%Y-%m-%d %H:%M:%S') $MESSAGE" >> "$LOGFILE"
return 0
}
warmup_server() {
local SERVER_ADDRESS="$1"
local SERVER_DESCRIPTION="$2"
log_message "Warming up $SERVER_DESCRIPTION..."
# we want to track the time it took to warm up the server
local START_TIME=$(date +%s)
# server restarts can take a while for all services to start, so we must retry long enough for all relevant services to start
HTTP_STATUS=$(curl --insecure --location --silent --show-error --fail --retry 60 --retry-delay 2 --retry-max-time 240 --output /dev/null --write-out "%{http_code}" '$SERVER_ADDRESS')
# we want to track the time it took to warm up the server
local TOTAL_STARTUP_TIME=$(($(date +%s)-$START_TIME))
log_message "$SERVER_DESCRIPTION started in $TOTAL_STARTUP_TIME seconds... (Status: $HTTP_STATUS)"
return 0
}
# monitor when Tomcat has stopped
if [ "$SCRIPT_MODE" == "stop" ]; then
log_message "Tomcat listener shutting down..."
exit 0
elif [ "$SCRIPT_MODE" == "start" ]; then
log_message "Tomcat listener started..."
fi
# servers to warm up
warmup_server 'https://127.0.0.1' 'Localhost #1'
warmup_server 'https://127.0.0.2' 'Localhost #2'
This seems to be working exactly as I want. The service starts up when the server is reboot and starting/stopping/restarting Tomcat fires off the expected events. Since it's independent of the Tomcat service, I can restart this warmup script if needed. It also doesn't delay the Tomcat startup time, since it is its own service, therefore running asynchronously like I wanted.

How to run a go app continously on an Ubuntu server

Couldn't seem to find a direct answer around here.
I'm not sure if I should run ./myBinary as a Cron process or if I should run "go run myapp.go"
What's an effective way to make sure that it is always running?
Sorry I'm used to Apache and Nginx.
Also what are best practices for deploying a Go app? I want everything (preferably) all served on the same server. Just like how my development environment is like.
I read something else that used S3, but, I really don't want to use S3.
Use the capabilities your init process provides. You're likely running system with either Systemd or Upstart. They've both got really easy descriptions of services and can ensure your app runs with the right privileges, is restarted when anything goes down, and that the output is are handled correctly.
For quick Upstart description look here, your service description is likely to be just:
start on runlevel [2345]
stop on runlevel [!2345]
setuid the_username_your_app_runs_as
exec /path/to/your/app --options
For quick Systemd description look here, your service is likely to be just:
[Unit]
Description=Your service
[Service]
User=the_username_your_app_runs_as
ExecStart=/path/to/your/app --options
[Install]
WantedBy=multi-user.target
You can put it in an inifiny loop, such as:
#! /bin/sh
while true; do
go run myapp.go
sleep 2 # Just in case
done
Hence, once the app dies due some reason, it will be run again.
You can put it in a script and run it in background using:
$ nohup ./my-script.sh >/dev/null 2>&1 &
You may want to go for virtual terminal utility like screen here. Example:
screen -S myapp # create screen with name myapp
cd ... # to your app directory
go run myapp.go # or go install and then ./myappfrom go bin dir
Ctrl-a+d # to go out of screen
If you want to return to the screen:
screen -r myapp
EDIT: this solution will persist the process when you go out of terminal, but won't restart it when it'll crash.

Monit check processes for multiple pidfiles

I have a Nodejs Web app with multiple node process, which I start using pm2 start app.js multiple times.
To monitor these processes usinf monit, I created a init script using pm2's : pm2 startup ubuntu command. Then, in the monit config file for my App, I use this init script as start and stop program commands for monit. Then I use something like check process pm2_1 with pidfile /path/to/node-pidfile, for both the processes in the monit config file.
I would like monit to check the pidfiles of both these processes and when either or both processes are down, restart both the processes. So, here's what my my-webapp.monitrc looks like :
check process pm2_1
with pidfile /root/.pm2/pids/proc1-0.pid
start program = "/etc/init.d/pm2-init.sh start"
stop program = "/etc/init.d/pm2-init.sh stop"
check process pm2_2
with pidfile /root/.pm2/pids/proc2-1.pid
start program = "/etc/init.d/pm2-init.sh start"
stop program = "/etc/init.d/pm2-init.sh stop"
The problem is, if either of the processes is down,it works. But if both processes are down, monit executes the start command twice.
Is there a way to have a "OR" condition for check process to monitor multiple different pidfiles and execute the same start and stop commands only once ?

Node.js Ubuntu and Monit

I'm working on getting a Node server up with upstart and monit instead of using a cron job to run a script to check on things. I've built an admin dashboard for the server that uses the Node os module for things like os.loadavg() and os.totalmem(), etc...
The problem is, when monit is running, os.loadavg() always returns [0, 0, 0]. Has anyone else encountered this problem? Does monit create a lock or something that does not allow Node to read that property?
Thanks in advance for any help!
Monit Script
check process flinch
with pidfile "/var/run/flinch.pid"
start program = "/sbin/start flinch"
stop program = "/sbin/stop flinch"
if loadavg (1min) > 4 then alert
if loadavg (5min) > 2 then alert
if memory usage > 0% then alert
To give this question some closure, I removed monit from the system check and wrote a custom bash script that checks the process and it runs on during a cron job every minute. Monit seems to put a lock on the system stats when in use.

Resources