How does Google Container Optimized OS handle a scheduled shutdown? - google-container-os

I'm playing around with Container Optimized OS on Google Cloud and found that the 'Auto Update' feature doesn't apply the updates until the system is restarted, and doesn't offer any functionality for scheduling a reboot after an update is applied.
I'm writing a simple startup script that schedules a shutdown when a reboot is needed, essentially:
#!/usr/bin/env sh
update_engine_client --block_until_reboot_is_needed
shutdown -r 02:00
My question is: how do I determine whether a shutdown has been scheduled? I have tried three methods so far that don't work in this OS:
$ ps -ef | grep shutdown - no shutdown process
$ systemctl status systemd-shutdownd.service - Unit systemd-shutdownd.service could not be found.
cat /run/systemd/shutdown/scheduled - no file found
Documentation on this OS, and what it's based on, are slim. What determines how shutdown is scheduled, and how does COS handle it?

In regard to your question: how do I determine whether a shutdown has been scheduled?
There's no shutdown taks configured by default, you have to configure it (daily, weekly, monthly, etc.), the easier way to do this is by using "crond" (OS Linux task scheduler) please follow this guide to know how to configure jobs in cron(COS usually use Ubuntu OS).
According to this GCP guide: " Container-Optimized OS instances are configured to automatically download weekly updates in the background; only a reboot is necessary to use the latest updates."
So, I suggest you to configure your cron jobs weekly on no peak production days (Saturday or Sunday).
Please let me know if you have further questions.

In Container-Optimized OS, the following command will display pending shutdown information in epoch time:
$ busctl get-property org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager ScheduledShutdown
I am curious why Google chose to use busctl instead of systemd - I was unfamiliar with busctl and had to do some reading on it to understand what the command is doing - busctl man page
Example:
$ sudo shutdown -r 02:00
Shutdown scheduled for Fri 2020-07-17 02:00:00 UTC, use 'shutdown -c' to cancel.
$ busctl get-property org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager ScheduledShutdown
(st) "reboot" 1594951200000000
$ sudo shutdown -c
busctl get-property org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager ScheduledShutdown
(st) "" 0

Related

How to track down process that's running too long?

I have a VPS with firewall and security notices enabled. I keep getting emails like this:
Time: Wed Jun 19 19:01:54 2019 -0500
Account: user
Resource: Process Time
Exceeded: 7248 > 3600 (seconds)
Executable: /opt/cpanel/ea-php72/root/usr/sbin/php-fpm
Command Line: php-fpm: pool domain_com
PID: 16374 (Parent PID:9915)
Killed: No
So for some reason with this example I have a script that has apparently been running for 2+ hours non-stop. I don't have anything that should be doing that.
I'm getting notices like this quite often. How can I use this info to track down what specifically is causing this?
Any information would be greatly appreciated. Thanks!
You can track which the exact process with the process ID mentioned.
lsof -p 16374
The alert which you are getting is from the LDF which is installed as a part of CSF. I think its normal for cPanel with php_fpm to have the process php_fpm run this long.
You can add the php-fpm to csf.pignore file to stop this warning.
You can also refer the below cPanel fourm thread.
https://forums.cpanel.net/threads/lfd-excessive-resource-usage-normal-for-php-fpm.592583/
To get more information on processes, I would use the Htop tool. This is a great article for learning about how to manage processes using htop and ps
Lsof (List open files) will tell you more information about what files the process is using.
You can get htop and lsof with
sudo apt install htop lsof -y
This article indicates that :
That message comes from the third-party CSF/LFD application and indicates a PHP-FPM process was running longer than the maximum time configured for the CSF/LFD detection period. It shows the process was not killed, thus you should not have traffic loss.
So you might want to check the PHP-FPM error log for the account in-question to see if you notice any particular error messages. It's located at:
/home/$username/logs/domain_tld.php.error.log
It looks like your specific issue has not been resolved on that form. So, you might want to try strace. It handles watching system calls made by a given process including all read-write operations and os function calls. You can activate it on the command line before the program you want to track or attach to a running process by hitting s on a process selected in htop.

Redis Startup issues on Debian Stretch (9)

Actually I'm on my way to switch to debian 9 for the new production servers of the company and want to provision them with ansible. So far, everything works fine, but now I'm stuck with redis-server.
By default, Debian 9 comes with redis version 3.2. I'm installing the package via apt-get install redis-server. After that, redis starts up as a daemon in the background. Now I want to apply some custom configuration, like binding to 2 different IPs (127.0.0.1 and the server IP).
After changing this as well as the daemonize option (to yes), redis is no longer willing to start in the background. Whenever doing either service redis-server start or /etc/init.d/redis-server start, the command just stucks.
journalctl -xe tells me, that the pid file is not readable (redis-server.service: PID file /var/run/redis/redis-server.pid not readable (yet?) after start-post: No such file or directory) even though it should be created according to init.d script:
start)
echo -n "Starting $DESC: "
mkdir -p $RUNDIR
touch $PIDFILE
chown redis:redis $RUNDIR $PIDFILE
chmod 755 $RUNDIR
After all, I can see, that both service redis-server start and /etc/init.d/redis-server are starting the server and I'm also able to connect to the server via redis-cli. But the damn process stucks.
Can anyone help? If you need further information, just let me know. I'll provide what ever possible if this solves the problem!
best
Chris
I had a similar situation on a Centos 7 server.
The resolution was to change supervised from no to auto
# By default Redis does not run as a daemon. Use 'yes' if you need it.
# Note that Redis will write a pid file in /var/run/redis.pid when daemonized.
daemonize yes
# If you run Redis from upstart or systemd, Redis can interact with your
# supervision tree. Options:
# supervised no - no supervision interaction
# supervised upstart - signal upstart by putting Redis into SIGSTOP mode
# supervised systemd - signal systemd by writing READY=1 to $NOTIFY_SOCKET
# supervised auto - detect upstart or systemd method based on
# UPSTART_JOB or NOTIFY_SOCKET environment variables
# Note: these supervision methods only signal "process is ready."
# They do not enable continuous liveness pings back to your supervisor.
supervised auto
When you run the process as daemon it need to interact with systemd for process management (if I read well some documentation).
Thanks

Linux SIGINT passive capture

Is there a place where the Linux kernel passively logs SIGKILL (kill -9) shutdown requests?
I have a JVM running that is arbitrarily being shut down and I suspect that, based on the evidence available, is being shut down by a stray process that is somehow issuing a shutdown of the JVM process. I have robust logging in place but in order to confirm my suspicion, I'd have to turn up the logging level to overwhelming levels.
I've researched exhaustively through /var/log and can't seem to find any place that might capture and log these SIGKILL events. Any ideas where I might find these events, if they exist?
Option 1:
If your kernel has ftrace support (very likely) try the killsnoop tool from Brendan Gregg's perf-tools:
wget https://raw.githubusercontent.com/brendangregg/perf-tools/master/killsnoop
chmod +x killsnoop
sudo ./killsnoop -s
More usage examples in the killsnoop_example.txt file.
Option 2: (passive capture)
If your kernel has no ftrace support you can use the kernel-siglog kernel module from https://github.com/nfedera/kernel-siglog :
git clone https://github.com/nfedera/kernel-siglog.git
cd kernel-siglog/
make
sudo insmod siglog.ko
Once inserted the siglog kernel module will record the last 10,000 signals in /proc/siglog
I had a similar issue and found the culprit using this kernel module. I had it inserted on a customer's server for some weeks and when the service was killed I logged in, did a cat /proc/siglog and found that my service was killed by a customer's own buggy watchdog script.

No pid file for CouchDB on Ubuntu 14.04

We would like to monitor our CouchDB installation using the default pid file method with MONIT, however although couchdb is working fine there is no pid file generated under /var/run/couchdb, there is only a couch.uri file.
Permissions on /var/run/couchdb are good (couch:couch) and service couchdb stop and start work fine, although for MONIT to stop/start we would need the /etc/init.d/couchdb start/stop option (which again isn't present).
For info we just installed using apt-get install couchdb on Ubuntu 14.04.
Any advice appreciated.
Best regards
RichBos
I have done this with an older version (1.3) of CouchDB installed from source. Please check if this is working for you:
check process couchdb with pidfile
/usr/local/var/run/couchdb/couchdb.pid
group database
start program = "/etc/init.d/couchdb start -u couchdb"
stop program = "/etc/init.d/couchdb stop -u couchdb"
if failed host 127.0.0.1 port 5984 then restart
if cpu is greater than 40% for 2 cycles then alert
if cpu > 60% for 5 cycles then restart
if 10 restarts within 10 cycles then timeout
If you have installed it via a package manager, you will most likely find the pid in /var/run/couchdb/couchdb.pid
The place of the pid file did not change since 1.3. So chances are good, that it's working for you.

Detect pending linux shutdown

Since I install pending updates for my Ubuntu server as soon as possible, I have to restart my linux server quite often. I'm running an webapp on that server and would like to warn my users about the pending restart. Right now, I do this manually, adding an announcement before the restart, give them some time to finish their work, restart and remove the announcement.
I hope, shutdown -r +60 writes an file with all the information about the restart, which I can check on every access. Is there such a file? Would prefer a file in a virtual file system like /proc for performance reasons...
I'm running Ubuntu 10.04.2 LTS
If you are using systemd, the following command shows the scheduled shutdown info.
cat /run/systemd/shutdown/scheduled
Example of output:
USEC=1636410600000000
WARN_WALL=1
MODE=reboot
As remarked in a comment by #Björn, USEC is the timestamp in micro seconds.
You can convert it to a human friendly format dropping the last 6 figures and using date like this:
$ date -d #1636410600
Mon Nov 8 23:30:00 CET 2021
The easiest solution I can envisage means writing a script to wrap the shutdown command, and in that script create a file that your web application can check for.
As far as I know, shutdown doesn't write a file to the underlying files system, although it does trigger broadcast messages warning of the shutdown, which I suppose you could write a program to intercept .. but the above solution seems the easiest.
Script example:
shutdown.bsh
touch /somefolder/somefile
shutdown -r $1
then check for 'somefile' in your web app.
You'd need to add a startup link that erased the 'somefile' otherwise it would still be there when the system comes up and the web app would always be telling your users it was about to shut down.
You can simply check for running shutdown process:
if ps -C shutdown > /dev/null; then
echo "Shutdown is pending"
else
echo "Shutdown is not scheduled"
fi
For newer linux distributions versions you might need to do:
busctl get-property org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager ScheduledShutdown
The method of how shutdown works has changed
Tried on:
- Debian Stretch 9.6
- Ubuntu 18.04.1 LTS
References
Check if shutdown schedule is active and when it is
The shutdown program on a modern systemd-based Linux system
You could write a daemon that does the announcement when it catches the SIGINT / SIGQUIT signal.

Resources