Linux software watchdog configuration

Linux software watchdog configuration - linux

I need to configure linux software watchdog (enabled in kernel configuration - CONFIG_SOFT_WATCHDOG=y, which gives me a new device /dev/watchdog1) such that if enabled and if a watchdog timeout occurs, it can launch a script/binary, instead of rebooting the system. My platform uses systemd and not init and I do not see a watchdog.conf file in /etc
Could not find a solution in how to use linux software watchdog. However, one comment says that " it is very possible to restart single or multiple processes after the watchdog signals that the systems is hanging - you can even ABORT the reboot or do a SOFT-reboot, one is able to configure "test" and "repair"-scripts / binaries which do whatever you want them to do."
How/Where can I configure /dev/watchdog1 so that it launches a script/binary instead of rebooting the system?

Eventually resorting to looking at the kernel source for watchdog drivers helped clear things for me. There is no way to configure /dev/watchdog1 or a kernel watchdog driver (hardware or software(softdog)), to be precise, to launch a script/binary instead of causing a system reboot. For this purpose, if feasible, you will have to write your own watchdog driver. The "launching script/binary" configuration that I was led to chase is associated with application space "watchdog daemon" (and has nothing to do with kernel's watchdog driver's configuration/behavior) which can launch a custom script to test your system health and try to fix things before a system reboot is necessary.

Related

hidepid=2 stopped working after an update. Kernel don't suppport "per-mount point"?

I am running arch linux hardened (5.11.13-hardened1-1-hardened) and have been setting hidepid=2 thru the fstab:
proc /proc proc nosuid,nodev,noexec,hidepid=2,gid=proc 0 0
and in the and override file for systemd-logind.service as hidepid.conf:
[Service]
SupplementaryGroups=proc
All accordning to the arch wiki security and everything has been working fine up until a while ago after an update, and I think it is because of systemd-248 update, but I am obviously not sure.
When reading up on systemd changes a came across this section about "ProtectProc=invisible" which is being set in the systemd-logind.service now by default and should obsolete the fstab setting of hidepid=2 but in the description of "ProtectProc=" here freedesktop.org systemd protectproc=
If the kernel doesn't support per-mount point hidepid= mount options this setting remains without effect, and the unit's processes will be able to access and see other process as if the option was not used. This option is only available for system services and is not supported for services running in per-user instances of the service manager.
So what is the meaning of this? Is this something I can fix thru kernel parameters in the hardened kernel?
Best regards

How exactly a 'shutdown -h' "HALT" differ from "shutdown" in linux

Suppose I have 20 process/deqamons running in my linux system,
How different the HALT will have an effect on my process/deamons, when compared to a SHUTDOWN

Generally, one uses the shutdown command. It allows a time delay and warning message before shutdown or reboot, which is important for system administration of multiuser shell servers; it can provide the users with advance notice of the downtime.
As such, the shutdown command has to be used like this to halt/switch off the computer immediately (on Linux and FreeBSD at least):
shutdown -h now
Or to reboot it with a custom, 30 minute advance warning:
shutdown -r +30 "Planned software upgrades"
After the delay, shutdown tells init to change to runlevel 0 (halt) or 6 (reboot). (Note that omitting -h or -r will cause the system to go into single-user mode (runlevel 1), which kills most system processes but does not actually halt the system; it still allows the administrator to remain logged in as root.)
Once system processes have been killed and filesystems have been unmounted, the system halts/powers off or reboots automatically. This is done using the halt or reboot command, which syncs changes to disks and then performs the actual halt/power off or reboot.
On Linux, if halt or reboot is run when the system has not already started the shutdown process, it will invoke the shutdown command automatically rather than directly performing its intended action. However, on systems such as FreeBSD, these commands first log the action in wtmp and then will immediately perform the halt/reboot themselves, without first killing processes or unmounting filesystems.

On POSIX systems the shutdown command switches runlevels, and executes the appropriate scripts.
On FreeBSD the "halt" command is an ACPI thing...
If you have particular concerns or would like to know things the general documentation wouldn't readily address, please feel free to refine your query.

In linux if watchdog manager is not kicked by a process is it a reset or reboot

In embedded Linux if we register a process to the watchdog manager and if the process didn't kick the watchdog manager then what happens, is it a reset or reboot.
Is the watchdog mentioned generally in Linux a software watchdog.
I am a novice in Linux. Please pardon me if i am asking any blunder...

How to simulate an interrupt storm or a live lock on Linux?

Background:
I am developing a tool which boots up a custom build of Linux and boots into QT based desktop for x86 based machine. My custom Linux runs from USB and when the it boots on a machine with certain brand of sound cards connected, then my tool runs to a live lock situation with a lot of interrupts. I doubt its some problem with APIC driver but the system is renderd useless and I have to poweroff the system.
My Question:
I would like to simulate the same situation by using a kernel driver or module. I am not sure if I can cause an interrupt to fire from a module. I have a experience with I2C or SPI which causes interrupts on ARM based Linux boards. But i dont know how to do it from a module
Could anybody please suggest me how to cause an interrupt from a driver?

Just create a module with an interrupt forkbomb in it. Google it. It'll only take a second for your vm to halt.
http://www.tldp.org/LDP/tlk/dd/interrupts.html

Watchdog for Linux

Are there any watchdog tools or libraries on Linux for the following purpose? I would like to build a watchdog executable which starts 2 processes and restarts them if:
processes crash
processes become unresponsive (e.g. hang for some reason)
Internet search found watchdog.c but I am not sure if that can be used for my purposes, it looks pretty low level.
I could run my processes as init programs (daemons) as suggested here, but I am not sure if Linux would then recognize that the process is hanging (e.g. due to a deadlock)

We use monit here: http://mmonit.com/monit/ it will let you do the restart thing it is also highly customizable regarding how to check and how to react via scripts

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Linux software watchdog configuration - linux

Related

hidepid=2 stopped working after an update. Kernel don't suppport "per-mount point"?

How exactly a 'shutdown -h' "HALT" differ from "shutdown" in linux

In linux if watchdog manager is not kicked by a process is it a reset or reboot

How to simulate an interrupt storm or a live lock on Linux?

Watchdog for Linux

Categories

Resources