Watchdog for Linux - linux

Are there any watchdog tools or libraries on Linux for the following purpose? I would like to build a watchdog executable which starts 2 processes and restarts them if:
processes crash
processes become unresponsive (e.g. hang for some reason)
Internet search found watchdog.c but I am not sure if that can be used for my purposes, it looks pretty low level.
I could run my processes as init programs (daemons) as suggested here, but I am not sure if Linux would then recognize that the process is hanging (e.g. due to a deadlock)

We use monit here: http://mmonit.com/monit/ it will let you do the restart thing it is also highly customizable regarding how to check and how to react via scripts

Related

CPU stress testing by percentage with native linux command

I am currently running the following commands to stress test my CPU:
md5sum /dev/zero
Then I kill the process after 30 seconds.
pkill md5sum
I am looking to run a similar command but where it lets me stress the CPU by a certain percentage. I cannot use libraries like stress-ng, it has to be be able to run with basic linux distro like ubuntu. I also can't run a program as such, I have to be able to run it from the command line itself. Any suggestions or help is greatly appreciated.
I don't believe this is possible without installing a tool (like cpulimit, available in Ubuntu apt repo) or by using cgroups, which is complex to configure and potentially requires kernel rebuilding.
The nearest standard feature is nice, which essentially sets the process priority. So without anything else running, it would still use all the CPU time, but any other processes would get CPU priority.

Linux software watchdog configuration

I need to configure linux software watchdog (enabled in kernel configuration - CONFIG_SOFT_WATCHDOG=y, which gives me a new device /dev/watchdog1) such that if enabled and if a watchdog timeout occurs, it can launch a script/binary, instead of rebooting the system. My platform uses systemd and not init and I do not see a watchdog.conf file in /etc
Could not find a solution in how to use linux software watchdog. However, one comment says that " it is very possible to restart single or multiple processes after the watchdog signals that the systems is hanging - you can even ABORT the reboot or do a SOFT-reboot, one is able to configure "test" and "repair"-scripts / binaries which do whatever you want them to do."
How/Where can I configure /dev/watchdog1 so that it launches a script/binary instead of rebooting the system?
Eventually resorting to looking at the kernel source for watchdog drivers helped clear things for me. There is no way to configure /dev/watchdog1 or a kernel watchdog driver (hardware or software(softdog)), to be precise, to launch a script/binary instead of causing a system reboot. For this purpose, if feasible, you will have to write your own watchdog driver. The "launching script/binary" configuration that I was led to chase is associated with application space "watchdog daemon" (and has nothing to do with kernel's watchdog driver's configuration/behavior) which can launch a custom script to test your system health and try to fix things before a system reboot is necessary.

respawn option for bsd rc.d

I run a small daemon and want it to be respawned when it is killed. I use "respawn" option in inittab on linux systems.(It is a small embedded platform.).
Now I am trying the same daemon on BSD. I have put my entry in "rc.d". But I could not find respawn option for BSD.
I can write a small program which respawns my daemon. But I was wondering if there must be something already built for BSD to restart killed services.
Do you know anything which I can use.
Thanks
P.S. I know I can do this thing in my daemon itself. But currently I dont have source for it.
The rc.d/init.d startup script convention does not provide for respawning daemons. That's one of the main reasons why alternatives like upstart and systemd have been created. On your embedded system, your best option is probably a small wrapper that monitors your daemon and restarts it when necessary.

What can cause SIGHUP to be generated?

We have about 40 computers running identical hardware and software. They all run Ubuntu 11.10. They all have just one user account to log in. The .profile file is set up to launch a daemon process. The code for the daemon is written in C.
Once in a few weeks, we get a report that the daemon is no longer running. This does not happen on all computers but just one or two. We cannot reproduce the problem consistently.
Looking at the code, the application quits when it receives either SIGHUP or SIGTERM.
As I understand, SIGHUP is generated when a user logs off. In our case, the user never logs off. I am wondering if it is possible that SIGHUP could have been generated for some other reason. Any other thought would be appreciated.
Well, there are a couple of things to note about SIGHUP. Firstly, its origin is from the concept of a hang-up, i.e. loss of connection to a console over something like a modem. In modern parlance this generally means it has lost its controlling tty. Unless you've taken care to detach from your tty, any program started in a given terminal will receive a SIGHUP when the terminal is closed. See here for details on how to do this in your program. Other options include:
running your program inside screen or tmux
run your program with nohup or some other daemonising framework
The other possibility is something is deliberately sending your process a SIGHUP which by "tradition" is often used to signal a process that it should re-read its configuration.
Signals can be sent using kill utility or kill syscall.
Of course, you can try and find out who is sending that signal or disconnecting your terminals or network connections, but there is simpler practical way to fix your problem.
When code is supposed to run as a daemon, but really isn't (just like yours), there is a wrapper that can turn any program into daemon. Surprise - this wrapper is called daemon! It has lots of options, probably most importantly for you, option to automatically restart your utility should it ever die for any reason.
If this command is not installed on your Ubuntu, just sudo apt-get install daemon, and man daemon to get started.

Run .exe with daemon or better solutions to get something similar to windows service (mac and linux)?

I was wondering if i can run a exe with daemons in mac and linux or do you have any other solutions to do something similar to a windows service that is a scheduler ? I know i can use crontab but i was wondering if there was other way to do it.
Thx
On OS X, the preferred way of doing things like this is with launchd daemons. You create a .plist file with information about what program to run, parameters to pass it, and what conditions to start it under (i.e. at certain times, when a network connection is received on a certain port, or just run always), and various other options. Lingon provides a handy GUI for creating the .plist, or just read the Apple LAUNCHD docs and create it yourself. Put the .plist in /Library/LaunchDaemons, and either reboot or activate it with sudo launchctl load /Library/LaunchDaemons/whatever.plist.
A warning about using launchd: most daemon-type programs for unix will "daemonize" themselves -- they drop into the background, and generally detach themselves from the program that started them. Launchd doesn't like this. It wants to keep watch over its children, so that it can monitor their status, relaunch them if necessary, etc. So you may either need to tell the program not to daemonize, or add an option to the .plist to tell launchd not to freak out if the program appears to quit.
Linux alternative to windows NT services are daemons. You can read a little more about it
here.
You also start executables by scripts located in "/etc/init.d" Just look at one of those scripts for reference. If you want to make a task or executable start at a given time use a crontab. It is made for this purpose and I don't see why use something else.
If you have a mono executable probably the easiest way is just to make a script in "init.d" if you want to start when system starts or make a crontab entry. It is realy easy. Here you can find a simple reference.

Resources