What are the advantages of "daemonizing" a server application over running the program in console mode?
Having it run as a daemon means you can
log out without loosing the service (which saves some resources)
do not risk loosing the service from an accidental ctrl-c
does not offer a minor security risk from someone accessing the terminal, hitting ctrl-c and taking your session
Essentially all 'real' services that are running 'in production' (as opposed to debug mode) run that way.
I think it is preventing from accidentally closing an app and you have one more terminal free.
But I personally don't see big difference between "screen" program and "daemonizing"
The main point would be to detach the process from the terminal so that the process does not terminate when the user logs out from the terminal. If you run a program in console mode, it will terminate when you log out, because this is the default behavior for a process when it receives a SIGHUP signal.
Note that there is more to writing a daemon than just calling daemon(3). See How to write a unix daemon for more information.
Related
Recently I was working on an update and I had to kill a few java process before that.
killall -9 java
So I used the above command which killed all the java process. But now I'm stuck without knowing how to restart those java services.
Is there a command to start all java services killed using killall?
using kill
First of all: kill -9 should be the last method to use to stop a process.
A process stopped with SIGKILL has no chance to shutdown properly. Some services or daemons have complex and important shutdown procedures like databases who takes care to close open database files in a consistent state and write cached data to it.
Before stopping processes with kill or something like that, you should try the stop procedure which comes from the init system of your unix/linux operating system.
When you have to use kill, try to send a TERM signal to a process first (just use kill without -9) and wait a moment to see if the process shuts down. Use -9 if there is no other option!
Starting and stopping services
Starting and stopping services should be handled by the init service which comes with your unix/linux operating system.
SysV init or systemd is common. Check the Manual of your operating system to see which system is used. If set up properply you can check which services are missing (stopped, which should be running) and start them again.
here are some manual examples
FreeBSD:
https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/configtuning-rcd.html
Debian:
https://www.debian.org/doc/manuals/debian-handbook/unix-services.de.html#sect.system-boot
Fedora: https://docs.fedoraproject.org/f28/system-administrators-guide/infrastructure-services/Services_and_Daemons.html
As far as I know, no. There is no record (by default) of what you have killed, as you can see in strace killall java.
More about process management, including why SIGKILL is a bad idea almost all of the time.
I develop linux daemon working with some complex hardware, and i need to know ways how application may exit (normal or abnormal) to create proper cleanup functions. As i read from docs application may die via:
1. Receive signal - sigwait,sigaction, etc.
2. exit
3. kill
4. tkill
Is there is some other ways how application may exit or die?
In your comments you wrote that you're concerned about "abnormal ways" the application may die.
There's only one solution for that1 -- code outside the application. In particular, all handles held by the application at termination (normal or abnormal) are cleanly closed by the kernel.
If you have a driver for your special hardware, do cleanup when the driver receives notification that the device fd has been closed. If you don't already have a custom driver, you can use a second user-mode process as a watchdog. Just connect the watchdog to the main process via a pipe... it will receive a signal when the main application closes.
In addition to things the programmer has some degree of control over, such as wild pointer bugs causing segmentation fault, there's always the oom-killer, which can take out even a bug-free process. For this reason the application should also detect unexpected loss of its watchdog and spawn a new one.
Your app should finish by itself when the system or the user doesnt need it.
Using an external commando like a kill -9 PROCESS could give you some bugs on your application because you don't know what is your application doing in that moment.
Try to imeplement over your app a subsystem to control your application status... like a real daemon to allow something like this:
yourapp service status or /etc/init.d/yourapp status
yourapp service start or /etc/init.d/yourapp start
yourapp service stop or /etc/init.d/yourapp stop
In that way your app could finish normally everytime and the users could control it easily.
Regards
I am very much new to linux and to this forum. I am working on one issue for a customer where they have 10+ Red Hat Linux 5.5 64 bits servers. They want to stop the tomcat process using the stop script (the script uses 'kill -15')
On some servers, the script works fine and stops the tomcat process within seconds.
On some servers, sometimes it stops quickly, sometimes it keeps running for minutes and finally customer has to use 'kill -9' command to stop tomcat. Logs are not indicating anything.
Do you have any idea why there is an intermittent behaviour of this script? How can we catch it in logs etc?
Processes may ignore the SIGTERM(15) signal.
From Wikipedia: http://en.wikipedia.org/wiki/Unix_signal
The SIGTERM signal is sent to a process to request its termination. Unlike the SIGKILL signal, it can be caught and interpreted or ignored by the process. This allows the process to perform nice termination releasing resources and saving state if appropriate.
If it's ignored the only option left is to kill it with SIGKILL(9) which can't be ignored.
My system includes a task which opens a network socket, receives pushed data from the network, processes it, and writes it out to disk or pings other machines depending on the messages. This task is intended to run forever, and the service is designed to have this task always running. But sometimes it crashes.
What's the best practice for keeping a task like this alive? Assume it's okay for the task to be dead for up to 30 seconds before we restart it.
Some obvious ideas include having a watchdog process that checks to make sure the process is still running. Watchdog could be triggered by cron. But how does it know if the process is alive or not? Write a pidfile? touch a heartbeat file? An ideal solution wouldn't continuously spin up more processes if the machine gets bogged down to the point where the watchdog is running faster than the heartbeat.
Are there standard linux tools for this? I can imagine a solution that uses a message queue, but I'm not sure if that's a good idea or not.
Depending on the nature of the task that you wish to monitor, one method is to write a simple wrapper to start up your task in a fork().
The wrapper task can then do a waitpid() on the child and restart it if it is terminated.
This does depend on modifying the source for the task that you wish to run.
sysvinit will restart processes that die, if added to inittab.
If you're worried about the process freezing without crashing and ending the process, you can use a heartbeat and hard kill the active instance, letting init restart it.
You could use monit along with daemonize. There are lots of tools for this in the *nix world.
Supervisor was designed precisely for this task. From the project website:
Supervisor is a client/server system that allows its users to monitor and control a number of processes on UNIX-like operating systems.
It runs as a daemon (supervisord) controlled by a command line tool, supervisorctl. The configuration file contains a list of programs it is supposed to monitor, among other settings.
The number of options is quite extensive, -- have a look at the docs for a complete list. In your case, the relevant configuration section might be something like this:
[program:my-network-task]
command=/bin/my-network-task # where your binary lives
autostart=true # start when supervisor starts?
autorestart=true # restart automatically when stopped?
startsecs=10 # consider start successful after how many secs?
startretries=3 # try starting how many times?
I have used Supervisor myself and it worked really well once everything was set up. It requires Python, which should not be a big deal in most environments but might be.
I have a daemon process which does the configuration management. all the other processes should interact with this daemon for their functioning. But when I execute a large action, after few hours the daemon process is unresponsive for 2 to 3 hours. And After 2- 3 hours it is working normally.
Debugging utilities for Linux process hang issues?
How to get at what point the linux process hangs?
strace can show the last system calls and their result
lsof can show open files
the system log can be very effective when log messages are written to track progress. Allows to box the problem in smaller areas. Also correlate log messages to other messages from other systems, this often turns up interesting results
wireshark if the apps use sockets to make the wire chatter visible.
ps ax + top can show if your app is in a busy loop, i.e. running all the time, sleeping or blocked in IO, consuming CPU, using memory.
Each of these may give a little bit of information which together build up a picture of the issue.
When using gdb, it might be useful to trigger a core dump when the app is blocked. Then you have a static snapshot which you can analyze using post mortem debugging at your leisure. You can have these triggered by a script. The you quickly build up a set of snapshots which can be used to test your theories.
One option is to use gdb and use the attach command in order to attach to a running process. You will need to load a file containing the symbols of the executable in question (using the file command)
There are a number of different ways to do:
Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.
Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.
You can use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.
You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.