Is changing parent process necessary when daemonize a process? - linux

I am reading about daemonizing a process at https://en.wikipedia.org/wiki/Daemon_%28computing%29#Creation
In a strictly technical sense, a Unix-like system process is a daemon
when its parent process terminates and the daemon is assigned the init
process (process number 1) as its parent process and has no
controlling terminal. However, more commonly, a daemon may be any
background process, whether a child of the init process or not.
On a Unix-like system, the common method for a process to become a
daemon, when the process is started from the command line or from a
startup script such as an init script or a SystemStarter script,
involves:
Dissociating from the controlling tty
Becoming a session leader
Becoming a process group leader
Executing as a background task by forking and exiting (once or twice). This is required sometimes for the process to become a session
leader. It also allows the parent process to continue its normal
execution.
Setting the root directory (/) as the current working directory so that the process does not keep any directory in use that may be on
a mounted file system (allowing it to be unmounted).
Changing the umask to 0 to allow open(), creat(), and other operating system calls to provide their own permission masks and not
to depend on the umask of the caller
Closing all inherited files at the time of execution that are left open by the parent process, including file descriptors 0, 1 and 2
for the standard streams (stdin, stdout and stderr). Required files
will be opened later.
Using a logfile, the console, or /dev/null as stdin, stdout, and stderr
If the process is started by a super-server daemon, such as inetd,
launchd, or systemd, the super-server daemon will perform those
functions for the process[5][6][7] (except for old-style daemons not
converted to run under systemd and specified as Type=forking[7] and
"multi-threaded" datagram servers under inetd[5]).
Is there a step there that changes the parent process of a process
to be daemonized? It seems to me none of the steps does that?
Is changing parent process necessary when daemonize a process?
After changing the parent process of a process (a process not necessarily
to be daemonized), can the process be associated to the controlling
tty of the new parent process? (The purpose of the question is to
see whether "keeping a process disassociated from the the
controlling tty of the new parent process" is a necessary condition
of "changing the parent process of the process".)
See my related question https://unix.stackexchange.com/questions/266565/daemonize-a-process-in-shell
Thanks.

The parent of a Unix process can't be changed by the process itself. The typical method of creating a daemon involves a fork call (which creates the process that will become the daemon). The initial process then exits, and the newly-orphaned child process will be inherited by the init process which becomes it's new parent. That's handled in step 4. The only thing init will do is wait for all it's children to exit. init doesn't have a controlling TTY, so once inherited by init the daemon can't become associated with a controlling TTY anymore. The main reason to become disassociated is to prevent signals generated from the TTY (hangups and control-C's etc.) from getting to the daemon.
There are two ways daemons are usually run:
From a shell script. The script runs the daemon's executable with the & operator at the end of the command to put the daemon into the background, possibly with I/O redirection to set the daemon's stdin, stdout and/or stderr, and then exits leaving the daemon without a parent. Running an executable from the shell involves the shell doing a fork followed by an exec in the child process of the executable to be run.
The daemon program has an option to daemonize itself. When run with that option it does a fork followed in the child process by an exec of itself with an appropriate set of arguments. The parent will normally exit after the fork since the work it's been asked to do is done. If it doesn't, the child process needs an extra fork to give it a parent that can exit. NB: this is why so many programs that normally run as daemons can be run directly without becoming a daemon, the "become a daemon" option causes the child process to close stdin/stdout/stderr and then just exec it's own executable without the "become a daemon" option.

I would suggest to use daemon(3). See also credentials(7)
Your list does not mention explicitly setsid(2).
MUSL libc has a legacy/daemon.c which forks twice and do setsid

Related

Difference(s) between a background process and a daemon in linux

Background processes don't belong to a user and a terminal, nor do daemon processes. What is the main difference between the two? If I were to write a server program, should I run it as a background process or a daemon?
When one says 'background process', it's usually in the context of a shell (like bash), which implements job control.
When a process (or a process group) is put into the background, it's still part of the session created by the shell and will still have an association with the shell's controlling terminal. The standard input/output of a background process will still be linked to the terminal (unless explicitly changed). Also, depending on how the shell exits, it may send a SIGHUP signal to all the background processes (See this answer to know exactly when). Until the shell terminates, it remains the parent of the background process.
A daemon on the other hand does not have a controlling terminal and is usually explicitly made to be a child of the init process. The standard input/output of a dare usually redirected to /dev/null
A Background process usually refers to a process which:
Another process is its parent; eg, a shell;
It has standard streams (input, output, error) connected to that parent
The most common type is when you run a shell program with a trailing &. It generally shares the shell’s output streams, but will get a signal and stop if it tries to read from its input stream.
More significantly (usually), a background process like this is still parented, so signals to that process group will continue to it. If the parent process terminates, the children will receive signals that will most likely terminate them, as well. (This is probably the biggest difference between the two for most users.)
A Daemon process is one that:
Has no parent, ie, its parent process is the system (or container) initial thread, commonly systemd (Linux), init (other Unix), or launchd? (MacOS);
Typically has its output disconnected, or connected to a log file;
Typically has its input disconnected.
Daemons are usually also written to accept the “user hung up” signal (SIGHUP), which would terminate a program if not handled, as a special instruction to re-read their configuration files and continue working.
Most often, these are processes created by some system-level facility that continue to operate completely independently of user activity (logins, logouts, and the like). Things that, themselves, handle logins (getty or gdm and the like), as well as other network-facing services (web servers, mail servers, etc) may be daemons, as well as self-monitoring services like cron, or smartd.

Run a background process and free up terminal won't work with & [duplicate]

I am writing a Linux daemon . I found two ways to do it.
Daemonize your process by calling fork() and setting sid.
Running your program with &.
Which is the right way to do it?
From http://www.steve.org.uk/Reference/Unix/faq_2.html#SEC16
Here are the steps to become a daemon:
fork() so the parent can exit, this returns control to the command line or shell invoking your program. This step is required so that the new process is guaranteed not to be a process group leader. The next step, setsid(), fails if you're a process group leader.
setsid() to become a process group and session group leader. Since a controlling terminal is associated with a session, and this new session has not yet acquired a controlling terminal our process now has no controlling terminal, which is a Good Thing for daemons.
fork() again so the parent, (the session group leader), can exit. This means that we, as a non-session group leader, can never regain a controlling terminal.
chdir("/") to ensure that our process doesn't keep any directory in use. Failure to do this could make it so that an administrator couldn't unmount a filesystem, because it was our current directory. [Equivalently, we could change to any directory containing files important to the daemon's operation.]
umask(0) so that we have complete control over the permissions of anything we write. We don't know what umask we may have inherited. [This step is optional]
close() fds 0, 1, and 2. This releases the standard in, out, and error we inherited from our parent process. We have no way of knowing where these fds might have been redirected to. Note that many daemons use sysconf() to determine the limit _SC_OPEN_MAX. _SC_OPEN_MAX tells you the maximun open files/process. Then in a loop, the daemon can close all possible file descriptors. You have to decide if you need to do this or not. If you think that there might be file-descriptors open you should close them, since there's a limit on number of concurrent file descriptors.
Establish new open descriptors for stdin, stdout and stderr. Even if you don't plan to use them, it is still a good idea to have them open. The precise handling of these is a matter of taste; if you have a logfile, for example, you might wish to open it as stdout or stderr, and open '/dev/null' as stdin; alternatively, you could open '/dev/console' as stderr and/or stdout, and '/dev/null' as stdin, or any other combination that makes sense for your particular daemon.
Better yet, just call the daemon() function if it's available.
I suggest not writing your program as a daemon at all. Make it run in the foreground with the file descriptors, current directory, process group, etc as given to it.
If you want to then run this program as a daemon, use start-stop-daemon(8), init(8), runsv (from runit), upstart, systemd, or whatever to launch your process as a daemon. That is, let your user decide how to run your program and don't enforce that it must run as a daemon.
Just use daemon(3) (from unistd.h).
The daemon() function is for programs
wishing to detach themselves from the
controlling terminal and run in the
background as system daemons. ...
The first. The second is not daemonizing, but running on the background. Daemonized programs should be on its own session and process group, and should not have a controlling terminal.
Actually to make a daemon you have to double fork.
Running the program with a & makes the shell run the program in the background, which does not make it a daemon. Daemons have init (pid 1) as a parent, that's why the double fork is needed.
So the nice way to do things, if your program is a daemon, would be to take care of this issue yourself (there are more methods, see here too). You could also use the start-stop-daemon program.
What language are you using? Some languages have helper methods that make daemonizing easier. For example, Ruby has the daemons package.

Fork child process to die when parent exits? (bash)

I'm working with parallel processing and rather than dealing with cvars and locks I've found it's much easier to run a few commands in a shell script in sequence to avoid race conditions in one place. The new problem is that one of these commands calls another program, which the OS has decided to put into a new process. I need to kill this process from the parent program, but the parent program only knows the pid of the parent (shell script), so this process keeps executing on its own.
Is there a way in bash to set a subprocess to die when the parent dies? I've tried to figure out how to execute it as a daemon because I read daemons exit when the parent dies, but it's tricky and I can't quite get it right. Thanks!
Found the problem, and this fixed it (except for some pesky messages that somehow cannot be redirected to /dev/null).
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT

Why in GNU do we tcsetpgrp from a shell while it is a background process?

I've been studying this link to finish my shell assignment: http://www.gnu.org/software/libc/manual/html_node/Launching-Jobs.html#Launching-Jobs and it's been particularly helpful. My confusion is that, to give the shell control of the stdin file descriptor again, I need to call tcsetpgrp from the shell after the child is terminated.
How do I get tcsetpgrp() to work in C?
I've searched different Stack Overflow questions, but none properly tell me why GNU promotes this approach. Because the shell currently is in the "background", tcsetpgrp() will send SIGTTOU to my process group. The current solution is to ignore it before calling the method and maybe reset it to default afterwards. What should I do?
EDIT: I would like to note that the child is first set in another process group before the shell passes control of stdin to it with tcsetpgrp(). Once the child dies, the shell calls tcsetpgrp() to reclaim stdin. GNU suggests this as a possible implementation, but says it uses a slightly different implementation for simplicity here.
If tcsetpgrp() is called by a member of a background process group in its session, and the calling process is not blocking or ignoring SIGTTOU, a SIGTTOU signal is sent to all members of this background process group.
I'm also towards the end of implementing a shell program, and can take a stab at this question:
TL;DR: even though the shell will be a background process at that point, it needs to reclaim the terminal foreground for itself once the most recent foreground process group exits; otherwise the terminal will hang and no process will consume the user's input.
Here's what the session looks like when a shell launches a job from a line of command:
before the child processes call execve(), they are assigned into a process group, so that it's simpler to manage them as a job;
now that the shell and the child processes are in different process group, and that the terminal can only serve one process group at a time, the shell has to donate the terminal foreground to this child process group, so that the child job could read inputs and receive signals directly from the user;
after the foreground donation, the shell becomes a background process, but since there's no point in running the shell in the background (its main function is to read/write to the terminal and launch jobs), it should not proceed until it reclaims the foreground;
another reason why the shell must immediately reclaim the foreground after the exit of the most recent child foreground process group, is that there will be no running process consuming the terminal input/signal between the time a) when the child foreground process group exit, and b) when the shell reclaims the foreground. The terminal will essentially hang and become unresponsive;
as for why the shell must reclaim the foreground itself, it's because the child processes will be calling execve() and loading entirely new process images. It is extremely difficult to enforce and advocate for a contract where every child process in the foreground returns the foreground back to the shell process;
lastly, since tcsetpgrp() sends the caller background process a SIGTTOU, the shell must register an ignorer handler for SIGTTOU, at least during the time it's reclaiming the terminal foreground;
The GNU C Library manual also has another section (28.5.4 Foreground and Background) that briefly explains why the shell must reclaim the terminal forg

What is the first process a typical Linux kernel starts?

I searched on the internet for which is the first process which gets executed upon system startup.
I found two answers which are init and sched. What is it really?
Which gets executed first? sched process or init process?
Typically it is the init process, the path of which is hard coded into the kernel itself. init performs very low level functions like starting upstart in the case of Ubuntu (prior to 15.40) or systemd in the case of Ubuntu 15.04 and later, Arch, Fedora, and others, which load the remaining processes and setup. Note that the system is not done booting when init runs - that is a common misconception. In fact, init sets up your login screen and other related tasks. Here's a WikiPedia page on init: https://en.wikipedia.org/wiki/Linux_startup_process#SysV_init
Init is the father of all processes. Its primary role is to create processes from a script stored in the file /etc/inittab. This file usually has entries which cause init to spawn gettys on each line that users can log in. It also controls autonomous processes required by any particular system. A run level is a software configuration of the system which allows only a selected group of processes to exist. The processes spawned by init for each of these run levels are defined in the /etc/inittab file.
However, the Linux kernel does start the scheduler but it is not in userspace, which is what most people associate as the home for a process. Also, the Bourne Shell (/bin/sh) can be substituted if the init is missing or cannot be called. You can also in theory substitute it for any executable by using the init=*some path here* Linux kernel boot option.
Its sched, as per Linux 3.13 start kernel() first calls sched_init() and runs first user space process init i.e rest_init() creates a kernel thread passing another function kernel_init() as the entry point and kernel goes to idle unless called.
start_kernel() {
...
sched_init();
rest_init(); calls function kernel_init();
}
The kernel at least has one runnable process, which is known as the idle task, swapper, init_task and sched. They are different names of the same process whose pid is 0. This init_task is a global variable of kernel, so it have a fixed address, you can see it from System.map by command grep 'D init_task' /boot/System.map-*. The address wouldn't change unless you recompile the kernel.
The program init whose pid is 1, spawned by init_task(pid 0). In Ubuntu, The program init is the Upstart process management daemon, while in other systems, it could be systemd. The address of init changed every time while system rebooting.
So, process 0 run first, then spawn process 1.
You can try
pstree 0
it will show the all process hierarchy in tree form right from the children on sched process (PID 0). No doubt init is the parent of all process but sched gets executed before init and spaws both init and kthread.
You can also see the PPID (i.e. process id of parent process ) using:
ps -eaf
You will notice it to be 0 for both init and kthread.
Swapper is the first process running. It has pid 0.

Resources