I am learning how to program in Linux OS platform and what is the implementation to run my app in background process.
Like for example in this scenario: When executing my app in the shell it will automatically run in background process. Take note that I don't need the "&" in the shell when I run my app. What standard Linux function to do this implementation?
And how can I kill or terminate the app that was run in background in the code? What I mean is I don't need to execute the kill shell command to terminate my app in background? Or if the app meets a condition then it will kill itself or terminate itself.
Many thanks.
You want to daemonize your program. This is generally done by fork() and a few other system calls.
There are more details here
Background applications can be killed by using kill. It is good practice for a daemon to write its process ID (PID) in a well-known file so it can be located easily.
While you should learn about fork() exec() wait() and kill(), its sometimes more convenient to just use daemon(3) if it exists.
Caveats:
Not in POSIX.1-2001
Not present in all BSD's (may be named something else, however)
If portability is not a major concern, it is rather convenient. If portability is a major concern, you can always write your own implementation and use that.
From the manpage:
SYNOPSIS
#include <unistd.h>
int daemon(int nochdir, int noclose);
DESCRIPTION
The daemon() function is for programs wishing to detach themselves from the
controlling terminal and run in the background as system daemons.
If nochdir is zero, daemon() changes the calling process's current working directory
to the root directory ("/"); otherwise, the current working directory is left
unchanged.
If noclose is zero, daemon() redirects standard input, standard output and standard
error to /dev/null; otherwise, no changes are made to these file descriptors.
fork(2) gives you a new process. In the child you run one of the exec(3) functions to replace it with a new executable. The parent can use one of the wait(2) functions to wait for the child to terminate. kill(2) can be used to send a signal to another process.
Related
Background processes don't belong to a user and a terminal, nor do daemon processes. What is the main difference between the two? If I were to write a server program, should I run it as a background process or a daemon?
When one says 'background process', it's usually in the context of a shell (like bash), which implements job control.
When a process (or a process group) is put into the background, it's still part of the session created by the shell and will still have an association with the shell's controlling terminal. The standard input/output of a background process will still be linked to the terminal (unless explicitly changed). Also, depending on how the shell exits, it may send a SIGHUP signal to all the background processes (See this answer to know exactly when). Until the shell terminates, it remains the parent of the background process.
A daemon on the other hand does not have a controlling terminal and is usually explicitly made to be a child of the init process. The standard input/output of a dare usually redirected to /dev/null
A Background process usually refers to a process which:
Another process is its parent; eg, a shell;
It has standard streams (input, output, error) connected to that parent
The most common type is when you run a shell program with a trailing &. It generally shares the shell’s output streams, but will get a signal and stop if it tries to read from its input stream.
More significantly (usually), a background process like this is still parented, so signals to that process group will continue to it. If the parent process terminates, the children will receive signals that will most likely terminate them, as well. (This is probably the biggest difference between the two for most users.)
A Daemon process is one that:
Has no parent, ie, its parent process is the system (or container) initial thread, commonly systemd (Linux), init (other Unix), or launchd? (MacOS);
Typically has its output disconnected, or connected to a log file;
Typically has its input disconnected.
Daemons are usually also written to accept the “user hung up” signal (SIGHUP), which would terminate a program if not handled, as a special instruction to re-read their configuration files and continue working.
Most often, these are processes created by some system-level facility that continue to operate completely independently of user activity (logins, logouts, and the like). Things that, themselves, handle logins (getty or gdm and the like), as well as other network-facing services (web servers, mail servers, etc) may be daemons, as well as self-monitoring services like cron, or smartd.
I am writing a Linux daemon . I found two ways to do it.
Daemonize your process by calling fork() and setting sid.
Running your program with &.
Which is the right way to do it?
From http://www.steve.org.uk/Reference/Unix/faq_2.html#SEC16
Here are the steps to become a daemon:
fork() so the parent can exit, this returns control to the command line or shell invoking your program. This step is required so that the new process is guaranteed not to be a process group leader. The next step, setsid(), fails if you're a process group leader.
setsid() to become a process group and session group leader. Since a controlling terminal is associated with a session, and this new session has not yet acquired a controlling terminal our process now has no controlling terminal, which is a Good Thing for daemons.
fork() again so the parent, (the session group leader), can exit. This means that we, as a non-session group leader, can never regain a controlling terminal.
chdir("/") to ensure that our process doesn't keep any directory in use. Failure to do this could make it so that an administrator couldn't unmount a filesystem, because it was our current directory. [Equivalently, we could change to any directory containing files important to the daemon's operation.]
umask(0) so that we have complete control over the permissions of anything we write. We don't know what umask we may have inherited. [This step is optional]
close() fds 0, 1, and 2. This releases the standard in, out, and error we inherited from our parent process. We have no way of knowing where these fds might have been redirected to. Note that many daemons use sysconf() to determine the limit _SC_OPEN_MAX. _SC_OPEN_MAX tells you the maximun open files/process. Then in a loop, the daemon can close all possible file descriptors. You have to decide if you need to do this or not. If you think that there might be file-descriptors open you should close them, since there's a limit on number of concurrent file descriptors.
Establish new open descriptors for stdin, stdout and stderr. Even if you don't plan to use them, it is still a good idea to have them open. The precise handling of these is a matter of taste; if you have a logfile, for example, you might wish to open it as stdout or stderr, and open '/dev/null' as stdin; alternatively, you could open '/dev/console' as stderr and/or stdout, and '/dev/null' as stdin, or any other combination that makes sense for your particular daemon.
Better yet, just call the daemon() function if it's available.
I suggest not writing your program as a daemon at all. Make it run in the foreground with the file descriptors, current directory, process group, etc as given to it.
If you want to then run this program as a daemon, use start-stop-daemon(8), init(8), runsv (from runit), upstart, systemd, or whatever to launch your process as a daemon. That is, let your user decide how to run your program and don't enforce that it must run as a daemon.
Just use daemon(3) (from unistd.h).
The daemon() function is for programs
wishing to detach themselves from the
controlling terminal and run in the
background as system daemons. ...
The first. The second is not daemonizing, but running on the background. Daemonized programs should be on its own session and process group, and should not have a controlling terminal.
Actually to make a daemon you have to double fork.
Running the program with a & makes the shell run the program in the background, which does not make it a daemon. Daemons have init (pid 1) as a parent, that's why the double fork is needed.
So the nice way to do things, if your program is a daemon, would be to take care of this issue yourself (there are more methods, see here too). You could also use the start-stop-daemon program.
What language are you using? Some languages have helper methods that make daemonizing easier. For example, Ruby has the daemons package.
I am developing a sandbox on linux. And now i am confused terminating all process in the sandbox.
My sandbox works as follows:
At first only one process run in the sandbox.
Then it can create several child process.
And child process will create their subprocess also.
And parent process may exit at some time before its children exited.
At last sandbox will terminate all the process.
I used to do this by using killall or pkill -u with a unique user attached to the sandbox.But it seems doesn't work on the program which uses fork() fastly.
Then I search for the source code of pkill and realized that pkill is lose of atomicity.
So how could i achieve my goal ?
You could use process groups setpgid(2) and sessions setsid(2), but I don't qualify what you do as a sandbox (in particular because if one of the processes is setuid or change its process group or session itself, you'll lose it; read execve(2) carefully and several times!). Notice that kill(2) with a negative pid kills an entire process group.
Read a good book like Advanced Linux Programming. Consider also using chroot(2).
And explain what and why you really want to do. sandboxing is harder that what you think. See also capabilities(7), credentials(7) and SElinux.
Consider a system that manages user-defined programs:
A program can be anything. Its command line is defined by non-privileged users in some configuration file. It could be /bin/ls, it could be /usr/sbin/apache; the user may specify whatever he is permitted to start.
Each program is run as a non-root user.
Any given user can configure any number of programs.
Each program runs for as long as it wants.
Each program may call fork(), exec() etc.
Each program may set itself as a session leader (ie., setsid()).
The system that starts the programs might not run continuously. It starts a program, then quits.
The action "stop all of program P's processes, including children/forks" must be possible.
The action "find all processes belonging to program P" must be possible.
Here's the question: How can one provide such a system within the Linux process model?
The naive method:
Start program with fork(), exec(), setuid(), etc..
Write the child PID (plus its start timestamp, from /proc/stat, to uniquely and permanently identify it) to a file.
To stop a single process, set SIGTERM to PID.
To find all processes, inspect /proc to build the process hiearchy based on the PID.
This method has a big hole: Any process may fork and break out of its process group. It's not sufficient to look at the process hierarchy. After a program has created new processes, it's not possible to trace their origin back to the original program.
A workaround would be to ensure that each program is started with a unique UID. This is not desirable or particularly workable, since a (human) user may define any number of programs; the system would then have to programmatically create new, unique users for each program.
My only idea so far is to inject a special, reserved environment variable into the program's initial process, ie., run the program with env PROGRAM=myprogram <command line>. The system could then mandate that all processes must inherit their parent's environment. At regular intervals, the system could trawl /proc and forcibly kill any process missing the PROGRAM environment variable.
Are there any secrets in the Linux syscall API that I could use?
(1) The action "stop all of program P's processes, including children/forks" must be possible. (2) The action "find all processes belonging to program P" must be possible.
cgroups implement this, and systemd is perhaps the heaviest user to date to make use of (2) to achieve (1). You can break out of progress groups, but not cgroups.
I've read about fork and from what I understand, the process is cloned but which process? The script itself or the process that launched the script?
For example:
I'm running rTorrent on my machine and when a torrent completes, I have a script run against it. This script fetches data from the web so it takes a few seconds to complete. During this time, my rtorrent process is frozen. So I made the script fork using the following
my $pid = fork();
if ($pid == 0) { blah blah blah; exit 0; }
If I run this script from the CLI, it comes back to the shell within a second while it runs in the background, exactly as I intended. However, when I run it from rTorrent, it seems to be even slower than before. So what exactly was forked? Did the rtorrent process clone itself and my script ran in that, or did my script clone itself? I hope this makes sense.
The fork() function returns TWICE! Once in the parent process, and once in the child process. In general, both processes are IDENTICAL in every way, as if EACH one had just returned from fork(). The only difference is that in one, the return value from fork() is 0, and in the other it is non-zero (the PID of the child process).
So whatever process was running your Perl script (if it is an embedded Perl interpreter inside rTorrent then rTorrent would be the process) would be duplicated at exactly the point that the fork() happened.
I believe I found the problem by looking through rTorrent's source. For some processes, it will read all of the output sent to stdout before continuing. If this is happening to your process, rTorrent will block until you close the stdout process. Because you're forking, your child process shares the same stdout as the parent. Your parent process will exit, but the pipe remains open (because your child process is still running). If you did an strace of rTorrent, I'd bet that it'd be blocked on this read() call while executing your command.
Try closing/redirecting stdout in your perl script before the fork().
The entire process containing the interpreter forks. Fortunately memory is copy-on-write so it doesn't need to copy all the process memory in order to fork. However, things such as file descriptors remain open. This allows child processes to handle them, but may cause issues if they aren't closed appropriately. In general, fork() should not be used in an embedded interpreter except under extreme duress.
To answer the nominal question, since you commented that the accepted answer fails to do so, fork affects the process in which it is called. In your example of rTorrent spawning a Perl process which then calls fork, it is the Perl process which is duplicated, since it was the Perl process which called fork.
In the general case, there is no way for a process to fork any process other than itself. If it were possible to tell another arbitrary process to go fork itself, that would open up no end of security and performance issues.
My advice would be "don't do that".
If the Perl interpreter is embedded within the rtorrent process, you've almost certainly forked an entire rtorrent process, the effects of which are probably ill-defined at best. It's generally a bad idea to play with process-level stuff in an embedded interpreter regardless of language.
There's an excellent chance that some sort of lock is not being properly released, or that threads within the processes are proceeding in unintended and possibly competing ways.
When we create a process using fork the child process will have the copy of the address space.So the child also can use the address space.And it also can access the files which is opened by the parent.We can have the control over the child.To get the complete status of the child we can use wait.