The difference between fork(), vfork(), exec() and clone() - linux

I was looking to find the difference between these four on Google and I expected there to be a huge amount of information on this, but there really wasn't any solid comparison between the four calls.
I set about trying to compile a kind of basic at-a-glance look at the differences between these system calls and here's what I got. Is all this information correct/am I missing anything important ?
Fork : The fork call basically makes a duplicate of the current process, identical in almost every way (not everything is copied over, for example, resource limits in some implementations but the idea is to create as close a copy as possible).
The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork - the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork call works - if not, no child is created and the parent gets an error code.
Vfork: The basic difference between vfork() and fork() is that when a new process is created with vfork(), the parent process is temporarily suspended, and the child process might borrow the parent's address space. This strange state of affairs continues until the child process either exits, or calls execve(), at which point the parent
process continues.
This means that the child process of a vfork() must be careful to avoid unexpectedly modifying variables of the parent process. In particular, the child process must not return from the function containing the vfork() call, and it must not call exit() (if it needs to exit, it should use _exit(); actually, this is also true for the child of a normal fork()).
Exec: The exec call is a way to basically replace the entire current process with a new program. It loads the program into the current process space and runs it from the entry point. exec() replaces the current process with a the executable pointed by the function. Control never returns to the original program unless there is an exec() error.
Clone: clone(), as fork(), creates a new process. Unlike fork(), these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers.
When the child process is created with clone(), it executes the function application fn(arg) (This differs from fork(), where execution continues in the child from the point of the original fork() call.) The fn argument is a pointer to a function that is called by the child process at the beginning of its execution. The arg argument is passed to the fn function.
When the fn(arg) function application returns, the child process terminates. The integer returned by fn is the exit code for the child process. The child process may also terminate explicitly by calling exit(2) or after receiving a fatal signal.
Information gotten from:
Differences between fork and exec
http://www.allinterview.com/showanswers/59616.html
http://www.unixguide.net/unix/programming/1.1.2.shtml
http://linux.about.com/library/cmd/blcmdl2_clone.htm
Thanks for taking the time to read this ! :)

vfork() is an obsolete optimization. Before good memory management, fork() made a full copy of the parent's memory, so it was pretty expensive. since in many cases a fork() was followed by exec(), which discards the current memory map and creates a new one, it was a needless expense. Nowadays, fork() doesn't copy the memory; it's simply set as "copy on write", so fork()+exec() is just as efficient as vfork()+exec().
clone() is the syscall used by fork(). with some parameters, it creates a new process, with others, it creates a thread. the difference between them is just which data structures (memory space, processor state, stack, PID, open files, etc) are shared or not.

execve() replaces the current executable image with another one loaded from an executable file.
fork() creates a child process.
vfork() is a historical optimized version of fork(), meant to be used when execve() is called directly after fork(). It turned out to work well in non-MMU systems (where fork() cannot work in an efficient manner) and when fork()ing processes with a huge memory footprint to run some small program (think Java's Runtime.exec()). POSIX has standardized the posix_spawn() to replace these latter two more modern uses of vfork().
posix_spawn() does the equivalent of a fork()/execve(), and also allows some fd juggling in between. It's supposed to replace fork()/execve(), mainly for non-MMU platforms.
pthread_create() creates a new thread.
clone() is a Linux-specific call, which can be used to implement anything from fork() to pthread_create(). It gives a lot of control. Inspired on rfork().
rfork() is a Plan-9 specific call. It's supposed to be a generic call, allowing several degrees of sharing, between full processes and threads.

fork() - creates a new child process, which is a complete copy of the parent process. Child and parent processes use different virtual address spaces, which is initially populated by the same memory pages. Then, as both processes are executed, the virtual address spaces begin to differ more and more, because the operating system performs a lazy copying of memory pages that are being written by either of these two processes and assigns an independent copies of the modified pages of memory for each process. This technique is called Copy-On-Write (COW).
vfork() - creates a new child process, which is a "quick" copy of the parent process. In contrast to the system call fork(), child and parent processes share the same virtual address space. NOTE! Using the same virtual address space, both the parent and child use the same stack, the stack pointer and the instruction pointer, as in the case of the classic fork()! To prevent unwanted interference between parent and child, which use the same stack, execution of the parent process is frozen until the child will call either exec() (create a new virtual address space and a transition to a different stack) or _exit() (termination of the process execution). vfork() is the optimization of fork() for "fork-and-exec" model. It can be performed 4-5 times faster than the fork(), because unlike the fork() (even with COW kept in the mind), implementation of vfork() system call does not include the creation of a new address space (the allocation and setting up of new page directories).
clone() - creates a new child process. Various parameters of this system call, specify which parts of the parent process must be copied into the child process and which parts will be shared between them. As a result, this system call can be used to create all kinds of execution entities, starting from threads and finishing by completely independent processes. In fact, clone() system call is the base which is used for the implementation of pthread_create() and all the family of the fork() system calls.
exec() - resets all the memory of the process, loads and parses specified executable binary, sets up new stack and passes control to the entry point of the loaded executable. This system call never return control to the caller and serves for loading of a new program to the already existing process. This system call with fork() system call together form a classical UNIX process management model called "fork-and-exec".

The fork(),vfork() and clone() all call the do_fork() to do the real work, but with different parameters.
asmlinkage int sys_fork(struct pt_regs regs)
{
return do_fork(SIGCHLD, regs.esp, &regs, 0);
}
asmlinkage int sys_clone(struct pt_regs regs)
{
unsigned long clone_flags;
unsigned long newsp;
clone_flags = regs.ebx;
newsp = regs.ecx;
if (!newsp)
newsp = regs.esp;
return do_fork(clone_flags, newsp, &regs, 0);
}
asmlinkage int sys_vfork(struct pt_regs regs)
{
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.esp, &regs, 0);
}
#define CLONE_VFORK 0x00004000 /* set if the parent wants the child to wake it up on mm_release */
#define CLONE_VM 0x00000100 /* set if VM shared between processes */
SIGCHLD means the child should send this signal to its father when exit.
For fork, the child and father has the independent VM page table, but since the efficiency, fork will not really copy any pages, it just set all the writeable pages to readonly for child process. So when child process want to write something on that page, an page exception happen and kernel will alloc a new page cloned from the old page with write permission. That's called "copy on write".
For vfork, the virtual memory is exactly by child and father---just because of that, father and child can't be awake concurrently since they will influence each other. So the father will sleep at the end of "do_fork()" and awake when child call exit() or execve() since then it will own new page table. Here is the code(in do_fork()) that the father sleep.
if ((clone_flags & CLONE_VFORK) && (retval > 0))
down(&sem);
return retval;
Here is the code(in mm_release() called by exit() and execve()) which awake the father.
up(tsk->p_opptr->vfork_sem);
For sys_clone(), it is more flexible since you can input any clone_flags to it. So pthread_create() call this system call with many clone_flags:
int clone_flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGNAL | CLONE_SETTLS | CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID | CLONE_SYSVSEM);
Summary: the fork(),vfork() and clone() will create child processes with different mount of sharing resource with the father process. We also can say the vfork() and clone() can create threads(actually they are processes since they have independent task_struct) since they share the VM page table with father process.

in fork(), either child or parent process will execute based on cpu selection..
But in vfork(), surely child will execute first. after child terminated, parent will execute.

Related

How the parent is restored after vfork()

As vfork creates the child process in the same address space as that of the parent, and when execv() is called on the child then how is the parent process restored, as exec loads the file and runs it in the same address space of the parent and hence the child?
When execv follows a true vfork, it does some of the work of fork: it allocates a new memory space into which to load the new program image and copies inheritable things like environment variables into it. Meanwhile, even vfork saves a bit of the parent’s state on the side, so that execv can restore the parent’s stack and instruction pointers once the child is separated.
For example, on Linux vfork calls common process-copying code via _do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, ...). copy_mm reacts to the CLONE_VM and just reuses the memory space with no call to dup_mm. _do_fork meanwhile reacts to the CLONE_VFORK, marks the child vfork_done, and suspends the caller until the memory space is no longer in use; if this is via execve, it goes through exec_mmap and mm_release, which sees the vfork_done and wakes the parent.
So, really, execve (which also calls copy_strings) is always "allocating a new memory space and copying environment variables into it"; after a normal fork, however, this is not observable because it happens at the same time as releasing the non-shared space created by the fork.

forking in linux about COW

In linux, I know it's implemented by COW because of wasting. But, in the book says, when child calls exec() right after fork(), address spaces are never copied.
But I think if child use exec(), it means making new data or codes in the address space which is not yet copied. So when exec() is called, then address spaced is copied(Copy on Write), and new data or codes are written in here.
Am I wrong? Why exec() calls never copy parent's things?
Or If child calls exec(), then child just make his own mm_struct and write new data in his own address space which is newly made?(not copied from parent)
exec is library wrapper around the execve kernel call. there's going to be some stack activity before the execve starts (even if execve is called directly), so there will be at-least one stack block copied on write before the exec kicks in disconnects from the process context.
meanwhile the parent process will have been doing lots of copy on write before the child disconnects.

Fork creates a new process that is exactly the same as its parent

From the assumption made in the title of my question "Fork create a new process that is exactly the same as its parent". I am wondering how a fork is really made by the operating system.
Considering a heavy process (huge RAM footprint) that fork itself to accomplish a small task (list the files into a directory). From the assumption, I expect that the child process will be as big as the first one. However my common sense tells me that it cannot be the case.
How does it work in the real world?
As others mentioned in the comments, a technique called Copy-On-Write mitigates the heavy cost of copying the entire memory space of the parent process. Copy-on-write means that memory pages are shared read-only between parent and child until either of them decides to write -- at which point the page is copied and each process gets its own private copy. This technique easily prevents a huge amount of copying that in a lot of cases would be a waste of time because the child will exec() or do something simple and exit.
Here's what happens in detail:
When you call fork(2), the only immediate cost you incur is the cost of allocating a new unique process descriptor and the cost of copying the parent's page tables. In Linux, fork(2) is implemented by the clone(2) syscall, which is a more general syscall that allows the caller to control which parts of the new process are shared with the parent. When called from fork(2), a set of flags are passed to indicate that nothing is to be shared (you can choose to share memory, file descriptors, etc - this is how threads are implemented: by calling clone(2) with CLONE_VM, which means "share the memory space").
Under the hood, each process's memory page has a bit flag that is the copy-on-write flag that indicates whether that page should be copied before being written to. fork(2) marks every writeable page in a process with that bit. Each page also maintains a reference count.
So, when a process forks, the kernel sets the copy-on-write bit on every non-private, writeable page of that process and increments the reference count by one. The child process has pointers to these same pages.
Then, every page is marked read-only so that an attempt to write to the page generates a page fault - this is needed to wake up the kernel so that it has a chance of seeing what happened and what needs to be done.
When either of the processes writes to a page that is still being shared, and thus is marked read-only, the kernel wakes up and attempts to figure out why there is a page fault. Assuming the parent / child process is writing to a legit location, the kernel eventually sees that the page fault was generated because the page is marked copy-on-write and there is more than one reference to that page.
The kernel then allocates memory, copies the page into the new location, and the write can proceed.
What is different across forks
You said that fork(2) creates a new process that is exactly the same as its parent. This is not quite true. There are several differences between the parent and the child:
The process ID is different
The parent process ID is different
The child's resource usage (CPU time, etc) are set to 0
File locks owned by the parent are not inherited
The set of pending signals on the child is cleared
Pending alarms are cleared on the child
About vfork
The vfork(2) syscall is very similar to fork(), but it does absolutely no copying - it doesn't even copy the parent's page tables. With the introduction of copy-on-write, it's not as widely used anymore, but historically it was used by processes that would call exec() after forking.
Naturally, attempting to write to memory in the child process after a vfork() results in chaos.
When a process forked, the child process use the same page table as its parent util parent or child write to its space!So

If I have a process, and I clone it, is the PID the same?

Just a quick question, if I clone a process, the PID of the cloned process is the same, yes ? fork() creates a child process where the PID differs, but everything else is the same. Vfork() creates a child process with the same PID. Exec works to change a process currently in execution to something else.
Am I correct in all of these statements ?
Not quite. If you clone a process via fork/exec, or vfork/exec, you will get a new process id. fork() will give you the new process with a new process id, and exec() replaces that process with a new process, but maintaining the process id.
From here:
The vfork() function differs from
fork() only in that the child process
can share code and data with the
calling process (parent process). This
speeds cloning activity significantly
at a risk to the integrity of the
parent process if vfork() is misused.
Neither fork() nor vfork() keep the same PID although clone() can in one scenario (*a). They are all different ways to achieve roughly the same end, the creation of a distinct child.
clone() is like fork() but there are many things shared by the two processes and this is often used to enable threading.
vfork() is a variant of clone in which the parent is halted until the child process exits or executes another program. It's more efficient in those cases since it doesn't involve copying page tables and such. Basically, everything is shared between the two processes for as long as it takes the child to load another program.
Contrast that last option with the normal copy-on-write where memory itself is shared (until one of the processes writes to it) but the page tables that reference that memory are copied. In other words, vfork() is even more efficient than copy-on-write, at least for the fork-followed-by-immediate-exec use case.
But, in most cases, the child has a different process ID to the parent.
*a Things become tricky when you clone() with CLONE_THREAD. At that stage, the processes still have different identifiers but what constitutes the PID begins to blur. At the deepest level, the Linux scheduler doesn't care about processes, it schedules threads.
A thread has a thread ID (TID) and a thread group ID (TGID). The TGID is what you get from getpid().
When a thread is cloned without CLONE_THREAD, it's given a new TID and it also has its TGID set to that value (i.e., a brand new PID).
With CLONE_THREAD, it's given a new TID but the TGID (hence the reported process ID) remains the same as the parent so they really have the same PID. However, they can distinguish themselves by getting the TID from gettid().
There's quite a bit of trickery going on there with regard to parent process IDs and delivery of signals (both to the threads within a group and the SIGCHLD to the parent), all which can be examined from the clone() man page.
It deserves some explanation. And it's simple as rain.
Consider this. A program has to do some things at the same time. Say, your program is printing "hello world!", each second, until somebody enters "hello, Mike", then, each second, it prints that string, waiting for John to change that in the future.
How do you write this the standard way? In your program, that basically prints "hello," you must create another branch that is waiting for user input.
You create two processes, one outputting those strings, and another one, waiting the user input. And, the only way to create a new process in UNIX was calling the system call fork(), like this:
ret = fork();
if(ret > 0) /* parent, continue waiting */
else /* child */
This scheme posed numerous problems. The user enters "Mike" but you have no simple way to pass that string to the parent process so that it'd be able to print that, because +each+ process has its own view of memory that isn't shared with the child.
When the processes are created by fork(), each one receives a copy of the memory existing at that moment, and if that memory really changes later, the mapping that was identical for those memory segments will be chaged at once (it's called a copy-on-write mechanism).
Another thingies to share between the child and the parent are, for example, opened file descriptors, descriptors of the shared memory, input/outpue stuff, etc., that also wouldn't survive after fork().
So. The very fork() call had to be alleviated, to include shared memory/signals etc. But how? This was the idea behind clone(). That call takes a flag indicating what exatly would you share with the child. For example, the memory, the signal handlers, etc. And if you call this with flag=0, this will be identical to fork(), up to the args they take. And when POSIX pthreads are created, that flag will reflect the attributes you have indicated in pthread_attr.
From the kernel point of view, there's no difference between the processes created such way, and no special semantics to differentiate the "processess". The kernel does not even know, what that "thread" is, it creates a new process, but it simply combines it as belogning to that process group that had the parent who called it, taking care what that process may do. So, you have different procesess (that share the same pid) combined in a process group each assigned with a different "TID" (that starts from PID of the parent).
Care to explain that clone() does exactly that. You may pass this whaterver you need (as the matter of fact, the old vfork() call will do). Are you going to share memory? Hanlers? You may tune everything, just be sure you don't clash with the pthreads library written right away around this very call.
An important thing, the kernel vesion is quite outrageous, it expects just 2 out of 4 parameters to be passed, the user stack, and options.
Since PID is an unique identifier for a process, there's no way to have two distinct process with the same PID.
Threads (which have the same visible 'pid') are implemented with the clone() call. When the flag CLONE_THREAD is supplied then the new process (a 'thread') share the Thread Group Identifier (TGID) with its creator process. getpid actually returns the TGID.
See the clone manpage for more details.
In summary the real PID, as seen by the kernel is always different. The visible PID is the same for threads.

How to use fork() in unix? Why not something of the form fork(pointerToFunctionToRun)?

I am having some trouble understanding how to use Unix's fork(). I am used to, when in need of parallelization, spawining threads in my application. It's always something of the form
CreateNewThread(MyFunctionToRun());
void myFunctionToRun() { ... }
Now, when learning about Unix's fork(), I was given examples of the form:
fork();
printf("%d\n", 123);
in which the code after the fork is "split up". I can't understand how fork() can be useful. Why doesn't fork() have a similar syntax to the above CreateNewThread(), where you pass it the address of a function you want to run?
To accomplish something similar to CreateNewThread(), I'd have to be creative and do something like
//pseudo code
id = fork();
if (id == 0) { //im the child
FunctionToRun();
} else { //im the parent
wait();
}
Maybe the problem is that I am so used to spawning threads the .NET way that I can't think clearly about this. What am I missing here? What are the advantages of fork() over CreateNewThread()?
PS: I know fork() will spawn a new process, while CreateNewThread() will spawn a new thread.
Thanks
fork() says "copy the current process state into a new process and start it running from right here." Because the code is then running in two processes, it in fact returns twice: once in the parent process (where it returns the child process's process identifier) and once in the child (where it returns zero).
There are a lot of restrictions on what it is safe to call in the child process after fork() (see below). The expectation is that the fork() call was part one of spawning a new process running a new executable with its own state. Part two of this process is a call to execve() or one of its variants, which specifies the path to an executable to be loaded into the currently running process, the arguments to be provided to that process, and the environment variables to surround that process. (There is nothing to stop you from re-executing the currently running executable and providing a flag that will make it pick up where the parent left off, if that's what you really want.)
The UNIX fork()-exec() dance is roughly the equivalent of the Windows CreateProcess(). A newer function is even more like it: posix_spawn().
As a practical example of using fork(), consider a shell, such as bash. fork() is used all the time by a command shell. When you tell the shell to run a program (such as echo "hello world"), it forks itself and then execs that program. A pipeline is a collection of forked processes with stdout and stdin rigged up appropriately by the parent in between fork() and exec().
If you want to create a new thread, you should use the Posix threads library. You create a new Posix thread (pthread) using pthread_create(). Your CreateNewThread() example would look like this:
#include <pthread.h>
/* Pthread functions are expected to accept and return void *. */
void *MyFunctionToRun(void *dummy __unused);
pthread_t thread;
int error = pthread_create(&thread,
NULL/*use default thread attributes*/,
MyFunctionToRun,
(void *)NULL/*argument*/);
Before threads were available, fork() was the closest thing UNIX provided to multithreading. Now that threads are available, usage of fork() is almost entirely limited to spawning a new process to execute a different executable.
below: The restrictions are because fork() predates multithreading, so only the thread that calls fork() continues to execute in the child process. Per POSIX:
A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. [THR] [Option Start] Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls. [Option End]
When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not asynch-signal-safe, the behavior is undefined.
Because any library function you call could have spawned a thread on your behalf, the paranoid assumption is that you are always limited to executing async-signal-safe operations in the child process between calling fork() and exec().
History aside, there are some fundamental differences with respect to ownership of resource and life time between processes and threads.
When you fork, the new process occupies a completely separate memory space. That's a very important distinction from creating a new thread. In multi-threaded applications you have to consider how you access and manipulate shared resources. Processed that have been forked have to explicitly share resources using inter-process means such as shared memory, pipes, remote procedure calls, semaphores, etc.
Another difference is that fork()'ed children can outlive their parent, where as all threads die when the process terminates.
In a client-server architecture where very, very long uptime is expected, using fork() rather than creating threads could be a valid strategy to combat memory leaks. Rather than worrying about cleaning up memory leaks in your threads, you just fork off a new child process to process each client request, then kill the child when it's done. The only source of memory leaks would then be the parent process that dispatches events.
An analogy: You can think of spawning threads as opening tabs inside a single browser window, while forking is like opening separate browser windows.
It would be more valid to ask why CreateNewThread doesn't just return a thread id like fork() does... after all fork() set a precedent. Your opinion's just coloured by you having seen one before the other. Take a step back and consider that fork() duplicates the process and continues execution... what better place than at the next instruction? Why complicate things by adding a function call into the bargain (and then one what only takes void*)?
Your comment to Mike says "I can't understand is in which contexts you'd want to use it.". Basically, you use it when you want to:
run another process using the exec family of functions
do some parallel processing independently (in terms of memory usage, signal handling, resources, security, robustness), for example:
each process may have intrusive limits of the number of file descriptors they can manage, or on a 32-bit system - the amount of memory: a second process can share the work while getting its own resources
web browsers tend to fork distinct processes because they can do some initialisation then call operating system functions to permanently reduce their privileges (e.g. change to a less-trusted user id, change the "root" directory under which they can access files, or make some memory pages read-only); most OSes don't allow the same extent of fine-grained permission-setting on a per-thread basis; another benefit is if a child process seg-faults or similar the parent process can handle that and continue, whereas similar faults in multi-threaded code raise questions about whether memory has been corrupted - or locks have been held - by the crashing thread such that remaining threads are compromised
BTW / using UNIX/Linux doesn't mean you have to give up threads for fork()ing processes... you can use pthread_create() and related functions if you're more comfortable with the threading paradigm.
Letting the difference between spawning a process and a thread set aside for a second: Basically, fork() is a more fundamental primitive. While SpawnNewThread has to do some background work to get the program counter in the right spot, fork does no such work, it just copies (or virtually copies) your program memory and continues the counter.
Fork has been with us for a very, very, long time. Fork was thought of before the idea of 'start a thread running a particular function' was a glimmer in anyone's eye.
People don't use fork because it's 'better,' we use it because it is the one and only unprivileged user-mode process creation function that works across all variations of Linux. If you want to create a process, you have to call fork. And, for some purposes, a process is what you need, not a thread.
You might consider researching the early papers on the subject.
It is worth noting that multi-processing not exactly the same as multi-threading. The new process created by fork share very little context with the old one, which is quite different from the case for threads.
So, lets look at the unixy thread system: pthread_create has semantics similar to CreateNewThread.
Or, to turn it around, lets look at the windows (or java or other system that makes its living with threads) way of spawning a process identical to the one you're currently running (which is what fork does on unix)...well, we could except that there isn't one: that just not part of the all-threads-all-the-time model. (Which is not a bad thing, mind you, just different).
You fork whenever you want to more than one thing at the same time. It’s called multitasking, and is really useful.
Here for example is a telnetish like program:
#!/usr/bin/perl
use strict;
use IO::Socket;
my ($host, $port, $kidpid, $handle, $line);
unless (#ARGV == 2) { die "usage: $0 host port" }
($host, $port) = #ARGV;
# create a tcp connection to the specified host and port
$handle = IO::Socket::INET->new(Proto => "tcp",
PeerAddr => $host,
PeerPort => $port)
or die "can't connect to port $port on $host: $!";
$handle->autoflush(1); # so output gets there right away
print STDERR "[Connected to $host:$port]\n";
# split the program into two processes, identical twins
die "can't fork: $!" unless defined($kidpid = fork());
if ($kidpid) {
# parent copies the socket to standard output
while (defined ($line = <$handle>)) {
print STDOUT $line;
}
kill("TERM" => $kidpid); # send SIGTERM to child
}
else {
# child copies standard input to the socket
while (defined ($line = <STDIN>)) {
print $handle $line;
}
}
exit;
See how easy that is?
Fork()'s most popular use is as a way to clone a server for each new client that connect()s (because the new process inherits all file descriptors in whatever state they exist).
But I've also used it to initiate a new (locally running) service on-demand from a client.
That scheme is best done with two calls to fork() - one stays in the parent session until the server is up and running and able to connect, the other (I fork it off from the child) becomes the server and departs the parent's session so it can no longer be reached by (say) SIGQUIT.

Resources