Here is my question:
if a process (father) create a new process (child) with fork(),which of these data structure do not share between father and the son??
-process ID
-heap
-code
-stack
Relation for Process ID
Upon successful completion, fork() returns a value of 0 to the child
process and returns the process ID of the child process to the parent
process. Otherwise, a value of -1 is returned to the parent process, no
child process is created, and the global variable errno is set to indi-
cate the error
Relation of heap or memory space
The child gets an exact copy of the parents address space, which in many cases is likely to be laid out in the same format as the parent address space. I have to point out that each one will have it's own virtual address space for it's memory, such that each could have the same data at the same address, yet in different address spaces. Also, linux uses copy on write when creating child processes. This means that the parent and child will share the parent address space until one of them does a write, at which point the memory will be physically copied to the child. This eliminates unneeded copies when execing a new process. Since you're just going to overwrite the memory with a new executable, why bother copying it?
Relation for code
There is no object-oriented inheritence in C.
Fork'ing in C is basically the process being stopped while it is running, and an entire copy of it being made in (effectively) a different memory space, then both processes being told to continue. They will both continue from where the parent was paused. The only way you can tell which process you are in is to check the return value of the fork() call.
In such a situation the child doesn't really inherit everything from the parent process, it's more like it gets a complete copy of everything the parent had.
Stack
child process gets separate instance of global variable declared in parent process".
The point of separate processes is to separate memory. So you can't share variables between the parent and the child process once the fork occured.
Related
I am reading "Linux Kernel Development, Second Edition" by Robert Love. (Yes, it's a bit outdated). I understand from Chapter 3: Process Management that in COW (copy-on-write), the parent and child processes share the parent's address space until one of the processes writes to the address space. This is to prevent the unnecessary duplication of the parent's address space when it is not even being written to.
But then, it mentions that if the child process calls exec() right after fork(), the parent's address space and pages DON'T need to be copied and given to the child as a separate copy. That's where I'm lost.
According to the manual, "the exec() family of functions replaces the current process image with a new process image." The manual doesn't say anything about exec() creating a new address space for the new process image. So if the child process is sharing address space with its parent, wouldn't this mean that exec() would load an executable image into the parent's address space (which is shared with the child)?
Since that means the parent's address space would be overwritten, I don't understand how a child process that executes exec() after fork() WOULDN'T need a separate copy of its parent's address space to write to. Is there something I'm missing here?
Copy-on-Write mechanism implies, that none modification in child process will affect on parent.
Calling exec by the child is not an exception: it changes address space only for child, not for the parent.
You can even read on vfork() which doesn't have copy on write mechanism. It shares the address space of the parent process and parent process is suspected untill child process exists. It is interesting and makes things much more clearer.
From the assumption made in the title of my question "Fork create a new process that is exactly the same as its parent". I am wondering how a fork is really made by the operating system.
Considering a heavy process (huge RAM footprint) that fork itself to accomplish a small task (list the files into a directory). From the assumption, I expect that the child process will be as big as the first one. However my common sense tells me that it cannot be the case.
How does it work in the real world?
As others mentioned in the comments, a technique called Copy-On-Write mitigates the heavy cost of copying the entire memory space of the parent process. Copy-on-write means that memory pages are shared read-only between parent and child until either of them decides to write -- at which point the page is copied and each process gets its own private copy. This technique easily prevents a huge amount of copying that in a lot of cases would be a waste of time because the child will exec() or do something simple and exit.
Here's what happens in detail:
When you call fork(2), the only immediate cost you incur is the cost of allocating a new unique process descriptor and the cost of copying the parent's page tables. In Linux, fork(2) is implemented by the clone(2) syscall, which is a more general syscall that allows the caller to control which parts of the new process are shared with the parent. When called from fork(2), a set of flags are passed to indicate that nothing is to be shared (you can choose to share memory, file descriptors, etc - this is how threads are implemented: by calling clone(2) with CLONE_VM, which means "share the memory space").
Under the hood, each process's memory page has a bit flag that is the copy-on-write flag that indicates whether that page should be copied before being written to. fork(2) marks every writeable page in a process with that bit. Each page also maintains a reference count.
So, when a process forks, the kernel sets the copy-on-write bit on every non-private, writeable page of that process and increments the reference count by one. The child process has pointers to these same pages.
Then, every page is marked read-only so that an attempt to write to the page generates a page fault - this is needed to wake up the kernel so that it has a chance of seeing what happened and what needs to be done.
When either of the processes writes to a page that is still being shared, and thus is marked read-only, the kernel wakes up and attempts to figure out why there is a page fault. Assuming the parent / child process is writing to a legit location, the kernel eventually sees that the page fault was generated because the page is marked copy-on-write and there is more than one reference to that page.
The kernel then allocates memory, copies the page into the new location, and the write can proceed.
What is different across forks
You said that fork(2) creates a new process that is exactly the same as its parent. This is not quite true. There are several differences between the parent and the child:
The process ID is different
The parent process ID is different
The child's resource usage (CPU time, etc) are set to 0
File locks owned by the parent are not inherited
The set of pending signals on the child is cleared
Pending alarms are cleared on the child
About vfork
The vfork(2) syscall is very similar to fork(), but it does absolutely no copying - it doesn't even copy the parent's page tables. With the introduction of copy-on-write, it's not as widely used anymore, but historically it was used by processes that would call exec() after forking.
Naturally, attempting to write to memory in the child process after a vfork() results in chaos.
When a process forked, the child process use the same page table as its parent util parent or child write to its space!So
According to wikipedia (which could be wrong)
When a fork() system call is issued, a copy of all the pages corresponding to the parent process is created, loaded into a separate memory location by the OS for the child process. But this is not needed in certain cases. Consider the case when a child executes an "exec" system call (which is used to execute any executable file from within a C program) or exits very soon after the fork(). When the child is needed just to execute a command for the parent process, there is no need for copying the parent process' pages, since exec replaces the address space of the process which invoked it with the command to be executed.
In such cases, a technique called copy-on-write (COW) is used. With this technique, when a fork occurs, the parent process's pages are not copied for the child process. Instead, the pages are shared between the child and the parent process. Whenever a process (parent or child) modifies a page, a separate copy of that particular page alone is made for that process (parent or child) which performed the modification. This process will then use the newly copied page rather than the shared one in all future references. The other process (the one which did not modify the shared page) continues to use the original copy of the page (which is now no longer shared). This technique is called copy-on-write since the page is copied when some process writes to it.
It seems that when either of the process tries to write to the page. A new copy of the page is allocated and assigned to the process that generated the page fault. The original page is marked writable afterwards.
My question is: what happens if the fork is called multiple times before any of the process made an attempt to write to a shared page?
If fork is called multiple times from the original parent process, then each of the children and parent will have their pages marked as read-only. When a child process attempts to write data then the page from the parent process is copied to its address space and the copied page is marked as writeable in the child but not in the parent.
If fork is called from the child process and the grand-child attempts to write, the page from the original parent is copied to the first child, and then to the grand child, and all is marked as writeable.
The original page is only marked writeable if it belongs to a single process, which might not be the case if there were multiple forks. The new page is always marked as writeable because it only belongs to the process which attempted to write it.
Just a quick question, if I clone a process, the PID of the cloned process is the same, yes ? fork() creates a child process where the PID differs, but everything else is the same. Vfork() creates a child process with the same PID. Exec works to change a process currently in execution to something else.
Am I correct in all of these statements ?
Not quite. If you clone a process via fork/exec, or vfork/exec, you will get a new process id. fork() will give you the new process with a new process id, and exec() replaces that process with a new process, but maintaining the process id.
From here:
The vfork() function differs from
fork() only in that the child process
can share code and data with the
calling process (parent process). This
speeds cloning activity significantly
at a risk to the integrity of the
parent process if vfork() is misused.
Neither fork() nor vfork() keep the same PID although clone() can in one scenario (*a). They are all different ways to achieve roughly the same end, the creation of a distinct child.
clone() is like fork() but there are many things shared by the two processes and this is often used to enable threading.
vfork() is a variant of clone in which the parent is halted until the child process exits or executes another program. It's more efficient in those cases since it doesn't involve copying page tables and such. Basically, everything is shared between the two processes for as long as it takes the child to load another program.
Contrast that last option with the normal copy-on-write where memory itself is shared (until one of the processes writes to it) but the page tables that reference that memory are copied. In other words, vfork() is even more efficient than copy-on-write, at least for the fork-followed-by-immediate-exec use case.
But, in most cases, the child has a different process ID to the parent.
*a Things become tricky when you clone() with CLONE_THREAD. At that stage, the processes still have different identifiers but what constitutes the PID begins to blur. At the deepest level, the Linux scheduler doesn't care about processes, it schedules threads.
A thread has a thread ID (TID) and a thread group ID (TGID). The TGID is what you get from getpid().
When a thread is cloned without CLONE_THREAD, it's given a new TID and it also has its TGID set to that value (i.e., a brand new PID).
With CLONE_THREAD, it's given a new TID but the TGID (hence the reported process ID) remains the same as the parent so they really have the same PID. However, they can distinguish themselves by getting the TID from gettid().
There's quite a bit of trickery going on there with regard to parent process IDs and delivery of signals (both to the threads within a group and the SIGCHLD to the parent), all which can be examined from the clone() man page.
It deserves some explanation. And it's simple as rain.
Consider this. A program has to do some things at the same time. Say, your program is printing "hello world!", each second, until somebody enters "hello, Mike", then, each second, it prints that string, waiting for John to change that in the future.
How do you write this the standard way? In your program, that basically prints "hello," you must create another branch that is waiting for user input.
You create two processes, one outputting those strings, and another one, waiting the user input. And, the only way to create a new process in UNIX was calling the system call fork(), like this:
ret = fork();
if(ret > 0) /* parent, continue waiting */
else /* child */
This scheme posed numerous problems. The user enters "Mike" but you have no simple way to pass that string to the parent process so that it'd be able to print that, because +each+ process has its own view of memory that isn't shared with the child.
When the processes are created by fork(), each one receives a copy of the memory existing at that moment, and if that memory really changes later, the mapping that was identical for those memory segments will be chaged at once (it's called a copy-on-write mechanism).
Another thingies to share between the child and the parent are, for example, opened file descriptors, descriptors of the shared memory, input/outpue stuff, etc., that also wouldn't survive after fork().
So. The very fork() call had to be alleviated, to include shared memory/signals etc. But how? This was the idea behind clone(). That call takes a flag indicating what exatly would you share with the child. For example, the memory, the signal handlers, etc. And if you call this with flag=0, this will be identical to fork(), up to the args they take. And when POSIX pthreads are created, that flag will reflect the attributes you have indicated in pthread_attr.
From the kernel point of view, there's no difference between the processes created such way, and no special semantics to differentiate the "processess". The kernel does not even know, what that "thread" is, it creates a new process, but it simply combines it as belogning to that process group that had the parent who called it, taking care what that process may do. So, you have different procesess (that share the same pid) combined in a process group each assigned with a different "TID" (that starts from PID of the parent).
Care to explain that clone() does exactly that. You may pass this whaterver you need (as the matter of fact, the old vfork() call will do). Are you going to share memory? Hanlers? You may tune everything, just be sure you don't clash with the pthreads library written right away around this very call.
An important thing, the kernel vesion is quite outrageous, it expects just 2 out of 4 parameters to be passed, the user stack, and options.
Since PID is an unique identifier for a process, there's no way to have two distinct process with the same PID.
Threads (which have the same visible 'pid') are implemented with the clone() call. When the flag CLONE_THREAD is supplied then the new process (a 'thread') share the Thread Group Identifier (TGID) with its creator process. getpid actually returns the TGID.
See the clone manpage for more details.
In summary the real PID, as seen by the kernel is always different. The visible PID is the same for threads.
I was looking to find the difference between these four on Google and I expected there to be a huge amount of information on this, but there really wasn't any solid comparison between the four calls.
I set about trying to compile a kind of basic at-a-glance look at the differences between these system calls and here's what I got. Is all this information correct/am I missing anything important ?
Fork : The fork call basically makes a duplicate of the current process, identical in almost every way (not everything is copied over, for example, resource limits in some implementations but the idea is to create as close a copy as possible).
The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork - the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork call works - if not, no child is created and the parent gets an error code.
Vfork: The basic difference between vfork() and fork() is that when a new process is created with vfork(), the parent process is temporarily suspended, and the child process might borrow the parent's address space. This strange state of affairs continues until the child process either exits, or calls execve(), at which point the parent
process continues.
This means that the child process of a vfork() must be careful to avoid unexpectedly modifying variables of the parent process. In particular, the child process must not return from the function containing the vfork() call, and it must not call exit() (if it needs to exit, it should use _exit(); actually, this is also true for the child of a normal fork()).
Exec: The exec call is a way to basically replace the entire current process with a new program. It loads the program into the current process space and runs it from the entry point. exec() replaces the current process with a the executable pointed by the function. Control never returns to the original program unless there is an exec() error.
Clone: clone(), as fork(), creates a new process. Unlike fork(), these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers.
When the child process is created with clone(), it executes the function application fn(arg) (This differs from fork(), where execution continues in the child from the point of the original fork() call.) The fn argument is a pointer to a function that is called by the child process at the beginning of its execution. The arg argument is passed to the fn function.
When the fn(arg) function application returns, the child process terminates. The integer returned by fn is the exit code for the child process. The child process may also terminate explicitly by calling exit(2) or after receiving a fatal signal.
Information gotten from:
Differences between fork and exec
http://www.allinterview.com/showanswers/59616.html
http://www.unixguide.net/unix/programming/1.1.2.shtml
http://linux.about.com/library/cmd/blcmdl2_clone.htm
Thanks for taking the time to read this ! :)
vfork() is an obsolete optimization. Before good memory management, fork() made a full copy of the parent's memory, so it was pretty expensive. since in many cases a fork() was followed by exec(), which discards the current memory map and creates a new one, it was a needless expense. Nowadays, fork() doesn't copy the memory; it's simply set as "copy on write", so fork()+exec() is just as efficient as vfork()+exec().
clone() is the syscall used by fork(). with some parameters, it creates a new process, with others, it creates a thread. the difference between them is just which data structures (memory space, processor state, stack, PID, open files, etc) are shared or not.
execve() replaces the current executable image with another one loaded from an executable file.
fork() creates a child process.
vfork() is a historical optimized version of fork(), meant to be used when execve() is called directly after fork(). It turned out to work well in non-MMU systems (where fork() cannot work in an efficient manner) and when fork()ing processes with a huge memory footprint to run some small program (think Java's Runtime.exec()). POSIX has standardized the posix_spawn() to replace these latter two more modern uses of vfork().
posix_spawn() does the equivalent of a fork()/execve(), and also allows some fd juggling in between. It's supposed to replace fork()/execve(), mainly for non-MMU platforms.
pthread_create() creates a new thread.
clone() is a Linux-specific call, which can be used to implement anything from fork() to pthread_create(). It gives a lot of control. Inspired on rfork().
rfork() is a Plan-9 specific call. It's supposed to be a generic call, allowing several degrees of sharing, between full processes and threads.
fork() - creates a new child process, which is a complete copy of the parent process. Child and parent processes use different virtual address spaces, which is initially populated by the same memory pages. Then, as both processes are executed, the virtual address spaces begin to differ more and more, because the operating system performs a lazy copying of memory pages that are being written by either of these two processes and assigns an independent copies of the modified pages of memory for each process. This technique is called Copy-On-Write (COW).
vfork() - creates a new child process, which is a "quick" copy of the parent process. In contrast to the system call fork(), child and parent processes share the same virtual address space. NOTE! Using the same virtual address space, both the parent and child use the same stack, the stack pointer and the instruction pointer, as in the case of the classic fork()! To prevent unwanted interference between parent and child, which use the same stack, execution of the parent process is frozen until the child will call either exec() (create a new virtual address space and a transition to a different stack) or _exit() (termination of the process execution). vfork() is the optimization of fork() for "fork-and-exec" model. It can be performed 4-5 times faster than the fork(), because unlike the fork() (even with COW kept in the mind), implementation of vfork() system call does not include the creation of a new address space (the allocation and setting up of new page directories).
clone() - creates a new child process. Various parameters of this system call, specify which parts of the parent process must be copied into the child process and which parts will be shared between them. As a result, this system call can be used to create all kinds of execution entities, starting from threads and finishing by completely independent processes. In fact, clone() system call is the base which is used for the implementation of pthread_create() and all the family of the fork() system calls.
exec() - resets all the memory of the process, loads and parses specified executable binary, sets up new stack and passes control to the entry point of the loaded executable. This system call never return control to the caller and serves for loading of a new program to the already existing process. This system call with fork() system call together form a classical UNIX process management model called "fork-and-exec".
The fork(),vfork() and clone() all call the do_fork() to do the real work, but with different parameters.
asmlinkage int sys_fork(struct pt_regs regs)
{
return do_fork(SIGCHLD, regs.esp, ®s, 0);
}
asmlinkage int sys_clone(struct pt_regs regs)
{
unsigned long clone_flags;
unsigned long newsp;
clone_flags = regs.ebx;
newsp = regs.ecx;
if (!newsp)
newsp = regs.esp;
return do_fork(clone_flags, newsp, ®s, 0);
}
asmlinkage int sys_vfork(struct pt_regs regs)
{
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs.esp, ®s, 0);
}
#define CLONE_VFORK 0x00004000 /* set if the parent wants the child to wake it up on mm_release */
#define CLONE_VM 0x00000100 /* set if VM shared between processes */
SIGCHLD means the child should send this signal to its father when exit.
For fork, the child and father has the independent VM page table, but since the efficiency, fork will not really copy any pages, it just set all the writeable pages to readonly for child process. So when child process want to write something on that page, an page exception happen and kernel will alloc a new page cloned from the old page with write permission. That's called "copy on write".
For vfork, the virtual memory is exactly by child and father---just because of that, father and child can't be awake concurrently since they will influence each other. So the father will sleep at the end of "do_fork()" and awake when child call exit() or execve() since then it will own new page table. Here is the code(in do_fork()) that the father sleep.
if ((clone_flags & CLONE_VFORK) && (retval > 0))
down(&sem);
return retval;
Here is the code(in mm_release() called by exit() and execve()) which awake the father.
up(tsk->p_opptr->vfork_sem);
For sys_clone(), it is more flexible since you can input any clone_flags to it. So pthread_create() call this system call with many clone_flags:
int clone_flags = (CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGNAL | CLONE_SETTLS | CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID | CLONE_SYSVSEM);
Summary: the fork(),vfork() and clone() will create child processes with different mount of sharing resource with the father process. We also can say the vfork() and clone() can create threads(actually they are processes since they have independent task_struct) since they share the VM page table with father process.
in fork(), either child or parent process will execute based on cpu selection..
But in vfork(), surely child will execute first. after child terminated, parent will execute.