Fork()-ing a new process - linux

Fork()-ing a process will end up calling do_fork() inside kernel, making an exact copy of itself. When I read through books, it says that child of fork will call exec to create the new process.
example:
ls command on a shell, will create this way.
sh(Parent)
|
sh(Child)
|
ls(New Process)
But, I am not able to understand how & where the exec*() is called?
Because, All I can see is the shell(child) is what created in fork.
But, when and where will the new process be created/executed?

You have to exec() if you actually want a new program running in one of the processes (usually the child but not absolutely necessary). In your specific case where the shell executes ls, the shell first forks, then the child process execs. But it's important to realise that this is two distinct operations.
All fork() does is give you two (nearly) identical processes and you can then use the return code from fork() to decide if you're the parent (you get the positive PID of the child, or -1 if the fork() failed) or child (you get 0).
See this answer for a description on how fork() and exec() work together (under your control) and how they can be used without each other.
Similar to do_fork(), the exec stuff all boils down to calls to do_execve, located in exec.c.

Related

What functions are used when the system call fork() is called

I've been searching through the net to find what functions are used inside the fork.c and in what order, however I can't seem to find the answer. All I see is what fork.c does. I know that fork.c uses _do_fork() but how it gets there I don't know.
When fork() system call is made it creates a new process by duplicating the calling process. And new process will be called the child process.
Look at this code its basic overview.
fork()->sys_fork()->do_fork()
sys_fork()
{
1. First it will validate the arguments.
2. Invoke do_fork.
3. return pid. (child pid)
}
do_fork()
{
1. First it will Allocate new address space.
2. Copy Segments of Caller address space to new address space.
3. allocate new task_struct instance. (PCB)
4. copy caller task_struct entries to new task_struct.
5. return.
}
On success, the PID of the child process is returned in the parent, and 0 is returned in the child.
Note: Their Are some few more calls but these two are most important and if you want to know more look into kernel source. If Still need help let me know.

Linux pipe example . ipc pipe creation

I was looking through the pipe(2) syscall example in linux, I got this from tldp: http://tldp.org/LDP/lpg/node11.html#SECTION00722000000000000000
When we need to close the input of child process we close fd(1) of child - fine, but we should also close the output of the parent i.e. close fd(0) of parent, why should we use else statement here, in this case the parent's fd(0) will close only when the fork fails, am I correct?
I feel there should not be else statement and both the input of child and output of parent should be closed for communication from child to parent correct?
You shouldn't talk about child input and parent output, that looks like you are referring to stdin and stdout, which is not necessarily the same as the pipe's read and write channels.
For communication from child to parent, the child needs to close the pipe's read channel (fd[0] in your example), and the parent needs to close the pipe's write channel (fd[1]).
Your confusion seems to be more about forking than about pipes.
The else is needed, because we need to execute different code in the parent and in the child. It is very common to use if / else after forking to differentiate the code that executes in each process. Remember that fork(2) returns twice: in the parent, and in the newborn child. It returns the child's pid in the parent, and 0 in the child, so we use that to tell them apart.
In the example you posted, if fork(2) fails, the first if is entered and the process exits. Otherwise, a pair of if / else is used to execute different code in each process.

DIFFERENT TASKS ASSIGNED TO DIFFERENT INSTANCES OF FORK() OF A PROCESS IN C

Can I assign different task to different instances of fork() of a process in C ?
like for example:
program.c has been forked 3 times
int main()
{
pid_t pid;
pid = fork();
pid = fork();
pid = fork();
}
now to every instance of fork() I want to do different thing, Can I do this? with forks ? or any other method if favorable? :)
PS: I am testing Real Time Linux and want to check the performance of the Context Switching through forks through Time Constraint.
You can use posix process..
posix_spawn( &Pid,ProgramPath.c_str(), & FileActions,& SpawnAttr,argv,envp);
Check its documentation here.
You always have to test the result of fork(2) (in particular, to handle error cases), and do different things for 0 result (successful in child process), positive result (successful in parent process), negative result (failure, so use perror). So according to that result you can do different things. Often you end up invoking execve(2) for the child process (when fork gives 0), and you usually setup things (e.g. for IPC thru pipe(7)-s) before calling fork.
So to assign a different task after a fork just execute different code according to result of fork
You should read Advanced Linux Programming. It has several chapters explaining all that (so I won't take the time to explain it here).
You could be interested in pthreads (implemented using clone(2) and futex(7), which you should not use directly unless you are implementing your thread library, which is not reasonable).
Try also to strace(1) several programs (including some shell and some basic commands). It will tell which syscalls(2) they are calling. See also intro(2).

Linux schedule task when another is done

I have a task/process currently running. I would like to schedule another task to start when the first one finished.
How can I do that in linux ?
(I can't stop the first one, and create a script to start one task after the other)
Somewhat meager spec, but something along the line of
watch -n 1 'pgrep task1 || task2'
might do the job.
You want wait.
Either the system call in section 2 of the manual, one of it's varients like waitpid or the shell builtin which is designed explicitly for this purpose.
The shell builtin is a little more natural because both processes are childred of the sell, so you write a script like:
#!/bin/sh
command1 arguments &
wait
command2 args
To use the system calls you will have to write a program that forks, launches the first command in the child then waits before execing the second program.
The manpage for wait (2) says:
wait() and waitpid()
The wait() system call suspends execution of the current process until one of its children terminates. The call wait(&status) is equivalent to:
waitpid(-1, &status, 0);
The waitpid() system call suspends execution of the current process until a child
specified by pid argument has changed state.

unix fork() understanding

int main(){
fork();
}
I know this is a newbie question, but my understanding is that the parent process now will fork a new child process exactly as the parent one, which means that the child should also fork a child process and so on... In reality, this only generates one child process. I cant understand what code will the child be executing?
The child process begins executing at the exact point where the last one left off - after the fork statement. If you wanted to fork forever, you'd have to put it in a while loop.
As everybody mentioned, the child also starts executing after fork() has finished. Thus, it doesn't call fork again.
You could see it clearly in the very common usage like this:
int main()
{
if (fork())
{
// you are in parent. The return value of fork was the pid of the child
// here you can do stuff and perhaps eventually `wait` on the child
}
else
{
// you are in the child. The return value of fork was 0
// you may often see here an `exec*` command
}
}
You missed a semi-colon.
But the child (and also the parent) is continuing just after the fork happenned. From the point of view of application programming, fork (like all system calls) is "atomic".
The only difference between the two processes (which after the fork have conceptually separate memory spaces) is the result of the fork.
If the child went on to call fork, the child would have two forks (the one that created it and the one that it then made) while the parent would only have one (the one that gave it a child). The nature of fork is that one process calls it and two processes return from it.

Resources