What happens if i use wait() in child process? - linux

Consider de following code fragment:
for(i = 0; i < 5; i++)
if(fork() == 0) {
printf("%d\n", i);
wait(0);
}
What will be the result and how many new processes will be created?

(1) is this a homework assignment? - A question about a homework assignment is ok, getting someone to do your homework assignment for you is not.
(2) why don't you try it and see?
(3) if a process itself has no child processes, wait will (immediately) return -1.
(4) Be warned that each child process, after wait(0);, will continue the loop - that is, the parent will fork and the child will print 0; the second time round, both processes will fork, and their children will print 1; the third time round, all four processes will fork, and their four children will print 2, and so forth.
(5) also, be warned that the processes each run independently - the first child may go round the loop several times before the parent does even one, or vice-versa.
If you have a computer to access the internet, you have a computer to try things on - if you are going to be doing C homework in the future, it would be well worth the effort to download a free C compiler for your computer. Try Tiny C at http://bellard.org/tcc/

Related

How to prevent switching threads in FOR loop, while using omp directive

Assume such simple FOR loop
#pragma omp parallel for
for (int i = 0; i<10; i++)
{
//do_something_1
//do_something_2
//do_something_3
}
As I understand, loop iterations can ran in any order. This is fine for me.
The problem is, that threads can switch between themselfs in the middle of iteration execution.
Assume there where created 10 threads(as a number of iterations).
Lets say, thread_1 is currently running, and after completing do_something_1 and do_something_2 lines, it switched to thread_4(for example).
Is there a way to force a thread to complete whole iteration without being switched, I mean thread_1 completing lines do_something_1, do_something_2 and do_something_3 without being interrupted.
I understand that this is a part of OS algorithm for multithreading environment, but this hope there is a way to bypass it.
Edit:
Using ORDERED clause in pragma is needed only in 2 cases,
1) You need a result of previous iteration in current iteration. And than it will be a single thread program
2) You need, that your index will be correct in each iteration(thought you still can run all iterations parallel).
Let's see example for my problem:
int new_index = 0;
#pragma omp parallel for
for (int i = 0; i<10; i++)
{
<mutex lock>
new_index++
<mutex unlock>
//do_something_1
//do_something_2
//do_something_3
my_array[new_index] = 5; //correct
my_array[i] = 5; //not correct
}
So, there will be still 10 iterations, but now it should be a correct index each time for my_array.
The problem is : thread_1 increments new_index(new_index = 1), complete do_something_1, and then switched to thread_2.
Thread_2 completes it's loop completely(new_index = 2), but now , when OS switch back to thread_1, there is no correct new_index(new_index = 1) and my_array stays unchanged.
So, I thought if it possible to tall to OS, don't switch threads in a middle of iteration.

Divide and conquer method using parent process and child processes

The assignment I have to do requiers working with processes in linux. The task sounds like this: calculate the sum of the elements of an array by using the divide and conquer method in the following way- a parent proccess would split the array in two different subbarrays, which are passed to two child processes. Every child process must calculate the sum of the elements from their own subarray and the results(s1 and s2) would then be added; the child processes should repeat the same "technique" till the final sum is returned.
I must admit that I really don't know much about unix procceses, as I've just started studying this chapter, but I know how to use fork() in order to create two child procceses from a parent proccess and also who to write a C program which uses the divide and conquer method. My problem is that I am struggling bringing these two aspects toghether, which means integrating the divide and conquer algorithm-for calculating the sum of the elements of an array- in a program that creates two child proccesses(such as the following one:)
pid_t child_a, child_b;
child_a = fork();
if (child_a == 0) {
/* Child A code */
} else {
child_b = fork();
if (child_b == 0) {
/* Child B code */
} else {
/* Parent Code */
}
}
You can try to use a for loop to do fork().
First, divide the array into multiple subarrays.
In the first for() loop, you just create a child by fork(), and assign a subarray to it.(just do the calculation in the if (pid == 0) section.).
After that, all child processes are ready to do calculations.
You need another for() loop to sum up the result from all child processes. But be careful, you have to wait for all child process until all of their results come out.
Finally, the work is done.

How many child processes can a parent spawn before becoming infeasible?

I'm a C programmer learning about fork(), exec(), and wait() for the first time. I'm also whiteboarding a Standard C program which will run on Linux and potentially need a lot of child processes. What I can't gauge is... how many child processes are too many for one parent to spawn and then wait upon?
Suppose my code looked like this:
pid_t status[ LARGENUMBER ];
status[0] = fork();
if( status[0] == 0 )
{
// I am the child
exec("./newCode01.c");
}
status[1] = fork();
if( status[1] == 0 )
{
// child
exec("./newCode02.c");
}
...etc...
wait(status[0]);
wait(status[1]);
...and so on....
Obviously, the larger LARGENUMBER is, the greater the chance that the parent is still fork() ing while children are segfaulting or becoming zombies or whatever.
So this implementation seems problematic to me. As I understand it, the parent can only wait() for one child at a time? What if LARGENUMBER is huge, and the time gap between running status[0] = fork(); and wait(status[0]); is substantial? What if the child has run, becomes a zombie, and been terminated by the OS somehow in that time? Will the parent then wait(status[0]) forever?
In the above example, there must be some standard or guideline to how big LARGENUMBER can be. Or is my approach all wrong?
#define LARGENUMBER 1
#define LARGENUMBER 10
#define LARGENUMBER 100
#define LARGENUMBER 1000
#define LARGENUMBER ???
I want to play with this, but my instinct is to ask for advice before I invest the development time into a program which may or may not turn out to be infeasible. Any advice/experience is appreciated.
If you read the documentation of wait, you would know that
If status information is available prior to the call to wait(), return will be immediate.
That means, if the child has already terminated, wait() will return immediately.
The OS will not remove the information from the process table until you have called wait¹ for the child process or your program exits:
If a parent process terminates without waiting for all of its child processes to terminate, the remaining child processes will be assigned a new parent process ID corresponding to an implementation-dependent system process.
Of course you still can't spawn an unlimited amount of children, for more detail on that see Maximum number of children processes on Linux (as far as Linux is concerned, other OS will impose other limits).
¹: https://en.wikipedia.org/wiki/Zombie_process
I will try my best to explain.
First a bad example: where you fork() one child process, then wait for it to finish before forking another child process. This kills the multiprocessing degree, bad CPU utilization.
pid = fork();
if (pid == -1) { ... } // handle error
else if (pid == 0) {execv(...);} // child
else (pid > 0) {
wait(NULL); // parent
pid = fork();
if (pid == -1) { ... } // handle error
else if (pid == 0) {execv(...);} // child
else (pid > 0) {wait(NULL); } // parent
}
How should it be done ?
In this approach, you first create the two child process, then wait. Increase CPU utilization and multiprocessing degree.
pid1 = fork();
if (pid1 == -1) { ... } // handle error
if (pid1 == 0) {execv(...);}
pid2 = fork();
if (pid2 == -1) { ... } // handle error
if (pid2 == 0) {execv(...);}
if (pid1 > 0) {wait(NULL); }
if (pid2 > 0) {wait(NULL); }
NOTE:
even though it seems as parent is waiting before the second wait is executed, the child is still running and is not waiting to execv or being spawned.
In your case, you are doing the second approach, first fork all processes and save return value of fork then wait.
the parent can only wait() for one child at a time?
The parent can wait for all its children one at a time!, whether they already finished and became zombie process or still running. For more explained details look here.
How many child processes can a parent spawn before becoming infeasible?
It might be OS dependent, but one acceptable approach is to split the time given to a process to run in 2, half for child process and half for parent process.
So that processes don't exhaust the system and cheat by creating child processes which will run more than the OS wanted to give the parent process in first place.

Can someone please explain how this works?fork(),sleep()

#include <sys/wait.h>
#include <stdlib.h>
#include <unistd.h>
#include<stdlib.h>
int main(void)
{
pid_t pids[10];
int i;
for (i = 9; i >= 0; --i) {
pids[i] = fork();
if (pids[i] == 0) {
printf("Child%d\n",i);
sleep(i+1);
_exit(0);
}
}
for (i = 9; i >= 0; --i){
printf("parent%d\n",i);
waitpid(pids[i], NULL, 0);
}
return 0;
}
What is happening here? How is sleep() getting executed in the for loop? When is it getting called? Here is the output:
parent9
Child3
Child4
Child2
Child5
Child1
Child6
Child0
Child7
Child8
Child9 //there is a pause here
parent8
parent7
parent6
parent5
parent4
parent3
parent2
parent1
parent0
Please explain this output. I am not able to understand how it's working.
Step by step analysis would be great.
In the first loop, the original (parent) process forks 10 copies of itself. Each of these child processes (detected by the fact that fork() returned zero) prints a message, sleeps, and exits. All of the children are created at essentially the same time (since the parent is doing very little in the loop), so it's somewhat random when each of them gets scheduled for the first time - thus the scrambled order of their messages.
During the loop, an array of child process IDs is built. There is a copy of the pids[] array in all 11 processes, but only in the parent is it complete - the copy in each child will be missing the lower-numbered child PIDs, and have zero for its own PID. (Not that this really matters, as only the parent process actually uses this array.)
The second loop executes only in the parent process (because all of the children have exited before this point), and waits for each child to exit. It waits for the child that slept 10 seconds first; all the others have long since exited, so all of the messages (except the first) appear in quick succession. There is no possibility of random ordering here, since it's driven by a loop in a single process. Note that the first parent message actually appeared before any of the children messages - the parent was able to continue into the second loop before any of the child processes were able to start. This again is just the random behavior of the process scheduler - the "parent9" message could have appeared anywhere in the sequence prior to "parent8".
I see that you've tagged this question with "zombie-processes". Child0 thru Child8 spend one or more seconds in this state, between the time they exited and the time the parent did a waitpid() on them. The parent was already waiting on Child9 before it exited, so that one process spent essentially no time as a zombie.
This example code might be a bit more illuminating if there were two messages in each loop - one before and one after the sleep/waitpid. It would also be instructive to reverse the order of the second loop (or equivalently, to change the first loop to sleep(10-i)).

Providing Concurrency Between Pthreads

I am working on multithread programming and I am stuck on something.
In my program there are two tasks and two types of robots for carrying out the tasks:
Task 1 requires any two types of robot and
task 2 requires 2 robot1 type and 2 robot2 type.
Total number of robot1 and robot2 and pointers to these two types are given for initialization. Threads share these robots and robots are reserved until a thread is done with them.
Actual task is done in doTask1(robot **) function which takes pointer to a robot pointer as parameter so I need to pass the robots that I reserved. I want to provide concurrency. Obviously if I lock everything it will not be concurrent. robot1 is type of Robot **. Since It is used by all threads before one thread calls doTask or finish it other can overwrite robot1 so it changes things. I know it is because robot1 is shared by all threads. Could you explain how can I solve this problem? I don't want to pass any arguments to thread start routine.
rsc is my struct to hold number of robots and pointers that are given in an initialization function.
void *task1(void *arg)
{
int tid;
tid = *((int *) arg);
cout << "TASK 1 with thread id " << tid << endl;
pthread_mutex_lock (&mutexUpdateRob);
while (rsc->totalResources < 2)
{
pthread_cond_wait(&noResource, &mutexUpdateRob);
}
if (rsc->numOfRobotA > 0 && rsc->numOfRobotB > 0)
{
rsc->numOfRobotA --;
rsc->numOfRobotB--;
robot1[0] = &rsc->robotA[counterA];
robot1[1] = &rsc->robotB[counterB];
counterA ++;
counterB ++;
flag1 = true;
rsc->totalResources -= 2;
}
pthread_mutex_unlock (&mutexUpdateRob);
doTask1(robot1);
pthread_mutex_lock (&mutexUpdateRob);
if(flag1)
{
rsc->numOfRobotA ++;
rsc->numOfRobotB++;
rsc->totalResources += 2;
}
if (totalResource >= 2)
{
pthread_cond_signal(&noResource);
}
pthread_mutex_unlock (&mutexUpdateRob);
pthread_exit(NULL);
}
If robots are global resources, threads should not dispose of them. It should be the duty of the main thread exit (or cleanup) function.
Also, there sould be a way for threads to locate unambiguously the robots, and to lock their use.
The robot1 array seems to store the robots, and it seems to be a global array. However:
its access is not protected by a mutex (pthread_mutex_t), it seems now that you've taken care of that.
Also, the code in task1 is always modifying entries 0 and 1 of this array. If two threads or more execute that code, the entries will be overwritten. I don't think that it is what you want. How will that array be used afterwards?
In fact, why does this array need to be global?
The bottom line is this: as long as this array is shared by threads, they will have problems working concurrently. Think about it this way:
You have two companies using robots to work, but they're using the same truck (robot1) to move the robots around. How are these two companies supposed to function properly, and efficiently with only one truck?

Resources