Is there a race condition when manually put process into sleep - linux

When I read ldd3 chapter 6, I was confused by the codes which is shown below:
while (spacefree(dev) == 0) { /* full */ DEFINE_WAIT(wait);
up(&dev->sem);
if (filp->f_flags & O_NONBLOCK)
return -EAGAIN;
PDEBUG("\"%s\" writing: going to sleep\n",current->comm);
prepare_to_wait(&dev->outq, &wait, TASK_INTERRUPTIBLE);
if (spacefree(dev) == 0)
schedule( );
finish_wait(&dev->outq, &wait);
if (signal_pending(current))
return -ERESTARTSYS; /* signal: tell the fs layer to handle it */
if (down_interruptible(&dev->sem))
return -ERESTARTSYS;
}
Isn't there a race condition between the if (spacefree(dev) == 0) and schedule(). If the spacefree(dev) function returns non-zero when the if statement just finished, maybe the process will lost the only chance to be wake up. Can anyone tell me what is the mechanism behind these codes?
By the way, I found another code snippet in Linux Kernel Development which is similar to the one above. Some details, however, is different.
DEFINE_WAIT(wait);
add_wait_queue(q, &wait);
while (!condition) { /* condition is the event that we are waiting for */
prepare_to_wait(&q, &wait, TASK_INTERRUPTIBLE);
if (signal_pending(current))
/* handle signal */
schedule();
}
finish_wait(&q, &wait);
The very difference is the addition of add_wait_queue() function which doesn't appear in the first code snippet.
The second difference is that in the second snippet's while statement, there is no condition test before schedule(). Why?
The third difference is the position of signal_pending() function, one is before the schedule(), another is after.
Why there exists such differences?

From LDD3 chapter 6:
"By checking our condition after setting the process state, we are covered against all
possible sequences of events. If the condition we are waiting for had come about
before setting the process state, we notice in this check and not actually sleep. If the
wakeup happens thereafter, the process is made runnable whether or not we have
actually gone to sleep yet."

Related

How to resolve this mistake in Petersons algorithm for process synchronization

Information
I was reading the book of E. Tanenbaum about Modern operating systems and there was a code snippet that was introducing Petersons algorithm for process synchronization which is implemented with software.
Here's the snippet.
```
#define FALSE 0
#define TRUE 1
#define N 2 /* number of processes */
int turn; /* whose turn is it? */
int interested[N]; /* all values initially 0 (FALSE) */
void enter_region(int process) /* process is 0 or 1 */
{
int other; /* number of the other process */
other = 1 − process; /* the opposite of process */
interested[process] = TRUE; /* show that you are interested */
turn = process; /*set flag*/
while (turn == process && interested[other] == TRUE); /* null statement */
}
void leave_region(int process) { /* process: who is leaving */
interested[process] = FALSE; /* indicate departure from critical region */
}
```
The question is
Isn't there a mistake? [Edit] Must'nt it be turn = other or maybe there is another mistake.
This version of algorithm violates rules of mutual exclusion.
[Edit]
I think this version is violating the rules of mutual exclusion. As if first process sets the interested variable than stops and other process runs, second process can idle wait after setting his interested and turn variables without any need as there is no any process in critical section.
Any answer and help is appreciated. Thanks!
If process 0 sets interested[0], and then process 1 runs enter_region up until the loop, then process 0 will be able to exit the loop because turn == process is no longer true for it. turn, in this case, really means "turn to wait", and protects against exactly the situation you described. In contrast, if the code did turn = other then process 0 would not be able to exit the loop until process 1 started waiting.

Using fetch-and-add as lock

I am trying to understand how fetch-and-add can be used as a lock. Here is what the book (OS's: 3 Easy pieces) says:
The basic operation is pretty simple: when
a thread wishes to acquire a lock, it first does an atomic fetch-and-add
on the ticket value; that value is now considered this thread’s “turn”
(myturn). The globally shared lock->turn is then used to determine
which thread’s turn it is; when (myturn == turn) for a given thread,
it is that thread’s turn to enter the critical section.
What I do not understand is how the thread checks if the lock held by another process before entering the cretical seection. All I can read that the value will be incremented, no mention of checks!
Another part says:
Unlock is accomplished
simply by incrementing the turn such that the next waiting thread (if
there is one) can now enter the critical section.
Which I can not interpret in a way where checks will not be performed, which can not be true becuase it compremises the whole porpose of locking cretical sections. What am I fmissing here? Thanks.
What I do not understand is how the thread checks if the lock held by another process before entering the cretical seection.
You need an "atomic fetch" for this, maybe something like "while( atomic_fetch(currently_serving) != my_ticket) { /* wait */ }".
If you have "atomic fetch and add", then you can implement "atomic fetch" by doing "atomic fetch and add the value zero", maybe something like "while( atomic_fetch_and_add(currently_serving, 0) != my_ticket) { /* wait */ }".
For reference; the full sequence could be something like:
my_ticket = atomic_fetch_and_add(ticket_counter, 1);
while( atomic_fetch_and_add(currently_serving, 0) != my_ticket) {
/* wait */
}
/* Critical section (lock successfully acquired). */
atomic_fetch_and_add(currently_serving, 1); /* Release the lock */
Of course you might have a better atomic fetch you can use instead (e.g. for some CPUs any normal aligned load is atomic).

Multithreaded Environment - Signal Handling in c++ in unix-like environment (freeBSD and linux)

I wrote a network packet listener program and I have 2 threads. Both runs forever but one of them sleeps 30 sec other sleeps 90 sec. In main function, I use sigaction function and after installed signal handler, I created these 2 threads. After creation of threads, main function calls pcaploop function, which is infinite loop. Basic structure of my program:
(I use pseudo syntax)
signalHandler()
only sets a flag (exitState = true)
thread1()
{
while 1
{
sleep 30 sec
check exit state, if so exit(0);
do smth;
}
}
thread2()
{
while 1
{
sleep 90 sec
check exit state, if so exit(0);
do smth;
}
}
main()
{
necassary snytax for sigaction ;
sigaction( SIGINT, &act, NULL );
sigaction( SIGUSR1, &act, NULL );
create thread1;
create thread2;
pcaploop(..., processPacket,...); // infinite loop, calls callback function (processPacket) everytime a packet comes.
join threads;
return 0;
}
processPacket()
{
check exitState, if true exit(0);
do smth;
}
And here is my question. When I press CTRL-C program does not terminate. If the program run less than 6-7 hours, when I press CTRL-C, program terminates. If the program run 1 night, at least 10 hours or more, I cannot terminate the program. Actually, signal handler is not called.
What could be the problem? Which thread does catch the signal?
Basically it would be better to remove all pseudo code you put in your example, and leave the minimum working code, what exactly you have.
From what I can see so far from your example, is that the error handling of sigaction's is missing.
Try to perform checks against errors in your code.
I am writing this for those who had faced with this problem. My problem was about synchronization of threads. After i got handle synchronization problem, the program now, can handle the signals. My advice is check the synchronization again and make sure that it works correctly.
I am sorry for late answer.
Edited :
I have also used sigaction for signal handling
and I have change my global bool variable whit this definition :
static volatile sig_atomic_t exitFlag = 0;
This flag has been used for checking whether the signal received or not.

How do I draw a state diagram for a suspension-queue semaphore?

Here is the question:
Each process may be in different states and different events cause a process to transfer from one state to another; this can be represented using a state diagram. Use a state diagram to explain how a suspension-queue semaphore may be implemented. [10 marks]
Is my diagram correct, or have I misunderstood the question?
http://i.imgur.com/dC5RG6o.jpg
It is my understanding that suspended-queue semaphores maintain a list of blocked processes from which to (perhaps randomly) select a process to unblock when the current process has finished its critical section. Hence the waiting state in the state diagram.
pseudocode of suspended_queue_semaphore.
struct suspended_queue_semaphore
{
int count;
queueType queue;
};
void up(suspended_queue_semaphore s)
{
if (s.count == 0)
{
/* place this process in s.queue /*
/* block this process */
}
else
{
s.count = s.count - 1;
}
}
void down(suspended_queue_semaphore s)
{
if (s.queue is not empty)
{
/* remove a process from s.queue using FIFO */
/* unblock the process */
}
else
{
s.count = s.count + 1;
}
}
Is the state diagram for the process or the semaphore, and which semaphore are you talking about.
In the simplest semaphore: a binary semaphore (i.e. only one process can run) with operations wait() i.e. request to access shared resource and signal() i.e. finished accessing resource.
A state diagram for the process has only two states: Queued (Q) and Running (R) in addition to the Start and Terminate state.
The state diagram would be:
START = wait.CAN_RUN
CAN_RUN = suspend.QUEUED + run.RUNNING
QUEUED = run.RUNNING
RUNNING = signal.END
The semaphore has two states Empty and Full
A state diagram for the semaphore would be:
START = EMPTY
EMPTY = wait.RUN_PROCCESS + RUN_PROCESS
RUN_PROCESS = run.FULL
FULL = signal.EMPTY + wait.SUSPEND_PROCESS
SUSPEND_PROCESS = suspend.FULL
Edit: Fixed notation of state diagrams (was backwards sorry my process calculus is rusty) and added internal processes CAN_RUN, SUSPEND_PROCESS and RUN_PROCESS; and internal messages run and suspend.
Explanation:
The process calls the 'wait' method (up in your pseudo code) and goes to the CAN_RUN state, from there it can either start RUNNING or become QUEUED based on whether it gets a 'run' or 'suspend' message. If QUEUED it can start RUNNING when it receives a 'run' message. If RUNNING it uses 'signal' (down in your pseudo code) before finishing.
The semaphore starts EMPTY, if it gets a 'wait' it goes into RUN_PROCESS issues a 'run' message and becomes FULL. Once FULL any further 'wait' will send it to the SUSPEND_PROCESS state where it issues a 'suspend' to the process. When a 'signal' is received it goes back to EMPTY and it can remain there or go to RUN_PROCESS again based on whether the queue is empty or not (I did not model these internal states, nor did I model the queue as a system.)

Does the new thread exist, when pthread_create() returns?

My application creates several threads with pthread_create() and then tries to verify their presence with pthread_kill(threadId, 0). Every once in a while the pthread_kill fails with "No such process"...
Could it be, I'm calling pthread_kill too early after pthread_create? I thought, the threadId returned by pthread_create() is valid right away, but it seems to not always be the case...
I do check the return value of pthread_create() itself -- it is not failing... Here is the code-snippet:
if (pthread_create(&title->thread, NULL,
process_title, title)) {
ERR("Could not spawn thread for `%s': %m",
title->name);
continue;
}
if (pthread_kill(title->thread, 0)) {
ERR("Thread of %s could not be signaled.",
title->name);
continue;
}
And once in a while I get the message about a thread, that could not be signaled...
That's really an implementation issue. The thread may exist or it may still be in a state of initialisation where pthread_kill won't be valid yet.
If you really want to verify that the thread is up and running, put some form of inter-thread communication in the thread function itself, rather than relying on the underlying details.
This could be as simple as an array which the main thread initialises to something and the thread function sets it to something else as its first action. Something like (pseudo-code obviously):
array running[10]
def threadFn(index):
running[index] = stateBorn
while running[index] != stateDying:
weaveYourMagic()
running[index] = stateDead
exitThread()
def main():
for i = 1 to 10:
running[i] = statePrenatal
startThread (threadFn, i)
for i = 1 to 10:
while running[i] != stateBorn:
sleepABit()
// All threads are now running, do stuff until shutdown required.
for i = 1 to 10:
running[i] = stateDying
for i = 1 to 10:
while running[i] != stateDead:
sleepABit()
// All threads have now exited, though you could have
// also used pthread_join for that final loop if you
// had the thread IDs.
From that code above, you actually use the running state to control both when the main thread knows all other threads are doing something, and to shutdown threads as necessary.

Resources