Does the new thread exist, when pthread_create() returns? - linux

My application creates several threads with pthread_create() and then tries to verify their presence with pthread_kill(threadId, 0). Every once in a while the pthread_kill fails with "No such process"...
Could it be, I'm calling pthread_kill too early after pthread_create? I thought, the threadId returned by pthread_create() is valid right away, but it seems to not always be the case...
I do check the return value of pthread_create() itself -- it is not failing... Here is the code-snippet:
if (pthread_create(&title->thread, NULL,
process_title, title)) {
ERR("Could not spawn thread for `%s': %m",
title->name);
continue;
}
if (pthread_kill(title->thread, 0)) {
ERR("Thread of %s could not be signaled.",
title->name);
continue;
}
And once in a while I get the message about a thread, that could not be signaled...

That's really an implementation issue. The thread may exist or it may still be in a state of initialisation where pthread_kill won't be valid yet.
If you really want to verify that the thread is up and running, put some form of inter-thread communication in the thread function itself, rather than relying on the underlying details.
This could be as simple as an array which the main thread initialises to something and the thread function sets it to something else as its first action. Something like (pseudo-code obviously):
array running[10]
def threadFn(index):
running[index] = stateBorn
while running[index] != stateDying:
weaveYourMagic()
running[index] = stateDead
exitThread()
def main():
for i = 1 to 10:
running[i] = statePrenatal
startThread (threadFn, i)
for i = 1 to 10:
while running[i] != stateBorn:
sleepABit()
// All threads are now running, do stuff until shutdown required.
for i = 1 to 10:
running[i] = stateDying
for i = 1 to 10:
while running[i] != stateDead:
sleepABit()
// All threads have now exited, though you could have
// also used pthread_join for that final loop if you
// had the thread IDs.
From that code above, you actually use the running state to control both when the main thread knows all other threads are doing something, and to shutdown threads as necessary.

Related

How does pthread mutex unlock work? And do threads come up at the same time?

I wanna ask you some basic thing but it really bothers me a lot.
I'm currently studying 'pthread mutex' for system programming and as far as I know, when 'pthread_mutex_lock' is called only current thread is executed not any others. Can I think like this?
And when it comes to 'pthread_mutex_unlock', when this function is called, does the current thread pass the lock permission to others and wait until some other thread calls unlock function again? Or does every thread including current thread execute simultaneously until one of them calls lock function?
Here's the code I was studying:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
enum { STATE_A, STATE_B } state = STATE_A;
pthread_cond_t condA = PTHREAD_COND_INITIALIZER;
pthread_cond_t condB = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void *threadA()
{
printf("A start\n");
int i = 0, rValue, loopNum;
while(i<3)
{
pthread_mutex_lock(&mutex);
while(state != STATE_A)
{
printf("a\n");
pthread_cond_wait(&condA, &mutex);
}
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&condB);
for(loopNum = 1; loopNum <= 5; loopNum++)
{
printf("Hello %d\n", loopNum);
}
pthread_mutex_lock(&mutex);
state = STATE_B;
printf("STATE_B\n");
pthread_cond_signal(&condB);
pthread_mutex_unlock(&mutex);
i++;
}
return 0;
}
void *threadB()
{
printf("B start\n");
int n = 0, rValue;
while(n<3)
{
pthread_mutex_lock(&mutex);
while (state != STATE_B)
{
printf("b\n");
pthread_cond_wait(&condB, &mutex);
}
pthread_mutex_unlock(&mutex);
printf("Goodbye\n");
pthread_mutex_lock(&mutex);
state = STATE_A;
printf("STATE_A\n");
pthread_cond_signal(&condA);
pthread_mutex_unlock(&mutex);
n++;
}
return 0;
}
int main(int argc, char *argv[])
{
pthread_t a, b;
pthread_create(&a, NULL, threadA, NULL);
pthread_create(&b, NULL, threadB, NULL);
pthread_join(a, NULL);
pthread_join(b, NULL);
}
I kind of modified some of the original parts to make sure what's going on in this code such as adding printf("A start\n"), printf("a\n") so on.
And here are some outputs:
Output 1
B start
b
A start
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
b
STATE_B
a
Goodbye
STATE_A
b
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
b
STATE_B
a
Goodbye
STATE_A
b
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
b
STATE_B
Goodbye
STATE_A
Output 2
B start
b
A start
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
STATE_B
a
Goodbye
STATE_A
b
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
STATE_B
a
Goodbye
STATE_A
b
Hello 1
Hello 2
Hello 3
Hello 4
Hello 5
STATE_B
Goodbye
STATE_A
So I learned that when threads are called, they are called simultaneously. Based on this logic, I added the 'printf("A start\n")' and 'printf("B start\n")' in the beginning of the each thread function 'threadA() and threadB()'. But always 'printf("B start\n")' comes up first. If they are called at the same time, don't they have to come up alternatively, at least randomly?
Also after the first 'Hello' loop, I'm assuming 'Goodbye' message always should be earlier than 'a' since I guess the 'pthread_mutex_unlock' in ThreadA calls ThreadB and waits until ThreadB calls unlock function. I want to know how this code works.
I'm guessing I would be totally wrong and misunderstood a lot of parts since I'm a newbie in this field. But wanna get the answer. Thank you for reading this :)
when 'pthread_mutex_lock' is called only current thread is executed
not any others. Can I think like this?
I guess you can think like that, but you'll be thinking incorrectly. pthread_mutex_lock() doesn't cause only the calling thread to execute. Rather, it does one of two things:
If the mutex wasn't already locked, it locks the mutex and returns immediately.
If the mutex was already locked, it puts the calling thread to sleep, to wait until the mutex has become unlocked. Only after pthread_mutex_lock() has successfully acquired the lock, will pthread_mutex_lock() return.
Note that in both cases, the promise that pthread_mutex_lock() makes to the calling thread is this: when pthread_mutex_lock() returns zero/success, the mutex will be locked and the calling thread will be the owner of the lock. (The other possibility is that phread_mutex_lock() will return a negative value indicating an error condition, but that's uncommon in practice so I won't dwell on it)
when it comes to 'pthread_mutex_unlock', does the current thread pass
the lock permission to others and wait until some other thread calls
unlock function again?
The first thing to clarify is that pthread_mutex_unlock() never waits for anything; unlike pthread_mutex_lock(), pthread_mutex_unlock() always returns immediately.
So what does pthread_mutex_unlock() do?
Unlocks the mutex (note that the mutex must have already been locked by a previous call to pthread_mutex_lock() in the same thread. If you call pthread_mutex_unlock() on a mutex without having previously called pthread_mutex_lock() to acquire that same mutex, then your program is buggy and won't work correctly)
Notifies the OS's thread-scheduler (through some mechanism that is deliberately left undocumented, since as a user of the pthreads library you don't need to know or care how it is implemented) that the mutex is now unlocked. Upon receiving that notification, the OS will check to see what other threads (if any) are blocked inside their own call to pthread_mutex_lock(), waiting to acquire this mutex, and if there are any, it will wake up one of those threads so that that thread may acquire the lock and its pthread_mutex_lock() call can then return. All that may happen before or after your thread's call to pthread_mutex_unlock() returns; the exact order of execution is indeterminate and doesn't really matter.
I guess the 'pthread_mutex_unlock' in ThreadA calls ThreadB and waits
until ThreadB calls unlock function.
pthread_mutex_unlock() does no such thing. In general, threads don't/can't call functions in other threads. For what pthread_mutex_unlock() does do, see my description above.
pthread_mutex_lock() doesn't mean only one thread will execute - it just means that any other thread that also tries to call pthread_mutex_lock() on the same mutex object will be suspended until the first thread releases the lock with pthread_mutex_unlock().
If the other threads aren't trying to lock the same mutex, they can continue running simultaneously.
If multiple threads have tried to lock the same mutex while it is locked by the first thread, then when the mutex is released by the first thread with pthread_mutex_unlock() only one of them will be able to proceed (and then when that thread itself calls pthread_mutex_unlock(), another waiting thread will be able to proceed and so on).
Note that a thread waiting for a mutex to be unlocked will not necessarily start executing immediately upon the mutex being unlocked.

wakeup/waiting race in a lock?

I am reading through the OSTEP book by prof.Remzi
http://pages.cs.wisc.edu/~remzi/OSTEP/
I could only partially understand how the following code results in wakeup/waiting race condition.(The code is taken from the books chapter.
http://pages.cs.wisc.edu/~remzi/OSTEP/threads-locks.pdf
void lock(lock_t *m) {
while (TestAndSet(&m->guard, 1) == 1); //acquire guard lock by spinning
if (m->flag == 0) {
m->flag = 1; // lock is acquired
m->guard = 0;
} else {
queue_add(m->q, gettid());
m->guard = 0;
park();
}
}
}
void unlock(lock_t *m) {
while (TestAndSet(&m->guard, 1) == 1); //acquire guard lock by spinning
if (queue_empty(m->q))
m->flag = 0; // let go of lock; no one wants it
else
unpark(queue_remove(m->q)); // hold lock (for next thread!)
m->guard = 0;
}
park() sys call puts a calling thread to sleep, and unpark(threadID) is used to wake a particular thread as designated by threadID.
Now if thread1 hold the lock by setting the m->flag to 1. If the thread2 comes in to acquire the lock, it fails. So the else case is executed, and the thread2 is added to queue, but-assume-if before park() sys call is made, thread2 is scheduled out and thread1 is given the timeslice. If thread1 releases the lock,unlock function tries to call unpark syscall(queue is non empty), since thread2 is in the queue. But the thread2 did not call park() sys call, it just got added to queue.
So the question is
1) what does the thread1's unpark() returns, just a error saying threadID not found?(os specific)
2) what happens to the lock flag ? it was supposed to be passed between the subsequent threads which called the lock routine, freeing the lock only when no more lock contention.
The book says thread2 will sleep for ever. But my understanding is any subsequent threads contesting for the locks will sleep forever,say thread3 tries to acquire lock at later time, because the the lock is never freed by thread1 during the unlock call.
My understanding is most probably wrong because the book was very specific in pointing out thread2 sleeping forever. Or am just reading too much in the example and my understanding is correct?!!! and there is a deadlock?
Mailed this question to prof.Remzi and got a reply from him !!!. Just posting the reply here.
Prof.Remzi's reply:
good questions!
I think you basically have it right.
unpark() will return (and perhaps say that the threadID was not sleeping);
in this implementation, the lock is left locked, and thread2 will sleep forever,
and as you say all subsequent threads trying to acquire the lock won't be
able to.
I think your understanding is correct. I think the unpark() will still return(but did not work normally). Since the thread2 never sleeps the lock held by thread1 will not free. The subsequent threads like thread3,...threadN will still add to the queue and sleep. Also, the thread2 has already been removed from the queue and I would say it is in kind of 'sleep' forever.

Safely close an indefinitely running thread

So first off, I realize that if my code was in a loop I could use a do while loop to check a variable set when I want the thread to close, but in this case that is not possible (so it seems):
DWORD WINAPI recv thread (LPVOID random) {
recv(ClientSocket, recvbuffer, recvbuflen, 0);
return 1;
}
In the above, recv() is a blocking function.
(Please pardon me if the formatting isn't correct. It's the best I can do on my phone.)
How would I go about terminating this thread since it never closes but never loops?
Thanks,
~P
Amongst other solutions you can
a) set a timeout for the socket and handle timeouts correctly by checking the return values and/or errors in an appropriate loop:
setsockopt(ClientSocket,SOL_SOCKET,SO_RCVTIMEO,(char *)&timeout,sizeof(timeout))
b) close the socket with recv(..) returning from blocked state with error.
You can use poll before recv() to check if some thing there to receive.
struct pollfd poll;
int res;
poll.fd = ClientSocket;
poll.events = POLLIN;
res = poll(&poll, 1, 1000); // 1000 ms timeout
if (res == 0)
{
// timeout
}
else if (res == -1)
{
// error
}
else
{
// implies (poll.revents & POLLIN) != 0
recv(ClientSocket, recvbuffer, recvbuflen,0); // we can read ...
}
The way I handle this problem is to never block inside recv() -- preferably by setting the socket to non-blocking mode, but you may also be able to get away with simply only calling recv() when you know the socket currently has some bytes available to read.
That leads to the next question: if you don't block inside recv(), how do you prevent CPU-spinning? The answer to that question is to call select() (or poll()) with the correct arguments so that you'll block there until the socket has bytes ready to recv().
Then comes the third question: if your thread is now blocked (possibly forever) inside select(), aren't we back to the original problem again? Not quite, because now we can implement a variation of the self-pipe trick. In particular, because select() (or poll()) can 'watch' multiple sockets at the same time, we can tell the call to block until either of two sockets has data ready-to-read. Then, when we want to shut down the thread, all the main thread has to do is send a single byte of data to the second socket, and that will cause select() to return immediately. When the thread sees that it is this second socket that is ready-for-read, it should respond by exiting, so that the main thread's blocking call to WaitForSingleObject(theThreadHandle) will return, and then the main thread can clean up without any risk of race conditions.
The final question is: how to set up a socket-pair so that your main thread can call send() on one of the pair's sockets, and your recv-thread will see the sent data appear on the other socket? Under POSIX it's easy, there is a socketpair() function that does exactly that. Under Windows, socketpair() does not exist, but you can roll your own implementation of it as shown here.

Is there a timed signal similar to pthread_cond_timedwait?

I have created many threads all waiting for there own condition. Each thread when runs signals its next condition and again goes into wait state.
However, I want that the currently running thread should signal its next condition after some specified period of time (very short period). How to achieve that?
void *threadA(void *t)
{
while(i<100)
{
pthread_mutex_lock(&mutex1);
while (state != my_id )
{
pthread_cond_wait(&cond[my_id], &mutex1);
}
// processing + busy wait
/* Set state to i+1 and wake up thread i+1 */
pthread_mutex_lock(&mutex1);
state = (my_id + 1) % NTHREADS;//usleep(1);
// (Here I don't want this sleep. I want that this thread completes it processing and signals next thread a bit later.)
/*nanosleep(&zero, NULL);*/
pthread_cond_signal(&cond[(my_id + 1) % NTHREADS]); // Send signal to Thread (i+1) to awake
pthread_mutex_unlock(&mutex1);
i++;
}
Signalling a condition does nothing if there is nothing waiting on the condition. So, if pthread 'x' signals condition 'cx' and then waits on it, it will wait for a very long time... unless some other thread also signals 'cx' !
I'm not really sure I understand what you mean by the pthread signalling its "next condition", but it occurs to me that there is not much difference between waiting to signal a waiting thread and the thread sleeping after it is signalled ?

How do I draw a state diagram for a suspension-queue semaphore?

Here is the question:
Each process may be in different states and different events cause a process to transfer from one state to another; this can be represented using a state diagram. Use a state diagram to explain how a suspension-queue semaphore may be implemented. [10 marks]
Is my diagram correct, or have I misunderstood the question?
http://i.imgur.com/dC5RG6o.jpg
It is my understanding that suspended-queue semaphores maintain a list of blocked processes from which to (perhaps randomly) select a process to unblock when the current process has finished its critical section. Hence the waiting state in the state diagram.
pseudocode of suspended_queue_semaphore.
struct suspended_queue_semaphore
{
int count;
queueType queue;
};
void up(suspended_queue_semaphore s)
{
if (s.count == 0)
{
/* place this process in s.queue /*
/* block this process */
}
else
{
s.count = s.count - 1;
}
}
void down(suspended_queue_semaphore s)
{
if (s.queue is not empty)
{
/* remove a process from s.queue using FIFO */
/* unblock the process */
}
else
{
s.count = s.count + 1;
}
}
Is the state diagram for the process or the semaphore, and which semaphore are you talking about.
In the simplest semaphore: a binary semaphore (i.e. only one process can run) with operations wait() i.e. request to access shared resource and signal() i.e. finished accessing resource.
A state diagram for the process has only two states: Queued (Q) and Running (R) in addition to the Start and Terminate state.
The state diagram would be:
START = wait.CAN_RUN
CAN_RUN = suspend.QUEUED + run.RUNNING
QUEUED = run.RUNNING
RUNNING = signal.END
The semaphore has two states Empty and Full
A state diagram for the semaphore would be:
START = EMPTY
EMPTY = wait.RUN_PROCCESS + RUN_PROCESS
RUN_PROCESS = run.FULL
FULL = signal.EMPTY + wait.SUSPEND_PROCESS
SUSPEND_PROCESS = suspend.FULL
Edit: Fixed notation of state diagrams (was backwards sorry my process calculus is rusty) and added internal processes CAN_RUN, SUSPEND_PROCESS and RUN_PROCESS; and internal messages run and suspend.
Explanation:
The process calls the 'wait' method (up in your pseudo code) and goes to the CAN_RUN state, from there it can either start RUNNING or become QUEUED based on whether it gets a 'run' or 'suspend' message. If QUEUED it can start RUNNING when it receives a 'run' message. If RUNNING it uses 'signal' (down in your pseudo code) before finishing.
The semaphore starts EMPTY, if it gets a 'wait' it goes into RUN_PROCESS issues a 'run' message and becomes FULL. Once FULL any further 'wait' will send it to the SUSPEND_PROCESS state where it issues a 'suspend' to the process. When a 'signal' is received it goes back to EMPTY and it can remain there or go to RUN_PROCESS again based on whether the queue is empty or not (I did not model these internal states, nor did I model the queue as a system.)

Resources