A Seemingly Simple Synchronization Problem
TL;DR
Several threads depend on each other. Whenever one of them finds some new information, all of them need to process that information. How to determine, that all threads are ready?
Background
I have (almost) parallelized a function Foo(input) that solves a problem, which is known to be P-complete and may be thought of as some type of search. Unsurprisingly, so far nobody has managed to successfully exploit parallelism beyond two threads for solving that problem. However, I had a promising idea and managed to fully implement it, except for this seemingly simply problem.
Details
Information between each of the threads is exchanged implicitly using some kind of shared graph-like database g of type G, such that the threads have all informations immediately and do not really need to notify each other explicitly. More precisely, each time an information i is found by some thread, that thread calls a thread-safe function g.addInformation(i) which among other things basically places the information i at the end of some array. One aspect of my new implementation is, that threads can use an information i during their search even before i has been enqueued at the end of the array. Nevertheless, each thread needs to additionally process the information i separately after it has been enqueued in that array. Enqueueing i may happen after the thread who added i has returned from g.addInformation(i). This is because some other thread may take over responsibility to enqueue i.
Each thread s calls a function s.ProcessAllInformation() in order to processes all information in that array in g in order. A call to s.ProcessAllInformation by some thread is a noop, i.e. does nothing, if that thread has already processed all informations or there was no (new) informations.
As soon as a thread finished processing all informations, it should wait for all other threads to finish. And it should resume work if any of the other threads finds some new information i. I.e. each time some thread calls g.addInformation(i) all threads that had finished processing all previously known informations, need to resume their work and process that (and any other) newly added information.
My Problem
Any solution I could think does not work and suffers from a variation of the same problem: One thread finished processing all informations and then sees all other threads are ready, too. Hence, this thread leaves. But then another thread notices some new information had been added, resumes work and finds a new information. The new information is then not processed by the thread that has already left.
A solution to this problem may be straight forward, but I can not think of one. Ideally a solution to this problem should not depend on time-consuming operations during a function call to g.addInformation(i) whenever a new information is found, because of how many times a second this situation is predicted to appear (1 or 2 Million times per second, see below).
Even more background
In my initially sequential application the function Foo(input) is called roughly 100k times a second on modern hardware and my application spends 80% to 90% of time executing Foo(input). Actually, all function calls to Foo(input) depend on each other, we kind of search for something in a very large space in an iterative manner. Solving a reasonable-sized problem typically takes about one or two hours when using the sequential version of the application.
Each time Foo(input) is called between zero and many hundred new informations are found. On average during the execution of my application 1 or 2 million informations are found per second, i.e. we find 10 to 20 new informations on each function call to Foo(input). All of these statistics probably have a very high standard deviation (which i didn't yet measure, though).
Currently I am writing a prototype for the parallel version of Foo(input) in go. I prefer answers in go. The sequential application is written in C (actually it's C++, but its written like a program in C). So answers in C or C++ (or pseudo-code) are no problem. I haven't benchmarked my prototype, yet, since wrong code is infinitely slower than slow code.
Code
This code examples are in order to clarify. Since I haven't solved the problem feel free to consider any changes to the code. (I appreciate unrelated helpful remarks, too.)
Global situation
We have some type G and Foo() is a method of G. If g is an object of type G and when g.Foo(input) is called, g creates some workers s[1], ..., s[g.numThreads] that obtain a pointer to g, such that these have access to the member variables of g and are able to call g.addInformation(i) whenever they find a new information. Then for each worker s[j] a method FooInParallel() is called in parallel.
type G struct {
s []worker
numThreads int
// some data, that the workers need access to
}
func (g *G) initializeWith(input InputType) {
// Some code...
}
func (g *G) Foo(input InputType) int {
// Initialize data-structures:
g.initializeWith(input)
// Initialize workers:
g.s := make([]worker, g.numThreads)
for j := range g.s {
g.s[j] := newWorker(g) // workers get a pointer to g
}
// Note: This wait group doesn't solve the problem. See remark below.
wg := new(sync.WaitGroup)
wg.Add(g.numThreads)
// Actual computation in parallel:
for j := 0 ; j < g.numThreads - 1 ; j++ {
// Start g.numThread - 1 go-routines in parrallel
go g.s[j].FooInParallel(wg)
}
// Last thread is this go-routine, such that we have
// g.numThread go-routines in total.
g.s[g.numThread-1].FooInParallel(wg)
wg.Wait()
}
// This function is thread-safe in so far as several
// workers can concurrently add information.
//
// The function is optimized for heavy contention; most
// threads can leave almost immediately. One threads
// cleans up any mess they leave behind (and even in
// bad cases that is not too much).
func (g *G) addInformation(i infoType) {
// Step 1: Make information available to all threads.
// Step 2: Enqueue information at the end of some array.
// Step 3: Possibly, call g.notifyAll()
}
// If a new information has been added, we must ensure,
// that every thread, that had finished, resumes work
// and processes any newly added informations.
func (g *G) notifyAll() {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
}
// If a thread has finished processing all information
// it must ensure that all threads have finished and
// that no new information have been added since.
func (g *G) allThreadsReady() bool {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
}
Remark: The only purpose of the wait group is to ensure Foo(input) is not called again before the last worker has returned. However, you can completely ignore this.
Local Situation
Each worker contains a pointer to the global data-structure and searches for either a treasure or new informations until it has processed all information that have been enqueued by this or other threads. If it finds a new information i it calls the function g.addInformation(i) and continues its search. If it finds a treasure it sends the treasure via a channel it has obtained as an argument and returns. If all threads are ready with processing all information, each of them can send a dummy-treasure to the channel and return. However, determining whether all threads are ready is exactly my problem.
type worker struct {
// Each worker contains a pointer to g
// such that it has access to its member
// variables and is able to call the
// function g.addInformation(i) as soon
// as it finds some information i.
g *G
// Also contains some other stuff.
}
func (s *worker) FooInParallel(wg *sync.WaitGroup) {
defer wg.Done()
for {
a := s.processAllInformation()
// The following is the problem. Feel free to make any
// changes to the following block.
s.notifyAll()
for !s.needsToResumeWork() {
if s.allThreadsReady() {
return
}
}
}
}
func (s *worker) notifyAll() {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
// An example:
// Step 1: Possibly, do something else first.
// Step 2: Call g.notifyAll()
}
func (s *worker) needsToResumeWork() bool {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
}
func (s *worker) allThreadsReady() bool {
// TODO:
// This is what I fail to accomplish. I include
// my most successful attempt in the corresponding.
// section. It doesn't work, though.
// If all threads are ready, return true.
// Otherwise, return false.
// Alternatively, spin as long as no new information
// has been added, and return false as soon as some
// new information has been added, or true if no new
// information has been added and all other threads
// are ready.
//
// However, this doesn't really matter, because a
// function call to processAllInformation is cheap
// if no new informations are available.
}
// A call to this function is cheap if no new work has
// been added since the last function call.
func (s *worker) processAllInformation() treasureType {
// Access member variables of g and search
// for information or treasures.
// If a new information i is found, calls the
// function g.addInformation(i).
// If all information that have been enqueued to
// g have been processed by this thread, returns.
}
My best attempt to solve the problem
Well, by now, I am rather tired, so I might need to double-check my solution later. However, even my correct attempt doesn't work. So in order to give you an idea of what I have been trying so far (among many other things), I share it immediately.
I tried the following. Each of the workers contains a member variable needsToResumeWork, that is atomically set to one whenever a new information has been added. Several times setting this member variable to one does not do harm, it is only important that the thread resumes work after the last information has been added.
In order to reduce work load for a thread calling g.addInformation(i) whenever an information i is found, instead of notifying all threads individually, the thread that enqueues the information (that is not necessarily the thread that called g.addInformation(i)) afterwards sets a member variable notifyAllFlag of g to one, which indicates that all threads need to be notified about the latest information.
Whenever a thread that has finished processing all information that had been enqueued calls the function g.notifyAll(), it checks whether the member variable notifyAllFlag is set to one. If so it tries to atomically compare g.allInformedFlag with 1 and swap with 0. If it could not write g.allInformedFlag it assumes some other thread has taken the responsibility to inform all threads. If this operation is successful, this thread has taken over responsibility to notify all threads and proceeds to do so by setting the member variable needsToResumeWorkFlag to one for every thread. Afterwards it atomically sets g.numThreadsReady and g.notifyAllFlag to zero, and g.allInformedFlag to 1.
type G struct {
numThreads int
numThreadsReady *uint32 // initialize to 0 somewhere appropriate
notifyAllFlag *uint32 // initialize to 0 somewhere appropriate
allInformedFlag *uint32 // initialize to 1 somewhere appropriate (1 is not a typo)
// some data, that the workers need access to
}
// This function is thread-safe in so far as several
// workers can concurrently add information.
//
// The function is optimized for heavy contention; most
// threads can leave almost immediately. One threads
// cleans up any mess they leave behind (and even in
// bad cases that is not too much).
func (g *G) addInformation(i infoType) {
// Step 1: Make information available to all threads.
// Step 2: Enqueue information at the end of some array.
// Since the responsibility to enqueue an information may
// be passed to another thread, it is important that the
// last step is executed by the thread which enqueues the
// information(s) in order to ensure, that the information
// successfully has been enqueued.
// Step 3:
atomic.StoreUint32(g.notifyAllFlag,1) // all threads need to be notified
}
// If a new information has been added, we must ensure,
// that every thread, that had finished, resumes work
// and processes any newly added informations.
func (g *G) notifyAll() {
if atomic.LoadUint32(g.notifyAll) == 1 {
// Somebody needs to notify all threads.
if atomic.CompareAndSwapUint32(g.allInformedFlag, 1, 0) {
// This thread has taken over the responsibility to inform
// all other threads. All threads are hindered to access
// their member variable s.needsToResumeWorkFlag
for j := range g.s {
atomic.StoreUint32(g.s[j].needsToResumeWorkFlag, 1)
}
atomic.StoreUint32(g.notifyAllFlag, 0)
atomic.StoreUint32(g.numThreadsReady, 0)
atomic.StoreUint32(g.allInformedFlag, 1)
} else {
// Some other thread has taken responsibility to inform
// all threads.
}
}
Whenever a thread finishes processing all information that had been enqueued, it checks whether it needs to resume work by atomically comparing its member variable needsToResumeWorkFlag with 1 and swapping with 0. However, since one of the threads is responsible to notify all others, it can not do so immediately.
First, it must call the function g.notifyAll(), and then it must check, whether the latest thread to call g.notifyAll() finished notifying all threads. Hence, after calling g.notifyAll() it must spin until g.allInformed is one, before it checks whether its member variable s.needsToResumeWorkFlag is one and in this case atomically sets it to be zero and resumes work. (I guess here is a mistake, but I also tried several other things here without success.) If s.needsToResumeWorkFlag is already zero, it atomically increments g.numThreadsReady by one, if it hasn't done so before. (Recall that g.numThreadsReady is reset during a function call to g.notifyAll().) then it atomically checks whether g.numThreadsReady is equal to g.numThreads, in which case it can leave (after sending a dummy-treasure to the channel). otherwise we start all over again until either this thread has been notified (possibly by itself) or all threads are ready.
type worker struct {
// Each worker contains a pointer to g
// such that it has access to its member
// variables and is able to call the
// function g.addInformation(i) as soon
// as it finds some information i.
g *G
// If new work has been added, the thread
// is notified by setting the uint32
// at which needsToResumeWorkFlag points to 1.
needsToResumeWorkFlag *uint32 // initialize to 0 somewhere appropriate
// Also contains some other stuff.
}
func (s *worker) FooInParallel(wg *sync.WaitGroup) {
defer wg.Done()
for {
a := s.processAllInformation()
numReadyIncremented := false
for !s.needsToResumeWork() {
if !numReadyIncremented {
atomic.AddUint32(g.numThreadsReady,1)
numReadyIncremented = true
}
if s.allThreadsReady() {
return
}
}
}
}
func (s *worker) needsToResumeWork() bool {
s.notifyAll()
for {
if atomic.LoadUint32(g.allInformedFlag) == 1 {
if atomic.CompareAndSwapUint32(s.needsToResumeWorkFlag, 1, 0) {
return true
} else {
return false
}
}
}
}
func (s *worker) notifyAll() {
g.notifyAll()
}
func (g *G) allThreadsReady() bool {
if atomic.LoadUint32(g.numThreadsReady) == g.numThreads {
return true
} else {
return false
}
}
As mentioned my solution doesn't work.
I found a solution myself. We exploit, that a call to s.processAllInformation() does nothing, if no new information had been added (and is cheap). The trick is to use an atomic variable as a lock to both, for each thread to notify all if necessary and to check whether it has been notified. And then to simply call s.processAllInformation() again, if the lock can not be acquired. A thread then uses the notifications to check whether it has to increment the counter of ready threads, instead of to see whether it needs to return work.
Global situation
type G struct {
numThreads int
numThreadsReady *uint32 // initialize to 0 somewhere appropriate
notifyAllFlag *uint32 // initialize to 0 somewhere appropriate
allCanGoFlag *uint32 // initialize to 0 somewhere appropriate
lock *uint32 // initialize to 0 somewhere appropriate
// some data, that the workers need access to
}
// This function is thread-safe in so far as several
// workers can concurrently add information.
//
// The function is optimized for heavy contention; most
// threads can leave almost immediately. One threads
// cleans up any mess they leave behind (and even in
// bad cases that is not too much).
func (g *G) addInformation(i infoType) {
// Step 1: Make information available to all threads.
// Step 2: Enqueue information at the end of some array.
// Since the responsibility to enqueue an information may
// be passed to another thread, it is important that the
// last step is executed by the thread which enqueues the
// information(s) in order to ensure, that the information
// successfully has been enqueued.
// Step 3:
atomic.StoreUint32(g.notifyAllFlag,1) // all threads need to be notified
}
// If a new information has been added, we must ensure,
// that every thread, that had finished, resumes work
// and processes any newly added informations.
//
// This function is not thread-safe. Make sure not to
// have several threads call this function concurrently
// if these calls are not guarded by some lock.
func (g *G) notifyAll() {
if atomic.LoadUint32(g.notifyAllFlag,1) {
for j := range g.s {
atomic.StoreUint32(g.s[j].needsToResumeWorkFlag, 1)
}
atomic.StoreUint32(g.notifyAllFlag,0)
atomic.StoreUint32(g.numThreadsReady,0)
}
Local situation
type worker struct {
// Each worker contains a pointer to g
// such that it has access to its member
// variables and is able to call the
// function g.addInformation(i) as soon
// as it finds some information i.
g *G
// If new work has been added, the thread
// is notified by setting the uint32
// at which needsToResumeWorkFlag points to 1.
needsToResumeWorkFlag *uint32 // initialize to 0 somewhere appropriate
incrementedNumReadyFlag *uint32 // initialize to 0 somewhere appropriate
// Also contains some other stuff.
}
func (s *worker) FooInParallel(wg *sync.WaitGroup) {
defer wg.Done()
for {
a := s.processAllInformation()
if atomic.LoadUint32(s.g.allCanGoFlag, 1) {
return
}
if atomic.CompareAndSwapUint32(g.lock,0,1) { // If possible, lock.
s.g.notifyAll() // It is essential, that this is also guarded by the lock.
if atomic.LoadUint32(s.needsToResumeWorkFlag) == 1 {
atomic.StoreUint32(s.needsToResumeWorkFlag,0)
// Some new information was found, and this thread can't be sure,
// whether it already has processed it. Since the counter for
// how many threads are ready had been reset, we must increment
// that counter after the next call processAllInformation() in the
// following iteration.
atomic.StoreUint32(s.incrementedNumReadyFlag,0)
} else {
// Increment number of ready threads by one, if this thread had not
// done this before (since the last newly found information).
if atomic.CompareAndSwapUint32(s.incrementedNumReadyFlag,0,1) {
atomic.AddUint32(s.g.numThreadsReady,1)
}
// If all threads are ready, give them all a signal.
if atomic.LoadUint32(s.g.numThreadsReady) == s.g.numThreads {
atomic.StoreUint32(s.g.allCanGo, 1)
}
}
atomic.StoreUint32(g.lock,0) // Unlock.
}
}
}
Later I may add some order for the threads to access to the lock under heavy contention, but for now that'll do.
I need to pause the current thread in Rust and notify it from another thread. In Java I would write:
synchronized(myThread) {
myThread.wait();
}
and from the second thread (to resume main thread):
synchronized(myThread){
myThread.notify();
}
Is is possible to do the same in Rust?
Using a channel that sends type () is probably easiest:
use std::sync::mpsc::channel;
use std::thread;
let (tx,rx) = channel();
// Spawn your worker thread, giving it `send` and whatever else it needs
thread::spawn(move|| {
// Do whatever
tx.send(()).expect("Could not send signal on channel.");
// Continue
});
// Do whatever
rx.recv().expect("Could not receive from channel.");
// Continue working
The () type is because it's effectively zero-information, which means it's pretty clear you're only using it as a signal. The fact that it's size zero means it's also potentially faster in some scenarios (but realistically probably not any faster than a normal machine word write).
If you just need to notify the program that a thread is done, you can grab its join guard and wait for it to join.
let guard = thread::spawn( ... ); // This will automatically join when finished computing
guard.join().expect("Could not join thread");
You can use std::thread::park() and std::thread::Thread::unpark() to achieve this.
In the thread you want to wait,
fn worker_thread() {
std::thread::park();
}
in the controlling thread, which has a thread handle already,
fn main_thread(worker_thread: std::thread::Thread) {
worker_thread.unpark();
}
Note that the parking thread can wake up spuriously, which means the thread can sometimes wake up without the any other threads calling unpark on it. You should prepare for this situation in your code, or use something like std::sync::mpsc::channel that is suggested in the accepted answer.
There are multiple ways to achieve this in Rust.
The underlying model in Java is that each object contains both a mutex and a condition variable, if I remember correctly. So using a mutex and condition variable would work...
... however, I would personally switch to using a channel instead:
the "waiting" thread has the receiving end of the channel, and waits for it
the "notifying" thread has the sending end of the channel, and sends a message
It is easier to manipulate than a condition variable, notably because there is no risk to accidentally use a different mutex when locking the variable.
The std::sync::mpsc has two channels (asynchronous and synchronous) depending on your needs. Here, the asynchronous one matches more closely: std::sync::mpsc::channel.
There is a monitor crate that provides this functionality by combining Mutex with Condvar in a convenience structure.
(Full disclosure: I am the author.)
Briefly, it can be used like this:
let mon = Arc::new(Monitor::new(false));
{
let mon = mon.clone();
let _ = thread::spawn(move || {
thread::sleep(Duration::from_millis(1000));
mon.with_lock(|mut done| { // done is a monitor::MonitorGuard<bool>
*done = true;
done.notify_one();
});
});
}
mon.with_lock(|mut done| {
while !*done {
done.wait();
}
println!("finished waiting");
});
Here, mon.with_lock(...) is semantically equivalent to Java's synchronized(mon) {...}.
Given the following C++11 code fragment:
#include <condition_variable>
#include <mutex>
std::mutex block;
long count;
std::condition_variable cv;
void await()
{
std::unique_lock<std::mutex> lk(block);
if (count > 0)
cv.wait(lk);
}
void countDown()
{
std::lock_guard<std::mutex> lk(block);
if (count > 0)
{
count--;
if (count==0) cv.notify_all();
}
}
If it is not clear what I am trying to accomplish, I am wanting calls to await to pause the calling thread while count is greater than 0, and if it has already been reduced to zero, then it should not pause at all. Other threads may call countDown() which will wake all threads that had previously called await.
The above code seems to work in all cases that I've tried, but I have this nagging doubt about it, because it seems to me like there is a possibility for unexpected behavior if the thread calling await() just happens to get preempted immediately after its condition test has been evaluated and just before the thread is actually suspended by the cv.wait() call, and if the countDown function is getting called at this time, and the count equals 0, then it would issue a notify to the condition variable, IF it were actually already waiting on it... but the thread calling await hasn't hit the cv.wait() call yet, so when the thread calling await resumes, it stops at the cv.wait() call and waits indefinitely.
I actually haven't seen this happen yet in practice, but I would like to harden the code against the eventuality.
It is good that you are thinking about these possibilities. But in this case your code is correct and safe.
If await gets preempted immediately after its condition test has been evaluated and just before the thread is actually suspended by the cv.wait() call, and if the countDown function is getting called at this time, the latter thread will block while trying to obtain the block mutex until await actually calls cv.wait(lk).
The call to cv.wait(lk) implicitly releases the lock on block, and thus now another thread can obtain the lock on block in countDown(). And as long as a thread holds the lock on block in countDown() (even after cv.notify_all() is called), the await thread can not return from cv.wait(). The await thread implicitly blocks on trying to re-lock block during the return from cv.wait().
Update
I did make a rookie mistake while reviewing your code though <blush>.
cv.wait(lk) may return spuriously. That is, it may return even though it hasn't been notified. To guard against this you should place your wait under a while loop, instead of under an if:
void await()
{
std::unique_lock<std::mutex> lk(block);
while (count > 0)
cv.wait(lk);
}
Now if the wait returns spuriously, it re-checks the condition, and if still not satisfied, waits again.
I'm writing a program in which I need to make sure a particular function is called is not being executed in more than one thread at a time.
Here I've written some simplified pseudocode that does exactly what is done in my real program.
mutex _enqueue_mutex;
mutex _action_mutex;
queue _queue;
bool _executing_queue;
// called in multiple threads, possibly simultaneously
do_action() {
_enqueue_mutex.lock()
object o;
_queue.enqueue(o);
_enqueue_mutex.unlock();
execute_queue();
}
execute_queue() {
if (!executing_queue) {
_executing_queue = true;
enqueue_mutex.lock();
bool is_empty = _queue.isEmpty();
_enqueue_mutex.lock();
while (!is_empty) {
_action_mutex.lock();
_enqueue_mutex.lock();
object o = _queue.dequeue();
is_empty = _queue.isEmpty();
_enqueue_mutex.unlock();
// callback is called when "o" is done being used by "do_stuff_to_object_with_callback" also, this function doesn't block, it is executed on its own thread (hence the need for the callback to know when it's done)
do_stuff_to_object_with_callback(o, &some_callback);
}
_executing_queue = false;
}
}
some_callback() {
_action_mutex.unlock();
}
Essentially, the idea is that _action_mutex is locked in the while loop (I should say that lock is assumed to be blocking until it can be locked again), and expected to be unlocked when the completion callback is called (some_callback in the above code).
This, does not seem to be working though. What happens is if the do_action is called more than once at the same time, the program locks up. I think it might be related to the while loop executing more than once simultaneously, but I just cant see how that could be the case. Is there something wrong with my approach? Is there a better approach?
Thanks
A queue that is not specifically designed to be multithreaded (multi-producer multi-consumer) will need to serialize both eneueue and dequeue operations using the same mutex.
(If your queue implementation has a different assumption, please state it in your question.)
The check for _queue.isEmpty() will also need to be protected, if the dequeue operation is prone to the Time of check to time of use problem.
That is, the line
object o = _queue.dequeue();
needs to be surrounded by _enqueue_mutex.lock(); and _enqueue_mutex.unlock(); as well.
You probably only need a single mutex for the queue. Also once you've dequeued the object, you can probably process it outside of the lock. This will prevent calls to do_action() from hanging too long.
mutex moo;
queue qoo;
bool keepRunning = true;
do_action():
{
moo.lock();
qoo.enqueue(something);
moo.unlock(); // really need try-finally to make sure,
// but don't know which language we are using
}
process_queue():
{
while(keepRunning)
{
moo.lock()
if(!qoo.isEmpty)
object o = qoo.dequeue();
moo.unlock(); // again, try finally needed
haveFunWith(o);
sleep(50);
}
}
Then Call process_queue() on it's own thread.
I'm trying to implement a sort of thread pool whereby I keep threads in a FIFO and process a bunch of images. Unfortunately, for some reason my cond_wait doesn't always wake even though it's been signaled.
// Initialize the thread pool
for(i=0;i<numThreads;i++)
{
pthread_t *tmpthread = (pthread_t *) malloc(sizeof(pthread_t));
struct Node* newNode;
newNode=(struct Node *) malloc(sizeof(struct Node));
newNode->Thread = tmpthread;
newNode->Id = i;
newNode->threadParams = 0;
pthread_cond_init(&(newNode->cond),NULL);
pthread_mutex_init(&(newNode->mutx),NULL);
pthread_create( tmpthread, NULL, someprocess, (void*) newNode);
push_back(newNode, &threadPool);
}
for() //stuff here
{
//...stuff
pthread_mutex_lock(&queueMutex);
struct Node *tmpNode = pop_front(&threadPool);
pthread_mutex_unlock(&queueMutex);
if(tmpNode != 0)
{
pthread_mutex_lock(&(tmpNode->mutx));
pthread_cond_signal(&(tmpNode->cond)); // Not starting mutex sometimes?
pthread_mutex_unlock(&(tmpNode->mutx));
}
//...stuff
}
destroy_threads=1;
//loop through and signal all the threads again so they can exit.
//pthread_join here
}
void *someprocess(void* threadarg)
{
do
{
//...stuff
pthread_mutex_lock(&(threadNode->mutx));
pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx));
// Doesn't always seem to resume here after signalled.
pthread_mutex_unlock(&(threadNode->mutx));
} while(!destroy_threads);
pthread_exit(NULL);
}
Am I missing something? It works about half of the time, so I would assume that I have a race somewhere, but the only thing I can think of is that I'm screwing up the mutexes? I read something about not signalling before locking or something, but I don't really understand what's going on.
Any suggestions?
Thanks!
Firstly, your example shows you locking the queueMutex around the call to pop_front, but not round push_back. Typically you would need to lock round both, unless you can guarantee that all the pushes happen-before all the pops.
Secondly, your call to pthread_cond_wait doesn't seem to have an associated predicate. Typical usage of condition variables is:
pthread_mutex_lock(&mtx);
while(!ready)
{
pthread_cond_wait(&cond,&mtx);
}
do_stuff();
pthread_mutex_unlock(&mtx);
In this example, ready is some variable that is set by another thread whilst that thread holds a lock on mtx.
If the waiting thread is not blocked in the pthread_cond_wait when pthread_cond_signal is called then the signal will be ignored. The associated ready variable allows you to handle this scenario, and also allows you to handle so-called spurious wake-ups where the call to pthread_cond_wait returns without a corresponding call to pthread_cond_signal from another thread.
I'm not sure, but I think you don't have to (you must not) lock the mutex in the thread pool before calling pthread_cond_signal(&(tmpNode->cond)); , otherwise, the thread which is woken up won't be able to lock the mutex as part of pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx)); operation.