pthread Segmentation Faults - linux

I am doing an assignment using pthreads and mutual exclusion. I have to create n print servers and m print clients, who each have 5 print jobs. We are to create the threads, pass the jobs through a queue of size 4 to the print servers which then print the job (ie busy work in this case). Here is the code for passing the jobs and servicing the jobs.
These are the client and server threads
void *PrintClient(void *arg){
int i;
char str[NUMJOBSPERCLIENT][100];
for(i=1;i<NUMJOBSPERCLIENT;i++){
pthread_mutex_lock(&mutex);
req.clientID = pthread_self();
req.fileSize = rand_int(FILEMIN,FILEMAX);
sprintf(str[i], "File_%d_%d",pthread_self(),i);
req.fileName = str[i];
append(req);
pthread_mutex_unlock(&mutex);
sleep(rand_int(1,3));
}//for
pthread_exit(NULL);
} // end PrintClient
void *PrintServer(void *arg){
pthread_mutex_lock(&mutex);
pthread_cond_wait(&cond,&mutex);
while(count > 0){
take();
count = count -1;
}
pthread_mutex_unlock(&mutex);
pthread_exit(NULL);
} // end PrintServer
And this is code which adds or removes a job from the queue. I know the error is here and it had to do with the threads themselves but I can not find it for the life of me. So far the debugger has been almost no help (I am running on a university linux server which is showing no compile errors).
void append(PrintRequest item){
BoundBuffer[count] = req;
printf("I am client %s\n",req.fileName);
count++;
if(count == BUFSIZE){
printf("Buffer Size Reached\n");
pthread_cond_signal(&cond);
}
} // end append
PrintRequest take(){
printf("Printing %s\n", BoundBuffer[count].fileName);
usleep(BoundBuffer[count].fileSize/PRINTSPEED);
printf("Finished Printing %s\n", BoundBuffer[count].fileName);
} // end take

I guess the segmentation fault is signaled around printf("Printing %s\n", BoundBuffer[count].fileName);, right?
In your PrintClient, you store file name to local variable str[][] and copy the pointer to this local variable as one parameter of the request req.fileName = str[i];. Thus the address pointed by req.fileName is allocated on the stack of the client thread.
When the requests are processed in the server thread PrintServer, it is possible that the client thread which generated the request is no longer present. The result is that req.fileName points to an address which doesn't exists (the stack memory has already been de-allocated with the exiting of the client thread), then when you de-reference such address in printf("Printing %s\n", BoundBuffer[count].fileName);, segmentation fault is signaled.

Related

MultiThreading and Uart write

FIRST see below EDIT part.
I use beaglebone black, ker 3.8, and GCC compiler for a signal processing project.
I receive the raw data from three GPS modules through uart communication asynchronously. So, I used 3 threads which check 3 UARTs (BB-UART1, BB-UART2, BB-UART4) continuously to receive raw data, here called the "reading threads". after each data packet received from each module, I decode the received data packet and extract important data packet.
when, decoding is finished in the threads, I perform the signal processing in a separate thread, called "signal processing thread", using above 3-important decoded data packet.
as it's obvious, I should synchronize the reading threads with signal processing thread. I use pthread_cond_wait and pthread_cond_signal for that.
the code operates fine and synchronization and signal processing are performed efficiently. each data packet received in 0.1 second (10 times in a second).
In the signal processing thread, after signal processing, I send the signal processing result to the user through separate UART, BB-UART5.
when I add this part of the code, "THIS" line, after some time, which all parts or OK and the signal processing results are sent to user, the signal processing thread is frozen and locked in a mutex. In fact the mutex unlocking is not performed in previous step.
I spent many time, some weeks, to find the reason. when I remove the mutex and other tools for threads synchronization (to make the whole code simple to debug) an array of data in somewhere is overflowed and its data are changed to overflowed values. However when I don't add "THIS" line, overflow not occurred any time.
when I remove the "write" function (of BB-UART5) in signal processing thread all operations are OK.
the signal processing thread:
void *signal_processing_thread (void *arg){
int i, j;
char str[512];
printf("000000000000000000000000000000000000000000\r\n");
printf("0-signal_processing_thread is running!\r\n");
printf("000000000000000000000000000000000000000000\r\n");
while(1){
pthread_mutex_lock(&th1); // the code lock here after add "THIS" line
pthread_mutex_lock(&th2);
pthread_mutex_lock(&th3);
if ((!decode_completed[0])|(!decode_completed[1])|(!decode_completed[2])){
pthread_mutex_unlock(&th1);
pthread_mutex_unlock(&th2);
pthread_mutex_unlock(&th3);
continue;
}
// data packets are ready in reading threads
// signal processing start
// signal processing done
// send the results
sprintf (str,"some string\r\n\0",some variables);
printf (str);
for (i=0;i<256;i++)
if (str[i]==0)
break;
write (uart5_id, str, i); // "THIS" line
decode_completed [0] = 0;
decode_completed [1] = 0;
decode_completed [2] = 0;
pthread_cond_signal(&cv1);
pthread_mutex_unlock(&th1);
pthread_cond_signal(&cv2);
pthread_mutex_unlock(&th2);
pthread_cond_signal(&cv3);
pthread_mutex_unlock(&th3);
}
printf("signal processing thread is closed!\r\n");
}
the reading threads:
void *getdecodedata1_thread (void *arg){
int ret, count, count_decode=0;
char buffer[2024], buffer_decode[2024];
int i;
printf("1-getdecodedata_thread is running!\r\n");
count = 0;
pthread_mutex_lock(&th1);
while(1){
for (i=0;i<500;i++){
ret = read(uart1_id, buffer+count ,255);
if (ret<1)
continue;
// data received
count += ret;
if (count>1000) break;
}
if (count>0){ // packet received
for (i=0;i<count;i++)
buffer_decode1[count_decode3+i]=buffer[i];
count_decode1 += count;
if (count_decode1>15){
// decode
// ....
// decode done
}
count = 0;
}
if (decode is completed) {
decode_completed [0] = 1; // newdata
printf("WAIT_1\n");
pthread_cond_wait(&cv1,&th1);
pthread_mutex_unlock(&th1);
pthread_mutex_lock(&th1);
printf("RELEASE_1\n");
}
}
}
void *getdecodedata2_thread (void *arg){
int ret, count, count_decode=0;
char buffer[2024], buffer_decode[2024];
int i;
printf("2-getdecodedata_thread is running!\r\n");
count = 0;
pthread_mutex_lock(&th2);
while(1){
for (i=0;i<500;i++){
ret = read(uart2_id, buffer+count ,255);
if (ret<1)
continue;
// data received
count += ret;
if (count>1000) break;
}
if (count>0){ // packet received
for (i=0;i<count;i++)
buffer_decode3[count_decode2+i]=buffer[i];
count_decode2 += count;
if (count_decode2>15){
// decode
// ....
// decode done
}
count = 0;
}
if (decode is completed) {
decode_completed [1] = 1; // newdata
printf("WAIT_2\n");
pthread_cond_wait(&cv2,&th2);
pthread_mutex_unlock(&th2);
pthread_mutex_lock(&th2);
printf("RELEASE_2\n");
}
}
}
void *getdecodedata3_thread (void *arg){
int ret, count, count_decode=0;
char buffer[2024], buffer_decode[2024];
int i;
printf("3-getdecodedata_thread is running!\r\n");
count = 0;
pthread_mutex_lock(&th3);
while(1){
for (i=0;i<500;i++){
ret = read(uart4_id, buffer+count ,255);
if (ret<1)
continue;
// data received
count += ret;
if (count>1000) break;
}
if (count>0){ // packet received
for (i=0;i<count;i++)
buffer_decode3[count_decode3+i]=buffer[i];
count_decode3 += count;
if (count_decode3>15){
// decode
// ....
// decode done
}
count = 0;
}
if (decode is completed) {
decode_completed [2] = 1; // newdata
printf("WAIT_3\n");
pthread_cond_wait(&cv3,&th3);
pthread_mutex_unlock(&th3);
pthread_mutex_lock(&th3);
printf("RELEASE_3\n");
}
}
}
EDIT: I've found that the problem is in another place of the code. in some where I use "write" function to send some bytes to UART5 and in a separate thread I read simultaneously (non-blocking) UART5 to receive commands. I think a problem like "SegFault" is occurred and the above problem is seen. when I comment the "read" function of UART5, all things is correct and mutexes work finely. How can I use the UART5 to read and write simultaneously?
after about 2 months, I find that the problem is originated from an invalid event in an IC. I use a ttl to RS485 converter and its supply bus has sum noise and distortion which leads to send some invalid characters to serial input. In-fact when I send some characters using serial output, I think probably, serial input receive some invalid data. so it freezes the code in one of critical condition mechanisms. when I make the IC separate from the BBB, the problem is fixed.
I don't Know why? and How? How can a invalid data on serial input can make critical conditions lock? I use ttyO[] files to read and write the serial port.

Does poll/epoll handling is in interrupt context?

This is probably trivial question for some people, but somehow I'm not sure about it.
When waiting with poll for event from kernel, is it that the handling of new event is done in interrupt context ?
If not, does it mean we can sleep/wait (using other commands in handler) in the handler ?
int main (void)
{
struct pollfd fds[2];
int ret;
fds[0].fd = FILENO;
fds[0].events = POLLIN;
fds[1].fd = FILENO;
fds[1].events = POLLOUT;
ret = poll(fds, 2, TIMEOUT * 1000);
if (ret == -1) {
perror ("poll");
return 1;
}
if (!ret) {
return 0;
}
if (fds[0].revents & POLLIN)
{
/********** HANDLING EVENTS HERE ***************/
printf ("FILENO is POLLIN\n");
}
if (fds[1].revents & POLLOUT)
{
/********** HANDLING EVENTS HERE ***************/
printf ("FILENO is POLLOUT\n");
}
return 0;
}
Thank you,
Ran
No (in general).
When you call poll(), the processor context switches to a kernel context, and other processes (and kernel threads) run. Your process will be context switched back in at some point after at least one of your FDs is ready. In general (consider for instance a pipe), interrupt context is not required for this, though note some I/O requires interrupt context to happen (not directly connected to poll()).

Linux multi-thread, pausing one thread while continue running the other threads within the same process

I cannot find a proper solution to my problem.
If i have more than one thread in one process. And I want to make only one thread to sleep while running the other threads within the same process, is there any predefined syntax for it or do i have to do my own implementation (sleep) ?
Ideally i want to send a indication from a thread to another thread when it is time for sleep.
Edited (2015-08-24)
I have two main threads, one for sending data over a network, the other receives the data from the network. Beside jitter, the receiving thread does validation and verification and some file management which in time could lead that it will drag behind. What i like to do is to add something like a micro sleep to the sender so that the receiver could catch up. sched_yield() will not help in this case because the HW has a multi core CPU with more than 40 cores.
From your description in the comments, it looks like you're trying to synchronize 2 threads so that one of them doesn't fall behind too far from the other.
If that's the case, you're going about this the wrong way. It is seldom a good idea to do synchronization by sleeping, because the scheduler may incur unpredictable and long delays that cause the other (slow) thread to remain stopped in the run queue without being scheduled. Even if it works most of the time, it's still a race condition, and it's an ugly hack.
Given your use case and constraints, I think you'd be better off using barriers (see pthread_barrier_init(3)). Pthread barriers allow you to create a rendezvous point in the code where threads can catch up.
You call pthread_barrier_init(3) as part of the initialization code, specifying the number of threads that will be synchronized using that barrier. In this case, it's 2.
Then, threads synchronize with others by calling pthread_barrier_wait(3). The call blocks until the number of threads specified in pthread_barrier_init(3) call pthread_barrier_wait(3), at which point every thread that was blocked in pthread_barrier_wait(3) becomes runnable and the cycle begins again. Essentially, barriers create a synchronization point where no one can move forward until everyone arrives. I think this is exactly what you're looking for.
Here's an example that simulates a fast sender thread and a slow receiver thread. They both synchronize with barriers to ensure that the sender does not do any work while the receiver is still processing other requests. The threads synchronize at the end of their work unit, but of course, you can choose where each thread calls pthread_barrier_wait(3), thereby controlling exactly when (and where) threads synchronize.
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
pthread_barrier_t barrier;
void *sender_thr(void *arg) {
printf("Entered sender thread\n");
int i;
for (i = 0; i < 10; i++) {
/* Simulate some work (500 ms) */
if (usleep(500000) < 0) {
perror("usleep(3) error");
}
printf("Sender thread synchronizing.\n");
/* Wait for receiver to catch up */
int barrier_res = pthread_barrier_wait(&barrier);
if (barrier_res == PTHREAD_BARRIER_SERIAL_THREAD)
printf("Sender thread was last.\n");
else if (barrier_res == 0)
printf("Sender thread was first.\n");
else
fprintf(stderr, "pthread_barrier_wait(3) error on sender: %s\n", strerror(barrier_res));
}
return NULL;
}
void *receiver_thr(void *arg) {
printf("Entered receiver thread\n");
int i;
for (i = 0; i < 10; i++) {
/* Simulate a lot of work */
if (usleep(2000000) < 0) {
perror("usleep(3) error");
}
printf("Receiver thread synchronizing.\n");
/* Catch up with sender */
int barrier_res = pthread_barrier_wait(&barrier);
if (barrier_res == PTHREAD_BARRIER_SERIAL_THREAD)
printf("Receiver thread was last.\n");
else if (barrier_res == 0)
printf("Receiver thread was first.\n");
else
fprintf(stderr, "pthread_barrier_wait(3) error on receiver: %s\n", strerror(barrier_res));
}
return NULL;
}
int main(void) {
int barrier_res;
if ((barrier_res = pthread_barrier_init(&barrier, NULL, 2)) != 0) {
fprintf(stderr, "pthread_barrier_init(3) error: %s\n", strerror(barrier_res));
exit(EXIT_FAILURE);
}
pthread_t threads[2];
int thread_res;
if ((thread_res = pthread_create(&threads[0], NULL, sender_thr, NULL)) != 0) {
fprintf(stderr, "pthread_create(3) error on sender thread: %s\n", strerror(thread_res));
exit(EXIT_FAILURE);
}
if ((thread_res = pthread_create(&threads[1], NULL, receiver_thr, NULL)) != 0) {
fprintf(stderr, "pthread_create(3) error on receiver thread: %s\n", strerror(thread_res));
exit(EXIT_FAILURE);
}
/* Do some work... */
if ((thread_res = pthread_join(threads[0], NULL)) != 0) {
fprintf(stderr, "pthread_join(3) error on sender thread: %s\n", strerror(thread_res));
exit(EXIT_FAILURE);
}
if ((thread_res = pthread_join(threads[1], NULL)) != 0) {
fprintf(stderr, "pthread_join(3) error on receiver thread: %s\n", strerror(thread_res));
exit(EXIT_FAILURE);
}
if ((barrier_res = pthread_barrier_destroy(&barrier)) != 0) {
fprintf(stderr, "pthread_barrier_destroy(3) error: %s\n", strerror(barrier_res));
exit(EXIT_FAILURE);
}
return 0;
}
Note that, as specified in the manpage for pthread_barrier_wait(3), once the desired number of threads call pthread_barrier_wait(3), the barrier state is reset to the original state that was in use after the last call to pthread_barrier_init(3), which means that the barrier atomically unlocks and resets state, so it is always ready for the next synchronization point, which is wonderful.
Once you're done with the barrier, don't forget to free the associated resources with pthread_barrier_destroy(3).

How to send signal/data from a worker thread to main thread?

I'll preface this by saying that I'm delving into multithreading for the first time. Despite a lot of reading on concurrency and synchronization, I'm not readily seeing a solution for the requirements I've been given.
Using C++11 and Boost, I'm trying to figure out how to send data from a worker thread to a main thread. The worker thread is spawned at the start of the application and continuously monitors a lock free queue. Objects populate this queue at various intervals. This part is working.
Once the data is available, it needs to be processed by the main thread since another signal will be sent to the rest of the application which cannot be on a worker thread. This is what I'm having trouble with.
If I have to block the main thread through a mutex or a condition variable until the worker thread is done, how will that improve responsiveness? I might as well just stay with a single thread so I have access to the data. I must be missing something here.
I have posted a couple questions, thinking that Boost::Asio was the way to go. There is an example of how signals and data can be sent between threads, but as the responses indicate, things get quickly overly-complicated and it's not working perfectly:
How to connect signal to boost::asio::io_service when posting work on different thread?
Boost::Asio with Main/Workers threads - Can I start event loop before posting work?
After speaking with some colleagues, it was suggested that two queues be used -- one input, one output. This would be in shared space and the output queue would be populated by the worker thread. The worker thread is always going but there would need to be a Timer, probably at the application level, that would force the main thread to examine the output queue to see if there were any pending tasks.
Any ideas on where I should direct my attention? Are there any techniques or strategies that might work for what I'm trying to do? I'll be looking at Timers next.
Thanks.
Edit: This is production code for a plugin system that post-processes simulation results. We are using C++11 first wherever possible, followed by Boost. We are using Boost's lockfree::queue. The application is doing what we want on a single thread but now we are trying to optimize where we see that there are performance issues (in this case, a calculation happening through another library). The main thread has a lot of responsibilities, including database access, which is why I want to limit what the worker thread actually does.
Update: I have already been successful in using std::thread to launch a worker thread that examines a Boost lock::free queue and processes tasks placed it in. It's step 5 in #Pressacco's response that I'm having trouble with. Any examples returning a value to the main thread when a worker thread is finished and informing the main thread, rather than simply waiting for the worker to finish?
If your objective is develop the solution from scratch (using native threads, queues, etc.):
create a thread save queue queue (Mutex/CriticalSection around add/remove)
create a counting semaphore that is associated with the queue
have one or more worker threads wait on the counting semaphore (i.e. the thread will block)
the semaphore is more efficient than having the thread constantly poll the queue
as messages/jobs are added to the queue, increment the semaphore
a thread will wake up
the thread should remove one message
if a result needs to be returned...
setup another: Queue+Semaphore+WorkerThreads
ADDITIONAL NOTES
If you decide to implement a thread safe queue from scratch, take a look at:
Synchronization between threads using Critical Section
With that said, I would take another look at BOOST. I haven't used the library, but from what I hear it will most likely contain some relevant data structures (e.g. a thread safe queue).
My favorite quote from the MSDN:
"When you use multithreading of any sort, you potentially expose
yourself to very serious and complex bugs"
SIDEBAR
Since you are looking at concurrent programming for the first time, you may wish to consider:
Is your objective to build production worthy code , or is this simply a learning exercise?
production? consider us existing proven libraries
learning? consider writing the code from scratch
Consider using a thread pool with an asynchronous callback instead of native threads.
more threads != better
Are threads really needed?
Follow the KISS principle.
The feedback above led me in the right direction for what I needed. The solution was definitely simpler than having to use signals/slots or Boost::Asio as I had previously attempted. I have two lock-free queues, one for input (on a worker thread) and one for output (on the main thread, populated by the worker thread). I use a timer to schedule when the output queue is processed. The code is below; perhaps it is of use to somebody:
//Task.h
#include <iostream>
#include <thread>
class Task
{
public:
Task(bool shutdown = false) : _shutdown(shutdown) {};
virtual ~Task() {};
bool IsShutdownRequest() { return _shutdown; }
virtual int Execute() = 0;
private:
bool _shutdown;
};
class ShutdownTask : public Task
{
public:
ShutdownTask() : Task(true) {}
virtual int Execute() { return -1; }
};
class TimeSeriesTask : public Task
{
public:
TimeSeriesTask(int value) : _value(value) {};
virtual int Execute()
{
std::cout << "Calculating on thread " << std::this_thread::get_id() << std::endl;
return _value * 2;
}
private:
int _value;
};
// Main.cpp : Defines the entry point for the console application.
#include "stdafx.h"
#include "afxwin.h"
#include <boost/lockfree/spsc_queue.hpp>
#include "Task.h"
static UINT_PTR ProcessDataCheckTimerID = 0;
static const int ProcessDataCheckPeriodInMilliseconds = 100;
class Manager
{
public:
Manager()
{
//Worker Thread with application lifetime that processes a lock free queue
_workerThread = std::thread(&Manager::ProcessInputData, this);
};
virtual ~Manager()
{
_workerThread.join();
};
void QueueData(int x)
{
if (x > 0)
{
_inputQueue.push(std::make_shared<TimeSeriesTask>(x));
}
else
{
_inputQueue.push(std::make_shared<ShutdownTask>());
}
}
void ProcessOutputData()
{
//process output data on the Main Thread
_outputQueue.consume_one([&](int value)
{
if (value < 0)
{
PostQuitMessage(WM_QUIT);
}
else
{
int result = value - 1;
std::cout << "Final result is " << result << " on thread " << std::this_thread::get_id() << std::endl;
}
});
}
private:
void ProcessInputData()
{
bool shutdown = false;
//Worker Thread processes input data indefinitely
do
{
_inputQueue.consume_one([&](std::shared_ptr<Task> task)
{
std::cout << "Getting element from input queue on thread " << std::this_thread::get_id() << std::endl;
if (task->IsShutdownRequest()) { shutdown = true; }
int result = task->Execute();
_outputQueue.push(result);
});
} while (shutdown == false);
}
std::thread _workerThread;
boost::lockfree::spsc_queue<std::shared_ptr<Task>, boost::lockfree::capacity<1024>> _inputQueue;
boost::lockfree::spsc_queue<int, boost::lockfree::capacity<1024>> _outputQueue;
};
std::shared_ptr<Manager> g_pMgr;
//timer to force Main Thread to process Manager's output queue
void CALLBACK TimerCallback(HWND hWnd, UINT nMsg, UINT nIDEvent, DWORD dwTime)
{
if (nIDEvent == ProcessDataCheckTimerID)
{
KillTimer(NULL, ProcessDataCheckPeriodInMilliseconds);
ProcessDataCheckTimerID = 0;
//call function to process data
g_pMgr->ProcessOutputData();
//reset timer
ProcessDataCheckTimerID = SetTimer(NULL, ProcessDataCheckTimerID, ProcessDataCheckPeriodInMilliseconds, (TIMERPROC)&TimerCallback);
}
}
int main()
{
std::cout << "Main thread is " << std::this_thread::get_id() << std::endl;
g_pMgr = std::make_shared<Manager>();
ProcessDataCheckTimerID = SetTimer(NULL, ProcessDataCheckTimerID, ProcessDataCheckPeriodInMilliseconds, (TIMERPROC)&TimerCallback);
//queue up some dummy data
for (int i = 1; i <= 10; i++)
{
g_pMgr->QueueData(i);
}
//queue a shutdown request
g_pMgr->QueueData(-1);
//fake the application's message loop
MSG msg;
bool shutdown = false;
while (shutdown == false)
{
if (GetMessage(&msg, NULL, 0, 0))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
else
{
shutdown = true;
}
}
return 0;
}

pthread_cond_broadcast problem

Using pthreads in linux 2.6.30 I am trying to send a single signal which will cause multiple threads to begin execution. The broadcast seems to only be received by one thread. I have tried both pthread_cond_signal and pthread cond_broadcast and both seem to have the same behavior. For the mutex in pthread_cond_wait, I have tried both common mutexes and separate (local) mutexes with no apparent difference.
worker_thread(void *p)
{
// setup stuff here
printf("Thread %d ready for action \n", p->thread_no);
pthread_cond_wait(p->cond_var, p->mutex);
printf("Thread %d off to work \n", p->thread_no);
// work stuff
}
dispatch_thread(void *p)
{
// setup stuff
printf("Wakeup, everyone ");
pthread_cond_broadcast(p->cond_var);
printf("everyone should be working \n");
// more stuff
}
main()
{
pthread_cond_init(cond_var);
for (i=0; i!=num_cores; i++) {
pthread_create(worker_thread...);
}
pthread_create(dispatch_thread...);
}
Output:
Thread 0 ready for action
Thread 1 ready for action
Thread 2 ready for action
Thread 3 ready for action
Wakeup, everyone
everyone should be working
Thread 0 off to work
What's a good way to send signals to all the threads?
First off, you should have the mutex locked at the point where you call pthread_cond_wait(). It's generally a good idea to hold the mutex when you call pthread_cond_broadcast(), as well.
Second off, you should loop calling pthread_cond_wait() while the wait condition is true. Spurious wakeups can happen, and you must be able to handle them.
Finally, your actual problem: you are signaling all threads, but some of them aren't waiting yet when the signal is sent. Your main thread and dispatch thread are racing your worker threads: if the main thread can launch the dispatch thread, and the dispatch thread can grab the mutex and broadcast on it before the worker threads can, then those worker threads will never wake up.
You need a synchronization point prior to signaling where you wait to signal till all threads are known to be waiting for the signal. That, or you can keep signaling till you know all threads have been woken up.
In this case, you could use the mutex to protect a count of sleeping threads. Each thread grabs the mutex and increments the count. If the count matches the count of worker threads, then it's the last thread to increment the count and so signals on another condition variable sharing the same mutex to the sleeping dispatch thread that all threads are ready. The thread then waits on the original condition, which causes it release the mutex.
If the dispatch thread wasn't sleeping yet when the last worker thread signals on that condition, it will find that the count already matches the desired count and not bother waiting, but immediately broadcast on the shared condition to wake workers, who are now guaranteed to all be sleeping.
Anyway, here's some working source code that fleshes out your sample code and includes my solution:
#include <stdio.h>
#include <pthread.h>
#include <err.h>
static const int num_cores = 8;
struct sync {
pthread_mutex_t *mutex;
pthread_cond_t *cond_var;
int thread_no;
};
static int sleeping_count = 0;
static pthread_cond_t all_sleeping_cond = PTHREAD_COND_INITIALIZER;
void *
worker_thread(void *p_)
{
struct sync *p = p_;
// setup stuff here
pthread_mutex_lock(p->mutex);
printf("Thread %d ready for action \n", p->thread_no);
sleeping_count += 1;
if (sleeping_count >= num_cores) {
/* Last worker to go to sleep. */
pthread_cond_signal(&all_sleeping_cond);
}
int err = pthread_cond_wait(p->cond_var, p->mutex);
if (err) warnc(err, "pthread_cond_wait");
printf("Thread %d off to work \n", p->thread_no);
pthread_mutex_unlock(p->mutex);
// work stuff
return NULL;
}
void *
dispatch_thread(void *p_)
{
struct sync *p = p_;
// setup stuff
pthread_mutex_lock(p->mutex);
while (sleeping_count < num_cores) {
pthread_cond_wait(&all_sleeping_cond, p->mutex);
}
printf("Wakeup, everyone ");
int err = pthread_cond_broadcast(p->cond_var);
if (err) warnc(err, "pthread_cond_broadcast");
printf("everyone should be working \n");
pthread_mutex_unlock(p->mutex);
// more stuff
return NULL;
}
int
main(void)
{
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond_var = PTHREAD_COND_INITIALIZER;
pthread_t worker[num_cores];
struct sync info[num_cores];
for (int i = 0; i < num_cores; i++) {
struct sync *p = &info[i];
p->mutex = &mutex;
p->cond_var = &cond_var;
p->thread_no = i;
pthread_create(&worker[i], NULL, worker_thread, p);
}
pthread_t dispatcher;
struct sync p = {&mutex, &cond_var, num_cores};
pthread_create(&dispatcher, NULL, dispatch_thread, &p);
pthread_exit(NULL);
/* not reached */
return 0;
}

Resources