parallelize data processing on multiple cores using pthreads

parallelize data processing on multiple cores using pthreads - linux

My goal is to process data on multiple cores by using multiple worker threads, and then further process the results in a master thread. I'm working on linux and I'd like to use pthreads. I'm created a simple example to learn how to do this properly. I have one main thread named "callback" and 4 worker threads. The idea is that the main thread signals the worker threads to start processing, and the 4 threads then signal the main thread when they are done, and the main thread quits after it has been notified by all 4 threads that they are done. I want the 4 worker threads to be able to run in parallel, so I don't want any of those threads waiting for the others. In my example I've tried to just let each of the threads sleep for a different duration (1, 2, 3 and 4 seconds), with the idea that the code would exit after 4 seconds (i.e. when worker thread 4 has done waiting 4 seconds).
For some reason, my code is incorrect, and it always exits immediately, printing this:
thread 3 start (sleeping 3000 ms)
thread 2 start (sleeping 2000 ms)
thread 1 start (sleeping 1000 ms)
thread 4 start (sleeping 4000 ms)
thread 1 stop
thread 2 stop
thread 3 stop
thread 4 stop
Main(): Waited on 5 threads. Done.
So the threads do seem to exit in the correct order but the program doesn't take 4 seconds to run.
What is going on here? I've pasted the code below
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
// based on https://computing.llnl.gov/tutorials/pthreads/#ConditionVariables
#define DUR 1000
#define NUM_THREADS 5
int state = 0;
pthread_mutex_t mutex;
pthread_cond_t conddone;
pthread_cond_t condwork;
void* callback(void* t) {
// signal worker threads to start work
pthread_mutex_lock(&mutex);
pthread_cond_broadcast(&condwork);
pthread_mutex_unlock(&mutex);
// wait for worker threads to finish
pthread_mutex_lock(&mutex);
while (state < 4)
pthread_cond_wait(&conddone, &mutex);
pthread_mutex_unlock(&mutex);
pthread_exit(NULL);
}
void* worker(void* t) {
long id = (long)t;
// wait for signal from callback to start doing work
pthread_mutex_lock(&mutex);
pthread_cond_wait(&condwork, &mutex);
pthread_mutex_unlock(&mutex);
// do work
printf("thread %d start (sleeping %d ms)\n", id, id * DUR);
usleep(id * DUR);
printf(" thread %d stop\n", id);
// tell callback we're done
pthread_mutex_lock(&mutex);
state++;
pthread_cond_signal(&conddone);
pthread_mutex_unlock(&mutex);
pthread_exit(NULL);
}
int main (int argc, char *argv[])
{
int i, rc;
pthread_t threads[5];
pthread_attr_t attr;
pthread_mutex_init(&mutex, NULL);
pthread_cond_init (&condwork, NULL);
pthread_cond_init (&conddone, NULL);
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_create(&threads[0], &attr, callback, (void *)0);
pthread_create(&threads[1], &attr, worker, (void *)1);
pthread_create(&threads[2], &attr, worker, (void *)2);
pthread_create(&threads[3], &attr, worker, (void *)3);
pthread_create(&threads[4], &attr, worker, (void *)4);
for (i=0; i<NUM_THREADS; i++) {
pthread_join(threads[i], NULL);
}
printf ("Main(): Waited on %d threads. Done.\n", NUM_THREADS);
pthread_attr_destroy(&attr);
pthread_mutex_destroy(&mutex);
pthread_cond_destroy(&condwork);
pthread_cond_destroy(&conddone);
pthread_exit(NULL);
}

Your immediate problem is just that usleep() sleeps for microseconds not milliseconds so your threads are sleeping for a thousandth of the time that you intend them to.
You do have another problem, though: your condwork condition variable is not paired with a predicate over shared state (eg the predicate state < 4 for the conddone variable). If one of your worker threads executes the pthread_cond_wait() after the "callback" thread has executed the pthread_cond_broadcast(), the worker will wait indefinitely.
You could fix this by initialising the state variable to -1:
int state = -1;
and having your workers wait on the predicate state < 0:
// wait for signal from callback to start doing work
pthread_mutex_lock(&mutex);
while (state < 0)
pthread_cond_wait(&condwork, &mutex);
pthread_mutex_unlock(&mutex);
and having the "callback" signal the workers by setting state to 0:
// signal worker threads to start work
pthread_mutex_lock(&mutex);
state = 0;
pthread_cond_broadcast(&condwork);
pthread_mutex_unlock(&mutex);

Related

synchronising lock step execution of threads

I have a top level controller, which schedules n sub threads,
and waits for all of them to complete before scheduling them all over again. These threads go on forever, so the threads do not need to be joined.
So the pseudo-code is something like this (assuming n=2):
Top:
loop:
1. initiate T1 and T2
2. wait for completion of both T1 and T2
T1: (similarly for T2)
loop:
1. wait for lock-1
2. do something
3. send completion signal
I am thinking of the following code for this, where Top,T1,T2 are
separate threads:
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define NUM_PROCS 2
pthread_mutex_t m_1, m_2; // for scheduling T1,T2
int count;
pthread_mutex_t m_count; // for completion-signal
pthread_cond_t c_count;
pthread_attr_t attr; // for threads
pthread_t thread[NUM_PROCS+1];
void *Top(void *t) {
count=0;
while(1) {
pthread_mutex_unlock(&m_1);
pthread_mutex_unlock(&m_2);
// not sure if this the correct way to wait for T1&T2
pthread_mutex_lock(&m_count);
while(count < 2) {
pthread_cond_wait(&c_count, &m_count);
}
count=0;
pthread_mutex_unlock(&m_count);
}
}
void *T1(void *t) { // similarly for T2
while(1) {
pthread_mutex_lock(&m_1); // use m_2 for T2
sleep(1);
pthread_mutex_lock(&m_count);
count++;
pthread_mutex_unlock(&m_count);
pthread_cond_signal(&c_count);
}
}
void *T2(void *t) {
while(1) {
pthread_mutex_lock(&m_2);
sleep(1);
pthread_mutex_lock(&m_count);
count++;
pthread_mutex_unlock(&m_count);
pthread_cond_signal(&c_count);
}
}
int main() {
int rc;
int t[NUM_PROCS+1] = {0,1,2}; // thread numbers
pthread_mutex_init(&m_1, NULL); // initializations
pthread_mutex_init(&m_2, NULL);
pthread_mutex_init(&m_count, NULL);
pthread_cond_init(&c_count, NULL);
pthread_mutex_lock(&m_1); // to allow Top to start first
pthread_mutex_lock(&m_2);
pthread_attr_init(&attr); // initiate the threads
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
rc = pthread_create(&thread[0], &attr, Top, (void *)&t[0]);
rc = pthread_create(&thread[1], &attr, T1, (void *)&t[1]);
rc = pthread_create(&thread[2], &attr, T2, (void *)&t[2]);
}
My questions on the above code:
Is the above code correct?
Usually, lock and unlock are both done by the same thread.
So my solution, of T1 locking m_1 and Top unlocking it,
seems a bit weird. Is there a better way of doing this?
Is semaphore a more efficient way to do this synchronization?
Will the code change (except main() of course) if I implement
this as separate processes with shared memory, instead of as
threads? And will that be less efficient than the threads version?

A thread that has not locked a pthread mutex may not unlock it. If you need to create a lock that one thread can acquire and another thread can release, you have to do so with your own code. A standard mutex is not such a lock.

Whats exact definition of 'atomicity' in programming?

the definition of 'atomicity' says that a transaction should be able to be terminated without being touched or manipulated by possibly concurrent running actions during its process. But does that also mean that a program should not run concurrent when it is supposed to be atomic?
let's say we have 2 program as an example:
example_program1:
counts int i = 1 to 100 every second
every number is printed in new line
example_program2:
just prints "hi"
and a parent program that includes both of these program and run them once receiving a signal to start a specific program (e.g via sigaction in linux) with 2 version:
version 1:
runs the program (even concurrent) anytime once receiving the signal
which means program2 can print "hi" while program1 is still printing out the numbers
version 2:
only run one program at a time
signal for other program is blocked until program in progress has terminated
in this example, can only version 2 considered atomic or both? Would this program be non-atomic only if e.g program2 would increment i by 1 during its process?

Hi Atomic operations provide instructions that execute atomically without interruption. Just as the atom was originally thought to be an indivisible particle, atomic operations are indivisible instructions.
I will like to explain that with a program multithread production and consumer with rase conditions.
#include <stdio.h>
#include <pthread.h>
pthread_cond_t condicion = PTHREAD_COND_INITIALIZER;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int cont = 0;
void* productor(){
while(1){
pthread_mutex_lock(&mutex);
while(cont != 0)
{
pthread_cond_wait(&condicion, &mutex);
}
cont++;
printf("%d\n",cont);
pthread_cond_signal(&condicion);
pthread_mutex_unlock(&mutex);
}
}
void* consumidor(){
while(1)
{
pthread_mutex_lock(&mutex);
while(cont == 0)
{
pthread_cond_wait(&condicion, &mutex);
}
cont--;
printf("%d\n",cont);
pthread_cond_signal(&condicion);
pthread_mutex_unlock(&mutex);
}
}
int main(){
pthread_t id_hilo1,id_hilo2;
pthread_create(&id_hilo1, NULL, &productor, NULL);
pthread_create(&id_hilo2, NULL, &consumidor, NULL);
pthread_join(id_hilo1, NULL);
pthread_join(id_hilo2, NULL);
return 0;
}

sem_init() causing SEGV

I have the following code and it is being killed by a SEGV signal. Using the debugger shows that it is being killed by the first sem_init() in main(). If I comment out the first sem_init() the second causes the same problem. I have tried figuring out what would cause this sys call to cause a SEGV. The else is not being run, so the error is happening before it can return a value.
Any help would be greatly appreciated,
Thank you.
I removed the rest of the code that isnt being run before this problem occurs.
#define PORTNUM 7000
#define NUM_OF_THREADS 5
#define oops(msg) { perror(msg); exit(1);}
#define FCFS 0
#define SJF 1;
void bindAndListen();
void acceptConnection(int socket_file_descriptor);
void* dispatchJobs(void*);
void* replyToClient(void* pos);
//holds ids of worker threads
pthread_t threads[NUM_OF_THREADS];
//mutex variable for sleep_signal_cond
pthread_mutex_t sleep_signal_mutex[NUM_OF_THREADS];
//holds the condition variables to signal when the thread should be unblocked
pthread_cond_t sleep_signal_cond[NUM_OF_THREADS];
//mutex for accessing sleeping_thread_list
pthread_mutex_t sleeping_threads_mutex = PTHREAD_MUTEX_INITIALIZER;
//list of which threads are sleeping so they can be signaled and given a job
std::vector<bool> *sleeping_threads_list = new std::vector<bool>();
//number of threads ready for jobs
sem_t* available_threads;
sem_t* waiting_jobs;
//holds requests waiting to be given to one of the threads for execution
std::vector<std::vector<int> >* jobs = new std::vector<std::vector<int> >();
pthread_mutex_t jobs_mutex = PTHREAD_MUTEX_INITIALIZER;
int main (int argc, char * const argv[]) {
//holds id for thread responsible for removing jobs from ready queue and assigning them to worker thread
pthread_t dispatcher_thread;
//initializes semaphores
if(sem_init(available_threads, 0, NUM_OF_THREADS) != 0){ //this is the line causing the SEGV
oops("Error Initializing Semaphore");
}
if(sem_init(waiting_jobs, 0, 0) !=0){
oops("Error Initializing Semaphore");
}
//initializes condition variables and guarding mutexes
for(int i=0; i<NUM_OF_THREADS; i++){
pthread_cond_init(&sleep_signal_cond[i], NULL);
pthread_mutex_init(&sleep_signal_mutex[i], NULL);
}
if(pthread_create(&dispatcher_thread, NULL, dispatchJobs, (void*)NULL) !=0){
oops("Error Creating Distributer Thread");

You declare pointers to your semaphores:
sem_t* available_threads;
sem_t* waiting_jobs;
but never initialize the memory. The sem_init function is not expecting to allocate memory, just to initialize an existing blob of memory. Either allocate some memory and assign these pointers to it, or declare the semaphores as sem_t and pass the address to sem_init.

Keeping number of threads constant with pthread in C

I tried to find a solution in order to keep the number of working threads constant under linux in C using pthreads, but I seem to be unable to fully understand what's wrong with the following code:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define MAX_JOBS 50
#define MAX_THREADS 5
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int jobs = MAX_JOBS;
int worker = 0;
int counter = 0;
void *functionC() {
pthread_mutex_lock(&mutex1);
worker++;
counter++;
printf("Counter value: %d\n",counter);
pthread_mutex_unlock(&mutex1);
// Do something...
sleep(4);
pthread_mutex_lock(&mutex1);
jobs--;
worker--;
printf(" >>> Job done: %d\n",jobs);
pthread_mutex_unlock(&mutex1);
}
int main(int argc, char *argv[]) {
int i=0, j=0;
pthread_t thread[MAX_JOBS];
// Create threads if the number of working threads doesn't exceed MAX_THREADS
while (1) {
if (worker > MAX_THREADS) {
printf(" +++ In queue: %d\n", worker);
sleep(1);
} else {
//printf(" +++ Creating new thread: %d\n", worker);
pthread_create(&thread[i], NULL, &functionC, NULL);
//printf("%d",worker);
i++;
}
if (i == MAX_JOBS) break;
}
// Wait all threads to finish
for (j=0;j<MAX_JOBS;j++) {
pthread_join(thread[j], NULL);
}
return(0);
}
A while (1) loop keeps creating threads if the number of working threads is under a certain threshold. A mutex is supposed to lock the critical sections every time the global counter of the working threads is incremented (thread creation) and decremented (job is done). I thought it could work fine and for the most part it does, but weird things happen...
For instance, if I comment (as it is in this snippet) the printf //printf(" +++ Creating new thread: %d\n", worker); the while (1) seems to generate a random number (18-25 in my experience) threads (functionC prints out "Counter value: from 1 to 18-25"...) at a time instead of respecting the IF condition inside the loop. If I include the printf the loop seems to behave "almost" in the right way... This seems to hint that there's a missing "mutex" condition that I should add to the loop in main() to effectively lock the thread when MAX_THREADS is reached but after changing a LOT of times this code for the past few days I'm a bit lost, now. What am I missing?
Please, let me know what I should change in order to keep the number of threads constant it doesn't seem that I'm too far from the solution... Hopefully... :-)
Thanks in advance!

Your problem is that worker is not incremented until the new thread actually starts and gets to run - in the meantime, the main thread loops around, checks workers, finds that it hasn't changed, and starts another thread. It can repeat this many times, creating far too many threads.
So, you need to increment worker in the main thread, when you've decided to create a new thread.
You have another problem - you should be using condition variables to let the main thread sleep until it should start another thread, not using a busy-wait loop with a sleep(1); in it. The complete fixed code would look like:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#define MAX_JOBS 50
#define MAX_THREADS 5
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond1 = PTHREAD_COND_INITIALIZER;
int jobs = MAX_JOBS;
int workers = 0;
int counter = 0;
void *functionC() {
pthread_mutex_lock(&mutex1);
counter++;
printf("Counter value: %d\n",counter);
pthread_mutex_unlock(&mutex1);
// Do something...
sleep(4);
pthread_mutex_lock(&mutex1);
jobs--;
printf(" >>> Job done: %d\n",jobs);
/* Worker is about to exit, so decrement count and wakeup main thread */
workers--;
pthread_cond_signal(&cond1);
pthread_mutex_unlock(&mutex1);
return NULL;
}
int main(int argc, char *argv[]) {
int i=0, j=0;
pthread_t thread[MAX_JOBS];
// Create threads if the number of working threads doesn't exceed MAX_THREADS
while (i < MAX_JOBS) {
/* Block on condition variable until there are insufficient workers running */
pthread_mutex_lock(&mutex1);
while (workers >= MAX_THREADS)
pthread_cond_wait(&cond1, &mutex1);
/* Another worker will be running shortly */
workers++;
pthread_mutex_unlock(&mutex1);
pthread_create(&thread[i], NULL, &functionC, NULL);
i++;
}
// Wait all threads to finish
for (j=0;j<MAX_JOBS;j++) {
pthread_join(thread[j], NULL);
}
return(0);
}
Note that even though this works, it isn't ideal - it's best to create the number of threads you want up-front, and have them loop around, waiting for work. This is because creating and destroying threads has significant overhead, and because it often simplifies resource management. A version of your code rewritten to work like this would look like:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#define MAX_JOBS 50
#define MAX_THREADS 5
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int jobs = MAX_JOBS;
int counter = 0;
void *functionC()
{
int running_job;
pthread_mutex_lock(&mutex1);
counter++;
printf("Counter value: %d\n",counter);
while (jobs > 0) {
running_job = jobs--;
pthread_mutex_unlock(&mutex1);
printf(" >>> Job starting: %d\n", running_job);
// Do something...
sleep(4);
printf(" >>> Job done: %d\n", running_job);
pthread_mutex_lock(&mutex1);
}
pthread_mutex_unlock(&mutex1);
return NULL;
}
int main(int argc, char *argv[]) {
int i;
pthread_t thread[MAX_THREADS];
for (i = 0; i < MAX_THREADS; i++)
pthread_create(&thread[i], NULL, &functionC, NULL);
// Wait all threads to finish
for (i = 0; i < MAX_THREADS; i++)
pthread_join(thread[i], NULL);
return 0;
}

pthread_cond_broadcast problem

Using pthreads in linux 2.6.30 I am trying to send a single signal which will cause multiple threads to begin execution. The broadcast seems to only be received by one thread. I have tried both pthread_cond_signal and pthread cond_broadcast and both seem to have the same behavior. For the mutex in pthread_cond_wait, I have tried both common mutexes and separate (local) mutexes with no apparent difference.
worker_thread(void *p)
{
// setup stuff here
printf("Thread %d ready for action \n", p->thread_no);
pthread_cond_wait(p->cond_var, p->mutex);
printf("Thread %d off to work \n", p->thread_no);
// work stuff
}
dispatch_thread(void *p)
{
// setup stuff
printf("Wakeup, everyone ");
pthread_cond_broadcast(p->cond_var);
printf("everyone should be working \n");
// more stuff
}
main()
{
pthread_cond_init(cond_var);
for (i=0; i!=num_cores; i++) {
pthread_create(worker_thread...);
}
pthread_create(dispatch_thread...);
}
Output:
Thread 0 ready for action
Thread 1 ready for action
Thread 2 ready for action
Thread 3 ready for action
Wakeup, everyone
everyone should be working
Thread 0 off to work
What's a good way to send signals to all the threads?

First off, you should have the mutex locked at the point where you call pthread_cond_wait(). It's generally a good idea to hold the mutex when you call pthread_cond_broadcast(), as well.
Second off, you should loop calling pthread_cond_wait() while the wait condition is true. Spurious wakeups can happen, and you must be able to handle them.
Finally, your actual problem: you are signaling all threads, but some of them aren't waiting yet when the signal is sent. Your main thread and dispatch thread are racing your worker threads: if the main thread can launch the dispatch thread, and the dispatch thread can grab the mutex and broadcast on it before the worker threads can, then those worker threads will never wake up.
You need a synchronization point prior to signaling where you wait to signal till all threads are known to be waiting for the signal. That, or you can keep signaling till you know all threads have been woken up.
In this case, you could use the mutex to protect a count of sleeping threads. Each thread grabs the mutex and increments the count. If the count matches the count of worker threads, then it's the last thread to increment the count and so signals on another condition variable sharing the same mutex to the sleeping dispatch thread that all threads are ready. The thread then waits on the original condition, which causes it release the mutex.
If the dispatch thread wasn't sleeping yet when the last worker thread signals on that condition, it will find that the count already matches the desired count and not bother waiting, but immediately broadcast on the shared condition to wake workers, who are now guaranteed to all be sleeping.
Anyway, here's some working source code that fleshes out your sample code and includes my solution:
#include <stdio.h>
#include <pthread.h>
#include <err.h>
static const int num_cores = 8;
struct sync {
pthread_mutex_t *mutex;
pthread_cond_t *cond_var;
int thread_no;
};
static int sleeping_count = 0;
static pthread_cond_t all_sleeping_cond = PTHREAD_COND_INITIALIZER;
void *
worker_thread(void *p_)
{
struct sync *p = p_;
// setup stuff here
pthread_mutex_lock(p->mutex);
printf("Thread %d ready for action \n", p->thread_no);
sleeping_count += 1;
if (sleeping_count >= num_cores) {
/* Last worker to go to sleep. */
pthread_cond_signal(&all_sleeping_cond);
}
int err = pthread_cond_wait(p->cond_var, p->mutex);
if (err) warnc(err, "pthread_cond_wait");
printf("Thread %d off to work \n", p->thread_no);
pthread_mutex_unlock(p->mutex);
// work stuff
return NULL;
}
void *
dispatch_thread(void *p_)
{
struct sync *p = p_;
// setup stuff
pthread_mutex_lock(p->mutex);
while (sleeping_count < num_cores) {
pthread_cond_wait(&all_sleeping_cond, p->mutex);
}
printf("Wakeup, everyone ");
int err = pthread_cond_broadcast(p->cond_var);
if (err) warnc(err, "pthread_cond_broadcast");
printf("everyone should be working \n");
pthread_mutex_unlock(p->mutex);
// more stuff
return NULL;
}
int
main(void)
{
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond_var = PTHREAD_COND_INITIALIZER;
pthread_t worker[num_cores];
struct sync info[num_cores];
for (int i = 0; i < num_cores; i++) {
struct sync *p = &info[i];
p->mutex = &mutex;
p->cond_var = &cond_var;
p->thread_no = i;
pthread_create(&worker[i], NULL, worker_thread, p);
}
pthread_t dispatcher;
struct sync p = {&mutex, &cond_var, num_cores};
pthread_create(&dispatcher, NULL, dispatch_thread, &p);
pthread_exit(NULL);
/* not reached */
return 0;
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

parallelize data processing on multiple cores using pthreads - linux

Related

synchronising lock step execution of threads

Whats exact definition of 'atomicity' in programming?

sem_init() causing SEGV

Keeping number of threads constant with pthread in C

pthread_cond_broadcast problem

Categories

Resources