How does the main thread_exit work with a loop sub-thread? - multithreading

I'm newbie in thread, I have the code below:
#include<pthread.h>
#include<stdio.h>
#include <stdlib.h>
#define NUM_THREADS 5
void *PrintHello (void *threadid ){
long tid ;tid = (long) threadid ;
printf ("Hello World! It’s me, thread#%ld !\n" , tid );
pthread_exit(NULL);
}
int main (int argc ,char *argv[] ){
pthread_t threads [NUM_THREADS] ;
int rc ;
long t ;
for( t=0; t<NUM_THREADS; t++){
printf ("In main: creating thread %ld\n" , t );
rc = pthread_create(&threads[t],NULL,PrintHello,(void *)t );
}
pthread_exit(NULL);
}
I compile and the output here:1
But when i delete the last line "pthread_exit(NULL)", the output is sometimes as same as the above which always prints enough 5 sub-threads, sometimes just prints 4 sub-thread from thread 0-3 for instace:2
Help me with this, please!

Omitting that pthread_exit call in main will cause main to implicitly return and terminate the process, including all running threads.
Now you have a race condition: will your five worker threads print their messages before main terminates the whole process? Sometimes yes, sometimes no, as you have observed.
If you keep the pthread_exit call in main, your program will instead terminate when the last running thread calls pthread_exit.

Related

Pthread join or pthread exit of terminating mulithreaded c program?

I wanted to print 1 to 10 for 3 threads. My code is able to do that but after that the program gets stuck. I tried using pthread_exit in the end of function. Also, I tried to remove while (1) in main and using pthread join there. But still, I got the same result. How should I terminate the threads?
enter code here
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
int done = 1;
//Thread function
void *foo()
{
for (int i = 0; i < 10; i++)
{
printf(" \n #############");
pthread_mutex_lock(&lock);
if(done == 1)
{
done = 2;
printf (" \n %d", i);
pthread_cond_signal(&cond2);
pthread_cond_wait(&cond1, &lock);
printf (" \n Thread 1 woke up");
}
else if(done == 2)
{
printf (" \n %d", i);
done = 3;
pthread_cond_signal(&cond3);
pthread_cond_wait(&cond2, &lock);
printf (" \n Thread 2 woke up");
}
else
{
printf (" \n %d", i);
done = 1;
pthread_cond_signal(&cond1);
pthread_cond_wait(&cond3, &lock);
printf (" \n Thread 3 woke up");
}
pthread_mutex_unlock(&lock);
}
pthread_exit(NULL);
return NULL;
}
int main(void)
{
pthread_t tid1, tid2, tid3;
pthread_create(&tid1, NULL, foo, NULL);
pthread_create(&tid2, NULL, foo, NULL);
pthread_create(&tid3, NULL, foo, NULL);
while(1);
printf ("\n $$$$$$$$$$$$$$$$$$$$$$$$$$$");
return 0;
}
How should I terminate the threads?
Either returning from the outermost call to the thread function or calling pthread_exit() terminates a thread. Returning a value p from the outermost call is equivalent to calling pthread_exit(p).
the program gets stuck
Well of course it does when the program performs
while(1);
.
Also, I tried to remove while (1) in main and using pthread join there. But still, I got the same result.
You do need to join the threads to ensure that they terminate before the overall program does. This is the only appropriate way to achieve that. But if your threads in fact do not terminate in the first place, then that's moot.
In your case, observe that each thread unconditionally performs a pthread_cond_wait() on every iteration of the loop, requiring it to be signaled before it resumes. Normally, the preceding thread will signal it, but that does not happen after the last iteration of the loop. You could address that by having each thread perform an appropriate additional call to pthread_cond_signal() after it exits the loop, or by ensuring that the threads do not wait in the last loop iteration.

Do I need a QMutex for variable that is accessed by single statement?

In this document, a QMutex is used to protect "number" from being modified by multiple threads at same time.
I have a code in which a thread is instructed to do different work according to a flag set by another thread.
//In thread1
if(flag)
dowork1;
else
dowork2;
//In thread2
void setflag(bool f)
{
flag=f;
}
I want to know if a QMutex is needed to protect flag, i.e.,
//In thread1
mutex.lock();
if(flag)
{
mutex.unlock();
dowork1;
}
else
{
mutex.unlock();
dowork2;
}
//In thread2
void setflag(bool f)
{
mutex.lock();
flag=f;
mutex.unlock();
}
The code is different from the document in that flag is accessed(read/written) by single statement in both threads, and only one thread modifies the value of flag.
PS:
I always see the example in multi-thread programming tutorials that one thread does "count++", the other thread does "count--", and the tutorials say you should use a Mutex to protect the variable "count". I cannot get the point of using a mutex. Does it mean the execution of single statement "count++" or "count--" can be interrupted in the middle and produce unexpected result? What unexpected results can be gotten?
Does it mean the execution of single statement "count++" or "count--"
can be interrupted in the middle and produce unexpected result? What
unexpected results can be gotten?
Just answering to this part: Yes, the execution can be interrupted in the middle of a statement.
Let's imagine a simple case:
class A {
void foo(){
++a;
}
int a = 0;
};
The single statement ++a is translated in assembly to
mov eax, DWORD PTR [rdi]
add eax, 1
mov DWORD PTR [rdi], eax
which can be seen as
eax = a;
eax += 1;
a = eax;
If foo() is called on the same instance of A in 2 different threads (be it on a single core, or multiple cores) you cannot predict what will be the result of the program.
It can behave nicely:
thread 1 > eax = a // eax in thread 1 is equal to 0
thread 1 > eax += 1 // eax in thread 1 is equal to 1
thread 1 > a = eax // a is set to 1
thread 2 > eax = a // eax in thread 2 is equal to 1
thread 2 > eax += 1 // eax in thread 2 is equal to 2
thread 2 > a = eax // a is set to 2
or not:
thread 1 > eax = a // eax in thread 1 is equal to 0
thread 2 > eax = a // eax in thread 2 is equal to 0
thread 2 > eax += 1 // eax in thread 2 is equal to 1
thread 2 > a = eax // a is set to 1
thread 1 > eax += 1 // eax in thread 1 is equal to 1
thread 1 > a = eax // a is set to 1
In a well defined program, N calls to foo() should result in a == N.
But calling foo() on the same instance of A from multiple threads creates undefined behavior. There is no way to know the value of a after N calls to foo().
It will depend on how you compiled your program, what optimization flags were used, which compiler was used, what was the load of your CPU, the number of core of your CPU,...
NB
class A {
public:
bool check() const { return a == b; }
int get_a() const { return a; }
int get_b() const { return b; }
void foo(){
++a;
++b;
}
private:
int a = 0;
int b = 0;
};
Now we have a class that, for an external observer, keeps a and b equal at all time.
The optimizer could optimize this class into:
class A {
public:
bool check() const { return true; }
int get_a() const { return a; }
int get_b() const { return b; }
void foo(){
++a;
++b;
}
private:
int a = 0;
int b = 0;
};
because it does not change the observable behavior of the program.
However if you invoke undefined behavior by calling foo() on the same instance of A from multiple threads, you could end up if a = 3, b = 2 and check() still returning true. Your code has lost its meaning, the program is not doing what it is supposed to and can be doing about anything.
From here you can imagine more complex cases, like if A manages network connections, you can end up sending the data for client #10 to client #6. If your program is running in a factory, you can end up activating the wrong tool.
If you want the definition of undefined behavior you can look here : https://en.cppreference.com/w/cpp/language/ub
and in the C++ standard
For a better understanding of UB you can look for CppCon talks on the topic.
For any standard object (including bool) that is accessed from multiple threads, where at least one of the threads may modify the object's state, you need to protect access to that object using a mutex, otherwise you will invoke undefined behavior.
As a practical matter, for a bool that undefined behavior probably won't come in the form of a crash, but more likely in the form of thread B sometimes not "seeing" changes made to the bool by thread A, due to caching and/or optimization issues (e.g. the optimizer "knows" that the bool can't change during a function call, so it doesn't bother checking it more than once)
If you don't want to guard your accesses with a mutex, the other option is to change flag from a bool to a std::atomic<bool>; the std::atomic<bool> type has exactly the semantics you are looking for, i.e. it can be read and/or written from any thread without invoking undefined behavior.
Look here for an explanation: Do I have to use atomic<bool> for "exit" bool variable?
To synchronize access to flag you can make it a std::atomic<bool>.
Or you can use a QReadWriteLock together with a QReadLocker and a QWriteLocker. Compared to using a QMutex this gives you the advantage that you do not need to care about the call to QMutex::unlock() if you use exceptions or early return statements.
Alternatively you can use a QMutexLocker if the QReadWriteLock does not match your use case.
QReadWriteLock lock;
...
//In thread1
{
QReadLocker readLocker(&lock);
if(flag)
dowork1;
else
dowork2;
}
...
//In thread2
void setflag(bool f)
{
QWriteLocker writeLocker(&lock);
flag=f;
}
Keeping your program expressing its intent (ie. accessing shared vars under locks) is a big win for program maintenance and clarity. You need to have some pretty good reasons to abandon that clarity for obscure approaches like the atomics and devising consistent race conditions.
Good reasons include you have measured your program spending too much time toggling the mutex. In any decent implementation, the difference between a non-contested mutex and an atomic is minute -- the mutex lock and unlock typical employ an optimistic compare-and-swap, returning quickly. If your vendor doesn't provide a decent implementation, you might bring that up with them.
In your example, dowork1 and dowork2 are invoked with the mutex locked; so the mutex isn't just protecting flag, but also serializing these functions. If that is just an artifact of how you posed the question, then race conditions (variants of atomics travesty) are less scary.
In your PS (dup of comment above):
Yes, count++ is best thought of as:
mov $_count, %r1
ld (%r1), %r0
add $1, %r0, %r2
st %r2,(%r1)
Even machines with natural atomic inc (x86,68k,370,dinosaurs) instructions might not be used consistently by the compiler.
So, if two threads do count--; and count++; at close to the same time, the result could be -1, 0, 1. (ignoring the language weenies that say your house might burn down).
barriers:
if CPU0 executes:
store $1 to b
store $2 to c
and CPU1 executes:
load barrier -- discard speculatively read values.
load b to r0
load c to r1
Then CPU1 could read r0,r1 as: (0,0), (1,0), (1,2), (0,2).
This is because the observable order of the memory writes is weak; the processor may make them visible in an arbitrary fashion.
So, we change CPU0 to execute:
store $1 to b
store barrier -- stop storing until all previous stores are visible
store $2 to c
Then, if CPU1 saw that r1 (c) was 2, then r0 (b) has to be 1. The store barrier enforces that.
For me, its seems to be more handy to use a mutex here.
In general not using mutex when sharing references could lead to
problems.
The only downside of using mutex here seems to be, that you will slightly decrease the performance, because your threads have to wait for each other.
What kind of errors could happen ?
Like somebody in the comments said its a different situation if
your share fundamental datatype e.g. int, bool, float
or a object references. I added some qt code
example, which emphases 2 possible problems during NOT using mutex. The problem #3 is a fundamental one and pretty well described in details by Benjamin T and his nice answer.
Blockquote
main.cpp
#include <QCoreApplication>
#include <QThread>
#include <QtDebug>
#include <QTimer>
#include "countingthread.h"
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
int amountThread = 3;
int counter = 0;
QString *s = new QString("foo");
QMutex *mutex = new QMutex();
//we construct a lot of thread
QList<CountingThread*> threadList;
//we create all threads
for(int i=0;i<amountThread;i++)
{
CountingThread *t = new CountingThread();
#ifdef TEST_ATOMIC_VAR_SHARE
t->addCounterdRef(&counter);
#endif
#ifdef TEST_OBJECT_VAR_SHARE
t->addStringRef(s);
//we add a mutex, which is shared to read read write
//just used with TEST_OBJECT_SHARE_FIX define uncommented
t->addMutexRef(mutex);
#endif
//t->moveToThread(t);
threadList.append(t);
}
//we start all with low prio, otherwise we produce something like a fork bomb
for(int i=0;i<amountThread;i++)
threadList.at(i)->start(QThread::Priority::LowPriority);
return a.exec();
}
countingthread.h
#ifndef COUNTINGTHREAD_H
#define COUNTINGTHREAD_H
#include <QThread>
#include <QtDebug>
#include <QTimer>
#include <QMutex>
//atomic var is shared
//#define TEST_ATOMIC_VAR_SHARE
//more complex object var is shared
#define TEST_OBJECT_VAR_SHARE
// we add the fix
#define TEST_OBJECT_SHARE_FIX
class CountingThread : public QThread
{
Q_OBJECT
int *m_counter;
QString *m_string;
QMutex *m_locker;
public :
void addCounterdRef(int *r);
void addStringRef(QString *s);
void addMutexRef(QMutex *m);
void run() override;
};
#endif // COUNTINGTHREAD_H
countingthread.cpp
#include "countingthread.h"
void CountingThread::run()
{
//forever
while(1)
{
#ifdef TEST_ATOMIC_VAR_SHARE
//first use of counter
int counterUse1Copy= (*m_counter);
//some other operations, here sleep 10 ms
this->msleep(10);
//we will retry to use a second time
int counterUse2Copy= (*m_counter);
if(counterUse1Copy != counterUse2Copy)
qDebug()<<this->thread()->currentThreadId()<<" problem #1 found, counter not like we expect";
//we increment afterwards our counter
(*m_counter) +=1; //this works for fundamental types, like float, int, ...
#endif
#ifdef TEST_OBJECT_VAR_SHARE
#ifdef TEST_OBJECT_SHARE_FIX
m_locker->lock();
#endif
m_string->replace("#","-");
//this will crash here !!, with problem #2,
//segmentation fault, is not handle by try catch
m_string->append("foomaster");
m_string->append("#");
if(m_string->length()>10000)
qDebug()<<this->thread()->currentThreadId()<<" string is: " << m_string;
#ifdef TEST_OBJECT_SHARE_FIX
m_locker->unlock();
#endif
#endif
}//end forever
}
void CountingThread::addCounterdRef(int *r)
{
m_counter = r;
qDebug()<<this->thread()->currentThreadId()<<" add counter with value: " << *m_counter << " and address : "<< m_counter ;
}
void CountingThread::addStringRef(QString *s)
{
m_string = s;
qDebug()<<this->thread()->currentThreadId()<<" add string with value: " << *m_string << " and address : "<< m_string ;
}
void CountingThread::addMutexRef(QMutex *m)
{
m_locker = m;
}
If you follow up the code you are able to perform 2 tests.
If you uncomment TEST_ATOMIC_VAR_SHARE and comment TEST_OBJECT_VAR_SHARE in countingthread.h
your will see
problem #1 if you use your variable multiple times in your thread, it could be changes in the background from another thread, besides my expectation there was no app crash or weird exception in my build environment during execution using an int counter.
If you uncomment TEST_OBJECT_VAR_SHARE and comment TEST_OBJECT_SHARE_FIX and comment TEST_ATOMIC_VAR_SHARE in countingthread.h
your will see
problem #2 you get a segmentation fault, which is not possible to handle via try catch. This appears because multiple threads are using string functions for editing on the same object.
If you uncomment TEST_OBJECT_SHARE_FIX too you see the right handling via mutex.
problem #3 see answer from Benjamin T
What is Mutex:
I really like the chicken explanation which vallabh suggested.
I also found an good explanation here

C++ Block thread exit signal/function

I have problem with blocking exit function in thread.
DWORD WINAPI thread1Func( LPVOID lpParam )
{
exit(0); // Problem is there
while(true){
printf("runnging");
Sleep(1000);
}
}
int _tmain(int argc, _TCHAR* argv[])
{
int thread1 = 1;
HANDLE thread1Handle = 0;
thread1Handle = CreateThread( 0, 0,
thread1Func, &thread1, 0, NULL);
WaitForSingleObject(thread1Handle,0);
system("pause");
return 0;
}
Unfortunately the thread which I have created in main function calls exit(0) function.
thread1Func doesn't call exit(0) statement directly. it is called by functions which has been called by thread1Func. So I cannot comment out or remove this statement.
I want to block exit signal from thread , What should I to do ?
How can I block exit signals from background threads ?
Well, exiting the process ( at least on windows ) is not the good way, so naturally you should return an error code in your main function if you can like ERROR_SUCCESS ( 0 ).
So with that being said, exit is more like a forced exit compared to what you should normally do, I can only assume that it calls ExitProcess(0) under the hood, which as you may read it on the docs for ExitProcess it does what it says, immediately exits the process without snding signals or waiting for anything ( like pending operations ), so your best bet is to exploit the DLL loading and create a a fake kernel32.dll where you block the exitprocess or whatever exit() actually calls, or do the whole thing in memory like filling the call to exit with nop's so it does nothing.
I can only assume that you are using someone else's code so that why you can't comment it out, In this case you could try some debugging helper libraries that allow you to use debugger features such as break points, and simply skip the exit if it's not coming from the main thread.

Why sleep() after acquiring a pthread_mutex_lock will block the whole program?

In my test program, I start two threads, each of them just do the following logic:
1) pthread_mutex_lock()
2) sleep(1)
3) pthread_mutex_unlock()
However, I find that after some time, one of the two threads will block on pthread_mutex_lock() forever, while the other thread works normal. This is a very strange behavior and I think maybe a potential serious issue. By Linux manual, sleep() is not prohibited when a pthread_mutex_t is acquired. So my question is: is this a real problem or is there any bug in my code ?
The following is the test program. In the code, the 1st thread's output is directed to stdout, while the 2nd's is directed to stderr. So we can check these two different output to see whether the thread is blocked.
I have tested it on linux kernel (2.6.31) and (2.6.9). Both results are the same.
//======================= Test Program ===========================
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <pthread.h>
#define THREAD_NUM 2
static int data[THREAD_NUM];
static int sleepFlag = 1;
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
static void * threadFunc(void *arg)
{
int* idx = (int*) arg;
FILE* fd = NULL;
if (*idx == 0)
fd = stdout;
else
fd = stderr;
while(1) {
fprintf(fd, "\n[%d]Before pthread_mutex_lock is called\n", *idx);
if (pthread_mutex_lock(&mutex) != 0) {
exit(1);
}
fprintf(fd, "[%d]pthread_mutex_lock is finisheded. Sleep some time\n", *idx);
if (sleepFlag == 1)
sleep(1);
fprintf(fd, "[%d]sleep done\n\n", *idx);
fprintf(fd, "[%d]Before pthread_mutex_unlock is called\n", *idx);
if (pthread_mutex_unlock(&mutex) != 0) {
exit(1);
}
fprintf(fd, "[%d]pthread_mutex_unlock is finisheded.\n", *idx);
}
}
// 1. compile
// gcc -o pthread pthread.c -lpthread
// 2. run
// 1) ./pthread sleep 2> /tmp/error.log # Each thread will sleep 1 second after it acquires pthread_mutex_lock
// ==> We can find that /tmp/error.log will not increase.
// or
// 2) ./pthread nosleep 2> /tmp/error.log # No sleep is done when each thread acquires pthread_mutex_lock
// ==> We can find that both stdout and /tmp/error.log increase.
int main(int argc, char *argv[]) {
if ((argc == 2) && (strcmp(argv[1], "nosleep") == 0))
{
sleepFlag = 0;
}
pthread_t t[THREAD_NUM];
int i;
for (i = 0; i < THREAD_NUM; i++) {
data[i] = i;
int ret = pthread_create(&t[i], NULL, threadFunc, &data[i]);
if (ret != 0) {
perror("pthread_create error\n");
exit(-1);
}
}
for (i = 0; i < THREAD_NUM; i++) {
int ret = pthread_join(t[i], (void*)0);
if (ret != 0) {
perror("pthread_join error\n");
exit(-1);
}
}
exit(0);
}
This is the output:
On the terminal where the program is started:
root#skyscribe:~# ./pthread sleep 2> /tmp/error.log
[0]Before pthread_mutex_lock is called
[0]pthread_mutex_lock is finisheded. Sleep some time
[0]sleep done
[0]Before pthread_mutex_unlock is called
[0]pthread_mutex_unlock is finisheded.
...
On another terminal to see the file /tmp/error.log
root#skyscribe:~# tail -f /tmp/error.log
[1]Before pthread_mutex_lock is called
And no new lines are outputed from /tmp/error.log
This is a wrong way to use mutexes. A thread should not hold a mutex for more time than it does not own it, particularly not if it sleeps while holding the mutex. There is no FIFO guarantee for locking a mutex (for efficiency reasons).
More specifically, if thread 1 unlocks the mutex while thread 2 is waiting for it, it makes thread 2 runnable but this does not force the scheduler to preempt thread 1 or make thread 2 run immediately. Most likely, it will not because thread 1 has recently slept. When thread 1 subsequently reaches the pthread_mutex_lock() call, it will generally be allowed to lock the mutex immediately, even though there is a thread waiting (and the implementation can know it). When thread 2 wakes up after that, it will find the mutex already locked and go back to sleep.
The best solution is not to hold a mutex for that long. If that is not possible, consider moving the lock-needing operations to a single thread (removing the need for the lock) or waking up the correct thread using condition variables.
There's neither a problem, nor a bug in your code, but a combination of buffering and scheduling effects. Add an fflush here:
fprintf (fd, "[%d]pthread_mutex_unlock is finisheded.\n", *idx);
fflush (fd);
and run
./a.out >1 1.log 2> 2.log &
and you'll see rather equal progress made by the two threads.
EDIT: and like #jilles above said, a mutex is supposed to be a short wait lock, as opposed to long waits like condition variable wait, waiting for I/O or sleeping. That's why a mutex is not a cancellation point too.

Does a call to MPI_Barrier affect every thread in an MPI process?

Does a call to MPI_Barrier affect every thread in an MPI process or only the thread
that makes the call?
For your information , my MPI application will run with MPI_THREAD_MULTIPLE.
Thanks.
The way to think of this is that MPI_Barrier (and other collectives) are blocking function calls, which block until all processes in the communicator have completed the function. That, I think, makes it a little easier to figure out what should happen; the function blocks, but other threads continue on their way unimpeded.
So consider the following chunk of code (The shared done flag being flushed to communicate between threads is not how you should be doing thread communication, so please don't use this as a template for anything. Furthermore, using a reference to done will solve this bug/optimization, see the end of comment 2):
#include <mpi.h>
#include <omp.h>
#include <stdio.h>
#include <unistd.h>
int main(int argc, char**argv) {
int ierr, size, rank;
int provided;
volatile int done=0;
MPI_Comm comm;
ierr = MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
if (provided == MPI_THREAD_SINGLE) {
fprintf(stderr,"Could not initialize with thread support\n");
MPI_Abort(MPI_COMM_WORLD,1);
}
comm = MPI_COMM_WORLD;
ierr = MPI_Comm_size(comm, &size);
ierr = MPI_Comm_rank(comm, &rank);
if (rank == 1) sleep(10);
#pragma omp parallel num_threads(2) default(none) shared(rank,comm,done)
{
#pragma omp single
{
/* spawn off one thread to do the barrier,... */
#pragma omp task
{
MPI_Barrier(comm);
printf("%d -- thread done Barrier\n", rank);
done = 1;
#pragma omp flush
}
/* and another to do some printing while we're waiting */
#pragma omp task
{
int *p = &done;
while(!(*p) {
printf("%d -- thread waiting\n", rank);
sleep(1);
}
}
}
}
MPI_Finalize();
return 0;
}
Rank 1 sleeps for 10 seconds, and all the ranks start a barrier in one thread. If you run this with mpirun -np 2, you'd expect the first of rank 0s threads to hit the barrier, and the other to cycle around printing and waiting -- and sure enough, that's what happens:
$ mpirun -np 2 ./threadbarrier
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
0 -- thread waiting
1 -- thread waiting
0 -- thread done Barrier
1 -- thread done Barrier

Resources