I would like to call pthread_join for a given thread id, but only if that thread has been started. The safe solution might be to add a variable to track which thread where started or not. However, I wonder if checking pthread_t variables is possible, something like the following code.
pthread_t thr1 = some_invalid_value; //0 ?
pthread_t thr2 = some_invalid_value;
/* thread 1 and 2 are strated or not depending on various condition */
....
/* cleanup */
if(thr1 != some_invalid_value)
pthread_join(&thr1);
if(thr2 != some_invalid_value)
pthread_join(&thr2);
Where some_invalid_value could be 0, or an implementation dependant 'PTHREAD_INVALID_ID' macro
PS :
My assumption is that pthread_t types are comparable and assignable, assumption based on
PPS :
I wanted to do this, because I thought calling pthread_join on invalid thread id was undefinde behaviour. It is not. However, joining a previously joined thread IS undefined behaviour. Now let's assume the above "function" is called repeatedly.
Unconditionnally calling pthread_join and checking the result might result in calling pthread_join on a previously joined thread.
Your assumption is incorrect to start with. pthread_t objects are opaque. You cannot compare pthread_t types directly in C. You should use pthread_equal instead.
Another consideration is that if pthread_create fails, the contents of your pthread_t will be undefined. It may not be set to your invalid value any more.
My preference is to keep the return values of the pthread_create calls (along with the thread IDs) and use that to determine whether each thread was started correctly.
As suggested by Tony, you can use pthread_self() in this situation.
But do not compare thread_ts using == or !=. Use pthread_equal.
From the pthread_self man page:
Therefore, variables of type pthread_t can't portably be compared using the C equality operator (==); use pthread_equal(3) instead.
I recently ran into this same issue. If pthread_create() failed, I ended up with a undefined, invalid value stored in my phtread_t structure. As a result, I keep a boolean associated with each thread that gets set to true if pthread_create() succeeded.
Then all I need to do is:
void* status;
if (my_thread_running) {
pthread_join(thread, &status);
my_thread_running = false;
}
Unfortunately, on systems where pthread_t is a pointer, pthread_equal() can return equality even though the two args refer to different threads, e.g. a thread can exit and a new thread can be created with the same pthread_t pointer value.
I was porting some code that used pthreads into a C++ application, and I had the same question. I decided it was easier to switch to the C++ std::thread object, which has the .joinable() method to decide whether or not to join, i.e.
if (t.joinable()) t.join();
I found that just calling pthead_join on a bad pthread_t value (as a result of pthread_create failing) caused a seg fault, not just an error return value.
Our issue was that we couldn't know if a pthread had been started or not, so we make it a pointer and allocate/de-allocate it and set it to NULL when not in use:
To start:
pthread_t *pth = NULL;
pth = malloc(sizeof(pthread_t));
int ret = pthread_create(pth, NULL, mythread, NULL);
if( ret != 0 )
{
free(pth);
pth = NULL;
}
And later when we need to join, whether or not the thread was started:
if (pth != NULL)
{
pthread_join(*pth, NULL);
free(pth);
pth = NULL;
}
If you need to respawn threads quickly then the malloc/free cycle is undesirable, but it works for simple cases like ours.
This is an excellent question that I really wish would get more discussion in C++ classes and code tests.
One option for some systems-- which may seem like overkill to you, but has come in handy for me-- is to start a thread which does nothing other than efficiently wait for a tear-down signal, then quit. This thread stays running for the life of the application, going down very late in the shutdown sequence. Until that point, the ID of this thread can effectively be used as an "invalid thread" value-- or, more likely, as an "uninitialized" sentinel-- for most purposes. For example, my debug libraries typically track the threads from which mutexes were locked. This requires initialization of that tracking value to something sensible. Because POSIX rather stupidly declined to require that platforms define an INVALID_THREAD_ID, and because my libraries allow main() to lock things (making the pthread_self checks that are a good solution pthread_create unusable for lock tracking), this is the solution I have come to use. It works on any platform.
Note, however, that you have a little more design work to do if you want this to be able to initialize static thread references to invalid values.
For C++ (not really sure what language the OP was asking about) another simple option (but I think this one is still the simplest as long as you don't need to treat pthread_self() as valid anywhere) would be to use std::optional<pthread_t> (or boost's version if you can't swing C++17 or later, or implement something similar yourself).
Then use .has_value() to check if values are valid, and be happy:
// so for example:
std::optional<pthread_t> create_thread (...) {
pthread_t thread;
if (pthread_create(&thread, ...))
return std::optional<pthread_t>();
else
return thread;
}
// then:
std::optional<pthread_t> id = create_thread(...);
if (id.has_value()) {
pthread_join(id.value(), ...);
} else {
...;
}
You still need to use pthread_equal when comparing valid values, so all the same caveats apply. However, you can reliably compare any value to an invalid value, so stuff like this will be fine:
// the default constructed optional has no value
const std::optional<pthread_t> InvalidID;
pthread_t id1 = /* from somewhere, no concept of 'invalid'. */;
std::optional<pthread_t> id2 = /* from somewhere. */;
// all of these will still work:
if (id1 == InvalidID) { } // ok: always false
if (id1 != InvalidID) { } // ok: always true
if (id2 == InvalidID) { } // ok: true if invalid, false if not.
if (id2 != InvalidID) { } // ok: true if valud, false if not.
Btw, if you also want to give yourself some proper comparison operators, though, or if you want to make a drop-in replacement for a pthread_t (where you don't have to call .value()), you'll have to write your own little wrapper class to handle all the implicit conversions. It's pretty straightforward, but also getting off-topic, so I'll just drop some code here and give info in the comments if anybody asks. Here are 3 options depending on how old of a C++ you want to support. They provide implicit conversion to/from pthread_t (can't do pthread_t* though), so code changes should be minimal:
//-----------------------------------------------------------
// This first one is C++20 only:
#include <optional>
struct thread_id {
thread_id () =default;
thread_id (const pthread_t &t) : t_(t) { }
operator pthread_t () const { return value(); }
friend bool operator == (const thread_id &L, const thread_id &R) {
return (!L.valid() && !R.valid()) || (L.valid() && R.valid() && pthread_equal(L.value(), R.value()));
}
friend bool operator == (const pthread_t &L, const thread_id &R) { return thread_id(L) == R; }
bool valid () const { return t_.has_value(); }
void reset () { t_.reset(); }
pthread_t value () const { return t_.value(); } // throws std::bad_optional_access if !valid()
private:
std::optional<pthread_t> t_;
};
//-----------------------------------------------------------
// This works for C++17 and C++20. Adds a few more operator
// overloads that aren't needed any more in C++20:
#include <optional>
struct thread_id {
// construction / conversion
thread_id () =default;
thread_id (const pthread_t &t) : t_(t) { }
operator pthread_t () const { return value(); }
// comparisons
friend bool operator == (const thread_id &L, const thread_id &R) {
return (!L.valid() && !R.valid()) || (L.valid() && R.valid() && pthread_equal(L.value(), R.value()));
}
friend bool operator == (const thread_id &L, const pthread_t &R) {return L==thread_id(R);}
friend bool operator == (const pthread_t &L, const thread_id &R) {return thread_id(L)==R;}
friend bool operator != (const thread_id &L, const thread_id &R) {return !(L==R);}
friend bool operator != (const thread_id &L, const pthread_t &R) {return L!=thread_id(R);}
friend bool operator != (const pthread_t &L, const thread_id &R) {return thread_id(L)!=R;}
// value access
bool valid () const { return t_.has_value(); }
void reset () { t_.reset(); }
pthread_t value () const { return t_.value(); } // throws std::bad_optional_access if !valid()
private:
std::optional<pthread_t> t_;
};
//-----------------------------------------------------------
// This works for C++11, 14, 17, and 20. It replaces
// std::optional with a flag and a custom exception.
struct bad_pthread_access : public std::runtime_error {
bad_pthread_access () : std::runtime_error("value() called, but !valid()") { }
};
struct thread_id {
thread_id () : valid_(false) { }
thread_id (const pthread_t &t) : thr_(t), valid_(true) { }
operator pthread_t () const { return value(); }
friend bool operator == (const thread_id &L, const thread_id &R) {
return (!L.valid() && !R.valid()) || (L.valid() && R.valid() && pthread_equal(L.value(), R.value()));
}
friend bool operator == (const thread_id &L, const pthread_t &R) { return L==thread_id(R); }
friend bool operator == (const pthread_t &L, const thread_id &R) { return thread_id(L)==R; }
friend bool operator != (const thread_id &L, const thread_id &R) { return !(L==R); }
friend bool operator != (const thread_id &L, const pthread_t &R) { return L!=thread_id(R); }
friend bool operator != (const pthread_t &L, const thread_id &R) { return thread_id(L)!=R; }
bool valid () const { return valid_; }
void reset () { valid_ = false; }
pthread_t value () const { // throws bad_pthread_access if !valid()
if (!valid_) throw bad_pthread_access();
return thr_;
}
private:
pthread_t thr_;
bool valid_;
};
//------------------------------------------------------
/* some random notes:
- `std::optional` doesn't let you specify custom comparison
functions, which would be convenient here.
- You can't write `bool operator == (pthread_t, pthread_t)`
overloads, because they'll conflict with default operators
on systems where `pthread_t` is a primitive type.
- You have to write overloads for all the combos of
pthread_t/thread_id in <= C++17, otherwise resolution is
ambiguous with the implicit conversions.
- *Really* sloppy but thorough test: https://godbolt.org/z/GY639ovzd
Related
I recently tried to implement an atomic reference counter in C, so I referred to the implementation of std::shared_ptr in STL, and I am very confused about the implementation of weak_ptr::lock.
When executing compared_and_exchange, clang specified memory_order_seq_cst, g++ specified memory_order_acq_rel, and MSVC specified memory_order_relaxed.
I think memory_order_relaxed has been enough, since there is no data needed to synchronize if user_count is non-zero.
I am not an expert in this area, can anyone provide some advice?
Following are code snippets:
MSVC
bool _Incref_nz() noexcept { // increment use count if not zero, return true if successful
auto& _Volatile_uses = reinterpret_cast<volatile long&>(_Uses);
#ifdef _M_CEE_PURE
long _Count = *_Atomic_address_as<const long>(&_Volatile_uses);
#else
long _Count = __iso_volatile_load32(reinterpret_cast<volatile int*>(&_Volatile_uses));
#endif
while (_Count != 0) {
const long _Old_value = _INTRIN_RELAXED(_InterlockedCompareExchange)(&_Volatile_uses, _Count + 1, _Count);
if (_Old_value == _Count) {
return true;
}
_Count = _Old_value;
}
return false;
}
clang/libcxx
__shared_weak_count*
__shared_weak_count::lock() noexcept
{
long object_owners = __libcpp_atomic_load(&__shared_owners_);
while (object_owners != -1)
{
if (__libcpp_atomic_compare_exchange(&__shared_owners_,
&object_owners,
object_owners+1))
return this;
}
return nullptr;
}
gcc/libstdc++
template<>
inline bool
_Sp_counted_base<_S_atomic>::
_M_add_ref_lock_nothrow() noexcept
{
// Perform lock-free add-if-not-zero operation.
_Atomic_word __count = _M_get_use_count();
do
{
if (__count == 0)
return false;
// Replace the current counter value with the old value + 1, as
// long as it's not changed meanwhile.
}
while (!__atomic_compare_exchange_n(&_M_use_count, &__count, __count + 1,
true, __ATOMIC_ACQ_REL,
__ATOMIC_RELAXED));
return true;
}
I am trying to answer this question myself.
The standard spec only says that weak_ptr::lock should be executed as an atomic operation, but nothing more about the memory order. So that different threads can invoke directly weak_ptr::lock in parallel without any race condition, and when that happens, different implementations offer different memory_order.
But no matter what, all the above implementations are correct.
There is an object shared by multiple threads to read from and write to, and I need to implement the class with a reader-writer lock which has the following functions:
It might be declared occupied by one and no more than one thread. Any other threads that try to occupy it will be rejected, and continue to do their works rather than be blocked.
Any of the threads are allowed to ask whether the object is occupied by self or by others at any time, except for the time when it is being declared occupied or released.
Only the owner of the object is allowed to release its ownership, though others might try to do it as well. If it is not the owner, the releasing operation will be canceled.
The performance needs to be carefully considered.
I'm doing the work with OpenMP, so I hope to implement the lock using only the APIs within OpenMP, rather than POSIX, or so on. I have read this answer, but there are only solutions for implementations of C++ standard library. As mixing OpenMP with C++ standard library or POSIX thread model may slow down the program, I wonder is there a good solution for OpenMP?
I have tried like this, sometimes it worked fine but sometimes it crashed, and sometimes it was dead locked. I find it hard to debug as well.
class Element
{
public:
typedef int8_t label_t;
Element() : occupied_(-1) {}
// Set it occupied by thread #myThread.
// Return whether it is set successfully.
bool setOccupiedBy(const int myThread)
{
if (lock_.try_lock())
{
if (occupied_ == -1)
{
occupied_ = myThread;
ready_.set(true);
}
}
// assert(lock_.get() && ready_.get());
return occupied_ == myThread;
}
// Return whether it is occupied by other threads
// except for thread #myThread.
bool isOccupiedByOthers(const int myThread) const
{
bool value = true;
while (lock_.get() != ready_.get());
value = occupied_ != -1 && occupied_ != myThread;
return value;
}
// Return whether it is occupied by thread #myThread.
bool isOccupiedBySelf(const int myThread) const
{
bool value = true;
while (lock_.get() != ready_.get());
value = occupied_ == myThread;
return value;
}
// Clear its occupying mark by thread #myThread.
void clearOccupied(const int myThread)
{
while (true)
{
bool ready = ready_.get();
bool lock = lock_.get();
if (!ready && !lock)
return;
if (ready && lock)
break;
}
label_t occupied = occupied_;
if (occupied == myThread)
{
ready_.set(false);
occupied_ = -1;
lock_.unlock();
}
// assert(ready_.get() == lock_.get());
}
protected:
Atomic<label_t> occupied_;
// Locked means it is occupied by one of the threads,
// and one of the threads might be modifying the ownership
MutexLock lock_;
// Ready means it is occupied by one the the threads,
// and none of the threads is modifying the ownership.
Mutex ready_;
};
The atomic variable, mutex, and the mutex lock is implemented with OpenMP instructions as following:
template <typename T>
class Atomic
{
public:
Atomic() {}
Atomic(T&& value) : mutex_(value) {}
T set(const T& value)
{
T oldValue;
#pragma omp atomic capture
{
oldValue = mutex_;
mutex_ = value;
}
return oldValue;
}
T get() const
{
T value;
#pragma omp read
value = mutex_;
return value;
}
operator T() const { return get(); }
Atomic& operator=(const T& value)
{
set(value);
return *this;
}
bool operator==(const T& value) { return get() == value; }
bool operator!=(const T& value) { return get() != value; }
protected:
volatile T mutex_;
};
class Mutex : public Atomic<bool>
{
public:
Mutex() : Atomic<bool>(false) {}
};
class MutexLock : private Mutex
{
public:
void lock()
{
bool oldMutex = false;
while (oldMutex = set(true), oldMutex == true) {}
}
void unlock() { set(false); }
bool try_lock()
{
bool oldMutex = set(true);
return oldMutex == false;
}
using Mutex::operator bool;
using Mutex::get;
};
I also use the lock provided by OpenMP in alternative:
class OmpLock
{
public:
OmpLock() { omp_init_lock(&lock_); }
~OmpLock() { omp_destroy_lock(&lock_); }
void lock() { omp_set_lock(&lock_); }
void unlock() { omp_unset_lock(&lock_); }
int try_lock() { return omp_test_lock(&lock_); }
private:
omp_lock_t lock_;
};
By the way, I use gcc 4.9.4 and OpenMP 4.0, on x86_64 GNU/Linux.
In this code is std::swap thread safe so it can be called from two execution threads at the same time or do I need use atomic_compare_exchange_weak() instead of swap()?
How do I know if this will work on all CPUs? I am happy if it just works on Intel CPUs.
#include <utility>
class resource {
int x = 0;
};
class foo
{
public:
foo() : p{new resource{}}
{ }
foo(const foo& other) : p{new resource{*(other.p)}}
{ }
foo(foo&& other) : p{other.p}
{
other.p = nullptr;
}
foo& operator=(foo other)
{
swap(*this, other);
return *this;
}
virtual ~foo()
{
delete p;
}
friend void swap(foo& first, foo& second)
{
using std::swap;
swap(first.p, second.p);
}
private:
resource* p;
};
I understand it is overkill to swap a pointer, but this migth be good pracise.
is std::swap thread safe so it can be called from two execution threads at the same time
std::swap is thread-safe as long as different threads pass different objects into it. Otherwise a race condition arises.
bounded blocking queue is famous, of course. There are mostly 2 methods to implement it. I try to understand which way is better:
Method 1: use counting semaphore
void *producer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&empty);
sem_wait(&mutex);
put(i);
sem_post(&mutex);
sem_post(&full);
}
}
void *consumer(void *arg) {
int i;
for (i = 0; i < loops; i++) {
sem_wait(&full);
sem_wait(&mutex);
int tmp = get();
sem_post(&mutex);
sem_post(&empty);
printf("%d\n", tmp);
}
}
Method 2: classic monitor pattern
class BoundedBuffer {
private:
int buffer[MAX];
int fill, use;
int fullEntries;
pthread_mutex_t monitor; // monitor lock
pthread_cond_t empty;
pthread_cond_t full;
public:
BoundedBuffer() {
use = fill = fullEntries = 0;
}
void produce(int element) {
pthread_mutex_lock(&monitor);
while (fullEntries == MAX)
pthread_cond_wait(&empty, &monitor);
//do something
pthread_cond_signal(&full);
pthread_mutex_unlock(&monitor);
}
int consume() {
pthread_mutex_lock(&monitor);
while (fullEntries == 0)
pthread_cond_wait(&full, &monitor);
//do something
pthread_cond_signal(&empty);
pthread_mutex_unlock(&monitor);
return tmp;
}
}
I understand the 2nd method can solve a lot of other problems. But how to compare these 2 methods? Looks like they can both fulfill the task.
Is there any link on detailed comparision?
Appreciate your help.
Thanks.
The big difference between those two methods is that the first one does not use pthread_ specific functions (semaphores are not part of pthread) and as such is not guaranteed to work in multithreaded enviornment.
In particular, semaphores do not protect memory ordering, so things written in one thread might not be readable on another. Mutexes are suitable for multi-thread message queue.
Below is an attempt at a multiple reader / writer shared data which uses std::atomics and busy waits instead of mutex and condition variables to synchronize between readers and writers. I am puzzled as to why the asserts in there are being hit. I'm sure there's a bug somewhere in the logic, but I'm not certain where it is.
The idea behind the implementation is that threads that read are spinning until the writer is done writing. As they enter the read function they increase m_numReaders count and as they are waiting for the writer they increase the m_numWaiting count.
The idea is that the m_numWaiting should then always be smaller or equal to m_numReaders, provided m_numWaiting is always incremented after m_numReaders and decremented before m_numReaders.
There shouldn't be a case where m_numWaiting is bigger than m_numReaders (or I'm not seeing it) since a reader always increments the reader counter first and only sometimes increments the waiting counter and the waiting counter is always decremented first.
Yet, this seems to be whats happening because the asserts are being hit.
Can someone point out the logic error, if you see it?
Thanks!
#include <iostream>
#include <thread>
#include <assert.h>
template<typename T>
class ReadWrite
{
public:
ReadWrite() : m_numReaders(0), m_numWaiting(0), m_writing(false)
{
m_writeFlag.clear();
}
template<typename functor>
void read(functor& readFunc)
{
m_numReaders++;
std::atomic<bool>waiting(false);
while (m_writing)
{
if(!waiting)
{
m_numWaiting++; // m_numWaiting should always be increased after m_numReaders
waiting = true;
}
}
assert(m_numWaiting <= m_numReaders);
readFunc(&m_data);
assert(m_numWaiting <= m_numReaders); // <-- These asserts get hit ?
if(waiting)
{
m_numWaiting--; // m_numWaiting should always be decreased before m_numReaders
}
m_numReaders--;
assert(m_numWaiting <= m_numReaders); // <-- These asserts get hit ?
}
//
// Only a single writer can operate on this at any given time.
//
template<typename functor>
void write(functor& writeFunc)
{
while (m_writeFlag.test_and_set());
// Ensure no readers present
while (m_numReaders);
// At this point m_numReaders may have been increased !
m_writing = true;
// If a reader entered before the writing flag was set, wait for
// it to finish
while (m_numReaders > m_numWaiting);
writeFunc(&m_data);
m_writeFlag.clear();
m_writing = false;
}
private:
T m_data;
std::atomic<int64_t> m_numReaders;
std::atomic<int64_t> m_numWaiting;
std::atomic<bool> m_writing;
std::atomic_flag m_writeFlag;
};
int main(int argc, const char * argv[])
{
const size_t numReaders = 2;
const size_t numWriters = 1;
const size_t numReadWrites = 10000000;
std::thread readThreads[numReaders];
std::thread writeThreads[numWriters];
ReadWrite<int> dummyData;
auto writeFunc = [&](int* pData) { return; }; // intentionally empty
auto readFunc = [&](int* pData) { return; }; // intentionally empty
auto readThreadProc = [&]()
{
size_t numReads = numReadWrites;
while (numReads--)
{
dummyData.read(readFunc);
}
};
auto writeThreadProc = [&]()
{
size_t numWrites = numReadWrites;
while (numWrites--)
{
dummyData.write(writeFunc);
}
};
for (std::thread& thread : writeThreads) { thread = std::thread(writeThreadProc);}
for (std::thread& thread : readThreads) { thread = std::thread(readThreadProc);}
for (std::thread& thread : writeThreads) { thread.join();}
for (std::thread& thread : readThreads) { thread.join();}
}