how to implement std::weak_ptr::lock with atomic operations? - visual-c++

I recently tried to implement an atomic reference counter in C, so I referred to the implementation of std::shared_ptr in STL, and I am very confused about the implementation of weak_ptr::lock.
When executing compared_and_exchange, clang specified memory_order_seq_cst, g++ specified memory_order_acq_rel, and MSVC specified memory_order_relaxed.
I think memory_order_relaxed has been enough, since there is no data needed to synchronize if user_count is non-zero.
I am not an expert in this area, can anyone provide some advice?
Following are code snippets:
MSVC
bool _Incref_nz() noexcept { // increment use count if not zero, return true if successful
auto& _Volatile_uses = reinterpret_cast<volatile long&>(_Uses);
#ifdef _M_CEE_PURE
long _Count = *_Atomic_address_as<const long>(&_Volatile_uses);
#else
long _Count = __iso_volatile_load32(reinterpret_cast<volatile int*>(&_Volatile_uses));
#endif
while (_Count != 0) {
const long _Old_value = _INTRIN_RELAXED(_InterlockedCompareExchange)(&_Volatile_uses, _Count + 1, _Count);
if (_Old_value == _Count) {
return true;
}
_Count = _Old_value;
}
return false;
}
clang/libcxx
__shared_weak_count*
__shared_weak_count::lock() noexcept
{
long object_owners = __libcpp_atomic_load(&__shared_owners_);
while (object_owners != -1)
{
if (__libcpp_atomic_compare_exchange(&__shared_owners_,
&object_owners,
object_owners+1))
return this;
}
return nullptr;
}
gcc/libstdc++
template<>
inline bool
_Sp_counted_base<_S_atomic>::
_M_add_ref_lock_nothrow() noexcept
{
// Perform lock-free add-if-not-zero operation.
_Atomic_word __count = _M_get_use_count();
do
{
if (__count == 0)
return false;
// Replace the current counter value with the old value + 1, as
// long as it's not changed meanwhile.
}
while (!__atomic_compare_exchange_n(&_M_use_count, &__count, __count + 1,
true, __ATOMIC_ACQ_REL,
__ATOMIC_RELAXED));
return true;
}

I am trying to answer this question myself.
The standard spec only says that weak_ptr::lock should be executed as an atomic operation, but nothing more about the memory order. So that different threads can invoke directly weak_ptr::lock in parallel without any race condition, and when that happens, different implementations offer different memory_order.
But no matter what, all the above implementations are correct.

Related

multiple-readers, single-writer locks in OpenMP

There is an object shared by multiple threads to read from and write to, and I need to implement the class with a reader-writer lock which has the following functions:
It might be declared occupied by one and no more than one thread. Any other threads that try to occupy it will be rejected, and continue to do their works rather than be blocked.
Any of the threads are allowed to ask whether the object is occupied by self or by others at any time, except for the time when it is being declared occupied or released.
Only the owner of the object is allowed to release its ownership, though others might try to do it as well. If it is not the owner, the releasing operation will be canceled.
The performance needs to be carefully considered.
I'm doing the work with OpenMP, so I hope to implement the lock using only the APIs within OpenMP, rather than POSIX, or so on. I have read this answer, but there are only solutions for implementations of C++ standard library. As mixing OpenMP with C++ standard library or POSIX thread model may slow down the program, I wonder is there a good solution for OpenMP?
I have tried like this, sometimes it worked fine but sometimes it crashed, and sometimes it was dead locked. I find it hard to debug as well.
class Element
{
public:
typedef int8_t label_t;
Element() : occupied_(-1) {}
// Set it occupied by thread #myThread.
// Return whether it is set successfully.
bool setOccupiedBy(const int myThread)
{
if (lock_.try_lock())
{
if (occupied_ == -1)
{
occupied_ = myThread;
ready_.set(true);
}
}
// assert(lock_.get() && ready_.get());
return occupied_ == myThread;
}
// Return whether it is occupied by other threads
// except for thread #myThread.
bool isOccupiedByOthers(const int myThread) const
{
bool value = true;
while (lock_.get() != ready_.get());
value = occupied_ != -1 && occupied_ != myThread;
return value;
}
// Return whether it is occupied by thread #myThread.
bool isOccupiedBySelf(const int myThread) const
{
bool value = true;
while (lock_.get() != ready_.get());
value = occupied_ == myThread;
return value;
}
// Clear its occupying mark by thread #myThread.
void clearOccupied(const int myThread)
{
while (true)
{
bool ready = ready_.get();
bool lock = lock_.get();
if (!ready && !lock)
return;
if (ready && lock)
break;
}
label_t occupied = occupied_;
if (occupied == myThread)
{
ready_.set(false);
occupied_ = -1;
lock_.unlock();
}
// assert(ready_.get() == lock_.get());
}
protected:
Atomic<label_t> occupied_;
// Locked means it is occupied by one of the threads,
// and one of the threads might be modifying the ownership
MutexLock lock_;
// Ready means it is occupied by one the the threads,
// and none of the threads is modifying the ownership.
Mutex ready_;
};
The atomic variable, mutex, and the mutex lock is implemented with OpenMP instructions as following:
template <typename T>
class Atomic
{
public:
Atomic() {}
Atomic(T&& value) : mutex_(value) {}
T set(const T& value)
{
T oldValue;
#pragma omp atomic capture
{
oldValue = mutex_;
mutex_ = value;
}
return oldValue;
}
T get() const
{
T value;
#pragma omp read
value = mutex_;
return value;
}
operator T() const { return get(); }
Atomic& operator=(const T& value)
{
set(value);
return *this;
}
bool operator==(const T& value) { return get() == value; }
bool operator!=(const T& value) { return get() != value; }
protected:
volatile T mutex_;
};
class Mutex : public Atomic<bool>
{
public:
Mutex() : Atomic<bool>(false) {}
};
class MutexLock : private Mutex
{
public:
void lock()
{
bool oldMutex = false;
while (oldMutex = set(true), oldMutex == true) {}
}
void unlock() { set(false); }
bool try_lock()
{
bool oldMutex = set(true);
return oldMutex == false;
}
using Mutex::operator bool;
using Mutex::get;
};
I also use the lock provided by OpenMP in alternative:
class OmpLock
{
public:
OmpLock() { omp_init_lock(&lock_); }
~OmpLock() { omp_destroy_lock(&lock_); }
void lock() { omp_set_lock(&lock_); }
void unlock() { omp_unset_lock(&lock_); }
int try_lock() { return omp_test_lock(&lock_); }
private:
omp_lock_t lock_;
};
By the way, I use gcc 4.9.4 and OpenMP 4.0, on x86_64 GNU/Linux.

context switch in the source

The folowing source is an queue class :
template<typename T>
class mpmc_bounded_queue
{
public:
bool enqueue(T const& data)
{
cell_t* cell;
size_t pos ;
for (;;)
{
pos = enqueue_pos_ ;
cell = &buffer_[pos & buffer_mask_];
size_t seq = cell->sequence_;
intptr_t dif = (intptr_t)seq - (intptr_t)pos;
if (dif == 0)
{
//spot A
if (__sync_bool_compare_and_swap(&enqueue_pos_,pos,pos+1) )
break;
} else if (dif < 0)
{
return false;
}else{
pos = enqueue_pos_;
}
}
// spot B
cell->data_ = data;
cell->sequence_ = pos + 1 ;
return true;
} //enqueue
private:
struct cell_t
{
size_t sequence_;
T data_;
};
} ;
In RedHat Enterprise Linux 7.0 x86_64 , Is it possible that context switch
happened in finishing __sync_bool_compare_and_swap ( spot A) but not yet to
execute cell->data_ = data (spot B) ?
I've told that context switch would happened in like recv , send , usleep,
something to do with I/O function , in this case , while many threads execute
enqueue , there is not possible that there exist an possibility a thread finish
__sync_bool_compare_and_swap return true but at this important moment it is context switch out before it execute cell->data_ = data , Is it true ?
or context switch was not possible happened between spot A and spot B ?
Is it true ?
Yes: a context switch can happen at any point between A and B.
That shouldn't affect the algorithm though, which appears to be correct: if __sync_bool_compare_and_swap returned true, then you have atomically reserved the cell at pos, and nobody else will interfere with that cell, so whether the context switch happens between A and B or not is irrelevant.

C++11 Thread Safety of Atomic Containers

I am trying to implement a thread safe STL vector without mutexes. So I followed through this post and implemented a wrapper for the atomic primitives.
However when I ran the code below, it displayed out Failed!twice from the below code (only two instances of race conditions) so it doesn't seem to be thread safe. I'm wondering how can I fix that?
Wrapper Class
template<typename T>
struct AtomicVariable
{
std::atomic<T> atomic;
AtomicVariable() : atomic(T()) {}
explicit AtomicVariable(T const& v) : atomic(v) {}
explicit AtomicVariable(std::atomic<T> const& a) : atomic(a.load()) {}
AtomicVariable(AtomicVariable const&other) :
atomic(other.atomic.load()) {}
inline AtomicVariable& operator=(AtomicVariable const &rhs) {
atomic.store(rhs.atomic.load());
return *this;
}
inline AtomicVariable& operator+=(AtomicVariable const &rhs) {
atomic.store(rhs.atomic.load() + atomic.load());
return *this;
}
inline bool operator!=(AtomicVariable const &rhs) {
return !(atomic.load() == rhs.atomic.load());
}
};
typedef AtomicVariable<int> AtomicInt;
Functions and Testing
// Vector of 100 elements.
vector<AtomicInt> common(100, AtomicInt(0));
void add10(vector<AtomicInt> &param){
for (vector<AtomicInt>::iterator it = param.begin();
it != param.end(); ++it){
*it += AtomicInt(10);
}
}
void add100(vector<AtomicInt> &param){
for (vector<AtomicInt>::iterator it = param.begin();
it != param.end(); ++it){
*it += AtomicInt(100);
}
}
void doParallelProcessing(){
// Create threads
std::thread t1(add10, std::ref(common));
std::thread t2(add100, std::ref(common));
// Join 'em
t1.join();
t2.join();
// Print vector again
for (vector<AtomicInt>::iterator it = common.begin();
it != common.end(); ++it){
if (*it != AtomicInt(110)){
cout << "Failed!" << endl;
}
}
}
int main(int argc, char *argv[]) {
// Just for testing purposes
for (int i = 0; i < 100000; i++){
// Reset vector
common.clear();
common.resize(100, AtomicInt(0));
doParallelProcessing();
}
}
Is there such a thing as an atomic container? I've also tested this with a regular vector<int> it didn't have any Failed output but that might just be a coincidence.
Just write operator += as:
inline AtomicVariable& operator+=(AtomicVariable const &rhs) {
atomic += rhs.atomic;
return *this;
}
In documentation: http://en.cppreference.com/w/cpp/atomic/atomic operator += is atomic.
Your example fails because below scenario of execution is possible:
Thread1 - rhs.atomic.load() - returns 10 ; Thread2 - rhs.atomic.load() - returns 100
Thread1 - atomic.load() - returns 0 ; Thread2 - atomic.load - returns 0
Thread1 - add values (0 + 10 = 10) ; Thread2 - add values (0 + 100)
Thread1 - atomic.store(10) ; Thread2 - atomic.store(100)
Finally in this case in atomic value might be 10 or 100, depends of which thread first execute atomic.store.
please note that
atomic.store(rhs.atomic.load() + atomic.load());
is not atomic
You have two options to solve it.
memoery
1) Use a mutex.
EDIT as T.C mentioned in the comments this is irrelevant since the operation here will be load() then load() then store() (not relaxed mode) - so memory order is not related here.
2) Use memory order http://bartoszmilewski.com/2008/12/01/c-atomics-and-memory-ordering/
memory_order_acquire: guarantees that subsequent loads are not moved before the current load or any preceding loads.
memory_order_release: preceding stores are not moved past the current store or any subsequent stores.
I'm still not sure about 2, but I think if the stores will not be on parallel, it will work.

Unzipping archive on Windows 8

I have previously used MiniZip (zlib wrapper) to unzip archives. MiniZip cannot be used for Metro applications as it uses deprecated APIs in "iowin32.c" -- CreateFile() and SetFilePointer().
I thought that would be an easy fix and created "iowinrt.c" with CreateFile() and SetFilePointer() replaced with CreateFile2() and SetFilePointerEx(). While this way I obtained a version of MiniZip that uses only approved Win8 APIs, it still turned out to be useless -- I forgot about sandboxing. If I pick a file using FileOpenPicker() and pass its path to my modified MiniZip I still cannot open it -- CreateFile2() will fail with "Access is denied." message.
So it appears that old C API for file access if now mostly useless; it is my understanding that in order to fix this I would need to reimplement my "iowinrt" in C++/CX using the new async file access. Are there any other options? I think I saw somewhere that WinRT does have compress/uncompress functionality but that it only works on individual files, not archives.
Additional requirements it that I need this to work in memory.
For a moment I thought I had a solution via .NET Framework 4.5:
I found this piece of info about how to create .NET classes that can be used from C++/CX:
http://social.msdn.microsoft.com/Forums/en-US/winappswithnativecode/thread/3ff383d0-0c9f-4a30-8987-ff2b23957f01
.NET Framework 4.5 contains ZipArchive and ZipArchiveEntry classes in System.IO.Compression:
http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchive%28v=vs.110%29.aspx#Y0
http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchiveentry%28v=vs.110%29.aspx#Y0
I thought I could create C# Metro Class Library with WinMD Output type exposing ZipArchive and ZipArchiveEntry then use that in my C++/CX project. However, even if it worked it would not work in-memory; it appears that ZipArchive and ZipArchiveEntry work only with files.
Got reading from archive working. Explanation and code below but really just a hack at this point, to see if it's possible at all. I just kept modifying things until I got something working; this is just an example of what works and by no means a production quality code (it's not re-entrant for start). There are undoubtedly many things that are bad/unnecessary/wtf so feel free to use comments to help with clean up.
As mentioned previously, it is no longer enough to pass path to the library -- unless file is in one of KnownFolders (documents, home, media, music, pictures, removable or videos) you end up with "access is denied" message. Instead, library must be able to accept StorageFile^, as returned from FileOpenPicker. At least I haven't found any other way to do it, maybe someone knows better?
MiniZip provides Windows filesystem access layer for zlib via iowin32.h/.c. This still works in desktop mode for old-style apps, but does not work for Metro apps as it uses deprecated APIs and relies on paths. To get MiniZip going on Windows 8, a complete rewrite of iowin32 is required.
To get things working again, first thing was to find a way to pass StorageFile^ all the way down to iowinrt (Windows 8 replacement for iowin32). Fortunately, that was not a problem as MiniZip provides two styles of open file functions -- ones that accept pointer to char, and the others accepting pointer to void. Since ^ is still just a pointer, casting StorageFile^ to void* and than back to StorageFile^ works fine.
Now that I was able to pass StorageFile^ to my new iowinrt, the next problem was how to make new async C++ file access API work with Zlib. In order to support very old C compilers, Zlib is written with old K&R style C. VisualStudio compiler will refuse to compile this as C++, it has to be compiled as C, and new iowinrt must be compiled as C++ of course -- keep that in mind when creating your project. Other things to note about VS project is that I did it as Visual C++ Windows Metro style Static Library although DLL should also work but then you must also define macro to export MiniZip API (I haven't tried this, don't know which macro you have to use). I think I also had to set "Consume Windows Runtime Extension" (/ZW), set "Not Using Precompiled Headers" and add _CRT_SECURE_NO_WARNINGS and _CRT_NONSTDC_NO_WARNINGS to Preprocessor Definitions.
As for iowinrt itself, I've split it in two files. One holds two sealed ref classes -- reader and writer objects; they accept StorageFile^. Reader implements Read, Tell, SeekFromBeginning, SeekFromCurrent and SeekFromEnd (the reason for 3 Seek methods is because ref sealed classes have to stick with RT types and that apparently excludes enums so I just took the easy route). Writer implements just Write at the moment, haven't used it yet.
This is FileReader code:
#include "pch.h"
#include "FileAccess.h" // FileReader and FileWriter
using namespace Concurrency;
using namespace Windows::Security::Cryptography;
using namespace CFileAccess;
FileReader::FileReader(StorageFile^ archive)
{
if (nullptr != archive)
{
create_task(archive->OpenReadAsync()).then([this](IRandomAccessStreamWithContentType^ archiveStream)
{
if (nullptr != archiveStream)
{
_readStream = archiveStream;
}
}).wait();
}
} // end of constructor
int32 FileReader::Read(WriteOnlyArray<byte>^ fileData)
{
int32 bytesRead = 0;
if ((nullptr != _readStream) && (fileData->Length > 0))
{
try
{
auto inputStreamReader = ref new DataReader(_readStream);
create_task(inputStreamReader->LoadAsync(fileData->Length)).then([&](task<unsigned int> dataRead)
{
try
{
bytesRead = dataRead.get();
if (bytesRead)
{
inputStreamReader->ReadBytes(fileData);
}
}
catch (Exception^ e)
{
bytesRead = -1;
}
inputStreamReader->DetachStream();
}).wait();
}
catch (Exception^ e)
{
bytesRead = -1;
}
}
return (bytesRead);
} // end of method Read()
int64 FileReader::Tell(void)
{
int64 ret = -1;
if (nullptr != _readStream)
{
ret = _readStream->Position;
}
return (ret);
} // end of method Tell()
int64 FileReader::SeekFromBeginning(uint64 offset)
{
int64 ret = -1;
if ((nullptr != _readStream) && (offset < _readStream->Size))
{
_readStream->Seek(offset);
ret = 0;
}
return (ret);
} // end of method SeekFromBeginning()
int64 FileReader::SeekFromCurrent(uint64 offset)
{
int64 ret = -1;
if ((nullptr != _readStream) && ((_readStream->Position + offset) < _readStream->Size))
{
_readStream->Seek(_readStream->Position + offset);
ret = 0;
}
return (ret);
} // end of method SeekFromCurrent()
int64 FileReader::SeekFromEnd(uint64 offset)
{
int64 ret = -1;
if ((nullptr != _readStream) && ((_readStream->Size - offset) >= 0))
{
_readStream->Seek(_readStream->Size - offset);
ret = 0;
}
return (ret);
} // end of method SeekFromEnd()
iowinrt sits between MiniZip and FileReader (and FileWriter). It's too long to give everything here but this should be sufficient to reconstruct the rest since it's mostly just more of the same with different function names, plus a bunch of fill_winRT_filefuncxxx() which are obvious:
#include "zlib.h"
#include "ioapi.h"
#include "iowinrt.h"
#include "FileAccess.h"
using namespace Windows::Security::Cryptography;
using namespace Platform;
using namespace CFileAccess;
static FileReader^ g_fileReader = nullptr;
static FileWriter^ g_fileWriter = nullptr;
static StorageFile^ g_storageFile = nullptr;
[...]
static voidpf winRT_translate_open_mode(int mode)
{
if (nullptr != g_storageFile)
{
if ((mode & ZLIB_FILEFUNC_MODE_READWRITEFILTER)==ZLIB_FILEFUNC_MODE_READ)
{
g_fileWriter = nullptr;
g_fileReader = ref new FileReader(g_storageFile);
}
else if (mode & ZLIB_FILEFUNC_MODE_EXISTING)
{
g_fileReader = nullptr;
g_fileWriter = ref new FileWriter(g_storageFile);
}
else if (mode & ZLIB_FILEFUNC_MODE_CREATE)
{
g_fileReader = nullptr;
g_fileWriter = ref new FileWriter(g_storageFile);
}
}
return (nullptr != g_fileReader ? reinterpret_cast<voidpf>(g_fileReader) : reinterpret_cast<voidpf>(g_fileWriter));
}
voidpf ZCALLBACK winRT_open64_file_func (voidpf opaque,const void* storageFile,int mode)
{
g_storageFile = reinterpret_cast<StorageFile^>(const_cast<void*>(storageFile));
return (winRT_translate_open_mode(mode));
}
[...]
Long ZCALLBACK winRT_read_file_func (voidpf opaque, voidpf stream, void* buf,uLong size)
{
uLong bytesRead = 0;
if (nullptr != g_fileReader)
{
auto fileData = ref new Platform::Array<byte>(size);
bytesRead = g_fileReader->Read(fileData);
memcpy(buf, fileData->Data, fileData->Length);
}
return (bytesRead);
}
uLong ZCALLBACK winRT_write_file_func (voidpf opaque,voidpf stream,const void* buf,uLong size)
{
uLong bytesWritten = 0;
if (nullptr != g_fileWriter)
{
auto bytes = ref new Array<uint8>(reinterpret_cast<uint8*>(const_cast<void*>(buf)), size);
IBuffer ^writeBuffer = CryptographicBuffer::CreateFromByteArray(bytes);
bytesWritten = g_fileWriter->Write(writeBuffer);
}
return (bytesWritten);
}
long ZCALLBACK winRT_tell_file_func (voidpf opaque,voidpf stream)
{
long long ret = 0;
if (nullptr != g_fileReader)
{
ret = g_fileReader->Tell();
}
return (static_cast<long>(ret));
}
ZPOS64_T ZCALLBACK winRT_tell64_file_func (voidpf opaque, voidpf stream)
{
ZPOS64_T ret = 0;
if (nullptr != g_fileReader)
{
ret = g_fileReader->Tell();
}
return (ret);
}
[...]
long ZCALLBACK winRT_seek64_file_func (voidpf opaque, voidpf stream,ZPOS64_T offset,int origin)
{
long long ret = -1;
if (nullptr != g_fileReader)
{
switch (origin)
{
case ZLIB_FILEFUNC_SEEK_CUR :
ret = g_fileReader->SeekFromCurrent(offset);
break;
case ZLIB_FILEFUNC_SEEK_END :
ret = g_fileReader->SeekFromEnd(offset);
break;
case ZLIB_FILEFUNC_SEEK_SET :
ret = g_fileReader->SeekFromBeginning(offset);
break;
default:
// should never happen!
ret = -1;
break;
}
}
return (static_cast<long>(ret));
}
int ZCALLBACK winRT_close_file_func (voidpf opaque, voidpf stream)
{
g_fileWriter = nullptr;
g_fileReader = nullptr;
return (0);
}
int ZCALLBACK winRT_error_file_func (voidpf opaque,voidpf stream)
{
/// #todo Get errors from FileAccess
return (0);
}
This is enough to get MiniZip going (at least for reading) but you have to take care how you call MiniZip functions -- since Metro is all about async and blocking UI thread will end up with exception, you must wrap access in tasks:
FileOpenPicker^ openPicker = ref new FileOpenPicker();
openPicker->ViewMode = PickerViewMode::List;
openPicker->SuggestedStartLocation = PickerLocationId::ComputerFolder;
openPicker->FileTypeFilter->Append(".zip");
task<IVectorView<StorageFile^>^>(openPicker->PickMultipleFilesAsync()).then([this](IVectorView<StorageFile^>^ files)
{
if (files->Size > 0)
{
std::for_each(begin(files), end(files), [this](StorageFile ^file)
{ // open selected zip archives
create_task([this, file]()
{
OpenArchive(file);
[...]
});
});
}
else
{
rootPage->NotifyUserBackgroundThread("No files were returned.", NotifyType::ErrorMessage);
}
});
[...]
bool OpenArchive(StorageFile^ archive)
{
bool isArchiveOpened = false;
if (nullptr != archive)
{ // open ZIP archive
zlib_filefunc64_def ffunc;
fill_winRT_filefunc64(&ffunc);
unzFile archiveObject = NULL;
create_task([this, &ffunc, archive]()
{
archiveObject = unzOpen2_64(reinterpret_cast<const void*>(archive), &ffunc);
}).wait();
if (NULL != archiveObject)
{
[...]

Is there an invalid pthread_t id?

I would like to call pthread_join for a given thread id, but only if that thread has been started. The safe solution might be to add a variable to track which thread where started or not. However, I wonder if checking pthread_t variables is possible, something like the following code.
pthread_t thr1 = some_invalid_value; //0 ?
pthread_t thr2 = some_invalid_value;
/* thread 1 and 2 are strated or not depending on various condition */
....
/* cleanup */
if(thr1 != some_invalid_value)
pthread_join(&thr1);
if(thr2 != some_invalid_value)
pthread_join(&thr2);
Where some_invalid_value could be 0, or an implementation dependant 'PTHREAD_INVALID_ID' macro
PS :
My assumption is that pthread_t types are comparable and assignable, assumption based on
PPS :
I wanted to do this, because I thought calling pthread_join on invalid thread id was undefinde behaviour. It is not. However, joining a previously joined thread IS undefined behaviour. Now let's assume the above "function" is called repeatedly.
Unconditionnally calling pthread_join and checking the result might result in calling pthread_join on a previously joined thread.
Your assumption is incorrect to start with. pthread_t objects are opaque. You cannot compare pthread_t types directly in C. You should use pthread_equal instead.
Another consideration is that if pthread_create fails, the contents of your pthread_t will be undefined. It may not be set to your invalid value any more.
My preference is to keep the return values of the pthread_create calls (along with the thread IDs) and use that to determine whether each thread was started correctly.
As suggested by Tony, you can use pthread_self() in this situation.
But do not compare thread_ts using == or !=. Use pthread_equal.
From the pthread_self man page:
Therefore, variables of type pthread_t can't portably be compared using the C equality operator (==); use pthread_equal(3) instead.
I recently ran into this same issue. If pthread_create() failed, I ended up with a undefined, invalid value stored in my phtread_t structure. As a result, I keep a boolean associated with each thread that gets set to true if pthread_create() succeeded.
Then all I need to do is:
void* status;
if (my_thread_running) {
pthread_join(thread, &status);
my_thread_running = false;
}
Unfortunately, on systems where pthread_t is a pointer, pthread_equal() can return equality even though the two args refer to different threads, e.g. a thread can exit and a new thread can be created with the same pthread_t pointer value.
I was porting some code that used pthreads into a C++ application, and I had the same question. I decided it was easier to switch to the C++ std::thread object, which has the .joinable() method to decide whether or not to join, i.e.
if (t.joinable()) t.join();
I found that just calling pthead_join on a bad pthread_t value (as a result of pthread_create failing) caused a seg fault, not just an error return value.
Our issue was that we couldn't know if a pthread had been started or not, so we make it a pointer and allocate/de-allocate it and set it to NULL when not in use:
To start:
pthread_t *pth = NULL;
pth = malloc(sizeof(pthread_t));
int ret = pthread_create(pth, NULL, mythread, NULL);
if( ret != 0 )
{
free(pth);
pth = NULL;
}
And later when we need to join, whether or not the thread was started:
if (pth != NULL)
{
pthread_join(*pth, NULL);
free(pth);
pth = NULL;
}
If you need to respawn threads quickly then the malloc/free cycle is undesirable, but it works for simple cases like ours.
This is an excellent question that I really wish would get more discussion in C++ classes and code tests.
One option for some systems-- which may seem like overkill to you, but has come in handy for me-- is to start a thread which does nothing other than efficiently wait for a tear-down signal, then quit. This thread stays running for the life of the application, going down very late in the shutdown sequence. Until that point, the ID of this thread can effectively be used as an "invalid thread" value-- or, more likely, as an "uninitialized" sentinel-- for most purposes. For example, my debug libraries typically track the threads from which mutexes were locked. This requires initialization of that tracking value to something sensible. Because POSIX rather stupidly declined to require that platforms define an INVALID_THREAD_ID, and because my libraries allow main() to lock things (making the pthread_self checks that are a good solution pthread_create unusable for lock tracking), this is the solution I have come to use. It works on any platform.
Note, however, that you have a little more design work to do if you want this to be able to initialize static thread references to invalid values.
For C++ (not really sure what language the OP was asking about) another simple option (but I think this one is still the simplest as long as you don't need to treat pthread_self() as valid anywhere) would be to use std::optional<pthread_t> (or boost's version if you can't swing C++17 or later, or implement something similar yourself).
Then use .has_value() to check if values are valid, and be happy:
// so for example:
std::optional<pthread_t> create_thread (...) {
pthread_t thread;
if (pthread_create(&thread, ...))
return std::optional<pthread_t>();
else
return thread;
}
// then:
std::optional<pthread_t> id = create_thread(...);
if (id.has_value()) {
pthread_join(id.value(), ...);
} else {
...;
}
You still need to use pthread_equal when comparing valid values, so all the same caveats apply. However, you can reliably compare any value to an invalid value, so stuff like this will be fine:
// the default constructed optional has no value
const std::optional<pthread_t> InvalidID;
pthread_t id1 = /* from somewhere, no concept of 'invalid'. */;
std::optional<pthread_t> id2 = /* from somewhere. */;
// all of these will still work:
if (id1 == InvalidID) { } // ok: always false
if (id1 != InvalidID) { } // ok: always true
if (id2 == InvalidID) { } // ok: true if invalid, false if not.
if (id2 != InvalidID) { } // ok: true if valud, false if not.
Btw, if you also want to give yourself some proper comparison operators, though, or if you want to make a drop-in replacement for a pthread_t (where you don't have to call .value()), you'll have to write your own little wrapper class to handle all the implicit conversions. It's pretty straightforward, but also getting off-topic, so I'll just drop some code here and give info in the comments if anybody asks. Here are 3 options depending on how old of a C++ you want to support. They provide implicit conversion to/from pthread_t (can't do pthread_t* though), so code changes should be minimal:
//-----------------------------------------------------------
// This first one is C++20 only:
#include <optional>
struct thread_id {
thread_id () =default;
thread_id (const pthread_t &t) : t_(t) { }
operator pthread_t () const { return value(); }
friend bool operator == (const thread_id &L, const thread_id &R) {
return (!L.valid() && !R.valid()) || (L.valid() && R.valid() && pthread_equal(L.value(), R.value()));
}
friend bool operator == (const pthread_t &L, const thread_id &R) { return thread_id(L) == R; }
bool valid () const { return t_.has_value(); }
void reset () { t_.reset(); }
pthread_t value () const { return t_.value(); } // throws std::bad_optional_access if !valid()
private:
std::optional<pthread_t> t_;
};
//-----------------------------------------------------------
// This works for C++17 and C++20. Adds a few more operator
// overloads that aren't needed any more in C++20:
#include <optional>
struct thread_id {
// construction / conversion
thread_id () =default;
thread_id (const pthread_t &t) : t_(t) { }
operator pthread_t () const { return value(); }
// comparisons
friend bool operator == (const thread_id &L, const thread_id &R) {
return (!L.valid() && !R.valid()) || (L.valid() && R.valid() && pthread_equal(L.value(), R.value()));
}
friend bool operator == (const thread_id &L, const pthread_t &R) {return L==thread_id(R);}
friend bool operator == (const pthread_t &L, const thread_id &R) {return thread_id(L)==R;}
friend bool operator != (const thread_id &L, const thread_id &R) {return !(L==R);}
friend bool operator != (const thread_id &L, const pthread_t &R) {return L!=thread_id(R);}
friend bool operator != (const pthread_t &L, const thread_id &R) {return thread_id(L)!=R;}
// value access
bool valid () const { return t_.has_value(); }
void reset () { t_.reset(); }
pthread_t value () const { return t_.value(); } // throws std::bad_optional_access if !valid()
private:
std::optional<pthread_t> t_;
};
//-----------------------------------------------------------
// This works for C++11, 14, 17, and 20. It replaces
// std::optional with a flag and a custom exception.
struct bad_pthread_access : public std::runtime_error {
bad_pthread_access () : std::runtime_error("value() called, but !valid()") { }
};
struct thread_id {
thread_id () : valid_(false) { }
thread_id (const pthread_t &t) : thr_(t), valid_(true) { }
operator pthread_t () const { return value(); }
friend bool operator == (const thread_id &L, const thread_id &R) {
return (!L.valid() && !R.valid()) || (L.valid() && R.valid() && pthread_equal(L.value(), R.value()));
}
friend bool operator == (const thread_id &L, const pthread_t &R) { return L==thread_id(R); }
friend bool operator == (const pthread_t &L, const thread_id &R) { return thread_id(L)==R; }
friend bool operator != (const thread_id &L, const thread_id &R) { return !(L==R); }
friend bool operator != (const thread_id &L, const pthread_t &R) { return L!=thread_id(R); }
friend bool operator != (const pthread_t &L, const thread_id &R) { return thread_id(L)!=R; }
bool valid () const { return valid_; }
void reset () { valid_ = false; }
pthread_t value () const { // throws bad_pthread_access if !valid()
if (!valid_) throw bad_pthread_access();
return thr_;
}
private:
pthread_t thr_;
bool valid_;
};
//------------------------------------------------------
/* some random notes:
- `std::optional` doesn't let you specify custom comparison
functions, which would be convenient here.
- You can't write `bool operator == (pthread_t, pthread_t)`
overloads, because they'll conflict with default operators
on systems where `pthread_t` is a primitive type.
- You have to write overloads for all the combos of
pthread_t/thread_id in <= C++17, otherwise resolution is
ambiguous with the implicit conversions.
- *Really* sloppy but thorough test: https://godbolt.org/z/GY639ovzd

Resources