sem_post() fail with valid semaphore - linux

I have a queue with a semaphore. At certain point all the calls to sem_post() always return 'Invalid argument' error although the semaphore itself is valid
The semaphore is a private member of C++ object which is never deleted, it also can be inspected in gdb. I added sem_getvalue() just before the sem_post() - the value reads OK and then it fails on sem_post(). What could be wrong?
CThreadQueue::CThreadQueue(int MaxSize) :
_MaxSize(MaxSize)
{
sem_init(&_TaskCount, 0, 0)
pthread_mutex_init(&_Mutex, 0);
pthread_create(&_Thread, NULL, CThreadQueue::StartThread, this);
}
CThreadQueue::~CThreadQueue()
{
pthread_kill(_Thread, SIGKILL);
sem_destroy(&_TaskCount);
}
int CThreadQueue::AddTask(CThreadTask Task)
{
pthread_mutex_lock(&_Mutex);
_Queue.push_back(TempTask);
sem_post(&_TaskCount)
pthread_mutex_unlock(&_Mutex);
return 0;
}
void *CThreadQueue::StartThread(void *Obj)
{
((CThreadQueue*)Obj)->RunThread();
return NULL;
}
//runs in a separate thread
void CThreadQueue::RunThread()
{
CThreadQueue::CTask Task;
while(1) {
sem_wait(&_TaskCount);
pthread_mutex_lock(&_Mutex);
Task = _Queue.front();
_Queue.pop_front();
pthread_mutex_unlock(&_Mutex);
if (Task.Callee != NULL)
Task.Callee->CallBackFunc(NULL, Task.CallParam);
}
}

What could be wrong? Any number of things. Something else could be destroying the semaphore or overwriting the memory used to store it or the pointer to it. Another possibility is that you're calling sem_post() too many times and the counter overflows. A code sample would help.

We had the same problem, and after a long time figuring out what could be happening, we discovered that the problem occurred because the semaphore was defined inside a struct that had its byte alignment changed to 1 (in this case using the pragma pack(1) directive).
The POSIX semaphores implementation on Linux uses the futex syscall. According to the futex man page, EINVAL is returned when "An operation was not defined or error in page alignment."
In our case, either removing the pragma pack(1) directive or defining the semaphore as the first element of the struct, solved the problem.

Related

pthread_mutex_lock() always returns EINVAL with PTHREAD_PRIO_PROTECT mutex even with ceiling properly defined

So I'm working on a multi-process real-time application on Linux and there are a few operations we have protected by Mutexes. I'm attempting to switch our mutexes to the PTHREAD_PRIO_PROTECT protocol, but I must be missing something, as I always get EINVAL on the pthread_mutex_lock() calls.
All of the below code uses similar error reporting checks. I'll just mark that with // Error Reporting to simplify the code listings. Each has some variant of the following blob:
{
char message[256];
throw std::runtime_error( std::string("Error locking Mutex: ") +
strerror_r( errno, message, sizeof(message) ) );
}
Here's the basic mutex initialization code, setting a ceiling of 40:
pthread_mutex_t thelock;
pthread_mutexattr_t attrib;
pthread_mutexattr_init(&attrib);
if ( pthread_mutexattr_setprotocol(&attrib, PTHREAD_PRIO_PROTECT) )
{
// Error Reporting
}
if ( pthread_mutexattr_setprioceiling(&attrib, 40) )
{
// Error Reporting
}
pthread_mutex_init(&thelock, &attrib);
Setting the schedule on the current thread with a priority of 20 (well below ceiling):
struct sched_param param;
param.sched_priority = 20;
if ( -1 == sched_setscheduler(0, SCHED_FIFO, &param) )
{
// Error Reporting
}
I also tried threads with no explicit scheduler set, and setting the scheduler/parameter on the thread attribute before creation.
No matter which way I do it, I end up with EINVAL for the following code:
if ( pthread_mutex_lock(&thelock) )
{
// Error Reporting
}
I've set the capabilities on my test program before running so it is able to properly change priorities:
sudo setcap 'cap_sys_nice=eip' threadpriotest
So it appears my previous testing was in error; I forgot to use pthread_attr_setinheritsched(&threaddr, PTHREAD_EXPLICIT_SCHED) when I changed the priority of the first thread that claimed the mutex. The method of using sched_setscheduler() does work, but I wasn't using that on the first thread at the time.
The bottom line is that it appears the thread must already be using SCHED_FIFO before the mutex will work with PTHREAD_PRIO_PROTECT. This is contrary to PTHREAD_PRIO_INHERIT, which seems to work fine with non-FIFO threads.

C++ usrsctp callback parameters null

I'm currently creating a network application that uses the usrsctp library on windows and I'm having an odd problem with parameters appearing as null when they shouldn't be on a callback function. I'm not sure if this is a specific usrsctp issue or something I'm doing wrong so I wanted to check here first.
When creating a new sctp socket you pass a function as one of the parameters that you want to be called when data is received as shown in the code below
static int receive_cb(struct socket *sock, union sctp_sockstore addr, void *data,
size_t datalen, struct sctp_rcvinfo rcv, int flags, void *ulp_info)
{
if (data == NULL) {
printf("receive_cb - Data NULL, closing socket...\n");
done = 1;
usrsctp_close(sock);
}
else {
_write(_fileno(stdout), data, datalen);
free(data);
}
return (1);
}
...
//Create SCTP socket
if ((sctpsock = usrsctp_socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP, receive_cb, NULL, 0, NULL)) == NULL) {
perror("usrsctp_socket");
return -1;
}
Tracing through the library I can see that before the call back is called all the parameters are correct
As soon as I step into it they become null
I've no idea what would cause this, the callback function was taken straight from the official examples so nothing should be wrong there.
Ok, worked out the issue, it seems that the parameter before 'union sctp_sockstore addr' was causing the stack to be pushed by 0x1c and moving the rest of the parameters away from where they should be. I've never come across this issue before but changing the parameter to a pointer fixed it.
I had the same Issue, in my case the reason was a missing define for INET.
Since the size of 'union sctp_sockstore' depends on this define.
So you have to ensure, that you use the same defines as you used when compiling the library.

How to implement a re-usable thread barrier with std::atomic

I have N threads performing various task and these threads must be regularly synchronized with a thread barrier as illustrated below with 3 thread and 8 tasks. The || indicates the temporal barrier, all threads have to wait until the completion of 8 tasks before starting again.
Thread#1 |----task1--|---task6---|---wait-----||-taskB--| ...
Thread#2 |--task2--|---task5--|-------taskE---||----taskA--| ...
Thread#3 |-task3-|---task4--|-taskG--|--wait--||-taskC-|---taskD ...
I couldn’t find a workable solution, thought the little book of Semaphores http://greenteapress.com/semaphores/index.html was inspiring. I came up with a solution using std::atomic shown below which “seems” to be working using three std::atomic.
I am worried about my code breaking down on corner cases hence the quoted verb. So can you share advise on verification of such code? Do you have a simpler fool proof code available?
std::atomic<int> barrier1(0);
std::atomic<int> barrier2(0);
std::atomic<int> barrier3(0);
void my_thread()
{
while(1) {
// pop task from queue
...
// and execute task
switch(task.id()) {
case TaskID::Barrier:
barrier2.store(0);
barrier1++;
while (barrier1.load() != NUM_THREAD) {
std::this_thread::yield();
}
barrier3.store(0);
barrier2++;
while (barrier2.load() != NUM_THREAD) {
std::this_thread::yield();
}
barrier1.store(0);
barrier3++;
while (barrier3.load() != NUM_THREAD) {
std::this_thread::yield();
}
break;
case TaskID::Task1:
...
}
}
}
Boost offers a barrier implementation as an extension to the C++11 standard thread library. If using Boost is an option, you should look no further than that.
If you have to rely on standard library facilities, you can roll your own implementation based on std::mutex and std::condition_variable without too much of a hassle.
class Barrier {
int wait_count;
int const target_wait_count;
std::mutex mtx;
std::condition_variable cond_var;
Barrier(int threads_to_wait_for)
: wait_count(0), target_wait_count(threads_to_wait_for) {}
void wait() {
std::unique_lock<std::mutex> lk(mtx);
++wait_count;
if(wait_count != target_wait_count) {
// not all threads have arrived yet; go to sleep until they do
cond_var.wait(lk,
[this]() { return wait_count == target_wait_count; });
} else {
// we are the last thread to arrive; wake the others and go on
cond_var.notify_all();
}
// note that if you want to reuse the barrier, you will have to
// reset wait_count to 0 now before calling wait again
// if you do this, be aware that the reset must be synchronized with
// threads that are still stuck in the wait
}
};
This implementation has the advantage over your atomics-based solution that threads waiting in condition_variable::wait should get send to sleep by your operating system's scheduler, so you don't block CPU cores by having waiting threads spin on the barrier.
A few words on resetting the barrier: The simplest solution is to just have a separate reset() method and have the user ensure that reset and wait are never invoked concurrently. But in many use cases, this is not easy to achieve for the user.
For a self-resetting barrier, you have to consider races on the wait count: If the wait count is reset before the last thread returned from wait, some threads might get stuck in the barrier. A clever solution here is to not have the terminating condition depend on the wait count variable itself. Instead you introduce a second counter, that is only increased by the thread calling the notify. The other threads then observe that counter for changes to determine whether to exit the wait:
void wait() {
std::unique_lock<std::mutex> lk(mtx);
unsigned int const current_wait_cycle = m_inter_wait_count;
++wait_count;
if(wait_count != target_wait_count) {
// wait condition must not depend on wait_count
cond_var.wait(lk,
[this, current_wait_cycle]() {
return m_inter_wait_count != current_wait_cycle;
});
} else {
// increasing the second counter allows waiting threads to exit
++m_inter_wait_count;
cond_var.notify_all();
}
}
This solution is correct under the (very reasonable) assumption that all threads leave the wait before the inter_wait_count overflows.
With atomic variables, using three of them for a barrier is simply overkill that only serves to complicate the issue. You know the number of threads, so you can simply atomically increment a single counter every time a thread enters the barrier, and then spin until the counter becomes greater or equal to N. Something like this:
void barrier(int N) {
static std::atomic<unsigned int> gCounter = 0;
gCounter++;
while((int)(gCounter - N) < 0) std::this_thread::yield();
}
If you don't have more threads than CPU cores and a short expected waiting time, you might want to remove the call to std::this_thread::yield(). This call is likely to be really expensive (more than a microsecond, I'd wager, but I haven't measured it). Depending on the size of your tasks, this may be significant.
If you want to do repeated barriers, just increment the N as you go:
unsigned int lastBarrier = 0;
while(1) {
switch(task.id()) {
case TaskID::Barrier:
barrier(lastBarrier += processCount);
break;
}
}
I would like to point out that in the solution given by #ComicSansMS ,
wait_count should be reset to 0 before executing cond_var.notify_all();
This is because when the barrier is called a second time the if condition will always fail, if wait_count is not reset to 0.

pthread_cond_wait never unblocking - thread pools

I'm trying to implement a sort of thread pool whereby I keep threads in a FIFO and process a bunch of images. Unfortunately, for some reason my cond_wait doesn't always wake even though it's been signaled.
// Initialize the thread pool
for(i=0;i<numThreads;i++)
{
pthread_t *tmpthread = (pthread_t *) malloc(sizeof(pthread_t));
struct Node* newNode;
newNode=(struct Node *) malloc(sizeof(struct Node));
newNode->Thread = tmpthread;
newNode->Id = i;
newNode->threadParams = 0;
pthread_cond_init(&(newNode->cond),NULL);
pthread_mutex_init(&(newNode->mutx),NULL);
pthread_create( tmpthread, NULL, someprocess, (void*) newNode);
push_back(newNode, &threadPool);
}
for() //stuff here
{
//...stuff
pthread_mutex_lock(&queueMutex);
struct Node *tmpNode = pop_front(&threadPool);
pthread_mutex_unlock(&queueMutex);
if(tmpNode != 0)
{
pthread_mutex_lock(&(tmpNode->mutx));
pthread_cond_signal(&(tmpNode->cond)); // Not starting mutex sometimes?
pthread_mutex_unlock(&(tmpNode->mutx));
}
//...stuff
}
destroy_threads=1;
//loop through and signal all the threads again so they can exit.
//pthread_join here
}
void *someprocess(void* threadarg)
{
do
{
//...stuff
pthread_mutex_lock(&(threadNode->mutx));
pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx));
// Doesn't always seem to resume here after signalled.
pthread_mutex_unlock(&(threadNode->mutx));
} while(!destroy_threads);
pthread_exit(NULL);
}
Am I missing something? It works about half of the time, so I would assume that I have a race somewhere, but the only thing I can think of is that I'm screwing up the mutexes? I read something about not signalling before locking or something, but I don't really understand what's going on.
Any suggestions?
Thanks!
Firstly, your example shows you locking the queueMutex around the call to pop_front, but not round push_back. Typically you would need to lock round both, unless you can guarantee that all the pushes happen-before all the pops.
Secondly, your call to pthread_cond_wait doesn't seem to have an associated predicate. Typical usage of condition variables is:
pthread_mutex_lock(&mtx);
while(!ready)
{
pthread_cond_wait(&cond,&mtx);
}
do_stuff();
pthread_mutex_unlock(&mtx);
In this example, ready is some variable that is set by another thread whilst that thread holds a lock on mtx.
If the waiting thread is not blocked in the pthread_cond_wait when pthread_cond_signal is called then the signal will be ignored. The associated ready variable allows you to handle this scenario, and also allows you to handle so-called spurious wake-ups where the call to pthread_cond_wait returns without a corresponding call to pthread_cond_signal from another thread.
I'm not sure, but I think you don't have to (you must not) lock the mutex in the thread pool before calling pthread_cond_signal(&(tmpNode->cond)); , otherwise, the thread which is woken up won't be able to lock the mutex as part of pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx)); operation.

How to asynchronously read to std::string using Boost::asio?

I'm learning Boost::asio and all that async stuff. How can I asynchronously read to variable user_ of type std::string? Boost::asio::buffer(user_) works only with async_write(), but not with async_read(). It works with vector, so what is the reason for it not to work with string? Is there another way to do that besides declaring char user_[max_len] and using Boost::asio::buffer(user_, max_len)?
Also, what's the point of inheriting from boost::enable_shared_from_this<Connection> and using shared_from_this() instead of this in async_read() and async_write()? I've seen that a lot in the examples.
Here is a part of my code:
class Connection
{
public:
Connection(tcp::acceptor &acceptor) :
acceptor_(acceptor),
socket_(acceptor.get_io_service(), tcp::v4())
{ }
void start()
{
acceptor_.get_io_service().post(
boost::bind(&Connection::start_accept, this));
}
private:
void start_accept()
{
acceptor_.async_accept(socket_,
boost::bind(&Connection::handle_accept, this,
placeholders::error));
}
void handle_accept(const boost::system::error_code& err)
{
if (err)
{
disconnect();
}
else
{
async_read(socket_, boost::asio::buffer(user_),
boost::bind(&Connection::handle_user_read, this,
placeholders::error, placeholders::bytes_transferred));
}
}
void handle_user_read(const boost::system::error_code& err,
std::size_t bytes_transferred)
{
if (err)
{
disconnect();
}
else
{
...
}
}
...
void disconnect()
{
socket_.shutdown(tcp::socket::shutdown_both);
socket_.close();
socket_.open(tcp::v4());
start_accept();
}
tcp::acceptor &acceptor_;
tcp::socket socket_;
std::string user_;
std::string pass_;
...
};
The Boost.Asio documentation states:
A buffer object represents a contiguous region of memory as a 2-tuple consisting of a pointer and size in bytes. A tuple of the form {void*, size_t} specifies a mutable (modifiable) region of memory.
This means that in order for a call to async_read to write data to a buffer, it must be (in the underlying buffer object) a contiguous block of memory. Additionally, the buffer object must be able to write to that block of memory.
std::string does not allow arbitrary writes into its buffer, so async_read cannot write chunks of memory into a string's buffer (note that std::string does give the caller read-only access to the underlying buffer via the data() method, which guarantees that the returned pointer will be valid until the next call to a non-const member function. For this reason, Asio can easily create a const_buffer wrapping an std::string, and you can use it with async_write).
The Asio documentation has example code for a simple "chat" program (see http://www.boost.org/doc/libs/1_43_0/doc/html/boost_asio/examples.html#boost_asio.examples.chat) that has a good method of overcoming this problem. Basically, you need to have the sending TCP send along the size of a message first, in a "header" of sorts, and your read handler must interpret the header to allocate a buffer of a fixed size suitable for reading the actual data.
As far as the need for using shared_from_this() in async_read and async_write, the reason is that it guarantees that the method wrapped by boost::bind will always refer to a live object. Consider the following situation:
Your handle_accept method calls async_read and sends a handler "into the reactor" - basically you've asked the io_service to invoke Connection::handle_user_read when it finishes reading data from the socket. The io_service stores this functor and continues its loop, waiting for the asynchronous read operation to complete.
After your call to async_read, the Connection object is deallocated for some reason (program termination, an error condition, etc.)
Suppose the io_service now determines that the asynchronous read is complete, after the Connection object has been deallocated but before the io_service is destroyed (this can occur, for example, if io_service::run is running in a separate thread, as is typical). Now, the io_service attempts to invoke the handler, and it has an invalid reference to a Connection object.
The solution is to allocate Connection via a shared_ptr and use shared_from_this() instead of this when sending a handler "into the reactor" - this allows io_service to store a shared reference to the object, and shared_ptr guarantees that it won't be deallocated until the last reference expires.
So, your code should probably look something like:
class Connection : public boost::enable_shared_from_this<Connection>
{
public:
Connection(tcp::acceptor &acceptor) :
acceptor_(acceptor),
socket_(acceptor.get_io_service(), tcp::v4())
{ }
void start()
{
acceptor_.get_io_service().post(
boost::bind(&Connection::start_accept, shared_from_this()));
}
private:
void start_accept()
{
acceptor_.async_accept(socket_,
boost::bind(&Connection::handle_accept, shared_from_this(),
placeholders::error));
}
void handle_accept(const boost::system::error_code& err)
{
if (err)
{
disconnect();
}
else
{
async_read(socket_, boost::asio::buffer(user_),
boost::bind(&Connection::handle_user_read, shared_from_this(),
placeholders::error, placeholders::bytes_transferred));
}
}
//...
};
Note that you now must make sure that each Connection object is allocated via a shared_ptr, e.g.:
boost::shared_ptr<Connection> new_conn(new Connection(...));
Hope this helps!
This isn't intended to be an answer per se, but just a lengthy comment: a very simple way to convert from an ASIO buffer to a string is to stream from it:
asio::streambuf buff;
asio::read_until(source, buff, '\r'); // for example
istream is(&buff);
is >> targetstring;
This is a data copy, of course, but that's what you need to do if you want it in a string.
You can use a std:string with async\_read() like this:
async_read(socket_, boost::asio::buffer(&user_[0], user_.size()),
boost::bind(&Connection::handle_user_read, this,
placeholders::error, placeholders::bytes_transferred));
However, you'd better make sure that the std::string is big enough to accept the packet that you're expecting and padded with zeros before calling async\_read().
And as for why you should NEVER bind a member function callback to a this pointer if the object can be deleted, a more complete description and a more robust method can be found here: Boost async_* functions and shared_ptr's.
Boost Asio has two styles of buffers. There's boost::asio::buffer(your_data_structure), which cannot grow, and is therefore generally useless for unknown input, and there's boost::asio::streambuf which can grow.
Given a boost::asio::streambuf buf, you turn it into a string with std::string(std::istreambuf_iterator<char>(&buf), {});.
This is not efficient as you end up copying data once more, but that would require making boost::asio::buffer aware of growable containers, i.e. containers that have a .resize(N) method. You can't make it efficient without touching Boost code.

Resources