Reuse of threads pthread - linux

I have a program which constantly gets some work to be done (something like a server), and few threads. Which is the right way to reuse threads from pthread library? Or am I forced to create a new thread every time. I want to reuse at least the pthread_t structures. I am thinking of something like this:
int main() {
pthread_t threads[some value];
while (1) {
get work;
find a free thread;
pthread_create(free thread, do work);
pthread_join(done threads);
}
But I don't know how to properly free a thread or how to check if it is free.

Just code the thread to do whatever work needs to be done. Don't keep creating and joining threads. The simplest way is to use a thread pool -- a collection of threads and a thread-safe queue of jobs. Each thread in the pool takes a job from the queue, does that job, and then waits for another job.
With POSIX threads, usually a mutex is used to protect the queue and a condition variable to allow threads to wait for work. You may want a boolean variable to track whether the program is shutting down.
In pseudo-code, each thread does this:
Acquire the mutex.
Check if the program is shutting down, if so, release the mutex and terminate.
Check if the queue is empty. If so, block on the condition variable and go to step 2.
Take the top job from the queue.
Release the mutex.
Do the job.
Go to step 1.
To ask a thread to do a job, do this:
Allocate a new work object.
Fill it in with the work to be done (which can be a pointer to a function and a parameter for that function).
Acquire the mutex.
Add the job to the queue.
Signal the condition variable.
Release the mutex.
To shut down:
Acquire the mutex.
Set the shutting down boolean to true.
Broadcast the condition variable.
Release the mutex.
Join all threads.

Related

Why does std::condition_variable wait() require a std::unique_lock arg?

My thread does not need to be locked. std::unique_lock locks thread on construction. I am simply using cond_var.wait() as a way to avoid busy waiting. I have essentially circumvented the auto-locking by putting the unique_lock within a tiny scope and hence destroying the unique lock after it leaves the tiny scope. Additionally, there is only a single consumer thread if that's relevant.
{
std::unique_lock<std::mutex> dispatch_ul(dispatch_mtx);
pq_cond.wait(dispatch_ul);
}
Is there possibly a better option to avoid the unnecessary auto-lock functionality from the unique_lock? I'm looking for a mutexless option to simply signal the thread, I am aware of std::condition_variable_any but that requires a mutex of sorts which is yet again unnessesary in my case.
You need a lock to prevent this common newbie mistake:
Producer thread produces something,
Producer thread calls some_condition.notify_all(),
Producer thread goes idle for a while,
meanwhile:
Consumer thread calls some_condition.wait(...)
Consumer thread waits,...
And waits,...
And waits.
A condition variable is not a flag. It does not remember that it was notified. If the producer calls notify_one() or notify_all() before the consumer has entered the wait() call, then the notification is "lost."
In order to prevent lost notifications, there must be some shared data that tells the consumer whether or not it needs to wait, and there must be a lock to protect the shared data.
The producer should:
Lock the lock,
update the shared data,
notify the condition variable,
release the lock
The consumer must then:
Lock the lock,
Check the shared data to see if it needs wait,
Wait if needed,
consume whatever,
release the lock.
The consumer needs to pass the lock in to the wait(...) call so that wait(...) can temporarily unlock it, and then re-lock it before returning. If wait(...) did not unlock the lock, then the producer would never be able to reach the notify() call.

Is it ok to use a semphore as a global pause for worker threads?

I'm thinking of using a semaphore as a pause mechanism for a pool of worker threads like so:
// main thread
for N jobs:
semaphore.release()
create and start worker
// worker thread
while (not done)
semaphore.acquire()
do_work
semaphore.release()
Now, if I want to pause all workers, I can acquire the entire count available in the semaphore. I'm wondering it that is better than:
if (paused)
paused_mutex.lock
wait for condition (paused_mutex)
do_work
Or is there a better solution?
I guess one downside of doing it with the semaphore is that the main thread will block until all workers release. In my case, the unit of work per iteration is very small so that probably won't be a problem.
Update: to clarify, my workers are database backups that act like file copies. The while(not quit) loop quits when the file has been successfully copied. So to relate it to the traditional worker-waits-for-condition to get work: my workers wait for a needed file copy and the while loop you see is doing the work requested. You could think of my do_work above as do_piece_of_work.
The problem with the semaphore approach is that the worker threads have to constantly check for work. They are eating up all the available CPU resources. It is better to use a mutex and a condition (signalling) variable (as in your second example) so that the threads are woken up only when they have something to do.
It is also better to hold the mutex for as short a time as possible. The traditional way to do this is to create a WORK QUEUE and to use the mutex to synchronize queue inserts and removals. The main thread inserts into the work queue and wakes up the worker. The worker acquires the mutex, removes an item from the queue, then release the mutex. NOW the worker performs the action. This maximizes the concurrency between the worker threads and the main thread.
Here is an example:
// main thread
create signal variable
create mutex
for N jobs:
create and start worker
while (wait for work)
// we have something to do
create work item
mutex.acquire();
insert_work_into_queue(item);
mutex.release();
//tell the workers
signal_condition_variable()
//worker thread
while (wait for condition)
mutex.acquire();
work=remove_item_from_queue();
mutex.release();
if (work) do(work);
This is a simple example where all the worker threads are awakened, even though only one worker will actually succeed in getting work off of the queue. If you want even more efficiency, use an array of condition variables, one per worker thread and then just signal the "next" one, using an algorithm for "next" that is as simple or as complex as you want.

Can multithreaded code possible deadlock be avoided this way?

We know that multi-threaded code has the bane of possible deadlocks if the threads acquire mutex locks but before it gets a chance to release it, the thread gets suspended by main thread or pre-empted out by Scheduler?
I am a beginner in using pthread library so please bear with me if my below query/proposed solution might be unfeasible or outright wrong.
void main()
{
thread_create(T1,NULL,thr_function,NULL)
suspend_thread(T1);
acquire_lock(Lock1);<--- //Now here is a possible deadlock if thread_function acquried Lock1 before main and main suspended T1 before its release
//Do something further;
}
void *thr_function(void *val)
{
///do something;
acquire_lock(Lock1);
//do some more things;
//do some more things;
release_lock(Lock1);
}
In this below pseudo code segment above I have, can't the thread run-time/compiler work together to make sure if a thread which has acquired a mutex lock, is suspended/pre-empted then it executes some 'cleanup code' of releasing all locks it has held before it gets out. The compiler/linker can identify the places inside a thread function which acquire , release lock, then when a thread is suspended between those two places(i.e. after acquire but before release) the execution in the thread function should jump via some kind of 'goto label;' inserted by the runtime where at the label: the thread would release the lock and then the thread gets blocked or context switch happens. [ I know if a thread acquires more than 1 locks it might get messy to jump across those points to release those locks...]
But basic idea/question is can the thread function not do the necessary releases of acquired locks for mutexes, semaphores before it gets blocked out or goes out of execution state to wait or some other state?
No. The reason a thread holds a lock is so that it can make data temporarily inconsistent or see a consistent view of that data itself. If some scheme were to automatically release that lock before the thread made the data consistent again, other threads would acquire the lock, see the inconsistent data, and fail. Or when that thread was resumed, it would either not have the lock or have the lock and see inconsistent data itself. This is why you can only reliably suspend a thread with that thread's cooperation.
Consider this logic to add an object to a linked list protected by a mutex:
Acquire the lock protecting a linked list.
Modify the link's head pointer.
Modify the object's next pointer.
Release the lock.
Now imagine if something were to suspend the thread between steps 2 and 3. If the lock were released, other threads would see the link's head pointer pointing to an object that had not been linked to the list. And when the thread resumed, it might set the object to the wrong pointer because the list had changed.
The general consensus is that suspending threads is so evil that even a feeling that you might want to suspend a thread suggests an incorrect application design. There is practically no reason a properly-designed application would ever want to suspend a thread. (If you didn't want that thread to continue doing the work it was doing, why did you code it to continue doing that work in the first place?)
By the way, scheduler pre-emption is not a problem. Eventually, the thread will be scheduled again and release the lock. So long as there are other threads that can make forward progress, no harm is done. And if there are no other threads that can make forward progress, the only thing the system can do is schedule the thread that was pre-empted.
One way to avoid this kind of deadlocks is to have a global, mutexed variable should_stop_thread which eventually gets set to true by the master thread.
The child thread checks the variable regularly and terminates in a controlled manner if it is true. "Controlled" in this sense means that all data (pointers) are valid (again) and mutex locks are released.

Advantages of using condition variables over mutex

I was wondering what is the performance benefit of using condition variables over mutex locks in pthreads.
What I found is : "Without condition variables, the programmer would need to have threads continually polling (possibly in a critical section), to check if the condition is met. This can be very resource consuming since the thread would be continuously busy in this activity. A condition variable is a way to achieve the same goal without polling." (https://computing.llnl.gov/tutorials/pthreads)
But it also seems that mutex calls are blocking (unlike spin-locks). Hence if a thread (T1) fails to get a lock because some other thread (T2) has the lock, T1 is put to sleep by the OS, and is woken up only when T2 releases the lock and the OS gives T1 the lock. The thread T1 does not really poll to get the lock. From this description, it seems that there is no performance benefit of using condition variables. In either case, there is no polling involved. The OS anyway provides the benefit that the condition-variable paradigm can provide.
Can you please explain what actually happens.
A condition variable allows a thread to be signaled when something of interest to that thread occurs.
By itself, a mutex doesn't do this.
If you just need mutual exclusion, then condition variables don't do anything for you. However, if you need to know when something happens, then condition variables can help.
For example, if you have a queue of items to work on, you'll have a mutex to ensure the queue's internals are consistent when accessed by the various producer and consumer threads. However, when the queue is empty, how will a consumer thread know when something is in there for it to work on? Without something like a condition variable it would need to poll the queue, taking and releasing the mutex on each poll (otherwise a producer thread could never put something on the queue).
Using a condition variable lets the consumer find that when the queue is empty it can just wait on the condition variable indicating that the queue has had something put into it. No polling - that thread does nothing until a producer puts something in the queue, then signals the condition that the queue has a new item.
You're looking for too much overlap in two separate but related things: a mutex and a condition variable.
A common implementation approach for a mutex is to use a flag and a queue. The flag indicates whether the mutex is held by anyone (a single-count semaphore would work too), and the queue tracks which threads are in line waiting to acquire the mutex exclusively.
A condition variable is then implemented as another queue bolted onto that mutex. Threads that got in line to wait to acquire the mutex can—usually once they have acquired it—volunteer to get out of the front of the line and get into the condition queue instead. At this point, you have two separate sets of waiters:
Those waiting to acquire the mutex exclusively
Those waiting for the condition variable to be signaled
When a thread holding the mutex exclusively signals the condition variable, for which we'll assume for now that it's a singular signal (unleashing no more than one waiting thread) and not a broadcast (unleashing all the waiting threads), the first thread in the condition variable queue gets shunted back over into the front (usually) of the mutex queue. Once the thread currently holding the mutex—usually the thread that signaled the condition variable—relinquishes the mutex, the next thread in the mutex queue can acquire it. That next thread in line will have been the one that was at the head of the condition variable queue.
There are many complicated details that come into play, but this sketch should give you a feel for the structures and operations in play.
If you are looking for performance, then start reading about "non blocking / non locking" thread synchronization algorithms. They are based upon atomic operations, which gcc is kind enough to provide. Lookup gcc atomic operations. Our tests showed we could increment a global value with multiple threads using atomic operation magnitudes faster than locking with a mutex. Here is some sample code that shows how to add items to and from a linked list from multiple threads at the same time without locking.
For sleeping and waking threads, signals are much faster than conditions. You use pthread_kill to send the signal, and sigwait to sleep the thread. We tested this too with the same kind of performance benefits. Here is some example code.

synchronising threads with mutexes

In Qt, I have a method which contains a mutex lock and unlock. The problem is when the mutex is unlock it sometimes take long before the other thread gets the lock back. In other words it seems the same thread can get the lock back(method called in a loop) even though another thread is waiting for it. What can I do about this? One thread is a qthread and the other thread is the main thread.
You can have your thread that just unlocked the mutex relinquish the processor. On Posix, you do that by calling pthread_yield() and on Windows by calling Sleep(0).
That said, there is no guarantee that the thread waiting on the lock will be scheduled before your thread wakes up again.
It shouldn't be possible to release a lock and then get it back if some other thread is already waiting on it.
Check that you actually releasing the lock when you think you do. Check that waiting thread actually waits (and not spins a loop with a trylock tests and sleeps, I actually done that once and was very puzzled at first :)).
Or if waiting thread really never gets time to even reach locking code, try QThread::yieldCurrentThread(). This will stop current thread and give scheduler a chance to give execution to somebody else. Might cause unnecessary switching depending on tightness of your loop.
If you want to make sure that one thread has priority over the other ones, an option is to use a QReadWriteLock. It's adapted to a typical scenario where n threads are going to read a value in a infinite loop, with only one thread updating it. I think it's the scenario you described.
QReadWriteLock offers two ways to lock: lockForRead() and lockForWrite(). The threads depending on the value will use the latter, while the thread updating the value (typically via the GUI) will use the former (lockForWrite()) and will have top priority. You won't need to sleep or yield or whatever.
Example code
Let's say you have a QReadWrite lock; somewhere.
"Reader" thread
forever {
lock.lockForRead();
if (condition) {
do_stuff();
}
lock.unlock();
}
"Writer" thread
// external input (eg. user) changes the thread
lock.lockForWrite(); // will block as soon as the reader lock ends
update_condition();
lock.unlock();

Resources