What is the difference between pthread_yield and p_thread_may yield .
pthread_yield() causes the calling thread to relinquish the CPU immediately.
Whereas, pthread_may_yield() will yield if the current thread has used up the given quantum.
pthread_may_yield ()
{
/* Whether this thread has used up its allocated time slice?
Set Variable to True if yes. */
if (thread has used up given quantum) {
pthread_yield();
}
}
Related
My question is from this implementation of a ThreadPool class in C++11. Following is relevant parts from the code:
whenever enqueue is called on the threadPool object, it binds the passed function with all passed arguments, to create a shared_ptr of std::packaged_task:
auto task = std::make_shared< std::packaged_task<return_type()> >(
std::bind(std::forward<F>(f), std::forward<Args>(args)...)
);
extracts the future from this std::packaged_taskto return to the caller and stores this task in a std::queue<std::function<void()>> tasks;.
In the constructor, it waits for the task in queue, and if it finds one, it executes the task:
for(size_t i = 0;i<threads;++i)
workers.emplace_back(
[this]
{
for(;;)
{
std::function<void()> task;
{
std::unique_lock<std::mutex> lock(this->queue_mutex);
this->condition.wait(lock,[this]{ return !this->tasks.empty(); });
task = std::move(this->tasks.front());
this->tasks.pop();
}
task();
}
}
);
Now, based on this, following is my questions:
If std::packaged_task was stored in a std::queue<std::function<void()>>, then it just becomes a std::function object, right? then how does it still write to the shared state of std::future extracted earlier?
If stored std::packaged_task was not just a std::function object but still a std::packaged_taskthen when a std::thread executes task() through a lambda (code inside constructor), then why doesn't it run on another thread? as std::packaged_task are supposed to run on another thread, right?
As my questions suggest, I am unable to understand the conversion of std::packaged_task into std::function and the capability of std::function to write to the shared state of std::future. Whenever I tested this code with n threads, the maximum number of thread ids I could get was n but never more than n. Here is the complete code (including that of ThreadPool and it also includes a main function which counts the number of threads created).
Is calling the Wait() method of sync.Cond safe when according to the documentation, it performs an Unlock() first?
Assume we are checking for a condition to be met:
func sample() {
cond = &sync.Cond{L: &sync.Mutex{}} // accessible by other parts of program
go func() {
cond.L.Lock()
for !condition() {
cond.Wait()
}
// do stuff ...
cond.L.Unlock()
}()
go func() {
cond.L.Lock()
mutation()
cond.L.Unlock()
cond.Signal()
}()
}
And:
func condition() bool {
// assuming someSharedState is a more complex state than just a bool
return someSharedState
}
func mutation() {
// assuming someSharedState is a more complex state than just a bool
// (A) state mutation on someSharedState
}
Since Wait() performs an Unlock, should (A) has a locking of it's own? Or being atomic?
Yes it is safe to call Wait even when it calls L.Unlock() first but it is essential that you acquired the lock before calling Wait and before checking your condition since, in this situation, both are not thread safe.
Wait atomically unlocks c.L and suspends execution of the calling goroutine. After later resuming execution, Wait locks c.L before returning.
The goroutine that calls Wait acquired the lock, checked the condition and found it to be unsatisfactory.
Now it waits but in order to allow for changes of the condition it needs to give the lock back. Wait does that automatically for you and then suspends the goroutine.
Now changes of the condition can happen and eventually the goroutine is awoken by Broadcast or Signal. It then acquires the lock to check the condition once again (This must happen one-by-one for each waiting goroutine or else there would be no way telling how many goroutines are running around freely now).
I have N threads performing various task and these threads must be regularly synchronized with a thread barrier as illustrated below with 3 thread and 8 tasks. The || indicates the temporal barrier, all threads have to wait until the completion of 8 tasks before starting again.
Thread#1 |----task1--|---task6---|---wait-----||-taskB--| ...
Thread#2 |--task2--|---task5--|-------taskE---||----taskA--| ...
Thread#3 |-task3-|---task4--|-taskG--|--wait--||-taskC-|---taskD ...
I couldn’t find a workable solution, thought the little book of Semaphores http://greenteapress.com/semaphores/index.html was inspiring. I came up with a solution using std::atomic shown below which “seems” to be working using three std::atomic.
I am worried about my code breaking down on corner cases hence the quoted verb. So can you share advise on verification of such code? Do you have a simpler fool proof code available?
std::atomic<int> barrier1(0);
std::atomic<int> barrier2(0);
std::atomic<int> barrier3(0);
void my_thread()
{
while(1) {
// pop task from queue
...
// and execute task
switch(task.id()) {
case TaskID::Barrier:
barrier2.store(0);
barrier1++;
while (barrier1.load() != NUM_THREAD) {
std::this_thread::yield();
}
barrier3.store(0);
barrier2++;
while (barrier2.load() != NUM_THREAD) {
std::this_thread::yield();
}
barrier1.store(0);
barrier3++;
while (barrier3.load() != NUM_THREAD) {
std::this_thread::yield();
}
break;
case TaskID::Task1:
...
}
}
}
Boost offers a barrier implementation as an extension to the C++11 standard thread library. If using Boost is an option, you should look no further than that.
If you have to rely on standard library facilities, you can roll your own implementation based on std::mutex and std::condition_variable without too much of a hassle.
class Barrier {
int wait_count;
int const target_wait_count;
std::mutex mtx;
std::condition_variable cond_var;
Barrier(int threads_to_wait_for)
: wait_count(0), target_wait_count(threads_to_wait_for) {}
void wait() {
std::unique_lock<std::mutex> lk(mtx);
++wait_count;
if(wait_count != target_wait_count) {
// not all threads have arrived yet; go to sleep until they do
cond_var.wait(lk,
[this]() { return wait_count == target_wait_count; });
} else {
// we are the last thread to arrive; wake the others and go on
cond_var.notify_all();
}
// note that if you want to reuse the barrier, you will have to
// reset wait_count to 0 now before calling wait again
// if you do this, be aware that the reset must be synchronized with
// threads that are still stuck in the wait
}
};
This implementation has the advantage over your atomics-based solution that threads waiting in condition_variable::wait should get send to sleep by your operating system's scheduler, so you don't block CPU cores by having waiting threads spin on the barrier.
A few words on resetting the barrier: The simplest solution is to just have a separate reset() method and have the user ensure that reset and wait are never invoked concurrently. But in many use cases, this is not easy to achieve for the user.
For a self-resetting barrier, you have to consider races on the wait count: If the wait count is reset before the last thread returned from wait, some threads might get stuck in the barrier. A clever solution here is to not have the terminating condition depend on the wait count variable itself. Instead you introduce a second counter, that is only increased by the thread calling the notify. The other threads then observe that counter for changes to determine whether to exit the wait:
void wait() {
std::unique_lock<std::mutex> lk(mtx);
unsigned int const current_wait_cycle = m_inter_wait_count;
++wait_count;
if(wait_count != target_wait_count) {
// wait condition must not depend on wait_count
cond_var.wait(lk,
[this, current_wait_cycle]() {
return m_inter_wait_count != current_wait_cycle;
});
} else {
// increasing the second counter allows waiting threads to exit
++m_inter_wait_count;
cond_var.notify_all();
}
}
This solution is correct under the (very reasonable) assumption that all threads leave the wait before the inter_wait_count overflows.
With atomic variables, using three of them for a barrier is simply overkill that only serves to complicate the issue. You know the number of threads, so you can simply atomically increment a single counter every time a thread enters the barrier, and then spin until the counter becomes greater or equal to N. Something like this:
void barrier(int N) {
static std::atomic<unsigned int> gCounter = 0;
gCounter++;
while((int)(gCounter - N) < 0) std::this_thread::yield();
}
If you don't have more threads than CPU cores and a short expected waiting time, you might want to remove the call to std::this_thread::yield(). This call is likely to be really expensive (more than a microsecond, I'd wager, but I haven't measured it). Depending on the size of your tasks, this may be significant.
If you want to do repeated barriers, just increment the N as you go:
unsigned int lastBarrier = 0;
while(1) {
switch(task.id()) {
case TaskID::Barrier:
barrier(lastBarrier += processCount);
break;
}
}
I would like to point out that in the solution given by #ComicSansMS ,
wait_count should be reset to 0 before executing cond_var.notify_all();
This is because when the barrier is called a second time the if condition will always fail, if wait_count is not reset to 0.
I have a main thread which creates another thread to perform some job.
main thread has a reference to that thread. How do I kill that thread forcefully some time later, even if thread is still operating. I cant find a proper function call that does that.
any help would be appreciable.
The original problem that I want to solve is I created a thread a thread to perform a CPU bound operation that may take 1 second to complete or may be 10 hours. I cant predict how much time it is going to take. If it is taking too much time, I want it to gracefully abandon the job when/ if I want. can I somehow communicate this message to that thread??
Assuming you're talking about a GLib.Thread, you can't. Even if you could, you probably wouldn't want to, since you would likely end up leaking a significant amount of memory.
What you're supposed to do is request that the thread kill itself. Generally this is done by using a variable to indicate whether or not it has been requested that the operation stop at the earliest opportunity. GLib.Cancellable is designed for this purpose, and it integrates with the I/O operations in GIO.
Example:
private static int main (string[] args) {
GLib.Cancellable cancellable = new GLib.Cancellable ();
new GLib.Thread<int> (null, () => {
try {
for ( int i = 0 ; i < 16 ; i++ ) {
cancellable.set_error_if_cancelled ();
GLib.debug ("%d", i);
GLib.Thread.usleep ((ulong) GLib.TimeSpan.MILLISECOND * 100);
}
return 0;
} catch ( GLib.Error e ) {
GLib.warning (e.message);
return -1;
}
});
GLib.Thread.usleep ((ulong) GLib.TimeSpan.SECOND);
cancellable.cancel ();
/* Make sure the thread has some time to cancel. In an application
* with a UI you probably wouldn't need to do this artificially,
* since the entire application probably wouldn't exit immediately
* after cancelling the thread (otherwise why bother cancelling the
* thread? Just exit the program) */
GLib.Thread.usleep ((ulong) GLib.TimeSpan.MILLISECOND * 150);
return 0;
}
I'm trying to implement a sort of thread pool whereby I keep threads in a FIFO and process a bunch of images. Unfortunately, for some reason my cond_wait doesn't always wake even though it's been signaled.
// Initialize the thread pool
for(i=0;i<numThreads;i++)
{
pthread_t *tmpthread = (pthread_t *) malloc(sizeof(pthread_t));
struct Node* newNode;
newNode=(struct Node *) malloc(sizeof(struct Node));
newNode->Thread = tmpthread;
newNode->Id = i;
newNode->threadParams = 0;
pthread_cond_init(&(newNode->cond),NULL);
pthread_mutex_init(&(newNode->mutx),NULL);
pthread_create( tmpthread, NULL, someprocess, (void*) newNode);
push_back(newNode, &threadPool);
}
for() //stuff here
{
//...stuff
pthread_mutex_lock(&queueMutex);
struct Node *tmpNode = pop_front(&threadPool);
pthread_mutex_unlock(&queueMutex);
if(tmpNode != 0)
{
pthread_mutex_lock(&(tmpNode->mutx));
pthread_cond_signal(&(tmpNode->cond)); // Not starting mutex sometimes?
pthread_mutex_unlock(&(tmpNode->mutx));
}
//...stuff
}
destroy_threads=1;
//loop through and signal all the threads again so they can exit.
//pthread_join here
}
void *someprocess(void* threadarg)
{
do
{
//...stuff
pthread_mutex_lock(&(threadNode->mutx));
pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx));
// Doesn't always seem to resume here after signalled.
pthread_mutex_unlock(&(threadNode->mutx));
} while(!destroy_threads);
pthread_exit(NULL);
}
Am I missing something? It works about half of the time, so I would assume that I have a race somewhere, but the only thing I can think of is that I'm screwing up the mutexes? I read something about not signalling before locking or something, but I don't really understand what's going on.
Any suggestions?
Thanks!
Firstly, your example shows you locking the queueMutex around the call to pop_front, but not round push_back. Typically you would need to lock round both, unless you can guarantee that all the pushes happen-before all the pops.
Secondly, your call to pthread_cond_wait doesn't seem to have an associated predicate. Typical usage of condition variables is:
pthread_mutex_lock(&mtx);
while(!ready)
{
pthread_cond_wait(&cond,&mtx);
}
do_stuff();
pthread_mutex_unlock(&mtx);
In this example, ready is some variable that is set by another thread whilst that thread holds a lock on mtx.
If the waiting thread is not blocked in the pthread_cond_wait when pthread_cond_signal is called then the signal will be ignored. The associated ready variable allows you to handle this scenario, and also allows you to handle so-called spurious wake-ups where the call to pthread_cond_wait returns without a corresponding call to pthread_cond_signal from another thread.
I'm not sure, but I think you don't have to (you must not) lock the mutex in the thread pool before calling pthread_cond_signal(&(tmpNode->cond)); , otherwise, the thread which is woken up won't be able to lock the mutex as part of pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx)); operation.