How to create monitoring thread? - linux

I got a question while I'm doing for ray-tracing stuff.
I have created multiple threads to split whole image to be processed and let it process its allocated task. Threads work well as it is intended. I would like to monitor the work progress in real-time.
To resolve this problem, I have created one more thread to monitor current state.
Here is the monitoring pseudo-code:
/* Global var */
int cnt = 0; // count the number of row processed
void* render_disp(void* arg){ // thread for monitoring current render-processing
/* monitoring global variable and calculate percentage to display */
double result = 100.*cnt/(h-1);
fprintf(stderr,"\r3.2%f%% of image is processed!", result);
}
void* process(void* arg){ // multiple threads work here
// Rendering process
for(........)
pthread_mutex_lock(&lock);
cnt++;
pthread_mutex_unlock(&lock);
for(........)
}
I wrote the code for initialization of pthread and mutex in main() function.
Basically, I think this monitoring thread should display current state but this thread seems to be called only once and quit.
How do I change this code to this thread function to be called until the whole rendering is finished?

Related

Executing GTK functions from other threads

This question is about GTK and threads.
You may find it useful if your application crashes, freezes or you want to have a multithreaded GTK application.
Main Loop
In order to understand GTK you must understand 2 concepts.
All contemporary GUIs are single-threaded. They have a thread which processes events from window system (like button, mouse events).
Such a thread is called main event loop or main loop.
GTK is also single threaded and not MT-safe. This means, that you must not call any GTK functions from other threads, as it will lead to undefined behaviour.
As Gtk documentation states,
Like all GUI toolkits, GTK+ uses an event-driven programming model. When the user is doing nothing, GTK+ sits in the “main loop” and waits for input. If the user performs some action - say, a mouse click - then the main loop “wakes up” and delivers an event to GTK+. GTK+ forwards the event to one or more widgets.
Gtk is event-based and asynchronous. It reacts to button clicks not in the exact moment of clicking, but a bit later.
It can be very roughly written like this (don't try this at home):
static list *pollable;
int main_loop (void)
{
while (run)
{
lock_mutex()
event_list = poll (pollable); // check whether there are some events to react to
unlock_mutex()
dispatch (event_list); // react to events.
}
}
void schedule (gpointer function)
{
lock_mutex()
add_to_list (pollable, something);
unlock_mutex()
}
I want a delayed action in my app
For example, hide a tooltip in several seconds or change button text.
Assuming your application is single-threaded, if you call sleep() it will be executed in main loop.
sleep() means, that this particular thread will be suspended for specified amount of seconds. No work will be done.
And if this thread is main thread, GTK will not be able to redraw or react to user interactions. The application freezes.
What you should do is schedule function call. It can be done with g_timeout_add or g_idle_add
In the first case our poll() from snippet above will return this event in several seconds. In the latter case it will be returned when there are no events of higher priority.
static int count;
gboolean change_label (gpointer data)
{
GtkButton *button = data;
gchar *text = g_strdup_printf ("%i seconds left", --count);
if (count == 0)
return G_SOURCE_REMOVE;
return G_SOURCE_CONTINUE;
}
void button_clicked (GtkButton *button)
{
gtk_button_set_label (button, "clicked");
count = 5;
g_timeout_add (1 * G_TIME_SPAN_SECOND, change_label, button);
}
Returning a value from function is very important. If you don't do it, the behaviour is undefined, your task may be called again or removed.
I have a long-running task
Long-running tasks aren't different from calling sleep. While one thread is busy with that task, it can't perform any other tasks, obviously. If that is a GUI thread, it can't redraw interface. That's why you should move all long-running tasks to other threads. There is an exception, though: non-blocking IO, but it's out of topic of my answer.
I have additional threads and my app crashes
As already mentioned, GTK is not MT-safe. You must not call Gtk functions from other threads.
You must schedule execution. g_timeout_add and g_idle_add are MT-safe, unlike other GTK functions.
That callbacks will be executed in main loop. If you have some shared resources between callback and thread you must read/write them atomically or use a mutex.
static int data;
static GMutex mutex;
gboolean change_label (gpointer data)
{
GtkButton *button = data;
int value;
gchar *text;
// retrieve data
g_mutex_lock (&mutex);
value = data;
g_mutex_unlock (&mutex);
// update widget
text = g_strdup_printf ("Current data value: %i", value);
return G_SOURCE_REMOVE;
}
gpointer thread_func (gpointer data)
{
GtkButton *button = data;
while (TRUE)
{
sleep (rand_time);
g_mutex_lock (&mutex);
++data;
g_mutex_unlock (&mutex);
g_idle_add (change_label, button);
}
}
Make sure mutexes are held as little as possible. Imagine you lock a mutex in another thread and do some IO. The main loop will be stuck until the mutex is released. There is g_mutex_try_lock() that returns immidiately, but it can bring additional syncronization problems because you can't guarantee that the mutex will be unlocked when mainloop tries to lock it.
Follow up: but python is single-threaded and GIL et cetera?
You can imagine that python is multi-threaded application run on a single-core machine.
You never know when the threads will be switched. You call a GTK function but you don't know in which state the main loop is. Maybe it free'd resources just a moment before. Always schedule.
What is not discussed and further reading
Detailed documentation on glib main loop can be found here
GSource as a more low-level primitive.
GTask

How to pthread_kill synchronously?

I am starting a bunch of joinable worker threads and main() waits for them to completed with pthread_join(). However, a user may hit CTRL+C on the terminal before the worker threads have completed their task. My understanding is that any thread could get the signal so all my worker threads call pthread_sigmask() on start up and block SIGINT (the CTRL+C signal). This causes the signal to be copied to other threads and main(). This way I know that at least main() will get definitely the signal.
I have defined a signal handler function on main() so that main() gets the signal and can kill all the worker threads and free their resources from one place. The problem is that this happens asynchronously. I call pthread_kill() from main() and then try to free() resources the worker thread is using and it's still running because the signal is dispatched asynchronously.
If I call pthread_kill(SIGTERM, ...) from main() to kill the thread main() gets killed too and do_thread_cleanup(i) is never called:
int main () {
signal (SIGINT, signal_handler);
for (i = 0; i < num_thd; i++) {
pthread_create(thread_init, ...);
}
for (i = 0; i < num_thd; i++) {
pthread_join(...);
}
return 0;
}
void signal_handler(int signal) {
for (i = 0; i < num_thd; i++) {
pthread_kill(pthread_t, SIGINT);
pthread_join(pthread_t, ...);
do_thread_cleanup(i); // Calls functions like free() and close()
}
}
void thread_init() {
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, SIGINT);
pthread_sigmask(SIG_BLOCK, &sigset, NULL);
do_stuff_in_a_loop();
}
How can I send SIGKILL to a thread without main() receiving that signal and killing itself? Alternatively, how can I wait for the thread to exit?
Having read the other SO posts the talk about using pthread_cleanup_push() and pthread_cleanup_pop() but that doesn't allow me to check form one central place that all threads are killed and their resources released.
The short answer is that you can’t; but you can do something close.
Free(), malloc() and thus all paths leading to them are not signal safe; so you can’t call them from a signal handler. It is rare that these functions would notice the signal (re)entry, so unpredictable behaviour is the likely result.
A good pattern is to have the main thread notice signals have occurred, and perform the processing for them within it. You can do this, safely, by having the main thread employ a pthread_cond_t,pthread_mutex_t pair to watch a counter, and have the signal handler use the same pair to update the counter and notify the change.
Thus the main thread can treat signals as simple inputs to transition between states, such as Normal/SIGINT -> Quitting, Quitting/SIGALRM -> HardStop.
Free() is probably a bit heavy-handed, as it can cause your program to make sporadic memory references, which may be exploitable as an attack surface.

How to queue the same workqueue work multiple times in Linux?

I see that when the schedule_work function is invoked it will not put the work task into the queue if it is already queued. However I want to queue the same task to be run multiple times even if it is already on the queue. How can I do this?
From workqueue.h:
/**
* schedule_work - put work task in global workqueue
* #work: job to be done
*
* Returns %false if #work was already on the kernel-global workqueue and
* %true otherwise.
*
* This puts a job in the kernel-global workqueue if it was not already
* queued and leaves it in the same position on the kernel-global
* workqueue otherwise.
*/
static inline bool schedule_work(struct work_struct *work)
Workqueue expects every work structure to represent single "task", which is needed to be run once.
So, then simplest way to run a task several times - create new work structure every time.
Alternatively, as repeating the work while it is running is something unusual for workqueue, you may create your own kernel thread for execute some function repeatedly:
DECLARE_WAITQUEUE(repeat_wq); // Kernel thread will wait on this workqueue.
int n_works = 0; // Number of work requests to process.
// Thread function
void repeat_work(void* unused)
{
spin_lock_irq(repeat_wq.lock); // Reuse workqueue's spinlock for our needs
while(1) {
// Wait until work request or thread should be stopped
wait_event_interruptible_locked(&repeat_wq,
n_works || kthread_should_stop());
if(kthread_should_stop()) break;
spin_unlock_irq(repeat_wq.lock);
<do the work>
// Acquire the lock for decrement count and recheck condition
spin_lock_irq(repeat_wq.lock);
n_works--;
}
// Finally release the lock
spin_unlock_irq(repeat_wq.lock);
}
// Request new work.
void add_work(void)
{
unsigned long flags;
spin_lock_irqsave(repeat_wq.lock, flags);
n_works++;
wake_up_locked(&repeat_wq);
spin_unlock_irqrestore(repeat_wq.lock, flags);
}
Workqueues are kernel threads too, with a specific thread function kthread_worker_fn().

lio_listio: How to wait until all requests complete?

In my C++ program, i use the lio_listio call to send many (up to a few hundred) write requests at once. After that, I do some calculations, and when I'm done I need to wait for all outstanding requests to finish before I can submit the next batch of requests. How can I do this?
Right now, I am just calling aio_suspend in a loop, with one request per call, but this seems ugly. It looks like I should use the struct sigevent *sevp argument to lio_listio. My current guess is that I should do something like this:
In the main thread, create a mutex and lock it just before the call to lio_listio.
In the call to lio_listio, specify a notification function / signal handler that unlocks this mutex.
This should give me the desired behavior, but will it work reliably? Is it allowed to manipulate mutexes from the signal handler context? I read that pthread mutexes can provide error detection and fail with if you try to lock them again from the same thread or unlock them from a different thread, yet this solution relies on deadlocking.
Example code, using a signal handler:
void notify(int, siginfo_t *info, void *) {
pthread_mutex_unlock((pthread_mutex_t *) info->si_value);
}
void output() {
pthread_mutex_t iomutex = PTHREAD_MUTEX_INITIALIZER;
struct sigaction act;
memset(&act, 0, sizeof(struct sigaction));
act.sa_sigaction = &notify;
act.sa_flags = SA_SIGINFO;
sigaction(SIGUSR1, &act, NULL);
for (...) {
pthread_mutex_lock(&iomutex);
// do some calculations here...
struct aiocb *cblist[];
int cbno;
// set up the aio request list - omitted
struct sigevent sev;
memset(&sev, 0, sizeof(struct sigevent));
sev.sigev_notify = SIGEV_SIGNAL;
sev.sigev_signo = SIGUSR1;
sev.sigev_value.sival_ptr = &iomutex;
lio_listio(LIO_NOWAIT, cblist, cbno, &sev);
}
// ensure that the last queued operation completes
// before this function returns
pthread_mutex_lock(&iomutex);
pthread_mutex_unlock(&iomutex);
}
Example code, using a notification function - possibly less efficient, since an extra thread is created:
void output() {
pthread_mutex_t iomutex = PTHREAD_MUTEX_INITIALIZER;
for (...) {
pthread_mutex_lock(&iomutex);
// do some calculations here...
struct aiocb *cblist[];
int cbno;
// set up the aio request list - omitted
struct sigevent sev;
memset(&sev, 0, sizeof(struct sigevent));
sev.sigev_notify = SIGEV_THREAD;
sev_sigev_notify_function = &pthread_mutex_unlock;
sev.sigev_value.sival_ptr = &iomutex;
lio_listio(LIO_NOWAIT, cblist, cbno, &sev);
}
// ensure that the last queued operation completes
// before this function returns
pthread_mutex_lock(&iomutex);
pthread_mutex_unlock(&iomutex);
}
If you set the sigevent argument in the lio_listio() call, you will be notified with a signal (or function call) when all the jobs in that one particular call completes. You would still need to:
wait until you receive as many notifications as you have made lio_listio() calls, to know when they're all done.
use some safe mechanism to communicate from your signal handler to your main thread, probably via a global variable (to be portable).
If you're on linux, I would recommend tying an eventfd to your sigevent instead and wait on that. That's a lot more flexible since you don't need to involve signal handlers. On BSD (but not Mac OS), you can wait on aiocbs using kqueue and on solaris/illumos you can use a port to get notified of aiocb completions.
Here's an example of how to use eventfds on linux:
As a side note, I would use caution when issuing jobs with lio_listio. You're not guaranteed that it supports taking more than 2 jobs, and some systems have very low limits of how many you can issue at a time. Default on Mac OS for instance is 16. This limit may be defined as the AIO_LISTIO_MAX macro, but it isn't necessarily. In which case you need to call sysconf(_SC_AIO_LISTIO_MAX) (see docs). For details, see the lio_listio documentation.
You should at least check error conditions from your lio_listio() call.
Also, your solution of using a mutex is sub-optimal, since you will synchronize each loop in the for loop, and just run one at a time (unless it's a recursive mutex, but in that case its state could be corrupt if your signal handler happens to land on a different thread).
A more appropriate primitive may be a semaphore, which is released in the handler, and then (after your for loop) acquired the same number of times as you looped, calling lio_listio(). But, I would still recommend an eventfd if it's OK to be linux specific.

pthread_cond_wait never unblocking - thread pools

I'm trying to implement a sort of thread pool whereby I keep threads in a FIFO and process a bunch of images. Unfortunately, for some reason my cond_wait doesn't always wake even though it's been signaled.
// Initialize the thread pool
for(i=0;i<numThreads;i++)
{
pthread_t *tmpthread = (pthread_t *) malloc(sizeof(pthread_t));
struct Node* newNode;
newNode=(struct Node *) malloc(sizeof(struct Node));
newNode->Thread = tmpthread;
newNode->Id = i;
newNode->threadParams = 0;
pthread_cond_init(&(newNode->cond),NULL);
pthread_mutex_init(&(newNode->mutx),NULL);
pthread_create( tmpthread, NULL, someprocess, (void*) newNode);
push_back(newNode, &threadPool);
}
for() //stuff here
{
//...stuff
pthread_mutex_lock(&queueMutex);
struct Node *tmpNode = pop_front(&threadPool);
pthread_mutex_unlock(&queueMutex);
if(tmpNode != 0)
{
pthread_mutex_lock(&(tmpNode->mutx));
pthread_cond_signal(&(tmpNode->cond)); // Not starting mutex sometimes?
pthread_mutex_unlock(&(tmpNode->mutx));
}
//...stuff
}
destroy_threads=1;
//loop through and signal all the threads again so they can exit.
//pthread_join here
}
void *someprocess(void* threadarg)
{
do
{
//...stuff
pthread_mutex_lock(&(threadNode->mutx));
pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx));
// Doesn't always seem to resume here after signalled.
pthread_mutex_unlock(&(threadNode->mutx));
} while(!destroy_threads);
pthread_exit(NULL);
}
Am I missing something? It works about half of the time, so I would assume that I have a race somewhere, but the only thing I can think of is that I'm screwing up the mutexes? I read something about not signalling before locking or something, but I don't really understand what's going on.
Any suggestions?
Thanks!
Firstly, your example shows you locking the queueMutex around the call to pop_front, but not round push_back. Typically you would need to lock round both, unless you can guarantee that all the pushes happen-before all the pops.
Secondly, your call to pthread_cond_wait doesn't seem to have an associated predicate. Typical usage of condition variables is:
pthread_mutex_lock(&mtx);
while(!ready)
{
pthread_cond_wait(&cond,&mtx);
}
do_stuff();
pthread_mutex_unlock(&mtx);
In this example, ready is some variable that is set by another thread whilst that thread holds a lock on mtx.
If the waiting thread is not blocked in the pthread_cond_wait when pthread_cond_signal is called then the signal will be ignored. The associated ready variable allows you to handle this scenario, and also allows you to handle so-called spurious wake-ups where the call to pthread_cond_wait returns without a corresponding call to pthread_cond_signal from another thread.
I'm not sure, but I think you don't have to (you must not) lock the mutex in the thread pool before calling pthread_cond_signal(&(tmpNode->cond)); , otherwise, the thread which is woken up won't be able to lock the mutex as part of pthread_cond_wait(&(threadNode->cond), &(threadNode->mutx)); operation.

Resources