kthread stopped without running - multithreading

If I create a kernel thread with kthread_run then kthread_stop it immediately, the kernel thread might be stopped without running. I checked the source code of kthread_run and kthread_stop in Linux-5.4.73
/**
* kthread_run - create and wake a thread.
* #threadfn: the function to run until signal_pending(current).
* #data: data ptr for #threadfn.
* #namefmt: printf-style name for the thread.
*
* Description: Convenient wrapper for kthread_create() followed by
* wake_up_process(). Returns the kthread or ERR_PTR(-ENOMEM).
*/
#define kthread_run(threadfn, data, namefmt, ...) \
({ \
struct task_struct *__k \
= kthread_create(threadfn, data, namefmt, ## __VA_ARGS__); \
if (!IS_ERR(__k)) \
wake_up_process(__k); \
__k; \
})
/**
* kthread_stop - stop a thread created by kthread_create().
* #k: thread created by kthread_create().
*
* Sets kthread_should_stop() for #k to return true, wakes it, and
* waits for it to exit. This can also be called after kthread_create()
* instead of calling wake_up_process(): the thread will exit without
* calling threadfn().
*
* If threadfn() may call do_exit() itself, the caller must ensure
* task_struct can't go away.
*
* Returns the result of threadfn(), or %-EINTR if wake_up_process()
* was never called.
*/
int kthread_stop(struct task_struct *k)
{
struct kthread *kthread;
int ret;
trace_sched_kthread_stop(k);
get_task_struct(k);
kthread = to_kthread(k);
set_bit(KTHREAD_SHOULD_STOP, &kthread->flags);
kthread_unpark(k);
wake_up_process(k);
wait_for_completion(&kthread->exited);
ret = k->exit_code;
put_task_struct(k);
trace_sched_kthread_stop_ret(ret);
return ret;
}
It seems that the kernel thread should have been waken up before kthread_stop return, but it might not. I am really confused, could anybody help me?
My test code is as follows.
test1.c
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/delay.h>
#include <linux/kthread.h>
#include <linux/sched.h>
#include <linux/semaphore.h>
#include <linux/spinlock.h>
MODULE_LICENSE("GPL");
static struct task_struct *t1;
static struct task_struct *t2;
static struct task_struct *t3;
static int func(void *__para)
{
const char *msg = (const char *)__para;
printk("%s %s\n", __func__, msg);
/* Wait for kthread_stop */
set_current_state(TASK_INTERRUPTIBLE);
while (!kthread_should_stop()) {
schedule();
set_current_state(TASK_INTERRUPTIBLE);
}
set_current_state(TASK_RUNNING);
printk("%s %s return\n", __func__, msg);
return 0;
}
static int __init start_init(void)
{
printk(KERN_INFO "Thread Creating...\n");
t1 = kthread_run(func, "t1", "t1");
if (IS_ERR(t1)) {
WARN_ON(1);
return 0;
}
t2 = kthread_run(func, "t2", "t2");
if (IS_ERR(t2)) {
WARN_ON(1);
return 0;
}
printk("Stopping t2\n");
kthread_stop(t2);
printk("t2 stopped\n");
t3 = kthread_run(func, "t3", "t3");
if (IS_ERR(t3)) {
WARN_ON(1);
return 0;
}
return 0;
}
static void __exit end_exit(void)
{
printk(KERN_INFO "Cleaning Up...\n");
if (IS_ERR(t1))
return;
printk("Stopping t1\n");
kthread_stop(t1);
printk("t1 stopped\n");
printk("Stopping t3\n");
kthread_stop(t3);
printk("t3 stopped\n");
}
module_init(start_init)
module_exit(end_exit)
Makefile
obj-m += test1.o
all:
$(MAKE) -C /lib/modules/$(shell uname -r)/build M=`pwd`
clean:
$(MAKE) -C /lib/modules/$(shell uname -r)/build M=`pwd` clean
The command to run
sudo insmod test1.ko
sudo rmmod test1
The dmesg of the first run
[10914.046211] Thread Creating...
[10914.046515] func t1
[10914.046530] Stopping t2
[10914.046531] func t2
[10914.046533] func t2 return
[10914.046538] t2 stopped
[10914.046555] func t3
[10938.895544] Cleaning Up...
[10938.895545] Stopping t1
[10938.895552] func t1 return
[10938.895561] t1 stopped
[10938.895562] Stopping t3
[10938.895566] func t3 return
[10938.895587] t3 stopped
t2 has executed before stopped this time.
The dmesg of the second run
[10940.775771] Thread Creating...
[10940.776109] func t1
[10940.776126] Stopping t2
[10940.776138] t2 stopped
[10940.776162] func t3
[10956.375606] Cleaning Up...
[10956.375607] Stopping t1
[10956.375613] func t1 return
[10956.375674] t1 stopped
[10956.375674] Stopping t3
[10956.375678] func t3 return
[10956.375697] t3 stopped
t2 stopped without running. But t1 and t3 which are not stopped immediately run before being stopped.

(This answer corresponds to Linux kernel version 5.4.)
The newly created kernel thread task executes the function kthread in "kernel/kthread.c". If all is well kthread calls the thread function referred to by the kthread_run's (or kthread_create's) threadfn parameter. However, the final test before calling the threadfn function pointer is to check the kernel thread's KTHREAD_SHOULD_STOP bit. If the KTHREAD_SHOULD_STOP bit is set, the threadfn function pointer will not be called and the new kernel thread task will call do_exit with the exit code -EINTR. The relevant piece of code at the end of function kthread is as follows:
ret = -EINTR;
if (!test_bit(KTHREAD_SHOULD_STOP, &self->flags)) {
cgroup_kthread_ready();
__kthread_parkme(self);
ret = threadfn(data);
}
do_exit(ret);
Although kthread_run wakes up the newly created kernel thread task before returning, it is possible for the kthread_stop function to be called and set the kernel thread's KTHREAD_SHOULD_STOP bit before the kernel thread reaches the final check of its KTHREAD_SHOULD_STOP bit prior to calling the threadfn function pointer. In that case, the kernel thread will exit without the threadfn function pointer ever being called.
OP's original code can be changed to print the return value of kthread_stop as follows:
int exit_code;
/* ... */
printk("Stopping t2\n");
exit_code = kthread_stop(t2);
printk("t2 stopped, exit code %d\n", exit_code);
Then it should show that thread t2 exited with code -EINTR (probably -4) if it was stopped before the entry function func was called.

Related

Calling sem_post before sem_wait in multithreaded environment

The behavior of the sem_post() function is not clear for a binary semaphore based implementation.
What happens when you call sem_wait() after calling sem_post()?
Will it work?
Code example :
Thread 1 :
do_something_critical()
sem_post();
Thread 2 :
sem_wait()
Proceed()
Here if some how sem_post() gets called before the call to sem_wait(),
will it work? Or is it necessary that sem_wait() need to be called before sem_post()?
sem_post() merely increments the semaphore and wakes up any waiting thread if any. Otherwise it does nothing.
sem_wait() merely decrements the semaphore. The caller will be blocked only if the current value of the semaphore is 0.
Here is an example program where the main thread initializes a semaphore to 0 and calls sem_trywait() to verify that the semaphore is busy (i.e. value is 0). Then, it calls sem_post() to release the semaphore (i.e. value is 1) before creating a thread. The thread calls sem_wait() (this decrements the semaphore to 0) and returns. The main thread waits for the end of the thread and verifies that the semaphore is 0 with a call to sem_trywait():
#include <pthread.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <semaphore.h>
#include <stdio.h>
#include <errno.h>
static sem_t *sem;
void *thd_entry(void *p)
{
int rc;
printf("Thread is starting...\n");
// This decrements the semaphore
rc = sem_wait(sem);
if (0 != rc) {
perror("sem_wait()");
return NULL;
}
printf("Thread is exiting...\n");
return NULL;
}
int main(int ac, char *av[])
{
int rc;
pthread_t thd;
// Create a semaphore with an initial value set to 0
sem = sem_open("/example", O_CREAT|O_RDWR, 0777, 0);
if (sem == SEM_FAILED) {
perror("sem_open()");
return 1;
}
// After creation the value of the semaphore is 0
rc = sem_trywait(sem);
if (-1 == rc) {
if (errno == EAGAIN) {
printf("Semaphore is busy (i.e. value is 0)\n");
} else {
perror("sem_trywait()");
return 1;
}
}
// Increment the semaphore
rc = sem_post(sem);
if (0 != rc) {
perror("sem_post()");
return 1;
}
// Create a thread
rc = pthread_create(&thd, NULL, thd_entry, 0);
if (0 != rc) {
errno = rc;
perror("pthread_create()");
return 1;
}
rc = pthread_join(thd, NULL);
if (0 != rc) {
errno = rc;
perror("pthread_join()");
return 1;
}
// The semaphore is 0 as the thread decremented it
rc = sem_trywait(sem);
if (-1 == rc) {
if (errno == EAGAIN) {
printf("Semaphore is busy (i.e. value is 0)\n");
} else {
perror("sem_trywait()");
return 1;
}
}
return 0;
}
Here is a try:
$ ls -l /dev/shm
total 0
$ gcc sema.c -o sema -lpthread
$ ./sema
Semaphore is busy (i.e. value is 0)
Thread is starting...
Thread is exiting...
Semaphore is busy (i.e. value is 0)
$ ls -l /dev/shm
total 4
-rwxrwxr-x 1 xxxxx xxxxx 32 janv. 5 16:24 sem.example
$ rm /dev/shm/sem.example

How to cancel a thread from another thread (Glib threads)?

I am developing an application library using GTK and the functions for threads in GLib. I have a thread (from now on will be called thread A) that is created when I hit an "ok" button in a certain graphical window. Thread A starts doing some heavy tasks. Another button named "cancel" is available to stop and finish thread A at any moment.
My aim is to code a function for the thread created when I hit the "cancel" button (thread B) that has the ability to end the thread A.
I create thread A with the function g_thread_create. However I cannot find any function similar to g_thread_cancel to stop thread A using thread B. Is this possible or cannot be done?
Thank you so much for any kind of information provided.
You might want to consider using GTask to run your task in a thread, rather than using a manually-created thread. If you use g_task_run_in_thread(), the operation will run in a separate thread automatically.
GTask is integrated with GCancellable, so to cancel the operation you would call g_cancellable_cancel() in the callback from your ‘Cancel’ button.
As OznOg says, you should treat the GCancellable as a way of gently (and thread-safely) telling your task that it should cancel. Depending on how your long-running task is written, you could either check g_cancellable_is_cancelled() once per loop iteration, or you could add the GSource from g_cancellable_source_new() to a poll loop in your task.
The advice about using threads with GLib is probably also relevant here.
I have developed a code that is able to cancel a thread from another, both of them created from a main one. The code works correctly according to my tests:
#include <pthread.h>
#include <stdio.h>
/* these variables are references to the first and second threads */
pthread_t inc_x_thread, inc_y_thread;
/* this function is run by the first thread */
void *inc_x(void *x_void_ptr)
{
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
/* increment x to 100 */
int *x_ptr = (int *)x_void_ptr;
while(++(*x_ptr) < 100000000);
printf("x increment finished\n");
/* the function must return something - NULL will do */
return NULL;
}
/* this function is run by the second thread */
void *inc_y(void *x_void_ptr)
{
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
/* increment y to 100 */
int *x_ptr = (int *)x_void_ptr;
pthread_cancel(inc_x_thread);
while(++(*x_ptr) < 100);
printf("y increment finished\n");
return NULL;
}
/* this is the main thread */
int main()
{
int x = 0, y = 0;
void *res;
/* show the initial values of x and y */
printf("x: %d, y: %d\n", x, y);
/* create a first thread */
if(pthread_create(&inc_x_thread, NULL, inc_x, &x)) {
fprintf(stderr, "Error creating thread\n");
return 1;
}
/* create a second thread */
if(pthread_create(&inc_y_thread, NULL, inc_y, &y)) {
fprintf(stderr, "Error creating thread\n");
return 1;
}
/* wait for the first thread to finish */
if(pthread_join(inc_x_thread, &res)) {
fprintf(stderr, "Error joining thread\n");
return 2;
}
if (res == PTHREAD_CANCELED)
printf(" thread was canceled\n");
else
printf(" thread wasn't canceled\n");
/* wait for the second thread to finish */
if(pthread_join(inc_y_thread, &res)) {
fprintf(stderr, "Error joining thread\n");
return 2;
}
if (res == PTHREAD_CANCELED)
printf(" thread was canceled\n");
else
printf(" thread wasn't canceled\n");
/* show the results*/
printf("x: %d, y: %d\n", x, y);
return 0;
}
You can compile the code by using: gcc example.c -lpthread
However, as OznOg and Philip Withnall have said, this is not the correct way of cancelling a thread. It is only a quick way of doing it that might not work in some specific situations. A better and safer way is to gently ask the thread to stop itself.

Trigger multiple pthreads by pthread_cond_broadcast

Since the examples for pthreads with pthread_cond_broadcast wakeup are sparse i wrote one, but are unsure if this is correctly synchronized and the way to do it:
do all threads share the same c and mtx variable?
is it necessary upon pthread_cond_wait return to test if some condition is actually met? in my case every call to broadcast should wake up every thread exactly once, but no-one else should do so. (do i prevent spurious wakeups?)
the program currently does not exit despite async cancel type. also no success with deferred cancellation tried in example code despite pthread_cond_wait being a cancellation point so.
overall does it work like i expect it to.
#include <pthread.h>
#include <iostream>
#include <unistd.h>
struct p_args{
int who;
};
pthread_cond_t c; //share between compilation units
pthread_mutex_t mtx;
void *threadFunc(void *vargs){
//pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS,NULL);
struct p_args * args = (struct p_args *) vargs;
while(true){
//wait for trigger one loop
pthread_mutex_lock(&mtx);
pthread_cond_wait(&c, &mtx);
pthread_mutex_unlock(&mtx);
//should be entangled output showing concurrent execution
std::cout << "t " << args->who << std::endl;
/* expensive work */
}
delete args;
}
int main(int argc, char* argv[])
{
pthread_cond_init(&c, NULL);
pthread_mutex_init(&mtx, NULL);
pthread_t thread_id[2];
struct p_args *args0 = new p_args();
struct p_args *args1 = new p_args();
args0->who = 0;
args1->who = 1;
pthread_create(&thread_id[0], NULL, threadFunc, args0);
pthread_create(&thread_id[1], NULL, threadFunc, args1);
sleep(3);
pthread_mutex_lock(&mtx);
pthread_cond_broadcast(&c);
pthread_mutex_unlock(&mtx);
sleep(3);//test if thread waits
pthread_cancel(thread_id[0]);
pthread_cancel(thread_id[1]);
pthread_join (thread_id[0], NULL);
pthread_join (thread_id[1], NULL);
//could perform cleanup here
return 0;
}
Regarding exiting deferred:
thread_id[0] exits fine and i am stuck in line `pthread_join (thread_id[1], NULL);`, it says (Exiting) but seems stuck on a lock, with debugger:
<br>
[![enter image description here][2]][2]
<br>
EDIT final solution i came up with:
#include <pthread.h>
#include <iostream>
#include <unistd.h>
struct p_args{
int who;
};
pthread_cond_t c;
pthread_mutex_t mtx;
bool doSome[2];
bool exitFlag;
void *threadFunc(void *vargs){
struct p_args * args = (struct p_args *) vargs;
while(true){
//wait for trigger one loop
pthread_mutex_lock(&mtx);
do {
pthread_cond_wait(&c, &mtx);
if(exitFlag) {
std::cout << "return " << args->who << std::endl;
delete args;
pthread_mutex_unlock(&mtx);
return NULL;
}
} while(doSome == false);
doSome[args->who] = false;
pthread_mutex_unlock(&mtx);
std::cout << "t " << args->who << std::endl;
}
}
int main(int argc, char* argv[])
{
pthread_cond_init(&c, NULL);
pthread_mutex_init(&mtx, NULL);
pthread_t thread_id[2];
struct p_args *args0 = new p_args();
struct p_args *args1 = new p_args();
args0->who = 0;
args1->who = 1;
doSome[0] = doSome[1] = true;
exitFlag = false;
pthread_create(&thread_id[0], NULL, threadFunc, args0);
pthread_create(&thread_id[1], NULL, threadFunc, args1);
doSome[0] = doSome[1] = true;
pthread_cond_broadcast(&c);
sleep(3);
doSome[0] = doSome[1] = true;
pthread_cond_broadcast(&c);
sleep(3);
exitFlag = true;
pthread_cond_broadcast(&c);
pthread_join (thread_id[0], NULL);
pthread_join (thread_id[1], NULL);
return 0;
}
do all threads share the same c and mtx variable?
Yes, just like any other global variable. You could print their addresses from each thread to confirm it.
is it necessary upon pthread_cond_wait return to test if some condition is actually met?
Yes, all wait interfaces are subject to spurious wakeups, and you're always responsible for checking your own predicate. See the documentation or a good book.
the program currently does not exit ...
pthread_cancel is uniformly horrible and should never be used. It's really hard to get right. If you want to tell your thread to exit, write a notification mechanism - build it into the existing predicate loop - and signal/broadcast to make sure all threads wake up and realize it's time to die.
Regarding exiting deferred: thread_id[0] exits fine and i am stuck in line pthread_join (thread_id[1], NULL);, it says (Exiting) but seems stuck on a lock
One of the hard things about pthread_cancel is cleanup. If cancellation occurs while you're holding a lock, you need to have used pthread_cleanup_push to emulate cancel-compatible RAII semantics. Otherwise the first thread may (and in this case, did) die with the mutex still locked.
In this case the second thread is trying to exit from pthread_const_wait due to cancellation, but it needs to regain the lock and can't.
The usual form of a condition variable loop is this (and a good reference book should show something similar):
void *thread(void *data)
{
struct Args *args = (struct Args *)data;
/* this lock protects both the exit and work predicates.
* It should probably be part of your argument struct,
* globals are not recommended.
* Error handling omitted for brevity,
* but you should really check the return values.
*/
pthread_mutex_lock(&args->mutex);
while (!exit_predicate(args)) {
while (!work_predicate(args)) {
/* check the return value here too */
pthread_cond_wait(&args->condition, &args->mutex);
}
/* work_predicate() is true and we have the lock */
do_work(args);
}
/* unlock (explicitly) only once.
* If you need to cope with cancellation, you do need
* pthread_cleanup_push/pop instead.
*/
pthread_mutex_unlock(&args->mutex);
return data;
}
where your custom code can just go in bool exit_predicate(struct Args*), bool work_predicate(struct Args*) and void do_work(struct Args*). The loop structure itself rarely needs much alteration.

Incrementing a global variable with a thread

I just wrote the following code to understand better how Threads work:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>
int globalVariable = 1;
void *myfunc (void *myvar);
int main (void) {
pthread_t thread1;
int waitms;
while(globalVariable <= 50){
printf("Main function: %d \n", globalVariable);
if (globalVariable==9) {
pthread_create(&thread1, NULL, myfunc, NULL);
pthread_join(thread1, NULL);
}
usleep(300000);
globalVariable++;
}
return 0;
}
void *myfunc (void *myvar){
int waitms;
while(globalVariable<=50) {
printf("Thread1: %d \n", globalVariable);
usleep(300000);
globalVariable++;
}
return 0;
}
The code must print a value of a global variable that is incremented in the main function. When this variable has the value 9, the main function calls a thread, that does the same as the original main function, but without calling another thread.
In the Output I get the first 9 prints of the main function and all the following ones are from the thread. Shouldn't they be mixed? What have I done wrong?
No because you are joining the thread1, so the main thread blocks until thread1 dies. Once thread1 dies it resumes but thread1 has incremented the globalVariable to a point where the main thread exits the first while loop.
Removing the join you will see mixed results, better still would be to move the join outside of the while loop so if thread1 is still alive when the main thread exits the loop it waits... it's most likely going to dead by that time but you should make sure your child threads have finished up before exiting the main thread.

Polling a loop device through a kernel module

I was trying to read a loopback device that I have created through a kernel module in periods of 200ms, but it is crashing the kernel, when I try to insert it.
I think there is problem with my read module, but it works fine without timer.
I am new to kernel programming,please help.
Thank you in advance:D
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/timer.h>
#include<linux/fs.h>
#include <linux/init.h>
#include <asm/segment.h>
#include <asm/uaccess.h>
#include <linux/buffer_head.h>
static struct timer_list my_timer;
static void read_file(char *filename)
{
struct file *fd;
char buf[1];
unsigned long long offset=0;
mm_segment_t old_fs = get_fs();
set_fs(KERNEL_DS);
fd = filp_open(filename, O_RDONLY, 0);
if (fd >= 0) {
printk(KERN_DEBUG);
while (vfs_read(fd, buf, 1,&offset) == 1)
{
if((0 <= buf[0]) && (buf[0] <=255))
printk("%c", buf[0]);
}
printk(KERN_ALERT "Loop Ran\n");
filp_close(fd,NULL);
}
set_fs(old_fs);
}
void my_timer_callback( unsigned long data )
{
int ret;
printk( "my_timer_callback called (%ld).\n", jiffies );
printk( "Starting timer to fire in 200ms (%ld)\n", jiffies );
read_file("/dev/loop0");
ret = mod_timer( &my_timer, jiffies + msecs_to_jiffies(3000) );
if(ret)
printk("Error in mod_timer\n");
}
int init_module( void )
{
int ret;
printk("Timer module installing\n");
setup_timer( &my_timer, my_timer_callback, 0 );
printk( "Starting timer to fire in 200ms (%ld)\n", jiffies );
ret = mod_timer( &my_timer, jiffies + msecs_to_jiffies(200) );
if(ret)
printk("Error in mod_timer\n");
return 0;
}
void cleanup_module( void )
{
int ret;
ret = del_timer( &my_timer );
if(ret)
printk("The timer is still in use...\n");
printk("Timer module uninstalling\n");
return;
}`enter code here`
MODULE_LICENSE("GPL");
My Make file:
obj-m := timer2.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Reading files is fairly complex, as there are many corner cases to handle. (What if the VM mappings need to be extended? What if you have to suspend the thread while waiting for the disk? etc.)
This article talks about what you should do instead: http://www.linuxjournal.com/article/8110
Unfortunately, the article gives some sample code for hacking around the problem which gives people hope. But the sample code ONLY works in the context of "user process calls into the kernel". In this case, the kernel can re-use the current user process context, but it's a hack.
In the general case (interrupts, timers, etch), you can't just "grab a random user context" because that will lead to massive problems.
Instead, you should make a user-space process that hands the kernel the data it needs.
Kernel timer functions should be atomic. File operations need a process context. Your crash is due to file operations present in you read operation.
Linux device drivers - chapter 7 should get you going on kernel timers.

Resources