pthread_kill in multithread'd app causes segmentation fault - linux

I managed to find several references to this problem which suggested pthread_kill was de-referencing the pthread_t structure which was causing some problem however other articles said this is not a problem as long as the pthread_t staructure is created via pthread_create.
Then I found a multi thread example of how to do this correctly :
How to send a signal to a process in C?
However I am still getting seg faults so here is my code example:
static pthread_t GPUthread;
static void GPUsigHandler(int signo)
{
fprintf(stderr, "Queue waking\n");
}
void StartGPUQueue()
{
sigset_t sigmask;
pthread_attr_t attr_obj; /* a thread attribute variable */
struct sigaction action;
/* set up signal mask to block all in main thread */
sigfillset(&sigmask); /* to turn on all bits */
pthread_sigmask(SIG_BLOCK, &sigmask, (sigset_t *)0);
/* set up signal handlers for SIGINT & SIGUSR1 */
action.sa_flags = 0;
action.sa_handler = GPUsigHandler;
sigaction(SIGUSR1, &action, (struct sigaction *)0);
pthread_attr_init(&attr_obj); /* init it to default */
pthread_attr_setdetachstate(&attr_obj, PTHREAD_CREATE_DETACHED);
GPUthread = pthread_create(&GPUthread, &attr_obj, ProcessGPUqueue, NULL);
if (GPUthread != 0)
{
fprintf(stderr, "Cannot start GPU thread\n");
}
}
void ProcessGPUqueue(void *ptr)
{
int sigdummy;
sigset_t sigmask;
sigfillset(&sigmask); /* will unblock all signals */
pthread_sigmask(SIG_UNBLOCK, &sigmask, (sigset_t *)0);
fprintf(stderr, "GPU queue alive\n");
while(queueActive)
{
fprintf(stderr, "Processing GPU queue\n");
while(GPUqueue != NULL)
{
// process stuff
}
sigwait(&sigmask, &sigdummy);
}
}
void QueueGPUrequest(unsigned char cmd, unsigned short p1, unsigned short p2, unsigned short p3, unsigned short p4)
{
// Add request to queue logic ...
fprintf(stderr, "About to Wake GPU queue\n");
pthread_kill(GPUthread, SIGUSR1);// Earth shattering KA-BOOM!!!
}

Related

Question in Book Linux device driver 3rd about interrupt handler for circular buffer

I am reading LLDR3, having a question on P.271 in section "Implementing a Handler"
Bottom are codes I am having questions:
I see writer ( ISR ) and reader ( which is waken-up by ISR ) they are touching on same buffer ( short_queue ), since they are touching on the shared resource, doesn't it worry about the case where "short_i_read" got interrupted by the writer ISR while it is working on buffer?
I can understand ISR writer won't get interrupted since it is ISR and normally IRQ will be disabled until completion. But for buffer read "short_i_read" , I don't see any place to guarantee atomic operation.
The one thing I notice is :
buffer writer(ISR) only increment on short_head
buffer reader only increment on short_tail
Does that mean this code here let writer and reader only touch different variable to have it achieve kind of lock-free circular buffer?
irqreturn_t short_interrupt(int irq, void *dev_id, struct pt_regs *regs) {
struct timeval tv;
int written;
do_gettimeofday(&tv);
/* Write a 16 byte record. Assume PAGE_SIZE is a multiple of 16 */
written = sprintf((char *)short_head,"%08u.%06u\n", (int)(tv.tv_sec % 100000000), (int)(tv.tv_usec));
BUG_ON(written != 16);
short_incr_bp(&short_head, written);
wake_up_interruptible(&short_queue);
/* awake any reading process */
return IRQ_HANDLED;
}
static inline void short_incr_bp(volatile unsigned long *index, int delta) {
unsigned long new = *index + delta;
barrier(); /* Don't optimize these two together */
*index = (new >= (short_buffer + PAGE_SIZE)) ? short_buffer : new;
}
ssize_t short_i_read (struct file *filp, char __user *buf, size_t count, loff_t *f_pos)
{
int count0;
DEFINE_WAIT(wait);
while (short_head == short_tail) {
prepare_to_wait(&short_queue, &wait, TASK_INTERRUPTIBLE);
if (short_head == short_tail)
schedule();
finish_wait(&short_queue, &wait);
if (signal_pending (current)) /* a signal arrived */
return -ERESTARTSYS; /* tell the fs layer to handle it */
}
/* count0 is the number of readable data bytes */
count0 = short_head - short_tail;
if (count0 < 0) /* wrapped */
count0 = short_buffer + PAGE_SIZE - short_tail;
if (count0 < count) count = count0;
if (copy_to_user(buf, (char *)short_tail, count))
return -EFAULT;
short_incr_bp (&short_tail, count);
return count;
}

Linux Thread priority , behaviour is abnormal

In the below code snippet, I am creating 6 threads. Each with different priorities. The priority is mentioned in global priority array. I am doing a continuous increment of global variables inside each thread based on thread index. I was expecting the count to be higher if thread priority is higher. but my output is not adhering to priority concepts pl. refer to the output order shown below. I am trying this out on Ubuntu 16.04 and Linux kernel 4.10.
O/P,
Thread=0
Thread=3
Thread=2
Thread=5
Thread=1
Thread=4
pid=32155 count=4522138740
pid=32155 count=4509082289
pid=32155 count=4535088439
pid=32155 count=4517943246
pid=32155 count=4522643905
pid=32155 count=4519640181
Code:
#include <stdio.h>
#include <pthread.h>
#define FAILURE -1
#define MAX_THREADS 15
long int global_count[MAX_THREADS];
/* priority of each thread */
long int priority[]={1,20,40,60,80,99};
void clearGlobalCounts()
{
int i=0;
for(i=0;i<MAX_THREADS;i++)
global_count[i]=0;
}
/**
thread parameter is thread index
**/
void funcDoNothing(void *threadArgument)
{
int count=0;
int index = *((int *)threadArgument);
printf("Thread=%d\n",index);
clearGlobalCounts();
while(1)
{
count++;
if(count==100)
{
global_count[index]++;
count=0;
}
}
}
int main()
{
int i=0;
for(int i=0;i<sizeof(priority)/sizeof(long int);i++)
create_thread(funcDoNothing, i,priority[i]);
sleep(3600);
for(i=0;i<sizeof(priority)/sizeof(long int);i++)
{
printf("pid=%d count=%ld\n",getpid(),
global_count[i]);
}
}
create_thread(void *func,int thread_index,int priority)
{
pthread_attr_t attr;
struct sched_param schedParam;
void *pParm=NULL;
int id;
int * index = malloc(sizeof(int));
*index = thread_index;
void *res;
/* Initialize the thread attributes */
if (pthread_attr_init(&attr))
{
printf("Failed to initialize thread attrs\n");
return FAILURE;
}
if(pthread_attr_setschedpolicy(&attr, SCHED_FIFO))
{
printf("Failed to pthread_attr_setschedpolicy\n");
return FAILURE;
}
if (pthread_attr_setschedpolicy(&attr, SCHED_FIFO))
{
printf("Failed to setschedpolicy\n");
return FAILURE;
}
/* Set the capture thread priority */
pthread_attr_getschedparam(&attr, &schedParam);;
schedParam.sched_priority = sched_get_priority_max(SCHED_FIFO) - 1;
schedParam.sched_priority = priority;
if (pthread_attr_setschedparam(&attr, &schedParam))
{
printf("Failed to setschedparam\n");
return FAILURE;
}
pthread_create(&id, &attr, (void *)func, index);
}
The documentation for pthread_attr_setschedparam says:
In order for the parameter setting made by
pthread_attr_setschedparam() to have effect when calling
pthread_create(3), the caller must use pthread_attr_setinheritsched(3)
to set
the inherit-scheduler attribute of the attributes object attr to PTHREAD_EXPLICIT_SCHED.
So you have to call pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) , for example:
if (pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) != 0) {
perror("pthread_attr_setinheritsched");
}
pthread_create(&id, &attr, (void *)func, index);
Note: Your code produces a lot of compiler warnings, you need to fix those. You do not want to try to test code which have a lot of undefined behavior - as indicated by some of the warnings. You should probably lower the sleep(3600) to just a few seconds, since when you get your threads running under SCHED_FIFO, they will hog your CPU and the machine appears freezed while they are running.

Windows Thread Synchronization with create semaphore

I am trying on a problem with writer and reader. I am trying with windows semaphore functionality.
It is very simple as follows
char n[200];
volatile HANDLE hSem=NULL; // handle to semaphore
The write function for console. Which release the semaphore for the read process.
void * write_message_function ( void *ptr )
{
/* do the work */
while(1){
printf("Enter a string");
scanf("%s",n);
ReleaseSemaphore(hSem,1,NULL); // unblock all the threads
}
pthread_exit(0); /* exit */
}
The print message wait for the release from the write message to print the message.
void * print_message_function ( void *ptr )
{
while(1){
WaitForSingleObject(hSem,INFINITE);
printf("The string entered is :");
printf("==== %s\n",n);
}
pthread_exit(0); /* exit */
}
The main function launch the application.
int main(int argc, char *argv[])
{
hSem=CreateSemaphore(NULL,0,1,NULL);
pthread_t thread1, thread2; /* thread variables */
/* create threads 1 and 2 */
pthread_create (&thread1, NULL, print_message_function, NULL);
pthread_create (&thread2, NULL, write_message_function, NULL);
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
/* exit */
CloseHandle(hSem);
}
The program executes but does not show the string input console.
ReleaseSemaphore in write_message_function will force the following:
The print_message_function will start the output and
The write_message_function will output and scan for input.
These two things occur at the same time. Using the semaphore to trigger the output is
fine. However, using MaximumCount=1 is a waste of capabilities, you may have mutiple inputs before an output occurs.
But the main problem here is that the I/O resources and the use of char n[200]; are not
implemented thread-safe. See What is meant by “thread-safe” code? for details. You'd still have to protect these resources by for example a mutex or a critical section.

How to soft reboot from a non-monolithic kernel module in an IRQ scope?

I need to reboot upon handling an IRQ in kernel land.
I want to call the /sbin/reboot binary, but I have got limitations due to the IRQ scope.
Code follows :
#define MY_IRQ_ID 42
void __init rebootmodule_init(void) {
request_any_context_irq(MY_IRQ_ID, rebootmodule_irq_handler, IRQF_TRIGGER_FALLING, "irq-name", NULL);
}
irqreturn_t rebootmodule_irq_handler(int irq, void *dev_id) {
my_reboot();
return IRQ_HANDLED;
}
void my_reboot(void) {
int ret;
char *argv[2], *envp[4];
argv[0] = "/sbin/reboot";
argv[1] = NULL;
envp[0] = "HOME=/";
envp[1] = "PWD=/";
envp[2] = "PATH=/sbin";
envp[3] = NULL;
ret = call_usermodehelper(argv[0], argv, envp, 0);
printk(KERN_INFO "trying to reboot (ret = %d)", ret);
}
I can see the printk(...) when the IRQ is triggered but I have some errors, even if I replace /sbin/reboot by /bin/rm /tmp/its-not-working.
I tested other way to do reboot like mvBoardReset(), machine_halt(), arm_pm_restart(), pm_power_off(), kill(1, SIGTSTP), reboot(), handle_sysrq('b'), I always have errors that I don't have outside the IRQ scope.
I really want to call /sbin/reboot, since it does clean soft reset.
Thank you for your time.
Just an idea: you can start kernel thread by kthread_run(), put it to sleep by wait_event(), wake it up in the IRQ handler by wake_up(), do your stuff (run /sbin/reboot or whatever you want) in the kernel thread. Something like this (completely untested):
#define MY_IRQ_ID 42
static DECLARE_WAIT_QUEUE_HEAD(wq);
static volatile int showtime = 0;
void my_reboot(void) {
int ret;
char *argv[2], *envp[4];
argv[0] = "/sbin/reboot";
argv[1] = NULL;
envp[0] = "HOME=/";
envp[1] = "PWD=/";
envp[2] = "PATH=/sbin";
envp[3] = NULL;
ret = call_usermodehelper(argv[0], argv, envp, 0);
printk(KERN_INFO "trying to reboot (ret = %d)", ret);
}
static int my_thread(void *arg) {
wait_event(&wq, showtime);
my_reboot();
return 0;
}
irqreturn_t rebootmodule_irq_handler(int irq, void *dev_id) {
showtime = 1;
wake_up(&wq);
return IRQ_HANDLED;
}
void __init rebootmodule_init(void) {
kthread_run(my_thread, NULL, "my_module");
request_any_context_irq(MY_IRQ_ID, rebootmodule_irq_handler, IRQF_TRIGGER_FALLING, "irq-name", NULL);
}
Don't forget to handle module __exit and the case when interrupt has come before the kernel thread was sent to sleep.

Debugging segmentation fault in a multi-threaded (using clone) program

I wrote a code to create some threads and whenever one of the threads finish a new thread is created to replace it. As I was not able to create very large number of threads (>450) using pthreads, I used clone system call instead. (Please note that I am aware of the implication of having such a huge number of threads, but this program is meant to only stress the system).
As clone() requires the stack space for the child thread to be specified as parameter, I malloc the required chunk of stack space for each thread and free it up when the thread finishes. When a thread finishes I send a signal to the parent to notify it of the same.
The code is given below:
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <unistd.h>
#include <errno.h>
#define NUM_THREADS 5
unsigned long long total_count=0;
int num_threads = NUM_THREADS;
static int thread_pids[NUM_THREADS];
static void *thread_stacks[NUM_THREADS];
int ppid;
int worker() {
int i;
union sigval s={0};
for(i=0;i!=99999999;i++);
if(sigqueue(ppid, SIGUSR1, s)!=0)
fprintf(stderr, "ERROR sigqueue");
fprintf(stderr, "Child [%d] done\n", getpid());
return 0;
}
void sigint_handler(int signal) {
char fname[35]="";
FILE *fp;
int ch;
if(signal == SIGINT) {
fprintf(stderr, "Caught SIGINT\n");
sprintf(fname, "/proc/%d/status", getpid());
fp = fopen(fname,"r");
while((ch=fgetc(fp))!=EOF)
fprintf(stderr, "%c", (char)ch);
fclose(fp);
fprintf(stderr, "No. of threads created so far = %llu\n", total_count);
exit(0);
} else
fprintf(stderr, "Unhandled signal (%d) received\n", signal);
}
int main(int argc, char *argv[]) {
int rc, i; long t;
void *chld_stack, *chld_stack2;
siginfo_t siginfo;
sigset_t sigset, oldsigset;
if(argc>1) {
num_threads = atoi(argv[1]);
if(num_threads<1) {
fprintf(stderr, "Number of threads must be >0\n");
return -1;
}
}
signal(SIGINT, sigint_handler);
/* Block SIGUSR1 */
sigemptyset(&sigset);
sigaddset(&sigset, SIGUSR1);
if(sigprocmask(SIG_BLOCK, &sigset, &oldsigset)==-1)
fprintf(stderr, "ERROR: cannot block SIGUSR1 \"%s\"\n", strerror(errno));
printf("Number of threads = %d\n", num_threads);
ppid = getpid();
for(t=0,i=0;t<num_threads;t++,i++) {
chld_stack = (void *) malloc(148*512);
chld_stack2 = ((char *)chld_stack + 148*512 - 1);
if(chld_stack == NULL) {
fprintf(stderr, "ERROR[%ld]: malloc for stack-space failed\n", t);
break;
}
rc = clone(worker, chld_stack2, CLONE_VM|CLONE_FS|CLONE_FILES, NULL);
if(rc == -1) {
fprintf(stderr, "ERROR[%ld]: return code from pthread_create() is %d\n", t, errno);
break;
}
thread_pids[i]=rc;
thread_stacks[i]=chld_stack;
fprintf(stderr, " [index:%d] = [pid:%d] ; [stack:0x%p]\n", i, thread_pids[i], thread_stacks[i]);
total_count++;
}
sigemptyset(&sigset);
sigaddset(&sigset, SIGUSR1);
while(1) {
fprintf(stderr, "Waiting for signal from childs\n");
if(sigwaitinfo(&sigset, &siginfo) == -1)
fprintf(stderr, "- ERROR returned by sigwaitinfo : \"%s\"\n", strerror(errno));
fprintf(stderr, "Got some signal from pid:%d\n", siginfo.si_pid);
/* A child finished, free the stack area allocated for it */
for(i=0;i<NUM_THREADS;i++) {
fprintf(stderr, " [index:%d] = [pid:%d] ; [stack:%p]\n", i, thread_pids[i], thread_stacks[i]);
if(thread_pids[i]==siginfo.si_pid) {
free(thread_stacks[i]);
thread_stacks[i]=NULL;
break;
}
}
fprintf(stderr, "Search for child ended with i=%d\n",i);
if(i==NUM_THREADS)
continue;
/* Create a new thread in its place */
chld_stack = (void *) malloc(148*512);
chld_stack2 = ((char *)chld_stack + 148*512 - 1);
if(chld_stack == NULL) {
fprintf(stderr, "ERROR[%ld]: malloc for stack-space failed\n", t);
break;
}
rc = clone(worker, chld_stack2, CLONE_VM|CLONE_FS|CLONE_FILES, NULL);
if(rc == -1) {
fprintf(stderr, "ERROR[%ld]: return code from clone() is %d\n", t, errno);
break;
}
thread_pids[i]=rc;
thread_stacks[i]=chld_stack;
total_count++;
}
fprintf(stderr, "Broke out of infinite loop. [total_count=%llu] [i=%d]\n",total_count, i);
return 0;
}
I have used couple of arrays to keep track of the child processes' pid and the stack area base address (for freeing it).
When I run this program it terminates after sometime. Running with gdb tells me that one of the thread gets a SIGSEGV (segmentation fault). But it doesn't gives me any location, the output is similar to the following:
Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 15864]
0x00000000 in ?? ()
I tried running it under valgrind with the following commandline:
valgrind --tool=memcheck --leak-check=yes --show-reachable=yes -v --num-callers=20 --track-fds=yes ./a.out
But it keeps running without any issues under valgrind.
I am puzzled as to how to debug this program. I felt that this might be some stack overflow or something but increasing the stack size (upto 74KB) didn't solved the problem.
My only query is why and where is the segmentation fault or how to debug this program.
Found the actual issue.
When the worker thread signals the parent process using sigqueue(), the parent sometimes gets the control immediately and frees up the stack before the child executes the return statement. When the same child thread uses return statement, it causes segmentation fault as the stack got corrupted.
To solve this I replaced
exit(0)
instead of
return 0;
I think i found the answer
Step 1
Replace this:
static int thread_pids[NUM_THREADS];
static void *thread_stacks[NUM_THREADS];
By this:
static int *thread_pids;
static void **thread_stacks;
Step 2
Add this in the main function (after checking arguments):
thread_pids = malloc(sizeof(int) * num_threads);
thread_stacks = malloc(sizeof(void *) * num_threads);
Step 3
Replace this:
chld_stack2 = ((char *)chld_stack + 148*512 - 1);
By this:
chld_stack2 = ((char *)chld_stack + 148*512);
In both places you use it.
I dont know if its really your problem, but after testing it i didnt get any segmentation fault. Btw i did only get segmentation faults when using more than 5 threads.
Hope i helped!
edit: tested with 1000 threads and runs perfectly
edit2: Explanation why the static allocation of thread_pids and thread_stacks causes an error.
The best way to do this is with an example.
Assume num_threads = 10;
The problem occurs in the following code:
for(t=0,i=0;t<num_threads;t++,i++) {
...
thread_pids[i]=rc;
thread_stacks[i]=chld_stack;
...
}
Here you try to access memory which does not belong to you (0 <= i <= 9, but both arrays have a size of 5). That can cause either segmentation fault or data corruption. Data corruption may happen if both arrays are allocated one after the other, resulting in writing to the other array. Segmentation can happen if you write in memory you dont have allocated (statically or dynamically).
You may be lucky and have no errors at all, but the code is surely not safe.
About the non-aligned pointer: I think i dont have to explain more than in my comment.

Resources