Why EAGAIN in pthread_key_create happens? - linux

Sometimes when I try to create key with pthread_key_create I'm getting EAGAIN error code. Is it possible to know exactly why?
Documentation says:
The system lacked the necessary resources to create another thread-specific data key, or the system-imposed limit on the total number of keys per process [PTHREAD_KEYS_MAX] would be exceeded.
How to check if it was a limit for keys? Maybe some king of monitor tool to check how many keys already opened in system and how many still could be used?
One important thing about our code: we use fork() and have multiple processes running. And each process could have multiple threads.
I found that we don't have independent limit for thread keys when we use fork(). Here is little example.
#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>
size_t create_keys(pthread_key_t *keys, size_t number_of_keys)
{
size_t counter = 0;
for (size_t i = 0; i < number_of_keys; i++)
{
int e = pthread_key_create(keys + i, NULL);
if (e)
{
printf("ERROR (%d): index: %ld, pthread_key_create (%d)\n", getpid(), i, e);
break;
}
counter++;
}
return counter;
}
int main(int argc, char const *argv[])
{
printf("maximim number of thread keys: %ld\n", sysconf(_SC_THREAD_KEYS_MAX));
printf("process id: %d\n", getpid());
const size_t number_of_keys = 1024;
pthread_key_t keys_1[number_of_keys];
memset(keys_1, 0, number_of_keys * sizeof(pthread_key_t));
printf("INFO (%d): number of active keys: %ld\n", getpid(), create_keys(keys_1, number_of_keys));
pid_t p = fork();
if (p == 0)
{
printf("process id: %d\n", getpid());
pthread_key_t keys_2[number_of_keys];
memset(keys_2, 0, number_of_keys * sizeof(pthread_key_t));
printf("INFO (%d): number of active keys: %ld\n", getpid(), create_keys(keys_2, number_of_keys));
}
return 0;
}
When I run this example on Ubuntu 16.04 I see that child process can not create any new thread key if I use same number of keys as limit (1024). But if I use 512 keys for parent and child processes I can run it without error.

As you know, fork() traditionally works by copying the process in memory and then continuing execution from the same point within each copy as parent and child. This is what the return code of fork() indicates.
In order to perform fork(), the internals of the process must be duplicated. Memory, stack, open files, and probably thread local storage keys. Each system is different in its implementation of fork(). Some systems allow you to customise the areas of the process that get copied (see Linux clone(2) interface). However, the concept remains the same.
So, on to your example code: if you allocate 1024 keys in the parent, every child process inherits a full key table and has no spare keys to work with, resulting in the errors. If you allocate only 512 keys in the parent, then every child inherits a half-empty keys table and has 512 spare keys to play with, hence no errors arise.

Maximum value:
#include <unistd.h>
#include <stdio.h>
int main ()
{
printf ("%ld\n", sysconf(_SC_THREAD_KEYS_MAX));
return 0;
}
Consider using pthread_key_delete.

Related

need to know how to interrupt all pthreads

In Linux, I am emulating an embedded system that has one thread that gets messages delivered to the outside world. If some thread detects an insurmountable problem, my goal is to stop all the other threads in their tracks (leaving useful stack traces) and allow only the message delivery thread to continue. So in my emulation environment, I want to "pthread_kill(tid, SIGnal)" each "tid". (I have a list. I'm using SIGTSTP.) Unfortunately, only one thread is getting the signal. "sigprocmask()" is not able to unmask the signal. Here is my current (non-working) handler:
void
wait_until_death(int sig)
{
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, sig);
sigprocmask(SIG_UNBLOCK, &mask, NULL);
for (;;)
pause();
}
I get verification that all the pthread_kill()'s get invoked, but only one thread has the handler in the stack trace. Can this be done?
This minimal example seems to function in the manner you want - all the threads except the main thread end up waiting in wait_until_death():
#include <stdio.h>
#include <pthread.h>
#include <signal.h>
#include <unistd.h>
#define NTHREADS 10
pthread_barrier_t barrier;
void
wait_until_death(int sig)
{
sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, sig);
sigprocmask(SIG_UNBLOCK, &mask, NULL);
for (;;)
pause();
}
void *thread_func(void *arg)
{
pthread_barrier_wait(&barrier);
for (;;)
pause();
}
int main(int argc, char *argv[])
{
const int thread_signal = SIGTSTP;
const struct sigaction sa = { .sa_handler = wait_until_death };
int i;
pthread_t thread[NTHREADS];
pthread_barrier_init(&barrier, NULL, NTHREADS + 1);
sigaction(thread_signal, &sa, NULL);
for (i = 0; i < NTHREADS; i++)
pthread_create(&thread[i], NULL, thread_func, NULL);
pthread_barrier_wait(&barrier);
for (i = 0; i < NTHREADS; i++)
pthread_kill(thread[i], thread_signal);
fprintf(stderr, "All threads signalled.\n");
for (;;)
pause();
return 0;
}
Note that unblocking the signal in the wait_until_death() isn't required: the signal mask is per-thread, and the thread that is executing the signal handler isn't going to be signalled again.
Presumably the problem is in how you are installing the signal handler, or setting up thread signal masks.
This is impossible. The problem is that some of the threads you stop may hold locks that the thread you want to continue running requires in order to continue making forward progress. Just abandon this idea entirely. Trust me, this will only cause you great pain.
If you literally must do it, have all the other threads call a conditional yielding point at known safe places where they hold no lock that can prevent any other thread from reaching its next conditional yielding point. But this is very difficult to get right and is very prone to deadlock and I strongly advise not trying it.

Writing to eventfd from kernel module

I have created an eventfd instance in a userspace program using eventfd(). Is there a way in which I can pass some reference (a pointer to its struct or pid+fd pair) to this created instance of eventfd to a kernel module so that it can update the counter value?
Here is what I want to do:
I am developing a userspace program which needs to exchange data and signals with a kernel space module which I have written.
For transferring data, I am already using ioctl. But I want the kernel module to be able to signal the userspace program whenever new data is ready for it to consume over ioctl.
To do this, my userspace program will create a few eventfds in various threads. These threads will wait on these eventfds using select() and whenever the kernel module updates the counts on these eventfds, they will go on to consume the data by requesting for it over ioctl.
The problem is, how do I resolve the "struct file *" pointers to these eventfds from kernelspace? What kind of information bout the eventfds can I sent to kernel modules so that it can get the pointers to the eventfds? what functions would I use in the kernel module to get those pointers?
Is there better way to signal events to userspace from kernelspace?
I cannot let go of using select().
I finally figured out how to do this. I realized that each open file on a system could be identified by the pid of one of the processes which opened it and the fd corresponding to that file (within that process's context). So if my kernel module knows the pid and fd, it can look up the struct * task_struct of the process and from that the struct * files and finally using the fd, it can acquire the pointer to the eventfd's struct * file. Then, using this last pointer, it can write to the eventfd's counter.
Here are the codes for the userspace program and the kernel module that I wrote up to demonstrate the concept (which now work):
Userspace C code (efd_us.c):
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h> //Definition of uint64_t
#include <sys/eventfd.h>
int efd; //Eventfd file descriptor
uint64_t eftd_ctr;
int retval; //for select()
fd_set rfds; //for select()
int s;
int main() {
//Create eventfd
efd = eventfd(0,0);
if (efd == -1){
printf("\nUnable to create eventfd! Exiting...\n");
exit(EXIT_FAILURE);
}
printf("\nefd=%d pid=%d",efd,getpid());
//Watch efd
FD_ZERO(&rfds);
FD_SET(efd, &rfds);
printf("\nNow waiting on select()...");
fflush(stdout);
retval = select(efd+1, &rfds, NULL, NULL, NULL);
if (retval == -1){
printf("\nselect() error. Exiting...");
exit(EXIT_FAILURE);
} else if (retval > 0) {
printf("\nselect() says data is available now. Exiting...");
printf("\nreturned from select(), now executing read()...");
s = read(efd, &eftd_ctr, sizeof(uint64_t));
if (s != sizeof(uint64_t)){
printf("\neventfd read error. Exiting...");
} else {
printf("\nReturned from read(), value read = %lld",eftd_ctr);
}
} else if (retval == 0) {
printf("\nselect() says that no data was available");
}
printf("\nClosing eventfd. Exiting...");
close(efd);
printf("\n");
exit(EXIT_SUCCESS);
}
Kernel Module C code (efd_lkm.c):
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/pid.h>
#include <linux/sched.h>
#include <linux/fdtable.h>
#include <linux/rcupdate.h>
#include <linux/eventfd.h>
//Received from userspace. Process ID and eventfd's File descriptor are enough to uniquely identify an eventfd object.
int pid;
int efd;
//Resolved references...
struct task_struct * userspace_task = NULL; //...to userspace program's task struct
struct file * efd_file = NULL; //...to eventfd's file struct
struct eventfd_ctx * efd_ctx = NULL; //...and finally to eventfd context
//Increment Counter by 1
static uint64_t plus_one = 1;
int init_module(void) {
printk(KERN_ALERT "~~~Received from userspace: pid=%d efd=%d\n",pid,efd);
userspace_task = pid_task(find_vpid(pid), PIDTYPE_PID);
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's task struct: %p\n",userspace_task);
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's files struct: %p\n",userspace_task->files);
rcu_read_lock();
efd_file = fcheck_files(userspace_task->files, efd);
rcu_read_unlock();
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's eventfd's file struct: %p\n",efd_file);
efd_ctx = eventfd_ctx_fileget(efd_file);
if (!efd_ctx) {
printk(KERN_ALERT "~~~eventfd_ctx_fileget() Jhol, Bye.\n");
return -1;
}
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's eventfd's context: %p\n",efd_ctx);
eventfd_signal(efd_ctx, plus_one);
printk(KERN_ALERT "~~~Incremented userspace program's eventfd's counter by 1\n");
eventfd_ctx_put(efd_ctx);
return 0;
}
void cleanup_module(void) {
printk(KERN_ALERT "~~~Module Exiting...\n");
}
MODULE_LICENSE("GPL");
module_param(pid, int, 0);
module_param(efd, int, 0);
To run this, carry out the following steps:
Compile the userspace program (efd_us.out) and the kernel module (efd_lkm.ko)
Run the userspace program (./efd_us.out) and note the pid and efd values that it print. (for eg. "pid=2803 efd=3". The userspace program will wait endlessly on select()
Open a new terminal window and insert the kernel module passing the pid and efd as params: sudo insmod efd_lkm.ko pid=2803 efd=3
Switch back to the userspace program window and you will see that the userspace program has broken out of select and exited.
Consult the kernel source here:
http://lxr.free-electrons.com/source/fs/eventfd.c
Basically, send your userspace file descriptor, as produced by eventfd(), to your module via ioctl() or some other path. From the kernel, call eventfd_ctx_fdget() to get an eventfd context, then eventfd_signal() on the resulting context. Don't forget eventfd_ctx_put() when you're done with the context.
how do I resolve the "struct file *" pointers to these eventfds from kernelspace
You must resolve those pointers into data structures that this interface you've created has published (create new types and read the fields you want from struct file into it).
Is there better way to signal events to userspace from kernelspace?
Netlink sockets are another convenient way for the kernel to communicate with userspace. "Better" is in the eye of the beholder.

how to set CPU affinity of a particular pthread?

I'd like to specify the cpu-affinity of a particular pthread. All the references I've found so far deal with setting the cpu-affinity of a process (pid_t) not a thread (pthread_t). I tried some experiments passing pthread_t's around and as expected they fail. Am I trying to do something impossible? If not, can you send a pointer please? Thanks a million.
This is a wrapper I've made to make my life easier. Its effect is that the calling thread gets "stuck" to the core with id core_id:
// core_id = 0, 1, ... n-1, where n is the system's number of cores
int stick_this_thread_to_core(int core_id) {
int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
if (core_id < 0 || core_id >= num_cores)
return EINVAL;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(core_id, &cpuset);
pthread_t current_thread = pthread_self();
return pthread_setaffinity_np(current_thread, sizeof(cpu_set_t), &cpuset);
}
Assuming linux:
The interface to setting the affinity is - as you've probably already discovered:
int sched_setaffinity(pid_t pid,size_t cpusetsize,cpu_set_t *mask);
Passing 0 as the pid, and it'll apply to the current thread only, or have other threads report their kernel pid with the linux-specific call pid_t gettid(void); and pass that in as the pid.
Quoting the man page
The affinity mask is actually a per-thread attribute that can be
adjusted independently for each of the
threads in a thread group. The value
returned from a call to gettid(2) can
be passed in the argument pid.
Specifying pid as 0 will set the
attribute for the calling thread, and
passing the value returned from a call
to getpid(2) will set the attribute
for the main thread of the thread
group. (If you are using the POSIX
threads API, then use
pthread_setaffinity_np (3) instead of
sched_setaffinity().)
//compilation: gcc -o affinity affinity.c -lpthread
#define _GNU_SOURCE
#include <sched.h> //cpu_set_t , CPU_SET
#include <pthread.h> //pthread_t
#include <stdio.h>
void *th_func(void * arg);
int main(void) {
pthread_t thread; //the thread
pthread_create(&thread,NULL,th_func,NULL);
pthread_join(thread,NULL);
return 0;
}
void *th_func(void * arg)
{
//we can set one or more bits here, each one representing a single CPU
cpu_set_t cpuset;
//the CPU we whant to use
int cpu = 2;
CPU_ZERO(&cpuset); //clears the cpuset
CPU_SET( cpu , &cpuset); //set CPU 2 on cpuset
/*
* cpu affinity for the calling thread
* first parameter is the pid, 0 = calling thread
* second parameter is the size of your cpuset
* third param is the cpuset in which your thread will be
* placed. Each bit represents a CPU
*/
sched_setaffinity(0, sizeof(cpuset), &cpuset);
while (1);
; //burns the CPU 2
return 0;
}
In POSIX environment you can use cpusets to control
which CPUs can be used by processes or pthreads.
This type of control is called CPU affinity.
The function 'sched_setaffinity' receives pthread IDs and
a cpuset as parameter.
When you use 0 in the first parameter, the calling thread
will be affected
Please find the below example program to cpu-affinity of a particular pthread.
Please add appropriate libs.
double waste_time(long n)
{
double res = 0;
long i = 0;
while (i <n * 200000) {
i++;
res += sqrt(i);
}
return res;
}
void *thread_func(void *param)
{
unsigned long mask = 1; /* processor 0 */
/* bind process to processor 0 */
if (pthread_setaffinity_np(pthread_self(), sizeof(mask),
&mask) <0) {
perror("pthread_setaffinity_np");
}
/* waste some time so the work is visible with "top" */
printf("result: %f\n", waste_time(2000));
mask = 2; /* process switches to processor 1 now */
if (pthread_setaffinity_np(pthread_self(), sizeof(mask),
&mask) <0) {
perror("pthread_setaffinity_np");
}
/* waste some more time to see the processor switch */
printf("result: %f\n", waste_time(2000));
}
int main(int argc, char *argv[])
{
pthread_t my_thread;
if (pthread_create(&my_thread, NULL, thread_func, NULL) != 0) {
perror("pthread_create");
}
pthread_exit(NULL);
}
Compile above program with -D_GNU_SOURCE flag.
The scheduler will change the cpu affinity as it sees fit; to set it persistently please see cpuset in /proc file system.
http://man7.org/linux/man-pages/man7/cpuset.7.html
Or you can write a short program that sets the cpu affinity periodically (every few seconds) with sched_setaffinity

Can I set the process group of an existing process?

I have a bunch of mini-server processes running. They're in the same process group as a FastCGI server I need to stop. The FastCGI server will kill everything in its process group, but I need those mini-servers to keep running.
Can I change the process group of a running, non-child process (they're children of PID 1)? setpgid() fails with "No such process" though I'm positive its there.
This is on Fedora Core 10.
NOTE the processes are already running. New servers do setsid(). These are some servers spawned by older code which did not.
One thing you could try is to do setsid() in the miniservers. That will make them session and process group leaders.
Also, keep in mind that you can't change the process group id to one from another session, and that you have to do the call to change the process group either from within the process that you want to change the group of, or from the parent of the process.
I've recently written some test code to periodically change the process group of a set of processes for a very similar task. You need not change the group id periodically, it's just that I thought I might evade a certain script that periodically checked for a group that runs for longer than a certain amount of time. It may also help you track down the error that you get with setpgid():
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
void err(const char *msg);
void prn(const char *msg);
void mydaemon();
int main(int arc, char *argv[]) {
mydaemon();
if (setsid() < 0)
err("setsid");
int secs = 5*60;
/* creating a pipe for the group leader to send changed
group ids to the child */
int pidx[2];
if (pipe(pidx))
err("pipe");
fcntl(pidx[0], F_SETFL, O_NONBLOCK);
fcntl(pidx[1], F_SETFL, O_NONBLOCK);
prn("begin");
/* here the child forks, it's a stand in for the set of
processes that need to have their group ids changed */
int child = fork();
switch (child) {
case -1: err("fork3");
case 0:
close(pidx[1]);
while(1) {
sleep(7);
secs -= 7;
if (secs <= 0) { prn("end child"); exit(0); }
int pid;
/* read new pid if available */
if (read(pidx[0], &pid, sizeof pid) != sizeof pid) continue;
/* set new process group id */
if (setpgid(getpid(), pid)) err("setpgid2");
prn("child group changed");
}
default: break;
}
close(pidx[0]);
/* here the group leader is forked every 20 seconds so that
a new process group can be sent to the child via the pipe */
while (1) {
sleep(20);
secs -= 20;
int pid = fork();
switch (pid) {
case -1: err("fork2");
case 0:
pid = getpid();
/* set process group leader for this process */
if (setpgid(pid, pid)) err("setpgid1");
/* inform child of change */
if (write(pidx[1], &pid, sizeof pid) != sizeof pid) err("write");
prn("group leader changed");
break;
default:
close(pidx[1]);
_exit(0);
}
if (secs <= 0) { prn("end leader"); exit(0); }
}
}
void prn(const char *msg) {
char buf[256];
strcpy(buf, msg);
strcat(buf, "\n");
write(2, buf, strlen(buf));
}
void err(const char *msg) {
char buf[256];
strcpy(buf, msg);
strcat(buf, ": ");
strcat(buf, strerror(errno));
prn(buf);
exit(1);
}
void mydaemon() {
int pid = fork();
switch (pid) {
case -1: err("fork");
case 0: break;
default: _exit(0);
}
close(0);
close(1);
/* close(2); let's keep stderr */
}
After some research I figured it out. Inshalla got the essential problem, "you can't change the process group id to one from another session" which explains why my setpgid() was failing (with a misleading message). However, it seems you can change it from any other process in the group (not necessarily the parent).
Since these processes were started by a FastCGI server and that FastCGI server was still running and in the same process group. Thus the problem, can't restart the FastCGI server without killing the servers it spawned. I wrote a new CGI program which did a setpgid() on the running servers, executed it through a web request and problem solved!
It sounds like you actually want to daemonise the process rather than move process groups. (Note: one can move process groups, but I believe you need to be in the same session and the target needs to already be a process group.)
But first, see if daemonising works for you:
#include <unistd.h>
#include <stdio.h>
int main() {
if (fork() == 0) {
setsid();
if (fork() == 0) {
printf("I'm still running! pid:%d", getpid());
sleep(10);
}
_exit(0);
}
return 0;
}
Obviously you should actually check for errors and such in real code, but the above should work.
The inner process will continue running even when the main process exits. Looking at the status of the inner process from /proc we find that it is, indeed, a child of init:
Name: a.out
State: S (sleeping)
Tgid: 21513
Pid: 21513
PPid: 1
TracerPid: 0

Multithreading Semaphore

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <semaphore.h>
void *thread_function(void *arg);
sem_t bin_sem;
#define WORK_SIZE 1024
char work_area[WORK_SIZE];
int main() {
int res;
pthread_t a_thread;
void *thread_result;
res = sem_init(&bin_sem, 0, 0);
if (res != 0) {
perror(“Semaphore initialization failed”);
exit(EXIT_FAILURE);
}
res = pthread_create(&a_thread, NULL, thread_function, NULL);
if (res != 0) {
perror(“Thread creation failed”);
exit(EXIT_FAILURE);
}
printf(“Input some text. Enter ‘end’ to finish\n”);
while(strncmp(“end”, work_area, 3) != 0) {
fgets(work_area, WORK_SIZE, stdin);
sem_post(&bin_sem);
}
printf(“\nWaiting for thread to finish...\n”);
res = pthread_join(a_thread, &thread_result);
if (res != 0) {
perror(“Thread join failed”);
exit(EXIT_FAILURE);
}
printf(“Thread joined\n”);
sem_destroy(&bin_sem);
exit(EXIT_SUCCESS);
}
void *thread_function(void *arg) {
sem_wait(&bin_sem);
while(strncmp(“end”, work_area, 3) != 0) {
printf(“You input %d characters\n”, strlen(work_area) -1);
sem_wait(&bin_sem);}
pthread_exit(NULL);
}
In the program above, when the semaphore is released using sem_post(), is it
possible that the fgets and the counting function in thread_function execute
simultaneously .And I think this program fails in allowing the second thread
to count the characters before the main thread reads the keyboard again.
Is that right?
The second thread will only read characters after sem_wait has returned, signaling that a sem_post has been called somewhere, so I think that is fine.
As for fgets and the counting function, those two could be running simultaneously.
I would recommend a mutex lock on the work_area variable in this case, because if the user is editing the variable in one thread while it is being read in another thread, problems will occur.
You can either use a mutex or you can use a semaphore and set the initial count on it to 1.
If you implement a mutex or use a semaphore like that though, make sure to put the mutex_lock after sema_wait, or else a deadlock may occur.
In this example you want to have a mutex around the read & writes of the shared memory.
I know this is an example, but the following code:
fgets(work_area, WORK_SIZE, stdin);
Should really be:
fgets(work_area, sizeof(work_area), stdin);
If you change the size of work_area in the future (to some other constant, etc), it's quite likely that changing this second WORK_SIZE could be missed.

Resources