Linux kernel timer init_timer. What happens if called many times? - linux

I am trying to hunt a bug that cause intermittent crash at PC around get_next_timer_interrupt() code and sometimes at run_timer_softirq()
I found a driver that potentially calls init_timer() often with the same static argument passed to it. (timer_list)
Will this cause issue?
What exactly does init_timer do and is there a function that does the reverse to destroy it?
Thanks

Calling init_timer() a lot should not cause any problem. The code which is eventually invoked is:
621 static void do_init_timer(struct timer_list *timer, unsigned int flags,
622 const char *name, struct lock_class_key *key)
623 {
624 struct tvec_base *base = __raw_get_cpu_var(tvec_bases);
625
626 timer->entry.next = NULL;
627 timer->base = (void *)((unsigned long)base | flags);
628 timer->slack = -1;
629 #ifdef CONFIG_TIMER_STATS
630 timer->start_site = NULL;
631 timer->start_pid = -1;
632 memset(timer->start_comm, 0, TASK_COMM_LEN);
633 #endif
634 lockdep_init_map(&timer->lockdep_map, name, key, 0);
635 }
That gets invoked by a few macros calling down to it. Start here to follow the flow.

Related

printf is producing the segmentation fault?

I'm learning threads and my code runs upto last print statement. Why it is giving segmentation fault at print? I think possible reason could be non-existant address passed as argument to print, but it is not the reason, I'm passing valid address.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
void *thread (void *vargp) {
int arg = *((int*)vargp);
return &arg;
}
int main () {
pthread_t tid;
int thread_arg = 0x7ffdbc32fa34;
int *ret_value;
pthread_create(&tid, NULL, thread, &thread_arg);
pthread_join(tid, (void **)(&ret_value));
printf("hello\n");
printf("%X\n", *ret_value);
return 0;
}
It is giving following output:
hello
Segmentation fault (core dumped)
Is it because I'm returning an address of a local variable, which gets destroyed once thread is returned? I don't think so, because changing to following code is also giving me segmentation fault!
void *thread (void *vargp) {
int * arg = malloc(sizeof(int));
*arg = *((int*)vargp);
return &arg;
}
Is it because I'm returning an address of a local variable, which gets
destroyed once thread is returned?
Yes, it is.
I don't think so, because changing to following code is also giving me
segmentation fault!
This code is also returning the address of a local variable (return &arg;). Instead, you should be returning the pointer value that malloc() returned (return arg;):
void *thread (void *vargp)
{
int * arg = malloc(sizeof(int));
*arg = *((int*)vargp);
return arg;
}
You also should not be casting the address of ret_value to type void ** in main() - the variable is of type int * not void *, so it shouldn't be written to through a void ** pointer (although, in practice, this will usually work). Instead, you should be using void * variable to hold the return value, then either casting this value to int * or assigning it to a variable of type int *:
void *ret_value;
pthread_create(&tid, NULL, thread, &thread_arg);
pthread_join(tid, &ret_value);
printf("%X\n", *(int *)ret_value);

is popen really thread safe?

Recently, I am running into trouble of calling popen, it seems to be not thread safe.
The following is the code snippet from the source code link: http://androidxref.com/9.0.0_r3/xref/bionic/libc/upstream-netbsd/lib/libc/gen/popen.c
static struct pid {
struct pid *next;
FILE *fp;
int fd;
pid_t pid;
} *pidlist;
static rwlock_t pidlist_lock = RWLOCK_INITIALIZER;
FILE *
popen(const char *command, const char *type)
{
struct pid *cur, *old;
...
pipe2(pdes, flags) // A
...
(void)rwlock_rdlock(&pidlist_lock); // C
...
switch (pid = vfork()) { // C.1
case 0: /* Child. */
...
_exit(127);
/* NOTREACHED */
}
/* Parent; */
...
/* Link into list of file descriptors. */
cur->fp = iop;
cur->pid = pid;
cur->next = pidlist; // D
pidlist = cur; // E
(void)rwlock_unlock(&pidlist_lock); // F
...
}
Observing the above code, it acquired a read-lock at C, but, inside the scope, it did some write operations at E. So, there will be possibly multiple read threads who are writing to the variable "pidlist" in the same time.
Does anyone know whether it is a true issue or not?
Looking at the code, it is not how you outline it.
Here is a paste from the given link http://androidxref.com/9.0.0_r3/xref/bionic/libc/upstream-netbsd/lib/libc/gen/popen.c
80FILE *
81popen(const char *command, const char *type)
82{
83 struct pid *cur, *old;
94 if (strchr(xtype, '+')) {
...
100 } else {
103 if (pipe2(pdes, flags) == -1)
104 return NULL;
105 }
106
...
113
114 (void)rwlock_rdlock(&pidlist_lock);
115 (void)__readlockenv();
116 switch (pid = vfork()) {
117 case -1: /* Error. */
...
127 case 0: /* Child. */
...
154 execl(_PATH_BSHELL, "sh", "-c", command, NULL);
155 _exit(127);
156 /* NOTREACHED */
157 }
158 (void)__unlockenv();
159
160 /* Parent; assume fdopen can't fail. */
161 if (*xtype == 'r') {
...
167 } else {
...
173 }
...
179 pidlist = cur;
180 (void)rwlock_unlock(&pidlist_lock);
181
182 return (iop);
183}
It is
do the file stuff (pipe2() or socketpair())
acquire the lock (this will avoid the child doing silly things)
vfork()
on the child, reorganize filedes and execl
on parent do some filedes stuff unlock and return
I do not see what would be the problem.

Large size kmalloc in the linux kernel kmalloc

I am looking at Linux version 4.9.31
And a kmalloc() function of slab and slub
The following is the kmalloc() function of include/linux/slab.h
static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
if (__builtin_constant_p(size)) {
if (size > KMALLOC_MAX_CACHE_SIZE)
return kmalloc_large(size, flags);
#ifndef CONFIG_SLOB
if (!(flags & GFP_DMA)) {
int index = kmalloc_index(size);
if (!index)
return ZERO_SIZE_PTR;
return kmem_cache_alloc_trace(kmalloc_caches[index],
flags, size);
}
#endif
}
return __kmalloc(size, flags);
}
In the above code, kmalloc_large() is called when __builtin_constant_p(size) is true.
First question. What is the relationship between __builtin_constant_p(size) and kmalloc_large()? Should not kmalloc_large() be called in runtime, not compile time?
The following is the __kmalloc() and __do_kmalloc() of mm/slab.c
static __always_inline void *__do_kmalloc(size_t size, gfp_t flags,
unsigned long caller)
{
struct kmem_cache *cachep;
void *ret;
cachep = kmalloc_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(cachep)))
return cachep;
ret = slab_alloc(cachep, flags, caller);
kasan_kmalloc(cachep, ret, size, flags);
trace_kmalloc(caller, ret,
size, cachep->size, flags);
return ret;
}
void *__kmalloc(size_t size, gfp_t flags)
{
return __do_kmalloc(size, flags, _RET_IP_);
}
The following is the __kmalloc() of mm/slub.c
void *__kmalloc(size_t size, gfp_t flags)
{
struct kmem_cache *s;
void *ret;
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
return kmalloc_large(size, flags);
s = kmalloc_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
return s;
ret = slab_alloc(s, flags, _RET_IP_);
trace_kmalloc(_RET_IP_, ret, size, s->size, flags);
kasan_kmalloc(s, ret, size, flags);
return ret;
}
Second question. Why do slub __kmalloc() check "size > KMALLOC_MAX_CACHE_SIZE" and call kmalloc_large() at runtime ?
Your two question are actually parts of the single question:
What is __builtin_constant_p(size)?
Operator __builtin_constant_p is gcc-specific extension, which checks whether its argument can be evaluated at compile time. E.g., if you call
p = kmalloc(100, GFP_KERNEL);
then the operator returns true.
But with
size_t size = 100;
p = kmalloc(size, GFP_KERNEL);
the operator returns false*.
By knowing that some function's parameter is known at compile time, one may check it at compile time, and perform some optimizations.
if (__builtin_constant_p(size)) {
if (size > KMALLOC_MAX_CACHE_SIZE)
While size > KMALLOC_MAX_CACHE_SIZE seems to be runtime-check here, it is actually compile-time check, because outer condition garantees that size is known at compile time. With that knowledge, compiler may optimize out inner branch, if it is false (if the branch true, compiler may optimize out other branches).
E.g.,
p = kmalloc(100000, GFP_KERNEL);
will be compiled into
kmalloc_large(100000, GFP_KERNEL);
and
p = kmalloc(100, GFP_KERNEL);
will be compiled into
__kmalloc(100, GFP_KERNEL);
But
size_t size = 100000;
p = kmalloc(size, GFP_KERNEL);
will be compiled into
size_t size = 100000;
__kmalloc(size, GFP_KERNEL);
because compiler cannot predict the branch at compile time.
Implementation of "fall-back" function __kmalloc checks its parameters anywhere, for the case when compile-time checks cannot be performed.
*- in my recent tests compiler actually doesn't try to predict value of size variable which has been assigned directly with a constant. But this may be changed in future gcc versions.

Unbound workqueue's kthreads CPU affinity

Is there a way to set CPU affinity for unbound workqueue's kthreads (those that named kthread/uXX:y)? Something like cpu mask for regular workqueues.
Is it a good idea to set it for each kthread using taskset?
Workqueue subsystem exports sysfs attribute for setting cpu affinity for unbound workers.
Code can be found in Workqueue.c:
5040 static ssize_t wq_unbound_cpumask_store(struct device *dev,
5041 struct device_attribute *attr, const char *buf, size_t count)
5042 {
5043 cpumask_var_t cpumask;
5044 int ret;
5045
5046 if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL))
5047 return -ENOMEM;
5048
5049 ret = cpumask_parse(buf, cpumask);
5050 if (!ret)
5051 ret = workqueue_set_unbound_cpumask(cpumask);
5052
5053 free_cpumask_var(cpumask);
5054 return ret ? ret : count;
5055 }
5056
5057 static struct device_attribute wq_sysfs_cpumask_attr =
5058 __ATTR(cpumask, 0644, wq_unbound_cpumask_show,
5059 wq_unbound_cpumask_store)
So any user space application can write to sysfs descriptor to set unbound workqueue cpu mask.
I hope this answers your query.

Shared memory across processes on Linux/x86_64

I have a few questions on using shared memory with processes. I looked at several previous posts and couldn't glean the answers precisely enough. Thanks in advance for your help.
I'm using shm_open + mmap like below. This code works as intended with parent and child alternating to increment g_shared->count (the synchronization is not portable; it works only for certain memory models, but good enough for my case for now). However, when I change MAP_SHARED to MAP_ANONYMOUS | MAP_SHARED, the memory isn't shared and the program hangs since the 'flag' doesn't get flipped. Removing the flag confirms what's happening with each process counting from 0 to 10 (implying that each has its own copy of the structure and hence the 'count' field). Is this the expected behavior? I don't want the memory to be backed by a file; I really want to emulate what might happen if these were threads instead of processes (they need to be processes for other reasons).
Do I really need shm_open? Since the processes belong to the same hierarchy, can I just use mmap alone instead? I understand this would be fairly straightforward if there wasn't an 'exec,' but how do I get it to work when there is an 'exec' following the 'fork?'
I'm using kernel version 3.2.0-23 on x86_64 (Intel i7-2600). For this implementation, does mmap give the same behavior (correctness as well as performance) as shared memory with pthreads sharing the same global object? For example, does the MMU map the segment with 'cacheable' MTRR/TLB attributes?
Is the cleanup_shared() code correct? Is it leaking any memory? How could I check? For example, is there an equivalent of System V's 'ipcs?'
thanks,
/Doobs
shmem.h:
#ifndef __SHMEM_H__
#define __SHMEM_H__
//includes
#define LEN 1000
#define ITERS 10
#define SHM_FNAME "/myshm"
typedef struct shmem_obj {
int count;
char buff[LEN];
volatile int flag;
} shmem_t;
extern shmem_t* g_shared;
extern char proc_name[100];
extern int fd;
void cleanup_shared() {
munmap(g_shared, sizeof(shmem_t));
close(fd);
shm_unlink(SHM_FNAME);
}
static inline
void init_shared() {
int oflag;
if (!strcmp(proc_name, "parent")) {
oflag = O_CREAT | O_RDWR;
} else {
oflag = O_RDWR;
}
fd = shm_open(SHM_FNAME, oflag, (S_IREAD | S_IWRITE));
if (fd == -1) {
perror("shm_open");
exit(EXIT_FAILURE);
}
if (ftruncate(fd, sizeof(shmem_t)) == -1) {
perror("ftruncate");
shm_unlink(SHM_FNAME);
exit(EXIT_FAILURE);
}
g_shared = mmap(NULL, sizeof(shmem_t),
(PROT_WRITE | PROT_READ),
MAP_SHARED, fd, 0);
if (g_shared == MAP_FAILED) {
perror("mmap");
cleanup_shared();
exit(EXIT_FAILURE);
}
}
static inline
void proc_write(const char* s) {
fprintf(stderr, "[%s] %s\n", proc_name, s);
}
#endif // __SHMEM_H__
shmem1.c (parent process):
#include "shmem.h"
int fd;
shmem_t* g_shared;
char proc_name[100];
void work() {
int i;
for (i = 0; i &lt ITERS; ++i) {
while (g_shared->flag);
++g_shared->count;
sprintf(g_shared->buff, "%s: %d", proc_name, g_shared->count);
proc_write(g_shared->buff);
g_shared->flag = !g_shared->flag;
}
}
int main(int argc, char* argv[], char* envp[]) {
int status, child;
strcpy(proc_name, "parent");
init_shared(argv);
fprintf(stderr, "Map address is: %p\n", g_shared);
if (child = fork()) {
work();
waitpid(child, &status, 0);
cleanup_shared();
fprintf(stderr, "Parent finished!\n");
} else { /* child executes shmem2 */
execvpe("./shmem2", argv + 2, envp);
}
}
shmem2.c (child process):
#include "shmem.h"
int fd;
shmem_t* g_shared;
char proc_name[100];
void work() {
int i;
for (i = 0; i &lt ITERS; ++i) {
while (!g_shared->flag);
++g_shared->count;
sprintf(g_shared->buff, "%s: %d", proc_name, g_shared->count);
proc_write(g_shared->buff);
g_shared->flag = !g_shared->flag;
}
}
int main(int argc, char* argv[], char* envp[]) {
int status;
strcpy(proc_name, "child");
init_shared(argv);
fprintf(stderr, "Map address is: %p\n", g_shared);
work();
cleanup_shared();
return 0;
}
Passing MAP_ANONYMOUS causes the kernel to ignore your file descriptor argument and give you a private mapping instead. That's not what you want.
Yes, you can create an anonymous shared mapping in a parent process, fork, and have the child process inherit the mapping, sharing the memory with the parent and any other children. That obvoiusly doesn't survive an exec() though.
I don't understand this question; pthreads doesn't allocate memory. The cacheability will depend on the file descriptor you mapped. If it's a disk file or anonymous mapping, then it's cacheable memory. If it's a video framebuffer device, it's probably not.
That's the right way to call munmap(), but I didn't verify the logic beyond that. All processes need to unmap, only one should call unlink.
2b) as a middle-ground of a sort, it is possible to call:
int const shm_fd = shm_open(fn,...);
shm_unlink(fn);
in a parent process, and then pass fd to a child process created by fork()/execve() via argp or envp. since open file descriptors of this type will survive the fork()/execve(), you can mmap the fd in both the parent process and any dervied processes. here's a more complete code example copied and simplified/sanitized from code i ran successfully under Ubuntu 12.04 / linux kernel 3.13 / glibc 2.15:
int create_shm_fd( void ) {
int oflags = O_RDWR | O_CREAT | O_TRUNC;
string const fn = "/some_shm_fn_maybe_with_pid";
int fd;
neg_one_fail( fd = shm_open( fn.c_str(), oflags, S_IRUSR | S_IWUSR ), "shm_open" );
if( fd == -1 ) { rt_err( strprintf( "shm_open() failed with errno=%s", str(errno).c_str() ) ); }
// for now, we'll just pass the open fd to our child process, so
// we don't need the file/name/link anymore, and by unlinking it
// here we can try to minimize the chance / amount of OS-level shm
// leakage.
neg_one_fail( shm_unlink( fn.c_str() ), "shm_unlink" );
// by default, the fd returned from shm_open() has FD_CLOEXEC
// set. it seems okay to remove it so that it will stay open
// across execve.
int fd_flags = 0;
neg_one_fail( fd_flags = fcntl( fd, F_GETFD ), "fcntl" );
fd_flags &= ~FD_CLOEXEC;
neg_one_fail( fcntl( fd, F_SETFD, fd_flags ), "fcntl" );
// resize the shm segment for later mapping via mmap()
neg_one_fail( ftruncate( fd, 1024*1024*4 ), "ftruncate" );
return fd;
}
it's not 100% clear to me if it's okay spec-wise to remove the FD_CLOEXEC and/or assume that after doing so the fd really will survive the exec. the man page for exec is unclear; it says: "POSIX shared memory regions are unmapped", but to me that's redundant with the general comments earlier that mapping are not preserved, and doesn't say that shm_open()'d fd will be closed. any of course there's the fact that, as i mentioned, the code does seem to work in at least one case.
the reason i might use this approach is that it would seem to reduce the chance of leaking the shared memory segment / filename, and it makes it clear that i don't need persistence of the memory segment.

Resources