Once the following code is running, it will eat all my memory and cause OOM issue on my ARM 64-bit processor. it seems that some 'delete' operations do not work... it confused me. Could any experts help to analysis what is the root cause? Ttoolchain or libc?
I run the same code in another two platforms, one is x64, one is another ARM 32-bit chip; they all fine except my IPQ ARM chip.
environment:
libc version: musl libc 1.1.16
gcc version: 5.2.0
cpu chip: IPQ ARM 64bit
os: linux 4.1
#include <iostream>
#include <cstring>
#include <list>
#include <memory>
#include <pthread.h>
void *h(void *param)
{
while(1)
{
char *p = new char[4096];
delete[] p;
}
return NULL;
}
int main(int argc, char *argv)
{
pthread_t th[50];
int thread_cnt = 2;
for(int i = 0; i < thread_cnt; i++)
pthread_create(th+i, NULL, h, NULL);
for(int i = 0; i < thread_cnt; i++)
pthread_join(th+i, NULL);
return 0;
}
If I use a mutex in the header-file thread callback, the issue will disappear; there is no memory leak.
std::mutex thlock;
void *h(void *param)
{
while(1)
{
std::lock_guard<std::mutex> lk(thlock);
char *p = new char[4096];
delete[] p;
}
return NULL;
}
If I change the memory allocation size to 1024, then a weird thing happed. 2 threads do not leak memory, but 20 threads will cause memory leak. Whatever the allocation size, there's no issue with only 1 thread.
void *h(void *param)
{
while(1)
{
//char *p = new char[4096];
char *p = new char[1024];
delete[] p;
}
return NULL;
}
If I replace new/delete to std::malloc/std::free, the issue will disappear; no memory leak.
void *h(void *param)
{
while(1)
{
//char *p = new char[4096];
char *p = (char *)std::malloc(4096);
std::free(p);
//delete[] p;
}
return NULL;
}
So I thought if I overload operator new/delete and my "new/delete" allocate memory by using std::malloc/free, maybe the issue could be solved. But
the weird thing happened again; the memory leak still exists.
void* operator new(std::size_t sz) // no inline, required by [replacement.functions]/3
{
std::printf("global op new called, size = %zu\n", sz);
if (sz == 0)
++sz; // avoid std::malloc(0) which may return nullptr on success
if (void *ptr = std::malloc(sz))
return ptr;
throw std::bad_alloc{}; // required by [new.delete.single]/3
}
void operator delete(void* ptr) noexcept
{
std::puts("global op delete called");
std::free(ptr);
}
void *h(void *param)
{
while(1)
{
char *p = new char[4096];
delete[] p;
}
return NULL;
}
So I thought the memory leak may be caused by my operator new/delete rather than malloc/free, so I changed the code to look like what's shown below. I don't really call malloc/free but return a pre-allocated buffer. But the memory leak issue disappeared, so it seems the issue not only comes from new/delete...
char pp[4096];
void* operator new(std::size_t sz) // no inline, required by [replacement.functions]/3
{
std::printf("global op new called, size = %zu\n", sz);
if (sz == 0)
++sz; // avoid std::malloc(0) which may return nullptr on success
return (void *)pp;
//if (void *ptr = std::malloc(sz))
//return ptr;
throw std::bad_alloc{}; // required by [new.delete.single]/3
}
void operator delete(void* ptr) noexcept
{
std::puts("global op delete called");
//std::free(ptr);
}
So, what happened? Could any experts give me some idea? I guess it maybe caused musl libc thread lib; it seems that thread synchronization does not work fine... or is there a patch that already fixed this issue?
this issue is involved by musl libc, and has been fixed by the following patch.
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=3e16313f8fe2ed143ae0267fd79d63014c24779f
above patch involved a buffur overflow issue of realloc and has been fixed by below 2 patches:
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=cb5babdc8d624a3e3e7bea0b4e28a677a2f2fc46
https://git.musl-libc.org/cgit/musl/commit/src/malloc?id=fca7428c096066482d8c3f52450810288e27515c
Related
Below is the code:
#define FILENAME "kernel.code"
#define kernel_name "hello_world"
#define THREADS 4
std::vector<char> load_file()
{
std::ifstream file(FILENAME, std::ios::binary | std::ios::ate);
std::streamsize fsize = file.tellg();
file.seekg(0, std::ios::beg);
std::vector<char> buffer(fsize);
if (!file.read(buffer.data(), fsize)) {
failed("could not open code object '%s'\n", FILENAME);
}
return buffer;
}
struct joinable_thread : std::thread
{
template <class... Xs>
joinable_thread(Xs&&... xs) : std::thread(std::forward<Xs>(xs)...) // NOLINT
{
}
joinable_thread& operator=(joinable_thread&& other) = default;
joinable_thread(joinable_thread&& other) = default;
~joinable_thread()
{
if(this->joinable())
this->join();
}
};
void run(const std::vector<char>& buffer) {
CUdevice device;
CUDACHECK(cuDeviceGet(&device, 0));
CUcontext context;
CUDACHECK(cuCtxCreate(&context, 0, device));
CUmodule Module;
CUDACHECK(cuModuleLoadData(&Module, &buffer[0]));
...
}
void run_multi_threads(uint32_t n) {
{
auto buffer = load_file();
std::vector<joinable_thread> threads;
for (uint32_t i = 0; i < n; i++) {
threads.emplace_back(std::thread{[&, i, buffer] {
run(buffer);
}});
}
}
}
int main() {
CUDACHECK(cuInit(0));
run_multi_threads(THREADS);
}
And the code kernel.cu used for ptx is as follows:
#include "cuda_runtime.h"
extern "C" __global__ void hello_world(float* a, float* b) {
int tx = threadIdx.x;
b[tx] = a[tx];
}
I m generating the ptx in this way
nvcc --ptx kernel.cu -o kernel.code
Im using a machine with GeForce GTX TITAN X.
And Im facing this "PTX JIT compilation failed" from cuModuleLoadData error, only when I m trying to use this with multiple threads. If i remove the multi-threading part and run normally, this error doesn't occur.
Can anyone please tell me what is going wrong and how to overcome this.
As mentioned in the comments, I was able to get it to work by moving the load_file() call to the main, so that the buffer read from the file is valid, and then pass only the buffer to all the threads.
Actually in the original code, the buffer will be deconstructed once it leaves the '{...}' scope. So when thread starts, you may read the invalid buffer.
If you put your buffer in the main, it will not be deconstructed or freed until the program exits.
So yes, it's because you pass the invalid buffer (which may have already been freed) to the cu code.
me and my friend are trying to develop custom memory allocator in linux ubuntu 16.04.
We got stuck because of an error, btw its our first time
that we are trying to code something like that so we are not the best debuggers the error is : Segmentation fault (core dumped)
and here is the code.
can anybody help us understand whats wrong ?
Thank you!
#include <unistd.h>
#include <string.h>
#include <pthread.h>
#include <stdio.h>
struct header_t {
size_t size;
unsigned is_free;
struct header_t *next; };
struct header_t *head = NULL, *tail = NULL;
pthread_mutex_t global_malloc_lock;
struct header_t *get_free_block(size_t size)
{
struct header_t *curr = head;
while(curr) {
/* see if there's a free block that can accomodate requested size */
if (curr->is_free && curr->size >= size)
return curr;
curr = curr->next;
}
return NULL;
}
void free(void *block)
{
struct header_t *header, *tmp;
/* program break is the end of the
process's data segment */
void *programbreak;
if (!block)
return;
pthread_mutex_lock(&global_malloc_lock);
header = (struct header_t*)block - 1;
/* sbrk(0) gives the current program break address */
programbreak = sbrk(0);
/*
Check if the block to be freed is the last one in the
linked list. If it is, then we could shrink the size of the
heap and release memory to OS. Else, we will keep the block
but mark it as free. */
if ((char*)block + header->size == programbreak) {
if (head == tail) {
head = tail = NULL;
} else {
tmp = head;
while (tmp) {
if(tmp->next == tail) {
tmp->next = NULL;
tail = tmp;
}
tmp = tmp->next;
}
}
/* sbrk() with a negative argument decrements the program break.
So memory is released by the program to OS. */
sbrk(0 - header->size - sizeof(struct header_t));
/* Note: This lock does not really assure thread
safety, because sbrk() itself is not really
thread safe. Suppose there occurs a foregin sbrk(N)
after we find the program break and before we decrement
it, then we end up realeasing the memory obtained by
the foreign sbrk(). */
pthread_mutex_unlock(&global_malloc_lock);
return;
}
header->is_free = 1;
pthread_mutex_unlock(&global_malloc_lock);
}
void *malloc(size_t size)
{
size_t total_size;
void *block;
struct header_t *header;
if (!size)
return NULL;
pthread_mutex_lock(&global_malloc_lock);
header = get_free_block(size);
if (header) {
/* Woah, found a free block to accomodate requested memory. */
header->is_free = 0;
pthread_mutex_unlock(&global_malloc_lock);
return (void*)(header + 1);
}
/* We need to get memory to fit in the requested block and header
from OS. */
total_size = sizeof(struct header_t) + size;
block = sbrk(total_size);
if (block == (void*) -1) {
pthread_mutex_unlock(&global_malloc_lock);
return NULL;
}
header = block;
header->size = size;
header->is_free = 0;
header->next = NULL;
if (!head)
head = header;
if (tail)
tail->next = header;
tail = header;
pthread_mutex_unlock(&global_malloc_lock);
return (void*)(header + 1);
}
void *calloc(size_t num, size_t nsize)
{
size_t size;
void *block;
if (!num || !nsize)
return NULL;
size = num * nsize;
/* check mul overflow */
if (nsize != size / num)
return NULL;
block = malloc(size);
if (!block)
return NULL;
memset(block, 0, size);
return block;
}
void *realloc(void *block, size_t size)
{
struct header_t *header;
void *ret;
if (!block || !size)
return malloc(size);
header = (struct header_t*)block - 1;
if (header->size >= size)
return block;
ret = malloc(size);
if (ret) {
/* Relocate contents to the new bigger block */
memcpy(ret, block, header->size);
/* Free the old memory block */
free(block);
}
return ret;
}
The problem occurred because the functions were not prototyped [decalred].
Once I added functions prototype. The code worked.
For more information about prototyping: http://www.trytoprogram.com/c-programming/function-prototype-in-c/
mutex variable should be initialized before using it for applying lock. your global_malloc_lock is not initialized.
you can't initialize mutex variable as of normal variable.
pthread_mutex_t global_malloc_lock = 0 ;// invalid .. you may thinking since it's it declared as global it's initialized with 0 which is wrong
Initialize the mutex variable by calling pthread_mutex_init() or using PTHREAD_MUTEX_INITIALIZER ;
for your code add this
pthread_mutex_t global_malloc_lock = pthread_mutex_t global_malloc_lock;
I've been whittling down this segfault for a while, and here's a pretty minimal reproducible example on my machine (below). I have the sinking feeling that it's a driver bug, but I'm very unfamiliar with OpenGL, so it's more likely I'm just doing something wrong.
Is this correct OpenGL 3.3 code? Should be fine regardless of platform and compiler and all that?
Here's the code, compiled with gcc -ggdb -lGL -lSDL2
#include <stdio.h>
#include "GL/gl.h"
#include "GL/glext.h"
#include "SDL2/SDL.h"
// this section is for loading OpenGL things from later versions.
typedef void (APIENTRY *GLGenVertexArrays) (GLsizei n, GLuint *arrays);
typedef void (APIENTRY *GLGenBuffers) (GLsizei n, GLuint *buffers);
typedef void (APIENTRY *GLBindVertexArray) (GLuint array);
typedef void (APIENTRY *GLBindBuffer) (GLenum target, GLuint buffer);
typedef void (APIENTRY *GLBufferData) (GLenum target, GLsizeiptr size, const GLvoid* data, GLenum usage);
typedef void (APIENTRY *GLBufferSubData) (GLenum target, GLintptr offset, GLsizeiptr size, const GLvoid* data);
typedef void (APIENTRY *GLGetBufferSubData) (GLenum target, GLintptr offset, GLsizeiptr size, const GLvoid* data);
typedef void (APIENTRY *GLFlush) (void);
typedef void (APIENTRY *GLFinish) (void);
GLGenVertexArrays glGenVertexArrays = NULL;
GLGenBuffers glGenBuffers = NULL;
GLBindVertexArray glBindVertexArray = NULL;
GLBindBuffer glBindBuffer = NULL;
GLBufferData glBufferData = NULL;
GLBufferSubData glBufferSubData = NULL;
GLGetBufferSubData glGetBufferSubData = NULL;
void load_gl_pointers() {
glGenVertexArrays = (GLGenVertexArrays)SDL_GL_GetProcAddress("glGenVertexArrays");
glGenBuffers = (GLGenBuffers)SDL_GL_GetProcAddress("glGenBuffers");
glBindVertexArray = (GLBindVertexArray)SDL_GL_GetProcAddress("glBindVertexArray");
glBindBuffer = (GLBindBuffer)SDL_GL_GetProcAddress("glBindBuffer");
glBufferData = (GLBufferData)SDL_GL_GetProcAddress("glBufferData");
glBufferSubData = (GLBufferSubData)SDL_GL_GetProcAddress("glBufferSubData");
glGetBufferSubData = (GLGetBufferSubData)SDL_GL_GetProcAddress("glGetBufferSubData");
}
// end OpenGL loading stuff
#define CAPACITY (1 << 8)
// return nonzero if an OpenGL error has occurred.
int opengl_checkerr(const char* const label) {
GLenum err;
switch(err = glGetError()) {
case GL_INVALID_ENUM:
printf("GL_INVALID_ENUM");
break;
case GL_INVALID_VALUE:
printf("GL_INVALID_VALUE");
break;
case GL_INVALID_OPERATION:
printf("GL_INVALID_OPERATION");
break;
case GL_INVALID_FRAMEBUFFER_OPERATION:
printf("GL_INVALID_FRAMEBUFFER_OPERATION");
break;
case GL_OUT_OF_MEMORY:
printf("GL_OUT_OF_MEMORY");
break;
case GL_STACK_UNDERFLOW:
printf("GL_STACK_UNDERFLOW");
break;
case GL_STACK_OVERFLOW:
printf("GL_STACK_OVERFLOW");
break;
default: return 0;
}
printf(" %s\n", label);
return 1;
}
int main(int nargs, const char* args[]) {
printf("initializing..\n");
SDL_Init(SDL_INIT_EVERYTHING);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MAJOR_VERSION, 3);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_MINOR_VERSION, 3);
SDL_GL_SetAttribute(SDL_GL_CONTEXT_PROFILE_MASK, SDL_GL_CONTEXT_PROFILE_CORE);
SDL_Window* const w =
SDL_CreateWindow(
"broken",
SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED,
1, 1,
SDL_WINDOW_OPENGL
);
if(w == NULL) {
printf("window was null\n");
return 0;
}
SDL_GLContext context = SDL_GL_CreateContext(w);
if(context == NULL) {
printf("context was null\n");
return 0;
}
load_gl_pointers();
if(opengl_checkerr("init")) {
return 1;
}
printf("GL_VENDOR: %s\n", glGetString(GL_VENDOR));
printf("GL_RENDERER: %s\n", glGetString(GL_RENDERER));
float* const vs = malloc(CAPACITY * sizeof(float));
memset(vs, 0, CAPACITY * sizeof(float));
unsigned int i = 0;
while(i < 128000) {
GLuint vertex_array;
GLuint vertex_buffer;
glGenVertexArrays(1, &vertex_array);
glBindVertexArray(vertex_array);
glGenBuffers(1, &vertex_buffer);
glBindBuffer(GL_ARRAY_BUFFER, vertex_buffer);
if(opengl_checkerr("gen/binding")) {
return 1;
}
glBufferData(
GL_ARRAY_BUFFER,
CAPACITY * sizeof(float),
vs, // initialize with `vs` just to make sure it's allocated.
GL_DYNAMIC_DRAW
);
// verify that the memory is allocated by reading it back into `vs`.
glGetBufferSubData(
GL_ARRAY_BUFFER,
0,
CAPACITY * sizeof(float),
vs
);
if(opengl_checkerr("creating buffer")) {
return 1;
}
glFlush();
glFinish();
// segfault occurs here..
glBufferSubData(
GL_ARRAY_BUFFER,
0,
CAPACITY * sizeof(float),
vs
);
glFlush();
glFinish();
++i;
}
return 0;
}
When I bump the iterations from 64k to 128k, I start getting:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff754c859 in __memcpy_sse2_unaligned () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff754c859 in __memcpy_sse2_unaligned () from /usr/lib/libc.so.6
#1 0x00007ffff2ea154d in ?? () from /usr/lib/xorg/modules/dri/i965_dri.so
#2 0x0000000000400e5c in main (nargs=1, args=0x7fffffffe8d8) at opengl-segfault.c:145
However, I can more than double the capacity (keeping the number of iterations at 64k) without segfaulting.
GL_VENDOR: Intel Open Source Technology Center
GL_RENDERER: Mesa DRI Intel(R) Haswell Mobile
I had a very similar issue when calling glGenTextures and glBindTexture. I tried debugging and when i would try to step through these lines I would get something like:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff26eaaa8 in ?? () from /usr/lib/x86_64-linux-gnu/dri/i965_dri.so
Note that prior to adding textures, I could successfully run programs with vbos and vaos and generate meshes fine. After looking into the answer suggesting switching from xf86-video-intel driver to xf86-video-fbdev driver, I would advise against it(There really isn't that much info on this issue or users facing segfaults on linux with integrated intel graphics cards. perhaps a good question to ask the folks over at Intel OpenSource).
The solution I found was to stop using freeglut. Switch to glfw instead. Whether there actually is some problem with the intel linux graphics stack is besides the matter, it seems the solvable problem is freeglut. If you want to use glfw with your machines most recent opengl core profile you need just the following:
glfwWindowHint (GLFW_CONTEXT_VERSION_MAJOR, 3);
glfwWindowHint (GLFW_CONTEXT_VERSION_MINOR, 0);
glfwWindowHint (GLFW_OPENGL_FORWARD_COMPAT, GL_TRUE);
Setting forward compat(although ive seen lots of post argueing you shouldnt do this) means mesa is free to select a core context permitted one sets the minimum context to 3.0 or higher. I guess freeglut must be going wrong somewhere in its interactions with mesa, if anyone can share some light on this that would be great!
This is a bug in the intel graphics drivers for Linux. Switching from the xf86-video-intel driver to xf86-video-fbdev driver solves the problem.
Edit: I'm not recommending switching to fbdev, just using it as an experiment to see whether the segfault goes away.
Can anyone shed light on the reason that when the below code is compiled and run on OSX the 'bartender' thread skips through the sem_wait() in what seems like a random manner and yet when compiled and run on a Linux machine the sem_wait() holds the thread until the relative call to sem_post() is made, as would be expected?
I am currently learning not only POSIX threads but concurrency as a whole so absoutely any comments, tips and insights are warmly welcomed...
Thanks in advance.
#include <stdio.h>
#include <stdlib.h>
#include <semaphore.h>
#include <fcntl.h>
#include <unistd.h>
#include <pthread.h>
#include <errno.h>
//using namespace std;
#define NSTUDENTS 30
#define MAX_SERVINGS 100
void* student(void* ptr);
void get_serving(int id);
void drink_and_think();
void* bartender(void* ptr);
void refill_barrel();
// This shared variable gives the number of servings currently in the barrel
int servings = 10;
// Define here your semaphores and any other shared data
sem_t *mutex_stu;
sem_t *mutex_bar;
int main() {
static const char *semname1 = "Semaphore1";
static const char *semname2 = "Semaphore2";
pthread_t tid;
mutex_stu = sem_open(semname1, O_CREAT, 0777, 0);
if (mutex_stu == SEM_FAILED)
{
fprintf(stderr, "%s\n", "ERROR creating semaphore semname1");
exit(EXIT_FAILURE);
}
mutex_bar = sem_open(semname2, O_CREAT, 0777, 1);
if (mutex_bar == SEM_FAILED)
{
fprintf(stderr, "%s\n", "ERROR creating semaphore semname2");
exit(EXIT_FAILURE);
}
pthread_create(&tid, NULL, bartender, &tid);
for(int i=0; i < NSTUDENTS; ++i) {
pthread_create(&tid, NULL, student, &tid);
}
pthread_join(tid, NULL);
sem_unlink(semname1);
sem_unlink(semname2);
printf("Exiting the program...\n");
}
//Called by a student process. Do not modify this.
void drink_and_think() {
// Sleep time in milliseconds
int st = rand() % 10;
sleep(st);
}
// Called by a student process. Do not modify this.
void get_serving(int id) {
if (servings > 0) {
servings -= 1;
} else {
servings = 0;
}
printf("ID %d got a serving. %d left\n", id, servings);
}
// Called by the bartender process.
void refill_barrel()
{
servings = 1 + rand() % 10;
printf("Barrel refilled up to -> %d\n", servings);
}
//-- Implement a synchronized version of the student
void* student(void* ptr) {
int id = *(int*)ptr;
printf("Started student %d\n", id);
while(1) {
sem_wait(mutex_stu);
if(servings > 0) {
get_serving(id);
} else {
sem_post(mutex_bar);
continue;
}
sem_post(mutex_stu);
drink_and_think();
}
return NULL;
}
//-- Implement a synchronized version of the bartender
void* bartender(void* ptr) {
int id = *(int*)ptr;
printf("Started bartender %d\n", id);
//sleep(5);
while(1) {
sem_wait(mutex_bar);
if(servings <= 0) {
refill_barrel();
} else {
printf("Bar skipped sem_wait()!\n");
}
sem_post(mutex_stu);
}
return NULL;
}
The first time you run the program, you're creating named semaphores with initial values, but since your threads never exit (they're infinite loops), you never get to the sem_unlink calls to delete those semaphores. If you kill the program (with ctrl-C or any other way), the semaphores will still exist in whatever state they are in. So if you run the program again, the sem_open calls will succeed (because you don't use O_EXCL), but they won't reset the semaphore value or state, so they might be in some odd state.
So you should make sure to call sem_unlink when the program STARTS, before calling sem_open. Better yet, don't use named semaphores at all -- use sem_init to initialize a couple of unnamed semaphores instead.
I just found out that someone is calling - from a signal handler - a definitely not async-signal-safe function that I wrote.
So, now I'm curious: how to circumvent this situation from happening again? I'd like to be able to easily determine if my code is running in signal handler context (language is C, but wouldn't the solution apply to any language?):
int myfunc( void ) {
if( in_signal_handler_context() ) { return(-1) }
// rest of function goes here
return( 0 );
}
This is under Linux.
Hope this isn't an easy answer, or else I'll feel like an idiot.
Apparently, newer Linux/x86 (probably since some 2.6.x kernel) calls signal handlers from the vdso. You could use this fact to inflict the following horrible hack upon the unsuspecting world:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <signal.h>
#include <unistd.h>
uintmax_t vdso_start = 0;
uintmax_t vdso_end = 0; /* actually, next byte */
int check_stack_for_vdso(uint32_t *esp, size_t len)
{
size_t i;
for (i = 0; i < len; i++, esp++)
if (*esp >= vdso_start && *esp < vdso_end)
return 1;
return 0;
}
void handler(int signo)
{
uint32_t *esp;
__asm__ __volatile__ ("mov %%esp, %0" : "=r"(esp));
/* XXX only for demonstration, don't call printf from a signal handler */
printf("handler: check_stack_for_vdso() = %d\n", check_stack_for_vdso(esp, 20));
}
void parse_maps()
{
FILE *maps;
char buf[256];
char path[7];
uintmax_t start, end, offset, inode;
char r, w, x, p;
unsigned major, minor;
maps = fopen("/proc/self/maps", "rt");
if (maps == NULL)
return;
while (!feof(maps) && !ferror(maps)) {
if (fgets(buf, 256, maps) != NULL) {
if (sscanf(buf, "%jx-%jx %c%c%c%c %jx %u:%u %ju %6s",
&start, &end, &r, &w, &x, &p, &offset,
&major, &minor, &inode, path) == 11) {
if (!strcmp(path, "[vdso]")) {
vdso_start = start;
vdso_end = end;
break;
}
}
}
}
fclose(maps);
printf("[vdso] at %jx-%jx\n", vdso_start, vdso_end);
}
int main()
{
struct sigaction sa;
uint32_t *esp;
parse_maps();
memset(&sa, 0, sizeof(struct sigaction));
sa.sa_handler = handler;
sa.sa_flags = SA_RESTART;
if (sigaction(SIGUSR1, &sa, NULL) < 0) {
perror("sigaction");
exit(1);
}
__asm__ __volatile__ ("mov %%esp, %0" : "=r"(esp));
printf("before kill: check_stack_for_vdso() = %d\n", check_stack_for_vdso(esp, 20));
kill(getpid(), SIGUSR1);
__asm__ __volatile__ ("mov %%esp, %0" : "=r"(esp));
printf("after kill: check_stack_for_vdso() = %d\n", check_stack_for_vdso(esp, 20));
return 0;
}
SCNR.
If we can assume your application doesn't manually block signals using sigprocmask() or pthread_sigmask(), then this is pretty simple: get your current thread ID (tid). Open /proc/tid/status and get the values for SigBlk and SigCgt. AND those two values. If the result of that AND is non-zero, then that thread is currently running from inside a signal handler. I've tested this myself and it works.
There are two proper ways to deal with this:
Have your co-workers stop doing the wrong thing. Good luck pulling this off with the boss, though...
Make your function re-entrant and async-safe. If necessary, provide a function with a different signature (e.g. using the widely-used *_r naming convention) with the additional arguments that are necessary for state preservation.
As for the non-proper way to do this, on Linux with GNU libc you can use backtrace() and friends to go through the caller list of your function. It's not easy to get right, safe or portable, but it might do for a while:
/*
* *** Warning ***
*
* Black, fragile and unportable magic ahead
*
* Do not use this, lest the daemons of hell be unleashed upon you
*/
int in_signal_handler_context() {
int i, n;
void *bt[1000];
char **bts = NULL;
n = backtrace(bt, 1000);
bts = backtrace_symbols(bt, n);
for (i = 0; i < n; ++i)
printf("%i - %s\n", i, bts[i]);
/* Have a look at the caller chain */
for (i = 0; i < n; ++i) {
/* Far more checks are needed here to avoid misfires */
if (strstr(bts[i], "(__libc_start_main+") != NULL)
return 0;
if (strstr(bts[i], "libc.so.6(+") != NULL)
return 1;
}
return 0;
}
void unsafe() {
if (in_signal_handler_context())
printf("John, you know you are an idiot, right?\n");
}
In my opinion, it might just be better to quit rather than be forced to write code like this.
You could work out something using sigaltstack. Set up an alternative signal stack, get the stack pointer in some async-safe way, if within the alternative stack go on, otherwise abort().
I guess you need to do the following. This is a complex solution, which combines the best practices not only from coding, but from software engineering as well!
Persuade your boss that naming convention on signal handlers is a good thing. Propose, for example, a Hungarian notation, and tell that it was used in Microsoft with great success.
So, all signal handlers will start with sighnd, like sighndInterrupt.
Your function that detects signal handling context would do the following:
Get the backtrace().
Look if any of the functions in it begin with sighnd.... If it does, then congratulations, you're inside a signal handler!
Otherwise, you're not.
Try to avoid working with Jimmy in the same company. "There can be only one", you know.
for code optimized at -O2 or better (istr) have found need to add -fno-omit-frame-pointer
else gcc will optimize out the stack context information