multithreaded using _beginthreadex passing parameters - multithreading

Visual Studio C++ 2008
I am using threads. However, I have this warnings and not sure what I am doing wrong.
1. unsigned int (__stdcall *)(void *) differs in levels of indirection from 'void *(__stdcall *)(void *)'
2. _beginthreadex different types for formal and actual parameters 3
/* Function prototype */
void* WINAPI get_info(void *arg_list)
DWORD thread_hd;
DWORD thread_id;
/* Value to pass to the function */
int th_sock_desc = 0;
/* Create thread */
thread_hd = _beginthreadex(NULL, 0, get_info, (void*) th_sock_desc, 0, &thread_id);
if(thread_hd == 0)
{
/* Failed to create thread */
fprintf(stderr, "[ %s ] [ %s ] [ %d ]\n", strerror(errno), __func__, __LINE__);
return 1;
}

the Thread function that you pass to _beginthreadex has a different prototype than the one you pass to _beginthread
uintptr_t _beginthread(
void( *start_address )( void * ),
unsigned stack_size,
void *arglist
);
uintptr_t _beginthreadex(
void *security,
unsigned stack_size,
unsigned ( *start_address )( void * ),
void *arglist,
unsigned initflag,
unsigned *thrdaddr
);
It's the same as what CreateThread expects,
DWORD WINAPI ThreadProc(
__in LPVOID lpParameter
);
So you need to change the function signature of your thread proc to
unsigned WINAPI get_info(void *arg_list)
remove WINAPI and change the return type.
Edit :
WINAPI is actually needed, the docs show the wrong prototype for _beginthredex, but they explicitly state that __stdcall is needed. Your problem is just the return type. Also, the error message, says that __stdcall is expected, so that settles it.

Related

Syscall Hooking on Kali Linux (kernel version 5)

I am trying to set up a hook for the bind() system call on Kali Linux 2021-W1 (Linux kernel version 5), but for some reason, calling the original system call fails and an error occurs.
Here is my code:
/* includes, license, author... */
void **sys_call_table_addr = (void **) 0xffffffff9e0002c0;
int enable_page_rw(void *ptr){
unsigned int level;
pte_t *pte = lookup_address((unsigned long) ptr, &level);
if(pte->pte &~_PAGE_RW){
pte->pte |=_PAGE_RW;
}
return 0;
}
int disable_page_rw(void *ptr){
unsigned int level;
pte_t *pte = lookup_address((unsigned long) ptr, &level);
pte->pte = pte->pte &~_PAGE_RW;
return 0;
}
asmlinkage int (*original_bind) (int, const struct sockaddr *, int);
asmlinkage int log_bind(int sockfd, const struct sockaddr *addr, int addrlen) {
int ret;
printk(KERN_INFO SOCKETLOG "bind was called");
return (*original_bind)(sockfd, addr, addrlen);
}
static int __init socketlog_init(void) {
printk(KERN_INFO SOCKETLOG "socketlog module has been loaded\n");
enable_page_rw(sys_call_table_addr);
original_bind = sys_call_table_addr[__NR_bind];
if (!original_bind) return -1;
sys_call_table_addr[__NR_bind] = log_bind;
disable_page_rw(sys_call_table_addr);
printk(KERN_INFO SOCKETLOG "original_bind = %p", original_bind);
return 0;
}
static void __exit socketlog_exit(void) {
printk(KERN_INFO SOCKETLOG "socketlog module has been unloaded\n");
enable_page_rw(sys_call_table_addr);
sys_call_table_addr[__NR_bind] = original_bind;
disable_page_rw(sys_call_table_addr);
}
module_init(socketlog_init);
module_exit(socketlog_exit);
After executing sudo insmod socketlog.ko, I can see the expected output:
[ +0.000488] [SOCKETLOG] socketlog module has been loaded
[ +0.000002] [SOCKETLOG] original_bind = 00000000bbf288f1
But every time a bind() is called, I get the weird behavior:
[ +0.000488] [SOCKETLOG] bind was called
[ +0.000005] BUG: unable to handle page fault for address: 0000000040697fb8
[ +0.000002] #PF: supervisor read access in kernel mode
[ +0.000001] #PF: error_code(0x0000) - not-present page
As expected 0x0000000040697fb8 is the address pointed to by 0x00000000bbf288f1: the content of the original system call. What am I missing?
Perhaps the way you are wrapping the system call does not work. For example, on Linux 5.4.0-59-generic x86_64 architecture, a system call in the kernel is called through a common wrapper called do_syscall_64(). It passes the parameters through the pt_regs structure to the entry in sys_call_table[]:
__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
struct thread_info *ti;
enter_from_user_mode();
local_irq_enable();
ti = current_thread_info();
if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY)
nr = syscall_trace_enter(regs);
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
regs->ax = sys_call_table[nr](regs); <-------- Call to the entry with pt_regs structure
#ifdef CONFIG_X86_X32_ABI
} else if (likely((nr & __X32_SYSCALL_BIT) &&
(nr & ~__X32_SYSCALL_BIT) < X32_NR_syscalls)) {
nr = array_index_nospec(nr & ~__X32_SYSCALL_BIT,
X32_NR_syscalls);
regs->ax = x32_sys_call_table[nr](regs);
#endif
}
syscall_return_slowpath(regs);
}
The pt_regs structure embeds the parameters passed to the system call by the user. So this may explain why you are crashing : the printk(..."bind was called") works as it does not access the parameters but after the call to the original system call entry does not comply with the expected parameters.
If you look at the source code of bind() system call in net/socket.c, it is defined as:
SYSCALL_DEFINE3(bind, int, fd, struct sockaddr __user *, umyaddr, int, addrlen)
{
return __sys_bind(fd, umyaddr, addrlen);
}
The above macro SYSCALL_DEFINE3() expands into some wrappers which extract the parameters from the pt_regs structure.
So, here is as an example, some fixes in your module which works on my 5.4.0-60-generic Ubuntu x86_64:
#include <linux/version.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <asm/ptrace.h>
#include <linux/socket.h>
#include <linux/kallsyms.h>
MODULE_LICENSE("Dual BSD/GPL");
typedef int (* syscall_wrapper)(struct pt_regs *);
unsigned long sys_call_table_addr;
#define SOCKETLOG "[SOCKETLOG]"
int enable_page_rw(void *ptr){
unsigned int level;
pte_t *pte = lookup_address((unsigned long) ptr, &level);
if(pte->pte &~_PAGE_RW){
pte->pte |=_PAGE_RW;
}
return 0;
}
int disable_page_rw(void *ptr){
unsigned int level;
pte_t *pte = lookup_address((unsigned long) ptr, &level);
pte->pte = pte->pte &~_PAGE_RW;
return 0;
}
syscall_wrapper original_bind;
//asmlinkage int log_bind(int sockfd, const struct sockaddr *addr, int addrlen) {
int log_bind(struct pt_regs *regs) {
printk(KERN_INFO SOCKETLOG "bind was called");
return (*original_bind)(regs);
}
static int __init socketlog_init(void) {
printk(KERN_INFO SOCKETLOG "socketlog module has been loaded\n");
sys_call_table_addr = kallsyms_lookup_name("sys_call_table");
printk(KERN_INFO SOCKETLOG "sys_call_table#%lx\n", sys_call_table_addr);
enable_page_rw((void *)sys_call_table_addr);
original_bind = ((syscall_wrapper *)sys_call_table_addr)[__NR_bind];
if (!original_bind) return -1;
((syscall_wrapper *)sys_call_table_addr)[__NR_bind] = log_bind;
disable_page_rw((void *)sys_call_table_addr);
printk(KERN_INFO SOCKETLOG "original_bind = %p", original_bind);
return 0;
}
static void __exit socketlog_exit(void) {
printk(KERN_INFO SOCKETLOG "socketlog module has been unloaded\n");
enable_page_rw((void *)sys_call_table_addr);
((syscall_wrapper *)sys_call_table_addr)[__NR_bind] = original_bind;
disable_page_rw((void *)sys_call_table_addr);
}
module_init(socketlog_init);
module_exit(socketlog_exit);
With a test:
$ sudo insmod ./bind_ovl.ko
$ dmesg
[ 2253.201888] [SOCKETLOG]socketlog module has been loaded
[ 2253.209486] [SOCKETLOG]sys_call_table#ffffffff88c013a0
[ 2253.209489] [SOCKETLOG]original_bind = 00000000f54304a9
After a reload of a WEB page for example, I get:
$ dmesg
[ 2136.946042] [SOCKETLOG]socketlog module has been unloaded
[ 2253.201888] [SOCKETLOG]socketlog module has been loaded
[ 2253.209486] [SOCKETLOG]sys_call_table#ffffffff88c013a0
[ 2253.209489] [SOCKETLOG]original_bind = 00000000f54304a9
[ 2281.716581] [SOCKETLOG]bind was called
[ 2295.607476] [SOCKETLOG]bind was called
[ 2301.947866] [SOCKETLOG]bind was called
[ 2304.088116] [SOCKETLOG]bind was called
[ 2309.599634] [SOCKETLOG]bind was called
[ 2310.946833] [SOCKETLOG]bind was called
After unloading the module:
$ sudo rmmod bind_ovl
$ dmesg
[...]
[ 2390.908456] [SOCKETLOG]bind was called
[ 2398.921475] [SOCKETLOG]bind was called
[ 2398.928855] [SOCKETLOG]socketlog module has been unloaded
You can of course enhance the overload by displaying the parameters passed to the system call. On x86_64, the system calls are passed at most 6 parameters through the processor registers. We can retrieve them in the pt_regs structure. The latter is defined in arch/x86/include/asm/ptrace.h as:
struct pt_regs {
/*
* C ABI says these regs are callee-preserved. They aren't saved on kernel entry
* unless syscall needs a complete, fully filled "struct pt_regs".
*/
unsigned long r15;
unsigned long r14;
unsigned long r13;
unsigned long r12;
unsigned long bp;
unsigned long bx;
/* These regs are callee-clobbered. Always saved on kernel entry. */
unsigned long r11;
unsigned long r10;
unsigned long r9;
unsigned long r8;
unsigned long ax;
unsigned long cx;
unsigned long dx;
unsigned long si;
unsigned long di;
/*
* On syscall entry, this is syscall#. On CPU exception, this is error code.
* On hw interrupt, it's IRQ number:
*/
unsigned long orig_ax;
/* Return frame for iretq */
unsigned long ip;
unsigned long cs;
unsigned long flags;
unsigned long sp;
unsigned long ss;
/* top of stack page */
};
The parameter passing convention for a system call is: param#0 to param#5 are respectively passed into the RDI, RSI, RDX, R10, R8 and R9 registers.
According to this rule, for bind() system call, the parameters are in the following registers:
RDI = int (socket descriptor)
RSI = struct sockaddr *addr
RDX = socklen_t addrlen
You can then enhance the log function with something like:
int log_bind(struct pt_regs *regs) {
printk(KERN_INFO SOCKETLOG "bind was called(%d, %p, %u)", (int)(regs->di), (void *)(regs->si), (unsigned int)(regs->dx));
return (*original_bind)(regs);
}
The traces from the module become more detailed:
[ 3259.589915] [SOCKETLOG]socketlog module has been loaded
[ 3259.594631] [SOCKETLOG]sys_call_table#ffffffff88c013a0
[ 3259.594634] [SOCKETLOG]original_bind = 00000000f54304a9
[ 3274.368906] [SOCKETLOG]bind was called(149, 0000000091c163d5, 12)
[ 3276.040330] [SOCKETLOG]bind was called(149, 0000000075b17cb4, 12)
[ 3278.203942] [SOCKETLOG]bind was called(188, 0000000091c163d5, 12)
[ 3287.014980] [SOCKETLOG]bind was called(214, 0000000075b17cb4, 12)
[ 3287.021167] [SOCKETLOG]bind was called(214, 0000000091c163d5, 12)
[ 3298.395713] [SOCKETLOG]bind was called(3, 000000008c2a9103, 12)
[ 3298.403249] [SOCKETLOG]socketlog module has been unloaded

pthread_kill in multithread'd app causes segmentation fault

I managed to find several references to this problem which suggested pthread_kill was de-referencing the pthread_t structure which was causing some problem however other articles said this is not a problem as long as the pthread_t staructure is created via pthread_create.
Then I found a multi thread example of how to do this correctly :
How to send a signal to a process in C?
However I am still getting seg faults so here is my code example:
static pthread_t GPUthread;
static void GPUsigHandler(int signo)
{
fprintf(stderr, "Queue waking\n");
}
void StartGPUQueue()
{
sigset_t sigmask;
pthread_attr_t attr_obj; /* a thread attribute variable */
struct sigaction action;
/* set up signal mask to block all in main thread */
sigfillset(&sigmask); /* to turn on all bits */
pthread_sigmask(SIG_BLOCK, &sigmask, (sigset_t *)0);
/* set up signal handlers for SIGINT & SIGUSR1 */
action.sa_flags = 0;
action.sa_handler = GPUsigHandler;
sigaction(SIGUSR1, &action, (struct sigaction *)0);
pthread_attr_init(&attr_obj); /* init it to default */
pthread_attr_setdetachstate(&attr_obj, PTHREAD_CREATE_DETACHED);
GPUthread = pthread_create(&GPUthread, &attr_obj, ProcessGPUqueue, NULL);
if (GPUthread != 0)
{
fprintf(stderr, "Cannot start GPU thread\n");
}
}
void ProcessGPUqueue(void *ptr)
{
int sigdummy;
sigset_t sigmask;
sigfillset(&sigmask); /* will unblock all signals */
pthread_sigmask(SIG_UNBLOCK, &sigmask, (sigset_t *)0);
fprintf(stderr, "GPU queue alive\n");
while(queueActive)
{
fprintf(stderr, "Processing GPU queue\n");
while(GPUqueue != NULL)
{
// process stuff
}
sigwait(&sigmask, &sigdummy);
}
}
void QueueGPUrequest(unsigned char cmd, unsigned short p1, unsigned short p2, unsigned short p3, unsigned short p4)
{
// Add request to queue logic ...
fprintf(stderr, "About to Wake GPU queue\n");
pthread_kill(GPUthread, SIGUSR1);// Earth shattering KA-BOOM!!!
}

Where could I find the code of "sched_getcpu()"

Recently I'm using the function sched_getcpu() from the header file sched.h on Linux.
However, I'm wondering where could I find the source code of this function?
Thanks.
Under Linux, the sched_getcpu() function is a glibc wrapper to sys_getcpu() system call, which is architecture specific.
For the x86_64 architecture, it is defined under arch/x86/include/asm/vgtod.h as __getcpu() (tree 4.x):
#ifdef CONFIG_X86_64
#define VGETCPU_CPU_MASK 0xfff
static inline unsigned int __getcpu(void)
{
unsigned int p;
/*
* Load per CPU data from GDT. LSL is faster than RDTSCP and
* works on all CPUs. This is volatile so that it orders
* correctly wrt barrier() and to keep gcc from cleverly
* hoisting it out of the calling function.
*/
asm volatile ("lsl %1,%0" : "=r" (p) : "r" (__PER_CPU_SEG));
return p;
}
#endif /* CONFIG_X86_64 */
Being this function called by __vdso_getcpu() declared in arch/entry/vdso/vgetcpu.c:
notrace long
__vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused)
{
unsigned int p;
p = __getcpu();
if (cpu)
*cpu = p & VGETCPU_CPU_MASK;
if (node)
*node = p >> 12;
return 0;
}
(See vDSO for details regarding what vdso prefix is).
EDIT 1: (in reply to arm code location)
ARM code location
It can be found in the arch/arm/include/asm/thread_info.h file:
static inline struct thread_info *current_thread_info(void)
{
return (struct thread_info *)
(current_stack_pointer & ~(THREAD_SIZE - 1));
}
This function is used by raw_smp_processor_id() that is defined in the file arch/arm/include/asm/smp.h as:
#define raw_smp_processor_id() (current_thread_info()->cpu)
And it's called by getcpu system call declared in the file kernel/sys.c:
SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep, struct getcpu_cache __user *, unused)
{
int err = 0;
int cpu = raw_smp_processor_id();
if (cpup)
err |= put_user(cpu, cpup);
if (nodep)
err |= put_user(cpu_to_node(cpu), nodep);
return err ? -EFAULT : 0;
}

Compilation error for multi threaded programs compiled using riscv64-unknown-elf-gcc

/* Includes */
#include <unistd.h> /* Symbolic Constants */
#include <sys/types.h> /* Primitive System Data Types */
#include <errno.h> /* Errors */
#include <stdio.h> /* Input/Output */
#include <stdlib.h> /* General Utilities */
#include <pthread.h> /* POSIX Threads */
#include <string.h> /* String handling */
/* prototype for thread routine */
void print_message_function ( void *ptr );
/* struct to hold data to be passed to a thread
this shows how multiple data items can be passed to a thread */
typedef struct str_thdata
{
int thread_no;
char message[100];
} thdata;
int main()
{
pthread_t thread1, thread2; /* thread variables */
thdata data1, data2; /* structs to be passed to threads */
/* initialize data to pass to thread 1 */
data1.thread_no = 1;
strcpy(data1.message, "Hello!");
/* initialize data to pass to thread 2 */
data2.thread_no = 2;
strcpy(data2.message, "Hi!");
/* create threads 1 and 2 */
pthread_create (&thread1, NULL, (void *) &print_message_function, (void *) &data1);
pthread_create (&thread2, NULL, (void *) &print_message_function, (void *) &data2);
/* Main block now waits for both threads to terminate, before it exits
If main block exits, both threads exit, even if the threads have not
finished their work */
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
/* exit */
exit(0);
} /* main() */
/**
* print_message_function is used as the start routine for the threads used
* it accepts a void pointer
**/
void print_message_function ( void *ptr )
{
thdata *data;
data = (thdata *) ptr; /* type cast to a pointer to thdata */
/* do the work */
printf("Thread %d says %s \n", data->thread_no, data->message);
pthread_exit(0); /* exit */
} /* print_message_function ( void *ptr ) */
The error that i faced was
thread-ex.c:24:5: error: unknown type name 'pthread_t'
pthread_t thread1, thread2; /* thread variables */
^
thread-ex.c:36:5: warning: implicit declaration of function 'pthread_create' [-Wimplicit-function-declaration]
pthread_create (&thread1, NULL, (void *) &print_message_function, (void ) &data1);
^
thread-ex.c:42:5: warning: implicit declaration of function 'pthread_join' [-Wimplicit-function-declaration]
pthread_join(thread1, NULL);
^
thread-ex.c: In function 'print_message_function':
thread-ex.c:61:5: warning: implicit declaration of function 'pthread_exit' [-Wimplicit-function-declaration]
pthread_exit(0); / exit */
pthreads are not available with the newlib based RISC-V compiler. You will need to use the linux based RISC-V compiler (riscv64-unknown-linux-gnu-gcc). You can find instructions to build and install the linux based compiler at the same riscv-gcc repo (https://github.com/riscv/riscv-gnu-toolchain).

Check signature of Linux shared-object before load

Goal: Load .so or executable that has been verified to be signed (or verified against an arbitrary algorithm).
I want to be able to verify a .so/executable and then load/execute that .so/executable with dlopen/...
The wrench in this is that there seems to be no programmatic way to check-then-load. One could check the file manually and then load it after.. however there is a window-of-opportunity within which someone could swap out that file for another.
One possible solution that I can think of is to load the binary, check the signature and then dlopen/execvt the /proc/$PID/fd.... however I do not know if that is a viable solution.
Since filesystem locks are advisory in Linux they are not so useful for this purpose... (well, there's mount -o mand ... but this is something for userlevel, not root use).
Many dynamic linkers (including Glibc's) support setting LD_AUDIT environment variable to a colon-separated list of shared libraries. These libraries are allowed to hook into various locations in the dynamic library loading process.
#define _GNU_SOURCE
#include <dlfcn.h>
#include <link.h>
unsigned int la_version(unsigned int v) { return v; }
unsigned int la_objopen(struct link_map *l, Lmid_t lmid, uintptr_t *cookie) {
if (!some_custom_check_on_name_and_contents(l->l_name, l->l_addr))
abort();
return 0;
}
Compile this with cc -shared -fPIC -o test.so test.c or similar.
You can see glibc/elf/tst-auditmod1.c or latrace for more examples, or read the Linkers and Libraries Guide.
Very very specific to Glibc's internals, but you can still hook into libdl at runtime.
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdio.h>
extern struct dlfcn_hook {
void *(*dlopen)(const char *, int, void *);
int (*dlclose)(void *);
void *(*dlsym)(void *, const char *, void *);
void *(*dlvsym)(void *, const char *, const char *, void *);
char *(*dlerror)(void);
int (*dladdr)(const void *, Dl_info *);
int (*dladdr1)(const void *, Dl_info *, void **, int);
int (*dlinfo)(void *, int, void *, void *);
void *(*dlmopen)(Lmid_t, const char *, int, void *);
void *pad[4];
} *_dlfcn_hook;
static struct dlfcn_hook *old_dlfcn_hook, my_dlfcn_hook;
static int depth;
static void enter(void) { if (!depth++) _dlfcn_hook = old_dlfcn_hook; }
static void leave(void) { if (!--depth) _dlfcn_hook = &my_dlfcn_hook; }
void *my_dlopen(const char *file, int mode, void *dl_caller) {
void *result;
fprintf(stderr, "%s(%s, %d, %p)\n", __func__, file, mode, dl_caller);
enter();
result = dlopen(file, mode);
leave();
return result;
}
int my_dlclose(void *handle) {
int result;
fprintf(stderr, "%s(%p)\n", __func__, handle);
enter();
result = dlclose(handle);
leave();
return result;
}
void *my_dlsym(void *handle, const char *name, void *dl_caller) {
void *result;
fprintf(stderr, "%s(%p, %s, %p)\n", __func__, handle, name, dl_caller);
enter();
result = dlsym(handle, name);
leave();
return result;
}
void *my_dlvsym(void *handle, const char *name, const char *version, void *dl_caller) {
void *result;
fprintf(stderr, "%s(%p, %s, %s, %p)\n", __func__, handle, name, version, dl_caller);
enter();
result = dlvsym(handle, name, version);
leave();
return result;
}
char *my_dlerror(void) {
char *result;
fprintf(stderr, "%s()\n", __func__);
enter();
result = dlerror();
leave();
return result;
}
int my_dladdr(const void *address, Dl_info *info) {
int result;
fprintf(stderr, "%s(%p, %p)\n", __func__, address, info);
enter();
result = dladdr(address, info);
leave();
return result;
}
int my_dladdr1(const void *address, Dl_info *info, void **extra_info, int flags) {
int result;
fprintf(stderr, "%s(%p, %p, %p, %d)\n", __func__, address, info, extra_info, flags);
enter();
result = dladdr1(address, info, extra_info, flags);
leave();
return result;
}
int my_dlinfo(void *handle, int request, void *arg, void *dl_caller) {
int result;
fprintf(stderr, "%s(%p, %d, %p, %p)\n", __func__, handle, request, arg, dl_caller);
enter();
result = dlinfo(handle, request, arg);
leave();
return result;
}
void *my_dlmopen(Lmid_t nsid, const char *file, int mode, void *dl_caller) {
void *result;
fprintf(stderr, "%s(%lu, %s, %d, %p)\n", __func__, nsid, file, mode, dl_caller);
enter();
result = dlmopen(nsid, file, mode);
leave();
return result;
}
static struct dlfcn_hook my_dlfcn_hook = {
.dlopen = my_dlopen,
.dlclose = my_dlclose,
.dlsym = my_dlsym,
.dlvsym = my_dlvsym,
.dlerror = my_dlerror,
.dladdr = my_dladdr,
.dlinfo = my_dlinfo,
.dlmopen = my_dlmopen,
.pad = {0, 0, 0, 0},
};
__attribute__((constructor))
static void init(void) {
old_dlfcn_hook = _dlfcn_hook;
_dlfcn_hook = &my_dlfcn_hook;
}
__attribute__((destructor))
static void fini(void) {
_dlfcn_hook = old_dlfcn_hook;
}
$ cc -shared -fPIC -o hook.so hook.c
$ cat > a.c
#include <dlfcn.h>
int main() { dlopen("./hook.so", RTLD_LAZY); dlopen("libm.so", RTLD_LAZY); }
^D
$ cc -ldl a.c
$ ./a.out
my_dlopen(libm.so, 1, 0x80484bd)
Unfortunately, my investigations are leading me to conclude that even if you could hook into glibc/elf/dl-load.c:open_verify() (which you can't), it's not possible to make this race-free against somebody writing over segments of your library.
The problem is essentially unsolvable in the form you've given, because shared objects are loaded by mmap()ing to process memory space. So even if you could make sure that the file that dlopen() operated on was the one you'd examined and declared OK, anyone who can write to the file can modify the loaded object at any time after you've loaded it. (This is why you don't upgrade running binaries by writing to them - instead you delete-then-install, because writing to them would likely crash any running instances).
Your best bet is to ensure that only the user you are running as can write to the file, then examine it, then dlopen() it. Your user (or root) can still sneak different code in, but processes with those permissions could just ptrace() you to do their bidding anyhow.
This project supposedly solves this on kernel level.
DigSig currently offers:
run time signature verification of ELF binaries and shared libraries.
support for file's signature revocation.
a signature caching mechanism to enhance performances.
I propose the following solution that should work without libraries *)
int memfd = memfd_create("for-debugging.library.so", MFD_CLOEXEC | MFD_ALLOW_SEALING);
assert(memfd != -1);
// Use any way to read the library from disk and copy the content into memfd
// e.g. write(2) or ftruncate(2) and mmap(2)
// Important! if you use mmap, you have to unmap it before the next step
// fcntl( , , F_SEAL_WRITE) will fail if there exists a writeable mapping
int seals_to_set = F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_WRITE | F_SEAL_SEAL;
int sealing_err = fcntl(memfd, F_ADD_SEALS, seals_to_set);
assert(sealing_err == 0);
// Only now verify the contents of the loaded file
// then you can safely *) dlopen("/proc/self/fd/<memfd>");
*) Not actually tested it against attacks. Do not use in production without further investigation.

Resources