why is library loaded via LD_PRELOAD operating before initialization? - linux

In the following minimal example, a library loaded via LD_PRELOAD with functions to intercept fopen and openat is apparently operating before its initialization. (Linux is CentOS 7.3). Why??
library file comm.c:
#define _GNU_SOURCE
#include <dlfcn.h>
#include <stdarg.h>
#include <stdio.h>
#include <fcntl.h>
typedef FILE *(*fopen_type)(const char *, const char *);
// initialize to invalid value (non-NULL)
// init() should initialize this correctly
fopen_type g_orig_fopen = (fopen_type) 1;
typedef int (*openat_type)(int, const char *, int, ...);
openat_type g_orig_openat;
void init() {
g_orig_fopen = (fopen_type)dlsym(RTLD_NEXT,"fopen");
g_orig_openat = (openat_type)dlsym(RTLD_NEXT,"openat");
}
FILE *fopen(const char *filename, const char *mode) {
// have to do this here because init is not called yet???
FILE * const ret = ((fopen_type)dlsym(RTLD_NEXT,"fopen"))(filename, mode);
printf("g_orig_fopen %p fopen file %s\n", g_orig_fopen, filename);
return ret;
}
int openat(int dirfd, const char* pathname, int flags, ...) {
int fd;
va_list ap;
printf("g_orig_fopen %p openat file %s\n", g_orig_fopen, pathname);
if (flags & (O_CREAT)) {
va_start(ap, flags);
fd = g_orig_openat(dirfd, pathname, flags, va_arg(ap, mode_t));
}
else
fd = g_orig_openat(dirfd, pathname, flags);
return fd;
}
compiled with:
gcc -shared -fPIC -Wl,-init,init -ldl comm.c -o comm.so
I have an empty subdirectory subdir. Then it appears the library function fopen is called before init:
#LD_PRELOAD=./comm.so find subdir
g_orig_fopen 0x1 fopen file /proc/filesystems
g_orig_fopen 0x1 fopen file /proc/mounts
subdir
g_orig_fopen 0x7f7b2e574620 openat file subdir

Obviously, fopen is called before initialization of comm.so. It is interesting to place a breakpoint in fopen() in order to understand (check this link in order to get debug symbols of various packages). I get this backtrace:
(gdb) bt
#0 fopen (filename=0x7ffff79cd2e7 "/proc/filesystems", mode=0x7ffff79cd159 "r") at comm.c:28
#1 0x00007ffff79bdb0e in selinuxfs_exists_internal () at init.c:64
#2 0x00007ffff79b5d98 in init_selinuxmnt () at init.c:99
#3 init_lib () at init.c:154
#4 0x00007ffff7de88aa in call_init (l=<optimized out>, argc=argc#entry=1, argv=argv#entry=0x7fffffffdf58, env=env#entry=0x7fffffffdf68) at dl-init.c:72
#5 0x00007ffff7de89bb in call_init (env=0x7fffffffdf68, argv=0x7fffffffdf58, argc=1, l=<optimized out>) at dl-init.c:30
#6 _dl_init (main_map=0x7ffff7ffe170, argc=1, argv=0x7fffffffdf58, env=0x7fffffffdf68) at dl-init.c:120
#7 0x00007ffff7dd9c5a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#8 0x0000000000000001 in ?? ()
#9 0x00007fffffffe337 in ?? ()
#10 0x0000000000000000 in ?? ()
It is obvious, comm.so depends on other libraries (libdl.so that requires libselinux.so). And comm.so is not the only library that declare an init function. libdl.so and libselinux.so also declare ones.
So, comm.so is the first library to be loaded (because it is declared with LD_PRELOAD) but, comm.so depends on libdl.so (because of -ldl during compilation) and libdl.so depends on libselinux.so. So, in order to load comm.so, init functions from libdl.so and libselinux.so are called before. And finally, init function from libselinux.so call fopen()
Personally, I usually resolve dynamic symbols during first call to the symbol. Like this:
FILE *fopen(const char *filename, const char *mode) {
static FILE *(*real_fopen)(const char *filename, const char *mode) = NULL;
if (!real_fopen)
real_fopen = dlsym(RTLD_NEXT, "fopen");
return real_fopen(filename, mode);
}

Related

Is it safe to call execl on a anonymous elf file?

First create an anonymous memory block by
int fd = memfd_create("", MFD_CLOEXEC);
Note that I pass MFD_CLOEXEC flag.
Then I copy elf file content into this anonymous memory.
The elf is executed like this:
char cmd[128];
sprintf(cmd, "/proc/self/fd/%i", fd);
execl(cmd, "dummy", NULL);
MFD_CLOEXEC means that fd will be closed after execl, but here execl need to load elf content from fd. I do a simple test and it seems OK. But I am not sure it is safe or not.
update:
#define _GNU_SOURCE
#include <sys/mman.h>
#include <unistd.h>
#include <stdint.h>
#include <sys/syscall.h>
#include <fcntl.h>
#include <stdio.h>
extern uint8_t foo_data[] asm("_binary_htop_start");
extern uint8_t foo_data_size[] asm("_binary_htop_size");
extern uint8_t foo_data_end[] asm("_binary_htop_end");
int main(int argc, char **argv)
{
int exefd = memfd_create("", MFD_CLOEXEC);
printf("%p %d %ld\n", foo_data, exefd, write(exefd, foo_data, foo_data_end-foo_data));
char * const vv[] = {"htopp", NULL};
//execveat(exefd, NULL, vv, NULL, AT_EMPTY_PATH);
exefd = syscall(__NR_execveat, exefd, NULL, vv, NULL, AT_EMPTY_PATH);
perror("");
return 0;
}
I try with execveat but fail. syscall set errno to "Bad address", don't know the reason. elf content is generated by objcopy.
It's perfectly fine with real binaries (like ELF files), whether statically or dynamically linked.
MFD_CLOEXEC will not work with executable scripts (eg. files starting with #! /bin/sh). You'll have to omit that flag and leave the fd open in that case.
Instead of your sprintf / execl trick, you should look into using execveat(fd, "", argv, env, AT_EMPTY_PATH) (which unfortunately has the same problem with executable scripts opened with O_CLOEXEC, but doesn't rely on a /proc fs being mounted).
And of course, you should always use (void*)0 instead of NULL with variadic functions like execl() ;-)

Linux addr2line command returns ??:0

I created a simple code in c++ that need to be crashed. I am getting back a backtrace with this error:
/prbsft/bins/Main(_Z5FuncCv+0x14)[0x5571ea64dd80]
Now I'm trying to use addr2line to get the error line in the function.
So I used:
addr2line -e /prbsft/bins/prbMain 0x5594262a8d80
But all I got is 0:??.
I have also tried to use 0x14 address instead of 0x5594262a8d80 but it returns the same result.
I'm using Ubuntu. addr2line version is:
GNU addr2line (GNU Binutils for Ubuntu) 2.30
Any idea?
Here is the output:
Program received signal SIGSEGV, Segmentation fault. 0x0000555555554d80 in FuncC () at main.cpp:34
warning: Source file is more recent than executable.
34 std::cout << k->n << std::endl;
(gdb) bt
#0 0x0000555555554d80 in FuncC () at main.cpp:34
#1 0x0000555555554db1 in FuncB () at main.cpp:39
#2 0x0000555555554dbd in FuncA () at main.cpp:44
#3 0x0000555555554dda in main () at main.cpp:53
The 0x55XXXXXXXXXX address you got is most likely the memory address of the function after the EXE is loaded from disk. addr2line only recognizes VMA addresses like the ones that you got from disassembling with objdump.
Let's call your function Foo. addr2line expects FooVMA or if you're using --section=.text, then Foofile - textfile. Functions like backtrace returns Foomem. One easy way that works most of the time is to calculate FooVMA = Foofile = Foomem - ELFmem. But that assumes VMAbase = 0, which is not true for all linkers (i.e. linker scripts). Examples would be the GCC 5.4 on Ubuntu 16 (0x400000) and clang 11 on MacOS (0x100000000). Here's an example that uses dladdr & dladdr1 to translate it to the VMA address.
#include <execinfo.h>
#include <link.h>
#include <stdlib.h>
#include <stdio.h>
// converts a function's address in memory to its VMA address in the executable file. VMA is what addr2line expects
size_t ConvertToVMA(size_t addr)
{
Dl_info info;
link_map* link_map;
dladdr1((void*)addr,&info,(void**)&link_map,RTLD_DL_LINKMAP);
return addr-link_map->l_addr;
}
void PrintCallStack()
{
void *callstack[128];
int frame_count = backtrace(callstack, sizeof(callstack)/sizeof(callstack[0]));
for (int i = 0; i < frame_count; i++)
{
char location[1024];
Dl_info info;
if(dladdr(callstack[i],&info))
{
// use addr2line; dladdr itself is rarely useful (see doc)
char command[256];
size_t VMA_addr=ConvertToVMA((size_t)callstack[i]);
//if(i!=crash_depth)
VMA_addr-=1; // https://stackoverflow.com/questions/11579509/wrong-line-numbers-from-addr2line/63841497#63841497
snprintf(command,sizeof(command),"addr2line -e %s -Ci %zx",info.dli_fname,VMA_addr);
system(command);
}
}
}
void Foo()
{
PrintCallStack();
}
int main()
{
Foo();
return 0;
}

How to get the reference count on Linux driver level?

In the Linux kernel the opened file is indicated by struct file, and the file descriptor table contains a pointers which is point to struct file. f_count is an important member in the struct file. f_count, which means Reference Count. The system call dup() and fork() make other file descriptor point to same struct file.
As shown in the picture (sorry, my reputation is too low, the picture can not be uploaded), fd1 and fd2 point to the struct file, so the Reference Count is equal to 2, thus f_count = 2.
My question is how can i get the value of the f_count by programming.
UPDATE:ok,In order to make myself more clear i will show my code, both the char device driver,Makefile and my application.:D
deviceDriver.c
#include "linux/kernel.h"
#include "linux/module.h"
#include "linux/fs.h"
#include "linux/init.h"
#include "linux/types.h"
#include "linux/errno.h"
#include "linux/uaccess.h"
#include "linux/kdev_t.h"
#define MAX_SIZE 1024
static int my_open(struct inode *inode, struct file *file);
static int my_release(struct inode *inode, struct file *file);
static ssize_t my_read(struct file *file, char __user *user, size_t t, loff_t *f);
static ssize_t my_write(struct file *file, const char __user *user, size_t t, loff_t *f);
static char message[MAX_SIZE] = "-------congratulations--------!";
static int device_num = 0;//device number
static int counter = 0;
static int mutex = 0;
static char* devName = "myDevice";//device name
struct file_operations pStruct =
{ open:my_open, release:my_release, read:my_read, write:my_write, };
/* regist the module */
int init_module()
{
int ret;
/ **/
ret = register_chrdev(0, devName, &pStruct);
if (ret < 0)
{
printk("regist failure!\n");
return -1;
}
else
{
printk("the device has been registered!\n");
device_num = ret;
printk("<1>the virtual device's major number %d.\n", device_num);
printk("<1>Or you can see it by using\n");
printk("<1>------more /proc/devices-------\n");
printk("<1>To talk to the driver,create a dev file with\n");
printk("<1>------'mknod /dev/myDevice c %d 0'-------\n", device_num);
printk("<1>Use \"rmmode\" to remove the module\n");
return 0;
}
}
void cleanup_module()
{
unregister_chrdev(device_num, devName);
printk("unregister it success!\n");
}
static int my_open(struct inode *inode, struct file *file)
{
if(mutex)
return -EBUSY;
mutex = 1;//lock
printk("<1>main device : %d\n", MAJOR(inode->i_rdev));
printk("<1>slave device : %d\n", MINOR(inode->i_rdev));
printk("<1>%d times to call the device\n", ++counter);
try_module_get(THIS_MODULE);
return 0;
}
/* release */
static int my_release(struct inode *inode, struct file *file)
{
printk("Device released!\n");
module_put(THIS_MODULE);
mutex = 0;//unlock
return 0;
}
static ssize_t my_read(struct file *file, char __user *user, size_t t, loff_t *f)
{
if(copy_to_user(user,message,sizeof(message)))
{
return -EFAULT;
}
return sizeof(message);
}
static ssize_t my_write(struct file *file, const char __user *user, size_t t, loff_t *f)
{
if(copy_from_user(message,user,sizeof(message)))
{
return -EFAULT;
}
return sizeof(message);
}
Makefile:
# If KERNELRELEASE is defined, we've been invoked from the
# kernel build system and can use its language.
ifeq ($(KERNELRELEASE),)
# Assume the source tree is where the running kernel was built
# You should set KERNELDIR in the environment if it's elsewhere
KERNELDIR ?= /lib/modules/$(shell uname -r)/build
# The current directory is passed to sub-makes as argument
PWD := $(shell pwd)
modules:
$(MAKE) -C $(KERNELDIR) M=$(PWD) modules
modules_install:
$(MAKE) -C $(KERNELDIR) M=$(PWD) modules_install
clean:
rm -rf *.o *~ core .depend .*.cmd *.ko *.mod.c .tmp_versions
.PHONY: modules modules_install clean
else
# called from kernel build system: just declare what our modules are
obj-m := devDrv.o
endif
application.c:
#include <sys/types.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#define MAX_SIZE 1024
int main(void)
{
int fd;
char buf[MAX_SIZE];
char get[MAX_SIZE];
char devName[20], dir[50] = "/dev/";
system("ls /dev/");
printf("Please input the device's name you wanna to use :");
gets(devName);
strcat(dir, devName);
fd = open(dir, O_RDWR | O_NONBLOCK);
if (fd != -1)
{
read(fd, buf, sizeof(buf));
printf("The device was inited with a string : %s\n", buf);
/* test fot writing */
printf("Please input a string :\n");
gets(get);
write(fd, get, sizeof(get));
/* test for reading */
read(fd, buf, sizeof(buf));
system("dmesg");
printf("\nThe string in the device now is : %s\n", buf);
close(fd);
return 0;
}
else
{
printf("Device open failed\n");
return -1;
}
}
Any idea to get the struct file's(char device file) f_count? Is it poassible get it by the way of printk?
You should divide reference counter to module from other module and from user-space application. lsmod show how many modules use your module.
sctp 247143 4
libcrc32c 12644 1 sctp
It is impossible load sctp without libcrc32c because sctp use exported function from libcrc32 to calculate control sum for packets.
The reference counter itself is embedded in the module data structure and can be obtained with the function uint module_refcount(struct module* module);
You can use:
printk("Module reference counter: %d\n", (int)module_refcount(THIS_MODULE));
THIS_MODULE it is a reference to current loadable module (for build-in module it is NULL) inside .owner field (inside struct file_operations).
If there is a need manually modify module's counter use:
int try_module_get(struct module* module);
void module_put(struct module* module);
If module unloaded it will return false. You also can move thru all modules. Modules linked via list.
File opening inside kernel it is bad idea. Inside kernel you can get access to inode. Try read man pages for dup and fork. In you system you can investigate lsof tools.

How my custom module on linux 3.2.28 can make a call to print_cpu_info?

I'm going to implement my custom module in which information for CPU is printed using print_cpu_info(). In order to call print_cpu_info(), I have included the needed header file, but it doesn't work. Here is my module.
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <asm/alternative.h>
#include <asm/bugs.h>
#include <asm/processor.h>
#include <asm/mtrr.h>
#include <asm/cacheflush.h>
extern struct cpuinfo_x86 boot_cpu_data;
int cpuinfox86_init(void)
{
print_cpu_info(&boot_cpu_data);
return 0;
}
void cpuinfox86_exit(void)
{
printk("good bye cpu\n");
}
module_init(cpuinfox86_init);
module_exit(cpuinfox86_exit);
MODULE_LICENSE("GPL");
After compiling this module, I get
make -C /lib/modules/3.2.28-2009720166/build SUBDIRS=/home/tracking/1031_oslab modules
make[1]: Entering directory `/usr/src/linux-3.2.28'
CC [M] /home/tracking/1031_oslab/module.o
Building modules, stage 2.
MODPOST 1 modules
WARNING: "print_cpu_info" [/home/tracking/1031_oslab/module.ko] undefined!
CC /home/tracking/1031_oslab/module.mod.o
LD [M] /home/tracking/1031_oslab/module.ko
make[1]: Leaving directory `/usr/src/linux-3.2.28'
any idea?
"print_cpu_info" is not exported symbol, so it can not be used by modules. However, you can use "kallsyms_lookup_name", which is exported, to get the address of "print_cpu_info" and execute the function call using a function pointer.
static void* find_sym( const char *sym ) { // find address kernel symbol sym
static unsigned long faddr = 0; // static !!!
// ----------- nested functions are a GCC extension ---------
int symb_fn( void* data, const char* sym, struct module* mod, unsigned long addr ) {
if( 0 == strcmp( (char*)data, sym ) ) {
faddr = addr;
return 1;
}
else return 0;
};
// --------------------------------------------------------
kallsyms_on_each_symbol( symb_fn, (void*)sym );
return (void*)faddr;
}
static void (*cpuInfo)(struct cpuinfo_x86 *);
How to use it:
unsigned int cpu = 0;
struct cpuinfo_x86 *c;
cpuInfo = find_sym("print_cpu_info");
for_each_online_cpu(cpu)
{
c = &cpu_data(cpu);
cpuInfo(c);
}

ELF entry field and actual program entry

I use the readelf utility to check (-h) an executable file, and i see the e_entry field has the value: 0x8048530 . Then i recompile the checked program and have it to print its own program entry by adding the line: printf("%p\n", (void*)main) and outputs: 0x80485e4. Why do i have this difference? (OS: Linux 32-bit)
The entry point of an executable is usually not main itself but a platform specific function (that we'll call _start) which performs initialization before calling main.
Answering the question "Can i access the _start label from the main body?":
#include <stdio.h>
int main()
{
void* res;
#if defined(__i386__)
asm("movl _start, %%eax" : "=a" (res));
#elif defined(__x86_64__)
asm("movq _start, %%rax" : "=a" (res));
#else
#error Unsupported architecture
#endif
printf("%p\n", res);
return 0;
}

Resources